The problem with this approach is that most real-world situations, and even some games, don’t have a simple set of rules governing how they work. So some researchers have attempted to get around the problem by using an approach that attempts to model how a particular game or scenario environment will affect an outcome, and then use that knowledge to craft a plan. The downside of this system is that some areas are so complex that it is almost impossible to model every aspect. This turned out to be the case with most Atari games, for example.
In a way, MuZero combines the best of both worlds. Rather than modeling everything, it only tries to take into account the important factors when making a decision. As DeepMind points out, it’s something you do as a human being. When most people look out the window and see dark clouds forming on the horizon, they are usually not caught thinking of things like condensation and pressure fronts. Rather, they think about how they should dress to stay dry if they go out. MuZero does something similar.
He takes three factors into account when making a decision. He will look at the outcome of his previous decision, where he is now, and the best course of action. This seemingly simple approach makes MuZero the most efficient DeepMind algorithm to date. During his testing, he found MuZero to be as good as AlphaZero at chess, Go, and shogi, and better than all of his previous algorithms, including Agent57, to Atari games. He also found that the more time he gave MuZero to consider an action, the better his performance. DeepMind also performed tests in which it limited the number of simulations MuZero could perform before committing to a move. Mrs. Pac-Man. During these tests, he found that MuZero was still able to achieve good results.
Getting high scores in Atari games is all well and good, but what about the practical applications of DeepMind’s latest research? In short, they could be revolutionary. Although we are not there yet, MuZero is the closest to the researchers who developed a general purpose algorithm. The subsidiary claims that MuZero’s learning abilities could one day help it solve complex problems in areas like robotics where there are no simple rules.