1 Data Set2 Beating Humans
UK-based DeepMind believes Agent57 could help to further the decision-making capabilities of AI. The company says the model could be used to provide more ability to make a decision. The ability of AI to make decisions is increasingly important for companies looking to leverage the technology for automated roles. DeepMind says Agent57 has achieved a first of its kind: “With Agent57, we have succeeded in building a more generally intelligent agent that has above-human performance on all tasks in the Atari57 benchmark,” wrote the study’s coauthors. “Agent57 was able to scale with increasing amounts of computation: the longer it trained, the higher its score got.”
Data Set
Comprising classic Atari games, the Arcade Learning Environment (ALE) is a data set platform to see how AI can work across a range of game titles. It features a set of Atari 2600 games that are designed to be especially challenging to humans.
Atari games were chosen because they have a wide variation of styles, are interesting to play for humans, and don’t have any bias. DeepMind has been leading the push for human-level competence on the ALE data set. The company’s Deep-Q Networks gained human-level control over many Atari 2600 games previously. In collaboration with OpenAI, DeepMind later shows superhuman capabilities when playing Enduro and Pong.
Beating Humans
Agent57 has taken the development a step further and achieved the best performance on the data set. The complex AI uses reinforcement learning (RL) across numerous computers to allow software guided by AI to make decisions. To test the results of Agent57 beating humans across all games, DeepMind compared it to other AI. The results showed the MuZero AI had the highest mean score (5661.84), but while it beat many games, it failed completely in other titles. For capped mean performance, Agent57 was highest with 100 compared to R2D2 (96.93) and MuZero (89.92). “Agent57 finally obtains above human-level performance on the very hardest games in the benchmark set, as well as the easiest ones,” wrote the coauthors in a blog post. “This by no means marks the end of Atari research, not only in terms of data efficiency, but also in terms of general performance … Key improvements to use might be enhancements in the representations that Agent57 uses for exploration, planning, and credit assignment.”