David Silver, leader of the reinforcement learning research group at DeepMind, received an honorary “ninth dan” professional ranking for AlphaGo.
JUNG YEON-JE | AFP | fake images
Computer scientists wonder if DeepMind, the British company owned by Alphabet that is widely regarded as one of the world’s leading artificial intelligence labs, will ever be able to build machines with the kind of “general” intelligence seen in humans and animals. .
In its quest for general artificial intelligence, sometimes referred to as human-level AI, DeepMind is focusing a portion of its efforts on an approach called “reinforcement learning.”
This involves programming an AI to perform certain actions in order to maximize its chances of obtaining a reward in a given situation. In other words, the algorithm “learns” to complete a task by looking for these preprogrammed rewards. The technique has been used successfully to train AI models on how to play (and excel at) games like Go and chess. But they are still relatively silly or “narrow.” DeepMind’s famous AlphaGo AI can’t draw a stickman or differentiate between a cat and a rabbit, for example, while a seven-year-old can.
Despite this, DeepMind, which was acquired by Google in 2014 for around $ 600 million, believes that artificial intelligence systems backed by reinforcement learning could theoretically grow and learn so much that they break through the theoretical AGI barrier without any new ones. technological development.
Researchers at the company, which has grown to around 1,000 people under Alphabet ownership, argued in an article submitted to the peer-reviewed Artificial Intelligence magazine last month that “The payoff is enough” to get to general AI. . The newspaper was first reported by VentureBeat last week.
In the article, the researchers state that if you keep “rewarding” an algorithm every time it does something you want, which is the essence of reinforcement learning, it will eventually start to show signs of general intelligence.
“The reward is sufficient to drive behavior that exhibits skills studied in natural and artificial intelligence, including knowledge, learning, perception, social intelligence, language, generalization, and imitation,” the authors write.
“We suggest that agents who learn through trial and error to maximize reward could learn the behavior exhibited by most, if not all, of these skills, and thus powerful reinforcement learning agents could constitute a solution for general artificial intelligence ”.
However, not everyone is convinced.
Samim Winiger, an AI researcher in Berlin, told CNBC that DeepMind’s “reward is enough” view is a “somewhat fringe philosophical position, misleadingly presented as hard science.”
He said the path to general AI is complex and the scientific community is aware that there are myriad challenges and known unknowns that “legitimately instill a sense of humility” in most researchers in the field and prevent them from making “great statements and totalitarian ”. like “RL is the final answer, all you need is a reward.”
DeepMind told CNBC that while reinforcement learning has been behind some of its best-known research breakthroughs, the artificial intelligence technique represents only a fraction of the overall research it conducts. The company said it believes it is important to understand things at a more fundamental level, so it pursues other areas such as “token artificial intelligence” and “population-based training.”
“In somewhat typical DeepMind fashion, they chose to make bold statements that grab attention at all costs, rather than a more nuanced approach,” Winiger said. “This is more like politics than science.”
Stephen Merity, an independent artificial intelligence researcher, told CNBC that there is “a difference between theory and practice.” He also pointed out that “a pile of dynamite is enough to get it to the moon, but it is not really practical.”
Ultimately, there is no evidence in any way to say whether reinforcement learning will ever lead to AGI.
Rodolfo Rosini, a technology investor and entrepreneur with a focus on artificial intelligence, told CNBC: “The truth is that nobody knows and that the main product of DeepMind is still public relations and not technical innovation or products.”
Entrepreneur William Tunstall-Pedoe, who sold his Siri Evi-like app to Amazon, told CNBC that even if the researchers are right “that doesn’t mean we’ll get there soon, nor does it mean there isn’t a better, faster way. to get there “.
DeepMind’s “The Reward Is Enough” article was co-authored by DeepMind heavyweights Richard Sutton and David Silver, who met DeepMind CEO Demis Hassabis at the University of Cambridge in the 1990s.
“The key problem with the thesis posed by ‘The reward is enough’ is not that it is wrong, but rather that it cannot be wrong and therefore does not satisfy Karl Popper’s famous criterion that all scientific hypotheses are falsifiable, ”said one senior. Artificial intelligence researcher at a large US tech company, who wished to remain anonymous due to the sensitive nature of the discussion.
“Because Silver et al. They are talking about generalities, and the notion of reward is suitably subspecified, you can always select the cases in which the hypothesis is satisfied, or the notion of reward can be changed so that it is satisfied, ”says the source. additional.
“As such, the unfortunate verdict here is not that these prominent members of our research community were wrong in any way, but rather that what is written is trivial. What do you learn from this document, in the end? Actionable consequences of acknowledging the inalienable truth of this hypothesis, was this article enough? “
What is AGI?
While AGI is often referred to as the holy grail of the AI community, there is no consensus on what AGI actually is. A definition is the ability of an intelligent agent to understand or learn any intellectual task that a human being can perform.
But not everyone agrees with that and some wonder if AGI will ever exist. Others are terrified of its potential impacts and whether AGI would build its own, even more powerful forms of AI or so-called superintelligence.
Ian Hogarth, an entrepreneur-turned-angel investor, told CNBC that he hopes reinforced learning is not enough to reach AGI. “The more existing techniques can be scaled up to achieve AGI, the less time we will have to prepare for AI security efforts and the less chance things will go well for our species,” he said.
Winiger argues that we are no closer to AGI today than we were several decades ago. “The only thing that has fundamentally changed since the 1950s / 60s is that science fiction is now a valid tool for giant corporations to confuse and mislead the public, journalists and shareholders,” he said.
Fueled by hundreds of millions of dollars from Alphabet each year, DeepMind is competing with companies like Facebook and OpenAI to hire the brightest people in the field as it seeks to develop AGI. “This invention could help society find answers to some of the world’s most pressing and fundamental scientific challenges,” DeepMind writes on its website.
DeepMind COO Lila Ibrahim said Monday that trying to “figure out how to operationalize the vision” has been the biggest challenge since she joined the company in April 2018.