Nobel Data Prize?

Data is eating the world and winning Nobel Prizes, hiding behind better marketing terms like “artificial intelligence” or “AI.” The 2024 Nobel Prize in Physics was awarded to Geoffrey Hinton (“the godfather of artificial intelligence”) and John Hopfield for their work on “machine learning with artificial neural networks”. Half of the chemistry prize was awarded to Demis Hassabis and John Jumper from DeepMind, thanks to the AlphaFold machine, which learned to accurately predict the structure of proteins using artificial neural networks.

Both awards were given for what was formerly called “computational statistics,” complex statistical methods that identify patterns in large amounts of data and the process of developing a computer program that can find the same patterns or abstractions in a new data set. Both awards celebrate the recent success of the “connectionism” approach to the development of “artificial intelligence” computer programs.

“Connectionism” based on artificial neural networks, which was born in the mid-1950s, was overshadowed by “symbolic artificial intelligence”, another approach born in the same period, until about a decade ago. Proponents of symbolic AI have considered statistical analysis as “alchemy”, favored by connectionists who believe in the power of human intelligence and its ability to create “rules” for logical thinking that can then be programmed to make computers think. , reason and plan. This was the dominant ideology in computer science in general; It was the fervent belief in “expert systems” (as one branch of symbolic AI has been labeled), that is, the belief that it was possible for experts (computer scientists, AI researchers and developers) to distill human systems. transfers information into computer code.

When Somite.ai co-founder and CTO Jonathan Rosenfeld explained his idea to me recently AI Scaling LawsRich Sutton mentioned Bitter Lesson in the context of why he (Rosenfeld) wanted to “do better than the experts.” Examining AI developments in chess, Go, speech recognition and natural language processing, Sutton concluded that “in the long run, the only thing that matters is leveraging computation.” All that matters is the falling cost of a unit of computation, or “Moore’s Law.”

Sutton learned two lessons from the bitter lesson of computer science. One of the insights is that it doesn’t really matter what experts think about thinking and the rules they come up with, because “finding simple ways to think about the contents of minds” is futile. The other lesson is that multipliers always triumph (eventually) because unlike experts, they scale: “the power of general-purpose methods, the power of methods that continue to scale with increasing computation even if the current computation becomes too large.”

Sutton noted: “Two methods that scale arbitrarily like this to call And learningMethods that form the basis of recent successful AI breakthroughs such as image classification, AlphFold, and LLMs. But while the cost of computing has been falling rapidly and steadily for decades, these breakthroughs have only occurred in the last decade. From where?

No doubt Sutton highlighted a key factor in the recent triumph of artificial neural networks (or deep learning, or computational statistics)—the falling cost of computing. But writing in 2019, he also had to acknowledge another important factor contributing to connectionism’s sudden victory: the existence of too much data.

When Tim Berners-Lee invented the World Wide Web thirty-five years ago, he (and the many other inventors who followed him) created a huge repository of data accessible to billions of internet users around the world. The Web, combined with new tools (primarily the smartphone) for creating and sharing data in multiple formats (text, images, video), has provided the key enabler of the recent success of the old-new approach to “AI.”

The decline in the cost of computing and the discovery that GPUs were the most efficient way to process the calculations required to find patterns in large amounts of data did not alone make the 2012 breakthrough in image classification possible. The main contribution of this breakthrough was the discovery of labeled images taken from the Web and assembled into ImageNet, an organized database, in 2009. Similarly, the invention of a new kind of statistical model for processing and analyzing text in 2017 was a significant contribution to today’s ChatGPT bubble, but “generative AI” couldn’t happen without the vast amounts of text (and images and videos) available (and or without permission) on the web.

Why is the decreasing cost of computing, or “Moore’s Law,” so central to descriptions and explanations of the course of computing in general and “Artificial Intelligence” advances in particular? why is there IT industry observers missed the most important trendThe data explosion that has driven technological innovation since at least the 1990s?

The term “data processing” was coined in 1954. “This does not mean that all industry participants should ignore data,” I wrote in 2019. data and increasingly larger containers to store it (also enabled by Moore’s Law).

“Data” is an ephemeral concept that is difficult to define and measure, unlike the fact that we can see with our own eyes that processing power and computers are rapidly shrinking. Also helping to promote the focus on processing rather than data was Intel, a very successful marketing force.

“Data” enjoyed a brief PR success between about 2005 and 2015, and terms like “Big Data” and “Data Science” became the talk of the day. But these were quickly eclipsed most successful marketing and branding campaign ever“artificial intelligence”. Yet data continues to consume the world, for better or worse. He eventually even received two Nobel Prizes – although it was not explicitly stated.