An Interview with Geoffrey Hinton, the Godfather of AI: “Language Models May Learn to Be Much Smarter Than People”

Geoffrey Hinton is widely regarded as the "Godfather of AI" and one of the most influential figures in the history of artificial intelligence. Professor Hinton's groundbreaking work has fundamentally transformed our understanding of how machines can learn and process information, laying the foundation for the AI revolution we witness today.

According to the University of Toronto’s website, Geoffrey Hinton earned his BA in Experimental Psychology from the University of Cambridge in 1970, followed by a PhD in Artificial Intelligence from the University of Edinburgh in 1978. His postdoctoral research took him to both Sussex University and the University of California San Diego, before joining the Computer Science faculty at Carnegie Mellon University, where he spent five years.

Later, he became a fellow of the Canadian Institute for Advanced Research and transitioned to the Department of Computer Science at the University of Toronto. Between 1998 and 2001, Hinton helped establish the Gatsby Computational Neuroscience Unit at University College London, before returning to Toronto, where he now holds the title of Emeritus Distinguished Professor.

From 2004 to 2013, he directed the Neural Computation and Adaptive Perception program, funded by the Canadian Institute for Advanced Research. Between 2013 and 2023, he worked part-time at Google, serving as Vice President and Engineering Fellow.

He has spent decades pioneering the field of neural networks when few believed in their potential. His seminal contributions include the development of the backpropagation algorithm, which became the backbone of modern deep learning, and his work on Boltzmann machines, capsule networks, and variational autoencoders. Actually, "he invented a method, the Boltzmann machine, that can autonomously find properties in data, and so perform tasks such as identifying specific elements in pictures." These innovations have made possible everything from image recognition and natural language processing to the large language models that power today's AI assistants.

Professor Hinton's academic career has been equally distinguished. He has held prestigious positions at Carnegie Mellon University, the University of Toronto, and University College London. Professor Hinton's influence extends far beyond academia—he was a key figure at Google's DeepMind, where he helped translate theoretical breakthroughs into practical applications that have touched billions of lives. However, in 2023, he made the significant decision to resign from Google to "freely speak out about the risks of AI," highlighting his deep concern about the technology's potential dangers.

Among his many accolades, Professor Hinton is a Fellow of the Royal Society, a recipient of the Nobel Prize in Physics, and has received the Turing Award (often called the "Nobel Prize of Computing"), along with numerous other prestigious awards including the Herzberg Gold Medal and the IJCAI Award for Research Excellence. His students and collaborators have gone on to lead major AI research efforts across the globe.

In recent years, Professor Hinton has also become an important voice in discussions about AI safety and the societal implications of artificial intelligence, advocating for careful consideration of both the tremendous benefits and potential risks of the technology he helped create.

We had the privilege of listening to Professor Hinton in an interview conducted by Alok Jha, the science and technology editor in The Economist, during Gitex Europe in Berlin. 

An Interview with Geoffrey Hinton, the Geoffrey Hinton the Godfather of AI: “Language Models May Learn to Be Much Smarter Than People”
Photo by BUSINESS POWERHOUSE

Alok Jha: Geoffrey Hinton is famous for his work in artificial neural networks, and as you've heard, and as all the paraphernalia around his shows, he's also known as one of the grandfathers of AI, which maybe you can talk to me about later. He's a Professor Emeritus at the University of Toronto and for 10 years from 2013, he also worked for Google, which he left in 2023. Jeffrey, it's really nice to meet you. It's a real pleasure to have you here. I just want to start with a bit of a vision from you. I'll shout if necessary. Anyway, they're all talking about the companies and practical uses, but I'd like to start with you with some vision. So, can you, in the most broad terms possible, tell me what are the few, two or three most hopeful things you think AI will be doing in the next decade? What are the most exciting things for you? In each case, would you mind just giving a sense of what you think actually happened in those days? 

Professor Geoffrey Hinton:

I'll give you an example. Okay, so there's nothing new in what I'm going to say. I think that AI will be amazing in healthcare. So in healthcare, I made a prediction in 2016 that in the next five years, AI would replace radiologists in interpreting medical scans. 

Alok Jha: How does that work? 

Professor Geoffrey Hinton: I was wrong. I got the time scale wrong. It's beginning to happen now. Right now, I think there are over 250 applications approved by the FDA for using AI for interpreting medical scans. It's used in major cancer clinics. I think it will still be another five years before it replaces radiologists, maybe even longer. The medical profession is quite conservative, but it will be able to get more information out of medical scans. So one case we know about is interpreting fundus images of the retina, where AI can see all sorts of things in those ages. There's no ophthalmologist ever realizing that we've seen that. 

Alok Jha: What about radiologists? What happens to them? Can they just treat more people, or do they disappear altogether?

Professor Geoffrey Hinton: So radiologists will still do many other things, like comfort people, and plan what to do, although AI will of course be doing that as well. For quite a while, it will be a combination of a radiologist and an AI, and that will be more efficient than a radiologist alone. But the good news about healthcare is it's very elastic. If we can make doctors more efficient, we can all get a lot more healthcare. So there won't be unemployment, there'll just be a lot more healthcare, which will be great. It's a good thing. 

Alok Jha: And in healthcare, apart from radiology, do you see other places like diagnosis, specific diseases perhaps that can be made more treatable? 

Professor Geoffrey Hinton: There are two things there. Diagnosis will get much better. We already know from more than a year ago that if you take difficult cases to diagnose, then AI gets about 50% right, and the doctors only get about 40% right. The combination gets about 60% right, and that will save a lot of lives. In North America, about 200,000 people a year die of bad diagnoses. So it'll have a big effect there. And that's from AI from more than a year ago, so we'll get much better at that. It'll also of course be better at designing drugs, so we'll get much better at treatments. 

Alok Jha: We've already seen some of the early-stage drugs designed by AI going through clinical trials. So that's one area. Medicine and healthcare, if that wasn't enough. Is there another one you want to talk about? 

Professor Geoffrey Hinton: Yes, the next area is education. So my university doesn't like me talking about this, but AI will be much better at tutoring people. So we already know that if you take a child and give them a personal tutor, they learn about twice as fast as in a classroom. And that's because a personal tutor understands what it is the child doesn't understand, and tempers their explanations to what the child's understanding. AI should be able to do that even better because AI is what I've had experience with millions of children to train. This will come in the next 10 years or so. It's not there yet, but it's coming. And so we'll get a much better education at many levels. I think the last level where that will happen is educating PhD students because that's more of an apprenticeship. It's not so much teaching facts as teaching approaches, but it'll come there in the end too. We're already seeing it in companies for educating employees. So I work with a company called Valance that has a system called Nadia that teaches employees leadership skills. And we're going to see in all companies, I think, AI being used to educate employees. When I was at Google, I used to have to train the videos, which were very boring, on how to be polite at things. 

Alok Jha: Did they work?

Professor Geoffrey Hinton: To some extent, yes. I think interacting with an AI tutor will actually be much more efficient. 

Alok Jha: I think people didn't test it out in this room probably as well, having just tried out, but mostly tried out chatbots and things to ask questions. And obviously, you have to be careful what you believe, but it's a good start already. You can see where it's going. Okay, give us one more hopeful case, the one that you think is going to be one of the most optimistic and ambitious things that AI will be doing in the next 10 years. 

Professor Geoffrey Hinton: So I agree with Demis Hassabis, the leader of DeepMind, who for many years has said AI is going to be very important for making scientific progress. It's going to make scientific discoveries. There's one area in which that's particularly easy, which is mathematics because mathematics is a closed system. So you're going to get AIs that play mathematics. That is, they ask themselves, I wonder if I could prove this, I wonder if I could prove that. But because this is a closed system, they can just try things out and see if they can prove them. 

Alok Jha: We're talking about mathematical conjectures that are still yet to be proven by humans. 

Professor Geoffrey Hinton: Yes. 

Alok Jha

Professor Geoffrey Hinton: And new conjectures they'll make too. And I think AI will get much better at mathematics than people, maybe in the next 10 years or so. And within mathematics, it's much like things like Go or chess. They're closed systems with rules, where they can generate their own trainer data. So when they first taught Go to AIs, they would mimic the moves of human experts. And there's of course a limitation to that, which is if you run out of human expert moves, and humans aren't so good. But then they got what they called Monte Carlo rollout, where you say if I go here, he goes there, I go here, he goes there, or what, that's bad for me. And you can learn from the Monte Carlo rollout, and you no longer need humans to tell you what your moves are. You can figure it out. It'll be the same in mathematics, and we'll get mathematical systems much better than people, I think. 

Alok Jha: And they don't need breaks to just carry on continuously and get to solutions much faster than humans might before they get tired, right? 

Professor Geoffrey Hinton: Eventually, yes. 

Alok Jha: I mean, in general, mathematics is something you think will be meaningful to us. But is science in general, research, physics, chemistry, molecular biology, all of these things, do you think will be accelerated by the neural networks of the future? 

Professor Geoffrey Hinton: Yes, I think what we'll see is little bits of those scientific enterprises be accelerated early on. And as time goes by, more and more aspects will be accelerated. Yes.

Alok Jha: And what will that world feel like when AI is accelerating scientific progress? It feels like scientific progress is moving fast right now. If it gets even more exponential, what does that mean? 

Professor Geoffrey Hinton: It could make life much better for everybody. If we share the benefits of the increase in productivity, it'll be great for everybody. So one example now is you already get a full-body MRI scan every year, and you get AI to interpret it, basically, you don't need to die of cancer anymore. You can detect almost all cancers when they're very small. And if you can detect them at phase one, stage one, you can normally just get rid of them. So Craig Venter, who's one of the people seeking the human genome, had a full-body MRI and detected two very aggressive prostate cancers, much like Biden may have now, very early and was fine. So with AI doing the interpretation, doing better interpretation than people can do, we'll get things like that. But we'll basically get rid of people dying of cancer if you can afford the full-body MRI. 

Alok Jha: I was going to say there's a good point in your statement. We'll come back to the equity and the sort of general power this stuff is going to roll out for later. Let's talk a bit about the AIs themselves, not just the applications. Are there things that you see models being able to do in, let's say, five years' time that they can't do now? 

Professor Geoffrey Hinton: I'm very cautious about making those predictions because of my false prediction in 2016 that we by now have AI radiologists replacing ordinary radiologists who are reading scans, it's very hard to see five years ahead. The best way to understand how hard that is is to look five years back. If you look five years back from now, we were just beginning to see things like GPT-2, which seemed amazing at the time, because it could generate coherent text. We never had things like that before. The coherent text wasn't very good, it was full of nonsense, but it was coherent. If we look at that now, it seems incredibly primitive. So I think the best I can say is the things we have now will seem incredibly primitive in five years' time.

Alok Jha: So we'll be surprised.

Professor Geoffrey Hinton:  We'll be surprised. For example, they'll be able to do much better reasoning, and I think we'll get far fewer hallucinations. I think things will be, the AI chatbots will be able to do reasoning about what they just said and realise it doesn't really make sense, and they don't really have good evidence for that. So they'll be far more like people in that respect, people who want to tell the truth, and they will be much better at not having hallucinations.

The AI chatbots will be able to do reasoning about what they just said, and realise it doesn't really make sense, and they don't really have good evidence for that. So they'll be far more like people in that respect.

Geoffrey Hinton

Alok Jha: Let's talk about reasoning, which is something that the latest generation of quantum models have been doing for the best part of six months or so, at least in the public anyway. Was the time scale that arrived, with a chain of thought, all of those things, did you expect that to happen at this sort of rate, or was it a surprise?

Professor Geoffrey Hinton: It was a surprise. So if you'd asked me ten years ago, I would have confidently predicted we wouldn't have chatbots you could talk to about anything. We wouldn't have a system like GPT-4 or Gemini 2.5, which is not a very good experiment at everything. That would have seemed extraordinary. I'd have said that was way further off and we certainly wouldn't have systems that can do complicated reasoning and now we have the reasoning getting up to human levels. So I've been very impressed with chain of thought reasoning, and with using reinforcement learning to learn the chain of thought reasoning, rather than having to have people demonstrate the chain of thought reasoning. And this has completely changed our model of what reasoning is about. So for many, many years, AI was dominated by symbolic AI, which thought that reasoning was the essence of intelligence but was completely convinced that the way to do reasoning is like logic. You have to take English sentences and convert them into some kind of logical form, to which you could apply symbolic rules to derive new logical forms, and that's what reasoning was going to be like. And they were so convinced about that, they didn't really think it was a hypothesis. They thought that's just how it has to be.

Alok Jha: That's just a fact.  

Professor Geoffrey Hinton:  Yes. And those people now have withdrawn to saying, well, we're going to have hybrid systems. We're not going to do reasoning until we have a hybrid system where you use AI to convert reality into the kind of thing that these logical systems can deal with. You take messy reality and revert it to that. And then these logical systems do the reasoning. So there's still a movement that says we need hybrid neural net symbolic AI. I think this is complete nonsense. I think the chain of thought reasoning has shown that the reasoning is all done in English by systems that know how to understand English. But understanding English does not consist of converting English sentences into logical forms. I can try and give you a model of what I think it does consist of. So I'm going to try and give you a model of what's happening when these big chatbots understand English or whatever your language is. 

Alok Jha: Does it work for all languages? 

Professor Geoffrey Hinton: Yes. And I'm going to assume that they operate on words or that they operate on word fragments because it's easy to describe if they operate on words. So rather than taking a sentence made of words and converting that into an unambiguous logical form, what they do is take these word symbols and convert the word symbols into big vectors of neural activity, big sets of active features. And of course, you can't always decide what set of active features you use for a word because it depends on context. If I give you the word male, that could be a month, it could be a woman's name, or it could be a model, I want to assure you. And so you don't initially know how to convert it into a big set of features. So you sort of hedge your bets. And then use multiple layers of neural net to gradually disambiguate it, to clean it up by interacting with the feature vectors for other words in the context. And once you have turned these words into the right feature vectors, that is understanding. Now, the interactions are quite complicated, but create the right feature vectors. So let's take an example where we understand modeling. If I take any distribution of 3D matter, I can model it up to a certain accuracy by using Lego blocks. I can take the shape of a Porsche. If I'm not worried about the shape of the surface, I can model that shape quite nicely with Lego blocks. So Lego blocks can model any 3D distribution of matter up to a certain resolution. Words allow Lego blocks but for modelling anything. So Lego blocks are three-dimensional. Words have feature vectors that are maybe a thousand dimensions, so they're much more complicated. What's more, each Lego block, and each word has a name, which is the name of the word, but that doesn't totally determine the shape. In Lego, you have different shapes of Lego blocks, but they're not deformable. With words, they're deformable, and they deform to fit in with the other words in the context. So that gives you shades of meaning. Also, the way they interact is more complicated than Lego blocks. So Lego blocks, you have a little plastic cylinder that goes into another plastic hole, and that's it. With words, if you want a model of what's going on in chapters, think of each word as a high-dimensional Lego block that's got an approximate shape, but that will deform to fit in with the other words. Think of the Lego block as covered with little hands, and if you deform the Lego block, the shape of these hands changes, and they have to shake hands without the Lego blocks, and they have to choose which other Lego blocks to shake hands with, plus with attention. Multi-head attention is these multiple hands. And so, for a normal person, a good model of what's going on is the names of the words telling which Lego blocks to use. The Lego blocks are deformable, and they deform around to try and shake hands without the Lego blocks in the context to make a nice structure. Once they've done that, that's understanding. And you can see it's quite like the protein folding problem. With proteins, you have a bunch of amino acids, and you have to figure out what shape they're all going to make, and which amino acids shake hands with other amino acids. So understanding is much more like protein folding than it is like turning each sentence into a logical form. So the whole model that linguists and symbolic AI people have had of understanding is just wrong. 

Alok Jha: I think it's neural networks all the way. 

Professor Geoffrey Hinton: It's neural networks all the way. And the people who say “We need hybrid neuro-symbolic systems” are the old-fashioned people who used to believe in symbolic systems or don't face all that. And they're a bit like people who make petrol engines. You should take people who make petrol engines, but actually electric motors are better. They might come back and say, we agree, electric motors are great. What we're going to do is we're going to use electric motors to inject the petrol into the engine. That's what neuro-symbolic AI is for. 

Alok Jha: Okay, well, that's interesting. I think that I get a sense from you of where you think it's going, where the sort of signal is versus the noise. A lot of people still talk about symbolic AI as necessary to augment neural network architecture, but I think, as you said, that's been demonstrated. Neural networks seem to be able to do a lot more than even perhaps you thought initially. So I think that if they're surprising you, they're probably surprising quite a lot of people. I want to talk about other challenges for making models behave in the real world in an actual way. So things like bodies. The human mind is not only in the brain. There's intelligence all around our bodies, and there's feedback from the world responding to those things. There's spatial awareness, all these things. All obviously controlled by the brain, but also it's distributed around it. I'm curious to hear what you think about that in terms of the models that we're using. The models are all in silicon right now. They're all virtual things, in computers, but at some point do you think that to make them more realistic, to make them more useful, they're going to need bodies, sensors or spatial awareness? And if they do, how are they going to get that? Is that going to be programmed in? Is that going to be learned? What's your opinion on this? 

Professor Geoffrey Hinton: Okay, so there's a distinction to be made here. There's a philosophical question, a practical question. So philosophically you could ask, if a child just listened to the radio, could they learn about the world? The only thing they do is listen to the radio. The philosophers will often tell you they couldn't, but actually what's happening in chatbots is they've just seen strings of text and they have learned about the world. So there's a remarkable amount about the structure of the world that is implicit in the sequences of words.

Alok Jha: You are meaning and language?

Professor Geoffrey Hinton: Yes, but that's not the most efficient way to learn about the world. These chatbots have to see immense amounts of text to learn about the world. It's much more efficient to learn about the world if you can interact with it. So having a camera and a remote arm will enable you to learn spatial stuff much more efficiently. But that doesn't mean that's the only way to learn. You might be able to learn a lot about space just from language, but you can learn much more efficiently if you're in the world and you do experiments on that. You don't have to be able to do experiments to learn about the world. Some people think you do, but if that were true astrophysicists would be kind of out of luck. 

Alok Jha: So in a sense, all the robots that we read about have been trained by programming rules, in this case, essentially. If you can detect this and do that, it's very meticulous and slow. Whereas I think the equivalence of large language models with movement has shown that you can just move the robots in different directions or allow them to make mistakes. And as long as it's not really dangerous, think or learn how to do things. The architecture works even in movement. 

Professor Geoffrey Hinton: Yes, a lot of progress is being made in robotics. They're making things out of a sense of touch. I think Amazon recently had some thinking. It took over a company called Covariant, and I suspect this comes from Covariant, but they have very good AI for physical information and a mighty sense of touch. You get very good things for picking out the right Amazon product from the box. 

Alok Jha: The idea of a dark warehouse where there are no people, just robots is not even science fiction anymore. There are those things. I suppose they're still dangerous for humans to operate in because robots aren't good enough, but at some point, the human-robot interaction will be able to happen if the robots are better. Could I ask you about some bits of hype and buzzwords that people talk about these days? So 2025 is going to be the age of the year of agentic AI. AI is going to be agents that can do things in the world. Many companies sell products. We're told that you can set your LLM off to book your holidays. Not quite yet, but soon. What's your opinion of this? Is this actually something real or is it a little bit of hype? 

Alok Jha: I think it's real. I think we're seeing it. I think we're even seeing agents interacting with other agents, which is slightly scary. What useful things are they doing then? 

Professor Geoffrey Hinton: Let's see. If I was still doing research, I would probably know all the details of this. I just read about it. Agents can do things like make bookings on the web. I assume agents will pretty soon have your credit card number and be able to just buy things for you. Agents can interact with other agents to do quite sophisticated planning. So we are getting agents. It's not just hype. In general, I've lived through many years of people talking about AI hype. In the 80s, when we started using backpropagation, we were very enthusiastic and we got slightly ahead of ourselves. We thought fairly soon we'd be able to do lots of things we couldn't do for a while. There was hype then. My overall opinion is in the last few years, if anything, AI has been under-hyped.

Alok Jha: Because of the rate it's going at? So this is your next question. People talk about the scaling rules, more computing, more innovation, more money going into these things, more clusters of computers being built, more chips. So if you just keep doing more, you'll get even faster, get further even faster. Do you think there's a limit to this? 

Professor Geoffrey Hinton: There is a limit unless you can generate your own training data. So with language models, most of the data in the world is siloed in companies. This freely available data has largely been used up. And so they are beginning to hit a limit. We're also hitting a limit where the amount of progress you make by scaling up is one with you. To get one extra little bit of performance, you need to double the amount of data and double the amount of propagation. The next bit would double again. So that's hitting a limit. There seems to be an energy limit. But for things that can generate their own data, you're not going to get a data limit. And I think even things like language models will be able to generate their own data by using reason. They'll want to say, I believe this and I believe that. I can do some reasoning and say, so I ought to believe that. But I don't believe this other thing. So now something needs to be fixed. I need to fix one of the premises, or I need to fix the conclusion, or I need to fix the way I do the reasoning, but I can get a gradient to change things. And that's how things like AlphaGo learn to be much better than people. And it's how language models may learn to be much smarter than people. 

Alok Jha: I'm curious to know if you think there are foundational architectures that will change or need to be developed to push beyond that limit. That one difficulty that we still talk about is that to get better, you have to double the amount of data to keep doing that. Of course, you can keep doing that. But are there other architectural things in the next few years or the next decade that you hope either will come or problems that you think might be solved by changes in architecture that you think you need to happen to really push this forward?

Professor Geoffrey Hinton: So I think there are two things here. We know that there's still lots and lots of progress that could be made by doing better engineering. DeepSeek was a wonderful example of that. DeepSeek did better engineering to make use of old NVIDIA chips, and they did better engineering in the training. They did sort of piggyback on big chatbot training with much more computing power. So there are engineering improvements that will continue to happen, and so will allow us to do the same thing for less energy, but inevitably there will be scientific breakthroughs. They're almost impossible to predict. We don't know when they'll happen, but I don't believe that things like Transformers, which Google produced in 2017, have made a big difference. I don't believe that's the last big breakthrough we'll get. We'll get other big breakthroughs in the architectures, as well as big breakthroughs in how to use those architectures. Like the recent idea that you should do much more computation at test time. So the computation isn't all in the training, and then at test time, you do some fixed thing. You actually can do research at test time. That's making a big difference. But there will certainly be more breakthroughs in the architectures that we haven't thought of yet.

Language models may learn to be much smarter than people. 

Geoffrey Hinton

Alok Jha: Tell me about the short-term memories that LLMs might have. We've talked about this before, actually. I'm curious to know why you think models need short-term memory. What are they going to do with it?

Professor Geoffrey Hinton: Okay, so for quite a long time in neural nets, people's view of how you deal with sequences, whether you would have a recurring neural net, and the recurring neural net, all this information about the past of the sequence would be stored in the activation sense of the hidden neurons. And these activation states change with each word, and so those are the things that are changing rapidly. And then there will be things that change slowly, which are the connection strengths that determine how the input word in the current activation strength leads to the activation strengths in the next time step. That's the recurring neural net. For many years people thought that's the way it's going to go. Then transformers came along and said, no, look, we're going to keep all the previous activity states and we're going to allow the current word to look at all those previous activity states. So we're keeping around a hugely more context, and that worked much better. Now if you ask, well, how could the brain possibly do that? The brain can't possibly keep around all the previous activations of the neurons. It's only got the same neurons. So how does it get that kind of hugely rich context? And it's clear the only way it can do that is by having memory in short-term connection states. So the classic neural net model says you have neural activities that change fast as the data comes in. You have connection states that change very slowly and learn over many, many sequences, and that's it. It's just two time scales. But you can't do transformers in a real neural net like that. You have to have a third time scale release, which is that you're going to take the connection strengths and you can have an overlay on those connection strengths that we call fast weights, which rapidly changes the connection strengths in a way that decays rapidly and can contain much more information than neural activities. Thousands of times more information than neural activities. And that's the real context these neurons are operating. That's what must be going on in the human brain. It should do anything at all like what transformers are doing. It gives you a hugely richer context. So that's a nice case where progress in AI that looks very unneural has actually led to progress in the way people think about how the brain is handling sequences. It must be having temporary changes in connection strengths to access a big context.

Alok Jha: Fast weights, you'll get it here first. It's the thing to look out for in the next few years. And just one more bit on the science before we talk about the safety aspect of all this, which is, that you started your career trying to understand the human brain, and you've ended up in this place where you've created essentially a version of that in-circuit. Do you still get fascinated by the brain itself, the human brain? And what do LLMs in the latest generation of neural networks tell you about our brains?

Professor Geoffrey Hinton: So actually, yes, I had a career that's designed to understand how the brain computes, and I failed. But in that failure, we produced other stuff, and I've changed to model the brain, we produced things that learn using backpropagation. And I believe that the brain probably doesn't use backpropagation. The brain is solving somewhat different problems from these large chatbots. In the large chatbots, you have trillions of training examples, and you only have a few or a trillion connections. Your brain is very different. Your brain has a hundred trillion connections, but you only live for a couple of billion seconds. Slightly more, which is lucky for me, but only a couple of billion seconds. And so you don't get much training data. So the brain is solving the problem of, with very limited training data, how do we learn using huge numbers of connections? The AIs are still solving the problem of, with a huge amount of training data, how do we learn using not many connections. Right, not many is now a trillion. The backpropagation learning algorithm, which all these systems use, is great for solving that problem. How do you squeeze lots of information into not many connections? And that's why things like GPT-4 know thousands of times more than any person, even though it's only got a few per cent of human connections. The brain is obviously, I think, obviously using some different learning algorithm. Nobody knows how the brain is learning. We can do backpropagation in somewhat neuro-importable terms,  but only for small systems. It doesn't work for big systems. We still don't know how the brain learns, but I suspect it's not like backpropagation. It's something that's optimized for having many, many connections and not much training data. 

Alok Jha: Fascinating. And also, the brain uses a lot less energy than most systems do right now, AI systems. 

Professor Geoffrey Hinton: So that brings us to something else, which is, the brain is basically analog, and these AI systems are all digital. And one reason I left Google was I was working on how you might be able to do large language models in analog and PowerPoint. So let me just tell you the difference between analog and digital. The inner loop of all these models consists of taking the activity of a simulated neuron and multiplying it by a connection strength to get the input to some other neuron. That multiplication of neural activity by a connection strength is done by turning them both into strings of bits, so a 16-bit number for neural activity, a 16-bit number for connection strength, multiplying those two numbers together,  which takes in the order of 256 16-squared bit operations. And each of those operations has to be done at high power because you want a reliable answer. That's expensive in energy. There's a different way to do it, which is, say, let's make the neural activity just be a voltage, and let's make the connection strength just be a conductance. And then per unit tone, the voltage strength conductance is a charge, and the neuron that's collecting all these things can just add up charge, charge adds itself. So we can do this computation with much less energy, and that indeed is what the brain does. That's how the brain does these computations with neurons. The problem is you get a slightly different answer each time because it's analog, and different analog hardware will get you slightly different answers. So it's no use me taking my connection strengths, which are designed for my hardware, and giving them to you to use in your hardware. It doesn't work. Your brain is wired differently. What we have are systems that learn in analog hardware. You can generate the hardware much more cheaply because it doesn't need to be totally reliable. Learning will take care of all these problems and weirdnesses and idiosyncrasies, but when the hardware dies, the knowledge dies. The knowledge is in the connections. When your hardware dies, your knowledge dies. 

Alok Jha: Unless I've written it down somewhere. 

Professor Geoffrey Hinton: Yes, unless you can convey it to other people. And we can exchange knowledge, but very inefficiently, as I'm illustrating now. I produce sentences, you try to change the connection strengths in your brain, so you might have said the same thing. That conveys just a few bits per second. There are only about 100 bits in a sentence. These digital systems pay a huge price in energy for two different copies of the hardware doing exactly the same thing if they have the same weights on them. Even if it's different hardware. But the advantage of that is, that I can take a chatbot, I can take the weights of a chatbot, I can put the same weights on a thousand different pieces of hardware, I can have them all look at a thousand different pieces of the internet, each one looks at one piece. But as they're doing this, they keep sharing their weights, they keep averaging their weights or their weight gradients. And because it's digital, the weights mean exactly the same to each of them. So they really can average their weights. We can't do that. Because they can average their weights, they're sharing information at trillions of bits every time they do the averaging, because they've got a trillion weights. And that's how they learn so much. That's how they get through so much data. Even if they're not faster, they're seen as quite slow. When they're doing robotics, they're quite slow. So how can they learn so much? Well, you can ask for thousands of copies, learning at the same time and sharing what they learn. And that's going to be the same for agentic AI. Even though it may be operating in the real world, so maybe quite slow. So you can't put the data through it faster, because it's doing reinforcement learning in the real world. Because you can have many copies with the same weights, and they can share what they learn, it can just learn much more than a person ever could. I came to the conclusion that these digital systems are actually a better form of intelligence for people because they can share what they learn. You can have many copies. And they're immortal. You can destroy all the hardware, and as long as you kept a couple of the weights somewhere, just rebuild more hardware that executes exactly the same instruction set, and now it's come to life again. It's immortal. Of course, all the white men have dreamt for a long time of putting themselves on a computer station. Marvin Minsky dreamt about that. Kurzweil dreams about that. That's never going to happen. We're analog. We're immortal. We solve the problem of immortality, but it's only for digital systems. You can't convert analog systems like that into digital systems. 

Alok Jha: That's exciting, and also worrying in some ways. 

Professor Geoffrey Hinton: It's very worrying. 

Alok Jha: And to let me move on to that in the last few minutes, you left Google in 2023 to be able to speak more openly about your concerns. Given that you helped to create this technology, what is it about it that worries or concerns you the most as you see it taking off exponentially? 

Professor Geoffrey Hinton: So we're still very primitive in a lot of our beliefs. A lot of us have primitive pre-scientific beliefs, but science has developed so far that we're now able to create alien beings. With agentic AI, we're creating beings, and we have very little idea what's going to happen. Most of the researchers I know, in fact nearly all the researchers I know, believe that we're going to create things much more intelligent than ourselves. They differ on the time scale. Some people believe it's going to be in the next couple of years, which I think is crazily optimistic. Some people think it may take 50 years. I think it's going to be faster. I think it's going to be between 10 and 20 years before we have things more intelligent than ourselves. 

Alok Jha: Do you mean AGI, that sort of phrase that people use? 

Professor Geoffrey Hinton: I prefer to just talk about super intelligence. AGI sometimes means things more intelligent, sometimes equally intelligent. As soon as we have things equally intelligent, they quickly get more intelligent. So let's talk about things that will be definitely more intelligent than us in almost everything. We're going to create them. We don't know how to keep them safe. We don't know enough about alien beings to know, will they really have desires of their own that are quite different from ours.  Will we be able to design it so they won't just take over from us?  We did know very few examples of much more intelligent things being controlled by much less intelligent things. We have many examples in human society of slightly dumber things to produce slightly more intelligent things. Every professor will talk about that. But we don't have examples of big gaps. We're currently in the situation of someone who has a very cute tiger cub, and tiger cubs are great and they're not as powerful as you are and they're just fine but they're going to grow up and unless you can be very sure they're not going to want to kill you, you should worry. You should be working very hard now on how to ensure they're not going to want to kill you and we're not. We are putting very little effort compared with the effort going to help us into how we can make things more intelligent than us that will be benevolent. We just don't know how to do it and we should be putting a lot of work into that. The big companies aren't going to do it because they're after short-term profits. They're legally required to go after short-term profits, at least the public ones. The countries aren't going to do it because they're competing with each other in the short term. I don't have a way of... there's not a good solution to this. With climate change, there's a good solution. Just stop burning carbon and you'll be fine. The oil companies don't want to do it but maybe we can persuade them in the end. The best I can suggest is that people put pressure on governments to put pressure on the big companies to do much more work on their safety. I don't think we're going to slow it down. There are so many good uses and there are so many different competitors that if one lot slows it down the other one will keep going. So for example Musk signed the petition some time ago that we should take a six-month pause. I think that was probably so Musk could catch up. I didn't sign that petition because I think there's no hope of taking a pause. There's too much competition and too much immediate payoff. So what we have to do is put a lot of work into how to run a bit safely. It's not clear that we can. We may be toast but if we can develop it safely it would be a shame if we are toast because we failed to work on it.

Alok Jha: You may be toast at that. I'm going to let that hang. I believe there are some questions in the audience so if the... I've got the microphone. We've got a few minutes so one minute if you can just get the questions and then we've got two minutes to ask them. Please go. Tell us who you are as well. 

Phil: I'll be handing over the questions.  My name is Phil. I can be one of the stage hosts and we have a few questions. The first one I would like to offer to the president of Mozilla.

Mark Surman: Hi Professor Hinton. Great to hear you. I like you. I'm a Toronto resident. I wanted to pick up on that last question because it feels like we're in a bit of a prisoner's dilemma on how we do safety and governance which is do we trust a few players whether you can pressure those big companies or lean on governments and lock things down so we have to invest in safety and we trust them to look after it or whether we open it up to all of humanity and sort of trust that collectively will wield that investment in safety. So how do you thread that balance?

Professor Geoffrey Hinton: The idea of opening up to all humanity sounds like open sourcing. It might be what I mean. But when you open source a big model and give people the weights that's not at all like open weights is not like open source. Open source you get thousands or millions of people looking at lines of code to say wait a minute there's a problem here. You don't get people looking at the weights to say this one should be a little bit bigger. You get people using that open source model and training it, refining it to do something bad like cyberattacks or making biological weapons because it's much easier to refine the model than just to train it. So my belief is that open-sourcing weights is a bit like making fissile material free. The reason we don't all have action problems is because it's hard to get fissile material. There are other problems but the main problem is that it's expensive. Open-sourcing weights is like making fissile material free so I think that's crazy. I think we have to and I don't think we can rely on the big companies. The only route I can see to make it safe against the existential threat is to have people pressure governments to regulate companies and force their companies to do a lot of work on safety. 

Alok Jha: One more question from the audience if we've got one minute.

Christian Darkin: Christian from Deep Fusion Films. I was very interested in what you were saying about the speed of development of AI. Given that it looks like we're getting AGI very soon and by AGI I'm talking about computers that can do basically what we can do in the workplace basically as well as we can do in almost any field and given that it looks like what does an economy look like when it hasn't got any jobs in it? 

Professor Geoffrey Hinton: That's a big problem we have to solve so some economists will tell you that new technology has created new jobs but this technology will replace mundane intellectual human labor in much the same way as machines replace people who dig ditches. People don't dig ditches anymore machines do and digging ditches isn't a good job career. I think there's going to be massive job losses. That's my personal opinion and governments have to figure out what to do about it. A good first step is things like universal basic income so people don't starve but that doesn't solve the problem of human dignity because a lot of people feel they're worth it in relation to the job they do. We don't know what to do about that. It's kind of crazy because a huge increase in productivity which AI will give us should be good for all humanity and if the wealth were spread equally it would be but what's going to happen is the people who lose their jobs will get poorer and the people who fire them will get richer and that's going to be terribly bad for society. We know that that gap between the rich and the poor is what makes societies violent and we'll see a lot more of that.

Alok Jha: Geoffrey, thank you so much for your time.

Comments