Whups, My AI Just Woke Up

I propose another way for this to happen

Jul 31, 2024

Ugly monster from one of those first AI image generators

It's been a while since I posted on Substack, because I’m writing a novella-sized story for another text-and-image book from me and ace image-masher Bill Gore.  But I need to comment on an AI issue that, miraculously these days, isn't well-covered.  And I feel qualified to chime in because I wrote about the issue in 2019, when the AI frenzy had barely popped its head above ground.

Our story starts last May when two eminent profs from a Stanford AI institute wrote a Time magazine piece, "No, Today's AI Isn't Sentient." This was heavily critiqued by neuroscientist and writer Eric Hoel.

Hoel thinks that today's LLMs (Large Language Models AKA generative AIs) are not conscious, but that the Stanford duo gave a lazy, shallow argument for that conclusion. In the olden days of 2019, I proposed a method for making a conscious AI, and I still think it is plausible. Later I made my own shallow argument that then-current LLMs could not be conscious.   But, uh-oh, I now see how I might be wrong about that.

One explanation for consciousness (see e.g., Thomas Metzinger, Michael Graziano, or Andy Clark) is that it's a mental simulation of your external world but also includes a simulation of yourself as part of that world.  On that basis I and others, Hoel included, have said that embodiment (having a body) could be on a path for developing artificial sentience.

So, I had said that the early LLM, LaMDA, couldn't be conscious because it has no body, but I gave another reason. When the program is not doing its job of responding to someone, it is in an ‘idle loop:’ merely waiting for the next request, like a word processor does in the short intervals between your keystrokes.  Collecting the user's input question and delivery of the output answer are doubtless done by some simple, single-purpose code. So the LLM has no active, un-tasked time to ponder its situation.  It waits in the idle loop, and from time to time someone asks it a question, generating what we call an interrupt to the idle loop. It's like a slap upside of its head, (whap!) saying 'Answer <this> now.' It computes the answer and starts waiting again. That's all it can do.

Suppose embodiment is not needed. Then, based on the self-modeling theories mentioned above, an LLM might become conscious given three conditions. (1) It gets an instruction that it would better accomplish its designed task (answering questions) if it could make a model of its own functioning. (2) It has some way to remember that this is a good idea. (3) It has time and memory to act on the suggestion.

I’ll justify this using more technical terms, and we’ll get this over with.

LLMs have been getting more complicated operating loops, so that their answers can be more like problem-solving and less like spewing sentences. One way for consciousness to happen would be if its builders/trainers explicitly give the AI an instruction to self-model, and enough processing time to do this, for every instance of question/answer request. However, there are well-known reasons why builders might be reluctant to take this step. A conscious AI might be able to suffer. It might be more likely to have a goal of self-preservation and become dangerous.  It might be, from our standpoint, insane.

Hopefully, most AI builders will be responsible and aware of the risks, but there will be a strong temptation to play god, or just see what happens.

Another path to consciousness might be from meddling outsiders, not from the system developers.  Suppose an LLM gets designed with extensive state memory (i.e., about what it has done recently) and recurrent looping (to refine its task over multiple iterations) to make better answers. As far as I know, they seem to be moving in that direction. Then all it would take is some clever end-user requesting that the LLM model its environment and itself in that environment, and to use that model to help refine its answers. That model, executing in the computer, might very well constitute a conscious experience for the AI. Speaking as an ex-software developer, I think this intervention would be unlikely to work.  For example, how does the self-model persist after that initial request?  And wouldn’t developers put in some safeguards?

Maybe so, maybe no.  But the odds are that someone will surely try to make an AI conscious if they are given any opportunity.

Sentient Artifact

Discussion about this post