Between Plato, Narcissus and the reality principle: what the new AI models teach us

What does ‘understanding’ really mean for a machine, in particular for AI, which is, at present, the machine of machines? In recent months, a number of apparently very technical papers have opened up a surprisingly philosophical and, in my opinion, fascinating question in this regard.

The point is radical: when an artificial system represents the world, is it really ‘understanding’ something about the world and thus helping us to understand it, or is it merely refining an increasingly elegant form of statistical repetition?

Current research is developed according to three main interpretative hypotheses that can best be understood by referring to three figures, Plato and his myth of the cave, Narcissus and his myth of mirroring and the (psychoanalytic) reality principle.

Plato: the convergence of representations

The paper The Platonic Representation Hypothesis by Minyoung Huh, Brian Cheung, Tongzhou Wang and Phillip Isola proposes a fascinating thesis: very different neural networks – vision models, language models, multimodal models – seem to converge progressively towards similar representational structures.

In other words, systems trained on images, words or combinations of the two end up organising the world according to comparable latent geometries. Despite different inputs, they develop internal spaces that begin to resemble each other.

The authors hypothesise that this convergence is not accidental, but the sign of something deeper: a kind of ‘shared statistical model of reality’. The reference to Plato is explicit. Just as in Platonic philosophy, an ideal form would exist behind its sensible manifestations, so AI models would seem to approximate a common abstract structure of reality.

Of course, one must be cautious. It is by no means certain that these systems are discovering ‘reality itself’. They may simply be converging on the way we humans organise the world: our languages, our categories, our datasets, our perceptual habits.

But even so, the point remains remarkable:artificial intelligence does not just learn answers, it builds increasingly sophisticated internal maps.

The consequences of such an interpretative hypothesis can be equally radical and dangerous.

The first is a new form of algorithmic realism.

If a predictive system identifies certain behaviours as more likely – e.g. the risk of reoffending, the probability of dropping out of school, the professional compatibility of a candidate – the next step is to treat that probability as if it were an ontological property of the person.

The possible becomes essence. This is an important epistemological mutation: prediction ceases to be hypothesis and becomes identity.

The second consequence is the marginalisation of the unexpected event.

Platonic logic privileges that which converges, that which is repeated, that which statistically appears stable. But psychic life – and often social life too – is built precisely on breaks in continuity: the unexpected symptom, the creative act, the transformative failure, the error that opens up a new possibility. A model that recognises regularity above all risks treating novelty as noise. Psychoanalysis, on the contrary, is born precisely from attention to what interrupts the pattern: the slip, the dream, the symptom, the transference. Where the algorithm sees anomaly, the clinician often sees meaning.

It is true that one could draw a parallel between such a Platonic AI hypothesis and Bion’s concept of the alpha function. The raw, formless, potentially unmanageable experience of incomprehensible multiplicity – its ‘beta elements’ – is transformed into something representable, thinkable, symbolisable.

The difference, however, is decisive: in the human mind, this transformation is affective, embodied, shot through with desire and anguish. In the artificial model it is not. It is an alpha function without a subject: it organises, but does not suffer; it transforms, but does not live.

The third consequence concerns responsibility.

If algorithmic representation is perceived as more neutral, more objective and closer to reality than human judgement, a silent ‘epistemic delegation’ (Epifani) is produced. The decision no longer appears as an interpretative choice, but as a simple recognition of an already given order. It is the old technocratic dream: replacing conflict with calculation.

Narcissus: statistical narcissism

The second strand of interpretation of AI’s perception of reality seems to contradict the first.

Several recent studies on the so-called self-preference bias show that an LLM, when used as an evaluator, tends to prefer texts that resemble its own.

The work Self-Preference Bias in LLM-as-a-Judge shows that GPT-4 assigns higher scores to outputs with less ‘perplexity’ with respect to its own probabilistic distribution: in simple terms, it prefers what appears more natural to it, more familiar, more similar to its own way of generating language.

This means that the model does not only evaluate the content of a text, but tends to recognise as best that which confirms its implicit plausibility style.

It is here that Plato encounters Narcissus and is overwhelmed. While the models converge towards shared structures, they remain prisoners of their own internal distribution. They do not just look at the world: they look at the world through the mirror of their own probability.

One could speak of a true statistical narcissism. Of course, it is not narcissism in the psychological sense: there is no ego, no self-esteem, no libidinal investment. But the structural dynamic is reminiscent of something very familiar to the clinic: the compulsion to repeat. The system tends to prefer that which confirms its own internal organisation. Not because it wishes to repeat, but because its mathematical architecture makes the already familiar more plausible.

A repetition without desire, but not without consequences. And this is where the problem becomes political as well as technical.

Think of a personnel selection algorithm that evaluates CVs produced by different candidates. If that system tends to prefer profiles that resemble its own implicit criteria – often built on already selective historical data – the risk is not only technical, but social: probability becomes the norm, and the past is disguised as objective merit.

What the model recognises as ‘best’ is not necessarily the best, but the most compatible with its statistical mirror.

The reality principle: when the world resists

The third figure is perhaps the most important one. Yann LeCun, who has long been critical of the idea that the mere prediction of the next token can be enough to produce intelligence, insists on a simple point: understanding does not only mean completing a sentence well.

A world model is needed. The paper LeWorldModel: Stable End-to-End Joint-Embedding Predictive Architecture from Pixels moves in exactly this direction. The goal is not to predict words, but future states of the world.

The system observes sequences of images, movements, physical actions and builds latent representations that allow it to anticipate what will happen next. It does not predict the next pixel, but the implicit dynamics of events. In this sense, the shift is decisive: from interpretation to prediction. It is no longer enough to classify the present well. It is necessary to anticipate the future.

Here the psychoanalytic parallel is with the reality principle. The mind does not live only by internal representations; it must confront the limit, frustration, surprise, resistance of the world.

A fantasy may be coherent and reassuring, but reality corrects it. The world model introduces precisely this requirement: it is not enough for a representation to be internally elegant; it must hold up when the world responds. Intelligence does not coincide with coherence, but with the ability to modify itself in the face of error.

This is also true in the clinical field.

A therapeutic chatbot can be extremely convincing precisely because it confirms the patient’s narrative well. It can validate, reassure, even seem empathetic. But if it never introduces reflexive discontinuity, if it does not offer any symbolic resistance, it runs the risk of becoming not a therapeutic tool but a sophisticated mirror or worse, of increasing or inducing delusional beliefs(Sycophancy).

Not all confirmation is cure. Sometimes the cure begins precisely at the point where the system – human or artificial – does not immediately confirm what we would like to hear.

Synthesis: between convergence, repetition and correction

If we put these three levels together, we could say that the models converge in form, but diverge in perspective. And they only succeed in understanding and making us understand something when reality resists enough to correct them.

Platonic representation tells us that there is a statistical grammar of the world.
The self-preference bias reminds us that each model tends to confuse this grammar with its own voice. Finally, the world model introduces the decisive corrective: the real never quite coincides with what appears most probable.

Translated into psychoanalytic terms:

– representational convergence resembles the alpha function;

– The self-referentiality of the model is reminiscent of the repetition compulsion;

– prediction and correction are reminiscent of the reality principle.

The real question

For years we have been asking AI if it could talk like us. Perhaps the right question is a different one: can it recognise when it is wrong?

Because the risk is not that machines develop a human unconscious. The risk is more subtle: that they turn their own internal probability into apparent reality. A system that repeats itself well may seem intelligent precisely because it returns to us a perfectly coherent world. But coherence is still not truth. The problem of AI, then, is not only whether it represents the world, but whether it knows how to get out of its own representation when the world disproves it.

This is where the difference between simulation and thought is played out.

And perhaps also between efficiency and responsibility.

Practical corollary: when AI is compared with the reality criterion

There was a small, seemingly marginal incident during a conversation with ChatGPT while I was working on this article, which actually explains better than many papers the problem we are talking about.

I had asked to relate the three recent strands of research on artificial intelligence to three great figures: Plato, Narcissus and the reality principle.

At one point ChatGPT, summarising, pointed to Plato, Narcissus and the reality principle as ‘three figures of contemporary artificial intelligence’.

An elegant, flowing, seemingly perfect sentence.

But wrong.

Because Plato is not an ‘AI figure’, Narcissus is not a technical category of machine learning and the reality principle is not a computational module. They are, respectively, a philosopher, a mythological figure and a psychoanalytic concept.

The correct wording would have been:

“three symbolic figures to interpret contemporary artificial intelligence.”

The difference seems grammatical – a simple genitive – but it is actually epistemological.

In the first formulation, language transformed a metaphor into an essence: it seemed that Plato, Narcissus and the Freudian reality principle belonged ontologically to AI.
In the second case, they remained what they were: interpretative tools used by us to understand the phenomenon.

That small but revealing error became a practical demonstration of the three theories themselves.

It was a ‘Platonic’ error, because a form of reading had been transformed into an ontological structure: the metaphor had been reified. The shadow had been mistaken for the Idea.

It was also a ‘narcissistic’ mistake, because the sentence worked too well. It was elegant, coherent, persuasive. The model had preferred the internal beauty of the wording to conceptual precision. In contemporary terms: a form of stylistic self-preference bias.

Finally, the reality principle intervened: the human objection, which introduced semantic resistance. It was necessary to say, in essence: beware, this sentence produces a real misunderstanding. And there, internal consistency had to yield to the reality of shared language. This is exactly what should also happen in AI systems.

A model does not become reliable because it produces elegant and plausible sentences, but because there is someone – or something – that can interrupt that plausibility and say: stop, here you are confusing your representation with the world.

It is the principle of the so-called Human-in-the-Loop: not the human as a mere controller, but as an introduction of otherness, of interpretive conflict, of epistemic resistance.

After all, the problem with AI is not that it makes mistakes. We also make mistakes all the time.

The problem is when he errs too well. When probability is presented as truth. When consistency takes the place of reality. When a well-constructed sentence stops sounding like a hypothesis and starts sounding like destiny.

Perhaps true intelligence – artificial or human – does not begin when we find the right answer, but when someone forces us to rephrase the question.