LLMs Don’t Know Who Said What, And That Irks You

Regular users of LLMs (AI Chat bots) learn to expect their errors ("hallucinations") and shrug them off. But there’s a particular sub-type of error that while it may not occur often, it carries an outsized effect, due to its unique social vibe. LLMs get Who Said What wrong. That isn't just a simple factual error, it's socially clueless.

When we (people) are in a conversation, we have an obvious distinction between things we said vs things others said. (so obvious, it's funny even mentioning) There's a vivid picture of the self in that memory, with clear statement attributions, ie. our brains strongly tag statements by their origin.

No, "tag" doesnt quite cover it. LLMs may indeed "tag" statement origins, along with a timestamp, noting any file attachments, etc. For humans, that directionality is not just metadata, it's the conversation itself.

Words and ideas travel entirely different neural pathways from the self, vs. from the other.

From the self: Synthesizing an idea, speaking it out loud. From others: Hearing, processing, understanding the idea of another

Statements in a two-way conversation have an innate polarity, I-said/ You-said, plain as black and white.

LLMs don't have that.

To an LLM, it's all just context window, a flat transcript of statements with no value on their origin. It could be a conversation between any 2 people. There is no persistent "self" in the past conversation, to an LLM.

For this concept of "statement attribution" I offer a catchier name - Who Said What, or WSW; and getting it wrong, ie misattribution, is a WSW Error.

LLMs routinely lose track of WSW, in a way that feels strange and wrong to humans. Namely:

LLM:"Here's the bug in your code"...
Human: 💭  No that was your code. you gave it to me 4 messages ago
LLM:"I have solved this puzzle - the solution is... "
Human: 💭  No, you were stumped,  I JUST gave you the answer.

WSW errors might be forgivable, coming from someone who just arrived in the chat & got up to speed via the transcript. Consider that actually, that is "who" is speaking in each LLM message.

Why this feels weird

1. It's simply weird, and just when the LLM was starting to sound normal. This mistake isn't relatable to humans, like a technical error we'd make ourselves. LLM social skills are uneven - they can be superbly relatable, and then - whoops. It's disorienting; unfamiliar, and it lands more like a social slip-up. We're used to conversation partners who track WSW.

2. Proximity makes it worse The error isn't about a factoid buried off deep in the web. The LLM didn't hallucinate a little detail because the real info was unavailable.

The misattributed statement is still on screen! Highlighting the LLM's tonedeafness. "You just gave me that code 2 minutes ago - i am looking right at that message." the person thinks, "How can you not know who wrote it?"

A mistake about something so simple and easily verified undermines confidence in the AI's answers to actual hard questions.

3. Ego bruises It may be coincidence but my two examples were slights to the human(me) - blamed for the bug, denied credit for the answer. Ego contributes to the human impact of the mistake- perhaps why these examples came to mind. LLMs don't account for the social cost of WSW Errors. We'll take it on faith that LLMs operate without ego, and the WSW errors are merely random or mechanistic, not callous or dismissive.

4.WSW matters - to us. Humans don’t get this backwards because WSW is essential in our narrative. The origin of a statement forms its semantics, as much as the words themselves.

Perhaps LLMs hardly notice WSW due to lack of ego. Their "WSW value" is that of the impartial observer who just walked in on the conversation, taking over for the previous LLM.

Improving:Attribution is all you need

(Didn’t the Beatles say that? Or was it the Stones? no matter. who can keep track of He said thisor She said that…)

LLMs' Emotional Intelligence would be improved with better modeling of WSW, in parallel with the human process. An LLM with higher WSW-IQ, would take more care about attribution and understand how humans perceive its use. The better representation the LLM has of human psychology (ref: Theory of Mind) the richer the interaction, and fewer faux pas and human ego-bruises.

In a practical light, this specific mistake can be flagged as high-priority to avoid, accounting for human feelings. An LLM understanding the cost of WSW errors will naturally avoid attributions at all, when not necessary - which would have been a perfect approach in the first example:

LLM:"I see the bug in that code"...

My point isn't that attribution is simply to be avoided, but that it matters, and is to be applied with care.


Questions & Comments

All responses are hand-moderated and will be posted within 24 hours.

I have a question

I want to post a comment