I have a question: Why is it when a chatbot makes stuff up it's called
Hallucinating. On the other hand if I said there's lots of people out
there: I am either douglas addams kinds of wrong, or (purely to avoid rule
lawyering): Give context, that or i'm just making tings up (the fine and
noble art the white lie..
Ok so why is it then: Chatbot's get callued Hallucinating, but a person is
either making things up, or just confused someone. So why do we say:
Chatbot Hallucinated,vs started pulling ********** out of it's digital and
hightly trained electronic backside and sitting part? (aka BSing and
telling fat lies).
For instance many years ago: chatgpt 2+2 is 4 (internet trolls went: oh
really?) and soon the 2+2=5  Acording to chatgpt (ie making stuff up!)  Why
is that called hallucinating vs BSing and making sh**** up?

On Thu, Sep 11, 2025 at 11:30 AM Marcus Daniels <[email protected]>
wrote:

> It seems to me that we are full circle back to a Turing Test.  If the LLM
> encodes and demonstrates skill (they certainly do), and these skills can be
> progress a solution of some real-world problem, then it is just empty
> chauvinism to say they don’t understand a topic.
>
>
>
> *From: *Friam <[email protected]> on behalf of Steve Smith <
> [email protected]>
> *Date: *Thursday, September 11, 2025 at 10:12 AM
> *To: *[email protected] <[email protected]>
> *Subject: *Re: [FRIAM] Hallucinations
>
> I find LLM engagement to be somewhere between that with a highly
> plausible gossip and a well researched survey paper in a subject I am
> interested in?
>
> Where a given conversation lands in this interval almost exclusively
> seems to rely on my care in crafting my prompts.
>
> I don't expect 'truth' out of either gossip or a survey paper... just
> 'perspective'?
>
> On 9/11/25 10:55 am, glen wrote:
> > OK. You're right in principle. But we might want to think of this in
> > the context of all algorithms. For example, let's say you run a FFT on
> > a signal and it outputs some frequencies. Does the signal *actually*
> > contain or express those frequencies? Or is it just an inference that
> > we find reliable?
> >
> > The same is true of the LLM inferences. Whether one ascribes truth or
> > falsity to those inferences is only relevant to metaphysicians and
> > philosophers. What matters is how reliable the inferences are when we
> > do some task. Yelling at the kids on your lawn doesn't achieve
> > anything. It's better to go out there and talk to them. 8^D
> >
> >
> > On 9/10/25 8:38 PM, Russ Abbott wrote:
> >> Glen, I wish people would stop talking about whether LLM-generated
> >> sentences are true or false. The mechanisms LLMs employ to generate a
> >> sentence have nothing to do with whether the sentence turns out to be
> >> true or false. A sentence may have a higher probability of being true
> >> if the training data consisted entirely of true sentences. (Even
> >> that's not guaranteed; similar true sentences might have their
> >> components interchanged when used during generation.) But the point
> >> is: the transformer process has no connection to the validity of its
> >> output. If an LLM reliably generates true sentences, no credit is due
> >> to the transformer. If the training data consists entirely of
> >> true/false sentences, the generated output is more likely to be
> >> true/false. Output validity plays no role in how an LLM generates its
> >> output.
> >>
> >> Marcus, if an LLM is trained entirely on false statements, its
> >> "confidence" in its output will presumably be the same as it would be
> >> if it were trained entirely on true statements. Truthfulness is not a
> >> consideration in the generation process. Speaking of a need to reduce
> >> ambiguity suggests that the LLM understands the input and realizes it
> >> might have multiple meanings. But of course, LLMs don't understand
> >> anything, they don't realize anything, and they can't take meaning
> >> into consideration when generating output.
> >>
> >>
> >>
> >>
> >>
> >> On Tue, Sep 9, 2025 at 5:20 PM glen <[email protected]
> >> <mailto:[email protected] <[email protected]>>> wrote:
> >>
> >>     It's unfortunate jargon [⛧]. So it's nothing like whether an LLM
> >> is red (unless you adopt a jargonal definition of "red"). And your
> >> example is a great one for understanding how language fluency *is* at
> >> least somewhat correlated with fidelity. The statistical probability
> >> of the phrase "LLMs hallucinate" is >> 0, whereas the prob for the
> >> phrase "LLMs are red" is vanishingly small. It would be the same for
> >> black swans and Lewis Carroll writings *if* they weren't canonical
> >> teaching devices. It can't be that sophisticated if children think
> >> it's funny.
> >>
> >>     But imagine all the woo out there where words like "entropy" or
> >> "entanglement" are used falsely. IDK for sure, but my guess is the
> >> false sentences outnumber the true ones by a lot. So the LLM has a
> >> high probability of forming false sentences.
> >>
> >>     Of course, in that sense, if a physicist finds themselves talking
> >> to an expert in the "Law of Attraction" (e.g. the movie "The Secret")
> >> and makes scientifically true statements about entanglement, the guru
> >> may well judge them as false. So there's "true in context" (validity)
> >> and "ontologically true" (soundness). A sentence can be true in
> >> context but false in the world and vice versa, depending on who's in
> >> control of the reinforcement.
> >>
> >>
> >>     [⛧] We could discuss the strength of the analogy between human
> >> hallucination and LLM "hallucination", especially in the context of
> >> prediction coding. But we don't need to. Just consider it jargon and
> >> move on.
> >>
> >>     On 9/9/25 4:37 PM, Russ Abbott wrote:
> >>      > Marcus, Glen,
> >>      >
> >>      > Your responses are much too sophisticated for me. Now that I'm
> >> retired (and, in truth, probably before as well), I tend to think in
> >> much simpler terms.
> >>      >
> >>      > My basic point was to express my surprise at realizing that it
> >> makes as much sense to ask whether an LLM hallucinates as it does to
> >> ask whether an LLM is red. It's a category mismatch--at least I now
> >> think so.
> >>      > _
> >>      > _
> >>      > __-- Russ <https://russabbott.substack.com/
> >> <https://russabbott.substack.com/>>
> >>      >
> >>      >
> >>      >
> >>      >
> >>      > On Tue, Sep 9, 2025 at 3:45 PM glen <[email protected]
> >> <mailto:[email protected] <[email protected]>> <mailto:
> [email protected]
> >> <mailto:[email protected] <[email protected]>>>> wrote:
> >>      >
> >>      >     The question of whether fluency is (well) correlated to
> >> accuracy seems to assume something like mentalizing, the idea that
> >> there's a correspondence between minds mediated by a correspondence
> >> between the structure of the world and the structure of our
> >> minds/language. We've talked about the "interface theory of
> >> perception", where Hoffman (I think?) argues we're more likely to
> >> learn *false* things than we are true things. And we've argued about
> >> realism, pragmatism, prediction coding, and everything else under the
> >> sun on this list.
> >>      >
> >>      >     So it doesn't surprise me if most people assume there will
> >> be more true statements in the corpus than false statements, at least
> >> in domains where there exists a common sense, where the laity *can*
> >> perceive the truth. In things like quantum mechanics or whatever,
> >> then all bets are off becuase there are probably more false sentences
> >> than true ones.
> >>      >
> >>      >     If there are more true than false sentences in the corpus,
> >> then reinforcement methods like Marcus' only bear a small burden (in
> >> lay domains). The implicit fidelity does the lion's share. But in
> >> those domains where counter-intuitive facts dominate, the
> >> reinforcement does the most work.
> >>      >
> >>      >
> >>      >     On 9/9/25 3:12 PM, Marcus Daniels wrote:
> >>      >      > Three ways some to mind..  I would guess that OpenAI,
> >> Google, Anthropic, and xAI are far more sophisticated..
> >>      >      >
> >>      >      >  1. Add a softmax penalty to the loss that tracks
> >> non-factual statements or grammatical constraints.  Cross entropy may
> >> not understand that some parts of content are more important than
> >> others.
> >>      >      >  2. Change how the beam search works during inference
> >> to skip sequences that fail certain predicates – like a lookahead
> >> that says “Oh, I can’t say that..”
> >>      >      >  3. Grade the output, either using human or non-LLM
> >> supervision, and re-train.
> >>      >      >
> >>      >      > *From:*Friam <[email protected]
> >> <mailto:[email protected] <[email protected]>> <mailto:
> [email protected]
> >> <mailto:[email protected] <[email protected]>>>> *On
> Behalf Of *Russ Abbott
> >>      >      > *Sent:* Tuesday, September 9, 2025 3:03 PM
> >>      >      > *To:* The Friday Morning Applied Complexity Coffee
> >> Group <[email protected] <mailto:[email protected] <[email protected]>>
>
> >> <mailto:[email protected] <mailto:[email protected] <[email protected]>
> >>>
> >>      >      > *Subject:* [FRIAM] Hallucinations
> >>      >      >
> >>      >      > OpenAI just published a paper on hallucinations
> >> <
> https://cdn.openai.com/pdf/d04913be-3f6f-4d2b-b283-ff432ef4aaa5/why-language-models-hallucinate.pdf
> >> <
> https://cdn.openai.com/pdf/d04913be-3f6f-4d2b-b283-ff432ef4aaa5/why-language-models-hallucinate.pdf>
>
> >> <
> https://cdn.openai.com/pdf/d04913be-3f6f-4d2b-b283-ff432ef4aaa5/why-language-models-hallucinate.pdf
> >> <
> https://cdn.openai.com/pdf/d04913be-3f6f-4d2b-b283-ff432ef4aaa5/why-language-models-hallucinate.pdf>>>
>  as
>
> >> well as a post summarizing the paper
> >> <https://openai.com/index/why-language-models-hallucinate/
> >> <https://openai.com/index/why-language-models-hallucinate/>
> >> <https://openai.com/index/why-language-models-hallucinate/
> >> <https://openai.com/index/why-language-models-hallucinate/>>>. The
> >> two of them seem wrong-headed in such a simple and obvious way that
> >> I'm surprised the issue they discuss is still alive.
> >>      >      >
> >>      >      > The paper and post point out that LLMs are trained to
> >> generate fluent language--which they do extraordinarily well. The
> >> paper and post also point out that LLMs are not trained to
> >> distinguish valid from invalid statements. Given those facts about
> >> LLMs, it's not clear why one should expect LLMs to be able to
> >> distinguish true statements from false statements--and hence why one
> >> should expect to be able to prevent LLMs from hallucinating.
> >>      >      >
> >>      >      > In other words, LLMs are built to generate text; they
> >> are not built to understand the texts they generate and certainly not
> >> to be able to determine whether the texts they generate make
> >> factually correct or incorrect statements.
> >>      >      >
> >>      >      > Please see my post
> >> <
> https://russabbott.substack.com/p/why-language-models-hallucinate-according
> >> <
> https://russabbott.substack.com/p/why-language-models-hallucinate-according>
>
> >> <
> https://russabbott.substack.com/p/why-language-models-hallucinate-according
> >> <
> https://russabbott.substack.com/p/why-language-models-hallucinate-according>>>
>
> >> elaborating on this.
> >>      >      >
> >>      >      > Why is this not obvious, and why is OpenAI still
> >> talking about it?
> >>      >      >
> >>     --
> >
> >
>
> .- .-.. .-.. / ..-. --- --- - . .-. ... / .- .-. . / .-- .-. --- -. --. /
> ... --- -- . / .- .-. . / ..- ... . ..-. ..- .-..
> FRIAM Applied Complexity Group listserv
> Fridays 9a-12p Friday St. Johns Cafe   /   Thursdays 9a-12p Zoom
> https://bit.ly/virtualfriam
> to (un)subscribe http://redfish.com/mailman/listinfo/friam_redfish.com
> FRIAM-COMIC http://friam-comic.blogspot.com/
> archives:  5/2017 thru present
> https://redfish.com/pipermail/friam_redfish.com/
>   1/2003 thru 6/2021  http://friam.383.s1.nabble.com/
> .- .-.. .-.. / ..-. --- --- - . .-. ... / .- .-. . / .-- .-. --- -. --. /
> ... --- -- . / .- .-. . / ..- ... . ..-. ..- .-..
> FRIAM Applied Complexity Group listserv
> Fridays 9a-12p Friday St. Johns Cafe   /   Thursdays 9a-12p Zoom
> https://bit.ly/virtualfriam
> to (un)subscribe http://redfish.com/mailman/listinfo/friam_redfish.com
> FRIAM-COMIC http://friam-comic.blogspot.com/
> archives:  5/2017 thru present
> https://redfish.com/pipermail/friam_redfish.com/
>   1/2003 thru 6/2021  http://friam.383.s1.nabble.com/
>
.- .-.. .-.. / ..-. --- --- - . .-. ... / .- .-. . / .-- .-. --- -. --. / ... 
--- -- . / .- .-. . / ..- ... . ..-. ..- .-..
FRIAM Applied Complexity Group listserv
Fridays 9a-12p Friday St. Johns Cafe   /   Thursdays 9a-12p Zoom 
https://bit.ly/virtualfriam
to (un)subscribe http://redfish.com/mailman/listinfo/friam_redfish.com
FRIAM-COMIC http://friam-comic.blogspot.com/
archives:  5/2017 thru present https://redfish.com/pipermail/friam_redfish.com/
  1/2003 thru 6/2021  http://friam.383.s1.nabble.com/

Reply via email to