> On Feb 10, 2023, at 2:31 PM, Marcus Daniels wrote:
>
> I don't think it is necessarily the case a very large neural net would need
> to backtrack. It could devote more and more resources to different
> hypotheses so long as it was not energy constrained.
In some sense, I understand that this must be right, because your language is
more faithful to how the algorithm works.
I have another colleague who has commented that one needs to think differently
about large neural-net learners because in very high dimensions, there are not
the same kinds of snags as those encountered in lower dimensions. There is
always “a way around”, and therefore gradient descent works better than one has
come to expect.
Yet in the output, there still seems to be a tree somehow; maybe from things
that aren’t flexible by a design logic (?):
Time runs forward, for the machine as for the rest of us.
Having said something, the chatbot can’t un-say it.
So then there is a design decision: Do we take what we have said as a
constraint on what we can say next? I can well imagine that there is strong
reinforcement for some version of that, because it is inherent in fluidity, and
even the notion of a “topic” in the pragmatics of a string of sentences. If
topicalization somehow means strict retention (doubling down), then our speech
lives on a tree, and there seem to be certain ways of putting “more resources
on different hypotheses” that become closed to us. Glen’s characterization as
mansplaining encapsulates this route nicely in one word.
Or, does the chatbot have a way to actually say Wow, shit. I was confused;
start over. Then more of the dimensionality would be available.
Are there any chatbots that operate in this general space that people would
call “admitting one’s mistakes”? If not, why not? Is it a design
consideration in what to reward? I can’t imagine it’s a property of the
training set in the abstract, as there’s all sorts of literature on admitting
mistakes (the wedding guest in the Ancient Mariner “a sadder but a wiser man”).
Are those patterns somehow “harder” to learn, even though they are there?
What would make them harder, as a category of composition patterns?
A “third way”, I guess, is what one could call the “dumb pathological liar” (a
sort of trump-like character), who simply rambles along in mutually
contradicting utterances and doesn’t “structurally” acknowledge their
existence. Yet at some level of speech, that isn’t happening, because there is
local fluidity, topics, and so forth. At that lower level, the preference
given to continuity seems very strong.
A fun aside about topicalization. Our friend, the historical linguist Sergei
Starostin, whose English was to my ear essentially blank and perfect, was once
telling me that for a Russian speaker, the last latent point of fear in
speaking English was the use of the definite article. He said that no matter
how many years on, he stil had this nagging stress about whether he was using
it in the right places. I don’t remember whether it was in that conversation —
I think it was, and that the conversation was also about topicalization in
Japanese with ga and wa, and in Chinese with topic-comment phrase structure
(which French also uses in many constructions, but less structured around a
“pivot” as it would be called for Chinese) — but either then or subsequently I
came to appreciate what a problem topicalization is. I would say it lives in
speech at the level of pragmatics, in that one can almost see the “attention”
as a kind of searchlight that is moving around, and that pragmatics is supposed
to respect in the unfolding of a conversation for the conversation to be
coherent. The challenge of marking topic — one form of “definiteness” of the
definite article versus an indefinite one — is that it involves this
ever-negotiated problem of how much either from the discourse or from presumed
shared knowledge the listener has in primed-awareness at any given moment.
“The” drifts back and forth between implicit definiteness (I can just say “the
moon”, without a further specifying clause, presuming that we both know there
is only one), versus definiteness that demans a specifier (the senator from
Wisconsin, when first introduced in the discourse). I guess “the” in English
is unusually fraught, in that its insertion or omission also modulates category
terms versus literal instances (the AI chatbots say silly things, versus AI
chatbots say silly things), and all these functional roles are in tension with
each other at the same time.
So it’s all very attention-scope semantic. Yet it can fail to be semantic at
other levels. What it would be about the encoding of speech that makes them so
different is still hard for me to see.
Eric
> -Original Message-
> From: Friam On Behalf Of Santafe
> Sent: Friday, February 10, 2023 3:11 AM
> To: The Friday Morning Applied Complexity Coffee