Mark Waser wrote:
I'm going to try to put some words into Richard's mouth here since
I'm curious to see how close I am . . . . (while radically changing the
words).
I think that Richard is not arguing about the possibility of
Novamente-type solutions as much as he is arguing about the
predictability of *very* flexible Novamente-type solutions as they grow
larger and more complex (and the difficulty in getting it to not
instantaneously "crash-and-burn"). Indeed, I have heard a very faint
shadow of Richard's concerns in your statements about the "tuning"
problems that you had with BioMind.
This is true, but not precise enough to capture the true nature of my worry.
Let me focus on one aspect of the problem. My goal here is to describe
in a little detail how the Complex Systems Problem actually bites in a
particular case.
Suppose that in some significant part of Novamente there is a
representation system that uses "probability" or "likelihood" numbers to
encode the strength of facts, as in [I like cats](p=0.75). The (p=0.75)
is supposed to express the idea that the statement [I like cats] is in
some sense "75% true".
[Quick qualifier: I know that this oversimplifies the real situation in
Novamente, but I need to do this simplification in order to get my point
across, and I am pretty sure this will not affect my argument, so bear
with me].
We all know that this p value is not quite a "probability" or
"likelihood" or "confidence factor". It plays a very ambigous role in
the system, because on the one hand we want it to be very much like a
probability in the sense that we want to do calculations with it: we
NEED a calculus of such values in order to combine facts in the system
to make inferences. But we also do not want to lock ourselves into a
particular interpretation of what it means, because we know full well
that we do not really have a clear semantics for these numbers.
Either way, we have a problem: a fact like [I like cats](p=0.75) is
ungrounded because we have to interpret it. Does it mean that I like
cats 75% of the time? That I like 75% of all cats? 75% of each cat?
Are the cats that I like always the same ones, or is the chance of an
individual cat being liked by me something that changes? Does it mean
that I like all cats, but only 75% as much as I like my human family,
which I like(p=1.0)? And so on and so on.
Digging down to the root of this problem (and this is the point where I
am skipping from baby stuff to hard core AI) we want these numbers to be
semantically compositional and interpretable, but in order to make sure
they are grounded, the system itself is going to have to build them
interpret them without our help ... and it is not clear that this
grounding can be completely implemented. Why is it not clear? Because
when you try to build the entire grounding mechanism(s) you are forced
to become explicit about what these numbers mean, during the process of
building a grounding system that you can trust to be doing its job: you
cannot create a mechanism that you *know* is constructing sensible p
numbers and facts during all of its development *unless* you finally
bite the bullet and say what the p numbers really mean, in fully cashed
out terms.
[Suppose you did not do this. Suppose you built the grounding mechanism
but remained ambiguous about the meaning of the p numbers. What would
the resulting system be computing? From end to end it would be building
facts with p numbers, but you the human observer would still be imposing
an interpretation on the facts. And if you are still doing anything to
interpret, it cannot be grounded].
Now, as far as I understand it, the standard approach to this condundrum
is that researchers (in Novamente and elsewhere) do indeed make an
attempt to disambiguate the p numbers, but they do it by developing more
sophisticated logical systems. First, perhaps, error-value bands of p
values instead of sharp values. And temporal logic mechanisms to deal
with time. Perhaps clusters of p and q and r and s values, each with
some slightly different zones of applicability. More generally, people
try to give structure to the qualifiers that are appended to the facts:
[I like cats](qualfier=value) instead of [I like cats](p=0.75).
The question is, does this process of refinement have an end? Does it
really lead to a situation where the qualifier is disambiguated and the
semantics is clear enough to build a trustworthy grounding system? Is
there a closed-form solution to the problem of building a logic that
disambiguates the qualifiers?
Here is what I think will happen if this process is continued. In order
to make the semantics unambiguous enough to let the system ground its
own knowledge without the interpretation of p values, researchers will
develop more and more sophisticated logics (with more and more
structured replacements for that simple p value), until they are forced
to introduce ideas that are so complicated that they do not allow you to
do the full job of compositionality any more: you cannot combine some
facts and have the combination of the complicated p-structures still be
interpretable. For example, if the system is encoded with such stuff as
[I like cats](general-likelihood=0.75 +- 0.05,
mood-variability=0.10 +-0.01,
time-stability=0.99 +0.005-0.03,
overall-unsureness=0.07
special-circumstances-count=5 )
Then can we be *absolutely* sure that a combination of facts of this
sort is going to preserve its accuracy across long ranges of inference?
Can we combine this fact with an [I am allergic to cats](....) fact to
come to a clear conclusion about the proposition [I want to sit down and
let Sleti jump onto my lap](....)?
If we built a calculus to handle such structured facts, would we be
kidding ourselves about whether the semantics was *really*
compositional...? Or would we just be sweeping the ambiguity of the
interpretation of these facts under the carpet? Hiding the ambiguity
inside an impossibly dense thicket of qualfiers?
Here, then, are the two conclusions from this phase of my comment:
1) I do not believe anyone seriously knows if there is any end to the
research process of trying to get a logic to does this disambiguation.
I think it is an endeavor driven by pure hope.
2) I believe that, in the end, this search for a good enough logic will
result the construction of a grounding system (i.e. a mechanism that is
able to pick up and autonomously interpret all its own facts about the
world) that actually has NOT been disambiguated, and that for this
reason it will start to fall apart when used in large scale situations -
with large numbers of facts and/or over large stretches of autonomous
functioning. I think people will sweep the dismbiguation problem under
the carpet, and then only notice that they are getting bitten by it when
the large-scale system does not seem to generate coherent, sensible
knowledge when left to its own devices.
This second point is where I finally meet up with your comment about
problems on the larger scale, and the system crashing and burning. I
think it will be a slowish crash. Incidentally, I presume I do not need
to labor the point about how this will probably appear on the larger
scale but might not be so obvious for small scale or toy demonstrations
of the mechanisms.
I need to finish by making a point about what I see as the underlying
cause of this problem.
The whole thing started because we wanted our p numbers to be
interpretable. What I believe will happen as a result of imposing this
design constraint is that we severely restrict the space of possible
grounding mechanisms that we allow ourselves to consider. By doing so,
we box ourselves into an increasingly tight corner, searching for a
solution that preserves compositional semantics, THEN quietly giving up
on the idea when we get into the depths of some horrendous
temporal/pragmatic/affective/case-based logic 8-) that we cannot, after
all, interpret ...... and then, having boxed ourselves into that
neighborhood of the space of all possible representational systems, we
find that there simply is no solution, given all those constraints.
(But, being stubborn, we carry on hacking away at it forever anyway).
So what is the solution? Well, easy: do not even try to make those p
numbers interpretable. Build systems that build their own
representations, give 'em p numbers to play with (and q and r and s
numbers if they want them), but let the mechanisms themselves use those
numbers without ever trying to exactly interpret them. Frankly, why
should we expect those numbers to be interpretable at all? Why should
we expect there to be a *calculus* that allows us to prove that a system
is truth-preserving?
In such a system the "truth value" of a fact would not be represented
inside the object(s) that encoded the fact, it would be the result of a
cluster of objects constraining one another. So, if the system has in
it the fact [I like cats], this would be connected to a host of other
facts, in such a way that if the system were asked "Do you like cats?"
it would build a large representation of the question and the
implications that were relevant in the present context, and the result
of all those objects interacting would be the thing that generated the
answer. If the person were responding to a questionnaire that forced
them to give an answer on a continuous scale between 0 and 100, they
might well put their mark at the 75% level, but this would not be the
result of retrieving a p value, it would be a nebulous, fleeting result
of the interaction of all the structures involved (and next time they
were asked, the value would probably be different).
Similarly, if the system were trying to decide whether or not to allow a
particular cat to jump up on its lap, given that it generally liked
cats, but was somewhat allergic, the decision would not be the result of
a combination of p numbers (be they ever so complicated), but the result
of a collision of some huge, extended structures involving many facts.
The collision would certainly involve some weighing of p, q, r and s
(etc.) numbers stored in these objects, but these numbers would not be
interpretable, and the combination process would not be consistent with
a logical calculus.
There is much more that could be said about the methodology needed to
find mechanisms that could do this, but leaving that aside for the
moment, there is just the big philosophical question of whether to give
up our obsession with interpretable semantics, or whether to be so
scared of complex systems (because of course, such a system would very
likely introduce the Complex Systems Bogeyman) that we do not dare try it.
That is a huge difference in philosophy. It is not just a small matter
of technique, it is huge perspective change.
So, to conclude, when I say that intelligence involves an irreducible
amount of complexity, I mean only that there are some situations in the
design of AGI systems, like the case I have just described, where I see
people going through a bizarre process:
Step 1) We decide that we must make our AGI as non-complex as
possible, so we can prove *something* about how knowledge-bits combine
to make reliable new knowledge-bits (in the above case, try to make it
as much like a probability-calculus or logical calculus, because we know
that in the purest examples of such things, we can preserve truth as
knowledge is added).
Step 2) We are eventually forced to compromise our principles and
introduce hacks that flush the truth-preservation guarantees down the
toilet: in the above case, we complicate the qualifiers in our logic
until we can no longer really be sure what the semantics is when we
combine them (and in the related case of inference control engines, we
allow such engines to do funky, truncated explorations of the space of
possible inferences, with unpredictable consequences).
Step 3) We then refuse to acknowledge that what we have got, now, is
a compromise that *is* a complex system: its overall behavior is subtly
dependent on interactions down at the low level. One reason that we get
away with this blindness for so long is that it does not necessarily
show itself in small systems or in relatively small scale runs, or in
systems where the developmental mechanisms (the worst culprits for
bringing out the complexity) have not yet been impremented.
Step 4) Having let some complexity in through the back door, we then
keep hacking away at the design, hoping that somewhere in the design
neighborhood there is a solution that is both ALMOST compositional (i.e.
interpretable semantics, truth-preserving, etc.) and slightly complex.
In reality, we have most likely boxed ourselves in because of our
initial (quixotic) emphasis on making the semantics intepretable.
Hmmm... if my luck runs the way it usually does, all this will be as
clear as mud. Oh well. :-(
This commentary is not, of course, specific to Novamente, but is really
about an entire class of AGI systems that belong in the same family as
Novamente. My problem with Novamente is really that I do not see it
being flexible enough to throw out the meaningful, interpretable parameters.
Richard Loosemore
Novamente looks, at times, like the very first step in an inductive
proof . . . . except that it is in a chaotic environment rather than the
nice orderly number system. Pieces of the system clearly sail in calm,
friendly waters but hooking them all up in a wild environment is another
story entirely (again, look at your own BioMind stories).
I've got many doubts because I don't think that you have a handle on
the order -- the big (O) -- of many of the operations you are proposing
(why I harp on scalability, modularity, etc.). Richard is going further
and saying that the predictability of even some of your smaller/simpler
operations is impossible (although, as he has pointed out, many of
them could be constrained by attractors, etc. if you were so inclined to
view/treat your design that way).
Personally, I believe that intelligence is *not* complex -- despite
the fact that it does (probably necessarily) rest on top of complex
pieces -- because those pieces' interactions are constrained enough that
intelligence is stable. I think that this could be built into a
Novamente-type design *but* you have to be attempting to do so (and I
think that I could convince Richard of that -- or else, I'd learn a lot
by trying :-).
Richard's main point is that he believes that the search space of
viable parameters and operations for Novamente is small enough that
you're not going to hit it by accident -- and Novamente's very
flexibility is what compounds the problem. Remember, life exists on the
boundary between order and chaos. Too much flexibility (unconstrained
chaos) is as deadly as too much structure.
I think that I see both sides of the issue and how Novamente could
be altered/enhanced to make Richard happy (since it's almost universally
flexible) -- but doing so would also impose many constraints that I
think that you would be unwilling to live with since I'm not sure that
you would see the point. I don't think that you're ever going to be
able to change his view that the current direction of Novamente
is -- pick one: a) a needle in an infinite haystack or b) too fragile
to succeed -- particularly since I'm pretty sure that you couldn't
convince me without making some serious additions to Novamente
----- Original Message -----
*From:* Benjamin Goertzel <mailto:[EMAIL PROTECTED]>
*To:* agi@v2.listbox.com <mailto:agi@v2.listbox.com>
*Sent:* Monday, November 12, 2007 3:49 PM
*Subject:* Re: [agi] What best evidence for fast AI?
To be honest, Richard, I do wonder whether a sufficiently in-depth
conversation
about AGI between us would result in you changing your views about
the CSP
problem in a way that would accept the possibility of Novamente-type
solutions.
But, this conversation as I'm envisioning it would take dozens of
hours, and would
require you to first spend 100+ hours studying detailed NM
materials, so this seems
unlikely to happen in the near future.
-- Ben
On Nov 12, 2007 3:32 PM, Richard Loosemore <[EMAIL PROTECTED]
<mailto:[EMAIL PROTECTED]>> wrote:
Benjamin Goertzel wrote:
>
> Ed --
>
> Just a quick comment: Mark actually read a bunch of the
proprietary,
> NDA-required Novamente documents and looked at some source
code (3 years
> ago, so a lot of progress has happened since then). Richard
didn't, so
> he doesn't have the same basis of knowledge to form detailed
comments on
> NM, that Mark does.
This is true, but not important to my line of argument, since of
course
I believe that a problem exists (CSP), which we have discussed on a
number of occasions, and your position is not that you have some
proprietary, unknown-to-me solution to the problem, but rather
that you
do not really think there is a problem.
Richard Loosemore
-----
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?& <http://v2.listbox.com/member/?&>
------------------------------------------------------------------------
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?& <http://v2.listbox.com/member/?&>
------------------------------------------------------------------------
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?&
<http://v2.listbox.com/member/?&>
-----
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?member_id=8660244&id_secret=64591405-3ee8b5