Essay - example of how the CSP bites [WAS Re: [agi] What best evidence for fast AI?]

Richard Loosemore Tue, 13 Nov 2007 09:36:07 -0800

Mark Waser wrote:

I'm going to try to put some words into Richard's mouth here sinceI'm curious to see how close I am . . . . (while radically changing thewords).I think that Richard is not arguing about the possibility ofNovamente-type solutions as much as he is arguing about thepredictability of *very* flexible Novamente-type solutions as they growlarger and more complex (and the difficulty in getting it to notinstantaneously "crash-and-burn"). Indeed, I have heard a very faintshadow of Richard's concerns in your statements about the "tuning"problems that you had with BioMind.


This is true, but not precise enough to capture the true nature of my worry.

Let me focus on one aspect of the problem. My goal here is to describein a little detail how the Complex Systems Problem actually bites in aparticular case.

Suppose that in some significant part of Novamente there is arepresentation system that uses "probability" or "likelihood" numbers toencode the strength of facts, as in [I like cats](p=0.75). The (p=0.75)is supposed to express the idea that the statement [I like cats] is insome sense "75% true".

[Quick qualifier: I know that this oversimplifies the real situation inNovamente, but I need to do this simplification in order to get my pointacross, and I am pretty sure this will not affect my argument, so bearwith me].

We all know that this p value is not quite a "probability" or"likelihood" or "confidence factor". It plays a very ambigous role inthe system, because on the one hand we want it to be very much like aprobability in the sense that we want to do calculations with it: weNEED a calculus of such values in order to combine facts in the systemto make inferences. But we also do not want to lock ourselves into aparticular interpretation of what it means, because we know full wellthat we do not really have a clear semantics for these numbers.

Either way, we have a problem: a fact like [I like cats](p=0.75) isungrounded because we have to interpret it. Does it mean that I likecats 75% of the time? That I like 75% of all cats? 75% of each cat?Are the cats that I like always the same ones, or is the chance of anindividual cat being liked by me something that changes? Does it meanthat I like all cats, but only 75% as much as I like my human family,which I like(p=1.0)? And so on and so on.

Digging down to the root of this problem (and this is the point where Iam skipping from baby stuff to hard core AI) we want these numbers to besemantically compositional and interpretable, but in order to make surethey are grounded, the system itself is going to have to build theminterpret them without our help ... and it is not clear that thisgrounding can be completely implemented. Why is it not clear? Becausewhen you try to build the entire grounding mechanism(s) you are forcedto become explicit about what these numbers mean, during the process ofbuilding a grounding system that you can trust to be doing its job: youcannot create a mechanism that you *know* is constructing sensible pnumbers and facts during all of its development *unless* you finallybite the bullet and say what the p numbers really mean, in fully cashedout terms.

[Suppose you did not do this. Suppose you built the grounding mechanismbut remained ambiguous about the meaning of the p numbers. What wouldthe resulting system be computing? From end to end it would be buildingfacts with p numbers, but you the human observer would still be imposingan interpretation on the facts. And if you are still doing anything tointerpret, it cannot be grounded].

Now, as far as I understand it, the standard approach to this condundrumis that researchers (in Novamente and elsewhere) do indeed make anattempt to disambiguate the p numbers, but they do it by developing moresophisticated logical systems. First, perhaps, error-value bands of pvalues instead of sharp values. And temporal logic mechanisms to dealwith time. Perhaps clusters of p and q and r and s values, each withsome slightly different zones of applicability. More generally, peopletry to give structure to the qualifiers that are appended to the facts:[I like cats](qualfier=value) instead of [I like cats](p=0.75).

The question is, does this process of refinement have an end? Does itreally lead to a situation where the qualifier is disambiguated and thesemantics is clear enough to build a trustworthy grounding system? Isthere a closed-form solution to the problem of building a logic thatdisambiguates the qualifiers?

Here is what I think will happen if this process is continued. In orderto make the semantics unambiguous enough to let the system ground itsown knowledge without the interpretation of p values, researchers willdevelop more and more sophisticated logics (with more and morestructured replacements for that simple p value), until they are forcedto introduce ideas that are so complicated that they do not allow you todo the full job of compositionality any more: you cannot combine somefacts and have the combination of the complicated p-structures still beinterpretable. For example, if the system is encoded with such stuff as


[I like cats](general-likelihood=0.75 +- 0.05,
              mood-variability=0.10 +-0.01,
              time-stability=0.99 +0.005-0.03,
              overall-unsureness=0.07
              special-circumstances-count=5 )

Then can we be *absolutely* sure that a combination of facts of thissort is going to preserve its accuracy across long ranges of inference?Can we combine this fact with an [I am allergic to cats](....) fact tocome to a clear conclusion about the proposition [I want to sit down andlet Sleti jump onto my lap](....)?

If we built a calculus to handle such structured facts, would we bekidding ourselves about whether the semantics was *really*compositional...? Or would we just be sweeping the ambiguity of theinterpretation of these facts under the carpet? Hiding the ambiguityinside an impossibly dense thicket of qualfiers?


Here, then, are the two conclusions from this phase of my comment:

1) I do not believe anyone seriously knows if there is any end to theresearch process of trying to get a logic to does this disambiguation.I think it is an endeavor driven by pure hope.

2) I believe that, in the end, this search for a good enough logic willresult the construction of a grounding system (i.e. a mechanism that isable to pick up and autonomously interpret all its own facts about theworld) that actually has NOT been disambiguated, and that for thisreason it will start to fall apart when used in large scale situations -with large numbers of facts and/or over large stretches of autonomousfunctioning. I think people will sweep the dismbiguation problem underthe carpet, and then only notice that they are getting bitten by it whenthe large-scale system does not seem to generate coherent, sensibleknowledge when left to its own devices.

This second point is where I finally meet up with your comment aboutproblems on the larger scale, and the system crashing and burning. Ithink it will be a slowish crash. Incidentally, I presume I do not needto labor the point about how this will probably appear on the largerscale but might not be so obvious for small scale or toy demonstrationsof the mechanisms.

I need to finish by making a point about what I see as the underlyingcause of this problem.

The whole thing started because we wanted our p numbers to beinterpretable. What I believe will happen as a result of imposing thisdesign constraint is that we severely restrict the space of possiblegrounding mechanisms that we allow ourselves to consider. By doing so,we box ourselves into an increasingly tight corner, searching for asolution that preserves compositional semantics, THEN quietly giving upon the idea when we get into the depths of some horrendoustemporal/pragmatic/affective/case-based logic 8-) that we cannot, afterall, interpret ...... and then, having boxed ourselves into thatneighborhood of the space of all possible representational systems, wefind that there simply is no solution, given all those constraints.(But, being stubborn, we carry on hacking away at it forever anyway).

So what is the solution? Well, easy: do not even try to make those pnumbers interpretable. Build systems that build their ownrepresentations, give 'em p numbers to play with (and q and r and snumbers if they want them), but let the mechanisms themselves use thosenumbers without ever trying to exactly interpret them. Frankly, whyshould we expect those numbers to be interpretable at all? Why shouldwe expect there to be a *calculus* that allows us to prove that a systemis truth-preserving?

In such a system the "truth value" of a fact would not be representedinside the object(s) that encoded the fact, it would be the result of acluster of objects constraining one another. So, if the system has init the fact [I like cats], this would be connected to a host of otherfacts, in such a way that if the system were asked "Do you like cats?"it would build a large representation of the question and theimplications that were relevant in the present context, and the resultof all those objects interacting would be the thing that generated theanswer. If the person were responding to a questionnaire that forcedthem to give an answer on a continuous scale between 0 and 100, theymight well put their mark at the 75% level, but this would not be theresult of retrieving a p value, it would be a nebulous, fleeting resultof the interaction of all the structures involved (and next time theywere asked, the value would probably be different).

Similarly, if the system were trying to decide whether or not to allow aparticular cat to jump up on its lap, given that it generally likedcats, but was somewhat allergic, the decision would not be the result ofa combination of p numbers (be they ever so complicated), but the resultof a collision of some huge, extended structures involving many facts.The collision would certainly involve some weighing of p, q, r and s(etc.) numbers stored in these objects, but these numbers would not beinterpretable, and the combination process would not be consistent witha logical calculus.

There is much more that could be said about the methodology needed tofind mechanisms that could do this, but leaving that aside for themoment, there is just the big philosophical question of whether to giveup our obsession with interpretable semantics, or whether to be soscared of complex systems (because of course, such a system would verylikely introduce the Complex Systems Bogeyman) that we do not dare try it.

That is a huge difference in philosophy. It is not just a small matterof technique, it is huge perspective change.

So, to conclude, when I say that intelligence involves an irreducibleamount of complexity, I mean only that there are some situations in thedesign of AGI systems, like the case I have just described, where I seepeople going through a bizarre process:

Step 1) We decide that we must make our AGI as non-complex aspossible, so we can prove *something* about how knowledge-bits combineto make reliable new knowledge-bits (in the above case, try to make itas much like a probability-calculus or logical calculus, because we knowthat in the purest examples of such things, we can preserve truth asknowledge is added).

Step 2) We are eventually forced to compromise our principles andintroduce hacks that flush the truth-preservation guarantees down thetoilet: in the above case, we complicate the qualifiers in our logicuntil we can no longer really be sure what the semantics is when wecombine them (and in the related case of inference control engines, weallow such engines to do funky, truncated explorations of the space ofpossible inferences, with unpredictable consequences).

Step 3) We then refuse to acknowledge that what we have got, now, isa compromise that *is* a complex system: its overall behavior is subtlydependent on interactions down at the low level. One reason that we getaway with this blindness for so long is that it does not necessarilyshow itself in small systems or in relatively small scale runs, or insystems where the developmental mechanisms (the worst culprits forbringing out the complexity) have not yet been impremented.

Step 4) Having let some complexity in through the back door, we thenkeep hacking away at the design, hoping that somewhere in the designneighborhood there is a solution that is both ALMOST compositional (i.e.interpretable semantics, truth-preserving, etc.) and slightly complex.In reality, we have most likely boxed ourselves in because of ourinitial (quixotic) emphasis on making the semantics intepretable.

Hmmm... if my luck runs the way it usually does, all this will be asclear as mud. Oh well. :-(

This commentary is not, of course, specific to Novamente, but is reallyabout an entire class of AGI systems that belong in the same family asNovamente. My problem with Novamente is really that I do not see itbeing flexible enough to throw out the meaningful, interpretable parameters.






Richard Loosemore

Novamente looks, at times, like the very first step in an inductiveproof . . . . except that it is in a chaotic environment rather than thenice orderly number system. Pieces of the system clearly sail in calm,friendly waters but hooking them all up in a wild environment is anotherstory entirely (again, look at your own BioMind stories).I've got many doubts because I don't think that you have a handle onthe order -- the big (O) -- of many of the operations you are proposing(why I harp on scalability, modularity, etc.). Richard is going furtherand saying that the predictability of even some of your smaller/simpleroperations is impossible (although, as he has pointed out, many ofthem could be constrained by attractors, etc. if you were so inclined toview/treat your design that way).Personally, I believe that intelligence is *not* complex -- despitethe fact that it does (probably necessarily) rest on top of complexpieces -- because those pieces' interactions are constrained enough thatintelligence is stable. I think that this could be built into aNovamente-type design *but* you have to be attempting to do so (and Ithink that I could convince Richard of that -- or else, I'd learn a lotby trying :-).Richard's main point is that he believes that the search space ofviable parameters and operations for Novamente is small enough thatyou're not going to hit it by accident -- and Novamente's veryflexibility is what compounds the problem. Remember, life exists on theboundary between order and chaos. Too much flexibility (unconstrainedchaos) is as deadly as too much structure.I think that I see both sides of the issue and how Novamente couldbe altered/enhanced to make Richard happy (since it's almost universallyflexible) -- but doing so would also impose many constraints that Ithink that you would be unwilling to live with since I'm not sure thatyou would see the point. I don't think that you're ever going to beable to change his view that the current direction of Novamenteis -- pick one: a) a needle in an infinite haystack or b) too fragileto succeed -- particularly since I'm pretty sure that you couldn'tconvince me without making some serious additions to Novamente
    ----- Original Message -----
    *From:* Benjamin Goertzel <mailto:[EMAIL PROTECTED]>
    *To:* agi@v2.listbox.com <mailto:agi@v2.listbox.com>
    *Sent:* Monday, November 12, 2007 3:49 PM
    *Subject:* Re: [agi] What best evidence for fast AI?


    To be honest, Richard, I do wonder whether a sufficiently in-depth
    conversation
    about AGI between us would result in you changing your views about
    the CSP
    problem in a way that would accept the possibility of Novamente-type
    solutions.

    But, this conversation as I'm envisioning it would take dozens of
    hours, and would
    require you to first spend 100+ hours studying detailed NM
    materials, so this seems
    unlikely to happen in the near future.

    -- Ben

    On Nov 12, 2007 3:32 PM, Richard Loosemore <[EMAIL PROTECTED]
    <mailto:[EMAIL PROTECTED]>> wrote:

        Benjamin Goertzel wrote:
         >
         > Ed --
         >
         > Just a quick comment: Mark actually read a bunch of the
        proprietary,
         > NDA-required Novamente documents and looked at some source
        code (3 years
         > ago, so a lot of progress has happened since then).  Richard
        didn't, so
         > he doesn't have the same basis of knowledge to form detailed
        comments on
         > NM, that Mark does.

        This is true, but not important to my line of argument, since of
        course
        I believe that a problem exists (CSP), which we have discussed on a
        number of occasions, and your position is not that you have some
        proprietary, unknown-to-me solution to the problem, but rather
        that you
        do not really think there is a problem.

        Richard Loosemore

        -----
        This list is sponsored by AGIRI: http://www.agiri.org/email
        To unsubscribe or change your options, please go to:
        http://v2.listbox.com/member/?&; <http://v2.listbox.com/member/?&;>


    ------------------------------------------------------------------------
    This list is sponsored by AGIRI: http://www.agiri.org/email
    To unsubscribe or change your options, please go to:
    http://v2.listbox.com/member/?&; <http://v2.listbox.com/member/?&;>

------------------------------------------------------------------------
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?&;<http://v2.listbox.com/member/?&;>


-----
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?member_id=8660244&id_secret=64591405-3ee8b5

Essay - example of how the CSP bites [WAS Re: [agi] What best evidence for fast AI?]

Reply via email to