Mark Waser wrote:
I'm going to try to put some words into Richard's mouth here since I'm curious to see how close I am . . . . (while radically changing the words). I think that Richard is not arguing about the possibility of Novamente-type solutions as much as he is arguing about the predictability of *very* flexible Novamente-type solutions as they grow larger and more complex (and the difficulty in getting it to not instantaneously "crash-and-burn"). Indeed, I have heard a very faint shadow of Richard's concerns in your statements about the "tuning" problems that you had with BioMind.

This is true, but not precise enough to capture the true nature of my worry.

Let me focus on one aspect of the problem. My goal here is to describe in a little detail how the Complex Systems Problem actually bites in a particular case.

Suppose that in some significant part of Novamente there is a representation system that uses "probability" or "likelihood" numbers to encode the strength of facts, as in [I like cats](p=0.75). The (p=0.75) is supposed to express the idea that the statement [I like cats] is in some sense "75% true".

[Quick qualifier: I know that this oversimplifies the real situation in Novamente, but I need to do this simplification in order to get my point across, and I am pretty sure this will not affect my argument, so bear with me].

We all know that this p value is not quite a "probability" or "likelihood" or "confidence factor". It plays a very ambigous role in the system, because on the one hand we want it to be very much like a probability in the sense that we want to do calculations with it: we NEED a calculus of such values in order to combine facts in the system to make inferences. But we also do not want to lock ourselves into a particular interpretation of what it means, because we know full well that we do not really have a clear semantics for these numbers.

Either way, we have a problem: a fact like [I like cats](p=0.75) is ungrounded because we have to interpret it. Does it mean that I like cats 75% of the time? That I like 75% of all cats? 75% of each cat? Are the cats that I like always the same ones, or is the chance of an individual cat being liked by me something that changes? Does it mean that I like all cats, but only 75% as much as I like my human family, which I like(p=1.0)? And so on and so on.

Digging down to the root of this problem (and this is the point where I am skipping from baby stuff to hard core AI) we want these numbers to be semantically compositional and interpretable, but in order to make sure they are grounded, the system itself is going to have to build them interpret them without our help ... and it is not clear that this grounding can be completely implemented. Why is it not clear? Because when you try to build the entire grounding mechanism(s) you are forced to become explicit about what these numbers mean, during the process of building a grounding system that you can trust to be doing its job: you cannot create a mechanism that you *know* is constructing sensible p numbers and facts during all of its development *unless* you finally bite the bullet and say what the p numbers really mean, in fully cashed out terms.

[Suppose you did not do this. Suppose you built the grounding mechanism but remained ambiguous about the meaning of the p numbers. What would the resulting system be computing? From end to end it would be building facts with p numbers, but you the human observer would still be imposing an interpretation on the facts. And if you are still doing anything to interpret, it cannot be grounded].

Now, as far as I understand it, the standard approach to this condundrum is that researchers (in Novamente and elsewhere) do indeed make an attempt to disambiguate the p numbers, but they do it by developing more sophisticated logical systems. First, perhaps, error-value bands of p values instead of sharp values. And temporal logic mechanisms to deal with time. Perhaps clusters of p and q and r and s values, each with some slightly different zones of applicability. More generally, people try to give structure to the qualifiers that are appended to the facts: [I like cats](qualfier=value) instead of [I like cats](p=0.75).

The question is, does this process of refinement have an end? Does it really lead to a situation where the qualifier is disambiguated and the semantics is clear enough to build a trustworthy grounding system? Is there a closed-form solution to the problem of building a logic that disambiguates the qualifiers?

Here is what I think will happen if this process is continued. In order to make the semantics unambiguous enough to let the system ground its own knowledge without the interpretation of p values, researchers will develop more and more sophisticated logics (with more and more structured replacements for that simple p value), until they are forced to introduce ideas that are so complicated that they do not allow you to do the full job of compositionality any more: you cannot combine some facts and have the combination of the complicated p-structures still be interpretable. For example, if the system is encoded with such stuff as

[I like cats](general-likelihood=0.75 +- 0.05,
              mood-variability=0.10 +-0.01,
              time-stability=0.99 +0.005-0.03,
              overall-unsureness=0.07
              special-circumstances-count=5 )

Then can we be *absolutely* sure that a combination of facts of this sort is going to preserve its accuracy across long ranges of inference? Can we combine this fact with an [I am allergic to cats](....) fact to come to a clear conclusion about the proposition [I want to sit down and let Sleti jump onto my lap](....)?

If we built a calculus to handle such structured facts, would we be kidding ourselves about whether the semantics was *really* compositional...? Or would we just be sweeping the ambiguity of the interpretation of these facts under the carpet? Hiding the ambiguity inside an impossibly dense thicket of qualfiers?

Here, then, are the two conclusions from this phase of my comment:

1) I do not believe anyone seriously knows if there is any end to the research process of trying to get a logic to does this disambiguation. I think it is an endeavor driven by pure hope.

2) I believe that, in the end, this search for a good enough logic will result the construction of a grounding system (i.e. a mechanism that is able to pick up and autonomously interpret all its own facts about the world) that actually has NOT been disambiguated, and that for this reason it will start to fall apart when used in large scale situations - with large numbers of facts and/or over large stretches of autonomous functioning. I think people will sweep the dismbiguation problem under the carpet, and then only notice that they are getting bitten by it when the large-scale system does not seem to generate coherent, sensible knowledge when left to its own devices.

This second point is where I finally meet up with your comment about problems on the larger scale, and the system crashing and burning. I think it will be a slowish crash. Incidentally, I presume I do not need to labor the point about how this will probably appear on the larger scale but might not be so obvious for small scale or toy demonstrations of the mechanisms.

I need to finish by making a point about what I see as the underlying cause of this problem.

The whole thing started because we wanted our p numbers to be interpretable. What I believe will happen as a result of imposing this design constraint is that we severely restrict the space of possible grounding mechanisms that we allow ourselves to consider. By doing so, we box ourselves into an increasingly tight corner, searching for a solution that preserves compositional semantics, THEN quietly giving up on the idea when we get into the depths of some horrendous temporal/pragmatic/affective/case-based logic 8-) that we cannot, after all, interpret ...... and then, having boxed ourselves into that neighborhood of the space of all possible representational systems, we find that there simply is no solution, given all those constraints. (But, being stubborn, we carry on hacking away at it forever anyway).

So what is the solution? Well, easy: do not even try to make those p numbers interpretable. Build systems that build their own representations, give 'em p numbers to play with (and q and r and s numbers if they want them), but let the mechanisms themselves use those numbers without ever trying to exactly interpret them. Frankly, why should we expect those numbers to be interpretable at all? Why should we expect there to be a *calculus* that allows us to prove that a system is truth-preserving?

In such a system the "truth value" of a fact would not be represented inside the object(s) that encoded the fact, it would be the result of a cluster of objects constraining one another. So, if the system has in it the fact [I like cats], this would be connected to a host of other facts, in such a way that if the system were asked "Do you like cats?" it would build a large representation of the question and the implications that were relevant in the present context, and the result of all those objects interacting would be the thing that generated the answer. If the person were responding to a questionnaire that forced them to give an answer on a continuous scale between 0 and 100, they might well put their mark at the 75% level, but this would not be the result of retrieving a p value, it would be a nebulous, fleeting result of the interaction of all the structures involved (and next time they were asked, the value would probably be different).

Similarly, if the system were trying to decide whether or not to allow a particular cat to jump up on its lap, given that it generally liked cats, but was somewhat allergic, the decision would not be the result of a combination of p numbers (be they ever so complicated), but the result of a collision of some huge, extended structures involving many facts. The collision would certainly involve some weighing of p, q, r and s (etc.) numbers stored in these objects, but these numbers would not be interpretable, and the combination process would not be consistent with a logical calculus.

There is much more that could be said about the methodology needed to find mechanisms that could do this, but leaving that aside for the moment, there is just the big philosophical question of whether to give up our obsession with interpretable semantics, or whether to be so scared of complex systems (because of course, such a system would very likely introduce the Complex Systems Bogeyman) that we do not dare try it.

That is a huge difference in philosophy. It is not just a small matter of technique, it is huge perspective change.

So, to conclude, when I say that intelligence involves an irreducible amount of complexity, I mean only that there are some situations in the design of AGI systems, like the case I have just described, where I see people going through a bizarre process:

Step 1) We decide that we must make our AGI as non-complex as possible, so we can prove *something* about how knowledge-bits combine to make reliable new knowledge-bits (in the above case, try to make it as much like a probability-calculus or logical calculus, because we know that in the purest examples of such things, we can preserve truth as knowledge is added).

Step 2) We are eventually forced to compromise our principles and introduce hacks that flush the truth-preservation guarantees down the toilet: in the above case, we complicate the qualifiers in our logic until we can no longer really be sure what the semantics is when we combine them (and in the related case of inference control engines, we allow such engines to do funky, truncated explorations of the space of possible inferences, with unpredictable consequences).

Step 3) We then refuse to acknowledge that what we have got, now, is a compromise that *is* a complex system: its overall behavior is subtly dependent on interactions down at the low level. One reason that we get away with this blindness for so long is that it does not necessarily show itself in small systems or in relatively small scale runs, or in systems where the developmental mechanisms (the worst culprits for bringing out the complexity) have not yet been impremented.

Step 4) Having let some complexity in through the back door, we then keep hacking away at the design, hoping that somewhere in the design neighborhood there is a solution that is both ALMOST compositional (i.e. interpretable semantics, truth-preserving, etc.) and slightly complex. In reality, we have most likely boxed ourselves in because of our initial (quixotic) emphasis on making the semantics intepretable.


Hmmm... if my luck runs the way it usually does, all this will be as clear as mud. Oh well. :-(

This commentary is not, of course, specific to Novamente, but is really about an entire class of AGI systems that belong in the same family as Novamente. My problem with Novamente is really that I do not see it being flexible enough to throw out the meaningful, interpretable parameters.





Richard Loosemore





Novamente looks, at times, like the very first step in an inductive proof . . . . except that it is in a chaotic environment rather than the nice orderly number system. Pieces of the system clearly sail in calm, friendly waters but hooking them all up in a wild environment is another story entirely (again, look at your own BioMind stories). I've got many doubts because I don't think that you have a handle on the order -- the big (O) -- of many of the operations you are proposing (why I harp on scalability, modularity, etc.). Richard is going further and saying that the predictability of even some of your smaller/simpler operations is impossible (although, as he has pointed out, many of them could be constrained by attractors, etc. if you were so inclined to view/treat your design that way). Personally, I believe that intelligence is *not* complex -- despite the fact that it does (probably necessarily) rest on top of complex pieces -- because those pieces' interactions are constrained enough that intelligence is stable. I think that this could be built into a Novamente-type design *but* you have to be attempting to do so (and I think that I could convince Richard of that -- or else, I'd learn a lot by trying :-). Richard's main point is that he believes that the search space of viable parameters and operations for Novamente is small enough that you're not going to hit it by accident -- and Novamente's very flexibility is what compounds the problem. Remember, life exists on the boundary between order and chaos. Too much flexibility (unconstrained chaos) is as deadly as too much structure. I think that I see both sides of the issue and how Novamente could be altered/enhanced to make Richard happy (since it's almost universally flexible) -- but doing so would also impose many constraints that I think that you would be unwilling to live with since I'm not sure that you would see the point. I don't think that you're ever going to be able to change his view that the current direction of Novamente is -- pick one: a) a needle in an infinite haystack or b) too fragile to succeed -- particularly since I'm pretty sure that you couldn't convince me without making some serious additions to Novamente
    ----- Original Message -----
    *From:* Benjamin Goertzel <mailto:[EMAIL PROTECTED]>
    *To:* agi@v2.listbox.com <mailto:agi@v2.listbox.com>
    *Sent:* Monday, November 12, 2007 3:49 PM
    *Subject:* Re: [agi] What best evidence for fast AI?


    To be honest, Richard, I do wonder whether a sufficiently in-depth
    conversation
    about AGI between us would result in you changing your views about
    the CSP
    problem in a way that would accept the possibility of Novamente-type
    solutions.

    But, this conversation as I'm envisioning it would take dozens of
    hours, and would
    require you to first spend 100+ hours studying detailed NM
    materials, so this seems
    unlikely to happen in the near future.

    -- Ben

    On Nov 12, 2007 3:32 PM, Richard Loosemore <[EMAIL PROTECTED]
    <mailto:[EMAIL PROTECTED]>> wrote:

        Benjamin Goertzel wrote:
         >
         > Ed --
         >
         > Just a quick comment: Mark actually read a bunch of the
        proprietary,
         > NDA-required Novamente documents and looked at some source
        code (3 years
         > ago, so a lot of progress has happened since then).  Richard
        didn't, so
         > he doesn't have the same basis of knowledge to form detailed
        comments on
         > NM, that Mark does.

        This is true, but not important to my line of argument, since of
        course
        I believe that a problem exists (CSP), which we have discussed on a
        number of occasions, and your position is not that you have some
        proprietary, unknown-to-me solution to the problem, but rather
        that you
        do not really think there is a problem.

        Richard Loosemore

        -----
        This list is sponsored by AGIRI: http://www.agiri.org/email
        To unsubscribe or change your options, please go to:
        http://v2.listbox.com/member/?&; <http://v2.listbox.com/member/?&;>


    ------------------------------------------------------------------------
    This list is sponsored by AGIRI: http://www.agiri.org/email
    To unsubscribe or change your options, please go to:
    http://v2.listbox.com/member/?&; <http://v2.listbox.com/member/?&;>

------------------------------------------------------------------------
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?&; <http://v2.listbox.com/member/?&;>

-----
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?member_id=8660244&id_secret=64591405-3ee8b5

Reply via email to