Re: [agi] A question on the symbol-system hypothesis

Mark Waser Mon, 04 Dec 2006 07:25:59 -0800

Philip Goetz gave an example of an intrusion detection system that learned
information that was not comprehensible to humans. You argued that hecould
have understood it if he tried harder.

No, I gave five separate alternatives most of which put the blame on thesystem for not being able to compress it's data pattern into knowledge andexplain it to Philip. As I keep saying (and am trying to better rephrasehere), the problem with statistical and similar systems is that theygenerally don't pick out and isolate salient features (unless you are luckyenough to have constrained them to exactly the correct number of variables).Since they don't pick out and isolate features, they are not able to buildupon what they do.

I disagreed and argued that an
explanation would be useless even if it could be understood.

In your explanation, however, you basically *did* explain exactly whatthe system did. Clearly, the intrusion detection system looks at a numberof variables and if the weighted sum exceeds a threshold, it decides that itis likely an intruder. The only real question is the degree of entanglementof the variables in the real world. It is *possible*, though I would argueextremely unlikely, that the variables really are entangled enough in thereal world that a human being couldn't be trained to do intrusion detection.It is much, much, *MUCH* more probable that the system has improperlyentangled the variables because it has too many degrees of freedom.

If you use a computer to add up a billion numbers, do you check the math,or
do you trust it to give you the right answer?

I trust it to give me the right answer because I know and understand exactlywhat it is doing.

My point is that when AGI is built, you will have to trust its answersbased
on the correctness of the learning algorithms, and not by examining the
internal data or tracing the reasoning.

The problems are that 1) correct learning algorithms will give bad resultsif given bad data *and* 2) how are you ensuring that your learningalgorithms are correct under all of the circumstances that you're usingthem?

I believe this is the fundamental

flaw of all AI systems based on structured knowledge representations, suchas

first order logic, frames, connectionist systems, term logic, rule based
systems, and so on.  The evidence supporting my assertion is:
1. The relative success of statistical models vs. structured knowledge.

Statistical models are successful at pattern-matching and recognition. I amnot aware of *anything* else that they are successful at. I am fully awareof Jeff Hawkins' contention that pattern-matching is the only thing that thebrain does but I would argue that that pattern-matching includes featuresextraction and knowledge compression, that current statistical AI models donot, and that that is why current statistical models are anything but AI.

Straight statistical models like you are touting are never going to get youto AI until you can successfully build them on top of each other -- and todo that, you need feature extraction and thus explainability. An AGI iscertainly going to use statistics for feature extraction, etc. but knowledgeis *NOT* going to be kept in raw, badly entangled statistical form (i.e.basically compressed data rather than knowledge). If you were to addfunctionality to a statistical system such that it could extract featuresand use that to explain it's results, then I would say that it is on the wayto AGI. The point is that your statistical systems can't correctly explaintheir results even to an unlimited being (because most of the time they areincorrectly entangled anyways).

----- Original Message -----From: "Matt Mahoney" <[EMAIL PROTECTED]>

To: <agi@v2.listbox.com>
Sent: Sunday, December 03, 2006 11:11 PM
Subject: Re: [agi] A question on the symbol-system hypothesis

Mark,

Philip Goetz gave an example of an intrusion detection system that learned
information that was not comprehensible to humans. You argued that hecould
have understood it if he tried harder.  I disagreed and argued that an
explanation would be useless even if it could be understood.
If you use a computer to add up a billion numbers, do you check the math,or
do you trust it to give you the right answer?
My point is that when AGI is built, you will have to trust its answersbased
on the correctness of the learning algorithms, and not by examining the
internal data or tracing the reasoning.  I believe this is the fundamental
flaw of all AI systems based on structured knowledge representations, suchas
first order logic, frames, connectionist systems, term logic, rule based
systems, and so on.  The evidence supporting my assertion is:

1. The relative success of statistical models vs. structured knowledge.
2. Arguments based on algorithmic complexity. (The brain cannot model amore
complex machine).
3. The two examples above.

I'm afraid that's all the arguments I have.  Until we build AGI, we really
won't know. I realize I am repeating (summarizing) what I have saidbefore.
If you want to tear down my argument line by line, please do it privately
because I don't think the rest of the list will be interested.

--- Mark Waser <[EMAIL PROTECTED]> wrote:
Matt,

    Why don't you try addressing my points instead of simply repeating
things that I acknowledged and answered and then trotting out tired oldred
herrings.
As I said, your network intrusion anomaly detector is a patternmatcher.
It is a stupid pattern matcher that can't explain it's reasoning andcan't
build upon what it has learned.

    You, on the other hand, gave a very good explanation of how it works.
Thus, you have successfully proved that you are an explainingintelligence
and it is not.
If anything, you've further proved my point that an AGI is going tohave
to be able to explain/be explained.
----- Original Message -----From: "Matt Mahoney" <[EMAIL PROTECTED]>
To: <agi@v2.listbox.com>
Sent: Saturday, December 02, 2006 5:17 PM
Subject: Re: [agi] A question on the symbol-system hypothesis


>
> --- Mark Waser <[EMAIL PROTECTED]> wrote:
>
>> A nice story but it proves absolutely nothing . . . . .
>
> I know a little about network intrusion anomaly detection (it was my
> dissertation topic), and yes it is an important lessson.
>
> Network traffic containing attacks has a higher algorithmic complexity
> than
> traffic without attacks. It is less compressible. The reason has> nothing
> to
> do with the attacks, but with arbitrary variations in protocol usage> made
> by
> the attacker.  For example, the Code Red worm fragments the TCP stream
> after
> the HTTP "GET" command, making it detectable even before the buffer
> overflow
> code is sent in the next packet. A statistical model will learn that> this
> is
> unusual (even though legal) in normal HTTP traffic, but offer no
> explanation
> why such an event should be hostile. The reason such anomalies occur> is> because when attackers craft exploits, they follow enough of the> protocol
> to
> make it work but often don't care about the undocumented conventions
> followed
> by normal servers and clients.  For example, they may use lower case
> commands
> where most software uses upper case, or they may put unusual but legal
> values
> in the TCP or IP-ID fields or a hundred other things that make the> attack
> stand out.  Even if they are careful, many exploits require unusual
> commands
> or combinations of options that rarely appear in normal traffic and are
> therefore less carefully tested.
>
> So my point is that it is pointless to try to make an anomaly detection
> system
> explain its reasoning, because the only explanation is that the traffic> is
> unusual.  The best you can do is have it estimate the probability of a
> false
> alarm based on the information content.
>
> So the lesson is that AGI is not the only intelligent system where you
> should
> not waste your time trying to understand what it has learned. Even if> you> understood it, it would not tell you anything. Would you understand> why a> person made some decision if you knew the complete state of every> neuron
> and
> synapse in his brain?
>
>
>> You developed a pattern-matcher. The pattern matcher worked (and I>> would>> dispute that it worked better "than it had a right to"). Clearly, you>> do
>> not understand how it worked.  So what does that prove?
>>
>> Your contention (or, at least, the only one that continues the>> previous
>> thread) seems to be that you are too stupid to ever understand the
>> pattern
>> that it found.
>>
>> Let me offer you several alternatives:
>> 1)  You missed something obvious
>> 2) You would have understood it if the system could have explained it>> to
>> you
>> 3) You would have understood it if the system had managed to>> losslessly
>> convert it into a more compact (and comprehensible) format
>> 4) You would have understood it if the system had managed to>> losslessly>> convert it into a more compact (and comprehensible) format and>> explained
>> it
>> to your
>> 5)  You would have understood it if the system had managed to lossily
>> convert it into a more compact (and comprehensible -- and probably>> even,
>> more correct) format
>> 6)  You would have understood it if the system had managed to lossily
>> convert it into a more compact (and comprehensible -- and probably>> even,
>> more correct) format and explained it to you
>>
>> My contention is that the pattern that it found was simply not>> translated
>> into terms you could understand and/or explained.
>>
>> Further, and more importantly, the pattern matcher *doesn't*>> understand
>> it's
>>
>> results either and certainly could build upon them -- thus, it *fails*
>> the
>> test as far as being the central component of an RSIAI or being able>> to
>> provide evidence as to the required behavior of such.
>>
>> ----- Original Message ----->> From: "Philip Goetz" <[EMAIL PROTECTED]>
>> To: <agi@v2.listbox.com>
>> Sent: Friday, December 01, 2006 7:02 PM
>> Subject: Re: [agi] A question on the symbol-system hypothesis
>>
>>
>> > On 11/30/06, Mark Waser <[EMAIL PROTECTED]> wrote:
>> >>     With many SVD systems, however, the representation is more
>> >> vector-like
>> >> and *not* conducive to easy translation to human terms.  I have two
>> >> answers
>> >> to these cases. Answer 1 is that it is still easy for a human to>> >> look
>> >> at
>> >> the closest matches to a particular word pair and figure out what>> >> they
>> >> have
>> >> in common.
>> >
>> > I developed an intrusion-detection system for detecting brand new
>> > attacks on computer systems.  It takes TCP connections, and produces
>> > 100-500 statistics on each connection.  It takes thousands of
>> > connections, and runs these statistics thru PCA to come up with 5
>> > dimensions.  Then it clusters each connection, and comes up with 1-3
>> > clusters per port that have a lot of connections and are declared to
>> > be "normal" traffic. Those connections that lie far from any of>> > those
>> > clusters are identified as possible intrusions.
>> >
>> > The system worked much better than I expected it to, or than it had>> > a
>> > right to.  I went back and, by hand, tried to figure out how it was
>> > classifying attacks. In most cases, my conclusion was that there>> > was>> > *no information available* to tell whether a connection was an>> > attack,
>> > because the only information to tell that a connection was an attack
>> > was in the TCP packet contents, while my system looked only at>> > packet
>> > headers.  And yet, the system succeeded in placing about 50% of all
>> > attacks in the top 1% of suspicious connections. To this day, I>> > don't
>> > know how it did it.
>> >
>> > -----
>> > This list is sponsored by AGIRI: http://www.agiri.org/email
>> > To unsubscribe or change your options, please go to:
>> > http://v2.listbox.com/member/?list_id=303
>> >
>>
>> -----
>> This list is sponsored by AGIRI: http://www.agiri.org/email
>> To unsubscribe or change your options, please go to:
>> http://v2.listbox.com/member/?list_id=303
>>
>
>
> -- Matt Mahoney, [EMAIL PROTECTED]
>
> -----
> This list is sponsored by AGIRI: http://www.agiri.org/email
> To unsubscribe or change your options, please go to:
> http://v2.listbox.com/member/?list_id=303
>

-----
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?list_id=303
-- Matt Mahoney, [EMAIL PROTECTED]

-----
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?list_id=303



-----
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?list_id=303

Re: [agi] A question on the symbol-system hypothesis

Reply via email to