Re: [agi] A question on the symbol-system hypothesis

Matt Mahoney Sun, 03 Dec 2006 20:11:21 -0800

Mark,

Philip Goetz gave an example of an intrusion detection system that learned 
information that was not comprehensible to humans.  You argued that he could
have understood it if he tried harder.  I disagreed and argued that an
explanation would be useless even if it could be understood.


If you use a computer to add up a billion numbers, do you check the math, or
do you trust it to give you the right answer?

My point is that when AGI is built, you will have to trust its answers based
on the correctness of the learning algorithms, and not by examining the
internal data or tracing the reasoning.  I believe this is the fundamental
flaw of all AI systems based on structured knowledge representations, such as
first order logic, frames, connectionist systems, term logic, rule based
systems, and so on.  The evidence supporting my assertion is:

1. The relative success of statistical models vs. structured knowledge.
2. Arguments based on algorithmic complexity.  (The brain cannot model a more
complex machine).
3. The two examples above.

I'm afraid that's all the arguments I have.  Until we build AGI, we really
won't know.  I realize I am repeating (summarizing) what I have said before. 
If you want to tear down my argument line by line, please do it privately
because I don't think the rest of the list will be interested.

--- Mark Waser <[EMAIL PROTECTED]> wrote:

> Matt,
> 
>     Why don't you try addressing my points instead of simply repeating 
> things that I acknowledged and answered and then trotting out tired old red 
> herrings.
> 
>     As I said, your network intrusion anomaly detector is a pattern matcher.
> 
> It is a stupid pattern matcher that can't explain it's reasoning and can't 
> build upon what it has learned.
> 
>     You, on the other hand, gave a very good explanation of how it works. 
> Thus, you have successfully proved that you are an explaining intelligence 
> and it is not.
> 
>     If anything, you've further proved my point that an AGI is going to have
> 
> to be able to explain/be explained.
> 
> 
> ----- Original Message ----- 
> From: "Matt Mahoney" <[EMAIL PROTECTED]>
> To: <agi@v2.listbox.com>
> Sent: Saturday, December 02, 2006 5:17 PM
> Subject: Re: [agi] A question on the symbol-system hypothesis
> 
> 
> >
> > --- Mark Waser <[EMAIL PROTECTED]> wrote:
> >
> >> A nice story but it proves absolutely nothing . . . . .
> >
> > I know a little about network intrusion anomaly detection (it was my
> > dissertation topic), and yes it is an important lessson.
> >
> > Network traffic containing attacks has a higher algorithmic complexity 
> > than
> > traffic without attacks.  It is less compressible.  The reason has nothing
> 
> > to
> > do with the attacks, but with arbitrary variations in protocol usage made 
> > by
> > the attacker.  For example, the Code Red worm fragments the TCP stream 
> > after
> > the HTTP "GET" command, making it detectable even before the buffer 
> > overflow
> > code is sent in the next packet.  A statistical model will learn that this
> 
> > is
> > unusual (even though legal) in normal HTTP traffic, but offer no 
> > explanation
> > why such an event should be hostile.  The reason such anomalies occur is
> > because when attackers craft exploits, they follow enough of the protocol 
> > to
> > make it work but often don't care about the undocumented conventions 
> > followed
> > by normal servers and clients.  For example, they may use lower case 
> > commands
> > where most software uses upper case, or they may put unusual but legal 
> > values
> > in the TCP or IP-ID fields or a hundred other things that make the attack
> > stand out.  Even if they are careful, many exploits require unusual 
> > commands
> > or combinations of options that rarely appear in normal traffic and are
> > therefore less carefully tested.
> >
> > So my point is that it is pointless to try to make an anomaly detection 
> > system
> > explain its reasoning, because the only explanation is that the traffic is
> > unusual.  The best you can do is have it estimate the probability of a 
> > false
> > alarm based on the information content.
> >
> > So the lesson is that AGI is not the only intelligent system where you 
> > should
> > not waste your time trying to understand what it has learned.  Even if you
> > understood it, it would not tell you anything.  Would you understand why a
> > person made some decision if you knew the complete state of every neuron 
> > and
> > synapse in his brain?
> >
> >
> >> You developed a pattern-matcher.  The pattern matcher worked (and I would
> >> dispute that it worked better "than it had a right to").  Clearly, you do
> >> not understand how it worked.  So what does that prove?
> >>
> >> Your contention (or, at least, the only one that continues the previous
> >> thread) seems to be that you are too stupid to ever understand the 
> >> pattern
> >> that it found.
> >>
> >> Let me offer you several alternatives:
> >> 1)  You missed something obvious
> >> 2)  You would have understood it if the system could have explained it to
> >> you
> >> 3)  You would have understood it if the system had managed to losslessly
> >> convert it into a more compact (and comprehensible) format
> >> 4)  You would have understood it if the system had managed to losslessly
> >> convert it into a more compact (and comprehensible) format and explained 
> >> it
> >> to your
> >> 5)  You would have understood it if the system had managed to lossily
> >> convert it into a more compact (and comprehensible -- and probably even,
> >> more correct) format
> >> 6)  You would have understood it if the system had managed to lossily
> >> convert it into a more compact (and comprehensible -- and probably even,
> >> more correct) format and explained it to you
> >>
> >> My contention is that the pattern that it found was simply not translated
> >> into terms you could understand and/or explained.
> >>
> >> Further, and more importantly, the pattern matcher *doesn't* understand 
> >> it's
> >>
> >> results either and certainly could build upon them -- thus, it *fails* 
> >> the
> >> test as far as being the central component of an RSIAI or being able to
> >> provide evidence as to the required behavior of such.
> >>
> >> ----- Original Message ----- 
> >> From: "Philip Goetz" <[EMAIL PROTECTED]>
> >> To: <agi@v2.listbox.com>
> >> Sent: Friday, December 01, 2006 7:02 PM
> >> Subject: Re: [agi] A question on the symbol-system hypothesis
> >>
> >>
> >> > On 11/30/06, Mark Waser <[EMAIL PROTECTED]> wrote:
> >> >>     With many SVD systems, however, the representation is more
> >> >> vector-like
> >> >> and *not* conducive to easy translation to human terms.  I have two
> >> >> answers
> >> >> to these cases.  Answer 1 is that it is still easy for a human to look
> 
> >> >> at
> >> >> the closest matches to a particular word pair and figure out what they
> >> >> have
> >> >> in common.
> >> >
> >> > I developed an intrusion-detection system for detecting brand new
> >> > attacks on computer systems.  It takes TCP connections, and produces
> >> > 100-500 statistics on each connection.  It takes thousands of
> >> > connections, and runs these statistics thru PCA to come up with 5
> >> > dimensions.  Then it clusters each connection, and comes up with 1-3
> >> > clusters per port that have a lot of connections and are declared to
> >> > be "normal" traffic.  Those connections that lie far from any of those
> >> > clusters are identified as possible intrusions.
> >> >
> >> > The system worked much better than I expected it to, or than it had a
> >> > right to.  I went back and, by hand, tried to figure out how it was
> >> > classifying attacks.  In most cases, my conclusion was that there was
> >> > *no information available* to tell whether a connection was an attack,
> >> > because the only information to tell that a connection was an attack
> >> > was in the TCP packet contents, while my system looked only at packet
> >> > headers.  And yet, the system succeeded in placing about 50% of all
> >> > attacks in the top 1% of suspicious connections.  To this day, I don't
> >> > know how it did it.
> >> >
> >> > -----
> >> > This list is sponsored by AGIRI: http://www.agiri.org/email
> >> > To unsubscribe or change your options, please go to:
> >> > http://v2.listbox.com/member/?list_id=303
> >> >
> >>
> >> -----
> >> This list is sponsored by AGIRI: http://www.agiri.org/email
> >> To unsubscribe or change your options, please go to:
> >> http://v2.listbox.com/member/?list_id=303
> >>
> >
> >
> > -- Matt Mahoney, [EMAIL PROTECTED]
> >
> > -----
> > This list is sponsored by AGIRI: http://www.agiri.org/email
> > To unsubscribe or change your options, please go to:
> > http://v2.listbox.com/member/?list_id=303
> > 
> 
> -----
> This list is sponsored by AGIRI: http://www.agiri.org/email
> To unsubscribe or change your options, please go to:
> http://v2.listbox.com/member/?list_id=303
> 


-- Matt Mahoney, [EMAIL PROTECTED]

-----
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?list_id=303

Re: [agi] A question on the symbol-system hypothesis

Reply via email to