Distributed search (was RE: Hacker intelligence level [WAS Re: [agi] Funding AGI research])

2007-12-05 Thread Matt Mahoney

--- Ed Porter <[EMAIL PROTECTED]> wrote:

> Matt,
> 
> Perhaps your are right.  
> 
> But one problem is that big Google-like compuplexes in the next five to ten
> years will be powerful enough to do AGI and they will be much more efficient
> for AGI search because the physical closeness of their machines will make it
> possible for them to perform the massive interconnected needed for powerful
> AGI much more efficiently.

Google controls about 0.1% of the world's computing power.  But I think their
ability to achieve AGI first will not be so much due to the high bandwidth of
their CPU cluster, as that nobody controls the other 99.9%.

Centralized search tends to produce monopolies as the cost of entry goes up. 
It is not so bad now because Google still has a (dwindling) set of
competitors.  They can't yet hide content that threatens them.

Distributed search like Wikia/Atlas/Grub is interesting, but if people don't
see a compelling need for it, it won't happen.  How big will it have to get
before it is better than Google?  File sharing networks would probably be a
lot bigger and more useful (with mostly legitimate content) if we could solve
the distributed search problem.


-- Matt Mahoney, [EMAIL PROTECTED]

-
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?member_id=8660244&id_secret=72969535-74e4ee


RE: Distributed search (was RE: Hacker intelligence level [WAS Re: [agi] Funding AGI research])

2007-12-05 Thread Ed Porter
I have a lot of respect for Google, but I don't like monopolies, whether it
is Microsoft or Google.  I think it is vitally important that there be
several viable search competators.  

I wish this wicki one luck.  As I said, it sounds a lot like your idea.

Ed Porter

-Original Message-
From: Matt Mahoney [mailto:[EMAIL PROTECTED] 
Sent: Wednesday, December 05, 2007 9:24 PM
To: agi@v2.listbox.com
Subject: Distributed search (was RE: Hacker intelligence level [WAS Re:
[agi] Funding AGI research])


--- Ed Porter <[EMAIL PROTECTED]> wrote:

> Matt,
> 
> Perhaps your are right.  
> 
> But one problem is that big Google-like compuplexes in the next five to
ten
> years will be powerful enough to do AGI and they will be much more
efficient
> for AGI search because the physical closeness of their machines will make
it
> possible for them to perform the massive interconnected needed for
powerful
> AGI much more efficiently.

Google controls about 0.1% of the world's computing power.  But I think
their
ability to achieve AGI first will not be so much due to the high bandwidth
of
their CPU cluster, as that nobody controls the other 99.9%.

Centralized search tends to produce monopolies as the cost of entry goes up.

It is not so bad now because Google still has a (dwindling) set of
competitors.  They can't yet hide content that threatens them.

Distributed search like Wikia/Atlas/Grub is interesting, but if people don't
see a compelling need for it, it won't happen.  How big will it have to get
before it is better than Google?  File sharing networks would probably be a
lot bigger and more useful (with mostly legitimate content) if we could
solve
the distributed search problem.


-- Matt Mahoney, [EMAIL PROTECTED]

-
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?&;

-
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?member_id=8660244&id_secret=73068614-a9079e

RE: Distributed search (was RE: Hacker intelligence level [WAS Re: [agi] Funding AGI research])

2007-12-06 Thread Matt Mahoney

--- Ed Porter <[EMAIL PROTECTED]> wrote:

> I have a lot of respect for Google, but I don't like monopolies, whether it
> is Microsoft or Google.  I think it is vitally important that there be
> several viable search competators.  
> 
> I wish this wicki one luck.  As I said, it sounds a lot like your idea.

Partly.  The main difference is that I am also proposing a message posting
service, where messages become instantly searchable and are also directed to
persistent queries.

Wikia has a big hurdle to get over.  People will ask "how is this better than
Google?" before they bother to download the software.  For example, Grub
(distributed spider) uses a lot of bandwidth and disk without providing much
direct benefit to the user.  The major benefit of Wikia seems to be that users
provide feedback on relevance to query responses, which in theory ought to
provide a better ranking algorithm than something like Google's PageRank.  But
assuming they get enough users to get to this level, spammers could still game
the system by flooding the network with with high rankings for their websites.

In a distributed message posting service, each peer would have its own policy
regarding which messages to relay, keep in its cache, or ignore.  If a
document is valuable, then lots of peers would keep a copy.  A client could
then rank query responses by the number of copies received weighted by the
peer's reputation.  Spammers could try to game the system by adding lots of
peers and flooding the network with advertising, but this would fail because
most other peers would be configured to ignore peers that don't provide
reciprocal services by routing their own outgoing messages.  Any peer not so
configured would quickly be abused and isolated from the network in the same
way that open relay SMTP servers get abused by spammers and blacklisted by
spam filters.

Of course a message posting service would have a big hurdle too.  Initially,
the service would have to be well integrated with the existing Internet. 
Client queries would have to go to the major search engines, and there would
have to be websites set up as peers without the user having to install
software.  Most computers are not configured to run as servers (dynamic IP,
behind firewalls, slow upload, etc), so peers will probably need to allow
message passing over client HTTP (website polling), by email, and over instant
messaging protocols.

File sharing networks became popular because they offered a service not
available elsewhere (free music).  But I don't intend for the message posting
service to be used to evade copyright or censorship (although it probably
could be).  The protocol requires that the message's originator and
intermediate routers all be identified by a reply address and time stamp.  It
won't work otherwise.


-- Matt Mahoney, [EMAIL PROTECTED]

-
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?member_id=8660244&id_secret=73286384-77b385


RE: Distributed search (was RE: Hacker intelligence level [WAS Re: [agi] Funding AGI research])

2007-12-06 Thread Ed Porter
Matt,

Does a PC become more vulnerable to viruses, worms, Trojan horses, root
kits, and other web attacks if it becomes part of a P2P network? And if so
why and how much.  

Ed Porter

-Original Message-
From: Matt Mahoney [mailto:[EMAIL PROTECTED] 
Sent: Thursday, December 06, 2007 3:01 PM
To: agi@v2.listbox.com
Subject: RE: Distributed search (was RE: Hacker intelligence level [WAS Re:
[agi] Funding AGI research])


--- Ed Porter <[EMAIL PROTECTED]> wrote:

> I have a lot of respect for Google, but I don't like monopolies, whether
it
> is Microsoft or Google.  I think it is vitally important that there be
> several viable search competators.  
> 
> I wish this wicki one luck.  As I said, it sounds a lot like your idea.

Partly.  The main difference is that I am also proposing a message posting
service, where messages become instantly searchable and are also directed to
persistent queries.

Wikia has a big hurdle to get over.  People will ask "how is this better
than
Google?" before they bother to download the software.  For example, Grub
(distributed spider) uses a lot of bandwidth and disk without providing much
direct benefit to the user.  The major benefit of Wikia seems to be that
users
provide feedback on relevance to query responses, which in theory ought to
provide a better ranking algorithm than something like Google's PageRank.
But
assuming they get enough users to get to this level, spammers could still
game
the system by flooding the network with with high rankings for their
websites.

In a distributed message posting service, each peer would have its own
policy
regarding which messages to relay, keep in its cache, or ignore.  If a
document is valuable, then lots of peers would keep a copy.  A client could
then rank query responses by the number of copies received weighted by the
peer's reputation.  Spammers could try to game the system by adding lots of
peers and flooding the network with advertising, but this would fail because
most other peers would be configured to ignore peers that don't provide
reciprocal services by routing their own outgoing messages.  Any peer not so
configured would quickly be abused and isolated from the network in the same
way that open relay SMTP servers get abused by spammers and blacklisted by
spam filters.

Of course a message posting service would have a big hurdle too.  Initially,
the service would have to be well integrated with the existing Internet. 
Client queries would have to go to the major search engines, and there would
have to be websites set up as peers without the user having to install
software.  Most computers are not configured to run as servers (dynamic IP,
behind firewalls, slow upload, etc), so peers will probably need to allow
message passing over client HTTP (website polling), by email, and over
instant
messaging protocols.

File sharing networks became popular because they offered a service not
available elsewhere (free music).  But I don't intend for the message
posting
service to be used to evade copyright or censorship (although it probably
could be).  The protocol requires that the message's originator and
intermediate routers all be identified by a reply address and time stamp.
It
won't work otherwise.


-- Matt Mahoney, [EMAIL PROTECTED]

-
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?&;

-
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?member_id=8660244&id_secret=73293460-0b3fcd

RE: Distributed search (was RE: Hacker intelligence level [WAS Re: [agi] Funding AGI research])

2007-12-06 Thread Matt Mahoney
--- Ed Porter <[EMAIL PROTECTED]> wrote:

> Matt,
> 
> Does a PC become more vulnerable to viruses, worms, Trojan horses, root
> kits, and other web attacks if it becomes part of a P2P network? And if so
> why and how much.  

It does if the P2P software has vulnerabilities, just like any other server or
client.  Worms would be especially dangerous because they could spread quickly
without user intervention, but slowly spreading viruses that are well hidden
can be dangerous too.  There is no foolproof defense, but it helps to keep the
protocol and software as simple as possible, to run the P2P software as a
nonprivileged process, use open source code, and not to depend to any large
extent on a single source of software.

The protocol I have in mind is that a message contain searchable natural
language text, possibly some nonsearchable attached files, and a header with
the reply address and timestamp of the originator and any intermediate peers
through which the message was routed.  The protocol is not dangerous except
for the attached files, but these have to be included because it is a useful
service.  If you don't include it, people will figure out how to embed
arbitrary data in the message text, which would make the protocol more
dangerous because it wasn't planned for.

In theory, you could use the P2P network to spread information about malicious
peers and deliver software patches.  But I think this would introduce more
problems than it solves because it would also introduce a mechanism for
spreading false information and patches containing trojans.  Peers should have
defenses that operate independently of the network, including disconnecting
itself if it detects anomalies in its own behavior.

Of course the network is vulnerable even if the peers behave properly. 
Malicious peers could forge headers, for example, to hide the true source of
messages or to force replies to be directed to unintended targets.  Some
attacks could be very complex depending on the idiosyncratic behavior of
particular peers.



-- Matt Mahoney, [EMAIL PROTECTED]

-
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?member_id=8660244&id_secret=73321137-bba914


RE: Distributed search (was RE: Hacker intelligence level [WAS Re: [agi] Funding AGI research])

2007-12-06 Thread Ed Porter
Matt,  
So if it is perceived as something that increases a machine's vulnerability,
it seems to me that would be one more reason for people to avoid using it.
Ed Porter

-Original Message-
From: Matt Mahoney [mailto:[EMAIL PROTECTED] 
Sent: Thursday, December 06, 2007 4:06 PM
To: agi@v2.listbox.com
Subject: RE: Distributed search (was RE: Hacker intelligence level [WAS Re:
[agi] Funding AGI research])

--- Ed Porter <[EMAIL PROTECTED]> wrote:

> Matt,
> 
> Does a PC become more vulnerable to viruses, worms, Trojan horses, root
> kits, and other web attacks if it becomes part of a P2P network? And if so
> why and how much.  

It does if the P2P software has vulnerabilities, just like any other server
or
client.  Worms would be especially dangerous because they could spread
quickly
without user intervention, but slowly spreading viruses that are well hidden
can be dangerous too.  There is no foolproof defense, but it helps to keep
the
protocol and software as simple as possible, to run the P2P software as a
nonprivileged process, use open source code, and not to depend to any large
extent on a single source of software.

The protocol I have in mind is that a message contain searchable natural
language text, possibly some nonsearchable attached files, and a header with
the reply address and timestamp of the originator and any intermediate peers
through which the message was routed.  The protocol is not dangerous except
for the attached files, but these have to be included because it is a useful
service.  If you don't include it, people will figure out how to embed
arbitrary data in the message text, which would make the protocol more
dangerous because it wasn't planned for.

In theory, you could use the P2P network to spread information about
malicious
peers and deliver software patches.  But I think this would introduce more
problems than it solves because it would also introduce a mechanism for
spreading false information and patches containing trojans.  Peers should
have
defenses that operate independently of the network, including disconnecting
itself if it detects anomalies in its own behavior.

Of course the network is vulnerable even if the peers behave properly. 
Malicious peers could forge headers, for example, to hide the true source of
messages or to force replies to be directed to unintended targets.  Some
attacks could be very complex depending on the idiosyncratic behavior of
particular peers.



-- Matt Mahoney, [EMAIL PROTECTED]

-
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?&;

-
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?member_id=8660244&id_secret=73357661-483045

Re: Distributed search (was RE: Hacker intelligence level [WAS Re: [agi] Funding AGI research])

2007-12-06 Thread William Pearson
On 06/12/2007, Ed Porter <[EMAIL PROTECTED]> wrote:
> Matt,
> So if it is perceived as something that increases a machine's vulnerability,
> it seems to me that would be one more reason for people to avoid using it.
> Ed Porter


Why are you having this discussion on an AGI list?

  Will Pearson

-
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?member_id=8660244&id_secret=73366106-264b25


RE: Distributed search (was RE: Hacker intelligence level [WAS Re: [agi] Funding AGI research])

2007-12-06 Thread Ed Porter
It was part of a discussion of using a P2P network with OpenCog to develop
distributed AGI's.

-Original Message-
From: William Pearson [mailto:[EMAIL PROTECTED] 
Sent: Thursday, December 06, 2007 5:20 PM
To: agi@v2.listbox.com
Subject: Re: Distributed search (was RE: Hacker intelligence level [WAS Re:
[agi] Funding AGI research])

On 06/12/2007, Ed Porter <[EMAIL PROTECTED]> wrote:
> Matt,
> So if it is perceived as something that increases a machine's
vulnerability,
> it seems to me that would be one more reason for people to avoid using it.
> Ed Porter


Why are you having this discussion on an AGI list?

  Will Pearson

-
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?&;

-
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?member_id=8660244&id_secret=73390249-cd905b

Re: Distributed search (was RE: Hacker intelligence level [WAS Re: [agi] Funding AGI research])

2007-12-06 Thread Matt Mahoney

--- William Pearson <[EMAIL PROTECTED]> wrote:

> On 06/12/2007, Ed Porter <[EMAIL PROTECTED]> wrote:
> > Matt,
> > So if it is perceived as something that increases a machine's
> vulnerability,
> > it seems to me that would be one more reason for people to avoid using it.
> > Ed Porter
> 
> 
> Why are you having this discussion on an AGI list?

Because this is an AGI design.  The intelligence comes from having a lot of
specialized experts on narrow topics and a distributed infrastructure that
directs your queries to the right experts.  The P2P protocol is natural
language text.  I will write up the proposal so it will make more sense than
the current collection of posts.


-- Matt Mahoney, [EMAIL PROTECTED]

-
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?member_id=8660244&id_secret=73390737-69c951


RE: Distributed search (was RE: Hacker intelligence level [WAS Re: [agi] Funding AGI research])

2007-12-06 Thread Ed Porter
Are you saying the increase in vulnerability would be no more than that?

-Original Message-
From: Matt Mahoney [mailto:[EMAIL PROTECTED] 
Sent: Thursday, December 06, 2007 6:17 PM
To: agi@v2.listbox.com
Subject: RE: Distributed search (was RE: Hacker intelligence level [WAS Re:
[agi] Funding AGI research])


--- Ed Porter <[EMAIL PROTECTED]> wrote:

> Matt,  
> So if it is perceived as something that increases a machine's
vulnerability,
> it seems to me that would be one more reason for people to avoid using it.
> Ed Porter

A web browser and email increases your computer's vulnerability, but it
doesn't stop people from using them.

> 
> -Original Message-
> From: Matt Mahoney [mailto:[EMAIL PROTECTED] 
> Sent: Thursday, December 06, 2007 4:06 PM
> To: agi@v2.listbox.com
> Subject: RE: Distributed search (was RE: Hacker intelligence level [WAS
Re:
> [agi] Funding AGI research])
> 
> --- Ed Porter <[EMAIL PROTECTED]> wrote:
> 
> > Matt,
> > 
> > Does a PC become more vulnerable to viruses, worms, Trojan horses, root
> > kits, and other web attacks if it becomes part of a P2P network? And if
so
> > why and how much.  
> 
> It does if the P2P software has vulnerabilities, just like any other
server
> or
> client.  Worms would be especially dangerous because they could spread
> quickly
> without user intervention, but slowly spreading viruses that are well
hidden
> can be dangerous too.  There is no foolproof defense, but it helps to keep
> the
> protocol and software as simple as possible, to run the P2P software as a
> nonprivileged process, use open source code, and not to depend to any
large
> extent on a single source of software.
> 
> The protocol I have in mind is that a message contain searchable natural
> language text, possibly some nonsearchable attached files, and a header
with
> the reply address and timestamp of the originator and any intermediate
peers
> through which the message was routed.  The protocol is not dangerous
except
> for the attached files, but these have to be included because it is a
useful
> service.  If you don't include it, people will figure out how to embed
> arbitrary data in the message text, which would make the protocol more
> dangerous because it wasn't planned for.
> 
> In theory, you could use the P2P network to spread information about
> malicious
> peers and deliver software patches.  But I think this would introduce more
> problems than it solves because it would also introduce a mechanism for
> spreading false information and patches containing trojans.  Peers should
> have
> defenses that operate independently of the network, including
disconnecting
> itself if it detects anomalies in its own behavior.
> 
> Of course the network is vulnerable even if the peers behave properly. 
> Malicious peers could forge headers, for example, to hide the true source
of
> messages or to force replies to be directed to unintended targets.  Some
> attacks could be very complex depending on the idiosyncratic behavior of
> particular peers.
> 
> 
> 
> -- Matt Mahoney, [EMAIL PROTECTED]
> 
> -
> This list is sponsored by AGIRI: http://www.agiri.org/email
> To unsubscribe or change your options, please go to:
> http://v2.listbox.com/member/?&;
> 
> -
> This list is sponsored by AGIRI: http://www.agiri.org/email
> To unsubscribe or change your options, please go to:
> http://v2.listbox.com/member/?&;


-- Matt Mahoney, [EMAIL PROTECTED]

-
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?&;

-
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?member_id=8660244&id_secret=73394329-17b2b6

RE: Distributed search (was RE: Hacker intelligence level [WAS Re: [agi] Funding AGI research])

2007-12-06 Thread Matt Mahoney

--- Ed Porter <[EMAIL PROTECTED]> wrote:

> Matt,  
> So if it is perceived as something that increases a machine's vulnerability,
> it seems to me that would be one more reason for people to avoid using it.
> Ed Porter

A web browser and email increases your computer's vulnerability, but it
doesn't stop people from using them.

> 
> -Original Message-
> From: Matt Mahoney [mailto:[EMAIL PROTECTED] 
> Sent: Thursday, December 06, 2007 4:06 PM
> To: agi@v2.listbox.com
> Subject: RE: Distributed search (was RE: Hacker intelligence level [WAS Re:
> [agi] Funding AGI research])
> 
> --- Ed Porter <[EMAIL PROTECTED]> wrote:
> 
> > Matt,
> > 
> > Does a PC become more vulnerable to viruses, worms, Trojan horses, root
> > kits, and other web attacks if it becomes part of a P2P network? And if so
> > why and how much.  
> 
> It does if the P2P software has vulnerabilities, just like any other server
> or
> client.  Worms would be especially dangerous because they could spread
> quickly
> without user intervention, but slowly spreading viruses that are well hidden
> can be dangerous too.  There is no foolproof defense, but it helps to keep
> the
> protocol and software as simple as possible, to run the P2P software as a
> nonprivileged process, use open source code, and not to depend to any large
> extent on a single source of software.
> 
> The protocol I have in mind is that a message contain searchable natural
> language text, possibly some nonsearchable attached files, and a header with
> the reply address and timestamp of the originator and any intermediate peers
> through which the message was routed.  The protocol is not dangerous except
> for the attached files, but these have to be included because it is a useful
> service.  If you don't include it, people will figure out how to embed
> arbitrary data in the message text, which would make the protocol more
> dangerous because it wasn't planned for.
> 
> In theory, you could use the P2P network to spread information about
> malicious
> peers and deliver software patches.  But I think this would introduce more
> problems than it solves because it would also introduce a mechanism for
> spreading false information and patches containing trojans.  Peers should
> have
> defenses that operate independently of the network, including disconnecting
> itself if it detects anomalies in its own behavior.
> 
> Of course the network is vulnerable even if the peers behave properly. 
> Malicious peers could forge headers, for example, to hide the true source of
> messages or to force replies to be directed to unintended targets.  Some
> attacks could be very complex depending on the idiosyncratic behavior of
> particular peers.
> 
> 
> 
> -- Matt Mahoney, [EMAIL PROTECTED]
> 
> -
> This list is sponsored by AGIRI: http://www.agiri.org/email
> To unsubscribe or change your options, please go to:
> http://v2.listbox.com/member/?&;
> 
> -
> This list is sponsored by AGIRI: http://www.agiri.org/email
> To unsubscribe or change your options, please go to:
> http://v2.listbox.com/member/?&;


-- Matt Mahoney, [EMAIL PROTECTED]

-
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?member_id=8660244&id_secret=73388768-0927ef


RE: Distributed search (was RE: Hacker intelligence level [WAS Re: [agi] Funding AGI research])

2007-12-06 Thread Matt Mahoney

--- Ed Porter <[EMAIL PROTECTED]> wrote:

> Are you saying the increase in vulnerability would be no more than that?

Yes, at least short term if we are careful with the design.  But then again,
you can't predict what AGI will do, or else it wouldn't be intelligent.  I
can't say for certain long term (2040s?) it wouldn't launch a singularity, or
even that it wouldn't create an intelligent worm that would eat the Internet. 
I don't think anyone is smart enough to get it right, but it is going to
happen in one form or another.

I wrote up a quick description of my AGI proposal at
http://www.mattmahoney.net/agi.html
basically summarizing what I posted over the last several emails, including
various attack scenarios.  I'm sure I didn't think of everything.  It is kind
of sketchy because it's not an area I am actively pursuing.  It should be a
useful service at least in the short term before it destroys us.


> 
> -Original Message-
> From: Matt Mahoney [mailto:[EMAIL PROTECTED] 
> Sent: Thursday, December 06, 2007 6:17 PM
> To: agi@v2.listbox.com
> Subject: RE: Distributed search (was RE: Hacker intelligence level [WAS Re:
> [agi] Funding AGI research])
> 
> 
> --- Ed Porter <[EMAIL PROTECTED]> wrote:
> 
> > Matt,  
> > So if it is perceived as something that increases a machine's
> vulnerability,
> > it seems to me that would be one more reason for people to avoid using it.
> > Ed Porter
> 
> A web browser and email increases your computer's vulnerability, but it
> doesn't stop people from using them.
> 
> > 
> > -Original Message-
> > From: Matt Mahoney [mailto:[EMAIL PROTECTED] 
> > Sent: Thursday, December 06, 2007 4:06 PM
> > To: agi@v2.listbox.com
> > Subject: RE: Distributed search (was RE: Hacker intelligence level [WAS
> Re:
> > [agi] Funding AGI research])
> > 
> > --- Ed Porter <[EMAIL PROTECTED]> wrote:
> > 
> > > Matt,
> > > 
> > > Does a PC become more vulnerable to viruses, worms, Trojan horses, root
> > > kits, and other web attacks if it becomes part of a P2P network? And if
> so
> > > why and how much.  
> > 
> > It does if the P2P software has vulnerabilities, just like any other
> server
> > or
> > client.  Worms would be especially dangerous because they could spread
> > quickly
> > without user intervention, but slowly spreading viruses that are well
> hidden
> > can be dangerous too.  There is no foolproof defense, but it helps to keep
> > the
> > protocol and software as simple as possible, to run the P2P software as a
> > nonprivileged process, use open source code, and not to depend to any
> large
> > extent on a single source of software.
> > 
> > The protocol I have in mind is that a message contain searchable natural
> > language text, possibly some nonsearchable attached files, and a header
> with
> > the reply address and timestamp of the originator and any intermediate
> peers
> > through which the message was routed.  The protocol is not dangerous
> except
> > for the attached files, but these have to be included because it is a
> useful
> > service.  If you don't include it, people will figure out how to embed
> > arbitrary data in the message text, which would make the protocol more
> > dangerous because it wasn't planned for.
> > 
> > In theory, you could use the P2P network to spread information about
> > malicious
> > peers and deliver software patches.  But I think this would introduce more
> > problems than it solves because it would also introduce a mechanism for
> > spreading false information and patches containing trojans.  Peers should
> > have
> > defenses that operate independently of the network, including
> disconnecting
> > itself if it detects anomalies in its own behavior.
> > 
> > Of course the network is vulnerable even if the peers behave properly. 
> > Malicious peers could forge headers, for example, to hide the true source
> of
> > messages or to force replies to be directed to unintended targets.  Some
> > attacks could be very complex depending on the idiosyncratic behavior of
> > particular peers.
> > 
> > 
> > 
> > -- Matt Mahoney, [EMAIL PROTECTED]
> > 
> > -
> > This list is sponsored by AGIRI: http://www.agiri.org/email
> > To unsubscribe or change your options, please go to:
> > http://v2.listbox.com/member/?&;
> > 
> > -
> > This list is sponsored by AGIRI: http://www.agiri.org/email
> > To unsubscribe or change your options, pleas

RE: Distributed search (was RE: Hacker intelligence level [WAS Re: [agi] Funding AGI research])

2007-12-06 Thread Jean-Paul Van Belle
Hi Matt

You call it an AGI proposal but it is described as a distributed search 
algorithms that (merely) appears intelligent i.e. "design for an Internet-wide 
message posting and search service". There doesn't appear to be any grounding 
or semantic interpretation by the AI system? How will it become more 
intelligent?

=Jean-Paul
>>> On 2007/12/07 at 06:41, in message
<[EMAIL PROTECTED]>, Matt Mahoney
<[EMAIL PROTECTED]> wrote:
> I wrote up a quick description of my AGI proposal at
> http://www.mattmahoney.net/agi.html 
> basically summarizing what I posted over the last several emails, including
> various attack scenarios.  I'm sure I didn't think of everything.  It is 
> kind
> of sketchy because it's not an area I am actively pursuing.  It should be a
> useful service at least in the short term before it destroys us.

-- 

Research Associate: CITANDA
Post-Graduate Section Head 
Department of Information Systems
Phone: (+27)-(0)21-6504256
Fax: (+27)-(0)21-6502280
Office: Leslie Commerce 4.21


-
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?member_id=8660244&id_secret=73453257-bf3294

RE: Distributed search (was RE: Hacker intelligence level [WAS Re: [agi] Funding AGI research])

2007-12-07 Thread Ed Porter
Thanks Matt!


-Original Message-
From: Matt Mahoney [mailto:[EMAIL PROTECTED] 
Sent: Thursday, December 06, 2007 11:42 PM
To: agi@v2.listbox.com
Subject: RE: Distributed search (was RE: Hacker intelligence level [WAS Re:
[agi] Funding AGI research])


--- Ed Porter <[EMAIL PROTECTED]> wrote:

> Are you saying the increase in vulnerability would be no more than that?

Yes, at least short term if we are careful with the design.  But then again,
you can't predict what AGI will do, or else it wouldn't be intelligent.  I
can't say for certain long term (2040s?) it wouldn't launch a singularity,
or
even that it wouldn't create an intelligent worm that would eat the
Internet. 
I don't think anyone is smart enough to get it right, but it is going to
happen in one form or another.

I wrote up a quick description of my AGI proposal at
http://www.mattmahoney.net/agi.html
basically summarizing what I posted over the last several emails, including
various attack scenarios.  I'm sure I didn't think of everything.  It is
kind
of sketchy because it's not an area I am actively pursuing.  It should be a
useful service at least in the short term before it destroys us.


> 
> -Original Message-
> From: Matt Mahoney [mailto:[EMAIL PROTECTED] 
> Sent: Thursday, December 06, 2007 6:17 PM
> To: agi@v2.listbox.com
> Subject: RE: Distributed search (was RE: Hacker intelligence level [WAS
Re:
> [agi] Funding AGI research])
> 
> 
> --- Ed Porter <[EMAIL PROTECTED]> wrote:
> 
> > Matt,  
> > So if it is perceived as something that increases a machine's
> vulnerability,
> > it seems to me that would be one more reason for people to avoid using
it.
> > Ed Porter
> 
> A web browser and email increases your computer's vulnerability, but it
> doesn't stop people from using them.
> 
> > 
> > -Original Message-
> > From: Matt Mahoney [mailto:[EMAIL PROTECTED] 
> > Sent: Thursday, December 06, 2007 4:06 PM
> > To: agi@v2.listbox.com
> > Subject: RE: Distributed search (was RE: Hacker intelligence level [WAS
> Re:
> > [agi] Funding AGI research])
> > 
> > --- Ed Porter <[EMAIL PROTECTED]> wrote:
> > 
> > > Matt,
> > > 
> > > Does a PC become more vulnerable to viruses, worms, Trojan horses,
root
> > > kits, and other web attacks if it becomes part of a P2P network? And
if
> so
> > > why and how much.  
> > 
> > It does if the P2P software has vulnerabilities, just like any other
> server
> > or
> > client.  Worms would be especially dangerous because they could spread
> > quickly
> > without user intervention, but slowly spreading viruses that are well
> hidden
> > can be dangerous too.  There is no foolproof defense, but it helps to
keep
> > the
> > protocol and software as simple as possible, to run the P2P software as
a
> > nonprivileged process, use open source code, and not to depend to any
> large
> > extent on a single source of software.
> > 
> > The protocol I have in mind is that a message contain searchable natural
> > language text, possibly some nonsearchable attached files, and a header
> with
> > the reply address and timestamp of the originator and any intermediate
> peers
> > through which the message was routed.  The protocol is not dangerous
> except
> > for the attached files, but these have to be included because it is a
> useful
> > service.  If you don't include it, people will figure out how to embed
> > arbitrary data in the message text, which would make the protocol more
> > dangerous because it wasn't planned for.
> > 
> > In theory, you could use the P2P network to spread information about
> > malicious
> > peers and deliver software patches.  But I think this would introduce
more
> > problems than it solves because it would also introduce a mechanism for
> > spreading false information and patches containing trojans.  Peers
should
> > have
> > defenses that operate independently of the network, including
> disconnecting
> > itself if it detects anomalies in its own behavior.
> > 
> > Of course the network is vulnerable even if the peers behave properly. 
> > Malicious peers could forge headers, for example, to hide the true
source
> of
> > messages or to force replies to be directed to unintended targets.  Some
> > attacks could be very complex depending on the idiosyncratic behavior of
> > particular peers.
> > 
> > 
> > 
> > -- Matt Mahoney, [EMAIL PROTECTED]
> > 
> > -
> > This list is sponsored by AGIRI: http://www.agiri.org/ema

RE: Distributed search (was RE: Hacker intelligence level [WAS Re: [agi] Funding AGI research])

2007-12-07 Thread Jean-Paul Van Belle
Hi Matt, Wonderful idea, now it will even show the typical human trait of 
lying...when i ask it "do you still love me?" most answers in its database will 
have Yes as an answer  but when i ask it 'what's my name?' it'll call me John?

However, your approach is actually already being implemented to a certain 
extent. Apparantly (was it newsweek, time?) the No 1 search engine in 
(Singapore? Hong Kong? Taiwan? - sorry I forgot) is *not* Google but a local 
language Q&A system that works very much the way you envisage it (except it 
collects the answers in its own SAN i.e. not distributed over the user machines)

=Jean-Paul
 On 2007/12/07 at 18:58, in message
> <[EMAIL PROTECTED]>, Matt Mahoney
> <[EMAIL PROTECTED]> wrote:
> 
> Hi Matt
> 
> You call it an AGI proposal but it is described as a distributed search
> algorithms that (merely) appears intelligent i.e. "design for an
> Internet-wide message posting and search service". There doesn't appear to
> be any grounding or semantic interpretation by the AI system? How will it
> become more intelligent?

Turing was careful to make no distinction between "being intelligent" and
"appearing intelligent".  The requirement for passing the Turing test is to be
able to compute a probability distribution P over text strings that varies
from the true distribution no more than it varies between different people. 
Once you can do this, then given a question Q, you can compute answer A that
maximizes P(A|Q) = P(QA)/P(Q).

This does not require grounding.  The way my system appears intelligent is by
directing Q to the right experts, and by being big enough to have experts on
nearly every conceivable topic of interest to humans.

A lot of AGI research seems to be focused on how to represent knowledge and
thought efficiently on a (much too small) computer, rather than on what
services the AGI should provide for us.

-- 

Research Associate: CITANDA
Post-Graduate Section Head 
Department of Information Systems
Phone: (+27)-(0)21-6504256
Fax: (+27)-(0)21-6502280
Office: Leslie Commerce 4.21


-
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?member_id=8660244&id_secret=73912948-7bb204

RE: Distributed search (was RE: Hacker intelligence level [WAS Re: [agi] Funding AGI research])

2007-12-07 Thread Matt Mahoney
--- Jean-Paul Van Belle <[EMAIL PROTECTED]> wrote:

> Hi Matt
> 
> You call it an AGI proposal but it is described as a distributed search
> algorithms that (merely) appears intelligent i.e. "design for an
> Internet-wide message posting and search service". There doesn't appear to
> be any grounding or semantic interpretation by the AI system? How will it
> become more intelligent?

Turing was careful to make no distinction between "being intelligent" and
"appearing intelligent".  The requirement for passing the Turing test is to be
able to compute a probability distribution P over text strings that varies
from the true distribution no more than it varies between different people. 
Once you can do this, then given a question Q, you can compute answer A that
maximizes P(A|Q) = P(QA)/P(Q).

This does not require grounding.  The way my system appears intelligent is by
directing Q to the right experts, and by being big enough to have experts on
nearly every conceivable topic of interest to humans.

A lot of AGI research seems to be focused on how to represent knowledge and
thought efficiently on a (much too small) computer, rather than on what
services the AGI should provide for us.


> 
> =Jean-Paul
> >>> On 2007/12/07 at 06:41, in message
> <[EMAIL PROTECTED]>, Matt Mahoney
> <[EMAIL PROTECTED]> wrote:
> > I wrote up a quick description of my AGI proposal at
> > http://www.mattmahoney.net/agi.html 
> > basically summarizing what I posted over the last several emails,
> including
> > various attack scenarios.  I'm sure I didn't think of everything.  It is 
> > kind
> > of sketchy because it's not an area I am actively pursuing.  It should be
> a
> > useful service at least in the short term before it destroys us.
> 
> -- 
> 
> Research Associate: CITANDA
> Post-Graduate Section Head 
> Department of Information Systems
> Phone: (+27)-(0)21-6504256
> Fax: (+27)-(0)21-6502280
> Office: Leslie Commerce 4.21
> 


-- Matt Mahoney, [EMAIL PROTECTED]

-
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?member_id=8660244&id_secret=73638450-5ef878


RE: Distributed search (was RE: Hacker intelligence level [WAS Re: [agi] Funding AGI research])

2007-12-11 Thread Matt Mahoney
--- Jean-Paul Van Belle <[EMAIL PROTECTED]> wrote:

> Hi Matt, Wonderful idea, now it will even show the typical human trait of
> lying...when i ask it "do you still love me?" most answers in its database
> will have Yes as an answer  but when i ask it 'what's my name?' it'll call
> me John?

My proposed message posting service allows anyone to contribute to its
knowledge base, just like Wikipedia, so it could certainly contain some false
or useless information.  However, the number of peers that keep a copy of a
message will depend on the number of peers that accept it according to the
peers' policies, which are set individually by their owners.  The network
provides an incentive for peers to produce useful information so that other
peers will accept it.  Thus, useful and truthful information is more likely to
be propagated.


-- Matt Mahoney, [EMAIL PROTECTED]

-
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?member_id=8660244&id_secret=74671775-73001c


RE: Distributed search (was RE: Hacker intelligence level [WAS Re: [agi] Funding AGI research])

2007-12-12 Thread James Ratcliff
I had been thinking about something along these lines, though not worded as you 
have in this message yet.

What I would be most interested in at this point is a knowledge gathering 
system somewhere along these lines, where the main AGI could be 
centralized/clustered or distributed, but where questions and information would 
be posed to the Bot on each persons node and collected together.
The system would remember any facts and domain that a person has contributed so 
any future unique questions could be posed to the knowledgeable expert users.
  This would allow a large amount of knowledge to be extracted in a distributed 
manner, keeping track of the quality of information gathered from each person 
as a trust metric, and many facts would be gathered and checked for truth.

  Mainly the system should have an ability to ACTIVELY go out in search of the 
answer, by chatting with known users to find and confirm any conflicting 
results.

For instance, it would randomly ask me "Who is the highest paid baseball 
player?"
and I would pass on that question... the system would put a lower score for any 
further baseball questions sent towards me, but based on my answering of other 
computer questions and ones about Austin, TX, it would be more likely to ask me 
questions about them.
And only me and a couple other people here would get the questions about 
Austin, TX.

Something along the lines of a higher quality Yahoo Questions, with an active 
component, and central knowledge base.
I think the knowledge base is one of the most important pieces of these, and 
hope to start seeing some more of ppls ideas and implementations of KR db's.

James Ratcliff

Matt Mahoney <[EMAIL PROTECTED]> wrote: --- Jean-Paul Van Belle  wrote:

> Hi Matt, Wonderful idea, now it will even show the typical human trait of
> lying...when i ask it "do you still love me?" most answers in its database
> will have Yes as an answer  but when i ask it 'what's my name?' it'll call
> me John?

My proposed message posting service allows anyone to contribute to its
knowledge base, just like Wikipedia, so it could certainly contain some false
or useless information.  However, the number of peers that keep a copy of a
message will depend on the number of peers that accept it according to the
peers' policies, which are set individually by their owners.  The network
provides an incentive for peers to produce useful information so that other
peers will accept it.  Thus, useful and truthful information is more likely to
be propagated.


-- Matt Mahoney, [EMAIL PROTECTED]

-
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?&;



___
James Ratcliff - http://falazar.com
Looking for something...
   
-
Looking for last minute shopping deals?  Find them fast with Yahoo! Search.

-
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?member_id=8660244&id_secret=75375812-111ad4

Re: Distributed search (was RE: Hacker intelligence level [WAS Re: [agi] Funding AGI research])

2007-12-12 Thread Mike Dougherty
On 12/12/07, James Ratcliff <[EMAIL PROTECTED]> wrote:
>   This would allow a large amount of knowledge to be extracted in a
> distributed manner, keeping track of the quality of information gathered
> from each person as a trust metric, and many facts would be gathered and
> checked for truth.

> Something along the lines of a higher quality Yahoo Questions, with an
> active component, and central knowledge base.
> I think the knowledge base is one of the most important pieces of these, and
> hope to start seeing some more of ppls ideas and implementations of KR db's.

I believe where you said "central knowledge base" you mean
"distributed KB" - right?  The idea of keeping local KB at each node
shares the burden for storage/bandwidth to every node in the network.
Your trust metrics are how nodes conditionally connect for per-topic
fact-checking.

I have already volunteered my free CPU/bandwidth to a prototype of
this model.  Of course, I'd like to be a collaborator of mechanisms
involved in addition to a user of the grid.  Even if it starts out
only a toy or hobby, it would still teach us a great deal.

-
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?member_id=8660244&id_secret=75442948-fd876c