[agi] What is the complexity of RSI?

Matt Mahoney Sun, 30 Sep 2007 15:07:10 -0700

What would be the simplest system capable of recursive self improvement, not
necessarily with human level intelligence?  What are the time and memory
costs?  What would be its algorithmic complexity?


One could imagine environments that simplify the problem, e.g. "Core Wars" as
a competitive evolutionary algorithm where the objective function of a species
is to reproduce and acquire computing resources as fast as possible.  We could
imagine two different strategies.  One strategy is to make "intelligent"
changes, much as a programmer might rewrite a line of code in its copy. 
Another would be to make simple changes like random bit flips, most of which
would be ineffective, but compensate with a higher reproduction rate.  There
is an analogy in biology: slow reproduction in humans vs. fast reproduction in
insects and bacteria.  Both strategies are effective.

A similar situation exists on the Internet.  SQL Slammer was very simple (a
376 byte UDP packet), but it doubled in population every 8 seconds, saturating
the Internet in 10 minutes.  It could not mutate, so it became extinct once
the vulnerability it exploited was patched.  At the other extreme, some email
viruses scan text and address books from its host to construct convincing
forgeries using complex algorithms.  These intelligent viruses are harder to
eradicate.

Instead of Core Wars, consider a real environment such as the Internet. 
Software has on average 1 to 2 bugs per 1000 lines of code.  This means that
operating systems and widely used software like web browsers, instant
messengers, media players, software on cell phones, etc, contain thousands of
bugs.  Some of these bugs could be exploited, e.g. crashing the program,
gaining control through buffer overflows or tricking the user.  New
vulnerabilities that affect your computer are discovered almost daily, so we
know that thousands more remain.  Today many exploits are discovered by white
hats who inform the authors, or gray hats who publish them, leaving them to
the black hats to exploit.

Security tools like NMAP, Nessus, and password crackers are double edged
swords.  System administrators need them to test their systems for
vulnerabilities, and hackers use them build botnets.  Naturally there is a big
incentive on both sides to build better tools.  Consider a program with human
level understanding of software that spent all of its time searching for
unpublished vulnerabilities.  Such a tool would be invaluable for testing your
software before releasing it.  It would be invaluable to hackers too, who
could discover exploits in your system that were unknown to any human, or to
any virus checker, firewall, or intrusion detection system you might be using.

The real danger is this: a program intelligent enough to understand software
would be intelligent enough to modify itself.  It would be a simple change for
a hacker to have the program break into systems and copy itself with small
changes.  Some of these changes would result in new systems that were more
successful at finding vulnerabilities, reproducing, and hiding from the
infected host's owners, even if that was not the intent of the person who
launched it.  For example, a white hat testing a system for resistance to this
very thing might test it on an isolated network, then accidentally release it
when the network was reconnected because he didn't kill all the copies as he
thought.

It is likely that all computers are vulnerable, and there is little we could
do about it.  Human intelligence is no match for billions of copies of a self
improving worm that understands software better than we do.  It would not only
infect computers, but potentially any system that connect to them, such as
cell phones, cameras, cars, and appliances.  But just like successful
parasites don't kill their hosts, the most successful worms will probably not
make your system so unusable that you would turn it off.

My question is, how soon could this occur?  Two considerations:  First, Legg
proved that an agent cannot predict (understand) a system with greater
algorithmic complexity [1].  This means that RSI must be experimental.  Not
all of the copies will be more successful.  RSI is an evolutionary algorithm.

The second is that understanding software seems to be AI-complete. 
Programmers have to understand how users and other programmers think, and be
able to read and understand documentation written in natural language.  Does
this require the same complexity as a language model (10^9 bits)?  Could a
simpler program compensate for lesser knowledge with brute force computation,
perhaps using simple pattern matching to identify likely programs at the
machine level?  The problem is equivalent to compressing code.  It seems to me
that the lower bound might be the algorithmic complexity of the underlying
hardware or programming language, plus some model of the environment of
unknown complexity.

References

1. Legg, Shane, (2006), Is There an Elegant Universal Theory of Prediction?, 
Technical Report IDSIA-12-06, IDSIA / USI-SUPSI, Dalle Molle Institute for
Artificial Intelligence, Galleria 2, 6928 Manno, Switzerland.
http://www.vetta.org/documents/IDSIA-12-06-1.pdf


-- Matt Mahoney, [EMAIL PROTECTED]

-----
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?member_id=8660244&id_secret=48315087-752156

[agi] What is the complexity of RSI?

Reply via email to