Pei, thanks for the comments. I posted an updated version of the paper to 
http://www.mattmahoney.net/rsi.pdf

I also found a bug in my RSI program and posted a revised version (which is 
also easier to read). My original program did not depend on input t, so 
according to my definition, it did not have a goal. The new program has a 
simple goal of printing big numbers, which get bigger as the time bound t 
increases. It also outputs an improved copy of itself such that for any input, 
the output is larger by 1. It is a 13 line C program such that the n'th 
generation outputs t+n.

I added a section to the paper explaining why I used a batch mode model of an 
intelligent agent when normally we use an interactive model (such as yours and 
AIXI). I distinguish between self-improvement and learning, such that self 
improvement means the program rewrites its software to better achieve some 
goal. It has to do this without any outside help beyond what it initially 
knows. If it updates itself based on new information, then that's learning. For 
batch mode testing, a utility function is sufficient to define a goal.

> *. "AIXI has insufficient knowledge (none initially)
> ..."
> 
> But it assumes a reward signal, which contains sufficient
> knowledge to
> evaluate behaviors. What if the reward signal is wrong?

The reward signal (controlled by the environment) is by definition what the 
agent tries to maximize. It can't be "wrong".


-- Matt Mahoney, [EMAIL PROTECTED]


--- On Sun, 9/14/08, Pei Wang <[EMAIL PROTECTED]> wrote:

> From: Pei Wang <[EMAIL PROTECTED]>
> Subject: Re: [agi] A model for RSI
> To: agi@v2.listbox.com
> Date: Sunday, September 14, 2008, 1:58 PM
> Matt,
> 
> Thanks for the paper. Some random comments:
> 
> *. "If RSI is possible, then it is critical that the
> initial goals of
> the first iteration of agents (seed AI) are friendly to
> humans and
> that the goals not drift through successive
> iterations."
> 
> As I commented on Ben's paper recently, here the
> implicit assumption
> is that the initial goals fully determines the goal
> structure, which I
> don't think is correct. If you think otherwise, you
> should argue for
> it, or at least make it explicit.
> 
> *. "Turing [5] defined AI as the ability of a machine
> to fool a human
> into believing it was another human."
> 
> No he didn't. Turing proposed the imitation game as a
> sufficient
> condition for intelligence, and he made it clear that it
> may not be a
> necessary condition by saying "May not machines carry
> out something
> which ought to be described as thinking but which is very
> different
> from what a man does? This objection is a very strong one,
> but at
> least we can say that if, nevertheless, a machine can be
> constructed
> to play the imitation game satisfactorily, we need not be
> troubled by
> this objection."
> 
> *. "This would solve the general intelligence problem
> once and for
> all, except for the fact that the strategy is not
> computable."
> 
> Not only that. Other exceptions include the situations
> where the
> definition doesn't apply, such as in systems where
> goals change over
> time, where no immediate and reliable reward signals are
> given, etc.,
> not to mention the unrealistic assumption on infinite
> resources.
> 
> *. "AIXI has insufficient knowledge (none initially)
> ..."
> 
> But it assumes a reward signal, which contains sufficient
> knowledge to
> evaluate behaviors. What if the reward signal is wrong?
> 
> *. "Hutter also proved that in the case of space bound
> l and time bound t ..."
> 
> That is not the same thing as "insufficient
> resources".
> 
> *. "We define a goal as a function G: N → R mapping
> natural numbers
> ... to real numbers."
> 
> I'm sure you can build systems with such a goal, though
> call it a
> "definition of goal" seems too strong --- are you
> claiming that all
> the "goals" in the AGI context can be put into
> this format? On the
> other hand, are all N → R functions goals? If not, how to
> distinguish
> them?
> 
> *. "running P longer will eventually produce a better
> result and never
> produce a worse result afterwards"
> 
> This is true for certain goals, though not for all. Some
> goals ask for
> keeping some parameter (such as body temperature) at a
> certain value,
> which cannot be covered by your definition using
> monotonically
> increasing function.
> 
>  *. "Define an improving sequence with respect to G as
> an infinite
> sequence of programs P1, P2, P3,... such that for all i
> > 0, Pi+1
> improves on Pi with respect to goal G."
> 
> If this is what people means by RSI, I don't think it
> can be designed
> to happen --- it will either be impossible or only happens
> by
> accident. All realistic learning and adaptation is
> tentative --- you
> make a change with the belief that it will be a better
> strategy,
> according to your experience, though you can never be
> absolutely sure,
> because the future is different from the past. There is no
> guaranteed
> improvement in an open system.
> 
> Pei
> 
> On Sat, Sep 13, 2008 at 11:39 PM, Matt Mahoney
> <[EMAIL PROTECTED]> wrote:
> > I have produced a mathematical model for recursive
> self improvement and would appreciate any comments before I
> publish this.
> >
> > http://www.mattmahoney.net/rsi.pdf
> >
> > In the paper, I try to give a sensible yet precise
> definition of what it means for a program to have a goal.
> Then I describe infinite sequences of programs that improve
> with respect to reaching a goal within fixed time bounds,
> and finally I give an example (in C) of a program that
> outputs the next program in this sequence. Although it is my
> long sought goal to prove or disprove RSI, it doesn't
> entirely resolve the question because the rate of knowledge
> gain is O(log n) and I prove that is the best you can do
> given fixed goals.
> >
> > -- Matt Mahoney, [EMAIL PROTECTED]
> >
> >



-------------------------------------------
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=8660244&id_secret=114414975-3c8e69
Powered by Listbox: http://www.listbox.com

Reply via email to