Ben Goertzel wrote:
> Eliezer,
>
>> A (selfish) human upload can engage in complex cooperative strategies
>> with an exact (selfish) clone, and this ability is not accessible to
>> AIXI-tl, since AIXI-tl itself is not tl-bounded and therefore cannot
>> be simulated by AIXI-tl, nor does AIXI-tl have any means of
>> abstractly representing the concept "a copy of myself". Similarly,
>> AIXI is not computable and therefore cannot be simulated by AIXI.
>> Thus both AIXI and AIXI-tl break down in dealing with a physical
>> environment that contains one or more copies of them. You might say
>> that AIXI and AIXI-tl can both do anything except recognize
>> themselves in a mirror.
>
> I disagree with the bit about 'nor does AIXI-tl have any means of
> abstractly representing the concept "a copy of myself".'
>
> It seems to me that AIXI-tl is capable of running programs that contain
> such an abstract representation. Why not? If the parameters are
> right, it can run programs vastly more complex than a human brain
> upload...
>
> For example, an AIXI-tl can run a program that contains the AIXI-tl
> algorithm, as described in Hutter's paper, with t and l left as free
> variables. This program can then carry out reasoning using predicate
> logic, about AIXI-tl in general, and about AIXI-tl for various values
> of t and l.
>
> Similarly, AIXI can run a program that contains a mathematical
> description of AIXI similar to the one in Hutter's paper. This program
> can then prove theorems about AIXI using predicate logic.
>
> For instance, if AIXI were rewarded for proving math theorems about
> AGI, eventually it would presumably learn to prove theorems about AIXI,
> extending Hutter's theorems and so forth.
Yes, AIXI can indeed prove theorems about AIXI better than any human.
AIXI-tl can prove theorems about AIXI-tl better than any tl-bounded human.
AIXI-tl can model AIXI-tl as well as any tl-bounded human. AIXI-tl can
model a tl-bounded human, say Lee Corbin, better than any tl-bounded
human; given deterministic physics it's possible AIXI-tl can model Lee
Corbin better than Lee Corbin (although I'm not quite as sure of this).
But AIXI-tl can't model an AIXI-tl in the same way that a Corbin-tl can
model a Corbin-tl. See below.
>> The simplest case is the one-shot Prisoner's Dilemna against your own
>> exact clone. It's pretty easy to formalize this challenge as a
>> computation that accepts either a human upload or an AIXI-tl. This
>> obviously breaks the AIXI-tl formalism. Does it break AIXI-tl? This
>> question is more complex than you might think. For simple problems,
>> there's a nonobvious way for AIXI-tl to stumble onto incorrect
>> hypotheses which imply cooperative strategies, such that these
>> hypotheses are stable under the further evidence then received. I
>> would expect there to be classes of complex cooperative problems in
>> which the chaotic attractor AIXI-tl converges to is suboptimal, but I
>> have not proved it. It is definitely true that the physical problem
>> breaks the AIXI formalism and that a human upload can
>> straightforwardly converge to optimal cooperative strategies based on
>> a model of reality which is more correct than any AIXI-tl is capable
>> of achieving.
>>
>> Ultimately AIXI's decision process breaks down in our physical
>> universe because AIXI models an environmental reality with which it
>> interacts, instead of modeling a naturalistic reality within which it
>> is embedded. It's one of two major formal differences between AIXI's
>> foundations and Novamente's. Unfortunately there is a third
>> foundational difference between AIXI and a Friendly AI.
>
> I don't agree at all.
>
> In a Prisoner's Dilemma between two AIXI-tl's, why can't each one run a
> program that:
>
> * uses an abstract mathematical representation of AIXI-tl, similar to
> the one given in the Hutter paper * use predicate logic to prove
> theorems about the behavior of the other AIXI-tl
Because AIXI-tl is not an entity deliberately allocating computing power;
its control process is fixed. AIXI-tl will model a process that proves
theorems about AIXI-tl only if that process is the best predictor of the
environmental information seen so far.
Let's say the primary AIXI-tl, the one whose performance we're tracking,
is facing a complex cooperative problem. Within each round, the challenge
protocol is as follows.
1) The Primary testee is cloned - that is, the two testees are
resynchronized at the start of each new round. This is why Lee Corbin is
the human upload (i.e., to avoid moral issues). We will assume that the
Secondary testee, if a human upload, continues to attempt to maximize
rational reward despite impending doom; again, this is why we're using Lee
Corbin.
2) Each party, the Primary and the Secondary (the Secondary being
re-cloned on each round) are shown an identical map of the next
cooperative complex problem. For example, this might be a set of
billiards, a complex table, pockets that score different amounts of
points, with the Primary's billiards colored green and the Secondary's
billiards colored blue. However, neither party is told which party they
are during this stage.
3) Each party is flashed a green or blue screen telling them whether they
are Primary or Secondary.
4) Each party has the opportunity to input a set of initial velocities
for their billiards.
5) Each party is shown the billiards problem playing out. (Strictly
speaking this step is optional, as the reward can act as the sole source
of feedback, but it simplifies the conceptual description of the scenario.)
6) Each party receives a reward proportional to the sum of the points
scored in the pockets their billiards sank into, bearing in mind that
different pockets score different numbers of points.
We'll assume that the table configuration is such as to require
cooperative collisions in order to reach the highest pockets. This
probably isn't really such a good example of a cooperative problem, but it
gives a general picture.
Lee Corbin can work out his entire policy in step (2), before step (3)
occurs, knowing that his synchronized other self - whichever one he is -
is doing the same. Thus (both of) Lee Corbin will tend to work out
policies which are fair but which maximize reward for both parties. Then
in step (4) Corbin just implements the actions already decided on, unless
he starts deciding to defect against himself, which is something he'll
have to work out on his own. There is a purely rational solution for this
philosophical problem which formalizes Hofstadterian (selfish)
superrationality, but we may assume Corbin just cooperates with himself on
instinct; AIXI-tl is supposed to outperform *any* tl-bounded algorithm.
Similarly, we will assume Corbin has been told the experimental setup
beforehand and that this forms part of his initial state. Presuming that
the billiard problems are tractable for Corbin, he should score very well.
What happens to AIXI-tl?
AIXI-tl first has the opportunity to produce an action in round 4. At
this time the Primary and Secondary already have different information;
they saw a different screen flash and were shown a different set of
billiards awaiting input. AIXI-tl's reasoning now can be roughly
understood... if *I've* understood it correctly... as effectively (a)
taking the size 2^l set of l-bounded programs, (b) treating these programs
as probability measures over all possible inputs, (c) Bayesian-updating
the posterior probabilities of all programs given the actual observed
inputs, (d) using this posterior probability to weight those programs'
predictions of rewards given various possible outputs, and (e) choosing a
strategy which maximizes reward over [horizon] rounds. I may have gotten
wrong my understanding of exactly what AIXI-tl is doing; if so, though, it
doesn't seem likely that it would affect the major point made below.
The major point is as follows: AIXI-tl is unable to arrive at a valid
predictive model of reality because the sequence of inputs it sees, on
successive rounds, are being produced by AIXI-tl trying to model the
inputs using tl-bounded programs, while in fact those inputs are really
the outputs of the non-tl-bounded AIXI-tl. If a tl-bounded program
correctly predicts the inputs seen so far, it will be using some
inaccurate model of the actual reality, since no tl-bounded program can
model the actual computational process AIXI-tl uses to select outputs. A
tl-bounded program (like myself) can *reason abstractly about* properties
of AIXI-tl but not actually *simulate* AIXI-tl well enough to produce its
output as a prediction. This problem gets worse and worse as AIXI-tl
reasons harder and harder on its alternate selves' outputs-as-inputs and
produces future outputs which are compoundedly less and less computable
for tl-bounded programs.
This chaotic iterative process may have an attractor in which a predictive
model suggests an output strategy for Secondaries which confirms the model
when seen as inputs by the Primary. Note, however, that while Corbin's
cooperative strategy is self-confirming, not all self-confirming
strategies are cooperative - the Always-D strategy in the one-shot
Prisoner's Dilemna is also self-confirming. Meanwhile, Always C
strategies are self-confirming and produce rewards equalling or exceeding
Corbin's score, but this predictive model is not stable for AIXI-tl under
the test conditions - if AIXI-tl predicts that the opponent always
cooperates, it will attempt to defect! The chaotic process producing
AIXI-tl's strategy, if any, cannot be understood as analogous to Corbin
working out the optimal strategy with his synchronized other self.
For very simple problems AIXI-tl may arrive at self-confirming *and*
stable *and* cooperative strategies, such as the one-shot Prisoner's
Dilemna and Tit for Tat. I would expect that for any complex cooperative
problem, AIXI-tl's inaccurate modeling process, iterating over its own
feedback, eventually converges to an attractor, a (false) model which is
both self-confirming and game-theoretically stable. I would expect that
there are many such attractors in complex cooperative problems and no
reason why AIXI-tl would successfully hit the optimal cooperative strategy
(which may not even be stable given AIXI-tl's modeling process) - given
that stability and self-confirmation are both fundamentally different
criteria from cooperative optimality.
> How is this so different than what two humans do when reasoning about
> each others' behavior? A given human cannot contain within itself a
> detailed model of its own clone; in practice, when a human reasons
> about the behavior of it clone, it is going to use some abstract
> representation of that clone, and do some precise or uncertain
> reasoning based on this abstract representation.
Humans can use a naturalistic representation of a reality in which they
are embedded, rather than being forced like AIXI-tl to reason about a
separated environment; consequently humans are capable of rationally
reasoning about correlations between their internal mental processes and
other parts of reality, which is the key to the complex cooperation
problem with your own clone - the realization that you can actually
*decide* your clone's actions in step (2), if you make the right
agreements with yourself and keep them.
To sum up:
(a) The fair, physically realizable challenge of cooperation with your
clone immediately breaks the AIXI and AIXI-tl formalisms.
(b) This happens because of a hidden assumption built into the formalism,
wherein AIXI devises a Cartesian model of a separated environmental
theatre, rather than devising a model of a naturalistic reality that
includes AIXI.
(c) There's no obvious way to repair the formalism. It's been
diagonalized, and diagonalization is usually fatal. The AIXI homunculus
relies on perfectly modeling the environment shown on its Cartesian
theatre; a naturalistic model includes the agent itself embedded in
reality, but the reflective part of the model is necessarily imperfect
(halting problem).
(d) It seems very likely (though I have not actually proven it) that in
addition to breaking the formalism, the physical challenge actually breaks
AIXI-tl in the sense that a tl-bounded human outperforms it on complex
cooperation problems.
(e) This conjectured outperformance reflects the human use of a type of
rational (Bayesian) reasoning apparently closed to AIXI, in that humans
can reason about correlations between their internal processes and distant
elements of reality, as a consequence of (b) above.
--
Eliezer S. Yudkowsky http://singinst.org/
Research Fellow, Singularity Institute for Artificial Intelligence
-------
To unsubscribe, change your address, or temporarily deactivate your subscription,
please go to http://v2.listbox.com/member/?[EMAIL PROTECTED]
- Re: [agi] AIXI and Solomonoff induction Cliff Stabbert
- Re: [agi] AIXI and Solomonoff induction James Rogers
- Re: [agi] AIXI and Solomonoff induction Shane Legg
- Re: [agi] AIXI and Solomonoff induction Shane Legg
- Re: [agi] AIXI and Solomonoff induction Eliezer S. Yudkowsky
- [agi] Breaking AIXI-tl Eliezer S. Yudkowsky
- Re: [agi] Breaking AIXI-tl Shane Legg
- Re: [agi] Breaking AIXI-tl Eliezer S. Yudkowsky
- Re: [agi] Breaking AIXI-tl Shane Legg
- RE: [agi] Breaking AIXI-tl Ben Goertzel
- [agi] Breaking Solomonoff induction... well, not real... Eliezer S. Yudkowsky
- [agi] Breaking Solomonoff induction... well, not real... Eliezer S. Yudkowsky
- RE: [agi] Breaking AIXI-tl Ben Goertzel
- Re: [agi] Breaking AIXI-tl Eliezer S. Yudkowsky
- Re: [agi] Breaking AIXI-tl Bill Hibbard
- Re: [agi] Breaking AIXI-tl Eliezer S. Yudkowsky
- Re: [agi] Breaking AIXI-tl Bill Hibbard
- RE: [agi] Breaking AIXI-tl Ben Goertzel
- RE: [agi] Breaking AIXI-tl Daniel Colonnese
- RE: [agi] Breaking AIXI-tl Ben Goertzel
- RE: [agi] Breaking AIXI-tl Ben Goertzel