> Ben Goertzel wrote:
> > I don't think that preventing an AI from tinkering with its
> > reward system is the only solution, or even the best one...
> >
> > It will in many cases be appropriate for an AI to tinker with its goal
> > system...
>
> I don't think I was being clear there. I don't mean
Ben Goertzel wrote:
> I don't think that preventing an AI from tinkering with its
> reward system is the only solution, or even the best one...
>
> It will in many cases be appropriate for an AI to tinker with its goal
> system...
I don't think I was being clear there. I don't mean the AI should b
> To avoid the problem entirely, you have to figure out how to make
> an AI that
> doesn't want to tinker with its reward system in the first place. This, in
> turn, requires some tricky design work that would not necessarily seem
> important unless one were aware of this problem. Which, of cours
Ben Goertzel wrote:
> Agreed, except for the "very modest resources" part. AIXI could
> potentially accumulate pretty significant resources pretty quickly.
Agreed. But if the AIXI needs to dissassemble the planet to build its
defense mechanism, the fact that it is harmless afterwards isn't going
Philip,
> The discussion at times seems to have progressed on the basis that
> AIXI / AIXItl could choose to do all sorts amzing, powerful things. But
> what I'm uncear on is what generates the infinite space of computer
> programs?
>
> Does AIXI / AIXItl itself generate these programs? Or does
I might have missed a key point made in the earlier part of the
discussion, but people have said on many occasions something like the
following in relation to AIXI / AIXItl:
> The function of this component would be much more effectively served
> by a module that was able to rapidly search throu
> It should also be pointed out that we are describing a state of
> AI such that:
>
> a) it provides no conceivable benefit to humanity
Not necessarily true: it's plausible that along the way, before learning how
to whack off by stimulating its own reward button, it could provide some
benefits t
Billy Brown wrote:
Ben Goertzel wrote:
I think this line of thinking makes way too many assumptions about
the technologies this uber-AI might discover.
It could discover a truly impenetrable shield, for example.
It could project itself into an entirely different universe...
It might decide we
> Now, it is certainly conceivable that the laws of physics just
> happen to be
> such that a sufficiently good technology can create a provably
> impenetrable
> defense in a short time span, using very modest resources.
Agreed, except for the "very modest resources" part. AIXI could potentially
Ben Goertzel wrote:
> I think this line of thinking makes way too many assumptions about the
> technologies this uber-AI might discover.
>
> It could discover a truly impenetrable shield, for example.
>
> It could project itself into an entirely different universe...
>
> It might decide we pose so
Wei Dai wrote:
Ok, I see. I think I agree with this. I was confused by your phrase
"Hofstadterian superrationality" because if I recall correctly, Hofstadter
suggested that one should always cooperate in one-shot PD, whereas you're
saying only cooperate if you have sufficient evidence that the
> Now, there is no easy way to predict what strategy it will settle on, but
> "build a modest bunker and ask to be left alone" surely isn't it. At the
> very least it needs to become the strongest military power in the
> world, and
> stay that way. I
...
> Billy Brown
>
I think this line of thin
On Wed, Feb 19, 2003 at 11:56:46AM -0500, Eliezer S. Yudkowsky wrote:
> The mathematical pattern of a goal system or decision may be instantiated
> in many distant locations simultaneously. Mathematical patterns are
> constant, and physical processes may produce knowably correlated outputs
> gi
>
> Now, there is no easy way to predict what strategy it will settle on, but
> "build a modest bunker and ask to be left alone" surely isn't it. At the
> very least it needs to become the strongest military power in the world, and
> stay that way. It might very well decide that exterminating the
Wei Dai wrote:
> The AIXI would just contruct some nano-bots to modify the reward-button so
> that it's stuck in the down position, plus some defenses to
> prevent the reward mechanism from being further modified. It might need to
> trick humans initially into allowing it the ability to construct s
> The AIXI would just contruct some nano-bots to modify the reward-button so
> that it's stuck in the down position, plus some defenses to
> prevent the reward mechanism from being further modified. It might need to
> trick humans initially into allowing it the ability to construct such
> nano-bo
On Wed, Feb 19, 2003 at 11:02:31AM -0500, Ben Goertzel wrote:
> I'm not sure why an AIXI, rewarded for pleasing humans, would learn an
> operating program leading it to hurt or annihilate humans, though.
>
> It might learn a program involving actually doing beneficial acts for humans
>
> Or, it m
Wei Dai wrote:
Eliezer S. Yudkowsky wrote:
"Important", because I strongly suspect Hofstadterian superrationality
is a *lot* more ubiquitous among transhumans than among us...
It's my understanding that Hofstadterian superrationality is not generally
accepted within the game theory research co
Bill Hibbard wrote:
The real flaw in the AIXI discussion was Eliezer's statement:
Lee Corbin can work out his entire policy in step (2), before step
(3) occurs, knowing that his synchronized other self - whichever one
he is - is doing the same.
He was assuming that a human could know that ano
I wrote:
> I'm not sure why an AIXI, rewarded for pleasing humans, would learn an
> operating program leading it to hurt or annihilate humans, though.
>
> It might learn a program involving actually doing beneficial acts
> for humans
>
> Or, it might learn a program that just tells humans what the
> This seems to be a non-sequitor. The weakness of AIXI is not that it's
> goals don't change, but that it has no goals other than to maximize an
> externally given reward. So it's going to do whatever it predicts will
> most efficiently produce that reward, which is to coerce or subvert
> the eva
Wei Dai wrote:
> This seems to be a non-sequitor. The weakness of AIXI is not that it's
> goals don't change, but that it has no goals other than to maximize an
> externally given reward. So it's going to do whatever it predicts will
> most efficiently produce that reward, which is to coerce or su
On Tue, Feb 18, 2003 at 06:58:30PM -0500, Ben Goertzel wrote:
> However, I do think he ended up making a good point about AIXItl, which is
> that an AIXItl will probably be a lot worse at modeling other AIXItl's, than
> a human is at modeling other humans. This suggests that AIXItl's playing
> coo
Eliezer,
Allowing goals to change in a coupled way with thoughts memories, is not
simply "adding entropy"
-- Ben
> Ben Goertzel wrote:
> >>
> >>I always thought that the biggest problem with the AIXI model is that it
> >>assumes that something in the environment is evaluating the AI
> and gi
Ben Goertzel wrote:
I always thought that the biggest problem with the AIXI model is that it
assumes that something in the environment is evaluating the AI and giving
it rewards, so the easiest way for the AI to obtain its rewards would be
to coerce or subvert the evaluator rather than to accompl
Wei Dai wrote:
> > "Important", because I strongly suspect Hofstadterian superrationality
> > is a *lot* more ubiquitous among transhumans than among us...
>
> It's my understanding that Hofstadterian superrationality is not generally
> accepted within the game theory research community as a vali
Eliezer S. Yudkowsky wrote:
> "Important", because I strongly suspect Hofstadterian superrationality
> is a *lot* more ubiquitous among transhumans than among us...
It's my understanding that Hofstadterian superrationality is not generally
accepted within the game theory research community as a
> To me it's almost enough to know that both you and Eliezer agree that
> the AIXItl system can be 'broken' by the challenge he set and that a
> human digital simulation might not. The next step is to ask "so what?".
> What has this got to do with the AGI friendliness issue.
This last point of E
Hi Ben,
>From a high order implications point of view I'm not sure that we need
too much written up from the last discussion.
To me it's almost enough to know that both you and Eliezer agree that
the AIXItl system can be 'broken' by the challenge he set and that a
human digital simulation migh
ent, we'll
see ...)
--
Ben
-Original Message-From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED]]On Behalf Of Philip
SuttonSent: Sunday, February 16, 2003 7:17 PMTo:
[EMAIL PROTECTED]Subject: Re: [agi] Breaking AIXI-tl - AGI
friendliness - how to move on
Hi
Hi Eliezer/Ben/all,
Well if the Breaking AIXI-tl discussion
was the warm up then the
discussion of the hard stuff on AGI friendliness is going to be really
something! Bring it on! :)
Just a couple of suggestions about
the methodology of the discussion -
could we complement e
Ben Goertzel wrote:
Actually, Eliezer said he had two points about AIXItl:
1) that it could be "broken" in the sense he's described
2) that it was intrinsically un-Friendly
So far he has only made point 1), and has not gotten to point 2) !!!
As for a general point about the teachability of Fri
t;Friendliness analysis of AGI
systems," rather than for any pragmatic implications it may yave.
-- Ben
> -Original Message-
> From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]]On
> Behalf Of Philip Sutton
> Sent: Sunday, February 16, 2003 9:42 AM
> To: [EMAIL PROT
Hi Eliezer/Ben,
My recollection was that Eliezer initiated the "Breaking AIXI-tl"
discussion as a way of proving that friendliness of AGIs had to be
consciously built in at the start and couldn't be assumed to be
teachable at a later point. (Or have I totally lost the plot?)
Do you feel the di
> I guess that for AIXI to learn this sort of thing, it would have to be
> rewarded for understanding AIXI in general, for proving theorems about AIXI,
> etc. Once it had learned this, it might be able to apply this knowledge in
> the one-shot PD context But I am not sure.
>
For those of u
nal Message-
> From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]]On
> Behalf Of Eliezer S. Yudkowsky
> Sent: Saturday, February 15, 2003 3:36 PM
> To: [EMAIL PROTECTED]
> Subject: Re: [agi] Breaking AIXI-tl
>
>
> Ben Goertzel wrote:
> >>AIXI-tl can learn the
Ben Goertzel wrote:
AIXI-tl can learn the iterated PD, of course; just not the
oneshot complex PD.
But if it's had the right prior experience, it may have an operating program
that is able to deal with the oneshot complex PD... ;-)
Ben, I'm not sure AIXI is capable of this. AIXI may inexorabl
>
> AIXI-tl can learn the iterated PD, of course; just not the
> oneshot complex PD.
>
But if it's had the right prior experience, it may have an operating program
that is able to deal with the oneshot complex PD... ;-)
ben
---
To unsubscribe, change your address, or temporarily deactivate y
Ben Goertzel wrote:
>> In a naturalistic universe, where there is no sharp boundary between
>> the physics of you and the physics of the rest of the world, the
>> capability to invent new top-level internal reflective choices can be
>> very important, pragmatically, in terms of properties of distan
Eliezer S. Yudkowsky wrote:
> Let's imagine I'm a superintelligent magician, sitting in my castle,
> Dyson Sphere, what-have-you. I want to allow sentient beings some way
> to visitme, but I'm tired of all these wandering AIXI-tl spambots that
> script kiddies code up to brute-force my entrance
> Anyway, a constant cave with an infinite tape seems like a constant
> challenge to me, and a finite cave that breaks any {AIXI-tl, tl-human}
> contest up to l=googlebyte also still seems interesting, especially as
> AIXI-tl is supposed to work for any tl, not just sufficiently high tl.
It's
Ben Goertzel wrote:
hi,
No, the challenge can be posed in a way that refers to an arbitrary agent
A which a constant challenge C accepts as input.
But the problem with saying it this way, is that the "constant challenge"
has to have an infinite memory capacity.
So in a sense, it's an infinite
Let's imagine I'm a superintelligent magician, sitting in my castle, Dyson
Sphere, what-have-you. I want to allow sentient beings some way to visit
me, but I'm tired of all these wandering AIXI-tl spambots that script
kiddies code up to brute-force my entrance challenges. I don't want to
tl-b
hi,
> No, the challenge can be posed in a way that refers to an arbitrary agent
> A which a constant challenge C accepts as input.
But the problem with saying it this way, is that the "constant challenge"
has to have an infinite memory capacity.
So in a sense, it's an infinite constant ;)
> No
> In a naturalistic universe, where there is no sharp boundary between the
> physics of you and the physics of the rest of the world, the
> capability to
> invent new top-level internal reflective choices can be very important,
> pragmatically, in terms of properties of distant reality that direct
Ben Goertzel wrote:
It's really the formalizability of the challenge as a computation which
can be fed either a *single* AIXI-tl or a *single* tl-bounded uploaded
human that makes the whole thing interesting at all... I'm sorry I didn't
succeed in making clear the general class of real-world anal
Brian Atkins wrote:
> Ben Goertzel wrote:
>>
>> So your basic point is that, because these clones are acting by
>> simulating programs that finish running in
>> going to be able to simulate each other very accurately.
>>
>> Whereas, a pair of clones each possessing a more flexible control
>> algor
> From my bystander POV I got something different out of this exchange of
> messages... it appeared to me that Eliezer was not trying to say that
> his point was regarding having more time for simulating, but rather that
> humans possess a qualitatively different "level" of reflectivity that
> all
Ben Goertzel wrote:
So your basic point is that, because these clones are acting by simulating
programs that finish running in
From my bystander POV I got something different out of this exchange of
messages... it appeared to me that Eliezer was not trying to say that
his point was regarding
> Eliezer/Ben,
>
> When you've had time to draw breath can you explain, in non-obscure,
> non-mathematical language, what the implications of the AIXI-tl
> discussion are?
>
> Thanks.
>
> Cheers, Philip
Here's a brief attempt...
AIXItl is a non-practical AGI software design, which basically con
Hi,
> There's a physical challenge which operates on *one* AIXI-tl and breaks
> it, even though it involves diagonalizing the AIXI-tl as part of the
> challenge.
OK, I see what you mean by calling it a "physical challenge." You mean
that, as part of the challenge, the external agent posing the
Eliezer/Ben,
When you've had time to draw breath can you explain, in non-obscure,
non-mathematical language, what the implications of the AIXI-tl
discussion are?
Thanks.
Cheers, Philip
---
To unsubscribe, change your address, or temporarily deactivate your subscription,
please go to http
Eliezer S. Yudkowsky wrote:
> Bill Hibbard wrote:
> > On Fri, 14 Feb 2003, Eliezer S. Yudkowsky wrote:
> >
> >>It *could* do this but it *doesn't* do this. Its control process is such
> >>that it follows an iterative trajectory through chaos which is forbidden
> >>to arrive at a truthful solution,
Eliezer S. Yudkowsky wrote:
But if this isn't immediately obvious to you, it doesn't seem like a top
priority to try and discuss it...
Argh. That came out really, really wrong and I apologize for how it
sounded. I'm not very good at agreeing to disagree.
Must... sleep...
--
Eliezer S. Yudk
Ben Goertzel wrote:
>
> I'll read the rest of your message tomorrow...
>
>> But we aren't *talking* about whether AIXI-tl has a mindlike
>> operating program. We're talking about whether the physically
>> realizable challenge, which definitely breaks the formalism, also
>> breaks AIXI-tl in practi
Hmmm My friend, I think you've pretty much convinced me with this last
batch of arguments. Or, actually, I'm not sure if it was your excellently
clear arguments or the fact that I finally got a quiet 15 minutes to really
think about it (the three kids, who have all been out sick from school
I'll read the rest of your message tomorrow...
> But we aren't *talking* about whether AIXI-tl has a mindlike operating
> program. We're talking about whether the physically realizable
> challenge,
> which definitely breaks the formalism, also breaks AIXI-tl in practice.
> That's what I origina
Ben Goertzel wrote:
>
>> AIXI-tl *cannot* figure this out because its control process is not
>> capable of recognizing tl-computable transforms of its own policies
>> and strategic abilities, *only* tl-computable transforms of its own
>> direct actions. Yes, it simulates entities who know this; it
Hi,
> You appear to be thinking of AIXI-tl as a fuzzy little harmless baby being
> confronted with some harsh trial.
Once again, your ability to see into my mind proves extremely flawed ;-)
You're right that my statement "AIXItl is slow at learning" was ill-said,
though. It is very inefficien
Eliezer S. Yudkowsky asked Ben Goertzel:
>
> Do you have a non-intuitive mental simulation mode?
>
LOL --#:^D
It *is* a valid question, Eliezer, but it makes me laugh.
Michael Roy Ames
[Who currently estimates his *non-intuitive mental simulation mode* to
contain about 3 iterations of 5 variab
Bill Hibbard wrote:
On Fri, 14 Feb 2003, Eliezer S. Yudkowsky wrote:
It *could* do this but it *doesn't* do this. Its control process is such
that it follows an iterative trajectory through chaos which is forbidden
to arrive at a truthful solution, though it may converge to a stable
attractor.
Ben Goertzel wrote:
>> Even if a (grown) human is playing PD2, it outperforms AIXI-tl
>> playing PD2.
>
> Well, in the long run, I'm not at all sure this is the case. You
> haven't proved this to my satisfaction.
PD2 is very natural to humans; we can take for granted that humans excel
at PD2. Th
> Even if a (grown) human is playing PD2, it outperforms AIXI-tl playing
> PD2.
Well, in the long run, I'm not at all sure this is the case. You haven't
proved this to my satisfaction.
In the short run, it certainly is the case. But so what? AIXI-tl is damn
slow at learning, we know that.
Th
Ben Goertzel wrote:
OK. Rather than responding point by point, I'll try to say something
compact ;)
You're looking at the interesting scenario of a iterated prisoners dilemma
between two AIXI-tl's, each of which has a blank operating program at the
start of the iterated prisoners' dilemma. (In
> Really, when has a computer (with the exception of certain Microsoft
> products) ever been able to disobey it's human masters?
>
> It's easy to get caught up in the romance of "superpowers", but come on,
> there's nothing to worry about.
>
> -Daniel
Hi Daniel,
Clearly there is nothing to worry
lot like it is, and make a guess that symmetrical
friendly behavior might be a good thing?
-- Ben
> -Original Message-
> From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]]On
> Behalf Of Eliezer S. Yudkowsky
> Sent: Friday, February 14, 2003 1:45 AM
> To: [EMAIL PROT
1
Fax: (775) 361-4495
http://www4.ncsu.edu:8030/~dcolonn/
*
-Original Message-
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]] On
Behalf Of Ben Goertzel
Sent: Friday, February 14, 2003 8:46 AM
To: [EMAIL PROTECTED]
Subject: RE: [agi] Breaking AIXI-tl
Hi Eliezer
Some replies to "si
Hi Eliezer
Some replies to "side points":
> This is a critical class of problem for would-be implementors of
> Friendliness. If all AIs, regardless of their foundations, did sort of
> what humans would do, given that AI's capabilities, the whole world would
> be a *lot* safer.
Hmmm. I don't
On Fri, 14 Feb 2003, Eliezer S. Yudkowsky wrote:
> Ben Goertzel wrote:
> . . .
> >> Lee Corbin can work out his entire policy in step (2), before step
> >> (3) occurs, knowing that his synchronized other self - whichever one
> >> he is - is doing the same.
> >
> > OK -- now, if AIXItl were st
Ben Goertzel wrote:
>
>> Because AIXI-tl is not an entity deliberately allocating computing
>> power; its control process is fixed. AIXI-tl will model a process
>> that proves theorems about AIXI-tl only if that process is the best
>> predictor of the environmental information seen so far.
>
> Wel
Eliezer,
I will print your message and read it more slowly tomorrow morning when my
brain is better rested.
But I can't resist some replies now, albeit on 4 hours of sleep ;)
> Because AIXI-tl is not an entity deliberately allocating computing power;
> its control process is fixed. AIXI-tl wil
Ben Goertzel wrote:
> Eliezer,
>
>> A (selfish) human upload can engage in complex cooperative strategies
>> with an exact (selfish) clone, and this ability is not accessible to
>> AIXI-tl, since AIXI-tl itself is not tl-bounded and therefore cannot
>> be simulated by AIXI-tl, nor does AIXI-tl have
Hi Eliezer,
> An intuitively fair, physically realizable challenge, with important
> real-world analogues, formalizable as a computation which can be fed
> either a tl-bounded uploaded human or an AIXI-tl, for which the human
> enjoys greater success measured strictly by total reward over time, du
Eliezer S. Yudkowsky wrote:
Has the problem been thought up just in the sense of "What happens when
two AIXIs meet?" or in the formalizable sense of "Here's a computational
challenge C on which a tl-bounded human upload outperforms AIXI-tl?"
I don't know of anybody else considering "human uplo
Eliezer,
> A (selfish) human upload can engage in complex cooperative
> strategies with
> an exact (selfish) clone, and this ability is not accessible to AIXI-tl,
> since AIXI-tl itself is not tl-bounded and therefore cannot be simulated
> by AIXI-tl, nor does AIXI-tl have any means of abstractly
Shane Legg wrote:
Eliezer,
Yes, this is a clever argument. This problem with AIXI has been
thought up before but only appears, at least as far as I know, in
material that is currently unpublished. I don't know if anybody
has analysed the problem in detail as yet... but it certainly is
a very i
Eliezer,
Yes, this is a clever argument. This problem with AIXI has been
thought up before but only appears, at least as far as I know, in
material that is currently unpublished. I don't know if anybody
has analysed the problem in detail as yet... but it certainly is
a very interesting question
77 matches
Mail list logo