[agi] Re: Preservation of goals in strongly self-modifying AI systems

Richard Loosemore Sun, 31 Aug 2008 16:13:16 -0700

Ben Goertzel wrote:

Hi
All who interested in such topics and are willing to endure some rawspeculative trains of thought,may be interested in an essay I recently posted on goal-preservation instrongly self-modifying
systems, which is linked to from this blog post

http://multiverseaccordingtoben.blogspot.com/2008/08/on-preservation-of-goals-in-self.html
(I've got another, shorter post on AGI systems with superhuman empathycoming up ... but probably
not this weekend; I've got some real work to get through first ;-)



The following commentary appears on my blog at susaro.com.

**************************************************************

Preserving Goals in AI: First You Sort Out The Definitions, Then You DoThe Math (not vice versa).

This is a brief commentary on Goetzel's paper entitled "Toward anUnderstanding of the Preservation of Goals in Self-Modifying CognitiveSystems", which is to be found athttp://www.goertzel.org/papers/PreservationOfGoals.pdf

The purpose of his paper is to ask what happens when AI systems aregiven goals (like "Make humans happy") and also given the ability tomodify their own design ... including the goals themselves.

In the following analysis I want to focus on the opening definitionoffered in that paper, and its relationship to everything that comesafter it.

So, Goertzel begins by asking "What does it mean for system S to possessgoal G over time interval T?".

His suggested answer to this question makes the assumption that thereexists an "...observer O, who is hopefully a smart guy". Withthisassumption in hand, he proposes the following definition, couched interms of that observer:

"S possesses goal G over time interval T if, to O, it appears that theactions S takes during T are significantly more probable to lead to themaximization of G than random actions S would be able to take."


Let's analyze this.

The definition says that if we compare S's actions (in time interval T)- which set of actions I will henceforth call A[T] - with somerandomly chosen set of alternative actions - which I will call A'[T],and if the comparison shows that A[T] are significantly more likely tolead to a maximization of G, than A'[T], then the system will be deemedto have possessed goal G.

But this entirely depends on the intellect and subjective judgement ofthe observer O. So much so, that we might as well say that S has thegoal G if the observer believes that S has the goal G - which would bean empty definition.

More specifically, there at least five points on which the definitiondepends on subjectivity in the observer:

1) The actions A[T] have to be enumerated and compared with a randomset A'[T] .... but 'actions' are not quantum entities, nor are theyalways obvious to an observer. What if the observer does not noticethat a particular action occurred, either because it was too subtle ortoo abstract?

2) What is the meaning of a "significant" difference between theoutcomes of the two sets of actions?

3) What exactly is the "maximization" of a goal? The best possibleachievement of the goal in all possible worlds? Some goals can bestated only in qualitative terms, not quantitative terms, so theobserver may be in a position of having to make subtle judgements aboutthe relative merits of different kinds of goal achievement. Is itbetter to maximize happiness by making absolutely sure that no personever experiences a degree of happiness that deviates from the average bymore than 0.01%, on a particular measure? Or would maximization ofhappiness allow greater variance with a higher average? The moreabstract the goal, the more meaningless it is to speak of its maximization.

4) And the actions must be judged, by the observer, to "lead" to themaximization of the goal. Who is to say how actions are causallyconnected to outcomes? Does a reduction in income tax for thewealthiest people lead to a 'trickle-down' increase in the overallwealth of the poor, or does it just lead to greater income disparity?

5) When exactly are the effects of the actions allowed to come intoplay? If the actions A[T] do not have a significant effect during timeT, but then have a massive effect at [T + delta], does the observerignore that fact and say that the system did not have goal G during timeinterval T (even though system S may verbally declare that it did havethe goal, but did not expect it to have any result until later)?

Now, if the subsequent analysis offered by Goertzel, in this paper, wereto result in a clarification of all these subjective aspects of thedefinition, then we might hope that the subjectivity was on the way tobeing reduced or eliminated. It is okay, I think, to start off withsome vagueness in your definition if the math that comes later isdesigned to eliminate that vagueness in a believable way.

But this is not what hapens. Rather, the ideas embedded in the abovedefinition (like what counts as a goal, and what counts as itsmaximization) are just left as primitives. The observer O is a crucialcharacter in this paper, because O is supposed to be ... you and me.We are being asked to step into O's shoes and buy the idea that we allagree roughly what it means for a system to have a goal.

But nothing could be further from the truth: primitive ideas like"goal" and "maximization" are a million miles away from being cast inobjective terms, and so the mathematical analysis that comes later isbuilt on nothing firmer than quicksand. The primitive terms in thatopening definition beg so many questions that the later analysis cannotbe said to go anywhere at all.

Now, to be fair, Goertzel is sanguine about how much he has achieved inthis paper, saying quite honestly that "Essentially nothing has beenresolved in the above discussion. What I hope is that I have raisedsome interesting questions."

However, he then goes on to say that "My central goal here has been toreplace vague conceptual questions about goal preservation inself-modifying systems with semantically similar questions that are atleast somewhat more precise". With this I respectfully but firmlydisagree: I believe that he started with concepts that were sosubjective and vague as to be of no use at all, then built amathematical apparatus on top of that insecure foundation.

In my book the first thing you do is sort out your definitions. Thenyou do the math. Not the other way around.

Sadly, the very existence of the mathematical apparatus that Goertzelproposes will serve to disguise the fact that all of our attention,right now, should be directed at the insecure foundations. In theliterature as a whole, the concept of a "goal" is bandied about as ifeveryone understood that this was a moderately well-defined concept. Infact it is anything but. It is all well and good to have philosophicaldiscussions in which we take a kind of Turing-esque, hands-off approachand say that a goal is just what a "reasonably smart guy" would judge tobe a goal, but this kind of philosophical handwaving is not going to cutthe mustard if real systems need to be designed to do real intelligentthings.

We still await a calculus of goals and motivation that is founded onbasic concepts that are not defined in terms of subjective observers orhomunculi.



Richard Loosemore






-------------------------------------------
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=8660244&id_secret=111637683-c8fa51
Powered by Listbox: http://www.listbox.com

[agi] Re: Preservation of goals in strongly self-modifying AI systems

Reply via email to