In response to Ben's paper on goal preservation, I think that identifying attractors or fixed points requires that we identify sources of goal drift. Here are some:
- Information loss - Software errors - Deliberate modification - Modification through learning - Evolution - Noise Information loss is caused by irreversible operations, for example, the assignment statement. The algorithmic complexity of a machine's state (such as a set of self improving processes) cannot increase over time without input. This suggests that the state of having no goals is an attractor. Software errors: an agent may have the goal of preserving its goals, but may be unsuccessful due to programming errors. Humans make errors, so there is no reason to believe that super intelligent beings will be different. Software verification reduces to the halting problem. This suggests that hard to detect bugs will accumulate. Deliberate modification: friendliness is algorithmically complex, so it will require human maintenance. Many situations will arise that were not planned for. For example, is it ethical to copy a human and kill the original? Is it ethical to turn off or simulate pain in an AI that has some human traits, but is not entirely human? Is it ethical to allow wireheading? This suggests a dynamic toward increasing algorithmic complexity (like our legal system). Modifications through learning: we could allow the AI to make ethical decisions for us on the premise that it is smarter than humans, and therefore will make more intelligent decisions than we could. This means we are also not intelligent enough to predict where this dynamic will go. Evolution: some goals may be harmful to the AI. Selective pressure will favor rapid reproduction and acquisition of computing resources. Noise results from copying errors. In living organisms there is an equilibrium between information gained through selection (about log n bits per birth/death, where n is the average number of children produced) and information loss through mutation that limits the algorithmic complexity of the genome of higher organisms (like humans) to around 10^7 bits. There may be other forces I missed. To make this concrete, imagine a simple self improving system, such as a data compressor that "wants" to output smaller files, or a linear regression program that "wants" to reduce RMS error. Describe an environment where the program could rewrite itself (or modify its copies). How would its goals drift, and where would they settle? -- Matt Mahoney, [EMAIL PROTECTED] ------------------------------------------- agi Archives: https://www.listbox.com/member/archive/303/=now RSS Feed: https://www.listbox.com/member/archive/rss/303/ Modify Your Subscription: https://www.listbox.com/member/?member_id=8660244&id_secret=111637683-c8fa51 Powered by Listbox: http://www.listbox.com