When I refer to a quantity of information, I mean its algorithmic complexity, the size of the smallest program that generates it. So yes, the Mandelbrot set contains very little information. I realize that algorithmic complexity is not obtainable in general. When I express AI or language modeling in terms of compression, I mean that the goal is to get as close to this unobtainable limit as possible.
Algorithmic complexity can apply to either finite or infinite series. For example, the algorithmic complexity of a string of n zero bits is log n + C for some constant C that depends on your choice of universal Turing machine. The complexity of an infinite string of zero bits is a (small) constant C. When I talk about Kauffman's assertion that complex systems evolve toward the boundary between stability and chaos, I mean a discrete approximation of these concepts. These are defined for dynamic systems in real vector spaces controlled by differential equations. (Chaos requires at least 3 dimensions). A system is chaotic if its Lyapunov exponent is greater than 1, and stable if less than one. Extensions to discrete systems have been described. For example, the logistic map x := rx(1 - x), 0 < x < 1, goes from stable to chaotic as r grows from 0 to 4. For discrete spaces, pseudo random number generators are simple examples of chaotic systems. Kauffman studied chaos in large discrete systems (state machines with randomly connected logic gates) and found that the systems transition from stable to chaotic as the number of inputs per gate is increased from 2 to 3. At the boundary, the number of discrete attractors (repeating cycles) is about the square root of the number of variables. Kauffman noted that gene regulation can be modeled this way (gene combinations turn other genes on or off) and that the number of human cell types (254) is about the square root of the number of genes (he estimated 100K, but actually 30K). I noted (coincidentally?) that vocabulary size is about the square root of the size of a language model. The significance of this to AI is that I believe it bounds the degree of interconnectedness of knowledge. It cannot be so great that small updates to the AI result in large changes in behavior. This places limits on what we can build. For example, in a neural network with feedback loops, the weights would have to be kept small. We should not confuse symbols with meaning. A language model associates patterns of symbols with other patterns of symbols. It is not grounded. A model does not need vision to know that the sky is blue. They are just words. I believe that an ungrounded model (plus a discourse model, which has a sense of time and who is speaking) can pass the Turing test. I don't believe all of the conditions are in place for a hard takeoff yet. You need: 1. Self replicating computers. 2. AI smart enough to write programs from natural language specifications. 3. Enough hardware on the Internet to support AGI. 4. Execute access. 1. Computer manufacturing depends heavily on computer automation but you still need humans to make it all work. 2. AI language models are now at the level of a toddler, able to recognize simple sentences of a few words, but they can already learn in hours or days what takes a human years. 3. I estimate an adult level language model will fit on a PC but it would take 3 years to train it. A massively parallel architecure like Google's MapReduce could do it in an hour, but it would require a high speed network. A distributed implementation like GIMPS or SETI would not have enough interconnection speed to support a language model. I think you need about a 1Gb/s connection with low latency to distribute it over a few hundred PCs. 4. Execute access is one buffer overflow away. -- Matt Mahoney, [EMAIL PROTECTED] ----- Original Message ---- From: Mike Dougherty <[EMAIL PROTECTED]> To: agi@v2.listbox.com Sent: Saturday, November 18, 2006 1:32:05 AM Subject: Re: [agi] A question on the symbol-system hypothesis I'm not sure I follow every twist in this thread. No... I'm sure I don't follow every twist in this thread. I have a question about this compression concept. Compute the number of pixels required to graph the Mandelbrot set at whatever detail you feel to be a sufficient for the sake of example. Now describe how this 'pattern' is compressed. Of course the ideal compression is something like 6 bytes. Show me a 6 byte jpg of a mandelbrot :) Is there a concept of compression of an infinite series? Or was the term "bounding" being used to describe the attractor around which the values tends to fall? chaotic attractor, statistical median, etc. they seem to be describing the same tendency of human pattern recognition of different types of data. Is a 'symbol' an idea, or a handle on an idea? Does this support the mechanics of how concepts can be built from agreed-upon ideas to make a new token we can exchange in communication that represents the sum of the constituent ideas? If this symbol-building process is used to communicate ideas across a highly volatile link (from me to you) then how would these symbols be used by a single computation machine? (Is that a hard takeoff situation, where the near zero latency turns into an exponential increase in symbol complexity per unit time?) If you could provide some feedback as a reality check on these thoughts, I'd appreciate the clarification... thanks. This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?list_id=303 ----- This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?list_id=303