ED>>>>> Matt Mahoney's below copied November 27, 2007 9:56 PM post had much more informative content than most. As I understood it, it was as a strong rebuttal to those who entertain the fantasy that the ">...human's brain computational power is about the same as of modern PC." And that "> AGI software is the missing part of AGI, not hardware."
As much as I agreed with Matt's rebuttal of this notion from the past, I did have some questions or comments concerning it, as reflected in the below selected quotes from it and my following comments. MATT>>>>> And some of the Blue Brain research suggests it is even worse. A mouse cortical column of 10^5 neurons is about 10% connected, but the neurons are arranged such that connections can be formed between any pair of neurons. Extending this idea to the human brain, with 10^6 columns of 10^5 neurons each, each column should be modeled as a 10^5 by 10^5 sparse matrix, 10% filled. This model requires about 10^16 bits. ED>>>>>I must admit, I have never heard cortical column described as containing 10^5 neurons. The figure I have commonly seen is 10^2 neurons for a cortical column, although my understanding is that the actual number could be either less or more. I guess the 10^5 figure would relate to so-called hypercolumns. It would appear that software systems in which nodes are connected by links based on pointers, are very capable of efficiently representing connections in a high dimensional sparse space where each node should have the capability to be connected to every other. It seems to me one of the key unknowns regarding AGI designed to handle human-level world knowledge is how large and how interconnected world knowledge is, and how many models are required to most efficiently represent and compute its knowledge. The range I normally think in is between 10 and 100 terabytes (i.e., 10^14 to 10^15bits which is not so far from the 10^16 number Matt suggested), a sizable fraction of which bytes should be in ram at any one time. I budget roughly 100 bytes to represent an atom (this includes RAM to represent state variables), so 10TByte would allow 100 billion atoms, or an average of 100 atom/sec over a 31 year (1 billion sec) period, a period long enough to learn much of world knowledge. 100TBytes would allow an average of 1000 atom/sec over that time. Intuitively this seems like it would allow human-level world knowledge to be represented. Of course, over time a larger and larger percent of the representation would be dedicated to models derived from episodic experience rather than to episodic memories themselves. But as the number of models of patterns grows, the efficiency with which experiences can be represented grows, much as in the text compression field Matt's email discusses at its end. Some of the key tuning parameters in any large AGI will be those that help determine which models are important enough to be formed and kept in existence, since there is a combinatorial explosion of possible models. MATT>>>>> Perhaps there are ways to optimize neural networks by taking advantage of the reliability of digital hardware, but over the last few decades researchers have not found any. Approaches that reduce the number of neurons or synapses, such as connectionist systems and various weighted graphs, just haven't scaled well. Yes, I know Novamente and NARS fall into this category. ED>>>>> I have said multiple times on this list that I believe we will need hardware within several orders of magnitude of the representational, computational, and interconnect capacity of the human brain to do what the human brain does (although obviously there are many tasks, such as sequential arithmetic at which even simple computers are millions of times faster). So I am sympathetic with the basic thrust of the above paragraph. But it seems to me that there should be some reduction in the number of nodes and links required to model the human brain because of factors such as: digital memory is more crisp; digital logic is much faster; there are many functions in the brain that would not have to be modeled, etc. I think these factors should allow a reduction of at least one or two orders of magnitude reduction in model complexity. All of this is assuming a new type of hardware optimized for massively parallel computing very high processor-to-memory and interconnect bandwidth. (Hardware that could be very possible in 5 years) So although big hardware is key to human-level AGI, currently to the best of my knowledge, there is no funding for AGI research on such hardware. That is why I am very interested in seeing what sort of a rabbit Ben Goertzel and his Novamente clan can pull out of a hat. Buy that I mean what sort of AGI they can pull out of relatively small hardware in their current relatively modest, but possibly important, AGI projects. The amount of "world-knowledge" they will be able to represent and computer/second will be quite small, but hopefully they can create impressive proofs of concept that show the potential of AGI on such little machines. This could be very valuable in creating a new world of vastly sub-human, but nevertheless useful AGI's. And hopefully this will encourage funding for AGI on much larger machines. Ed Porter -----Original Message----- From: Matt Mahoney [mailto:[EMAIL PROTECTED] Sent: Tuesday, November 27, 2007 9:56 PM To: agi@v2.listbox.com Subject: Re: Re[10]: [agi] Funding AGI research --- Dennis Gorelik <[EMAIL PROTECTED]> wrote: > Matt, > > > --- Dennis Gorelik <[EMAIL PROTECTED]> wrote: > >> Could you describe a piece of technology that simultaneously: > >> - Is required for AGI. > >> - Cannot be required part of any useful narrow AI. > > > A one million CPU cluster. > > Are you claiming that computational power of human brain is equivalent > to one million CPU cluster? > > My feeling is that human's brain computational power is about the same > as of modern PC. > > AGI software is the missing part of AGI, not hardware. We don't know that. What we do know is that people have historically underestimated the difficulty of AI since about 1950. Our approach has always been to design algorithms that push the limits of whatever hardware capacity was available at the time. At every point in history we seem to have the hindsight to realize that past attempts have failed for lack of computing power but not the foresight to realize when we are still in the same situation. If AGI is possible with one millionth of the computing power of human brain, then 1. Why didn't we evolve insect sized brains? 2. Why aren't insects as smart as we are? 3. Why aren't our computers as smart as insects? With regard to 1, the human brain accounts for most of our resting metabolism. It uses more power than any other organ except the muscles during exercise. One of the arguments that AGI is possible on a PC is from information theory. Humans learn language from the equivalent of about 1 GB of training data (or 10^9 bits compressed). Turing argued in 1950 that a learning algorithm running on a computer with 10^9 bits of memory and educated like a child should pass the imitation game. Likewise, Landauer estimated human long term memory capacity to be 10^9 bits. Yet a human brain has 10^11 neurons and 10^15 synapses. Why? And some of the Blue Brain research suggests it is even worse. A mouse cortical column of 10^5 neurons is about 10% connected, but the neurons are arranged such that connections can be formed between any pair of neurons. Extending this idea to the human brain, with 10^6 columns of 10^5 neurons each, each column should be modeled as a 10^5 by 10^5 sparse matrix, 10% filled. This model requires about 10^16 bits. Perhaps there are ways to optimize neural networks by taking advantage of the reliability of digital hardware, but over the last few decades researchers have not found any. Approaches that reduce the number of neurons or synapses, such as connectionist systems and various weighted graphs, just haven't scaled well. Yes, I know Novamente and NARS fall into this category. For narrow AI applications, we can usually find better algorithms than neural networks, for example, arithmetic, deductive logic, or playing chess. But none of these other algorithms are so broadly applicable to so many different domains such as language, speech, vision, robotics, etc. My work in text compression (an AI problem) is an attempt to answer the question by measuring trends in intelligence (compression) as a function of CPU and memory. The best algorithms model mostly at the lexical level (the level of a 1 year old child) with only a crude model of semantics and no syntax. Memory is so tightly constrained (at 2 GB) that modeling at a higher level is mostly pointless. The slope of compression surface in speed/memory space is steep along the memory axis. -- Matt Mahoney, [EMAIL PROTECTED] ----- This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?& ----- This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?member_id=8660244&id_secret=69425067-483d7c
<<attachment: winmail.dat>>