Hi Richard,

 Frankly, looking at recent posts, I think this list is already dead.

Dear Richard, be patient, or post more about your own results.  I have, right 
or wrong, somewhat modest expectations for the posts on this list (-aside from 
my favorite authors :-) ).  I, like perhaps some other developers, am hard at 
work solving the numerous mundane, tedious, obscure problems that bedevil our 
designs, and do not have the time to respond to every provocative post.  When I 
am fortunate enough to report progress - I do so.

I've spent a couple of weeks revisiting an issue I thought solved (i.e. 
incremental fluid construction grammar syntax), in the hope of dramatically 
collapsing the number of required grammar rules by allowing optional 
sub-constituents.  This feature simplifies the task of the grammar author but 
makes my otherwise simple Java parsing/generation code harder to test and to 
understand.

I'll have more to say in an upcoming blog post - both in regard to the issue at 
hand and some observations on my own cognitive activities during this process.

To re-capitulate, this list is not dead - some of its historical posters are 
very busy.

Cheers and warm regards,
-Steve


Stephen L. Reed


Artificial Intelligence Researcher
http://texai.org/blog
http://texai.org
3008 Oak Crest Ave.
Austin, Texas, USA 78704
512.791.7860



----- Original Message ----
From: Richard Loosemore <[EMAIL PROTECTED]>
To: agi@v2.listbox.com
Sent: Friday, June 27, 2008 3:29:59 PM
Subject: Re: [agi] WHAT SORT OF HARDWARE $33K AND $850K BUYS TODAY FOR USE IN 
AGI


At a quick glance I would say you could do it cheaper by building it 
yourself rather than buying Dell servers (cf MicroWulf project that was 
discussed before: http://www.clustermonkey.net//content/view/211/33/).

Secondly:  if what you need to get done is spreading activation (which 
implies massive parallelism) you would probably be better off with a 
Celoxica system than COTS servers:  celoxica.com.  Hugo de Garis has a 
good deal of experience with using this hardware:  it is FPGA based, so 
the potential parallelism is huge.

Third:  the problem, in any case, is not the hardware.  AI researchers 
have saying "if only we had better hardware, we could really get these 
algorithms to sing, and THEN we will have a real AI!" since the f***ing 
1970s, at least.  There is nothing on this earth more stupid than 
watching people repeat the same mistakes over and over again, for 
decades in a row.

Pardon my fury, but the problem is understanding HOW TO DO IT, and HOW 
TO BUILD THE TOOLS TO DO IT, not having expensive hardware.  So long as 
some people on this list repeat this mistake, this list will degenerate 
even further into obsolescence.

Frankly, looking at recent posts, I think this list is already dead.




Richard Loosemore






Ed Porter wrote:
> WHAT SORT OF HARDWARE $33K AND $850K BUYS TODAY FOR USE IN AGI
> 
> On Wednesday, June 25, US East Cost time, I had an interesting phone
> conversation with Dave Hart, where we discussed just how much hardware could
> you get for the current buck, for the amounts of money AGI research teams
> using OpenCog (THE LUCKY ONES) might have available to them.
> 
> After our talk I checked out the cost of current servers at Dell (the
> easiest place I knew of to check out prices.) I found hardware, and
> particularly memory was somewhat cheaper than Dave and I had thought.  But
> it is still sufficiently expensive, that moderately funded projects are
> going to be greatly limited by the processor-memory and inter-processor
> bandwidth as to how much spreading activation and inferencing they will be
> capable of doing.
> 
> A RACK MOUNTABLE SERVER WITH 4 QUAD-CORE XEONS, WITH EACH PROCESSOR HAVING
> 8MB OF CACHE, AND THE WHOLE SERVER HAVING 128GBYTES OF RAM AND FOUR 300GBYTE
> HARD DRIVES WAS UNDER $30K.  The memory stayed roughly constant in price per
> GByte going from 32 to 64 to 128 GBytes.  Of course you would probably have
> to pay a several extra grand for software and warranties.  SO LET US SAY THE
> PRICE IS $33K PER SERVER.
> 
> A 24 port 20Gbit/sec infiniband switch with cables and one 20Gbit/sec
> adapter card for each of 24 servers would be about $52K
> 
> SO A TOTAL SYSTEM WITH 24 SERVERS, 96 PROCESSORS, 384 CORES, 768MBYTE OF L2
> CACHE, 3 TBYTES OF RAM, AND 28.8TBYTES OF DISK, AND THE 24 PORT 20GBIT/SEC
> SWITCH WOULD BE ROUGHLY $850 GRAND.  
> 
> That doesn't include air conditioning.  I am guessing each server probably
> draws about 400 watts, so 24 of them would be about 9600 watts--- about the
> amount of heat of ten hair dryers running in one room, which obviously would
> require some cooling, but I would not think would be that expensive to
> handle.
> 
> With regard to performance, such systems are not even close to human brain
> level but they should allow some interesting proofs of concepts
> 
> Performance
> ---------------------------------------
> AI spreading activation often involves a fair amount of non-locality of
> memory.  Unfortunately there is a real penalty for accessing RAM randomly.
> Without overleaving, one article I read recently implied about 50ns was a
> short latency for a memory access.  So we will assume 20M random RAM access
> (randomRamOpps) per second per channel, and that an average activation will
> take two, a read and write, so roughly 10M activations/sec per memory
> channel.  
> 
> Matt Mahoney has pointed out that spreading activation can be modeled by
> matrix methods that let you access RAM with much higher sequential memory
> accessing rates.  He claimed he could process about a gigabyte of matrix
> data a second.  If one assumes each element in the matrix is 8 bytes, that
> would be the equivalent of doing 125M activation a second, which is roughly
> 12.5 times faster (if just 2 bytes, it would be 50 times fasters, or 500M
> activation/sec).
> 
> If one assumes each of 4 core of each of 4 processors could handle a matrix
> at 1GByte/sec, and each element in the matrix was just 2 bytes, that would
> be 8 G 2Byte matrix activations/sec/server, and 256G matrix
> activation/sec/system.  It is not clear how well this could be made to work
> with the type of interconnectivity of an AGI.  It is clear their would be
> some penalty for sparseness, perhaps a large one.  If one used run-length
> encoding in matrix, which is read by rows, then a set of column whose values
> could fit in cache could be loaded into cache, and the portions of all the
> rows relating to them could be read sequentially.  Once all the portions of
> all the row relating to the sub-set of colums had been processed, then the
> process could be repeated for another set of columns whose values would be
> read into cache.   If this were done one should be able to get largely
> sequential high memory sequential access rates, but presumably this would
> substantially increase the number of bytes per contact required in the
> matrix rep. and would slow processing.
> 
> But it is not clear how much the speed would be decreased as the sparseness
> of the matrix increases.  It is also not clear how effective such method
> would be in the presence of inference control mechanisms which would
> presumably be filtering a high percent of messaging, and thus dynamically
> varying the sparseness and varying its pattern.
> 
> Thus I think at least initially when we are exploring different types of
> activation spreading and inferencing patterns thinking in terms of such
> matrix systems would be highly constraining, and there should be a lot of
> attention paid to the limit on AGI computer power which is caused by the
> processor-memory, and inter-processor bandwidth limitations.
> 
> I am assuming below our system has the type of quadcore Xeons that Dave Hart
> told me about Wednesday night which he said were coming out soon and which
> would have 4 separate memory channels for each processor.  I have assumed in
> my estimates below that each channel can do 20M random RAM accesses/sec
> (randomRamOpps).  (You might get higher number of ram opps from interleaving
> memory reads and writes on the same bus, but I don't know how much this can
> speed up randomRamOpps/sec.  That would be 320M randomRamOpps/sec for each
> of the 16 channels on one server, and 7.680G randomRamOpp/sec for the whole
> system.  Divide those numbers in half for read-modify-writes to RAM.
> 
> If one assumes that L2 cache access require 7 clock cycles (which is what
> they did in a P4), and if one assumes that each processor four cores could
> access cache without any decrease in speed because of contention (which they
> probably can't), then for 1.8Ghz Xeon, that would be a max of  (1.8gHz/7)x4
> = ~ 1G L2 cache accesses/sec.  That would be an optimistic max of 4 G L2
> cache accesses/sec for each server, and an optimistic max of 96G L2 cache
> accesses/sec for the whole 24 server system.
> 
> If one assumes inter-server messages are 16byte messages, that are packed
> into 16Kbyte infiniband packets, that would allow upto 1K messages a second
> (obviously if spreading activation messages are 32bytes each, the number
> would be half).  If two machines are just sending messages as fast as they
> can between each other a 20Gbit/sec, they would allow each node to both send
> and receive about 42.3M such 16Byte sub-messages/sec.  Assuming that with
> the possible contention of 24 machines sending to each other, and the fact
> that the desired message flow my not be regular, let us assume
> optimistically that we can get 20% of this messaging capacity, or 8.4M such
> 16Byte sub-msgs/sec.  Over the 24 servers that would be an average of
> roughly 200M inter node 16Byte sub-messages/sec.  This interprocessor
> messaging rate is roughly 1/20th of the rate at which the system can perform
> random read-modify-writes to RAM.  It is probable that at least several such
> read-modify-writes will be involved in the sending and receiving of each
> such sub-msg, and often a message from one graph node in one server will
> activate multiple graph nodes in another server at which it is received, and
> most of these activations will probably require random read-modify-writes to
> RAM.  So the inter-server bandwidth in this system is probably roughly
> balanced with the number of random RAM accesses each server can perform.
> 
> Below is a summary of these very rough, often quite optimistic, estimates of
> the power of such a ~$33K server and the ~$850M 24 server system --- all
> with the qualifications discussed above:  (SOME OF THESE ESTIMATES MAY BE 2
> TO 5 TIMES TOO HIGH)
> 
> 
> ==================================================
> ---FOR ONE ROUGHLY $33K SERVER
> -------- FOR $850 MILLION 24 NODE SYSTEM
> ==================================================
> ---4 quadcore processors, 16 cores
> ------ 96 QUADCORE PROCESSORS, 384 CORES 
> ---128GBytes RAM
> -------- 3TBYTES RAM 
> ---32MBytes of L2 cache
> -------- 768MBYTES of L2 CACHE
> ---20Gbits/sec inter-server bandwidth
> -------- 480GBITS/SEC INTER-SERVER BANDWIDTH
> ==================================================
> ---16GByte of matrix processing/sec
> -------- 384 GBYTES OF MATRIX PROCESSING/SEC
> ---8G 2byte matrix elements processed/sec
> -------- 192G 2BYTE MATRIX ELEMENTS PROCESSED/SEC  
> ---2G 8byte matrix elements processed/sec
> -------- 48G 8BYTE MATRIX ELEMENTS PROCESSED/SEC
> ---4G L2 cache accesses/sec (if no contention between cores)
> -------- 96G L2 CACHE ACCESSES/SEC (if no contention between cores)
> ---320M randomRamOpps/sec (cache line reads or writes)
> -------- 7.6G RANDOMRAMOPPS/SEC (cache line reads or writes)
> ---160M random cache line read-modify-writes/sec
> -------- 3.8G RANDOM CACHE LINE READ-MODIFY-WRITES/SEC
> ---8.4M 16Byte inter-sever sub-msg/sec(~1/20 of random r-m-writes)
> -------- 200M 16BYTE INTER-SERVER SUB-MSG/SEC(~1/20 of random r-m-writes)
> //one msg to another server could activate all connections to graph nodes in
> that other server from the sending graph node, but the amount of such
> message multiplication is limited by number of random cache line
> read-modify-writes.
> ==================================================
> 
> 
> The take-home from this is that for $33K you can get a machine with
> 128Gbytes of RAM and very something in the ballpark of 160M random cache
> line read-modify-writes/sec.  This should be enough to demonstrate to those
> enlightened enough to understand AI concepts the potential power of
> promising AGI architectures.  $33K is cheap enough that hopefully in a year
> of so 10s or 100s of grad student projects will be each working with one or
> more such systems for AGI problems (hopefully, many with OpenCog).
> 
> For $850K you can get a 3 terabyte of RAM.  That should be roughly enough to
> store as much information as the brain of a rat.  For example,
> http://faculty.washington.edu/chudler/facts.html states the cerebral cortex
> of a rat has 6cm2 area.  Since the cortex has roughly 10^5 neurons/mm2,
> that's 6x10^7 neurons, and if you assume 10^4 synapses per neuron that's
> 6x10^11 synapse.  3Tbytes of RAM would allow an average of 5bytes per
> synapse. Even is you doubled or tripled the number of neurons to reflect
> neurons in other parts of the brain, the $850K 24 server system would have
> more than one byte per synapse (which is probably too low by one or two
> orders of magnitude, if you are not using a matrix representation, but in
> the right ball park). 
> 
> But when you get to processing and communicating power, however, the picture
> is much more bleak.  If you assume each neuron fires on average one a
> second, a number I have read in some papers, that is 6x10^11 synapse
> activation/sec.  If you could do your activations using matrix speeds
> indicated above, you might be able to be roughly in this ball park (but you
> might well be slowed down by one or two orders of magnitude by things such
> as the sparseness of interconnects and L2 cache access speeds).  
> 
> But if you are using random accessing of RAM to do activations, you are only
> going to be about 1/200th this assumed rat brain speed.  
> 
> (It should be noted, however, that some people claim that only about 1% of
> synapse are actually functional, which if true would indicate that even
> using random accessing of RAM you should be able to roughly simulate a rat
> brain.)
> 
> (And of course, an AGI program running on the 24 server system would
> probably be dealing at a high level of abstraction than most of the
> processing done in a rat's brain, such as at the word level, or at sensory
> levels where a lot of the lower level inputs have been preprocessed by more
> efficient matrix, or stream computing methods.)
> 
> Thus, current AGI projects, are going to be limited by the amount of
> spreading activation and inferencing you will be able to do, with the types
> of hardware likely to be funded by typical academic projects.  Once the
> hardware industry starts selling hardwares with much greater
> processor-memory and inter-processor bandwidth --- such as the 64 to 256
> core chips, with the cores connected by a high bandwidth mesh network, and
> with each core connected by through silicon vias to multiple memory layers
> above, providing, fat buses between RAM and each core --- AGI's will be able
> to demonstrate much greater capability for a given amount of RAM.
> 
> But for those who can get enough funding to get systems like the $33K server
> upto the $850K 24 server system, these systems should be powerful enough to
> provide good testbeds for many AGI ideas.
> 
> Ed Porter
> 
> 
> 
> 
> 
> 
> -------------------------------------------
> agi
> Archives: http://www.listbox.com/member/archive/303/=now
> RSS Feed: http://www.listbox.com/member/archive/rss/303/
> Modify Your Subscription: http://www.listbox.com/member/?&;
> Powered by Listbox: http://www.listbox.com



-------------------------------------------
agi
Archives: http://www.listbox.com/member/archive/303/=now
RSS Feed: http://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: http://www.listbox.com/member/?&;
Powered by Listbox: http://www.listbox.com



      


-------------------------------------------
agi
Archives: http://www.listbox.com/member/archive/303/=now
RSS Feed: http://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: 
http://www.listbox.com/member/?member_id=8660244&id_secret=106510220-47b225
Powered by Listbox: http://www.listbox.com

Reply via email to