RE: [agi] WHAT SORT OF HARDWARE $33K AND $850K BUYS TODAY FOR USE IN AGI

Ed Porter Sat, 28 Jun 2008 12:36:24 -0700

My comments on Richard's post of Friday, June 27, 2008 4:30 PM

>RICHARD LOOSEMORE========>
At a quick glance I would say you could do it cheaper by building it 
yourself rather than buying Dell servers (cf MicroWulf project that was 
discussed before: http://www.clustermonkey.net//content/view/211/33/).


ED PORTER========>
I was quoting a main stream price that would be relevant to software types
who don't want to go to the trouble of building and debugging their own
hardware, and wanted to buy from a company that could be trusted to stand
behind their hardware.  I was assuming most readers would understand cheaper
prices would be available.

>RICHARD LOOSEMORE========>
Secondly:  if what you need to get done is spreading activation (which 
implies massive parallelism) you would probably be better off with a 
Celoxica system than COTS servers:  celoxica.com.  Hugo de Garis has a 
good deal of experience with using this hardware:  it is FPGA based, so 
the potential parallelism is huge.

ED PORTER========>
I have considered the use of FPGAs for AI.  But FPGA's are not cheap.
Correct me if I am wrong, but I would assume the tools for programming, the
available software, and the available programming expertise for developing a
system with FPGAs is as not good or as common as for more traditional
hardware.  

Again correct me if I am wrong, but I think at this stage, in which most AGI
software architectures are still a work in progress, there would be a
considerable advantage to working with a more traditional computing platform
that had more tools, and more flexibility.  

But once our AGI architectures are better understood, it might well make
sense to move them to other types of hardware.  As I said, much more highly
concurrent hardware is expected to arrive in about five years.

>RICHARD LOOSEMORE========>
Third:  the problem, in any case, is not the hardware.  AI researchers 
have saying "if only we had better hardware, we could really get these 
algorithms to sing, and THEN we will have a real AI!" since the f***ing 
1970s, at least.  There is nothing on this earth more stupid than 
watching people repeat the same mistakes over and over again, for 
decades in a row.

ED PORTER========>
I don't know about you, Richard, but from me "the f***ing 1970's" brings
back fond memories.

Richard, you should be intelligent enough to understand the fact people have
been making the statement the lack of sufficiently powerful hardware is a
major obstacle to developing human-like AGI for 30 some years does not, in
any way, disprove that statement --- since during those years, few if any AI
researchers have had access to anything close to the level of hardware
necessary.   To paraphrase your language --- there is "nothing on this earth
more stupid" than thinking it does.

I do not claim the software architecture for AGI has been totally solved.
But I believe that enough good AGI approaches exist (and I think Novamente
is one) that when powerful hardware available to more people we will be able
to relatively quickly get systems up and running that demonstrate the parts
of the problems we have solved.  And that will provide valuable insights and
test beds for solving the parts of the problem that we have not yet solved.


>RICHARD LOOSEMORE========>
Pardon my fury, but the problem is understanding HOW TO DO IT, and HOW 
TO BUILD THE TOOLS TO DO IT, not having expensive hardware.  So long as 
some people on this list repeat this mistake, this list will degenerate 
even further into obsolescence.

ED PORTER========>
"Fury," Richard.  Isn't that a little extreme.

With regard to your statement "the problem is understanding HOW TO DO IT"
---
WE DO UNDERSTAND HOW TO DO IT --- NOT ALL OF IT --- AND NOT HOW TO MAKE IT
ALL WORK TOGETHER WELL AUTOMATICALLY --- BUT --- GIVEN THE TYPE OF HARDWARE
EXPECTED TO COST LESS THAN $3M IN 6 YEARS --- WE KNOW HOW TO BUILD MUCH OF
IT --- ENOUGH THAT WE COULD PROVIDE EXTREMELY VALUABLE COMPUTERS WITH OUR
CURRENT UNDERSTANDINGS.  

If Ben could get the funding he wants for the roughly 20-some man years of
work he thinks is needed --- and if he had say the $850K system referred to
in my email below to work with.  I think he could demonstrate some very
impressive generalized learning, perceiving, problem solving, behavior
learning and adapting, NL, attention focusing, and inferencing capabilities
--- capabilities that would show how close we actually are to understanding
how to make human level AGIs.  

Obviously that is my belief and there is no reason why you have to believe
it, but I believe it after a lot of deep thinking.  I can't think of any
aspect of human thought that a Novamente-like system could not solve.  When,
several months ago on this list I asked for what hard problems were left in
AGI, virtually none were mentioned that a Novamente-like architecture would
not seem capable of solving.

With regard to tools, one of the tools that would be helpful to developing
AGI would be tools for developing code on massively parallel systems
efficiently.

If anything, the problem right now is the confusion of possible approaches
to many of the problems.  More cheap hardware will allow more of them to be
tested of systems of the necessary complexity, and the better ones to become
more widely accepted

>RICHARD LOOSEMORE========>
Frankly, looking at recent posts, I think this list is already dead.

ED PORTER========>
Richard, if the list is so dead of late, how come you have posted to it so
often recently?  





-----Original Message-----
From: Richard Loosemore [mailto:[EMAIL PROTECTED] 
Sent: Friday, June 27, 2008 4:30 PM
To: agi@v2.listbox.com
Subject: Re: [agi] WHAT SORT OF HARDWARE $33K AND $850K BUYS TODAY FOR USE
IN AGI


At a quick glance I would say you could do it cheaper by building it 
yourself rather than buying Dell servers (cf MicroWulf project that was 
discussed before: http://www.clustermonkey.net//content/view/211/33/).

Secondly:  if what you need to get done is spreading activation (which 
implies massive parallelism) you would probably be better off with a 
Celoxica system than COTS servers:  celoxica.com.  Hugo de Garis has a 
good deal of experience with using this hardware:  it is FPGA based, so 
the potential parallelism is huge.

Third:  the problem, in any case, is not the hardware.  AI researchers 
have saying "if only we had better hardware, we could really get these 
algorithms to sing, and THEN we will have a real AI!" since the f***ing 
1970s, at least.  There is nothing on this earth more stupid than 
watching people repeat the same mistakes over and over again, for 
decades in a row.

Pardon my fury, but the problem is understanding HOW TO DO IT, and HOW 
TO BUILD THE TOOLS TO DO IT, not having expensive hardware.  So long as 
some people on this list repeat this mistake, this list will degenerate 
even further into obsolescence.

Frankly, looking at recent posts, I think this list is already dead.




Richard Loosemore






Ed Porter wrote:
> WHAT SORT OF HARDWARE $33K AND $850K BUYS TODAY FOR USE IN AGI
> 
> On Wednesday, June 25, US East Cost time, I had an interesting phone
> conversation with Dave Hart, where we discussed just how much hardware
could
> you get for the current buck, for the amounts of money AGI research teams
> using OpenCog (THE LUCKY ONES) might have available to them.
> 
> After our talk I checked out the cost of current servers at Dell (the
> easiest place I knew of to check out prices.) I found hardware, and
> particularly memory was somewhat cheaper than Dave and I had thought.  But
> it is still sufficiently expensive, that moderately funded projects are
> going to be greatly limited by the processor-memory and inter-processor
> bandwidth as to how much spreading activation and inferencing they will be
> capable of doing.
> 
> A RACK MOUNTABLE SERVER WITH 4 QUAD-CORE XEONS, WITH EACH PROCESSOR HAVING
> 8MB OF CACHE, AND THE WHOLE SERVER HAVING 128GBYTES OF RAM AND FOUR
300GBYTE
> HARD DRIVES WAS UNDER $30K.  The memory stayed roughly constant in price
per
> GByte going from 32 to 64 to 128 GBytes.  Of course you would probably
have
> to pay a several extra grand for software and warranties.  SO LET US SAY
THE
> PRICE IS $33K PER SERVER.
> 
> A 24 port 20Gbit/sec infiniband switch with cables and one 20Gbit/sec
> adapter card for each of 24 servers would be about $52K
> 
> SO A TOTAL SYSTEM WITH 24 SERVERS, 96 PROCESSORS, 384 CORES, 768MBYTE OF
L2
> CACHE, 3 TBYTES OF RAM, AND 28.8TBYTES OF DISK, AND THE 24 PORT 20GBIT/SEC
> SWITCH WOULD BE ROUGHLY $850 GRAND.  
> 
> That doesn't include air conditioning.  I am guessing each server probably
> draws about 400 watts, so 24 of them would be about 9600 watts--- about
the
> amount of heat of ten hair dryers running in one room, which obviously
would
> require some cooling, but I would not think would be that expensive to
> handle.
> 
> With regard to performance, such systems are not even close to human brain
> level but they should allow some interesting proofs of concepts
> 
> Performance
> ---------------------------------------
> AI spreading activation often involves a fair amount of non-locality of
> memory.  Unfortunately there is a real penalty for accessing RAM randomly.
> Without overleaving, one article I read recently implied about 50ns was a
> short latency for a memory access.  So we will assume 20M random RAM
access
> (randomRamOpps) per second per channel, and that an average activation
will
> take two, a read and write, so roughly 10M activations/sec per memory
> channel.  
> 
> Matt Mahoney has pointed out that spreading activation can be modeled by
> matrix methods that let you access RAM with much higher sequential memory
> accessing rates.  He claimed he could process about a gigabyte of matrix
> data a second.  If one assumes each element in the matrix is 8 bytes, that
> would be the equivalent of doing 125M activation a second, which is
roughly
> 12.5 times faster (if just 2 bytes, it would be 50 times fasters, or 500M
> activation/sec).
> 
> If one assumes each of 4 core of each of 4 processors could handle a
matrix
> at 1GByte/sec, and each element in the matrix was just 2 bytes, that would
> be 8 G 2Byte matrix activations/sec/server, and 256G matrix
> activation/sec/system.  It is not clear how well this could be made to
work
> with the type of interconnectivity of an AGI.  It is clear their would be
> some penalty for sparseness, perhaps a large one.  If one used run-length
> encoding in matrix, which is read by rows, then a set of column whose
values
> could fit in cache could be loaded into cache, and the portions of all the
> rows relating to them could be read sequentially.  Once all the portions
of
> all the row relating to the sub-set of colums had been processed, then the
> process could be repeated for another set of columns whose values would be
> read into cache.   If this were done one should be able to get largely
> sequential high memory sequential access rates, but presumably this would
> substantially increase the number of bytes per contact required in the
> matrix rep. and would slow processing.
> 
> But it is not clear how much the speed would be decreased as the
sparseness
> of the matrix increases.  It is also not clear how effective such method
> would be in the presence of inference control mechanisms which would
> presumably be filtering a high percent of messaging, and thus dynamically
> varying the sparseness and varying its pattern.
> 
> Thus I think at least initially when we are exploring different types of
> activation spreading and inferencing patterns thinking in terms of such
> matrix systems would be highly constraining, and there should be a lot of
> attention paid to the limit on AGI computer power which is caused by the
> processor-memory, and inter-processor bandwidth limitations.
> 
> I am assuming below our system has the type of quadcore Xeons that Dave
Hart
> told me about Wednesday night which he said were coming out soon and which
> would have 4 separate memory channels for each processor.  I have assumed
in
> my estimates below that each channel can do 20M random RAM accesses/sec
> (randomRamOpps).  (You might get higher number of ram opps from
interleaving
> memory reads and writes on the same bus, but I don't know how much this
can
> speed up randomRamOpps/sec.  That would be 320M randomRamOpps/sec for each
> of the 16 channels on one server, and 7.680G randomRamOpp/sec for the
whole
> system.  Divide those numbers in half for read-modify-writes to RAM.
> 
> If one assumes that L2 cache access require 7 clock cycles (which is what
> they did in a P4), and if one assumes that each processor four cores could
> access cache without any decrease in speed because of contention (which
they
> probably can't), then for 1.8Ghz Xeon, that would be a max of
(1.8gHz/7)x4
> = ~ 1G L2 cache accesses/sec.  That would be an optimistic max of 4 G L2
> cache accesses/sec for each server, and an optimistic max of 96G L2 cache
> accesses/sec for the whole 24 server system.
> 
> If one assumes inter-server messages are 16byte messages, that are packed
> into 16Kbyte infiniband packets, that would allow upto 1K messages a
second
> (obviously if spreading activation messages are 32bytes each, the number
> would be half).  If two machines are just sending messages as fast as they
> can between each other a 20Gbit/sec, they would allow each node to both
send
> and receive about 42.3M such 16Byte sub-messages/sec.  Assuming that with
> the possible contention of 24 machines sending to each other, and the fact
> that the desired message flow my not be regular, let us assume
> optimistically that we can get 20% of this messaging capacity, or 8.4M
such
> 16Byte sub-msgs/sec.  Over the 24 servers that would be an average of
> roughly 200M inter node 16Byte sub-messages/sec.  This interprocessor
> messaging rate is roughly 1/20th of the rate at which the system can
perform
> random read-modify-writes to RAM.  It is probable that at least several
such
> read-modify-writes will be involved in the sending and receiving of each
> such sub-msg, and often a message from one graph node in one server will
> activate multiple graph nodes in another server at which it is received,
and
> most of these activations will probably require random read-modify-writes
to
> RAM.  So the inter-server bandwidth in this system is probably roughly
> balanced with the number of random RAM accesses each server can perform.
> 
> Below is a summary of these very rough, often quite optimistic, estimates
of
> the power of such a ~$33K server and the ~$850M 24 server system --- all
> with the qualifications discussed above:  (SOME OF THESE ESTIMATES MAY BE
2
> TO 5 TIMES TOO HIGH)
> 
> 
> ==================================================
> ---FOR ONE ROUGHLY $33K SERVER
> -------- FOR $850 MILLION 24 NODE SYSTEM
> ==================================================
> ---4 quadcore processors, 16 cores
> ------ 96 QUADCORE PROCESSORS, 384 CORES 
> ---128GBytes RAM
> -------- 3TBYTES RAM 
> ---32MBytes of L2 cache
> -------- 768MBYTES of L2 CACHE
> ---20Gbits/sec inter-server bandwidth
> -------- 480GBITS/SEC INTER-SERVER BANDWIDTH
> ==================================================
> ---16GByte of matrix processing/sec
> -------- 384 GBYTES OF MATRIX PROCESSING/SEC
> ---8G 2byte matrix elements processed/sec
> -------- 192G 2BYTE MATRIX ELEMENTS PROCESSED/SEC   
> ---2G 8byte matrix elements processed/sec
> -------- 48G 8BYTE MATRIX ELEMENTS PROCESSED/SEC
> ---4G L2 cache accesses/sec (if no contention between cores)
> -------- 96G L2 CACHE ACCESSES/SEC (if no contention between cores)
> ---320M randomRamOpps/sec (cache line reads or writes)
> -------- 7.6G RANDOMRAMOPPS/SEC (cache line reads or writes)
> ---160M random cache line read-modify-writes/sec
> -------- 3.8G RANDOM CACHE LINE READ-MODIFY-WRITES/SEC
> ---8.4M 16Byte inter-sever sub-msg/sec(~1/20 of random r-m-writes)
> -------- 200M 16BYTE INTER-SERVER SUB-MSG/SEC(~1/20 of random r-m-writes)
> //one msg to another server could activate all connections to graph nodes
in
> that other server from the sending graph node, but the amount of such
> message multiplication is limited by number of random cache line
> read-modify-writes.
> ==================================================
> 
> 
> The take-home from this is that for $33K you can get a machine with
> 128Gbytes of RAM and very something in the ballpark of 160M random cache
> line read-modify-writes/sec.  This should be enough to demonstrate to
those
> enlightened enough to understand AI concepts the potential power of
> promising AGI architectures.  $33K is cheap enough that hopefully in a
year
> of so 10s or 100s of grad student projects will be each working with one
or
> more such systems for AGI problems (hopefully, many with OpenCog).
> 
> For $850K you can get a 3 terabyte of RAM.  That should be roughly enough
to
> store as much information as the brain of a rat.  For example,
> http://faculty.washington.edu/chudler/facts.html states the cerebral
cortex
> of a rat has 6cm2 area.  Since the cortex has roughly 10^5 neurons/mm2,
> that's 6x10^7 neurons, and if you assume 10^4 synapses per neuron that's
> 6x10^11 synapse.  3Tbytes of RAM would allow an average of 5bytes per
> synapse. Even is you doubled or tripled the number of neurons to reflect
> neurons in other parts of the brain, the $850K 24 server system would have
> more than one byte per synapse (which is probably too low by one or two
> orders of magnitude, if you are not using a matrix representation, but in
> the right ball park). 
> 
> But when you get to processing and communicating power, however, the
picture
> is much more bleak.  If you assume each neuron fires on average one a
> second, a number I have read in some papers, that is 6x10^11 synapse
> activation/sec.  If you could do your activations using matrix speeds
> indicated above, you might be able to be roughly in this ball park (but
you
> might well be slowed down by one or two orders of magnitude by things such
> as the sparseness of interconnects and L2 cache access speeds).  
> 
> But if you are using random accessing of RAM to do activations, you are
only
> going to be about 1/200th this assumed rat brain speed.  
> 
> (It should be noted, however, that some people claim that only about 1% of
> synapse are actually functional, which if true would indicate that even
> using random accessing of RAM you should be able to roughly simulate a rat
> brain.)
> 
> (And of course, an AGI program running on the 24 server system would
> probably be dealing at a high level of abstraction than most of the
> processing done in a rat's brain, such as at the word level, or at sensory
> levels where a lot of the lower level inputs have been preprocessed by
more
> efficient matrix, or stream computing methods.)
> 
> Thus, current AGI projects, are going to be limited by the amount of
> spreading activation and inferencing you will be able to do, with the
types
> of hardware likely to be funded by typical academic projects.  Once the
> hardware industry starts selling hardwares with much greater
> processor-memory and inter-processor bandwidth --- such as the 64 to 256
> core chips, with the cores connected by a high bandwidth mesh network, and
> with each core connected by through silicon vias to multiple memory layers
> above, providing, fat buses between RAM and each core --- AGI's will be
able
> to demonstrate much greater capability for a given amount of RAM.
> 
> But for those who can get enough funding to get systems like the $33K
server
> upto the $850K 24 server system, these systems should be powerful enough
to
> provide good testbeds for many AGI ideas.
> 
> Ed Porter
> 
> 
> 
> 
> 
> 
> -------------------------------------------
> agi
> Archives: http://www.listbox.com/member/archive/303/=now
> RSS Feed: http://www.listbox.com/member/archive/rss/303/
> Modify Your Subscription: http://www.listbox.com/member/?&;
> Powered by Listbox: http://www.listbox.com



-------------------------------------------
agi
Archives: http://www.listbox.com/member/archive/303/=now
RSS Feed: http://www.listbox.com/member/archive/rss/303/
Modify Your Subscription:
http://www.listbox.com/member/?&;
Powered by Listbox: http://www.listbox.com



-------------------------------------------
agi
Archives: http://www.listbox.com/member/archive/303/=now
RSS Feed: http://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: 
http://www.listbox.com/member/?member_id=8660244&id_secret=106510220-47b225
Powered by Listbox: http://www.listbox.com

RE: [agi] WHAT SORT OF HARDWARE $33K AND $850K BUYS TODAY FOR USE IN AGI

Reply via email to