[A51] CPU implementation (more code)

Frank A. Stevenson Mon, 02 Nov 2009 13:12:47 -0800

I cleaned up the ATI brook code, and made a new version of the shared 
library, that uses only the CPU for generating chains. The code can be 
found here (linux only ATM):


http://traxme.net/a5/a5_cpu.tar.gz

There is a small python frontend that will generate real chains, (follow 
the instructions in my previous post about using the python script) - 
also the script can be edited to set the number of threads (cores) you 
wish to use.

An AMD Phenom x4 @ 3.2 GHz makes around 16 chains / second. I suppose 
there is room for assembly optimization here, but that isn't really the 
point. I am writing this code, to look into efficient table lookup once 
that tables have been generated. The idea is to spread the lookup part 
to machines that may not have a GPU.

On my machine the code will cause 32 lookups to disk / second, hardly a 
cause for alarm, so a bog standard hard disk will do.

But if the GPU is used for lookup, the rate will be much higher 
(320/sec) - and I am currently copying sorted tables to a slow USB flash 
drive, to determine of it can keep up with the pace with respect to 
lookup / reads. (It takes some hours to copy 1 million files down to 
this device, so I may change my approach to sorting, to something that 
is more in line with the 64kb block size commonly found on 8GB flash disks

cheers,
  Frank






_______________________________________________
A51 mailing list
A51@lists.reflextor.com
http://lists.lists.reflextor.com/cgi-bin/mailman/listinfo/a51

[A51] CPU implementation (more code)

Reply via email to