> I just test things and go for the fastest. But if we do theoretic math, SHMEM > is difficult to beat of course. > Google for measurements with shmem, not many out there.
SHMEM within the node or between nodes? > Fact that so few standardized/rewrote their floating point software to gpu's, > is already saying enough about all the legacy codes in HPC world :) > > When some years ago i had a working 2 cluster node here with QM500- A , it > had at 32 bits , 33Mhz pci long sleeve slots a blocked read latency of under 3 > us is what i saw on my screen. Sure i had no switch in between it. Direct > connection between the 2 elan4's. > > I'm not sure what pci-x adds to it when clocked at 133Mhz, but it won't be a > big diff with pci-e. There is a big different between PCIX and PCIe. PCIe is half the latency - from 0.7 to 0.3 more or less. > PCI-e probably only has a bigger bandwidth isn't it? Also bandwidth ...:-) > Beating such hardware 2nd hand is difficult. $30 on ebay and i can install 4 > rails or so. > Didn't find the cables yet though... > > So i don't see how to outdo that with old infiniband cards which are > $130 and upwards for the connectx, say $150 soon, which would allow only > single rail > or maybe at best 2 rails. So far didn't hear anyone yet who has more than > single rail IB. > > Is it possible to install 2 rails with IB? Yes, you can do dual rails > So if i use your number in pessimistic manner, which means that there is > some overhead of pci-x, then the connectx type IB, can do 1 million blocked > reads per second theoretic with 2 rails. Which is $300 or so, cables not > counted. Are you referring to RDMA reads? _______________________________________________ Beowulf mailing list, [email protected] sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
