As its a single thread I doubt that faster memory is going to help you much. It's going to suck whatever you do.
Am 9 Jan 2013 um 17:29 schrieb Jörg Saßmannshausen <[email protected]>: > Dear all, > > many thanks for the quick reply and all the suggestions. > > The code we want to use is that one here: > > http://www.cpfs.mpg.de/~kohout/dgrid.html > > Feel free to download and dig into the code. I am no expert in Fortran so I > won't be able to help you much if you got specific questions to the code :-( > However, my understanding is that it will only run on one core/thread. > > As for the budget: That is where it is getting a bit tricky. The ceiling is > 10k GBP. I know that machines with less memory, say 256 GB, are cheaper, so > one solution would be to get two of the beast so we can do two calculations > at > the same time. If there are enough slots free, we could upgrade to 500 GB > once > we got another pot of money. > > I guess I would go for DDR3, simply as it is faster. Waiting 2 weeks for a > calculation is no fun, so if we can save a bit of time here (faster RAM) we > gain actually quite a bit here. > > I am not convinced with the AMD Bulldozer to be honest. From what I > understand > the Sandybridge has the faster memory access (higher bandwidth). Is that > correct or do I miss out something here. > > I gather that the idea of just using one CPU is not a good one. So we need to > have a dual CPU machine, which is fine with me. > > I am wondering about the vSMP / ScaleMP suggestion from Joe. If I am using an > InfiniBand network here, would I be able to spread the 'bottlenecks' a bit > better? What I am after is, when I tested out the InfiniBand on the new > cluster > we got, I noticed that if you are running a job in parallel between nodes, > the > same amount of cores are marginally faster. At the time I put that down due > to > a slightly faster memory access as there was no bottleneck to the RAM. > I am not familiar with vSMP (i.e. I never used it), but is it possible to > aggregate RAM from a number of nodes (say 40) and use it as a large virtual > SMP? So one node would be slaving away with the calculations and the other > nodes are only doing memory IO. Is that possible with vSMP? > In a related context, how about NUMAScale? > > The idea of the aggregates SDD is nice as well. I know some storage vendors > are using a mixture of RAM and SDD for their meta-data (fast access) and that > seems to work quite well. So that would be a large swap file / partition or > is > there another way to use disc-space as RAM? I need to read the paper of > NVMalloc I suppose. Is that actually used or is that just a good idea and we > got a working example here? > > I don't think there is much disc IO here. There is most certainly no network > bound traffic as it is a single thread. A fast CPU would be of advantage as > well, however, I gut the feeling the trade-off would be the memory access > speed > (bandwidth). > > I have tried to answer the questions raised. Let me know whether there are > still some unclear points. > > Thanks for all your help and suggestions so far. I will need to digest that. > > All the best from a sunny London > > Jörg > > -- > ************************************************************* > Jörg Saßmannshausen > University College London > Department of Chemistry > Gordon Street > London > WC1H 0AJ > > email: [email protected] > web: http://sassy.formativ.net > > Please avoid sending me Word or PowerPoint attachments. > See http://www.gnu.org/philosophy/no-word-attachments.html > > _______________________________________________ > Beowulf mailing list, [email protected] sponsored by Penguin Computing > To change your subscription (digest mode or unsubscribe) visit > http://www.beowulf.org/mailman/listinfo/beowulf _______________________________________________ Beowulf mailing list, [email protected] sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
