On Thu, Nov 15, 2012 at 11:26 AM, Jörg Saßmannshausen <[email protected]> wrote: > One of the older clusters has > Mellanox MT23108 cards and a Voltaire sLB-24 switch, the newer cluster has > Mellanox MT26428 with a QLogic 12300 switch.
You compare different IB cards as well, not only different switches. > All clusters are running Debian > Squeeze, all of them are 64 bit machines and all of them have the the required > packages for the IB network installed. Are the IB drivers at the same version ? > That crashes immediately, and I have included the verbose output of that in > the attached file. This is not really a crash... it actually tells you politely that it couldn't reach other ranks and terminates. The following lines: Process 1 ([[5187,1],1]) is on host: node24 Process 2 ([[5187,1],0]) is on host: node32 BTLs attempted: self sm mean that the only qualified to continue BTLs were self and sm, none of which allows inter-node communications. Very likely tcp (which you disabled) was the only inter-node BTL available. So now it's up to you to find out why openib BTL could not be selected... > However, if I am not using the cluster with the Voltair > switch (described above) but the one with the more recent Qlogic switch and > _copy_ the binary just over, it is working. You are copying the binary. Are you also copying the IB drivers/libs ? Is IB configured the same way ? Is the OpenMPI lib compiled to dynamically look for components ? If so, does it find the IB libs in the right places ? > However, from the above observation (and I got a very similar case wit NWchem) > it appears to me that the program GAMESS-US has problems with the Voltair > network but no problems with the Qlogic network . That is something I find a > bit puzzling. You can make it even simpler: are you able to run a simple MPI hello/pi calculation/etc. program when forcing OpenMPI to use the openib BTL and use several nodes ? Cheers, Bogdan _______________________________________________ Beowulf mailing list, [email protected] sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
