Few thoughts 1. Bind to socket is broken in 1.5.4 - fixed in next release
2. Add --report-bindings to cmd line and see where it thinks the procs are bound 3. Sounds lime memory may not be local - might be worth checking mem binding. Sent from my iPad On Feb 13, 2012, at 7:07 AM, Matthias Jurenz <matthias.jur...@tu-dresden.de> wrote: > Hi Sylvain, > > thanks for the quick response! > > Here some results with enabled process binding. I hope I used the parameters > correctly... > > bind two ranks to one socket: > $ mpirun -np 2 --bind-to-core ./all2all > $ mpirun -np 2 -mca mpi_paffinity_alone 1 ./all2all > > bind two ranks to two different sockets: > $ mpirun -np 2 --bind-to-socket ./all2all > > All three runs resulted in similar bad latencies (~1.4us). > :-( > > > Matthias > > On Monday 13 February 2012 12:43:22 sylvain.jeau...@bull.net wrote: >> Hi Matthias, >> >> You might want to play with process binding to see if your problem is >> related to bad memory affinity. >> >> Try to launch pingpong on two CPUs of the same socket, then on different >> sockets (i.e. bind each process to a core, and try different >> configurations). >> >> Sylvain >> >> >> >> De : Matthias Jurenz <matthias.jur...@tu-dresden.de> >> A : Open MPI Developers <de...@open-mpi.org> >> Date : 13/02/2012 12:12 >> Objet : [OMPI devel] poor btl sm latency >> Envoyé par : devel-boun...@open-mpi.org >> >> >> >> Hello all, >> >> on our new AMD cluster (AMD Opteron 6274, 2,2GHz) we get very bad >> latencies >> (~1.5us) when performing 0-byte p2p communication on one single node using >> the >> Open MPI sm BTL. When using Platform MPI we get ~0.5us latencies which is >> pretty good. The bandwidth results are similar for both MPI >> implementations >> (~3,3GB/s) - this is okay. >> >> One node has 64 cores and 64Gb RAM where it doesn't matter how many ranks >> allocated by the application. We get similar results with different number >> of >> ranks. >> >> We are using Open MPI 1.5.4 which is built by gcc 4.3.4 without any >> special >> configure options except the installation prefix and the location of the >> LSF >> stuff. >> >> As mentioned at http://www.open-mpi.org/faq/?category=sm we tried to use >> /dev/shm instead of /tmp for the session directory, but it had no effect. >> Furthermore, we tried the current release candidate 1.5.5rc1 of Open MPI >> which >> provides an option to use the SysV shared memory (-mca shmem sysv) - also >> this >> results in similar poor latencies. >> >> Do you have any idea? Please help! >> >> Thanks, >> Matthias >> _______________________________________________ >> devel mailing list >> de...@open-mpi.org >> http://www.open-mpi.org/mailman/listinfo.cgi/devel > > _______________________________________________ > devel mailing list > de...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/devel