Le 16/02/2012 17:12, Matthias Jurenz a écrit : > Thanks for the hint, Brice. > I'll forward this bug report to our cluster vendor. > > Could this be the reason for the bad latencies with Open MPI or does it only > affect hwloc/lstopo?
It affects binding. So it may affect the performance you observed when using "high-level" binding policies that end up binding on wrong cores because of hwloc/kernel problems. If you specify binding manually, it shouldn't hurt. If the best latency case is supposed to be when L2 is shared, then try: mpiexec -np 1 hwloc-bind pu:0 ./all2all : -np 1 hwloc-bind pu:1 ./all2all Then, we'll see if you can get the same result with one of OMPI binding options. Brice > Matthias > > On Thursday 16 February 2012 15:46:46 Brice Goglin wrote: >> Le 16/02/2012 15:39, Matthias Jurenz a écrit : >>> Here the output of lstopo from a single compute node. I'm wondering that >>> the fact of L1/L2 sharing isn't visible - also not in the graphical >>> output... >> That's a kernel bug. We're waiting for AMD to tell the kernel that L1i >> and L2 are shared across dual-core modules. If you have some contact at >> AMD, please tell them to look at >> https://bugzilla.kernel.org/show_bug.cgi?id=42607 >> >> Brice >> >> _______________________________________________ >> devel mailing list >> de...@open-mpi.org >> http://www.open-mpi.org/mailman/listinfo.cgi/devel > _______________________________________________ > devel mailing list > de...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/devel