[OMPI devel] Very poor performance with btl sm on twin nehalem servers with Mellanox ConnectX installed

2010-05-13 Thread Oskar Enoksson
Sorry for crossposting, I already posted this report to the users list, but the developers list is probably more relevant. I have a cluster with two Intel Xeon Nehalem E5520 CPU per server quad-core, 2.27GHz). The interconnect is 4xQDR Infiniband (Mellanox ConnectX). I have compiled and installed

Re: [OMPI devel] Very poor performance with btl sm on twin nehalem servers with Mellanox ConnectX installed

2010-05-13 Thread Christopher Samuel
On 13/05/10 20:56, Oskar Enoksson wrote: > The problem is that I get very bad performance unless I > explicitly exclude the "sm" btl and I can't figure out why. Recently someone reported issues which were traced back to the fact that the files that sm uses for mmap() were in a /tmp which was NFS

[OMPI devel] RFC: Remove all other paffinity components

2010-05-13 Thread Jeff Squyres
WHAT: Remove all non-hwloc paffinity components. WHY: The hwloc component supports all those systems. WHERE: opal/mca/paffinity/[^hwloc|base] directories WHEN: for 1.5.1 TIMEOUT: Tuesday call, May 25 (yes, about 2 weeks from now -- let hwloc soak for a while...) --

[OMPI devel] RFC: move hwloc code base to opal/hwloc

2010-05-13 Thread Jeff Squyres
WHAT: hwloc is currently embedded in opal/mca/paffinity/hwloc/hwloc -- move it to be a first class citizen in opal/hwloc. WHY: Let other portions of the OPAL, ORTE, and OMPI code bases use hwloc services (remember that hwloc provides detailed topology information, not just processor binding).

Re: [OMPI devel] RFC: Remove all other paffinity components

2010-05-13 Thread Christopher Samuel
On 14/05/10 10:20, Jeff Squyres wrote: > That being said, we might as well leave the paffinity > framework around, even if it only has one component left, > simply on the argument that someday Open MPI may support > a platform that hwloc does not. Sounds good to me. cheers! Chris -- Christoph