On 12/04/13 14:53, Jeff Squyres (jsquyres) wrote:
On Dec 4, 2013, at 4:31 AM, Paul Kapinos <kapi...@rz.rwth-aachen.de> wrote:Argh - what a shame not to see "btl:usnic" :-|What a shame you don't have Cisco hardware to use the usnic BTL! :-p
Well, this is far above my decision level :o)
Look for the openib messages, not the usnic messages.Well, as said there were *no messages* form the patch you provided in http://www.open-mpi.org/community/lists/devel/2013/06/12472.phpAh, I see.I've attached of a run with single process per node on nodes with 2 NICs, maybe you can see what goes wrong..What I'm guessing is happening here is that hwloc was built without PCI device detection, and therefore you're not getting the benefit of the near/far detection. I don't think we currently export whether hwloc was built with PCI device detection support or not, so look for the section in your configure output labeled: --- MCA component hwloc:hwloc152 (m4 configuration macro, priority 75) Send the output of that section here. There should be tests for PCI libraries in there; that should tell us whether you have PCI detection support enabled.
The whole configure output attached, to prevent bad copying, as far as output of 'ompi_info --all'.
As far as I see "it should be there": > checking whether to enable hwloc PCI device support... yes (default) -- Dipl.-Inform. Paul Kapinos - High Performance Computing, RWTH Aachen University, Center for Computing and Communication Seffenter Weg 23, D 52074 Aachen (Germany) Tel: +49 241/80-24915
log_01_conf.txt.gz
Description: GNU Zip compressed data
openmpi-1.7.3js_ompi_info--all.txt.gz
Description: GNU Zip compressed data
smime.p7s
Description: S/MIME Cryptographic Signature