Re: [OMPI users] Myricom MX2G Segmentation fault on OMPI 1.6

Jeff Squyres Fri, 15 Jun 2012 09:41:12 -0400

On Jun 11, 2012, at 7:48 PM, Yong Qin wrote:

> ah, I guess my original understanding of PML was wrong. Adding "-mca
> pml ob1" does help to ease the problem.


See the README for a little more discussion about this issue.  There can only 
be 1 PML in use by a given MPI job -- using "--mca pml ob1" forces the use of 
the "ob1" PML (i.e., the BTLs), as opposed to the "cm" MTL (i.e., the MTLs).

> But the question still
> remains. Why ompi decided to use the mx BTL in the first place, given
> there's no physical device onboard at all? This behavior is completely
> different than the original gm BTL.

That's not what is actually happening.

Open MPI *built* with MX support, and it therefore assumes that you will likely 
want to use it.  So it *warns* you when there is no MX device available.

That being said, I have recently run into the issue you are seeing: if OMPI 1.6 
warns you that there is no high-speed device available (openib in my case), it 
then segv's (which it obviously shouldn't -- it should warn and then die 
gracefully).  I'll open a ticket on this behavior.  It's not a common scenario, 
but we still shouldn't segv.

My first guess is that this has something to do with the memory manager... but 
that's a guess.

-- 
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to: 
http://www.cisco.com/web/about/doing_business/legal/cri/

Re: [OMPI users] Myricom MX2G Segmentation fault on OMPI 1.6

Reply via email to