Re: [OMPI devel] seg fault when using yalla, XRC, and yalla

2016-04-24 Thread Alina Sklarevich
HI, When the segmentation fault happens, I get the following trace: (gdb) bt #0 0x7fffee4f007d in ibv_close_xrcd (xrcd=0x2) at /usr/include/infiniband/verbs.h:1227 #1 0x7fffee4f055f in mca_btl_openib_close_xrc_domain (device=0xfb20c0) at btl_openib_xrc.c:104 #2 0x7fffee4da0

Re: [OMPI devel] seg fault when using yalla, XRC, and yalla

2016-04-21 Thread Nathan Hjelm
In 1.10.x is possible for the BTLs to be in use by ether ob1 or an oshmem component. In 2.x one-sided components can also use BTLs. The MTL interface doesn't not provide support for accessing hardware atomics and RDMA. As for UD it stands for Unconnected Datagram. Its usage gets better messaage ra

Re: [OMPI devel] seg fault when using yalla, XRC, and yalla

2016-04-21 Thread David Shrader
Hey Nathan, I thought only 1 pml could be loaded at a time, and the only pml that could use btl's was ob1. If that is the case, how can the openib btl run at the same time as cm and yalla? Also, what is UD? Thanks, David On 04/21/2016 09:25 AM, Nathan Hjelm wrote: The openib btl should be

Re: [OMPI devel] seg fault when using yalla, XRC, and yalla

2016-04-21 Thread Nathan Hjelm
The openib btl should be able to run alongside cm/mxm or yalla. If I have time this weekend I will get on the mustang and see what the problem is. The best answer is to change the openmpi-mca-params.conf in the install to have pml = ob1. I have seen little to no benefit with using MXM on mustang.

Re: [OMPI devel] seg fault when using yalla, XRC, and yalla

2016-04-21 Thread Alina Sklarevich
David, thanks for the info you provided. I will try to dig in further to see what might be causing this issue. In the meantime, maybe Nathan can please comment about the openib btl behavior here? Thanks, Alina. On Wed, Apr 20, 2016 at 8:01 PM, David Shrader wrote: > Hello Alina, > > Thank you

Re: [OMPI devel] seg fault when using yalla, XRC, and yalla

2016-04-20 Thread David Shrader
Hello Alina, Thank you for the information about how the pml components work. I knew that the other components were being opened and ultimately closed in favor of yalla, but I didn't realize that initial open would cause a persistent change in the ompi runtime. Here's the information you req

Re: [OMPI devel] seg fault when using yalla, XRC, and yalla

2016-04-20 Thread Alina Sklarevich
Hi David, I was able to reproduce the issue you reported. When the command line doesn't specify the components to use, ompi will try to load/open all the ones available (and close them in the end) and then choose the components according to their priority and whether or not they were opened succe

Re: [OMPI devel] seg fault when using yalla, XRC, and yalla

2016-04-20 Thread Joshua Ladd
Hi, David We are looking into your report. Best, Josh On Tue, Apr 19, 2016 at 4:41 PM, David Shrader wrote: > Hello, > > I have been investigating using XRC on a cluster with a mellanox > interconnect. I have found that in a certain situation I get a seg fault. I > am using 1.10.2 compiled wi

[OMPI devel] seg fault when using yalla, XRC, and yalla

2016-04-19 Thread David Shrader
Hello, I have been investigating using XRC on a cluster with a mellanox interconnect. I have found that in a certain situation I get a seg fault. I am using 1.10.2 compiled with gcc 5.3.0, and the simplest configure line that I have found that still results in the seg fault is as follows: $