Re: [OMPI devel] 1.10.0rc4 hcoll problem compiled statically

2015-08-23 Thread Paul Hargrove
Ralph, Indeed, configuration with --enable-mca-no-build=coll-ml resolved my problem. So, this *is* the same problem at was already known. Sorry for the false alarm. -Paul On Sun, Aug 23, 2015 at 9:43 AM, Ralph Castain wrote: > I think that’s true - this looks like the hcoll symbol issue. I’d s

Re: [OMPI devel] 1.10.0rc4 hcoll problem compiled statically

2015-08-23 Thread Ralph Castain
I think that’s true - this looks like the hcoll symbol issue. I’d suggest configuring with —enable-mca-no-build=coll-ml to resolve the problem in static builds, or follow Gilles suggestion about .ompi_ignore > On Aug 22, 2015, at 10:14 PM, Gilles Gouaillardet > mailto:gilles.gouaillar...@gma

Re: [OMPI devel] 1.10.0rc4 hcoll problem compiled statically

2015-08-23 Thread Gilles Gouaillardet
Paul, if ompi is built statically or with --disable-dlopen, I do not think --mca coll ^ml can prevent the crash (assuming this is the same issue we discussed before). note if you build dynamically and without --disable-dlopen, it might or might not crash, depending on how modules are enumerated, a

Re: [OMPI devel] 1.10.0rc4 hcoll problem compiled statically

2015-08-23 Thread Paul Hargrove
Gilles, This is on Mellanox's own system where /opt/mellanox/hcoll was updates Aug 2. This problem also did not occur unless I build libmpi statically. A run of "mpirun -mca coll ^ml -np 2 examples/ring_c" still crashes. So, I really don't know if this is the same issue, but suspect that it is not

Re: [OMPI devel] 1.10.0rc4 hcoll problem compiled statically

2015-08-22 Thread Gilles Gouaillardet
Paul, isn t this an issue that was already discussed ? mellanox proprietary hcoll library includes its own coll ml module that conflicts with the ompi one. mellanox folks fixed this internally but I am not sure this has been released. you can run nm libhcoll.so if there are some symbols starting w

[OMPI devel] 1.10.0rc4 hcoll problem compiled statically

2015-08-22 Thread Paul Hargrove
Having seen problems with mtl:ofi with "--enable-static --disable-shared", I tried mtl:psm and mtl:mxm with those options as well. The good news is that mtl:psm was fine, but the bad news is when testing mtl:mxm I encountered a new problem involving coll:hcol. Ralph probably wants to strangle me r