Fixed in https://github.com/open-mpi/ompi/pull/1959
> On Aug 11, 2016, at 6:23 PM, Gilles Gouaillardet <gil...@rist.or.jp> wrote: > > Thanks George, > > > fwiw, note the current behavior is a bit more "twisted" than that. > > OPAL_MODEX_RECV_VALUE() returns successfully (e.g. err == OPAL_SUCCESS) but > the OPAL_PMIX_NODEID (e.g. val) value is -1. > > that means orted did "push" OPAL_PMIX_NODEID, but with an unitialized value > of -1 (this is set in the constructor). > > fortunatly, you used the same -1 special value if OPAL_MODEX_RECV_VALUE() had > failed (e.g. OPAL_ERR_NOT_FOUND), > > so bottom line, your commit does fix the crash. > > Cheers, > > Gilles > > On 8/12/2016 2:09 AM, George Bosilca wrote: >> I just pushed a solution to this problem in 8d0baf140f. If we are unable to >> extract the expected information from the RTE, we simply build a >> non-reordered communicator and gracefully return. >> >> That being said, not being able to correctly retrieve OPAL_PMIX_NODEID has >> the potential to drastically decrease the performance as no specialized >> hierarchies can be built without the RTE information. >> >> George. >> >> >> On Wed, Aug 10, 2016 at 3:57 AM, Gilles Gouaillardet <gil...@rist.or.jp >> <mailto:gil...@rist.or.jp>> wrote: >> Ralph, >> >> >> i noticed dist-graph/distgraph_test_4 from the ibm test suite fails when >> using a hostfile and running no task on the host running mpirun. >> >> n0$ mpirun --host n1:1,n2:1 -np 2 ./dist-graph/distgraph_test_4 >> >> >> the root cause is OPAL_PMIX_NODEID is correctly set ( 0, 1, 2) by mpirun, >> but for some reasons, orted sets it to -1 everywhere. >> >> an indirect consequence is a crash of the test (it believes tasks run on >> zero distinct nodes instead of 2) >> >> >> this occurs only master, and v2.x is fine. >> >> >> Could you please have a look ? >> >> >> Cheers, >> >> >> Gilles >> >> _______________________________________________ >> devel mailing list >> devel@lists.open-mpi.org <mailto:devel@lists.open-mpi.org> >> https://rfd.newmexicoconsortium.org/mailman/listinfo/devel >> <https://rfd.newmexicoconsortium.org/mailman/listinfo/devel> >> >> >> >> _______________________________________________ >> devel mailing list >> devel@lists.open-mpi.org <mailto:devel@lists.open-mpi.org> >> https://rfd.newmexicoconsortium.org/mailman/listinfo/devel >> <https://rfd.newmexicoconsortium.org/mailman/listinfo/devel> > _______________________________________________ > devel mailing list > devel@lists.open-mpi.org > https://rfd.newmexicoconsortium.org/mailman/listinfo/devel
_______________________________________________ devel mailing list devel@lists.open-mpi.org https://rfd.newmexicoconsortium.org/mailman/listinfo/devel