Hey Tom Note that rc2 had a bug in the out-of-band messaging system - might be what you are hitting. I'd suggest working with rc4.
On Mon, Dec 15, 2014 at 12:57 PM, Tom Wurgler <twu...@goodyear.com> wrote: > > I have to take it back. While the first job was less than a node's > worth of cores and ran properly on the cores I wanted. more testing is > revealing other problems. > > Anything that spans more than one node crashes and burns, with a core > dump, and nothing in the files to indicate why. > > Note this is still rc2.... > > More testing on-going.... > > > ------------------------------ > *From:* devel <devel-boun...@open-mpi.org> on behalf of Tom Wurgler < > twu...@goodyear.com> > *Sent:* Monday, December 15, 2014 1:23 PM > > *To:* Open MPI Developers > *Subject:* Re: [OMPI devel] 1.8.4rc Status > > > It seems to be working in rc2 after all. > > I was still trying to use a rankfile, but it appears that is no longer > needed. > > Thanks! > > > ------------------------------ > *From:* devel <devel-boun...@open-mpi.org> on behalf of Ralph Castain < > r...@open-mpi.org> > *Sent:* Monday, December 15, 2014 8:45 AM > *To:* Open MPI Developers > *Subject:* Re: [OMPI devel] 1.8.4rc Status > > Should be there in rc4, and I thought it made it to rc2 for that matter. > I'll take a gander. > > FWIW: I'm working off-list with IBM to tighten the LSF integration so we > correctly read and follow their binding directives. This will also be in > 1.8.4 as we are in final test with it now. > > Ralph > > > On Mon, Dec 15, 2014 at 5:40 AM, Tom Wurgler <twu...@goodyear.com> wrote: >> >> Forgive me if I've missed it, but I believe using physical OR logical >> core numbering was going to be >> >> reimplemented in the 1.8.4 series. >> >> >> I've checked out rc2 and as far as I can tell, it isn't there as yet. >> Is this correct? >> >> >> thanks! >> >> >> ------------------------------ >> *From:* devel <devel-boun...@open-mpi.org> on behalf of Ralph Castain < >> r...@open-mpi.org> >> *Sent:* Monday, December 15, 2014 8:35 AM >> *To:* Open MPI Developers >> *Subject:* [OMPI devel] 1.8.4rc Status >> >> Hi folks >> >> Trying to summarize the current situation on releasing 1.8.4. Remaining >> identified issues: >> >> 1. TCP/BTL hang under mpi-thread-multiple. Asked George to look into it. >> >> 2. hwloc updates required. Brice committed them to the hwloc 1.7 repo. >> Gilles volunteered to create the PR from there. >> >> 3. Fortran f08 binding disable for compilers not meeting certain >> conditions. PR from Gilles awaiting review by Jeff >> >> 4. Topo signature issue reported by IBM. Ralph is waiting for more >> debug. >> >> 5. MPI/IO issue reported by Eric Chamberland. Gilles investigating. >> >> 6. make check issue on SPARC. Problem and fix reported by Paul >> Hargrove, Ralph will commit >> >> 7. Linkage issue on Solaris-11 reported by Paul Hargrove. Missing the >> multi-threaded C libraries, apparently need "-mt=yes" in both compile and >> link. Need someone to investigate. >> >> Please let me know if I've missed anything. >> Ralph >> >> >> _______________________________________________ >> devel mailing list >> de...@open-mpi.org >> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel >> Link to this post: >> http://www.open-mpi.org/community/lists/devel/2014/12/16595.php >> > > _______________________________________________ > devel mailing list > de...@open-mpi.org > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel > Link to this post: > http://www.open-mpi.org/community/lists/devel/2014/12/16604.php >