Ralph and Jeff, Thanks for all the rapid fixes. I'll send openmpi-1.7.4rc2r30031 for a spin while I go wait in line at the Post Office.
-Paul On Fri, Dec 20, 2013 at 11:45 AM, Ralph Castain <r...@open-mpi.org> wrote: > Hi Paul > > The binding stuff was in there, but the limit protection code just went in > today. Jeff has since regenerated the tarball for the web site, so the one > up there should have most (if not all) of these problems fixed > > Have a great holiday! > Ralph > > > On Dec 20, 2013, at 11:40 AM, Paul Hargrove <phhargr...@lbl.gov> wrote: > > Ralph, > > I see the same behavior w/ last night's 1.7 tarball > (openmpi-1.7.4rc2r30002). > The very next commit, r30003, is your addition (on trunk) of guards for > RLIMIT_AS, etc.. > So, I DON'T think any fix for this behavior is in the 1.7 branch as you > thought (maybe just CMR'ed?) > > Let me know if there is additional information about the platform or error > which I should collect. > > -Paul > > P.S. > You may see my email vacation auto-responder message. > My vacation has started (no *paid* work) but I am still reading email > today. > I plan to re-test tonight's 1.7 tarball on all the systems where I > reported issues on Thu night. > > > On Thu, Dec 19, 2013 at 7:19 PM, Ralph Castain <r...@open-mpi.org> wrote: > >> I believe this one has already been fixed and is in the nightly >> (1.7.4rc2) - for now, you can just set "--bind-to none" on the cmd line to >> get past it >> >> >> On Dec 19, 2013, at 6:42 PM, Paul Hargrove <phhargr...@lbl.gov> wrote: >> >> Testing with Solaris 10 on SPARC, I was expecting to encounter the bus >> error reported previously by Siegman Gross. Instead I see the following >> hwloc-related abort: >> >> $ env >> PATH=/home/hargrove/OMPI/openmpi-1.7.4rc1-solaris10-sparcT2-ss12u3-v9/INST/bin:$PATH >> >> LD_LIBRARY_PATH_64=/home/hargrove/OMPI/openmpi-1.7.4rc1-solaris10-sparcT2-ss12u3-v9/INST/lib:$LD_LIBRARY_PATH_64 >> OMPI_MCA_shmem_mmap_enable_nfs_warning=0 >> >> /home/hargrove/OMPI/openmpi-1.7.4rc1-solaris10-sparcT2-ss12u3-v9/INST/bin/mpirun >> -mca btl sm,self -np 2 examples/ring_c >> -------------------------------------------------------------------------- >> Open MPI tried to bind a new process, but something went wrong. The >> process was killed without launching the target application. Your job >> will now abort. >> >> Local host: niagara1 >> Application name: examples/ring_c >> Error message: hwloc indicates cpu binding cannot be enforced >> Location: >> >> /home/hargrove/OMPI/openmpi-1.7.4rc1-solaris10-sparcT2-ss12u3-v9/openmpi-1.7.4rc1/orte/mca/odls/default/odls_default_module.c:478 >> -------------------------------------------------------------------------- >> 2 total processes failed to start >> >> >> I am assuming I just need some magic pixie dust to disable cpu binding. >> I'd appreciate some corresponding instructions. >> >> However, if this is NOT an expected/desired/known behavior please let me >> know what I can/should do to help determine the root cause. >> >> >> -Paul >> >> -- >> Paul H. Hargrove phhargr...@lbl.gov >> Future Technologies Group >> Computer and Data Sciences Department Tel: +1-510-495-2352 >> Lawrence Berkeley National Laboratory Fax: +1-510-486-6900 >> _______________________________________________ >> devel mailing list >> de...@open-mpi.org >> http://www.open-mpi.org/mailman/listinfo.cgi/devel >> >> >> >> _______________________________________________ >> devel mailing list >> de...@open-mpi.org >> http://www.open-mpi.org/mailman/listinfo.cgi/devel >> > > > > -- > Paul H. Hargrove phhargr...@lbl.gov > Future Technologies Group > Computer and Data Sciences Department Tel: +1-510-495-2352 > Lawrence Berkeley National Laboratory Fax: +1-510-486-6900 > _______________________________________________ > devel mailing list > de...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/devel > > > > _______________________________________________ > devel mailing list > de...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/devel > -- Paul H. Hargrove phhargr...@lbl.gov Future Technologies Group Computer and Data Sciences Department Tel: +1-510-495-2352 Lawrence Berkeley National Laboratory Fax: +1-510-486-6900