Ralph and Jeff,

Thanks for all the rapid fixes.
I'll send openmpi-1.7.4rc2r30031 for a spin while I go wait in line at the
Post Office.

-Paul


On Fri, Dec 20, 2013 at 11:45 AM, Ralph Castain <r...@open-mpi.org> wrote:

> Hi Paul
>
> The binding stuff was in there, but the limit protection code just went in
> today. Jeff has since regenerated the tarball for the web site, so the one
> up there should have most (if not all) of these problems fixed
>
> Have a great holiday!
> Ralph
>
>
> On Dec 20, 2013, at 11:40 AM, Paul Hargrove <phhargr...@lbl.gov> wrote:
>
> Ralph,
>
> I see the same behavior w/ last night's 1.7 tarball
> (openmpi-1.7.4rc2r30002).
> The very next commit, r30003, is your addition (on trunk) of guards for
> RLIMIT_AS, etc..
> So, I DON'T think any fix for this behavior is in the 1.7 branch as you
> thought (maybe just CMR'ed?)
>
> Let me know if there is additional information about the platform or error
> which I should collect.
>
> -Paul
>
> P.S.
> You may see my email vacation auto-responder message.
> My vacation has started (no *paid* work) but I am still reading email
> today.
> I plan to re-test tonight's 1.7 tarball on all the systems where I
> reported issues on Thu night.
>
>
> On Thu, Dec 19, 2013 at 7:19 PM, Ralph Castain <r...@open-mpi.org> wrote:
>
>> I believe this one has already been fixed and is in the nightly
>> (1.7.4rc2) - for now, you can just set "--bind-to none" on the cmd line to
>> get past it
>>
>>
>> On Dec 19, 2013, at 6:42 PM, Paul Hargrove <phhargr...@lbl.gov> wrote:
>>
>> Testing with Solaris 10 on SPARC, I was expecting to encounter the bus
>> error reported previously by Siegman Gross.  Instead I see the following
>> hwloc-related abort:
>>
>> $ env
>> PATH=/home/hargrove/OMPI/openmpi-1.7.4rc1-solaris10-sparcT2-ss12u3-v9/INST/bin:$PATH
>>  
>> LD_LIBRARY_PATH_64=/home/hargrove/OMPI/openmpi-1.7.4rc1-solaris10-sparcT2-ss12u3-v9/INST/lib:$LD_LIBRARY_PATH_64
>>  OMPI_MCA_shmem_mmap_enable_nfs_warning=0
>>  
>> /home/hargrove/OMPI/openmpi-1.7.4rc1-solaris10-sparcT2-ss12u3-v9/INST/bin/mpirun
>> -mca btl sm,self -np 2 examples/ring_c
>> --------------------------------------------------------------------------
>> Open MPI tried to bind a new process, but something went wrong.  The
>> process was killed without launching the target application.  Your job
>> will now abort.
>>
>>   Local host:        niagara1
>>   Application name:  examples/ring_c
>>   Error message:     hwloc indicates cpu binding cannot be enforced
>>   Location:
>>  
>> /home/hargrove/OMPI/openmpi-1.7.4rc1-solaris10-sparcT2-ss12u3-v9/openmpi-1.7.4rc1/orte/mca/odls/default/odls_default_module.c:478
>> --------------------------------------------------------------------------
>> 2 total processes failed to start
>>
>>
>> I am assuming I just need some magic pixie dust to disable cpu binding.
>> I'd appreciate some corresponding instructions.
>>
>> However, if this is NOT an expected/desired/known behavior please let me
>> know what I can/should do to help determine the root cause.
>>
>>
>> -Paul
>>
>> --
>> Paul H. Hargrove                          phhargr...@lbl.gov
>> Future Technologies Group
>> Computer and Data Sciences Department     Tel: +1-510-495-2352
>> Lawrence Berkeley National Laboratory     Fax: +1-510-486-6900
>>  _______________________________________________
>> devel mailing list
>> de...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>
>>
>>
>> _______________________________________________
>> devel mailing list
>> de...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>
>
>
>
> --
> Paul H. Hargrove                          phhargr...@lbl.gov
> Future Technologies Group
> Computer and Data Sciences Department     Tel: +1-510-495-2352
> Lawrence Berkeley National Laboratory     Fax: +1-510-486-6900
>  _______________________________________________
> devel mailing list
> de...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>
>
>
> _______________________________________________
> devel mailing list
> de...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>



-- 
Paul H. Hargrove                          phhargr...@lbl.gov
Future Technologies Group
Computer and Data Sciences Department     Tel: +1-510-495-2352
Lawrence Berkeley National Laboratory     Fax: +1-510-486-6900

Reply via email to