Paul and all,

before r32408, the environment/abort test from the ibm test suite
crashed with SIGSEGV.

there is no more crash after the fix :-)

that being said, i experience some (random) hangs on my VM :
--mca btl tcp,self => no hang
--mca btl sm,self or --mca btl vader,self => hang about 25% of the time
--mca btl scif,self => always hang

only the mpirun process remains and is hanging.

i will try to debug this, and i welcome any help !

Cheers,

Gilles

On 2014/08/04 11:57, Gilles Gouaillardet wrote:
> Paul,
>
> i confirm ampersand was missing and this was a bug
> /* a similar bug was fixed by Ralph in r32357 */
>
> i commited r32408 in order to fix these three bugs.
>
> i also took the liberty to replace the OMPI_CAST_RTE_NAME
> with an inline function (only in debug mode) in order to get a
> compiler warning on both 32 and 64 bits arch in this case :
>
> #if OPAL_ENABLE_DEBUG
> static inline orte_process_name_t *
> OMPI_CAST_RTE_NAME(opal_process_name_t * name);
> #else
> #define OMPI_CAST_RTE_NAME(a) ((orte_process_name_t*)(a))
> #endif
>
> Cheers,
>
> Gilles
>
> On 2014/08/03 14:49, Gilles GOUAILLARDET wrote:
>> Paul,
>>
>> imho, the root cause is a missing ampersand.
>>
>> I will double check this from tomorrow only
>>
>> Cheers,
>>
>> Gilles
>>
>> Ralph Castain <r...@open-mpi.org> wrote:
>>> Arg - that raises an interesting point. This is a pointer to a 64-bit 
>>> number. Will uintptr_t resolve that problem on such platforms?
>>>
>>>
>>> On Aug 2, 2014, at 8:12 PM, Paul Hargrove <phhargr...@lbl.gov> wrote:
>>>
>>>
>>> Looks like on a 32-bit platform a (uintptr_t) cast is desired in the 
>>> OMPI_CAST_RTE_NAME() macro.
>>>
>>>
>>> Warnings from current trunk tarball attributable to the missing case 
>>> include:
>>>
>>>
>>> /home/pcp1/phargrov/OMPI/openmpi-trunk-linux-x86-gcc/openmpi-1.9a1r32406/ompi/runtime/ompi_mpi_abort.c:89:
>>>  warning: cast to pointer from integer of different size
>>>
>>> /home/pcp1/phargrov/OMPI/openmpi-trunk-linux-x86-gcc/openmpi-1.9a1r32406/ompi/runtime/ompi_mpi_abort.c:97:
>>>  warning: cast to pointer from integer of different size
>>>
>>> /home/pcp1/phargrov/OMPI/openmpi-trunk-linux-x86-gcc/openmpi-1.9a1r32406/ompi/mca/pml/bfo/pml_bfo_failover.c:1417:
>>>  warning: cast to pointer from integer of different size
>>>
>>>
>>> -Paul
>>>
>>>
>>> -- 
>>>
>>> Paul H. Hargrove                          phhargr...@lbl.gov
>>>
>>> Future Technologies Group
>>>
>>> Computer and Data Sciences Department     Tel: +1-510-495-2352
>>>
>>> Lawrence Berkeley National Laboratory     Fax: +1-510-486-6900
>>>
>>> _______________________________________________
>>> devel mailing list
>>> de...@open-mpi.org
>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>> Link to this post: 
>>> http://www.open-mpi.org/community/lists/devel/2014/08/15481.php
>>>
>>>
>>>
>>>
>>> _______________________________________________
>>> devel mailing list
>>> de...@open-mpi.org
>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>> Link to this post: 
>>> http://www.open-mpi.org/community/lists/devel/2014/08/15484.php
>
>
>
> _______________________________________________
> devel mailing list
> de...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
> Link to this post: 
> http://www.open-mpi.org/community/lists/devel/2014/08/15489.php

Reply via email to