Re: [OMPI devel] 1.7.5 fails on simple test

2014-02-10 Thread Paul Hargrove
All the platforms that failed over the weekend have passed today.

-Paul


On Mon, Feb 10, 2014 at 2:34 PM, Paul Hargrove <phhargr...@lbl.gov> wrote:

> The fastest of my systems that failed over the weekend (a ppc64)  has
> completed tests successfully.
> I will report on the ppc32 and SPARC results when they have all passed or
> failed.
>
> -Paul
>
>
> On Mon, Feb 10, 2014 at 1:52 PM, Ralph Castain <r...@open-mpi.org> wrote:
>
>> Tarball is now posted
>>
>> On Feb 10, 2014, at 1:31 PM, Ralph Castain <r...@open-mpi.org> wrote:
>>
>> Generating it now - sorry for my lack of response, my OMPI email was down
>> for some reason. I can now receive it, but still haven't gotten the backlog
>> from the down period.
>>
>>
>> On Feb 10, 2014, at 1:23 PM, Paul Hargrove <phhargr...@lbl.gov> wrote:
>>
>> Ralph,
>>
>> If you give me a heads-up when this makes it into a tarball, I will
>> retest my failing ppc and sparc platforms.
>>
>> -Paul
>>
>>
>> On Mon, Feb 10, 2014 at 1:13 PM, Rolf vandeVaart 
>> <rvandeva...@nvidia.com>wrote:
>>
>>> I have tracked this down.  There is a missing commit that affects
>>> ompi_mpi_init.c causing it to initialize bml twice.
>>>
>>> Ralph, can you apply r30310 to 1.7?
>>>
>>>
>>>
>>> Thanks,
>>>
>>> Rolf
>>>
>>>
>>>
>>> *From:* devel [mailto:devel-boun...@open-mpi.org] *On Behalf Of *Rolf
>>> vandeVaart
>>> *Sent:* Monday, February 10, 2014 12:29 PM
>>> *To:* Open MPI Developers
>>> *Subject:* Re: [OMPI devel] 1.7.5 fails on simple test
>>>
>>>
>>>
>>> I have seen this same issue although my core dump is a little bit
>>> different.  I am running with tcp,self.  The first entry in the list of
>>> BTLs is garbage, but then there is tcp and self in the list.   Strange.
>>> This is my core dump.  Line 208 in bml_r2.c is where I get the SEGV.
>>>
>>>
>>>
>>> Program terminated with signal 11, Segmentation fault.
>>>
>>> #0  0x7fb6dec981d0 in ?? ()
>>>
>>> Missing separate debuginfos, use: debuginfo-install
>>> glibc-2.12-1.107.el6_4.5.x86_64
>>>
>>> (gdb) where
>>>
>>> #0  0x7fb6dec981d0 in ?? ()
>>>
>>> #1  
>>>
>>> #2  0x7fb6e82fff38 in main_arena () from /lib64/libc.so.6
>>>
>>> #3  0x7fb6e4103de2 in mca_bml_r2_add_procs (nprocs=2,
>>> procs=0x2061440, reachable=0x7fff80487b40)
>>>
>>> at ../../../../../ompi/mca/bml/r2/bml_r2.c:208
>>>
>>> #4  0x7fb6df50a751 in mca_pml_ob1_add_procs (procs=0x2060bc0,
>>> nprocs=2)
>>>
>>> at ../../../../../ompi/mca/pml/ob1/pml_ob1.c:332
>>>
>>> #5  0x7fb6e8570dca in ompi_mpi_init (argc=1, argv=0x7fff80488158,
>>> requested=0, provided=0x7fff80487cc8)
>>>
>>> at ../../ompi/runtime/ompi_mpi_init.c:776
>>>
>>> #6  0x7fb6e85a3606 in PMPI_Init (argc=0x7fff80487d8c,
>>> argv=0x7fff80487d80) at pinit.c:84
>>>
>>> #7  0x00401c56 in main (argc=1, argv=0x7fff80488158) at
>>> MPI_Isend_ator_c.c:143
>>>
>>> (gdb)
>>>
>>> #3  0x7fb6e4103de2 in mca_bml_r2_add_procs (nprocs=2,
>>> procs=0x2061440, reachable=0x7fff80487b40)
>>>
>>> at ../../../../../ompi/mca/bml/r2/bml_r2.c:208
>>>
>>> 208 rc = btl->btl_add_procs(btl, n_new_procs, new_procs,
>>> btl_endpoints, reachable);
>>>
>>> (gdb) print *btl
>>>
>>> $1 = {btl_component = 0x7fb6e82ffee8, btl_eager_limit = 140423556234984,
>>> btl_rndv_eager_limit = 140423556235000,
>>>
>>>   btl_max_send_size = 140423556235000, btl_rdma_pipeline_send_length =
>>> 140423556235016,
>>>
>>>   btl_rdma_pipeline_frag_size = 140423556235016,
>>> btl_min_rdma_pipeline_size = 140423556235032,
>>>
>>>   btl_exclusivity = 3895459608, btl_latency = 32694, btl_bandwidth =
>>> 3895459624, btl_flags = 32694,
>>>
>>>   btl_seg_size = 140423556235048, btl_add_procs = 0x7fb6e82fff38
>>> <main_arena+184>,
>>>
>>>   btl_del_procs = 0x7fb6e82fff38 <main_arena+184>, btl_register =
>>> 0x7fb6e82fff48 <main_arena+200>,
>>>
>>>   btl_finalize = 0x7fb6e82fff48 <main_arena+200>, btl_alloc =
>>> 0x7fb6e82fff58 <main_a

Re: [OMPI devel] 1.7.5 fails on simple test

2014-02-10 Thread Paul Hargrove
The fastest of my systems that failed over the weekend (a ppc64)  has
completed tests successfully.
I will report on the ppc32 and SPARC results when they have all passed or
failed.

-Paul


On Mon, Feb 10, 2014 at 1:52 PM, Ralph Castain <r...@open-mpi.org> wrote:

> Tarball is now posted
>
> On Feb 10, 2014, at 1:31 PM, Ralph Castain <r...@open-mpi.org> wrote:
>
> Generating it now - sorry for my lack of response, my OMPI email was down
> for some reason. I can now receive it, but still haven't gotten the backlog
> from the down period.
>
>
> On Feb 10, 2014, at 1:23 PM, Paul Hargrove <phhargr...@lbl.gov> wrote:
>
> Ralph,
>
> If you give me a heads-up when this makes it into a tarball, I will retest
> my failing ppc and sparc platforms.
>
> -Paul
>
>
> On Mon, Feb 10, 2014 at 1:13 PM, Rolf vandeVaart 
> <rvandeva...@nvidia.com>wrote:
>
>> I have tracked this down.  There is a missing commit that affects
>> ompi_mpi_init.c causing it to initialize bml twice.
>>
>> Ralph, can you apply r30310 to 1.7?
>>
>>
>>
>> Thanks,
>>
>> Rolf
>>
>>
>>
>> *From:* devel [mailto:devel-boun...@open-mpi.org] *On Behalf Of *Rolf
>> vandeVaart
>> *Sent:* Monday, February 10, 2014 12:29 PM
>> *To:* Open MPI Developers
>> *Subject:* Re: [OMPI devel] 1.7.5 fails on simple test
>>
>>
>>
>> I have seen this same issue although my core dump is a little bit
>> different.  I am running with tcp,self.  The first entry in the list of
>> BTLs is garbage, but then there is tcp and self in the list.   Strange.
>> This is my core dump.  Line 208 in bml_r2.c is where I get the SEGV.
>>
>>
>>
>> Program terminated with signal 11, Segmentation fault.
>>
>> #0  0x7fb6dec981d0 in ?? ()
>>
>> Missing separate debuginfos, use: debuginfo-install
>> glibc-2.12-1.107.el6_4.5.x86_64
>>
>> (gdb) where
>>
>> #0  0x7fb6dec981d0 in ?? ()
>>
>> #1  
>>
>> #2  0x7fb6e82fff38 in main_arena () from /lib64/libc.so.6
>>
>> #3  0x7fb6e4103de2 in mca_bml_r2_add_procs (nprocs=2,
>> procs=0x2061440, reachable=0x7fff80487b40)
>>
>> at ../../../../../ompi/mca/bml/r2/bml_r2.c:208
>>
>> #4  0x7fb6df50a751 in mca_pml_ob1_add_procs (procs=0x2060bc0,
>> nprocs=2)
>>
>> at ../../../../../ompi/mca/pml/ob1/pml_ob1.c:332
>>
>> #5  0x7fb6e8570dca in ompi_mpi_init (argc=1, argv=0x7fff80488158,
>> requested=0, provided=0x7fff80487cc8)
>>
>> at ../../ompi/runtime/ompi_mpi_init.c:776
>>
>> #6  0x7fb6e85a3606 in PMPI_Init (argc=0x7fff80487d8c,
>> argv=0x7fff80487d80) at pinit.c:84
>>
>> #7  0x00401c56 in main (argc=1, argv=0x7fff80488158) at
>> MPI_Isend_ator_c.c:143
>>
>> (gdb)
>>
>> #3  0x7fb6e4103de2 in mca_bml_r2_add_procs (nprocs=2,
>> procs=0x2061440, reachable=0x7fff80487b40)
>>
>> at ../../../../../ompi/mca/bml/r2/bml_r2.c:208
>>
>> 208 rc = btl->btl_add_procs(btl, n_new_procs, new_procs,
>> btl_endpoints, reachable);
>>
>> (gdb) print *btl
>>
>> $1 = {btl_component = 0x7fb6e82ffee8, btl_eager_limit = 140423556234984,
>> btl_rndv_eager_limit = 140423556235000,
>>
>>   btl_max_send_size = 140423556235000, btl_rdma_pipeline_send_length =
>> 140423556235016,
>>
>>   btl_rdma_pipeline_frag_size = 140423556235016,
>> btl_min_rdma_pipeline_size = 140423556235032,
>>
>>   btl_exclusivity = 3895459608, btl_latency = 32694, btl_bandwidth =
>> 3895459624, btl_flags = 32694,
>>
>>   btl_seg_size = 140423556235048, btl_add_procs = 0x7fb6e82fff38
>> <main_arena+184>,
>>
>>   btl_del_procs = 0x7fb6e82fff38 <main_arena+184>, btl_register =
>> 0x7fb6e82fff48 <main_arena+200>,
>>
>>   btl_finalize = 0x7fb6e82fff48 <main_arena+200>, btl_alloc =
>> 0x7fb6e82fff58 <main_arena+216>,
>>
>>   btl_free = 0x7fb6e82fff58 <main_arena+216>, btl_prepare_src =
>> 0x7fb6e82fff68 <main_arena+232>,
>>
>>   btl_prepare_dst = 0x7fb6e82fff68 <main_arena+232>, btl_send =
>> 0x7fb6e82fff78 <main_arena+248>,
>>
>>   btl_sendi = 0x7fb6e82fff78 <main_arena+248>, btl_put = 0x7fb6e82fff88
>> <main_arena+264>,
>>
>>   btl_get = 0x7fb6e82fff88 <main_arena+264>, btl_dump = 0x7fb6e82fff98
>> <main_arena+280>,
>>
>>   btl_mpool = 0x7fb6e82fff98, btl_register_error = 0x7fb6e82fffa8
>> <main_arena+296>,
>>
>

Re: [OMPI devel] 1.7.5 fails on simple test

2014-02-10 Thread Ralph Castain
Tarball is now posted

On Feb 10, 2014, at 1:31 PM, Ralph Castain <r...@open-mpi.org> wrote:

> Generating it now - sorry for my lack of response, my OMPI email was down for 
> some reason. I can now receive it, but still haven't gotten the backlog from 
> the down period.
> 
> 
> On Feb 10, 2014, at 1:23 PM, Paul Hargrove <phhargr...@lbl.gov> wrote:
> 
>> Ralph,
>> 
>> If you give me a heads-up when this makes it into a tarball, I will retest 
>> my failing ppc and sparc platforms.
>> 
>> -Paul
>> 
>> 
>> On Mon, Feb 10, 2014 at 1:13 PM, Rolf vandeVaart <rvandeva...@nvidia.com> 
>> wrote:
>> I have tracked this down.  There is a missing commit that affects 
>> ompi_mpi_init.c causing it to initialize bml twice.
>> 
>> Ralph, can you apply r30310 to 1.7?
>> 
>>  
>> 
>> Thanks,
>> 
>> Rolf
>> 
>>  
>> 
>> From: devel [mailto:devel-boun...@open-mpi.org] On Behalf Of Rolf vandeVaart
>> Sent: Monday, February 10, 2014 12:29 PM
>> To: Open MPI Developers
>> Subject: Re: [OMPI devel] 1.7.5 fails on simple test
>> 
>>  
>> 
>> I have seen this same issue although my core dump is a little bit different. 
>>  I am running with tcp,self.  The first entry in the list of BTLs is 
>> garbage, but then there is tcp and self in the list.   Strange.  This is my 
>> core dump.  Line 208 in bml_r2.c is where I get the SEGV.
>> 
>>  
>> 
>> Program terminated with signal 11, Segmentation fault.
>> 
>> #0  0x7fb6dec981d0 in ?? ()
>> 
>> Missing separate debuginfos, use: debuginfo-install 
>> glibc-2.12-1.107.el6_4.5.x86_64
>> 
>> (gdb) where
>> 
>> #0  0x7fb6dec981d0 in ?? ()
>> 
>> #1  
>> 
>> #2  0x7fb6e82fff38 in main_arena () from /lib64/libc.so.6
>> 
>> #3  0x7fb6e4103de2 in mca_bml_r2_add_procs (nprocs=2, procs=0x2061440, 
>> reachable=0x7fff80487b40)
>> 
>> at ../../../../../ompi/mca/bml/r2/bml_r2.c:208
>> 
>> #4  0x7fb6df50a751 in mca_pml_ob1_add_procs (procs=0x2060bc0, nprocs=2)
>> 
>> at ../../../../../ompi/mca/pml/ob1/pml_ob1.c:332
>> 
>> #5  0x7fb6e8570dca in ompi_mpi_init (argc=1, argv=0x7fff80488158, 
>> requested=0, provided=0x7fff80487cc8)
>> 
>> at ../../ompi/runtime/ompi_mpi_init.c:776
>> 
>> #6  0x7fb6e85a3606 in PMPI_Init (argc=0x7fff80487d8c, 
>> argv=0x7fff80487d80) at pinit.c:84
>> 
>> #7  0x00401c56 in main (argc=1, argv=0x7fff80488158) at 
>> MPI_Isend_ator_c.c:143
>> 
>> (gdb)
>> 
>> #3  0x7fb6e4103de2 in mca_bml_r2_add_procs (nprocs=2, procs=0x2061440, 
>> reachable=0x7fff80487b40)
>> 
>> at ../../../../../ompi/mca/bml/r2/bml_r2.c:208
>> 
>> 208 rc = btl->btl_add_procs(btl, n_new_procs, new_procs, 
>> btl_endpoints, reachable);
>> 
>> (gdb) print *btl
>> 
>> $1 = {btl_component = 0x7fb6e82ffee8, btl_eager_limit = 140423556234984, 
>> btl_rndv_eager_limit = 140423556235000,
>> 
>>   btl_max_send_size = 140423556235000, btl_rdma_pipeline_send_length = 
>> 140423556235016,
>> 
>>   btl_rdma_pipeline_frag_size = 140423556235016, btl_min_rdma_pipeline_size 
>> = 140423556235032,
>> 
>>   btl_exclusivity = 3895459608, btl_latency = 32694, btl_bandwidth = 
>> 3895459624, btl_flags = 32694,
>> 
>>   btl_seg_size = 140423556235048, btl_add_procs = 0x7fb6e82fff38 
>> <main_arena+184>,
>> 
>>   btl_del_procs = 0x7fb6e82fff38 <main_arena+184>, btl_register = 
>> 0x7fb6e82fff48 <main_arena+200>,
>> 
>>   btl_finalize = 0x7fb6e82fff48 <main_arena+200>, btl_alloc = 0x7fb6e82fff58 
>> <main_arena+216>,
>> 
>>   btl_free = 0x7fb6e82fff58 <main_arena+216>, btl_prepare_src = 
>> 0x7fb6e82fff68 <main_arena+232>,
>> 
>>   btl_prepare_dst = 0x7fb6e82fff68 <main_arena+232>, btl_send = 
>> 0x7fb6e82fff78 <main_arena+248>,
>> 
>>   btl_sendi = 0x7fb6e82fff78 <main_arena+248>, btl_put = 0x7fb6e82fff88 
>> <main_arena+264>,
>> 
>>   btl_get = 0x7fb6e82fff88 <main_arena+264>, btl_dump = 0x7fb6e82fff98 
>> <main_arena+280>,
>> 
>>   btl_mpool = 0x7fb6e82fff98, btl_register_error = 0x7fb6e82fffa8 
>> <main_arena+296>,
>> 
>>   btl_ft_event = 0x7fb6e82fffa8 <main_arena+296>}
>> 
>> (gdb)
>> 
>>  
>> 
>>  
>> 
>> From: devel [mailto:devel-boun...@open-mpi.org] On Be

Re: [OMPI devel] 1.7.5 fails on simple test

2014-02-10 Thread Ralph Castain
Generating it now - sorry for my lack of response, my OMPI email was down for 
some reason. I can now receive it, but still haven't gotten the backlog from 
the down period.


On Feb 10, 2014, at 1:23 PM, Paul Hargrove <phhargr...@lbl.gov> wrote:

> Ralph,
> 
> If you give me a heads-up when this makes it into a tarball, I will retest my 
> failing ppc and sparc platforms.
> 
> -Paul
> 
> 
> On Mon, Feb 10, 2014 at 1:13 PM, Rolf vandeVaart <rvandeva...@nvidia.com> 
> wrote:
> I have tracked this down.  There is a missing commit that affects 
> ompi_mpi_init.c causing it to initialize bml twice.
> 
> Ralph, can you apply r30310 to 1.7?
> 
>  
> 
> Thanks,
> 
> Rolf
> 
>  
> 
> From: devel [mailto:devel-boun...@open-mpi.org] On Behalf Of Rolf vandeVaart
> Sent: Monday, February 10, 2014 12:29 PM
> To: Open MPI Developers
> Subject: Re: [OMPI devel] 1.7.5 fails on simple test
> 
>  
> 
> I have seen this same issue although my core dump is a little bit different.  
> I am running with tcp,self.  The first entry in the list of BTLs is garbage, 
> but then there is tcp and self in the list.   Strange.  This is my core dump. 
>  Line 208 in bml_r2.c is where I get the SEGV.
> 
>  
> 
> Program terminated with signal 11, Segmentation fault.
> 
> #0  0x7fb6dec981d0 in ?? ()
> 
> Missing separate debuginfos, use: debuginfo-install 
> glibc-2.12-1.107.el6_4.5.x86_64
> 
> (gdb) where
> 
> #0  0x7fb6dec981d0 in ?? ()
> 
> #1  
> 
> #2  0x7fb6e82fff38 in main_arena () from /lib64/libc.so.6
> 
> #3  0x7fb6e4103de2 in mca_bml_r2_add_procs (nprocs=2, procs=0x2061440, 
> reachable=0x7fff80487b40)
> 
> at ../../../../../ompi/mca/bml/r2/bml_r2.c:208
> 
> #4  0x7fb6df50a751 in mca_pml_ob1_add_procs (procs=0x2060bc0, nprocs=2)
> 
> at ../../../../../ompi/mca/pml/ob1/pml_ob1.c:332
> 
> #5  0x7fb6e8570dca in ompi_mpi_init (argc=1, argv=0x7fff80488158, 
> requested=0, provided=0x7fff80487cc8)
> 
> at ../../ompi/runtime/ompi_mpi_init.c:776
> 
> #6  0x7fb6e85a3606 in PMPI_Init (argc=0x7fff80487d8c, 
> argv=0x7fff80487d80) at pinit.c:84
> 
> #7  0x00401c56 in main (argc=1, argv=0x7fff80488158) at 
> MPI_Isend_ator_c.c:143
> 
> (gdb)
> 
> #3  0x7fb6e4103de2 in mca_bml_r2_add_procs (nprocs=2, procs=0x2061440, 
> reachable=0x7fff80487b40)
> 
> at ../../../../../ompi/mca/bml/r2/bml_r2.c:208
> 
> 208 rc = btl->btl_add_procs(btl, n_new_procs, new_procs, 
> btl_endpoints, reachable);
> 
> (gdb) print *btl
> 
> $1 = {btl_component = 0x7fb6e82ffee8, btl_eager_limit = 140423556234984, 
> btl_rndv_eager_limit = 140423556235000,
> 
>   btl_max_send_size = 140423556235000, btl_rdma_pipeline_send_length = 
> 140423556235016,
> 
>   btl_rdma_pipeline_frag_size = 140423556235016, btl_min_rdma_pipeline_size = 
> 140423556235032,
> 
>   btl_exclusivity = 3895459608, btl_latency = 32694, btl_bandwidth = 
> 3895459624, btl_flags = 32694,
> 
>   btl_seg_size = 140423556235048, btl_add_procs = 0x7fb6e82fff38 
> <main_arena+184>,
> 
>   btl_del_procs = 0x7fb6e82fff38 <main_arena+184>, btl_register = 
> 0x7fb6e82fff48 <main_arena+200>,
> 
>   btl_finalize = 0x7fb6e82fff48 <main_arena+200>, btl_alloc = 0x7fb6e82fff58 
> <main_arena+216>,
> 
>   btl_free = 0x7fb6e82fff58 <main_arena+216>, btl_prepare_src = 
> 0x7fb6e82fff68 <main_arena+232>,
> 
>   btl_prepare_dst = 0x7fb6e82fff68 <main_arena+232>, btl_send = 
> 0x7fb6e82fff78 <main_arena+248>,
> 
>   btl_sendi = 0x7fb6e82fff78 <main_arena+248>, btl_put = 0x7fb6e82fff88 
> <main_arena+264>,
> 
>   btl_get = 0x7fb6e82fff88 <main_arena+264>, btl_dump = 0x7fb6e82fff98 
> <main_arena+280>,
> 
>   btl_mpool = 0x7fb6e82fff98, btl_register_error = 0x7fb6e82fffa8 
> <main_arena+296>,
> 
>   btl_ft_event = 0x7fb6e82fffa8 <main_arena+296>}
> 
> (gdb)
> 
>  
> 
>  
> 
> From: devel [mailto:devel-boun...@open-mpi.org] On Behalf Of Mike Dubman
> Sent: Monday, February 10, 2014 4:23 AM
> To: Open MPI Developers
> Subject: [OMPI devel] 1.7.5 fails on simple test
> 
>  
> 
>  
>  
> $/scrap/jenkins/scrap/workspace/hpc-ompi-shmem/label/hpc-test-node/ompi_install1/bin/mpirun
>  -np 8 -mca pml ob1 -mca btl self,tcp 
> /scrap/jenkins/scrap/workspace/hpc-ompi-shmem/label/hpc-test-node/ompi_install1/examples/hello_usempi
> [vegas12:12724] *** Process received signal ***
> [vegas12:12724] Signal: Segmentation fault (11)
> [vegas12:12724] Signal code:  (128)
> [vegas12:12724] Failing at address: (nil)
> [vegas12:12724] [ 0] /l

Re: [OMPI devel] 1.7.5 fails on simple test

2014-02-10 Thread Paul Hargrove
Ralph,

If you give me a heads-up when this makes it into a tarball, I will retest
my failing ppc and sparc platforms.

-Paul


On Mon, Feb 10, 2014 at 1:13 PM, Rolf vandeVaart <rvandeva...@nvidia.com>wrote:

> I have tracked this down.  There is a missing commit that affects
> ompi_mpi_init.c causing it to initialize bml twice.
>
> Ralph, can you apply r30310 to 1.7?
>
>
>
> Thanks,
>
> Rolf
>
>
>
> *From:* devel [mailto:devel-boun...@open-mpi.org] *On Behalf Of *Rolf
> vandeVaart
> *Sent:* Monday, February 10, 2014 12:29 PM
> *To:* Open MPI Developers
> *Subject:* Re: [OMPI devel] 1.7.5 fails on simple test
>
>
>
> I have seen this same issue although my core dump is a little bit
> different.  I am running with tcp,self.  The first entry in the list of
> BTLs is garbage, but then there is tcp and self in the list.   Strange.
> This is my core dump.  Line 208 in bml_r2.c is where I get the SEGV.
>
>
>
> Program terminated with signal 11, Segmentation fault.
>
> #0  0x7fb6dec981d0 in ?? ()
>
> Missing separate debuginfos, use: debuginfo-install
> glibc-2.12-1.107.el6_4.5.x86_64
>
> (gdb) where
>
> #0  0x7fb6dec981d0 in ?? ()
>
> #1  
>
> #2  0x7fb6e82fff38 in main_arena () from /lib64/libc.so.6
>
> #3  0x7fb6e4103de2 in mca_bml_r2_add_procs (nprocs=2, procs=0x2061440,
> reachable=0x7fff80487b40)
>
> at ../../../../../ompi/mca/bml/r2/bml_r2.c:208
>
> #4  0x7fb6df50a751 in mca_pml_ob1_add_procs (procs=0x2060bc0, nprocs=2)
>
> at ../../../../../ompi/mca/pml/ob1/pml_ob1.c:332
>
> #5  0x7fb6e8570dca in ompi_mpi_init (argc=1, argv=0x7fff80488158,
> requested=0, provided=0x7fff80487cc8)
>
> at ../../ompi/runtime/ompi_mpi_init.c:776
>
> #6  0x7fb6e85a3606 in PMPI_Init (argc=0x7fff80487d8c,
> argv=0x7fff80487d80) at pinit.c:84
>
> #7  0x00401c56 in main (argc=1, argv=0x7fff80488158) at
> MPI_Isend_ator_c.c:143
>
> (gdb)
>
> #3  0x7fb6e4103de2 in mca_bml_r2_add_procs (nprocs=2, procs=0x2061440,
> reachable=0x7fff80487b40)
>
> at ../../../../../ompi/mca/bml/r2/bml_r2.c:208
>
> 208 rc = btl->btl_add_procs(btl, n_new_procs, new_procs,
> btl_endpoints, reachable);
>
> (gdb) print *btl
>
> $1 = {btl_component = 0x7fb6e82ffee8, btl_eager_limit = 140423556234984,
> btl_rndv_eager_limit = 140423556235000,
>
>   btl_max_send_size = 140423556235000, btl_rdma_pipeline_send_length =
> 140423556235016,
>
>   btl_rdma_pipeline_frag_size = 140423556235016,
> btl_min_rdma_pipeline_size = 140423556235032,
>
>   btl_exclusivity = 3895459608, btl_latency = 32694, btl_bandwidth =
> 3895459624, btl_flags = 32694,
>
>   btl_seg_size = 140423556235048, btl_add_procs = 0x7fb6e82fff38
> <main_arena+184>,
>
>   btl_del_procs = 0x7fb6e82fff38 <main_arena+184>, btl_register =
> 0x7fb6e82fff48 <main_arena+200>,
>
>   btl_finalize = 0x7fb6e82fff48 <main_arena+200>, btl_alloc =
> 0x7fb6e82fff58 <main_arena+216>,
>
>   btl_free = 0x7fb6e82fff58 <main_arena+216>, btl_prepare_src =
> 0x7fb6e82fff68 <main_arena+232>,
>
>   btl_prepare_dst = 0x7fb6e82fff68 <main_arena+232>, btl_send =
> 0x7fb6e82fff78 <main_arena+248>,
>
>   btl_sendi = 0x7fb6e82fff78 <main_arena+248>, btl_put = 0x7fb6e82fff88
> <main_arena+264>,
>
>   btl_get = 0x7fb6e82fff88 <main_arena+264>, btl_dump = 0x7fb6e82fff98
> <main_arena+280>,
>
>   btl_mpool = 0x7fb6e82fff98, btl_register_error = 0x7fb6e82fffa8
> <main_arena+296>,
>
>   btl_ft_event = 0x7fb6e82fffa8 <main_arena+296>}
>
> (gdb)
>
>
>
>
>
> *From:* devel [mailto:devel-boun...@open-mpi.org<devel-boun...@open-mpi.org>]
> *On Behalf Of *Mike Dubman
> *Sent:* Monday, February 10, 2014 4:23 AM
> *To:* Open MPI Developers
> *Subject:* [OMPI devel] 1.7.5 fails on simple test
>
>
>
>
>
>
>
> *$/scrap/jenkins/scrap/workspace/hpc-ompi-shmem/label/hpc-test-node/ompi_install1/bin/mpirun
>  -np 8 -mca pml ob1 -mca btl self,tcp 
> /scrap/jenkins/scrap/workspace/hpc-ompi-shmem/label/hpc-test-node/ompi_install1/examples/hello_usempi*
>
> *[vegas12:12724] *** Process received signal 
>
> *[vegas12:12724] Signal: Segmentation fault (11)*
>
> *[vegas12:12724] Signal code:  (128)*
>
> *[vegas12:12724] Failing at address: (nil)*
>
> *[vegas12:12724] [ 0] /lib64/libpthread.so.0[0x3937c0f500]*
>
> *[vegas12:12724] [ 1] 
> /scrap/jenkins/scrap/workspace/hpc-ompi-shmem/label/hpc-test-node/ompi_install1/lib/openmpi/mca_btl_tcp.so(mca_btl_tcp_component_init+0x583)[0x7395f813]*
>
> *[vegas12:12724] [ 2] 
> /scrap/jenkins/scra

Re: [OMPI devel] 1.7.5 fails on simple test

2014-02-10 Thread Ralph Castain
Done - thanks Rolf!!


On Feb 10, 2014, at 1:13 PM, Rolf vandeVaart <rvandeva...@nvidia.com> wrote:

> I have tracked this down.  There is a missing commit that affects 
> ompi_mpi_init.c causing it to initialize bml twice.
> Ralph, can you apply r30310 to 1.7?
>  
> Thanks,
> Rolf
>  
> From: devel [mailto:devel-boun...@open-mpi.org] On Behalf Of Rolf vandeVaart
> Sent: Monday, February 10, 2014 12:29 PM
> To: Open MPI Developers
> Subject: Re: [OMPI devel] 1.7.5 fails on simple test
>  
> I have seen this same issue although my core dump is a little bit different.  
> I am running with tcp,self.  The first entry in the list of BTLs is garbage, 
> but then there is tcp and self in the list.   Strange.  This is my core dump. 
>  Line 208 in bml_r2.c is where I get the SEGV.
>  
> Program terminated with signal 11, Segmentation fault.
> #0  0x7fb6dec981d0 in ?? ()
> Missing separate debuginfos, use: debuginfo-install 
> glibc-2.12-1.107.el6_4.5.x86_64
> (gdb) where
> #0  0x7fb6dec981d0 in ?? ()
> #1  
> #2  0x7fb6e82fff38 in main_arena () from /lib64/libc.so.6
> #3  0x7fb6e4103de2 in mca_bml_r2_add_procs (nprocs=2, procs=0x2061440, 
> reachable=0x7fff80487b40)
> at ../../../../../ompi/mca/bml/r2/bml_r2.c:208
> #4  0x7fb6df50a751 in mca_pml_ob1_add_procs (procs=0x2060bc0, nprocs=2)
> at ../../../../../ompi/mca/pml/ob1/pml_ob1.c:332
> #5  0x7fb6e8570dca in ompi_mpi_init (argc=1, argv=0x7fff80488158, 
> requested=0, provided=0x7fff80487cc8)
> at ../../ompi/runtime/ompi_mpi_init.c:776
> #6  0x7fb6e85a3606 in PMPI_Init (argc=0x7fff80487d8c, 
> argv=0x7fff80487d80) at pinit.c:84
> #7  0x00401c56 in main (argc=1, argv=0x7fff80488158) at 
> MPI_Isend_ator_c.c:143
> (gdb)
> #3  0x7fb6e4103de2 in mca_bml_r2_add_procs (nprocs=2, procs=0x2061440, 
> reachable=0x7fff80487b40)
> at ../../../../../ompi/mca/bml/r2/bml_r2.c:208
> 208 rc = btl->btl_add_procs(btl, n_new_procs, new_procs, 
> btl_endpoints, reachable);
> (gdb) print *btl
> $1 = {btl_component = 0x7fb6e82ffee8, btl_eager_limit = 140423556234984, 
> btl_rndv_eager_limit = 140423556235000,
>   btl_max_send_size = 140423556235000, btl_rdma_pipeline_send_length = 
> 140423556235016,
>   btl_rdma_pipeline_frag_size = 140423556235016, btl_min_rdma_pipeline_size = 
> 140423556235032,
>   btl_exclusivity = 3895459608, btl_latency = 32694, btl_bandwidth = 
> 3895459624, btl_flags = 32694,
>   btl_seg_size = 140423556235048, btl_add_procs = 0x7fb6e82fff38 
> <main_arena+184>,
>   btl_del_procs = 0x7fb6e82fff38 <main_arena+184>, btl_register = 
> 0x7fb6e82fff48 <main_arena+200>,
>   btl_finalize = 0x7fb6e82fff48 <main_arena+200>, btl_alloc = 0x7fb6e82fff58 
> <main_arena+216>,
>   btl_free = 0x7fb6e82fff58 <main_arena+216>, btl_prepare_src = 
> 0x7fb6e82fff68 <main_arena+232>,
>   btl_prepare_dst = 0x7fb6e82fff68 <main_arena+232>, btl_send = 
> 0x7fb6e82fff78 <main_arena+248>,
>   btl_sendi = 0x7fb6e82fff78 <main_arena+248>, btl_put = 0x7fb6e82fff88 
> <main_arena+264>,
>   btl_get = 0x7fb6e82fff88 <main_arena+264>, btl_dump = 0x7fb6e82fff98 
> <main_arena+280>,
>   btl_mpool = 0x7fb6e82fff98, btl_register_error = 0x7fb6e82fffa8 
> <main_arena+296>,
>   btl_ft_event = 0x7fb6e82fffa8 <main_arena+296>}
> (gdb)
>  
>  
> From: devel [mailto:devel-boun...@open-mpi.org] On Behalf Of Mike Dubman
> Sent: Monday, February 10, 2014 4:23 AM
> To: Open MPI Developers
> Subject: [OMPI devel] 1.7.5 fails on simple test
>  
>  
>  
> $/scrap/jenkins/scrap/workspace/hpc-ompi-shmem/label/hpc-test-node/ompi_install1/bin/mpirun
>  -np 8 -mca pml ob1 -mca btl self,tcp 
> /scrap/jenkins/scrap/workspace/hpc-ompi-shmem/label/hpc-test-node/ompi_install1/examples/hello_usempi
> [vegas12:12724] *** Process received signal ***
> [vegas12:12724] Signal: Segmentation fault (11)
> [vegas12:12724] Signal code:  (128)
> [vegas12:12724] Failing at address: (nil)
> [vegas12:12724] [ 0] /lib64/libpthread.so.0[0x3937c0f500]
> [vegas12:12724] [ 1] 
> /scrap/jenkins/scrap/workspace/hpc-ompi-shmem/label/hpc-test-node/ompi_install1/lib/openmpi/mca_btl_tcp.so(mca_btl_tcp_component_init+0x583)[0x7395f813]
> [vegas12:12724] [ 2] 
> /scrap/jenkins/scrap/workspace/hpc-ompi-shmem/label/hpc-test-node/ompi_install1/lib/libmpi.so.1(mca_btl_base_select+0x117)[0x778e14a7]
> [vegas12:12724] [ 3] 
> /scrap/jenkins/scrap/workspace/hpc-ompi-shmem/label/hpc-test-node/ompi_install1/lib/openmpi/mca_bml_r2.so(mca_bml_r2_component_init+0x12)[0x73ded6f2]
> [vegas12:12724] [ 4] 
> /scrap/jenkins/scrap/workspace/hpc-ompi-shmem/label/hpc-test-node/ompi_install1/lib/libmpi.

Re: [OMPI devel] 1.7.5 fails on simple test

2014-02-10 Thread Rolf vandeVaart
I have tracked this down.  There is a missing commit that affects 
ompi_mpi_init.c causing it to initialize bml twice.
Ralph, can you apply r30310 to 1.7?

Thanks,
Rolf

From: devel [mailto:devel-boun...@open-mpi.org] On Behalf Of Rolf vandeVaart
Sent: Monday, February 10, 2014 12:29 PM
To: Open MPI Developers
Subject: Re: [OMPI devel] 1.7.5 fails on simple test

I have seen this same issue although my core dump is a little bit different.  I 
am running with tcp,self.  The first entry in the list of BTLs is garbage, but 
then there is tcp and self in the list.   Strange.  This is my core dump.  Line 
208 in bml_r2.c is where I get the SEGV.

Program terminated with signal 11, Segmentation fault.
#0  0x7fb6dec981d0 in ?? ()
Missing separate debuginfos, use: debuginfo-install 
glibc-2.12-1.107.el6_4.5.x86_64
(gdb) where
#0  0x7fb6dec981d0 in ?? ()
#1  
#2  0x7fb6e82fff38 in main_arena () from /lib64/libc.so.6
#3  0x7fb6e4103de2 in mca_bml_r2_add_procs (nprocs=2, procs=0x2061440, 
reachable=0x7fff80487b40)
at ../../../../../ompi/mca/bml/r2/bml_r2.c:208
#4  0x7fb6df50a751 in mca_pml_ob1_add_procs (procs=0x2060bc0, nprocs=2)
at ../../../../../ompi/mca/pml/ob1/pml_ob1.c:332
#5  0x7fb6e8570dca in ompi_mpi_init (argc=1, argv=0x7fff80488158, 
requested=0, provided=0x7fff80487cc8)
at ../../ompi/runtime/ompi_mpi_init.c:776
#6  0x7fb6e85a3606 in PMPI_Init (argc=0x7fff80487d8c, argv=0x7fff80487d80) 
at pinit.c:84
#7  0x00401c56 in main (argc=1, argv=0x7fff80488158) at 
MPI_Isend_ator_c.c:143
(gdb)
#3  0x7fb6e4103de2 in mca_bml_r2_add_procs (nprocs=2, procs=0x2061440, 
reachable=0x7fff80487b40)
at ../../../../../ompi/mca/bml/r2/bml_r2.c:208
208 rc = btl->btl_add_procs(btl, n_new_procs, new_procs, 
btl_endpoints, reachable);
(gdb) print *btl
$1 = {btl_component = 0x7fb6e82ffee8, btl_eager_limit = 140423556234984, 
btl_rndv_eager_limit = 140423556235000,
  btl_max_send_size = 140423556235000, btl_rdma_pipeline_send_length = 
140423556235016,
  btl_rdma_pipeline_frag_size = 140423556235016, btl_min_rdma_pipeline_size = 
140423556235032,
  btl_exclusivity = 3895459608, btl_latency = 32694, btl_bandwidth = 
3895459624, btl_flags = 32694,
  btl_seg_size = 140423556235048, btl_add_procs = 0x7fb6e82fff38 
<main_arena+184>,
  btl_del_procs = 0x7fb6e82fff38 <main_arena+184>, btl_register = 
0x7fb6e82fff48 <main_arena+200>,
  btl_finalize = 0x7fb6e82fff48 <main_arena+200>, btl_alloc = 0x7fb6e82fff58 
<main_arena+216>,
  btl_free = 0x7fb6e82fff58 <main_arena+216>, btl_prepare_src = 0x7fb6e82fff68 
<main_arena+232>,
  btl_prepare_dst = 0x7fb6e82fff68 <main_arena+232>, btl_send = 0x7fb6e82fff78 
<main_arena+248>,
  btl_sendi = 0x7fb6e82fff78 <main_arena+248>, btl_put = 0x7fb6e82fff88 
<main_arena+264>,
  btl_get = 0x7fb6e82fff88 <main_arena+264>, btl_dump = 0x7fb6e82fff98 
<main_arena+280>,
  btl_mpool = 0x7fb6e82fff98, btl_register_error = 0x7fb6e82fffa8 
<main_arena+296>,
  btl_ft_event = 0x7fb6e82fffa8 <main_arena+296>}
(gdb)


From: devel [mailto:devel-boun...@open-mpi.org] On Behalf Of Mike Dubman
Sent: Monday, February 10, 2014 4:23 AM
To: Open MPI Developers
Subject: [OMPI devel] 1.7.5 fails on simple test






$/scrap/jenkins/scrap/workspace/hpc-ompi-shmem/label/hpc-test-node/ompi_install1/bin/mpirun
 -np 8 -mca pml ob1 -mca btl self,tcp 
/scrap/jenkins/scrap/workspace/hpc-ompi-shmem/label/hpc-test-node/ompi_install1/examples/hello_usempi

[vegas12:12724] *** Process received signal ***

[vegas12:12724] Signal: Segmentation fault (11)

[vegas12:12724] Signal code:  (128)

[vegas12:12724] Failing at address: (nil)

[vegas12:12724] [ 0] /lib64/libpthread.so.0[0x3937c0f500]

[vegas12:12724] [ 1] 
/scrap/jenkins/scrap/workspace/hpc-ompi-shmem/label/hpc-test-node/ompi_install1/lib/openmpi/mca_btl_tcp.so(mca_btl_tcp_component_init+0x583)[0x7395f813]

[vegas12:12724] [ 2] 
/scrap/jenkins/scrap/workspace/hpc-ompi-shmem/label/hpc-test-node/ompi_install1/lib/libmpi.so.1(mca_btl_base_select+0x117)[0x778e14a7]

[vegas12:12724] [ 3] 
/scrap/jenkins/scrap/workspace/hpc-ompi-shmem/label/hpc-test-node/ompi_install1/lib/openmpi/mca_bml_r2.so(mca_bml_r2_component_init+0x12)[0x73ded6f2]

[vegas12:12724] [ 4] 
/scrap/jenkins/scrap/workspace/hpc-ompi-shmem/label/hpc-test-node/ompi_install1/lib/libmpi.so.1(mca_bml_base_init+0x99)[0x778e0cc9]

[vegas12:12724] [ 5] 
/scrap/jenkins/scrap/workspace/hpc-ompi-shmem/label/hpc-test-node/ompi_install1/lib/openmpi/mca_pml_ob1.so(+0x51d8)[0x737481d8]

[vegas12:12724] [ 6] 
/scrap/jenkins/scrap/workspace/hpc-ompi-shmem/label/hpc-test-node/ompi_install1/lib/libmpi.so.1(mca_pml_base_select+0x1e0)[0x778f31e0]

[vegas12:12724] [ 7] 
/scrap/jenkins/scrap/workspace/hpc-ompi-shmem/label/hpc-test-node/ompi_install1/lib/libmpi.so.1(ompi_mpi_init+0x52b)[0x778bffdb]

[vegas12:12724] [ 8] 
/scrap/jenkins/scrap/workspace/hp

Re: [OMPI devel] 1.7.5 fails on simple test

2014-02-10 Thread Rolf vandeVaart
I have seen this same issue although my core dump is a little bit different.  I 
am running with tcp,self.  The first entry in the list of BTLs is garbage, but 
then there is tcp and self in the list.   Strange.  This is my core dump.  Line 
208 in bml_r2.c is where I get the SEGV.

Program terminated with signal 11, Segmentation fault.
#0  0x7fb6dec981d0 in ?? ()
Missing separate debuginfos, use: debuginfo-install 
glibc-2.12-1.107.el6_4.5.x86_64
(gdb) where
#0  0x7fb6dec981d0 in ?? ()
#1  
#2  0x7fb6e82fff38 in main_arena () from /lib64/libc.so.6
#3  0x7fb6e4103de2 in mca_bml_r2_add_procs (nprocs=2, procs=0x2061440, 
reachable=0x7fff80487b40)
at ../../../../../ompi/mca/bml/r2/bml_r2.c:208
#4  0x7fb6df50a751 in mca_pml_ob1_add_procs (procs=0x2060bc0, nprocs=2)
at ../../../../../ompi/mca/pml/ob1/pml_ob1.c:332
#5  0x7fb6e8570dca in ompi_mpi_init (argc=1, argv=0x7fff80488158, 
requested=0, provided=0x7fff80487cc8)
at ../../ompi/runtime/ompi_mpi_init.c:776
#6  0x7fb6e85a3606 in PMPI_Init (argc=0x7fff80487d8c, argv=0x7fff80487d80) 
at pinit.c:84
#7  0x00401c56 in main (argc=1, argv=0x7fff80488158) at 
MPI_Isend_ator_c.c:143
(gdb)
#3  0x7fb6e4103de2 in mca_bml_r2_add_procs (nprocs=2, procs=0x2061440, 
reachable=0x7fff80487b40)
at ../../../../../ompi/mca/bml/r2/bml_r2.c:208
208 rc = btl->btl_add_procs(btl, n_new_procs, new_procs, 
btl_endpoints, reachable);
(gdb) print *btl
$1 = {btl_component = 0x7fb6e82ffee8, btl_eager_limit = 140423556234984, 
btl_rndv_eager_limit = 140423556235000,
  btl_max_send_size = 140423556235000, btl_rdma_pipeline_send_length = 
140423556235016,
  btl_rdma_pipeline_frag_size = 140423556235016, btl_min_rdma_pipeline_size = 
140423556235032,
  btl_exclusivity = 3895459608, btl_latency = 32694, btl_bandwidth = 
3895459624, btl_flags = 32694,
  btl_seg_size = 140423556235048, btl_add_procs = 0x7fb6e82fff38 
<main_arena+184>,
  btl_del_procs = 0x7fb6e82fff38 <main_arena+184>, btl_register = 
0x7fb6e82fff48 <main_arena+200>,
  btl_finalize = 0x7fb6e82fff48 <main_arena+200>, btl_alloc = 0x7fb6e82fff58 
<main_arena+216>,
  btl_free = 0x7fb6e82fff58 <main_arena+216>, btl_prepare_src = 0x7fb6e82fff68 
<main_arena+232>,
  btl_prepare_dst = 0x7fb6e82fff68 <main_arena+232>, btl_send = 0x7fb6e82fff78 
<main_arena+248>,
  btl_sendi = 0x7fb6e82fff78 <main_arena+248>, btl_put = 0x7fb6e82fff88 
<main_arena+264>,
  btl_get = 0x7fb6e82fff88 <main_arena+264>, btl_dump = 0x7fb6e82fff98 
<main_arena+280>,
  btl_mpool = 0x7fb6e82fff98, btl_register_error = 0x7fb6e82fffa8 
<main_arena+296>,
  btl_ft_event = 0x7fb6e82fffa8 <main_arena+296>}
(gdb)


From: devel [mailto:devel-boun...@open-mpi.org] On Behalf Of Mike Dubman
Sent: Monday, February 10, 2014 4:23 AM
To: Open MPI Developers
Subject: [OMPI devel] 1.7.5 fails on simple test






$/scrap/jenkins/scrap/workspace/hpc-ompi-shmem/label/hpc-test-node/ompi_install1/bin/mpirun
 -np 8 -mca pml ob1 -mca btl self,tcp 
/scrap/jenkins/scrap/workspace/hpc-ompi-shmem/label/hpc-test-node/ompi_install1/examples/hello_usempi

[vegas12:12724] *** Process received signal ***

[vegas12:12724] Signal: Segmentation fault (11)

[vegas12:12724] Signal code:  (128)

[vegas12:12724] Failing at address: (nil)

[vegas12:12724] [ 0] /lib64/libpthread.so.0[0x3937c0f500]

[vegas12:12724] [ 1] 
/scrap/jenkins/scrap/workspace/hpc-ompi-shmem/label/hpc-test-node/ompi_install1/lib/openmpi/mca_btl_tcp.so(mca_btl_tcp_component_init+0x583)[0x7395f813]

[vegas12:12724] [ 2] 
/scrap/jenkins/scrap/workspace/hpc-ompi-shmem/label/hpc-test-node/ompi_install1/lib/libmpi.so.1(mca_btl_base_select+0x117)[0x778e14a7]

[vegas12:12724] [ 3] 
/scrap/jenkins/scrap/workspace/hpc-ompi-shmem/label/hpc-test-node/ompi_install1/lib/openmpi/mca_bml_r2.so(mca_bml_r2_component_init+0x12)[0x73ded6f2]

[vegas12:12724] [ 4] 
/scrap/jenkins/scrap/workspace/hpc-ompi-shmem/label/hpc-test-node/ompi_install1/lib/libmpi.so.1(mca_bml_base_init+0x99)[0x778e0cc9]

[vegas12:12724] [ 5] 
/scrap/jenkins/scrap/workspace/hpc-ompi-shmem/label/hpc-test-node/ompi_install1/lib/openmpi/mca_pml_ob1.so(+0x51d8)[0x737481d8]

[vegas12:12724] [ 6] 
/scrap/jenkins/scrap/workspace/hpc-ompi-shmem/label/hpc-test-node/ompi_install1/lib/libmpi.so.1(mca_pml_base_select+0x1e0)[0x778f31e0]

[vegas12:12724] [ 7] 
/scrap/jenkins/scrap/workspace/hpc-ompi-shmem/label/hpc-test-node/ompi_install1/lib/libmpi.so.1(ompi_mpi_init+0x52b)[0x778bffdb]

[vegas12:12724] [ 8] 
/scrap/jenkins/scrap/workspace/hpc-ompi-shmem/label/hpc-test-node/ompi_install1/lib/libmpi.so.1(MPI_Init+0x170)[0x778d4210]

[vegas12:12724] [ 9] 
/scrap/jenkins/scrap/workspace/hpc-ompi-shmem/label/hpc-test-node/ompi_install1/lib/libmpi_mpifh.so.2(PMPI_Init_f08+0x25)[0x77b71c25]

[vegas12:12724] [10] 
/scrap/jenkins/scrap/workspace/hpc-ompi-shmem/label/hpc-test-node/ompi_install1/examples/hell

[OMPI devel] 1.7.5 fails on simple test

2014-02-10 Thread Mike Dubman
*$/scrap/jenkins/scrap/workspace/hpc-ompi-shmem/label/hpc-test-node/ompi_install1/bin/mpirun
-np 8 -mca pml ob1 -mca btl self,tcp
/scrap/jenkins/scrap/workspace/hpc-ompi-shmem/label/hpc-test-node/ompi_install1/examples/hello_usempi
[vegas12:12724] *** Process received signal ***
[vegas12:12724] Signal: Segmentation fault (11)
[vegas12:12724] Signal code:  (128)
[vegas12:12724] Failing at address: (nil)
[vegas12:12724] [ 0] /lib64/libpthread.so.0[0x3937c0f500]
[vegas12:12724] [ 1]
/scrap/jenkins/scrap/workspace/hpc-ompi-shmem/label/hpc-test-node/ompi_install1/lib/openmpi/mca_btl_tcp.so(mca_btl_tcp_component_init+0x583)[0x7395f813]
[vegas12:12724] [ 2]
/scrap/jenkins/scrap/workspace/hpc-ompi-shmem/label/hpc-test-node/ompi_install1/lib/libmpi.so.1(mca_btl_base_select+0x117)[0x778e14a7]
[vegas12:12724] [ 3]
/scrap/jenkins/scrap/workspace/hpc-ompi-shmem/label/hpc-test-node/ompi_install1/lib/openmpi/mca_bml_r2.so(mca_bml_r2_component_init+0x12)[0x73ded6f2]
[vegas12:12724] [ 4]
/scrap/jenkins/scrap/workspace/hpc-ompi-shmem/label/hpc-test-node/ompi_install1/lib/libmpi.so.1(mca_bml_base_init+0x99)[0x778e0cc9]
[vegas12:12724] [ 5]
/scrap/jenkins/scrap/workspace/hpc-ompi-shmem/label/hpc-test-node/ompi_install1/lib/openmpi/mca_pml_ob1.so(+0x51d8)[0x737481d8]
[vegas12:12724] [ 6]
/scrap/jenkins/scrap/workspace/hpc-ompi-shmem/label/hpc-test-node/ompi_install1/lib/libmpi.so.1(mca_pml_base_select+0x1e0)[0x778f31e0]
[vegas12:12724] [ 7]
/scrap/jenkins/scrap/workspace/hpc-ompi-shmem/label/hpc-test-node/ompi_install1/lib/libmpi.so.1(ompi_mpi_init+0x52b)[0x778bffdb]
[vegas12:12724] [ 8]
/scrap/jenkins/scrap/workspace/hpc-ompi-shmem/label/hpc-test-node/ompi_install1/lib/libmpi.so.1(MPI_Init+0x170)[0x778d4210]
[vegas12:12724] [ 9]
/scrap/jenkins/scrap/workspace/hpc-ompi-shmem/label/hpc-test-node/ompi_install1/lib/libmpi_mpifh.so.2(PMPI_Init_f08+0x25)[0x77b71c25]
[vegas12:12724] [10]
/scrap/jenkins/scrap/workspace/hpc-ompi-shmem/label/hpc-test-node/ompi_install1/examples/hello_usempi[0x400c0b]
[vegas12:12724] [11]
/scrap/jenkins/scrap/workspace/hpc-ompi-shmem/label/hpc-test-node/ompi_install1/examples/hello_usempi[0x400d4a]
[vegas12:12724] [12] /lib64/libc.so.6(__libc_start_main+0xfd)[0x393741ecdd]
[vegas12:12724] [13]
/scrap/jenkins/scrap/workspace/hpc-ompi-shmem/label/hpc-test-node/ompi_install1/examples/hello_usempi[0x400b29]
[vegas12:12724] *** End of error message ***
[vegas12:12731] *** Process received signal ***
[vegas12:12731] Signal: Segmentation fault (11)
[vegas12:12731] Signal code:  (128)
[vegas12:12731] Failing at address: (nil)
[vegas12:12731] [ 0] /lib64/libpthread.so.0[0x3937c0f500]
[vegas12:12731] [ 1]
/scrap/jenkins/scrap/workspace/hpc-ompi-shmem/label/hpc-test-node/ompi_install1/lib/openmpi/mca_btl_tcp.so(mca_btl_tcp_component_init+0x583)[0x7395f813]
[vegas12:12731] [ 2]
/scrap/jenkins/scrap/workspace/hpc-ompi-shmem/label/hpc-test-node/ompi_install1/lib/libmpi.so.1(mca_btl_base_select+0x117)[0x778e14a7]
[vegas12:12731] [ 3]
/scrap/jenkins/scrap/workspace/hpc-ompi-shmem/label/hpc-test-node/ompi_install1/lib/openmpi/mca_bml_r2.so(mca_bml_r2_component_init+0x12)[0x73ded6f2]
[vegas12:12731] [ 4]
/scrap/jenkins/scrap/workspace/hpc-ompi-shmem/label/hpc-test-node/ompi_install1/lib/libmpi.so.1(mca_bml_base_init+0x99)[0x778e0cc9]
[vegas12:12731] [ 5]
/scrap/jenkins/scrap/workspace/hpc-ompi-shmem/label/hpc-test-node/ompi_install1/lib/openmpi/mca_pml_ob1.so(+0x51d8)[0x737481d8]
[vegas12:12731] [ 6]
/scrap/jenkins/scrap/workspace/hpc-ompi-shmem/label/hpc-test-node/ompi_install1/lib/libmpi.so.1(mca_pml_base_select+0x1e0)[0x778f31e0]
[vegas12:12731] [ 7]
/scrap/jenkins/scrap/workspace/hpc-ompi-shmem/label/hpc-test-node/ompi_install1/lib/libmpi.so.1(ompi_mpi_init+0x52b)[0x778bffdb]
[vegas12:12731] [ 8]
/scrap/jenkins/scrap/workspace/hpc-ompi-shmem/label/hpc-test-node/ompi_install1/lib/libmpi.so.1(MPI_Init+0x170)[0x778d4210]
[vegas12:12731] [ 9]
/scrap/jenkins/scrap/workspace/hpc-ompi-shmem/label/hpc-test-node/ompi_install1/lib/libmpi_mpifh.so.2(PMPI_Init_f08+0x25)[0x77b71c25]
[vegas12:12731] [10]
/scrap/jenkins/scrap/workspace/hpc-ompi-shmem/label/hpc-test-node/ompi_install1/examples/hello_usempi[0x400c0b]
[vegas12:12731] [11]
/scrap/jenkins/scrap/workspace/hpc-ompi-shmem/label/hpc-test-node/ompi_install1/examples/hello_usempi[0x400d4a]
[vegas12:12731] [12] /lib64/libc.so.6(__libc_start_main+0xfd)[0x393741ecdd]
[vegas12:12731] [13]
/scrap/jenkins/scrap/workspace/hpc-ompi-shmem/label/hpc-test-node/ompi_install1/examples/hello_usempi[0x400b29]
[vegas12:12731] *** End of error message ***
--
mpirun noticed that process rank 0 with PID 12724 on node vegas12
exited on signal 11 (Segmentation fault).
--
jenkins@vegas12 ~
*