Re: [OMPI devel] [BUG?] OpenMPI with openib on SPARC64: Signal: Bus error (10)

2011-12-06 Thread Eugene Loh
On 11/21/11 20:51, Lukas Razik wrote: Hello everybody! I've Sun T5120 (SPARC64) Servers with - Debian: 6.0.3 - linux-2.6.39.4 (from kernel.org) - OFED-1.5.3.2 - InfiniBand: Mellanox Technologies MT25418 [ConnectX VPI PCIe 2.0 2.5GT/s - IB DDR / 10GigE] (rev a0) with newest FW (2.9.1) and

Re: [OMPI devel] [BUG?] OpenMPI with openib on SPARC64: Signal: Bus error (10)

2011-11-23 Thread Lukas Razik
urestorage.com> >Gesendet: 20:10 Mittwoch, 23.November 2011 >Betreff: Re: [OMPI devel] [BUG?] OpenMPI with openib on SPARC64: Signal: Bus >error (10) > > >I think of -m32 (and -m64) as really selecting a different compiler.  My >practice is to put those flags in

Re: [OMPI devel] [BUG?] OpenMPI with openib on SPARC64: Signal: Bus error (10)

2011-11-23 Thread Lukas Razik
TERRY DONTJE wrote: >>>Can you build OMPI as a 32 bit library and see if that works any better? >>So you mean I shall leave the whole OFED stack as 64 bit and build only >>openmpi as 32 bit? >I believe the OFED user libraries will need to be 32 bit also or the 32 bit

Re: [OMPI devel] [BUG?] OpenMPI with openib on SPARC64: Signal: Bus error (10)

2011-11-23 Thread Larry Baker
I think of -m32 (and -m64) as really selecting a different compiler. My practice is to put those flags in the compiler/linker environment variables. For example: # ./configure >configure.log 2>&1 \ --prefix=/usr/local/openmpi --with-sge \ CC="gcc -m32" \ CFLAGS="-g -O3" \

Re: [OMPI devel] [BUG?] OpenMPI with openib on SPARC64: Signal: Bus error (10)

2011-11-23 Thread TERRY DONTJE
On 11/23/2011 1:45 PM, Lukas Razik wrote: TERRY DONTJE wrote Can you build OMPI as a 32 bit library and see if that works any better? So you mean I shall leave the whole OFED stack as 64 bit and build only openmpi as 32 bit? I believe the OFED user libraries will

Re: [OMPI devel] [BUG?] OpenMPI with openib on SPARC64: Signal: Bus error (10)

2011-11-23 Thread Lukas Razik
TERRY DONTJE wrote >Can you build OMPI as a 32 bit library and see if that works any better? So you mean I shall leave the whole OFED stack as 64 bit and build only openmpi as 32 bit? How must I configure openmpi that it'll be definitely built as 32bit? Regards, Lukas

Re: [OMPI devel] [BUG?] OpenMPI with openib on SPARC64: Signal: Bus error (10)

2011-11-23 Thread TERRY DONTJE
On 11/23/2011 11:05 AM, Lukas Razik wrote: TERRY DONTJE wrote: Nuts!!! Ok I am going to have to think about this a little more. Do you have the ability to configure and remake your ompi install? I might want to have you add some stuff to help me track this down

Re: [OMPI devel] [BUG?] OpenMPI with openib on SPARC64: Signal: Bus error (10)

2011-11-23 Thread Lukas Razik
Lukas Razik wrote >T ERRY DONTJE wrote: >> Nuts!!! Ok I am going to have to think about this a little more.  Do you > have the ability to configure and remake your ompi install? I might want to > have > you add some stuff to help me track this down

Re: [OMPI devel] [BUG?] OpenMPI with openib on SPARC64: Signal: Bus error (10)

2011-11-23 Thread Lukas Razik
TERRY DONTJE wrote: > Nuts!!! Ok I am going to have to think about this a little more.  Do you have > the ability to configure and remake your ompi install? I might want to have > you add some stuff to help me track this down some more if you can recompile > your ompi.

Re: [OMPI devel] [BUG?] OpenMPI with openib on SPARC64: Signal: Bus error (10)

2011-11-23 Thread TERRY DONTJE
On 11/23/2011 10:11 AM, Lukas Razik wrote: TERRY DONTJE wrote: Can you try running the benchmark with coalescing off? To do that add the following option to your mpirun line "-mca btl_openib_use_message_coalescing 0". I've tried this: #

Re: [OMPI devel] [BUG?] OpenMPI with openib on SPARC64: Signal: Bus error (10)

2011-11-23 Thread Lukas Razik
TERRY DONTJE wrote: >Can you try running the benchmark with coalescing off?  To do that add the following option to your mpirun line "-mca btl_openib_use_message_coalescing 0". I've tried this: # /usr/mpi/gcc/openmpi-1.4.4/bin/mpirun -np 2   --mca

Re: [OMPI devel] [BUG?] OpenMPI with openib on SPARC64: Signal: Bus error (10)

2011-11-23 Thread TERRY DONTJE
On 11/23/2011 9:57 AM, Lukas Razik wrote: TERRY DONTJE wrote: On 11/22/2011 6:59 PM, Lukas Razik wrote: Roland Dreier wrote: On Tue, Nov 22, 2011 at 3:05 PM, Lukas Razik wrote: #0 0xf8010229ba9c in

Re: [OMPI devel] [BUG?] OpenMPI with openib on SPARC64: Signal: Bus error (10)

2011-11-23 Thread Lukas Razik
TERRY DONTJE wrote: > On 11/22/2011 6:59 PM, Lukas Razik wrote: >> Roland Dreier  wrote: >> >>> On Tue, Nov 22, 2011 at 3:05 PM, Lukas Razik > wrote:    #0  0xf8010229ba9c in mca_pml_ob1_send_request_start_copy >>>

Re: [OMPI devel] [BUG?] OpenMPI with openib on SPARC64: Signal: Bus error (10)

2011-11-23 Thread TERRY DONTJE
On 11/22/2011 6:59 PM, Lukas Razik wrote: Roland Dreier wrote: On Tue, Nov 22, 2011 at 3:05 PM, Lukas Razik wrote: #0 0xf8010229ba9c in mca_pml_ob1_send_request_start_copy (sendreq=0xb23200, bml_btl=0xb29050, size=0) at pml_ob1_sendreq.c:551

Re: [OMPI devel] [BUG?] OpenMPI with openib on SPARC64: Signal: Bus error (10)

2011-11-22 Thread Lukas Razik
Roland Dreier wrote: > On Tue, Nov 22, 2011 at 3:05 PM, Lukas Razik wrote: >> #0  0xf8010229ba9c in mca_pml_ob1_send_request_start_copy > (sendreq=0xb23200, bml_btl=0xb29050, size=0) at pml_ob1_sendreq.c:551 >> 551 hdr->hdr_match.hdr_ctx

Re: [OMPI devel] [BUG?] OpenMPI with openib on SPARC64: Signal: Bus error (10)

2011-11-22 Thread Lukas Razik
TERRY DONTJE wrote: >On 11/22/2011 5:49 AM, TERRY DONTJE wrote: >The error you are seeing is usually indicative of some code operating on >memory that isn't aligned properly for a SPARC instruction being used.  The >address that is causing the failure is odd aligned

Re: [OMPI devel] [BUG?] OpenMPI with openib on SPARC64: Signal: Bus error (10)

2011-11-22 Thread Lukas Razik
Roland Dreier wrote: > > On Mon, Nov 21, 2011 at 5:51 PM, Lukas Razik wrote: >> [cluster1:64027] Signal code: Invalid address alignment (1) >> [cluster1:64027] Failing at address: 0xaa9053 >> [cluster1:64027] [ 0] >

Re: [OMPI devel] [BUG?] OpenMPI with openib on SPARC64: Signal: Bus error (10)

2011-11-22 Thread TERRY DONTJE
On 11/22/2011 5:49 AM, TERRY DONTJE wrote: The error you are seeing is usually indicative of some code operating on memory that isn't aligned properly for a SPARC instruction being used. The address that is causing the failure is odd aligned which is more than likely the culprit. If you

[OMPI devel] [BUG?] OpenMPI with openib on SPARC64: Signal: Bus error (10)

2011-11-21 Thread Lukas Razik
Hello everybody! I've Sun T5120 (SPARC64) Servers with - Debian: 6.0.3 - linux-2.6.39.4 (from kernel.org) - OFED-1.5.3.2 - InfiniBand: Mellanox Technologies MT25418 [ConnectX VPI PCIe 2.0 2.5GT/s - IB DDR / 10GigE] (rev a0)   with newest FW (2.9.1) and the following issue: If I try to mpirun a