Re: [OMPI devel] [BUG?] OpenMPI with openib on SPARC64: Signal: Bus error (10)

2011-11-23 Thread Lukas Razik
Hello Larry, I thank you for this helpful tip! Regards and have a nice day, Lukas > > Von: Larry Baker >An: Open MPI Developers >Cc: Lukas Razik ; Roland Dreier >Gesendet: 20:10 Mittwoch, 23.November 2011 >Betreff: Re: [OMPI devel] [

Re: [OMPI devel] [BUG?] OpenMPI with openib on SPARC64: Signal: Bus error (10)

2011-11-23 Thread Lukas Razik
TERRY DONTJE wrote: >>>Can you build OMPI as a 32 bit library and see if that works any better? >>So you mean I shall leave the whole OFED stack as 64 bit and build only >>openmpi as 32 bit? >I believe the OFED user libraries will need to be 32 bit also or the 32 bit >MPI libraries will not be a

Re: [OMPI devel] [BUG?] OpenMPI with openib on SPARC64: Signal: Bus error (10)

2011-11-23 Thread Lukas Razik
TERRY DONTJE wrote >Can you build OMPI as a 32 bit library and see if that works any better? So you mean I shall leave the whole OFED stack as 64 bit and build only openmpi as 32 bit? How must I configure openmpi that it'll be definitely built as 32bit? Regards, Lukas

Re: [OMPI devel] [BUG?] OpenMPI with openib on SPARC64: Signal: Bus error (10)

2011-11-23 Thread Lukas Razik
Lukas Razik wrote >T ERRY DONTJE wrote: >> Nuts!!! Ok I am going to have to think about this a little more.  Do you > have the ability to configure and remake your ompi install? I might want to > have > you add some stuff to help me track this down some more if you can

Re: [OMPI devel] [BUG?] OpenMPI with openib on SPARC64: Signal: Bus error (10)

2011-11-23 Thread Lukas Razik
TERRY DONTJE wrote: > Nuts!!! Ok I am going to have to think about this a little more.  Do you have > the ability to configure and remake your ompi install? I might want to have > you add some stuff to help me track this down some more if you can recompile > your ompi. As I wrote you I've alr

Re: [OMPI devel] [BUG?] OpenMPI with openib on SPARC64: Signal: Bus error (10)

2011-11-23 Thread Lukas Razik
TERRY DONTJE wrote: >Can you try running the benchmark with coalescing off?  To do that add the following option to your mpirun line "-mca btl_openib_use_message_coalescing 0". I've tried this: # /usr/mpi/gcc/openmpi-1.4.4/bin/mpirun -np 2   --mca btl_openib_use_message_coalescing 0   --

Re: [OMPI devel] [BUG?] OpenMPI with openib on SPARC64: Signal: Bus error (10)

2011-11-23 Thread Lukas Razik
TERRY DONTJE wrote: > On 11/22/2011 6:59 PM, Lukas Razik wrote: >> Roland Dreier  wrote: >> >>> On Tue, Nov 22, 2011 at 3:05 PM, Lukas Razik > wrote: >>>>   #0  0xf8010229ba9c in mca_pml_ob1_send_request_start_copy >>> (sendreq=0xb23200, bm

Re: [OMPI devel] [BUG?] OpenMPI with openib on SPARC64: Signal: Bus error (10)

2011-11-22 Thread Lukas Razik
Roland Dreier wrote: > On Tue, Nov 22, 2011 at 3:05 PM, Lukas Razik wrote: >> #0  0xf8010229ba9c in mca_pml_ob1_send_request_start_copy > (sendreq=0xb23200, bml_btl=0xb29050, size=0) at pml_ob1_sendreq.c:551 >> 551 hdr->hdr_match.hdr_ctx = > sendreq->

Re: [OMPI devel] [BUG?] OpenMPI with openib on SPARC64: Signal: Bus error (10)

2011-11-22 Thread Lukas Razik
TERRY DONTJE wrote: >On 11/22/2011 5:49 AM, TERRY DONTJE wrote: >The error you are seeing is usually indicative of some code operating on >memory that isn't aligned properly for a SPARC instruction being used.  The >address that is causing the failure is odd aligned which is more than likely >t

Re: [OMPI devel] [BUG?] OpenMPI with openib on SPARC64: Signal: Bus error (10)

2011-11-22 Thread Lukas Razik
Roland Dreier wrote: > > On Mon, Nov 21, 2011 at 5:51 PM, Lukas Razik wrote: >> [cluster1:64027] Signal code: Invalid address alignment (1) >> [cluster1:64027] Failing at address: 0xaa9053 >> [cluster1:64027] [ 0] > /usr/mpi/gcc/openmpi-1.4.3/lib64/ope

[OMPI devel] [BUG?] OpenMPI with openib on SPARC64: Signal: Bus error (10)

2011-11-21 Thread Lukas Razik
Hello everybody! I've Sun T5120 (SPARC64) Servers with - Debian: 6.0.3 - linux-2.6.39.4 (from kernel.org) - OFED-1.5.3.2 - InfiniBand: Mellanox Technologies MT25418 [ConnectX VPI PCIe 2.0 2.5GT/s - IB DDR / 10GigE] (rev a0)   with newest FW (2.9.1) and the following issue: If I try to mpirun a p