Re: [OMPI users] OMPI 3.0.0 crashing at mpi_init on OS X using Fortran [FIXED]

2017-12-14 Thread Ricardo Fonseca
Hi guys

I figured out what the problem was: my code uses the HDF(5) library, and the 
version I had installed was compiled with parallel I/O support and linked to 
OMPI 2.1.1. In the end I was only linking my code with OMPI 3.0.0, but 
something in there got confused and it led to the crash.

Recompiling HDF5 with OMPI 3.0.0 fixed the whole thing.

Sorry about bothering you with this,

Thanks again for your help,
Ricardo

> Message: 1
> Date: Tue, 12 Dec 2017 20:00:25 +
> From: "Jeff Squyres (jsquyres)" 
> 
> I am unable to reproduce your error with Open MPI v3.0.0 on the latest stable 
> MacOS High Sierra.
> 
> Given that you're failing in MPI_INIT, it feels like the application 
> shouldn't matter.  But regardless, can you test with the trivial Fortran test 
> programs in the examples/ directory in the Open MPI tarball?
> 
> 
> 
>> On Dec 11, 2017, at 11:21 PM, r...@open-mpi.org wrote:
>> 
>> FWIW: I just cloned the v3.0.x branch to get the latest 3.0.1 release 
>> candidate, built and ran it on Mac OSX High Sierra. Everything built and ran 
>> fine for both C and Fortran codes.
>> 
>> You might want to test the same - could be this was already fixed.
>> 
>>> On Dec 11, 2017, at 12:43 PM, Ricardo Parreira de Azambuja Fonseca 
>>>  wrote:
>>> 
>>> Hi guys
>>> 
>>> I?m having problems with a Fortran based code that I develop with OpenMPI 
>>> 3.0.0 on Mac OS X. The problem shows itself with both gfortran and intel 
>>> ifort compilers, and it runs perfectly with version 2.1.2 (and earlier 
>>> versions).
>>> 
>>> Launching the code, even without using mpiexec, causes a segfault when my 
>>> code calls mpi_init()
>>> 
>>> Program received signal SIGSEGV: Segmentation fault - invalid memory 
>>> reference.
>>> 
>>> Backtrace for this error:
>>> #0  0x1107a41fc
>>> (?)
>>> #10  0x10f86eff1
>>> Segmentation fault: 11
>>> 
>>> Recompiling OpenMPI with ?enable-debug, and launching the code through lldb 
>>> gives:
>>> 
>>> (lldb) run
>>> Process 65169 launched: '../source/build/osiris.e' (x86_64)
>>> Process 65169 stopped
>>> * thread #1, queue = 'com.apple.main-thread', stop reason = EXC_BAD_ACCESS 
>>> (code=1, address=0x48)
>>>  frame #0: 0x000100fbe79a 
>>> libmpi.40.dylib`ompi_hook_base_mpi_init_top_post_opal(argc=0, 
>>> argv=0x, requested=0, provided=0x7ffeefbfe290) at 
>>> hook_base.c:278
>>> 275
>>> 276 void ompi_hook_base_mpi_init_top_post_opal(int argc, char 
>>> **argv, int requested, int *provided)
>>> 277 {
>>> -> 278  HOOK_CALL_COMMON( mpi_init_top_post_opal, argc, argv, 
>>> requested, provided);
>>> 279 }
>>> 280
>>> 281 void ompi_hook_base_mpi_init_bottom(int argc, char **argv, int 
>>> requested, int *provided)
>>> Target 0: (osiris.e) stopped.
>>> (lldb) bt
>>> * thread #1, queue = 'com.apple.main-thread', stop reason = EXC_BAD_ACCESS 
>>> (code=1, address=0x48)
>>> * frame #0: 0x000100fbe79a 
>>> libmpi.40.dylib`ompi_hook_base_mpi_init_top_post_opal(argc=0, 
>>> argv=0x, requested=0, provided=0x7ffeefbfe290) at 
>>> hook_base.c:278
>>>  frame #1: 0x000100dce0ff libmpi.40.dylib`ompi_mpi_init(argc=0, 
>>> argv=0x, requested=0, provided=0x7ffeefbfe290) at 
>>> ompi_mpi_init.c:486
>>>  frame #2: 0x000100eb3f38 
>>> libmpi.40.dylib`PMPI_Init(argc=0x7ffeefbfe2d0, argv=0x7ffeefbfe2c8) 
>>> at pinit.c:66
>>>  frame #3: 0x000100cceb0b 
>>> libmpi_mpifh.40.dylib`ompi_init_f(ierr=0x7ffeefbfe9f8) at init_f.c:84
>>>  frame #4: 0x000100ccead5 
>>> libmpi_mpifh.40.dylib`mpi_init_(ierr=0x7ffeefbfe9f8) at init_f.c:65
>>>  frame #5: 0x00014e5a osiris.e`__m_system_MOD_system_init at 
>>> os-sys-multi.f03:323
>>>  frame #6: 0x00010036edb5 osiris.e`MAIN__ at os-main.f03:36
>>>  frame #7: 0x00010039eff2 osiris.e`main at memory.h:19
>>>  frame #8: 0x7fff6ee7d115 libdyld.dylib`start + 1
>>> 
>>> Any thoughts?
>>> 
>>> Thanks in advance,
>>> Ricardo
>>> 
>>> ?
>>> Ricardo Fonseca
>>> 
>>> Full Professor | Professor Catedr?tico
>>> GoLP - Grupo de Lasers e Plasmas
>>> Instituto de Plasmas e Fus?o Nuclear
>>> Instituto Superior T?cnico
>>> Av. Rovisco Pais
>>> 1049-001 Lisboa
>>> Portugal
>>> 
>>> tel: +351 21 8419202
>>> web: http://epp.tecnico.ulisboa.pt/
>>> ___
>>> users mailing list
>>> users@lists.open-mpi.org
>>> https://lists.open-mpi.org/mailman/listinfo/users
>> 
>> ___
>> users mailing list
>> users@lists.open-mpi.org
>> https://lists.open-mpi.org/mailman/listinfo/users
> 
> 
> -- 
> Jeff Squyres
> jsquy...@cisco.com


—
Ricardo Fonseca
 
Full Professor | Professor Catedrático
GoLP - Grupo de Lasers e Plasmas
Instituto de Plasmas e Fusão Nuclear
Instituto Superior Técnico
Av. Rovisco Pais
1049-001 Lisboa
Portugal
 
tel: +351 21 8419202
web: http://epp.tecnico.ulisboa.pt/


Re: [OMPI users] OMPI 3.0.0 crashing at mpi_init on OS X using Fortran

2017-12-12 Thread Jeff Squyres (jsquyres)
I am unable to reproduce your error with Open MPI v3.0.0 on the latest stable 
MacOS High Sierra.

Given that you're failing in MPI_INIT, it feels like the application shouldn't 
matter.  But regardless, can you test with the trivial Fortran test programs in 
the examples/ directory in the Open MPI tarball?



> On Dec 11, 2017, at 11:21 PM, r...@open-mpi.org wrote:
> 
> FWIW: I just cloned the v3.0.x branch to get the latest 3.0.1 release 
> candidate, built and ran it on Mac OSX High Sierra. Everything built and ran 
> fine for both C and Fortran codes.
> 
> You might want to test the same - could be this was already fixed.
> 
>> On Dec 11, 2017, at 12:43 PM, Ricardo Parreira de Azambuja Fonseca 
>>  wrote:
>> 
>> Hi guys
>> 
>> I’m having problems with a Fortran based code that I develop with OpenMPI 
>> 3.0.0 on Mac OS X. The problem shows itself with both gfortran and intel 
>> ifort compilers, and it runs perfectly with version 2.1.2 (and earlier 
>> versions).
>> 
>> Launching the code, even without using mpiexec, causes a segfault when my 
>> code calls mpi_init()
>> 
>> Program received signal SIGSEGV: Segmentation fault - invalid memory 
>> reference.
>> 
>> Backtrace for this error:
>> #0  0x1107a41fc
>> (…)
>> #10  0x10f86eff1
>> Segmentation fault: 11
>> 
>> Recompiling OpenMPI with —enable-debug, and launching the code through lldb 
>> gives:
>> 
>> (lldb) run
>> Process 65169 launched: '../source/build/osiris.e' (x86_64)
>> Process 65169 stopped
>> * thread #1, queue = 'com.apple.main-thread', stop reason = EXC_BAD_ACCESS 
>> (code=1, address=0x48)
>>   frame #0: 0x000100fbe79a 
>> libmpi.40.dylib`ompi_hook_base_mpi_init_top_post_opal(argc=0, 
>> argv=0x, requested=0, provided=0x7ffeefbfe290) at 
>> hook_base.c:278
>>  275
>>  276 void ompi_hook_base_mpi_init_top_post_opal(int argc, char 
>> **argv, int requested, int *provided)
>>  277 {
>> -> 278   HOOK_CALL_COMMON( mpi_init_top_post_opal, argc, argv, 
>> requested, provided);
>>  279 }
>>  280
>>  281 void ompi_hook_base_mpi_init_bottom(int argc, char **argv, int 
>> requested, int *provided)
>> Target 0: (osiris.e) stopped.
>> (lldb) bt
>> * thread #1, queue = 'com.apple.main-thread', stop reason = EXC_BAD_ACCESS 
>> (code=1, address=0x48)
>> * frame #0: 0x000100fbe79a 
>> libmpi.40.dylib`ompi_hook_base_mpi_init_top_post_opal(argc=0, 
>> argv=0x, requested=0, provided=0x7ffeefbfe290) at 
>> hook_base.c:278
>>   frame #1: 0x000100dce0ff libmpi.40.dylib`ompi_mpi_init(argc=0, 
>> argv=0x, requested=0, provided=0x7ffeefbfe290) at 
>> ompi_mpi_init.c:486
>>   frame #2: 0x000100eb3f38 
>> libmpi.40.dylib`PMPI_Init(argc=0x7ffeefbfe2d0, argv=0x7ffeefbfe2c8) 
>> at pinit.c:66
>>   frame #3: 0x000100cceb0b 
>> libmpi_mpifh.40.dylib`ompi_init_f(ierr=0x7ffeefbfe9f8) at init_f.c:84
>>   frame #4: 0x000100ccead5 
>> libmpi_mpifh.40.dylib`mpi_init_(ierr=0x7ffeefbfe9f8) at init_f.c:65
>>   frame #5: 0x00014e5a osiris.e`__m_system_MOD_system_init at 
>> os-sys-multi.f03:323
>>   frame #6: 0x00010036edb5 osiris.e`MAIN__ at os-main.f03:36
>>   frame #7: 0x00010039eff2 osiris.e`main at memory.h:19
>>   frame #8: 0x7fff6ee7d115 libdyld.dylib`start + 1
>> 
>> Any thoughts?
>> 
>> Thanks in advance,
>> Ricardo
>> 
>> —
>> Ricardo Fonseca
>> 
>> Full Professor | Professor Catedrático
>> GoLP - Grupo de Lasers e Plasmas
>> Instituto de Plasmas e Fusão Nuclear
>> Instituto Superior Técnico
>> Av. Rovisco Pais
>> 1049-001 Lisboa
>> Portugal
>> 
>> tel: +351 21 8419202
>> web: http://epp.tecnico.ulisboa.pt/
>> ___
>> users mailing list
>> users@lists.open-mpi.org
>> https://lists.open-mpi.org/mailman/listinfo/users
> 
> ___
> users mailing list
> users@lists.open-mpi.org
> https://lists.open-mpi.org/mailman/listinfo/users


-- 
Jeff Squyres
jsquy...@cisco.com



___
users mailing list
users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/users

Re: [OMPI users] OMPI 3.0.0 crashing at mpi_init on OS X using Fortran

2017-12-11 Thread r...@open-mpi.org
FWIW: I just cloned the v3.0.x branch to get the latest 3.0.1 release 
candidate, built and ran it on Mac OSX High Sierra. Everything built and ran 
fine for both C and Fortran codes.

You might want to test the same - could be this was already fixed.

> On Dec 11, 2017, at 12:43 PM, Ricardo Parreira de Azambuja Fonseca 
>  wrote:
> 
> Hi guys
> 
> I’m having problems with a Fortran based code that I develop with OpenMPI 
> 3.0.0 on Mac OS X. The problem shows itself with both gfortran and intel 
> ifort compilers, and it runs perfectly with version 2.1.2 (and earlier 
> versions).
> 
> Launching the code, even without using mpiexec, causes a segfault when my 
> code calls mpi_init()
> 
> Program received signal SIGSEGV: Segmentation fault - invalid memory 
> reference.
> 
> Backtrace for this error:
> #0  0x1107a41fc
> (…)
> #10  0x10f86eff1
> Segmentation fault: 11
> 
> Recompiling OpenMPI with —enable-debug, and launching the code through lldb 
> gives:
> 
> (lldb) run
> Process 65169 launched: '../source/build/osiris.e' (x86_64)
> Process 65169 stopped
> * thread #1, queue = 'com.apple.main-thread', stop reason = EXC_BAD_ACCESS 
> (code=1, address=0x48)
>frame #0: 0x000100fbe79a 
> libmpi.40.dylib`ompi_hook_base_mpi_init_top_post_opal(argc=0, 
> argv=0x, requested=0, provided=0x7ffeefbfe290) at 
> hook_base.c:278
>   275
>   276 void ompi_hook_base_mpi_init_top_post_opal(int argc, char 
> **argv, int requested, int *provided)
>   277 {
> -> 278HOOK_CALL_COMMON( mpi_init_top_post_opal, argc, argv, 
> requested, provided);
>   279 }
>   280
>   281 void ompi_hook_base_mpi_init_bottom(int argc, char **argv, int 
> requested, int *provided)
> Target 0: (osiris.e) stopped.
> (lldb) bt
> * thread #1, queue = 'com.apple.main-thread', stop reason = EXC_BAD_ACCESS 
> (code=1, address=0x48)
>  * frame #0: 0x000100fbe79a 
> libmpi.40.dylib`ompi_hook_base_mpi_init_top_post_opal(argc=0, 
> argv=0x, requested=0, provided=0x7ffeefbfe290) at 
> hook_base.c:278
>frame #1: 0x000100dce0ff libmpi.40.dylib`ompi_mpi_init(argc=0, 
> argv=0x, requested=0, provided=0x7ffeefbfe290) at 
> ompi_mpi_init.c:486
>frame #2: 0x000100eb3f38 
> libmpi.40.dylib`PMPI_Init(argc=0x7ffeefbfe2d0, argv=0x7ffeefbfe2c8) 
> at pinit.c:66
>frame #3: 0x000100cceb0b 
> libmpi_mpifh.40.dylib`ompi_init_f(ierr=0x7ffeefbfe9f8) at init_f.c:84
>frame #4: 0x000100ccead5 
> libmpi_mpifh.40.dylib`mpi_init_(ierr=0x7ffeefbfe9f8) at init_f.c:65
>frame #5: 0x00014e5a osiris.e`__m_system_MOD_system_init at 
> os-sys-multi.f03:323
>frame #6: 0x00010036edb5 osiris.e`MAIN__ at os-main.f03:36
>frame #7: 0x00010039eff2 osiris.e`main at memory.h:19
>frame #8: 0x7fff6ee7d115 libdyld.dylib`start + 1
> 
> Any thoughts?
> 
> Thanks in advance,
> Ricardo
> 
> —
> Ricardo Fonseca
> 
> Full Professor | Professor Catedrático
> GoLP - Grupo de Lasers e Plasmas
> Instituto de Plasmas e Fusão Nuclear
> Instituto Superior Técnico
> Av. Rovisco Pais
> 1049-001 Lisboa
> Portugal
> 
> tel: +351 21 8419202
> web: http://epp.tecnico.ulisboa.pt/
> ___
> users mailing list
> users@lists.open-mpi.org
> https://lists.open-mpi.org/mailman/listinfo/users

___
users mailing list
users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/users

[OMPI users] OMPI 3.0.0 crashing at mpi_init on OS X using Fortran

2017-12-11 Thread Ricardo Parreira de Azambuja Fonseca

Hi guys

I’m having problems with a Fortran based code that I develop with 
OpenMPI 3.0.0 on Mac OS X. The problem shows itself with both gfortran 
and intel ifort compilers, and it runs perfectly with version 2.1.2 (and 
earlier versions).


Launching the code, even without using mpiexec, causes a segfault when 
my code calls mpi_init()


Program received signal SIGSEGV: Segmentation fault - invalid memory 
reference.


Backtrace for this error:
#0  0x1107a41fc
(…)
#10  0x10f86eff1
Segmentation fault: 11

Recompiling OpenMPI with —enable-debug, and launching the code through 
lldb gives:


(lldb) run
Process 65169 launched: '../source/build/osiris.e' (x86_64)
Process 65169 stopped
* thread #1, queue = 'com.apple.main-thread', stop reason = 
EXC_BAD_ACCESS (code=1, address=0x48)
frame #0: 0x000100fbe79a 
libmpi.40.dylib`ompi_hook_base_mpi_init_top_post_opal(argc=0, 
argv=0x, requested=0, provided=0x7ffeefbfe290) at 
hook_base.c:278

   275
   276 	void ompi_hook_base_mpi_init_top_post_opal(int argc, char 
**argv, int requested, int *provided)

   277  {
-> 278 	HOOK_CALL_COMMON( mpi_init_top_post_opal, argc, argv, 
requested, provided);

   279  }
   280
   281 	void ompi_hook_base_mpi_init_bottom(int argc, char **argv, int 
requested, int *provided)

Target 0: (osiris.e) stopped.
(lldb) bt
* thread #1, queue = 'com.apple.main-thread', stop reason = 
EXC_BAD_ACCESS (code=1, address=0x48)
  * frame #0: 0x000100fbe79a 
libmpi.40.dylib`ompi_hook_base_mpi_init_top_post_opal(argc=0, 
argv=0x, requested=0, provided=0x7ffeefbfe290) at 
hook_base.c:278
frame #1: 0x000100dce0ff libmpi.40.dylib`ompi_mpi_init(argc=0, 
argv=0x, requested=0, provided=0x7ffeefbfe290) at 
ompi_mpi_init.c:486
frame #2: 0x000100eb3f38 
libmpi.40.dylib`PMPI_Init(argc=0x7ffeefbfe2d0, 
argv=0x7ffeefbfe2c8) at pinit.c:66
frame #3: 0x000100cceb0b 
libmpi_mpifh.40.dylib`ompi_init_f(ierr=0x7ffeefbfe9f8) at 
init_f.c:84
frame #4: 0x000100ccead5 
libmpi_mpifh.40.dylib`mpi_init_(ierr=0x7ffeefbfe9f8) at init_f.c:65
frame #5: 0x00014e5a osiris.e`__m_system_MOD_system_init at 
os-sys-multi.f03:323

frame #6: 0x00010036edb5 osiris.e`MAIN__ at os-main.f03:36
frame #7: 0x00010039eff2 osiris.e`main at memory.h:19
frame #8: 0x7fff6ee7d115 libdyld.dylib`start + 1

Any thoughts?

Thanks in advance,
Ricardo

—
Ricardo Fonseca

Full Professor | Professor Catedrático
GoLP - Grupo de Lasers e Plasmas
Instituto de Plasmas e Fusão Nuclear
Instituto Superior Técnico
Av. Rovisco Pais
1049-001 Lisboa
Portugal

tel: +351 21 8419202
web: http://epp.tecnico.ulisboa.pt/
___
users mailing list
users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/users