Oh yeah - that would indeed be very bad :-(

> On Oct 26, 2014, at 6:06 PM, Kawashima, Takahiro <t-kawash...@jp.fujitsu.com> 
> wrote:
> 
> Siegmar, Oscar,
> 
> I suspect that the problem is calling mca_base_var_register
> without initializing OPAL in JNI_OnLoad.
> 
> ompi/mpi/java/c/mpi_MPI.c:
> ----------------------------------------------------------------
> jint JNI_OnLoad(JavaVM *vm, void *reserved)
> {
>    libmpi = dlopen("libmpi." OPAL_DYN_LIB_SUFFIX, RTLD_NOW | RTLD_GLOBAL);
> 
>    if(libmpi == NULL)
>    {
>        fprintf(stderr, "Java bindings failed to load liboshmem.\n");
>        exit(1);
>    }
> 
>    mca_base_var_register("ompi", "mpi", "java", "eager",
>                          "Java buffers eager size",
>                          MCA_BASE_VAR_TYPE_INT, NULL, 0, 0,
>                          OPAL_INFO_LVL_5,
>                          MCA_BASE_VAR_SCOPE_READONLY,
>                          &ompi_mpi_java_eager);
> 
>    return JNI_VERSION_1_6;
> }
> ----------------------------------------------------------------
> 
> I suppose JNI_OnLoad is the first function in the libmpi_java.so
> which is called by JVM. So OPAL is not initialized yet.
> As shown in Siegmar's JRE log, SEGV occurred in asprintf called
> by mca_base_var_cache_files.
> 
> Siegmar's hs_err_pid13080.log:
> ----------------------------------------------------------------
> siginfo:si_signo=SIGSEGV: si_errno=0, si_code=1 (SEGV_MAPERR), 
> si_addr=0x0000000000000000
> 
> Stack: [0xffffffff7b400000,0xffffffff7b500000],  sp=0xffffffff7b4fc730,  free 
> space=1009k
> Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native 
> code)
> C  [libc.so.1+0x3c7f0]  strlen+0x50
> C  [libc.so.1+0xaf640]  vsnprintf+0x84
> C  [libc.so.1+0xaadb4]  vasprintf+0x20
> C  [libc.so.1+0xaaf04]  asprintf+0x28
> C  [libopen-pal.so.0.0.0+0xaf3cc]  mca_base_var_cache_files+0x160
> C  [libopen-pal.so.0.0.0+0xaed90]  mca_base_var_init+0x4e8
> C  [libopen-pal.so.0.0.0+0xb260c]  register_variable+0x214
> C  [libopen-pal.so.0.0.0+0xb36a0]  mca_base_var_register+0x104
> C  [libmpi_java.so.0.0.0+0x221e8]  JNI_OnLoad+0x128
> C  [libjava.so+0x10860]  
> Java_java_lang_ClassLoader_00024NativeLibrary_load+0xb8
> j  java.lang.ClassLoader$NativeLibrary.load(Ljava/lang/String;Z)V+-665819
> j  java.lang.ClassLoader$NativeLibrary.load(Ljava/lang/String;Z)V+0
> j  java.lang.ClassLoader.loadLibrary0(Ljava/lang/Class;Ljava/io/File;)Z+328
> j  
> java.lang.ClassLoader.loadLibrary(Ljava/lang/Class;Ljava/lang/String;Z)V+290
> j  java.lang.Runtime.loadLibrary0(Ljava/lang/Class;Ljava/lang/String;)V+54
> j  java.lang.System.loadLibrary(Ljava/lang/String;)V+7
> j  mpi.MPI.<clinit>()V+28
> ----------------------------------------------------------------
> 
> mca_base_var_cache_files passes opal_install_dirs.sysconfdir to
> asprintf.
> 
> opal/mca/base/mca_base_var.c:
> ----------------------------------------------------------------
>    asprintf(&mca_base_var_files, "%s"OPAL_PATH_SEP".openmpi" OPAL_PATH_SEP
>             "mca-params.conf%c%s" OPAL_PATH_SEP "openmpi-mca-params.conf",
>             home, OPAL_ENV_SEP, opal_install_dirs.sysconfdir);
> ----------------------------------------------------------------
> 
> In this situation, opal_install_dirs.sysconfdir is still NULL.
> 
> I run a MPI Java program that only calls MPI.Init() and
> MPI.Finalize() with MCA variable mpi_show_mca_params=1 on
> Linux to confirm this. mca_base_param_files contains "(null)".
> 
> mpi_show_mca_params=1:
> ----------------------------------------------------------------
> [ppc:12232] 
> mca_base_param_files=/home/rivis/.openmpi/mca-params.conf:(null)/openmpi-mca-params.conf
>  (default)
> [ppc:12232] 
> mca_param_files=/home/rivis/.openmpi/mca-params.conf:(null)/openmpi-mca-params.conf
>  (default)
> [ppc:12232] 
> mca_base_override_param_file=(null)/openmpi-mca-params-override.conf (default)
> [ppc:12232] mca_base_suppress_override_warning=false (default)
> [ppc:12232] mca_base_param_file_prefix= (default)
> [ppc:12232] 
> mca_base_param_file_path=(null)/amca-param-sets:/home/rivis/src/mpisample 
> (default)
> [ppc:12232] mca_base_param_file_path_force= (default)
> [ppc:12232] mca_base_env_list= (default)
> [ppc:12232] mca_base_env_list_delimiter=; (default)
> [ppc:12232] mpi_java_eager=65536 (default)
> (snip)
> ----------------------------------------------------------------
> 
> GNU libc sets "(null)" for asprintf(buf, "%s", NULL) but
> Solaris libc raises SEGV for it. I think this is the difference
> of Siegmar's runs on Linux and Solaris.
> 
> I think this mca_base_var_register should be moved to another
> place or opal_init_util or something should be called before
> this mca_base_var_register.
> 
> Thanks,
> Takahiro
> 
>> Hi Gilles,
>> 
>> thank you very much for the quick tutorial. Unfortunately I still
>> can't get a backtrace.
>> 
>>> You might need to configure with --enable-debug and add -g -O0
>>> to your CFLAGS and LDFLAGS
>>> 
>>> Then once you attach with gdb, you have to find the thread that is polling :
>>> thread 1
>>> bt
>>> thread 2
>>> bt
>>> and so on until you find the good thread
>>> If _dbg is a local variable, you need to select the right frame
>>> before you can change the value :
>>> get the frame number from bt (generally 1 under linux)
>>> f <frame number>
>>> set _dbg=0
>>> 
>>> I hope this helps
>> 
>> "--enable-debug" is one of my default options. Now I used the
>> following command to configure Open MPI. I always start the
>> build process in an empty directory and I always remove
>> /usr/local/openmpi-1.9.0_64_gcc, before I install a new version.
>> 
>> tyr openmpi-dev-124-g91e9686-SunOS.sparc.64_gcc 112 head config.log \
>>  | grep openmpi
>> $ ../openmpi-dev-124-g91e9686/configure
>>  --prefix=/usr/local/openmpi-1.9.0_64_gcc
>>  --libdir=/usr/local/openmpi-1.9.0_64_gcc/lib64
>>  --with-jdk-bindir=/usr/local/jdk1.8.0/bin
>>  --with-jdk-headers=/usr/local/jdk1.8.0/include
>>  JAVA_HOME=/usr/local/jdk1.8.0
>>  LDFLAGS=-m64 -g -O0 CC=gcc CXX=g++ FC=gfortran
>>  CFLAGS=-m64 -D_REENTRANT -g -O0
>>  CXXFLAGS=-m64 FCFLAGS=-m64 CPP=cpp CXXCPP=cpp
>>  CPPFLAGS=-D_REENTRANT CXXCPPFLAGS=
>>  --enable-mpi-cxx --enable-cxx-exceptions --enable-mpi-java
>>  --enable-heterogeneous --enable-mpi-thread-multiple
>>  --with-threads=posix --with-hwloc=internal --without-verbs
>>  --with-wrapper-cflags=-std=c11 -m64 --enable-debug
>> tyr openmpi-dev-124-g91e9686-SunOS.sparc.64_gcc 113 
>> 
>> 
>> "gbd" doesn't allow any backtrace for any thread.
>> 
>> tyr java 124 /usr/local/gdb-7.6.1_64_gcc/bin/gdb
>> GNU gdb (GDB) 7.6.1
>> ...
>> (gdb) attach 18876
>> Attaching to process 18876
>> [New process 18876]
>> Retry #1:
>> Retry #2:
>> Retry #3:
>> Retry #4:
>> 0x7eadcb04 in ?? ()
>> (gdb) info threads
>> [New LWP 12]
>> [New LWP 11]
>> [New LWP 10]
>> [New LWP 9]
>> [New LWP 8]
>> [New LWP 7]
>> [New LWP 6]
>> [New LWP 5]
>> [New LWP 4]
>> [New LWP 3]
>> [New LWP 2]
>>  Id   Target Id         Frame 
>>  12   LWP 2             0x7eadc6b0 in ?? ()
>>  11   LWP 3             0x7eadcbb8 in ?? ()
>>  10   LWP 4             0x7eadcbb8 in ?? ()
>>  9    LWP 5             0x7eadcbb8 in ?? ()
>>  8    LWP 6             0x7eadcbb8 in ?? ()
>>  7    LWP 7             0x7eadcbb8 in ?? ()
>>  6    LWP 8             0x7ead8b0c in ?? ()
>>  5    LWP 9             0x7eadcbb8 in ?? ()
>>  4    LWP 10            0x7eadcbb8 in ?? ()
>>  3    LWP 11            0x7eadcbb8 in ?? ()
>>  2    LWP 12            0x7eadcbb8 in ?? ()
>> * 1    LWP 1             0x7eadcb04 in ?? ()
>> (gdb) thread 1
>> [Switching to thread 1 (LWP 1)]
>> #0  0x7eadcb04 in ?? ()
>> (gdb) bt
>> #0  0x7eadcb04 in ?? ()
>> #1  0x7eaca12c in ?? ()
>> Backtrace stopped: previous frame identical to this frame (corrupt stack?)
>> (gdb) thread 2
>> [Switching to thread 2 (LWP 12)]
>> #0  0x7eadcbb8 in ?? ()
>> (gdb) bt
>> #0  0x7eadcbb8 in ?? ()
>> #1  0x7eac2638 in ?? ()
>> Backtrace stopped: previous frame identical to this frame (corrupt stack?)
>> (gdb) thread 3
>> [Switching to thread 3 (LWP 11)]
>> #0  0x7eadcbb8 in ?? ()
>> (gdb) bt
>> #0  0x7eadcbb8 in ?? ()
>> #1  0x7eac25a8 in ?? ()
>> Backtrace stopped: previous frame identical to this frame (corrupt stack?)
>> (gdb) thread 4
>> [Switching to thread 4 (LWP 10)]
>> #0  0x7eadcbb8 in ?? ()
>> (gdb) bt
>> #0  0x7eadcbb8 in ?? ()
>> #1  0x7eac2638 in ?? ()
>> Backtrace stopped: previous frame identical to this frame (corrupt stack?)
>> (gdb) thread 5
>> [Switching to thread 5 (LWP 9)]
>> #0  0x7eadcbb8 in ?? ()
>> (gdb) bt
>> #0  0x7eadcbb8 in ?? ()
>> #1  0x7eac2638 in ?? ()
>> Backtrace stopped: previous frame identical to this frame (corrupt stack?)
>> (gdb) thread 6
>> [Switching to thread 6 (LWP 8)]
>> #0  0x7ead8b0c in ?? ()
>> (gdb) bt
>> #0  0x7ead8b0c in ?? ()
>> #1  0x7eacbcb0 in ?? ()
>> Backtrace stopped: previous frame identical to this frame (corrupt stack?)
>> (gdb) thread 7
>> [Switching to thread 7 (LWP 7)]
>> #0  0x7eadcbb8 in ?? ()
>> (gdb) bt
>> #0  0x7eadcbb8 in ?? ()
>> #1  0x7eac25a8 in ?? ()
>> Backtrace stopped: previous frame identical to this frame (corrupt stack?)
>> (gdb) thread 8
>> [Switching to thread 8 (LWP 6)]
>> #0  0x7eadcbb8 in ?? ()
>> (gdb) bt
>> #0  0x7eadcbb8 in ?? ()
>> #1  0x7eac25a8 in ?? ()
>> Backtrace stopped: previous frame identical to this frame (corrupt stack?)
>> (gdb) thread 9
>> [Switching to thread 9 (LWP 5)]
>> #0  0x7eadcbb8 in ?? ()
>> (gdb) bt
>> #0  0x7eadcbb8 in ?? ()
>> #1  0x7eac2638 in ?? ()
>> Backtrace stopped: previous frame identical to this frame (corrupt stack?)
>> (gdb) thread 10
>> [Switching to thread 10 (LWP 4)]
>> #0  0x7eadcbb8 in ?? ()
>> (gdb) bt
>> #0  0x7eadcbb8 in ?? ()
>> #1  0x7eac25a8 in ?? ()
>> Backtrace stopped: previous frame identical to this frame (corrupt stack?)
>> (gdb) thread 11
>> [Switching to thread 11 (LWP 3)]
>> #0  0x7eadcbb8 in ?? ()
>> (gdb) bt
>> #0  0x7eadcbb8 in ?? ()
>> #1  0x7eac25a8 in ?? ()
>> Backtrace stopped: previous frame identical to this frame (corrupt stack?)
>> (gdb) thread 12
>> [Switching to thread 12 (LWP 2)]
>> #0  0x7eadc6b0 in ?? ()
>> (gdb) 
>> 
>> 
>> 
>> I also tried to set _dbg in all available frames without success.
>> 
>> (gdb) f 1
>> #1  0x7eacb46c in ?? ()
>> (gdb) set _dbg=0
>> No symbol table is loaded.  Use the "file" command.
>> (gdb) symbol-file /usr/local/openmpi-1.9.0_64_gcc/lib64/libmpi_java.so
>> Reading symbols from 
>> /export2/prog/SunOS_sparc/openmpi-1.9.0_64_gcc/lib64/libmpi_java.so.0.0.0...done.
>> (gdb) f 1
>> #1  0x7eacb46c in ?? ()
>> (gdb) set _dbg=0
>> No symbol "_dbg" in current context.
>> (gdb) f 2
>> #0  0x00000000 in ?? ()
>> (gdb) set _dbg=0
>> No symbol "_dbg" in current context.
>> (gdb) 
>> ...
>> 
>> 
>> With "list" I get source code from mpi_CartComm.c and not from mpi_MPI.c.
>> If a switch threads, "list" continues in the old file.
>> 
>> (gdb) thread 1
>> [Switching to thread 1 (LWP 1)]
>> #0  0x7eadcb04 in ?? ()
>> (gdb) list 36
>> 31          distributed under the License is distributed on an "AS IS" BASIS,
>> 32          WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or 
>> implied.
>> 33          See the License for the specific language governing permissions 
>> and
>> 34          limitations under the License.
>> 35      */
>> 36      /*
>> 37       * File         : mpi_CartComm.c
>> 38       * Headerfile   : mpi_CartComm.h
>> 39       * Author       : Sung-Hoon Ko, Xinying Li
>> 40       * Created      : Thu Apr  9 12:22:15 1998
>> (gdb) thread 2
>> [Switching to thread 2 (LWP 12)]
>> #0  0x7eadcbb8 in ?? ()
>> (gdb) list
>> 41       * Revision     : $Revision: 1.6 $
>> 42       * Updated      : $Date: 2003/01/16 16:39:34 $
>> 43       * Copyright: Northeast Parallel Architectures Center
>> 44       *            at Syracuse University 1998
>> 45       */
>> 46      #include "ompi_config.h"
>> 47      
>> 48      #include <stdlib.h>
>> 49      #ifdef HAVE_TARGETCONDITIONALS_H
>> 50      #include <TargetConditionals.h>
>> (gdb) 
>> 
>> 
>> Do you have any ideas, what's going wrong or if I must use a different
>> symbol table?
>> 
>> 
>> Kind regards
>> 
>> Siegmar
>> 
>> 
>> 
>> 
>>> 
>>> Gilles
>>> 
>>> 
>>> Siegmar Gross <siegmar.gr...@informatik.hs-fulda.de> wrote:
>>>> Hi Gilles,
>>>> 
>>>> I changed _dbg to a static variable, so that it is visible in the
>>>> library, but unfortunately still not in the symbol table.
>>>> 
>>>> 
>>>> tyr java 419 nm /usr/local/openmpi-1.9.0_64_gcc/lib64/libmpi_java.so | 
>>>> grep -i _dbg
>>>> [271]   |  1249644|     4|OBJT |LOCL |0    |18     |_dbg.14258
>>>> tyr java 420 /usr/local/gdb-7.6.1_64_gcc/bin/gdb
>>>> GNU gdb (GDB) 7.6.1
>>>> ...
>>>> (gdb) attach 13019
>>>> Attaching to process 13019
>>>> [New process 13019]
>>>> Retry #1:
>>>> Retry #2:
>>>> Retry #3:
>>>> Retry #4:
>>>> 0x7eadcb04 in ?? ()
>>>> (gdb) symbol-file /usr/local/openmpi-1.9.0_64_gcc/lib64/libmpi_java.so
>>>> Reading symbols from 
>> /export2/prog/SunOS_sparc/openmpi-1.9.0_64_gcc/lib64/libmpi_java.so.0.0.0...done.
>>>> (gdb) set var _dbg.14258=0
>>>> No symbol "_dbg" in current context.
>>>> (gdb) 
>>>> 
>>>> 
>>>> Kind regards
>>>> 
>>>> Siegmar
>>>> 
>>>> 
>>>> 
>>>> 
>>>>> unfortunately I didn't get anything useful. It's probably my fault,
>>>>> because I'm still not very familiar with gdb or any other debugger.
>>>>> I did the following things.
>>>>> 
>>>>> 
>>>>> 1st window:
>>>>> -----------
>>>>> 
>>>>> tyr java 174 setenv OMPI_ATTACH 1
>>>>> tyr java 175 mpijavac InitFinalizeMain.java 
>>>>> warning: [path] bad path element
>>>>>  "/usr/local/openmpi-1.9.0_64_gcc/lib64/shmem.jar":
>>>>>  no such file or directory
>>>>> 1 warning
>>>>> tyr java 176 mpiexec -np 1 java InitFinalizeMain
>>>>> 
>>>>> 
>>>>> 
>>>>> 2nd window:
>>>>> -----------
>>>>> 
>>>>> tyr java 379 ps -aef | grep java
>>>>> noaccess  1345     1   0   May 22 ?         113:23 /usr/java/bin/java 
>>>>> -server -Xmx128m 
>> -XX:+UseParallelGC 
>>>> -XX:ParallelGCThreads=4 
>>>>>  fd1026  3661 10753   0 14:09:12 pts/14      0:00 mpiexec -np 1 java 
>>>>> InitFinalizeMain
>>>>>  fd1026  3677 13371   0 14:16:55 pts/2       0:00 grep java
>>>>>  fd1026  3663  3661   0 14:09:12 pts/14      0:01 java -cp 
>>>> /home/fd1026/work/skripte/master/parallel/prog/mpi/java:/usr/local/jun
>>>>> tyr java 380 /usr/local/gdb-7.6.1_64_gcc/bin/gdb
>>>>> GNU gdb (GDB) 7.6.1
>>>>> ...
>>>>> (gdb) attach 3663
>>>>> Attaching to process 3663
>>>>> [New process 3663]
>>>>> Retry #1:
>>>>> Retry #2:
>>>>> Retry #3:
>>>>> Retry #4:
>>>>> 0x7eadcb04 in ?? ()
>>>>> (gdb) symbol-file /usr/local/openmpi-1.9.0_64_gcc/lib64/libmpi_java.so
>>>>> Reading symbols from 
>> /export2/prog/SunOS_sparc/openmpi-1.9.0_64_gcc/lib64/libmpi_java.so.0.0.0...done.
>>>>> (gdb) set var _dbg=0
>>>>> No symbol "_dbg" in current context.
>>>>> (gdb) set var JNI_OnLoad::_dbg=0
>>>>> No symbol "_dbg" in specified context.
>>>>> (gdb) set JNI_OnLoad::_dbg=0
>>>>> No symbol "_dbg" in specified context.
>>>>> (gdb) info threads
>>>>> [New LWP 12]
>>>>> [New LWP 11]
>>>>> [New LWP 10]
>>>>> [New LWP 9]
>>>>> [New LWP 8]
>>>>> [New LWP 7]
>>>>> [New LWP 6]
>>>>> [New LWP 5]
>>>>> [New LWP 4]
>>>>> [New LWP 3]
>>>>> [New LWP 2]
>>>>>  Id   Target Id         Frame 
>>>>>  12   LWP 2             0x7eadc6b0 in ?? ()
>>>>>  11   LWP 3             0x7eadcbb8 in ?? ()
>>>>>  10   LWP 4             0x7eadcbb8 in ?? ()
>>>>>  9    LWP 5             0x7eadcbb8 in ?? ()
>>>>>  8    LWP 6             0x7eadcbb8 in ?? ()
>>>>>  7    LWP 7             0x7eadcbb8 in ?? ()
>>>>>  6    LWP 8             0x7ead8b0c in ?? ()
>>>>>  5    LWP 9             0x7eadcbb8 in ?? ()
>>>>>  4    LWP 10            0x7eadcbb8 in ?? ()
>>>>>  3    LWP 11            0x7eadcbb8 in ?? ()
>>>>>  2    LWP 12            0x7eadcbb8 in ?? ()
>>>>> * 1    LWP 1             0x7eadcb04 in ?? ()
>>>>> (gdb) 
>>>>> 
>>>>> 
>>>>> 
>>>>> It seems that "_dbg" is unknown and unavailable.
>>>>> 
>>>>> tyr java 399 grep _dbg 
>>>>> /export2/src/openmpi-1.9/openmpi-dev-124-g91e9686/ompi/mpi/java/c/*
>>>>> /export2/src/openmpi-1.9/openmpi-dev-124-g91e9686/ompi/mpi/java/c/mpi_MPI.c:
>>>>>         volatile 
>> int _dbg = 1;
>>>>> /export2/src/openmpi-1.9/openmpi-dev-124-g91e9686/ompi/mpi/java/c/mpi_MPI.c:
>>>>>         while 
>> (_dbg) poll(NULL, 0, 1);
>>>>> tyr java 400 nm /usr/local/openmpi-1.9.0_64_gcc/lib64/*.so | grep -i _dbg
>>>>> tyr java 401 nm /usr/local/openmpi-1.9.0_64_gcc/lib64/*.so | grep -i 
>>>>> JNI_OnLoad
>>>>> [1057]  |              139688|                 444|FUNC |GLOB |0    |11   
>>>>>   |JNI_OnLoad
>>>>> tyr java 402 
>>>>> 
>>>>> 
>>>>> 
>>>>> How can I set _dbg to zero to continue mpiexec? I also tried to
>>>>> set a breakpoint for function JNI_OnLoad, but it seems, that the
>>>>> function isn't called before SIGSEGV.
>>>>> 
>>>>> 
>>>>> tyr java 177 unsetenv OMPI_ATTACH 
>>>>> tyr java 178 /usr/local/gdb-7.6.1_64_gcc/bin/gdb mpiexec
>>>>> GNU gdb (GDB) 7.6.1
>>>>> ...
>>>>> (gdb) b mpi_MPI.c:JNI_OnLoad
>>>>> No source file named mpi_MPI.c.
>>>>> Make breakpoint pending on future shared library load? (y or [n]) y
>>>>> 
>>>>> Breakpoint 1 (mpi_MPI.c:JNI_OnLoad) pending.
>>>>> (gdb) run -np 1 java InitFinalizeMain 
>>>>> Starting program: /usr/local/openmpi-1.9.0_64_gcc/bin/mpiexec -np 1 java 
>>>>> InitFinalizeMain
>>>>> [Thread debugging using libthread_db enabled]
>>>>> [New Thread 1 (LWP 1)]
>>>>> [New LWP    2        ]
>>>>> #
>>>>> # A fatal error has been detected by the Java Runtime Environment:
>>>>> #
>>>>> #  SIGSEGV (0xb) at pc=0xffffffff7ea3c7f0, pid=3518, tid=2
>>>>> ...
>>>>> 
>>>>> 
>>>>> 
>>>>> tyr java 381 cat InitFinalizeMain.java 
>>>>> import mpi.*;
>>>>> 
>>>>> public class InitFinalizeMain
>>>>> {
>>>>>  public static void main (String args[]) throws MPIException
>>>>>  {
>>>>>    MPI.Init (args);
>>>>>    System.out.print ("Hello!\n");
>>>>>    MPI.Finalize ();
>>>>>  }
>>>>> }
>>>>> 
>>>>> 
>>>>> SIGSEGV happens in MPI.Init(args), because I can print a message
>>>>> before I call the method.
>>>>> 
>>>>> tyr java 192 unsetenv OMPI_ATTACH
>>>>> tyr java 193 mpijavac InitFinalizeMain.java
>>>>> tyr java 194 mpiexec -np 1 java InitFinalizeMain
>>>>> Before MPI.Init()
>>>>> #
>>>>> # A fatal error has been detected by the Java Runtime Environment:
>>>>> #
>>>>> #  SIGSEGV (0xb) at pc=0xffffffff7ea3c7f0, pid=3697, tid=2
>>>>> ...
>>>>> 
>>>>> 
>>>>> 
>>>>> Any ideas, how I can continue? I couldn't find a C function for
>>>>> MPI.Init() in a C file. Do you know, which function is called first,
>>>>> so that I can set a breakpoint? By the way, I get the same error
>>>>> for Solaris 10 x86_64.
>>>>> 
>>>>> tyr java 388 ssh sunpc1
>>>>> ...
>>>>> sunpc1 java 106 mpijavac InitFinalizeMain.java
>>>>> sunpc1 java 107 uname -a
>>>>> SunOS sunpc1 5.10 Generic_147441-21 i86pc i386 i86pc Solaris
>>>>> sunpc1 java 108 isainfo -k
>>>>> amd64
>>>>> sunpc1 java 109 mpiexec -np 1 java InitFinalizeMain
>>>>> #
>>>>> # A fatal error has been detected by the Java Runtime Environment:
>>>>> #
>>>>> #  SIGSEGV (0xb) at pc=0xfffffd7fff1d77f0, pid=20256, tid=2
>>>>> 
>>>>> 
>>>>> Thank you very much for any help in advance.
>>>>> 
>>>>> Kind regards
>>>>> 
>>>>> Siegmar
>>>>> 
>>>>> 
>>>>> 
>>>>>> thank you very much for your help.
>>>>>> 
>>>>>>> how did you configure openmpi ? which java version did you use ?
>>>>>>> 
>>>>>>> i just found a regression and you currently have to explicitly add
>>>>>>> CFLAGS=-D_REENTRANT CPPFLAGS=-D_REENTRANT
>>>>>>> to your configure command line
>>>>>> 
>>>>>> I added "-D_REENTRANT" to my command.
>>>>>> 
>>>>>> ../openmpi-dev-124-g91e9686/configure 
>>>>>> --prefix=/usr/local/openmpi-1.9.0_64_gcc \
>>>>>>  --libdir=/usr/local/openmpi-1.9.0_64_gcc/lib64 \
>>>>>>  --with-jdk-bindir=/usr/local/jdk1.8.0/bin \
>>>>>>  --with-jdk-headers=/usr/local/jdk1.8.0/include \
>>>>>>  JAVA_HOME=/usr/local/jdk1.8.0 \
>>>>>>  LDFLAGS="-m64" CC="gcc" CXX="g++" FC="gfortran" \
>>>>>>  CFLAGS="-m64 -D_REENTRANT" CXXFLAGS="-m64" FCFLAGS="-m64" \
>>>>>>  CPP="cpp" CXXCPP="cpp" \
>>>>>>  CPPFLAGS="-D_REENTRANT" CXXCPPFLAGS="" \
>>>>>>  --enable-mpi-cxx \
>>>>>>  --enable-cxx-exceptions \
>>>>>>  --enable-mpi-java \
>>>>>>  --enable-heterogeneous \
>>>>>>  --enable-mpi-thread-multiple \
>>>>>>  --with-threads=posix \
>>>>>>  --with-hwloc=internal \
>>>>>>  --without-verbs \
>>>>>>  --with-wrapper-cflags="-std=c11 -m64" \
>>>>>>  --enable-debug \
>>>>>>  |& tee log.configure.$SYSTEM_ENV.$MACHINE_ENV.64_gcc
>>>>>> 
>>>>>> I use Java 8.
>>>>>> 
>>>>>> tyr openmpi-1.9 112 java -version
>>>>>> java version "1.8.0"
>>>>>> Java(TM) SE Runtime Environment (build 1.8.0-b132)
>>>>>> Java HotSpot(TM) 64-Bit Server VM (build 25.0-b70, mixed mode)
>>>>>> tyr openmpi-1.9 113 
>>>>>> 
>>>>>> Unfortunately I still get a SIGSEGV with openmpi-dev-124-g91e9686.
>>>>>> I have applied your patch and will try to debug my small Java
>>>>>> program tomorrow or next week and then let you know the result.
> _______________________________________________
> users mailing list
> us...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
> Link to this post: 
> http://www.open-mpi.org/community/lists/users/2014/10/25590.php 
> <http://www.open-mpi.org/community/lists/users/2014/10/25590.php>

Reply via email to