Hi all,

I am writing a new device for MPJ Express that uses a native MPI library
for communication. Its based on JNI wrappers like the original mpiJava. The
device works fine with MPICH3  (and MVAPICH2.2). Here is the issue with
loading Open MPI 1.7.4 from MPJ Express.

I have generated the following error message from a simple JNI to MPI
application for clarity purposes and also to regenerate the error easily. I
have attached the app for your consideration.


[bibrak@localhost JNI_to_MPI]$ *mpirun -np 2 java -cp .
-Djava.library.path=/home/**bibrak/work/JNI_to_MPI/ simpleJNI_MPI*
[localhost.localdomain:29086] mca: base: component_find: unable to open
/home/bibrak/work/installs/OpenMPI_installed/lib/openmpi/mca_shmem_mmap:
/home/bibrak/work/installs/OpenMPI_installed/lib/openmpi/mca_shmem_mmap.so:
undefined symbol: opal_show_help (ignored)
[localhost.localdomain:29085] mca: base: component_find: unable to open
/home/bibrak/work/installs/OpenMPI_installed/lib/openmpi/mca_shmem_mmap:
/home/bibrak/work/installs/OpenMPI_installed/lib/openmpi/mca_shmem_mmap.so:
undefined symbol: opal_show_help (ignored)
[localhost.localdomain:29085] mca: base: component_find: unable to open
/home/bibrak/work/installs/OpenMPI_installed/lib/openmpi/mca_shmem_posix:
/home/bibrak/work/installs/OpenMPI_installed/lib/openmpi/mca_shmem_posix.so:
undefined symbol: opal_shmem_base_framework (ignored)
[localhost.localdomain:29086] mca: base: component_find: unable to open
/home/bibrak/work/installs/OpenMPI_installed/lib/openmpi/mca_shmem_posix:
/home/bibrak/work/installs/OpenMPI_installed/lib/openmpi/mca_shmem_posix.so:
undefined symbol: opal_shmem_base_framework (ignored)
[localhost.localdomain:29086] mca: base: component_find: unable to open
/home/bibrak/work/installs/OpenMPI_installed/lib/openmpi/mca_shmem_sysv:
/home/bibrak/work/installs/OpenMPI_installed/lib/openmpi/mca_shmem_sysv.so:
undefined symbol: opal_show_help (ignored)
--------------------------------------------------------------------------
It looks like opal_init failed for some reason; your parallel process is
likely to abort.  There are many reasons that a parallel process can
fail during opal_init; some of which are due to configuration or
environment problems.  This failure appears to be an internal failure;
here's some additional information (which may only be relevant to an
Open MPI developer):

  opal_shmem_base_select failed
  --> Returned value -1 instead of OPAL_SUCCESS
--------------------------------------------------------------------------
[localhost.localdomain:29085] mca: base: component_find: unable to open
/home/bibrak/work/installs/OpenMPI_installed/lib/openmpi/mca_shmem_sysv:
/home/bibrak/work/installs/OpenMPI_installed/lib/openmpi/mca_shmem_sysv.so:
undefined symbol: opal_show_help (ignored)
--------------------------------------------------------------------------
It looks like orte_init failed for some reason; your parallel process is
likely to abort.  There are many reasons that a parallel process can
fail during orte_init; some of which are due to configuration or
environment problems.  This failure appears to be an internal failure;
here's some additional information (which may only be relevant to an
Open MPI developer):

  opal_init failed
  --> Returned value Error (-1) instead of ORTE_SUCCESS
--------------------------------------------------------------------------
--------------------------------------------------------------------------
It looks like MPI_INIT failed for some reason; your parallel process is
likely to abort.  There are many reasons that a parallel process can
fail during MPI_INIT; some of which are due to configuration or environment
problems.  This failure appears to be an internal failure; here's some
additional information (which may only be relevant to an Open MPI
developer):

  ompi_mpi_init: ompi_rte_init failed
  --> Returned "Error" (-1) instead of "Success" (0)
--------------------------------------------------------------------------
*** An error occurred in MPI_Init
*** on a NULL communicator
*** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort,
***    and potentially your MPI job)
[localhost.localdomain:29086] Local abort before MPI_INIT completed
successfully; not able to aggregate error messages, and not able to
guarantee that all other processes were killed!
--------------------------------------------------------------------------
It looks like opal_init failed for some reason; your parallel process is
likely to abort.  There are many reasons that a parallel process can
fail during opal_init; some of which are due to configuration or
environment problems.  This failure appears to be an internal failure;
here's some additional information (which may only be relevant to an
Open MPI developer):

  opal_shmem_base_select failed
  --> Returned value -1 instead of OPAL_SUCCESS
--------------------------------------------------------------------------
-------------------------------------------------------
Primary job  terminated normally, but 1 process returned
a non-zero exit code.. Per user-direction, the job has been aborted.
-------------------------------------------------------
--------------------------------------------------------------------------
mpirun detected that one or more processes exited with non-zero status,
thus causing
the job to be terminated. The first process to do so was:

  Process name: [[41236,1],1]
  Exit code:    1
--------------------------------------------------------------------------


This is a thread that I found where the Open MPI developers were having
issues while integrating mpiJava into their stack.

http://mail-archives.apache.org/mod_mbox/hadoop-common-dev/201202.mbox/%3c5ea543bd-a12e-4729-b66a-c746bc789...@open-mpi.org%3E

I don't have better understanding of the internals of Open MPI, so my
question is how to use Open MPI using JNI wrappers? Any guidelines in this
regard?

Thanks
Bibrak

Attachment: JNI_to_MPI.tar.gz
Description: GNU Zip compressed data

Reply via email to