Here's what I had to do to load the library correctly (we were only using ORTE,
so substitute "libmpi") - this was called at the beginning of "init":
/* first, load the required ORTE library */
#if OPAL_WANT_LIBLTDL
lt_dladvise advise;
if (lt_dlinit() != 0) {
fprintf(stderr, "LT_DLINIT FAILED - CANNOT LOAD LIBMRPLUS\n");
return JNI_FALSE;
}
#if OPAL_HAVE_LTDL_ADVISE
/* open the library into the global namespace */
if (lt_dladvise_init(&advise)) {
fprintf(stderr, "LT_DLADVISE INIT FAILED - CANNOT LOAD LIBMRPLUS\n");
return JNI_FALSE;
}
if (lt_dladvise_ext(&advise)) {
fprintf(stderr, "LT_DLADVISE EXT FAILED - CANNOT LOAD LIBMRPLUS\n");
lt_dladvise_destroy(&advise);
return JNI_FALSE;
}
if (lt_dladvise_global(&advise)) {
fprintf(stderr, "LT_DLADVISE GLOBAL FAILED - CANNOT LOAD LIBMRPLUS\n");
lt_dladvise_destroy(&advise);
return JNI_FALSE;
}
/* we don't care about the return value
* on dlopen - it might return an error
* because the lib is already loaded,
* depending on the way we were built
*/
lt_dlopenadvise("libopen-rte", advise);
lt_dladvise_destroy(&advise);
#else
fprintf(stderr, "NO LT_DLADVISE - CANNOT LOAD LIBMRPLUS\n");
/* need to balance the ltdl inits */
lt_dlexit();
/* if we don't have advise, then we are hosed */
return JNI_FALSE;
#endif
#endif
/* if dlopen was disabled, then all symbols
* should have been pulled up into the libraries,
* so we don't need to do anything as the symbols
* are already available.
*/
On Mar 12, 2014, at 6:32 AM, Jeff Squyres (jsquyres) <[email protected]> wrote:
> Check out how we did this with the embedded java bindings in Open MPI; see
> the comment describing exactly this issue starting here:
>
>
> https://svn.open-mpi.org/trac/ompi/browser/trunk/ompi/mpi/java/c/mpi_MPI.c#L79
>
> Feel free to compare MPJ to the OMPI java bindings -- they're shipping in
> 1.7.4 and have a bunch of improvements in the soon-to-be-released 1.7.5, but
> you must enable them since they aren't enabled by default:
>
> ./configure --enable-mpi-java ...
>
> FWIW, we found a few places in the Java bindings where it was necessary for
> the bindings to have some insight into the internals of the MPI
> implementation. Did you find the same thing with MPJ Express?
>
> Are your bindings similar in style/signature to ours?
>
>
>
> On Mar 12, 2014, at 6:40 AM, Bibrak Qamar <[email protected]> wrote:
>
>> Hi all,
>>
>> I am writing a new device for MPJ Express that uses a native MPI library for
>> communication. Its based on JNI wrappers like the original mpiJava. The
>> device works fine with MPICH3 (and MVAPICH2.2). Here is the issue with
>> loading Open MPI 1.7.4 from MPJ Express.
>>
>> I have generated the following error message from a simple JNI to MPI
>> application for clarity purposes and also to regenerate the error easily. I
>> have attached the app for your consideration.
>>
>>
>> [bibrak@localhost JNI_to_MPI]$ mpirun -np 2 java -cp .
>> -Djava.library.path=/home/bibrak/work/JNI_to_MPI/ simpleJNI_MPI
>> [localhost.localdomain:29086] mca: base: component_find: unable to open
>> /home/bibrak/work/installs/OpenMPI_installed/lib/openmpi/mca_shmem_mmap:
>> /home/bibrak/work/installs/OpenMPI_installed/lib/openmpi/mca_shmem_mmap.so:
>> undefined symbol: opal_show_help (ignored)
>> [localhost.localdomain:29085] mca: base: component_find: unable to open
>> /home/bibrak/work/installs/OpenMPI_installed/lib/openmpi/mca_shmem_mmap:
>> /home/bibrak/work/installs/OpenMPI_installed/lib/openmpi/mca_shmem_mmap.so:
>> undefined symbol: opal_show_help (ignored)
>> [localhost.localdomain:29085] mca: base: component_find: unable to open
>> /home/bibrak/work/installs/OpenMPI_installed/lib/openmpi/mca_shmem_posix:
>> /home/bibrak/work/installs/OpenMPI_installed/lib/openmpi/mca_shmem_posix.so:
>> undefined symbol: opal_shmem_base_framework (ignored)
>> [localhost.localdomain:29086] mca: base: component_find: unable to open
>> /home/bibrak/work/installs/OpenMPI_installed/lib/openmpi/mca_shmem_posix:
>> /home/bibrak/work/installs/OpenMPI_installed/lib/openmpi/mca_shmem_posix.so:
>> undefined symbol: opal_shmem_base_framework (ignored)
>> [localhost.localdomain:29086] mca: base: component_find: unable to open
>> /home/bibrak/work/installs/OpenMPI_installed/lib/openmpi/mca_shmem_sysv:
>> /home/bibrak/work/installs/OpenMPI_installed/lib/openmpi/mca_shmem_sysv.so:
>> undefined symbol: opal_show_help (ignored)
>> --------------------------------------------------------------------------
>> It looks like opal_init failed for some reason; your parallel process is
>> likely to abort. There are many reasons that a parallel process can
>> fail during opal_init; some of which are due to configuration or
>> environment problems. This failure appears to be an internal failure;
>> here's some additional information (which may only be relevant to an
>> Open MPI developer):
>>
>> opal_shmem_base_select failed
>> --> Returned value -1 instead of OPAL_SUCCESS
>> --------------------------------------------------------------------------
>> [localhost.localdomain:29085] mca: base: component_find: unable to open
>> /home/bibrak/work/installs/OpenMPI_installed/lib/openmpi/mca_shmem_sysv:
>> /home/bibrak/work/installs/OpenMPI_installed/lib/openmpi/mca_shmem_sysv.so:
>> undefined symbol: opal_show_help (ignored)
>> --------------------------------------------------------------------------
>> It looks like orte_init failed for some reason; your parallel process is
>> likely to abort. There are many reasons that a parallel process can
>> fail during orte_init; some of which are due to configuration or
>> environment problems. This failure appears to be an internal failure;
>> here's some additional information (which may only be relevant to an
>> Open MPI developer):
>>
>> opal_init failed
>> --> Returned value Error (-1) instead of ORTE_SUCCESS
>> --------------------------------------------------------------------------
>> --------------------------------------------------------------------------
>> It looks like MPI_INIT failed for some reason; your parallel process is
>> likely to abort. There are many reasons that a parallel process can
>> fail during MPI_INIT; some of which are due to configuration or environment
>> problems. This failure appears to be an internal failure; here's some
>> additional information (which may only be relevant to an Open MPI
>> developer):
>>
>> ompi_mpi_init: ompi_rte_init failed
>> --> Returned "Error" (-1) instead of "Success" (0)
>> --------------------------------------------------------------------------
>> *** An error occurred in MPI_Init
>> *** on a NULL communicator
>> *** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort,
>> *** and potentially your MPI job)
>> [localhost.localdomain:29086] Local abort before MPI_INIT completed
>> successfully; not able to aggregate error messages, and not able to
>> guarantee that all other processes were killed!
>> --------------------------------------------------------------------------
>> It looks like opal_init failed for some reason; your parallel process is
>> likely to abort. There are many reasons that a parallel process can
>> fail during opal_init; some of which are due to configuration or
>> environment problems. This failure appears to be an internal failure;
>> here's some additional information (which may only be relevant to an
>> Open MPI developer):
>>
>> opal_shmem_base_select failed
>> --> Returned value -1 instead of OPAL_SUCCESS
>> --------------------------------------------------------------------------
>> -------------------------------------------------------
>> Primary job terminated normally, but 1 process returned
>> a non-zero exit code.. Per user-direction, the job has been aborted.
>> -------------------------------------------------------
>> --------------------------------------------------------------------------
>> mpirun detected that one or more processes exited with non-zero status, thus
>> causing
>> the job to be terminated. The first process to do so was:
>>
>> Process name: [[41236,1],1]
>> Exit code: 1
>> --------------------------------------------------------------------------
>>
>>
>> This is a thread that I found where the Open MPI developers were having
>> issues while integrating mpiJava into their stack.
>>
>> http://mail-archives.apache.org/mod_mbox/hadoop-common-dev/201202.mbox/%[email protected]%3E
>>
>> I don't have better understanding of the internals of Open MPI, so my
>> question is how to use Open MPI using JNI wrappers? Any guidelines in this
>> regard?
>>
>> Thanks
>> Bibrak
>>
>> <JNI_to_MPI.tar.gz>_______________________________________________
>> devel mailing list
>> [email protected]
>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
>> Link to this post:
>> http://www.open-mpi.org/community/lists/devel/2014/03/14335.php
>
>
> --
> Jeff Squyres
> [email protected]
> For corporate legal information go to:
> http://www.cisco.com/web/about/doing_business/legal/cri/
>
> _______________________________________________
> devel mailing list
> [email protected]
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
> Link to this post:
> http://www.open-mpi.org/community/lists/devel/2014/03/14337.php