Re: [OMPI devel] orte-restart and PATH
That's what the --enable-orterun-prefix-by-default configure option is for On Mar 12, 2014, at 9:28 AM, Adrian Reber wrote: > I am using orte-restart without setting my PATH to my Open MPI > installation. I am running /full/path/to/orte-restart and orte-restart > tries to run mpirun to restart the process. This fails on my system > because I do not have any mpirun in my PATH. Is it expected for an Open > MPI installation to set up the PATH variable or should it work using the > absolute path to the binaries? > > Should I just set my PATH correctly and be done with it or should > orte-restart figure out the full path to its accompanying mpirun and > start mpirun with the full path? > > Adrian > ___ > devel mailing list > de...@open-mpi.org > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel > Link to this post: > http://www.open-mpi.org/community/lists/devel/2014/03/14339.php
[OMPI devel] orte-restart and PATH
I am using orte-restart without setting my PATH to my Open MPI installation. I am running /full/path/to/orte-restart and orte-restart tries to run mpirun to restart the process. This fails on my system because I do not have any mpirun in my PATH. Is it expected for an Open MPI installation to set up the PATH variable or should it work using the absolute path to the binaries? Should I just set my PATH correctly and be done with it or should orte-restart figure out the full path to its accompanying mpirun and start mpirun with the full path? Adrian
Re: [OMPI devel] Loading Open MPI from MPJ Express (Java) fails
Here's what I had to do to load the library correctly (we were only using ORTE, so substitute "libmpi") - this was called at the beginning of "init": /* first, load the required ORTE library */ #if OPAL_WANT_LIBLTDL lt_dladvise advise; if (lt_dlinit() != 0) { fprintf(stderr, "LT_DLINIT FAILED - CANNOT LOAD LIBMRPLUS\n"); return JNI_FALSE; } #if OPAL_HAVE_LTDL_ADVISE /* open the library into the global namespace */ if (lt_dladvise_init(&advise)) { fprintf(stderr, "LT_DLADVISE INIT FAILED - CANNOT LOAD LIBMRPLUS\n"); return JNI_FALSE; } if (lt_dladvise_ext(&advise)) { fprintf(stderr, "LT_DLADVISE EXT FAILED - CANNOT LOAD LIBMRPLUS\n"); lt_dladvise_destroy(&advise); return JNI_FALSE; } if (lt_dladvise_global(&advise)) { fprintf(stderr, "LT_DLADVISE GLOBAL FAILED - CANNOT LOAD LIBMRPLUS\n"); lt_dladvise_destroy(&advise); return JNI_FALSE; } /* we don't care about the return value * on dlopen - it might return an error * because the lib is already loaded, * depending on the way we were built */ lt_dlopenadvise("libopen-rte", advise); lt_dladvise_destroy(&advise); #else fprintf(stderr, "NO LT_DLADVISE - CANNOT LOAD LIBMRPLUS\n"); /* need to balance the ltdl inits */ lt_dlexit(); /* if we don't have advise, then we are hosed */ return JNI_FALSE; #endif #endif /* if dlopen was disabled, then all symbols * should have been pulled up into the libraries, * so we don't need to do anything as the symbols * are already available. */ On Mar 12, 2014, at 6:32 AM, Jeff Squyres (jsquyres) wrote: > Check out how we did this with the embedded java bindings in Open MPI; see > the comment describing exactly this issue starting here: > > > https://svn.open-mpi.org/trac/ompi/browser/trunk/ompi/mpi/java/c/mpi_MPI.c#L79 > > Feel free to compare MPJ to the OMPI java bindings -- they're shipping in > 1.7.4 and have a bunch of improvements in the soon-to-be-released 1.7.5, but > you must enable them since they aren't enabled by default: > >./configure --enable-mpi-java ... > > FWIW, we found a few places in the Java bindings where it was necessary for > the bindings to have some insight into the internals of the MPI > implementation. Did you find the same thing with MPJ Express? > > Are your bindings similar in style/signature to ours? > > > > On Mar 12, 2014, at 6:40 AM, Bibrak Qamar wrote: > >> Hi all, >> >> I am writing a new device for MPJ Express that uses a native MPI library for >> communication. Its based on JNI wrappers like the original mpiJava. The >> device works fine with MPICH3 (and MVAPICH2.2). Here is the issue with >> loading Open MPI 1.7.4 from MPJ Express. >> >> I have generated the following error message from a simple JNI to MPI >> application for clarity purposes and also to regenerate the error easily. I >> have attached the app for your consideration. >> >> >> [bibrak@localhost JNI_to_MPI]$ mpirun -np 2 java -cp . >> -Djava.library.path=/home/bibrak/work/JNI_to_MPI/ simpleJNI_MPI >> [localhost.localdomain:29086] mca: base: component_find: unable to open >> /home/bibrak/work/installs/OpenMPI_installed/lib/openmpi/mca_shmem_mmap: >> /home/bibrak/work/installs/OpenMPI_installed/lib/openmpi/mca_shmem_mmap.so: >> undefined symbol: opal_show_help (ignored) >> [localhost.localdomain:29085] mca: base: component_find: unable to open >> /home/bibrak/work/installs/OpenMPI_installed/lib/openmpi/mca_shmem_mmap: >> /home/bibrak/work/installs/OpenMPI_installed/lib/openmpi/mca_shmem_mmap.so: >> undefined symbol: opal_show_help (ignored) >> [localhost.localdomain:29085] mca: base: component_find: unable to open >> /home/bibrak/work/installs/OpenMPI_installed/lib/openmpi/mca_shmem_posix: >> /home/bibrak/work/installs/OpenMPI_installed/lib/openmpi/mca_shmem_posix.so: >> undefined symbol: opal_shmem_base_framework (ignored) >> [localhost.localdomain:29086] mca: base: component_find: unable to open >> /home/bibrak/work/installs/OpenMPI_installed/lib/openmpi/mca_shmem_posix: >> /home/bibrak/work/installs/OpenMPI_installed/lib/openmpi/mca_shmem_posix.so: >> undefined symbol: opal_shmem_base_framework (ignored) >> [localhost.localdomain:29086] mca: base: component_find: unable to open >> /home/bibrak/work/installs/OpenMPI_installed/lib/openmpi/mca_shmem_sysv: >> /home/bibrak/work/installs/OpenMPI_installed/lib/openmpi/mca_shmem_sysv.so: >> undefined symbol: opal_show_help (ignored) >> -- >> It looks like opal_init failed for some reason; your parallel process is >> likely to abort. There are many reasons that a parallel process can >> fail during opal_init; some of which are due to configuration or >> environment problems. This failure appears to be an internal failure; >> here's some additional information
Re: [OMPI devel] Loading Open MPI from MPJ Express (Java) fails
Check out how we did this with the embedded java bindings in Open MPI; see the comment describing exactly this issue starting here: https://svn.open-mpi.org/trac/ompi/browser/trunk/ompi/mpi/java/c/mpi_MPI.c#L79 Feel free to compare MPJ to the OMPI java bindings -- they're shipping in 1.7.4 and have a bunch of improvements in the soon-to-be-released 1.7.5, but you must enable them since they aren't enabled by default: ./configure --enable-mpi-java ... FWIW, we found a few places in the Java bindings where it was necessary for the bindings to have some insight into the internals of the MPI implementation. Did you find the same thing with MPJ Express? Are your bindings similar in style/signature to ours? On Mar 12, 2014, at 6:40 AM, Bibrak Qamar wrote: > Hi all, > > I am writing a new device for MPJ Express that uses a native MPI library for > communication. Its based on JNI wrappers like the original mpiJava. The > device works fine with MPICH3 (and MVAPICH2.2). Here is the issue with > loading Open MPI 1.7.4 from MPJ Express. > > I have generated the following error message from a simple JNI to MPI > application for clarity purposes and also to regenerate the error easily. I > have attached the app for your consideration. > > > [bibrak@localhost JNI_to_MPI]$ mpirun -np 2 java -cp . > -Djava.library.path=/home/bibrak/work/JNI_to_MPI/ simpleJNI_MPI > [localhost.localdomain:29086] mca: base: component_find: unable to open > /home/bibrak/work/installs/OpenMPI_installed/lib/openmpi/mca_shmem_mmap: > /home/bibrak/work/installs/OpenMPI_installed/lib/openmpi/mca_shmem_mmap.so: > undefined symbol: opal_show_help (ignored) > [localhost.localdomain:29085] mca: base: component_find: unable to open > /home/bibrak/work/installs/OpenMPI_installed/lib/openmpi/mca_shmem_mmap: > /home/bibrak/work/installs/OpenMPI_installed/lib/openmpi/mca_shmem_mmap.so: > undefined symbol: opal_show_help (ignored) > [localhost.localdomain:29085] mca: base: component_find: unable to open > /home/bibrak/work/installs/OpenMPI_installed/lib/openmpi/mca_shmem_posix: > /home/bibrak/work/installs/OpenMPI_installed/lib/openmpi/mca_shmem_posix.so: > undefined symbol: opal_shmem_base_framework (ignored) > [localhost.localdomain:29086] mca: base: component_find: unable to open > /home/bibrak/work/installs/OpenMPI_installed/lib/openmpi/mca_shmem_posix: > /home/bibrak/work/installs/OpenMPI_installed/lib/openmpi/mca_shmem_posix.so: > undefined symbol: opal_shmem_base_framework (ignored) > [localhost.localdomain:29086] mca: base: component_find: unable to open > /home/bibrak/work/installs/OpenMPI_installed/lib/openmpi/mca_shmem_sysv: > /home/bibrak/work/installs/OpenMPI_installed/lib/openmpi/mca_shmem_sysv.so: > undefined symbol: opal_show_help (ignored) > -- > It looks like opal_init failed for some reason; your parallel process is > likely to abort. There are many reasons that a parallel process can > fail during opal_init; some of which are due to configuration or > environment problems. This failure appears to be an internal failure; > here's some additional information (which may only be relevant to an > Open MPI developer): > > opal_shmem_base_select failed > --> Returned value -1 instead of OPAL_SUCCESS > -- > [localhost.localdomain:29085] mca: base: component_find: unable to open > /home/bibrak/work/installs/OpenMPI_installed/lib/openmpi/mca_shmem_sysv: > /home/bibrak/work/installs/OpenMPI_installed/lib/openmpi/mca_shmem_sysv.so: > undefined symbol: opal_show_help (ignored) > -- > It looks like orte_init failed for some reason; your parallel process is > likely to abort. There are many reasons that a parallel process can > fail during orte_init; some of which are due to configuration or > environment problems. This failure appears to be an internal failure; > here's some additional information (which may only be relevant to an > Open MPI developer): > > opal_init failed > --> Returned value Error (-1) instead of ORTE_SUCCESS > -- > -- > It looks like MPI_INIT failed for some reason; your parallel process is > likely to abort. There are many reasons that a parallel process can > fail during MPI_INIT; some of which are due to configuration or environment > problems. This failure appears to be an internal failure; here's some > additional information (which may only be relevant to an Open MPI > developer): > > ompi_mpi_init: ompi_rte_init failed > --> Returned "Error" (-1) instead of "Success" (0) > -- > *** An error occurred in MPI_Init > *** on a NULL
Re: [OMPI devel] Loading Open MPI from MPJ Express (Java) fails
If you are going to use OMPI via JNI, then you have to load the OMPI library from within your code. This is a little tricky from Java as OMPI by default builds as a set of dynamic libraries, and each component is a dynamic library as well. The solution is to either build OMPI static, or to use lt_dladvise and friends to ensure the load paths are followed. On Mar 12, 2014, at 3:40 AM, Bibrak Qamar wrote: > Hi all, > > I am writing a new device for MPJ Express that uses a native MPI library for > communication. Its based on JNI wrappers like the original mpiJava. The > device works fine with MPICH3 (and MVAPICH2.2). Here is the issue with > loading Open MPI 1.7.4 from MPJ Express. > > I have generated the following error message from a simple JNI to MPI > application for clarity purposes and also to regenerate the error easily. I > have attached the app for your consideration. > > > [bibrak@localhost JNI_to_MPI]$ mpirun -np 2 java -cp . > -Djava.library.path=/home/bibrak/work/JNI_to_MPI/ simpleJNI_MPI > [localhost.localdomain:29086] mca: base: component_find: unable to open > /home/bibrak/work/installs/OpenMPI_installed/lib/openmpi/mca_shmem_mmap: > /home/bibrak/work/installs/OpenMPI_installed/lib/openmpi/mca_shmem_mmap.so: > undefined symbol: opal_show_help (ignored) > [localhost.localdomain:29085] mca: base: component_find: unable to open > /home/bibrak/work/installs/OpenMPI_installed/lib/openmpi/mca_shmem_mmap: > /home/bibrak/work/installs/OpenMPI_installed/lib/openmpi/mca_shmem_mmap.so: > undefined symbol: opal_show_help (ignored) > [localhost.localdomain:29085] mca: base: component_find: unable to open > /home/bibrak/work/installs/OpenMPI_installed/lib/openmpi/mca_shmem_posix: > /home/bibrak/work/installs/OpenMPI_installed/lib/openmpi/mca_shmem_posix.so: > undefined symbol: opal_shmem_base_framework (ignored) > [localhost.localdomain:29086] mca: base: component_find: unable to open > /home/bibrak/work/installs/OpenMPI_installed/lib/openmpi/mca_shmem_posix: > /home/bibrak/work/installs/OpenMPI_installed/lib/openmpi/mca_shmem_posix.so: > undefined symbol: opal_shmem_base_framework (ignored) > [localhost.localdomain:29086] mca: base: component_find: unable to open > /home/bibrak/work/installs/OpenMPI_installed/lib/openmpi/mca_shmem_sysv: > /home/bibrak/work/installs/OpenMPI_installed/lib/openmpi/mca_shmem_sysv.so: > undefined symbol: opal_show_help (ignored) > -- > It looks like opal_init failed for some reason; your parallel process is > likely to abort. There are many reasons that a parallel process can > fail during opal_init; some of which are due to configuration or > environment problems. This failure appears to be an internal failure; > here's some additional information (which may only be relevant to an > Open MPI developer): > > opal_shmem_base_select failed > --> Returned value -1 instead of OPAL_SUCCESS > -- > [localhost.localdomain:29085] mca: base: component_find: unable to open > /home/bibrak/work/installs/OpenMPI_installed/lib/openmpi/mca_shmem_sysv: > /home/bibrak/work/installs/OpenMPI_installed/lib/openmpi/mca_shmem_sysv.so: > undefined symbol: opal_show_help (ignored) > -- > It looks like orte_init failed for some reason; your parallel process is > likely to abort. There are many reasons that a parallel process can > fail during orte_init; some of which are due to configuration or > environment problems. This failure appears to be an internal failure; > here's some additional information (which may only be relevant to an > Open MPI developer): > > opal_init failed > --> Returned value Error (-1) instead of ORTE_SUCCESS > -- > -- > It looks like MPI_INIT failed for some reason; your parallel process is > likely to abort. There are many reasons that a parallel process can > fail during MPI_INIT; some of which are due to configuration or environment > problems. This failure appears to be an internal failure; here's some > additional information (which may only be relevant to an Open MPI > developer): > > ompi_mpi_init: ompi_rte_init failed > --> Returned "Error" (-1) instead of "Success" (0) > -- > *** An error occurred in MPI_Init > *** on a NULL communicator > *** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort, > ***and potentially your MPI job) > [localhost.localdomain:29086] Local abort before MPI_INIT completed > successfully; not able to aggregate error messages, and not able to guarantee > that all other processes were killed! > -
[OMPI devel] Loading Open MPI from MPJ Express (Java) fails
Hi all, I am writing a new device for MPJ Express that uses a native MPI library for communication. Its based on JNI wrappers like the original mpiJava. The device works fine with MPICH3 (and MVAPICH2.2). Here is the issue with loading Open MPI 1.7.4 from MPJ Express. I have generated the following error message from a simple JNI to MPI application for clarity purposes and also to regenerate the error easily. I have attached the app for your consideration. [bibrak@localhost JNI_to_MPI]$ *mpirun -np 2 java -cp . -Djava.library.path=/home/**bibrak/work/JNI_to_MPI/ simpleJNI_MPI* [localhost.localdomain:29086] mca: base: component_find: unable to open /home/bibrak/work/installs/OpenMPI_installed/lib/openmpi/mca_shmem_mmap: /home/bibrak/work/installs/OpenMPI_installed/lib/openmpi/mca_shmem_mmap.so: undefined symbol: opal_show_help (ignored) [localhost.localdomain:29085] mca: base: component_find: unable to open /home/bibrak/work/installs/OpenMPI_installed/lib/openmpi/mca_shmem_mmap: /home/bibrak/work/installs/OpenMPI_installed/lib/openmpi/mca_shmem_mmap.so: undefined symbol: opal_show_help (ignored) [localhost.localdomain:29085] mca: base: component_find: unable to open /home/bibrak/work/installs/OpenMPI_installed/lib/openmpi/mca_shmem_posix: /home/bibrak/work/installs/OpenMPI_installed/lib/openmpi/mca_shmem_posix.so: undefined symbol: opal_shmem_base_framework (ignored) [localhost.localdomain:29086] mca: base: component_find: unable to open /home/bibrak/work/installs/OpenMPI_installed/lib/openmpi/mca_shmem_posix: /home/bibrak/work/installs/OpenMPI_installed/lib/openmpi/mca_shmem_posix.so: undefined symbol: opal_shmem_base_framework (ignored) [localhost.localdomain:29086] mca: base: component_find: unable to open /home/bibrak/work/installs/OpenMPI_installed/lib/openmpi/mca_shmem_sysv: /home/bibrak/work/installs/OpenMPI_installed/lib/openmpi/mca_shmem_sysv.so: undefined symbol: opal_show_help (ignored) -- It looks like opal_init failed for some reason; your parallel process is likely to abort. There are many reasons that a parallel process can fail during opal_init; some of which are due to configuration or environment problems. This failure appears to be an internal failure; here's some additional information (which may only be relevant to an Open MPI developer): opal_shmem_base_select failed --> Returned value -1 instead of OPAL_SUCCESS -- [localhost.localdomain:29085] mca: base: component_find: unable to open /home/bibrak/work/installs/OpenMPI_installed/lib/openmpi/mca_shmem_sysv: /home/bibrak/work/installs/OpenMPI_installed/lib/openmpi/mca_shmem_sysv.so: undefined symbol: opal_show_help (ignored) -- It looks like orte_init failed for some reason; your parallel process is likely to abort. There are many reasons that a parallel process can fail during orte_init; some of which are due to configuration or environment problems. This failure appears to be an internal failure; here's some additional information (which may only be relevant to an Open MPI developer): opal_init failed --> Returned value Error (-1) instead of ORTE_SUCCESS -- -- It looks like MPI_INIT failed for some reason; your parallel process is likely to abort. There are many reasons that a parallel process can fail during MPI_INIT; some of which are due to configuration or environment problems. This failure appears to be an internal failure; here's some additional information (which may only be relevant to an Open MPI developer): ompi_mpi_init: ompi_rte_init failed --> Returned "Error" (-1) instead of "Success" (0) -- *** An error occurred in MPI_Init *** on a NULL communicator *** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort, ***and potentially your MPI job) [localhost.localdomain:29086] Local abort before MPI_INIT completed successfully; not able to aggregate error messages, and not able to guarantee that all other processes were killed! -- It looks like opal_init failed for some reason; your parallel process is likely to abort. There are many reasons that a parallel process can fail during opal_init; some of which are due to configuration or environment problems. This failure appears to be an internal failure; here's some additional information (which may only be relevant to an Open MPI developer): opal_shmem_base_select failed --> Returned value -1 instead of OPAL_SUCCESS -- -
Re: [OMPI devel] [Score-P support] Compile errors of Fedora rawhide
On Mar 11, 2014, at 11:25 PM, Orion Poplawski wrote: >> Did you find any others, perchance? > > Also: > MPI_Type_hindexed > MPI_Type_struct > > But these were also deprecated in MPI-2.0, so probably gone in MPI-3. Correct -- i.e., I confirm you're right: MPI_Type_indexed and MPI_Type_struct were deprecated in MPI-2.0 (1994), and finally removed in MPI-3 (2012). > That's it as far as score-p is concerned. Note that dropping functions > has serious ABI/soname implications. Keep in mind that MPI-3 deleted these interfaces, but they had already been deprecated for over 2012-1994 = 17 years. MPI-3 also made it clear that MPI implementations can keep providing these interfaces, but they must adhere to the prototypes that were published in prior versions of the MPI specification (i.e., no const). At this point, I believe both Open MPI and MPICH will issue deprecated warnings if your compiler supports them if you use these functions. Open MPI doesn't yet have any plans for actually removing the functions. If/when we do remove them, we'll do it at the beginning of a new feature series. > I'll probably have to hack up > something for scorep to handle these, probably just be removing them. It would be best to migrate to the new interfaces; it should be quite trivial (change the parameter type from "int" to "MPI_Aint"). 17 years is enough. :-) -- Jeff Squyres jsquy...@cisco.com For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/