On 10 October 2007 at 15:27, Brian Granger wrote: | I am seeing the same error, but I am using mpi4py (Lisandro Dalcin's | Python MPI bindings). I don't think that libmpi.so is being dlopen'd | directly at runtime, but, the shared library that is linked at compile | time to libmpi.so is probably being loaded at runtime. The odd thing | is that mpi4py has been tested extensively with openmpi and this is | the first version of openmpi that we have seen this issue. I tried | 1.2.3 again yesterday and it works fine. What changed with 1.2.4? | | The problem with our case is that the code that is doing the dlopen is | deep inside Python itself (not just mpi4py). It is the same code that
That's the same for R. We don;t touch the innert guts of module loading for this . What Hao realized after looking at the corresponding FAQ item was that right before calling MPI_Init, one can load libmpi explicitly, and -- and that;s the important bit -- set the proper RTLD_GLOBAL argument. So you could adapt the patch we used : a) add an include for dlfcn.h b) explicitly call dlopen on libmpi.so with RTLD_GLOBAL That should be reasonably easy to test as you only need to rebuild mpi4py, --- rmpi-0.5-4.orig/src/Rmpi.c +++ rmpi-0.5-4/src/Rmpi.c @@ -16,6 +16,7 @@ */ #include "Rmpi.h" +#include <dlfcn.h> static MPI_Comm *comm; static MPI_Status *status; @@ -32,7 +33,9 @@ if (flag) return AsInt(1); else { - MPI_Init((void *)0,(void *)0); + char *libm="libmpi.so"; + dlopen(libm,RTLD_GLOBAL); + MPI_Init((void *)0,(void *)0); MPI_Errhandler_set(MPI_COMM_WORLD, MPI_ERRORS_RETURN); MPI_Errhandler_set(MPI_COMM_SELF, MPI_ERRORS_RETURN); comm=(MPI_Comm *)Calloc(COMM_MAXSIZE, MPI_Comm); | is responsible for loading _everything_ into Python, and I am pretty | sure that there is no way that people would be willing to change it. | I am cc'ing this to Lisandro - maybe he has some ideas on this front. Actually, looked like you didn't CC him. Hth, Dirk | | Thanks | | Brian | | On 10/10/07, Brian Barrett <brbar...@open-mpi.org> wrote: | > On Oct 10, 2007, at 1:27 PM, Dirk Eddelbuettel wrote: | > > | Does this happen for all MPI programs (potentially only those that | > > | use the MPI-2 one-sided stuff), or just your R environment? | > > | > > This is the likely winner. | > > | > > It seems indeed due to R's Rmpi package. Running a simple mpitest.c | > > shows no | > > error message. We will look at the Rmpi initialization to see what | > > could | > > cause this. | > | > Does rmpi link in libmpi.so or dynamically load it at run-time? The | > pt2pt one-sided component uses the MPI-1 point-to-point calls for | > communication (hence, the pt2pt name). If those symbols were | > unavailable (say, because libmpi.so was dynamically loaded) I could | > see how this would cause problems. | > | > The pt2pt component (rightly) does not have a -lmpi in its link | > line. The other components that use symbols in libmpi.so (wrongly) | > do have a -lmpi in their link line. This can cause some problems on | > some platforms (Linux tends to do dynamic linking / dynamic loading | > better than most). That's why only the pt2pt component fails. | > | > My guess is that Rmpi is dynamically loading libmpi.so, but not | > specifying the RTLD_GLOBAL flag. This means that libmpi.so is not | > available to the components the way it should be, and all goes | > downhill from there. It only mostly works because we do something | > silly with how we link most of our components, and Linux is just | > smart enough to cover our rears (thankfully). | > | > Solutions: | > | > - Someone could make the pt2pt osc component link in libmpi.so | > like the rest of the components and hope that no one ever | > tries this on a non-friendly platform. | > - Debian (and all Rmpi users) could configure Open MPI with the | > --disable-dlopen flag and ignore the problem. | > - Someone could fix Rmpi to dlopen libmpi.so with the RTLD_GLOBAL | > flag and fix the problem properly. | > | > I think it's clear I'm in favor of Option 3. | > | > Brian | > _______________________________________________ | > users mailing list | > us...@open-mpi.org | > http://www.open-mpi.org/mailman/listinfo.cgi/users | > | _______________________________________________ | users mailing list | us...@open-mpi.org | http://www.open-mpi.org/mailman/listinfo.cgi/users -- Three out of two people have difficulties with fractions.