On Feb 24, 2009, at 1:07 PM, Jeff Squyres wrote:

The minimpi(.py) Python module loads the minimpiext(.c) module and calls
its minimpiext.init() method (defined in minimpiext.c) which in turn
calls MPI_Init(). "minimpiext.c" is linked against libmpi. Libmpi is
loaded as soon as Python evaluates "import minimpi".

Ah, ok. I wonder if you're not building properly. -lmpi is not usually suffucient to build an Open MPI application; we hide a bunch of flags inside mpicc you can see via mpicc --showme.

How does one add more ldflags to your setup.py script?


Never mind; a few quick google searches and I found the extra_linker_args python param. That turned out to be a red herring, anyway.

The issue is that Python is apparently dlopen'ing libmpi in a private scope. Specifically, if I "strace python" and type in "import minimpi", I see the following go by:

-----
open("/home/jsquyres/bogus/lib/libmpi.so.0", O_RDONLY) = 5
read(5, "\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0P\241\1\0"..., 832) = 832
fstat(5, {st_mode=S_IFREG|0755, st_size=799503, ...}) = 0
mmap(NULL, 1690760, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 5, 0) = 0x2a987f6000
-----

Which I think corresponds to calling dlopen() without RTLD_GLOBAL.

The problem is that Open MPI is also built upon plugins. And OMPI's plugins use symbols in OMPI's libraries. Hence, when we dynamically load OMPI's plugins, they need to be able to resolve some symbols with symbols that can be found in the process. Hence, if libmpi is loaded in a private scope, and then libmpi turns around and calls dlopen() to open a plugin, libmpi's symbols (and libopen-rte and libopen-pal) are not available to the plugin. Hence, the plugin fails to load. This error propagates up the stack and MPI_INIT eventually fails.

You have some possible workarounds:

- We recommended to the PyMPI author a while ago that he add his own dlopen() of libmpi before calling MPI_INIT, but specifically using RTLD_GLOBAL, so that the library is opened in the global process space (not a private space in the process). Then libmpi's (and friends) symbols will be available to its plugins. If you're unhappy with the non-portability of dlopen, try lt_dlopen_advise() -- it's a portable version that is linked inside Open MPI.

- Another option is to configure/compile Open MPI with "--disable- dlopen" or "--enable-static --disable-shared" configure options. Either of these options will cause Open MPI to slurp all of its plugins up into libmpi (etc) and not dynamically open them at run- time, thereby avoiding the problem of Python opening libmpi in a private scope.

- Get Python to give you the possibility of opening dependent libraries in the global scope. This may be somewhat controversial; there are good reasons to open plugins in private scopes. But I have to imagine that OMPI is not the only python extension out there that wants to open plugins of its own; other such projects should be running into similar issues.

--
Jeff Squyres
Cisco Systems

Reply via email to