On Mar 13, 2014, at 3:15 PM, Ross Boylan <r...@biostat.ucsf.edu> wrote:

> The motivation was
> http://www.stats.uwo.ca/faculty/yu/Rmpi/changelogs.htm notes
> ----------------------------------
> 2007-10-24, version 0.5-5:
> 
> dlopen has been used to load libmpi.so explicitly. This is mainly useful
> for Rmpi under OpenMPI where one might see many error messages:
> mca: base: component_find: unable to open osc pt2pt: file not found
> (ignored)
> if libmpi.so is not loaded with RTLD_GLOBAL flag.
> -------------------------------------
> 
> I think I'll try changing to to try libmpi.so first so that it picks up
> libmpi.so.1 if available.  I've already rebuilt R, though it looks as if
> Rmpi may have been the source of the problems.


If you care for the reason why, it's because many (most? all?) of OMPI's 
plugins depend on symbols in the main MPI library.

Hence, if those symbols can't be found in the process' namespace when OMPI 
tries to dlopen one of its plugins, that dlopen will fail due to the symbols it 
needs not being able to be resolved.

It seems weird because libmpi.so *is* in the process (obviously), but it just 
can't be found by the plugin because libmpi.so may well be in a private 
namespace -- and therefore its symbols are hidden from the plugin that is being 
dlopened.  Weird, but true.

I honestly don't know what happens if you have a library opened in a private 
namespace in a process and then you dlopen it again in a public namespace in 
the same process.  Do you actually get a second copy of libmpi (with a second 
copy of all of its global symbols), or is the linker smart enough to realize 
you already have it loaded and effectively move it into the public namespace?

I'm not sure.

-- 
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to: 
http://www.cisco.com/web/about/doing_business/legal/cri/

Reply via email to