Hi!

When trying to adapt GridScheduler's (former SGE) Java DRMAA wrappers to 
Slurms C DRMAA plugin, I have discovered something peculiar about the way 
Slurm's plugins (for authenticaction, etc.) are loaded:

When using Slurm's libdrmaa.so and its example C program, everything is 
working correctly. But when trying to run another test program from Java with 
a JNI wrapper library to libdrmaa, I get "undefined symbol" errors in 
auth_munge.so for symbols defined in libslurm.so.

After some digging, I think I understand why this is: I use Munge for 
authentication and when libslurm.so is used it tries to load auth_munge.so. It 
is loaded at run-time (with ldopen(...) ) and in turn uses symbols from 
libslurm.so, but without linking to it. The way this seems to function in the 
regular pure-C way is that libslurm is loaded in "global mode" (the 
RTLD_GLOBAL flag to ldopen, or equivalent), making all symbols in it available 
implicilty. Hence it works. But from JNI, it is loaded in "local mode" 
(RTLD_LOCAL) and hence auth_munge cannot find the symbols from libslurm. The 
same issue seems to exist in most plugins.

Is there some reason for why the plugins are not linked to libslurm when 
built, or is my build not configured correctly? Else, it would seem that 
library files should link to all other libraries they use.

(There seems to be a way to force libslurm to be loaded in global mode, by 
explicitly loading libdrmaa (which in turn loads libslurm) in the JNI on-load 
hook, but I don't now id this has any consequences I haven't discovered 
yet...)

Regards,
/Sebastian Gröhn

---

Sebastian Gröhn, Research Assistant
Dept. of Computing Science, Umeå University

Reply via email to