Hi Nathan,
Nathan DeBardeleben writes:

I've been having this problem for a week or so and I've been asking other people to weigh in if they know what I'm doing wrong. I've gotten no where on this so I figure I'll finally drop it out on the list. First, here's the important info: The machine:
[sparkplug]~ > cat /etc/issue
Welcome to SuSE Linux 9.1 (x86-64) - Kernel \r (\l).

[sparkplug]~ > uname -a
Linux sparkplug 2.6.10 #4 SMP Wed Jan 26 11:50:00 MST 2005 x86_64 x86_64 x86_64 GNU/Linux

My versions of libtool, autoconf, automake:
[sparkplug]~ > libtool --version
ltmain.sh (GNU libtool) 1.5.20 (1.1220.2.287 2005/08/31 18:54:15)
*snip*
My ompi version: 7322 - but this has been going on for a few days like I said and I've been updating a lot, with no progress. Configured using:
$ ./configure --enable-static --disable-shared --without-threads --prefix=/home/ndebard/local/ompi --with-devel-headers --enable-mca-no-build=ptl-gm

Simple C file which I will compile into a shared library:
int test_compile(int x) {
int rc;
    rc = orte_init(true);
printf("rc = %d\n", rc);
    return x + 1;
}

Above file is named 'testlib.c' OK, so let's build this:
[sparkplug]~/ompi-test > mpicc -c testlib.c
[sparkplug]~/ompi-test > mpicc -shared -o libtestlib.so testlib.o
/usr/lib64/gcc-lib/x86_64-suse-linux/3.3.3/../../../../x86_64-suse-linux/bin/ld:
testlib.o: relocation R_X86_64_32 can not be used when making a shared
object; recompile with -fPIC
testlib.o: could not read symbols: Bad value
collect2: ld returned 1 exit status

OK, I don't have time to reproduce this at the moment, but I see several
issues: First, testlib.o needs to be compiled PIC (you noticed that already).
OK so relocation problems. Maybe I'll follow the directions and -fPIC my file myself:
[sparkplug]~/ompi-test > mpicc -c testlib.c -fPIC
[sparkplug]~/ompi-test > mpicc -shared -o libtestlib.so testlib.o
/usr/lib64/gcc-lib/x86_64-suse-linux/3.3.3/../../../../x86_64-suse-linux/bin/ld:
/home/ndebard/local/ompi/lib/liborte.a(orte_init.o): relocation
R_X86_64_32 can not be used when making a shared object; recompile with -fPIC
/home/ndebard/local/ompi/lib/liborte.a: could not read symbols: Bad value
collect2: ld returned 1 exit status

This is the second issue: orte_init.o is not compiled PIC (surely,
as you --disable-shared).  But the error here is that it tries to
link the static library into the shared one, which is wrong.
Either a Libtool or an OpenMPI bug.  Please show what both of the above
mpicc calls generate.
OK so I read this as there's a relocation problem in 'liborte.a'. I un-arred liborte.a and checked some of the files with 'file' and it says 64bit. I havn't yet written a script to check every file in here, but here's orte_init.o:
[sparkplug]~/<1>tmp > file orte_init.o
orte_init.o: ELF 64-bit LSB relocatable, AMD x86-64, version 1 (SYSV), not stripped

So that at least says it's 64bit.
And to confirm, my mpicc's 64bit too:
[sparkplug]~/<1>tmp > which mpicc
/home/ndebard/local/ompi/bin/mpicc
[sparkplug]~/<1>tmp > file /home/ndebard/local/ompi/bin/mpicc
/home/ndebard/local/ompi/bin/mpicc: ELF 64-bit LSB executable, AMD x86-64, version 1 (SYSV), for GNU/Linux 2.4.1, dynamically linked (uses shared libs), not stripped

Someone suggested I take out the 'disabled-shared' from the configure line, so I did. The result was the same.

Are you sure you really rebuilt the library afterwards (I believe a
"make clean" in between is necessary)?  Please show the link line
of liborte.la.  (You can do a full build, then delete liborte.la and
type "make" again to capture its output more easily.)
So the result is that I can not build a shared library on a 64bit linux machine that uses orte calls. So then I tried taking out the orte calls and instead use MPI calls. Sure, this function makes no sense but here it is now:
#include "orte_config.h"
#include <mpi.h>
int test_compile(int x) {
MPI_Comm_rank(MPI_COMM_WORLD, &x);
    return x + 1;
}

And now, when I try and make a shared object I get relocation errors:

Should be the same issue. /usr/lib64/gcc-lib/x86_64-suse-linux/3.3.3/../../../../x86_64-suse-linux/bin /ld:
/home/ndebard/local/ompi/lib/libmpi.a(comm_init.o): relocation R_X86_64_32 can not be used when making a shared object; recompile with -fPIC
/home/ndebard/local/ompi/lib/libmpi.a: could not read symbols: Bad value

So... could perhaps the build be messed up and not be really using 64bit code? Am I the only one seeing this? It's a trivial test for those of you with access to a 64bit machine if you wouldn't mind testing for me.

As I said, I can probably only test this a few days from now.
Cheers,
Ralf

Reply via email to