Re: [O-MPI devel] 64bit shared library problems

2005-09-13 Thread Ralf Wildenhues
Hi Nathan, 


Nathan DeBardeleben writes:


I've been having this problem for a week or so and I've been asking 
other people to weigh in if they know what I'm doing wrong.  I've gotten 
no where on this so I figure I'll finally drop it out on the list.  
First, here's the important info:
The machine: 

[sparkplug]~ > cat /etc/issue 


Welcome to SuSE Linux 9.1 (x86-64) - Kernel \r (\l).



[sparkplug]~ > uname -a
Linux sparkplug 2.6.10 #4 SMP Wed Jan 26 11:50:00 MST 2005 x86_64 
x86_64 x86_64 GNU/Linux


My versions of libtool, autoconf, automake: 


[sparkplug]~ > libtool --version
ltmain.sh (GNU libtool) 1.5.20 (1.1220.2.287 2005/08/31 18:54:15)
*snip* 

My ompi version: 7322 - but this has been going on for a few days like I 
said and I've been updating a lot, with no progress. 

Configured using: 

$ ./configure --enable-static --disable-shared --without-threads 
--prefix=/home/ndebard/local/ompi --with-devel-headers 
--enable-mca-no-build=ptl-gm


Simple C file which I will compile into a shared library: 


int test_compile(int x) {
int rc; 


rc = orte_init(true);
printf("rc = %d\n", rc); 


return x + 1;
}


Above file is named 'testlib.c' 

OK, so let's build this: 


[sparkplug]~/ompi-test > mpicc -c testlib.c
[sparkplug]~/ompi-test > mpicc -shared -o libtestlib.so testlib.o
/usr/lib64/gcc-lib/x86_64-suse-linux/3.3.3/../../../../x86_64-suse-linux/bin/ld:
testlib.o: relocation R_X86_64_32 can not be used when making a shared
object; recompile with -fPIC
testlib.o: could not read symbols: Bad value
collect2: ld returned 1 exit status


OK, I don't have time to reproduce this at the moment, but I see several
issues: First, testlib.o needs to be compiled PIC (you noticed that 
already). 

OK so relocation problems.  Maybe I'll follow the directions and -fPIC 
my file myself: 


[sparkplug]~/ompi-test > mpicc -c testlib.c -fPIC
[sparkplug]~/ompi-test > mpicc -shared -o libtestlib.so testlib.o
/usr/lib64/gcc-lib/x86_64-suse-linux/3.3.3/../../../../x86_64-suse-linux/bin/ld:
/home/ndebard/local/ompi/lib/liborte.a(orte_init.o): relocation
R_X86_64_32 can not be used when making a shared object; recompile 
with -fPIC

/home/ndebard/local/ompi/lib/liborte.a: could not read symbols: Bad value
collect2: ld returned 1 exit status


This is the second issue: orte_init.o is not compiled PIC (surely,
as you --disable-shared).  But the error here is that it tries to
link the static library into the shared one, which is wrong.
Either a Libtool or an OpenMPI bug.  Please show what both of the above
mpicc calls generate. 

OK so I read this as there's a relocation problem in 'liborte.a'.  I 
un-arred liborte.a and checked some of the files with 'file' and it says 
64bit.  I havn't yet written a script to check every file in here, but 
here's orte_init.o: 


[sparkplug]~/<1>tmp > file orte_init.o
orte_init.o: ELF 64-bit LSB relocatable, AMD x86-64, version 1 (SYSV), 
not stripped


So that at least says it's 64bit.
And to confirm, my mpicc's 64bit too: 


[sparkplug]~/<1>tmp > which mpicc
/home/ndebard/local/ompi/bin/mpicc
[sparkplug]~/<1>tmp > file /home/ndebard/local/ompi/bin/mpicc
/home/ndebard/local/ompi/bin/mpicc: ELF 64-bit LSB executable, AMD 
x86-64, version 1 (SYSV), for GNU/Linux 2.4.1, dynamically linked 
(uses shared libs), not stripped


Someone suggested I take out the 'disabled-shared' from the configure 
line, so I did.  The result was the same.


Are you sure you really rebuilt the library afterwards (I believe a
"make clean" in between is necessary)?  Please show the link line
of liborte.la.  (You can do a full build, then delete liborte.la and
type "make" again to capture its output more easily.) 

So the result is that I can not build a shared library on a 64bit linux 
machine that uses orte calls.
So then I tried taking out the orte calls and instead use MPI calls.  
Sure, this function makes no sense but here it is now: 


#include "orte_config.h"
#include  


int test_compile(int x) {
MPI_Comm_rank(MPI_COMM_WORLD, &x); 


return x + 1;
}


And now, when I try and make a shared object I get relocation errors:


Should be the same issue. 

/usr/lib64/gcc-lib/x86_64-suse-linux/3.3.3/../../../../x86_64-suse-linux/bin 
/ld:
/home/ndebard/local/ompi/lib/libmpi.a(comm_init.o): relocation 
R_X86_64_32 can not be used when making a shared object; recompile 
with -fPIC

/home/ndebard/local/ompi/lib/libmpi.a: could not read symbols: Bad value


So... could perhaps the build be messed up and not be really using 64bit 
code?
Am I the only one seeing this?  It's a trivial test for those of you 
with access to a 64bit machine if you wouldn't mind testing for me.


As I said, I can probably only test this a few days from now. 


Cheers,
Ralf


[O-MPI devel] OMPI compile failing

2005-09-13 Thread Nathan DeBardeleben

Compiling I get:

 gcc -DHAVE_CONFIG_H -I. -I. -I../../../../include 
-I../../../../include -I../../../../include -I../../../.. 
-I../../../.. -I../../../../include -I../../../../opal 
-I../../../../orte -I../../../../ompi -g -Wall -Wundef -Wno-long-long 
-Wsign-compare -Wmissing-prototypes -Wstrict-prototypes -Wcomment 
-pedantic -Werror-implicit-function-declaration -fno-strict-aliasing 
-MT btl_gm.lo -MD -MP -MF .deps/btl_gm.Tpo -c btl_gm.c  -fPIC -DPIC -o 
.libs/btl_gm.o

btl_gm.c: In function `mca_btl_gm_prepare_src':
btl_gm.c:237: error: `gm_btl' undeclared (first use in this function)
btl_gm.c:237: error: (Each undeclared identifier is reported only once
btl_gm.c:237: error: for each function it appears in.)
btl_gm.c: In function `mca_btl_gm_prepare_dst':
btl_gm.c:398: warning: ISO C89 forbids mixed declarations and code
btl_gm.c:404: error: structure has no member named `mpoo_retain'
btl_gm.c:381: warning: unused variable `gm_btl'
make[4]: *** [btl_gm.lo] Error 1
make[4]: Leaving directory `/home/ndebard/ompi/ompi/mca/btl/gm'
make[3]: *** [all-recursive] Error 1
make[3]: Leaving directory `/home/ndebard/ompi/ompi/dynamic-mca/btl'
make[2]: *** [all-recursive] Error 1
make[2]: Leaving directory `/home/ndebard/ompi/ompi/dynamic-mca'
make[1]: *** [all-recursive] Error 1
make[1]: Leaving directory `/home/ndebard/ompi/ompi'
make: *** [all-recursive] Error 1
[sparkplug]~/ompi > 


I've configured using the option I thought to disable this:


--enable-mca-no-build=ptl-gm


I even tried --enable-mca-no-build=btl-gm.
No luck.

--
-- Nathan
Correspondence
-
Nathan DeBardeleben, Ph.D.
Los Alamos National Laboratory
Parallel Tools Team
High Performance Computing Environments
phone: 505-667-3428
email: ndeb...@lanl.gov
-



Re: [O-MPI devel] OMPI compile failing

2005-09-13 Thread Tim S. Woodall

Nathan - What machine are you on?

Galen - have you tried GM w/ your changes?


Nathan DeBardeleben wrote:

Compiling I get:


gcc -DHAVE_CONFIG_H -I. -I. -I../../../../include 
-I../../../../include -I../../../../include -I../../../.. 
-I../../../.. -I../../../../include -I../../../../opal 
-I../../../../orte -I../../../../ompi -g -Wall -Wundef -Wno-long-long 
-Wsign-compare -Wmissing-prototypes -Wstrict-prototypes -Wcomment 
-pedantic -Werror-implicit-function-declaration -fno-strict-aliasing 
-MT btl_gm.lo -MD -MP -MF .deps/btl_gm.Tpo -c btl_gm.c  -fPIC -DPIC -o 
.libs/btl_gm.o

btl_gm.c: In function `mca_btl_gm_prepare_src':
btl_gm.c:237: error: `gm_btl' undeclared (first use in this function)
btl_gm.c:237: error: (Each undeclared identifier is reported only once
btl_gm.c:237: error: for each function it appears in.)
btl_gm.c: In function `mca_btl_gm_prepare_dst':
btl_gm.c:398: warning: ISO C89 forbids mixed declarations and code
btl_gm.c:404: error: structure has no member named `mpoo_retain'
btl_gm.c:381: warning: unused variable `gm_btl'
make[4]: *** [btl_gm.lo] Error 1
make[4]: Leaving directory `/home/ndebard/ompi/ompi/mca/btl/gm'
make[3]: *** [all-recursive] Error 1
make[3]: Leaving directory `/home/ndebard/ompi/ompi/dynamic-mca/btl'
make[2]: *** [all-recursive] Error 1
make[2]: Leaving directory `/home/ndebard/ompi/ompi/dynamic-mca'
make[1]: *** [all-recursive] Error 1
make[1]: Leaving directory `/home/ndebard/ompi/ompi'
make: *** [all-recursive] Error 1
[sparkplug]~/ompi > 



I've configured using the option I thought to disable this:



--enable-mca-no-build=ptl-gm



I even tried --enable-mca-no-build=btl-gm.
No luck.



Re: [O-MPI devel] OMPI compile failing

2005-09-13 Thread Nathan DeBardeleben
I'm trying this on sparkplug.  I have no real desire to use GM, so if it 
can be disabled then that'd be great.


-- Nathan
Correspondence
-
Nathan DeBardeleben, Ph.D.
Los Alamos National Laboratory
Parallel Tools Team
High Performance Computing Environments
phone: 505-667-3428
email: ndeb...@lanl.gov
-



Tim S. Woodall wrote:


Nathan - What machine are you on?

Galen - have you tried GM w/ your changes?


Nathan DeBardeleben wrote:
 


Compiling I get:


   

gcc -DHAVE_CONFIG_H -I. -I. -I../../../../include 
-I../../../../include -I../../../../include -I../../../.. 
-I../../../.. -I../../../../include -I../../../../opal 
-I../../../../orte -I../../../../ompi -g -Wall -Wundef -Wno-long-long 
-Wsign-compare -Wmissing-prototypes -Wstrict-prototypes -Wcomment 
-pedantic -Werror-implicit-function-declaration -fno-strict-aliasing 
-MT btl_gm.lo -MD -MP -MF .deps/btl_gm.Tpo -c btl_gm.c  -fPIC -DPIC -o 
.libs/btl_gm.o

btl_gm.c: In function `mca_btl_gm_prepare_src':
btl_gm.c:237: error: `gm_btl' undeclared (first use in this function)
btl_gm.c:237: error: (Each undeclared identifier is reported only once
btl_gm.c:237: error: for each function it appears in.)
btl_gm.c: In function `mca_btl_gm_prepare_dst':
btl_gm.c:398: warning: ISO C89 forbids mixed declarations and code
btl_gm.c:404: error: structure has no member named `mpoo_retain'
btl_gm.c:381: warning: unused variable `gm_btl'
make[4]: *** [btl_gm.lo] Error 1
make[4]: Leaving directory `/home/ndebard/ompi/ompi/mca/btl/gm'
make[3]: *** [all-recursive] Error 1
make[3]: Leaving directory `/home/ndebard/ompi/ompi/dynamic-mca/btl'
make[2]: *** [all-recursive] Error 1
make[2]: Leaving directory `/home/ndebard/ompi/ompi/dynamic-mca'
make[1]: *** [all-recursive] Error 1
make[1]: Leaving directory `/home/ndebard/ompi/ompi'
make: *** [all-recursive] Error 1
[sparkplug]~/ompi > 
 


I've configured using the option I thought to disable this:


   


--enable-mca-no-build=ptl-gm
 


I even tried --enable-mca-no-build=btl-gm.
No luck.

   


___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel

 



Re: [O-MPI devel] OMPI compile failing

2005-09-13 Thread George Bosilca
Please update again (rev 7352). I run on the same problems yesterday  
when I compile on thor, but I didn't commit as I was thinking that  
I'm the only one still using GM.


BTW I think the correct option to not compile GM is --without-gm at  
configure time.


  george.

On Sep 13, 2005, at 4:07 PM, Nathan DeBardeleben wrote:

I'm trying this on sparkplug.  I have no real desire to use GM, so  
if it

can be disabled then that'd be great.

-- Nathan
Correspondence
-
Nathan DeBardeleben, Ph.D.
Los Alamos National Laboratory
Parallel Tools Team
High Performance Computing Environments
phone: 505-667-3428
email: ndeb...@lanl.gov
-



Tim S. Woodall wrote:



Nathan - What machine are you on?

Galen - have you tried GM w/ your changes?


Nathan DeBardeleben wrote:




Compiling I get:






gcc -DHAVE_CONFIG_H -I. -I. -I../../../../include
-I../../../../include -I../../../../include -I../../../..
-I../../../.. -I../../../../include -I../../../../opal
-I../../../../orte -I../../../../ompi -g -Wall -Wundef -Wno-long- 
long

-Wsign-compare -Wmissing-prototypes -Wstrict-prototypes -Wcomment
-pedantic -Werror-implicit-function-declaration -fno-strict- 
aliasing
-MT btl_gm.lo -MD -MP -MF .deps/btl_gm.Tpo -c btl_gm.c  -fPIC - 
DPIC -o

.libs/btl_gm.o
btl_gm.c: In function `mca_btl_gm_prepare_src':
btl_gm.c:237: error: `gm_btl' undeclared (first use in this  
function)
btl_gm.c:237: error: (Each undeclared identifier is reported  
only once

btl_gm.c:237: error: for each function it appears in.)
btl_gm.c: In function `mca_btl_gm_prepare_dst':
btl_gm.c:398: warning: ISO C89 forbids mixed declarations and code
btl_gm.c:404: error: structure has no member named `mpoo_retain'
btl_gm.c:381: warning: unused variable `gm_btl'
make[4]: *** [btl_gm.lo] Error 1
make[4]: Leaving directory `/home/ndebard/ompi/ompi/mca/btl/gm'
make[3]: *** [all-recursive] Error 1
make[3]: Leaving directory `/home/ndebard/ompi/ompi/dynamic-mca/ 
btl'

make[2]: *** [all-recursive] Error 1
make[2]: Leaving directory `/home/ndebard/ompi/ompi/dynamic-mca'
make[1]: *** [all-recursive] Error 1
make[1]: Leaving directory `/home/ndebard/ompi/ompi'
make: *** [all-recursive] Error 1
[sparkplug]~/ompi >




I've configured using the option I thought to disable this:






--enable-mca-no-build=ptl-gm




I even tried --enable-mca-no-build=btl-gm.
No luck.





___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel





___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel



"Half of what I say is meaningless; but I say it so that the other  
half may reach you"

  Kahlil Gibran




Re: [O-MPI devel] OMPI compile failing

2005-09-13 Thread Galen M. Shipman


Looking into it now.. looks like a type or two..

On Sep 13, 2005, at 1:50 PM, Nathan DeBardeleben wrote:


Compiling I get:



 gcc -DHAVE_CONFIG_H -I. -I. -I../../../../include
-I../../../../include -I../../../../include -I../../../..
-I../../../.. -I../../../../include -I../../../../opal
-I../../../../orte -I../../../../ompi -g -Wall -Wundef -Wno-long-long
-Wsign-compare -Wmissing-prototypes -Wstrict-prototypes -Wcomment
-pedantic -Werror-implicit-function-declaration -fno-strict-aliasing
-MT btl_gm.lo -MD -MP -MF .deps/btl_gm.Tpo -c btl_gm.c  -fPIC - 
DPIC -o

.libs/btl_gm.o
btl_gm.c: In function `mca_btl_gm_prepare_src':
btl_gm.c:237: error: `gm_btl' undeclared (first use in this function)
btl_gm.c:237: error: (Each undeclared identifier is reported only  
once

btl_gm.c:237: error: for each function it appears in.)
btl_gm.c: In function `mca_btl_gm_prepare_dst':
btl_gm.c:398: warning: ISO C89 forbids mixed declarations and code
btl_gm.c:404: error: structure has no member named `mpoo_retain'
btl_gm.c:381: warning: unused variable `gm_btl'
make[4]: *** [btl_gm.lo] Error 1
make[4]: Leaving directory `/home/ndebard/ompi/ompi/mca/btl/gm'
make[3]: *** [all-recursive] Error 1
make[3]: Leaving directory `/home/ndebard/ompi/ompi/dynamic-mca/btl'
make[2]: *** [all-recursive] Error 1
make[2]: Leaving directory `/home/ndebard/ompi/ompi/dynamic-mca'
make[1]: *** [all-recursive] Error 1
make[1]: Leaving directory `/home/ndebard/ompi/ompi'
make: *** [all-recursive] Error 1
[sparkplug]~/ompi >



I've configured using the option I thought to disable this:



--enable-mca-no-build=ptl-gm



I even tried --enable-mca-no-build=btl-gm.
No luck.

--
-- Nathan
Correspondence
-
Nathan DeBardeleben, Ph.D.
Los Alamos National Laboratory
Parallel Tools Team
High Performance Computing Environments
phone: 505-667-3428
email: ndeb...@lanl.gov
-

___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel





Re: [O-MPI devel] OMPI compile failing

2005-09-13 Thread Galen M. Shipman
thanks george, I didn't get a chance to test this from yesterday's  
merge, I will do so and commit any other needed changes..



On Sep 13, 2005, at 2:18 PM, George Bosilca wrote:


Please update again (rev 7352). I run on the same problems yesterday
when I compile on thor, but I didn't commit as I was thinking that
I'm the only one still using GM.

BTW I think the correct option to not compile GM is --without-gm at
configure time.

   george.

On Sep 13, 2005, at 4:07 PM, Nathan DeBardeleben wrote:



I'm trying this on sparkplug.  I have no real desire to use GM, so
if it
can be disabled then that'd be great.

-- Nathan
Correspondence
-
Nathan DeBardeleben, Ph.D.
Los Alamos National Laboratory
Parallel Tools Team
High Performance Computing Environments
phone: 505-667-3428
email: ndeb...@lanl.gov
-



Tim S. Woodall wrote:




Nathan - What machine are you on?

Galen - have you tried GM w/ your changes?


Nathan DeBardeleben wrote:





Compiling I get:







gcc -DHAVE_CONFIG_H -I. -I. -I../../../../include
-I../../../../include -I../../../../include -I../../../..
-I../../../.. -I../../../../include -I../../../../opal
-I../../../../orte -I../../../../ompi -g -Wall -Wundef -Wno-long-
long
-Wsign-compare -Wmissing-prototypes -Wstrict-prototypes -Wcomment
-pedantic -Werror-implicit-function-declaration -fno-strict-
aliasing
-MT btl_gm.lo -MD -MP -MF .deps/btl_gm.Tpo -c btl_gm.c  -fPIC -
DPIC -o
.libs/btl_gm.o
btl_gm.c: In function `mca_btl_gm_prepare_src':
btl_gm.c:237: error: `gm_btl' undeclared (first use in this
function)
btl_gm.c:237: error: (Each undeclared identifier is reported
only once
btl_gm.c:237: error: for each function it appears in.)
btl_gm.c: In function `mca_btl_gm_prepare_dst':
btl_gm.c:398: warning: ISO C89 forbids mixed declarations and code
btl_gm.c:404: error: structure has no member named `mpoo_retain'
btl_gm.c:381: warning: unused variable `gm_btl'
make[4]: *** [btl_gm.lo] Error 1
make[4]: Leaving directory `/home/ndebard/ompi/ompi/mca/btl/gm'
make[3]: *** [all-recursive] Error 1
make[3]: Leaving directory `/home/ndebard/ompi/ompi/dynamic-mca/
btl'
make[2]: *** [all-recursive] Error 1
make[2]: Leaving directory `/home/ndebard/ompi/ompi/dynamic-mca'
make[1]: *** [all-recursive] Error 1
make[1]: Leaving directory `/home/ndebard/ompi/ompi'
make: *** [all-recursive] Error 1
[sparkplug]~/ompi >





I've configured using the option I thought to disable this:







--enable-mca-no-build=ptl-gm





I even tried --enable-mca-no-build=btl-gm.
No luck.






___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel






___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel




"Half of what I say is meaningless; but I say it so that the other
half may reach you"
   Kahlil Gibran


___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel





[O-MPI devel] Startup/shutdown performance

2005-09-13 Thread Ralph H. Castain

Yo folks

Josh ran some tests for me on Odin earlier today - the results show a 
major improvement in our startup/shutdown performance. As you may 
recall, our times grew roughly exponentially before - as the attached 
graph shows, they now grow roughly linearly. The data also shows that 
the MPI_INIT penalty is fairly small. This is due to the data 
exchange being "encapsulated" in the initial data sent back at the 
stage_1 trigger, thus avoiding any further overhead as the number of 
processes grows. The data was taken using the rsh launcher.


We should be able to further improve our scalability once we (a) 
incorporate a tree-based scheme into the rsh launcher and (b) utilize 
a tree-based (or better) broadcast mechanism for sending the trigger 
messages (right now, we send them linearly across the processes).


Anyway, thought you might find this of interest.
Ralph
[]