Jeff and All,
Belated Merry Christmas and a Happy New Year! And now we can be back to 
business :)

I did some additional tests with versions installed on our cluster (1.8.3 ++ 1.8.4) and I *believe* I can confirm, that the problem seem to be not rooted on LSF support and yessir, adding "-lbat -llsf" is not a solution but a weird "workaround" which is not a real true workaround(*), as you wrote.

Back to error description:

a) the problem only arise in 1.8.x series when configured with these flags:
>  --disable-dlopen  --disable-mca-dso

We add these flags since early 2012 in order to minimise the NFS activity at start-up time. In the 1.6.x versions we *probably* do not see the error due to (*) - yessir, 'libbat.so' and 'liblsf.so' contain all the symbols missed and as these two libs are linked in by default prior 1.8.x, there is the mess you described below (symbols from libbat.so and liblsf.so *probably* used instead of symbols in the code).

b) the problem vanish when
>  --as-needed
command is passed to the linker:
$ mpif90 -o example main.o y.tab.o mymain.o lex.yy.o  -Wl,--as-needed

c) yes it seem to be a general linkage issue: the behaviour of Intel compiler is the same as of GCC and studio compilers.

a bit logs: version 1.8.3, configured with "--disable-dlopen  --disable-mca-dso"

$ mpif90 -o example main.o y.tab.o mymain.o lex.yy.o -showme
ifort -o example main.o y.tab.o mymain.o lex.yy.o -I/opt/MPI/openmpi-1.8.3/linux/intel/include -fexceptions -I/opt/MPI/openmpi-1.8.3/linux/intel/lib -L/opt/lsf/9.1/linux2.6-glibc2.3-x86_64/lib -Wl,-rpath -Wl,/opt/lsf/9.1/linux2.6-glibc2.3-x86_64/lib -Wl,-rpath -Wl,/opt/MPI/openmpi-1.8.3/linux/intel/lib -Wl,--enable-new-dtags -L/opt/MPI/openmpi-1.8.3/linux/intel/lib -lmpi_usempif08 -lmpi_usempi_ignore_tkr -lmpi_mpifh -lmpi
(===> error)

(===> add -Wl,--as-needed)
$ mpif90 -o example main.o y.tab.o mymain.o lex.yy.o -Wl,--as-needed -showme
ifort -o example main.o y.tab.o mymain.o lex.yy.o -Wl,--as-needed -I/opt/MPI/openmpi-1.8.3/linux/intel/include -fexceptions -I/opt/MPI/openmpi-1.8.3/linux/intel/lib -L/opt/lsf/9.1/linux2.6-glibc2.3-x86_64/lib -Wl,-rpath -Wl,/opt/lsf/9.1/linux2.6-glibc2.3-x86_64/lib -Wl,-rpath -Wl,/opt/MPI/openmpi-1.8.3/linux/intel/lib -Wl,--enable-new-dtags -L/opt/MPI/openmpi-1.8.3/linux/intel/lib -lmpi_usempif08 -lmpi_usempi_ignore_tkr -lmpi_mpifh -lmpi
(===> OK)


(===> try to remove the LSF linking stuff at all:)
ifort -o example main.o y.tab.o mymain.o lex.yy.o -Wl,--as-needed -I/opt/MPI/openmpi-1.8.3/linux/intel/include -fexceptions -I/opt/MPI/openmpi-1.8.3/linux/intel/lib -Wl,-rpath -Wl,/opt/MPI/openmpi-1.8.3/linux/intel/lib -Wl,--enable-new-dtags -L/opt/MPI/openmpi-1.8.3/linux/intel/lib -lmpi_usempif08 -lmpi_usempi_ignore_tkr -lmpi_mpifh -lmpi
(===> OK)

(the same line as above but without -Wl,--as-needed ===> error).

now the fun fact: omitting all the Open MPI part,
> ifort -o example main.o y.tab.o mymain.o lex.yy.o
lead to OK linking (the compiled app is not an MPI app).



Recap:
1) - the error is related to configure with '--disable-dlopen --disable-mca-dso'
2) - the error vanishes when added   '-Wl,--as-needed' to the link line
3) - the error is not special to any compiler or version
4) - the error is not related to LSF but linking with these libs just shut down it due to some symbols mess

Well, I'm not really sure that (2) is the true workaround, or just starts some more library deep search and binds to LSF libs linked in somewhere in the bush.

Could someone with moar xperience in linking libs and especially Open MPI take a look at this? (sorry for pushing this, but all this smells for me being an general linking problem rooted somewhere in Open MPI and '--disable-dlopen', see "fun fact" above)

best

Paul Kapinos

P.S. Never used "-fPIC" here


On 12/01/14 20:48, Jeff Squyres (jsquyres) wrote:
Paul --

Sorry for the delay -- SC and the US Thanksgiving holiday last week got in the 
way of responding to this properly.

I talked with Dave Goodell about this issue a bunch today.

Going back to the original email in this thread 
(http://www.open-mpi.org/community/lists/devel/2014/10/16064.php), it seems 
like this is the original problem:

----
$ make
mpif90 -c main.f90
yacc -d example4.y
mpicc -c y.tab.c
mpicc -c mymain.c
lex example4.l
mpicc -c lex.yy.c
mpif90 -o example main.o y.tab.o mymain.o lex.yy.o
ld: y.tab.o(.text+0xd9): unresolvable R_X86_64_PC32 relocation against symbol 
`yylval'
ld: y.tab.o(.text+0x16f): unresolvable R_X86_64_PC32 relocation against symbol 
`yyval'
-----

You later confirmed that adding -fPIC to the compile/link lines make everything 
work without adding -lbat -llsf.

Dave and I are sorta convinced (i.e., we could still be wrong, but we *think* 
this is right) that adding -lbat and -llsf to the link line is the Wrong 
solution.  The issue seems to be that a correct/matching yylval symbol is not 
being found during your final link.

Crucial point: the yylval symbol should be in *your* code, not in the bat and 
lsf libraries.  Indeed, if adding -lbat -llsf resolves the problem (because a 
matching yylval symbol is found in libbat or liblsf), then it means you're 
using the lex/yacc-generated yylval symbol in the LSF libraries, not your code 
(!).

And that definitely does not seem right.

(even though it *works* [in v1.6 and/or by adding -lbat -llsf in v1.8], it may 
not be actually doing what you expect under the covers, and you're really just 
getting lucky that it actually works at all)

It *seems* like this is a generic C/Fortran linkage issue; i.e., it would be 
good to look at the docs for your version of icc/ifort to see if they are 
generating different modes of .o files by default, or somesuch (i.e., why 
adding -fPIC to the compile/link line makes it work).

Make sense?

That being said, you previously sent the v1.6/v1.8 differences between "mpicc --showme" 
-- can you send the differences between "mpif90 -o example main.o y.tab.o mymain.o lex.yy.o 
--showme"?

Thanks.



On Oct 21, 2014, at 4:13 AM, Paul Kapinos <kapi...@itc.rwth-aachen.de> wrote:

Jeff,
the output of "mpicc --showme" is attached below.

Do you really need to add "-lbat -llsf" to the command line to make it work?
As both 1.6.5 and 1.8.3 versions are build for work with Platform LSF, yes, we 
need libbat and liblsf. The 1.6.5 version links this library explicitly in the 
link line. The 1.8.3 does not.



### 1.6.5:
icc 
-I/opt/MPI/openmpi-1.6.5/linux/intel/include/openmpi/opal/mca/hwloc/hwloc132/hwloc/include
 -I/opt/MPI/openmpi-1.6.5/linux/intel/include 
-I/opt/MPI/openmpi-1.6.5/linux/intel/include/openmpi -fexceptions -pthread 
-L/opt/lsf/9.1/linux2.6-glibc2.3-x86_64/lib 
-L/opt/MPI/openmpi-1.6.5/linux/intel/lib -lmpi -losmcomp -lrdmacm -libverbs 
-lrt -lnsl -lutil -lpsm_infinipath -lbat -llsf -ldl -lm -lnuma -lrt -lnsl -lutil

### 1.8.3:
icc 
-I/opt/MPI/openmpi-1.8.3/linux/intel/include/openmpi/opal/mca/hwloc/hwloc172/hwloc/include
 
-I/opt/MPI/openmpi-1.8.3/linux/intel/include/openmpi/opal/mca/event/libevent2021/libevent
 
-I/opt/MPI/openmpi-1.8.3/linux/intel/include/openmpi/opal/mca/event/libevent2021/libevent/include
 -I/opt/MPI/openmpi-1.8.3/linux/intel/include 
-I/opt/MPI/openmpi-1.8.3/linux/intel/include/openmpi -fexceptions -pthread 
-L/opt/lsf/9.1/linux2.6-glibc2.3-x86_64/lib -Wl,-rpath 
-Wl,/opt/lsf/9.1/linux2.6-glibc2.3-x86_64/lib -Wl,-rpath 
-Wl,/opt/MPI/openmpi-1.8.3/linux/intel/lib -Wl,--enable-new-dtags 
-L/opt/MPI/openmpi-1.8.3/linux/intel/lib -lmpi


On 10/18/14 01:56, Jeff Squyres (jsquyres) wrote:
I think the LSF part of this may be a red herring.  Do you really need to add "-lbat 
-llsf" to the command line to make it work?

The error message *sounds* like y.tab.o was compiled differently than 
others...?  It's hard to know without seeing the output of mpicc --showme.


On Oct 17, 2014, at 7:51 AM, Ralph Castain <r...@open-mpi.org> wrote:

Forwarding this for Paul until his email address gets updated on the User list:

Begin forwarded message:

Date: October 17, 2014 at 6:35:31 AM PDT
From: Paul Kapinos <kapi...@itc.rwth-aachen.de>
To: Open MPI Users <us...@open-mpi.org>
Cc: "Kapinos, Paul" <kapi...@itc.rwth-aachen.de>, <fri...@cats.rwth-aachen.de>
Subject: Open MPI 1.8: link problem when Fortran+C+Platform LSF

Dear Open MPI developer,

we have both Open MPI 1.6(.5) and 1.8(.3) in our cluster, configured to be used 
with Platform LSF.

One of our users run into an issue when trying to link his code (combination of 
lex/C and Fortran) with v.1.8, whereby with OpenMPI/1.6er the code can be 
linked OK.

$ make
mpif90 -c main.f90
yacc -d example4.y
mpicc -c y.tab.c
mpicc -c mymain.c
lex example4.l
mpicc -c lex.yy.c
mpif90 -o example main.o y.tab.o mymain.o lex.yy.o
ld: y.tab.o(.text+0xd9): unresolvable R_X86_64_PC32 relocation against symbol 
`yylval'
ld: y.tab.o(.text+0x16f): unresolvable R_X86_64_PC32 relocation against symbol 
`yyval'
.......

looking into "mpif90 --show-me" let us see that the link line and possibly the 
philosophy behind it has been changed, there is also a note on it:

# Note that per https://svn.open-mpi.org/trac/ompi/ticket/3422, we
# intentionally only link in the MPI libraries (ORTE, OPAL, etc. are
# pulled in implicitly) because we intend MPI applications to only use
# the MPI API.




Well, by now we know two workarounds:
a) add "-lbat -llsf" to the link line
b) add " -Wl,--as-needed" to the link line

What would be better? Maybe one of this should be added to linker_flags=..." in 
the .../share/openmpi/mpif90-wrapper-data.txt file? As of the note above, (b) would 
be better?

Best

Paul Kapinos

P.S. $ mpif90 --show-me

1.6.5
ifort -nofor-main -I/opt/MPI/openmpi-1.6.5/linux/intel/include -fexceptions 
-I/opt/MPI/openmpi-1.6.5/linux/intel/lib 
-L/opt/lsf/9.1/linux2.6-glibc2.3-x86_64/lib 
-L/opt/MPI/openmpi-1.6.5/linux/intel/lib -lmpi_f90 -lmpi_f77 -lmpi -losmcomp 
-lrdmacm -libverbs -lrt -lnsl -lutil -lpsm_infinipath -lbat -llsf -ldl -lm 
-lnuma -lrt -lnsl -lutil

1.8.3
ifort             -I/opt/MPI/openmpi-1.8.3/linux/intel/include -fexceptions 
-I/opt/MPI/openmpi-1.8.3/linux/intel/lib 
-L/opt/lsf/9.1/linux2.6-glibc2.3-x86_64/lib -Wl,-rpath 
-Wl,/opt/lsf/9.1/linux2.6-glibc2.3-x86_64/lib -Wl,-rpath 
-Wl,/opt/MPI/openmpi-1.8.3/linux/intel/lib -Wl,--enable-new-dtags 
-L/opt/MPI/openmpi-1.8.3/linux/intel/lib -lmpi_usempif08 
-lmpi_usempi_ignore_tkr -lmpi_mpifh -lmpi

P.S.2 $ man ld
....
       --as-needed
       --no-as-needed
           This option affects ELF DT_NEEDED tags for dynamic libraries
           mentioned on the command line after the --as-needed option.
           Normally the linker will add a DT_NEEDED tag for each dynamic
           library mentioned on the command line, regardless of whether the
           library is actually needed or not.  --as-needed causes a DT_NEEDED
           tag to only be emitted for a library that satisfies an undefined
           symbol reference from a regular object file or, if the library is
           not found in the DT_NEEDED lists of other libraries linked up to
           that point, an undefined symbol reference from another dynamic
           library.  --no-as-needed restores the default behaviour.

....

--
Dipl.-Inform. Paul Kapinos   -   High Performance Computing,
RWTH Aachen University, IT Center
Seffenter Weg 23,  D 52074  Aachen (Germany)
Tel: +49 241/80-24915


<lexyacc.zip>




--
Dipl.-Inform. Paul Kapinos   -   High Performance Computing,
RWTH Aachen University, IT Center
Seffenter Weg 23,  D 52074  Aachen (Germany)
Tel: +49 241/80-24915





--
Dipl.-Inform. Paul Kapinos   -   High Performance Computing,
RWTH Aachen University, IT Center
Seffenter Weg 23,  D 52074  Aachen (Germany)
Tel: +49 241/80-24915

Attachment: smime.p7s
Description: S/MIME Cryptographic Signature

Reply via email to