[OMPI devel] Building Error

2011-08-15 Thread Matthew Russell
I hope this problem merits being posted here.

On OS X (Snow Leopard, and Lion), I cannot seem to build Open MPI.

After a lot of building, I get the error:

/bin/sh ../../../libtool --tag=CC   --mode=link
/opt/pgi/osx86-64/10.9/bin/pgcc  -DNDEBUG -O2 -Msignextend -V
-export-dynamic   -o orte-clean orte-clean.o
../../../orte/libopen-rte.la-lutil
libtool: link: /opt/pgi/osx86-64/10.9/bin/pgcc -DNDEBUG -O2 -Msignextend -V
-o orte-clean orte-clean.o  ../../../orte/.libs/libopen-rte.a
/Users/matt/software/openmpi/openmpi-1.4.3/opal/.libs/libopen-pal.a -lutil
Undefined symbols for architecture x86_64:
  "_orte_odls", referenced from:
  _orte_errmgr_base_error_abort in libopen-rte.a(errmgr_base_fns.o)
ld: symbol(s) not found for architecture x86_64

This is with the PGI 10.9 compiler, OpenMPI 1.4.3, platform is 86x64

The README does not list PGI as a compiler that OpenMPI was tested with, and
there are notes about it's support for XGrid being broken (I'm not sure if
this is related.)

I seem to get the error regardless of which configure flags I'm using, just
for completeness though, here are the flags I am using:
./configure --prefix=/usr/local/openmpi_pg --enable-mpi-f77 --enable-mpi-f90
--with-memory-manager=none

Has anyone else got or fixed this error?

I looked at other postings in this list, such as
http://www.open-mpi.org/community/lists/devel/2007/05/1590.php , but they
didn't help much.


Re: [OMPI devel] Building Error

2011-08-16 Thread Matthew Russell
Hmm, I tried the recommendation above, adding -Wl,-search_paths_first, and I
still ran into the same issue.  I suspect it is an issue with PGI.

Meanwhile, I've been able to get my applications (CMAQ) working with MPICH2,
so for now at least I am going to continue with that.

Thanks for the responses!

On Mon, Aug 15, 2011 at 8:43 PM, Ralph Castain  wrote:

> FWIW: I build OMPI on Mac OS-X (Snow Leopard) every day, without adding any
> extra flags, without problem. The citation below relates to something from a
> long time ago, I believe - haven't seen that problem in quite some time.
>
> I do not, however, use PGI. We regularly have problems with PGI on a
> variety of systems, and I suspect you are hitting one here - but can't
> confirm it as we don't have PGI licenses to use for testing.
>
> The Xgrid support is broken, but has nothing to do with the problem you
> describe. Just means you can't launch via Xgrid.
>
>
>
> On Aug 15, 2011, at 2:53 PM, Larry Baker wrote:
>
> Matthew,
>
> I have the same type of error on a completely different software package on
> Mac OS X.  The error occurs because of the way that Mac OS X searches for
> -lutil.  If the libutil.a ORTE needs is theirs, i.e., not the system
> libutil.dylib, then you have exactly the same problem I did.
>
> Here are my notes for the fix using gcc.  You will have to find out the
> equivalent method to pass the -search_paths_first linker option using pgcc.
>
> # Mac OS X searches for shared libraries before static libraries.  Thus,
> -L -lutil finds the system libutil.dylib
> # before our libutil.a, which causes undefined references in the link step
> because it is using the wrong library.  The
> # ld -search_paths_first option forces ld to search each directory first
> for a matching library, instead of all directories
> # first for a shared library.
> # Note: this is the form to pass -search_paths_first to ld when $(CC) is
> the linker command in makefile.ux
> export LDFLAGS=-Wl,-search_paths_first
>
>
> Larry Baker
> US Geological Survey
> 650-329-5608
> ba...@usgs.gov
>
> On 15 Aug 2011, at 1:01 PM, Matthew Russell wrote:
>
>
>
> I hope this problem merits being posted here.
>
> On OS X (Snow Leopard, and Lion), I cannot seem to build Open MPI.
>
> After a lot of building, I get the error:
>
> /bin/sh ../../../libtool --tag=CC   --mode=link
> /opt/pgi/osx86-64/10.9/bin/pgcc  -DNDEBUG -O2 -Msignextend -V
> -export-dynamic   -o orte-clean orte-clean.o
> ../../../orte/libopen-rte.la-lutil
> libtool: link: /opt/pgi/osx86-64/10.9/bin/pgcc -DNDEBUG -O2 -Msignextend -V
> -o orte-clean orte-clean.o  ../../../orte/.libs/libopen-rte.a
> /Users/matt/software/openmpi/openmpi-1.4.3/opal/.libs/libopen-pal.a -lutil
> Undefined symbols for architecture x86_64:
>   "_orte_odls", referenced from:
>   _orte_errmgr_base_error_abort in libopen-rte.a(errmgr_base_fns.o)
> ld: symbol(s) not found for architecture x86_64
>
> This is with the PGI 10.9 compiler, OpenMPI 1.4.3, platform is 86x64
>
> The README does not list PGI as a compiler that OpenMPI was tested with,
> and there are notes about it's support for XGrid being broken (I'm not sure
> if this is related.)
>
> I seem to get the error regardless of which configure flags I'm using, just
> for completeness though, here are the flags I am using:
> ./configure --prefix=/usr/local/openmpi_pg --enable-mpi-f77
> --enable-mpi-f90 --with-memory-manager=none
>
> Has anyone else got or fixed this error?
>
> I looked at other postings in this list, such as
> http://www.open-mpi.org/community/lists/devel/2007/05/1590.php , but they
> didn't help much.
>
> ___
> devel mailing list
> de...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>
>
> ___
> devel mailing list
> de...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>
>
>
> ___
> devel mailing list
> de...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>
>


Re: [OMPI devel] Building Error

2011-08-17 Thread Matthew Russell
Hi, I'm really grateful for the detailed responses.

I'll try running different responses as Larry suggested.  Right now MPICH
seems to be satisfying my needs, so I have less time to devote to getting
OpenMPI working, but I am interested in having it working just as an option
to MPICH.

Thanks!

On Tue, Aug 16, 2011 at 10:35 PM, Ralph Castain  wrote:

> Just an FYI. Disabling ORTE support is intended solely for systems that
> require no RTE assistance - e.g., Crays. Configuring without RTE support
> will generate something that cannot run on a Mac, which is why the build
> fails in that environment - it is looking for external RTE support that does
> not exist on the Mac. That configure option works fine on the intended
> targets.
>
> The declspec macro does indeed have visibility attributes - in fact, that
> is its sole purpose. You are welcome to try disabling visibility to see if
> that helps.
>
> The module definitions are actually identical, minus the visibility flags.
>
>
> On Aug 16, 2011, at 8:08 PM, Larry Baker wrote:
>
> Matthew,
>
> The best I can come up with is that somehow the declaration of
> external orte_odls in orte/mca/odls/odls.h
>
> ORTE_DECLSPEC extern orte_odls_base_module_t orte_odls;  /* holds selected
> module's function pointers */
>
>
> does not exactly match the definition of orte_odis in
> orte/mca/odis/base/odls_base_open.c
>
> orte_odls_base_module_t orte_odls;
>
>
> ORTE_DECLSPEC might include some decorations having to do with the
> visibility attribute.  Try adding --disable-visibility to your configure.
>
> Otherwise, I see in orte/mca/odis/base/odls_base_open.c that orte_odis is
> not defined if ORTE_DISABLE_FULL_SUPPORT == 1.  I tried to compile
> with --without-rte-support to force #define ORTE_DISABLE_FULL_SUPPORT 1, but
> the make failed before it reached the link that failed for you.  When
> --without-rte-support is requested in 1.4.3, there are declarations that
> depend on typedefs that are skipped, causing the make to fail.  You may be
> encountering something subtle like that when configure deduces some behavior
> for pgcc and the code doesn't quite have the conditional compilation tests
> in the right place.
>
> You might try a newer version of OpenMPI, which might have fixed problem
> like --without-rte-support failing.
>
> Larry Baker
> US Geological Survey
> 650-329-5608
> ba...@usgs.gov
>
> On 16 Aug 2011, at 11:53 AM, Matthew Russell wrote:
>
> Hi Larry,
>
> Thank you for your interest.
>
> I believe your solution is the right one, however I think there's some
> other issues causing some problems too.
>
> When I add the search_paths_first flag to my configure, the command that
> breaks in the Makefile is,
>
> libtool: link: /opt/pgi/osx86-64/10.9/bin/pgcc -DNDEBUG -O2 -Msignextend -V
> -search_paths_first -o orte-clean orte-clean.o
>  ../../../orte/.libs/libopen-rte.a
> /Users/matt/software/openmpi/openmpi-1.4.3/opal/.libs/libopen-pal.a -lutil
> *pgcc-Error-Unknown switch: -search_paths_first*
>
> pgcc 10.9-0 64-bit target on Apple OS/X -tp nehalem-64
> Copyright 1989-2000, The Portland Group, Inc.  All Rights Reserved.
> Copyright 2000-2010, STMicroelectronics, Inc.  All Rights Reserved.
> make: *** [orte-clean] Error 1
>
> The problem there is that that libtool isn't passing the "-Wl," along with
> the search_path_first error, so it isn't getting to the linker.  If I try
> to manually build it, I still have missing symbols:
>
> matt@pontus:orte-clean$ pgcc -DNDEBUG -O2 -Msignextend -V *
> -Wl,-search_paths_first* -o orte-clean orte-clean.o
>  ../../../orte/.libs/libopen-rte.a
> /Users/matt/software/openmpi/openmpi-1.4.3/opal/.libs/libopen-pal.a -lutil
>
> pgcc 10.9-0 64-bit target on Apple OS/X -tp nehalem-64
> Copyright 1989-2000, The Portland Group, Inc.  All Rights Reserved.
> Copyright 2000-2010, STMicroelectronics, Inc.  All Rights Reserved.
> Undefined symbols for architecture x86_64:
>   "_orte_odls", referenced from:
>   _orte_errmgr_base_error_abort in libopen-rte.a(errmgr_base_fns.o)
> ld: symbol(s) not found for architecture x86_64
>
>
>
> On Tue, Aug 16, 2011 at 2:46 PM, Larry Baker  wrote:
>
>> Matthew,
>>
>> What configure options did you use?
>>
>> I can try to replicate your findings, as best I can, using the Intel
>> compiler on my desktop Mac (Leopard).  One thing I want to investigate is
>> which libutil is supposed to be linked.  There is no -L in the failing link
>> step.  Is that possibly the error?
>>
>> I have PGI and about five other compilers on our cluster.  I'll get to
>> OpenMPI 1.4.3 with all those as soon as I f