Re: [OMPI devel] 1.10.0rc3 build failure Solaris/x86 + gcc

2015-08-20 Thread Jeff Squyres (jsquyres)
Paul --

I see that there was an ASM change in 1.8.8.  At first look, it seems harmless 
/ shouldn't have caused this kind of problem.

Nathan is checking into it...



> On Aug 14, 2015, at 9:52 PM, Paul Hargrove  wrote:
> 
> I have a systems running Solaris 11.1 on x86-64 hardware and 11.2 in an 
> x86-64 VM.
> To the extent I have tested the results are the same on both, despite 
> gcc-4.5.2 vs 4.8.2
> 
> I have normally tested only the Sun/Oracle Studio compilers on these systems.
> However, today I gave the vendor-provided gcc, g++ and gfortran in /usr/bin a 
> try.
> So I configured the OpenMPI 1.10.0rc3 tarball with NO arguments to configure.
> 
> When doing so I see tons of warnings like:
> 
> ../../../../openmpi-1.10.0rc3/opal/include/opal/sys/atomic.h:393:9: warning: 
> `opal_atomic_add_32' used but never defined
> ../../../../openmpi-1.10.0rc3/opal/include/opal/sys/atomic.h:401:9: warning: 
> `opal_atomic_sub_32' used but never defined
> 
> and an eventual link failure to match:
> 
>   CCLD libopen-pal.la
> Text relocation remains referenced
> against symbol  offset  in file
> opal_atomic_add_32  0x1e4   runtime/.libs/opal_progress.o
> opal_atomic_sub_32  0x234   runtime/.libs/opal_progress.o
> ld: fatal: relocations remain against allocatable but non-writable sections
> collect2: ld returned 1 exit status
> 
> 
> 
> Here is the possibly-relevant portion of the configure output:
> 
> checking if gcc -std=gnu99 supports GCC inline assembly... yes
> checking if gcc -std=gnu99 supports DEC inline assembly... no
> checking if gcc -std=gnu99 supports XLC inline assembly... no
> checking for assembly format... default-.text-.globl-:--.L-@-1-0-1-1-0
> checking for assembly architecture... IA32
> checking for builtin atomics... BUILTIN_NO
> checking for perl... perl
> checking for pre-built assembly file... yes (atomic-ia32-linux-nongas.s)
> checking for atomic assembly filename... atomic-ia32-linux-nongas.s
> 
> 
> The same problem is present in Open MPI 1.8.8, but 1.8.7 builds just fine.
> 
> Note that on Solaris the default ABI is ILP32 (e.g. default to -m32 rather 
> than -m64).
> There are no problems with LP64 builds ("-m64" in *FLAGS and the wrapper 
> flags).
> There are also no problems with either ILP32 or LP64 and the Studio compilers.
> Only gcc with (default) 32-bit target experiences this failure.
> 
> -Paul
> 
> -- 
> Paul H. Hargrove  phhargr...@lbl.gov
> Computer Languages & Systems Software (CLaSS) Group
> Computer Science Department   Tel: +1-510-495-2352
> Lawrence Berkeley National Laboratory Fax: +1-510-486-6900
> ___
> devel mailing list
> de...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
> Link to this post: 
> http://www.open-mpi.org/community/lists/devel/2015/08/17750.php


-- 
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to: 
http://www.cisco.com/web/about/doing_business/legal/cri/



Re: [OMPI devel] 1.10.0rc3 build failure Solaris/x86 + gcc

2015-08-20 Thread Nathan Hjelm

I see the problem. Both Ralph and I missed an error in the
cherry-pick. For add_32 in the ia32 atomics we were checking for
OPAL_GCC_INLINE_ASSEMBLY instead of OMPI_GCC_INLINE_ASSEMBLY.

-Nathan

On Thu, Aug 20, 2015 at 03:01:35PM +, Jeff Squyres (jsquyres) wrote:
> Paul --
> 
> I see that there was an ASM change in 1.8.8.  At first look, it seems 
> harmless / shouldn't have caused this kind of problem.
> 
> Nathan is checking into it...
> 
> 
> 
> > On Aug 14, 2015, at 9:52 PM, Paul Hargrove  wrote:
> > 
> > I have a systems running Solaris 11.1 on x86-64 hardware and 11.2 in an 
> > x86-64 VM.
> > To the extent I have tested the results are the same on both, despite 
> > gcc-4.5.2 vs 4.8.2
> > 
> > I have normally tested only the Sun/Oracle Studio compilers on these 
> > systems.
> > However, today I gave the vendor-provided gcc, g++ and gfortran in /usr/bin 
> > a try.
> > So I configured the OpenMPI 1.10.0rc3 tarball with NO arguments to 
> > configure.
> > 
> > When doing so I see tons of warnings like:
> > 
> > ../../../../openmpi-1.10.0rc3/opal/include/opal/sys/atomic.h:393:9: 
> > warning: `opal_atomic_add_32' used but never defined
> > ../../../../openmpi-1.10.0rc3/opal/include/opal/sys/atomic.h:401:9: 
> > warning: `opal_atomic_sub_32' used but never defined
> > 
> > and an eventual link failure to match:
> > 
> >   CCLD libopen-pal.la
> > Text relocation remains referenced
> > against symbol  offset  in file
> > opal_atomic_add_32  0x1e4   
> > runtime/.libs/opal_progress.o
> > opal_atomic_sub_32  0x234   
> > runtime/.libs/opal_progress.o
> > ld: fatal: relocations remain against allocatable but non-writable sections
> > collect2: ld returned 1 exit status
> > 
> > 
> > 
> > Here is the possibly-relevant portion of the configure output:
> > 
> > checking if gcc -std=gnu99 supports GCC inline assembly... yes
> > checking if gcc -std=gnu99 supports DEC inline assembly... no
> > checking if gcc -std=gnu99 supports XLC inline assembly... no
> > checking for assembly format... default-.text-.globl-:--.L-@-1-0-1-1-0
> > checking for assembly architecture... IA32
> > checking for builtin atomics... BUILTIN_NO
> > checking for perl... perl
> > checking for pre-built assembly file... yes (atomic-ia32-linux-nongas.s)
> > checking for atomic assembly filename... atomic-ia32-linux-nongas.s
> > 
> > 
> > The same problem is present in Open MPI 1.8.8, but 1.8.7 builds just fine.
> > 
> > Note that on Solaris the default ABI is ILP32 (e.g. default to -m32 rather 
> > than -m64).
> > There are no problems with LP64 builds ("-m64" in *FLAGS and the wrapper 
> > flags).
> > There are also no problems with either ILP32 or LP64 and the Studio 
> > compilers.
> > Only gcc with (default) 32-bit target experiences this failure.
> > 
> > -Paul
> > 
> > -- 
> > Paul H. Hargrove  phhargr...@lbl.gov
> > Computer Languages & Systems Software (CLaSS) Group
> > Computer Science Department   Tel: +1-510-495-2352
> > Lawrence Berkeley National Laboratory Fax: +1-510-486-6900
> > ___
> > devel mailing list
> > de...@open-mpi.org
> > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
> > Link to this post: 
> > http://www.open-mpi.org/community/lists/devel/2015/08/17750.php
> 
> 
> -- 
> Jeff Squyres
> jsquy...@cisco.com
> For corporate legal information go to: 
> http://www.cisco.com/web/about/doing_business/legal/cri/
> 
> ___
> devel mailing list
> de...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
> Link to this post: 
> http://www.open-mpi.org/community/lists/devel/2015/08/17766.php


pgpBjYKaUKMEM.pgp
Description: PGP signature


Re: [OMPI devel] 1.10.0rc3 build failure Solaris/x86 + gcc

2015-08-20 Thread Jeff Squyres (jsquyres)
(the fix has been merged in to v1.8 and v1.10 branches)

> On Aug 20, 2015, at 12:18 PM, Nathan Hjelm  wrote:
> 
> 
> I see the problem. Both Ralph and I missed an error in the
> cherry-pick. For add_32 in the ia32 atomics we were checking for
> OPAL_GCC_INLINE_ASSEMBLY instead of OMPI_GCC_INLINE_ASSEMBLY.
> 
> -Nathan
> 
> On Thu, Aug 20, 2015 at 03:01:35PM +, Jeff Squyres (jsquyres) wrote:
>> Paul --
>> 
>> I see that there was an ASM change in 1.8.8.  At first look, it seems 
>> harmless / shouldn't have caused this kind of problem.
>> 
>> Nathan is checking into it...
>> 
>> 
>> 
>>> On Aug 14, 2015, at 9:52 PM, Paul Hargrove  wrote:
>>> 
>>> I have a systems running Solaris 11.1 on x86-64 hardware and 11.2 in an 
>>> x86-64 VM.
>>> To the extent I have tested the results are the same on both, despite 
>>> gcc-4.5.2 vs 4.8.2
>>> 
>>> I have normally tested only the Sun/Oracle Studio compilers on these 
>>> systems.
>>> However, today I gave the vendor-provided gcc, g++ and gfortran in /usr/bin 
>>> a try.
>>> So I configured the OpenMPI 1.10.0rc3 tarball with NO arguments to 
>>> configure.
>>> 
>>> When doing so I see tons of warnings like:
>>> 
>>> ../../../../openmpi-1.10.0rc3/opal/include/opal/sys/atomic.h:393:9: 
>>> warning: `opal_atomic_add_32' used but never defined
>>> ../../../../openmpi-1.10.0rc3/opal/include/opal/sys/atomic.h:401:9: 
>>> warning: `opal_atomic_sub_32' used but never defined
>>> 
>>> and an eventual link failure to match:
>>> 
>>>  CCLD libopen-pal.la
>>> Text relocation remains referenced
>>>against symbol  offset  in file
>>> opal_atomic_add_32  0x1e4   
>>> runtime/.libs/opal_progress.o
>>> opal_atomic_sub_32  0x234   
>>> runtime/.libs/opal_progress.o
>>> ld: fatal: relocations remain against allocatable but non-writable sections
>>> collect2: ld returned 1 exit status
>>> 
>>> 
>>> 
>>> Here is the possibly-relevant portion of the configure output:
>>> 
>>> checking if gcc -std=gnu99 supports GCC inline assembly... yes
>>> checking if gcc -std=gnu99 supports DEC inline assembly... no
>>> checking if gcc -std=gnu99 supports XLC inline assembly... no
>>> checking for assembly format... default-.text-.globl-:--.L-@-1-0-1-1-0
>>> checking for assembly architecture... IA32
>>> checking for builtin atomics... BUILTIN_NO
>>> checking for perl... perl
>>> checking for pre-built assembly file... yes (atomic-ia32-linux-nongas.s)
>>> checking for atomic assembly filename... atomic-ia32-linux-nongas.s
>>> 
>>> 
>>> The same problem is present in Open MPI 1.8.8, but 1.8.7 builds just fine.
>>> 
>>> Note that on Solaris the default ABI is ILP32 (e.g. default to -m32 rather 
>>> than -m64).
>>> There are no problems with LP64 builds ("-m64" in *FLAGS and the wrapper 
>>> flags).
>>> There are also no problems with either ILP32 or LP64 and the Studio 
>>> compilers.
>>> Only gcc with (default) 32-bit target experiences this failure.
>>> 
>>> -Paul
>>> 
>>> -- 
>>> Paul H. Hargrove  phhargr...@lbl.gov
>>> Computer Languages & Systems Software (CLaSS) Group
>>> Computer Science Department   Tel: +1-510-495-2352
>>> Lawrence Berkeley National Laboratory Fax: +1-510-486-6900
>>> ___
>>> devel mailing list
>>> de...@open-mpi.org
>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>> Link to this post: 
>>> http://www.open-mpi.org/community/lists/devel/2015/08/17750.php
>> 
>> 
>> -- 
>> Jeff Squyres
>> jsquy...@cisco.com
>> For corporate legal information go to: 
>> http://www.cisco.com/web/about/doing_business/legal/cri/
>> 
>> ___
>> devel mailing list
>> de...@open-mpi.org
>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
>> Link to this post: 
>> http://www.open-mpi.org/community/lists/devel/2015/08/17766.php
> ___
> devel mailing list
> de...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
> Link to this post: 
> http://www.open-mpi.org/community/lists/devel/2015/08/17767.php


-- 
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to: 
http://www.cisco.com/web/about/doing_business/legal/cri/



Re: [OMPI devel] 1.10.0rc3 - cannot build libfabic support static?

2015-08-20 Thread Jeff Squyres (jsquyres)
Paul sent me some additional info off-list, and I fixed the issue.

PR's filed for v1.10 and v2.x.


> On Aug 19, 2015, at 5:09 PM, Jeff Squyres (jsquyres)  
> wrote:
> 
> On Aug 19, 2015, at 5:01 PM, Todd Kordenbrock  wrote:
>> 
>> Jeff,
>> 
>> The linker error that Paul posted isn't an OFI MTL specific linker line.  It 
>> is the linker line for otfmerge-mpi from VampirTrace package.  Portals4 just 
>> shows up as an external library the same as OFI or torque.
> 
> Ah, right you are -- and those are there because of --enable-static 
> --disable-shared.  So: all is good there.
> 
>> As far as the 3 occurrences of the Portals4 path in that linker line, it 
>> breaks down as one -L and two -rpath.  I think the -rpath shows up twice 
>> because Portals4 depends on libev.so which is installed in the same place as 
>> libportals.so and you get one -rpath for each lib.  I'll see if that can be 
>> deduped.
> 
> K.  Not a huge deal, but would be nice to fix up.
> 
> -- 
> Jeff Squyres
> jsquy...@cisco.com
> For corporate legal information go to: 
> http://www.cisco.com/web/about/doing_business/legal/cri/
> 
> ___
> devel mailing list
> de...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
> Link to this post: 
> http://www.open-mpi.org/community/lists/devel/2015/08/17765.php


-- 
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to: 
http://www.cisco.com/web/about/doing_business/legal/cri/



Re: [OMPI devel] 1.10.0rc3 - second Solaris build failure

2015-08-20 Thread Jeff Squyres (jsquyres)
Paul --

Can you give me more info on this?

Can you "make clean all V=1" in the ompi/mpi/fortran/mpif-h directory and send 
me the output?

Additionally, can you send the output of "ls -l ompi/mpi/fortran/mpif-h/.libs" 
after the make?


> On Aug 14, 2015, at 11:40 PM, Paul Hargrove  wrote:
> 
> The following is seen on my Solaris-11.2 (but not 11.1) system.
> It is present with the Studio compilers (at least 12.4 and 12.3) for both 32- 
> and 64-bit targets.
> It is also present with the Gnu compiler for 64-bit targets (with 32-bit the 
> build dies for a different reason).
> 
>   FCLD libmpi_mpifh_pmpi.la
>   FCLD libmpi_mpifh_sizeof.la
>   CCLD libmpi_mpifh.la
> ld: fatal: file ./.libs/libmpi_mpifh_sizeof.a: open failed: No such file or 
> directory
> 
> On this same system I can build the 1.10.0rc2 tarball fine with identical 
> configure args (other than the prefix setting).
> I retested RC2 just now to be certain nothing relevant had changed in my 
> configuration.
> 
> -Paul
> 
> -- 
> Paul H. Hargrove  phhargr...@lbl.gov
> Computer Languages & Systems Software (CLaSS) Group
> Computer Science Department   Tel: +1-510-495-2352
> Lawrence Berkeley National Laboratory Fax: +1-510-486-6900
> ___
> devel mailing list
> de...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
> Link to this post: 
> http://www.open-mpi.org/community/lists/devel/2015/08/17751.php


-- 
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to: 
http://www.cisco.com/web/about/doing_business/legal/cri/



Re: [OMPI devel] 1.10.0rc3 - second Solaris build failure

2015-08-20 Thread Paul Hargrove
OK, I'll see what I can do.  I have a conf call in an hour.
So I'll if I don't have your requested output before that, it will be much
later today.

-Paul

On Thu, Aug 20, 2015 at 10:53 AM, Jeff Squyres (jsquyres) <
jsquy...@cisco.com> wrote:

> Paul --
>
> Can you give me more info on this?
>
> Can you "make clean all V=1" in the ompi/mpi/fortran/mpif-h directory and
> send me the output?
>
> Additionally, can you send the output of "ls -l
> ompi/mpi/fortran/mpif-h/.libs" after the make?
>
>
> > On Aug 14, 2015, at 11:40 PM, Paul Hargrove  wrote:
> >
> > The following is seen on my Solaris-11.2 (but not 11.1) system.
> > It is present with the Studio compilers (at least 12.4 and 12.3) for
> both 32- and 64-bit targets.
> > It is also present with the Gnu compiler for 64-bit targets (with 32-bit
> the build dies for a different reason).
> >
> >   FCLD libmpi_mpifh_pmpi.la
> >   FCLD libmpi_mpifh_sizeof.la
> >   CCLD libmpi_mpifh.la
> > ld: fatal: file ./.libs/libmpi_mpifh_sizeof.a: open failed: No such file
> or directory
> >
> > On this same system I can build the 1.10.0rc2 tarball fine with
> identical configure args (other than the prefix setting).
> > I retested RC2 just now to be certain nothing relevant had changed in my
> configuration.
> >
> > -Paul
> >
> > --
> > Paul H. Hargrove  phhargr...@lbl.gov
> > Computer Languages & Systems Software (CLaSS) Group
> > Computer Science Department   Tel: +1-510-495-2352
> > Lawrence Berkeley National Laboratory Fax: +1-510-486-6900
> > ___
> > devel mailing list
> > de...@open-mpi.org
> > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
> > Link to this post:
> http://www.open-mpi.org/community/lists/devel/2015/08/17751.php
>
>
> --
> Jeff Squyres
> jsquy...@cisco.com
> For corporate legal information go to:
> http://www.cisco.com/web/about/doing_business/legal/cri/
>
> ___
> devel mailing list
> de...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
> Link to this post:
> http://www.open-mpi.org/community/lists/devel/2015/08/17770.php
>



-- 
Paul H. Hargrove  phhargr...@lbl.gov
Computer Languages & Systems Software (CLaSS) Group
Computer Science Department   Tel: +1-510-495-2352
Lawrence Berkeley National Laboratory Fax: +1-510-486-6900


Re: [OMPI devel] 1.10.0rc3 build failure Solaris/x86 + gcc

2015-08-20 Thread Paul Hargrove
Excellent.  Sorry I let this escape into the 1.8.8 release.
-Paul

On Thu, Aug 20, 2015 at 10:29 AM, Jeff Squyres (jsquyres) <
jsquy...@cisco.com> wrote:

> (the fix has been merged in to v1.8 and v1.10 branches)
>
> > On Aug 20, 2015, at 12:18 PM, Nathan Hjelm  wrote:
> >
> >
> > I see the problem. Both Ralph and I missed an error in the
> > cherry-pick. For add_32 in the ia32 atomics we were checking for
> > OPAL_GCC_INLINE_ASSEMBLY instead of OMPI_GCC_INLINE_ASSEMBLY.
> >
> > -Nathan
> >
> > On Thu, Aug 20, 2015 at 03:01:35PM +, Jeff Squyres (jsquyres) wrote:
> >> Paul --
> >>
> >> I see that there was an ASM change in 1.8.8.  At first look, it seems
> harmless / shouldn't have caused this kind of problem.
> >>
> >> Nathan is checking into it...
> >>
> >>
> >>
> >>> On Aug 14, 2015, at 9:52 PM, Paul Hargrove  wrote:
> >>>
> >>> I have a systems running Solaris 11.1 on x86-64 hardware and 11.2 in
> an x86-64 VM.
> >>> To the extent I have tested the results are the same on both, despite
> gcc-4.5.2 vs 4.8.2
> >>>
> >>> I have normally tested only the Sun/Oracle Studio compilers on these
> systems.
> >>> However, today I gave the vendor-provided gcc, g++ and gfortran in
> /usr/bin a try.
> >>> So I configured the OpenMPI 1.10.0rc3 tarball with NO arguments to
> configure.
> >>>
> >>> When doing so I see tons of warnings like:
> >>>
> >>> ../../../../openmpi-1.10.0rc3/opal/include/opal/sys/atomic.h:393:9:
> warning: `opal_atomic_add_32' used but never defined
> >>> ../../../../openmpi-1.10.0rc3/opal/include/opal/sys/atomic.h:401:9:
> warning: `opal_atomic_sub_32' used but never defined
> >>>
> >>> and an eventual link failure to match:
> >>>
> >>>  CCLD libopen-pal.la
> >>> Text relocation remains referenced
> >>>against symbol  offset  in file
> >>> opal_atomic_add_32  0x1e4
>  runtime/.libs/opal_progress.o
> >>> opal_atomic_sub_32  0x234
>  runtime/.libs/opal_progress.o
> >>> ld: fatal: relocations remain against allocatable but non-writable
> sections
> >>> collect2: ld returned 1 exit status
> >>>
> >>>
> >>>
> >>> Here is the possibly-relevant portion of the configure output:
> >>>
> >>> checking if gcc -std=gnu99 supports GCC inline assembly... yes
> >>> checking if gcc -std=gnu99 supports DEC inline assembly... no
> >>> checking if gcc -std=gnu99 supports XLC inline assembly... no
> >>> checking for assembly format... default-.text-.globl-:--.L-@-1-0-1-1-0
> >>> checking for assembly architecture... IA32
> >>> checking for builtin atomics... BUILTIN_NO
> >>> checking for perl... perl
> >>> checking for pre-built assembly file... yes
> (atomic-ia32-linux-nongas.s)
> >>> checking for atomic assembly filename... atomic-ia32-linux-nongas.s
> >>>
> >>>
> >>> The same problem is present in Open MPI 1.8.8, but 1.8.7 builds just
> fine.
> >>>
> >>> Note that on Solaris the default ABI is ILP32 (e.g. default to -m32
> rather than -m64).
> >>> There are no problems with LP64 builds ("-m64" in *FLAGS and the
> wrapper flags).
> >>> There are also no problems with either ILP32 or LP64 and the Studio
> compilers.
> >>> Only gcc with (default) 32-bit target experiences this failure.
> >>>
> >>> -Paul
> >>>
> >>> --
> >>> Paul H. Hargrove  phhargr...@lbl.gov
> >>> Computer Languages & Systems Software (CLaSS) Group
> >>> Computer Science Department   Tel: +1-510-495-2352
> >>> Lawrence Berkeley National Laboratory Fax: +1-510-486-6900
> >>> ___
> >>> devel mailing list
> >>> de...@open-mpi.org
> >>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
> >>> Link to this post:
> http://www.open-mpi.org/community/lists/devel/2015/08/17750.php
> >>
> >>
> >> --
> >> Jeff Squyres
> >> jsquy...@cisco.com
> >> For corporate legal information go to:
> http://www.cisco.com/web/about/doing_business/legal/cri/
> >>
> >> ___
> >> devel mailing list
> >> de...@open-mpi.org
> >> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
> >> Link to this post:
> http://www.open-mpi.org/community/lists/devel/2015/08/17766.php
> > ___
> > devel mailing list
> > de...@open-mpi.org
> > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
> > Link to this post:
> http://www.open-mpi.org/community/lists/devel/2015/08/17767.php
>
>
> --
> Jeff Squyres
> jsquy...@cisco.com
> For corporate legal information go to:
> http://www.cisco.com/web/about/doing_business/legal/cri/
>
> ___
> devel mailing list
> de...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
> Link to this post:
> http://www.open-mpi.org/community/lists/devel/2015/08/17768.php
>



-- 
Paul H. Hargrove  phhargr...@lbl.gov
Computer Languages & Systems Software (CLaSS) Group
Computer Science

Re: [OMPI devel] 1.10.0rc3 build failure Solaris/x86 + gcc

2015-08-20 Thread Nathan Hjelm

No problem. I should have caught it in my post-cherry-pick tests. I
forgot to test with -m32.

-Nathan

On Thu, Aug 20, 2015 at 11:37:17AM -0700, Paul Hargrove wrote:
>Excellent.  Sorry I let this escape into the 1.8.8 release.
>-Paul
>On Thu, Aug 20, 2015 at 10:29 AM, Jeff Squyres (jsquyres)
> wrote:
> 
>  (the fix has been merged in to v1.8 and v1.10 branches)
>  > On Aug 20, 2015, at 12:18 PM, Nathan Hjelm  wrote:
>  >
>  >
>  > I see the problem. Both Ralph and I missed an error in the
>  > cherry-pick. For add_32 in the ia32 atomics we were checking for
>  > OPAL_GCC_INLINE_ASSEMBLY instead of OMPI_GCC_INLINE_ASSEMBLY.
>  >
>  > -Nathan
>  >
>  > On Thu, Aug 20, 2015 at 03:01:35PM +, Jeff Squyres (jsquyres)
>  wrote:
>  >> Paul --
>  >>
>  >> I see that there was an ASM change in 1.8.8.  At first look, it seems
>  harmless / shouldn't have caused this kind of problem.
>  >>
>  >> Nathan is checking into it...
>  >>
>  >>
>  >>
>  >>> On Aug 14, 2015, at 9:52 PM, Paul Hargrove 
>  wrote:
>  >>>
>  >>> I have a systems running Solaris 11.1 on x86-64 hardware and 11.2 in
>  an x86-64 VM.
>  >>> To the extent I have tested the results are the same on both,
>  despite gcc-4.5.2 vs 4.8.2
>  >>>
>  >>> I have normally tested only the Sun/Oracle Studio compilers on these
>  systems.
>  >>> However, today I gave the vendor-provided gcc, g++ and gfortran in
>  /usr/bin a try.
>  >>> So I configured the OpenMPI 1.10.0rc3 tarball with NO arguments to
>  configure.
>  >>>
>  >>> When doing so I see tons of warnings like:
>  >>>
>  >>> ../../../../openmpi-1.10.0rc3/opal/include/opal/sys/atomic.h:393:9:
>  warning: `opal_atomic_add_32' used but never defined
>  >>> ../../../../openmpi-1.10.0rc3/opal/include/opal/sys/atomic.h:401:9:
>  warning: `opal_atomic_sub_32' used but never defined
>  >>>
>  >>> and an eventual link failure to match:
>  >>>
>  >>>  CCLD libopen-pal.la
>  >>> Text relocation remains referenced
>  >>>against symbol  offset  in file
>  >>> opal_atomic_add_32  0x1e4 
>   runtime/.libs/opal_progress.o
>  >>> opal_atomic_sub_32  0x234 
>   runtime/.libs/opal_progress.o
>  >>> ld: fatal: relocations remain against allocatable but non-writable
>  sections
>  >>> collect2: ld returned 1 exit status
>  >>>
>  >>>
>  >>>
>  >>> Here is the possibly-relevant portion of the configure output:
>  >>>
>  >>> checking if gcc -std=gnu99 supports GCC inline assembly... yes
>  >>> checking if gcc -std=gnu99 supports DEC inline assembly... no
>  >>> checking if gcc -std=gnu99 supports XLC inline assembly... no
>  >>> checking for assembly format...
>  default-.text-.globl-:--.L-@-1-0-1-1-0
>  >>> checking for assembly architecture... IA32
>  >>> checking for builtin atomics... BUILTIN_NO
>  >>> checking for perl... perl
>  >>> checking for pre-built assembly file... yes
>  (atomic-ia32-linux-nongas.s)
>  >>> checking for atomic assembly filename... atomic-ia32-linux-nongas.s
>  >>>
>  >>>
>  >>> The same problem is present in Open MPI 1.8.8, but 1.8.7 builds just
>  fine.
>  >>>
>  >>> Note that on Solaris the default ABI is ILP32 (e.g. default to -m32
>  rather than -m64).
>  >>> There are no problems with LP64 builds ("-m64" in *FLAGS and the
>  wrapper flags).
>  >>> There are also no problems with either ILP32 or LP64 and the Studio
>  compilers.
>  >>> Only gcc with (default) 32-bit target experiences this failure.
>  >>>
>  >>> -Paul
>  >>>
>  >>> --
>  >>> Paul H. Hargrove  phhargr...@lbl.gov
>  >>> Computer Languages & Systems Software (CLaSS) Group
>  >>> Computer Science Department   Tel: +1-510-495-2352
>  >>> Lawrence Berkeley National Laboratory Fax: +1-510-486-6900
>  >>> ___
>  >>> devel mailing list
>  >>> de...@open-mpi.org
>  >>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
>  >>> Link to this post:
>  http://www.open-mpi.org/community/lists/devel/2015/08/17750.php
>  >>
>  >>
>  >> --
>  >> Jeff Squyres
>  >> jsquy...@cisco.com
>  >> For corporate legal information go to:
>  http://www.cisco.com/web/about/doing_business/legal/cri/
>  >>
>  >> ___
>  >> devel mailing list
>  >> de...@open-mpi.org
>  >> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
>  >> Link to this post:
>  http://www.open-mpi.org/community/lists/devel/2015/08/17766.php
>  > _

Re: [OMPI devel] 1.10.0rc3 - second Solaris build failure

2015-08-20 Thread Paul Hargrove
Jeff,

My testing scripts always pass V=1 to each make command, but as you can see
in my report that didn't give full command lines.
It is worth noting that on this system "make" is not Gnu-Make.
I am venturing a guess that this is why V=1 is not producing the expected
output.
That might be the known/expected automake behavior with non-Gnu version of
make - I honestly don't know.
So, you can consider this observation an additional bug report if you are
so inclined (and if you ignore it then I'll not complain).

After manually applying Nathan's fix for
opal/include/opal/sys/ia32/atomic.h this second failure mode remains.
I checked that first, in case the missing atomic functions had prevented
creation of the lib.

The attached transcript should contain the requested output.
It includes "make clean all V=1" *and* "gmake clean all V=1".
The gmake case also fails, but at least V=1 works.

In case anybody wants to reproduce for themselves:
I am using a VirtualBox VM image which anyone (with registration) can
download from Oracle.
I can provide more details upon request.

-Paul

On Thu, Aug 20, 2015 at 10:59 AM, Paul Hargrove  wrote:

> OK, I'll see what I can do.  I have a conf call in an hour.
> So I'll if I don't have your requested output before that, it will be much
> later today.
>
> -Paul
>
> On Thu, Aug 20, 2015 at 10:53 AM, Jeff Squyres (jsquyres) <
> jsquy...@cisco.com> wrote:
>
>> Paul --
>>
>> Can you give me more info on this?
>>
>> Can you "make clean all V=1" in the ompi/mpi/fortran/mpif-h directory and
>> send me the output?
>>
>> Additionally, can you send the output of "ls -l
>> ompi/mpi/fortran/mpif-h/.libs" after the make?
>>
>>
>> > On Aug 14, 2015, at 11:40 PM, Paul Hargrove  wrote:
>> >
>> > The following is seen on my Solaris-11.2 (but not 11.1) system.
>> > It is present with the Studio compilers (at least 12.4 and 12.3) for
>> both 32- and 64-bit targets.
>> > It is also present with the Gnu compiler for 64-bit targets (with
>> 32-bit the build dies for a different reason).
>> >
>> >   FCLD libmpi_mpifh_pmpi.la
>> >   FCLD libmpi_mpifh_sizeof.la
>> >   CCLD libmpi_mpifh.la
>> > ld: fatal: file ./.libs/libmpi_mpifh_sizeof.a: open failed: No such
>> file or directory
>> >
>> > On this same system I can build the 1.10.0rc2 tarball fine with
>> identical configure args (other than the prefix setting).
>> > I retested RC2 just now to be certain nothing relevant had changed in
>> my configuration.
>> >
>> > -Paul
>> >
>> > --
>> > Paul H. Hargrove  phhargr...@lbl.gov
>> > Computer Languages & Systems Software (CLaSS) Group
>> > Computer Science Department   Tel: +1-510-495-2352
>> > Lawrence Berkeley National Laboratory Fax: +1-510-486-6900
>> > ___
>> > devel mailing list
>> > de...@open-mpi.org
>> > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
>> > Link to this post:
>> http://www.open-mpi.org/community/lists/devel/2015/08/17751.php
>>
>>
>> --
>> Jeff Squyres
>> jsquy...@cisco.com
>> For corporate legal information go to:
>> http://www.cisco.com/web/about/doing_business/legal/cri/
>>
>> ___
>> devel mailing list
>> de...@open-mpi.org
>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
>> Link to this post:
>> http://www.open-mpi.org/community/lists/devel/2015/08/17770.php
>>
>
>
>
> --
> Paul H. Hargrove  phhargr...@lbl.gov
> Computer Languages & Systems Software (CLaSS) Group
> Computer Science Department   Tel: +1-510-495-2352
> Lawrence Berkeley National Laboratory Fax: +1-510-486-6900
>



-- 
Paul H. Hargrove  phhargr...@lbl.gov
Computer Languages & Systems Software (CLaSS) Group
Computer Science Department   Tel: +1-510-495-2352
Lawrence Berkeley National Laboratory Fax: +1-510-486-6900


typescript.bz2
Description: BZip2 compressed data


Re: [OMPI devel] 1.10.0rc3 - second Solaris build failure

2015-08-20 Thread Jeff Squyres (jsquyres)
Ah, I see the issue:

-
  GENERATE psizeof_f.f90
  FC   psizeof_f.lo
WARNING: Source file "psizeof_f.f90" contains no Fortran statements.

f90comp: 29 SOURCE LINES
f90comp: 0 ERRORS, 1 WARNINGS, 0 OTHER MESSAGES, 0 ANSI
-

As you can see, (p)sizeof_f.f90 is a generated file.  I'll bet that OMPI 
determined that your Fortran compiler didn't support enough Fortran mojo to 
properly support MPI_SIZEOF.  So it generated an empty .f90 file.  And 
therefore it 

Easy enough to fix...


> On Aug 20, 2015, at 2:52 PM, Paul Hargrove  wrote:
> 
> Jeff,
> 
> My testing scripts always pass V=1 to each make command, but as you can see 
> in my report that didn't give full command lines.
> It is worth noting that on this system "make" is not Gnu-Make.
> I am venturing a guess that this is why V=1 is not producing the expected 
> output.
> That might be the known/expected automake behavior with non-Gnu version of 
> make - I honestly don't know.
> So, you can consider this observation an additional bug report if you are so 
> inclined (and if you ignore it then I'll not complain).
> 
> After manually applying Nathan's fix for opal/include/opal/sys/ia32/atomic.h 
> this second failure mode remains.
> I checked that first, in case the missing atomic functions had prevented 
> creation of the lib.
> 
> The attached transcript should contain the requested output.
> It includes "make clean all V=1" *and* "gmake clean all V=1".
> The gmake case also fails, but at least V=1 works.
> 
> In case anybody wants to reproduce for themselves:
> I am using a VirtualBox VM image which anyone (with registration) can 
> download from Oracle.
> I can provide more details upon request.
> 
> -Paul
> 
> On Thu, Aug 20, 2015 at 10:59 AM, Paul Hargrove  wrote:
> OK, I'll see what I can do.  I have a conf call in an hour.
> So I'll if I don't have your requested output before that, it will be much 
> later today.
> 
> -Paul
> 
> On Thu, Aug 20, 2015 at 10:53 AM, Jeff Squyres (jsquyres) 
>  wrote:
> Paul --
> 
> Can you give me more info on this?
> 
> Can you "make clean all V=1" in the ompi/mpi/fortran/mpif-h directory and 
> send me the output?
> 
> Additionally, can you send the output of "ls -l 
> ompi/mpi/fortran/mpif-h/.libs" after the make?
> 
> 
> > On Aug 14, 2015, at 11:40 PM, Paul Hargrove  wrote:
> >
> > The following is seen on my Solaris-11.2 (but not 11.1) system.
> > It is present with the Studio compilers (at least 12.4 and 12.3) for both 
> > 32- and 64-bit targets.
> > It is also present with the Gnu compiler for 64-bit targets (with 32-bit 
> > the build dies for a different reason).
> >
> >   FCLD libmpi_mpifh_pmpi.la
> >   FCLD libmpi_mpifh_sizeof.la
> >   CCLD libmpi_mpifh.la
> > ld: fatal: file ./.libs/libmpi_mpifh_sizeof.a: open failed: No such file or 
> > directory
> >
> > On this same system I can build the 1.10.0rc2 tarball fine with identical 
> > configure args (other than the prefix setting).
> > I retested RC2 just now to be certain nothing relevant had changed in my 
> > configuration.
> >
> > -Paul
> >
> > --
> > Paul H. Hargrove  phhargr...@lbl.gov
> > Computer Languages & Systems Software (CLaSS) Group
> > Computer Science Department   Tel: +1-510-495-2352
> > Lawrence Berkeley National Laboratory Fax: +1-510-486-6900
> > ___
> > devel mailing list
> > de...@open-mpi.org
> > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
> > Link to this post: 
> > http://www.open-mpi.org/community/lists/devel/2015/08/17751.php
> 
> 
> --
> Jeff Squyres
> jsquy...@cisco.com
> For corporate legal information go to: 
> http://www.cisco.com/web/about/doing_business/legal/cri/
> 
> ___
> devel mailing list
> de...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
> Link to this post: 
> http://www.open-mpi.org/community/lists/devel/2015/08/17770.php
> 
> 
> 
> -- 
> Paul H. Hargrove  phhargr...@lbl.gov
> Computer Languages & Systems Software (CLaSS) Group
> Computer Science Department   Tel: +1-510-495-2352
> Lawrence Berkeley National Laboratory Fax: +1-510-486-6900
> 
> 
> 
> -- 
> Paul H. Hargrove  phhargr...@lbl.gov
> Computer Languages & Systems Software (CLaSS) Group
> Computer Science Department   Tel: +1-510-495-2352
> Lawrence Berkeley National Laboratory Fax: +1-510-486-6900
> ___
> devel mailing list
> de...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
> Link to this post: 
> http://www.open-mpi.org/community/lists/devel/2015/08/17774.php


-- 
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to: 
http://www.cisco.com/web/about/doing_business/legal/cri/



Re: [OMPI devel] 1.10.0rc3 - second Solaris build failure

2015-08-20 Thread Jeff Squyres (jsquyres)
On Aug 20, 2015, at 3:54 PM, Jeff Squyres (jsquyres)  wrote:
> 
> Ah, I see the issue:
> 
> -
>  GENERATE psizeof_f.f90
>  FC   psizeof_f.lo
> WARNING: Source file "psizeof_f.f90" contains no Fortran statements.
> 
> f90comp: 29 SOURCE LINES
> f90comp: 0 ERRORS, 1 WARNINGS, 0 OTHER MESSAGES, 0 ANSI
> -
> 
> As you can see, (p)sizeof_f.f90 is a generated file.  I'll bet that OMPI 
> determined that your Fortran compiler didn't support enough Fortran mojo to 
> properly support MPI_SIZEOF.  So it generated an empty .f90 file.  And 
> therefore it 

Heh -- I didn't complete that sentence...

And therefore it didn't generate libmpi_mpifh_sizeof.a (gfortran  Easy enough to fix...

-- 
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to: 
http://www.cisco.com/web/about/doing_business/legal/cri/



Re: [OMPI devel] [OMPI commits] Git: open-mpi/ompi branch master updated. dev-2340-gd5763a8

2015-08-20 Thread Dave Goodell (dgoodell)
On Aug 20, 2015, at 3:03 PM, git...@crest.iu.edu wrote:

> This is an automated email from the git hooks/post-receive script. It was
> generated because a ref change was pushed to the repository containing
> the project "open-mpi/ompi".
> 
> The branch, master has been updated
>   via  d5763a8288c994e1d55a333d45f1a85d64341aff (commit)
>  from  305053615779af14ed36e0d94d85c2bbea59d55b (commit)
> 
> Those revisions listed above that are new to this repository have
> not appeared on any other notification email; so we list those
> revisions in full, below.
> 
> - Log -
> https://github.com/open-mpi/ompi/commit/d5763a8288c994e1d55a333d45f1a85d64341aff
> 
> commit d5763a8288c994e1d55a333d45f1a85d64341aff
> Author: --quiet <--quiet>

That's sure a funny author name and email... Jeff was this you somehow?

-Dave



Re: [OMPI devel] 1.10.0rc3 - second Solaris build failure

2015-08-20 Thread Jeff Squyres (jsquyres)
On Aug 20, 2015, at 3:56 PM, Jeff Squyres (jsquyres)  wrote:
> 
>> I'll bet that OMPI determined that your Fortran compiler didn't support 
>> enough Fortran mojo to properly support MPI_SIZEOF.  So it generated an 
>> empty .f90 file.  And therefore it 
> 
> And therefore it didn't generate libmpi_mpifh_sizeof.a (gfortran  generate an effectively "empty" libmpi_mpifh_sizeof.a).  Hence, a subsequent 
> link that depended on that library failed.

Paul: can you verify my theory?

Do this in your existing build:

-
rm -f ompi/mpi/fortran/base/gen-mpi-sizeof.pl
wget \
  
https://raw.githubusercontent.com/open-mpi/ompi/master/ompi/mpi/fortran/base/gen-mpi-sizeof.pl
 \
  -O ompi/mpi/fortran/base/gen-mpi-sizeof.pl
chmod +x ompi/mpi/fortran/base/gen-mpi-sizeof.pl
rm ompi/mpi/fortran/mpif-h/profile/psizeof_f.f90
make -j 32
-

That will download the new script from master (which is identical to the v1.10 
version, but I have committed the fix to master), make it executable, remove 
the generated psizeof_f.f90 file, and then run the build again -- which will 
cause it to generate psizeof_f.f90 again and try to build again.

Thanks!

-- 
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to: 
http://www.cisco.com/web/about/doing_business/legal/cri/



Re: [OMPI devel] [OMPI commits] Git: open-mpi/ompi branch master updated. dev-2340-gd5763a8

2015-08-20 Thread Jeff Squyres (jsquyres)
$#%@%@#$%

WTF?

Yes.  Somehow I stomped on my $HOME/.gitconfig.  :-(


> On Aug 20, 2015, at 4:10 PM, Dave Goodell (dgoodell)  
> wrote:
> 
> On Aug 20, 2015, at 3:03 PM, git...@crest.iu.edu wrote:
> 
>> This is an automated email from the git hooks/post-receive script. It was
>> generated because a ref change was pushed to the repository containing
>> the project "open-mpi/ompi".
>> 
>> The branch, master has been updated
>>  via  d5763a8288c994e1d55a333d45f1a85d64341aff (commit)
>> from  305053615779af14ed36e0d94d85c2bbea59d55b (commit)
>> 
>> Those revisions listed above that are new to this repository have
>> not appeared on any other notification email; so we list those
>> revisions in full, below.
>> 
>> - Log -
>> https://github.com/open-mpi/ompi/commit/d5763a8288c994e1d55a333d45f1a85d64341aff
>> 
>> commit d5763a8288c994e1d55a333d45f1a85d64341aff
>> Author: --quiet <--quiet>
> 
> That's sure a funny author name and email... Jeff was this you somehow?
> 
> -Dave
> 


-- 
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to: 
http://www.cisco.com/web/about/doing_business/legal/cri/



Re: [OMPI devel] 1.10.0rc3 - second Solaris build failure

2015-08-20 Thread Paul Hargrove
On Thu, Aug 20, 2015 at 1:14 PM, Jeff Squyres (jsquyres)  wrote:
>
> > And therefore it didn't generate libmpi_mpifh_sizeof.a (gfortran  will generate an effectively "empty" libmpi_mpifh_sizeof.a).  Hence, a
> subsequent link that depended on that library failed.
>
> Paul: can you verify my theory?
>
> Do this in your existing build:
>
> -
> rm -f ompi/mpi/fortran/base/gen-mpi-sizeof.pl
> wget \
>
> https://raw.githubusercontent.com/open-mpi/ompi/master/ompi/mpi/fortran/base/gen-mpi-sizeof.pl
> \
>   -O ompi/mpi/fortran/base/gen-mpi-sizeof.pl
> chmod +x ompi/mpi/fortran/base/gen-mpi-sizeof.pl
> rm ompi/mpi/fortran/mpif-h/profile/psizeof_f.f90
> make -j 32
>

I made changes to your instruction appropriate to my VPATH build (cd
$BLDDIR after the chmod).
Solaris make has no '-j' option, but since I am running in a VM on a
dual-core laptop I chose to omit "-j 32" even after switching to gmake.

Good-natured-nit-picking aside, the solution does NOT work (it may be
necessary, but is not sufficient).
There is a new generated psizeof_f.f90, containing a dummy subroutine, but
my pandas are still sad.
In fact, these pandas are so despondent that they started chewing on your
.gitconfig file (but I asked them to be --quiet about it).

A log from "gmake clean all V=1" in the mpif-h directory is (again)
attached.

I direct your attention to the following line:
/bin/sh ../../../../libtool  --tag=FC   --mode=link f90  -m32 -g   -o
libmpi_mpifh_sizeof.la-lm -lsocket -lnsl
Somebody appears to have specified no linker inputs!
On other platforms I see a "sizeof_f.lo" immediately before the -l options.
I am pretty sure this is a contributing factor. ;-)

-Paul


-- 
Paul H. Hargrove  phhargr...@lbl.gov
Computer Languages & Systems Software (CLaSS) Group
Computer Science Department   Tel: +1-510-495-2352
Lawrence Berkeley National Laboratory Fax: +1-510-486-6900


log.bz2
Description: BZip2 compressed data