Re: [OMPI devel] RFC: add atomic compare-and-swap that returns old value

2014-07-29 Thread Paul Hargrove
On Tue, Jul 29, 2014 at 2:10 PM, Nathan Hjelm  wrote:

> Is there a reason why the
> current implementations of opal atomics (add, cmpset) do not return the
> old value?
>

Because some CPUs don't implement such an atomic instruction?

On any CPU one *can* certainly synthesize the desired operation with an
added read before the compare-and-swap to return a value that was present
at some time before a failed cmpset.  That is almost certainly sufficient
for your purposes.  However, the added load makes it (marginally) more
expensive on some CPUs that only have the native equivalent of gcc's
__sync_bool_compare_and_swap().

-Paul

-- 
Paul H. Hargrove  phhargr...@lbl.gov
Future Technologies Group
Computer and Data Sciences Department Tel: +1-510-495-2352
Lawrence Berkeley National Laboratory Fax: +1-510-486-6900


Re: [OMPI devel] openmpi-1.8.2rc2 and f08 interface built with PGI-14.7 causes link error

2014-07-29 Thread Paul Hargrove
I have license for PGI and installations of 14.1 and 14.4
I will see what I can do today in terms of testing.

-Paul


On Tue, Jul 29, 2014 at 4:23 PM, Jeff Squyres (jsquyres)  wrote:

> Tetsuya --
>
> I am unable to test with the PGI compiler -- I don't have a license.  I
> was hoping that LANL would be able to test today, but I don't think they
> got to it.
>
> Can you send more details?
>
> E.g., can you send the all the stuff listed on
> http://www.open-mpi.org/community/help/ for 1.8 and 1.8.2rc2 for the 14.7
> compiler?
>
> I'm *guessing* that we've done something new in the changes since 1.8 that
> PGI doesn't support, and we need to disable that something (hopefully while
> not needing to disable the entire mpi_f08 bindings...).
>
>
>
> On Jul 28, 2014, at 11:43 PM, tmish...@jcity.maeda.co.jp wrote:
>
> >
> > Hi folks,
> >
> > I tried to build openmpi-1.8.2rc2 with PGI-14.7 and execute a sample
> > program. Then, it causes linking error:
> >
> > [mishima@manage work]$ cat test.f
> >  program hello_world
> >  use mpi_f08
> >  implicit none
> >
> >  type(MPI_Comm) :: comm
> >  integer :: myid, npes, ierror
> >  integer :: name_length
> >  character(len=MPI_MAX_PROCESSOR_NAME) :: processor_name
> >
> >  call mpi_init(ierror)
> >  comm = MPI_COMM_WORLD
> >  call MPI_Comm_rank(comm, myid, ierror)
> >  call MPI_Comm_size(comm, npes, ierror)
> >  call MPI_Get_processor_name(processor_name, name_length, ierror)
> >  write (*,'(A,X,I4,X,A,X,I4,X,A,X,A)')
> > +"Process", myid, "of", npes, "is on", trim(processor_name)
> >  call MPI_Finalize(ierror)
> >
> >  end program hello_world
> >
> > [mishima@manage work]$ mpif90 test.f -o test.ex
> > /tmp/pgfortran65ZcUeoncoqT.o: In function `.C1_283':
> > test.f:(.data+0x6c): undefined reference to
> `mpi_f08_interfaces_callbacks_'
> > test.f:(.data+0x74): undefined reference to `mpi_f08_interfaces_'
> > test.f:(.data+0x7c): undefined reference to `pmpi_f08_interfaces_'
> > test.f:(.data+0x84): undefined reference to `mpi_f08_sizeof_'
> >
> > So, I did some more tests with previous version of PGI and
> > openmpi-1.8. The results are summarized as follows:
> >
> >  PGI13.10   PGI14.7
> > openmpi-1.8   OK OK
> > openmpi-1.8.2rc2  configure sets use_f08_mpi:no  link error
> >
> > Regards,
> > Tetsuya Mishima
> >
> > ___
> > devel mailing list
> > de...@open-mpi.org
> > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
> > Link to this post:
> http://www.open-mpi.org/community/lists/devel/2014/07/15303.php
>
>
> --
> Jeff Squyres
> jsquy...@cisco.com
> For corporate legal information go to:
> http://www.cisco.com/web/about/doing_business/legal/cri/
>
> ___
> devel mailing list
> de...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
> Link to this post:
> http://www.open-mpi.org/community/lists/devel/2014/07/15335.php
>



-- 
Paul H. Hargrove  phhargr...@lbl.gov
Future Technologies Group
Computer and Data Sciences Department Tel: +1-510-495-2352
Lawrence Berkeley National Laboratory Fax: +1-510-486-6900


Re: [OMPI devel] openmpi-1.8.2rc2 and f08 interface built with PGI-14.7 causes link error

2014-07-29 Thread Paul Hargrove
If I read the original email correctly then 1.8.2rc2 also failed on
pgi-13.10.
At the moment I am just hoping to reproduce at all.

-Paul


On Tue, Jul 29, 2014 at 4:46 PM, Larry Baker <ba...@usgs.gov> wrote:

> PGI 14.7 is VERY new -- I just received the announcement on Sunday.
>
> Larry Baker
> US Geological Survey
> 650-329-5608
> ba...@usgs.gov
>
>
>
> On 29 Jul 2014, at 4:25 PM, Paul Hargrove wrote:
>
> I have license for PGI and installations of 14.1 and 14.4
> I will see what I can do today in terms of testing.
>
> -Paul
>
>
> On Tue, Jul 29, 2014 at 4:23 PM, Jeff Squyres (jsquyres) <
> jsquy...@cisco.com> wrote:
>
>> Tetsuya --
>>
>> I am unable to test with the PGI compiler -- I don't have a license.  I
>> was hoping that LANL would be able to test today, but I don't think they
>> got to it.
>>
>> Can you send more details?
>>
>> E.g., can you send the all the stuff listed on
>> http://www.open-mpi.org/community/help/ for 1.8 and 1.8.2rc2 for the
>> 14.7 compiler?
>>
>> I'm *guessing* that we've done something new in the changes since 1.8
>> that PGI doesn't support, and we need to disable that something (hopefully
>> while not needing to disable the entire mpi_f08 bindings...).
>>
>>
>>
>> On Jul 28, 2014, at 11:43 PM, tmish...@jcity.maeda.co.jp wrote:
>>
>> >
>> > Hi folks,
>> >
>> > I tried to build openmpi-1.8.2rc2 with PGI-14.7 and execute a sample
>> > program. Then, it causes linking error:
>> >
>> > [mishima@manage work]$ cat test.f
>> >  program hello_world
>> >  use mpi_f08
>> >  implicit none
>> >
>> >  type(MPI_Comm) :: comm
>> >  integer :: myid, npes, ierror
>> >  integer :: name_length
>> >  character(len=MPI_MAX_PROCESSOR_NAME) :: processor_name
>> >
>> >  call mpi_init(ierror)
>> >  comm = MPI_COMM_WORLD
>> >  call MPI_Comm_rank(comm, myid, ierror)
>> >  call MPI_Comm_size(comm, npes, ierror)
>> >  call MPI_Get_processor_name(processor_name, name_length, ierror)
>> >  write (*,'(A,X,I4,X,A,X,I4,X,A,X,A)')
>> > +"Process", myid, "of", npes, "is on", trim(processor_name)
>> >  call MPI_Finalize(ierror)
>> >
>> >  end program hello_world
>> >
>> > [mishima@manage work]$ mpif90 test.f -o test.ex
>> > /tmp/pgfortran65ZcUeoncoqT.o: In function `.C1_283':
>> > test.f:(.data+0x6c): undefined reference to
>> `mpi_f08_interfaces_callbacks_'
>> > test.f:(.data+0x74): undefined reference to `mpi_f08_interfaces_'
>> > test.f:(.data+0x7c): undefined reference to `pmpi_f08_interfaces_'
>> > test.f:(.data+0x84): undefined reference to `mpi_f08_sizeof_'
>> >
>> > So, I did some more tests with previous version of PGI and
>> > openmpi-1.8. The results are summarized as follows:
>> >
>> >  PGI13.10   PGI14.7
>> > openmpi-1.8   OK OK
>> > openmpi-1.8.2rc2  configure sets use_f08_mpi:no  link error
>> >
>> > Regards,
>> > Tetsuya Mishima
>> >
>> > ___
>> > devel mailing list
>> > de...@open-mpi.org
>> > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
>> > Link to this post:
>> http://www.open-mpi.org/community/lists/devel/2014/07/15303.php
>>
>>
>> --
>> Jeff Squyres
>> jsquy...@cisco.com
>> For corporate legal information go to:
>> http://www.cisco.com/web/about/doing_business/legal/cri/
>>
>> ___
>> devel mailing list
>> de...@open-mpi.org
>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
>> Link to this post:
>> http://www.open-mpi.org/community/lists/devel/2014/07/15335.php
>>
>
>
>
> --
> Paul H. Hargrove  phhargr...@lbl.gov
> Future Technologies Group
> Computer and Data Sciences Department Tel: +1-510-495-2352
> Lawrence Berkeley National Laboratory Fax: +1-510-486-6900
>  ___
> devel mailing list
> de...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
> Link to this post:
> http://www.open-mpi.org/community/lists/devel/2014/07/15336.php
>
>
>
> ___
> devel mailing list
> de...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
> Link to this post:
> http://www.open-mpi.org/community/lists/devel/2014/07/15337.php
>



-- 
Paul H. Hargrove  phhargr...@lbl.gov
Future Technologies Group
Computer and Data Sciences Department Tel: +1-510-495-2352
Lawrence Berkeley National Laboratory Fax: +1-510-486-6900


Re: [OMPI devel] openmpi-1.8.2rc2 and f08 interface built with PGI-14.7 causes link error

2014-07-29 Thread Paul Hargrove
On Tue, Jul 29, 2014 at 4:23 PM, Jeff Squyres (jsquyres)  wrote:

> Tetsuya --
>
> I am unable to test with the PGI compiler -- I don't have a license.  I
> was hoping that LANL would be able to test today, but I don't think they
> got to it.
>
> Can you send more details?
>
> E.g., can you send the all the stuff listed on
> http://www.open-mpi.org/community/help/ for 1.8 and 1.8.2rc2 for the 14.7
> compiler?
>
> I'm *guessing* that we've done something new in the changes since 1.8 that
> PGI doesn't support, and we need to disable that something (hopefully while
> not needing to disable the entire mpi_f08 bindings...).
>


The good news is that my build with 1.8.2rc2 and PGI 14.4 isn't a total
failure.
However, with no fortran-specific configure arguments it did not install
mpi_f08.mod.
So, is it possible that configure is automatically (and correctly)
determining that F08 doesn't work?
I can extract the right bits from config.log is somebody (Jeff?) can tell
me what to look for.

I am trying again with an explicit --enable-mpi-fortran=usempi at configure
time to see what happens.

-Paul


-- 
Paul H. Hargrove  phhargr...@lbl.gov
Future Technologies Group
Computer and Data Sciences Department Tel: +1-510-495-2352
Lawrence Berkeley National Laboratory Fax: +1-510-486-6900


Re: [OMPI devel] openmpi-1.8.2rc2 and f08 interface built with PGI-14.7 causes link error

2014-07-29 Thread Paul Hargrove
On Tue, Jul 29, 2014 at 6:33 PM, Paul Hargrove <phhargr...@lbl.gov> wrote:

> I am trying again with an explicit --enable-mpi-fortran=usempi at
> configure time to see what happens.
>

Of course that should have said --enable-mpi-fortran=usempif08


-- 
Paul H. Hargrove  phhargr...@lbl.gov
Future Technologies Group
Computer and Data Sciences Department Tel: +1-510-495-2352
Lawrence Berkeley National Laboratory Fax: +1-510-486-6900


Re: [OMPI devel] openmpi-1.8.2rc2 and f08 interface built with PGI-14.7 causes link error

2014-07-30 Thread Paul Hargrove
On Tue, Jul 29, 2014 at 6:38 PM, Paul Hargrove <phhargr...@lbl.gov> wrote:

>
> On Tue, Jul 29, 2014 at 6:33 PM, Paul Hargrove <phhargr...@lbl.gov> wrote:
>
>> I am trying again with an explicit --enable-mpi-fortran=usempi at
>> configure time to see what happens.
>>
>
> Of course that should have said --enable-mpi-fortran=usempif08
>

I've switched to using PG13.6 for my testing.
I find that even when I pass that flag I see that use_mpi_f08 is NOT
enabled:

checking Fortran compiler ignore TKR syntax... not cached; checking variants
checking for Fortran compiler support of TYPE(*), DIMENSION(*)... no
checking for Fortran compiler support of !DEC$ ATTRIBUTES NO_ARG_CHECK... no
checking for Fortran compiler support of !$PRAGMA IGNORE_TKR... no
checking for Fortran compiler support of !DIR$ IGNORE_TKR... yes
checking Fortran compiler ignore TKR syntax... 1:real, dimension(*):!DIR$
IGNORE_TKR
checking if Fortran compiler supports ISO_C_BINDING... yes
checking if building Fortran 'use mpi' bindings... yes
checking if Fortran compiler supports SUBROUTINE BIND(C)... yes
checking if Fortran compiler supports TYPE, BIND(C)... yes
checking if Fortran compiler supports TYPE(type), BIND(C, NAME="name")...
yes
checking if Fortran compiler supports PROCEDURE... no
*checking if building Fortran 'use mpi_f08' bindings... no*

Contrast that to openmpi-1.8.1 and the same compiler:

checking Fortran compiler ignore TKR syntax... not cached; checking variants
checking for Fortran compiler support of TYPE(*), DIMENSION(*)... no
checking for Fortran compiler support of !DEC$ ATTRIBUTES NO_ARG_CHECK... no
checking for Fortran compiler support of !$PRAGMA IGNORE_TKR... no
checking for Fortran compiler support of !DIR$ IGNORE_TKR... yes
checking Fortran compiler ignore TKR syntax... 1:real, dimension(*):!DIR$
IGNORE_TKR
checking if building Fortran 'use mpi' bindings... yes
checking if Fortran compiler supports ISO_C_BINDING... yes
checking if Fortran compiler supports SUBROUTINE BIND(C)... yes
checking if Fortran compiler supports TYPE, BIND(C)... yes
checking if Fortran compiler supports TYPE(type), BIND(C, NAME="name")...
yes
checking if Fortran compiler supports optional arguments... yes
checking if Fortran compiler supports PRIVATE... yes
checking if Fortran compiler supports PROTECTED... yes
checking if Fortran compiler supports ABSTRACT... yes
checking if Fortran compiler supports ASYNCHRONOUS... yes
checking if Fortran compiler supports PROCEDURE... no
checking size of Fortran type(test_mpi_handle)... 4
checking Fortran compiler F08 assumed rank syntax... not cached; checking
checking for Fortran compiler support of TYPE(*), DIMENSION(..)... no
checking Fortran compiler F08 assumed rank syntax... no
checking which mpi_f08 implementation to build... "good" compiler, no array
subsections
*checking if building Fortran 'use mpi_f08' bindings... yes*

So, somewhere between 1.8.1 and 1.8.2rc2 something has happened in the
configure logic to disqualify the pgf90 compiler.

I also surprised to see 1.8.2rc2 performing *fewer* tests of FC then 1.8.1
did (unless they moved elsewhere?).

In the end I cannot reproduce the originally reported problem for the
simple reason that I instead see:

{hargrove@hopper04 openmpi-1.8.2rc2-linux-x86_64-pgi-14.4}$
./INST/bin/mpif90 ../test.f
PGF90-F-0004-Unable to open MODULE file mpi_f08.mod (../test.f: 2)
PGF90/x86-64 Linux 14.4-0: compilation aborted


Tetsuya Mishima,

Is it possible that your installation of 1.8.2rc2 was to the same prefix as
an older build?
It that is the case, you may have the mpi_f08.mod from the older build even
though no f08 support is in the new build.


-Paul


-- 
Paul H. Hargrove  phhargr...@lbl.gov
Future Technologies Group
Computer and Data Sciences Department Tel: +1-510-495-2352
Lawrence Berkeley National Laboratory Fax: +1-510-486-6900


Re: [OMPI devel] openmpi-1.8.2rc2 and f08 interface built with PGI-14.7 causes link error

2014-07-30 Thread Paul Hargrove
Giles,

If you look more carefully at the output I provided you will see that 1.8.1
*does* test for PROCEDURE support and finds it lacking.  BOTH outputs
include:
 checking if Fortran compiler supports PROCEDURE... no

However in the 1.8.1 case that is apparently not sufficient to disqualify
building the f08 module.

The test does fail in both 1.8.1 and 1.8.2rc2.
Here is the related portion of config.log from one of them:

configure:57708: checking if Fortran compiler supports PROCEDURE
configure:57735: pgf90 -c -g conftest.f90 >&5 PGF90-S-0155-Illegal
procedure interface - mpi_user_function (conftest.f90: 12)
PGF90-S-0155-Illegal procedure interface - mpi_user_function (conftest.f90:
12) 0 inform, 0 warnings, 2 severes, 0 fatal for test_proc configure:57735:
$? = 2 configure: failed program was: | MODULE proc_mod | INTERFACE |
SUBROUTINE MPI_User_function | END SUBROUTINE | END INTERFACE | END MODULE
proc_mod | | PROGRAM test_proc | INTERFACE | SUBROUTINE binky(user_fn) |
USE proc_mod | PROCEDURE(MPI_User_function) :: user_fn | END SUBROUTINE |
END INTERFACE | END PROGRAM configure:57751: result: no

Other than the line numbers the 1.8.1 and 1.8.2rc2 output are identical in
this respect.

The test also fails run manually:

{hargrove@hopper04 OMPI}$ pgf90 -c -g conftest.f90 PGF90-S-0155-Illegal
procedure interface - mpi_user_function (conftest.f90: 12)
PGF90-S-0155-Illegal procedure interface - mpi_user_function (conftest.f90:
12) 0 inform, 0 warnings, 2 severes, 0 fatal for test_proc
{hargrove@hopper04 OMPI}$ pgf90 -V pgf90 13.10-0 64-bit target on x86-64
Linux -tp shanghai The Portland Group - PGI Compilers and Tools Copyright
(c) 2013, NVIDIA CORPORATION. All rights reserved.

-Paul

On Tue, Jul 29, 2014 at 9:09 PM, Gilles Gouaillardet <
gilles.gouaillar...@iferc.org> wrote:

>  Paul,
>
> from the logs, the only difference i see is about Fortran PROCEDURE.
>
> openpmi 1.8 (svn checkout) does not build the usempif08 bindings if
> PROCEDURE is not supported.
>
> from the logs, openmpi 1.8.1 does not check whether PROCEDURE is supported
> or not
>
> here is the sample program to check PROCEDURE (from
> config/ompi_fortran_check_procedure.m4)
>
> MODULE proc_mod
> INTERFACE
> SUBROUTINE MPI_User_function
> END SUBROUTINE
> END INTERFACE
> END MODULE proc_mod
>
> PROGRAM test_proc
> INTERFACE
> SUBROUTINE binky(user_fn)
>   USE proc_mod
>   PROCEDURE(MPI_User_function) :: user_fn
> END SUBROUTINE
> END INTERFACE
> END PROGRAM
>
> i do not have a PGI license, could you please confirm the PGI compiler
> fails compiling the test above ?
>
> Cheers,
>
> Gilles
>
> On 2014/07/30 12:54, Paul Hargrove wrote:
>
> On Tue, Jul 29, 2014 at 6:38 PM, Paul Hargrove <phhargr...@lbl.gov> 
> <phhargr...@lbl.gov> wrote:
>
>
>  On Tue, Jul 29, 2014 at 6:33 PM, Paul Hargrove <phhargr...@lbl.gov> 
> <phhargr...@lbl.gov> wrote:
>
>
>  I am trying again with an explicit --enable-mpi-fortran=usempi at
> configure time to see what happens.
>
>
>  Of course that should have said --enable-mpi-fortran=usempif08
>
>
>  I've switched to using PG13.6 for my testing.
> I find that even when I pass that flag I see that use_mpi_f08 is NOT
> enabled:
>
> checking Fortran compiler ignore TKR syntax... not cached; checking variants
> checking for Fortran compiler support of TYPE(*), DIMENSION(*)... no
> checking for Fortran compiler support of !DEC$ ATTRIBUTES NO_ARG_CHECK... no
> checking for Fortran compiler support of !$PRAGMA IGNORE_TKR... no
> checking for Fortran compiler support of !DIR$ IGNORE_TKR... yes
> checking Fortran compiler ignore TKR syntax... 1:real, dimension(*):!DIR$
> IGNORE_TKR
> checking if Fortran compiler supports ISO_C_BINDING... yes
> checking if building Fortran 'use mpi' bindings... yes
> checking if Fortran compiler supports SUBROUTINE BIND(C)... yes
> checking if Fortran compiler supports TYPE, BIND(C)... yes
> checking if Fortran compiler supports TYPE(type), BIND(C, NAME="name")...
> yes
> checking if Fortran compiler supports PROCEDURE... no
> *checking if building Fortran 'use mpi_f08' bindings... no*
>
> Contrast that to openmpi-1.8.1 and the same compiler:
>
> checking Fortran compiler ignore TKR syntax... not cached; checking variants
> checking for Fortran compiler support of TYPE(*), DIMENSION(*)... no
> checking for Fortran compiler support of !DEC$ ATTRIBUTES NO_ARG_CHECK... no
> checking for Fortran compiler support of !$PRAGMA IGNORE_TKR... no
> checking for Fortran compiler support of !DIR$ IGNORE_TKR... yes
> checking Fortran compiler ignore TKR syntax... 1:real, dimension(*):!DIR$
> IGNORE_TKR
> checking if building Fortran 'use mpi' bindings... yes
> checking if Fortran compiler supports ISO_C_B

Re: [OMPI devel] openmpi-1.8.2rc2 and f08 interface built with PGI-14.7 causes link error

2014-07-30 Thread Paul Hargrove
On a related topic:

I configured with an explicit --enable-mpi-fortran=usempif08.
Then configure found PROCEDURE was missing/broken.
The result is that the build continued, but without the requested f08
support.

If the user has explicitly enabled a given level of Fortran support, but it
cannot be provided, shouldn't this be a configure-time error?

-Paul


On Tue, Jul 29, 2014 at 9:41 PM, Gilles Gouaillardet <
gilles.gouaillar...@iferc.org> wrote:

>  Paul,
>
> i am sorry i missed that.
>
> and you are right, 1.8.1 and 1.8 from svn differs :
>
> from svn (config/ompi_setup_mpi_fortran.m4)
> # Per https://svn.open-mpi.org/trac/ompi/ticket/4590, if the
> # Fortran compiler doesn't support PROCEDURE in the way we
> # want/need, disable the mpi_f08 module.
> OMPI_FORTRAN_HAVE_PROCEDURE=0
> AS_IF([test $OMPI_WANT_FORTRAN_USEMPIF08_BINDINGS -eq 1 -a \
>$OMPI_BUILD_FORTRAN_USEMPIF08_BINDINGS -eq 1],
>   [ # Does the compiler support "procedure"
>OMPI_FORTRAN_CHECK_PROCEDURE(
>[OMPI_FORTRAN_HAVE_PROCEDURE=1],
>[OMPI_FORTRAN_HAVE_PROCEDURE=0
> OMPI_BUILD_FORTRAN_USEMPIF08_BINDINGS=0])])
>
> 1.8.1 does not disqualify f08 bindings if PROCEDURE is not supported.
> /* for the sake of completion, in some cases, 1.8.1 *might* disqualify f08
> bindings if PROCEDURE *is* supported :
> # Per https://svn.open-mpi.org/trac/ompi/ticket/4157, temporarily
> # disqualify the fortran compiler if it exhibits the behavior
> # described in that ticket.  Short version: OMPI does something
> # non-Fortran that we don't have time to fix 1.7.4.  So we just
> # disqualify Fortran compilers who actually enforce this issue,
> # and we'll fix OMPI to be Fortran-compliant after 1.7.4
> AS_IF([test $OMPI_WANT_FORTRAN_USEMPIF08_BINDINGS -eq 1 && \
>test $OMPI_BUILD_FORTRAN_USEMPIF08_BINDINGS -eq 1 && \
>test $OMPI_FORTRAN_HAVE_PROCEDURE -eq 1 && \
>test $OMPI_FORTRAN_HAVE_ABSTRACT -eq 1],
>   [ # Check for ticket 4157
>OMPI_FORTRAN_CHECK_TICKET_4157(
>[],
>[ # If we don't have this, don't build the mpi_f08 module
> OMPI_BUILD_FORTRAN_USEMPIF08_BINDINGS=0])])
>
>
> from the sources and #4590, f08 binding is intentionally disabled since
> PGI compilers does not support PROCEDURE.
> i agree this is really bad for PGI users :-(
>
> Jeff, can you comment on that ?
>
> Cheers,
>
> Gilles
>
> On 2014/07/30 13:25, Paul Hargrove wrote:
>
> Giles,
>
> If you look more carefully at the output I provided you will see that 1.8.1
> *does* test for PROCEDURE support and finds it lacking.  BOTH outputs
> include:
>  checking if Fortran compiler supports PROCEDURE... no
>
> However in the 1.8.1 case that is apparently not sufficient to disqualify
> building the f08 module.
>
> The test does fail in both 1.8.1 and 1.8.2rc2.
> Here is the related portion of config.log from one of them:
>
> configure:57708: checking if Fortran compiler supports PROCEDURE
> configure:57735: pgf90 -c -g conftest.f90 >&5 PGF90-S-0155-Illegal
> procedure interface - mpi_user_function (conftest.f90: 12)
> PGF90-S-0155-Illegal procedure interface - mpi_user_function (conftest.f90:
> 12) 0 inform, 0 warnings, 2 severes, 0 fatal for test_proc configure:57735:
> $? = 2 configure: failed program was: | MODULE proc_mod | INTERFACE |
> SUBROUTINE MPI_User_function | END SUBROUTINE | END INTERFACE | END MODULE
> proc_mod | | PROGRAM test_proc | INTERFACE | SUBROUTINE binky(user_fn) |
> USE proc_mod | PROCEDURE(MPI_User_function) :: user_fn | END SUBROUTINE |
> END INTERFACE | END PROGRAM configure:57751: result: no
>
> Other than the line numbers the 1.8.1 and 1.8.2rc2 output are identical in
> this respect.
>
> The test also fails run manually:
>
> {hargrove@hopper04 OMPI}$ pgf90 -c -g conftest.f90 PGF90-S-0155-Illegal
> procedure interface - mpi_user_function (conftest.f90: 12)
> PGF90-S-0155-Illegal procedure interface - mpi_user_function (conftest.f90:
> 12) 0 inform, 0 warnings, 2 severes, 0 fatal for test_proc
> {hargrove@hopper04 OMPI}$ pgf90 -V pgf90 13.10-0 64-bit target on x86-64
> Linux -tp shanghai The Portland Group - PGI Compilers and Tools Copyright
> (c) 2013, NVIDIA CORPORATION. All rights reserved.
>
> -Paul
>
> On Tue, Jul 29, 2014 at 9:09 PM, Gilles Gouaillardet 
> <gilles.gouaillar...@iferc.org> wrote:
>
>
>   Paul,
>
> from the logs, the only difference i see is about Fortran PROCEDURE.
>
> openpmi 1.8 (svn checkout) does not build the usempif08 bindings if
> PROCEDURE is not supported.
>

Re: [OMPI devel] openmpi-1.8.2rc2 and f08 interface built with PGI-14.7 causes link error

2014-07-30 Thread Paul Hargrove
Jeff,

I am not "screaming" for a return of support for the PGI compilers.
I will also note that "use mpi" works fine; only the F2008 support is
lacking.

Rather than complain I am offering to help test any solution that might be
offered.
I will also note that Nathan and Howard both have accounts at NERSC that
allow then access to Hopper, the system I have used for testing (in
addition to whatever LANL has).

NEW INFO:

While the 13.6 version of pgf90 failed the PROCEEDURE test, I find that
14.1 and 14.4 both *pass* (at least when attempted manually)
So, the issues I've had are DIFFERENT from the originally reported issue.
That is consistent with the mpi_f08.mod file with the same timestamp as the
others.
So, I am investigating the ORIGINAL problem once again with 14.4.


-Paul



On Wed, Jul 30, 2014 at 3:30 PM, Jeff Squyres (jsquyres) <jsquy...@cisco.com
> wrote:

> On Jul 30, 2014, at 12:36 AM, Paul Hargrove <phhargr...@lbl.gov> wrote:
>
> > Unfortunately, this (and
> https://svn.open-mpi.org/trac/ompi/changeset/31588 that followed)
> represent a REGRESSION in that between 1.8.1 and 1.8.2rc2 Open MPI has lost
> support for F08 with the PGI compilers.
>
> Yes, and the answer is for PGI to support more of the F2003 standard.
>  Then there might be a hope for supporting the MPI F08 bindings.  :-)
>
> Glib answer aside...
>
> The fact of the matter is that Fortran compilers are a nightmare of what
> specific Fortran features they support.  As part of r31587 and r31588,
> there was a simplification made to the (already quite complex) F08 bindings
> in OMPI to only support Fortran compilers that support PROCEDURE.
>
> I don't think I realized that I would be cutting off PGI support with this
> change.
>
> That being said, unless someone really screams, I would greatly prefer not
> to put back in the "support compilers who do not support PROCEDURE" code
> because a) it creates the problem that we solved by taking that stuff out,
> b) it adds more complexity to the F08 bindings, and c) we'll have to solve
> the original problem a different way... and I don't know how to do that.
>  :-\
>
> --
> Jeff Squyres
> jsquy...@cisco.com
> For corporate legal information go to:
> http://www.cisco.com/web/about/doing_business/legal/cri/
>
> ___
> devel mailing list
> de...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
> Link to this post:
> http://www.open-mpi.org/community/lists/devel/2014/07/15374.php
>



-- 
Paul H. Hargrove  phhargr...@lbl.gov
Future Technologies Group
Computer and Data Sciences Department Tel: +1-510-495-2352
Lawrence Berkeley National Laboratory Fax: +1-510-486-6900


Re: [OMPI devel] openmpi-1.8.2rc2 and f08 interface built with PGI-14.7 causes link error

2014-07-30 Thread Paul Hargrove
Tetsuya,

I found that the behavior of pgf90 changed somewhere between versions 13.6
and 14.1.
My previous reports were mostly based on my testing of 13.6.
So, I have probably been seeing an issue entirely different than yours.

I am testing 14.4 now and hope to be able to reproduce the problem you
reported.

-Paul


On Wed, Jul 30, 2014 at 12:14 AM,  wrote:

> Hi Paul, thank you for your comment.
>
> I don't think my mpi_f08.mod is older one, because the time stamp is
> equal to the time when I rebuilt them today.
>
> [mishima@manage openmpi-1.8.2rc2-pgi14.7]$ ll lib/mpi*
> -rwxr-xr-x 1 mishima mishima315 Jul 30 12:27 lib/mpi_ext.mod
> -rwxr-xr-x 1 mishima mishima327 Jul 30 12:27 lib/mpi_f08_ext.mod
> -rwxr-xr-x 1 mishima mishima  11716 Jul 30 12:27
> lib/mpi_f08_interfaces_callbacks.mod
> -rwxr-xr-x 1 mishima mishima 374813 Jul 30 12:27 lib/mpi_f08_interfaces.mod
> -rwxr-xr-x 1 mishima mishima 715615 Jul 30 12:27 lib/mpi_f08.mod
> -rwxr-xr-x 1 mishima mishima  14730 Jul 30 12:27 lib/mpi_f08_sizeof.mod
> -rwxr-xr-x 1 mishima mishima  77141 Jul 30 12:27 lib/mpi_f08_types.mod
> -rwxr-xr-x 1 mishima mishima 878339 Jul 30 12:27 lib/mpi.mod
>
> Regards,
> Tetsuya
>




-- 
Paul H. Hargrove  phhargr...@lbl.gov
Future Technologies Group
Computer and Data Sciences Department Tel: +1-510-495-2352
Lawrence Berkeley National Laboratory Fax: +1-510-486-6900


Re: [OMPI devel] openmpi-1.8.2rc2 and f08 interface built with PGI-14.7 causes link error

2014-07-30 Thread Paul Hargrove
Jeff,

I can now reproduce Tetsuya's original problem, using a build of 1.8.2rc2
with PGI 14.4.

$ INST/bin/mpifort  ../test.f
/scratch/scratchdirs/hargrove/pgf90pdegT3bhBmEq.o: In function `.C1_283':
test.f:(.data+0x6c): undefined reference to `mpi_f08_interfaces_callbacks_'
test.f:(.data+0x74): undefined reference to `mpi_f08_interfaces_'
test.f:(.data+0x7c): undefined reference to `pmpi_f08_interfaces_'
test.f:(.data+0x84): undefined reference to `mpi_f08_sizeof_'
/usr/bin/ld: link errors found, deleting executable `a.out'

And here is the showme:

$ INST/bin/mpifort  ../test.f --showme
pgf90 ../test.f
-I/scratch/scratchdirs/hargrove/OMPI/openmpi-1.8.2rc2-linux-x86_64-pgi-14.4/INST/include
-I/scratch/scratchdirs/hargrove/OMPI/openmpi-1.8.2rc2-linux-x86_64-pgi-14.4/INST/lib
-Wl,-rpath
-Wl,/scratch/scratchdirs/hargrove/OMPI/openmpi-1.8.2rc2-linux-x86_64-pgi-14.4/INST/lib
-L/scratch/scratchdirs/hargrove/OMPI/openmpi-1.8.2rc2-linux-x86_64-pgi-14.4/INST/lib
-lmpi_usempif08 -lmpi_usempi_ignore_tkr -lmpi_mpifh -lmpi


It may be relevant to note that the 4 undefined references each name a
module.
There does not appear to be any definition of these in any library:

$ for x in INST/lib/*.{a,so}; do nm $x; done | grep -i mpi_f08_sizeof
 U mpi_f08_sizeof_

That undefined reference is in libmpi_usepmif90.so along with the other
three in the linker error.


I am essentially illiterate with respect to any feature added to fortran
after F77.
So, I am happy to run tests but have no suggestions as to a resolution.

-Paul

On Wed, Jul 30, 2014 at 5:24 PM, Jeff Squyres (jsquyres)  wrote:

> On Jul 28, 2014, at 11:43 PM, tmish...@jcity.maeda.co.jp wrote:
>
> > [mishima@manage work]$ mpif90 test.f -o test.ex
> > /tmp/pgfortran65ZcUeoncoqT.o: In function `.C1_283':
> > test.f:(.data+0x6c): undefined reference to
> `mpi_f08_interfaces_callbacks_'
> > test.f:(.data+0x74): undefined reference to `mpi_f08_interfaces_'
> > test.f:(.data+0x7c): undefined reference to `pmpi_f08_interfaces_'
> > test.f:(.data+0x84): undefined reference to `mpi_f08_sizeof_'
>
> Just to go back to the original post here: can you send the results of
>
>   mpifort test.f -o test.ex --showme
>
> I'd like to see what fortran libraries are being linked in.  Here's what I
> get when I compile OMPI with the Intel suite:
>
> -
> $ mpifort hello_usempif08.f90 -o hello --showme
> ifort hello_usempif08.f90 -o hello -I/home/jsquyres/bogus/include
> -I/home/jsquyres/bogus/lib -Wl,-rpath -Wl,/home/jsquyres/bogus/lib
> -Wl,--enable-new-dtags -L/home/jsquyres/bogus/lib -lmpi_usempif08
> -lmpi_usempi_ignore_tkr -lmpi_mpifh -lmpi
> 
>
> I note that with the Intel compiler, the Fortran module files are created
> in the lib directory (i.e., $prefix/lib), which is -L'ed on the link line.
>  Does the PGI compiler require something different?  Does the PGI 14
> compiler make an additional library for modules that we need to link in?
>
> We didn't use CONTAINS, and it supposedly works fine with the mpi module
> (right, guys?), so I'm not sure would the same scheme wouldn't work for the
> mpi_f08 module...?
>
> --
> Jeff Squyres
> jsquy...@cisco.com
> For corporate legal information go to:
> http://www.cisco.com/web/about/doing_business/legal/cri/
>
> ___
> devel mailing list
> de...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
> Link to this post:
> http://www.open-mpi.org/community/lists/devel/2014/07/15377.php
>



-- 
Paul H. Hargrove  phhargr...@lbl.gov
Future Technologies Group
Computer and Data Sciences Department Tel: +1-510-495-2352
Lawrence Berkeley National Laboratory Fax: +1-510-486-6900


Re: [OMPI devel] openmpi-1.8.2rc2 and f08 interface built with PGI-14.7 causes link error

2014-07-30 Thread Paul Hargrove
On Wed, Jul 30, 2014 at 6:15 PM,  wrote:
[...]

> Strange thing is that openmpi-1.8 with PGI14.7 works fine.
> What's the difference with openmpi-1.8 and openmpi-1.8.2rc2?
>
[...]

Tetsuya,

Now that I can reproduce the problem you have reported, I am building 1.8.1
with PGI14.4.
Then I may be able to answer the question about what is different.

-Paul




-- 
Paul H. Hargrove  phhargr...@lbl.gov
Future Technologies Group
Computer and Data Sciences Department Tel: +1-510-495-2352
Lawrence Berkeley National Laboratory Fax: +1-510-486-6900


Re: [OMPI devel] openmpi-1.8.2rc2 and f08 interface built with PGI-14.7 causes link error

2014-07-31 Thread Paul Hargrove
On Wed, Jul 30, 2014 at 6:20 PM, Paul Hargrove <phhargr...@lbl.gov> wrote:

>
> On Wed, Jul 30, 2014 at 6:15 PM, <tmish...@jcity.maeda.co.jp> wrote:
> [...]
>
>> Strange thing is that openmpi-1.8 with PGI14.7 works fine.
>> What's the difference with openmpi-1.8 and openmpi-1.8.2rc2?
>>
> [...]
>
> Tetsuya,
>
> Now that I can reproduce the problem you have reported, I am building
> 1.8.1 with PGI14.4.
> Then I may be able to answer the question about what is different.
>
> -Paul
>


I have a clear answer to *what* is different (below) and am next looking
into the why/how now.
It seems that 1.8.1 has included all dependencies into libmpi_usempif08
while 1.8.2rc2 does not.
My reflex is to blame libtool, but config/lt* are unchanged between the two
versions.

I am rebuilding now with "V=1" passed to make so I can see how the libs
were built.
I'd appreciate guidance if Jeff or anybody else has suggestions as to an
alternative approach to investigate this.
When completed, I will be (more than) happy to turn over the verbose make
output for somebody else to examine.

-Paul

In 1.8.1:
$ nm openmpi-1.8.1-linux-x86_64-pgi-14.4/INST/lib/libmpi_usempif08.so |
grep ' mpi_f08_sizeof_'
0004a9a0 T mpi_f08_sizeof_
0004ad70 T mpi_f08_sizeof_mpi_sizeof_complex_a_16_
0004acf0 T mpi_f08_sizeof_mpi_sizeof_complex_a_8_
0004ad30 T mpi_f08_sizeof_mpi_sizeof_complex_s_16_
0004acb0 T mpi_f08_sizeof_mpi_sizeof_complex_s_8_
0004a9f0 T mpi_f08_sizeof_mpi_sizeof_integer_a_1_
0004aa70 T mpi_f08_sizeof_mpi_sizeof_integer_a_2_
0004aaf0 T mpi_f08_sizeof_mpi_sizeof_integer_a_4_
0004ab70 T mpi_f08_sizeof_mpi_sizeof_integer_a_8_
0004a9b0 T mpi_f08_sizeof_mpi_sizeof_integer_s_1_
0004aa30 T mpi_f08_sizeof_mpi_sizeof_integer_s_2_
0004aab0 T mpi_f08_sizeof_mpi_sizeof_integer_s_4_
0004ab30 T mpi_f08_sizeof_mpi_sizeof_integer_s_8_
0004abf0 T mpi_f08_sizeof_mpi_sizeof_real_a_4_
0004ac70 T mpi_f08_sizeof_mpi_sizeof_real_a_8_
0004abb0 T mpi_f08_sizeof_mpi_sizeof_real_s_4_
0004ac30 T mpi_f08_sizeof_mpi_sizeof_real_s_8_

In 1.8.2rc2:
$ nm openmpi-1.8.2rc2-linux-x86_64-pgi-14.4/INST/lib/libmpi_usempif08.so |
grep ' mpi_f08_sizeof_'
 U mpi_f08_sizeof_


Similar differences exist corresponding to the other three modules that
give undefined references in Tetsuya's simple test code.


-- 
Paul H. Hargrove  phhargr...@lbl.gov
Future Technologies Group
Computer and Data Sciences Department Tel: +1-510-495-2352
Lawrence Berkeley National Laboratory Fax: +1-510-486-6900


Re: [OMPI devel] openmpi-1.8.2rc2 and f08 interface built with PGI-14.7 causes link error

2014-07-31 Thread Paul Hargrove
On Wed, Jul 30, 2014 at 8:53 PM, Paul Hargrove <phhargr...@lbl.gov> wrote:
[...]

> I have a clear answer to *what* is different (below) and am next looking
> into the why/how now.
> It seems that 1.8.1 has included all dependencies into libmpi_usempif08
> while 1.8.2rc2 does not.
>
 [...]

The difference appears to stem from the following difference in
ompi/mpi/fortran/use-mpi-f08/Makefile.am:

1.8.1:
libmpi_usempif08_la_LIBADD = \
$(module_sentinel_file) \
$(OMPI_MPIEXT_USEMPIF08_LIBS) \
$(top_builddir)/ompi/libmpi.la

1.8.2rc2:
libmpi_usempif08_la_LIBADD = \
$(OMPI_MPIEXT_USEMPIF08_LIBS) \
$(top_builddir)/ompi/libmpi.la
libmpi_usempif08_la_DEPENDENCIES = $(module_sentinel_file)

Where in both cases one has:

module_sentinel_file = \
libforce_usempif08_internal_modules_to_be_built.la

which contains all of the symbols which my previous testing found had
"disappeared" from libmpi_usempif08.so between 1.8.1 and 1.8.2rc2.

I don't have recent enough autotools to attempt the change the Makefile.am,
but instead restored the removed item from libmpi_usempif08_la_LIBADD
directly in Makefile.in.  However, rather than fixing the problem, that
resulted in multiple definitions of a bunch of _eq and _ne functions
(e.g. mpi_f08_types_ompi_request_op_ne_).  So, I am uncertain how to
proceed.

Use svn blame points at a "bulk" CMR of many fortran related changes,
including one related to the eq/ne operators.  So, I am turning over this
investigation to Jeff and/or Ralph to figure out what actually is required
to fix this without loss of whatever benefits were in that CMR.  I am still
available to test the proposed fixes.  Happy hunting...

Somebody owes me a virtual beer (or nihonshu) ;-)
-Paul


-- 
Paul H. Hargrove  phhargr...@lbl.gov
Future Technologies Group
Computer and Data Sciences Department Tel: +1-510-495-2352
Lawrence Berkeley National Laboratory Fax: +1-510-486-6900


Re: [OMPI devel] openmpi-1.8.2rc2 and f08 interface built with PGI-14.7 causes link error

2014-07-31 Thread Paul Hargrove
Gilles,


Just as you speculate, PGI is creating a _-suffixed reference to the module
name:

$ pgf90 -c test.f90
$ nm -u test.o | grep f08
 U mpi_f08_sizeof_
 U mpi_f08_sizeof_mpi_sizeof_real_s_4_



You suggested the following work-around in a previous email:

$ INST/bin/mpifort  ../test.f
./BLD/ompi/mpi/fortran/use-mpi-f08/.libs/libforce_usempif08_internal_modules_to_be_built.a

That works fine.  That doesn't surprise me, because I had already
identified that file as having been removed from libmpi_usempif08.so
between 1.8.1 and 1.8.2rc2.  It includes the symbol for the module names
plus trailing '_'.

-Paul


On Thu, Jul 31, 2014 at 1:07 AM, Gilles Gouaillardet <
gilles.gouaillar...@iferc.org> wrote:

> Paul,
>
> in .../ompi/mpi/fortran/use-mpi-f08, can you create the following dumb
> test program,
> compile and run nm | grep f08 on the object :
>
> $ cat foo.f90
> program foo
> use mpi_f08_sizeof
>
> implicit none
>
> real :: x
> integer :: size, ierror
>
> call MPI_Sizeof_real_s_4(x, size, ierror)
>
> stop
> end program
>
>
> with intel compiler :
> $ ifort -c foo.f90
> $ nm foo.o | grep f08
>  U mpi_f08_sizeof_mp_mpi_sizeof_real_s_4_
>
> i am wondering whether PGI compiler adds an additional undefined
> reference to mpi_f08_sizeof_ ...
>
> Cheers,
>
> Gilles
>
> ___
> devel mailing list
> de...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
> Link to this post:
> http://www.open-mpi.org/community/lists/devel/2014/07/15390.php
>



-- 
Paul H. Hargrove  phhargr...@lbl.gov
Future Technologies Group
Computer and Data Sciences Department Tel: +1-510-495-2352
Lawrence Berkeley National Laboratory Fax: +1-510-486-6900


Re: [OMPI devel] RFC: add atomic compare-and-swap that returns old value

2014-07-31 Thread Paul Hargrove
On Thu, Jul 31, 2014 at 4:13 PM, George Bosilca  wrote:

> Paul, I know you have a pretty diverse range computers. Can you try to
> compile and run a "make check" with the following patch?


I will see what I can do for ARMv7, MIPS, PPC and IA64 (or whatever subset
of those is still supported).
The ARM and MIPS system are emulators and take forever to build OMPI.
However, I am not even sure how soon I'll get to start this testing.

-Paul



-- 
Paul H. Hargrove  phhargr...@lbl.gov
Future Technologies Group
Computer and Data Sciences Department Tel: +1-510-495-2352
Lawrence Berkeley National Laboratory Fax: +1-510-486-6900


Re: [OMPI devel] RFC: add atomic compare-and-swap that returns old value

2014-07-31 Thread Paul Hargrove
On Thu, Jul 31, 2014 at 4:22 PM, Paul Hargrove <phhargr...@lbl.gov> wrote:

>
> On Thu, Jul 31, 2014 at 4:13 PM, George Bosilca <bosi...@icl.utk.edu>
> wrote:
>
>> Paul, I know you have a pretty diverse range computers. Can you try to
>> compile and run a "make check" with the following patch?
>
>
> I will see what I can do for ARMv7, MIPS, PPC and IA64 (or whatever subset
> of those is still supported).
> The ARM and MIPS system are emulators and take forever to build OMPI.
> However, I am not even sure how soon I'll get to start this testing.
>


Add SPARC (v8plus and v9) to that list.



-- 
Paul H. Hargrove  phhargr...@lbl.gov
Future Technologies Group
Computer and Data Sciences Department Tel: +1-510-495-2352
Lawrence Berkeley National Laboratory Fax: +1-510-486-6900


Re: [OMPI devel] openmpi-1.8.2rc2 and f08 interface built with PGI-14.7 causes link error

2014-07-31 Thread Paul Hargrove
Related question:

If I am understanding PGI's list of fixed-TPRs (bugs) then it looks like
one (certainly not the only) difference between 13.x and 14.1 is a fix to a
problem with PROCEDURE and zero-argument subroutines.  As it happens, the
configure probe for PROCEEDURE is a zero-argument subroutine, but the
"real" usage in OMPI is *not* zero-argument.  This opens the possibility
(not certainty) that PROCEDURE may work as required in PGI-13.x, in which
case only a "more accurate" configure test would be required to restore F08
support for PGI-13 (present in 1.8.1 and lacking in 1.8.2rc2).

So, the most important question first:
Does anybody care about PGI-13 (cannot use PGI-14 for some reason other
than cost of license)?

-Paul


On Thu, Jul 31, 2014 at 6:17 PM, Jeff Squyres (jsquyres) <jsquy...@cisco.com
> wrote:

> Many thanks guys, this thread was most helpful in finding the fix.
>
> Paul H. nailed 80% of it on the head in the post where he identified the
> Makefile.am change.  That Makefile.am change was due to three things:
>
> 1. Fixing a real bug (elsewhere in that commit)
> 2. My misunderstanding of how module files work in Fortran
> 3. The fact that gfortran, Absoft, and ifort *don't* require you to link
> in the .o files generated by modules, but apparently pgfortran *does*
>
> Blarg.
>
> That led to the duplicate symbol issue which Paul also encountered when he
> tried to fix the original problem, so I fixed that, too (which was a direct
> consequence of the first fix).
>
> Should be fixed in the trunk now; we tested with pgfortran on Craig
> Rasmussen's cluster (many thanks, Craig!).
>
> CMR is https://svn.open-mpi.org/trac/ompi/ticket/4519.
>
>
>
>
> On Jul 31, 2014, at 7:27 AM, Paul Hargrove <phhargr...@lbl.gov> wrote:
>
> > Gilles,
> >
> >
> > Just as you speculate, PGI is creating a _-suffixed reference to the
> module name:
> >
> > $ pgf90 -c test.f90
> > $ nm -u test.o | grep f08
> >  U mpi_f08_sizeof_
> >  U mpi_f08_sizeof_mpi_sizeof_real_s_4_
> >
> >
> >
> > You suggested the following work-around in a previous email:
> >
> > $ INST/bin/mpifort  ../test.f
> ./BLD/ompi/mpi/fortran/use-mpi-f08/.libs/libforce_usempif08_internal_modules_to_be_built.a
> >
> > That works fine.  That doesn't surprise me, because I had already
> identified that file as having been removed from libmpi_usempif08.so
> between 1.8.1 and 1.8.2rc2.  It includes the symbol for the module names
> plus trailing '_'.
> >
> > -Paul
> >
> >
> > On Thu, Jul 31, 2014 at 1:07 AM, Gilles Gouaillardet <
> gilles.gouaillar...@iferc.org> wrote:
> > Paul,
> >
> > in .../ompi/mpi/fortran/use-mpi-f08, can you create the following dumb
> > test program,
> > compile and run nm | grep f08 on the object :
> >
> > $ cat foo.f90
> > program foo
> > use mpi_f08_sizeof
> >
> > implicit none
> >
> > real :: x
> > integer :: size, ierror
> >
> > call MPI_Sizeof_real_s_4(x, size, ierror)
> >
> > stop
> > end program
> >
> >
> > with intel compiler :
> > $ ifort -c foo.f90
> > $ nm foo.o | grep f08
> >  U mpi_f08_sizeof_mp_mpi_sizeof_real_s_4_
> >
> > i am wondering whether PGI compiler adds an additional undefined
> > reference to mpi_f08_sizeof_ ...
> >
> > Cheers,
> >
> > Gilles
> >
> > ___
> > devel mailing list
> > de...@open-mpi.org
> > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
> > Link to this post:
> http://www.open-mpi.org/community/lists/devel/2014/07/15390.php
> >
> >
> >
> > --
> > Paul H. Hargrove  phhargr...@lbl.gov
> > Future Technologies Group
> > Computer and Data Sciences Department Tel: +1-510-495-2352
> > Lawrence Berkeley National Laboratory Fax: +1-510-486-6900
> > ___
> > devel mailing list
> > de...@open-mpi.org
> > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
> > Link to this post:
> http://www.open-mpi.org/community/lists/devel/2014/07/15391.php
>
>
> --
> Jeff Squyres
> jsquy...@cisco.com
> For corporate legal information go to:
> http://www.cisco.com/web/about/doing_business/legal/cri/
>
> ___
> devel mailing list
> de...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
> Link to this post:
> http://www.open-mpi.org/community/lists/devel/2014/07/15415.php
>



-- 
Paul H. Hargrove  phhargr...@lbl.gov
Future Technologies Group
Computer and Data Sciences Department Tel: +1-510-495-2352
Lawrence Berkeley National Laboratory Fax: +1-510-486-6900


Re: [OMPI devel] openmpi-1.8.2rc2 and f08 interface built with PGI-14.7 causes link error

2014-07-31 Thread Paul Hargrove
Nevermind my suggestion to revise examples/hello_usempif08.f90
I've just determined that it is already sufficient to reproduce the problem.
(So now I need to see what's wrong in my testing scripts).

-Paul


On Thu, Jul 31, 2014 at 7:04 PM, Paul Hargrove <phhargr...@lbl.gov> wrote:

> Second related issue:
>
> Can/should examples/hello_usempif08.f90 be extended to use more of the
> module such that it would have illustrated the bug found with Tetsuya's
> example code?   I don't know about MTT, but my scripts for testing a
> release candidate includes running "make" in the example subdir.
>
> -Paul
>
>
> On Thu, Jul 31, 2014 at 6:17 PM, Jeff Squyres (jsquyres) <
> jsquy...@cisco.com> wrote:
>
>> Many thanks guys, this thread was most helpful in finding the fix.
>>
>> Paul H. nailed 80% of it on the head in the post where he identified the
>> Makefile.am change.  That Makefile.am change was due to three things:
>>
>> 1. Fixing a real bug (elsewhere in that commit)
>> 2. My misunderstanding of how module files work in Fortran
>> 3. The fact that gfortran, Absoft, and ifort *don't* require you to link
>> in the .o files generated by modules, but apparently pgfortran *does*
>>
>> Blarg.
>>
>> That led to the duplicate symbol issue which Paul also encountered when
>> he tried to fix the original problem, so I fixed that, too (which was a
>> direct consequence of the first fix).
>>
>> Should be fixed in the trunk now; we tested with pgfortran on Craig
>> Rasmussen's cluster (many thanks, Craig!).
>>
>> CMR is https://svn.open-mpi.org/trac/ompi/ticket/4519.
>>
>>
>>
>>
>> On Jul 31, 2014, at 7:27 AM, Paul Hargrove <phhargr...@lbl.gov> wrote:
>>
>> > Gilles,
>> >
>> >
>> > Just as you speculate, PGI is creating a _-suffixed reference to the
>> module name:
>> >
>> > $ pgf90 -c test.f90
>> > $ nm -u test.o | grep f08
>> >  U mpi_f08_sizeof_
>> >  U mpi_f08_sizeof_mpi_sizeof_real_s_4_
>> >
>> >
>> >
>> > You suggested the following work-around in a previous email:
>> >
>> > $ INST/bin/mpifort  ../test.f
>> ./BLD/ompi/mpi/fortran/use-mpi-f08/.libs/libforce_usempif08_internal_modules_to_be_built.a
>> >
>> > That works fine.  That doesn't surprise me, because I had already
>> identified that file as having been removed from libmpi_usempif08.so
>> between 1.8.1 and 1.8.2rc2.  It includes the symbol for the module names
>> plus trailing '_'.
>> >
>> > -Paul
>> >
>> >
>> > On Thu, Jul 31, 2014 at 1:07 AM, Gilles Gouaillardet <
>> gilles.gouaillar...@iferc.org> wrote:
>> > Paul,
>> >
>> > in .../ompi/mpi/fortran/use-mpi-f08, can you create the following dumb
>> > test program,
>> > compile and run nm | grep f08 on the object :
>> >
>> > $ cat foo.f90
>> > program foo
>> > use mpi_f08_sizeof
>> >
>> > implicit none
>> >
>> > real :: x
>> > integer :: size, ierror
>> >
>> > call MPI_Sizeof_real_s_4(x, size, ierror)
>> >
>> > stop
>> > end program
>> >
>> >
>> > with intel compiler :
>> > $ ifort -c foo.f90
>> > $ nm foo.o | grep f08
>> >  U mpi_f08_sizeof_mp_mpi_sizeof_real_s_4_
>> >
>> > i am wondering whether PGI compiler adds an additional undefined
>> > reference to mpi_f08_sizeof_ ...
>> >
>> > Cheers,
>> >
>> > Gilles
>> >
>> > ___
>> > devel mailing list
>> > de...@open-mpi.org
>> > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
>> > Link to this post:
>> http://www.open-mpi.org/community/lists/devel/2014/07/15390.php
>> >
>> >
>> >
>> > --
>> > Paul H. Hargrove  phhargr...@lbl.gov
>> > Future Technologies Group
>> > Computer and Data Sciences Department Tel: +1-510-495-2352
>> > Lawrence Berkeley National Laboratory Fax: +1-510-486-6900
>> > ___
>> > devel mailing list
>> > de...@open-mpi.org
>> > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
>> > Link to this post:
>> http://www.open-mpi.org/community/lists/devel/2014/07/15391.php
>>
>>
>> --
>> Jeff Squyres
>> jsquy...@cisco.com
>> For corporate legal information go to:
>> http://www.cisco.com/web/about/doing_business/legal/cri/
>>
>> ___
>> devel mailing list
>> de...@open-mpi.org
>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
>> Link to this post:
>> http://www.open-mpi.org/community/lists/devel/2014/07/15415.php
>>
>
>
>
> --
> Paul H. Hargrove  phhargr...@lbl.gov
> Future Technologies Group
> Computer and Data Sciences Department Tel: +1-510-495-2352
> Lawrence Berkeley National Laboratory Fax: +1-510-486-6900
>



-- 
Paul H. Hargrove  phhargr...@lbl.gov
Future Technologies Group
Computer and Data Sciences Department Tel: +1-510-495-2352
Lawrence Berkeley National Laboratory Fax: +1-510-486-6900


Re: [OMPI devel] RFC: add atomic compare-and-swap that returns old value

2014-07-31 Thread Paul Hargrove
George:

Have a failure with your patch applied on PPC64/Linux and gcc-4.4.6:

Making all in asm
make[2]: Entering directory
`/home/hargrov1/OMPI/openmpi-trunk-linux-ppc64-gcc/BLD/opal/asm'
  CC   asm.lo
In file included from
/home/hargrov1/OMPI/openmpi-trunk-linux-ppc64-gcc/openmpi-1.9a1r32369/opal/asm/asm.c:21:0:
/home/hargrov1/OMPI/openmpi-trunk-linux-ppc64-gcc/openmpi-1.9a1r32369/opal/include/opal/sys/atomic.h:374:9:
error: conflicting types for 'opal_atomic_cmpset_rel_64'
/home/hargrov1/OMPI/openmpi-trunk-linux-ppc64-gcc/openmpi-1.9a1r32369/opal/include/opal/sys/powerpc/atomic.h:214:19:
note: previous definition of 'opal_atomic_cmpset_rel_64' was here
/home/hargrov1/OMPI/openmpi-trunk-linux-ppc64-gcc/openmpi-1.9a1r32369/opal/include/opal/sys/atomic.h:374:9:
warning: 'opal_atomic_cmpset_rel_64' used but never defined [enabled by
default]
make[2]: *** [asm.lo] Error 1


BTW: the patch applied cleanly to trunk except the portion
changing opal/include/opal/sys/osx/atomic.h, which does not exist.

-Paul


On Thu, Jul 31, 2014 at 4:25 PM, George Bosilca <bosi...@icl.utk.edu> wrote:

> Awesome, thanks Paul. When the results will be in we will fix whatever is
> needed for these less common architectures.
>
>   George.
>
>
>
> On Thu, Jul 31, 2014 at 7:24 PM, Paul Hargrove <phhargr...@lbl.gov> wrote:
>
>>
>>
>> On Thu, Jul 31, 2014 at 4:22 PM, Paul Hargrove <phhargr...@lbl.gov>
>> wrote:
>>
>>>
>>> On Thu, Jul 31, 2014 at 4:13 PM, George Bosilca <bosi...@icl.utk.edu>
>>> wrote:
>>>
>>>> Paul, I know you have a pretty diverse range computers. Can you try to
>>>> compile and run a "make check" with the following patch?
>>>
>>>
>>> I will see what I can do for ARMv7, MIPS, PPC and IA64 (or whatever
>>> subset of those is still supported).
>>> The ARM and MIPS system are emulators and take forever to build OMPI.
>>> However, I am not even sure how soon I'll get to start this testing.
>>>
>>
>>
>> Add SPARC (v8plus and v9) to that list.
>>
>>
>>
>> --
>> Paul H. Hargrove  phhargr...@lbl.gov
>>  Future Technologies Group
>> Computer and Data Sciences Department Tel: +1-510-495-2352
>> Lawrence Berkeley National Laboratory Fax: +1-510-486-6900
>>
>> ___
>> devel mailing list
>> de...@open-mpi.org
>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
>> Link to this post:
>> http://www.open-mpi.org/community/lists/devel/2014/07/15411.php
>>
>
>
> ___
> devel mailing list
> de...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
> Link to this post:
> http://www.open-mpi.org/community/lists/devel/2014/07/15412.php
>



-- 
Paul H. Hargrove  phhargr...@lbl.gov
Future Technologies Group
Computer and Data Sciences Department Tel: +1-510-495-2352
Lawrence Berkeley National Laboratory Fax: +1-510-486-6900


[OMPI devel] trunk link failure on Solaris-10/SPARC

2014-08-01 Thread Paul Hargrove
$ INST/bin/mpirun -mca btl sm,self -np 2 examples/ring_c'
ld.so.1: ring_c: fatal: relocation error: file
/home/hargrove/OMPI/openmpi-trunk-solaris10-sparcT2-ss12u3-v8plus/INST/lib/openmpi/mca_pml_ob1.so:
symbol alloca: referenced symbol not found


This platform has worked in the past.
I will be trying 1.8.2rc2 on this system ASAP.


-Paul

-- 
Paul H. Hargrove  phhargr...@lbl.gov
Future Technologies Group
Computer and Data Sciences Department Tel: +1-510-495-2352
Lawrence Berkeley National Laboratory Fax: +1-510-486-6900


Re: [OMPI devel] [OMPI svn] svn:open-mpi r32388 - trunk/ompi/mca/pml/ob1

2014-08-01 Thread Paul Hargrove
In general I am only setup to build from tarballs, not svn.
However, I can (and will) apply this change manually w/o difficulty.

I will report back when I've had a chance to try that.

I already have many builds in-flight to test George's atomics patch and am
in danger of confusing myself if I am not careful.

-Paul


On Thu, Jul 31, 2014 at 8:29 PM, Ralph Castain  wrote:

> FWIW: we had Siegmar try that and it didn't solve the problem. Paul?
>
>
> On Jul 31, 2014, at 8:28 PM, svn-commit-mai...@open-mpi.org wrote:
>
> > Author: bosilca (George Bosilca)
> > Date: 2014-07-31 23:28:23 EDT (Thu, 31 Jul 2014)
> > New Revision: 32388
> > URL: https://svn.open-mpi.org/trac/ompi/changeset/32388
> >
> > Log:
> > Missing alloca.h. Thanks Paul for catching this.
> >
> > Text files modified:
> >   trunk/ompi/mca/pml/ob1/pml_ob1_irecv.c | 3 +++
> >   trunk/ompi/mca/pml/ob1/pml_ob1_isend.c | 3 +++
> >   2 files changed, 6 insertions(+), 0 deletions(-)
> >
> > Modified: trunk/ompi/mca/pml/ob1/pml_ob1_irecv.c
> >
> ==
> > --- trunk/ompi/mca/pml/ob1/pml_ob1_irecv.cThu Jul 31 21:00:42 2014
>  (r32387)
> > +++ trunk/ompi/mca/pml/ob1/pml_ob1_irecv.c2014-07-31 23:28:23 EDT
> (Thu, 31 Jul 2014)  (r32388)
> > @@ -28,6 +28,9 @@
> > #include "pml_ob1_recvfrag.h"
> > #include "ompi/peruse/peruse-internal.h"
> > #include "ompi/message/message.h"
> > +#if HAVE_ALLOCA_H
> > +#include 
> > +#endif  /* HAVE_ALLOCA_H */
> >
> > int mca_pml_ob1_irecv_init(void *addr,
> >size_t count,
> >
> > Modified: trunk/ompi/mca/pml/ob1/pml_ob1_isend.c
> >
> ==
> > --- trunk/ompi/mca/pml/ob1/pml_ob1_isend.cThu Jul 31 21:00:42 2014
>  (r32387)
> > +++ trunk/ompi/mca/pml/ob1/pml_ob1_isend.c2014-07-31 23:28:23 EDT
> (Thu, 31 Jul 2014)  (r32388)
> > @@ -26,6 +26,9 @@
> > #include "pml_ob1_sendreq.h"
> > #include "pml_ob1_recvreq.h"
> > #include "ompi/peruse/peruse-internal.h"
> > +#if HAVE_ALLOCA_H
> > +#include 
> > +#endif  /* HAVE_ALLOCA_H */
> >
> > int mca_pml_ob1_isend_init(void *buf,
> >size_t count,
> > ___
> > svn mailing list
> > s...@open-mpi.org
> > http://www.open-mpi.org/mailman/listinfo.cgi/svn
>
> ___
> devel mailing list
> de...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
> Link to this post:
> http://www.open-mpi.org/community/lists/devel/2014/07/15424.php
>



-- 
Paul H. Hargrove  phhargr...@lbl.gov
Future Technologies Group
Computer and Data Sciences Department Tel: +1-510-495-2352
Lawrence Berkeley National Laboratory Fax: +1-510-486-6900


Re: [OMPI devel] [OMPI svn] svn:open-mpi r32388 - trunk/ompi/mca/pml/ob1

2014-08-01 Thread Paul Hargrove
Gilles,

This test was using the Solaris Studio Compilers version 12.3.

/usr/bin/gcc on this system is "gccfss" which Open MPI does NOT support.

There is also a gcc-3.3.2 in /usr/local/bin and gcc-3.4.3 in /usr/sfw/bin
Neither includes usable fortran compilers, which is why the Studio
compilers are preferred.
Let me know if you need me to try any of those gcc installations.

-Paul


On Thu, Jul 31, 2014 at 9:12 PM, Gilles Gouaillardet <
gilles.gouaillar...@iferc.org> wrote:

>  Paul,
>
> As Ralph pointed, this issue was reported last month on the user mailing
> list.
>
> #include  did not help :
> http://www.open-mpi.org/community/lists/users/2014/07/24883.php
>
> I will try if i can reproduce and fix this issue on a solaris10 (but x86)
> VM
>
> BTW, are you using the GNU compiler ?
>
> Cheers,
>
> Gilles
>
> On 2014/08/01 13:08, Paul Hargrove wrote:
>
> In general I am only setup to build from tarballs, not svn.
> However, I can (and will) apply this change manually w/o difficulty.
>
> I will report back when I've had a chance to try that.
>
> I already have many builds in-flight to test George's atomics patch and am
> in danger of confusing myself if I am not careful.
>
> -Paul
>
>
> On Thu, Jul 31, 2014 at 8:29 PM, Ralph Castain <r...@open-mpi.org> 
> <r...@open-mpi.org> wrote:
>
>
>  FWIW: we had Siegmar try that and it didn't solve the problem. Paul?
>
>
> On Jul 31, 2014, at 8:28 PM, svn-commit-mai...@open-mpi.org wrote:
>
>
>  Author: bosilca (George Bosilca)
> Date: 2014-07-31 23:28:23 EDT (Thu, 31 Jul 2014)
> New Revision: 32388
> URL: https://svn.open-mpi.org/trac/ompi/changeset/32388
>
> Log:
> Missing alloca.h. Thanks Paul for catching this.
>
> Text files modified:
>   trunk/ompi/mca/pml/ob1/pml_ob1_irecv.c | 3 +++
>   trunk/ompi/mca/pml/ob1/pml_ob1_isend.c | 3 +++
>   2 files changed, 6 insertions(+), 0 deletions(-)
>
> Modified: trunk/ompi/mca/pml/ob1/pml_ob1_irecv.c
>
>
>  
> ==
>
>  --- trunk/ompi/mca/pml/ob1/pml_ob1_irecv.cThu Jul 31 21:00:42 2014
>
>   (r32387)
>
>  +++ trunk/ompi/mca/pml/ob1/pml_ob1_irecv.c2014-07-31 23:28:23 EDT
>
>  (Thu, 31 Jul 2014)  (r32388)
>
>  @@ -28,6 +28,9 @@
> #include "pml_ob1_recvfrag.h"
> #include "ompi/peruse/peruse-internal.h"
> #include "ompi/message/message.h"
> +#if HAVE_ALLOCA_H
> +#include 
> +#endif  /* HAVE_ALLOCA_H */
>
> int mca_pml_ob1_irecv_init(void *addr,
>size_t count,
>
> Modified: trunk/ompi/mca/pml/ob1/pml_ob1_isend.c
>
>
>  
> ==
>
>  --- trunk/ompi/mca/pml/ob1/pml_ob1_isend.cThu Jul 31 21:00:42 2014
>
>   (r32387)
>
>  +++ trunk/ompi/mca/pml/ob1/pml_ob1_isend.c2014-07-31 23:28:23 EDT
>
>  (Thu, 31 Jul 2014)  (r32388)
>
>  @@ -26,6 +26,9 @@
> #include "pml_ob1_sendreq.h"
> #include "pml_ob1_recvreq.h"
> #include "ompi/peruse/peruse-internal.h"
> +#if HAVE_ALLOCA_H
> +#include 
> +#endif  /* HAVE_ALLOCA_H */
>
> int mca_pml_ob1_isend_init(void *buf,
>size_t count,
> ___
> svn mailing 
> listsvn@open-mpi.orghttp://www.open-mpi.org/mailman/listinfo.cgi/svn
>
>  ___
> devel mailing listde...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
> Link to this 
> post:http://www.open-mpi.org/community/lists/devel/2014/07/15424.php
>
>
>
>
> ___
> devel mailing listde...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
> Link to this post: 
> http://www.open-mpi.org/community/lists/devel/2014/08/15427.php
>
>
>
> ___
> devel mailing list
> de...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
> Link to this post:
> http://www.open-mpi.org/community/lists/devel/2014/08/15428.php
>



-- 
Paul H. Hargrove  phhargr...@lbl.gov
Future Technologies Group
Computer and Data Sciences Department Tel: +1-510-495-2352
Lawrence Berkeley National Laboratory Fax: +1-510-486-6900


Re: [OMPI devel] [OMPI svn] svn:open-mpi r32388 - trunk/ompi/mca/pml/ob1

2014-08-01 Thread Paul Hargrove
George's patch worked for me.

Now of course since this is a big-endian system things are still busted on
trunk, but ring_c is now hung instead of failing at load time.

-Paul


On Thu, Jul 31, 2014 at 9:30 PM, Gilles Gouaillardet <
gilles.gouaillar...@iferc.org> wrote:

>  Paul,
>
> George just made a good point, you should test with his patch first
>
> if it still does not work, could you try to mix gnu and sun compilers ?
> configure ... CC=/usr/sfw/bin/gcc CXX=/usr/sfw/bin/g++ FC= fortran compiler>
>
> Cheers,
>
> Gilles
>
> On 2014/08/01 13:19, Paul Hargrove wrote:
>
> Gilles,
>
> This test was using the Solaris Studio Compilers version 12.3.
>
> /usr/bin/gcc on this system is "gccfss" which Open MPI does NOT support.
>
> There is also a gcc-3.3.2 in /usr/local/bin and gcc-3.4.3 in /usr/sfw/bin
> Neither includes usable fortran compilers, which is why the Studio
> compilers are preferred.
> Let me know if you need me to try any of those gcc installations.
>
> -Paul
>
>
> On Thu, Jul 31, 2014 at 9:12 PM, Gilles Gouaillardet 
> <gilles.gouaillar...@iferc.org> wrote:
>
>
>   Paul,
>
> As Ralph pointed, this issue was reported last month on the user mailing
> list.
>
> #include  did not help 
> :http://www.open-mpi.org/community/lists/users/2014/07/24883.php
>
> I will try if i can reproduce and fix this issue on a solaris10 (but x86)
> VM
>
> BTW, are you using the GNU compiler ?
>
> Cheers,
>
> Gilles
>
> On 2014/08/01 13:08, Paul Hargrove wrote:
>
> In general I am only setup to build from tarballs, not svn.
> However, I can (and will) apply this change manually w/o difficulty.
>
> I will report back when I've had a chance to try that.
>
> I already have many builds in-flight to test George's atomics patch and am
> in danger of confusing myself if I am not careful.
>
> -Paul
>
>
> On Thu, Jul 31, 2014 at 8:29 PM, Ralph Castain <r...@open-mpi.org> 
> <r...@open-mpi.org> <r...@open-mpi.org> <r...@open-mpi.org> wrote:
>
>
>  FWIW: we had Siegmar try that and it didn't solve the problem. Paul?
>
>
> On Jul 31, 2014, at 8:28 PM, svn-commit-mai...@open-mpi.org wrote:
>
>
>  Author: bosilca (George Bosilca)
> Date: 2014-07-31 23:28:23 EDT (Thu, 31 Jul 2014)
> New Revision: 32388
> URL: https://svn.open-mpi.org/trac/ompi/changeset/32388
>
> Log:
> Missing alloca.h. Thanks Paul for catching this.
>
> Text files modified:
>   trunk/ompi/mca/pml/ob1/pml_ob1_irecv.c | 3 +++
>   trunk/ompi/mca/pml/ob1/pml_ob1_isend.c | 3 +++
>   2 files changed, 6 insertions(+), 0 deletions(-)
>
> Modified: trunk/ompi/mca/pml/ob1/pml_ob1_irecv.c
>
>
>  
> ==
>
>  --- trunk/ompi/mca/pml/ob1/pml_ob1_irecv.cThu Jul 31 21:00:42 2014
>
>   (r32387)
>
>  +++ trunk/ompi/mca/pml/ob1/pml_ob1_irecv.c2014-07-31 23:28:23 EDT
>
>  (Thu, 31 Jul 2014)  (r32388)
>
>  @@ -28,6 +28,9 @@
> #include "pml_ob1_recvfrag.h"
> #include "ompi/peruse/peruse-internal.h"
> #include "ompi/message/message.h"
> +#if HAVE_ALLOCA_H
> +#include 
> +#endif  /* HAVE_ALLOCA_H */
>
> int mca_pml_ob1_irecv_init(void *addr,
>size_t count,
>
> Modified: trunk/ompi/mca/pml/ob1/pml_ob1_isend.c
>
>
>  
> ==
>
>  --- trunk/ompi/mca/pml/ob1/pml_ob1_isend.cThu Jul 31 21:00:42 2014
>
>   (r32387)
>
>  +++ trunk/ompi/mca/pml/ob1/pml_ob1_isend.c2014-07-31 23:28:23 EDT
>
>  (Thu, 31 Jul 2014)  (r32388)
>
>  @@ -26,6 +26,9 @@
> #include "pml_ob1_sendreq.h"
> #include "pml_ob1_recvreq.h"
> #include "ompi/peruse/peruse-internal.h"
> +#if HAVE_ALLOCA_H
> +#include 
> +#endif  /* HAVE_ALLOCA_H */
>
> int mca_pml_ob1_isend_init(void *buf,
>size_t count,
> ___
> svn mailing 
> listsvn@open-mpi.orghttp://www.open-mpi.org/mailman/listinfo.cgi/svn
>
>  ___
> devel mailing listde...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
> Link to this 
> post:http://www.open-mpi.org/community/lists/devel/2014/07/15424.php
>
>
>
>
> ___
> devel mailing listde...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
> Link to this post: 
> http://www.open-mpi.org/community/lists/devel/2014/08/15427.php
>
>
>

[OMPI devel] [1.8.2rc3] Build failure on FreeBSD (missing header)

2014-08-01 Thread Paul Hargrove
/home/phargrov/OMPI/openmpi-1.8.2rc3-freebsd10-amd64/openmpi-1.8.2rc3/orte/mca/ess/base/ess_base_std_app.c:412:36:
error: use of undeclared identifier 'S_IRUSR'
fd = open(myfile, O_CREAT, S_IRUSR);
   ^

To fix this it was sufficient to add the following 3 lines in the obvious
place in ess_base_std_app.c

#ifdef HAVE_SYS_STAT_H
#include 
#endif


-Paul


-- 
Paul H. Hargrove  phhargr...@lbl.gov
Future Technologies Group
Computer and Data Sciences Department Tel: +1-510-495-2352
Lawrence Berkeley National Laboratory Fax: +1-510-486-6900


Re: [OMPI devel] Trunk broken for PPC64?

2014-08-01 Thread Paul Hargrove
)[0xfffa9541408]
[bd-login:09106] [17]
/home/hargrov1/OMPI/openmpi-trunk-linux-ppc64-gcc/INST/lib/libmpi.so.0(MPI_Init-0xf28d4)[0xfffa9591c74]
[bd-login:09106] [18] examples/ring_c[0x199c]
[bd-login:09106] [19]
/home/hargrov1/OMPI/openmpi-trunk-linux-ppc64-gcc/INST/lib/libmpi.so.0(MPI_Init-0xf28d4)[0xfff8f451c74]
[bd-login:09107] [18] examples/ring_c[0x199c]
[bd-login:09107] [19] /lib64/libc.so.6[0x80c9b2bcd8]
[bd-login:09107] [20] /lib64/libc.so.6[0x80c9b2bcd8]
[bd-login:09106] [20]
/lib64/libc.so.6(__libc_start_main-0x184e00)[0x80c9b2bed0]
[bd-login:09107] *** End of error message ***
/lib64/libc.so.6(__libc_start_main-0x184e00)[0x80c9b2bed0]
[bd-login:09106] *** End of error message ***
--
mpirun noticed that process rank 0 with PID 0 on node bd-login exited on
signal 11 (Segmentation fault).
--







On Thu, Jul 31, 2014 at 11:39 PM, Gilles Gouaillardet <
gilles.gouaillar...@iferc.org> wrote:

>  Paul and Ralph,
>
> for what it's worth :
>
> a) i faced the very same issue on my (slw) qemu emulated ppc64 vm
> b) i was able to run very basic programs when passing --mca coll ^ml to
> mpirun
>
> Cheers,
>
> Gilles
>
> On 2014/08/01 12:30, Ralph Castain wrote:
>
> Yes, I fear this will require some effort to chase all the breakage down 
> given that (to my knowledge, at least) we lack PPC machines in the devel 
> group.
>
>
> On Jul 31, 2014, at 5:46 PM, Paul Hargrove <phhargr...@lbl.gov> 
> <phhargr...@lbl.gov> wrote:
>
>
>  On the path to verifying George's atomics patch, I have started just by 
> verifying that I can still build the UNPATCHED trunk on each of the platforms 
> I listed.
>
> I have tried two PPC64/Linux systems so far and am seeing the same problem on 
> both.  Though I can pass "make check" both platforms SEGV on
>mpirun -mca btl sm,self -np 2 examples/ring_c
>
> Is this the expected state of the trunk on big-endian systems?
> I am thinking in particular of 
> http://www.open-mpi.org/community/lists/devel/2014/07/15365.php in which 
> Ralph wrote:
>
>  Yeah, my fix won't work for big endian machines - this is going to be an 
> issue across the
> code base now, so we'll have to troll and fix it. I was doing the minimal 
> change required to
> fix the trunk in the meantime.
>
>  If this big-endian failure is not known/expected let me know and I'll 
> provide details.
> Since testing George's patch only requires "make check" I can proceed with 
> that regardless.
>
> -Paul
>
>
> On Thu, Jul 31, 2014 at 4:25 PM, George Bosilca <bosi...@icl.utk.edu> 
> <bosi...@icl.utk.edu> wrote:
> Awesome, thanks Paul. When the results will be in we will fix whatever is 
> needed for these less common architectures.
>
>   George.
>
>
>
> On Thu, Jul 31, 2014 at 7:24 PM, Paul Hargrove <phhargr...@lbl.gov> 
> <phhargr...@lbl.gov> wrote:
>
>
> On Thu, Jul 31, 2014 at 4:22 PM, Paul Hargrove <phhargr...@lbl.gov> 
> <phhargr...@lbl.gov> wrote:
>
> On Thu, Jul 31, 2014 at 4:13 PM, George Bosilca <bosi...@icl.utk.edu> 
> <bosi...@icl.utk.edu> wrote:
> Paul, I know you have a pretty diverse range computers. Can you try to 
> compile and run a "make check" with the following patch?
>
> I will see what I can do for ARMv7, MIPS, PPC and IA64 (or whatever subset of 
> those is still supported).
> The ARM and MIPS system are emulators and take forever to build OMPI.
> However, I am not even sure how soon I'll get to start this testing.
>
>
> Add SPARC (v8plus and v9) to that list.
>
>
>
> --
> Paul H. Hargrove  phhargr...@lbl.gov
> Future Technologies Group
> Computer and Data Sciences Department Tel: +1-510-495-2352
> Lawrence Berkeley National Laboratory Fax: +1-510-486-6900
>
> ___
> devel mailing listde...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
> Link to this post: 
> http://www.open-mpi.org/community/lists/devel/2014/07/15411.php
>
>
> ___
> devel mailing listde...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
> Link to this post: 
> http://www.open-mpi.org/community/lists/devel/2014/07/15412.php
>
>
>
> --
> Paul H. Hargrove  phhargr...@lbl.gov
> Future Technologies Group
> Computer and Data Sciences Department Tel: +1-510-495-2352
> Lawrence Berkeley National Laboratory Fax: +1-510-486-6900
> _

Re: [OMPI devel] Trunk broken for PPC64?

2014-08-01 Thread Paul Hargrove
Hmm, maybe this has nothing to do with big-endian.
Below is a backtrace from ring_c on an IA64 platform (definitely
little-endian) that looks very similar to me.

It happens that sysconf(_SC_PAGESIZE) returns 64K on both of these systems.
So, I wonder if that might be related.

-Paul

$ mpirun -mca btl sm,self -np 2 examples/ring_c'
[altix][[26769,1],0][/eng/home/PHHargrove/OMPI/openmpi-trunk-linux-ia64/openmpi-1.9a1r32386/ompi/mca/coll/ml/coll_ml_lmngr.c:231:mca_coll_ml_lmngr_init]
COLL-ML [altix:20418] *** Process received signal ***
[altix:20418] Signal: Segmentation fault (11)
[altix:20418] Signal code: Invalid permissions (2)
[altix:20418] Failing at address: 0x16
[altix][[26769,1],1][/eng/home/PHHargrove/OMPI/openmpi-trunk-linux-ia64/openmpi-1.9a1r32386/ompi/mca/coll/ml/coll_ml_lmngr.c:231:mca_coll_ml_lmngr_init]
COLL-ML [altix:20419] *** Process received signal ***
[altix:20419] Signal: Segmentation fault (11)
[altix:20419] Signal code: Invalid permissions (2)
[altix:20419] Failing at address: 0x16
[altix:20418] [ 0] [0xa0010800]
[altix:20418] [ 1] /lib/libc.so.6.1(strlen-0x92e930)[0x2051b2a0]
[altix:20418] [altix:20419] [ 0] [0xa0010800]
[altix:20419] [ 1] [ 2]
/lib/libc.so.6.1(strlen-0x92e930)[0x2051b2a0]
[altix:20419] [ 2]
/lib/libc.so.6.1(_IO_vfprintf-0x998610)[0x204b15d0]
[altix:20419] [ 3] /lib/libc.so.6.1(+0x82860)[0x204b2860]
[altix:20419] [ 4]
/lib/libc.so.6.1(_IO_vfprintf-0x99f140)[0x2040]
[altix:20419] [ 5]
/eng/home/PHHargrove/OMPI/openmpi-trunk-linux-ia64/INST/lib/openmpi/mca_coll_ml.so(+0xc5a70)[0x21e55a70]
[altix:20419] [ 6]
/eng/home/PHHargrove/OMPI/openmpi-trunk-linux-ia64/INST/lib/openmpi/mca_coll_ml.so(+0xc84a0)[0x21e584a0]
[altix:20419] [ 7]
/eng/home/PHHargrove/OMPI/openmpi-trunk-linux-ia64/INST/lib/openmpi/mca_coll_ml.so(mca_coll_ml_lmngr_alloc+0x100f520)[0x21e59110]
[altix:20419] [ 8]
/eng/home/PHHargrove/OMPI/openmpi-trunk-linux-ia64/INST/lib/openmpi/mca_coll_ml.so(mca_coll_ml_allocate_block+0xf6e940)[0x21db8540]
[altix:20419] [ 9]
/eng/home/PHHargrove/OMPI/openmpi-trunk-linux-ia64/INST/lib/openmpi/mca_coll_ml.so(+0x10130)[0x21da0130]
[altix:20419] [10]
/eng/home/PHHargrove/OMPI/openmpi-trunk-linux-ia64/INST/lib/openmpi/mca_coll_ml.so(+0x19970)[0x21da9970]
[altix:20419] [11]
/eng/home/PHHargrove/OMPI/openmpi-trunk-linux-ia64/INST/lib/openmpi/mca_coll_ml.so(mca_coll_ml_comm_query+0xf6d6b0)[0x21db5830]
[altix:20419] [12]
/eng/home/PHHargrove/OMPI/openmpi-trunk-linux-ia64/INST/lib/libmpi.so.0(+0x22fbd0)[0x2028fbd0]
[altix:20419] [13]
/eng/home/PHHargrove/OMPI/openmpi-trunk-linux-ia64/INST/lib/libmpi.so.0(+0x22fac0)[0x2028fac0]
[altix:20419] [14]
/eng/home/PHHargrove/OMPI/openmpi-trunk-linux-ia64/INST/lib/libmpi.so.0(+0x22f7e0)[0x2028f7e0]
[altix:20419] [15]
/eng/home/PHHargrove/OMPI/openmpi-trunk-linux-ia64/INST/lib/libmpi.so.0(+0x22eac0)[0x2028eac0]
[altix:20419] [16]
/eng/home/PHHargrove/OMPI/openmpi-trunk-linux-ia64/INST/lib/libmpi.so.0(mca_coll_base_comm_select-0xbcbb90)[0x2027e080]
[altix:20419] [17]
/eng/home/PHHargrove/OMPI/openmpi-trunk-linux-ia64/INST/lib/libmpi.so.0(ompi_mpi_init-0xd38e70)[0x20110db0]
[altix:20419] [18]
/eng/home/PHHargrove/OMPI/openmpi-trunk-linux-ia64/INST/lib/libmpi.so.0(MPI_Init-0xc8bf40)[0x201bdcf0]
[altix:20419] [19] examples/ring_c[0x4c00]
[altix:20419] [20]
/lib/libc.so.6.1(__libc_start_main-0x9f56b0)[0x20454590]
[altix:20419] [21] examples/ring_c[0x4a20]
[altix:20419] *** End of error message ***
/lib/libc.so.6.1(_IO_vfprintf-0x998610)[0x204b15d0]
[altix:20418] [ 3] /lib/libc.so.6.1(+0x82860)[0x204b2860]
[altix:20418] [ 4]
/lib/libc.so.6.1(_IO_vfprintf-0x99f140)[0x2040]




On Thu, Jul 31, 2014 at 11:47 PM, Paul Hargrove <phhargr...@lbl.gov> wrote:

> Gilles's findings are consistent with mine which showed the SEGVs to be in
> the coll/ml code.
> I've built with --enable-debug and so below is a backtrace (well, two
> actually) that might be helpful.
> Unfortunately the output of the two ranks did get slightly entangled.
>
> -Paul
>
> $ ../INST/bin/mpirun -mca btl sm,self -np 2 examples/ring_c'
> [bd-login][[43502,1],0][/home/hargrov1/OMPI/openmpi-trunk-linux-ppc64-gcc/openmpi-1.9a1r32369/ompi/mca/coll/ml/coll_ml_lmngr.c:231:mca_coll_ml_lmngr_init]
> COLL-ML [bd-login:09106] *** Process received signal ***
> [bd-login][[43502,1],1][/home/hargrov1/OMPI/openmpi-trunk-linux-ppc64-gcc/openmpi-1.9a1r32369/ompi/mca/coll/ml/coll_ml_lmngr.c:231:mca_coll_ml_lmngr_init]
> COLL-ML [bd-login:09107] *** Process received signal ***
> [bd-login:09107] Signal: Segmentation fault (11)
> [bd-login:09107] Signal code: Address not mapped (1)
> [bd-login:09107] Failing at address: 0x10
> [bd-login:09107] [ 0] [bd-login:09106] Signal: Segmentation fault (11)
> [bd-login:

Re: [OMPI devel] 1.8.2rc3 now out

2014-08-01 Thread Paul Hargrove
Note that the Solaris unresolved alloca problem George fixed in r32388 is
still present in 1.8.2rc3.
I have manually confirmed that the same patch resolves the problem in
1.8.2rc3.

-Paul


On Thu, Jul 31, 2014 at 9:44 PM, Ralph Castain  wrote:

> Usual place - this is a last-chance check, so please hit it. Main change
> from rc2 is the repairs to the Fortran binding config logic
>
> http://www.open-mpi.org/software/ompi/v1.8/
>
> Ralph
>
> ___
> devel mailing list
> de...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
> Link to this post:
> http://www.open-mpi.org/community/lists/devel/2014/08/15433.php
>



-- 
Paul H. Hargrove  phhargr...@lbl.gov
Future Technologies Group
Computer and Data Sciences Department Tel: +1-510-495-2352
Lawrence Berkeley National Laboratory Fax: +1-510-486-6900


Re: [OMPI devel] RFC: add atomic compare-and-swap that returns old value

2014-08-01 Thread Paul Hargrove
MIPS32, MIPS64 and ARMv7 tests are also pending.
-Paul


On Fri, Aug 1, 2014 at 9:40 AM, George Bosilca <bosi...@icl.utk.edu> wrote:

> Another version of the atomic patch. Paul has tested it on a bunch of
> platforms. At this point we have confirmation from all architectures except
> SPARC (v8+ and v9).
>
>   George.
>
>
>
> On Jul 31, 2014, at 19:13 , George Bosilca <bosi...@icl.utk.edu> wrote:
>
> > All,
> >
> > Here is the patch that change the meaning of the atomics to make them
> always return the previous value (similar to sync_fetch_and_<*>). I tested
> this with the following atomics: OS X, gcc style intrinsics and AMD64.
> >
> > I did not change the base assembly files used when GCC style assembly
> operations are not supported. If someone feels like fixing them, feel free.
> >
> > Paul, I know you have a pretty diverse range computers. Can you try to
> compile and run a "make check" with the following patch?
> >
> >  George.
> >
> > 
> >
> > On Jul 30, 2014, at 15:21 , Nathan Hjelm <hje...@lanl.gov> wrote:
> >
> >>
> >> That is what I would prefer. I was trying to not disturb things too
> >> much :). Please bring the changes over!
> >>
> >> -Nathan
> >>
> >> On Wed, Jul 30, 2014 at 03:18:44PM -0400, George Bosilca wrote:
> >>>  Why do you want to add new versions? This will lead to having two,
> almost
> >>>  identical, sets of atomics that are conceptually equivalent but
> different
> >>>  in terms of code. And we will have to maintained both!
> >>>  I did a similar change in a fork of OPAL in another project but
> instead of
> >>>  adding another flavor of atomics, I completely replaced the available
> ones
> >>>  with a set returning the old value. I can bring the code over.
> >>>George.
> >>>
> >>>  On Tue, Jul 29, 2014 at 5:29 PM, Paul Hargrove <phhargr...@lbl.gov>
> wrote:
> >>>
> >>>On Tue, Jul 29, 2014 at 2:10 PM, Nathan Hjelm <hje...@lanl.gov>
> wrote:
> >>>
> >>>  Is there a reason why the
> >>>  current implementations of opal atomics (add, cmpset) do not
> return
> >>>  the
> >>>  old value?
> >>>
> >>>Because some CPUs don't implement such an atomic instruction?
> >>>
> >>>On any CPU one *can* certainly synthesize the desired operation
> with an
> >>>added read before the compare-and-swap to return a value that was
> >>>present at some time before a failed cmpset.  That is almost
> certainly
> >>>sufficient for your purposes.  However, the added load makes it
> >>>(marginally) more expensive on some CPUs that only have the native
> >>>equivalent of gcc's __sync_bool_compare_and_swap().
> >>>
> >>>-Paul
> >>>--
> >>>Paul H. Hargrove  phhargr...@lbl.gov
> >>>Future Technologies Group
> >>>Computer and Data Sciences Department Tel: +1-510-495-2352
> >>>Lawrence Berkeley National Laboratory Fax: +1-510-486-6900
> >>>___
> >>>devel mailing list
> >>>de...@open-mpi.org
> >>>Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
> >>>Link to this post:
> >>>http://www.open-mpi.org/community/lists/devel/2014/07/15328.php
> >>
> >>> ___
> >>> devel mailing list
> >>> de...@open-mpi.org
> >>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
> >>> Link to this post:
> http://www.open-mpi.org/community/lists/devel/2014/07/15369.php
> >>
> >> ___
> >> devel mailing list
> >> de...@open-mpi.org
> >> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
> >> Link to this post:
> http://www.open-mpi.org/community/lists/devel/2014/07/15370.php
> >
>
>
> ___
> devel mailing list
> de...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
> Link to this post:
> http://www.open-mpi.org/community/lists/devel/2014/08/15462.php
>



-- 
Paul H. Hargrove  phhargr...@lbl.gov
Future Technologies Group
Computer and Data Sciences Department Tel: +1-510-495-2352
Lawrence Berkeley National Laboratory Fax: +1-510-486-6900


Re: [OMPI devel] 1.8.2rc3 now out

2014-08-01 Thread Paul Hargrove
761] Local abort before MPI_INIT completed 
> successfully; not able to aggregate error messages, and not able to guarantee 
> that all other processes were killed!*11:45:01* *** An error occurred in 
> MPI_Init*11:45:01* *** on a NULL communicator*11:45:01* *** 
> MPI_ERRORS_ARE_FATAL (processes in this communicator will now 
> abort,*11:45:01* ***and potentially your MPI job)*11:45:01* 
> [hpctest:2757] Local abort before MPI_INIT completed successfully; not able 
> to aggregate error messages, and not able to guarantee that all other 
> processes were killed!*11:45:01* *** An error occurred in MPI_Init*11:45:01* 
> *** on a NULL communicator*11:45:01* *** MPI_ERRORS_ARE_FATAL (processes in 
> this communicator will now abort,*11:45:01* ***and potentially your MPI 
> job)*11:45:01* [hpctest:2751] Local abort before MPI_INIT completed 
> successfully; not able to aggregate error messages, and not able to guarantee 
> that all other processes were killed!*11:45:01* *** An error occurred in 
> MPI_Init*11:45:01* *** on a NULL communicator*11:45:01* *** 
> MPI_ERRORS_ARE_FATAL (processes in this communicator will now 
> abort,*11:45:01* ***and potentially your MPI job)*11:45:01* 
> [hpctest:2752] Local abort before MPI_INIT completed successfully; not able 
> to aggregate error messages, and not able to guarantee that all other 
> processes were killed!*11:45:01* *** An error occurred in MPI_Init*11:45:01* 
> *** on a NULL communicator*11:45:01* *** MPI_ERRORS_ARE_FATAL (processes in 
> this communicator will now abort,*11:45:01* ***and potentially your MPI 
> job)*11:45:01* [hpctest:2753] Local abort before MPI_INIT completed 
> successfully; not able to aggregate error messages, and not able to guarantee 
> that all other processes were killed!*11:45:01* *** An error occurred in 
> MPI_Init*11:45:01* *** on a NULL communicator*11:45:01* *** 
> MPI_ERRORS_ARE_FATAL (processes in this communicator will now 
> abort,*11:45:01* ***and potentially your MPI job)*11:45:01* 
> [hpctest:2755] Local abort before MPI_INIT completed successfully; not able 
> to aggregate error messages, and not able to guarantee that all other 
> processes were killed!*11:45:01* *** An error occurred in MPI_Init*11:45:01* 
> *** on a NULL communicator*11:45:01* *** MPI_ERRORS_ARE_FATAL (processes in 
> this communicator will now abort,*11:45:01* ***and potentially your MPI 
> job)*11:45:01* [hpctest:2759] Local abort before MPI_INIT completed 
> successfully; not able to aggregate error messages, and not able to guarantee 
> that all other processes were killed!*11:45:01* *** An error occurred in 
> MPI_Init*11:45:01* *** on a NULL communicator*11:45:01* *** 
> MPI_ERRORS_ARE_FATAL (processes in this communicator will now 
> abort,*11:45:01* ***and potentially your MPI job)*11:45:01* 
> [hpctest:2763] Local abort before MPI_INIT completed successfully; not able 
> to aggregate error messages, and not able to guarantee that all other 
> processes were killed!
>
>
>
> On Fri, Aug 1, 2014 at 11:00 AM, Paul Hargrove <phhargr...@lbl.gov> wrote:
>
>> Note that the Solaris unresolved alloca problem George fixed in r32388 is
>> still present in 1.8.2rc3.
>> I have manually confirmed that the same patch resolves the problem in
>> 1.8.2rc3.
>>
>> -Paul
>>
>>
>> On Thu, Jul 31, 2014 at 9:44 PM, Ralph Castain <r...@open-mpi.org> wrote:
>>
>>> Usual place - this is a last-chance check, so please hit it. Main change
>>> from rc2 is the repairs to the Fortran binding config logic
>>>
>>> http://www.open-mpi.org/software/ompi/v1.8/
>>>
>>> Ralph
>>>
>>> ___
>>> devel mailing list
>>> de...@open-mpi.org
>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>> Link to this post:
>>> http://www.open-mpi.org/community/lists/devel/2014/08/15433.php
>>>
>>
>>
>>
>> --
>> Paul H. Hargrove  phhargr...@lbl.gov
>> Future Technologies Group
>> Computer and Data Sciences Department Tel: +1-510-495-2352
>> Lawrence Berkeley National Laboratory Fax: +1-510-486-6900
>>
>> ___
>> devel mailing list
>> de...@open-mpi.org
>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
>> Link to this post:
>> http://www.open-mpi.org/community/lists/devel/2014/08/15440.php
>>
>
> ___
> devel mailing list
> de...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
> Link to this post:
> http://www.open-mpi.org/community/lists/devel/2014/08/15444.php
>
>
>
> ___
> devel mailing list
> de...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
> Link to this post:
> http://www.open-mpi.org/community/lists/devel/2014/08/15449.php
>



-- 
Paul H. Hargrove  phhargr...@lbl.gov
Future Technologies Group
Computer and Data Sciences Department Tel: +1-510-495-2352
Lawrence Berkeley National Laboratory Fax: +1-510-486-6900


Re: [OMPI devel] RFC: add atomic compare-and-swap that returns old value

2014-08-01 Thread Paul Hargrove
I have confirmed that George's latest version works on both SPARC ABIs.

ARMv7 and three MIPS ABIs still pending...

-Paul


On Fri, Aug 1, 2014 at 9:40 AM, George Bosilca <bosi...@icl.utk.edu> wrote:

> Another version of the atomic patch. Paul has tested it on a bunch of
> platforms. At this point we have confirmation from all architectures except
> SPARC (v8+ and v9).
>
>   George.
>
>
>
> On Jul 31, 2014, at 19:13 , George Bosilca <bosi...@icl.utk.edu> wrote:
>
> > All,
> >
> > Here is the patch that change the meaning of the atomics to make them
> always return the previous value (similar to sync_fetch_and_<*>). I tested
> this with the following atomics: OS X, gcc style intrinsics and AMD64.
> >
> > I did not change the base assembly files used when GCC style assembly
> operations are not supported. If someone feels like fixing them, feel free.
> >
> > Paul, I know you have a pretty diverse range computers. Can you try to
> compile and run a "make check" with the following patch?
> >
> >  George.
> >
> > 
> >
> > On Jul 30, 2014, at 15:21 , Nathan Hjelm <hje...@lanl.gov> wrote:
> >
> >>
> >> That is what I would prefer. I was trying to not disturb things too
> >> much :). Please bring the changes over!
> >>
> >> -Nathan
> >>
> >> On Wed, Jul 30, 2014 at 03:18:44PM -0400, George Bosilca wrote:
> >>>  Why do you want to add new versions? This will lead to having two,
> almost
> >>>  identical, sets of atomics that are conceptually equivalent but
> different
> >>>  in terms of code. And we will have to maintained both!
> >>>  I did a similar change in a fork of OPAL in another project but
> instead of
> >>>  adding another flavor of atomics, I completely replaced the available
> ones
> >>>  with a set returning the old value. I can bring the code over.
> >>>George.
> >>>
> >>>  On Tue, Jul 29, 2014 at 5:29 PM, Paul Hargrove <phhargr...@lbl.gov>
> wrote:
> >>>
> >>>On Tue, Jul 29, 2014 at 2:10 PM, Nathan Hjelm <hje...@lanl.gov>
> wrote:
> >>>
> >>>  Is there a reason why the
> >>>  current implementations of opal atomics (add, cmpset) do not
> return
> >>>  the
> >>>  old value?
> >>>
> >>>Because some CPUs don't implement such an atomic instruction?
> >>>
> >>>On any CPU one *can* certainly synthesize the desired operation
> with an
> >>>added read before the compare-and-swap to return a value that was
> >>>present at some time before a failed cmpset.  That is almost
> certainly
> >>>sufficient for your purposes.  However, the added load makes it
> >>>(marginally) more expensive on some CPUs that only have the native
> >>>equivalent of gcc's __sync_bool_compare_and_swap().
> >>>
> >>>-Paul
> >>>--
> >>>Paul H. Hargrove  phhargr...@lbl.gov
> >>>Future Technologies Group
> >>>Computer and Data Sciences Department Tel: +1-510-495-2352
> >>>Lawrence Berkeley National Laboratory Fax: +1-510-486-6900
> >>>___
> >>>devel mailing list
> >>>de...@open-mpi.org
> >>>Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
> >>>Link to this post:
> >>>http://www.open-mpi.org/community/lists/devel/2014/07/15328.php
> >>
> >>> ___
> >>> devel mailing list
> >>> de...@open-mpi.org
> >>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
> >>> Link to this post:
> http://www.open-mpi.org/community/lists/devel/2014/07/15369.php
> >>
> >> ___
> >> devel mailing list
> >> de...@open-mpi.org
> >> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
> >> Link to this post:
> http://www.open-mpi.org/community/lists/devel/2014/07/15370.php
> >
>
>
> ___
> devel mailing list
> de...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
> Link to this post:
> http://www.open-mpi.org/community/lists/devel/2014/08/15462.php
>



-- 
Paul H. Hargrove  phhargr...@lbl.gov
Future Technologies Group
Computer and Data Sciences Department Tel: +1-510-495-2352
Lawrence Berkeley National Laboratory Fax: +1-510-486-6900


[OMPI devel] [1.8.2rc3] build failure on OpenBSD (libevent)

2014-08-01 Thread Paul Hargrove
I am seeing the following on OpenBSD/amd64 with "make V=1":

Making all in tools/wrappers
/bin/sh ../../../libtool  --tag=CC--mode=link gcc -std=gnu99  -g
-finline-functions -fno-strict-aliasing -pthread   -export-dynamic   -o
opal_wrapper opal_wrapper.o ../../../opal/libopen-pal.la -lutil -lm
libtool: link: gcc -std=gnu99 -g -finline-functions -fno-strict-aliasing
-pthread -o .libs/opal_wrapper opal_wrapper.o -Wl,-E  -L../../../opal/.libs
-lopen-pal -lpthread -lutil -lm -pthread
-Wl,-rpath,/home/phargrov/OMPI/openmpi-1.8.2rc3-openbsd5-amd64/INST/lib
../../../opal/.libs/libopen-pal.so.8.0: warning: vsprintf() is often
misused, please use vsnprintf()
../../../opal/.libs/libopen-pal.so.8.0: warning: strcpy() is almost always
misused, please use strlcpy()
../../../opal/.libs/libopen-pal.so.8.0: warning: random() isn't random;
consider using arc4random()
../../../opal/.libs/libopen-pal.so.8.0: warning: strcat() is almost always
misused, please use strlcat()
../../../opal/.libs/libopen-pal.so.8.0: warning: sprintf() is often
misused, please use snprintf()
../../../opal/.libs/libopen-pal.so.8.0: undefined reference to
`arc4random_addrandom'
collect2: ld returned 1 exit status
*** Error 1 in opal/tools/wrappers (Makefile:1623 'opal_wrapper')
*** Error 1 in opal (Makefile:2145 'all-recursive')
*** Error 1 in /home/phargrov/OMPI/openmpi-1.8.2rc3-openbsd5-amd64/BLD
(Makefile:1689 'all-recursive')

Ignoring OpenBSD's typical warnings about functions their developers don't
like there is an undefined reference to arc4random_addrandom.  The only
explicit reference appears to be in libevent:

$ grep -rlw arc4random_addrandom .
./opal/mca/event/libevent2021/libevent/evutil_rand.c
./opal/mca/event/libevent2021/libevent/arc4random.c

It appears that OpenBSD has arc4random, but no arc4random_addrandom():
/usr/include/stdlib.h:u_int32_t arc4random(void);
/usr/include/stdlib.h:u_int32_t arc4random_uniform(u_int32_t);
/usr/include/stdlib.h:void arc4random_buf(void *, size_t)

I tried to work-around this by adding  "ac_cv_func_arc4random=no" to the
configure command line, but that creates secondary problems because the #if
logic in libevent doesn't allow for the case that arc4random() does not
exist but arc4random_buf() does:

In file included from
/home/phargrov/OMPI/openmpi-1.8.2rc3-openbsd5-amd64/openmpi-1.8.2rc3/opal/mca/event/libev
ent2021/libevent/evutil_rand.c:119:
/home/phargrov/OMPI/openmpi-1.8.2rc3-openbsd5-amd64/openmpi-1.8.2rc3/opal/mca/event/libevent2021/libevent/./arc
4random.c:482: error: static declaration of 'arc4random_buf' follows
non-static declaration
/usr/include/stdlib.h:308: error: previous declaration of 'arc4random_buf'
was here

Use of --with-libevent=... was no use because the pre-built libevent
package for OpenBSD lacks thread support.

So, I am left without any recipe to build 1.8.2rc3 on OpenBSD.
HOWEVER, is appears that 1.8, 1.8.1 and trunk all have the same problem.
Of course, I am the only one who tests Open MPI on OpenBSD, and I don't
actually USE it.
So, this is not any sort of a priority as far as I am concerned.

-Paul






-- 
Paul H. Hargrove  phhargr...@lbl.gov
Future Technologies Group
Computer and Data Sciences Department Tel: +1-510-495-2352
Lawrence Berkeley National Laboratory Fax: +1-510-486-6900


Re: [OMPI devel] [1.8.2rc3] build failure on OpenBSD (libevent)

2014-08-02 Thread Paul Hargrove
Ralph,

My position on support for OpenBSD is the same as the numerous other
operating systems, cpu architectures and compilers I help test on.  I feel
that every additional platform for which one can maintain support improves
the chance of catching bugs in ones code and reduces the effort to port to
new platforms in the future, making portability a goal rather than just a
means to ones ends.

Therefore, I believe that resolving portability issues is deserving of
effort that may seem out of proportion to the number of potential users of
a given port.  Keep in mind that when I have time I aggressively test to
help ensure the wide portability of Open MPI despite the fact that I have
never written an MPI application outside of course work (over ten years
ago).  I am not an MPI developer or user - I am an advocate for portable
HPC middleware.

Unless somebody beat me to it, I will create a ticket for this issue
assigning it to myself.
If/when I have the time I will try libevent patches to resolve the problem.

Regarding the possibility that this is fixed in a later libevent than is
packaged with Open MPI, I had a look at the OpenBSD ports tree.  They have
libevent-2.0.21-stable and still apply patches to remove the use of
arc4random_addrandom().  I believe that is the same version packages with
Open MPI and so their patches will be the starting point for trying to fix
libevent in Open MPI.

Now having said all of that, I find that the OpenBSD ports tree and
repository of binary packages still contain Open MPI 1.4.1 (and nothing
newer) and no version of mpich at all (and thankfully no LAM/MPI).  This
suggests that either
a) there is no demand at all for MPI on OpenBSD
b) there are users working building from sources

So, there is absolutely no reason to believe there is any time sensitivity
for resolution of this issue.  Only I am likely to ever notice the lack of
OpenBSD support.

-Paul


On Sat, Aug 2, 2014 at 11:46 AM, Ralph Castain <r...@open-mpi.org> wrote:

> This was apparently somewhat recent - here is the OpenBSD ticket regarding
> it:
>
> http://sourceforge.net/p/levent/bugs/320/
>
> If someone feels it important that we continue supporting OpenBSD, one
> might explore the solution recommended in that ticket. It's also possible
> that the libevent guys are working on solving it too (or may have already
> done so in a later version than we include)
>
>
> On Aug 1, 2014, at 4:07 PM, Paul Hargrove <phhargr...@lbl.gov> wrote:
>
> I am seeing the following on OpenBSD/amd64 with "make V=1":
>
> Making all in tools/wrappers
> /bin/sh ../../../libtool  --tag=CC--mode=link gcc -std=gnu99  -g
> -finline-functions -fno-strict-aliasing -pthread   -export-dynamic   -o
> opal_wrapper opal_wrapper.o ../../../opal/libopen-pal.la -lutil -lm
> libtool: link: gcc -std=gnu99 -g -finline-functions -fno-strict-aliasing
> -pthread -o .libs/opal_wrapper opal_wrapper.o -Wl,-E  -L../../../opal/.libs
> -lopen-pal -lpthread -lutil -lm -pthread
> -Wl,-rpath,/home/phargrov/OMPI/openmpi-1.8.2rc3-openbsd5-amd64/INST/lib
> ../../../opal/.libs/libopen-pal.so.8.0: warning: vsprintf() is often
> misused, please use vsnprintf()
> ../../../opal/.libs/libopen-pal.so.8.0: warning: strcpy() is almost always
> misused, please use strlcpy()
> ../../../opal/.libs/libopen-pal.so.8.0: warning: random() isn't random;
> consider using arc4random()
> ../../../opal/.libs/libopen-pal.so.8.0: warning: strcat() is almost always
> misused, please use strlcat()
> ../../../opal/.libs/libopen-pal.so.8.0: warning: sprintf() is often
> misused, please use snprintf()
> ../../../opal/.libs/libopen-pal.so.8.0: undefined reference to
> `arc4random_addrandom'
> collect2: ld returned 1 exit status
> *** Error 1 in opal/tools/wrappers (Makefile:1623 'opal_wrapper')
> *** Error 1 in opal (Makefile:2145 'all-recursive')
> *** Error 1 in /home/phargrov/OMPI/openmpi-1.8.2rc3-openbsd5-amd64/BLD
> (Makefile:1689 'all-recursive')
>
> Ignoring OpenBSD's typical warnings about functions their developers don't
> like there is an undefined reference to arc4random_addrandom.  The only
> explicit reference appears to be in libevent:
>
> $ grep -rlw arc4random_addrandom .
> ./opal/mca/event/libevent2021/libevent/evutil_rand.c
> ./opal/mca/event/libevent2021/libevent/arc4random.c
>
> It appears that OpenBSD has arc4random, but no arc4random_addrandom():
> /usr/include/stdlib.h:u_int32_t arc4random(void);
> /usr/include/stdlib.h:u_int32_t arc4random_uniform(u_int32_t);
> /usr/include/stdlib.h:void arc4random_buf(void *, size_t)
>
> I tried to work-around this by adding  "ac_cv_func_arc4random=no" to the
> configure command line, but that creates secondary problems because the #if
> logic in libevent doesn't allow for the case that arc4random() does not
> exist but arc4

[OMPI devel] Trak missing Versions for 1.8.x

2014-08-02 Thread Paul Hargrove
I was just in Trak to open a new ticket and noticed that the Version
pull-down lacks entries for 1.8 and 1.8.1.

-Paul

-- 
Paul H. Hargrove  phhargr...@lbl.gov
Future Technologies Group
Computer and Data Sciences Department Tel: +1-510-495-2352
Lawrence Berkeley National Laboratory Fax: +1-510-486-6900


Re: [OMPI devel] [1.8.2rc3] build failure on OpenBSD (libevent)

2014-08-02 Thread Paul Hargrove
Ralph,

I understand that I am likely the only one in a position to test.

I have already completed initial testing of the approach in the bug report
you fount at sourceforge: excise the sole *unused* routine that calls
arc4random_addrandom().

Assuming the remainder of my testing is successful, I will soon attach a
patch to the trac issue (#4829).  It is up to you and the other core
developers to decide whether to CMR for inclusion 1.8.2, 1.8.3 or leave for
1.9.  I will not push to delay 1.8.2 for OpenBSD support.

-Paul


On Sat, Aug 2, 2014 at 2:10 PM, Ralph Castain <r...@open-mpi.org> wrote:

> To be clear, I fully support what you say. Ordinarily, I would just do the
> port, but sadly (a) I am totally buried at work right now, and (b) I have
> no way to verify that the patches actually work.
>
> If/when you have time, do let me know the results and I'll be happy to
> proceed.
>
> Thanks
> Ralph
>
>
> On Aug 2, 2014, at 12:49 PM, Paul Hargrove <phhargr...@lbl.gov> wrote:
>
> Ralph,
>
> My position on support for OpenBSD is the same as the numerous other
> operating systems, cpu architectures and compilers I help test on.  I feel
> that every additional platform for which one can maintain support improves
> the chance of catching bugs in ones code and reduces the effort to port to
> new platforms in the future, making portability a goal rather than just a
> means to ones ends.
>
> Therefore, I believe that resolving portability issues is deserving of
> effort that may seem out of proportion to the number of potential users of
> a given port.  Keep in mind that when I have time I aggressively test to
> help ensure the wide portability of Open MPI despite the fact that I have
> never written an MPI application outside of course work (over ten years
> ago).  I am not an MPI developer or user - I am an advocate for portable
> HPC middleware.
>
> Unless somebody beat me to it, I will create a ticket for this issue
> assigning it to myself.
> If/when I have the time I will try libevent patches to resolve the problem.
>
> Regarding the possibility that this is fixed in a later libevent than is
> packaged with Open MPI, I had a look at the OpenBSD ports tree.  They have
> libevent-2.0.21-stable and still apply patches to remove the use of
> arc4random_addrandom().  I believe that is the same version packages with
> Open MPI and so their patches will be the starting point for trying to fix
> libevent in Open MPI.
>
> Now having said all of that, I find that the OpenBSD ports tree and
> repository of binary packages still contain Open MPI 1.4.1 (and nothing
> newer) and no version of mpich at all (and thankfully no LAM/MPI).  This
> suggests that either
> a) there is no demand at all for MPI on OpenBSD
> b) there are users working building from sources
>
> So, there is absolutely no reason to believe there is any time sensitivity
> for resolution of this issue.  Only I am likely to ever notice the lack of
> OpenBSD support.
>
> -Paul
>
>
> On Sat, Aug 2, 2014 at 11:46 AM, Ralph Castain <r...@open-mpi.org> wrote:
>
>> This was apparently somewhat recent - here is the OpenBSD ticket
>> regarding it:
>>
>> http://sourceforge.net/p/levent/bugs/320/
>>
>> If someone feels it important that we continue supporting OpenBSD, one
>> might explore the solution recommended in that ticket. It's also possible
>> that the libevent guys are working on solving it too (or may have already
>> done so in a later version than we include)
>>
>>
>> On Aug 1, 2014, at 4:07 PM, Paul Hargrove <phhargr...@lbl.gov> wrote:
>>
>> I am seeing the following on OpenBSD/amd64 with "make V=1":
>>
>> Making all in tools/wrappers
>> /bin/sh ../../../libtool  --tag=CC--mode=link gcc -std=gnu99  -g
>> -finline-functions -fno-strict-aliasing -pthread   -export-dynamic   -o
>> opal_wrapper opal_wrapper.o ../../../opal/libopen-pal.la -lutil -lm
>> libtool: link: gcc -std=gnu99 -g -finline-functions -fno-strict-aliasing
>> -pthread -o .libs/opal_wrapper opal_wrapper.o -Wl,-E  -L../../../opal/.libs
>> -lopen-pal -lpthread -lutil -lm -pthread
>> -Wl,-rpath,/home/phargrov/OMPI/openmpi-1.8.2rc3-openbsd5-amd64/INST/lib
>> ../../../opal/.libs/libopen-pal.so.8.0: warning: vsprintf() is often
>> misused, please use vsnprintf()
>> ../../../opal/.libs/libopen-pal.so.8.0: warning: strcpy() is almost
>> always misused, please use strlcpy()
>> ../../../opal/.libs/libopen-pal.so.8.0: warning: random() isn't random;
>> consider using arc4random()
>> ../../../opal/.libs/libopen-pal.so.8.0: warning: strcat() is almost
>> always misused, please use strlcat()
>> ../../../opal/

Re: [OMPI devel] Trak missing Versions for 1.8.x

2014-08-02 Thread Paul Hargrove
Not sure I understood that reply.

I see Version going back to 1.0, but none for the *current* release series.
Is that really the intent?

-Paul


On Sat, Aug 2, 2014 at 2:11 PM, Ralph Castain <r...@open-mpi.org> wrote:

> Yeah, we remove back entries as we aren't going to backport patches to old
> releases.
>
> On Aug 2, 2014, at 12:59 PM, Paul Hargrove <phhargr...@lbl.gov> wrote:
>
> I was just in Trak to open a new ticket and noticed that the Version
> pull-down lacks entries for 1.8 and 1.8.1.
>
> -Paul
>
> --
> Paul H. Hargrove  phhargr...@lbl.gov
> Future Technologies Group
> Computer and Data Sciences Department Tel: +1-510-495-2352
> Lawrence Berkeley National Laboratory Fax: +1-510-486-6900
>  ___
> devel mailing list
> de...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
> Link to this post:
> http://www.open-mpi.org/community/lists/devel/2014/08/15474.php
>
>
>
> ___
> devel mailing list
> de...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
> Link to this post:
> http://www.open-mpi.org/community/lists/devel/2014/08/15476.php
>



-- 
Paul H. Hargrove  phhargr...@lbl.gov
Future Technologies Group
Computer and Data Sciences Department Tel: +1-510-495-2352
Lawrence Berkeley National Laboratory Fax: +1-510-486-6900


Re: [OMPI devel] Trak missing Versions for 1.8.x

2014-08-02 Thread Paul Hargrove
I'm more concerned with the INABILITY to file bug reports against the
current release.
One can pick "1.8 branch" but not 1.8 or 1.8.1.

-Paul


On Sat, Aug 2, 2014 at 2:33 PM, Ralph Castain <r...@open-mpi.org> wrote:

> Hmmm...I'll check with Jeff next week. I'm not sure why we would support
> creation of tickets for releases that we know we'll never fix
>
>
> On Aug 2, 2014, at 2:26 PM, Paul Hargrove <phhargr...@lbl.gov> wrote:
>
> Not sure I understood that reply.
>
> I see Version going back to 1.0, but none for the *current* release series.
> Is that really the intent?
>
> -Paul
>
>
> On Sat, Aug 2, 2014 at 2:11 PM, Ralph Castain <r...@open-mpi.org> wrote:
>
>> Yeah, we remove back entries as we aren't going to backport patches to
>> old releases.
>>
>> On Aug 2, 2014, at 12:59 PM, Paul Hargrove <phhargr...@lbl.gov> wrote:
>>
>> I was just in Trak to open a new ticket and noticed that the Version
>> pull-down lacks entries for 1.8 and 1.8.1.
>>
>> -Paul
>>
>> --
>> Paul H. Hargrove  phhargr...@lbl.gov
>> Future Technologies Group
>> Computer and Data Sciences Department Tel: +1-510-495-2352
>> Lawrence Berkeley National Laboratory Fax: +1-510-486-6900
>>  ___
>> devel mailing list
>> de...@open-mpi.org
>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
>> Link to this post:
>> http://www.open-mpi.org/community/lists/devel/2014/08/15474.php
>>
>>
>>
>> ___
>> devel mailing list
>> de...@open-mpi.org
>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
>> Link to this post:
>> http://www.open-mpi.org/community/lists/devel/2014/08/15476.php
>>
>
>
>
> --
> Paul H. Hargrove  phhargr...@lbl.gov
> Future Technologies Group
> Computer and Data Sciences Department Tel: +1-510-495-2352
> Lawrence Berkeley National Laboratory Fax: +1-510-486-6900
>  ___
> devel mailing list
> de...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
> Link to this post:
> http://www.open-mpi.org/community/lists/devel/2014/08/15478.php
>
>
>
> ___
> devel mailing list
> de...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
> Link to this post:
> http://www.open-mpi.org/community/lists/devel/2014/08/15479.php
>



-- 
Paul H. Hargrove  phhargr...@lbl.gov
Future Technologies Group
Computer and Data Sciences Department Tel: +1-510-495-2352
Lawrence Berkeley National Laboratory Fax: +1-510-486-6900


[OMPI devel] trunk warnings on x86

2014-08-03 Thread Paul Hargrove
Looks like on a 32-bit platform a (uintptr_t) cast is desired in the
OMPI_CAST_RTE_NAME() macro.

Warnings from current trunk tarball attributable to the missing case
include:

/home/pcp1/phargrov/OMPI/openmpi-trunk-linux-x86-gcc/openmpi-1.9a1r32406/ompi/runtime/ompi_mpi_abort.c:89:
warning: cast to pointer from integer of different size
/home/pcp1/phargrov/OMPI/openmpi-trunk-linux-x86-gcc/openmpi-1.9a1r32406/ompi/runtime/ompi_mpi_abort.c:97:
warning: cast to pointer from integer of different size
/home/pcp1/phargrov/OMPI/openmpi-trunk-linux-x86-gcc/openmpi-1.9a1r32406/ompi/mca/pml/bfo/pml_bfo_failover.c:1417:
warning: cast to pointer from integer of different size

-Paul

-- 
Paul H. Hargrove  phhargr...@lbl.gov
Future Technologies Group
Computer and Data Sciences Department Tel: +1-510-495-2352
Lawrence Berkeley National Laboratory Fax: +1-510-486-6900


Re: [OMPI devel] trunk warnings on x86

2014-08-03 Thread Paul Hargrove
Whether just adding a (uintptr_t) cast is sufficient or not depends on the
usage, and I don't pretend to have looked much deeper than seeing that this
macro is common to the line numbers in the warnings I quoted.

If the intent is to uniformly store a pointer then a (uintptr_t *) cast may
be appropriate, though that would use the most-significant 32-bits on ppc32
and least-significant 32-bits on x86.  Again, the appropriate form for the
macro depends on how the field is used.

-Paul


On Sat, Aug 2, 2014 at 9:14 PM, Ralph Castain <r...@open-mpi.org> wrote:

> Arg - that raises an interesting point. This is a pointer to a 64-bit
> number. Will uintptr_t resolve that problem on such platforms?
>
> On Aug 2, 2014, at 8:12 PM, Paul Hargrove <phhargr...@lbl.gov> wrote:
>
> Looks like on a 32-bit platform a (uintptr_t) cast is desired in the
> OMPI_CAST_RTE_NAME() macro.
>
> Warnings from current trunk tarball attributable to the missing case
> include:
>
> /home/pcp1/phargrov/OMPI/openmpi-trunk-linux-x86-gcc/openmpi-1.9a1r32406/ompi/runtime/ompi_mpi_abort.c:89:
> warning: cast to pointer from integer of different size
> /home/pcp1/phargrov/OMPI/openmpi-trunk-linux-x86-gcc/openmpi-1.9a1r32406/ompi/runtime/ompi_mpi_abort.c:97:
> warning: cast to pointer from integer of different size
> /home/pcp1/phargrov/OMPI/openmpi-trunk-linux-x86-gcc/openmpi-1.9a1r32406/ompi/mca/pml/bfo/pml_bfo_failover.c:1417:
> warning: cast to pointer from integer of different size
>
> -Paul
>
> --
> Paul H. Hargrove  phhargr...@lbl.gov
> Future Technologies Group
> Computer and Data Sciences Department Tel: +1-510-495-2352
> Lawrence Berkeley National Laboratory Fax: +1-510-486-6900
>  ___
> devel mailing list
> de...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
> Link to this post:
> http://www.open-mpi.org/community/lists/devel/2014/08/15481.php
>
>
>
> ___
> devel mailing list
> de...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
> Link to this post:
> http://www.open-mpi.org/community/lists/devel/2014/08/15482.php
>



-- 
Paul H. Hargrove  phhargr...@lbl.gov
Future Technologies Group
Computer and Data Sciences Department Tel: +1-510-495-2352
Lawrence Berkeley National Laboratory Fax: +1-510-486-6900


[OMPI devel] [1.8.2rc3] another openib bug (#4377)

2014-08-03 Thread Paul Hargrove
I have a pair of x86/linux (32 bit) hosts connected by Mellanox Tavor HCAs.
 I have no idea if (or why) this has only appeared on this system, but I
find that blt:openib thinks the INI file says to ignore these HCAs.  See
the 4th line below:


[pcp-j-5][[27705,1],0][/home/pcp1/phargrov/OMPI/openmpi-1.8.2rc3-linux-x86-mx/openmpi-1.8.2rc3/ompi/mca/btl/openib/btl_openib_ip.c:364:add_rdma_addr]
Adding addr 172.18.0.105 (0x690012ac) subnet 0xac12 as mthca0:1
[pcp-j-5][[27705,1],0][/home/pcp1/phargrov/OMPI/openmpi-1.8.2rc3-linux-x86-mx/openmpi-1.8.2rc3/ompi/mca/btl/openib/btl_openib_ini.c:170:ompi_btl_openib_ini_query]
Querying INI files for vendor 0x02c9, part ID 23108
[pcp-j-5][[27705,1],0][/home/pcp1/phargrov/OMPI/openmpi-1.8.2rc3-linux-x86-mx/openmpi-1.8.2rc3/ompi/mca/btl/openib/btl_openib_ini.c:189:ompi_btl_openib_ini_query]
Found corresponding INI values: Mellanox Tavor Infinihost
[pcp-j-5][[27705,1],0][/home/pcp1/phargrov/OMPI/openmpi-1.8.2rc3-linux-x86-mx/openmpi-1.8.2rc3/ompi/mca/btl/openib/btl_openib_component.c:1541:init_one_device]
device mthca0 skipped; ignore_device=1
[pcp-j-5][[27705,1],0][/home/pcp1/phargrov/OMPI/openmpi-1.8.2rc3-linux-x86-mx/openmpi-1.8.2rc3/ompi/mca/btl/openib/btl_openib_component.c:988:device_destruct]
Failed to release mpool
[pcp-j-5][[27705,1],0][/home/pcp1/phargrov/OMPI/openmpi-1.8.2rc3-linux-x86-mx/openmpi-1.8.2rc3/ompi/mca/btl/openib/btl_openib_component.c:1020:device_destruct]
Failed to destroy device resources
[pcp-j-5][[27705,1],0][/home/pcp1/phargrov/OMPI/openmpi-1.8.2rc3-linux-x86-mx/openmpi-1.8.2rc3/ompi/mca/btl/openib/connect/btl_openib_connect_rdmacm.c:1981:rdmacm_component_finalize]
rdmacm_component_finalize

Turns out this is known, and has been entered as trac ticket #4377,
currently assigned to miked.
Applying the 2-line patch attached to the ticket fixes the ignore_device=1
problem for me.

Mike,
Please apply that patch to trunk and CMR for 1.8.2

BTW:
Even with the "ignore_device=1" problem fixed, I can't get btl:openib
running on x86.
So, there may be additional reports in the next few hours.

-Paul

-- 
Paul H. Hargrove  phhargr...@lbl.gov
Future Technologies Group
Computer and Data Sciences Department Tel: +1-510-495-2352
Lawrence Berkeley National Laboratory Fax: +1-510-486-6900


Re: [OMPI devel] [1.8.2rc3] another openib bug (#4377)

2014-08-03 Thread Paul Hargrove
On Sun, Aug 3, 2014 at 12:49 PM, Paul Hargrove <phhargr...@lbl.gov> wrote:

> BTW:
> Even with the "ignore_device=1" problem fixed, I can't get btl:openib
> running on x86.
> So, there may be additional reports in the next few hours.
>

That turned out to be the already known issue in 1.8.2rc3 that was since
fixed.
So, with manual application of r32395 + the patch for ticket #4377 I can
run btl:openib on x86+tavor

-Paul


-- 
Paul H. Hargrove  phhargr...@lbl.gov
Future Technologies Group
Computer and Data Sciences Department Tel: +1-510-495-2352
Lawrence Berkeley National Laboratory Fax: +1-510-486-6900


[OMPI devel] [1.8.2rc3] static linking fails on linux (openpty undefined)

2014-08-03 Thread Paul Hargrove
I've configured the 1.8.2rc3 tarball with "--enable-static
--disable-shared" on a fairly standard Linux/x86-64 platform.  While there
are no problems on the same platform w/o these configure flags, with them I
cannot link any application codes.

$ mpicc -ghello_c.c   -o hello_c
/global/homes/h/hargrove/GSCRATCH/OMPI/openmpi-1.8.2rc3-linux-x86_64-static/INST/lib/libopen-pal.a(opal_pty.o):
In function `opal_openpty':
opal_pty.c:(.text+0x1): undefined reference to `openpty'

I checked "make openpty" and the manpage says to link with '-lutil'.
The '-showme' does not show libutil:

$ mpicc -showme hello_c.c
gcc hello_c.c
-I/global/homes/h/hargrove/GSCRATCH/OMPI/openmpi-1.8.2rc3-linux-x86_64-static/INST/include
-pthread -L/usr/syscom/opt/torque/4.1.4/lib -Wl,-rpath
-Wl,/usr/syscom/opt/torque/4.1.4/lib -Wl,-rpath
-Wl,/usr/syscom/opt/torque/4.1.4/lib -Wl,-rpath
-Wl,/usr/syscom/opt/torque/4.1.4/lib -Wl,-rpath
-Wl,/usr/syscom/opt/torque/4.1.4/lib -Wl,-rpath
-Wl,/global/homes/h/hargrove/GSCRATCH/OMPI/openmpi-1.8.2rc3-linux-x86_64-static/INST/lib
-Wl,--enable-new-dtags
-L/global/homes/h/hargrove/GSCRATCH/OMPI/openmpi-1.8.2rc3-linux-x86_64-static/INST/lib
-lmpi -lopen-rte -lopen-pal -lm -ldl -ltorque -libverbs -lrdmacm


It looks like configure is doing the right thing on some level, but failing
to add '-lutil' to the appropriate list of libs (OPAL_WRAPPER_EXTRA_LIBS?):


== Library and Function tests

checking if we need -lutil for openpty... yes
checking for openpty... yes


-Paul

-- 
Paul H. Hargrove  phhargr...@lbl.gov
Future Technologies Group
Computer and Data Sciences Department Tel: +1-510-495-2352
Lawrence Berkeley National Laboratory Fax: +1-510-486-6900


Re: [OMPI devel] [1.8.2rc3] static linking fails on linux (openpty undefined)

2014-08-03 Thread Paul Hargrove
Hmm,

On a different Linux/x86-64 host things work as expected with '-lutil'
linked explicitly:

$ ./INST/bin/mpicc -showme BLD/examples/hello_c.c
pgcc BLD/examples/hello_c.c
-I/scratch/scratchdirs/hargrove/OMPI/openmpi-1.8.2rc3-linux-x86_64-pgi-14.1/INST/include
-L/opt/torque/4.2.7.h1/lib -Wl,-rpath -Wl,/opt/torque/4.2.7.h1/lib
-Wl,-rpath -Wl,/opt/torque/4.2.7.h1/lib -Wl,-rpath
-Wl,/opt/torque/4.2.7.h1/lib -Wl,-rpath -Wl,/opt/torque/4.2.7.h1/lib
-Wl,-rpath
-Wl,/scratch/scratchdirs/hargrove/OMPI/openmpi-1.8.2rc3-linux-x86_64-pgi-14.1/INST/lib
-L/scratch/scratchdirs/hargrove/OMPI/openmpi-1.8.2rc3-linux-x86_64-pgi-14.1/INST/lib
-lmpi -lopen-rte -lopen-pal -lm -lpciaccess -ldl -ltorque -lrt -lnsl -lutil

Searching for relevant differences now...

-Paul


On Sun, Aug 3, 2014 at 4:58 PM, Paul Hargrove <phhargr...@lbl.gov> wrote:

>
> I've configured the 1.8.2rc3 tarball with "--enable-static
> --disable-shared" on a fairly standard Linux/x86-64 platform.  While there
> are no problems on the same platform w/o these configure flags, with them I
> cannot link any application codes.
>
> $ mpicc -ghello_c.c   -o hello_c
> /global/homes/h/hargrove/GSCRATCH/OMPI/openmpi-1.8.2rc3-linux-x86_64-static/INST/lib/libopen-pal.a(opal_pty.o):
> In function `opal_openpty':
> opal_pty.c:(.text+0x1): undefined reference to `openpty'
>
> I checked "make openpty" and the manpage says to link with '-lutil'.
> The '-showme' does not show libutil:
>
> $ mpicc -showme hello_c.c
> gcc hello_c.c
> -I/global/homes/h/hargrove/GSCRATCH/OMPI/openmpi-1.8.2rc3-linux-x86_64-static/INST/include
> -pthread -L/usr/syscom/opt/torque/4.1.4/lib -Wl,-rpath
> -Wl,/usr/syscom/opt/torque/4.1.4/lib -Wl,-rpath
> -Wl,/usr/syscom/opt/torque/4.1.4/lib -Wl,-rpath
> -Wl,/usr/syscom/opt/torque/4.1.4/lib -Wl,-rpath
> -Wl,/usr/syscom/opt/torque/4.1.4/lib -Wl,-rpath
> -Wl,/global/homes/h/hargrove/GSCRATCH/OMPI/openmpi-1.8.2rc3-linux-x86_64-static/INST/lib
> -Wl,--enable-new-dtags
> -L/global/homes/h/hargrove/GSCRATCH/OMPI/openmpi-1.8.2rc3-linux-x86_64-static/INST/lib
> -lmpi -lopen-rte -lopen-pal -lm -ldl -ltorque -libverbs -lrdmacm
>
>
> It looks like configure is doing the right thing on some level, but
> failing to add '-lutil' to the appropriate list of libs
> (OPAL_WRAPPER_EXTRA_LIBS?):
>
>
> 
> == Library and Function tests
>
> 
> checking if we need -lutil for openpty... yes
> checking for openpty... yes
>
>
> -Paul
>
> --
> Paul H. Hargrove  phhargr...@lbl.gov
> Future Technologies Group
> Computer and Data Sciences Department Tel: +1-510-495-2352
> Lawrence Berkeley National Laboratory Fax: +1-510-486-6900
>



-- 
Paul H. Hargrove  phhargr...@lbl.gov
Future Technologies Group
Computer and Data Sciences Department Tel: +1-510-495-2352
Lawrence Berkeley National Laboratory Fax: +1-510-486-6900


Re: [OMPI devel] [1.8.2rc3] static linking fails on linux when not building ROMIO

2014-08-04 Thread Paul Hargrove
I've identified the difference between the platform that does link libutil
and the one that does not.

1) libutil is linked (as an OMPI dependency) only on the working system:

Working system:
$ grep 'checking for .* LIBS' configure.out
checking for OPAL LIBS... -lm -lpciaccess -ldl
checking for ORTE LIBS... -lm -lpciaccess -ldl -ltorque
checking for OMPI LIBS... -lm -lpciaccess -ldl -ltorque -lrt -lnsl -lutil

NON-working system:
$ grep 'checking for .* LIBS' configure.out
checking for OPAL LIBS... -lm -ldl
checking for ORTE LIBS... -lm -ldl -ltorque
checking for OMPI LIBS... -lm -ldl -ltorque

So, the working system that does link libutil is doing so as an OMPI
dependency.
However it is also needed for opal (only caller of openpty is
opal/util/open_pty.c).

2) Only the working system is building ROMIO:

Comparing the 'checking if * can compile' lines of configure output shows
only ONE difference:

 checking if MCA component fs:ufs can compile... yes
 checking if MCA component fs:pvfs2 can compile... no
 checking if MCA component io:ompio can compile... yes
-checking if MCA component io:romio can compile... no
+checking if MCA component io:romio can compile... yes
 checking if MCA component mpool:grdma can compile... yes
 checking if MCA component mpool:sm can compile... yes
 checking if MCA component mpool:udreg can compile... no

So, it appears that *if* ROMIO is configured in, then "-lutil" gets added
to OMPI_WRAPPER_EXTRA_LIBS.
This masks the fact that it is missing from OPAL_WRAPPER_EXTRA_LIBS.


I have confirmed that I can reproduce the static linking failure by adding
--disable-io-romio to the configure options of the system that worked
previously.

So, I update my report (and the email subject line) to:
   Static linking fails on Linux when not building ROMIO

-Paul



On Sun, Aug 3, 2014 at 6:22 PM, Paul Hargrove <phhargr...@lbl.gov> wrote:

> Hmm,
>
> On a different Linux/x86-64 host things work as expected with '-lutil'
> linked explicitly:
>
> $ ./INST/bin/mpicc -showme BLD/examples/hello_c.c
> pgcc BLD/examples/hello_c.c
> -I/scratch/scratchdirs/hargrove/OMPI/openmpi-1.8.2rc3-linux-x86_64-pgi-14.1/INST/include
> -L/opt/torque/4.2.7.h1/lib -Wl,-rpath -Wl,/opt/torque/4.2.7.h1/lib
> -Wl,-rpath -Wl,/opt/torque/4.2.7.h1/lib -Wl,-rpath
> -Wl,/opt/torque/4.2.7.h1/lib -Wl,-rpath -Wl,/opt/torque/4.2.7.h1/lib
> -Wl,-rpath
> -Wl,/scratch/scratchdirs/hargrove/OMPI/openmpi-1.8.2rc3-linux-x86_64-pgi-14.1/INST/lib
> -L/scratch/scratchdirs/hargrove/OMPI/openmpi-1.8.2rc3-linux-x86_64-pgi-14.1/INST/lib
> -lmpi -lopen-rte -lopen-pal -lm -lpciaccess -ldl -ltorque -lrt -lnsl -lutil
>
> Searching for relevant differences now...
>
> -Paul
>
>
> On Sun, Aug 3, 2014 at 4:58 PM, Paul Hargrove <phhargr...@lbl.gov> wrote:
>
>>
>> I've configured the 1.8.2rc3 tarball with "--enable-static
>> --disable-shared" on a fairly standard Linux/x86-64 platform.  While there
>> are no problems on the same platform w/o these configure flags, with them I
>> cannot link any application codes.
>>
>> $ mpicc -ghello_c.c   -o hello_c
>> /global/homes/h/hargrove/GSCRATCH/OMPI/openmpi-1.8.2rc3-linux-x86_64-static/INST/lib/libopen-pal.a(opal_pty.o):
>> In function `opal_openpty':
>> opal_pty.c:(.text+0x1): undefined reference to `openpty'
>>
>> I checked "make openpty" and the manpage says to link with '-lutil'.
>> The '-showme' does not show libutil:
>>
>> $ mpicc -showme hello_c.c
>> gcc hello_c.c
>> -I/global/homes/h/hargrove/GSCRATCH/OMPI/openmpi-1.8.2rc3-linux-x86_64-static/INST/include
>> -pthread -L/usr/syscom/opt/torque/4.1.4/lib -Wl,-rpath
>> -Wl,/usr/syscom/opt/torque/4.1.4/lib -Wl,-rpath
>> -Wl,/usr/syscom/opt/torque/4.1.4/lib -Wl,-rpath
>> -Wl,/usr/syscom/opt/torque/4.1.4/lib -Wl,-rpath
>> -Wl,/usr/syscom/opt/torque/4.1.4/lib -Wl,-rpath
>> -Wl,/global/homes/h/hargrove/GSCRATCH/OMPI/openmpi-1.8.2rc3-linux-x86_64-static/INST/lib
>> -Wl,--enable-new-dtags
>> -L/global/homes/h/hargrove/GSCRATCH/OMPI/openmpi-1.8.2rc3-linux-x86_64-static/INST/lib
>> -lmpi -lopen-rte -lopen-pal -lm -ldl -ltorque -libverbs -lrdmacm
>>
>>
>> It looks like configure is doing the right thing on some level, but
>> failing to add '-lutil' to the appropriate list of libs
>> (OPAL_WRAPPER_EXTRA_LIBS?):
>>
>>
>> 
>> == Library and Function tests
>>
>> 
>> checking if we need -lutil for openpty... yes
>> checking for openpty... yes
>>
>>
>> -Paul
>>
>> --
>> Paul H. Hargrove  phhargr...@lbl.gov
>> Futur

[OMPI devel] 1.8.2rc3 cosmetic issues in configure

2014-08-04 Thread Paul Hargrove
It looks like four instances of AC_MSG_CHECKING are missing an
AC_MSG_RESULT or have other configure macros improperly nested between the
two:

checking for epoll support... checking for epoll_ctl... yes
yes
checking for working epoll library interface... yes
yes

checking if user requested CMA build... checking --with-knem value...
simple ok (unspecified)

checking if user requested CMA build... checking if MCA component btl:vader
can compile... yes

checking orte configuration args... checking if MCA component dpm:orte can
compile... yes

-Paul


-- 
Paul H. Hargrove  phhargr...@lbl.gov
Future Technologies Group
Computer and Data Sciences Department Tel: +1-510-495-2352
Lawrence Berkeley National Laboratory Fax: +1-510-486-6900


[OMPI devel] oshmem enabled by default

2014-08-04 Thread Paul Hargrove
In both trunk and 1.8.2rc3 the behavior is to enable oshmem by default.

In the 1.8.2rc3 tarball the configure help output matches the behavior.
HOWEVER, in the trunk the configure help output still says oshmem is
DISabled by default.

{~/OMPI/ompi-trunk}$ svn info | grep "Revision"
Revision: 32422
{~/OMPI/ompi-trunk}$ ./configure --help | grep -A1 'enable-oshmem '
  --enable-oshmem Enable building the OpenSHMEM interface (disabled
by
  default)

-Paul


On Thu, Jul 24, 2014 at 2:09 PM, Ralph Castain  wrote:

> Actually, it already is set correctly - the help message was out of date,
> so I corrected that.
>
> On Jul 24, 2014, at 10:58 AM, Marco Atzeri  wrote:
>
> > On 24/07/2014 15:52, Ralph Castain wrote:
> >> Oshmem should be enabled by default now
> >
> > Ok,
> > so please reverse the configure switch
> >
> >  --enable-oshmem Enable building the OpenSHMEM interface
> (disabled by default)
> >
> > I will test enabling it in the meantime.
> >
> > Regards
> > Marco
> >
> >
> >
> > ___
> > devel mailing list
> > de...@open-mpi.org
> > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
> > Link to this post:
> http://www.open-mpi.org/community/lists/devel/2014/07/15254.php
>
> ___
> devel mailing list
> de...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
> Link to this post:
> http://www.open-mpi.org/community/lists/devel/2014/07/15261.php
>



-- 
Paul H. Hargrove  phhargr...@lbl.gov
Future Technologies Group
Computer and Data Sciences Department Tel: +1-510-495-2352
Lawrence Berkeley National Laboratory Fax: +1-510-486-6900


Re: [OMPI devel] [1.8.2rc3] static linking fails on linux when not building ROMIO

2014-08-04 Thread Paul Hargrove
Ralph and Jeff,

I've been digging and find the problem is wider than just the one library
and has manifestations specific to FreeBSD, NetBSD and Solaris.  I am
adding new info to the ticket as I unearth it.

Additionally, it appears this existed in 1.8, 1.8.1 and in the 1.7 series
as well.
So, would suggest this NOT be a blocker for a 1.8.2 release.

Of course I am willing to provide testing if you still want to push for a
quick resolution.

-Paul


On Mon, Aug 4, 2014 at 1:27 PM, Ralph Castain <r...@open-mpi.org> wrote:

> Okay, I filed a blocker on this for 1.8.2 and assigned it to Jeff. I took
> a crack at fixing it, but came up short :-(
>
>
> On Aug 3, 2014, at 10:46 PM, Paul Hargrove <phhargr...@lbl.gov> wrote:
>
> I've identified the difference between the platform that does link libutil
> and the one that does not.
>
> 1) libutil is linked (as an OMPI dependency) only on the working system:
>
> Working system:
> $ grep 'checking for .* LIBS' configure.out
> checking for OPAL LIBS... -lm -lpciaccess -ldl
> checking for ORTE LIBS... -lm -lpciaccess -ldl -ltorque
> checking for OMPI LIBS... -lm -lpciaccess -ldl -ltorque -lrt -lnsl -lutil
>
> NON-working system:
> $ grep 'checking for .* LIBS' configure.out
> checking for OPAL LIBS... -lm -ldl
> checking for ORTE LIBS... -lm -ldl -ltorque
> checking for OMPI LIBS... -lm -ldl -ltorque
>
> So, the working system that does link libutil is doing so as an OMPI
> dependency.
> However it is also needed for opal (only caller of openpty is
> opal/util/open_pty.c).
>
> 2) Only the working system is building ROMIO:
>
> Comparing the 'checking if * can compile' lines of configure output shows
> only ONE difference:
>
>  checking if MCA component fs:ufs can compile... yes
>  checking if MCA component fs:pvfs2 can compile... no
>  checking if MCA component io:ompio can compile... yes
> -checking if MCA component io:romio can compile... no
> +checking if MCA component io:romio can compile... yes
>  checking if MCA component mpool:grdma can compile... yes
>  checking if MCA component mpool:sm can compile... yes
>  checking if MCA component mpool:udreg can compile... no
>
> So, it appears that *if* ROMIO is configured in, then "-lutil" gets added
> to OMPI_WRAPPER_EXTRA_LIBS.
> This masks the fact that it is missing from OPAL_WRAPPER_EXTRA_LIBS.
>
>
> I have confirmed that I can reproduce the static linking failure by adding
> --disable-io-romio to the configure options of the system that worked
> previously.
>
> So, I update my report (and the email subject line) to:
>Static linking fails on Linux when not building ROMIO
>
> -Paul
>
>
>
> On Sun, Aug 3, 2014 at 6:22 PM, Paul Hargrove <phhargr...@lbl.gov> wrote:
>
>> Hmm,
>>
>> On a different Linux/x86-64 host things work as expected with '-lutil'
>> linked explicitly:
>>
>> $ ./INST/bin/mpicc -showme BLD/examples/hello_c.c
>> pgcc BLD/examples/hello_c.c
>> -I/scratch/scratchdirs/hargrove/OMPI/openmpi-1.8.2rc3-linux-x86_64-pgi-14.1/INST/include
>> -L/opt/torque/4.2.7.h1/lib -Wl,-rpath -Wl,/opt/torque/4.2.7.h1/lib
>> -Wl,-rpath -Wl,/opt/torque/4.2.7.h1/lib -Wl,-rpath
>> -Wl,/opt/torque/4.2.7.h1/lib -Wl,-rpath -Wl,/opt/torque/4.2.7.h1/lib
>> -Wl,-rpath
>> -Wl,/scratch/scratchdirs/hargrove/OMPI/openmpi-1.8.2rc3-linux-x86_64-pgi-14.1/INST/lib
>> -L/scratch/scratchdirs/hargrove/OMPI/openmpi-1.8.2rc3-linux-x86_64-pgi-14.1/INST/lib
>> -lmpi -lopen-rte -lopen-pal -lm -lpciaccess -ldl -ltorque -lrt -lnsl -lutil
>>
>> Searching for relevant differences now...
>>
>> -Paul
>>
>>
>> On Sun, Aug 3, 2014 at 4:58 PM, Paul Hargrove <phhargr...@lbl.gov> wrote:
>>
>>>
>>> I've configured the 1.8.2rc3 tarball with "--enable-static
>>> --disable-shared" on a fairly standard Linux/x86-64 platform.  While there
>>> are no problems on the same platform w/o these configure flags, with them I
>>> cannot link any application codes.
>>>
>>> $ mpicc -ghello_c.c   -o hello_c
>>> /global/homes/h/hargrove/GSCRATCH/OMPI/openmpi-1.8.2rc3-linux-x86_64-static/INST/lib/libopen-pal.a(opal_pty.o):
>>> In function `opal_openpty':
>>> opal_pty.c:(.text+0x1): undefined reference to `openpty'
>>>
>>> I checked "make openpty" and the manpage says to link with '-lutil'.
>>> The '-showme' does not show libutil:
>>>
>>> $ mpicc -showme hello_c.c
>>> gcc hello_c.c
>>> -I/global/homes/h/hargrove/GSCRATCH/OMPI/openmpi-1.8.2rc3-linux-x86_64-static/INST/include
>>> -pthread -L/usr/syscom/opt/torque/4.1.4/lib -Wl,-rpath
>>> -

[OMPI devel] [vt] --with-openmpi-inside configure argument

2014-08-04 Thread Paul Hargrove
I noticed that Open MPI is passing
--with-openmpi-inside=1.7
in the arguments passed to
ompi/contrib/vt/vt/configure
and
ompi/contrib/vt/vt/extlib/otf/configure

The extlib/otf case just tests if the value is set, but the top-level
vt/configure is checking for the specific string "1.7":

# Check whether we are inside Open MPI package
inside_openmpi="no"
AC_ARG_WITH(openmpi-inside, [],
[
AS_IF([test x"$withval" = "xyes" -o x"$withval" = "x1.7"],
[
inside_openmpi="$withval"
CPPFLAGS="-DINSIDE_OPENMPI $CPPFLAGS"

# Set FC to F77 if Open MPI version < 1.7
AS_IF([test x"$withval" = "xyes" -a x"$FC" = x -a x"$F77"
!= x],
[FC="$F77"])
])
])

That logic looks a bit fragile with respect to any future changes.
Specifically the inner AS_IF is true for the desired condition "version <
1.7" only because the outer AS_IF currently ensures the only possible
values of "$withval" are "yes" and "1.7".

-Paul

-- 
Paul H. Hargrove  phhargr...@lbl.gov
Future Technologies Group
Computer and Data Sciences Department Tel: +1-510-495-2352
Lawrence Berkeley National Laboratory Fax: +1-510-486-6900


[OMPI devel] minor atomics nit

2014-08-04 Thread Paul Hargrove
Running "make dist" on trunk I see:

--> Generating assembly for "SPARC" "default-.text-.globl-:--.L-#-1-0-1-0-0"
Could not open ../../../opal/asm/base/SPARC.asm: No such file or directory

Which is apparent because the following lines were never removed
from opal/asm/asm-data.txt

# default compile mode on Solaris.  Evil.  equiv to about Sparc v8
SPARC   default-.text-.globl-:--.L-#-1-0-1-0-0  sparc-solaris

README is clear about having dropped support for SPARC < v8plus.


-Paul

-- 
Paul H. Hargrove  phhargr...@lbl.gov
Future Technologies Group
Computer and Data Sciences Department Tel: +1-510-495-2352
Lawrence Berkeley National Laboratory Fax: +1-510-486-6900


Re: [OMPI devel] oshmem enabled by default

2014-08-04 Thread Paul Hargrove
Since "disabled by default" is just part of a macro argument we can say
anything we want.
I propose the following:

Index: config/oshmem_configure_options.m4
===
--- config/oshmem_configure_options.m4  (revision 32424)
+++ config/oshmem_configure_options.m4  (working copy)
@@ -22,7 +22,7 @@
 AC_MSG_CHECKING([if want oshmem])
 AC_ARG_ENABLE([oshmem],
   [AC_HELP_STRING([--enable-oshmem],
-  [Enable building the OpenSHMEM interface
(disabled by default)])],
+  [Enable building the OpenSHMEM interface
(available on Linux only, where it is enabled by default)])],
   [oshmem_arg_given=yes],
   [oshmem_arg_given=no])
 if test "$oshmem_arg_given" = "yes"; then


-Paul




On Mon, Aug 4, 2014 at 7:34 PM, Gilles Gouaillardet <
gilles.gouaillar...@iferc.org> wrote:

>  Paul,
>
> this is a bit trickier ...
>
> on a Linux platform oshmem is built by default,
> on a non Linux platform, oshmem is *not* built by default.
>
> so the configure message (disabled by default) is correct on non Linux
> platform, and incorrect on Linux platform ...
>
> i do not know what should be done, here are some options :
> - have a different behaviour on Linux vs non Linux platforms (by the way,
> does autotools support this ?)
> - disable by default, provide only the --enable-oshmem option (so
> configure abort if --enable-oshmem on non Linux platforms)
> - provide only the --disable-oshmem option, useful only on Linux
> platforms. on non Linux platforms do not build oshmem and this is not an
> error
> - other ?
>
> Cheers,
>
> Gilles
>
> r31155 | rhc | 2014-03-20 05:32:15 +0900 (Thu, 20 Mar 2014) | 5 lines
>
> As per the thread on ticket #4399, OSHMEM does not support non-Linux
> platforms. So provide a check for Linux and error out if --enable-oshmem is
> given on a non-supported platform. If no OSHMEM option is given (enable or
> disable), then don't attempt to build OSHMEM unless we are on a Linux
> platform. Default to building if we are on Linux for now, pending the
> outcome of the Debian situation.
>
>
> On 2014/08/05 6:41, Paul Hargrove wrote:
>
> In both trunk and 1.8.2rc3 the behavior is to enable oshmem by default.
>
> In the 1.8.2rc3 tarball the configure help output matches the behavior.
> HOWEVER, in the trunk the configure help output still says oshmem is
> DISabled by default.
>
> {~/OMPI/ompi-trunk}$ svn info | grep "Revision"
> Revision: 32422
> {~/OMPI/ompi-trunk}$ ./configure --help | grep -A1 'enable-oshmem '
>   --enable-oshmem Enable building the OpenSHMEM interface (disabled
> by
>   default)
>
> -Paul
>
>
> On Thu, Jul 24, 2014 at 2:09 PM, Ralph Castain <r...@open-mpi.org> 
> <r...@open-mpi.org> wrote:
>
>
>  Actually, it already is set correctly - the help message was out of date,
> so I corrected that.
>
> On Jul 24, 2014, at 10:58 AM, Marco Atzeri <marco.atz...@gmail.com> 
> <marco.atz...@gmail.com> wrote:
>
>
>  On 24/07/2014 15:52, Ralph Castain wrote:
>
>  Oshmem should be enabled by default now
>
>  Ok,
> so please reverse the configure switch
>
>  --enable-oshmem Enable building the OpenSHMEM interface
>
>  (disabled by default)
>
>  I will test enabling it in the meantime.
>
> Regards
> Marco
>
>
>
> ___
> devel mailing listde...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
> Link to this post:
>
>  http://www.open-mpi.org/community/lists/devel/2014/07/15254.php
>
> ___
> devel mailing listde...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
> Link to this 
> post:http://www.open-mpi.org/community/lists/devel/2014/07/15261.php
>
>
>
> ___
> devel mailing listde...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
> Link to this post: 
> http://www.open-mpi.org/community/lists/devel/2014/08/15502.php
>
>
>
> ___
> devel mailing list
> de...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
> Link to this post:
> http://www.open-mpi.org/community/lists/devel/2014/08/15507.php
>



-- 
Paul H. Hargrove  phhargr...@lbl.gov
Future Technologies Group
Computer and Data Sciences Department Tel: +1-510-495-2352
Lawrence Berkeley National Laboratory Fax: +1-510-486-6900


Re: [OMPI devel] [vt] --with-openmpi-inside configure argument

2014-08-05 Thread Paul Hargrove
Bert,

It is just an observation of something that could easily break in the
future.
The code is correct as written.
So, no immediate action is required.

-Paul


On Tue, Aug 5, 2014 at 12:04 AM, Bert Wesarg <bert.wes...@tu-dresden.de>
wrote:

> On 08/05/2014 02:40 AM, Paul Hargrove wrote:
>
>> I noticed that Open MPI is passing
>>  --with-openmpi-inside=1.7
>> in the arguments passed to
>>  ompi/contrib/vt/vt/configure
>> and
>>  ompi/contrib/vt/vt/extlib/otf/configure
>>
>> The extlib/otf case just tests if the value is set, but the top-level
>> vt/configure is checking for the specific string "1.7":
>>
>> # Check whether we are inside Open MPI package
>> inside_openmpi="no"
>> AC_ARG_WITH(openmpi-inside, [],
>> [
>>  AS_IF([test x"$withval" = "xyes" -o x"$withval" = "x1.7"],
>>  [
>>  inside_openmpi="$withval"
>>  CPPFLAGS="-DINSIDE_OPENMPI $CPPFLAGS"
>>
>>  # Set FC to F77 if Open MPI version < 1.7
>>  AS_IF([test x"$withval" = "xyes" -a x"$FC" = x -a x"$F77"
>> != x],
>>  [FC="$F77"])
>>  ])
>> ])
>>
>> That logic looks a bit fragile with respect to any future changes.
>> Specifically the inner AS_IF is true for the desired condition "version <
>> 1.7" only because the outer AS_IF currently ensures the only possible
>> values of "$withval" are "yes" and "1.7".
>>
>
> Noted. But this is not my field. May take some time, because Matthias is
> still in vacation.
>
> Bert
>
>
>> -Paul
>>
>>
>>
> --
> Dipl.-Inf. Bert Wesarg
> wiss. Mitarbeiter
>
> Technische Universität Dresden
> Zentrum für Informationsdienste und Hochleistungsrechnen (ZIH)
> 01062 Dresden
> Tel.: +49 (351) 463-42451
> Fax: +49 (351) 463-37773
> E-Mail: bert.wes...@tu-dresden.de
>
>
> ___
> devel mailing list
> de...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
> Link to this post:
> http://www.open-mpi.org/community/lists/devel/2014/08/15510.php
>



-- 
Paul H. Hargrove  phhargr...@lbl.gov
Future Technologies Group
Computer and Data Sciences Department Tel: +1-510-495-2352
Lawrence Berkeley National Laboratory Fax: +1-510-486-6900


Re: [OMPI devel] [1.8.2rc3] static linking fails on linux when not building ROMIO

2014-08-05 Thread Paul Hargrove
 central point (thus catching it everywhere), then it could be quick and worth 
> doing. However, I'm skeptical as I tried to do that in the most obvious 
> place, and it failed (could be operator error).
>
> Will let you know tomorrow. Truly appreciate your digging on this!
> Ralph
>
> On Aug 4, 2014, at 3:50 PM, Paul Hargrove <phhargr...@lbl.gov> 
> <phhargr...@lbl.gov> wrote:
>
>
>  Ralph and Jeff,
>
> I've been digging and find the problem is wider than just the one library and 
> has manifestations specific to FreeBSD, NetBSD and Solaris.  I am adding new 
> info to the ticket as I unearth it.
>
> Additionally, it appears this existed in 1.8, 1.8.1 and in the 1.7 series as 
> well.
> So, would suggest this NOT be a blocker for a 1.8.2 release.
>
> Of course I am willing to provide testing if you still want to push for a 
> quick resolution.
>
> -Paul
>
>
> On Mon, Aug 4, 2014 at 1:27 PM, Ralph Castain <r...@open-mpi.org> 
> <r...@open-mpi.org> wrote:
> Okay, I filed a blocker on this for 1.8.2 and assigned it to Jeff. I took a 
> crack at fixing it, but came up short :-(
>
>
> On Aug 3, 2014, at 10:46 PM, Paul Hargrove <phhargr...@lbl.gov> 
> <phhargr...@lbl.gov> wrote:
>
>
>  I've identified the difference between the platform that does link libutil 
> and the one that does not.
>
> 1) libutil is linked (as an OMPI dependency) only on the working system:
>
> Working system:
> $ grep 'checking for .* LIBS' configure.out
> checking for OPAL LIBS... -lm -lpciaccess -ldl
> checking for ORTE LIBS... -lm -lpciaccess -ldl -ltorque
> checking for OMPI LIBS... -lm -lpciaccess -ldl -ltorque -lrt -lnsl -lutil
>
> NON-working system:
> $ grep 'checking for .* LIBS' configure.out
> checking for OPAL LIBS... -lm -ldl
> checking for ORTE LIBS... -lm -ldl -ltorque
> checking for OMPI LIBS... -lm -ldl -ltorque
>
> So, the working system that does link libutil is doing so as an OMPI 
> dependency.
> However it is also needed for opal (only caller of openpty is 
> opal/util/open_pty.c).
>
> 2) Only the working system is building ROMIO:
>
> Comparing the 'checking if * can compile' lines of configure output shows 
> only ONE difference:
>
>  checking if MCA component fs:ufs can compile... yes
>  checking if MCA component fs:pvfs2 can compile... no
>  checking if MCA component io:ompio can compile... yes
> -checking if MCA component io:romio can compile... no
> +checking if MCA component io:romio can compile... yes
>  checking if MCA component mpool:grdma can compile... yes
>  checking if MCA component mpool:sm can compile... yes
>  checking if MCA component mpool:udreg can compile... no
>
> So, it appears that *if* ROMIO is configured in, then "-lutil" gets added to 
> OMPI_WRAPPER_EXTRA_LIBS.
> This masks the fact that it is missing from OPAL_WRAPPER_EXTRA_LIBS.
>
>
> I have confirmed that I can reproduce the static linking failure by adding 
> --disable-io-romio to the configure options of the system that worked 
> previously.
>
> So, I update my report (and the email subject line) to:
>Static linking fails on Linux when not building ROMIO
>
> -Paul
>
>
>
> On Sun, Aug 3, 2014 at 6:22 PM, Paul Hargrove <phhargr...@lbl.gov> 
> <phhargr...@lbl.gov> wrote:
> Hmm,
>
> On a different Linux/x86-64 host things work as expected with '-lutil' linked 
> explicitly:
>
> $ ./INST/bin/mpicc -showme BLD/examples/hello_c.c
> pgcc BLD/examples/hello_c.c 
> -I/scratch/scratchdirs/hargrove/OMPI/openmpi-1.8.2rc3-linux-x86_64-pgi-14.1/INST/include
>  -L/opt/torque/4.2.7.h1/lib -Wl,-rpath -Wl,/opt/torque/4.2.7.h1/lib 
> -Wl,-rpath -Wl,/opt/torque/4.2.7.h1/lib -Wl,-rpath 
> -Wl,/opt/torque/4.2.7.h1/lib -Wl,-rpath -Wl,/opt/torque/4.2.7.h1/lib 
> -Wl,-rpath 
> -Wl,/scratch/scratchdirs/hargrove/OMPI/openmpi-1.8.2rc3-linux-x86_64-pgi-14.1/INST/lib
>  
> -L/scratch/scratchdirs/hargrove/OMPI/openmpi-1.8.2rc3-linux-x86_64-pgi-14.1/INST/lib
>  -lmpi -lopen-rte -lopen-pal -lm -lpciaccess -ldl -ltorque -lrt -lnsl -lutil
>
> Searching for relevant differences now...
>
> -Paul
>
>
> On Sun, Aug 3, 2014 at 4:58 PM, Paul Hargrove <phhargr...@lbl.gov> 
> <phhargr...@lbl.gov> wrote:
>
> I've configured the 1.8.2rc3 tarball with "--enable-static --disable-shared" 
> on a fairly standard Linux/x86-64 platform.  While there are no problems on 
> the same platform w/o these configure flags, with them I cannot link any 
> application codes.
>
> $ mpicc -ghello_c.c   -o hello_c
> /global/homes/h/hargrove/GSCRATCH/OMPI/openmpi-1.8.2rc3-linux-x86_64-static/INST/lib/libopen-pal.a(opal_pty.o):
>  In function `opal

Re: [OMPI devel] v1.8.2 still held up...

2014-08-07 Thread Paul Hargrove
Ralph,

I will hopefully be able to test Gilles's patch for 4834 on applicable
systems today or tomorrow.

So, I can soon answer whether the patch fixes all the problems I reported.
However, I cannot speak at all to the desirability of the approach relative
to the build infrastructure.
I think Jeff may be best qualified to make that judgement.

-Paul


On Thu, Aug 7, 2014 at 10:55 AM, Ralph Castain  wrote:

> Hey folks
>
> I *really* need your help to get this release out the door. It remains
> stuck on two things:
>
> * static linking failure - Gilles has posted a proposed fix, but somebody
> needs to approve and CMR it. Please see:
> https://svn.open-mpi.org/trac/ompi/ticket/4834
>
> * fixes to coll/ml that expanded to fixing page alignment in general -
> someone needs to review/approve it:
> https://svn.open-mpi.org/trac/ompi/ticket/4826
>
>
> We also have three outstanding issues that may not make 1.8.2:
>
> * MPI-I/O issues - looks like ROMIO needs some patches, and OMPIO may have
> an issue:
> http://www.open-mpi.org/community/lists/users/2014/08/24934.php
>
> * Siegmar reports another alignment issue on Sparc
> http://www.open-mpi.org/community/lists/users/2014/07/24869.php
>
> * Siegmar reports an issue that looks like it relates to handling of
> boolean MCA params:
>  http://www.open-mpi.org/community/lists/users/2014/08/24944.php
>
>
> Can someone *please* help out with these?
> Ralph
>
>
> ___
> devel mailing list
> de...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
> Link to this post:
> http://www.open-mpi.org/community/lists/devel/2014/08/15533.php
>



-- 
Paul H. Hargrove  phhargr...@lbl.gov
Future Technologies Group
Computer and Data Sciences Department Tel: +1-510-495-2352
Lawrence Berkeley National Laboratory Fax: +1-510-486-6900


Re: [OMPI devel] v1.8.2 still held up...

2014-08-08 Thread Paul Hargrove
On Thu, Aug 7, 2014 at 8:03 PM, Gilles Gouaillardet <
gilles.gouaillar...@iferc.org> wrote:

> > * Siegmar reports another alignment issue on Sparc
> > http://www.open-mpi.org/community/lists/users/2014/07/24869.php
> >
> Is there any chance r32449 fixes the issue ?
>
> i found the problem on Solaris/x86_64 but i have no way to test it on a
> Solaris/sparc box.
>

I have Solaris-10/SPARC, just as Siegmar reports using.
However, I don't have gcc-4.9.0 and doubt I can build it myself.

I will see if I can reproduce the problem with 1.8.2rc2 or rc3.
If so, then I'll give r32449 a try.

-Paul


-- 
Paul H. Hargrove  phhargr...@lbl.gov
Future Technologies Group
Computer and Data Sciences Department Tel: +1-510-495-2352
Lawrence Berkeley National Laboratory Fax: +1-510-486-6900


Re: [OMPI devel] v1.8.2 still held up...

2014-08-08 Thread Paul Hargrove
On Thu, Aug 7, 2014 at 8:03 PM, Gilles Gouaillardet <
gilles.gouaillar...@iferc.org> wrote:

> > * static linking failure - Gilles has posted a proposed fix, but
> somebody needs to approve and CMR it. Please see:
> > https://svn.open-mpi.org/trac/ompi/ticket/4834
>
> Jeff made a better fix (r32447) to which i added a minor correction
> (r32448).
> as far as i am concerned, i am fine with to approve #4841
>
> that being said, per Jeff's commit log :
> "This needs to soak for a day or two on the trunk before moving to the
> v1.8 branch"
>
> so you might want to wait a bit ...
>


I trust Jeff's judgment on the waiting (or not), but can report that except
for an unrelated issue on Solaris-10/SPARC (email coming soon) the changes
in r32447+r32448 resolve the issue on all the OSes I test.

-Paul


-- 
Paul H. Hargrove  phhargr...@lbl.gov
Future Technologies Group
Computer and Data Sciences Department Tel: +1-510-495-2352
Lawrence Berkeley National Laboratory Fax: +1-510-486-6900


[OMPI devel] circular library dependence prevents static link on Solaris-10/SPARC

2014-08-08 Thread Paul Hargrove
Testing r32448 on trunk for trac issue #4834, I encounter the following
which appears unrelated to #4834:

  CCLD orte-info
Undefined   first referenced
 symbol in file
ompi_proc_local_proc
 
/sandbox/hargrove/OMPI/openmpi-trunk-solaris10-sparcT2-ss12u3-v9-static/BLD/opal/.libs/libopen-pal.a(libmca_btl_sm_la-btl_sm_component.o)
ld: fatal: Symbol referencing errors. No output written to orte-info

Note that this is *static* linking.

This appears to indicate a call from OPAL to OMPI, and I am guessing this
is a side-effect of the BTL move.

Since OMPI contains (many) calls to OPAL this is a circular library
dependence.
Unfortunately, some linkers process their argument strictly left-to-right.
Thus if this dependence is not eliminated one may need "-lmpi -lopen-pal
-lmpi" (or similar) to resolve it.

-Paul

-- 
Paul H. Hargrove  phhargr...@lbl.gov
Future Technologies Group
Computer and Data Sciences Department Tel: +1-510-495-2352
Lawrence Berkeley National Laboratory Fax: +1-510-486-6900


Re: [OMPI devel] circular library dependence prevents static link on Solaris-10/SPARC

2014-08-08 Thread Paul Hargrove
I will attempt to confirm on my Solaris-10 system ASAP.
That will allow me to finally be certain that the other static linking
issue has been resolved.

-Paul


On Fri, Aug 8, 2014 at 11:39 AM, Jeff Squyres (jsquyres) <jsquy...@cisco.com
> wrote:

> Thanks!
>
> On Aug 8, 2014, at 2:30 PM, George Bosilca <bosi...@icl.utk.edu> wrote:
>
> > r32467 should fix the problem.
> >
> >   George.
> >
> >
> > On Fri, Aug 8, 2014 at 1:20 PM, Jeff Squyres (jsquyres) <
> jsquy...@cisco.com> wrote:
> > That'll do it...
> >
> > George: can you fix?
> >
> >
> > On Aug 8, 2014, at 1:11 PM, Ralph Castain <r...@open-mpi.org> wrote:
> >
> > > I think it might be getting pulled in from this include:
> > >
> > > opal/mca/common/sm/common_sm.h:37:#include "ompi/group/group.h"
> > >
> > >
> > > On Aug 8, 2014, at 5:33 AM, Jeff Squyres (jsquyres) <
> jsquy...@cisco.com> wrote:
> > >
> > >> Weirdness; I don't see any name like that in the SM BTL.
> > >>
> > >> I see it used in the OMPI layer... not sure how it's being using down
> in the btl SM component file...?
> > >>
> > >>
> > >> On Aug 7, 2014, at 11:25 PM, Paul Hargrove <phhargr...@lbl.gov>
> wrote:
> > >>
> > >>> Testing r32448 on trunk for trac issue #4834, I encounter the
> following which appears unrelated to #4834:
> > >>>
> > >>>  CCLD orte-info
> > >>> Undefined   first referenced
> > >>> symbol in file
> > >>> ompi_proc_local_proc
>  
> /sandbox/hargrove/OMPI/openmpi-trunk-solaris10-sparcT2-ss12u3-v9-static/BLD/opal/.libs/libopen-pal.a(libmca_btl_sm_la-btl_sm_component.o)
> > >>> ld: fatal: Symbol referencing errors. No output written to orte-info
> > >>>
> > >>> Note that this is *static* linking.
> > >>>
> > >>> This appears to indicate a call from OPAL to OMPI, and I am guessing
> this is a side-effect of the BTL move.
> > >>>
> > >>> Since OMPI contains (many) calls to OPAL this is a circular library
> dependence.
> > >>> Unfortunately, some linkers process their argument strictly
> left-to-right.
> > >>> Thus if this dependence is not eliminated one may need "-lmpi
> -lopen-pal -lmpi" (or similar) to resolve it.
> >
> > ___
> > devel mailing list
> > de...@open-mpi.org
> > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
> > Link to this post:
> http://www.open-mpi.org/community/lists/devel/2014/08/15565.php
>
>
> --
> Jeff Squyres
> jsquy...@cisco.com
> For corporate legal information go to:
> http://www.cisco.com/web/about/doing_business/legal/cri/
>
> ___
> devel mailing list
> de...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
> Link to this post:
> http://www.open-mpi.org/community/lists/devel/2014/08/15566.php
>



-- 
Paul H. Hargrove  phhargr...@lbl.gov
Future Technologies Group
Computer and Data Sciences Department Tel: +1-510-495-2352
Lawrence Berkeley National Laboratory Fax: +1-510-486-6900


Re: [OMPI devel] v1.8.2 still held up...

2014-08-09 Thread Paul Hargrove
On Thu, Aug 7, 2014 at 10:55 AM, Ralph Castain  wrote:

> * static linking failure - Gilles has posted a proposed fix, but somebody
> needs to approve and CMR it. Please see:
> https://svn.open-mpi.org/trac/ompi/ticket/4834
>


Jeff moved the fix to v1.8 in r32471.
I have tested tonight's tarball (1.8.2rc4r32480) and found the problem to
be resolved on all tested OSes (linux, macos, freebsd, netbsd, openbsd,
solaris-10 and solaris-11).

-Paul


-- 
Paul H. Hargrove  phhargr...@lbl.gov
Future Technologies Group
Computer and Data Sciences Department Tel: +1-510-495-2352
Lawrence Berkeley National Laboratory Fax: +1-510-486-6900


Re: [OMPI devel] v1.8.2 still held up...

2014-08-09 Thread Paul Hargrove
On Thu, Aug 7, 2014 at 10:55 AM, Ralph Castain  wrote:

> * fixes to coll/ml that expanded to fixing page alignment in general -
> someone needs to review/approve it:
> https://svn.open-mpi.org/trac/ompi/ticket/4826
>

I've been able to confirm that the nightly tarball (1.8.2rc4r32480) works
as expected on the SPARC and PPC64 platforms where I had reproduced the
problem previously.  I won't have access to the IA64 platform (which also
has pagesize != 4K) until about 6 hours from now, but have no doubt the fix
will work there too.

-Paul

-- 
Paul H. Hargrove  phhargr...@lbl.gov
Future Technologies Group
Computer and Data Sciences Department Tel: +1-510-495-2352
Lawrence Berkeley National Laboratory Fax: +1-510-486-6900


[OMPI devel] [v1.8] java bindings build failure

2014-08-09 Thread Paul Hargrove
Below are errors from trying tonight's v1.8 tarball on one of the few
systems I have access to with java.  The trunk has the same errors but with
all the line numbers increased by exactly 18.

-Paul

Making all in java
make[3]: Entering directory
`/brashear/hargrove/OMPI/openmpi-1.8-latest-linux-x86_64-java/BLD/ompi/mpi/java/java'
  JAVAC  MPI.class
/usr/users/6/hargrove/SCRATCH/OMPI/openmpi-1.8-latest-linux-x86_64-java/openmpi-1.8.2rc4r32480/ompi/mpi/java/java/Struct.java:92:
type parameters of T cannot be determined; no unique maximal instance
exists for type variable T with upper bounds T,mpi.Struct.Data
return newData(buffer, 0);
  ^
/usr/users/6/hargrove/SCRATCH/OMPI/openmpi-1.8-latest-linux-x86_64-java/openmpi-1.8.2rc4r32480/ompi/mpi/java/java/Struct.java:107:
type parameters of T cannot be determined; no unique maximal instance
exists for type variable T with upper bounds T,mpi.Struct.Data
return newData(buffer, index * extent);
  ^
/usr/users/6/hargrove/SCRATCH/OMPI/openmpi-1.8-latest-linux-x86_64-java/openmpi-1.8.2rc4r32480/ompi/mpi/java/java/Struct.java:120:
type parameters of T cannot be determined; no unique maximal instance
exists for type variable T with upper bounds T,mpi.Struct.Data
return getData(buffer);
  ^
/usr/users/6/hargrove/SCRATCH/OMPI/openmpi-1.8-latest-linux-x86_64-java/openmpi-1.8.2rc4r32480/ompi/mpi/java/java/Struct.java:136:
type parameters of T cannot be determined; no unique maximal instance
exists for type variable T with upper bounds T,mpi.Struct.Data
return getData(buffer, index);
  ^
/usr/users/6/hargrove/SCRATCH/OMPI/openmpi-1.8-latest-linux-x86_64-java/openmpi-1.8.2rc4r32480/ompi/mpi/java/java/Struct.java:722:
type parameters of T cannot be determined; no unique maximal instance
exists for type variable T with upper bounds D,mpi.Struct.Data
return s.newData(buffer, offset + field);
^
/usr/users/6/hargrove/SCRATCH/OMPI/openmpi-1.8-latest-linux-x86_64-java/openmpi-1.8.2rc4r32480/ompi/mpi/java/java/Struct.java:737:
type parameters of T cannot be determined; no unique maximal instance
exists for type variable T with upper bounds D,mpi.Struct.Data
return s.newData(buffer, offset + field + index * s.extent);
^
6 errors



-- 
Paul H. Hargrove  phhargr...@lbl.gov
Future Technologies Group
Computer and Data Sciences Department Tel: +1-510-495-2352
Lawrence Berkeley National Laboratory Fax: +1-510-486-6900


[OMPI devel] [v1.8] illegal commas after last item in enum

2014-08-09 Thread Paul Hargrove
The Solaris Studio 12.3 C++ compiler warns about commas after the last item
in an enum.
While these commas are legal in C99, they are ILLEGAL in C++ prior to C++11

The warnings below list the four instances I encountered while building the
C++ bindings, but there might be others.

-Paul

"openmpi-1.8.2rc4r32480/ompi/include/ompi/constants.h", line 70: Warning:
Identifier expected instead of "}".
"openmpi-1.8.2rc4r32480/opal/mca/base/mca_base_framework.h", line 37:
Warning: Identifier expected instead of "}".
"openmpi-1.8.2rc4r32480/opal/mca/base/mca_base_framework.h", line 119:
Warning: Identifier expected instead of "}".
"openmpi-1.8.2rc4r32480/opal/mca/base/mca_base_var.h", line 694: Warning:
Identifier expected instead of "}".


-- 
Paul H. Hargrove  phhargr...@lbl.gov
Future Technologies Group
Computer and Data Sciences Department Tel: +1-510-495-2352
Lawrence Berkeley National Laboratory Fax: +1-510-486-6900


Re: [OMPI devel] [v1.8] java bindings build failure

2014-08-09 Thread Paul Hargrove
Ralph,

/usr/java/jdk1.6.0_21

-Paul


On Fri, Aug 8, 2014 at 9:51 PM, Ralph Castain <r...@open-mpi.org> wrote:

> This seems odd - I'm not seeing any warnings or errors when building Java.
> Which JDK version do you have?
>
> On Aug 8, 2014, at 9:31 PM, Paul Hargrove <phhargr...@lbl.gov> wrote:
>
> Below are errors from trying tonight's v1.8 tarball on one of the few
> systems I have access to with java.  The trunk has the same errors but with
> all the line numbers increased by exactly 18.
>
> -Paul
>
> Making all in java
> make[3]: Entering directory
> `/brashear/hargrove/OMPI/openmpi-1.8-latest-linux-x86_64-java/BLD/ompi/mpi/java/java'
>   JAVAC  MPI.class
> /usr/users/6/hargrove/SCRATCH/OMPI/openmpi-1.8-latest-linux-x86_64-java/openmpi-1.8.2rc4r32480/ompi/mpi/java/java/Struct.java:92:
> type parameters of T cannot be determined; no unique maximal instance
> exists for type variable T with upper bounds T,mpi.Struct.Data
> return newData(buffer, 0);
>   ^
> /usr/users/6/hargrove/SCRATCH/OMPI/openmpi-1.8-latest-linux-x86_64-java/openmpi-1.8.2rc4r32480/ompi/mpi/java/java/Struct.java:107:
> type parameters of T cannot be determined; no unique maximal instance
> exists for type variable T with upper bounds T,mpi.Struct.Data
> return newData(buffer, index * extent);
>   ^
> /usr/users/6/hargrove/SCRATCH/OMPI/openmpi-1.8-latest-linux-x86_64-java/openmpi-1.8.2rc4r32480/ompi/mpi/java/java/Struct.java:120:
> type parameters of T cannot be determined; no unique maximal instance
> exists for type variable T with upper bounds T,mpi.Struct.Data
> return getData(buffer);
>   ^
> /usr/users/6/hargrove/SCRATCH/OMPI/openmpi-1.8-latest-linux-x86_64-java/openmpi-1.8.2rc4r32480/ompi/mpi/java/java/Struct.java:136:
> type parameters of T cannot be determined; no unique maximal instance
> exists for type variable T with upper bounds T,mpi.Struct.Data
> return getData(buffer, index);
>   ^
> /usr/users/6/hargrove/SCRATCH/OMPI/openmpi-1.8-latest-linux-x86_64-java/openmpi-1.8.2rc4r32480/ompi/mpi/java/java/Struct.java:722:
> type parameters of T cannot be determined; no unique maximal instance
> exists for type variable T with upper bounds D,mpi.Struct.Data
> return s.newData(buffer, offset + field);
> ^
> /usr/users/6/hargrove/SCRATCH/OMPI/openmpi-1.8-latest-linux-x86_64-java/openmpi-1.8.2rc4r32480/ompi/mpi/java/java/Struct.java:737:
> type parameters of T cannot be determined; no unique maximal instance
> exists for type variable T with upper bounds D,mpi.Struct.Data
> return s.newData(buffer, offset + field + index * s.extent);
> ^
> 6 errors
>
>
>
> --
> Paul H. Hargrove  phhargr...@lbl.gov
> Future Technologies Group
> Computer and Data Sciences Department Tel: +1-510-495-2352
> Lawrence Berkeley National Laboratory Fax: +1-510-486-6900
>  ___
> devel mailing list
> de...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
> Link to this post:
> http://www.open-mpi.org/community/lists/devel/2014/08/15575.php
>
>
>
> ___
> devel mailing list
> de...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
> Link to this post:
> http://www.open-mpi.org/community/lists/devel/2014/08/15577.php
>



-- 
Paul H. Hargrove  phhargr...@lbl.gov
Future Technologies Group
Computer and Data Sciences Department Tel: +1-510-495-2352
Lawrence Berkeley National Laboratory Fax: +1-510-486-6900


[OMPI devel] [v1.8] 32-bit c++ build failure with Sun compilers

2014-08-09 Thread Paul Hargrove
A problem Siegmar reported on trunk over a year and a half ago is breaking
a 32-bit build of the v1.8 branch with the Sun C++ compiler:

Siegmar's report appears in
http://www.open-mpi.org/community/lists/users/2013/01/21269.php
There are several warnings, but the error is (from my current build):

"/shared/OMPI/openmpi-1.8-latest-solaris11-x86-ib-ss12u3/openmpi-1.8.2rc4r32480/ompi/mpi/cxx/file.cc",
Error: The function opal_atomic_add_32(volatile int*, int) has not had a
body defined.

Brian attached a possible fix to
http://www.open-mpi.org/community/lists/users/2013/01/21272.php
It applies cleanly to v1.8 but appears to make things worse, trading that
one error for two:

"/shared/OMPI/openmpi-1.8-latest-solaris11-x86-ib-ss12u3/openmpi-1.8.2rc4r32480/opal/include/opal/sys/atomic_impl.h",
line 106: Error: opal_atomic_add_64(volatile long long*, long long) was
previously declared "extern", not "inline".
"/shared/OMPI/openmpi-1.8-latest-solaris11-x86-ib-ss12u3/openmpi-1.8.2rc4r32480/opal/include/opal/sys/atomic_impl.h",
line 121: Error: opal_atomic_sub_64(volatile long long*, long long) was
previously declared "extern", not "inline".


The good news is that the problem does not exist on the trunk.
So, hopefully somebody can track down the proper changes to CMR.

-Paul


-- 
Paul H. Hargrove  phhargr...@lbl.gov
Future Technologies Group
Computer and Data Sciences Department Tel: +1-510-495-2352
Lawrence Berkeley National Laboratory Fax: +1-510-486-6900


Re: [OMPI devel] v1.8.2 still held up...

2014-08-09 Thread Paul Hargrove
And for the record: the IA64 platform passed too.


On Sat, Aug 9, 2014 at 3:22 AM, Jeff Squyres (jsquyres) <jsquy...@cisco.com>
wrote:

> Thanks for all the testing!
>
> On Aug 8, 2014, at 11:21 PM, Paul Hargrove <phhargr...@lbl.gov> wrote:
>
> >
> >
> >
> > On Thu, Aug 7, 2014 at 10:55 AM, Ralph Castain <r...@open-mpi.org> wrote:
> > * fixes to coll/ml that expanded to fixing page alignment in general -
> someone needs to review/approve it:
> > https://svn.open-mpi.org/trac/ompi/ticket/4826
> >
> > I've been able to confirm that the nightly tarball (1.8.2rc4r32480)
> works as expected on the SPARC and PPC64 platforms where I had reproduced
> the problem previously.  I won't have access to the IA64 platform (which
> also has pagesize != 4K) until about 6 hours from now, but have no doubt
> the fix will work there too.
> >
> > -Paul
> >
> > --
> > Paul H. Hargrove  phhargr...@lbl.gov
> > Future Technologies Group
> > Computer and Data Sciences Department Tel: +1-510-495-2352
> > Lawrence Berkeley National Laboratory Fax: +1-510-486-6900
> > ___
> > devel mailing list
> > de...@open-mpi.org
> > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
> > Link to this post:
> http://www.open-mpi.org/community/lists/devel/2014/08/15574.php
>
>
> --
> Jeff Squyres
> jsquy...@cisco.com
> For corporate legal information go to:
> http://www.cisco.com/web/about/doing_business/legal/cri/
>
> ___
> devel mailing list
> de...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
> Link to this post:
> http://www.open-mpi.org/community/lists/devel/2014/08/15583.php
>



-- 
Paul H. Hargrove  phhargr...@lbl.gov
Future Technologies Group
Computer and Data Sciences Department Tel: +1-510-495-2352
Lawrence Berkeley National Laboratory Fax: +1-510-486-6900


Re: [OMPI devel] [v1.8] 32-bit c++ build failure with Sun compilers

2014-08-09 Thread Paul Hargrove
Ralph,

Yes, that did the trick.
The attached patch applied cleanly to last night's v1.8 tarball
(1.8.2rc4r32480) and I was able to build the C++ bindings on this platform.

-Paul


On Sat, Aug 9, 2014 at 7:58 AM, Ralph Castain <r...@open-mpi.org> wrote:

> I think I chased this down - looks like it is r28034. I've attached the
> patch here - can you please let me know if this fixes the problem?
>
>
>
> On Aug 8, 2014, at 11:11 PM, Paul Hargrove <phhargr...@lbl.gov> wrote:
>
> A problem Siegmar reported on trunk over a year and a half ago is breaking
> a 32-bit build of the v1.8 branch with the Sun C++ compiler:
>
> Siegmar's report appears in
> http://www.open-mpi.org/community/lists/users/2013/01/21269.php
> There are several warnings, but the error is (from my current build):
>
>
> "/shared/OMPI/openmpi-1.8-latest-solaris11-x86-ib-ss12u3/openmpi-1.8.2rc4r32480/ompi/mpi/cxx/
> file.cc", Error: The function opal_atomic_add_32(volatile int*, int) has
> not had a body defined.
>
> Brian attached a possible fix to
> http://www.open-mpi.org/community/lists/users/2013/01/21272.php
> It applies cleanly to v1.8 but appears to make things worse, trading that
> one error for two:
>
> "/shared/OMPI/openmpi-1.8-latest-solaris11-x86-ib-ss12u3/openmpi-1.8.2rc4r32480/opal/include/opal/sys/atomic_impl.h",
> line 106: Error: opal_atomic_add_64(volatile long long*, long long) was
> previously declared "extern", not "inline".
> "/shared/OMPI/openmpi-1.8-latest-solaris11-x86-ib-ss12u3/openmpi-1.8.2rc4r32480/opal/include/opal/sys/atomic_impl.h",
> line 121: Error: opal_atomic_sub_64(volatile long long*, long long) was
> previously declared "extern", not "inline".
>
>
> The good news is that the problem does not exist on the trunk.
> So, hopefully somebody can track down the proper changes to CMR.
>
> -Paul
>
>
> --
> Paul H. Hargrove  phhargr...@lbl.gov
> Future Technologies Group
> Computer and Data Sciences Department Tel: +1-510-495-2352
> Lawrence Berkeley National Laboratory Fax: +1-510-486-6900
>  ___
> devel mailing list
> de...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
> Link to this post:
> http://www.open-mpi.org/community/lists/devel/2014/08/15582.php
>
>
>
> ___
> devel mailing list
> de...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
> Link to this post:
> http://www.open-mpi.org/community/lists/devel/2014/08/15591.php
>



-- 
Paul H. Hargrove  phhargr...@lbl.gov
Future Technologies Group
Computer and Data Sciences Department Tel: +1-510-495-2352
Lawrence Berkeley National Laboratory Fax: +1-510-486-6900


Re: [OMPI devel] [v1.8] 32-bit c++ build failure with Sun compilers

2014-08-09 Thread Paul Hargrove
These changes also eliminate an equivalent g++ warning :

/shared/OMPI/openmpi-1.8-latest-solaris11-x86-ib-gcc452/openmpi-1.8.2rc4r32480/opal/include/opal/sys/atomic.h:397:9:
warning: inline function `int32_t opal_atomic_add_32(volatile int32_t*,
int)' used but never defined
/shared/OMPI/openmpi-1.8-latest-solaris11-x86-ib-gcc452/openmpi-1.8.2rc4r32480/opal/include/opal/sys/atomic.h:407:9:
warning: inline function `int32_t opal_atomic_sub_32(volatile int32_t*,
int)' used but never defined

-Paul



On Sat, Aug 9, 2014 at 12:36 PM, Ralph Castain <r...@open-mpi.org> wrote:

> Kewl - thanks!
>
> On Aug 9, 2014, at 12:24 PM, Paul Hargrove <phhargr...@lbl.gov> wrote:
>
> Ralph,
>
> Yes, that did the trick.
> The attached patch applied cleanly to last night's v1.8 tarball
> (1.8.2rc4r32480) and I was able to build the C++ bindings on this platform.
>
> -Paul
>
>
> On Sat, Aug 9, 2014 at 7:58 AM, Ralph Castain <r...@open-mpi.org> wrote:
>
>> I think I chased this down - looks like it is r28034. I've attached the
>> patch here - can you please let me know if this fixes the problem?
>>
>>
>>
>> On Aug 8, 2014, at 11:11 PM, Paul Hargrove <phhargr...@lbl.gov> wrote:
>>
>> A problem Siegmar reported on trunk over a year and a half ago is
>> breaking a 32-bit build of the v1.8 branch with the Sun C++ compiler:
>>
>> Siegmar's report appears in
>> http://www.open-mpi.org/community/lists/users/2013/01/21269.php
>> There are several warnings, but the error is (from my current build):
>>
>>
>> "/shared/OMPI/openmpi-1.8-latest-solaris11-x86-ib-ss12u3/openmpi-1.8.2rc4r32480/ompi/mpi/cxx/
>> file.cc", Error: The function opal_atomic_add_32(volatile int*, int) has
>> not had a body defined.
>>
>> Brian attached a possible fix to
>> http://www.open-mpi.org/community/lists/users/2013/01/21272.php
>> It applies cleanly to v1.8 but appears to make things worse, trading that
>> one error for two:
>>
>> "/shared/OMPI/openmpi-1.8-latest-solaris11-x86-ib-ss12u3/openmpi-1.8.2rc4r32480/opal/include/opal/sys/atomic_impl.h",
>> line 106: Error: opal_atomic_add_64(volatile long long*, long long) was
>> previously declared "extern", not "inline".
>> "/shared/OMPI/openmpi-1.8-latest-solaris11-x86-ib-ss12u3/openmpi-1.8.2rc4r32480/opal/include/opal/sys/atomic_impl.h",
>> line 121: Error: opal_atomic_sub_64(volatile long long*, long long) was
>> previously declared "extern", not "inline".
>>
>>
>> The good news is that the problem does not exist on the trunk.
>> So, hopefully somebody can track down the proper changes to CMR.
>>
>> -Paul
>>
>>
>> --
>> Paul H. Hargrove  phhargr...@lbl.gov
>> Future Technologies Group
>> Computer and Data Sciences Department Tel: +1-510-495-2352
>> Lawrence Berkeley National Laboratory Fax: +1-510-486-6900
>>  ___
>> devel mailing list
>> de...@open-mpi.org
>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
>> Link to this post:
>> http://www.open-mpi.org/community/lists/devel/2014/08/15582.php
>>
>>
>>
>> ___
>> devel mailing list
>> de...@open-mpi.org
>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
>> Link to this post:
>> http://www.open-mpi.org/community/lists/devel/2014/08/15591.php
>>
>
>
>
> --
> Paul H. Hargrove  phhargr...@lbl.gov
> Future Technologies Group
> Computer and Data Sciences Department Tel: +1-510-495-2352
> Lawrence Berkeley National Laboratory Fax: +1-510-486-6900
>  ___
> devel mailing list
> de...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
> Link to this post:
> http://www.open-mpi.org/community/lists/devel/2014/08/15594.php
>
>
>
> ___
> devel mailing list
> de...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
> Link to this post:
> http://www.open-mpi.org/community/lists/devel/2014/08/15595.php
>



-- 
Paul H. Hargrove  phhargr...@lbl.gov
Future Technologies Group
Computer and Data Sciences Department Tel: +1-510-495-2352
Lawrence Berkeley National Laboratory Fax: +1-510-486-6900


[OMPI devel] [v1.8] build failure with xlc-11.1

2014-08-09 Thread Paul Hargrove
Building v1.8 nightly tarball with xlc-11.1 on a ppc64/linux platform:

Making all in asm
make[2]: Entering directory
`/home/hargrov1/OMPI/openmpi-1.8-latest-bluedrop-64-xlc/BLD/opal/asm'
  CC   asm.lo
rm -f atomic-asm.S
ln -s "../../opal/asm/generated/atomic-local.s" atomic-asm.S
  CPPASatomic-asm.lo
  CCLD libasm.la
ar: .libs/atomic-asm.o: No such file or directory

The related portion of the configure output:

*** Assembler
checking dependency style of
/home/hargrov1/OMPI/openmpi-1.8-latest-bluedrop-64-xlc/openmpi-1.8.2rc4r32480/config/compile
xlc... none
checking for BSD- or MS-compatible name lister (nm)... /usr/bin/nm -B
checking the name lister (/usr/bin/nm -B) interface... BSD nm
checking for fgrep... /bin/grep -F
checking if need to remove -g from CCASFLAGS... no
checking whether to enable smp locks... yes
checking if .proc/endp is needed... no
checking directive for setting text section... .text
checking directive for exporting symbols... .globl
checking for objdump... objdump
checking if .note.GNU-stack is needed... no
checking suffix for labels... :
checking prefix for global symbol labels...
checking prefix for lsym labels... .L
checking prefix for function in .type... @
checking if .size is needed... yes
checking if .align directive takes logarithmic value... yes
checking if PowerPC registers have r prefix... no
checking if
/home/hargrov1/OMPI/openmpi-1.8-latest-bluedrop-64-xlc/openmpi-1.8.2rc4r32480/config/compile
xlc supports GCC inline assembly... yes
checking if
/home/hargrov1/OMPI/openmpi-1.8-latest-bluedrop-64-xlc/openmpi-1.8.2rc4r32480/config/compile
xlc supports DEC inline assembly... no
checking if
/home/hargrov1/OMPI/openmpi-1.8-latest-bluedrop-64-xlc/openmpi-1.8.2rc4r32480/config/compile
xlc supports XLC inline assembly... no
checking for assembly format... default-.text-.globl-:--.L-@-1-1-0-1-0
checking for asssembly architecture... POWERPC64
checking for builtin atomics... BUILTIN_NO
checking for perl... perl
checking for pre-built assembly file... no (not in asm-data)
checking whether possible to generate assembly file... yes
checking for atomic assembly filename... atomic-local.s

-Paul

-- 
Paul H. Hargrove  phhargr...@lbl.gov
Future Technologies Group
Computer and Data Sciences Department Tel: +1-510-495-2352
Lawrence Berkeley National Laboratory Fax: +1-510-486-6900


[OMPI devel] cosmetic configure nit

2014-08-09 Thread Paul Hargrove
One too many 's' characters in the following:

checking for asssembly architecture...

-Paul

-- 
Paul H. Hargrove  phhargr...@lbl.gov
Future Technologies Group
Computer and Data Sciences Department Tel: +1-510-495-2352
Lawrence Berkeley National Laboratory Fax: +1-510-486-6900


Re: [OMPI devel] [v1.8] build failure with xlc-11.1

2014-08-09 Thread Paul Hargrove
One note regarding my report below:

I have noticed that autoconf has chosen to use "$srcdir/config/compile xlc"
instead of just "xlc" (I set CC=xlc).  I strongly suspect this is related,
and am investigating why the compile wrapper is used.  However, independent
of that there does seem to be some flaw in how the atomics are getting
built on this configuration.

-Paul


On Sat, Aug 9, 2014 at 1:22 PM, Paul Hargrove <phhargr...@lbl.gov> wrote:

> Building v1.8 nightly tarball with xlc-11.1 on a ppc64/linux platform:
>
> Making all in asm
> make[2]: Entering directory
> `/home/hargrov1/OMPI/openmpi-1.8-latest-bluedrop-64-xlc/BLD/opal/asm'
>   CC   asm.lo
> rm -f atomic-asm.S
> ln -s "../../opal/asm/generated/atomic-local.s" atomic-asm.S
>   CPPASatomic-asm.lo
>   CCLD libasm.la
> ar: .libs/atomic-asm.o: No such file or directory
>
> The related portion of the configure output:
>
> *** Assembler
> checking dependency style of
> /home/hargrov1/OMPI/openmpi-1.8-latest-bluedrop-64-xlc/openmpi-1.8.2rc4r32480/config/compile
> xlc... none
> checking for BSD- or MS-compatible name lister (nm)... /usr/bin/nm -B
> checking the name lister (/usr/bin/nm -B) interface... BSD nm
> checking for fgrep... /bin/grep -F
> checking if need to remove -g from CCASFLAGS... no
> checking whether to enable smp locks... yes
> checking if .proc/endp is needed... no
> checking directive for setting text section... .text
> checking directive for exporting symbols... .globl
> checking for objdump... objdump
> checking if .note.GNU-stack is needed... no
> checking suffix for labels... :
> checking prefix for global symbol labels...
> checking prefix for lsym labels... .L
> checking prefix for function in .type... @
> checking if .size is needed... yes
> checking if .align directive takes logarithmic value... yes
> checking if PowerPC registers have r prefix... no
> checking if
> /home/hargrov1/OMPI/openmpi-1.8-latest-bluedrop-64-xlc/openmpi-1.8.2rc4r32480/config/compile
> xlc supports GCC inline assembly... yes
> checking if
> /home/hargrov1/OMPI/openmpi-1.8-latest-bluedrop-64-xlc/openmpi-1.8.2rc4r32480/config/compile
> xlc supports DEC inline assembly... no
> checking if
> /home/hargrov1/OMPI/openmpi-1.8-latest-bluedrop-64-xlc/openmpi-1.8.2rc4r32480/config/compile
> xlc supports XLC inline assembly... no
> checking for assembly format... default-.text-.globl-:--.L-@-1-1-0-1-0
> checking for asssembly architecture... POWERPC64
> checking for builtin atomics... BUILTIN_NO
> checking for perl... perl
> checking for pre-built assembly file... no (not in asm-data)
> checking whether possible to generate assembly file... yes
> checking for atomic assembly filename... atomic-local.s
>
> -Paul
>
> --
> Paul H. Hargrove  phhargr...@lbl.gov
> Future Technologies Group
> Computer and Data Sciences Department Tel: +1-510-495-2352
> Lawrence Berkeley National Laboratory Fax: +1-510-486-6900
>



-- 
Paul H. Hargrove  phhargr...@lbl.gov
Future Technologies Group
Computer and Data Sciences Department Tel: +1-510-495-2352
Lawrence Berkeley National Laboratory Fax: +1-510-486-6900


Re: [OMPI devel] [v1.8] build failure with xlc-11.1

2014-08-09 Thread Paul Hargrove
You can disregard this thread... the problem was pilot error.

I understand now why the "compile" wrapper was getting used.
The probe "checking whether $CC and cc understand -c and -o together... "
 is run WITHOUT $CFLAGS.

In my case CFLAGS included an argument required to locate the compiler's
config files and consequently the probe failed.  I consider that "pilot
error" on my part and have moved the config option to the definition of CC
instead (and the problem goes away)

So, while there *may* exist some valid set of conditions under which the
current configure/build could produce the reported failure, my test did NOT
represent a valid set of conditions.

-Paul


On Sat, Aug 9, 2014 at 1:29 PM, Paul Hargrove <phhargr...@lbl.gov> wrote:

> One note regarding my report below:
>
> I have noticed that autoconf has chosen to use "$srcdir/config/compile
> xlc" instead of just "xlc" (I set CC=xlc).  I strongly suspect this is
> related, and am investigating why the compile wrapper is used.  However,
> independent of that there does seem to be some flaw in how the atomics are
> getting built on this configuration.
>
> -Paul
>
>
> On Sat, Aug 9, 2014 at 1:22 PM, Paul Hargrove <phhargr...@lbl.gov> wrote:
>
>> Building v1.8 nightly tarball with xlc-11.1 on a ppc64/linux platform:
>>
>> Making all in asm
>> make[2]: Entering directory
>> `/home/hargrov1/OMPI/openmpi-1.8-latest-bluedrop-64-xlc/BLD/opal/asm'
>>   CC   asm.lo
>> rm -f atomic-asm.S
>> ln -s "../../opal/asm/generated/atomic-local.s" atomic-asm.S
>>   CPPASatomic-asm.lo
>>   CCLD libasm.la
>> ar: .libs/atomic-asm.o: No such file or directory
>>
>> The related portion of the configure output:
>>
>> *** Assembler
>> checking dependency style of
>> /home/hargrov1/OMPI/openmpi-1.8-latest-bluedrop-64-xlc/openmpi-1.8.2rc4r32480/config/compile
>> xlc... none
>> checking for BSD- or MS-compatible name lister (nm)... /usr/bin/nm -B
>> checking the name lister (/usr/bin/nm -B) interface... BSD nm
>> checking for fgrep... /bin/grep -F
>> checking if need to remove -g from CCASFLAGS... no
>> checking whether to enable smp locks... yes
>> checking if .proc/endp is needed... no
>> checking directive for setting text section... .text
>> checking directive for exporting symbols... .globl
>> checking for objdump... objdump
>> checking if .note.GNU-stack is needed... no
>> checking suffix for labels... :
>> checking prefix for global symbol labels...
>> checking prefix for lsym labels... .L
>> checking prefix for function in .type... @
>> checking if .size is needed... yes
>> checking if .align directive takes logarithmic value... yes
>> checking if PowerPC registers have r prefix... no
>> checking if
>> /home/hargrov1/OMPI/openmpi-1.8-latest-bluedrop-64-xlc/openmpi-1.8.2rc4r32480/config/compile
>> xlc supports GCC inline assembly... yes
>> checking if
>> /home/hargrov1/OMPI/openmpi-1.8-latest-bluedrop-64-xlc/openmpi-1.8.2rc4r32480/config/compile
>> xlc supports DEC inline assembly... no
>> checking if
>> /home/hargrov1/OMPI/openmpi-1.8-latest-bluedrop-64-xlc/openmpi-1.8.2rc4r32480/config/compile
>> xlc supports XLC inline assembly... no
>> checking for assembly format... default-.text-.globl-:--.L-@-1-1-0-1-0
>> checking for asssembly architecture... POWERPC64
>> checking for builtin atomics... BUILTIN_NO
>> checking for perl... perl
>> checking for pre-built assembly file... no (not in asm-data)
>> checking whether possible to generate assembly file... yes
>> checking for atomic assembly filename... atomic-local.s
>>
>> -Paul
>>
>> --
>> Paul H. Hargrove  phhargr...@lbl.gov
>> Future Technologies Group
>> Computer and Data Sciences Department Tel: +1-510-495-2352
>> Lawrence Berkeley National Laboratory Fax: +1-510-486-6900
>>
>
>
>
> --
> Paul H. Hargrove  phhargr...@lbl.gov
> Future Technologies Group
> Computer and Data Sciences Department Tel: +1-510-495-2352
> Lawrence Berkeley National Laboratory Fax: +1-510-486-6900
>



-- 
Paul H. Hargrove  phhargr...@lbl.gov
Future Technologies Group
Computer and Data Sciences Department Tel: +1-510-495-2352
Lawrence Berkeley National Laboratory Fax: +1-510-486-6900


Re: [OMPI devel] RFC: add atomic compare-and-swap that returns old value

2014-08-11 Thread Paul Hargrove
I am on the same page with George here - if it's on the list then support
it until its been removed.

I happen to have systems to test, I believe, every supported atomics
implementation except for DEC Alpha, and so I did test them all.

AFAIK ARMv5 is even out-dated as a smartphone platform.

-Paul


On Mon, Aug 11, 2014 at 9:46 AM, George Bosilca  wrote:

> It is not that I care, but it was one of our supported platforms and we
> don't usually drop support for anything without a proper RFC.
>
>   George.
>
>
>
>
> On Mon, Aug 11, 2014 at 12:09 PM, Dave Goodell (dgoodell) <
> dgood...@cisco.com> wrote:
>
>> On Aug 7, 2014, at 11:37 PM, George Bosilca  wrote:
>>
>> > Paul's tests identified an small issue with the previous patch (a real
>> corner-case for ARM v5). The patch below is fixing all known issues.
>>
>> Wait, why do we care about ARMv5?  It's certainly not a serious HPC
>> platform, nor is it even a relevant laptop platform at this point (AFAIK).
>>
>> -Dave
>>
>> ___
>> devel mailing list
>> de...@open-mpi.org
>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
>> Link to this post:
>> http://www.open-mpi.org/community/lists/devel/2014/08/15614.php
>>
>
>
> ___
> devel mailing list
> de...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
> Link to this post:
> http://www.open-mpi.org/community/lists/devel/2014/08/15615.php
>



-- 
Paul H. Hargrove  phhargr...@lbl.gov
Future Technologies Group
Computer and Data Sciences Department Tel: +1-510-495-2352
Lawrence Berkeley National Laboratory Fax: +1-510-486-6900


Re: [OMPI devel] RFC: add atomic compare-and-swap that returns old value

2014-08-11 Thread Paul Hargrove
Well, the contents of opal/asm/asm-data.txt and the arch-specific subdirs
below opal/include/opal/sys have served me as a list of the atomics
implementations.  If those include architectures no longer officially
supported, then some cleanup may be in order (as SPARC_v8 was recently
removed from asm-data.txt).

-Paul


On Mon, Aug 11, 2014 at 11:44 AM, Jeff Squyres (jsquyres) <
jsquy...@cisco.com> wrote:

> I think the closest thing we have to a supported architecture list is in
> the README.
>
>
> On Aug 11, 2014, at 2:42 PM, Nathan Hjelm <hje...@lanl.gov> wrote:
>
> >
> > Which brings us back to Dave's question. Is there a list of supported
> > architectures? I don't want to bother with DEC Alpha if we no longer
> > support it.
> >
> > BTW, so far I have converted: AMD64, IA32, ARM. Working on IA64 now.
> >
> > -Nathan
> >
> > On Mon, Aug 11, 2014 at 01:57:21PM -0400, George Bosilca wrote:
> >>   Dave,
> >>   We all understand your concerns. However, the current issue has
> nothing to
> >>   do with Nathan, the code for supporting ARMv5 is already in the patch
> I
> >>   submitted and that Paul validated.
> >>   What Nathan said he might take a look at is a different method for
> >>   generating assembly code, one that only supports ARMv7 and later.
> >>     George.
> >>
> >>   On Mon, Aug 11, 2014 at 1:51 PM, Dave Goodell (dgoodell)
> >>   <dgood...@cisco.com> wrote:
> >>
> >> On Aug 11, 2014, at 11:54 AM, Paul Hargrove <phhargr...@lbl.gov>
> wrote:
> >>
> >>> I am on the same page with George here - if it's on the list then
> >> support it until its been removed.
> >>>
> >>> I happen to have systems to test, I believe, every supported atomics
> >> implementation except for DEC Alpha, and so I did test them all.
> >>
> >> My comment was not intended to indicate that I don't value your
> testing
> >> contributions, Paul.  I am more concerned that Nathan is wasting
> time
> >> fixing support for an effectively useless platform.  It's not like
> this
> >> is a case where making the more portable change improves our general
> >> correctness on other platforms; it's a very (<= ARMv5)-specific
> >> situation.
> >>
> >> If there's actually an official list of supported platforms
> somewhere,
> >> then I'll let Nathan decide whether he wants to submit an RFC to
> drop
> >> ARMv5 support.  I know I'd support it, but I don't care enough to
> write
> >> an RFC of my own right now.
> >> -Dave
> >>
> >> ___
> >> devel mailing list
> >> de...@open-mpi.org
> >> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
> >> Link to this post:
> >> http://www.open-mpi.org/community/lists/devel/2014/08/15618.php
> >
> >> ___
> >> devel mailing list
> >> de...@open-mpi.org
> >> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
> >> Link to this post:
> http://www.open-mpi.org/community/lists/devel/2014/08/15619.php
> >
> > ___
> > devel mailing list
> > de...@open-mpi.org
> > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
> > Link to this post:
> http://www.open-mpi.org/community/lists/devel/2014/08/15620.php
>
>
> --
> Jeff Squyres
> jsquy...@cisco.com
> For corporate legal information go to:
> http://www.cisco.com/web/about/doing_business/legal/cri/
>
> ___
> devel mailing list
> de...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
> Link to this post:
> http://www.open-mpi.org/community/lists/devel/2014/08/15621.php
>



-- 
Paul H. Hargrove  phhargr...@lbl.gov
Future Technologies Group
Computer and Data Sciences Department Tel: +1-510-495-2352
Lawrence Berkeley National Laboratory Fax: +1-510-486-6900


Re: [OMPI devel] trunk hang when nodes have similar but private network

2014-08-13 Thread Paul Hargrove
I think that in this case one *could* add logic that would disqualify the
subnet because every compute node in the job has the SAME address.  In
fact, any subnet on which two or more compute nodes have the same address
must be suspect.  If this logic were introduced, the 127.0.0.1 loopback
address wouldn't need to be a special case.

This is just an observation, not a feature request (at least not on my
part).

-Paul


On Wed, Aug 13, 2014 at 7:55 AM, Jeff Squyres (jsquyres)  wrote:

> I think this is expected behavior.
>
> If you have networks that you need Open MPI to ignore (e.g., a private
> network that *looks* reachable between multiple servers -- because the
> interfaces are on the same subnet -- but actually *isn't*), then the
> include/exclude mechanism is the right way to exclude them.
>
> That being said, I'm not sure why the behavior is different between trunk
> and v1.8.
>
>
> On Aug 13, 2014, at 1:41 AM, Gilles Gouaillardet <
> gilles.gouaillar...@iferc.org> wrote:
>
> > Folks,
> >
> > i noticed mpirun (trunk) hangs when running any mpi program on two nodes
> > *and* each node has a private network with the same ip
> > (in my case, each node has a private network to a MIC)
> >
> > in order to reproduce the problem, you can simply run (as root) on the
> > two compute nodes
> > brctl addbr br0
> > ifconfig br0 192.168.255.1 netmask 255.255.255.0
> >
> > mpirun will hang
> >
> > a workaroung is to add --mca btl_tcp_if_include eth0
> >
> > v1.8 does not hang in this case
> >
> > Cheers,
> >
> > Gilles
> > ___
> > devel mailing list
> > de...@open-mpi.org
> > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
> > Link to this post:
> http://www.open-mpi.org/community/lists/devel/2014/08/15623.php
>
>
> --
> Jeff Squyres
> jsquy...@cisco.com
> For corporate legal information go to:
> http://www.cisco.com/web/about/doing_business/legal/cri/
>
> ___
> devel mailing list
> de...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
> Link to this post:
> http://www.open-mpi.org/community/lists/devel/2014/08/15631.php
>



-- 
Paul H. Hargrove  phhargr...@lbl.gov
Future Technologies Group
Computer and Data Sciences Department Tel: +1-510-495-2352
Lawrence Berkeley National Laboratory Fax: +1-510-486-6900


[OMPI devel] [1.8.2rc4] build failure with --enable-osx-builtin-atomics

2014-08-13 Thread Paul Hargrove
When configured with --enable-osx-builtin-atomics

Making all in asm
  CC   asm.lo
In file included from
/Users/Paul/OMPI/openmpi-1.8.2rc4-macos10.8-x86-clang-atomics/openmpi-1.8.2rc4/opal/asm/asm.c:21:
/Users/Paul/OMPI/openmpi-1.8.2rc4-macos10.8-x86-clang-atomics/openmpi-1.8.2rc4/opal/include/opal/sys/atomic.h:145:10:
fatal error: 'opal/sys/osx/atomic.h' file not found
#include "opal/sys/osx/atomic.h"
 ^
1 error generated.

I reported the same issue to George in the trunk last week.
So, I am 95% certain one just needs to cmr r32390 (commit msg == 'Dont miss
the Os X atomics on "make dist"')


-Paul

-- 
Paul H. Hargrove  phhargr...@lbl.gov
Future Technologies Group
Computer and Data Sciences Department Tel: +1-510-495-2352
Lawrence Berkeley National Laboratory Fax: +1-510-486-6900


[OMPI devel] [1.8.2rc4] OSHMEM fortran bindings with bad compilers

2014-08-13 Thread Paul Hargrove
The following is NOT a bug report.
This is just an observation that may deserve some text in the README.

I've reported issues in the past with some Fortran compilers (mostly older
XLC and PGI) which either cannot build the "use mpi_f08" module, or cannot
correctly link to it (and sometimes this fails only if configured with
--enable-debug).

Testing the OSHMEM Fortran bindings (enabled by default on Linux) I have
found several compilers which fail to link the examples (hello_oshmemfh and
ring_oshmemfh).  I reported one specific instance (with xlc-11/xlf-13) back
in February: http://www.open-mpi.org/community/lists/devel/2014/02/14057.php

So far I have these failures only on platforms where the Fortran compiler
is *known* to be broken for the MPI f90 and/or f08 bindings.  Specifically,
all the failing platforms are ones on which either:
+ Configure determines (without my help) that FC cannot build the F90
and/or F08 modules.
OR
+ I must pass --enable-mpi-fortran=usempi or --enable-mpi-fortran=mpifh for
cases configure cannot detect.

So, I do *not* believe there is anything wrong with the OSHMEM code, which
is why I started this post with "The following is NOT a bug report".
 However, I have two recommendations to make:

1) Documentation

The README says just:

--disable-oshmem-fortran
  Disable building only the Fortran OSHMEM bindings.

So, I recommend adding a sentence there referencing the "Compiler Notes"
section of the README which has details on some known bad Fortran compilers.

2) Configure:

As I noted above, at least some of the failures are on platforms where
configure has determined it cannot build the f08 MPI bindings.  So, maybe
there is something that could be done at configure time to disqualify some
Fortran compilers from building the OSHMEM fotran bindings, too.

-Paul

-- 
Paul H. Hargrove  phhargr...@lbl.gov
Future Technologies Group
Computer and Data Sciences Department Tel: +1-510-495-2352
Lawrence Berkeley National Laboratory Fax: +1-510-486-6900


Re: [OMPI devel] [1.8.2rc4] build failure with --enable-osx-builtin-atomics

2014-08-14 Thread Paul Hargrove
Fix confirmed using the nightly tarball (v1.8rc5r32531).

-Paul


On Wed, Aug 13, 2014 at 6:16 PM, Ralph Castain <r...@open-mpi.org> wrote:

> Thanks Paul - fixed in r32530
>
>
>
> On Wed, Aug 13, 2014 at 2:42 PM, Paul Hargrove <phhargr...@lbl.gov> wrote:
>
>> When configured with --enable-osx-builtin-atomics
>>
>> Making all in asm
>>   CC   asm.lo
>> In file included from
>> /Users/Paul/OMPI/openmpi-1.8.2rc4-macos10.8-x86-clang-atomics/openmpi-1.8.2rc4/opal/asm/asm.c:21:
>> /Users/Paul/OMPI/openmpi-1.8.2rc4-macos10.8-x86-clang-atomics/openmpi-1.8.2rc4/opal/include/opal/sys/atomic.h:145:10:
>> fatal error: 'opal/sys/osx/atomic.h' file not found
>> #include "opal/sys/osx/atomic.h"
>>  ^
>> 1 error generated.
>>
>> I reported the same issue to George in the trunk last week.
>> So, I am 95% certain one just needs to cmr r32390 (commit msg == 'Dont
>> miss the Os X atomics on "make dist"')
>>
>>
>> -Paul
>>
>> --
>> Paul H. Hargrove  phhargr...@lbl.gov
>> Future Technologies Group
>> Computer and Data Sciences Department Tel: +1-510-495-2352
>> Lawrence Berkeley National Laboratory Fax: +1-510-486-6900
>>
>> ___
>> devel mailing list
>> de...@open-mpi.org
>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
>> Link to this post:
>> http://www.open-mpi.org/community/lists/devel/2014/08/15642.php
>>
>
>
> ___
> devel mailing list
> de...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
> Link to this post:
> http://www.open-mpi.org/community/lists/devel/2014/08/15644.php
>



-- 
Paul H. Hargrove  phhargr...@lbl.gov
Future Technologies Group
Computer and Data Sciences Department Tel: +1-510-495-2352
Lawrence Berkeley National Laboratory Fax: +1-510-486-6900


Re: [OMPI devel] 1.8.4rc4 is out

2014-08-14 Thread Paul Hargrove
I have completed testing the majority of the platforms I have access to.
The only issue that is not already known to exist in earlier releases was
the missing osx/atomic.h, for which Ralph promptly merged George's fix.

If I include the re-tested osx-atomics (which passes w/ 1.8.2rc5r32531), I
have success on 75 distinct configurations which include x86, x86-64,
sparc-v8+, sparc64-v9, ppc32 and ppc64 ABIs with various releases of Linux,
Mac OS X, Solaris, FreeBSD, NetBSD and OpenBSD, with all sorts of
compilers, and static linking (w/o romio :-)) on at least one configuration
for each OS.

I will have results on ia64, ARMv5, ARMv7 and 3 MIPS ABIs in the next day
or two.

Looks good to me.
-Paul


On Wed, Aug 13, 2014 at 1:37 PM, Jeff Squyres (jsquyres) <jsquy...@cisco.com
> wrote:

> Please test!  Ralph would like to release after the teleconf next Tuesday:
>
> http://www.open-mpi.org/software/ompi/v1.8/
>
> Changes since last rc:
>
> - Fix cascading/over-quoting in some cases with the rsh/ssh-based
>   launcher.  Thanks to multiple users for raising the issue.
> - Properly add support for gfortran 4.9 ignore TKR pragma (it was
>   erroneously only partially added in v1.7.5).  Thanks to Marcus
>   Daniels for raising the issue.
> - Update/improve help messages in the usnic BTL.
> - Resolve a race condition in MPI_Abort.
> - Fix obscure cases where static linking from wrapper compilers would
>   fail.
> - Clarify the configure --help message about when OpenSHMEM is
>   enabled/disabled by default.  Thanks to Paul Hargrove for the
>   suggestion.
> - Align pages properly where relevant.  Thanks to Paul Hargrove for
>   identifying the issue.
> - Various compiler warning and minor fixes for OpenBSD, FreeBSD, and
>   Solaris/SPARC.  Thanks to Paul Hargrove for the patches.
>
> --
> Jeff Squyres
> jsquy...@cisco.com
> For corporate legal information go to:
> http://www.cisco.com/web/about/doing_business/legal/cri/
>
> ___
> devel mailing list
> de...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
> Link to this post:
> http://www.open-mpi.org/community/lists/devel/2014/08/15640.php
>



-- 
Paul H. Hargrove  phhargr...@lbl.gov
Future Technologies Group
Computer and Data Sciences Department Tel: +1-510-495-2352
Lawrence Berkeley National Laboratory Fax: +1-510-486-6900


Re: [OMPI devel] 1.8.4rc4 is out

2014-08-15 Thread Paul Hargrove
My testing has additionally passed on
  IA64
  ARM - v5 and v7
  MIPS - "32", "n32" and "64" ABIs

-Paul


On Wed, Aug 13, 2014 at 9:18 PM, Paul Hargrove <phhargr...@lbl.gov> wrote:

> I have completed testing the majority of the platforms I have access to.
> The only issue that is not already known to exist in earlier releases was
> the missing osx/atomic.h, for which Ralph promptly merged George's fix.
>
> If I include the re-tested osx-atomics (which passes w/ 1.8.2rc5r32531), I
> have success on 75 distinct configurations which include x86, x86-64,
> sparc-v8+, sparc64-v9, ppc32 and ppc64 ABIs with various releases of Linux,
> Mac OS X, Solaris, FreeBSD, NetBSD and OpenBSD, with all sorts of
> compilers, and static linking (w/o romio :-)) on at least one configuration
> for each OS.
>
> I will have results on ia64, ARMv5, ARMv7 and 3 MIPS ABIs in the next day
> or two.
>
> Looks good to me.
> -Paul
>
>
> On Wed, Aug 13, 2014 at 1:37 PM, Jeff Squyres (jsquyres) <
> jsquy...@cisco.com> wrote:
>
>> Please test!  Ralph would like to release after the teleconf next Tuesday:
>>
>> http://www.open-mpi.org/software/ompi/v1.8/
>>
>> Changes since last rc:
>>
>> - Fix cascading/over-quoting in some cases with the rsh/ssh-based
>>   launcher.  Thanks to multiple users for raising the issue.
>> - Properly add support for gfortran 4.9 ignore TKR pragma (it was
>>   erroneously only partially added in v1.7.5).  Thanks to Marcus
>>   Daniels for raising the issue.
>> - Update/improve help messages in the usnic BTL.
>> - Resolve a race condition in MPI_Abort.
>> - Fix obscure cases where static linking from wrapper compilers would
>>   fail.
>> - Clarify the configure --help message about when OpenSHMEM is
>>   enabled/disabled by default.  Thanks to Paul Hargrove for the
>>   suggestion.
>> - Align pages properly where relevant.  Thanks to Paul Hargrove for
>>   identifying the issue.
>> - Various compiler warning and minor fixes for OpenBSD, FreeBSD, and
>>   Solaris/SPARC.  Thanks to Paul Hargrove for the patches.
>>
>> --
>> Jeff Squyres
>> jsquy...@cisco.com
>> For corporate legal information go to:
>> http://www.cisco.com/web/about/doing_business/legal/cri/
>>
>> ___
>> devel mailing list
>> de...@open-mpi.org
>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
>> Link to this post:
>> http://www.open-mpi.org/community/lists/devel/2014/08/15640.php
>>
>
>
>
> --
> Paul H. Hargrove  phhargr...@lbl.gov
> Future Technologies Group
> Computer and Data Sciences Department Tel: +1-510-495-2352
> Lawrence Berkeley National Laboratory Fax: +1-510-486-6900
>



-- 
Paul H. Hargrove  phhargr...@lbl.gov
Future Technologies Group
Computer and Data Sciences Department Tel: +1-510-495-2352
Lawrence Berkeley National Laboratory Fax: +1-510-486-6900


Re: [OMPI devel] [OMPI svn] svn:open-mpi r32555 - trunk/opal/mca/btl/scif

2014-08-20 Thread Paul Hargrove
Can somebody confirm that configure is adding "-c9x" or "-c99" to CFLAGS
with this compiler?
If not then r32555 could possibly be reverted in favor of adding the proper
compiler flag.

Also, I am suspicious of this failure because even without a language-level
option pgcc 12.9 and 13.4 compile the following:

struct S { int i; double d; };
struct S x = {1,0};
int main (void)
{
  struct S y = { .i = x.i };
  return y.i;
}


-Paul


On Wed, Aug 20, 2014 at 7:20 AM, Nathan Hjelm  wrote:

> Really? That means PGI 2013 is NOT C99 compliant! Figures.
>
> -Nathan
>
> On Tue, Aug 19, 2014 at 10:48:48PM -0400, svn-commit-mai...@open-mpi.org
> wrote:
> > Author: ggouaillardet (Gilles Gouaillardet)
> > Date: 2014-08-19 22:48:47 EDT (Tue, 19 Aug 2014)
> > New Revision: 32555
> > URL: https://svn.open-mpi.org/trac/ompi/changeset/32555
> >
> > Log:
> > btl/scif: use safe syntax
> >
> > PGI compilers 2013 and older do not support the following syntax :
> > mca_btl_scif_modex_t modex = {.port_id = mca_btl_scif_module.port_id};
> > so split it on two lines
> >
> > cmr=v1.8.2:reviewer=hjelmn
> >
> > Text files modified:
> >trunk/opal/mca/btl/scif/btl_scif_component.c | 3 ++-
> >1 files changed, 2 insertions(+), 1 deletions(-)
> >
> > Modified: trunk/opal/mca/btl/scif/btl_scif_component.c
> >
> ==
> > --- trunk/opal/mca/btl/scif/btl_scif_component.c  Tue Aug 19
> 18:34:49 2014(r32554)
> > +++ trunk/opal/mca/btl/scif/btl_scif_component.c  2014-08-19
> 22:48:47 EDT (Tue, 19 Aug 2014)  (r32555)
> > @@ -208,7 +208,8 @@
> >
> >  static int mca_btl_scif_modex_send (void)
> >  {
> > -mca_btl_scif_modex_t modex = {.port_id =
> mca_btl_scif_module.port_id};
> > +mca_btl_scif_modex_t modex;
> > +modex.port_id = mca_btl_scif_module.port_id;
> >
> >  return opal_modex_send (_btl_scif_component.super.btl_version,
> , sizeof (modex));
> >  }
> > ___
> > svn mailing list
> > s...@open-mpi.org
> > http://www.open-mpi.org/mailman/listinfo.cgi/svn
>
> ___
> devel mailing list
> de...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
> Link to this post:
> http://www.open-mpi.org/community/lists/devel/2014/08/15667.php
>



-- 
Paul H. Hargrove  phhargr...@lbl.gov
Future Technologies Group
Computer and Data Sciences Department Tel: +1-510-495-2352
Lawrence Berkeley National Laboratory Fax: +1-510-486-6900


Re: [OMPI devel] Envelope of HINDEXED_BLOCK

2014-08-26 Thread Paul Hargrove
>
> libtoolize: putting libltdl files in LT_CONFIG_LTDL_DIR, `opal/libltdl'.
> libtoolize: `COPYING.LIB' not found in `/usr/share/libtool/libltdl'
> autoreconf: libtoolize failed with exit status: 1
>
>
The error message is from libtoolize about a file missing from the libtool
installation directory.
So, this looks (to me) like a mis-installation of libtool.

-Paul


-- 
Paul H. Hargrove  phhargr...@lbl.gov
Future Technologies Group
Computer and Data Sciences Department Tel: +1-510-495-2352
Lawrence Berkeley National Laboratory Fax: +1-510-486-6900


Re: [OMPI devel] 1.8.3rc1 - start your engines

2014-09-13 Thread Paul Hargrove
Ralph,

I am not sure if I will have time to run my full suite of configurations,
including all the PGI, Sun, Intel and IBM compilers on Linux.

However, the following non-(Linux/x86-64) platforms have passed:

+ Linux/{PPC32,PPC64,IA64}
+ Solaris-10/{SPARC-v8+,SPARC-v9} (Oracle and GNU compilers)
+ Solaris-11/{amd64,i386} (Oracle and GNU compilers)
+ NetBSD-6/{amd64,i386}
+ OpenBSD-5/{amd64,i386}
+ FreeBSD-10/{amd64,i386}

I've started runs on my ARM and MIPS Linux systems, but those results will
take a while.

-Paul

On Sat, Sep 13, 2014 at 11:23 AM, Ralph Castain  wrote:

> Hi folks
>
> Time to start the release process with rc1 - please test and report issues:
>
> http://www.open-mpi.org/software/ompi/v1.8/
>
> Ralph
>
> ___
> devel mailing list
> de...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
> Link to this post:
> http://www.open-mpi.org/community/lists/devel/2014/09/15822.php
>



-- 
Paul H. Hargrove  phhargr...@lbl.gov
Future Technologies Group
Computer and Data Sciences Department Tel: +1-510-495-2352
Lawrence Berkeley National Laboratory Fax: +1-510-486-6900


Re: [OMPI devel] CONVERSION TO GITHUB

2014-09-16 Thread Paul Hargrove
Jeff,

Any instructions for those who have never had Subversion accounts, but do
have Trac accounts?
You know... the people like me who primarily just make work for others :-)

-Paul

On Tue, Sep 16, 2014 at 10:34 AM, Jeff Squyres (jsquyres) <
jsquy...@cisco.com> wrote:

> Short version
> =
>
> - I have added / invited all users to the "ompi" Github repo.
>   *** You need to join and then "unwatch" the "ompi" repo on Github ASAP
> ***
>
> - The github migration is planned for *next* Wednesday: 24 Sep, 2014
>   ALL OMPI ACTIVITY MUST STOP THAT DAY: commits, tickets, wiki
>
> - Go read the new OMPI wik pages about Git / Github.  They talk about how
> we're going to use Git/Github, etc.  Please reply here with comments,
> suggestions, questions, etc.:
>
>   https://github.com/open-mpi/ompi/wiki
>
> More detail
> ===
>
> The Github migration has been planned for Wednesday, 24 Sep 2014.  The
> migration will start at 8am US Eastern time, and will take all day.
> Subversion and Trac will be placed into read-only status at the beginning
> of the migration.
>
> *** Please reply ASAP if this date does not work for you.
>
> *** If you're a current OMPI developer, you MUST join and "unwatch" the
> "ompi" repo before the migration date (i.e., go to
> https://github.com/open-mpi/ompi/ and click the "Unwatch" button in the
> top right and select "Ignoring").  If you don't join, you make the
> migration harder for me (please don't do that).  If you don't "unwatch",
> you will get a ZILLION emails when the migration actually occurs.  YOU HAVE
> BEEN WARNED.
>
> There's much more information about the Github migration on this wiki page:
>
>  https://github.com/open-mpi/ompi/wiki/GithubMigration
>
> Go read it.  GO READ IT NOW.
>
> I will send out an "all clear" email next Wednesday when Github is ready
> to use.  At that point, it will be safe (and recommended) to start Watching
> the "ompi" repo again.
>
> --
> Jeff Squyres
> jsquy...@cisco.com
> For corporate legal information go to:
> http://www.cisco.com/web/about/doing_business/legal/cri/
>
> ___
> devel mailing list
> de...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
> Link to this post:
> http://www.open-mpi.org/community/lists/devel/2014/09/15839.php
>



-- 
Paul H. Hargrove  phhargr...@lbl.gov
Future Technologies Group
Computer and Data Sciences Department Tel: +1-510-495-2352
Lawrence Berkeley National Laboratory Fax: +1-510-486-6900


Re: [OMPI devel] 1.8.3rc1 - start your engines

2014-09-16 Thread Paul Hargrove
The ARM results finished a couple days back and the MIPS results (3 ABIs to
test) finally completed over night.
In the meantime I was able to schedule tests of most of my menagerie of
Intel, PGI, Sun, Pathscale, and Open64 compilers on x86-64, and some IBM
compiler tests on PPC64 (but *NOT* yet the latest compiler release
available to me)

Other then the known issues with various compilers (such need to explicitly
disable F08 bindings with some PGI versions) there were no problems found
in 1.8.3rc1.

There may be some results later for the IBM compiler I didn't get to, and
possibly for Clang on Linux.

-Paul

On Sun, Sep 14, 2014 at 8:55 PM, Ralph Castain <r...@open-mpi.org> wrote:

> Your contributions are always appreciated, Paul - thanks!
>
> On Sep 13, 2014, at 7:51 PM, Paul Hargrove <phhargr...@lbl.gov> wrote:
>
> Ralph,
>
> I am not sure if I will have time to run my full suite of configurations,
> including all the PGI, Sun, Intel and IBM compilers on Linux.
>
> However, the following non-(Linux/x86-64) platforms have passed:
>
> + Linux/{PPC32,PPC64,IA64}
> + Solaris-10/{SPARC-v8+,SPARC-v9} (Oracle and GNU compilers)
> + Solaris-11/{amd64,i386} (Oracle and GNU compilers)
> + NetBSD-6/{amd64,i386}
> + OpenBSD-5/{amd64,i386}
> + FreeBSD-10/{amd64,i386}
>
> I've started runs on my ARM and MIPS Linux systems, but those results will
> take a while.
>
> -Paul
>
> On Sat, Sep 13, 2014 at 11:23 AM, Ralph Castain <r...@open-mpi.org> wrote:
>
>> Hi folks
>>
>> Time to start the release process with rc1 - please test and report
>> issues:
>>
>> http://www.open-mpi.org/software/ompi/v1.8/
>>
>> Ralph
>>
>> ___
>> devel mailing list
>> de...@open-mpi.org
>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
>> Link to this post:
>> http://www.open-mpi.org/community/lists/devel/2014/09/15822.php
>>
>
>
>
> --
> Paul H. Hargrove  phhargr...@lbl.gov
> Future Technologies Group
> Computer and Data Sciences Department Tel: +1-510-495-2352
> Lawrence Berkeley National Laboratory Fax: +1-510-486-6900
>  ___
> devel mailing list
> de...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
> Link to this post:
> http://www.open-mpi.org/community/lists/devel/2014/09/15823.php
>
>
>
> ___
> devel mailing list
> de...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
> Link to this post:
> http://www.open-mpi.org/community/lists/devel/2014/09/15826.php
>



-- 
Paul H. Hargrove  phhargr...@lbl.gov
Future Technologies Group
Computer and Data Sciences Department Tel: +1-510-495-2352
Lawrence Berkeley National Laboratory Fax: +1-510-486-6900


Re: [OMPI devel] CONVERSION TO GITHUB

2014-09-16 Thread Paul Hargrove
Jeff,

So the instructions from your reply is "create a github account if you wish
to continue filing tickets".

But don't you want/need the trac->github account mapping now to convert
existing tickets?
For instance, I am "phargrov" in your Trac, but "PHHargrove" at github.

And by the way, on wiki page
https://github.com/open-mpi/ompi/wiki/SubmittingBugs you might consider
adding a link to the issue tracker, for folks not familiar with github
navigation .

-Paul

On Tue, Sep 16, 2014 at 11:47 AM, Jeff Squyres (jsquyres) <
jsquy...@cisco.com> wrote:

> Not really.
>
> One minor point: you'll need a Github account to file Github issues (i.e.,
> what's replacing Trac tickets) and/or use the code commenting tools.
>
>
>
> On Sep 16, 2014, at 2:33 PM, Paul Hargrove <phhargr...@lbl.gov> wrote:
>
> > Jeff,
> >
> > Any instructions for those who have never had Subversion accounts, but
> do have Trac accounts?
> > You know... the people like me who primarily just make work for others
> :-)
> >
> > -Paul
> >
> > On Tue, Sep 16, 2014 at 10:34 AM, Jeff Squyres (jsquyres) <
> jsquy...@cisco.com> wrote:
> > Short version
> > =
> >
> > - I have added / invited all users to the "ompi" Github repo.
> >   *** You need to join and then "unwatch" the "ompi" repo on Github ASAP
> ***
> >
> > - The github migration is planned for *next* Wednesday: 24 Sep, 2014
> >   ALL OMPI ACTIVITY MUST STOP THAT DAY: commits, tickets, wiki
> >
> > - Go read the new OMPI wik pages about Git / Github.  They talk about
> how we're going to use Git/Github, etc.  Please reply here with comments,
> suggestions, questions, etc.:
> >
> >   https://github.com/open-mpi/ompi/wiki
> >
> > More detail
> > ===
> >
> > The Github migration has been planned for Wednesday, 24 Sep 2014.  The
> migration will start at 8am US Eastern time, and will take all day.
> Subversion and Trac will be placed into read-only status at the beginning
> of the migration.
> >
> > *** Please reply ASAP if this date does not work for you.
> >
> > *** If you're a current OMPI developer, you MUST join and "unwatch" the
> "ompi" repo before the migration date (i.e., go to
> https://github.com/open-mpi/ompi/ and click the "Unwatch" button in the
> top right and select "Ignoring").  If you don't join, you make the
> migration harder for me (please don't do that).  If you don't "unwatch",
> you will get a ZILLION emails when the migration actually occurs.  YOU HAVE
> BEEN WARNED.
> >
> > There's much more information about the Github migration on this wiki
> page:
> >
> >  https://github.com/open-mpi/ompi/wiki/GithubMigration
> >
> > Go read it.  GO READ IT NOW.
> >
> > I will send out an "all clear" email next Wednesday when Github is ready
> to use.  At that point, it will be safe (and recommended) to start Watching
> the "ompi" repo again.
> >
> > --
> > Jeff Squyres
> > jsquy...@cisco.com
> > For corporate legal information go to:
> http://www.cisco.com/web/about/doing_business/legal/cri/
> >
> > ___
> > devel mailing list
> > de...@open-mpi.org
> > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
> > Link to this post:
> http://www.open-mpi.org/community/lists/devel/2014/09/15839.php
> >
> >
> >
> > --
> > Paul H. Hargrove  phhargr...@lbl.gov
> > Future Technologies Group
> > Computer and Data Sciences Department Tel: +1-510-495-2352
> > Lawrence Berkeley National Laboratory Fax: +1-510-486-6900
> > ___
> > devel mailing list
> > de...@open-mpi.org
> > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
> > Link to this post:
> http://www.open-mpi.org/community/lists/devel/2014/09/15840.php
>
>
> --
> Jeff Squyres
> jsquy...@cisco.com
> For corporate legal information go to:
> http://www.cisco.com/web/about/doing_business/legal/cri/
>
> ___
> devel mailing list
> de...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
> Link to this post:
> http://www.open-mpi.org/community/lists/devel/2014/09/15842.php
>



-- 
Paul H. Hargrove  phhargr...@lbl.gov
Future Technologies Group
Computer and Data Sciences Department Tel: +1-510-495-2352
Lawrence Berkeley National Laboratory Fax: +1-510-486-6900


Re: [OMPI devel] CONVERSION TO GITHUB

2014-09-17 Thread Paul Hargrove
On Wed, Sep 17, 2014 at 8:06 AM, Jeff Squyres (jsquyres)  wrote:

> I actually have the mapping already.  The *only* ID that is preserved
> between the two will be who the ticket is assigned to.


You sent out email asking for SVN -> github ID mapping, but did NOT ask
about IDs for Trac users who are not SVN users.
So my (minor) concern was over what happens to tickets assigned to Trac
users you didn't collect github IDs for?

Am I really the only person with Trac issues assigned but no SVN?
Anyway, you know my github ID now so I won't lose ownership of my one or
two tickets :-)

-Paul

-- 
Paul H. Hargrove  phhargr...@lbl.gov
Future Technologies Group
Computer and Data Sciences Department Tel: +1-510-495-2352
Lawrence Berkeley National Laboratory Fax: +1-510-486-6900


Re: [OMPI devel] Bitbucket vs. GitHub (was: Conversion to GitHub: POSTPONED)

2014-09-25 Thread Paul Hargrove
some repos at BB and
> some at GH.  Keep in mind that this is two different web UIs, two different
> ticket systems, two different wiki formats, etc.  For those of us who work
> in multiple different projects in OMPI, it could be annoying to have to
> mentally switch between the two.
>
> Don't get me wrong: using two different systems is definitely do-able,
> but... meh.
>
> All in all, I think it distills down to:
>
> 1. There's one feature we hope GitHub will implement (per-branch push
> ACLs; we can easily switch from a two-repo system to a single-repo system
> if they ever do); Bitbucket has this feature today.  Otherwise, BB vs. GH =
> pretty feature-comparable.
>
> 2. Bitbucket is a bit more expensive / Cisco already paid for GitHub.  As
> a side-effect, using Bitbucket *may* result in committer-counting games (to
> stay on a given plan).
>
> 3. All the rest of OMPI projects are at GitHub
>
> Because of inertia, monetary cost, an logistics/mental cost, I'm inclined
> to stick with the existing migration plan and move the main Open MPI repo
> to GitHub next Wednesday, 1 Oct 2014, starting at 8am US Eastern.
>
> Comments?
>
>
>
>
> On Sep 24, 2014, at 6:37 AM, Jeff Squyres (jsquyres) <jsquy...@cisco.com>
> wrote:
>
> > If someone with a .edu account gets us a free Bitbucket for Open MPI,
> and then we use it for both research and industry stuff... at best, I think
> that falls into a grey area as to whether this is within Bitbucket's TOS
> (disclaimer: I haven't read their TOS).  It still sounds like a murky
> prospect; I'm not sure it's within the intent of a free account.
> >
> > Paying a reasonable amount for a private account isn't out of the
> question.  Indeed, Cisco has already paid $300 for the first year of a
> Github account so that OMPI can have a private repo.  :-\  That can be
> written off, if necessary, but it would be nice not to.  However, paying
> per developer may become prohibitive -- infrequent bulk payments (e.g.,
> $300/year) are do-able from those of us at corporations.  Managing a
> monthly fee that is dependent upon the number of active committers (and
> that number changes over time) could get a bit... complex, in terms of
> corporate payments / reimbursements.
> >
> > That being said, there's quite a bit of OMPI infrastructure that is
> actively in use at GitHub.  It would be a bit of a pain to migrate all of
> that *again* (from SVN/Trac -> Git/Github -> Git/Bitbucket).  Remember,
> it's not just moving the repos (which, since most repos are now Git, is
> easy to move to another hosting provider); it's also moving the wiki and
> the tickets, too.  That will take more effort.
> >
> > All the above being said:
> >
> > 1. I'll still have a look at Bitbucket today.  It may be a workable
> model that the main OMPI repo (and wiki and tickets) is at Bitbucket, and
> most other repos (and wikis and tickets) are at Github.
> > 2. I just sent a mail to Github support asking them if they plan to
> support per-branch push ACLs.  I don't know if they'll be able to give a
> direct answer, but it's worth asking.
> >
> > It would be a little weird to span Github and Bitbucket, but the
> individual OMPI sub-projects are suitably independent of each other such
> that it could work.  Indeed, we've effectively been doing that for a while
> (e.g., hwloc has been at Github for quite a few months now), but that was
> never intended to be the desired end state.
> >
> >
> >
> > On Sep 23, 2014, at 11:57 PM, Paul Hargrove <phhargr...@lbl.gov> wrote:
> >
> >> The pricing question might not be as simple as it first sounds.  At
> BitBucket Academic accounts are free and allow unlimited users.  So, if
> somebody with an .EDU email address  (IU and UTK come to mind) are the
> owners of the repo then I believe the cost is zero.  Somebody should verify
> that rather than take my word for it.
> >>
> >> More points for comparison between BitBucket and GitHub are presented in
> >>
> http://www.infoworld.com/article/2611771/application-development/bitbucket-vs--github--which-project-host-has-the-most-.html
> >>
> >> -Paul
> >>
> >> On Tue, Sep 23, 2014 at 8:39 PM, Gilles Gouaillardet <
> gilles.gouaillar...@iferc.org> wrote:
> >> my 0.02 US$ ...
> >>
> >> Bitbucket pricing model is per user (but with free public/private
> >> repository up to 5 users)
> >> whereas github pricing is per *private* repository (and free public
> >> repository and with unlimited users)
> >>
> >> from an OpenMPI point of view, this means :
> >> - with github, only the pri

Re: [OMPI devel] RFC: calloc instead of malloc in opal_obj_new()

2014-10-03 Thread Paul Hargrove
I agree with George that zeroing memory only in the debug builds could hide
bugs, and thus would want to see the debug and non-debug builds have the
same behavior (both malloc or both calloc).  So, I also agree this looks
initially like a hard choice.

What about using malloc() in non-debug builds and having a MCA param
control malloc-vs-calloc in a debug build (with malloc being the default)?
The param name could be something with "valgrind" in it to allow it to
control any other "paranoid code" that may be introduced just to silence
valgrind warnings.

-Paul

On Fri, Oct 3, 2014 at 3:02 PM, George Bosilca  wrote:

> It's a tough call. This proposal will create significant differences
> between the debug and fast builds. As the entire objects will be set to
> zero this might reduce bugs in the debug build, bugs that will be horribly
> difficult to track in any non-debug builds. Moreover, if the structures are
> carefully accessed in our code, adding such a disruptive initialization
> just to prevent valgrind from reporting false-positive about uninitialized
> reads in memcpy is too costly as a solution (I am also conscient that it
> will be almost impossible to write a valgrind suppression rule for the
> specific case you mention).
>
> Some parts of the code have (or at least had) some level of cleanness for
> the gaps in the structures. The solution was to minimally zero-fy the gaps,
> maintaining the same behavior between debug and non-debug builds. However,
> in order to do this one need to know the layout of the structure, so this
> is not a completely generic solution...
>
>   George.
>
>
> On Oct 3, 2014, at 16:54 , Jeff Squyres (jsquyres) 
> wrote:
>
> > WHAT: change the malloc() to calloc() in opal_obj_new() (perhaps only in
> debug builds?)
> >
> > WHY: Drastically reduces valgrind output
> >
> > WHERE: see
> https://github.com/open-mpi/ompi/blob/master/opal/class/opal_object.h#L462-L467
> >
> > TIMEOUT: teleconf, Tue, Oct 14 (there's no rush)
> >
> > MORE DETAIL:
> >
> > I was debugging some code today and came across a bunch of places where
> we write structs down various IPC mechanisms, and the structs contain
> holes.  In most places, the performance doesn't matter / the readability of
> struct members is more important, so we haven't re-ordered the structs to
> remove holes.  But consequently, those holes end up uninitialized, and
> therefore memcpy()ing or write()ing instances of these structs causes
> valgrind to emit warnings.
> >
> > The patch below eliminates most (all?) of these valgrind warnings -- in
> debug builds, it changes the malloc() inside OBJ_NEW to a calloc().
> >
> > Upon a little more thought, however, I wonder if we use OBJ_NEW in any
> fast code paths (other than in bulk, such as when we need to grow a free
> list).  Specifically: would it be terrible to *always* calloc -- not just
> for debug builds?
> >
> > -
> > diff --git a/opal/class/opal_object.h b/opal/class/opal_object.h
> > index 7012bac..585f13e 100644
> > --- a/opal/class/opal_object.h
> > +++ b/opal/class/opal_object.h
> > @@ -464,7 +464,11 @@ static inline opal_object_t
> *opal_obj_new(opal_class_t * cl
> > opal_object_t *object;
> > assert(cls->cls_sizeof >= sizeof(opal_object_t));
> >
> > +#if OPAL_ENABLE_DEBUG
> > +object = (opal_object_t *) calloc(1, cls->cls_sizeof);
> > +#else
> > object = (opal_object_t *) malloc(cls->cls_sizeof);
> > +#endif
> > if (0 == cls->cls_initialized) {
> > opal_class_initialize(cls);
> > }
> > -
> >
> > --
> > Jeff Squyres
> > jsquy...@cisco.com
> > For corporate legal information go to:
> http://www.cisco.com/web/about/doing_business/legal/cri/
> >
> > ___
> > devel mailing list
> > de...@open-mpi.org
> > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
> > Link to this post:
> http://www.open-mpi.org/community/lists/devel/2014/10/16001.php
>
> ___
> devel mailing list
> de...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
> Link to this post:
> http://www.open-mpi.org/community/lists/devel/2014/10/16004.php
>



-- 
Paul H. Hargrove  phhargr...@lbl.gov
Future Technologies Group
Computer and Data Sciences Department Tel: +1-510-495-2352
Lawrence Berkeley National Laboratory Fax: +1-510-486-6900


Re: [OMPI devel] RFC: calloc instead of malloc in opal_obj_new()

2014-10-03 Thread Paul Hargrove
Jeff,

Using calloc() only subject to --with-valgrind sounds good to me.
If I'd known such a option exists, I'd not have suggested the MCA param
idea.

-Paul

On Fri, Oct 3, 2014 at 3:33 PM, Jeff Squyres (jsquyres) <jsquy...@cisco.com>
wrote:

> How about a compromise -- how about enabling calloc() when --with-valgrind
> is specified on the command line?
>
> I.e., don't tie it to debug builds, but to valgrind-enabled builds?
>
>
> On Oct 3, 2014, at 6:11 PM, Paul Hargrove <phhargr...@lbl.gov> wrote:
>
> > I agree with George that zeroing memory only in the debug builds could
> hide bugs, and thus would want to see the debug and non-debug builds have
> the same behavior (both malloc or both calloc).  So, I also agree this
> looks initially like a hard choice.
> >
> > What about using malloc() in non-debug builds and having a MCA param
> control malloc-vs-calloc in a debug build (with malloc being the default)?
> The param name could be something with "valgrind" in it to allow it to
> control any other "paranoid code" that may be introduced just to silence
> valgrind warnings.
> >
> > -Paul
> >
> > On Fri, Oct 3, 2014 at 3:02 PM, George Bosilca <bosi...@icl.utk.edu>
> wrote:
> > It's a tough call. This proposal will create significant differences
> between the debug and fast builds. As the entire objects will be set to
> zero this might reduce bugs in the debug build, bugs that will be horribly
> difficult to track in any non-debug builds. Moreover, if the structures are
> carefully accessed in our code, adding such a disruptive initialization
> just to prevent valgrind from reporting false-positive about uninitialized
> reads in memcpy is too costly as a solution (I am also conscient that it
> will be almost impossible to write a valgrind suppression rule for the
> specific case you mention).
> >
> > Some parts of the code have (or at least had) some level of cleanness
> for the gaps in the structures. The solution was to minimally zero-fy the
> gaps, maintaining the same behavior between debug and non-debug builds.
> However, in order to do this one need to know the layout of the structure,
> so this is not a completely generic solution...
> >
> >   George.
> >
> >
> > On Oct 3, 2014, at 16:54 , Jeff Squyres (jsquyres) <jsquy...@cisco.com>
> wrote:
> >
> > > WHAT: change the malloc() to calloc() in opal_obj_new() (perhaps only
> in debug builds?)
> > >
> > > WHY: Drastically reduces valgrind output
> > >
> > > WHERE: see
> https://github.com/open-mpi/ompi/blob/master/opal/class/opal_object.h#L462-L467
> > >
> > > TIMEOUT: teleconf, Tue, Oct 14 (there's no rush)
> > >
> > > MORE DETAIL:
> > >
> > > I was debugging some code today and came across a bunch of places
> where we write structs down various IPC mechanisms, and the structs contain
> holes.  In most places, the performance doesn't matter / the readability of
> struct members is more important, so we haven't re-ordered the structs to
> remove holes.  But consequently, those holes end up uninitialized, and
> therefore memcpy()ing or write()ing instances of these structs causes
> valgrind to emit warnings.
> > >
> > > The patch below eliminates most (all?) of these valgrind warnings --
> in debug builds, it changes the malloc() inside OBJ_NEW to a calloc().
> > >
> > > Upon a little more thought, however, I wonder if we use OBJ_NEW in any
> fast code paths (other than in bulk, such as when we need to grow a free
> list).  Specifically: would it be terrible to *always* calloc -- not just
> for debug builds?
> > >
> > > -
> > > diff --git a/opal/class/opal_object.h b/opal/class/opal_object.h
> > > index 7012bac..585f13e 100644
> > > --- a/opal/class/opal_object.h
> > > +++ b/opal/class/opal_object.h
> > > @@ -464,7 +464,11 @@ static inline opal_object_t
> *opal_obj_new(opal_class_t * cl
> > > opal_object_t *object;
> > > assert(cls->cls_sizeof >= sizeof(opal_object_t));
> > >
> > > +#if OPAL_ENABLE_DEBUG
> > > +object = (opal_object_t *) calloc(1, cls->cls_sizeof);
> > > +#else
> > > object = (opal_object_t *) malloc(cls->cls_sizeof);
> > > +#endif
> > > if (0 == cls->cls_initialized) {
> > > opal_class_initialize(cls);
> > > }
> > > -
> > >
> > > --
> > > Jeff Squyres
> > > jsquy...@cisco.com
> > > For corporate legal information go to:
> http://www.cisco.com/web/about/doing_business/le

Re: [OMPI devel] Fwd: Open MPI 1.8: link problem when Fortran+C+Platform LSF

2014-10-17 Thread Paul Hargrove
I know of two possibilities:

1) I cannot be certain but since the message concerns a PC-relative
addressing mode, it is possible that something needs to be compiled with
-fPIC to fix the issue.  See if adding that option to any of the mpicc
commands helps.

2) Try adding ONE of "-ll", "-lfl" or "-lfl_pic" to include the lex/flex
support lib.   This is PROBABLY the wrong solution because that lib defines
its own "main()".

-Paul



On Fri, Oct 17, 2014 at 4:56 PM, Jeff Squyres (jsquyres)  wrote:

> I think the LSF part of this may be a red herring.  Do you really need to
> add "-lbat -llsf" to the command line to make it work?
>
> The error message *sounds* like y.tab.o was compiled differently than
> others...?  It's hard to know without seeing the output of mpicc --showme.
>
>
> On Oct 17, 2014, at 7:51 AM, Ralph Castain  wrote:
>
> > Forwarding this for Paul until his email address gets updated on the
> User list:
> >
> >> Begin forwarded message:
> >>
> >> Date: October 17, 2014 at 6:35:31 AM PDT
> >> From: Paul Kapinos 
> >> To: Open MPI Users 
> >> Cc: "Kapinos, Paul" , <
> fri...@cats.rwth-aachen.de>
> >> Subject: Open MPI 1.8: link problem when Fortran+C+Platform LSF
> >>
> >> Dear Open MPI developer,
> >>
> >> we have both Open MPI 1.6(.5) and 1.8(.3) in our cluster, configured to
> be used with Platform LSF.
> >>
> >> One of our users run into an issue when trying to link his code
> (combination of lex/C and Fortran) with v.1.8, whereby with OpenMPI/1.6er
> the code can be linked OK.
> >>
> >>> $ make
> >>> mpif90 -c main.f90
> >>> yacc -d example4.y
> >>> mpicc -c y.tab.c
> >>> mpicc -c mymain.c
> >>> lex example4.l
> >>> mpicc -c lex.yy.c
> >>> mpif90 -o example main.o y.tab.o mymain.o lex.yy.o
> >>> ld: y.tab.o(.text+0xd9): unresolvable R_X86_64_PC32 relocation against
> symbol `yylval'
> >>> ld: y.tab.o(.text+0x16f): unresolvable R_X86_64_PC32 relocation
> against symbol `yyval'
> >>> ...
> >>
> >> looking into "mpif90 --show-me" let us see that the link line and
> possibly the philosophy behind it has been changed, there is also a note on
> it:
> >>
> >> # Note that per https://svn.open-mpi.org/trac/ompi/ticket/3422, we
> >> # intentionally only link in the MPI libraries (ORTE, OPAL, etc. are
> >> # pulled in implicitly) because we intend MPI applications to only use
> >> # the MPI API.
> >>
> >>
> >>
> >>
> >> Well, by now we know two workarounds:
> >> a) add "-lbat -llsf" to the link line
> >> b) add " -Wl,--as-needed" to the link line
> >>
> >> What would be better? Maybe one of this should be added to
> linker_flags=..." in the .../share/openmpi/mpif90-wrapper-data.txt file? As
> of the note above, (b) would be better?
> >>
> >> Best
> >>
> >> Paul Kapinos
> >>
> >> P.S. $ mpif90 --show-me
> >>
> >> 1.6.5
> >> ifort -nofor-main -I/opt/MPI/openmpi-1.6.5/linux/intel/include
> -fexceptions -I/opt/MPI/openmpi-1.6.5/linux/intel/lib
> -L/opt/lsf/9.1/linux2.6-glibc2.3-x86_64/lib
> -L/opt/MPI/openmpi-1.6.5/linux/intel/lib -lmpi_f90 -lmpi_f77 -lmpi
> -losmcomp -lrdmacm -libverbs -lrt -lnsl -lutil -lpsm_infinipath -lbat -llsf
> -ldl -lm -lnuma -lrt -lnsl -lutil
> >>
> >> 1.8.3
> >> ifort -I/opt/MPI/openmpi-1.8.3/linux/intel/include
> -fexceptions -I/opt/MPI/openmpi-1.8.3/linux/intel/lib
> -L/opt/lsf/9.1/linux2.6-glibc2.3-x86_64/lib -Wl,-rpath
> -Wl,/opt/lsf/9.1/linux2.6-glibc2.3-x86_64/lib -Wl,-rpath
> -Wl,/opt/MPI/openmpi-1.8.3/linux/intel/lib -Wl,--enable-new-dtags
> -L/opt/MPI/openmpi-1.8.3/linux/intel/lib -lmpi_usempif08
> -lmpi_usempi_ignore_tkr -lmpi_mpifh -lmpi
> >>
> >> P.S.2 $ man ld
> >> 
> >>   --as-needed
> >>   --no-as-needed
> >>   This option affects ELF DT_NEEDED tags for dynamic libraries
> >>   mentioned on the command line after the --as-needed option.
> >>   Normally the linker will add a DT_NEEDED tag for each dynamic
> >>   library mentioned on the command line, regardless of whether
> the
> >>   library is actually needed or not.  --as-needed causes a
> DT_NEEDED
> >>   tag to only be emitted for a library that satisfies an
> undefined
> >>   symbol reference from a regular object file or, if the
> library is
> >>   not found in the DT_NEEDED lists of other libraries linked up
> to
> >>   that point, an undefined symbol reference from another dynamic
> >>   library.  --no-as-needed restores the default behaviour.
> >>
> >> 
> >>
> >> --
> >> Dipl.-Inform. Paul Kapinos   -   High Performance Computing,
> >> RWTH Aachen University, IT Center
> >> Seffenter Weg 23,  D 52074  Aachen (Germany)
> >> Tel: +49 241/80-24915
> >>
> >
> > 
>
>
> --
> Jeff Squyres
> jsquy...@cisco.com
> For corporate legal information go to:
> http://www.cisco.com/web/about/doing_business/legal/cri/
>
> ___
> devel mailing list
> 

Re: [OMPI devel] Open MPI 1.8: link problem when Fortran+C+Platform LSF

2014-10-20 Thread Paul Hargrove
Markus,

I based my suggestion on the presence of certain keywords in the error
message, not on any mental model of the compiler or linker action on your
input.  I don't think there is any valid reason one should *expect* a need
to compile or link with "mpif90 -fPIC".  So, I am afraid I cannot answer as
to why this fixes the problem.

-Paul

On Sun, Oct 19, 2014 at 10:44 PM, Frings, Markus <fri...@cats.rwth-aachen.de
> wrote:

>  Compiling the sources with -fPIC fixes the issue. But I wonder why I have
> to add -fPIC when I want to compile with mpif90, but not when I use ifort
> directly. With mpif90 I also use ifort with some additional flags and
> libraries as mpif90 --show-me shows.
>
>
> Markus Frings, M.Sc.
>
>  Chair for Computational Analysis of Technical Systems (CATS)
> RWTH Aachen University
> Schinkelstrasse 2, Room 222a
> D-52062 Aachen
>
>  Phone +49 (0)241 80 99932
> fri...@cats.rwth-aachen.de
> http://www.cats.rwth-aachen.de
>
>  On 18.10.2014, at 02:24, Paul Hargrove <phhargr...@lbl.gov> wrote:
>
>  I know of two possibilities:
>
>  1) I cannot be certain but since the message concerns a PC-relative
> addressing mode, it is possible that something needs to be compiled with
> -fPIC to fix the issue.  See if adding that option to any of the mpicc
> commands helps.
>
> 2) Try adding ONE of "-ll", "-lfl" or "-lfl_pic" to include the lex/flex
> support lib.   This is PROBABLY the wrong solution because that lib defines
> its own "main()".
>
>  -Paul
>
>
>
> On Fri, Oct 17, 2014 at 4:56 PM, Jeff Squyres (jsquyres) <
> jsquy...@cisco.com> wrote:
>
>> I think the LSF part of this may be a red herring.  Do you really need to
>> add "-lbat -llsf" to the command line to make it work?
>>
>> The error message *sounds* like y.tab.o was compiled differently than
>> others...?  It's hard to know without seeing the output of mpicc --showme.
>>
>>
>> On Oct 17, 2014, at 7:51 AM, Ralph Castain <r...@open-mpi.org> wrote:
>>
>> > Forwarding this for Paul until his email address gets updated on the
>> User list:
>> >
>> >> Begin forwarded message:
>> >>
>> >> Date: October 17, 2014 at 6:35:31 AM PDT
>> >> From: Paul Kapinos <kapi...@itc.rwth-aachen.de>
>> >> To: Open MPI Users <us...@open-mpi.org>
>> >> Cc: "Kapinos, Paul" <kapi...@itc.rwth-aachen.de>, <
>> fri...@cats.rwth-aachen.de>
>> >> Subject: Open MPI 1.8: link problem when Fortran+C+Platform LSF
>> >>
>> >> Dear Open MPI developer,
>> >>
>> >> we have both Open MPI 1.6(.5) and 1.8(.3) in our cluster, configured
>> to be used with Platform LSF.
>> >>
>> >> One of our users run into an issue when trying to link his code
>> (combination of lex/C and Fortran) with v.1.8, whereby with OpenMPI/1.6er
>> the code can be linked OK.
>> >>
>> >>> $ make
>> >>> mpif90 -c main.f90
>> >>> yacc -d example4.y
>> >>> mpicc -c y.tab.c
>> >>> mpicc -c mymain.c
>> >>> lex example4.l
>> >>> mpicc -c lex.yy.c
>> >>> mpif90 -o example main.o y.tab.o mymain.o lex.yy.o
>> >>> ld: y.tab.o(.text+0xd9): unresolvable R_X86_64_PC32 relocation
>> against symbol `yylval'
>> >>> ld: y.tab.o(.text+0x16f): unresolvable R_X86_64_PC32 relocation
>> against symbol `yyval'
>> >>> ...
>> >>
>> >> looking into "mpif90 --show-me" let us see that the link line and
>> possibly the philosophy behind it has been changed, there is also a note on
>> it:
>> >>
>> >> # Note that per https://svn.open-mpi.org/trac/ompi/ticket/3422, we
>> >> # intentionally only link in the MPI libraries (ORTE, OPAL, etc. are
>> >> # pulled in implicitly) because we intend MPI applications to only use
>> >> # the MPI API.
>> >>
>> >>
>> >>
>> >>
>> >> Well, by now we know two workarounds:
>> >> a) add "-lbat -llsf" to the link line
>> >> b) add " -Wl,--as-needed" to the link line
>> >>
>> >> What would be better? Maybe one of this should be added to
>> linker_flags=..." in the .../share/openmpi/mpif90-wrapper-data.txt file? As
>> of the note above, (b) would be better?
>> >>
>> >> Best
>> >>
>> >> Paul Kapinos
>> >

Re: [OMPI devel] Deprecated call in sharedfp framework

2014-10-24 Thread Paul Hargrove
I can shed some light on these warnings.

sem_init() and sem_destroy() are POSIX-defined interfaces for UNNAMED
semaphores.
There are also POSX interfaces, sem_{open,close,unlink}(), that operate on
NAMED semaphores.
See for more info:
   http://pubs.opengroup.org/onlinepubs/009695399/basedefs/semaphore.h.html

According to the following link Mac OSX only implements the NAMED
semaphores and I would guess they are now deprecating the ones that just
return -1 and set errno=ENOSYS:
  http://stackoverflow.com/questions/1413785/sem-init-on-os-x

-Paul


On Fri, Oct 24, 2014 at 1:45 PM, Edgar Gabriel  wrote:

> Yes, will have a look at it next week.
>
> Thanks
> Edgar
>
> On 10/24/2014 12:01 PM, Jeff Squyres (jsquyres) wrote:
>
>> Edgar -- can you have a look?
>>
>>
>> On Oct 24, 2014, at 12:04 PM, Ralph Castain  wrote:
>>
>>  I'm not sure who owns that framework, but I'm seeing this warning:
>>>
>>> sharedfp_sm_file_open.c: In function 'mca_sharedfp_sm_file_open':
>>> sharedfp_sm_file_open.c:159:5: warning: 'sem_init' is deprecated
>>> (declared at /usr/include/sys/semaphore.h:55)
>>> [-Wdeprecated-declarations]
>>>   if(sem_init(_offset_ptr->mutex, 1, 1) != -1){
>>>   ^
>>> sharedfp_sm_file_open.c: In function 'mca_sharedfp_sm_file_close':
>>> sharedfp_sm_file_open.c:214:13: warning: 'sem_destroy' is deprecated
>>> (declared at /usr/include/sys/semaphore.h:53)
>>> [-Wdeprecated-declarations]
>>>   sem_destroy(_data->sm_offset_ptr->mutex);
>>>   ^
>>>
>>>
>>> This is with gcc (MacPorts gcc49 4.9.1_0) 4.9.1
>>> Ralph
>>>
>>> ___
>>> devel mailing list
>>> de...@open-mpi.org
>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>> Link to this post: http://www.open-mpi.org/
>>> community/lists/devel/2014/10/16088.php
>>>
>>
>>
>>
> --
> Edgar Gabriel
> Associate Professor
> Parallel Software Technologies Lab  http://pstl.cs.uh.edu
> Department of Computer Science  University of Houston
> Philip G. Hoffman Hall, Room 524Houston, TX-77204, USA
> Tel: +1 (713) 743-3857  Fax: +1 (713) 743-3335
> ___
> devel mailing list
> de...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
> Link to this post: http://www.open-mpi.org/community/lists/devel/2014/10/
> 16090.php
>



-- 
Paul H. Hargrove  phhargr...@lbl.gov
Future Technologies Group
Computer and Data Sciences Department Tel: +1-510-495-2352
Lawrence Berkeley National Laboratory Fax: +1-510-486-6900


Re: [OMPI devel] errno and reentrance

2014-10-27 Thread Paul Hargrove
On Mon, Oct 27, 2014 at 2:42 AM, Gilles Gouaillardet <
gilles.gouaillar...@iferc.org> wrote:
[...]

> Paul, since you have access to many platforms, could you please run this
> test with and without -D_REENTRANT / -D_THREAD_SAFE
> and tell me where the program produces incorrect behaviour (output is
> KO...) without the flag ?
>
> Thanks in advance,
>
> Gilles
>

Gilles,

I have a lot of things due between now and the SC14 conference.
I've added this test to my to-do list, but cannot be sure of how soon I'll
be able to get results back to you.

Feel free to remind me off-list,
-Paul



-- 
Paul H. Hargrove  phhargr...@lbl.gov
Future Technologies Group
Computer and Data Sciences Department Tel: +1-510-495-2352
Lawrence Berkeley National Laboratory Fax: +1-510-486-6900


Re: [OMPI devel] errno and reentrance

2014-10-27 Thread Paul Hargrove
Gilles,

I responded too quickly, not thinking that this test is pretty quick and
doesn't require that I try sparc, ppc, ia64, etc.
So my results:

Solaris-{10,11}:
  With "cc" I agree with your findings (need -D_REENTRANT for correct
behavior).
  With gcc either "-pthread" or "-D_REENTRANT" gave correct behavior

NetBSD-5:
  Got "KO: error 4 (0)" no matter what I tried

Linux,  FreeBSD-{9,10}, NetBSD-6, OpenBSD-5:
  Using "-pthread" or "-lpthread" was necessary to link, and sufficient for
correct results.

MacOSX-10.{5,6,7,8}:
  No compiler options were required for 'cc' (which has been gcc, llvm-gcc
and clang through those OS revs)

Though I have access, I did not try compute nodes on BG/Q or Cray X{E,K,C}.
Let me know if any of those are of significant concern.

I no longer have AIX or IRIX access.

-Paul


On Mon, Oct 27, 2014 at 2:48 AM, Gilles Gouaillardet <
gilles.gouaillar...@iferc.org> wrote:

>  Thanks Paul !
>
> Gilles
>
> On 2014/10/27 18:47, Paul Hargrove wrote:
>
> On Mon, Oct 27, 2014 at 2:42 AM, Gilles Gouaillardet 
> <gilles.gouaillar...@iferc.org> wrote:
> [...]
>
>
>  Paul, since you have access to many platforms, could you please run this
> test with and without -D_REENTRANT / -D_THREAD_SAFE
> and tell me where the program produces incorrect behaviour (output is
> KO...) without the flag ?
>
> Thanks in advance,
>
> Gilles
>
>
>  Gilles,
>
> I have a lot of things due between now and the SC14 conference.
> I've added this test to my to-do list, but cannot be sure of how soon I'll
> be able to get results back to you.
>
> Feel free to remind me off-list,
> -Paul
>
>
>
>
>
>
> ___
> devel mailing listde...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
> Link to this post: 
> http://www.open-mpi.org/community/lists/devel/2014/10/16095.php
>
>
>
> ___
> devel mailing list
> de...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
> Link to this post:
> http://www.open-mpi.org/community/lists/devel/2014/10/16096.php
>



-- 
Paul H. Hargrove  phhargr...@lbl.gov
Future Technologies Group
Computer and Data Sciences Department Tel: +1-510-495-2352
Lawrence Berkeley National Laboratory Fax: +1-510-486-6900


Re: [OMPI devel] configure.m4 for pmix/s1 and pmix/s2 question

2014-10-28 Thread Paul Hargrove
On Tue, Oct 28, 2014 at 11:53 AM, Howard Pritchard 
wrote:

>
>> We may no longer require those as you have separated the Cray check out,
>> but the original problem is that we would pickup the Slurm components on
>> the Cray because we would find pmi.h
>>
>> Oh,  I forgot about that .
>

In GASNet's configure logic we look for "pmi_cray.h" before "pmi.h".
So far that has been sufficient to disambiguate the implementations.
One might also try checking libpmi for Cray's extensions.

-Paul


-- 
Paul H. Hargrove  phhargr...@lbl.gov
Future Technologies Group
Computer and Data Sciences Department Tel: +1-510-495-2352
Lawrence Berkeley National Laboratory Fax: +1-510-486-6900


Re: [OMPI devel] configure.m4 for pmix/s1 and pmix/s2 question

2014-10-28 Thread Paul Hargrove
Ralph,

The Cray's at NERSC have *both* pmi_cray.h and pmi.h (and pmi2.h as well).
That is why I said our configure logic checks for pmi_cray.h *first*.
Sorry if that wasn't clear.

On NERSC's XE6:

{hargrove@hopper06 ~}$ ls /opt/cray/pmi/default/include/
pmi2.h  pmi_cray_ext.h  pmi_cray.h  pmi.h  pmi_version.h
{hargrove@hopper06 ~}$ rpm -qf /opt/cray/pmi/default/include/pmi_cray.h
cray-libpmi-devel-4.0.1-1..9753.86.3.gem


On NERSC's XC30:

{hargrove@edison08 ~}$ ls /opt/cray/pmi/default/include/
pmi.h  pmi2.h  pmi_cray.h  pmi_cray_ext.h  pmi_version.h
{hargrove@edison08 ~}$ rpm -qf /opt/cray/pmi/default/include/pmi_cray.h
cray-libpmi-devel-5.0.5-1..10300.134.8.ari


-Paul

On Tue, Oct 28, 2014 at 12:02 PM, Ralph Castain <r...@open-mpi.org> wrote:

>
> On Oct 28, 2014, at 11:59 AM, Paul Hargrove <phhargr...@lbl.gov> wrote:
>
>
> On Tue, Oct 28, 2014 at 11:53 AM, Howard Pritchard <hpprit...@gmail.com>
> wrote:
>
>>
>>> We may no longer require those as you have separated the Cray check out,
>>> but the original problem is that we would pickup the Slurm components on
>>> the Cray because we would find pmi.h
>>>
>>> Oh,  I forgot about that .
>>
>
> In GASNet's configure logic we look for "pmi_cray.h" before "pmi.h".
>
>
> Hmmm...on LANL's Cray systems, it was still labeled "pmi.h"
>
> So far that has been sufficient to disambiguate the implementations.
> One might also try checking libpmi for Cray's extensions.
>
> -Paul
>
>
> --
> Paul H. Hargrove  phhargr...@lbl.gov
> Future Technologies Group
> Computer and Data Sciences Department Tel: +1-510-495-2352
> Lawrence Berkeley National Laboratory Fax: +1-510-486-6900
>  ___
> devel mailing list
> de...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
> Link to this post:
> http://www.open-mpi.org/community/lists/devel/2014/10/16114.php
>
>
>
> ___
> devel mailing list
> de...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
> Link to this post:
> http://www.open-mpi.org/community/lists/devel/2014/10/16115.php
>



-- 
Paul H. Hargrove  phhargr...@lbl.gov
Future Technologies Group
Computer and Data Sciences Department Tel: +1-510-495-2352
Lawrence Berkeley National Laboratory Fax: +1-510-486-6900


Re: [OMPI devel] configure.m4 for pmix/s1 and pmix/s2 question

2014-10-28 Thread Paul Hargrove
On Tue, Oct 28, 2014 at 12:20 PM, Ralph Castain <r...@open-mpi.org> wrote:

> On Oct 28, 2014, at 12:17 PM, Paul Hargrove <phhargr...@lbl.gov> wrote:
>
> Ralph,
>
> The Cray's at NERSC have *both* pmi_cray.h and pmi.h (and pmi2.h as well).
>
>
> I understand that - I was questioning if that is universally true or not.
> IF we are guaranteed that nobody with a Cray ever renames pmi_cray.h to
> pmi.h, THEN your check will be fine. Otherwise, we can't trust it.
>
> And I seem to recall that the earlier Crays, at least, didn't have this
> naming distinction - or at least, not at LANL. Hence my question.
>

Fair enough.
I would say anybody moving or renaming files provided by Cray gets what
they deserve. However, since I have no way to confirm older or future
systems, I cannot answer your question with an affirmative.

What about checking for the presence of pmi_cray_ext.h?
Is that any better?

So, if one is not going to trust ANY filenames, one might instead see if
pmi.h and libpmi.* provide Cray's extensions.  If there are Cray extensions
used by OPAL/ORTE/OMPI, then checking for those would be "the right way"
anyway.

-Paul


-- 
Paul H. Hargrove  phhargr...@lbl.gov
Future Technologies Group
Computer and Data Sciences Department Tel: +1-510-495-2352
Lawrence Berkeley National Laboratory Fax: +1-510-486-6900


Re: [OMPI devel] configure.m4 for pmix/s1 and pmix/s2 question

2014-10-28 Thread Paul Hargrove
By Howard's definition I guess NERSC's Hopper (XE6) qualifies as "very old"
at PrgEnv 4.2.34

{hargrove@hopper06 ~}$ pkg-config --cflags cray-pmi
Package cray-alpslli was not found in the pkg-config search path.
Perhaps you should add the directory containing `cray-alpslli.pc'
to the PKG_CONFIG_PATH environment variable
Package 'cray-alpslli', required by 'cray-pmi', not found

-Paul


On Tue, Oct 28, 2014 at 1:05 PM, Howard Pritchard <hpprit...@gmail.com>
wrote:

> Hi Folks,
>
> The simplest and best way on cray is to use the pkg-config command.
> No looking for odd header file names, etc.  There is a minor issue
> with external login nodes running very old (like CLE 4.X) that one has
> to workaround, but otherwise works well.
>
> pkg-config --cflags cray-pmi
>
> etc. etc.
>
> The pc files for the various cray software packages are suppose to include
> all dependencies on headers files, libs, etc. from other cay packages.
>
> Howard
>
>
>
>
> 2014-10-28 13:20 GMT-06:00 Ralph Castain <r...@open-mpi.org>:
>
>>
>> On Oct 28, 2014, at 12:17 PM, Paul Hargrove <phhargr...@lbl.gov> wrote:
>>
>> Ralph,
>>
>> The Cray's at NERSC have *both* pmi_cray.h and pmi.h (and pmi2.h as well).
>>
>>
>> I understand that - I was questioning if that is universally true or not.
>> IF we are guaranteed that nobody with a Cray ever renames pmi_cray.h to
>> pmi.h, THEN your check will be fine. Otherwise, we can't trust it.
>>
>> And I seem to recall that the earlier Crays, at least, didn't have this
>> naming distinction - or at least, not at LANL. Hence my question.
>>
>>
>> That is why I said our configure logic checks for pmi_cray.h *first*.
>> Sorry if that wasn't clear.
>>
>> On NERSC's XE6:
>>
>> {hargrove@hopper06 ~}$ ls /opt/cray/pmi/default/include/
>> pmi2.h  pmi_cray_ext.h  pmi_cray.h  pmi.h  pmi_version.h
>> {hargrove@hopper06 ~}$ rpm -qf /opt/cray/pmi/default/include/pmi_cray.h
>> cray-libpmi-devel-4.0.1-1..9753.86.3.gem
>>
>>
>> On NERSC's XC30:
>>
>> {hargrove@edison08 ~}$ ls /opt/cray/pmi/default/include/
>> pmi.h  pmi2.h  pmi_cray.h  pmi_cray_ext.h  pmi_version.h
>> {hargrove@edison08 ~}$ rpm -qf /opt/cray/pmi/default/include/pmi_cray.h
>> cray-libpmi-devel-5.0.5-1..10300.134.8.ari
>>
>>
>> -Paul
>>
>> On Tue, Oct 28, 2014 at 12:02 PM, Ralph Castain <r...@open-mpi.org> wrote:
>>
>>>
>>> On Oct 28, 2014, at 11:59 AM, Paul Hargrove <phhargr...@lbl.gov> wrote:
>>>
>>>
>>> On Tue, Oct 28, 2014 at 11:53 AM, Howard Pritchard <hpprit...@gmail.com>
>>> wrote:
>>>
>>>>
>>>>> We may no longer require those as you have separated the Cray check
>>>>> out, but the original problem is that we would pickup the Slurm components
>>>>> on the Cray because we would find pmi.h
>>>>>
>>>>> Oh,  I forgot about that .
>>>>
>>>
>>> In GASNet's configure logic we look for "pmi_cray.h" before "pmi.h".
>>>
>>>
>>> Hmmm...on LANL's Cray systems, it was still labeled "pmi.h"
>>>
>>> So far that has been sufficient to disambiguate the implementations.
>>> One might also try checking libpmi for Cray's extensions.
>>>
>>> -Paul
>>>
>>>
>>> --
>>> Paul H. Hargrove  phhargr...@lbl.gov
>>> Future Technologies Group
>>> Computer and Data Sciences Department Tel: +1-510-495-2352
>>> Lawrence Berkeley National Laboratory Fax: +1-510-486-6900
>>>  ___
>>> devel mailing list
>>> de...@open-mpi.org
>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>> Link to this post:
>>> http://www.open-mpi.org/community/lists/devel/2014/10/16114.php
>>>
>>>
>>>
>>> ___
>>> devel mailing list
>>> de...@open-mpi.org
>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>> Link to this post:
>>> http://www.open-mpi.org/community/lists/devel/2014/10/16115.php
>>>
>>
>>
>>
>> --
>> Paul H. Hargrove  phhargr...@lbl.gov
>> Future Technologies Group
>> Computer and Data Sciences Department Tel: +1-510-495-2352
>> Lawrence Berkeley National Laboratory Fax: +1-510-486-6900
>>  ___

Re: [OMPI devel] Error: undefined reference `__builtin_va_gparg1'

2014-10-29 Thread Paul Hargrove
Amit,

You appear to be mixing PGI and GNU compilers, as shown by the "g++" in the
final portion of your output.
You must configure Open MPI with all compilers (C, C++ and Fortran) from
the same "family".

-Paul


On Wed, Oct 29, 2014 at 1:11 PM, Kumar, Amit  wrote:

> Dear Developers,
>
> I have ran into the following errors while compiling OpenMPI version(both
> 1.8.2 and 1.8.3) using PGI-13.2
>
> Any idea where could I locate defined references to `__builtin_va_gparg1'.
>
> Any help is greatly appreciated.
>
> Regards,
> Amit
>
> Making all in tool
> make[7]: Entering directory
> `/grid/software/admin/root/packages/build/openmpi-1.8.3/ompi/contrib/vt/vt/tools/opari/tool'
>   CXX  opari-handler.o
>   CXX  opari-ompragma.o
>   CXX  opari-ompragma_c.o
>   CXX  opari-ompragma_f.o
>   CXX  opari-ompregion.o
>   CXX  opari-opari.o
>   CXX  opari-process_c.o
>   CXX  opari-process_f.o
>   CXX  opari-process_omp.o
> ln -s ../../../util/util.c
>   CC   util.o
>   CXXLDopari
> util.o: In function `guess_strlen':
> /grid/software/admin/root/packages/build/openmpi-1.8.3/ompi/contrib/vt/vt/tools/opari/tool/./util.c:51:
> undefined reference to `__builtin_va_gparg1'
> /grid/software/admin/root/packages/build/openmpi-1.8.3/ompi/contrib/vt/vt/tools/opari/tool/./util.c:55:
> undefined reference to `__builtin_va_gparg1'
> /grid/software/admin/root/packages/build/openmpi-1.8.3/ompi/contrib/vt/vt/tools/opari/tool/./util.c:69:
> undefined reference to `__builtin_va_gparg1'
> /grid/software/admin/root/packages/build/openmpi-1.8.3/ompi/contrib/vt/vt/tools/opari/tool/./util.c:82:
> undefined reference to `__builtin_va_gparg1'
> /grid/software/admin/root/packages/build/openmpi-1.8.3/ompi/contrib/vt/vt/tools/opari/tool/./util.c:91:
> undefined reference to `__builtin_va_gparg1'
> util.o:/grid/software/admin/root/packages/build/openmpi-1.8.3/ompi/contrib/vt/vt/tools/opari/tool/./util.c:107:
> more undefined references to `__builtin_va_gparg1' follow
> collect2: ld returned 1 exit status
> make[7]: *** [opari] Error 1
> make[7]: Leaving directory
> `/grid/software/admin/root/packages/build/openmpi-1.8.3/ompi/contrib/vt/vt/tools/opari/tool'
> make[6]: *** [all-recursive] Error 1
> make[6]: Leaving directory
> `/grid/software/admin/root/packages/build/openmpi-1.8.3/ompi/contrib/vt/vt/tools/opari'
> make[5]: *** [all-recursive] Error 1
> make[5]: Leaving directory
> `/grid/software/admin/root/packages/build/openmpi-1.8.3/ompi/contrib/vt/vt/tools'
> make[4]: *** [all-recursive] Error 1
> make[4]: Leaving directory
> `/grid/software/admin/root/packages/build/openmpi-1.8.3/ompi/contrib/vt/vt'
> make[3]: *** [all] Error 2
> make[3]: Leaving directory
> `/grid/software/admin/root/packages/build/openmpi-1.8.3/ompi/contrib/vt/vt'
> make[2]: *** [all-recursive] Error 1
> make[2]: Leaving directory
> `/grid/software/admin/root/packages/build/openmpi-1.8.3/ompi/contrib/vt'
> make[1]: *** [all-recursive] Error 1
> make[1]: Leaving directory
> `/grid/software/admin/root/packages/build/openmpi-1.8.3/ompi'
> make: *** [all-recursive] Error 1
>
> # cd
> /grid/software/admin/root/packages/build/openmpi-1.8.3/ompi/contrib/vt/vt/tools/opari/tool
> # make -n
> rm -f opari
> echo "  CXXLD   " opari;/bin/sh ../../../libtool --silent --tag=CXX
>  --mode=link g++ -DOPARI_VT -O3 -DNDEBUG -finline-functions -pthread   -o
> opari opari-handler.o opari-ompragma.o opari-ompragma_c.o
> opari-ompragma_f.o opari-ompregion.o opari-opari.o opari-process_c.o
> opari-process_f.o opari-process_omp.o util.o  -lrt -lut
> ___
> devel mailing list
> de...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
> Link to this post:
> http://www.open-mpi.org/community/lists/devel/2014/10/16128.php
>



-- 
Paul H. Hargrove  phhargr...@lbl.gov
Future Technologies Group
Computer and Data Sciences Department Tel: +1-510-495-2352
Lawrence Berkeley National Laboratory Fax: +1-510-486-6900


Re: [OMPI devel] OMPI devel] [OMPI commits] Git: open-mpi/ompi branch master updated. dev-198-g68bec0a

2014-11-03 Thread Paul Hargrove
On Mon, Nov 3, 2014 at 8:29 AM, Dave Goodell (dgoodell) 
wrote:

> > btw, is there a push option to abort if that would make github history
> non linear ?
>
> No, not really.  There are some options to "pull" to prevent you from
> creating a merge commit, but the fix when you encounter that situation
> would simply be to rebase in some fashion, so you might as well just do
> that every time.
>

The "some options" Dave is referring to is probably
git pull --ff-only
I have this aliased to "git ff" and use it instead of "git pull".

If your pull would require a merge, this will tell you so and not make any
changes.
As Dave says, "git pull --rebase" is *probably* going to be your next step
if "git pull --ff-only" fails.  However, that is not *always* the case.
Sometimes you may wish to "stash" or create a new branch for the local
modifications.

I prefer "git pull --ff-only" because it allows (some may say "forces") me
to examine the situation before I create non-linear history.  Without it,
imagine what happens when I login to some machine I seldom use, and there
are local mods from some experiment I had totally forgotten about.
- If I do a plain "git pull" I get a merge I probably didn't want
- If I do "git pull --rebase" then my local mods are (silently unless you
look carefully) rebased on the new tip.
In either of the above cases I may find myself resolving conflicts for
changes I didn't even remember making.

So, I favor "git pull --ff-only" because in the case of no local mods it
just updates my local repo, and otherwise I get to examine the local
changes before I let git merge or rebase them.  If you are familiar enough
with using "stash", you can even choose to ignore the local changes for now
and get on with the task at hand.

-Paul


-- 
Paul H. Hargrove  phhargr...@lbl.gov
Future Technologies Group
Computer and Data Sciences Department Tel: +1-510-495-2352
Lawrence Berkeley National Laboratory Fax: +1-510-486-6900


Re: [OMPI devel] OMPI devel] [OMPI commits] Git: open-mpi/ompi branch master updated. dev-198-g68bec0a

2014-11-03 Thread Paul Hargrove
IIRC it was not possible to merge with a dirty tree with git 1.7.
So, Dave, you may have been bitten in those dark days.
-Paul

On Mon, Nov 3, 2014 at 8:49 AM, Dave Goodell (dgoodell) 
wrote:

> On Nov 3, 2014, at 10:41 AM, Jed Brown  wrote:
>
> > "Dave Goodell (dgoodell)"  writes:
> >> Most of the time a "pull" won't succeed if you have uncommitted
> >> modifications your tree, so I'm not sure how pull/commit/push would
> >> actually work for you.  Do you stash/unstash in the middle there?
> >
> > Git will happily do the pull/merge despite your dirty tree as long as
> > none of the dirty files are affected.  Linus says that he usually has
> > uncommitted changes in his tree when merging.
>
> Hmm... you can see how often I create proper merge commits on a dirty
> tree.  I must have been hit in the past by conflicting dirty files.
>
> -Dave
>
> ___
> devel mailing list
> de...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
> Link to this post:
> http://www.open-mpi.org/community/lists/devel/2014/11/16149.php
>



-- 
Paul H. Hargrove  phhargr...@lbl.gov
Future Technologies Group
Computer and Data Sciences Department Tel: +1-510-495-2352
Lawrence Berkeley National Laboratory Fax: +1-510-486-6900


  1   2   3   4   5   6   7   8   9   >