[OMPI devel] Suspicious warnings from gcc-4.5.0 (both 1.4.3rc1 and 1.5rc5)

2010-08-25 Thread Paul H. Hargrove
With both recent RCs I get the following suspicious warnings from 
gcc-4.5.0 on Linux/ia64


1.4.3rc1:

../../../../../ompi/mca/dpm/orte/dpm_orte.c:963:5: warning: attempt to 
free a non-heap object 'ompi_mpi_comm_null'
../../../../../ompi/mca/dpm/orte/dpm_orte.c:965:5: warning: attempt to 
free a non-heap object 'ompi_mpi_group_null'
../../../../../ompi/mca/dpm/orte/dpm_orte.c:967:5: warning: attempt to 
free a non-heap object 'ompi_mpi_errors_are_fatal'



1.5rc5:

../../../../../ompi/mca/dpm/orte/dpm_orte.c:990:5: warning: attempt to 
free a non-heap object 'ompi_mpi_comm_null'
../../../../../ompi/mca/dpm/orte/dpm_orte.c:992:5: warning: attempt to 
free a non-heap object 'ompi_mpi_group_null'
../../../../../ompi/mca/dpm/orte/dpm_orte.c:994:5: warning: attempt to 
free a non-heap object 'ompi_mpi_errors_are_fatal'


-Paul

--
Paul H. Hargrove  phhargr...@lbl.gov
Future Technologies Group
HPC Research Department   Tel: +1-510-495-2352
Lawrence Berkeley National Laboratory Fax: +1-510-486-6900



[OMPI devel] 1.5rc5: warning from gcc-4.0.1 on Mac OS 10.4.11 (x86 and ppc)

2010-08-25 Thread Paul H. Hargrove

I get a warning building 1.5rc5 that appears unique to this Mac OS version.
It does NOT occur with
 Mac OS 10.5.8 for x86
 Mac OS 10.6.1 for x86
 Mac OS 10.5.8 for ppc
Nor does this occur with Open MPI 1.4.3rc1.

$ uname -a
Darwin irun.lbl.gov 8.11.1 Darwin Kernel Version 8.11.1: Wed Oct 10 
18:23:28 PDT 2007; root:xnu-792.25.20~1/RELEASE_I386 i386 i386


$ sw_vers
ProductName:Mac OS X
ProductVersion: 10.4.11
BuildVersion:   8S2167

$ gcc --version
i686-apple-darwin8-gcc-4.0.1 (GCC) 4.0.1 (Apple Computer, Inc. build 5367)
Copyright (C) 2005 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

$ [path_to]/openmpi-1.5rc5/configure
[...]

$ make
[...]
 CC libvt_mt_la-vt_pthreadwrap.lo
../../../../../../ompi/contrib/vt/vt/vtlib/vt_pthreadwrap.c: In function 
'VT_pthread_attr_getscope__':
../../../../../../ompi/contrib/vt/vt/vtlib/vt_pthreadwrap.c:365: 
warning: passing argument 1 of 'pthread_attr_getscope' discards 
qualifiers from pointer target type

[...]



Same for the equivalent ppc platform:

$ uname -a; echo; sw_vers; echo; gcc --version
Darwin iwalk.lbl.gov 8.11.0 Darwin Kernel Version 8.11.0: Wed Oct 10 
18:26:00 PDT 2007; root:xnu-792.24.17~1/RELEASE_PPC Power Macintosh powerpc


ProductName:Mac OS X
ProductVersion: 10.4.11
BuildVersion:   8S165

powerpc-apple-darwin8-gcc-4.0.1 (GCC) 4.0.1 (Apple Computer, Inc. build 
5341)

Copyright (C) 2005 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

-Paul





--
Paul H. Hargrove  phhargr...@lbl.gov
Future Technologies Group
HPC Research Department   Tel: +1-510-495-2352
Lawrence Berkeley National Laboratory Fax: +1-510-486-6900



[OMPI devel] 1.5rc5 and 1.4.3rc1: type size warnings in ompi_rb_tree test

2010-08-25 Thread Paul H. Hargrove
Saw this one on Mac OS X 10.6.1 on an x86_64, but not on older Mac's w/ 
their default 32-bit void*.

This is present in both the current RCs.
I've not checked if this is present on non-Mac systems.

$ uname -a; echo; sw_vers; echo; gcc --version
Darwin ijog.lbl.gov 10.0.0 Darwin Kernel Version 10.0.0: Fri Jul 31 
22:47:34 PDT 2009; root:xnu-1456.1.25~1/RELEASE_I386 i386


ProductName:Mac OS X
ProductVersion: 10.6.1
BuildVersion:   10B504

i686-apple-darwin10-gcc-4.2.1 (GCC) 4.2.1 (Apple Inc. build 5646)
Copyright (C) 2007 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

$ [path_to]/openmpi-1.5rc5/configure
[...]

$ make
[...]

$ make check
[...]
Making check in class
make  ompi_rb_tree opal_bitmap opal_hash_table opal_list 
opal_value_array opal_pointer_array

 CC ompi_rb_tree.o
../../../test/class/ompi_rb_tree.c: In function 'test2':
../../../test/class/ompi_rb_tree.c:347: warning: cast to pointer from 
integer of different size
../../../test/class/ompi_rb_tree.c:365: warning: cast from pointer to 
integer of different size
../../../test/class/ompi_rb_tree.c:373: warning: cast from pointer to 
integer of different size

[...]

--
Paul H. Hargrove  phhargr...@lbl.gov
Future Technologies Group
HPC Research Department   Tel: +1-510-495-2352
Lawrence Berkeley National Laboratory Fax: +1-510-486-6900



[OMPI devel] 1.5rc5: opal_path_nfs test failure on GPFS filesystem

2010-08-25 Thread Paul H. Hargrove
Testing 1.5rc5 on Linux/PPC64 I get a test failure in "make check" that 
probably relates to the GPFS filesystems used on this machine.  Not sure 
if this is a serious error or just an annoyance:


$  cat /etc/SuSE-release
SUSE Linux Enterprise Server 10 (ppc)
VERSION = 10
PATCHLEVEL = 3

$ uname -a
Linux login2 2.6.16.60-0.67.1-ppc64 #1 SMP Thu Aug 5 10:54:46 UTC 2010 
ppc64 ppc64 ppc64 GNU/Linux


$ /lib64/libc.so.6
GNU C Library stable release version 2.4 (20090904), by Roland McGrath 
et al.

Copyright (C) 2006 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.
There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A
PARTICULAR PURPOSE.
Configured for ppc64-suse-linux.
Compiled by GNU CC version 4.1.2 20070115 (SUSE Linux).
Compiled on a Linux 2.6.16 system on 2009-09-04.
Available extensions:
   crypt add-on version 2.1 by Michael Glad and others
   GNU Libidn by Simon Josefsson
   GNU libio by Per Bothner
   NIS(YP)/NIS+ NSS modules 0.19 by Thorsten Kukuk
   Native POSIX Threads Library by Ulrich Drepper et al
   BIND-8.2.3-T5B
Thread-local storage support included.
For bug reporting instructions, please see:
.

$ gcc -m64 --version
gcc (GCC) 4.1.2 20070115 (SUSE Linux)
Copyright (C) 2006 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

$ mount | grep gpfs
/dev/surveyor_software on /gpfs/software type gpfs 
(rw,mtime,dev=surveyor_software,autostart)
/dev/surveyor_home on /gpfs/home type gpfs 
(rw,mtime,dev=surveyor_home,autostart)


$ [path_to]/openmpi-1.5rc5/configure CC='gcc -m64' CXX='g++ -m64' 
F77='gfortran -m64' FC='gfortran -m64'

[...]

$ make
[...]

$ make check
[...]
gmake[3]: Entering directory 
`/gpfs/home/hargrove/tmp/openmpi-1.5rc5/BLD-64/test/util'

 CC opal_path_nfs.o
 CCLD   opal_path_nfs
gmake[3]: Leaving directory 
`/gpfs/home/hargrove/tmp/openmpi-1.5rc5/BLD-64/test/util'

gmake  check-TESTS
gmake[3]: Entering directory 
`/gpfs/home/hargrove/tmp/openmpi-1.5rc5/BLD-64/test/util'

Failure : Mismatch: input "/gpfs/software", expected:0 got:1

Failure : Mismatch: input "/gpfs/home", expected:0 got:1

SUPPORT: OMPI Test failed: opal_path_nfs() (2 of 17 failed)
FAIL: opal_path_nfs

1 of 1 test failed
Please report to http://www.open-mpi.org/community/help/

[...]



Same error occurs when configure is run with no argument (yielding an 
ILP32 build).


This test does not exist in 1.4.3rc1.

-Paul

--
Paul H. Hargrove  phhargr...@lbl.gov
Future Technologies Group
HPC Research Department   Tel: +1-510-495-2352
Lawrence Berkeley National Laboratory Fax: +1-510-486-6900



Re: [OMPI devel] Suspicious warnings from gcc-4.5.0 (both 1.4.3rc1 and 1.5rc5)

2010-08-25 Thread Ralph Castain
Hi Paul

Much appreciate all your testing!

Quick question here: is this on compile, or were you trying to run something?

We haven't seen this before, but I'm wondering if it is due to us failing to 
initialize an object's fields. If so, then it might be we don't see it because 
those fields usually default to zero (looks like NULL), but you might see it if 
they don't on your system.

Ralph

On Aug 24, 2010, at 10:19 PM, Paul H. Hargrove wrote:

> With both recent RCs I get the following suspicious warnings from gcc-4.5.0 
> on Linux/ia64
> 
> 1.4.3rc1:
> 
> ../../../../../ompi/mca/dpm/orte/dpm_orte.c:963:5: warning: attempt to free a 
> non-heap object 'ompi_mpi_comm_null'
> ../../../../../ompi/mca/dpm/orte/dpm_orte.c:965:5: warning: attempt to free a 
> non-heap object 'ompi_mpi_group_null'
> ../../../../../ompi/mca/dpm/orte/dpm_orte.c:967:5: warning: attempt to free a 
> non-heap object 'ompi_mpi_errors_are_fatal'
> 
> 
> 1.5rc5:
> 
> ../../../../../ompi/mca/dpm/orte/dpm_orte.c:990:5: warning: attempt to free a 
> non-heap object 'ompi_mpi_comm_null'
> ../../../../../ompi/mca/dpm/orte/dpm_orte.c:992:5: warning: attempt to free a 
> non-heap object 'ompi_mpi_group_null'
> ../../../../../ompi/mca/dpm/orte/dpm_orte.c:994:5: warning: attempt to free a 
> non-heap object 'ompi_mpi_errors_are_fatal'
> 
> -Paul
> 
> -- 
> Paul H. Hargrove  phhargr...@lbl.gov
> Future Technologies Group
> HPC Research Department   Tel: +1-510-495-2352
> Lawrence Berkeley National Laboratory Fax: +1-510-486-6900
> 
> ___
> devel mailing list
> de...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/devel




Re: [OMPI devel] 1.5rc5 - warnings from Sun C 5.10

2010-08-25 Thread Rolf vandeVaart
With respect to the warnings with atomic.h, we have been down this road 
before.

Here is the ticket with the background information.

https://svn.open-mpi.org/trac/ompi/ticket/1500

Eventually, we decided to just live with the warnings.  However, I will 
take a look at George's two suggestions.


Rolf



On 08/24/10 21:28, George Bosilca wrote:

On Aug 24, 2010, at 20:40 , Paul H. Hargrove wrote:

  

"../../../openmpi-1.5rc5/opal/include/opal/sys/ia32/atomic.h", line 170: warning: 
impossible constraint for "%1" asm operand



   __asm__ __volatile__(
SMPLOCK "addl %1,%0"
:"=m" (*v)
:"ir" (i), "m" (*v));

The problem seems to come from the "ir". Based on a Sun blog about the gcc style asm 
inlining support (http://blogs.sun.com/x86be/entry/gcc_style_asm_inlining_support) it appears that 
i (any size integer immediate constraint) and r (any registers in rax, rbx, rcx, rdx, rbp, rsi, 
rdi, rsp, r8 - r15). As we don't only apply our atomics on immediate I think we should drop the 
"i".

  

"../../../openmpi-1.5rc5/opal/include/opal/sys/ia32/atomic.h", line 170: 
warning: parameter in inline asm statement unused: %2



This one is more trickier. Because of the %2 I suspect that the second (*v) on 
the inputs is not matched to the first (*v) on the outputs. While this might be 
significantly bad under some circumstances, in this case I think it can be 
safely ignored.

However I would like to try the following asm code instead with the SUN 
compiler:

   __asm__ __volatile__(
SMPLOCK "addl %1,%0"
:"+m" (*v)
:"r" (i));

  Thanks,
george.


  

"../../../openmpi-1.5rc5/opal/include/opal/sys/ia32/atomic.h", line 187: warning: 
impossible constraint for "%1" asm operand
"../../../openmpi-1.5rc5/opal/include/opal/sys/ia32/atomic.h", line 187: 
warning: parameter in inline asm statement unused: %2

../../../../openmpi-1.5rc5/ompi/mpi/cxx/file.cc", line 145: Warning (Anachronism): Formal argument read_conversion_fn 
of type extern "C" int(*)(void*,ompi_datatype_t*,int,void*,long long,void*) in call to MPI_Register_datarep(char*, 
extern "C" int(*)(void*,ompi_datatype_t*,int,void*,long long,void*), extern "C" 
int(*)(void*,ompi_datatype_t*,int,void*,long long,void*), extern "C" int(*)(ompi_datatype_t*,int*,void*), void*) 
is being passed int(*)(void*,ompi_datatype_t*,int,void*,long long,void*).
"../../../../openmpi-1.5rc5/ompi/mpi/cxx/file.cc", line 146: Warning (Anachronism): Formal argument write_conversion_fn 
of type extern "C" int(*)(void*,ompi_datatype_t*,int,void*,long long,void*) in call to MPI_Register_datarep(char*, 
extern "C" int(*)(void*,ompi_datatype_t*,int,void*,long long,void*), extern "C" 
int(*)(void*,ompi_datatype_t*,int,void*,long long,void*), extern "C" int(*)(ompi_datatype_t*,int*,void*), void*) is 
being passed int(*)(void*,ompi_datatype_t*,int,void*,long long,void*).
"../../../../openmpi-1.5rc5/ompi/mpi/cxx/file.cc", line 147: Warning (Anachronism): Formal argument 
dtype_file_extent_fn of type extern "C" int(*)(ompi_datatype_t*,int*,void*) in call to MPI_Register_datarep(char*, 
extern "C" int(*)(void*,ompi_datatype_t*,int,void*,long long,void*), extern "C" 
int(*)(void*,ompi_datatype_t*,int,void*,long long,void*), extern "C" int(*)(ompi_datatype_t*,int*,void*), void*) is 
being passed int(*)(ompi_datatype_t*,int*,void*).
"../../../../openmpi-1.5rc5/ompi/mpi/cxx/file.cc", line 172: Warning (Anachronism): Formal argument write_conversion_fn 
of type extern "C" int(*)(void*,ompi_datatype_t*,int,void*,long long,void*) in call to MPI_Register_datarep(char*, 
extern "C" int(*)(void*,ompi_datatype_t*,int,void*,long long,void*), extern "C" 
int(*)(void*,ompi_datatype_t*,int,void*,long long,void*), extern "C" int(*)(ompi_datatype_t*,int*,void*), void*) is 
being passed int(*)(void*,ompi_datatype_t*,int,void*,long long,void*).
"../../../../openmpi-1.5rc5/ompi/mpi/cxx/file.cc", line 173: Warning (Anachronism): Formal argument 
dtype_file_extent_fn of type extern "C" int(*)(ompi_datatype_t*,int*,void*) in call to MPI_Register_datarep(char*, 
extern "C" int(*)(void*,ompi_datatype_t*,int,void*,long long,void*), extern "C" 
int(*)(void*,ompi_datatype_t*,int,void*,long long,void*), extern "C" int(*)(ompi_datatype_t*,int*,void*), void*) is 
being passed int(*)(ompi_datatype_t*,int*,void*).
"../../../../openmpi-1.5rc5/ompi/mpi/cxx/file.cc", line 197: Warning (Anachronism): Formal argument read_conversion_fn 
of type extern "C" int(*)(void*,ompi_datatype_t*,int,void*,long long,void*) in call to MPI_Register_datarep(char*, 
extern "C" int(*)(void*,ompi_datatype_t*,int,void*,long long,void*), extern "C" 
int(*)(void*,ompi_datatype_t*,int,void*,long long,void*), extern "C" int(*)(ompi_datatype_t*,int*,void*), void*) is 
being passed int(*)(void*,ompi_datatype_t*,int,void*,long long,void*).
"../../../../openmpi-1.5rc5/ompi/mpi/cxx/file.cc", line 199: Warning (

Re: [OMPI devel] Suspicious warnings from gcc-4.5.0 (both 1.4.3rc1 and 1.5rc5)

2010-08-25 Thread Paul H. Hargrove

Ralph,

 This is seen when compiling Open MPI.  I suspect that gcc's analysis 
is seeing a free() call on a value it can prove did not come from 
malloc() (or equivalent).  However, if as you say the value is always 
NULL, then this would be a false alarm.


-Paul

Ralph Castain wrote:

Hi Paul

Much appreciate all your testing!

Quick question here: is this on compile, or were you trying to run something?

We haven't seen this before, but I'm wondering if it is due to us failing to 
initialize an object's fields. If so, then it might be we don't see it because 
those fields usually default to zero (looks like NULL), but you might see it if 
they don't on your system.

Ralph

On Aug 24, 2010, at 10:19 PM, Paul H. Hargrove wrote:

  

With both recent RCs I get the following suspicious warnings from gcc-4.5.0 on 
Linux/ia64

1.4.3rc1:

../../../../../ompi/mca/dpm/orte/dpm_orte.c:963:5: warning: attempt to free a 
non-heap object 'ompi_mpi_comm_null'
../../../../../ompi/mca/dpm/orte/dpm_orte.c:965:5: warning: attempt to free a 
non-heap object 'ompi_mpi_group_null'
../../../../../ompi/mca/dpm/orte/dpm_orte.c:967:5: warning: attempt to free a 
non-heap object 'ompi_mpi_errors_are_fatal'


1.5rc5:

../../../../../ompi/mca/dpm/orte/dpm_orte.c:990:5: warning: attempt to free a 
non-heap object 'ompi_mpi_comm_null'
../../../../../ompi/mca/dpm/orte/dpm_orte.c:992:5: warning: attempt to free a 
non-heap object 'ompi_mpi_group_null'
../../../../../ompi/mca/dpm/orte/dpm_orte.c:994:5: warning: attempt to free a 
non-heap object 'ompi_mpi_errors_are_fatal'

-Paul

--
Paul H. Hargrove  phhargr...@lbl.gov
Future Technologies Group
HPC Research Department   Tel: +1-510-495-2352
Lawrence Berkeley National Laboratory Fax: +1-510-486-6900

___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel




___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel
  


--
Paul H. Hargrove  phhargr...@lbl.gov
Future Technologies Group
HPC Research Department   Tel: +1-510-495-2352
Lawrence Berkeley National Laboratory Fax: +1-510-486-6900



[OMPI devel] delivering SIGUSR2 to an ompi process

2010-08-25 Thread Steve Wise

Hey Open MPI wizards,

I'm trying to debug something in my library that gets loaded into my mpi 
processes when they are started via mpirun.  With other MPIs, I've been 
able to deliver SIGUSR2 to the process and trigger some debug code I 
have in my library that sets up a handler for SIGUSR2.  However, when I 
deliver SIGUSR2 to my process running under OMPI, the process just dies 
and mpirun logs this:


--
mpirun noticed that process rank 0 with PID 13568 on node hpc-cn2 exited 
on signal 12 (User defined signal 2).

--


Is there any way to allow SIGUSR2 to reach my library handler?

Does OMPI use SIGUSR1/2 for other purposes?

Is there some other clever way I can kick my library at runtime to dump 
its debug code?  Like maybe interface with the MPI debug code somehow so 
things like padb could trigger this debug logic?


Thanks in advance,

Steve.



Re: [OMPI devel] delivering SIGUSR2 to an ompi process

2010-08-25 Thread Ralph Castain
We don't use it - mpirun traps it and then propagates it by default to all 
remote procs.

What OMPI version is this?

On Aug 25, 2010, at 10:23 AM, Steve Wise wrote:

> Hey Open MPI wizards,
> 
> I'm trying to debug something in my library that gets loaded into my mpi 
> processes when they are started via mpirun.  With other MPIs, I've been able 
> to deliver SIGUSR2 to the process and trigger some debug code I have in my 
> library that sets up a handler for SIGUSR2.  However, when I deliver SIGUSR2 
> to my process running under OMPI, the process just dies and mpirun logs this:
> 
> --
> mpirun noticed that process rank 0 with PID 13568 on node hpc-cn2 exited on 
> signal 12 (User defined signal 2).
> --
> 
> 
> Is there any way to allow SIGUSR2 to reach my library handler?
> 
> Does OMPI use SIGUSR1/2 for other purposes?
> 
> Is there some other clever way I can kick my library at runtime to dump its 
> debug code?  Like maybe interface with the MPI debug code somehow so things 
> like padb could trigger this debug logic?
> 
> Thanks in advance,
> 
> Steve.
> 
> ___
> devel mailing list
> de...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/devel




[OMPI devel] Fwd: [OMPI svn-full] svn:open-mpi r23664

2010-08-25 Thread Jeff Squyres
FYI.  This is an ABI-changing commit.

We've unfortunately had some F90 API parameters wrong *for years* and no one 
noticed.

I'm inclined not to change this in the 1.4 series, just because it changes the 
ABI.  But it should go into 1.5.0 since we're already breaking ABI from 1.4.x 
-> 1.5.0.


Begin forwarded message:

> From: jsquy...@osl.iu.edu
> Date: August 25, 2010 12:46:37 PM EDT
> To: svn-f...@open-mpi.org
> Subject: [OMPI svn-full] svn:open-mpi r23664
> Reply-To: de...@open-mpi.org
> 
> Author: jsquyres
> Date: 2010-08-25 12:46:36 EDT (Wed, 25 Aug 2010)
> New Revision: 23664
> URL: https://svn.open-mpi.org/trac/ompi/changeset/23664
> 
> Log:
> Several EXTRA_STATE parameter types were erroneously "INTEGER" (they
> should be "INTEGER(kind=MPI_ADDRESS_KIND)").  This has been wrong for
> ''years''.  Apparently no one who uses the F90 bindings also uses MPI
> attributes.  Sigh.

-- 
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to:
http://www.cisco.com/web/about/doing_business/legal/cri/




Re: [OMPI devel] 1.5rc5 - warnings from Sun C 5.10

2010-08-25 Thread George Bosilca
Right, that was a pretty intense discussion. However, I don't think we talked 
about replacing the = by a +. The difference is that = means write and + means 
read&write. Here is the assembly output from gcc -O3:

with output "=m" (*v)  |  with output "+m" (*v)
and  input  "ir" (i), "m" (*v) |  and  input  "r" (i)
   |
_opal_atomic_add_32:   |  _opal_atomic_add_32:
LFB5:  |  LFB5:
pushq   %rbp   |pushq   %rbp
LCFI3: |  LCFI3:
movq%rsp, %rbp |movq%rsp, %rbp
LCFI4: |  LCFI4:
movq%rdi, -8(%rbp) |movq%rdi, -8(%rbp)
movl%esi, -12(%rbp)|movl%esi, -12(%rbp)
movq-8(%rbp), %rcx |movq-8(%rbp), %rcx
movl-12(%rbp), %edx|movl-12(%rbp), %edx
movq-8(%rbp), %rax |movq-8(%rbp), %rax
lock;addl %edx,(%rcx)  |lock;addl %edx,(%rcx)
movq-8(%rbp), %rax |movq-8(%rbp), %rax
movl(%rax), %eax   |movl(%rax), %eax
leave  |leave
ret|ret

It generates multiple loads as %ras is updated before the lock. Useless!

Now, if we put on the output "=m"(*v) and skip the (*)v in the input arguments 
the code looks like this:

LFB7:
pushq   %rbp
LCFI0:
movq%rsp, %rbp
LCFI1:
movl$0, -4(%rbp)
leaq-4(%rbp), %rax
movl$1, %edx
lock
addl %edx,(%rax)
movl-4(%rbp), %eax
leave
ret

Which is a LOT better. Not perfect as it still generate a load after the locked 
addl, but this is because we wanted to return the (*v).

Thus the code should look at least like this

static inline int32_t opal_atomic_add_32(volatile int32_t* v, int i)
{
   __asm__ __volatile__(
SMPLOCK "addl %1,%0"
:"=m" (*v)
:"r" (i));
   return (*v);  /* should be an atomic operation */
}

Now, if what we want back from this function is the __REAL__ result of the 
atomic addition, then the code is wrong. Well, mostly wrong under heavy usage 
(i.e. multiple threads doing atomics on the same variable).

Here is the opal_atomic_add_32 returning the correct result. This is similar to 
the atomic intrinsic called add_and_fetch.

static inline int32_t opal_atomic_add_32(volatile int32_t* v, int i)
{
   int ret = i;
   __asm__ __volatile__(
SMPLOCK "xaddl %1,%0"
:"=m" (*v), "+r" (ret)
);
   return ret+i;
}

  george.

On Aug 25, 2010, at 10:58 , Rolf vandeVaart wrote:

> With respect to the warnings with atomic.h, we have been down this road 
> before.
> Here is the ticket with the background information.
> 
> https://svn.open-mpi.org/trac/ompi/ticket/1500
> 
> Eventually, we decided to just live with the warnings.  However, I will take 
> a look at George's two suggestions.
> 
> Rolf
> 
> 
> 
> On 08/24/10 21:28, George Bosilca wrote:
>> On Aug 24, 2010, at 20:40 , Paul H. Hargrove wrote:
>> 
>>   
>> 
>>> "../../../openmpi-1.5rc5/opal/include/opal/sys/ia32/atomic.h", line 170: 
>>> warning: impossible constraint for "%1" asm operand
>>> 
>>> 
>> 
>>__asm__ __volatile__(
>> SMPLOCK "addl %1,%0"
>> :"=m" (*v)
>> :"ir" (i), "m" (*v));
>> 
>> The problem seems to come from the "ir". Based on a Sun blog about the gcc 
>> style asm inlining support (
>> http://blogs.sun.com/x86be/entry/gcc_style_asm_inlining_support
>> ) it appears that i (any size integer immediate constraint) and r (any 
>> registers in rax, rbx, rcx, rdx, rbp, rsi, rdi, rsp, r8 - r15). As we don't 
>> only apply our atomics on immediate I think we should drop the "i".
>> 
>>   
>> 
>>> "../../../openmpi-1.5rc5/opal/include/opal/sys/ia32/atomic.h", line 170: 
>>> warning: parameter in inline asm statement unused: %2
>>> 
>>> 
>> 
>> This one is more trickier. Because of the %2 I suspect that the second (*v) 
>> on the inputs is not matched to the first (*v) on the outputs. While this 
>> might be significantly bad under some circumstances, in this case I think it 
>> can be safely ignored.
>> 
>> However I would like to try the following asm code instead with the SUN 
>> compiler:
>> 
>>__asm__ __volatile__(
>> SMPLOCK "addl %1,%0"
>> :"+m" (*v)
>> :"r" (i));
>> 
>>   Thanks,
>> george.
>> 
>> 
>>   
>> 
>>> "../../../openmpi-1.5rc5/opal/include/opal/sys/ia32/atomic.h", line 187: 
>>> warning: impossible constraint for "%1" asm operand
>>> "../..

Re: [OMPI devel] Suspicious warnings from gcc-4.5.0 (both 1.4.3rc1 and 1.5rc5)

2010-08-25 Thread George Bosilca
In this particular case the compiler is both right and wrong. It is right to 
complain, because as Paul pointed out, there is a free on a non-malloced object 
(ompi_mpi_comm_null). However, this free is protected by the reference count 
going to zero, and this should never happens in this particular piece of code 
(hopefully!).

What we really need here is one of the following:
1) to simply decrease the reference count once, to signal that ompi_comm_parent 
is not using the ompi_mpi_comm_null anymore. Unfortunately, we don't have such 
a macro.

2) As, in this code, we handle only statically allocated objects remove the 
OBJ_RELEASE from the dyn_init code, and their counterpart (OBJ_RETAIN) in the 
comm_init.c:166.

  george.

On Aug 25, 2010, at 12:05 , Paul H. Hargrove wrote:

> Ralph,
> 
> This is seen when compiling Open MPI.  I suspect that gcc's analysis is 
> seeing a free() call on a value it can prove did not come from malloc() (or 
> equivalent).  However, if as you say the value is always NULL, then this 
> would be a false alarm.
> 
> -Paul
> 
> Ralph Castain wrote:
>> Hi Paul
>> 
>> Much appreciate all your testing!
>> 
>> Quick question here: is this on compile, or were you trying to run something?
>> 
>> We haven't seen this before, but I'm wondering if it is due to us failing to 
>> initialize an object's fields. If so, then it might be we don't see it 
>> because those fields usually default to zero (looks like NULL), but you 
>> might see it if they don't on your system.
>> 
>> Ralph
>> 
>> On Aug 24, 2010, at 10:19 PM, Paul H. Hargrove wrote:
>> 
>>  
>>> With both recent RCs I get the following suspicious warnings from gcc-4.5.0 
>>> on Linux/ia64
>>> 
>>> 1.4.3rc1:
>>> 
>>> ../../../../../ompi/mca/dpm/orte/dpm_orte.c:963:5: warning: attempt to free 
>>> a non-heap object 'ompi_mpi_comm_null'
>>> ../../../../../ompi/mca/dpm/orte/dpm_orte.c:965:5: warning: attempt to free 
>>> a non-heap object 'ompi_mpi_group_null'
>>> ../../../../../ompi/mca/dpm/orte/dpm_orte.c:967:5: warning: attempt to free 
>>> a non-heap object 'ompi_mpi_errors_are_fatal'
>>> 
>>> 
>>> 1.5rc5:
>>> 
>>> ../../../../../ompi/mca/dpm/orte/dpm_orte.c:990:5: warning: attempt to free 
>>> a non-heap object 'ompi_mpi_comm_null'
>>> ../../../../../ompi/mca/dpm/orte/dpm_orte.c:992:5: warning: attempt to free 
>>> a non-heap object 'ompi_mpi_group_null'
>>> ../../../../../ompi/mca/dpm/orte/dpm_orte.c:994:5: warning: attempt to free 
>>> a non-heap object 'ompi_mpi_errors_are_fatal'
>>> 
>>> -Paul
>>> 
>>> -- 
>>> Paul H. Hargrove  phhargr...@lbl.gov
>>> Future Technologies Group
>>> HPC Research Department   Tel: +1-510-495-2352
>>> Lawrence Berkeley National Laboratory Fax: +1-510-486-6900
>>> 
>>> ___
>>> devel mailing list
>>> de...@open-mpi.org
>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>>
>> 
>> 
>> ___
>> devel mailing list
>> de...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>  
> 
> -- 
> Paul H. Hargrove  phhargr...@lbl.gov
> Future Technologies Group
> HPC Research Department   Tel: +1-510-495-2352
> Lawrence Berkeley National Laboratory Fax: +1-510-486-6900
> 
> ___
> devel mailing list
> de...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/devel




Re: [OMPI devel] delivering SIGUSR2 to an ompi process

2010-08-25 Thread Steve Wise

On 08/25/2010 11:33 AM, Ralph Castain wrote:

We don't use it - mpirun traps it and then propagates it by default to all 
remote procs.

   


So I should send the signal to the mpirun process?



What OMPI version is this?

   


1.4.1



On Aug 25, 2010, at 10:23 AM, Steve Wise wrote:

   

Hey Open MPI wizards,

I'm trying to debug something in my library that gets loaded into my mpi 
processes when they are started via mpirun.  With other MPIs, I've been able to 
deliver SIGUSR2 to the process and trigger some debug code I have in my library 
that sets up a handler for SIGUSR2.  However, when I deliver SIGUSR2 to my 
process running under OMPI, the process just dies and mpirun logs this:

--
mpirun noticed that process rank 0 with PID 13568 on node hpc-cn2 exited on 
signal 12 (User defined signal 2).
--


Is there any way to allow SIGUSR2 to reach my library handler?

Does OMPI use SIGUSR1/2 for other purposes?

Is there some other clever way I can kick my library at runtime to dump its 
debug code?  Like maybe interface with the MPI debug code somehow so things 
like padb could trigger this debug logic?

Thanks in advance,

Steve.

___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel
 


___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel
   




Re: [OMPI devel] Suspicious warnings from gcc-4.5.0 (both 1.4.3rc1 and 1.5rc5)

2010-08-25 Thread Jeff Squyres
Ralph pinged Edgar and me about this off-list -- I speculated that that chunk 
of code could be replaced with:

   OBJ_RELEASE(ompi_mpi_comm_parent);
   ompi_mpi_comm_parent = newcomm;

...but that was after only a quick look at the code.  :-)  There might well 
have been a good reason why it wasn't written that way in the first place.



On Aug 25, 2010, at 1:25 PM, George Bosilca wrote:

> In this particular case the compiler is both right and wrong. It is right to 
> complain, because as Paul pointed out, there is a free on a non-malloced 
> object (ompi_mpi_comm_null). However, this free is protected by the reference 
> count going to zero, and this should never happens in this particular piece 
> of code (hopefully!).
> 
> What we really need here is one of the following:
> 1) to simply decrease the reference count once, to signal that 
> ompi_comm_parent is not using the ompi_mpi_comm_null anymore. Unfortunately, 
> we don't have such a macro.
> 
> 2) As, in this code, we handle only statically allocated objects remove the 
> OBJ_RELEASE from the dyn_init code, and their counterpart (OBJ_RETAIN) in the 
> comm_init.c:166.
> 
>  george.
> 
> On Aug 25, 2010, at 12:05 , Paul H. Hargrove wrote:
> 
>> Ralph,
>> 
>> This is seen when compiling Open MPI.  I suspect that gcc's analysis is 
>> seeing a free() call on a value it can prove did not come from malloc() (or 
>> equivalent).  However, if as you say the value is always NULL, then this 
>> would be a false alarm.
>> 
>> -Paul
>> 
>> Ralph Castain wrote:
>>> Hi Paul
>>> 
>>> Much appreciate all your testing!
>>> 
>>> Quick question here: is this on compile, or were you trying to run 
>>> something?
>>> 
>>> We haven't seen this before, but I'm wondering if it is due to us failing 
>>> to initialize an object's fields. If so, then it might be we don't see it 
>>> because those fields usually default to zero (looks like NULL), but you 
>>> might see it if they don't on your system.
>>> 
>>> Ralph
>>> 
>>> On Aug 24, 2010, at 10:19 PM, Paul H. Hargrove wrote:
>>> 
>>> 
 With both recent RCs I get the following suspicious warnings from 
 gcc-4.5.0 on Linux/ia64
 
 1.4.3rc1:
 
 ../../../../../ompi/mca/dpm/orte/dpm_orte.c:963:5: warning: attempt to 
 free a non-heap object 'ompi_mpi_comm_null'
 ../../../../../ompi/mca/dpm/orte/dpm_orte.c:965:5: warning: attempt to 
 free a non-heap object 'ompi_mpi_group_null'
 ../../../../../ompi/mca/dpm/orte/dpm_orte.c:967:5: warning: attempt to 
 free a non-heap object 'ompi_mpi_errors_are_fatal'
 
 
 1.5rc5:
 
 ../../../../../ompi/mca/dpm/orte/dpm_orte.c:990:5: warning: attempt to 
 free a non-heap object 'ompi_mpi_comm_null'
 ../../../../../ompi/mca/dpm/orte/dpm_orte.c:992:5: warning: attempt to 
 free a non-heap object 'ompi_mpi_group_null'
 ../../../../../ompi/mca/dpm/orte/dpm_orte.c:994:5: warning: attempt to 
 free a non-heap object 'ompi_mpi_errors_are_fatal'
 
 -Paul
 
 -- 
 Paul H. Hargrove  phhargr...@lbl.gov
 Future Technologies Group
 HPC Research Department   Tel: +1-510-495-2352
 Lawrence Berkeley National Laboratory Fax: +1-510-486-6900
 
 ___
 devel mailing list
 de...@open-mpi.org
 http://www.open-mpi.org/mailman/listinfo.cgi/devel
 
>>> 
>>> 
>>> ___
>>> devel mailing list
>>> de...@open-mpi.org
>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>> 
>> 
>> -- 
>> Paul H. Hargrove  phhargr...@lbl.gov
>> Future Technologies Group
>> HPC Research Department   Tel: +1-510-495-2352
>> Lawrence Berkeley National Laboratory Fax: +1-510-486-6900
>> 
>> ___
>> devel mailing list
>> de...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
> 
> 
> ___
> devel mailing list
> de...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/devel


-- 
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to:
http://www.cisco.com/web/about/doing_business/legal/cri/




Re: [OMPI devel] delivering SIGUSR2 to an ompi process

2010-08-25 Thread Ralph Castain

On Aug 25, 2010, at 11:26 AM, Steve Wise wrote:

> On 08/25/2010 11:33 AM, Ralph Castain wrote:
>> We don't use it - mpirun traps it and then propagates it by default to all 
>> remote procs.
>> 
>>   
> 
> So I should send the signal to the mpirun process?

Yes - however, note that it will be propagated to ALL processes in the job.

If you want to only get the signal in one proc, you can just do a "kill" to 
that specific process on its node. We don't trap signals on the application 
procs themselves, so your proc can do whatever it wants with it.


> 
> 
>> What OMPI version is this?
>> 
>>   
> 
> 1.4.1
> 
> 
>> On Aug 25, 2010, at 10:23 AM, Steve Wise wrote:
>> 
>>   
>>> Hey Open MPI wizards,
>>> 
>>> I'm trying to debug something in my library that gets loaded into my mpi 
>>> processes when they are started via mpirun.  With other MPIs, I've been 
>>> able to deliver SIGUSR2 to the process and trigger some debug code I have 
>>> in my library that sets up a handler for SIGUSR2.  However, when I deliver 
>>> SIGUSR2 to my process running under OMPI, the process just dies and mpirun 
>>> logs this:
>>> 
>>> --
>>> mpirun noticed that process rank 0 with PID 13568 on node hpc-cn2 exited on 
>>> signal 12 (User defined signal 2).
>>> --
>>> 
>>> 
>>> Is there any way to allow SIGUSR2 to reach my library handler?
>>> 
>>> Does OMPI use SIGUSR1/2 for other purposes?
>>> 
>>> Is there some other clever way I can kick my library at runtime to dump its 
>>> debug code?  Like maybe interface with the MPI debug code somehow so things 
>>> like padb could trigger this debug logic?
>>> 
>>> Thanks in advance,
>>> 
>>> Steve.
>>> 
>>> ___
>>> devel mailing list
>>> de...@open-mpi.org
>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>> 
>> 
>> ___
>> devel mailing list
>> de...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>   
> 
> ___
> devel mailing list
> de...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/devel




Re: [OMPI devel] delivering SIGUSR2 to an ompi process

2010-08-25 Thread Steve Wise

On 08/25/2010 12:43 PM, Ralph Castain wrote:

On Aug 25, 2010, at 11:26 AM, Steve Wise wrote:

   

On 08/25/2010 11:33 AM, Ralph Castain wrote:
 

We don't use it - mpirun traps it and then propagates it by default to all 
remote procs.


   

So I should send the signal to the mpirun process?
 

Yes - however, note that it will be propagated to ALL processes in the job.

If you want to only get the signal in one proc, you can just do a "kill" to 
that specific process on its node. We don't trap signals on the application procs 
themselves, so your proc can do whatever it wants with it.


   


Something is funny then.  When I send SIGUSR2 to the process itself -or- 
to the mpirun proc, it just kills the process and doesn't get to my sig 
handler.  And my same library works when I run the job using mvapich2.


I'll keep digging.

Thanks!

Steve.





Re: [OMPI devel] delivering SIGUSR2 to an ompi process

2010-08-25 Thread Ralph Castain
Could be a bug on our part. If you --enable-debug in your configure, you can 
then set -mca odls_base_verbose 5 and (amidst a lot of other stuff) you'll see 
the signal being delivered to the proc if you sent it to mpirun.

If you send the signal direct to the proc yourself, we shouldn't touch 
it...unless we have a bug that does so.

On Aug 25, 2010, at 12:04 PM, Steve Wise wrote:

> On 08/25/2010 12:43 PM, Ralph Castain wrote:
>> On Aug 25, 2010, at 11:26 AM, Steve Wise wrote:
>> 
>>   
>>> On 08/25/2010 11:33 AM, Ralph Castain wrote:
>>> 
 We don't use it - mpirun traps it and then propagates it by default to all 
 remote procs.
 
 
   
>>> So I should send the signal to the mpirun process?
>>> 
>> Yes - however, note that it will be propagated to ALL processes in the job.
>> 
>> If you want to only get the signal in one proc, you can just do a "kill" to 
>> that specific process on its node. We don't trap signals on the application 
>> procs themselves, so your proc can do whatever it wants with it.
>> 
>> 
>>   
> 
> Something is funny then.  When I send SIGUSR2 to the process itself -or- to 
> the mpirun proc, it just kills the process and doesn't get to my sig handler. 
>  And my same library works when I run the job using mvapich2.
> 
> I'll keep digging.
> 
> Thanks!
> 
> Steve.
> 
> 
> 
> ___
> devel mailing list
> de...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/devel




[OMPI devel] VT "platform" selection needs documentation

2010-08-25 Thread Paul H. Hargrove
I wanted to test builds of OpenMPI 1.5rc5 and 1.4.3rc1 on Linux/PPC64.  
As it happens the only such hast I currently have access to is the 
front-end for a BG/P.  It was NOT my intention to build Open MPI (or 
VapirTrace) for the BG/P, but VT's configure logic decided I was on a 
BG/P and so built using the front-end compiler with the compute-node 
headers (which was a messy situation).


The autoconf for VT uses a macro ACVT_PLATFORM, located in
   openmpi-1.4.3rc1/ompi/contrib/vt/vt/m4/acinclude.pform.m4
   openmpi-1.5rc5/ompi/contrib/vt/vt/config/m4/acinclude.pform.m4

In 1.4.3rc4, that macro includes the following logic run if --platform 
is not specified:


   linux*)
   AS_IF([test "$host_cpu" = "ia64" -a -f 
/etc/sgi-release],

   [PLATFORM=altix],
   [AS_IF([test "$host_cpu" = "powerpc64" 
-a -d /bgl/BlueLight],

[PLATFORM=bgl],
[AS_IF([test "$host_cpu" = "x86_64" -a 
-d /opt/xt-boot],

 [PLATFORM=crayxt],
 [PLATFORM=linux])])])
   ;;

In 1.5rc5 that has expanded to detect more platforms, including the BG/P 
where I was working:


   case $host_os in
   linux*)
   AS_IF([test "$host_cpu" = "ia64" -a -f 
/etc/sgi-release],

   [PLATFORM=altix],
   [AS_IF([test "$host_cpu" = "powerpc64" 
-a -d /bgl/BlueLight],

[PLATFORM=bgl],
[AS_IF([test "$host_cpu" = "powerpc64" 
-a -d /bgsys],

 [PLATFORM=bgp],
 [AS_IF([test "$host_cpu" = "x86_64" -a 
-d /opt/xt-boot],

  [PLATFORM=crayxt],
  [AS_IF([test "$host_cpu" = "mips64" 
-a -d /opt/sicortex],

   [PLATFORM=sicortex],
   [PLATFORM=linux])])])])])
   ;;


So the issue I have is that if building on the front-end for any of 
these specialized systems, I will get a VT build for the "back-end" 
unless I explicitly pass --platform=linux.  By itself that sounds OK, 
though something about this in ompi/contrib/vt/vt/INSTALL would be nice.


The NEXT problem comes from the fact that Open MPI's top-level configure 
has the OMPI_LOAD_PLATFORM macro which expects 
--with-platform=.  Thus there appears to be a conflict between 
the VT INSTALL documentation and the OMPI configure script.


I am not a newbie, so I DID find the desired solution: 
--with-contrib-vt-flags=--platform=linux
However, the only documentation I could find for --with-contrib-vt-flags 
in the source tree (as in "grep -R with-contrib-vt-flags") is the output 
of "configure --help".  I did eventually also find the related FAQ 
entry: http://www.open-mpi.org/faq/?category=vampirtrace#vt_options

but that was only AFTER I knew of the problem passing --platform.

So, 3 requests of the VT folks:

1) Document the fact that compilation on a "front-end" of various 
systems (Altix, BGP, BGL, SiCortex and CrayXT) will default to building 
for the "back-end" system if one doesn't explicitly set platform=linux.  
This would be good in the VT INSTALL, and in the VT-related FAQ page for 
OpenMPI.


2) Document --with-contrib-vt-flags in the Open MPI README, the VT 
INSTALL, or in both places.


3) Consider for the BGL, BGP and CrayXT cases checking in ACVT_PLATFORM 
whether back-end compiler is being used.  I know that is not possible 
for the Altix and SiCortex.


-Paul

--
Paul H. Hargrove  phhargr...@lbl.gov
Future Technologies Group
HPC Research Department   Tel: +1-510-495-2352
Lawrence Berkeley National Laboratory Fax: +1-510-486-6900



[OMPI devel] Problem w/ documented SPARC/gcc flags (1.5rc5 and 1.4.3rc1)

2010-08-25 Thread Paul H. Hargrove

In both 1.5rc5 and 1.4.3rc1, README says:
- Open MPI does not support the Sparc v8 CPU target, which is the
 default on Sun Solaris.  The v8plus (32 bit) or v9 (64 bit)
 targets must be used to build Open MPI on Solaris.  This can be
 done by including a flag in CFLAGS, CXXFLAGS, FFLAGS, and FCFLAGS,
 -xarch=v8plus for the Sun compilers, -mv8plus for GCC.

However, the -mv8plus flag DOES NOT work for me.
The following occurs for both 1.5rc5 and 1.4.3rc1:

$ uname -a
SunOS lem.lbl.gov 5.10 s10_69 sun4u sparc SUNW,Ultra-5_10

$ gcc --version
gcc (GCC) 3.3.2
Copyright (C) 2003 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

$ [path_to]/configure --disable-mpi-f77 --disable-mpi-f90 
CFLAGS=-mv8plus CXXFLAGS=-mv8plus

[...]
*** Assembler
[...]
checking if have Sparc v8+/v9 support... no
configure: WARNING: Sparc v8 target is not supported in this release of 
Open MPI.

configure: WARNING: You must specify the target architecture v8plus
configure: WARNING: (cc: -xarch=v8plus, gcc: -mcpu=v9) for CFLAGS, CXXFLAGS,
configure: WARNING: FFLAGS, and FCFLAGS to compile Open MPI in 32 bit 
mode on

configure: WARNING: Sparc processors
configure: error: Can not continue.


Following the recommendation from configure:
 $ [path_to]/configure --disable-mpi-f77 --disable-mpi-f90 
CFLAGS=-mcpu=v9 CXXFLAGS=-mcpu=v9

DOES work for both of the current RCs.

So, I see a few possibilities:

1) -mv8plus SHOULD work (as -xarch=v8plus appears to w/ Suc C 5.10) but 
configure is unconditionally too strict.

OR
2) My gcc is older than other have tested and configure is mistakenly 
thinking the ABI is wrong.

OR
3) -mcpu=v9 is the proper incantation and README needs correction.

No matter which of the above is correct, I suspect REAME and configure 
need to give the user the same information.


-Paul

P.S.  I can provide temporary machine access if needed to resolve this.
P.P.S.  I am /still/ not finished testing all the platforms available to 
me ;-)


--
Paul H. Hargrove  phhargr...@lbl.gov
Future Technologies Group
HPC Research Department   Tel: +1-510-495-2352
Lawrence Berkeley National Laboratory Fax: +1-510-486-6900



[OMPI devel] Some positive test results (1.5rc5 and 1.4.3rc1)

2010-08-25 Thread Paul H. Hargrove
I have mostly be sending the negative findings, bur rest assured that I 
have had positive ones too.


I won't list them all, but wanted in particular to note that I have 
built successfully both 1.5rc5 and 1.4.3rc1 for the following transports 
that may or may not be getting tested by others

  elan on a Linux/x86 host  gcc-3.4.0
  gm-1.6.4 on a Linux/x86 host  gcc-3.4.0
  gm-2.0.19 on a Linux/x86 host  gcc-3.4.0
  psm on a Linux/x86-64 host  gcc-3.4.6
I have not compiled or run any MPI apps, just "make check".

I have also had success with LP64 builds of both RCs on a Linux/ppc64 
host with gcc-4.1.2


-Paul

--
Paul H. Hargrove  phhargr...@lbl.gov
Future Technologies Group
HPC Research Department   Tel: +1-510-495-2352
Lawrence Berkeley National Laboratory Fax: +1-510-486-6900



[OMPI devel] 1.5rc5: attribute((noreturn)) and pointers to functions

2010-08-25 Thread Paul H. Hargrove
Building 1.5rc5 with xlc on linux/ppc I see many instances of the 
following warnings


"../../../../orte/mca/ess/ess.h", line 61.16: 1506-959 (W) The attribute 
"noreturn" is not a valid type attribute and is ignored.
"../../../../orte/mca/errmgr/errmgr.h", line 134.16: 1506-959 (W) The 
attribute "noreturn" is not a valid type attribute and is ignored.


This is nearly the same as the Sun C 5.10 warning I reported in 
http://www.open-mpi.org/community/lists/devel/2010/08/8323.php


"../../../../openmpi-1.5rc5/orte/mca/ess/ess.h", line 61: warning: 
attribute "noreturn" may not be applied to variable, ignored
"../../../../openmpi-1.5rc5/orte/mca/errmgr/errmgr.h", line 138: 
warning: attribute "noreturn" may not be applied to variable, ignored


This indicates a common cause and suggests a common solution:

In both cases the configure probe for compiler support for the 
"noreturn" attribute has passed.  However, in both cases the compiler is 
not happy applying this attribute to a pointer-to-function, (though gcc 
is apparently fine with this).  I believe that the solution to this is 
simply to add a "noreturn_funcptr" probe to 
opal/config/opal_check_attributes.m4, analogous to the format_funcptr 
probe and then define and use a __opal_attribute_noreturn_funcptr__ as 
appropriate.


-Paul

--
Paul H. Hargrove  phhargr...@lbl.gov
Future Technologies Group
HPC Research Department   Tel: +1-510-495-2352
Lawrence Berkeley National Laboratory Fax: +1-510-486-6900



[OMPI devel] nit-pick: typo in README (1.4.3rc1 and 1.5rc5)

2010-08-25 Thread Paul H. Hargrove
The following patch applies to both 1.4.3rc1 and 1.5rc5 to fix a typo in 
the README:


--- README.orig2010-08-25 14:45:09.0 -0700
+++ README 2010-08-25 14:45:20.0 -0700
@@ -69,7 +69,7 @@
- Asynchronous, transparent checkpoint/restart support
  - Fully coordinated checkpoint/restart coordination component
  - Support for the following checkpoint/restart services:
-- blcr: Berkley Lab's Checkpoint/Restart
+- blcr: Berkeley Lab's Checkpoint/Restart
- self: Application level callbacks
  - Support for the following interconnects:
- tcp



-Paul

--
Paul H. Hargrove  phhargr...@lbl.gov
Future Technologies Group
HPC Research Department   Tel: +1-510-495-2352
Lawrence Berkeley National Laboratory Fax: +1-510-486-6900



Re: [OMPI devel] Problem w/ documented SPARC/gcc flags (1.5rc5 and 1.4.3rc1)

2010-08-25 Thread Rolf vandeVaart
Paul, is it possible for you to try one more thing.  Can you reconfigure 
with


CFLAGS="-mv8plus -Wa,-xarch=v8plus"

I think this will get past the configure test as the configure test is 
compiling a piece
of assembly, and for some reason, the -mv8plus is not finding its way to 
the assembler.


If that works, then we eliminate #2 on your list below, and have to 
decide between
#1 and #3. 


Rolf

On 08/25/10 15:56, Paul H. Hargrove wrote:

In both 1.5rc5 and 1.4.3rc1, README says:
- Open MPI does not support the Sparc v8 CPU target, which is the
 default on Sun Solaris.  The v8plus (32 bit) or v9 (64 bit)
 targets must be used to build Open MPI on Solaris.  This can be
 done by including a flag in CFLAGS, CXXFLAGS, FFLAGS, and FCFLAGS,
 -xarch=v8plus for the Sun compilers, -mv8plus for GCC.

However, the -mv8plus flag DOES NOT work for me.
The following occurs for both 1.5rc5 and 1.4.3rc1:

$ uname -a
SunOS lem.lbl.gov 5.10 s10_69 sun4u sparc SUNW,Ultra-5_10

$ gcc --version
gcc (GCC) 3.3.2
Copyright (C) 2003 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There 
is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR 
PURPOSE.


$ [path_to]/configure --disable-mpi-f77 --disable-mpi-f90 
CFLAGS=-mv8plus CXXFLAGS=-mv8plus

[...]
*** Assembler
[...]
checking if have Sparc v8+/v9 support... no
configure: WARNING: Sparc v8 target is not supported in this release 
of Open MPI.

configure: WARNING: You must specify the target architecture v8plus
configure: WARNING: (cc: -xarch=v8plus, gcc: -mcpu=v9) for CFLAGS, 
CXXFLAGS,
configure: WARNING: FFLAGS, and FCFLAGS to compile Open MPI in 32 bit 
mode on

configure: WARNING: Sparc processors
configure: error: Can not continue.


Following the recommendation from configure:
 $ [path_to]/configure --disable-mpi-f77 --disable-mpi-f90 
CFLAGS=-mcpu=v9 CXXFLAGS=-mcpu=v9

DOES work for both of the current RCs.

So, I see a few possibilities:

1) -mv8plus SHOULD work (as -xarch=v8plus appears to w/ Suc C 5.10) 
but configure is unconditionally too strict.

OR
2) My gcc is older than other have tested and configure is mistakenly 
thinking the ABI is wrong.

OR
3) -mcpu=v9 is the proper incantation and README needs correction.

No matter which of the above is correct, I suspect REAME and configure 
need to give the user the same information.


-Paul

P.S.  I can provide temporary machine access if needed to resolve this.
P.P.S.  I am /still/ not finished testing all the platforms available 
to me ;-)






Re: [OMPI devel] Problem w/ documented SPARC/gcc flags (1.5rc5 and 1.4.3rc1)

2010-08-25 Thread Paul H. Hargrove

Trying Rolf's suggestion, I configure 1.4.3rc1 with
  CFLAGS="-mv8plus -Wa,-xarch=v8plus" CXXFLAGS="-mv8plus 
-Wa,-xarch=v8plus"
I find that I get configure past the v8+/v9 Assembler ABI probe (but 
didn't wait for the full configure to run).


Another datapoint in favor of #2 is that I can successfully build 
1.4.3rc1 w/ gcc-4.3.3 when I configure with

   CC=gcc-4.3.3 CXX=g++-4.3.3 CFLAGS=-mv8plus CXXFLAGS=-mv8plus
And have configured (again stopping after the Assembler ABI probe) with 
gcc-4.3.3 AND Rolf's flags
   CC=gcc-4.3.3 CXX=g++-4.3.3 CFLAGS=-mv8plus CC=gcc-4.3.3 
CXX=g++-4.3.3 CFLAGS=-mv8plus


So, here is MY summary:

+ For gcc-4.3.3 README is providing correct information
+ For gcc-3.3.2 README is providing INcorrect information
+ For both gcc versions configure provides correct info on failure, but 
following it prevents using the V8+ ABI.


My suggestion fix:

+ Edit README and configure both to suggest "-mv8plus -Wa,-xarch=v8plus" 
as that should be correct for either compiler version.


-Paul

Rolf vandeVaart wrote:
Paul, is it possible for you to try one more thing.  Can you 
reconfigure with


CFLAGS="-mv8plus -Wa,-xarch=v8plus"

I think this will get past the configure test as the configure test is 
compiling a piece
of assembly, and for some reason, the -mv8plus is not finding its way 
to the assembler.


If that works, then we eliminate #2 on your list below, and have to 
decide between

#1 and #3.
Rolf

On 08/25/10 15:56, Paul H. Hargrove wrote:

In both 1.5rc5 and 1.4.3rc1, README says:
- Open MPI does not support the Sparc v8 CPU target, which is the
 default on Sun Solaris.  The v8plus (32 bit) or v9 (64 bit)
 targets must be used to build Open MPI on Solaris.  This can be
 done by including a flag in CFLAGS, CXXFLAGS, FFLAGS, and FCFLAGS,
 -xarch=v8plus for the Sun compilers, -mv8plus for GCC.

However, the -mv8plus flag DOES NOT work for me.
The following occurs for both 1.5rc5 and 1.4.3rc1:

$ uname -a
SunOS lem.lbl.gov 5.10 s10_69 sun4u sparc SUNW,Ultra-5_10

$ gcc --version
gcc (GCC) 3.3.2
Copyright (C) 2003 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There 
is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR 
PURPOSE.


$ [path_to]/configure --disable-mpi-f77 --disable-mpi-f90 
CFLAGS=-mv8plus CXXFLAGS=-mv8plus

[...]
*** Assembler
[...]
checking if have Sparc v8+/v9 support... no
configure: WARNING: Sparc v8 target is not supported in this release 
of Open MPI.

configure: WARNING: You must specify the target architecture v8plus
configure: WARNING: (cc: -xarch=v8plus, gcc: -mcpu=v9) for CFLAGS, 
CXXFLAGS,
configure: WARNING: FFLAGS, and FCFLAGS to compile Open MPI in 32 bit 
mode on

configure: WARNING: Sparc processors
configure: error: Can not continue.


Following the recommendation from configure:
 $ [path_to]/configure --disable-mpi-f77 --disable-mpi-f90 
CFLAGS=-mcpu=v9 CXXFLAGS=-mcpu=v9

DOES work for both of the current RCs.

So, I see a few possibilities:

1) -mv8plus SHOULD work (as -xarch=v8plus appears to w/ Suc C 5.10) 
but configure is unconditionally too strict.

OR
2) My gcc is older than other have tested and configure is mistakenly 
thinking the ABI is wrong.

OR
3) -mcpu=v9 is the proper incantation and README needs correction.

No matter which of the above is correct, I suspect REAME and 
configure need to give the user the same information.


-Paul

P.S.  I can provide temporary machine access if needed to resolve this.
P.P.S.  I am /still/ not finished testing all the platforms available 
to me ;-)




___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel


--
Paul H. Hargrove  phhargr...@lbl.gov
Future Technologies Group
HPC Research Department   Tel: +1-510-495-2352
Lawrence Berkeley National Laboratory Fax: +1-510-486-6900



Re: [OMPI devel] Problem w/ documented SPARC/gcc flags (1.5rc5 and 1.4.3rc1)

2010-08-25 Thread Paul H. Hargrove

In the message below I fouled up some cut-and-paste.
Please mentally replace

And have configured (again stopping after the Assembler ABI probe) with 
gcc-4.3.3 AND Rolf's flags
  CC=gcc-4.3.3 CXX=g++-4.3.3 CFLAGS=-mv8plus CC=gcc-4.3.3 CXX=g++-4.3.3 
CFLAGS=-mv8plus


with

And have configured (again stopping after the Assembler ABI probe) with 
gcc-4.3.3 AND Rolf's flags
  CC=gcc-4.3.3 CXX=g++-4.3.3  CFLAGS="-mv8plus -Wa,-xarch=v8plus" 
CXXFLAGS="-mv8plus -Wa,-xarch=v8plus"


-Paul



Paul H. Hargrove wrote:

Trying Rolf's suggestion, I configure 1.4.3rc1 with
  CFLAGS="-mv8plus -Wa,-xarch=v8plus" CXXFLAGS="-mv8plus 
-Wa,-xarch=v8plus"
I find that I get configure past the v8+/v9 Assembler ABI probe (but 
didn't wait for the full configure to run).


Another datapoint in favor of #2 is that I can successfully build 
1.4.3rc1 w/ gcc-4.3.3 when I configure with

   CC=gcc-4.3.3 CXX=g++-4.3.3 CFLAGS=-mv8plus CXXFLAGS=-mv8plus
And have configured (again stopping after the Assembler ABI probe) 
with gcc-4.3.3 AND Rolf's flags
   CC=gcc-4.3.3 CXX=g++-4.3.3 CFLAGS=-mv8plus CC=gcc-4.3.3 
CXX=g++-4.3.3 CFLAGS=-mv8plus


So, here is MY summary:

+ For gcc-4.3.3 README is providing correct information
+ For gcc-3.3.2 README is providing INcorrect information
+ For both gcc versions configure provides correct info on failure, 
but following it prevents using the V8+ ABI.


My suggestion fix:

+ Edit README and configure both to suggest "-mv8plus 
-Wa,-xarch=v8plus" as that should be correct for either compiler version.


-Paul

Rolf vandeVaart wrote:
Paul, is it possible for you to try one more thing.  Can you 
reconfigure with


CFLAGS="-mv8plus -Wa,-xarch=v8plus"

I think this will get past the configure test as the configure test 
is compiling a piece
of assembly, and for some reason, the -mv8plus is not finding its way 
to the assembler.


If that works, then we eliminate #2 on your list below, and have to 
decide between

#1 and #3.
Rolf

On 08/25/10 15:56, Paul H. Hargrove wrote:

In both 1.5rc5 and 1.4.3rc1, README says:
- Open MPI does not support the Sparc v8 CPU target, which is the
 default on Sun Solaris.  The v8plus (32 bit) or v9 (64 bit)
 targets must be used to build Open MPI on Solaris.  This can be
 done by including a flag in CFLAGS, CXXFLAGS, FFLAGS, and FCFLAGS,
 -xarch=v8plus for the Sun compilers, -mv8plus for GCC.

However, the -mv8plus flag DOES NOT work for me.
The following occurs for both 1.5rc5 and 1.4.3rc1:

$ uname -a
SunOS lem.lbl.gov 5.10 s10_69 sun4u sparc SUNW,Ultra-5_10

$ gcc --version
gcc (GCC) 3.3.2
Copyright (C) 2003 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There 
is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR 
PURPOSE.


$ [path_to]/configure --disable-mpi-f77 --disable-mpi-f90 
CFLAGS=-mv8plus CXXFLAGS=-mv8plus

[...]
*** Assembler
[...]
checking if have Sparc v8+/v9 support... no
configure: WARNING: Sparc v8 target is not supported in this release 
of Open MPI.

configure: WARNING: You must specify the target architecture v8plus
configure: WARNING: (cc: -xarch=v8plus, gcc: -mcpu=v9) for CFLAGS, 
CXXFLAGS,
configure: WARNING: FFLAGS, and FCFLAGS to compile Open MPI in 32 
bit mode on

configure: WARNING: Sparc processors
configure: error: Can not continue.


Following the recommendation from configure:
 $ [path_to]/configure --disable-mpi-f77 --disable-mpi-f90 
CFLAGS=-mcpu=v9 CXXFLAGS=-mcpu=v9

DOES work for both of the current RCs.

So, I see a few possibilities:

1) -mv8plus SHOULD work (as -xarch=v8plus appears to w/ Suc C 5.10) 
but configure is unconditionally too strict.

OR
2) My gcc is older than other have tested and configure is 
mistakenly thinking the ABI is wrong.

OR
3) -mcpu=v9 is the proper incantation and README needs correction.

No matter which of the above is correct, I suspect REAME and 
configure need to give the user the same information.


-Paul

P.S.  I can provide temporary machine access if needed to resolve this.
P.P.S.  I am /still/ not finished testing all the platforms 
available to me ;-)




___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel




--
Paul H. Hargrove  phhargr...@lbl.gov
Future Technologies Group
HPC Research Department   Tel: +1-510-495-2352
Lawrence Berkeley National Laboratory Fax: +1-510-486-6900



[OMPI devel] Checkpoint/restart question

2010-08-25 Thread Tomas Oppelstrup
Hi,
I have a question about checkpoint-restart operation with opem-mpi. I
hope this is an apropriate forum for my question.

I do not have access to recopmile the kernel or load kernel modules,
so I would like to use the condor checkpoint-restart library. Can
that me made to work with openmpi's checkpoint-restart
infrastructure?

The condor library, upon recept of a signal or calling its checkpoint
function from within the program, generates a file containing the
complete (as complete as possible) state of the process, including
the state of libraries, e.g. openmpi. On restart, the process
image/state is loaded into memory and execution is resumed at the
checkpoint location.

On restart, I assume that some information in the mpi-state may need
to be reinitalized, since e.g. the names of the hosts of the
mpi-process, and pids of possible support processes will have
changed.

Is this tricky to fix (that code must somehow be there for the BLCR
compatibility)?

Perhaps it can be achieved by (in violation of the mpi-standard)
calling MPI_Finalize before the checkpoint, and MPI_Init after
restart? This seems like a conceptually appealing solution, but may
not be allowed nor to the correct thing in openmpi?!

  Thanks for any ideas/help/pointers to more information!

 Tomas


[OMPI devel] "make check" (libtool?) failure on Solaris/SPARC (1.5rc5 and 1.4.3rc1)

2010-08-25 Thread Paul H. Hargrove
I have been able to configure and build both 1.5rc5 and 1.4.3rc1 on 
Solaris 10 for SPARC, using Sun C 5.10.
I have also build 1.5rc5 w/ gcc-3.3.2 (and expect 1.4.3rc1 to build w/ 
gcc as well, once I have time)


All 3 builds fail "make check" in a way that suggests to me that libtool 
is not working correctly on this platform.


The two RC versions fail in different places, but I suspect that is just 
because they contain different tests.


Platform:

$ uname -a
SunOS lem.lbl.gov 5.10 s10_69 sun4u sparc SUNW,Ultra-5_10

$ cc -V
cc: Sun C 5.10 SunOS_sparc 2009/06/03
usage: cc [ options] files.  Use 'cc -flags' for details

$ gcc --version
gcc (GCC) 3.3.2
Copyright (C) 2003 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.


Configure/build/check for 1.4.3rc1 w/ Sun C 5.10

$ [path_to]/openmpi-1.4.3rc1/configure CC='cc -m32 -xarch=spar'c CXX='CC 
-m32 -xarch=sparc' F77='f77 -m32 -xarch=sparc' FC='f90 -m32 -xarch=sparc'

[...Yes, this does pass the V8+/v9 ABI check...]

$ make
[...]

$ make check
[...]
make[3]: Entering directory 
`/export/home/phargrov/openmpi-1.4.3rc1/BLD-cc-5.10/test/datatype'

source='../../../test/datatype/checksum.c' object='checksum.o' libtool=no \
DEPDIR=.deps depmode=none /bin/bash ../../../config/depcomp \
cc -m32 -xarch=sparc -DHAVE_CONFIG_H -I. -I../../../test/datatype 
-I../../opal/include -I../../orte/include -I../../ompi/include 
-I../../opal/mca/paffinity/linux/plpa/src/libplpa   -I../../.. -I../.. 
-I../../../opal/include -I../../../orte/include 
-I../../../ompi/include-O -DNDEBUG  -mt -c -o checksum.o 
../../../test/datatype/checksum.c
"../../../test/datatype/checksum.c", line 68: warning: assignment type 
mismatch:

   pointer to char "=" pointer to int
"../../../test/datatype/checksum.c", line 86: warning: assignment type 
mismatch:

   pointer to char "=" pointer to int
"../../../test/datatype/checksum.c", line 106: warning: assignment type 
mismatch:

   pointer to char "=" pointer to int
/bin/bash ../../libtool --tag=CC   --mode=link cc -m32 -xarch=sparc  -O 
-DNDEBUG  -mt  -export-dynamic   -o checksum checksum.o 
../../ompi/libmpi.la -lsocket -lnsl  -lrt -lm -lthread
libtool: link: cc -m32 -xarch=sparc -O -DNDEBUG -mt -o .libs/checksum 
checksum.o  ../../ompi/.libs/libmpi.so 
/export/home/phargrov/openmpi-1.4.3rc1/BLD-cc-5.10/orte/.libs/libopen-rte.so 
/export/home/phargrov/openmpi-1.4.3rc1/BLD-cc-5.10/opal/.libs/libopen-pal.so 
-lsocket -lnsl -lrt -lm -lthread -mt -R/usr/local/lib

source='../../../test/datatype/position.c' object='position.o' libtool=no \
DEPDIR=.deps depmode=none /bin/bash ../../../config/depcomp \
cc -m32 -xarch=sparc -DHAVE_CONFIG_H -I. -I../../../test/datatype 
-I../../opal/include -I../../orte/include -I../../ompi/include 
-I../../opal/mca/paffinity/linux/plpa/src/libplpa   -I../../.. -I../.. 
-I../../../opal/include -I../../../orte/include 
-I../../../ompi/include-O -DNDEBUG  -mt -c -o position.o 
../../../test/datatype/position.c
/bin/bash ../../libtool --tag=CC   --mode=link cc -m32 -xarch=sparc  -O 
-DNDEBUG  -mt  -export-dynamic   -o position position.o 
../../ompi/libmpi.la -lsocket -lnsl  -lrt -lm -lthread
libtool: link: cc -m32 -xarch=sparc -O -DNDEBUG -mt -o .libs/position 
position.o  ../../ompi/.libs/libmpi.so 
/export/home/phargrov/openmpi-1.4.3rc1/BLD-cc-5.10/orte/.libs/libopen-rte.so 
/export/home/phargrov/openmpi-1.4.3rc1/BLD-cc-5.10/opal/.libs/libopen-pal.so 
-lsocket -lnsl -lrt -lm -lthread -mt -R/usr/local/lib

source='../../../test/datatype/to_self.c' object='to_self.o' libtool=no \
DEPDIR=.deps depmode=none /bin/bash ../../../config/depcomp \
cc -m32 -xarch=sparc -DHAVE_CONFIG_H -I. -I../../../test/datatype 
-I../../opal/include -I../../orte/include -I../../ompi/include 
-I../../opal/mca/paffinity/linux/plpa/src/libplpa   -I../../.. -I../.. 
-I../../../opal/include -I../../../orte/include 
-I../../../ompi/include-O -DNDEBUG  -mt -c -o to_self.o 
../../../test/datatype/to_self.c
/bin/bash ../../libtool --tag=CC   --mode=link cc -m32 -xarch=sparc  -O 
-DNDEBUG  -mt  -export-dynamic   -o to_self to_self.o 
../../ompi/libmpi.la -lsocket -lnsl  -lrt -lm -lthread
libtool: link: cc -m32 -xarch=sparc -O -DNDEBUG -mt -o .libs/to_self 
to_self.o  ../../ompi/.libs/libmpi.so 
/export/home/phargrov/openmpi-1.4.3rc1/BLD-cc-5.10/orte/.libs/libopen-rte.so 
/export/home/phargrov/openmpi-1.4.3rc1/BLD-cc-5.10/opal/.libs/libopen-pal.so 
-lsocket -lnsl -lrt -lm -lthread -mt -R/usr/local/lib

source='../../../test/datatype/ddt_pack.c' object='ddt_pack.o' libtool=no \
DEPDIR=.deps depmode=none /bin/bash ../../../config/depcomp \
cc -m32 -xarch=sparc -DHAVE_CONFIG_H -I. -I../../../test/datatype 
-I../../opal/include -I../../orte/include -I../../ompi/include 
-I../../opal/mca/paffinity/linux/plpa/src/libplpa   -I../../.. -I../.. 
-I../../../opal/include -I../../../orte/include 
-I../../../ompi/

[OMPI devel] bitbucket announced downtime for upgrade

2010-08-25 Thread Jeff Squyres
For all of you using bitbucket:


http://blog.bitbucket.org/2010/08/25/bitbucket-downtime-for-a-hardware-upgrade/

-- 
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to:
http://www.cisco.com/web/about/doing_business/legal/cri/




[OMPI devel] atomic_spinlock test failure with xlc/ppc64 (1.5rc5 and 1.4.3rc1)

2010-08-25 Thread Paul H. Hargrove
I know Linux/PPC64 is listed as an under-tested platform, and the BG/P 
release of XLC is probably not supported at all, but I tested it anyway 
(on a front-end, not the BG/P compute nodes) and have the following to 
report.  I report here only the 1.5rc5 case, but results are identical 
with 1.4.3rc1. 


$ make check
[...]
--> Testing atomic_spinlock
../../../test/asm/run_tests: line 8: 25146 Segmentation fault  $* 
$threads

   - 1 threads: Failed
../../../test/asm/run_tests: line 8: 25148 Segmentation fault  $* 
$threads

   - 2 threads: Failed
../../../test/asm/run_tests: line 8: 25151 Segmentation fault  $* 
$threads

   - 4 threads: Failed
../../../test/asm/run_tests: line 8: 25154 Segmentation fault  $* 
$threads

   - 5 threads: Failed
../../../test/asm/run_tests: line 8: 25157 Segmentation fault  $* 
$threads

   - 8 threads: Failed
FAIL: atomic_spinlock
[...]

1 of 8 tests failed
Please report to http://www.open-mpi.org/community/help/



Here are the details of the platform:

$ uname -a
Linux login1 2.6.16.60-0.67.1-ppc64 #1 SMP Thu Aug 5 10:54:46 UTC 2010 
ppc64 ppc64 ppc64 GNU/Linux


$ which xlc
/soft/apps/ibmcmp-aug2010/vac/bg/9.0/bin/xlc
$ xlc -qversion
IBM XL C/C++ Advanced Edition for Blue Gene/P, V9.0
Version: 09.00..0009

$ /lib64/libc.so.6
GNU C Library stable release version 2.4 (20090904), by Roland McGrath 
et al.

Copyright (C) 2006 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.
There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A
PARTICULAR PURPOSE.
Configured for ppc64-suse-linux.
Compiled by GNU CC version 4.1.2 20070115 (SUSE Linux).
Compiled on a Linux 2.6.16 system on 2009-09-04.
Available extensions:
   crypt add-on version 2.1 by Michael Glad and others
   GNU Libidn by Simon Josefsson
   GNU libio by Per Bothner
   NIS(YP)/NIS+ NSS modules 0.19 by Thorsten Kukuk
   Native POSIX Threads Library by Ulrich Drepper et al
   BIND-8.2.3-T5B
Thread-local storage support included.
For bug reporting instructions, please see:
.


Here is the configure command:

$ [path_to]/openmpi-1.5rc5/configure --enable-static --disable-shared 
--with-contrib-vt-flags=--with-platform=linux CC='xlc_r -q64' CXX='xlC_r 
-q64' F77='xlf -q64' FC='xlf90 -q64'


These deserve some explanation:
+ --enable-static --disable-shared
   These are due to an apparent libtool-vs-xlc problem I will report 
separately

+ --with-contrib-vt-flags=--with-platform=linux
   This is due to VT's BG/P auto-detection which is not appropriate 
when building for the front end

   See http://www.open-mpi.org/community/lists/devel/2010/08/8358.php
+ CC='xlc_r -q64' CXX='xlC_r -q64'
   The -q64 requests the LP64 ABI
   The "_r" suffixes are for the thread-safe versions, and are needed 
to avoid undefined pthread symbols at link time

+ F77='xlf -q64' FC='xlf90 -q64'
   The -q64 flag is, again, for the LP64 ABI
   No "_r" suffix was needed, as I suspect no Fortran+pthread code is 
built when compiling Open MPI


Here is the Assembler section of the configure output:

*** Assembler
checking dependency style of xlc_r -q64... none
checking for BSD- or MS-compatible name lister (nm)... /usr/bin/nm -B
checking the name lister (/usr/bin/nm -B) interface... BSD nm
checking for fgrep... /bin/grep -F
checking if need to remove -g from CCASFLAGS... no
checking whether to enable smp locks... yes
checking if .proc/endp is needed... no
checking directive for setting text section... .text
checking directive for exporting symbols... .globl
checking for objdump... objdump
checking if .note.GNU-stack is needed... no
checking suffix for labels... :
checking prefix for global symbol labels...
checking prefix for lsym labels... .L
checking prefix for function in .type... @
checking if .size is needed... yes
checking if .align directive takes logarithmic value... yes
checking if PowerPC registers have r prefix... no
checking if xlc_r -q64 supports GCC inline assembly... yes
checking if xlc_r -q64 supports DEC inline assembly... no
checking if xlc_r -q64 supports XLC inline assembly... no
checking if xlC_r -q64 supports GCC inline assembly... yes
checking if xlC_r -q64 supports DEC inline assembly... no
checking if xlC_r -q64 supports XLC inline assembly... no
checking for assembly format... default-.text-.globl-:--.L-@-1-1-0-1-0
checking for asssembly architecture... POWERPC64
checking for perl... perl
checking for pre-built assembly file... no (not in asm-data)
checking whether possible to generate assembly file... yes
checking for atomic assembly filename... atomic-local.s


Note that "supports GCC inline assembly... yes" is NOT mistake (though I 
was not expecting to see "XLC inline assembly... no").


-Paul

--
Paul H. Hargrove  phhargr...@lbl.gov
Future Technol

[OMPI devel] make install (libtool) failure on Solaris 10 (1.5rc5 and 1.4.3rc1)

2010-08-25 Thread Paul H. Hargrove

This has got to be the stupidest failure I have ever seen!

$ make install
[...]
make[3]: Entering directory 
`/export/home/phargrov/openmpi-1.5rc5/BLD-gcc-vt/ompi'
test -z "/usr/local/pkg/ompi-1.5rc5/lib" || ../../config/install-sh -c 
-d "/usr/local/pkg/ompi-1.5rc5/lib"
/bin/bash ../libtool   --mode=install ../../config/install-sh -c   
libmpi.la '/usr/local/pkg/ompi-1.5rc5/lib'
libtool: install: ../../config/install-sh -c .libs/libmpi.so.0.0.2 
/usr/local/pkg/ompi-1.5rc5/lib/libmpi.so.0.0.2
libtool: install: (cd /usr/local/pkg/ompi-1.5rc5/lib && { ln -s -f 
libmpi.so.0.0.2 libmpi.so.0 || { rm -f libmpi.so.0 && ln -s 
libmpi.so.0.0.2 libmpi.so.0; }; })

Usage: ln [-f] [-s] f1
  ln [-f] [-s] f1 f2
  ln [-f] [-s] f1 ... fn d1
[...]

This is due to an incomprehensibly stupid "ln" that cares about the 
order of the "-s" and "-f" options:


$ rm -f b; touch a; ln -f -s a b
$ rm -f b; touch a; ln -s -f a b
Usage: ln [-f] [-s] f1
  ln [-f] [-s] f1 f2
  ln [-f] [-s] f1 ... fn d1

$ which ln
/usr/ucb/ln

$ uname -a
SunOS lem.lbl.gov 5.10 s10_69 sun4u sparc SUNW,Ultra-5_10


I see the same with both the 1.5rc5 and 1.4.3rc1 tarballs, which both 
contain

 # ltmain.sh (GNU libtool) 2.2.6b

-Paul

--
Paul H. Hargrove  phhargr...@lbl.gov
Future Technologies Group
HPC Research Department   Tel: +1-510-495-2352
Lawrence Berkeley National Laboratory Fax: +1-510-486-6900