Re: [O-MPI devel] couple of problems in openib mpool.

2005-08-12 Thread Galen Shipman

Hey Gleb,

Sorry for the delay.. we have been doing a bit of reworking of the 
pml/btl so that the btl's can be shared outside of just the pml 
(collectives, etc).


I have added the bug fix (old_reg). Will look at the assumption of 
non-null registration next.


Thanks (and keep them coming ;-) ,

Galen

On Aug 11, 2005, at 8:27 AM, Gleb Natapov wrote:


Hello,

 There are couple of bugs/typos in openib mpool. First one is fixed
by included patch. Second one is in function mca_mpool_openib_free().
This function assumes that registration is never NULL, but there are
callers that think different (ompi/class/ompi_fifo.h,
ompi/class/ompi_circular_buffer_fifo.h)


Index: ompi/mca/mpool/openib/mpool_openib_module.c
===
--- ompi/mca/mpool/openib/mpool_openib_module.c (revision 6806)
+++ ompi/mca/mpool/openib/mpool_openib_module.c (working copy)
@@ -127,7 +127,7 @@
 mca_mpool_base_registration_t* old_reg  = *registration;
 void* new_mem = mpool->mpool_alloc(mpool, size, 0, registration);
 memcpy(new_mem, addr, old_reg->bound - old_reg->base);
-mpool->mpool_free(mpool, addr, &old_reg);
+mpool->mpool_free(mpool, addr, old_reg);
 return new_mem;

 }
--
Gleb.
___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel




[O-MPI devel] build warnings..

2005-08-12 Thread Galen Shipman

Current build warnings:

mca_base_parse_paramfile_lex.c:1664: warning: 'yy_flex_realloc' defined 
but not used

qsort.c:163: warning: cast from pointer to integer of different size
show_help_lex.c:1606: warning: 'yy_flex_realloc' defined but not used
rmgr_proxy.c:237: warning: ISO C forbids conversion of object pointer 
to function pointer type
rmgr_proxy.c:356: warning: ISO C forbids conversion of function pointer 
to object pointer type
rmgr_urm.c:184: warning: ISO C forbids conversion of object pointer to 
function pointer type
rmgr_urm.c:309: warning: ISO C forbids conversion of function pointer 
to object pointer type

comm_cid.c:167: warning: comparison between signed and unsigned
fake_stack.c:46: warning: no previous prototype for 
'ompi_convertor_create_stack_with_pos_general'




Re: [O-MPI devel] build warnings..

2005-08-12 Thread Brian Barrett

On Aug 12, 2005, at 9:36 AM, Galen Shipman wrote:


Current build warnings:

mca_base_parse_paramfile_lex.c:1664: warning: 'yy_flex_realloc' defined
but not used


This one is pretty much impossible to fix (it's in a file generated by 
lex, and isn't easy to deal with.



qsort.c:163: warning: cast from pointer to integer of different size


This one is pretty difficult to get rid of without doing really dumb 
things.  But that code really shouldn't be built on platforms where the 
native qsort works.  I'll fix this today.



show_help_lex.c:1606: warning: 'yy_flex_realloc' defined but not used


Same as the other lex.


rmgr_proxy.c:237: warning: ISO C forbids conversion of object pointer
to function pointer type
rmgr_proxy.c:356: warning: ISO C forbids conversion of function pointer
to object pointer type
rmgr_urm.c:184: warning: ISO C forbids conversion of object pointer to
function pointer type
rmgr_urm.c:309: warning: ISO C forbids conversion of function pointer
to object pointer type
comm_cid.c:167: warning: comparison between signed and unsigned


This could have been due to my MAX_CID changes - I'll have a look and 
make it right.



fake_stack.c:46: warning: no previous prototype for
'ompi_convertor_create_stack_with_pos_general'


Brian

--
  Brian Barrett
  Open MPI developer
  http://www.open-mpi.org/



[O-MPI devel] OMPI 32bit on a 64bit Linux box

2005-08-12 Thread Nathan DeBardeleben
We've got a 64bit Linux (SUSE) box here.  For a variety of reasons 
(Java, JNI, linking in with OMPI libraries, etc which I won't get into) 
I need to compile OMPI 32 bit (or get 64bit versions of a lot of other 
libraries).
I get various compile errors when I try different things, but first let 
me explain the system we have:



[sparkplug]~/ompi > uname -a
Linux sparkplug 2.6.10 #4 SMP Wed Jan 26 11:50:00 MST 2005 x86_64 
x86_64 x86_64 GNU/Linux

[sparkplug]~/ompi >
[sparkplug]~/ompi > cat /etc/issue

Welcome to SuSE Linux 9.1 (x86-64) - Kernel \r (\l).


[sparkplug]~/ompi > 


I tried the obvious:


./configure CFLAGS=-m32 FFLAGS=-m32 .. 


The make then bailed out with compile errors:



 gcc -m32 -g -Wall -Wundef -Wno-long-long -Wsign-compare -Wmissing- 
prototypes -Wstrict-prototypes -Wcomment -pedantic -Werror- 
implicit-function-declaration -fno-strict-aliasing -c atomic-asm.s  -o 
atomic-asm.o

atomic-asm.s: Assembler messages:
atomic-asm.s:6: Error: suffix or operands invalid for `push'
atomic-asm.s:7: Error: suffix or operands invalid for `movq'
atomic-asm.s:16: Error: suffix or operands invalid for `push'
atomic-asm.s:17: Error: suffix or operands invalid for `movq'
atomic-asm.s:26: Error: suffix or operands invalid for `push'
atomic-asm.s:27: Error: suffix or operands invalid for `movq'
atomic-asm.s:36: Error: suffix or operands invalid for `push'
atomic-asm.s:37: Error: suffix or operands invalid for `movq'
atomic-asm.s:38: Error: `-8(%rbp)' is not a valid 32 bit base/ index 
expression
atomic-asm.s:39: Error: `-12(%rbp)' is not a valid 32 bit base/ index 
expression
atomic-asm.s:40: Error: `-16(%rbp)' is not a valid 32 bit base/ index 
expression
atomic-asm.s:41: Error: `-16(%rbp)' is not a valid 32 bit base/ index 
expression
atomic-asm.s:42: Error: `-8(%rbp)' is not a valid 32 bit base/ index 
expression
atomic-asm.s:43: Error: `-12(%rbp)' is not a valid 32 bit base/ index 
expression
atomic-asm.s:45: Error: `(%rdx)' is not a valid 32 bit base/index  
expression
atomic-asm.s:47: Error: `-24(%rbp)' is not a valid 32 bit base/ index 
expression
atomic-asm.s:48: Error: `-24(%rbp)' is not a valid 32 bit base/ index 
expression
atomic-asm.s:49: Error: `-28(%rbp)' is not a valid 32 bit base/ index 
expression
atomic-asm.s:50: Error: `-28(%rbp)' is not a valid 32 bit base/ index 
expression
atomic-asm.s:51: Error: `-12(%rbp)' is not a valid 32 bit base/ index 
expression
atomic-asm.s:54: Error: `-28(%rbp)' is not a valid 32 bit base/ index 
expression
atomic-asm.s:55: Error: `-28(%rbp)' is not a valid 32 bit base/ index 
expression

atomic-asm.s:64: Error: suffix or operands invalid for `push'
atomic-asm.s:65: Error: suffix or operands invalid for `movq'
atomic-asm.s:66: Error: `-8(%rbp)' is not a valid 32 bit base/ index 
expression
atomic-asm.s:67: Error: `-16(%rbp)' is not a valid 32 bit base/ index 
expression
atomic-asm.s:68: Error: `-24(%rbp)' is not a valid 32 bit base/ index 
expression
atomic-asm.s:69: Error: `-24(%rbp)' is not a valid 32 bit base/ index 
expression
atomic-asm.s:70: Error: `-8(%rbp)' is not a valid 32 bit base/ index 
expression
atomic-asm.s:71: Error: `-16(%rbp)' is not a valid 32 bit base/ index 
expression
atomic-asm.s:73: Error: `(%rdx)' is not a valid 32 bit base/index  
expression
atomic-asm.s:76: Error: `-32(%rbp)' is not a valid 32 bit base/ index 
expression
atomic-asm.s:77: Error: `-32(%rbp)' is not a valid 32 bit base/ index 
expression
atomic-asm.s:78: Error: `-16(%rbp)' is not a valid 32 bit base/ index 
expression

make[2]: *** [atomic-asm.lo] Error 1
make[2]: Leaving directory `/home/ndebard/ompi/opal/asm'
make[1]: *** [all-recursive] Error 1
make[1]: Leaving directory `/home/ndebard/ompi/opal'
make: *** [all-recursive] Error 1


Greg Watson then suggested I add to me configure:


--build=i586-suse-linux


That got the Make further, but now it dies saying:

/usr/lib64/gcc-lib/x86_64-suse-linux/3.3.3/../../../../x86_64-suse-linux/bin/ld: 
warning: i386 architecture of input file 
`../../../opal/.libs/libopal.a(memory.o)' is incompatible with 
i386:x86-64 output
../../../ompi/.libs/libmpi.a(mca_io_romio_dist_ad_nfs_read.o)(.text+0x106e): 
In function `mca_io_romio_dist_ADIOI_NFS_ReadStrided':
/home/ndebard/ompi/ompi/mca/io/romio/romio-dist/adio/ad_nfs/mca_io_romio_dist_ad_nfs_read.c:230: 
undefined reference to `__divdi3'
../../../ompi/.libs/libmpi.a(mca_io_romio_dist_ad_nfs_read.o)(.text+0x108f):/home/ndebard/ompi/ompi/mca/io/romio/romio-dist/adio/ad_nfs/mca_io_romio_dist_ad_nfs_read.c:231: 
undefined reference to `__moddi3'
../../../ompi/.libs/libmpi.a(mca_io_romio_dist_ad_nfs_write.o)(.text+0xf76): 
In function `mca_io_romio_dist_ADIOI_NFS_WriteStrided':
/home/ndebard/ompi/ompi/mca/io/romio/romio-dist/adio/ad_nfs/mca_io_romio_dist_ad_nfs_write.c:268: 
undefined reference to `__divdi3'
../../../ompi/.libs/libmpi.a(mca_io_romio_dist_ad_nfs_write.o)(.text+0xf97):/home/ndebard/ompi/ompi/mca/io/romio/romio-dist/adio/ad_nfs/mca_io_romio_dist_ad_nfs_write.c:269:

Re: [O-MPI devel] OMPI 32bit on a 64bit Linux box

2005-08-12 Thread Brian Barrett

On Aug 12, 2005, at 3:13 PM, Nathan DeBardeleben wrote:


We've got a 64bit Linux (SUSE) box here.  For a variety of reasons
(Java, JNI, linking in with OMPI libraries, etc which I won't get  
into)

I need to compile OMPI 32 bit (or get 64bit versions of a lot of other
libraries).
I get various compile errors when I try different things, but first  
let

me explain the system we have:





This goes on and on and on actually.  And the 'is incompatible with
i386:x86-64 output' looks to be repeated for every line before this
error which actually caused the Make to bomb.

Any suggestions at all?  Surely someone must have tried to force  
OMPI to

build in 32bit mode on a 64bit machine.


I don't think anyone has tried to build 32 bit on an Opteron, which  
is the cause of the problems...


I think I know how to fix this, but won't happen until later in the  
weekend.  I can't think of a good workaround until then.  Well, one  
possibility is to set the target like you were doing and disable  
ROMIO.  Actually, you'll also need to disable Fortran 77.  So  
something like:


  ./configure [usual options] --build=i586-suse-linux --disable-io- 
romio --disable-f77


might just do the trick.

Brian


--
  Brian Barrett
  Open MPI developer
  http://www.open-mpi.org/




Re: [O-MPI devel] OMPI 32bit on a 64bit Linux box

2005-08-12 Thread Nathan DeBardeleben
Thanks, trying that now.  While I'd like those things in the long run, 
they're not needed right now to test what I'm trying to test.  Will let 
you know how it goes!  (What's the problem, by the way?)


-- Nathan
Correspondence
-
Nathan DeBardeleben, Ph.D.
Los Alamos National Laboratory
Parallel Tools Team
High Performance Computing Environments
phone: 505-667-3428
email: ndeb...@lanl.gov
-



Brian Barrett wrote:


On Aug 12, 2005, at 3:13 PM, Nathan DeBardeleben wrote:

 


We've got a 64bit Linux (SUSE) box here.  For a variety of reasons
(Java, JNI, linking in with OMPI libraries, etc which I won't get  
into)

I need to compile OMPI 32 bit (or get 64bit versions of a lot of other
libraries).
I get various compile errors when I try different things, but first  
let

me explain the system we have:
   





 


This goes on and on and on actually.  And the 'is incompatible with
i386:x86-64 output' looks to be repeated for every line before this
error which actually caused the Make to bomb.

Any suggestions at all?  Surely someone must have tried to force  
OMPI to

build in 32bit mode on a 64bit machine.
   



I don't think anyone has tried to build 32 bit on an Opteron, which  
is the cause of the problems...


I think I know how to fix this, but won't happen until later in the  
weekend.  I can't think of a good workaround until then.  Well, one  
possibility is to set the target like you were doing and disable  
ROMIO.  Actually, you'll also need to disable Fortran 77.  So  
something like:


  ./configure [usual options] --build=i586-suse-linux --disable-io- 
romio --disable-f77


might just do the trick.

Brian


 



Re: [O-MPI devel] OMPI 32bit on a 64bit Linux box

2005-08-12 Thread Brian Barrett

On Aug 12, 2005, at 3:22 PM, Nathan DeBardeleben wrote:


Thanks, trying that now.  While I'd like those things in the long run,
they're not needed right now to test what I'm trying to test.  Will  
let

you know how it goes!  (What's the problem, by the way?)


The problem is that I key off the target host string to decide what  
assembly to use for the atomic operations.  For most 64 bit  
platforms, the architecture string is the same for 32/64 bit and then  
you use sizeof(long) to determine whether to use 32 or 64 bit  
instructions.  So what I need to add to the configure script is a  
check if we're on x86_64 that if sizeof(long) == 4, we use the  
assembly for x86, not x86_64.


Brian

--
  Brian Barrett
  Open MPI developer
  http://www.open-mpi.org/




Re: [O-MPI devel] OMPI 32bit on a 64bit Linux box

2005-08-12 Thread Nathan DeBardeleben

OK, so I reconfigured, made, etc:

   137  14:29   ./configure CFLAGS=-m32 FFLAGS=-m32 
--build=i586-suse-linux --enable-static --disable-shared 
--without-threads --prefix=/home/ndebard/local/ompi 
--with-devel-headers --disable-io-romio --disable-f77

   138  14:48   make clean all install


But mpicc now segfaults immediately:


[sparkplug]~/ompi > /home/ndebard/local/ompi/bin/mpicc
Segmentation fault



[sparkplug]~/ompi > gdb /home/ndebard/local/ompi/bin/mpicc
GNU gdb 6.1
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and 
you are
welcome to change it and/or distribute copies of it under certain 
conditions.

Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for 
details.
This GDB was configured as "x86_64-suse-linux"...DW_FORM_strp pointing 
outside of .debug_str section [in module 
/home/ndebard/local/ompi/bin/mpicc]

Using host libthread_db library "/lib64/tls/libthread_db.so.1".

(gdb) run
Starting program: /home/ndebard/local/ompi/bin/mpicc
(no debugging symbols found)...(no debugging symbols found)...(no 
debugging symbols found)...(no debugging symbols found)...(no 
debugging symbols found)...(no debugging symbols found)...(no 
debugging symbols found)...

Program received signal SIGSEGV, Segmentation fault.
0x00408d4a in ?? ()
(gdb) where
#0  0x00408d4a in ?? ()
Cannot access memory at address 0xbfffecf8
(gdb)



[sparkplug]~/ompi > /home/ndebard/local/ompi/bin/mpic++
Segmentation fault
[sparkplug]~/ompi > 



-- Nathan
Correspondence
-
Nathan DeBardeleben, Ph.D.
Los Alamos National Laboratory
Parallel Tools Team
High Performance Computing Environments
phone: 505-667-3428
email: ndeb...@lanl.gov
-



Brian Barrett wrote:


On Aug 12, 2005, at 3:22 PM, Nathan DeBardeleben wrote:

 


Thanks, trying that now.  While I'd like those things in the long run,
they're not needed right now to test what I'm trying to test.  Will  
let

you know how it goes!  (What's the problem, by the way?)
   



The problem is that I key off the target host string to decide what  
assembly to use for the atomic operations.  For most 64 bit  
platforms, the architecture string is the same for 32/64 bit and then  
you use sizeof(long) to determine whether to use 32 or 64 bit  
instructions.  So what I need to add to the configure script is a  
check if we're on x86_64 that if sizeof(long) == 4, we use the  
assembly for x86, not x86_64.


Brian

 



Re: [O-MPI devel] OMPI 32bit on a 64bit Linux box

2005-08-12 Thread Jeff Squyres
That's a neat one.  mpicc shouldn't care about any of this stuff -- 
it's a trivial C++ program that invokes none of the MCA framework 
stuff, etc.


I'll try to replicate.

Just out of curiosity -- do other C++ applications work nicely in 32 
bit on that machine?  (particularly ones that use std::vector and 
std::string)




On Aug 12, 2005, at 5:02 PM, Nathan DeBardeleben wrote:


OK, so I reconfigured, made, etc:


   137  14:29   ./configure CFLAGS=-m32 FFLAGS=-m32
--build=i586-suse-linux --enable-static --disable-shared
--without-threads --prefix=/home/ndebard/local/ompi
--with-devel-headers --disable-io-romio --disable-f77
   138  14:48   make clean all install


But mpicc now segfaults immediately:


[sparkplug]~/ompi > /home/ndebard/local/ompi/bin/mpicc
Segmentation fault



[sparkplug]~/ompi > gdb /home/ndebard/local/ompi/bin/mpicc
GNU gdb 6.1
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and
you are
welcome to change it and/or distribute copies of it under certain
conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for
details.
This GDB was configured as "x86_64-suse-linux"...DW_FORM_strp pointing
outside of .debug_str section [in module
/home/ndebard/local/ompi/bin/mpicc]
Using host libthread_db library "/lib64/tls/libthread_db.so.1".

(gdb) run
Starting program: /home/ndebard/local/ompi/bin/mpicc
(no debugging symbols found)...(no debugging symbols found)...(no
debugging symbols found)...(no debugging symbols found)...(no
debugging symbols found)...(no debugging symbols found)...(no
debugging symbols found)...
Program received signal SIGSEGV, Segmentation fault.
0x00408d4a in ?? ()
(gdb) where
#0  0x00408d4a in ?? ()
Cannot access memory at address 0xbfffecf8
(gdb)



[sparkplug]~/ompi > /home/ndebard/local/ompi/bin/mpic++
Segmentation fault
[sparkplug]~/ompi >



-- Nathan
Correspondence
-
Nathan DeBardeleben, Ph.D.
Los Alamos National Laboratory
Parallel Tools Team
High Performance Computing Environments
phone: 505-667-3428
email: ndeb...@lanl.gov
-



Brian Barrett wrote:


On Aug 12, 2005, at 3:22 PM, Nathan DeBardeleben wrote:



Thanks, trying that now.  While I'd like those things in the long 
run,

they're not needed right now to test what I'm trying to test.  Will
let
you know how it goes!  (What's the problem, by the way?)




The problem is that I key off the target host string to decide what
assembly to use for the atomic operations.  For most 64 bit
platforms, the architecture string is the same for 32/64 bit and then
you use sizeof(long) to determine whether to use 32 or 64 bit
instructions.  So what I need to add to the configure script is a
check if we're on x86_64 that if sizeof(long) == 4, we use the
assembly for x86, not x86_64.

Brian




___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel



--
{+} Jeff Squyres
{+} The Open MPI Project
{+} http://www.open-mpi.org/



Re: [O-MPI devel] OMPI 32bit on a 64bit Linux box

2005-08-12 Thread Jeff Squyres
Actually, Brian just pointed out the problem -- you also need to set 
CXXFLAGS=-m32.



On Aug 12, 2005, at 5:15 PM, Jeff Squyres wrote:


That's a neat one.  mpicc shouldn't care about any of this stuff --
it's a trivial C++ program that invokes none of the MCA framework
stuff, etc.

I'll try to replicate.

Just out of curiosity -- do other C++ applications work nicely in 32
bit on that machine?  (particularly ones that use std::vector and
std::string)



On Aug 12, 2005, at 5:02 PM, Nathan DeBardeleben wrote:


OK, so I reconfigured, made, etc:


   137  14:29   ./configure CFLAGS=-m32 FFLAGS=-m32
--build=i586-suse-linux --enable-static --disable-shared
--without-threads --prefix=/home/ndebard/local/ompi
--with-devel-headers --disable-io-romio --disable-f77
   138  14:48   make clean all install


But mpicc now segfaults immediately:


[sparkplug]~/ompi > /home/ndebard/local/ompi/bin/mpicc
Segmentation fault



[sparkplug]~/ompi > gdb /home/ndebard/local/ompi/bin/mpicc
GNU gdb 6.1
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and
you are
welcome to change it and/or distribute copies of it under certain
conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for
details.
This GDB was configured as "x86_64-suse-linux"...DW_FORM_strp 
pointing

outside of .debug_str section [in module
/home/ndebard/local/ompi/bin/mpicc]
Using host libthread_db library "/lib64/tls/libthread_db.so.1".

(gdb) run
Starting program: /home/ndebard/local/ompi/bin/mpicc
(no debugging symbols found)...(no debugging symbols found)...(no
debugging symbols found)...(no debugging symbols found)...(no
debugging symbols found)...(no debugging symbols found)...(no
debugging symbols found)...
Program received signal SIGSEGV, Segmentation fault.
0x00408d4a in ?? ()
(gdb) where
#0  0x00408d4a in ?? ()
Cannot access memory at address 0xbfffecf8
(gdb)



[sparkplug]~/ompi > /home/ndebard/local/ompi/bin/mpic++
Segmentation fault
[sparkplug]~/ompi >



-- Nathan
Correspondence
-
Nathan DeBardeleben, Ph.D.
Los Alamos National Laboratory
Parallel Tools Team
High Performance Computing Environments
phone: 505-667-3428
email: ndeb...@lanl.gov
-



Brian Barrett wrote:


On Aug 12, 2005, at 3:22 PM, Nathan DeBardeleben wrote:




Thanks, trying that now.  While I'd like those things in the long
run,
they're not needed right now to test what I'm trying to test.  Will
let
you know how it goes!  (What's the problem, by the way?)




The problem is that I key off the target host string to decide what
assembly to use for the atomic operations.  For most 64 bit
platforms, the architecture string is the same for 32/64 bit and then
you use sizeof(long) to determine whether to use 32 or 64 bit
instructions.  So what I need to add to the configure script is a
check if we're on x86_64 that if sizeof(long) == 4, we use the
assembly for x86, not x86_64.

Brian




___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel



--
{+} Jeff Squyres
{+} The Open MPI Project
{+} http://www.open-mpi.org/

___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel



--
{+} Jeff Squyres
{+} jsquy...@lam-mpi.org
{+} http://www.lam-mpi.org/



[O-MPI devel] Memory manager changes

2005-08-12 Thread Brian Barrett

Hi all -

For those not on the telecon Tuesday, we finally broke down and  
decided we needed to do all the system nastiness to intercept free()  
and munmap() and the like for high speed interconnects so that we can  
do pinned page caching and not take the pinning performance hit on  
applications like NetPIPE (and, to be fair, many user applications).   
Unlike LAM, however, we're going to try to make this not be the  
center of all pain and suffering ;).  While we'll support the  
ptmalloc2 trick that LAM and MPICH-gm use, it will not be on by  
default and we're trying to find better alternatives.  Below are your  
current choices for intercepting memory releases back to the  
operating system.  The default is malloc_hooks on platforms that  
support it when threads aren't enabled.  Otherwise the current  
default is "none".


In all cases, in addition to dealing with free() and realloc(), we  
provide intercepts for munmap() to catch the user doing his own  
memory management.  We may also want to intercept SysV shared memory  
functions.


You can choose exactly which "memory manager" to use with the --with- 
memory-manager=TYPE option to configure, where TYPE is one of  
"ptmalloc2", "malloc_hooks", "darwin7", or "ldpreload".  Of course,  
you can also use --without-memory-manager or --with-memory- 
manager=none to completely disable the things.


* PTMALLOC2

  + Very fast implementation of the full malloc/free suite.
Directly used by glibc as their memory manager.
  + Works properly in threaded environment
  + Only call unpin callbacks when giving memory back to the
OS (ie, when sbrk() or munmap() are called)
  - Does not work properly in some situations (abacus linker
tricks, for example) that appear to be within the
spirit of using the MPI library
  - Does not work on many platforms (everywhere but linux, really)
  - Feels massively icky

* MALLOC_HOOKS

  + Use the hooks proviced by ptmalloc2 (and therefore glibc)
to get callbacks when free(), realloc(), etc are called
  + No "corner cases" that cause unexpected behavior like with
ptmalloc2
  - Does not support threads (disables itself if either
progress or mpi threads are enabled)
  - Have to call unpin callbacks when memory is free()d or
realloc()ed, not when giving back to OS
  - Very low performance impact (1-2%) on calling free() when
there are no mpools registering callbacks

* LDPRELOAD

  + Thread safe
  + No "corner cases" that cause unexpected behavior like with
ptmalloc2
  + Should work on every platform that supports LD Preload and
dlsym()
  - Requires doing ldpreload tricks
  - On some platforms, have to call unpin callbacks when
memory is free()d or realloc()ed, not when giving back
to the OS
  - Did I mention, it requires doing ldpreload?
  + If LDPRELOAD doesn't succeed, opal can properly determine
this and will just say free() interception is unavailable

* DARWIN7

  + Thread safe
  - Requires some nasty linker tricks to make work.  User
application must be linked with mpicc or a long list
of special flags
  + If application is not linked with the special sauce,
opal should be able to properly determine this and just
say free() interception is unavailable.
  - Total hack of linker tricks

LD Preload is not yet implemented, but should be by the end of the  
weekend.  The initial version will most likely only support making  
callbacks every time free() / realloc() is called, rather than every  
time memory is given back to the OS.  Not optimal, but better than  
nothing.


I'm going to talk with some Darwin developers about better ways to do  
things on Darwin, but probably won't have any results on that front  
until sometime middle of next week.



Brian

--
  Brian Barrett
  Open MPI developer
  http://www.open-mpi.org/




[O-MPI devel] Fwd: Memory manager changes

2005-08-12 Thread Rich L. Graham

Brian,
Sounds like I got off the call a bit too early ;-)
   Can we choose to use  standard platform  libraries, or are we 
pinning

ourselves into a corner ?  I.e., is this optional ?
  What sort of problems are we getting into playing with pre-load 
options ?  I would
be VERY careful here, and do plenty of testing, especially with c++ 
codes, before
you decide to do this.  We used to use some of these tricks in LA-MPI, 
but backed

off because of loader ordering issues.
  As you can tell, I am VERY leery of these sort of tricks for a 
production grade
bit of code.  If it is easy to decide at run-time if to use these 
tricks (w/o a performance

penalty), this is a different question.

Rich

Begin forwarded message:


From: Brian Barrett 
Date: August 12, 2005 7:47:45 PM MDT
To: Open MPI Developers 
Subject: [O-MPI devel] Memory manager changes
Reply-To: Open MPI Developers 

Hi all -

For those not on the telecon Tuesday, we finally broke down and
decided we needed to do all the system nastiness to intercept free()
and munmap() and the like for high speed interconnects so that we can
do pinned page caching and not take the pinning performance hit on
applications like NetPIPE (and, to be fair, many user applications).
Unlike LAM, however, we're going to try to make this not be the
center of all pain and suffering ;).  While we'll support the
ptmalloc2 trick that LAM and MPICH-gm use, it will not be on by
default and we're trying to find better alternatives.  Below are your
current choices for intercepting memory releases back to the
operating system.  The default is malloc_hooks on platforms that
support it when threads aren't enabled.  Otherwise the current
default is "none".

In all cases, in addition to dealing with free() and realloc(), we
provide intercepts for munmap() to catch the user doing his own
memory management.  We may also want to intercept SysV shared memory
functions.

You can choose exactly which "memory manager" to use with the --with-
memory-manager=TYPE option to configure, where TYPE is one of
"ptmalloc2", "malloc_hooks", "darwin7", or "ldpreload".  Of course,
you can also use --without-memory-manager or --with-memory-
manager=none to completely disable the things.

* PTMALLOC2

   + Very fast implementation of the full malloc/free suite.
 Directly used by glibc as their memory manager.
   + Works properly in threaded environment
   + Only call unpin callbacks when giving memory back to the
 OS (ie, when sbrk() or munmap() are called)
   - Does not work properly in some situations (abacus linker
 tricks, for example) that appear to be within the
 spirit of using the MPI library
   - Does not work on many platforms (everywhere but linux, really)
   - Feels massively icky

* MALLOC_HOOKS

   + Use the hooks proviced by ptmalloc2 (and therefore glibc)
 to get callbacks when free(), realloc(), etc are called
   + No "corner cases" that cause unexpected behavior like with
 ptmalloc2
   - Does not support threads (disables itself if either
 progress or mpi threads are enabled)
   - Have to call unpin callbacks when memory is free()d or
 realloc()ed, not when giving back to OS
   - Very low performance impact (1-2%) on calling free() when
 there are no mpools registering callbacks

* LDPRELOAD

   + Thread safe
   + No "corner cases" that cause unexpected behavior like with
 ptmalloc2
   + Should work on every platform that supports LD Preload and
 dlsym()
   - Requires doing ldpreload tricks
   - On some platforms, have to call unpin callbacks when
 memory is free()d or realloc()ed, not when giving back
 to the OS
   - Did I mention, it requires doing ldpreload?
   + If LDPRELOAD doesn't succeed, opal can properly determine
 this and will just say free() interception is unavailable

* DARWIN7

   + Thread safe
   - Requires some nasty linker tricks to make work.  User
 application must be linked with mpicc or a long list
 of special flags
   + If application is not linked with the special sauce,
 opal should be able to properly determine this and just
 say free() interception is unavailable.
   - Total hack of linker tricks

LD Preload is not yet implemented, but should be by the end of the
weekend.  The initial version will most likely only support making
callbacks every time free() / realloc() is called, rather than every
time memory is given back to the OS.  Not optimal, but better than
nothing.

I'm going to talk with some Darwin developers about better ways to do
things on Darwin, but probably won't have any results on that front
until sometime middle of next week.


Brian

--
   Brian Barrett
   Open MPI developer
   http://www.open-mpi.org/


___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel


Re: [O-MPI devel] Fwd: Memory manager changes

2005-08-12 Thread Brian Barrett

On Aug 12, 2005, at 9:43 PM, Rich L. Graham wrote:


Sounds like I got off the call a bit too early ;-)
   Can we choose to use  standard platform  libraries, or are  
we pinning

ourselves into a corner ?  I.e., is this optional ?


Yes - the code is all built around trying to use the standard  
platform.  And yes, everything is optional.  In many cases (pretty  
much everywhere but single threaded Linux), the default will be to  
not do any memory manager tricks at all.  Of course, not having any  
memory manager hooks lessens the performance of the BTLs since we  
have to do pin/rdma pipelining, but that's the price we have to pay.


  What sort of problems are we getting into playing with pre-load  
options ?  I would
be VERY careful here, and do plenty of testing, especially with c++  
codes, before
you decide to do this.  We used to use some of these tricks in LA- 
MPI, but backed

off because of loader ordering issues.


Agreed - I'm one of the ones who was very against doing it in the  
first place :).  Currently, the default on everywhere but single  
threaded Linux is to not have any memory manager hooks at all.  On  
single threaded Linux, we use the hooks provided by glibc for doing  
"something" before the actual free/realloc occurs.  Because these are  
official, recommended ways of doing things, they should work on any  
C, C++, and Fortran codes, even if they are statically linked.  I've  
tested them with C++ apps, and they work as the documentation implies  
they would.


I don't think that the ldpreload tricks should ever be the default.   
I'd like to provide them, because on threaded builds (where the glibc  
hooks aren't available), they provide a much better solution than  
using ptmalloc2.  The sysadmin/user would have to setup his  
environment to load the preload library.  If the module fails to  
preload, there is a facility in place for the memory code to tell the  
mpools that there is no memory manager interrupt and to fall back to  
the unpin after use mode.  Further, the ldpreload module (not yet  
committed, but half written) can run just fine even if the app  
started isn't an opal code (with little if any performance  
difference).  I don't envision us ever explicitly setting the  
LD_PRELOAD in the pls components or anything like that.  Instead, I  
see us documenting "Add this to your LD_PRELOAD or /etc/ld.preload  
and OMPI goes faster".


  As you can tell, I am VERY leery of these sort of tricks for a  
production grade
bit of code.  If it is easy to decide at run-time if to use these  
tricks (w/o a performance

penalty), this is a different question.


Some of these will be very difficult to turn off at runtime (the  
LD_PRELOAD probably being the exception - you can at least turn that  
off any time before the application starts running).  However, I  
don't think this is a problem because the defaults are going to be so  
pessimistic that we shouldn't get in a situation where the user is  
going to have to turn them off.  I'm thinking big, annoying warnings  
in the installation document about turning the less-safe ones on.


Brian



Begin forwarded message:



From: Brian Barrett 
Date: August 12, 2005 7:47:45 PM MDT
To: Open MPI Developers 
Subject: [O-MPI devel] Memory manager changes
Reply-To: Open MPI Developers 

Hi all -

For those not on the telecon Tuesday, we finally broke down and
decided we needed to do all the system nastiness to intercept free()
and munmap() and the like for high speed interconnects so that we can
do pinned page caching and not take the pinning performance hit on
applications like NetPIPE (and, to be fair, many user applications).
Unlike LAM, however, we're going to try to make this not be the
center of all pain and suffering ;).  While we'll support the
ptmalloc2 trick that LAM and MPICH-gm use, it will not be on by
default and we're trying to find better alternatives.  Below are your
current choices for intercepting memory releases back to the
operating system.  The default is malloc_hooks on platforms that
support it when threads aren't enabled.  Otherwise the current
default is "none".

In all cases, in addition to dealing with free() and realloc(), we
provide intercepts for munmap() to catch the user doing his own
memory management.  We may also want to intercept SysV shared memory
functions.

You can choose exactly which "memory manager" to use with the --with-
memory-manager=TYPE option to configure, where TYPE is one of
"ptmalloc2", "malloc_hooks", "darwin7", or "ldpreload".  Of course,
you can also use --without-memory-manager or --with-memory-
manager=none to completely disable the things.

* PTMALLOC2

   + Very fast implementation of the full malloc/free suite.
 Directly used by glibc as their memory manager.
   + Works properly in threaded environment
   + Only call unpin callbacks when giving memory back to the
 OS (ie, when sbrk() or munmap() are called)
   - Does not work properly in some situations

Re: [O-MPI devel] Fwd: Memory manager changes

2005-08-12 Thread Rich L. Graham

Sound reasonable - I am for being able to turn on optional things
that will improve performance...

Thanks,
Rich

On Aug 12, 2005, at 9:14 PM, Brian Barrett wrote:


On Aug 12, 2005, at 9:43 PM, Rich L. Graham wrote:


Sounds like I got off the call a bit too early ;-)
   Can we choose to use  standard platform  libraries, or are
we pinning
ourselves into a corner ?  I.e., is this optional ?


Yes - the code is all built around trying to use the standard
platform.  And yes, everything is optional.  In many cases (pretty
much everywhere but single threaded Linux), the default will be to
not do any memory manager tricks at all.  Of course, not having any
memory manager hooks lessens the performance of the BTLs since we
have to do pin/rdma pipelining, but that's the price we have to pay.


  What sort of problems are we getting into playing with pre-load
options ?  I would
be VERY careful here, and do plenty of testing, especially with c++
codes, before
you decide to do this.  We used to use some of these tricks in LA-
MPI, but backed
off because of loader ordering issues.


Agreed - I'm one of the ones who was very against doing it in the
first place :).  Currently, the default on everywhere but single
threaded Linux is to not have any memory manager hooks at all.  On
single threaded Linux, we use the hooks provided by glibc for doing
"something" before the actual free/realloc occurs.  Because these are
official, recommended ways of doing things, they should work on any
C, C++, and Fortran codes, even if they are statically linked.  I've
tested them with C++ apps, and they work as the documentation implies
they would.

I don't think that the ldpreload tricks should ever be the default.
I'd like to provide them, because on threaded builds (where the glibc
hooks aren't available), they provide a much better solution than
using ptmalloc2.  The sysadmin/user would have to setup his
environment to load the preload library.  If the module fails to
preload, there is a facility in place for the memory code to tell the
mpools that there is no memory manager interrupt and to fall back to
the unpin after use mode.  Further, the ldpreload module (not yet
committed, but half written) can run just fine even if the app
started isn't an opal code (with little if any performance
difference).  I don't envision us ever explicitly setting the
LD_PRELOAD in the pls components or anything like that.  Instead, I
see us documenting "Add this to your LD_PRELOAD or /etc/ld.preload
and OMPI goes faster".


  As you can tell, I am VERY leery of these sort of tricks for a
production grade
bit of code.  If it is easy to decide at run-time if to use these
tricks (w/o a performance
penalty), this is a different question.


Some of these will be very difficult to turn off at runtime (the
LD_PRELOAD probably being the exception - you can at least turn that
off any time before the application starts running).  However, I
don't think this is a problem because the defaults are going to be so
pessimistic that we shouldn't get in a situation where the user is
going to have to turn them off.  I'm thinking big, annoying warnings
in the installation document about turning the less-safe ones on.

Brian



Begin forwarded message:



From: Brian Barrett 
Date: August 12, 2005 7:47:45 PM MDT
To: Open MPI Developers 
Subject: [O-MPI devel] Memory manager changes
Reply-To: Open MPI Developers 

Hi all -

For those not on the telecon Tuesday, we finally broke down and
decided we needed to do all the system nastiness to intercept free()
and munmap() and the like for high speed interconnects so that we can
do pinned page caching and not take the pinning performance hit on
applications like NetPIPE (and, to be fair, many user applications).
Unlike LAM, however, we're going to try to make this not be the
center of all pain and suffering ;).  While we'll support the
ptmalloc2 trick that LAM and MPICH-gm use, it will not be on by
default and we're trying to find better alternatives.  Below are your
current choices for intercepting memory releases back to the
operating system.  The default is malloc_hooks on platforms that
support it when threads aren't enabled.  Otherwise the current
default is "none".

In all cases, in addition to dealing with free() and realloc(), we
provide intercepts for munmap() to catch the user doing his own
memory management.  We may also want to intercept SysV shared memory
functions.

You can choose exactly which "memory manager" to use with the --with-
memory-manager=TYPE option to configure, where TYPE is one of
"ptmalloc2", "malloc_hooks", "darwin7", or "ldpreload".  Of course,
you can also use --without-memory-manager or --with-memory-
manager=none to completely disable the things.

* PTMALLOC2

   + Very fast implementation of the full malloc/free suite.
 Directly used by glibc as their memory manager.
   + Works properly in threaded environment
   + Only call unpin callbacks when giving memory back to the
 OS (i