Re: [OMPI devel] Improvements to "mpi_leave_pinned" behavior

2009-08-24 Thread Ashley Pittman
On Fri, 2009-08-21 at 10:41 -0400, Jeff Squyres wrote:
> Roland has pushed his new Linux "ummunotify" kernel upstream (i.e.,  
> it's in his -next git branch):
> 
> http://git.kernel.org/?p=linux/kernel/git/roland/infiniband.git;a=commit;h=2fadea9acc19674c07ae7a9d90758f4b9b793940
> 
> It's not yet guaranteed that it will be accepted, but it looks good so  
> far.  With some bug fixes from Pasha/Mellanox and Lenny+Mike/Voltaire,  
> I think it's ready for wide-spread testing (I mailed some of you  
> yesterday asking for specific testing).  I'm asking all to give the  
> prototype code a whirl to shake out any remaining design bugs.

Good to hear this long-standing issue is getting the attention it
deserves, this will be a huge step forward when it's up and running.

Ashley,

-- 

Ashley Pittman, Bath, UK.

Padb - A parallel job inspection tool for cluster computing
http://padb.pittman.org.uk



[OMPI devel] Improvements to "mpi_leave_pinned" behavior

2009-08-21 Thread Jeff Squyres
Roland has pushed his new Linux "ummunotify" kernel upstream (i.e.,  
it's in his -next git branch):


http://git.kernel.org/?p=linux/kernel/git/roland/infiniband.git;a=commit;h=2fadea9acc19674c07ae7a9d90758f4b9b793940

It's not yet guaranteed that it will be accepted, but it looks good so  
far.  With some bug fixes from Pasha/Mellanox and Lenny+Mike/Voltaire,  
I think it's ready for wide-spread testing (I mailed some of you  
yesterday asking for specific testing).  I'm asking all to give the  
prototype code a whirl to shake out any remaining design bugs.


I describe the issue that we're fixing in my new MPI-themed blog:


http://blogs.cisco.com/ciscotalk/performance/comments/better_linux_memory_tracking

The HG where this OMPI work is being done is here:

http://bitbucket.org/jsquyres/ummunot/

You need to have a very recent Linux kernel (2.6.31+) and Roland's  
umunotify module installed/running.  Build the OMPI HG tree with the  
"--enable-mca-no-build=memory-ptmalloc2" to disable ptmalloc2 and  
enable the ummunotify stuff.


This hack-ish "disable ptmalloc2" step is only necessary while we're  
shaking out the design issues.  I'm halfway through merging the ummunot 
+ptmalloc2 code into a new opal/mca/memory component named "linux".   
This component will choose at run time whether to use ptmalloc2 or the  
ummunotify stuff (i.e., the --enable-mca-no-build... step won't be  
necessary when all is said and done; a default OMPI Linux build will  
do the Right Things).


Thanks.

--
Jeff Squyres
jsquy...@cisco.com