Re: [OMPI devel] MALLOC_MMAP_MAX (and MALLOC_MMAP_THRESHOLD)

2010-01-11 Thread Jeff Squyres
Arrgh -- if only the Linux kernel community had accepted ummunotify, this would now be a moot point (i.e., the argument would be solely with the OS/glibc, not the MPI!). On Jan 9, 2010, at 10:45 PM, Barrett, Brian W wrote: > We should absolutely not change this. For simple applications, yes,

Re: [OMPI devel] How can I achieve node fail over

2010-01-11 Thread Josh Hursey
On Jan 6, 2010, at 9:04 AM, Sai Sudheesh wrote: Hi, Just about two months ago I started experimenting with OpenMPI. I found this piece of software very interesting. How can I make this software fault tolerant? Depends on what you mean my fault tolerant. :) As of no

Re: [OMPI devel] Howto pause BTL's sending at runtime - hope mail is working again

2010-01-11 Thread Christoph Konersmann
Thanks a lot for your help! I will give it a try. Christoph Ralph Castain schrieb: > You've got this a tad wrong, but that's okay - let me try to clarify a couple > of things that may help. > > First, you don't want to add this as a separate orted command. As you noted, > orte has no direct wa

Re: [OMPI devel] Howto pause BTL's sending at runtime - hope mail is working again

2010-01-11 Thread Jeff Squyres
Additionally, I believe that the FT system already does something like what you describe (although perhaps not exactly the same thing) -- there is a phase where the FT system pauses and quiesces all BTLs. Did you look at that part of the code, perchance, and see if it meets your needs? On Jan

Re: [OMPI devel] Howto pause BTL's sending at runtime - hope mail is working again

2010-01-11 Thread Josh Hursey
The ft_event() function that you mentioned is part of the larger fault tolerance infrastructure in Open MPI. You need to make sure to enable it before using (if it is not enabled many of the ft_event functions default to NULL). Add '--with-ft=cr' to your ./configure line and that will enabl

Re: [OMPI devel] How can I achieve node fail over

2010-01-11 Thread Sai Sudheesh
Hi Josh, First of all...thanks for your response.. There was some typos in my mail making it vague at some portions. Let me make the scenarios mentioned in the previous mail more elaborative. What I tried is as follows.

Re: [OMPI devel] How can I achieve node fail over

2010-01-11 Thread Ralph Castain
As Josh indicated, the current OMPI trunk won't do that at the moment. Josh and I are working on a side branch to integrate the OpenRCM methods with mpirun to provide an OMPI capability for those not running ORCM on their systems. What wasn't clear is your motivation. Are you trying to develop t