Re: [OMPI devel] [OMPI svn-full] svn:open-mpi r21287

2009-05-26 Thread Rainer Keller
Dear all,
please note, that with the separation of configure-flags into project-related 
sections, the one for openib-control-hdr-padding was moved.

As Jeff noted, it is however most suitable in ompi_check_openib.m4.
However, I put this as an AC_ARG_ENABLE, instead of an AC_ARG_WITH.
Ralph, this is used by LANL, correct?

With best regards,
Rainer


On Tuesday 26 May 2009 11:03:19 pm rusra...@osl.iu.edu wrote:
>  #
> +# Add padding to OpenIB header
> +#
> +AC_ARG_ENABLE([openib-control-hdr-padding],
> +[AC_HELP_STRING([--enable-openib-control-hdr-padding],
> +[Add padding bytes to the openib control header
> (default:disabled)])])

> -#
> -# Add padding to OpenIB header
> -#
> -AC_MSG_CHECKING([whether to add padding to the openib control header])
> -AC_ARG_WITH([openib-control-hdr-padding],
> - [AC_HELP_STRING([--with-openib-control-hdr-padding],
> - [Add padding bytes to the openib control header])])


-- 

Rainer Keller, PhD  Tel: +1 (865) 241-6293
Oak Ridge National Lab  Fax: +1 (865) 241-4811
PO Box 2008 MS 6164   Email: kel...@ornl.gov
Oak Ridge, TN 37831-2008AIM/Skype: rusraink




Re: [OMPI devel] XML stdout/stderr

2009-05-26 Thread Ralph Castain
Guess I had just never seen that format before - thanks for clarifying!

I committed the revisions to the trunk in r21285 - see what you think...
Ralph


On Tue, May 26, 2009 at 1:55 PM, Greg Watson  wrote:

> Ralph,
> Both my proposals are correct XML and should be parsable by any conforming
> XML parser. Just changing the tags will not work because any text that
> contains "&", "<", or ">" will still confuse an XML parser.
>
> Regards,
>
> Greg
>
> On May 26, 2009, at 8:25 AM, Ralph Castain wrote:
>
> Yo Greg
>
> I'm slow, but it did hit me that there may be a simpler solution after all.
> I gather that the problem is that the user's output could have tags in it
> that match our own, thus causing tag-confusion. True?
>
> My concern is that our proposed solution generates pidgin-xml which could
> only ever be translated by a specially written parser. Kinda makes xml a
> little moot in ways.
>
> What if we simply change the name of our tags to something ompi-specific? I
> could tag things with , for example. This would follow the
> natural naming convention for internal variables, and would avoid any
> conflicts unless the user were truly stupid - in which case, the onus would
> be on them.
>
> Would that resolve the problem?
> Ralph
>
>
> On Tue, May 26, 2009 at 5:42 AM, Ralph Castain  wrote:
>
>>
>>
>> On Mon, May 25, 2009 at 9:10 AM, Greg Watson wrote:
>>
>>> Ralph,
>>>
>>> In life, nothing is ever easy...
>>
>>
>> :-)
>>
>>
>>>
>>>
>>> While the XML output is working well, I've come across an issue with
>>> stdout/stderr. Unfortunately it's not just enough to wrap it in tags,
>>> because it's possible that the output will contain XML formatting
>>> information. There are two ways to get around this. The easiest is to wrap
>>> the output in "". This has the benefit of being
>>> relatively easy, but will fail if the output contains the string "]]>". The
>>> other way is to replace all instances of "&", "<", and ">" with "&",
>>> "<", and ">" respectively. This is safer, but requires more
>>> processing.
>>>
>>> Thoughts?
>>
>>
>> "Ick" immediately comes to mind, but is hardly helpful. :-)
>>
>> I am already doing some processing to deal with linefeeds in the middle of
>> output streams, so adding these three special chars isn't -that- big a deal.
>> I can have a test version for you in the next day or so (svn trunk) - I am
>> on reduced hours while moving my son (driving across country).
>>
>> Let's give that a try and see if it resolves the problem...
>>
>>
>>
>>>
>>>
>>> Greg
>>> ___
>>> devel mailing list
>>> de...@open-mpi.org
>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>>
>>
>>
> ___
> devel mailing list
> de...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>
>
>
> ___
> devel mailing list
> de...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>


Re: [OMPI devel] problem in the ORTE notifier framework

2009-05-26 Thread Jeff Squyres
Sure, I can setup a webex (with international dialins) if it would be  
useful.



On May 26, 2009, at 7:24 PM, Ralph Castain wrote:

First, to answer Nadia's question: you will find that the init  
function for the module is already called when it is selected - see  
the code in orte/mca/base/notifier_base_select.c, lines 72-76 (in  
the trunk.


It would be a good idea to tie into the sos work to avoid conflicts  
when it all gets merged back together, assuming that isn't a big  
problem for you.


As for Jeff's suggestion: dealing with the performance hit problem  
is why I suggested ORTE_NOTIFIER_VERBOSE, modeled after the  
OPAL_OUTPUT_VERBOSE model. The idea was to compile it in -only- when  
the system is built for it - maybe using a --with-notifier-verbose  
configuration option. Frankly, some organizations would happily pay  
a small performance penalty for the benefits.


I would personally recommend that the notifier framework keep the  
stats so things can be compact and self-contained. We still get  
atomicity by allowing each framework/component/whatever specify the  
threshold. Creating yet another system to do nothing more than track  
error/warning frequencies to decide whether or not to notify seems  
wasteful.


Perhaps worth a phone call to decide path forward?


On Tue, May 26, 2009 at 1:06 PM, Jeff Squyres   
wrote:

Nadia --

Sorry I didn't get to jump in on the other thread earlier.

We have made considerable changes to the notifier framework in a  
branch to better support "SOS" functionality:


   https://www.open-mpi.org/hg/auth/hgwebdir.cgi/jsquyres/opal-sos

Cisco and Indiana U. have been working on this branch for a while.   
A description of the SOS stuff is here:


   https://svn.open-mpi.org/trac/ompi/wiki/ErrorMessages

As for setting up an external web server with hg, don't bother --  
just get an account at bitbucket.org.  They're free and allow you to  
host hg repositories there.  I've used bitbucket to collaborate on  
code before it hits OMPI's SVN trunk with both internal and external  
OMPI developers.


We can certainly move the opal-sos repo to bitbucket (or branch  
again off opal-sos to bitbucket -- whatever makes more sense) to  
facilitate collaborating with you.


Back on topic...

I'd actually suggest a combination of what has been discussed in the  
other thread.  The notifier can be the mechanism that actually sends  
the output message, but it doesn't have to be the mechanism that  
tracks the stats and decides when to output a message.  That can be  
separate logic, and therefore be more fine-grained (and potentially  
even specific to the MPI layer).


The Big Question will how to do this with zero performance impact  
when it is not being used. This has always been the difficult issue  
when trying to implement any kind of monitoring inside the core OMPI  
performance-sensitive paths.  Even adding individual branches has  
met with resistance (in performance-critical code paths)...





On May 26, 2009, at 10:59 AM, Nadia Derbey wrote:

Hi,

While having a look at the notifier framework under orte, I noticed  
that

the way it is written, the init routine for the selected module cannot
be called.

Attached is a small patch that fixes this issue.

Regards,
Nadia




--
Jeff Squyres
Cisco Systems

___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel

___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel



--
Jeff Squyres
Cisco Systems



Re: [OMPI devel] problem in the ORTE notifier framework

2009-05-26 Thread Ralph Castain
First, to answer Nadia's question: you will find that the init function for
the module is already called when it is selected - see the code in
orte/mca/base/notifier_base_select.c, lines 72-76 (in the trunk.

It would be a good idea to tie into the sos work to avoid conflicts when it
all gets merged back together, assuming that isn't a big problem for you.

As for Jeff's suggestion: dealing with the performance hit problem is why I
suggested ORTE_NOTIFIER_VERBOSE, modeled after the OPAL_OUTPUT_VERBOSE
model. The idea was to compile it in -only- when the system is built for it
- maybe using a --with-notifier-verbose configuration option. Frankly, some
organizations would happily pay a small performance penalty for the
benefits.

I would personally recommend that the notifier framework keep the stats so
things can be compact and self-contained. We still get atomicity by allowing
each framework/component/whatever specify the threshold. Creating yet
another system to do nothing more than track error/warning frequencies to
decide whether or not to notify seems wasteful.

Perhaps worth a phone call to decide path forward?


On Tue, May 26, 2009 at 1:06 PM, Jeff Squyres  wrote:

> Nadia --
>
> Sorry I didn't get to jump in on the other thread earlier.
>
> We have made considerable changes to the notifier framework in a branch to
> better support "SOS" functionality:
>
>https://www.open-mpi.org/hg/auth/hgwebdir.cgi/jsquyres/opal-sos
>
> Cisco and Indiana U. have been working on this branch for a while.  A
> description of the SOS stuff is here:
>
>https://svn.open-mpi.org/trac/ompi/wiki/ErrorMessages
>
> As for setting up an external web server with hg, don't bother -- just get
> an account at bitbucket.org.  They're free and allow you to host hg
> repositories there.  I've used bitbucket to collaborate on code before it
> hits OMPI's SVN trunk with both internal and external OMPI developers.
>
> We can certainly move the opal-sos repo to bitbucket (or branch again off
> opal-sos to bitbucket -- whatever makes more sense) to facilitate
> collaborating with you.
>
> Back on topic...
>
> I'd actually suggest a combination of what has been discussed in the other
> thread.  The notifier can be the mechanism that actually sends the output
> message, but it doesn't have to be the mechanism that tracks the stats and
> decides when to output a message.  That can be separate logic, and therefore
> be more fine-grained (and potentially even specific to the MPI layer).
>
> The Big Question will how to do this with zero performance impact when it
> is not being used. This has always been the difficult issue when trying to
> implement any kind of monitoring inside the core OMPI performance-sensitive
> paths.  Even adding individual branches has met with resistance (in
> performance-critical code paths)...
>
>
>
>
> On May 26, 2009, at 10:59 AM, Nadia Derbey wrote:
>
>  Hi,
>>
>> While having a look at the notifier framework under orte, I noticed that
>> the way it is written, the init routine for the selected module cannot
>> be called.
>>
>> Attached is a small patch that fixes this issue.
>>
>> Regards,
>> Nadia
>>
>> 
>>
>
>
> --
> Jeff Squyres
> Cisco Systems
>
> ___
> devel mailing list
> de...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>


Re: [OMPI devel] RFC: Diagnostoc framework for MPI

2009-05-26 Thread Eugene Loh

Nadia Derbey wrote:


What: Warn the administrator when unusual events are occurring too
frequently.

Why: Such unusual events might be the symptom of some problem that can
easily be fixed (by a better tuning, for example)
 

Before Sun HPC ClusterTools adopted the Open MPI code base (that is, CT6 
and earlier), there was some performance analysis support called 
MPProf.  See 
http://docs.sun.com/source/819-4134-10/profile.html#pgfId-999249 .  The 
key characteristic was supposed to be that it would be very easy to 
use:  set an environment variable before running; run a report generator 
afterwards;  report is self explanatory;  data volumes were relatively 
small and so easy to manage.


One part in particular seemed germane to your RFC:  advice on 
implementation-specific environment variables.  See 
http://docs.sun.com/source/819-4134-10/profile.html#pgfId-1000209 .  Sun 
MPI had instrumentation embedded in it that looked for various 
"performance conditions".  Then, in post processing, the report 
generator would translate that information into user-actionable 
feedback.  At least, that was the concept.  The idea would be that all 
user feedback should include:


*) a brief explanation of what happened ("you ran out of postboxes... 
see Appendix A.1.b.23 of user guide if you really dare to understand 
what this means")
*) an estimate of how important this is ("we think this cost you 10% 
performance")
*) a concise description of what to do to improve performance and 
discussion of ramifications ("set the environment variable 
MPI_NUMPOSTBOX to 256 and rerun, this will cost about 50 Mbyte more 
memory per process")


The feedback need not be limited to environment variables or 
implementation-specific conditions.  E.g., perhaps one could detect when 
MPI_Ssend is used in place of MPI_Send and how much performance 
(unneeded synchronization) that cost.


Re: [OMPI devel] RFC: MPI Interface Extensions Infrastructure

2009-05-26 Thread Josh Hursey

This was committed in r21272

Let me know if you have any problems with this commit.

Cheers,
Josh

On May 26, 2009, at 10:00 AM, Jeff Squyres wrote:


Exxxcellent.  :-)

On May 26, 2009, at 9:57 AM, Josh Hursey wrote:


As a heads up, this RFC expires today. We discussed it last week
during the teleconf and there were no objections.

I updated the HG branch to the current trunk, and, if there are not
objections, I will commit it to the trunk this afternoon [target  
1.5].


Cheers,
Josh

On May 11, 2009, at 2:37 PM, Josh Hursey wrote:

>
> What:  Infrastructure for MPI Interface Extensions
>
> Why:   Allow for experimentation with new interfaces without
> changing mpi.h
>
> Where: Temporary Mercurial branch (link below)
>   http://cgi.cs.indiana.edu/~jjhursey/public-tmp/hg/hgwebdir.cgi/mpi-ext/
>
> When:  Apply on trunk before branching for v1.5
>
> Timeout: 2 weeks - May 26, 2009 after the teleconf.
>
>  
-

> Description:
>
> At times developers want to expose non-standard, optional  
interfaces

> to users. These interfaces may represent MPI interfaces to be
> presented to the MPI Forum for standardization. In order to add  
such

> an interface to Open MPI you must add it directly to the ompi/mpi/
> directory and mpi.h. The combination of standard and non-standard
> interfaces inside mpi.h becomes troublesome to many developers and
> users.
>
> This branch allows developers to create a directory under ompi/
> mpiext/ for their extension (see ompi/mpiext/example in the HG
> branch for an example). By default, all extensions are disabled.
> They can be enabled through a configure option '--enable-ext='.  
This

> option takes a list of extensions that should be built as part of
> Open MPI. The user can include all of the extensions by referencing
> the appropriate header file (e.g., #include  ), and
> compiling with the normal wrapper compilers (e.g., mpicc).
>
> This infrastructure was designed and discussed on July 2, 2008 at  
an
> Open MPI developers meeting directly following an MPI Forum  
meeting.

> I have been developing this branch over the past few months under
> the advisement of Jeff and Brian. The C interface is functional and
> stable. The C++, F77, and F90 interfaces have not been completed.
> There are comments in the appropriate build system files
> (particularly config/ompi_ext.m4) that indicate where a developer
> would need to focus to finish support for these language bindings  
if

> needed. I have not completed them since I do not feel comfortable
> enough at this time with these languages to provide such
> functionality.
>
> I would like to bring this into the trunk before v1.5 branch.  
Having
> the infrastructure in the trunk will make it easier to maintain  
off-

> trunk experimental interface development.
>
> As part of this RFC, I will also update the 'MPI Extensions' wiki
> page to describe how a developer can get started using this
> infrastructure:
>  https://svn.open-mpi.org/trac/ompi/wiki/MPIExtensions
>
>  
-

> How to use the branch:
>
> Configure with this additional option:
> --enable-ext=example
>
> Compile the following sample MPI program with 'mpicc' per usual.
> /*-*/
> #include 
> #include 
> #include 
>
> int main(int argc, char *argv[])
> {
>int rank, size;
>
>MPI_Init(&argc, &argv);
>
>MPI_Comm_rank(MPI_COMM_WORLD, &rank);
>MPI_Comm_size(MPI_COMM_WORLD, &size);
>
>OMPI_Progress("Go OMPI! Go!");
>
>MPI_Finalize();
>
>return 0;
> }
> /*-*/
>
> ___
> devel mailing list
> de...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/devel

___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel




--
Jeff Squyres
Cisco Systems

___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel




Re: [OMPI devel] XML stdout/stderr

2009-05-26 Thread Greg Watson

Ralph,

Both my proposals are correct XML and should be parsable by any  
conforming XML parser. Just changing the tags will not work because  
any text that contains "&", "<", or ">" will still confuse an XML  
parser.


Regards,

Greg

On May 26, 2009, at 8:25 AM, Ralph Castain wrote:


Yo Greg

I'm slow, but it did hit me that there may be a simpler solution  
after all. I gather that the problem is that the user's output could  
have tags in it that match our own, thus causing tag-confusion. True?


My concern is that our proposed solution generates pidgin-xml which  
could only ever be translated by a specially written parser. Kinda  
makes xml a little moot in ways.


What if we simply change the name of our tags to something ompi- 
specific? I could tag things with , for example. This  
would follow the natural naming convention for internal variables,  
and would avoid any conflicts unless the user were truly stupid - in  
which case, the onus would be on them.


Would that resolve the problem?
Ralph


On Tue, May 26, 2009 at 5:42 AM, Ralph Castain   
wrote:



On Mon, May 25, 2009 at 9:10 AM, Greg Watson   
wrote:

Ralph,

In life, nothing is ever easy...

:-)



While the XML output is working well, I've come across an issue with  
stdout/stderr. Unfortunately it's not just enough to wrap it in  
tags, because it's possible that the output will contain XML  
formatting information. There are two ways to get around this. The  
easiest is to wrap the output in "". This has the  
benefit of being relatively easy, but will fail if the output  
contains the string "]]>". The other way is to replace all instances  
of "&", "<", and ">" with "&", "<", and ">" respectively.  
This is safer, but requires more processing.


Thoughts?

"Ick" immediately comes to mind, but is hardly helpful. :-)

I am already doing some processing to deal with linefeeds in the  
middle of output streams, so adding these three special chars isn't - 
that- big a deal. I can have a test version for you in the next day  
or so (svn trunk) - I am on reduced hours while moving my son  
(driving across country).


Let's give that a try and see if it resolves the problem...




Greg
___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel


___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel




Re: [OMPI devel] problem in the ORTE notifier framework

2009-05-26 Thread Jeff Squyres

Nadia --

Sorry I didn't get to jump in on the other thread earlier.

We have made considerable changes to the notifier framework in a  
branch to better support "SOS" functionality:


https://www.open-mpi.org/hg/auth/hgwebdir.cgi/jsquyres/opal-sos

Cisco and Indiana U. have been working on this branch for a while.  A  
description of the SOS stuff is here:


https://svn.open-mpi.org/trac/ompi/wiki/ErrorMessages

As for setting up an external web server with hg, don't bother -- just  
get an account at bitbucket.org.  They're free and allow you to host  
hg repositories there.  I've used bitbucket to collaborate on code  
before it hits OMPI's SVN trunk with both internal and external OMPI  
developers.


We can certainly move the opal-sos repo to bitbucket (or branch again  
off opal-sos to bitbucket -- whatever makes more sense) to facilitate  
collaborating with you.


Back on topic...

I'd actually suggest a combination of what has been discussed in the  
other thread.  The notifier can be the mechanism that actually sends  
the output message, but it doesn't have to be the mechanism that  
tracks the stats and decides when to output a message.  That can be  
separate logic, and therefore be more fine-grained (and potentially  
even specific to the MPI layer).


The Big Question will how to do this with zero performance impact when  
it is not being used. This has always been the difficult issue when  
trying to implement any kind of monitoring inside the core OMPI  
performance-sensitive paths.  Even adding individual branches has met  
with resistance (in performance-critical code paths)...




On May 26, 2009, at 10:59 AM, Nadia Derbey wrote:


Hi,

While having a look at the notifier framework under orte, I noticed  
that

the way it is written, the init routine for the selected module cannot
be called.

Attached is a small patch that fixes this issue.

Regards,
Nadia





--
Jeff Squyres
Cisco Systems



[OMPI devel] problem in the ORTE notifier framework

2009-05-26 Thread Nadia Derbey
Hi,

While having a look at the notifier framework under orte, I noticed that
the way it is written, the init routine for the selected module cannot
be called.

Attached is a small patch that fixes this issue.

Regards,
Nadia
ORTE notifier module init routine is never called: orte_notifier.init checking
should be done after orte_notifier has been set.


diff -r 876c02c65058 orte/mca/notifier/base/notifier_base_select.c
--- a/orte/mca/notifier/base/notifier_base_select.c	Mon May 25 14:17:38 2009 +0200
+++ b/orte/mca/notifier/base/notifier_base_select.c	Tue May 26 17:00:28 2009 +0200
@@ -69,17 +69,16 @@ int orte_notifier_base_select(void)
 goto cleanup;
 }

+/* Save the winner */
+orte_notifier = *best_module;
+
 if (NULL != orte_notifier.init) {
 /* if an init function is provided, use it */
 if (ORTE_SUCCESS != (ret = orte_notifier.init()) ) {
 exit_status = ret;
-goto cleanup;
 }
 }

-/* Save the winner */
-orte_notifier = *best_module;
-
  cleanup:
 return exit_status;
 }


Re: [OMPI devel] RFC: MPI Interface Extensions Infrastructure

2009-05-26 Thread Jeff Squyres

Exxxcellent.  :-)

On May 26, 2009, at 9:57 AM, Josh Hursey wrote:


As a heads up, this RFC expires today. We discussed it last week
during the teleconf and there were no objections.

I updated the HG branch to the current trunk, and, if there are not
objections, I will commit it to the trunk this afternoon [target 1.5].

Cheers,
Josh

On May 11, 2009, at 2:37 PM, Josh Hursey wrote:

>
> What:  Infrastructure for MPI Interface Extensions
>
> Why:   Allow for experimentation with new interfaces without
> changing mpi.h
>
> Where: Temporary Mercurial branch (link below)
>   http://cgi.cs.indiana.edu/~jjhursey/public-tmp/hg/hgwebdir.cgi/mpi-ext/
>
> When:  Apply on trunk before branching for v1.5
>
> Timeout: 2 weeks - May 26, 2009 after the teleconf.
>
>  
-

> Description:
>
> At times developers want to expose non-standard, optional interfaces
> to users. These interfaces may represent MPI interfaces to be
> presented to the MPI Forum for standardization. In order to add such
> an interface to Open MPI you must add it directly to the ompi/mpi/
> directory and mpi.h. The combination of standard and non-standard
> interfaces inside mpi.h becomes troublesome to many developers and
> users.
>
> This branch allows developers to create a directory under ompi/
> mpiext/ for their extension (see ompi/mpiext/example in the HG
> branch for an example). By default, all extensions are disabled.
> They can be enabled through a configure option '--enable-ext='. This
> option takes a list of extensions that should be built as part of
> Open MPI. The user can include all of the extensions by referencing
> the appropriate header file (e.g., #include  ), and
> compiling with the normal wrapper compilers (e.g., mpicc).
>
> This infrastructure was designed and discussed on July 2, 2008 at an
> Open MPI developers meeting directly following an MPI Forum meeting.
> I have been developing this branch over the past few months under
> the advisement of Jeff and Brian. The C interface is functional and
> stable. The C++, F77, and F90 interfaces have not been completed.
> There are comments in the appropriate build system files
> (particularly config/ompi_ext.m4) that indicate where a developer
> would need to focus to finish support for these language bindings if
> needed. I have not completed them since I do not feel comfortable
> enough at this time with these languages to provide such
> functionality.
>
> I would like to bring this into the trunk before v1.5 branch. Having
> the infrastructure in the trunk will make it easier to maintain off-
> trunk experimental interface development.
>
> As part of this RFC, I will also update the 'MPI Extensions' wiki
> page to describe how a developer can get started using this
> infrastructure:
>  https://svn.open-mpi.org/trac/ompi/wiki/MPIExtensions
>
>  
-

> How to use the branch:
>
> Configure with this additional option:
> --enable-ext=example
>
> Compile the following sample MPI program with 'mpicc' per usual.
> /*-*/
> #include 
> #include 
> #include 
>
> int main(int argc, char *argv[])
> {
>int rank, size;
>
>MPI_Init(&argc, &argv);
>
>MPI_Comm_rank(MPI_COMM_WORLD, &rank);
>MPI_Comm_size(MPI_COMM_WORLD, &size);
>
>OMPI_Progress("Go OMPI! Go!");
>
>MPI_Finalize();
>
>return 0;
> }
> /*-*/
>
> ___
> devel mailing list
> de...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/devel

___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel




--
Jeff Squyres
Cisco Systems



Re: [OMPI devel] RFC: MPI Interface Extensions Infrastructure

2009-05-26 Thread Josh Hursey
As a heads up, this RFC expires today. We discussed it last week  
during the teleconf and there were no objections.


I updated the HG branch to the current trunk, and, if there are not  
objections, I will commit it to the trunk this afternoon [target 1.5].


Cheers,
Josh

On May 11, 2009, at 2:37 PM, Josh Hursey wrote:



What:  Infrastructure for MPI Interface Extensions

Why:   Allow for experimentation with new interfaces without  
changing mpi.h


Where: Temporary Mercurial branch (link below)
  http://cgi.cs.indiana.edu/~jjhursey/public-tmp/hg/hgwebdir.cgi/mpi-ext/

When:  Apply on trunk before branching for v1.5

Timeout: 2 weeks - May 26, 2009 after the teleconf.

-
Description:

At times developers want to expose non-standard, optional interfaces  
to users. These interfaces may represent MPI interfaces to be  
presented to the MPI Forum for standardization. In order to add such  
an interface to Open MPI you must add it directly to the ompi/mpi/  
directory and mpi.h. The combination of standard and non-standard  
interfaces inside mpi.h becomes troublesome to many developers and  
users.


This branch allows developers to create a directory under ompi/ 
mpiext/ for their extension (see ompi/mpiext/example in the HG  
branch for an example). By default, all extensions are disabled.  
They can be enabled through a configure option '--enable-ext='. This  
option takes a list of extensions that should be built as part of  
Open MPI. The user can include all of the extensions by referencing  
the appropriate header file (e.g., #include  ), and  
compiling with the normal wrapper compilers (e.g., mpicc).


This infrastructure was designed and discussed on July 2, 2008 at an  
Open MPI developers meeting directly following an MPI Forum meeting.  
I have been developing this branch over the past few months under  
the advisement of Jeff and Brian. The C interface is functional and  
stable. The C++, F77, and F90 interfaces have not been completed.  
There are comments in the appropriate build system files  
(particularly config/ompi_ext.m4) that indicate where a developer  
would need to focus to finish support for these language bindings if  
needed. I have not completed them since I do not feel comfortable  
enough at this time with these languages to provide such  
functionality.


I would like to bring this into the trunk before v1.5 branch. Having  
the infrastructure in the trunk will make it easier to maintain off- 
trunk experimental interface development.


As part of this RFC, I will also update the 'MPI Extensions' wiki  
page to describe how a developer can get started using this  
infrastructure:

 https://svn.open-mpi.org/trac/ompi/wiki/MPIExtensions

-
How to use the branch:

Configure with this additional option:
--enable-ext=example

Compile the following sample MPI program with 'mpicc' per usual.
/*-*/
#include 
#include 
#include 

int main(int argc, char *argv[])
{
   int rank, size;

   MPI_Init(&argc, &argv);

   MPI_Comm_rank(MPI_COMM_WORLD, &rank);
   MPI_Comm_size(MPI_COMM_WORLD, &size);

   OMPI_Progress("Go OMPI! Go!");

   MPI_Finalize();

   return 0;
}
/*-*/

___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel




Re: [OMPI devel] MTT usage

2009-05-26 Thread Jeff Squyres

The ompi-core-testers branch has now been deleted.


On Apr 25, 2009, at 7:55 AM, Jeff Squyres wrote:


OMPI testing organizations:

*** This is an end-of-life announcement for the "ompi-core-testers"  
MTT SVN branch.  This branch will be deleted on 25 May, 2009.


*** Also note that new development will soon be occurring on the MTT  
SVN trunk such that it may become unstable at times.  We encourage  
all OMPI testing organizations using the MTT trunk to move to the  
official /v3.0/ branch (see below).


All organizations are strongly encouraged to do the following:

1. Migrate to use the new https://svn.open-mpi.org/svn/mtt/branches/v3.0/ 
 branch.  This branch is a copy of MTT's trunk as of 24 April 2009.   
This branch is therefore the newest version of MTT, but has been in  
production use by Cisco, Sun, and Indiana U. (and others) for many  
months.  It is quite stable and is the preferred version to use.   
Its INI file syntax is slightly different than the old "ompi-core- 
testers" branch; you'll likely need to update your syntax.


2. If you are currently using the ompi-core-testers branch and  
cannot upgrade your INI file syntax, please move to the https://svn.open-mpi.org/svn/mtt/branches/v2.0/ 
 branch.  It is a copy of the ompi-core-testers branch as of 24  
April 2009, so nothing has changed.  The name is simply more  
consistent and allows a clear versioning path forward.


Thanks.

--
Jeff Squyres
Cisco Systems

___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel



--
Jeff Squyres
Cisco Systems



[OMPI devel] Remove IMB 2.3 from ompi-tests?

2009-05-26 Thread Jeff Squyres
We've had IMB 3.1 in the ompi-tests svn for a long time; it's what I  
run in my nightly MTT.  I just uploaded 3.2 as well, and will be  
switching my nightly MTT to use it.


*** Note that I have applied a custom bug fix to IMB_window.c in  
3.1/3.2 to make the code function properly -- otherwise OMPI  
[correctly] aborts right near the beginning (per MPI-2.1 11.2.1.  I've  
notified Intel of the fix; they're examining it.


Is it time to remove IMB 2.3 from ompi-tests?  Or, more specifically,  
is there any reason to keep 2.3 around?


--
Jeff Squyres
Cisco Systems



Re: [OMPI devel] XML stdout/stderr

2009-05-26 Thread Ralph Castain
Yo Greg

I'm slow, but it did hit me that there may be a simpler solution after all.
I gather that the problem is that the user's output could have tags in it
that match our own, thus causing tag-confusion. True?

My concern is that our proposed solution generates pidgin-xml which could
only ever be translated by a specially written parser. Kinda makes xml a
little moot in ways.

What if we simply change the name of our tags to something ompi-specific? I
could tag things with , for example. This would follow the
natural naming convention for internal variables, and would avoid any
conflicts unless the user were truly stupid - in which case, the onus would
be on them.

Would that resolve the problem?
Ralph


On Tue, May 26, 2009 at 5:42 AM, Ralph Castain  wrote:

>
>
> On Mon, May 25, 2009 at 9:10 AM, Greg Watson wrote:
>
>> Ralph,
>>
>> In life, nothing is ever easy...
>
>
> :-)
>
>
>>
>>
>> While the XML output is working well, I've come across an issue with
>> stdout/stderr. Unfortunately it's not just enough to wrap it in tags,
>> because it's possible that the output will contain XML formatting
>> information. There are two ways to get around this. The easiest is to wrap
>> the output in "". This has the benefit of being
>> relatively easy, but will fail if the output contains the string "]]>". The
>> other way is to replace all instances of "&", "<", and ">" with "&",
>> "<", and ">" respectively. This is safer, but requires more
>> processing.
>>
>> Thoughts?
>
>
> "Ick" immediately comes to mind, but is hardly helpful. :-)
>
> I am already doing some processing to deal with linefeeds in the middle of
> output streams, so adding these three special chars isn't -that- big a deal.
> I can have a test version for you in the next day or so (svn trunk) - I am
> on reduced hours while moving my son (driving across country).
>
> Let's give that a try and see if it resolves the problem...
>
>
>
>>
>>
>> Greg
>> ___
>> devel mailing list
>> de...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>
>
>


Re: [OMPI devel] RFC: Diagnostoc framework for MPI

2009-05-26 Thread Nadia Derbey
On Tue, 2009-05-26 at 05:35 -0600, Ralph Castain wrote:
> Hi Nadia
> 
> We actually have a framework in the system for this purpose, though it
> might require some minor modifications to do precisely what you
> describe. It is the ORTE "notifier" framework - you will find it at
> orte/mca/notifier. There are several components, each of which
> supports a different notification mechanism (e.g., message into the
> sys log, smtp, and even "twitter").

Ralph,

Thanks a lot for your detailed answer. I'll have a look at the notifier
framework to see if it could serve our purpose. Actually, form what you
describe, looks like it does.

Regards,
Nadia
> 
> The system works by adding orte_notifier calls to the OMPI code
> wherever we deem it advisable to alert someone. For example, if we
> think a sys admin might want to be alerted when the number of IB send
> retries exceeds some limit, we add a call to orte_notifier to the IB
> code with:
> 
> if (#retries > threshold) {
> orte_notifier.xxx();
> }
> 
> I believe we could easily extend this to support your proposed
> functionality. A couple of possibilities that immediately spring to
> mind would be:
> 
> 1. you could create a new component (or we could modify the existing
> ones) that tracks how many times it is called for a given error, and
> only actually issues a notification for that specific error when the
> count exceeds a threshold. The negative to this approach is that the
> threshold would be uniform across all errors.
> 
> 2. we could extend the current notifier APIs to add a threshold count
> upon which the notification is to be sent, perhaps creating a new
> macro ORTE_NOTIFIER_VERBOSE that takes the threshold as one of its
> arguments. We could then let each OMPI framework have a new
> "threshold" MCA param, thus allowing the sys admins to "tune" the
> frequency of error reporting by framework. Of course, we could let
> them get as detailed here as you want - they could even have
> "threshold" params for each component, function, or whatever. This
> would be combined with #1 above to alert only when the count exceeded
> the threshold for that specific error message.
> 
> I'm sure you and others will come up with additional (probably better)
> ways of implementing this extension. My point here was simply to
> ensure you knew that the basic mechanism already exists, and to
> stimulate some thought as to how to use it for your proposed purpose.
> 
> I would be happy to help you do so as this is something we (LANL) have
> put at a high priority - our sys admins on the large clusters really
> need the help.
> 
> HTH
> Ralph
> 
> 
> On Mon, May 25, 2009 at 11:33 PM, Nadia Derbey 
> wrote:
> What: Warn the administrator when unusual events are occurring
> too
> frequently.
> 
> Why: Such unusual events might be the symptom of some problem
> that can
> easily be fixed (by a better tuning, for example)
> 
> Where: Adds a new ompi framework
> 
> ---
> 
> Description:
> 
> The objective of the Open MPI library is to make applications
> run to
> completion, given that no fatal error is encountered.
> In some situations, unusual events may occur. Since these
> events are not
> considered to be fatal enough, the library arbitrarily chooses
> to bypass
> them using a software mechanism, instead of actually stopping
> the
> application. But even though this choice helps in completing
> the
> application, it may frequently result in significant
> performance
> degradation. This is not an issue if such “unusual events”
> don't occur
> too frequently. But if they actually do, that might be
> representative of
> a real problem that could sometimes be easily avoided.
> 
> For example, when mca_pml_ob1_send_request_start() starts a
> send request
> and faces a resource shortage, it silently calls
> add_request_to_send_pending() to queue that send request into
> the list
> of pending send requests in order to process it later on. If
> an adapting
> mechanism is not provided at runtime to increase the receive
> queue
> length, at least a message can be sent to the administrator to
> let him
> do the tuning by hand before the next run.
> 
> We had a look at other tracing utilities (like PMPI, PERUSE,
> VT), but
> found them either too high level or too intrusive at the
> application
> level.
> 
> The “diagnostic framework” we'd like to propose would help
> capturing
> such “unusual events” and tracing them, while having a very
> low impact
> on the performan

Re: [OMPI devel] XML stdout/stderr

2009-05-26 Thread Ralph Castain
On Mon, May 25, 2009 at 9:10 AM, Greg Watson  wrote:

> Ralph,
>
> In life, nothing is ever easy...


:-)


>
>
> While the XML output is working well, I've come across an issue with
> stdout/stderr. Unfortunately it's not just enough to wrap it in tags,
> because it's possible that the output will contain XML formatting
> information. There are two ways to get around this. The easiest is to wrap
> the output in "". This has the benefit of being
> relatively easy, but will fail if the output contains the string "]]>". The
> other way is to replace all instances of "&", "<", and ">" with "&",
> "<", and ">" respectively. This is safer, but requires more
> processing.
>
> Thoughts?


"Ick" immediately comes to mind, but is hardly helpful. :-)

I am already doing some processing to deal with linefeeds in the middle of
output streams, so adding these three special chars isn't -that- big a deal.
I can have a test version for you in the next day or so (svn trunk) - I am
on reduced hours while moving my son (driving across country).

Let's give that a try and see if it resolves the problem...



>
>
> Greg
> ___
> devel mailing list
> de...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>


Re: [OMPI devel] RFC: Diagnostoc framework for MPI

2009-05-26 Thread Ralph Castain
Hi Nadia

We actually have a framework in the system for this purpose, though it might
require some minor modifications to do precisely what you describe. It is
the ORTE "notifier" framework - you will find it at orte/mca/notifier. There
are several components, each of which supports a different notification
mechanism (e.g., message into the sys log, smtp, and even "twitter").

The system works by adding orte_notifier calls to the OMPI code wherever we
deem it advisable to alert someone. For example, if we think a sys admin
might want to be alerted when the number of IB send retries exceeds some
limit, we add a call to orte_notifier to the IB code with:

if (#retries > threshold) {
orte_notifier.xxx();
}

I believe we could easily extend this to support your proposed
functionality. A couple of possibilities that immediately spring to mind
would be:

1. you could create a new component (or we could modify the existing ones)
that tracks how many times it is called for a given error, and only actually
issues a notification for that specific error when the count exceeds a
threshold. The negative to this approach is that the threshold would be
uniform across all errors.

2. we could extend the current notifier APIs to add a threshold count upon
which the notification is to be sent, perhaps creating a new macro
ORTE_NOTIFIER_VERBOSE that takes the threshold as one of its arguments. We
could then let each OMPI framework have a new "threshold" MCA param, thus
allowing the sys admins to "tune" the frequency of error reporting by
framework. Of course, we could let them get as detailed here as you want -
they could even have "threshold" params for each component, function, or
whatever. This would be combined with #1 above to alert only when the count
exceeded the threshold for that specific error message.

I'm sure you and others will come up with additional (probably better) ways
of implementing this extension. My point here was simply to ensure you knew
that the basic mechanism already exists, and to stimulate some thought as to
how to use it for your proposed purpose.

I would be happy to help you do so as this is something we (LANL) have put
at a high priority - our sys admins on the large clusters really need the
help.

HTH
Ralph


On Mon, May 25, 2009 at 11:33 PM, Nadia Derbey wrote:

> What: Warn the administrator when unusual events are occurring too
> frequently.
>
> Why: Such unusual events might be the symptom of some problem that can
> easily be fixed (by a better tuning, for example)
>
> Where: Adds a new ompi framework
>
> ---
>
> Description:
>
> The objective of the Open MPI library is to make applications run to
> completion, given that no fatal error is encountered.
> In some situations, unusual events may occur. Since these events are not
> considered to be fatal enough, the library arbitrarily chooses to bypass
> them using a software mechanism, instead of actually stopping the
> application. But even though this choice helps in completing the
> application, it may frequently result in significant performance
> degradation. This is not an issue if such “unusual events” don't occur
> too frequently. But if they actually do, that might be representative of
> a real problem that could sometimes be easily avoided.
>
> For example, when mca_pml_ob1_send_request_start() starts a send request
> and faces a resource shortage, it silently calls
> add_request_to_send_pending() to queue that send request into the list
> of pending send requests in order to process it later on. If an adapting
> mechanism is not provided at runtime to increase the receive queue
> length, at least a message can be sent to the administrator to let him
> do the tuning by hand before the next run.
>
> We had a look at other tracing utilities (like PMPI, PERUSE, VT), but
> found them either too high level or too intrusive at the application
> level.
>
> The “diagnostic framework” we'd like to propose would help capturing
> such “unusual events” and tracing them, while having a very low impact
> on the performances. This is obtained by defining tracing routines that
> can be called from the ompi code. The collected events are aggregated
> per MPI process and only traced if a threshold has been reached. Another
> threshold (time threshold) can be used to condition subsequent traces
> generation for an already traced event.
>
> This is obtained by defining 2 mca parameters and a rule:
> . the count threshold C
> . the time delay T
> The rule is: an event will only be traced if it happened N times, and it
> won't be traced more than once every T seconds.
>
> Thus, events happening at a very low rate will never generate a trace
> except one at MPI_Finalize summarizing:
> [time] At finalize : 23 times : pre-allocated buffers all full, calling
> malloc
>
> Those happening "a little too much" will sometimes generate a trace
> saying something like:
> [time] 1000 warnin

[OMPI devel] RFC: Diagnostoc framework for MPI

2009-05-26 Thread Nadia Derbey
What: Warn the administrator when unusual events are occurring too
frequently.

Why: Such unusual events might be the symptom of some problem that can
easily be fixed (by a better tuning, for example)

Where: Adds a new ompi framework

---

Description:

The objective of the Open MPI library is to make applications run to
completion, given that no fatal error is encountered.
In some situations, unusual events may occur. Since these events are not
considered to be fatal enough, the library arbitrarily chooses to bypass
them using a software mechanism, instead of actually stopping the
application. But even though this choice helps in completing the
application, it may frequently result in significant performance
degradation. This is not an issue if such “unusual events” don't occur
too frequently. But if they actually do, that might be representative of
a real problem that could sometimes be easily avoided.

For example, when mca_pml_ob1_send_request_start() starts a send request
and faces a resource shortage, it silently calls
add_request_to_send_pending() to queue that send request into the list
of pending send requests in order to process it later on. If an adapting
mechanism is not provided at runtime to increase the receive queue
length, at least a message can be sent to the administrator to let him
do the tuning by hand before the next run.

We had a look at other tracing utilities (like PMPI, PERUSE, VT), but
found them either too high level or too intrusive at the application
level.

The “diagnostic framework” we'd like to propose would help capturing
such “unusual events” and tracing them, while having a very low impact
on the performances. This is obtained by defining tracing routines that
can be called from the ompi code. The collected events are aggregated
per MPI process and only traced if a threshold has been reached. Another
threshold (time threshold) can be used to condition subsequent traces
generation for an already traced event.

This is obtained by defining 2 mca parameters and a rule:
. the count threshold C
. the time delay T
The rule is: an event will only be traced if it happened N times, and it
won't be traced more than once every T seconds.

Thus, events happening at a very low rate will never generate a trace
except one at MPI_Finalize summarizing:
[time] At finalize : 23 times : pre-allocated buffers all full, calling
malloc

Those happening "a little too much" will sometimes generate a trace
saying something like:
[time] 1000 warnings : could not send in openib now, delaying
[time+12345 sec] 1000 warnings : could not send in openib now, delaying

And events occurring at a high frequency will only generate a message
every T seconds saying:
[time] 1000 warnings : adding buffers in the SRQ
[time+T]   1,234,567 warnings (in T seconds) : adding buffers in the SRQ
[time+2*T] 2,345,678 warnings (in T seconds) : adding buffers in the SRQ

The count threshold and time delay are defined per event.
They can also be defined as MCA parameters. In that case, the mca
parameter value overrides the per event values.

The following information are traced too:
  . job family
  . the local job id
  . the job vpid

Another aspect of performance savings is that a mechanism ala
show_help() can be used in order to let the HNP actually do the job.

We started the implementation of this feature, so patches are available if 
needed. We are currently trying to setup hgweb on an external server.

Since I'm an Open MPI newbie, I'm submitting this RFC to have your
opinion about its usefulness, or even to know if there's an already
existing mechanism to do this job.

Regards,
Nadia

-- 
Nadia Derbey