The code in 1.3 is definitely different from the trunk as it lags quite a bit behind. However, the trunk definitely does include the code I referenced.
Not sure why the hg mirror wouldn't have it. I would have to defer to Jeff on that question - could be a bug in the update macro that maintains the mirror? I haven't checked the opal_sos branch to see if it has the code in it, but I would have thought those guys were tracking the trunk that closely - that code was committed in r19209. Ralph On Thu, May 28, 2009 at 1:45 AM, Sylvain Jeaugey <sylvain.jeau...@bull.net>wrote: > To be more complete, we pull Hg from > http://www.open-mpi.org/hg/hgwebdir.cgi/ompi-svn-mirror/ ; are we mistaken > ? > > If not, the code in v1.3 seems to be different from the code in the trunk > ... > > Sylvain > > > On Thu, 28 May 2009, Nadia Derbey wrote: > > On Tue, 2009-05-26 at 17:24 -0600, Ralph Castain wrote: >> >>> First, to answer Nadia's question: you will find that the init >>> function for the module is already called when it is selected - see >>> the code in orte/mca/base/notifier_base_select.c, lines 72-76 (in the >>> trunk. >>> >> >> Strange? Our repository is a clone of the trunk? >> >>> >>> It's true that if I "hg update" to v1.3 I see that the fix is there. >> >> Regards, >> Nadia >> >> It would be a good idea to tie into the sos work to avoid conflicts >>> when it all gets merged back together, assuming that isn't a big >>> problem for you. >>> >>> As for Jeff's suggestion: dealing with the performance hit problem is >>> why I suggested ORTE_NOTIFIER_VERBOSE, modeled after the >>> OPAL_OUTPUT_VERBOSE model. The idea was to compile it in -only- when >>> the system is built for it - maybe using a --with-notifier-verbose >>> configuration option. Frankly, some organizations would happily pay a >>> small performance penalty for the benefits. >>> >>> I would personally recommend that the notifier framework keep the >>> stats so things can be compact and self-contained. We still get >>> atomicity by allowing each framework/component/whatever specify the >>> threshold. Creating yet another system to do nothing more than track >>> error/warning frequencies to decide whether or not to notify seems >>> wasteful. >>> >>> Perhaps worth a phone call to decide path forward? >>> >>> >>> On Tue, May 26, 2009 at 1:06 PM, Jeff Squyres <jsquy...@cisco.com> >>> wrote: >>> Nadia -- >>> >>> Sorry I didn't get to jump in on the other thread earlier. >>> >>> We have made considerable changes to the notifier framework in >>> a branch to better support "SOS" functionality: >>> >>> >>> https://www.open-mpi.org/hg/auth/hgwebdir.cgi/jsquyres/opal-sos >>> >>> Cisco and Indiana U. have been working on this branch for a >>> while. A description of the SOS stuff is here: >>> >>> https://svn.open-mpi.org/trac/ompi/wiki/ErrorMessages >>> >>> As for setting up an external web server with hg, don't bother >>> -- just get an account at bitbucket.org. They're free and >>> allow you to host hg repositories there. I've used bitbucket >>> to collaborate on code before it hits OMPI's SVN trunk with >>> both internal and external OMPI developers. >>> >>> We can certainly move the opal-sos repo to bitbucket (or >>> branch again off opal-sos to bitbucket -- whatever makes more >>> sense) to facilitate collaborating with you. >>> >>> Back on topic... >>> >>> I'd actually suggest a combination of what has been discussed >>> in the other thread. The notifier can be the mechanism that >>> actually sends the output message, but it doesn't have to be >>> the mechanism that tracks the stats and decides when to output >>> a message. That can be separate logic, and therefore be more >>> fine-grained (and potentially even specific to the MPI layer). >>> >>> The Big Question will how to do this with zero performance >>> impact when it is not being used. This has always been the >>> difficult issue when trying to implement any kind of >>> monitoring inside the core OMPI performance-sensitive paths. >>> Even adding individual branches has met with resistance (in >>> performance-critical code paths)... >>> >>> >>> >>> >>> >>> On May 26, 2009, at 10:59 AM, Nadia Derbey wrote: >>> >>> >>> >>> Hi, >>> >>> While having a look at the notifier framework under >>> orte, I noticed that >>> the way it is written, the init routine for the >>> selected module cannot >>> be called. >>> >>> Attached is a small patch that fixes this issue. >>> >>> Regards, >>> Nadia >>> >>> >>> <orte_notifier_fix_select.patch><ATT14046023.txt> >>> >>> >>> -- >>> Jeff Squyres >>> Cisco Systems >>> >>> _______________________________________________ >>> devel mailing list >>> de...@open-mpi.org >>> http://www.open-mpi.org/mailman/listinfo.cgi/devel >>> >>> _______________________________________________ >>> devel mailing list >>> de...@open-mpi.org >>> http://www.open-mpi.org/mailman/listinfo.cgi/devel >>> >> -- >> Nadia Derbey <nadia.der...@bull.net> >> >> _______________________________________________ >> devel mailing list >> de...@open-mpi.org >> http://www.open-mpi.org/mailman/listinfo.cgi/devel >> >> >> _______________________________________________ > devel mailing list > de...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/devel >