Re: [OMPI devel] RFC 1/1: improvements to the "notifier" framework and ORTE WDC
On Mar 29, 2010, at 9:16 PM, Ralph Castain wrote: On Mar 29, 2010, at 5:53 PM, Abhishek Kulkarni wrote: On Mon, 29 Mar 2010, Sylvain Jeaugey wrote: Hi Ralph, For now, I think that yes, this is a unique identifier. However, in my opinion, this could be improved in the future replacing it by a unique string. Something like : #define ORTE_NOTIFIER_DEFINE_EVENT(eventstr, associated_text) { static int event = -1; if (OPAL_UNLIKELY(event == -1) { event = opal_sos_create_new_event(eventstr, associated_text); } .. } This would move the event numbering to the OPAL layer, making it transparent to the developper. This is a good suggestion, but then I think we end up relying on run-time generation of the event numbers and have to pay the extra cost of looking up the event in a list/array/hash each time we log the event. Since it is -solely- intended to be in an error path, I fail to see the concern here. My bad. Clearly I misunderstood here -- mostly because I vaguely remember (from [1]) that the original motivation was to put conditional #ifdef'd hooks in the "fast path" as well. But if they ought to be on the "slow path", I think it would be fair enough to consider Sylvain's suggestion of pushing the event numbering to SOS. In that, the SOS hashtable could map the notifier events to their unique identifier and the threshold counter itself could be encoded inside the identifier returned by SOS. [1] http://www.open-mpi.org/community/lists/devel/2009/05/6132.php From what I understand, and from the discussions that took place when this proposal was first put up on the devel list, is that since the event tracing hooks could lie in the critical path, we want the overhead to be as low as possible. By manually defining the unique identifiers, we can generate the event tracing macro at compile- time and have a minimal tracing impact. Surely you jest - yes?? The event tracing hooks should -never- be in the critical path. The notifier is intended -solely- to be called when an error (or some other critical event) has already been detected. The idea was that we detect an error, and then (if selected) notify someone about it. The last thing we want to do, IMHO, is put the notifier in a critical path. If we do, I personally will regret having created it :-) My 2¢ ofcourse. Thanks Abhishek Just my 2 cents ... Sylvain On Mon, 29 Mar 2010, Ralph Castain wrote: Hi Abhishek I'm confused by the WDC wiki page, specifically the part about the new ORTE_NOTIFIER_DEFINE_EVENT macro. Are you saying that I (as the developer) have to provide this macro with a unique notifier id? So that would mean that ORTE/OMPI would have to maintain a global notifier id counter to ensure it is unique? If so, that seems really cumbersome. Could you please clarify? Thanks Ralph On Mar 29, 2010, at 8:57 AM, Abhishek Kulkarni wrote: = = = === [RFC 1/2] = = = === WHAT: Merge improvements to the "notifier" framework from the OPAL SOS and the ORTE WDC mercurial branches into the SVN trunk. WHY: Some improvements and interface changes were put into the ORTE notifier framework during the development of the OPAL SOS[1] and ORTE WDC[2] branches. WHERE: Mostly restricted to ORTE notifier files and files using the notifier interface in OMPI. TIMEOUT: The weekend of April 2-3. REFERENCE MERCURIAL REPOS: * SOS development: http://bitbucket.org/jsquyres/opal-sos-fixed/ * WDC development: http://bitbucket.org/derbeyn/orte-wdc- fixed/ = = = === BACKGROUND: The notifier interface and its components underwent a host of improvements and changes during the development of the SOS[1] and the WDC[2] branches. The ORTE WDC (Warning Data Capture) branch enables accounting of events through the use of notifier interface, whereas OPAL SOS uses the notifier interface by setting up callbacks to relay out logged events. Some of the improvements include: - added more severity levels. - "ftb" notifier improvements. - "command" notifier improvements. - added "file" notifier component - changes in the notifier modules selection - activate only a subset of the callbacks (i.e. any combination of log, help, log_peer) - define different output media for any given callback (e.g. log_peer can be redirected to the syslog and smtp, while the show_help can be sent to the hnp). - ORTE_NOTIFIER_LOG_EVENT() (that accounts and warns about unusual events) Much more information is available on these t
Re: [OMPI devel] RFC 1/1: improvements to the "notifier" framework and ORTE WDC
On Mon, 29 Mar 2010, Abhishek Kulkarni wrote: #define ORTE_NOTIFIER_DEFINE_EVENT(eventstr, associated_text) { static int event = -1; if (OPAL_UNLIKELY(event == -1) { event = opal_sos_create_new_event(eventstr, associated_text); } .. } This is a good suggestion, but then I think we end up relying on run-time generation of the event numbers Yes. and have to pay the extra cost of looking up the event in a list/array/hash each time we log the event. No. Of course not, that's the point of the "static int" here. The "create_new_event" function will be only called once ; the event is then stored and used directly whenever we enter this code again. But yes, I'm adding an "if", which may cost a little more than just the counter increment. From what I understand, and from the discussions that took place when this proposal was first put up on the devel list, is that since the event tracing hooks could lie in the critical path, we want the overhead to be as low as possible. By manually defining the unique identifiers, we can generate the event tracing macro at compile-time and have a minimal tracing impact. Not in the critical path. And from my point on view not on error pathes too. I prefer to talk about some "slow path" : not critical, but slow. Sylvain On Mon, 29 Mar 2010, Ralph Castain wrote: Hi Abhishek I'm confused by the WDC wiki page, specifically the part about the new ORTE_NOTIFIER_DEFINE_EVENT macro. Are you saying that I (as the developer) have to provide this macro with a unique notifier id? So that would mean that ORTE/OMPI would have to maintain a global notifier id counter to ensure it is unique? If so, that seems really cumbersome. Could you please clarify? Thanks Ralph On Mar 29, 2010, at 8:57 AM, Abhishek Kulkarni wrote: == [RFC 1/2] == WHAT: Merge improvements to the "notifier" framework from the OPAL SOS and the ORTE WDC mercurial branches into the SVN trunk. WHY: Some improvements and interface changes were put into the ORTE notifier framework during the development of the OPAL SOS[1] and ORTE WDC[2] branches. WHERE: Mostly restricted to ORTE notifier files and files using the notifier interface in OMPI. TIMEOUT: The weekend of April 2-3. REFERENCE MERCURIAL REPOS: * SOS development: http://bitbucket.org/jsquyres/opal-sos-fixed/ * WDC development: http://bitbucket.org/derbeyn/orte-wdc-fixed/ == BACKGROUND: The notifier interface and its components underwent a host of improvements and changes during the development of the SOS[1] and the WDC[2] branches. The ORTE WDC (Warning Data Capture) branch enables accounting of events through the use of notifier interface, whereas OPAL SOS uses the notifier interface by setting up callbacks to relay out logged events. Some of the improvements include: - added more severity levels. - "ftb" notifier improvements. - "command" notifier improvements. - added "file" notifier component - changes in the notifier modules selection - activate only a subset of the callbacks (i.e. any combination of log, help, log_peer) - define different output media for any given callback (e.g. log_peer can be redirected to the syslog and smtp, while the show_help can be sent to the hnp). - ORTE_NOTIFIER_LOG_EVENT() (that accounts and warns about unusual events) Much more information is available on these two wiki pages: [1] http://svn.open-mpi.org/trac/ompi/wiki/ErrorMessages [2] http://svn.open-mpi.org/trac/ompi/wiki/ORTEWDC NOTE: This is first of a two-part RFC to bring the SOS and WDC branches to the trunk. This only brings in the "notifier" changes from the SOS branch, while the rest of the branch will be brought over after the timeout of the second RFC. == ___ devel mailing list de...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/devel
[OMPI devel] Some questions about checkpoint/restart (8)
8th question is as follows: (8) The result of communication which uses derived datatypes which was constructed using MPI_Type_vector,MPI_Type_indexed is incorrect after taking checkpoint. Framework : datatype Component : datatype The source file : ompi/datatype/dt_copy.c The function name : ompi_ddt_copy_content_same_ddt Framework : crcp Component : bkmrk The source file : ompi/mca/crcp/bkmrk/crcp_bkmrk_pml.c The function name : ? Here's the code that causes the problem: #define SLPTIME 60 #define ITEMNUM 10 int buf[ITEMNUM][ITEMNUM]; MPI_Type_vector(10,1,10,MPI_INT,&newdt); MPI_Type_commit(&newdt); MPI_Barrier(MPI_COMM_WORLD); if (rank == 0) { MPI_Isend(&buf[0][0],1,newdt,1,1000,MPI_COMM_WORLD,&req); printf(" rank=%d sleep start \n",rank); fflush(stdout); sleep(SLPTIME); /** take checkpoint at this point **/ printf(" rank=%d sleep end \n",rank); fflush(stdout); MPI_Wait(&req,&sts); MPI_Type_free(&newdt); } else { printf(" rank=%d sleep start \n",rank); fflush(stdout); sleep(SLPTIME); /** take checkpoint at this point **/ printf(" rank=%d sleep end \n",rank); fflush(stdout); MPI_Irecv(&buf[0][0],1,newdt,0,1000,MPI_COMM_WORLD,&req); MPI_Wait(&req,&sts); MPI_Type_free(&newdt); } for (i=0;isize=1] wait_quiesce_drained:xx=0 0 wait_quiesce_drained:xx=1 100 wait_quiesce_drained:xx=2 200 wait_quiesce_drained:xx=3 300 wait_quiesce_drained:xx=4 400 wait_quiesce_drained:xx=5 500 wait_quiesce_drained:xx=6 600 wait_quiesce_drained:xx=7 700 wait_quiesce_drained:xx=8 800 wait_quiesce_drained:xx=9 900 ompi_ddt_copy_content_same_ddt:Start size=40 flag=102/4 count=1 * I think that receiver received message correctly in the bkmrk. Received messages are contiguous. * I think that the problem is copy processing in ompi_ddt_copy_content_same_ddt. Or is using ompi_ddt_copy_content_same_ddt function wrong? * the first argument(datatype) of ompi_ddt_copy_content_same_ddt function in drain_message_copy_remove is specified by user's application Hexadecimal value of datatype->flags is 0x102. It does not contain DT_FLAG_CONTIGUOUS and it will mean derived datatype. * I think that problem occurs at the following parts of ompi_ddt_copy_content_same_ddt function. Both source and destination use the same information of datatype which is specified by user's application. But source(received messages in the bkmrk) is simple contiguous messages. --- destination += datatype->true_lb; source += datatype->true_lb; --- ptrdiff_t extent = (datatype->ub - datatype->lb); destination += extent; source += extent; --- pStack = (dt_stack_t*)alloca( sizeof(dt_stack_t) * (datatype->btypes[DT_LOOP] + 1) ); source = (unsigned char*)source_base + pStack->disp; destination = (unsigned char*)destination_base + pStack->disp; * If the source datatype is different from the destination datatype, Should not ompi_ddt_copy_content_same_ddt function be used? -bash-3.2$ cat t_mpi_question-8.c #include #include #include #include "mpi.h" #define SLPTIME 60 #define ITEMNUM 10 int buf[ITEMNUM][ITEMNUM]; int main(int ac,char **av) { int rank,size,cc,i,j; MPI_Request req; MPI_Status sts; MPI_Datatype newdt; MPI_Init(&ac,&av); MPI_Comm_rank(MPI_COMM_WORLD,&rank); MPI_Comm_size(MPI_COMM_WORLD,&size); for (i=0;i #include #include #include "mpi.h" #define SLPTIME 60 #define ITEMNUM 10 int buf[ITEMNUM][ITEMNUM]; int main(int ac,char **av) { int rank,size,cc,i,j; MPI_Request req; MPI_Status sts; MPI_Datatype newdt; int block_length[ITEMNUM]; int disp[ITEMNUM]; MPI_Init(&ac,&av); MPI_Comm_rank(MPI_COMM_WORLD,&rank); MPI_Comm_size(MPI_COMM_WORLD,&size); for (i=0;i #include #include #include "mpi.h" #define ITEMNUM_1 10 #define SLPTIME60 int buf[ITEMNUM_1][ITEMNUM_1]; int main(int ac,char **av) { int rank,size,cc,i,j,k; MPI_Request req; MPI_Status sts; MPI_Datatype newdt; int itmnum,newdt_size; int b_l[3]; MPI_Aint dp[3],newdt_extent,newdt_lb,newdt_ub; MPI_Datatype dt[3]; itmnum = 10; rank=0; MPI_Init(&ac,&av); MPI_Comm_rank(MPI_COMM_WORLD,&rank); MPI_Comm_size(MPI_COMM_WORLD,&size); for (i=0;i
Re: [OMPI devel] RFC 1/1: improvements to the "notifier" framework and ORTE WDC
On Mon, 2010-03-29 at 09:37 -0600, Ralph Castain wrote: > Hi Abhishek > > > I'm confused by the WDC wiki page, specifically the part about the new > ORTE_NOTIFIER_DEFINE_EVENT macro. Are you saying that I (as the > developer) have to provide this macro with a unique notifier id? Hi Ralph, Actually ORTE_NOTIFIER_DEFINE_EVENT(, ) expands to a static inline routine notifier_log_event_(). So I would say there is a one to one relationship between an event id and a log_event routine. So there is no need to do a lookup inside an array or a list. So yes the event identifier needs to be unique, but only inside a single source file: you can perpectly call ORTE_NOTIFIER_DEFINE_EVENT(0, ) in a .c file and ORTE_NOTIFIER_DEFINE_EVENT(0, ) in another one. Now, we could centralize the event ids in a .h file in the notifier framework, but the purpose here would only be to have something "cleaner". > So that would mean that ORTE/OMPI would have to maintain a global > notifier id counter to ensure it is unique? >From what I said before, we don't need this. Regards, Nadia > > > If so, that seems really cumbersome. Could you please clarify? > > > Thanks > Ralph > > On Mar 29, 2010, at 8:57 AM, Abhishek Kulkarni wrote: > > > > > == > > [RFC 1/2] > > == > > > > WHAT: Merge improvements to the "notifier" framework from the OPAL > > SOS > > and the ORTE WDC mercurial branches into the SVN trunk. > > > > WHY: Some improvements and interface changes were put into the ORTE > >notifier framework during the development of the OPAL SOS[1] and > >ORTE WDC[2] branches. > > > > WHERE: Mostly restricted to ORTE notifier files and files using the > > notifier interface in OMPI. > > > > TIMEOUT: The weekend of April 2-3. > > > > REFERENCE MERCURIAL REPOS: > > * SOS development: http://bitbucket.org/jsquyres/opal-sos-fixed/ > > * WDC development: http://bitbucket.org/derbeyn/orte-wdc-fixed/ > > > > == > > > > BACKGROUND: > > > > The notifier interface and its components underwent a host of > > improvements and changes during the development of the SOS[1] and > > the > > WDC[2] branches. The ORTE WDC (Warning Data Capture) branch enables > > accounting of events through the use of notifier interface, whereas > > OPAL SOS uses the notifier interface by setting up callbacks to > > relay > > out logged events. > > > > Some of the improvements include: > > > > - added more severity levels. > > - "ftb" notifier improvements. > > - "command" notifier improvements. > > - added "file" notifier component > > - changes in the notifier modules selection > > - activate only a subset of the callbacks > > (i.e. any combination of log, help, log_peer) > > - define different output media for any given callback (e.g. > > log_peer > > can be redirected to the syslog and smtp, while the show_help can be > > sent to the hnp). > > - ORTE_NOTIFIER_LOG_EVENT() (that accounts and warns about unusual > > events) > > > > Much more information is available on these two wiki pages: > > > > [1] http://svn.open-mpi.org/trac/ompi/wiki/ErrorMessages > > [2] http://svn.open-mpi.org/trac/ompi/wiki/ORTEWDC > > > > NOTE: This is first of a two-part RFC to bring the SOS and WDC > > branches > > to the trunk. This only brings in the "notifier" changes from the > > SOS > > branch, while the rest of the branch will be brought over after the > > timeout of the second RFC. > > > > == > > ___ > > devel mailing list > > de...@open-mpi.org > > http://www.open-mpi.org/mailman/listinfo.cgi/devel > > > > > ___ > devel mailing list > de...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/devel -- Nadia Derbey