I have a few comments: - This looks nice. Thanks for the contribution.
- I notice that the ORTE timing stuff is now a compile-time decision, not a run-time decision. Do we care that we've now taken away the ability for users to do timings in a production build? - "clksync" -- can we use "clocksync"? It's only 2 letters. We tend to use real words in the OMPI code base; unnecessary abbreviation should be avoided. - r32738 introduced a several files into the code base that have no copyrights, and do not have the standard OMPI copyright header block. Please fix. - There's no documentation on how to use mpisync, mpirun_prof, or ompi_timing_post, even though they're installed when you --enable-timing. What are these 3 executables? Can we get man pages? - What's the purpose of the MCA param orte_rml_base_timing? A *quick* look through the code seems to indicate that it is ignored. - What's the purpose of the MCA params opal_clksync_file, opal_timing_file, and opal_timing_overhead? E.g., what is a "clksync" file, what is it for, and what is its format? Does the user have to provide one? If so, how to you get one? Or is it an output file? ...etc. The brief descriptions given in the MCA help strings don't really provide enough information for someone who has no idea what the timing stuff is. Also, can those 3 params have a common prefix? I.e., it's not obvious that opal_clksync_file is related to opal_timing_* at all. - A *quick* look at ompi/tools/mpisync shows that a bunch of that code came from an external project. Is the license compatible with OMPI's license? What do we need to do to conform to their license? - opal/util/timings.h is protected by OPAL_SYS_TIMING_H -- shouldn't it be OPAL_UTIL_TIMINGS_H? - There's commented-out code in opal/util/timings.h. - There's no doxygen-style documentation in opal/util/timings.h to tell developers how to use it. - There's "TODO" comments in opal/util/timings.c; should those be fixed? - opal_config.h should be the first include in opal/util/timings.c. - If timing support is not to be compiled in, then opal/util/timings.c should not be be compiled via the Makefile.am (rather than entirely #if'ed out). It looks like this work is about 95% complete. Finishing the remaining 5% would make it great and genuinely useful to the rest of the code base. Thanks! On Sep 16, 2014, at 10:20 AM, Artem Polyakov <artpo...@gmail.com> wrote: > Hello, > > I would like to introduce OMPI timing framework that was included into the > trunk yesterday (r32738). The code is new so if you'll hit some bugs - just > let me know. > > The framework consists of the set of macro's and routines for internal OMPI > usage + standalone tool mpisync and few additional scripts: mpirun_prof and > ompi_timing_post. The set of features is very basic and I am open for > discussion of new things that are desirable there. > > To enable framework compilation you should configure OMPI with > --enable-timing option. If the option was passed to ./configure, standalone > tools and scripts will be installed into <prefix>/bin. > > The timing code is located in OPAL (opal/utils/timing.[ch]). There is a set > of macro's that should be used to preprocess out all mentions of the timing > code in case it wasn't requested with --enable-timing: > OPAL_TIMING_DECLARE(t) - declare timing handler structure with name "t". > OPAL_TIMING_DECLARE_EXT(x, t) - external declaration of a timing handler "t". > OPAL_TIMING_INIT(t) - initialize timing handler "t" > OPAL_TIMING_EVENT(x) - printf-like event declaration similar to OPAL_OUTPUT. > The information about the event will be quickly inserted into the linked > list. Maximum event description is limited by OPAL_TIMING_DESCR_MAX. > The malloc is performed in buckets (OPAL_TIMING_BUFSIZE at once) and overhead > (time to malloc and prepare the bucket) is accounted in corresponding list > element. It might be excluded from the timing results (controlled by > OMPI_MCA_opal_timing_overhead parameter). > OPAL_TIMING_REPORT(enable, t, prefix) - prepare and print out timing > information. If OMPI_MCA_opal_timing_file was specified the output will go to > that file. In other case the output will be directed using opal_output, each > line will be prefixed with "prefix" to ease grep'ing. "enable" is a > boolean/integer variable that is used for runtime selection of what should be > reported. > OPAL_TIMING_RELEASE(t) - the counterpart for OPAL_TIMING_INIT. > > There are several examples in OMPI code. And here is another simple example: > OPAL_TIMING_DECLARE(tm); > OPAL_TIMING_INIT(&tm); > ... > OPAL_TIMING_EVENT((&tm,"Begin of timing: %s", > ORTE_NAME_PRINT(&(peer->name)) )); > .... > OPAL_TIMING_EVENT((&tm,"Next timing event with condition x = %d", x )); > ... > OPAL_TIMING_EVENT((&tm,"Finish")); > OPAL_TIMING_REPORT(enable_var, &tm,"MPI Init"); > OPAL_TIMING_RELEASE(&tm); > > > An output from all OMPI processes (mpirun, orted's, user processes) is merged > together. NTP provides 1 millisecond - 100 microsecond level of precision. > This may not be sufficient to order events globally. > To help developers extract the most realistic picture of what is going on, > additional time synchronisation might be performed before profiling. The > mpisync program should be runned 1-user-process-per-node to acquire the file > with time offsets relative to HNP of each node. If the cluster runs over Gig > Ethernet the precision will be 30-50 microseconds, in case of Infiniband - 4 > microseconds. mpisync produces output file that might be readed and used by > timing framework (OMPI_MCA_opal_clksync_file parameter). The bad news is that > this synchronisation is not enough because of different clock skew on > different nodes. Additional periodical synchronisation is needed. This is > planned for the near future (me and Ralph discussing possible ways now). > > the mpirun_prof & ompi_timing_post script may be used to automate clock > synchronisation in following manner: > export OMPI_MCA_ompi_timing=true > export OMPI_MCA_orte_oob_timing=true > export OMPI_MCA_orte_rml_timing=true > export OMPI_MCA_opal_timing_file=timing.out > mpirun_prof <ompi-params> ./mpiprog > ompi_timing_post timing.out > > ompi_timing_post will simply sort the events and made all times to be > relative to the first one. > > -- > С Уважением, Поляков Артем Юрьевич > Best regards, Artem Y. Polyakov > _______________________________________________ > devel mailing list > de...@open-mpi.org > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel > Link to this post: > http://www.open-mpi.org/community/lists/devel/2014/09/15837.php -- Jeff Squyres jsquy...@cisco.com For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/