reductionista opened a new pull request #521:
URL: https://github.com/apache/madlib/pull/521


   There are no function functional changes in this PR, this is just for making 
MADlib development and performance profiling easier.
   
   1. Adds a new collection of utility functions (`utilities/debug.py_in`), to 
aid with the debugging process.
   
   The main purpose of this new debug module is to make it easy to
       turn on or off certain kinds of debug messages, without needing to make 
lots of temporary changes to the code.
       There is a frequent debugging pattern which has kept repeating when it 
comes to performance issues:
          - We find a bug, for example something is running slowly or filling 
up the disk or memory.  But we don't know where in a long sequence of SQL 
queriers it's happening.
          - We add lots of different extra lines of code to print out timing 
information after each query is sent, as well as what the queries look like 
after all python formatting params have been filled in, what their EXPLAIN 
plans look like, and sometimes other things... like which models are on which 
segments after each hop (for the case of `madlib_keras_fit_mutliple`).
          - We check these changes in on a temporary branch, and spend a while 
investigating, eventually getting a good picture of where the performance 
bottleneck is.  As we go along, often the types of information getting printed 
get more complicated and spread out through the code, and these commits get all 
mixed up with simultaneously commits we're making to actually fix the bug.
           - After a series of tweeks to different stages of the UDF 
implementation, the performance profiling looks good again, and we declare the 
bug fixed.
           - Then we sift through a complicated mess of commits, stripping out 
any debugging changes so they don't get merged in to the production code.
           - Soon after, we find another bug, and now we've got to start 
putting some debugging statements back in, little by little.  Eventually the 
whole cycle repeats, and we've done and undone a lot of work many times at this 
point.
   
   To improve the situation, we can build any of the profiling and debugging 
statements right into the production code, in a way where it can be easily 
enabled or disabled simply by changing a single line of code.  (In the future, 
we could even make it a compiler option.)
   
   Hopefully this will speed up the process of tracking down performance 
bottlenecks and reduce the amount of redundant work repeated.
   
      debug.start_timing(msg)
         Starts a timer.  msg is any name or short description of the timer 
which can be grepped for later.  Multiple timers can be running simultaneously, 
and they won't interfere with each other as long as they are each named 
differently.  This will print a header with column names that can be matched up 
later with the actual timings as they get printed.
   
       debug.print_timing(msg, force=False) :
          Used to print timing information for different stages of a python UDF 
(for example, madlib_keras_fit_multiple()).
          This will print (or not print) how many seconds have elapsed since 
this particular timer (associate with msg param) was started. If enabled, it 
will print a nicely formatted banner with information in different columns that 
can be easily extracted from logs with the grep and cut commands.  For example, 
we could use it to break down the current run_training() function into 
_hop_time_, _uda_time_, _truncate_time_, _copy_time_, and _delete_time_.  I'm 
not including this example yet because it's on a branch with a major refactor 
and these particular stages no longer line up.  (Refactor will be raised as a 
separate PR and call these functions with names appropriate for the new stages.)
   
         Enable all timing messages in a source file (just after debug module 
has been imported):
            debug.timings_enabled = True
   
         Enable a single call to debug.print_timings()
            by passing force=True.  Will print even if
            debug.timings_enabled is False
   
       debug.print_mst_keys(force=False) :
           If enabled, prints a map between mst_keys and seg_id's as the 
mst_keys
           hop around from segment to segment... to verify correct
           MOP behavior.
   
       debug.plpy.execute(query, ..., force=False)
   
           If enabled, prints query, EXPLAIN ANALYZE, and timing of a query
           while executing it.  Otherwise, just runs the query with 
plpy.execute(query, ...)
   
       debug.plpy.info(msg,..., force=False):
          Calls plpy.info(msg, ...) if enabled.
          Otherwise, does nothing.
   
       debug.plpy.debug(msg, ..., force=False):
           Behaves like plpy.debug(msg, ...) if disabled (printing only
           if DEBUG level is set high enough), but becomes a
           plpy.info() if enabled.
   
   2. One minor fix to silence a warning message that started showing up for me 
recently (I think when I upgraded from pg9 to pg12).
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to