On Tue, 9 Dec 2008 at 08:40, Slaunger wrote:
I am a novice Python 2.5 programmer, who write some cmd line scripts
for processing large amounts of data.

I would like to have possibility to regularly print out the progress
made during the processing, say every 1 seconds, and i am wondering
what a proper generic way to do this is.

I have created this test example to show the general problem. Running
the script gives me the output:

Work through all 20 steps reporting progress every 1.0 secs...
Work step 0
Work step 1
Work step 2
Work step 3
Work step 4
Processed 4 of 20
Work step 5
[...]
Work step 19
Finished working through 20 steps
[...]
Quite frankly, I do not like what I have made! It is a mess,
responsibilities are mixed, and it seems overly complicated. But I
can't figure out how to do this right.

I would therefore like some feedback on this proposed generic "report
progress at regular intervals" approach presented here. What could I
do better?

I felt like a little lunchtime challenge, so I wrote something that
I think matches your spec, based on your sample code.  This is not
necessarily the best implementation, but I think it is simpler and
clearer than yours.  The biggest change is that the work is being
done in the subthread, while the main thread does the monitoring.

It would be fairly simple to enhance this so that you could pass
arbitrary arguments in to the worker function, in addition to
or instead of the loop counter.

-----------------------------------------------------------------------
"""
Test module for testing generic ways of displaying progress
information at regular intervals.
"""
import random
import threading
import time

def work(i):
    """
    Dummy process function, which takes a random time in the interval
    0.0-0.5 secs to execute
    """
    print "Work step %d" % i
    time.sleep(0.5 * random.random())


class Monitor(object):
    """
    This class creates an object that will execute a worker function
    in a loop and at regular intervals emit a progress report on
    how many times the function has been called.
    """

    def dowork(self):
        """
        Call the worker function in a loop, keeping track of how
        many times it was called in self.no
        """
        for self.no in xrange(self.max_iter):
            self.func(self.no)

    def __call__(self, func, verbose=True, max_iter=20, progress_interval=1.0):
        """
        Repeatedly call 'func', passing it the loop count, for max_iter
        iterations, and every progress_interval seconds report how
        many times the function has been called.
        """
        # Not all of these need to be instance variables, but they might
        # as well be in case we want to reference them in an enhanced
        # dowork function.
        self.func = func
        self.verbose = verbose
        self.max_iter=max_iter
        self.progress_interval=progress_interval

        if self.verbose:
            print ("Work through all %d steps reporting progress every "
                "%3.1f secs...") % (self.max_iter, self.progress_interval)

        # Create a thread to run the loop, and start it going.
        worker = threading.Thread(target=self.dowork)
        worker.start()

        # Monitoring loop.
        loops = 0
        # We're going to loop ten times per second using an integer count,
        # so multiply the seconds parameter by 10 to give it the same
        # magnitude.
        intint = int(self.progress_interval*10)
        # isAlive will be false after dowork returns
        while worker.isAlive():
            loops += 1
            # Wait 0.1 seconds between checks so that we aren't chewing
            # CPU in a spin loop.
            time.sleep(0.1)
            # when the modulus (second element of divmod tuple) is zero,
            # then we have hit a new progress_interval, so emit the report.
            if not divmod(loops, intint)[1]:
                print "Processed %d of %d" % (self.no, self.max_iter)

        if verbose:
            print "Finished working through %d steps" % max_iter

if __name__ == "__main__":
    #Create the monitor.
    monitor = Monitor()
    #Run the work function under monitoring.
    monitor(work)
--
http://mail.python.org/mailman/listinfo/python-list

Reply via email to