[issue19495] context manager for measuring duration of blocks of code

Caleb Hattingh Mon, 25 Mar 2019 06:47:59 -0700

Caleb Hattingh <caleb.hatti...@gmail.com> added the comment:

Somehow I missed that there's been an open issue on this. Like others I've 
written a bunch of different incarnations of an "elapsed" context manager over 
the years.  Always for the more crude "how long did this take" reason like 
David mentioned, never the microbenchmarks scenario that timeit serves. The 
work is never quite substantial enough to make a Pypi package of it, but always 
annoying to have to do again the next time the need arises.



The overall least fiddly scheme I've found is to just use a callback. It's the 
simplest option:


@contextmanager
def elapsed(cb: Callable[[float, float], None], counter=time.perf_counter):
    t0 = counter()
    try:
        yield
    finally:
        t1 = counter()
        cb(t0, t1)


The simple case, the one most people want when they just want a quick check of 
elapsed time on a chunk of code, is then quite easy:


with elapsed(lambda t0, t1: print(f'read_excel: {t1 - t0:.2f} s')):
    # very large spreadsheet
    df = pandas.read_excel(filename, dtype=str)


(I rarely need to use a timer decorator for functions, because the profiler 
tracks function calls. But within the scope of a function it can sometimes be 
difficult to get timing information, particularly if the calls made there are 
into native extensions)

One of the consequences of using a callback strategy is that an additional 
version might be required for async callbacks (I have used these in production 
also):


@asynccontextmanager
async def aioelapsed(acb: Callable[[float, float], Awaitable[None]],
                     counter=time.perf_counter):
    t0 = counter()
    try:
        yield
    finally:
        t1 = counter()
        await acb(t0, t1)


So, the interesting thing here is that there is a general form for which an 
"elapsed" function is just a special case:


T = TypeVar('T')


@contextmanager
def sample_before_and_after(cb: Callable[[T, T], None], sample: Callable[[], 
T]):
    before = sample()
    try:
        yield
    finally:
        after = sample()
        cb(before, after)


The version of "elapsed" given further above is just this with kwarg 
sample=time.perf_counter.  So, it might be sufficient to cover the use-case of 
an "elapsed" context manager instead with something like the above instead, 
which is more general. However, I don't actually have any use cases for this 
more general thing above, other than "elapsed", but I thought it was 
interesting.


Whether any of this merits being in the stdlib or not is hard to say. These 
code snippets are all short and easy to write. But I've written them multiple 
times to make "elapsed".


---


Once the "elapsed" abstraction is available, other cool ideas become a little 
bit easier to think about. These would be things that are user code (not be in 
the stdlib), but which can make use of the "elapsed" cm; for example, a clever 
logger for slow code blocks (written a few of these too):


@contextmanager
def slow_code_logging(logger_name, msg, *args, threshold_sec=1.0, **kwargs):
    logger = logging.getLogger(logger_name)

    if logger.isEnabledFor(logging.INFO):
        def cb(t0: float, t1: float) -> None:
            dt = t1 - t0
            if dt < threshold_sec:
                # Change the logger level depending on threshold
                logger.debug(msg, dt, *args, **kwargs)
            else:
                logger.info(msg, dt, *args, **kwargs)

        cm = elapsed(cb)
    else:
        # Logger is not even enabled, do nothing.
        cm = nullcontext()

    with cm:
        yield


with slow_code_logging(__name__, 'Took longer to run than expected: %.4g s'):
    ...


And a super-hacky timing histogram generator (which would be quite interesting 
to use for measuring latency in network calls, e.g. with asyncio code):


@contextmanager
def histobuilder(counter, bin_size):
    def cb(t0, t1):
        dt = t1 - t0
        bucket = dt - dt % bin_size
        counter[bucket, bucket + bin_size] += 1

    with elapsed(cb, counter=time.perf_counter_ns):
        yield

counter = Counter()

for i in range(100):
    with histobuilder(counter, bin_size=int(5e4)):  # 50 us
        time.sleep(0.01)  # 10 ms

for (a, b), v in sorted(counter.items(), key=lambda _: _[0][0]):
    print(f'{a/1e6:6.2f} ms - {b/1e6:>6.2f} ms: {v:4} ' + '\u2588' * v)


output:


  9.85 ms -   9.90 ms:    1 █
  9.90 ms -   9.95 ms:   10 ██████████
  9.95 ms -  10.00 ms:   17 █████████████████
 10.00 ms -  10.05 ms:    8 ████████
 10.05 ms -  10.10 ms:   12 ████████████
 10.10 ms -  10.15 ms:    5 █████
 10.15 ms -  10.20 ms:    4 ████
 10.20 ms -  10.25 ms:    4 ████
 10.25 ms -  10.30 ms:    6 ██████
 10.30 ms -  10.35 ms:    9 █████████
 10.35 ms -  10.40 ms:    3 ███
 10.40 ms -  10.45 ms:    5 █████
 10.45 ms -  10.50 ms:   12 ████████████
 10.50 ms -  10.55 ms:    3 ███
 10.55 ms -  10.60 ms:    1 █


Things like the histogram builder above become (for me) much simpler when 
"elapsed" is just a thing that already exists even though it's very simple on 
its own.

If the right call is that we shouldn't add such a thing to the stdlib, should 
we instead just add an example of an elapsed context manager to the contextlib 
docs? Perhaps copy-paste from the docs is good enough for most users (like we 
do with the itertools recipes for example).  I'd be willing to add something to 
the docs if it's decided that's the way to go.

(On the off chance some version of "elapsed" goes into the stdlib, I think 
contextlib is right place rather than timeit, particularly because of how the 
"sample_before_and_after" pattern generalises)

----------
nosy: +cjrh

_______________________________________
Python tracker <rep...@bugs.python.org>
<https://bugs.python.org/issue19495>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue19495] context manager for measuring duration of blocks of code

Reply via email to