New submission from Matthew Suozzo <matthew.suo...@gmail.com>:

# Issue

When profiling a generator function, the initial call and all subsequent yields 
are aggregated into the same "ncalls" metric by cProfile.

## Example

>>> cProfile.run("""
... def foo():
...   yield 1
...   yield 2
... assert tuple(foo()) == (1, 2)
... """)
   <snip>
   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
      ...
        3    0.000    0.000    0.000    0.000 <string>:2(foo)
      ...

This was unexpected behavior since it *looks* like a single call from the code. 
This also complicates basic analysis about the frequency of a codepath's 
execution where a generator might yield a variable number of times depending on 
the input.

The profile interface can and does differentiate between call types: Normal 
calls and "primitive" calls, i.e. those that are not induced via recursion, are 
displayed as "all_calls/primitive_calls" e.g. "3/1" for a single initial calls 
with 2 recursed calls (comparable to the example above).

This seems like an appropriate abstraction to apply in the generator case: Each 
yield is better modeled as an 'interior' call to the generator, not as a call 
on its own.

# Possible fix

I have two ideas that seem like they might address the problem:

* Add a new PyTrace_YIELD constant (and handle it in [3])
* Track some state from the `frame->f_gen` in the ProfilerEntry (injected in 
[3], used in [4]) to determine whether this is the first or a subsequent call.

I've not been poking around for long so I don't have an intuition about which 
would be less invasive (nor really whether either would work in practice).

As background, this seems to be the call chain from trace invocation to 
callcount increment:
[0]: 
https://github.com/python/cpython/blob/4e7a69bdb63a104587759d7784124492dcdd496e/Python/ceval.c#L4106-L4107
[1]: 
https://github.com/python/cpython/blob/4e7a69bdb63a104587759d7784124492dcdd496e/Python/ceval.c#L4937
[2]: 
https://github.com/python/cpython/blob/4e7a69bdb63a104587759d7784124492dcdd496e/Python/ceval.c#L4961
[3]: 
https://github.com/python/cpython/blob/4e7a69bdb63a104587759d7784124492dcdd496e/Modules/_lsprof.c#L419
[4]: 
https://github.com/python/cpython/blob/4e7a69bdb63a104587759d7784124492dcdd496e/Modules/_lsprof.c#L389
[5]: 
https://github.com/python/cpython/blob/4e7a69bdb63a104587759d7784124492dcdd496e/Modules/_lsprof.c#L311-L316

----------
components: C API
messages: 382373
nosy: matthew.suozzo
priority: normal
severity: normal
status: open
title: Generator `yield`s counted as primitive calls by cProfile
type: behavior
versions: Python 3.10, Python 3.6, Python 3.7, Python 3.8, Python 3.9

_______________________________________
Python tracker <rep...@bugs.python.org>
<https://bugs.python.org/issue42551>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to