[issue30491] Add a lightweight mechanism for detecting un-awaited coroutine objects

Nathaniel Smith Mon, 11 Dec 2017 23:56:35 -0800

Nathaniel Smith <[email protected]> added the comment:

Update!


I've been experimenting with this some more, and here's a more detailed 
proposal, that I'd ideally like to get into 3.7. I don't *think* this is big 
enough to need a PEP? I dunno, thoughts on that welcome.

Motivation: It's easy to accidentally write 'f()' where you meant 'await f()', 
which is why Python issues a warning whenever an unawaited coroutine is GCed. 
This helps, and for asyncio proper, it may not be possible to do better than 
this -- since the problem is detected at GC time, there's very little we can do 
*except* print a warning. In particular, we can't raise an error. But this 
warning is still easy to miss, and prone to obscure problems: it's easy to have 
a test that passes ... because it didn't actually run any code. And then the 
warning is attached to a different test entirely. But, in some specific cases, 
we could do better: for example, if pytest-asyncio could check for unawaited 
coroutines after each test, it could immediately raise a proper and detailed 
error on the correct test. And if trio could check for unawaited coroutines at 
selected points like schedule points, it could reliably detect these problems 
and raise them as errors, right at the source.

Specification: We add two new functions, 
sys.set_unawaited_coroutine_tracking(enabled: bool) -> None and 
sys.collect_unawaited_coroutines() -> List[Coroutine]. The semantics are: 
internally, there is a thread-local bool I'll call tracking_enabled that 
defaults to False. set_unawaited_coroutine_tracking lets you set it. If 
tracking_enabled == False, everything works like now. If tracking_enabled == 
True, then the interpreter internally keeps a table of all unawaited coroutine 
objects: when a coroutine object is created, it's automatically added to the 
table; when it's awaited, it's automatically removed. When 
collect_unawaited_coroutines is called, it returns the current contents of the 
table as a list, and clears it. The table holds a strong reference to the 
coroutines in it, which makes this is a simple and reliable way to track 
unawaited coroutines (but also means that we need the enable/disable API 
instead of leaving it on all the time, because once it's enabled someone needs 
to c
 all collect_unawaited_coroutines regularly to avoid a memory leak).

Implementation: this can be made fast and cheap by storing the table as a 
thread-specific intrusive double-linked list. Basically each coroutine object 
would gain two pointer slots (this adds a small amount of memory overhead, but 
a coroutine object + frame is already >500 bytes, so the relative overhead is 
low), which are used to link it into a list when it's created (O(1), very 
cheap), and then unlink it again when it's awaited (also O(1), very cheap).

Rejected alternatives:

- The original comment above suggested keeping a count of unawaited coroutines 
instead of tracking the actual objects, but this way is just about as cheap 
while (a) allowing for much better debugging information when an unawaited 
coroutine is detected, since you have the actual objects there and (b) avoiding 
a mess of issues around unawaited coroutines that get GCed before the user 
checks for them.

- What about using the existing coroutine wrapper hook? You could do this, but 
this proposal has two advantages. First, it's much faster, which is important 
because Trio wants to run with this enabled by default, and pytest-asyncio 
doesn't want to slow down everyone's test suites too much. (I should benchmark 
this properly, but in general the coroutine wrappers add a ton of overhead b/c 
they're effectively a whole new Python-level object being allocated on every 
function call.) And second, since the coroutine wrapper hook is such a generic 
mechanism, it's prone to collisions between different uses. For example, 
pytest-asyncio's unawaited coroutine detection and asyncio's debug mode seem 
like they ought to complement each other: pytest-asyncio finds the problematic 
coroutines, and then asyncio's debug mode gives the details on where they came 
from. But if they're both trying to use the same coroutine wrapper hook, then 
they'll end up fighting over it. So this proposal follows Python's
  general rule that generic hooks are fine when you really need an escape 
hatch, but if there's a specific use case it's often worth handling it 
specifically. (Recent example: module __class__ assignment vs. PEP 562.)

----------

_______________________________________
Python tracker <[email protected]>
<https://bugs.python.org/issue30491>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue30491] Add a lightweight mechanism for detecting un-awaited coroutine objects

Reply via email to