[issue22448] call_at/call_later with Timer cancellation can result in (practically) unbounded memory usage.
Joshua Moore-Oliva added the comment: Hm. That sounds like you won't actually be interoperable with other asyncio-using code. asyncio code can be interoperated with by spinning off an asyncio coroutine that on completion calls a callback that reschedules a non-asyncio coroutine. I assume we shouldn't be spamming an issue with unrelated chatter, I'd be happy to discuss more via email if you would like. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue22448 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue22448] call_at/call_later with Timer cancellation can result in (practically) unbounded memory usage.
Joshua Moore-Oliva added the comment: Also - should I close this issue now that a patch has been committed? -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue22448 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue22448] call_at/call_later with Timer cancellation can result in (practically) unbounded memory usage.
Joshua Moore-Oliva added the comment: You can contribute upstream to the Tulip project first. Will I be writing a patch and tests for tulip, and then separate a patch and tests for python 3.4? Or will I submit to tulip, and then the changes will get merged from tulip into python by some other process? If possible, I would like to get this into python 3.4.2 (assuming all goes well). -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue22448 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue22448] call_at/call_later with Timer cancellation can result in (practically) unbounded memory usage.
Joshua Moore-Oliva added the comment: My patch is ready for review, if I followed the process correctly I think you should have received an email https://codereview.appspot.com/145220043 By the way I just looked at wait_for.py; it has a bug where do_work() isn't using yield-from with the sleep() call. But that may well be the issue you were trying to debug, and this does not change my opinion about the issue That was not intended, it was just a mistake. (A quick aside on yield from, feel free to ignore, I don't expect to change anyone's opinion on this) I don't use yield from much - my first use of asyncio was porting an application from gevent (I made a small custom wrapper with fibers (https://pypi.python.org/pypi/fibers) that can internally yield on coroutines). I have read https://glyph.twistedmatrix.com/2014/02/unyielding.html but in my cases, I tend to write my code with the thought that any non standard library function can yield (I initially tried porting to vanilla asyncio, but I ended up having yield from almost everywhere). In the rare cases I want to ensure no yielding takes place across function calls, I like the way gruvi (https://github.com/geertj/gruvi) handles it with a construct to assert no yielding takes place. with assert_no_switchpoints(): do_something() do_something_else() I also find that it is less error prone (missing a yield from), but that is a minor point as I could write a static analyzer (on top of test cases ofc) to check for that. But that's just my opinion and opinion's evolve :) -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue22448 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue22448] call_at/call_later with Timer cancellation can result in (practically) unbounded memory usage.
Joshua Moore-Oliva added the comment: I will try to review later tonight. Thanks! That makes sense when using gevent, but not when using asyncio or Trollius. Nothing will make events run if you don't use yield [from]. Yes, I am aware of that. I have written a small custom library using fibers (a greenlet-like library) on top of asyncio so that I don't need to use yield from in my application(s). -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue22448 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue21965] Add support for Memory BIO to _ssl
Changes by Joshua Moore-Oliva chatg...@gmail.com: -- nosy: +chatgris ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue21965 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue22448] call_at/call_later with Timer cancellation can result in (practically) unbounded memory usage.
New submission from Joshua Moore-Oliva: The core issue stems from the implementation of Timer cancellation. (which features like asyncio.wait_for build upon). BaseEventLoop stores scheduled events in an array backed heapq named _scheduled. Once an event has been scheduled with call_at, cancelling the event only marks the event as cancelled, it does not remove it from the array backed heap. It is only removed once the cancelled event is at the next scheduled event for the loop. In a system where many events are run (and then cancelled) that may have long timeout periods, and there always exists at least one event that is scheduled for an earlier time, memory use is practically unbounded. The attached program wait_for.py demonstrates a trivial example where memory use is practically unbounded for an hour of time. This is the case even though the program only ever has two uncancelled events and two coroutines at any given time in its execution. This could be fixed in a variety of ways: a) Timer cancellation could result in the object being removed from the heap like in the sched module. This would be at least O(N) where N is the number of scheduled events. b) Timer cancellation could trigger a callback that tracks the number of cancelled events in the _scheduled list. Once this number exceeds a threshold ( 50% ? ), the list could be cleared of all cancelled events and then be re-heapified. c) A balanced tree structure could be used to implement the scheduled events O(log N) time complexity (current module is O(log N) for heappop anyways). Given python's lack of a balanced tree structure in the standard library, I assume option c) is a non-starter. I would prefer option b) over option a) as when there are a lot of scheduled events in the system (upwards of 50,000 - 100,000 in some of my use cases) the amortized complexity for cancelling an event trends towards O(1) (N/2 cancellations are handled by a single O(N) event) at the cost of slightly more, but bounded relative to the amount of events, memory. I would be willing to take a shot at implementing this patch with the most agreeable option. Please let me know if that would be appreciated, or if someone else would rather tackle this issue. (First time bug report for python, not sure of the politics/protocols involved). Disclaimer that I by no means an asyncio expert, my understanding of the code base is based on my reading of it debugging this memory leak. -- components: asyncio files: wait_for.py messages: 227136 nosy: chatgris, gvanrossum, haypo, yselivanov priority: normal severity: normal status: open title: call_at/call_later with Timer cancellation can result in (practically) unbounded memory usage. type: resource usage versions: Python 3.4 Added file: http://bugs.python.org/file3/wait_for.py ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue22448 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com