Re: [Python-Dev] PEP 525, fourth update

2016-09-07 Thread Guido van Rossum
Thanks Yury! (Everyone else following along, the PEP is accepted
provisionally, and we may make small tweaks from time to time during
Python 3.6's lifetime.)
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 525, fourth update

2016-09-07 Thread Yury Selivanov

Thank you, Guido!


I've updated the PEP to make shutdown_asyncgens a coroutine, as we 
discussed.



Yury


On 2016-09-06 7:10 PM, Guido van Rossum wrote:

Thanks Yury!

I am hereby accepting PEP 525 provisionally. The acceptance is so that
you can go ahead and merge this into 3.6 before the feature freeze
this weekend. The provisional status is because this is a big project
and it's likely that we'll need to tweak some small aspect of the API
once the code is in, even after 3.6.0 is out. (Similar to the way PEP
492, async/await, was accepted provisionally.) But I am cautiously
optimistic and I am grateful to Yury for the care and effort he has
put into it.

--Guido

On Tue, Sep 6, 2016 at 5:10 PM, Yury Selivanov  wrote:

Hi,

I've updated PEP 525 with a new section about asyncio changes.

Essentially, asyncio event loop will get a new "shutdown_asyncgens" method
that allows to close the loop and all associated AGs with it reliably.

Only the updated section is pasted below:


asyncio
---

The asyncio event loop will use ``sys.set_asyncgen_hooks()`` API to
maintain a weak set of all scheduled asynchronous generators, and to
schedule their ``aclose()`` coroutine methods when it is time for
generators to be GCed.

To make sure that asyncio programs can finalize all scheduled
asynchronous generators reliably, we propose to add a new event loop
method ``loop.shutdown_asyncgens(*, timeout=30)``.  The method will
schedule all currently open asynchronous generators to close with an
``aclose()`` call.

After calling the ``loop.shutdown_asyncgens()`` method, the event loop
will issue a warning whenever a new asynchronous generator is iterated
for the first time.  The idea is that after requesting all asynchronous
generators to be shutdown, the program should not execute code that
iterates over new asynchronous generators.

An example of how ``shutdown_asyncgens`` should be used::

 try:
 loop.run_forever()
 # or loop.run_until_complete(...)
 finally:
 loop.shutdown_asyncgens()
 loop.close()

-
Yury
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe:
https://mail.python.org/mailman/options/python-dev/guido%40python.org





___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 525, fourth update

2016-09-06 Thread Guido van Rossum
Thanks Yury!

I am hereby accepting PEP 525 provisionally. The acceptance is so that
you can go ahead and merge this into 3.6 before the feature freeze
this weekend. The provisional status is because this is a big project
and it's likely that we'll need to tweak some small aspect of the API
once the code is in, even after 3.6.0 is out. (Similar to the way PEP
492, async/await, was accepted provisionally.) But I am cautiously
optimistic and I am grateful to Yury for the care and effort he has
put into it.

--Guido

On Tue, Sep 6, 2016 at 5:10 PM, Yury Selivanov  wrote:
> Hi,
>
> I've updated PEP 525 with a new section about asyncio changes.
>
> Essentially, asyncio event loop will get a new "shutdown_asyncgens" method
> that allows to close the loop and all associated AGs with it reliably.
>
> Only the updated section is pasted below:
>
>
> asyncio
> ---
>
> The asyncio event loop will use ``sys.set_asyncgen_hooks()`` API to
> maintain a weak set of all scheduled asynchronous generators, and to
> schedule their ``aclose()`` coroutine methods when it is time for
> generators to be GCed.
>
> To make sure that asyncio programs can finalize all scheduled
> asynchronous generators reliably, we propose to add a new event loop
> method ``loop.shutdown_asyncgens(*, timeout=30)``.  The method will
> schedule all currently open asynchronous generators to close with an
> ``aclose()`` call.
>
> After calling the ``loop.shutdown_asyncgens()`` method, the event loop
> will issue a warning whenever a new asynchronous generator is iterated
> for the first time.  The idea is that after requesting all asynchronous
> generators to be shutdown, the program should not execute code that
> iterates over new asynchronous generators.
>
> An example of how ``shutdown_asyncgens`` should be used::
>
> try:
> loop.run_forever()
> # or loop.run_until_complete(...)
> finally:
> loop.shutdown_asyncgens()
> loop.close()
>
> -
> Yury
> ___
> Python-Dev mailing list
> Python-Dev@python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
> https://mail.python.org/mailman/options/python-dev/guido%40python.org



-- 
--Guido van Rossum (python.org/~guido)
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] PEP 525, fourth update

2016-09-06 Thread Yury Selivanov

Hi,

I've updated PEP 525 with a new section about asyncio changes.

Essentially, asyncio event loop will get a new "shutdown_asyncgens" 
method that allows to close the loop and all associated AGs with it 
reliably.


Only the updated section is pasted below:


asyncio
---

The asyncio event loop will use ``sys.set_asyncgen_hooks()`` API to
maintain a weak set of all scheduled asynchronous generators, and to
schedule their ``aclose()`` coroutine methods when it is time for
generators to be GCed.

To make sure that asyncio programs can finalize all scheduled
asynchronous generators reliably, we propose to add a new event loop
method ``loop.shutdown_asyncgens(*, timeout=30)``.  The method will
schedule all currently open asynchronous generators to close with an
``aclose()`` call.

After calling the ``loop.shutdown_asyncgens()`` method, the event loop
will issue a warning whenever a new asynchronous generator is iterated
for the first time.  The idea is that after requesting all asynchronous
generators to be shutdown, the program should not execute code that
iterates over new asynchronous generators.

An example of how ``shutdown_asyncgens`` should be used::

try:
loop.run_forever()
# or loop.run_until_complete(...)
finally:
loop.shutdown_asyncgens()
loop.close()

-
Yury
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 525, third round, better finalization

2016-09-03 Thread Greg Ewing

Nick Coghlan wrote:

For synchronous code, that's a relatively easy burden to push back
onto the programmer - assuming fair thread scheduling, a with
statement can ensure reliably ensure prompt resource cleanup.

That assurance goes out the window as soon as you explicitly pause
code execution inside the body of the with statement - it doesn't
matter whether its via yield, yield from, or await, you've completely
lost that assurance of immediacy.


I don't see how this is any worse than a thread containing
an ordinary with-statement that waits for something that
will never happen. If that's the case, then you've got a
deadlock, and you have more to worry about than resources
not being released.

I think what all this means is that an event loop must
not simply drop async tasks on the floor. If it's asked
to cancel a task, it should do that by throwing an
appropriate exception into it and letting it unwind
itself.

To go along with that, the programmer needs to understand
that he can't just fire off a task and abandon it if it
uses external resources and is not guaranteed to finish
under its own steam. He needs to arrange a timeout or
other mechanism to cancel it if it doesn't complete in a
timely manner.

If those things are done, an async with should be exactly
as adequate for resource cleanup as an ordinary with is in
a thread. It also shouldn't be necessary to have any
special protocol for finalising an async generator; async
with together with a way of throwing an exception into a
task should be all that's needed.

--
Greg

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 525, third round, better finalization

2016-09-03 Thread Yury Selivanov

Hi Oscar,

I don't think PyPy is in breach of the language spec here. Python made
a decision a long time ago to shun RAII-style implicit cleanup in
favour if with-style explicit cleanup.

The solution to this problem is to move resource management outside of
the generator functions. This is true for ordinary generators without
an event-loop etc. The example in the PEP is

async def square_series(con, to):
 async with con.transaction():
 cursor = con.cursor(
 'SELECT generate_series(0, $1) AS i', to)
 async for row in cursor:
 yield row['i'] ** 2

async for i in square_series(con, 1000):
 if i == 100:
 break

The normal generator equivalent of this is:

def square_series(con, to):
 with con.transaction():
 cursor = con.cursor(
 'SELECT generate_series(0, $1) AS i', to)
 for row in cursor:
 yield row['i'] ** 2

This code is already broken: move the with statement outside to the
caller of the generator function.


Exactly.

I used 'async with' in the PEP to demonstrate that the cleanup 
mechanisms are powerful enough to handle bad code patterns.


Thank you,
Yury
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 525, third round, better finalization

2016-09-03 Thread Nick Coghlan
On 4 September 2016 at 04:38, Oscar Benjamin  wrote:
> On 3 September 2016 at 16:42, Nick Coghlan  wrote:
>> On 2 September 2016 at 19:13, Nathaniel Smith  wrote:
>>> This works OK on CPython because the reference-counting gc will call
>>> handle.__del__() at the end of the scope (so on CPython it's at level
>>> 2), but it famously causes huge problems when porting to PyPy with
>>> it's much faster and more sophisticated gc that only runs when
>>> triggered by memory pressure. (Or for "PyPy" you can substitute
>>> "Jython", "IronPython", whatever.) Technically this code doesn't
>>> actually "leak" file descriptors on PyPy, because handle.__del__()
>>> will get called *eventually* (this code is at level 1, not level 0),
>>> but by the time "eventually" arrives your server process has probably
>>> run out of file descriptors and crashed. Level 1 isn't good enough. So
>>> now we have all learned to instead write
> ...
>>> BUT, with the current PEP 525 proposal, trying to use this generator
>>> in this way is exactly analogous to the open(path).read() case: on
>>> CPython it will work fine -- the generator object will leave scope at
>>> the end of the 'async for' loop, cleanup methods will be called, etc.
>>> But on PyPy, the weakref callback will not be triggered until some
>>> arbitrary time later, you will "leak" file descriptors, and your
>>> server will crash.
>>
>> That suggests the PyPy GC should probably be tracking pressure on more
>> resources than just memory when deciding whether or not to trigger a
>> GC run.
>
> PyPy's GC is conformant to the language spec

The language spec doesn't say anything about what triggers GC cycles -
that's purely a decision for runtime implementors based on the
programming experience they want to provide their users.

CPython runs GC pretty eagerly, with it being immediate when the
automatic reference counting is sufficient and the cyclic GC doesn't
have to get involved at all.

If I understand correctly, PyPy currently decides whether or not to
trigger a GC cycle based primarily on memory pressure, even though the
uncollected garbage may also be holding on to system resources other
than memory (like file descriptors).

For synchronous code, that's a relatively easy burden to push back
onto the programmer - assuming fair thread scheduling, a with
statement can ensure reliably ensure prompt resource cleanup.

That assurance goes out the window as soon as you explicitly pause
code execution inside the body of the with statement - it doesn't
matter whether its via yield, yield from, or await, you've completely
lost that assurance of immediacy.

At that point, even CPython doesn't ensure prompt release of resources
- it just promises to try to clean things up as soon as it can and as
best it can (which is usually pretty soon and pretty well, with recent
iterations of 3.x, but event loops will still happily keep things
alive indefinitely if they're waiting for events that never happen).

For synchronous generators, you can make your API a bit more
complicated, and ask your caller to handle the manual resource
management, but you may not want to do that.

The asynchronous case is even worse though, as there, you often simply
can't readily push the burden back onto the programmer, because the
code is *meant* to be waiting for events and reacting to them, rather
than proceeding deterministically from beginning to end.

So while it's good that PEP 492 and 525 attempt to adapt synchronous
resource management models to the asynchronous world, it's also
important to remember that there's a fundamental mismatch of
underlying concepts when it comes to trying to pair up deterministic
resource management with asynchronous code - you're often going to
want to tip the model on its side and set up a dedicated resource
manager that other components can interact with, and then have the
resource manager take care of promptly releasing the resources when
the other components go away (perhaps with notions of leases and lease
renewals if you simply cannot afford unexpected delays in resources
being released).

Cheers,
Nick.

-- 
Nick Coghlan   |   ncogh...@gmail.com   |   Brisbane, Australia
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 525, third round, better finalization

2016-09-03 Thread Yury Selivanov

Hi Nathaniel,

On 2016-09-02 2:13 AM, Nathaniel Smith wrote:

On Thu, Sep 1, 2016 at 3:34 PM, Yury Selivanov  wrote:

Hi,

I've spent quite a while thinking and experimenting with PEP 525 trying to
figure out how to make asynchronous generators (AG) finalization reliable.
I've tried to replace the callback for GCed with a callback to intercept
first iteration of AGs.  Turns out it's very hard to work with weak-refs and
make asyncio event loop to reliably track and shutdown all open AGs.

My new approach is to replace the "sys.set_asyncgen_finalizer(finalizer)"
function with "sys.set_asyncgen_hooks(firstiter=None, finalizer=None)".

1) Can/should these hooks be used by other types besides async
generators? (e.g., async iterators that are not async generators?)
What would that look like?


Asynchronous iterators (classes implementing __aiter__, __anext__) 
should use __del__ for any cleanup purposes.


sys.set_asyncgen_hooks only supports asynchronous generators.



2) In the asyncio design it's legal for an event loop to be stopped
and then started again. Currently (I guess for this reason?) asyncio
event loops do not forcefully clean up resources associated with them
on shutdown. For example, if I open a StreamReader, loop.stop() and
loop.close() will not automatically close it for me. When, concretely,
are you imagining that asyncio will run these finalizers?


I think we will add another API method to asyncio event loop, which 
users will call before closing the loop.  In my reference implementation 
I added `loop.shutdown()` synchronous method.




3) Should the cleanup code in the generator be able to distinguish
between "this iterator has left scope" versus "the event loop is being
violently shut down"?


This is already handled in the reference implementation.  When an AG is 
iterated for the first time, the loop starts tracking it by adding it to 
a weak set.  When the AG is about to be GCed, the loop removes it from 
the weak set, and schedules its 'aclose()'.


If 'loop.shutdown' is called it means that the loop is being "violently 
shutdown", so we schedule 'aclose' for all AGs in the weak set.




4) More fundamentally -- this revision is definitely an improvement,
but it doesn't really address the main concern I have. Let me see if I
can restate it more clearly.

Let's define 3 levels of cleanup handling:

   Level 0: resources (e.g. file descriptors) cannot be reliably cleaned up.

   Level 1: resources are cleaned up reliably, but at an unpredictable time.

   Level 2: resources are cleaned up both reliably and promptly.

In Python 3.5, unless you're very anal about writing cumbersome 'async
with' blocks around every single 'async for', resources owned by aysnc
iterators land at level 0. (Because the only cleanup method available
is __del__, and __del__ cannot make async calls, so if you need async
calls to do clean up then you're just doomed.)

I think at the revised draft does a good job of moving async
generators from level 0 to level 1 -- the finalizer hook gives a way
to effectively call back into the event loop from __del__, and the
shutdown hook gives us a way to guarantee that the cleanup happens
while the event loop is still running.
Right.  It's good to hear that you agree that the latest revision of the 
PEP makes AGs cleanup reliable (albeit unpredictable when exactly that 
will happen, more on that below).


My goal was exactly this - make the mechanism reliable, with the same 
predictability as what we have for __del__.



But... IIUC, it's now generally agreed that for Python code, level 1
is simply *not good enough*. (Or to be a little more precise, it's
good enough for the case where the resource being cleaned up is
memory, because the garbage collector knows when memory is short, but
it's not good enough for resources like file descriptors.) The classic
example of this is code like:


I think this is where I don't agree with you 100%.  There are no strict 
guarantees when an object will be GCed in a timely manner in CPython or 
PyPy.  If it's part of a ref cycle, it might not be cleaned up at all.


All in all, in all your examples I don't see the exact place where AGs 
are different from let's say synchronous generators.


For instance:

   async def read_json_lines_from_server(host, port):
   async for line in asyncio.open_connection(host, port)[0]:
   yield json.loads(line)

You would expect to use this like:

   async for data in read_json_lines_from_server(host, port):
   ...


If you rewrite the above code without the 'async' keyword, you'd have a 
synchronous generator with *exactly* the same problems.

tl;dr: AFAICT this revision of PEP 525 is enough to make it work
reliably on CPython, but I have serious concerns that it bakes a
CPython-specific design into the language. I would prefer a design
that actually aims for "level 2" cleanup semantics (for example, [1])



I honestly don't see why PEP 525 can't be implemented in 

Re: [Python-Dev] PEP 525, third round, better finalization

2016-09-03 Thread Oscar Benjamin
On 3 September 2016 at 16:42, Nick Coghlan  wrote:
> On 2 September 2016 at 19:13, Nathaniel Smith  wrote:
>> This works OK on CPython because the reference-counting gc will call
>> handle.__del__() at the end of the scope (so on CPython it's at level
>> 2), but it famously causes huge problems when porting to PyPy with
>> it's much faster and more sophisticated gc that only runs when
>> triggered by memory pressure. (Or for "PyPy" you can substitute
>> "Jython", "IronPython", whatever.) Technically this code doesn't
>> actually "leak" file descriptors on PyPy, because handle.__del__()
>> will get called *eventually* (this code is at level 1, not level 0),
>> but by the time "eventually" arrives your server process has probably
>> run out of file descriptors and crashed. Level 1 isn't good enough. So
>> now we have all learned to instead write
...
>> BUT, with the current PEP 525 proposal, trying to use this generator
>> in this way is exactly analogous to the open(path).read() case: on
>> CPython it will work fine -- the generator object will leave scope at
>> the end of the 'async for' loop, cleanup methods will be called, etc.
>> But on PyPy, the weakref callback will not be triggered until some
>> arbitrary time later, you will "leak" file descriptors, and your
>> server will crash.
>
> That suggests the PyPy GC should probably be tracking pressure on more
> resources than just memory when deciding whether or not to trigger a
> GC run.

PyPy's GC is conformant to the language spec AFAICT:
https://docs.python.org/3/reference/datamodel.html#object.__del__

"""
object.__del__(self)

Called when the instance is about to be destroyed. This is also called
a destructor. If a base class has a __del__() method, the derived
class’s __del__() method, if any, must explicitly call it to ensure
proper deletion of the base class part of the instance. Note that it
is possible (though not recommended!) for the __del__() method to
postpone destruction of the instance by creating a new reference to
it. It may then be called at a later time when this new reference is
deleted. It is not guaranteed that __del__() methods are called for
objects that still exist when the interpreter exits.
"""

Note the last sentence. It is also not guaranteed (across different
Python implementations and regardless of the CPython-specific notes in
the docs) that any particular object will cease to exist before the
interpreter exits. Taken together these two imply that it is not
guaranteed that *any* __del__ method will ever be called.

Antoine's excellent work in PEP 442 has improved the situation with
CPython but the language spec (covering all implementations) remains
the same and changing that requires a new PEP and coordination with
other implementations. Without changing it is a mistake to base a new
core language feature (async finalisation) on CPython-specific
implementation details. Already using with (or try/finally etc.)
inside a generator function behaves differently under PyPy:

$ cat gentest.py

def generator_needs_finalisation():
try:
for n in range(10):
yield n
finally:
print('Doing important cleanup')

for obj in generator_needs_finalisation():
if obj == 5:
break

print('Process exit')

$ python gentest.py
Doing important cleanup
Process exit

So here the cleanup is triggered by the reference count of the
generator falling at the break statement. Under CPython this
corresponds to Nathaniel's "level 2" cleanup. If we keep another
reference around it gets done at process exit:

$ cat gentest2.py

def generator_needs_finalisation():
try:
for n in range(10):
yield n
finally:
print('Doing important cleanup')

gen = generator_needs_finalisation()
for obj in gen:
if obj == 5:
break

print('Process exit')

$ python gentest2.py
Process exit
Doing important cleanup

So that's Nathaniel's "level 1" cleanup. However if you run either of
these scripts under PyPy the cleanup simply won't occur (i.e. "level
0" cleanup):

$ pypy gentest.py
Process exit
$ pypy gentest2.py
Process exit

I don't think PyPy is in breach of the language spec here. Python made
a decision a long time ago to shun RAII-style implicit cleanup in
favour if with-style explicit cleanup.

The solution to this problem is to move resource management outside of
the generator functions. This is true for ordinary generators without
an event-loop etc. The example in the PEP is

async def square_series(con, to):
async with con.transaction():
cursor = con.cursor(
'SELECT generate_series(0, $1) AS i', to)
async for row in cursor:
yield row['i'] ** 2

async for i in square_series(con, 1000):
if i == 100:
break

The normal generator equivalent of this is:

def square_series(con, to):
with con.transaction():
cursor = con.cursor(
'SELECT generate_series(0, $1) AS i', to)
  

Re: [Python-Dev] PEP 525, third round, better finalization

2016-09-03 Thread Nick Coghlan
On 2 September 2016 at 19:13, Nathaniel Smith  wrote:
> This works OK on CPython because the reference-counting gc will call
> handle.__del__() at the end of the scope (so on CPython it's at level
> 2), but it famously causes huge problems when porting to PyPy with
> it's much faster and more sophisticated gc that only runs when
> triggered by memory pressure. (Or for "PyPy" you can substitute
> "Jython", "IronPython", whatever.) Technically this code doesn't
> actually "leak" file descriptors on PyPy, because handle.__del__()
> will get called *eventually* (this code is at level 1, not level 0),
> but by the time "eventually" arrives your server process has probably
> run out of file descriptors and crashed. Level 1 isn't good enough. So
> now we have all learned to instead write
>
>  # good modern Python style:
>  def get_file_contents(path):
>   with open(path) as handle:
>   return handle.read()

This only works if the file fits in memory - otherwise you just have
to accept the fact that you need to leave the file handle open until
you're "done with the iterator", which means deferring the resource
management to the caller.

> and we have fancy tools like the ResourceWarning machinery to help us
> catch these bugs.
>
> Here's the analogous example for async generators. This is a useful,
> realistic async generator, that lets us incrementally read from a TCP
> connection that streams newline-separated JSON documents:
>
>   async def read_json_lines_from_server(host, port):
>   async for line in asyncio.open_connection(host, port)[0]:
>   yield json.loads(line)
>
> You would expect to use this like:
>
>   async for data in read_json_lines_from_server(host, port):
>   ...

The actual synchronous equivalent to this would look more like:

def read_data_from_file(path):
with open(path) as f:
for line in f:
yield f

(Assume we're doing something interesting to each line, rather than
reproducing normal file iteration behaviour)

And that has the same problem as your asynchronous example: the caller
needs to worry about resource management on the generator and do:


with closing(read_data_from_file(path)) as itr:
for line in itr:
...

Which means the problem causing your concern doesn't arise from the
generator being asynchronous - it comes from the fact the generator
actually *needs* to hold the FD open in order to work as intended (if
it didn't, then the code wouldn't need to be asynchronous).

> BUT, with the current PEP 525 proposal, trying to use this generator
> in this way is exactly analogous to the open(path).read() case: on
> CPython it will work fine -- the generator object will leave scope at
> the end of the 'async for' loop, cleanup methods will be called, etc.
> But on PyPy, the weakref callback will not be triggered until some
> arbitrary time later, you will "leak" file descriptors, and your
> server will crash.

That suggests the PyPy GC should probably be tracking pressure on more
resources than just memory when deciding whether or not to trigger a
GC run.

> For correct operation, you have to replace the
> simple 'async for' loop with this lovely construct:
>
>   async with aclosing(read_json_lines_from_server(host, port)) as ait:
>   async for data in ait:
>   ...
>
> Of course, you only have to do this on loops whose iterator might
> potentially hold resources like file descriptors, either currently or
> in the future. So... uh... basically that's all loops, I guess? If you
> want to be a good defensive programmer?

At that level of defensiveness in asynchronous code, you need to start
treating all external resources (including file descriptors) as a
managed pool, just as we have process and thread pools in the standard
library, and many database and networking libraries offer connection
pooling. It limits your per-process concurrency, but that limit exists
anyway at the operating system level - modelling it explicitly just
lets you manage how the application handles those limits.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncogh...@gmail.com   |   Brisbane, Australia
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 525, third round, better finalization

2016-09-02 Thread Nathaniel Smith
On Thu, Sep 1, 2016 at 3:34 PM, Yury Selivanov  wrote:
> Hi,
>
> I've spent quite a while thinking and experimenting with PEP 525 trying to
> figure out how to make asynchronous generators (AG) finalization reliable.
> I've tried to replace the callback for GCed with a callback to intercept
> first iteration of AGs.  Turns out it's very hard to work with weak-refs and
> make asyncio event loop to reliably track and shutdown all open AGs.
>
> My new approach is to replace the "sys.set_asyncgen_finalizer(finalizer)"
> function with "sys.set_asyncgen_hooks(firstiter=None, finalizer=None)".

1) Can/should these hooks be used by other types besides async
generators? (e.g., async iterators that are not async generators?)
What would that look like?

2) In the asyncio design it's legal for an event loop to be stopped
and then started again. Currently (I guess for this reason?) asyncio
event loops do not forcefully clean up resources associated with them
on shutdown. For example, if I open a StreamReader, loop.stop() and
loop.close() will not automatically close it for me. When, concretely,
are you imagining that asyncio will run these finalizers?

3) Should the cleanup code in the generator be able to distinguish
between "this iterator has left scope" versus "the event loop is being
violently shut down"?

4) More fundamentally -- this revision is definitely an improvement,
but it doesn't really address the main concern I have. Let me see if I
can restate it more clearly.

Let's define 3 levels of cleanup handling:

  Level 0: resources (e.g. file descriptors) cannot be reliably cleaned up.

  Level 1: resources are cleaned up reliably, but at an unpredictable time.

  Level 2: resources are cleaned up both reliably and promptly.

In Python 3.5, unless you're very anal about writing cumbersome 'async
with' blocks around every single 'async for', resources owned by aysnc
iterators land at level 0. (Because the only cleanup method available
is __del__, and __del__ cannot make async calls, so if you need async
calls to do clean up then you're just doomed.)

I think at the revised draft does a good job of moving async
generators from level 0 to level 1 -- the finalizer hook gives a way
to effectively call back into the event loop from __del__, and the
shutdown hook gives us a way to guarantee that the cleanup happens
while the event loop is still running.

But... IIUC, it's now generally agreed that for Python code, level 1
is simply *not good enough*. (Or to be a little more precise, it's
good enough for the case where the resource being cleaned up is
memory, because the garbage collector knows when memory is short, but
it's not good enough for resources like file descriptors.) The classic
example of this is code like:

 # used to be good, now considered poor style:
 def get_file_contents(path):
  handle = open(path)
  return handle.read()

This works OK on CPython because the reference-counting gc will call
handle.__del__() at the end of the scope (so on CPython it's at level
2), but it famously causes huge problems when porting to PyPy with
it's much faster and more sophisticated gc that only runs when
triggered by memory pressure. (Or for "PyPy" you can substitute
"Jython", "IronPython", whatever.) Technically this code doesn't
actually "leak" file descriptors on PyPy, because handle.__del__()
will get called *eventually* (this code is at level 1, not level 0),
but by the time "eventually" arrives your server process has probably
run out of file descriptors and crashed. Level 1 isn't good enough. So
now we have all learned to instead write

 # good modern Python style:
 def get_file_contents(path):
  with open(path) as handle:
  return handle.read()

and we have fancy tools like the ResourceWarning machinery to help us
catch these bugs.

Here's the analogous example for async generators. This is a useful,
realistic async generator, that lets us incrementally read from a TCP
connection that streams newline-separated JSON documents:

  async def read_json_lines_from_server(host, port):
  async for line in asyncio.open_connection(host, port)[0]:
  yield json.loads(line)

You would expect to use this like:

  async for data in read_json_lines_from_server(host, port):
  ...

BUT, with the current PEP 525 proposal, trying to use this generator
in this way is exactly analogous to the open(path).read() case: on
CPython it will work fine -- the generator object will leave scope at
the end of the 'async for' loop, cleanup methods will be called, etc.
But on PyPy, the weakref callback will not be triggered until some
arbitrary time later, you will "leak" file descriptors, and your
server will crash. For correct operation, you have to replace the
simple 'async for' loop with this lovely construct:

  async with aclosing(read_json_lines_from_server(host, port)) as ait:
  async for data in ait:
  ...

Of course, you only have to do this on loops whose 

[Python-Dev] PEP 525, third round, better finalization

2016-09-01 Thread Yury Selivanov

Hi,

I've spent quite a while thinking and experimenting with PEP 525 trying 
to figure out how to make asynchronous generators (AG) finalization 
reliable.  I've tried to replace the callback for GCed with a callback 
to intercept first iteration of AGs.  Turns out it's very hard to work 
with weak-refs and make asyncio event loop to reliably track and 
shutdown all open AGs.


My new approach is to replace the 
"sys.set_asyncgen_finalizer(finalizer)" function with 
"sys.set_asyncgen_hooks(firstiter=None, finalizer=None)".


This design allows us to:

1. intercept first iteration of an AG.  That makes it possible for event 
loops to keep a weak set of all "open" AGs, and to implement a 
"shutdown" method to close the loop and close all AGs *reliably*.


2. intercept AGs GC.  That makes it possible to call "aclose" on GCed 
AGs to guarantee that 'finally' and 'async with' statements are properly 
closed.


3. in later Python versions we can add more hooks, although I can't 
think of anything else we need to add right now.


I'm posting below the only updated PEP section. The latest PEP revision 
should also be available on python.org shortly.


All new proposed changes are available to play with in my fork of 
CPython here: https://github.com/1st1/cpython/tree/async_gen



Finalization



PEP 492 requires an event loop or a scheduler to run coroutines.
Because asynchronous generators are meant to be used from coroutines,
they also require an event loop to run and finalize them.

Asynchronous generators can have ``try..finally`` blocks, as well as
``async with``.  It is important to provide a guarantee that, even
when partially iterated, and then garbage collected, generators can
be safely finalized.  For example::

async def square_series(con, to):
async with con.transaction():
cursor = con.cursor(
'SELECT generate_series(0, $1) AS i', to)
async for row in cursor:
yield row['i'] ** 2

async for i in square_series(con, 1000):
if i == 100:
break

The above code defines an asynchronous generator that uses
``async with`` to iterate over a database cursor in a transaction.
The generator is then iterated over with ``async for``, which interrupts
the iteration at some point.

The ``square_series()`` generator will then be garbage collected,
and without a mechanism to asynchronously close the generator, Python
interpreter would not be able to do anything.

To solve this problem we propose to do the following:

1. Implement an ``aclose`` method on asynchronous generators
   returning a special *awaitable*.  When awaited it
   throws a ``GeneratorExit`` into the suspended generator and
   iterates over it until either a ``GeneratorExit`` or
   a ``StopAsyncIteration`` occur.

   This is very similar to what the ``close()`` method does to regular
   Python generators, except that an event loop is required to execute
   ``aclose()``.

2. Raise a ``RuntimeError``, when an asynchronous generator executes
   a ``yield`` expression in its ``finally`` block (using ``await``
   is fine, though)::

async def gen():
try:
yield
finally:
await asyncio.sleep(1)   # Can use 'await'.

yield# Cannot use 'yield',
 # this line will trigger a
 # RuntimeError.

3. Add two new methods to the ``sys`` module:
   ``set_asyncgen_hooks()`` and ``get_asyncgen_hooks()``.

The idea behind ``sys.set_asyncgen_hooks()`` is to allow event
loops to intercept asynchronous generators iteration and finalization,
so that the end user does not need to care about the finalization
problem, and everything just works.

``sys.set_asyncgen_hooks()`` accepts two arguments:

* ``firstiter``: a callable which will be called when an asynchronous
  generator is iterated for the first time.

* ``finalizer``: a callable which will be called when an asynchronous
  generator is about to be GCed.

When an asynchronous generator is iterated for the first time,
it stores a reference to the current finalizer.  If there is none,
a ``RuntimeError`` is raised.  This provides a strong guarantee that
every asynchronous generator object will always have a finalizer
installed by the correct event loop.

When an asynchronous generator is about to be garbage collected,
it calls its cached finalizer.  The assumption is that the finalizer
will schedule an ``aclose()`` call with the loop that was active
when the iteration started.

For instance, here is how asyncio is modified to allow safe
finalization of asynchronous generators::

   # asyncio/base_events.py

   class BaseEventLoop:

   def run_forever(self):
   ...
   old_hooks = sys.get_asyncgen_hooks()
sys.set_asyncgen_hooks(finalizer=self._finalize_asyncgen)
   try:
   ...
   finally:
   

Re: [Python-Dev] PEP 525

2016-08-25 Thread Nick Coghlan
On 25 August 2016 at 05:00, Yury Selivanov  wrote:
> On 2016-08-24 12:35 PM, Guido van Rossum wrote:
>> Hopefully there will be other discussion as well, otherwise I'll have to
>> accept the PEP once this issue is cleared up. :-)
>
> Curious to hear your thoughts on two different approaches to finalization.
> At this point, I'm inclined to change the PEP to use the second approach.  I
> think it gives much more power to event loops, and basically means that any
> kind of APIs to control AG (or to finalize the loop) is possible.

The notification/callback approach where the event loop is given a
chance to intercept the first iteration of any given coroutine seems
nicer to me, since it opens up more opportunities for event loops to
experiment with new ideas. As a very simple example, they could emit a
debugging message every time a new coroutine is started.

asyncio could provide a default notification hook that just mapped
weakref finalisation to asynchronous execution of aclose().

Cheers,
Nick.

-- 
Nick Coghlan   |   ncogh...@gmail.com   |   Brisbane, Australia
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 525

2016-08-24 Thread Sven R. Kunze

On 24.08.2016 21:05, Yury Selivanov wrote:
Sorry for making you irritated.  Please feel free to remind me about 
any concrete changes to the PEP that I promised to add on 
python-ideas.  I'll go and re-read that thread right now anyways.


No problem as it seems I wasn't the only one. So, it doesn't matter 
anymore. :)


Best,
Sven
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 525

2016-08-24 Thread Sven R. Kunze

On 24.08.2016 21:00, Yury Selivanov wrote:


For an async generator there are two cases: either it tries to yield 
another value (the first time this happens you can throw an error 
back into it) or it tries to await -- in that case you can also throw 
an error back into it, and if the error comes out unhandled you can 
print the error (in both cases actually).


It's probably to specify all this behavior using some kind of default 
finalizer (though you don't have to implement it that way).


Hopefully there will be other discussion as well, otherwise I'll have 
to accept the PEP once this issue is cleared up. :-)


Curious to hear your thoughts on two different approaches to 
finalization.  At this point, I'm inclined to change the PEP to use 
the second approach.  I think it gives much more power to event loops, 
and basically means that any kind of APIs to control AG (or to 
finalize the loop) is possible.


I think your alternative approach is the better one. It feels more 
integrated even though it's harder for event loop implementors (which 
are rarer than normal event loop users). Also AG finalization is 
something that's not really custom to each AG but makes more sense at 
the event loop level, I think.


Best,
Sven
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 525

2016-08-24 Thread Yury Selivanov

On 2016-08-24 3:01 PM, Sven R. Kunze wrote:


On 24.08.2016 18:35, Guido van Rossum wrote:
On Wed, Aug 24, 2016 at 8:17 AM, Yury Selivanov 
> wrote:


On 2016-08-23 10:38 PM, Rajiv Kumar wrote:

I was playing with your implementation to gain a better
understanding of the operation of asend() and friends. Since
I was explicitly trying to "manually" advance the generators,
I wasn't using asyncio or other event loop. This meant that
the first thing I ran into with my toy code was the
RuntimeError ("cannot iterate async generator without
finalizer set").

As you have argued elsewhere, in practice the finalizer is
likely to be set by the event loop. Since the authors of
event loops are likely to know that they should set the
finalizer, would it perhaps be acceptable to merely issue a
warning instead of an error if the finalizer is not set? That
way there isn't an extra hoop to jump through for simple
examples.

In my case, I just called
sys.set_asyncgen_finalizer(lambda g: 1)
to get around the error and continue playing :) (I realize
that's a bad thing to do but it didn't matter for the toy cases)


Yeah, maybe warning would be sufficient.  I just find it's highly
unlikely that a lot of people would use async generators without
a loop/coroutine runner, as it's a very tedious process.


Heh, I had the same reaction as Rajiv. I think the tediousness is 
actually a good argument that there's no reason to forbid this. I 
don't even think a warning is needed. People who don't use a 
coroutine runner are probably just playing around (maybe even in the 
REPL) and they shouldn't get advice unasked.


I also was irritated as Yury said there were absolutely no changes 
after python-ideas. He said he might consider a clearer warning for 
those examples at the beginning of the PEP to make them work for the 
reader.


Sorry for making you irritated.  Please feel free to remind me about any 
concrete changes to the PEP that I promised to add on python-ideas.  
I'll go and re-read that thread right now anyways.


Yury
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 525

2016-08-24 Thread Sven R. Kunze

On 24.08.2016 18:35, Guido van Rossum wrote:
On Wed, Aug 24, 2016 at 8:17 AM, Yury Selivanov 
> wrote:


On 2016-08-23 10:38 PM, Rajiv Kumar wrote:

I was playing with your implementation to gain a better
understanding of the operation of asend() and friends. Since I
was explicitly trying to "manually" advance the generators, I
wasn't using asyncio or other event loop. This meant that the
first thing I ran into with my toy code was the RuntimeError
("cannot iterate async generator without finalizer set").

As you have argued elsewhere, in practice the finalizer is
likely to be set by the event loop. Since the authors of event
loops are likely to know that they should set the finalizer,
would it perhaps be acceptable to merely issue a warning
instead of an error if the finalizer is not set? That way
there isn't an extra hoop to jump through for simple examples.

In my case, I just called
sys.set_asyncgen_finalizer(lambda g: 1)
to get around the error and continue playing :) (I realize
that's a bad thing to do but it didn't matter for the toy cases)


Yeah, maybe warning would be sufficient.  I just find it's highly
unlikely that a lot of people would use async generators without a
loop/coroutine runner, as it's a very tedious process.


Heh, I had the same reaction as Rajiv. I think the tediousness is 
actually a good argument that there's no reason to forbid this. I 
don't even think a warning is needed. People who don't use a coroutine 
runner are probably just playing around (maybe even in the REPL) and 
they shouldn't get advice unasked.


I also was irritated as Yury said there were absolutely no changes after 
python-ideas. He said he might consider a clearer warning for those 
examples at the beginning of the PEP to make them work for the reader.




Would it be possible to print a warning only when an async generator 
is being finalized and doesn't run straight to the end without 
suspending or yielding? For regular generators we have a similar 
exception (although I don't recall whether we actually warn) -- if you 
call close() and it tries to yield another value it is just GC'ed 
without giving the frame more control. For an async generator there 
are two cases: either it tries to yield another value (the first time 
this happens you can throw an error back into it) or it tries to await 
-- in that case you can also throw an error back into it, and if the 
error comes out unhandled you can print the error (in both cases 
actually).


It's probably to specify all this behavior using some kind of default 
finalizer (though you don't have to implement it that way).


Does a default finalizer solve the "event loop does not know its AGs" 
problem?


Hopefully there will be other discussion as well, otherwise I'll have 
to accept the PEP once this issue is cleared up. :-)


--
--Guido van Rossum (python.org/~guido )


___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/srkunze%40mail.de


___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 525

2016-08-24 Thread Yury Selivanov

On 2016-08-24 12:35 PM, Guido van Rossum wrote:

On Wed, Aug 24, 2016 at 8:17 AM, Yury Selivanov 
> wrote:


On 2016-08-23 10:38 PM, Rajiv Kumar wrote:

I was playing with your implementation to gain a better
understanding of the operation of asend() and friends. Since I
was explicitly trying to "manually" advance the generators, I
wasn't using asyncio or other event loop. This meant that the
first thing I ran into with my toy code was the RuntimeError
("cannot iterate async generator without finalizer set").

As you have argued elsewhere, in practice the finalizer is
likely to be set by the event loop. Since the authors of event
loops are likely to know that they should set the finalizer,
would it perhaps be acceptable to merely issue a warning
instead of an error if the finalizer is not set? That way
there isn't an extra hoop to jump through for simple examples.

In my case, I just called
sys.set_asyncgen_finalizer(lambda g: 1)
to get around the error and continue playing :) (I realize
that's a bad thing to do but it didn't matter for the toy cases)


Yeah, maybe warning would be sufficient.  I just find it's highly
unlikely that a lot of people would use async generators without a
loop/coroutine runner, as it's a very tedious process.


Heh, I had the same reaction as Rajiv. I think the tediousness is 
actually a good argument that there's no reason to forbid this. I 
don't even think a warning is needed. People who don't use a coroutine 
runner are probably just playing around (maybe even in the REPL) and 
they shouldn't get advice unasked.


Good point.



Would it be possible to print a warning only when an async generator 
is being finalized and doesn't run straight to the end without 
suspending or yielding? For regular generators we have a similar 
exception (although I don't recall whether we actually warn) -- if you 
call close() and it tries to yield another value it is just GC'ed 
without giving the frame more control.


Yes, we can implement the exact same semantics for AGs:

- A ResourceWarning will be issued if an AG is GCed and cannot be 
synchronously closed (that will happen if no finalizer is set and there 
are 'await' expressions in 'finally').


- A RuntimeError is issued when an AG is yielding (asynchronously) in 
its 'finally' block.


I think both of those things are already there in the reference 
implementation.  So we can just lift the requirement for asynchronous 
finalizer being set before you iterate an AG.


For an async generator there are two cases: either it tries to yield 
another value (the first time this happens you can throw an error back 
into it) or it tries to await -- in that case you can also throw an 
error back into it, and if the error comes out unhandled you can print 
the error (in both cases actually).


It's probably to specify all this behavior using some kind of default 
finalizer (though you don't have to implement it that way).


Hopefully there will be other discussion as well, otherwise I'll have 
to accept the PEP once this issue is cleared up. :-)


Curious to hear your thoughts on two different approaches to 
finalization.  At this point, I'm inclined to change the PEP to use the 
second approach.  I think it gives much more power to event loops, and 
basically means that any kind of APIs to control AG (or to finalize the 
loop) is possible.


Thank you,
Yury
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 525

2016-08-24 Thread Guido van Rossum
On Wed, Aug 24, 2016 at 8:17 AM, Yury Selivanov 
wrote:

> On 2016-08-23 10:38 PM, Rajiv Kumar wrote:
>
>> I was playing with your implementation to gain a better understanding of
>> the operation of asend() and friends. Since I was explicitly trying to
>> "manually" advance the generators, I wasn't using asyncio or other event
>> loop. This meant that the first thing I ran into with my toy code was the
>> RuntimeError ("cannot iterate async generator without finalizer set").
>>
>> As you have argued elsewhere, in practice the finalizer is likely to be
>> set by the event loop. Since the authors of event loops are likely to know
>> that they should set the finalizer, would it perhaps be acceptable to
>> merely issue a warning instead of an error if the finalizer is not set?
>> That way there isn't an extra hoop to jump through for simple examples.
>>
>> In my case, I just called
>> sys.set_asyncgen_finalizer(lambda g: 1)
>> to get around the error and continue playing :) (I realize that's a bad
>> thing to do but it didn't matter for the toy cases)
>>
>
> Yeah, maybe warning would be sufficient.  I just find it's highly unlikely
> that a lot of people would use async generators without a loop/coroutine
> runner, as it's a very tedious process.
>

Heh, I had the same reaction as Rajiv. I think the tediousness is actually
a good argument that there's no reason to forbid this. I don't even think a
warning is needed. People who don't use a coroutine runner are probably
just playing around (maybe even in the REPL) and they shouldn't get advice
unasked.

Would it be possible to print a warning only when an async generator is
being finalized and doesn't run straight to the end without suspending or
yielding? For regular generators we have a similar exception (although I
don't recall whether we actually warn) -- if you call close() and it tries
to yield another value it is just GC'ed without giving the frame more
control. For an async generator there are two cases: either it tries to
yield another value (the first time this happens you can throw an error
back into it) or it tries to await -- in that case you can also throw an
error back into it, and if the error comes out unhandled you can print the
error (in both cases actually).

It's probably to specify all this behavior using some kind of default
finalizer (though you don't have to implement it that way).

Hopefully there will be other discussion as well, otherwise I'll have to
accept the PEP once this issue is cleared up. :-)

-- 
--Guido van Rossum (python.org/~guido)
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 525

2016-08-24 Thread Yury Selivanov

Hi Rajiv,

On 2016-08-23 10:38 PM, Rajiv Kumar wrote:

Hi Yury,

I was playing with your implementation to gain a better understanding 
of the operation of asend() and friends. Since I was explicitly trying 
to "manually" advance the generators, I wasn't using asyncio or other 
event loop. This meant that the first thing I ran into with my toy 
code was the RuntimeError ("cannot iterate async generator without 
finalizer set").


As you have argued elsewhere, in practice the finalizer is likely to 
be set by the event loop. Since the authors of event loops are likely 
to know that they should set the finalizer, would it perhaps be 
acceptable to merely issue a warning instead of an error if the 
finalizer is not set? That way there isn't an extra hoop to jump 
through for simple examples.


In my case, I just called
sys.set_asyncgen_finalizer(lambda g: 1)
to get around the error and continue playing :) (I realize that's a 
bad thing to do but it didn't matter for the toy cases)


Yeah, maybe warning would be sufficient.  I just find it's highly 
unlikely that a lot of people would use async generators without a 
loop/coroutine runner, as it's a very tedious process.


Thank you,
Yury
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 525

2016-08-23 Thread Rajiv Kumar
Hi Yury,

I was playing with your implementation to gain a better understanding of
the operation of asend() and friends. Since I was explicitly trying to
"manually" advance the generators, I wasn't using asyncio or other event
loop. This meant that the first thing I ran into with my toy code was the
RuntimeError ("cannot iterate async generator without finalizer set").

As you have argued elsewhere, in practice the finalizer is likely to be set
by the event loop. Since the authors of event loops are likely to know that
they should set the finalizer, would it perhaps be acceptable to merely
issue a warning instead of an error if the finalizer is not set? That way
there isn't an extra hoop to jump through for simple examples.

In my case, I just called
sys.set_asyncgen_finalizer(lambda g: 1)
to get around the error and continue playing :) (I realize that's a bad
thing to do but it didn't matter for the toy cases)

- Rajiv
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] PEP 525

2016-08-23 Thread Yury Selivanov

Hi,

I think it's time to discuss PEP 525 on python-dev (pasted below).

There were no changes in the PEP since I posted it to python-ideas
a couple of weeks ago.

One really critical thing that will block PEP acceptance is
asynchronous generators (AG) finalization.  The problem is
to provide a reliable way to correctly close all AGs on program
shutdown.

To recap: PEP 492 requires an event loop or a coroutine runner
to run coroutines.  PEP 525 defines AGs using asynchronous
iteration protocol, also defined in PEP 492.  AGs require an
'async for' statement to iterate over them, which in turn can
only be used in a coroutine.  Therefore, AGs also require an
event loop or a coroutine runner to operate.

The finalization problem is related to partial iteration.
For instance, let's look at an ordinary synchronous generator:

  def gen():
try:
  while True:
yield 42
finally:
  print("done")

  g = gen()
  next(g)
  next(g)
  del g

In the above example, when 'g' is GCed, the interpreter will
try to close the generator.  It will do that by throwing a
GeneratorExit exception into 'g', which would trigger the 'finally'
statement.

For AGs we have a similar problem.  Except that they can have
`await` expressions in their `finally` statements, which means
that the interpreter can't close them on its own.  An event
loop is required to run an AG, and an event loop is required to
close it correctly.

To enable correct AGs finalization, PEP 525 proposes to add a
`sys.set_asyncgen_finalizer` API.  The idea is to have a finalizer
callback assigned to each AG, and when it's time to close the AG,
that callback will be called.  The callback will be installed by
the event loop (or coroutine runner), and should schedule a
correct asynchronous finalization of the AG (remember, AGs can
have 'await' expressions in their finally statements).

The problem with 'set_asyncgen_finalizer' is that the event loop
doesn't know about AGs until they are GCed.  This can be a problem
if we want to write a program that gracefully closes all AGs
when the event loop is being closed.

There is an alternative approach to finalization of AGs: instead
of assigning a finalizer callback to an AG, we can add an API to
intercept AG first iteration.  That would allow event loops to
have weak references to all AGs running under their control:

1. that would make it possible to intercept AGs garbage collection
similarly to the currently proposed set_asyncgen_finalizer

2. it would also allow us to implement 'loop.shutdown' coroutine,
which would try to asynchronously close all open AGs.

The second approach gives event loops more control and allows to
implement APIs to collect open resources gracefully.  The only
downside is that it's a bit harder for event loops to work with.

Let's discuss.


PEP: 525
Title: Asynchronous Generators
Version: $Revision$
Last-Modified: $Date$
Author: Yury Selivanov 
Discussions-To: 
Status: Draft
Type: Standards Track
Content-Type: text/x-rst
Created: 28-Jul-2016
Python-Version: 3.6
Post-History: 02-Aug-2016


Abstract


PEP 492 introduced support for native coroutines and ``async``/``await``
syntax to Python 3.5.  It is proposed here to extend Python's
asynchronous capabilities by adding support for
*asynchronous generators*.


Rationale and Goals
===

Regular generators (introduced in PEP 255) enabled an elegant way of
writing complex *data producers* and have them behave like an iterator.

However, currently there is no equivalent concept for the *asynchronous
iteration protocol* (``async for``).  This makes writing asynchronous
data producers unnecessarily complex, as one must define a class that
implements ``__aiter__`` and ``__anext__`` to be able to use it in
an ``async for`` statement.

Essentially, the goals and rationale for PEP 255, applied to the
asynchronous execution case, hold true for this proposal as well.

Performance is an additional point for this proposal: in our testing of
the reference implementation, asynchronous generators are **2x** faster
than an equivalent implemented as an asynchronous iterator.

As an illustration of the code quality improvement, consider the
following class that prints numbers with a given delay once iterated::

class Ticker:
"""Yield numbers from 0 to `to` every `delay` seconds."""

def __init__(self, delay, to):
self.delay = delay
self.i = 0
self.to = to

def __aiter__(self):
return self

async def __anext__(self):
i = self.i
if i >= self.to:
raise StopAsyncIteration
self.i += 1
if i:
await asyncio.sleep(self.delay)
return i


The same can be implemented as a much simpler asynchronous generator::

async def ticker(delay, to):
"""Yield numbers from 0 to `to` every `delay` seconds."""
for i in