subject:"Question about asyncio and blocking operations"

Re: Question about asyncio and blocking operations

2016-01-29 Thread Frank Millman


"Frank Millman"  wrote in message news:n8et0d$hem$1...@ger.gmane.org...

I have read the other messages, and I can see that there are some clever 
ideas there. However, having found something that seems to work and that I 
feel comfortable with, I plan to run with this for the time being.


A quick update.

Now that I am starting to understand this a bit better, I found it very easy 
to turn my concept into an Asynchronous Iterator.


class AsyncCursor:
   def __init__(self, loop, sql):
   self.return_queue = asyncio.Queue()
   request_queue.put((self.return_queue, loop, sql))

   async def __aiter__(self):
   return self

   async def __anext__(self):
   row = await self.return_queue.get()
   if row is not None:
   return row
   else:
   self.return_queue.task_done()
   raise StopAsyncIteration

The caller can use it like this -

   sql = 'SELECT ...'
   cur = AsyncCursor(loop, sql)
   async for row in cur:
   print('got', row)

Frank


--
https://mail.python.org/mailman/listinfo/python-list

Re: Question about asyncio and blocking operations

2016-01-29 Thread Maxime Steisel

Le 28 janv. 2016 22:52, "Ian Kelly"  a écrit :
>
> On Thu, Jan 28, 2016 at 2:23 PM, Maxime S  wrote:
> >
> > 2016-01-28 17:53 GMT+01:00 Ian Kelly :
> >>
> >> On Thu, Jan 28, 2016 at 9:40 AM, Frank Millman 
wrote:
> >>
> >> > The caller requests some data from the database like this.
> >> >
> >> >return_queue = asyncio.Queue()
> >> >sql = 'SELECT ...'
> >> >request_queue.put((return_queue, sql))
> >>
> >> Note that since this is a queue.Queue, the put call has the potential
> >> to block your entire event loop.
> >>
> >
> > Actually, I don't think you actually need an asyncio.Queue.
> >
> > You could use a simple deque as a buffer, and call fetchmany() when it
is
> > empty, like that (untested):
>
> True. The asyncio Queue is really just a wrapper around a deque with
> an interface designed for use with the producer-consumer pattern. If
> the producer isn't a coroutine then it may not be appropriate.
>
> This seems like a nice suggestion. Caution is advised if multiple
> cursor methods are executed concurrently since they would be in
> different threads and the underlying cursor may not be thread-safe.
> --
> https://mail.python.org/mailman/listinfo/python-list

Indeed, the run_in_executor call should probably protected by an
asyncio.Lock.

But it is a pretty strange idea to call two fetch*() method concurrently
anyways.
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Question about asyncio and blocking operations

2016-01-28 Thread Frank Millman

"Ian Kelly"  wrote in message 
news:calwzidn6nft_o0cfhw1itwja81+mw3schuecadvcen3ix6z...@mail.gmail.com...


As I commented in my previous message, asyncio.Queue is not
thread-safe, so it's very important that the put calls here be done on
the event loop thread using event_loop.call_soon_threadsafe. This
could be the cause of the strange behavior you're seeing in getting
the results.



Using call_soon_threadsafe makes all the difference. The rows are now 
retrieved instantly.


I have read the other messages, and I can see that there are some clever 
ideas there. However, having found something that seems to work and that I 
feel comfortable with, I plan to run with this for the time being.


Thanks to all for the very stimulating discussion.

Frank


--
https://mail.python.org/mailman/listinfo/python-list

Re: Question about asyncio and blocking operations

2016-01-28 Thread Ian Kelly

On Jan 28, 2016 3:07 PM, "Maxime Steisel"  wrote:
>
> But it is a pretty strange idea to call two fetch*() method concurrently
anyways.

If you want to process rows concurrently and aren't concerned with
processing them in order, it may be attractive to create multiple threads /
coroutines, pass the cursor to each, and let them each call fetchmany
independently. I agree this is a bad idea unless you use a lock to isolate
the calls or are certain that you'll never use a dbapi implementation with
threadsafety < 3.

I pointed it out because the wrapper makes it less obvious that multiple
threads are involved; one could naively assume that the separate calls are
isolated by the event loop.
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Question about asyncio and blocking operations

2016-01-28 Thread Ian Kelly

On Thu, Jan 28, 2016 at 2:23 PM, Maxime S  wrote:
>
> 2016-01-28 17:53 GMT+01:00 Ian Kelly :
>>
>> On Thu, Jan 28, 2016 at 9:40 AM, Frank Millman  wrote:
>>
>> > The caller requests some data from the database like this.
>> >
>> >return_queue = asyncio.Queue()
>> >sql = 'SELECT ...'
>> >request_queue.put((return_queue, sql))
>>
>> Note that since this is a queue.Queue, the put call has the potential
>> to block your entire event loop.
>>
>
> Actually, I don't think you actually need an asyncio.Queue.
>
> You could use a simple deque as a buffer, and call fetchmany() when it is
> empty, like that (untested):

True. The asyncio Queue is really just a wrapper around a deque with
an interface designed for use with the producer-consumer pattern. If
the producer isn't a coroutine then it may not be appropriate.

This seems like a nice suggestion. Caution is advised if multiple
cursor methods are executed concurrently since they would be in
different threads and the underlying cursor may not be thread-safe.
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Question about asyncio and blocking operations

2016-01-28 Thread Maxime S

2016-01-28 17:53 GMT+01:00 Ian Kelly :

> On Thu, Jan 28, 2016 at 9:40 AM, Frank Millman  wrote:
>
> > The caller requests some data from the database like this.
> >
> >return_queue = asyncio.Queue()
> >sql = 'SELECT ...'
> >request_queue.put((return_queue, sql))
>
> Note that since this is a queue.Queue, the put call has the potential
> to block your entire event loop.
>
>
Actually, I don't think you actually need an asyncio.Queue.

You could use a simple deque as a buffer, and call fetchmany() when it is
empty, like that (untested):

class AsyncCursor:
"""Wraps a DB cursor and provide async method for blocking operations"""
def __init__(self, cur, loop=None):
if loop is None:
loop = asyncio.get_event_loop()
self._loop = loop
self._cur = cur
self._queue = deque()

def __getattr__(self, attr):
return getattr(self._cur, attr)

def __setattr__(self, attr, value):
return setattr(self._cur, attr, value)

async def execute(self, operation, params):
return await self._loop.run_in_executor(self._cur.execute,
operation, params)

async def fetchall(self):
return await self._loop.run_in_executor(self._cur.fetchall)


async def fetchone(self):
return await self._loop.run_in_executor(self._cur.fetchone)

async def fetchmany(self, size=None):
return await self._loop.run_in_executor(self._cur.fetchmany, size)


async def __aiter__(self):
return self

async def __anext__(self):
if self._queue.empty():
rows = await self.fetchmany()
if not rows:
raise StopAsyncIteration()
self._queue.extend(rows)

return self._queue.popleft()
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Question about asyncio and blocking operations

2016-01-28 Thread Frank Millman

"Ian Kelly"  wrote in message 
news:CALwzidnGbz7kM=d7mkua2ta9-csfn9u0ohl0w-x5bbixpcw...@mail.gmail.com...

On Jan 28, 2016 4:13 AM, "Frank Millman"  wrote:
>
> I *think* I have this one covered. When the caller makes a request, it
creates an instance of an asyncio.Queue, and includes it with the request.
The db handler uses this queue to send the result back.
>
> Do you see any problem with this?

That seems reasonable to me. I assume that when you send the result back
you would be queuing up individual rows and not just sending a single
object across, which could be more easily with just a single future.

I have hit a snag. It feels like a bug in 'await q.get()', though I am sure
it is just me misunderstanding how it works.

I can post some working code if necessary, but here is a short description.

Here is the database handler - 'request_queue' is a queue.Queue -

   while not request_queue.empty():
   return_queue, sql = request_queue.get()
   cur.execute(sql)
   for row in cur:
   return_queue.put_nowait(row)
return_queue.put_nowait(None)
   request_queue.task_done()

The caller requests some data from the database like this.

   return_queue = asyncio.Queue()
   sql = 'SELECT ...'
   request_queue.put((return_queue, sql))
   while True:
   row = await return_queue.get()
   if row is None:
   break
   print('got', row)
   return_queue.task_done()

The first time 'await return_queue.get()' is called, return_queue is empty,
as the db handler has not had time to do anything yet.

It is supposed to pause there, wait for something to appear in the queue,
and then process it. I have confirmed that the db handler populates the
queue virtually instantly.

What seems to happen is that it pauses there, but then waits for some other
event in the event loop to occur before it continues. Then it processes all
rows very quickly.

I am running a 'counter' task in the background that prints a sequential
number, using await asyncio.sleep(1). I noticed a short but variable delay
before the rows were printed, and thought it might be waiting for the
counter. I increased the sleep to 10, and sure enough it now waits up to 10
seconds before printing any rows.

Any ideas?

Frank

--
https://mail.python.org/mailman/listinfo/python-list

Re: Question about asyncio and blocking operations

2016-01-28 Thread Ian Kelly

On Jan 28, 2016 4:13 AM, "Frank Millman"  wrote:
>
> "Chris Angelico"  wrote in message
news:captjjmr162+k4lzefpxrur6wxrhxbr-_wkrclldyr7kst+k...@mail.gmail.com...
>>
>>
>> On Thu, Jan 28, 2016 at 8:13 PM, Frank Millman 
wrote:
>> > Run the database handler in a separate thread. Use a queue.Queue to
send
>> > requests to the handler. Use an asyncio.Queue to send results back to
> the
>> > caller, which can call 'await q.get()'.
>> >
>> > I ran a quick test and it seems to work. What do you think?
>>
>> My gut feeling is that any queue can block at either get or put 
>>
>
> H'mm, I will have to think about that one, and figure out how to create a
worst-case scenario. I will report back on that.

The get and put methods of asyncio queues are coroutines, so I don't think
this would be a real issue. The coroutine might block, but it won't block
the event loop. If the queue fills up, then effectively the waiting
coroutines just become a (possibly unordered) extension of the queue.

>>
>> The other risk is that the wrong result will be queried (two async
>> tasks put something onto the queue - which one gets the first
>> result?), which could either be coped with by simple sequencing (maybe
>> this happens automatically, although I'd prefer a
>> mathematically-provable result to "it seems to work"), or by wrapping
>> the whole thing up in a function/class.
>>
>
> I *think* I have this one covered. When the caller makes a request, it
creates an instance of an asyncio.Queue, and includes it with the request.
The db handler uses this queue to send the result back.
>
> Do you see any problem with this?

That seems reasonable to me. I assume that when you send the result back
you would be queuing up individual rows and not just sending a single
object across, which could be more easily with just a single future.

The main risk of adding limited threads to an asyncio program is that
threads make it harder to reason about concurrency. Just make sure the
threads don't share any state and you should be good.

Note that I can only see queues being used to move data in this direction,
not in the opposite. It's unclear to me how queue.get would work from the
blocking thread. Asyncio queues aren't threadsafe, but you couldn't just
use call_soon_threadsafe since the result is important. You might want to
use a queue.Queue instead in that case, but then you run back into the
problem of queue.put being a blocking operation.
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Question about asyncio and blocking operations

2016-01-28 Thread Ian Kelly

On Thu, Jan 28, 2016 at 9:40 AM, Frank Millman  wrote:
> I have hit a snag. It feels like a bug in 'await q.get()', though I am sure
> it is just me misunderstanding how it works.
>
> I can post some working code if necessary, but here is a short description.
>
> Here is the database handler - 'request_queue' is a queue.Queue -
>
>while not request_queue.empty():
>return_queue, sql = request_queue.get()
>cur.execute(sql)
>for row in cur:
>return_queue.put_nowait(row)
> return_queue.put_nowait(None)
>request_queue.task_done()

As I commented in my previous message, asyncio.Queue is not
thread-safe, so it's very important that the put calls here be done on
the event loop thread using event_loop.call_soon_threadsafe. This
could be the cause of the strange behavior you're seeing in getting
the results.

> The caller requests some data from the database like this.
>
>return_queue = asyncio.Queue()
>sql = 'SELECT ...'
>request_queue.put((return_queue, sql))

Note that since this is a queue.Queue, the put call has the potential
to block your entire event loop.
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Question about asyncio and blocking operations

2016-01-28 Thread Frank Millman

"Chris Angelico"  wrote in message 
news:captjjmr162+k4lzefpxrur6wxrhxbr-_wkrclldyr7kst+k...@mail.gmail.com...

On Thu, Jan 28, 2016 at 8:13 PM, Frank Millman  wrote:
> Run the database handler in a separate thread. Use a queue.Queue to send
> requests to the handler. Use an asyncio.Queue to send results back to 
> the

> caller, which can call 'await q.get()'.
>
> I ran a quick test and it seems to work. What do you think?

My gut feeling is that any queue can block at either get or put 

H'mm, I will have to think about that one, and figure out how to create a 
worst-case scenario. I will report back on that.

The other risk is that the wrong result will be queried (two async
tasks put something onto the queue - which one gets the first
result?), which could either be coped with by simple sequencing (maybe
this happens automatically, although I'd prefer a
mathematically-provable result to "it seems to work"), or by wrapping
the whole thing up in a function/class.

I *think* I have this one covered. When the caller makes a request, it 
creates an instance of an asyncio.Queue, and includes it with the request. 
The db handler uses this queue to send the result back.

Do you see any problem with this?

Frank

--
https://mail.python.org/mailman/listinfo/python-list

Re: Question about asyncio and blocking operations

2016-01-28 Thread Chris Angelico

On Thu, Jan 28, 2016 at 8:13 PM, Frank Millman  wrote:
> Run the database handler in a separate thread. Use a queue.Queue to send
> requests to the handler. Use an asyncio.Queue to send results back to the
> caller, which can call 'await q.get()'.
>
> I ran a quick test and it seems to work. What do you think?

My gut feeling is that any queue can block at either get or put. Half
of your operations are "correct", and the other half are "wrong". The
caller can "await q.get()", which is the "correct" way to handle the
blocking operation; the database thread can "jobqueue.get()" as a
blocking operation, which is also fine. But your queue-putting
operations are going to have to assume that they never block. Maybe
you can have the database put its results onto the asyncio.Queue
safely, but the requests going onto the queue.Queue could block
waiting for the database. Specifically, this will happen if database
queries come in faster than the database can handle them - quite
literally, your jobs will be "blocked on the database". What should
happen then?

The other risk is that the wrong result will be queried (two async
tasks put something onto the queue - which one gets the first
result?), which could either be coped with by simple sequencing (maybe
this happens automatically, although I'd prefer a
mathematically-provable result to "it seems to work"), or by wrapping
the whole thing up in a function/class.

Both of these are risks seen purely by looking at the idea
description, not at any sort of code. It's entirely possible I'm
mistaken about them. But there's only one way to find out :)

By the way, just out of interest... there's no way you can actually
switch out the database communication for something purely
socket-based, is there? PostgreSQL's protocol, for instance, is fairly
straight-forward, and you don't *have* to use libpq; Pike's inbuilt
pgsql module just opens a socket and says hello, and it looks like
py-postgresql [1] is the same thing for Python. Taking something like
that and making it asynchronous would be as straight-forward as
converting any other socket-based code. Could be an alternative to all
this weirdness.

ChrisA
[1] https://pypi.python.org/pypi/py-postgresql
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Question about asyncio and blocking operations

2016-01-28 Thread Chris Angelico

On Thu, Jan 28, 2016 at 10:11 PM, Frank Millman  wrote:
>>
>> The other risk is that the wrong result will be queried (two async
>> tasks put something onto the queue - which one gets the first
>> result?), which could either be coped with by simple sequencing (maybe
>> this happens automatically, although I'd prefer a
>> mathematically-provable result to "it seems to work"), or by wrapping
>> the whole thing up in a function/class.
>>
>
> I *think* I have this one covered. When the caller makes a request, it
> creates an instance of an asyncio.Queue, and includes it with the request.
> The db handler uses this queue to send the result back.
>
> Do you see any problem with this?

Oh, I get it. In that case, you should be safe (at the cost of
efficiency, presumably, but probably immeasurably so).

The easiest way to thrash-test this is to simulate an ever-increasing
stream of requests that eventually get bottlenecked by the database.
For instance, have the database "process" a maximum of 10 requests a
second, and then start feeding it 11 requests a second. Monitoring the
queue length should tell you what's happening. Eventually, the queue
will hit its limit (and you can make that happen sooner by creating
the queue with a small maxsize), at which point the queue.put() will
block. My suspicion is that that's going to lock your entire backend,
unlocking only when the database takes something off its queue; you'll
end up throttling at the database's rate (which is fine), but with
something perpetually blocked waiting for the database (which is bad -
other jobs, like socket read/write, won't be happening).

As an alternative, you could use put_nowait or put(item, block=False)
to put things on the queue. If there's no room, it'll fail
immediately, which you can spit back to the user as "Database
overloaded, please try again later". Tuning the size of the queue
would then be an important consideration for real-world work; you'd
want it long enough to cope with bursty traffic, but short enough that
people's requests don't time out while they're blocked on the database
(it's much tidier to fail them instantly if that's going to happen).

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Question about asyncio and blocking operations

2016-01-28 Thread Frank Millman

"Ian Kelly"  wrote in message 
news:CALwzidkr-fT6S6wH2caNaxyQvUdAw=x7xdqkqofnrrwzwnj...@mail.gmail.com...

On Wed, Jan 27, 2016 at 10:14 AM, Ian Kelly  wrote:
> Unfortunately this doesn't actually work at present.
> EventLoop.run_in_executor swallows the StopIteration exception and
> just returns None, which I assume is a bug.

http://bugs.python.org/issue26221

Thanks for that. Fascinating discussion between you and GvR.

Reading it gave me an idea.

Run the database handler in a separate thread. Use a queue.Queue to send 
requests to the handler. Use an asyncio.Queue to send results back to the 
caller, which can call 'await q.get()'.

I ran a quick test and it seems to work. What do you think?

Frank

--
https://mail.python.org/mailman/listinfo/python-list

Re: Question about asyncio and blocking operations

2016-01-27 Thread Frank Millman

"Ian Kelly"  wrote in message 
news:CALwzidk-RBkB-vi6CgcEeoFHQrsoTFvqX9MqzDD=rny5boc...@mail.gmail.com...

On Tue, Jan 26, 2016 at 7:15 AM, Frank Millman  wrote:
>
> If I return the cursor, I can iterate over it, but isn't this a blocking
> operation? As far as I know, the DB adaptor will only actually retrieve 
> the

> row when requested.
>
> If I am right, I should call fetchall() while inside get_rows(), and 
> return

> all the rows as a list.
>

You probably want an asynchronous iterator here. If the cursor doesn't
provide that, then you can wrap it in one. In fact, this is basically
one of the examples in the PEP:
https://www.python.org/dev/peps/pep-0492/#example-1

Thanks, Ian. I had a look, and it does seem to fit the bill, but I could not 
get it to work, and I am running out of time.

Specifically, I tried to get it working with the sqlite3 cursor. I am no 
expert, but after some googling I tried this -

import sqlite3
conn = sqlite3.connect('/sqlite_db')
cur = conn.cursor()

async def __aiter__(self):
   return self

async def __anext__(self):
   loop = asyncio.get_event_loop()
   return await loop.run_in_executor(None, self.__next__)

import types
cur.__aiter__ = types.MethodType( __aiter__, cur )
cur.__anext__ = types.MethodType( __anext__, cur )

It failed with this exception -

AttributeError: 'sqlite3.Cursor' object has no attribute '__aiter__'

I think this is what happens if a class uses 'slots' to define its 
attributes - it will not permit the creation of a new one.

Anyway, moving on, I decided to change tack. Up to now I have been trying to 
isolate the function where I actually communicate with the database, and 
wrap that in a Future with 'run_in_executor'.

In practice, the vast majority of my interactions with the database consist 
of very small CRUD commands, and will have minimal impact on response times 
even if they block. So I decided to focus on a couple of functions which are 
larger, and try to wrap the entire function in a Future with 
'run_in_executor'.

It seems to be working, but it looks a bit odd, so I will show what I am 
doing and ask for feedback.

Assume a slow function -

async def slow_function(arg1, arg2):
   [do stuff]

It now looks like this -

async def slow_function(arg1, arg2):
   loop = asyncio.get_event_loop()
   await loop.run_in_executor(None, slow_function_1, arg1, arg2)

def slow_function_1(self, arg1, arg2):
   loop = asyncio.new_event_loop()
   asyncio.set_event_loop(loop)
   loop.run_until_complete(slow_function_2(arg1, arg2))

async slow_function_2(arg1, arg2):
   [do stuff]

Does this look right?

Frank

--
https://mail.python.org/mailman/listinfo/python-list

Re: Question about asyncio and blocking operations

2016-01-27 Thread Frank Millman

"Ian Kelly"  wrote in message 
news:calwzidn6tvn9w-2qnn2jyvju8nhzn499nptfjn9ohjddceb...@mail.gmail.com...

On Wed, Jan 27, 2016 at 7:40 AM, Frank Millman  wrote:
>
> Assume a slow function -
>
> async def slow_function(arg1, arg2):
>[do stuff]
>
> It now looks like this -
>
> async def slow_function(arg1, arg2):
>loop = asyncio.get_event_loop()
>await loop.run_in_executor(None, slow_function_1, arg1, arg2)
>
> def slow_function_1(self, arg1, arg2):
>loop = asyncio.new_event_loop()
>asyncio.set_event_loop(loop)
>loop.run_until_complete(slow_function_2(arg1, arg2))
>
> async slow_function_2(arg1, arg2):
>[do stuff]
>
> Does this look right?

I'm not sure I understand what you're trying to accomplish by running
a second event loop inside the executor thread. It will only be useful
for scheduling asynchronous operations, and if they're asynchronous
then why not schedule them on the original event loop?

I could be confusing myself here, but this is what I am trying to do.

run_in_executor() schedules a blocking function to run in the executor, and 
returns a Future.

If you just invoke it, the blocking function will execute in the background, 
and the calling function will carry on.

If you obtain a reference to the Future, and then 'await' it, the calling 
function will be suspended until the blocking function is complete. You 
might do this because you want the calling function to block, but you do not 
want to block the entire event loop.

In the above example, I do not want the calling function to block. However, 
the blocking function invokes one or more coroutines, so it needs an event 
loop to operate. Creating a new event loop allows them to run independently.

Hope this makes sense.

Frank

--
https://mail.python.org/mailman/listinfo/python-list

Re: Question about asyncio and blocking operations

2016-01-27 Thread Ian Kelly

On Wed, Jan 27, 2016 at 10:14 AM, Ian Kelly  wrote:
> Unfortunately this doesn't actually work at present.
> EventLoop.run_in_executor swallows the StopIteration exception and
> just returns None, which I assume is a bug.

http://bugs.python.org/issue26221
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Question about asyncio and blocking operations

2016-01-27 Thread Ian Kelly

On Wed, Jan 27, 2016 at 9:15 AM, Ian Kelly  wrote:
> class CursorWrapper:
>
> def __init__(self, cursor):
> self._cursor = cursor
>
> async def __aiter__(self):
> return self
>
> async def __anext__(self):
> loop = asyncio.get_event_loop()
> return await loop.run_in_executor(None, next, self._cursor)

Oh, except you'd want to be sure to catch StopIteration and raise
AsyncStopIteration in its place. This could also be generalized as an
iterator wrapper, similar to the example in the PEP except using
run_in_executor to actually avoid blocking.

class AsyncIteratorWrapper:

def __init__(self, iterable, loop=None, executor=None):
self._iterator = iter(iterable)
self._loop = loop or asyncio.get_event_loop()
self._executor = executor

async def __aiter__(self):
return self

async def __anext__(self):
try:
return await self._loop.run_in_executor(
self._executor, next, self._iterator)
except StopIteration:
raise StopAsyncIteration


Unfortunately this doesn't actually work at present.
EventLoop.run_in_executor swallows the StopIteration exception and
just returns None, which I assume is a bug.
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Question about asyncio and blocking operations

2016-01-27 Thread Ian Kelly

On Wed, Jan 27, 2016 at 7:40 AM, Frank Millman  wrote:
> "Ian Kelly"  wrote in message
> news:CALwzidk-RBkB-vi6CgcEeoFHQrsoTFvqX9MqzDD=rny5boc...@mail.gmail.com...
>
>> You probably want an asynchronous iterator here. If the cursor doesn't
>> provide that, then you can wrap it in one. In fact, this is basically
>> one of the examples in the PEP:
>> https://www.python.org/dev/peps/pep-0492/#example-1
>>
>
> Thanks, Ian. I had a look, and it does seem to fit the bill, but I could not
> get it to work, and I am running out of time.
>
> Specifically, I tried to get it working with the sqlite3 cursor. I am no
> expert, but after some googling I tried this -
>
> import sqlite3
> conn = sqlite3.connect('/sqlite_db')
> cur = conn.cursor()
>
> async def __aiter__(self):
>return self
>
> async def __anext__(self):
>loop = asyncio.get_event_loop()
>return await loop.run_in_executor(None, self.__next__)
>
> import types
> cur.__aiter__ = types.MethodType( __aiter__, cur )
> cur.__anext__ = types.MethodType( __anext__, cur )
>
> It failed with this exception -
>
> AttributeError: 'sqlite3.Cursor' object has no attribute '__aiter__'
>
> I think this is what happens if a class uses 'slots' to define its
> attributes - it will not permit the creation of a new one.

This is why I suggested wrapping the cursor instead. Something like this:

class CursorWrapper:

def __init__(self, cursor):
self._cursor = cursor

async def __aiter__(self):
return self

async def __anext__(self):
loop = asyncio.get_event_loop()
return await loop.run_in_executor(None, next, self._cursor)

> Anyway, moving on, I decided to change tack. Up to now I have been trying to
> isolate the function where I actually communicate with the database, and
> wrap that in a Future with 'run_in_executor'.
>
> In practice, the vast majority of my interactions with the database consist
> of very small CRUD commands, and will have minimal impact on response times
> even if they block. So I decided to focus on a couple of functions which are
> larger, and try to wrap the entire function in a Future with
> 'run_in_executor'.
>
> It seems to be working, but it looks a bit odd, so I will show what I am
> doing and ask for feedback.
>
> Assume a slow function -
>
> async def slow_function(arg1, arg2):
>[do stuff]
>
> It now looks like this -
>
> async def slow_function(arg1, arg2):
>loop = asyncio.get_event_loop()
>await loop.run_in_executor(None, slow_function_1, arg1, arg2)
>
> def slow_function_1(self, arg1, arg2):
>loop = asyncio.new_event_loop()
>asyncio.set_event_loop(loop)
>loop.run_until_complete(slow_function_2(arg1, arg2))
>
> async slow_function_2(arg1, arg2):
>[do stuff]
>
> Does this look right?

I'm not sure I understand what you're trying to accomplish by running
a second event loop inside the executor thread. It will only be useful
for scheduling asynchronous operations, and if they're asynchronous
then why not schedule them on the original event loop?
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Question about asyncio and blocking operations

2016-01-27 Thread Frank Millman

"Ian Kelly"  wrote in message 
news:CALwzidk-RBkB-vi6CgcEeoFHQrsoTFvqX9MqzDD=rny5boc...@mail.gmail.com...

On Tue, Jan 26, 2016 at 7:15 AM, Frank Millman  wrote:
>
> If I return the cursor, I can iterate over it, but isn't this a blocking
> operation? As far as I know, the DB adaptor will only actually retrieve 
> the

> row when requested.
>
> If I am right, I should call fetchall() while inside get_rows(), and 
> return

> all the rows as a list.
>

You probably want an asynchronous iterator here. If the cursor doesn't
provide that, then you can wrap it in one. In fact, this is basically
one of the examples in the PEP:
https://www.python.org/dev/peps/pep-0492/#example-1

Thanks, Ian. I had a look, and it does seem to fit the bill, but I could not 
get it to work, and I am running out of time.

Specifically, I tried to get it working with the sqlite3 cursor. I am no 
expert, but after some googling I tried this -

import sqlite3
conn = sqlite3.connect('/sqlite_db')
cur = conn.cursor()

async def __aiter__(self):
   return self

async def __anext__(self):
   loop = asyncio.get_event_loop()
   return await loop.run_in_executor(None, self.__next__)

import types
cur.__aiter__ = types.MethodType( __aiter__, cur )
cur.__anext__ = types.MethodType( __anext__, cur )

It failed with this exception -

AttributeError: 'sqlite3.Cursor' object has no attribute '__aiter__'

I think this is what happens if a class uses 'slots' to define its 
attributes - it will not permit the creation of a new one.

Anyway, moving on, I decided to change tack. Up to now I have been trying to 
isolate the function where I actually communicate with the database, and 
wrap that in a Future with 'run_in_executor'.

In practice, the vast majority of my interactions with the database consist 
of very small CRUD commands, and will have minimal impact on response times 
even if they block. So I decided to focus on a couple of functions which are 
larger, and try to wrap the entire function in a Future with 
'run_in_executor'.

It seems to be working, but it looks a bit odd, so I will show what I am 
doing and ask for feedback.

Assume a slow function -

async def slow_function(arg1, arg2):
   [do stuff]

It now looks like this -

async def slow_function(arg1, arg2):
   loop = asyncio.get_event_loop()
   await loop.run_in_executor(None, slow_function_1, arg1, arg2)

def slow_function_1(self, arg1, arg2):
   loop = asyncio.new_event_loop()
   asyncio.set_event_loop(loop)
   loop.run_until_complete(slow_function_2(arg1, arg2))

async slow_function_2(arg1, arg2):
   [do stuff]

Does this look right?

Frank

--
https://mail.python.org/mailman/listinfo/python-list

Re: Question about asyncio and blocking operations

2016-01-26 Thread Alberto Berti


> "Alberto" == Alberto Berti  writes:

Alberto> async external_coro(): # this is the calling context, which is 
a coro
Alberto> async with transction.begin():
Alberto> o = MyObject
Alberto> # maybe other stuff

ops... here it is "o = MyObject()" ;-)

-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Question about asyncio and blocking operations

2016-01-26 Thread Alberto Berti

> "Frank" == Frank Millman  writes:

Frank> Now I have another problem. I have some classes which retrieve some
Frank> data from the database during their __init__() method. I find that it
Frank> is not allowed to call a coroutine from __init__(), and it is not
Frank> allowed to turn __init__() into a coroutine.

IMHO this is semantically correct for a method tha should really
initialize that instance an await in the __init__ means having a
suspension point that makes the initialization
somewhat... unpredictable :-).

To cover the cases when you need to call a coroutine from a non
coroutine function like __init__ I have developed a small package that
helps maintaining your code almost clean, where you can be sure that
after some point in your code flow, the coroutines scheduled by the
normal function have been executed. With that you can write code like
this:

from metapensiero.asyncio import transaction

class MyObject():

def __init__(self):
tran = transaction.get()
tran.add(get_db_object('company'), cback=self._init) # 
get_db_object is a coroutine

def _init(self, fut):
self.company = fut.result()

async external_coro(): # this is the calling context, which is a coro
async with transction.begin():
o = MyObject
# maybe other stuff

# start using your db object
o.company...

This way the management of the "inner" coroutine is simpler, and from
your code it's clear it suspends to wait and after that all the
"stashed" coroutines are guaranteed to be executed.

Hope it helps,

Alberto

-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Question about asyncio and blocking operations

2016-01-26 Thread Ian Kelly

On Tue, Jan 26, 2016 at 7:15 AM, Frank Millman  wrote:
> I am making some progress, but I have found a snag - possibly unavoidable,
> but worth a mention.
>
> Usually when I retrieve rows from a database I iterate over the cursor -
>
>def get_rows(sql, params):
>cur.execute(sql, params)
>for row in cur:
>yield row
>
> If I create a Future to run get_rows(), I have to 'return' the result so
> that the caller can access it by calling future.result().
>
> If I return the cursor, I can iterate over it, but isn't this a blocking
> operation? As far as I know, the DB adaptor will only actually retrieve the
> row when requested.
>
> If I am right, I should call fetchall() while inside get_rows(), and return
> all the rows as a list.
>
> This seems to be swapping one bit of asynchronicity for another.
>
> Does this sound right?

You probably want an asynchronous iterator here. If the cursor doesn't
provide that, then you can wrap it in one. In fact, this is basically
one of the examples in the PEP:
https://www.python.org/dev/peps/pep-0492/#example-1
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Question about asyncio and blocking operations

2016-01-26 Thread Frank Millman


"Frank Millman"  wrote in message news:n8038j$575$1...@ger.gmane.org...


I am developing a typical accounting/business application which involves a 
front-end allowing clients to access the system, a back-end connecting to 
a database, and a middle layer that glues it all together.



[...]


There was one aspect that I deliberately ignored at that stage. I did not 
change the database access to an asyncio approach, so all reading 
from/writing to the database involved a blocking operation. I am now ready 
to tackle that.


I am making some progress, but I have found a snag - possibly unavoidable, 
but worth a mention.


Usually when I retrieve rows from a database I iterate over the cursor -

   def get_rows(sql, params):
   cur.execute(sql, params)
   for row in cur:
   yield row

If I create a Future to run get_rows(), I have to 'return' the result so 
that the caller can access it by calling future.result().


If I return the cursor, I can iterate over it, but isn't this a blocking 
operation? As far as I know, the DB adaptor will only actually retrieve the 
row when requested.


If I am right, I should call fetchall() while inside get_rows(), and return 
all the rows as a list.


This seems to be swapping one bit of asynchronicity for another.

Does this sound right?

Frank


--
https://mail.python.org/mailman/listinfo/python-list

Re: Question about asyncio and blocking operations

2016-01-25 Thread Paul Rubin

Marko Rauhamaa  writes:
> Note that neither the multithreading model (which I dislike) nor the
> callback hell (which I like) suffer from this problem.

There are some runtimes (GHC and Erlang) where everything is nonblocking
under the covers, which lets even the asyncs be swept under the rug.
Similarly with some low-tech cooperative multitaskers, say in Forth.
When you've got a mixture of blocking and nonblocking, it becomes a
mess.
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Question about asyncio and blocking operations

2016-01-25 Thread Marko Rauhamaa

Rustom Mody :

> Bah -- What a bloody mess!
> And thanks for pointing this out, Ian.
> Keep wondering whether my brain is atrophying, or its rocket science or...

I'm afraid the asyncio idea will not fly.

Adding the keywords "async" and "await" did make things much better, but
the programming model seems very cumbersome.

Say you have an async that calls a nonblocking function as follows:


 async def t():
 ...
 f()
 ...

 def f():
 ...
 g()
 ...

 def g():
 ...
 h()
 ...

 def h():
 ...

Then, you need to add a blocking call to h(). You then have a cascading
effect of having to sprinkle asyncs and awaits everywhere:

 async def t():
 ...
 await f()
 ...

 async def f():
 ...
 await g()
 ...

 async def g():
 ...
 await h()
 ...

 async def h():
 ...
 await ...
 ...

A nasty case of nonlocality. Makes you wonder if you ought to declare
*all* functions *always* as asyncs just in case they turn out that way.

Note that neither the multithreading model (which I dislike) nor the
callback hell (which I like) suffer from this problem.


Marko
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Question about asyncio and blocking operations

2016-01-25 Thread Rustom Mody

On Monday, January 25, 2016 at 9:16:13 PM UTC+5:30, Ian wrote:
> On Mon, Jan 25, 2016 at 8:32 AM, Ian Kelly wrote:
> >
> > On Jan 25, 2016 2:04 AM, "Frank Millman"  wrote:
> >>
> >> "Ian Kelly"  wrote in message
> >>>
> >>> This seems to be a common misapprehension about asyncio programming.
> >>> While coroutines are the focus of the library, they're based on
> >>> futures, and so by working at a slightly lower level you can also
> >>> handle them as such. So  while this would be the typical way to use
> >>> run_in_executor:
> >>>
> >>> async def my_coroutine(stuff):
> >>> value = await get_event_loop().run_in_executor(None,
> >>> blocking_function, stuff)
> >>> result = await do_something_else_with(value)
> >>> return result
> >>>
> >>> This is also a perfectly valid way to use it:
> >>>
> >>> def normal_function(stuff):
> >>> loop = get_event_loop()
> >>> coro = loop.run_in_executor(None, blocking_function, stuff)
> >>> task = loop.create_task(coro)
> >>> task.add_done_callback(do_something_else)
> >>> return task
> >>
> >>
> >> I am struggling to get my head around this.
> >>
> >> 1. In the second function, AFAICT coro is already a future. Why is it
> >> necessary to turn it into a task? In fact when I tried that in my testing, 
> >> I
> >> got an assertion error -
> >>
> >> File: "C:\Python35\lib\asyncio\base_events.py", line 211, in create_task
> >>task = tasks.Task(coro, loop=self)
> >> File: "C:\Python35\lib\asyncio\tasks.py", line 70, in __init__
> >>assert coroutines.iscoroutine(coro), repr(coro)
> >> AssertionError: 
> >
> > I didn't test this; it was based on the documentation, which says that
> > run_in_executor is a coroutine. Looking at the source, it's actually a
> > function that returns a future, so this may be a documentation bug.
> 
> And now I'm reminded of this note in the asyncio docs:
> 
> """
> Note: In this documentation, some methods are documented as
> coroutines, even if they are plain Python functions returning a
> Future. This is intentional to have a freedom of tweaking the
> implementation of these functions in the future. If such a function is
> needed to be used in a callback-style code, wrap its result with
> ensure_future().
> """
> 
> IMO such methods should simply be documented as awaitables, not
> coroutines. I wonder if that's already settled, or if it's worth
> starting a discussion around.

Bah -- What a bloody mess!
And thanks for pointing this out, Ian.
Keep wondering whether my brain is atrophying, or its rocket science or...
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Question about asyncio and blocking operations

2016-01-25 Thread Ian Kelly

On Mon, Jan 25, 2016 at 8:32 AM, Ian Kelly  wrote:
>
> On Jan 25, 2016 2:04 AM, "Frank Millman"  wrote:
>>
>> "Ian Kelly"  wrote in message
>> news:calwzidngogpx+cpmvba8vpefuq4-bwmvs0gz3shb0owzi0b...@mail.gmail.com...
>>>
>>> This seems to be a common misapprehension about asyncio programming.
>>> While coroutines are the focus of the library, they're based on
>>> futures, and so by working at a slightly lower level you can also
>>> handle them as such. So  while this would be the typical way to use
>>> run_in_executor:
>>>
>>> async def my_coroutine(stuff):
>>> value = await get_event_loop().run_in_executor(None,
>>> blocking_function, stuff)
>>> result = await do_something_else_with(value)
>>> return result
>>>
>>> This is also a perfectly valid way to use it:
>>>
>>> def normal_function(stuff):
>>> loop = get_event_loop()
>>> coro = loop.run_in_executor(None, blocking_function, stuff)
>>> task = loop.create_task(coro)
>>> task.add_done_callback(do_something_else)
>>> return task
>>
>>
>> I am struggling to get my head around this.
>>
>> 1. In the second function, AFAICT coro is already a future. Why is it
>> necessary to turn it into a task? In fact when I tried that in my testing, I
>> got an assertion error -
>>
>> File: "C:\Python35\lib\asyncio\base_events.py", line 211, in create_task
>>task = tasks.Task(coro, loop=self)
>> File: "C:\Python35\lib\asyncio\tasks.py", line 70, in __init__
>>assert coroutines.iscoroutine(coro), repr(coro)
>> AssertionError: 
>
> I didn't test this; it was based on the documentation, which says that
> run_in_executor is a coroutine. Looking at the source, it's actually a
> function that returns a future, so this may be a documentation bug.

And now I'm reminded of this note in the asyncio docs:

"""
Note: In this documentation, some methods are documented as
coroutines, even if they are plain Python functions returning a
Future. This is intentional to have a freedom of tweaking the
implementation of these functions in the future. If such a function is
needed to be used in a callback-style code, wrap its result with
ensure_future().
"""

IMO such methods should simply be documented as awaitables, not
coroutines. I wonder if that's already settled, or if it's worth
starting a discussion around.
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Question about asyncio and blocking operations

2016-01-25 Thread Ian Kelly

On Jan 25, 2016 2:04 AM, "Frank Millman"  wrote:
>
> "Ian Kelly"  wrote in message
news:calwzidngogpx+cpmvba8vpefuq4-bwmvs0gz3shb0owzi0b...@mail.gmail.com...
>>
>> This seems to be a common misapprehension about asyncio programming.
>> While coroutines are the focus of the library, they're based on
>> futures, and so by working at a slightly lower level you can also
>> handle them as such. So  while this would be the typical way to use
>> run_in_executor:
>>
>> async def my_coroutine(stuff):
>> value = await get_event_loop().run_in_executor(None,
>> blocking_function, stuff)
>> result = await do_something_else_with(value)
>> return result
>>
>> This is also a perfectly valid way to use it:
>>
>> def normal_function(stuff):
>> loop = get_event_loop()
>> coro = loop.run_in_executor(None, blocking_function, stuff)
>> task = loop.create_task(coro)
>> task.add_done_callback(do_something_else)
>> return task
>
>
> I am struggling to get my head around this.
>
> 1. In the second function, AFAICT coro is already a future. Why is it
necessary to turn it into a task? In fact when I tried that in my testing,
I got an assertion error -
>
> File: "C:\Python35\lib\asyncio\base_events.py", line 211, in create_task
>task = tasks.Task(coro, loop=self)
> File: "C:\Python35\lib\asyncio\tasks.py", line 70, in __init__
>assert coroutines.iscoroutine(coro), repr(coro)
> AssertionError: 

I didn't test this; it was based on the documentation, which says that
run_in_executor is a coroutine. Looking at the source, it's actually a
function that returns a future, so this may be a documentation bug.

There's no need to get a task specifically. We just need a future so that
callbacks can be added, so if the result of run_in_executor is already a
future then the create_task call is unnecessary. To be safe, you could
replace that call with asyncio.ensure_future, which accepts any awaitable
and returns a future.

> 2. In the first function, calling 'run_in_executor' unblocks the main
loop so that it can continue with other tasks, but the function itself is
suspended until the blocking function returns. In the second function, I
cannot see how the function gets suspended. It looks as if the blocking
function will run in the background, and the main function will continue.

Correct. It's not a coroutine, so it has no facility for being suspended
and resumed; it can only block or return. That's why the callback is
necessary to schedule additional code to run after blocking_function
finishes. normal_function itself can continue to make other non-blocking
calls such as scheduling additional tasks, but it shouldn't do anything
that depends on the result of blocking_function since it can't be assumed
to be available yet.

> I would like to experiment with this further, but I would need to see the
broader context - IOW see the 'caller' of normal_function(), and see what
it does with the return value.

The caller of normal_function can do anything it wants with the return
value, including adding additional callbacks or just discarding it. The
caller could be a coroutine or another normal non-blocking function. If
it's a coroutine, then it can await the future, but it doesn't need to
unless it wants to do something with the result. Depending on what the
future represents, it might also be considered internal to normal_function,
in which case it shouldn't be returned at all.
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Question about asyncio and blocking operations

2016-01-25 Thread Frank Millman

"Ian Kelly"  wrote in message
news:calwzidngogpx+cpmvba8vpefuq4-bwmvs0gz3shb0owzi0b...@mail.gmail.com...

On Sat, Jan 23, 2016 at 7:38 AM, Frank Millman  wrote:
> Here is the difficulty. The recommended way to handle a blocking
> operation
> is to run it as task in a different thread, using run_in_executor().
> This
> method is a coroutine. An implication of this is that any method that
> calls
> it must also be a coroutine, so I end up with a chain of coroutines
> stretching all the way back to the initial event that triggered it.

This seems to be a common misapprehension about asyncio programming.
While coroutines are the focus of the library, they're based on
futures, and so by working at a slightly lower level you can also
handle them as such. So  while this would be the typical way to use
run_in_executor:

async def my_coroutine(stuff):
value = await get_event_loop().run_in_executor(None,
blocking_function, stuff)
result = await do_something_else_with(value)
return result

This is also a perfectly valid way to use it:

def normal_function(stuff):
loop = get_event_loop()
coro = loop.run_in_executor(None, blocking_function, stuff)
task = loop.create_task(coro)
task.add_done_callback(do_something_else)
return task

I am struggling to get my head around this.

1. In the second function, AFAICT coro is already a future. Why is it
necessary to turn it into a task? In fact when I tried that in my testing, I
got an assertion error -

File: "C:\Python35\lib\asyncio\base_events.py", line 211, in create_task
   task = tasks.Task(coro, loop=self)
File: "C:\Python35\lib\asyncio\tasks.py", line 70, in __init__
   assert coroutines.iscoroutine(coro), repr(coro)
AssertionError: 

2. In the first function, calling 'run_in_executor' unblocks the main loop
so that it can continue with other tasks, but the function itself is
suspended until the blocking function returns. In the second function, I
cannot see how the function gets suspended. It looks as if the blocking
function will run in the background, and the main function will continue.

I would like to experiment with this further, but I would need to see the
broader context - IOW see the 'caller' of normal_function(), and see what it
does with the return value.

I feel I am getting closer to an 'aha' moment, but I am not there yet, so
all info is appreciated.

Frank 

--
https://mail.python.org/mailman/listinfo/python-list

Re: Question about asyncio and blocking operations

2016-01-25 Thread Frank Millman

"Ian Kelly"  wrote in message 
news:calwzidngogpx+cpmvba8vpefuq4-bwmvs0gz3shb0owzi0b...@mail.gmail.com...

On Sat, Jan 23, 2016 at 7:38 AM, Frank Millman  wrote:
> Here is the difficulty. The recommended way to handle a blocking 
> operation
> is to run it as task in a different thread, using run_in_executor(). 
> This
> method is a coroutine. An implication of this is that any method that 
> calls

> it must also be a coroutine, so I end up with a chain of coroutines
> stretching all the way back to the initial event that triggered it.

This seems to be a common misapprehension about asyncio programming.
While coroutines are the focus of the library, they're based on
futures, and so by working at a slightly lower level you can also
handle them as such. So  while this would be the typical way to use
run_in_executor:

async def my_coroutine(stuff):
value = await get_event_loop().run_in_executor(None,
blocking_function, stuff)
result = await do_something_else_with(value)
return result

This is also a perfectly valid way to use it:

def normal_function(stuff):
loop = get_event_loop()
coro = loop.run_in_executor(None, blocking_function, stuff)
task = loop.create_task(coro)
task.add_done_callback(do_something_else)
return task

I am struggling to get my head around this.

1. In the second function, AFAICT coro is already a future. Why is it 
necessary to turn it into a task? In fact when I tried that in my testing, I 
got an assertion error -

File: "C:\Python35\lib\asyncio\base_events.py", line 211, in create_task
   task = tasks.Task(coro, loop=self)
File: "C:\Python35\lib\asyncio\tasks.py", line 70, in __init__
   assert coroutines.iscoroutine(coro), repr(coro)
AssertionError: 

2. In the first function, calling 'run_in_executor' unblocks the main loop 
so that it can continue with other tasks, but the function itself is 
suspended until the blocking function returns. In the second function, I 
cannot see how the function gets suspended. It looks as if the blocking 
function will run in the background, and the main function will continue.

I would like to experiment with this further, but I would need to see the 
broader context - IOW see the 'caller' of normal_function(), and see what it 
does with the return value.

I feel I am getting closer to an 'aha' moment, but I am not there yet, so 
all info is appreciated.

Frank

--
https://mail.python.org/mailman/listinfo/python-list

Re: Question about asyncio and blocking operations

2016-01-23 Thread Frank Millman


"Frank Millman"  wrote in message news:n8038j$575$1...@ger.gmane.org...

So I thought I would ask here if anyone has been through a similar 
exercise, and if what I am going through sounds normal, or if I am doing 
something fundamentally wrong.



Thanks for any input


Just a quick note of thanks to ChrisA and Ian. Very interesting responses 
and plenty to think about.


I will have to sleep on it and come back with renewed vigour in the morning. 
I may well be back with more questions :-)


Frank


--
https://mail.python.org/mailman/listinfo/python-list

Re: Question about asyncio and blocking operations

2016-01-23 Thread Ian Kelly

On Sat, Jan 23, 2016 at 8:44 AM, Ian Kelly  wrote:
> This is where it would make sense to me to use callbacks instead of
> subroutines. You can structure your __init__ method like this:

Doh.

s/subroutines/coroutines
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Question about asyncio and blocking operations

2016-01-23 Thread Ian Kelly

On Sat, Jan 23, 2016 at 7:38 AM, Frank Millman  wrote:
> Here is the difficulty. The recommended way to handle a blocking operation
> is to run it as task in a different thread, using run_in_executor(). This
> method is a coroutine. An implication of this is that any method that calls
> it must also be a coroutine, so I end up with a chain of coroutines
> stretching all the way back to the initial event that triggered it.

This seems to be a common misapprehension about asyncio programming.
While coroutines are the focus of the library, they're based on
futures, and so by working at a slightly lower level you can also
handle them as such. So  while this would be the typical way to use
run_in_executor:

async def my_coroutine(stuff):
value = await get_event_loop().run_in_executor(None,
blocking_function, stuff)
result = await do_something_else_with(value)
return result

This is also a perfectly valid way to use it:

def normal_function(stuff):
loop = get_event_loop()
coro = loop.run_in_executor(None, blocking_function, stuff)
task = loop.create_task(coro)
task.add_done_callback(do_something_else)
return task

> I use a cache to store frequently used objects, but I wait for the first
> request before I actually retrieve it from the database. This is how it
> worked -
>
> # cache of database objects for each company
> class DbObject(dict):
>def __missing__(self, company):
>db_object = self[company] = get_db_object _from_database()
>return db_object
> db_objects = DbObjects()
>
> Any function could ask for db_cache.db_objects[company]. The first time it
> would be read from the database, on subsequent requests it would be returned
> from the dictionary.
>
> Now get_db_object_from_database() is a coroutine, so I have to change it to
>db_object = self[company] = await get_db_object _from_database()
>
> But that is not allowed, because __missing__() is not a coroutine.
>
> I fixed it by replacing the cache with a function -
>
> # cache of database objects for each company
> db_objects = {}
> async def get_db_object(company):
>if company not in db_objects:
>db_object = db_objects[company] = await get_db_object
> _from_database()
>return db_objects[company]
>
> Now the calling functions have to call 'await
> db_cache.get_db_object(company)'
>
> Ok, once I had made the change it did not feel so bad.

This all sounds pretty reasonable to me.

> Now I have another problem. I have some classes which retrieve some data
> from the database during their __init__() method. I find that it is not
> allowed to call a coroutine from __init__(), and it is not allowed to turn
> __init__() into a coroutine.
>
> I imagine that I will have to split __init__() into two parts, put the
> database functionality into a separately-callable method, and then go
> through my app to find all occurrences of instantiating the object and
> follow it with an explicit call to the new method.
>
> Again, I can handle that without too much difficulty. But at this stage I do
> not know what other problems I am going to face, and how easy they will be
> to fix.
>
> So I thought I would ask here if anyone has been through a similar exercise,
> and if what I am going through sounds normal, or if I am doing something
> fundamentally wrong.

This is where it would make sense to me to use callbacks instead of
subroutines. You can structure your __init__ method like this:

def __init__(self, params):
self.params = params
self.db_object_future = get_event_loop().create_task(
get_db_object(params))

async def method_depending_on_db_object():
db_object = await self.db_object_future
result = do_something_with(db_object)
return result

The caveat with this is that while __init__ itself doesn't need to be
a coroutine, any method that depends on the DB lookup does need to be
(or at least needs to return a future).
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Question about asyncio and blocking operations

2016-01-23 Thread Chris Angelico

On Sun, Jan 24, 2016 at 1:38 AM, Frank Millman  wrote:
> I find I am bumping my head more that I expected, so I thought I would try
> to get some feedback here to see if I have some flaw in my approach, or if
> it is just in the nature of writing an asynchronous-style application.

I don't have a lot of experience with Python's async/await as such,
but I've written asynchronous apps using a variety of systems (and
also written threaded ones many times). so I'll answer questions on
the basis of design principles that were passed down to me through the
generations.

> I use a cache to store frequently used objects, but I wait for the first
> request before I actually retrieve it from the database. This is how it
> worked -
>
> # cache of database objects for each company
> class DbObject(dict):
>def __missing__(self, company):
>db_object = self[company] = get_db_object _from_database()
>return db_object
> db_objects = DbObjects()
>
> Any function could ask for db_cache.db_objects[company]. The first time it
> would be read from the database, on subsequent requests it would be returned
> from the dictionary.
>
> Now get_db_object_from_database() is a coroutine, so I have to change it to
>db_object = self[company] = await get_db_object _from_database()
>
> But that is not allowed, because __missing__() is not a coroutine.
>
> I fixed it by replacing the cache with a function -
>
> # cache of database objects for each company
> db_objects = {}
> async def get_db_object(company):
>if company not in db_objects:
>db_object = db_objects[company] = await get_db_object
> _from_database()
>return db_objects[company]
>
> Now the calling functions have to call 'await
> db_cache.get_db_object(company)'
>
> Ok, once I had made the change it did not feel so bad.

I would prefer the function call anyway. Subscripting a dictionary is
fine for something that's fairly cheap, but if it's potentially hugely
expensive, I'd rather see it spelled as a function call. There's
plenty of precedent for caching function calls so only the first one
is expensive.

> Now I have another problem. I have some classes which retrieve some data
> from the database during their __init__() method. I find that it is not
> allowed to call a coroutine from __init__(), and it is not allowed to turn
> __init__() into a coroutine.
>
> I imagine that I will have to split __init__() into two parts, put the
> database functionality into a separately-callable method, and then go
> through my app to find all occurrences of instantiating the object and
> follow it with an explicit call to the new method.
>
> Again, I can handle that without too much difficulty. But at this stage I do
> not know what other problems I am going to face, and how easy they will be
> to fix.

The question here is: Until you get that data from the database, what
state would the object be in? There are two basic options:

1) If the object is somewhat usable and meaningful, divide
initialization into two parts - one that sets up the object itself
(__init__) and one that fetches stuff from the database. If you can,
trigger the database fetch in __init__ so it's potentially partly done
when you come to wait for it.

2) If the object would be completely useless, use an awaitable factory
function instead. Rather than constructing an object, you ask an
asynchronous procedure to give you an object. It's a subtle change,
and by carefully managing the naming, you could make it almost
transparent in your code:

# Old way:
class User:
def __init__(self, domain, name):
self.id = blocking_database_call("get user", domain, name)
# And used thus:
me = User("example.com", "rosuav")

# New way:
class User:
def __init__(self, id):
self.id = id
_User = User
async def User(domain, name):
id = await async_database_call("get user", domain, name)
return _User(id)
# And used thus:
me = await User("example.com", "rosuav")

> So I thought I would ask here if anyone has been through a similar exercise,
> and if what I am going through sounds normal, or if I am doing something
> fundamentally wrong.

I think this looks pretty much right. There are some small things you
can do to make it look a bit easier, but it's minor.

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list

Question about asyncio and blocking operations

2016-01-23 Thread Frank Millman


Hi all

I am developing a typical accounting/business application which involves a 
front-end allowing clients to access the system, a back-end connecting to a 
database, and a middle layer that glues it all together.


Some time ago I converted the front-end from a multi-threaded approach to an 
asyncio approach. It was surprisingly easy, and did not require me to delve 
into asyncio too deeply.


There was one aspect that I deliberately ignored at that stage. I did not 
change the database access to an asyncio approach, so all reading 
from/writing to the database involved a blocking operation. I am now ready 
to tackle that.


I find I am bumping my head more that I expected, so I thought I would try 
to get some feedback here to see if I have some flaw in my approach, or if 
it is just in the nature of writing an asynchronous-style application.


Here is the difficulty. The recommended way to handle a blocking operation 
is to run it as task in a different thread, using run_in_executor(). This 
method is a coroutine. An implication of this is that any method that calls 
it must also be a coroutine, so I end up with a chain of coroutines 
stretching all the way back to the initial event that triggered it. I can 
understand why this is necessary, but it does lead to some awkward 
programming.


I use a cache to store frequently used objects, but I wait for the first 
request before I actually retrieve it from the database. This is how it 
worked -


# cache of database objects for each company
class DbObject(dict):
   def __missing__(self, company):
   db_object = self[company] = get_db_object _from_database()
   return db_object
db_objects = DbObjects()

Any function could ask for db_cache.db_objects[company]. The first time it 
would be read from the database, on subsequent requests it would be returned 
from the dictionary.


Now get_db_object_from_database() is a coroutine, so I have to change it to
   db_object = self[company] = await get_db_object _from_database()

But that is not allowed, because __missing__() is not a coroutine.

I fixed it by replacing the cache with a function -

# cache of database objects for each company
db_objects = {}
async def get_db_object(company):
   if company not in db_objects:
   db_object = db_objects[company] = await get_db_object 
_from_database()

   return db_objects[company]

Now the calling functions have to call 'await 
db_cache.get_db_object(company)'


Ok, once I had made the change it did not feel so bad.

Now I have another problem. I have some classes which retrieve some data 
from the database during their __init__() method. I find that it is not 
allowed to call a coroutine from __init__(), and it is not allowed to turn 
__init__() into a coroutine.


I imagine that I will have to split __init__() into two parts, put the 
database functionality into a separately-callable method, and then go 
through my app to find all occurrences of instantiating the object and 
follow it with an explicit call to the new method.


Again, I can handle that without too much difficulty. But at this stage I do 
not know what other problems I am going to face, and how easy they will be 
to fix.


So I thought I would ask here if anyone has been through a similar exercise, 
and if what I am going through sounds normal, or if I am doing something 
fundamentally wrong.


Thanks for any input

Frank Millman


--
https://mail.python.org/mailman/listinfo/python-list

Re: Question about asyncio and blocking operations

Re: Question about asyncio and blocking operations

Re: Question about asyncio and blocking operations

Re: Question about asyncio and blocking operations

Re: Question about asyncio and blocking operations

Re: Question about asyncio and blocking operations

Re: Question about asyncio and blocking operations

Re: Question about asyncio and blocking operations

Re: Question about asyncio and blocking operations

Re: Question about asyncio and blocking operations

Re: Question about asyncio and blocking operations

Re: Question about asyncio and blocking operations

Re: Question about asyncio and blocking operations

Re: Question about asyncio and blocking operations

Re: Question about asyncio and blocking operations

Re: Question about asyncio and blocking operations

Re: Question about asyncio and blocking operations

Re: Question about asyncio and blocking operations

Re: Question about asyncio and blocking operations

Re: Question about asyncio and blocking operations

Re: Question about asyncio and blocking operations

Re: Question about asyncio and blocking operations

Re: Question about asyncio and blocking operations

Re: Question about asyncio and blocking operations

Re: Question about asyncio and blocking operations

Re: Question about asyncio and blocking operations

Re: Question about asyncio and blocking operations

Re: Question about asyncio and blocking operations

Re: Question about asyncio and blocking operations

Re: Question about asyncio and blocking operations

Re: Question about asyncio and blocking operations

Re: Question about asyncio and blocking operations

Re: Question about asyncio and blocking operations

Re: Question about asyncio and blocking operations

Question about asyncio and blocking operations

35 matches

Site Navigation

Mail list logo

Footer information