Re: Question about asyncio and blocking operations
"Frank Millman" wrote in message news:n8et0d$hem$1...@ger.gmane.org... I have read the other messages, and I can see that there are some clever ideas there. However, having found something that seems to work and that I feel comfortable with, I plan to run with this for the time being. A quick update. Now that I am starting to understand this a bit better, I found it very easy to turn my concept into an Asynchronous Iterator. class AsyncCursor: def __init__(self, loop, sql): self.return_queue = asyncio.Queue() request_queue.put((self.return_queue, loop, sql)) async def __aiter__(self): return self async def __anext__(self): row = await self.return_queue.get() if row is not None: return row else: self.return_queue.task_done() raise StopAsyncIteration The caller can use it like this - sql = 'SELECT ...' cur = AsyncCursor(loop, sql) async for row in cur: print('got', row) Frank -- https://mail.python.org/mailman/listinfo/python-list
Re: Question about asyncio and blocking operations
Le 28 janv. 2016 22:52, "Ian Kelly" a écrit : > > On Thu, Jan 28, 2016 at 2:23 PM, Maxime S wrote: > > > > 2016-01-28 17:53 GMT+01:00 Ian Kelly : > >> > >> On Thu, Jan 28, 2016 at 9:40 AM, Frank Millman wrote: > >> > >> > The caller requests some data from the database like this. > >> > > >> >return_queue = asyncio.Queue() > >> >sql = 'SELECT ...' > >> >request_queue.put((return_queue, sql)) > >> > >> Note that since this is a queue.Queue, the put call has the potential > >> to block your entire event loop. > >> > > > > Actually, I don't think you actually need an asyncio.Queue. > > > > You could use a simple deque as a buffer, and call fetchmany() when it is > > empty, like that (untested): > > True. The asyncio Queue is really just a wrapper around a deque with > an interface designed for use with the producer-consumer pattern. If > the producer isn't a coroutine then it may not be appropriate. > > This seems like a nice suggestion. Caution is advised if multiple > cursor methods are executed concurrently since they would be in > different threads and the underlying cursor may not be thread-safe. > -- > https://mail.python.org/mailman/listinfo/python-list Indeed, the run_in_executor call should probably protected by an asyncio.Lock. But it is a pretty strange idea to call two fetch*() method concurrently anyways. -- https://mail.python.org/mailman/listinfo/python-list
Re: Question about asyncio and blocking operations
"Ian Kelly" wrote in message news:calwzidn6nft_o0cfhw1itwja81+mw3schuecadvcen3ix6z...@mail.gmail.com... As I commented in my previous message, asyncio.Queue is not thread-safe, so it's very important that the put calls here be done on the event loop thread using event_loop.call_soon_threadsafe. This could be the cause of the strange behavior you're seeing in getting the results. Using call_soon_threadsafe makes all the difference. The rows are now retrieved instantly. I have read the other messages, and I can see that there are some clever ideas there. However, having found something that seems to work and that I feel comfortable with, I plan to run with this for the time being. Thanks to all for the very stimulating discussion. Frank -- https://mail.python.org/mailman/listinfo/python-list
Re: Question about asyncio and blocking operations
On Jan 28, 2016 3:07 PM, "Maxime Steisel" wrote: > > But it is a pretty strange idea to call two fetch*() method concurrently anyways. If you want to process rows concurrently and aren't concerned with processing them in order, it may be attractive to create multiple threads / coroutines, pass the cursor to each, and let them each call fetchmany independently. I agree this is a bad idea unless you use a lock to isolate the calls or are certain that you'll never use a dbapi implementation with threadsafety < 3. I pointed it out because the wrapper makes it less obvious that multiple threads are involved; one could naively assume that the separate calls are isolated by the event loop. -- https://mail.python.org/mailman/listinfo/python-list
Re: Question about asyncio and blocking operations
On Thu, Jan 28, 2016 at 2:23 PM, Maxime S wrote: > > 2016-01-28 17:53 GMT+01:00 Ian Kelly : >> >> On Thu, Jan 28, 2016 at 9:40 AM, Frank Millman wrote: >> >> > The caller requests some data from the database like this. >> > >> >return_queue = asyncio.Queue() >> >sql = 'SELECT ...' >> >request_queue.put((return_queue, sql)) >> >> Note that since this is a queue.Queue, the put call has the potential >> to block your entire event loop. >> > > Actually, I don't think you actually need an asyncio.Queue. > > You could use a simple deque as a buffer, and call fetchmany() when it is > empty, like that (untested): True. The asyncio Queue is really just a wrapper around a deque with an interface designed for use with the producer-consumer pattern. If the producer isn't a coroutine then it may not be appropriate. This seems like a nice suggestion. Caution is advised if multiple cursor methods are executed concurrently since they would be in different threads and the underlying cursor may not be thread-safe. -- https://mail.python.org/mailman/listinfo/python-list
Re: Question about asyncio and blocking operations
2016-01-28 17:53 GMT+01:00 Ian Kelly : > On Thu, Jan 28, 2016 at 9:40 AM, Frank Millman wrote: > > > The caller requests some data from the database like this. > > > >return_queue = asyncio.Queue() > >sql = 'SELECT ...' > >request_queue.put((return_queue, sql)) > > Note that since this is a queue.Queue, the put call has the potential > to block your entire event loop. > > Actually, I don't think you actually need an asyncio.Queue. You could use a simple deque as a buffer, and call fetchmany() when it is empty, like that (untested): class AsyncCursor: """Wraps a DB cursor and provide async method for blocking operations""" def __init__(self, cur, loop=None): if loop is None: loop = asyncio.get_event_loop() self._loop = loop self._cur = cur self._queue = deque() def __getattr__(self, attr): return getattr(self._cur, attr) def __setattr__(self, attr, value): return setattr(self._cur, attr, value) async def execute(self, operation, params): return await self._loop.run_in_executor(self._cur.execute, operation, params) async def fetchall(self): return await self._loop.run_in_executor(self._cur.fetchall) async def fetchone(self): return await self._loop.run_in_executor(self._cur.fetchone) async def fetchmany(self, size=None): return await self._loop.run_in_executor(self._cur.fetchmany, size) async def __aiter__(self): return self async def __anext__(self): if self._queue.empty(): rows = await self.fetchmany() if not rows: raise StopAsyncIteration() self._queue.extend(rows) return self._queue.popleft() -- https://mail.python.org/mailman/listinfo/python-list
Re: Question about asyncio and blocking operations
"Ian Kelly" wrote in message news:CALwzidnGbz7kM=d7mkua2ta9-csfn9u0ohl0w-x5bbixpcw...@mail.gmail.com... On Jan 28, 2016 4:13 AM, "Frank Millman" wrote: > > I *think* I have this one covered. When the caller makes a request, it creates an instance of an asyncio.Queue, and includes it with the request. The db handler uses this queue to send the result back. > > Do you see any problem with this? That seems reasonable to me. I assume that when you send the result back you would be queuing up individual rows and not just sending a single object across, which could be more easily with just a single future. I have hit a snag. It feels like a bug in 'await q.get()', though I am sure it is just me misunderstanding how it works. I can post some working code if necessary, but here is a short description. Here is the database handler - 'request_queue' is a queue.Queue - while not request_queue.empty(): return_queue, sql = request_queue.get() cur.execute(sql) for row in cur: return_queue.put_nowait(row) return_queue.put_nowait(None) request_queue.task_done() The caller requests some data from the database like this. return_queue = asyncio.Queue() sql = 'SELECT ...' request_queue.put((return_queue, sql)) while True: row = await return_queue.get() if row is None: break print('got', row) return_queue.task_done() The first time 'await return_queue.get()' is called, return_queue is empty, as the db handler has not had time to do anything yet. It is supposed to pause there, wait for something to appear in the queue, and then process it. I have confirmed that the db handler populates the queue virtually instantly. What seems to happen is that it pauses there, but then waits for some other event in the event loop to occur before it continues. Then it processes all rows very quickly. I am running a 'counter' task in the background that prints a sequential number, using await asyncio.sleep(1). I noticed a short but variable delay before the rows were printed, and thought it might be waiting for the counter. I increased the sleep to 10, and sure enough it now waits up to 10 seconds before printing any rows. Any ideas? Frank -- https://mail.python.org/mailman/listinfo/python-list
Re: Question about asyncio and blocking operations
On Jan 28, 2016 4:13 AM, "Frank Millman" wrote: > > "Chris Angelico" wrote in message news:captjjmr162+k4lzefpxrur6wxrhxbr-_wkrclldyr7kst+k...@mail.gmail.com... >> >> >> On Thu, Jan 28, 2016 at 8:13 PM, Frank Millman wrote: >> > Run the database handler in a separate thread. Use a queue.Queue to send >> > requests to the handler. Use an asyncio.Queue to send results back to > the >> > caller, which can call 'await q.get()'. >> > >> > I ran a quick test and it seems to work. What do you think? >> >> My gut feeling is that any queue can block at either get or put >> > > H'mm, I will have to think about that one, and figure out how to create a worst-case scenario. I will report back on that. The get and put methods of asyncio queues are coroutines, so I don't think this would be a real issue. The coroutine might block, but it won't block the event loop. If the queue fills up, then effectively the waiting coroutines just become a (possibly unordered) extension of the queue. >> >> The other risk is that the wrong result will be queried (two async >> tasks put something onto the queue - which one gets the first >> result?), which could either be coped with by simple sequencing (maybe >> this happens automatically, although I'd prefer a >> mathematically-provable result to "it seems to work"), or by wrapping >> the whole thing up in a function/class. >> > > I *think* I have this one covered. When the caller makes a request, it creates an instance of an asyncio.Queue, and includes it with the request. The db handler uses this queue to send the result back. > > Do you see any problem with this? That seems reasonable to me. I assume that when you send the result back you would be queuing up individual rows and not just sending a single object across, which could be more easily with just a single future. The main risk of adding limited threads to an asyncio program is that threads make it harder to reason about concurrency. Just make sure the threads don't share any state and you should be good. Note that I can only see queues being used to move data in this direction, not in the opposite. It's unclear to me how queue.get would work from the blocking thread. Asyncio queues aren't threadsafe, but you couldn't just use call_soon_threadsafe since the result is important. You might want to use a queue.Queue instead in that case, but then you run back into the problem of queue.put being a blocking operation. -- https://mail.python.org/mailman/listinfo/python-list
Re: Question about asyncio and blocking operations
On Thu, Jan 28, 2016 at 9:40 AM, Frank Millman wrote: > I have hit a snag. It feels like a bug in 'await q.get()', though I am sure > it is just me misunderstanding how it works. > > I can post some working code if necessary, but here is a short description. > > Here is the database handler - 'request_queue' is a queue.Queue - > >while not request_queue.empty(): >return_queue, sql = request_queue.get() >cur.execute(sql) >for row in cur: >return_queue.put_nowait(row) > return_queue.put_nowait(None) >request_queue.task_done() As I commented in my previous message, asyncio.Queue is not thread-safe, so it's very important that the put calls here be done on the event loop thread using event_loop.call_soon_threadsafe. This could be the cause of the strange behavior you're seeing in getting the results. > The caller requests some data from the database like this. > >return_queue = asyncio.Queue() >sql = 'SELECT ...' >request_queue.put((return_queue, sql)) Note that since this is a queue.Queue, the put call has the potential to block your entire event loop. -- https://mail.python.org/mailman/listinfo/python-list
Re: Question about asyncio and blocking operations
"Chris Angelico" wrote in message news:captjjmr162+k4lzefpxrur6wxrhxbr-_wkrclldyr7kst+k...@mail.gmail.com... On Thu, Jan 28, 2016 at 8:13 PM, Frank Millman wrote: > Run the database handler in a separate thread. Use a queue.Queue to send > requests to the handler. Use an asyncio.Queue to send results back to > the > caller, which can call 'await q.get()'. > > I ran a quick test and it seems to work. What do you think? My gut feeling is that any queue can block at either get or put H'mm, I will have to think about that one, and figure out how to create a worst-case scenario. I will report back on that. The other risk is that the wrong result will be queried (two async tasks put something onto the queue - which one gets the first result?), which could either be coped with by simple sequencing (maybe this happens automatically, although I'd prefer a mathematically-provable result to "it seems to work"), or by wrapping the whole thing up in a function/class. I *think* I have this one covered. When the caller makes a request, it creates an instance of an asyncio.Queue, and includes it with the request. The db handler uses this queue to send the result back. Do you see any problem with this? Frank -- https://mail.python.org/mailman/listinfo/python-list
Re: Question about asyncio and blocking operations
On Thu, Jan 28, 2016 at 8:13 PM, Frank Millman wrote: > Run the database handler in a separate thread. Use a queue.Queue to send > requests to the handler. Use an asyncio.Queue to send results back to the > caller, which can call 'await q.get()'. > > I ran a quick test and it seems to work. What do you think? My gut feeling is that any queue can block at either get or put. Half of your operations are "correct", and the other half are "wrong". The caller can "await q.get()", which is the "correct" way to handle the blocking operation; the database thread can "jobqueue.get()" as a blocking operation, which is also fine. But your queue-putting operations are going to have to assume that they never block. Maybe you can have the database put its results onto the asyncio.Queue safely, but the requests going onto the queue.Queue could block waiting for the database. Specifically, this will happen if database queries come in faster than the database can handle them - quite literally, your jobs will be "blocked on the database". What should happen then? The other risk is that the wrong result will be queried (two async tasks put something onto the queue - which one gets the first result?), which could either be coped with by simple sequencing (maybe this happens automatically, although I'd prefer a mathematically-provable result to "it seems to work"), or by wrapping the whole thing up in a function/class. Both of these are risks seen purely by looking at the idea description, not at any sort of code. It's entirely possible I'm mistaken about them. But there's only one way to find out :) By the way, just out of interest... there's no way you can actually switch out the database communication for something purely socket-based, is there? PostgreSQL's protocol, for instance, is fairly straight-forward, and you don't *have* to use libpq; Pike's inbuilt pgsql module just opens a socket and says hello, and it looks like py-postgresql [1] is the same thing for Python. Taking something like that and making it asynchronous would be as straight-forward as converting any other socket-based code. Could be an alternative to all this weirdness. ChrisA [1] https://pypi.python.org/pypi/py-postgresql -- https://mail.python.org/mailman/listinfo/python-list
Re: Question about asyncio and blocking operations
On Thu, Jan 28, 2016 at 10:11 PM, Frank Millman wrote: >> >> The other risk is that the wrong result will be queried (two async >> tasks put something onto the queue - which one gets the first >> result?), which could either be coped with by simple sequencing (maybe >> this happens automatically, although I'd prefer a >> mathematically-provable result to "it seems to work"), or by wrapping >> the whole thing up in a function/class. >> > > I *think* I have this one covered. When the caller makes a request, it > creates an instance of an asyncio.Queue, and includes it with the request. > The db handler uses this queue to send the result back. > > Do you see any problem with this? Oh, I get it. In that case, you should be safe (at the cost of efficiency, presumably, but probably immeasurably so). The easiest way to thrash-test this is to simulate an ever-increasing stream of requests that eventually get bottlenecked by the database. For instance, have the database "process" a maximum of 10 requests a second, and then start feeding it 11 requests a second. Monitoring the queue length should tell you what's happening. Eventually, the queue will hit its limit (and you can make that happen sooner by creating the queue with a small maxsize), at which point the queue.put() will block. My suspicion is that that's going to lock your entire backend, unlocking only when the database takes something off its queue; you'll end up throttling at the database's rate (which is fine), but with something perpetually blocked waiting for the database (which is bad - other jobs, like socket read/write, won't be happening). As an alternative, you could use put_nowait or put(item, block=False) to put things on the queue. If there's no room, it'll fail immediately, which you can spit back to the user as "Database overloaded, please try again later". Tuning the size of the queue would then be an important consideration for real-world work; you'd want it long enough to cope with bursty traffic, but short enough that people's requests don't time out while they're blocked on the database (it's much tidier to fail them instantly if that's going to happen). ChrisA -- https://mail.python.org/mailman/listinfo/python-list
Re: Question about asyncio and blocking operations
"Ian Kelly" wrote in message news:CALwzidkr-fT6S6wH2caNaxyQvUdAw=x7xdqkqofnrrwzwnj...@mail.gmail.com... On Wed, Jan 27, 2016 at 10:14 AM, Ian Kelly wrote: > Unfortunately this doesn't actually work at present. > EventLoop.run_in_executor swallows the StopIteration exception and > just returns None, which I assume is a bug. http://bugs.python.org/issue26221 Thanks for that. Fascinating discussion between you and GvR. Reading it gave me an idea. Run the database handler in a separate thread. Use a queue.Queue to send requests to the handler. Use an asyncio.Queue to send results back to the caller, which can call 'await q.get()'. I ran a quick test and it seems to work. What do you think? Frank -- https://mail.python.org/mailman/listinfo/python-list
Re: Question about asyncio and blocking operations
"Ian Kelly" wrote in message news:CALwzidk-RBkB-vi6CgcEeoFHQrsoTFvqX9MqzDD=rny5boc...@mail.gmail.com... On Tue, Jan 26, 2016 at 7:15 AM, Frank Millman wrote: > > If I return the cursor, I can iterate over it, but isn't this a blocking > operation? As far as I know, the DB adaptor will only actually retrieve > the > row when requested. > > If I am right, I should call fetchall() while inside get_rows(), and > return > all the rows as a list. > You probably want an asynchronous iterator here. If the cursor doesn't provide that, then you can wrap it in one. In fact, this is basically one of the examples in the PEP: https://www.python.org/dev/peps/pep-0492/#example-1 Thanks, Ian. I had a look, and it does seem to fit the bill, but I could not get it to work, and I am running out of time. Specifically, I tried to get it working with the sqlite3 cursor. I am no expert, but after some googling I tried this - import sqlite3 conn = sqlite3.connect('/sqlite_db') cur = conn.cursor() async def __aiter__(self): return self async def __anext__(self): loop = asyncio.get_event_loop() return await loop.run_in_executor(None, self.__next__) import types cur.__aiter__ = types.MethodType( __aiter__, cur ) cur.__anext__ = types.MethodType( __anext__, cur ) It failed with this exception - AttributeError: 'sqlite3.Cursor' object has no attribute '__aiter__' I think this is what happens if a class uses 'slots' to define its attributes - it will not permit the creation of a new one. Anyway, moving on, I decided to change tack. Up to now I have been trying to isolate the function where I actually communicate with the database, and wrap that in a Future with 'run_in_executor'. In practice, the vast majority of my interactions with the database consist of very small CRUD commands, and will have minimal impact on response times even if they block. So I decided to focus on a couple of functions which are larger, and try to wrap the entire function in a Future with 'run_in_executor'. It seems to be working, but it looks a bit odd, so I will show what I am doing and ask for feedback. Assume a slow function - async def slow_function(arg1, arg2): [do stuff] It now looks like this - async def slow_function(arg1, arg2): loop = asyncio.get_event_loop() await loop.run_in_executor(None, slow_function_1, arg1, arg2) def slow_function_1(self, arg1, arg2): loop = asyncio.new_event_loop() asyncio.set_event_loop(loop) loop.run_until_complete(slow_function_2(arg1, arg2)) async slow_function_2(arg1, arg2): [do stuff] Does this look right? Frank -- https://mail.python.org/mailman/listinfo/python-list
Re: Question about asyncio and blocking operations
"Ian Kelly" wrote in message news:calwzidn6tvn9w-2qnn2jyvju8nhzn499nptfjn9ohjddceb...@mail.gmail.com... On Wed, Jan 27, 2016 at 7:40 AM, Frank Millman wrote: > > Assume a slow function - > > async def slow_function(arg1, arg2): >[do stuff] > > It now looks like this - > > async def slow_function(arg1, arg2): >loop = asyncio.get_event_loop() >await loop.run_in_executor(None, slow_function_1, arg1, arg2) > > def slow_function_1(self, arg1, arg2): >loop = asyncio.new_event_loop() >asyncio.set_event_loop(loop) >loop.run_until_complete(slow_function_2(arg1, arg2)) > > async slow_function_2(arg1, arg2): >[do stuff] > > Does this look right? I'm not sure I understand what you're trying to accomplish by running a second event loop inside the executor thread. It will only be useful for scheduling asynchronous operations, and if they're asynchronous then why not schedule them on the original event loop? I could be confusing myself here, but this is what I am trying to do. run_in_executor() schedules a blocking function to run in the executor, and returns a Future. If you just invoke it, the blocking function will execute in the background, and the calling function will carry on. If you obtain a reference to the Future, and then 'await' it, the calling function will be suspended until the blocking function is complete. You might do this because you want the calling function to block, but you do not want to block the entire event loop. In the above example, I do not want the calling function to block. However, the blocking function invokes one or more coroutines, so it needs an event loop to operate. Creating a new event loop allows them to run independently. Hope this makes sense. Frank -- https://mail.python.org/mailman/listinfo/python-list
Re: Question about asyncio and blocking operations
On Wed, Jan 27, 2016 at 10:14 AM, Ian Kelly wrote: > Unfortunately this doesn't actually work at present. > EventLoop.run_in_executor swallows the StopIteration exception and > just returns None, which I assume is a bug. http://bugs.python.org/issue26221 -- https://mail.python.org/mailman/listinfo/python-list
Re: Question about asyncio and blocking operations
On Wed, Jan 27, 2016 at 9:15 AM, Ian Kelly wrote: > class CursorWrapper: > > def __init__(self, cursor): > self._cursor = cursor > > async def __aiter__(self): > return self > > async def __anext__(self): > loop = asyncio.get_event_loop() > return await loop.run_in_executor(None, next, self._cursor) Oh, except you'd want to be sure to catch StopIteration and raise AsyncStopIteration in its place. This could also be generalized as an iterator wrapper, similar to the example in the PEP except using run_in_executor to actually avoid blocking. class AsyncIteratorWrapper: def __init__(self, iterable, loop=None, executor=None): self._iterator = iter(iterable) self._loop = loop or asyncio.get_event_loop() self._executor = executor async def __aiter__(self): return self async def __anext__(self): try: return await self._loop.run_in_executor( self._executor, next, self._iterator) except StopIteration: raise StopAsyncIteration Unfortunately this doesn't actually work at present. EventLoop.run_in_executor swallows the StopIteration exception and just returns None, which I assume is a bug. -- https://mail.python.org/mailman/listinfo/python-list
Re: Question about asyncio and blocking operations
On Wed, Jan 27, 2016 at 7:40 AM, Frank Millman wrote: > "Ian Kelly" wrote in message > news:CALwzidk-RBkB-vi6CgcEeoFHQrsoTFvqX9MqzDD=rny5boc...@mail.gmail.com... > >> You probably want an asynchronous iterator here. If the cursor doesn't >> provide that, then you can wrap it in one. In fact, this is basically >> one of the examples in the PEP: >> https://www.python.org/dev/peps/pep-0492/#example-1 >> > > Thanks, Ian. I had a look, and it does seem to fit the bill, but I could not > get it to work, and I am running out of time. > > Specifically, I tried to get it working with the sqlite3 cursor. I am no > expert, but after some googling I tried this - > > import sqlite3 > conn = sqlite3.connect('/sqlite_db') > cur = conn.cursor() > > async def __aiter__(self): >return self > > async def __anext__(self): >loop = asyncio.get_event_loop() >return await loop.run_in_executor(None, self.__next__) > > import types > cur.__aiter__ = types.MethodType( __aiter__, cur ) > cur.__anext__ = types.MethodType( __anext__, cur ) > > It failed with this exception - > > AttributeError: 'sqlite3.Cursor' object has no attribute '__aiter__' > > I think this is what happens if a class uses 'slots' to define its > attributes - it will not permit the creation of a new one. This is why I suggested wrapping the cursor instead. Something like this: class CursorWrapper: def __init__(self, cursor): self._cursor = cursor async def __aiter__(self): return self async def __anext__(self): loop = asyncio.get_event_loop() return await loop.run_in_executor(None, next, self._cursor) > Anyway, moving on, I decided to change tack. Up to now I have been trying to > isolate the function where I actually communicate with the database, and > wrap that in a Future with 'run_in_executor'. > > In practice, the vast majority of my interactions with the database consist > of very small CRUD commands, and will have minimal impact on response times > even if they block. So I decided to focus on a couple of functions which are > larger, and try to wrap the entire function in a Future with > 'run_in_executor'. > > It seems to be working, but it looks a bit odd, so I will show what I am > doing and ask for feedback. > > Assume a slow function - > > async def slow_function(arg1, arg2): >[do stuff] > > It now looks like this - > > async def slow_function(arg1, arg2): >loop = asyncio.get_event_loop() >await loop.run_in_executor(None, slow_function_1, arg1, arg2) > > def slow_function_1(self, arg1, arg2): >loop = asyncio.new_event_loop() >asyncio.set_event_loop(loop) >loop.run_until_complete(slow_function_2(arg1, arg2)) > > async slow_function_2(arg1, arg2): >[do stuff] > > Does this look right? I'm not sure I understand what you're trying to accomplish by running a second event loop inside the executor thread. It will only be useful for scheduling asynchronous operations, and if they're asynchronous then why not schedule them on the original event loop? -- https://mail.python.org/mailman/listinfo/python-list
Re: Question about asyncio and blocking operations
"Ian Kelly" wrote in message news:CALwzidk-RBkB-vi6CgcEeoFHQrsoTFvqX9MqzDD=rny5boc...@mail.gmail.com... On Tue, Jan 26, 2016 at 7:15 AM, Frank Millman wrote: > > If I return the cursor, I can iterate over it, but isn't this a blocking > operation? As far as I know, the DB adaptor will only actually retrieve > the > row when requested. > > If I am right, I should call fetchall() while inside get_rows(), and > return > all the rows as a list. > You probably want an asynchronous iterator here. If the cursor doesn't provide that, then you can wrap it in one. In fact, this is basically one of the examples in the PEP: https://www.python.org/dev/peps/pep-0492/#example-1 Thanks, Ian. I had a look, and it does seem to fit the bill, but I could not get it to work, and I am running out of time. Specifically, I tried to get it working with the sqlite3 cursor. I am no expert, but after some googling I tried this - import sqlite3 conn = sqlite3.connect('/sqlite_db') cur = conn.cursor() async def __aiter__(self): return self async def __anext__(self): loop = asyncio.get_event_loop() return await loop.run_in_executor(None, self.__next__) import types cur.__aiter__ = types.MethodType( __aiter__, cur ) cur.__anext__ = types.MethodType( __anext__, cur ) It failed with this exception - AttributeError: 'sqlite3.Cursor' object has no attribute '__aiter__' I think this is what happens if a class uses 'slots' to define its attributes - it will not permit the creation of a new one. Anyway, moving on, I decided to change tack. Up to now I have been trying to isolate the function where I actually communicate with the database, and wrap that in a Future with 'run_in_executor'. In practice, the vast majority of my interactions with the database consist of very small CRUD commands, and will have minimal impact on response times even if they block. So I decided to focus on a couple of functions which are larger, and try to wrap the entire function in a Future with 'run_in_executor'. It seems to be working, but it looks a bit odd, so I will show what I am doing and ask for feedback. Assume a slow function - async def slow_function(arg1, arg2): [do stuff] It now looks like this - async def slow_function(arg1, arg2): loop = asyncio.get_event_loop() await loop.run_in_executor(None, slow_function_1, arg1, arg2) def slow_function_1(self, arg1, arg2): loop = asyncio.new_event_loop() asyncio.set_event_loop(loop) loop.run_until_complete(slow_function_2(arg1, arg2)) async slow_function_2(arg1, arg2): [do stuff] Does this look right? Frank -- https://mail.python.org/mailman/listinfo/python-list
Re: Question about asyncio and blocking operations
> "Alberto" == Alberto Berti writes: Alberto> async external_coro(): # this is the calling context, which is a coro Alberto> async with transction.begin(): Alberto> o = MyObject Alberto> # maybe other stuff ops... here it is "o = MyObject()" ;-) -- https://mail.python.org/mailman/listinfo/python-list
Re: Question about asyncio and blocking operations
> "Frank" == Frank Millman writes: Frank> Now I have another problem. I have some classes which retrieve some Frank> data from the database during their __init__() method. I find that it Frank> is not allowed to call a coroutine from __init__(), and it is not Frank> allowed to turn __init__() into a coroutine. IMHO this is semantically correct for a method tha should really initialize that instance an await in the __init__ means having a suspension point that makes the initialization somewhat... unpredictable :-). To cover the cases when you need to call a coroutine from a non coroutine function like __init__ I have developed a small package that helps maintaining your code almost clean, where you can be sure that after some point in your code flow, the coroutines scheduled by the normal function have been executed. With that you can write code like this: from metapensiero.asyncio import transaction class MyObject(): def __init__(self): tran = transaction.get() tran.add(get_db_object('company'), cback=self._init) # get_db_object is a coroutine def _init(self, fut): self.company = fut.result() async external_coro(): # this is the calling context, which is a coro async with transction.begin(): o = MyObject # maybe other stuff # start using your db object o.company... This way the management of the "inner" coroutine is simpler, and from your code it's clear it suspends to wait and after that all the "stashed" coroutines are guaranteed to be executed. Hope it helps, Alberto -- https://mail.python.org/mailman/listinfo/python-list
Re: Question about asyncio and blocking operations
On Tue, Jan 26, 2016 at 7:15 AM, Frank Millman wrote: > I am making some progress, but I have found a snag - possibly unavoidable, > but worth a mention. > > Usually when I retrieve rows from a database I iterate over the cursor - > >def get_rows(sql, params): >cur.execute(sql, params) >for row in cur: >yield row > > If I create a Future to run get_rows(), I have to 'return' the result so > that the caller can access it by calling future.result(). > > If I return the cursor, I can iterate over it, but isn't this a blocking > operation? As far as I know, the DB adaptor will only actually retrieve the > row when requested. > > If I am right, I should call fetchall() while inside get_rows(), and return > all the rows as a list. > > This seems to be swapping one bit of asynchronicity for another. > > Does this sound right? You probably want an asynchronous iterator here. If the cursor doesn't provide that, then you can wrap it in one. In fact, this is basically one of the examples in the PEP: https://www.python.org/dev/peps/pep-0492/#example-1 -- https://mail.python.org/mailman/listinfo/python-list
Re: Question about asyncio and blocking operations
"Frank Millman" wrote in message news:n8038j$575$1...@ger.gmane.org... I am developing a typical accounting/business application which involves a front-end allowing clients to access the system, a back-end connecting to a database, and a middle layer that glues it all together. [...] There was one aspect that I deliberately ignored at that stage. I did not change the database access to an asyncio approach, so all reading from/writing to the database involved a blocking operation. I am now ready to tackle that. I am making some progress, but I have found a snag - possibly unavoidable, but worth a mention. Usually when I retrieve rows from a database I iterate over the cursor - def get_rows(sql, params): cur.execute(sql, params) for row in cur: yield row If I create a Future to run get_rows(), I have to 'return' the result so that the caller can access it by calling future.result(). If I return the cursor, I can iterate over it, but isn't this a blocking operation? As far as I know, the DB adaptor will only actually retrieve the row when requested. If I am right, I should call fetchall() while inside get_rows(), and return all the rows as a list. This seems to be swapping one bit of asynchronicity for another. Does this sound right? Frank -- https://mail.python.org/mailman/listinfo/python-list
Re: Question about asyncio and blocking operations
Marko Rauhamaa writes: > Note that neither the multithreading model (which I dislike) nor the > callback hell (which I like) suffer from this problem. There are some runtimes (GHC and Erlang) where everything is nonblocking under the covers, which lets even the asyncs be swept under the rug. Similarly with some low-tech cooperative multitaskers, say in Forth. When you've got a mixture of blocking and nonblocking, it becomes a mess. -- https://mail.python.org/mailman/listinfo/python-list
Re: Question about asyncio and blocking operations
Rustom Mody : > Bah -- What a bloody mess! > And thanks for pointing this out, Ian. > Keep wondering whether my brain is atrophying, or its rocket science or... I'm afraid the asyncio idea will not fly. Adding the keywords "async" and "await" did make things much better, but the programming model seems very cumbersome. Say you have an async that calls a nonblocking function as follows: async def t(): ... f() ... def f(): ... g() ... def g(): ... h() ... def h(): ... Then, you need to add a blocking call to h(). You then have a cascading effect of having to sprinkle asyncs and awaits everywhere: async def t(): ... await f() ... async def f(): ... await g() ... async def g(): ... await h() ... async def h(): ... await ... ... A nasty case of nonlocality. Makes you wonder if you ought to declare *all* functions *always* as asyncs just in case they turn out that way. Note that neither the multithreading model (which I dislike) nor the callback hell (which I like) suffer from this problem. Marko -- https://mail.python.org/mailman/listinfo/python-list
Re: Question about asyncio and blocking operations
On Monday, January 25, 2016 at 9:16:13 PM UTC+5:30, Ian wrote: > On Mon, Jan 25, 2016 at 8:32 AM, Ian Kelly wrote: > > > > On Jan 25, 2016 2:04 AM, "Frank Millman" wrote: > >> > >> "Ian Kelly" wrote in message > >>> > >>> This seems to be a common misapprehension about asyncio programming. > >>> While coroutines are the focus of the library, they're based on > >>> futures, and so by working at a slightly lower level you can also > >>> handle them as such. So while this would be the typical way to use > >>> run_in_executor: > >>> > >>> async def my_coroutine(stuff): > >>> value = await get_event_loop().run_in_executor(None, > >>> blocking_function, stuff) > >>> result = await do_something_else_with(value) > >>> return result > >>> > >>> This is also a perfectly valid way to use it: > >>> > >>> def normal_function(stuff): > >>> loop = get_event_loop() > >>> coro = loop.run_in_executor(None, blocking_function, stuff) > >>> task = loop.create_task(coro) > >>> task.add_done_callback(do_something_else) > >>> return task > >> > >> > >> I am struggling to get my head around this. > >> > >> 1. In the second function, AFAICT coro is already a future. Why is it > >> necessary to turn it into a task? In fact when I tried that in my testing, > >> I > >> got an assertion error - > >> > >> File: "C:\Python35\lib\asyncio\base_events.py", line 211, in create_task > >>task = tasks.Task(coro, loop=self) > >> File: "C:\Python35\lib\asyncio\tasks.py", line 70, in __init__ > >>assert coroutines.iscoroutine(coro), repr(coro) > >> AssertionError: > > > > I didn't test this; it was based on the documentation, which says that > > run_in_executor is a coroutine. Looking at the source, it's actually a > > function that returns a future, so this may be a documentation bug. > > And now I'm reminded of this note in the asyncio docs: > > """ > Note: In this documentation, some methods are documented as > coroutines, even if they are plain Python functions returning a > Future. This is intentional to have a freedom of tweaking the > implementation of these functions in the future. If such a function is > needed to be used in a callback-style code, wrap its result with > ensure_future(). > """ > > IMO such methods should simply be documented as awaitables, not > coroutines. I wonder if that's already settled, or if it's worth > starting a discussion around. Bah -- What a bloody mess! And thanks for pointing this out, Ian. Keep wondering whether my brain is atrophying, or its rocket science or... -- https://mail.python.org/mailman/listinfo/python-list
Re: Question about asyncio and blocking operations
On Mon, Jan 25, 2016 at 8:32 AM, Ian Kelly wrote: > > On Jan 25, 2016 2:04 AM, "Frank Millman" wrote: >> >> "Ian Kelly" wrote in message >> news:calwzidngogpx+cpmvba8vpefuq4-bwmvs0gz3shb0owzi0b...@mail.gmail.com... >>> >>> This seems to be a common misapprehension about asyncio programming. >>> While coroutines are the focus of the library, they're based on >>> futures, and so by working at a slightly lower level you can also >>> handle them as such. So while this would be the typical way to use >>> run_in_executor: >>> >>> async def my_coroutine(stuff): >>> value = await get_event_loop().run_in_executor(None, >>> blocking_function, stuff) >>> result = await do_something_else_with(value) >>> return result >>> >>> This is also a perfectly valid way to use it: >>> >>> def normal_function(stuff): >>> loop = get_event_loop() >>> coro = loop.run_in_executor(None, blocking_function, stuff) >>> task = loop.create_task(coro) >>> task.add_done_callback(do_something_else) >>> return task >> >> >> I am struggling to get my head around this. >> >> 1. In the second function, AFAICT coro is already a future. Why is it >> necessary to turn it into a task? In fact when I tried that in my testing, I >> got an assertion error - >> >> File: "C:\Python35\lib\asyncio\base_events.py", line 211, in create_task >>task = tasks.Task(coro, loop=self) >> File: "C:\Python35\lib\asyncio\tasks.py", line 70, in __init__ >>assert coroutines.iscoroutine(coro), repr(coro) >> AssertionError: > > I didn't test this; it was based on the documentation, which says that > run_in_executor is a coroutine. Looking at the source, it's actually a > function that returns a future, so this may be a documentation bug. And now I'm reminded of this note in the asyncio docs: """ Note: In this documentation, some methods are documented as coroutines, even if they are plain Python functions returning a Future. This is intentional to have a freedom of tweaking the implementation of these functions in the future. If such a function is needed to be used in a callback-style code, wrap its result with ensure_future(). """ IMO such methods should simply be documented as awaitables, not coroutines. I wonder if that's already settled, or if it's worth starting a discussion around. -- https://mail.python.org/mailman/listinfo/python-list
Re: Question about asyncio and blocking operations
On Jan 25, 2016 2:04 AM, "Frank Millman" wrote: > > "Ian Kelly" wrote in message news:calwzidngogpx+cpmvba8vpefuq4-bwmvs0gz3shb0owzi0b...@mail.gmail.com... >> >> This seems to be a common misapprehension about asyncio programming. >> While coroutines are the focus of the library, they're based on >> futures, and so by working at a slightly lower level you can also >> handle them as such. So while this would be the typical way to use >> run_in_executor: >> >> async def my_coroutine(stuff): >> value = await get_event_loop().run_in_executor(None, >> blocking_function, stuff) >> result = await do_something_else_with(value) >> return result >> >> This is also a perfectly valid way to use it: >> >> def normal_function(stuff): >> loop = get_event_loop() >> coro = loop.run_in_executor(None, blocking_function, stuff) >> task = loop.create_task(coro) >> task.add_done_callback(do_something_else) >> return task > > > I am struggling to get my head around this. > > 1. In the second function, AFAICT coro is already a future. Why is it necessary to turn it into a task? In fact when I tried that in my testing, I got an assertion error - > > File: "C:\Python35\lib\asyncio\base_events.py", line 211, in create_task >task = tasks.Task(coro, loop=self) > File: "C:\Python35\lib\asyncio\tasks.py", line 70, in __init__ >assert coroutines.iscoroutine(coro), repr(coro) > AssertionError: I didn't test this; it was based on the documentation, which says that run_in_executor is a coroutine. Looking at the source, it's actually a function that returns a future, so this may be a documentation bug. There's no need to get a task specifically. We just need a future so that callbacks can be added, so if the result of run_in_executor is already a future then the create_task call is unnecessary. To be safe, you could replace that call with asyncio.ensure_future, which accepts any awaitable and returns a future. > 2. In the first function, calling 'run_in_executor' unblocks the main loop so that it can continue with other tasks, but the function itself is suspended until the blocking function returns. In the second function, I cannot see how the function gets suspended. It looks as if the blocking function will run in the background, and the main function will continue. Correct. It's not a coroutine, so it has no facility for being suspended and resumed; it can only block or return. That's why the callback is necessary to schedule additional code to run after blocking_function finishes. normal_function itself can continue to make other non-blocking calls such as scheduling additional tasks, but it shouldn't do anything that depends on the result of blocking_function since it can't be assumed to be available yet. > I would like to experiment with this further, but I would need to see the broader context - IOW see the 'caller' of normal_function(), and see what it does with the return value. The caller of normal_function can do anything it wants with the return value, including adding additional callbacks or just discarding it. The caller could be a coroutine or another normal non-blocking function. If it's a coroutine, then it can await the future, but it doesn't need to unless it wants to do something with the result. Depending on what the future represents, it might also be considered internal to normal_function, in which case it shouldn't be returned at all. -- https://mail.python.org/mailman/listinfo/python-list
Re: Question about asyncio and blocking operations
"Ian Kelly" wrote in message news:calwzidngogpx+cpmvba8vpefuq4-bwmvs0gz3shb0owzi0b...@mail.gmail.com... On Sat, Jan 23, 2016 at 7:38 AM, Frank Millman wrote: > Here is the difficulty. The recommended way to handle a blocking > operation > is to run it as task in a different thread, using run_in_executor(). > This > method is a coroutine. An implication of this is that any method that > calls > it must also be a coroutine, so I end up with a chain of coroutines > stretching all the way back to the initial event that triggered it. This seems to be a common misapprehension about asyncio programming. While coroutines are the focus of the library, they're based on futures, and so by working at a slightly lower level you can also handle them as such. So while this would be the typical way to use run_in_executor: async def my_coroutine(stuff): value = await get_event_loop().run_in_executor(None, blocking_function, stuff) result = await do_something_else_with(value) return result This is also a perfectly valid way to use it: def normal_function(stuff): loop = get_event_loop() coro = loop.run_in_executor(None, blocking_function, stuff) task = loop.create_task(coro) task.add_done_callback(do_something_else) return task I am struggling to get my head around this. 1. In the second function, AFAICT coro is already a future. Why is it necessary to turn it into a task? In fact when I tried that in my testing, I got an assertion error - File: "C:\Python35\lib\asyncio\base_events.py", line 211, in create_task task = tasks.Task(coro, loop=self) File: "C:\Python35\lib\asyncio\tasks.py", line 70, in __init__ assert coroutines.iscoroutine(coro), repr(coro) AssertionError: 2. In the first function, calling 'run_in_executor' unblocks the main loop so that it can continue with other tasks, but the function itself is suspended until the blocking function returns. In the second function, I cannot see how the function gets suspended. It looks as if the blocking function will run in the background, and the main function will continue. I would like to experiment with this further, but I would need to see the broader context - IOW see the 'caller' of normal_function(), and see what it does with the return value. I feel I am getting closer to an 'aha' moment, but I am not there yet, so all info is appreciated. Frank -- https://mail.python.org/mailman/listinfo/python-list
Re: Question about asyncio and blocking operations
"Ian Kelly" wrote in message news:calwzidngogpx+cpmvba8vpefuq4-bwmvs0gz3shb0owzi0b...@mail.gmail.com... On Sat, Jan 23, 2016 at 7:38 AM, Frank Millman wrote: > Here is the difficulty. The recommended way to handle a blocking > operation > is to run it as task in a different thread, using run_in_executor(). > This > method is a coroutine. An implication of this is that any method that > calls > it must also be a coroutine, so I end up with a chain of coroutines > stretching all the way back to the initial event that triggered it. This seems to be a common misapprehension about asyncio programming. While coroutines are the focus of the library, they're based on futures, and so by working at a slightly lower level you can also handle them as such. So while this would be the typical way to use run_in_executor: async def my_coroutine(stuff): value = await get_event_loop().run_in_executor(None, blocking_function, stuff) result = await do_something_else_with(value) return result This is also a perfectly valid way to use it: def normal_function(stuff): loop = get_event_loop() coro = loop.run_in_executor(None, blocking_function, stuff) task = loop.create_task(coro) task.add_done_callback(do_something_else) return task I am struggling to get my head around this. 1. In the second function, AFAICT coro is already a future. Why is it necessary to turn it into a task? In fact when I tried that in my testing, I got an assertion error - File: "C:\Python35\lib\asyncio\base_events.py", line 211, in create_task task = tasks.Task(coro, loop=self) File: "C:\Python35\lib\asyncio\tasks.py", line 70, in __init__ assert coroutines.iscoroutine(coro), repr(coro) AssertionError: 2. In the first function, calling 'run_in_executor' unblocks the main loop so that it can continue with other tasks, but the function itself is suspended until the blocking function returns. In the second function, I cannot see how the function gets suspended. It looks as if the blocking function will run in the background, and the main function will continue. I would like to experiment with this further, but I would need to see the broader context - IOW see the 'caller' of normal_function(), and see what it does with the return value. I feel I am getting closer to an 'aha' moment, but I am not there yet, so all info is appreciated. Frank -- https://mail.python.org/mailman/listinfo/python-list
Re: Question about asyncio and blocking operations
"Frank Millman" wrote in message news:n8038j$575$1...@ger.gmane.org... So I thought I would ask here if anyone has been through a similar exercise, and if what I am going through sounds normal, or if I am doing something fundamentally wrong. Thanks for any input Just a quick note of thanks to ChrisA and Ian. Very interesting responses and plenty to think about. I will have to sleep on it and come back with renewed vigour in the morning. I may well be back with more questions :-) Frank -- https://mail.python.org/mailman/listinfo/python-list
Re: Question about asyncio and blocking operations
On Sat, Jan 23, 2016 at 8:44 AM, Ian Kelly wrote: > This is where it would make sense to me to use callbacks instead of > subroutines. You can structure your __init__ method like this: Doh. s/subroutines/coroutines -- https://mail.python.org/mailman/listinfo/python-list
Re: Question about asyncio and blocking operations
On Sat, Jan 23, 2016 at 7:38 AM, Frank Millman wrote: > Here is the difficulty. The recommended way to handle a blocking operation > is to run it as task in a different thread, using run_in_executor(). This > method is a coroutine. An implication of this is that any method that calls > it must also be a coroutine, so I end up with a chain of coroutines > stretching all the way back to the initial event that triggered it. This seems to be a common misapprehension about asyncio programming. While coroutines are the focus of the library, they're based on futures, and so by working at a slightly lower level you can also handle them as such. So while this would be the typical way to use run_in_executor: async def my_coroutine(stuff): value = await get_event_loop().run_in_executor(None, blocking_function, stuff) result = await do_something_else_with(value) return result This is also a perfectly valid way to use it: def normal_function(stuff): loop = get_event_loop() coro = loop.run_in_executor(None, blocking_function, stuff) task = loop.create_task(coro) task.add_done_callback(do_something_else) return task > I use a cache to store frequently used objects, but I wait for the first > request before I actually retrieve it from the database. This is how it > worked - > > # cache of database objects for each company > class DbObject(dict): >def __missing__(self, company): >db_object = self[company] = get_db_object _from_database() >return db_object > db_objects = DbObjects() > > Any function could ask for db_cache.db_objects[company]. The first time it > would be read from the database, on subsequent requests it would be returned > from the dictionary. > > Now get_db_object_from_database() is a coroutine, so I have to change it to >db_object = self[company] = await get_db_object _from_database() > > But that is not allowed, because __missing__() is not a coroutine. > > I fixed it by replacing the cache with a function - > > # cache of database objects for each company > db_objects = {} > async def get_db_object(company): >if company not in db_objects: >db_object = db_objects[company] = await get_db_object > _from_database() >return db_objects[company] > > Now the calling functions have to call 'await > db_cache.get_db_object(company)' > > Ok, once I had made the change it did not feel so bad. This all sounds pretty reasonable to me. > Now I have another problem. I have some classes which retrieve some data > from the database during their __init__() method. I find that it is not > allowed to call a coroutine from __init__(), and it is not allowed to turn > __init__() into a coroutine. > > I imagine that I will have to split __init__() into two parts, put the > database functionality into a separately-callable method, and then go > through my app to find all occurrences of instantiating the object and > follow it with an explicit call to the new method. > > Again, I can handle that without too much difficulty. But at this stage I do > not know what other problems I am going to face, and how easy they will be > to fix. > > So I thought I would ask here if anyone has been through a similar exercise, > and if what I am going through sounds normal, or if I am doing something > fundamentally wrong. This is where it would make sense to me to use callbacks instead of subroutines. You can structure your __init__ method like this: def __init__(self, params): self.params = params self.db_object_future = get_event_loop().create_task( get_db_object(params)) async def method_depending_on_db_object(): db_object = await self.db_object_future result = do_something_with(db_object) return result The caveat with this is that while __init__ itself doesn't need to be a coroutine, any method that depends on the DB lookup does need to be (or at least needs to return a future). -- https://mail.python.org/mailman/listinfo/python-list
Re: Question about asyncio and blocking operations
On Sun, Jan 24, 2016 at 1:38 AM, Frank Millman wrote: > I find I am bumping my head more that I expected, so I thought I would try > to get some feedback here to see if I have some flaw in my approach, or if > it is just in the nature of writing an asynchronous-style application. I don't have a lot of experience with Python's async/await as such, but I've written asynchronous apps using a variety of systems (and also written threaded ones many times). so I'll answer questions on the basis of design principles that were passed down to me through the generations. > I use a cache to store frequently used objects, but I wait for the first > request before I actually retrieve it from the database. This is how it > worked - > > # cache of database objects for each company > class DbObject(dict): >def __missing__(self, company): >db_object = self[company] = get_db_object _from_database() >return db_object > db_objects = DbObjects() > > Any function could ask for db_cache.db_objects[company]. The first time it > would be read from the database, on subsequent requests it would be returned > from the dictionary. > > Now get_db_object_from_database() is a coroutine, so I have to change it to >db_object = self[company] = await get_db_object _from_database() > > But that is not allowed, because __missing__() is not a coroutine. > > I fixed it by replacing the cache with a function - > > # cache of database objects for each company > db_objects = {} > async def get_db_object(company): >if company not in db_objects: >db_object = db_objects[company] = await get_db_object > _from_database() >return db_objects[company] > > Now the calling functions have to call 'await > db_cache.get_db_object(company)' > > Ok, once I had made the change it did not feel so bad. I would prefer the function call anyway. Subscripting a dictionary is fine for something that's fairly cheap, but if it's potentially hugely expensive, I'd rather see it spelled as a function call. There's plenty of precedent for caching function calls so only the first one is expensive. > Now I have another problem. I have some classes which retrieve some data > from the database during their __init__() method. I find that it is not > allowed to call a coroutine from __init__(), and it is not allowed to turn > __init__() into a coroutine. > > I imagine that I will have to split __init__() into two parts, put the > database functionality into a separately-callable method, and then go > through my app to find all occurrences of instantiating the object and > follow it with an explicit call to the new method. > > Again, I can handle that without too much difficulty. But at this stage I do > not know what other problems I am going to face, and how easy they will be > to fix. The question here is: Until you get that data from the database, what state would the object be in? There are two basic options: 1) If the object is somewhat usable and meaningful, divide initialization into two parts - one that sets up the object itself (__init__) and one that fetches stuff from the database. If you can, trigger the database fetch in __init__ so it's potentially partly done when you come to wait for it. 2) If the object would be completely useless, use an awaitable factory function instead. Rather than constructing an object, you ask an asynchronous procedure to give you an object. It's a subtle change, and by carefully managing the naming, you could make it almost transparent in your code: # Old way: class User: def __init__(self, domain, name): self.id = blocking_database_call("get user", domain, name) # And used thus: me = User("example.com", "rosuav") # New way: class User: def __init__(self, id): self.id = id _User = User async def User(domain, name): id = await async_database_call("get user", domain, name) return _User(id) # And used thus: me = await User("example.com", "rosuav") > So I thought I would ask here if anyone has been through a similar exercise, > and if what I am going through sounds normal, or if I am doing something > fundamentally wrong. I think this looks pretty much right. There are some small things you can do to make it look a bit easier, but it's minor. ChrisA -- https://mail.python.org/mailman/listinfo/python-list
Question about asyncio and blocking operations
Hi all I am developing a typical accounting/business application which involves a front-end allowing clients to access the system, a back-end connecting to a database, and a middle layer that glues it all together. Some time ago I converted the front-end from a multi-threaded approach to an asyncio approach. It was surprisingly easy, and did not require me to delve into asyncio too deeply. There was one aspect that I deliberately ignored at that stage. I did not change the database access to an asyncio approach, so all reading from/writing to the database involved a blocking operation. I am now ready to tackle that. I find I am bumping my head more that I expected, so I thought I would try to get some feedback here to see if I have some flaw in my approach, or if it is just in the nature of writing an asynchronous-style application. Here is the difficulty. The recommended way to handle a blocking operation is to run it as task in a different thread, using run_in_executor(). This method is a coroutine. An implication of this is that any method that calls it must also be a coroutine, so I end up with a chain of coroutines stretching all the way back to the initial event that triggered it. I can understand why this is necessary, but it does lead to some awkward programming. I use a cache to store frequently used objects, but I wait for the first request before I actually retrieve it from the database. This is how it worked - # cache of database objects for each company class DbObject(dict): def __missing__(self, company): db_object = self[company] = get_db_object _from_database() return db_object db_objects = DbObjects() Any function could ask for db_cache.db_objects[company]. The first time it would be read from the database, on subsequent requests it would be returned from the dictionary. Now get_db_object_from_database() is a coroutine, so I have to change it to db_object = self[company] = await get_db_object _from_database() But that is not allowed, because __missing__() is not a coroutine. I fixed it by replacing the cache with a function - # cache of database objects for each company db_objects = {} async def get_db_object(company): if company not in db_objects: db_object = db_objects[company] = await get_db_object _from_database() return db_objects[company] Now the calling functions have to call 'await db_cache.get_db_object(company)' Ok, once I had made the change it did not feel so bad. Now I have another problem. I have some classes which retrieve some data from the database during their __init__() method. I find that it is not allowed to call a coroutine from __init__(), and it is not allowed to turn __init__() into a coroutine. I imagine that I will have to split __init__() into two parts, put the database functionality into a separately-callable method, and then go through my app to find all occurrences of instantiating the object and follow it with an explicit call to the new method. Again, I can handle that without too much difficulty. But at this stage I do not know what other problems I am going to face, and how easy they will be to fix. So I thought I would ask here if anyone has been through a similar exercise, and if what I am going through sounds normal, or if I am doing something fundamentally wrong. Thanks for any input Frank Millman -- https://mail.python.org/mailman/listinfo/python-list
Re: Question about asyncio doc example
Ian Kelly : > Callbacks can easily schedule coroutines, but they can't wait on them, > because that would require suspending their execution, dropping back > to the event loop, and resuming later -- in other words, the callback > would need to be a coroutine also. I guess the key is, can a callback release a lock or semaphore, notify a condition variable, or put an item into a queue that a coroutine is waiting on? Quite possibly. Didn't try it. In that case, callbacks mix just fine with coroutines. Marko -- https://mail.python.org/mailman/listinfo/python-list
Re: Question about asyncio doc example
On Jul 24, 2014 1:26 AM, "Marko Rauhamaa" wrote: > > Terry Reedy : > > > 18.5.3. Tasks and coroutines, seems to be devoid of event wait > > examples. However, there is a 'yield from' network example in 18.5.5 > > Streams using socket functions wrapped with coroutines. These should > > definitely be used instead of sleep. In fact, for cross-platform > > network code meant to run on *nix and Windows, they are better than > > the unix oriented select and poll functions. > > Asyncio has full support for the callback style as well. What I don't > know is how well the two styles mix. Say, you have a module that > produces callbacks and another one that is based on coroutines. The > coroutines can easily emit callbacks but can callbacks call "yield > from"? Callbacks can easily schedule coroutines, but they can't wait on them, because that would require suspending their execution, dropping back to the event loop, and resuming later -- in other words, the callback would need to be a coroutine also. -- https://mail.python.org/mailman/listinfo/python-list
Re: Question about asyncio doc example
Terry Reedy : > 18.5.3. Tasks and coroutines, seems to be devoid of event wait > examples. However, there is a 'yield from' network example in 18.5.5 > Streams using socket functions wrapped with coroutines. These should > definitely be used instead of sleep. In fact, for cross-platform > network code meant to run on *nix and Windows, they are better than > the unix oriented select and poll functions. Asyncio has full support for the callback style as well. What I don't know is how well the two styles mix. Say, you have a module that produces callbacks and another one that is based on coroutines. The coroutines can easily emit callbacks but can callbacks call "yield from"? Marko -- https://mail.python.org/mailman/listinfo/python-list
Re: Question about asyncio doc example
On 7/24/2014 1:15 AM, Saimadhav Heblikar wrote: On 24 July 2014 05:54, Terry Reedy wrote: On 7/23/2014 6:43 AM, Saimadhav Heblikar wrote: Hi, The example in question is https://docs.python.org/3/library/asyncio-task.html#example-hello-world-coroutine. I'd like to learn the purpose of the statement "yield from asyncio.sleep(2)" in that example. In particular, I'd like to know if asyncio.sleep() is used as a substitute for slow/time consuming operation, i.e. in real code, whether there will be a real time consuming statement in place of asyncio.sleep(). The context is while True: print('Hello') yield from asyncio.sleep(3) sleep is both itself, to shown to schedule something at intervals in a non-blocking fashion, as well as a placefiller. The blocking equivalent would use 'time' instead of 'yield from asyncio'. The following shows the non-blocking feature a bit better. import asyncio @asyncio.coroutine def hello(): while True: print('Hello') yield from asyncio.sleep(3) @asyncio.coroutine def goodbye(): while True: print('Goodbye') yield from asyncio.sleep(5.01) @asyncio.coroutine def world(): while True: print('World') yield from asyncio.sleep(2.02) loop = asyncio.get_event_loop() loop.run_until_complete(asyncio.wait([hello(), goodbye(), world()])) Getting the same time behavior in a while...sleep loop requires reproducing some of the calculation and queue manipulation included in the event loop. -- Terry Jan Reedy -- https://mail.python.org/mailman/listinfo/python-list That clears it up for me. For situations where I dont really know how long a function is going to take(say waiting for user input or a network operation), I am better off using callbacks than "yield from asyncio.sleep()". Is my understanding correct? The question is not formulated very well. In asyncio parlance, 'using callbacks' contrasts with 'using co-routines'. It is a coding-style contrast. Tkinter only has the callback style. On the other hand, waiting (via sleep, without blocking other tasks) for a definite time interval contrasts with waiting (without blocking other tasks) until an event happens. This is an operational contrast. Tkinter has both possibilities, using call_after versus event-handler registration. I believe asyncio can do either type of waiting with either coding style. 18.5.3. Tasks and coroutines, seems to be devoid of event wait examples. However, there is a 'yield from' network example in 18.5.5 Streams using socket functions wrapped with coroutines. These should definitely be used instead of sleep. In fact, for cross-platform network code meant to run on *nix and Windows, they are better than the unix oriented select and poll functions. I believe asyncio does not do key events, even though that is a form of unpredictable input. -- Terry Jan Reedy -- https://mail.python.org/mailman/listinfo/python-list
Re: Question about asyncio doc example
Saimadhav Heblikar : > For situations where I dont really know how long a function is going > to take(say waiting for user input or a network operation), I am > better off using callbacks than "yield from asyncio.sleep()". Is my > understanding correct? If you choose the coroutine style of programming, you wouldn't normally use callbacks. Instead, you would "yield from" any blocking event. There are coroutine equivalents for locking, network I/O, multiplexing etc. The callback style encodes the state in a variable. The coroutine style (which closely resembles multithreading), encodes the state in the code itself. Both styles can easily become really messy (because reality is surprisingly messy). Marko -- https://mail.python.org/mailman/listinfo/python-list
Re: Question about asyncio doc example
On 24 July 2014 05:54, Terry Reedy wrote: > On 7/23/2014 6:43 AM, Saimadhav Heblikar wrote: >> >> Hi, >> >> The example in question is >> >> https://docs.python.org/3/library/asyncio-task.html#example-hello-world-coroutine. >> I'd like to learn the purpose of the statement >> "yield from asyncio.sleep(2)" in that example. >> >> In particular, I'd like to know if asyncio.sleep() is used as a >> substitute for slow/time consuming operation, i.e. in real code, >> whether there will be a real time consuming statement in place of >> asyncio.sleep(). > > > The context is > while True: > print('Hello') > yield from asyncio.sleep(3) > > sleep is both itself, to shown to schedule something at intervals in a > non-blocking fashion, as well as a placefiller. The blocking equivalent > would use 'time' instead of 'yield from asyncio'. The following shows the > non-blocking feature a bit better. > > import asyncio > > @asyncio.coroutine > def hello(): > while True: > print('Hello') > yield from asyncio.sleep(3) > > @asyncio.coroutine > def goodbye(): > while True: > print('Goodbye') > yield from asyncio.sleep(5.01) > > @asyncio.coroutine > def world(): > while True: > print('World') > yield from asyncio.sleep(2.02) > > loop = asyncio.get_event_loop() > loop.run_until_complete(asyncio.wait([hello(), goodbye(), world()])) > > Getting the same time behavior in a while...sleep loop requires reproducing > some of the calculation and queue manipulation included in the event loop. > > -- > Terry Jan Reedy > > -- > https://mail.python.org/mailman/listinfo/python-list That clears it up for me. For situations where I dont really know how long a function is going to take(say waiting for user input or a network operation), I am better off using callbacks than "yield from asyncio.sleep()". Is my understanding correct? -- Regards Saimadhav Heblikar -- https://mail.python.org/mailman/listinfo/python-list
Re: Question about asyncio doc example
On 7/23/2014 6:43 AM, Saimadhav Heblikar wrote: Hi, The example in question is https://docs.python.org/3/library/asyncio-task.html#example-hello-world-coroutine. I'd like to learn the purpose of the statement "yield from asyncio.sleep(2)" in that example. In particular, I'd like to know if asyncio.sleep() is used as a substitute for slow/time consuming operation, i.e. in real code, whether there will be a real time consuming statement in place of asyncio.sleep(). The context is while True: print('Hello') yield from asyncio.sleep(3) sleep is both itself, to shown to schedule something at intervals in a non-blocking fashion, as well as a placefiller. The blocking equivalent would use 'time' instead of 'yield from asyncio'. The following shows the non-blocking feature a bit better. import asyncio @asyncio.coroutine def hello(): while True: print('Hello') yield from asyncio.sleep(3) @asyncio.coroutine def goodbye(): while True: print('Goodbye') yield from asyncio.sleep(5.01) @asyncio.coroutine def world(): while True: print('World') yield from asyncio.sleep(2.02) loop = asyncio.get_event_loop() loop.run_until_complete(asyncio.wait([hello(), goodbye(), world()])) Getting the same time behavior in a while...sleep loop requires reproducing some of the calculation and queue manipulation included in the event loop. -- Terry Jan Reedy -- https://mail.python.org/mailman/listinfo/python-list
Re: Question about asyncio doc example
asyncio.sleep() returns you a Future. When you yield from a future, your coroutine blocks, until the Future completes. In the meantime, event loop continutes to execute other things that are waiting to be executed. The Future returned from asyncio.sleep gets completed after specified seconds. 2014-07-23 13:43 GMT+03:00 Saimadhav Heblikar : > Hi, > > The example in question is > https://docs.python.org/3/library/asyncio-task.html#example-hello-world-coroutine. > I'd like to learn the purpose of the statement > "yield from asyncio.sleep(2)" in that example. > > In particular, I'd like to know if asyncio.sleep() is used as a > substitute for slow/time consuming operation, i.e. in real code, > whether there will be a real time consuming statement in place of > asyncio.sleep(). > > -- > Regards > Saimadhav Heblikar > -- > https://mail.python.org/mailman/listinfo/python-list -- http://ysar.net/ -- https://mail.python.org/mailman/listinfo/python-list
Question about asyncio doc example
Hi, The example in question is https://docs.python.org/3/library/asyncio-task.html#example-hello-world-coroutine. I'd like to learn the purpose of the statement "yield from asyncio.sleep(2)" in that example. In particular, I'd like to know if asyncio.sleep() is used as a substitute for slow/time consuming operation, i.e. in real code, whether there will be a real time consuming statement in place of asyncio.sleep(). -- Regards Saimadhav Heblikar -- https://mail.python.org/mailman/listinfo/python-list
Re: Question about asyncio
"Ian Kelly" wrote in message news:CALwzidmzG_WA5shw+PS4Y976M4DVTOwE=zb+kurvcpj3n+5...@mail.gmail.com... > On Fri, Jun 13, 2014 at 5:42 AM, Frank Millman wrote: >> Now I want to use the functionality of asyncio by using a 'yield from' to >> suspend the currently executing function at a particular point while it >> waits for some information. [...] > > If the caller needs to wait on the result, then I don't think you have > another option but to make it a coroutine also. However if it doesn't > need to wait on the result, then you can just schedule it and move on, > and the caller doesn't need to be a coroutine itself. Just be aware > that this could result in different behavior from the threaded > approach, since whatever the function does after the scheduling will > happen before the coroutine is started rather than after. Thanks for the info, Ian. It confirms that I am on the right track by converting all functions involved in responding to an HTTP request into a chain of coroutines. I don't know if there is any overhead in that but so far the response times feel quite crisp. It took me a while to actually implement what I wanted to do, but I now realise that I had not got my head around the 'async' way of thinking. It is coming clearer, and I now have a toy example working. For the record, this is what I am trying to accomplish. I have a server, written in python, listening for and responding to HTTP requests. I have a browser-based client, written in Javascript. Once past the opening connection, all subsequent communication is carried out by XMLHttpRequests (Ajax). Action by a user can trigger one or more messages to be sent to the server. They are joined together in a list and sent. The server unpacks the list and processes the messages in sequence. Each step in the process can generate one or more responses to be sent back to the client. Again they are built up in a list, and when the final step is completed, the entire list is sent back. This all works well, but I have now introduced a complication. At any point in the server-side process, I want to be able to send a message to the client to open a dialog box, wait for the response, and use the response to determine how the process must continue. This was the main reason why I wanted to move to an 'async' approach. I create a Task to handle asking the question, and then use asyncio.wait_for() to wait for the response. So far it seems to be working. Thanks for the assistance. Frank -- https://mail.python.org/mailman/listinfo/python-list
Re: Question about asyncio
On Fri, Jun 13, 2014 at 5:42 AM, Frank Millman wrote: > Now I want to use the functionality of asyncio by using a 'yield from' to > suspend the currently executing function at a particular point while it > waits for some information. I find that adding 'yield from' turns the > function into a generator, which means that the caller has to iterate over > it. Hold up; you shouldn't be iterating over the coroutines yourself. Your choices for invoking an asyncio coroutine are either: 1) schedule it (the simplest way to do this is by calling asyncio.async); or 2) using 'yield from' in another coroutine that is already being run as a task. > I can avoid that by telling the caller to 'yield from' the generator, > but then *its* caller has to be modified. Now I find I am going through my > entire application and changing every function into a coroutine by > decorating it with @asyncio.coroutine, and changing a simple function call > to a 'yield from'. If the caller needs to wait on the result, then I don't think you have another option but to make it a coroutine also. However if it doesn't need to wait on the result, then you can just schedule it and move on, and the caller doesn't need to be a coroutine itself. Just be aware that this could result in different behavior from the threaded approach, since whatever the function does after the scheduling will happen before the coroutine is started rather than after. -- https://mail.python.org/mailman/listinfo/python-list
Question about asyncio
Hi all I am trying to get to grips with asyncio, but I don't know if I am doing it right. I have an app that listens for http connections and sends responses. I had it working with cherrypy, which uses threading. Now I am trying to convert it to asyncio, using a package called aiohttp. I got a basic version working quite quickly, just replacing cherrpy's request handler with aiohttp's, with a bit of tweaking where required. Now I want to use the functionality of asyncio by using a 'yield from' to suspend the currently executing function at a particular point while it waits for some information. I find that adding 'yield from' turns the function into a generator, which means that the caller has to iterate over it. I can avoid that by telling the caller to 'yield from' the generator, but then *its* caller has to be modified. Now I find I am going through my entire application and changing every function into a coroutine by decorating it with @asyncio.coroutine, and changing a simple function call to a 'yield from'. So far it is working, but there are dozens of functions to modify, so before digging too deep a hole for myself I would like to know if this feels like the right approach. Thanks Frank Millman -- https://mail.python.org/mailman/listinfo/python-list