Re: Question about asyncio and blocking operations
"Frank Millman" wrote in message news:n8et0d$hem$1...@ger.gmane.org... I have read the other messages, and I can see that there are some clever ideas there. However, having found something that seems to work and that I feel comfortable with, I plan to run with this for the time being. A quick update. Now that I am starting to understand this a bit better, I found it very easy to turn my concept into an Asynchronous Iterator. class AsyncCursor: def __init__(self, loop, sql): self.return_queue = asyncio.Queue() request_queue.put((self.return_queue, loop, sql)) async def __aiter__(self): return self async def __anext__(self): row = await self.return_queue.get() if row is not None: return row else: self.return_queue.task_done() raise StopAsyncIteration The caller can use it like this - sql = 'SELECT ...' cur = AsyncCursor(loop, sql) async for row in cur: print('got', row) Frank -- https://mail.python.org/mailman/listinfo/python-list
Re: Question about asyncio and blocking operations
Le 28 janv. 2016 22:52, "Ian Kelly" a écrit : > > On Thu, Jan 28, 2016 at 2:23 PM, Maxime S wrote: > > > > 2016-01-28 17:53 GMT+01:00 Ian Kelly : > >> > >> On Thu, Jan 28, 2016 at 9:40 AM, Frank Millman wrote: > >> > >> > The caller requests some data from the database like this. > >> > > >> >return_queue = asyncio.Queue() > >> >sql = 'SELECT ...' > >> >request_queue.put((return_queue, sql)) > >> > >> Note that since this is a queue.Queue, the put call has the potential > >> to block your entire event loop. > >> > > > > Actually, I don't think you actually need an asyncio.Queue. > > > > You could use a simple deque as a buffer, and call fetchmany() when it is > > empty, like that (untested): > > True. The asyncio Queue is really just a wrapper around a deque with > an interface designed for use with the producer-consumer pattern. If > the producer isn't a coroutine then it may not be appropriate. > > This seems like a nice suggestion. Caution is advised if multiple > cursor methods are executed concurrently since they would be in > different threads and the underlying cursor may not be thread-safe. > -- > https://mail.python.org/mailman/listinfo/python-list Indeed, the run_in_executor call should probably protected by an asyncio.Lock. But it is a pretty strange idea to call two fetch*() method concurrently anyways. -- https://mail.python.org/mailman/listinfo/python-list
Re: Question about asyncio and blocking operations
"Ian Kelly" wrote in message news:calwzidn6nft_o0cfhw1itwja81+mw3schuecadvcen3ix6z...@mail.gmail.com... As I commented in my previous message, asyncio.Queue is not thread-safe, so it's very important that the put calls here be done on the event loop thread using event_loop.call_soon_threadsafe. This could be the cause of the strange behavior you're seeing in getting the results. Using call_soon_threadsafe makes all the difference. The rows are now retrieved instantly. I have read the other messages, and I can see that there are some clever ideas there. However, having found something that seems to work and that I feel comfortable with, I plan to run with this for the time being. Thanks to all for the very stimulating discussion. Frank -- https://mail.python.org/mailman/listinfo/python-list
Re: Question about asyncio and blocking operations
On Jan 28, 2016 3:07 PM, "Maxime Steisel" wrote: > > But it is a pretty strange idea to call two fetch*() method concurrently anyways. If you want to process rows concurrently and aren't concerned with processing them in order, it may be attractive to create multiple threads / coroutines, pass the cursor to each, and let them each call fetchmany independently. I agree this is a bad idea unless you use a lock to isolate the calls or are certain that you'll never use a dbapi implementation with threadsafety < 3. I pointed it out because the wrapper makes it less obvious that multiple threads are involved; one could naively assume that the separate calls are isolated by the event loop. -- https://mail.python.org/mailman/listinfo/python-list
Re: Question about asyncio and blocking operations
On Thu, Jan 28, 2016 at 2:23 PM, Maxime S wrote: > > 2016-01-28 17:53 GMT+01:00 Ian Kelly : >> >> On Thu, Jan 28, 2016 at 9:40 AM, Frank Millman wrote: >> >> > The caller requests some data from the database like this. >> > >> >return_queue = asyncio.Queue() >> >sql = 'SELECT ...' >> >request_queue.put((return_queue, sql)) >> >> Note that since this is a queue.Queue, the put call has the potential >> to block your entire event loop. >> > > Actually, I don't think you actually need an asyncio.Queue. > > You could use a simple deque as a buffer, and call fetchmany() when it is > empty, like that (untested): True. The asyncio Queue is really just a wrapper around a deque with an interface designed for use with the producer-consumer pattern. If the producer isn't a coroutine then it may not be appropriate. This seems like a nice suggestion. Caution is advised if multiple cursor methods are executed concurrently since they would be in different threads and the underlying cursor may not be thread-safe. -- https://mail.python.org/mailman/listinfo/python-list
Re: Question about asyncio and blocking operations
2016-01-28 17:53 GMT+01:00 Ian Kelly : > On Thu, Jan 28, 2016 at 9:40 AM, Frank Millman wrote: > > > The caller requests some data from the database like this. > > > >return_queue = asyncio.Queue() > >sql = 'SELECT ...' > >request_queue.put((return_queue, sql)) > > Note that since this is a queue.Queue, the put call has the potential > to block your entire event loop. > > Actually, I don't think you actually need an asyncio.Queue. You could use a simple deque as a buffer, and call fetchmany() when it is empty, like that (untested): class AsyncCursor: """Wraps a DB cursor and provide async method for blocking operations""" def __init__(self, cur, loop=None): if loop is None: loop = asyncio.get_event_loop() self._loop = loop self._cur = cur self._queue = deque() def __getattr__(self, attr): return getattr(self._cur, attr) def __setattr__(self, attr, value): return setattr(self._cur, attr, value) async def execute(self, operation, params): return await self._loop.run_in_executor(self._cur.execute, operation, params) async def fetchall(self): return await self._loop.run_in_executor(self._cur.fetchall) async def fetchone(self): return await self._loop.run_in_executor(self._cur.fetchone) async def fetchmany(self, size=None): return await self._loop.run_in_executor(self._cur.fetchmany, size) async def __aiter__(self): return self async def __anext__(self): if self._queue.empty(): rows = await self.fetchmany() if not rows: raise StopAsyncIteration() self._queue.extend(rows) return self._queue.popleft() -- https://mail.python.org/mailman/listinfo/python-list
Re: Question about asyncio and blocking operations
"Ian Kelly" wrote in message news:CALwzidnGbz7kM=d7mkua2ta9-csfn9u0ohl0w-x5bbixpcw...@mail.gmail.com... On Jan 28, 2016 4:13 AM, "Frank Millman" wrote: > > I *think* I have this one covered. When the caller makes a request, it creates an instance of an asyncio.Queue, and includes it with the request. The db handler uses this queue to send the result back. > > Do you see any problem with this? That seems reasonable to me. I assume that when you send the result back you would be queuing up individual rows and not just sending a single object across, which could be more easily with just a single future. I have hit a snag. It feels like a bug in 'await q.get()', though I am sure it is just me misunderstanding how it works. I can post some working code if necessary, but here is a short description. Here is the database handler - 'request_queue' is a queue.Queue - while not request_queue.empty(): return_queue, sql = request_queue.get() cur.execute(sql) for row in cur: return_queue.put_nowait(row) return_queue.put_nowait(None) request_queue.task_done() The caller requests some data from the database like this. return_queue = asyncio.Queue() sql = 'SELECT ...' request_queue.put((return_queue, sql)) while True: row = await return_queue.get() if row is None: break print('got', row) return_queue.task_done() The first time 'await return_queue.get()' is called, return_queue is empty, as the db handler has not had time to do anything yet. It is supposed to pause there, wait for something to appear in the queue, and then process it. I have confirmed that the db handler populates the queue virtually instantly. What seems to happen is that it pauses there, but then waits for some other event in the event loop to occur before it continues. Then it processes all rows very quickly. I am running a 'counter' task in the background that prints a sequential number, using await asyncio.sleep(1). I noticed a short but variable delay before the rows were printed, and thought it might be waiting for the counter. I increased the sleep to 10, and sure enough it now waits up to 10 seconds before printing any rows. Any ideas? Frank -- https://mail.python.org/mailman/listinfo/python-list
Re: Question about asyncio and blocking operations
On Jan 28, 2016 4:13 AM, "Frank Millman" wrote: > > "Chris Angelico" wrote in message news:captjjmr162+k4lzefpxrur6wxrhxbr-_wkrclldyr7kst+k...@mail.gmail.com... >> >> >> On Thu, Jan 28, 2016 at 8:13 PM, Frank Millman wrote: >> > Run the database handler in a separate thread. Use a queue.Queue to send >> > requests to the handler. Use an asyncio.Queue to send results back to > the >> > caller, which can call 'await q.get()'. >> > >> > I ran a quick test and it seems to work. What do you think? >> >> My gut feeling is that any queue can block at either get or put >> > > H'mm, I will have to think about that one, and figure out how to create a worst-case scenario. I will report back on that. The get and put methods of asyncio queues are coroutines, so I don't think this would be a real issue. The coroutine might block, but it won't block the event loop. If the queue fills up, then effectively the waiting coroutines just become a (possibly unordered) extension of the queue. >> >> The other risk is that the wrong result will be queried (two async >> tasks put something onto the queue - which one gets the first >> result?), which could either be coped with by simple sequencing (maybe >> this happens automatically, although I'd prefer a >> mathematically-provable result to "it seems to work"), or by wrapping >> the whole thing up in a function/class. >> > > I *think* I have this one covered. When the caller makes a request, it creates an instance of an asyncio.Queue, and includes it with the request. The db handler uses this queue to send the result back. > > Do you see any problem with this? That seems reasonable to me. I assume that when you send the result back you would be queuing up individual rows and not just sending a single object across, which could be more easily with just a single future. The main risk of adding limited threads to an asyncio program is that threads make it harder to reason about concurrency. Just make sure the threads don't share any state and you should be good. Note that I can only see queues being used to move data in this direction, not in the opposite. It's unclear to me how queue.get would work from the blocking thread. Asyncio queues aren't threadsafe, but you couldn't just use call_soon_threadsafe since the result is important. You might want to use a queue.Queue instead in that case, but then you run back into the problem of queue.put being a blocking operation. -- https://mail.python.org/mailman/listinfo/python-list
Re: Question about asyncio and blocking operations
On Thu, Jan 28, 2016 at 9:40 AM, Frank Millman wrote: > I have hit a snag. It feels like a bug in 'await q.get()', though I am sure > it is just me misunderstanding how it works. > > I can post some working code if necessary, but here is a short description. > > Here is the database handler - 'request_queue' is a queue.Queue - > >while not request_queue.empty(): >return_queue, sql = request_queue.get() >cur.execute(sql) >for row in cur: >return_queue.put_nowait(row) > return_queue.put_nowait(None) >request_queue.task_done() As I commented in my previous message, asyncio.Queue is not thread-safe, so it's very important that the put calls here be done on the event loop thread using event_loop.call_soon_threadsafe. This could be the cause of the strange behavior you're seeing in getting the results. > The caller requests some data from the database like this. > >return_queue = asyncio.Queue() >sql = 'SELECT ...' >request_queue.put((return_queue, sql)) Note that since this is a queue.Queue, the put call has the potential to block your entire event loop. -- https://mail.python.org/mailman/listinfo/python-list
Re: Question about asyncio and blocking operations
"Chris Angelico" wrote in message news:captjjmr162+k4lzefpxrur6wxrhxbr-_wkrclldyr7kst+k...@mail.gmail.com... On Thu, Jan 28, 2016 at 8:13 PM, Frank Millman wrote: > Run the database handler in a separate thread. Use a queue.Queue to send > requests to the handler. Use an asyncio.Queue to send results back to > the > caller, which can call 'await q.get()'. > > I ran a quick test and it seems to work. What do you think? My gut feeling is that any queue can block at either get or put H'mm, I will have to think about that one, and figure out how to create a worst-case scenario. I will report back on that. The other risk is that the wrong result will be queried (two async tasks put something onto the queue - which one gets the first result?), which could either be coped with by simple sequencing (maybe this happens automatically, although I'd prefer a mathematically-provable result to "it seems to work"), or by wrapping the whole thing up in a function/class. I *think* I have this one covered. When the caller makes a request, it creates an instance of an asyncio.Queue, and includes it with the request. The db handler uses this queue to send the result back. Do you see any problem with this? Frank -- https://mail.python.org/mailman/listinfo/python-list
Re: Question about asyncio and blocking operations
On Thu, Jan 28, 2016 at 8:13 PM, Frank Millman wrote: > Run the database handler in a separate thread. Use a queue.Queue to send > requests to the handler. Use an asyncio.Queue to send results back to the > caller, which can call 'await q.get()'. > > I ran a quick test and it seems to work. What do you think? My gut feeling is that any queue can block at either get or put. Half of your operations are "correct", and the other half are "wrong". The caller can "await q.get()", which is the "correct" way to handle the blocking operation; the database thread can "jobqueue.get()" as a blocking operation, which is also fine. But your queue-putting operations are going to have to assume that they never block. Maybe you can have the database put its results onto the asyncio.Queue safely, but the requests going onto the queue.Queue could block waiting for the database. Specifically, this will happen if database queries come in faster than the database can handle them - quite literally, your jobs will be "blocked on the database". What should happen then? The other risk is that the wrong result will be queried (two async tasks put something onto the queue - which one gets the first result?), which could either be coped with by simple sequencing (maybe this happens automatically, although I'd prefer a mathematically-provable result to "it seems to work"), or by wrapping the whole thing up in a function/class. Both of these are risks seen purely by looking at the idea description, not at any sort of code. It's entirely possible I'm mistaken about them. But there's only one way to find out :) By the way, just out of interest... there's no way you can actually switch out the database communication for something purely socket-based, is there? PostgreSQL's protocol, for instance, is fairly straight-forward, and you don't *have* to use libpq; Pike's inbuilt pgsql module just opens a socket and says hello, and it looks like py-postgresql [1] is the same thing for Python. Taking something like that and making it asynchronous would be as straight-forward as converting any other socket-based code. Could be an alternative to all this weirdness. ChrisA [1] https://pypi.python.org/pypi/py-postgresql -- https://mail.python.org/mailman/listinfo/python-list
Re: Question about asyncio and blocking operations
On Thu, Jan 28, 2016 at 10:11 PM, Frank Millman wrote: >> >> The other risk is that the wrong result will be queried (two async >> tasks put something onto the queue - which one gets the first >> result?), which could either be coped with by simple sequencing (maybe >> this happens automatically, although I'd prefer a >> mathematically-provable result to "it seems to work"), or by wrapping >> the whole thing up in a function/class. >> > > I *think* I have this one covered. When the caller makes a request, it > creates an instance of an asyncio.Queue, and includes it with the request. > The db handler uses this queue to send the result back. > > Do you see any problem with this? Oh, I get it. In that case, you should be safe (at the cost of efficiency, presumably, but probably immeasurably so). The easiest way to thrash-test this is to simulate an ever-increasing stream of requests that eventually get bottlenecked by the database. For instance, have the database "process" a maximum of 10 requests a second, and then start feeding it 11 requests a second. Monitoring the queue length should tell you what's happening. Eventually, the queue will hit its limit (and you can make that happen sooner by creating the queue with a small maxsize), at which point the queue.put() will block. My suspicion is that that's going to lock your entire backend, unlocking only when the database takes something off its queue; you'll end up throttling at the database's rate (which is fine), but with something perpetually blocked waiting for the database (which is bad - other jobs, like socket read/write, won't be happening). As an alternative, you could use put_nowait or put(item, block=False) to put things on the queue. If there's no room, it'll fail immediately, which you can spit back to the user as "Database overloaded, please try again later". Tuning the size of the queue would then be an important consideration for real-world work; you'd want it long enough to cope with bursty traffic, but short enough that people's requests don't time out while they're blocked on the database (it's much tidier to fail them instantly if that's going to happen). ChrisA -- https://mail.python.org/mailman/listinfo/python-list
Re: Question about asyncio and blocking operations
"Ian Kelly" wrote in message news:CALwzidkr-fT6S6wH2caNaxyQvUdAw=x7xdqkqofnrrwzwnj...@mail.gmail.com... On Wed, Jan 27, 2016 at 10:14 AM, Ian Kelly wrote: > Unfortunately this doesn't actually work at present. > EventLoop.run_in_executor swallows the StopIteration exception and > just returns None, which I assume is a bug. http://bugs.python.org/issue26221 Thanks for that. Fascinating discussion between you and GvR. Reading it gave me an idea. Run the database handler in a separate thread. Use a queue.Queue to send requests to the handler. Use an asyncio.Queue to send results back to the caller, which can call 'await q.get()'. I ran a quick test and it seems to work. What do you think? Frank -- https://mail.python.org/mailman/listinfo/python-list
Re: Question about asyncio and blocking operations
"Ian Kelly" wrote in message news:CALwzidk-RBkB-vi6CgcEeoFHQrsoTFvqX9MqzDD=rny5boc...@mail.gmail.com... On Tue, Jan 26, 2016 at 7:15 AM, Frank Millman wrote: > > If I return the cursor, I can iterate over it, but isn't this a blocking > operation? As far as I know, the DB adaptor will only actually retrieve > the > row when requested. > > If I am right, I should call fetchall() while inside get_rows(), and > return > all the rows as a list. > You probably want an asynchronous iterator here. If the cursor doesn't provide that, then you can wrap it in one. In fact, this is basically one of the examples in the PEP: https://www.python.org/dev/peps/pep-0492/#example-1 Thanks, Ian. I had a look, and it does seem to fit the bill, but I could not get it to work, and I am running out of time. Specifically, I tried to get it working with the sqlite3 cursor. I am no expert, but after some googling I tried this - import sqlite3 conn = sqlite3.connect('/sqlite_db') cur = conn.cursor() async def __aiter__(self): return self async def __anext__(self): loop = asyncio.get_event_loop() return await loop.run_in_executor(None, self.__next__) import types cur.__aiter__ = types.MethodType( __aiter__, cur ) cur.__anext__ = types.MethodType( __anext__, cur ) It failed with this exception - AttributeError: 'sqlite3.Cursor' object has no attribute '__aiter__' I think this is what happens if a class uses 'slots' to define its attributes - it will not permit the creation of a new one. Anyway, moving on, I decided to change tack. Up to now I have been trying to isolate the function where I actually communicate with the database, and wrap that in a Future with 'run_in_executor'. In practice, the vast majority of my interactions with the database consist of very small CRUD commands, and will have minimal impact on response times even if they block. So I decided to focus on a couple of functions which are larger, and try to wrap the entire function in a Future with 'run_in_executor'. It seems to be working, but it looks a bit odd, so I will show what I am doing and ask for feedback. Assume a slow function - async def slow_function(arg1, arg2): [do stuff] It now looks like this - async def slow_function(arg1, arg2): loop = asyncio.get_event_loop() await loop.run_in_executor(None, slow_function_1, arg1, arg2) def slow_function_1(self, arg1, arg2): loop = asyncio.new_event_loop() asyncio.set_event_loop(loop) loop.run_until_complete(slow_function_2(arg1, arg2)) async slow_function_2(arg1, arg2): [do stuff] Does this look right? Frank -- https://mail.python.org/mailman/listinfo/python-list
Re: Question about asyncio and blocking operations
"Ian Kelly" wrote in message news:calwzidn6tvn9w-2qnn2jyvju8nhzn499nptfjn9ohjddceb...@mail.gmail.com... On Wed, Jan 27, 2016 at 7:40 AM, Frank Millman wrote: > > Assume a slow function - > > async def slow_function(arg1, arg2): >[do stuff] > > It now looks like this - > > async def slow_function(arg1, arg2): >loop = asyncio.get_event_loop() >await loop.run_in_executor(None, slow_function_1, arg1, arg2) > > def slow_function_1(self, arg1, arg2): >loop = asyncio.new_event_loop() >asyncio.set_event_loop(loop) >loop.run_until_complete(slow_function_2(arg1, arg2)) > > async slow_function_2(arg1, arg2): >[do stuff] > > Does this look right? I'm not sure I understand what you're trying to accomplish by running a second event loop inside the executor thread. It will only be useful for scheduling asynchronous operations, and if they're asynchronous then why not schedule them on the original event loop? I could be confusing myself here, but this is what I am trying to do. run_in_executor() schedules a blocking function to run in the executor, and returns a Future. If you just invoke it, the blocking function will execute in the background, and the calling function will carry on. If you obtain a reference to the Future, and then 'await' it, the calling function will be suspended until the blocking function is complete. You might do this because you want the calling function to block, but you do not want to block the entire event loop. In the above example, I do not want the calling function to block. However, the blocking function invokes one or more coroutines, so it needs an event loop to operate. Creating a new event loop allows them to run independently. Hope this makes sense. Frank -- https://mail.python.org/mailman/listinfo/python-list
Re: Question about asyncio and blocking operations
On Wed, Jan 27, 2016 at 10:14 AM, Ian Kelly wrote: > Unfortunately this doesn't actually work at present. > EventLoop.run_in_executor swallows the StopIteration exception and > just returns None, which I assume is a bug. http://bugs.python.org/issue26221 -- https://mail.python.org/mailman/listinfo/python-list
Re: Question about asyncio and blocking operations
On Wed, Jan 27, 2016 at 9:15 AM, Ian Kelly wrote: > class CursorWrapper: > > def __init__(self, cursor): > self._cursor = cursor > > async def __aiter__(self): > return self > > async def __anext__(self): > loop = asyncio.get_event_loop() > return await loop.run_in_executor(None, next, self._cursor) Oh, except you'd want to be sure to catch StopIteration and raise AsyncStopIteration in its place. This could also be generalized as an iterator wrapper, similar to the example in the PEP except using run_in_executor to actually avoid blocking. class AsyncIteratorWrapper: def __init__(self, iterable, loop=None, executor=None): self._iterator = iter(iterable) self._loop = loop or asyncio.get_event_loop() self._executor = executor async def __aiter__(self): return self async def __anext__(self): try: return await self._loop.run_in_executor( self._executor, next, self._iterator) except StopIteration: raise StopAsyncIteration Unfortunately this doesn't actually work at present. EventLoop.run_in_executor swallows the StopIteration exception and just returns None, which I assume is a bug. -- https://mail.python.org/mailman/listinfo/python-list
Re: Question about asyncio and blocking operations
On Wed, Jan 27, 2016 at 7:40 AM, Frank Millman wrote: > "Ian Kelly" wrote in message > news:CALwzidk-RBkB-vi6CgcEeoFHQrsoTFvqX9MqzDD=rny5boc...@mail.gmail.com... > >> You probably want an asynchronous iterator here. If the cursor doesn't >> provide that, then you can wrap it in one. In fact, this is basically >> one of the examples in the PEP: >> https://www.python.org/dev/peps/pep-0492/#example-1 >> > > Thanks, Ian. I had a look, and it does seem to fit the bill, but I could not > get it to work, and I am running out of time. > > Specifically, I tried to get it working with the sqlite3 cursor. I am no > expert, but after some googling I tried this - > > import sqlite3 > conn = sqlite3.connect('/sqlite_db') > cur = conn.cursor() > > async def __aiter__(self): >return self > > async def __anext__(self): >loop = asyncio.get_event_loop() >return await loop.run_in_executor(None, self.__next__) > > import types > cur.__aiter__ = types.MethodType( __aiter__, cur ) > cur.__anext__ = types.MethodType( __anext__, cur ) > > It failed with this exception - > > AttributeError: 'sqlite3.Cursor' object has no attribute '__aiter__' > > I think this is what happens if a class uses 'slots' to define its > attributes - it will not permit the creation of a new one. This is why I suggested wrapping the cursor instead. Something like this: class CursorWrapper: def __init__(self, cursor): self._cursor = cursor async def __aiter__(self): return self async def __anext__(self): loop = asyncio.get_event_loop() return await loop.run_in_executor(None, next, self._cursor) > Anyway, moving on, I decided to change tack. Up to now I have been trying to > isolate the function where I actually communicate with the database, and > wrap that in a Future with 'run_in_executor'. > > In practice, the vast majority of my interactions with the database consist > of very small CRUD commands, and will have minimal impact on response times > even if they block. So I decided to focus on a couple of functions which are > larger, and try to wrap the entire function in a Future with > 'run_in_executor'. > > It seems to be working, but it looks a bit odd, so I will show what I am > doing and ask for feedback. > > Assume a slow function - > > async def slow_function(arg1, arg2): >[do stuff] > > It now looks like this - > > async def slow_function(arg1, arg2): >loop = asyncio.get_event_loop() >await loop.run_in_executor(None, slow_function_1, arg1, arg2) > > def slow_function_1(self, arg1, arg2): >loop = asyncio.new_event_loop() >asyncio.set_event_loop(loop) >loop.run_until_complete(slow_function_2(arg1, arg2)) > > async slow_function_2(arg1, arg2): >[do stuff] > > Does this look right? I'm not sure I understand what you're trying to accomplish by running a second event loop inside the executor thread. It will only be useful for scheduling asynchronous operations, and if they're asynchronous then why not schedule them on the original event loop? -- https://mail.python.org/mailman/listinfo/python-list
Re: Question about asyncio and blocking operations
"Ian Kelly" wrote in message news:CALwzidk-RBkB-vi6CgcEeoFHQrsoTFvqX9MqzDD=rny5boc...@mail.gmail.com... On Tue, Jan 26, 2016 at 7:15 AM, Frank Millman wrote: > > If I return the cursor, I can iterate over it, but isn't this a blocking > operation? As far as I know, the DB adaptor will only actually retrieve > the > row when requested. > > If I am right, I should call fetchall() while inside get_rows(), and > return > all the rows as a list. > You probably want an asynchronous iterator here. If the cursor doesn't provide that, then you can wrap it in one. In fact, this is basically one of the examples in the PEP: https://www.python.org/dev/peps/pep-0492/#example-1 Thanks, Ian. I had a look, and it does seem to fit the bill, but I could not get it to work, and I am running out of time. Specifically, I tried to get it working with the sqlite3 cursor. I am no expert, but after some googling I tried this - import sqlite3 conn = sqlite3.connect('/sqlite_db') cur = conn.cursor() async def __aiter__(self): return self async def __anext__(self): loop = asyncio.get_event_loop() return await loop.run_in_executor(None, self.__next__) import types cur.__aiter__ = types.MethodType( __aiter__, cur ) cur.__anext__ = types.MethodType( __anext__, cur ) It failed with this exception - AttributeError: 'sqlite3.Cursor' object has no attribute '__aiter__' I think this is what happens if a class uses 'slots' to define its attributes - it will not permit the creation of a new one. Anyway, moving on, I decided to change tack. Up to now I have been trying to isolate the function where I actually communicate with the database, and wrap that in a Future with 'run_in_executor'. In practice, the vast majority of my interactions with the database consist of very small CRUD commands, and will have minimal impact on response times even if they block. So I decided to focus on a couple of functions which are larger, and try to wrap the entire function in a Future with 'run_in_executor'. It seems to be working, but it looks a bit odd, so I will show what I am doing and ask for feedback. Assume a slow function - async def slow_function(arg1, arg2): [do stuff] It now looks like this - async def slow_function(arg1, arg2): loop = asyncio.get_event_loop() await loop.run_in_executor(None, slow_function_1, arg1, arg2) def slow_function_1(self, arg1, arg2): loop = asyncio.new_event_loop() asyncio.set_event_loop(loop) loop.run_until_complete(slow_function_2(arg1, arg2)) async slow_function_2(arg1, arg2): [do stuff] Does this look right? Frank -- https://mail.python.org/mailman/listinfo/python-list
Re: Question about asyncio and blocking operations
> "Alberto" == Alberto Berti writes: Alberto> async external_coro(): # this is the calling context, which is a coro Alberto> async with transction.begin(): Alberto> o = MyObject Alberto> # maybe other stuff ops... here it is "o = MyObject()" ;-) -- https://mail.python.org/mailman/listinfo/python-list
Re: Question about asyncio and blocking operations
> "Frank" == Frank Millman writes: Frank> Now I have another problem. I have some classes which retrieve some Frank> data from the database during their __init__() method. I find that it Frank> is not allowed to call a coroutine from __init__(), and it is not Frank> allowed to turn __init__() into a coroutine. IMHO this is semantically correct for a method tha should really initialize that instance an await in the __init__ means having a suspension point that makes the initialization somewhat... unpredictable :-). To cover the cases when you need to call a coroutine from a non coroutine function like __init__ I have developed a small package that helps maintaining your code almost clean, where you can be sure that after some point in your code flow, the coroutines scheduled by the normal function have been executed. With that you can write code like this: from metapensiero.asyncio import transaction class MyObject(): def __init__(self): tran = transaction.get() tran.add(get_db_object('company'), cback=self._init) # get_db_object is a coroutine def _init(self, fut): self.company = fut.result() async external_coro(): # this is the calling context, which is a coro async with transction.begin(): o = MyObject # maybe other stuff # start using your db object o.company... This way the management of the "inner" coroutine is simpler, and from your code it's clear it suspends to wait and after that all the "stashed" coroutines are guaranteed to be executed. Hope it helps, Alberto -- https://mail.python.org/mailman/listinfo/python-list
Re: Question about asyncio and blocking operations
On Tue, Jan 26, 2016 at 7:15 AM, Frank Millman wrote: > I am making some progress, but I have found a snag - possibly unavoidable, > but worth a mention. > > Usually when I retrieve rows from a database I iterate over the cursor - > >def get_rows(sql, params): >cur.execute(sql, params) >for row in cur: >yield row > > If I create a Future to run get_rows(), I have to 'return' the result so > that the caller can access it by calling future.result(). > > If I return the cursor, I can iterate over it, but isn't this a blocking > operation? As far as I know, the DB adaptor will only actually retrieve the > row when requested. > > If I am right, I should call fetchall() while inside get_rows(), and return > all the rows as a list. > > This seems to be swapping one bit of asynchronicity for another. > > Does this sound right? You probably want an asynchronous iterator here. If the cursor doesn't provide that, then you can wrap it in one. In fact, this is basically one of the examples in the PEP: https://www.python.org/dev/peps/pep-0492/#example-1 -- https://mail.python.org/mailman/listinfo/python-list
Re: Question about asyncio and blocking operations
"Frank Millman" wrote in message news:n8038j$575$1...@ger.gmane.org... I am developing a typical accounting/business application which involves a front-end allowing clients to access the system, a back-end connecting to a database, and a middle layer that glues it all together. [...] There was one aspect that I deliberately ignored at that stage. I did not change the database access to an asyncio approach, so all reading from/writing to the database involved a blocking operation. I am now ready to tackle that. I am making some progress, but I have found a snag - possibly unavoidable, but worth a mention. Usually when I retrieve rows from a database I iterate over the cursor - def get_rows(sql, params): cur.execute(sql, params) for row in cur: yield row If I create a Future to run get_rows(), I have to 'return' the result so that the caller can access it by calling future.result(). If I return the cursor, I can iterate over it, but isn't this a blocking operation? As far as I know, the DB adaptor will only actually retrieve the row when requested. If I am right, I should call fetchall() while inside get_rows(), and return all the rows as a list. This seems to be swapping one bit of asynchronicity for another. Does this sound right? Frank -- https://mail.python.org/mailman/listinfo/python-list
Re: Question about asyncio and blocking operations
Marko Rauhamaa writes: > Note that neither the multithreading model (which I dislike) nor the > callback hell (which I like) suffer from this problem. There are some runtimes (GHC and Erlang) where everything is nonblocking under the covers, which lets even the asyncs be swept under the rug. Similarly with some low-tech cooperative multitaskers, say in Forth. When you've got a mixture of blocking and nonblocking, it becomes a mess. -- https://mail.python.org/mailman/listinfo/python-list
Re: Question about asyncio and blocking operations
Rustom Mody : > Bah -- What a bloody mess! > And thanks for pointing this out, Ian. > Keep wondering whether my brain is atrophying, or its rocket science or... I'm afraid the asyncio idea will not fly. Adding the keywords "async" and "await" did make things much better, but the programming model seems very cumbersome. Say you have an async that calls a nonblocking function as follows: async def t(): ... f() ... def f(): ... g() ... def g(): ... h() ... def h(): ... Then, you need to add a blocking call to h(). You then have a cascading effect of having to sprinkle asyncs and awaits everywhere: async def t(): ... await f() ... async def f(): ... await g() ... async def g(): ... await h() ... async def h(): ... await ... ... A nasty case of nonlocality. Makes you wonder if you ought to declare *all* functions *always* as asyncs just in case they turn out that way. Note that neither the multithreading model (which I dislike) nor the callback hell (which I like) suffer from this problem. Marko -- https://mail.python.org/mailman/listinfo/python-list
Re: Question about asyncio and blocking operations
On Monday, January 25, 2016 at 9:16:13 PM UTC+5:30, Ian wrote: > On Mon, Jan 25, 2016 at 8:32 AM, Ian Kelly wrote: > > > > On Jan 25, 2016 2:04 AM, "Frank Millman" wrote: > >> > >> "Ian Kelly" wrote in message > >>> > >>> This seems to be a common misapprehension about asyncio programming. > >>> While coroutines are the focus of the library, they're based on > >>> futures, and so by working at a slightly lower level you can also > >>> handle them as such. So while this would be the typical way to use > >>> run_in_executor: > >>> > >>> async def my_coroutine(stuff): > >>> value = await get_event_loop().run_in_executor(None, > >>> blocking_function, stuff) > >>> result = await do_something_else_with(value) > >>> return result > >>> > >>> This is also a perfectly valid way to use it: > >>> > >>> def normal_function(stuff): > >>> loop = get_event_loop() > >>> coro = loop.run_in_executor(None, blocking_function, stuff) > >>> task = loop.create_task(coro) > >>> task.add_done_callback(do_something_else) > >>> return task > >> > >> > >> I am struggling to get my head around this. > >> > >> 1. In the second function, AFAICT coro is already a future. Why is it > >> necessary to turn it into a task? In fact when I tried that in my testing, > >> I > >> got an assertion error - > >> > >> File: "C:\Python35\lib\asyncio\base_events.py", line 211, in create_task > >>task = tasks.Task(coro, loop=self) > >> File: "C:\Python35\lib\asyncio\tasks.py", line 70, in __init__ > >>assert coroutines.iscoroutine(coro), repr(coro) > >> AssertionError: > > > > I didn't test this; it was based on the documentation, which says that > > run_in_executor is a coroutine. Looking at the source, it's actually a > > function that returns a future, so this may be a documentation bug. > > And now I'm reminded of this note in the asyncio docs: > > """ > Note: In this documentation, some methods are documented as > coroutines, even if they are plain Python functions returning a > Future. This is intentional to have a freedom of tweaking the > implementation of these functions in the future. If such a function is > needed to be used in a callback-style code, wrap its result with > ensure_future(). > """ > > IMO such methods should simply be documented as awaitables, not > coroutines. I wonder if that's already settled, or if it's worth > starting a discussion around. Bah -- What a bloody mess! And thanks for pointing this out, Ian. Keep wondering whether my brain is atrophying, or its rocket science or... -- https://mail.python.org/mailman/listinfo/python-list
Re: Question about asyncio and blocking operations
On Mon, Jan 25, 2016 at 8:32 AM, Ian Kelly wrote: > > On Jan 25, 2016 2:04 AM, "Frank Millman" wrote: >> >> "Ian Kelly" wrote in message >> news:calwzidngogpx+cpmvba8vpefuq4-bwmvs0gz3shb0owzi0b...@mail.gmail.com... >>> >>> This seems to be a common misapprehension about asyncio programming. >>> While coroutines are the focus of the library, they're based on >>> futures, and so by working at a slightly lower level you can also >>> handle them as such. So while this would be the typical way to use >>> run_in_executor: >>> >>> async def my_coroutine(stuff): >>> value = await get_event_loop().run_in_executor(None, >>> blocking_function, stuff) >>> result = await do_something_else_with(value) >>> return result >>> >>> This is also a perfectly valid way to use it: >>> >>> def normal_function(stuff): >>> loop = get_event_loop() >>> coro = loop.run_in_executor(None, blocking_function, stuff) >>> task = loop.create_task(coro) >>> task.add_done_callback(do_something_else) >>> return task >> >> >> I am struggling to get my head around this. >> >> 1. In the second function, AFAICT coro is already a future. Why is it >> necessary to turn it into a task? In fact when I tried that in my testing, I >> got an assertion error - >> >> File: "C:\Python35\lib\asyncio\base_events.py", line 211, in create_task >>task = tasks.Task(coro, loop=self) >> File: "C:\Python35\lib\asyncio\tasks.py", line 70, in __init__ >>assert coroutines.iscoroutine(coro), repr(coro) >> AssertionError: > > I didn't test this; it was based on the documentation, which says that > run_in_executor is a coroutine. Looking at the source, it's actually a > function that returns a future, so this may be a documentation bug. And now I'm reminded of this note in the asyncio docs: """ Note: In this documentation, some methods are documented as coroutines, even if they are plain Python functions returning a Future. This is intentional to have a freedom of tweaking the implementation of these functions in the future. If such a function is needed to be used in a callback-style code, wrap its result with ensure_future(). """ IMO such methods should simply be documented as awaitables, not coroutines. I wonder if that's already settled, or if it's worth starting a discussion around. -- https://mail.python.org/mailman/listinfo/python-list
Re: Question about asyncio and blocking operations
On Jan 25, 2016 2:04 AM, "Frank Millman" wrote: > > "Ian Kelly" wrote in message news:calwzidngogpx+cpmvba8vpefuq4-bwmvs0gz3shb0owzi0b...@mail.gmail.com... >> >> This seems to be a common misapprehension about asyncio programming. >> While coroutines are the focus of the library, they're based on >> futures, and so by working at a slightly lower level you can also >> handle them as such. So while this would be the typical way to use >> run_in_executor: >> >> async def my_coroutine(stuff): >> value = await get_event_loop().run_in_executor(None, >> blocking_function, stuff) >> result = await do_something_else_with(value) >> return result >> >> This is also a perfectly valid way to use it: >> >> def normal_function(stuff): >> loop = get_event_loop() >> coro = loop.run_in_executor(None, blocking_function, stuff) >> task = loop.create_task(coro) >> task.add_done_callback(do_something_else) >> return task > > > I am struggling to get my head around this. > > 1. In the second function, AFAICT coro is already a future. Why is it necessary to turn it into a task? In fact when I tried that in my testing, I got an assertion error - > > File: "C:\Python35\lib\asyncio\base_events.py", line 211, in create_task >task = tasks.Task(coro, loop=self) > File: "C:\Python35\lib\asyncio\tasks.py", line 70, in __init__ >assert coroutines.iscoroutine(coro), repr(coro) > AssertionError: I didn't test this; it was based on the documentation, which says that run_in_executor is a coroutine. Looking at the source, it's actually a function that returns a future, so this may be a documentation bug. There's no need to get a task specifically. We just need a future so that callbacks can be added, so if the result of run_in_executor is already a future then the create_task call is unnecessary. To be safe, you could replace that call with asyncio.ensure_future, which accepts any awaitable and returns a future. > 2. In the first function, calling 'run_in_executor' unblocks the main loop so that it can continue with other tasks, but the function itself is suspended until the blocking function returns. In the second function, I cannot see how the function gets suspended. It looks as if the blocking function will run in the background, and the main function will continue. Correct. It's not a coroutine, so it has no facility for being suspended and resumed; it can only block or return. That's why the callback is necessary to schedule additional code to run after blocking_function finishes. normal_function itself can continue to make other non-blocking calls such as scheduling additional tasks, but it shouldn't do anything that depends on the result of blocking_function since it can't be assumed to be available yet. > I would like to experiment with this further, but I would need to see the broader context - IOW see the 'caller' of normal_function(), and see what it does with the return value. The caller of normal_function can do anything it wants with the return value, including adding additional callbacks or just discarding it. The caller could be a coroutine or another normal non-blocking function. If it's a coroutine, then it can await the future, but it doesn't need to unless it wants to do something with the result. Depending on what the future represents, it might also be considered internal to normal_function, in which case it shouldn't be returned at all. -- https://mail.python.org/mailman/listinfo/python-list
Re: Question about asyncio and blocking operations
"Ian Kelly" wrote in message news:calwzidngogpx+cpmvba8vpefuq4-bwmvs0gz3shb0owzi0b...@mail.gmail.com... On Sat, Jan 23, 2016 at 7:38 AM, Frank Millman wrote: > Here is the difficulty. The recommended way to handle a blocking > operation > is to run it as task in a different thread, using run_in_executor(). > This > method is a coroutine. An implication of this is that any method that > calls > it must also be a coroutine, so I end up with a chain of coroutines > stretching all the way back to the initial event that triggered it. This seems to be a common misapprehension about asyncio programming. While coroutines are the focus of the library, they're based on futures, and so by working at a slightly lower level you can also handle them as such. So while this would be the typical way to use run_in_executor: async def my_coroutine(stuff): value = await get_event_loop().run_in_executor(None, blocking_function, stuff) result = await do_something_else_with(value) return result This is also a perfectly valid way to use it: def normal_function(stuff): loop = get_event_loop() coro = loop.run_in_executor(None, blocking_function, stuff) task = loop.create_task(coro) task.add_done_callback(do_something_else) return task I am struggling to get my head around this. 1. In the second function, AFAICT coro is already a future. Why is it necessary to turn it into a task? In fact when I tried that in my testing, I got an assertion error - File: "C:\Python35\lib\asyncio\base_events.py", line 211, in create_task task = tasks.Task(coro, loop=self) File: "C:\Python35\lib\asyncio\tasks.py", line 70, in __init__ assert coroutines.iscoroutine(coro), repr(coro) AssertionError: 2. In the first function, calling 'run_in_executor' unblocks the main loop so that it can continue with other tasks, but the function itself is suspended until the blocking function returns. In the second function, I cannot see how the function gets suspended. It looks as if the blocking function will run in the background, and the main function will continue. I would like to experiment with this further, but I would need to see the broader context - IOW see the 'caller' of normal_function(), and see what it does with the return value. I feel I am getting closer to an 'aha' moment, but I am not there yet, so all info is appreciated. Frank -- https://mail.python.org/mailman/listinfo/python-list
Re: Question about asyncio and blocking operations
"Ian Kelly" wrote in message news:calwzidngogpx+cpmvba8vpefuq4-bwmvs0gz3shb0owzi0b...@mail.gmail.com... On Sat, Jan 23, 2016 at 7:38 AM, Frank Millman wrote: > Here is the difficulty. The recommended way to handle a blocking > operation > is to run it as task in a different thread, using run_in_executor(). > This > method is a coroutine. An implication of this is that any method that > calls > it must also be a coroutine, so I end up with a chain of coroutines > stretching all the way back to the initial event that triggered it. This seems to be a common misapprehension about asyncio programming. While coroutines are the focus of the library, they're based on futures, and so by working at a slightly lower level you can also handle them as such. So while this would be the typical way to use run_in_executor: async def my_coroutine(stuff): value = await get_event_loop().run_in_executor(None, blocking_function, stuff) result = await do_something_else_with(value) return result This is also a perfectly valid way to use it: def normal_function(stuff): loop = get_event_loop() coro = loop.run_in_executor(None, blocking_function, stuff) task = loop.create_task(coro) task.add_done_callback(do_something_else) return task I am struggling to get my head around this. 1. In the second function, AFAICT coro is already a future. Why is it necessary to turn it into a task? In fact when I tried that in my testing, I got an assertion error - File: "C:\Python35\lib\asyncio\base_events.py", line 211, in create_task task = tasks.Task(coro, loop=self) File: "C:\Python35\lib\asyncio\tasks.py", line 70, in __init__ assert coroutines.iscoroutine(coro), repr(coro) AssertionError: 2. In the first function, calling 'run_in_executor' unblocks the main loop so that it can continue with other tasks, but the function itself is suspended until the blocking function returns. In the second function, I cannot see how the function gets suspended. It looks as if the blocking function will run in the background, and the main function will continue. I would like to experiment with this further, but I would need to see the broader context - IOW see the 'caller' of normal_function(), and see what it does with the return value. I feel I am getting closer to an 'aha' moment, but I am not there yet, so all info is appreciated. Frank -- https://mail.python.org/mailman/listinfo/python-list
Re: Question about asyncio and blocking operations
"Frank Millman" wrote in message news:n8038j$575$1...@ger.gmane.org... So I thought I would ask here if anyone has been through a similar exercise, and if what I am going through sounds normal, or if I am doing something fundamentally wrong. Thanks for any input Just a quick note of thanks to ChrisA and Ian. Very interesting responses and plenty to think about. I will have to sleep on it and come back with renewed vigour in the morning. I may well be back with more questions :-) Frank -- https://mail.python.org/mailman/listinfo/python-list
Re: Question about asyncio and blocking operations
On Sat, Jan 23, 2016 at 8:44 AM, Ian Kelly wrote: > This is where it would make sense to me to use callbacks instead of > subroutines. You can structure your __init__ method like this: Doh. s/subroutines/coroutines -- https://mail.python.org/mailman/listinfo/python-list
Re: Question about asyncio and blocking operations
On Sat, Jan 23, 2016 at 7:38 AM, Frank Millman wrote: > Here is the difficulty. The recommended way to handle a blocking operation > is to run it as task in a different thread, using run_in_executor(). This > method is a coroutine. An implication of this is that any method that calls > it must also be a coroutine, so I end up with a chain of coroutines > stretching all the way back to the initial event that triggered it. This seems to be a common misapprehension about asyncio programming. While coroutines are the focus of the library, they're based on futures, and so by working at a slightly lower level you can also handle them as such. So while this would be the typical way to use run_in_executor: async def my_coroutine(stuff): value = await get_event_loop().run_in_executor(None, blocking_function, stuff) result = await do_something_else_with(value) return result This is also a perfectly valid way to use it: def normal_function(stuff): loop = get_event_loop() coro = loop.run_in_executor(None, blocking_function, stuff) task = loop.create_task(coro) task.add_done_callback(do_something_else) return task > I use a cache to store frequently used objects, but I wait for the first > request before I actually retrieve it from the database. This is how it > worked - > > # cache of database objects for each company > class DbObject(dict): >def __missing__(self, company): >db_object = self[company] = get_db_object _from_database() >return db_object > db_objects = DbObjects() > > Any function could ask for db_cache.db_objects[company]. The first time it > would be read from the database, on subsequent requests it would be returned > from the dictionary. > > Now get_db_object_from_database() is a coroutine, so I have to change it to >db_object = self[company] = await get_db_object _from_database() > > But that is not allowed, because __missing__() is not a coroutine. > > I fixed it by replacing the cache with a function - > > # cache of database objects for each company > db_objects = {} > async def get_db_object(company): >if company not in db_objects: >db_object = db_objects[company] = await get_db_object > _from_database() >return db_objects[company] > > Now the calling functions have to call 'await > db_cache.get_db_object(company)' > > Ok, once I had made the change it did not feel so bad. This all sounds pretty reasonable to me. > Now I have another problem. I have some classes which retrieve some data > from the database during their __init__() method. I find that it is not > allowed to call a coroutine from __init__(), and it is not allowed to turn > __init__() into a coroutine. > > I imagine that I will have to split __init__() into two parts, put the > database functionality into a separately-callable method, and then go > through my app to find all occurrences of instantiating the object and > follow it with an explicit call to the new method. > > Again, I can handle that without too much difficulty. But at this stage I do > not know what other problems I am going to face, and how easy they will be > to fix. > > So I thought I would ask here if anyone has been through a similar exercise, > and if what I am going through sounds normal, or if I am doing something > fundamentally wrong. This is where it would make sense to me to use callbacks instead of subroutines. You can structure your __init__ method like this: def __init__(self, params): self.params = params self.db_object_future = get_event_loop().create_task( get_db_object(params)) async def method_depending_on_db_object(): db_object = await self.db_object_future result = do_something_with(db_object) return result The caveat with this is that while __init__ itself doesn't need to be a coroutine, any method that depends on the DB lookup does need to be (or at least needs to return a future). -- https://mail.python.org/mailman/listinfo/python-list
Re: Question about asyncio and blocking operations
On Sun, Jan 24, 2016 at 1:38 AM, Frank Millman wrote: > I find I am bumping my head more that I expected, so I thought I would try > to get some feedback here to see if I have some flaw in my approach, or if > it is just in the nature of writing an asynchronous-style application. I don't have a lot of experience with Python's async/await as such, but I've written asynchronous apps using a variety of systems (and also written threaded ones many times). so I'll answer questions on the basis of design principles that were passed down to me through the generations. > I use a cache to store frequently used objects, but I wait for the first > request before I actually retrieve it from the database. This is how it > worked - > > # cache of database objects for each company > class DbObject(dict): >def __missing__(self, company): >db_object = self[company] = get_db_object _from_database() >return db_object > db_objects = DbObjects() > > Any function could ask for db_cache.db_objects[company]. The first time it > would be read from the database, on subsequent requests it would be returned > from the dictionary. > > Now get_db_object_from_database() is a coroutine, so I have to change it to >db_object = self[company] = await get_db_object _from_database() > > But that is not allowed, because __missing__() is not a coroutine. > > I fixed it by replacing the cache with a function - > > # cache of database objects for each company > db_objects = {} > async def get_db_object(company): >if company not in db_objects: >db_object = db_objects[company] = await get_db_object > _from_database() >return db_objects[company] > > Now the calling functions have to call 'await > db_cache.get_db_object(company)' > > Ok, once I had made the change it did not feel so bad. I would prefer the function call anyway. Subscripting a dictionary is fine for something that's fairly cheap, but if it's potentially hugely expensive, I'd rather see it spelled as a function call. There's plenty of precedent for caching function calls so only the first one is expensive. > Now I have another problem. I have some classes which retrieve some data > from the database during their __init__() method. I find that it is not > allowed to call a coroutine from __init__(), and it is not allowed to turn > __init__() into a coroutine. > > I imagine that I will have to split __init__() into two parts, put the > database functionality into a separately-callable method, and then go > through my app to find all occurrences of instantiating the object and > follow it with an explicit call to the new method. > > Again, I can handle that without too much difficulty. But at this stage I do > not know what other problems I am going to face, and how easy they will be > to fix. The question here is: Until you get that data from the database, what state would the object be in? There are two basic options: 1) If the object is somewhat usable and meaningful, divide initialization into two parts - one that sets up the object itself (__init__) and one that fetches stuff from the database. If you can, trigger the database fetch in __init__ so it's potentially partly done when you come to wait for it. 2) If the object would be completely useless, use an awaitable factory function instead. Rather than constructing an object, you ask an asynchronous procedure to give you an object. It's a subtle change, and by carefully managing the naming, you could make it almost transparent in your code: # Old way: class User: def __init__(self, domain, name): self.id = blocking_database_call("get user", domain, name) # And used thus: me = User("example.com", "rosuav") # New way: class User: def __init__(self, id): self.id = id _User = User async def User(domain, name): id = await async_database_call("get user", domain, name) return _User(id) # And used thus: me = await User("example.com", "rosuav") > So I thought I would ask here if anyone has been through a similar exercise, > and if what I am going through sounds normal, or if I am doing something > fundamentally wrong. I think this looks pretty much right. There are some small things you can do to make it look a bit easier, but it's minor. ChrisA -- https://mail.python.org/mailman/listinfo/python-list
Question about asyncio and blocking operations
Hi all I am developing a typical accounting/business application which involves a front-end allowing clients to access the system, a back-end connecting to a database, and a middle layer that glues it all together. Some time ago I converted the front-end from a multi-threaded approach to an asyncio approach. It was surprisingly easy, and did not require me to delve into asyncio too deeply. There was one aspect that I deliberately ignored at that stage. I did not change the database access to an asyncio approach, so all reading from/writing to the database involved a blocking operation. I am now ready to tackle that. I find I am bumping my head more that I expected, so I thought I would try to get some feedback here to see if I have some flaw in my approach, or if it is just in the nature of writing an asynchronous-style application. Here is the difficulty. The recommended way to handle a blocking operation is to run it as task in a different thread, using run_in_executor(). This method is a coroutine. An implication of this is that any method that calls it must also be a coroutine, so I end up with a chain of coroutines stretching all the way back to the initial event that triggered it. I can understand why this is necessary, but it does lead to some awkward programming. I use a cache to store frequently used objects, but I wait for the first request before I actually retrieve it from the database. This is how it worked - # cache of database objects for each company class DbObject(dict): def __missing__(self, company): db_object = self[company] = get_db_object _from_database() return db_object db_objects = DbObjects() Any function could ask for db_cache.db_objects[company]. The first time it would be read from the database, on subsequent requests it would be returned from the dictionary. Now get_db_object_from_database() is a coroutine, so I have to change it to db_object = self[company] = await get_db_object _from_database() But that is not allowed, because __missing__() is not a coroutine. I fixed it by replacing the cache with a function - # cache of database objects for each company db_objects = {} async def get_db_object(company): if company not in db_objects: db_object = db_objects[company] = await get_db_object _from_database() return db_objects[company] Now the calling functions have to call 'await db_cache.get_db_object(company)' Ok, once I had made the change it did not feel so bad. Now I have another problem. I have some classes which retrieve some data from the database during their __init__() method. I find that it is not allowed to call a coroutine from __init__(), and it is not allowed to turn __init__() into a coroutine. I imagine that I will have to split __init__() into two parts, put the database functionality into a separately-callable method, and then go through my app to find all occurrences of instantiating the object and follow it with an explicit call to the new method. Again, I can handle that without too much difficulty. But at this stage I do not know what other problems I am going to face, and how easy they will be to fix. So I thought I would ask here if anyone has been through a similar exercise, and if what I am going through sounds normal, or if I am doing something fundamentally wrong. Thanks for any input Frank Millman -- https://mail.python.org/mailman/listinfo/python-list