Re: asyncio question
On Tue, Nov 3, 2020 at 3:27 AM Frank Millman wrote: > It works, and it does look neater. But I want to start some background > tasks before starting the server, and cancel them on Ctrl+C. > > Using the 'old' method, I can wrap 'loop.run_forever()' in a > try/except/finally, check for KeyboardInterrupt, and run my cleanup in > the 'finally' block. > > Using the 'new' method, KeyboardInterrupt is not caught by > 'server.serve_forever()' but by 'asyncio.run()'. It is too late to do > any cleanup at this point, as the loop has already been stopped. > > Is it ok to stick to the 'old' method, or is there a better way to do this. > It's fine to stick with the older method in your case, as there's nothing inherently wrong with continuing to use it. `asyncio.run()` is largely a convenience function that takes care of some finalization/cleanup steps that are often forgotten (cancelling remaining tasks, closing event loop's default ThreadPoolExecutor, closing async generators, etc). If you want to use custom KeyboardInterrupt handling and still use asyncio.run(), you can either (a) use `loop.add_signal_handler()` or (b) make a slightly modified local version of `asyncio.run()` that has your desired KeyboardInterrupt behavior, based roughly on https://github.com/python/cpython/blob/master/Lib/asyncio/runners.py. However, using `loop.run_until_complete()` instead of `asyncio.run()` is also perfectly fine, especially in existing code that still works without issue. It just leaves the author with a bit more responsibility when it comes to resource finalization and dealing with the event loop in general (which can add some extra cognitive burden and room for error, particularly when dealing with multiple event loops or threads). But it's reasonably common to continue using `loop.run_until_complete()` in situations where the default `asyncio.run()` behavior isn't what you need/want, such as your case. -- https://mail.python.org/mailman/listinfo/python-list
asyncio question
Hi all My app runs an HTTP server using asyncio. A lot of the code dates back to Python 3.4, and I am trying to bring it up to date. There is one aspect I do not understand. The 'old' way looks like this - import asyncio def main(): loop = asyncio.get_event_loop() server = loop.run_until_complete( asyncio.start_server(handle_client, host, port)) loop.run_forever() if __name__ == '__main__': main() According to the docs, the preferred way is now like this - import asyncio async def main(): loop = asyncio.get_running_loop() server = await asyncio.start_server( handle_client, host, port) async with server: server.serve_forever() if __name__ == '__main__': asyncio.run(main()) It works, and it does look neater. But I want to start some background tasks before starting the server, and cancel them on Ctrl+C. Using the 'old' method, I can wrap 'loop.run_forever()' in a try/except/finally, check for KeyboardInterrupt, and run my cleanup in the 'finally' block. Using the 'new' method, KeyboardInterrupt is not caught by 'server.serve_forever()' but by 'asyncio.run()'. It is too late to do any cleanup at this point, as the loop has already been stopped. Is it ok to stick to the 'old' method, or is there a better way to do this. Thanks Frank Millman -- https://mail.python.org/mailman/listinfo/python-list
Re: Asyncio question (rmlibre)
On 2020-02-28 1:37 AM, rmli...@riseup.net wrote: > What resources are you trying to conserve? > > If you want to try conserving time, you shouldn't have to worry about > starting too many background tasks. That's because asyncio code was > designed to be extremely time efficient at handling large numbers of > concurrent async tasks. > Thanks for the reply. That is exactly what I want, and in an earlier response Greg echoes what what you say here - background tasks are lightweight and are ideal for my situation. Frank -- https://mail.python.org/mailman/listinfo/python-list
Re: Asyncio question (rmlibre)
What resources are you trying to conserve? If you want to try conserving time, you shouldn't have to worry about starting too many background tasks. That's because asyncio code was designed to be extremely time efficient at handling large numbers of concurrent async tasks. For your application, it seems starting background tasks that appropriately await execution based on their designated queue is a good idea. This is more time efficient since it takes full advantage of async concurrency, while also allowing you to control the order of execution. Although, there may be other efficiency boosts to be had, for instance, if all but the precise changes that need to be atomic are run concurrently. However, if you want to conserve cpu cycles per unit time, then staggering the processing of requests sequentially is the best option, although, there's little need for async code if this is the case. Or, if you'd like to conserve memory, making the code more generator-based is a good option. Lazy computation is quite efficient on memory and time. Although, rewriting your codebase to run on generators can be a lot of work, and their efficiency won't really be felt unless your code is handling "big data" or very large requests. In any case, you'd probably want to run some benchmark and profiling tools against a mock-up runtime of your code and optimize/experiment only after you've noticed there's an efficiency problem and have deduced its causes. Barring that, it's just guess-work & may just be a waste of time. On 2020-02-21 17:00, python-list-requ...@python.org wrote: > Hi all > I use asyncio in my project, and it works very well without my having to > understand what goes on under the hood. It is a multi-user client/server > system, and I want it to scale to many concurrent users. I have a situation > where I have to decide between two approaches, and I want to choose the least > resource-intensive, but I find it hard to reason about which, if either, is > better. > > I use HTTP. On the initial connection from a client, I set up a session > object, and the session id is passed to the client. All subsequent requests > from that client include the session id, and the request is passed to the > session object for handling. > > It is possible for a new request to be received from a client before the > previous one has been completed, and I want each request to be handled > atomically, so each session maintains its own asyncio.Queue(). The main > routine gets the session id from the request and 'puts' the request in the > appropriate queue. The session object 'gets' from the queue and handles the > request. It works well. > > The question is, how to arrange for each session to 'await' its queue. My > first attempt was to create a background task for each session which runs for > the life-time of the session, and 'awaits' its queue. It works, but I was > concerned about having a lot a background tasks active at the same time. > > Then I came up with what I thought was a better idea. On the initial > connection, I create the session object, send the response to the client, and > then 'await' the method that sets up the session's queue. This also works, > and there is no background task involved. However, I then realised that the > initial response handler never completes, and will 'await' until the session > is closed. > > Is this better, worse, or does it make no difference? If it makes no > difference, I will lean towards the first approach, as it is easier to reason > about what is going on. > > Thanks for any advice. > > Frank Millman -- https://mail.python.org/mailman/listinfo/python-list
Re: Asyncio question
On 2020-02-21 11:13 PM, Greg Ewing wrote: On 21/02/20 7:59 pm, Frank Millman wrote: My first attempt was to create a background task for each session which runs for the life-time of the session, and 'awaits' its queue. It works, but I was concerned about having a lot a background tasks active at the same time. The whole point of asyncio is to make tasks very lightweight, so you can use as many of them as is convenient without worries. One task per client sounds like the right thing to do here. Perfect. Thanks so much. Frank -- https://mail.python.org/mailman/listinfo/python-list
Re: Asyncio question
On 21/02/20 7:59 pm, Frank Millman wrote: My first attempt was to create a background task for each session which runs for the life-time of the session, and 'awaits' its queue. It works, but I was concerned about having a lot a background tasks active at the same time. The whole point of asyncio is to make tasks very lightweight, so you can use as many of them as is convenient without worries. One task per client sounds like the right thing to do here. -- Greg -- https://mail.python.org/mailman/listinfo/python-list
Asyncio question
Hi all I use asyncio in my project, and it works very well without my having to understand what goes on under the hood. It is a multi-user client/server system, and I want it to scale to many concurrent users. I have a situation where I have to decide between two approaches, and I want to choose the least resource-intensive, but I find it hard to reason about which, if either, is better. I use HTTP. On the initial connection from a client, I set up a session object, and the session id is passed to the client. All subsequent requests from that client include the session id, and the request is passed to the session object for handling. It is possible for a new request to be received from a client before the previous one has been completed, and I want each request to be handled atomically, so each session maintains its own asyncio.Queue(). The main routine gets the session id from the request and 'puts' the request in the appropriate queue. The session object 'gets' from the queue and handles the request. It works well. The question is, how to arrange for each session to 'await' its queue. My first attempt was to create a background task for each session which runs for the life-time of the session, and 'awaits' its queue. It works, but I was concerned about having a lot a background tasks active at the same time. Then I came up with what I thought was a better idea. On the initial connection, I create the session object, send the response to the client, and then 'await' the method that sets up the session's queue. This also works, and there is no background task involved. However, I then realised that the initial response handler never completes, and will 'await' until the session is closed. Is this better, worse, or does it make no difference? If it makes no difference, I will lean towards the first approach, as it is easier to reason about what is going on. Thanks for any advice. Frank Millman -- https://mail.python.org/mailman/listinfo/python-list
RE: asyncio Question
> -Original Message- > From: Python-list bounces+jcasale=activenetwerx@python.org> On Behalf Of Simon > Connah > Sent: Thursday, March 14, 2019 3:03 AM > To: Python > Subject: asyncio Question > > Hi, > > Hopefully this isn't a stupid question. For the record I am using Python > 3.7 on Ubuntu Linux. > > I've decided to use asyncio to write a TCP network server using Streams > and asyncio.start_server(). I can handle that part of it without many > problems as the documentation is pretty good. I have one problem though > that I am not sure how to solve. Each request to the server will be a > JSON string and one of the components of the JSON string will be the > latitude and longitude. What I want to do is associate a latitude and > longitude with the client connection. Every time the latitude and > longitude changes then the two values will be updated. There will only > ever be one latitude and longitude for each client connection (since a > device can only ever be in one place). I'm just not sure what the best > method to achieve this would be when using asyncio.start_server(). As you expect the client to provide the latitude and longitude, so what happens when it misbehaves either by accident or intentionally and all of a sudden you have several connections from the same values? You'd better rely on your means of differentiating them. -- https://mail.python.org/mailman/listinfo/python-list
asyncio Question
Hi, Hopefully this isn't a stupid question. For the record I am using Python 3.7 on Ubuntu Linux. I've decided to use asyncio to write a TCP network server using Streams and asyncio.start_server(). I can handle that part of it without many problems as the documentation is pretty good. I have one problem though that I am not sure how to solve. Each request to the server will be a JSON string and one of the components of the JSON string will be the latitude and longitude. What I want to do is associate a latitude and longitude with the client connection. Every time the latitude and longitude changes then the two values will be updated. There will only ever be one latitude and longitude for each client connection (since a device can only ever be in one place). I'm just not sure what the best method to achieve this would be when using asyncio.start_server(). Any help would be greatly appreciated. -- https://mail.python.org/mailman/listinfo/python-list
Re: asyncio question
"Ian Kelly" wrote in message news:CALwzid=vdczAH18mHKaL7ryvDUB=7_y-JVUrTkRZ=gkz66p...@mail.gmail.com... On Tue, Dec 13, 2016 at 6:15 AM, Frank Millmanwrote: > The client uses AJAX to send messages to the server. It sends the > message > and continues processing, while a background task waits for the response > and > handles it appropriately. As a result, the client can send a second > message > before receiving a response to the first one. The server can detect > this, > but it cannot wait for the first message to complete, otherwise it will > block other clients. I have not noticed any problems with processing 2 > requests from the same client concurrently, but I don't like it, so I > want > to process them sequentially. Is there a particular reason why you're worried about this? The browser is perfectly capable of keeping the requests straight. Also, note that since the requests use separate HTTP connections, even if the server sends its responses in a particular order, there's no guarantee that the client will read them in that order, so this doesn't free you from the need to allow the client to handle the requests coming back in any order. I had not thought of that, thanks. In fact, more to the point in my case, I assume that there is no guarantee that the server will receive the requests in the same order that the client sends them. The particular reason for my concern was that each request can change the state of the session, and I wanted to be sure that the state has been fully updated before processing the next request. One scenario, that I had not thought of, is that the requests could be received out of sequence. I don't think this will be a problem, but I will have to think about it. The second scenario, which was my main concern, is that the server starts processing the second request before processing of the first request has been completed, meaning that the session data may not be in a stable state. My proposed solution solves this problem. In a 64-bit Linux build of CPython, the combined size of a generator and a stack frame is around half a kilobyte (not including whatever space is needed for local variables), so hundreds of yielding asyncio tasks should consume no more than hundreds of kilobytes of memory. Remember that half a kilobyte figure is per generator, not per task, so if your while loop is four coroutines deep, that will inflate the cost of the task to two kilobytes each. This is different from the threading model where each thread would need its own separate stack space, not just a frame on the heap; you probably wouldn't want to do this with threads, but with coroutines it should be fine. Good to know, thanks. I will proceed on the assumption that if anyone runs my system with hundreds of users, they will run it on some serious hardware, so I should be safe. Frank -- https://mail.python.org/mailman/listinfo/python-list
Re: asyncio question
On Tue, Dec 13, 2016 at 6:15 AM, Frank Millmanwrote: > The client uses AJAX to send messages to the server. It sends the message > and continues processing, while a background task waits for the response and > handles it appropriately. As a result, the client can send a second message > before receiving a response to the first one. The server can detect this, > but it cannot wait for the first message to complete, otherwise it will > block other clients. I have not noticed any problems with processing 2 > requests from the same client concurrently, but I don't like it, so I want > to process them sequentially. Is there a particular reason why you're worried about this? The browser is perfectly capable of keeping the requests straight. Also, note that since the requests use separate HTTP connections, even if the server sends its responses in a particular order, there's no guarantee that the client will read them in that order, so this doesn't free you from the need to allow the client to handle the requests coming back in any order. > Here is my solution. As I create each Session instance, I set up a > background task, using asyncio.ensure_future, which sets up an > asyncio.Queue. The request handler identifies the session that the request > belongs to, and 'puts' the request onto that session's Queue. The background > task runs a 'while True' loop waiting for requests. As they come in it > 'gets' them and processes them. It seems to work. > > This means that I have a background task running for each concurrent user. > Each one will be idle most of the time. My gut-feel says that this will not > cause a problem, even if there are hundreds of them, but any comments will > be welcome. In a 64-bit Linux build of CPython, the combined size of a generator and a stack frame is around half a kilobyte (not including whatever space is needed for local variables), so hundreds of yielding asyncio tasks should consume no more than hundreds of kilobytes of memory. Remember that half a kilobyte figure is per generator, not per task, so if your while loop is four coroutines deep, that will inflate the cost of the task to two kilobytes each. This is different from the threading model where each thread would need its own separate stack space, not just a frame on the heap; you probably wouldn't want to do this with threads, but with coroutines it should be fine. -- https://mail.python.org/mailman/listinfo/python-list
asyncio question
Hi all I had a problem with asyncio - not a programming problem, but one with organising my code to achieve a given result. I have come up with a solution, but thought I would mention it here to see if there is a better approach. I am using asyncio.start_server() to run a simple HTTP server. Each request is passed to a handler. I strive to complete each request as quickly as possible, but I use 'await' where necessary to prevent blocking. It all works well. HTTP does not keep the connection open, it sends a message, waits for a response, and closes the connection. In order to maintain 'state' for concurrent users, I have a Session class, and each message contains a session id so that I can pass the message to the correct session instance. Again, it all works well. The client uses AJAX to send messages to the server. It sends the message and continues processing, while a background task waits for the response and handles it appropriately. As a result, the client can send a second message before receiving a response to the first one. The server can detect this, but it cannot wait for the first message to complete, otherwise it will block other clients. I have not noticed any problems with processing 2 requests from the same client concurrently, but I don't like it, so I want to process them sequentially. Here is my solution. As I create each Session instance, I set up a background task, using asyncio.ensure_future, which sets up an asyncio.Queue. The request handler identifies the session that the request belongs to, and 'puts' the request onto that session's Queue. The background task runs a 'while True' loop waiting for requests. As they come in it 'gets' them and processes them. It seems to work. This means that I have a background task running for each concurrent user. Each one will be idle most of the time. My gut-feel says that this will not cause a problem, even if there are hundreds of them, but any comments will be welcome. Thanks Frank Millman -- https://mail.python.org/mailman/listinfo/python-list
asyncio question
I have a portion of code I need to speed up, there are 3 api calls to an external system where the first enumerates a large collection of objects I then loop through and perform two additional api calls each. The first call is instant, the second and third per object are very slow. Currently after accumulating all the data I write the relevant data into a database. I have the ability to hold all this in memory and dump it once fully accumulated, so performing the second and third call in parallel with fixed batches would be great, I took a look at coroutines and some skeleton code worked fine, but I am not sure how to perform the acquisition in fixed groups like I might for example with multiprocessing and a pool of workers. Anyone done something like this or have an opinion? Thanks, jlc -- https://mail.python.org/mailman/listinfo/python-list