Re: asyncio question

2020-11-03 Thread Kyle Stanley
 On Tue, Nov 3, 2020 at 3:27 AM Frank Millman  wrote:

> It works, and it does look neater. But I want to start some background
> tasks before starting the server, and cancel them on Ctrl+C.
>
> Using the 'old' method, I can wrap 'loop.run_forever()' in a
> try/except/finally, check for KeyboardInterrupt, and run my cleanup in
> the 'finally' block.
>
> Using the 'new' method, KeyboardInterrupt is not caught by
> 'server.serve_forever()' but by 'asyncio.run()'. It is too late to do
> any cleanup at this point, as the loop has already been stopped.
>
> Is it ok to stick to the 'old' method, or is there a better way to do this.
>

It's fine to stick with the older method in your case, as there's nothing
inherently wrong with continuing to use it. `asyncio.run()` is largely a
convenience function that takes care of some finalization/cleanup steps
that are often forgotten (cancelling remaining tasks, closing event loop's
default ThreadPoolExecutor, closing async generators, etc). If you want to
use custom KeyboardInterrupt handling and still use asyncio.run(), you can
either (a) use `loop.add_signal_handler()` or (b) make a slightly modified
local version of `asyncio.run()` that has your desired KeyboardInterrupt
behavior, based roughly on
https://github.com/python/cpython/blob/master/Lib/asyncio/runners.py.

However, using `loop.run_until_complete()` instead of `asyncio.run()` is
also perfectly fine, especially in existing code that still works without
issue. It just leaves the author with a bit more responsibility when it
comes to resource finalization and dealing with the event loop in general
(which can add some extra cognitive burden and room for error, particularly
when dealing with multiple event loops or threads). But it's reasonably
common to continue using `loop.run_until_complete()` in situations where
the default `asyncio.run()` behavior isn't what you need/want, such as your
case.
-- 
https://mail.python.org/mailman/listinfo/python-list


asyncio question

2020-11-03 Thread Frank Millman

Hi all

My app runs an HTTP server using asyncio. A lot of the code dates back 
to Python 3.4, and I am trying to bring it up to date. There is one 
aspect I do not understand.


The 'old' way looks like this -

import asyncio

def main():
loop = asyncio.get_event_loop()
server = loop.run_until_complete(
asyncio.start_server(handle_client, host, port))
loop.run_forever()

if __name__ == '__main__':
main()

According to the docs, the preferred way is now like this -

import asyncio

async def main():
loop = asyncio.get_running_loop()
server = await asyncio.start_server(
 handle_client, host, port)
async with server:
server.serve_forever()

if __name__ == '__main__':
asyncio.run(main())

It works, and it does look neater. But I want to start some background 
tasks before starting the server, and cancel them on Ctrl+C.


Using the 'old' method, I can wrap 'loop.run_forever()' in a 
try/except/finally, check for KeyboardInterrupt, and run my cleanup in 
the 'finally' block.


Using the 'new' method, KeyboardInterrupt is not caught by 
'server.serve_forever()' but by 'asyncio.run()'. It is too late to do 
any cleanup at this point, as the loop has already been stopped.


Is it ok to stick to the 'old' method, or is there a better way to do this.

Thanks

Frank Millman

--
https://mail.python.org/mailman/listinfo/python-list


Re: Asyncio question (rmlibre)

2020-02-28 Thread Frank Millman



On 2020-02-28 1:37 AM, rmli...@riseup.net wrote:
> What resources are you trying to conserve?
>
> If you want to try conserving time, you shouldn't have to worry about
> starting too many background tasks. That's because asyncio code was
> designed to be extremely time efficient at handling large numbers of
> concurrent async tasks.
>

Thanks for the reply.

That is exactly what I want, and in an earlier response Greg echoes what 
what you say here - background tasks are lightweight and are ideal for 
my situation.


Frank

--
https://mail.python.org/mailman/listinfo/python-list


Re: Asyncio question (rmlibre)

2020-02-27 Thread rmlibre
What resources are you trying to conserve? 


If you want to try conserving time, you shouldn't have to worry about
starting too many background tasks. That's because asyncio code was
designed to be extremely time efficient at handling large numbers of
concurrent async tasks. 

For your application, it seems starting background tasks that
appropriately await execution based on their designated queue is a good
idea. This is more time efficient since it takes full advantage of async
concurrency, while also allowing you to control the order of execution.

Although, there may be other efficiency boosts to be had, for instance,
if all but the precise changes that need to be atomic are run
concurrently. 


However, if you want to conserve cpu cycles per unit time, then
staggering the processing of requests sequentially is the best option,
although, there's little need for async code if this is the case. 


Or, if you'd like to conserve memory, making the code more
generator-based is a good option. Lazy computation is quite efficient on
memory and time. Although, rewriting your codebase to run on generators
can be a lot of work, and their efficiency won't really be felt unless
your code is handling "big data" or very large requests.


In any case, you'd probably want to run some benchmark and profiling
tools against a mock-up runtime of your code and optimize/experiment
only after you've noticed there's an efficiency problem and have deduced
its causes. Barring that, it's just guess-work & may just be a waste of
time.




On 2020-02-21 17:00, python-list-requ...@python.org wrote:
> Hi all

> I use asyncio in my project, and it works very well without my having to 
> understand what goes on under the hood. It is a multi-user client/server 
> system, and I want it to scale to many concurrent users. I have a situation 
> where I have to decide between two approaches, and I want to choose the least 
> resource-intensive, but I find it hard to reason about which, if either, is 
> better.
>
> I use HTTP. On the initial connection from a client, I set up a session 
> object, and the session id is passed to the client. All subsequent requests 
> from that client include the session id, and the request is passed to the 
> session object for handling.
>
> It is possible for a new request to be received from a client before the 
> previous one has been completed, and I want each request to be handled 
> atomically, so each session maintains its own asyncio.Queue(). The main 
> routine gets the session id from the request and 'puts' the request in the 
> appropriate queue. The session object 'gets' from the queue and handles the 
> request. It works well.
>
> The question is, how to arrange for each session to 'await' its queue. My 
> first attempt was to create a background task for each session which runs for 
> the life-time of the session, and 'awaits' its queue. It works, but I was 
> concerned about having a lot a background tasks active at the same time.
>
> Then I came up with what I thought was a better idea. On the initial 
> connection, I create the session object, send the response to the client, and 
> then 'await' the method that sets up the session's queue. This also works, 
> and there is no background task involved. However, I then realised that the 
> initial response handler never completes, and will 'await' until the session 
> is closed.
>
> Is this better, worse, or does it make no difference? If it makes no 
> difference, I will lean towards the first approach, as it is easier to reason 
> about what is going on.
>
> Thanks for any advice.
>
> Frank Millman
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Asyncio question

2020-02-21 Thread Frank Millman

On 2020-02-21 11:13 PM, Greg Ewing wrote:

On 21/02/20 7:59 pm, Frank Millman wrote:
My first attempt was to create a background task for each session 
which runs for the life-time of the session, and 'awaits' its queue. 
It works, but I was concerned about having a lot a background tasks 
active at the same time.


The whole point of asyncio is to make tasks very lightweight, so you
can use as many of them as is convenient without worries. One task
per client sounds like the right thing to do here.



Perfect. Thanks so much.

Frank

--
https://mail.python.org/mailman/listinfo/python-list


Re: Asyncio question

2020-02-21 Thread Greg Ewing

On 21/02/20 7:59 pm, Frank Millman wrote:
My first attempt was to create a background task for each session which 
runs for the life-time of the session, and 'awaits' its queue. It works, 
but I was concerned about having a lot a background tasks active at the 
same time.


The whole point of asyncio is to make tasks very lightweight, so you
can use as many of them as is convenient without worries. One task
per client sounds like the right thing to do here.

--
Greg
--
https://mail.python.org/mailman/listinfo/python-list


Asyncio question

2020-02-20 Thread Frank Millman

Hi all

I use asyncio in my project, and it works very well without my having to 
understand what goes on under the hood. It is a multi-user client/server 
system, and I want it to scale to many concurrent users. I have a 
situation where I have to decide between two approaches, and I want to 
choose the least resource-intensive, but I find it hard to reason about 
which, if either, is better.


I use HTTP. On the initial connection from a client, I set up a session 
object, and the session id is passed to the client. All subsequent 
requests from that client include the session id, and the request is 
passed to the session object for handling.


It is possible for a new request to be received from a client before the 
previous one has been completed, and I want each request to be handled 
atomically, so each session maintains its own asyncio.Queue(). The main 
routine gets the session id from the request and 'puts' the request in 
the appropriate queue. The session object 'gets' from the queue and 
handles the request. It works well.


The question is, how to arrange for each session to 'await' its queue. 
My first attempt was to create a background task for each session which 
runs for the life-time of the session, and 'awaits' its queue. It works, 
but I was concerned about having a lot a background tasks active at the 
same time.


Then I came up with what I thought was a better idea. On the initial 
connection, I create the session object, send the response to the 
client, and then 'await' the method that sets up the session's queue. 
This also works, and there is no background task involved. However, I 
then realised that the initial response handler never completes, and 
will 'await' until the session is closed.


Is this better, worse, or does it make no difference? If it makes no 
difference, I will lean towards the first approach, as it is easier to 
reason about what is going on.


Thanks for any advice.

Frank Millman

--
https://mail.python.org/mailman/listinfo/python-list


RE: asyncio Question

2019-03-15 Thread Joseph L. Casale
> -Original Message-
> From: Python-list  bounces+jcasale=activenetwerx@python.org> On Behalf Of Simon
> Connah
> Sent: Thursday, March 14, 2019 3:03 AM
> To: Python 
> Subject: asyncio Question
> 
> Hi,
> 
> Hopefully this isn't a stupid question. For the record I am using Python
> 3.7 on Ubuntu Linux.
> 
> I've decided to use asyncio to write a TCP network server using Streams
> and asyncio.start_server(). I can handle that part of it without many
> problems as the documentation is pretty good. I have one problem though
> that I am not sure how to solve. Each request to the server will be a
> JSON string and one of the components of the JSON string will be the
> latitude and longitude. What I want to do is associate a latitude and
> longitude with the client connection. Every time the latitude and
> longitude changes then the two values will be updated. There will only
> ever be one latitude and longitude for each client connection (since a
> device can only ever be in one place). I'm just not sure what the best
> method to achieve this would be when using asyncio.start_server().

As you expect the client to provide the latitude and longitude, so what
happens when it misbehaves either by accident or intentionally and
all of a sudden you have several connections from the same values?

You'd better rely on your means of differentiating them.
-- 
https://mail.python.org/mailman/listinfo/python-list


asyncio Question

2019-03-14 Thread Simon Connah

Hi,

Hopefully this isn't a stupid question. For the record I am using Python 
3.7 on Ubuntu Linux.


I've decided to use asyncio to write a TCP network server using Streams 
and asyncio.start_server(). I can handle that part of it without many 
problems as the documentation is pretty good. I have one problem though 
that I am not sure how to solve. Each request to the server will be a 
JSON string and one of the components of the JSON string will be the 
latitude and longitude. What I want to do is associate a latitude and 
longitude with the client connection. Every time the latitude and 
longitude changes then the two values will be updated. There will only 
ever be one latitude and longitude for each client connection (since a 
device can only ever be in one place). I'm just not sure what the best 
method to achieve this would be when using asyncio.start_server().


Any help would be greatly appreciated.

--
https://mail.python.org/mailman/listinfo/python-list


Re: asyncio question

2016-12-13 Thread Frank Millman
"Ian Kelly"  wrote in message 
news:CALwzid=vdczAH18mHKaL7ryvDUB=7_y-JVUrTkRZ=gkz66p...@mail.gmail.com...


On Tue, Dec 13, 2016 at 6:15 AM, Frank Millman  wrote:
> The client uses AJAX to send messages to the server. It sends the 
> message
> and continues processing, while a background task waits for the response 
> and
> handles it appropriately. As a result, the client can send a second 
> message
> before receiving a response to the first one. The server can detect 
> this,

> but it cannot wait for the first message to complete, otherwise it will
> block other clients. I have not noticed any problems with processing 2
> requests from the same client concurrently, but I don't like it, so I 
> want

> to process them sequentially.

Is there a particular reason why you're worried about this? The
browser is perfectly capable of keeping the requests straight. Also,
note that since the requests use separate HTTP connections, even if
the server sends its responses in a particular order, there's no
guarantee that the client will read them in that order, so this
doesn't free you from the need to allow the client to handle the
requests coming back in any order.



I had not thought of that, thanks. In fact, more to the point in my case, I 
assume that there is no guarantee that the server will receive the requests 
in the same order that the client sends them.


The particular reason for my concern was that each request can change the 
state of the session, and I wanted to be sure that the state has been fully 
updated before processing the next request.


One scenario, that I had not thought of, is that the requests could be 
received out of sequence. I don't think this will be a problem, but I will 
have to think about it.


The second scenario, which was my main concern, is that the server starts 
processing the second request before processing of the first request has 
been completed, meaning that the session data may not be in a stable state. 
My proposed solution solves this problem.




In a 64-bit Linux build of CPython, the combined size of a generator
and a stack frame is around half a kilobyte (not including whatever
space is needed for local variables), so hundreds of yielding asyncio
tasks should consume no more than hundreds of kilobytes of memory.
Remember that half a kilobyte figure is per generator, not per task,
so if your while loop is four coroutines deep, that will inflate the
cost of the task to two kilobytes each. This is different from the
threading model where each thread would need its own separate stack
space, not just a frame on the heap; you probably wouldn't want to do
this with threads, but with coroutines it should be fine.



Good to know, thanks. I will proceed on the assumption that if anyone runs 
my system with hundreds of users, they will run it on some serious hardware, 
so I should be safe.


Frank


--
https://mail.python.org/mailman/listinfo/python-list


Re: asyncio question

2016-12-13 Thread Ian Kelly
On Tue, Dec 13, 2016 at 6:15 AM, Frank Millman  wrote:
> The client uses AJAX to send messages to the server. It sends the message
> and continues processing, while a background task waits for the response and
> handles it appropriately. As a result, the client can send a second message
> before receiving a response to the first one. The server can detect this,
> but it cannot wait for the first message to complete, otherwise it will
> block other clients. I have not noticed any problems with processing 2
> requests from the same client concurrently, but I don't like it, so I want
> to process them sequentially.

Is there a particular reason why you're worried about this? The
browser is perfectly capable of keeping the requests straight. Also,
note that since the requests use separate HTTP connections, even if
the server sends its responses in a particular order, there's no
guarantee that the client will read them in that order, so this
doesn't free you from the need to allow the client to handle the
requests coming back in any order.

> Here is my solution. As I create each Session instance, I set up a
> background task, using asyncio.ensure_future, which sets up an
> asyncio.Queue. The request handler identifies the session that the request
> belongs to, and 'puts' the request onto that session's Queue. The background
> task runs a 'while True' loop waiting for requests. As they come in it
> 'gets' them and processes them. It seems to work.
>
> This means that I have a background task running for each concurrent user.
> Each one will be idle most of the time. My gut-feel says that this will not
> cause a problem, even if there are hundreds of them, but any comments will
> be welcome.

In a 64-bit Linux build of CPython, the combined size of a generator
and a stack frame is around half a kilobyte (not including whatever
space is needed for local variables), so hundreds of yielding asyncio
tasks should consume no more than hundreds of kilobytes of memory.
Remember that half a kilobyte figure is per generator, not per task,
so if your while loop is four coroutines deep, that will inflate the
cost of the task to two kilobytes each. This is different from the
threading model where each thread would need its own separate stack
space, not just a frame on the heap; you probably wouldn't want to do
this with threads, but with coroutines it should be fine.
-- 
https://mail.python.org/mailman/listinfo/python-list


asyncio question

2016-12-13 Thread Frank Millman

Hi all

I had a problem with asyncio - not a programming problem, but one with 
organising my code to achieve a given result.


I have come up with a solution, but thought I would mention it here to see 
if there is a better approach.


I am using asyncio.start_server() to run a simple HTTP server. Each request 
is passed to a handler. I strive to complete each request as quickly as 
possible, but I use 'await' where necessary to prevent blocking. It all 
works well.


HTTP does not keep the connection open, it sends a message, waits for a 
response, and closes the connection. In order to maintain 'state' for 
concurrent users, I have a Session class, and each message contains a 
session id so that I can pass the message to the correct session instance. 
Again, it all works well.


The client uses AJAX to send messages to the server. It sends the message 
and continues processing, while a background task waits for the response and 
handles it appropriately. As a result, the client can send a second message 
before receiving a response to the first one. The server can detect this, 
but it cannot wait for the first message to complete, otherwise it will 
block other clients. I have not noticed any problems with processing 2 
requests from the same client concurrently, but I don't like it, so I want 
to process them sequentially.


Here is my solution. As I create each Session instance, I set up a 
background task, using asyncio.ensure_future, which sets up an 
asyncio.Queue. The request handler identifies the session that the request 
belongs to, and 'puts' the request onto that session's Queue. The background 
task runs a 'while True' loop waiting for requests. As they come in it 
'gets' them and processes them. It seems to work.


This means that I have a background task running for each concurrent user. 
Each one will be idle most of the time. My gut-feel says that this will not 
cause a problem, even if there are hundreds of them, but any comments will 
be welcome.


Thanks

Frank Millman


--
https://mail.python.org/mailman/listinfo/python-list


asyncio question

2014-03-13 Thread Joseph L. Casale
I have a portion of code I need to speed up, there are 3 api calls to an 
external system
where the first enumerates a large collection of objects I then loop through 
and perform
two additional api calls each. The first call is instant, the second and third 
per object are
very slow. Currently after accumulating all the data I write the relevant data 
into a database.

I have the ability to hold all this in memory and dump it once fully 
accumulated, so performing
the second and third call in parallel with fixed batches would be great,

I took a look at coroutines and some skeleton code worked fine, but I am not 
sure how to
perform the acquisition in fixed groups like I might for example with 
multiprocessing and
a pool of workers.

Anyone done something like this or have an opinion?

Thanks,
jlc
-- 
https://mail.python.org/mailman/listinfo/python-list