Re: Thoughts on ASGI or Why I don't see myself ever wanting to use ASGI

Donald Stufft Fri, 06 May 2016 11:00:49 -0700

> On May 6, 2016, at 1:45 PM, Andrew Godwin <and...@aeracode.org> wrote:
> 
> Want to just cover a few more things I didn't in my reply to Aymeric.
> 
> On Fri, May 6, 2016 at 9:11 AM, Donald Stufft <don...@stufft.io 
> <mailto:don...@stufft.io>> wrote:
> 
> In short, I think that the message bus adds an additional layer of complexity
> that makes everything a bit more complex and complicated for very little 
> actual
> gain over other possible, but less complex solutions. This message bus also
> removes a key part of the amount of control that the server which is 
> *actually*
> receiving the connection has over the lifetime and process of the eventual
> request.
> 
> True; however, having a message bus/channel abstraction also removes a layer 
> of complexity that is caring about socket handling and sinking your 
> performance by even doing a slightly blocking operation.
> 
> In an ideal world we'd have some magical language that let us all write 
> amazing async code and that detected all possible deadlocks or livelocks 
> before they happened, but that's not yet the case, and I think the worker 
> model has been a good substitute for it in software design generally.
> 
> 
> For an example, in traditional HTTP servers where you have an open connection
> associated with whatever view code you're running whenever the client
> disconnects you're given a few options of what you can do, but the most common
> option in my experience is that once the connection has been lost the HTTP
> server cancels the execution of whatever view code it had been running [1].
> This allows a single process to serve more by shedding the load of connections
> that have since been disconnected for some reason, however in ASGI since
> there's no way to remove an item from the queue or cancel it once it has begun
> to be processed by a worker proccess you lose out on this ability to shed the
> load of processing a request once it has already been scheduled.
> 
> But as soon as you introduce a layer like Varnish into the equation, you've 
> lost this anyway, as you're no longer seeing the true client socket. 
> Abandoned requests are an existent problem with HTTP and WSGI; I see them in 
> our logs all the time.



I don’t believe that to be true. For example: The client connects to Varnish, 
Varnish connects to h2o, h2o connections to gunciorn which is running WSGI. The 
client closes the connection to Varnish, so Varnish closes the connection to 
h2o, so h2o closes the connection to gunicorn who can then throw a SystemExit 
exception and halt execution of the code.

> 
> 
> This additional complexity incurred by the message bus also ends up requiring
> additional complexity layered onto ASGI to try and re-invent some of the
> "natural" features of TCP and/or HTTP (or whatever the underlying protocol 
> is).
> An example of this would be the ``order`` keyword in the WebSocket spec,
> something that isn't required and just naturally happens whenever you're
> directly connected to a websocket because the ``order`` is just whatever bytes
> come in off the wire. This also gets exposed in other features, like
> backpressure where ASGI didn't currently have a concept of allowing the queue
> to apply back pressure to the web connection but now Andrew has started to 
> come
> around to the idea of adding a bounding to the queue (which is good!) but if
> the indirection of the message bus hadn't been added, then backpressure would
> have naturally occurred whenever you ended up getting enough things processing
> that it blocked new connections from being ``accept``d which would eventually
> end up filling up the backlog and then making new connections hang block
> waiting to connect. Now it's good that Andrew is adding the ability to bound
> the queue, but that is something that is going to require care to tune in each
> individual deployment (and will need regularly re-evaluated) rather than
> something that just occurs naturally as a consequence of the design of the
> system.
> 
> Client buffers in OSs were also manually tuned to begin with; I suspect we 
> can hone in on how to make this work best over time once we have more 
> experience with how it runs in the wild. I don't disagree that I'm 
> reinventing existing features of TCP sockets, but it's also a mix of UDP 
> features too; there's a reason a lot of modern protocols back onto UDP 
> instead of TCP, and I'm trying to strike the balance.
> 
> 
> Anytime you add a message bus you need to make a few trade offs, the 
> particular
> trade off that ASGI made is that it should prefer "at most once" delivery of
> messages and low latency to guaranteed delivery. This choice is likely one of
> the sanest ones you can make in regards to which trade offs you make for the
> design of ASGI, but in that trade off you end up with new problems that don't
> exist otherwise. For example, HTTP/1 has the concept of pipelining which 
> allows
> you to make several HTTP requests on a single HTTP connection without waiting
> for the responses before sending each one. Given the nature of ASGI it would 
> be
> very difficult to actually support this feature without either violating the
> RFC or forcing either Daphne or the queue to buffer potentially huge responses
> while it waits for another request that came before it to be finished whereas
> again you get this for free using either async IO (you just don't await the
> result of that second request until the first request has been processed) or
> with WSGI if you're using generators (you just don't iterate over the result
> until you're ready for it).
> 
> Even with asyncio that data has to be buffered somewhere, whether it's in the 
> client transmit buffer, the receiving OS buffer, or Python memory. If Daphne 
> refuses to read() more from a socket it got a HTTP/1.1 pipeline request on 
> before the response for the first one comes back, that would achieve the same 
> affect as asyncio, no? (This may in fact be what it does already, I need to 
> check the twisted.web pipeline handling)

It doesn’t have to be (entirely) buffered anywhere though is the point. You 
stop producing the data when your buffer fills up, until your consumer of that 
buffer drains it and it’s available for more data again. You’re not just 
growing a buffer unbounded.

> 
> 
> ASGI purports to make it easier to gracefully restart your servers by making 
> it
> possible to restart the worker servers (since there is no long live open
> connections to them) and simply spin up new ones. However, that's not really
> the whole story, because while that is true, it really only exists as long as
> your code changes don't touch something that Daphne needs to be aware of in
> order to process incoming requests. As soon as Daphne needs restarted then
> you're back in the same boat of needing another solution to graceful restarts
> and since Daphne depends on project specific code, it's going to require to be
> restarted much more frequently than other solutions that don't. It appears to
> me like it would be difficult to be able to automatically determine whether or
> not Daphne needs a restart on any particular deployment, so it will be common
> for people to just need to restart the whole stack anyways.
> 
> Daphne only depends on one tiny piece of project code, the channel layer 
> configuration. I don't imagine that changing nearly as often as actual 
> business logic. You're right that once there's a new Daphne version or that 
> config changes, it needs a restart too, but that's not going to be very 
> common.

Right, but in an automated system it’ll be difficult to determine if Daphne or 
the worker processes need to be restarted. A human could figure it out but a 
machine would need to trace Python code to figure out if Daphne is affected or 
not.

> 
> 
> So what sort of solution would I personally advocate had I the time or energy
> to do so? I would look towards what sort of pure Python API (like WSGI itself)
> could be added to allow a web server to pass websockets down into Django. I
> admit that in some cases people would then need to layer on their own message
> buses (since that's just about the only reasonable way to implement something
> like Group().send()) but even here they'd be able to get added gains and "for
> free" features by utilizing something that specializes in this sort of
> multicast type of message (a pub/sub message bus more or less). Of course
> currently no web servers would support whatever this new "WSGI but for
> WebSockets" would be, so you'd need to implement something like Daphne that
> could handle it in the intrim (or possibly forever if nobody implemented it)
> but that's the same case as with ASGI now.
> 
> Handling scaling out to multiple processes and graceful restarts would be
> handled the way they are today. Either you'd have some master process that
> isn't specific to the Django code (like Daphne is) that would spin up new
> processes, start sending traffic to them and then close out the old processes.
> This generalizes out past a single machine too, where you'd have something 
> like
> HAProxy load balancing between machines and able to gracefully stop sending
> requests to once instance and start sending them to a new instance. For
> Websockets anytime you have a persistent connection to your worker you'll need
> some way trigger your clients to disconnect and reconnect (so they get
> scheduled onto the new server/process), but that's something you'll need with
> ASGI anyways anytime you need to restart Daphne anyways (and since the thing
> intiating the restart there is tied to your application code, a hook can be
> provided that gets called on shut down that lets the application do some
> application specific thing to tell people to reconnect).
> 
> In this solution, since everything is just HTTP (or Websockets, or whatever)
> all the way down you end up getting to reuse all of the battle tested pieces
> that already exist like HAProxy. It's also easier to simply drop in another
> piece, possibly written in another language or another technology since
> everywhere in the stack speaks HTTP/Websocket and you don't have to go and
> teach say, Erlang how to ASGI.
> 
> 
> I agree with the want to use things like HAProxy in the stack, but I think 
> your idea of handling WebSockets natively in Django is far more difficult and 
> fragile than Channels is, mostly due to our ten-year history of synchronous 
> code. We would have to audit a large amount of the codebase to ensure it was 
> all async compatible, not to mention drop python 2 suport, before we'd even 
> get close.

You don’t need to write it asynchronously. You need an async server but that 
async server can execute synchronous code just fine using something like 
deferToThread. That’s how twistd -n web —wsgi works today. It gets a request 
and it deferToThread’s it to synchronous WSGI code.

> 
> I'm not saying my solution is perfect, I'm saying it's pragmatic given our 
> current position and likely future position. Channels adds a spectrum to 
> Django where you can run it on anything between a single process, a single 
> machine (with the IPC channel layer), or a cluster of machines.
> 
> I look forward to Python async being in a better place in five to ten years 
> so we can revisit this and improve things (but hopefully keep a similar 
> end-developer API, which I think is quite nice to use and reflects URL 
> routing and view writing in a nice way), but I believe we need something that 
> works well now, which means taking a few tradeoffs along the way; after all, 
> it's not going to be forced on anyone, WSGI will still be there for a long 
> time to come*.
> 
> (*At least until I get around to working out what an in-process asyncio WSGI 
> replacement with WebSocket support might look like)
> 
> Andrew
> 
> --
> You received this message because you are subscribed to the Google Groups 
> "Django developers (Contributions to Django itself)" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to django-developers+unsubscr...@googlegroups.com 
> <mailto:django-developers+unsubscr...@googlegroups.com>.
> To post to this group, send email to django-developers@googlegroups.com 
> <mailto:django-developers@googlegroups.com>.
> Visit this group at https://groups.google.com/group/django-developers 
> <https://groups.google.com/group/django-developers>.
> To view this discussion on the web visit 
> https://groups.google.com/d/msgid/django-developers/CAFwN1urfvxwUsGSsk3UHLMqZwrqTYfaCvgFQqfFqM%2BiGtkRUmg%40mail.gmail.com
>  
> <https://groups.google.com/d/msgid/django-developers/CAFwN1urfvxwUsGSsk3UHLMqZwrqTYfaCvgFQqfFqM%2BiGtkRUmg%40mail.gmail.com?utm_medium=email&utm_source=footer>.
> For more options, visit https://groups.google.com/d/optout 
> <https://groups.google.com/d/optout>.


-----------------
Donald Stufft
PGP: 0x6E3CBCE93372DCFA // 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers  (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at https://groups.google.com/group/django-developers.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-developers/1BB7DFFC-1DE1-4F02-816B-E9CBD13CCF22%40stufft.io.
For more options, visit https://groups.google.com/d/optout.

signature.asc
Description: Message signed with OpenPGP using GPGMail

Re: Thoughts on ASGI or Why I don't see myself ever wanting to use ASGI

Reply via email to