At 02:04 PM 4/10/2010 +0100, Chris Dent wrote:
I realize I'm able to build up a complete string or yield via a
generator, or a whole bunch of various ways to accomplish things
(which is part of why I like WSGI: that content is just an iterator,
that's a good thing) so I'm not looking for a statement of what is or
isn't possible, but rather opinions. Why is yielding lots of moderately
sized strings *very bad*? Why is it _not_ very bad (as presumably
others think)?

How bad it is depends a lot on the specific middleware, server architecture, OS, and what else is running on the machine. The more layers of architecture you have, the worse the overhead is going to be.

The main reason, though, is that alternating control between your app and the server means increased request lifetime and worsened average request completion latency.

Imagine that I have five tasks to work on right now. Let us say each takes five units of time to complete. If I have five units of time right now, I can either finish one task now, or partially finish five. If I work on them in an interleaved way, *none* of the tasks will be done until twenty-five units have elapsed, and so all tasks will have a completion latency of 25 units.

If I work on them one at a time, however, then one task will be done in 5 units, the next in 10, and so on -- for an average latency of only 15 units. And that is *not* counting any task switching overhead.

But it's *worse* than that, because by multitasking, my task queue has five things in it the whole time... so I am using more memory and have more management overhead, as well as task switching overhead.

If you translate this to the architecture of a web application, where the "work" is the server serving up bytes produced by the application, then you will see that if the application serves up small chunks, the web server is effectively forced to multitask, and keep more application instances simultaneously running, with lowered latency, increased memory usage, etc.

However, if the application hands either its entire output to the server, then the "task" is already *done* -- the server doesn't need the thread or child process for that app anymore, and can have it do something else while the I/O is happening. The OS is in a better position to interleave its own I/O with the app's computation, and the overall request latency is reduced.

Is this a big emergency if your server's mostly idle? Nope. Is it a problem if you're writing a CGI program or some other direct API that doesn't automatically flush I/O? Not at all. I/O buffering works just fine for making sure that the tasks are handed off in bigger chunks.

But if you're coding up a WSGI framework, you don't really want to have it sending tiny chunks of data up a stack of middleware, because WSGI doesn't *have* any buffering, and each chunk is supposed to be sent *immediately*.

Well-written web frameworks usually do some degree of buffering already, for API and performance reasons, so for simplicity's sake, WSGI was spec'd assuming that applications would send data in already-buffered chunks.

(Specifically, the simplicity of not needing to have an explicit flushing API, which would otherwise have been necessary if middleware and servers were allowed to buffer the data, too.)

_______________________________________________
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com

Reply via email to