Re: [Web-SIG] WSGI start_response exc_info argument

Phillip J. Eby Tue, 05 Apr 2005 18:22:47 -0700

At 03:51 PM 4/5/05 -0500, Ian Bicking wrote:

Phillip J. Eby wrote:
But I don't mind all of that, because it is only contained in the error catching middleware and no where else. I have other middleware that overrides start_response, and don't want to bother with all the exc_info in that case.
Just pass it through to the upstream start_response; the top-level server is the only one that needs to care.

And a lot of the logic -- like trying to show errors even when there's been a partial response -- is just work, there's no way to get around it.
So leave it to the server. All I'm saying is that there is no need to track whether the response has started. It's the server's job to know that, and the opinion of middleware doesn't count here. As long as the *server* hasn't sent the headers yet, you can restart the response.
My concern is mostly that it is error-prone to leave it to the server, because it's not something you can pass upward easily (AFAICT).

I don't understand. If you want to implement an in-stream recovery middleware, you certainly can. (But even then you don't need to track the state; if start_response() raises an error, you know the server above you has already sent the headers. So you can always trap the error from start_response in order to *know* that you need in-stream recovery.)

And for anything that's *not* in-stream recovery middleware, you shouldn't care; just call start_response with exc_info and proceed about your business. If there is an upstream handler, it will throw exc_info back at you if need be, then catch the error after it breaks out of your code.

The purpose of exc_info is to simplify application-level error handlers; they just pass the exc_info and proceed as they would for pre-stream recovery. If pre-stream recovery is impossible, the error handler will be aborted and the server (or error-handling middleware) gets to take over.

I know my middleware is mostly not compliant with this part of the spec, and it's not even clear to me how I'd fix them all. I'm sure I could figure it out, but most of WSGI doesn't require deep thought (and I like that), and this part doesn't feel like that to me.

I think you're over-analyzing it and there isn't anything complex except the case of an in-stream handler, which is inherently complex due to the task. But there's nothing stopping you writing middleware that lies to its downstream application when called with exc_info, by *not* re-throwing exc_info but instead attempting recovery. Technically, this is against the letter of the spec, which says that if HTTP headers have been output you must abort. (Although this is then loosened in the Error Handling section to say that error-handling middleware can just return without an exception.)

I personally still believe, however, that leaving middleware out of in-streaam recovery is by far the best course of action, because a good framework will buffer its output for the majority of human-readable pages, so in-stream recovery is only needed for streaming data or large files, where *only the application* knows what the safe way to recover is! Therefore, having middleware attempt in-stream recovery is IMO inherently unsafe, unless it is tuned for precisely that particular application, amounting to little more than a monkey patch for that specific scenario.

To put it another way, if you think you need this, it's probably because the application isn't buffering properly. In the common case, a WSGI application *should* be sending its output as a single block. (See http://www.python.org/peps/pep-0333.html#buffering-and-streaming for details.)

I'm trying to outsmart the servers, because I want to be able to control the error handling independent of servers. I'm trying to advocate that servers be as dumb as possible, and I expect to trust them as little as possible, so I don't want to leave stuff up to them. And showing partial responses is just Hard -- all the more reason to avoid leaving it up to servers with all the implementations that exist.

Right; partial responses are hard, so don't do them except in *application* code. 99% of application output should be buffered, so in-stream recovery is irrelevant and useless.

Well, I guess the idea is to let the error middleware do its thing, but give the server an option to bail out gracefully if necessary (by raising the exception passed in). I think it's actually reasonable to have the server bail out ungracefully -- or the middleware -- in those few cases where there's a conflict.

It's allowed to; the spec just says it *should* raise exc_info, but it's allowed to raise something else.

It mostly only applies to cases where there's errors in the streamed output, which seems unlikely to me (at least in cases where there's interactive debugging via a web browser).

Right, it's only for errors in streamed output that the exc_info argument can even be used; apart from that scenario it's a total red herring.

Now that I'm thinking about it, can you remind me why WSGI doesn't work like this:
status, headers, body_iter = application(environ)
print status
print headers...
for block in body_iter: ...
body_iter.close()
Why is there a start_response and then a separate return?

One reason is that it allows you to write an application as a generator. But more importantly, it's necessary in order to support 'write()' for backward compatibility with existing frameworks, and that's pretty much the "killer reason" it's structured how it is. This particular innovation was Tony Lownds' brainchild, though, not mine. In my original WSGI concept, the application received an output stream and just wrote headers and everything to it.

_______________________________________________
Web-SIG mailing list
[email protected]
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com

Re: [Web-SIG] WSGI start_response exc_info argument

Reply via email to