Re: [Web-SIG] PEP 444 (aka Web3)
I have some pending changes to the PEP 444 spec (the working copy is at http://github.com/mcdonc/web3/blob/master/pep-0444.rst but please don't consider that canonical in any sense, it will change before an official republication of the proposal). The modifications fold in most of what we've talked about on the list, or at least acknowledge the issues; a change log is contained near the top. However, I'm currently trying work work through what to do about offering up quoted PATH_INFO and SCRIPT_NAME values (unquoted in the sense that, at least on platforms that support it, these would be the original values before being run through urllib.unquote). The current published proposal on Python.org indicates that these would go into web3.path_info and web3.script_name but nobody seems to much like that because it would make things like path_info_pop hard (the code would need to keep two data structures in sync, and would need to be pretty magical in the face of %2F markers). The pending, unpublished proposal turns SCRIPT_NAME and PATH_INFO into *quoted* values, and adds a ``web3.path_requoted`` flag for debugging purposes, which will be True if the SCRIPT_NAME and/or PATH_INFO needed to be recomposed and requoted (eg. on CGI platforms). But private conversations lead me to believe that not many folks will like this either, because it comandeers CGI names that are well-understood to be unquoted. The only sensible way to break the deadlock seems to be to not use any CGI names in the specification at all, so as not to break people's expectations. I know that when I change it to not use any CGI names, it will be received poorly, but I can't think of a better idea. - C On Wed, 2010-09-15 at 19:03 -0400, Chris McDonough wrote: A PEP was submitted and accepted today for a WSGI successor protocol named Web3: http://python.org/dev/peps/pep-0444/ I'd encourage other folks to suggest improvements to that spec or to submit a competing spec, so we can get WSGI-on-Python3 settled soon. - C ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/chrism%40plope.com ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] PEP 444 (aka Web3)
On Sun, 19 Sep 2010, Ian Bicking wrote: On Sun, Sep 19, 2010 at 11:32 AM, Chris McDonough chr...@plope.com wrote: I propose to write in the PEP that a middleware should provide an app attribute to get the wrapped application or middleware. It seems to be the most common name used out there. We can't really mandate this because middleware is not required to be an instance. It can be a function. We could suggest it, and suggest the attribute name. Composites, lazy loading middleware, or a bunch of other situations can break it... but it's nice for introspection tools to at least be able to attempt to run down the chain. Middleware is almost always a closure if it's a function, I believe, so you could still do: If the goal here is to write a spec, then I would prefer that spec say what must be done and what must not be done, not what may be done, could be done or is suggested as perhaps a best practice. Those sorts of things belong in communication that is out of band of the spec. -- Chris Dent http://burningchrome.com/ [...] ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] PEP 444 (aka Web3)
On Sun, 2010-09-19 at 21:52 -0400, Chris McDonough wrote: I'm -0 on the server trying to guess the Content-Length header. It just doesn't seem like much of a burden to place on an application and it's easier to specify that an application must do this than it is to specify how a server should behave in the face of a missing Content-Length. I also believe Graham has argued against making the server guess, I presume this causes him some pain somehow (probably underspecification in WSGI). Graham's issues with requiring the server to set Content-Length are detailed here: http://blog.dscpl.com.au/2009/10/wsgi-issues-with-http-head- requests.html Chris, Thanks for that link. I had completely forgotten about that issue. I'd really appreciate it if your web3 spec made some definitive decision on whether applications and middleware are responsible for correctly differentiating HEAD from GET, or whether servers should transform HEAD to GET before invoking the first application callable. I'd personally prefer the former. Robert Brewer fuman...@aminus.org ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] PEP 444 (aka Web3)
On 20 September 2010 16:19, Robert Brewer fuman...@aminus.org wrote: On Sun, 2010-09-19 at 21:52 -0400, Chris McDonough wrote: I'm -0 on the server trying to guess the Content-Length header. It just doesn't seem like much of a burden to place on an application and it's easier to specify that an application must do this than it is to specify how a server should behave in the face of a missing Content-Length. I also believe Graham has argued against making the server guess, I presume this causes him some pain somehow (probably underspecification in WSGI). Graham's issues with requiring the server to set Content-Length are detailed here: http://blog.dscpl.com.au/2009/10/wsgi-issues-with-http-head- requests.html Chris, Thanks for that link. I had completely forgotten about that issue. I'd really appreciate it if your web3 spec made some definitive decision on whether applications and middleware are responsible for correctly differentiating HEAD from GET, or whether servers should transform HEAD to GET before invoking the first application callable. I'd personally prefer the former. Servers should definitely not transform a HEAD to a GET. Transforming HEAD to GET and then discarding the body is often not a bad default but an application may well want to handle the HEAD explicitly. For instance, an application's HEAD handler may only need to check an ETag in a database before returning a 304 Not Modified response (with the correct Content-Length and no body, of course). Similarly, it's almost certainly a bad idea for a WSGI server or middleware to change the Content-Length header in the application's HTTP response because there may be no body to look at. - Matt ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] PEP 444 (aka Web3)
Hi, On 9/20/10 6:31 PM, Matt Goodall wrote: Servers should definitely not transform a HEAD to a GET. There are some good reasons why it currently has to. I haven't read the link in question but I had a discussion with Graham a few days ago on Skype and he outlined the issue in detail. I will write a summary to the list in a few days, just too busy to do that right now :( Regards, Armin ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] PEP 444 (aka Web3)
On Thu, 2010-09-16 at 05:29 +0200, Roberto De Ioris wrote: About the *.file_wrapper removal, i suggest a PSGI-like approach where 'body' can contains a File Object. def file_app(environ): fd = open('/tmp/pippo.txt', 'r') status = b'200 OK' headers = [(b'Content-type', b'text/plain')] body = fd return body, status, headers I don't see why this couldn't work as long as middleware didn't convert the body into something not-file-like. But it is really an implementation detail of the origin server (it might specialize when the body is a file), and doesn't really need to be in the spec. or def file_app(environ): fd = open('/tmp/pippo.txt', 'r') status = b'200 OK' headers = [(b'Content-type', b'text/plain')] body = [b'Header', fd, b'Footer'] return body, status, headers This won't work, as the body is required to return an iterable which returns bytes, and cannot be an iterable which returns either bytes or other iterables (it must be a flat sequence). - C ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] PEP 444 (aka Web3)
On Sun, Sep 19, 2010 at 11:32 AM, Chris McDonough chr...@plope.com wrote: I propose to write in the PEP that a middleware should provide an app attribute to get the wrapped application or middleware. It seems to be the most common name used out there. We can't really mandate this because middleware is not required to be an instance. It can be a function. We could suggest it, and suggest the attribute name. Composites, lazy loading middleware, or a bunch of other situations can break it... but it's nice for introspection tools to at least be able to attempt to run down the chain. Middleware is almost always a closure if it's a function, I believe, so you could still do: def caps(app): def replacement_app(environ): status, headers, body = app(environ) body = [''.join(body).upper()] return status, headers, body replacement_app.app = app return replacement_app -- Ian Bicking | http://blog.ianbicking.org ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] PEP 444 (aka Web3)
On Sun, 2010-09-19 at 21:52 -0400, Chris McDonough wrote: I'm -0 on the server trying to guess the Content-Length header. It just doesn't seem like much of a burden to place on an application and it's easier to specify that an application must do this than it is to specify how a server should behave in the face of a missing Content-Length. I also believe Graham has argued against making the server guess, I presume this causes him some pain somehow (probably underspecification in WSGI). Graham's issues with requiring the server to set Content-Length are detailed here: http://blog.dscpl.com.au/2009/10/wsgi-issues-with-http-head-requests.html ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] PEP 444 (aka Web3)
On Sat, Sep 18, 2010 at 5:03 AM, Marcel Hellkamp m...@gsites.de wrote: With WSGI it was possible to yield empty strings as long as the application is waiting for data and call start_response once the headers are final. Not perfect, but at least non-blocking. Web3 removes this possibility. The headers must be returned before the body iterable yielded its first element, empty or not. Removing any support for this type of asynchronism would render web3 useless for all but completely synchronous and trivial applications. Even frameworks would have no way to work around this anymore. I'm aware of what a lot of people have done with WSGI, but I'm not aware of anyone doing an async proxy of any sort, or implementing anything in a way where this empty string policy served any function. It's not implausible that it *could* be used, but years of practice have shown it is not used. -- Ian Bicking | http://blog.ianbicking.org ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] PEP 444 (aka Web3)
Marcel Hellkamp wrote: Am Donnerstag, den 16.09.2010, 22:58 +0200 schrieb Armin Ronacher: - The async part. If I can't find someone that is willing to provide some input on that I will remove that section. I see a problem here: The response tuple must be returned synchronously according to web3. Once returned, the values are final. If an application needs to wait for some background task to finish in order to decide about headers or the status code, it is now forced to block completely. A common use case for this is a web service that itself queries other web services (e.g. an ajax proxy to work around same origin policy). With WSGI it was possible to yield empty strings as long as the application is waiting for data and call start_response once the headers are final. Not perfect, but at least non-blocking. Web3 removes this possibility. The headers must be returned before the body iterable yielded its first element, empty or not. Removing any support for this type of asynchronism would render web3 useless for all but completely synchronous and trivial applications. Even frameworks would have no way to work around this anymore. I do understand that the start_response callable is inconvenient for middleware to implement, but it totally made sense. I don't follow. What is the benefit of yielding empty strings instead of just waiting for the status and headers to be available? Do you then run off and do other things with that server thread? I've run a few businesses now on WSGI without doing what you describe, so I don't see why blocking makes an application 'trivial'. Robert Brewer fuman...@aminus.org ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] PEP 444 (aka Web3)
At 09:01 AM 9/18/2010 -0700, Robert Brewer wrote: Marcel Hellkamp wrote: Removing any support for this type of asynchronism would render web3 useless for all but completely synchronous and trivial applications. Even frameworks would have no way to work around this anymore. I've run a few businesses now on WSGI without doing what you describe, so I don't see why blocking makes an application 'trivial'. I believe he means: all_but(synchronous_apps + trivial_apps), not all_but(apps(synchronous trivial)). ;-) (That being said, for WSGI 2 I still want to get rid of start_response. IMO, async WSGI needs to be a different protocol.) ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] PEP 444 (aka Web3)
On Thu, Sep 16, 2010 at 21:39, P.J. Eby p...@telecommunity.com wrote: Or, to put it another way: splitting the spec into two 100% incompatible versions is a bad idea for Python 3 adoption. With a WSGI 1 addendum, we should be able to make it possible to put the same apps and middleware on 2 and 3 with just a decorator wrapping them. (i.e., people should be able to write libraries that run on both 2 and 3, which is probably critical to adoption). I just wish I'd come to these conclusions much sooner... like a year or two ago. :-( Meh, I'd much rather have Web3/WSGI 2 (and I prefer the WSGI name, too) for Python 3 than the small update you're proposing. IMO there are some good improvements in Chris Armin's spec over the original WSGI, and I would be sad to have to go back to an incremental update that does just enough to make PEP 333 work on Python 3. (Also I think there might actually be value in having some incompatibility to make the distinction clearer.) Cheers, Dirkjan ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] PEP 444 (aka Web3)
Am 16.09.2010 20:00, schrieb Ian Bicking: On Thu, Sep 16, 2010 at 12:35 PM, Guido van Rossum gu...@python.org mailto:gu...@python.org wrote: On Thu, Sep 16, 2010 at 10:01 AM, Ian Bicking i...@colorstudy.com mailto:i...@colorstudy.com wrote: Well, reiterating some things I've said before: * This is clearly just WSGI slightly reworked, why the new name? * Why byte values in the environ? No one has offered any real reason they are better than native strings. I keep asking people to offer a reason, *and no one ever does*. It's just hyperbole and distraction. Frankly I'm feeling annoyed. So far my experience makes me believe using native strings will make it easier to port and support libraries across 2 and 3. Hm. IIUC the proposal is to implicitly assume Latin1 when decoding the bytes to Unicode. I worry that this will just perpetuate mojibake and other atrocities committed in Python 2. I was reading http://python.org/dev/peps/pep-0444/ -- is there another revision under discussion? This seems to explicitly say all environ values will be bytes. There have been other str-oriented proposals, including mod_wsgi's implementation. IIUC Guido was referring to your proposal. Georg -- Thus spake the Lord: Thou shalt indent with four spaces. No more, no less. Four shall be the number of spaces thou shalt indent, and the number of thy indenting shall be four. Eight shalt thou not indent, nor either indent thou two, excepting that thou then proceed to four. Tabs are right out. ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] PEP 444 (aka Web3)
On Fri, Sep 17, 2010 at 10:36 AM, Georg Brandl g.bra...@gmx.net wrote: Am 16.09.2010 23:07, schrieb James Mills: - the web3 name If there is any value in this PEP and we find something to decide on, there is no reason this couldn't be WSGI 2. But until it's just something a small part of the web-sig community worked on directly a separate name is a good thing I think, because it does not reserve the name WSGI 2 for something that might actually become WSGI 2 in case this PEP gets rejected. I personally still don't see any real benefit to changing the key names from wsgi to web3 (or whatever). I would prefer it remain the same. If you're going to use Python3, you know you're using Python3 (you don't need web3 key names to know that). (subjective) That statement shows another weakness of the web3 name: this spec is not in the least exclusive to Python 3. (Which would be a bit useless, having two incompatible WSGI/web specs on two incompatible Python versions.) The goal would be to first migrate to WSGI2/web3, and *then* have an easy transition going to Python 3. Georg also WSGI acronym is defining better the purpose by itself than web3 which mean nothing. - benoit ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] PEP 444 (aka Web3)
On Thu, Sep 16, 2010 at 6:41 PM, Armin Ronacher armin.ronac...@active-4.com wrote: 4. The web3 spec says, In case a content length header is absent the stream must not return anything on read. It must never request more data than specified from the client. but later it says, Web3 servers must handle any supported inbound hop-by-hop headers on their own, such as by decoding any inbound Transfer-Encoding, including chunked encoding if applicable.. I would be sad if web3 did not support streaming uploads via Transfer-Encoding. One way to implement that would be to make the origin server handle read() transparently by returning '' on EOF, regardless of whether a Content-Length or a Transfer-Encoding header was provided. I was toying with the idea to have a websocket extension for web3 which would have solved my usecase for requests without a content-length header. The problem with the content length of incoming data is quite complex and that seemed to be the solution that was easiest for everybody involved. uh ? Since with Transfer-Encoding: chunked we know when the stream end, I would be in favor of returning an EOF too at the end. Also most of servers know when a stream end even if there is no content-length. Maybe we could have a capability setting in environ that say if the server support streaming or not. And in all cases returning EOF at the end? - benoît ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] PEP 444 (aka Web3)
On 09/17/2010 04:21 AM, Ian Bicking wrote: Yes, if we get rid of SCRIPT_NAME/PATH_INFO then the problem goes away. For servers without access to the unencoded value, reencoding those values doesn't actually lose any information over what we have now, and avoids any encoding issues. It doesn't lose any information, but it also makes script_name/path_info inherently unreliable. My fear is that if gateways are allowed to create a reconstructed script_name/path_info without clearly signalling they have done so, those values will continue to be unreliable at all times and server authors won't feel the need to get it right since it's broken everywhere anyway: the unhappy status quo. This is why I am continuing to plead for a 'script_name/path_info are authoritative' flag in environ that applications can use to detect situations where it is safe to go ahead and rely on them. I want to say Unicode paths are supported if your server/gateway does, not Unicode paths might sometimes work, depending on how you configure your server and application. It is not just CGI that is affected here! IIS does not provide the original undecoded path at all, even through ISAPI. At the moment I am using a 'fixPathInfo' method in my form-reading layer to try to compensate as much as possible for the problems of CGI: - on Python 2 on Windows, re-read the environment variables using ctypes if available, to avoid the mangling caused by reading os.environ using mbcs. (This didn't used to work, as old versions of IIS deliberately mbcs-filtered values before putting them in the environment, but it does now.) - on Python 3 on POSIX, re-read the environment variables using environb if available. Otherwise try to reverse the faulty decoding of environ using surrogateescapes, where available. - on Windows, encode the Unicode environment to bytes using ISO-8859-1 if the server is Apache, or UTF-8 is the server is IIS. (IIS tries to decode path bytes using UTF-8, falling back to mbcs where the input is not valid UTF-8. Unfortunately there is no way to tell this has happened.) - when server is Microsoft-IIS, remove the erroneously repeated SCRIPT_NAME components from the front of PATH_INFO. (This is a long-standing bug that can be configured away using the allowPathInfo/AllowPathInfoForScriptMappings configs, but no- one does as it breaks ASP.) However, the form layer is not really the right place to be doing these hacks. It would be better done in the stdlib CGI handler. Servers with REQUEST_URI can at least attempt to reconstruct the encoded values. This is slightly unsafe. It's something an application might want to do (or at least provide as an option), but a gateway probably couldn't get away with it for the general case because REQUEST_URI doesn't reflect the redirections done by a RewriteRule or an ErrorDocument. Cookie is also the one header that can't be safely folded. There are others, eg. Authorization. Anyway: folding doesn't happen in the HTTP world. It can be forgotten about. -- And Clover mailto:a...@doxdesk.com http://www.doxdesk.com/ ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] PEP 444 (aka Web3)
Hi, On 9/17/10 11:40 AM, And Clover wrote: This is why I am continuing to plead for a 'script_name/path_info are authoritative' flag in environ that applications can use to detect situations where it is safe to go ahead and rely on them. I want to say Unicode paths are supported if your server/gateway does, not Unicode paths might sometimes work, depending on how you configure your server and application. In case there is no raw value with the current spec, you can see SCRIPT_NAME and PATH_INFO as unreliable. In case we change the spec as Ian mentioned above, I am all for a wsgi.guessed_encoding = True flag or something like that. It is not just CGI that is affected here! IIS does not provide the original undecoded path at all, even through ISAPI. Unless I am mistaken, the same is true for CGI scripts running on Apache2 on Windows. - on Python 2 on Windows, re-read the environment variables using ctypes if available, to avoid the mangling caused by reading os.environ using mbcs. (This didn't used to work, as old versions of IIS deliberately mbcs-filtered values before putting them in the environment, but it does now.) I did some tests a while ago and was pretty sure that Apache2 on Windows did the same. Might be wrong though. However, the form layer is not really the right place to be doing these hacks. It would be better done in the stdlib CGI handler. The correct place for these hacks would be the appropriate WSGI/Web3 handler of the webserver. Certainly not a particular WSGI/Web3 implementation or even the CGI module of the standard library. Regards, Armin ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] PEP 444 (aka Web3)
On 09/17/2010 02:03 PM, Armin Ronacher wrote: In case we change the spec as Ian mentioned above, I am all for a wsgi.guessed_encoding = True flag or something like that. Yes, I'd like to see that. I believe going with *only* a raw-or-reconstructed path_info, rather than having both path_info and PATH_INFO, is probably best, for the middleware-dupication reasons PJE mentioned. A more in-depth possibility might be: wsgi.path_accuracy = 0: script_name/path_info have been crudely reconstructed from SCRIPT_NAME/PATH_INFO from an unknown source. Beware! If there is to be backwards compatibility with WSGI1, this would be seen as the 'default value' given a missing path_accuracy. 1: script_name/path_info have been reconstructed, but it is known that path_info is accurate, other than %2F and non-ASCII issues. That is, it's known that the path doesn't come from IIS's broken PATH_INFO, or the IIS error has been detected and compensated for. 2: script_name/path_info have been reconstructed using known-good encodings for the env. The only way in which they may differ from the original request path is that a slash might originally have been a %2F. (This is good enough for the vast majority of applications.) 3: script_name/path_info come directly from the request path without any intervening mangling. Unless I am mistaken, the same is true for CGI scripts running on Apache2 on Windows. Yes, it's true of *all* CGI scripts, but also for non-CGI scripts on IIS. I did some tests a while ago and was pretty sure that Apache2 on Windows did the same. Apache-on-Windows puts the bytes of the decoded path into the environment variables as one code unit per byte: that is, as if encoded by ISO-8859-1. You still have to read the environ using ctypes because mbcs is never ISO-8859-1, but at least the original bytes are recoverable, which isn't the case with IIS. The correct place for these hacks would be the appropriate WSGI/Web3 handler of the webserver. The IIS PATH_INFO-prefix hack would be appropriate to put in an IIS-specific handler; indeed, I believe isapi_wsgi does just that. But the other hacks are specific to CGI. For CGI, there is no 'handler of the webserver', there is only the standard CGI-to-WSGI adapter, so this is the only component it is reasonable to burden with the hacks. Frameworks and libraries further up the stack cannot reliably do the fixups, because they don't know whether the WSGI environ they have been given comes from os.environ or somewhere else, or whether middleware has played with it. -- And Clover mailto:a...@doxdesk.com http://www.doxdesk.com/ ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] PEP 444 (aka Web3)
At 03:43 PM 9/17/2010 +0200, And Clover wrote: On 09/17/2010 02:03 PM, Armin Ronacher wrote: In case we change the spec as Ian mentioned above, I am all for a wsgi.guessed_encoding = True flag or something like that. Yes, I'd like to see that. I believe going with *only* a raw-or-reconstructed path_info, rather than having both path_info and PATH_INFO, is probably best, for the middleware-dupication reasons PJE mentioned. A more in-depth possibility might be: wsgi.path_accuracy = 0: script_name/path_info have been crudely reconstructed from SCRIPT_NAME/PATH_INFO from an unknown source. Beware! If there is to be backwards compatibility with WSGI1, this would be seen as the 'default value' given a missing path_accuracy. 1: script_name/path_info have been reconstructed, but it is known that path_info is accurate, other than %2F and non-ASCII issues. That is, it's known that the path doesn't come from IIS's broken PATH_INFO, or the IIS error has been detected and compensated for. 2: script_name/path_info have been reconstructed using known-good encodings for the env. The only way in which they may differ from the original request path is that a slash might originally have been a %2F. (This is good enough for the vast majority of applications.) 3: script_name/path_info come directly from the request path without any intervening mangling. So, do you have an example of what some real-world code is going to *do* with this information? i.e., what's the use case for knowing the precise degree of messed-uppedness of the path? ;-) ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] PEP 444 (aka Web3)
Hi, On 9/17/10 5:42 PM, P.J. Eby wrote: So, do you have an example of what some real-world code is going to *do* with this information? i.e., what's the use case for knowing the precise degree of messed-uppedness of the path? ;-) Actually, I can see a couple of use cases. I have a blog that by default only produces ASCII-safe slugs for the URLs which means that if you are a chinese person you will only get the ID based fallback there. If I could safely detect if the setup supports unicode identifiers in URLs in a way that works, I could give a good default and warn the user if they change the setting. Regards, Armin ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] PEP 444 (aka Web3)
I don't like this proposal at all. Besides having to go through the bytes craziness the design is pretty backwards for middleware and asynchronous applications. Even the proxy_and_timing_support example in the PEP is broken for async or streaming apps - it won't return the proper time (since it doesn't consume the body iterable) and it will fail most of the times since you can't just add a tuple to a iterable. The missing requirement that middleware must yield at least an empty string if they need more more information from the application iterable also breaks async gateways that expect oob information from the app (for example cogen can't be ported to this spec). The removed requirement middleware components *must not* block iteration waiting for multiple values from an application iterable. If the middleware needs to accumulate more data from the application before it can produce any output, it *must* yield an empty string. also breaks async gateways/apps. I feel this spec puts too much burden on applications - having to process all those byte strings and even having to add Content-Length even for naive buffered-body apps. --ionel On Thu, Sep 16, 2010 at 02:03, Chris McDonough chr...@plope.com wrote: A PEP was submitted and accepted today for a WSGI successor protocol named Web3: http://python.org/dev/peps/pep-0444/ I'd encourage other folks to suggest improvements to that spec or to submit a competing spec, so we can get WSGI-on-Python3 settled soon. - C ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/ionel.mc%40gmail.com ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] PEP 444 (aka Web3)
On Fri, 2010-09-17 at 19:47 +0300, Ionel Maries Cristian wrote: I don't like this proposal at all. Besides having to go through the bytes craziness the design is pretty backwards for middleware and asynchronous applications. We've acknowledged in other messages to this thread that the web3.async red herring is speculative, and Armin has indicated that if he does not find a champion willing to create a reference implementation for it today that it will be taken out. This doesn't help async people, but it also doesn't harm them (no difference from WSGI really). Personally, I hope nobody steps up and we just rip it out. ;-) I'm not sure why you characterize using bytes as bytes craziness. We have been using strings as byte sequences in WSGI for over five years. Python itself draws an equivalence between the Python 3 bytes type and Python 2 str (bytes is aliased to str under Python 2). I'm not really sure why we shouldn't take advantage of that equivalence, and why people are so enamored of treating envvar values, headers, and such as text other than the brokenness of the Python 3 stdlib urllib stuff. IMO, WSGI/Web3 isn't really a programming platform (or at least if it is, it is destined to be a pretty crappy one), it's just a connection protocol, so any its more typing or its ugly argument seems pretty thin to me. I'd personally rather have it be more general and less easy to use than potentially broken in some corner case circumstance. - C ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] PEP 444 (aka Web3)
On Fri, Sep 17, 2010 at 9:43 AM, And Clover and...@doxdesk.com wrote: On 09/17/2010 02:03 PM, Armin Ronacher wrote: In case we change the spec as Ian mentioned above, I am all for a wsgi.guessed_encoding = True flag or something like that. Yes, I'd like to see that. I believe going with *only* a raw-or-reconstructed path_info, rather than having both path_info and PATH_INFO, is probably best, for the middleware-dupication reasons PJE mentioned. A more in-depth possibility might be: wsgi.path_accuracy = 0: script_name/path_info have been crudely reconstructed from SCRIPT_NAME/PATH_INFO from an unknown source. Beware! If there is to be backwards compatibility with WSGI1, this would be seen as the 'default value' given a missing path_accuracy. 1: script_name/path_info have been reconstructed, but it is known that path_info is accurate, other than %2F and non-ASCII issues. That is, it's known that the path doesn't come from IIS's broken PATH_INFO, or the IIS error has been detected and compensated for. 2: script_name/path_info have been reconstructed using known-good encodings for the env. The only way in which they may differ from the original request path is that a slash might originally have been a %2F. (This is good enough for the vast majority of applications.) 3: script_name/path_info come directly from the request path without any intervening mangling. path_accuracy is certainly a better name than encoding; nothing here actually relates to encoding (except insofar as attempts to encode or reencode values corrupts the path). Personally I wouldn't want to split it up this much, I'd rather a simple flag to indicate something was guessed, vs. an accurate request. The only real value I see in it is to help people debug problems. Maybe. I'm not sure it's that realistic to imagine this will be noticed by people deploying software and encountering problems. A helpful application could use it to warn the deployer of potential problems. It seems that it would be possible to create a WSGI application and client library that together can detect and help resolve these issues. E.g., the application always returns the values of script_name, path_info, and query_string, and the client fires off a bunch of different requests to see how it gets interpreted. It could suggest corrections until everything passes. I would really like to see concerns over bad gateways not be used to keep valuable information out of the spec. We want people to use well-configured gateways that accurately represent requests. There are limits, e.g., in environments where information is lost. The only really problematic example is losing the distinction between %2f and /, and I think it's reasonable to suggest that applications should avoid making that distinction in the path if they want to be easily deployed in different environments. -- Ian Bicking | http://blog.ianbicking.org ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] PEP 444 (aka Web3)
On Fri, Sep 17, 2010 at 1:02 PM, Ian Bicking i...@colorstudy.com wrote: I would really like to see concerns over bad gateways not be used to keep valuable information out of the spec. We want people to use well-configured gateways that accurately represent requests. There are limits, e.g., in environments where information is lost. The only really problematic example is losing the distinction between %2f and /, and I think it's reasonable to suggest that applications should avoid making that distinction in the path if they want to be easily deployed in different environments. Just to expand -- the reason %2f is special is because / has special meaning in URL paths, or at least is treated as such. ? has special meaning too, but that's already handled by splitting off QUERY_STRING. Technically ; is supposed to mean something, but no one ever cared, so it doesn't really. In theory you could make any character special, and in doing so want an escape mechanism to determine the difference between, e.g., , and %2c... but no one does that, so no problem. All the other potential problems are problems of gateway corruption. E.g., where the bytes were decoded with Latin1 and then encoded with sys.getfilesystemencoding(), or some other mismatched combination. I don't believe we should expose gateway corruption to the spec. I *do* believe that we can build tools inside WSGI to help debug and fix those problems, and I don't think any of these changes makes those tools particularly harder to implement. -- Ian Bicking | http://blog.ianbicking.org ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] PEP 444 (aka Web3)
On Fri, 17 Sep 2010, Ionel Maries Cristian wrote: I feel this spec puts too much burden on applications - having to process all those byte strings and even having to add Content-Length even for naive buffered-body apps. The Content-Length requirement is a big killer for me. I'm usually generating content in apps, rather deep in a stack of middleware-like pieces that may or may not be looking at or modifying that content. I don't want to a) have to unwind my generators at each level b) reset the content-length here there and everywhere. It could be I'm doing it completely wrong, but it works rather nicely. -- Chris Dent http://burningchrome.com/ [...] ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] PEP 444 (aka Web3)
On Fri, Sep 17, 2010 at 1:37 PM, chris.d...@gmail.com wrote: On Fri, 17 Sep 2010, Ionel Maries Cristian wrote: I feel this spec puts too much burden on applications - having to process all those byte strings and even having to add Content-Length even for naive buffered-body apps. The Content-Length requirement is a big killer for me. I'm usually generating content in apps, rather deep in a stack of middleware-like pieces that may or may not be looking at or modifying that content. I don't want to a) have to unwind my generators at each level b) reset the content-length here there and everywhere. It could be I'm doing it completely wrong, but it works rather nicely. I'm unclear what exactly you guys are reacting to. This? - The server must not inject an additional Content-Length header by guessing the length from the response iterable. This must be set by the application itself in all situations. I'm also not sure what motivated this particular change, but I don't have any opinion one way or the other. -- Ian Bicking | http://blog.ianbicking.org ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] PEP 444 (aka Web3)
Hi, On 9/17/10 7:43 PM, Ian Bicking wrote: I'm also not sure what motivated this particular change, but I don't have any opinion one way or the other. Motivation is that WSGI wants servers to do something like this: if len(iterable) == 1 and content_length_header_missing: headers.append(('Content-Length', str(len(iterable[0]))) However not everybody was doing that and some applications were setting a content length header or not. If a content length header was not set some middlewares that changed content worked properly even though they did not check the header. The idea is that with web3 every tool in the chain is supposed to look for that header and update it appropriately. Even the piglatin middleware from the PEP 333 did not check the content length if I remember correctly. Regards, Armin ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] PEP 444 (aka Web3)
On Fri, Sep 17, 2010 at 2:06 PM, Armin Ronacher armin.ronac...@active-4.com wrote: Hi, On 9/17/10 7:43 PM, Ian Bicking wrote: I'm also not sure what motivated this particular change, but I don't have any opinion one way or the other. Motivation is that WSGI wants servers to do something like this: if len(iterable) == 1 and content_length_header_missing: headers.append(('Content-Length', str(len(iterable[0]))) However not everybody was doing that and some applications were setting a content length header or not. If a content length header was not set some middlewares that changed content worked properly even though they did not check the header. The idea is that with web3 every tool in the chain is supposed to look for that header and update it appropriately. Even the piglatin middleware from the PEP 333 did not check the content length if I remember correctly. OK, so maybe it should just be clarified: * Middleware and servers should not modify or add Content-Length, Date, or other headers unless they have reason to do so, and they must ensure that the response is valid (e.g., there should never be two Content-Length headers). It still seems reasonable that *if* there is no Content-Length, and the server can guess easily enough (mostly it is returned an actual list/tuple that we know can be introspected fast and without side effects), then it's perfectly reasonable to set it -- but certainly the server doesn't own that header (or any other, except maybe some connection-related headers?). -- Ian Bicking | http://blog.ianbicking.org ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] PEP 444 (aka Web3)
Il giorno 16/set/2010, alle ore 08.37, Masklinn ha scritto: I generally like it. About the *.file_wrapper removal, i suggest a PSGI-like approach where 'body' can contains a File Object. def file_app(environ): fd = open('/tmp/pippo.txt', 'r') status = b'200 OK' headers = [(b'Content-type', b'text/plain')] body = fd return body, status, headers As far as I understand it, `body` is an iterable so there should not be any problem with sending a file through directly in this manner. Better, the web3 spec specifically mandates that if the `body` iterable has a `close` method it must be called on request completion (second-to-last paragraph in the specification details section [0]). So a File Object as a body is already completely handled by web3. On the other hand, `body` has to yield bytes, so `fd = open('/tmp/pippo.txt', 'rb')` I think. In this case i do not see a need for wsgi.file_wrapper replacement. The Web3 gateway/hosting system can manage File-Like Object the way it wants (and transparently for the application) -- Roberto De Ioris http://unbit.it JID: robe...@jabber.unbit.it ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] PEP 444 (aka Web3)
On Thu, Sep 16, 2010 at 13:32, Armin Ronacher armin.ronac...@active-4.com wrote: The motivation is that you can pass that to constructors of response objects already in place. response_tuple = response.get_response_tuple() response = Response(*response_tuple) The order body, status code, headers is what Werkzeug and WebOb are currently using. Django has (content, mimetype, status) as constructor but if they detect a list/dict on the third parameter they could assume that mimetype referes to the status thus they have a proper upgrade path. Okay, I can see why the order makes sense from a default arguments point of view, but I'm still not sure why it helps if the Response() signature looks like the application return signature. That would be a nice to have, but makes the middleware logic harder because each middleware would have to check for the type. Works for 2.x, but on 3.x that would mean each middleware would have to check the type before each operation and convert to bytes if necessary which means a lot of overhead for each middleware in the stack. Okay, I guess it makes sense. I just thoroughly dislike that we're making applications harder in a bunch of places to make the life of middleware easier. Surely we write more applications than middleware? Can we somehow invert the model to have the gateway act as a controller for middleware, so that we can canonicalize application returns before passing them to the middleware? Or provide a function in wsgiref that allows me to write an application like this: import wsgiref def app(environ): return wsgiref.canonicalize(200, {'Content-Type': 'text/plain'}, ['foo']) Maybe it should be an exceedingly light-weight response class (which could be inherited by the frameworks) instead. Cheers, Dirkjan ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] PEP 444 (aka Web3)
Hi, On 9/16/10 1:44 PM, Tarek Ziadé wrote: I propose to write in the PEP that a middleware should provide an app attribute to get the wrapped application or middleware. It seems to be the most common name used out there. What about middlewares that encapsulate more than one application? Regards, Armin ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] PEP 444 (aka Web3)
On Thu, Sep 16, 2010 at 1:57 PM, Armin Ronacher armin.ronac...@active-4.com wrote: Hi, On 9/16/10 1:44 PM, Tarek Ziadé wrote: I propose to write in the PEP that a middleware should provide an app attribute to get the wrapped application or middleware. It seems to be the most common name used out there. What about middlewares that encapsulate more than one application? True... I don't know what's the best option here.. I guess we need to provide all children so one may visit the whole graph. Do you have a list of middleware that does this ? Regards Tarek -- Tarek Ziadé | http://ziade.org ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] PEP 444 (aka Web3)
On Thu, Sep 16, 2010 at 2:40 PM, Armin Ronacher armin.ronac...@active-4.com wrote: Hi, On 9/16/10 2:38 PM, Tarek Ziadé wrote: True... I don't know what's the best option here.. I guess we need to provide all children so one may visit the whole graph. Another gripe I have with WSGI is that if you attempt to combine applications together with a dispatcher middleware, the inner application does not know the URL of the outer one. It's SCRIPT_NAME points to itself and there is no ORIGINAL_SCRIPT_NAME. Do you have a list of middleware that does this ? I know that Paste has a cascade middleware and I think it also has one that maps applications to specific prefixes. Ah yes, the composite thing IIRC - I didn't know this was a middleware. Should those be middlewares ? ISTM that they should in the front of the stack instead, and that a stack of middleware should be dedicated to a single application -- for the griefs you mentioned and probably other problems. I mean, one call does not visit several application, and this is some kind of dynamic rewriting of the stack.. Another possibility would be to define a get_application(environ=None) method so the middleware is able to return the right app at the right moment Regards, Armin -- Tarek Ziadé | http://ziade.org ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] PEP 444 (aka Web3)
On 2010-09-16, at 18:08 , Tarek Ziadé wrote: On Thu, Sep 16, 2010 at 1:57 PM, Armin Ronacher armin.ronac...@active-4.com wrote: Hi, On 9/16/10 1:44 PM, Tarek Ziadé wrote: I propose to write in the PEP that a middleware should provide an app attribute to get the wrapped application or middleware. It seems to be the most common name used out there. What about middlewares that encapsulate more than one application? True... I don't know what's the best option here.. I guess we need to provide all children so one may visit the whole graph. That would require a hypothetical self.app to always be a list, or at least an iterable, right? ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] PEP 444 (aka Web3)
On Thu, Sep 16, 2010 at 2:57 PM, Masklinn maskl...@masklinn.net wrote: On 2010-09-16, at 18:08 , Tarek Ziadé wrote: On Thu, Sep 16, 2010 at 1:57 PM, Armin Ronacher armin.ronac...@active-4.com wrote: Hi, On 9/16/10 1:44 PM, Tarek Ziadé wrote: I propose to write in the PEP that a middleware should provide an app attribute to get the wrapped application or middleware. It seems to be the most common name used out there. What about middlewares that encapsulate more than one application? True... I don't know what's the best option here.. I guess we need to provide all children so one may visit the whole graph. That would require a hypothetical self.app to always be a list, or at least an iterable, right? I would prefer a get_application(environ=None) iterator that would reach the final application depending on the environment, and return only one app or middleware per level, but I am not sure... -- Tarek Ziadé | http://ziade.org ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] PEP 444 (aka Web3)
On 09/16/2010 02:05 AM, P.J. Eby wrote: note that the spec's sample CGI implementation does not itself provide the new variables It can't: This is the original URL-encoded value derived from the request URI. If the server cannot provide this value, it must omit it from the environ. A CGI gateway doesn't have access to the original URL-encoded value. middleware must be explicitly written to handle the case where there is duplication. The alternative to duplication would be to allow a gateway to try to 'reconstruct' `path_info` from CGI `PATH_INFO`. If this is done there really needs to be a flag somewhere to say that it has been done, ie. that `/` and non-ASCII characters in the path are unreliable. Otherwise we're just going to end up in the same sorry situation we have today where all sorts of different encodings and corruptions lurk inside PATH_INFO and apps simply cannot rely on it. chr...@plope.com wrote: The most sensible thing to me would be to put it in PATH_INFO. Please don't have a field with encoded semantics that re-uses the name of a field that has always had decoded semantics. -- And Clover mailto:a...@doxdesk.com http://www.doxdesk.com/ ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] PEP 444 (aka Web3)
On Thu, Sep 16, 2010 at 5:48 AM, Tarek Ziadé ziade.ta...@gmail.com wrote: On Thu, Sep 16, 2010 at 2:40 PM, Armin Ronacher armin.ronac...@active-4.com wrote: Hi, On 9/16/10 2:38 PM, Tarek Ziadé wrote: True... I don't know what's the best option here.. I guess we need to provide all children so one may visit the whole graph. Another gripe I have with WSGI is that if you attempt to combine applications together with a dispatcher middleware, the inner application does not know the URL of the outer one. It's SCRIPT_NAME points to itself and there is no ORIGINAL_SCRIPT_NAME. Do you have a list of middleware that does this ? I know that Paste has a cascade middleware and I think it also has one that maps applications to specific prefixes. Ah yes, the composite thing IIRC - I didn't know this was a middleware. Should those be middlewares ? ISTM that they should in the front of the stack instead, and that a stack of middleware should be dedicated to a single application -- for the griefs you mentioned and probably other problems. I mean, one call does not visit several application, and this is some kind of dynamic rewriting of the stack.. Another possibility would be to define a get_application(environ=None) method so the middleware is able to return the right app at the right moment The 'pegboard' middleware composes a result out of an arbitrary graph of WSGI apps, with one request visiting many applications. The graph can be built at runtime in application code, so it would be very difficult to report all of the '.app's applicable for a given environ until after the request. Also, it is quite reasonable in practice to have middleware both in front of such a composer and also in the stacks of the apps it composes. A concern with should have .app is that a single closure middleware breaks the chain. For example: def unproxy(app): def middleware(environ): environ['HTTP_HOST'] = environ['HTTP_X_FORWARDED_FOR_HOST'] return app(environ) return middleware For the use case of original_app = self.app.app.application.app, I've had great success with a pattern I first saw in Zine: applying the middleware internally to the application instance, not wrapping the instance. It seems fairly robust against closures and middleware that can't or won't play along with .app. Unlike .app, this isn't generically traversable, but in cases where I need this kind of cross-talk between middleware/apps I haven't had any problems getting the right instances into scope at runtime. class MyApp: def apply_middleware(self, factory, *args, **kw): self.dispatch_wsgi = factory(self.dispatch_wsgi, *args, **kw) def dispatch_wsgi(self, environ): return [b'hi'], b'200 OK', [(b'Content-type', b'text/plain')] def __call__(self, environ): return self.dispatch_wsgi(environ) app = MyApp() app.apply_middleware(unproxy) app.apply_middleware(StaticContent, 'static/') ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] PEP 444 (aka Web3)
On Thu, 16 Sep 2010, jason kirtland wrote: The 'pegboard' middleware composes a result out of an arbitrary graph of WSGI apps, with one request visiting many applications. The graph can be built at runtime in application code, so it would be very difficult to report all of the '.app's applicable for a given environ until after the request. Also, it is quite reasonable in practice to have middleware both in front of such a composer and also in the stacks of the apps it composes. The general rule we can extract from this is that we don't want the spec to limit what is possible for the sake of making fairly arbitrary things that only some people (think they?) need and can be satisfied using the more fundamental units already present in the design. I can see that applying here, thus we don't want to enforce some kind of app method or attribute as that could be costly for assembling flexible groups of apps (in the same app). On the other end of that same principle, I'm not sure I can see much justification in (paraphrase) let's make the return signature be the same as the signature of some constructors at use out there in the wild. One of the best things about WSGI, that I hope does not get lost in Web3 (thanks for moving things forward, by the way), is that in its most basic use it is almost entirely about (simple) data structure and (simple) data flow and not about methods, objects, magical attributes and other flim flammery. In other words it is good that the units are basic and fundamental. -- Chris Dent http://burningchrome.com/ [...] ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] PEP 444 (aka Web3)
Thanks to Chris M. and Armin for moving forward with a PEP! Armin Ronacher wrote: Hi, On 9/16/10 1:23 PM, Dirkjan Ochtman wrote: I find the order of the application return arguments really annoying, could it just be status, headers, body? Mirrors the actual structure of the request, which is easier to remember IMO. The motivation is that you can pass that to constructors of response objects already in place. response_tuple = response.get_response_tuple() response = Response(*response_tuple) chris.d...@gmail.com wrote: On the other end of that same principle, I'm not sure I can see much justification in (paraphrase) let's make the return signature be the same as the signature of some constructors at use out there in the wild. FWIW, I am with Dirkjan and Chris on this...the most logical ordering for a response tuple is: status, headers, body Trying to conform the spec to existing frameworks doesn't seem like the best approach in this case. -- Randy Syring Intelicom 502-644-4776 Whether, then, you eat or drink or whatever you do, do all to the glory of God. 1 Cor 10:31 chris.d...@gmail.com wrote: On Thu, 16 Sep 2010, jason kirtland wrote: The 'pegboard' middleware composes a result out of an arbitrary graph of WSGI apps, with one request visiting many applications. The graph can be built at runtime in application code, so it would be very difficult to report all of the '.app's applicable for a given environ until after the request. Also, it is quite reasonable in practice to have middleware both in front of such a composer and also in the stacks of the apps it composes. The general rule we can extract from this is that we don't want the spec to limit what is possible for the sake of making fairly arbitrary things that only some people (think they?) need and can be satisfied using the more fundamental units already present in the design. I can see that applying here, thus we don't want to enforce some kind of app method or attribute as that could be costly for assembling flexible groups of apps (in the same app). On the other end of that same principle, I'm not sure I can see much justification in (paraphrase) let's make the return signature be the same as the signature of some constructors at use out there in the wild. One of the best things about WSGI, that I hope does not get lost in Web3 (thanks for moving things forward, by the way), is that in its most basic use it is almost entirely about (simple) data structure and (simple) data flow and not about methods, objects, magical attributes and other flim flammery. In other words it is good that the units are basic and fundamental. ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] PEP 444 (aka Web3)
Chris McDonough wrote: A PEP was submitted and accepted today for a WSGI successor protocol named Web3: http://python.org/dev/peps/pep-0444/ I'd encourage other folks to suggest improvements to that spec or to submit a competing spec, so we can get WSGI-on-Python3 settled soon. Thanks Chris, a few comments: 1. Hooray for all-byte output. 2. Hardly anybody implements RFC 2047, and http-bis is phasing it out. In addition, since folded and/or 2047-encoded lines are equivalent to their non-folded-nor-encoded variants, applications have no business emitting folded or encoded versions of these; that decision should be left up to the origin server. So keep the text about control characters, carriage returns and linefeeds, please. 3. +1 on (status, headers, body) in that order. Your own example code composed them in that order, and then re-arranged them for output! One of the benefits of a new spec is the opportunity to coerce rewrites in existing codebases that undo their poor design choices and make them more readable. By the way, the Specification Details and Values Returned sections have this in the (s, h, b) order in your draft. 4. The web3 spec says, In case a content length header is absent the stream must not return anything on read. It must never request more data than specified from the client. but later it says, Web3 servers must handle any supported inbound hop-by-hop headers on their own, such as by decoding any inbound Transfer-Encoding, including chunked encoding if applicable.. I would be sad if web3 did not support streaming uploads via Transfer-Encoding. One way to implement that would be to make the origin server handle read() transparently by returning '' on EOF, regardless of whether a Content-Length or a Transfer-Encoding header was provided. 5. Conversely, streaming output is nice to have and should be explicitly supported in the web3 spec. One way would be to require servers to respect a 'Transfer-Encoding: chunked' header emitted by the application. However, the WSGI and web3 specs specifically deny this approach by saying, Applications and middleware are forbidden from using HTTP/1.1 hop-by-hop features or headers. A workaround would be for the application to signal Transfer-Encoding by omitting any Content-Length header in its response headers (this is what CherryPy currently does). 6. I'd personally like to see it be OK for apps and middleware to emit Connection: close too, or have some other way of communicating that desire to the server. 7. it is presumed that Web3 middleware will be created which can be used in front of existing WSGI 1.0 applications, allowing those existing WSGI 1.0 applications to run under a Web3 stack. This middleware will require, when under Python 3, an equivalence to be drawn between Python 3 str types and the bytes values represented by the HTTP request and all the attendant encoding- guessing (or configuration) it implies. Just some field experience: that's not hard. CherryPy 3.2 does this now between various WSGI proposals. Robert Brewer fuman...@aminus.org ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] PEP 444 (aka Web3)
Hi, On 9/16/10 6:19 PM, Robert Brewer wrote: 1. Hooray for all-byte output. Hooray for agreeing :) 3. +1 on (status, headers, body) in that order. Your own example code composed them in that order, and then re-arranged them for output! One of the benefits of a new spec is the opportunity to coerce rewrites in existing codebases that undo their poor design choices and make them more readable. By the way, the Specification Details and Values Returned sections have this in the (s, h, b) order in your draft. I suppose it makes sense to word the spec in that order then, seems like the majority wants it that way round. 4. The web3 spec says, In case a content length header is absent the stream must not return anything on read. It must never request more data than specified from the client. but later it says, Web3 servers must handle any supported inbound hop-by-hop headers on their own, such as by decoding any inbound Transfer-Encoding, including chunked encoding if applicable.. I would be sad if web3 did not support streaming uploads via Transfer-Encoding. One way to implement that would be to make the origin server handle read() transparently by returning '' on EOF, regardless of whether a Content-Length or a Transfer-Encoding header was provided. I was toying with the idea to have a websocket extension for web3 which would have solved my usecase for requests without a content-length header. The problem with the content length of incoming data is quite complex and that seemed to be the solution that was easiest for everybody involved. 5. Conversely, streaming output is nice to have and should be explicitly supported in the web3 spec. One way would be to require servers to respect a 'Transfer-Encoding: chunked' header emitted by the application. However, the WSGI and web3 specs specifically deny this approach by saying, Applications and middleware are forbidden from using HTTP/1.1 hop-by-hop features or headers. A workaround would be for the application to signal Transfer-Encoding by omitting any Content-Length header in its response headers (this is what CherryPy currently does). I am fine improving that, but it would require a very good reference implementation with enough comments so that people have an idea of how it's supposed to behave. wsgiref is nice in WSGI already, but it has its faults to which we should try to keep in mind for web3. (Like that it sets multithreaded flag despite being single threaded or that it always appends a Date header breaking some applications). 6. I'd personally like to see it be OK for apps and middleware to emit Connection: close too, or have some other way of communicating that desire to the server. I would like to see this feature as well, but you will have to fight for this feature with Phillip and Graham I suppose. 7. it is presumed that Web3 middleware will be created which can be used in front of existing WSGI 1.0 applications, allowing those existing WSGI 1.0 applications to run under a Web3 stack. This middleware will require, when under Python 3, an equivalence to be drawn between Python 3 str types and the bytes values represented by the HTTP request and all the attendant encoding- guessing (or configuration) it implies. Just some field experience: that's not hard. CherryPy 3.2 does this now between various WSGI proposals. I suppose we will see some adapters that have some configuration parameters to adapt to different usage patterns. Regards, Armin ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] PEP 444 (aka Web3)
Well, reiterating some things I've said before: * This is clearly just WSGI slightly reworked, why the new name? * Why byte values in the environ? No one has offered any real reason they are better than native strings. I keep asking people to offer a reason, *and no one ever does*. It's just hyperbole and distraction. Frankly I'm feeling annoyed. So far my experience makes me believe using native strings will make it easier to port and support libraries across 2 and 3. * It makes sense to me that the error stream should accept both bytes and unicode, and should do a best effort to handle either. Getting encoding errors or type errors when logging an error is very distracting. * Instead of focusing on Response(*response_tuple), I'd rather just rely on something like Response.from_wsgi(response_tuple). Body first feels very unnatural. * Regarding long response headers, I think we should ignore the HTTP spec. You can put 4k in a Set-Cookie header, such headers aren't easily or safely folded... I think the line length constraint in the HTTP spec isn't a constraint we need to pay attention to. -- Ian Bicking | http://blog.ianbicking.org ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] PEP 444 (aka Web3)
On Thu, Sep 16, 2010 at 12:35 PM, Guido van Rossum gu...@python.org wrote: On Thu, Sep 16, 2010 at 10:01 AM, Ian Bicking i...@colorstudy.com wrote: Well, reiterating some things I've said before: * This is clearly just WSGI slightly reworked, why the new name? * Why byte values in the environ? No one has offered any real reason they are better than native strings. I keep asking people to offer a reason, *and no one ever does*. It's just hyperbole and distraction. Frankly I'm feeling annoyed. So far my experience makes me believe using native strings will make it easier to port and support libraries across 2 and 3. Hm. IIUC the proposal is to implicitly assume Latin1 when decoding the bytes to Unicode. I worry that this will just perpetuate mojibake and other atrocities committed in Python 2. I was reading http://python.org/dev/peps/pep-0444/ -- is there another revision under discussion? This seems to explicitly say all environ values will be bytes. There have been other str-oriented proposals, including mod_wsgi's implementation. There is consensus that request and response bodies should be bytes. So really we're talking about whether headers and status are bytes or native strings. Most HTTP headers can only contain sensible characters in ASCII, and while anyone can submit anything in a header I'm not aware of it being a problem that, e.g., someone submits a Cache-Control header with non-ASCII values. There are a small number of headers that can reasonably contain Latin1 characters. Latin1 is specified in HTTP, and in a few instances RFC2047 encoding is allowed, though I don't believe anyone proposes that servers should try to handle RFC2047 (I believe CherryPy does/did do this, but I believe Robert Brewer who is in charge of that project supports removing that). There are headers that can reasonably contain RFC2047, but this can be decoded at the application level. The Cookie header does frequently contain incorrect encodings, but to handle this you have to decode the header as bytes or latin1 (all the meaningful characters are the same in both cases) and then decode/transcode values after parsing. Latin1 imposes only a small speedbump for a header that already has a bunch of speedbumps. The other case when Latin1 is not appropriate is the URL-decoded path, WSGI 1's SCRIPT_NAME and PATH_INFO. This proposal removes those. The URL-encoded values are ASCII-safe, or at least could be safely normalized to be safe in the server level. -- Ian Bicking | http://blog.ianbicking.org ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] PEP 444 (aka Web3)
On Sep 16, 2010, at 1:01 PM, Ian Bicking wrote: Well, reiterating some things I've said before: * This is clearly just WSGI slightly reworked, why the new name? Agreed. Among many other reasons, it seems poor from a Python 3 marketing perspective to introduce a name change that implies something totally different from WSGI that will require major rewrites to port to. It's also a poor choice as a rebranding even if one were desirable, I think. It's terribly generic, and suggests it's somehow a successor to Web 2.0. Nor is it very search engine friendly, and there may be trademark issues (http://www.networkedplanet.com/Products/Web3/) Also, ordering the response tuple for the very minor convenience of a couple of frameworks while simultaneously requiring them to make adjustments for the web3.* names seems strange to me. Count me for retaining the WSGI naming, and for (status, headers, body), for what little it's worth. * It makes sense to me that the error stream should accept both bytes and unicode, and should do a best effort to handle either. Getting encoding errors or type errors when logging an error is very distracting. I think I agree with this too. ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] PEP 444 (aka Web3)
At 10:35 AM 9/16/2010 -0700, Guido van Rossum wrote: No comments on the rest except to note that at this point it looks unlikely that we can make everyone happy (or even get an agreement to adopt what would be the long-term technically optimal solution -- AFAICT there is no agreement on what that solution would be, if one weren't to take porting Python 2 code into account). IOW something/sokebody has gotta give. Indeed. This entire discussion has pushed me strongly in favor of doing a super-minimalist update to PEP 333 with the following points: * Clarifying the encoding of environ values (locale+surrogateescape vs. latin1, TBD) * Making the streams and all output values byte strings ('str' on 2.x, 'bytes' on 3.x), leaving everything else native strings ('str' on both 2.x and 3.x) * Any other minor errata/clarifications that the folks with the requisite experience (e.g. Robert, Ian, Graham -- not an exclusive list, but at least they all have both heavy WSGI implementations under their belts and 3.x experience) think are absolutely necessary to resolve open questions for Python 3.2 WSGI implementations. Something like that has a halfway decent chance of being able to settle and get implemented in the short timeline, and it also doesn't put Graham (mod_wsgi) in the position of coming back from vacation to a huge new spec to unravel. ;-) (To be clear, what I'm suggesting is almost exactly what mod_wsgi does; it's just stricter on outputs than what mod_wsgi accepts, and there may be some minor issues regarding the environ encoding: mod_wsgi is probably using the latin1 approach rather than locale+surrogateescape, and I think we need to talk that one out a bit.) Anyway, web3 is nice, but it doesn't look like it'll really fit the bill for porting applications. i.e., it's like a bike shed full of red herrings for what Python-Dev needs right now. ;-) ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] PEP 444 (aka Web3)
Hi, On 9/16/10 7:56 PM, Ty Sarna wrote: Agreed. Among many other reasons, it seems poor from a Python 3 marketing perspective to introduce a name change that implies something totally different from WSGI that will require major rewrites to port to. It's also a poor choice as a rebranding even if one were desirable, I think. It's terribly generic, and suggests it's somehow a successor to Web 2.0. Nor is it very search engine friendly, and there may be trademark issues (http://www.networkedplanet.com/Products/Web3/) The name is not set in stone. I am very happy to accept WSGI 2 as a name for that, but we did not want to totally bypass the discussions on web-sig here and announce something that clearly says it will be WSGI 2 when only a small set of the people here participated directly in the writing of that PEP. * It makes sense to me that the error stream should accept both bytes and unicode, and should do a best effort to handle either. Getting encoding errors or type errors when logging an error is very distracting. I think I agree with this too. There are no such stream objects on Python 3 unless I am missing something. Furthermore there are no libraries on Python 3 that would emit string information as text, so I don't see the reason for considering bytes and unicode for that stream. Regards, Armin ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] PEP 444 (aka Web3)
On Thu, 2010-09-16 at 12:01 -0500, Ian Bicking wrote: Well, reiterating some things I've said before: * This is clearly just WSGI slightly reworked, why the new name? The PEP says Web3 is clearly a WSGI derivative; it only uses a different name than WSGI in order to indicate that it is not in any way backwards compatible. I don't really care what the name is. My experience in various communities suggests that naming the new totally-bw-incompat thing the same as the old thing weakens both the new thing and the old thing, but.. whatever. I just don't care much. * Why byte values in the environ? No one has offered any real reason they are better than native strings. I keep asking people to offer a reason, *and no one ever does*. It's just hyperbole and distraction. Frankly I'm feeling annoyed. So far my experience makes me believe using native strings will make it easier to port and support libraries across 2 and 3. I'm sorry you're annoyed. I chose bytes here mainly out of ignorance and fear. This is an extremely low level protocol, and I just literally don't know how we can sanely convert environ values to Unicode without some loss of control or potential for incorrect decoding without having server encoding configuration. You say it's easy and straightforward, and that's fine. I just haven't internalized enough specification to know. I'd very much encourage folks who want to use native strings to create another PEP: it's just a lot easier to argue about one thing than it is to argue endlessly in snippets on blogs and epic maillist threads. I could care less if this *particular* PEP is selected, to be honest. Let's just get it over within a process where there's at least some chance of resolution. * It makes sense to me that the error stream should accept both bytes and unicode, and should do a best effort to handle either. Getting encoding errors or type errors when logging an error is very distracting. Sounds good. * Instead of focusing on Response(*response_tuple), I'd rather just rely on something like Response.from_wsgi(response_tuple). Body first feels very unnatural. Others have said same, also good. * Regarding long response headers, I think we should ignore the HTTP spec. You can put 4k in a Set-Cookie header, such headers aren't easily or safely folded... I think the line length constraint in the HTTP spec isn't a constraint we need to pay attention to. OK. - C ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] PEP 444 (aka Web3)
On Thu, 2010-09-16 at 14:04 -0400, P.J. Eby wrote: At 10:35 AM 9/16/2010 -0700, Guido van Rossum wrote: No comments on the rest except to note that at this point it looks unlikely that we can make everyone happy (or even get an agreement to adopt what would be the long-term technically optimal solution -- AFAICT there is no agreement on what that solution would be, if one weren't to take porting Python 2 code into account). IOW something/sokebody has gotta give. Indeed. This entire discussion has pushed me strongly in favor of doing a super-minimalist update to PEP 333 with the following points: Right on, write it all down! ;-) - C ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] PEP 444 (aka Web3)
My experience in various communities suggests that naming the new totally-bw-incompat thing the same as the old thing weakens both the new thing and the old thing, I share the same experience. ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] PEP 444 (aka Web3)
At 02:17 PM 9/16/2010 -0500, Ian Bicking wrote: On Thu, Sep 16, 2010 at 1:04 PM, P.J. Eby mailto:p...@telecommunity.comp...@telecommunity.com wrote: * Clarifying the encoding of environ values (locale+surrogateescape vs. latin1, TBD) locale+surrageescape would be insanity! CGI will just require some configuration with respect to the environment. Anyway, I suspect CGI only really works because: (a) people using CGI are sticking to ASCII, (b) they've fixed stuff up in their apps, (c) they just produce garbage and no one cares. Ok. There are some simple errata, most of which I believe web3 covers (in addition to other things it covers). I think everyone is on board with:  status, headers, app_iter = app(environ) Web3 proposed a different order, but it seems clear from the thread that people prefer the more natural order, and web3 authors don't particularly object. My comments were about releasing a WSGI 1.0 update for Python 3, not making changes to web3. The current free-for-all (and the 3.2 stdlib need) have convinced me to stop arguing for throwing out WSGI 1 on Python 3. Or, to put it another way: splitting the spec into two 100% incompatible versions is a bad idea for Python 3 adoption. With a WSGI 1 addendum, we should be able to make it possible to put the same apps and middleware on 2 and 3 with just a decorator wrapping them. (i.e., people should be able to write libraries that run on both 2 and 3, which is probably critical to adoption). I just wish I'd come to these conclusions much sooner... like a year or two ago. :-( ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] PEP 444 (aka Web3)
Ty Sarna wrote: On Sep 16, 2010, at 2:55 PM, Massimo Di Pierro wrote: My experience in various communities suggests that naming the new totally-bw-incompat thing the same as the old thing weakens both the new thing and the old thing, I share the same experience. Interesting. Do you feel that Python 3.x should have been named something other than Python? I think that would rather have weakened both 3.x and 2.x by suggesting a fork, placing the two in competition, when the goal was to have one supersede the other, as is also the case here. FWIW, I agree on this point. WSGI2 seems better than WEB3. IMO, its OK to put a disclaimer at the top of the spec that states they are different specs and entirely backwards incompatible. If there is consensus to more away from WSGI, then I think a name other than WEB3 is in order. Its just too generic. -- Randy Syring Intelicom 502-644-4776 Whether, then, you eat or drink or whatever you do, do all to the glory of God. 1 Cor 10:31 ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] PEP 444 (aka Web3)
Um, talk about a whopper of a topic change. None of that is on the table. Maybe for Python 4. And certainly not in web-sig. On Thu, Sep 16, 2010 at 12:32 PM, Massimo Di Pierro mdipie...@cs.depaul.edu wrote: Not sure this discussion belongs here but since you asked: I think it should have takes three/four more bold steps: 1) address the GIL issue completely by removing reference counting 2) add more support for lightweight threads (like stackless, erlang and go) 3) perhaps allow some mechanism for tainting data and do restricted execution 4) change name to avoid confusion ... and yet stress that it was almost 100% compatible with existing python code. I think a lot more people would have jumped on it from outside the existing community. The future is in multi core processors and lightweight threads. Of course I am not a developer and I do realize these things may be hard to accomplish. I also trust Guido's judgement more than my own in this respect so consider mine a wish more than a realistic suggestion. Massimo On Sep 16, 2010, at 2:16 PM, Ty Sarna wrote: On Sep 16, 2010, at 2:55 PM, Massimo Di Pierro wrote: My experience in various communities suggests that naming the new totally-bw-incompat thing the same as the old thing weakens both the new thing and the old thing, I share the same experience. Interesting. Do you feel that Python 3.x should have been named something other than Python? I think that would rather have weakened both 3.x and 2.x by suggesting a fork, placing the two in competition, when the goal was to have one supersede the other, as is also the case here. ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/guido%40python.org -- --Guido van Rossum (python.org/~guido) ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] PEP 444 (aka Web3)
Hi, Here some things comments summarized and how things will change: - The order of the response tuple. The majority of this list wants it to be changed to the standard (status, headers, body) format, and we agree. The original motivation was passing it to the constructor of a common response object, but there is no reason this shouldn't be changed. Will update the PEP and implementation appropriately. - The async part. It was added in the hope that someone would step up and come up with something better as replacement. I asked in the #twisted IRC channel but they did not see any value in supporting a common specification that was shared with the synchronous world and it looks like it will be harder to find someone that does care about this particular issue. The motivation was that facebook's tornado framework is currently attracting a lot of users and creating an environment besides the WSGI one which means that it might be quite hard to share some code between those two worlds. I also remember hearing a lot of backlash when start_response was considered for deleting last time from the nginx mod_wsgi maintainer. If I can't find someone that is willing to provide some input on that I will remove that section. - Bytes values in the environment: HTTP transmits bytes, that's a fact we can't change. When we go with native strings we will go with unicode on 3.x This has the following implications: - getting the right path info requires a decode + an encode unless you are assuming latin1. - same as above for the script name and cookie header When going with unicode strings on 3.x for environ values, we would have to do the same for outgoing values which makes middlewares a lot harder to write: - header keys and values might then be bytes and unicode strings. Because of this all middlewares would have to convert to either str objects or bytes which might mean a lot of extra encoding and decoding depending on how the middleware is implemented. - We can't change the fact that a large percentage of Python developers is living in an ASCII-only world which would never have to deal with encodings that way and might be encouraged to just assume ASCII as encoding. For implementations not based on the standard library the bytes-only approach seems to be easier in any way as far as I can see. The only real issue appears to be urllib for the moment, and until that is resolved one could easily do an encode/decode around the calls to that particular library. - web3.errors I think Ian raised concern that it's specified to support unicode only. I don't think we should change that to accepting either bytes or unicode is a good idea on Python 3 where there is no stream in the language or standard library that accepts both at the same time. An implementation for 2.x could support both, but I don't know if there is a usecase for that. In general though I have to say that very few people use wsgi.errors currently, so I don't think this is a real issue anyways. - the web3 name If there is any value in this PEP and we find something to decide on, there is no reason this couldn't be WSGI 2. But until it's just something a small part of the web-sig community worked on directly a separate name is a good thing I think, because it does not reserve the name WSGI 2 for something that might actually become WSGI 2 in case this PEP gets rejected. Regards, Armin ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] PEP 444 (aka Web3)
On Fri, Sep 17, 2010 at 6:58 AM, Armin Ronacher armin.ronac...@active-4.com wrote: - The async part. It was added in the hope that someone would step up and come up with something better as replacement. I asked in the #twisted IRC channel but they did not see any value in supporting a common specification that was shared with the synchronous world and it looks like it will be harder to find someone that does care about this particular issue. The motivation was that facebook's tornado framework is currently attracting a lot of users and creating an environment besides the WSGI one which means that it might be quite hard to share some code between those two worlds. I also remember hearing a lot of backlash when start_response was considered for deleting last time from the nginx mod_wsgi maintainer. If I can't find someone that is willing to provide some input on that I will remove that section. I'm with the Twisted community on this one in that I see no real value. async operations and awareness should be (IHMO) really left up to the server/framework, not the application(s) or middleware. - the web3 name If there is any value in this PEP and we find something to decide on, there is no reason this couldn't be WSGI 2. But until it's just something a small part of the web-sig community worked on directly a separate name is a good thing I think, because it does not reserve the name WSGI 2 for something that might actually become WSGI 2 in case this PEP gets rejected. I personally still don't see any real benefit to changing the key names from wsgi to web3 (or whatever). I would prefer it remain the same. If you're going to use Python3, you know you're using Python3 (you don't need web3 key names to know that). (subjective) cheers James -- -- James Mills -- -- Problems are solved by method ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] PEP 444 (aka Web3)
On Thu, Sep 16, 2010 at 4:58 PM, Armin Ronacher armin.ronac...@active-4.com wrote: - Bytes values in the environment: HTTP transmits bytes, that's a fact we can't change. When we go with native strings we will go with unicode on 3.x This has the following implications: - getting the right path info requires a decode + an encode unless you are assuming latin1. Not if you are working with the URL-encoded paths. - same as above for the script name and cookie header Cookie is weird. If that one header could be bytes, that'd be great... but special-casing Cookie/Set-Cookie is too hard/weird. Plus handling Cookie/Set-Cookie as Latin1 is just one more line of code (well, two, one for each header). When going with unicode strings on 3.x for environ values, we would have to do the same for outgoing values which makes middlewares a lot harder to write: All response headers handle encoded URLs (e.g., Location), so SCRIPT_NAME/PATH_INFO issues don't come into play. Set-Cookie could be an issue, though only really when someone wants to replicate an external system's weird cookies -- except for legacy issues it's best for application developers to stick to ASCII cookies (URL-encoding cookie values is a popular way of doing this). I don't know of any other header (or the status) that would reasonably cause a problem. And I'm not glossing over corner cases -- I'm generally very aware and concerned with legacy issues, and interacting with legacy systems. There just aren't any here except for the resolvable issues I've listed. - web3.errors I think Ian raised concern that it's specified to support unicode only. I don't think we should change that to accepting either bytes or unicode is a good idea on Python 3 where there is no stream in the language or standard library that accepts both at the same time. An implementation for 2.x could support both, but I don't know if there is a usecase for that. In general though I have to say that very few people use wsgi.errors currently, so I don't think this is a real issue anyways. It's more of an issue under Python 2, it could probably be ignored with Python 3. Under Python 2 when you have some error condition it's really frustrating to encounter some unicode error with the logging of that error (often covering up the original error). -- Ian Bicking | http://blog.ianbicking.org ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] PEP 444 (aka Web3)
Hi, On 9/17/10 3:43 AM, Ian Bicking wrote: Not if you are working with the URL-encoded paths. SCRIPT_NAME / PATH_INFO will always stay unencoded and the current spec requires the web3.script_name thing to only be provided if the server can safely provide that. So at least for the fallback, we are dealing with (properly latin1 decoded) non-URL encoded things. Can be changed of course. Cookie is weird. If that one header could be bytes, that'd be great... but special-casing Cookie/Set-Cookie is too hard/weird. Special casing one header is indeed weird. I don't know of any other header (or the status) that would reasonably cause a problem. And I'm not glossing over corner cases -- I'm generally very aware and concerned with legacy issues, and interacting with legacy systems. There just aren't any here except for the resolvable issues I've listed. Technically speaking it would affect etags too, but I doubt anyone is using non-ASCII quoted strings there. A very funny header is btw the Warning header which actually can have any encoding: The warn-text SHOULD be in a natural language and character set that is most likely to be intelligible to the human user receiving the response. This decision MAY be based on any available knowledge, such as the location of the cache or user, the Accept-Language field in a request, the Content-Language field in a response, etc. The default language is English and the default character set is ISO-8859-1. If a character set other than ISO-8859-1 is used, it MUST be encoded in the warn-text using the method described in RFC 2047 [14]. Doubt anyone is using that header though. It's more of an issue under Python 2, it could probably be ignored with Python 3. Under Python 2 when you have some error condition it's really frustrating to encounter some unicode error with the logging of that error (often covering up the original error). I guess there it would be fine to have stderr like stream that accepts unicode and bytes. Regards, Armin ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] PEP 444 (aka Web3)
On Thu, Sep 16, 2010 at 9:59 PM, Armin Ronacher armin.ronac...@active-4.com wrote: On 9/17/10 3:43 AM, Ian Bicking wrote: Not if you are working with the URL-encoded paths. SCRIPT_NAME / PATH_INFO will always stay unencoded and the current spec requires the web3.script_name thing to only be provided if the server can safely provide that. So at least for the fallback, we are dealing with (properly latin1 decoded) non-URL encoded things. Can be changed of course. Yes, if we get rid of SCRIPT_NAME/PATH_INFO then the problem goes away. For servers without access to the unencoded value, reencoding those values doesn't actually lose any information over what we have now, and avoids any encoding issues. Servers with REQUEST_URI can at least attempt to reconstruct the encoded values. Cookie is weird. If that one header could be bytes, that'd be great... but special-casing Cookie/Set-Cookie is too hard/weird. Special casing one header is indeed weird. Cookie is also the one header that can't be safely folded. It's just a messed up header, and requires hacky workarounds. I don't know of any other header (or the status) that would reasonably cause a problem. And I'm not glossing over corner cases -- I'm generally very aware and concerned with legacy issues, and interacting with legacy systems. There just aren't any here except for the resolvable issues I've listed. Technically speaking it would affect etags too, but I doubt anyone is using non-ASCII quoted strings there. A very funny header is btw the Warning header which actually can have any encoding: The warn-text SHOULD be in a natural language and character set that is most likely to be intelligible to the human user receiving the response. This decision MAY be based on any available knowledge, such as the location of the cache or user, the Accept-Language field in a request, the Content-Language field in a response, etc. The default language is English and the default character set is ISO-8859-1. If a character set other than ISO-8859-1 is used, it MUST be encoded in the warn-text using the method described in RFC 2047 [14]. Doubt anyone is using that header though. The Title header (in Atompub) also suggests 2047, but that's essentially an ASCII conversion like URL quoting. It looks something like =?iso-8859-1?q?p=F6stal?= -- Ian Bicking | http://blog.ianbicking.org ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] PEP 444 (aka Web3)
Hi, On 9/17/10 4:21 AM, Ian Bicking wrote: The Title header (in Atompub) also suggests 2047, but that's essentially an ASCII conversion like URL quoting. It looks something like =?iso-8859-1?q?p=F6stal?= Yep. That was mere a fun fact I wanted to share. Was not aware of HTTP specifying a non latin1 header anywhere. I suppose the authors of the HTTP specification were aware of encoding issues, just that the people that made the Cookie specification didn't have non-ASCII payloads in mind. Not too surprising, after all it's called Cookie and not arbitrary data-store :) Regards, Armin ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] PEP 444 (aka Web3)
I fully support it! Massimo On Sep 15, 2010, at 6:03 PM, Chris McDonough wrote: A PEP was submitted and accepted today for a WSGI successor protocol named Web3: http://python.org/dev/peps/pep-0444/ I'd encourage other folks to suggest improvements to that spec or to submit a competing spec, so we can get WSGI-on-Python3 settled soon. - C ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/mdipierro%40cti.depaul.edu ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] PEP 444 (aka Web3)
On Thu, Sep 16, 2010 at 9:40 AM, Massimo Di Pierro mdipie...@cs.depaul.edu wrote: I fully support it! I don't entirely. I don't quite agree with the key changes from wsgi to web3. I think it's unnecessary. cheers james -- -- James Mills -- -- Problems are solved by method ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] PEP 444 (aka Web3)
At 07:03 PM 9/15/2010 -0400, Chris McDonough wrote: A PEP was submitted and accepted today for a WSGI successor protocol named Web3: http://python.org/dev/peps/pep-0444/ I'd encourage other folks to suggest improvements to that spec or to submit a competing spec, so we can get WSGI-on-Python3 settled soon. The first thing I notice is that web3.async appears to force all existing middleware to delete it from the environment if it wishes to remain compatible, unless it adapts to support receiving callables itself. On further reading I see you have something about middleware disabling itself if it doesn't support async execution, but this doesn't make any sense to me: if it can't support async execution, why wouldn't it just delete web3.async from the environ, forcing its wrapped app to be synchronous instead? I'm also not a fan of the bytes environ, or the new path_info/script_name variables; note that the spec's sample CGI implementation does not itself provide the new variables, and that middleware must be explicitly written to handle the case where there is duplication. My main fear with this spec is that people will assume they can just make a few superficial changes to run WSGI code on it, when in fact it is deeply incompatible where middleware is concerned. In fact, AFAICT, it seems like it will be *harder* to write correct web3 middleware than it is to write correct WSGI middleware now. This seems like a step backward, since the whole idea behind dropping start_response() was to make correct middleware *easier* to write. Any time a spec makes something optional or allows More Than One Way To Do It, it immediately doubles the mimimum code required to implement that portion of the spec in compliant middleware. This spec has two optionalities: web3.async, and the optional path_info/script_name, so the return handling of every piece of middleware is doubled (or else environ['web3.async'] = False must be added at the top), and any code that modifies paths must similarly ditch the special variables or do double work to update them. ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] PEP 444 (aka Web3)
On Wed, 2010-09-15 at 20:05 -0400, P.J. Eby wrote: At 07:03 PM 9/15/2010 -0400, Chris McDonough wrote: A PEP was submitted and accepted today for a WSGI successor protocol named Web3: http://python.org/dev/peps/pep-0444/ I'd encourage other folks to suggest improvements to that spec or to submit a competing spec, so we can get WSGI-on-Python3 settled soon. The first thing I notice is that web3.async appears to force all existing middleware to delete it from the environment if it wishes to remain compatible, unless it adapts to support receiving callables itself. We can ditch everything concerning web3.async as far as I'm concerned. Ian has told me that this feature won't be liked by the async people anyway, as it doesnt have a trigger mechanism. On further reading I see you have something about middleware disabling itself if it doesn't support async execution, but this doesn't make any sense to me: if it can't support async execution, why wouldn't it just delete web3.async from the environ, forcing its wrapped app to be synchronous instead? I'm also not a fan of the bytes environ, or the new path_info/script_name variables; note that the spec's sample CGI implementation does not itself provide the new variables, and that middleware must be explicitly written to handle the case where there is duplication. I'm not concerned about which environment variables have it, but I would definitely like to be able to get at the original (non-%2F-decoded) path info somewhere. I'd be fine if PATH_INFO was just that, and get rid of web3.path_info. web3.script_name is probably just a mistake entirely. My main fear with this spec is that people will assume they can just make a few superficial changes to run WSGI code on it, when in fact it is deeply incompatible where middleware is concerned. In fact, AFAICT, it seems like it will be *harder* to write correct web3 middleware than it is to write correct WSGI middleware now. I'm very willing to drop web3.async entirely. It seems reasonable to do so. I should have done so before I mailed the spec, as I knew it would be unpopular. This seems like a step backward, since the whole idea behind dropping start_response() was to make correct middleware *easier* to write. Any time a spec makes something optional or allows More Than One Way To Do It, it immediately doubles the mimimum code required to implement that portion of the spec in compliant middleware. This spec has two optionalities: web3.async, and the optional path_info/script_name, so the return handling of every piece of middleware is doubled (or else environ['web3.async'] = False must be added at the top), and any code that modifies paths must similarly ditch the special variables or do double work to update them. No worries, let's get rid of both, with the caveat that it's pretty essential (to me anyway) to be able to get at the non-%2F-encoded path somewhere. The most sensible thing to me would be to put it in PATH_INFO. As far as bytes vs. strings, whatever, we have to pick one. Bytes makes more sense to me. I'll leave it to the native-string and/or unicode people to create their own spec. - C ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] PEP 444 (aka Web3)
A PEP was submitted and accepted today for a WSGI successor protocol named Web3: http://python.org/dev/peps/pep-0444/ I'd encourage other folks to suggest improvements to that spec or to submit a competing spec, so we can get WSGI-on-Python3 settled soon. - C I generally like it. About the *.file_wrapper removal, i suggest a PSGI-like approach where 'body' can contains a File Object. def file_app(environ): fd = open('/tmp/pippo.txt', 'r') status = b'200 OK' headers = [(b'Content-type', b'text/plain')] body = fd return body, status, headers or def file_app(environ): fd = open('/tmp/pippo.txt', 'r') status = b'200 OK' headers = [(b'Content-type', b'text/plain')] body = [b'Header', fd, b'Footer'] return body, status, headers (and what about returning multiple File objects ?) By the way, congratulations for the big step forward -- Roberto De Ioris http://unbit.it ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com