Manlio Perillo wrote:
In a CGI application, HTTP headers are Unicode strings, and are decoded
using system default encoding.
In a future WSGI application, HTTP headers are Unicode strings, and are
decoded using latin-1 encoding.
Yes. As proposed, WSGI 1.1 would require CGI-to-WSGI handler to undo the
decode stage caused by reading environ using the default encoding. At
least this is now reliably possible thanks to surrogateescape.
PATH_INFO is the only really important HTTP-related environment variable
for Unicode. Potentially SCRIPT_NAME could also be significant in
relation to PATH_INFO. The HTTP headers don't massively matter because
there are almost never any non-ASCII characters in them.
Previously the job of undoing an unwanted decode step was dumped on
whatever read the PATH_INFO; usually a routing component, which would
have to make guesses with typically poor results. The CGI adapter is in
a much better place to do it, being closer to the server.
> The problem is that not all browsers use latin-1.
Not WSGI's problem. WSGI will deliver bytes encoded into Unicode
strings, not ready-to-use Unicode strings. It is up to the application
to decide how they want to handle those bytes; maybe they want Latin-1
and can do nothing, maybe they want to recode to UTF-8, maybe something
else completely. No solution satisfies every app so there is always
going to have to be a recode step somewhere.
An application that doesn't want to think about this will use a
framework that does it for them.
> What about HTTP_COOKIE?
For what it's worth, the choice of Latin-1 here results in the 'right'
Unicode string for more browsers than any other potential encoding.
In any case as previously discussed, non-ASCII cookies are already
totally broken everywhere and hence used by no-one.
--
And Clover
mailto:a...@doxdesk.com
http://www.doxdesk.com/
_______________________________________________
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe:
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com