On Mon, Sep 21, 2009 at 09:14:13PM +0200, Armin Ronacher wrote:
> So the same standard should have different behavior on different Python
> versions? That would make framework code a lot more complicated.
I don't understand why it would be 'a lot more' complicated.
(The following code snippets is Python 3 only, and assumes we're using
'native strings' everywhere)
In the gateway, environ would be populated this way:
environ['some_key'] = some_value.decode('utf8', 'surrogateescape')
Compare that to the utf-8-then-latin-1 alternative:
try:
environ['some_key'] = some_value.decode('utf-8')
environ['some_key.encoding'] = 'utf-8'
except UnicodeError:
environ['some_key'] = some_value.decode('latin-1')
environ['some_key.encoding'] = 'latin-1'
What you would have in the application to get the original value:
environ['some_key'].encode('utf8', 'surrogateescape')
With utf8-then-latin1:
environ['some_key'].encode(environ['some_key.encoding'])
The 'surrogateescape' way is clearly simpler. The 'equivalent' Python 2
code is even simpler:
environ['some_key'] = some_value
And:
environ['some_key']
--
Henry PrĂȘcheur
_______________________________________________
Web-SIG mailing list
[email protected]
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe:
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com