Glenn Linderman <v+pyt...@g.nevcal.com> added the comment:

Victor said:
I mean: you should pass sys.stdin.buffer instead of sys.stdin.

I say:
That would be possible, but it is hard to leave it at default, in that case, 
because sys.stdin will, by default, not be a binary stream.  It is a 
convenience for FieldStorage to have a useful default for its input, since RFC 
3875 declares that the message body is obtained from "standard input".

Pierre said:
I wish it could be as simple, but I'm afraid it's not. On my PC, 
sys.stdin.encoding is cp-1252. I tested a multipart/form-data with an INPUT 
field, and I entered the euro character, which is encoded  \x80 in cp-1252

If I use the encoding defined for sys.stdin (cp-1252) to decode the bytes 
received on sys.stdin.buffer, I get the correct value in the cgi script ; if I 
set the encoding to latin-1 in FieldStorage, since \x80 maps to undefined in 
latin-1, I get a UnicodeEncodeError if I try to print the value ("character 
maps to <undefined>")

I say:
Interesting. I'm curious what your system (probably Windows since you mention 
cp-) and browser, and HTTP server is, that you used for that test.  Is it 
possible to capture the data stream for that test?  Describe how, and at what 
stage the data stream was captured, if you can capture it.  Most interesting 
would be on the interface between browser and HTTP server.

RFC 3875 states (section 4.1.3) what the default encodings should be, but I see 
that the first possibility is "system defined".  On the other hand, it seems to 
imply that it should be a system definition specifically defined for particular 
media types, not just a general system definition such as might be used as a 
default encoding for file handles... after all, most Web communication crosses 
system boundaries.  So lacking a system defined definition for text/ types, it 
then indicates that the default for text/ types is Latin-1.

I wonder what result you get with the same browser, at the web page 
http://rishida.net/tools/conversion/ by entering the euro symbol into the 
Characters entry field, and choosing convert.

----------

_______________________________________
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue4953>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to