Glenn Linderman <v+pyt...@g.nevcal.com> added the comment:

Pierre said:
The encoding used by the browser is defined in the Content-Type meta tag, or 
the content-type header ; if not, the default seems to vary for different 
browsers. So it's definitely better to define it

The argument stream_encoding used in FieldStorage *must* be this encoding

I say:
I agree it is better to define it.  I think you just said the same thing that 
the page I linked to said, I might not have conveyed that correctly in my 
paraphrasing.  I assume you are talking about the charset of the Content-Type 
of the form page itself, as served to the browser, as the browser, sadly, 
doesn't send that charset back with the form data.

Pierre says:
But this raises another problem, when the CGI script has to print the data 
received. The built-in print() function encodes the string with 
sys.stdout.encoding, and this will fail if the string can't be encoded with it. 
It is the case on my PC, where sys.stdout.encoding is cp1252 : it can't handle 
Arabic or Chinese characters

I say:
I don't think there is any need to override print, especially not 
builtins.print.  It is still true that the HTTP data stream is and should be 
treated as a binary stream.  So the script author is responsible for creating 
such a binary stream.

The FieldStorage class does not use the print method, so it seems inappropriate 
to add a parameter to its constructor to create a print method that it doesn't 
use.

For the convenience of CGI script authors, it would be nice if CGI provided 
access to the output stream in a useful way... and I agree that because the 
generation of an output page comes complete with its own encoding, that the 
output stream encoding parameter should be separate from the stream_encoding 
parameter required for FieldStorage.

A separate, new function or class for doing that seems appropriate, possibly 
included in cgi.py, but not in FieldStorage.  Message 125100 in this issue 
describes a class IOMix that I wrote and use for such; codifying it by 
including it in cgi.py would be fine by me... I've been using it quite 
successfully for some months now.

The last line of Message 125100 may be true, perhaps a few more methods should 
be added.  However, print is not one of them.  I think you'll be pleasantly 
surprised to discover (as I was, after writing that line) that the 
builtins.print converts its parameters to str, and writes to stdout, assuming 
that stdout will do the appropriate encoding.  The class IOMix will, in fact, 
do that appropriate encoding (given an appropriate parameter to its 
initialization.  Perhaps for CGI, a convenience function could be added to 
IOMix to include the last two code lines after IOMix in the prior message:

        @staticmethod
        def setup( encoding="UTF-8"):
            sys.stdout = IOMix( sys.stdout, encoding )
            sys.stderr = IOMix( sys.stderr, encoding )

Note that IOMix allows the users choice of output stream encoding, applies it 
to both stdout and stderr, which both need it, and also allows the user to 
generate binary directly (if sending back a file, for example), as both bytes 
and str are accepted.

print can be used with a file= parameter in 3.x which your implementation 
doesn't permit, and which could be used to write to other files by a CGI 
script, so I really, really don't think we want to override builtins.print 
without the file= parameter, and specifically tying it to stdout.

My message 126075 still needs to be included in your next patch.

----------

_______________________________________
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue4953>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to