Chris Lambacher <[email protected]> added the comment:
Sorry in advance for the long winded response.
Ron, have you looked at my patch?
The underlying issue is that the semantics for print() between Python 1 and 3.
print() does not accept a bytes type in Python 3. In Python 2 str was a "bytes"
type and so print happily sent encoded strings to stdout.
This presents an issue for both --type=html and the text version if an encoding
is asked for. Just using print() will result in repr being called on the byte
string and you get either an invalid HTML file or a text file with extra junk
in it (same junk in both).
If you ask for an encoding, you are going to get bytes. Changing it back into a
string to mask that effect does not actually fix things for you because once
you do print() you are back to a default encoding and therefore more broken
because you are not doing what the user asked for (which is a particular
encoding).
In order for:
return str(''.join(v).encode(encoding, "xmlcharrefreplace"),
encoding=encoding)
to solve the issue, you would also need to take away the ability for the user
to specify an encoding (at the command line and via the API). It's already a
string, why make it a byte and then a string again? If you don't want to deal
with encoding, then return a string and leave it up to the consumer of the API
to handle the desired encoding (and the "xmlcharrefreplace", maybe with a note
in the docs).
If you do want to deal with encoding (which I think we are stuck with), then
solve the real issue by not using print() (see my patch).
I think the only reason that my patch was not accepted, and why this is still
languishing is that I said I would provide tests and have not had time to do
so.
Please feel free to correct me if I am wrong about any of the above.
----------
_______________________________________
Python tracker <[email protected]>
<http://bugs.python.org/issue10087>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe:
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com