Chris Lambacher <ch...@kateandchris.net> added the comment:

Sorry in advance for the long winded response.

Ron, have you looked at my patch?

The underlying issue is that the semantics for print() between Python 1 and 3. 
print() does not accept a bytes type in Python 3. In Python 2 str was a "bytes" 
type and so print happily sent encoded strings to stdout. 

This presents an issue for both --type=html and the text version if an encoding 
is asked for. Just using print() will result in repr being called on the byte 
string and you get either an invalid HTML file or a text file with extra junk 
in it (same junk in both).

If you ask for an encoding, you are going to get bytes. Changing it back into a 
string to mask that effect does not actually fix things for you because once 
you do print() you are back to a default encoding and therefore more broken 
because you are not doing what the user asked for (which is a particular 
encoding).

In order for:
    return str(''.join(v).encode(encoding, "xmlcharrefreplace"),
                encoding=encoding)

to solve the issue, you would also need to take away the ability for the user 
to specify an encoding (at the command line and via the API). It's already a 
string, why make it a byte and then a string again? If you don't want to deal 
with encoding, then return a string and leave it up to the consumer of the API 
to handle the desired encoding (and the "xmlcharrefreplace", maybe with a note 
in the docs).

If you do want to deal with encoding (which I think we are stuck with), then 
solve the real issue by not using print() (see my patch).

I think the only reason that my patch was not accepted, and why this is still 
languishing is that I said I would provide tests and have not had time to do 
so. 

Please feel free to correct me if I am wrong about any of the above.

----------

_______________________________________
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue10087>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to