[issue8047] Serialiser in ElementTree returns unicode strings in Py3k

Stefan Behnel Fri, 12 Mar 2010 01:38:36 -0800

Stefan Behnel <sco...@users.sourceforge.net> added the comment:

"'None' has always been the documented default for the encoding parameter"


What I meant here was that "help(ET.tostring)" will show you that as the 
default. Also, in the docs, the signature is "tostring(tree, encoding=None)", 
so None is the documented default value for the argument, regardless of the 
internal handling.


> "writing out the Unicode serialisation will result in an incorrect
> XML serialisation"
> I think Guido meant the ElementTree.write method; is that broken too?

Yes, the feature has been implemeted deep down in the _encode() helper 
function, so it impacts the entire serialiser, not only its API.


> I think I'd prefer old "tostring" behaviour and a separate "tounicode" 
> function, and I'm still not convinced that the latter is required for the XML 
> use case (which implies that maybe it should live in lxml.html for the HTML 
> case, even if it ends up calling the same internal implementation).

I obviously agree that the use case for XML is fable, but that alone doesn't 
make this a convincing argument to move it into lxml.html when the 
implementation will stay in lxml.etree anyway. Besides, that's pretty off-topic 
for this bug tracker.


> Or should that be "tobytes" and "tounicode" to eliminate all ambiguity?

That might be the clean break-all-bridges solution, but I don't think the name 
tostring() is so inherently broken in Py3 that it needs fixing. It's not 
"tostr()", for example.

I wouldn't raise much opposition against tobytes() as an alias for tostring(), 
although that sounds more like duplicating an otherwise simple API.

Stefan

----------

_______________________________________
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue8047>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue8047] Serialiser in ElementTree returns unicode strings in Py3k

Reply via email to