Stefan Behnel added the comment:

We are talking about two different things here.

I said that (serialised) XML is defined as a sequence of bytes. Read the spec 
on that.

What you are talking about is the Infoset, or the parsed/generated in-memory 
XML tree. That's obviously not bytes, it's defined based on Unicode. Parsing 
and serialising does the mapping here.

The "attack" that you presented is based on serialised XML, thus on a sequence 
of bytes. What I am saying is that this "attack" can be done by any kind of 
binary data, so it's not XML specific, thus not a problem with ElementTree.

The fact that ElementTree allows you to generate non well-formed 'XML' 
containing control characters when you tell it to do so is unfortunate, but 
it's neither a security risk (you already had the non well-formed content in 
your hands *before* you passed it into ElementTree), nor clearly a bug, because 
the user specifically requested the serialisation of an in-memory tree that 
contained these control characters.

But, once again, it would be nice if ElementTree rejected this input in one way 
or another, and that's a feature request.

----------

_______________________________________
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue18850>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to