[issue34160] ElementTree not preserving attribute order

Stefan Behnel Mon, 18 Mar 2019 10:45:18 -0700


Stefan Behnel <[email protected]> added the comment:


Victor, as much as I appreciate backwards compatibility, I really don't think 
it's a big deal in this case. In fact, it might not even apply.

My (somewhat educated) gut feeling is that most users simply won't care or 
won't even notice the change. Out of those who do (or have to) care, many are 
better off by fixing their code to not rely on an entirely arbitrary (sorted by 
name) attribute order than by getting the old behaviour back. And for those few 
who really need attributes to be sorted by name, there's the recipe I posted 
which works on all ElementTree implementations out there with all alive CPython 
versions.

> This recipe does modify the document and so changes the behaviour of the 
> application when it iterates on attributes later

This is actually a very rare thing. If I were to make up numbers, I'd guess 
that some 99% of the applications do XML serialisation as the last thing and 
then throw away the tree afterwards, without touching it (or its attributes) 
again. And the remaining cases are most probably covered by the "don't need to 
care" type of users. I don't think we should optimise for 0.05% of our user 
base by providing a new API option for them. Especially in ElementTree, which 
decidedly aims to be simple.

The example that Ned gave refers to a very specific and narrow case: comparing 
serialised XML, at the byte level, in tests. He was very lucky that ElementTree 
was so stable over the last 10 Python releases that the output did not change 
at all. That is not something that an XML library needs to guarantee. There is 
some ambiguity in XML for everything that's outside of the XML Information set, 
and there is a good reason why the W3C has tackled this ambiguity with an 
explicit and *separate* specification: C14N. So, when you write:

> Many XML parsers rely on the order of attributes

It's certainly not many parsers, and could even be close to none. The order of 
attributes is explicitly excluded from the XML Information set:

https://www.w3.org/TR/xml-infoset/#omitted

Despite this, cases where the order of the attributes matters to the 
*application* are not unheard of. But for them, attributes sorted by their name 
are most likely the problem and not a solution. Raymond mentioned one such 
example. Sorting attributes by their name really only fulfils a single purpose: 
to get reproducible output in cases where the order does *not* matter. For all 
the (few) cases where the order *does* matter, it gets in the way.

But by removing the sorting, as this change does, we still get predictable 
output due to dict ordering. So this use case is still covered. It's just not 
necessarily the same output as before, because now the ordering is entirely in 
the hands of the users. Meaning, those users who *do* care can now actually 
influence the ordering, which was very difficult and hackish to achieve before. 
We are allowing users to remove these hacks, not forcing them to add new ones.

----------

_______________________________________
Python tracker <[email protected]>
<https://bugs.python.org/issue34160>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue34160] ElementTree not preserving attribute order

Reply via email to