[issue18304] ElementTree -- provide a way to ignore namespace in tags and seaches

Stefan Behnel Thu, 17 Apr 2014 22:08:22 -0700

Stefan Behnel added the comment:

You can already use iterparse for this.


    it = ET.iterparse('somefile.xml')
    for _, el in it:
        el.tag = el.tag.split('}', 1)[1]  # strip all namespaces
    root = it.root

As I said, this would be a little friendlier with support in the QName class, 
but it's not really complex code. Could be added to the docs as a recipe, with 
a visible warning that this can easily lead to incorrect data processing and 
therefore should not be used in systems where the input is not entirely under 
control.

Note that it's unclear what the "right way to do it" is, though. Is it better 
to 1) alter the data by stripping all namespaces off, or 2) let the tree API 
itself provide a namespace agnostic mode? Depends on the use case, but the more 
generic way 2) should be fairly involved in terms of implementation complexity, 
for just a minor use case. 1) would be ok in most cases where this "feature" is 
useful, I guess, and can be done as shown above.

In fact, the advantage of doing it explicitly with iterparse() is that instead 
of stripping all namespaces, only the expected namespaces can be discarded. And 
errors can be raised when finding unnamespaced elements, for example. This 
allows for a safety guard that prevents the code from completely 
misinterpreting input. There is a reason why namespace were added to XML at 
some point.

----------

_______________________________________
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue18304>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue18304] ElementTree -- provide a way to ignore namespace in tags and seaches

Reply via email to