Tomalak <m8r-t1tu...@mailinator.com> added the comment: Francesco, I think you are missing the point. :-) The problem has two sides.
If I create an XML document using the DOM (not by parsing it from a string!), then I can put newline characters into attribute value. This is allowed and conforms to the XML spec. However, *literal* newlines in an attribute value (i.e. when the document is parsed from a string) have no meaning. The parser treats them as if they were insignificant whitespace -- they are converted to a single space. This is also valid and conforms to the XML spec. The catch: This leads to an actual data loss if I *wanted* to store newline characters in an attribute -- unless the newline characters are properly encoded. Encoding the newline characters is also valid and conforms to the spec, so the DOM implementation should do it. In other words - the parsing process you refer to is actually working fine. If an attribute contains a literal newline, it is indeed okay to collapse it into a space. It's only the document serializing that is broken. Minidom is clearly missing functionality here, and it does not conform to the XML spec. If I store a string of data in an XML document, it must be ensured that upon reading the document again, I get the *same* data back. This is what I check with my test script. ---------- _______________________________________ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue5752> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com