Re: XML and namespaces
Uche Ogbuji [EMAIL PROTECTED] wrote: Andrew Clover also suggested an overly-legalistic argument that current minidom behavior is not a bug. I stick by my language-law interpretation of spec. DOM 2 Core specifically disclaims any responsibility for namespace fixup and advises the application writer to do it themselves if they want to be sure of the right output. W3C knew they weren't going to get all that standardised by Level 2 so they left it open for future work - if minidom claimed to support DOM 3 LS it would be a different matter. '?xml version=1.0 ?\nferh/' (i.e. ferh rather than href), would you not consider that a minidom bug? It's not a *spec* bug, as no spec that minidom claims to conform to says anything about serialisation. It's a *minidom* bug in that it fails to conform to the minimal documentation of the method toxml() which claims to Return the XML that the DOM represents as a string - the DOM does not represent that XML. However that doc for toxml() says nothing about being namespace-aware. XML and XML-with-namespaces both still exist, and for the former class of document the minidom behaviour is correct. The reality is that once the poor user has done: element = document.createElementNS(DAV:, href) They are following DOM specification that they have created an element in a namespace It's possible that a namespaced node could also be imported/parsed into a non-namespace document and then serialised; it's particularly likely this could happen for scripts processing XHTML. We shouldn't change the existing behaviour for toxml/writexml because people may be relying on it. One of the reasons I ended up writing a replacement was that the behaviour of minidom was not only wrong, but kept changing under my feet with each version. However, adding the ability to do fixup on serialisation would indeed be very welcome - toxmlns() maybe, or toxml(namespaces= True)? I'll be sure to emphasize heavily to users that minidom is broken with respect to Namespaces and serialization, and that they abandon it in favor of third-party tools. Well yes... there are in any case more fundamental bugs than just serialisation problems. Frederik wrote: can anyone perhaps dig up a DOM L2 implementation that's not written by anyone involved in this thread g -- And Clover mailto:[EMAIL PROTECTED] http://doxdesk.com/ -- http://mail.python.org/mailman/listinfo/python-list
Re: XML and namespaces
[Paul Boddie] It's interesting that minidom plus PrettyPrint seems to generate the xmlns attributes in the serialisation, though; should that be reported as a bug? I believe that it is a bug. [Paul Boddie] Well, with the automagic, all DOM users get the once in a lifetime chance to exchange those lead boots for concrete ones. I'm sure there are all sorts of interesting reasons for assigning namespaces to nodes, serialising the document, and then not getting all the document information back when parsing it, but I'd rather be spared all the amusement behind all those reasons and just have life made easier for just about everyone concerned. Well, if you have a fair amount of spare time and really want to improve things, I recommend that you consider implementing the DOM L3 namespace normalisation algorithm. http://www.w3.org/TR/2004/REC-DOM-Level-3-Core-20040407/namespaces-algorithms.html That way, everyone can have namespace well-formed documents by simply calling a single method, and not a line of automagic in sight: just standards-compliant XML processing. Anyway, thank you for your helpful commentary on this matter! And thanks to you for actually informing yourself on the issue, and for taking the time to research and understand it. I wish that your refreshing attitude was more widespread! now-i-really-must-get-back-to-work-ly'yrs, -- alan kennedy -- email alan: http://xhaus.com/contact/alan -- http://mail.python.org/mailman/listinfo/python-list
Re: XML and namespaces
Alan Kennedy wrote: [Discussing the appearance of xmlns=DAV:] But that's incorrect. You have now defaulted the namespace to DAV: for every unprefixed element that is a descendant of the href element. [Code creating the no_ns element with namespaceURI set to None] ?xml version=1.0? href xmlns=DAV:no_ns//href I must admit that I was focusing on the first issue rather than this one, even though it is related, when I responded before. Moreover, libxml2dom really should respect the lack of a namespace on the no_ns element, which the current version unfortunately doesn't do. However, wouldn't the correct serialisation of the document be as follows? ?xml version=1.0? href xmlns=DAV:no_ns xmlns=//href As for the first issue - the presence of the xmlns attribute in the serialised document - I'd be interested to hear whether it is considered acceptable to parse the serialised document and to find that no non-null namespaceURI is set on the href element, given that such a namespaceURI was set when the document was created. In other words, ... document = libxml2dom.createDocument(None, doc, None) top = document.xpath(*)[0] elem1 = document.createElementNS(DAV:, href) document.replaceChild(elem1, top) elem2 = document.createElementNS(None, no_ns) document.xpath(*)[0].appendChild(elem2) document.toFile(open(test_ns.xml, wb)) ...as before, followed by this test: document = libxml2dom.parse(test_ns.xml) print Namespace is, repr(document.xpath(*)[0].namespaceURI) What should the Namespace is message produce? Paul -- http://mail.python.org/mailman/listinfo/python-list
Re: XML and namespaces
[Paul Boddie] However, wouldn't the correct serialisation of the document be as follows? ?xml version=1.0? href xmlns=DAV:no_ns xmlns=//href Yes, the correct way to override a default namespace is an xmlns= attribute. [Paul Boddie] As for the first issue - the presence of the xmlns attribute in the serialised document - I'd be interested to hear whether it is considered acceptable to parse the serialised document and to find that no non-null namespaceURI is set on the href element, given that such a namespaceURI was set when the document was created. The key issue: should the serialised-then-reparsed document have the same DOM content (XML InfoSet) if the user did not explicitly create the requisite namespace declaration attributes? My answer: No, it should not be the same. My reasoning: The user did not explicitly create the attributes = The DOM should not automagically create them (according to the L2 spec) = such attributes should not be serialised - The user didn't create them - The DOM implementation didn't create them - If the serialisation processor creates them, that gives the same end result as if the DOM impl had (wrongly) created them. = the serialisation is a faithful/naive representation of the (not-namespace-well-formed) DOM constructed by the user (who omitted required attributes). = The reloaded document is a different DOM to the original, i.e. it has a different infoset. The xerces and jython snippet I posted the other day demonstrates this. If you look closely at that code, the actual DOM implementation and the serialisation processor used are from different libraries. The DOM is the inbuilt JAXP DOM implementation, Apache Crimson(the example only works on JDK 1.4). The serialisation processor is the Apache Xerces serialiser. The fact that the xmlns=DAV: attribute didn't appear in the output document shows that BOTH the (Crimson) DOM implementation AND the (Xerces) serialiser chose NOT to automagically create the attribute. If you run that snippet with other DOM implementations, by setting the javax.xml.parsers.DocumentBuilderFactory property, you'll find the same result. Serialisation and namespace normalisation are both in the realm of DOM Level 3, whereas minidom is only L2 compliant. Automagically introducing L3 semantics into the L2 implementation is the wrong thing to do. http://www.w3.org/TR/DOM-Level-3-LS/load-save.html http://www.w3.org/TR/2004/REC-DOM-Level-3-Core-20040407/namespaces-algorithms.html [Paul Boddie] In other words, ... What should the Namespace is message produce? Namespace is None If you want it to produce, Namespace is 'DAV:' and for your code to be portable to other DOM implementations besides libxml2dom, then your code should look like:- document = libxml2dom.createDocument(None, doc, None) top = document.xpath(*)[0] elem1 = document.createElementNS(DAV:, href) elem1.setAttributeNS(xml.dom.XMLNS_NAMESPACE, xmlns, DAV:) document.replaceChild(elem1, top) elem2 = document.createElementNS(None, no_ns) elem2.setAttributeNS(xml.dom.XMLNS_NAMESPACE, xmlns, ) document.xpath(*)[0].appendChild(elem2) document.toFile(open(test_ns.xml, wb)) its-not-about-namespaces-its-about-automagic-ly'yrs, -- alan kennedy -- email alan: http://xhaus.com/contact/alan -- http://mail.python.org/mailman/listinfo/python-list
Re: XML and namespaces
Alan Kennedy wrote: Serialisation and namespace normalisation are both in the realm of DOM Level 3, whereas minidom is only L2 compliant. Automagically introducing L3 semantics into the L2 implementation is the wrong thing to do. I think I'll have to either add some configuration support, in order to let the user specify which standards they have in mind, or to deny/assert support for one or another of the standards. It's interesting that minidom plus PrettyPrint seems to generate the xmlns attributes in the serialisation, though; should that be reported as a bug? As for the toxml method in minidom, the subject did seem to be briefly discussed on the XML-SIG mailing list earlier in the year: http://mail.python.org/pipermail/xml-sig/2005-July/011157.html its-not-about-namespaces-its-about-automagic-ly'yrs Well, with the automagic, all DOM users get the once in a lifetime chance to exchange those lead boots for concrete ones. I'm sure there are all sorts of interesting reasons for assigning namespaces to nodes, serialising the document, and then not getting all the document information back when parsing it, but I'd rather be spared all the amusement behind all those reasons and just have life made easier for just about everyone concerned. I think the closing remarks in the following message say it pretty well: http://mail-archives.apache.org/mod_mbox/xml-security-dev/200409.mbox/1095071819.17967.44.camel%40amida And there are some interesting comments on this archived page, too: http://web.archive.org/web/20010211173643/http://xmlbastard.editthispage.com/discuss/msgReader$6 Anyway, thank you for your helpful commentary on this matter! Paul -- http://mail.python.org/mailman/listinfo/python-list
Re: XML and namespaces
Alan Kennedy wrote: [Fredrik Lundh] but isn't libxml2dom just a binding for libxml2? as I mention above, I had libxml2 in mind when I wrote widely used, not the libxml2dom binding itself. No, libxml2dom is Paul Boddie's DOM API compatibility layer on top of the cpython bindings for libxml2. So a binding that just passes things through to another binding is not a binding? Alright, let's call it a compatibility layer then. but libxml2 is also widely used, so we have at least two ways to interpret the spec. Don't confuse libxml2dom with libxml2. As Paul has said several times, libxml2dom is just a thin API compatibility layer on top of libxml2. It's libxml2 that does all the work, and the libxml2 authors claim that libxml2 implements the DOM level 2 document model, but with a different API. Maybe they're wrong, but wasn't the whole point of this subthread that different developers have interpreted the specification in different ways ? /F -- http://mail.python.org/mailman/listinfo/python-list
Re: XML and namespaces
[Fredrik Lundh] It's libxml2 that does all the work, and the libxml2 authors claim that libxml2 implements the DOM level 2 document model, but with a different API. That statement is meaningless. The DOM is *only* an API, i.e. an interface. The opening statement on the W3C DOM page is What is the Document Object Model? The Document Object Model is a platform- and language-neutral interface that will allow programs and scripts to dynamically access and update the content, structure and style of documents. http://www.w3.org/DOM/ The interfaces that make up the different levels of the DOM are described in CORBA IDL - Interface Definition Language. DOM Implementations are free to implement the methods and properties of the IDL interfaces as they see fit. Some implementations might maintain an object model, with separate objects for each node in the tree, several string variables associated with each node, i.e. node name, namespace, etc. But they could just as easily store those data in tables, indexed by some node id. (As an aside, the non-DOM-compatible Xalan Table Model does exactly that: http://xml.apache.org/xalan-j/dtm.html). So when the libxml2 developers say (copied from http://www.xmlsoft.org/) To some extent libxml2 provides support for the following additional specifications but doesn't claim to implement them completely: * Document Object Model (DOM) http://www.w3.org/TR/DOM-Level-2-Core/ the document model, but it doesn't implement the API itself, gdome2 does this on top of libxml2 They've completely missed the point: DOM is *only* the API. Maybe they're wrong, but wasn't the whole point of this subthread that different developers have interpreted the specification in different ways ? What specification? Libxml2 implements none of the DOM specifications. -- alan kennedy -- email alan: http://xhaus.com/contact/alan -- http://mail.python.org/mailman/listinfo/python-list
Re: XML and namespaces
[Alan Kennedy] Don't confuse libxml2dom with libxml2. [Paul Boddie] Well, quite, but perhaps you can explain what I'm doing wrong with this low-level version of the previously specified code: Well, if your purpose is to make a point about minidom and DOM standards compliance in relation to serialisation of namespaces, then what you're doing wrong is to use a library that bears no relationship to the DOM to make your point. Think about it this way: Say you decide to create a new XML document using a non-DOM library, such as the excellent ElementTree. So you make a series of ElementTree-API-specific calls to create the document, the elements, attributes, namespaces, etc, and then serialise the whole thing. And the end result is that you end up with a document that looks like this ?xml version=1.0 encoding=utf-8? href xmlns=DAV:/ It is not possible to use that ElementTree code to make inferences on how minidom should behave, because the syntax and semantics of the minidom API calls and the ElementTree API calls are different. Minidom is constrained to implement the precise semantics of the DOM APIs, because it claims standards compliance. ElementTree is free to do whatever it likes, e.g. be pythonic, because it has no standard to conform to: it is designed solely according to the experience and intuition of its author, who is free change it at any stage if he feels like it. s/ElementTree/libxml2/g If I've completely missed your point and you were talking something else entirely, please forgive me. I'd be happy to help with any questions if I can. -- alan kennedy -- email alan: http://xhaus.com/contact/alan -- http://mail.python.org/mailman/listinfo/python-list
Re: XML and namespaces
Paul Boddie wrote: It is difficult to say whether this usage of the API is correct or not, judging from the Web site's material [...] Some more on this: I found an example on the libxml2 mailing list (searching for xmlNewNs default namespace) which is similar to the one I gave: http://mail.gnome.org/archives/xml/2004-April/msg00282.html Meanwhile, the usage of xmlNewNs seems to have some correlation with the production of xmlns attributes (found in a search for xmlns default namespace): http://mail.gnome.org/archives/xml/2002-March/msg00111.html And whilst gdome2 - the GNOME project's DOM wrapper for libxml2 - seems to create unowned namespaces, adding them to the document as global namespace declarations (looking at the code for gdome_xmlNewNs and gdome_xml_doc_createElementNS respectively)... http://cvs.gnome.org/viewcvs/gdome2/libgdome/gdomecore/gdome-xml-xmlutil.c?rev=1.18view=markup http://cvs.gnome.org/viewcvs/gdome2/libgdome/gdomecore/gdome-xml-document.c?rev=1.50view=markup ...seemingly comparable operations with libxml2mod seem to be no longer supported: libxml2mod.xmlNewGlobalNs(d, DAV:, None) xmlNewGlobalNs() deprecated function reached Given that I've recently unsubscribed from some pretty unproductive mailing lists, perhaps I should make some enquiries on the libxml2 mailing list and possibly report back. Paul -- http://mail.python.org/mailman/listinfo/python-list
Re: XML and namespaces
Alan Kennedy wrote: Well, if your purpose is to make a point about minidom and DOM standards compliance in relation to serialisation of namespaces, then what you're doing wrong is to use a library that bears no relationship to the DOM to make your point. Alright. I respectfully withdraw libxml2/libxml2dom as an example of a DOM Level 2 compatible implementation. Since I only profess to support a PyXML-style DOM in libxml2dom, the course I take in any amendments to that package will follow whatever Uche decides to do with 4DOM and PyXML. ;-) Whatever happens, I'll attempt to make it compatible with qtxmldom in both its flavours (qtxml and KHTML). As for the various issues with namespaces and the DOM, with memories of slapping empty xmlns attributes strategically-but-desperately in XSL processing pipelines to avoid invisible-but-still-present default namespaces now thankfully receding into the incoherent past, the whole business merely reinforces my impression of the various standards committees as a group of corporate delegates meeting regularly to hold a measuring competition amongst themselves. Paul -- http://mail.python.org/mailman/listinfo/python-list
Re: XML and namespaces
[Fredrik Lundh] my point was that (unless I'm missing something here), there are at least two widely used implementations (libxml2 and the 4DOM domlette stuff) that don't interpret the spec in this way. Libxml2dom is of alpha quality, according to its CheeseShop page anyway. http://cheeseshop.python.org/pypi/libxml2dom/0.2.4 This can be seen in its incorrect serialisation of the following valid DOM. #-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= document = libxml2dom.createDocument(None, doc, None) top = document.xpath(*)[0] elem1 = document.createElementNS(DAV:, myns:href) elem1.setAttributeNS(xml.dom.XMLNS_NAMESPACE, xmlns:myns, DAV:) document.replaceChild(elem1, top) print document.toString() #-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= Which produces ?xml version=1.0? myns:href xmlns:myns=DAV: xmlns:xmlns=http://www.w3.org/2000/xmlns/; xmlns:myns=DAV: / Which is not even well-formed XML (duplicate attributes), let alone namespace well-formed. Note also the invalid xml namespace xmlns:xmlns attribute. So I don't accept that libxml2dom's behaviour is definitive in this case. The other DOM you refer to, the 4DOM stuff, was written by a participant in this discussion. Will you accept Apache Xerces 2 for Java as a widely used DOM Implementation? I guarantee that it is far more widely used than either of the DOMs mentioned. Download Xerces 2 (I am using Xerces 2.7.1), and run the following code under jython:- http://www.apache.org/dist/xml/xerces-j/ #-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= # # This is a simple adaptation of the DOMGenerate.java # sample from the Xerces 2.7.1 distribution. # from javax.xml.parsers import DocumentBuilder, DocumentBuilderFactory from org.apache.xml.serialize import OutputFormat, XMLSerializer from java.io import StringWriter def create_document(): dbf = DocumentBuilderFactory.newInstance() db = dbf.newDocumentBuilder() return db.newDocument() def serialise(doc): format = OutputFormat( doc ) outbuf = StringWriter() serial = XMLSerializer( outbuf, format ) serial.asDOMSerializer() serial.serialize(doc.getDocumentElement()) return outbuf.toString() doc = create_document() root = doc.createElementNS(DAV:, href) doc.appendChild( root ) print serialise(doc) #-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= Which produces ?xml version=1.0 encoding=UTF-8? href/ As I expected it would. -- alan kennedy -- email alan: http://xhaus.com/contact/alan -- http://mail.python.org/mailman/listinfo/python-list
Re: XML and namespaces
[EMAIL PROTECTED] You're the one who doesn't seem to clearly understand XML namespaces. It's your position that is bewildering, not XML namespaces (well, they are confusing, but I have a good handle on all the nuances by now). So you keep claiming, but I have yet to see the evidence. Again, no skin off my back here: I write and use tools that are XML namespaces compliant. It doesn't hurt me that Minidom is not. I was hoping to help, but again I don't have time for ths argument. If you make statements such as you're wrong on this , you misunderstand , you're guessing ., etc, then you should be prepared to back them up, not state them and then say but I'm too busy and/or important to discuss it with you. Perhaps you should think twice before making such statements in the future. -- alan kennedy -- email alan: http://xhaus.com/contact/alan -- http://mail.python.org/mailman/listinfo/python-list
Re: XML and namespaces
Alan Kennedy wrote: [Fredrik Lundh] my point was that (unless I'm missing something here), there are at least two widely used implementations (libxml2 and the 4DOM domlette stuff) that don't interpret the spec in this way. Libxml2dom is of alpha quality, according to its CheeseShop page anyway. http://cheeseshop.python.org/pypi/libxml2dom/0.2.4 but isn't libxml2dom just a binding for libxml2? as I mention above, I had libxml2 in mind when I wrote widely used, not the libxml2dom binding itself. Will you accept Apache Xerces 2 for Java as a widely used DOM Implementation? sure. but libxml2 is also widely used, so we have at least two ways to interpret the spec. the defacto interpretation of the spec seems to be that namespace handling during serialization is undefined... (is there perhaps a DOM library that starts hack or rogue when you use name- spaces ? ;-) /F -- http://mail.python.org/mailman/listinfo/python-list
Re: XML and namespaces
[Fredrik Lundh] but isn't libxml2dom just a binding for libxml2? as I mention above, I had libxml2 in mind when I wrote widely used, not the libxml2dom binding itself. No, libxml2dom is Paul Boddie's DOM API compatibility layer on top of the cpython bindings for libxml2. From the CheeseShop page The libxml2dom package provides a traditional DOM wrapper around the Python bindings for libxml2. In contrast to the libxml2 bindings, libxml2dom provides an API reminiscent of minidom, pxdom and other Python-based and Python-related XML toolkits. http://cheeseshop.python.org/pypi/libxml2dom [Alan Kennedy] Will you accept Apache Xerces 2 for Java as a widely used DOM Implementation? [Fredrik Lundh] sure. but libxml2 is also widely used, so we have at least two ways to interpret the spec. Don't confuse libxml2dom with libxml2. As I showed with a code snippet in a previous message, libxml2dom has significant defects in relation to serialisation of namespaced documents, whereby the serialised documents it produces aren't even well-formed xml. Perhaps you can show a code snippet in libxml2 that illustrates the behaviour you describe? -- alan kennedy -- email alan: http://xhaus.com/contact/alan -- http://mail.python.org/mailman/listinfo/python-list
Re: XML and namespaces
Alan Kennedy wrote: Libxml2dom is of alpha quality, according to its CheeseShop page anyway. Given that I gave it that classification, let me explain that its alpha status is primarily justified by the fact that it doesn't attempt to cover the entire DOM API. As I mentioned in my original contribution to this thread, the serialisation is done by libxml2 itself - arguably a wise choice given the abysmal performance of many Python DOM implementations when serialising documents. I'll look into namespace-setting issues in the libxml2 API, but I imagine that the serialisation mechanisms control much of what you're seeing, and it's quite possible that they can be configured to perform in whichever way is desirable. Paul -- http://mail.python.org/mailman/listinfo/python-list
Re: XML and namespaces
Alan Kennedy wrote: Don't confuse libxml2dom with libxml2. Well, quite, but perhaps you can explain what I'm doing wrong with this low-level version of the previously specified code: import libxml2mod document = libxml2mod.xmlNewDoc(None) element = libxml2mod.xmlNewChild(document, None, href, None) print libxml2mod.serializeNode(document, None, 1) This prints the following: ?xml version=1.0? href/ Extending the above code... ns = libxml2mod.xmlNewNs(element, DAV:, None) print libxml2mod.serializeNode(document, None, 1) This prints the following: ?xml version=1.0? href xmlns=DAV:/ Note that libxml2mod is as close to the libxml2 C API as you can get in Python. As far as I can tell, by using that module, you're effectively driving the C API almost directly. Note also that libxml2mod is nothing to do with what I've written myself - I'm just using it here, just as libxml2dom does. Now, in the first part of the code, we didn't specify a namespace on the element at all, but in the second part we chose to set a namespace on the element with a null prefix. As you can see, we get the xmlns attribute as soon as the namespace is introduced. It is difficult to say whether this usage of the API is correct or not, judging from the Web site's material [1], so I'd be happy if someone could point out improvements or corrections. Paul [1] http://xmlsoft.org/ -- http://mail.python.org/mailman/listinfo/python-list
Re: XML and namespaces
Alan Kennedy: These namespace declaration nodes, i.e. attribute nodes in the xml.dom.XMLNS_NAMESPACE namespace, are a pre-requisite for any namespaced DOM document to be well-formed, and thus naively serializable. The argument could be made that application authors should be protected from themselves by having the underlying DOM library automatically create the relevant namespace nodes. But to me that's not pythonic: it's implicit, not explicit. My vote is that the existing xml.dom.minidom behaviour wrt namespace nodes is correct and should not be changed. Andrew Clover also suggested an overly-legalistic argument that current minidom behavior is not a bug. It's a very strange attitude that because a behavior is not specifically proscribed in a spec, that it is not a bug. Let me try a reducto ad absurdum, which I think in this case is a very fair stratagem. If the code in question: document = xml.dom.minidom.Document() element = document.createElementNS(DAV:, href) document.appendChild(element) DOM Element: href at 0x1443e68 document.toxml() '?xml version=1.0 ?\nferh/' (i.e. ferh rather than href), would you not consider that a minidom bug? Now consider that DOM Level 2 does not proscribe such mangling. Do you still think that's a useful way to determine what is a bug? The current, erroneous behavior, which you advocate, is of the same bug. Minidom is an XML Namespaces aware API. In XML Namespaces, the namespace URI is *part of* the name. No question about it. In Clark notation the element name that is specified in element = document.createElementNS(DAV:, href) is {DAV:}href. In Clark notation the element name of the document element in the created docuent is href. That is not the name the user specified. It is a mangled version of it. The mangling is no better than my reductio of reversing the qname. This is a bug. Simple as that. WIth this behavior, minidom is an API correct with respect to XML Namespaces. So you try the tack of invoking pythonicness. Well I have one for ya: In the face of ambiguity, refuse the temptation to guess. You re guessing that explicit XMLNS attributes are the only way the user means to express namespace information, even though DOM allows this to be provided through such attributes *or* through namespace properties. I could easily argue that since these are core properties in the DOM, that DOM should ignore explicit XMLNS attributes and only use namespace properties in determining output namespace. You are guessing that XMLNS attributes (and only those) represent what the user really means. I would be arguing the same of namespace properties. The reality is that once the poor user has done: element = document.createElementNS(DAV:, href) They are following DOM specification that they have created an element in a namespace, and you seem to be arguing that they cannot usefully have completed their work until they also do: element.setAttributeNS(xml.dom.XMLNS_NAMESPACE, None, DAV:) I'd love to hear how many actual minidom users would agree with you. It's currently a bug. It needs to be fixed. However, I have no time for this bewildering fight. If the consensus is to leave minidom the way it is, I'll just wash my hands of the matter, but I'll be sure to emphasize heavily to users that minidom is broken with respect to Namespaces and serialization, and that they abandon it in favor of third-party tools. -- Uche Ogbuji Fourthought, Inc. http://uche.ogbuji.nethttp://fourthought.com http://copia.ogbuji.net http://4Suite.org Articles: http://uche.ogbuji.net/tech/publications/ -- http://mail.python.org/mailman/listinfo/python-list
Re: XML and namespaces
I wrote: The reality is that once the poor user has done: element = document.createElementNS(DAV:, href) They are following DOM specification that they have created an element in a namespace, and you seem to be arguing that they cannot usefully have completed their work until they also do: element.setAttributeNS(xml.dom.XMLNS_NAMESPACE, None, DAV:) I'd love to hear how many actual minidom users would agree with you. Of course (FWIW) I meant element.setAttributeNS(xml.dom.XMLNS_NAMESPACE, xmlns, DAV:) -- Uche Ogbuji Fourthought, Inc. http://uche.ogbuji.nethttp://fourthought.com http://copia.ogbuji.net http://4Suite.org Articles: http://uche.ogbuji.net/tech/publications/ -- http://mail.python.org/mailman/listinfo/python-list
Re: XML and namespaces
[EMAIL PROTECTED] The current, erroneous behavior, which you advocate, is of the same bug. Minidom is an XML Namespaces aware API. In XML Namespaces, the namespace URI is *part of* the name. No question about it. In Clark notation the element name that is specified in element = document.createElementNS(DAV:, href) is {DAV:}href. In Clark notation the element name of the document element in the created docuent is href. I think if we're going to get anywhere in this discussion, we'll have to stick to the convention that we are dealing with some specific values. I suggest the following element_local_name = 'href' element_ns_prefix = 'DAV' element_ns_uri = 'somescheme://someuri' Therefore, in Clark notation, the qualified name of the element in the OPs example is {somescheme://someuri}href. (Yes, I know that DAV: is a valid namespace URI. But it's a poor example because it looks like a namespace prefix, and may be giving rise to some confusion.) So, to create a namespaced element, we must specify the namespace uri, the namespace prefix and the element local name, like so qname = %s:%s % (element_ns_prefix, element_local_name) element = document.createElementNS(element_ns_uri, qname) Now, if we create, as the OP did, an element with a namespace uri but no prefix, like so element = document.createElementNS(element_ns_uri, element_local_name) that element *cannot* be serialised naively, because the namespace prefix has not been declared. Yes, the element is correctly scoped to the element_ns_uri namespace, but it cannot be serialised because declaration of namespace prefixes is a pre-requisite of the Namespaces REC. Relevant quotes from the Namespaces REC are URI references can contain characters not allowed in names, so cannot be used directly as namespace prefixes. Therefore, the namespace prefix serves as a proxy for a URI reference. An attribute-based syntax described below is used to declare the association of the namespace prefix with a URI reference; software which supports this namespace proposal must recognize and act on these declarations and prefixes. and Namespace Constraint: Prefix Declared The namespace prefix, unless it is xml or xmlns, must have been declared in a namespace declaration attribute in either the start-tag of the element where the prefix is used or in an an ancestor element (i.e. an element in whose content the prefixed markup occurs). http://www.w3.org/TR/REC-xml-names/ [EMAIL PROTECTED] So you try the tack of invoking pythonicness. Well I have one for ya: In the face of ambiguity, refuse the temptation to guess. Precisely: If the user has created a document that is not namespace correct, then do not try to guess whether it should be corrected or not: simply serialize the dud document. If the user wants a namespace well-formed document, then they are responsible for either ensuring that the relevant namespaces, prefixes and uris are explicitly declared, or for explicitly calling some normalization routine that automagically does that for them. [EMAIL PROTECTED] You re guessing that explicit XMLNS attributes are the only way the user means to express namespace information, even though DOM allows this to be provided through such attributes *or* through namespace properties. I could easily argue that since these are core properties in the DOM, that DOM should ignore explicit XMLNS attributes and only use namespace properties in determining output namespace. You are guessing that XMLNS attributes (and only those) represent what the user really means. I would be arguing the same of namespace properties. I'm not guessing anything: I'm asserting that with DOM Level 2, the user is expected to manage their own namespace prefix declarations. DOM L2 states that Namespace validation is not enforced; the DOM application is responsible. In particular, since the mapping between prefixes and namespace URIs is not enforced, in general, the resulting document cannot be serialized naively. DOM L3 provides the normalizeNamespaces method, which the user should have to *explicitly* call in order to make their document namespace well-formed if it was not already. http://www.w3.org/TR/2004/REC-DOM-Level-3-Core-20040407/namespaces-algorithms.html The proposal that minidom should automagically fixup namespace declarations and prefixes on output would leave it compliant with *neither* DOM L2 or L3. [EMAIL PROTECTED] The reality is that once the poor user has done: element = document.createElementNS(DAV:, href) They are following DOM specification that they have created an element in a namespace, and you seem to be arguing that they cannot usefully have completed their work until they also do: element.setAttributeNS(xml.dom.XMLNS_NAMESPACE, None, DAV:) Actually no, that statement produces AttributeError: 'NoneType' object has no attribute 'split'. I believe that you're
Re: XML and namespaces
Is this automatic creation an expected behaviour? Of course. Not exactly a bug /.../ So it should probably be optional. My interpretation of namespace nodes is that the application is responsible /.../ I'm sorry but you're wrong on this. Well, my reading of the DOM L2 spec is such that it does not agree with the statement above. It's currently a bug. It needs to be fixed. It's not a bug, it doesn't need fixing, minidom is not broken. further p^H^H^H^H^H^H^H^H^H can anyone perhaps dig up a DOM L2 implementation that's not written by anyone involved in this thread, and see what it does ? /F -- http://mail.python.org/mailman/listinfo/python-list
Re: XML and namespaces
Fredrik Lundh wrote: can anyone perhaps dig up a DOM L2 implementation that's not written by anyone involved in this thread, and see what it does ? Alright. Look away from the wrapper code (which I wrote, and which doesn't do anything particularly clever) and look at the underlying libxml2 serialisation behaviour: import libxml2dom document = libxml2dom.createDocument(DAV:, href, None) print document.toString() This outputs the following: ?xml version=1.0? href xmlns=DAV:/ To reproduce the creation of bare Document objects (which I thought wasn't strictly supported by minidom), we perform some tricks: document = libxml2dom.createDocument(None, doc, None) top = document.xpath(*)[0] element = document.createElementNS(DAV:, href) document.replaceChild(element, top) print document.toString() This outputs the following: ?xml version=1.0? href xmlns=DAV:/ While I can understand the desire to suppress xmlns attribute generation for certain document types, this is probably only interesting for legacy XML processors and for HTML. Leaving such attributes out by default, whilst claiming some kind of fine print standards compliance, is really a recipe for unnecessary user frustration. Paul -- http://mail.python.org/mailman/listinfo/python-list
Re: XML and namespaces
[Fredrik Lundh] can anyone perhaps dig up a DOM L2 implementation that's not written by anyone involved in this thread, and see what it does ? [Paul Boddie] document = libxml2dom.createDocument(None, doc, None) top = document.xpath(*)[0] element = document.createElementNS(DAV:, href) document.replaceChild(element, top) print document.toString() This outputs the following: ?xml version=1.0? href xmlns=DAV:/ But that's incorrect. You have now defaulted the namespace to DAV: for every unprefixed element that is a descendant of the href element. Here is an example #-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= document = libxml2dom.createDocument(None, doc, None) top = document.xpath(*)[0] elem1 = document.createElementNS(DAV:, href) document.replaceChild(elem1, top) elem2 = document.createElementNS(None, no_ns) document.childNodes[0].appendChild(elem2) print document.toString() #-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= which produces ?xml version=1.0? href xmlns=DAV:no_ns//href The defaulting rules of XML namespaces state 5.2 Namespace Defaulting A default namespace is considered to apply to the element where it is declared (if that element has no namespace prefix), and to all elements with no prefix within the content of that element. http://www.w3.org/TR/REC-xml-names/#defaulting So although I have explicitly specified no namespace for the no_ns subelement, it now defaults to the default DAV: namespace which has been declared in the automagically created xmlns attribute. This is wrong behaviour. If I want for my sub-element to truly have no namespace, I have to write it like this ?xml version=1.0? myns:href xmlns:myns=DAV:no_ns//myns:href [Paul Boddie] Leaving such attributes out by default, whilst claiming some kind of fine print standards compliance, is really a recipe for unnecessary user frustration. On the contrary, once you start second guessing the standards and making guesses about what users are really trying to do, and making decisions for them, then some people are going to get different behaviour from what they rightfully expect according to the standard. People whose expectations match with the guesses made on their behalf will find that their software is not portable between DOM implementations. With something as finicky as XML namespaces, you can't just make ad-hoc decisions as to what the user really wants. That's why DOM L2 punted on the whole problem, and left it to DOM L3. -- alan kennedy -- email alan: http://xhaus.com/contact/alan -- http://mail.python.org/mailman/listinfo/python-list
Re: XML and namespaces
Leaving such attributes out by default, whilst claiming some kind of fine print standards compliance, is really a recipe for unnecessary user frustration. On the contrary, once you start second guessing the standards and making guesses about what users are really trying to do, and making decisions for them, then some people are going to get different behaviour from what they rightfully expect according to the standard. People whose expectations match with the guesses made on their behalf will find that their software is not portable between DOM implementations. and this hypothetical situation is different from the current situation in exactly what way? With something as finicky as XML namespaces, you can't just make ad-hoc decisions as to what the user really wants. That's why DOM L2 punted on the whole problem, and left it to DOM L3. so L2 is the we support namespaces, but we don't really support them level ? maybe we could take everyone involved with the DOM design out to the backyard and beat them with empty PET bottles until they promise never to touch a computer again ? /F -- http://mail.python.org/mailman/listinfo/python-list
Re: XML and namespaces
[Alan Kennedy] On the contrary, once you start second guessing the standards and making guesses about what users are really trying to do, and making decisions for them, then some people are going to get different behaviour from what they rightfully expect according to the standard. People whose expectations match with the guesses made on their behalf will find that their software is not portable between DOM implementations. [Fredrik Lundh] and this hypothetical situation is different from the current situation in exactly what way? Hmm, not sure I understand what you're getting at. If changes are made to minidom that implement non-standard behaviour, there are two groups of people I'm thinking of 1. The people who expect the standard behaviour, not the modified behaviour. From these people's POV, the software can then be considered broken, since it produces different results from what is expected according to the standard. 2. The people who are ignorant of the decisions made on their behalf, and assume that they have written correct code. But their code won't work on other DOM implementations (because the automagic namespace fixup code isn't present, for example). From these people's POV, the software can then be considered broken. [Alan Kennedy] With something as finicky as XML namespaces, you can't just make ad-hoc decisions as to what the user really wants. That's why DOM L2 punted on the whole problem, and left it to DOM L3. [Fredrik Lundh] so L2 is the we support namespaces, but we don't really support them level ? Well, I read it as we support namespaces, but only if you know what you're doing. [Fredrik Lundh] maybe we could take everyone involved with the DOM design out to the backyard and beat them with empty PET bottles until they promise never to touch a computer again ? :-D -- alan kennedy -- email alan: http://xhaus.com/contact/alan -- http://mail.python.org/mailman/listinfo/python-list
Re: XML and namespaces
Alan Kennedy wrote: [Fredrik Lundh] and this hypothetical situation is different from the current situation in exactly what way? Hmm, not sure I understand what you're getting at. If changes are made to minidom that implement non-standard behaviour, there are two groups of people I'm thinking of 1. The people who expect the standard behaviour, not the modified behaviour. From these people's POV, the software can then be considered broken, since it produces different results from what is expected according to the standard. 2. The people who are ignorant of the decisions made on their behalf, and assume that they have written correct code. But their code won't work on other DOM implementations (because the automagic namespace fixup code isn't present, for example). From these people's POV, the software can then be considered broken. my point was that (unless I'm missing something here), there are at least two widely used implementations (libxml2 and the 4DOM domlette stuff) that don't interpret the spec in this way. so L2 is the we support namespaces, but we don't really support them level ? Well, I read it as we support namespaces, but only if you know what you're doing. or we support namespaces, but no matter how you interpret the word 'support', we probably mean something else. nyah nyah! /F -- http://mail.python.org/mailman/listinfo/python-list
Re: XML and namespaces
Alan Kennedy Although I am sympathetic to your bewilderment: xml namespaces can be overly complex when it comes to the nitty, gritty details. You're the one who doesn't seem to clearly understand XML namespaces. It's your position that is bewildering, not XML namespaces (well, they are confusing, but I have a good handle on all the nuances by now). Again, no skin off my back here: I write and use tools that are XML namespaces compliant. It doesn't hurt me that Minidom is not. I was hoping to help, but again I don't have time for ths argument. -- Uche Ogbuji Fourthought, Inc. http://uche.ogbuji.nethttp://fourthought.com http://copia.ogbuji.net http://4Suite.org Articles: http://uche.ogbuji.net/tech/publications/ -- http://mail.python.org/mailman/listinfo/python-list
Re: XML and namespaces
Wilfredo Sánchez Vega: I'm having some issues around namespace handling with XML: document = xml.dom.minidom.Document() element = document.createElementNS(DAV:, href) document.appendChild(element) DOM Element: href at 0x1443e68 document.toxml() '?xml version=1.0 ?\nhref/' Note that the namespace wasn't emitted. If I have PyXML, xml.dom.ext.Print does emit the namespace: xml.dom.ext.Print(document) ?xml version='1.0' encoding='UTF-8'?href xmlns='DAV:'/ Is that a limitation in toxml(), or is there an option to make it include namespaces? Getting back to the OP: PyXML's xml.dom.ext.Print does get things right, and based on discussion in this thread, the only way you can serialize correctly is to use that add-on with minidom, or to use a third party, properly Namespaces-aware tool such as 4Suite (there are others as well). Good luck. -- Uche Ogbuji Fourthought, Inc. http://uche.ogbuji.nethttp://fourthought.com http://copia.ogbuji.net http://4Suite.org Articles: http://uche.ogbuji.net/tech/publications/ -- http://mail.python.org/mailman/listinfo/python-list
Re: XML and namespaces
Uche [EMAIL PROTECTED] wrote: Of course. Minidom implements level 2 (thus the NS at the end of the method name), which means that its APIs should all be namespace aware. The bug is that writexml() and thus toxml() are not so. Not exactly a bug - DOM Level 2 Core 1.1.8p2 explicitly leaves namespace fixup at the mercy of the application. It's only standardised as a DOM feature in Level 3, which minidom does not yet claim to support. It would be a nice feature to add, but it's not entirely trivial to implement, especially when you can serialize a partial DOM tree. Additionally, it might have some compatibility problems with apps that don't expect namespace declarations to automagically appear. For example, perhaps, an app dealing with HTML that doesn't want spare xmlns=http://www.w3.org/1999/xhtml; declarations appearing in every snippet of serialized output. So it should probably be optional. In DOM Level 3 (and pxdom) there's a DOMConfiguration parameter 'namespaces' to control it; perhaps for minidom an argument to toxml() might be best? -- And Clover mailto:[EMAIL PROTECTED] http://www.doxdesk.com/ -- http://mail.python.org/mailman/listinfo/python-list
Re: XML and namespaces
Quoting Andrew Kuchling: element = document.createElementNS(DAV:, href) This call is incorrect; the signature is createElementNS(namespaceURI, qualifiedName). Not at all, Andrew. href is a valid qname, as is foo:href. The prefix is optional in a QName. Here is the correct behavior, taken from a non-broken DOM library (4Suite's Domlette) from Ft.Xml import Domlette document = Domlette.implementation.createDocument(None, None, None) element = document.createElementNS(DAV:, href) document.appendChild(element) Element at 0xb7d12e2c: name u'href', 0 attributes, 0 children Domlette.Print(document) ?xml version=1.0 encoding=UTF-8? href xmlns=DAV:/ If you call .createElementNS('whatever', 'DAV:href'), the output is the expected: ?xml version=1.0 ?DAV:href/ Oh, no. That is not at all expected. The output should be: ?xml version=1.0 ?DAV:href xmlns:DAV=whatever/ It doesn't look like there's any code in minidom that will automatically create an 'xmlns:DAV=whatever' attribute for you. Is this automatic creation an expected behaviour? Of course. Minidom implements level 2 (thus the NS at the end of the method name), which means that its APIs should all be namespace aware. The bug is that writexml() and thus toxml() are not so. (I assume not. Section 1.3.3 of the DOM Level 3 says Similarly, creating a node with a namespace prefix and namespace URI, or changing the namespace prefix of a node, does not result in any addition, removal, or modification of any special attributes for declaring the appropriate XML namespaces. So the DOM can create XML documents that aren't well-formed w.r.t. namespaces, I think.) Oh no. That only means that namespace declaration attributes are not created in the DOM data structure. However, output has to fix up namespaces in .namespaceURI properties as well as directly asserted xmlns attributes. It would be silly for DOM to produce malformed XML+XMLNS, and of course it is not meant to. The minidom behavior needs fixing, badly. -- Uche Ogbuji Fourthought, Inc. http://uche.ogbuji.nethttp://fourthought.com http://copia.ogbuji.net http://4Suite.org Articles: http://uche.ogbuji.net/tech/publications/ -- http://mail.python.org/mailman/listinfo/python-list
Re: XML and namespaces
On 2 Dec 2005 06:16:29 -0800, [EMAIL PROTECTED] [EMAIL PROTECTED] wrote: Of course. Minidom implements level 2 (thus the NS at the end of the method name), which means that its APIs should all be namespace aware. The bug is that writexml() and thus toxml() are not so. Hm, OK. Filed as bug #1371937 in the Python bug tracker. Maybe I'll look at this during the bug day this Sunday. --amk -- http://mail.python.org/mailman/listinfo/python-list
Re: XML and namespaces
[AMK] (I assume not. Section 1.3.3 of the DOM Level 3 says Similarly, creating a node with a namespace prefix and namespace URI, or changing the namespace prefix of a node, does not result in any addition, removal, or modification of any special attributes for declaring the appropriate XML namespaces. So the DOM can create XML documents that aren't well-formed w.r.t. namespaces, I think.) [Uche] Oh no. That only means that namespace declaration attributes are not created in the DOM data structure. However, output has to fix up namespaces in .namespaceURI properties as well as directly asserted xmlns attributes. It would be silly for DOM to produce malformed XML+XMLNS, and of course it is not meant to. The minidom behavior needs fixing, badly. My interpretation of namespace nodes is that the application is responsible for creating whatever namespace declaration attribute nodes are required, on the DOM tree. DOM should not have to imply any attributes on output. #-=-=-=-=-=-=-=-=-= import xml.dom import xml.dom.minidom DAV_NS_U = http://webdav.org; xmldoc = xml.dom.minidom.Document() xmlroot = xmldoc.createElementNS(DAV_NS_U, DAV:xpg) xmlroot.setAttributeNS(xml.dom.XMLNS_NAMESPACE, xmlns:DAV, DAV_NS_U) xmldoc.appendChild(xmlroot) print xmldoc.toprettyxml() #-=-=-=-=-=-=-=-=-= produces ?xml version=1.0 ? DAV:xpg xmlns:DAV=http://webdav.org/ Which is well formed wrt namespaces. regards, -- alan kennedy -- email alan: http://xhaus.com/contact/alan -- http://mail.python.org/mailman/listinfo/python-list
Re: XML and namespaces
Alan Kennedy: Oh no. That only means that namespace declaration attributes are not created in the DOM data structure. However, output has to fix up namespaces in .namespaceURI properties as well as directly asserted xmlns attributes. It would be silly for DOM to produce malformed XML+XMLNS, and of course it is not meant to. The minidom behavior needs fixing, badly. My interpretation of namespace nodes is that the application is responsible for creating whatever namespace declaration attribute nodes are required, on the DOM tree. DOM should not have to imply any attributes on output. I'm sorry but you're wrong on this. First of all, DOM L2 (the level minidom targets) does not have the concept of namespace nodes. That's XPath. DOM supports two ways of expressing namespace information. The first way is through the node properties .namespaceURI, .prefix (for the QName) and .localName. It *also* supports literal namespace declaration atrributes (the NSDecl attributes themselves must have a namespace of http://www.w3.org/2000/xmlns/;). As if this is not confusing enough the Level 1 propoerty .nodeName must provide the QName, redundantly. As a result, you have to perform fix-up to merge properties with explicit NSDEcl attributes in order to serialize. If it does not do so, it is losing all the information in namespace properties, and the resulting output is not the same document that is represented in the DOM. Believe me, I've spent many weary hours with all these issues, and implemented code to deal with the mess multiple times, and I know it all too painfully well. I wrote Amara largely because I got irrecoverably sick of DOM's idiosyncracies. Andrew, for this reason I probably take the initiative to work up a patch for the issue. I'll do what I can to get to it tomorrow. If you help me with code review and maybe writing some tests, that would be a huge help. -- Uche Ogbuji Fourthought, Inc. http://uche.ogbuji.nethttp://fourthought.com http://copia.ogbuji.net http://4Suite.org Articles: http://uche.ogbuji.net/tech/publications/ -- http://mail.python.org/mailman/listinfo/python-list
Re: XML and namespaces
Wilfredo Sánchez Vega: I'm having some issues around namespace handling with XML: document = xml.dom.minidom.Document() element = document.createElementNS(DAV:, href) document.appendChild(element) DOM Element: href at 0x1443e68 document.toxml() '?xml version=1.0 ?\nhref/' I haven't worked with minidom in just about forever, but from what I can tell this is a serious bug (or at least an appalling mising feature). I can't find anything in the Element,writexml() method that deals with namespaces. But I'm just baffled. Is there really any way such a bug could have gone so long unnoticed in Python and PyXML? I searched both trackers, and the closest thing I could find was this from 2002: http://sourceforge.net/tracker/index.php?func=detailaid=637355group_id=6473atid=106473 Different symptom, but also looks like a case of namespace ignorant code. Can anyone who's worked on minidom more recently let me know if I'm just blind to something? -- Uche Ogbuji Fourthought, Inc. http://uche.ogbuji.nethttp://fourthought.com http://copia.ogbuji.net http://4Suite.org Articles: http://uche.ogbuji.net/tech/publications/ -- http://mail.python.org/mailman/listinfo/python-list
Re: XML and namespaces
I've found the same bug. This is what I've been doing: from xml.dom.minidom import Document try: from xml.dom.ext import PrettyPrint except ImportError: PrettyPrint = None doc = Document() ... if PrettyPrint is not None: PrettyPrint(doc, stream=output, indent='') else: top_parent.setAttribute(xmlns, xmlns) output.write(doc.toprettyxml(indent='')) -- http://mail.python.org/mailman/listinfo/python-list
Re: XML and namespaces
On 30 Nov 2005 07:22:56 -0800, [EMAIL PROTECTED] [EMAIL PROTECTED] quoted: element = document.createElementNS(DAV:, href) This call is incorrect; the signature is createElementNS(namespaceURI, qualifiedName). If you call .createElementNS('whatever', 'DAV:href'), the output is the expected: ?xml version=1.0 ?DAV:href/ It doesn't look like there's any code in minidom that will automatically create an 'xmlns:DAV=whatever' attribute for you. Is this automatic creation an expected behaviour? (I assume not. Section 1.3.3 of the DOM Level 3 says Similarly, creating a node with a namespace prefix and namespace URI, or changing the namespace prefix of a node, does not result in any addition, removal, or modification of any special attributes for declaring the appropriate XML namespaces. So the DOM can create XML documents that aren't well-formed w.r.t. namespaces, I think.) --amk -- http://mail.python.org/mailman/listinfo/python-list
Re: Clarification on XML parsing namespaces (xml.dom.minidom)
Greg Wogan-Browne wrote: I am having some trouble figuring out what is going on here - is this a bug, or correct behaviour? Basically, when I create an XML document with a namespace using xml.dom.minidom.parse() or parseString(), the namespace exists as an xmlns attribute in the DOM (fair enough, as it's in the original source document). However, if I use the DOM implementation to create an identical document with a namespace, the xmlns attribute is not present. This mainly affects me when I go to print out the document again using Document.toxml(), as the xmlns attribute is not printed for documents I create dynamically, and therefore XSLT does not kick in (I'm using an external processor). Any thoughts on this would be appreciated. Should I file a bug on pyxml? It's odd behavior, but I think it's a stretch to call it a bug. You problem is that you're mixing namespaced documents with the non-namespace DOM API. That means trouble and such odd quirks every time. Use getAttributeNS, createElementNS, setAttributeNS, etc. rather than getAttribute, createElement, setAttribute, etc. -- Uche OgbujiFourthought, Inc. http://uche.ogbuji.nethttp://4Suite.orghttp://fourthought.com Use CSS to display XML - http://www.ibm.com/developerworks/edu/x-dw-x-xmlcss-i.html Introducing the Amara XML Toolkit - http://www.xml.com/pub/a/2005/01/19/amara.html Be humble, not imperial (in design) - http://www.adtmag.com/article.asp?id=10286UBL 1.0 - http://www-106.ibm.com/developerworks/xml/library/x-think28.html Manage XML collections with XAPI - http://www-106.ibm.com/developerworks/xml/library/x-xapi.html Default and error handling in XSLT lookup tables - http://www.ibm.com/developerworks/xml/library/x-tiplook.html Packaging XSLT lookup tables as EXSLT functions - http://www.ibm.com/developerworks/xml/library/x-tiplook2.html -- http://mail.python.org/mailman/listinfo/python-list
Clarification on XML parsing namespaces (xml.dom.minidom)
Hi all, I am having some trouble figuring out what is going on here - is this a bug, or correct behaviour? Basically, when I create an XML document with a namespace using xml.dom.minidom.parse() or parseString(), the namespace exists as an xmlns attribute in the DOM (fair enough, as it's in the original source document). However, if I use the DOM implementation to create an identical document with a namespace, the xmlns attribute is not present. This mainly affects me when I go to print out the document again using Document.toxml(), as the xmlns attribute is not printed for documents I create dynamically, and therefore XSLT does not kick in (I'm using an external processor). Any thoughts on this would be appreciated. Should I file a bug on pyxml? Greg Python 2.3.3 (#1, May 7 2004, 10:31:40) [GCC 3.3.3 20040412 (Red Hat Linux 3.3.3-7)] on linux2 Type help, copyright, credits or license for more information. import xml.dom.minidom raw = 'test xmlns=http://example.com/namespace/' doc = xml.dom.minidom.parseString(raw) print doc.documentElement.namespaceURI http://example.com/namespace print doc.documentElement.getAttribute('xmlns') http://example.com/namespace impl = xml.dom.minidom.getDOMImplementation() doc2 = impl.createDocument('http://example.com/namespace','test',None) print doc2.documentElement.namespaceURI http://example.com/namespace print doc2.documentElement.getAttribute('xmlns') -- http://mail.python.org/mailman/listinfo/python-list