Brian, That is a lot of detail - thanks for the care you have put into it. I understand that the "attributes" starting with "xmlns:" are not real attributes of but are the representation of the namespace nodes associated with elements that have been serialized.
My approach to creating the DOM document from the SAX events has been based on responding to a startElement event by creating a new DOM Element node, inserting it into the document, and then for each real attribute of that element, as supplied in the attrs parameter, adding the attribute using the element's setAttributeNS() method. To my understanding, that does not add any attributes called things like xmlns:xyz. Instead it adds the attribute, and if required, adds a namespace node to declare the namespace/prefix association that is used by the attribute that has been added. All that said, I am going to look into the "handler.startPrefixMapping" method that you suggested because I agree that the conversion from SAX events to a DOM structure is where the wheels are falling off the cart. I will give it another day of investigation and then try posting this to the Xerces user list. Regards Geoff Shuetrim On Fri, 2005-06-03 at 00:17 -0400, Brian Minchau wrote: > Geoff, > this seems to be an issue with the DOM object that you want to serialize, > not a problem with an XSL transformation (Xalan), nor a problem with the > serializer that is part of Xalan. > > You are using Xerces as your DocumentBuilderFactory, so I suggest to take > this issue up on one of the Xerces mailing lists. > > I'm not DOM expert, so I don't quite know what your recent code fragment is > trying to do with namespaces. > > My instinct says that the root of your problem may be how namespace nodes > are represented. > > Consider this example, an element named "A" with a namespace node of prefix > "p1" mapped to URI "http://uri1". Element "A" has direct child element "B" > that has two namespace nodes, "p1" mapped to URI "http://uri1" and "p2" > mapped to "http://uri2". Also, "B" has an attribute with name "attrb" and > value "valueb". > > Each element has its own associated namespace nodes. This more or less is > the DOM point of view. When you serialize (write out a a stream of > characters or bytes) the above concepts you get this: > <A xmlns:p1='http://uri1'><B xmlns:p1='http://uri1' > xmlns:p2='http://uri2' attrb='valueb' /></A> > > It is a feature of serialized XML that if a parent element has a namespace > mapping, xmlns:prefix='uri', then all decendant elements get that mapping > "for free". So any serializer worth its salt wouldn't write out what I did > but this: > <A xmlns:p1='http://uri1'><B xmlns:p2='http://uri2' attrb='valueb' > /></A> > > These attributes, the ones starting with xmlns, are not real attributes of > A and B, but are the representation of the namespace nodes associated with > A and with B when you serialize. When an XML parser, like Xerces reads such > a stream in it will convert the xmlns:prefix="uri" attributes back into > namespace nodes. I'm not sure if it also leaves them as attribute values > as well. > > If you want to generate the above described elements with SAX events you > will need: > handler.startPrefixMapping("p1","http://uri1"); > handler.startElement(...) ; // for A > handler.startPrefixMapping("p1","http://uri1"); > handler.startElement(...) ; // for B, and the attributes object passed in > here contains attrb with value valueb > endElement() // end B > endElement() // end A > > If you tried to do it like this, I don't think the meaing is the same: > handler.startElement(...) ; // for A, and the attributes object passed in > here contains xmlns:p1 with value http://uri1 > handler.startElement(...) ; // for B, and the attributes object passed in > here contains attrb with value valueb and xmlns:p2 with value http://uri2 > endElement() // end B > endElement() // end A > > It may look the same when serialized, but neither A nor B had any namespace > nodes. > > Element A has one attribute named xmlns:p1 with value http://uri1. However > I don't think this is namespace node. If you squeezed these SAX events > into a DOM, and asked the DOM for a list of namespace nodes associated with > A I don't think it would include a namespace node for p1. It isn't a > namespace node, it is just an attribute that happens to look like a > namespace node. With SAX you want to use the namespaceMapping() calls to > declare namespace nodes. > > > > - Brian > - - - - - - - - - - - - - - - - - - - - > Brian Minchau > XSLT Development, IBM Toronto > e-mail: [EMAIL PROTECTED] > > "You want it today? I thought tomorrow was yesterday, and I still had more > time." - My daughter. > > > > > Geoffrey Shuetrim > <[EMAIL PROTECTED] > > To > xalan > 06/02/2005 10:27 <[email protected]> > PM cc > > Subject > Please respond to Re: Losing namespace declarations > geoff for namespaces that are used only > on attributes (eg xlink) when using > org.apache.xml.serializer > > > > > > > > > > > OK - I am stumped: > > The original Document that I use to start the chain of XSLT > transformations is also losing its namespace declarations for attribute > only namespaces when it is serialised - IE - before any transformations > so I think I can rule out any side effects of the transformation > process. > > I can also run the following code against the original Document: > > NodeList n = document.getElementsByTagNameNS( > "http://www.xbrl.org/2003/linkbase", > "linkbaseRef" > ); > Element e = (Element) n.item(0); > String type = e.getAttributeNS("http://www.w3.org/1999/xlink","type"); > System.out.println(type); > > and I get back the text: "simple", which is what I was after for an > XLink simple link. This suggests to me that the XLink attribute does > have its namespace declaration in the Document that is serialized > incorrectly. > > I also run: > > System.out.println(document.getClass().getName()); > > and it produces "org.apache.xerces.dom.DocumentImpl", telling me (I > think) that I am working on the intended Document implementation. > > Running Brian's code results in my not losing any namespaces for the > trivial example that I provided, either when creating the original XML > as a string or when loading it into a DOM Document from a file. > Similarly, I can parse in very large complex XML files that correspond > closely to what I am having trouble serializing, and then serialize them > again without losing the namespaces. > > That leaves me with the view that the problem lies in there being some > difference between the DOM Document that I build up in memory from a > series of SAX events (my original DOM Document) and the kinds of DOM > Documents created when I parse XML from files or Strings. When I add > attributes to the original DOM Document, I use the following: > > for (int i = 0; i < attrs.getLength(); i++) { > > if (attrs.getURI(i) == XMLNamespace) // Handle the XML > namespacenewElement.setAttribute( > attrs.getQName(i), > attrs.getValue(i) > ); > else > newElement.setAttributeNS( > attrs.getURI(i), > attrs.getQName(i), > attrs.getValue(i) > ); > } > > My understanding is that this code is setting up the namespace nodes for > the attributes correctly as well as adding the attributes themselves. > This seems to be borne out by the success of the stylesheets that I can > apply to this original DOM Document. > > That leaves me with a big zero in terms of the number of possible > differences I can think of. Suggestions are welcome. > > Cheers > > Geoff Shuetrim > > > > On Thu, 2005-06-02 at 17:17 -0400, Brian Minchau wrote: > > Goeff, > > I tried two things. The first was to run your input XML through the > > identity transform, which will essentially use the same code underneath, > > but not use DOM at all. > > Here is the code I ran: > > static void case3() throws TransformerException, IOException { > > final javax.xml.transform.TransformerFactory tFactory; > > tFactory = new > org.apache.xalan.processor.TransformerFactoryImpl(); > > > > final javax.xml.transform.Transformer transformer; > > transformer = tFactory.newTransformer(); > > > > StringWriter sw = new StringWriter(); > > StringReader sr = new StringReader( > > "<?xml version='1.0' ?>\n" + > > "<c:a xmlns:b='http://somenamespace.com/'\n"+ > > " xmlns:c='http://othernamespace.com/'\n"+ > > " b:d='e'/>"); > > StreamResult strmrslt = new StreamResult(sw); > > StreamSource strmsrc = new StreamSource(sr); > > > > transformer.setOutputProperty("method","xml"); > > transformer.setOutputProperty("indent","yes"); > > transformer.setOutputProperty("standalone","no"); > > transformer.transform(strmsrc, strmrslt); > > > > > > sw.flush(); > > String out = sw.toString(); > > sw.close(); > > System.out.println("================================="); > > System.out.println(out); > > System.out.println("================================="); > > > > } > > > > Here is the output that I got: > > ================================= > > <?xml version="1.0" encoding="UTF-8" standalone="no"?> > > <c:a xmlns:b="http://somenamespace.com/" > > xmlns:c="http://othernamespace.com/" b:d="e"/> > > > > ================================= > > > > Then I tried with Xerces as the creator of the DOM. > > > > public static void case4() > > throws SAXException, IOException, ParserConfigurationException { > > > > javax.xml.parsers.DocumentBuilderFactory dfactory; > > // dfactory = > > // javax.xml.parsers.DocumentBuilderFactory.newInstance(); > > > > dfactory = new > org.apache.xerces.jaxp.DocumentBuilderFactoryImpl(); > > dfactory.setNamespaceAware(true); > > > > > > > > final org.w3c.dom.Document document; > > { > > StringReader sr = > > new StringReader( > > "<c:a xmlns:b='http://somenamespace.com/' "+ > > " xmlns:c='http://othernamespace.com/' b:d='e'/>"); > > StreamSource strmsrc = new StreamSource(sr); > > InputSource is = new InputSource(sr); > > > > > > javax.xml.parsers.DocumentBuilder dBuilder = > > dfactory.newDocumentBuilder(); > > document = dBuilder.parse(is); > > } > > > > final StringWriter sw = new StringWriter(); > > > > { > > java.util.Properties xmlProps = > > > OutputPropertiesFactory.getDefaultMethodProperties("xml"); > > xmlProps.setProperty("indent", "yes"); > > xmlProps.setProperty("standalone", "no"); > > Serializer serializer = > > SerializerFactory.getSerializer(xmlProps); > > serializer.setWriter(sw); > > serializer.asDOMSerializer().serialize(document); > > } > > > > sw.flush(); > > String out = sw.toString(); > > sw.close(); > > System.out.println("================================="); > > System.out.println(out); > > System.out.println("================================="); > > > > } > > > > Whether the factory (dfactory) was set to be namespace aware, or not, > made > > no difference in what I got. The same output in all cases. Namespace > nodes > > were not lost. > > > > I don't know how you get your DOM document, but I think that is where the > > problem is. > > > > - Brian > > - - - - - - - - - - - - - - - - - - - - > > Brian Minchau > > XSLT Development, IBM Toronto > > e-mail: [EMAIL PROTECTED] > > > > "If the lion purrs, it is only because it is saving you for dessert." - > My > > wife > > > > > > > > > > > Geoffrey Shuetrim > > > <[EMAIL PROTECTED] > > > > > To > > xalan > > > 06/02/2005 05:21 <[email protected]> > > > AM > cc > > > > > > Subject > > Please respond to Losing namespace declarations for > > > geoff namespaces that are used only on > > > attributes (eg xlink) when using > > > org.apache.xml.serializer > > > > > > > > > > > > > > > > > > > > > > > > > > > > > The subjects says most of it. I am trying to serialize the results of > > stylesheet transformations and some xmlns declarations are going > > missing. > > > > For example, if I am serializing the following markup: > > > > <c:a > > xmlns:b="http://somenamespace.com/" > > xmlns:c="http://othernamespace.com/" > > b:d="e"/> > > > > then I get the following: > > > > <c:a > > xmlns:c="http://othernamespace.com/" > > b:d="e"/> > > > > The serialize method that I have written is: > > > > public void serialize( > > OutputStream outputStream, > > Document document) > > throws Exception { > > java.util.Properties xmlProps = > > OutputPropertiesFactory.getDefaultMethodProperties("xml"); > > xmlProps.setProperty("indent", "yes"); > > xmlProps.setProperty("standalone", "no"); > > Serializer serializer = > > SerializerFactory.getSerializer(xmlProps); > > serializer.setOutputStream(outputStream); > > serializer.asDOMSerializer().serialize(document); > > } > > > > The document being serialized has always been parsed by a namespace > > aware DOM builder or created by a transformer applied to a namespace > > aware DOM and created from a stylesheet parsed in using a namespace > > aware DOM builder. > > > > The nearest archived message that I have come across is: > > http://mail-archives.apache.org/mod_mbox/xml-security-dev/200409.mbox/% > > [EMAIL PROTECTED] > > but that relates to XMLSerializer. > > > > Any guidance on what is going on would be very welcome! > > > > Geoff Shuetrim > > > > > > > > > > > >
