Hello,
I'm hoping someone can help me figure out a strange problem.
I have a string which contains xml, and I have 2 scenarios that produce different results:
 
Scenario 1:
1.  I first use DocumentHelper.parseText to parse the string into a dom4j.Document.
2.  I then use DOMWriter to convert the document to a w3c.Document
3.  I then create an instance of a xerces TreeWalker
4.  Upon traversing the tree, I get new nodes where there should not be.
 
Scenario 2:
1.  I use DocumentBuilderFactory->DocumentBuilder->InputSource to create a w3c.Document from the original source string
2.  I then create an instance of a xerces TreeWalker
3.  Tree traversal gives the expected nodes.
 
Here is my code:
------------------------------------------------------------------------------------------
import java.io.StringReader;
import java.io.StringWriter;
 
import javax.xml.parsers.DocumentBuilder;
import javax.xml.parsers.DocumentBuilderFactory;
 
import org.apache.xerces.dom.DocumentImpl;
import org.apache.xerces.dom.TreeWalkerImpl;
import org.apache.xml.serialize.LineSeparator;
import org.apache.xml.serialize.OutputFormat;
import org.apache.xml.serialize.XMLSerializer;
import org.dom4j.Document;
import org.dom4j.DocumentException;
import org.dom4j.DocumentHelper;
import org.dom4j.io.DOMWriter;
import org.w3c.dom.Node;
import org.w3c.dom.traversal.NodeFilter;
import org.w3c.dom.traversal.TreeWalker;
import org.xml.sax.InputSource;
 
public class Dom4jTest {
    public static void main(String[] args) {
        try {   
            Dom4jTest dt = new Dom4jTest();
            String source ="<?xml version=\"1.0\" encoding=\"UTF-8\"?>"+
                         "<body><p>This Office is a test document: Officer</p><p>t &quot;Bert&quot;.  y &quot;Sam&quot;. &quot;Bert&quot; ow aobl him.  Jreine aine.</p></body>";
            Document dom4jDoc = DocumentHelper.parseText(source);
            org.w3c.dom.Document domDoc = dt.convert(dom4jDoc);
            System.out.println("----Calling createDocument with dom4j Document converted----");
            System.out.println("domDoc: "+dt.serializeDom(domDoc));
            dt.createDocument(domDoc);
           
            DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
            DocumentBuilder builder = factory.newDocumentBuilder();
            InputSource is = new InputSource(new StringReader(source));
            domDoc = builder.parse(is);
            System.out.println("----Calling createDocument with w3c Document----");
            System.out.println("domDoc: "+dt.serializeDom(domDoc));
            dt.createDocument(domDoc);
 
        } catch (Exception e) {
            System.out.println("Exception has occurred.");
        }
    }
    public Dom4jTest() {
        super();
    }
    public org.w3c.dom.Document convert(Document dom4jDoc) throws DocumentException {
        DOMWriter writer = new DOMWriter();
        return writer.write(dom4jDoc);
    }
    public void createDocument(org.w3c.dom.Document srcDoc) throws Exception {
        AllElements allelements = new AllElements();
        Node sourceRoot = srcDoc.getLastChild();
        DocumentImpl sourceImpl = (DocumentImpl)srcDoc;
        TreeWalkerImpl tw =
           (TreeWalkerImpl)sourceImpl.createTreeWalker(sourceRoot,
              NodeFilter.SHOW_ALL, allelements, true);
        walk(tw);
    }
    private void walk(TreeWalker sourceIterator) {
        Node n = sourceIterator.getCurrentNode();
        for (Node tagSourceElem = sourceIterator.firstChild();tagSourceElem != null; tagSourceElem = sourceIterator.nextSibling()) {
            if (tagSourceElem.getNodeValue() != null) {   
                System.out.println("The name and value of the node: "+tagSourceElem.getNodeName() + " " + tagSourceElem.getNodeValue());
            }
            walk(sourceIterator);
        }
        sourceIterator.setCurrentNode(n);
    }
    //  filters the elements of the XML document
    class AllElements implements NodeFilter
    {
      public short acceptNode (Node n)
      {
        if (n.getNodeType() > 0 )
          return FILTER_ACCEPT;
        return FILTER_SKIP;
      }
    }
    private static String serializeDom(org.w3c.dom.Document xmlDom) {
        OutputFormat format = new OutputFormat(xmlDom);
        format.setLineSeparator(LineSeparator.Windows);
        StringWriter writer = new StringWriter();
       
        XMLSerializer serializer = new XMLSerializer(writer, format);
        try {
        serializer.asDOMSerializer();
        serializer.serialize(xmlDom);
        } catch (Exception e) {
            System.out.println("Exception in SerializeDom");
        }
        return writer.toString();
    }
}
------------------------------------------------------------------------------------------
Here is my output:
 

----Calling createDocument with dom4j Document converted----

domDoc: <?xml version="1.0" encoding="UTF-8"?>

<body><p>This Office is a test document: Officer</p><p>t "Bert". y "Sam". "Bert" ow aobl him. Jreine aine.</p></body>

The name and value of the node: #text This Office is a

The name and value of the node: #text test document: Officer

The name and value of the node: #text t

The name and value of the node: #text "

The name and value of the node: #text Bert

The name and value of the node: #text "

The name and value of the node: #text . y

The name and value of the node: #text "

The name and value of the node: #text Sam

The name and value of the node: #text "

The name and value of the node: #text .

The name and value of the node: #text "

The name and value of the node: #text Bert

The name and value of the node: #text "

The name and value of the node: #text ow aobl him. Jreine aine.

----Calling createDocument with w3c Document----

domDoc: <?xml version="1.0" encoding="UTF-8"?>

<body><p>This Office is a test document: Officer</p><p>t "Bert". y "Sam". "Bert" ow aobl him. Jreine aine.</p></body>

The name and value of the node: #text This Office is a test document: Officer

The name and value of the node: #text t "Bert". y "Sam". "Bert" ow aobl him. Jreine aine.

 

 

Thanks in advance for your assistance.

Terry

-------------------------------------------------------------------------
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
_______________________________________________
dom4j-user mailing list
dom4j-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dom4j-user

Reply via email to