hi,

I am having a problem on parsing xml with backspace (0x0008) char. 

(The tab char (0x0009) is fine)

It turns out that dom4j cannot parse a xml with backspace char even the xml
is generated by the dom4j itself.

To demo the problem, here is the code section:

import org.dom4j.Document;
import org.dom4j.DocumentException;
import org.dom4j.DocumentHelper;
import org.dom4j.Element;

public class Foo {

    public static void main(String args[] ) {
        Document document = DocumentHelper.createDocument();
        Element root = document.addElement( "root" );             

        Element author1 = root.addElement( "author" )
            .addText( "James Strachan" + (new Character((char)
8)).toString() );
        
        String text = document.asXML();
        System.out.println(text);
        
        try
        {
            DocumentHelper.parseText(text);
        } 
        catch (DocumentException e)
        {
            e.printStackTrace();                //<----- excepiton occurs
        }
        
    }
}

The following are the output:

<?xml version="1.0" encoding="UTF-8"?>
<root><author>James Strachan&#8;</author></root>
org.dom4j.DocumentException: Error on line 2 of document  : Character
reference "&#8" is an invalid XML character. Nested exception: Character
reference "&#8" is an invalid XML character.
        at org.dom4j.io.SAXReader.read(SAXReader.java:482)
        at org.dom4j.DocumentHelper.parseText(DocumentHelper.java:278)
        at Foo.main(Foo.java:24)
Nested exception: 
org.xml.sax.SAXParseException: Character reference "&#8" is an invalid XML
character.
        at org.apache.xerces.parsers.AbstractSAXParser.parse(Unknown Source)
        at org.dom4j.io.SAXReader.read(SAXReader.java:465)
        at org.dom4j.DocumentHelper.parseText(DocumentHelper.java:278)
        at Foo.main(Foo.java:24)
Nested exception: org.xml.sax.SAXParseException: Character reference "&#8"
is an invalid XML character.
        at org.apache.xerces.parsers.AbstractSAXParser.parse(Unknown Source)
        at org.dom4j.io.SAXReader.read(SAXReader.java:465)
        at org.dom4j.DocumentHelper.parseText(DocumentHelper.java:278)
        at Foo.main(Foo.java:24)

Anyone know how to solve it? I use the dom4j-1.6.1.jar

Regards,

Chris Lai

29597369
GET 6303


-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys - and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
_______________________________________________
dom4j-user mailing list
dom4j-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dom4j-user

Reply via email to