Richard,

Whe you do the following, 

>         Document document = DocumentHelper.createDocument();
>         Element root = document.addElement( "root" );
>
>         Element author1 = root.addElement( "author" )
>             .addText( "James Strachan" + (new Character((char)
> 8)).toString() );
>
>         String text = document.asXML();
>         System.out.println(text);

The result is 

        <?xml version="1.0" encoding="UTF-8"?>
        <root><author>James Strachan&#8;</author></root>

which may be correct, but the dom4j parser reject such xml it generated.

So it may be a bug either on

1) xml toString incorrectly encode backspace into &#8;
or
2) The parser incorrectly reject &#8;

if 2) is true, I need to know the work around to encode backspace in order
for dom4j to parse with exception.

IE has no problem to parse the xml with &#8;

Regards,

Chris Lai

29597369
GET 6303


-----Original Message-----
From: Richard Eckart [mailto:[EMAIL PROTECTED]
Sent: Wednesday, January 10, 2007 4:29 PM
To: Chris Lai /EEL/IT
Cc: dom4j-user@lists.sourceforge.net
Subject: Re: [dom4j-user] problem on parsing backspace character


Hi Chris,

It's due to the XML specifications. The backspace character is not a  
valid XML character. If you want to have it in your documents, you  
need to escape it. It seems there is a bug that causes it to fail to  
escape the backspace char when a XML document is serialized to  
String. What I suppose should happen is, that the resultung XML  
contains a &#8; entity.

See here: http://www.w3.org/TR/REC-xml/#charsets
(Section 2.2 Characters - Character range)
Character Range

Char
    ::=
#x9 | #xA | #xD | [#x20-#xD7FF] | [#xE000-#xFFFD] | [#x10000-#x10FFFF]
/* any Unicode character, excluding the surrogate blocks, FFFE, and  
FFFF. */

Cheers,

Richard

Am 10.01.2007 um 08:08 schrieb Chris Lai /EEL/IT:

> hi,
>
> I am having a problem on parsing xml with backspace (0x0008) char.
>
> (The tab char (0x0009) is fine)
>
> It turns out that dom4j cannot parse a xml with backspace char even  
> the xml
> is generated by the dom4j itself.
>
> To demo the problem, here is the code section:
>
> import org.dom4j.Document;
> import org.dom4j.DocumentException;
> import org.dom4j.DocumentHelper;
> import org.dom4j.Element;
>
> public class Foo {
>
>     public static void main(String args[] ) {
>         Document document = DocumentHelper.createDocument();
>         Element root = document.addElement( "root" );
>
>         Element author1 = root.addElement( "author" )
>             .addText( "James Strachan" + (new Character((char)
> 8)).toString() );
>
>         String text = document.asXML();
>         System.out.println(text);
>
>         try
>         {
>             DocumentHelper.parseText(text);
>         }
>         catch (DocumentException e)
>         {
>             e.printStackTrace();              //<----- excepiton occurs
>         }
>
>     }
> }
>
> The following are the output:
>
> <?xml version="1.0" encoding="UTF-8"?>
> <root><author>James Strachan&#8;</author></root>
> org.dom4j.DocumentException: Error on line 2 of document  : Character
> reference "&#8" is an invalid XML character. Nested exception:  
> Character
> reference "&#8" is an invalid XML character.
>       at org.dom4j.io.SAXReader.read(SAXReader.java:482)
>       at org.dom4j.DocumentHelper.parseText(DocumentHelper.java:278)
>       at Foo.main(Foo.java:24)
> Nested exception:
> org.xml.sax.SAXParseException: Character reference "&#8" is an  
> invalid XML
> character.
>       at org.apache.xerces.parsers.AbstractSAXParser.parse(Unknown Source)
>       at org.dom4j.io.SAXReader.read(SAXReader.java:465)
>       at org.dom4j.DocumentHelper.parseText(DocumentHelper.java:278)
>       at Foo.main(Foo.java:24)
> Nested exception: org.xml.sax.SAXParseException: Character  
> reference "&#8"
> is an invalid XML character.
>       at org.apache.xerces.parsers.AbstractSAXParser.parse(Unknown Source)
>       at org.dom4j.io.SAXReader.read(SAXReader.java:465)
>       at org.dom4j.DocumentHelper.parseText(DocumentHelper.java:278)
>       at Foo.main(Foo.java:24)
>
> Anyone know how to solve it? I use the dom4j-1.6.1.jar
>
> Regards,
>
> Chris Lai
>
> 29597369
> GET 6303
>
>
> ---------------------------------------------------------------------- 
> ---
> Take Surveys. Earn Cash. Influence the Future of IT
> Join SourceForge.net's Techsay panel and you'll get the chance to  
> share your
> opinions on IT & business topics through brief surveys - and earn cash
> http://www.techsay.com/default.php? 
> page=join.php&p=sourceforge&CID=DEVDEV
> _______________________________________________
> dom4j-user mailing list
> dom4j-user@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/dom4j-user

-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys - and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
_______________________________________________
dom4j-user mailing list
dom4j-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dom4j-user

Reply via email to