[ 
https://issues.apache.org/jira/browse/DIGESTER-120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12579049#action_12579049
 ] 

Simon Kitching commented on DIGESTER-120:
-----------------------------------------

Ok, using entities I was able to duplicate this pretty quickly. That also makes 
sense; the parser has the entity already cached as a string, so of course makes 
a separate call to the characters method for it.

I have committed a patch to the trunk, and deployed a new 1.8.1-SNAPSHOT 
version to the apache maven snapshot repository. Could you please try it out 
and confirm it fixes the problem for you?

Thanks, Simon

> digesting xml content with NodeCreateRule swallows spaces.
> ----------------------------------------------------------
>
>                 Key: DIGESTER-120
>                 URL: https://issues.apache.org/jira/browse/DIGESTER-120
>             Project: Commons Digester
>          Issue Type: Bug
>    Affects Versions: 1.8
>         Environment: jdk 1.4.2_08, digester 1.8
>            Reporter: Nguyen Thanh Son Daniel
>         Attachments: digester-patch.txt, simple.xml
>
>
> i need to process an xml file that contains entities: ie:
> <?xml version="1.0" encoding="UTF-8"?>
> <top>
> <body>&#65; &#65;</body>
> </top>
> i'm using digester as follows:
> Digester digester = new Digester ();
> digester.addRule ("top", new ObjectCreateRule (MyContent.class));
> digester.addRule ("top/body", new NodeCreateRule ());
> digester.addSetNext ("top/body", "setBody");
> then
> ...
> digester.parse (file);
> MyContent class transforms the node into text as follows:
> public class MyContent
> {
>  public void setBody (Element node)
>  {
>   String content = serializeNode (node);
>   System.out.println (content);
>  }
>  ...
> }
> the content displayed is in this case: <body>AA</body>
> if the body was encoded in the xml file as: <top><body>A A</body></top>, the 
> content would then be correctly displayed as: 
> <body>A A</body>
> looking at the NodeCreateRule.NodeBuilder.characters () implementation, the 
> following code generates the problem: 
> String str = new String(ch, start, length);
> if (str.trim().length() > 0) { 
>  top.appendChild(doc.createTextNode(str));
> when entities are being used; the characters () method is called for 'A', ' ' 
> and 'A' in the first case. in the second case, it is called once with 'A A'.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to