[ https://issues.apache.org/jira/browse/DIGESTER-120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12579049#action_12579049 ]
Simon Kitching commented on DIGESTER-120: ----------------------------------------- Ok, using entities I was able to duplicate this pretty quickly. That also makes sense; the parser has the entity already cached as a string, so of course makes a separate call to the characters method for it. I have committed a patch to the trunk, and deployed a new 1.8.1-SNAPSHOT version to the apache maven snapshot repository. Could you please try it out and confirm it fixes the problem for you? Thanks, Simon > digesting xml content with NodeCreateRule swallows spaces. > ---------------------------------------------------------- > > Key: DIGESTER-120 > URL: https://issues.apache.org/jira/browse/DIGESTER-120 > Project: Commons Digester > Issue Type: Bug > Affects Versions: 1.8 > Environment: jdk 1.4.2_08, digester 1.8 > Reporter: Nguyen Thanh Son Daniel > Attachments: digester-patch.txt, simple.xml > > > i need to process an xml file that contains entities: ie: > <?xml version="1.0" encoding="UTF-8"?> > <top> > <body>A A</body> > </top> > i'm using digester as follows: > Digester digester = new Digester (); > digester.addRule ("top", new ObjectCreateRule (MyContent.class)); > digester.addRule ("top/body", new NodeCreateRule ()); > digester.addSetNext ("top/body", "setBody"); > then > ... > digester.parse (file); > MyContent class transforms the node into text as follows: > public class MyContent > { > public void setBody (Element node) > { > String content = serializeNode (node); > System.out.println (content); > } > ... > } > the content displayed is in this case: <body>AA</body> > if the body was encoded in the xml file as: <top><body>A A</body></top>, the > content would then be correctly displayed as: > <body>A A</body> > looking at the NodeCreateRule.NodeBuilder.characters () implementation, the > following code generates the problem: > String str = new String(ch, start, length); > if (str.trim().length() > 0) { > top.appendChild(doc.createTextNode(str)); > when entities are being used; the characters () method is called for 'A', ' ' > and 'A' in the first case. in the second case, it is called once with 'A A'. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.