[ https://issues.apache.org/jira/browse/DIGESTER-120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12579043#action_12579043 ]
Nguyen Thanh Son Daniel commented on DIGESTER-120: -------------------------------------------------- Simon, 1- The parser being used is identified by the following maven artifact: groupId: xerces artifactId: xercesImpl version: 2.6.2 also, stepping in the code reveals that it is the parser being used. 2- i am not aware of the location where i should be getting the patch. can you let me know how i should proceed to access your patch ? > digesting xml content with NodeCreateRule swallows spaces. > ---------------------------------------------------------- > > Key: DIGESTER-120 > URL: https://issues.apache.org/jira/browse/DIGESTER-120 > Project: Commons Digester > Issue Type: Bug > Affects Versions: 1.8 > Environment: jdk 1.4.2_08, digester 1.8 > Reporter: Nguyen Thanh Son Daniel > Attachments: digester-patch.txt > > > i need to process an xml file that contains entities: ie: > <?xml version="1.0" encoding="UTF-8"?> > <top> > <body>A A</body> > </top> > i'm using digester as follows: > Digester digester = new Digester (); > digester.addRule ("top", new ObjectCreateRule (MyContent.class)); > digester.addRule ("top/body", new NodeCreateRule ()); > digester.addSetNext ("top/body", "setBody"); > then > ... > digester.parse (file); > MyContent class transforms the node into text as follows: > public class MyContent > { > public void setBody (Element node) > { > String content = serializeNode (node); > System.out.println (content); > } > ... > } > the content displayed is in this case: <body>AA</body> > if the body was encoded in the xml file as: <top><body>A A</body></top>, the > content would then be correctly displayed as: > <body>A A</body> > looking at the NodeCreateRule.NodeBuilder.characters () implementation, the > following code generates the problem: > String str = new String(ch, start, length); > if (str.trim().length() > 0) { > top.appendChild(doc.createTextNode(str)); > when entities are being used; the characters () method is called for 'A', ' ' > and 'A' in the first case. in the second case, it is called once with 'A A'. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.