On Sat, May 1, 2010 at 07:33, Kasun Indrasiri <[email protected]> wrote: > Hi, > > I guess this becomes even more riskier in a scenario like this. > > XML string : "<a> a_ lengthy_string</a>" -> omElem > > Once we parse this xml in non-coalescing mode and create an OM > element(omElem) with this, > > - first Child : contains the first portion of 'a_lengthy_string' string > - last Child : contains the rest > > However, as Hiranya mentioned 'omEle.getText()' will give us the correct > value of the text content. > > Is this the acceptable behavior?
It's not the default behavior, but if someone explicitly configures Axiom to switch off coalescing, then he has to live with the consequences ;-) > regards, > > Kasun > > > On Fri, Apr 30, 2010 at 9:12 PM, Andreas Veithen > <[email protected]>wrote: > >> Axiom always creates the nodes based on the events received from the >> underlying parser. If javax.xml.stream.isCoalescing is set to false on >> the parser, then by definition the parser may return large text nodes >> in multiple chunks. The problem is that if >> javax.xml.stream.isCoalescing is set to true, StAX doesn't report >> CDATA sections in the document as CDATA events, but as CHARACTER >> events. It is however possible to configure Woodstox to report CDATA >> sections without splitting text nodes into chunks. Note that even with >> such a configuration, OMElement#getText should always be used to >> extract the text content of an element (to cover the case where the >> element contains a mix of text nodes and CDATA sections). >> >> Note that while coalescing is switched off by default at the StAX >> level, Axiom overrides this so that by default coalescing is turned on >> [1]. It is not surprising that there is code that implicitly relies on >> this. Therefore, working with Axiom in non coalescing mode is always a >> risk. >> >> Andreas >> >> [1] http://people.apache.org/~veithen/axiom/userguide/ch04.html#d0e866 >> >> On Fri, Apr 30, 2010 at 11:51, Kasun Indrasiri <[email protected]> wrote: >> > Hi, >> > >> > When parsing XML in non-coalescing mode ("javax.xml.stream.isCoalescing", >> > false) Axiom breaks down large text entries to multiple chunks. Therefore >> CDATA >> > elements with lengthy texts get translated into multiple CDATA elements. >> > >> > thanks, >> > -- >> > Kasun Indrasiri >> > Senior Software Engineer, >> > WSO2 Inc. - "Lean . Enterprise . Middleware" - http://www.wso2.com/ >> > Blog : http://kasunpanorama.blogspot.com/ >> > >> > > > > -- > Kasun Indrasiri > Senior Software Engineer, > WSO2 Inc. - "Lean . Enterprise . Middleware" - http://www.wso2.com/ > Blog : http://kasunpanorama.blogspot.com/ >
