Hi Matthew, We use Axiom as the underlying XML infoset. AFAIK it usually works well with special characters. Not sure why it cannot handle this pound sign. May be Andreas, can shed some light on the matter? Actually in this case the exception is thrown by the Woodstox parser which is at a layer lower than Axiom. So this could be a Woodstox issue.
However if the underlying XML parser cannot handle this payload, then I don't think any of our built-in utils will be able to parse it without throwing an error. So your best option is to serialize this into a string buffer or a byte buffer and run the necessary replacement operations. Anyway lets wait and see what others have to say. Thanks, Hiranya On Tue, Feb 14, 2012 at 7:55 PM, Matthew Clark <[email protected]>wrote: > Sure - the service I'm looking at right now is very simple - the input just > looks like this: > > <oxxml version="1.0" xmlns="http://xyz.com/xmlapi/"> > <function>findOrderByReference</function> > <args> > <arg id="1">SomeRef123</arg> > </args> > </oxxml> > > The response then looks like this (i've removed a large chunk of it": > > <oxxml version="1.0" xmlns="http://xyz.com/xmlapi/"> > <response function="findOrderByReference" uuid="4444-4444-4444-4444"> > <matches count="1"> > <order id="1234567"> > <description>Some description including a £ (pound) > sign</description> > </order> > </matches> > </response> > </oxxml> > > The pound sign causes StAX to throw an exception.. so I'd like to replace > it as follows: > > <oxxml version="1.0" xmlns="http://xyz.com/xmlapi/"> > <response function="findOrderByReference" txn-uuid="4444-4444-4444-4444"> > <matches count="1"> > <order id="1234567"> > <description>Some description including a £ > (ampersandhash163;) > sign</description> > </order> > </matches> > </response> > </oxxml> > > > On 14 February 2012 13:16, Hiranya Jayathilaka <[email protected]> > wrote: > > > On Tue, Feb 14, 2012 at 5:01 PM, Matthew Clark > > <[email protected]>wrote: > > > > > Hi thanks for that - for some reason I had overlooked the message > > > builders.. > > > > > > I have a rudimentary version of this working now but given the various > > > classes available (XMLStreamReader, StAXbuilder and so on), what would > be > > > the most efficient way to do the replacement? > > > > > > > If the input byte stream contains invalid characters then I don't think > you > > can use any of the above classes to process your inputs. > > > > > > > > > > I have about 40 characters (such as the pound sign) that I would like > to > > > replace with entity references... For the first version, I simply > > converted > > > to a string used StringUtils.replaceEach() but this is obviously not > > > ideal.. > > > > > > > Can you please share an input message and a preprocessed message for us > to > > get a better understanding of your requirement? > > > > Thanks, > > Hiranya > > > > > > > > > > > > > On 14 February 2012 04:32, Hiranya Jayathilaka <[email protected]> > > > wrote: > > > > > > > Hi Mark, > > > > > > > > If you want to preprocess the responses then I'd recommend you to > > write a > > > > custom message builder. You can register the custom message builder > in > > > the > > > > axis2.xml file against the content type of your responses. There you > > will > > > > be able to include any custom logic along with code for handling > > invalid > > > > characters in the payload. > > > > > > > > Here are some useful resources I found on the web: > > > > > > > > > > > > > > > > > > http://charithwiki.blogspot.com/2010/11/how-to-write-axis2-message-builder.html > > > > > > > > > > > > > > http://wso2.org/library/articles/axis2-configuration-part2-learning-axis2-xml > > > > > > > > Thanks, > > > > Hiranya > > > > > > > > On Tue, Feb 14, 2012 at 4:34 AM, Matthew Clark > > > > <[email protected]>wrote: > > > > > > > > > Hi all, I'd really appreciate some help with this one... it's > hurting > > > my > > > > > brain! > > > > > > > > > > We have a legacy service that I would like to include in some of > our > > > ESB > > > > > operations. > > > > > The legacy service uses XML for both request and response payloads > > > making > > > > > it a very easy integration. > > > > > > > > > > I've created a very simple proxy service (see below). > > > > > > > > > > The problem I am having is that the legacy service can return some > > > > invalid > > > > > characters and is causing the stax parser to blow up in such a way > > > that I > > > > > can't even handle it gracefully with a fault sequence. I'd really > > like > > > > to > > > > > pre-process the responses (before they are parsed/built) as 99% of > > the > > > > time > > > > > it is simply a case of replacing characters with numeric character > > > > > references or character entity references.. > > > > > > > > > > We are unable to modify the legacy service to remove these > erroneous > > > > > responses. > > > > > > > > > > Heres the proxy config (I said it was simple!!) followed by the > > > Exception > > > > > thrown... The exception causes the service to hang and the fault > > > > sequence > > > > > is only entered after a 60 second timeout. > > > > > > > > > > <proxy xmlns="http://ws.apache.org/ns/synapse" > name="legacyservice" > > > > > transports="http" startOnLoad="true"> > > > > > > > > > > <target endpoint="legacyXMLReceiver"> > > > > > > > > > > <inSequence> > > > > > > > > > > <log level="full"> > > > > > > > > > > <property name="MESSAGE" value="InSequence" /> > > > > > > > > > > </log> > > > > > > > > > > </inSequence> > > > > > > > > > > <outSequence> > > > > > > > > > > <log level="full"> > > > > > > > > > > <property name="MESSAGE" value="OutSequence" /> > > > > > > > > > > </log> > > > > > > > > > > <send /> > > > > > > > > > > </outSequence> > > > > > > > > > > <faultSequence> > > > > > > > > > > <makefault version="soap11"> > > > > > > > > > > <code xmlns:soap11Env=" > > > > > http://schemas.xmlsoap.org/soap/envelope/" > value="soap11Env:Server" > > /> > > > > > > > > > > <reason expression="get-property('ERROR_MESSAGE')" /> > > > > > > > > > > <role /> > > > > > > > > > > </makefault> > > > > > > > > > > <log level="full"> > > > > > > > > > > <property name="MESSAGE" value="FaultSequence" /> > > > > > > > > > > </log> > > > > > > > > > > <property name="HTTP_SC" value="500" scope="axis2" /> > > > > > > > > > > <send /> > > > > > > > > > > </faultSequence> > > > > > > > > > > </target> > > > > > > > > > > </proxy> > > > > > > > > > > > > > > > <endpoint xmlns="http://ws.apache.org/ns/synapse" > > > > > name="legacyXMLReceiver"> > > > > > > > > > > <address uri="http://a.b.c.d:8080/legacyService/LegacyServlet" > > > > > format="pox" > > > > > > > > > > > </address> > > > > > > > > > > </endpoint> > > > > > > > > > > > > > > > ERROR {org.apache.axis2.transport.base.threads.NativeWorkerPool} - > > > > > Uncaught exception > > > > > {org.apache.axis2.transport.base.threads.NativeWorkerPool} > > > > > *org.apache.axiom.om.OMException: com.ctc.wstx.exc.WstxIOException: > > > > Invalid > > > > > UTF-8 middle byte 0x3c (at char #714, byte #127)* > > > > > at > > > > > > > > > > > > > > > org.apache.axiom.om.impl.builder.StAXOMBuilder.next(StAXOMBuilder.java:296) > > > > > at > > > > > > > > > > > > > > > > > > > > org.apache.axiom.om.impl.llom.OMElementImpl.buildNext(OMElementImpl.java:653) > > > > > at > > > > > > > > > > > > > > > > > > > > org.apache.axiom.om.impl.llom.OMNodeImpl.getNextOMSibling(OMNodeImpl.java:122) > > > > > at > > > > > > > > > > > > > > > > > > > > org.apache.axiom.om.impl.llom.OMElementImpl.getNextOMSibling(OMElementImpl.java:343) > > > > > at > > > > > > > > > > > > > > > > > > > > org.apache.axiom.om.impl.traverse.OMChildrenIterator.getNextNode(OMChildrenIterator.java:36) > > > > > at > > > > > > > > > > > > > > > > > > > > org.apache.axiom.om.impl.traverse.OMAbstractIterator.hasNext(OMAbstractIterator.java:58) > > > > > at > > > > > > > > > > > > > > > > > > > > org.apache.axiom.om.impl.util.OMSerializerUtil.serializeChildren(OMSerializerUtil.java:555) > > > > > at > > > > > > > > > > > > > > > > > > > > org.apache.axiom.om.impl.llom.OMElementImpl.internalSerialize(OMElementImpl.java:875) > > > > > at > > > > > > > > > > > > > > > > > > > > org.apache.axiom.om.impl.util.OMSerializerUtil.serializeChildren(OMSerializerUtil.java:556) > > > > > at > > > > > > > > > > > > > > > > > > > > org.apache.axiom.om.impl.llom.OMElementImpl.internalSerialize(OMElementImpl.java:875) > > > > > at > > > > > > > > > > > > > > > > > > > > org.apache.axiom.om.impl.util.OMSerializerUtil.serializeChildren(OMSerializerUtil.java:556) > > > > > at > > > > > > > > > > > > > > > > > > > > org.apache.axiom.om.impl.llom.OMElementImpl.internalSerialize(OMElementImpl.java:875) > > > > > at > > > > > > > > > > > > > > > > > > > > org.apache.axiom.soap.impl.llom.SOAPEnvelopeImpl.internalSerialize(SOAPEnvelopeImpl.java:230) > > > > > at > > > > > > > > > > > > > > > > > > > > org.apache.axiom.om.impl.llom.OMSerializableImpl.serialize(OMSerializableImpl.java:125) > > > > > at > > > > > > > > > > > > > > > > > > > > org.apache.axiom.om.impl.llom.OMSerializableImpl.serialize(OMSerializableImpl.java:113) > > > > > at > > > > > > > > > > > > > > > > > > > > org.apache.axiom.om.impl.llom.OMElementImpl.toString(OMElementImpl.java:988) > > > > > at java.lang.String.valueOf(String.java:2826) > > > > > at java.lang.StringBuffer.append(StringBuffer.java:219) > > > > > at > > > > > > > > > > > > > > > > > > > > org.apache.synapse.mediators.builtin.LogMediator.getFullLogMessage(LogMediator.java:184) > > > > > at > > > > > > > > > > > > > > > > > > > > org.apache.synapse.mediators.builtin.LogMediator.getLogMessage(LogMediator.java:123) > > > > > at > > > > > > > > > > > > > > > > > > > > org.apache.synapse.mediators.builtin.LogMediator.mediate(LogMediator.java:91) > > > > > at > > > > > > > > > > > > > > > > > > > > org.apache.synapse.mediators.AbstractListMediator.mediate(AbstractListMediator.java:60) > > > > > at > > > > > > > > > > > > > > > > > > > > org.apache.synapse.mediators.base.SequenceMediator.mediate(SequenceMediator.java:114) > > > > > at > > > > > > > > > > > > > > > > > > > > org.apache.synapse.core.axis2.Axis2SynapseEnvironment.injectMessage(Axis2SynapseEnvironment.java:229) > > > > > at > > > > > > > > > > > > > > > > > > > > org.apache.synapse.core.axis2.SynapseCallbackReceiver.handleMessage(SynapseCallbackReceiver.java:370) > > > > > at > > > > > > > > > > > > > > > > > > > > org.apache.synapse.core.axis2.SynapseCallbackReceiver.receive(SynapseCallbackReceiver.java:160) > > > > > at org.apache.axis2.engine.AxisEngine.receive(AxisEngine.java:181) > > > > > at > > > > > > > > > > > > > > > org.apache.synapse.transport.nhttp.ClientWorker.run(ClientWorker.java:275) > > > > > at > > > > > > > > > > > > > > > > > > > > org.apache.axis2.transport.base.threads.NativeWorkerPool$1.run(NativeWorkerPool.java:173) > > > > > at > > > > > > > > > > > > > > > > > > > > java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) > > > > > at > > > > > > > > > > > > > > > > > > > > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) > > > > > at java.lang.Thread.run(Thread.java:680) > > > > > *Caused by: com.ctc.wstx.exc.WstxIOException: Invalid UTF-8 middle > > byte > > > > > 0x3c (at char #714, byte #127)* > > > > > at > com.ctc.wstx.sr.StreamScanner.throwFromIOE(StreamScanner.java:708) > > > > > at > > com.ctc.wstx.sr.BasicStreamReader.next(BasicStreamReader.java:1086) > > > > > at > > > > > > > > > > > > > > > > > > > > org.apache.axiom.util.stax.wrapper.XMLStreamReaderWrapper.next(XMLStreamReaderWrapper.java:225) > > > > > at > > > > > > > > > > > > > > > > > > > > org.apache.axiom.util.stax.dialect.DisallowDoctypeDeclStreamReaderWrapper.next(DisallowDoctypeDeclStreamReaderWrapper.java:34) > > > > > at > > > > > > > > > > > > > > > > > > > > org.apache.axiom.util.stax.wrapper.XMLStreamReaderWrapper.next(XMLStreamReaderWrapper.java:225) > > > > > at > > > > > > > > > > > > > > > > > > > > org.apache.axiom.om.impl.builder.StAXOMBuilder.parserNext(StAXOMBuilder.java:681) > > > > > at > > > > > > > > > > > > > > > org.apache.axiom.om.impl.builder.StAXOMBuilder.next(StAXOMBuilder.java:214) > > > > > ... 31 more > > > > > *Caused by: java.io.CharConversionException: Invalid UTF-8 middle > > byte > > > > 0x3c > > > > > (at char #714, byte #127)* > > > > > at > com.ctc.wstx.io.UTF8Reader.reportInvalidOther(UTF8Reader.java:313) > > > > > at com.ctc.wstx.io.UTF8Reader.read(UTF8Reader.java:204) > > > > > at com.ctc.wstx.io.MergedReader.read(MergedReader.java:101) > > > > > at com.ctc.wstx.io.ReaderSource.readInto(ReaderSource.java:84) > > > > > at > > > > > > > > > > > > > > > > > > > > com.ctc.wstx.io.BranchingReaderSource.readInto(BranchingReaderSource.java:57) > > > > > at > > > > > > > > > > > > > > > com.ctc.wstx.sr.StreamScanner.loadMoreFromCurrent(StreamScanner.java:1046) > > > > > at > > > > > > > > > > > > > > > com.ctc.wstx.sr.StreamScanner.loadMoreFromCurrent(StreamScanner.java:1053) > > > > > at > > > > > > > > > > > > > > > com.ctc.wstx.sr.StreamScanner.getNextInCurrAfterWS(StreamScanner.java:892) > > > > > at > > > > > > > > > > > > > > > > > > > > com.ctc.wstx.sr.BasicStreamReader.handleNsAttrs(BasicStreamReader.java:2963) > > > > > at > > > > > > > > > > > > > > > > > > > > com.ctc.wstx.sr.BasicStreamReader.handleStartElem(BasicStreamReader.java:2936) > > > > > at > > > > > > > > > > > > > > > com.ctc.wstx.sr.BasicStreamReader.nextFromTree(BasicStreamReader.java:2848) > > > > > at > > com.ctc.wstx.sr.BasicStreamReader.next(BasicStreamReader.java:1019) > > > > > > > > > > > > > > > > > > > > > -- > > > > Hiranya Jayathilaka > > > > Associate Technical Lead; > > > > WSO2 Inc.; http://wso2.org > > > > E-mail: [email protected]; Mobile: +94 77 633 3491 > > > > Blog: http://techfeast-hiranya.blogspot.com > > > > > > > > > > > > > > > -- > > Hiranya Jayathilaka > > Associate Technical Lead; > > WSO2 Inc.; http://wso2.org > > E-mail: [email protected]; Mobile: +94 77 633 3491 > > Blog: http://techfeast-hiranya.blogspot.com > > > -- Hiranya Jayathilaka Associate Technical Lead; WSO2 Inc.; http://wso2.org E-mail: [email protected]; Mobile: +94 77 633 3491 Blog: http://techfeast-hiranya.blogspot.com
