Axiom/Woodstox is of course able to handle special characters, provided that it gets the right information about the charset encoding of the message. To me this looks like the Content-Type header doesn't contain the correct charset encoding.
Andreas On Tue, Feb 14, 2012 at 17:41, Hiranya Jayathilaka <[email protected]> wrote: > Hi Matthew, > > We use Axiom as the underlying XML infoset. AFAIK it usually works well > with special characters. Not sure why it cannot handle this pound sign. May > be Andreas, can shed some light on the matter? Actually in this case the > exception is thrown by the Woodstox parser which is at a layer lower than > Axiom. So this could be a Woodstox issue. > > However if the underlying XML parser cannot handle this payload, then I > don't think any of our built-in utils will be able to parse it without > throwing an error. So your best option is to serialize this into a string > buffer or a byte buffer and run the necessary replacement operations. > Anyway lets wait and see what others have to say. > > Thanks, > Hiranya > > On Tue, Feb 14, 2012 at 7:55 PM, Matthew Clark > <[email protected]>wrote: > >> Sure - the service I'm looking at right now is very simple - the input just >> looks like this: >> >> <oxxml version="1.0" xmlns="http://xyz.com/xmlapi/"> >> <function>findOrderByReference</function> >> <args> >> <arg id="1">SomeRef123</arg> >> </args> >> </oxxml> >> >> The response then looks like this (i've removed a large chunk of it": >> >> <oxxml version="1.0" xmlns="http://xyz.com/xmlapi/"> >> <response function="findOrderByReference" uuid="4444-4444-4444-4444"> >> <matches count="1"> >> <order id="1234567"> >> <description>Some description including a £ (pound) >> sign</description> >> </order> >> </matches> >> </response> >> </oxxml> >> >> The pound sign causes StAX to throw an exception.. so I'd like to replace >> it as follows: >> >> <oxxml version="1.0" xmlns="http://xyz.com/xmlapi/"> >> <response function="findOrderByReference" txn-uuid="4444-4444-4444-4444"> >> <matches count="1"> >> <order id="1234567"> >> <description>Some description including a £ >> (ampersandhash163;) >> sign</description> >> </order> >> </matches> >> </response> >> </oxxml> >> >> >> On 14 February 2012 13:16, Hiranya Jayathilaka <[email protected]> >> wrote: >> >> > On Tue, Feb 14, 2012 at 5:01 PM, Matthew Clark >> > <[email protected]>wrote: >> > >> > > Hi thanks for that - for some reason I had overlooked the message >> > > builders.. >> > > >> > > I have a rudimentary version of this working now but given the various >> > > classes available (XMLStreamReader, StAXbuilder and so on), what would >> be >> > > the most efficient way to do the replacement? >> > > >> > >> > If the input byte stream contains invalid characters then I don't think >> you >> > can use any of the above classes to process your inputs. >> > >> > >> > > >> > > I have about 40 characters (such as the pound sign) that I would like >> to >> > > replace with entity references... For the first version, I simply >> > converted >> > > to a string used StringUtils.replaceEach() but this is obviously not >> > > ideal.. >> > > >> > >> > Can you please share an input message and a preprocessed message for us >> to >> > get a better understanding of your requirement? >> > >> > Thanks, >> > Hiranya >> > >> > >> > > >> > > >> > > On 14 February 2012 04:32, Hiranya Jayathilaka <[email protected]> >> > > wrote: >> > > >> > > > Hi Mark, >> > > > >> > > > If you want to preprocess the responses then I'd recommend you to >> > write a >> > > > custom message builder. You can register the custom message builder >> in >> > > the >> > > > axis2.xml file against the content type of your responses. There you >> > will >> > > > be able to include any custom logic along with code for handling >> > invalid >> > > > characters in the payload. >> > > > >> > > > Here are some useful resources I found on the web: >> > > > >> > > > >> > > > >> > > >> > >> http://charithwiki.blogspot.com/2010/11/how-to-write-axis2-message-builder.html >> > > > >> > > > >> > > >> > >> http://wso2.org/library/articles/axis2-configuration-part2-learning-axis2-xml >> > > > >> > > > Thanks, >> > > > Hiranya >> > > > >> > > > On Tue, Feb 14, 2012 at 4:34 AM, Matthew Clark >> > > > <[email protected]>wrote: >> > > > >> > > > > Hi all, I'd really appreciate some help with this one... it's >> hurting >> > > my >> > > > > brain! >> > > > > >> > > > > We have a legacy service that I would like to include in some of >> our >> > > ESB >> > > > > operations. >> > > > > The legacy service uses XML for both request and response payloads >> > > making >> > > > > it a very easy integration. >> > > > > >> > > > > I've created a very simple proxy service (see below). >> > > > > >> > > > > The problem I am having is that the legacy service can return some >> > > > invalid >> > > > > characters and is causing the stax parser to blow up in such a way >> > > that I >> > > > > can't even handle it gracefully with a fault sequence. I'd really >> > like >> > > > to >> > > > > pre-process the responses (before they are parsed/built) as 99% of >> > the >> > > > time >> > > > > it is simply a case of replacing characters with numeric character >> > > > > references or character entity references.. >> > > > > >> > > > > We are unable to modify the legacy service to remove these >> erroneous >> > > > > responses. >> > > > > >> > > > > Heres the proxy config (I said it was simple!!) followed by the >> > > Exception >> > > > > thrown... The exception causes the service to hang and the fault >> > > > sequence >> > > > > is only entered after a 60 second timeout. >> > > > > >> > > > > <proxy xmlns="http://ws.apache.org/ns/synapse" >> name="legacyservice" >> > > > > transports="http" startOnLoad="true"> >> > > > > >> > > > > <target endpoint="legacyXMLReceiver"> >> > > > > >> > > > > <inSequence> >> > > > > >> > > > > <log level="full"> >> > > > > >> > > > > <property name="MESSAGE" value="InSequence" /> >> > > > > >> > > > > </log> >> > > > > >> > > > > </inSequence> >> > > > > >> > > > > <outSequence> >> > > > > >> > > > > <log level="full"> >> > > > > >> > > > > <property name="MESSAGE" value="OutSequence" /> >> > > > > >> > > > > </log> >> > > > > >> > > > > <send /> >> > > > > >> > > > > </outSequence> >> > > > > >> > > > > <faultSequence> >> > > > > >> > > > > <makefault version="soap11"> >> > > > > >> > > > > <code xmlns:soap11Env=" >> > > > > http://schemas.xmlsoap.org/soap/envelope/" >> value="soap11Env:Server" >> > /> >> > > > > >> > > > > <reason expression="get-property('ERROR_MESSAGE')" /> >> > > > > >> > > > > <role /> >> > > > > >> > > > > </makefault> >> > > > > >> > > > > <log level="full"> >> > > > > >> > > > > <property name="MESSAGE" value="FaultSequence" /> >> > > > > >> > > > > </log> >> > > > > >> > > > > <property name="HTTP_SC" value="500" scope="axis2" /> >> > > > > >> > > > > <send /> >> > > > > >> > > > > </faultSequence> >> > > > > >> > > > > </target> >> > > > > >> > > > > </proxy> >> > > > > >> > > > > >> > > > > <endpoint xmlns="http://ws.apache.org/ns/synapse" >> > > > > name="legacyXMLReceiver"> >> > > > > >> > > > > <address uri="http://a.b.c.d:8080/legacyService/LegacyServlet" >> > > > > format="pox" > >> > > > > >> > > > > </address> >> > > > > >> > > > > </endpoint> >> > > > > >> > > > > >> > > > > ERROR {org.apache.axis2.transport.base.threads.NativeWorkerPool} - >> > > > > Uncaught exception >> > > > > {org.apache.axis2.transport.base.threads.NativeWorkerPool} >> > > > > *org.apache.axiom.om.OMException: com.ctc.wstx.exc.WstxIOException: >> > > > Invalid >> > > > > UTF-8 middle byte 0x3c (at char #714, byte #127)* >> > > > > at >> > > > > >> > > > >> > > >> > >> org.apache.axiom.om.impl.builder.StAXOMBuilder.next(StAXOMBuilder.java:296) >> > > > > at >> > > > > >> > > > > >> > > > >> > > >> > >> org.apache.axiom.om.impl.llom.OMElementImpl.buildNext(OMElementImpl.java:653) >> > > > > at >> > > > > >> > > > > >> > > > >> > > >> > >> org.apache.axiom.om.impl.llom.OMNodeImpl.getNextOMSibling(OMNodeImpl.java:122) >> > > > > at >> > > > > >> > > > > >> > > > >> > > >> > >> org.apache.axiom.om.impl.llom.OMElementImpl.getNextOMSibling(OMElementImpl.java:343) >> > > > > at >> > > > > >> > > > > >> > > > >> > > >> > >> org.apache.axiom.om.impl.traverse.OMChildrenIterator.getNextNode(OMChildrenIterator.java:36) >> > > > > at >> > > > > >> > > > > >> > > > >> > > >> > >> org.apache.axiom.om.impl.traverse.OMAbstractIterator.hasNext(OMAbstractIterator.java:58) >> > > > > at >> > > > > >> > > > > >> > > > >> > > >> > >> org.apache.axiom.om.impl.util.OMSerializerUtil.serializeChildren(OMSerializerUtil.java:555) >> > > > > at >> > > > > >> > > > > >> > > > >> > > >> > >> org.apache.axiom.om.impl.llom.OMElementImpl.internalSerialize(OMElementImpl.java:875) >> > > > > at >> > > > > >> > > > > >> > > > >> > > >> > >> org.apache.axiom.om.impl.util.OMSerializerUtil.serializeChildren(OMSerializerUtil.java:556) >> > > > > at >> > > > > >> > > > > >> > > > >> > > >> > >> org.apache.axiom.om.impl.llom.OMElementImpl.internalSerialize(OMElementImpl.java:875) >> > > > > at >> > > > > >> > > > > >> > > > >> > > >> > >> org.apache.axiom.om.impl.util.OMSerializerUtil.serializeChildren(OMSerializerUtil.java:556) >> > > > > at >> > > > > >> > > > > >> > > > >> > > >> > >> org.apache.axiom.om.impl.llom.OMElementImpl.internalSerialize(OMElementImpl.java:875) >> > > > > at >> > > > > >> > > > > >> > > > >> > > >> > >> org.apache.axiom.soap.impl.llom.SOAPEnvelopeImpl.internalSerialize(SOAPEnvelopeImpl.java:230) >> > > > > at >> > > > > >> > > > > >> > > > >> > > >> > >> org.apache.axiom.om.impl.llom.OMSerializableImpl.serialize(OMSerializableImpl.java:125) >> > > > > at >> > > > > >> > > > > >> > > > >> > > >> > >> org.apache.axiom.om.impl.llom.OMSerializableImpl.serialize(OMSerializableImpl.java:113) >> > > > > at >> > > > > >> > > > > >> > > > >> > > >> > >> org.apache.axiom.om.impl.llom.OMElementImpl.toString(OMElementImpl.java:988) >> > > > > at java.lang.String.valueOf(String.java:2826) >> > > > > at java.lang.StringBuffer.append(StringBuffer.java:219) >> > > > > at >> > > > > >> > > > > >> > > > >> > > >> > >> org.apache.synapse.mediators.builtin.LogMediator.getFullLogMessage(LogMediator.java:184) >> > > > > at >> > > > > >> > > > > >> > > > >> > > >> > >> org.apache.synapse.mediators.builtin.LogMediator.getLogMessage(LogMediator.java:123) >> > > > > at >> > > > > >> > > > > >> > > > >> > > >> > >> org.apache.synapse.mediators.builtin.LogMediator.mediate(LogMediator.java:91) >> > > > > at >> > > > > >> > > > > >> > > > >> > > >> > >> org.apache.synapse.mediators.AbstractListMediator.mediate(AbstractListMediator.java:60) >> > > > > at >> > > > > >> > > > > >> > > > >> > > >> > >> org.apache.synapse.mediators.base.SequenceMediator.mediate(SequenceMediator.java:114) >> > > > > at >> > > > > >> > > > > >> > > > >> > > >> > >> org.apache.synapse.core.axis2.Axis2SynapseEnvironment.injectMessage(Axis2SynapseEnvironment.java:229) >> > > > > at >> > > > > >> > > > > >> > > > >> > > >> > >> org.apache.synapse.core.axis2.SynapseCallbackReceiver.handleMessage(SynapseCallbackReceiver.java:370) >> > > > > at >> > > > > >> > > > > >> > > > >> > > >> > >> org.apache.synapse.core.axis2.SynapseCallbackReceiver.receive(SynapseCallbackReceiver.java:160) >> > > > > at org.apache.axis2.engine.AxisEngine.receive(AxisEngine.java:181) >> > > > > at >> > > > > >> > > > >> > > >> > >> org.apache.synapse.transport.nhttp.ClientWorker.run(ClientWorker.java:275) >> > > > > at >> > > > > >> > > > > >> > > > >> > > >> > >> org.apache.axis2.transport.base.threads.NativeWorkerPool$1.run(NativeWorkerPool.java:173) >> > > > > at >> > > > > >> > > > > >> > > > >> > > >> > >> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) >> > > > > at >> > > > > >> > > > > >> > > > >> > > >> > >> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) >> > > > > at java.lang.Thread.run(Thread.java:680) >> > > > > *Caused by: com.ctc.wstx.exc.WstxIOException: Invalid UTF-8 middle >> > byte >> > > > > 0x3c (at char #714, byte #127)* >> > > > > at >> com.ctc.wstx.sr.StreamScanner.throwFromIOE(StreamScanner.java:708) >> > > > > at >> > com.ctc.wstx.sr.BasicStreamReader.next(BasicStreamReader.java:1086) >> > > > > at >> > > > > >> > > > > >> > > > >> > > >> > >> org.apache.axiom.util.stax.wrapper.XMLStreamReaderWrapper.next(XMLStreamReaderWrapper.java:225) >> > > > > at >> > > > > >> > > > > >> > > > >> > > >> > >> org.apache.axiom.util.stax.dialect.DisallowDoctypeDeclStreamReaderWrapper.next(DisallowDoctypeDeclStreamReaderWrapper.java:34) >> > > > > at >> > > > > >> > > > > >> > > > >> > > >> > >> org.apache.axiom.util.stax.wrapper.XMLStreamReaderWrapper.next(XMLStreamReaderWrapper.java:225) >> > > > > at >> > > > > >> > > > > >> > > > >> > > >> > >> org.apache.axiom.om.impl.builder.StAXOMBuilder.parserNext(StAXOMBuilder.java:681) >> > > > > at >> > > > > >> > > > >> > > >> > >> org.apache.axiom.om.impl.builder.StAXOMBuilder.next(StAXOMBuilder.java:214) >> > > > > ... 31 more >> > > > > *Caused by: java.io.CharConversionException: Invalid UTF-8 middle >> > byte >> > > > 0x3c >> > > > > (at char #714, byte #127)* >> > > > > at >> com.ctc.wstx.io.UTF8Reader.reportInvalidOther(UTF8Reader.java:313) >> > > > > at com.ctc.wstx.io.UTF8Reader.read(UTF8Reader.java:204) >> > > > > at com.ctc.wstx.io.MergedReader.read(MergedReader.java:101) >> > > > > at com.ctc.wstx.io.ReaderSource.readInto(ReaderSource.java:84) >> > > > > at >> > > > > >> > > > > >> > > > >> > > >> > >> com.ctc.wstx.io.BranchingReaderSource.readInto(BranchingReaderSource.java:57) >> > > > > at >> > > > > >> > > > >> > > >> > >> com.ctc.wstx.sr.StreamScanner.loadMoreFromCurrent(StreamScanner.java:1046) >> > > > > at >> > > > > >> > > > >> > > >> > >> com.ctc.wstx.sr.StreamScanner.loadMoreFromCurrent(StreamScanner.java:1053) >> > > > > at >> > > > > >> > > > >> > > >> > >> com.ctc.wstx.sr.StreamScanner.getNextInCurrAfterWS(StreamScanner.java:892) >> > > > > at >> > > > > >> > > > > >> > > > >> > > >> > >> com.ctc.wstx.sr.BasicStreamReader.handleNsAttrs(BasicStreamReader.java:2963) >> > > > > at >> > > > > >> > > > > >> > > > >> > > >> > >> com.ctc.wstx.sr.BasicStreamReader.handleStartElem(BasicStreamReader.java:2936) >> > > > > at >> > > > > >> > > > >> > > >> > >> com.ctc.wstx.sr.BasicStreamReader.nextFromTree(BasicStreamReader.java:2848) >> > > > > at >> > com.ctc.wstx.sr.BasicStreamReader.next(BasicStreamReader.java:1019) >> > > > > >> > > > >> > > > >> > > > >> > > > -- >> > > > Hiranya Jayathilaka >> > > > Associate Technical Lead; >> > > > WSO2 Inc.; http://wso2.org >> > > > E-mail: [email protected]; Mobile: +94 77 633 3491 >> > > > Blog: http://techfeast-hiranya.blogspot.com >> > > > >> > > >> > >> > >> > >> > -- >> > Hiranya Jayathilaka >> > Associate Technical Lead; >> > WSO2 Inc.; http://wso2.org >> > E-mail: [email protected]; Mobile: +94 77 633 3491 >> > Blog: http://techfeast-hiranya.blogspot.com >> > >> > > > > -- > Hiranya Jayathilaka > Associate Technical Lead; > WSO2 Inc.; http://wso2.org > E-mail: [email protected]; Mobile: +94 77 633 3491 > Blog: http://techfeast-hiranya.blogspot.com
