Re: Sorting overlapping annotation of same type using UIMAFIT
Thank you, Marshall. What if they are of the same type? The workaround for me was to add a feature I can store a integer which I use to sort the annotations. It is not a good approach because the user will need to remember to sort it before using. Thank you William 2016-11-21 20:10 GMT-02:00 Marshall Schor : > The select form you're using iterates using UIMA's built-in Annotation > index. > This index is sorting the annotations based on 3 criteria: > > 1) the begin (ascending order) > > 2) the end (descending order) > > 3) the type priority > > You can use the 3rd criterion to set a preference ordering among two > annotations > of different types, which have the same begin / end. > You specify the type priorities as part of Analysis Engine metadata, see > http://uima.apache.org/d/uimaj-current/references.html# > ugr.ref.xml.component_descriptor.aes.primitive > > -Marshall > > On 11/20/2016 9:52 PM, William Colen wrote: > > Hi, > > > > In Portuguese we have contractions, that are words composed by, for > > example, a preposition + article, pronoun or an adverb. > > > > Example: > > > > Nós acreditávamos nele. (We believed him.) > > > > Where "nele" can be divided into "em" + "ele". (in + him) > > > > To properly analyze this, I created two token annotation with the same > > begin and end, but the first I associated with the POS Tag preposition, > and > > the second pronoun. > > > > This is especially important when we are doing chunking, because the > first > > token will be part of a prepositional phrase, while the second of a > nominal > > phrase. > > > > How can I guarantee that when I call UIMAFit JCasUtil.select I will get > the > > tokens ordered, first the preposition, second the pronoun? > > > > Thank you, > > William > > > >
Re: Sorting overlapping annotation of same type using UIMAFIT
The select form you're using iterates using UIMA's built-in Annotation index. This index is sorting the annotations based on 3 criteria: 1) the begin (ascending order) 2) the end (descending order) 3) the type priority You can use the 3rd criterion to set a preference ordering among two annotations of different types, which have the same begin / end. You specify the type priorities as part of Analysis Engine metadata, see http://uima.apache.org/d/uimaj-current/references.html#ugr.ref.xml.component_descriptor.aes.primitive -Marshall On 11/20/2016 9:52 PM, William Colen wrote: > Hi, > > In Portuguese we have contractions, that are words composed by, for > example, a preposition + article, pronoun or an adverb. > > Example: > > Nós acreditávamos nele. (We believed him.) > > Where "nele" can be divided into "em" + "ele". (in + him) > > To properly analyze this, I created two token annotation with the same > begin and end, but the first I associated with the POS Tag preposition, and > the second pronoun. > > This is especially important when we are doing chunking, because the first > token will be part of a prepositional phrase, while the second of a nominal > phrase. > > How can I guarantee that when I call UIMAFit JCasUtil.select I will get the > tokens ordered, first the preposition, second the pronoun? > > Thank you, > William >
Re: No service reply, after org.xml.sax.SAXParseException; Trying to serialize non-XML 1.0 character:
Nelson, a fix for this is part of JIRA UIMA-5189 which addresses error handing when a serializer throws an exception. I will post UIMA-AS 2.9.0 release candidate tomorrow so you can test your use case. Watch for an email on uima dev list. Jerry On Mon, Nov 21, 2016 at 4:17 PM, nelson rivera wrote: > I tried to process a input cas in service aggregate deployed in > uima-as. The annotations produced for annotators contains apparently > invalid character, after finalize the processing , when the framework > tries to send the reply, shows a org.xml.sax.SAXParseException error > serializing the cas and in the client side i get not any reply, the > listener associate it is not notified of the error, and the client > program stays waiting > > the log of service aggregate error: > > 03:50:03.578 - 22: > org.apache.uima.aae.controller.AggregateAnalysisEngineControl > ler_impl.replyToClient: > WARNING: Service: XDataFileExtractorAggregate Runtime Exception > 03:50:03.579 - 22: > org.apache.uima.aae.controller.AggregateAnalysisEngineControl > ler_impl.replyToClient: > WARNING: > org.apache.uima.aae.error.AsynchAEException: > org.xml.sax.SAXParseException; Trying to serialize non-XML 1.0 > character: , 0x1 at offset 0 in string starting with > at org.apache.uima.adapter.jms.activemq.JmsOutputChannel. > getSerializedCas(JmsOutputChannel.java:1258) > at org.apache.uima.adapter.jms.activemq.JmsOutputChannel. > sendReply(JmsOutputChannel.java:793) > at org.apache.uima.aae.controller.AggregateAnalysisEngineControl > ler_impl.sendReplyToRemoteClient(AggregateAnalysisEngineControl > ler_impl.java:2166) > at org.apache.uima.aae.controller.AggregateAnalysisEngineControl > ler_impl.replyToClient(AggregateAnalysisEngineController_impl.java:2335) > at org.apache.uima.aae.controller.AggregateAnalysisEngineControl > ler_impl.finalStep(AggregateAnalysisEngineController_impl.java:1855) > at org.apache.uima.aae.controller.AggregateAnalysisEngineControl > ler_impl.executeFlowStep(AggregateAnalysisEngineController_impl.java:2482) > at org.apache.uima.aae.controller.AggregateAnalysisEngineControl > ler_impl.process(AggregateAnalysisEngineController_impl.java:1264) > at org.apache.uima.aae.handler.HandlerBase.invokeProcess( > HandlerBase.java:118) > at org.apache.uima.aae.handler.input.ProcessResponseHandler. > cancelTimerAndProcess(ProcessResponseHandler.java:117) > at org.apache.uima.aae.handler.input.ProcessResponseHandler. > handleProcessResponseWithCASReference(ProcessResponseHandler.java:485) > at org.apache.uima.aae.handler.input.ProcessResponseHandler. > handle(ProcessResponseHandler.java:767) > at org.apache.uima.aae.handler.HandlerBase.delegate( > HandlerBase.java:149) > at org.apache.uima.aae.handler.input.ProcessRequestHandler_ > impl.handle(ProcessRequestHandler_impl.java:1085) > at org.apache.uima.aae.spi.transport.vm.UimaVmMessageListener. > onMessage(UimaVmMessageListener.java:107) > at org.apache.uima.aae.spi.transport.vm. > UimaVmMessageDispatcher$1.run(UimaVmMessageDispatcher.java:70) > at java.util.concurrent.ThreadPoolExecutor.runWorker( > ThreadPoolExecutor.java:1145) > at java.util.concurrent.ThreadPoolExecutor$Worker.run( > ThreadPoolExecutor.java:615) > at org.apache.uima.aae.UimaAsThreadFactory$1.run( > UimaAsThreadFactory.java:132) > at java.lang.Thread.run(Thread.java:745) > Caused by: org.xml.sax.SAXParseException; Trying to serialize non-XML > 1.0 character: , 0x1 at offset 0 in string starting with > at org.apache.uima.util.XMLSerializer$ > CharacterValidatingContentHandler.checkForInvalidXmlChars( > XMLSerializer.java:374) > at org.apache.uima.util.XMLSerializer$ > CharacterValidatingContentHandler.startElement(XMLSerializer.java:275) > at org.apache.uima.cas.impl.XmiCasSerializer$ > XmiDocSerializer.startElement(XmiCasSerializer.java:1197) > at org.apache.uima.cas.impl.XmiCasSerializer$XmiDocSerializer. > writeFsOrLists(XmiCasSerializer.java:711) > at org.apache.uima.cas.impl.XmiCasSerializer$ > XmiDocSerializer.writeFs(XmiCasSerializer.java:697) > at org.apache.uima.cas.impl.CasSerializerSupport$ > CasDocSerializer.encodeFS(CasSerializerSupport.java:) > at org.apache.uima.cas.impl.CasSerializerSupport$ > CasDocSerializer.encodeQueued(CasSerializerSupport.java:1015) > at org.apache.uima.cas.impl.XmiCasSerializer$XmiDocSerializer. > writeFeatureStructures(XmiCasSerializer.java:563) > at org.apache.uima.cas.impl.CasSerializerSupport$ > CasDocSerializer.serialize(CasSerializerSupport.java:439) > at org.apache.uima.cas.impl.XmiCasSerializer.serialize( > XmiCasSerializer.java:415) > at org.apache.uima.cas.impl.XmiCasSerializer.serialize( > XmiCasSerializer.java:385) > at org.apache.uima.aae.UimaSerializer.serializeCasToXmi( > UimaSerializer.java:145) >
No service reply, after org.xml.sax.SAXParseException; Trying to serialize non-XML 1.0 character:
I tried to process a input cas in service aggregate deployed in uima-as. The annotations produced for annotators contains apparently invalid character, after finalize the processing , when the framework tries to send the reply, shows a org.xml.sax.SAXParseException error serializing the cas and in the client side i get not any reply, the listener associate it is not notified of the error, and the client program stays waiting the log of service aggregate error: 03:50:03.578 - 22: org.apache.uima.aae.controller.AggregateAnalysisEngineController_impl.replyToClient: WARNING: Service: XDataFileExtractorAggregate Runtime Exception 03:50:03.579 - 22: org.apache.uima.aae.controller.AggregateAnalysisEngineController_impl.replyToClient: WARNING: org.apache.uima.aae.error.AsynchAEException: org.xml.sax.SAXParseException; Trying to serialize non-XML 1.0 character: , 0x1 at offset 0 in string starting with at org.apache.uima.adapter.jms.activemq.JmsOutputChannel.getSerializedCas(JmsOutputChannel.java:1258) at org.apache.uima.adapter.jms.activemq.JmsOutputChannel.sendReply(JmsOutputChannel.java:793) at org.apache.uima.aae.controller.AggregateAnalysisEngineController_impl.sendReplyToRemoteClient(AggregateAnalysisEngineController_impl.java:2166) at org.apache.uima.aae.controller.AggregateAnalysisEngineController_impl.replyToClient(AggregateAnalysisEngineController_impl.java:2335) at org.apache.uima.aae.controller.AggregateAnalysisEngineController_impl.finalStep(AggregateAnalysisEngineController_impl.java:1855) at org.apache.uima.aae.controller.AggregateAnalysisEngineController_impl.executeFlowStep(AggregateAnalysisEngineController_impl.java:2482) at org.apache.uima.aae.controller.AggregateAnalysisEngineController_impl.process(AggregateAnalysisEngineController_impl.java:1264) at org.apache.uima.aae.handler.HandlerBase.invokeProcess(HandlerBase.java:118) at org.apache.uima.aae.handler.input.ProcessResponseHandler.cancelTimerAndProcess(ProcessResponseHandler.java:117) at org.apache.uima.aae.handler.input.ProcessResponseHandler.handleProcessResponseWithCASReference(ProcessResponseHandler.java:485) at org.apache.uima.aae.handler.input.ProcessResponseHandler.handle(ProcessResponseHandler.java:767) at org.apache.uima.aae.handler.HandlerBase.delegate(HandlerBase.java:149) at org.apache.uima.aae.handler.input.ProcessRequestHandler_impl.handle(ProcessRequestHandler_impl.java:1085) at org.apache.uima.aae.spi.transport.vm.UimaVmMessageListener.onMessage(UimaVmMessageListener.java:107) at org.apache.uima.aae.spi.transport.vm.UimaVmMessageDispatcher$1.run(UimaVmMessageDispatcher.java:70) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at org.apache.uima.aae.UimaAsThreadFactory$1.run(UimaAsThreadFactory.java:132) at java.lang.Thread.run(Thread.java:745) Caused by: org.xml.sax.SAXParseException; Trying to serialize non-XML 1.0 character: , 0x1 at offset 0 in string starting with at org.apache.uima.util.XMLSerializer$CharacterValidatingContentHandler.checkForInvalidXmlChars(XMLSerializer.java:374) at org.apache.uima.util.XMLSerializer$CharacterValidatingContentHandler.startElement(XMLSerializer.java:275) at org.apache.uima.cas.impl.XmiCasSerializer$XmiDocSerializer.startElement(XmiCasSerializer.java:1197) at org.apache.uima.cas.impl.XmiCasSerializer$XmiDocSerializer.writeFsOrLists(XmiCasSerializer.java:711) at org.apache.uima.cas.impl.XmiCasSerializer$XmiDocSerializer.writeFs(XmiCasSerializer.java:697) at org.apache.uima.cas.impl.CasSerializerSupport$CasDocSerializer.encodeFS(CasSerializerSupport.java:) at org.apache.uima.cas.impl.CasSerializerSupport$CasDocSerializer.encodeQueued(CasSerializerSupport.java:1015) at org.apache.uima.cas.impl.XmiCasSerializer$XmiDocSerializer.writeFeatureStructures(XmiCasSerializer.java:563) at org.apache.uima.cas.impl.CasSerializerSupport$CasDocSerializer.serialize(CasSerializerSupport.java:439) at org.apache.uima.cas.impl.XmiCasSerializer.serialize(XmiCasSerializer.java:415) at org.apache.uima.cas.impl.XmiCasSerializer.serialize(XmiCasSerializer.java:385) at org.apache.uima.aae.UimaSerializer.serializeCasToXmi(UimaSerializer.java:145) at org.apache.uima.adapter.jms.activemq.JmsOutputChannel.serializeCAS(JmsOutputChannel.java:244) at org.apache.uima.adapter.jms.activemq.JmsOutputChannel.getSerializedCas(JmsOutputChannel.java:1243) ... 18 more
Re: TERMINATE Action with org.xml.sax.SAXParseException in deserializeCasFromXmi function
Nelson, I've created a JIRA for this bug: https://issues.apache.org/jira/browse/UIMA-5189 This will be fixed soon and will be part of the next UIMA-AS release (2.9.0). Thanks for finding the bug. Jerry On Fri, Nov 18, 2016 at 3:39 PM, Jaroslaw Cwiklik wrote: > Hi, looks like a bug. Will take a look on Monday. > Thanks > Jerry > > On Fri, Nov 18, 2016 at 11:12 AM, nelson rivera > wrote: > >> I have a service aggregate deploys in uima-as. When i send a input cas >> with a text that contains apparently invalid character, occurs an >> error deserializing the cas and the framework stops the aggregate >> service >> >> this is the complete stacktrace: >> >> 09:54:38.24 - 1: >> org.apache.uima.adapter.jms.activemq.SpringContainerDeployer >> .doStartListeners: >> INFO: Controller: XTokenizerAggregate Trying to Start Listener on >> Endpoint: queue://XTokenizerAggregate Selector: Command=2000 OR >> Command=2002 Broker: tcp://localhost:61616 >> 09:54:38.193 - 1: >> org.apache.uima.adapter.jms.activemq.SpringContainerDeployer >> .doStartListeners: >> INFO: Controller: XTokenizerAggregate Trying to Start Listener on >> Endpoint: queue://XTokenizerAggregate Selector: Command=2001 Broker: >> tcp://localhost:61616 >> 09:55:11.411 - 16: >> org.apache.uima.aae.handler.input.ProcessRequestHandler_impl >> .handleProcessRequestFromRemoteClient: >> WARNING: Service: XTokenizerAggregate Runtime Exception >> 09:55:11.411 - 16: >> org.apache.uima.aae.handler.input.ProcessRequestHandler_impl >> .handleProcessRequestFromRemoteClient: >> WARNING: >> org.xml.sax.SAXParseException; lineNumber: 1; columnNumber: 585; >> Character reference " >> at com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser >> .parse(AbstractSAXParser.java:1239) >> at org.apache.uima.aae.UimaSerializer.deserializeCasFromXmi(Uim >> aSerializer.java:187) >> at org.apache.uima.aae.handler.input.ProcessRequestHandler_impl >> .deserializeCASandRegisterWithCache(ProcessRequestHandler_impl.java:220) >> at org.apache.uima.aae.handler.input.ProcessRequestHandler_impl >> .handleProcessRequestFromRemoteClient(ProcessRequestHandler_ >> impl.java:531) >> at org.apache.uima.aae.handler.input.ProcessRequestHandler_impl >> .handle(ProcessRequestHandler_impl.java:1062) >> at org.apache.uima.aae.handler.input.MetadataRequestHandler_imp >> l.handle(MetadataRequestHandler_impl.java:78) >> at org.apache.uima.adapter.jms.activemq.JmsInputChannel.onMessa >> ge(JmsInputChannel.java:731) >> at org.springframework.jms.listener.AbstractMessageListenerCont >> ainer.doInvokeListener(AbstractMessageListenerContainer.java:689) >> at org.springframework.jms.listener.AbstractMessageListenerCont >> ainer.invokeListener(AbstractMessageListenerContainer.java:649) >> at org.springframework.jms.listener.AbstractMessageListenerCont >> ainer.doExecuteListener(AbstractMessageListenerContainer.java:619) >> at org.springframework.jms.listener.AbstractPollingMessageListe >> nerContainer.doReceiveAndExecute(AbstractPollingMessageListenerContainer. >> java:307) >> at org.springframework.jms.listener.AbstractPollingMessageListe >> nerContainer.receiveAndExecute(AbstractPollingMessageListenerContainer. >> java:245) >> at org.springframework.jms.listener.DefaultMessageListenerConta >> iner$AsyncMessageListenerInvoker.invokeListener(DefaultMessageLis >> tenerContainer.java:1144) >> at org.springframework.jms.listener.DefaultMessageListenerConta >> iner$AsyncMessageListenerInvoker.executeOngoingLoop(DefaultMessag >> eListenerContainer.java:1136) >> at org.springframework.jms.listener.DefaultMessageListenerConta >> iner$AsyncMessageListenerInvoker.run(DefaultMessageListenerContai >> ner.java:1033) >> at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPool >> Executor.java:1145) >> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoo >> lExecutor.java:615) >> at org.apache.uima.aae.UimaAsThreadFactory$1.run(UimaAsThreadFa >> ctory.java:132) >> at java.lang.Thread.run(Thread.java:745) >> >> 09:55:11.412 - 16: >> org.apache.uima.aae.error.handler.ProcessCasErrorHandler.handleError: >> WARNING: Service: XTokenizerAggregate Runtime Exception >> 09:55:11.412 - 16: >> org.apache.uima.aae.error.handler.ProcessCasErrorHandler.handleError: >> WARNING: >> org.xml.sax.SAXParseException; lineNumber: 1; columnNumber: 585; >> Character reference " >> at com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser >> .parse(AbstractSAXParser.java:1239) >> at org.apache.uima.aae.UimaSerializer.deserializeCasFromXmi(Uim >> aSerializer.java:187) >> at org.apache.uima.aae.handler.input.ProcessRequestHandler_impl >> .deserializeCASandRegisterWithCache(ProcessRequestHandler_impl.java:220) >> at org.apache.uima.aae.handler.input.ProcessRequestHandler_impl >> .handleProcessRequestFromRemoteClient(ProcessRequestHandler_ >