Re: Sorting overlapping annotation of same type using UIMAFIT

2016-11-21 Thread William Colen
Thank you, Marshall.
What if they are of the same type?
The workaround for me was to add a feature I can store a integer which I
use to sort the annotations. It is not a good approach because the user
will need to remember to sort it before using.

Thank you
William

2016-11-21 20:10 GMT-02:00 Marshall Schor :

> The select form you're using iterates using UIMA's built-in Annotation
> index.
> This index is sorting the annotations based on 3 criteria:
>
> 1) the begin (ascending order)
>
> 2) the end (descending order)
>
> 3) the type priority
>
> You can use the 3rd criterion to set a preference ordering among two
> annotations
> of different types, which have the same begin / end.
> You specify the type priorities as part of Analysis Engine metadata, see
> http://uima.apache.org/d/uimaj-current/references.html#
> ugr.ref.xml.component_descriptor.aes.primitive
>
> -Marshall
>
> On 11/20/2016 9:52 PM, William Colen wrote:
> > Hi,
> >
> > In Portuguese we have contractions, that are words composed by, for
> > example, a preposition + article, pronoun or an adverb.
> >
> > Example:
> >
> > Nós acreditávamos nele. (We believed him.)
> >
> > Where "nele" can be divided into "em" + "ele". (in + him)
> >
> > To properly analyze this, I created two token annotation with the same
> > begin and end, but the first I associated with the POS Tag preposition,
> and
> > the second pronoun.
> >
> > This is especially important when we are doing chunking, because the
> first
> > token will be part of a prepositional phrase, while the second of a
> nominal
> > phrase.
> >
> > How can I guarantee that when I call UIMAFit JCasUtil.select I will get
> the
> > tokens ordered, first the preposition, second the pronoun?
> >
> > Thank you,
> > William
> >
>
>


No service reply, after org.xml.sax.SAXParseException; Trying to serialize non-XML 1.0 character:

2016-11-21 Thread nelson rivera
I tried to process a input cas in service aggregate deployed in
uima-as. The annotations produced for annotators contains apparently
invalid character, after finalize the processing , when the framework
tries to send the reply, shows a org.xml.sax.SAXParseException error
serializing the cas and in the client side i get not any reply, the
listener associate it is not notified of the error, and the client
program stays waiting

the log of service aggregate error:

03:50:03.578 - 22:
org.apache.uima.aae.controller.AggregateAnalysisEngineController_impl.replyToClient:
WARNING: Service: XDataFileExtractorAggregate Runtime Exception
03:50:03.579 - 22:
org.apache.uima.aae.controller.AggregateAnalysisEngineController_impl.replyToClient:
WARNING:
org.apache.uima.aae.error.AsynchAEException:
org.xml.sax.SAXParseException; Trying to serialize non-XML 1.0
character: , 0x1 at offset 0 in string starting with
at 
org.apache.uima.adapter.jms.activemq.JmsOutputChannel.getSerializedCas(JmsOutputChannel.java:1258)
at 
org.apache.uima.adapter.jms.activemq.JmsOutputChannel.sendReply(JmsOutputChannel.java:793)
at 
org.apache.uima.aae.controller.AggregateAnalysisEngineController_impl.sendReplyToRemoteClient(AggregateAnalysisEngineController_impl.java:2166)
at 
org.apache.uima.aae.controller.AggregateAnalysisEngineController_impl.replyToClient(AggregateAnalysisEngineController_impl.java:2335)
at 
org.apache.uima.aae.controller.AggregateAnalysisEngineController_impl.finalStep(AggregateAnalysisEngineController_impl.java:1855)
at 
org.apache.uima.aae.controller.AggregateAnalysisEngineController_impl.executeFlowStep(AggregateAnalysisEngineController_impl.java:2482)
at 
org.apache.uima.aae.controller.AggregateAnalysisEngineController_impl.process(AggregateAnalysisEngineController_impl.java:1264)
at 
org.apache.uima.aae.handler.HandlerBase.invokeProcess(HandlerBase.java:118)
at 
org.apache.uima.aae.handler.input.ProcessResponseHandler.cancelTimerAndProcess(ProcessResponseHandler.java:117)
at 
org.apache.uima.aae.handler.input.ProcessResponseHandler.handleProcessResponseWithCASReference(ProcessResponseHandler.java:485)
at 
org.apache.uima.aae.handler.input.ProcessResponseHandler.handle(ProcessResponseHandler.java:767)
at 
org.apache.uima.aae.handler.HandlerBase.delegate(HandlerBase.java:149)
at 
org.apache.uima.aae.handler.input.ProcessRequestHandler_impl.handle(ProcessRequestHandler_impl.java:1085)
at 
org.apache.uima.aae.spi.transport.vm.UimaVmMessageListener.onMessage(UimaVmMessageListener.java:107)
at 
org.apache.uima.aae.spi.transport.vm.UimaVmMessageDispatcher$1.run(UimaVmMessageDispatcher.java:70)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at 
org.apache.uima.aae.UimaAsThreadFactory$1.run(UimaAsThreadFactory.java:132)
at java.lang.Thread.run(Thread.java:745)
Caused by: org.xml.sax.SAXParseException; Trying to serialize non-XML
1.0 character: , 0x1 at offset 0 in string starting with
at 
org.apache.uima.util.XMLSerializer$CharacterValidatingContentHandler.checkForInvalidXmlChars(XMLSerializer.java:374)
at 
org.apache.uima.util.XMLSerializer$CharacterValidatingContentHandler.startElement(XMLSerializer.java:275)
at 
org.apache.uima.cas.impl.XmiCasSerializer$XmiDocSerializer.startElement(XmiCasSerializer.java:1197)
at 
org.apache.uima.cas.impl.XmiCasSerializer$XmiDocSerializer.writeFsOrLists(XmiCasSerializer.java:711)
at 
org.apache.uima.cas.impl.XmiCasSerializer$XmiDocSerializer.writeFs(XmiCasSerializer.java:697)
at 
org.apache.uima.cas.impl.CasSerializerSupport$CasDocSerializer.encodeFS(CasSerializerSupport.java:)
at 
org.apache.uima.cas.impl.CasSerializerSupport$CasDocSerializer.encodeQueued(CasSerializerSupport.java:1015)
at 
org.apache.uima.cas.impl.XmiCasSerializer$XmiDocSerializer.writeFeatureStructures(XmiCasSerializer.java:563)
at 
org.apache.uima.cas.impl.CasSerializerSupport$CasDocSerializer.serialize(CasSerializerSupport.java:439)
at 
org.apache.uima.cas.impl.XmiCasSerializer.serialize(XmiCasSerializer.java:415)
at 
org.apache.uima.cas.impl.XmiCasSerializer.serialize(XmiCasSerializer.java:385)
at 
org.apache.uima.aae.UimaSerializer.serializeCasToXmi(UimaSerializer.java:145)
at 
org.apache.uima.adapter.jms.activemq.JmsOutputChannel.serializeCAS(JmsOutputChannel.java:244)
at 
org.apache.uima.adapter.jms.activemq.JmsOutputChannel.getSerializedCas(JmsOutputChannel.java:1243)
... 18 more


Re: TERMINATE Action with org.xml.sax.SAXParseException in deserializeCasFromXmi function

2016-11-21 Thread Jaroslaw Cwiklik
Nelson, I've created a JIRA for this bug:
https://issues.apache.org/jira/browse/UIMA-5189

This will be fixed soon and will be part of the next UIMA-AS release
(2.9.0).

Thanks for finding the bug.
Jerry

On Fri, Nov 18, 2016 at 3:39 PM, Jaroslaw Cwiklik  wrote:

> Hi, looks like a bug. Will take a look on Monday.
> Thanks
> Jerry
>
> On Fri, Nov 18, 2016 at 11:12 AM, nelson rivera 
> wrote:
>
>> I have a service aggregate deploys in uima-as. When i send a input cas
>> with a text that contains apparently invalid character, occurs an
>> error deserializing the cas and the framework stops the aggregate
>> service
>>
>> this is the complete stacktrace:
>>
>> 09:54:38.24 - 1:
>> org.apache.uima.adapter.jms.activemq.SpringContainerDeployer
>> .doStartListeners:
>> INFO: Controller: XTokenizerAggregate Trying to Start Listener on
>> Endpoint: queue://XTokenizerAggregate Selector: Command=2000 OR
>> Command=2002 Broker: tcp://localhost:61616
>> 09:54:38.193 - 1:
>> org.apache.uima.adapter.jms.activemq.SpringContainerDeployer
>> .doStartListeners:
>> INFO: Controller: XTokenizerAggregate Trying to Start Listener on
>> Endpoint: queue://XTokenizerAggregate Selector: Command=2001 Broker:
>> tcp://localhost:61616
>> 09:55:11.411 - 16:
>> org.apache.uima.aae.handler.input.ProcessRequestHandler_impl
>> .handleProcessRequestFromRemoteClient:
>> WARNING: Service: XTokenizerAggregate Runtime Exception
>> 09:55:11.411 - 16:
>> org.apache.uima.aae.handler.input.ProcessRequestHandler_impl
>> .handleProcessRequestFromRemoteClient:
>> WARNING:
>> org.xml.sax.SAXParseException; lineNumber: 1; columnNumber: 585;
>> Character reference "