Thank you, Marshall. What if they are of the same type? The workaround for me was to add a feature I can store a integer which I use to sort the annotations. It is not a good approach because the user will need to remember to sort it before using.
Thank you William 2016-11-21 20:10 GMT-02:00 Marshall Schor <m...@schor.com>: > The select form you're using iterates using UIMA's built-in Annotation > index. > This index is sorting the annotations based on 3 criteria: > > 1) the begin (ascending order) > > 2) the end (descending order) > > 3) the type priority > > You can use the 3rd criterion to set a preference ordering among two > annotations > of different types, which have the same begin / end. > You specify the type priorities as part of Analysis Engine metadata, see > http://uima.apache.org/d/uimaj-current/references.html# > ugr.ref.xml.component_descriptor.aes.primitive > > -Marshall > > On 11/20/2016 9:52 PM, William Colen wrote: > > Hi, > > > > In Portuguese we have contractions, that are words composed by, for > > example, a preposition + article, pronoun or an adverb. > > > > Example: > > > > Nós acreditávamos nele. (We believed him.) > > > > Where "nele" can be divided into "em" + "ele". (in + him) > > > > To properly analyze this, I created two token annotation with the same > > begin and end, but the first I associated with the POS Tag preposition, > and > > the second pronoun. > > > > This is especially important when we are doing chunking, because the > first > > token will be part of a prepositional phrase, while the second of a > nominal > > phrase. > > > > How can I guarantee that when I call UIMAFit JCasUtil.select I will get > the > > tokens ordered, first the preposition, second the pronoun? > > > > Thank you, > > William > > > >