Hi!

On Thu, Jan 14, 2016 at 02:09:07PM +0000, Sean Crist wrote:
> I have a few questions on the basic concepts of UIMA.  It’s fine if you tell 
> me to read the manuals, but I haven’t been able to find the answers there so 
> far, so a chapter reference would be a big help.
> 
> 
> 
> 1)    If Annotator A creates an annotation, is it OK for Annotator B to 
> modify the information in the annotations which A created?

  Yes, that's fine.  (I hope - maybe the rules change a little in
distributed environment, and for some reason I always reindex the
annotations, but that might not be necessary anymore - I'll let someone
else fill in the details here.)

> 2)   I’ve read that an annotation can contain a reference to another 
> annotation, but I haven’t been able to find instructions or an example.
> 
> Possibly, I could generate the annotation class using JCasGen, and then 
> manually augment the auto-generated code to support references to other 
> annotation objects.  Is that a good way to do it?  Or is there some kind of 
> built-in support?

  Sure, the feature type does not need to be a primitive UIMA type like

        <rangeTypeName>uima.cas.Integer</rangeTypeName>

but also a reference to another featureset type like

        <rangeTypeName>uima.tcas.Annotation</rangeTypeName>

(reference to an unspecified type of annotation) or

        
<rangeTypeName>de.tudarmstadt.ukp.dkpro.core.api.segmentation.type.Token</rangeTypeName>o

(reference to a particular type of annotation).  JCas then handles all
the resolution for you and the get...() function will return an instance
of the correct JCas class of the referenced annotation.

> 3)   Suppose I want a parser to build a parse tree over tokens.  A parse tree 
> consists of a hierarchy of nodes.
> 
> I could represent each node as an annotation.  Is that the most UIMA-like 
> solution?
> 
> The reason I hesitate is this.  If I were writing a non-UIMA solution from 
> scratch, I’d treat all of the nodes above the token level as abstract units, 
> and those abstract units wouldn’t deal in concrete information such as the 
> beginning and end of a character range.  I’d keep track of that only at the 
> token level.  I think that all UIMA annotations are required to keep track of 
> this information.
> 
> Also, it sounds the only way for an annotator to retrieve existing 
> annotations is to create an iterator and pull them out one by one.  I wish 
> there were a way to just get a reference to the root node of my parse tree, 
> so that I can simply step recursively through the tree (which assumes I’ve 
> arranged for each node to contain references to its children).

  Yes, you would represent each node as an annotation - or rather, each
edge as an annotation (typically annotating the "receiving end").
That's exactly how e.g. DKpro does it when wrapping StanfordParser.

  It's not really painful to work with the tree this way, see e.g.

        
https://github.com/brmson/yodaqa/blob/master/src/main/java/cz/brmlab/yodaqa/analysis/question/FocusGenerator.java

for an example of code that applies a simple set of blackboard rules
to a parse tree to find a focus of a question sentence.

-- 
                                Petr Baudis
        If you have good ideas, good data and fast computers,
        you can do almost anything. -- Geoffrey Hinton

Reply via email to