Wrong behavior of the annotation index subiterator
--------------------------------------------------
Key: UIMA-1764
URL: https://issues.apache.org/jira/browse/UIMA-1764
Project: UIMA
Issue Type: Bug
Affects Versions: 2.3, 2.2.2
Reporter: Hannes Korte
Attachments: files.zip
I noticed a strange behavior of the annotation index subiterator in
uimaj 2.2.2 and 2.3.0.
Consider the sentence: 'Testing the UIMA-Framework'
with tokens: 'Testing' 'the' 'UIMA-Framework'
and the named entity: 'UIMA'
The type priorities list NamedEntity on top of the Token type.
If I call the Token subiterator for the NamedEntity 'UIMA' with
strict=false, I get an empty result. According to the docs, the
definition of Tokens contained in the NamendEntity is in the
strict=false setting defined as:
annot.getBegin() <= b.getBegin() <= annot.getEnd()
for NamedEntity annot and Token b. This is true for 'UIMA' and
'UIMA-Framework', but the subiterator is empty.
If I change the NamedEntity to ' UIMA' (including the preceeding space),
then it works correctly, and the Token 'UIMA-Framework' is contained in
the subiterator.
I appended a simple java class with all needed files to demonstrate the
problem. Any ideas?
--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
https://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira