Mikhail Sogrin created UIMA-2434:
------------------------------------
Summary: Feature structure removal from sorted index is very slow
Key: UIMA-2434
URL: https://issues.apache.org/jira/browse/UIMA-2434
Project: UIMA
Issue Type: Improvement
Components: Core Java Framework
Affects Versions: 2.3.1SDK
Reporter: Mikhail Sogrin
Removal of feature structures from sorted indexes (e.g. default index) is very
slow. FSIntArrayIndex.remove() method performs two operations: linear search in
the array until the given FS is found, followed by the shift of elements to the
end of this array by one position to the left.
If many annotations (millions and more) are being deleted at once, this
operation gets very very slow - much slower than adding these annotations in
the first place. It seems to require O(N^2) time to remove N annotations.
One item is the linear search, which can be replaced by the binary search
method, which is already implemented in the same class.
Second, array copy can be done with Java built-in method instead of a custom
loop.
Ideally, a method for bulk removal of a collection of annotations would have
been the most efficient, for example a method to remove all annotations of a
given type.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira