Marshall Schor created UIMA-3603:
------------------------------------
Summary: IntArray code to sort and remove duplicates fails on edge
case of empty array
Key: UIMA-3603
URL: https://issues.apache.org/jira/browse/UIMA-3603
Project: UIMA
Issue Type: Bug
Components: Core Java Framework
Affects Versions: 2.5.0SDK
Reporter: Marshall Schor
Assignee: Marshall Schor
Fix For: 2.5.1SDK
As part of serialization, the code which collects the set of indexed FSs has a
portion which removes duplicates; this is run on an IntVector object which is
collecting indexed FSs for a single type, FSIndexRepositoryImpl line 1531).
This object is cleared and reused, in FSIndexRepositoryImpl, lines 1522 to
1532..
There's an edge case bug where this dedup routine fails for empty arrays - it
returns an array of length 1, where the value[0] is whatever might have
previously been there before the last "removeAllElements"
(FSIndexRepositoryImpl, line 1522).
If the 0-th element in this working IntVector happened to be 0 (which is the
case on first use), then this will include the FS at address 0 (which is null,
and has a 0 type code, etc.), which causes XmiSerialization to fail.
The only time this error would manifest itself as a NPE is if a used, but now
0-length index for some type was the first index to be iterated over.
Peter Klügl found this via a NPE while serializing a CAS, and it was order
dependent. I reproduced it by making a CAS, adding / removing an instance of
an annotation (which is a sorted type), and then setting the text for the CAS
(normal operation, setting the text for the CAS, then doing the other, would
result in the DocumentAnnotation instance being created as part of setting the
text, and it would be found 1st in the index collection code, thus insuring
that position 0 in the working IntVector was not 0).
The fix is to handle the edge case where the input array to sortDedup is of 0
length.
--
This message was sent by Atlassian JIRA
(v6.1.5#6160)