Puh, good! Unsolicited garbage collection (CAS rewriting) would break at least 
one of our projects. We use the CAS address as a FS-ID (as a substitute for 
that "id" feature that Georg asked for in another mail).

Btw. the desire for stable IDs seems to be pretty recurring recently… 

-- Richard

Am 06.05.2013 um 15:28 schrieb Marshall Schor <[email protected]>:

> yes, that's right.
> 
> We currently only have serialization -> deserialization, or cas copying to
> accomplish reclaiming space - it's like a stop-and-copy garbage collection.
> 
> -Marshall
> 
> On 5/6/2013 7:00 AM, Richard Eckart de Castilho (JIRA) wrote:
>>    [ 
>> https://issues.apache.org/jira/browse/UIMA-2434?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13649647#comment-13649647
>>  ] 
>> 
>> Richard Eckart de Castilho commented on UIMA-2434:
>> --------------------------------------------------
>> 
>> About reclaiming space: this change is only affecting indexes, right? The 
>> addresses of FSes in the low-level CAS remain untouched?
>> 
>>> Feature structure removal from sorted index is very slow
>>> --------------------------------------------------------
>>> 
>>>                Key: UIMA-2434
>>>                URL: https://issues.apache.org/jira/browse/UIMA-2434
>>>            Project: UIMA
>>>         Issue Type: Improvement
>>>         Components: Core Java Framework
>>>   Affects Versions: 2.3.1SDK
>>>           Reporter: Mikhail Sogrin
>>>           Assignee: Marshall Schor
>>>            Fix For: 2.4.1SDK
>>> 
>>> 
>>> Removal of feature structures from sorted indexes (e.g. default index) is 
>>> very slow. FSIntArrayIndex.remove() method performs two operations: linear 
>>> search in the array until the given FS is found, followed by the shift of 
>>> elements to the end of this array by one position to the left.
>>> If many annotations (millions and more) are being deleted at once, this 
>>> operation gets very very slow - much slower than adding these annotations 
>>> in the first place. It seems to require O(N^2) time to remove N annotations.
>>> One item is the linear search, which can be replaced by the binary search 
>>> method, which is already implemented in the same class.
>>> Second, array copy can be done with Java built-in method instead of a 
>>> custom loop.
>>> Ideally, a method for bulk removal of a collection of annotations would 
>>> have been the most efficient, for example a method to remove all 
>>> annotations of a given type.
>> --
>> This message is automatically generated by JIRA.
>> If you think it was sent incorrectly, please contact your JIRA administrators
>> For more information on JIRA, see: http://www.atlassian.com/software/jira
>> 
> 

Reply via email to