[ 
https://issues.apache.org/jira/browse/OAK-5192?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15996398#comment-15996398
 ] 

Chetan Mehrotra commented on OAK-5192:
--------------------------------------

bq. break into the Lucene files management algorithm and that existing 
instances of Directoriy may still reference those files that we proactively 
delete

[~teofili] The proposed approach only reclaims files which have been deleted 
for sufficient time (those which are deleted and older than last valid 
checkpoint) and when the indexes have been refreshed and moved to newer 
versions. So it should not cause any issue for running setup

> Reduce Lucene related growth of repository size
> -----------------------------------------------
>
>                 Key: OAK-5192
>                 URL: https://issues.apache.org/jira/browse/OAK-5192
>             Project: Jackrabbit Oak
>          Issue Type: Improvement
>          Components: lucene, segment-tar
>            Reporter: Michael Dürig
>            Assignee: Tommaso Teofili
>              Labels: perfomance, scalability
>             Fix For: 1.8, 1.7.3
>
>         Attachments: added-bytes-zoom.png
>
>
> I observed Lucene indexing contributing to up to 99% of repository growth. 
> While the size of the index itself is well inside reasonable bounds, the 
> overall turnover of data being written and removed again can be as much as 
> 99%. 
> In the case of the TarMK this negatively impacts overall system performance 
> due to fast growing number of tar files / segments, bad locality of 
> reference, cache misses/thrashing when looking up segments and vastly 
> prolonged garbage collection cycles.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to