[ https://issues.apache.org/jira/browse/OAK-5192?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15996379#comment-15996379 ]
Chetan Mehrotra commented on OAK-5192: -------------------------------------- bq. We could use re-enable compression (which also affects write/read performance). This was disabled in OAK-1737. Compression is enabled (i.e. default codec used) for Lucene indexes which do not involve fulltext index i.e. which purely index properties literally. I think with OAK-2808 we should be able to mitigate this issue considerably i.e. reclaiming the garbage quickly. We can still explore ways to reduce generation of garbage but OAK-2808 would reduce the impact of the growth quite a bit > Reduce Lucene related growth of repository size > ----------------------------------------------- > > Key: OAK-5192 > URL: https://issues.apache.org/jira/browse/OAK-5192 > Project: Jackrabbit Oak > Issue Type: Improvement > Components: lucene, segment-tar > Reporter: Michael Dürig > Assignee: Tommaso Teofili > Labels: perfomance, scalability > Fix For: 1.8, 1.7.3 > > Attachments: added-bytes-zoom.png > > > I observed Lucene indexing contributing to up to 99% of repository growth. > While the size of the index itself is well inside reasonable bounds, the > overall turnover of data being written and removed again can be as much as > 99%. > In the case of the TarMK this negatively impacts overall system performance > due to fast growing number of tar files / segments, bad locality of > reference, cache misses/thrashing when looking up segments and vastly > prolonged garbage collection cycles. -- This message was sent by Atlassian JIRA (v6.3.15#6346)