[ https://issues.apache.org/jira/browse/OAK-3177?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14659948#comment-14659948 ]
Michael Dürig edited comment on OAK-3177 at 8/6/15 12:40 PM: ------------------------------------------------------------- [^OAK-3177.png] Attaching a graph showing average compaction times with and without the patch for 7 subsequent compaction cycles on a repository with 5 concurrent writer threads ({{SegmentCompactionIT}}). The graphs show the times of the individual compaction cycles normalised against the first cycle. was (Author: mduerig): !OAK-3177.png! Attaching a graph showing average compaction times with and without the patch for 7 subsequent compaction cycles on a repository with 5 concurrent writer threads ({{SegmentCompactionIT}}). The graphs show the times of the individual compaction cycles normalised against the first cycle. > Compaction slow on repository with continuous writes > ---------------------------------------------------- > > Key: OAK-3177 > URL: https://issues.apache.org/jira/browse/OAK-3177 > Project: Jackrabbit Oak > Issue Type: Sub-task > Components: segmentmk > Reporter: Michael Dürig > Assignee: Michael Dürig > Labels: compaction, gc > Fix For: 1.3.5 > > Attachments: OAK-3177.patch, OAK-3177.png > > > OAK-2734 introduced retry cycles and the option to force compaction when all > cycles fail. However OAK-2192 introduced a performance regression: each > compaction cycle takes in the order of the size of the repository to complete > instead of in the order of the number of remaining changes to compact. This > is caused by comparing compacted with pre-compacted node states, which is > necessary to avoid mixed segments (aka OAK-2192). To fix the performance > regression I propose to pass the compactor an additional node state (the > 'onto' state). The diff would then be calculated across the pre compacted > states, which performs in the order of number of changes. The changes would > then be applied to the 'onto' state, which is a compacted state to avoid > mixed segments. -- This message was sent by Atlassian JIRA (v6.3.4#6332)