[ https://issues.apache.org/jira/browse/OAK-7162?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16352135#comment-16352135 ]
Michael Dürig commented on OAK-7162: ------------------------------------ [~dulceanu], I agree with your reasoning regarding removing the call to #{{refreshHead}}. Should there be future cases where the head is changed from outside the scheduler we should still be on the safe side as #{{refreshHead}} gets called every time the root node is retrieved. The only difference is delaying distribution of the content changes from that concurrent modification for a bit. A few minor remarks wrt. the patch: * {{LockBasedSchedulerTest#temporaryFolder}} is not used. * Please add a message to the the assertion asserting the commit tasks do not return {{null}}. The message should somehow capture that {{null}} means a commit got lost because of a race condition. * There is the {{EmptyHook}} class that you could use in {{#createCommit}}. * Please add a few words of Javadoc to {{#testSimulatedRaceOnRevisions}} explaining what the tests is guarding against. This issue's description might be a good start. Otherwise the test looks really good, great stuff! > Race condition on revisions head between compaction and scheduler could > result in skipped commit > ------------------------------------------------------------------------------------------------ > > Key: OAK-7162 > URL: https://issues.apache.org/jira/browse/OAK-7162 > Project: Jackrabbit Oak > Issue Type: Bug > Components: segment-tar > Affects Versions: 1.8.0 > Reporter: Andrei Dulceanu > Assignee: Andrei Dulceanu > Priority: Blocker > Labels: scalability > Fix For: 1.9.0, 1.10, 1.8.2 > > Attachments: OAK-7162-02.patch, OAK-7162-03.patch, > OAK-7162-test.patch, OAK-7162.patch > > > There is a race condition on {{TarRevisions#head}} between a running > compaction trying to set the new head [0] and the scheduler doing the same > after executing a specific commit [1]. If the compaction thread is first, > then the head assignment in the scheduler will fail and not be re-attempted. > IMO, the simple if statement should be changed to a while loop in which the > head is refreshed and the commit is re-applied against the new head, before > attempting again to set a new head in {{TarRevisions}}. This is somehow > similar to what we previously had [2], but without the unneeded > optimistic/pessimistic strategies involving tokens. > [0] > https://github.com/apache/jackrabbit-oak/blob/trunk/oak-segment-tar/src/main/java/org/apache/jackrabbit/oak/segment/file/FileStore.java#L764 > [1] > https://github.com/apache/jackrabbit-oak/blob/trunk/oak-segment-tar/src/main/java/org/apache/jackrabbit/oak/segment/scheduler/LockBasedScheduler.java#L253 > [2] > https://github.com/apache/jackrabbit-oak/blob/1.6/oak-segment-tar/src/main/java/org/apache/jackrabbit/oak/segment/SegmentNodeStore.java#L686 -- This message was sent by Atlassian JIRA (v7.6.3#76005)