Todd Lipcon has submitted this change and it was merged. Change subject: KUDU-1131. Avoid CHECK failure during compaction when an operation is slow to commit ......................................................................
KUDU-1131. Avoid CHECK failure during compaction when an operation is slow to commit This fixes a CHECK failure in the following case: - an operation with txid 5 is replicated, but not yet applying - an operation with txid 6 is replicated, and starts to apply - the tablet flushes. We wait for txid 6 to commit, since it was already applying when the flush snapshot was taken. -- the resulting UNDO file now includes an UNDO at txid 6, so its max_timestamp is 6 - the tablet issues a compaction, before txid 5 has committed This would trigger a CHECK because we see that the current snapshot has txid 5 as uncommitted, but there is an UNDO delta with a max txid of 6. With the code before this patch, that would have erroneously made the compaction code think the UNDO file was actually a REDO file, since its time range overlapped the current snapshot. To fix this, I added a new call to DeltaTracker to specifically fetch UNDO or REDO delta files, rather than relying on the time range and snapshot to do so. The CHECK can now safely be removed. The original commit that added this CHECK was 1a6b80a310a7de3519d78a2f5e90ecaae1cf405a. The commit message there mentions that linked_list-test was modified at that point to act as a regression test for the original bug. I ran that test 500 times and they all passed: http://dist-test.cloudera.org//job?job_id=todd.1463187608.28912 I also ran mt-tablet-test "DoTestAllAtOnce" 4000 times. This test was flaky (1 or 2 failures out of 4000) prior to this change and now passed: http://dist-test.cloudera.org/job?job_id=todd.1463186765.27185 Change-Id: Ie16f2c6d190a322c107d60312d4c35d7aa409c43 Reviewed-on: http://gerrit.cloudera.org:8080/3073 Tested-by: Kudu Jenkins Reviewed-by: Adar Dembo <[email protected]> --- M src/kudu/tablet/compaction.cc M src/kudu/tablet/delta_tracker.cc M src/kudu/tablet/delta_tracker.h 3 files changed, 46 insertions(+), 57 deletions(-) Approvals: Adar Dembo: Looks good to me, approved Kudu Jenkins: Verified -- To view, visit http://gerrit.cloudera.org:8080/3073 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: merged Gerrit-Change-Id: Ie16f2c6d190a322c107d60312d4c35d7aa409c43 Gerrit-PatchSet: 4 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Todd Lipcon <[email protected]> Gerrit-Reviewer: Adar Dembo <[email protected]> Gerrit-Reviewer: David Ribeiro Alves <[email protected]> Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Mike Percy <[email protected]> Gerrit-Reviewer: Todd Lipcon <[email protected]>
