Todd Lipcon has submitted this change and it was merged.

Change subject: KUDU-1131. Avoid CHECK failure during compaction when an 
operation is slow to commit
......................................................................


KUDU-1131. Avoid CHECK failure during compaction when an operation is slow to 
commit

This fixes a CHECK failure in the following case:

- an operation with txid 5 is replicated, but not yet applying
- an operation with txid 6 is replicated, and starts to apply
- the tablet flushes. We wait for txid 6 to commit, since it was already
  applying when the flush snapshot was taken.
-- the resulting UNDO file now includes an UNDO at txid 6, so its
   max_timestamp is 6
- the tablet issues a compaction, before txid 5 has committed

This would trigger a CHECK because we see that the current snapshot has txid 5
as uncommitted, but there is an UNDO delta with a max txid of 6. With the code
before this patch, that would have erroneously made the compaction code think
the UNDO file was actually a REDO file, since its time range overlapped the
current snapshot.

To fix this, I added a new call to DeltaTracker to specifically fetch UNDO or
REDO delta files, rather than relying on the time range and snapshot to do so.
The CHECK can now safely be removed.

The original commit that added this CHECK was 
1a6b80a310a7de3519d78a2f5e90ecaae1cf405a.
The commit message there mentions that linked_list-test was modified at that 
point
to act as a regression test for the original bug. I ran that test 500 times and
they all passed:
http://dist-test.cloudera.org//job?job_id=todd.1463187608.28912

I also ran mt-tablet-test "DoTestAllAtOnce" 4000 times. This test was flaky (1
or 2 failures out of 4000) prior to this change and now passed:
http://dist-test.cloudera.org/job?job_id=todd.1463186765.27185

Change-Id: Ie16f2c6d190a322c107d60312d4c35d7aa409c43
Reviewed-on: http://gerrit.cloudera.org:8080/3073
Tested-by: Kudu Jenkins
Reviewed-by: Adar Dembo <[email protected]>
---
M src/kudu/tablet/compaction.cc
M src/kudu/tablet/delta_tracker.cc
M src/kudu/tablet/delta_tracker.h
3 files changed, 46 insertions(+), 57 deletions(-)

Approvals:
  Adar Dembo: Looks good to me, approved
  Kudu Jenkins: Verified



-- 
To view, visit http://gerrit.cloudera.org:8080/3073
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: Ie16f2c6d190a322c107d60312d4c35d7aa409c43
Gerrit-PatchSet: 4
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Todd Lipcon <[email protected]>
Gerrit-Reviewer: Adar Dembo <[email protected]>
Gerrit-Reviewer: David Ribeiro Alves <[email protected]>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <[email protected]>
Gerrit-Reviewer: Todd Lipcon <[email protected]>

Reply via email to