Hello Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

    http://gerrit.cloudera.org:8080/5055

to look at the new patch set (#4).

Change subject: KUDU-768 (part 1) - Move timestamp assignement out of Tablet
......................................................................

KUDU-768 (part 1) - Move timestamp assignement out of Tablet

Safe time is a timestamp such that all transactions before it are
known and either completed or in-flight. Waiting for the Mvcc
snapshot at "safe time" to be "clean" allows to yield repeatable
reads: scans of a tablet at a snapshot defined by a timestamp
that will always return the same results. Proper "safe time"
advancement also allows to improve load balancing: A scan at a clean
timestamp that is lower that "safe time" on a replica is guaranteed
to yield the same results as the same scan on the leader replica
(though maybe with a lantency penalty).

Currently this timestamp is advanced within Mvcc but this is not
natural as in conflates the consensus state (all the operations
that are being replicated and/or replayed) and the mvcc state
(all the operations that have been consensus committed and are
being applied). Furthermore, there is a confusing mixing of
concepts in Mvcc between "safe time" and "clean time" where the
latter means a timestamp such that all operation have been
completed, whereas the former also includes the operations that
are in-flight, even if they haven't started being applied to
the tablet.

This patch series aims at separating the two concepts and fixing
safe time advancement:
a) - Safe time advancement will be handled by consensus: The leader
can easily establish which timestamps are safe for a replica by
looking at which operations that replica knows and what the
timestamp of the last committed operation is.
b) - Mvcc will only take care of monitoring "clean time" advancement.
This makes it simpler to wait for a timestamp to be "safe" and "clean"
the caller will first wait for a timestamp to be "safe" meaning all
operations are known and in-flight and then wait for it to be "clean"
in mvcc meaning all the in-flight operations before have completed.

This patch in particular takes the first two steps in this direction:
1) It moves timestamp assignment from tablet and into the
TransactionDriver to be done prior to pushing the operation to
consensus for replication.
2) It makes all operations be "operations at a timestamp", making
all operations have the same behavior within mvcc independently of
whether they were started at the leader or at a follower.

Follow up patches will completely remove the Mvcc APIs for automatic
safe time advancement and timestamp assignment and will introduce
the new entity responsible for "safe time".

Change-Id: I3ba7212f9211f585d4bef00e5ccfc24d5eece224
---
M src/kudu/tablet/local_tablet_writer.h
M src/kudu/tablet/tablet.cc
M src/kudu/tablet/tablet.h
M src/kudu/tablet/tablet_peer.cc
M src/kudu/tablet/transactions/transaction_driver.cc
M src/kudu/tablet/transactions/transaction_driver.h
M src/kudu/tablet/transactions/transaction_tracker-test.cc
7 files changed, 51 insertions(+), 18 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/55/5055/4
-- 
To view, visit http://gerrit.cloudera.org:8080/5055
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I3ba7212f9211f585d4bef00e5ccfc24d5eece224
Gerrit-PatchSet: 4
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: David Ribeiro Alves <dral...@apache.org>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot

Reply via email to