Hello Tidy Bot, Mike Percy, David Ribeiro Alves, Kudu Jenkins, Todd Lipcon,
I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/7439 to look at the new patch set (#24). Change subject: introduce closed mvcc and stopped tablets ...................................................................... introduce closed mvcc and stopped tablets Currently, the only way to stop an Applying transaction is to wait for it to finish and Commit it. This constraint was put in place to guarantee on-disk correctness, but is sometimes too strict. E.g. if the tablet is shutting down, the Apply doesn't need to finish. This patch adds a new state to the MvccManager in which it is closed for transactions. Once in this closed state: 1. New Applies will return and not move to the Commit phase, and any methods waiting for the tablet's Applies to Commit (e.g. new snapshot scans, FlushMRS) will respond with an error immediately. This allows an escape from the existing invariant that Applies _must_ Commit, provided the MvccManager is in this closed state. 2. Applies that are already underway may still Commit, but will return early on a best-effort basis. These non-Committed operations are inconsequential w.r.t. consistency; having some in-flight transactions Commit and others not is consistent with the server shutting down in between the Commits of two transactions. 3. New transactions drivers will abort immediately before even reaching the Prepare phase, ensuring no more writes to the tablet are made durable. The Tablet class uses this closed MVCC state in a new "stopped" state of its own. A Tablet that has been stopped will avoid further activity: its MvccManager is closed to prevent further writes, and its maintenance ops are cancelled to prevent further scheduling. This patch includes these new behaviors when shutting down a tablet, with the assumption that a tablet will only be shut down when it's being deleted and we don't care too much about its in-flight transactions Committing or its further maintenance ops. Code paths that previously crashed if Applies did not succeed (e.g. TransactionDriver::ApplyTask, MvccManager::AbortTransaction, etc.) or that waited for Applies to finish (e.g. Tablet:: FlushUnlocked) will now _not_ crash if the Tablet has been stopped and will log a warning instead. Testing is done by adding the following: - a test in mvcc-test to shut down MVCC and delete an Applying transaction, ensuring that there are no errors when it leaves scope. - a test in mvcc-test to wait on an Applying transaction, shut down MVCC, and ensure that any waiters will return with an error. - a new test stop_tablet-itest is added to ensure stopped leaders block writes (because they cannot start new transactions) and stopped followers don't (because while they cannot service the op, there still exists a majority that can); and that stopped tablets don't prevent fault-tolerant scans Change-Id: I983620f27e7226806a2cca253db7619731914d42 --- M src/kudu/integration-tests/CMakeLists.txt A src/kudu/integration-tests/stop_tablet-itest.cc M src/kudu/tablet/local_tablet_writer.h M src/kudu/tablet/mvcc-test.cc M src/kudu/tablet/mvcc.cc M src/kudu/tablet/mvcc.h M src/kudu/tablet/rowset.cc M src/kudu/tablet/tablet.cc M src/kudu/tablet/tablet.h M src/kudu/tablet/tablet_bootstrap.cc M src/kudu/tablet/tablet_replica.cc M src/kudu/tablet/tablet_replica_mm_ops.cc M src/kudu/tablet/transactions/transaction_driver.cc M src/kudu/tablet/transactions/transaction_driver.h M src/kudu/tablet/transactions/write_transaction.cc M src/kudu/tserver/tablet_service.cc 16 files changed, 638 insertions(+), 118 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/39/7439/24 -- To view, visit http://gerrit.cloudera.org:8080/7439 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I983620f27e7226806a2cca253db7619731914d42 Gerrit-Change-Number: 7439 Gerrit-PatchSet: 24 Gerrit-Owner: Andrew Wong <aw...@cloudera.com> Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com> Gerrit-Reviewer: David Ribeiro Alves <davidral...@gmail.com> Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Mike Percy <mpe...@apache.org> Gerrit-Reviewer: Tidy Bot Gerrit-Reviewer: Todd Lipcon <t...@apache.org>