Hello David Ribeiro Alves, Kudu Jenkins, I'd like you to reexamine a change. Please visit
http://gerrit.cloudera.org:8080/6572 to look at the new patch set (#3). Change subject: KUDU-1607. Unpin tablet flush after failed bootstrap ...................................................................... KUDU-1607. Unpin tablet flush after failed bootstrap We have heard reports that, in certain scenarios, a failed tablet is unable to be deleted. After some investigation, I determined that this is because we neglect to unpin the Tablet when tablet bootstrap fails. When a table is being deleted, we delete each tablet superblock by calling TabletMetadata::DeleteSuperBlock(). This method double-checks that no orphaned blocks remain referenced to ensure we don't leak disk space. That requires TabletMetadata::DeleteTabletData() to be called first, which invokes Flush() twice to ensure that no orphaned blocks are referenced on disk upon returning. However, if the tablet is pinned, Flush() silently becomes a no-op (except for a log message that is printed) and so DeleteSuperBlock() also fails, resulting in the tablet not being fully deleted and the following warning message being written to the log file: W0317 13:05:19.324373 13242 ts_tablet_manager.cc:634] Invalid argument: Unable to delete on-disk data from tablet d1b857ddaa0743c79c9bcbd9b2fda174: The metadata for tablet d1b857ddaa0743c79c9bcbd9b2fda174 still references orphaned blocks. Call DeleteTabletData() first This patch makes the following changes: 1. Always unpin the tablet after pinning it. 2. Add a new itest that fails without that change, reproducing the above warning message. 3. Add a little more test infra to MiniCluster (and the MiniClusterBase interface) to make it easier to use the helper functions in cluster_itest_util.h in MiniCluster-based itests. Note: This backport has been modified not to rely on KUDU-1741. Change-Id: Id274c6ee1da75bc6f92ab91c0a01edaa009b8962 Reviewed-on: http://gerrit.cloudera.org:8080/6422 Tested-by: Kudu Jenkins Reviewed-by: David Ribeiro Alves <dral...@apache.org> (cherry picked from commit 0450f77f69c74cc6dec08ad36bb89ac12c17608f) --- M src/kudu/integration-tests/CMakeLists.txt A src/kudu/integration-tests/delete_tablet-itest.cc M src/kudu/integration-tests/external_mini_cluster.cc M src/kudu/integration-tests/external_mini_cluster.h M src/kudu/integration-tests/mini_cluster.cc M src/kudu/integration-tests/mini_cluster.h M src/kudu/integration-tests/mini_cluster_base.h M src/kudu/tablet/tablet_bootstrap.cc M src/kudu/tserver/mini_tablet_server.cc M src/kudu/tserver/mini_tablet_server.h 10 files changed, 224 insertions(+), 48 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/72/6572/3 -- To view, visit http://gerrit.cloudera.org:8080/6572 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: Id274c6ee1da75bc6f92ab91c0a01edaa009b8962 Gerrit-PatchSet: 3 Gerrit-Project: kudu Gerrit-Branch: branch-1.2.x Gerrit-Owner: Jean-Daniel Cryans <jdcry...@apache.org> Gerrit-Reviewer: David Ribeiro Alves <dral...@apache.org> Gerrit-Reviewer: Jean-Daniel Cryans <jdcry...@apache.org> Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Mike Percy <mpe...@apache.org>