Todd Lipcon has submitted this change and it was merged.
Change subject: catalog_manager: wait for table visitors before shutting down
catalog
......................................................................
catalog_manager: wait for table visitors before shutting down catalog
We observed the following crash:
*** Aborted at 1456910767 (unix time) try "date -d @1456910767" if you are
using GNU date ***
PC: @ 0x7f1892a8c3c0 base::subtle::NoBarrier_CompareAndSwap()
*** SIGSEGV (@0x158) received by PID 6439 (TID 0x7f187b54e700) from PID 344;
stack trace: ***
@ 0x3b44e0f710 (unknown) at ??:0
@ 0x7f1892a8c3c0 base::subtle::NoBarrier_CompareAndSwap() at ??:0
@ 0x7f1892a8c440 base::subtle::Acquire_CompareAndSwap() at ??:0
@ 0x7f1892a8c5ca base::SpinLock::Lock() at ??:0
@ 0x7f1892a8ce74 kudu::simple_spinlock::lock() at ??:0
@ 0x7f1892a9a90e boost::lock_guard<>::lock_guard() at ??:0
@ 0x7f1891f2a43a kudu::tablet::MvccManager::TakeSnapshot() at ??:0
@ 0x7f1891f2b00e kudu::tablet::MvccSnapshot::MvccSnapshot() at ??:0
@ 0x7f1891e63755 kudu::tablet::Tablet::NewRowIterator() at ??:0
@ 0x7f1892ade0c1 kudu::master::SysCatalogTable::VisitTablets() at ??:0
@ 0x7f1892a74497
kudu::master::CatalogManager::VisitTablesAndTabletsUnlocked() at ??:0
@ 0x7f1892a73f8e
kudu::master::CatalogManager::VisitTablesAndTabletsTask() at ??:0
This can happen if the deferral of VisitTablesAndTabletsTask to the worker
thread pool is slow enough such that the catalog is able to shutdown before
the table visitor starts.
I spent a bunch of time trying to reproduce this crash. The attached test is
simple, but it needs class friendship and I've only seen it trigger the
crash once in thousands of runs. As such, it may hurt more than it helps.
Change-Id: I142b8dbdf4356a324bcde0e63fa44ea63798d509
Reviewed-on: http://gerrit.cloudera.org:8080/2427
Tested-by: Kudu Jenkins
Reviewed-by: Todd Lipcon <[email protected]>
---
M src/kudu/master/catalog_manager.cc
M src/kudu/master/catalog_manager.h
M src/kudu/master/master-test.cc
3 files changed, 18 insertions(+), 0 deletions(-)
Approvals:
Todd Lipcon: Looks good to me, approved
Kudu Jenkins: Verified
--
To view, visit http://gerrit.cloudera.org:8080/2427
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings
Gerrit-MessageType: merged
Gerrit-Change-Id: I142b8dbdf4356a324bcde0e63fa44ea63798d509
Gerrit-PatchSet: 2
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Adar Dembo <[email protected]>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon <[email protected]>