Alexey Serbin has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/18239


Change subject: [master] fix rare crash on master shutdown
......................................................................

[master] fix rare crash on master shutdown

Once kudu-master crashed in the RemoteKsckTest.TestClusterWithLocation
scenario when running in dist-test setup [1] (TSAN build configuration).

Looking into the logs shown that the Raft pool was already shutdown,
but it seems the system's tablet replica wasn't stopped (see below).

I replaced the call to shutdown the Raft pool of the system tablet
replica with the call to shutdown the tablet itself: it seems that
should take care of everything as needed.

[1] http://dist-test.cloudera.org/job?job_id=jenkins-slave.1645053595.2558268

I0216 23:21:53.056661 11915 raft_consensus.cc:2227] T 
00000000000000000000000000000000 P 8ffb1b8a80764dcab6d0fd9860c9ef7a [term 1 
FOLLOWER]: Raft consensus shutting down.
I0216 23:21:53.057236 11915 raft_consensus.cc:2256] T 
00000000000000000000000000000000 P 8ffb1b8a80764dcab6d0fd9860c9ef7a [term 1 
FOLLOWER]: Raft consensus is shut down!
I0216 23:21:53.057643 11915 tablet_replica.cc:323] T 
00000000000000000000000000000000 P 8ffb1b8a80764dcab6d0fd9860c9ef7a: stopping 
tablet replica
I0216 23:21:53.110038 11915 master.cc:427] Master@127.11.162.252:43629 shutdown 
complete.
F0216 23:21:53.113328 11915 threadpool.cc:306] Check failed: 1 == 
tokens_.size() (1 vs. 3) Threadpool raft destroyed with 3 allocated tokens

The stack trace was something like below:

    @     0x7f814c154332 google::logging_fail() at ??:0
    @     0x7f814c1531a4 google::LogMessage::SendToLog() at ??:0
    @     0x7f814c153c42 google::LogMessage::Flush() at ??:0
    @     0x7f814c1583ab google::LogMessageFatal::~LogMessageFatal() at ??:0
    @     0x7f814c979e47 kudu::ThreadPool::~ThreadPool() at ??:0
    @     0x7f8153c3854f std::__1::default_delete<>::operator()() at ??:0
    @     0x7f8153c384be std::__1::unique_ptr<>::reset() at ??:0
    @     0x7f8153bf7cfc std::__1::unique_ptr<>::~unique_ptr() at ??:0
    @     0x7f8153c98b54 kudu::kserver::KuduServer::~KuduServer() at ??:0
    @     0x7f8153c916eb kudu::master::Master::~Master() at ??:0
    @     0x7f8153c9199a kudu::master::Master::~Master() at ??:0
    @     0x7f8153cc8c78 std::__1::default_delete<>::operator()() at ??:0
    @     0x7f8153cc1f9e std::__1::unique_ptr<>::reset() at ??:0
    @     0x7f8153cd8393 kudu::master::MiniMaster::Shutdown() at ??:0

Change-Id: I006a4d617cd1cb21312dac2f467fa3d04060b866
---
M src/kudu/master/catalog_manager.cc
1 file changed, 6 insertions(+), 6 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/39/18239/1
--
To view, visit http://gerrit.cloudera.org:8080/18239
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I006a4d617cd1cb21312dac2f467fa3d04060b866
Gerrit-Change-Number: 18239
Gerrit-PatchSet: 1
Gerrit-Owner: Alexey Serbin <aser...@cloudera.com>

Reply via email to