[ 
https://issues.apache.org/jira/browse/KUDU-3149?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wong updated KUDU-3149:
------------------------------
    Description: 
We saw a bunch of tablets bootstrapping extremely slowly, and many stuck 
supposedly bootstrapping, but not showing up in the {{/tablets}} page, i.e. we 
could only see INITIALIZED and RUNNING tablets, no BOOTSTRAPPING.

Upon digging into the stacks, we saw a bunch waiting to acquire the MM lock:
{code:java}
TID 46577(tablet-open [wo):
    @     0x7f1dd57147e0  (unknown)
    @     0x7f1dd5713332  (unknown)
    @     0x7f1dd570e5d8  (unknown)
    @     0x7f1dd570e4a7  (unknown)
    @          0x23b4058  kudu::Mutex::Acquire()
    @          0x23980ff  kudu::MaintenanceManager::RegisterOp()
    @           0xb59b99  kudu::tablet::Tablet::RegisterMaintenanceOps()
    @           0xb855a1  kudu::tablet::TabletReplica::RegisterMaintenanceOps()
    @           0xa0055b  kudu::tserver::TSTabletManager::OpenTablet()
    @          0x23f994c  kudu::ThreadPool::DispatchThread()
    @          0x23f3f8b  kudu::Thread::SuperviseThread()
    @     0x7f1dd570caa1  (unknown)
    @     0x7f1dd3b18bcd  (unknown)
TID 46574(tablet-open [wo):
    @     0x7f1dd57147e0  (unknown)
    @     0x7f1dd5713332  (unknown)
    @     0x7f1dd570e5d8  (unknown)
    @     0x7f1dd570e4a7  (unknown)
    @          0x23b4058  kudu::Mutex::Acquire()
    @          0x23980ff  kudu::MaintenanceManager::RegisterOp()
    @           0xb59c74  kudu::tablet::Tablet::RegisterMaintenanceOps()
    @           0xb855a1  kudu::tablet::TabletReplica::RegisterMaintenanceOps()
    @           0xa0055b  kudu::tserver::TSTabletManager::OpenTablet()
    @          0x23f994c  kudu::ThreadPool::DispatchThread()
    @          0x23f3f8b  kudu::Thread::SuperviseThread()
    @     0x7f1dd570caa1  (unknown)
    @     0x7f1dd3b18bcd  (unknown)

7 threads with same stack:
TID 46575(tablet-open [wo):
TID 46576(tablet-open [wo):
TID 46578(tablet-open [wo):
TID 46580(tablet-open [wo):
TID 46581(tablet-open [wo):
TID 46582(tablet-open [wo):
TID 46583(tablet-open [wo):
    @     0x7f1dd57147e0  (unknown)
    @     0x7f1dd5713332  (unknown)
    @     0x7f1dd570e5d8  (unknown)
    @     0x7f1dd570e4a7  (unknown)
    @          0x23b4058  kudu::Mutex::Acquire()
    @          0x23980ff  kudu::MaintenanceManager::RegisterOp()
    @           0xb85374  kudu::tablet::TabletReplica::RegisterMaintenanceOps()
    @           0xa0055b  kudu::tserver::TSTabletManager::OpenTablet()
    @          0x23f994c  kudu::ThreadPool::DispatchThread()
    @          0x23f3f8b  kudu::Thread::SuperviseThread()
    @     0x7f1dd570caa1  (unknown)
    @     0x7f1dd3b18bcd  (unknown)
TID 46573(tablet-open [wo):
    @     0x7f1dd57147e0  (unknown)
    @     0x7f1dd5713332  (unknown)
    @     0x7f1dd570e5d8  (unknown)
    @     0x7f1dd570e4a7  (unknown)
    @          0x23b4058  kudu::Mutex::Acquire()
    @          0x23980ff  kudu::MaintenanceManager::RegisterOp()
    @           0xb854c7  kudu::tablet::TabletReplica::RegisterMaintenanceOps()
    @           0xa0055b  kudu::tserver::TSTabletManager::OpenTablet()
    @          0x23f994c  kudu::ThreadPool::DispatchThread()
    @          0x23f3f8b  kudu::Thread::SuperviseThread()
    @     0x7f1dd570caa1  (unknown)
    @     0x7f1dd3b18bcd  (unknown)

2 threads with same stack:
TID 43795(MaintenanceMgr ):
TID 43796(MaintenanceMgr ):
    @     0x7f1dd57147e0  (unknown)
    @     0x7f1dd5713332  (unknown)
    @     0x7f1dd570e5d8  (unknown)
    @     0x7f1dd570e4a7  (unknown)
    @          0x23b4058  kudu::Mutex::Acquire()
    @          0x239a064  kudu::MaintenanceManager::LaunchOp()
    @          0x23f994c  kudu::ThreadPool::DispatchThread()
    @          0x23f3f8b  kudu::Thread::SuperviseThread()
    @     0x7f1dd570caa1  (unknown)
    @     0x7f1dd3b18bcd  (unknown)
{code}

A couple more stacks show some work being done by the maintenance manager:

{code:java}
TID 43794(MaintenanceMgr ):
    @     0x7f1dd57147e0  (unknown)
    @           0xba7b41  
kudu::tablet::BudgetedCompactionPolicy::RunApproximation()
    @           0xba8f5d  kudu::tablet::BudgetedCompactionPolicy::PickRowSets()
    @           0xb5b1a1  kudu::tablet::Tablet::PickRowSetsToCompact()
    @           0xb64e93  kudu::tablet::Tablet::Compact()
    @           0xb81071  kudu::tablet::CompactRowSetsOp::Perform()
    @          0x2399c77  kudu::MaintenanceManager::LaunchOp()
    @          0x23f994c  kudu::ThreadPool::DispatchThread()
    @          0x23f3f8b  kudu::Thread::SuperviseThread()
    @     0x7f1dd570caa1  (unknown)
    @     0x7f1dd3b18bcd  (unknown)

TID 46999(maintenance_sch):
    @     0x7f1dd57147e0  (unknown)
    @     0x7f1dd5713332  (unknown)
    @     0x7f1dd570e5d8  (unknown)
    @     0x7f1dd570e4a7  (unknown)
    @           0xb51f29  kudu::tablet::Tablet::UpdateCompactionStats()
    @           0xb7f435  kudu::tablet::CompactRowSetsOp::UpdateStats()
    @          0x23956e4  kudu::MaintenanceManager::FindBestOp()
    @          0x2396af9  kudu::MaintenanceManager::FindAndLaunchOp()
    @          0x2397858  kudu::MaintenanceManager::RunSchedulerThread()
    @          0x23f3f8b  kudu::Thread::SuperviseThread()
    @     0x7f1dd570caa1  (unknown)
    @     0x7f1dd3b18bcd  (unknown)
{code}

So the question is, why is updating stats taking so long? Getting the full 
stacks, the scheduler thread is updating compaction stats, but is waiting to 
take a lock:

{code:java}
Thread 4 (Thread 0x7f1d7d358700 (LWP 46999)):
#0  0x00007f1dd5713334 in __lll_lock_wait () from /lib64/libpthread.so.0
#1  0x00007f1dd570e5d8 in _L_lock_854 () from /lib64/libpthread.so.0
#2  0x00007f1dd570e4a7 in pthread_mutex_lock () from /lib64/libpthread.so.0
#3  0x0000000000b51f29 in 
kudu::tablet::Tablet::UpdateCompactionStats(kudu::MaintenanceOpStats*) ()
#4  0x0000000000b7f435 in 
kudu::tablet::CompactRowSetsOp::UpdateStats(kudu::MaintenanceOpStats*) ()
#5  0x00000000023956e4 in kudu::MaintenanceManager::FindBestOp() ()
#6  0x0000000002396af9 in 
kudu::MaintenanceManager::FindAndLaunchOp(std::unique_lock<kudu::Mutex>*) ()
#7  0x0000000002397858 in kudu::MaintenanceManager::RunSchedulerThread() ()
{code}
And a compaction thread running the rowset-picking algorithm, which happens 
while the {{compact_select_lock_}} is held:
{code:java}
Thread 125 (Thread 0x7f1dce377700 (LWP 43794)):
#0  0x0000000000ba7b90 in 
kudu::tablet::BudgetedCompactionPolicy::RunApproximation(std::vector<kudu::tablet::RowSetInfo,
 std::allocator<kudu::tablet::RowSetInfo> > const&, 
std::vector<kudu::tablet::RowSetInfo, std::allocator<kudu::tablet::RowSetInfo> 
> const&, std::vector<double, std::allocator<double> >*, 
kudu::tablet::BudgetedCompactionPolicy::SolutionAndValue*) const ()
#1  0x0000000000ba8f5d in 
kudu::tablet::BudgetedCompactionPolicy::PickRowSets(kudu::tablet::RowSetTree 
const&, std::unordered_set<kudu::tablet::RowSet const*, 
std::hash<kudu::tablet::RowSet const*>, std::equal_to<kudu::tablet::RowSet 
const*>, std::allocator<kudu::tablet::RowSet const*> >*, double*, 
std::vector<std::basic_string<char, std::char_traits<char>, 
std::allocator<char> >, std::allocator<std::basic_string<char, 
std::char_traits<char>, std::allocator<char> > > >*) ()
#2  0x0000000000b5b1a1 in 
kudu::tablet::Tablet::PickRowSetsToCompact(kudu::tablet::RowSetsInCompaction*, 
int) const ()
#3  0x0000000000b64e93 in kudu::tablet::Tablet::Compact(int) ()
#4  0x0000000000b81071 in kudu::tablet::CompactRowSetsOp::Perform() ()
#5  0x0000000002399c77 in 
kudu::MaintenanceManager::LaunchOp(kudu::MaintenanceOp*) ()
#6  0x00000000023f994c in kudu::ThreadPool::DispatchThread() ()
#7  0x00000000023f3f8b in kudu::Thread::SuperviseThread(void*) ()
#8  0x00007f1dd570caa1 in start_thread () from /lib64/libpthread.so.0
#9  0x00007f1dd3b18bcd in clone () from /lib64/libc.so.6
{code}
It seems the computing of stats is being blocked by on-going compactions, 
contending with the {{compact_select_lock_}}. Additionally, we're holding the 
maintenance manager's {{lock_}} member, for the duration of us computing stats, 
which is contending with the registration of maintenance manager ops. So the 
compaction selection is blocking the scheduler thread, and is thus effectively 
blocking the registration of many tablet replicas' ops, and blocking further 
bootstrapping.

A couple things come to mind with regards to the registration issue:
 - We could probably take a snapshot of the ops under lock and release the 
lock_ when finding the best op to run, so op registration would be somewhat 
independent of scheduling.
 - Additionally, we may want to consider disabling compactions entirely until 
the initial set of tablets finishes bootstrapping.

It's worth noting that it isn't the act of compacting that is contending 
directly with registration per se, but rather the computation of the stats. The 
computation of stats happens to be contending with on-going compactions. 
Improving either contention would likely help.

As a workaround, we used the {{set_flag}} tool to disable compactions on the 
node and noted significantly faster bootstrapping.

  was:
We saw a bunch of tablets bootstrapping extremely slowly, and many stuck 
supposedly bootstrapping, but not showing up in the {{/tablets}} page, i.e. we 
could only see INITIALIZED and RUNNING tablets, no BOOTSTRAPPING.

Upon digging into the stacks, we saw a bunch waiting to acquire the MM lock:
{code:java}
TID 46577(tablet-open [wo):
    @     0x7f1dd57147e0  (unknown)
    @     0x7f1dd5713332  (unknown)
    @     0x7f1dd570e5d8  (unknown)
    @     0x7f1dd570e4a7  (unknown)
    @          0x23b4058  kudu::Mutex::Acquire()
    @          0x23980ff  kudu::MaintenanceManager::RegisterOp()
    @           0xb59b99  kudu::tablet::Tablet::RegisterMaintenanceOps()
    @           0xb855a1  kudu::tablet::TabletReplica::RegisterMaintenanceOps()
    @           0xa0055b  kudu::tserver::TSTabletManager::OpenTablet()
    @          0x23f994c  kudu::ThreadPool::DispatchThread()
    @          0x23f3f8b  kudu::Thread::SuperviseThread()
    @     0x7f1dd570caa1  (unknown)
    @     0x7f1dd3b18bcd  (unknown)
TID 46574(tablet-open [wo):
    @     0x7f1dd57147e0  (unknown)
    @     0x7f1dd5713332  (unknown)
    @     0x7f1dd570e5d8  (unknown)
    @     0x7f1dd570e4a7  (unknown)
    @          0x23b4058  kudu::Mutex::Acquire()
    @          0x23980ff  kudu::MaintenanceManager::RegisterOp()
    @           0xb59c74  kudu::tablet::Tablet::RegisterMaintenanceOps()
    @           0xb855a1  kudu::tablet::TabletReplica::RegisterMaintenanceOps()
    @           0xa0055b  kudu::tserver::TSTabletManager::OpenTablet()
    @          0x23f994c  kudu::ThreadPool::DispatchThread()
    @          0x23f3f8b  kudu::Thread::SuperviseThread()
    @     0x7f1dd570caa1  (unknown)
    @     0x7f1dd3b18bcd  (unknown)

7 threads with same stack:
TID 46575(tablet-open [wo):
TID 46576(tablet-open [wo):
TID 46578(tablet-open [wo):
TID 46580(tablet-open [wo):
TID 46581(tablet-open [wo):
TID 46582(tablet-open [wo):
TID 46583(tablet-open [wo):
    @     0x7f1dd57147e0  (unknown)
    @     0x7f1dd5713332  (unknown)
    @     0x7f1dd570e5d8  (unknown)
    @     0x7f1dd570e4a7  (unknown)
    @          0x23b4058  kudu::Mutex::Acquire()
    @          0x23980ff  kudu::MaintenanceManager::RegisterOp()
    @           0xb85374  kudu::tablet::TabletReplica::RegisterMaintenanceOps()
    @           0xa0055b  kudu::tserver::TSTabletManager::OpenTablet()
    @          0x23f994c  kudu::ThreadPool::DispatchThread()
    @          0x23f3f8b  kudu::Thread::SuperviseThread()
    @     0x7f1dd570caa1  (unknown)
    @     0x7f1dd3b18bcd  (unknown)
TID 46573(tablet-open [wo):
    @     0x7f1dd57147e0  (unknown)
    @     0x7f1dd5713332  (unknown)
    @     0x7f1dd570e5d8  (unknown)
    @     0x7f1dd570e4a7  (unknown)
    @          0x23b4058  kudu::Mutex::Acquire()
    @          0x23980ff  kudu::MaintenanceManager::RegisterOp()
    @           0xb854c7  kudu::tablet::TabletReplica::RegisterMaintenanceOps()
    @           0xa0055b  kudu::tserver::TSTabletManager::OpenTablet()
    @          0x23f994c  kudu::ThreadPool::DispatchThread()
    @          0x23f3f8b  kudu::Thread::SuperviseThread()
    @     0x7f1dd570caa1  (unknown)
    @     0x7f1dd3b18bcd  (unknown)

2 threads with same stack:
TID 43795(MaintenanceMgr ):
TID 43796(MaintenanceMgr ):
    @     0x7f1dd57147e0  (unknown)
    @     0x7f1dd5713332  (unknown)
    @     0x7f1dd570e5d8  (unknown)
    @     0x7f1dd570e4a7  (unknown)
    @          0x23b4058  kudu::Mutex::Acquire()
    @          0x239a064  kudu::MaintenanceManager::LaunchOp()
    @          0x23f994c  kudu::ThreadPool::DispatchThread()
    @          0x23f3f8b  kudu::Thread::SuperviseThread()
    @     0x7f1dd570caa1  (unknown)
    @     0x7f1dd3b18bcd  (unknown)
{code}

A couple more stats 

{code:java}
TID 43794(MaintenanceMgr ):
    @     0x7f1dd57147e0  (unknown)
    @           0xba7b41  
kudu::tablet::BudgetedCompactionPolicy::RunApproximation()
    @           0xba8f5d  kudu::tablet::BudgetedCompactionPolicy::PickRowSets()
    @           0xb5b1a1  kudu::tablet::Tablet::PickRowSetsToCompact()
    @           0xb64e93  kudu::tablet::Tablet::Compact()
    @           0xb81071  kudu::tablet::CompactRowSetsOp::Perform()
    @          0x2399c77  kudu::MaintenanceManager::LaunchOp()
    @          0x23f994c  kudu::ThreadPool::DispatchThread()
    @          0x23f3f8b  kudu::Thread::SuperviseThread()
    @     0x7f1dd570caa1  (unknown)
    @     0x7f1dd3b18bcd  (unknown)

TID 46999(maintenance_sch):
    @     0x7f1dd57147e0  (unknown)
    @     0x7f1dd5713332  (unknown)
    @     0x7f1dd570e5d8  (unknown)
    @     0x7f1dd570e4a7  (unknown)
    @           0xb51f29  kudu::tablet::Tablet::UpdateCompactionStats()
    @           0xb7f435  kudu::tablet::CompactRowSetsOp::UpdateStats()
    @          0x23956e4  kudu::MaintenanceManager::FindBestOp()
    @          0x2396af9  kudu::MaintenanceManager::FindAndLaunchOp()
    @          0x2397858  kudu::MaintenanceManager::RunSchedulerThread()
    @          0x23f3f8b  kudu::Thread::SuperviseThread()
    @     0x7f1dd570caa1  (unknown)
    @     0x7f1dd3b18bcd  (unknown)
{code}
So the question is, why is updating stats taking so long? Getting the full 
stacks, the scheduler thread is updating compaction stats, but is waiting to 
take a lock:
{code:java}
Thread 4 (Thread 0x7f1d7d358700 (LWP 46999)):
#0  0x00007f1dd5713334 in __lll_lock_wait () from /lib64/libpthread.so.0
#1  0x00007f1dd570e5d8 in _L_lock_854 () from /lib64/libpthread.so.0
#2  0x00007f1dd570e4a7 in pthread_mutex_lock () from /lib64/libpthread.so.0
#3  0x0000000000b51f29 in 
kudu::tablet::Tablet::UpdateCompactionStats(kudu::MaintenanceOpStats*) ()
#4  0x0000000000b7f435 in 
kudu::tablet::CompactRowSetsOp::UpdateStats(kudu::MaintenanceOpStats*) ()
#5  0x00000000023956e4 in kudu::MaintenanceManager::FindBestOp() ()
#6  0x0000000002396af9 in 
kudu::MaintenanceManager::FindAndLaunchOp(std::unique_lock<kudu::Mutex>*) ()
#7  0x0000000002397858 in kudu::MaintenanceManager::RunSchedulerThread() ()
{code}
And a compaction thread running the rowset-picking algorithm, which happens 
while the {{compact_select_lock_}} is held:
{code:java}
Thread 125 (Thread 0x7f1dce377700 (LWP 43794)):
#0  0x0000000000ba7b90 in 
kudu::tablet::BudgetedCompactionPolicy::RunApproximation(std::vector<kudu::tablet::RowSetInfo,
 std::allocator<kudu::tablet::RowSetInfo> > const&, 
std::vector<kudu::tablet::RowSetInfo, std::allocator<kudu::tablet::RowSetInfo> 
> const&, std::vector<double, std::allocator<double> >*, 
kudu::tablet::BudgetedCompactionPolicy::SolutionAndValue*) const ()
#1  0x0000000000ba8f5d in 
kudu::tablet::BudgetedCompactionPolicy::PickRowSets(kudu::tablet::RowSetTree 
const&, std::unordered_set<kudu::tablet::RowSet const*, 
std::hash<kudu::tablet::RowSet const*>, std::equal_to<kudu::tablet::RowSet 
const*>, std::allocator<kudu::tablet::RowSet const*> >*, double*, 
std::vector<std::basic_string<char, std::char_traits<char>, 
std::allocator<char> >, std::allocator<std::basic_string<char, 
std::char_traits<char>, std::allocator<char> > > >*) ()
#2  0x0000000000b5b1a1 in 
kudu::tablet::Tablet::PickRowSetsToCompact(kudu::tablet::RowSetsInCompaction*, 
int) const ()
#3  0x0000000000b64e93 in kudu::tablet::Tablet::Compact(int) ()
#4  0x0000000000b81071 in kudu::tablet::CompactRowSetsOp::Perform() ()
#5  0x0000000002399c77 in 
kudu::MaintenanceManager::LaunchOp(kudu::MaintenanceOp*) ()
#6  0x00000000023f994c in kudu::ThreadPool::DispatchThread() ()
#7  0x00000000023f3f8b in kudu::Thread::SuperviseThread(void*) ()
#8  0x00007f1dd570caa1 in start_thread () from /lib64/libpthread.so.0
#9  0x00007f1dd3b18bcd in clone () from /lib64/libc.so.6
{code}
It seems the computing of stats is being blocked by on-going compactions, 
contending with the {{compact_select_lock_}}. Additionally, we're holding the 
maintenance manager's {{lock_}} member, for the duration of us computing stats, 
which is contending with the registration of maintenance manager ops. So the 
compaction selection is blocking the scheduler thread, and is thus effectively 
blocking the registration of many tablet replicas' ops, and blocking further 
bootstrapping.

A couple things come to mind with regards to the registration issue:
 - We could probably take a snapshot of the ops under lock and release the 
lock_ when finding the best op to run, so op registration would be somewhat 
independent of scheduling.
 - Additionally, we may want to consider disabling compactions entirely until 
the initial set of tablets finishes bootstrapping.

It's worth noting that it isn't the act of compacting that is contending 
directly with registration per se, but rather the computation of the stats. The 
computation of stats happens to be contending with on-going compactions. 
Improving either contention would likely help.

As a workaround, we used the {{set_flag}} tool to disable compactions on the 
node and noted significantly faster bootstrapping.


> Lock contention between registering ops and computing maintenance op stats
> --------------------------------------------------------------------------
>
>                 Key: KUDU-3149
>                 URL: https://issues.apache.org/jira/browse/KUDU-3149
>             Project: Kudu
>          Issue Type: Bug
>          Components: perf, tserver
>            Reporter: Andrew Wong
>            Priority: Major
>
> We saw a bunch of tablets bootstrapping extremely slowly, and many stuck 
> supposedly bootstrapping, but not showing up in the {{/tablets}} page, i.e. 
> we could only see INITIALIZED and RUNNING tablets, no BOOTSTRAPPING.
> Upon digging into the stacks, we saw a bunch waiting to acquire the MM lock:
> {code:java}
> TID 46577(tablet-open [wo):
>     @     0x7f1dd57147e0  (unknown)
>     @     0x7f1dd5713332  (unknown)
>     @     0x7f1dd570e5d8  (unknown)
>     @     0x7f1dd570e4a7  (unknown)
>     @          0x23b4058  kudu::Mutex::Acquire()
>     @          0x23980ff  kudu::MaintenanceManager::RegisterOp()
>     @           0xb59b99  kudu::tablet::Tablet::RegisterMaintenanceOps()
>     @           0xb855a1  
> kudu::tablet::TabletReplica::RegisterMaintenanceOps()
>     @           0xa0055b  kudu::tserver::TSTabletManager::OpenTablet()
>     @          0x23f994c  kudu::ThreadPool::DispatchThread()
>     @          0x23f3f8b  kudu::Thread::SuperviseThread()
>     @     0x7f1dd570caa1  (unknown)
>     @     0x7f1dd3b18bcd  (unknown)
> TID 46574(tablet-open [wo):
>     @     0x7f1dd57147e0  (unknown)
>     @     0x7f1dd5713332  (unknown)
>     @     0x7f1dd570e5d8  (unknown)
>     @     0x7f1dd570e4a7  (unknown)
>     @          0x23b4058  kudu::Mutex::Acquire()
>     @          0x23980ff  kudu::MaintenanceManager::RegisterOp()
>     @           0xb59c74  kudu::tablet::Tablet::RegisterMaintenanceOps()
>     @           0xb855a1  
> kudu::tablet::TabletReplica::RegisterMaintenanceOps()
>     @           0xa0055b  kudu::tserver::TSTabletManager::OpenTablet()
>     @          0x23f994c  kudu::ThreadPool::DispatchThread()
>     @          0x23f3f8b  kudu::Thread::SuperviseThread()
>     @     0x7f1dd570caa1  (unknown)
>     @     0x7f1dd3b18bcd  (unknown)
> 7 threads with same stack:
> TID 46575(tablet-open [wo):
> TID 46576(tablet-open [wo):
> TID 46578(tablet-open [wo):
> TID 46580(tablet-open [wo):
> TID 46581(tablet-open [wo):
> TID 46582(tablet-open [wo):
> TID 46583(tablet-open [wo):
>     @     0x7f1dd57147e0  (unknown)
>     @     0x7f1dd5713332  (unknown)
>     @     0x7f1dd570e5d8  (unknown)
>     @     0x7f1dd570e4a7  (unknown)
>     @          0x23b4058  kudu::Mutex::Acquire()
>     @          0x23980ff  kudu::MaintenanceManager::RegisterOp()
>     @           0xb85374  
> kudu::tablet::TabletReplica::RegisterMaintenanceOps()
>     @           0xa0055b  kudu::tserver::TSTabletManager::OpenTablet()
>     @          0x23f994c  kudu::ThreadPool::DispatchThread()
>     @          0x23f3f8b  kudu::Thread::SuperviseThread()
>     @     0x7f1dd570caa1  (unknown)
>     @     0x7f1dd3b18bcd  (unknown)
> TID 46573(tablet-open [wo):
>     @     0x7f1dd57147e0  (unknown)
>     @     0x7f1dd5713332  (unknown)
>     @     0x7f1dd570e5d8  (unknown)
>     @     0x7f1dd570e4a7  (unknown)
>     @          0x23b4058  kudu::Mutex::Acquire()
>     @          0x23980ff  kudu::MaintenanceManager::RegisterOp()
>     @           0xb854c7  
> kudu::tablet::TabletReplica::RegisterMaintenanceOps()
>     @           0xa0055b  kudu::tserver::TSTabletManager::OpenTablet()
>     @          0x23f994c  kudu::ThreadPool::DispatchThread()
>     @          0x23f3f8b  kudu::Thread::SuperviseThread()
>     @     0x7f1dd570caa1  (unknown)
>     @     0x7f1dd3b18bcd  (unknown)
> 2 threads with same stack:
> TID 43795(MaintenanceMgr ):
> TID 43796(MaintenanceMgr ):
>     @     0x7f1dd57147e0  (unknown)
>     @     0x7f1dd5713332  (unknown)
>     @     0x7f1dd570e5d8  (unknown)
>     @     0x7f1dd570e4a7  (unknown)
>     @          0x23b4058  kudu::Mutex::Acquire()
>     @          0x239a064  kudu::MaintenanceManager::LaunchOp()
>     @          0x23f994c  kudu::ThreadPool::DispatchThread()
>     @          0x23f3f8b  kudu::Thread::SuperviseThread()
>     @     0x7f1dd570caa1  (unknown)
>     @     0x7f1dd3b18bcd  (unknown)
> {code}
> A couple more stacks show some work being done by the maintenance manager:
> {code:java}
> TID 43794(MaintenanceMgr ):
>     @     0x7f1dd57147e0  (unknown)
>     @           0xba7b41  
> kudu::tablet::BudgetedCompactionPolicy::RunApproximation()
>     @           0xba8f5d  
> kudu::tablet::BudgetedCompactionPolicy::PickRowSets()
>     @           0xb5b1a1  kudu::tablet::Tablet::PickRowSetsToCompact()
>     @           0xb64e93  kudu::tablet::Tablet::Compact()
>     @           0xb81071  kudu::tablet::CompactRowSetsOp::Perform()
>     @          0x2399c77  kudu::MaintenanceManager::LaunchOp()
>     @          0x23f994c  kudu::ThreadPool::DispatchThread()
>     @          0x23f3f8b  kudu::Thread::SuperviseThread()
>     @     0x7f1dd570caa1  (unknown)
>     @     0x7f1dd3b18bcd  (unknown)
> TID 46999(maintenance_sch):
>     @     0x7f1dd57147e0  (unknown)
>     @     0x7f1dd5713332  (unknown)
>     @     0x7f1dd570e5d8  (unknown)
>     @     0x7f1dd570e4a7  (unknown)
>     @           0xb51f29  kudu::tablet::Tablet::UpdateCompactionStats()
>     @           0xb7f435  kudu::tablet::CompactRowSetsOp::UpdateStats()
>     @          0x23956e4  kudu::MaintenanceManager::FindBestOp()
>     @          0x2396af9  kudu::MaintenanceManager::FindAndLaunchOp()
>     @          0x2397858  kudu::MaintenanceManager::RunSchedulerThread()
>     @          0x23f3f8b  kudu::Thread::SuperviseThread()
>     @     0x7f1dd570caa1  (unknown)
>     @     0x7f1dd3b18bcd  (unknown)
> {code}
> So the question is, why is updating stats taking so long? Getting the full 
> stacks, the scheduler thread is updating compaction stats, but is waiting to 
> take a lock:
> {code:java}
> Thread 4 (Thread 0x7f1d7d358700 (LWP 46999)):
> #0  0x00007f1dd5713334 in __lll_lock_wait () from /lib64/libpthread.so.0
> #1  0x00007f1dd570e5d8 in _L_lock_854 () from /lib64/libpthread.so.0
> #2  0x00007f1dd570e4a7 in pthread_mutex_lock () from /lib64/libpthread.so.0
> #3  0x0000000000b51f29 in 
> kudu::tablet::Tablet::UpdateCompactionStats(kudu::MaintenanceOpStats*) ()
> #4  0x0000000000b7f435 in 
> kudu::tablet::CompactRowSetsOp::UpdateStats(kudu::MaintenanceOpStats*) ()
> #5  0x00000000023956e4 in kudu::MaintenanceManager::FindBestOp() ()
> #6  0x0000000002396af9 in 
> kudu::MaintenanceManager::FindAndLaunchOp(std::unique_lock<kudu::Mutex>*) ()
> #7  0x0000000002397858 in kudu::MaintenanceManager::RunSchedulerThread() ()
> {code}
> And a compaction thread running the rowset-picking algorithm, which happens 
> while the {{compact_select_lock_}} is held:
> {code:java}
> Thread 125 (Thread 0x7f1dce377700 (LWP 43794)):
> #0  0x0000000000ba7b90 in 
> kudu::tablet::BudgetedCompactionPolicy::RunApproximation(std::vector<kudu::tablet::RowSetInfo,
>  std::allocator<kudu::tablet::RowSetInfo> > const&, 
> std::vector<kudu::tablet::RowSetInfo, 
> std::allocator<kudu::tablet::RowSetInfo> > const&, std::vector<double, 
> std::allocator<double> >*, 
> kudu::tablet::BudgetedCompactionPolicy::SolutionAndValue*) const ()
> #1  0x0000000000ba8f5d in 
> kudu::tablet::BudgetedCompactionPolicy::PickRowSets(kudu::tablet::RowSetTree 
> const&, std::unordered_set<kudu::tablet::RowSet const*, 
> std::hash<kudu::tablet::RowSet const*>, std::equal_to<kudu::tablet::RowSet 
> const*>, std::allocator<kudu::tablet::RowSet const*> >*, double*, 
> std::vector<std::basic_string<char, std::char_traits<char>, 
> std::allocator<char> >, std::allocator<std::basic_string<char, 
> std::char_traits<char>, std::allocator<char> > > >*) ()
> #2  0x0000000000b5b1a1 in 
> kudu::tablet::Tablet::PickRowSetsToCompact(kudu::tablet::RowSetsInCompaction*,
>  int) const ()
> #3  0x0000000000b64e93 in kudu::tablet::Tablet::Compact(int) ()
> #4  0x0000000000b81071 in kudu::tablet::CompactRowSetsOp::Perform() ()
> #5  0x0000000002399c77 in 
> kudu::MaintenanceManager::LaunchOp(kudu::MaintenanceOp*) ()
> #6  0x00000000023f994c in kudu::ThreadPool::DispatchThread() ()
> #7  0x00000000023f3f8b in kudu::Thread::SuperviseThread(void*) ()
> #8  0x00007f1dd570caa1 in start_thread () from /lib64/libpthread.so.0
> #9  0x00007f1dd3b18bcd in clone () from /lib64/libc.so.6
> {code}
> It seems the computing of stats is being blocked by on-going compactions, 
> contending with the {{compact_select_lock_}}. Additionally, we're holding the 
> maintenance manager's {{lock_}} member, for the duration of us computing 
> stats, which is contending with the registration of maintenance manager ops. 
> So the compaction selection is blocking the scheduler thread, and is thus 
> effectively blocking the registration of many tablet replicas' ops, and 
> blocking further bootstrapping.
> A couple things come to mind with regards to the registration issue:
>  - We could probably take a snapshot of the ops under lock and release the 
> lock_ when finding the best op to run, so op registration would be somewhat 
> independent of scheduling.
>  - Additionally, we may want to consider disabling compactions entirely until 
> the initial set of tablets finishes bootstrapping.
> It's worth noting that it isn't the act of compacting that is contending 
> directly with registration per se, but rather the computation of the stats. 
> The computation of stats happens to be contending with on-going compactions. 
> Improving either contention would likely help.
> As a workaround, we used the {{set_flag}} tool to disable compactions on the 
> node and noted significantly faster bootstrapping.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to