Alexey Serbin has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/16934 )
Change subject: [util] fix another lock contention in MaintenanceManager ...................................................................... [util] fix another lock contention in MaintenanceManager I had a chance to look at stack traces in tservers' diagnostic log files at a Kudu cluster with high data ingest ratio. There were many stack snapshots captured periodically every 30 seconds, but the pattern below presented in every snapshot in a row for several hours. tids=[4016] 0x7f64f36b05e0 <unknown> 0xa116c6 kudu::tablet::BudgetedCompactionPolicy::RunApproximation() 0xa129c9 kudu::tablet::BudgetedCompactionPolicy::PickRowSets() 0x9c8d80 kudu::tablet::Tablet::UpdateCompactionStats() 0x9ec848 kudu::tablet::CompactRowSetsOp::UpdateStats() 0x1b3de5c kudu::MaintenanceManager::FindBestOp() 0x1b3f3c5 kudu::MaintenanceManager::RunSchedulerThread() 0x1b86014 kudu::Thread::SuperviseThread() 0x7f64f36a8e25 start_thread 0x7f64f176f34d __clone tids=[48325,48324,48323] 0x7f64f36b05e0 <unknown> 0x7f64f36af42b __lll_lock_wait 0x7f64f36aadcb _L_lock_812 0x7f64f36aac98 __GI___pthread_mutex_lock 0x1b546fd kudu::Mutex::Acquire() 0x1b42913 kudu::MaintenanceManager::LaunchOp() 0x1b929cd kudu::FunctionRunnable::Run() 0x1b8fa87 kudu::ThreadPool::DispatchThread() 0x1b86014 kudu::Thread::SuperviseThread() 0x7f64f36a8e25 start_thread 0x7f64f176f34d __clone The thread 4016 above had acquired the MaintenanceManager::lock_ mutex and went calculating the scores for compaction candidates. Three other threads 48325, 48324, 48323 are waiting for the same mutex to be acquired upon returning from the MaintenanceManager::LaunchOp() method. These three maintenance threads were the only threads in the 'MaintenanceMgr' thread pool, i.e. no other threads were available to perform scheduled compaction operations. These three threads were blocked after performing scheduled maintenance operations, so they could not process any new compaction tasks already scheduled by the former thread. The more compaction candidates were there, the longer the maintenance threads were blocked waiting on the compaction score computation performed by the former thread. To relieve the contention, I updated the code to use separate mutexes for op-specific condition variables and the scheduler's condition variable. Now, op-specific condition variable uses the MaintenanceManager::running_instances_lock_, which is also used to guard access to the MaintenanceManager::running_instances_ container. This patch also fixes reporting on the duration of the compaction operations. Before this patch, the timings for compaction operations might be bloated in case of lock contention, especially in cases attributed to conditions resulting in the stacks like shown above. This patch doesn't contain a test to evaluate performance impact of this change: I'm planning to do so in a separate changelist. Change-Id: I63b12dd3641ef655f8fcbbad8d8ac515d874c0fb Reviewed-on: http://gerrit.cloudera.org:8080/16934 Tested-by: Kudu Jenkins Reviewed-by: Andrew Wong <aw...@cloudera.com> --- M src/kudu/util/maintenance_manager.cc M src/kudu/util/maintenance_manager.h 2 files changed, 172 insertions(+), 130 deletions(-) Approvals: Kudu Jenkins: Verified Andrew Wong: Looks good to me, approved -- To view, visit http://gerrit.cloudera.org:8080/16934 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: I63b12dd3641ef655f8fcbbad8d8ac515d874c0fb Gerrit-Change-Number: 16934 Gerrit-PatchSet: 5 Gerrit-Owner: Alexey Serbin <aser...@cloudera.com> Gerrit-Reviewer: Alexey Serbin <aser...@cloudera.com> Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com> Gerrit-Reviewer: Grant Henke <granthe...@apache.org> Gerrit-Reviewer: Kudu Jenkins (120)