[ https://issues.apache.org/jira/browse/KUDU-2720?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Grant Henke updated KUDU-2720: ------------------------------ Component/s: perf > Improve concurrency of ResultTracker > ------------------------------------ > > Key: KUDU-2720 > URL: https://issues.apache.org/jira/browse/KUDU-2720 > Project: Kudu > Issue Type: Improvement > Components: perf > Affects Versions: 1.10.0 > Reporter: William Berkeley > Priority: Major > > Running a workload that's pushing many small batches from many clients, I see > a lot of contention on the spinlock in the ResultTracker: > {noformat} > Stacks at 0228 14:19:29.339088 (service queue overflowed for > kudu.tserver.TabletServerService): > tids=[17223] > 0x379ba0f710 <unknown> > 0x89ee80 <unknown> > 0x1fb8f72 base::internal::SpinLockDelay() > 0x1fb8ea7 base::SpinLock::SlowLock() > 0x1e138dc kudu::rpc::ResultTracker::TrackRpc() > 0x1e289e5 kudu::rpc::GeneratedServiceIf::Handle() > 0x1e2935a kudu::rpc::ServicePool::RunThread() > 0x1f9bd91 kudu::Thread::SuperviseThread() > 0x379ba079d1 start_thread > 0x379b6e88fd clone > ... > tids=[5695,5673] > 0x379ba0f710 <unknown> > 0x1fb900a base::internal::SpinLockDelay() > 0x1fb8ea7 base::SpinLock::SlowLock() > 0x1e11b60 kudu::rpc::ResultTracker::IsCurrentDriver() > 0xaaaf16 kudu::tablet::TransactionDriver::Prepare() > 0xaabbdd kudu::tablet::TransactionDriver::PrepareTask() > 0x1fa32dd kudu::ThreadPool::DispatchThread() > 0x1f9bd91 kudu::Thread::SuperviseThread() > 0x379ba079d1 start_thread > 0x379b6e88fd clone > > tids=[5689,5696,5693,5692,5691,5690,5698,5688,5681,5682,5683,5685,5686,5687,5700,5669,5668,5667,5714,5704,5703,5702,5701,5697,5670,5665,5699,5664,5671,5672,5680] > 0x379ba0f710 <unknown> > 0x1fb900a base::internal::SpinLockDelay() > 0x1fb8ea7 base::SpinLock::SlowLock() > 0x1e11bcc kudu::rpc::ResultTracker::RecordCompletionAndRespond() > 0x1e15e6c kudu::rpc::RpcContext::RespondSuccess() > 0xaad024 kudu::tablet::TransactionDriver::Finalize() > 0xaad531 kudu::tablet::TransactionDriver::ApplyTask() > 0x1fa32dd kudu::ThreadPool::DispatchThread() > 0x1f9bd91 kudu::Thread::SuperviseThread() > 0x379ba079d1 start_thread > 0x379b6e88fd clone > {noformat} > The lock in this case is being held by > {noformat} > tids=[5679] > 0x379ba0f710 <unknown> > 0x212f81b google::protobuf::Message::SpaceUsedLong() > 0x1e11f2f kudu::rpc::ResultTracker::RecordCompletionAndRespond() > 0x1e15e6c kudu::rpc::RpcContext::RespondSuccess() > 0xaad024 kudu::tablet::TransactionDriver::Finalize() > 0xaad531 kudu::tablet::TransactionDriver::ApplyTask() > 0x1fa32dd kudu::ThreadPool::DispatchThread() > 0x1f9bd91 kudu::Thread::SuperviseThread() > 0x379ba079d1 start_thread > 0x379b6e88fd clone > {noformat} > KUDU-1622 contained some suggestions for improving the ResultTracker. Some > were implemented, but maybe we should consider implementing other suggestions > there. -- This message was sent by Atlassian Jira (v8.3.4#803005)