[ https://issues.apache.org/jira/browse/KUDU-2244?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Grant Henke updated KUDU-2244: ------------------------------ Labels: performance (was: ) > spinlock contention in raft_consensus > ------------------------------------- > > Key: KUDU-2244 > URL: https://issues.apache.org/jira/browse/KUDU-2244 > Project: Kudu > Issue Type: Improvement > Components: consensus > Reporter: Andrew Wong > Priority: Major > Labels: performance > > I was going through the logs of a cluster that was seeing a bunch of > kernel_stack_watchdog traces, and the slowness seemed to be caused by a lot > of activity in consensus requests. E.g. > W1214 18:57:29.514219 36138 kernel_stack_watchdog.cc:145] Thread 36317 stuck > at > /data/jenkins/workspace/generic-package-centos64-7-0-impala/topdir/BUILD/kudu-1.3.0-cdh5.11.0/src/kudu/rpc/outbound_call.cc:192 > for 123ms: > Kernel stack: > [<ffffffff810b7ef5>] sys_sched_yield+0x65/0xd0 > [<ffffffff81645909>] system_call_fastpath+0x16/0x1b > [<ffffffffffffffff>] 0xffffffffffffffff > User stack: > @ 0x7f72fab92057 __GI___sched_yield > @ 0x19498bf kudu::Thread::StartThread() > @ 0x1952e7d kudu::ThreadPool::CreateThreadUnlocked() > @ 0x19534d3 kudu::ThreadPool::Submit() > @ 0x1953a27 kudu::ThreadPool::SubmitFunc() > @ 0x1953ecb kudu::ThreadPool::SubmitClosure() > @ 0x9c94ec kudu::consensus::RaftConsensus::ElectionCallback() > @ 0x9e6032 kudu::consensus::LeaderElection::CheckForDecision() > @ 0x9e78c3 > kudu::consensus::LeaderElection::VoteResponseRpcCallback() > @ 0xa8b137 kudu::rpc::OutboundCall::CallCallback() > @ 0xa8c2bc kudu::rpc::OutboundCall::SetResponse() > @ 0xa822c0 kudu::rpc::Connection::HandleCallResponse() > @ 0xa83ffc ev::base<>::method_thunk<>() > @ 0x198a07f ev_invoke_pending > @ 0x198af71 ev_run > @ 0xa5e049 kudu::rpc::ReactorThread::RunThread() > So it seemed to be cause by some slowness in getting threads. Upon perusing > the logs a bit more, there were a sizable number of spinlock profiling traces: > W1214 18:54:27.897955 36379 rpcz_store.cc:238] Trace: > 1214 18:54:26.766922 (+ 0us) service_pool.cc:143] Inserting onto call > queue > 1214 18:54:26.771135 (+ 4213us) service_pool.cc:202] Handling call > 1214 18:54:26.771138 (+ 3us) raft_consensus.cc:1126] Updating replica for > 0 ops > 1214 18:54:27.897699 (+1126561us) raft_consensus.cc:1165] Early marking > committed up to index 0 > 1214 18:54:27.897700 (+ 1us) raft_consensus.cc:1170] Triggering prepare > for 0 ops > 1214 18:54:27.897701 (+ 1us) raft_consensus.cc:1282] Marking committed up > to 1766 > 1214 18:54:27.897702 (+ 1us) raft_consensus.cc:1332] Filling consensus > response to leader. > 1214 18:54:27.897736 (+ 34us) spinlock_profiling.cc:255] Waited 991 ms on > lock 0x120b3540. stack: 00000000019406c5 00000000009c60d7 00000000009c75f7 > 00000000007dc628 0000000000a7adfc 0000000000a7b9cd 000000000194d059 > 00007f72fbcc2dc4 00007f72fabad1cc ffffffffffffffff > 1214 18:54:27.897737 (+ 1us) raft_consensus.cc:1327] UpdateReplicas() > finished > 1214 18:54:27.897741 (+ 4us) inbound_call.cc:130] Queueing success > response > Metrics: {"spinlock_wait_cycles":2478395136} > Each of the traces noted on the order of 500-1000ms of waiting on spinlocks. > Upon looking at raft_consensus.cc, it seems we're holding a spinlock > (update_lock_) while we call RaftConsensus::UpdateReplica(), which according > to its header, "won't return until all operations have been stored in the log > and all Prepares() have been completed". While locking may be necessary, it's > worth considering using a different kind of lock here. -- This message was sent by Atlassian Jira (v8.3.4#803005)