[ https://issues.apache.org/jira/browse/KUDU-2433?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Mike Percy updated KUDU-2433: ----------------------------- Summary: Possible stack-smashing bug (was: Potential flake in client-negotiation-failover-itest due to negotiation timeout) > Possible stack-smashing bug > --------------------------- > > Key: KUDU-2433 > URL: https://issues.apache.org/jira/browse/KUDU-2433 > Project: Kudu > Issue Type: Bug > Reporter: Mike Percy > Priority: Major > Attachments: client-negotiation-failover-itest.txt > > > There appears to be a rare flake in client-negotiation-failover-itest in > ClientFailoverOnNegotiationTimeoutITest.Kudu1580ConnectToTServer due to a > negotiation timeout. This appeared on one of the test dashboards: > {code:java} > I0507 20:10:39.140327 1387 raft_consensus.cc:2459] T > 7161aaf1a71d455883ef478337dfaf19 P d1bf0f72a88f4483a1906bc8246849dd [term 3 > FOLLOWER]: Leader pre-election lost for term 4. Reason: could not achieve > majority > W0507 20:10:39.140895 296 leader_election.cc:286] T > 7161aaf1a71d455883ef478337dfaf19 P d1bf0f72a88f4483a1906bc8246849dd > [CANDIDATE]: Term 4 pre-election: RPC error from VoteRequest() call to peer > 5014824b94b64fd49da2c1a645c078aa: Network error: Client connection > negotiation failed: client connection to 127.0.52.194:43942: connect: > Connection refused (error 111) > I0507 20:10:39.165484 1389 raft_consensus.cc:435] T > a0376a5e3b9f495aa00d366b35594a27 P d1bf0f72a88f4483a1906bc8246849dd [term 3 > FOLLOWER]: Starting pre-election (detected failure of leader > b5e887ddd8a14cf3b2c24a0aa14be025) > I0507 20:10:39.165598 1389 raft_consensus.cc:2704] T > a0376a5e3b9f495aa00d366b35594a27 P d1bf0f72a88f4483a1906bc8246849dd: Snoozing > failure detection for 0.853s (starting election) > I0507 20:10:39.165699 1389 raft_consensus.cc:457] T > a0376a5e3b9f495aa00d366b35594a27 P d1bf0f72a88f4483a1906bc8246849dd [term 3 > FOLLOWER]: Starting pre-election with config: opid_index: -1 OBSOLETE_local: > false peers { permanent_uuid: "b5e887ddd8a14cf3b2c24a0aa14be025" member_type: > VOTER last_known_addr { host: "127.0.52.195" port: 38453 } } peers { > permanent_uuid: "d1bf0f72a88f4483a1906bc8246849dd" member_type: VOTER > last_known_addr { host: "127.0.52.193" port: 46057 } } peers { > permanent_uuid: "5014824b94b64fd49da2c1a645c078aa" member_type: VOTER > last_known_addr { host: "127.0.52.194" port: 43942 } } > I0507 20:10:39.166122 1389 leader_election.cc:230] T > a0376a5e3b9f495aa00d366b35594a27 P d1bf0f72a88f4483a1906bc8246849dd > [CANDIDATE]: Term 4 pre-election: Requesting pre-vote from peer > b5e887ddd8a14cf3b2c24a0aa14be025 > I0507 20:10:39.166318 1389 leader_election.cc:230] T > a0376a5e3b9f495aa00d366b35594a27 P d1bf0f72a88f4483a1906bc8246849dd > [CANDIDATE]: Term 4 pre-election: Requesting pre-vote from peer > 5014824b94b64fd49da2c1a645c078aa > W0507 20:10:39.166895 297 leader_election.cc:286] T > a0376a5e3b9f495aa00d366b35594a27 P d1bf0f72a88f4483a1906bc8246849dd > [CANDIDATE]: Term 4 pre-election: RPC error from VoteRequest() call to peer > b5e887ddd8a14cf3b2c24a0aa14be025: Timed out: RequestConsensusVote RPC to > 127.0.52.195:38453 timed out after 0.633s (SENT) > I0507 20:10:39.166954 297 leader_election.cc:257] T > a0376a5e3b9f495aa00d366b35594a27 P d1bf0f72a88f4483a1906bc8246849dd > [CANDIDATE]: Term 4 pre-election: Election decided. Result: candidate lost. > W0507 20:10:39.167076 296 leader_election.cc:286] T > a0376a5e3b9f495aa00d366b35594a27 P d1bf0f72a88f4483a1906bc8246849dd > [CANDIDATE]: Term 4 pre-election: RPC error from VoteRequest() call to peer > 5014824b94b64fd49da2c1a645c078aa: Network error: Client connection > negotiation failed: client connection to 127.0.52.194:43942: connect: > Connection refused (error 111) > I0507 20:10:39.167102 1389 raft_consensus.cc:2704] T > a0376a5e3b9f495aa00d366b35594a27 P d1bf0f72a88f4483a1906bc8246849dd: Snoozing > failure detection for 0.685s (election complete) > I0507 20:10:39.167136 1389 raft_consensus.cc:2459] T > a0376a5e3b9f495aa00d366b35594a27 P d1bf0f72a88f4483a1906bc8246849dd [term 3 > FOLLOWER]: Leader pre-election lost for term 4. Reason: could not achieve > majority > W0507 20:10:39.179502 660 negotiation.cc:313] Failed RPC negotiation. Trace: > 0507 20:10:36.182479 (+ 0us) reactor.cc:577] Submitting negotiation task for > client connection to 127.0.52.195:38453 > 0507 20:10:36.182694 (+ 215us) negotiation.cc:98] Waiting for socket to > connect > 0507 20:10:36.182707 (+ 13us) client_negotiation.cc:160] Beginning negotiation > 0507 20:10:36.182797 (+ 90us) client_negotiation.cc:237] Sending NEGOTIATE > NegotiatePB request > 0507 20:10:39.179166 (+2996369us) negotiation.cc:304] Negotiation complete: > Timed out: Client connection negotiation failed: client connection to > 127.0.52.195:38453 > Metrics: {"client-negotiator.queue_time_us":63} > W0507 20:10:39.179760 659 meta_cache.cc:216] Tablet > a0376a5e3b9f495aa00d366b35594a27: Replica b5e887ddd8a14cf3b2c24a0aa14be025 > (127.0.52.195:38453) has failed: Timed out: Client connection negotiation > failed: client connection to 127.0.52.195:38453 > W0507 20:10:39.242803 297 leader_election.cc:286] T > 7161aaf1a71d455883ef478337dfaf19 P d1bf0f72a88f4483a1906bc8246849dd > [CANDIDATE]: Term 4 pre-election: RPC error from VoteRequest() call to peer > b5e887ddd8a14cf3b2c24a0aa14be025: Timed out: RequestConsensusVote RPC to > 127.0.52.195:38453 timed out after 0.101s (SENT) > I0507 20:10:39.242909 297 leader_election.cc:257] T > 7161aaf1a71d455883ef478337dfaf19 P d1bf0f72a88f4483a1906bc8246849dd > [CANDIDATE]: Term 4 pre-election: Election decided. Result: candidate lost. > I0507 20:10:39.243211 1389 raft_consensus.cc:2704] T > 7161aaf1a71d455883ef478337dfaf19 P d1bf0f72a88f4483a1906bc8246849dd: Snoozing > failure detection for 0.097s (election complete) > I0507 20:10:39.243288 1389 raft_consensus.cc:2459] T > 7161aaf1a71d455883ef478337dfaf19 P d1bf0f72a88f4483a1906bc8246849dd [term 3 > FOLLOWER]: Leader pre-election lost for term 4. Reason: could not achieve > majority > W0507 20:10:39.263934 297 leader_election.cc:286] T > d80f4fae4b9f4d238f7f626507290390 P d1bf0f72a88f4483a1906bc8246849dd > [CANDIDATE]: Term 5 pre-election: RPC error from VoteRequest() call to peer > b5e887ddd8a14cf3b2c24a0aa14be025: Timed out: RequestConsensusVote RPC to > 127.0.52.195:38453 timed out after 0.975s (SENT) > I0507 20:10:39.264031 297 leader_election.cc:257] T > d80f4fae4b9f4d238f7f626507290390 P d1bf0f72a88f4483a1906bc8246849dd > [CANDIDATE]: Term 5 pre-election: Election decided. Result: candidate lost. > I0507 20:10:39.264344 1389 raft_consensus.cc:2704] T > d80f4fae4b9f4d238f7f626507290390 P d1bf0f72a88f4483a1906bc8246849dd: Snoozing > failure detection for 0.720s (election complete) > I0507 20:10:39.264427 1389 raft_consensus.cc:2459] T > d80f4fae4b9f4d238f7f626507290390 P d1bf0f72a88f4483a1906bc8246849dd [term 4 > FOLLOWER]: Leader pre-election lost for term 5. Reason: could not achieve > majority > I0507 20:10:39.340955 1389 raft_consensus.cc:435] T > 7161aaf1a71d455883ef478337dfaf19 P d1bf0f72a88f4483a1906bc8246849dd [term 3 > FOLLOWER]: Starting pre-election (detected failure of leader > b5e887ddd8a14cf3b2c24a0aa14be025) > I0507 20:10:39.341055 1389 raft_consensus.cc:2704] T > 7161aaf1a71d455883ef478337dfaf19 P d1bf0f72a88f4483a1906bc8246849dd: Snoozing > failure detection for 0.311s (starting election) > I0507 20:10:39.341125 1389 raft_consensus.cc:457] T > 7161aaf1a71d455883ef478337dfaf19 P d1bf0f72a88f4483a1906bc8246849dd [term 3 > FOLLOWER]: Starting pre-election with config: opid_index: -1 OBSOLETE_local: > false peers { permanent_uuid: "5014824b94b64fd49da2c1a645c078aa" member_type: > VOTER last_known_addr { host: "127.0.52.194" port: 43942 } } peers { > permanent_uuid: "d1bf0f72a88f4483a1906bc8246849dd" member_type: VOTER > last_known_addr { host: "127.0.52.193" port: 46057 } } peers { > permanent_uuid: "b5e887ddd8a14cf3b2c24a0aa14be025" member_type: VOTER > last_known_addr { host: "127.0.52.195" port: 38453 } } > I0507 20:10:39.341584 1389 leader_election.cc:230] T > 7161aaf1a71d455883ef478337dfaf19 P d1bf0f72a88f4483a1906bc8246849dd > [CANDIDATE]: Term 4 pre-election: Requesting pre-vote from peer > 5014824b94b64fd49da2c1a645c078aa > I0507 20:10:39.341753 1389 leader_election.cc:230] T > 7161aaf1a71d455883ef478337dfaf19 P d1bf0f72a88f4483a1906bc8246849dd > [CANDIDATE]: Term 4 pre-election: Requesting pre-vote from peer > b5e887ddd8a14cf3b2c24a0aa14be025 > W0507 20:10:39.342707 296 leader_election.cc:286] T > 7161aaf1a71d455883ef478337dfaf19 P d1bf0f72a88f4483a1906bc8246849dd > [CANDIDATE]: Term 4 pre-election: RPC error from VoteRequest() call to peer > 5014824b94b64fd49da2c1a645c078aa: Network error: Client connection > negotiation failed: client connection to 127.0.52.194:43942: connect: > Connection refused (error 111) > W0507 20:10:39.443085 659 batcher.cc:348] Timed out: Failed to write batch of > 1 ops to tablet a0376a5e3b9f495aa00d366b35594a27 after 261 attempt(s): Failed > to write to server: (no server available): Write(tablet: > a0376a5e3b9f495aa00d366b35594a27, num_ops: 1, num_attempts: 261) passed its > deadline: Illegal state: Replica d1bf0f72a88f4483a1906bc8246849dd is not > leader of this config. Role: FOLLOWER. Consensus state: current_term: 3 > leader_uuid: "b5e887ddd8a14cf3b2c24a0aa14be025" committed_config { > opid_index: -1 OBSOLETE_local: false peers { permanent_uuid: > "b5e887ddd8a14cf3b2c24a0aa14be025" member_type: VOTER last_known_addr { host: > "127.0.52.195" port: 38453 } } peers { permanent_uuid: > "d1bf0f72a88f4483a1906bc8246849dd" member_type: VOTER last_known_addr { host: > "127.0.52.193" port: 46057 } } peers { permanent_uuid: > "5014824b94b64fd49da2c1a645c078aa" member_type: VOTER last_known_addr { host: > "127.0.52.194" port: 43942 } } } > /data/somelongdirectorytoavoidrpathissues/src/kudu/src/kudu/integration-tests/client-negotiation-failover-itest.cc:186: > Failure > Failed > Bad status: IO error: Some errors occurred > ... > [ FAILED ] ClientFailoverOnNegotiationTimeoutITest.Kudu1580ConnectToTServer > (325968 ms){code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)