[ 
https://issues.apache.org/jira/browse/KUDU-2302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17362395#comment-17362395
 ] 

ASF subversion and git services commented on KUDU-2302:
-------------------------------------------------------

Commit f9647149a49ddb87ea0ecf069eab3b5ec0217136 in kudu's branch 
refs/heads/master from Andrew Wong
[ https://gitbox.apache.org/repos/asf?p=kudu.git;h=f964714 ]

[consensus] KUDU-2302: don't crash if new leader can't resolve peer

When a tablet replica is elected leader, it constructs Peer objects for
each replica in the Raft config for the sake of sending RPCs to each.
If, during this construction, any remote peer cannot be reached for
whatever reason, this would result in a crash.

Rather than crashing, this patch allows us to start Peers without a
proxy, and retries constructing the proxy the next time a proxy is
required.

Change-Id: I22d870ecc526fa47b97f6856c3b023bc1ec029c7
Reviewed-on: http://gerrit.cloudera.org:8080/17585
Tested-by: Kudu Jenkins
Reviewed-by: Alexey Serbin <aser...@cloudera.com>


> Leader crashes if it can't resolve DNS address of a peer
> --------------------------------------------------------
>
>                 Key: KUDU-2302
>                 URL: https://issues.apache.org/jira/browse/KUDU-2302
>             Project: Kudu
>          Issue Type: Bug
>          Components: consensus, master, tserver
>    Affects Versions: 1.6.0, 1.7.0, 1.8.0, 1.7.1, 1.9.0, 1.10.0, 1.10.1, 
> 1.11.0, 1.12.0, 1.11.1, 1.13.0, 1.14.0
>            Reporter: Todd Lipcon
>            Assignee: Andrew Wong
>            Priority: Critical
>              Labels: crash, roadmap-candidate, stability
>
> In BecomeLeader we call:
> {code}
>  CHECK_OK(BecomeLeaderUnlocked());
> {code}
> This will fail if it fails to resolve the address of one of its peers. 
> Instead it should probably continue to be leader but consider attempts to RPC 
> to that peer to be failed due to network resolution (with periodic retries of 
> resolution)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to