[ 
https://issues.apache.org/jira/browse/KUDU-1620?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17425882#comment-17425882
 ] 

ASF subversion and git services commented on KUDU-1620:
-------------------------------------------------------

Commit 3884a6388b2696a826b8903144ae555faa595473 in kudu's branch 
refs/heads/master from Andrew Wong
[ https://gitbox.apache.org/repos/asf?p=kudu.git;h=3884a63 ]

[consensus] KUDU-1620: re-resolve consensus peers on network error

This plumbs the work from KUDU-75 into the long-lived consensus proxy,
allowing Raft peers to re-resolve on error.

This has the knock-on effect that masters starting up also re-resolve
other masters' address when attempting to fetch UUIDs, since this
process also uses consensus proxies.

Change-Id: Ibd1b68c3c14d7d8f81168e16fe450d2ffcce840b
Reviewed-on: http://gerrit.cloudera.org:8080/17868
Tested-by: Kudu Jenkins
Reviewed-by: Alexey Serbin <aser...@cloudera.com>


> Consensus peer proxy hostnames should be reresolved on failure
> --------------------------------------------------------------
>
>                 Key: KUDU-1620
>                 URL: https://issues.apache.org/jira/browse/KUDU-1620
>             Project: Kudu
>          Issue Type: Bug
>          Components: consensus
>    Affects Versions: 1.0.0
>            Reporter: Adar Dembo
>            Priority: Major
>              Labels: docker
>
> Noticed this while documenting the workflow to replace a dead master, which 
> currently bypasses Raft config changes in favor of having the replacement 
> master "masquerade" as the dead master via DNS changes.
> Internally we never rebuild consensus peer proxies in the event of network 
> failure; we assume that the peer will return at the same location. Nominally 
> this is reasonable; allowing peers to change host/port information on the fly 
> is tricky and has yet to be implemented. But, we should at least retry the 
> DNS resolution; not doing so forces the workflow to include steps to restart 
> the existing masters, which creates a (small) availability outage.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to