[ 
https://issues.apache.org/jira/browse/KUDU-1096?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mike Percy updated KUDU-1096:
-----------------------------
    Parent: KUDU-434

> Re-replication support for Kudu beta
> ------------------------------------
>
>                 Key: KUDU-1096
>                 URL: https://issues.apache.org/jira/browse/KUDU-1096
>             Project: Kudu
>          Issue Type: Sub-task
>          Components: consensus
>    Affects Versions: Feature Complete
>            Reporter: Mike Percy
>            Assignee: Mike Percy
>            Priority: Critical
>
> We want to add initial support for re-replication for the beta release.
> Design: 
> # When a leader detects that a follower has fallen behind to the point that 
> it can't catch up, it will trigger a "remove server" config change.
> # When the master gets a report from a tablet and sees that the number of 
> replicas in the config is less than the table's desired replication, it will 
> itself start a task to create a new replica.
> Details:
> # Let's start with choosing randomly among any tservers that have a most 
> recent heartbeat in the last 3 heartbeat periods, as a reasonable proxy for 
> "live tservers". Later we can do something smarter like "power of two 
> choices" or load-aware placement. Random placement isn't optimal, but also 
> has the least risk of causing weird emergent behavior.
> # The master task will call AddServer() to add the newly selected replica.
> Additional possible refinements:
> # We should also trigger this same process if the leader detects that it 
> hasn't had a successful request send to a follower after N heartbeat periods.
> # We should build in some safety net here in the case that the follower is 
> actually still in the middle of bootstrapping and making progress - otherwise 
> we could flap.
> # We probably want to prohibit the leader from doing this unless it knows 
> it's still within its "lease period". Otherwise, we might too easily drop to 
> a 1-node config if we get to a 2-node config and the leader itself has some 
> issue.
> Pros:
> * Fairly simple and easy approach to re-replication.
> Cons:
> * Availability is less than optimal, for example if a follower is slow enough 
> to fall behind the log, causing the leader to remove it from the Raft config, 
> and there is a simultaneous leader failure (i.e. bad disk) on the leader, 
> then administrator intervention will be required to bring the cluster back 
> online since the only remaining replica will be unable to get elected.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to