[
https://issues.apache.org/jira/browse/HELIX-276?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13802496#comment-13802496
]
Kanak Biscuitwala commented on HELIX-276:
-----------------------------------------
Also, for the "quick failover" case in particular, we could support temporarily
honoring transition preferences until the new node is caught up, and then do
the Slave --> Master transition. This would get state balance out of sync while
the bootstrapping is going on while ensuring eventual state balance. This may
be a good way to go, especially in the short term.
> Allow FULL_AUTO mode to favor some transitions
> ----------------------------------------------
>
> Key: HELIX-276
> URL: https://issues.apache.org/jira/browse/HELIX-276
> Project: Apache Helix
> Issue Type: Improvement
> Reporter: Matthieu Morel
> Assignee: Kanak Biscuitwala
>
> In FULL_AUTO mode, helix computes both partitioning and states.
> Currently, in a master-replica model, when rebalancing due to a failure of
> the master node, Helix does not promote an existing replica to master, but
> instead assigns a new master (I.e. offline -> replica -> master).
> The current algorithm optimizes for minimal partition movement and even
> distribution of state. However, it should also take into account the
> priorities between states, or provide a way to customize it. For instance,
> when it is more costly (number of transitions, priorities) to perform offline
> -> master than replica -> master, the algorithm could favor replica -> master
> transitions.
> One application would be for quick failover : mater ops are logged to a
> journal, a replica builds its state by tailing the journal, and upon failure
> of the master, recovery is fast since only a few operations may have to be
> played to reach the latest state of the master. If a new node is assigned
> master role from scratch, the whole journal must be replayed.
> More context in this thread:
> http://markmail.org/message/inq6tnlnk5ckscwr
--
This message was sent by Atlassian JIRA
(v6.1#6144)