My understanding of Baseline Topology is the set of nodes which are
*expected* to be in the cluster.
Let me go a little bit further because BT (or whatever name we choose) may
and will solve more issues than just auto-activation:

1) More graceful control over rebalancing than just rebalance delay. If a
server is shut down for maintenance and there are enough backup nodes in
the cluster, there is no need to rebalance.
2) Guarantee that there will be no conflicting key-value mappings due to
incorrect cluster activation. For example, consider a scenario when there
was a cluster of 10 nodes, then the cluster was shut down, started first 5
nodes, activated, made some updates, shut down 5 nodes, start up other 5
nodes, activate, make some updates, start up first 5 nodes. Currently,
there is no way to determine that there was an incompatible topology change
which leads to data inconsistency.
3) When a cluster is shutting down node-by-node, we must track a node which
has 'seen' a partition last time and not activate the cluster until all
nodes are present. Otherwise, again, we may activate too early and see
outdated values.

I do not want to add any 'faster' hacks here because they will only make
the issue above appear more likely. Besides, BT should be available in 2.2
anyway, so no need to rush with hacks.

--AG

2017-08-03 15:09 GMT+03:00 Yakov Zhdanov <yzhda...@apache.org>:

> >Obvious connotation of "minimal set" is a set that cannot be decreased.
>
> >But lets consider the following case: user has a cluster of 50 nodes and
> >decides to switch off 3 nodes for maintenance for a while. Ok, user just
> >does it and then recreates this "minimal node set" to only 47 nodes.
>
> >So initial minimal node set was decreased - something counter-intuitive to
> >me and may cause confusion as well.
>
> That was my point. If I have 50 nodes and 3 backups I can restart on 48, 49
> and 50 without data loss. In case of 48 and 49 after cluster gets activated
> missing backups are assigned and rebalancing starts.
>
> --Yakov
>

Reply via email to