This is an automated email from the ASF dual-hosted git repository.
hulee pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/helix.wiki.git
The following commit(s) were added to refs/heads/master by this push:
new 96617cd Created Cluster Change Detector for Helix Rebalaner (markdown)
96617cd is described below
commit 96617cd9cae892743dc668c754321e64ffd32e36
Author: Hunter Lee <[email protected]>
AuthorDate: Wed Aug 14 16:54:56 2019 -0700
Created Cluster Change Detector for Helix Rebalaner (markdown)
---
Cluster-Change-Detector-for-Helix-Rebalaner.md | 90 ++++++++++++++++++++++++++
1 file changed, 90 insertions(+)
diff --git a/Cluster-Change-Detector-for-Helix-Rebalaner.md
b/Cluster-Change-Detector-for-Helix-Rebalaner.md
new file mode 100644
index 0000000..dbb4610
--- /dev/null
+++ b/Cluster-Change-Detector-for-Helix-Rebalaner.md
@@ -0,0 +1,90 @@
+# Overview
+This document outlines the design and details implementation of the cluster
change detector for Helix rebalancers.
+# Introduction
+### What
+The distributed nature of applications requires the Helix controller to
rebalance against various scenarios and changes that take place in such
systems. Currently, Helix makes use of ZooKeeper's child/data change callbacks
to be notified of changes happening around the cluster. Cluster Change Detector
aims to become the central component in which various
changes/callbacks/notifications are resolved to efficiently let Helix's
rebalancer know that rebalancing is needed.
+
+### Why
+The current state of affairs is that the Controller relies on callbacks
generated based on ZooKeeper Watchers to trigger a rebalancing pipeline. But
there are cases in which no rebalancing might be needed depending on what kind
of change is happening, or there could be various types of rebalancing that
Helix will perform that happens in parallel to the original controller
pipeline. However, it has become evident that Helix rebalancers should not
directly react to all changes in the clust [...]
+
+### How
+Once ready, Helix's rebalancers will rely on Cluster Change Detector's APIs to
determine whether rebalance is needed.
+
+# Background
+Cluster Change Detector is a critical component for the next-generation
rebalancer for Helix: [The New Helix Rebalancer: Weight-Aware Globally-Even
Distribute
Rebalancer](https://github.com/apache/helix/wiki/Weight-aware-Globally-Evenly-distributed-Rebalancer).
+
+# Problem Statement
+## Objectives
+The primary objectives of Cluster Change Detector are the following:
+
+* Detect any changes happening around the cluster so that the rebalancer
doesn't have to react to changes directly.
+ * We want clear separation of responsibility where the rebalancer purely
does rebalancing, and change detection comes from another separate, independent
component.
+* Determine what kind of rebalance is needed and which
resources/partitions/replicas are affected.
+ *Previously, Helix on FULL-AUTO mode would trigger a rebalance for each
event that triggers the pipeline. This triggered unnecessary rebalances that
were a cause for increased latency and redundant computation. Cluster Change
Detector aims to solve this problem.
+* Enhanced audit log for changes
+ *Helix outputs a lot of logs, and sometimes to the point that they are not
too useful.
+Cluster Change Detector will be the central place for Helix-related audit logs
and the logs will have relevant information around changes happening in the
cluster, which should aid in debugging.
+* (Optional) Detect changes as fast as possible to reduce reaction time.
+ *Helix on FULL-AUTO rebalancer relies on the creation of pipeline events
that get queued as they come in. This means that a slow pipeline run could
increase the reaction time of Helix rebalancer.
+# Architecture/Implementation
+## Defining Change Types
+There are two types of changes: permanent and transient.
+
+### Permanent Changes
+Permanent changes change the nature of the cluster. The following are example
scenarios:
+
+1. Helix Participants added/removed
+1. Participants' configs (for example, fault zone, capacity, traffic load,
etc.) changed
+1. Resources added/removed
+### Transient Changes
+Transient changes do not change the nature of the cluster. These changes
include common failure scenarios experienced by distributed systems such as
network issues, hardware issues, and connection loss. In other words, nodes in
a cluster could come and go. This, in Helix, translates to LiveInstance change.
+
+## Global Baseline Calculation and Partial Rebalance
+With the possible types of changes defined, we could now move on to the topic
of what actually needs to be done about these changes. Any change will require
Helix to take action; that is, Helix will trigger state transitions to
temporarily accommodate for such changes. These reactive state transitions will
have to be sent out pretty quickly (preferably within milliseconds) to prevent,
for example, situations like masterless partitions. These state transitions
would be makeshift transitio [...]
+
+On the other hand, there might be more ideal partition assignments than the
resulting assignment from doing a partial rebalance - more ideal in the sense
that you might be able to find a set of partition mappings that are more evenly
distributed (when all constraints are accounted for). However, finding such
sets of ideal, or good enough, assignments arguably takes more time because the
calculation is more involved. We will refer to the computation of a more
globally-optimized set of map [...]
+
+The following summarizes what kind of rebalance would be needed by change type:
+
+Permanent ClusterConfig Global Baseline Calculation + Partial Rebalance
+Permanent InstanceConfig Global Baseline Calculation + Partial Rebalance
+Permanent IdealStates/ResourceConfig Global Baseline Calculation +
Partial Rebalance
+Transient LiveInstance Partial Rebalance
+## Enhanced Logging
+There are two types of logging that are crucial in maintaining online
clusters: 1) What changes were made to the cluster by external entities (such
as the operator, connection loss, etc.) and 2) What changes Helix is making to
the cluster internally (move partitions to react to external changes).
+
+## Change In Helix's Controller Pipeline
+
+We will have an additional stage where we will create a change detector
reconcile the difference between the cache created from the previous controller
pipeline and the cache created from the current pipeline.
+
+An additional consideration at implementation time could be given to making
the stages in the dataProcess pipeline occur asynchronously because they do not
depend on each other. This will be an optional step because these pipeline
stages will only involve in-memory computation, and they are not expected to be
a latency bottleneck in the Controller pipeline.
+
+## Cluster Change Detector API
+The APIs listed here are loosely defined; that is, they are subject to change
during implementation.
+
+`public class ClusterChangeDetector {`
+
+ `public ClusterChangeDetector() {}`
+
+ `/**`
+ `Returns all change types detected during the ClusterDetection stage.`
+ `*/`
+ `public Set<ChangeType> getChangeTypes();`
+
+ `/**`
+ `Returns a set of the names of components that changed based on the
given change type.`
+ `*/`
+ `public Set<String> getChangesBasedOnType(ChangeType changeType);`
+
+`}`
+
+## Logging and Monitoring
+### Logging
+In every iteration of Helix Controller pipeline, we will have the cluster
change detector run its change-detection logic in the ChangeDetector stage.
During that stage, Helix will log what type of changes were detected. Note that
the changes referred to in this section will not contain individual state or
details such as listing of all names of changed instances and partitions;
rather, they will only include changes around the cluster topology. Logging
cluster information at such minute [...]
+
+### Monitoring
+Helix could emit inGraph metrics for the aforementioned changes for easier
monitoring. This will provide some insight on how frequently a given cluster
undergoes topology change, and both Helix devs and application teams will be
able to tell how often changes take place and what kind they are more easily.
This kind of information is currently not available and will be useful in
maintaining clusters.
+
+## Future Design
+### Listening Directly on ZK Changes
+This design proposes to implement IZkChildListener and/or IZkDataListener in
order to bypass Helix's controller event queue. The expected result is faster
detection of permanent/transient changes. We will not go this route because we
agreed that Cluster Change Detector does not need to be faster in cadence than
the rebalancer, and the speed of rebalancing will be capped at how fast Helix
processes events regardless of how fast Cluster Change Detector detects changes.
\ No newline at end of file