Re: [DISCUSS] FLIP-144: Native Kubernetes HA for Flink

Stephan Ewen Wed, 16 Sep 2020 05:16:46 -0700

This is a very cool feature proposal.

One lesson-learned from the ZooKeeper-based HA is that it is overly
complicated to have the Leader RPC address in a different node than the
LeaderLock. There is extra code needed to make sure these converge and the
can be temporarily out of sync.


A much easier design would be to have the RPC address as payload in the
lock entry (ZNode in ZK), the same way that the leader fencing token is
stored as payload of the lock.
I think for the design above it would mean having a single ConfigMap for
both leader lock and leader RPC address discovery.

This probably serves as a good design principle in general - not divide
information that is updated together over different resources.

Best,
Stephan


On Wed, Sep 16, 2020 at 11:26 AM Xintong Song <tonysong...@gmail.com> wrote:

> Thanks for preparing this FLIP, @Yang.
>
> In general, I'm +1 for this new feature. Leveraging Kubernetes's buildtin
> ConfigMap for Flink's HA services should significantly reduce the
> maintenance overhead compared to deploying a ZK cluster. I think this is an
> attractive feature for users.
>
> Concerning the proposed design, I have some questions. Might not be
> problems, just trying to understand.
>
> ## Architecture
>
> Why does the leader election need two ConfigMaps (`lock for contending
> leader`, and `leader RPC address`)? What happens if the two ConfigMaps are
> not updated consistently? E.g., a TM learns about a new JM becoming leader
> (lock for contending leader updated), but still gets the old leader's
> address when trying to read `leader RPC address`?
>
> ## HA storage > Lock and release
>
> It seems to me that the owner needs to explicitly release the lock so that
> other peers can write/remove the stored object. What if the previous owner
> failed to release the lock (e.g., dead before releasing)? Would there be
> any problem?
>
> ## HA storage > HA data clean up
>
> If the ConfigMap is destroyed on `kubectl delete deploy <ClusterID>`, how
> are the HA dada retained?
>
>
> Thank you~
>
> Xintong Song
>
>
>
> On Tue, Sep 15, 2020 at 11:26 AM Yang Wang <danrtsey...@gmail.com> wrote:
>
>> Hi devs and users,
>>
>> I would like to start the discussion about FLIP-144[1], which will
>> introduce
>> a new native high availability service for Kubernetes.
>>
>> Currently, Flink has provided Zookeeper HA service and been widely used
>> in production environments. It could be integrated in standalone cluster,
>> Yarn, Kubernetes deployments. However, using the Zookeeper HA in K8s
>> will take additional cost since we need to manage a Zookeeper cluster.
>> In the meantime, K8s has provided some public API for leader election[2]
>> and configuration storage(i.e. ConfigMap[3]). We could leverage these
>> features and make running HA configured Flink cluster on K8s more
>> convenient.
>>
>> Both the standalone on K8s and native K8s could benefit from the new
>> introduced KubernetesHaService.
>>
>> [1].
>> https://cwiki.apache.org/confluence/display/FLINK/FLIP-144%3A+Native+Kubernetes+HA+for+Flink
>> [2].
>> https://kubernetes.io/blog/2016/01/simple-leader-election-with-kubernetes/
>> [3]. https://kubernetes.io/docs/concepts/configuration/configmap/
>>
>> Looking forward to your feedback.
>>
>> Best,
>> Yang
>>
>

Re: [DISCUSS] FLIP-144: Native Kubernetes HA for Flink

Reply via email to