Hi Niklas, We had the same problem one year ago and we choose Ververica Platform Community Edttion. Pros: - support for jobs on Session Clusters - good support for restoring jobs from checkpoints and savepoints - support for even hundreds of jobs Cons: - state in SQLite (we've already corrupted db file once) - delay with Flink Versions
One year later I still think there is no perfect solution for managing Flink on K8s, but for us Ververica was the closest match. Regards, Maciek pt., 6 sie 2021 o 13:49 Niklas Wilcke <niklas.wil...@uniberg.com> napisał(a): > > Hi Flink Community, > > I'm currently assessing the situation about how to properly deploy Flink on > Kubernetes via GitOps. There are some options available to deploy Flink on > Kubernetes, which I would like to discuss. In general we are looking for an > open source or at least unpaid solution, but I don't exclude paid solutions > from the beginning. > I see the following options. > > 1. Kubernetes Standalone [1] > * Seems to be deprecated, since the docs state to use Native Kubernetes > instead > 2. Native Kubernetes [2] > * Doesn't seem to implement the Kubernetes operator pattern > * Seems to require command line activities to be operated / upgraded (not > GitOps compatible out of the box) > 3. "GoogleCloudPlatform/flink-on-k8s-operator" Operator [3] > * Seems not to be well maintained / documented > * We had some trouble with crashes during configuration changes, but we need > to investigate further > * There is a "maintained" fork from spotify, which could be an option > 4. Flink Native Kubernetes Operator [4] > * Seems to be a private project from a Flink Committer, which might not be > mature enough for a stable operation > 5. Proprietary Solution Ververica Platform [5] > * I didn't try it out yet and have no experience with it > * I'm unsure whether the Community Edition is suited for a production > environment. (one namespace, no auto scaling, no RBAC, etc.) > > I have the following questions. > > 1. Is the "Native Kubernetes" approach suited to be operated via Gitops and > does it have some drawbacks compared to an operator based setup? (e.g. is a > rollback during a failed upgrade possible?) > 2. Are there any experiences with the > "GoogleCloudPlatform/flink-on-k8s-operator" or a fork of it in a production > environment? > 3. Is the "Flink Native Kubernetes Operator" an option or is it just a > playground project. How is it related to the "Native Kubernetes" setup? Is it > going to be "integrated" into Flink? > 4. Is a proprietary unpaid solution like "Ververica Platform Community > Edition" a solution for a production environment or will it definitely lack > features I need? > > Any information or feedback is highly appreciated. Thank you very much in > advance. > > Kind Regards, > Niklas Wilcke > > > [1] > https://ci.apache.org/projects/flink/flink-docs-release-1.13/docs/deployment/resource-providers/standalone/kubernetes/ > [2] > https://ci.apache.org/projects/flink/flink-docs-release-1.13/docs/deployment/resource-providers/native_kubernetes/ > [3] https://github.com/GoogleCloudPlatform/flink-on-k8s-operator > [4] https://github.com/wangyang0918/flink-native-k8s-operator > [5] https://www.ververica.com/getting-started-flink-ververica > > > > > UNIBERG GmbH > Simon-von-Utrecht-Straße 85a > 20359 Hamburg > > niklas.wil...@uniberg.com > Mobile: +49 160 9793 2593 > Office: +49 40 2380 6523 > > > UNIBERG GmbH, Dorfstraße 3, 23816 Bebensee > > Registergericht / Register: Amtsgericht Kiel HRB SE-1507 > Geschäftsführer / CEO‘s: Andreas Möller, Martin Ulbricht > > Informationen zum Datenschutz / Privacy Information: > https://www.uniberg.com/impressum.html > -- Maciek Bryński