Flink k8s operator starts wrong job config from FlinkSessionJob

2024-06-26 Thread Peter Klauke
Hi all, we're running a session cluster and I submit around 20 jobs to it at the same time by creating FlinkSessionJob Kubernetes resources. After sufficient time there are 20 jobs created and running healthy. However, it appears that some jobs are started with the wrong config. As a result some

Re: Autoscaling with flink-k8s-operator 1.8.0

2024-05-02 Thread Chetas Joshi
Hi Gyula, Thanks for getting back and explaining the difference in the responsibilities of the autoscaler and the operator. I figured out what the issue was. Here is what I was trying to do: the autoscaler had initially down-scaled (2->1) the flinkDeployment so there was

Re: Autoscaling with flink-k8s-operator 1.8.0

2024-05-01 Thread Gyula Fóra
Hi Chetas, The operator logic itself would normally call the rescale api during the upgrade process, not the autoscaler module. The autoscaler module sets the correct config with the parallelism overrides, and then the operator performs the regular upgrade cycle (as when you yourself change

Autoscaling with flink-k8s-operator 1.8.0

2024-05-01 Thread Chetas Joshi
Hello, We recently upgraded the operator to 1.8.0 to leverage the new autoscaling features ( https://nightlies.apache.org/flink/flink-kubernetes-operator-docs-release-1.8/docs/custom-resource/autoscaler/). The FlinkDeployment (application cluster) is set to flink v1_18 as well. I am able to

flink k8s operator chk config interval bug.inoperative

2024-03-14 Thread kcz
kcz 573693...@qq.com

[flink-k8s-connector] In-place scaling up often takes several times till it succeeds.

2023-12-06 Thread Xiaolong Wang
Hi, I'm playing with a Flink 1.18 demo with the auto-scaler and the adaptive scheduler. The operator can correctly collect data and order the job to scale up, but it'll take the job several times to reach the required parallelism. E.g. The original parallelism for each vertex is something like

Flink K8s operator ignores taskmanager.host setting

2023-11-29 Thread Salva Alcántara
Hi! I'm deploying a job via the Flink K8s Operator with these settings in the FlinkDeployment resource: ``` spec: flinkConfiguration: taskmanager.host: 0.0.0.0 <-- ignored / not applied ``` When I look into the flink-conf.yaml in the TM the setting is not there. Is there any rea

Re: Using Flink k8s operator on OKD

2023-10-05 Thread Gyula Fóra
sources: >>> - deployments >>> - deployments/finalizers >>> verbs: >>> - '*' >>> --- >>> apiVersion: rbac.authorization.k8s.io/v1 >>> kind: RoleBinding >>> metadata: >>> labels: >>> app.kubernetes.io/

Re: Using Flink k8s operator on OKD

2023-10-05 Thread Krzysztof Chmielewski
;> verbs: >> - '*' >> --- >> apiVersion: rbac.authorization.k8s.io/v1 >> kind: RoleBinding >> metadata: >> labels: >> app.kubernetes.io/name: flink-kubernetes-operator >> app.kubernetes.io/version: 1.5.0 >> name: flink-role-binding >

Re: Using Flink k8s operator on OKD

2023-09-20 Thread Krzysztof Chmielewski
-binding > roleRef: > apiGroup: rbac.authorization.k8s.io > kind: Role > name: flink > subjects: > - kind: ServiceAccount > name: flink > EOF > > Hopefully that helps. > > > On Tue, Sep 19, 2023 at 5:40 PM Krzysztof Chmielewski < > krzysiek.chmielew.

Re: Using Flink k8s operator on OKD

2023-09-19 Thread Zach Lorimer
want to have Flink deployments in. kubectl apply -f - < wrote: > Hi community, > I was wondering if anyone tried to deploy Flink using Flink k8s operator > on machine where OKD [1] is installed? > > We have tried to install Flink k8s operator version 1.6 which seems to > succeed

Using Flink k8s operator on OKD

2023-09-19 Thread Krzysztof Chmielewski
Hi community, I was wondering if anyone tried to deploy Flink using Flink k8s operator on machine where OKD [1] is installed? We have tried to install Flink k8s operator version 1.6 which seems to succeed, however when we try to deploy simple Flink deployment we are getting an error. 2023-09-19

Re: Flink K8S operator does not support IPv6

2023-09-05 Thread Xiaolong Wang
FYI, adding environment variables of ` KUBERNETES_DISABLE_HOSTNAME_VERIFICATION=true` works for me. This env variable needs to be added to both the Flink operator and the Flink job definition. On Tue, Aug 8, 2023 at 12:03 PM Xiaolong Wang wrote: > Ok, thank you. > > On Tue, Aug 8, 2023 at

Re: flink k8s operator - problem with patching seession cluster

2023-08-31 Thread Gyula Fóra
TM to Flink Session cluster via Java K8s client if Session >>> Cluster has running jobs? >>> >>> Thanks, >>> Krzysztof >>> >>> pt., 25 sie 2023 o 23:48 Krzysztof Chmielewski < >>> krzysiek.chmielew...@gmail.com> napisał(a): >

Re: flink k8s operator - problem with patching seession cluster

2023-08-31 Thread Krzysztof Chmielewski
napisał(a): >> >>> Hi community, >>> I have a use case where I would like to add an extra TM) to a running >>> Flink session cluster that has Flink jobs deployed. Session cluster >>> creation, job submission and cluster patching is done using flink k8s &

Re: flink k8s operator - problem with patching seession cluster

2023-08-31 Thread Gyula Fóra
wski < > krzysiek.chmielew...@gmail.com> napisał(a): > >> Hi community, >> I have a use case where I would like to add an extra TM) to a running >> Flink session cluster that has Flink jobs deployed. Session cluster >> creation, job submission and

Re: flink k8s operator - problem with patching seession cluster

2023-08-30 Thread Krzysztof Chmielewski
apisał(a): > Hi community, > I have a use case where I would like to add an extra TM) to a running > Flink session cluster that has Flink jobs deployed. Session cluster > creation, job submission and cluster patching is done using flink k8s > operator Java API. The Details of this are pre

flink k8s operator - problem with patching seession cluster

2023-08-25 Thread Krzysztof Chmielewski
Hi community, I have a use case where I would like to add an extra TM) to a running Flink session cluster that has Flink jobs deployed. Session cluster creation, job submission and cluster patching is done using flink k8s operator Java API. The Details of this are presented here [1] I would like

Re: Flink k8s operator - managde from java microservice

2023-08-16 Thread Yaroslav Tkachenko
obs using Apache Flink > k8s operator. > Where actions like job submission (new and from save point), Job cancel > with save point, cluster creations will be triggered from Java based micro > service. > > Is there any recommended/Dedicated Java API for Flink k8s operator? > I se

Flink k8s operator - managde from java microservice

2023-08-16 Thread Krzysztof Chmielewski
Hi, I have a use case where I would like to run Flink jobs using Apache Flink k8s operator. Where actions like job submission (new and from save point), Job cancel with save point, cluster creations will be triggered from Java based micro service. Is there any recommended/Dedicated Java API

Re: Flink K8S operator does not support IPv6

2023-08-07 Thread Xiaolong Wang
Ok, thank you. On Tue, Aug 8, 2023 at 11:22 AM Peter Huang wrote: > We will handle it asap. Please check the status of this jira > https://issues.apache.org/jira/browse/FLINK-32777 > > On Mon, Aug 7, 2023 at 8:08 PM Xiaolong Wang > wrote: > >> Hi, >> >> I was testing flink-kubernetes-operator

Re: Flink K8S operator does not support IPv6

2023-08-07 Thread Peter Huang
We will handle it asap. Please check the status of this jira https://issues.apache.org/jira/browse/FLINK-32777 On Mon, Aug 7, 2023 at 8:08 PM Xiaolong Wang wrote: > Hi, > > I was testing flink-kubernetes-operator in an IPv6 cluster and found out > the below issues: > > *Caused by:

Flink K8S operator does not support IPv6

2023-08-07 Thread Xiaolong Wang
Hi, I was testing flink-kubernetes-operator in an IPv6 cluster and found out the below issues: *Caused by: javax.net.ssl.SSLPeerUnverifiedException: Hostname > fd70:e66a:970d::1 not verified:* > > *certificate: sha256/EmX0EhNn75iJO353Pi+1rClwZyVLe55HN3l5goaneKQ=* > > *DN:

Re: TM fails to register with JM while trying to run basic.yaml example with Flink K8S operator

2023-07-19 Thread Orkhan Dadashov
I have fixed the issue by increasing the CPU and memory for my JM and TM pods. Make sure your instance type can accommodate the required resources. On Wed, 19 Jul 2023 at 13:35, Orkhan Dadashov wrote: > Hi Flink users, > > I'm following up on this guide to try the Flink K8S operat

TM fails to register with JM while trying to run basic.yaml example with Flink K8S operator

2023-07-19 Thread Orkhan Dadashov
Hi Flink users, I'm following up on this guide to try the Flink K8S operator (1.5.0 version ): https://nightlies.apache.org/flink/flink-kubernetes-operator-docs-release-1.5/docs/try-flink-kubernetes-operator/quick-start/ When I try to deploy a basic example, JM and TM start, but TM fails

TM fails to register with JM while trying to run basic.yaml example with Flink K8S operator

2023-07-19 Thread Orkhan Dadashov
Hi Flink users, I'm following up on this guide to try the Flink K8S operator (1.5.0 version ): https://nightlies.apache.org/flink/flink-kubernetes-operator-docs-release-1.5/docs/try-flink-kubernetes-operator/quick-start/ When I try to deploy a basic example, JM and TM start, but TM fails

Re: [Flink K8s Operator] Trigger nonce missing for manual savepoint info

2023-07-12 Thread Gyula Fóra
Maybe you have inconsistent operator / CRC versions? In any case I highly recommend upgrading to the lates operator version to get all the bug / security fixes and improvements. Gyula On Wed, 12 Jul 2023 at 10:58, Paul Lam wrote: > Hi, > > I’m using K8s operator 1.3.1 with Flink 1.15.2 on 2

[Flink K8s Operator] Trigger nonce missing for manual savepoint info

2023-07-12 Thread Paul Lam
Hi, I’m using K8s operator 1.3.1 with Flink 1.15.2 on 2 K8s clusters. Weird enough, on one K8s cluster the flink deployments would show savepoint trigger nonce. while the flink deployments on the other cluster wouldn’t. The normal output is as follows: ``` Last Savepoint: Format

Re: [Flink K8s Operator] Automatic cleanup of terminated deployments

2023-05-21 Thread Paul Lam
Hi Gyula, Thank you and sorry for the late response. My use case is that users may run finite jobs (either batch jobs or finite stream jobs), leaving a lot of deprecated flink deployments around. I’ve filed a ticket[1]. [1] https://issues.apache.org/jira/browse/FLINK-32143 Best, Paul Lam >

Re: [Flink K8s Operator] Automatic cleanup of terminated deployments

2023-05-14 Thread Gyula Fóra
There is no such feature currently, Kubernetes resources usually do not delete themselves :) The problem I see here is by deleting the resource you lose all information about what happened, you won't know if it failed or completed etc. What is the use-case you are thinking about? If this is

[Flink K8s Operator] Automatic cleanup of terminated deployments

2023-05-14 Thread Paul Lam
Hi all, Currently, if a job turns into terminated status (e.g. FINISHED or FAILED), the flinkdeployment remains until a manual cleanup is performed. I went through the docs but did not find any way to clean them up automatically. Am I missing something? Thanks! Best, Paul Lam

Re: flink k8s 部署启动报错

2023-03-13 Thread Weihua Hu
- 回复的原邮件 > | 发件人 | Weihua Hu | > | 发送日期 | 2023年3月14日 10:39 | > | 收件人 | | > | 主题 | Re: flink k8s 部署启动报错 | > Hi, > > 看异常信息是 Flink 集群在启动时检索到 HA 路径上存在 DirtyResults 数据,但是数据已经不完整了,无法正常读取。 > 可以参考文档[1],检查相关的 HA 路径,清理下异常数据 > > 另外问一下,之前是通过同名的 cluster-id 启动过 Flink 集群吗? > &

回复: flink k8s 部署启动报错

2023-03-13 Thread Jason_H
您好, 我找到了我的ha目录,请教一下,怎么确定哪些数据是脏数据,可以允许删除的,这个有什么办法可以确定吗,我看到的都是些系统数据 | | Jason_H | | hyb_he...@163.com | 回复的原邮件 | 发件人 | Weihua Hu | | 发送日期 | 2023年3月14日 10:39 | | 收件人 | | | 主题 | Re: flink k8s 部署启动报错 | Hi, 看异常信息是 Flink 集群在启动时检索到 HA 路径上存在 DirtyResults 数据,但是数据已经不完整了,无法正常读取。 可以参考文档[1],检查相关的

回复: flink k8s 部署启动报错

2023-03-13 Thread Jason_H
您好, 对的,之前是正常启动的,突然失败了,然后我直接重启pod,就一直报这个错了。 | | Jason_H | | hyb_he...@163.com | 回复的原邮件 | 发件人 | Weihua Hu | | 发送日期 | 2023年3月14日 10:39 | | 收件人 | | | 主题 | Re: flink k8s 部署启动报错 | Hi, 看异常信息是 Flink 集群在启动时检索到 HA 路径上存在 DirtyResults 数据,但是数据已经不完整了,无法正常读取。 可以参考文档[1],检查相关的 HA 路径,清理下异常数据 另外问一下,之前

Re: flink k8s 部署启动报错

2023-03-13 Thread Weihua Hu
Hi, 看异常信息是 Flink 集群在启动时检索到 HA 路径上存在 DirtyResults 数据,但是数据已经不完整了,无法正常读取。 可以参考文档[1],检查相关的 HA 路径,清理下异常数据 另外问一下,之前是通过同名的 cluster-id 启动过 Flink 集群吗? [1] https://nightlies.apache.org/flink/flink-docs-master/docs/deployment/config/#job-result-store-storage-path Best, Weihua On Tue, Mar 14, 2023 at

flink k8s 部署启动报错

2023-03-13 Thread Jason_H
hi,大家好 请教一个问题,我在k8s上部署的flink集群,启动不来,报如下的错误,大家有遇到过吗 java.util.concurrent.CompletionException: org.apache.flink.util.FlinkRuntimeException: Could not retrieve JobResults of globally-terminated jobs from JobResultStore at java.util.concurrent.CompletableFuture.encodeThrowable(Unknown

Re: Flink K8s operator pod section of CRD

2023-02-24 Thread Őrhidi Mátyás
Yep! Simple oversight, it was :/ Cheers, Matyas On Thu, Feb 23, 2023 at 10:54 PM Gyula Fóra wrote: > Hey! > You are right, these fields could have been of the PodTemplate / > PodTemplateSpec type (probably PodTemplateSpec is actually better). > > I think the reason why we used it is two fold:

Re: Flink K8s operator pod section of CRD

2023-02-23 Thread Gyula Fóra
Hey! You are right, these fields could have been of the PodTemplate / PodTemplateSpec type (probably PodTemplateSpec is actually better). I think the reason why we used it is two fold: - Simple oversight :) - Flink itself "expects" the podtemplate in this form for the native integration as you

Flink K8s operator pod section of CRD

2023-02-23 Thread Mason Chen
Hi all, Why does the FlinkDeployment CRD refer to the Pod class instead of the PodTemplate class from the fabric8 library? As far as I can tell, the only difference is that the Pod class exposes the PodStatus, which doesn't seem mutable. Thanks in advance! Best, Mason

Re: [Flink K8s Operator] flinkdep stays in DEPLOYED and never turns STABLE

2022-12-06 Thread Paul Lam
fecycleState enum if you want a single condensed status view . > > Cheers > Gyula > > On Tue, 6 Dec 2022 at 04:12, Paul Lam <mailto:paullin3...@gmail.com>> wrote: > Hi all, > > I’m trying out Flink K8s operator 1.2 with K8s 1.25 and Flink 1.15. > > I found kub

Re: [Flink K8s Operator] flinkdep stays in DEPLOYED and never turns STABLE

2022-12-06 Thread Gyula Fóra
want a single condensed status view . Cheers Gyula On Tue, 6 Dec 2022 at 04:12, Paul Lam wrote: > Hi all, > > I’m trying out Flink K8s operator 1.2 with K8s 1.25 and Flink 1.15. > > I found kubectl shows that flinkdeployments stay in DEPLOYED like forever > (the Flink job

[Flink K8s Operator] flinkdep stays in DEPLOYED and never turns STABLE

2022-12-06 Thread Paul Lam
Hi all, I’m trying out Flink K8s operator 1.2 with K8s 1.25 and Flink 1.15. I found kubectl shows that flinkdeployments stay in DEPLOYED like forever (the Flink job status are RUNNING), but the operator logs shows that the flinkdeployments already turned into STABLE. Is that a known bug

Re: [Flink K8s operator] HA metadata not available to restore from last state

2022-11-22 Thread Dongwon Kim
liminate most of these cases. > > Cheers, > Gyula > > On Tue, Nov 22, 2022 at 9:43 AM Dongwon Kim wrote: > >> Hi, >> >> While using a last-state upgrade mode on flink-k8s-operator-1.2.0 and >> flink-1.14.3, we're occasionally facing the following error: &g

Re: [Flink K8s operator] HA metadata not available to restore from last state

2022-11-22 Thread Gyula Fóra
eliminate most of these cases. Cheers, Gyula On Tue, Nov 22, 2022 at 9:43 AM Dongwon Kim wrote: > Hi, > > While using a last-state upgrade mode on flink-k8s-operator-1.2.0 and > flink-1.14.3, we're occasionally facing the following error: > > Status: >> Cluster Info:

[Flink K8s operator] HA metadata not available to restore from last state

2022-11-22 Thread Dongwon Kim
Hi, While using a last-state upgrade mode on flink-k8s-operator-1.2.0 and flink-1.14.3, we're occasionally facing the following error: Status: > Cluster Info: > Flink - Revision: 98997ea @ 2022-01-08T23:23:54+01:00 > Flink - Version: 1.14.3

Re: Flink k8s operator高可用部署Flink Session Cluster,提交job遇到异常。

2022-11-01 Thread 汪赟
gt; 也许你可以创建一个 jira issue 来跟进这个问题 > > Best, > Weihua > > >> On Thu, Oct 27, 2022 at 6:51 PM Young Chen wrote: >> >> 【问题描述】 >> >> Flink k8s operator(v1.1.0)高可用部署了一个Flink Session Cluster(两个JobManager), >> 然后用SessionJob 部署一个例子job,job

Re: Flink k8s operator高可用部署Flink Session Cluster,提交job遇到异常。

2022-10-27 Thread Weihua Hu
: > 【问题描述】 > > Flink k8s operator(v1.1.0)高可用部署了一个Flink Session Cluster(两个JobManager), > 然后用SessionJob 部署一个例子job,job有时可以部署,有时部署不了。 > > 可以看到容器中如下error日志。 > > > > 【操作步骤】 > > 部署Cluster > > > > apiVersion: flink.apache.org/v1beta1 > > kind: Flink

Flink k8s operator高可用部署Flink Session Cluster,提交job遇到异常。

2022-10-27 Thread Young Chen
【问题描述】 Flink k8s operator(v1.1.0)高可用部署了一个Flink Session Cluster(两个JobManager), 然后用SessionJob 部署一个例子job,job有时可以部署,有时部署不了。 可以看到容器中如下error日志。 【操作步骤】 部署Cluster apiVersion: flink.apache.org/v1beta1 kind: FlinkDeployment metadata: name: flink-cluster-1jm-checkpoint spec: image: flink

Re: batch job 结束时, flink-k8s-operator crd 状态展示不清晰

2022-10-25 Thread Yang Wang
日(星期二) 下午3:33 > 收件人:"user-zh" > 主题:batch job 结束时, flink-k8s-operator crd 状态展示不清晰 > > > > hi, > 我在使用flink-k8s-operator 部署batch job。 我发现当batch job 结束之后, > flink-k8s-operator 的 FlinkDeployment CRD 状态发生了变化: > jobManagerDeploymentStatus 变成了"missing&quo

Re: status no clear when deploying batch job with flink-k8s-operator

2022-10-25 Thread Gyula Fóra
ing a flink batch job with flink-k8s-operator. > My flink-k8s-operator's version is 1.2.0 and flink's version is 1.14.6. I > found after the batch job execute finish, the jobManagerDeploymentStatus > field became "MISSING" in FlinkDeployment crd. And the error field became

batch job 结束时, flink-k8s-operator crd 状态展示不清晰

2022-10-25 Thread Liting Liu (litiliu)
hi, 我在使用flink-k8s-operator 部署batch job。 我发现当batch job 结束之后, flink-k8s-operator 的 FlinkDeployment CRD 状态发生了变化: jobManagerDeploymentStatus 变成了"missing", "error" 变成了“Missing JobManager deployment”。 我想这个应该是batch job执行完毕之后,native-k8s 自动将JobmanagerDeployment 删除导致的。 请问该如何通过判断C

status no clear when deploying batch job with flink-k8s-operator

2022-10-25 Thread Liting Liu (litiliu)
  Hi, I'm deploying a flink batch job with flink-k8s-operator. My flink-k8s-operator's version is 1.2.0 and flink's version is 1.14.6. I found after the batch job execute finish, the jobManagerDeploymentStatus field became "MISSING" in FlinkDeployment crd. And the error fi

Re: fail to mount hadoop-config-volume when using flink-k8s-operator

2022-10-13 Thread Yang Wang
adoop conf directory exists in the image. For flink-k8s-operator, another feasible solution is to create a hadoop-config-configmap manually and then use *"kubernetes.hadoop.conf.config-map.name <http://kubernetes.hadoop.conf.config-map.name>" *to mount it to JobManager and TaskManager

fail to mount hadoop-config-volume when using flink-k8s-operator

2022-10-12 Thread Liting Liu (litiliu)
Hi, community: I'm using flink-k8s-operator v1.2.0 to deploy flink job. And the "HADOOP_CONF_DIR" environment variable was setted in the image that i buiilded from flink:1.15. I found the taskmanager pod was trying to mount a volume named "hadoop-config-volume&

Re: flink-k8s-operator中webhook的作用

2022-07-27 Thread Yang Wang
Webhook主要的作用是做CR的校验,避免提交到K8s上之后才发现 例如:parallelism被错误的设置为负值,jarURI没有设置等 Best, Yang Kyle Zhang 于2022年7月27日周三 18:59写道: > Hi,all > 最近在看flink-k8s-operator[1],架构里有一个flink-webhook,请问这个container的作用是什么,如果配置 > webhook.create=false对整体功能有什么影响? > > Best regards > > [1] > > h

flink-k8s-operator中webhook的作用

2022-07-27 Thread Kyle Zhang
Hi,all 最近在看flink-k8s-operator[1],架构里有一个flink-webhook,请问这个container的作用是什么,如果配置 webhook.create=false对整体功能有什么影响? Best regards [1] https://nightlies.apache.org/flink/flink-kubernetes-operator-docs-release-1.1/docs/concepts/architecture/

Re: Flink k8s 作业提交流程

2022-06-27 Thread Lijie Wang
Hi, 使用文档可以查看: https://nightlies.apache.org/flink/flink-docs-master/docs/deployment/resource-providers/native_kubernetes 设计文档可以查看: https://docs.google.com/document/d/1-jNzqGF6NfZuwVaFICoFQ5HFFXzF5NVIagUZByFMfBY/edit?usp=sharing jira: https://issues.apache.org/jira/browse/FLINK-9953 Best, Lijie

Flink k8s ????????????

2022-06-27 Thread hjw
Flink version:1.15.0 ??1.15.0Flink??native k8s?Flink on Native k8s ??:)

Re: Flink k8s Operator on AWS?

2022-06-27 Thread Matt Casters
it. I chalk all that up to just me lacking a bit of experience with k8s. That being said... It's all working now and I documented the deployment over here: https://hop.apache.org/manual/next/pipeline/beam/flink-k8s-operator-running-hop-pipeline.html A big thank you to everyone that helped me out

Re: Flink k8s Operator on AWS?

2022-06-26 Thread Yang Wang
Could you please share the JobManager logs of failed deployment? It will also help a lot if you could show the pending pod status via "kubectl describe ". Given that the current Flink Kubernetes Operator is built on top of native K8s integration[1], the Flink ResourceManager should allocate

Re: Flink k8s Operator on AWS?

2022-06-24 Thread Matt Casters
Yes of-course. I already feel a bit less intelligent for having asked the question ;-) The status now is that I managed to have it all puzzled together. Copying the files from s3 to an ephemeral volume takes all of 2 seconds so it's really not an issue. The cluster starts and our fat jar and

Re: Flink k8s Operator on AWS?

2022-06-24 Thread Őrhidi Mátyás
Hi Matt, Yes. There are several official Flink images with various JVMs including Java 11. https://hub.docker.com/_/flink Cheers, Matyas On Fri, Jun 24, 2022 at 2:06 PM Matt Casters wrote: > Hi Mátyás & all, > > Thanks again for the advice so far. On a related note I noticed Java 8 > being

Re: Flink k8s Operator on AWS?

2022-06-24 Thread Matt Casters
Hi Mátyás & all, Thanks again for the advice so far. On a related note I noticed Java 8 being used, indicated in the log. org.apache.flink.runtime.entrypoint.ClusterEntrypoint[] - JAVA_HOME: /usr/local/openjdk-8 Is there a way to use Java 11 to start Flink with? Kind regards, Matt

Re: Flink k8s Operator on AWS?

2022-06-23 Thread Yang Wang
Thanks for your valuable inputs. To make deploying Flink on K8s easy as a normal Java application is certainly the mission of Flink Kubernetes Operator. Obviously, we are still a little far from this mission. Back to the user jars download, I think it makes sense to introduce the artifact fetcher

Re: Flink k8s Operator on AWS?

2022-06-22 Thread Matt Casters
Hi Yang, Thanks for the suggestion! I looked into this volume sharing on EKS yesterday but I couldn't figure it out right away. The way that people come into the Apache Hop project is often with very little technical knowledge since that's sort of the goal of the project: make things easy.

Re: Flink k8s Operator on AWS?

2022-06-22 Thread Matt Casters
Hi Matyas, Again thank you very much for the information. I'm a beginner and all the help is really appreciated. After some diving into the script behind s3-artifiact-fetcher I kind of figured it out. Have an folder sync'ed into the pod container of the task manager. Then I guess we should be

Re: Flink k8s Operator on AWS?

2022-06-21 Thread Yang Wang
Matyas and Gyula have shared many great informations about how to make the Flink Kubernetes Operator work on the EKS. One more input about how to prepare the user jars. If you are more familiar with K8s, you could use persistent volume to provide the user jars and them mount the volume to

Re: Flink k8s Operator on AWS?

2022-06-21 Thread Őrhidi Mátyás
Hi Matt, I believe an artifact fetcher (e.g https://hub.docker.com/r/agiledigital/s3-artifact-fetcher ) + the pod template ( https://nightlies.apache.org/flink/flink-kubernetes-operator-docs-main/docs/custom-resource/pod-template/#pod-template) is an elegant way to solve your problem. The

Re: Flink k8s Operator on AWS?

2022-06-21 Thread Matt Casters
Thank you very much for the help Matyas and Gyula! I just saw a video today where you were presenting the FKO. Really nice stuff! So I'm guessing we're executing "flink run" at some point on the master and that this is when we need the jar file to be local? Am I right in assuming that this

Re: Flink k8s Operator on AWS?

2022-06-21 Thread Gyula Fóra
A small addition to what Matyas has said: The limitation of only supporting local scheme is coming from the Flink Kubernetes Application mode directly and is not related to the operator itself. Once this feature is added to Flink itself the operator can also support it for newer Flink versions.

Re: Flink k8s Operator on AWS?

2022-06-21 Thread Őrhidi Mátyás
Hi Matt, - In FlinkDeployments you can utilize an init container to download your artifact onto a shared volume, then you can refer to it as local:/.. from the main container. FlinkDeployments comes with pod template support

Flink k8s Operator on AWS?

2022-06-21 Thread Matt Casters
Hi Flink team! I'm interested in getting the new Flink Kubernetes Operator to work on AWS EKS. Following the documentation I got pretty far. However, when trying to run a job I got the following error: Only "local" is supported as schema for application mode. This assumes t > hat the jar is

Re:Re: Flink k8s HA 手动删除作业deployment导致的异常

2022-06-13 Thread m18814122325
的HA数据泄露 > >[1]. >https://nightlies.apache.org/flink/flink-docs-release-1.15/docs/deployment/ha/kubernetes_ha/#high-availability-data-clean-up > > >Best, >Yang > >Zhanghao Chen 于2022年6月13日周一 07:53写道: > >> 1.基于K8S做HA的Flink任务要想正常,不能手动删除作业deployment,必须通过cancel,stop命令进行停止。基于上面我

Re: Flink k8s HA 手动删除作业deployment导致的异常

2022-06-12 Thread Yang Wang
-availability-data-clean-up Best, Yang Zhanghao Chen 于2022年6月13日周一 07:53写道: > 1.基于K8S做HA的Flink任务要想正常,不能手动删除作业deployment,必须通过cancel,stop命令进行停止。基于上面我猜测Flink > k8s HA是基于ConfigMap之上开发的,其声明周期从K8S角度不能像作业的svc一样带ownerreference。 > > 是的,Flink K8s HA 是基于 ConfigMap 开发的,并且 HA configmap 没有设置 > o

Re: Flink k8s HA 手动删除作业deployment导致的异常

2022-06-12 Thread Zhanghao Chen
1.基于K8S做HA的Flink任务要想正常,不能手动删除作业deployment,必须通过cancel,stop命令进行停止。基于上面我猜测Flink k8s HA是基于ConfigMap之上开发的,其声明周期从K8S角度不能像作业的svc一样带ownerreference。 是的,Flink K8s HA 是基于 ConfigMap 开发的,并且 HA configmap 没有设置 ownerreference,因此如果想在保留 HA 数据的情况下重启集群直接 delete deployment 就行,重启后会从最新 cp 恢复。 2.基于k8s做HA的Flink job id

Flink k8s HA 手动删除作业deployment导致的异常

2022-06-12 Thread m18814122325
-sql-application-job-cluster-config-map 1 13m 我有以下疑问: 1.基于K8S做HA的Flink任务要想正常,不能手动删除作业deployment,必须通过cancel,stop命令进行停止。基于上面我猜测Flink k8s HA是基于ConfigMap之上开发的,其声明周期从K8S角度不能像作业的svc一样带ownerreference。 2.基于k8s做HA的Flink job id皆为

Re:Re: flink k8s ha

2022-06-08 Thread json
恩,明白保留HA配置的意义了但感觉是不是有bug,看我的问题,重启报找不到 /high-availability.storageDir/task/completedCheckpointe5c125ad20ea 文件但oss上的HA目录只有 /high-availability.storageDir/task/completedCheckpointacdfb4309903既HA的configmap 信息和 high-availability.storageDir 目录里的文件不一致了 在 2022-06-08 23:06:03,"Weihua Hu" 写道: >Hi, >删除

Re: flink k8s ha

2022-06-08 Thread Weihua Hu
Hi, 删除 deployment 会将关联到这个 Deployment 的 Pod、Service、flink-conf configmap 等删除。但是 HA 相关的 configmap 没有配置 owner reference,是不会被删除的。主要目的是集群重启时可以从之前的HA 状态中恢复。更多内容参考官方文档[1] [1] https://nightlies.apache.org/flink/flink-docs-master/docs/deployment/ha/kubernetes_ha/#high-availability-data-clean-up Best,

flink k8s ha

2022-06-08 Thread json
flink1.13.6 on k8s application 模式,设置HA high-availability: org.apache.flink.kubernetes.highavailability.KubernetesHaServicesFactory high-availability.storageDir: oss 会在 k8s 上生成configmap 1. 但在 k8s 删除此任务的 deployment 后,为什么这些configmap还在?(任务都删了,这些ha应该不需要了吧) 2. 任务重新启动后,还是会去这些 configmap

Re: multiple pipeline deployment using flink k8s operator

2022-06-01 Thread Yang Wang
Best, Yang Sigalit Eliazov 于2022年6月1日周三 14:54写道: > Hi all, > we just started using the flink k8s operator to deploy our flink cluster. > From what we understand we are only able to start a flink cluster per job. > So in our case when we have 2 jobs we have to create 2 different clusters

multiple pipeline deployment using flink k8s operator

2022-06-01 Thread Sigalit Eliazov
Hi all, we just started using the flink k8s operator to deploy our flink cluster. >From what we understand we are only able to start a flink cluster per job. So in our case when we have 2 jobs we have to create 2 different clusters. obviously we would prefer to deploy these 2 job which rel

[QUESTION] In Flink k8s Application mode with HA can not using History Server for history backend

2022-05-11 Thread 谭家良
In Flink k8s application mode with high-availability, it's job id always 00, but in history server, it make job's id for the key. How can I using the application mode with HA and store the history job status with history server? Best, tanjialiang.

flink ????????k8s????????jar??????????

2022-04-25 Thread ????????
flink??kubernetes session

flink ????????k8s????????jar??????????

2022-04-25 Thread ????????
flink??kubernetes session jar ??!

Re: Flink + K8s

2021-11-02 Thread Austin Cawley-Edwards
Hi Rommel, That’s correct that K8s will restart the JM pod (assuming it’s been created by a K8s Job or Deployment), and it will pick up the HA data and resume work. The only use case for having multiple replicas is faster failover, so you don’t have to wait for K8s to provision that new pod

Flink + K8s

2021-11-02 Thread Rommel Holmes
Hi, >From my understanding, when i set Flink in HA mode in K8s, I don't need to setup more than 1 job manager, because once the job manager dies, K8s will restart it for me. Is that the correct understanding or for the HA purpose, I still need to setup more than 1 job manager? Thanks. Rommel

回复:flink k8s部署使用s3做HA问题

2021-07-27 Thread johnjlong
Caused by: java.lang.ClassNotFoundException: com.amazonaws.services.s3.model.AmazonS3Exception | | johnjlong | | johnjl...@163.com | 签名由网易邮箱大师定制 在2021年7月27日 15:18,maker_d...@foxmail.com 写道: 各位开发者: 大家好! 我在使用flink native Kubernetes方式部署,使用minio做文件系统,配置如下: state.backend: filesystem

flink k8s部署使用s3做HA问题

2021-07-27 Thread maker_d...@foxmail.com
各位开发者: 大家好! 我在使用flink native Kubernetes方式部署,使用minio做文件系统,配置如下: state.backend: filesystem fs.allowed-fallback-filesystems: s3 s3.endpoint: http://172.16.14.40:9000 s3.path-style: true s3.access-key: admin s3.secret-key: admin123

Re: flink k8s高可用如何使用oss作为high-availability.storageDir?

2021-02-17 Thread Yang Wang
使用社区官方镜像flink:1.12.1,你需要配置如下参数 最后两个参数是通过环境变量的方式来enable oss的plugin high-availability.storageDir: oss://flink/flink-ha fs.oss.endpoint: fs.oss.accessKeyId: fs.oss.accessKeySecret: containerized.master.env.ENABLE_BUILT_IN_PLUGINS: flink-oss-fs-hadoop-1.12.1.jar

flink k8s高可用如何使用oss作为high-availability.storageDir?

2021-02-17 Thread casel.chen
如题,在k8s环境下不想使用hdfs作为high-availability.storageDir,有没有办法直接使用oss呢?checkpoint和savepoint已经能够使用oss了。