Hi all,
we're running a session cluster and I submit around 20 jobs to it at the
same time by creating FlinkSessionJob Kubernetes resources. After
sufficient time there are 20 jobs created and running healthy. However, it
appears that some jobs are started with the wrong config. As a result some
Hi Gyula,
Thanks for getting back and explaining the difference in the
responsibilities of the autoscaler and the operator.
I figured out what the issue was.
Here is what I was trying to do: the autoscaler had initially down-scaled
(2->1) the flinkDeployment so there was
Hi Chetas,
The operator logic itself would normally call the rescale api during the
upgrade process, not the autoscaler module. The autoscaler module sets the
correct config with the parallelism overrides, and then the operator
performs the regular upgrade cycle (as when you yourself change
Hello,
We recently upgraded the operator to 1.8.0 to leverage the new autoscaling
features (
https://nightlies.apache.org/flink/flink-kubernetes-operator-docs-release-1.8/docs/custom-resource/autoscaler/).
The FlinkDeployment (application cluster) is set to flink v1_18 as well. I
am able to
kcz
573693...@qq.com
Hi,
I'm playing with a Flink 1.18 demo with the auto-scaler and the adaptive
scheduler.
The operator can correctly collect data and order the job to scale up, but
it'll take the job several times to reach the required parallelism.
E.g. The original parallelism for each vertex is something like
Hi! I'm deploying a job via the Flink K8s Operator with these settings in
the FlinkDeployment resource:
```
spec:
flinkConfiguration:
taskmanager.host: 0.0.0.0 <-- ignored / not applied
```
When I look into the flink-conf.yaml in the TM the setting is not there. Is
there any rea
sources:
>>> - deployments
>>> - deployments/finalizers
>>> verbs:
>>> - '*'
>>> ---
>>> apiVersion: rbac.authorization.k8s.io/v1
>>> kind: RoleBinding
>>> metadata:
>>> labels:
>>> app.kubernetes.io/
;> verbs:
>> - '*'
>> ---
>> apiVersion: rbac.authorization.k8s.io/v1
>> kind: RoleBinding
>> metadata:
>> labels:
>> app.kubernetes.io/name: flink-kubernetes-operator
>> app.kubernetes.io/version: 1.5.0
>> name: flink-role-binding
>
-binding
> roleRef:
> apiGroup: rbac.authorization.k8s.io
> kind: Role
> name: flink
> subjects:
> - kind: ServiceAccount
> name: flink
> EOF
>
> Hopefully that helps.
>
>
> On Tue, Sep 19, 2023 at 5:40 PM Krzysztof Chmielewski <
> krzysiek.chmielew.
want to have Flink
deployments in.
kubectl apply -f - < wrote:
> Hi community,
> I was wondering if anyone tried to deploy Flink using Flink k8s operator
> on machine where OKD [1] is installed?
>
> We have tried to install Flink k8s operator version 1.6 which seems to
> succeed
Hi community,
I was wondering if anyone tried to deploy Flink using Flink k8s operator on
machine where OKD [1] is installed?
We have tried to install Flink k8s operator version 1.6 which seems to
succeed, however when we try to deploy simple Flink deployment we are
getting an error.
2023-09-19
FYI, adding environment variables of `
KUBERNETES_DISABLE_HOSTNAME_VERIFICATION=true` works for me.
This env variable needs to be added to both the Flink operator and the
Flink job definition.
On Tue, Aug 8, 2023 at 12:03 PM Xiaolong Wang
wrote:
> Ok, thank you.
>
> On Tue, Aug 8, 2023 at
TM to Flink Session cluster via Java K8s client if Session
>>> Cluster has running jobs?
>>>
>>> Thanks,
>>> Krzysztof
>>>
>>> pt., 25 sie 2023 o 23:48 Krzysztof Chmielewski <
>>> krzysiek.chmielew...@gmail.com> napisał(a):
>
napisał(a):
>>
>>> Hi community,
>>> I have a use case where I would like to add an extra TM) to a running
>>> Flink session cluster that has Flink jobs deployed. Session cluster
>>> creation, job submission and cluster patching is done using flink k8s
&
wski <
> krzysiek.chmielew...@gmail.com> napisał(a):
>
>> Hi community,
>> I have a use case where I would like to add an extra TM) to a running
>> Flink session cluster that has Flink jobs deployed. Session cluster
>> creation, job submission and
apisał(a):
> Hi community,
> I have a use case where I would like to add an extra TM) to a running
> Flink session cluster that has Flink jobs deployed. Session cluster
> creation, job submission and cluster patching is done using flink k8s
> operator Java API. The Details of this are pre
Hi community,
I have a use case where I would like to add an extra TM) to a running Flink
session cluster that has Flink jobs deployed. Session cluster creation, job
submission and cluster patching is done using flink k8s operator Java API.
The Details of this are presented here [1]
I would like
obs using Apache Flink
> k8s operator.
> Where actions like job submission (new and from save point), Job cancel
> with save point, cluster creations will be triggered from Java based micro
> service.
>
> Is there any recommended/Dedicated Java API for Flink k8s operator?
> I se
Hi,
I have a use case where I would like to run Flink jobs using Apache Flink
k8s operator.
Where actions like job submission (new and from save point), Job cancel
with save point, cluster creations will be triggered from Java based micro
service.
Is there any recommended/Dedicated Java API
Ok, thank you.
On Tue, Aug 8, 2023 at 11:22 AM Peter Huang
wrote:
> We will handle it asap. Please check the status of this jira
> https://issues.apache.org/jira/browse/FLINK-32777
>
> On Mon, Aug 7, 2023 at 8:08 PM Xiaolong Wang
> wrote:
>
>> Hi,
>>
>> I was testing flink-kubernetes-operator
We will handle it asap. Please check the status of this jira
https://issues.apache.org/jira/browse/FLINK-32777
On Mon, Aug 7, 2023 at 8:08 PM Xiaolong Wang
wrote:
> Hi,
>
> I was testing flink-kubernetes-operator in an IPv6 cluster and found out
> the below issues:
>
> *Caused by:
Hi,
I was testing flink-kubernetes-operator in an IPv6 cluster and found out
the below issues:
*Caused by: javax.net.ssl.SSLPeerUnverifiedException: Hostname
> fd70:e66a:970d::1 not verified:*
>
> *certificate: sha256/EmX0EhNn75iJO353Pi+1rClwZyVLe55HN3l5goaneKQ=*
>
> *DN:
I have fixed the issue by increasing the CPU and memory for my JM and TM
pods.
Make sure your instance type can accommodate the required resources.
On Wed, 19 Jul 2023 at 13:35, Orkhan Dadashov
wrote:
> Hi Flink users,
>
> I'm following up on this guide to try the Flink K8S operat
Hi Flink users,
I'm following up on this guide to try the Flink K8S operator (1.5.0 version
):
https://nightlies.apache.org/flink/flink-kubernetes-operator-docs-release-1.5/docs/try-flink-kubernetes-operator/quick-start/
When I try to deploy a basic example, JM and TM start, but TM fails
Hi Flink users,
I'm following up on this guide to try the Flink K8S operator (1.5.0 version
):
https://nightlies.apache.org/flink/flink-kubernetes-operator-docs-release-1.5/docs/try-flink-kubernetes-operator/quick-start/
When I try to deploy a basic example, JM and TM start, but TM fails
Maybe you have inconsistent operator / CRC versions? In any case I highly
recommend upgrading to the lates operator version to get all the bug /
security fixes and improvements.
Gyula
On Wed, 12 Jul 2023 at 10:58, Paul Lam wrote:
> Hi,
>
> I’m using K8s operator 1.3.1 with Flink 1.15.2 on 2
Hi,
I’m using K8s operator 1.3.1 with Flink 1.15.2 on 2 K8s clusters. Weird enough,
on one K8s cluster the flink deployments would show savepoint trigger nonce.
while the flink deployments on the other cluster wouldn’t.
The normal output is as follows:
```
Last Savepoint:
Format
Hi Gyula,
Thank you and sorry for the late response.
My use case is that users may run finite jobs (either batch jobs or finite
stream jobs), leaving a lot of deprecated flink deployments around. I’ve filed
a ticket[1].
[1] https://issues.apache.org/jira/browse/FLINK-32143
Best,
Paul Lam
>
There is no such feature currently, Kubernetes resources usually do not
delete themselves :)
The problem I see here is by deleting the resource you lose all information
about what happened, you won't know if it failed or completed etc.
What is the use-case you are thinking about?
If this is
Hi all,
Currently, if a job turns into terminated status (e.g. FINISHED or FAILED),
the flinkdeployment remains until a manual cleanup is performed. I went
through the docs but did not find any way to clean them up automatically.
Am I missing something? Thanks!
Best,
Paul Lam
- 回复的原邮件
> | 发件人 | Weihua Hu |
> | 发送日期 | 2023年3月14日 10:39 |
> | 收件人 | |
> | 主题 | Re: flink k8s 部署启动报错 |
> Hi,
>
> 看异常信息是 Flink 集群在启动时检索到 HA 路径上存在 DirtyResults 数据,但是数据已经不完整了,无法正常读取。
> 可以参考文档[1],检查相关的 HA 路径,清理下异常数据
>
> 另外问一下,之前是通过同名的 cluster-id 启动过 Flink 集群吗?
>
&
您好,
我找到了我的ha目录,请教一下,怎么确定哪些数据是脏数据,可以允许删除的,这个有什么办法可以确定吗,我看到的都是些系统数据
| |
Jason_H
|
|
hyb_he...@163.com
|
回复的原邮件
| 发件人 | Weihua Hu |
| 发送日期 | 2023年3月14日 10:39 |
| 收件人 | |
| 主题 | Re: flink k8s 部署启动报错 |
Hi,
看异常信息是 Flink 集群在启动时检索到 HA 路径上存在 DirtyResults 数据,但是数据已经不完整了,无法正常读取。
可以参考文档[1],检查相关的
您好,
对的,之前是正常启动的,突然失败了,然后我直接重启pod,就一直报这个错了。
| |
Jason_H
|
|
hyb_he...@163.com
|
回复的原邮件
| 发件人 | Weihua Hu |
| 发送日期 | 2023年3月14日 10:39 |
| 收件人 | |
| 主题 | Re: flink k8s 部署启动报错 |
Hi,
看异常信息是 Flink 集群在启动时检索到 HA 路径上存在 DirtyResults 数据,但是数据已经不完整了,无法正常读取。
可以参考文档[1],检查相关的 HA 路径,清理下异常数据
另外问一下,之前
Hi,
看异常信息是 Flink 集群在启动时检索到 HA 路径上存在 DirtyResults 数据,但是数据已经不完整了,无法正常读取。
可以参考文档[1],检查相关的 HA 路径,清理下异常数据
另外问一下,之前是通过同名的 cluster-id 启动过 Flink 集群吗?
[1]
https://nightlies.apache.org/flink/flink-docs-master/docs/deployment/config/#job-result-store-storage-path
Best,
Weihua
On Tue, Mar 14, 2023 at
hi,大家好
请教一个问题,我在k8s上部署的flink集群,启动不来,报如下的错误,大家有遇到过吗
java.util.concurrent.CompletionException:
org.apache.flink.util.FlinkRuntimeException: Could not retrieve JobResults of
globally-terminated jobs from JobResultStore
at java.util.concurrent.CompletableFuture.encodeThrowable(Unknown
Yep!
Simple oversight, it was :/
Cheers,
Matyas
On Thu, Feb 23, 2023 at 10:54 PM Gyula Fóra wrote:
> Hey!
> You are right, these fields could have been of the PodTemplate /
> PodTemplateSpec type (probably PodTemplateSpec is actually better).
>
> I think the reason why we used it is two fold:
Hey!
You are right, these fields could have been of the PodTemplate /
PodTemplateSpec type (probably PodTemplateSpec is actually better).
I think the reason why we used it is two fold:
- Simple oversight :)
- Flink itself "expects" the podtemplate in this form for the native
integration as you
Hi all,
Why does the FlinkDeployment CRD refer to the Pod class instead of the
PodTemplate class from the fabric8 library? As far as I can tell, the only
difference is that the Pod class exposes the PodStatus, which doesn't seem
mutable. Thanks in advance!
Best,
Mason
fecycleState enum if you want a single condensed status view .
>
> Cheers
> Gyula
>
> On Tue, 6 Dec 2022 at 04:12, Paul Lam <mailto:paullin3...@gmail.com>> wrote:
> Hi all,
>
> I’m trying out Flink K8s operator 1.2 with K8s 1.25 and Flink 1.15.
>
> I found kub
want a single condensed status view .
Cheers
Gyula
On Tue, 6 Dec 2022 at 04:12, Paul Lam wrote:
> Hi all,
>
> I’m trying out Flink K8s operator 1.2 with K8s 1.25 and Flink 1.15.
>
> I found kubectl shows that flinkdeployments stay in DEPLOYED like forever
> (the Flink job
Hi all,
I’m trying out Flink K8s operator 1.2 with K8s 1.25 and Flink 1.15.
I found kubectl shows that flinkdeployments stay in DEPLOYED like forever (the
Flink job status are RUNNING), but the operator logs shows that the
flinkdeployments already turned into STABLE.
Is that a known bug
liminate most of these cases.
>
> Cheers,
> Gyula
>
> On Tue, Nov 22, 2022 at 9:43 AM Dongwon Kim wrote:
>
>> Hi,
>>
>> While using a last-state upgrade mode on flink-k8s-operator-1.2.0 and
>> flink-1.14.3, we're occasionally facing the following error:
&g
eliminate most of these cases.
Cheers,
Gyula
On Tue, Nov 22, 2022 at 9:43 AM Dongwon Kim wrote:
> Hi,
>
> While using a last-state upgrade mode on flink-k8s-operator-1.2.0 and
> flink-1.14.3, we're occasionally facing the following error:
>
> Status:
>> Cluster Info:
Hi,
While using a last-state upgrade mode on flink-k8s-operator-1.2.0 and
flink-1.14.3, we're occasionally facing the following error:
Status:
> Cluster Info:
> Flink - Revision: 98997ea @ 2022-01-08T23:23:54+01:00
> Flink - Version: 1.14.3
gt; 也许你可以创建一个 jira issue 来跟进这个问题
>
> Best,
> Weihua
>
>
>> On Thu, Oct 27, 2022 at 6:51 PM Young Chen wrote:
>>
>> 【问题描述】
>>
>> Flink k8s operator(v1.1.0)高可用部署了一个Flink Session Cluster(两个JobManager),
>> 然后用SessionJob 部署一个例子job,job
:
> 【问题描述】
>
> Flink k8s operator(v1.1.0)高可用部署了一个Flink Session Cluster(两个JobManager),
> 然后用SessionJob 部署一个例子job,job有时可以部署,有时部署不了。
>
> 可以看到容器中如下error日志。
>
>
>
> 【操作步骤】
>
> 部署Cluster
>
>
>
> apiVersion: flink.apache.org/v1beta1
>
> kind: Flink
【问题描述】
Flink k8s operator(v1.1.0)高可用部署了一个Flink Session Cluster(两个JobManager),
然后用SessionJob 部署一个例子job,job有时可以部署,有时部署不了。
可以看到容器中如下error日志。
【操作步骤】
部署Cluster
apiVersion: flink.apache.org/v1beta1
kind: FlinkDeployment
metadata:
name: flink-cluster-1jm-checkpoint
spec:
image: flink
日(星期二) 下午3:33
> 收件人:"user-zh"
> 主题:batch job 结束时, flink-k8s-operator crd 状态展示不清晰
>
>
>
> hi,
> 我在使用flink-k8s-operator 部署batch job。 我发现当batch job 结束之后,
> flink-k8s-operator 的 FlinkDeployment CRD 状态发生了变化:
> jobManagerDeploymentStatus 变成了"missing&quo
ing a flink batch job with flink-k8s-operator.
> My flink-k8s-operator's version is 1.2.0 and flink's version is 1.14.6. I
> found after the batch job execute finish, the jobManagerDeploymentStatus
> field became "MISSING" in FlinkDeployment crd. And the error field became
hi,
我在使用flink-k8s-operator 部署batch job。 我发现当batch job 结束之后, flink-k8s-operator
的 FlinkDeployment CRD 状态发生了变化: jobManagerDeploymentStatus 变成了"missing", "error"
变成了“Missing JobManager deployment”。 我想这个应该是batch job执行完毕之后,native-k8s
自动将JobmanagerDeployment 删除导致的。 请问该如何通过判断C
Hi, I'm deploying a flink batch job with flink-k8s-operator. My
flink-k8s-operator's version is 1.2.0 and flink's version is 1.14.6. I found
after the batch job execute finish, the jobManagerDeploymentStatus field became
"MISSING" in FlinkDeployment crd. And the error fi
adoop conf directory exists in the image.
For flink-k8s-operator, another feasible solution is to create a
hadoop-config-configmap manually and then use
*"kubernetes.hadoop.conf.config-map.name
<http://kubernetes.hadoop.conf.config-map.name>" *to mount it to JobManager
and TaskManager
Hi, community:
I'm using flink-k8s-operator v1.2.0 to deploy flink job. And the
"HADOOP_CONF_DIR" environment variable was setted in the image that i buiilded
from flink:1.15. I found the taskmanager pod was trying to mount a volume
named "hadoop-config-volume&
Webhook主要的作用是做CR的校验,避免提交到K8s上之后才发现
例如:parallelism被错误的设置为负值,jarURI没有设置等
Best,
Yang
Kyle Zhang 于2022年7月27日周三 18:59写道:
> Hi,all
> 最近在看flink-k8s-operator[1],架构里有一个flink-webhook,请问这个container的作用是什么,如果配置
> webhook.create=false对整体功能有什么影响?
>
> Best regards
>
> [1]
>
> h
Hi,all
最近在看flink-k8s-operator[1],架构里有一个flink-webhook,请问这个container的作用是什么,如果配置
webhook.create=false对整体功能有什么影响?
Best regards
[1]
https://nightlies.apache.org/flink/flink-kubernetes-operator-docs-release-1.1/docs/concepts/architecture/
Hi,
使用文档可以查看:
https://nightlies.apache.org/flink/flink-docs-master/docs/deployment/resource-providers/native_kubernetes
设计文档可以查看:
https://docs.google.com/document/d/1-jNzqGF6NfZuwVaFICoFQ5HFFXzF5NVIagUZByFMfBY/edit?usp=sharing
jira: https://issues.apache.org/jira/browse/FLINK-9953
Best,
Lijie
Flink version:1.15.0
??1.15.0Flink??native k8s?Flink on
Native k8s ??:)
it. I
chalk all that up to just me lacking a bit of experience with k8s.
That being said... It's all working now and I documented the deployment
over here:
https://hop.apache.org/manual/next/pipeline/beam/flink-k8s-operator-running-hop-pipeline.html
A big thank you to everyone that helped me out
Could you please share the JobManager logs of failed deployment? It will
also help a lot if you could show the pending pod status via "kubectl
describe ".
Given that the current Flink Kubernetes Operator is built on top of native
K8s integration[1], the Flink ResourceManager should allocate
Yes of-course. I already feel a bit less intelligent for having asked the
question ;-)
The status now is that I managed to have it all puzzled together. Copying
the files from s3 to an ephemeral volume takes all of 2 seconds so it's
really not an issue. The cluster starts and our fat jar and
Hi Matt,
Yes. There are several official Flink images with various JVMs including
Java 11.
https://hub.docker.com/_/flink
Cheers,
Matyas
On Fri, Jun 24, 2022 at 2:06 PM Matt Casters
wrote:
> Hi Mátyás & all,
>
> Thanks again for the advice so far. On a related note I noticed Java 8
> being
Hi Mátyás & all,
Thanks again for the advice so far. On a related note I noticed Java 8
being used, indicated in the log.
org.apache.flink.runtime.entrypoint.ClusterEntrypoint[] -
JAVA_HOME: /usr/local/openjdk-8
Is there a way to use Java 11 to start Flink with?
Kind regards,
Matt
Thanks for your valuable inputs.
To make deploying Flink on K8s easy as a normal Java application
is certainly the mission of Flink Kubernetes Operator. Obviously, we are
still a little far from this mission.
Back to the user jars download, I think it makes sense to introduce the
artifact fetcher
Hi Yang,
Thanks for the suggestion! I looked into this volume sharing on EKS
yesterday but I couldn't figure it out right away.
The way that people come into the Apache Hop project is often with very
little technical knowledge since that's sort of the goal of the project:
make things easy.
Hi Matyas,
Again thank you very much for the information. I'm a beginner and all
the help is really appreciated. After some diving into the script
behind s3-artifiact-fetcher I kind of figured it out. Have an folder
sync'ed into the pod container of the task manager. Then I guess we should
be
Matyas and Gyula have shared many great informations about how to make the
Flink Kubernetes Operator work on the EKS.
One more input about how to prepare the user jars. If you are more familiar
with K8s, you could use persistent volume to provide the user jars and them
mount the volume to
Hi Matt,
I believe an artifact fetcher (e.g
https://hub.docker.com/r/agiledigital/s3-artifact-fetcher ) + the pod
template (
https://nightlies.apache.org/flink/flink-kubernetes-operator-docs-main/docs/custom-resource/pod-template/#pod-template)
is an elegant way to solve your problem.
The
Thank you very much for the help Matyas and Gyula!
I just saw a video today where you were presenting the FKO. Really nice
stuff!
So I'm guessing we're executing "flink run" at some point on the master and
that this is when we need the jar file to be local?
Am I right in assuming that this
A small addition to what Matyas has said:
The limitation of only supporting local scheme is coming from the Flink
Kubernetes Application mode directly and is not related to the operator
itself.
Once this feature is added to Flink itself the operator can also support it
for newer Flink versions.
Hi Matt,
- In FlinkDeployments you can utilize an init container to download your
artifact onto a shared volume, then you can refer to it as local:/.. from
the main container. FlinkDeployments comes with pod template support
Hi Flink team!
I'm interested in getting the new Flink Kubernetes Operator to work on AWS
EKS. Following the documentation I got pretty far. However, when trying
to run a job I got the following error:
Only "local" is supported as schema for application mode. This assumes t
> hat the jar is
的HA数据泄露
>
>[1].
>https://nightlies.apache.org/flink/flink-docs-release-1.15/docs/deployment/ha/kubernetes_ha/#high-availability-data-clean-up
>
>
>Best,
>Yang
>
>Zhanghao Chen 于2022年6月13日周一 07:53写道:
>
>> 1.基于K8S做HA的Flink任务要想正常,不能手动删除作业deployment,必须通过cancel,stop命令进行停止。基于上面我
-availability-data-clean-up
Best,
Yang
Zhanghao Chen 于2022年6月13日周一 07:53写道:
> 1.基于K8S做HA的Flink任务要想正常,不能手动删除作业deployment,必须通过cancel,stop命令进行停止。基于上面我猜测Flink
> k8s HA是基于ConfigMap之上开发的,其声明周期从K8S角度不能像作业的svc一样带ownerreference。
>
> 是的,Flink K8s HA 是基于 ConfigMap 开发的,并且 HA configmap 没有设置
> o
1.基于K8S做HA的Flink任务要想正常,不能手动删除作业deployment,必须通过cancel,stop命令进行停止。基于上面我猜测Flink
k8s HA是基于ConfigMap之上开发的,其声明周期从K8S角度不能像作业的svc一样带ownerreference。
是的,Flink K8s HA 是基于 ConfigMap 开发的,并且 HA configmap 没有设置 ownerreference,因此如果想在保留
HA 数据的情况下重启集群直接 delete deployment 就行,重启后会从最新 cp 恢复。
2.基于k8s做HA的Flink job id
-sql-application-job-cluster-config-map
1 13m
我有以下疑问:
1.基于K8S做HA的Flink任务要想正常,不能手动删除作业deployment,必须通过cancel,stop命令进行停止。基于上面我猜测Flink
k8s HA是基于ConfigMap之上开发的,其声明周期从K8S角度不能像作业的svc一样带ownerreference。
2.基于k8s做HA的Flink job id皆为
恩,明白保留HA配置的意义了但感觉是不是有bug,看我的问题,重启报找不到
/high-availability.storageDir/task/completedCheckpointe5c125ad20ea
文件但oss上的HA目录只有
/high-availability.storageDir/task/completedCheckpointacdfb4309903既HA的configmap
信息和 high-availability.storageDir 目录里的文件不一致了
在 2022-06-08 23:06:03,"Weihua Hu" 写道:
>Hi,
>删除
Hi,
删除 deployment 会将关联到这个 Deployment 的 Pod、Service、flink-conf configmap 等删除。但是
HA 相关的 configmap 没有配置 owner reference,是不会被删除的。主要目的是集群重启时可以从之前的HA
状态中恢复。更多内容参考官方文档[1]
[1]
https://nightlies.apache.org/flink/flink-docs-master/docs/deployment/ha/kubernetes_ha/#high-availability-data-clean-up
Best,
flink1.13.6 on k8s application 模式,设置HA
high-availability:
org.apache.flink.kubernetes.highavailability.KubernetesHaServicesFactory
high-availability.storageDir: oss
会在 k8s 上生成configmap
1. 但在 k8s 删除此任务的 deployment 后,为什么这些configmap还在?(任务都删了,这些ha应该不需要了吧)
2. 任务重新启动后,还是会去这些 configmap
Best,
Yang
Sigalit Eliazov 于2022年6月1日周三 14:54写道:
> Hi all,
> we just started using the flink k8s operator to deploy our flink cluster.
> From what we understand we are only able to start a flink cluster per job.
> So in our case when we have 2 jobs we have to create 2 different clusters
Hi all,
we just started using the flink k8s operator to deploy our flink cluster.
>From what we understand we are only able to start a flink cluster per job.
So in our case when we have 2 jobs we have to create 2 different clusters.
obviously we would prefer to deploy these 2 job which rel
In Flink k8s application mode with high-availability, it's job id always
00, but in history server, it make job's id for the key. How can I
using the application mode with HA and store the history job status with
history server?
Best,
tanjialiang.
flink??kubernetes session
flink??kubernetes session
jar
??!
Hi Rommel,
That’s correct that K8s will restart the JM pod (assuming it’s been created
by a K8s Job or Deployment), and it will pick up the HA data and resume
work. The only use case for having multiple replicas is faster failover, so
you don’t have to wait for K8s to provision that new pod
Hi,
>From my understanding, when i set Flink in HA mode in K8s, I don't need to
setup more than 1 job manager, because once the job manager dies, K8s will
restart it for me. Is that the correct understanding or for the HA purpose,
I still need to setup more than 1 job manager?
Thanks.
Rommel
Caused by: java.lang.ClassNotFoundException:
com.amazonaws.services.s3.model.AmazonS3Exception
| |
johnjlong
|
|
johnjl...@163.com
|
签名由网易邮箱大师定制
在2021年7月27日 15:18,maker_d...@foxmail.com 写道:
各位开发者:
大家好!
我在使用flink native Kubernetes方式部署,使用minio做文件系统,配置如下:
state.backend: filesystem
各位开发者:
大家好!
我在使用flink native Kubernetes方式部署,使用minio做文件系统,配置如下:
state.backend: filesystem
fs.allowed-fallback-filesystems: s3
s3.endpoint: http://172.16.14.40:9000
s3.path-style: true
s3.access-key: admin
s3.secret-key: admin123
使用社区官方镜像flink:1.12.1,你需要配置如下参数
最后两个参数是通过环境变量的方式来enable oss的plugin
high-availability.storageDir: oss://flink/flink-ha
fs.oss.endpoint:
fs.oss.accessKeyId:
fs.oss.accessKeySecret:
containerized.master.env.ENABLE_BUILT_IN_PLUGINS:
flink-oss-fs-hadoop-1.12.1.jar
如题,在k8s环境下不想使用hdfs作为high-availability.storageDir,有没有办法直接使用oss呢?checkpoint和savepoint已经能够使用oss了。
90 matches
Mail list logo