The fabric8 K8s client is using PATCH to replace get-and-update in v6.6.2.
That's why you also need to give PATCH permission for the K8s service
account.
This would help to decrease the pressure of K8s APIServer. You could find
more information here[1].
[1].
r election is performed with a unified config map for
doing that.
Best,
Zhanghao Chen
From: Ethan T Yang
Sent: Wednesday, December 6, 2023 5:40
To: user@flink.apache.org
Subject: Flink Kubernetes HA
Hi Flink users,
After upgrading Flink ( from 1.13.1 -> 1.18.
Never mind. The issue was fix due to the service account permission missing
“patch” verb. Which lead to RPC service not started.
> On Dec 5, 2023, at 1:40 PM, Ethan T Yang wrote:
>
> Hi Flink users,
> After upgrading Flink ( from 1.13.1 -> 1.18.0), I noticed the an issue when
> HA is
Hi Flink users,
After upgrading Flink ( from 1.13.1 -> 1.18.0), I noticed the an issue when HA
is enabled.( see exception below). I am using k8s deployment and I clean the
previous configmaps, like leader files etc. I know the pekko is a recently
thing. Can someone share doc on how to use or
Hello Matthias,
I'll extract the logs from the cluster au update that here.
For the tm's, i'll try to find relevant logs, we had many of them deployed
at that time. And all of the logs may not be that interesting to upload.
Regards,
Gil
On Thu, Aug 26, 2021, 12:31 Matthias Pohl wrote:
> Hi
Hi Gil,
could you provide the complete logs (TaskManager & JobManager) for us to
investigate it? The error itself and the behavior you're describing sounds
like expected behavior if there are not enough slots available for all the
submitted jobs to be handled in time. Have you tried increasing the
Hello,
We are struggling a bit with an error in our kubernetes deployment.
The deployment is composed of 2 flink job managers and 58 task managers.
When deploying the jobs everything is going fine at first, but after the
deployment of several jobs (mix of batch and streaming job using the SQL
>From the implementation of DefaultCompletedCheckpointStore, Flink will only
retain the configured amount of checkpoints.
Maybe you could also check the content of jobmanager-leader ConfigMap. It
should have the same number of pointers for the completedCheckpoint.
Best,
Yang
Ivan Yang
Thanks for the reply. Yes, We are seeing all the completedCheckpoint and
they keep growing. We will revisit our k8s set up, configmap etc
> On Jun 23, 2021, at 2:09 AM, Yang Wang wrote:
>
> Hi Ivan,
>
> For completedCheckpoint files will keep growing, do you mean too many
> files
Hi Dear Flink users,
We recently implemented enabled the zookeeper less HA in our kubernetes Flink
deployment. The set up has
high-availability.storageDir: s3://some-bucket/recovery
Since we have a retention policy on the s3 bucket, relatively short 7 days. So
the HA will fail if the
10 matches
Mail list logo