Thanks,nacisimsek.I will try your suggestion.
------------------ Original ------------------
From:
"nacisimsek"
<[email protected]>;
Date: Thu, Jul 11, 2024 06:25 PM
To: "Enric Ott"<[email protected]>;
Cc: "user"<[email protected]>;
Subject: Re: Flink reactive deployment on with kubernetes operator
Hi Enric,
You can try using persistent volume claim on your kubernetes cluster as a
JobResultStore, instead of using a local path from your underlying host, and
see if it works.
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: flink-data-pvc
spec:
resources:
requests:
storage: 10Gi
volumeMode: Filesystem
accessModes:
- ReadWriteOnce
And edit your yaml
(spec.podTemplate.spec.volumes.persistentVolumeClaim.claimName) to use this PVC:
apiVersion: flink.apache.org/v1beta1
kind: FlinkDeployment
metadata:
name: basic-reactive-example
spec:
image: flink:1.17
flinkVersion: v1_17
flinkConfiguration:
scheduler-mode: REACTIVE
taskmanager.numberOfTaskSlots: "2"
state.savepoints.dir: file:///flink-data/savepoints
state.checkpoints.dir: file:///flink-data/checkpoints
high-availability:
org.apache.flink.kubernetes.highavailability.KubernetesHaServicesFactory
high-availability.storageDir: file:///flink-data/ha
serviceAccount: flink
jobManager:
resource:
memory: "2048m"
cpu: 1
taskManager:
resource:
memory: "2048m"
cpu: 1
podTemplate:
spec:
containers:
- name: flink-main-container
volumeMounts:
- mountPath: /flink-data
name:
flink-volume
volumes:
- name: flink-volume
persistentVolumeClaim:
claimName: flink-data-pvc
job:
jarURI:
local:///opt/flink/examples/streaming/StateMachineExample.jar
parallelism: 2
upgradeMode: savepoint
state: running
savepointTriggerNonce: 0
mode: standalone
Naci
On 11. Jul 2024, at 05:40, Enric Ott <[email protected]> wrote:
Hi,Community:
I hava encountered a problem when deploy reactive flink scheduler on
kubernetes with flink kubernetes operator 1.6.0,the manifest and exception
stack info listed as follows.
Any clues would be appreciated.
################################################################################
# Licensed to the Apache Software Foundation (ASF) under one
# or more contributor license agreements. See the NOTICE file
# distributed with this work for additional information
# regarding copyright ownership. The ASF licenses this file
# to you under the Apache License, Version 2.0 (the
# "License"); you may not use this file except in compliance
# with the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
################################################################################
apiVersion: flink.apache.org/v1beta1
kind: FlinkDeployment
metadata:
name: basic-reactive-example
spec:
image: flink:1.17
flinkVersion: v1_17
flinkConfiguration:
scheduler-mode: REACTIVE
taskmanager.numberOfTaskSlots: "2"
state.savepoints.dir: file:///flink-data/savepoints
state.checkpoints.dir: file:///flink-data/checkpoints
high-availability:
org.apache.flink.kubernetes.highavailability.KubernetesHaServicesFactory
high-availability.storageDir: file:///flink-data/ha
serviceAccount: flink
jobManager:
resource:
memory: "2048m"
cpu: 1
taskManager:
resource:
memory: "2048m"
cpu: 1
podTemplate:
spec:
containers:
- name: flink-main-container
volumeMounts:
- mountPath: /flink-data
name: flink-volume
volumes:
- name: flink-volume
hostPath:
# directory location on host
path:
/run/desktop/mnt/host/c/Users/24381/Documents/
# this field is optional
type: DirectoryOrCreate
job:
jarURI:
local:///opt/flink/examples/streaming/StateMachineExample.jar
parallelism: 2
upgradeMode: savepoint
state: running
savepointTriggerNonce: 0
mode: standalone
ERROR org.apache.flink.runtime.entrypoint.ClusterEntrypoint
[] - Fatal error occurred in the cluster entrypoint.
java.util.concurrent.CompletionException: java.lang.IllegalStateException: The
base directory of the JobResultStore isn't accessible. No dirty JobResults can
be restored.
at java.util.concurrent.CompletableFuture.encodeThrowable(Unknown
Source) ~[?:?]
at java.util.concurrent.CompletableFuture.completeThrowable(Unknown
Source) [?:?]
at java.util.concurrent.CompletableFuture$AsyncSupply.run(Unknown
Source) [?:?]
at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
[?:?]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
[?:?]
at java.lang.Thread.run(Unknown Source) [?:?]
Caused by: java.lang.IllegalStateException: The base directory of the
JobResultStore isn't accessible. No dirty JobResults can be restored.
at
org.apache.flink.util.Preconditions.checkState(Preconditions.java:193)
~[flink-dist-1.17.1.jar:1.17.1]
at
org.apache.flink.runtime.highavailability.FileSystemJobResultStore.getDirtyResultsInternal(FileSystemJobResultStore.java:199)
~[flink-dist-1.17.1.jar:1.17.1]
at
org.apache.flink.runtime.highavailability.AbstractThreadsafeJobResultStore.withReadLock(AbstractThreadsafeJobResultStore.java:118)
~[flink-dist-1.17.1.jar:1.17.1]
at
org.apache.flink.runtime.highavailability.AbstractThreadsafeJobResultStore.getDirtyResults(AbstractThreadsafeJobResultStore.java:100)
~[flink-dist-1.17.1.jar:1.17.1]
at
org.apache.flink.runtime.dispatcher.runner.SessionDispatcherLeaderProcess.getDirtyJobResults(SessionDispatcherLeaderProcess.java:194)
~[flink-dist-1.17.1.jar:1.17.1]
at
org.apache.flink.runtime.dispatcher.runner.AbstractDispatcherLeaderProcess.supplyUnsynchronizedIfRunning(AbstractDispatcherLeaderProcess.java:198)
~[flink-dist-1.17.1.jar:1.17.1]
at
org.apache.flink.runtime.dispatcher.runner.SessionDispatcherLeaderProcess.getDirtyJobResultsIfRunning(SessionDispatcherLeaderProcess.java:188)
~[flink-dist-1.17.1.jar:1.17.1]
... 4 more