Hi Pachara.
We are also struggling to get a PyFlink job to run. Hitting some other wall,
but I would like to share some thoughts here.
At the moment, we are launching our fleet of jobs using a specially crafted
Docker image: Flink 1.20.1 + plugins + PyFlink + our Python libs + our jobs.
From that image we are invoking “flink run -py … -pyFile …”. So, that image is
running as a Flink client and it interacts with Flink Session Cluster.
The reason why I am mentioning all of this is that – I suspect – under covers,
Flink K8s Operator works in similar fashion. When you make a K8s Custom
Resource, the Operator will act as a client and it will try to read those files
you mention.
And there lies the problem. file:///flink-tmp/... Is a reference to a local
file, but local to what? Kubernetes operator, I suspect. That is why some
examples show the following:
spec:
job:
jarURI: local:///opt/flink/lib/flink-python-1.20.1.jar
…
This fails for us, stating that “local” is not a recognized URI schema. Never
mind, we skipped it (and hit another wall).
Sooo,…. You could try putting your Python files on some global storage, like S3
and see if that works. If it does, please tell us.
Nikola.
From: Pachara Aryuyuen (Kas) <[email protected]>
Date: Monday, July 7, 2025 at 8:38 AM
To: [email protected] <[email protected]>
Cc: Santhana Jinjakam (Eye) <[email protected]>
Subject: [PyFlink] Issue Deploying FlinkSessionJob with PVC and Python Script
Access
Dear Flink Community,
I’m currently working on deploying a FlinkSessionJob using PyFlink and have
encountered an issue I hope someone can help clarify.
Here’s my setup:
* I installed the Flink Kubernetes Operator in namespace A
* I deployed a FlinkDeployment in namespace B
* I’m trying to deploy a FlinkSessionJob in namespace B
The job fails with a file not found error related to the jarURI. I’m
referencing a JAR file located at /flink-tmp/, which is mounted as a PVC on the
FlinkDeployment. This PVC is backed by EFS, so it should be accessible across
pods. I’ve confirmed that the file exists at that path.
I also tried using the following public JAR instead:
jarURI:
https://repo1.maven.org/maven2/org/apache/flink/flink-python/1.20.1/flink-python-1.20.1.jar
However, even with this change, the Python script located in the same mounted
volume (/flink-tmp/transform_confirmed_trade.py) still cannot be found.
I haven’t been able to find any best practices or working examples of
FlinkSessionJob with PyFlink. Do you have any sample manifests or guidance for
this use case?
Here is the manifest I’m using:
apiVersion: flink.apache.org/v1beta1
kind: FlinkSessionJob
metadata:
name: transform-confirmed-trade-job
namespace: namespace-b
spec:
deploymentName: flink-cluster
job:
jarURI: file:///flink-tmp/flink-python-1.20.1.jar
entryClass: "org.apache.flink.client.python.PythonDriver"
args: ["-pyclientexec", "/usr/bin/python3", "--python",
"/flink-tmp/transform_confirmed_trade.py"]
parallelism: 2
upgradeMode: stateless
Additional Question:
Is it possible to configure TaskManagers to run permanently (i.e., not shut
down when idle)? If so, what configuration options are required to achieve this
behavior?
Any help or pointers would be greatly appreciated.
Best regards,
Pachara Aryuyuen (Kas)
Cloud Platform Engineer