Hello,
I am using an Apache Flink cluster to run a streaming pipeline that I've created using Apache Beam. This streaming pipeline should be the only one of its type running on the Flink cluster, and I need some help with how to ensure that is the case. A Dockerized pipeline runner program submits the streaming pipeline, and if the pipeline exits (i.e. because of an error), then the pipeline runner program exits and is re-run, so that the pipeline is submitted again and continues running. The problem I am running into is that if the pipeline runner program exits, but the streaming pipeline is still running (i.e. because the job server went down and came back up), then I need to check in the pipeline runner program whether or not the pipeline is still running, or if it has gone down. My first thought was to try to create a specific job name that would be stored in Flink's REST API, and then to see if the job was already running, I could query the REST API for that name. I'm having trouble doing this. I seem to be able to set a job name in Beam, but that job name does not seem to be accessible via Flink’s REST API once the pipeline is run using Flink. >From researching this problem, I found this <https://github.com/apache/beam/blob/9a11e28ce79e3b243a13fbf148f2ba26b8c14107/sdks/java/core/src/main/java/org/apache/beam/sdk/options/PipelineOptions.java#L340> method, which initializes an AppName. This seems promising to me, but it is written in Java and I am looking to do it in Python. Is there a way to specify the Flink job name via the Beam Python SDK? Or is there a simpler way to know that a particular Beam pipeline is running, and therefore not resubmit it? Please let me know if you have any suggestions - either about how to execute the approaches I've described or if there's a simpler solution that I am overlooking. Thank you for your help! Best, Adlae D'Orazio