Hi I had similar use-case recently, and adding a metadata key solved the issue https://github.com/GoogleCloudDataproc/initialization-actions/pull/334. You keep the original initialization action and add for example (using gcloud) '--metadata flink-snapshot-url=http://mirrors.up.pt/pub/apache/flink/flink-1.9.1/flink-1.9.1-bin-scala_2.11.tgz'
Cheers Pawel ________________________________ From: Ismaël Mejía <ieme...@gmail.com> Sent: Friday, February 7, 2020 2:24 PM To: Xander Song <iamuuriw...@gmail.com>; user@beam.apache.org <user@beam.apache.org> Cc: u...@flink.apache.org <u...@flink.apache.org> Subject: Re: Running a Beam Pipeline on GCP Dataproc Flink Cluster +user@beam.apache.org<mailto:user@beam.apache.org> On Fri, Feb 7, 2020 at 12:54 AM Xander Song <iamuuriw...@gmail.com<mailto:iamuuriw...@gmail.com>> wrote: I am attempting to run a Beam pipeline on a GCP Dataproc Flink cluster. I have followed the instructions at this repo<https://github.com/GoogleCloudDataproc/initialization-actions/tree/master/flink> to create a Flink cluster on Dataproc using an initialization action. However, the resulting cluster uses version 1.5.6 of Flink, and my project requires a more recent version (version 1.7, 1.8, or 1.9) for compatibility with Beam<https://beam.apache.org/documentation/runners/flink/>. Inside of the flink.sh script in the linked repo, there is a line for installing Flink from a snapshot URL instead of apt<https://github.com/GoogleCloudDataproc/initialization-actions/blob/81e453d8f8a036e371e144d5103aaa38ecb2c679/flink/flink.sh#L53>. Is this the correct mechanism for installing a different version of Flink using the initialization script? If so, how is it meant to be used? Thank you in advance.