Hi, If I provide the same jobId, will it restore from the HA first, or just > start anew?
it will restore from HA first, and will not start a new one. I tried with flink v1.15.2 with the flowing example, but can't reproduce. Could you provide the full start command? 1. ./bin/standalone-job.sh start --job-classname org.apache.flink.streaming.examples.windowing.TopSpeedWindowing 2. ./bin/taskmanager.sh start 3. kill -9 ${JobManager PID} 4. ./bin/standalone-job.sh start --job-classname org.apache.flink.streaming.examples.windowing.TopSpeedWindowing There is only one job running in my cluster after the new jobmanager started. Best, Weihua On Sat, May 6, 2023 at 4:00 PM Shubham Bansal <s.bansa...@gmail.com> wrote: > Thanks for the response Weihua. > > I am using flink v.15.2. I am unable to understand how I should be > starting the jobmanager without the job so that it can poll HA store and > start the job. > Since the current method of running the script standalone-job.sh requires > a job at start. If I provide the same jobId, will it restore from the HA > first, or just start anew? > > Best, > Shubham Bansal > > > On Sat, May 6, 2023 at 9:34 AM Weihua Hu <huweihua....@gmail.com> wrote: > >> Hi, Shubham >> >> Which Flink version are you using? >> AFAIK, the JobManager will recover the job from the HA store first, and >> it won't submit the same job (identify be jobID) again if it has already >> been recovered. >> >> Best, >> Weihua >> >> >> On Tue, May 2, 2023 at 8:02 PM Shubham Bansal <s.bansa...@gmail.com> >> wrote: >> >>> Hello everyone, >>> >>> Elastic scaling, as of now, has to be started in application mode. I am >>> having some trouble in making it work with HA. >>> I have a jobmanager pod and a single taskmanager pod, I use a post >>> install helm hook to start the jobmanager along with the single job (As >>> it's required in application mode). >>> >>>> ./bin/standalone-job.sh start (..removed for brevity) >>>> >>>> If due to some reason my job manager goes down, I have two options >>> >>> 1. I start the same job from the required checkpoint, my HA >>> (zookeeper) also kicks in and submits another job to the jobmanager. So >>> there are 2 jobs running where it should have been 1. >>> 2. If I don't submit the job, I don't have any other way to start >>> the jobmanager without the job with elastic scaling enabled. >>> >>> Is there any way to start the jobmanager without the job with elastic >>> scaling? >>> >>> Best, >>> Shubham >>> >>