Re: Flink 1.11 job stop with save point timeout error

2020-07-25 Thread Congxian Qiu
Hi Ivan From the JM log, the savepoint complete with 1 second, and the timeout exception said that the stop-with-savepoint can not be completed in 60s(this was calculated by 20 -- RestOptions#RETRAY_MAX_ATTEMPTS * 3s -- RestOptions#RETRY_DELAY. you can check the logic here[1]). I'm not sure

Re: Flink 1.11 job stop with save point timeout error

2020-07-24 Thread Ivan Yang
Hi Robert, Below is the job manager log after issuing the “flink stop” command 2020-07-24 19:24:12,388 INFO org.apache.flink.runtime.checkpoint.CheckpointCoordinator[] - Triggering checkpoint 1 (type=CHECKPOINT) @ 1595618652138 for job

Re: Flink 1.11 job stop with save point timeout error

2020-07-24 Thread Robert Metzger
Hi Ivan, thanks a lot for your message. Can you post the JobManager log here as well? It might contain additional information on the reason for the timeout. On Fri, Jul 24, 2020 at 4:03 AM Ivan Yang wrote: > Hello everyone, > > We recently upgrade FLINK from 1.9.1 to 1.11.0. Found one strange

Flink 1.11 job stop with save point timeout error

2020-07-23 Thread Ivan Yang
Hello everyone, We recently upgrade FLINK from 1.9.1 to 1.11.0. Found one strange behavior when we stop a job to a save point got following time out error. I checked Flink web console, the save point is created in s3 in 1 second.The job is fairly simple, so 1 second for savepoint generation is