Hi Vino, Yes I am enabling checkpoint in the code as follows :
StreamExecutionEnvironment env = StreamExecutionEnvironment.createRemoteEnvironment("<job_manager_host>,<job_manager_port>,getJobConfiguration(),jarPath"); env.enableCheckpointing(1000); env.setSateBackend(new FsStateBackend("file:///<shared_mount_point_location>")); env.getCheckpointConfig().setMinPauseBetweenCheckpoints(1000); In getJobConfiguration method I have set HA related properties like HA_STORAGE_PATH,HA_ZOOKEEPER_QUORUM,HA_ZOOKEEPER_ROOT,HA_MODE,HA_JOB_MANAGER_PORT_RANGE,HA_CLUSTER_ID I can see the error in Job Manager logs where it says Collection Source is not being executed at the moment. Aborting checkpoint. In the pipeline I have a stream initialized using "fromCollection". I think I will have to get rid of this. What do you suggest Regards, Vinay Patil On Thu, Jul 26, 2018 at 12:04 PM vino yang <yanghua1...@gmail.com> wrote: > Hi Vinay: > > Did you call specific config API refer to this documentation[1]; > > Can you share your job program and JM Log? Or the JM log contains the log > message like this pattern "Triggering checkpoint {} @ {} for job {}."? > > [1]: > https://ci.apache.org/projects/flink/flink-docs-release-1.5/dev/stream/state/checkpointing.html#enabling-and-configuring-checkpointing > > Thanks, vino. > > 2018-07-25 19:43 GMT+08:00 Chesnay Schepler <ches...@apache.org>: > >> Can you provide us with the job code? >> >> I assume that checkpointing runs properly if you submit the same job to a >> normal cluster? >> >> >> On 25.07.2018 13:15, Vinay Patil wrote: >> >> No error in the logs. That is why I am not able to understand why >> checkpoints are not getting triggered. >> >> Regards, >> Vinay Patil >> >> >> On Wed, Jul 25, 2018 at 4:44 PM Vinay Patil <vinay18.pa...@gmail.com> >> wrote: >> >>> Hi Chesnay, >>> >>> No error in the logs. That is why I am not able to understand why >>> checkpoints are getting triggered. >>> >>> Regards, >>> Vinay Patil >>> >>> >>> On Wed, Jul 25, 2018 at 4:36 PM Chesnay Schepler <ches...@apache.org> >>> wrote: >>> >>>> Please check the job- and taskmanager logs for anything suspicious. >>>> >>>> On 25.07.2018 12:33, Vinay Patil wrote: >>>> >>>> Hi, >>>> >>>> I am starting the cluster using bootstrap application where in I am >>>> calling Job Manager and Task Manager main class to form the cluster. The HA >>>> cluster is formed correctly and I am able to submit jobs to this cluster >>>> using RemoteExecutionEnvironment but when I enable checkpointing in code I >>>> do not see any checkpoints triggered on Flink UI. >>>> >>>> Am I missing any configurations to be set for the >>>> RemoteExecutionEnvironment for checkpointing to work. >>>> >>>> >>>> Regards, >>>> Vinay Patil >>>> >>>> >>>> >> >