I have followed this 
https://ci.apache.org/projects/flink/flink-docs-release-1.10/ops/memory/mem_migration.html<https://ci.apache.org/projects/flink/flink-docs-release-1.10/ops/memory/mem_migration.html#container-cut-off-memory>
and I am using taskmanager.memory.flink.size now instead of 
taskmanager.heap.size
________________________________
From: Deshpande, Omkar <omkar_deshpa...@intuit.com>
Sent: Monday, September 14, 2020 6:23 PM
To: user@flink.apache.org <user@flink.apache.org>
Subject: flink checkpoint timeout

This email is from an external sender.

Hello,

I recently upgraded from flink 1.9 to 1.10. The checkpointing succeeds first 
couple of times and then starts failing because of timeouts. The checkpoint 
time grows with every checkpoint and starts exceeding 10 minutes. I do not see 
any exceptions in the logs. I have enabled debug logging at "org.apache.flink" 
level. How do I investigate this? The garbage collection seems fine. There is 
no backpressure. This used to work as is with flink 1.9 without any issue.

Any pointers on how to investigate long time taken to complete checkpoint?

Omkar

Reply via email to