Re: Flink checkpoint timeout

2023-06-01 Thread Hangxiang Yu
HI, Ivan. Could you provide more information about it: 1. Which operator subtask is stuck ? or is it random ? 2. Could you share the stack or flame graph of the stuck subtask ? On Wed, May 31, 2023 at 12:45 PM Ethan T Yang wrote: > Hello all, > > We recently start to experience Checkpoint

Flink checkpoint timeout

2023-05-30 Thread Ethan T Yang
Hello all, We recently start to experience Checkpoint timeout randomly. Here are some background information 1. We are on Flink 1.13.1 2. We have been running these type of streaming jobs for years. When checkpoint succeeds, it only take a few seconds. After a week ago, we start to see random

Re: Flink Checkpoint Timeout

2022-03-08 Thread Mahantesh Patil
I see for every consequential checkpoint timeout fail , number of tasks which completed checkpointing keeps decreasing, why would that happen? Does flink try to process data beyond old checkpoint barrier which failed to complete due to timeout? On Tue, Mar 8, 2022 at 12:48 AM yidan zhao wrote:

Re: Flink Checkpoint Timeout

2022-03-08 Thread yidan zhao
If the checkpoint timeout leads to the job's fail, then the job will be recovered and data will be reprocessed from the last completed checkpoint. If the job doesn't fail, then not. Mahantesh Patil 于2022年3月8日周二 14:47写道: > Hello Team, > > What happens after checkpoint timeout? > > Does Flink

Flink Checkpoint Timeout

2022-03-07 Thread Mahantesh Patil
Hello Team, What happens after checkpoint timeout? Does Flink reprocess data from the previous checkpoint for all tasks? I have one compute intensive operator with parallelism of 20 and only one of the parallel tasks seems to get stuck because of data skew. On checkpoint timeout , will data be

Re: flink checkpoint timeout

2020-10-12 Thread Arvid Heise
>> checkpoints keep timing out since migrating to 1.10 from 1.9 >> -- >> *From:* Deshpande, Omkar >> *Sent:* Wednesday, September 16, 2020 5:27 PM >> *To:* Congxian Qiu >> *Cc:* user@flink.apache.org ; Yun Tang < >> myas...@live.com> >> *Subject:* R

Re: flink checkpoint timeout

2020-10-05 Thread Yu Li
om 1.9 > -- > *From:* Deshpande, Omkar > *Sent:* Wednesday, September 16, 2020 5:27 PM > *To:* Congxian Qiu > *Cc:* user@flink.apache.org ; Yun Tang < > myas...@live.com> > *Subject:* Re: flink checkpoint timeout > > This email is from an

Re: flink checkpoint timeout

2020-09-15 Thread Deshpande, Omkar
nstead of taskmanager.heap.size From: Deshpande, Omkar Sent: Monday, September 14, 2020 6:23 PM To: user@flink.apache.org Subject: flink checkpoint timeout This email is from an external sender. Hello, I recently upgraded from flink 1.9 to 1.10. The checkpointing succeeds

Re: flink checkpoint timeout

2020-09-14 Thread Congxian Qiu
ould I be looking for in the thread dump? > > -- > *From:* Yun Tang > *Sent:* Monday, September 14, 2020 8:52 PM > *To:* Deshpande, Omkar ; user@flink.apache.org > > *Subject:* Re: flink checkpoint timeout > > This email is from an external sender. &

flink checkpoint timeout

2020-09-14 Thread Deshpande, Omkar
Hello, I recently upgraded from flink 1.9 to 1.10. The checkpointing succeeds first couple of times and then starts failing because of timeouts. The checkpoint time grows with every checkpoint and starts exceeding 10 minutes. I do not see any exceptions in the logs. I have enabled debug

Re: flink checkpoint timeout

2020-09-14 Thread Yun Tang
/browse/FLINK-14816 Best Yun Tang From: Deshpande, Omkar Sent: Tuesday, September 15, 2020 10:25 To: user@flink.apache.org Subject: Re: flink checkpoint timeout I have followed this https://ci.apache.org/projects/flink/flink-docs-release-1.10/ops/memory