Re: checkpoint interval and hdfs file capacity

Congxian Qiu Wed, 11 Nov 2020 03:11:23 -0800

Hi
    Currently, checkpoint discard logic was executed in Executor[1], maybe
it will not be deleted so quickly


[1]
https://github.com/apache/flink/blob/91404f435f20c5cd6714ee18bf4ccf95c81fb73e/flink-runtime/src/main/java/org/apache/flink/runtime/checkpoint/CheckpointsCleaner.java#L45

Best,
Congxian


lec ssmi <shicheng31...@gmail.com> 于2020年11月10日周二 下午2:25写道：

> Thanks.
>    I have some jobs with the checkpoint interval 1000ms. And the HDFS
> files grow too large to work normally .
> What I am curious about is, are writing and deleting performed
> synchronously? Is it possible to add too fast to delete old files?
>
> Congxian Qiu <qcx978132...@gmail.com> 于2020年11月10日周二 下午2:16写道：
>
>> Hi
>>     No matter what interval you set, Flink will take care of the
>> checkpoints(remove the useless checkpoint when it can), but when you set a
>> very small checkpoint interval, there may be much high pressure for the
>> storage system(here is RPC pressure of HDFS NN).
>>
>> Best,
>> Congxian
>>
>>
>> lec ssmi <shicheng31...@gmail.com> 于2020年11月10日周二 下午1:19写道：
>>
>>> Hi, if I set the checkpoint interval to be very small, such as 5
>>> seconds, will there be a lot of state files on HDFS? In theory, no matter
>>> what the interval is set, every time you checkpoint, the old file will be
>>> deleted and new file will be written, right?
>>>
>>

Re: checkpoint interval and hdfs file capacity

Reply via email to