[jira] [Commented] (FLINK-28386) Trigger an immediate checkpoint after all sources finished

2023-07-04 Thread Dong Lin (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-28386?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17739931#comment-17739931
 ] 

Dong Lin commented on FLINK-28386:
--

Merged to the apache/flink master branch 
c6f443c92880f9c4afc8b30c886e2fba523c564a

> Trigger an immediate checkpoint after all sources finished
> --
>
> Key: FLINK-28386
> URL: https://issues.apache.org/jira/browse/FLINK-28386
> Project: Flink
>  Issue Type: Sub-task
>  Components: Runtime / Checkpointing
>Reporter: Yun Gao
>Assignee: Jiang Xin
>Priority: Major
>  Labels: pull-request-available
>
> Currently for bounded job in streaming mode, by default it will wait for one 
> more checkpoint to commit the last piece of data. If the checkpoint period is 
> long, the waiting time might also be long. to optimize this situation, we 
> could eagerly trigger a checkpoint after all sources are finished. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-28386) Trigger an immediate checkpoint after all sources finished

2023-05-22 Thread Piotr Nowojski (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-28386?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17724809#comment-17724809
 ] 

Piotr Nowojski commented on FLINK-28386:


Hi [~Jiang Xin], thanks! I guess it depends what you mean by "sin operators". 
Just checking tail/the most downstream operators - yes. Literally 
{{SinkOperator}} probably not, as that could be a fragile check.

> Trigger an immediate checkpoint after all sources finished
> --
>
> Key: FLINK-28386
> URL: https://issues.apache.org/jira/browse/FLINK-28386
> Project: Flink
>  Issue Type: Sub-task
>  Components: Runtime / Checkpointing
>Reporter: Yun Gao
>Assignee: Jiang Xin
>Priority: Major
>
> Currently for bounded job in streaming mode, by default it will wait for one 
> more checkpoint to commit the last piece of data. If the checkpoint period is 
> long, the waiting time might also be long. to optimize this situation, we 
> could eagerly trigger a checkpoint after all sources are finished. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-28386) Trigger an immediate checkpoint after all sources finished

2023-05-18 Thread Jiang Xin (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-28386?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17724093#comment-17724093
 ] 

Jiang Xin commented on FLINK-28386:
---

Hi [~pnowojski], I'm working on this. I agree with you that only checking 
sources finished is not enough, but would it be simpler to just check all sink 
operators are finished?

> Trigger an immediate checkpoint after all sources finished
> --
>
> Key: FLINK-28386
> URL: https://issues.apache.org/jira/browse/FLINK-28386
> Project: Flink
>  Issue Type: Sub-task
>  Components: Runtime / Checkpointing
>Reporter: Yun Gao
>Priority: Major
>
> Currently for bounded job in streaming mode, by default it will wait for one 
> more checkpoint to commit the last piece of data. If the checkpoint period is 
> long, the waiting time might also be long. to optimize this situation, we 
> could eagerly trigger a checkpoint after all sources are finished. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-28386) Trigger an immediate checkpoint after all sources finished

2023-04-21 Thread Piotr Nowojski (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-28386?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17714956#comment-17714956
 ] 

Piotr Nowojski commented on FLINK-28386:


[~zlzhang0122] Checkpoint could be used just to make the side effects visible 
(committing results in two phase commit operators/sinks). On the other hand, 
why savepoint makes any sense? There is no point in recovering  from such 
snapshot anyway.

About the ticket. Taking into account unaligned checkpoints, I think a better 
condition would be to trigger a checkpoint once all tasks are finished. With 
unaligned checkpoints, downstream tasks can be still processing in-flight data, 
while upstream sources are finished, so triggering checkpoint on finished 
sources wouldn't achieve the desired goal of stopping the job faster.

> Trigger an immediate checkpoint after all sources finished
> --
>
> Key: FLINK-28386
> URL: https://issues.apache.org/jira/browse/FLINK-28386
> Project: Flink
>  Issue Type: Sub-task
>  Components: Runtime / Checkpointing
>Reporter: Yun Gao
>Priority: Major
>
> Currently for bounded job in streaming mode, by default it will wait for one 
> more checkpoint to commit the last piece of data. If the checkpoint period is 
> long, the waiting time might also be long. to optimize this situation, we 
> could eagerly trigger a checkpoint after all sources are finished. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-28386) Trigger an immediate checkpoint after all sources finished

2022-07-05 Thread zlzhang0122 (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-28386?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17562941#comment-17562941
 ] 

zlzhang0122 commented on FLINK-28386:
-

For bounded job in this scene, IMO, maybe savepoint is better?Why should we use 
checkpoint instead of savepoint?

> Trigger an immediate checkpoint after all sources finished
> --
>
> Key: FLINK-28386
> URL: https://issues.apache.org/jira/browse/FLINK-28386
> Project: Flink
>  Issue Type: Sub-task
>  Components: Runtime / Checkpointing
>Reporter: Yun Gao
>Priority: Major
>
> Currently for bounded job in streaming mode, by default it will wait for one 
> more checkpoint to commit the last piece of data. If the checkpoint period is 
> long, the waiting time might also be long. to optimize this situation, we 
> could eagerly trigger a checkpoint after all sources are finished. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)