[jira] [Commented] (FLINK-31588) The unaligned checkpoint type is wrong at subtask level

2023-04-22 Thread Rui Fan (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-31588?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17715340#comment-17715340
 ] 

Rui Fan commented on FLINK-31588:
-

Thanks [~pnowojski] for the review and discussion.

Merged master commit: d46d8d0f6b590f185608b23fbe8b2fcbded111de

> The unaligned checkpoint type is wrong at subtask level
> ---
>
> Key: FLINK-31588
> URL: https://issues.apache.org/jira/browse/FLINK-31588
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Checkpointing
>Affects Versions: 1.16.0, 1.17.0
>Reporter: Rui Fan
>Assignee: Rui Fan
>Priority: Major
>  Labels: pull-request-available
> Attachments: image-2023-03-23-18-45-01-535.png
>
>
> FLINK-20488 supported show checkpoint type for each subtask, and it based on 
> received `CheckpointOptions` and it's right.
> However, FLINK-27251 supported timeout aligned to unaligned checkpoint 
> barrier in the output buffers. It means the received `CheckpointOptions` can 
> be converted from aligned checkpoint to unaligned checkpoint.
> So, the unaligned checkpoint type may be wrong at subtask level. For example, 
> as shown in the figure below, Unaligned checkpoint type is false, but it is 
> actually Unaligned checkpoint (persisted data > 0).
>  
> !image-2023-03-23-18-45-01-535.png|width=1879,height=797!
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-31588) The unaligned checkpoint type is wrong at subtask level

2023-04-07 Thread Rui Fan (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-31588?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17709823#comment-17709823
 ] 

Rui Fan commented on FLINK-31588:
-

Thanks for your feedback.
{quote} If checkpoint was unaligned, as it arrived unaligned, it should be 
reported as such, even if that particular subtask didn't persist any data.
{quote}
Sounds make sense. I will prepare this PR next week.

> The unaligned checkpoint type is wrong at subtask level
> ---
>
> Key: FLINK-31588
> URL: https://issues.apache.org/jira/browse/FLINK-31588
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Checkpointing
>Affects Versions: 1.16.0, 1.17.0
>Reporter: Rui Fan
>Assignee: Rui Fan
>Priority: Major
> Attachments: image-2023-03-23-18-45-01-535.png
>
>
> FLINK-20488 supported show checkpoint type for each subtask, and it based on 
> received `CheckpointOptions` and it's right.
> However, FLINK-27251 supported timeout aligned to unaligned checkpoint 
> barrier in the output buffers. It means the received `CheckpointOptions` can 
> be converted from aligned checkpoint to unaligned checkpoint.
> So, the unaligned checkpoint type may be wrong at subtask level. For example, 
> as shown in the figure below, Unaligned checkpoint type is false, but it is 
> actually Unaligned checkpoint (persisted data > 0).
>  
> !image-2023-03-23-18-45-01-535.png|width=1879,height=797!
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-31588) The unaligned checkpoint type is wrong at subtask level

2023-04-07 Thread Piotr Nowojski (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-31588?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17709783#comment-17709783
 ] 

Piotr Nowojski commented on FLINK-31588:


Sorry for late response, I've just found an old tab with WIP comment that I 
wanted to write, but somehow didn't send as something must have interrupted me 
:(

Thanks for reporting the issue. I see the problem. I think ideally we should 
try to keep the semantic of that flag in sync with what {{StreamTask}} was 
actually doing. If checkpoint was unaligned, as it arrived unaligned, it should 
be reported as such, even if that particular subtask didn't persist any data. 
Can we still achieve that? 

> The unaligned checkpoint type is wrong at subtask level
> ---
>
> Key: FLINK-31588
> URL: https://issues.apache.org/jira/browse/FLINK-31588
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Checkpointing
>Affects Versions: 1.16.0, 1.17.0
>Reporter: Rui Fan
>Assignee: Rui Fan
>Priority: Major
> Attachments: image-2023-03-23-18-45-01-535.png
>
>
> FLINK-20488 supported show checkpoint type for each subtask, and it based on 
> received `CheckpointOptions` and it's right.
> However, FLINK-27251 supported timeout aligned to unaligned checkpoint 
> barrier in the output buffers. It means the received `CheckpointOptions` can 
> be converted from aligned checkpoint to unaligned checkpoint.
> So, the unaligned checkpoint type may be wrong at subtask level. For example, 
> as shown in the figure below, Unaligned checkpoint type is false, but it is 
> actually Unaligned checkpoint (persisted data > 0).
>  
> !image-2023-03-23-18-45-01-535.png|width=1879,height=797!
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-31588) The unaligned checkpoint type is wrong at subtask level

2023-03-23 Thread Rui Fan (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-31588?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17704072#comment-17704072
 ] 

Rui Fan commented on FLINK-31588:
-

Hi [~pnowojski] [~yuanmei] , please help take a look this ticket in your free 
time, thanks!

I prefer generate unaligned checkpoint type based on persisted data. If it is 
switched to an unaligned checkpoint and no data is persisted, it is still 
considered an aligned checkpoint for flink users. WDYT?

> The unaligned checkpoint type is wrong at subtask level
> ---
>
> Key: FLINK-31588
> URL: https://issues.apache.org/jira/browse/FLINK-31588
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Checkpointing
>Affects Versions: 1.16.0, 1.17.0
>Reporter: Rui Fan
>Assignee: Rui Fan
>Priority: Major
> Attachments: image-2023-03-23-18-45-01-535.png
>
>
> FLINK-20488 supported show checkpoint type for each subtask, and it based on 
> received `CheckpointOptions` and it's right.
> However, FLINK-27251 supported timeout aligned to unaligned checkpoint 
> barrier in the output buffers. It means the received `CheckpointOptions` can 
> be converted from aligned checkpoint to unaligned checkpoint.
> So, the unaligned checkpoint type may be wrong at subtask level. For example, 
> as shown in the figure below, Unaligned checkpoint type is false, but it is 
> actually Unaligned checkpoint (persisted data > 0).
>  
> !image-2023-03-23-18-45-01-535.png|width=1879,height=797!
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)