[ 
https://issues.apache.org/jira/browse/FLINK-22132?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17316161#comment-17316161
 ] 

Arvid Heise commented on FLINK-22132:
-------------------------------------

Since the task is pretty huge. I'm splitting into subtasks for defining the 
application and executing the application.

> Test unaligned checkpoints rescaling manually on a real cluster
> ---------------------------------------------------------------
>
>                 Key: FLINK-22132
>                 URL: https://issues.apache.org/jira/browse/FLINK-22132
>             Project: Flink
>          Issue Type: Test
>          Components: Runtime / Checkpointing
>    Affects Versions: 1.13.0
>            Reporter: Piotr Nowojski
>            Priority: Blocker
>             Fix For: 1.13.0
>
>
> To test unaligned checkpoints, we should use a few different applications 
> that use different features:
> - Mixing forward/rescale channels with keyby or other shuffle operations
> - Unions
> - 2 or n-ary operators
> - Associated state ((keyed) process function)
> - Correctness verifications
> The sinks should not be mocked but rather should be able to induce a fair 
> amount of backpressure into the system. Then, after induced failure, the user 
> needs to restart from a retained checkpoint with
> - lower
> - same
> - higher degree of parallelism.
> To enable unaligned checkpoints, set 
> - execution.checkpointing.unaligned: true
> - execution.checkpointing.alignment-timeout to 0s, 10s, 1min (for high 
> backpressure)
> The primary objective is to check if all data is recovered properly and if 
> the semantics is correct (does state match input?). 
> The secondary objective is to check if Flink UI shows the information 
> correctly:
> - unaligned checkpoint enabled on job level
> - timeout on job level
> - for each checkpoint, if it's unaligned or not; how much data was written



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to