[ 
https://issues.apache.org/jira/browse/FLINK-30792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17686891#comment-17686891
 ] 

Feifan Wang commented on FLINK-30792:
-------------------------------------

Thanks for your reply [~roman] , you are right, ref counting state changes per 
state handle can indeed solve the problem of changelog file not found mentioned 
above. The changes in this PR are only intended to reduce useless data uploads.

As for the performance regression problem you mentioned, I really didn't think 
about it carefully before. Now I'm also not sure if this is causing a 
performance regression. On the one hand, this will indeed reduce the amount of 
data uploaded; on the other hand, it will indeed require more data to be 
uploaded when the checkpoint is triggered.

> clean up not uploaded state changes after materialization complete
> ------------------------------------------------------------------
>
>                 Key: FLINK-30792
>                 URL: https://issues.apache.org/jira/browse/FLINK-30792
>             Project: Flink
>          Issue Type: Bug
>          Components: Runtime / State Backends
>    Affects Versions: 1.16.0
>            Reporter: Feifan Wang
>            Priority: Major
>              Labels: pull-request-available
>         Attachments: image-2023-02-03-11-30-40-198.png
>
>
> We should clean up not uploaded state changes after materialization 
> completed, otherwise it may cause FileNotFoundException.
> Since state changes before completed materialization in 
> FsStateChangelogWriter#notUploaded will not be used in any subsequent 
> checkpoint, I suggest clean up it while handle materialization result. 
> How do you think about this ? [~ym] , [~roman] 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to