[ https://issues.apache.org/jira/browse/FLINK-17820?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17116496#comment-17116496 ]
Roman Khachatryan commented on FLINK-17820: ------------------------------------------- I've created a PR: [https://github.com/apache/flink/pull/12332] It flushes to disk if buffers are above threshold: {code:java} public void flush() { if (pos > localStateThreshold) { flushToFile(); } }{code} (I realized that always doing nothing in flush may be counter-intuitive). > Memory threshold is ignored for channel state > --------------------------------------------- > > Key: FLINK-17820 > URL: https://issues.apache.org/jira/browse/FLINK-17820 > Project: Flink > Issue Type: Bug > Components: Runtime / Checkpointing, Runtime / Task > Affects Versions: 1.11.0 > Reporter: Roman Khachatryan > Assignee: Roman Khachatryan > Priority: Critical > Labels: pull-request-available > Fix For: 1.11.0 > > > Config parameter state.backend.fs.memory-threshold is ignored for channel > state. Causing each subtask to have a file per checkpoint. Regardless of the > size of channel state (of this subtask). > This also causes slow cleanup and delays the next checkpoint. > > The problem is that {{ChannelStateCheckpointWriter.finishWriteAndResult}} > calls flush(); which actually flushes the data on disk. > > From FSDataOutputStream.flush Javadoc: > A completed flush does not mean that the data is necessarily persistent. Data > persistence can is only assumed after calls to close() or sync(). > > Possible solutions: > 1. not to flush in {{ChannelStateCheckpointWriter.finishWriteAndResult (which > can lead to data loss in a wrapping stream).}} > {{2. change }}{{FsCheckpointStateOutputStream.flush behavior}} > {{3. wrap }}{{FsCheckpointStateOutputStream to prevent flush}}{{}}{{}} -- This message was sent by Atlassian Jira (v8.3.4#803005)