[ https://issues.apache.org/jira/browse/FLINK-14118?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
ASF GitHub Bot updated FLINK-14118: ----------------------------------- Labels: pull-request-available (was: ) > Reduce the unnecessary flushing when there is no data available for flush > ------------------------------------------------------------------------- > > Key: FLINK-14118 > URL: https://issues.apache.org/jira/browse/FLINK-14118 > Project: Flink > Issue Type: Improvement > Components: Runtime / Network > Reporter: Yingjie Cao > Priority: Major > Labels: pull-request-available > Fix For: 1.10.0 > > > The new flush implementation which works by triggering a netty user event may > cause performance regression compared to the old synchronization-based one. > More specifically, when there is exactly one BufferConsumer in the buffer > queue of subpartition and no new data will be added for a while in the future > (may because of just no input or the logic of the operator is to collect some > data for processing and will not emit records immediately), that is, there is > no data to send, the OutputFlusher will continuously notify data available > and wake up the netty thread, though no data will be returned by the > pollBuffer method. > For some of our production jobs, this will incur 20% to 40% CPU overhead > compared to the old implementation. We tried to fix the problem by checking > if there is new data available when flushing, if there is no new data, the > netty thread will not be notified. It works for our jobs and the cpu usage > falls to previous level. -- This message was sent by Atlassian Jira (v8.3.4#803005)