GitHub user geserdugarov added a comment to the discussion: RLI support for Flink streaming
As far as I know, Flink write operators currently flush buffers during checkpoint. As a result, we should wait for the slowest task manager to complete this flush to continue stream processing. This can lead to performance degradation in case of data skew, when one task manager receives significantly more data than others, or when some task manager has lower I/O bandwidth. I suppose it might be possible to utilize Flink's local state on task managers, and flush buffers there, instead of writing to remote storage during the checkpoint. This should be faster, and would decouple Flink checkpoints from Hudi commits. @danny0405 , what do you think? GitHub link: https://github.com/apache/hudi/discussions/17452#discussioncomment-15204698 ---- This is an automatically sent email for [email protected]. To unsubscribe, please send an email to: [email protected]
