Hi Dong and Yunfeng, Thanks for the FLIP. What's not clear for me is what's the expected behaviour when the allowed latency can't be met, for whatever reason. Given that we're talking about an "allowed latency", it implies that something has gone wrong and should fail? Isn't this more a minimum latency that you're proposing?
There's also the part about the Hudi Sink processing records immediately upon arrival. Given that the SinkV2 API provides the ability for custom post and pre-commit topologies [1], specifically targeted to avoid generating multiple small files, why isn't that sufficient for the Hudi Sink? It would be great to see that added under Rejected Alternatives if this is indeed not sufficient. Best regards, Martijn [1] https://cwiki.apache.org/confluence/display/FLINK/FLIP-191%3A+Extend+unified+Sink+interface+to+support+small+file+compaction On Sun, Jun 25, 2023 at 4:25 AM Yunfeng Zhou <flink.zhouyunf...@gmail.com> wrote: > > Hi all, > > Dong(cc'ed) and I are opening this thread to discuss our proposal to > support configuring end-to-end allowed latency for Flink jobs, which > has been documented in FLIP-325 > <https://cwiki.apache.org/confluence/display/FLINK/FLIP-325%3A+Support+configuring+end-to-end+allowed+latency>. > > By configuring the latency requirement for a Flink job, users would be > able to optimize the throughput and overhead of the job while still > acceptably increasing latency. This approach is particularly useful > when dealing with records that do not require immediate processing and > emission upon arrival. > > Please refer to the FLIP document for more details about the proposed > design and implementation. We welcome any feedback and opinions on > this proposal. > > Best regards. > > Dong and Yunfeng