Hi, Roman Thanks for your proposal, I intuitively feel that this optimization would be very useful to reduce the amount of message amplification for TopN operators. After briefly looking at your google docs, I have the following questions:
1. Whether you can describe in detail the principle of solving the TopN operator record amplification, similar to Minibatch Join[1], through the figure of current Motivation part, I can not understand how you did it 2. TopN has currently multiple implementation functions, including AppendOnlyFirstNFunction, AppendOnlyTopNFunction, FastTop1Function, RetractableTopNFunction, UpdatableTopNFunction. Is it possible to elaborate on which patterns the Minibatch optimization applies to? 3. Is it possible to provide the PoC code? 4. finally, we need a formal FLIP document on the wiki[2]. [1] https://nightlies.apache.org/flink/flink-docs-master/docs/dev/table/tuning/#minibatch-regular-joins [2] https://cwiki.apache.org/confluence/display/FLINK/Flink+Improvement+Proposals Best, Ron Roman Boyko <ro.v.bo...@gmail.com> 于2024年3月24日周日 01:14写道: > Hi Flink Community, > > I tried to describe my idea about minibatch for TopNFunction in this doc - > > https://docs.google.com/document/d/1YPHwxKfiGSUOUOa6bc68fIJHO_UojTwZEC29VVEa-Uk/edit?usp=sharing > > Looking forward to your feedback, thank you > > On Tue, 19 Mar 2024 at 12:24, Roman Boyko <ro.v.bo...@gmail.com> wrote: > > > Hello Flink Community, > > > > The same problem with record amplification as described in FLIP-415: > Introduce > > a new join operator to support minibatch[1] exists for most of > > implementations of AbstractTopNFunction. Especially when the rank is > > provided to output. For example, when calculating Top100 with rank > output, > > every input record might produce 100 -U records and 100 +U records. > > > > According to my POC (which is similar to FLIP-415) the record > > amplification could be significantly reduced by using input or output > > buffer. > > > > What do you think if we implement such optimization for TopNFunctions? > > > > [1] > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-415 > > %3A+Introduce+a+new+join+operator+to+support+minibatch > > > > -- > > Best regards, > > Roman Boyko > > e.: ro.v.bo...@gmail.com > > m.: +79059592443 > > telegram: @rboyko > > > > > -- > Best regards, > Roman Boyko > e.: ro.v.bo...@gmail.com > m.: +79059592443 > telegram: @rboyko >