Hi, Roman

Thanks for your proposal, I intuitively feel that this optimization would
be very useful to reduce the amount of message amplification for TopN
operators. After briefly looking at your google docs, I have the following
questions:

1. Whether you can describe in detail the principle of solving the TopN
operator record amplification, similar to Minibatch Join[1], through the
figure of current Motivation part, I can not understand how you did it
2. TopN has currently multiple implementation functions, including
AppendOnlyFirstNFunction, AppendOnlyTopNFunction, FastTop1Function,
RetractableTopNFunction, UpdatableTopNFunction. Is it possible to elaborate
on which patterns the Minibatch optimization applies to?
3. Is it possible to provide the PoC code?
4. finally, we need a formal FLIP document on the wiki[2].

[1]
https://nightlies.apache.org/flink/flink-docs-master/docs/dev/table/tuning/#minibatch-regular-joins
[2]
https://cwiki.apache.org/confluence/display/FLINK/Flink+Improvement+Proposals

Best,
Ron

Roman Boyko <ro.v.bo...@gmail.com> 于2024年3月24日周日 01:14写道:

> Hi Flink Community,
>
> I tried to describe my idea about minibatch for TopNFunction in this doc -
>
> https://docs.google.com/document/d/1YPHwxKfiGSUOUOa6bc68fIJHO_UojTwZEC29VVEa-Uk/edit?usp=sharing
>
> Looking forward to your feedback, thank you
>
> On Tue, 19 Mar 2024 at 12:24, Roman Boyko <ro.v.bo...@gmail.com> wrote:
>
> > Hello Flink Community,
> >
> > The same problem with record amplification as described in FLIP-415:
> Introduce
> > a new join operator to support minibatch[1] exists for most of
> > implementations of AbstractTopNFunction. Especially when the rank is
> > provided to output. For example, when calculating Top100 with rank
> output,
> > every input record might produce 100 -U records and 100 +U records.
> >
> > According to my POC (which is similar to FLIP-415) the record
> > amplification could be significantly reduced by using input or output
> > buffer.
> >
> > What do you think if we implement such optimization for TopNFunctions?
> >
> > [1]
> > https://cwiki.apache.org/confluence/display/FLINK/FLIP-415
> > %3A+Introduce+a+new+join+operator+to+support+minibatch
> >
> > --
> > Best regards,
> > Roman Boyko
> > e.: ro.v.bo...@gmail.com
> > m.: +79059592443
> > telegram: @rboyko
> >
>
>
> --
> Best regards,
> Roman Boyko
> e.: ro.v.bo...@gmail.com
> m.: +79059592443
> telegram: @rboyko
>

Reply via email to