Hi Yunhong,

Thanks for the updating. And introducing the new bushy join reorder
algorithm would be great. And I also agree with the newly added config
option "table.optimizer.bushy-join-reorder-threshold" and 12 as the default
value.


> As for optimization
> latency, this is the problem to be solved by the parameters to be
> introduced in this discussion. When there are many tables need to be
> reordered, the optimization latency will increase greatly. But when the
> table numbers less than the threshold, the latency is the same as the
> LoptOptimizeJoinRule.


This sounds great. If possible, could you share more numbers to us? E.g.,
what's the latency of optimization when there are 11/12 tables for both
approach?

 For question #3: The implementation of Calcite MultiJoinOptimizeBushyRule
> is very simple, and it will not store the intermediate results at all. So,
> the implementation of Calcite cannot get all possible join reorder results
> and it cannot combine with the current cost model to get more reasonable
> join reorder results.


It's ok to do it in Flink as the first step. It would be great to also
contribute it to Calcite later if possible, it depends on you.

yh z <zhengyunhon...@gmail.com> 于2023年1月3日周二 15:27写道:

> Hi Benchao,
>
> Thanks for your reply.
>
> Actually,  I mistakenly wrote the name "bushy join reorder" to "busy join
> reorder". I'm sorry for the trouble bring to you. "Bushy join reorder"
> means we can build a bushy join tree based on cost model, but now Flink can
> only build a left-deep tree using Calcite LoptOptimizeJoinRule. I hope my
> answers can help you solve the following questions:
>
> For question #1: The biggest advantage of this "bushy join reorder"
> strategy over the default Flink left-deep tree strategy is that it can
> retail all possible join reorder plans, and then select the optimal plan
> according to the cost model. This means that the busy join reorder strategy
> can be better combined with the current cost model to get more reasonable
> join reorder results. We verified it on the TPC-DS benchmark, with the
> spark plan as a reference, the new busy join reorder strategy can make more
> TPC-DS query plans be adjusted to be consistent with the Spark plan, and
> the execution time is signifcantly reduced.  As for optimization
> latency, this is the problem to be solved by the parameters to be
> introduced in this discussion. When there are many tables need to be
> reordered, the optimization latency will increase greatly. But when the
> table numbers less than the threshold, the latency is the same as the
> LoptOptimizeJoinRule.
>
> For question #2: According to my research, many compute or database systems
> have the "bushy join reorder" strategies based on dynamic programming. For
> example, Spark and PostgresSql use the same strategy, and the threshold be
> set to 12. Also, some papers, like [1] and [2], have also researched this
> strategy, and [2] set the threshold to 14.
>
> For question #3: The implementation of Calcite MultiJoinOptimizeBushyRule
> is very simple, and it will not store the intermediate results at all. So,
> the implementation of Calcite cannot get all possible join reorder results
> and it cannot combine with the current cost model to get more reasonable
> join reorder results.
>
>
> [1]
>
> https://courses.cs.duke.edu/compsci516/cps216/spring03/papers/selinger-etal-1979.pdf
> [2] https://db.in.tum.de/~radke/papers/hugejoins.pdf
>
>
>
> Benchao Li <libenc...@apache.org> 于2023年1月3日周二 12:54写道:
>
> > Hi Yunhong,
> >
> > Thanks for driving this~
> >
> > I haven't gone deep into the implementation details yet. Regarding the
> > general description, I would ask a few questions firstly:
> >
> > #1, Is there any benchmark results about the optimization latency change
> > compared to current approach? In OLAP scenario, query optimization
> latency
> > is more crucial.
> >
> > #2, About the term "busy join reorder", is there any others systems which
> > also use this term? I know Calcite has a rule[1] which uses the term
> "bushy
> > join".
> >
> > #3, About the implementation, if this does the same work as Calcite
> > MultiJoinOptimizeBushyRule, is it possible to use the Calcite version
> > directly or extend it in some way?
> >
> > [1]
> >
> >
> https://github.com/apache/calcite/blob/9054682145727fbf8a13e3c79b3512be41574349/core/src/main/java/org/apache/calcite/rel/rules/MultiJoinOptimizeBushyRule.java#L78
> >
> > yh z <zhengyunhon...@gmail.com> 于2022年12月29日周四 14:44写道:
> >
> > > Hi, devs,
> > >
> > > I'd like to start a discuss about adding an option called
> > > "table.oprimizer.busy-join-reorder-threshold" for planner rule while we
> > try
> > > to introduce a new busy join reorder rule[1] into Flink.
> > >
> > > This join reorder rule is based on dynamic programing[2], which can
> store
> > > all possible intermediate results, and the cost model can be used to
> > select
> > > the optimal join reorder result. Compare with the existing Lopt join
> > > reorder rule, the new rule can give more possible results and the
> result
> > > can be more accurate. However, the search space of this rule will
> become
> > > very large as the number of tables increases. So we should introduce an
> > > option to limit the expansion of search space, if the number of table
> can
> > > be reordered less than the threshold, the new busy join reorder rule is
> > > used. On the contrary, the Lopt rule is used.
> > >
> > > The default threshold intended to be set to 12. One reason is that in
> the
> > > tpc-ds benchmark test, when the number of tables exceeds 12, the
> > > optimization time will be very long. The other reason is that it refers
> > to
> > > relevant engines, like Spark, whose recommended setting is 12.[3]
> > >
> > > Looking forward to your feedback.
> > >
> > > [1]  https://issues.apache.org/jira/browse/FLINK-30376
> > > [2]
> > >
> > >
> >
> https://courses.cs.duke.edu/compsci516/cps216/spring03/papers/selinger-etal-1979.pdf
> > > [3]
> > >
> > >
> >
> https://spark.apache.org/docs/3.3.1/configuration.html#runtime-sql-configuration
> > >
> > > Best regards,
> > > Yunhong Zheng
> > >
> >
> >
> > --
> >
> > Best,
> > Benchao Li
> >
>


-- 

Best,
Benchao Li

Reply via email to