Hi Jark,

Thanks for your reply.

We are going to use bushy join reorder rule and Lopt join reorder rule at
the same time. By setting the threshold
"table.optimizer.bushy-join-reorder-threshold", when the number of tables
need to be reordered is less than/equals this threshold, bushy join reorder
rule will be used. On the contrary, when the number of tables need to be
reordered is greater than this threshold, Lopt join reorder will be used.
Since for most queries, the number of tables need to be reordered is less
than/equals this threshold, so bushy join reorder rule can be regarded as
the default join reorder rule.

I'm really sorry. Because I didn't carefully check the contents of the
first email, I wrote the wrong words in that email. I have made sure that
the correct word "bushy" is used in pr[1]. The threshold name indeed is
"table.optimizer.bushy-join-reorder-threshold".

[1] https://github.com/apache/flink/pull/21530

Best regards,
Yunhong Zheng

Jark Wu <imj...@gmail.com> 于2023年1月3日周二 20:06写道:

> Hi Yuhong,
>
> Thanks for driving the feature.
>
> I just have one question. Is the bushy join reorder optimization enabled
> by default? Does the bushy join reorder will replace the existing Lopt join
> reorder rule?
>
> Besides, I guess the option "table.oprimizer.busy-join-reorder-threshold”
> should be "table.optimizer.bushy-join-reorder-threshold”?  (I guess they
> are just typos, as your last email said, but I just want to clarify as it
> is a public API).
>
> Best,
> Jark
>
>
> > 2023年1月3日 12:53,Benchao Li <libenc...@apache.org> 写道:
> >
> > Hi Yunhong,
> >
> > Thanks for driving this~
> >
> > I haven't gone deep into the implementation details yet. Regarding the
> > general description, I would ask a few questions firstly:
> >
> > #1, Is there any benchmark results about the optimization latency change
> > compared to current approach? In OLAP scenario, query optimization
> latency
> > is more crucial.
> >
> > #2, About the term "busy join reorder", is there any others systems which
> > also use this term? I know Calcite has a rule[1] which uses the term
> "bushy
> > join".
> >
> > #3, About the implementation, if this does the same work as Calcite
> > MultiJoinOptimizeBushyRule, is it possible to use the Calcite version
> > directly or extend it in some way?
> >
> > [1]
> >
> https://github.com/apache/calcite/blob/9054682145727fbf8a13e3c79b3512be41574349/core/src/main/java/org/apache/calcite/rel/rules/MultiJoinOptimizeBushyRule.java#L78
> >
> > yh z <zhengyunhon...@gmail.com> 于2022年12月29日周四 14:44写道:
> >
> >> Hi, devs,
> >>
> >> I'd like to start a discuss about adding an option called
> >> "table.oprimizer.busy-join-reorder-threshold" for planner rule while we
> try
> >> to introduce a new busy join reorder rule[1] into Flink.
> >>
> >> This join reorder rule is based on dynamic programing[2], which can
> store
> >> all possible intermediate results, and the cost model can be used to
> select
> >> the optimal join reorder result. Compare with the existing Lopt join
> >> reorder rule, the new rule can give more possible results and the result
> >> can be more accurate. However, the search space of this rule will become
> >> very large as the number of tables increases. So we should introduce an
> >> option to limit the expansion of search space, if the number of table
> can
> >> be reordered less than the threshold, the new busy join reorder rule is
> >> used. On the contrary, the Lopt rule is used.
> >>
> >> The default threshold intended to be set to 12. One reason is that in
> the
> >> tpc-ds benchmark test, when the number of tables exceeds 12, the
> >> optimization time will be very long. The other reason is that it refers
> to
> >> relevant engines, like Spark, whose recommended setting is 12.[3]
> >>
> >> Looking forward to your feedback.
> >>
> >> [1]  https://issues.apache.org/jira/browse/FLINK-30376
> >> [2]
> >>
> >>
> https://courses.cs.duke.edu/compsci516/cps216/spring03/papers/selinger-etal-1979.pdf
> >> [3]
> >>
> >>
> https://spark.apache.org/docs/3.3.1/configuration.html#runtime-sql-configuration
> >>
> >> Best regards,
> >> Yunhong Zheng
> >>
> >
> >
> > --
> >
> > Best,
> > Benchao Li
>
>

Reply via email to