Re: [Question] Unknown cost calculation/propagation in RelSubsets

Thomas Rebele Tue, 16 Jan 2024 12:37:06 -0800

Hello,

The RuleMatchVisualizer uses the planner to get the cost [1], and the
Volcano planner uses the bestCost attribute for RelSubset [2].


The color depends on the steps:
* For intermediate steps, the purple color shows which nodes have been
matched by the rule. Light blue shows added nodes.
* For the final step, the purple and light blue colors show the chosen
nodes of the final plan. Light blue for the RelSubset nodes.

If I remember correctly, the cost for a subset shown at a step should be
the same as the best cost of all children for that particular step.

It would be helpful to share at least the generated files (especially the
.js), to understand what's going on.

[1]
https://github.com/apache/calcite/blob/c4042a34ef054b89cec1c47fefcbc8689bad55be/core/src/main/java/org/apache/calcite/plan/visualizer/RuleMatchVisualizer.java#L300C34-L300C34
[2]
https://github.com/apache/calcite/blob/c4042a34ef054b89cec1c47fefcbc8689bad55be/core/src/main/java/org/apache/calcite/plan/volcano/VolcanoPlanner.java#L722

Cordialement / Best Regards,
*Dr. Thomas Rebele* | R&D Developer | Germany | *E* *[email protected]
<[email protected]>* | *W* *www.cloud.com <http://www.cloud.com/>*

*Cloud SG Germany GmbH* |  ℅ Citrix Systems GmbH, Erika-Mann-Straße 67-69,
80636 München Deutschland | Registergericht: Amtsgericht München, HRB
123355 | Geschäftsführer: Antonio Gomes, Alexander E. Kolar, Ganesh
Vaidyanathan, Brian Lee Shytle



On Fri, Jan 12, 2024 at 11:35 PM Tony Fiedler <
[email protected]> wrote:

> Dear Calcite devs,
>
> First of all I really appreciate having a mature framework like Calcite.
> Please continue your great work on this project!
>
> My use case is feeding Calcite (v1.35.0) with an SQL query and doing
> some optimizations by providing metadata and selected planner rules. I
> initialize the Volcano planner and convert the logical plan resulting
> from the sql to a physical plan (using bindable convention).
> After the optimization, I convert the physical plan back to sql --
> hoping its execution time is faster (running the query by a PostgreSQL
> server) than the original query.
>
> There are some aspects I don't understand regarding both the cost
> calculation and cost propagation of (Rel) Subsets in the tree-based plan
> representation generated by RuleMatchVisualizer.
>
> AFAIK Subsets don't have any costs [1], so I'm really confused why
> (cumulative) `cpu` is higher in the subset than it is in its child
> elements (BindableJoin and BindableFilter), see [2]. In addition to that
> the cost metric `rows` is smaller(!) than the values provided by the
> children.
>
> What I expect is that Subset has exactly the same `rows`, `cpu` (and
> `io`) of the selected (purple) child element.
> Having a look at this sub tree [3] the cost propagation works like
> expected.
>
> Besides that I already noticed that Calcite costs seem to have an upper
> bound (9.223372036854775807E18) where costs can't get any higher in sub
> trees where this value is reached in an (physical operator) element.
>
> I know it's hard to tell what Calcite actually does just using
> screenshots. Please let me know if I should provide e.g., my code for
> giving better insights.
>
> Thank you in advance for your reply!
>
> [1]:
>
> https://github.com/apache/calcite/blob/c4042a34ef054b89cec1c47fefcbc8689bad55be/core/src/main/java/org/apache/calcite/plan/volcano/RelSubset.java#L254
> [2]: https://ibb.co/7jtXKH3
> [3]: https://ibb.co/5BZZyLz
>
> Best regards,
> Tony
>

Re: [Question] Unknown cost calculation/propagation in RelSubsets

Reply via email to