[Question] Unknown cost calculation/propagation in RelSubsets

2024-01-12 Thread Tony Fiedler
Dear Calcite devs, First of all I really appreciate having a mature framework like Calcite. Please continue your great work on this project! My use case is feeding Calcite (v1.35.0) with an SQL query and doing some optimizations by providing metadata and selected planner rules. I initialize th

Re: [Question] Unknown cost calculation/propagation in RelSubsets

2024-01-16 Thread Thomas Rebele
Hello, The RuleMatchVisualizer uses the planner to get the cost [1], and the Volcano planner uses the bestCost attribute for RelSubset [2]. The color depends on the steps: * For intermediate steps, the purple color shows which nodes have been matched by the rule. Light blue shows added nodes. * F

Re: [Question] Unknown cost calculation/propagation in RelSubsets

2024-01-16 Thread Julian Hyde
Tony, You’re asking about how the Volcano algorithm computes metadata for equivalence classes (what Calcite calls subsets) and to my knowledge it’s not been spelled out explicitly (either in the Volcano/Cascades papers or in Calcite discussions). Calcite needs various kinds of metadata, such a

Re: [Question] Unknown cost calculation/propagation in RelSubsets

2024-01-24 Thread Tony Fiedler
Julian, many thanks for the insights. I'm obiously not able to know/find out where in the code those aggregate/combiner functions for RelSubsets are located. AFAIK there are no metadata handler methods inside the metadata handler implementations (in the form of `RelMdXXX implements MetadataHa

Re: [Question] Unknown cost calculation/propagation in RelSubsets

2024-01-24 Thread Julian Hyde
You’re right, metadata handler methods for RelSubset would be the way to go. If there are no such methods I guess no one has tried to solve this problem before. (At least no one who contributed their changes back.) Yes, those are the papers. They are foundational for Calcite but I haven’t read

RE: Re: [Question] Unknown cost calculation/propagation in RelSubsets

2024-01-24 Thread Tony Fiedler
Hello, Right, I guess I understand what the color encoding means. Thanks for confirming my thoughts. If I remember correctly, the cost for a subset shown at a step should be the same as the best cost of all children for that particular step. Right and this is what confuses me since this doe

Re: Re: [Question] Unknown cost calculation/propagation in RelSubsets

2024-02-06 Thread Thomas Rebele
Hello, Thank you for sharing the files. I assume the intermediate costs were included, as the cost is updated on each step for many nodes. So in step 89, #197-BindableJoin has a CPU cost of 6.787504586323E12, and that cost gets updated to 5.6541574E7 in step 90. However, its parent subset#170 does