[jira] [Created] (CALCITE-3972) Allow RelBuilder to create RelNode with convention and use it for trait convert
Xiening Dai created CALCITE-3972: Summary: Allow RelBuilder to create RelNode with convention and use it for trait convert Key: CALCITE-3972 URL: https://issues.apache.org/jira/browse/CALCITE-3972 Project: Calcite Issue Type: Bug Reporter: Xiening Dai 1. Provide Convention.transformRelBuilder() to transform an existing RelBuilder into one with specific convention. 2. RelBuilder provides withRelFactories() method to allow caller swap the underlying RelFactories and create a new builder. 3. Use the new interface in RelCollationTraitDef for converting into RelCollation traits We can avoid ~1/3 of total rule firings in a N way join case with this change. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (CALCITE-3963) Maintains logical properties at RelSet (equivalent group) instead of RelNode
Xiening Dai created CALCITE-3963: Summary: Maintains logical properties at RelSet (equivalent group) instead of RelNode Key: CALCITE-3963 URL: https://issues.apache.org/jira/browse/CALCITE-3963 Project: Calcite Issue Type: Bug Reporter: Xiening Dai Currently the logical properties (such as row count, distinct row count, etc) are maintained at RelNode level. This creates a number of meta data consistency problems, e.g. CALCITE-1048, CALCITE-2166. In theory, all RelNodes in a RelSet should share the same logical properties per definition of relational equivalence. So it makes more sense to keep logical properties at RelSet level, rather than the RelNode. And such properties shouldn't change when new sub set is created or subset's best is changed. Specifically I think below build in metadata should fall into the logical properties category - Selectivity UniqueKeys ColumnUniqueness RowCount MaxRowCount MinRowCount DistinctRowCount Size (averageRowSize, averageColumnSize) -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (CALCITE-3915) Add rule listener to report rule attempts and time at DEBUG log level
Xiening Dai created CALCITE-3915: Summary: Add rule listener to report rule attempts and time at DEBUG log level Key: CALCITE-3915 URL: https://issues.apache.org/jira/browse/CALCITE-3915 Project: Calcite Issue Type: Bug Reporter: Xiening Dai We can add a rule listener to the VolcanoPlanner for recording rule attempts and the accumulative time for the rule firing. This would be very helpful to debug planner performance issue (such as CALCITE-2970). The listener will only be added under DEBUG log level. It will produce a rule attempt summary like below - 2020-04-10 18:03:24,621 [main] DEBUG - Rules Attempts Time (us) ProjectMergeRule:force_mode 1,203 1,124,242 EnumerableProjectRule(in:NONE,out:ENUMERABLE) 129 226,475 AggregatePullUpConstantsRule 84 1,949 AggregateProjectMergeRule 81 128,575 ProjectFilterTransposeRule 76 8,734 JoinPushExpressionsRule 71 10,314 FilterJoinRule:FilterJoinRule:no-filter 71 1,306 JoinPushThroughJoinRule:right 47 676 JoinPushThroughJoinRule:left 45 175,696 FilterJoinRule:FilterJoinRule:filter 41 89,381 ReduceExpressionsRule(Filter) 24 38,128 EnumerableFilterRule(in:NONE,out:ENUMERABLE) 24 52,457 EnumerableJoinRule(in:NONE,out:ENUMERABLE) 24 71,752 EnumerableMergeJoinRule(in:NONE,out:ENUMERABLE) 24 47,987 FilterProjectTransposeRule 22 55,177 JoinCommuteRule 20 70,240 EnumerableAggregateRule(in:NONE,out:ENUMERABLE) 10 26,522 AggregateExpandDistinctAggregatesRule 10 37 EnumerableTableScanRule(in:NONE,out:ENUMERABLE) 1 527 EnumerableInterpreterRule(in:BINDABLE,out:ENUMERABLE) 1 764 BindableTableScanRule 1 1,710 ExpandConversionRule 1 175 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (CALCITE-3821) RelOptUtil::containsMultisetOrWindowedAgg doesn't really check multiset
Xiening Dai created CALCITE-3821: Summary: RelOptUtil::containsMultisetOrWindowedAgg doesn't really check multiset Key: CALCITE-3821 URL: https://issues.apache.org/jira/browse/CALCITE-3821 Project: Calcite Issue Type: Bug Reporter: Xiening Dai The check of containsMultiset() is shortcut by a "false" constant. Also what the function really does is to check rel node does *not* contain aggs. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (CALCITE-3807) checkForSatisfiedConverters() is unnecessary
Xiening Dai created CALCITE-3807: Summary: checkForSatisfiedConverters() is unnecessary Key: CALCITE-3807 URL: https://issues.apache.org/jira/browse/CALCITE-3807 Project: Calcite Issue Type: Bug Reporter: Xiening Dai When VolcanoPlanner registers an abstract converter, it adds the converter into set.abstractConverters list, then calls checkSatisfiedConverter() to see if any converter is satisfied and can be remove from the list. But for every abstract converter, it always satisfies itself (changeTraitsUsingConverters() returns itself). Basically the converter would be removed from the list right after it's added. So this check is completely unnecessary and it slows down the planner. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (CALCITE-3785) HepPlanner.belongToDag() doesn't have to use mapDigestToVertex
Xiening Dai created CALCITE-3785: Summary: HepPlanner.belongToDag() doesn't have to use mapDigestToVertex Key: CALCITE-3785 URL: https://issues.apache.org/jira/browse/CALCITE-3785 Project: Calcite Issue Type: Bug Reporter: Xiening Dai To test whether or not a vertex belongs to DAG, we can simply do graph.vertexSet().contains(vertex); There's no need to look up in mapDigestToVertex map, which incurs overhead of creating the map key. This problem was amplified by CALCITE-3713. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (CALCITE-3768) VolcanoPlanner.changeTraitsUsingConverters() has parameter that's never used
Xiening Dai created CALCITE-3768: Summary: VolcanoPlanner.changeTraitsUsingConverters() has parameter that's never used Key: CALCITE-3768 URL: https://issues.apache.org/jira/browse/CALCITE-3768 Project: Calcite Issue Type: Bug Reporter: Xiening Dai {code:java} private RelNode changeTraitsUsingConverters( RelNode rel, RelTraitSet toTraits, boolean allowAbstractConverters) {code} The 3rd parameter allowAbstractConverters is always set to false. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (CALCITE-3491) VolcanoPlanner.completeConversion() is bypassed by "if (true)"
Xiening Dai created CALCITE-3491: Summary: VolcanoPlanner.completeConversion() is bypassed by "if (true)" Key: CALCITE-3491 URL: https://issues.apache.org/jira/browse/CALCITE-3491 Project: Calcite Issue Type: Bug Reporter: Xiening Dai Not very clear about why we need VolcanoPlanner.completeConversion(). It's currently bypassed as below[1] - if (true) { return rel; } If the method is not needed, we should just remove it. [1] https://github.com/apache/calcite/blob/f90e5d7b48848191c1a70e6c0f78eece93433806/core/src/main/java/org/apache/calcite/plan/volcano/VolcanoPlanner.java#L1096 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (CALCITE-3487) Should not hard code RelMetadataQuery class in VolcanoPlanner.isValid()
Xiening Dai created CALCITE-3487: Summary: Should not hard code RelMetadataQuery class in VolcanoPlanner.isValid() Key: CALCITE-3487 URL: https://issues.apache.org/jira/browse/CALCITE-3487 Project: Calcite Issue Type: Bug Reporter: Xiening Dai With CALCITE-3446, a calcite user can choose to provide a customized provide a sub-class of RelMetadataQuery through RelOptCluster.setMetadataQuery(). We cannot assume the base class RelMetadataQuery is always used in planner. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (CALCITE-3455) Redundant rule firing for both logical and physical nodes
Xiening Dai created CALCITE-3455: Summary: Redundant rule firing for both logical and physical nodes Key: CALCITE-3455 URL: https://issues.apache.org/jira/browse/CALCITE-3455 Project: Calcite Issue Type: Bug Reporter: Xiening Dai While I was looking at CALCITE-2970, I noticed that some of the rules are fired for both logical and physical nodes. For example, ProjectMergeRule matches Project.class, so it’s fired for LogicalProject. But then after LogicalProject is converted into EnummerableProject, the same rule is fired again for the physical rels. Same for EnumerableLimitRule, SortRemoveConstantKeysRule, etc. This seems to be unnecessary. When ProjectMerge is applied to LogicalProject nodes, we already generate all possible alternatives with merged projects. We just need to convert the LogicalProject into EnumerableProject. There’s no need to merge EnumerableProject again. If I update those rules to only match logical nodes, the planning time of the case in CALCITE-2970 is reduced ~30%. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (CALCITE-3330) propagateCostImprovements() could result in stack overflow
Xiening Dai created CALCITE-3330: Summary: propagateCostImprovements() could result in stack overflow Key: CALCITE-3330 URL: https://issues.apache.org/jira/browse/CALCITE-3330 Project: Calcite Issue Type: Bug Reporter: Xiening Dai Assignee: Xiening Dai Current implementation uses depth first approach for propagating cost improvements to parent rel nodes. This could lead to stack overflow if the rel node hierarchy is very deep. Suggest use breath first approach for cost propagation. -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Created] (CALCITE-3283) RelSubSet's best is not existed in the set
Xiening Dai created CALCITE-3283: Summary: RelSubSet's best is not existed in the set Key: CALCITE-3283 URL: https://issues.apache.org/jira/browse/CALCITE-3283 Project: Calcite Issue Type: Bug Reporter: Xiening Dai To repro this, add below code in VolcanoPlanner.isValid() which verifies RelSubset.best indeed belongs to its set - {code:java} // Make sure best RelNode is valid if (subset.best != null && !subset.set.rels.contains(subset.best)) { return litmus.fail("RelSubSet [{}] has best RelNode [{}] which is not existed in its set.", subset.getDescription(), subset.best.getDescription()); } {code} Run JDBCTest.testVarcharEquals, you will get below exception - java.lang.AssertionError: RelSubSet [rel#6828:Subset#2.JDBC.foodmart.[]] has best RelNode [rel#6871:JdbcProject.JDBC.foodmart.[](input=RelSubset#6844,lname=$2)] which is not existed in its set. at org.apache.calcite.util.Litmus$1.fail(Litmus.java:31) at org.apache.calcite.plan.volcano.VolcanoPlanner.isValid(VolcanoPlanner.java:888) at org.apache.calcite.plan.volcano.VolcanoPlanner.register(VolcanoPlanner.java:851) at org.apache.calcite.plan.volcano.VolcanoPlanner.ensureRegistered(VolcanoPlanner.java:868) at org.apache.calcite.plan.volcano.VolcanoPlanner.ensureRegistered(VolcanoPlanner.java:1939) at org.apache.calcite.plan.volcano.VolcanoRuleCall.transformTo(VolcanoRuleCall.java:129) at org.apache.calcite.plan.RelOptRuleCall.transformTo(RelOptRuleCall.java:236) at org.apache.calcite.rel.convert.ConverterRule.onMatch(ConverterRule.java:141) at org.apache.calcite.plan.volcano.VolcanoRuleCall.onMatch(VolcanoRuleCall.java:208) at org.apache.calcite.plan.volcano.VolcanoPlanner.findBestExp(VolcanoPlanner.java:631) at org.apache.calcite.tools.Programs.lambda$standard$3(Programs.java:286) at org.apache.calcite.tools.Programs$SequenceProgram.run(Programs.java:346) at org.apache.calcite.prepare.Prepare.optimize(Prepare.java:189) at org.apache.calcite.prepare.Prepare.prepareSql(Prepare.java:320) at org.apache.calcite.prepare.Prepare.prepareSql(Prepare.java:231) at org.apache.calcite.prepare.CalcitePrepareImpl.prepare2_(CalcitePrepareImpl.java:638) at org.apache.calcite.prepare.CalcitePrepareImpl.prepare_(CalcitePrepareImpl.java:502) at org.apache.calcite.prepare.CalcitePrepareImpl.prepareSql(CalcitePrepareImpl.java:472) at org.apache.calcite.jdbc.CalciteConnectionImpl.parseQuery(CalciteConnectionImpl.java:231) at org.apache.calcite.jdbc.CalciteMetaImpl.prepareAndExecute(CalciteMetaImpl.java:550) at org.apache.calcite.avatica.AvaticaConnection.prepareAndExecuteInternal(AvaticaConnection.java:675) at org.apache.calcite.avatica.AvaticaStatement.executeInternal(AvaticaStatement.java:156) at org.apache.calcite.avatica.AvaticaStatement.executeQuery(AvaticaStatement.java:227) at org.apache.calcite.test.CalciteAssert.assertQuery(CalciteAssert.java:522) at org.apache.calcite.test.CalciteAssert$AssertQuery.lambda$returns$1(CalciteAssert.java:1466) at org.apache.calcite.test.CalciteAssert$AssertQuery.withConnection(CalciteAssert.java:1398) at org.apache.calcite.test.CalciteAssert$AssertQuery.returns(CalciteAssert.java:1464) at org.apache.calcite.test.CalciteAssert$AssertQuery.returns(CalciteAssert.java:1447) at org.apache.calcite.test.CalciteAssert$AssertQuery.returns(CalciteAssert.java:1410) at org.apache.calcite.test.JdbcTest.testVarcharEquals(JdbcTest.java:4420) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50) at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47) at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57) at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290) at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71) at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
[jira] [Created] (CALCITE-3257) Need to clear RelMetaDataQuery cache after dumping RelNodes and Graphviz
Xiening Dai created CALCITE-3257: Summary: Need to clear RelMetaDataQuery cache after dumping RelNodes and Graphviz Key: CALCITE-3257 URL: https://issues.apache.org/jira/browse/CALCITE-3257 Project: Calcite Issue Type: Bug Components: core Reporter: Xiening Dai To repro - 1. Set Log4J log level to TRACE. So planner will dump rel node info every time at rule match (https://github.com/apache/calcite/blob/3124a85b93ff2f1b79484c7bd4cc41835d4f1920/core/src/main/java/org/apache/calcite/plan/volcano/RuleQueue.java#L435) 2. Run JdbcTest.testInnerJoinValues. Get below exception - java.lang.AssertionError: rel [rel#2128:EnumerableProject.ENUMERABLE.[](input=RelSubset#2122,EMPNO=$0,DESC=$3)] has lower cost {309.75 rows, 215.5 cpu, 0.0 io} than best cost {322.5 rows, 241.0 cpu, 0.0 io} of subset [rel#2119:Subset#5.ENUMERABLE.[]] at org.apache.calcite.util.Litmus$1.fail(Litmus.java:31) at org.apache.calcite.plan.volcano.VolcanoPlanner.isValid(VolcanoPlanner.java:888) at org.apache.calcite.plan.volcano.VolcanoPlanner.register(VolcanoPlanner.java:851) at org.apache.calcite.plan.volcano.VolcanoPlanner.ensureRegistered(VolcanoPlanner.java:868) at org.apache.calcite.plan.volcano.VolcanoPlanner.ensureRegistered(VolcanoPlanner.java:90) at org.apache.calcite.rel.AbstractRelNode.onRegister(AbstractRelNode.java:329) at org.apache.calcite.plan.volcano.VolcanoPlanner.registerImpl(VolcanoPlanner.java:1656) at org.apache.calcite.plan.volcano.VolcanoPlanner.register(VolcanoPlanner.java:846) at org.apache.calcite.plan.volcano.VolcanoPlanner.ensureRegistered(VolcanoPlanner.java:868) at org.apache.calcite.plan.volcano.VolcanoPlanner.ensureRegistered(VolcanoPlanner.java:1927) at org.apache.calcite.plan.volcano.VolcanoRuleCall.transformTo(VolcanoRuleCall.java:129) at org.apache.calcite.plan.RelOptRuleCall.transformTo(RelOptRuleCall.java:236) at org.apache.calcite.rel.rules.AggregateProjectMergeRule.onMatch(AggregateProjectMergeRule.java:74) at org.apache.calcite.plan.volcano.VolcanoRuleCall.onMatch(VolcanoRuleCall.java:208) at org.apache.calcite.plan.volcano.VolcanoPlanner.findBestExp(VolcanoPlanner.java:631) at org.apache.calcite.tools.Programs.lambda$standard$3(Programs.java:286) at org.apache.calcite.tools.Programs$SequenceProgram.run(Programs.java:346) at org.apache.calcite.prepare.Prepare.optimize(Prepare.java:189) at org.apache.calcite.prepare.Prepare.prepareSql(Prepare.java:314) at org.apache.calcite.prepare.Prepare.prepareSql(Prepare.java:231) at org.apache.calcite.prepare.CalcitePrepareImpl.prepare2_(CalcitePrepareImpl.java:638) at org.apache.calcite.prepare.CalcitePrepareImpl.prepare_(CalcitePrepareImpl.java:502) at org.apache.calcite.prepare.CalcitePrepareImpl.prepareSql(CalcitePrepareImpl.java:472) at org.apache.calcite.jdbc.CalciteConnectionImpl.parseQuery(CalciteConnectionImpl.java:231) at org.apache.calcite.jdbc.CalciteMetaImpl.prepareAndExecute(CalciteMetaImpl.java:550) at org.apache.calcite.avatica.AvaticaConnection.prepareAndExecuteInternal(AvaticaConnection.java:675) at org.apache.calcite.avatica.AvaticaStatement.executeInternal(AvaticaStatement.java:156) at org.apache.calcite.avatica.AvaticaStatement.executeQuery(AvaticaStatement.java:227) at org.apache.calcite.test.CalciteAssert.assertQuery(CalciteAssert.java:522) at org.apache.calcite.test.CalciteAssert$AssertQuery.lambda$returns$1(CalciteAssert.java:1466) at org.apache.calcite.test.CalciteAssert$AssertQuery.withConnection(CalciteAssert.java:1398) at org.apache.calcite.test.CalciteAssert$AssertQuery.returns(CalciteAssert.java:1464) at org.apache.calcite.test.CalciteAssert$AssertQuery.explainMatches(CalciteAssert.java:1561) at org.apache.calcite.test.CalciteAssert$AssertQuery.explainContains(CalciteAssert.java:1556) at org.apache.calcite.test.JdbcTest.testInnerJoinValues(JdbcTest.java:2580) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50) at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47) at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) at