[jira] [Created] (CALCITE-3972) Allow RelBuilder to create RelNode with convention and use it for trait convert

2020-05-05 Thread Xiening Dai (Jira)
Xiening Dai created CALCITE-3972:


 Summary: Allow RelBuilder to create RelNode with convention and 
use it for trait convert
 Key: CALCITE-3972
 URL: https://issues.apache.org/jira/browse/CALCITE-3972
 Project: Calcite
  Issue Type: Bug
Reporter: Xiening Dai


1. Provide Convention.transformRelBuilder() to transform an existing RelBuilder 
into one with specific convention.
2. RelBuilder provides withRelFactories() method to allow caller swap the 
underlying RelFactories and create a new builder. 
3. Use the new interface in RelCollationTraitDef for converting into 
RelCollation traits

We can avoid ~1/3 of total rule firings in a N way join case with this change.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (CALCITE-3963) Maintains logical properties at RelSet (equivalent group) instead of RelNode

2020-04-29 Thread Xiening Dai (Jira)
Xiening Dai created CALCITE-3963:


 Summary: Maintains logical properties at RelSet (equivalent group) 
instead of RelNode
 Key: CALCITE-3963
 URL: https://issues.apache.org/jira/browse/CALCITE-3963
 Project: Calcite
  Issue Type: Bug
Reporter: Xiening Dai


Currently the logical properties (such as row count, distinct row count, etc) 
are maintained at RelNode level. This creates a number of meta data consistency 
problems, e.g. CALCITE-1048, CALCITE-2166. 

In theory, all RelNodes in a RelSet should share the same logical properties 
per definition of relational equivalence. So it makes more sense to keep 
logical properties at RelSet level, rather than the RelNode. And such 
properties shouldn't change when new sub set is created or subset's best is 
changed.

Specifically I think below build in metadata should fall into the logical 
properties category -

Selectivity

UniqueKeys

ColumnUniqueness

RowCount

MaxRowCount

MinRowCount

DistinctRowCount

Size (averageRowSize, averageColumnSize)

 

 

 

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (CALCITE-3915) Add rule listener to report rule attempts and time at DEBUG log level

2020-04-10 Thread Xiening Dai (Jira)
Xiening Dai created CALCITE-3915:


 Summary: Add rule listener to report rule attempts and time at 
DEBUG log level
 Key: CALCITE-3915
 URL: https://issues.apache.org/jira/browse/CALCITE-3915
 Project: Calcite
  Issue Type: Bug
Reporter: Xiening Dai


We can add a rule listener to the VolcanoPlanner for recording rule attempts 
and the accumulative time for the rule firing. This would be very helpful to 
debug planner performance issue (such as CALCITE-2970). The listener will only 
be added under DEBUG log level.

It will produce a rule attempt summary like below -

2020-04-10 18:03:24,621 [main] DEBUG - 
Rules   
Attempts   Time (us)
ProjectMergeRule:force_mode
1,203   1,124,242
EnumerableProjectRule(in:NONE,out:ENUMERABLE)
129 226,475
AggregatePullUpConstantsRule  
84   1,949
AggregateProjectMergeRule 
81 128,575
ProjectFilterTransposeRule
76   8,734
JoinPushExpressionsRule   
71  10,314
FilterJoinRule:FilterJoinRule:no-filter   
71   1,306
JoinPushThroughJoinRule:right 
47 676
JoinPushThroughJoinRule:left  
45 175,696
FilterJoinRule:FilterJoinRule:filter  
41  89,381
ReduceExpressionsRule(Filter) 
24  38,128
EnumerableFilterRule(in:NONE,out:ENUMERABLE)  
24  52,457
EnumerableJoinRule(in:NONE,out:ENUMERABLE)
24  71,752
EnumerableMergeJoinRule(in:NONE,out:ENUMERABLE)   
24  47,987
FilterProjectTransposeRule
22  55,177
JoinCommuteRule   
20  70,240
EnumerableAggregateRule(in:NONE,out:ENUMERABLE)   
10  26,522
AggregateExpandDistinctAggregatesRule 
10  37
EnumerableTableScanRule(in:NONE,out:ENUMERABLE)
1 527
EnumerableInterpreterRule(in:BINDABLE,out:ENUMERABLE)  
1 764
BindableTableScanRule  
1   1,710
ExpandConversionRule   
1 175



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (CALCITE-3821) RelOptUtil::containsMultisetOrWindowedAgg doesn't really check multiset

2020-02-25 Thread Xiening Dai (Jira)
Xiening Dai created CALCITE-3821:


 Summary: RelOptUtil::containsMultisetOrWindowedAgg doesn't really 
check multiset
 Key: CALCITE-3821
 URL: https://issues.apache.org/jira/browse/CALCITE-3821
 Project: Calcite
  Issue Type: Bug
Reporter: Xiening Dai


The check of containsMultiset() is shortcut by a "false" constant. Also what 
the function really does is to check rel node does *not* contain aggs.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (CALCITE-3807) checkForSatisfiedConverters() is unnecessary

2020-02-19 Thread Xiening Dai (Jira)
Xiening Dai created CALCITE-3807:


 Summary: checkForSatisfiedConverters() is unnecessary 
 Key: CALCITE-3807
 URL: https://issues.apache.org/jira/browse/CALCITE-3807
 Project: Calcite
  Issue Type: Bug
Reporter: Xiening Dai


When VolcanoPlanner registers an abstract converter, it adds the converter into 
set.abstractConverters list, then calls checkSatisfiedConverter() to see if any 
converter is satisfied and can be remove from the list. But for every abstract 
converter, it always satisfies itself (changeTraitsUsingConverters() returns 
itself). Basically the converter would be removed from the list right after 
it's added. So this check is completely unnecessary and it slows down the 
planner.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (CALCITE-3785) HepPlanner.belongToDag() doesn't have to use mapDigestToVertex

2020-02-11 Thread Xiening Dai (Jira)
Xiening Dai created CALCITE-3785:


 Summary: HepPlanner.belongToDag() doesn't have to use 
mapDigestToVertex
 Key: CALCITE-3785
 URL: https://issues.apache.org/jira/browse/CALCITE-3785
 Project: Calcite
  Issue Type: Bug
Reporter: Xiening Dai


To test whether or not a vertex belongs to DAG, we can simply do 
graph.vertexSet().contains(vertex); There's no need to look up in 
mapDigestToVertex map, which incurs overhead of creating the map key. This 
problem was amplified by CALCITE-3713.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (CALCITE-3768) VolcanoPlanner.changeTraitsUsingConverters() has parameter that's never used

2020-02-04 Thread Xiening Dai (Jira)
Xiening Dai created CALCITE-3768:


 Summary: VolcanoPlanner.changeTraitsUsingConverters() has 
parameter that's never used
 Key: CALCITE-3768
 URL: https://issues.apache.org/jira/browse/CALCITE-3768
 Project: Calcite
  Issue Type: Bug
Reporter: Xiening Dai



{code:java}
private RelNode changeTraitsUsingConverters(
  RelNode rel,
  RelTraitSet toTraits,
  boolean allowAbstractConverters)
{code}

The 3rd parameter allowAbstractConverters is always set to false.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (CALCITE-3491) VolcanoPlanner.completeConversion() is bypassed by "if (true)"

2019-11-11 Thread Xiening Dai (Jira)
Xiening Dai created CALCITE-3491:


 Summary: VolcanoPlanner.completeConversion() is bypassed by "if 
(true)"
 Key: CALCITE-3491
 URL: https://issues.apache.org/jira/browse/CALCITE-3491
 Project: Calcite
  Issue Type: Bug
Reporter: Xiening Dai


Not very clear about why we need VolcanoPlanner.completeConversion(). It's 
currently bypassed as below[1] -

   if (true) {
  return rel;
}

If the method is not needed, we should just remove it.

[1] 
https://github.com/apache/calcite/blob/f90e5d7b48848191c1a70e6c0f78eece93433806/core/src/main/java/org/apache/calcite/plan/volcano/VolcanoPlanner.java#L1096





--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (CALCITE-3487) Should not hard code RelMetadataQuery class in VolcanoPlanner.isValid()

2019-11-08 Thread Xiening Dai (Jira)
Xiening Dai created CALCITE-3487:


 Summary: Should not hard code RelMetadataQuery class in 
VolcanoPlanner.isValid()
 Key: CALCITE-3487
 URL: https://issues.apache.org/jira/browse/CALCITE-3487
 Project: Calcite
  Issue Type: Bug
Reporter: Xiening Dai


With CALCITE-3446, a calcite user can choose to provide a customized provide a 
sub-class of RelMetadataQuery through RelOptCluster.setMetadataQuery(). We 
cannot assume the base class RelMetadataQuery is always used in planner.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (CALCITE-3455) Redundant rule firing for both logical and physical nodes

2019-10-28 Thread Xiening Dai (Jira)
Xiening Dai created CALCITE-3455:


 Summary: Redundant rule firing for both logical and physical nodes
 Key: CALCITE-3455
 URL: https://issues.apache.org/jira/browse/CALCITE-3455
 Project: Calcite
  Issue Type: Bug
Reporter: Xiening Dai


While I was looking at CALCITE-2970, I noticed that some of the rules are fired 
for both logical and physical nodes. For example, ProjectMergeRule matches 
Project.class, so it’s fired for LogicalProject. But then after LogicalProject 
is converted into EnummerableProject, the same rule is fired again for the 
physical rels. Same for EnumerableLimitRule, SortRemoveConstantKeysRule, etc. 

This seems to be unnecessary. When ProjectMerge is applied to LogicalProject 
nodes, we already generate all possible alternatives with merged projects. We 
just need to convert the LogicalProject into EnumerableProject. There’s no need 
to merge EnumerableProject again. 

If I update those rules to only match logical nodes, the planning time of the 
case in CALCITE-2970 is reduced ~30%. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (CALCITE-3330) propagateCostImprovements() could result in stack overflow

2019-09-07 Thread Xiening Dai (Jira)
Xiening Dai created CALCITE-3330:


 Summary: propagateCostImprovements() could result in stack overflow
 Key: CALCITE-3330
 URL: https://issues.apache.org/jira/browse/CALCITE-3330
 Project: Calcite
  Issue Type: Bug
Reporter: Xiening Dai
Assignee: Xiening Dai


Current implementation uses depth first approach for propagating cost 
improvements to parent rel nodes. This could lead to stack overflow if the rel 
node hierarchy is very deep. Suggest use breath first approach for cost 
propagation. 



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Created] (CALCITE-3283) RelSubSet's best is not existed in the set

2019-08-22 Thread Xiening Dai (Jira)
Xiening Dai created CALCITE-3283:


 Summary: RelSubSet's best is not existed in the set
 Key: CALCITE-3283
 URL: https://issues.apache.org/jira/browse/CALCITE-3283
 Project: Calcite
  Issue Type: Bug
Reporter: Xiening Dai


To repro this, add below code in VolcanoPlanner.isValid() which verifies 
RelSubset.best indeed belongs to its set -

{code:java}
// Make sure best RelNode is valid
if (subset.best != null && !subset.set.rels.contains(subset.best)) {
  return litmus.fail("RelSubSet [{}] has best RelNode [{}] which is not 
existed in its set.",
  subset.getDescription(), subset.best.getDescription());
}
{code}

Run JDBCTest.testVarcharEquals, you will get below exception -

java.lang.AssertionError: RelSubSet [rel#6828:Subset#2.JDBC.foodmart.[]] has 
best RelNode 
[rel#6871:JdbcProject.JDBC.foodmart.[](input=RelSubset#6844,lname=$2)] which is 
not existed in its set.

at org.apache.calcite.util.Litmus$1.fail(Litmus.java:31)
at 
org.apache.calcite.plan.volcano.VolcanoPlanner.isValid(VolcanoPlanner.java:888)
at 
org.apache.calcite.plan.volcano.VolcanoPlanner.register(VolcanoPlanner.java:851)
at 
org.apache.calcite.plan.volcano.VolcanoPlanner.ensureRegistered(VolcanoPlanner.java:868)
at 
org.apache.calcite.plan.volcano.VolcanoPlanner.ensureRegistered(VolcanoPlanner.java:1939)
at 
org.apache.calcite.plan.volcano.VolcanoRuleCall.transformTo(VolcanoRuleCall.java:129)
at 
org.apache.calcite.plan.RelOptRuleCall.transformTo(RelOptRuleCall.java:236)
at 
org.apache.calcite.rel.convert.ConverterRule.onMatch(ConverterRule.java:141)
at 
org.apache.calcite.plan.volcano.VolcanoRuleCall.onMatch(VolcanoRuleCall.java:208)
at 
org.apache.calcite.plan.volcano.VolcanoPlanner.findBestExp(VolcanoPlanner.java:631)
at 
org.apache.calcite.tools.Programs.lambda$standard$3(Programs.java:286)
at 
org.apache.calcite.tools.Programs$SequenceProgram.run(Programs.java:346)
at org.apache.calcite.prepare.Prepare.optimize(Prepare.java:189)
at org.apache.calcite.prepare.Prepare.prepareSql(Prepare.java:320)
at org.apache.calcite.prepare.Prepare.prepareSql(Prepare.java:231)
at 
org.apache.calcite.prepare.CalcitePrepareImpl.prepare2_(CalcitePrepareImpl.java:638)
at 
org.apache.calcite.prepare.CalcitePrepareImpl.prepare_(CalcitePrepareImpl.java:502)
at 
org.apache.calcite.prepare.CalcitePrepareImpl.prepareSql(CalcitePrepareImpl.java:472)
at 
org.apache.calcite.jdbc.CalciteConnectionImpl.parseQuery(CalciteConnectionImpl.java:231)
at 
org.apache.calcite.jdbc.CalciteMetaImpl.prepareAndExecute(CalciteMetaImpl.java:550)
at 
org.apache.calcite.avatica.AvaticaConnection.prepareAndExecuteInternal(AvaticaConnection.java:675)
at 
org.apache.calcite.avatica.AvaticaStatement.executeInternal(AvaticaStatement.java:156)
at 
org.apache.calcite.avatica.AvaticaStatement.executeQuery(AvaticaStatement.java:227)
at 
org.apache.calcite.test.CalciteAssert.assertQuery(CalciteAssert.java:522)
at 
org.apache.calcite.test.CalciteAssert$AssertQuery.lambda$returns$1(CalciteAssert.java:1466)
at 
org.apache.calcite.test.CalciteAssert$AssertQuery.withConnection(CalciteAssert.java:1398)
at 
org.apache.calcite.test.CalciteAssert$AssertQuery.returns(CalciteAssert.java:1464)
at 
org.apache.calcite.test.CalciteAssert$AssertQuery.returns(CalciteAssert.java:1447)
at 
org.apache.calcite.test.CalciteAssert$AssertQuery.returns(CalciteAssert.java:1410)
at 
org.apache.calcite.test.JdbcTest.testVarcharEquals(JdbcTest.java:4420)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
   

[jira] [Created] (CALCITE-3257) Need to clear RelMetaDataQuery cache after dumping RelNodes and Graphviz

2019-08-16 Thread Xiening Dai (JIRA)
Xiening Dai created CALCITE-3257:


 Summary: Need to clear RelMetaDataQuery cache after dumping 
RelNodes and Graphviz
 Key: CALCITE-3257
 URL: https://issues.apache.org/jira/browse/CALCITE-3257
 Project: Calcite
  Issue Type: Bug
  Components: core
Reporter: Xiening Dai


To repro -

1. Set Log4J log level to TRACE. So planner will dump rel node info every time 
at rule match 
(https://github.com/apache/calcite/blob/3124a85b93ff2f1b79484c7bd4cc41835d4f1920/core/src/main/java/org/apache/calcite/plan/volcano/RuleQueue.java#L435)

2. Run JdbcTest.testInnerJoinValues. Get below exception -

java.lang.AssertionError: rel 
[rel#2128:EnumerableProject.ENUMERABLE.[](input=RelSubset#2122,EMPNO=$0,DESC=$3)]
 has lower cost {309.75 rows, 215.5 cpu, 0.0 io} than best cost {322.5 rows, 
241.0 cpu, 0.0 io} of subset [rel#2119:Subset#5.ENUMERABLE.[]]

at org.apache.calcite.util.Litmus$1.fail(Litmus.java:31)
at 
org.apache.calcite.plan.volcano.VolcanoPlanner.isValid(VolcanoPlanner.java:888)
at 
org.apache.calcite.plan.volcano.VolcanoPlanner.register(VolcanoPlanner.java:851)
at 
org.apache.calcite.plan.volcano.VolcanoPlanner.ensureRegistered(VolcanoPlanner.java:868)
at 
org.apache.calcite.plan.volcano.VolcanoPlanner.ensureRegistered(VolcanoPlanner.java:90)
at 
org.apache.calcite.rel.AbstractRelNode.onRegister(AbstractRelNode.java:329)
at 
org.apache.calcite.plan.volcano.VolcanoPlanner.registerImpl(VolcanoPlanner.java:1656)
at 
org.apache.calcite.plan.volcano.VolcanoPlanner.register(VolcanoPlanner.java:846)
at 
org.apache.calcite.plan.volcano.VolcanoPlanner.ensureRegistered(VolcanoPlanner.java:868)
at 
org.apache.calcite.plan.volcano.VolcanoPlanner.ensureRegistered(VolcanoPlanner.java:1927)
at 
org.apache.calcite.plan.volcano.VolcanoRuleCall.transformTo(VolcanoRuleCall.java:129)
at 
org.apache.calcite.plan.RelOptRuleCall.transformTo(RelOptRuleCall.java:236)
at 
org.apache.calcite.rel.rules.AggregateProjectMergeRule.onMatch(AggregateProjectMergeRule.java:74)
at 
org.apache.calcite.plan.volcano.VolcanoRuleCall.onMatch(VolcanoRuleCall.java:208)
at 
org.apache.calcite.plan.volcano.VolcanoPlanner.findBestExp(VolcanoPlanner.java:631)
at 
org.apache.calcite.tools.Programs.lambda$standard$3(Programs.java:286)
at 
org.apache.calcite.tools.Programs$SequenceProgram.run(Programs.java:346)
at org.apache.calcite.prepare.Prepare.optimize(Prepare.java:189)
at org.apache.calcite.prepare.Prepare.prepareSql(Prepare.java:314)
at org.apache.calcite.prepare.Prepare.prepareSql(Prepare.java:231)
at 
org.apache.calcite.prepare.CalcitePrepareImpl.prepare2_(CalcitePrepareImpl.java:638)
at 
org.apache.calcite.prepare.CalcitePrepareImpl.prepare_(CalcitePrepareImpl.java:502)
at 
org.apache.calcite.prepare.CalcitePrepareImpl.prepareSql(CalcitePrepareImpl.java:472)
at 
org.apache.calcite.jdbc.CalciteConnectionImpl.parseQuery(CalciteConnectionImpl.java:231)
at 
org.apache.calcite.jdbc.CalciteMetaImpl.prepareAndExecute(CalciteMetaImpl.java:550)
at 
org.apache.calcite.avatica.AvaticaConnection.prepareAndExecuteInternal(AvaticaConnection.java:675)
at 
org.apache.calcite.avatica.AvaticaStatement.executeInternal(AvaticaStatement.java:156)
at 
org.apache.calcite.avatica.AvaticaStatement.executeQuery(AvaticaStatement.java:227)
at 
org.apache.calcite.test.CalciteAssert.assertQuery(CalciteAssert.java:522)
at 
org.apache.calcite.test.CalciteAssert$AssertQuery.lambda$returns$1(CalciteAssert.java:1466)
at 
org.apache.calcite.test.CalciteAssert$AssertQuery.withConnection(CalciteAssert.java:1398)
at 
org.apache.calcite.test.CalciteAssert$AssertQuery.returns(CalciteAssert.java:1464)
at 
org.apache.calcite.test.CalciteAssert$AssertQuery.explainMatches(CalciteAssert.java:1561)
at 
org.apache.calcite.test.CalciteAssert$AssertQuery.explainContains(CalciteAssert.java:1556)
at 
org.apache.calcite.test.JdbcTest.testInnerJoinValues(JdbcTest.java:2580)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
at