[jira] [Created] (CALCITE-4712) Add RelHashDistribution
Haisheng Yuan created CALCITE-4712: -- Summary: Add RelHashDistribution Key: CALCITE-4712 URL: https://issues.apache.org/jira/browse/CALCITE-4712 Project: Calcite Issue Type: Bug Components: core Reporter: Haisheng Yuan Add RelHashDistribution. The hash distribution should have the following properties: 1. ImmutableIntList keys; // distribution keys 2. ImmutableList equivKeys; // equivalent keys for each distribution key 3. int bucketNum; // number of buckets or shards 4. boolean nullsColocated; // are NULLS colocated? 5. String hashFunc; // name or identity of hash function -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (CALCITE-4638) Volcano top-down optimizer failed to recognize transformation rule correctly
Haisheng Yuan created CALCITE-4638: -- Summary: Volcano top-down optimizer failed to recognize transformation rule correctly Key: CALCITE-4638 URL: https://issues.apache.org/jira/browse/CALCITE-4638 Project: Calcite Issue Type: Bug Components: core Reporter: Haisheng Yuan In VolcanoPlanner.isTransformationRule, if the rule is a physical implementation rule, it may be recognized as a transformation rule incorrectly. See discussion in mailing list: https://lists.apache.org/thread.html/r362da53656b80b727c72cc6e4b807c5b1eccb15a0aa6fec5f6bbe91e%40%3Cdev.calcite.apache.org%3E -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (CALCITE-4131) ERROR message in SqlToRelConverterExtendedTest
Haisheng Yuan created CALCITE-4131: -- Summary: ERROR message in SqlToRelConverterExtendedTest Key: CALCITE-4131 URL: https://issues.apache.org/jira/browse/CALCITE-4131 Project: Calcite Issue Type: Improvement Components: core Reporter: Haisheng Yuan The error doesn't affect the proceeding of the test suite, and doesn't fail the test. But the error message is not so pleasing: {code:java} SqlToRelConverterExtendedTest > testCrossJoinLateral1() STANDARD_ERROR [Fatal Error] :1:2: XML document structures must start and end within the same entity. ERROR: 'Could not compile stylesheet' FATAL ERROR: 'XML document structures must start and end within the same entity.' :XML document structures must start and end within the same entity. SqlToRelConverterExtendedTest > testMatchRecognizeIn() STANDARD_ERROR ERROR: 'XML document structures must start and end within the same entity.' ERROR: 'com.sun.org.apache.xml.internal.utils.WrappedRuntimeException: XML document structures must start and end within the same entity.' {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (CALCITE-4129) Support equality check for whole rel plan tree
Haisheng Yuan created CALCITE-4129: -- Summary: Support equality check for whole rel plan tree Key: CALCITE-4129 URL: https://issues.apache.org/jira/browse/CALCITE-4129 Project: Calcite Issue Type: Improvement Components: core Reporter: Haisheng Yuan Currently the only way to check rel node tree deep equality is transforming into String by {{RelOptUtil.toString(rel)}} with {{SqlExplainLevel.EXPPLAN_ATTRIBUTES}}, which is inefficient. One example is RexSubQuery. It has to do it this way, because the rel being reference by RexSubQuery is possibly not yet registered to VolcanoPlanner, and the digest {{equals}} checks the input RelNode by identity (not content). That is OK for RelSubset and HepRelVertex, if the RelNode is already registered in planner, but not for plain RelNode that is outside of planner. Due to this, we have to implement another set of deep equals logic in our system. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (CALCITE-4126) Stackoverflow error when applying JoinCommuteRule
Haisheng Yuan created CALCITE-4126: -- Summary: Stackoverflow error when applying JoinCommuteRule Key: CALCITE-4126 URL: https://issues.apache.org/jira/browse/CALCITE-4126 Project: Calcite Issue Type: Improvement Components: core Reporter: Haisheng Yuan Add the test case to RelOptRulesTest.java {code:java} @Test void testJoinCommute() { final HepProgram program = HepProgram.builder() .addRuleInstance(JoinCommuteRule.INSTANCE) .build(); final String sql = "select * \n" + "from sales.emp e join sales.dept d on e.empno > d.deptno"; sql(sql).with(program).check(); } {code} Stacktrace: {code:java} java.lang.StackOverflowError at org.apache.calcite.rex.RexInputRef.accept(RexInputRef.java:112) at org.apache.calcite.rel.core.Project.isValid(Project.java:210) at org.apache.calcite.rel.core.Project.(Project.java:90) at org.apache.calcite.rel.logical.LogicalProject.(LogicalProject.java:67) at org.apache.calcite.rel.logical.LogicalProject.copy(LogicalProject.java:128) at org.apache.calcite.rel.logical.LogicalProject.copy(LogicalProject.java:45) at org.apache.calcite.rel.core.Project.copy(Project.java:122) at org.apache.calcite.plan.hep.HepPlanner.addRelToGraph(HepPlanner.java:803) at org.apache.calcite.plan.hep.HepPlanner.applyTransformationResults(HepPlanner.java:732) at org.apache.calcite.plan.hep.HepPlanner.applyRule(HepPlanner.java:543) at org.apache.calcite.plan.hep.HepPlanner.depthFirstApply(HepPlanner.java:360) at org.apache.calcite.plan.hep.HepPlanner.depthFirstApply(HepPlanner.java:372) at org.apache.calcite.plan.hep.HepPlanner.depthFirstApply(HepPlanner.java:372) at org.apache.calcite.plan.hep.HepPlanner.depthFirstApply(HepPlanner.java:372) at org.apache.calcite.plan.hep.HepPlanner.depthFirstApply(HepPlanner.java:372) at org.apache.calcite.plan.hep.HepPlanner.depthFirstApply(HepPlanner.java:372) at org.apache.calcite.plan.hep.HepPlanner.depthFirstApply(HepPlanner.java:372) at org.apache.calcite.plan.hep.HepPlanner.depthFirstApply(HepPlanner.java:372) at org.apache.calcite.plan.hep.HepPlanner.depthFirstApply(HepPlanner.java:372) {code} I check master branch and 1.23.0, both throw StackOverflowError. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (CALCITE-4124) Stop invalidating metadata cache in VolcanoRuleCall
Haisheng Yuan created CALCITE-4124: -- Summary: Stop invalidating metadata cache in VolcanoRuleCall Key: CALCITE-4124 URL: https://issues.apache.org/jira/browse/CALCITE-4124 Project: Calcite Issue Type: Improvement Components: core Reporter: Haisheng Yuan After CALCITE-2018, we don't need to invalidate metadata query when a new RelNode is generated. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (CALCITE-4122) Support on-demand rule matching
Haisheng Yuan created CALCITE-4122: -- Summary: Support on-demand rule matching Key: CALCITE-4122 URL: https://issues.apache.org/jira/browse/CALCITE-4122 Project: Calcite Issue Type: Improvement Components: core Reporter: Haisheng Yuan Currently in top-down planner, rule is matched whenever a new relnode is added to memo, no matter the group is pruned or not. We should do the rule matching for each operator only when needed, hence rule queue is not required anymore. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (CALCITE-4105) Replace Pair with Flat2List in RelDigestWriter to improve hash code distribution
Haisheng Yuan created CALCITE-4105: -- Summary: Replace Pair with Flat2List in RelDigestWriter to improve hash code distribution Key: CALCITE-4105 URL: https://issues.apache.org/jira/browse/CALCITE-4105 Project: Calcite Issue Type: Improvement Components: core Reporter: Haisheng Yuan Assignee: Haisheng Yuan Pair hashCode uses XOR, Flat2List uses 31 prime multiplication. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (CALCITE-4103) SetOp and Union should accept RelNode subclasses in the constructor
Haisheng Yuan created CALCITE-4103: -- Summary: SetOp and Union should accept RelNode subclasses in the constructor Key: CALCITE-4103 URL: https://issues.apache.org/jira/browse/CALCITE-4103 Project: Calcite Issue Type: Improvement Components: core Reporter: Haisheng Yuan Currently only list of RelNode is accepted: {code:java} protected SetOp(RelOptCluster cluster, RelTraitSet traits, List inputs, SqlKind kind, boolean all) {} {code} If I have a List of EnumerableTableScan, the compiler will complain. It might be better to change to List. I think this can still keep the backward compatibility. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (CALCITE-4097) Avoid requesting unnecessary trait request when deriving traits from child inputs
Haisheng Yuan created CALCITE-4097: -- Summary: Avoid requesting unnecessary trait request when deriving traits from child inputs Key: CALCITE-4097 URL: https://issues.apache.org/jira/browse/CALCITE-4097 Project: Calcite Issue Type: Improvement Components: core Reporter: Haisheng Yuan If the child subset is used to derive new traits for current relnode, the subset will be marked REQUIRED when registering the new derived relnode and later will add enforcers between other delivered subsets. e.g. a MergeJoin request both inputs hash distributed by [a,b] sorted by [a,b]. If the left input R1 happens to be distributed by [a], the MergeJoin can derive new traits from this input and request both input to be distributed by [a] sorted by [a,b]. In case there is a alternative R2 with ANY distribution in the left input's RelSet, we end up with requesting hash distribution [a] on alternative R2, which is unnecessary and waste, because we request distribution by [a] because of R1 can deliver the exact same distribution and we don't need to enforce properties on other subsets that can't satisfy the specific trait requirement. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (CALCITE-4096) Change Pair.hashCode() not to use XOR
Haisheng Yuan created CALCITE-4096: -- Summary: Change Pair.hashCode() not to use XOR Key: CALCITE-4096 URL: https://issues.apache.org/jira/browse/CALCITE-4096 Project: Calcite Issue Type: Improvement Components: core Reporter: Haisheng Yuan JDK Map Entry uses XOR maybe because it is unlikely to use Map Entry as the HashMap key. But Pair in Calcite is a general data structure, it is used in several places as the key of HashMap/HashSet. XOR is not a good candidate for hash algorithm, it is more likely to generate hash collision than simple prime multiplication, especially when pair.left equals pair.right. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (CALCITE-4083) RelTraitSet failed to canonize traits
Haisheng Yuan created CALCITE-4083: -- Summary: RelTraitSet failed to canonize traits Key: CALCITE-4083 URL: https://issues.apache.org/jira/browse/CALCITE-4083 Project: Calcite Issue Type: Bug Components: core Reporter: Haisheng Yuan RelTraitSet#plus uses FlatLists and ImmutableList. They have different hash algorithms, and they are all different classes from RelTraitSet. The RelTraitSet equality requires the other object must be RelTraitSet too, and the HashMap#get(key) uses key.equals() to check equality, instead of the other way. In case we pass RelTraitSet as the search key to cache, but the cached entry has key with FlatLists or ImmutableList, we may fail to find the cached RelTraitSet. Due to this, using == to check traitSet equality may fail, even they have same traits. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (CALCITE-4071) Make RelNode Immutable
Haisheng Yuan created CALCITE-4071: -- Summary: Make RelNode Immutable Key: CALCITE-4071 URL: https://issues.apache.org/jira/browse/CALCITE-4071 Project: Calcite Issue Type: Improvement Components: core Reporter: Haisheng Yuan Currently, RelNode is mutable, mainly in planner. That makes planner code error-prone, hard to maintain, and complicate the logic. Let's try to avoid using {{replaceInput}}, at least inside planner, and see if we can deprecate {{replaceInput}}. That will make planner more robust. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (CALCITE-4058) Add Limit and LogicalLimit operator and deprecate limit in Sort
Haisheng Yuan created CALCITE-4058: -- Summary: Add Limit and LogicalLimit operator and deprecate limit in Sort Key: CALCITE-4058 URL: https://issues.apache.org/jira/browse/CALCITE-4058 Project: Calcite Issue Type: New Feature Components: core Reporter: Haisheng Yuan Add {{Limit}} base class and {{LogicalLimit}} operator. Fetch and offset should not be in Sort operator, we should deprecate the fields while keeping the Sort work for compatibility. {code:java} select * from foo limit 5 offset 3; select * from foo order by a; {code} The above 2 queries are totally different query, but have the same operators, SortxxxTranspose sometimes means limit push down, which is counter-intuitive. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (CALCITE-4056) Remove Digest from RelNode and RexNode
Haisheng Yuan created CALCITE-4056: -- Summary: Remove Digest from RelNode and RexNode Key: CALCITE-4056 URL: https://issues.apache.org/jira/browse/CALCITE-4056 Project: Calcite Issue Type: Improvement Components: core Reporter: Haisheng Yuan The digest is used everywhere (RelNode, RexNode, DataType), causing OOM easily for large queries or with complex expressions. Datatype is cached in global interner and can be reused. Unlike RelNode, RexNode is not stored in MEMO as a GROUP, causing it can't be shared. This makes Calcite can't scale for large queries, e.g. CALCITE-3784. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (CALCITE-4032) Mark CalcMergeRule as TransformationRule
Haisheng Yuan created CALCITE-4032: -- Summary: Mark CalcMergeRule as TransformationRule Key: CALCITE-4032 URL: https://issues.apache.org/jira/browse/CALCITE-4032 Project: Calcite Issue Type: Improvement Components: core Reporter: Haisheng Yuan Previously it was removed in CALCITE-3997. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (CALCITE-4027) Add -Doverwrite option to SqlToRelTestBase
Haisheng Yuan created CALCITE-4027: -- Summary: Add -Doverwrite option to SqlToRelTestBase Key: CALCITE-4027 URL: https://issues.apache.org/jira/browse/CALCITE-4027 Project: Calcite Issue Type: Improvement Components: core Reporter: Haisheng Yuan So that by setting overwrite=true, the expected xml file will be overwritten by actual output file automatically. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (CALCITE-4024) In top-down optimizer, forbid Sort (non-limit) to participate any rule matches
Haisheng Yuan created CALCITE-4024: -- Summary: In top-down optimizer, forbid Sort (non-limit) to participate any rule matches Key: CALCITE-4024 URL: https://issues.apache.org/jira/browse/CALCITE-4024 Project: Calcite Issue Type: Improvement Components: core Reporter: Haisheng Yuan In top-down optimizer, forbid enforcer operator, e.g. Sort (non-limit) to participate any rule matches. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (CALCITE-4023) Remove or deprecate ProjectSortTransposeRule
Haisheng Yuan created CALCITE-4023: -- Summary: Remove or deprecate ProjectSortTransposeRule Key: CALCITE-4023 URL: https://issues.apache.org/jira/browse/CALCITE-4023 Project: Calcite Issue Type: Improvement Components: core Reporter: Haisheng Yuan It never worked. The check condition {{if (sort.getClass() != Sort.class)}} is always true. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (CALCITE-4018) Implement trait propagation for EnumerableValues
Haisheng Yuan created CALCITE-4018: -- Summary: Implement trait propagation for EnumerableValues Key: CALCITE-4018 URL: https://issues.apache.org/jira/browse/CALCITE-4018 Project: Calcite Issue Type: Improvement Components: core Reporter: Haisheng Yuan Only passThrough is needed. Currently, when Values is created, it will enumerate all the possible collations no matter parent operator requires it or not, it will be a disaster if the Values has thousands of columns, and the parent operator may be just a hash aggregate or hashjoin, which doesn't care about its collation. The collation should be created on demand by calling passThrough. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (CALCITE-4017) Implement trait propagation for Enumerable Setop
Haisheng Yuan created CALCITE-4017: -- Summary: Implement trait propagation for Enumerable Setop Key: CALCITE-4017 URL: https://issues.apache.org/jira/browse/CALCITE-4017 Project: Calcite Issue Type: Improvement Components: core Reporter: Haisheng Yuan Mainly for Union operator, not sure about Minus and Intersect. I haven't check how is Enumerable Minus, Intersect's executor implemented. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (CALCITE-4016) Implement trait propagation for EnumerableCalc
Haisheng Yuan created CALCITE-4016: -- Summary: Implement trait propagation for EnumerableCalc Key: CALCITE-4016 URL: https://issues.apache.org/jira/browse/CALCITE-4016 Project: Calcite Issue Type: Improvement Components: core Reporter: Haisheng Yuan It should be similar with EnumerableProject. Maybe same logic. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (CALCITE-4015) Pass through parent collation request on subset or superset of join keys for EnumerableMergeJoin
Haisheng Yuan created CALCITE-4015: -- Summary: Pass through parent collation request on subset or superset of join keys for EnumerableMergeJoin Key: CALCITE-4015 URL: https://issues.apache.org/jira/browse/CALCITE-4015 Project: Calcite Issue Type: Improvement Components: core Reporter: Haisheng Yuan Pass through parent collation request on subset or superset of join keys for EnumerableMergeJoin. e.g. {code:java} select * from foo join bar on foo.a=bar.a and foo.b=bar.b order by b; {code} collation on b is passed to mergejoin, mergejoin should generate an alternative with collation on b, a. {code:java} select * from foo join bar on foo.a=bar.a and foo.b=bar.b order by foo.a, foo.b, foo.c; {code} collation on a,b,c is passed to mergejoin, mergejoin should generate an alternative with collation on a,b,c, even the join keys are a,b, but since foo.c is also the output column, mergejoin can still request its child foo to deliver collation on a,b,c. The join keys must be the collation request's prefixes. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (CALCITE-4013) Remove traitset derivation when creating logical operators
Haisheng Yuan created CALCITE-4013: -- Summary: Remove traitset derivation when creating logical operators Key: CALCITE-4013 URL: https://issues.apache.org/jira/browse/CALCITE-4013 Project: Calcite Issue Type: Improvement Components: core Reporter: Haisheng Yuan Profiler shows traitset derivation accounts for 65% of the total runtime of LoigcalProject.create(). Anyway, EnumerableProject will do it again. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (CALCITE-4012) Implement trait propagation for EnumerableHashJoin and EnumerableNestedLoopJoini
Haisheng Yuan created CALCITE-4012: -- Summary: Implement trait propagation for EnumerableHashJoin and EnumerableNestedLoopJoini Key: CALCITE-4012 URL: https://issues.apache.org/jira/browse/CALCITE-4012 Project: Calcite Issue Type: Improvement Components: core Reporter: Haisheng Yuan The traitset can be derived from the outer relation of hashjoin/nestedloopjoin. They can also pass down parent trait request to their outer child. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (CALCITE-4011) Implement trait propagation for EnumerableProject and EnumerableFilter
Haisheng Yuan created CALCITE-4011: -- Summary: Implement trait propagation for EnumerableProject and EnumerableFilter Key: CALCITE-4011 URL: https://issues.apache.org/jira/browse/CALCITE-4011 Project: Calcite Issue Type: Improvement Components: core Reporter: Haisheng Yuan So that parent trait can be passed down, and it can derive new traitsets from child. Most importantly, as a demonstration. So that SortProjectTransposeRule and ProjectSortTransposeRule can be disabled when topDownOpt is enabled. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (CALCITE-4010) Revise codegen for EnumerableMergeJoin to accept any sort order
Haisheng Yuan created CALCITE-4010: -- Summary: Revise codegen for EnumerableMergeJoin to accept any sort order Key: CALCITE-4010 URL: https://issues.apache.org/jira/browse/CALCITE-4010 Project: Calcite Issue Type: Improvement Components: core Reporter: Haisheng Yuan Given MergeJoin on foo.a=bar.a and foo.b=bar.b, The codegen for EnumerableMergeJoin always sort tuples by (a,b), but after 1.23.0 calcite can generate MergeJoin on collation of (b,a), or even (a,b,c), (b,a,c), with any asc/desc, null direction, which are all legit. So the codegen should sort tuples exactly according to the requirement of collation. See https://github.com/apache/calcite/blob/master/linq4j/src/main/java/org/apache/calcite/linq4j/EnumerableDefaults.java#L1966 and https://github.com/apache/calcite/pull/1833 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (CALCITE-4009) Revert traitset mapping that was added to ProjectJoinTransposeRule
Haisheng Yuan created CALCITE-4009: -- Summary: Revert traitset mapping that was added to ProjectJoinTransposeRule Key: CALCITE-4009 URL: https://issues.apache.org/jira/browse/CALCITE-4009 Project: Calcite Issue Type: Improvement Components: core Reporter: Haisheng Yuan Revert traitset mapping that was added to ProjectJoinTransposeRule by CALCITE-3353. Now it becomes obsolete, we should fail fast if that happens. Otherwise, all the downstream projects that uses this rule will be wasted time to deal with traitsets they don't need. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (CALCITE-4008) Implement Code generation for EnumerableSortedAggregate
Haisheng Yuan created CALCITE-4008: -- Summary: Implement Code generation for EnumerableSortedAggregate Key: CALCITE-4008 URL: https://issues.apache.org/jira/browse/CALCITE-4008 Project: Calcite Issue Type: Improvement Components: core Reporter: Haisheng Yuan EnumerableSortedAggregate is added in 1.23.0, but doesn't implement code generation. It would be nice if we can add code generation for it, so that more efficient plan can be generated and used for some queries, e.g. {code:java} select foo.a, max(b) from foo join bar on foo.a=bar.a group by a order by a desc; {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (CALCITE-4007) MergeJoin collation check should not be limited to join key's order
Haisheng Yuan created CALCITE-4007: -- Summary: MergeJoin collation check should not be limited to join key's order Key: CALCITE-4007 URL: https://issues.apache.org/jira/browse/CALCITE-4007 Project: Calcite Issue Type: Improvement Components: core Reporter: Haisheng Yuan Given MergeJoin on foo.a=bar.a and foo.b=bar.b, The collation check on MergeJoin always require it is sorted by (a,b), but after 1.23.0 calcite can generate MergeJoin on collation of (b,a), or even (a,b,c), (b,a,c), which are all legit. We should relax the check condition. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (CALCITE-4004) Override RelOptRuleOperand.toString()
Haisheng Yuan created CALCITE-4004: -- Summary: Override RelOptRuleOperand.toString() Key: CALCITE-4004 URL: https://issues.apache.org/jira/browse/CALCITE-4004 Project: Calcite Issue Type: Improvement Components: core Reporter: Haisheng Yuan Override RelOptRuleOperand.toString() to facilitate debugging, otherwise, it is so tedious... -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (CALCITE-4003) In MaterializationTest, FilterProjectTransposeRule matches with logical and physical convention
Haisheng Yuan created CALCITE-4003: -- Summary: In MaterializationTest, FilterProjectTransposeRule matches with logical and physical convention Key: CALCITE-4003 URL: https://issues.apache.org/jira/browse/CALCITE-4003 Project: Calcite Issue Type: Improvement Components: core Reporter: Haisheng Yuan In MaterializationTest.testMaterializationSubstitution2, FilterProjectTransposeRule matches with logical and physical convention at the same time, that means, LogicalFilter and EnumerableProject. We should check and prevent this from happening. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (CALCITE-4002) Add security check to make sure TransformationRule doesn't generate PhysicalNode
Haisheng Yuan created CALCITE-4002: -- Summary: Add security check to make sure TransformationRule doesn't generate PhysicalNode Key: CALCITE-4002 URL: https://issues.apache.org/jira/browse/CALCITE-4002 Project: Calcite Issue Type: Improvement Components: core Reporter: Haisheng Yuan Right now, it is not allowed to generate {{PhysicalNode}} in logical {{TransformationRule}}, but it is just a contract, not being enforced. Add a security check to make sure we throw if a {{PhysicalNode}} is generated in {{VolcanoPlanner}}. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (CALCITE-3993) Add isDefault() to RelTrait interface
Haisheng Yuan created CALCITE-3993: -- Summary: Add isDefault() to RelTrait interface Key: CALCITE-3993 URL: https://issues.apache.org/jira/browse/CALCITE-3993 Project: Calcite Issue Type: Improvement Components: core Reporter: Haisheng Yuan It is an extremely frequent usage to check whether the distribution is default (ANY), the collation is default (EMPTY). -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (CALCITE-3989) Release Calcite 1.23.0
Haisheng Yuan created CALCITE-3989: -- Summary: Release Calcite 1.23.0 Key: CALCITE-3989 URL: https://issues.apache.org/jira/browse/CALCITE-3989 Project: Calcite Issue Type: Task Reporter: Haisheng Yuan Assignee: Haisheng Yuan Fix For: 1.23.0 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (CALCITE-3983) Add utility methods to RelTraitSet
Haisheng Yuan created CALCITE-3983: -- Summary: Add utility methods to RelTraitSet Key: CALCITE-3983 URL: https://issues.apache.org/jira/browse/CALCITE-3983 Project: Calcite Issue Type: Improvement Components: core Reporter: Haisheng Yuan Add the following methods to RelTraitSet: {code:java} boolean isDefault(boolean ignoreConvention=true); // Convenient methods to get the reltrait defined by // Calcite, should throw exception if default trait // defs are not registered. getConvention(); getDistribution(); getCollation();{code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (CALCITE-3968) Disable JoinPushThroughJoinRule.left by default
Haisheng Yuan created CALCITE-3968: -- Summary: Disable JoinPushThroughJoinRule.left by default Key: CALCITE-3968 URL: https://issues.apache.org/jira/browse/CALCITE-3968 Project: Calcite Issue Type: Improvement Components: core Reporter: Haisheng Yuan JoinPushThroughJoinRule.right does (RS)T -> (RT)S JoinPushThroughJoinRule.left does (RS)T -> (TS)R If JoinCommuteRule is enabled, only enabling JoinPushThroughJoinRule.right can achieve the same alternative with JoinPushThroughJoinRule.left, vice versa (suppose they are connected). So there is no need to enable left and right instances at the same time, which will slow down the optimization dramatically, e.g TPCH q05, q07 in TpchTest.java never finish. There is also the same bug report from [1]. Meanwhile, JoinPushThroughJoinRule matches RelNode.class, which is unnecessary. It should be just a group, or RelSet / RelSubset, as a place holder, because we don't care about what exactly the top join's right child is. But since the rule is designed to work with both HepPlanner and VolcanoPlanner, so just bear with the slowness. [1] https://lists.apache.org/thread.html/r195c267ef15f50aa21bbcefd7bf523f8bf2495b2345fd163e91e3c36%40%3Cdev.calcite.apache.org%3E -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (CALCITE-3966) Trigger rules for existing RelSubset when it becomes delivered
Haisheng Yuan created CALCITE-3966: -- Summary: Trigger rules for existing RelSubset when it becomes delivered Key: CALCITE-3966 URL: https://issues.apache.org/jira/browse/CALCITE-3966 Project: Calcite Issue Type: Improvement Components: core Reporter: Haisheng Yuan Fix For: 1.23.0 Existing RelSubset may be first created by parent trait request, so its state is required. But when there is a new RelNode in the same RelSet can deliver the same traitset, we mark the subset as derived/delivered too, in which case, we should fire rule for the subset too. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (CALCITE-3949) RelDistributions.of() and RelCollations.of() should canonize trait instance
Haisheng Yuan created CALCITE-3949: -- Summary: RelDistributions.of() and RelCollations.of() should canonize trait instance Key: CALCITE-3949 URL: https://issues.apache.org/jira/browse/CALCITE-3949 Project: Calcite Issue Type: Bug Components: core Reporter: Haisheng Yuan -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (CALCITE-3944) Move dumpSets and dumpGraphviz out of VolcanoPlanner
Haisheng Yuan created CALCITE-3944: -- Summary: Move dumpSets and dumpGraphviz out of VolcanoPlanner Key: CALCITE-3944 URL: https://issues.apache.org/jira/browse/CALCITE-3944 Project: Calcite Issue Type: Improvement Components: core Reporter: Haisheng Yuan They are helper methods, will move to Dumper.java. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (CALCITE-3937) Only fire rule for RelSubset when it is derived
Haisheng Yuan created CALCITE-3937: -- Summary: Only fire rule for RelSubset when it is derived Key: CALCITE-3937 URL: https://issues.apache.org/jira/browse/CALCITE-3937 Project: Calcite Issue Type: Improvement Components: core Reporter: Haisheng Yuan It is meaningless to fire rule for RelSubset when it is generated by parent's trait requirement. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (CALCITE-3932) Make data type cache thread local, non-evictable
Haisheng Yuan created CALCITE-3932: -- Summary: Make data type cache thread local, non-evictable Key: CALCITE-3932 URL: https://issues.apache.org/jira/browse/CALCITE-3932 Project: Calcite Issue Type: Improvement Components: core Reporter: Haisheng Yuan Data type can be evicted out of cache, which is global, thread-safe. `=` based type comparison may suffer from cache eviction. It seems not necessary to cache them globally, because most of them are RelRecordType, which is query dependent, not sharable between different queries. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (CALCITE-3927) RelSubset is not fired for rule when set gets merged
Haisheng Yuan created CALCITE-3927: -- Summary: RelSubset is not fired for rule when set gets merged Key: CALCITE-3927 URL: https://issues.apache.org/jira/browse/CALCITE-3927 Project: Calcite Issue Type: Bug Components: core Reporter: Haisheng Yuan In VolcanoPlanner, when set gets merged, planner fires rules again for RelNodes in both sets, but not for RelSubset. We might miss something because of this. If all the logical transformation rules and physical implementation rules are separated out and physical rules don't do logical work, we might be OK. But the reality is that all the things are mixed together at the moment. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (CALCITE-3918) SubQueryFilterRemoveRule failed to decorrelate subquery for TPCH q17
Haisheng Yuan created CALCITE-3918: -- Summary: SubQueryFilterRemoveRule failed to decorrelate subquery for TPCH q17 Key: CALCITE-3918 URL: https://issues.apache.org/jira/browse/CALCITE-3918 Project: Calcite Issue Type: Bug Components: core Reporter: Haisheng Yuan Disable RelDecorrelator and run TpchTest.testQuery17(), SubQueryFilterRemoveRule generates plan with Correlate, which is not expected. {code:java} EnumerableProject(AVG_YEARLY=[/($0, 7.0:DECIMAL(2, 1))]) EnumerableAggregate(group=[{}], agg#0=[SUM($2)]) EnumerableFilter(condition=[AND(=($3, $0), =(CAST($4):VARCHAR, 'Brand#13'), =(CAST($5):VARCHAR, 'JUMBO CAN'), <($1, $6))]) EnumerableCorrelate(correlation=[$cor0], joinType=[left], requiredColumns=[{3}]) EnumerableNestedLoopJoin(condition=[true], joinType=[inner]) EnumerableProject(L_PARTKEY=[$1], L_QUANTITY=[$4], L_EXTENDEDPRICE=[$5]) EnumerableTableScan(table=[[TPCH_01, LINEITEM]]) EnumerableProject(P_PARTKEY=[$0], P_BRAND=[$3], P_CONTAINER=[$6]) EnumerableTableScan(table=[[TPCH_01, PART]]) EnumerableProject($f0=[*(0.2:DECIMAL(2, 1), CAST(/(CASE(=($1, 0), null:JavaType(class java.lang.Long), $0), $1)):JavaType(class java.lang.Long))]) EnumerableAggregate(group=[{}], agg#0=[$SUM0($4)], agg#1=[COUNT($4)]) EnumerableFilter(condition=[=($1, $cor0.P_PARTKEY)]) EnumerableTableScan(table=[[TPCH_01, LINEITEM]]) {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (CALCITE-3917) Revive pruned node when a rule generates RelNode that is already pruned
Haisheng Yuan created CALCITE-3917: -- Summary: Revive pruned node when a rule generates RelNode that is already pruned Key: CALCITE-3917 URL: https://issues.apache.org/jira/browse/CALCITE-3917 Project: Calcite Issue Type: Improvement Components: core Reporter: Haisheng Yuan Manually pruning nodes in rule can be dangerous. e.g.: Logical transformation rule X matches RelNode A and generates RelNode B. Rule X thinks B is always better than A, prunes A in the rule. But if B is already in the MEMO, matched and pruned by another rule Y before X, and rule Y generates A. At the end, we may see CannotPlanException, because there is no available RelNode in the RelSet. In this case, we'd better revive the node that is already pruned out by planner, even this can bring additional rule apply and overhead, but better than failure to plan. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (CALCITE-3916) Apply rules bottom up by RelSet
Haisheng Yuan created CALCITE-3916: -- Summary: Apply rules bottom up by RelSet Key: CALCITE-3916 URL: https://issues.apache.org/jira/browse/CALCITE-3916 Project: Calcite Issue Type: Improvement Components: core Reporter: Haisheng Yuan Apply rules by leaf RelSet -> root RelSet order. For every RelNode in a RelSet, rule is matched and applied sequentially. No RuleQueue and DeferringRuleCall is needed anymore. This will make space pruning and rule mutual exclusivity check possible. Rule that use AbstractConverter as operand is an exception, to keep backward compatibility, this kind of rule still needs top-down apply. This should be done after CALCITE-3896. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (CALCITE-3911) JoinCommuteRule may generate wrong plan for SEMI/ANTI join
Haisheng Yuan created CALCITE-3911: -- Summary: JoinCommuteRule may generate wrong plan for SEMI/ANTI join Key: CALCITE-3911 URL: https://issues.apache.org/jira/browse/CALCITE-3911 Project: Calcite Issue Type: Bug Components: core Reporter: Haisheng Yuan JoinCommuteRule generates wrong plan for SEMI/ANTI join when swapOuter is true. Semi / Anti joins are not swappable. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (CALCITE-3896) Pass through parent trait requests to child operators
Haisheng Yuan created CALCITE-3896: -- Summary: Pass through parent trait requests to child operators Key: CALCITE-3896 URL: https://issues.apache.org/jira/browse/CALCITE-3896 Project: Calcite Issue Type: Improvement Components: core Reporter: Haisheng Yuan This is not on-demand trait requests as described in [mailing list|http://mail-archives.apache.org/mod_mbox/calcite-dev/201910.mbox/%3cd75b20f4-542a-4a73-897e-66ab426494c1.h.y...@alibaba-inc.com%3e], which requires the overhaul of the core planner. This ticket tries to enable VolcanoPlanner with basic and minimal ability to pass through parent trait request to child operators without rules, though may not be flexible or powerful, but should be able to work with current Calcite application with minimal changes. The method for physical operators to implement would be: {code:java} interface PhysicalNode { RelNode passThrough(RelTraitSet required); } {code} Given that Calcite's physical operators decides its child operators' traits when the physical operator is created in physical implementation rule, there are some drawback that can't be avoided. e.g., given the following plan: {code:java} StreamAgg on [a] +-- MergeJoin on [a, b, c] |--- TableScan foo +--- TableScan bar {code} Suppose the MergeJoin implementation rule generates several mergejoins that distributes by [a], [a,b], [a,b,c]. Then we pass parent operator StreamAgg's trait request to MergeJoin. Since MergeJoin[a] satisfies parent's request, nothing to do. Next pass request to MergeJoin[a,b], we get MergeJoin[a], then pass request to MergeJoin[a,b,c], we get MergeJoin[a] again. We know they are redundant and there is no need to pass through parent operator's trait request, but these MergeJoin operators are independent and agnostic of each other's existence. The ideal way is that in physical implementation rule, during the creation of physical operator, it should not care about itself and its child operators' physical traits. But this is another different topic. Anyway, better than nothing, once it is done, we can provide the option to obsolete or disable {{AbstractConverter}}. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (CALCITE-3891) Remove use of Pair.zip in RelTraitSet.satisfies
Haisheng Yuan created CALCITE-3891: -- Summary: Remove use of Pair.zip in RelTraitSet.satisfies Key: CALCITE-3891 URL: https://issues.apache.org/jira/browse/CALCITE-3891 Project: Calcite Issue Type: Improvement Components: core Reporter: Haisheng Yuan Attachments: image-2020-04-01-19-05-58-306.png It accounts too much time of the method, even this is a tiny portion overall. !image-2020-04-01-19-05-58-306.png! -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (CALCITE-3889) Add apply(Mappings.Mapping) to RelTrait and RelTraitSet
Haisheng Yuan created CALCITE-3889: -- Summary: Add apply(Mappings.Mapping) to RelTrait and RelTraitSet Key: CALCITE-3889 URL: https://issues.apache.org/jira/browse/CALCITE-3889 Project: Calcite Issue Type: Improvement Components: core Reporter: Haisheng Yuan RelTrait Collation, Distribution have key indices, when we pass down the traitset to child or propagate to parent operator, we have to remap these keys. It would be nice to have {{apply(Mappings.Mapping)}} on RelTrait and RelTraitSet. RelDistribution already has the method, but we may want it on every RelTrait except Convention. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (CALCITE-3886) Execute substitution rule according to the order they get matched
Haisheng Yuan created CALCITE-3886: -- Summary: Execute substitution rule according to the order they get matched Key: CALCITE-3886 URL: https://issues.apache.org/jira/browse/CALCITE-3886 Project: Calcite Issue Type: Improvement Components: core Reporter: Haisheng Yuan Currently the substitution rule is always appended to the head of the queue. We prefer to executing the rule according to the order they get matched. So we need a separate queue for substitution rule matches. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (CALCITE-3879) Rel Id generator should not be static
Haisheng Yuan created CALCITE-3879: -- Summary: Rel Id generator should not be static Key: CALCITE-3879 URL: https://issues.apache.org/jira/browse/CALCITE-3879 Project: Calcite Issue Type: Improvement Components: core Reporter: Haisheng Yuan The rel id will continue increasing until the class is unloaded. We have observed that the id is reaching 100 million after running for just several days. Although we don't use the id for calculation, it might be OK for integer overflow, but it doesn't make sense to generate unique id globally, it will hard to read the dumped plan and the rel id in the log, because the rel id is so large. IMO, the id should be generated per query context. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (CALCITE-3868) Remove redundant ruleSet and ruleNames in VolcanoPlanner
Haisheng Yuan created CALCITE-3868: -- Summary: Remove redundant ruleSet and ruleNames in VolcanoPlanner Key: CALCITE-3868 URL: https://issues.apache.org/jira/browse/CALCITE-3868 Project: Calcite Issue Type: Improvement Components: core Reporter: Haisheng Yuan ruleSet and ruleNames in VolcanoPlanner are redundant, remove them. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (CALCITE-3865) RelCollationTraitDef.canConvert should always return true
Haisheng Yuan created CALCITE-3865: -- Summary: RelCollationTraitDef.canConvert should always return true Key: CALCITE-3865 URL: https://issues.apache.org/jira/browse/CALCITE-3865 Project: Calcite Issue Type: Improvement Components: core Reporter: Haisheng Yuan Attachments: image-2020-03-19-16-44-25-621.png, image-2020-03-19-16-47-21-805.png CALCITE-1148 introduced the following change to RelCollationTraitDef to fix RelTrait conversion bug, but it is just hiding the underlying issue and adding redundant and unnecessary check to planner. !image-2020-03-19-16-47-21-805.png! The root cause is that logical operators, especially LogicalSort can have traits, which is a bad design decision, and {{AggregateReduceFunctionsRule}} fails to adjust the column mapping in RelTraitSet. The newly created {{LogicalProject}} has collation on column 5 (it just copy its input's RelTraitSet blindly), but it only has 2 columns. !image-2020-03-19-16-44-25-621.png! -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (CALCITE-3819) Prune parent RelNode when merging child RelSet with parent RelSet
Haisheng Yuan created CALCITE-3819: -- Summary: Prune parent RelNode when merging child RelSet with parent RelSet Key: CALCITE-3819 URL: https://issues.apache.org/jira/browse/CALCITE-3819 Project: Calcite Issue Type: Improvement Components: core Reporter: Haisheng Yuan Suppose we have 2 RelSets: RelSet A: rel1 RelSet B: rel2 rel1 is the parent of rel2. If there is a transformation rule that transform rel1 to rel2, we will merge RelSet A and B. During merge process, we can safely prune rel1 to avoid further rule apply on rel1 and reduce search space, more importantly, avoid cyclic reference. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (CALCITE-3757) When merging sets, relnodes may be reregistered multiple times
Haisheng Yuan created CALCITE-3757: -- Summary: When merging sets, relnodes may be reregistered multiple times Key: CALCITE-3757 URL: https://issues.apache.org/jira/browse/CALCITE-3757 Project: Calcite Issue Type: Bug Components: core Reporter: Haisheng Yuan In RelSet#mergeWith(), {code:java} // merge subsets for (RelSubset otherSubset : otherSet.subsets) { .. for (RelNode otherRel : otherSubset.getRels()) { planner.reregister(this, otherRel); } } {code} otherSubset.getRels() returns all the rels that satisfy the traitset. A relnode's traitset may satisfy many traitsets, hence it will be reregistered multiple times. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (CALCITE-3756) RelSubset should not match operand(RelNode.class)
Haisheng Yuan created CALCITE-3756: -- Summary: RelSubset should not match operand(RelNode.class) Key: CALCITE-3756 URL: https://issues.apache.org/jira/browse/CALCITE-3756 Project: Calcite Issue Type: Bug Components: core Reporter: Haisheng Yuan Even RelSubset is a subclass of RelNode, but they are quite different. RelNode already fires these rules, it is not necessary to fire fore RelSubset. More importantly, it never worked due to CALCITE-3755. In addition, it will waste time to try to iterate class operands of RelSubset and try to match pattern operand(RelNode), given that it never matches when ascending. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (CALCITE-3755) Ascending rule match with RelSubset operand doesn't work
Haisheng Yuan created CALCITE-3755: -- Summary: Ascending rule match with RelSubset operand doesn't work Key: CALCITE-3755 URL: https://issues.apache.org/jira/browse/CALCITE-3755 Project: Calcite Issue Type: Bug Components: core Reporter: Haisheng Yuan Given a rule with match pattern {{operand(Filter, operand(RelSubset))}}, when we add a new RelSubset into the set, planner will fire rules for RelSubset. But in {{VolcanoRuleCall.matchRecurse}}, if !inputRels.contains(previous), it will skip. Obviously, the input rels will never contain a RelSubset, hence fails to match the rule. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (CALCITE-3753) Always try to match and execute substitution rule first and remove rulematch ordering
Haisheng Yuan created CALCITE-3753: -- Summary: Always try to match and execute substitution rule first and remove rulematch ordering Key: CALCITE-3753 URL: https://issues.apache.org/jira/browse/CALCITE-3753 Project: Calcite Issue Type: Improvement Components: core Reporter: Haisheng Yuan Some rules e.g. ProjectMergeRule, PruneEmptyRule can be defined as SubstitutionRule, so that we can always try to match and execute them first (without deferring rule call). All the other rulematches doesn't need to be sorted and rules can be executed in any order they matched, since we are going to execute all of them anyway, sooner or later. Computing and comparing importances cause a lot of latency. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (CALCITE-3744) Duplicate RuleMatches when RelSet gets merged
Haisheng Yuan created CALCITE-3744: -- Summary: Duplicate RuleMatches when RelSet gets merged Key: CALCITE-3744 URL: https://issues.apache.org/jira/browse/CALCITE-3744 Project: Calcite Issue Type: Bug Components: core Reporter: Haisheng Yuan Some times RelSet can get merged. e.g. Each number represet Rel id, and they are in different RelSet. {code:java} 14 \ / 2 / \/ 3 {code} Assume in the rulequeue, we have rule match with name of 12, 23, 43. For simplicity, omit the rule name, first digit represent the parent RelNode, second represent the child. If after some rule, we merged the set of 3 into the set of 2, and RelNode 4's input is replaced by RelSubset of 2. We will retrigger rules and try to add rulematch 12, 42, 43 into rule queue. 12 will be filtered out, because there is duplicate in the RuleMatch names set. But for 43, it now has different rulematch digest with previous match, because RelNode #4's input RelSubset changed, its digest also changed. So we can't detect the duplication and will re-apply rulematch 43 twice. Although we try to recompute rulematch's digest when popMatch(), but it seems too late. I think in RuleMatch digest, just using relnode's id should suffice. And we don't need to call recomputeDigest when popMatch. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (CALCITE-3676) VolcanoPlanner. dumpGraphviz should handle exception gracefully
Haisheng Yuan created CALCITE-3676: -- Summary: VolcanoPlanner. dumpGraphviz should handle exception gracefully Key: CALCITE-3676 URL: https://issues.apache.org/jira/browse/CALCITE-3676 Project: Calcite Issue Type: Bug Components: core Reporter: Haisheng Yuan In case VolcanoPlanner can't generate plan due to some RelSubset doesn't have best rel, it will dump the sets and Graphviz. Since there is error during planning, we want to see the sets in the exception message, but when dumping Graphviz, it might encounter another exception (we can't guarantee all the sets and subsets are correctly generated for Graphviz), causing failure to dump Sets info, which is rather helpful in production system. {code:java} Caused by: java.lang.AssertionError at org.apache.calcite.util.PartiallyOrderedSet.findParentsChildren(PartiallyOrderedSet.java:318) at org.apache.calcite.util.PartiallyOrderedSet.findParents(PartiallyOrderedSet.java:308) at org.apache.calcite.util.PartiallyOrderedSet.add(PartiallyOrderedSet.java:226) at org.apache.calcite.plan.volcano.VolcanoPlanner.dumpGraphviz(VolcanoPlanner.java:1320) at org.apache.calcite.plan.volcano.VolcanoPlanner.dump(VolcanoPlanner.java:1194) at org.apache.calcite.plan.volcano.RelSubset$CheapestPlanReplacer.visit(RelSubset.java:606) at org.apache.calcite.plan.volcano.RelSubset.buildCheapestPlan(RelSubset.java:307) at org.apache.calcite.plan.volcano.VolcanoPlanner.findBestExp(VolcanoPlanner.java:649) {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (CALCITE-3668) VolcanoPlanner doesn't match all the RelSubSet in matchRecursive
Haisheng Yuan created CALCITE-3668: -- Summary: VolcanoPlanner doesn't match all the RelSubSet in matchRecursive Key: CALCITE-3668 URL: https://issues.apache.org/jira/browse/CALCITE-3668 Project: Calcite Issue Type: Bug Reporter: Haisheng Yuan If I have a rule to match pattern with Filter-RelSubset, VolcanoPlanner only matches 1 RelSubset in the RelSet, instead of all the subsets. {code:java} if (operand.getMatchedClass() == RelSubset.class) { // If the rule wants the whole subset, we just provide it successors = ImmutableList.of(subset); } else { successors = subset.getRelList(); } {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (CALCITE-3576) Remove Enumerable convention check in FilterIntoJoinRule
Haisheng Yuan created CALCITE-3576: -- Summary: Remove Enumerable convention check in FilterIntoJoinRule Key: CALCITE-3576 URL: https://issues.apache.org/jira/browse/CALCITE-3576 Project: Calcite Issue Type: Improvement Components: core Reporter: Haisheng Yuan Fix For: 1.22.0 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (CALCITE-3521) CalciteSystemProperty can't load config file
Haisheng Yuan created CALCITE-3521: -- Summary: CalciteSystemProperty can't load config file Key: CALCITE-3521 URL: https://issues.apache.org/jira/browse/CALCITE-3521 Project: Calcite Issue Type: Bug Reporter: Haisheng Yuan I am using calcite as a standalone library, want to specify default charset and collation through global config file {{saffron.properties}}. But it failed to load property file. I think it should use {{Thread.currentThread().getContextClassLoader()}} to get class loader. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (CALCITE-3492) Exception thrown when terms has 1 RexNode in RexUtil.simplifyOrs()
Haisheng Yuan created CALCITE-3492: -- Summary: Exception thrown when terms has 1 RexNode in RexUtil.simplifyOrs() Key: CALCITE-3492 URL: https://issues.apache.org/jira/browse/CALCITE-3492 Project: Calcite Issue Type: Improvement Components: core Reporter: Haisheng Yuan This is a regression caused by the refactoring in 1.20.0. When {{terms}} has only 1 element, it will throw an exception. We discovered the issue when trying to upgrade Calcite. {code:java} public static RexNode simplifyOrs(RexBuilder rexBuilder, List terms) { return new RexSimplify(rexBuilder, RelOptPredicateList.EMPTY, EXECUTOR) .simplifyUnknownAs(rexBuilder.makeCall(SqlStdOperatorTable.OR, terms), RexUnknownAs.UNKNOWN); } {code} We can't assume the {{terms}} always has more than 1 element. Although this method is deprecated, I think we need to correct the regression. Maybe we don't need a test case, given it is a deprecated method. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (CALCITE-3483) Make RexLiteral member fields accessible by sub-class
Haisheng Yuan created CALCITE-3483: -- Summary: Make RexLiteral member fields accessible by sub-class Key: CALCITE-3483 URL: https://issues.apache.org/jira/browse/CALCITE-3483 Project: Calcite Issue Type: Improvement Components: core Reporter: Haisheng Yuan RexLiteral may be sub-classed and some methods may be overrode. e.g. {{printAsJava}}. But currently, RexLiteral fields are private, sub-classes can't access these fields, which makes methods like {{printAsJava}} unable to be overrode. We recommend make RexLiteral member fields protected, so that we can sub-class it, and customize the behavior of methods such as {{printAsJava}}. Does it make sense? -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (CALCITE-3460) Poor performance in RexReplacer for large queries
Haisheng Yuan created CALCITE-3460: -- Summary: Poor performance in RexReplacer for large queries Key: CALCITE-3460 URL: https://issues.apache.org/jira/browse/CALCITE-3460 Project: Calcite Issue Type: Improvement Components: core Reporter: Haisheng Yuan We have queries that have tens of thousands of RexCalls. reducibleExps.indexOf(call) is an O(n) operation, which takes 50% of the running time, causing the query runs for ever until timed out. In RexShuttle, ImmutableList iterator creation in {{visitList}} takes another 5~7% of running time, and it is creating millions of temporary iterator object, not only time consuming, but also memory consuming. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (CALCITE-3458) Remove desc in AbstractRelNode
Haisheng Yuan created CALCITE-3458: -- Summary: Remove desc in AbstractRelNode Key: CALCITE-3458 URL: https://issues.apache.org/jira/browse/CALCITE-3458 Project: Calcite Issue Type: Improvement Components: core Reporter: Haisheng Yuan If the query is super large, e.g. contains tens of thousands of nodes or expressions, the RelNode digest and desc become very large. The content of desc and digest are almost the same, except that desc consists of id plus digest, which causes OOM in our production system. So remove desc, just use {{id + digest}} to produce description. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (CALCITE-3425) Inconsistent behavior of MetadataProvider in RelOptCluster
Haisheng Yuan created CALCITE-3425: -- Summary: Inconsistent behavior of MetadataProvider in RelOptCluster Key: CALCITE-3425 URL: https://issues.apache.org/jira/browse/CALCITE-3425 Project: Calcite Issue Type: Improvement Components: core Reporter: Haisheng Yuan To use customized metadata provider, we can do the following: {code:java} RelMetadataQuery.THREAD_PROVIDERS.set( JaninoRelMetadataProvider.of(xxxmetadataProvider)); {code} It only works for builtin metadata type, but for customized metadata, we still get exception when retrieve the metadata using reflection. Because when the RelOptCluster is created, it always use the default metadata provider, instead of the customized one. {code:java} setMetadataProvider(DefaultRelMetadataProvider.INSTANCE); {code} It causes confusing. We have to set the provider in 2 places. Should we unify them in a single place? -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (CALCITE-3403) WindowReduceExpressionsRule does not reuse RelMetadataQuery
Haisheng Yuan created CALCITE-3403: -- Summary: WindowReduceExpressionsRule does not reuse RelMetadataQuery Key: CALCITE-3403 URL: https://issues.apache.org/jira/browse/CALCITE-3403 Project: Calcite Issue Type: Improvement Reporter: Haisheng Yuan It creates a new RelMetadataQuery instance, which is a fresh RelMetadataQuery without any data cache. We should get RelMetadataQuery from RelOptRuleCall to reuse cache as much as possible. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (CALCITE-3372) Add a rule to transform a semijoin into a join on top of aggregate
Haisheng Yuan created CALCITE-3372: -- Summary: Add a rule to transform a semijoin into a join on top of aggregate Key: CALCITE-3372 URL: https://issues.apache.org/jira/browse/CALCITE-3372 Project: Calcite Issue Type: Improvement Components: core Reporter: Haisheng Yuan {code:java} Semijoin |- TS1 +- TS2 {code} can be transformed into {code:java} Project +-InnerJoin |- TS1 +- Aggregate + TS2 {code} We may benefit from the new plan by join relation reordering if TS2 is large even after aggregation. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (CALCITE-3285) EnumerableMergeJoin should support non-equi join conditions
Haisheng Yuan created CALCITE-3285: -- Summary: EnumerableMergeJoin should support non-equi join conditions Key: CALCITE-3285 URL: https://issues.apache.org/jira/browse/CALCITE-3285 Project: Calcite Issue Type: Improvement Reporter: Haisheng Yuan Calcite should be able to generate EnumerableMergeJoin with non-equi join conditions, as long as there are equi-join conditions. -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Created] (CALCITE-3284) Enumerable hash semijoin support non-equi join conditions
Haisheng Yuan created CALCITE-3284: -- Summary: Enumerable hash semijoin support non-equi join conditions Key: CALCITE-3284 URL: https://issues.apache.org/jira/browse/CALCITE-3284 Project: Calcite Issue Type: Improvement Reporter: Haisheng Yuan Calcite should be able to generate enumerable hash semijoin with non-equi join conditions, as long as there are equi-join condtions, so that we can do hash look up. -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Created] (CALCITE-3181) Support limit per group in Window
Haisheng Yuan created CALCITE-3181: -- Summary: Support limit per group in Window Key: CALCITE-3181 URL: https://issues.apache.org/jira/browse/CALCITE-3181 Project: Calcite Issue Type: Improvement Components: core Reporter: Haisheng Yuan We have a lot of queries like the following to retrieve top N tuples per group: {code:java} SELECT x, y FROM (SELECT x, y, ROW_NUMBER() OVER (PARTITION BY x ORDER BY y) AS rn FROM t1) t2 WHERE rn <= 3; {code} The performance is not good if each group has a lot more tuples than wanted, because we will retrieve and sort all the tuples, instead of just doing a top-N heap sort. In order to do optimization for this kind of query, we need to extend window to support limit, if and only if there is only 1 window function, and it is {{row_number()}}. We also need a substitute rule to push the limit into window. Of course, we also need to modify executor to support this optimization (can be later). Thoughts? Objections? -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (CALCITE-3160) Failed to materialize when the aggregate function uses group key
Haisheng Yuan created CALCITE-3160: -- Summary: Failed to materialize when the aggregate function uses group key Key: CALCITE-3160 URL: https://issues.apache.org/jira/browse/CALCITE-3160 Project: Calcite Issue Type: Bug Reporter: Haisheng Yuan Repro: {code:java} @Test public void testAggregateGroupSetsRollUp() { checkMaterialize( "select \"empid\", \"deptno\", count(*) as c, sum(\"empid\") as s from \"emps\" " + "group by \"empid\", \"deptno\"", "select count(*) + 1 as c, \"deptno\" from \"emps\" group by cube(\"empid\",\"deptno\")", HR_FKUK_MODEL, CalciteAssert.checkResultContains( "EnumerableCalc(expr#0..2=[{inputs}], expr#3=[1], " + "expr#4=[+($t2, $t3)], C=[$t4], deptno=[$t1])\n" + " EnumerableAggregate(group=[{0, 1}], groups=[[{0, 1}, {0}, {1}, {}]], agg#0=[$SUM0($2)])\n" + "EnumerableTableScan(table=[[hr, m0]])")); } {code} Note that if we change sum(\"empid\") to sum("salary"), it can materialize successfully. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (CALCITE-3159) Distinct can be removed for MIN/MAX/BIT_OR/BIT_AND aggregate functions
Haisheng Yuan created CALCITE-3159: -- Summary: Distinct can be removed for MIN/MAX/BIT_OR/BIT_AND aggregate functions Key: CALCITE-3159 URL: https://issues.apache.org/jira/browse/CALCITE-3159 Project: Calcite Issue Type: Improvement Components: core Reporter: Haisheng Yuan For the following query: {code:java} select a, min(distinct b), bit_or(distinct c) from foo; {code} Currently Calcite still preserve the distinct for these aggregate functions, but DISTINCT is not meaningful with MIN/MAX and is available for ISO compatibility only. We can safely remove distinct and get more optimization opportunities. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (CALCITE-3149) CACHE in RelDataTypeFactoryImpl can't be garbage collected
Haisheng Yuan created CALCITE-3149: -- Summary: CACHE in RelDataTypeFactoryImpl can't be garbage collected Key: CALCITE-3149 URL: https://issues.apache.org/jira/browse/CALCITE-3149 Project: Calcite Issue Type: Bug Reporter: Haisheng Yuan Attachments: image-2019-06-26-22-01-50-965.png Key and Value are pointing to the same object, with soft references, the item in CACHE will not be garbage collected. And it is causing the RelDataType is not reused. !image-2019-06-26-22-01-50-965.png! -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (CALCITE-3138) Restructuring ROW type fields is not supported
Haisheng Yuan created CALCITE-3138: -- Summary: Restructuring ROW type fields is not supported Key: CALCITE-3138 URL: https://issues.apache.org/jira/browse/CALCITE-3138 Project: Calcite Issue Type: Bug Components: core Reporter: Haisheng Yuan Attachments: ROW_repro.patch ROW type is not supported to restructure fields after flattening (see RelStructuredTypeFlattener.restructureFields), which may cause wrong result. Attached patch file can be used to reproduce the issue. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (CALCITE-3137) AssertionError when reconstructing STRUCTURED type fields
Haisheng Yuan created CALCITE-3137: -- Summary: AssertionError when reconstructing STRUCTURED type fields Key: CALCITE-3137 URL: https://issues.apache.org/jira/browse/CALCITE-3137 Project: Calcite Issue Type: Bug Components: core Reporter: Haisheng Yuan Add the following test to SqlToRelConverterTest.java. {code:java} @Test public void testRecontructStructFields() { final String sql = "select HOME_ADDRESS from EMP_ADDRESS "; sql(sql).convertsTo( "LogicalProject(HOME_ADDRESS=[NEW($1.CITY, $1.ZIP, $1.STATE, $2.STREET):ObjectSqlType(ADDRESS) NOT NULL])\n" + " LogicalTableScan(table=[[CATALOG, SALES, EMP_ADDRESS]])" ); } {code} Error: {code:java} java.lang.AssertionError: wrong operand count 4 for NEW at org.apache.calcite.util.Litmus$1.fail(Litmus.java:31) at org.apache.calcite.sql.SqlPrefixOperator.validRexOperands(SqlPrefixOperator.java:100) at org.apache.calcite.rex.RexCall.(RexCall.java:86) at org.apache.calcite.rex.RexBuilder.makeNewInvocation(RexBuilder.java:494) at org.apache.calcite.sql2rel.RelStructuredTypeFlattener.restructure(RelStructuredTypeFlattener.java:249) at org.apache.calcite.sql2rel.RelStructuredTypeFlattener.restructureFields(RelStructuredTypeFlattener.java:227) at org.apache.calcite.sql2rel.RelStructuredTypeFlattener.rewrite(RelStructuredTypeFlattener.java:206) at org.apache.calcite.sql2rel.SqlToRelConverter.flattenTypes(SqlToRelConverter.java:468) at org.apache.calcite.test.SqlToRelTestBase$TesterImpl.convertSqlToRel(SqlToRelTestBase.java:614) {code} The assertion should be removed. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (CALCITE-3124) Infinite rule matching when AggregateRemoveRule is enabled for SUM0
Haisheng Yuan created CALCITE-3124: -- Summary: Infinite rule matching when AggregateRemoveRule is enabled for SUM0 Key: CALCITE-3124 URL: https://issues.apache.org/jira/browse/CALCITE-3124 Project: Calcite Issue Type: Bug Reporter: Haisheng Yuan Make the following changes (uncomment return clause) to /AggregateRemoveRule, the test case {{JdbcTest.testHavingNot2}} won't complete due to infinite rule matching. {code:diff} --- a/core/src/main/java/org/apache/calcite/rel/rules/AggregateRemoveRule.java +++ b/core/src/main/java/org/apache/calcite/rel/rules/AggregateRemoveRule.java @@ -102,7 +102,7 @@ public void onMatch(RelOptRuleCall call) { if (aggregation.getKind() == SqlKind.SUM0) { // Bail out for SUM0 to avoid potential infinite rule matching, // because it may be generated by transforming SUM aggregate // function to SUM0 and COUNT. -return; +//return; } final SqlSplittableAggFunction splitter = Objects.requireNonNull( {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (CALCITE-3110) Enable parallel execution of parameterized test
Haisheng Yuan created CALCITE-3110: -- Summary: Enable parallel execution of parameterized test Key: CALCITE-3110 URL: https://issues.apache.org/jira/browse/CALCITE-3110 Project: Calcite Issue Type: Improvement Components: core Reporter: Haisheng Yuan The patch of CALCITE-2812 introduced a parameterized test, but parameterized JUnit test hangs when running in parallel mode due to [1]. So we annotated the test case as not thread safe avoid parallel execution. Enable parallel execution of EnumerableRepeatUnionHierarchyTest when issue [1] is fixed. See discussion at [2]. [1] https://issues.apache.org/jira/browse/SUREFIRE-1430 [2] https://lists.apache.org/thread.html/d9205b98873d60102ed53e46ffde2fa63632eeca948ee1c9ab810827@%3Cdev.calcite.apache.org%3E -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (CALCITE-3089) Deprecate EquiJoin
Haisheng Yuan created CALCITE-3089: -- Summary: Deprecate EquiJoin Key: CALCITE-3089 URL: https://issues.apache.org/jira/browse/CALCITE-3089 Project: Calcite Issue Type: Improvement Reporter: Haisheng Yuan EquiJoin should be replaced by Join with equiConds and nonEquiConds (empty if it doesn't have). EquiJoin will not have any subclasses. EnumerableHashJoin and EnumerableMergeJoin, SemiJoin should extended Join directly. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (CALCITE-3087) Mappings.apply2 doesn't keep order
Haisheng Yuan created CALCITE-3087: -- Summary: Mappings.apply2 doesn't keep order Key: CALCITE-3087 URL: https://issues.apache.org/jira/browse/CALCITE-3087 Project: Calcite Issue Type: Bug Reporter: Haisheng Yuan Add the following test to MaterializationTest: {code:java} @Test public void testAggregateGroupSetsRollUp0() { checkMaterialize( "select \"empid\", \"deptno\", count(*) as c, sum(\"salary\") as s from \"emps\" " + "group by \"empid\", \"deptno\"", "select count(*) + 1 as c, \"deptno\" from \"emps\" group by cube(\"deptno\", \"empid\")", HR_FKUK_MODEL, CalciteAssert.checkResultContains( "EnumerableCalc(expr#0..2=[{inputs}], expr#3=[1], " + "expr#4=[+($t2, $t3)], C=[$t4], deptno=[$t1])\n" + " EnumerableAggregate(group=[{0, 1}], groups=[[{0, 1}, {0}, {1}, {}]], agg#0=[$SUM0($2)])\n" + "EnumerableTableScan(table=[[hr, m0]])")); } {code} We got error: {code:java} Caused by: java.lang.IllegalArgumentException: must be sorted: [{0, 1}, {1}, {0}, {}] at org.apache.calcite.rel.core.Aggregate$Group.induce(Aggregate.java:472) at org.apache.calcite.rel.mutable.MutableAggregate.getGroupType(MutableAggregate.java:83) at org.apache.calcite.plan.SubstitutionVisitor.unifyAggregates(SubstitutionVisitor.java:1281) at org.apache.calcite.plan.SubstitutionVisitor$AggregateOnProjectToAggregateUnifyRule.apply(SubstitutionVisitor.java:1345) at org.apache.calcite.plan.SubstitutionVisitor.go(SubstitutionVisitor.java:531) at org.apache.calcite.plan.SubstitutionVisitor.go(SubstitutionVisitor.java:466) at org.apache.calcite.plan.MaterializedViewSubstitutionVisitor.go(MaterializedViewSubstitutionVisitor.java:56) at org.apache.calcite.plan.RelOptMaterializations.substitute(RelOptMaterializations.java:200) at org.apache.calcite.plan.RelOptMaterializations.useMaterializedViews(RelOptMaterializations.java:72) at org.apache.calcite.plan.volcano.VolcanoPlanner.registerMaterializations(VolcanoPlanner.java:348) at org.apache.calcite.plan.volcano.VolcanoPlanner.findBestExp(VolcanoPlanner.java:584) {code} The reason is that Mappings.apply2 doesn't sort. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (CALCITE-3086) MULTI_FUNCTION mapping is not implemented
Haisheng Yuan created CALCITE-3086: -- Summary: MULTI_FUNCTION mapping is not implemented Key: CALCITE-3086 URL: https://issues.apache.org/jira/browse/CALCITE-3086 Project: Calcite Issue Type: Improvement Reporter: Haisheng Yuan When I create a Mapping with MULTI_FUNCTION type, it throws an exception. It is not implemented. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (CALCITE-3041) Query runs endless when setImportance to 0
Haisheng Yuan created CALCITE-3041: -- Summary: Query runs endless when setImportance to 0 Key: CALCITE-3041 URL: https://issues.apache.org/jira/browse/CALCITE-3041 Project: Calcite Issue Type: Bug Components: core Reporter: Haisheng Yuan Make the following changes: {code:java} --- a/core/src/main/java/org/apache/calcite/rel/rules/AggregateRemoveRule.java +++ b/core/src/main/java/org/apache/calcite/rel/rules/AggregateRemoveRule.java @@ -123,6 +123,7 @@ public void onMatch(RelOptRuleCall call) { // aggregate functions, add a project for the same effect. relBuilder.project(relBuilder.fields(aggregate.getGroupSet().asList())); } +call.getPlanner().setImportance(aggregate, 0.0); call.transformTo(relBuilder.build()); } } {code} The query in {{JdbcTest.testGroupBy1Max1}} will run endless. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (CALCITE-3031) AssertionError: correlation id $cor0 not found in correlation list []
Haisheng Yuan created CALCITE-3031: -- Summary: AssertionError: correlation id $cor0 not found in correlation list [] Key: CALCITE-3031 URL: https://issues.apache.org/jira/browse/CALCITE-3031 Project: Calcite Issue Type: Bug Components: core Reporter: Haisheng Yuan Repro: {code:java} @Test public void testSelectAnyCorrelated() { final String sql = "select empno > any(select deptno from dept where emp.job = dept.name) from emp\n" ; checkSubQuery(sql).withLateDecorrelation(true).check(); } {code} Error: {code:java} java.lang.AssertionError: correlation id $cor0 not found in correlation list [] at org.apache.calcite.util.Litmus$1.fail(Litmus.java:31) at org.apache.calcite.rex.RexChecker.visitCorrelVariable(RexChecker.java:174) at org.apache.calcite.rex.RexChecker.visitCorrelVariable(RexChecker.java:57) at org.apache.calcite.rex.RexCorrelVariable.accept(RexCorrelVariable.java:47) at org.apache.calcite.rex.RexVisitorImpl.visitFieldAccess(RexVisitorImpl.java:98) at org.apache.calcite.rex.RexChecker.visitFieldAccess(RexChecker.java:149) at org.apache.calcite.rex.RexChecker.visitFieldAccess(RexChecker.java:57) at org.apache.calcite.rex.RexFieldAccess.accept(RexFieldAccess.java:81) at org.apache.calcite.rex.RexChecker.visitCall(RexChecker.java:140) at org.apache.calcite.rex.RexChecker.visitCall(RexChecker.java:57) at org.apache.calcite.rex.RexCall.accept(RexCall.java:191) at org.apache.calcite.rel.core.Filter.isValid(Filter.java:120) at org.apache.calcite.test.SqlToRelConverterTest$RelValidityChecker.visit(SqlToRelConverterTest.java:3312) at org.apache.calcite.rel.SingleRel.childrenAccept(SingleRel.java:72) at org.apache.calcite.rel.RelVisitor.visit(RelVisitor.java:44) {code} The plan after SubQueryRemoveRule is: {code:xml} LogicalProject(EXPR$0=[CAST(OR(AND(IS TRUE(>($0, $9)), <>($10, 0)), AND(>($10, $11), null, <>($10, 0), IS NOT TRUE(>($0, $9))), AND(>($0, $9), <>($10, 0), IS NOT TRUE(>($0, $9)), <=($10, $11:BOOLEAN NOT NULL]) LogicalJoin(condition=[true], joinType=[inner]) LogicalTableScan(table=[[CATALOG, SALES, EMP]]) LogicalAggregate(group=[{}], m=[MIN($0)], c=[COUNT()], d=[COUNT($0)]) LogicalProject(DEPTNO=[$0]) LogicalFilter(condition=[=($cor0.JOB, $1)]) LogicalTableScan(table=[[CATALOG, SALES, DEPT]]) {code} It should be a Correlate, instead of a Join. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (CALCITE-3007) Type mismatch for > ANY subquery in project
Haisheng Yuan created CALCITE-3007: -- Summary: Type mismatch for > ANY subquery in project Key: CALCITE-3007 URL: https://issues.apache.org/jira/browse/CALCITE-3007 Project: Calcite Issue Type: Bug Components: core Reporter: Haisheng Yuan Repro: {code:java} @Test public void testGtAnyInProject() { final String sql = "select name, \n" + " deptno > ANY (\n" + " select deptno from emp) \n" + " from dept"; checkSubQuery(sql).withLateDecorrelation(true).check(); } {code} Exception: {code:java} java.lang.AssertionError: Cannot add expression of different type to set: set type is RecordType(VARCHAR(10) NOT NULL NAME, BOOLEAN NOT NULL EXPR$1) NOT NULL expression type is RecordType(VARCHAR(10) NOT NULL NAME, BOOLEAN EXPR$1) NOT NULL set is rel#5:LogicalProject(input=HepRelVertex#4,NAME=$1,EXPR$1=> SOME($0, { LogicalProject(DEPTNO=[$7]) LogicalTableScan(table=[[CATALOG, SALES, EMP]]) })) expression is LogicalProject(NAME=[$1], EXPR$1=[OR(AND(IS TRUE(>($0, $2)), <>($3, 0)), AND(>($3, $4), null, <>($3, 0), IS NOT TRUE(>($0, $2))), AND(>($0, $2), <>($3, 0), IS NOT TRUE(>($0, $2)), <=($3, $4)))]) LogicalJoin(condition=[true], joinType=[inner]) LogicalTableScan(table=[[CATALOG, SALES, DEPT]]) LogicalAggregate(group=[{}], m=[MIN($0)], c=[COUNT()], d=[COUNT($0)]) LogicalProject(DEPTNO=[$7]) LogicalTableScan(table=[[CATALOG, SALES, EMP]]) at org.apache.calcite.plan.RelOptUtil.verifyTypeEquivalence(RelOptUtil.java:382) at org.apache.calcite.plan.hep.HepRuleCall.transformTo(HepRuleCall.java:57) at org.apache.calcite.plan.RelOptRuleCall.transformTo(RelOptRuleCall.java:236) at org.apache.calcite.rel.rules.SubQueryRemoveRule$SubQueryProjectRemoveRule.onMatch(SubQueryRemoveRule.java:519) {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (CALCITE-2970) Performance issue when enabling abstract converter for EnumerableConvertion
Haisheng Yuan created CALCITE-2970: -- Summary: Performance issue when enabling abstract converter for EnumerableConvertion Key: CALCITE-2970 URL: https://issues.apache.org/jira/browse/CALCITE-2970 Project: Calcite Issue Type: Bug Components: core Reporter: Haisheng Yuan If we enable the use of abstract converter for {{EnumerableConvention}}, by making {{useAbstractConvertersForConversion}} return true, {{JDBCTest.testJoinManyWay}} will not complete. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (CALCITE-2964) testEmptyFilterProjectUnion fails when copyPorject is disabled in FilterProjectTransposeRule
Haisheng Yuan created CALCITE-2964: -- Summary: testEmptyFilterProjectUnion fails when copyPorject is disabled in FilterProjectTransposeRule Key: CALCITE-2964 URL: https://issues.apache.org/jira/browse/CALCITE-2964 Project: Calcite Issue Type: Bug Components: core Reporter: Haisheng Yuan Changing {code:java} public static final FilterProjectTransposeRule INSTANCE = new FilterProjectTransposeRule(Filter.class, Project.class, true, false, RelFactories.LOGICAL_BUILDER); {code} to {code:java} public static final FilterProjectTransposeRule INSTANCE = new FilterProjectTransposeRule(Filter.class, Project.class, true, false, RelFactories.LOGICAL_BUILDER); {code} will cause testEmptyFilterProjectUnion plan diff: {code:java} LogicalProject(X=[$0], Y=[$1]) LogicalUnion(all=[true]) LogicalFilter(condition=[>(+($0, $1), 30)]) LogicalValues(tuples=[[{ 10, 1 }, { 30, 3 }]]) LogicalFilter(condition=[>(+($0, $1), 30)]) LogicalValues(tuples=[[{ 20, 2 }]]) {code} The expected plan is: {code:java} LogicalProject(X=[$0], Y=[$1]) LogicalValues(tuples=[[{ 30, 3 }]]) {code} I am assuming it will generate the same plan no matter copyFilter/copyProject is on or off. Is this wrong assumption? -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (CALCITE-2953) Lattice
Haisheng Yuan created CALCITE-2953: -- Summary: Lattice Key: CALCITE-2953 URL: https://issues.apache.org/jira/browse/CALCITE-2953 Project: Calcite Issue Type: Bug Reporter: Haisheng Yuan -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (CALCITE-2949) Double aggregate generated for in subquery with distinct
Haisheng Yuan created CALCITE-2949: -- Summary: Double aggregate generated for in subquery with distinct Key: CALCITE-2949 URL: https://issues.apache.org/jira/browse/CALCITE-2949 Project: Calcite Issue Type: Bug Components: core Reporter: Haisheng Yuan Repro: Add the following test to SqlToRelConverterTest.java. {code:java} @Test public void testSubQueryIN() { final String sql = "select deptno\n" + "from EMP e\n" + "where deptno in (select distinct deptno\n" + "from EMP where empno=e.empno)"; sql(sql).ok(); } {code} Plan: {code:java} LogicalProject(DEPTNO=[$7]) LogicalJoin(condition=[AND(=($0, $10), =($7, $9))], joinType=[inner]) LogicalTableScan(table=[[CATALOG, SALES, EMP]]) LogicalAggregate(group=[{0, 1}]) LogicalProject(DEPTNO=[$0], EMPNO=[$1]) LogicalAggregate(group=[{0, 1}]) LogicalProject(DEPTNO=[$7], EMPNO=[$0]) LogicalTableScan(table=[[CATALOG, SALES, EMP]]) {code} One LogicalAggregate would suffice. Although the 2 aggregates can be merged by rules described in issue [CALCITE-1172|https://issues.apache.org/jira/browse/CALCITE-1172], which is not available yet. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (CALCITE-2948) SqlToRelConverter generates complicated logical plan for subquery
Haisheng Yuan created CALCITE-2948: -- Summary: SqlToRelConverter generates complicated logical plan for subquery Key: CALCITE-2948 URL: https://issues.apache.org/jira/browse/CALCITE-2948 Project: Calcite Issue Type: Bug Components: core Reporter: Haisheng Yuan Repro: Add the following test to SqlToRelConverterTest.java. {code:java} @Test public void testSubQueryIN() { final String sql = "select deptno\n" + "from EMP e\n" + "where deptno in (select deptno\n" + "from EMP where empno=e.empno+1)"; sql(sql).ok(); } {code} Plan: {code:java} LogicalProject(DEPTNO=[$7]) LogicalJoin(condition=[AND(=($0, $10), =($7, $9))], joinType=[inner]) LogicalTableScan(table=[[CATALOG, SALES, EMP]]) LogicalAggregate(group=[{0, 1}]) LogicalProject(DEPTNO=[$7], EMPNO0=[$9]) LogicalJoin(condition=[=($0, +($9, 1))], joinType=[inner]) LogicalTableScan(table=[[CATALOG, SALES, EMP]]) LogicalProject(EMPNO=[$0]) LogicalTableScan(table=[[CATALOG, SALES, EMP]]) {code} One join would suffice. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (CALCITE-2947) Type mismatch assertion error when converting NOT IN subquery
Haisheng Yuan created CALCITE-2947: -- Summary: Type mismatch assertion error when converting NOT IN subquery Key: CALCITE-2947 URL: https://issues.apache.org/jira/browse/CALCITE-2947 Project: Calcite Issue Type: Bug Components: core Reporter: Haisheng Yuan Repro: Add the following test to SqlToRelConverterTest.java. {code:java} @Test public void testSubQueryNotIN() { final String sql = "select deptno\n" + "from EMP e\n" + "where deptno not in (select deptno\n" + "from EMP where empno=e.empno)"; sql(sql).ok(); } {code} Error: {code:java} java.lang.AssertionError: type mismatch: ref: INTEGER NOT NULL input: INTEGER at org.apache.calcite.util.Litmus$1.fail(Litmus.java:31) at org.apache.calcite.plan.RelOptUtil.eq(RelOptUtil.java:1832) at org.apache.calcite.rex.RexChecker.visitInputRef(RexChecker.java:125) at org.apache.calcite.rex.RexChecker.visitInputRef(RexChecker.java:57) at org.apache.calcite.rex.RexInputRef.accept(RexInputRef.java:112) at org.apache.calcite.rex.RexChecker.visitCall(RexChecker.java:140) at org.apache.calcite.rex.RexChecker.visitCall(RexChecker.java:57) at org.apache.calcite.rex.RexCall.accept(RexCall.java:191) at org.apache.calcite.rel.core.Filter.isValid(Filter.java:120) at org.apache.calcite.rel.logical.LogicalFilter.(LogicalFilter.java:70) at org.apache.calcite.rel.logical.LogicalFilter.create(LogicalFilter.java:114) at org.apache.calcite.rel.logical.LogicalFilter.create(LogicalFilter.java:101) at org.apache.calcite.rel.core.RelFactories$FilterFactoryImpl.createFilter(RelFactories.java:300) at org.apache.calcite.sql2rel.SqlToRelConverter.createJoin(SqlToRelConverter.java:2433) {code} If we change not in subquery to in subquery, it can run without error. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (CALCITE-2944) Lack of test for Aggregate.indicator
Haisheng Yuan created CALCITE-2944: -- Summary: Lack of test for Aggregate.indicator Key: CALCITE-2944 URL: https://issues.apache.org/jira/browse/CALCITE-2944 Project: Calcite Issue Type: Improvement Components: core Reporter: Haisheng Yuan Attachments: image-2019-03-21-18-19-58-933.png Although Aggregate.indicator is discouraged to use, but no test cases guarantee indicator can work correctly. I tried to set indicator to always false in Aggregate constructor, it passed all the tests. !image-2019-03-21-18-19-58-933.png! -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (CALCITE-2936) Existential subquery that has aggregate without grouping key should be simplified
Haisheng Yuan created CALCITE-2936: -- Summary: Existential subquery that has aggregate without grouping key should be simplified Key: CALCITE-2936 URL: https://issues.apache.org/jira/browse/CALCITE-2936 Project: Calcite Issue Type: New Feature Reporter: Haisheng Yuan An existential subquery whose inner child is an aggregate with no grouping columns should be simplified to a Boolean constant. Example: exists(select sum(i) from X) --> true not exists(select sum(i) from X) --> false Repro: {code:java} @Test public void testExistentialSubquery() { final String sql = "SELECT e1.empno\n" + "FROM emp e1 where exists\n" + "(select avg(sal) from emp e2 where e1.empno = e2.empno )"; sql(sql).decorrelate(true).ok(); } {code} We got plan: {code:java} LogicalProject(EMPNO=[$0]) LogicalProject(EMPNO=[$0], ENAME=[$1], JOB=[$2], MGR=[$3], HIREDATE=[$4], SAL=[$5], COMM=[$6], DEPTNO=[$7], SLACKER=[$8], EMPNO0=[CAST($9):INTEGER], $f1=[CAST($10):BOOLEAN]) LogicalJoin(condition=[=($0, $9)], joinType=[inner]) LogicalTableScan(table=[[CATALOG, SALES, EMP]]) LogicalAggregate(group=[{0}], agg#0=[MIN($1)]) LogicalProject(EMPNO=[$0], $f0=[true]) LogicalAggregate(group=[{0}], EXPR$0=[AVG($1)]) LogicalProject(EMPNO=[$0], SAL=[$5]) LogicalTableScan(table=[[CATALOG, SALES, EMP]]) {code} The preferred plan should be: {code:java} LogicalProject(EMPNO=[$0]) LogicalTableScan(table=[[CATALOG, SALES, EMP]]) {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (CALCITE-2935) Support ANY, SOME, EVERY aggregate functions
Haisheng Yuan created CALCITE-2935: -- Summary: Support ANY, SOME, EVERY aggregate functions Key: CALCITE-2935 URL: https://issues.apache.org/jira/browse/CALCITE-2935 Project: Calcite Issue Type: New Feature Components: core Reporter: Haisheng Yuan ANY, SOME is equivalent with bool_or. EVERY is equivalent with bool_and. https://blog.jooq.org/2014/12/18/a-true-sql-gem-you-didnt-know-yet-the-every-aggregate-function/ https://mysqlserverteam.com/using-the-aggregate-functions-any-some-every-with-mysql/ -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (CALCITE-2904) Column not found in table when query contain duplicate table alias
Haisheng Yuan created CALCITE-2904: -- Summary: Column not found in table when query contain duplicate table alias Key: CALCITE-2904 URL: https://issues.apache.org/jira/browse/CALCITE-2904 Project: Calcite Issue Type: Bug Components: core Reporter: Haisheng Yuan Repro: In SqlToRelConverterTest.java, add this test. {code:java} @Test public void testDuplicateTableAlias() { final String sql = "select * from (values 4) as t(c) " + "left join lateral " + "(select c,c*a from " + "(values 2) as t(a)" + ") as r(d,c) " + "using(c)"; sql(sql).ok(); } {code} Error message: {code:java} org.apache.calcite.runtime.CalciteContextException: At line 1, column 60: Column 'C' not found in table 'T' {code} The tables with same aliases are at different level. Postgres can parse this query without any issue. Not sure if this is expected behavior or not in calcite. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (CALCITE-2903) Exception thrown when decorrelating streaming query
Haisheng Yuan created CALCITE-2903: -- Summary: Exception thrown when decorrelating streaming query Key: CALCITE-2903 URL: https://issues.apache.org/jira/browse/CALCITE-2903 Project: Calcite Issue Type: Bug Components: core Reporter: Haisheng Yuan Repro: In SqlToRelConverterTest.java, add the following case: {code:java} @Test public void testStreamLateralJoin() { final String sql = "select stream *\n" + "from orders\n" + "cross join lateral (" + " select * from products_temporal for system_time as of orders.rowtime\n" + " where orders.productid = products_temporal.productid)\n"; sql(sql).ok(); } {code} We will see the exception: {code:java} java.lang.RuntimeException: While invoking method 'public org.apache.calcite.sql2rel.RelDecorrelator$Frame org.apache.calcite.sql2rel.RelDecorrelator.decorrelateRel(org.apache.calcite.rel.RelNode)' at org.apache.calcite.util.ReflectUtil$2.invoke(ReflectUtil.java:527) at org.apache.calcite.sql2rel.RelDecorrelator.getInvoke(RelDecorrelator.java:613) at org.apache.calcite.sql2rel.RelDecorrelator.decorrelate(RelDecorrelator.java:254) at org.apache.calcite.sql2rel.RelDecorrelator.decorrelateQuery(RelDecorrelator.java:214) at org.apache.calcite.sql2rel.SqlToRelConverter.decorrelateQuery(SqlToRelConverter.java:3073) at org.apache.calcite.sql2rel.SqlToRelConverter.decorrelate(SqlToRelConverter.java:483) at org.apache.calcite.test.SqlToRelTestBase$TesterImpl.convertSqlToRel(SqlToRelTestBase.java:616) at org.apache.calcite.test.SqlToRelTestBase$TesterImpl.assertConvertsTo(SqlToRelTestBase.java:723) at org.apache.calcite.test.SqlToRelConverterTest$Sql.convertsTo(SqlToRelConverterTest.java:3096) at org.apache.calcite.test.SqlToRelConverterTest$Sql.ok(SqlToRelConverterTest.java:3088) . Caused by: java.lang.reflect.InvocationTargetException at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.calcite.util.ReflectUtil$2.invoke(ReflectUtil.java:524) ... 32 more Caused by: java.lang.RuntimeException: While invoking method 'public org.apache.calcite.sql2rel.RelDecorrelator$Frame org.apache.calcite.sql2rel.RelDecorrelator.decorrelateRel(org.apache.calcite.rel.logical.LogicalProject)' at org.apache.calcite.util.ReflectUtil$2.invoke(ReflectUtil.java:527) at org.apache.calcite.sql2rel.RelDecorrelator.getInvoke(RelDecorrelator.java:613) at org.apache.calcite.sql2rel.RelDecorrelator.decorrelateRel(RelDecorrelator.java:369) {code} Inside {noformat} public Frame decorrelateRel(LogicalCorrelate rel) {noformat} , assertion failure happened: {code:java} assert rel.getRequiredColumns().cardinality() <= rightFrame.corDefOutputs.keySet().size(); {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (CALCITE-2893) Type mismatch assertion error when creating join condition
Haisheng Yuan created CALCITE-2893: -- Summary: Type mismatch assertion error when creating join condition Key: CALCITE-2893 URL: https://issues.apache.org/jira/browse/CALCITE-2893 Project: Calcite Issue Type: Bug Components: core Reporter: Haisheng Yuan Repro: {code:java} @Test public void testValuesLateralJoin() { final String q1 = "select *\n" + "from (values 2, 5) as t (c)\n" + "left join lateral table(\"s\".\"fibonacci2\"(c)) as R(n) on c=n"; with() .with(CalciteConnectionProperty.CONFORMANCE, SqlConformanceEnum.LENIENT) .query(q1) .returnsUnordered("C=2; N=null", "C=2; N=null", "C=2; N=2", "C=5; N=null", "C=5; N=null", "C=5; N=null", "C=5; N=null", "C=5; N=5"); } {code} Error message: {code:java} java.lang.AssertionError: type mismatch: ref: BIGINT NOT NULL input: BIGINT at org.apache.calcite.util.Litmus$1.fail(Litmus.java:31) at org.apache.calcite.plan.RelOptUtil.eq(RelOptUtil.java:1832) at org.apache.calcite.rex.RexChecker.visitInputRef(RexChecker.java:125) at org.apache.calcite.rex.RexChecker.visitInputRef(RexChecker.java:57) at org.apache.calcite.rex.RexInputRef.accept(RexInputRef.java:112) at org.apache.calcite.rex.RexChecker.visitCall(RexChecker.java:140) at org.apache.calcite.rex.RexChecker.visitCall(RexChecker.java:57) at org.apache.calcite.rex.RexCall.accept(RexCall.java:190) at org.apache.calcite.rel.core.Filter.isValid(Filter.java:120) at org.apache.calcite.rel.logical.LogicalFilter.(LogicalFilter.java:70) at org.apache.calcite.rel.logical.LogicalFilter.create(LogicalFilter.java:114) at org.apache.calcite.rel.logical.LogicalFilter.create(LogicalFilter.java:101) at org.apache.calcite.rel.core.RelFactories$FilterFactoryImpl.createFilter(RelFactories.java:295) at org.apache.calcite.sql2rel.SqlToRelConverter.createJoin(SqlToRelConverter.java:2413) at org.apache.calcite.sql2rel.SqlToRelConverter.convertFrom(SqlToRelConverter.java:2065) at org.apache.calcite.sql2rel.SqlToRelConverter.convertSelectImpl(SqlToRelConverter.java:644) at org.apache.calcite.sql2rel.SqlToRelConverter.convertSelect(SqlToRelConverter.java:625) at org.apache.calcite.sql2rel.SqlToRelConverter.convertQueryRecursive(SqlToRelConverter.java:3080) at org.apache.calcite.sql2rel.SqlToRelConverter.convertQuery(SqlToRelConverter.java:561) {code} Even literal is definitely not null, we should still be able to compare the two. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (CALCITE-2891) Alias suggester failed to suggest name based on original name
Haisheng Yuan created CALCITE-2891: -- Summary: Alias suggester failed to suggest name based on original name Key: CALCITE-2891 URL: https://issues.apache.org/jira/browse/CALCITE-2891 Project: Calcite Issue Type: Bug Components: core Reporter: Haisheng Yuan in testAliasFilter(), the plan is {code:java} LogicalFilter(condition=[>($1, $2)]) LogicalProject($f1=[20], $f12=[10], DEPTNO=[$7]) LogicalTableScan(table=[[scott, EMP]]) {code} The project alias should be {code:java} LogicalProject($f1=[20], $f2=[10], DEPTNO=[$7]) {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)