[
https://issues.apache.org/jira/browse/CALCITE-457?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14584655#comment-14584655
]
Jinfeng Ni commented on CALCITE-457:
------------------------------------
Hi Julian,
I modified the code based on your comments. The PR is updated at
https://github.com/apache/incubator-calcite/pull/92. Please take a look.
Here are responses to your comments.
Comment 1 . In misc.oq, a field that was called "DESC" is now called "EXPR$1".
Can you restore the old name?
[jni] I guess you are talking about the JdbcTest.testInnerJoinValues(), where
"DESC" is changed to "EXPR$1" in EXPLAIN.
Actually, the field name change happens only in EXPLAIN PLAN output; the query
result is still using "DESC". I spent quite some time trying to figure out why
the field name is changed in EXPLAIN. I do not have a completely clear idea why
it happened. But seems the cause might be related to the fact that Calcite put
two RelNodes with different rowType under the same RelSet.
The following is from the calcite trace "before" my code change:
Set#1, type: RecordType(INTEGER EXPR$0, CHAR(8) EXPR$1)
rel#11:Subset#1.NONE.[], best=null, importance=0.5904900000000001
rel#1:LogicalValues.NONE.[[0, 1], [1]](type=RecordType(INTEGER EXPR$0,
CHAR(8) EXPR$1),tuples=[{ 10, 'SameName' }]), rowcount=1.0, cumulative
cost={inf}
rel#12:LogicalProject.NONE.[[0, 1],
[1]](input=rel#11:Subset#1.NONE.[],ID=$0,DESC=$1), rowcount=1.0, cumulative
cost={inf}
In the above, the LogicalValues has (EXPR$0, EXPR$1), LogicalProject has (ID,
DESC). Yet Calcite still put them together under the same RelSet. My guess is
CALCITE-457 patch will impact the order of some rules firing, and somehow the
rowType of downstream RelNode is messed up.
On the other hand, I think Calcite's API does not guarantee that the
intermediate RelNode will keep the same field name; the output field name is
ensured to be "DESC" via validator.getValidatedNodeType(). That's why the query
result still has "DESC" in the output. Therefore, I feel it might be ok to have
different field name in the EXPLAIN plan output.
Another comment : this testcase does not have the join condition pushed, simply
because f costing : Values() has one row only. I tried at least with two rows,
and verified the condition will be pushed down.
Comment 2: It's appropriate to always create a LogicalJoin when in
SqlToRelConverter, less so in general-purpose planner rules. Can you use a join
factory and not add method RelOptUtil.createJoin?
[jni]
I tried to add join factory to FilterJoinRule, that seems require more change.
In stead, I chose to passing in originalJoin, and use join.copy() method. That
way, in the general-purpose planner rule or SqlToRelConverter, the new created
Join will be of the same class of original join node.
Comment 3: re-format
[jni] Now I pass in param originalJoin. No need for such format.
Comment 4: join.oq should end with newline
[jni] Done.
Comment 5: Can you add a test for FilterJoinRule's new capability to
RelOptRulesTest.
[jni] testPushJoinCondDownToProject() is Added.
> Non-ansi join should not be processed as a filter on top of "on (true)" join
> ----------------------------------------------------------------------------
>
> Key: CALCITE-457
> URL: https://issues.apache.org/jira/browse/CALCITE-457
> Project: Calcite
> Issue Type: Bug
> Affects Versions: 0.9.1-incubating
> Reporter: Vladimir Sitnikov
> Assignee: Julian Hyde
>
> I've tested two plans and it turns out the query with non-ansi joins has
> extremely bad plan (note {{EnumerableJoinRel(condition=\[true\]}}):
> {code:sql}
> explain plan for select d."deptno", e."empid"
> from "hr"."emps" as e
> , "hr"."depts" as d
> where e."deptno" = d."deptno"+0
> PLAN=EnumerableCalcRel(expr#0..2=[{inputs}], expr#3=[CAST($t1):INTEGER NOT
> NULL], expr#4=[0], expr#5=[+($t2, $t4)], expr#6=[=($t3, $t5)], deptno=[$t2],
> empid=[$t0], $condition=[$t6])
> EnumerableJoinRel(condition=[true], joinType=[inner])
> EnumerableCalcRel(expr#0..4=[{inputs}], proj#0..1=[{exprs}])
> EnumerableTableAccessRel(table=[[hr, emps]])
> EnumerableCalcRel(expr#0..2=[{inputs}], deptno=[$t0])
> EnumerableTableAccessRel(table=[[hr, depts]])
> {code}
> Same works fine with ANSI style:
> {code:sql}
> explain plan for select d."deptno", e."empid"
> from "hr"."emps" as e
> join "hr"."depts" as d
> on (e."deptno" = d."deptno"+0)
> PLAN=EnumerableCalcRel(expr#0..3=[{inputs}], deptno=[$t2], empid=[$t0])
> EnumerableJoinRel(condition=[=($1, $3)], joinType=[inner])
> EnumerableCalcRel(expr#0..4=[{inputs}], expr#5=[CAST($t1):INTEGER NOT
> NULL], empid=[$t0], $f5=[$t5])
> EnumerableTableAccessRel(table=[[hr, emps]])
> EnumerableCalcRel(expr#0..2=[{inputs}], expr#3=[0], expr#4=[+($t0, $t3)],
> deptno=[$t0], $f3=[$t4])
> EnumerableTableAccessRel(table=[[hr, depts]])
> {code}
> The query that does not use calculations works fine even with non-ansi style:
> {code:sql}
> explain plan for select d."deptno", e."empid"
> from "hr"."emps" as e
> , "hr"."depts" as d
> where e."deptno" = d."deptno"
> PLAN=EnumerableCalcRel(expr#0..2=[{inputs}], deptno=[$t2], empid=[$t0])
> EnumerableJoinRel(condition=[=($1, $2)], joinType=[inner])
> EnumerableCalcRel(expr#0..4=[{inputs}], proj#0..1=[{exprs}])
> EnumerableTableAccessRel(table=[[hr, emps]])
> EnumerableCalcRel(expr#0..2=[{inputs}], deptno=[$t0])
> EnumerableTableAccessRel(table=[[hr, depts]])
> {code}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)