[
https://issues.apache.org/jira/browse/CALCITE-850?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14705784#comment-14705784
]
Jinfeng Ni commented on CALCITE-850:
------------------------------------
I looked at the code change. I think it makes sense to move the logic of
pushing join expression into a separate rule, and it's up to each system to
decide whether turn on/off such rule in their planner. The code change looks
fine to me ( one minor comment).
I'm a bit surprised that it caused performance regression in Hive by pushing
expression into project below the join, though. I guess under two scenarios
such push down would cause performance overhead:
1 ) The join condition itself does not have filtering, or very less filtering.
As such, it does not matter much whether the filter is applied in join
operator, or in the filter operator after join.
2) the join condition evaluation applies short-circuit evaluation optimization.
As such, it might be possible to skip some expensive expression. In contrast,
if we push down the expression, we will end up with evaluating every expression
always.
I guess such scenarios probably be reflected in the costing; it's up to the
costing to decide which way to go, while the rule's job is to enumerate the
possible different choices.
Also, if the query' join is ANSI-sql style; join condition is in "ON" clause,
then Calcite will do such pushdown in SqlToRelConverter always, before the opt
phases kicks in.
> Remove push down expressions from FilterJoinRule and create a new rule for it
> -----------------------------------------------------------------------------
>
> Key: CALCITE-850
> URL: https://issues.apache.org/jira/browse/CALCITE-850
> Project: Calcite
> Issue Type: Bug
> Reporter: Jesus Camacho Rodriguez
> Assignee: Jesus Camacho Rodriguez
>
> CALCITE-457 added pushing expressions in join conditions into projects below
> the join in the FilterJoinRule, so the expression would be computed
> beforehand and not in the join predicate.
> While this can be an interesting feature for some projects using Calcite, it
> is a different functionality and it should be a standalone independent rule.
> For instance, in Hive we do not want to enable it at the moment, as it causes
> some performance regressions in many test cases.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)