Hi Benchao,
Looking at this closer, I think I see what you mean.
The distinguishing factor here is that the Policy [1] for the EQUALS operator 
is ANY, which is used in Strong.isNull to check if any of the operands can be 
null (which is true when above an outer join), which allows 
RelOptUtil.simplifyJoin to simplify the join into an inner join.
Then this allows the filter to be pushed down onto what use to be the null 
producing side of the join.
In the case of a custom function, the operator is given an AS_IS, which causes 
Strong.isNull to base the nullability on the expression kind. The node is an 
OTHER_FUNCTION which is not present in the switch, and defaults to false [3], 
which does *not* allow the join to be converted to an inner join, which 
disallows the rule to push down the filters below the join.

I think that if we were able to define the operator’s Policy (which I guess we 
are only able to do by defining SqlOperators) then this problem would be solved.

Thanks!
-Ian J. Bertolacci

[1] 
https://calcite.apache.org/javadocAggregate/org/apache/calcite/plan/Strong.Policy.html
[2] 
https://github.com/apache/calcite/blob/2bf1e7b2e8596a61b5ac19aa7bebb9910c4eea94/core/src/main/java/org/apache/calcite/plan/Strong.java#L211
[3] 
https://github.com/apache/calcite/blob/2bf1e7b2e8596a61b5ac19aa7bebb9910c4eea94/core/src/main/java/org/apache/calcite/plan/Strong.java#L242
On 2023/08/30 12:39:24 Benchao Li wrote:
> Hi Ian,
>
> FilterJoinRule[1] is designed to not push filters to null-generating
> side of Join, it has nothing to do with your custom UDF.
>
> [1] 
> https://github.com/apache/calcite/blob/2bf1e7b2e8596a61b5ac19aa7bebb9910c4eea94/core/src/main/java/org/apache/calcite/rel/rules/FilterJoinRule.java#L111-L125
>
> Alessandro Solimando <al...@gmail.com<mailto:al...@gmail.com>> 于2023年8月30日周三 
> 14:13写道:
> >
> > Hi Ian,
> > I did not have time to dig more into your question (not very familiar with
> > the way you register functions), but for RuleEventLogger you might want to
> > check these slides (9 to 12 for how to activate it, the rest 15 to 23 on
> > how to read the output):
> > https://www.slideshare.net/StamatisZampetakis/debugging-planning-issues-using-calcites-builtin-loggers
> > .
> >
> > Hth,
> > Alessandro
> >
> >
> >
> > On Wed, 30 Aug 2023 at 04:14, Ian Bertolacci
> > <ia...@workday.com.inva<mailto:ia...@workday.com.inva>lid> wrote:
> >
> > > Hello,
> > > We have defined some extra comparison functions for our users.
> > > However, we’ve noticed that filter push-down (using `
> > > CoreRules.FILTER_INTO_JOIN`) does not occur when the operands to the
> > > functions come from the opposite side of an outer join (i.e. from the 
> > > right
> > > side of a left outer join, or from the left side of a right outer join).
> > >
> > > Here is a demonstration of this with SQL, the logical plan, and the
> > > physical plan post optimization: https://pastebin.com/raw/KjE40z5X
> > > In this example, I’ve defined a simple function called “test” which takes
> > > a BigInt and returns Boolean.
> > > In the first query, the operand to `test` comes from the left side of the
> > > left join, and the tree before planning has the filter node above the 
> > > join,
> > > and after planning the filter is below the join.
> > > In the second query, the operand to `test` comes from the left side of the
> > > right join, and the tree before planning has the filter node above the
> > > join, but after planning the filter is still above the join.
> > > (I was hoping to get the output from the RuleEventLogger, but I haven’t
> > > been able to get it working.)
> > >
> > > I figure the push down is not happening because we are not properly
> > > communicating how these functions handles null (which is to return null,
> > > which would be false-y, and therefore allow the join to be converted to an
> > > inner join and the filter pushed down below the join).
> > > We are not providing these functions as SqlOperator instances in a
> > > SqlOperatorTable; instead we use `ScalarFunctionImpl.create` on native
> > > methods, and provide those Function instances through the Schema, via
> > > `AbstractSchema.getFunctionMultimap` (which eventually make it to the
> > > CatalogReader)
> > > I’m guessing that if we were providing these via a SqlOperatorTable (which
> > > we’re hoping to do sometime soon) that having the return type of the
> > > operator be nullable, or have the nullability be inferred from the
> > > operands, would allow the planner to push the filter below the join.
> > >
> > > Am I right?
> > > In the meantime, is there a way to provide this nullability information
> > > through `ScalarFunctionImpl.create` pathway we’re currently using?
> > >
> > > Thanks!
> > > -Ian J. Bertolacci
> > >
>
>
>
> --
>
> Best,
> Benchao Li
>

Reply via email to