I wanted to figure out whether the transformation that you want is valid. Let’s 
convert your query

  select T1.id, T2.id
  from T1 left join T2 on T1.id = T2.C201
  where test(T1.id)

into one of the same shape:

  select d.deptno, e.empno
  from dept as d left join emp as e on d.deptno = e.deptno
  where is_odd_or_null(d.deptno)

Is this query equivalent to

  select d.deptno, e.empno
  from (select * from dept as d
    where is_odd_or_null(d.deptno)
  ) as d left join emp as e on d.deptno = e.deptno

for all possible is_odd_or_null functions, and without assuming and primary or 
foreign key constraints, and without assuming that any columns are not null?

Yes, I think it is. In particular, we do not need any knowledge of the ’strong’ 
behavior of is_odd_or_null.

Julian



> On Aug 30, 2023, at 10:02 AM, Ian Bertolacci 
> <ian.bertola...@workday.com.INVALID> wrote:
> 
> Hi Benchao,
> Looking at this closer, I think I see what you mean.
> The distinguishing factor here is that the Policy [1] for the EQUALS operator 
> is ANY, which is used in Strong.isNull to check if any of the operands can be 
> null (which is true when above an outer join), which allows 
> RelOptUtil.simplifyJoin to simplify the join into an inner join.
> Then this allows the filter to be pushed down onto what use to be the null 
> producing side of the join.
> In the case of a custom function, the operator is given an AS_IS, which 
> causes Strong.isNull to base the nullability on the expression kind. The node 
> is an OTHER_FUNCTION which is not present in the switch, and defaults to 
> false [3], which does *not* allow the join to be converted to an inner join, 
> which disallows the rule to push down the filters below the join.
> 
> I think that if we were able to define the operator’s Policy (which I guess 
> we are only able to do by defining SqlOperators) then this problem would be 
> solved.
> 
> Thanks!
> -Ian J. Bertolacci
> 
> [1] 
> https://calcite.apache.org/javadocAggregate/org/apache/calcite/plan/Strong.Policy.html
> [2] 
> https://github.com/apache/calcite/blob/2bf1e7b2e8596a61b5ac19aa7bebb9910c4eea94/core/src/main/java/org/apache/calcite/plan/Strong.java#L211
> [3] 
> https://github.com/apache/calcite/blob/2bf1e7b2e8596a61b5ac19aa7bebb9910c4eea94/core/src/main/java/org/apache/calcite/plan/Strong.java#L242
> On 2023/08/30 12:39:24 Benchao Li wrote:
>> Hi Ian,
>> 
>> FilterJoinRule[1] is designed to not push filters to null-generating
>> side of Join, it has nothing to do with your custom UDF.
>> 
>> [1] 
>> https://github.com/apache/calcite/blob/2bf1e7b2e8596a61b5ac19aa7bebb9910c4eea94/core/src/main/java/org/apache/calcite/rel/rules/FilterJoinRule.java#L111-L125
>> 
>> Alessandro Solimando <al...@gmail.com<mailto:al...@gmail.com>> 于2023年8月30日周三 
>> 14:13写道:
>>> 
>>> Hi Ian,
>>> I did not have time to dig more into your question (not very familiar with
>>> the way you register functions), but for RuleEventLogger you might want to
>>> check these slides (9 to 12 for how to activate it, the rest 15 to 23 on
>>> how to read the output):
>>> https://www.slideshare.net/StamatisZampetakis/debugging-planning-issues-using-calcites-builtin-loggers
>>> .
>>> 
>>> Hth,
>>> Alessandro
>>> 
>>> 
>>> 
>>> On Wed, 30 Aug 2023 at 04:14, Ian Bertolacci
>>> <ia...@workday.com.inva<mailto:ia...@workday.com.inva>lid> wrote:
>>> 
>>>> Hello,
>>>> We have defined some extra comparison functions for our users.
>>>> However, we’ve noticed that filter push-down (using `
>>>> CoreRules.FILTER_INTO_JOIN`) does not occur when the operands to the
>>>> functions come from the opposite side of an outer join (i.e. from the right
>>>> side of a left outer join, or from the left side of a right outer join).
>>>> 
>>>> Here is a demonstration of this with SQL, the logical plan, and the
>>>> physical plan post optimization: https://pastebin.com/raw/KjE40z5X
>>>> In this example, I’ve defined a simple function called “test” which takes
>>>> a BigInt and returns Boolean.
>>>> In the first query, the operand to `test` comes from the left side of the
>>>> left join, and the tree before planning has the filter node above the join,
>>>> and after planning the filter is below the join.
>>>> In the second query, the operand to `test` comes from the left side of the
>>>> right join, and the tree before planning has the filter node above the
>>>> join, but after planning the filter is still above the join.
>>>> (I was hoping to get the output from the RuleEventLogger, but I haven’t
>>>> been able to get it working.)
>>>> 
>>>> I figure the push down is not happening because we are not properly
>>>> communicating how these functions handles null (which is to return null,
>>>> which would be false-y, and therefore allow the join to be converted to an
>>>> inner join and the filter pushed down below the join).
>>>> We are not providing these functions as SqlOperator instances in a
>>>> SqlOperatorTable; instead we use `ScalarFunctionImpl.create` on native
>>>> methods, and provide those Function instances through the Schema, via
>>>> `AbstractSchema.getFunctionMultimap` (which eventually make it to the
>>>> CatalogReader)
>>>> I’m guessing that if we were providing these via a SqlOperatorTable (which
>>>> we’re hoping to do sometime soon) that having the return type of the
>>>> operator be nullable, or have the nullability be inferred from the
>>>> operands, would allow the planner to push the filter below the join.
>>>> 
>>>> Am I right?
>>>> In the meantime, is there a way to provide this nullability information
>>>> through `ScalarFunctionImpl.create` pathway we’re currently using?
>>>> 
>>>> Thanks!
>>>> -Ian J. Bertolacci
>>>> 
>> 
>> 
>> 
>> --
>> 
>> Best,
>> Benchao Li
>> 

Reply via email to