I wanted to figure out whether the transformation that you want is valid. Let’s convert your query
select T1.id, T2.id from T1 left join T2 on T1.id = T2.C201 where test(T1.id) into one of the same shape: select d.deptno, e.empno from dept as d left join emp as e on d.deptno = e.deptno where is_odd_or_null(d.deptno) Is this query equivalent to select d.deptno, e.empno from (select * from dept as d where is_odd_or_null(d.deptno) ) as d left join emp as e on d.deptno = e.deptno for all possible is_odd_or_null functions, and without assuming and primary or foreign key constraints, and without assuming that any columns are not null? Yes, I think it is. In particular, we do not need any knowledge of the ’strong’ behavior of is_odd_or_null. Julian > On Aug 30, 2023, at 10:02 AM, Ian Bertolacci > <ian.bertola...@workday.com.INVALID> wrote: > > Hi Benchao, > Looking at this closer, I think I see what you mean. > The distinguishing factor here is that the Policy [1] for the EQUALS operator > is ANY, which is used in Strong.isNull to check if any of the operands can be > null (which is true when above an outer join), which allows > RelOptUtil.simplifyJoin to simplify the join into an inner join. > Then this allows the filter to be pushed down onto what use to be the null > producing side of the join. > In the case of a custom function, the operator is given an AS_IS, which > causes Strong.isNull to base the nullability on the expression kind. The node > is an OTHER_FUNCTION which is not present in the switch, and defaults to > false [3], which does *not* allow the join to be converted to an inner join, > which disallows the rule to push down the filters below the join. > > I think that if we were able to define the operator’s Policy (which I guess > we are only able to do by defining SqlOperators) then this problem would be > solved. > > Thanks! > -Ian J. Bertolacci > > [1] > https://calcite.apache.org/javadocAggregate/org/apache/calcite/plan/Strong.Policy.html > [2] > https://github.com/apache/calcite/blob/2bf1e7b2e8596a61b5ac19aa7bebb9910c4eea94/core/src/main/java/org/apache/calcite/plan/Strong.java#L211 > [3] > https://github.com/apache/calcite/blob/2bf1e7b2e8596a61b5ac19aa7bebb9910c4eea94/core/src/main/java/org/apache/calcite/plan/Strong.java#L242 > On 2023/08/30 12:39:24 Benchao Li wrote: >> Hi Ian, >> >> FilterJoinRule[1] is designed to not push filters to null-generating >> side of Join, it has nothing to do with your custom UDF. >> >> [1] >> https://github.com/apache/calcite/blob/2bf1e7b2e8596a61b5ac19aa7bebb9910c4eea94/core/src/main/java/org/apache/calcite/rel/rules/FilterJoinRule.java#L111-L125 >> >> Alessandro Solimando <al...@gmail.com<mailto:al...@gmail.com>> 于2023年8月30日周三 >> 14:13写道: >>> >>> Hi Ian, >>> I did not have time to dig more into your question (not very familiar with >>> the way you register functions), but for RuleEventLogger you might want to >>> check these slides (9 to 12 for how to activate it, the rest 15 to 23 on >>> how to read the output): >>> https://www.slideshare.net/StamatisZampetakis/debugging-planning-issues-using-calcites-builtin-loggers >>> . >>> >>> Hth, >>> Alessandro >>> >>> >>> >>> On Wed, 30 Aug 2023 at 04:14, Ian Bertolacci >>> <ia...@workday.com.inva<mailto:ia...@workday.com.inva>lid> wrote: >>> >>>> Hello, >>>> We have defined some extra comparison functions for our users. >>>> However, we’ve noticed that filter push-down (using ` >>>> CoreRules.FILTER_INTO_JOIN`) does not occur when the operands to the >>>> functions come from the opposite side of an outer join (i.e. from the right >>>> side of a left outer join, or from the left side of a right outer join). >>>> >>>> Here is a demonstration of this with SQL, the logical plan, and the >>>> physical plan post optimization: https://pastebin.com/raw/KjE40z5X >>>> In this example, I’ve defined a simple function called “test” which takes >>>> a BigInt and returns Boolean. >>>> In the first query, the operand to `test` comes from the left side of the >>>> left join, and the tree before planning has the filter node above the join, >>>> and after planning the filter is below the join. >>>> In the second query, the operand to `test` comes from the left side of the >>>> right join, and the tree before planning has the filter node above the >>>> join, but after planning the filter is still above the join. >>>> (I was hoping to get the output from the RuleEventLogger, but I haven’t >>>> been able to get it working.) >>>> >>>> I figure the push down is not happening because we are not properly >>>> communicating how these functions handles null (which is to return null, >>>> which would be false-y, and therefore allow the join to be converted to an >>>> inner join and the filter pushed down below the join). >>>> We are not providing these functions as SqlOperator instances in a >>>> SqlOperatorTable; instead we use `ScalarFunctionImpl.create` on native >>>> methods, and provide those Function instances through the Schema, via >>>> `AbstractSchema.getFunctionMultimap` (which eventually make it to the >>>> CatalogReader) >>>> I’m guessing that if we were providing these via a SqlOperatorTable (which >>>> we’re hoping to do sometime soon) that having the return type of the >>>> operator be nullable, or have the nullability be inferred from the >>>> operands, would allow the planner to push the filter below the join. >>>> >>>> Am I right? >>>> In the meantime, is there a way to provide this nullability information >>>> through `ScalarFunctionImpl.create` pathway we’re currently using? >>>> >>>> Thanks! >>>> -Ian J. Bertolacci >>>> >> >> >> >> -- >> >> Best, >> Benchao Li >>