Probably RelRule should not change nullability. But we have done so in the past, and I suspect we continue to do so today. It would be interesting to find a list of rules that do this.
Deducing that a field is never null is very useful. With that knowledge, further optimizations become possible. So, we must continue to do it. But we also have to abide by the contract that we don't change row type (except for field names). How to square that circle? I think we should keep the columns' type as nullable, but generate predicates (see RelMdPredicates, RelOptPredicateList) that indicate that the field is never null. We already use 'not null' predicates during the simplify process. If we are simplifying 'x is null or x > 5', by the time we process the 'x > 5' factor we will have deduced that x is not null. Predicates are good, and we're making ever more use of them these days. Similar issues could occur with other type constraints. Imagine simplifying select 'abc' from t1 where false union all select 'xy' from t2 We could tighten the type from CHAR(3) to CHAR(2) when we notice that no rows will come from t1. But should we? Julian On Sun, Nov 8, 2020 at 10:46 AM Vladimir Sitnikov <sitnikov.vladi...@gmail.com> wrote: > > Hi, > > Can anybody clarify if RelRule should be allowed to change RelNode field > nullability? > > For instance, RelOptRulesTest#testReduceNullableToNotNull has the following > SQL: > ...empno + *case* when 'a' = 'a' then 1 else *null end* as newcol... > > Of course, the initial type is nullable int (since the last CASE branch is > null), however, > you might see that the branch is never taken. > > A year or so ago the test ensured that the simplified output was +($0, > CAST(1):INTEGER, > however CALCITE-2852 RexNode simplification does not traverse unknown > functions (see [1]) > <https://github.com/apache/calcite/pull/1053/files#diff-bc6d90f2392477eb02e40d8bc26ad2d8b0d6bd6dd2714ea0c2b94b3043ff9a8eR6952> > altered the behavior, so the actual test output became NEWCOL=[+($0, 1)] > which means the rule was allowed > to modify row type. > > It is strange if a rule could replace a RelNode with a different type, so I > would expect all transformations to be type-preserving. > Should those transformations be allowed? > Should the planner deny such transformations? > Should the planner add "cast to nullable" in case the original field was > nullable? > > [1]: > https://github.com/apache/calcite/pull/1053/files#diff-bc6d90f2392477eb02e40d8bc26ad2d8b0d6bd6dd2714ea0c2b94b3043ff9a8eR6952 > > Vladimir