Yanjing, Generally, the default HEP_PROGRAM will translate Filter and Project into Calc, and later it will be matched via Substitution#CalcToCalcUnifyRule. If you only got FILTER_REDUCE_EXPRESSIONS in the program, then it will be matched via Substitution#TrivialRule.
I think the TrivialRule is only used to handle cases in which we do not have specific Rule for them. And you can see that CalcToCalcUnifyRule is much powerful than TrivialRule, and this is the reason why the test case does not pass in your branch. Hope this helps~ Yanjing Wang <[email protected]> 于2022年6月1日周三 12:04写道: > Thanks Benchao, > > I changed the canonizing HEP_PROGRAM to only FILTER_REDUCE_EXPRESSIONS rule > so that the materialized view can't be applied, the original HEP_PROGRAM is > ok. I don't know if the original HEP_PROGRAM is a must and I expect it > should be applied without the original HEP_PROGRAM in this case. > > Benchao Li <[email protected]> 于2022年6月1日周三 11:53写道: > > > Hi Yanjing, > > > > I know your concerns, and I know the difference between the inferred > types. > > However, my point is that this does not affect the materialization > > substitution. > > > > My branch is: > > > https://github.com/libenchao/calcite/tree/5169-simplification-improvement > > You can try your test case on this branch, or cherry-pick the commit to > > your branch. > > > > Yanjing Wang <[email protected]> 于2022年6月1日周三 11:34写道: > > > > > Benchao, > > > > > > I noticed your Sargs in RelOptRulesTest.xml, you can see > > > LogicalFilter(condition=[SEARCH($1, Sarg[(-∞..'':VARCHAR(20)), > > > ('':VARCHAR(20)..'3':VARCHAR(20)), > ('3':VARCHAR(20)..+∞)]:VARCHAR(20))]) > > > vs > > > LogicalFilter(condition=[SEARCH($1, Sarg[(-∞..'':CHAR(1)), > > > ('':CHAR(1)..'3'), ('3'..+∞)]:CHAR(1))]) > > > > > > the difference between VARCHAR(20) and CHAR(1) is my problem. > > > > > > --Your test case passed on my branch. > > > It's not on my branch merged with master and your pr, did you test > > > in MaterializedViewSubstitutionVisitorTest? What's your branch version? > > > > > > > > > Benchao Li <[email protected]> 于2022年5月31日周二 19:55写道: > > > > > > > Hi Yanjing, > > > > > > > > Your test case passed on my branch. Can you try it with my > improvement > > > > in https://github.com/apache/calcite/pull/2821? > > > > > > > > Yanjing Wang <[email protected]> 于2022年5月31日周二 16:06写道: > > > > > > > > > Hi, Benchao, > > > > > > > > > > The pr is good, and I noticed that the two sarg types is not same. > > > > > > > > > > Add the following test in MaterializedViewSubstitutionVisitorTest, > > you > > > > will > > > > > see the substitution fails and sarg types is not same. > > > > > > > > > > @Test void testFilter2() { > > > > > sql("select * from \"EMP\" where (\"ENAME\" > '' or \"ENAME\" < > > > '') " > > > > > + "and (\"ENAME\" > '3' or \"ENAME\" < '3')", > > > > > "select * from \"EMP\" where \"ENAME\" <> '' and \"ENAME\" > <> > > > > '3'") > > > > > .withDefaultSchemaSpec(CalciteAssert.SchemaSpec.SCOTT) > > > > > .ok(); > > > > > } > > > > > > > > > > Benchao Li <[email protected]> 于2022年5月30日周一 14:47写道: > > > > > > > > > > > Hi Yanjing, > > > > > > > > > > > > The type derivation is different in this case indeed. > > > > > > However, after my fix[1], they both can be optimized to Sarg, > > > > > > and they have the same plan structure. > > > > > > (I've added tests to show this, welcome review) > > > > > > > > > > > > In your case, if you have some different handling logic in later > > > steps, > > > > > > maybe you can share with us. > > > > > > > > > > > > [1] https://github.com/apache/calcite/pull/2821 > > > > > > > > > > > > Yanjing Wang <[email protected]> 于2022年5月30日周一 12:14写道: > > > > > > > > > > > > > Thanks Julian, Benchao, > > > > > > > > > > > > > > This is not only a problem about expression simplification, > also > > > type > > > > > > > consistency in conversion[1]. > > > > > > > As Xiong said in CALCITE-4993 > > > > > > > <https://issues.apache.org/jira/browse/CALCITE-4993> , "EQUALS > > and > > > > > > > NOT-EQUALS comparison. Because they use the same > > LEAST_RESTRICTIVE > > > > > > > strategy to validate the parameter. Other comparators use the > > > COMPARE > > > > > > > strategy." > > > > > > > > > > > > > > This will cause NOT-EQUALS comparison will use different > strategy > > > > from > > > > > > > other compare operator like LESS THAN to generate literal > > operand's > > > > > type. > > > > > > > > > > > > > > In my case, if "name" column has a type varchar(256), "name" > <> > > '' > > > > and > > > > > > > "name" <> '3' will cause '' and '3' have type varchar(256), but > > > > > ("name" > > > > > > > > > > > > > > '' or "name" < '') and ("name" > '3' or "name" < '3') will > cause > > '' > > > > and > > > > > > '3' > > > > > > > has type char(0) and char(1) respectively. > > > > > > > > > > > > > > [1] > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > https://github.com/apache/calcite/blob/f278efb0411ba29c7bd167f5d02e566bf542acb3/core/src/main/java/org/apache/calcite/sql2rel/StandardConvertletTable.java#L993 > > > > > > > > > > > > > > Benchao Li <[email protected]> 于2022年5月29日周日 15:35写道: > > > > > > > > > > > > > > > I've filed an issue[1] to track this. > > > > > > > > > > > > > > > > [1] https://issues.apache.org/jira/browse/CALCITE-5169 > > > > > > > > > > > > > > > > Benchao Li <[email protected]> 于2022年5月29日周日 11:19写道: > > > > > > > > > > > > > > > > > Hi all, > > > > > > > > > > > > > > > > > > I've confirmed it. > > > > > > > > > The reason why different plans for queries: > > > > > > > > > query 1: > > > > > > > > > select * from "emps" where "name" <> '' and "name" <> '3' > > > > > > > > > query 2: > > > > > > > > > select * from "emps" where ("name" > '' or "name" < '') and > > > > > ("name" > > > > > > > > '3' > > > > > > > > > or "name" < '3') > > > > > > > > > is not from the operator consistency. > > > > > > > > > It's just because the expression: ("name" > '' or "name" < > > '') > > > > and > > > > > > > > ("name" > > > > > > > > > > '3' > > > > > > > > > or "name" < '3') cannot be translated into Sarg for now. > > > > > > > > > > > > > > > > > > I'll file a Jira issue to track and improve this. > > > > > > > > > > > > > > > > > > Benchao Li <[email protected]> 于2022年5月28日周六 09:30写道: > > > > > > > > > > > > > > > > > >> FYI, the issue might be this one: > > > > > > > > >> https://issues.apache.org/jira/browse/CALCITE-4993 > > > > > > > > >> > > > > > > > > >> I also looked into this in this direction yesterday, > > however, > > > I > > > > > > didn't > > > > > > > > >> confirm it yet. > > > > > > > > >> That's why I didn't reply to this email before. I will do > > > > further > > > > > > > > >> verifications and > > > > > > > > >> post the result here later. > > > > > > > > >> > > > > > > > > >> > > > > > > > > >> > > > > > > > > >> Julian Hyde <[email protected]> 于2022年5月27日周五 > 23:52写道: > > > > > > > > >> > > > > > > > > >>> I think there’s a JIRA case for this. The implicit casts > > > > prevent > > > > > > SARG > > > > > > > > >>> simplification from kicking in. In SARG representation > the > > > > > > > expressions > > > > > > > > >>> would be the same. Which is why we love SARGs. > > > > > > > > >>> > > > > > > > > >>> Julian > > > > > > > > >>> > > > > > > > > >>> > On May 26, 2022, at 17:49, Yanjing Wang < > > > > > > [email protected] > > > > > > > > > > > > > > > > >>> wrote: > > > > > > > > >>> > > > > > > > > > >>> > Hi community, > > > > > > > > >>> > > > > > > > > > >>> > I have this sql: select * from "emps" where "name" <> > '' > > > and > > > > > > "name" > > > > > > > > <> > > > > > > > > >>> '3' > > > > > > > > >>> > > > > > > > > > >>> > I thought it would generate the same plan with > > > > > > > > >>> > select * from "emps" where ("name" > '' or "name" < '') > > and > > > > > > > ("name" > > > > > > > > > >>> '3' > > > > > > > > >>> > or "name" < '3') > > > > > > > > >>> > > > > > > > > > >>> > but not, the not equal operator consistency is > different > > > with > > > > > > less > > > > > > > > >>> than and > > > > > > > > >>> > greater than operator, > > > > > > > > >>> > > > > > > > > > >>> > which will cause the literal '' and '3' have different > > data > > > > > type > > > > > > in > > > > > > > > >>> plans > > > > > > > > >>> > of the above sqls. > > > > > > > > >>> > > > > > > > > > >>> > That behavior maybe cause some queries will not hit the > > > > > > > > >>> materialization. > > > > > > > > >>> > > > > > > > > > >>> > should we canonize ("name" > '' or "name" < '') and > > > ("name" > > > > > > > > > > > '3' > > > > > > > or > > > > > > > > >>> > "name" < '3') to not equal or vice versa as RelToSql > > > > > > > > >>> > < > > > > > > > > >>> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > https://github.com/apache/calcite/blob/9bdfd9a178f493b235d8785afd94fd0c998e8cce/core/src/main/java/org/apache/calcite/rel/rel2sql/SqlImplementor.java#L870 > > > > > > > > >>> > > > > > > > > > >>> > behaves? > > > > > > > > >>> > > > > > > > > >> > > > > > > > > >> > > > > > > > > >> -- > > > > > > > > >> > > > > > > > > >> Best, > > > > > > > > >> Benchao Li > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > > > > > > > > > > > > > Best, > > > > > > > > > Benchao Li > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > > > > > > > > > > > Best, > > > > > > > > Benchao Li > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > > > > > > > Best, > > > > > > Benchao Li > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > > > Best, > > > > Benchao Li > > > > > > > > > > > > > -- > > > > Best, > > Benchao Li > > > -- Best, Benchao Li
