hi Huang 1. Reduce the interleaved project-filter to one single project-filter or filter-project,i always use `Calc Operator` to optimize `project-filter` or `filter-project`, Using the `Calc` operator, relational algebra can be simplified,This relational expression combines the functionality of `Project` and `Filter`.In Calcite, you can use `FilterToCalcRule` and `ProjectToCalcRule`, `CalcMergeRule` may also be used. These rules can be used by `HepPlanner`?? 2. I'm not sure if you want to remove `CAST` from RexNode,Maybe you can refer `org.apache.calcite.rex.RexUtil#removeCast`. In the RelNode, we can implement a `RelOptRule` to match the operator, and then remove the cast operator.you can refer `RelOptRules` in `org.apache.calcite.rel.rules.CoreRules`.In addition, you can use `RelShuttle` to rewrite RelNode, both of them can remove the `cast operator`. I hope it can help you.
Best Xzh ------------------ Original ------------------ From: "dev" <yohu...@nvidia.com.INVALID>; Date: Mon, Sep 27, 2021 10:38 AM To: "dev@calcite.apache.org"<dev@calcite.apache.org>; Subject: Need help: Optimization rule to reduce project-filter-project-filter on one table Hi Calcite developers, I just started learning Calcite, and tried Calcite optimization on a fake SQL as shown in below table. I see that generated Logical plan has interleaved Project-Filter-Project-Filter chain on one table. I need some help on 2 questions: 1. Does Calcite have any rule to reduce the interleaved project-filter to one single project-filter or filter-project? 1. How can I remove the CAST operation in the SQL statement generated from RelNode? Thanks, Justin # Input fake SQL statement select * from schema_csv.table_csv left join ( select BugId, ARB, ( BugId + BugId ) as BugId2 from ( select * from schema_bug.table_bug where BugId = 100 and ARB = \'abc\' ) where BugId > 200741801 AND BugId < (200751927 - 2) ) as MyBug on schema_csv.table_csv.BugId = MyBug.BugId # Generated Logical plan LogicalProject(name=[$0], age=[$1], position=[$2], BugId=[$3], BugId0=[$4], ARB=[$5], BugId2=[$6]) LogicalJoin(condition=[=($3, $4)], joinType=[left]) LogicalTableScan(table=[[schema_csv, table_csv]]) LogicalProject(BugId=[$0], ARB=[CAST('abc':VARCHAR):VARCHAR], BugId2=[+($0, $0)]) LogicalFilter(condition=[AND(>($0, 200741801), <($0, -(200751927, 2)))]) LogicalProject(BugId=[CAST(100:BIGINT):BIGINT], ARB=[CAST('abc':VARCHAR):VARCHAR]) LogicalFilter(condition=[AND(=($0, 100), =($1, 'abc'))]) LogicalTableScan(table=[[schema_bug, table_bug]]) # Generated sql (HiveSqlDialect) from the LogicalProject input of LogicalJoin SELECT BugId, CAST('abc' AS VARCHAR) ARB, BugId + BugId BugId2 FROM (SELECT CAST(100 AS BIGINT) BugId, CAST('abc' AS VARCHAR) ARB FROM table_bug WHERE BugId = 100 AND ARB = 'abc') t0 WHERE BugId > 200741801 AND BugId < 200751927 - 2