hi Huang
    1. Reduce the interleaved project-filter to one single 
project-filter or filter-project,i always use `Calc Operator` to optimize 
`project-filter` or `filter-project`, Using the `Calc` operator, 
relational algebra can be simplified,This relational expression combines the 
functionality of `Project` and `Filter`.In Calcite, you can use 
`FilterToCalcRule` and `ProjectToCalcRule`, `CalcMergeRule` may also be 
used. These rules can be used by `HepPlanner`??
    2. I'm not sure if you want to remove `CAST` from 
RexNode,Maybe you can refer 
`org.apache.calcite.rex.RexUtil#removeCast`. In the RelNode, we can 
implement a `RelOptRule` to match the operator, and then remove the cast 
operator.you can refer `RelOptRules` in 
`org.apache.calcite.rel.rules.CoreRules`.In addition, you can use `RelShuttle` 
to rewrite RelNode, both of them can remove the `cast operator`.
    I hope it can help you.


 Best
   Xzh


















------------------ Original ------------------
From:                                                                           
                                             "dev"                              
                                                      
<yohu...@nvidia.com.INVALID&gt;;
Date:&nbsp;Mon, Sep 27, 2021 10:38 AM
To:&nbsp;"dev@calcite.apache.org"<dev@calcite.apache.org&gt;;

Subject:&nbsp;Need help: Optimization rule to reduce 
project-filter-project-filter on one table



Hi Calcite developers,

I just started learning Calcite, and tried Calcite optimization on a fake SQL 
as shown in below table.

I see that generated Logical plan has interleaved Project-Filter-Project-Filter 
chain on one table.

I need some help on 2 questions:

&nbsp; 1.&nbsp; Does Calcite have any rule to reduce the interleaved 
project-filter to one single project-filter or filter-project?


&nbsp; 1.&nbsp; How can I remove the CAST operation in the SQL statement 
generated from RelNode?


Thanks,
Justin


# Input fake SQL statement

select
&nbsp;&nbsp; *
from
&nbsp;&nbsp; schema_csv.table_csv
&nbsp;&nbsp; left join
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; (
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; select
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; BugId,
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; ARB,
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; (
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
 BugId + BugId
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; )
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; as BugId2
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; from
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; (
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
 select
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
 *
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
 from
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
 schema_bug.table_bug
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
 where
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
 BugId = 100
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
 and ARB = \'abc\'
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; )
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; where
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; BugId &gt; 
200741801
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; AND BugId < 
(200751927 - 2)
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; )
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; as MyBug
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; on schema_csv.table_csv.BugId = MyBug.BugId


# Generated Logical plan

LogicalProject(name=[$0], age=[$1], position=[$2], BugId=[$3], BugId0=[$4], 
ARB=[$5], BugId2=[$6])
&nbsp; LogicalJoin(condition=[=($3, $4)], joinType=[left])
&nbsp;&nbsp;&nbsp; LogicalTableScan(table=[[schema_csv, table_csv]])
&nbsp;&nbsp;&nbsp; LogicalProject(BugId=[$0], 
ARB=[CAST('abc':VARCHAR):VARCHAR], BugId2=[+($0, $0)])
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; LogicalFilter(condition=[AND(&gt;($0, 
200741801), <($0, -(200751927, 2)))])
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 
LogicalProject(BugId=[CAST(100:BIGINT):BIGINT], 
ARB=[CAST('abc':VARCHAR):VARCHAR])
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 
LogicalFilter(condition=[AND(=($0, 100), =($1, 'abc'))])
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 
LogicalTableScan(table=[[schema_bug, table_bug]])


# Generated sql (HiveSqlDialect) from the LogicalProject input of LogicalJoin

SELECT BugId, CAST('abc' AS VARCHAR) ARB, BugId + BugId BugId2 FROM (SELECT 
CAST(100 AS BIGINT) BugId, CAST('abc' AS VARCHAR) ARB FROM table_bug WHERE 
BugId = 100 AND ARB = 'abc') t0 WHERE BugId &gt; 200741801 AND BugId < 
200751927 - 2

Reply via email to