[ 
https://issues.apache.org/jira/browse/DRILL-1150?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jinfeng Ni reassigned DRILL-1150:
---------------------------------

    Assignee: DrillCommitter  (was: Jinfeng Ni)

> Sub-optimal expression pushdown for slightly modified version of Tpch 19
> ------------------------------------------------------------------------
>
>                 Key: DRILL-1150
>                 URL: https://issues.apache.org/jira/browse/DRILL-1150
>             Project: Apache Drill
>          Issue Type: Bug
>            Reporter: Aman Sinha
>            Assignee: DrillCommitter
>
> A slightly modified version of TPCH 19, called 19_1 in the 
> TestTpchDistributed JUnit test suite produces the following plan on latest 
> master version 699851b.   The plan shows several expressions pushed into the 
> Project just above the Lineitem scan whereas these expressions should ideally 
> be evaluated after the join since there is no need to evaluate the expression 
> for a row that does not qualify the join.   Also notice that there are 2 
> Projects above the Lineitem scan...these should have been merged into one. 
> | 00-00    Screen
> 00-01      StreamAgg(group=[{}], revenue=[SUM($0)])
> 00-02        Project($f0=[*($2, -(1, $3))])
> 00-03          SelectionVectorRemover
> 00-04            Filter(condition=[OR(AND(=($15, 'Brand#41'), OR(=($14, 'SM 
> CASE'), =($14, 'SM BOX'), =($14, 'SM PACK'), =($14, 'SM PKG')), $4, $5, $16, 
> $17, OR(=($0, 'AIR'), =($0, 'AIR REG')), =($6, 'DELIVER IN PERSON')), 
> AND(=($18, 'Brand#13'), OR(=($14, 'MED BAG'), =($14, 'MED BOX'), =($14, 'MED 
> PKG'), =($14, 'MED PACK')), $7, $8, $19, $20, OR(=($0, 'AIR'), =($0, 'AIR 
> REG')), =($9, 'DELIVER IN PERSON')), AND(=($21, 'Brand#55'), OR(=($14, 'LG 
> CASE'), =($14, 'LG BOX'), =($14, 'LG PACK'), =($14, 'LG PKG')), $10, $11, 
> $22, $23, OR(=($0, 'AIR'), =($0, 'AIR REG')), =($12, 'DELIVER IN PERSON')))])
> 00-05              HashJoin(condition=[=($1, $13)], joinType=[inner])
> 00-07                Project(l_shipmode=[$5], l_partkey=[$4], 
> l_extendedprice=[$3], l_discount=[$1], $f7=[>=($2, 2)], $f8=[<=($2, +(2, 
> 10))], $f9=[CAST($0):CHAR(17) CHARACTER SET "ISO-8859-1" COLLATE 
> "ISO-8859-1$en_US$primary"], $f10=[>=($2, 14)], $f11=[<=($2, +(14, 10))], 
> $f12=[CAST($0):CHAR(17) CHARACTER SET "ISO-8859-1" COLLATE 
> "ISO-8859-1$en_US$primary"], $f13=[>=($2, 23)], $f14=[<=($2, +(23, 10))], 
> $f15=[CAST($0):CHAR(17) CHARACTER SET "ISO-8859-1" COLLATE 
> "ISO-8859-1$en_US$primary"])
> 00-09                  ProducerConsumer
> 00-11                    Scan(groupscan=[ParquetGroupScan 
> [entries=[ReadEntryWithPath [path=/tpch/lineitem.parquet]], 
> selectionRoot=/tpch/lineitem.parquet, columns=[SchemaPath [`l_shipmode`], 
> SchemaPath [`l_partkey`], SchemaPath [`l_extendedprice`], SchemaPath 
> [`l_discount`], SchemaPath [`l_quantity`], SchemaPath [`l_shipinstruct`]]]])
> 00-06                Project(p_partkey=[$0], p_container=[$1], $f5=[$2], 
> $f6=[$3], $f70=[$4], $f80=[$5], $f90=[$6], $f100=[$7], $f110=[$8], 
> $f120=[$9], $f130=[$10])
> 00-08                  Project(p_partkey=[$2], p_container=[$3], 
> $f5=[CAST($1):CHAR(8) CHARACTER SET "ISO-8859-1" COLLATE 
> "ISO-8859-1$en_US$primary"], $f6=[>=($0, 1)], $f7=[<=($0, 5)], 
> $f8=[CAST($1):CHAR(8) CHARACTER SET "ISO-8859-1" COLLATE 
> "ISO-8859-1$en_US$primary"], $f9=[>=($0, 1)], $f10=[<=($0, 10)], 
> $f11=[CAST($1):CHAR(8) CHARACTER SET "ISO-8859-1" COLLATE 
> "ISO-8859-1$en_US$primary"], $f12=[>=($0, 1)], $f13=[<=($0, 15)])
> 00-10                    ProducerConsumer
> 00-12                      Scan(groupscan=[ParquetGroupScan 
> [entries=[ReadEntryWithPath [path=/tpch/part.parquet]], 
> selectionRoot=/tpch/part.parquet, columns=[SchemaPath [`p_partkey`], 
> SchemaPath [`p_container`], SchemaPath [`p_brand`], SchemaPath [`p_size`]]]])



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to