Aman Sinha created DRILL-1150:
---------------------------------

             Summary: Sub-optimal expression pushdown for slightly modified 
version of Tpch 19
                 Key: DRILL-1150
                 URL: https://issues.apache.org/jira/browse/DRILL-1150
             Project: Apache Drill
          Issue Type: Bug
            Reporter: Aman Sinha
            Assignee: Jinfeng Ni


A slightly modified version of TPCH 19, called 19_1 in the TestTpchDistributed 
JUnit test suite produces the following plan on latest master version 699851b.  
 The plan shows several expressions pushed into the Project just above the 
Lineitem scan whereas these expressions should ideally be evaluated after the 
join since there is no need to evaluate the expression for a row that does not 
qualify the join.   Also notice that there are 2 Projects above the Lineitem 
scan...these should have been merged into one. 


| 00-00    Screen
00-01      StreamAgg(group=[{}], revenue=[SUM($0)])
00-02        Project($f0=[*($2, -(1, $3))])
00-03          SelectionVectorRemover
00-04            Filter(condition=[OR(AND(=($15, 'Brand#41'), OR(=($14, 'SM 
CASE'), =($14, 'SM BOX'), =($14, 'SM PACK'), =($14, 'SM PKG')), $4, $5, $16, 
$17, OR(=($0, 'AIR'), =($0, 'AIR REG')), =($6, 'DELIVER IN PERSON')), 
AND(=($18, 'Brand#13'), OR(=($14, 'MED BAG'), =($14, 'MED BOX'), =($14, 'MED 
PKG'), =($14, 'MED PACK')), $7, $8, $19, $20, OR(=($0, 'AIR'), =($0, 'AIR 
REG')), =($9, 'DELIVER IN PERSON')), AND(=($21, 'Brand#55'), OR(=($14, 'LG 
CASE'), =($14, 'LG BOX'), =($14, 'LG PACK'), =($14, 'LG PKG')), $10, $11, $22, 
$23, OR(=($0, 'AIR'), =($0, 'AIR REG')), =($12, 'DELIVER IN PERSON')))])
00-05              HashJoin(condition=[=($1, $13)], joinType=[inner])
00-07                Project(l_shipmode=[$5], l_partkey=[$4], 
l_extendedprice=[$3], l_discount=[$1], $f7=[>=($2, 2)], $f8=[<=($2, +(2, 10))], 
$f9=[CAST($0):CHAR(17) CHARACTER SET "ISO-8859-1" COLLATE 
"ISO-8859-1$en_US$primary"], $f10=[>=($2, 14)], $f11=[<=($2, +(14, 10))], 
$f12=[CAST($0):CHAR(17) CHARACTER SET "ISO-8859-1" COLLATE 
"ISO-8859-1$en_US$primary"], $f13=[>=($2, 23)], $f14=[<=($2, +(23, 10))], 
$f15=[CAST($0):CHAR(17) CHARACTER SET "ISO-8859-1" COLLATE 
"ISO-8859-1$en_US$primary"])
00-09                  ProducerConsumer
00-11                    Scan(groupscan=[ParquetGroupScan 
[entries=[ReadEntryWithPath [path=/tpch/lineitem.parquet]], 
selectionRoot=/tpch/lineitem.parquet, columns=[SchemaPath [`l_shipmode`], 
SchemaPath [`l_partkey`], SchemaPath [`l_extendedprice`], SchemaPath 
[`l_discount`], SchemaPath [`l_quantity`], SchemaPath [`l_shipinstruct`]]]])
00-06                Project(p_partkey=[$0], p_container=[$1], $f5=[$2], 
$f6=[$3], $f70=[$4], $f80=[$5], $f90=[$6], $f100=[$7], $f110=[$8], $f120=[$9], 
$f130=[$10])
00-08                  Project(p_partkey=[$2], p_container=[$3], 
$f5=[CAST($1):CHAR(8) CHARACTER SET "ISO-8859-1" COLLATE 
"ISO-8859-1$en_US$primary"], $f6=[>=($0, 1)], $f7=[<=($0, 5)], 
$f8=[CAST($1):CHAR(8) CHARACTER SET "ISO-8859-1" COLLATE 
"ISO-8859-1$en_US$primary"], $f9=[>=($0, 1)], $f10=[<=($0, 10)], 
$f11=[CAST($1):CHAR(8) CHARACTER SET "ISO-8859-1" COLLATE 
"ISO-8859-1$en_US$primary"], $f12=[>=($0, 1)], $f13=[<=($0, 15)])
00-10                    ProducerConsumer
00-12                      Scan(groupscan=[ParquetGroupScan 
[entries=[ReadEntryWithPath [path=/tpch/part.parquet]], 
selectionRoot=/tpch/part.parquet, columns=[SchemaPath [`p_partkey`], SchemaPath 
[`p_container`], SchemaPath [`p_brand`], SchemaPath [`p_size`]]]])



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to