[ https://issues.apache.org/jira/browse/DRILL-6896?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Pritesh Maker updated DRILL-6896: --------------------------------- Fix Version/s: (was: 1.16.0) Future > Extraneous columns being projected past a join > ---------------------------------------------- > > Key: DRILL-6896 > URL: https://issues.apache.org/jira/browse/DRILL-6896 > Project: Apache Drill > Issue Type: Improvement > Affects Versions: 1.15.0 > Reporter: Karthikeyan Manivannan > Assignee: Aman Sinha > Priority: Major > Fix For: Future > > > [~rhou] noted that TPCH13 on Drill 1.15 was running slower than Drill 1.14. > Analysis revealed that an extra column was being projected in 1.15 and the > slowdown was because the extra column was being unnecessarily pushed across > an exchange. > Here is a simplified query written by [~amansinha100] that exhibits the same > problem : > In first plan, o_custkey and o_comment are both extraneous projections. > In the second plan (on 1.14.0), also, there is an extraneous projection: > o_custkey but not o_comment. > On 1.15.0: > ------------- > {noformat} > explain plan without implementation for > select > c.c_custkey > from > cp.`tpch/customer.parquet` c > left outer join cp.`tpch/orders.parquet` o > on c.c_custkey = o.o_custkey > and o.o_comment not like '%special%requests%' > ; > DrillScreenRel > DrillProjectRel(c_custkey=[$0]) > DrillProjectRel(c_custkey=[$2], o_custkey=[$0], o_comment=[$1]) > DrillJoinRel(condition=[=($2, $0)], joinType=[right]) > DrillFilterRel(condition=[NOT(LIKE($1, '%special%requests%'))]) > DrillScanRel(table=[[cp, tpch/orders.parquet]], > groupscan=[ParquetGroupScan [entries=[ReadEntryWithPath > [path=classpath:/tpch/orders.parquet]], > selectionRoot=classpath:/tpch/orders.parquet, numFiles=1, numRowGroups=1, > usedMetadataFile=false, columns=[`o_custkey`, `o_comment`]]]) > DrillScanRel(table=[[cp, tpch/customer.parquet]], > groupscan=[ParquetGroupScan [entries=[ReadEntryWithPath > [path=classpath:/tpch/customer.parquet]], > selectionRoot=classpath:/tpch/customer.parquet, numFiles=1, numRowGroups=1, > usedMetadataFile=false, columns=[`c_custkey`]]]) > {noformat} > On 1.14.0: > ------------- > {noformat} > DrillScreenRel > DrillProjectRel(c_custkey=[$0]) > DrillProjectRel(c_custkey=[$1], o_custkey=[$0]) > DrillJoinRel(condition=[=($1, $0)], joinType=[right]) > DrillProjectRel(o_custkey=[$0]) > DrillFilterRel(condition=[NOT(LIKE($1, '%special%requests%'))]) > DrillScanRel(table=[[cp, tpch/orders.parquet]], > groupscan=[ParquetGroupScan [entries=[ReadEntryWithPath > [path=classpath:/tpch/orders.parquet]], > selectionRoot=classpath:/tpch/orders.parquet, numFiles=1, numRowGroups=1, > usedMetadataFile=false, columns=[`o_custkey`, `o_comment`]]]) > DrillScanRel(table=[[cp, tpch/customer.parquet]], > groupscan=[ParquetGroupScan [entries=[ReadEntryWithPath > [path=classpath:/tpch/customer.parquet]], > selectionRoot=classpath:/tpch/customer.parquet, numFiles=1, numRowGroups=1, > usedMetadataFile=false, columns=[`c_custkey`]]]) > {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005)