[jira] [Updated] (DRILL-6896) Extraneous columns being projected past a join

2019-03-11 Thread Pritesh Maker (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-6896?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pritesh Maker updated DRILL-6896:
-
Fix Version/s: (was: 1.16.0)
   Future

> Extraneous columns being projected past a join
> --
>
> Key: DRILL-6896
> URL: https://issues.apache.org/jira/browse/DRILL-6896
> Project: Apache Drill
>  Issue Type: Improvement
>Affects Versions: 1.15.0
>Reporter: Karthikeyan Manivannan
>Assignee: Aman Sinha
>Priority: Major
> Fix For: Future
>
>
> [~rhou] noted that TPCH13 on Drill 1.15 was running slower than Drill 1.14. 
> Analysis revealed that an extra column was being projected in 1.15 and the 
> slowdown was because the extra column was being unnecessarily pushed across 
> an exchange.
> Here is a simplified query written by [~amansinha100] that exhibits the same 
> problem :
> In first plan, o_custkey and o_comment are both extraneous projections. 
>  In the second plan (on 1.14.0), also, there is an extraneous projection: 
> o_custkey but not o_comment.
> On 1.15.0:
> -
> {noformat}
> explain plan without implementation for 
> select
>   c.c_custkey
> from
>cp.`tpch/customer.parquet` c 
>  left outer join cp.`tpch/orders.parquet` o 
>   on c.c_custkey = o.o_custkey
>  and o.o_comment not like '%special%requests%'
>;
> DrillScreenRel
>   DrillProjectRel(c_custkey=[$0])
> DrillProjectRel(c_custkey=[$2], o_custkey=[$0], o_comment=[$1])
>   DrillJoinRel(condition=[=($2, $0)], joinType=[right])
> DrillFilterRel(condition=[NOT(LIKE($1, '%special%requests%'))])
>   DrillScanRel(table=[[cp, tpch/orders.parquet]], 
> groupscan=[ParquetGroupScan [entries=[ReadEntryWithPath 
> [path=classpath:/tpch/orders.parquet]], 
> selectionRoot=classpath:/tpch/orders.parquet, numFiles=1, numRowGroups=1, 
> usedMetadataFile=false, columns=[`o_custkey`, `o_comment`]]])
> DrillScanRel(table=[[cp, tpch/customer.parquet]], 
> groupscan=[ParquetGroupScan [entries=[ReadEntryWithPath 
> [path=classpath:/tpch/customer.parquet]], 
> selectionRoot=classpath:/tpch/customer.parquet, numFiles=1, numRowGroups=1, 
> usedMetadataFile=false, columns=[`c_custkey`]]])
> {noformat}
> On 1.14.0:
> -
> {noformat}
> DrillScreenRel
>   DrillProjectRel(c_custkey=[$0])
> DrillProjectRel(c_custkey=[$1], o_custkey=[$0])
>   DrillJoinRel(condition=[=($1, $0)], joinType=[right])
> DrillProjectRel(o_custkey=[$0])
>   DrillFilterRel(condition=[NOT(LIKE($1, '%special%requests%'))])
> DrillScanRel(table=[[cp, tpch/orders.parquet]], 
> groupscan=[ParquetGroupScan [entries=[ReadEntryWithPath 
> [path=classpath:/tpch/orders.parquet]], 
> selectionRoot=classpath:/tpch/orders.parquet, numFiles=1, numRowGroups=1, 
> usedMetadataFile=false, columns=[`o_custkey`, `o_comment`]]])
> DrillScanRel(table=[[cp, tpch/customer.parquet]], 
> groupscan=[ParquetGroupScan [entries=[ReadEntryWithPath 
> [path=classpath:/tpch/customer.parquet]], 
> selectionRoot=classpath:/tpch/customer.parquet, numFiles=1, numRowGroups=1, 
> usedMetadataFile=false, columns=[`c_custkey`]]])
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (DRILL-6896) Extraneous columns being projected past a join

2018-12-11 Thread Aman Sinha (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-6896?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aman Sinha updated DRILL-6896:
--
Fix Version/s: 1.16.0

> Extraneous columns being projected past a join
> --
>
> Key: DRILL-6896
> URL: https://issues.apache.org/jira/browse/DRILL-6896
> Project: Apache Drill
>  Issue Type: Improvement
>Affects Versions: 1.15.0
>Reporter: Karthikeyan Manivannan
>Assignee: Aman Sinha
>Priority: Major
> Fix For: 1.16.0
>
>
> [~rhou] noted that TPCH13 on Drill 1.15 was running slower than Drill 1.14. 
> Analysis revealed that an extra column was being projected in 1.15 and the 
> slowdown was because the extra column was being unnecessarily pushed across 
> an exchange.
> Here is a simplified query written by [~amansinha100] that exhibits the same 
> problem :
> In first plan, o_custkey and o_comment are both extraneous projections. 
>  In the second plan (on 1.14.0), also, there is an extraneous projection: 
> o_custkey but not o_comment.
> On 1.15.0:
> -
> {noformat}
> explain plan without implementation for 
> select
>   c.c_custkey
> from
>cp.`tpch/customer.parquet` c 
>  left outer join cp.`tpch/orders.parquet` o 
>   on c.c_custkey = o.o_custkey
>  and o.o_comment not like '%special%requests%'
>;
> DrillScreenRel
>   DrillProjectRel(c_custkey=[$0])
> DrillProjectRel(c_custkey=[$2], o_custkey=[$0], o_comment=[$1])
>   DrillJoinRel(condition=[=($2, $0)], joinType=[right])
> DrillFilterRel(condition=[NOT(LIKE($1, '%special%requests%'))])
>   DrillScanRel(table=[[cp, tpch/orders.parquet]], 
> groupscan=[ParquetGroupScan [entries=[ReadEntryWithPath 
> [path=classpath:/tpch/orders.parquet]], 
> selectionRoot=classpath:/tpch/orders.parquet, numFiles=1, numRowGroups=1, 
> usedMetadataFile=false, columns=[`o_custkey`, `o_comment`]]])
> DrillScanRel(table=[[cp, tpch/customer.parquet]], 
> groupscan=[ParquetGroupScan [entries=[ReadEntryWithPath 
> [path=classpath:/tpch/customer.parquet]], 
> selectionRoot=classpath:/tpch/customer.parquet, numFiles=1, numRowGroups=1, 
> usedMetadataFile=false, columns=[`c_custkey`]]])
> {noformat}
> On 1.14.0:
> -
> {noformat}
> DrillScreenRel
>   DrillProjectRel(c_custkey=[$0])
> DrillProjectRel(c_custkey=[$1], o_custkey=[$0])
>   DrillJoinRel(condition=[=($1, $0)], joinType=[right])
> DrillProjectRel(o_custkey=[$0])
>   DrillFilterRel(condition=[NOT(LIKE($1, '%special%requests%'))])
> DrillScanRel(table=[[cp, tpch/orders.parquet]], 
> groupscan=[ParquetGroupScan [entries=[ReadEntryWithPath 
> [path=classpath:/tpch/orders.parquet]], 
> selectionRoot=classpath:/tpch/orders.parquet, numFiles=1, numRowGroups=1, 
> usedMetadataFile=false, columns=[`o_custkey`, `o_comment`]]])
> DrillScanRel(table=[[cp, tpch/customer.parquet]], 
> groupscan=[ParquetGroupScan [entries=[ReadEntryWithPath 
> [path=classpath:/tpch/customer.parquet]], 
> selectionRoot=classpath:/tpch/customer.parquet, numFiles=1, numRowGroups=1, 
> usedMetadataFile=false, columns=[`c_custkey`]]])
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (DRILL-6896) Extraneous columns being projected past a join

2018-12-11 Thread Aman Sinha (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-6896?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aman Sinha updated DRILL-6896:
--
Description: 
[~rhou] noted that TPCH13 on Drill 1.15 was running slower than Drill 1.14. 
Analysis revealed that an extra column was being projected in 1.15 and the 
slowdown was because the extra column was being unnecessarily pushed across an 
exchange.

Here is a simplified query written by [~amansinha100] that exhibits the same 
problem :

In first plan, o_custkey and o_comment are both extraneous projections. 
 In the second plan (on 1.14.0), also, there is an extraneous projection: 
o_custkey but not o_comment.

On 1.15.0:
-
{noformat}
explain plan without implementation for 
select
  c.c_custkey
from
   cp.`tpch/customer.parquet` c 
 left outer join cp.`tpch/orders.parquet` o 
  on c.c_custkey = o.o_custkey
 and o.o_comment not like '%special%requests%'
   ;

DrillScreenRel
  DrillProjectRel(c_custkey=[$0])
DrillProjectRel(c_custkey=[$2], o_custkey=[$0], o_comment=[$1])
  DrillJoinRel(condition=[=($2, $0)], joinType=[right])
DrillFilterRel(condition=[NOT(LIKE($1, '%special%requests%'))])
  DrillScanRel(table=[[cp, tpch/orders.parquet]], 
groupscan=[ParquetGroupScan [entries=[ReadEntryWithPath 
[path=classpath:/tpch/orders.parquet]], 
selectionRoot=classpath:/tpch/orders.parquet, numFiles=1, numRowGroups=1, 
usedMetadataFile=false, columns=[`o_custkey`, `o_comment`]]])
DrillScanRel(table=[[cp, tpch/customer.parquet]], 
groupscan=[ParquetGroupScan [entries=[ReadEntryWithPath 
[path=classpath:/tpch/customer.parquet]], 
selectionRoot=classpath:/tpch/customer.parquet, numFiles=1, numRowGroups=1, 
usedMetadataFile=false, columns=[`c_custkey`]]])
{noformat}


On 1.14.0:
-
{noformat}
DrillScreenRel
  DrillProjectRel(c_custkey=[$0])
DrillProjectRel(c_custkey=[$1], o_custkey=[$0])
  DrillJoinRel(condition=[=($1, $0)], joinType=[right])
DrillProjectRel(o_custkey=[$0])
  DrillFilterRel(condition=[NOT(LIKE($1, '%special%requests%'))])
DrillScanRel(table=[[cp, tpch/orders.parquet]], 
groupscan=[ParquetGroupScan [entries=[ReadEntryWithPath 
[path=classpath:/tpch/orders.parquet]], 
selectionRoot=classpath:/tpch/orders.parquet, numFiles=1, numRowGroups=1, 
usedMetadataFile=false, columns=[`o_custkey`, `o_comment`]]])
DrillScanRel(table=[[cp, tpch/customer.parquet]], 
groupscan=[ParquetGroupScan [entries=[ReadEntryWithPath 
[path=classpath:/tpch/customer.parquet]], 
selectionRoot=classpath:/tpch/customer.parquet, numFiles=1, numRowGroups=1, 
usedMetadataFile=false, columns=[`c_custkey`]]])
{noformat}

  was:
[~rhou] noted that TPCH13 on Drill 1.15 was running slower than Drill 1.14. 
Analysis revealed that an extra column was being projected in 1.15 and the 
slowdown was because the extra column was being unnecessarily pushed across an 
exchange.

Here is a simplified query written by [~amansinha100] that exhibits the same 
problem :

In first plan, o_custkey and o_comment are both extraneous projections. 
 In the second plan (on 1.14.0), also, there is an extraneous projection: 
o_custkey but not o_comment.

On 1.15.0:

-

explain plan without implementation for 
 select
 c.c_custkey
 from
 cp.`tpch/customer.parquet` c 
 left outer join cp.`tpch/orders.parquet` o 
 on c.c_custkey = o.o_custkey
 and o.o_comment not like '%special%requests%'
 ;

DrillScreenRel

DrillProjectRel(c_custkey=[$0])

DrillProjectRel(c_custkey=[$2], o_custkey=[$0], o_comment=[$1])

DrillJoinRel(condition=[=($2, $0)], joinType=[right])

DrillFilterRel(condition=[NOT(LIKE($1, '%special%requests%'))])

DrillScanRel(table=[[cp, tpch/orders.parquet]], groupscan=[ParquetGroupScan 
[entries=[ReadEntryWithPath [path=classpath:/tpch/orders.parquet]], 
selectionRoot=classpath:/tpch/orders.parquet, numFiles=1, numRowGroups=1, 
usedMetadataFile=false, columns=[`o_custkey`, `o_comment`]]])

DrillScanRel(table=[[cp, tpch/customer.parquet]], groupscan=[ParquetGroupScan 
[entries=[ReadEntryWithPath [path=classpath:/tpch/customer.parquet]], 
selectionRoot=classpath:/tpch/customer.parquet, numFiles=1, numRowGroups=1, 
usedMetadataFile=false, columns=[`c_custkey`]]])

On 1.14.0:

-

DrillScreenRel

DrillProjectRel(c_custkey=[$0])

DrillProjectRel(c_custkey=[$1], o_custkey=[$0])

DrillJoinRel(condition=[=($1, $0)], joinType=[right])

DrillProjectRel(o_custkey=[$0])

DrillFilterRel(condition=[NOT(LIKE($1, '%special%requests%'))])

DrillScanRel(table=[[cp, tpch/orders.parquet]], groupscan=[ParquetGroupScan 
[entries=[ReadEntryWithPath [path=classpath:/tpch/orders.parquet]], 
selectionRoot=classpath:/tpch/orders.parquet, numFiles=1, numRowGroups=1, 
usedMetadataFile=false, columns=[`o_custkey`, `o_comment`]]])

DrillScanRel(table=[[cp, tpch/customer.parquet]], groupscan=[ParquetGroupScan 

[jira] [Updated] (DRILL-6896) Extraneous columns being projected past a join

2018-12-11 Thread Aman Sinha (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-6896?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aman Sinha updated DRILL-6896:
--
Summary: Extraneous columns being projected past a join  (was: Extraneous 
columns being projected in Drill 1.15)

> Extraneous columns being projected past a join
> --
>
> Key: DRILL-6896
> URL: https://issues.apache.org/jira/browse/DRILL-6896
> Project: Apache Drill
>  Issue Type: Improvement
>Affects Versions: 1.15.0
>Reporter: Karthikeyan Manivannan
>Assignee: Aman Sinha
>Priority: Major
>
> [~rhou] noted that TPCH13 on Drill 1.15 was running slower than Drill 1.14. 
> Analysis revealed that an extra column was being projected in 1.15 and the 
> slowdown was because the extra column was being unnecessarily pushed across 
> an exchange.
> Here is a simplified query written by [~amansinha100] that exhibits the same 
> problem :
> In first plan, o_custkey and o_comment are both extraneous projections. 
>  In the second plan (on 1.14.0), also, there is an extraneous projection: 
> o_custkey but not o_comment.
> On 1.15.0:
> -
> explain plan without implementation for 
>  select
>  c.c_custkey
>  from
>  cp.`tpch/customer.parquet` c 
>  left outer join cp.`tpch/orders.parquet` o 
>  on c.c_custkey = o.o_custkey
>  and o.o_comment not like '%special%requests%'
>  ;
> DrillScreenRel
> DrillProjectRel(c_custkey=[$0])
> DrillProjectRel(c_custkey=[$2], o_custkey=[$0], o_comment=[$1])
> DrillJoinRel(condition=[=($2, $0)], joinType=[right])
> DrillFilterRel(condition=[NOT(LIKE($1, '%special%requests%'))])
> DrillScanRel(table=[[cp, tpch/orders.parquet]], groupscan=[ParquetGroupScan 
> [entries=[ReadEntryWithPath [path=classpath:/tpch/orders.parquet]], 
> selectionRoot=classpath:/tpch/orders.parquet, numFiles=1, numRowGroups=1, 
> usedMetadataFile=false, columns=[`o_custkey`, `o_comment`]]])
> DrillScanRel(table=[[cp, tpch/customer.parquet]], groupscan=[ParquetGroupScan 
> [entries=[ReadEntryWithPath [path=classpath:/tpch/customer.parquet]], 
> selectionRoot=classpath:/tpch/customer.parquet, numFiles=1, numRowGroups=1, 
> usedMetadataFile=false, columns=[`c_custkey`]]])
> On 1.14.0:
> -
> DrillScreenRel
> DrillProjectRel(c_custkey=[$0])
> DrillProjectRel(c_custkey=[$1], o_custkey=[$0])
> DrillJoinRel(condition=[=($1, $0)], joinType=[right])
> DrillProjectRel(o_custkey=[$0])
> DrillFilterRel(condition=[NOT(LIKE($1, '%special%requests%'))])
> DrillScanRel(table=[[cp, tpch/orders.parquet]], groupscan=[ParquetGroupScan 
> [entries=[ReadEntryWithPath [path=classpath:/tpch/orders.parquet]], 
> selectionRoot=classpath:/tpch/orders.parquet, numFiles=1, numRowGroups=1, 
> usedMetadataFile=false, columns=[`o_custkey`, `o_comment`]]])
> DrillScanRel(table=[[cp, tpch/customer.parquet]], groupscan=[ParquetGroupScan 
> [entries=[ReadEntryWithPath [path=classpath:/tpch/customer.parquet]], 
> selectionRoot=classpath:/tpch/customer.parquet, numFiles=1, numRowGroups=1, 
> usedMetadataFile=false, columns=[`c_custkey`]]])



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)