[jira] [Commented] (DRILL-3765) Partition prune rule is unnecessary fired multiple times.

2015-12-15 Thread Rahul Challapalli (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-3765?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15059081#comment-15059081
 ] 

Rahul Challapalli commented on DRILL-3765:
--

[~jni] I am trying to verify this and the only thing I can think of is a 
performance test. Let me know if a functional test can be added to test this 
patch

> Partition prune rule is unnecessary fired multiple times. 
> --
>
> Key: DRILL-3765
> URL: https://issues.apache.org/jira/browse/DRILL-3765
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Query Planning & Optimization
>Reporter: Jinfeng Ni
>Assignee: Jinfeng Ni
> Fix For: 1.4.0
>
>
> It seems that the partition prune rule may be fired multiple times, even 
> after the first rule execution has pushed the filter into the scan operator. 
> Since partition prune has to build the vectors to contain the partition /file 
> / directory information, to invoke the partition prune rule unnecessary may 
> lead to big memory overhead.
> Drill planner should avoid the un-necessary partition prune rule, in order to 
> reduce the chance of hitting OOM exception, while the partition prune rule is 
> executed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (DRILL-3765) Partition prune rule is unnecessary fired multiple times.

2015-12-15 Thread Jinfeng Ni (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-3765?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15059185#comment-15059185
 ] 

Jinfeng Ni commented on DRILL-3765:
---

[~rkins], that's right. I post some preliminary performance number for "explain 
plan" in the earlier comment. Please note that the improvement depends on the 
complexity of the partition filtering ; essentially, this patch is trying to 
reduce the # of partitioning filter evaluation. The more complex the 
partitioning filter is, the more likely we would see performance improvement in 
planning time.

> Partition prune rule is unnecessary fired multiple times. 
> --
>
> Key: DRILL-3765
> URL: https://issues.apache.org/jira/browse/DRILL-3765
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Query Planning & Optimization
>Reporter: Jinfeng Ni
>Assignee: Jinfeng Ni
> Fix For: 1.4.0
>
>
> It seems that the partition prune rule may be fired multiple times, even 
> after the first rule execution has pushed the filter into the scan operator. 
> Since partition prune has to build the vectors to contain the partition /file 
> / directory information, to invoke the partition prune rule unnecessary may 
> lead to big memory overhead.
> Drill planner should avoid the un-necessary partition prune rule, in order to 
> reduce the chance of hitting OOM exception, while the partition prune rule is 
> executed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (DRILL-3765) Partition prune rule is unnecessary fired multiple times.

2015-11-16 Thread Aman Sinha (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-3765?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15006861#comment-15006861
 ] 

Aman Sinha commented on DRILL-3765:
---

[~jni] I did not realize this was assigned to me for review...will do that 
soon.  

> Partition prune rule is unnecessary fired multiple times. 
> --
>
> Key: DRILL-3765
> URL: https://issues.apache.org/jira/browse/DRILL-3765
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Query Planning & Optimization
>Reporter: Jinfeng Ni
>Assignee: Aman Sinha
>
> It seems that the partition prune rule may be fired multiple times, even 
> after the first rule execution has pushed the filter into the scan operator. 
> Since partition prune has to build the vectors to contain the partition /file 
> / directory information, to invoke the partition prune rule unnecessary may 
> lead to big memory overhead.
> Drill planner should avoid the un-necessary partition prune rule, in order to 
> reduce the chance of hitting OOM exception, while the partition prune rule is 
> executed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (DRILL-3765) Partition prune rule is unnecessary fired multiple times.

2015-11-16 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-3765?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15007101#comment-15007101
 ] 

ASF GitHub Bot commented on DRILL-3765:
---

Github user amansinha100 commented on a diff in the pull request:

https://github.com/apache/drill/pull/255#discussion_r44964325
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/planner/sql/DrillSqlWorker.java
 ---
@@ -116,20 +117,32 @@ public DrillSqlWorker(QueryContext context) {
 
   private RuleSet[] getRules(QueryContext context) {
 StoragePluginRegistry storagePluginRegistry = context.getStorage();
-RuleSet drillLogicalRules = DrillRuleSets.mergedRuleSets(
+
+// Ruleset for the case where VolcanoPlanner is used for everything : 
join, filter/project pushdown, partition pruning.
+RuleSet drillLogicalVolOnlyRules = DrillRuleSets.mergedRuleSets(
 DrillRuleSets.getDrillBasicRules(context),
+DrillRuleSets.getPruneScanRules(context),
 DrillRuleSets.getJoinPermRules(context),
 DrillRuleSets.getDrillUserConfigurableLogicalRules(context));
+
+// Ruleset for the case where join planning is done in Hep-LOPT, 
filter/project pushdown and parttion pruning are done in VolcanoPlanner
+RuleSet drillLogicalHepJoinRules = DrillRuleSets.mergedRuleSets(
+DrillRuleSets.getDrillBasicRules(context),
+DrillRuleSets.getPruneScanRules(context),
+DrillRuleSets.getDrillUserConfigurableLogicalRules(context));
+
+// Ruleset for the case where join planning and partition pruning is 
done in Hep, filter/project pushdown are done in VolcanoPlanner
--- End diff --

This comment refers to partition pruning rules but the ruleset does not 
have those..


> Partition prune rule is unnecessary fired multiple times. 
> --
>
> Key: DRILL-3765
> URL: https://issues.apache.org/jira/browse/DRILL-3765
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Query Planning & Optimization
>Reporter: Jinfeng Ni
>Assignee: Aman Sinha
>
> It seems that the partition prune rule may be fired multiple times, even 
> after the first rule execution has pushed the filter into the scan operator. 
> Since partition prune has to build the vectors to contain the partition /file 
> / directory information, to invoke the partition prune rule unnecessary may 
> lead to big memory overhead.
> Drill planner should avoid the un-necessary partition prune rule, in order to 
> reduce the chance of hitting OOM exception, while the partition prune rule is 
> executed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (DRILL-3765) Partition prune rule is unnecessary fired multiple times.

2015-11-16 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-3765?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15007395#comment-15007395
 ] 

ASF GitHub Bot commented on DRILL-3765:
---

Github user amansinha100 commented on a diff in the pull request:

https://github.com/apache/drill/pull/255#discussion_r44988307
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/planner/sql/DrillSqlWorker.java
 ---
@@ -116,20 +117,32 @@ public DrillSqlWorker(QueryContext context) {
 
   private RuleSet[] getRules(QueryContext context) {
 StoragePluginRegistry storagePluginRegistry = context.getStorage();
-RuleSet drillLogicalRules = DrillRuleSets.mergedRuleSets(
+
+// Ruleset for the case where VolcanoPlanner is used for everything : 
join, filter/project pushdown, partition pruning.
+RuleSet drillLogicalVolOnlyRules = DrillRuleSets.mergedRuleSets(
 DrillRuleSets.getDrillBasicRules(context),
+DrillRuleSets.getPruneScanRules(context),
 DrillRuleSets.getJoinPermRules(context),
 DrillRuleSets.getDrillUserConfigurableLogicalRules(context));
+
+// Ruleset for the case where join planning is done in Hep-LOPT, 
filter/project pushdown and parttion pruning are done in VolcanoPlanner
+RuleSet drillLogicalHepJoinRules = DrillRuleSets.mergedRuleSets(
+DrillRuleSets.getDrillBasicRules(context),
+DrillRuleSets.getPruneScanRules(context),
+DrillRuleSets.getDrillUserConfigurableLogicalRules(context));
+
+// Ruleset for the case where join planning and partition pruning is 
done in Hep, filter/project pushdown are done in VolcanoPlanner
--- End diff --

Ok, I see.   
LGTM +1.  


> Partition prune rule is unnecessary fired multiple times. 
> --
>
> Key: DRILL-3765
> URL: https://issues.apache.org/jira/browse/DRILL-3765
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Query Planning & Optimization
>Reporter: Jinfeng Ni
>Assignee: Aman Sinha
>
> It seems that the partition prune rule may be fired multiple times, even 
> after the first rule execution has pushed the filter into the scan operator. 
> Since partition prune has to build the vectors to contain the partition /file 
> / directory information, to invoke the partition prune rule unnecessary may 
> lead to big memory overhead.
> Drill planner should avoid the un-necessary partition prune rule, in order to 
> reduce the chance of hitting OOM exception, while the partition prune rule is 
> executed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (DRILL-3765) Partition prune rule is unnecessary fired multiple times.

2015-11-16 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-3765?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15007223#comment-15007223
 ] 

ASF GitHub Bot commented on DRILL-3765:
---

Github user jinfengni commented on a diff in the pull request:

https://github.com/apache/drill/pull/255#discussion_r44974839
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/planner/sql/DrillSqlWorker.java
 ---
@@ -116,20 +117,32 @@ public DrillSqlWorker(QueryContext context) {
 
   private RuleSet[] getRules(QueryContext context) {
 StoragePluginRegistry storagePluginRegistry = context.getStorage();
-RuleSet drillLogicalRules = DrillRuleSets.mergedRuleSets(
+
+// Ruleset for the case where VolcanoPlanner is used for everything : 
join, filter/project pushdown, partition pruning.
+RuleSet drillLogicalVolOnlyRules = DrillRuleSets.mergedRuleSets(
 DrillRuleSets.getDrillBasicRules(context),
+DrillRuleSets.getPruneScanRules(context),
 DrillRuleSets.getJoinPermRules(context),
 DrillRuleSets.getDrillUserConfigurableLogicalRules(context));
+
+// Ruleset for the case where join planning is done in Hep-LOPT, 
filter/project pushdown and parttion pruning are done in VolcanoPlanner
+RuleSet drillLogicalHepJoinRules = DrillRuleSets.mergedRuleSets(
+DrillRuleSets.getDrillBasicRules(context),
+DrillRuleSets.getPruneScanRules(context),
+DrillRuleSets.getDrillUserConfigurableLogicalRules(context));
+
+// Ruleset for the case where join planning and partition pruning is 
done in Hep, filter/project pushdown are done in VolcanoPlanner
--- End diff --

Exactly. The ruleset defined here is used in Frameworks.VolcanoPlanner. In 
case that join / partition pruning is done in Hep, we do not put the 
join/partition pruning rules in the ruleset here. In stead, the join/partition 
pruning rules are specified when HepPlanner is built. 


> Partition prune rule is unnecessary fired multiple times. 
> --
>
> Key: DRILL-3765
> URL: https://issues.apache.org/jira/browse/DRILL-3765
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Query Planning & Optimization
>Reporter: Jinfeng Ni
>Assignee: Aman Sinha
>
> It seems that the partition prune rule may be fired multiple times, even 
> after the first rule execution has pushed the filter into the scan operator. 
> Since partition prune has to build the vectors to contain the partition /file 
> / directory information, to invoke the partition prune rule unnecessary may 
> lead to big memory overhead.
> Drill planner should avoid the un-necessary partition prune rule, in order to 
> reduce the chance of hitting OOM exception, while the partition prune rule is 
> executed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (DRILL-3765) Partition prune rule is unnecessary fired multiple times.

2015-11-16 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-3765?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15008000#comment-15008000
 ] 

ASF GitHub Bot commented on DRILL-3765:
---

Github user asfgit closed the pull request at:

https://github.com/apache/drill/pull/255


> Partition prune rule is unnecessary fired multiple times. 
> --
>
> Key: DRILL-3765
> URL: https://issues.apache.org/jira/browse/DRILL-3765
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Query Planning & Optimization
>Reporter: Jinfeng Ni
>Assignee: Jinfeng Ni
>
> It seems that the partition prune rule may be fired multiple times, even 
> after the first rule execution has pushed the filter into the scan operator. 
> Since partition prune has to build the vectors to contain the partition /file 
> / directory information, to invoke the partition prune rule unnecessary may 
> lead to big memory overhead.
> Drill planner should avoid the un-necessary partition prune rule, in order to 
> reduce the chance of hitting OOM exception, while the partition prune rule is 
> executed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (DRILL-3765) Partition prune rule is unnecessary fired multiple times.

2015-11-16 Thread Jinfeng Ni (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-3765?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15007761#comment-15007761
 ] 

Jinfeng Ni commented on DRILL-3765:
---

I should point out that the improvement from this patch depends on how much 
time is spent on evaluation of pruning filter in the rule in the overall 
planning time.  The more complex pruning filter we have, the more likely we 
would see improvements.

For example, in the previous comment, I compared the planning time for a 
pruning filter with a 5 value in-list, (5.2 seconds vs 9.4 seconds).  If the 
pruning filter is changed to one = condition, then the planning time is changed 
to 3.7 seconds vs 4.6 seconds. That is, we would see smaller improvement with 
simpler pruning filter, which seems to be reasonable. 

{code}
explain plan for select ss_sold_date_sk, ss_sold_time_sk, ss_item_sk, 
ss_customer_sk from dfs.tmp.store_pb_item_sk where ss_item_sk =100  and 
ss_customer_sk = 96479;

1 row selected (3.709 seconds)

alter session set `planner.enable_hep_partition_pruning` = false;

1 row selected (4.65 seconds)
{code}


> Partition prune rule is unnecessary fired multiple times. 
> --
>
> Key: DRILL-3765
> URL: https://issues.apache.org/jira/browse/DRILL-3765
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Query Planning & Optimization
>Reporter: Jinfeng Ni
>Assignee: Jinfeng Ni
>
> It seems that the partition prune rule may be fired multiple times, even 
> after the first rule execution has pushed the filter into the scan operator. 
> Since partition prune has to build the vectors to contain the partition /file 
> / directory information, to invoke the partition prune rule unnecessary may 
> lead to big memory overhead.
> Drill planner should avoid the un-necessary partition prune rule, in order to 
> reduce the chance of hitting OOM exception, while the partition prune rule is 
> executed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (DRILL-3765) Partition prune rule is unnecessary fired multiple times.

2015-11-15 Thread Jinfeng Ni (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-3765?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15006236#comment-15006236
 ] 

Jinfeng Ni commented on DRILL-3765:
---

[~amansinha100], could you please review the PR for DRILL-3765? Thanks!

https://github.com/apache/drill/pull/255



> Partition prune rule is unnecessary fired multiple times. 
> --
>
> Key: DRILL-3765
> URL: https://issues.apache.org/jira/browse/DRILL-3765
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Query Planning & Optimization
>Reporter: Jinfeng Ni
>Assignee: Aman Sinha
>
> It seems that the partition prune rule may be fired multiple times, even 
> after the first rule execution has pushed the filter into the scan operator. 
> Since partition prune has to build the vectors to contain the partition /file 
> / directory information, to invoke the partition prune rule unnecessary may 
> lead to big memory overhead.
> Drill planner should avoid the un-necessary partition prune rule, in order to 
> reduce the chance of hitting OOM exception, while the partition prune rule is 
> executed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (DRILL-3765) Partition prune rule is unnecessary fired multiple times.

2015-11-13 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-3765?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15004845#comment-15004845
 ] 

ASF GitHub Bot commented on DRILL-3765:
---

GitHub user jinfengni opened a pull request:

https://github.com/apache/drill/pull/255

DRILL-3765: Move partitioning pruning to HepPlanner to avoid the perf…

…ormance overhead for redundant rule execution.

Add fall back option in planner.

Run the regression/unit with this feature turned on /off.

Please see DRILL-3765 for some performance number.

Regarding the planner options:
1. planner.enable_join_hep_opt is replaced with planner.enable_hep_opt.  
This controls whether HepPlanner would be used in drill logical planning. If 
this option is off, it would fall back to Volcano planner. 
2. Add new planning option : planner.enable_hep_partition_pruning, which 
will turn on/off this feature. If this option is off, it will fall back to the 
code prior to this patch. 

@amansinha100 , could you please review this patch?




You can merge this pull request into a Git repository by running:

$ git pull https://github.com/jinfengni/incubator-drill DRILL-3765

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/drill/pull/255.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #255


commit 739ac9482605cd2593b569ce2ec22fd176f6afbd
Author: Jinfeng Ni 
Date:   2015-11-03T23:51:10Z

DRILL-3765: Move partitioning pruning to HepPlanner to avoid the 
performance overhead for redundant rule execution.

Add fall back option in planner.




> Partition prune rule is unnecessary fired multiple times. 
> --
>
> Key: DRILL-3765
> URL: https://issues.apache.org/jira/browse/DRILL-3765
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Query Planning & Optimization
>Reporter: Jinfeng Ni
>Assignee: Jinfeng Ni
>
> It seems that the partition prune rule may be fired multiple times, even 
> after the first rule execution has pushed the filter into the scan operator. 
> Since partition prune has to build the vectors to contain the partition /file 
> / directory information, to invoke the partition prune rule unnecessary may 
> lead to big memory overhead.
> Drill planner should avoid the un-necessary partition prune rule, in order to 
> reduce the chance of hitting OOM exception, while the partition prune rule is 
> executed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (DRILL-3765) Partition prune rule is unnecessary fired multiple times.

2015-11-11 Thread Jinfeng Ni (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-3765?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15001400#comment-15001400
 ] 

Jinfeng Ni commented on DRILL-3765:
---

Did some preliminary testing to see how much performance we may gain from the 
patch, if we move the PruneScanRules into a HepPlanner, once the project/filter 
pushdown are applied. Here is the result when run on mac.  

date: tpcds sample dataset:
1. Create a partitioned table.  This produces a table with 18000 parquet files. 
{code}
create table dfs.tmp.store_pb_item_sk partition by (ss_item_sk) as select * 
from store_sale;
{code} 

2. Query the partitioned table with filter referring the partition column 
(ss_item_sk) and non-partitioning column.
{code}
explain plan for select ss_sold_date_sk, ss_sold_time_sk, ss_item_sk, 
ss_customer_sk from dfs.tmp.store_pb_item_sk where ss_item_sk in (100, 200, 
300, 400, 500) and ss_customer_sk = 96479;
{code}  

3. Results:
{code}
alter session set `planner.enable_hep_opt` = true;

explain plan for select ss_sold_date_sk, ss_sold_time_sk, ss_item_sk, 
ss_customer_sk from dfs.tmp.store_pb_item_sk where ss_item_sk in (100, 200, 
300, 400, 500) and ss_customer_sk = 96479;

1 row selected (5.246 seconds)

alter session set `planner.enable_hep_opt` = false;
explain plan for select ss_sold_date_sk, ss_sold_time_sk, ss_item_sk, 
ss_customer_sk from dfs.tmp.store_pb_item_sk where ss_item_sk in (100, 200, 
300, 400, 500) and ss_customer_sk = 96479;

+--+--+
1 row selected (9.412 seconds)
{code}

By avoiding the repeated PruneScanRule executions, the planning time is reduced 
from 9.4 seconds to 5.2 seconds.  With more parquet files in the table or 
multiple table join query, it would expected that we might see even big 
improvements with this patch.

With parquet metadata cache file created, I saw similar number between the 
existing number and the new number. 

Log shows that the existing code indeed would fire the PruneScanRules multiple 
times, including the directory-based pruning and partitioning column (from 
CTAS) based pruning. With the patch, partition pruning will be fired once for 
directory-based pruning and once for partitioning column pruning. That explains 
the performance gain we saw in this preliminary test.




> Partition prune rule is unnecessary fired multiple times. 
> --
>
> Key: DRILL-3765
> URL: https://issues.apache.org/jira/browse/DRILL-3765
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Query Planning & Optimization
>Reporter: Jinfeng Ni
>Assignee: Jinfeng Ni
>
> It seems that the partition prune rule may be fired multiple times, even 
> after the first rule execution has pushed the filter into the scan operator. 
> Since partition prune has to build the vectors to contain the partition /file 
> / directory information, to invoke the partition prune rule unnecessary may 
> lead to big memory overhead.
> Drill planner should avoid the un-necessary partition prune rule, in order to 
> reduce the chance of hitting OOM exception, while the partition prune rule is 
> executed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (DRILL-3765) Partition prune rule is unnecessary fired multiple times.

2015-11-03 Thread Jinfeng Ni (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-3765?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14987748#comment-14987748
 ] 

Jinfeng Ni commented on DRILL-3765:
---

Caching the result will help. But I feel it probably will not help a lot in 
case where intermediate filters are pushed down, and partition rule is fired 
against the intermediate filters. Ideally, we only want to apply the partition 
rule against the final filter. 
 

> Partition prune rule is unnecessary fired multiple times. 
> --
>
> Key: DRILL-3765
> URL: https://issues.apache.org/jira/browse/DRILL-3765
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Query Planning & Optimization
>Reporter: Jinfeng Ni
>Assignee: Jinfeng Ni
>
> It seems that the partition prune rule may be fired multiple times, even 
> after the first rule execution has pushed the filter into the scan operator. 
> Since partition prune has to build the vectors to contain the partition /file 
> / directory information, to invoke the partition prune rule unnecessary may 
> lead to big memory overhead.
> Drill planner should avoid the un-necessary partition prune rule, in order to 
> reduce the chance of hitting OOM exception, while the partition prune rule is 
> executed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (DRILL-3765) Partition prune rule is unnecessary fired multiple times.

2015-11-03 Thread Jacques Nadeau (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-3765?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14987809#comment-14987809
 ] 

Jacques Nadeau commented on DRILL-3765:
---

got it. thx

> Partition prune rule is unnecessary fired multiple times. 
> --
>
> Key: DRILL-3765
> URL: https://issues.apache.org/jira/browse/DRILL-3765
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Query Planning & Optimization
>Reporter: Jinfeng Ni
>Assignee: Jinfeng Ni
>
> It seems that the partition prune rule may be fired multiple times, even 
> after the first rule execution has pushed the filter into the scan operator. 
> Since partition prune has to build the vectors to contain the partition /file 
> / directory information, to invoke the partition prune rule unnecessary may 
> lead to big memory overhead.
> Drill planner should avoid the un-necessary partition prune rule, in order to 
> reduce the chance of hitting OOM exception, while the partition prune rule is 
> executed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (DRILL-3765) Partition prune rule is unnecessary fired multiple times.

2015-11-03 Thread Jacques Nadeau (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-3765?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14987626#comment-14987626
 ] 

Jacques Nadeau commented on DRILL-3765:
---

Wouldn't another option be caching the result? It seems like that could be a 
simpler solution.

> Partition prune rule is unnecessary fired multiple times. 
> --
>
> Key: DRILL-3765
> URL: https://issues.apache.org/jira/browse/DRILL-3765
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Query Planning & Optimization
>Reporter: Jinfeng Ni
>Assignee: Jinfeng Ni
>
> It seems that the partition prune rule may be fired multiple times, even 
> after the first rule execution has pushed the filter into the scan operator. 
> Since partition prune has to build the vectors to contain the partition /file 
> / directory information, to invoke the partition prune rule unnecessary may 
> lead to big memory overhead.
> Drill planner should avoid the un-necessary partition prune rule, in order to 
> reduce the chance of hitting OOM exception, while the partition prune rule is 
> executed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)