[jira] [Commented] (DRILL-6099) Drill does not push limit past project (flatten) if it cannot be pushed into scan
[ https://issues.apache.org/jira/browse/DRILL-6099?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16384800#comment-16384800 ] ASF GitHub Bot commented on DRILL-6099: --- Github user asfgit closed the pull request at: https://github.com/apache/drill/pull/1096 > Drill does not push limit past project (flatten) if it cannot be pushed into > scan > - > > Key: DRILL-6099 > URL: https://issues.apache.org/jira/browse/DRILL-6099 > Project: Apache Drill > Issue Type: Bug >Affects Versions: 1.12.0 >Reporter: Gautam Kumar Parai >Assignee: Gautam Kumar Parai >Priority: Major > Labels: ready-to-commit > Fix For: 1.13.0 > > > It would be useful to have pushdown occur past flatten(project). Here is an > example to illustrate the issue: > {{explain plan without implementation for }}{{select name, > flatten(categories) as category from dfs.`/tmp/t_json_20` LIMIT 1;}} > {{DrillScreenRel}}{{ }} > {{ DrillLimitRel(fetch=[1])}}{{ }} > {{ DrillProjectRel(name=[$0], category=[FLATTEN($1)])}} > {{ DrillScanRel(table=[[dfs, /tmp/t_json_20]], groupscan=[EasyGroupScan > [selectionRoot=maprfs:/tmp/t_json_20, numFiles=1, columns=[`name`, > `categories`], files=[maprfs:///tmp/t_json_20/0_0_0.json]]])}} > = > Content of 0_0_0.json > = > { > "name" : "Eric Goldberg, MD", > "categories" : [ "Doctors", "Health & Medical" ] > } { > "name" : "Pine Cone Restaurant", > "categories" : [ "Restaurants" ] > } { > "name" : "Deforest Family Restaurant", > "categories" : [ "American (Traditional)", "Restaurants" ] > } { > "name" : "Culver's", > "categories" : [ "Food", "Ice Cream & Frozen Yogurt", "Fast Food", > "Restaurants" ] > } { > "name" : "Chang Jiang Chinese Kitchen", > "categories" : [ "Chinese", "Restaurants" ] > } > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (DRILL-6099) Drill does not push limit past project (flatten) if it cannot be pushed into scan
[ https://issues.apache.org/jira/browse/DRILL-6099?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16382745#comment-16382745 ] ASF GitHub Bot commented on DRILL-6099: --- Github user amansinha100 commented on a diff in the pull request: https://github.com/apache/drill/pull/1096#discussion_r171711326 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/planner/logical/DrillPushLimitToScanRule.java --- @@ -55,18 +62,21 @@ public void onMatch(RelOptRuleCall call) { } }; - public static DrillPushLimitToScanRule LIMIT_ON_PROJECT = - new DrillPushLimitToScanRule( - RelOptHelper.some(DrillLimitRel.class, RelOptHelper.some( - DrillProjectRel.class, RelOptHelper.any(DrillScanRel.class))), - "DrillPushLimitToScanRule_LimitOnProject") { + public static DrillPushLimitToScanRule LIMIT_ON_PROJECT = new DrillPushLimitToScanRule( + RelOptHelper.some(DrillLimitRel.class, RelOptHelper.any(DrillProjectRel.class)), "DrillPushLimitToScanRule_LimitOnProject") { @Override public boolean matches(RelOptRuleCall call) { DrillLimitRel limitRel = call.rel(0); - DrillScanRel scanRel = call.rel(2); - // For now only applies to Parquet. And pushdown only apply limit but not offset, + DrillProjectRel projectRel = call.rel(1); + // pushdown only apply limit but not offset, // so if getFetch() return null no need to run this rule. - if (scanRel.getGroupScan().supportsLimitPushdown() && (limitRel.getFetch() != null)) { --- End diff -- Ok, yeah in that case we are not generating a redundant limit. > Drill does not push limit past project (flatten) if it cannot be pushed into > scan > - > > Key: DRILL-6099 > URL: https://issues.apache.org/jira/browse/DRILL-6099 > Project: Apache Drill > Issue Type: Bug >Affects Versions: 1.12.0 >Reporter: Gautam Kumar Parai >Assignee: Gautam Kumar Parai >Priority: Major > Fix For: 1.13.0 > > > It would be useful to have pushdown occur past flatten(project). Here is an > example to illustrate the issue: > {{explain plan without implementation for }}{{select name, > flatten(categories) as category from dfs.`/tmp/t_json_20` LIMIT 1;}} > {{DrillScreenRel}}{{ }} > {{ DrillLimitRel(fetch=[1])}}{{ }} > {{ DrillProjectRel(name=[$0], category=[FLATTEN($1)])}} > {{ DrillScanRel(table=[[dfs, /tmp/t_json_20]], groupscan=[EasyGroupScan > [selectionRoot=maprfs:/tmp/t_json_20, numFiles=1, columns=[`name`, > `categories`], files=[maprfs:///tmp/t_json_20/0_0_0.json]]])}} > = > Content of 0_0_0.json > = > { > "name" : "Eric Goldberg, MD", > "categories" : [ "Doctors", "Health & Medical" ] > } { > "name" : "Pine Cone Restaurant", > "categories" : [ "Restaurants" ] > } { > "name" : "Deforest Family Restaurant", > "categories" : [ "American (Traditional)", "Restaurants" ] > } { > "name" : "Culver's", > "categories" : [ "Food", "Ice Cream & Frozen Yogurt", "Fast Food", > "Restaurants" ] > } { > "name" : "Chang Jiang Chinese Kitchen", > "categories" : [ "Chinese", "Restaurants" ] > } > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (DRILL-6099) Drill does not push limit past project (flatten) if it cannot be pushed into scan
[ https://issues.apache.org/jira/browse/DRILL-6099?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16382746#comment-16382746 ] ASF GitHub Bot commented on DRILL-6099: --- Github user amansinha100 commented on the issue: https://github.com/apache/drill/pull/1096 Updated version lgtm. +1 > Drill does not push limit past project (flatten) if it cannot be pushed into > scan > - > > Key: DRILL-6099 > URL: https://issues.apache.org/jira/browse/DRILL-6099 > Project: Apache Drill > Issue Type: Bug >Affects Versions: 1.12.0 >Reporter: Gautam Kumar Parai >Assignee: Gautam Kumar Parai >Priority: Major > Fix For: 1.13.0 > > > It would be useful to have pushdown occur past flatten(project). Here is an > example to illustrate the issue: > {{explain plan without implementation for }}{{select name, > flatten(categories) as category from dfs.`/tmp/t_json_20` LIMIT 1;}} > {{DrillScreenRel}}{{ }} > {{ DrillLimitRel(fetch=[1])}}{{ }} > {{ DrillProjectRel(name=[$0], category=[FLATTEN($1)])}} > {{ DrillScanRel(table=[[dfs, /tmp/t_json_20]], groupscan=[EasyGroupScan > [selectionRoot=maprfs:/tmp/t_json_20, numFiles=1, columns=[`name`, > `categories`], files=[maprfs:///tmp/t_json_20/0_0_0.json]]])}} > = > Content of 0_0_0.json > = > { > "name" : "Eric Goldberg, MD", > "categories" : [ "Doctors", "Health & Medical" ] > } { > "name" : "Pine Cone Restaurant", > "categories" : [ "Restaurants" ] > } { > "name" : "Deforest Family Restaurant", > "categories" : [ "American (Traditional)", "Restaurants" ] > } { > "name" : "Culver's", > "categories" : [ "Food", "Ice Cream & Frozen Yogurt", "Fast Food", > "Restaurants" ] > } { > "name" : "Chang Jiang Chinese Kitchen", > "categories" : [ "Chinese", "Restaurants" ] > } > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (DRILL-6099) Drill does not push limit past project (flatten) if it cannot be pushed into scan
[ https://issues.apache.org/jira/browse/DRILL-6099?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16382736#comment-16382736 ] ASF GitHub Bot commented on DRILL-6099: --- Github user gparai commented on the issue: https://github.com/apache/drill/pull/1096 @amansinha100 I have addressed your review comments. Please take a look. Thanks! > Drill does not push limit past project (flatten) if it cannot be pushed into > scan > - > > Key: DRILL-6099 > URL: https://issues.apache.org/jira/browse/DRILL-6099 > Project: Apache Drill > Issue Type: Bug >Affects Versions: 1.12.0 >Reporter: Gautam Kumar Parai >Assignee: Gautam Kumar Parai >Priority: Major > Fix For: 1.13.0 > > > It would be useful to have pushdown occur past flatten(project). Here is an > example to illustrate the issue: > {{explain plan without implementation for }}{{select name, > flatten(categories) as category from dfs.`/tmp/t_json_20` LIMIT 1;}} > {{DrillScreenRel}}{{ }} > {{ DrillLimitRel(fetch=[1])}}{{ }} > {{ DrillProjectRel(name=[$0], category=[FLATTEN($1)])}} > {{ DrillScanRel(table=[[dfs, /tmp/t_json_20]], groupscan=[EasyGroupScan > [selectionRoot=maprfs:/tmp/t_json_20, numFiles=1, columns=[`name`, > `categories`], files=[maprfs:///tmp/t_json_20/0_0_0.json]]])}} > = > Content of 0_0_0.json > = > { > "name" : "Eric Goldberg, MD", > "categories" : [ "Doctors", "Health & Medical" ] > } { > "name" : "Pine Cone Restaurant", > "categories" : [ "Restaurants" ] > } { > "name" : "Deforest Family Restaurant", > "categories" : [ "American (Traditional)", "Restaurants" ] > } { > "name" : "Culver's", > "categories" : [ "Food", "Ice Cream & Frozen Yogurt", "Fast Food", > "Restaurants" ] > } { > "name" : "Chang Jiang Chinese Kitchen", > "categories" : [ "Chinese", "Restaurants" ] > } > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (DRILL-6099) Drill does not push limit past project (flatten) if it cannot be pushed into scan
[ https://issues.apache.org/jira/browse/DRILL-6099?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16382732#comment-16382732 ] ASF GitHub Bot commented on DRILL-6099: --- Github user gparai commented on a diff in the pull request: https://github.com/apache/drill/pull/1096#discussion_r171708636 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/planner/logical/DrillPushLimitToScanRule.java --- @@ -55,18 +62,21 @@ public void onMatch(RelOptRuleCall call) { } }; - public static DrillPushLimitToScanRule LIMIT_ON_PROJECT = - new DrillPushLimitToScanRule( - RelOptHelper.some(DrillLimitRel.class, RelOptHelper.some( - DrillProjectRel.class, RelOptHelper.any(DrillScanRel.class))), - "DrillPushLimitToScanRule_LimitOnProject") { + public static DrillPushLimitToScanRule LIMIT_ON_PROJECT = new DrillPushLimitToScanRule( + RelOptHelper.some(DrillLimitRel.class, RelOptHelper.any(DrillProjectRel.class)), "DrillPushLimitToScanRule_LimitOnProject") { @Override public boolean matches(RelOptRuleCall call) { DrillLimitRel limitRel = call.rel(0); - DrillScanRel scanRel = call.rel(2); - // For now only applies to Parquet. And pushdown only apply limit but not offset, + DrillProjectRel projectRel = call.rel(1); + // pushdown only apply limit but not offset, // so if getFetch() return null no need to run this rule. - if (scanRel.getGroupScan().supportsLimitPushdown() && (limitRel.getFetch() != null)) { --- End diff -- Without a FLATTEN, the LIMIT would be fully pushed past the PROJECT i.e. we would not have a LIMIT on top of the project. > Drill does not push limit past project (flatten) if it cannot be pushed into > scan > - > > Key: DRILL-6099 > URL: https://issues.apache.org/jira/browse/DRILL-6099 > Project: Apache Drill > Issue Type: Bug >Affects Versions: 1.12.0 >Reporter: Gautam Kumar Parai >Assignee: Gautam Kumar Parai >Priority: Major > Fix For: 1.13.0 > > > It would be useful to have pushdown occur past flatten(project). Here is an > example to illustrate the issue: > {{explain plan without implementation for }}{{select name, > flatten(categories) as category from dfs.`/tmp/t_json_20` LIMIT 1;}} > {{DrillScreenRel}}{{ }} > {{ DrillLimitRel(fetch=[1])}}{{ }} > {{ DrillProjectRel(name=[$0], category=[FLATTEN($1)])}} > {{ DrillScanRel(table=[[dfs, /tmp/t_json_20]], groupscan=[EasyGroupScan > [selectionRoot=maprfs:/tmp/t_json_20, numFiles=1, columns=[`name`, > `categories`], files=[maprfs:///tmp/t_json_20/0_0_0.json]]])}} > = > Content of 0_0_0.json > = > { > "name" : "Eric Goldberg, MD", > "categories" : [ "Doctors", "Health & Medical" ] > } { > "name" : "Pine Cone Restaurant", > "categories" : [ "Restaurants" ] > } { > "name" : "Deforest Family Restaurant", > "categories" : [ "American (Traditional)", "Restaurants" ] > } { > "name" : "Culver's", > "categories" : [ "Food", "Ice Cream & Frozen Yogurt", "Fast Food", > "Restaurants" ] > } { > "name" : "Chang Jiang Chinese Kitchen", > "categories" : [ "Chinese", "Restaurants" ] > } > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (DRILL-6099) Drill does not push limit past project (flatten) if it cannot be pushed into scan
[ https://issues.apache.org/jira/browse/DRILL-6099?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16382728#comment-16382728 ] ASF GitHub Bot commented on DRILL-6099: --- Github user gparai commented on a diff in the pull request: https://github.com/apache/drill/pull/1096#discussion_r171708384 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/planner/common/DrillRelOptUtil.java --- @@ -224,4 +226,64 @@ public Void visitInputRef(RexInputRef inputRef) { } } + public static boolean isLimit0(RexNode fetch) { +if (fetch != null && fetch.isA(SqlKind.LITERAL)) { + RexLiteral l = (RexLiteral) fetch; + switch (l.getTypeName()) { +case BIGINT: +case INTEGER: +case DECIMAL: + if (((long) l.getValue2()) == 0) { +return true; + } + } +} +return false; + } + + public static boolean isProjectOutputRowcountUnknown(RelNode project) { +assert project instanceof Project : "Rel is NOT an instance of project!"; +try { + RexVisitor visitor = --- End diff -- Yes, you are correct. If the rewrite does not consider it as embedded within other expressions then it is fine for the utility function to do the same. > Drill does not push limit past project (flatten) if it cannot be pushed into > scan > - > > Key: DRILL-6099 > URL: https://issues.apache.org/jira/browse/DRILL-6099 > Project: Apache Drill > Issue Type: Bug >Affects Versions: 1.12.0 >Reporter: Gautam Kumar Parai >Assignee: Gautam Kumar Parai >Priority: Major > Fix For: 1.13.0 > > > It would be useful to have pushdown occur past flatten(project). Here is an > example to illustrate the issue: > {{explain plan without implementation for }}{{select name, > flatten(categories) as category from dfs.`/tmp/t_json_20` LIMIT 1;}} > {{DrillScreenRel}}{{ }} > {{ DrillLimitRel(fetch=[1])}}{{ }} > {{ DrillProjectRel(name=[$0], category=[FLATTEN($1)])}} > {{ DrillScanRel(table=[[dfs, /tmp/t_json_20]], groupscan=[EasyGroupScan > [selectionRoot=maprfs:/tmp/t_json_20, numFiles=1, columns=[`name`, > `categories`], files=[maprfs:///tmp/t_json_20/0_0_0.json]]])}} > = > Content of 0_0_0.json > = > { > "name" : "Eric Goldberg, MD", > "categories" : [ "Doctors", "Health & Medical" ] > } { > "name" : "Pine Cone Restaurant", > "categories" : [ "Restaurants" ] > } { > "name" : "Deforest Family Restaurant", > "categories" : [ "American (Traditional)", "Restaurants" ] > } { > "name" : "Culver's", > "categories" : [ "Food", "Ice Cream & Frozen Yogurt", "Fast Food", > "Restaurants" ] > } { > "name" : "Chang Jiang Chinese Kitchen", > "categories" : [ "Chinese", "Restaurants" ] > } > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (DRILL-6099) Drill does not push limit past project (flatten) if it cannot be pushed into scan
[ https://issues.apache.org/jira/browse/DRILL-6099?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16382731#comment-16382731 ] ASF GitHub Bot commented on DRILL-6099: --- Github user gparai commented on a diff in the pull request: https://github.com/apache/drill/pull/1096#discussion_r171708439 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/planner/common/DrillRelOptUtil.java --- @@ -224,4 +226,64 @@ public Void visitInputRef(RexInputRef inputRef) { } } + public static boolean isLimit0(RexNode fetch) { +if (fetch != null && fetch.isA(SqlKind.LITERAL)) { + RexLiteral l = (RexLiteral) fetch; + switch (l.getTypeName()) { +case BIGINT: +case INTEGER: +case DECIMAL: + if (((long) l.getValue2()) == 0) { +return true; + } + } +} +return false; + } + + public static boolean isProjectOutputRowcountUnknown(RelNode project) { +assert project instanceof Project : "Rel is NOT an instance of project!"; +try { + RexVisitor visitor = + new RexVisitorImpl(true) { +public Void visitCall(RexCall call) { + if ("flatten".equals(call.getOperator().getName().toLowerCase())) { +throw new Util.FoundOne(call); /* throw exception to interrupt tree walk (this is similar to + other utility methods in RexUtil.java */ + } + return super.visitCall(call); +} + }; + for (RexNode rex : ((Project) project).getProjects()) { +rex.accept(visitor); + } +} catch (Util.FoundOne e) { + Util.swallow(e, null); + return true; +} +return false; + } + + public static boolean isProjectOutputSchemaUnknown(RelNode project) { --- End diff -- Done > Drill does not push limit past project (flatten) if it cannot be pushed into > scan > - > > Key: DRILL-6099 > URL: https://issues.apache.org/jira/browse/DRILL-6099 > Project: Apache Drill > Issue Type: Bug >Affects Versions: 1.12.0 >Reporter: Gautam Kumar Parai >Assignee: Gautam Kumar Parai >Priority: Major > Fix For: 1.13.0 > > > It would be useful to have pushdown occur past flatten(project). Here is an > example to illustrate the issue: > {{explain plan without implementation for }}{{select name, > flatten(categories) as category from dfs.`/tmp/t_json_20` LIMIT 1;}} > {{DrillScreenRel}}{{ }} > {{ DrillLimitRel(fetch=[1])}}{{ }} > {{ DrillProjectRel(name=[$0], category=[FLATTEN($1)])}} > {{ DrillScanRel(table=[[dfs, /tmp/t_json_20]], groupscan=[EasyGroupScan > [selectionRoot=maprfs:/tmp/t_json_20, numFiles=1, columns=[`name`, > `categories`], files=[maprfs:///tmp/t_json_20/0_0_0.json]]])}} > = > Content of 0_0_0.json > = > { > "name" : "Eric Goldberg, MD", > "categories" : [ "Doctors", "Health & Medical" ] > } { > "name" : "Pine Cone Restaurant", > "categories" : [ "Restaurants" ] > } { > "name" : "Deforest Family Restaurant", > "categories" : [ "American (Traditional)", "Restaurants" ] > } { > "name" : "Culver's", > "categories" : [ "Food", "Ice Cream & Frozen Yogurt", "Fast Food", > "Restaurants" ] > } { > "name" : "Chang Jiang Chinese Kitchen", > "categories" : [ "Chinese", "Restaurants" ] > } > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (DRILL-6099) Drill does not push limit past project (flatten) if it cannot be pushed into scan
[ https://issues.apache.org/jira/browse/DRILL-6099?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16382729#comment-16382729 ] ASF GitHub Bot commented on DRILL-6099: --- Github user gparai commented on a diff in the pull request: https://github.com/apache/drill/pull/1096#discussion_r171708410 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/planner/common/DrillRelOptUtil.java --- @@ -224,4 +226,64 @@ public Void visitInputRef(RexInputRef inputRef) { } } + public static boolean isLimit0(RexNode fetch) { +if (fetch != null && fetch.isA(SqlKind.LITERAL)) { + RexLiteral l = (RexLiteral) fetch; + switch (l.getTypeName()) { +case BIGINT: +case INTEGER: +case DECIMAL: + if (((long) l.getValue2()) == 0) { +return true; + } + } +} +return false; + } + + public static boolean isProjectOutputRowcountUnknown(RelNode project) { --- End diff -- Done > Drill does not push limit past project (flatten) if it cannot be pushed into > scan > - > > Key: DRILL-6099 > URL: https://issues.apache.org/jira/browse/DRILL-6099 > Project: Apache Drill > Issue Type: Bug >Affects Versions: 1.12.0 >Reporter: Gautam Kumar Parai >Assignee: Gautam Kumar Parai >Priority: Major > Fix For: 1.13.0 > > > It would be useful to have pushdown occur past flatten(project). Here is an > example to illustrate the issue: > {{explain plan without implementation for }}{{select name, > flatten(categories) as category from dfs.`/tmp/t_json_20` LIMIT 1;}} > {{DrillScreenRel}}{{ }} > {{ DrillLimitRel(fetch=[1])}}{{ }} > {{ DrillProjectRel(name=[$0], category=[FLATTEN($1)])}} > {{ DrillScanRel(table=[[dfs, /tmp/t_json_20]], groupscan=[EasyGroupScan > [selectionRoot=maprfs:/tmp/t_json_20, numFiles=1, columns=[`name`, > `categories`], files=[maprfs:///tmp/t_json_20/0_0_0.json]]])}} > = > Content of 0_0_0.json > = > { > "name" : "Eric Goldberg, MD", > "categories" : [ "Doctors", "Health & Medical" ] > } { > "name" : "Pine Cone Restaurant", > "categories" : [ "Restaurants" ] > } { > "name" : "Deforest Family Restaurant", > "categories" : [ "American (Traditional)", "Restaurants" ] > } { > "name" : "Culver's", > "categories" : [ "Food", "Ice Cream & Frozen Yogurt", "Fast Food", > "Restaurants" ] > } { > "name" : "Chang Jiang Chinese Kitchen", > "categories" : [ "Chinese", "Restaurants" ] > } > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (DRILL-6099) Drill does not push limit past project (flatten) if it cannot be pushed into scan
[ https://issues.apache.org/jira/browse/DRILL-6099?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16381616#comment-16381616 ] ASF GitHub Bot commented on DRILL-6099: --- Github user amansinha100 commented on a diff in the pull request: https://github.com/apache/drill/pull/1096#discussion_r171479641 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/planner/common/DrillRelOptUtil.java --- @@ -224,4 +226,64 @@ public Void visitInputRef(RexInputRef inputRef) { } } + public static boolean isLimit0(RexNode fetch) { +if (fetch != null && fetch.isA(SqlKind.LITERAL)) { + RexLiteral l = (RexLiteral) fetch; + switch (l.getTypeName()) { +case BIGINT: +case INTEGER: +case DECIMAL: + if (((long) l.getValue2()) == 0) { +return true; + } + } +} +return false; + } + + public static boolean isProjectOutputRowcountUnknown(RelNode project) { +assert project instanceof Project : "Rel is NOT an instance of project!"; +try { + RexVisitor visitor = --- End diff -- Would FLATTEN ever occur within other expressions ? I believe it always occurs as an independent expression. If that's the case, it seems to me that having a visitor is overkill.. what do you think ? Even the original rewrite from project to flatten just iterates over the project exprs here [1]. [1] https://github.com/apache/drill/blob/master/exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/visitor/RewriteProjectToFlatten.java#L77 > Drill does not push limit past project (flatten) if it cannot be pushed into > scan > - > > Key: DRILL-6099 > URL: https://issues.apache.org/jira/browse/DRILL-6099 > Project: Apache Drill > Issue Type: Bug >Affects Versions: 1.12.0 >Reporter: Gautam Kumar Parai >Assignee: Gautam Kumar Parai >Priority: Major > Fix For: 1.13.0 > > > It would be useful to have pushdown occur past flatten(project). Here is an > example to illustrate the issue: > {{explain plan without implementation for }}{{select name, > flatten(categories) as category from dfs.`/tmp/t_json_20` LIMIT 1;}} > {{DrillScreenRel}}{{ }} > {{ DrillLimitRel(fetch=[1])}}{{ }} > {{ DrillProjectRel(name=[$0], category=[FLATTEN($1)])}} > {{ DrillScanRel(table=[[dfs, /tmp/t_json_20]], groupscan=[EasyGroupScan > [selectionRoot=maprfs:/tmp/t_json_20, numFiles=1, columns=[`name`, > `categories`], files=[maprfs:///tmp/t_json_20/0_0_0.json]]])}} > = > Content of 0_0_0.json > = > { > "name" : "Eric Goldberg, MD", > "categories" : [ "Doctors", "Health & Medical" ] > } { > "name" : "Pine Cone Restaurant", > "categories" : [ "Restaurants" ] > } { > "name" : "Deforest Family Restaurant", > "categories" : [ "American (Traditional)", "Restaurants" ] > } { > "name" : "Culver's", > "categories" : [ "Food", "Ice Cream & Frozen Yogurt", "Fast Food", > "Restaurants" ] > } { > "name" : "Chang Jiang Chinese Kitchen", > "categories" : [ "Chinese", "Restaurants" ] > } > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (DRILL-6099) Drill does not push limit past project (flatten) if it cannot be pushed into scan
[ https://issues.apache.org/jira/browse/DRILL-6099?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16381617#comment-16381617 ] ASF GitHub Bot commented on DRILL-6099: --- Github user amansinha100 commented on a diff in the pull request: https://github.com/apache/drill/pull/1096#discussion_r171478117 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/planner/common/DrillRelOptUtil.java --- @@ -224,4 +226,64 @@ public Void visitInputRef(RexInputRef inputRef) { } } + public static boolean isLimit0(RexNode fetch) { +if (fetch != null && fetch.isA(SqlKind.LITERAL)) { + RexLiteral l = (RexLiteral) fetch; + switch (l.getTypeName()) { +case BIGINT: +case INTEGER: +case DECIMAL: + if (((long) l.getValue2()) == 0) { +return true; + } + } +} +return false; + } + + public static boolean isProjectOutputRowcountUnknown(RelNode project) { +assert project instanceof Project : "Rel is NOT an instance of project!"; +try { + RexVisitor visitor = + new RexVisitorImpl(true) { +public Void visitCall(RexCall call) { + if ("flatten".equals(call.getOperator().getName().toLowerCase())) { +throw new Util.FoundOne(call); /* throw exception to interrupt tree walk (this is similar to + other utility methods in RexUtil.java */ + } + return super.visitCall(call); +} + }; + for (RexNode rex : ((Project) project).getProjects()) { +rex.accept(visitor); + } +} catch (Util.FoundOne e) { + Util.swallow(e, null); + return true; +} +return false; + } + + public static boolean isProjectOutputSchemaUnknown(RelNode project) { --- End diff -- Javadoc > Drill does not push limit past project (flatten) if it cannot be pushed into > scan > - > > Key: DRILL-6099 > URL: https://issues.apache.org/jira/browse/DRILL-6099 > Project: Apache Drill > Issue Type: Bug >Affects Versions: 1.12.0 >Reporter: Gautam Kumar Parai >Assignee: Gautam Kumar Parai >Priority: Major > Fix For: 1.13.0 > > > It would be useful to have pushdown occur past flatten(project). Here is an > example to illustrate the issue: > {{explain plan without implementation for }}{{select name, > flatten(categories) as category from dfs.`/tmp/t_json_20` LIMIT 1;}} > {{DrillScreenRel}}{{ }} > {{ DrillLimitRel(fetch=[1])}}{{ }} > {{ DrillProjectRel(name=[$0], category=[FLATTEN($1)])}} > {{ DrillScanRel(table=[[dfs, /tmp/t_json_20]], groupscan=[EasyGroupScan > [selectionRoot=maprfs:/tmp/t_json_20, numFiles=1, columns=[`name`, > `categories`], files=[maprfs:///tmp/t_json_20/0_0_0.json]]])}} > = > Content of 0_0_0.json > = > { > "name" : "Eric Goldberg, MD", > "categories" : [ "Doctors", "Health & Medical" ] > } { > "name" : "Pine Cone Restaurant", > "categories" : [ "Restaurants" ] > } { > "name" : "Deforest Family Restaurant", > "categories" : [ "American (Traditional)", "Restaurants" ] > } { > "name" : "Culver's", > "categories" : [ "Food", "Ice Cream & Frozen Yogurt", "Fast Food", > "Restaurants" ] > } { > "name" : "Chang Jiang Chinese Kitchen", > "categories" : [ "Chinese", "Restaurants" ] > } > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (DRILL-6099) Drill does not push limit past project (flatten) if it cannot be pushed into scan
[ https://issues.apache.org/jira/browse/DRILL-6099?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16381615#comment-16381615 ] ASF GitHub Bot commented on DRILL-6099: --- Github user amansinha100 commented on a diff in the pull request: https://github.com/apache/drill/pull/1096#discussion_r171478085 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/planner/common/DrillRelOptUtil.java --- @@ -224,4 +226,64 @@ public Void visitInputRef(RexInputRef inputRef) { } } + public static boolean isLimit0(RexNode fetch) { +if (fetch != null && fetch.isA(SqlKind.LITERAL)) { + RexLiteral l = (RexLiteral) fetch; + switch (l.getTypeName()) { +case BIGINT: +case INTEGER: +case DECIMAL: + if (((long) l.getValue2()) == 0) { +return true; + } + } +} +return false; + } + + public static boolean isProjectOutputRowcountUnknown(RelNode project) { --- End diff -- Could you add javadoc for this utility function. > Drill does not push limit past project (flatten) if it cannot be pushed into > scan > - > > Key: DRILL-6099 > URL: https://issues.apache.org/jira/browse/DRILL-6099 > Project: Apache Drill > Issue Type: Bug >Affects Versions: 1.12.0 >Reporter: Gautam Kumar Parai >Assignee: Gautam Kumar Parai >Priority: Major > Fix For: 1.13.0 > > > It would be useful to have pushdown occur past flatten(project). Here is an > example to illustrate the issue: > {{explain plan without implementation for }}{{select name, > flatten(categories) as category from dfs.`/tmp/t_json_20` LIMIT 1;}} > {{DrillScreenRel}}{{ }} > {{ DrillLimitRel(fetch=[1])}}{{ }} > {{ DrillProjectRel(name=[$0], category=[FLATTEN($1)])}} > {{ DrillScanRel(table=[[dfs, /tmp/t_json_20]], groupscan=[EasyGroupScan > [selectionRoot=maprfs:/tmp/t_json_20, numFiles=1, columns=[`name`, > `categories`], files=[maprfs:///tmp/t_json_20/0_0_0.json]]])}} > = > Content of 0_0_0.json > = > { > "name" : "Eric Goldberg, MD", > "categories" : [ "Doctors", "Health & Medical" ] > } { > "name" : "Pine Cone Restaurant", > "categories" : [ "Restaurants" ] > } { > "name" : "Deforest Family Restaurant", > "categories" : [ "American (Traditional)", "Restaurants" ] > } { > "name" : "Culver's", > "categories" : [ "Food", "Ice Cream & Frozen Yogurt", "Fast Food", > "Restaurants" ] > } { > "name" : "Chang Jiang Chinese Kitchen", > "categories" : [ "Chinese", "Restaurants" ] > } > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (DRILL-6099) Drill does not push limit past project (flatten) if it cannot be pushed into scan
[ https://issues.apache.org/jira/browse/DRILL-6099?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16381618#comment-16381618 ] ASF GitHub Bot commented on DRILL-6099: --- Github user amansinha100 commented on a diff in the pull request: https://github.com/apache/drill/pull/1096#discussion_r171480227 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/planner/logical/DrillPushLimitToScanRule.java --- @@ -55,18 +62,21 @@ public void onMatch(RelOptRuleCall call) { } }; - public static DrillPushLimitToScanRule LIMIT_ON_PROJECT = - new DrillPushLimitToScanRule( - RelOptHelper.some(DrillLimitRel.class, RelOptHelper.some( - DrillProjectRel.class, RelOptHelper.any(DrillScanRel.class))), - "DrillPushLimitToScanRule_LimitOnProject") { + public static DrillPushLimitToScanRule LIMIT_ON_PROJECT = new DrillPushLimitToScanRule( + RelOptHelper.some(DrillLimitRel.class, RelOptHelper.any(DrillProjectRel.class)), "DrillPushLimitToScanRule_LimitOnProject") { @Override public boolean matches(RelOptRuleCall call) { DrillLimitRel limitRel = call.rel(0); - DrillScanRel scanRel = call.rel(2); - // For now only applies to Parquet. And pushdown only apply limit but not offset, + DrillProjectRel projectRel = call.rel(1); + // pushdown only apply limit but not offset, // so if getFetch() return null no need to run this rule. - if (scanRel.getGroupScan().supportsLimitPushdown() && (limitRel.getFetch() != null)) { --- End diff -- One implication of this is suppose the underlying Scan does not support Limit pushdown, you could end up with a plan `Scan->Limit->Project->Limit` where the Limit above the Scan is redundant (assume that there is no FLATTEN in this query). Can this be avoided ? > Drill does not push limit past project (flatten) if it cannot be pushed into > scan > - > > Key: DRILL-6099 > URL: https://issues.apache.org/jira/browse/DRILL-6099 > Project: Apache Drill > Issue Type: Bug >Affects Versions: 1.12.0 >Reporter: Gautam Kumar Parai >Assignee: Gautam Kumar Parai >Priority: Major > Fix For: 1.13.0 > > > It would be useful to have pushdown occur past flatten(project). Here is an > example to illustrate the issue: > {{explain plan without implementation for }}{{select name, > flatten(categories) as category from dfs.`/tmp/t_json_20` LIMIT 1;}} > {{DrillScreenRel}}{{ }} > {{ DrillLimitRel(fetch=[1])}}{{ }} > {{ DrillProjectRel(name=[$0], category=[FLATTEN($1)])}} > {{ DrillScanRel(table=[[dfs, /tmp/t_json_20]], groupscan=[EasyGroupScan > [selectionRoot=maprfs:/tmp/t_json_20, numFiles=1, columns=[`name`, > `categories`], files=[maprfs:///tmp/t_json_20/0_0_0.json]]])}} > = > Content of 0_0_0.json > = > { > "name" : "Eric Goldberg, MD", > "categories" : [ "Doctors", "Health & Medical" ] > } { > "name" : "Pine Cone Restaurant", > "categories" : [ "Restaurants" ] > } { > "name" : "Deforest Family Restaurant", > "categories" : [ "American (Traditional)", "Restaurants" ] > } { > "name" : "Culver's", > "categories" : [ "Food", "Ice Cream & Frozen Yogurt", "Fast Food", > "Restaurants" ] > } { > "name" : "Chang Jiang Chinese Kitchen", > "categories" : [ "Chinese", "Restaurants" ] > } > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (DRILL-6099) Drill does not push limit past project (flatten) if it cannot be pushed into scan
[ https://issues.apache.org/jira/browse/DRILL-6099?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16379395#comment-16379395 ] ASF GitHub Bot commented on DRILL-6099: --- Github user gparai commented on a diff in the pull request: https://github.com/apache/drill/pull/1096#discussion_r171086924 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/planner/logical/DrillPushLimitToScanRule.java --- @@ -55,18 +62,21 @@ public void onMatch(RelOptRuleCall call) { } }; - public static DrillPushLimitToScanRule LIMIT_ON_PROJECT = - new DrillPushLimitToScanRule( - RelOptHelper.some(DrillLimitRel.class, RelOptHelper.some( - DrillProjectRel.class, RelOptHelper.any(DrillScanRel.class))), - "DrillPushLimitToScanRule_LimitOnProject") { + public static DrillPushLimitToScanRule LIMIT_ON_PROJECT = new DrillPushLimitToScanRule( --- End diff -- There are many instances where we would have a PROJECT on top of the SCAN. The way the rule is refactored now the LIMIT_SCAN rule would not work unless we do LIMIT_PROJECT. Hence, these rules should go together for LIMIT_SCAN to work effectively. That is the reason I kept the rule here rather than creating a new rule. > Drill does not push limit past project (flatten) if it cannot be pushed into > scan > - > > Key: DRILL-6099 > URL: https://issues.apache.org/jira/browse/DRILL-6099 > Project: Apache Drill > Issue Type: Bug >Affects Versions: 1.12.0 >Reporter: Gautam Kumar Parai >Assignee: Gautam Kumar Parai >Priority: Major > Fix For: 1.13.0 > > > It would be useful to have pushdown occur past flatten(project). Here is an > example to illustrate the issue: > {{explain plan without implementation for }}{{select name, > flatten(categories) as category from dfs.`/tmp/t_json_20` LIMIT 1;}} > {{DrillScreenRel}}{{ }} > {{ DrillLimitRel(fetch=[1])}}{{ }} > {{ DrillProjectRel(name=[$0], category=[FLATTEN($1)])}} > {{ DrillScanRel(table=[[dfs, /tmp/t_json_20]], groupscan=[EasyGroupScan > [selectionRoot=maprfs:/tmp/t_json_20, numFiles=1, columns=[`name`, > `categories`], files=[maprfs:///tmp/t_json_20/0_0_0.json]]])}} > = > Content of 0_0_0.json > = > { > "name" : "Eric Goldberg, MD", > "categories" : [ "Doctors", "Health & Medical" ] > } { > "name" : "Pine Cone Restaurant", > "categories" : [ "Restaurants" ] > } { > "name" : "Deforest Family Restaurant", > "categories" : [ "American (Traditional)", "Restaurants" ] > } { > "name" : "Culver's", > "categories" : [ "Food", "Ice Cream & Frozen Yogurt", "Fast Food", > "Restaurants" ] > } { > "name" : "Chang Jiang Chinese Kitchen", > "categories" : [ "Chinese", "Restaurants" ] > } > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (DRILL-6099) Drill does not push limit past project (flatten) if it cannot be pushed into scan
[ https://issues.apache.org/jira/browse/DRILL-6099?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16379390#comment-16379390 ] ASF GitHub Bot commented on DRILL-6099: --- Github user gparai commented on a diff in the pull request: https://github.com/apache/drill/pull/1096#discussion_r171085780 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/planner/logical/DrillPushLimitToScanRule.java --- @@ -55,18 +62,21 @@ public void onMatch(RelOptRuleCall call) { } }; - public static DrillPushLimitToScanRule LIMIT_ON_PROJECT = - new DrillPushLimitToScanRule( - RelOptHelper.some(DrillLimitRel.class, RelOptHelper.some( - DrillProjectRel.class, RelOptHelper.any(DrillScanRel.class))), - "DrillPushLimitToScanRule_LimitOnProject") { + public static DrillPushLimitToScanRule LIMIT_ON_PROJECT = new DrillPushLimitToScanRule( + RelOptHelper.some(DrillLimitRel.class, RelOptHelper.any(DrillProjectRel.class)), "DrillPushLimitToScanRule_LimitOnProject") { @Override public boolean matches(RelOptRuleCall call) { DrillLimitRel limitRel = call.rel(0); - DrillScanRel scanRel = call.rel(2); - // For now only applies to Parquet. And pushdown only apply limit but not offset, + DrillProjectRel projectRel = call.rel(1); + // pushdown only apply limit but not offset, // so if getFetch() return null no need to run this rule. - if (scanRel.getGroupScan().supportsLimitPushdown() && (limitRel.getFetch() != null)) { --- End diff -- We still have the LIMIT_ON_SCAN rule which does that check. This rule is changed from LIMIT_PROJECT_SCAN to LIMIT_PROJECT. The LIMIT_SCAN along with the LIMIT_PROJECT would work as the LIMIT_PROJECT_SCAN. > Drill does not push limit past project (flatten) if it cannot be pushed into > scan > - > > Key: DRILL-6099 > URL: https://issues.apache.org/jira/browse/DRILL-6099 > Project: Apache Drill > Issue Type: Bug >Affects Versions: 1.12.0 >Reporter: Gautam Kumar Parai >Assignee: Gautam Kumar Parai >Priority: Major > Fix For: 1.13.0 > > > It would be useful to have pushdown occur past flatten(project). Here is an > example to illustrate the issue: > {{explain plan without implementation for }}{{select name, > flatten(categories) as category from dfs.`/tmp/t_json_20` LIMIT 1;}} > {{DrillScreenRel}}{{ }} > {{ DrillLimitRel(fetch=[1])}}{{ }} > {{ DrillProjectRel(name=[$0], category=[FLATTEN($1)])}} > {{ DrillScanRel(table=[[dfs, /tmp/t_json_20]], groupscan=[EasyGroupScan > [selectionRoot=maprfs:/tmp/t_json_20, numFiles=1, columns=[`name`, > `categories`], files=[maprfs:///tmp/t_json_20/0_0_0.json]]])}} > = > Content of 0_0_0.json > = > { > "name" : "Eric Goldberg, MD", > "categories" : [ "Doctors", "Health & Medical" ] > } { > "name" : "Pine Cone Restaurant", > "categories" : [ "Restaurants" ] > } { > "name" : "Deforest Family Restaurant", > "categories" : [ "American (Traditional)", "Restaurants" ] > } { > "name" : "Culver's", > "categories" : [ "Food", "Ice Cream & Frozen Yogurt", "Fast Food", > "Restaurants" ] > } { > "name" : "Chang Jiang Chinese Kitchen", > "categories" : [ "Chinese", "Restaurants" ] > } > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (DRILL-6099) Drill does not push limit past project (flatten) if it cannot be pushed into scan
[ https://issues.apache.org/jira/browse/DRILL-6099?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16379154#comment-16379154 ] Gautam Kumar Parai commented on DRILL-6099: --- [~priteshm] no I did not get a chance to address them yet. I will take a look. > Drill does not push limit past project (flatten) if it cannot be pushed into > scan > - > > Key: DRILL-6099 > URL: https://issues.apache.org/jira/browse/DRILL-6099 > Project: Apache Drill > Issue Type: Bug >Affects Versions: 1.12.0 >Reporter: Gautam Kumar Parai >Assignee: Gautam Kumar Parai >Priority: Major > Fix For: 1.13.0 > > > It would be useful to have pushdown occur past flatten(project). Here is an > example to illustrate the issue: > {{explain plan without implementation for }}{{select name, > flatten(categories) as category from dfs.`/tmp/t_json_20` LIMIT 1;}} > {{DrillScreenRel}}{{ }} > {{ DrillLimitRel(fetch=[1])}}{{ }} > {{ DrillProjectRel(name=[$0], category=[FLATTEN($1)])}} > {{ DrillScanRel(table=[[dfs, /tmp/t_json_20]], groupscan=[EasyGroupScan > [selectionRoot=maprfs:/tmp/t_json_20, numFiles=1, columns=[`name`, > `categories`], files=[maprfs:///tmp/t_json_20/0_0_0.json]]])}} > = > Content of 0_0_0.json > = > { > "name" : "Eric Goldberg, MD", > "categories" : [ "Doctors", "Health & Medical" ] > } { > "name" : "Pine Cone Restaurant", > "categories" : [ "Restaurants" ] > } { > "name" : "Deforest Family Restaurant", > "categories" : [ "American (Traditional)", "Restaurants" ] > } { > "name" : "Culver's", > "categories" : [ "Food", "Ice Cream & Frozen Yogurt", "Fast Food", > "Restaurants" ] > } { > "name" : "Chang Jiang Chinese Kitchen", > "categories" : [ "Chinese", "Restaurants" ] > } > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (DRILL-6099) Drill does not push limit past project (flatten) if it cannot be pushed into scan
[ https://issues.apache.org/jira/browse/DRILL-6099?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16379135#comment-16379135 ] ASF GitHub Bot commented on DRILL-6099: --- Github user priteshm commented on the issue: https://github.com/apache/drill/pull/1096 @gparai did you get a chance to address @amansinha100 's comment for this PR? > Drill does not push limit past project (flatten) if it cannot be pushed into > scan > - > > Key: DRILL-6099 > URL: https://issues.apache.org/jira/browse/DRILL-6099 > Project: Apache Drill > Issue Type: Bug >Affects Versions: 1.12.0 >Reporter: Gautam Kumar Parai >Assignee: Gautam Kumar Parai >Priority: Major > Fix For: 1.13.0 > > > It would be useful to have pushdown occur past flatten(project). Here is an > example to illustrate the issue: > {{explain plan without implementation for }}{{select name, > flatten(categories) as category from dfs.`/tmp/t_json_20` LIMIT 1;}} > {{DrillScreenRel}}{{ }} > {{ DrillLimitRel(fetch=[1])}}{{ }} > {{ DrillProjectRel(name=[$0], category=[FLATTEN($1)])}} > {{ DrillScanRel(table=[[dfs, /tmp/t_json_20]], groupscan=[EasyGroupScan > [selectionRoot=maprfs:/tmp/t_json_20, numFiles=1, columns=[`name`, > `categories`], files=[maprfs:///tmp/t_json_20/0_0_0.json]]])}} > = > Content of 0_0_0.json > = > { > "name" : "Eric Goldberg, MD", > "categories" : [ "Doctors", "Health & Medical" ] > } { > "name" : "Pine Cone Restaurant", > "categories" : [ "Restaurants" ] > } { > "name" : "Deforest Family Restaurant", > "categories" : [ "American (Traditional)", "Restaurants" ] > } { > "name" : "Culver's", > "categories" : [ "Food", "Ice Cream & Frozen Yogurt", "Fast Food", > "Restaurants" ] > } { > "name" : "Chang Jiang Chinese Kitchen", > "categories" : [ "Chinese", "Restaurants" ] > } > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (DRILL-6099) Drill does not push limit past project (flatten) if it cannot be pushed into scan
[ https://issues.apache.org/jira/browse/DRILL-6099?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16351094#comment-16351094 ] Aman Sinha commented on DRILL-6099: --- [~gparai] I had couple of questions..sorry didn't get around to sending it earlier. > Drill does not push limit past project (flatten) if it cannot be pushed into > scan > - > > Key: DRILL-6099 > URL: https://issues.apache.org/jira/browse/DRILL-6099 > Project: Apache Drill > Issue Type: Bug >Affects Versions: 1.12.0 >Reporter: Gautam Kumar Parai >Assignee: Gautam Kumar Parai >Priority: Major > Labels: ready-to-commit > Fix For: 1.13.0 > > Original Estimate: 48h > Remaining Estimate: 48h > > It would be useful to have pushdown occur past flatten(project). Here is an > example to illustrate the issue: > {{explain plan without implementation for }}{{select name, > flatten(categories) as category from dfs.`/tmp/t_json_20` LIMIT 1;}} > {{DrillScreenRel}}{{ }} > {{ DrillLimitRel(fetch=[1])}}{{ }} > {{ DrillProjectRel(name=[$0], category=[FLATTEN($1)])}} > {{ DrillScanRel(table=[[dfs, /tmp/t_json_20]], groupscan=[EasyGroupScan > [selectionRoot=maprfs:/tmp/t_json_20, numFiles=1, columns=[`name`, > `categories`], files=[maprfs:///tmp/t_json_20/0_0_0.json]]])}} > = > Content of 0_0_0.json > = > { > "name" : "Eric Goldberg, MD", > "categories" : [ "Doctors", "Health & Medical" ] > } { > "name" : "Pine Cone Restaurant", > "categories" : [ "Restaurants" ] > } { > "name" : "Deforest Family Restaurant", > "categories" : [ "American (Traditional)", "Restaurants" ] > } { > "name" : "Culver's", > "categories" : [ "Food", "Ice Cream & Frozen Yogurt", "Fast Food", > "Restaurants" ] > } { > "name" : "Chang Jiang Chinese Kitchen", > "categories" : [ "Chinese", "Restaurants" ] > } > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (DRILL-6099) Drill does not push limit past project (flatten) if it cannot be pushed into scan
[ https://issues.apache.org/jira/browse/DRILL-6099?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16351090#comment-16351090 ] ASF GitHub Bot commented on DRILL-6099: --- Github user amansinha100 commented on a diff in the pull request: https://github.com/apache/drill/pull/1096#discussion_r165788152 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/planner/logical/DrillPushLimitToScanRule.java --- @@ -55,18 +62,21 @@ public void onMatch(RelOptRuleCall call) { } }; - public static DrillPushLimitToScanRule LIMIT_ON_PROJECT = - new DrillPushLimitToScanRule( - RelOptHelper.some(DrillLimitRel.class, RelOptHelper.some( - DrillProjectRel.class, RelOptHelper.any(DrillScanRel.class))), - "DrillPushLimitToScanRule_LimitOnProject") { + public static DrillPushLimitToScanRule LIMIT_ON_PROJECT = new DrillPushLimitToScanRule( --- End diff -- I am not sure why this rule is being overloaded for doing limit push past project. This particular rule is about doing limit pushdown into scan for cases where we have LIMIT-SCAN or LIMIT-PROJECT-SCAN. I think we should keep this rule as-is but create a separate rule that does a limit push past project. Was there a strong reason to do it this way ? Could there be a side effect of removing the check for the Scan ? > Drill does not push limit past project (flatten) if it cannot be pushed into > scan > - > > Key: DRILL-6099 > URL: https://issues.apache.org/jira/browse/DRILL-6099 > Project: Apache Drill > Issue Type: Bug >Affects Versions: 1.12.0 >Reporter: Gautam Kumar Parai >Assignee: Gautam Kumar Parai >Priority: Major > Labels: ready-to-commit > Fix For: 1.13.0 > > Original Estimate: 48h > Remaining Estimate: 48h > > It would be useful to have pushdown occur past flatten(project). Here is an > example to illustrate the issue: > {{explain plan without implementation for }}{{select name, > flatten(categories) as category from dfs.`/tmp/t_json_20` LIMIT 1;}} > {{DrillScreenRel}}{{ }} > {{ DrillLimitRel(fetch=[1])}}{{ }} > {{ DrillProjectRel(name=[$0], category=[FLATTEN($1)])}} > {{ DrillScanRel(table=[[dfs, /tmp/t_json_20]], groupscan=[EasyGroupScan > [selectionRoot=maprfs:/tmp/t_json_20, numFiles=1, columns=[`name`, > `categories`], files=[maprfs:///tmp/t_json_20/0_0_0.json]]])}} > = > Content of 0_0_0.json > = > { > "name" : "Eric Goldberg, MD", > "categories" : [ "Doctors", "Health & Medical" ] > } { > "name" : "Pine Cone Restaurant", > "categories" : [ "Restaurants" ] > } { > "name" : "Deforest Family Restaurant", > "categories" : [ "American (Traditional)", "Restaurants" ] > } { > "name" : "Culver's", > "categories" : [ "Food", "Ice Cream & Frozen Yogurt", "Fast Food", > "Restaurants" ] > } { > "name" : "Chang Jiang Chinese Kitchen", > "categories" : [ "Chinese", "Restaurants" ] > } > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (DRILL-6099) Drill does not push limit past project (flatten) if it cannot be pushed into scan
[ https://issues.apache.org/jira/browse/DRILL-6099?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16351091#comment-16351091 ] ASF GitHub Bot commented on DRILL-6099: --- Github user amansinha100 commented on a diff in the pull request: https://github.com/apache/drill/pull/1096#discussion_r165791415 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/planner/logical/DrillPushLimitToScanRule.java --- @@ -55,18 +62,21 @@ public void onMatch(RelOptRuleCall call) { } }; - public static DrillPushLimitToScanRule LIMIT_ON_PROJECT = - new DrillPushLimitToScanRule( - RelOptHelper.some(DrillLimitRel.class, RelOptHelper.some( - DrillProjectRel.class, RelOptHelper.any(DrillScanRel.class))), - "DrillPushLimitToScanRule_LimitOnProject") { + public static DrillPushLimitToScanRule LIMIT_ON_PROJECT = new DrillPushLimitToScanRule( + RelOptHelper.some(DrillLimitRel.class, RelOptHelper.any(DrillProjectRel.class)), "DrillPushLimitToScanRule_LimitOnProject") { @Override public boolean matches(RelOptRuleCall call) { DrillLimitRel limitRel = call.rel(0); - DrillScanRel scanRel = call.rel(2); - // For now only applies to Parquet. And pushdown only apply limit but not offset, + DrillProjectRel projectRel = call.rel(1); + // pushdown only apply limit but not offset, // so if getFetch() return null no need to run this rule. - if (scanRel.getGroupScan().supportsLimitPushdown() && (limitRel.getFetch() != null)) { --- End diff -- I can understand that this check was removed because this matches() method no longer is checking for DrillScanRel, but does that mean that no one is checking the GroupScan for supportsLimitPushdown() ? > Drill does not push limit past project (flatten) if it cannot be pushed into > scan > - > > Key: DRILL-6099 > URL: https://issues.apache.org/jira/browse/DRILL-6099 > Project: Apache Drill > Issue Type: Bug >Affects Versions: 1.12.0 >Reporter: Gautam Kumar Parai >Assignee: Gautam Kumar Parai >Priority: Major > Labels: ready-to-commit > Fix For: 1.13.0 > > Original Estimate: 48h > Remaining Estimate: 48h > > It would be useful to have pushdown occur past flatten(project). Here is an > example to illustrate the issue: > {{explain plan without implementation for }}{{select name, > flatten(categories) as category from dfs.`/tmp/t_json_20` LIMIT 1;}} > {{DrillScreenRel}}{{ }} > {{ DrillLimitRel(fetch=[1])}}{{ }} > {{ DrillProjectRel(name=[$0], category=[FLATTEN($1)])}} > {{ DrillScanRel(table=[[dfs, /tmp/t_json_20]], groupscan=[EasyGroupScan > [selectionRoot=maprfs:/tmp/t_json_20, numFiles=1, columns=[`name`, > `categories`], files=[maprfs:///tmp/t_json_20/0_0_0.json]]])}} > = > Content of 0_0_0.json > = > { > "name" : "Eric Goldberg, MD", > "categories" : [ "Doctors", "Health & Medical" ] > } { > "name" : "Pine Cone Restaurant", > "categories" : [ "Restaurants" ] > } { > "name" : "Deforest Family Restaurant", > "categories" : [ "American (Traditional)", "Restaurants" ] > } { > "name" : "Culver's", > "categories" : [ "Food", "Ice Cream & Frozen Yogurt", "Fast Food", > "Restaurants" ] > } { > "name" : "Chang Jiang Chinese Kitchen", > "categories" : [ "Chinese", "Restaurants" ] > } > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (DRILL-6099) Drill does not push limit past project (flatten) if it cannot be pushed into scan
[ https://issues.apache.org/jira/browse/DRILL-6099?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16344166#comment-16344166 ] ASF GitHub Bot commented on DRILL-6099: --- Github user gparai commented on a diff in the pull request: https://github.com/apache/drill/pull/1096#discussion_r164593749 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/planner/logical/DrillPushLimitToScanRule.java --- @@ -121,4 +132,50 @@ protected void doOnMatch(RelOptRuleCall call, DrillLimitRel limitRel, DrillScanR } } + + private static boolean isProjectFlatten(RelNode project) { --- End diff -- Done. > Drill does not push limit past project (flatten) if it cannot be pushed into > scan > - > > Key: DRILL-6099 > URL: https://issues.apache.org/jira/browse/DRILL-6099 > Project: Apache Drill > Issue Type: Bug >Affects Versions: 1.12.0 >Reporter: Gautam Kumar Parai >Assignee: Gautam Kumar Parai >Priority: Major > Fix For: 1.13.0 > > Original Estimate: 48h > Remaining Estimate: 48h > > It would be useful to have pushdown occur past flatten(project). Here is an > example to illustrate the issue: > {{explain plan without implementation for }}{{select name, > flatten(categories) as category from dfs.`/tmp/t_json_20` LIMIT 1;}} > {{DrillScreenRel}}{{ }} > {{ DrillLimitRel(fetch=[1])}}{{ }} > {{ DrillProjectRel(name=[$0], category=[FLATTEN($1)])}} > {{ DrillScanRel(table=[[dfs, /tmp/t_json_20]], groupscan=[EasyGroupScan > [selectionRoot=maprfs:/tmp/t_json_20, numFiles=1, columns=[`name`, > `categories`], files=[maprfs:///tmp/t_json_20/0_0_0.json]]])}} > = > Content of 0_0_0.json > = > { > "name" : "Eric Goldberg, MD", > "categories" : [ "Doctors", "Health & Medical" ] > } { > "name" : "Pine Cone Restaurant", > "categories" : [ "Restaurants" ] > } { > "name" : "Deforest Family Restaurant", > "categories" : [ "American (Traditional)", "Restaurants" ] > } { > "name" : "Culver's", > "categories" : [ "Food", "Ice Cream & Frozen Yogurt", "Fast Food", > "Restaurants" ] > } { > "name" : "Chang Jiang Chinese Kitchen", > "categories" : [ "Chinese", "Restaurants" ] > } > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (DRILL-6099) Drill does not push limit past project (flatten) if it cannot be pushed into scan
[ https://issues.apache.org/jira/browse/DRILL-6099?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16344124#comment-16344124 ] ASF GitHub Bot commented on DRILL-6099: --- Github user chunhui-shi commented on a diff in the pull request: https://github.com/apache/drill/pull/1096#discussion_r164585670 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/planner/logical/DrillPushLimitToScanRule.java --- @@ -121,4 +132,50 @@ protected void doOnMatch(RelOptRuleCall call, DrillLimitRel limitRel, DrillScanR } } + + private static boolean isProjectFlatten(RelNode project) { --- End diff -- I think it might be more general to name the functions to schemaUnknown(for conert_fromJson), rowCountUnknown(for flatten), so if in future we have some other functions fall in these two categories, we could easily add these functions to the categories. What do you think? > Drill does not push limit past project (flatten) if it cannot be pushed into > scan > - > > Key: DRILL-6099 > URL: https://issues.apache.org/jira/browse/DRILL-6099 > Project: Apache Drill > Issue Type: Bug >Affects Versions: 1.12.0 >Reporter: Gautam Kumar Parai >Assignee: Gautam Kumar Parai >Priority: Major > Fix For: 1.13.0 > > Original Estimate: 48h > Remaining Estimate: 48h > > It would be useful to have pushdown occur past flatten(project). Here is an > example to illustrate the issue: > {{explain plan without implementation for }}{{select name, > flatten(categories) as category from dfs.`/tmp/t_json_20` LIMIT 1;}} > {{DrillScreenRel}}{{ }} > {{ DrillLimitRel(fetch=[1])}}{{ }} > {{ DrillProjectRel(name=[$0], category=[FLATTEN($1)])}} > {{ DrillScanRel(table=[[dfs, /tmp/t_json_20]], groupscan=[EasyGroupScan > [selectionRoot=maprfs:/tmp/t_json_20, numFiles=1, columns=[`name`, > `categories`], files=[maprfs:///tmp/t_json_20/0_0_0.json]]])}} > = > Content of 0_0_0.json > = > { > "name" : "Eric Goldberg, MD", > "categories" : [ "Doctors", "Health & Medical" ] > } { > "name" : "Pine Cone Restaurant", > "categories" : [ "Restaurants" ] > } { > "name" : "Deforest Family Restaurant", > "categories" : [ "American (Traditional)", "Restaurants" ] > } { > "name" : "Culver's", > "categories" : [ "Food", "Ice Cream & Frozen Yogurt", "Fast Food", > "Restaurants" ] > } { > "name" : "Chang Jiang Chinese Kitchen", > "categories" : [ "Chinese", "Restaurants" ] > } > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (DRILL-6099) Drill does not push limit past project (flatten) if it cannot be pushed into scan
[ https://issues.apache.org/jira/browse/DRILL-6099?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16342471#comment-16342471 ] ASF GitHub Bot commented on DRILL-6099: --- Github user gparai commented on the issue: https://github.com/apache/drill/pull/1096 @chunhui-shi can you please review the new changes (in commit e6dcf14)? Thanks! > Drill does not push limit past project (flatten) if it cannot be pushed into > scan > - > > Key: DRILL-6099 > URL: https://issues.apache.org/jira/browse/DRILL-6099 > Project: Apache Drill > Issue Type: Bug >Affects Versions: 1.12.0 >Reporter: Gautam Kumar Parai >Assignee: Gautam Kumar Parai >Priority: Major > Fix For: 1.13.0 > > Original Estimate: 48h > Remaining Estimate: 48h > > It would be useful to have pushdown occur past flatten(project). Here is an > example to illustrate the issue: > {{explain plan without implementation for }}{{select name, > flatten(categories) as category from dfs.`/tmp/t_json_20` LIMIT 1;}} > {{DrillScreenRel}}{{ }} > {{ DrillLimitRel(fetch=[1])}}{{ }} > {{ DrillProjectRel(name=[$0], category=[FLATTEN($1)])}} > {{ DrillScanRel(table=[[dfs, /tmp/t_json_20]], groupscan=[EasyGroupScan > [selectionRoot=maprfs:/tmp/t_json_20, numFiles=1, columns=[`name`, > `categories`], files=[maprfs:///tmp/t_json_20/0_0_0.json]]])}} > = > Content of 0_0_0.json > = > { > "name" : "Eric Goldberg, MD", > "categories" : [ "Doctors", "Health & Medical" ] > } { > "name" : "Pine Cone Restaurant", > "categories" : [ "Restaurants" ] > } { > "name" : "Deforest Family Restaurant", > "categories" : [ "American (Traditional)", "Restaurants" ] > } { > "name" : "Culver's", > "categories" : [ "Food", "Ice Cream & Frozen Yogurt", "Fast Food", > "Restaurants" ] > } { > "name" : "Chang Jiang Chinese Kitchen", > "categories" : [ "Chinese", "Restaurants" ] > } > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (DRILL-6099) Drill does not push limit past project (flatten) if it cannot be pushed into scan
[ https://issues.apache.org/jira/browse/DRILL-6099?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16335199#comment-16335199 ] ASF GitHub Bot commented on DRILL-6099: --- Github user chunhui-shi commented on the issue: https://github.com/apache/drill/pull/1096 Once all tests are done, I think it is fine to add 'ready-to-commit' label to the JIRA. > Drill does not push limit past project (flatten) if it cannot be pushed into > scan > - > > Key: DRILL-6099 > URL: https://issues.apache.org/jira/browse/DRILL-6099 > Project: Apache Drill > Issue Type: Bug >Affects Versions: 1.12.0 >Reporter: Gautam Kumar Parai >Assignee: Gautam Kumar Parai >Priority: Major > Fix For: 1.13.0 > > Original Estimate: 48h > Remaining Estimate: 48h > > It would be useful to have pushdown occur past flatten(project). Here is an > example to illustrate the issue: > {{explain plan without implementation for }}{{select name, > flatten(categories) as category from dfs.`/tmp/t_json_20` LIMIT 1;}} > {{DrillScreenRel}}{{ }} > {{ DrillLimitRel(fetch=[1])}}{{ }} > {{ DrillProjectRel(name=[$0], category=[FLATTEN($1)])}} > {{ DrillScanRel(table=[[dfs, /tmp/t_json_20]], groupscan=[EasyGroupScan > [selectionRoot=maprfs:/tmp/t_json_20, numFiles=1, columns=[`name`, > `categories`], files=[maprfs:///tmp/t_json_20/0_0_0.json]]])}} > = > Content of 0_0_0.json > = > { > "name" : "Eric Goldberg, MD", > "categories" : [ "Doctors", "Health & Medical" ] > } { > "name" : "Pine Cone Restaurant", > "categories" : [ "Restaurants" ] > } { > "name" : "Deforest Family Restaurant", > "categories" : [ "American (Traditional)", "Restaurants" ] > } { > "name" : "Culver's", > "categories" : [ "Food", "Ice Cream & Frozen Yogurt", "Fast Food", > "Restaurants" ] > } { > "name" : "Chang Jiang Chinese Kitchen", > "categories" : [ "Chinese", "Restaurants" ] > } > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (DRILL-6099) Drill does not push limit past project (flatten) if it cannot be pushed into scan
[ https://issues.apache.org/jira/browse/DRILL-6099?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16335198#comment-16335198 ] ASF GitHub Bot commented on DRILL-6099: --- Github user chunhui-shi commented on the issue: https://github.com/apache/drill/pull/1096 +1 > Drill does not push limit past project (flatten) if it cannot be pushed into > scan > - > > Key: DRILL-6099 > URL: https://issues.apache.org/jira/browse/DRILL-6099 > Project: Apache Drill > Issue Type: Bug >Affects Versions: 1.12.0 >Reporter: Gautam Kumar Parai >Assignee: Gautam Kumar Parai >Priority: Major > Fix For: 1.13.0 > > Original Estimate: 48h > Remaining Estimate: 48h > > It would be useful to have pushdown occur past flatten(project). Here is an > example to illustrate the issue: > {{explain plan without implementation for }}{{select name, > flatten(categories) as category from dfs.`/tmp/t_json_20` LIMIT 1;}} > {{DrillScreenRel}}{{ }} > {{ DrillLimitRel(fetch=[1])}}{{ }} > {{ DrillProjectRel(name=[$0], category=[FLATTEN($1)])}} > {{ DrillScanRel(table=[[dfs, /tmp/t_json_20]], groupscan=[EasyGroupScan > [selectionRoot=maprfs:/tmp/t_json_20, numFiles=1, columns=[`name`, > `categories`], files=[maprfs:///tmp/t_json_20/0_0_0.json]]])}} > = > Content of 0_0_0.json > = > { > "name" : "Eric Goldberg, MD", > "categories" : [ "Doctors", "Health & Medical" ] > } { > "name" : "Pine Cone Restaurant", > "categories" : [ "Restaurants" ] > } { > "name" : "Deforest Family Restaurant", > "categories" : [ "American (Traditional)", "Restaurants" ] > } { > "name" : "Culver's", > "categories" : [ "Food", "Ice Cream & Frozen Yogurt", "Fast Food", > "Restaurants" ] > } { > "name" : "Chang Jiang Chinese Kitchen", > "categories" : [ "Chinese", "Restaurants" ] > } > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (DRILL-6099) Drill does not push limit past project (flatten) if it cannot be pushed into scan
[ https://issues.apache.org/jira/browse/DRILL-6099?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16334796#comment-16334796 ] ASF GitHub Bot commented on DRILL-6099: --- Github user chunhui-shi commented on a diff in the pull request: https://github.com/apache/drill/pull/1096#discussion_r163048747 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/planner/PlannerPhase.java --- @@ -341,7 +346,7 @@ static RuleSet getPruneScanRules(OptimizerRulesContext optimizerRulesContext) { ParquetPruneScanRule.getFilterOnProjectParquet(optimizerRulesContext), ParquetPruneScanRule.getFilterOnScanParquet(optimizerRulesContext), DrillPushLimitToScanRule.LIMIT_ON_SCAN, -DrillPushLimitToScanRule.LIMIT_ON_PROJECT +DrillPushLimitToScanRule.LIMIT_ON_PROJECT_SCAN --- End diff -- Not sure if we still need "limit_on_project_scan". In theory, limit_on_project and limit_on_scan should already cover all the cases. Have you tested with "limit_on_project_scan" disabled? > Drill does not push limit past project (flatten) if it cannot be pushed into > scan > - > > Key: DRILL-6099 > URL: https://issues.apache.org/jira/browse/DRILL-6099 > Project: Apache Drill > Issue Type: Bug >Affects Versions: 1.12.0 >Reporter: Gautam Kumar Parai >Assignee: Gautam Kumar Parai >Priority: Major > Fix For: 1.13.0 > > Original Estimate: 48h > Remaining Estimate: 48h > > It would be useful to have pushdown occur past flatten(project). Here is an > example to illustrate the issue: > {{explain plan without implementation for }}{{select name, > flatten(categories) as category from dfs.`/tmp/t_json_20` LIMIT 1;}} > {{DrillScreenRel}}{{ }} > {{ DrillLimitRel(fetch=[1])}}{{ }} > {{ DrillProjectRel(name=[$0], category=[FLATTEN($1)])}} > {{ DrillScanRel(table=[[dfs, /tmp/t_json_20]], groupscan=[EasyGroupScan > [selectionRoot=maprfs:/tmp/t_json_20, numFiles=1, columns=[`name`, > `categories`], files=[maprfs:///tmp/t_json_20/0_0_0.json]]])}} > = > Content of 0_0_0.json > = > { > "name" : "Eric Goldberg, MD", > "categories" : [ "Doctors", "Health & Medical" ] > } { > "name" : "Pine Cone Restaurant", > "categories" : [ "Restaurants" ] > } { > "name" : "Deforest Family Restaurant", > "categories" : [ "American (Traditional)", "Restaurants" ] > } { > "name" : "Culver's", > "categories" : [ "Food", "Ice Cream & Frozen Yogurt", "Fast Food", > "Restaurants" ] > } { > "name" : "Chang Jiang Chinese Kitchen", > "categories" : [ "Chinese", "Restaurants" ] > } > -- This message was sent by Atlassian JIRA (v7.6.3#76005)