[jira] [Commented] (DRILL-4320) Difference in query plan on JDK8 for window function query
[ https://issues.apache.org/jira/browse/DRILL-4320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15136381#comment-15136381 ] Zelaine Fong commented on DRILL-4320: - Possibly. Another example is https://github.com/apache/drill/pull/351/files, where the change made was to look for 2 variations of the query plan. > Difference in query plan on JDK8 for window function query > -- > > Key: DRILL-4320 > URL: https://issues.apache.org/jira/browse/DRILL-4320 > Project: Apache Drill > Issue Type: Bug > Components: Query Planning & Optimization >Affects Versions: 1.4.0 > Environment: 4 node cluster CentOS >Reporter: Khurram Faraaz > Labels: JDK8SUPPORT > > Difference in query plan seen in window function query on JDK8 with below > test environment, the difference being that a Project is missing after the > initial Scan, the new plan looks more optimized. Should we update the > expected query plan or further investigation is required ? > Java 8 > MapR Drill 1.4.0 GA > JDK8 > MapR FS 5.0.0 GA > Functional/window_functions/optimization/plan/pp_03.sql > {noformat} > Actual plan > 00-00Screen > 00-01 Project(EXPR$0=[$0]) > 00-02Project($0=[$2]) > 00-03 Window(window#0=[window(partition {1} order by [] range > between UNBOUNDED PRECEDING and UNBOUNDED FOLLOWING aggs [SUM($0)])]) > 00-04SelectionVectorRemover > 00-05 Sort(sort0=[$1], dir0=[ASC]) > 00-06Scan(groupscan=[ParquetGroupScan > [entries=[ReadEntryWithPath [path=maprfs:///drill/testdata/subqueries/t1]], > selectionRoot=maprfs:/drill/testdata/subqueries/t1, numFiles=1, > usedMetadataFile=false, columns=[`a1`, `c1`]]]) > Expected plan > Screen > .*Project.* >.*Project.* > .*Window.*range between UNBOUNDED PRECEDING and UNBOUNDED > FOLLOWING aggs.* >.*SelectionVectorRemover.* > .*Sort.* >.*Project.* > .*Scan.* > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (DRILL-4320) Difference in query plan on JDK8 for window function query
[ https://issues.apache.org/jira/browse/DRILL-4320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15135543#comment-15135543 ] Jacques Nadeau commented on DRILL-4320: --- In a lot of places, we're matching too much of the plan for what we're actually testing. I wonder if that is the case here and thus we could address my narrowing the match profile. > Difference in query plan on JDK8 for window function query > -- > > Key: DRILL-4320 > URL: https://issues.apache.org/jira/browse/DRILL-4320 > Project: Apache Drill > Issue Type: Bug > Components: Query Planning & Optimization >Affects Versions: 1.4.0 > Environment: 4 node cluster CentOS >Reporter: Khurram Faraaz > Labels: JDK8SUPPORT > > Difference in query plan seen in window function query on JDK8 with below > test environment, the difference being that a Project is missing after the > initial Scan, the new plan looks more optimized. Should we update the > expected query plan or further investigation is required ? > Java 8 > MapR Drill 1.4.0 GA > JDK8 > MapR FS 5.0.0 GA > Functional/window_functions/optimization/plan/pp_03.sql > {noformat} > Actual plan > 00-00Screen > 00-01 Project(EXPR$0=[$0]) > 00-02Project($0=[$2]) > 00-03 Window(window#0=[window(partition {1} order by [] range > between UNBOUNDED PRECEDING and UNBOUNDED FOLLOWING aggs [SUM($0)])]) > 00-04SelectionVectorRemover > 00-05 Sort(sort0=[$1], dir0=[ASC]) > 00-06Scan(groupscan=[ParquetGroupScan > [entries=[ReadEntryWithPath [path=maprfs:///drill/testdata/subqueries/t1]], > selectionRoot=maprfs:/drill/testdata/subqueries/t1, numFiles=1, > usedMetadataFile=false, columns=[`a1`, `c1`]]]) > Expected plan > Screen > .*Project.* >.*Project.* > .*Window.*range between UNBOUNDED PRECEDING and UNBOUNDED > FOLLOWING aggs.* >.*SelectionVectorRemover.* > .*Sort.* >.*Project.* > .*Scan.* > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (DRILL-4320) Difference in query plan on JDK8 for window function query
[ https://issues.apache.org/jira/browse/DRILL-4320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15134342#comment-15134342 ] Zelaine Fong commented on DRILL-4320: - Are these tables smallish? My hypothesis is because Drill is still using the Volcano planner, and the costs of a plan with project is the same as one without, it's arbitrary which one gets picked by the optimizer. So, there is some level of non-determinism in this test. I'm not sure if there's a good way of addressing this. > Difference in query plan on JDK8 for window function query > -- > > Key: DRILL-4320 > URL: https://issues.apache.org/jira/browse/DRILL-4320 > Project: Apache Drill > Issue Type: Bug > Components: Query Planning & Optimization >Affects Versions: 1.4.0 > Environment: 4 node cluster CentOS >Reporter: Khurram Faraaz > Labels: JDK8SUPPORT > > Difference in query plan seen in window function query on JDK8 with below > test environment, the difference being that a Project is missing after the > initial Scan, the new plan looks more optimized. Should we update the > expected query plan or further investigation is required ? > Java 8 > MapR Drill 1.4.0 GA > JDK8 > MapR FS 5.0.0 GA > Functional/window_functions/optimization/plan/pp_03.sql > {noformat} > Actual plan > 00-00Screen > 00-01 Project(EXPR$0=[$0]) > 00-02Project($0=[$2]) > 00-03 Window(window#0=[window(partition {1} order by [] range > between UNBOUNDED PRECEDING and UNBOUNDED FOLLOWING aggs [SUM($0)])]) > 00-04SelectionVectorRemover > 00-05 Sort(sort0=[$1], dir0=[ASC]) > 00-06Scan(groupscan=[ParquetGroupScan > [entries=[ReadEntryWithPath [path=maprfs:///drill/testdata/subqueries/t1]], > selectionRoot=maprfs:/drill/testdata/subqueries/t1, numFiles=1, > usedMetadataFile=false, columns=[`a1`, `c1`]]]) > Expected plan > Screen > .*Project.* >.*Project.* > .*Window.*range between UNBOUNDED PRECEDING and UNBOUNDED > FOLLOWING aggs.* >.*SelectionVectorRemover.* > .*Sort.* >.*Project.* > .*Scan.* > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (DRILL-4320) Difference in query plan on JDK8 for window function query
[ https://issues.apache.org/jira/browse/DRILL-4320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15133715#comment-15133715 ] Khurram Faraaz commented on DRILL-4320: --- test case : Functional/window_functions/optimization/plan/pp_03.sql explain plan for select sum(a1) over(partition by c1) from t1; Query plan for above query {noformat} Actual plan from JDK8 and Drill 1.5.0 00-00Screen 00-01 Project(EXPR$0=[$0]) 00-02Project($0=[$2]) 00-03 Window(window#0=[window(partition {1} order by [] range between UNBOUNDED PRECEDING and UNBOUNDED FOLLOWING aggs [SUM($0)])]) 00-04SelectionVectorRemover 00-05 Sort(sort0=[$1], dir0=[ASC]) 00-06Scan(groupscan=[ParquetGroupScan [entries=[ReadEntryWithPath [path=maprfs:///drill/testdata/subqueries/t1]], selectionRoot=maprfs:/drill/testdata/subqueries/t1, numFiles=1, usedMetadataFile=false, columns=[`a1`, `c1`]]]) Expected plan Screen .*Project.* .*Project.* .*Window.*range between UNBOUNDED PRECEDING and UNBOUNDED FOLLOWING aggs.* .*SelectionVectorRemover.* .*Sort.* .*Project.* .*Scan.* {noformat} > Difference in query plan on JDK8 for window function query > -- > > Key: DRILL-4320 > URL: https://issues.apache.org/jira/browse/DRILL-4320 > Project: Apache Drill > Issue Type: Bug > Components: Query Planning & Optimization >Affects Versions: 1.4.0 > Environment: 4 node cluster CentOS >Reporter: Khurram Faraaz > Labels: JDK8SUPPORT > > Difference in query plan seen in window function query on JDK8 with below > test environment, the difference being that a Project is missing after the > initial Scan, the new plan looks more optimized. Should we update the > expected query plan or further investigation is required ? > Java 8 > MapR Drill 1.4.0 GA > JDK8 > MapR FS 5.0.0 GA > Functional/window_functions/optimization/plan/pp_03.sql > {noformat} > Actual plan > 00-00Screen > 00-01 Project(EXPR$0=[$0]) > 00-02Project($0=[$2]) > 00-03 Window(window#0=[window(partition {1} order by [] range > between UNBOUNDED PRECEDING and UNBOUNDED FOLLOWING aggs [SUM($0)])]) > 00-04SelectionVectorRemover > 00-05 Sort(sort0=[$1], dir0=[ASC]) > 00-06Scan(groupscan=[ParquetGroupScan > [entries=[ReadEntryWithPath [path=maprfs:///drill/testdata/subqueries/t1]], > selectionRoot=maprfs:/drill/testdata/subqueries/t1, numFiles=1, > usedMetadataFile=false, columns=[`a1`, `c1`]]]) > Expected plan > Screen > .*Project.* >.*Project.* > .*Window.*range between UNBOUNDED PRECEDING and UNBOUNDED > FOLLOWING aggs.* >.*SelectionVectorRemover.* > .*Sort.* >.*Project.* > .*Scan.* > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (DRILL-4320) Difference in query plan on JDK8 for window function query
[ https://issues.apache.org/jira/browse/DRILL-4320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15133710#comment-15133710 ] Zelaine Fong commented on DRILL-4320: - [~khfaraaz] - can you include a sample of one of the queries that's missing the project in JDK 1.8. > Difference in query plan on JDK8 for window function query > -- > > Key: DRILL-4320 > URL: https://issues.apache.org/jira/browse/DRILL-4320 > Project: Apache Drill > Issue Type: Bug > Components: Query Planning & Optimization >Affects Versions: 1.4.0 > Environment: 4 node cluster CentOS >Reporter: Khurram Faraaz > Labels: JDK8SUPPORT > > Difference in query plan seen in window function query on JDK8 with below > test environment, the difference being that a Project is missing after the > initial Scan, the new plan looks more optimized. Should we update the > expected query plan or further investigation is required ? > Java 8 > MapR Drill 1.4.0 GA > JDK8 > MapR FS 5.0.0 GA > Functional/window_functions/optimization/plan/pp_03.sql > {noformat} > Actual plan > 00-00Screen > 00-01 Project(EXPR$0=[$0]) > 00-02Project($0=[$2]) > 00-03 Window(window#0=[window(partition {1} order by [] range > between UNBOUNDED PRECEDING and UNBOUNDED FOLLOWING aggs [SUM($0)])]) > 00-04SelectionVectorRemover > 00-05 Sort(sort0=[$1], dir0=[ASC]) > 00-06Scan(groupscan=[ParquetGroupScan > [entries=[ReadEntryWithPath [path=maprfs:///drill/testdata/subqueries/t1]], > selectionRoot=maprfs:/drill/testdata/subqueries/t1, numFiles=1, > usedMetadataFile=false, columns=[`a1`, `c1`]]]) > Expected plan > Screen > .*Project.* >.*Project.* > .*Window.*range between UNBOUNDED PRECEDING and UNBOUNDED > FOLLOWING aggs.* >.*SelectionVectorRemover.* > .*Sort.* >.*Project.* > .*Scan.* > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (DRILL-4320) Difference in query plan on JDK8 for window function query
[ https://issues.apache.org/jira/browse/DRILL-4320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15133589#comment-15133589 ] Khurram Faraaz commented on DRILL-4320: --- With JDK7 on same Drill build, we expect a Project after the initial Scan in query plan. With JDK8 on same Drill build, we do not see the Project after the initial Scan in query plan, and hence the test fails. All our regression runs are run using JDK7 and we do not see such failures due to difference in query plan (due to missing Project after initial Scan) > Difference in query plan on JDK8 for window function query > -- > > Key: DRILL-4320 > URL: https://issues.apache.org/jira/browse/DRILL-4320 > Project: Apache Drill > Issue Type: Bug > Components: Query Planning & Optimization >Affects Versions: 1.4.0 > Environment: 4 node cluster CentOS >Reporter: Khurram Faraaz > Labels: JDK8SUPPORT > > Difference in query plan seen in window function query on JDK8 with below > test environment, the difference being that a Project is missing after the > initial Scan, the new plan looks more optimized. Should we update the > expected query plan or further investigation is required ? > Java 8 > MapR Drill 1.4.0 GA > JDK8 > MapR FS 5.0.0 GA > Functional/window_functions/optimization/plan/pp_03.sql > {noformat} > Actual plan > 00-00Screen > 00-01 Project(EXPR$0=[$0]) > 00-02Project($0=[$2]) > 00-03 Window(window#0=[window(partition {1} order by [] range > between UNBOUNDED PRECEDING and UNBOUNDED FOLLOWING aggs [SUM($0)])]) > 00-04SelectionVectorRemover > 00-05 Sort(sort0=[$1], dir0=[ASC]) > 00-06Scan(groupscan=[ParquetGroupScan > [entries=[ReadEntryWithPath [path=maprfs:///drill/testdata/subqueries/t1]], > selectionRoot=maprfs:/drill/testdata/subqueries/t1, numFiles=1, > usedMetadataFile=false, columns=[`a1`, `c1`]]]) > Expected plan > Screen > .*Project.* >.*Project.* > .*Window.*range between UNBOUNDED PRECEDING and UNBOUNDED > FOLLOWING aggs.* >.*SelectionVectorRemover.* > .*Sort.* >.*Project.* > .*Scan.* > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (DRILL-4320) Difference in query plan on JDK8 for window function query
[ https://issues.apache.org/jira/browse/DRILL-4320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15133117#comment-15133117 ] Zelaine Fong commented on DRILL-4320: - [~khfaraaz] - are you saying that in the case of JDK 1.7, the additional project is not there, even when using the same Drill build? > Difference in query plan on JDK8 for window function query > -- > > Key: DRILL-4320 > URL: https://issues.apache.org/jira/browse/DRILL-4320 > Project: Apache Drill > Issue Type: Bug > Components: Query Planning & Optimization >Affects Versions: 1.4.0 > Environment: 4 node cluster CentOS >Reporter: Khurram Faraaz > Labels: JDK8SUPPORT > > Difference in query plan seen in window function query on JDK8 with below > test environment, the difference being that a Project is missing after the > initial Scan, the new plan looks more optimized. Should we update the > expected query plan or further investigation is required ? > Java 8 > MapR Drill 1.4.0 GA > JDK8 > MapR FS 5.0.0 GA > Functional/window_functions/optimization/plan/pp_03.sql > {noformat} > Actual plan > 00-00Screen > 00-01 Project(EXPR$0=[$0]) > 00-02Project($0=[$2]) > 00-03 Window(window#0=[window(partition {1} order by [] range > between UNBOUNDED PRECEDING and UNBOUNDED FOLLOWING aggs [SUM($0)])]) > 00-04SelectionVectorRemover > 00-05 Sort(sort0=[$1], dir0=[ASC]) > 00-06Scan(groupscan=[ParquetGroupScan > [entries=[ReadEntryWithPath [path=maprfs:///drill/testdata/subqueries/t1]], > selectionRoot=maprfs:/drill/testdata/subqueries/t1, numFiles=1, > usedMetadataFile=false, columns=[`a1`, `c1`]]]) > Expected plan > Screen > .*Project.* >.*Project.* > .*Window.*range between UNBOUNDED PRECEDING and UNBOUNDED > FOLLOWING aggs.* >.*SelectionVectorRemover.* > .*Sort.* >.*Project.* > .*Scan.* > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (DRILL-4320) Difference in query plan on JDK8 for window function query
[ https://issues.apache.org/jira/browse/DRILL-4320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15130220#comment-15130220 ] Khurram Faraaz commented on DRILL-4320: --- Some fourteen tests fail on mapr Drill 1.5.0 with JDK8 due to difference in plan, a Project after the initial Scan is missing from the actual query plan. Project is missing after initial Scan Functional/filter/pushdown/plan/q1.sql Functional/filter/pushdown/plan/q2.sql Functional/filter/pushdown/plan/q3.sql Functional/filter/pushdown/plan/q4.sql Functional/filter/pushdown/plan/q5.sql Functional/filter/pushdown/plan/q6.sql Functional/filter/pushdown/plan/q7.sql Functional/filter/pushdown/plan/q8.sql Functional/filter/pushdown/plan/q9.sql Functional/filter/pushdown/plan/q10.sql Functional/filter/pushdown/plan/q11.sql Functional/filter/pushdown/plan/q13.sql Functional/filter/pushdown/plan/q14.sql Functional/filter/pushdown/plan/q16.sql > Difference in query plan on JDK8 for window function query > -- > > Key: DRILL-4320 > URL: https://issues.apache.org/jira/browse/DRILL-4320 > Project: Apache Drill > Issue Type: Bug > Components: Query Planning & Optimization >Affects Versions: 1.4.0 > Environment: 4 node cluster CentOS >Reporter: Khurram Faraaz > > Difference in query plan seen in window function query on JDK8 with below > test environment, the difference being that a Project is missing after the > initial Scan, the new plan looks more optimized. Should we update the > expected query plan or further investigation is required ? > Java 8 > MapR Drill 1.4.0 GA > JDK8 > MapR FS 5.0.0 GA > Functional/window_functions/optimization/plan/pp_03.sql > {noformat} > Actual plan > 00-00Screen > 00-01 Project(EXPR$0=[$0]) > 00-02Project($0=[$2]) > 00-03 Window(window#0=[window(partition {1} order by [] range > between UNBOUNDED PRECEDING and UNBOUNDED FOLLOWING aggs [SUM($0)])]) > 00-04SelectionVectorRemover > 00-05 Sort(sort0=[$1], dir0=[ASC]) > 00-06Scan(groupscan=[ParquetGroupScan > [entries=[ReadEntryWithPath [path=maprfs:///drill/testdata/subqueries/t1]], > selectionRoot=maprfs:/drill/testdata/subqueries/t1, numFiles=1, > usedMetadataFile=false, columns=[`a1`, `c1`]]]) > Expected plan > Screen > .*Project.* >.*Project.* > .*Window.*range between UNBOUNDED PRECEDING and UNBOUNDED > FOLLOWING aggs.* >.*SelectionVectorRemover.* > .*Sort.* >.*Project.* > .*Scan.* > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (DRILL-4320) Difference in query plan on JDK8 for window function query
[ https://issues.apache.org/jira/browse/DRILL-4320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15130181#comment-15130181 ] Khurram Faraaz commented on DRILL-4320: --- I can repro on mapr Drill 1.5.0 git.commit.id=6a36a704 and JDK8 and MapR FS 5.0.0 on 4 node cluster on CentOS > Difference in query plan on JDK8 for window function query > -- > > Key: DRILL-4320 > URL: https://issues.apache.org/jira/browse/DRILL-4320 > Project: Apache Drill > Issue Type: Bug > Components: Query Planning & Optimization >Affects Versions: 1.4.0 > Environment: 4 node cluster CentOS >Reporter: Khurram Faraaz > > Difference in query plan seen in window function query on JDK8 with below > test environment, the difference being that a Project is missing after the > initial Scan, the new plan looks more optimized. Should we update the > expected query plan or further investigation is required ? > Java 8 > MapR Drill 1.4.0 GA > JDK8 > MapR FS 5.0.0 GA > Functional/window_functions/optimization/plan/pp_03.sql > {noformat} > Actual plan > 00-00Screen > 00-01 Project(EXPR$0=[$0]) > 00-02Project($0=[$2]) > 00-03 Window(window#0=[window(partition {1} order by [] range > between UNBOUNDED PRECEDING and UNBOUNDED FOLLOWING aggs [SUM($0)])]) > 00-04SelectionVectorRemover > 00-05 Sort(sort0=[$1], dir0=[ASC]) > 00-06Scan(groupscan=[ParquetGroupScan > [entries=[ReadEntryWithPath [path=maprfs:///drill/testdata/subqueries/t1]], > selectionRoot=maprfs:/drill/testdata/subqueries/t1, numFiles=1, > usedMetadataFile=false, columns=[`a1`, `c1`]]]) > Expected plan > Screen > .*Project.* >.*Project.* > .*Window.*range between UNBOUNDED PRECEDING and UNBOUNDED > FOLLOWING aggs.* >.*SelectionVectorRemover.* > .*Sort.* >.*Project.* > .*Scan.* > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)