[jira] [Commented] (DRILL-4320) Difference in query plan on JDK8 for window function query

2016-02-07 Thread Zelaine Fong (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-4320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15136381#comment-15136381
 ] 

Zelaine Fong commented on DRILL-4320:
-

Possibly.  Another example is https://github.com/apache/drill/pull/351/files, 
where the change made was to look for 2 variations of the query plan.

> Difference in query plan on JDK8 for window function query
> --
>
> Key: DRILL-4320
> URL: https://issues.apache.org/jira/browse/DRILL-4320
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Query Planning & Optimization
>Affects Versions: 1.4.0
> Environment: 4 node cluster CentOS
>Reporter: Khurram Faraaz
>  Labels: JDK8SUPPORT
>
> Difference in query plan seen in window function query on JDK8 with below 
> test environment, the difference being that a Project is missing after the 
> initial Scan, the new plan looks more optimized. Should we update the 
> expected query plan or further investigation is required ?
> Java 8
> MapR Drill 1.4.0 GA
> JDK8
> MapR FS 5.0.0 GA
> Functional/window_functions/optimization/plan/pp_03.sql
> {noformat}
> Actual plan 
> 00-00Screen
> 00-01  Project(EXPR$0=[$0])
> 00-02Project($0=[$2])
> 00-03  Window(window#0=[window(partition {1} order by [] range 
> between UNBOUNDED PRECEDING and UNBOUNDED FOLLOWING aggs [SUM($0)])])
> 00-04SelectionVectorRemover
> 00-05  Sort(sort0=[$1], dir0=[ASC])
> 00-06Scan(groupscan=[ParquetGroupScan 
> [entries=[ReadEntryWithPath [path=maprfs:///drill/testdata/subqueries/t1]], 
> selectionRoot=maprfs:/drill/testdata/subqueries/t1, numFiles=1, 
> usedMetadataFile=false, columns=[`a1`, `c1`]]])
> Expected plan 
>  Screen
>  .*Project.*
>.*Project.*
>  .*Window.*range between UNBOUNDED PRECEDING and UNBOUNDED 
> FOLLOWING aggs.*
>.*SelectionVectorRemover.*
>  .*Sort.*
>.*Project.*
>  .*Scan.*
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (DRILL-4320) Difference in query plan on JDK8 for window function query

2016-02-05 Thread Jacques Nadeau (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-4320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15135543#comment-15135543
 ] 

Jacques Nadeau commented on DRILL-4320:
---

In a lot of places, we're matching too much of the plan for what we're actually 
testing. I wonder if that is the case here and thus we could address my 
narrowing the match profile. 

> Difference in query plan on JDK8 for window function query
> --
>
> Key: DRILL-4320
> URL: https://issues.apache.org/jira/browse/DRILL-4320
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Query Planning & Optimization
>Affects Versions: 1.4.0
> Environment: 4 node cluster CentOS
>Reporter: Khurram Faraaz
>  Labels: JDK8SUPPORT
>
> Difference in query plan seen in window function query on JDK8 with below 
> test environment, the difference being that a Project is missing after the 
> initial Scan, the new plan looks more optimized. Should we update the 
> expected query plan or further investigation is required ?
> Java 8
> MapR Drill 1.4.0 GA
> JDK8
> MapR FS 5.0.0 GA
> Functional/window_functions/optimization/plan/pp_03.sql
> {noformat}
> Actual plan 
> 00-00Screen
> 00-01  Project(EXPR$0=[$0])
> 00-02Project($0=[$2])
> 00-03  Window(window#0=[window(partition {1} order by [] range 
> between UNBOUNDED PRECEDING and UNBOUNDED FOLLOWING aggs [SUM($0)])])
> 00-04SelectionVectorRemover
> 00-05  Sort(sort0=[$1], dir0=[ASC])
> 00-06Scan(groupscan=[ParquetGroupScan 
> [entries=[ReadEntryWithPath [path=maprfs:///drill/testdata/subqueries/t1]], 
> selectionRoot=maprfs:/drill/testdata/subqueries/t1, numFiles=1, 
> usedMetadataFile=false, columns=[`a1`, `c1`]]])
> Expected plan 
>  Screen
>  .*Project.*
>.*Project.*
>  .*Window.*range between UNBOUNDED PRECEDING and UNBOUNDED 
> FOLLOWING aggs.*
>.*SelectionVectorRemover.*
>  .*Sort.*
>.*Project.*
>  .*Scan.*
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (DRILL-4320) Difference in query plan on JDK8 for window function query

2016-02-05 Thread Zelaine Fong (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-4320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15134342#comment-15134342
 ] 

Zelaine Fong commented on DRILL-4320:
-

Are these tables smallish?  My hypothesis is because Drill is still using the 
Volcano planner, and the costs of a plan with project is the same as one 
without, it's arbitrary which one gets picked by the optimizer.  So, there is 
some level of non-determinism in this test.  I'm not sure if there's a good way 
of addressing this.

> Difference in query plan on JDK8 for window function query
> --
>
> Key: DRILL-4320
> URL: https://issues.apache.org/jira/browse/DRILL-4320
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Query Planning & Optimization
>Affects Versions: 1.4.0
> Environment: 4 node cluster CentOS
>Reporter: Khurram Faraaz
>  Labels: JDK8SUPPORT
>
> Difference in query plan seen in window function query on JDK8 with below 
> test environment, the difference being that a Project is missing after the 
> initial Scan, the new plan looks more optimized. Should we update the 
> expected query plan or further investigation is required ?
> Java 8
> MapR Drill 1.4.0 GA
> JDK8
> MapR FS 5.0.0 GA
> Functional/window_functions/optimization/plan/pp_03.sql
> {noformat}
> Actual plan 
> 00-00Screen
> 00-01  Project(EXPR$0=[$0])
> 00-02Project($0=[$2])
> 00-03  Window(window#0=[window(partition {1} order by [] range 
> between UNBOUNDED PRECEDING and UNBOUNDED FOLLOWING aggs [SUM($0)])])
> 00-04SelectionVectorRemover
> 00-05  Sort(sort0=[$1], dir0=[ASC])
> 00-06Scan(groupscan=[ParquetGroupScan 
> [entries=[ReadEntryWithPath [path=maprfs:///drill/testdata/subqueries/t1]], 
> selectionRoot=maprfs:/drill/testdata/subqueries/t1, numFiles=1, 
> usedMetadataFile=false, columns=[`a1`, `c1`]]])
> Expected plan 
>  Screen
>  .*Project.*
>.*Project.*
>  .*Window.*range between UNBOUNDED PRECEDING and UNBOUNDED 
> FOLLOWING aggs.*
>.*SelectionVectorRemover.*
>  .*Sort.*
>.*Project.*
>  .*Scan.*
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (DRILL-4320) Difference in query plan on JDK8 for window function query

2016-02-04 Thread Khurram Faraaz (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-4320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15133715#comment-15133715
 ] 

Khurram Faraaz commented on DRILL-4320:
---

test case : Functional/window_functions/optimization/plan/pp_03.sql

explain plan for select sum(a1) over(partition by c1) from t1;

Query plan for above query
{noformat}
Actual plan from JDK8 and Drill 1.5.0

00-00Screen
00-01  Project(EXPR$0=[$0])
00-02Project($0=[$2])
00-03  Window(window#0=[window(partition {1} order by [] range between 
UNBOUNDED PRECEDING and UNBOUNDED FOLLOWING aggs [SUM($0)])])
00-04SelectionVectorRemover
00-05  Sort(sort0=[$1], dir0=[ASC])
00-06Scan(groupscan=[ParquetGroupScan 
[entries=[ReadEntryWithPath [path=maprfs:///drill/testdata/subqueries/t1]], 
selectionRoot=maprfs:/drill/testdata/subqueries/t1, numFiles=1, 
usedMetadataFile=false, columns=[`a1`, `c1`]]])

Expected plan 

 Screen
 .*Project.*
   .*Project.*
 .*Window.*range between UNBOUNDED PRECEDING and UNBOUNDED 
FOLLOWING aggs.*
   .*SelectionVectorRemover.*
 .*Sort.*
   .*Project.*
 .*Scan.*
{noformat}

> Difference in query plan on JDK8 for window function query
> --
>
> Key: DRILL-4320
> URL: https://issues.apache.org/jira/browse/DRILL-4320
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Query Planning & Optimization
>Affects Versions: 1.4.0
> Environment: 4 node cluster CentOS
>Reporter: Khurram Faraaz
>  Labels: JDK8SUPPORT
>
> Difference in query plan seen in window function query on JDK8 with below 
> test environment, the difference being that a Project is missing after the 
> initial Scan, the new plan looks more optimized. Should we update the 
> expected query plan or further investigation is required ?
> Java 8
> MapR Drill 1.4.0 GA
> JDK8
> MapR FS 5.0.0 GA
> Functional/window_functions/optimization/plan/pp_03.sql
> {noformat}
> Actual plan 
> 00-00Screen
> 00-01  Project(EXPR$0=[$0])
> 00-02Project($0=[$2])
> 00-03  Window(window#0=[window(partition {1} order by [] range 
> between UNBOUNDED PRECEDING and UNBOUNDED FOLLOWING aggs [SUM($0)])])
> 00-04SelectionVectorRemover
> 00-05  Sort(sort0=[$1], dir0=[ASC])
> 00-06Scan(groupscan=[ParquetGroupScan 
> [entries=[ReadEntryWithPath [path=maprfs:///drill/testdata/subqueries/t1]], 
> selectionRoot=maprfs:/drill/testdata/subqueries/t1, numFiles=1, 
> usedMetadataFile=false, columns=[`a1`, `c1`]]])
> Expected plan 
>  Screen
>  .*Project.*
>.*Project.*
>  .*Window.*range between UNBOUNDED PRECEDING and UNBOUNDED 
> FOLLOWING aggs.*
>.*SelectionVectorRemover.*
>  .*Sort.*
>.*Project.*
>  .*Scan.*
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (DRILL-4320) Difference in query plan on JDK8 for window function query

2016-02-04 Thread Zelaine Fong (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-4320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15133710#comment-15133710
 ] 

Zelaine Fong commented on DRILL-4320:
-

[~khfaraaz] - can you include a sample of one of the queries that's missing the 
project in JDK 1.8.

> Difference in query plan on JDK8 for window function query
> --
>
> Key: DRILL-4320
> URL: https://issues.apache.org/jira/browse/DRILL-4320
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Query Planning & Optimization
>Affects Versions: 1.4.0
> Environment: 4 node cluster CentOS
>Reporter: Khurram Faraaz
>  Labels: JDK8SUPPORT
>
> Difference in query plan seen in window function query on JDK8 with below 
> test environment, the difference being that a Project is missing after the 
> initial Scan, the new plan looks more optimized. Should we update the 
> expected query plan or further investigation is required ?
> Java 8
> MapR Drill 1.4.0 GA
> JDK8
> MapR FS 5.0.0 GA
> Functional/window_functions/optimization/plan/pp_03.sql
> {noformat}
> Actual plan 
> 00-00Screen
> 00-01  Project(EXPR$0=[$0])
> 00-02Project($0=[$2])
> 00-03  Window(window#0=[window(partition {1} order by [] range 
> between UNBOUNDED PRECEDING and UNBOUNDED FOLLOWING aggs [SUM($0)])])
> 00-04SelectionVectorRemover
> 00-05  Sort(sort0=[$1], dir0=[ASC])
> 00-06Scan(groupscan=[ParquetGroupScan 
> [entries=[ReadEntryWithPath [path=maprfs:///drill/testdata/subqueries/t1]], 
> selectionRoot=maprfs:/drill/testdata/subqueries/t1, numFiles=1, 
> usedMetadataFile=false, columns=[`a1`, `c1`]]])
> Expected plan 
>  Screen
>  .*Project.*
>.*Project.*
>  .*Window.*range between UNBOUNDED PRECEDING and UNBOUNDED 
> FOLLOWING aggs.*
>.*SelectionVectorRemover.*
>  .*Sort.*
>.*Project.*
>  .*Scan.*
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (DRILL-4320) Difference in query plan on JDK8 for window function query

2016-02-04 Thread Khurram Faraaz (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-4320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15133589#comment-15133589
 ] 

Khurram Faraaz commented on DRILL-4320:
---

With JDK7 on same Drill build, we expect a Project after the initial Scan in 
query plan.

With JDK8 on same Drill build, we do not see the Project after the initial Scan 
in query plan, and hence the test fails.

All our regression runs are run using JDK7 and we do not see such failures due 
to difference in query plan (due to missing Project after initial Scan)

> Difference in query plan on JDK8 for window function query
> --
>
> Key: DRILL-4320
> URL: https://issues.apache.org/jira/browse/DRILL-4320
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Query Planning & Optimization
>Affects Versions: 1.4.0
> Environment: 4 node cluster CentOS
>Reporter: Khurram Faraaz
>  Labels: JDK8SUPPORT
>
> Difference in query plan seen in window function query on JDK8 with below 
> test environment, the difference being that a Project is missing after the 
> initial Scan, the new plan looks more optimized. Should we update the 
> expected query plan or further investigation is required ?
> Java 8
> MapR Drill 1.4.0 GA
> JDK8
> MapR FS 5.0.0 GA
> Functional/window_functions/optimization/plan/pp_03.sql
> {noformat}
> Actual plan 
> 00-00Screen
> 00-01  Project(EXPR$0=[$0])
> 00-02Project($0=[$2])
> 00-03  Window(window#0=[window(partition {1} order by [] range 
> between UNBOUNDED PRECEDING and UNBOUNDED FOLLOWING aggs [SUM($0)])])
> 00-04SelectionVectorRemover
> 00-05  Sort(sort0=[$1], dir0=[ASC])
> 00-06Scan(groupscan=[ParquetGroupScan 
> [entries=[ReadEntryWithPath [path=maprfs:///drill/testdata/subqueries/t1]], 
> selectionRoot=maprfs:/drill/testdata/subqueries/t1, numFiles=1, 
> usedMetadataFile=false, columns=[`a1`, `c1`]]])
> Expected plan 
>  Screen
>  .*Project.*
>.*Project.*
>  .*Window.*range between UNBOUNDED PRECEDING and UNBOUNDED 
> FOLLOWING aggs.*
>.*SelectionVectorRemover.*
>  .*Sort.*
>.*Project.*
>  .*Scan.*
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (DRILL-4320) Difference in query plan on JDK8 for window function query

2016-02-04 Thread Zelaine Fong (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-4320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15133117#comment-15133117
 ] 

Zelaine Fong commented on DRILL-4320:
-

[~khfaraaz] - are you saying that in the case of JDK 1.7, the additional 
project is not there, even when using the same Drill build?

> Difference in query plan on JDK8 for window function query
> --
>
> Key: DRILL-4320
> URL: https://issues.apache.org/jira/browse/DRILL-4320
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Query Planning & Optimization
>Affects Versions: 1.4.0
> Environment: 4 node cluster CentOS
>Reporter: Khurram Faraaz
>  Labels: JDK8SUPPORT
>
> Difference in query plan seen in window function query on JDK8 with below 
> test environment, the difference being that a Project is missing after the 
> initial Scan, the new plan looks more optimized. Should we update the 
> expected query plan or further investigation is required ?
> Java 8
> MapR Drill 1.4.0 GA
> JDK8
> MapR FS 5.0.0 GA
> Functional/window_functions/optimization/plan/pp_03.sql
> {noformat}
> Actual plan 
> 00-00Screen
> 00-01  Project(EXPR$0=[$0])
> 00-02Project($0=[$2])
> 00-03  Window(window#0=[window(partition {1} order by [] range 
> between UNBOUNDED PRECEDING and UNBOUNDED FOLLOWING aggs [SUM($0)])])
> 00-04SelectionVectorRemover
> 00-05  Sort(sort0=[$1], dir0=[ASC])
> 00-06Scan(groupscan=[ParquetGroupScan 
> [entries=[ReadEntryWithPath [path=maprfs:///drill/testdata/subqueries/t1]], 
> selectionRoot=maprfs:/drill/testdata/subqueries/t1, numFiles=1, 
> usedMetadataFile=false, columns=[`a1`, `c1`]]])
> Expected plan 
>  Screen
>  .*Project.*
>.*Project.*
>  .*Window.*range between UNBOUNDED PRECEDING and UNBOUNDED 
> FOLLOWING aggs.*
>.*SelectionVectorRemover.*
>  .*Sort.*
>.*Project.*
>  .*Scan.*
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (DRILL-4320) Difference in query plan on JDK8 for window function query

2016-02-03 Thread Khurram Faraaz (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-4320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15130220#comment-15130220
 ] 

Khurram Faraaz commented on DRILL-4320:
---

Some fourteen tests fail on mapr Drill 1.5.0 with JDK8 due to difference in 
plan, a Project after the initial Scan is missing from the actual query plan.

Project is missing after initial Scan
 
 Functional/filter/pushdown/plan/q1.sql
 Functional/filter/pushdown/plan/q2.sql
 Functional/filter/pushdown/plan/q3.sql
 Functional/filter/pushdown/plan/q4.sql
 Functional/filter/pushdown/plan/q5.sql
 Functional/filter/pushdown/plan/q6.sql
 Functional/filter/pushdown/plan/q7.sql
 Functional/filter/pushdown/plan/q8.sql
 Functional/filter/pushdown/plan/q9.sql
 Functional/filter/pushdown/plan/q10.sql
 Functional/filter/pushdown/plan/q11.sql
 Functional/filter/pushdown/plan/q13.sql
 Functional/filter/pushdown/plan/q14.sql
 Functional/filter/pushdown/plan/q16.sql

> Difference in query plan on JDK8 for window function query
> --
>
> Key: DRILL-4320
> URL: https://issues.apache.org/jira/browse/DRILL-4320
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Query Planning & Optimization
>Affects Versions: 1.4.0
> Environment: 4 node cluster CentOS
>Reporter: Khurram Faraaz
>
> Difference in query plan seen in window function query on JDK8 with below 
> test environment, the difference being that a Project is missing after the 
> initial Scan, the new plan looks more optimized. Should we update the 
> expected query plan or further investigation is required ?
> Java 8
> MapR Drill 1.4.0 GA
> JDK8
> MapR FS 5.0.0 GA
> Functional/window_functions/optimization/plan/pp_03.sql
> {noformat}
> Actual plan 
> 00-00Screen
> 00-01  Project(EXPR$0=[$0])
> 00-02Project($0=[$2])
> 00-03  Window(window#0=[window(partition {1} order by [] range 
> between UNBOUNDED PRECEDING and UNBOUNDED FOLLOWING aggs [SUM($0)])])
> 00-04SelectionVectorRemover
> 00-05  Sort(sort0=[$1], dir0=[ASC])
> 00-06Scan(groupscan=[ParquetGroupScan 
> [entries=[ReadEntryWithPath [path=maprfs:///drill/testdata/subqueries/t1]], 
> selectionRoot=maprfs:/drill/testdata/subqueries/t1, numFiles=1, 
> usedMetadataFile=false, columns=[`a1`, `c1`]]])
> Expected plan 
>  Screen
>  .*Project.*
>.*Project.*
>  .*Window.*range between UNBOUNDED PRECEDING and UNBOUNDED 
> FOLLOWING aggs.*
>.*SelectionVectorRemover.*
>  .*Sort.*
>.*Project.*
>  .*Scan.*
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (DRILL-4320) Difference in query plan on JDK8 for window function query

2016-02-03 Thread Khurram Faraaz (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-4320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15130181#comment-15130181
 ] 

Khurram Faraaz commented on DRILL-4320:
---

I can repro on mapr Drill 1.5.0 git.commit.id=6a36a704 and JDK8 and MapR FS 
5.0.0 on 4 node cluster on CentOS

> Difference in query plan on JDK8 for window function query
> --
>
> Key: DRILL-4320
> URL: https://issues.apache.org/jira/browse/DRILL-4320
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Query Planning & Optimization
>Affects Versions: 1.4.0
> Environment: 4 node cluster CentOS
>Reporter: Khurram Faraaz
>
> Difference in query plan seen in window function query on JDK8 with below 
> test environment, the difference being that a Project is missing after the 
> initial Scan, the new plan looks more optimized. Should we update the 
> expected query plan or further investigation is required ?
> Java 8
> MapR Drill 1.4.0 GA
> JDK8
> MapR FS 5.0.0 GA
> Functional/window_functions/optimization/plan/pp_03.sql
> {noformat}
> Actual plan 
> 00-00Screen
> 00-01  Project(EXPR$0=[$0])
> 00-02Project($0=[$2])
> 00-03  Window(window#0=[window(partition {1} order by [] range 
> between UNBOUNDED PRECEDING and UNBOUNDED FOLLOWING aggs [SUM($0)])])
> 00-04SelectionVectorRemover
> 00-05  Sort(sort0=[$1], dir0=[ASC])
> 00-06Scan(groupscan=[ParquetGroupScan 
> [entries=[ReadEntryWithPath [path=maprfs:///drill/testdata/subqueries/t1]], 
> selectionRoot=maprfs:/drill/testdata/subqueries/t1, numFiles=1, 
> usedMetadataFile=false, columns=[`a1`, `c1`]]])
> Expected plan 
>  Screen
>  .*Project.*
>.*Project.*
>  .*Window.*range between UNBOUNDED PRECEDING and UNBOUNDED 
> FOLLOWING aggs.*
>.*SelectionVectorRemover.*
>  .*Sort.*
>.*Project.*
>  .*Scan.*
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)