[jira] [Comment Edited] (DRILL-3783) Incorrect results : COUNT() over results returned by UNION ALL

2015-09-15 Thread Sean Hsuan-Yi Chu (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-3783?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14746973#comment-14746973
 ] 

Sean Hsuan-Yi Chu edited comment on DRILL-3783 at 9/16/15 6:36 AM:
---

[~jaltekruse] I think I know what might be confusing. 

Please see the first query in the description. The parenthesis does NOT 
surround the whole UNION-ALL. Instead, it surrounds just the "inputs" of 
UNION-ALL. Meaning the structure is more like this: (...) union all (...); NOT 
(... union all ...)

The second query does not have "extra" parenthesis. There are two inner 
parentheses, each embracing an input of union-all; Besides, there is an outer 
parenthesis which surrounds the whole Union-all. Meaning the structure is more 
like this:
((...) union all (...)); NOT ((... union all ...))


was (Author: seanhychu):
[~jaltekruse] I think I know what might be confusing. 

Please see the first query in the description. The parenthesis does NOT 
surround the whole UNION-ALL. Instead, it surrounds just the "inputs" of 
UNION-ALL. Meaning the structure is more like this: (...) union all (...); NOT 
(... union all ...)

The second query does not have "extra" parenthesis. There are two inner 
parentheses, each embracing an input of union-all; Besides, there is an outer 
parenthesis which surrounds the whole Union-all. Meaning the structure is more 
like this:
((...) union all (...))

> Incorrect results : COUNT() over results returned by UNION ALL 
> 
>
> Key: DRILL-3783
> URL: https://issues.apache.org/jira/browse/DRILL-3783
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Relational Operators
>Affects Versions: 1.2.0
> Environment: 4 node cluster on CentOS
>Reporter: Khurram Faraaz
>Assignee: Sean Hsuan-Yi Chu
>Priority: Critical
> Fix For: 1.2.0
>
>
> Count over results returned union all query, returns incorrect results. The 
> below query returned an Exception (please se DRILL-2637) that JIRA was marked 
> as fixed, however the query returns incorrect results. 
> {code}
> 0: jdbc:drill:schema=dfs.tmp> select count(c1) from (select cast(columns[0] 
> as int) c1 from `testWindow.csv`) union all (select cast(columns[0] as int) 
> c2 from `testWindow.csv`);
> +-+
> | EXPR$0  |
> +-+
> | 11  |
> | 100 |
> | 10  |
> | 2   |
> | 50  |
> | 55  |
> | 67  |
> | 113 |
> | 119 |
> | 89  |
> | 57  |
> | 61  |
> +-+
> 12 rows selected (0.753 seconds)
> {code}
> Results returned by the query on LHS and RHS of Union all operator are
> {code}
> 0: jdbc:drill:schema=dfs.tmp> select cast(columns[0] as int) c1 from 
> `testWindow.csv`;
> +--+
> |  c1  |
> +--+
> | 100  |
> | 10   |
> | 2|
> | 50   |
> | 55   |
> | 67   |
> | 113  |
> | 119  |
> | 89   |
> | 57   |
> | 61   |
> +--+
> 11 rows selected (0.197 seconds)
> 0: jdbc:drill:schema=dfs.tmp> select cast(columns[0] as int) c2 from 
> `testWindow.csv`;
> +--+
> |  c2  |
> +--+
> | 100  |
> | 10   |
> | 2|
> | 50   |
> | 55   |
> | 67   |
> | 113  |
> | 119  |
> | 89   |
> | 57   |
> | 61   |
> +--+
> 11 rows selected (0.173 seconds)
> {code}
> Note that enclosing the queries within correct parentheses returns correct 
> results. We do not want to return incorrect results to user when the 
> parentheses are missing.
> {code}
> 0: jdbc:drill:schema=dfs.tmp> select count(c1) from ((select cast(columns[0] 
> as int) c1 from `testWindow.csv`) union all (select cast(columns[0] as int) 
> c2 from `testWindow.csv`));
> +-+
> | EXPR$0  |
> +-+
> | 22  |
> +-+
> 1 row selected (0.234 seconds)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (DRILL-3783) Incorrect results : COUNT() over results returned by UNION ALL

2015-09-15 Thread Sean Hsuan-Yi Chu (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-3783?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14746973#comment-14746973
 ] 

Sean Hsuan-Yi Chu edited comment on DRILL-3783 at 9/16/15 6:30 AM:
---

[~jaltekruse] I think I know what might be confusing. 

Please see the first query in the description. The parenthesis does NOT 
surround the whole UNION-ALL. Instead, it surrounds just the "inputs" of 
UNION-ALL. Meaning the structure is more like this: (...) union all (...); NOT 
(... union all ...)

The second query does not have "extra" parenthesis. There are two inner 
parentheses, each embracing an input of union-all; Besides, there is an outer 
parenthesis which surrounds the whole Union-all. Meaning the structure is more 
like this:
((...) union all (...))


was (Author: seanhychu):
[~jaltekruse] I think I know what might be confusing. 

Please see the first query in the description. The parenthesis does NOT 
surround the whole UNION-ALL. Instead, it surrounds just the "inputs" of 
UNION-ALL. Meaning the structure is more like this: (...) union all (...); NOT 
(... union all ...)

The second query does not have "extra" parenthesis. There are two inner 
parentheses, each embracing an input go union-all; Besides, there is an outer 
parenthesis which surrounds the whole Union-all. Meaning the structure is more 
like this:
((...) union all (...))

> Incorrect results : COUNT() over results returned by UNION ALL 
> 
>
> Key: DRILL-3783
> URL: https://issues.apache.org/jira/browse/DRILL-3783
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Relational Operators
>Affects Versions: 1.2.0
> Environment: 4 node cluster on CentOS
>Reporter: Khurram Faraaz
>Assignee: Sean Hsuan-Yi Chu
>Priority: Critical
> Fix For: 1.2.0
>
>
> Count over results returned union all query, returns incorrect results. The 
> below query returned an Exception (please se DRILL-2637) that JIRA was marked 
> as fixed, however the query returns incorrect results. 
> {code}
> 0: jdbc:drill:schema=dfs.tmp> select count(c1) from (select cast(columns[0] 
> as int) c1 from `testWindow.csv`) union all (select cast(columns[0] as int) 
> c2 from `testWindow.csv`);
> +-+
> | EXPR$0  |
> +-+
> | 11  |
> | 100 |
> | 10  |
> | 2   |
> | 50  |
> | 55  |
> | 67  |
> | 113 |
> | 119 |
> | 89  |
> | 57  |
> | 61  |
> +-+
> 12 rows selected (0.753 seconds)
> {code}
> Results returned by the query on LHS and RHS of Union all operator are
> {code}
> 0: jdbc:drill:schema=dfs.tmp> select cast(columns[0] as int) c1 from 
> `testWindow.csv`;
> +--+
> |  c1  |
> +--+
> | 100  |
> | 10   |
> | 2|
> | 50   |
> | 55   |
> | 67   |
> | 113  |
> | 119  |
> | 89   |
> | 57   |
> | 61   |
> +--+
> 11 rows selected (0.197 seconds)
> 0: jdbc:drill:schema=dfs.tmp> select cast(columns[0] as int) c2 from 
> `testWindow.csv`;
> +--+
> |  c2  |
> +--+
> | 100  |
> | 10   |
> | 2|
> | 50   |
> | 55   |
> | 67   |
> | 113  |
> | 119  |
> | 89   |
> | 57   |
> | 61   |
> +--+
> 11 rows selected (0.173 seconds)
> {code}
> Note that enclosing the queries within correct parentheses returns correct 
> results. We do not want to return incorrect results to user when the 
> parentheses are missing.
> {code}
> 0: jdbc:drill:schema=dfs.tmp> select count(c1) from ((select cast(columns[0] 
> as int) c1 from `testWindow.csv`) union all (select cast(columns[0] as int) 
> c2 from `testWindow.csv`));
> +-+
> | EXPR$0  |
> +-+
> | 22  |
> +-+
> 1 row selected (0.234 seconds)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (DRILL-3783) Incorrect results : COUNT() over results returned by UNION ALL

2015-09-15 Thread Sean Hsuan-Yi Chu (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-3783?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14746973#comment-14746973
 ] 

Sean Hsuan-Yi Chu edited comment on DRILL-3783 at 9/16/15 6:30 AM:
---

[~jaltekruse] I think I know what might be confusing. 

Please see the first query in the description. The parenthesis does NOT 
surround the whole UNION-ALL. Instead, it surrounds just the "inputs" of 
UNION-ALL. Meaning the structure is more like this: (...) union all (...); NOT 
(... union all ...)

The second query does not have "extra" parenthesis. There are two inner 
parentheses, each embracing an input go union-all; Besides, there is an outer 
parenthesis which surrounds the whole Union-all. Meaning the structure is more 
like this:
((...) union all (...))


was (Author: seanhychu):
[~jaltekruse] I think I know what might be confusing. 

Please see the first query in the description. The parenthesis does NOT 
surround the whole UNION-ALL. Instead, it surrounds just the "inputs" of 
UNION-ALL. Meaning the structure is more like this: (...) union all (...)

The second query does not have "extra" parenthesis. There are two inner 
parentheses, each embracing an input go union-all; Besides, there is an outer 
parenthesis which surrounds the whole Union-all. Meaning the structure is more 
like this:
((...) union all (...))

> Incorrect results : COUNT() over results returned by UNION ALL 
> 
>
> Key: DRILL-3783
> URL: https://issues.apache.org/jira/browse/DRILL-3783
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Relational Operators
>Affects Versions: 1.2.0
> Environment: 4 node cluster on CentOS
>Reporter: Khurram Faraaz
>Assignee: Sean Hsuan-Yi Chu
>Priority: Critical
> Fix For: 1.2.0
>
>
> Count over results returned union all query, returns incorrect results. The 
> below query returned an Exception (please se DRILL-2637) that JIRA was marked 
> as fixed, however the query returns incorrect results. 
> {code}
> 0: jdbc:drill:schema=dfs.tmp> select count(c1) from (select cast(columns[0] 
> as int) c1 from `testWindow.csv`) union all (select cast(columns[0] as int) 
> c2 from `testWindow.csv`);
> +-+
> | EXPR$0  |
> +-+
> | 11  |
> | 100 |
> | 10  |
> | 2   |
> | 50  |
> | 55  |
> | 67  |
> | 113 |
> | 119 |
> | 89  |
> | 57  |
> | 61  |
> +-+
> 12 rows selected (0.753 seconds)
> {code}
> Results returned by the query on LHS and RHS of Union all operator are
> {code}
> 0: jdbc:drill:schema=dfs.tmp> select cast(columns[0] as int) c1 from 
> `testWindow.csv`;
> +--+
> |  c1  |
> +--+
> | 100  |
> | 10   |
> | 2|
> | 50   |
> | 55   |
> | 67   |
> | 113  |
> | 119  |
> | 89   |
> | 57   |
> | 61   |
> +--+
> 11 rows selected (0.197 seconds)
> 0: jdbc:drill:schema=dfs.tmp> select cast(columns[0] as int) c2 from 
> `testWindow.csv`;
> +--+
> |  c2  |
> +--+
> | 100  |
> | 10   |
> | 2|
> | 50   |
> | 55   |
> | 67   |
> | 113  |
> | 119  |
> | 89   |
> | 57   |
> | 61   |
> +--+
> 11 rows selected (0.173 seconds)
> {code}
> Note that enclosing the queries within correct parentheses returns correct 
> results. We do not want to return incorrect results to user when the 
> parentheses are missing.
> {code}
> 0: jdbc:drill:schema=dfs.tmp> select count(c1) from ((select cast(columns[0] 
> as int) c1 from `testWindow.csv`) union all (select cast(columns[0] as int) 
> c2 from `testWindow.csv`));
> +-+
> | EXPR$0  |
> +-+
> | 22  |
> +-+
> 1 row selected (0.234 seconds)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (DRILL-3783) Incorrect results : COUNT() over results returned by UNION ALL

2015-09-15 Thread Sean Hsuan-Yi Chu (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-3783?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14746958#comment-14746958
 ] 

Sean Hsuan-Yi Chu edited comment on DRILL-3783 at 9/16/15 6:29 AM:
---

[~jaltekruse] So what is the problem now ? I am confused :(

Let's summarize how Postgres works firstly.
1. If there is no parenthesis => Union-All is evaluated the last
2. If there is parenthesis => the subquery in the parenthesis (union-all) is 
evaluated firstly

In both cases, Drill is doing exactly the same thing as Postgres.


was (Author: seanhychu):
[~jaltekruse] So what is the problem now ? I am confused :(

Let's summarize how Postgres works firstly.
1. If there is no parenthesis => Union-All is evaluated the last
2. If there is parenthesis => the subquery in the parenthesis (union-all) is 
evaluated firstly

In both cases, Drill is doing exactly the same thing as Postgres. So why do 

> Incorrect results : COUNT() over results returned by UNION ALL 
> 
>
> Key: DRILL-3783
> URL: https://issues.apache.org/jira/browse/DRILL-3783
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Relational Operators
>Affects Versions: 1.2.0
> Environment: 4 node cluster on CentOS
>Reporter: Khurram Faraaz
>Assignee: Sean Hsuan-Yi Chu
>Priority: Critical
> Fix For: 1.2.0
>
>
> Count over results returned union all query, returns incorrect results. The 
> below query returned an Exception (please se DRILL-2637) that JIRA was marked 
> as fixed, however the query returns incorrect results. 
> {code}
> 0: jdbc:drill:schema=dfs.tmp> select count(c1) from (select cast(columns[0] 
> as int) c1 from `testWindow.csv`) union all (select cast(columns[0] as int) 
> c2 from `testWindow.csv`);
> +-+
> | EXPR$0  |
> +-+
> | 11  |
> | 100 |
> | 10  |
> | 2   |
> | 50  |
> | 55  |
> | 67  |
> | 113 |
> | 119 |
> | 89  |
> | 57  |
> | 61  |
> +-+
> 12 rows selected (0.753 seconds)
> {code}
> Results returned by the query on LHS and RHS of Union all operator are
> {code}
> 0: jdbc:drill:schema=dfs.tmp> select cast(columns[0] as int) c1 from 
> `testWindow.csv`;
> +--+
> |  c1  |
> +--+
> | 100  |
> | 10   |
> | 2|
> | 50   |
> | 55   |
> | 67   |
> | 113  |
> | 119  |
> | 89   |
> | 57   |
> | 61   |
> +--+
> 11 rows selected (0.197 seconds)
> 0: jdbc:drill:schema=dfs.tmp> select cast(columns[0] as int) c2 from 
> `testWindow.csv`;
> +--+
> |  c2  |
> +--+
> | 100  |
> | 10   |
> | 2|
> | 50   |
> | 55   |
> | 67   |
> | 113  |
> | 119  |
> | 89   |
> | 57   |
> | 61   |
> +--+
> 11 rows selected (0.173 seconds)
> {code}
> Note that enclosing the queries within correct parentheses returns correct 
> results. We do not want to return incorrect results to user when the 
> parentheses are missing.
> {code}
> 0: jdbc:drill:schema=dfs.tmp> select count(c1) from ((select cast(columns[0] 
> as int) c1 from `testWindow.csv`) union all (select cast(columns[0] as int) 
> c2 from `testWindow.csv`));
> +-+
> | EXPR$0  |
> +-+
> | 22  |
> +-+
> 1 row selected (0.234 seconds)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (DRILL-3783) Incorrect results : COUNT() over results returned by UNION ALL

2015-09-15 Thread Sean Hsuan-Yi Chu (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-3783?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14746973#comment-14746973
 ] 

Sean Hsuan-Yi Chu commented on DRILL-3783:
--

[~jaltekruse] I think I know what might be confusing. 

Please see the first query in the description. The parenthesis does NOT 
surround the whole UNION-ALL. Instead, it surrounds just the "inputs" of 
UNION-ALL. Meaning the structure is more like this: (...) union all (...)

The second query does not have "extra" parenthesis. There are two inner 
parentheses, each embracing an input go union-all; Besides, there is an outer 
parenthesis which surrounds the whole Union-all. Meaning the structure is more 
like this:
((...) union all (...))

> Incorrect results : COUNT() over results returned by UNION ALL 
> 
>
> Key: DRILL-3783
> URL: https://issues.apache.org/jira/browse/DRILL-3783
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Relational Operators
>Affects Versions: 1.2.0
> Environment: 4 node cluster on CentOS
>Reporter: Khurram Faraaz
>Assignee: Sean Hsuan-Yi Chu
>Priority: Critical
> Fix For: 1.2.0
>
>
> Count over results returned union all query, returns incorrect results. The 
> below query returned an Exception (please se DRILL-2637) that JIRA was marked 
> as fixed, however the query returns incorrect results. 
> {code}
> 0: jdbc:drill:schema=dfs.tmp> select count(c1) from (select cast(columns[0] 
> as int) c1 from `testWindow.csv`) union all (select cast(columns[0] as int) 
> c2 from `testWindow.csv`);
> +-+
> | EXPR$0  |
> +-+
> | 11  |
> | 100 |
> | 10  |
> | 2   |
> | 50  |
> | 55  |
> | 67  |
> | 113 |
> | 119 |
> | 89  |
> | 57  |
> | 61  |
> +-+
> 12 rows selected (0.753 seconds)
> {code}
> Results returned by the query on LHS and RHS of Union all operator are
> {code}
> 0: jdbc:drill:schema=dfs.tmp> select cast(columns[0] as int) c1 from 
> `testWindow.csv`;
> +--+
> |  c1  |
> +--+
> | 100  |
> | 10   |
> | 2|
> | 50   |
> | 55   |
> | 67   |
> | 113  |
> | 119  |
> | 89   |
> | 57   |
> | 61   |
> +--+
> 11 rows selected (0.197 seconds)
> 0: jdbc:drill:schema=dfs.tmp> select cast(columns[0] as int) c2 from 
> `testWindow.csv`;
> +--+
> |  c2  |
> +--+
> | 100  |
> | 10   |
> | 2|
> | 50   |
> | 55   |
> | 67   |
> | 113  |
> | 119  |
> | 89   |
> | 57   |
> | 61   |
> +--+
> 11 rows selected (0.173 seconds)
> {code}
> Note that enclosing the queries within correct parentheses returns correct 
> results. We do not want to return incorrect results to user when the 
> parentheses are missing.
> {code}
> 0: jdbc:drill:schema=dfs.tmp> select count(c1) from ((select cast(columns[0] 
> as int) c1 from `testWindow.csv`) union all (select cast(columns[0] as int) 
> c2 from `testWindow.csv`));
> +-+
> | EXPR$0  |
> +-+
> | 22  |
> +-+
> 1 row selected (0.234 seconds)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (DRILL-3783) Incorrect results : COUNT() over results returned by UNION ALL

2015-09-15 Thread Sean Hsuan-Yi Chu (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-3783?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14746958#comment-14746958
 ] 

Sean Hsuan-Yi Chu commented on DRILL-3783:
--

[~jaltekruse] So what is the problem now ? I am confused :(

Let's summarize how Postgres works firstly.
1. If there is no parenthesis => Union-All is evaluated the last
2. If there is parenthesis => the subquery in the parenthesis (union-all) is 
evaluated firstly

In both cases, Drill is doing exactly the same thing as Postgres. So why do 

> Incorrect results : COUNT() over results returned by UNION ALL 
> 
>
> Key: DRILL-3783
> URL: https://issues.apache.org/jira/browse/DRILL-3783
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Relational Operators
>Affects Versions: 1.2.0
> Environment: 4 node cluster on CentOS
>Reporter: Khurram Faraaz
>Assignee: Sean Hsuan-Yi Chu
>Priority: Critical
> Fix For: 1.2.0
>
>
> Count over results returned union all query, returns incorrect results. The 
> below query returned an Exception (please se DRILL-2637) that JIRA was marked 
> as fixed, however the query returns incorrect results. 
> {code}
> 0: jdbc:drill:schema=dfs.tmp> select count(c1) from (select cast(columns[0] 
> as int) c1 from `testWindow.csv`) union all (select cast(columns[0] as int) 
> c2 from `testWindow.csv`);
> +-+
> | EXPR$0  |
> +-+
> | 11  |
> | 100 |
> | 10  |
> | 2   |
> | 50  |
> | 55  |
> | 67  |
> | 113 |
> | 119 |
> | 89  |
> | 57  |
> | 61  |
> +-+
> 12 rows selected (0.753 seconds)
> {code}
> Results returned by the query on LHS and RHS of Union all operator are
> {code}
> 0: jdbc:drill:schema=dfs.tmp> select cast(columns[0] as int) c1 from 
> `testWindow.csv`;
> +--+
> |  c1  |
> +--+
> | 100  |
> | 10   |
> | 2|
> | 50   |
> | 55   |
> | 67   |
> | 113  |
> | 119  |
> | 89   |
> | 57   |
> | 61   |
> +--+
> 11 rows selected (0.197 seconds)
> 0: jdbc:drill:schema=dfs.tmp> select cast(columns[0] as int) c2 from 
> `testWindow.csv`;
> +--+
> |  c2  |
> +--+
> | 100  |
> | 10   |
> | 2|
> | 50   |
> | 55   |
> | 67   |
> | 113  |
> | 119  |
> | 89   |
> | 57   |
> | 61   |
> +--+
> 11 rows selected (0.173 seconds)
> {code}
> Note that enclosing the queries within correct parentheses returns correct 
> results. We do not want to return incorrect results to user when the 
> parentheses are missing.
> {code}
> 0: jdbc:drill:schema=dfs.tmp> select count(c1) from ((select cast(columns[0] 
> as int) c1 from `testWindow.csv`) union all (select cast(columns[0] as int) 
> c2 from `testWindow.csv`));
> +-+
> | EXPR$0  |
> +-+
> | 22  |
> +-+
> 1 row selected (0.234 seconds)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (DRILL-3783) Incorrect results : COUNT() over results returned by UNION ALL

2015-09-15 Thread Jason Altekruse (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-3783?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14746858#comment-14746858
 ] 

Jason Altekruse commented on DRILL-3783:


I was pretty surprised that it had a problem with it, I haven't seen any other 
languages that complain about extra parenthesis before. I am guessing that 
Calcite and Postgres are getting it right here.

> Incorrect results : COUNT() over results returned by UNION ALL 
> 
>
> Key: DRILL-3783
> URL: https://issues.apache.org/jira/browse/DRILL-3783
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Relational Operators
>Affects Versions: 1.2.0
> Environment: 4 node cluster on CentOS
>Reporter: Khurram Faraaz
>Assignee: Sean Hsuan-Yi Chu
>Priority: Critical
> Fix For: 1.2.0
>
>
> Count over results returned union all query, returns incorrect results. The 
> below query returned an Exception (please se DRILL-2637) that JIRA was marked 
> as fixed, however the query returns incorrect results. 
> {code}
> 0: jdbc:drill:schema=dfs.tmp> select count(c1) from (select cast(columns[0] 
> as int) c1 from `testWindow.csv`) union all (select cast(columns[0] as int) 
> c2 from `testWindow.csv`);
> +-+
> | EXPR$0  |
> +-+
> | 11  |
> | 100 |
> | 10  |
> | 2   |
> | 50  |
> | 55  |
> | 67  |
> | 113 |
> | 119 |
> | 89  |
> | 57  |
> | 61  |
> +-+
> 12 rows selected (0.753 seconds)
> {code}
> Results returned by the query on LHS and RHS of Union all operator are
> {code}
> 0: jdbc:drill:schema=dfs.tmp> select cast(columns[0] as int) c1 from 
> `testWindow.csv`;
> +--+
> |  c1  |
> +--+
> | 100  |
> | 10   |
> | 2|
> | 50   |
> | 55   |
> | 67   |
> | 113  |
> | 119  |
> | 89   |
> | 57   |
> | 61   |
> +--+
> 11 rows selected (0.197 seconds)
> 0: jdbc:drill:schema=dfs.tmp> select cast(columns[0] as int) c2 from 
> `testWindow.csv`;
> +--+
> |  c2  |
> +--+
> | 100  |
> | 10   |
> | 2|
> | 50   |
> | 55   |
> | 67   |
> | 113  |
> | 119  |
> | 89   |
> | 57   |
> | 61   |
> +--+
> 11 rows selected (0.173 seconds)
> {code}
> Note that enclosing the queries within correct parentheses returns correct 
> results. We do not want to return incorrect results to user when the 
> parentheses are missing.
> {code}
> 0: jdbc:drill:schema=dfs.tmp> select count(c1) from ((select cast(columns[0] 
> as int) c1 from `testWindow.csv`) union all (select cast(columns[0] as int) 
> c2 from `testWindow.csv`));
> +-+
> | EXPR$0  |
> +-+
> | 22  |
> +-+
> 1 row selected (0.234 seconds)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (DRILL-3786) Query with window function fails with IllegalFormatConversionException

2015-09-15 Thread Abhishek Girish (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-3786?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14746848#comment-14746848
 ] 

Abhishek Girish commented on DRILL-3786:


Value is 23

> Query with window function fails with IllegalFormatConversionException
> --
>
> Key: DRILL-3786
> URL: https://issues.apache.org/jira/browse/DRILL-3786
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Relational Operators
>Affects Versions: 1.2.0
> Environment: 10 Performance Nodes
> DRILL_MAX_DIRECT_MEMORY=100g
> DRILL_INIT_HEAP="8g"
> DRILL_MAX_HEAP="8g"
> planner.memory.query_max_memory_per_node bumped up to 20 GB
> TPC-DS SF 1000 dataset (Parquet)
>Reporter: Abhishek Girish
>Assignee: Deneche A. Hakim
> Fix For: 1.2.0
>
> Attachments: drillbit.log.txt, query_profile.json
>
>
> Query fails with Runtime exception:
> {code:sql}
> SELECT sum(s.ss_quantity) OVER (PARTITION BY s.ss_store_sk, s.ss_customer_sk 
> ORDER BY s.ss_store_sk) FROM store_sales s LIMIT 20;
> java.lang.RuntimeException: java.sql.SQLException: SYSTEM ERROR: 
> IllegalFormatConversionException: d != java.lang.Character
> Fragment 1:0
> [Error Id: 12b51c0c-4992-4ceb-89c4-c99307529c7e on ucs-node8.perf.lab:31010]
>   at sqlline.IncrementalRows.hasNext(IncrementalRows.java:73)
>   at 
> sqlline.TableOutputFormat$ResizingRowsProvider.next(TableOutputFormat.java:87)
>   at sqlline.TableOutputFormat.print(TableOutputFormat.java:118)
>   at sqlline.SqlLine.print(SqlLine.java:1583)
>   at sqlline.Commands.execute(Commands.java:852)
>   at sqlline.Commands.sql(Commands.java:751)
>   at sqlline.SqlLine.dispatch(SqlLine.java:738)
>   at sqlline.SqlLine.begin(SqlLine.java:612)
>   at sqlline.SqlLine.start(SqlLine.java:366)
>   at sqlline.SqlLine.main(SqlLine.java:259)
> {code}
> Query logs and profile attached. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (DRILL-3783) Incorrect results : COUNT() over results returned by UNION ALL

2015-09-15 Thread Victoria Markman (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-3783?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14746800#comment-14746800
 ] 

Victoria Markman commented on DRILL-3783:
-

{code}
postgres=#  select count(a1) from ((select a1 from t1 union all select a1 from 
t1)) x;
 count 
---
18
(1 row)
{code}

Who is going to read the standard ? :)

> Incorrect results : COUNT() over results returned by UNION ALL 
> 
>
> Key: DRILL-3783
> URL: https://issues.apache.org/jira/browse/DRILL-3783
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Relational Operators
>Affects Versions: 1.2.0
> Environment: 4 node cluster on CentOS
>Reporter: Khurram Faraaz
>Assignee: Sean Hsuan-Yi Chu
>Priority: Critical
> Fix For: 1.2.0
>
>
> Count over results returned union all query, returns incorrect results. The 
> below query returned an Exception (please se DRILL-2637) that JIRA was marked 
> as fixed, however the query returns incorrect results. 
> {code}
> 0: jdbc:drill:schema=dfs.tmp> select count(c1) from (select cast(columns[0] 
> as int) c1 from `testWindow.csv`) union all (select cast(columns[0] as int) 
> c2 from `testWindow.csv`);
> +-+
> | EXPR$0  |
> +-+
> | 11  |
> | 100 |
> | 10  |
> | 2   |
> | 50  |
> | 55  |
> | 67  |
> | 113 |
> | 119 |
> | 89  |
> | 57  |
> | 61  |
> +-+
> 12 rows selected (0.753 seconds)
> {code}
> Results returned by the query on LHS and RHS of Union all operator are
> {code}
> 0: jdbc:drill:schema=dfs.tmp> select cast(columns[0] as int) c1 from 
> `testWindow.csv`;
> +--+
> |  c1  |
> +--+
> | 100  |
> | 10   |
> | 2|
> | 50   |
> | 55   |
> | 67   |
> | 113  |
> | 119  |
> | 89   |
> | 57   |
> | 61   |
> +--+
> 11 rows selected (0.197 seconds)
> 0: jdbc:drill:schema=dfs.tmp> select cast(columns[0] as int) c2 from 
> `testWindow.csv`;
> +--+
> |  c2  |
> +--+
> | 100  |
> | 10   |
> | 2|
> | 50   |
> | 55   |
> | 67   |
> | 113  |
> | 119  |
> | 89   |
> | 57   |
> | 61   |
> +--+
> 11 rows selected (0.173 seconds)
> {code}
> Note that enclosing the queries within correct parentheses returns correct 
> results. We do not want to return incorrect results to user when the 
> parentheses are missing.
> {code}
> 0: jdbc:drill:schema=dfs.tmp> select count(c1) from ((select cast(columns[0] 
> as int) c1 from `testWindow.csv`) union all (select cast(columns[0] as int) 
> c2 from `testWindow.csv`));
> +-+
> | EXPR$0  |
> +-+
> | 22  |
> +-+
> 1 row selected (0.234 seconds)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (DRILL-2482) JDBC : calling getObject when the actual column type is 'NVARCHAR' results in NoClassDefFoundError

2015-09-15 Thread Daniel Barclay (Drill) (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-2482?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Barclay (Drill) resolved DRILL-2482.
---
Resolution: Fixed

> JDBC : calling getObject when the actual column type is 'NVARCHAR' results in 
> NoClassDefFoundError
> --
>
> Key: DRILL-2482
> URL: https://issues.apache.org/jira/browse/DRILL-2482
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Client - JDBC
>Reporter: Rahul Challapalli
>Assignee: Daniel Barclay (Drill)
>Priority: Blocker
> Fix For: 1.2.0
>
>
> git.commit.id.abbrev=7b4c887
> I tried to call getObject(i) on a column which is of type varchar, drill 
> failed with the below error :
> {code}
> Exception in thread "main" java.lang.NoClassDefFoundError: 
> org/apache/hadoop/io/Text
>   at 
> org.apache.drill.exec.vector.VarCharVector$Accessor.getObject(VarCharVector.java:407)
>   at 
> org.apache.drill.exec.vector.NullableVarCharVector$Accessor.getObject(NullableVarCharVector.java:386)
>   at 
> org.apache.drill.exec.vector.accessor.NullableVarCharAccessor.getObject(NullableVarCharAccessor.java:98)
>   at 
> org.apache.drill.exec.vector.accessor.BoundCheckingAccessor.getObject(BoundCheckingAccessor.java:137)
>   at 
> org.apache.drill.jdbc.AvaticaDrillSqlAccessor.getObject(AvaticaDrillSqlAccessor.java:136)
>   at 
> net.hydromatic.avatica.AvaticaResultSet.getObject(AvaticaResultSet.java:351)
>   at Dummy.testComplexQuery(Dummy.java:94)
>   at Dummy.main(Dummy.java:30)
> Caused by: java.lang.ClassNotFoundException: org.apache.hadoop.io.Text
>   at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
>   at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
>   at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
>   at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
>   at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
>   ... 8 more
> {code}
> When the underlying type is a primitive, the getObject call succeeds



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (DRILL-3658) Missing org.apache.hadoop in the JDBC jar

2015-09-15 Thread Daniel Barclay (Drill) (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-3658?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Barclay (Drill) resolved DRILL-3658.
---
Resolution: Fixed

> Missing org.apache.hadoop in the JDBC jar
> -
>
> Key: DRILL-3658
> URL: https://issues.apache.org/jira/browse/DRILL-3658
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Client - JDBC
>Reporter: Piotr Sokólski
>Assignee: Daniel Barclay (Drill)
>Priority: Blocker
> Fix For: 1.2.0
>
>
> java.lang.ClassNotFoundException: local.org.apache.hadoop.io.Text is thrown 
> while trying to access a text field from a result set returned from Drill 
> while using the drill-jdbc-all.jar



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (DRILL-3786) Query with window function fails with IllegalFormatConversionException

2015-09-15 Thread Deneche A. Hakim (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-3786?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14746757#comment-14746757
 ] 

Deneche A. Hakim commented on DRILL-3786:
-

[~agirish] what is the value of {{planner.width.max_per_node}} ?

> Query with window function fails with IllegalFormatConversionException
> --
>
> Key: DRILL-3786
> URL: https://issues.apache.org/jira/browse/DRILL-3786
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Relational Operators
>Affects Versions: 1.2.0
> Environment: 10 Performance Nodes
> DRILL_MAX_DIRECT_MEMORY=100g
> DRILL_INIT_HEAP="8g"
> DRILL_MAX_HEAP="8g"
> planner.memory.query_max_memory_per_node bumped up to 20 GB
> TPC-DS SF 1000 dataset (Parquet)
>Reporter: Abhishek Girish
>Assignee: Deneche A. Hakim
> Fix For: 1.2.0
>
> Attachments: drillbit.log.txt, query_profile.json
>
>
> Query fails with Runtime exception:
> {code:sql}
> SELECT sum(s.ss_quantity) OVER (PARTITION BY s.ss_store_sk, s.ss_customer_sk 
> ORDER BY s.ss_store_sk) FROM store_sales s LIMIT 20;
> java.lang.RuntimeException: java.sql.SQLException: SYSTEM ERROR: 
> IllegalFormatConversionException: d != java.lang.Character
> Fragment 1:0
> [Error Id: 12b51c0c-4992-4ceb-89c4-c99307529c7e on ucs-node8.perf.lab:31010]
>   at sqlline.IncrementalRows.hasNext(IncrementalRows.java:73)
>   at 
> sqlline.TableOutputFormat$ResizingRowsProvider.next(TableOutputFormat.java:87)
>   at sqlline.TableOutputFormat.print(TableOutputFormat.java:118)
>   at sqlline.SqlLine.print(SqlLine.java:1583)
>   at sqlline.Commands.execute(Commands.java:852)
>   at sqlline.Commands.sql(Commands.java:751)
>   at sqlline.SqlLine.dispatch(SqlLine.java:738)
>   at sqlline.SqlLine.begin(SqlLine.java:612)
>   at sqlline.SqlLine.start(SqlLine.java:366)
>   at sqlline.SqlLine.main(SqlLine.java:259)
> {code}
> Query logs and profile attached. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (DRILL-3786) Query with window function fails with IllegalFormatConversionException

2015-09-15 Thread Deneche A. Hakim (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-3786?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Deneche A. Hakim updated DRILL-3786:

Fix Version/s: 1.2.0

> Query with window function fails with IllegalFormatConversionException
> --
>
> Key: DRILL-3786
> URL: https://issues.apache.org/jira/browse/DRILL-3786
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Relational Operators
>Affects Versions: 1.2.0
> Environment: 10 Performance Nodes
> DRILL_MAX_DIRECT_MEMORY=100g
> DRILL_INIT_HEAP="8g"
> DRILL_MAX_HEAP="8g"
> planner.memory.query_max_memory_per_node bumped up to 20 GB
> TPC-DS SF 1000 dataset (Parquet)
>Reporter: Abhishek Girish
>Assignee: Deneche A. Hakim
> Fix For: 1.2.0
>
> Attachments: drillbit.log.txt, query_profile.json
>
>
> Query fails with Runtime exception:
> {code:sql}
> SELECT sum(s.ss_quantity) OVER (PARTITION BY s.ss_store_sk, s.ss_customer_sk 
> ORDER BY s.ss_store_sk) FROM store_sales s LIMIT 20;
> java.lang.RuntimeException: java.sql.SQLException: SYSTEM ERROR: 
> IllegalFormatConversionException: d != java.lang.Character
> Fragment 1:0
> [Error Id: 12b51c0c-4992-4ceb-89c4-c99307529c7e on ucs-node8.perf.lab:31010]
>   at sqlline.IncrementalRows.hasNext(IncrementalRows.java:73)
>   at 
> sqlline.TableOutputFormat$ResizingRowsProvider.next(TableOutputFormat.java:87)
>   at sqlline.TableOutputFormat.print(TableOutputFormat.java:118)
>   at sqlline.SqlLine.print(SqlLine.java:1583)
>   at sqlline.Commands.execute(Commands.java:852)
>   at sqlline.Commands.sql(Commands.java:751)
>   at sqlline.SqlLine.dispatch(SqlLine.java:738)
>   at sqlline.SqlLine.begin(SqlLine.java:612)
>   at sqlline.SqlLine.start(SqlLine.java:366)
>   at sqlline.SqlLine.main(SqlLine.java:259)
> {code}
> Query logs and profile attached. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (DRILL-3786) Query with window function fails with IllegalFormatConversionException

2015-09-15 Thread Deneche A. Hakim (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-3786?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14746751#comment-14746751
 ] 

Deneche A. Hakim commented on DRILL-3786:
-

The failure is happening here:
{code}
if (runningBatches >= Character.MAX_VALUE) {
  final String errMsg = String.format("Tried to add more than %d number of 
batches.", Character.MAX_VALUE);
  logger.error(errMsg);
  throw new DrillRuntimeException(errMsg);
}
{code}

We are passing a char (Character.MAX_VALUE) to String.format() where it's 
expecting an int.

I will submit a patch to fix the error message, but the query will still fail 
because the number of batches the sort needs to handles exceeds the maximum 
SelectionVector4 can index.

> Query with window function fails with IllegalFormatConversionException
> --
>
> Key: DRILL-3786
> URL: https://issues.apache.org/jira/browse/DRILL-3786
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Relational Operators
>Affects Versions: 1.2.0
> Environment: 10 Performance Nodes
> DRILL_MAX_DIRECT_MEMORY=100g
> DRILL_INIT_HEAP="8g"
> DRILL_MAX_HEAP="8g"
> planner.memory.query_max_memory_per_node bumped up to 20 GB
> TPC-DS SF 1000 dataset (Parquet)
>Reporter: Abhishek Girish
>Assignee: Deneche A. Hakim
> Fix For: 1.2.0
>
> Attachments: drillbit.log.txt, query_profile.json
>
>
> Query fails with Runtime exception:
> {code:sql}
> SELECT sum(s.ss_quantity) OVER (PARTITION BY s.ss_store_sk, s.ss_customer_sk 
> ORDER BY s.ss_store_sk) FROM store_sales s LIMIT 20;
> java.lang.RuntimeException: java.sql.SQLException: SYSTEM ERROR: 
> IllegalFormatConversionException: d != java.lang.Character
> Fragment 1:0
> [Error Id: 12b51c0c-4992-4ceb-89c4-c99307529c7e on ucs-node8.perf.lab:31010]
>   at sqlline.IncrementalRows.hasNext(IncrementalRows.java:73)
>   at 
> sqlline.TableOutputFormat$ResizingRowsProvider.next(TableOutputFormat.java:87)
>   at sqlline.TableOutputFormat.print(TableOutputFormat.java:118)
>   at sqlline.SqlLine.print(SqlLine.java:1583)
>   at sqlline.Commands.execute(Commands.java:852)
>   at sqlline.Commands.sql(Commands.java:751)
>   at sqlline.SqlLine.dispatch(SqlLine.java:738)
>   at sqlline.SqlLine.begin(SqlLine.java:612)
>   at sqlline.SqlLine.start(SqlLine.java:366)
>   at sqlline.SqlLine.main(SqlLine.java:259)
> {code}
> Query logs and profile attached. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (DRILL-3783) Incorrect results : COUNT() over results returned by UNION ALL

2015-09-15 Thread Jason Altekruse (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-3783?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14746669#comment-14746669
 ] 

Jason Altekruse commented on DRILL-3783:


Interestingly when I tried to put extra parenthesis around the sub-query (the 
thing that "fixed" the drill result) it threw a syntax error. We might be 
allowing invalid syntax with regards to parenthesis.

{code}
mysql> select count(firstname) from ((select firstname from users union all 
select firstname from users)) t;
ERROR 1064 (42000): You have an error in your SQL syntax; check the manual that 
corresponds to your MySQL server version for the right syntax to use near ')) 
t' at line 1
{code}

> Incorrect results : COUNT() over results returned by UNION ALL 
> 
>
> Key: DRILL-3783
> URL: https://issues.apache.org/jira/browse/DRILL-3783
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Relational Operators
>Affects Versions: 1.2.0
> Environment: 4 node cluster on CentOS
>Reporter: Khurram Faraaz
>Assignee: Sean Hsuan-Yi Chu
>Priority: Critical
> Fix For: 1.2.0
>
>
> Count over results returned union all query, returns incorrect results. The 
> below query returned an Exception (please se DRILL-2637) that JIRA was marked 
> as fixed, however the query returns incorrect results. 
> {code}
> 0: jdbc:drill:schema=dfs.tmp> select count(c1) from (select cast(columns[0] 
> as int) c1 from `testWindow.csv`) union all (select cast(columns[0] as int) 
> c2 from `testWindow.csv`);
> +-+
> | EXPR$0  |
> +-+
> | 11  |
> | 100 |
> | 10  |
> | 2   |
> | 50  |
> | 55  |
> | 67  |
> | 113 |
> | 119 |
> | 89  |
> | 57  |
> | 61  |
> +-+
> 12 rows selected (0.753 seconds)
> {code}
> Results returned by the query on LHS and RHS of Union all operator are
> {code}
> 0: jdbc:drill:schema=dfs.tmp> select cast(columns[0] as int) c1 from 
> `testWindow.csv`;
> +--+
> |  c1  |
> +--+
> | 100  |
> | 10   |
> | 2|
> | 50   |
> | 55   |
> | 67   |
> | 113  |
> | 119  |
> | 89   |
> | 57   |
> | 61   |
> +--+
> 11 rows selected (0.197 seconds)
> 0: jdbc:drill:schema=dfs.tmp> select cast(columns[0] as int) c2 from 
> `testWindow.csv`;
> +--+
> |  c2  |
> +--+
> | 100  |
> | 10   |
> | 2|
> | 50   |
> | 55   |
> | 67   |
> | 113  |
> | 119  |
> | 89   |
> | 57   |
> | 61   |
> +--+
> 11 rows selected (0.173 seconds)
> {code}
> Note that enclosing the queries within correct parentheses returns correct 
> results. We do not want to return incorrect results to user when the 
> parentheses are missing.
> {code}
> 0: jdbc:drill:schema=dfs.tmp> select count(c1) from ((select cast(columns[0] 
> as int) c1 from `testWindow.csv`) union all (select cast(columns[0] as int) 
> c2 from `testWindow.csv`));
> +-+
> | EXPR$0  |
> +-+
> | 22  |
> +-+
> 1 row selected (0.234 seconds)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (DRILL-3788) Partition Pruning not taking place with metadata caching when we have ~20k files

2015-09-15 Thread Aman Sinha (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-3788?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14746670#comment-14746670
 ] 

Aman Sinha commented on DRILL-3788:
---

That is true..in this case the data was probably created with the hierarchical 
directory structure.   [~rkins] can you run both types of tests ? 
i.e (1) CTAS auto partitioning with partitioning column 'x' ,  run metadata 
refresh, add a new set of auto-partitioned files to the same directory and then 
run your query with filter on 'x'. 
(2) Create the hierarchical directory structure, refresh metadata and query 
with filters on these directories.(it sounds like you are doing this). 



> Partition Pruning not taking place with metadata caching when we have ~20k 
> files
> 
>
> Key: DRILL-3788
> URL: https://issues.apache.org/jira/browse/DRILL-3788
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Query Planning & Optimization
>Affects Versions: 1.2.0
>Reporter: Rahul Challapalli
>Assignee: Aman Sinha
>Priority: Critical
> Fix For: 1.2.0
>
> Attachments: plan.txt
>
>
> git.commit.id.abbrev=240a455
> Partition Pruning did not take place for the below query after I executed the 
> "refresh table metadata command"
> {code}
>  explain plan for 
> select
>   l_returnflag,
>   l_linestatus
> from
>   `lineitem/2006/1`
> where
>   dir0=1 or dir0=2
> {code}
> The logs did not indicate that "pruning did not take place"
> Before executing the refresh table metadata command, partition pruning did 
> take effect
> I am not attaching the data set as it is larger than 10MB. Reach out to me if 
> you need more information



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (DRILL-3783) Incorrect results : COUNT() over results returned by UNION ALL

2015-09-15 Thread Jason Altekruse (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-3783?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14746667#comment-14746667
 ] 

Jason Altekruse commented on DRILL-3783:


Shouldn't the parenthesis that surround the sub-query with the union all in it 
have absolute top precedence?

I tried playing around with mysql, and the behavior is consistent with what is 
described in this report as the expected behavior. It does confirm what you 
said about the aggregate applying before the union all, in the case of no 
parenthesis. Seems to be doing implicit casting on the result of the 
aggregation to allow it to co-exist with the varchar values that are introduced 
in the union all.

{code}
mysql> select * from users;
++---+--+---+-+
| id | firstname | lastname | email | reg_date|
++---+--+---+-+
|  1 | john  | smith| NULL  | 2015-09-15 18:51:06 |
|  2 | john  | doe  | NULL  | 2015-09-15 18:51:06 |
|  3 | bill  | williams | NULL  | 2015-09-15 18:51:06 |
++---+--+---+-+
3 rows in set (0.00 sec)

mysql> select count(firstname) from (select firstname from users union all 
select firstname from users) t;
+--+
| count(firstname) |
+--+
|6 |
+--+
1 row in set (0.00 sec)

mysql> select count(firstname) from users union all select firstname from users 
t;
+--+
| count(firstname) |
+--+
| 3|
| john |
| john |
| bill |
+--+
4 rows in set (0.00 sec)

{code}

> Incorrect results : COUNT() over results returned by UNION ALL 
> 
>
> Key: DRILL-3783
> URL: https://issues.apache.org/jira/browse/DRILL-3783
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Relational Operators
>Affects Versions: 1.2.0
> Environment: 4 node cluster on CentOS
>Reporter: Khurram Faraaz
>Assignee: Sean Hsuan-Yi Chu
>Priority: Critical
> Fix For: 1.2.0
>
>
> Count over results returned union all query, returns incorrect results. The 
> below query returned an Exception (please se DRILL-2637) that JIRA was marked 
> as fixed, however the query returns incorrect results. 
> {code}
> 0: jdbc:drill:schema=dfs.tmp> select count(c1) from (select cast(columns[0] 
> as int) c1 from `testWindow.csv`) union all (select cast(columns[0] as int) 
> c2 from `testWindow.csv`);
> +-+
> | EXPR$0  |
> +-+
> | 11  |
> | 100 |
> | 10  |
> | 2   |
> | 50  |
> | 55  |
> | 67  |
> | 113 |
> | 119 |
> | 89  |
> | 57  |
> | 61  |
> +-+
> 12 rows selected (0.753 seconds)
> {code}
> Results returned by the query on LHS and RHS of Union all operator are
> {code}
> 0: jdbc:drill:schema=dfs.tmp> select cast(columns[0] as int) c1 from 
> `testWindow.csv`;
> +--+
> |  c1  |
> +--+
> | 100  |
> | 10   |
> | 2|
> | 50   |
> | 55   |
> | 67   |
> | 113  |
> | 119  |
> | 89   |
> | 57   |
> | 61   |
> +--+
> 11 rows selected (0.197 seconds)
> 0: jdbc:drill:schema=dfs.tmp> select cast(columns[0] as int) c2 from 
> `testWindow.csv`;
> +--+
> |  c2  |
> +--+
> | 100  |
> | 10   |
> | 2|
> | 50   |
> | 55   |
> | 67   |
> | 113  |
> | 119  |
> | 89   |
> | 57   |
> | 61   |
> +--+
> 11 rows selected (0.173 seconds)
> {code}
> Note that enclosing the queries within correct parentheses returns correct 
> results. We do not want to return incorrect results to user when the 
> parentheses are missing.
> {code}
> 0: jdbc:drill:schema=dfs.tmp> select count(c1) from ((select cast(columns[0] 
> as int) c1 from `testWindow.csv`) union all (select cast(columns[0] as int) 
> c2 from `testWindow.csv`));
> +-+
> | EXPR$0  |
> +-+
> | 22  |
> +-+
> 1 row selected (0.234 seconds)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Reopened] (DRILL-3783) Incorrect results : COUNT() over results returned by UNION ALL

2015-09-15 Thread Jason Altekruse (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-3783?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Altekruse reopened DRILL-3783:


> Incorrect results : COUNT() over results returned by UNION ALL 
> 
>
> Key: DRILL-3783
> URL: https://issues.apache.org/jira/browse/DRILL-3783
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Relational Operators
>Affects Versions: 1.2.0
> Environment: 4 node cluster on CentOS
>Reporter: Khurram Faraaz
>Assignee: Sean Hsuan-Yi Chu
>Priority: Critical
> Fix For: 1.2.0
>
>
> Count over results returned union all query, returns incorrect results. The 
> below query returned an Exception (please se DRILL-2637) that JIRA was marked 
> as fixed, however the query returns incorrect results. 
> {code}
> 0: jdbc:drill:schema=dfs.tmp> select count(c1) from (select cast(columns[0] 
> as int) c1 from `testWindow.csv`) union all (select cast(columns[0] as int) 
> c2 from `testWindow.csv`);
> +-+
> | EXPR$0  |
> +-+
> | 11  |
> | 100 |
> | 10  |
> | 2   |
> | 50  |
> | 55  |
> | 67  |
> | 113 |
> | 119 |
> | 89  |
> | 57  |
> | 61  |
> +-+
> 12 rows selected (0.753 seconds)
> {code}
> Results returned by the query on LHS and RHS of Union all operator are
> {code}
> 0: jdbc:drill:schema=dfs.tmp> select cast(columns[0] as int) c1 from 
> `testWindow.csv`;
> +--+
> |  c1  |
> +--+
> | 100  |
> | 10   |
> | 2|
> | 50   |
> | 55   |
> | 67   |
> | 113  |
> | 119  |
> | 89   |
> | 57   |
> | 61   |
> +--+
> 11 rows selected (0.197 seconds)
> 0: jdbc:drill:schema=dfs.tmp> select cast(columns[0] as int) c2 from 
> `testWindow.csv`;
> +--+
> |  c2  |
> +--+
> | 100  |
> | 10   |
> | 2|
> | 50   |
> | 55   |
> | 67   |
> | 113  |
> | 119  |
> | 89   |
> | 57   |
> | 61   |
> +--+
> 11 rows selected (0.173 seconds)
> {code}
> Note that enclosing the queries within correct parentheses returns correct 
> results. We do not want to return incorrect results to user when the 
> parentheses are missing.
> {code}
> 0: jdbc:drill:schema=dfs.tmp> select count(c1) from ((select cast(columns[0] 
> as int) c1 from `testWindow.csv`) union all (select cast(columns[0] as int) 
> c2 from `testWindow.csv`));
> +-+
> | EXPR$0  |
> +-+
> | 22  |
> +-+
> 1 row selected (0.234 seconds)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Closed] (DRILL-3783) Incorrect results : COUNT() over results returned by UNION ALL

2015-09-15 Thread Khurram Faraaz (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-3783?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Khurram Faraaz closed DRILL-3783.
-

> Incorrect results : COUNT() over results returned by UNION ALL 
> 
>
> Key: DRILL-3783
> URL: https://issues.apache.org/jira/browse/DRILL-3783
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Relational Operators
>Affects Versions: 1.2.0
> Environment: 4 node cluster on CentOS
>Reporter: Khurram Faraaz
>Assignee: Sean Hsuan-Yi Chu
>Priority: Critical
> Fix For: 1.2.0
>
>
> Count over results returned union all query, returns incorrect results. The 
> below query returned an Exception (please se DRILL-2637) that JIRA was marked 
> as fixed, however the query returns incorrect results. 
> {code}
> 0: jdbc:drill:schema=dfs.tmp> select count(c1) from (select cast(columns[0] 
> as int) c1 from `testWindow.csv`) union all (select cast(columns[0] as int) 
> c2 from `testWindow.csv`);
> +-+
> | EXPR$0  |
> +-+
> | 11  |
> | 100 |
> | 10  |
> | 2   |
> | 50  |
> | 55  |
> | 67  |
> | 113 |
> | 119 |
> | 89  |
> | 57  |
> | 61  |
> +-+
> 12 rows selected (0.753 seconds)
> {code}
> Results returned by the query on LHS and RHS of Union all operator are
> {code}
> 0: jdbc:drill:schema=dfs.tmp> select cast(columns[0] as int) c1 from 
> `testWindow.csv`;
> +--+
> |  c1  |
> +--+
> | 100  |
> | 10   |
> | 2|
> | 50   |
> | 55   |
> | 67   |
> | 113  |
> | 119  |
> | 89   |
> | 57   |
> | 61   |
> +--+
> 11 rows selected (0.197 seconds)
> 0: jdbc:drill:schema=dfs.tmp> select cast(columns[0] as int) c2 from 
> `testWindow.csv`;
> +--+
> |  c2  |
> +--+
> | 100  |
> | 10   |
> | 2|
> | 50   |
> | 55   |
> | 67   |
> | 113  |
> | 119  |
> | 89   |
> | 57   |
> | 61   |
> +--+
> 11 rows selected (0.173 seconds)
> {code}
> Note that enclosing the queries within correct parentheses returns correct 
> results. We do not want to return incorrect results to user when the 
> parentheses are missing.
> {code}
> 0: jdbc:drill:schema=dfs.tmp> select count(c1) from ((select cast(columns[0] 
> as int) c1 from `testWindow.csv`) union all (select cast(columns[0] as int) 
> c2 from `testWindow.csv`));
> +-+
> | EXPR$0  |
> +-+
> | 22  |
> +-+
> 1 row selected (0.234 seconds)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (DRILL-3783) Incorrect results : COUNT() over results returned by UNION ALL

2015-09-15 Thread Sean Hsuan-Yi Chu (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-3783?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean Hsuan-Yi Chu resolved DRILL-3783.
--
Resolution: Invalid

This is fine. Set-Operator (e.g., Union, Intersect) has the least precedence, 
so they will be applied later.

> Incorrect results : COUNT() over results returned by UNION ALL 
> 
>
> Key: DRILL-3783
> URL: https://issues.apache.org/jira/browse/DRILL-3783
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Relational Operators
>Affects Versions: 1.2.0
> Environment: 4 node cluster on CentOS
>Reporter: Khurram Faraaz
>Assignee: Sean Hsuan-Yi Chu
>Priority: Critical
> Fix For: 1.2.0
>
>
> Count over results returned union all query, returns incorrect results. The 
> below query returned an Exception (please se DRILL-2637) that JIRA was marked 
> as fixed, however the query returns incorrect results. 
> {code}
> 0: jdbc:drill:schema=dfs.tmp> select count(c1) from (select cast(columns[0] 
> as int) c1 from `testWindow.csv`) union all (select cast(columns[0] as int) 
> c2 from `testWindow.csv`);
> +-+
> | EXPR$0  |
> +-+
> | 11  |
> | 100 |
> | 10  |
> | 2   |
> | 50  |
> | 55  |
> | 67  |
> | 113 |
> | 119 |
> | 89  |
> | 57  |
> | 61  |
> +-+
> 12 rows selected (0.753 seconds)
> {code}
> Results returned by the query on LHS and RHS of Union all operator are
> {code}
> 0: jdbc:drill:schema=dfs.tmp> select cast(columns[0] as int) c1 from 
> `testWindow.csv`;
> +--+
> |  c1  |
> +--+
> | 100  |
> | 10   |
> | 2|
> | 50   |
> | 55   |
> | 67   |
> | 113  |
> | 119  |
> | 89   |
> | 57   |
> | 61   |
> +--+
> 11 rows selected (0.197 seconds)
> 0: jdbc:drill:schema=dfs.tmp> select cast(columns[0] as int) c2 from 
> `testWindow.csv`;
> +--+
> |  c2  |
> +--+
> | 100  |
> | 10   |
> | 2|
> | 50   |
> | 55   |
> | 67   |
> | 113  |
> | 119  |
> | 89   |
> | 57   |
> | 61   |
> +--+
> 11 rows selected (0.173 seconds)
> {code}
> Note that enclosing the queries within correct parentheses returns correct 
> results. We do not want to return incorrect results to user when the 
> parentheses are missing.
> {code}
> 0: jdbc:drill:schema=dfs.tmp> select count(c1) from ((select cast(columns[0] 
> as int) c1 from `testWindow.csv`) union all (select cast(columns[0] as int) 
> c2 from `testWindow.csv`));
> +-+
> | EXPR$0  |
> +-+
> | 22  |
> +-+
> 1 row selected (0.234 seconds)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (DRILL-3788) Partition Pruning not taking place with metadata caching when we have ~20k files

2015-09-15 Thread Steven Phillips (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-3788?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14746645#comment-14746645
 ] 

Steven Phillips commented on DRILL-3788:


I am a bit confused. This jira seems to be related to directory-based partition 
pruning, not single-valued column based pruning. As far as I know they should 
both be working, though. I would have to use a debugger to find out why it's 
failing.

> Partition Pruning not taking place with metadata caching when we have ~20k 
> files
> 
>
> Key: DRILL-3788
> URL: https://issues.apache.org/jira/browse/DRILL-3788
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Query Planning & Optimization
>Affects Versions: 1.2.0
>Reporter: Rahul Challapalli
>Assignee: Aman Sinha
>Priority: Critical
> Fix For: 1.2.0
>
> Attachments: plan.txt
>
>
> git.commit.id.abbrev=240a455
> Partition Pruning did not take place for the below query after I executed the 
> "refresh table metadata command"
> {code}
>  explain plan for 
> select
>   l_returnflag,
>   l_linestatus
> from
>   `lineitem/2006/1`
> where
>   dir0=1 or dir0=2
> {code}
> The logs did not indicate that "pruning did not take place"
> Before executing the refresh table metadata command, partition pruning did 
> take effect
> I am not attaching the data set as it is larger than 10MB. Reach out to me if 
> you need more information



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (DRILL-3788) Partition Pruning not taking place with metadata caching when we have ~20k files

2015-09-15 Thread Aman Sinha (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-3788?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14746640#comment-14746640
 ] 

Aman Sinha commented on DRILL-3788:
---

[~rkins] can you try to create a smaller repro ?  Given the nature of the issue 
it shouldn't require large number of files. 
[~sphillips] does the metadata cache store the min/max values for a column in a 
row group to determine if it is single_value ? Any thoughts on what is missing 
to get partition pruning to work with metadata caching ? 

> Partition Pruning not taking place with metadata caching when we have ~20k 
> files
> 
>
> Key: DRILL-3788
> URL: https://issues.apache.org/jira/browse/DRILL-3788
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Query Planning & Optimization
>Affects Versions: 1.2.0
>Reporter: Rahul Challapalli
>Assignee: Aman Sinha
>Priority: Critical
> Fix For: 1.2.0
>
> Attachments: plan.txt
>
>
> git.commit.id.abbrev=240a455
> Partition Pruning did not take place for the below query after I executed the 
> "refresh table metadata command"
> {code}
>  explain plan for 
> select
>   l_returnflag,
>   l_linestatus
> from
>   `lineitem/2006/1`
> where
>   dir0=1 or dir0=2
> {code}
> The logs did not indicate that "pruning did not take place"
> Before executing the refresh table metadata command, partition pruning did 
> take effect
> I am not attaching the data set as it is larger than 10MB. Reach out to me if 
> you need more information



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (DRILL-2748) Filter is not pushed down into subquery with the group by

2015-09-15 Thread Jinfeng Ni (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-2748?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14746631#comment-14746631
 ] 

Jinfeng Ni commented on DRILL-2748:
---

Turns out adding FilterAggregateTransposableRule is not enough to address this 
issue. The current cost estimation for aggregate in Drill's cost model is way 
off the real cost. The costing issue would make the planner pick the plan 
without filter pushdown, even the rule produce the alternative, which is 
estimated to have higher cost than the original one, in some cases. 

I have revised the cost formula for aggregation. Seems it does produce the plan 
that we want. I'll submit another patch.



> Filter is not pushed down into subquery with the group by
> -
>
> Key: DRILL-2748
> URL: https://issues.apache.org/jira/browse/DRILL-2748
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Query Planning & Optimization
>Affects Versions: 0.9.0, 1.0.0, 1.1.0
>Reporter: Victoria Markman
>Assignee: Jinfeng Ni
> Fix For: 1.2.0
>
> Attachments: 
> 0001-DRILL-2748-Add-optimizer-rule-to-push-filter-past-ag.patch
>
>
> I'm not sure about this one, theoretically filter could have been pushed into 
> the subquery.
> {code}
> 0: jdbc:drill:schema=dfs> explain plan for select x, y, z from (select a1, 
> b1, avg(a1) from t1 group by a1, b1) as sq(x, y, z) where x = 10;
> +++
> |text|json|
> +++
> | 00-00Screen
> 00-01  Project(x=[$0], y=[$1], z=[$2])
> 00-02Project(x=[$0], y=[$1], z=[CAST(/(CastHigh(CASE(=($3, 0), null, 
> $2)), $3)):ANY NOT NULL])
> 00-03  SelectionVectorRemover
> 00-04Filter(condition=[=($0, 10)])
> 00-05  HashAgg(group=[{0, 1}], agg#0=[$SUM0($0)], 
> agg#1=[COUNT($0)])
> 00-06Project(a1=[$1], b1=[$0])
> 00-07  Scan(groupscan=[ParquetGroupScan 
> [entries=[ReadEntryWithPath [path=maprfs:/drill/testdata/predicates/t1]], 
> selectionRoot=/drill/testdata/predicates/t1, numFiles=1, columns=[`a1`, 
> `b1`]]])
> {code}
> Same with distinct in subquery:
> {code}
> 0: jdbc:drill:schema=dfs> explain plan for select x, y, z from ( select 
> distinct a1, b1, c1 from t1 ) as sq(x, y, z) where x = 10;
> +++
> |text|json|
> +++
> | 00-00Screen
> 00-01  Project(x=[$0], y=[$1], z=[$2])
> 00-02Project(x=[$0], y=[$1], z=[$2])
> 00-03  SelectionVectorRemover
> 00-04Filter(condition=[=($0, 10)])
> 00-05  HashAgg(group=[{0, 1, 2}])
> 00-06Project(a1=[$2], b1=[$1], c1=[$0])
> 00-07  Scan(groupscan=[ParquetGroupScan 
> [entries=[ReadEntryWithPath [path=maprfs:/drill/testdata/predicates/t1]], 
> selectionRoot=/drill/testdata/predicates/t1, numFiles=1, columns=[`a1`, `b1`, 
> `c1`]]])
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (DRILL-3788) Partition Pruning not taking place with metadata caching when we have ~20k files

2015-09-15 Thread Jinfeng Ni (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-3788?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jinfeng Ni updated DRILL-3788:
--
Assignee: Aman Sinha  (was: Jinfeng Ni)

> Partition Pruning not taking place with metadata caching when we have ~20k 
> files
> 
>
> Key: DRILL-3788
> URL: https://issues.apache.org/jira/browse/DRILL-3788
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Query Planning & Optimization
>Affects Versions: 1.2.0
>Reporter: Rahul Challapalli
>Assignee: Aman Sinha
>Priority: Critical
> Fix For: 1.2.0
>
> Attachments: plan.txt
>
>
> git.commit.id.abbrev=240a455
> Partition Pruning did not take place for the below query after I executed the 
> "refresh table metadata command"
> {code}
>  explain plan for 
> select
>   l_returnflag,
>   l_linestatus
> from
>   `lineitem/2006/1`
> where
>   dir0=1 or dir0=2
> {code}
> The logs did not indicate that "pruning did not take place"
> Before executing the refresh table metadata command, partition pruning did 
> take effect
> I am not attaching the data set as it is larger than 10MB. Reach out to me if 
> you need more information



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (DRILL-3788) Partition Pruning not taking place with metadata caching when we have ~20k files

2015-09-15 Thread Rahul Challapalli (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-3788?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rahul Challapalli updated DRILL-3788:
-
Attachment: plan.txt

Log file contents
{code}
 2015-09-16 00:59:51,812 [2a0740fa-e089-4afc-a162-cfc554a4e70a:frag:0:0] INFO  
o.a.d.e.w.f.FragmentStatusReporter - 2a0740fa-e089-4afc-a162-cfc554a4e70a:0:0: 
State to report: FINISHED
 2015-09-16 00:59:51,822 [BitServer-4] INFO  
o.a.drill.exec.work.foreman.Foreman - State change requested.  RUNNING --> 
COMPLETED 
 2015-09-16 00:59:51,828 [BitServer-4] INFO  
o.a.drill.exec.work.foreman.Foreman - foreman cleaning up.
 2015-09-16 01:02:10,342 [2a07406d-9089-755a-2371-fc8750a53628:foreman] INFO  
o.a.drill.exec.work.foreman.Foreman - State change requested.  PENDING --> 
RUNNING
 2015-09-16 01:02:10,343 [2a07406d-9089-755a-2371-fc8750a53628:frag:0:0] INFO  
o.a.d.e.w.fragment.FragmentExecutor - 2a07406d-9089-755a-2371-fc8750a53628:0:0: 
State change requested AWAITING_ALLOCATION --> RUNNING
 2015-09-16 01:02:10,343 [2a07406d-9089-755a-2371-fc8750a53628:frag:0:0] INFO  
o.a.d.e.w.f.FragmentStatusReporter - 2a07406d-9089-755a-2371-fc8750a53628:0:0: 
State to report: RUNNING
 2015-09-16 01:02:10,351 [2a07406d-9089-755a-2371-fc8750a53628:frag:0:0] INFO  
o.a.d.e.w.fragment.FragmentExecutor - 2a07406d-9089-755a-2371-fc8750a53628:0:0: 
State change requested RUNNING --> FINISHED
 2015-09-16 01:02:10,351 [2a07406d-9089-755a-2371-fc8750a53628:frag:0:0] INFO  
o.a.d.e.w.f.FragmentStatusReporter - 2a07406d-9089-755a-2371-fc8750a53628:0:0: 
State to report: FINISHED
 2015-09-16 01:02:10,360 [BitServer-4] INFO  
o.a.drill.exec.work.foreman.Foreman - State change requested.  RUNNING --> 
COMPLETED 
 2015-09-16 01:02:10,367 [BitServer-4] INFO  
o.a.drill.exec.work.foreman.Foreman - foreman cleaning up.
 2015-09-15 22:42:33,037 [2a07630c-0d47-a04a-5490-a814a65a575d:frag:1:11] INFO  
o.a.d.e.w.fragment.FragmentExecutor - 
2a07630c-0d47-a04a-5490-a814a65a575d:1:11: State change requested RUNNING --> 
FINISHED
 2015-09-15 22:42:33,038 [2a07630c-0d47-a04a-5490-a814a65a575d:frag:1:11] INFO  
o.a.d.e.w.f.FragmentStatusReporter - 2a07630c-0d47-a04a-5490-a814a65a575d:1:11: 
State to report: FINISHED
 2015-09-15 22:42:33,040 [2a07630c-0d47-a04a-5490-a814a65a575d:frag:1:13] INFO  
o.a.d.e.w.fragment.FragmentExecutor - 
2a07630c-0d47-a04a-5490-a814a65a575d:1:13: State change requested RUNNING --> 
FINISHED
 2015-09-15 22:42:33,040 [2a07630c-0d47-a04a-5490-a814a65a575d:frag:1:13] INFO  
o.a.d.e.w.f.FragmentStatusReporter - 2a07630c-0d47-a04a-5490-a814a65a575d:1:13: 
State to report: FINISHED
 2015-09-15 22:42:33,041 [2a07630c-0d47-a04a-5490-a814a65a575d:frag:1:15] INFO  
o.a.d.e.w.fragment.FragmentExecutor - 
2a07630c-0d47-a04a-5490-a814a65a575d:1:15: State change requested RUNNING --> 
FINISHED
 2015-09-15 22:42:33,042 [2a07630c-0d47-a04a-5490-a814a65a575d:frag:1:15] INFO  
o.a.d.e.w.f.FragmentStatusReporter - 2a07630c-0d47-a04a-5490-a814a65a575d:1:15: 
State to report: FINISHED
 2015-09-15 22:42:33,042 [2a07630c-0d47-a04a-5490-a814a65a575d:frag:2:1] INFO  
o.a.d.e.w.fragment.FragmentExecutor - 2a07630c-0d47-a04a-5490-a814a65a575d:2:1: 
State change requested RUNNING --> FINISHED
 2015-09-15 22:42:33,043 [2a07630c-0d47-a04a-5490-a814a65a575d:frag:2:1] INFO  
o.a.d.e.w.f.FragmentStatusReporter - 2a07630c-0d47-a04a-5490-a814a65a575d:2:1: 
State to report: FINISHED
 2015-09-15 22:42:33,156 [2a07630c-0d47-a04a-5490-a814a65a575d:frag:1:29] INFO  
o.a.d.e.w.fragment.FragmentExecutor - 
2a07630c-0d47-a04a-5490-a814a65a575d:1:29: State change requested RUNNING --> 
FINISHED
 2015-09-15 22:42:33,157 [2a07630c-0d47-a04a-5490-a814a65a575d:frag:1:29] INFO  
o.a.d.e.w.f.FragmentStatusReporter - 2a07630c-0d47-a04a-5490-a814a65a575d:1:29: 
State to report: FINISHED
 2015-09-16 01:09:37,149 [2a073eae-ca95-9ca5-26a5-335acf3085c5:foreman] INFO  
o.a.drill.exec.work.foreman.Foreman - State change requested.  PENDING --> 
RUNNING
 2015-09-16 01:09:37,150 [2a073eae-ca95-9ca5-26a5-335acf3085c5:frag:0:0] INFO  
o.a.d.e.w.fragment.FragmentExecutor - 2a073eae-ca95-9ca5-26a5-335acf3085c5:0:0: 
State change requested AWAITING_ALLOCATION --> RUNNING
 2015-09-16 01:09:37,150 [2a073eae-ca95-9ca5-26a5-335acf3085c5:frag:0:0] INFO  
o.a.d.e.w.f.FragmentStatusReporter - 2a073eae-ca95-9ca5-26a5-335acf3085c5:0:0: 
State to report: RUNNING
 2015-09-16 01:09:37,160 [2a073eae-ca95-9ca5-26a5-335acf3085c5:frag:0:0] INFO  
o.a.d.e.w.fragment.FragmentExecutor - 2a073eae-ca95-9ca5-26a5-335acf3085c5:0:0: 
State change requested RUNNING --> FINISHED
 2015-09-16 01:09:37,160 [2a073eae-ca95-9ca5-26a5-335acf3085c5:frag:0:0] INFO  
o.a.d.e.w.f.FragmentStatusReporter - 2a073eae-ca95-9ca5-26a5-335acf3085c5:0:0: 
State to report: FINISHED
 2015-09-16 01:09:37,171 [BitServer-4] INFO  
o.a.drill.exec.work.foreman.Foreman - State change requested.  RUNNING --> 
COMPLETED
 2015-09-16 01:09:37,178 [BitServer-4] INFO  
o.a.drill.exec.work

[jira] [Created] (DRILL-3788) Partition Pruning not taking place with metadata caching when we have ~20k files

2015-09-15 Thread Rahul Challapalli (JIRA)
Rahul Challapalli created DRILL-3788:


 Summary: Partition Pruning not taking place with metadata caching 
when we have ~20k files
 Key: DRILL-3788
 URL: https://issues.apache.org/jira/browse/DRILL-3788
 Project: Apache Drill
  Issue Type: Bug
  Components: Query Planning & Optimization
Affects Versions: 1.2.0
Reporter: Rahul Challapalli
Assignee: Jinfeng Ni
Priority: Critical
 Fix For: 1.2.0


git.commit.id.abbrev=240a455

Partition Pruning did not take place for the below query after I executed the 
"refresh table metadata command"
{code}
 explain plan for 
select
  l_returnflag,
  l_linestatus
from
  `lineitem/2006/1`
where
  dir0=1 or dir0=2
{code}

The logs did not indicate that "pruning did not take place"

Before executing the refresh table metadata command, partition pruning did take 
effect

I am not attaching the data set as it is larger than 10MB. Reach out to me if 
you need more information




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Issue Comment Deleted] (DRILL-2748) Filter is not pushed down into subquery with the group by

2015-09-15 Thread Victoria Markman (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-2748?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Victoria Markman updated DRILL-2748:

Comment: was deleted

(was: [~jni]

I tried with tpcds sf100, filter is not pushed down as well as in the original 
query:

{code}
0: jdbc:drill:schema=dfs> explain plan for select x, y, z from (select 
ss_quantity, ss_store_sk, avg(ss_quantity) from store sales group by 
ss_quantity, ss_store_sk) as sq(x, y, z) where x = 10;
+--+--+
| text | json |
+--+--+
| 00-00Screen
00-01  Project(x=[$0], y=[$1], z=[$2])
00-02Project(x=[$0], y=[$1], z=[$2])
00-03  Project(ss_quantity=[$0], ss_store_sk=[$1], 
EXPR$2=[CAST(/(CastHigh(CASE(=($3, 0), null, $2)), $3)):ANY NOT NULL])
00-04SelectionVectorRemover
00-05  Filter(condition=[=($0, 10)])
00-06HashAgg(group=[{0, 1}], agg#0=[$SUM0($0)], 
agg#1=[COUNT($0)])
00-07  Project(ss_quantity=[$1], ss_store_sk=[$0])
00-08Scan(groupscan=[ParquetGroupScan 
[entries=[ReadEntryWithPath [path=maprfs:///tpcds100/parquet/store]], 
selectionRoot=maprfs:/tpcds100/parquet/store, numFiles=1, 
columns=[`ss_quantity`, `ss_store_sk`]]])
{code})

> Filter is not pushed down into subquery with the group by
> -
>
> Key: DRILL-2748
> URL: https://issues.apache.org/jira/browse/DRILL-2748
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Query Planning & Optimization
>Affects Versions: 0.9.0, 1.0.0, 1.1.0
>Reporter: Victoria Markman
>Assignee: Jinfeng Ni
> Fix For: 1.2.0
>
> Attachments: 
> 0001-DRILL-2748-Add-optimizer-rule-to-push-filter-past-ag.patch
>
>
> I'm not sure about this one, theoretically filter could have been pushed into 
> the subquery.
> {code}
> 0: jdbc:drill:schema=dfs> explain plan for select x, y, z from (select a1, 
> b1, avg(a1) from t1 group by a1, b1) as sq(x, y, z) where x = 10;
> +++
> |text|json|
> +++
> | 00-00Screen
> 00-01  Project(x=[$0], y=[$1], z=[$2])
> 00-02Project(x=[$0], y=[$1], z=[CAST(/(CastHigh(CASE(=($3, 0), null, 
> $2)), $3)):ANY NOT NULL])
> 00-03  SelectionVectorRemover
> 00-04Filter(condition=[=($0, 10)])
> 00-05  HashAgg(group=[{0, 1}], agg#0=[$SUM0($0)], 
> agg#1=[COUNT($0)])
> 00-06Project(a1=[$1], b1=[$0])
> 00-07  Scan(groupscan=[ParquetGroupScan 
> [entries=[ReadEntryWithPath [path=maprfs:/drill/testdata/predicates/t1]], 
> selectionRoot=/drill/testdata/predicates/t1, numFiles=1, columns=[`a1`, 
> `b1`]]])
> {code}
> Same with distinct in subquery:
> {code}
> 0: jdbc:drill:schema=dfs> explain plan for select x, y, z from ( select 
> distinct a1, b1, c1 from t1 ) as sq(x, y, z) where x = 10;
> +++
> |text|json|
> +++
> | 00-00Screen
> 00-01  Project(x=[$0], y=[$1], z=[$2])
> 00-02Project(x=[$0], y=[$1], z=[$2])
> 00-03  SelectionVectorRemover
> 00-04Filter(condition=[=($0, 10)])
> 00-05  HashAgg(group=[{0, 1, 2}])
> 00-06Project(a1=[$2], b1=[$1], c1=[$0])
> 00-07  Scan(groupscan=[ParquetGroupScan 
> [entries=[ReadEntryWithPath [path=maprfs:/drill/testdata/predicates/t1]], 
> selectionRoot=/drill/testdata/predicates/t1, numFiles=1, columns=[`a1`, `b1`, 
> `c1`]]])
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Reopened] (DRILL-2748) Filter is not pushed down into subquery with the group by

2015-09-15 Thread Victoria Markman (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-2748?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Victoria Markman reopened DRILL-2748:
-

> Filter is not pushed down into subquery with the group by
> -
>
> Key: DRILL-2748
> URL: https://issues.apache.org/jira/browse/DRILL-2748
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Query Planning & Optimization
>Affects Versions: 0.9.0, 1.0.0, 1.1.0
>Reporter: Victoria Markman
>Assignee: Jinfeng Ni
> Fix For: 1.2.0
>
> Attachments: 
> 0001-DRILL-2748-Add-optimizer-rule-to-push-filter-past-ag.patch
>
>
> I'm not sure about this one, theoretically filter could have been pushed into 
> the subquery.
> {code}
> 0: jdbc:drill:schema=dfs> explain plan for select x, y, z from (select a1, 
> b1, avg(a1) from t1 group by a1, b1) as sq(x, y, z) where x = 10;
> +++
> |text|json|
> +++
> | 00-00Screen
> 00-01  Project(x=[$0], y=[$1], z=[$2])
> 00-02Project(x=[$0], y=[$1], z=[CAST(/(CastHigh(CASE(=($3, 0), null, 
> $2)), $3)):ANY NOT NULL])
> 00-03  SelectionVectorRemover
> 00-04Filter(condition=[=($0, 10)])
> 00-05  HashAgg(group=[{0, 1}], agg#0=[$SUM0($0)], 
> agg#1=[COUNT($0)])
> 00-06Project(a1=[$1], b1=[$0])
> 00-07  Scan(groupscan=[ParquetGroupScan 
> [entries=[ReadEntryWithPath [path=maprfs:/drill/testdata/predicates/t1]], 
> selectionRoot=/drill/testdata/predicates/t1, numFiles=1, columns=[`a1`, 
> `b1`]]])
> {code}
> Same with distinct in subquery:
> {code}
> 0: jdbc:drill:schema=dfs> explain plan for select x, y, z from ( select 
> distinct a1, b1, c1 from t1 ) as sq(x, y, z) where x = 10;
> +++
> |text|json|
> +++
> | 00-00Screen
> 00-01  Project(x=[$0], y=[$1], z=[$2])
> 00-02Project(x=[$0], y=[$1], z=[$2])
> 00-03  SelectionVectorRemover
> 00-04Filter(condition=[=($0, 10)])
> 00-05  HashAgg(group=[{0, 1, 2}])
> 00-06Project(a1=[$2], b1=[$1], c1=[$0])
> 00-07  Scan(groupscan=[ParquetGroupScan 
> [entries=[ReadEntryWithPath [path=maprfs:/drill/testdata/predicates/t1]], 
> selectionRoot=/drill/testdata/predicates/t1, numFiles=1, columns=[`a1`, `b1`, 
> `c1`]]])
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (DRILL-2748) Filter is not pushed down into subquery with the group by

2015-09-15 Thread Victoria Markman (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-2748?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14746599#comment-14746599
 ] 

Victoria Markman commented on DRILL-2748:
-

[~jni]

I tried with tpcds sf100, filter is not pushed down as well as in the original 
query:

{code}
0: jdbc:drill:schema=dfs> explain plan for select x, y, z from (select 
ss_quantity, ss_store_sk, avg(ss_quantity) from store sales group by 
ss_quantity, ss_store_sk) as sq(x, y, z) where x = 10;
+--+--+
| text | json |
+--+--+
| 00-00Screen
00-01  Project(x=[$0], y=[$1], z=[$2])
00-02Project(x=[$0], y=[$1], z=[$2])
00-03  Project(ss_quantity=[$0], ss_store_sk=[$1], 
EXPR$2=[CAST(/(CastHigh(CASE(=($3, 0), null, $2)), $3)):ANY NOT NULL])
00-04SelectionVectorRemover
00-05  Filter(condition=[=($0, 10)])
00-06HashAgg(group=[{0, 1}], agg#0=[$SUM0($0)], 
agg#1=[COUNT($0)])
00-07  Project(ss_quantity=[$1], ss_store_sk=[$0])
00-08Scan(groupscan=[ParquetGroupScan 
[entries=[ReadEntryWithPath [path=maprfs:///tpcds100/parquet/store]], 
selectionRoot=maprfs:/tpcds100/parquet/store, numFiles=1, 
columns=[`ss_quantity`, `ss_store_sk`]]])
{code}

> Filter is not pushed down into subquery with the group by
> -
>
> Key: DRILL-2748
> URL: https://issues.apache.org/jira/browse/DRILL-2748
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Query Planning & Optimization
>Affects Versions: 0.9.0, 1.0.0, 1.1.0
>Reporter: Victoria Markman
>Assignee: Jinfeng Ni
> Fix For: 1.2.0
>
> Attachments: 
> 0001-DRILL-2748-Add-optimizer-rule-to-push-filter-past-ag.patch
>
>
> I'm not sure about this one, theoretically filter could have been pushed into 
> the subquery.
> {code}
> 0: jdbc:drill:schema=dfs> explain plan for select x, y, z from (select a1, 
> b1, avg(a1) from t1 group by a1, b1) as sq(x, y, z) where x = 10;
> +++
> |text|json|
> +++
> | 00-00Screen
> 00-01  Project(x=[$0], y=[$1], z=[$2])
> 00-02Project(x=[$0], y=[$1], z=[CAST(/(CastHigh(CASE(=($3, 0), null, 
> $2)), $3)):ANY NOT NULL])
> 00-03  SelectionVectorRemover
> 00-04Filter(condition=[=($0, 10)])
> 00-05  HashAgg(group=[{0, 1}], agg#0=[$SUM0($0)], 
> agg#1=[COUNT($0)])
> 00-06Project(a1=[$1], b1=[$0])
> 00-07  Scan(groupscan=[ParquetGroupScan 
> [entries=[ReadEntryWithPath [path=maprfs:/drill/testdata/predicates/t1]], 
> selectionRoot=/drill/testdata/predicates/t1, numFiles=1, columns=[`a1`, 
> `b1`]]])
> {code}
> Same with distinct in subquery:
> {code}
> 0: jdbc:drill:schema=dfs> explain plan for select x, y, z from ( select 
> distinct a1, b1, c1 from t1 ) as sq(x, y, z) where x = 10;
> +++
> |text|json|
> +++
> | 00-00Screen
> 00-01  Project(x=[$0], y=[$1], z=[$2])
> 00-02Project(x=[$0], y=[$1], z=[$2])
> 00-03  SelectionVectorRemover
> 00-04Filter(condition=[=($0, 10)])
> 00-05  HashAgg(group=[{0, 1, 2}])
> 00-06Project(a1=[$2], b1=[$1], c1=[$0])
> 00-07  Scan(groupscan=[ParquetGroupScan 
> [entries=[ReadEntryWithPath [path=maprfs:/drill/testdata/predicates/t1]], 
> selectionRoot=/drill/testdata/predicates/t1, numFiles=1, columns=[`a1`, `b1`, 
> `c1`]]])
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (DRILL-2748) Filter is not pushed down into subquery with the group by

2015-09-15 Thread Victoria Markman (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-2748?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14746596#comment-14746596
 ] 

Victoria Markman commented on DRILL-2748:
-

[~jni]

Costing theory may not be correct: tried with tpcds sf100, filter is not pushed 
down :(

{code}
0: jdbc:drill:schema=dfs> explain plan for select x, y, z from (select 
ss_quantity, ss_store_sk, avg(ss_quantity) from store sales group by 
ss_quantity, ss_store_sk) as sq(x, y, z) where x = 10;
+--+--+
| text | json |
+--+--+
| 00-00Screen
00-01  Project(x=[$0], y=[$1], z=[$2])
00-02Project(x=[$0], y=[$1], z=[$2])
00-03  Project(ss_quantity=[$0], ss_store_sk=[$1], 
EXPR$2=[CAST(/(CastHigh(CASE(=($3, 0), null, $2)), $3)):ANY NOT NULL])
00-04SelectionVectorRemover
00-05  Filter(condition=[=($0, 10)])
00-06HashAgg(group=[{0, 1}], agg#0=[$SUM0($0)], 
agg#1=[COUNT($0)])
00-07  Project(ss_quantity=[$1], ss_store_sk=[$0])
00-08Scan(groupscan=[ParquetGroupScan 
[entries=[ReadEntryWithPath [path=maprfs:///tpcds100/parquet/store]], 
selectionRoot=maprfs:/tpcds100/parquet/store, numFiles=1, 
columns=[`ss_quantity`, `ss_store_sk`]]])
{code}

> Filter is not pushed down into subquery with the group by
> -
>
> Key: DRILL-2748
> URL: https://issues.apache.org/jira/browse/DRILL-2748
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Query Planning & Optimization
>Affects Versions: 0.9.0, 1.0.0, 1.1.0
>Reporter: Victoria Markman
>Assignee: Jinfeng Ni
> Fix For: 1.2.0
>
> Attachments: 
> 0001-DRILL-2748-Add-optimizer-rule-to-push-filter-past-ag.patch
>
>
> I'm not sure about this one, theoretically filter could have been pushed into 
> the subquery.
> {code}
> 0: jdbc:drill:schema=dfs> explain plan for select x, y, z from (select a1, 
> b1, avg(a1) from t1 group by a1, b1) as sq(x, y, z) where x = 10;
> +++
> |text|json|
> +++
> | 00-00Screen
> 00-01  Project(x=[$0], y=[$1], z=[$2])
> 00-02Project(x=[$0], y=[$1], z=[CAST(/(CastHigh(CASE(=($3, 0), null, 
> $2)), $3)):ANY NOT NULL])
> 00-03  SelectionVectorRemover
> 00-04Filter(condition=[=($0, 10)])
> 00-05  HashAgg(group=[{0, 1}], agg#0=[$SUM0($0)], 
> agg#1=[COUNT($0)])
> 00-06Project(a1=[$1], b1=[$0])
> 00-07  Scan(groupscan=[ParquetGroupScan 
> [entries=[ReadEntryWithPath [path=maprfs:/drill/testdata/predicates/t1]], 
> selectionRoot=/drill/testdata/predicates/t1, numFiles=1, columns=[`a1`, 
> `b1`]]])
> {code}
> Same with distinct in subquery:
> {code}
> 0: jdbc:drill:schema=dfs> explain plan for select x, y, z from ( select 
> distinct a1, b1, c1 from t1 ) as sq(x, y, z) where x = 10;
> +++
> |text|json|
> +++
> | 00-00Screen
> 00-01  Project(x=[$0], y=[$1], z=[$2])
> 00-02Project(x=[$0], y=[$1], z=[$2])
> 00-03  SelectionVectorRemover
> 00-04Filter(condition=[=($0, 10)])
> 00-05  HashAgg(group=[{0, 1, 2}])
> 00-06Project(a1=[$2], b1=[$1], c1=[$0])
> 00-07  Scan(groupscan=[ParquetGroupScan 
> [entries=[ReadEntryWithPath [path=maprfs:/drill/testdata/predicates/t1]], 
> selectionRoot=/drill/testdata/predicates/t1, numFiles=1, columns=[`a1`, `b1`, 
> `c1`]]])
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (DRILL-3679) IOB Exception : when window functions used in outer and inner query

2015-09-15 Thread Khurram Faraaz (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-3679?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14746598#comment-14746598
 ] 

Khurram Faraaz commented on DRILL-3679:
---

Ok I will verify and confirm.

> IOB Exception : when window functions used in outer and inner query
> ---
>
> Key: DRILL-3679
> URL: https://issues.apache.org/jira/browse/DRILL-3679
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Query Planning & Optimization
>Affects Versions: 1.2.0
> Environment: private-branch 
> https://github.com/adeneche/incubator-drill/tree/new-window-funcs
>Reporter: Khurram Faraaz
>Assignee: Jinfeng Ni
>  Labels: window_function
> Fix For: 1.2.0
>
>
> IOB Exception seen when two different window functions are used in inner and 
> outer queries.
> {code}
> 0: jdbc:drill:schema=dfs.tmp> select rnum, position_id, ntile(4) over(order 
> by position_id) from (select position_id, row_number() over(order by 
> position_id) as rnum from cp.`employee.json`);
> java.lang.RuntimeException: java.sql.SQLException: SYSTEM ERROR: 
> IndexOutOfBoundsException: index: 0, length: 4 (expected: range(0, 0))
> Fragment 0:0
> [Error Id: 8e0cbf82-842d-4fa7-ab0d-1d982a3d6c24 on centos-03.qa.lab:31010]
>   at sqlline.IncrementalRows.hasNext(IncrementalRows.java:73)
>   at 
> sqlline.TableOutputFormat$ResizingRowsProvider.next(TableOutputFormat.java:87)
>   at sqlline.TableOutputFormat.print(TableOutputFormat.java:118)
>   at sqlline.SqlLine.print(SqlLine.java:1583)
>   at sqlline.Commands.execute(Commands.java:852)
>   at sqlline.Commands.sql(Commands.java:751)
>   at sqlline.SqlLine.dispatch(SqlLine.java:738)
>   at sqlline.SqlLine.begin(SqlLine.java:612)
>   at sqlline.SqlLine.start(SqlLine.java:366)
>   at sqlline.SqlLine.main(SqlLine.java:259)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (DRILL-3679) IOB Exception : when window functions used in outer and inner query

2015-09-15 Thread Jinfeng Ni (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-3679?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14746594#comment-14746594
 ] 

Jinfeng Ni commented on DRILL-3679:
---

Can you run against the latest master branch? The fix for DRILL-3680 is in 
commit : 9afcf61f6c993cd028022d827daa7f873a61ffaa.

The one you tried seems to be at least two days ago. 


> IOB Exception : when window functions used in outer and inner query
> ---
>
> Key: DRILL-3679
> URL: https://issues.apache.org/jira/browse/DRILL-3679
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Query Planning & Optimization
>Affects Versions: 1.2.0
> Environment: private-branch 
> https://github.com/adeneche/incubator-drill/tree/new-window-funcs
>Reporter: Khurram Faraaz
>Assignee: Jinfeng Ni
>  Labels: window_function
> Fix For: 1.2.0
>
>
> IOB Exception seen when two different window functions are used in inner and 
> outer queries.
> {code}
> 0: jdbc:drill:schema=dfs.tmp> select rnum, position_id, ntile(4) over(order 
> by position_id) from (select position_id, row_number() over(order by 
> position_id) as rnum from cp.`employee.json`);
> java.lang.RuntimeException: java.sql.SQLException: SYSTEM ERROR: 
> IndexOutOfBoundsException: index: 0, length: 4 (expected: range(0, 0))
> Fragment 0:0
> [Error Id: 8e0cbf82-842d-4fa7-ab0d-1d982a3d6c24 on centos-03.qa.lab:31010]
>   at sqlline.IncrementalRows.hasNext(IncrementalRows.java:73)
>   at 
> sqlline.TableOutputFormat$ResizingRowsProvider.next(TableOutputFormat.java:87)
>   at sqlline.TableOutputFormat.print(TableOutputFormat.java:118)
>   at sqlline.SqlLine.print(SqlLine.java:1583)
>   at sqlline.Commands.execute(Commands.java:852)
>   at sqlline.Commands.sql(Commands.java:751)
>   at sqlline.SqlLine.dispatch(SqlLine.java:738)
>   at sqlline.SqlLine.begin(SqlLine.java:612)
>   at sqlline.SqlLine.start(SqlLine.java:366)
>   at sqlline.SqlLine.main(SqlLine.java:259)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (DRILL-3679) IOB Exception : when window functions used in outer and inner query

2015-09-15 Thread Khurram Faraaz (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-3679?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14746584#comment-14746584
 ] 

Khurram Faraaz commented on DRILL-3679:
---

I see this JIRA is marked as resolved, but the issue still exists on master, 
see commit id below. I have verified fix for DRILL-3680 which is now Fixed on 
master, on the same commit id as this one.

{code}
0: jdbc:drill:schema=dfs.tmp> select * from sys.version;
+---+-++--++
| commit_id | 
commit_message  |commit_time | 
build_email  | build_time |
+---+-++--++
| b525692e05c2a562a664093abd46bf68137b4a3b  | DRILL-3280, DRILL-3360, 
DRILL-3601, DRILL-3649: Add test cases  | 11.09.2015 @ 00:52:25 UTC  | Unknown  
| 11.09.2015 @ 05:43:11 UTC  |
+---+-++--++
1 row selected (0.209 seconds)
0: jdbc:drill:schema=dfs.tmp> select rnum, position_id, ntile(4) over(order by 
position_id) from (select position_id, row_number() over(order by position_id) 
as rnum from cp.`employee.json`);
java.lang.RuntimeException: java.sql.SQLException: SYSTEM ERROR: 
IndexOutOfBoundsException: index: 0, length: 4 (expected: range(0, 0))

Fragment 0:0

[Error Id: 42b5d107-a3c8-477d-943d-40469f6cbd17 on centos-04.qa.lab:31010]
at sqlline.IncrementalRows.hasNext(IncrementalRows.java:73)
at 
sqlline.TableOutputFormat$ResizingRowsProvider.next(TableOutputFormat.java:87)
at sqlline.TableOutputFormat.print(TableOutputFormat.java:118)
at sqlline.SqlLine.print(SqlLine.java:1583)
at sqlline.Commands.execute(Commands.java:852)
at sqlline.Commands.sql(Commands.java:751)
at sqlline.SqlLine.dispatch(SqlLine.java:738)
at sqlline.SqlLine.begin(SqlLine.java:612)
at sqlline.SqlLine.start(SqlLine.java:366)
at sqlline.SqlLine.main(SqlLine.java:259)
0: jdbc:drill:schema=dfs.tmp> 
{code}

> IOB Exception : when window functions used in outer and inner query
> ---
>
> Key: DRILL-3679
> URL: https://issues.apache.org/jira/browse/DRILL-3679
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Query Planning & Optimization
>Affects Versions: 1.2.0
> Environment: private-branch 
> https://github.com/adeneche/incubator-drill/tree/new-window-funcs
>Reporter: Khurram Faraaz
>Assignee: Jinfeng Ni
>  Labels: window_function
> Fix For: 1.2.0
>
>
> IOB Exception seen when two different window functions are used in inner and 
> outer queries.
> {code}
> 0: jdbc:drill:schema=dfs.tmp> select rnum, position_id, ntile(4) over(order 
> by position_id) from (select position_id, row_number() over(order by 
> position_id) as rnum from cp.`employee.json`);
> java.lang.RuntimeException: java.sql.SQLException: SYSTEM ERROR: 
> IndexOutOfBoundsException: index: 0, length: 4 (expected: range(0, 0))
> Fragment 0:0
> [Error Id: 8e0cbf82-842d-4fa7-ab0d-1d982a3d6c24 on centos-03.qa.lab:31010]
>   at sqlline.IncrementalRows.hasNext(IncrementalRows.java:73)
>   at 
> sqlline.TableOutputFormat$ResizingRowsProvider.next(TableOutputFormat.java:87)
>   at sqlline.TableOutputFormat.print(TableOutputFormat.java:118)
>   at sqlline.SqlLine.print(SqlLine.java:1583)
>   at sqlline.Commands.execute(Commands.java:852)
>   at sqlline.Commands.sql(Commands.java:751)
>   at sqlline.SqlLine.dispatch(SqlLine.java:738)
>   at sqlline.SqlLine.begin(SqlLine.java:612)
>   at sqlline.SqlLine.start(SqlLine.java:366)
>   at sqlline.SqlLine.main(SqlLine.java:259)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (DRILL-3679) IOB Exception : when window functions used in outer and inner query

2015-09-15 Thread Khurram Faraaz (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-3679?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14746586#comment-14746586
 ] 

Khurram Faraaz commented on DRILL-3679:
---

Should I re-open this JIRA or report a new one ?

> IOB Exception : when window functions used in outer and inner query
> ---
>
> Key: DRILL-3679
> URL: https://issues.apache.org/jira/browse/DRILL-3679
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Query Planning & Optimization
>Affects Versions: 1.2.0
> Environment: private-branch 
> https://github.com/adeneche/incubator-drill/tree/new-window-funcs
>Reporter: Khurram Faraaz
>Assignee: Jinfeng Ni
>  Labels: window_function
> Fix For: 1.2.0
>
>
> IOB Exception seen when two different window functions are used in inner and 
> outer queries.
> {code}
> 0: jdbc:drill:schema=dfs.tmp> select rnum, position_id, ntile(4) over(order 
> by position_id) from (select position_id, row_number() over(order by 
> position_id) as rnum from cp.`employee.json`);
> java.lang.RuntimeException: java.sql.SQLException: SYSTEM ERROR: 
> IndexOutOfBoundsException: index: 0, length: 4 (expected: range(0, 0))
> Fragment 0:0
> [Error Id: 8e0cbf82-842d-4fa7-ab0d-1d982a3d6c24 on centos-03.qa.lab:31010]
>   at sqlline.IncrementalRows.hasNext(IncrementalRows.java:73)
>   at 
> sqlline.TableOutputFormat$ResizingRowsProvider.next(TableOutputFormat.java:87)
>   at sqlline.TableOutputFormat.print(TableOutputFormat.java:118)
>   at sqlline.SqlLine.print(SqlLine.java:1583)
>   at sqlline.Commands.execute(Commands.java:852)
>   at sqlline.Commands.sql(Commands.java:751)
>   at sqlline.SqlLine.dispatch(SqlLine.java:738)
>   at sqlline.SqlLine.begin(SqlLine.java:612)
>   at sqlline.SqlLine.start(SqlLine.java:366)
>   at sqlline.SqlLine.main(SqlLine.java:259)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Closed] (DRILL-3680) window function query returns Incorrect results

2015-09-15 Thread Khurram Faraaz (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-3680?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Khurram Faraaz closed DRILL-3680.
-

Verified fix on master commit id: b525692e
added test

{code}
0: jdbc:drill:schema=dfs.tmp>  select c1 , c2 , lead(c2) OVER ( PARTITION BY c2 
ORDER BY c1) lead_c2 FROM (SELECT c1 , c2, ntile(3) over(PARTITION BY c2 ORDER 
BY c1) FROM `tblWnulls.parquet`);
+-+---+--+
| c1  |  c2   | lead_c2  |
+-+---+--+
| 0   | a | a|
| 1   | a | a|
| 5   | a | a|
| 10  | a | a|
| 11  | a | a|
| 14  | a | a|
| 1   | a | null |
| 2   | b | b|
| 9   | b | b|
| 13  | b | b|
| 17  | b | null |
| 4   | c | c|
| 6   | c | c|
| 8   | c | c|
| 12  | c | c|
| 13  | c | c|
| 13  | c | c|
| null| c | null |
| 10  | d | d|
| 11  | d | d|
| 2147483647  | d | d|
| 2147483647  | d | d|
| null| d | d|
| null| d | null |
| -1  | e | e|
| 15  | e | null |
| 19  | null  | null |
| 65536   | null  | null |
| 100 | null  | null |
| null| null  | null |
+-+---+--+
30 rows selected (0.579 seconds)
{code}

> window function query returns Incorrect results 
> 
>
> Key: DRILL-3680
> URL: https://issues.apache.org/jira/browse/DRILL-3680
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Flow
>Affects Versions: 1.2.0
> Environment: private-branch 
> https://github.com/adeneche/incubator-drill/tree/new-window-funcs
>Reporter: Khurram Faraaz
>Assignee: Jinfeng Ni
>Priority: Critical
>  Labels: window_function
> Fix For: 1.2.0
>
> Attachments: 
> 0001-DRILL-3680-Fix-incorrect-query-result-or-IOBE-when-t.patch, 
> 0001-DRILL-3680-Fix-incorrect-query-result-or-IOBE-when-t.patch.2, 
> tblWnulls.parquet
>
>
> Query plan from Drill for the query that returns wrong results
> {code}
> 0: jdbc:drill:schema=dfs.tmp> explain plan for select c1 , c2 , lead(c2) OVER 
> ( PARTITION BY c2 ORDER BY c1) lead_c2 FROM (SELECT c1 , c2, ntile(3) 
> over(PARTITION BY c2 ORDER BY c1) FROM `tblWnulls.parquet`);
> +--+--+
> | text | json |
> +--+--+
> | 00-00Screen
> 00-01  Project(c1=[$0], c2=[$1], lead_c2=[$2])
> 00-02Project(c1=[$0], c2=[$1], lead_c2=[$2])
> 00-03  Project(c1=[$0], c2=[$1], $2=[$3])
> 00-04Window(window#0=[window(partition {1} order by [0] range 
> between UNBOUNDED PRECEDING and CURRENT ROW aggs [LEAD($1)])])
> 00-05  Window(window#0=[window(partition {1} order by [0] range 
> between UNBOUNDED PRECEDING and CURRENT ROW aggs [NTILE($2)])])
> 00-06SelectionVectorRemover
> 00-07  Sort(sort0=[$1], sort1=[$0], dir0=[ASC], dir1=[ASC])
> 00-08Project(c1=[$1], c2=[$0])
> 00-09  Scan(groupscan=[ParquetGroupScan 
> [entries=[ReadEntryWithPath [path=maprfs:///tmp/tblWnulls.parquet]], 
> selectionRoot=maprfs:/tmp/tblWnulls.parquet, numFiles=1, columns=[`c1`, 
> `c2`]]])
> {code}
> Results returned by Drill.
> {code}
> 0: jdbc:drill:schema=dfs.tmp> select c1 , c2 , lead(c2) OVER ( PARTITION BY 
> c2 ORDER BY c1) lead_c2 FROM (SELECT c1 , c2, ntile(3) over(PARTITION BY c2 
> ORDER BY c1) FROM `tblWnulls.parquet`);
> +-+---+--+
> | c1  |  c2   | lead_c2  |
> +-+---+--+
> | 0   | a | null |
> | 1   | a | null |
> | 5   | a | null |
> | 10  | a | null |
> | 11  | a | null |
> | 14  | a | null |
> | 1   | a | null |
> | 2   | b | null |
> | 9   | b | null |
> | 13  | b | null |
> | 17  | b | null |
> | 4   | c | null |
> | 6   | c | null |
> | 8   | c | null |
> | 12  | c | null |
> | 13  | c | null |
> | 13  | c | null |
> | null| c | null |
> | 10  | d | null |
> | 11  | d | null |
> | 2147483647  | d | null |
> | 2147483647  | d | null |
> | null| d | null |
> | null| d | null |
> | -1  | e | null |
> | 15  

[jira] [Closed] (DRILL-3537) Empty Json file can potentially result into wrong results

2015-09-15 Thread Chun Chang (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-3537?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chun Chang closed DRILL-3537.
-
Assignee: Chun Chang  (was: Parth Chandra)

verified fix.

{noformat}
0: jdbc:drill:schema=dfs.drillTestDirDropTabl> select * from 
dfs.`/tmp/drill-3537/b.json`;
++
| a  |
++
| 1  |
++
1 row selected (0.305 seconds)
0: jdbc:drill:schema=dfs.drillTestDirDropTabl> select * from 
dfs.`/tmp/drill-3537/a.json`;
+--+
|  |
+--+
+--+
No rows selected (0.268 seconds)
0: jdbc:drill:schema=dfs.drillTestDirDropTabl> select * from 
dfs.`/tmp/drill-3537`;
++
| a  |
++
| 1  |
++
{noformat}

> Empty Json file can potentially result into wrong results 
> --
>
> Key: DRILL-3537
> URL: https://issues.apache.org/jira/browse/DRILL-3537
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Relational Operators, Storage - JSON
>Reporter: Sean Hsuan-Yi Chu
>Assignee: Chun Chang
>Priority: Critical
> Fix For: 1.2.0
>
>
> In the directory, we have two files. One has some data and the other one is 
> empty. A query as below:
> {code}
> select * from dfs.`directory`;
> {code}
> will produce different results according to the order of the files being read 
> (The default order is in the alphabetic order of the filenames). To give a 
> more concrete example, the non-empty json has data:
> {code}
> {
>   a:1
> }
> {code}
> By naming the files, you can control the orders. If the empty file is read in 
> firstly, the result is
> {code}
> +---++
> |   *   | a  |
> +---++
> | null  | 1  |
> +---++
> {code}
> If the opposite order takes place, the result is
> {code}
> ++
> | a  |
> ++
> | 1  |
> | 2  |
> ++
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (DRILL-3679) IOB Exception : when window functions used in outer and inner query

2015-09-15 Thread Jinfeng Ni (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-3679?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jinfeng Ni resolved DRILL-3679.
---
Resolution: Fixed

The symptom is different, but the root cause is same as DRILL-3680. Marked as 
duplicate. 

> IOB Exception : when window functions used in outer and inner query
> ---
>
> Key: DRILL-3679
> URL: https://issues.apache.org/jira/browse/DRILL-3679
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Query Planning & Optimization
>Affects Versions: 1.2.0
> Environment: private-branch 
> https://github.com/adeneche/incubator-drill/tree/new-window-funcs
>Reporter: Khurram Faraaz
>Assignee: Jinfeng Ni
>  Labels: window_function
> Fix For: 1.2.0
>
>
> IOB Exception seen when two different window functions are used in inner and 
> outer queries.
> {code}
> 0: jdbc:drill:schema=dfs.tmp> select rnum, position_id, ntile(4) over(order 
> by position_id) from (select position_id, row_number() over(order by 
> position_id) as rnum from cp.`employee.json`);
> java.lang.RuntimeException: java.sql.SQLException: SYSTEM ERROR: 
> IndexOutOfBoundsException: index: 0, length: 4 (expected: range(0, 0))
> Fragment 0:0
> [Error Id: 8e0cbf82-842d-4fa7-ab0d-1d982a3d6c24 on centos-03.qa.lab:31010]
>   at sqlline.IncrementalRows.hasNext(IncrementalRows.java:73)
>   at 
> sqlline.TableOutputFormat$ResizingRowsProvider.next(TableOutputFormat.java:87)
>   at sqlline.TableOutputFormat.print(TableOutputFormat.java:118)
>   at sqlline.SqlLine.print(SqlLine.java:1583)
>   at sqlline.Commands.execute(Commands.java:852)
>   at sqlline.Commands.sql(Commands.java:751)
>   at sqlline.SqlLine.dispatch(SqlLine.java:738)
>   at sqlline.SqlLine.begin(SqlLine.java:612)
>   at sqlline.SqlLine.start(SqlLine.java:366)
>   at sqlline.SqlLine.main(SqlLine.java:259)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (DRILL-3787) Need a better error message

2015-09-15 Thread Khurram Faraaz (JIRA)
Khurram Faraaz created DRILL-3787:
-

 Summary: Need a better error message
 Key: DRILL-3787
 URL: https://issues.apache.org/jira/browse/DRILL-3787
 Project: Apache Drill
  Issue Type: Bug
  Components: Execution - Data Types
Affects Versions: 1.2.0
Reporter: Khurram Faraaz
Assignee: Hanifi Gunes
Priority: Minor


Predicate over DATE type column results in SchemaChangeException. All values in 
the column are of type date. We need to fix the error message.

{code}
0: jdbc:drill:schema=dfs.tmp> SELECT col5 FROM `fewRowsAllData.parquet` WHERE 
col5 < 1968-06-06;
Error: SYSTEM ERROR: SchemaChangeException: Failure while trying to materialize 
incoming schema.  Errors:
 
Error in expression at index -1.  Error: Missing function implementation: 
[castTIMESTAMP(INT-REQUIRED)].  Full expression: --UNKNOWN EXPRESSION--..

Fragment 0:0

[Error Id: 6fddaef0-ec51-4896-b12f-046c4a011995 on centos-04.qa.lab:31010] 
(state=,code=0)
{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (DRILL-3786) Query with window function fails with IllegalFormatConversionException

2015-09-15 Thread Jinfeng Ni (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-3786?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14746556#comment-14746556
 ] 

Jinfeng Ni commented on DRILL-3786:
---

The stack trace shows the error is raised in Sort operator. Assign to Hakim. 

at 
org.apache.drill.exec.physical.impl.sort.SortRecordBatchBuilder.add(SortRecordBatchBuilder.java:106)
 ~[drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]

> Query with window function fails with IllegalFormatConversionException
> --
>
> Key: DRILL-3786
> URL: https://issues.apache.org/jira/browse/DRILL-3786
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Relational Operators
>Affects Versions: 1.2.0
> Environment: 10 Performance Nodes
> DRILL_MAX_DIRECT_MEMORY=100g
> DRILL_INIT_HEAP="8g"
> DRILL_MAX_HEAP="8g"
> planner.memory.query_max_memory_per_node bumped up to 20 GB
> TPC-DS SF 1000 dataset (Parquet)
>Reporter: Abhishek Girish
>Assignee: Deneche A. Hakim
> Attachments: drillbit.log.txt, query_profile.json
>
>
> Query fails with Runtime exception:
> {code:sql}
> SELECT sum(s.ss_quantity) OVER (PARTITION BY s.ss_store_sk, s.ss_customer_sk 
> ORDER BY s.ss_store_sk) FROM store_sales s LIMIT 20;
> java.lang.RuntimeException: java.sql.SQLException: SYSTEM ERROR: 
> IllegalFormatConversionException: d != java.lang.Character
> Fragment 1:0
> [Error Id: 12b51c0c-4992-4ceb-89c4-c99307529c7e on ucs-node8.perf.lab:31010]
>   at sqlline.IncrementalRows.hasNext(IncrementalRows.java:73)
>   at 
> sqlline.TableOutputFormat$ResizingRowsProvider.next(TableOutputFormat.java:87)
>   at sqlline.TableOutputFormat.print(TableOutputFormat.java:118)
>   at sqlline.SqlLine.print(SqlLine.java:1583)
>   at sqlline.Commands.execute(Commands.java:852)
>   at sqlline.Commands.sql(Commands.java:751)
>   at sqlline.SqlLine.dispatch(SqlLine.java:738)
>   at sqlline.SqlLine.begin(SqlLine.java:612)
>   at sqlline.SqlLine.start(SqlLine.java:366)
>   at sqlline.SqlLine.main(SqlLine.java:259)
> {code}
> Query logs and profile attached. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (DRILL-3786) Query with window function fails with IllegalFormatConversionException

2015-09-15 Thread Jinfeng Ni (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-3786?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jinfeng Ni updated DRILL-3786:
--
Assignee: Deneche A. Hakim  (was: Jinfeng Ni)

> Query with window function fails with IllegalFormatConversionException
> --
>
> Key: DRILL-3786
> URL: https://issues.apache.org/jira/browse/DRILL-3786
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Relational Operators
>Affects Versions: 1.2.0
> Environment: 10 Performance Nodes
> DRILL_MAX_DIRECT_MEMORY=100g
> DRILL_INIT_HEAP="8g"
> DRILL_MAX_HEAP="8g"
> planner.memory.query_max_memory_per_node bumped up to 20 GB
> TPC-DS SF 1000 dataset (Parquet)
>Reporter: Abhishek Girish
>Assignee: Deneche A. Hakim
> Attachments: drillbit.log.txt, query_profile.json
>
>
> Query fails with Runtime exception:
> {code:sql}
> SELECT sum(s.ss_quantity) OVER (PARTITION BY s.ss_store_sk, s.ss_customer_sk 
> ORDER BY s.ss_store_sk) FROM store_sales s LIMIT 20;
> java.lang.RuntimeException: java.sql.SQLException: SYSTEM ERROR: 
> IllegalFormatConversionException: d != java.lang.Character
> Fragment 1:0
> [Error Id: 12b51c0c-4992-4ceb-89c4-c99307529c7e on ucs-node8.perf.lab:31010]
>   at sqlline.IncrementalRows.hasNext(IncrementalRows.java:73)
>   at 
> sqlline.TableOutputFormat$ResizingRowsProvider.next(TableOutputFormat.java:87)
>   at sqlline.TableOutputFormat.print(TableOutputFormat.java:118)
>   at sqlline.SqlLine.print(SqlLine.java:1583)
>   at sqlline.Commands.execute(Commands.java:852)
>   at sqlline.Commands.sql(Commands.java:751)
>   at sqlline.SqlLine.dispatch(SqlLine.java:738)
>   at sqlline.SqlLine.begin(SqlLine.java:612)
>   at sqlline.SqlLine.start(SqlLine.java:366)
>   at sqlline.SqlLine.main(SqlLine.java:259)
> {code}
> Query logs and profile attached. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (DRILL-3786) Query with window function fails with IllegalFormatConversionException

2015-09-15 Thread Jinfeng Ni (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-3786?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jinfeng Ni updated DRILL-3786:
--
Component/s: (was: Query Planning & Optimization)
 Execution - Relational Operators

> Query with window function fails with IllegalFormatConversionException
> --
>
> Key: DRILL-3786
> URL: https://issues.apache.org/jira/browse/DRILL-3786
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Relational Operators
>Affects Versions: 1.2.0
> Environment: 10 Performance Nodes
> DRILL_MAX_DIRECT_MEMORY=100g
> DRILL_INIT_HEAP="8g"
> DRILL_MAX_HEAP="8g"
> planner.memory.query_max_memory_per_node bumped up to 20 GB
> TPC-DS SF 1000 dataset (Parquet)
>Reporter: Abhishek Girish
>Assignee: Deneche A. Hakim
> Attachments: drillbit.log.txt, query_profile.json
>
>
> Query fails with Runtime exception:
> {code:sql}
> SELECT sum(s.ss_quantity) OVER (PARTITION BY s.ss_store_sk, s.ss_customer_sk 
> ORDER BY s.ss_store_sk) FROM store_sales s LIMIT 20;
> java.lang.RuntimeException: java.sql.SQLException: SYSTEM ERROR: 
> IllegalFormatConversionException: d != java.lang.Character
> Fragment 1:0
> [Error Id: 12b51c0c-4992-4ceb-89c4-c99307529c7e on ucs-node8.perf.lab:31010]
>   at sqlline.IncrementalRows.hasNext(IncrementalRows.java:73)
>   at 
> sqlline.TableOutputFormat$ResizingRowsProvider.next(TableOutputFormat.java:87)
>   at sqlline.TableOutputFormat.print(TableOutputFormat.java:118)
>   at sqlline.SqlLine.print(SqlLine.java:1583)
>   at sqlline.Commands.execute(Commands.java:852)
>   at sqlline.Commands.sql(Commands.java:751)
>   at sqlline.SqlLine.dispatch(SqlLine.java:738)
>   at sqlline.SqlLine.begin(SqlLine.java:612)
>   at sqlline.SqlLine.start(SqlLine.java:366)
>   at sqlline.SqlLine.main(SqlLine.java:259)
> {code}
> Query logs and profile attached. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (DRILL-3786) Query with window function fails with IllegalFormatConversionException

2015-09-15 Thread Abhishek Girish (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-3786?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Abhishek Girish updated DRILL-3786:
---
Attachment: query_profile.json
drillbit.log.txt

> Query with window function fails with IllegalFormatConversionException
> --
>
> Key: DRILL-3786
> URL: https://issues.apache.org/jira/browse/DRILL-3786
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Query Planning & Optimization
>Affects Versions: 1.2.0
> Environment: 10 Performance Nodes
> DRILL_MAX_DIRECT_MEMORY=100g
> DRILL_INIT_HEAP="8g"
> DRILL_MAX_HEAP="8g"
> planner.memory.query_max_memory_per_node bumped up to 20 GB
> TPC-DS SF 1000 dataset (Parquet)
>Reporter: Abhishek Girish
>Assignee: Jinfeng Ni
> Attachments: drillbit.log.txt, query_profile.json
>
>
> Query fails with Runtime exception:
> {code:sql}
> SELECT sum(s.ss_quantity) OVER (PARTITION BY s.ss_store_sk, s.ss_customer_sk 
> ORDER BY s.ss_store_sk) FROM store_sales s LIMIT 20;
> java.lang.RuntimeException: java.sql.SQLException: SYSTEM ERROR: 
> IllegalFormatConversionException: d != java.lang.Character
> Fragment 1:0
> [Error Id: 12b51c0c-4992-4ceb-89c4-c99307529c7e on ucs-node8.perf.lab:31010]
>   at sqlline.IncrementalRows.hasNext(IncrementalRows.java:73)
>   at 
> sqlline.TableOutputFormat$ResizingRowsProvider.next(TableOutputFormat.java:87)
>   at sqlline.TableOutputFormat.print(TableOutputFormat.java:118)
>   at sqlline.SqlLine.print(SqlLine.java:1583)
>   at sqlline.Commands.execute(Commands.java:852)
>   at sqlline.Commands.sql(Commands.java:751)
>   at sqlline.SqlLine.dispatch(SqlLine.java:738)
>   at sqlline.SqlLine.begin(SqlLine.java:612)
>   at sqlline.SqlLine.start(SqlLine.java:366)
>   at sqlline.SqlLine.main(SqlLine.java:259)
> {code}
> Query logs and profile attached. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (DRILL-3786) Query with window function fails with IllegalFormatConversionException

2015-09-15 Thread Abhishek Girish (JIRA)
Abhishek Girish created DRILL-3786:
--

 Summary: Query with window function fails with 
IllegalFormatConversionException
 Key: DRILL-3786
 URL: https://issues.apache.org/jira/browse/DRILL-3786
 Project: Apache Drill
  Issue Type: Bug
  Components: Query Planning & Optimization
Affects Versions: 1.2.0
 Environment: 10 Performance Nodes
DRILL_MAX_DIRECT_MEMORY=100g
DRILL_INIT_HEAP="8g"
DRILL_MAX_HEAP="8g"
planner.memory.query_max_memory_per_node bumped up to 20 GB
TPC-DS SF 1000 dataset (Parquet)
Reporter: Abhishek Girish
Assignee: Jinfeng Ni


Query fails with Runtime exception:

{code:sql}
SELECT sum(s.ss_quantity) OVER (PARTITION BY s.ss_store_sk, s.ss_customer_sk 
ORDER BY s.ss_store_sk) FROM store_sales s LIMIT 20;
java.lang.RuntimeException: java.sql.SQLException: SYSTEM ERROR: 
IllegalFormatConversionException: d != java.lang.Character

Fragment 1:0

[Error Id: 12b51c0c-4992-4ceb-89c4-c99307529c7e on ucs-node8.perf.lab:31010]
at sqlline.IncrementalRows.hasNext(IncrementalRows.java:73)
at 
sqlline.TableOutputFormat$ResizingRowsProvider.next(TableOutputFormat.java:87)
at sqlline.TableOutputFormat.print(TableOutputFormat.java:118)
at sqlline.SqlLine.print(SqlLine.java:1583)
at sqlline.Commands.execute(Commands.java:852)
at sqlline.Commands.sql(Commands.java:751)
at sqlline.SqlLine.dispatch(SqlLine.java:738)
at sqlline.SqlLine.begin(SqlLine.java:612)
at sqlline.SqlLine.start(SqlLine.java:366)
at sqlline.SqlLine.main(SqlLine.java:259)
{code}

Query logs and profile attached. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Closed] (DRILL-3737) CTAS from empty text file fails with NPE

2015-09-15 Thread Chun Chang (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-3737?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chun Chang closed DRILL-3737.
-
Assignee: Chun Chang  (was: Sean Hsuan-Yi Chu)

dup of Drill-3539

> CTAS from empty text file fails with NPE
> 
>
> Key: DRILL-3737
> URL: https://issues.apache.org/jira/browse/DRILL-3737
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Relational Operators
>Reporter: Sean Hsuan-Yi Chu
>Assignee: Chun Chang
>Priority: Critical
> Fix For: 1.2.0
>
>
> {code}
> create table a(aa) as select columns[0] from `empty.csv`;
> {code}
> shows:
> Error: SYSTEM ERROR: NullPointerException
> Fragment 0:0



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (DRILL-3765) Partition prune rule is unnecessary fired multiple times.

2015-09-15 Thread Jinfeng Ni (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-3765?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jinfeng Ni resolved DRILL-3765.
---
Resolution: Won't Fix

I looked at the Calcite trace file. Seems the multiple rule firing is what 
Calcite's rule framework is supposed to do. 

The multiple rule firings are actually against different set of RelNode, which 
are result of other rules firing. For instance, one scan could be the one 
before Project push down, while the other one could be after the project push 
down. In other words, there is no redundant rule firing, against the same set 
of RelNode.  Seems the multiple rule firing in this case is the way it should 
be.

If we stop the subsequent rule's firing after the first rule firing, the query 
planner may lose the chance to find the ultimate best plan. 

Therefore, I'm going to mark this as "won't fix".  

> Partition prune rule is unnecessary fired multiple times. 
> --
>
> Key: DRILL-3765
> URL: https://issues.apache.org/jira/browse/DRILL-3765
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Query Planning & Optimization
>Reporter: Jinfeng Ni
>Assignee: Jinfeng Ni
>
> It seems that the partition prune rule may be fired multiple times, even 
> after the first rule execution has pushed the filter into the scan operator. 
> Since partition prune has to build the vectors to contain the partition /file 
> / directory information, to invoke the partition prune rule unnecessary may 
> lead to big memory overhead.
> Drill planner should avoid the un-necessary partition prune rule, in order to 
> reduce the chance of hitting OOM exception, while the partition prune rule is 
> executed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (DRILL-3121) Hive partition pruning is not happening

2015-09-15 Thread Chun Chang (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-3121?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chun Chang updated DRILL-3121:
--
Assignee: Rahul Challapalli  (was: Mehant Baid)

> Hive partition pruning is not happening
> ---
>
> Key: DRILL-3121
> URL: https://issues.apache.org/jira/browse/DRILL-3121
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Execution - Flow
>Affects Versions: 1.0.0
>Reporter: Hao Zhu
>Assignee: Rahul Challapalli
>Priority: Critical
> Fix For: 1.2.0
>
> Attachments: DRILL-3121.patch
>
>
> Tested on 1.0.0 with below commit id, and hive 0.13.
> {code}
> >  select * from sys.version;
> +---+++--++
> | commit_id |   
> commit_message   |commit_time | 
> build_email  | build_time |
> +---+++--++
> | d8b19759657698581cc0d01d7038797952888123  | DRILL-3100: 
> TestImpersonationDisabledWithMiniDFS fails on Windows  | 15.05.2015 @ 
> 01:18:03 EDT  | Unknown  | 15.05.2015 @ 03:07:10 EDT  |
> +---+++--++
> 1 row selected (0.083 seconds)
> {code}
> How to reproduce:
> 1. Use hive to create below partition table:
> {code}
> CREATE TABLE partition_table(id INT, username string)
>  PARTITIONED BY(year STRING, month STRING)
> ROW FORMAT DELIMITED FIELDS TERMINATED BY ",";
> insert into table partition_table PARTITION(year='2014',month='11') select 
> 1,'u' from passwords limit 1;
> insert into table partition_table PARTITION(year='2014',month='12') select 
> 2,'s' from passwords limit 1;
> insert into table partition_table PARTITION(year='2015',month='01') select 
> 3,'e' from passwords limit 1;
> insert into table partition_table PARTITION(year='2015',month='02') select 
> 4,'r' from passwords limit 1;
> insert into table partition_table PARTITION(year='2015',month='03') select 
> 5,'n' from passwords limit 1;
> {code}
> 2. Hive query can do partition pruning for below 2 queries:
> {code}
> hive>  explain EXTENDED select * from partition_table where year='2015' and 
> month in ( '02','03') ;
> partition values:
>   month 02
>   year 2015
> partition values:
>   month 03
>   year 2015  
> explain EXTENDED select * from partition_table where year='2015' and (month 
> >= '02' and month <= '03') ;
> partition values:
>   month 02
>   year 2015
> partition values:
>   month 03
>   year 2015
> {code}
> Hive only scans 2 partitions -- 2015/02 and 2015/03.
> 3. Drill can not do partition pruning for below 2 queries:
> {code}
> > explain plan for select * from hive.partition_table where `year`='2015' and 
> > `month` in ('02','03');
> +--+--+
> | text | json |
> +--+--+
> | 00-00Screen
> 00-01  Project(id=[$0], username=[$1], year=[$2], month=[$3])
> 00-02SelectionVectorRemover
> 00-03  Filter(condition=[AND(=($2, '2015'), OR(=($3, '02'), =($3, 
> '03')))])
> 00-04Scan(groupscan=[HiveScan [table=Table(dbName:default, 
> tableName:partition_table), 
> inputSplits=[maprfs:/user/hive/warehouse/partition_table/year=2015/month=01/00_0:0+4,
>  maprfs:/user/hive/warehouse/partition_table/year=2015/month=02/00_0:0+4, 
> maprfs:/user/hive/warehouse/partition_table/year=2015/month=03/00_0:0+4], 
> columns=[`*`], partitions= [Partition(values:[2015, 01]), 
> Partition(values:[2015, 02]), Partition(values:[2015, 03])]]])
> > explain plan for select * from hive.partition_table where `year`='2015' and 
> > (`month` >= '02' and `month` <= '03' );
> +--+--+
> | text | json |
> +--+--+
> | 00-00Screen
> 00-01  Project(id=[$0], username=[$1], year=[$2], month=[$3])
> 00-02SelectionVectorRemover
> 00-03  Filter(condition=[AND(=($2, '2015'), >=($3, '02'), <=($3, 
> '03'))])
> 00-04Scan(groupscan=[HiveScan [table=Table(dbName:default, 
> tableName:partition_table), 
> inputSplits=[maprfs:/user/hive/warehouse/partition_table/year=2015/month=01/00_0:0+4,
>  maprfs:/user/hive/warehouse/partition_table/year=2015/month=02/00_0:0+4, 
> maprfs:/user/hive/warehouse/partition_table/year=2015/month

[jira] [Closed] (DRILL-2625) org.apache.drill.common.StackTrace should follow standard stacktrace format

2015-09-15 Thread Chun Chang (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-2625?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chun Chang closed DRILL-2625.
-
Assignee: Chun Chang  (was: Chris Westin)

verified fix.

{noformat}
[Error Id: ed3fbb63-6468-4f59-b312-7226291d8727 ]
at 
org.apache.drill.common.exceptions.UserException$Builder.build(UserException.java:524)
 ~[drill-common-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
at 
org.apache.drill.exec.store.dfs.WorkspaceSchemaFactory$WorkspaceSchema.isHomogeneous(WorkspaceSchemaFactory.java:389)
 ~[drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
at 
org.apache.drill.exec.store.dfs.WorkspaceSchemaFactory$WorkspaceSchema.dropTable(WorkspaceSchemaFactory.java:430)
 ~[drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
at 
org.apache.drill.exec.planner.sql.handlers.DropTableHandler.getPlan(DropTableHandler.java:72)
 ~[drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
at 
org.apache.drill.exec.planner.sql.DrillSqlWorker.getPlan(DrillSqlWorker.java:178)
 ~[drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
at org.apache.drill.exec.work.foreman.Foreman.runSQL(Foreman.java:903) 
[drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
at org.apache.drill.exec.work.foreman.Foreman.run(Foreman.java:242) 
[drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
... 3 common frames omitted
{noformat}

> org.apache.drill.common.StackTrace should follow standard stacktrace format
> ---
>
> Key: DRILL-2625
> URL: https://issues.apache.org/jira/browse/DRILL-2625
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Flow
>Affects Versions: 0.8.0
>Reporter: Daniel Barclay (Drill)
>Assignee: Chun Chang
> Fix For: 1.2.0
>
>
> org.apache.drill.common.StackTrace uses a different textual format than JDK's 
> standard format for stack traces.
> It should probably use the standard format so that its stack trace output can 
> be used by tools that already can parse the standard format to provide 
> functionality such as displaying the corresponding source.
> (After correcting for DRILL-2624, StackTrace formats stack traces like this:
> org.apache.drill.common.StackTrace.:1
> org.apache.drill.exec.server.Drillbit.run:20
> org.apache.drill.jdbc.DrillConnectionImpl.:232
> The normal form is like this:
> {noformat}
>   at 
> org.apache.drill.exec.memory.TopLevelAllocator.close(TopLevelAllocator.java:162)
>   at 
> org.apache.drill.exec.server.BootStrapContext.close(BootStrapContext.java:75)
>   at com.google.common.io.Closeables.close(Closeables.java:77)
> {noformat}
> )



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (DRILL-2625) org.apache.drill.common.StackTrace should follow standard stacktrace format

2015-09-15 Thread Daniel Barclay (Drill) (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-2625?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Barclay (Drill) updated DRILL-2625:
--
Description: 
org.apache.drill.common.StackTrace uses a different textual format than JDK's 
standard format for stack traces.

It should probably use the standard format so that its stack trace output can 
be used by tools that already can parse the standard format to provide 
functionality such as displaying the corresponding source.

(After correcting for DRILL-2624, StackTrace formats stack traces like this:

org.apache.drill.common.StackTrace.:1
org.apache.drill.exec.server.Drillbit.run:20
org.apache.drill.jdbc.DrillConnectionImpl.:232

The normal form is like this:
{noformat}
at 
org.apache.drill.exec.memory.TopLevelAllocator.close(TopLevelAllocator.java:162)
at 
org.apache.drill.exec.server.BootStrapContext.close(BootStrapContext.java:75)
at com.google.common.io.Closeables.close(Closeables.java:77)
{noformat}
)



  was:
org.apache.drill.common.StackTrace uses a different textual format than JDK's 
standard format for stack traces.

It should probably use the standard format so that its stack trace output can 
be used by tools that already can parse the standard format to provide 
functionality such as displaying the corresponding source.

(After correcting for DRILL-2624, StackTrace formats stack traces like this:

org.apache.drill.common.StackTrace.:1
org.apache.drill.exec.server.Drillbit.run:20
org.apache.drill.jdbc.DrillConnectionImpl.:232

The normal form is like this:
at 
org.apache.drill.exec.memory.TopLevelAllocator.close(TopLevelAllocator.java:162)
at 
org.apache.drill.exec.server.BootStrapContext.close(BootStrapContext.java:75)
at com.google.common.io.Closeables.close(Closeables.java:77)
)




> org.apache.drill.common.StackTrace should follow standard stacktrace format
> ---
>
> Key: DRILL-2625
> URL: https://issues.apache.org/jira/browse/DRILL-2625
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Flow
>Affects Versions: 0.8.0
>Reporter: Daniel Barclay (Drill)
>Assignee: Chris Westin
> Fix For: 1.2.0
>
>
> org.apache.drill.common.StackTrace uses a different textual format than JDK's 
> standard format for stack traces.
> It should probably use the standard format so that its stack trace output can 
> be used by tools that already can parse the standard format to provide 
> functionality such as displaying the corresponding source.
> (After correcting for DRILL-2624, StackTrace formats stack traces like this:
> org.apache.drill.common.StackTrace.:1
> org.apache.drill.exec.server.Drillbit.run:20
> org.apache.drill.jdbc.DrillConnectionImpl.:232
> The normal form is like this:
> {noformat}
>   at 
> org.apache.drill.exec.memory.TopLevelAllocator.close(TopLevelAllocator.java:162)
>   at 
> org.apache.drill.exec.server.BootStrapContext.close(BootStrapContext.java:75)
>   at com.google.common.io.Closeables.close(Closeables.java:77)
> {noformat}
> )



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (DRILL-1065) Provide a reset command to reset an option to its default value

2015-09-15 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-1065?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14746489#comment-14746489
 ] 

ASF GitHub Bot commented on DRILL-1065:
---

GitHub user sudheeshkatkam opened a pull request:

https://github.com/apache/drill/pull/159

DRILL-1065: Support for ALTER ... RESET statement

+ Support for "SET option = value" statement (assumes scope as SESSION)
+ Better error messages in SetOptionHandler
+ Changes in CompoundIdentifierConverter
  - update when rewritten operand is not deeply equal to the original 
operand
  - Added Override annotations
+ Default ExecutionControls option value should be at SYSTEM level

@jacques-n @vkorukanti please review.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/sudheeshkatkam/drill DRILL-1065

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/drill/pull/159.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #159


commit f785c3ba4d50bfe5a7ff77a8d37b3589bbe1c1cf
Author: Sudheesh Katkam 
Date:   2015-09-15T23:13:23Z

DRILL-1065: Support for ALTER ... RESET statement

+ Support for "SET option = value" statement (assumes scope as SESSION)
+ Better error messages in SetOptionHandler
+ Changes in CompoundIdentifierConverter
  - update when rewritten operand is not deeply equal to the original 
operand
  - Added Override annotations
+ Default ExecutionControls option value should be at SYSTEM level




> Provide a reset command to reset an option to its default value
> ---
>
> Key: DRILL-1065
> URL: https://issues.apache.org/jira/browse/DRILL-1065
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Execution - Flow
>Reporter: Aman Sinha
>Assignee: Sudheesh Katkam
>Priority: Minor
> Fix For: 1.2.0
>
>
> Within a session, currently we set configuration options and it would be very 
> useful to have a 'reset' command to reset the value of an option to its 
> default system value: 
>   ALTER SESSION RESET  
> If we don't want to add a new keyword for RESET, we could potentially 
> overload the SET command and allow the user to set to the 'default' value.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (DRILL-3030) Foreman hangs trying to cancel non-root fragments

2015-09-15 Thread Sudheesh Katkam (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-3030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14746488#comment-14746488
 ] 

Sudheesh Katkam commented on DRILL-3030:


I don't think this is a duplicate of 
[DRILL-3448|https://issues.apache.org/jira/browse/DRILL-3448]. The threadstack 
is different; however, they might be related.

> Foreman hangs trying to cancel non-root fragments
> -
>
> Key: DRILL-3030
> URL: https://issues.apache.org/jira/browse/DRILL-3030
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Flow
>Affects Versions: 1.0.0
>Reporter: Ramana Inukonda Nagaraj
>Assignee: Chun Chang
> Fix For: 1.2.0
>
> Attachments: threadstack
>
>
> Steps to repro:
> 1. Ran long running query on a clean drill restart. 
> 2. Killed a non foreman node. 
> 3. Restarted drillbits using clush.
> One of the drillbits(coincidentally a foreman node always) refused to 
> shutdown. 
> Jstack shows that the foreman is waiting 
> {code}
>   at 
> org.apache.drill.exec.rpc.ReconnectingConnection$ConnectionListeningFuture.waitAndRun(ReconnectingConnection.java:105)
> at 
> org.apache.drill.exec.rpc.ReconnectingConnection.runCommand(ReconnectingConnection.java:81)
> - locked <0x00073878aaa8> (a 
> org.apache.drill.exec.rpc.control.ControlConnectionManager)
> at 
> org.apache.drill.exec.rpc.control.ControlTunnel.cancelFragment(ControlTunnel.java:57)
> at 
> org.apache.drill.exec.work.foreman.QueryManager.cancelExecutingFragments(QueryManager.java:192)
> at 
> org.apache.drill.exec.work.foreman.Foreman$StateSwitch.processEvent(Foreman.java:824)
> at 
> org.apache.drill.exec.work.foreman.Foreman$StateSwitch.processEvent(Foreman.java:768)
> at 
> org.apache.drill.common.EventProcessor.sendEvent(EventProcessor.java:73)
> at 
> org.apache.drill.exec.work.foreman.Foreman$StateSwitch.moveToState(Foreman.java:770)
> at 
> org.apache.drill.exec.work.foreman.Foreman.moveToState(Foreman.java:871)
> at 
> org.apache.drill.exec.work.foreman.Foreman.access$2700(Foreman.java:107)
> at 
> org.apache.drill.exec.work.foreman.Foreman$StateListener.moveToState(Foreman.java:1132)
> at 
> org.apache.drill.exec.work.foreman.QueryManager$1.statusUpdate(QueryManager.java:460)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Closed] (DRILL-3030) Foreman hangs trying to cancel non-root fragments

2015-09-15 Thread Chun Chang (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-3030?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chun Chang closed DRILL-3030.
-
Assignee: Chun Chang  (was: Sudheesh Katkam)

dup of Drill-3448.

> Foreman hangs trying to cancel non-root fragments
> -
>
> Key: DRILL-3030
> URL: https://issues.apache.org/jira/browse/DRILL-3030
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Flow
>Affects Versions: 1.0.0
>Reporter: Ramana Inukonda Nagaraj
>Assignee: Chun Chang
> Fix For: 1.2.0
>
> Attachments: threadstack
>
>
> Steps to repro:
> 1. Ran long running query on a clean drill restart. 
> 2. Killed a non foreman node. 
> 3. Restarted drillbits using clush.
> One of the drillbits(coincidentally a foreman node always) refused to 
> shutdown. 
> Jstack shows that the foreman is waiting 
> {code}
>   at 
> org.apache.drill.exec.rpc.ReconnectingConnection$ConnectionListeningFuture.waitAndRun(ReconnectingConnection.java:105)
> at 
> org.apache.drill.exec.rpc.ReconnectingConnection.runCommand(ReconnectingConnection.java:81)
> - locked <0x00073878aaa8> (a 
> org.apache.drill.exec.rpc.control.ControlConnectionManager)
> at 
> org.apache.drill.exec.rpc.control.ControlTunnel.cancelFragment(ControlTunnel.java:57)
> at 
> org.apache.drill.exec.work.foreman.QueryManager.cancelExecutingFragments(QueryManager.java:192)
> at 
> org.apache.drill.exec.work.foreman.Foreman$StateSwitch.processEvent(Foreman.java:824)
> at 
> org.apache.drill.exec.work.foreman.Foreman$StateSwitch.processEvent(Foreman.java:768)
> at 
> org.apache.drill.common.EventProcessor.sendEvent(EventProcessor.java:73)
> at 
> org.apache.drill.exec.work.foreman.Foreman$StateSwitch.moveToState(Foreman.java:770)
> at 
> org.apache.drill.exec.work.foreman.Foreman.moveToState(Foreman.java:871)
> at 
> org.apache.drill.exec.work.foreman.Foreman.access$2700(Foreman.java:107)
> at 
> org.apache.drill.exec.work.foreman.Foreman$StateListener.moveToState(Foreman.java:1132)
> at 
> org.apache.drill.exec.work.foreman.QueryManager$1.statusUpdate(QueryManager.java:460)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (DRILL-3238) Cannot Plan Exception is raised when the same window partition is defined in select & window clauses

2015-09-15 Thread Victoria Markman (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-3238?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Victoria Markman updated DRILL-3238:

Fix Version/s: (was: Future)
   1.2.0

> Cannot Plan Exception is raised when the same window partition is defined in 
> select & window clauses
> 
>
> Key: DRILL-3238
> URL: https://issues.apache.org/jira/browse/DRILL-3238
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Query Planning & Optimization
>Reporter: Sean Hsuan-Yi Chu
>Assignee: Sean Hsuan-Yi Chu
>  Labels: window_function
> Fix For: 1.2.0
>
>
> While this works:
> {code}
> select sum(a2) over(partition by a2 order by a2), count(*) over(partition by 
> a2 order by a2) 
> from t
> {code}
> , this fails
> {code}
> select sum(a2) over(w), count(*) over(partition by a2 order by a2) 
> from t
> window w as (partition by a2 order by a2)
> {code}
> Notice these two queries are logically the same thing if we plug-in the 
> window definition back into the SELECT-CLAUSE in the 2nd query.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Closed] (DRILL-2650) Cancelled queries json profile shows query end time occurs before fragments end time

2015-09-15 Thread Krystal (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-2650?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Krystal closed DRILL-2650.
--

git.commit.id.abbrev=0c1b293

Verified the durations of the individual fragments are within the duration of 
the cancelled query itself.

> Cancelled queries json profile shows query end time occurs before fragments 
> end time 
> -
>
> Key: DRILL-2650
> URL: https://issues.apache.org/jira/browse/DRILL-2650
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Client - HTTP
>Affects Versions: 0.9.0
>Reporter: Krystal
>Assignee: Sudheesh Katkam
> Fix For: 1.2.0
>
> Attachments: DRILL-2650.1.patch.txt
>
>
> I have a query that was successfully cancelled.  The query start and end time 
> is as follows:
> "type": 1,
> "start": 1427839192049,
> "end": 1427839194966,
> This translates to a query duration of about 3 seconds.  However, the 
> duration of the query's fragments are much longer up to more than 6 seconds.  
> Here is an entry for majorFragmentId=0 with a duration of 6.6 seconds:
>  "startTime": 1427839192796,
>  "endTime": 1427839199408,
> 8 out of 11 major fragments have duration greater than the query itself.  To 
> an end user, this is confusing and does not make sense.  We should wait for 
> all of the major fragments to be completely cancelled before updating the the 
> "end" time of the query.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (DRILL-3393) Quotes not being recognized in tab delimited (tsv) files

2015-09-15 Thread Sean Hsuan-Yi Chu (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-3393?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean Hsuan-Yi Chu resolved DRILL-3393.
--
Resolution: Fixed

Resolved in commit#: 48bc0b9a8916af7191b0a99351c27fd5b69786c3

> Quotes not being recognized in tab delimited (tsv) files
> 
>
> Key: DRILL-3393
> URL: https://issues.apache.org/jira/browse/DRILL-3393
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Storage - Text & CSV
>Affects Versions: 1.0.0
>Reporter: Chi Lang
>Assignee: Steven Phillips
>Priority: Critical
> Fix For: 1.2.0
>
> Attachments: DRILL-3393.patch, fail.tsv
>
>
> Drill doesn't seem to recognise quotes in tsv, while working fine for csv 
> files.
> For example, given the following files
> test.tsv
> ---
> foobarbar
> "aa"  "bc"
> ---
> test.csv
> --
> foobar,bar
> "aa","bc"
> --
> I get these results:
> 0: jdbc:drill:zk=local> select columns[0], columns[1] from dfs.`test.csv`;
> +-+-+
> | EXPR$0  | EXPR$1  |
> +-+-+
> | foobar  | bar |
> | aa  | bc  |
> +-+-+
> 2 rows selected (0.259 seconds)
> 0: jdbc:drill:zk=local> select columns[0], columns[1] from dfs.`test.tsv`;
> +--+-+
> |  EXPR$0  | EXPR$1  |
> +--+-+
> | foobar   | bar |
> | aa" "bc  | null|
> +--+-+
> 2 rows selected (0.122 seconds)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (DRILL-3448) typo in QueryManager.DrillbitStatusListener will cause the Foreman to hang if a Drillbit dies

2015-09-15 Thread Chun Chang (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-3448?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chun Chang updated DRILL-3448:
--
Assignee: Khurram Faraaz  (was: Sudheesh Katkam)

> typo in QueryManager.DrillbitStatusListener will cause the Foreman to hang if 
> a Drillbit dies
> -
>
> Key: DRILL-3448
> URL: https://issues.apache.org/jira/browse/DRILL-3448
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Flow
>Affects Versions: 1.0.0
>Reporter: Deneche A. Hakim
>Assignee: Khurram Faraaz
>Priority: Critical
> Fix For: 1.2.0
>
> Attachments: DRILL-3448.1.patch.txt
>
>
> at the end of DrillbitStatusListener.drillbitUnregistered() there is if block:
> {code}
>   if (!atLeastOneFailure) {
> logger.warn("...");
> stateListener.moveToState(QueryState.FAILED,
> new ForemanException(...));
>   }
> {code}
> this will basically fail the query if the drillbit DIDN'T contain any 
> fragment for this Foreman, which in fact should be the inverse.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (DRILL-3718) quotes in .tsv trigger exception

2015-09-15 Thread Sean Hsuan-Yi Chu (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-3718?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14746395#comment-14746395
 ] 

Sean Hsuan-Yi Chu commented on DRILL-3718:
--

Resolved in commit#: 48bc0b9a8916af7191b0a99351c27fd5b69786c3

> quotes in .tsv trigger exception 
> -
>
> Key: DRILL-3718
> URL: https://issues.apache.org/jira/browse/DRILL-3718
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Storage - Text & CSV
>Reporter: Sean Hsuan-Yi Chu
>Assignee: Sean Hsuan-Yi Chu
> Fix For: 1.2.0
>
>
> Given a simple tsv file as below
> {code}
> "a"   a
> a a
> a
> {code}
> After getting the first quote, the TextReader would just keep going down the 
> entire files, as opposed to stopping at the second quote.
> This will trigger an exception
> {code}
> org.apache.drill.common.exceptions.UserRemoteException: SYSTEM ERROR: 
> TextParsingException: Error processing input: Cannot use newline character 
> within quoted string, line=2, char=12. Content parsed: [ ]
> Fragment 0:0
> {code}
> which complains at having newline in the quote.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (DRILL-3785) AssertionError in FindPartitionConditions.java

2015-09-15 Thread Rahul Challapalli (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-3785?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rahul Challapalli updated DRILL-3785:
-
Attachment: error (4).log

> AssertionError in FindPartitionConditions.java
> --
>
> Key: DRILL-3785
> URL: https://issues.apache.org/jira/browse/DRILL-3785
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Query Planning & Optimization
>Reporter: Rahul Challapalli
>Assignee: Jinfeng Ni
> Attachments: error (4).log
>
>
> git.commit.id.abbrev=240a455
> I am running the drillbits with assertions enabled and the below query is 
> producing an AssertionError
> {code}
> explain plan for 
> select
>   l_returnflag,
>   l_linestatus,
>   sum(l_quantity) as sum_qty,
>   sum(l_extendedprice) as sum_base_price,
>   sum(l_extendedprice * (1 - l_discount)) as sum_disc_price,
>   sum(l_extendedprice * (1 - l_discount) * (1 + l_tax)) as sum_charge,
>   avg(l_quantity) as avg_qty,
>   avg(l_extendedprice) as avg_price,
>   avg(l_discount) as avg_disc,
>   count(*) as count_order
> from
>   `lineitem`
> where
>   dir0=2006 or dir0=2007 and
>   l_shipdate <= date '1998-12-01' - interval '120' day (3)
> group by
>   l_returnflag,
>   l_linestatus
> order by
>   l_returnflag,
>   l_linestatus;
> {code}
> I tried running the same query immediately again, but the error did not show 
> up.
> The data set used is tpch sf100 where the lineitem data set is split into 
> 200K parquet files and stored using the below structure
> Data Set Structure :
> {code}
> lineitem
>   -- 2006 (year)
> -- 1 (month)
>   -- 1 (day)
>   ..
>   ..
>   -- 31(day)
> ..
> -- 12 (month)
>   ..
>   ..
>   2015(year)
> {code}
> Log file is attached



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (DRILL-3785) AssertionError in FindPartitionConditions.java

2015-09-15 Thread Rahul Challapalli (JIRA)
Rahul Challapalli created DRILL-3785:


 Summary: AssertionError in FindPartitionConditions.java
 Key: DRILL-3785
 URL: https://issues.apache.org/jira/browse/DRILL-3785
 Project: Apache Drill
  Issue Type: Bug
  Components: Query Planning & Optimization
Reporter: Rahul Challapalli
Assignee: Jinfeng Ni


git.commit.id.abbrev=240a455

I am running the drillbits with assertions enabled and the below query is 
producing an AssertionError
{code}
explain plan for 
select
  l_returnflag,
  l_linestatus,
  sum(l_quantity) as sum_qty,
  sum(l_extendedprice) as sum_base_price,
  sum(l_extendedprice * (1 - l_discount)) as sum_disc_price,
  sum(l_extendedprice * (1 - l_discount) * (1 + l_tax)) as sum_charge,
  avg(l_quantity) as avg_qty,
  avg(l_extendedprice) as avg_price,
  avg(l_discount) as avg_disc,
  count(*) as count_order
from
  `lineitem`
where
  dir0=2006 or dir0=2007 and
  l_shipdate <= date '1998-12-01' - interval '120' day (3)
group by
  l_returnflag,
  l_linestatus
order by
  l_returnflag,
  l_linestatus;
{code}

I tried running the same query immediately again, but the error did not show up.

The data set used is tpch sf100 where the lineitem data set is split into 200K 
parquet files and stored using the below structure

Data Set Structure :
{code}
lineitem
  -- 2006 (year)
-- 1 (month)
  -- 1 (day)
  ..
  ..
  -- 31(day)
..
-- 12 (month)
  ..
  ..
  2015(year)
{code}

Log file is attached



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (DRILL-3238) Cannot Plan Exception is raised when the same window partition is defined in select & window clauses

2015-09-15 Thread Sean Hsuan-Yi Chu (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-3238?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean Hsuan-Yi Chu resolved DRILL-3238.
--
Resolution: Fixed

> Cannot Plan Exception is raised when the same window partition is defined in 
> select & window clauses
> 
>
> Key: DRILL-3238
> URL: https://issues.apache.org/jira/browse/DRILL-3238
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Query Planning & Optimization
>Reporter: Sean Hsuan-Yi Chu
>Assignee: Sean Hsuan-Yi Chu
>  Labels: window_function
> Fix For: Future
>
>
> While this works:
> {code}
> select sum(a2) over(partition by a2 order by a2), count(*) over(partition by 
> a2 order by a2) 
> from t
> {code}
> , this fails
> {code}
> select sum(a2) over(w), count(*) over(partition by a2 order by a2) 
> from t
> window w as (partition by a2 order by a2)
> {code}
> Notice these two queries are logically the same thing if we plug-in the 
> window definition back into the SELECT-CLAUSE in the 2nd query.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Closed] (DRILL-3497) Throw UserException#validationError for errors when modifying options

2015-09-15 Thread Chun Chang (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-3497?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chun Chang closed DRILL-3497.
-
Assignee: Chun Chang  (was: Sudheesh Katkam)

verified fix.

{code}
0: jdbc:drill:schema=dfs.drillTestDirDropTabl> alter session set 
`planner.slice_target` = 10;
+---++
|  ok   |summary |
+---++
| true  | planner.slice_target updated.  |
+---++
1 row selected (0.296 seconds)
0: jdbc:drill:schema=dfs.drillTestDirDropTabl> alter session set 
`planner.slice_targetx` = 10;
Error: VALIDATION ERROR: Unknown option: planner.slice_targetx


[Error Id: 6c06e182-d398-46bf-8450-736cb5eb3b03 on 10.10.30.168:31010] 
(state=,code=0)
{code}

> Throw UserException#validationError for errors when modifying options
> -
>
> Key: DRILL-3497
> URL: https://issues.apache.org/jira/browse/DRILL-3497
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Flow
>Reporter: Sudheesh Katkam
>Assignee: Chun Chang
>Priority: Minor
> Fix For: 1.2.0
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (DRILL-3455) If a drillbit, that contains fragments for the current query, dies the QueryManager will fail the query even if those fragments already finished successfully

2015-09-15 Thread Chun Chang (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-3455?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chun Chang updated DRILL-3455:
--
Assignee: Khurram Faraaz  (was: Sudheesh Katkam)

> If a drillbit, that contains fragments for the current query, dies the 
> QueryManager will fail the query even if those fragments already finished 
> successfully
> -
>
> Key: DRILL-3455
> URL: https://issues.apache.org/jira/browse/DRILL-3455
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Flow
>Reporter: Deneche A. Hakim
>Assignee: Khurram Faraaz
> Fix For: 1.2.0
>
> Attachments: DRILL-3455.1.patch.txt, DRILL-3455.2.patch.txt
>
>
> Once DRILL-3448 is fixed we need to update 
> QueryManager.DrillbitStatusListener to no fragment is still running on the 
> dead node.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (DRILL-3497) Throw UserException#validationError for errors when modifying options

2015-09-15 Thread Sudheesh Katkam (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-3497?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14746090#comment-14746090
 ] 

Sudheesh Katkam commented on DRILL-3497:


Invalid parameters to "ALTER ... SET ..." command should throw exceptions with 
specific type (e.g. VALIDATION_ERROR) instead of SYSTEM_ERROR.

> Throw UserException#validationError for errors when modifying options
> -
>
> Key: DRILL-3497
> URL: https://issues.apache.org/jira/browse/DRILL-3497
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Flow
>Reporter: Sudheesh Katkam
>Assignee: Sudheesh Katkam
>Priority: Minor
> Fix For: 1.2.0
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (DRILL-3180) Apache Drill JDBC storage plugin to query rdbms systems such as MySQL and Netezza from Apache Drill

2015-09-15 Thread Magnus Pierre (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-3180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14746059#comment-14746059
 ] 

Magnus Pierre commented on DRILL-3180:
--

If remembering correctly the JDBC drivers are able to expose both internal, and 
user defined functions to the application using it. Does drill have a function 
"store" to validate whether a function is valid and should/could be pushed down 
or not?

> Apache Drill JDBC storage plugin to query rdbms systems such as MySQL and 
> Netezza from Apache Drill
> ---
>
> Key: DRILL-3180
> URL: https://issues.apache.org/jira/browse/DRILL-3180
> Project: Apache Drill
>  Issue Type: New Feature
>  Components: Storage - Other
>Affects Versions: 1.0.0
>Reporter: Magnus Pierre
>Assignee: Jacques Nadeau
>  Labels: Drill, JDBC, plugin
> Fix For: 1.2.0
>
> Attachments: patch.diff, pom.xml, storage-mpjdbc.zip
>
>   Original Estimate: 1m
>  Remaining Estimate: 1m
>
> I have developed the base code for a JDBC storage-plugin for Apache Drill. 
> The code is primitive but consitutes a good starting point for further 
> coding. Today it provides primitive support for SELECT against RDBMS with 
> JDBC. 
> The goal is to provide complete SELECT support against RDBMS with push down 
> capabilities.
> Currently the code is using standard JDBC classes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Reopened] (DRILL-2332) Drill should be consistent with Implicit casting rules across data formats

2015-09-15 Thread Victoria Markman (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-2332?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Victoria Markman reopened DRILL-2332:
-

Sean,

Fix for DRILL-2313 I believe, makes this case work now:

t1 is parquet file:
message root {
  optional int32 a1;
  optional binary b1 (UTF8);
  optional int32 c1 (DATE);
}

{code}
0: jdbc:drill:schema=dfs> select * from t1 where c1 in ('2015-01-01');
+-++-+
| a1  |   b1   | c1  |
+-++-+
| 1   | a  | 2015-01-01  |
+-++-+
1 row selected (0.191 seconds)

0: jdbc:drill:schema=dfs> create view v1(a1, b1, c1) as select cast(a1 as int), 
cast(b1 as varchar(20)), cast(c1 as date) from t1;
+---++
|  ok   |  summary   |
+---++
| true  | View 'v1' created successfully in 'dfs.subqueries' schema  |
+---++
1 row selected (0.226 seconds)

0: jdbc:drill:schema=dfs> describe v1;
+--++--+
| COLUMN_NAME  | DATA_TYPE  | IS_NULLABLE  |
+--++--+
| a1   | INTEGER| YES  |
| b1   | CHARACTER VARYING  | YES  |
| c1   | DATE   | YES  |
+--++--+
3 rows selected (0.334 seconds)

0: jdbc:drill:schema=dfs> select * from v1 where c1 in ('2015-01-01');
+-++-+
| a1  |   b1   | c1  |
+-++-+
| 1   | a  | 2015-01-01  |
+-++-+
1 row selected (0.241 seconds)
{code}

We should resolve it as a duplicate of DRILL-2313 (if you agree) and 
extensively test implicit cast, This is a pretty significant change in drill's 
behavior.

> Drill should be consistent with Implicit casting rules across data formats
> --
>
> Key: DRILL-2332
> URL: https://issues.apache.org/jira/browse/DRILL-2332
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Query Planning & Optimization
>Reporter: Abhishek Girish
>Assignee: Sean Hsuan-Yi Chu
> Fix For: 1.2.0
>
>
> Currently, the outcome of a query with a filter on a column comparing it with 
> a literal, depends on the underlying data format. 
> *Parquet*
> {code:sql}
> select * from date_dim where d_month_seq ='1193' limit 1;
> [Succeeds]
> select * from date_dim where d_date in ('1999-06-30') limit 1;
> [Succeeds]
> {code}
> *View on top of text:*
> {code:sql}
> select * from date_dim where d_date in ('1999-06-30') limit 1;
> Query failed: SqlValidatorException: Values passed to IN operator must have 
> compatible types
> Error: exception while executing query: Failure while executing query. 
> (state=,code=0)
> select * from date_dim where d_month_seq ='1193' limit 1;
> Query failed: SqlValidatorException: Cannot apply '=' to arguments of type 
> ' = '. Supported form(s): ' = 
> '
> Error: exception while executing query: Failure while executing query. 
> (state=,code=0)
> {code}
> I understand that in the case of View on Text, SQL validation fails at the 
> Optiq layer. 
> But from the perspective of an end-user, Drill's behavior must be consistent 
> across data formats. Also having a view by definition should abstract out 
> this information.
> Here, both the view and parquet were created with type information. 
> *Parquet-meta*
> {code}
> parquet-schema /mapr/abhi311/data/parquet/tpcds/scale1/date_dim/0_0_0.parquet 
> message root {
>   optional int32 d_date_sk;
>   optional binary d_date_id (UTF8);
>   optional binary d_date (UTF8);
>   optional int32 d_month_seq;
>   optional int32 d_week_seq;
>   optional int32 d_quarter_seq;
>   optional int32 d_year;
>   optional int32 d_dow;
>   optional int32 d_moy;
>   optional int32 d_dom;
>   optional int32 d_qoy;
>   optional int32 d_fy_year;
>   optional int32 d_fy_quarter_seq;
>   optional int32 s_fy_week_seq;
>   optional binary d_day_name (UTF8);
>   optional binary d_quarter_name (UTF8);
>   optional binary d_holiday (UTF8);
>   optional binary d_weekend (UTF8);
>   optional binary d_following_holiday (UTF8);
>   optional int32 d_first_dom;
>   optional int32 d_last_dom;
>   optional int32 d_same_day_ly;
>   optional int32 d_same_day_lq;
>   optional binary d_current_day (UTF8);
>   optional binary d_current_week (UTF8);
>   optional binary d_current_month (UTF8);
>   optional binary d_current_quarter (UTF8);
>   optional binary d_current_year (UTF8);
> }
> {code}
> *Describe View*
> {code:sql}
> > describe date_dim;
> +-++-+
> | COLUM

[jira] [Commented] (DRILL-3180) Apache Drill JDBC storage plugin to query rdbms systems such as MySQL and Netezza from Apache Drill

2015-09-15 Thread Magnus Pierre (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-3180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14746044#comment-14746044
 ] 

Magnus Pierre commented on DRILL-3180:
--

Hello Jacques,
Sounds good. :)
Re the question:
I humbly disagree with the last sentence in your response. Having almost 15 
years in Enterprise Data Warehousing, one of the most common queries I came 
accross, or wrote myself were queries that dealt with time, and quite common 
were filtering conditions as part of the join clause.

Consider when joining n tables i.e. a much bigger query than the expressed and 
where you have history on most tables you are to join with, it is common to put 
the filter condition as part of the join since:

1) It makes the query more clearly expressed and readable where the conditions 
for the join is together with the join condition (most often a left outer join) 
where the filter is applied on the right hand table. 

2) It makes the query easier to maintain for the simple reason that you can 
comment out a block of code without touching multiple places.

3) Legacy SQL that could be supported provided we support filters as part of 
the join clause:
For some DB engines (Teradata to mention one), it is common to use it as part 
of join since it is more likely that the optimizer will be able to apply the 
filter before the join, at least on the ancient releases I worked with. (even 
though it should not matter from a query optimization perspective)
At the same token, derived tables are commonly used in some databases (TD as an 
example again) to ensure that a certain condition is processed before the join: 
Example: SELECT * from customer c inner join ( select s0.x,s0.y, s0.z from 
table_1 s0 where  s0.z < 100) as t1 on  c.cust_id = t1.x

Basically trying to circumvent certain limitations of query rewrite by 
explicitly expressing the processing order knowing that a large query is hard 
to untangle for most optimizers.

It should not matter for a mature optimizer, but for some it does.
So to conclude: It is important to support both cases since for some engines it 
will make a difference in effiiciency and processing order.
 



> Apache Drill JDBC storage plugin to query rdbms systems such as MySQL and 
> Netezza from Apache Drill
> ---
>
> Key: DRILL-3180
> URL: https://issues.apache.org/jira/browse/DRILL-3180
> Project: Apache Drill
>  Issue Type: New Feature
>  Components: Storage - Other
>Affects Versions: 1.0.0
>Reporter: Magnus Pierre
>Assignee: Jacques Nadeau
>  Labels: Drill, JDBC, plugin
> Fix For: 1.2.0
>
> Attachments: patch.diff, pom.xml, storage-mpjdbc.zip
>
>   Original Estimate: 1m
>  Remaining Estimate: 1m
>
> I have developed the base code for a JDBC storage-plugin for Apache Drill. 
> The code is primitive but consitutes a good starting point for further 
> coding. Today it provides primitive support for SELECT against RDBMS with 
> JDBC. 
> The goal is to provide complete SELECT support against RDBMS with push down 
> capabilities.
> Currently the code is using standard JDBC classes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (DRILL-2313) Query fails when one of the operands is a DATE literal without an explicit cast

2015-09-15 Thread Victoria Markman (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-2313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14746026#comment-14746026
 ] 

Victoria Markman commented on DRILL-2313:
-

I answered question #2 myself:

{code}
0: jdbc:drill:schema=dfs> select * from t1 where a1 between 1 and cast('11' 
as int);
+-++-+
| a1  |   b1   | c1  |
+-++-+
| 1   | a  | 2015-01-01  |
| 2   | b  | 2015-01-02  |
| 3   | c  | 2015-01-03  |
| 4   | null   | 2015-01-04  |
| 5   | e  | 2015-01-05  |
| 6   | f  | 2015-01-06  |
| 7   | g  | 2015-01-07  |
| 9   | i  | null|
| 10  | j  | 2015-01-10  |
+-++-+
9 rows selected (0.215 seconds)
{code}

> Query fails when one of the operands is a DATE literal without an explicit 
> cast
> ---
>
> Key: DRILL-2313
> URL: https://issues.apache.org/jira/browse/DRILL-2313
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Query Planning & Optimization
>Affects Versions: 0.8.0
>Reporter: Abhishek Girish
>Assignee: Sean Hsuan-Yi Chu
> Fix For: 1.2.0
>
>
> For operations involving the date datatype, when one of the operands is a 
> DATE literal without a cast, query fails. 
> *The following query fails to validate:*
> {code:sql}
> SELECT
>  *
> FROM 
>  date_dim
>  
> WHEREd_date BETWEEN '2002-3-01' AND cast('2002-3-01' AS DATE) 
> LIMIT 1;
> {code}
> Query failed: SqlValidatorException: Cannot apply 'BETWEEN' to arguments of 
> type ' BETWEEN  AND '. Supported form(s): 
> ' BETWEEN  AND '
> *The following query executes fine:*
> {code:sql}
> SELECT
>  *
> FROM 
>  date_dim
>  
> WHEREd_date BETWEEN '2002-3-01' AND 
>   '2002-3-01'
> LIMIT 1;
> {code}
> Both the queries execute fine on postgres



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (DRILL-2313) Query fails when one of the operands is a DATE literal without an explicit cast

2015-09-15 Thread Victoria Markman (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-2313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14746017#comment-14746017
 ] 

Victoria Markman commented on DRILL-2313:
-

[~seanhychu],

(1) Is this expected behavior ?

{code}
0: jdbc:drill:schema=dfs> select * from t1 where c1 between cast('2015-01-01' 
as date) and '2015-01-03 xxx';
+-++-+
| a1  |   b1   | c1  |
+-++-+
| 1   | a  | 2015-01-01  |
| 2   | b  | 2015-01-02  |
| 3   | c  | 2015-01-03  |
+-++-+
3 rows selected (0.222 seconds)
{code}

Postgres, for example, only tolerates trailing spaces:
{code}
postgres=# select * from t1 where c1 between cast('2015-01-01' as date) and 
'2015-01-03 xx';
'ERROR:  invalid input syntax for type date: "2015-01-03 xx"
LINE 1: ...1 where c1 between cast('2015-01-01' as date) and '2015-01-0...
 ^
postgres=# select * from t1 where c1 between cast('2015-01-01' as date) and 
'2015-01-03';
 a1 |  b1   | c1 
+---+
  1 | a | 2015-01-01
  2 | b | 2015-01-02
  3 | c | 2015-01-03
(3 rows)
{code}

(2) In this fix, did we only implement implicit cast from string to date or 
string to numeric should work as well ?

(3) It feels that we maybe need to document this behavior as well. What do you 
think ?

> Query fails when one of the operands is a DATE literal without an explicit 
> cast
> ---
>
> Key: DRILL-2313
> URL: https://issues.apache.org/jira/browse/DRILL-2313
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Query Planning & Optimization
>Affects Versions: 0.8.0
>Reporter: Abhishek Girish
>Assignee: Sean Hsuan-Yi Chu
> Fix For: 1.2.0
>
>
> For operations involving the date datatype, when one of the operands is a 
> DATE literal without a cast, query fails. 
> *The following query fails to validate:*
> {code:sql}
> SELECT
>  *
> FROM 
>  date_dim
>  
> WHEREd_date BETWEEN '2002-3-01' AND cast('2002-3-01' AS DATE) 
> LIMIT 1;
> {code}
> Query failed: SqlValidatorException: Cannot apply 'BETWEEN' to arguments of 
> type ' BETWEEN  AND '. Supported form(s): 
> ' BETWEEN  AND '
> *The following query executes fine:*
> {code:sql}
> SELECT
>  *
> FROM 
>  date_dim
>  
> WHEREd_date BETWEEN '2002-3-01' AND 
>   '2002-3-01'
> LIMIT 1;
> {code}
> Both the queries execute fine on postgres



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (DRILL-3539) CTAS over empty json file throws NPE

2015-09-15 Thread Khurram Faraaz (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-3539?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14746013#comment-14746013
 ] 

Khurram Faraaz commented on DRILL-3539:
---

Adding this here.

{code}
0: jdbc:drill:schema=dfs.tmp> create table empty_tbl as select * from 
`empty.csv`;
Error: SYSTEM ERROR: NullPointerException

Fragment 0:0

[Error Id: 80f34826-d86d-44c4-a5bd-f4d3527a4165 on centos-01.qa.lab:31010] 
(state=,code=0)
{code}

> CTAS over empty json file throws NPE
> 
>
> Key: DRILL-3539
> URL: https://issues.apache.org/jira/browse/DRILL-3539
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Flow
>Affects Versions: 1.2.0
>Reporter: Khurram Faraaz
>Assignee: Sean Hsuan-Yi Chu
>
> CTAS over empty JSON file results in NPE.
> {code}
> 0: jdbc:drill:schema=dfs.tmp> create table t45645 as select * from 
> `empty.json`;
> Error: SYSTEM ERROR: NullPointerException
> Fragment 0:0
> [Error Id: 79039288-5402-4b0a-b32d-5bf5024f3b71 on centos-02.qa.lab:31010] 
> (state=,code=0)
> {code}
> Stack trace from drillbit.log
> {code}
> 2015-07-22 00:34:03,788 [2a511b03-90b3-1d39-f4e3-cfd754aa085f:frag:0:0] ERROR 
> o.a.d.e.w.fragment.FragmentExecutor - SYSTEM ERROR: NullPointerException
> Fragment 0:0
> [Error Id: 79039288-5402-4b0a-b32d-5bf5024f3b71 on centos-02.qa.lab:31010]
> org.apache.drill.common.exceptions.UserException: SYSTEM ERROR: 
> NullPointerException
> Fragment 0:0
> [Error Id: 79039288-5402-4b0a-b32d-5bf5024f3b71 on centos-02.qa.lab:31010]
> at 
> org.apache.drill.common.exceptions.UserException$Builder.build(UserException.java:523)
>  ~[drill-common-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
> at 
> org.apache.drill.exec.work.fragment.FragmentExecutor.sendFinalState(FragmentExecutor.java:323)
>  [drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
> at 
> org.apache.drill.exec.work.fragment.FragmentExecutor.cleanup(FragmentExecutor.java:178)
>  [drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
> at 
> org.apache.drill.exec.work.fragment.FragmentExecutor.run(FragmentExecutor.java:292)
>  [drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
> at 
> org.apache.drill.common.SelfCleaningRunnable.run(SelfCleaningRunnable.java:38)
>  [drill-common-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>  [na:1.7.0_45]
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>  [na:1.7.0_45]
> at java.lang.Thread.run(Thread.java:744) [na:1.7.0_45]
> Caused by: java.lang.NullPointerException: null
> at 
> org.apache.drill.exec.physical.impl.WriterRecordBatch.addOutputContainerData(WriterRecordBatch.java:133)
>  ~[drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
> at 
> org.apache.drill.exec.physical.impl.WriterRecordBatch.innerNext(WriterRecordBatch.java:126)
>  ~[drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
> at 
> org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:147)
>  ~[drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
> at 
> org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:105)
>  ~[drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
> at 
> org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:95)
>  ~[drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
> at 
> org.apache.drill.exec.record.AbstractSingleRecordBatch.innerNext(AbstractSingleRecordBatch.java:51)
>  ~[drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
> at 
> org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.innerNext(ProjectRecordBatch.java:129)
>  ~[drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
> at 
> org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:147)
>  ~[drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
> at 
> org.apache.drill.exec.physical.impl.BaseRootExec.next(BaseRootExec.java:83) 
> ~[drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
> at 
> org.apache.drill.exec.physical.impl.ScreenCreator$ScreenRoot.innerNext(ScreenCreator.java:79)
>  ~[drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
> at 
> org.apache.drill.exec.physical.impl.BaseRootExec.next(BaseRootExec.java:73) 
> ~[drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
> at 
> org.apache.drill.exec.work.fragment.FragmentExecutor$1.run(FragmentExecutor.java:258)
>  ~[drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
> at 
> org.apache.drill.exec.work.fragment.FragmentExecutor$1.run(FragmentExecutor.java:252)
>  ~[drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
> at java.security.AccessController.doPrivileged(Native Method) 
> ~[na:1.7.0_45]
>

[jira] [Updated] (DRILL-3539) CTAS over empty json file throws NPE

2015-09-15 Thread Khurram Faraaz (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-3539?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Khurram Faraaz updated DRILL-3539:
--
Assignee: Sean Hsuan-Yi Chu  (was: Steven Phillips)

> CTAS over empty json file throws NPE
> 
>
> Key: DRILL-3539
> URL: https://issues.apache.org/jira/browse/DRILL-3539
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Flow
>Affects Versions: 1.2.0
>Reporter: Khurram Faraaz
>Assignee: Sean Hsuan-Yi Chu
>
> CTAS over empty JSON file results in NPE.
> {code}
> 0: jdbc:drill:schema=dfs.tmp> create table t45645 as select * from 
> `empty.json`;
> Error: SYSTEM ERROR: NullPointerException
> Fragment 0:0
> [Error Id: 79039288-5402-4b0a-b32d-5bf5024f3b71 on centos-02.qa.lab:31010] 
> (state=,code=0)
> {code}
> Stack trace from drillbit.log
> {code}
> 2015-07-22 00:34:03,788 [2a511b03-90b3-1d39-f4e3-cfd754aa085f:frag:0:0] ERROR 
> o.a.d.e.w.fragment.FragmentExecutor - SYSTEM ERROR: NullPointerException
> Fragment 0:0
> [Error Id: 79039288-5402-4b0a-b32d-5bf5024f3b71 on centos-02.qa.lab:31010]
> org.apache.drill.common.exceptions.UserException: SYSTEM ERROR: 
> NullPointerException
> Fragment 0:0
> [Error Id: 79039288-5402-4b0a-b32d-5bf5024f3b71 on centos-02.qa.lab:31010]
> at 
> org.apache.drill.common.exceptions.UserException$Builder.build(UserException.java:523)
>  ~[drill-common-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
> at 
> org.apache.drill.exec.work.fragment.FragmentExecutor.sendFinalState(FragmentExecutor.java:323)
>  [drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
> at 
> org.apache.drill.exec.work.fragment.FragmentExecutor.cleanup(FragmentExecutor.java:178)
>  [drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
> at 
> org.apache.drill.exec.work.fragment.FragmentExecutor.run(FragmentExecutor.java:292)
>  [drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
> at 
> org.apache.drill.common.SelfCleaningRunnable.run(SelfCleaningRunnable.java:38)
>  [drill-common-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>  [na:1.7.0_45]
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>  [na:1.7.0_45]
> at java.lang.Thread.run(Thread.java:744) [na:1.7.0_45]
> Caused by: java.lang.NullPointerException: null
> at 
> org.apache.drill.exec.physical.impl.WriterRecordBatch.addOutputContainerData(WriterRecordBatch.java:133)
>  ~[drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
> at 
> org.apache.drill.exec.physical.impl.WriterRecordBatch.innerNext(WriterRecordBatch.java:126)
>  ~[drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
> at 
> org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:147)
>  ~[drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
> at 
> org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:105)
>  ~[drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
> at 
> org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:95)
>  ~[drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
> at 
> org.apache.drill.exec.record.AbstractSingleRecordBatch.innerNext(AbstractSingleRecordBatch.java:51)
>  ~[drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
> at 
> org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.innerNext(ProjectRecordBatch.java:129)
>  ~[drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
> at 
> org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:147)
>  ~[drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
> at 
> org.apache.drill.exec.physical.impl.BaseRootExec.next(BaseRootExec.java:83) 
> ~[drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
> at 
> org.apache.drill.exec.physical.impl.ScreenCreator$ScreenRoot.innerNext(ScreenCreator.java:79)
>  ~[drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
> at 
> org.apache.drill.exec.physical.impl.BaseRootExec.next(BaseRootExec.java:73) 
> ~[drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
> at 
> org.apache.drill.exec.work.fragment.FragmentExecutor$1.run(FragmentExecutor.java:258)
>  ~[drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
> at 
> org.apache.drill.exec.work.fragment.FragmentExecutor$1.run(FragmentExecutor.java:252)
>  ~[drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
> at java.security.AccessController.doPrivileged(Native Method) 
> ~[na:1.7.0_45]
> at javax.security.auth.Subject.doAs(Subject.java:415) ~[na:1.7.0_45]
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1566)
>  ~[hadoop-common-2.5.1-mapr-1503.jar:na]
> at 
> org.apache.drill.exec.work.fra

[jira] [Created] (DRILL-3784) simple Jdbc program fails with NoClassDefFoundError

2015-09-15 Thread Deneche A. Hakim (JIRA)
Deneche A. Hakim created DRILL-3784:
---

 Summary: simple Jdbc program fails with NoClassDefFoundError
 Key: DRILL-3784
 URL: https://issues.apache.org/jira/browse/DRILL-3784
 Project: Apache Drill
  Issue Type: Bug
Affects Versions: 1.2.0
Reporter: Deneche A. Hakim
Assignee: Jacques Nadeau
Priority: Blocker


I have zookeeper installed and I'm running a single drillbit.

I'm running a simple Jdbc program that uses drill-jdbc-all, and I do get the 
following error when trying to connect:
{noformat}
Exception in thread "main" java.lang.NoClassDefFoundError: 
oadd/org/codehaus/jackson/map/ObjectMapper
at 
oadd.org.apache.curator.x.discovery.details.JsonInstanceSerializer.(JsonInstanceSerializer.java:42)
at 
oadd.org.apache.curator.x.discovery.ServiceDiscoveryBuilder.builder(ServiceDiscoveryBuilder.java:42)
at 
oadd.org.apache.drill.exec.coord.zk.ZKClusterCoordinator.getDiscovery(ZKClusterCoordinator.java:265)
at 
oadd.org.apache.drill.exec.coord.zk.ZKClusterCoordinator.(ZKClusterCoordinator.java:103)
at 
oadd.org.apache.drill.exec.client.DrillClient.connect(DrillClient.java:185)
at 
org.apache.drill.jdbc.impl.DrillConnectionImpl.(DrillConnectionImpl.java:134)
at 
org.apache.drill.jdbc.impl.DrillJdbc41Factory.newDrillConnection(DrillJdbc41Factory.java:66)
at 
org.apache.drill.jdbc.impl.DrillFactory.newConnection(DrillFactory.java:69)
at 
oadd.net.hydromatic.avatica.UnregisteredDriver.connect(UnregisteredDriver.java:126)
at org.apache.drill.jdbc.Driver.connect(Driver.java:72)
at java.sql.DriverManager.getConnection(DriverManager.java:571)
at java.sql.DriverManager.getConnection(DriverManager.java:233)
at SimpleJdbc.main(SimpleJdbc.java:13)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at com.intellij.rt.execution.application.AppMain.main(AppMain.java:140)
Caused by: java.lang.ClassNotFoundException: 
oadd.org.codehaus.jackson.map.ObjectMapper
at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
... 18 more
{noformat}

Using jdbc-all built right before DRILL-3589 commit works fine, so the problem 
seem to be related to the changes in DRILL-3589

Here is the program I'm running:
{code}
import java.sql.Connection;
import java.sql.DriverManager;
import java.sql.ResultSet;
import java.sql.Statement;

public class SimpleJdbc {

  public static void main(String[] args) throws Exception {
Class.forName("org.apache.drill.jdbc.Driver").newInstance();
Connection conn = DriverManager.getConnection("jdbc:drill:");

Statement stmt = null;
ResultSet rs = null;

try {
  stmt = conn.createStatement();
  rs = stmt.executeQuery("SELECT employee_id FROM cp.`employee.json`");

  while (rs.next()) {
System.out.println(rs.getObject(1));
  }

} finally {
  if (stmt != null) {
stmt.close();
  }
  if (rs != null) {
rs.close();
  }

  conn.close();
}
  }
}
{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Closed] (DRILL-2727) CTAS select * from CSV file results in Exception

2015-09-15 Thread Khurram Faraaz (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-2727?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Khurram Faraaz closed DRILL-2727.
-

Verified on master b525692e

> CTAS select * from CSV file results in Exception
> 
>
> Key: DRILL-2727
> URL: https://issues.apache.org/jira/browse/DRILL-2727
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Storage - Other
>Affects Versions: 0.9.0
> Environment: 4 node cluster on CentOS
> | 9d92b8e319f2d46e8659d903d355450e15946533 | DRILL-2580: Exit early from 
> HashJoinBatch if build side is empty | 26.03.2015 @ 16:13:53 EDT 
>Reporter: Khurram Faraaz
>Assignee: Jacques Nadeau
> Fix For: 1.2.0
>
>
> CREATE TABLE csv_tbl as SELECT * FROM `input.csv`
> results in Exception, with message Repeated types are not supported
> {code}
> 0: jdbc:drill:> create table newCSV_Int_tbl12 as select * from `int_f.csv`;
> Query failed: Query stopped., Repeated types are not supported. [ 
> 84ab8d15-6aac-42bb-908a-d663e7453abf on centos-02.qa.lab:31010 ]
> Error: exception while executing query: Failure while executing query. 
> (state=,code=0)
> 0: jdbc:drill:> create table newCSV_Int_tbl12 as select * from `int_f.csv`;
> Query failed: Query stopped., Repeated types are not supported. [ 
> 84ab8d15-6aac-42bb-908a-d663e7453abf on centos-02.qa.lab:31010 ]
> Error: exception while executing query: Failure while executing query. 
> (state=,code=0)
> 0: jdbc:drill:> create table newCSV_Int_tbl13 as select columns[0] from 
> `int_f.csv`;
> ++---+
> |  Fragment  | Number of records written |
> ++---+
> | 0_0| 12|
> ++---+
> 1 row selected (0.146 seconds)
> 0: jdbc:drill:> select * from newCSV_Int_tbl13;
> ++
> |  columns   |
> ++
> | ["EXPR$0"] |
> | ["1"]  |
> | ["0"]  |
> | ["-1"] |
> | ["65535"]  |
> | ["1234567"] |
> | ["100"] |
> | ["101010"] |
> | ["1"]  |
> | ["100"]|
> | ["13"] |
> | ["19"] |
> | ["17"] |
> ++
> 13 rows selected (0.117 seconds)
> Stack trace from drillbit.log
> org.apache.drill.exec.rpc.RemoteRpcException: Failure while running 
> fragment., Repeated types are not supported. [ 
> 0b36b8c1-3d1e-40a4-9937-aade9ead445b on centos-02.qa.lab:31010 ]
> [ 0b36b8c1-3d1e-40a4-9937-aade9ead445b on centos-02.qa.lab:31010 ]
> at 
> org.apache.drill.exec.work.foreman.QueryManager.statusUpdate(QueryManager.java:163)
>  [drill-java-exec-0.9.0-SNAPSHOT-rebuffed.jar:0.9.0-SNAPSHOT]
> at 
> org.apache.drill.exec.work.foreman.QueryManager$RootStatusReporter.statusChange(QueryManager.java:281)
>  [drill-java-exec-0.9.0-SNAPSHOT-rebuffed.jar:0.9.0-SNAPSHOT]
> at 
> org.apache.drill.exec.work.fragment.AbstractStatusReporter.fail(AbstractStatusReporter.java:114)
>  [drill-java-exec-0.9.0-SNAPSHOT-rebuffed.jar:0.9.0-SNAPSHOT]
> at 
> org.apache.drill.exec.work.fragment.AbstractStatusReporter.fail(AbstractStatusReporter.java:110)
>  [drill-java-exec-0.9.0-SNAPSHOT-rebuffed.jar:0.9.0-SNAPSHOT]
> at 
> org.apache.drill.exec.work.fragment.FragmentExecutor.internalFail(FragmentExecutor.java:230)
>  [drill-java-exec-0.9.0-SNAPSHOT-rebuffed.jar:0.9.0-SNAPSHOT]
> at 
> org.apache.drill.exec.work.fragment.FragmentExecutor.run(FragmentExecutor.java:165)
>  [drill-java-exec-0.9.0-SNAPSHOT-rebuffed.jar:0.9.0-SNAPSHOT]
> at 
> org.apache.drill.common.SelfCleaningRunnable.run(SelfCleaningRunnable.java:38)
>  [drill-common-0.9.0-SNAPSHOT-rebuffed.jar:0.9.0-SNAPSHOT]
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>  [na:1.7.0_75]
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>  [na:1.7.0_75]
> at java.lang.Thread.run(Thread.java:745) [na:1.7.0_75]
> 2015-04-09 00:37:48,977 [2ada3623-63c8-3c52-31cd-45c219efd25f:frag:0:0] WARN  
> o.a.d.e.w.fragment.FragmentExecutor - Error while initializing or executing 
> fragment
> java.lang.RuntimeException: Error closing fragment context.
> at 
> org.apache.drill.exec.work.fragment.FragmentExecutor.closeOutResources(FragmentExecutor.java:224)
>  ~[drill-java-exec-0.9.0-SNAPSHOT-rebuffed.jar:0.9.0-SNAPSHOT]
> at 
> org.apache.drill.exec.work.fragment.FragmentExecutor.run(FragmentExecutor.java:166)
>  ~[drill-java-exec-0.9.0-SNAPSHOT-rebuffed.jar:0.9.0-SNAPSHOT]
> at 
> org.apache.drill.common.SelfCleaningRunnable.run(SelfCleaningRunnable.java:38)
>  [drill-common-0.9.0-SNAPSHOT-rebuffed.jar:0.9.0-SNAPSHOT]
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>  [na:1.7.0_75]
> at 
> java.util.concurrent.ThreadP

[jira] [Closed] (DRILL-2843) Reading empty CSV file fails with error (rather than yielding zero rows)

2015-09-15 Thread Khurram Faraaz (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-2843?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Khurram Faraaz closed DRILL-2843.
-

> Reading empty CSV file fails with error (rather than yielding zero rows)
> 
>
> Key: DRILL-2843
> URL: https://issues.apache.org/jira/browse/DRILL-2843
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Storage - Text & CSV
>Reporter: Daniel Barclay (Drill)
>Assignee: Steven Phillips
> Fix For: 1.2.0
>
>
> Reading an empty CSV file (an empty file with a name ending with ".csv") 
> fails with an internal error rather than yielding zero rows:
> > SELECT * FROM `empty_file.csv`;
> Query failed: SYSTEM ERROR: Unexpected exception during fragment 
> initialization: Incoming endpoints 1 is greater than number of row groups 0
> [4e7c7167-51eb-485c-a07b-d41c2f63a670 on dev-linux2:31010]
> Error: exception while executing query: Failure while executing query. 
> (state=,code=0)
> > 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Closed] (DRILL-3295) UNION (distinct type) is supported now

2015-09-15 Thread Khurram Faraaz (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-3295?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Khurram Faraaz closed DRILL-3295.
-

Verified.

> UNION (distinct type) is supported now
> --
>
> Key: DRILL-3295
> URL: https://issues.apache.org/jira/browse/DRILL-3295
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Documentation
>Reporter: Sean Hsuan-Yi Chu
>Assignee: Bridget Bevens
> Fix For: 1.2.0
>
>
> UNION (distinct type) is supported now
> https://issues.apache.org/jira/browse/DRILL-1169
> We can update the documentation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Closed] (DRILL-3095) Memory Leak : Failure while closing accountor.

2015-09-15 Thread Khurram Faraaz (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-3095?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Khurram Faraaz closed DRILL-3095.
-

> Memory Leak : Failure while closing accountor.
> --
>
> Key: DRILL-3095
> URL: https://issues.apache.org/jira/browse/DRILL-3095
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Flow
>Affects Versions: 1.0.0
> Environment: f7f6efc525cd833ce1530deae32eb9ccb20b664a
>Reporter: Khurram Faraaz
>Assignee: Chris Westin
> Fix For: 1.2.0
>
>
> I am seeing a memory leak when i Cancel a long running query on sqlline. I am 
> re running the query with assertion enabled, will add details after the 
> second run is complete with assertions.
> Long running query was,
> {code}
> select key1, key2 from `twoKeyJsn.json`;
> {code}
> I did Ctrl-C when the above query was running on sqlline, and then issued the 
> below query that returned correct results. And then I see there is a memory 
> leak message in drillbit.log
> {code}
> 0: jdbc:drill:schema=dfs.tmp> select count(*) from `twoKeyJsn.json`;
> ++
> |   EXPR$0   |
> ++
> | 26212355   |
> ++
> 1 row selected (14.734 seconds)
> {code}
> Stack trace from drillbit.log
> {code}
> 2015-05-15 00:59:01,951 [2aaabb50-3afd-2906-3f48-eb86a315a1f5:frag:0:0] WARN  
> o.a.drill.exec.ops.SendingAccountor - Interrupted while waiting for send 
> complete. Continuing to wait.
> java.lang.InterruptedException: null
> at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1301)
>  ~[na:1.7.0_45]
> at java.util.concurrent.Semaphore.acquire(Semaphore.java:472) 
> ~[na:1.7.0_45]
> at 
> org.apache.drill.exec.ops.SendingAccountor.waitForSendComplete(SendingAccountor.java:48)
>  ~[drill-java-exec-1.0.0-SNAPSHOT-rebuffed.jar:1.0.0-SNAPSHOT]
> at 
> org.apache.drill.exec.ops.FragmentContext.waitForSendComplete(FragmentContext.java:436)
>  [drill-java-exec-1.0.0-SNAPSHOT-rebuffed.jar:1.0.0-SNAPSHOT]
> at 
> org.apache.drill.exec.physical.impl.BaseRootExec.close(BaseRootExec.java:112) 
> [drill-java-exec-1.0.0-SNAPSHOT-rebuffed.jar:1.0.0-SNAPSHOT]
> at 
> org.apache.drill.exec.physical.impl.ScreenCreator$ScreenRoot.close(ScreenCreator.java:141)
>  [drill-java-exec-1.0.0-SNAPSHOT-rebuffed.jar:1.0.0-SNAPSHOT]
> at 
> org.apache.drill.exec.work.fragment.FragmentExecutor.closeOutResources(FragmentExecutor.java:333)
>  [drill-java-exec-1.0.0-SNAPSHOT-rebuffed.jar:1.0.0-SNAPSHOT]
> at 
> org.apache.drill.exec.work.fragment.FragmentExecutor.run(FragmentExecutor.java:278)
>  [drill-java-exec-1.0.0-SNAPSHOT-rebuffed.jar:1.0.0-SNAPSHOT]
> at 
> org.apache.drill.common.SelfCleaningRunnable.run(SelfCleaningRunnable.java:38)
>  [drill-common-1.0.0-SNAPSHOT-rebuffed.jar:1.0.0-SNAPSHOT]
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>  [na:1.7.0_45]
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>  [na:1.7.0_45]
> at java.lang.Thread.run(Thread.java:744) [na:1.7.0_45]
> 2015-05-15 00:59:01,952 [2aaabb50-3afd-2906-3f48-eb86a315a1f5:frag:0:0] INFO  
> o.a.d.e.w.fragment.FragmentExecutor - 
> 2aaabb50-3afd-2906-3f48-eb86a315a1f5:0:0: State change requested from 
> CANCELLATION_REQUESTED --> FAILED for
> 2015-05-15 00:59:01,952 [2aaabb50-3afd-2906-3f48-eb86a315a1f5:frag:0:0] INFO  
> o.a.d.e.w.fragment.FragmentExecutor - 
> 2aaabb50-3afd-2906-3f48-eb86a315a1f5:0:0: State change requested from FAILED 
> --> FAILED for
> 2015-05-15 00:59:01,952 [2aaabb50-3afd-2906-3f48-eb86a315a1f5:frag:0:0] INFO  
> o.a.d.e.w.fragment.FragmentExecutor - 
> 2aaabb50-3afd-2906-3f48-eb86a315a1f5:0:0: State change requested from FAILED 
> --> FINISHED for
> 2015-05-15 00:59:01,956 [2aaabb50-3afd-2906-3f48-eb86a315a1f5:frag:0:0] ERROR 
> o.a.d.c.exceptions.UserException - SYSTEM ERROR: 
> java.lang.IllegalStateException: Failure while closing accountor.  Expected 
> private and shared pools to be set to initial values.  However, one or more 
> were not.  Stats are
> zoneinitallocated   delta
> private 100 918080  81920
> shared  00  00  0.
> Fragment 0:0
> [Error Id: 90ced8b1-b6db-438f-b193-b7634de31b81 on centos-03.qa.lab:31010]
> org.apache.drill.common.exceptions.UserException: SYSTEM ERROR: 
> java.lang.IllegalStateException: Failure while closing accountor.  Expected 
> private and shared pools to be set to initial values.  However, one or more 
> were not.  Stats are
> zoneinitallocated   delta
> private 100 918080  81920
> shared  00  00  0.
> Fragment 0:0
> [Error Id: 90

[jira] [Closed] (DRILL-3536) Add support for LEAD, LAG, NTILE, FIRST_VALUE and LAST_VALUE window functions

2015-09-15 Thread Khurram Faraaz (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-3536?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Khurram Faraaz closed DRILL-3536.
-

Verified on master b525692e

> Add support for LEAD, LAG, NTILE, FIRST_VALUE and LAST_VALUE window functions
> -
>
> Key: DRILL-3536
> URL: https://issues.apache.org/jira/browse/DRILL-3536
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Execution - Relational Operators
>Reporter: Deneche A. Hakim
>Assignee: Deneche A. Hakim
>  Labels: window_function
> Fix For: 1.2.0
>
> Attachments: DRILL-3536.1.patch.txt, DRILL-3536.2.patch.txt, 
> DRILL-3536.3.patch.txt
>
>
> This JIRA will track the progress on the following window functions (no 
> particular order):
> - LEAD
> - LAG
> - NTILE
> - FIRST_VALUE
> - LAST_VALUE



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (DRILL-3783) Incorrect results : COUNT() over results returned by UNION ALL

2015-09-15 Thread Khurram Faraaz (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-3783?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14745981#comment-14745981
 ] 

Khurram Faraaz commented on DRILL-3783:
---

Query plan for the query

{code}
0: jdbc:drill:schema=dfs.tmp> explain plan for select count(c1) from (select 
cast(columns[0] as int) c1 from `testWindow.csv`) union all (select 
cast(columns[0] as int) c2 from `testWindow.csv`);
+--+--+
| text | json |
+--+--+
| 00-00Screen
00-01  Project(EXPR$0=[$0])
00-02UnionAll(all=[true])
00-04  StreamAgg(group=[{}], EXPR$0=[COUNT($0)])
00-06Project(c1=[CAST(ITEM($0, 0)):INTEGER])
00-07  Scan(groupscan=[EasyGroupScan 
[selectionRoot=maprfs:/tmp/testWindow.csv, numFiles=1, columns=[`columns`[0]], 
files=[maprfs:///tmp/testWindow.csv]]])
00-03  Project(c2=[CAST(ITEM($0, 0)):INTEGER])
00-05Scan(groupscan=[EasyGroupScan 
[selectionRoot=maprfs:/tmp/testWindow.csv, numFiles=1, columns=[`columns`[0]], 
files=[maprfs:///tmp/testWindow.csv]]])
{code}

> Incorrect results : COUNT() over results returned by UNION ALL 
> 
>
> Key: DRILL-3783
> URL: https://issues.apache.org/jira/browse/DRILL-3783
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Relational Operators
>Affects Versions: 1.2.0
> Environment: 4 node cluster on CentOS
>Reporter: Khurram Faraaz
>Assignee: Sean Hsuan-Yi Chu
>Priority: Critical
> Fix For: 1.2.0
>
>
> Count over results returned union all query, returns incorrect results. The 
> below query returned an Exception (please se DRILL-2637) that JIRA was marked 
> as fixed, however the query returns incorrect results. 
> {code}
> 0: jdbc:drill:schema=dfs.tmp> select count(c1) from (select cast(columns[0] 
> as int) c1 from `testWindow.csv`) union all (select cast(columns[0] as int) 
> c2 from `testWindow.csv`);
> +-+
> | EXPR$0  |
> +-+
> | 11  |
> | 100 |
> | 10  |
> | 2   |
> | 50  |
> | 55  |
> | 67  |
> | 113 |
> | 119 |
> | 89  |
> | 57  |
> | 61  |
> +-+
> 12 rows selected (0.753 seconds)
> {code}
> Results returned by the query on LHS and RHS of Union all operator are
> {code}
> 0: jdbc:drill:schema=dfs.tmp> select cast(columns[0] as int) c1 from 
> `testWindow.csv`;
> +--+
> |  c1  |
> +--+
> | 100  |
> | 10   |
> | 2|
> | 50   |
> | 55   |
> | 67   |
> | 113  |
> | 119  |
> | 89   |
> | 57   |
> | 61   |
> +--+
> 11 rows selected (0.197 seconds)
> 0: jdbc:drill:schema=dfs.tmp> select cast(columns[0] as int) c2 from 
> `testWindow.csv`;
> +--+
> |  c2  |
> +--+
> | 100  |
> | 10   |
> | 2|
> | 50   |
> | 55   |
> | 67   |
> | 113  |
> | 119  |
> | 89   |
> | 57   |
> | 61   |
> +--+
> 11 rows selected (0.173 seconds)
> {code}
> Note that enclosing the queries within correct parentheses returns correct 
> results. We do not want to return incorrect results to user when the 
> parentheses are missing.
> {code}
> 0: jdbc:drill:schema=dfs.tmp> select count(c1) from ((select cast(columns[0] 
> as int) c1 from `testWindow.csv`) union all (select cast(columns[0] as int) 
> c2 from `testWindow.csv`));
> +-+
> | EXPR$0  |
> +-+
> | 22  |
> +-+
> 1 row selected (0.234 seconds)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (DRILL-2949) TPC-DS queries 1 and 30 fail with CannotPlanException

2015-09-15 Thread Victoria Markman (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-2949?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14745977#comment-14745977
 ] 

Victoria Markman commented on DRILL-2949:
-

[~agirish],

I think you've already re-enabled these. Can you please confirm ?

Thanks,
Vicky.

> TPC-DS queries 1 and 30 fail with CannotPlanException
> -
>
> Key: DRILL-2949
> URL: https://issues.apache.org/jira/browse/DRILL-2949
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Query Planning & Optimization
>Affects Versions: 0.9.0
>Reporter: Abhishek Girish
>Assignee: Abhishek Girish
> Fix For: 1.2.0
>
> Attachments: 
> 0001-DRILL-2949-Fix-CannotPlanException-for-TPC-DS-1-and-.patch
>
>
> TPC-DS queries 1 & 30 fail on recent master [Regression]:
> {code}
> SYSTEM ERROR: This query cannot be planned possibly due to either a cartesian 
> join or an inequality join 
> {code}
>  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (DRILL-2949) TPC-DS queries 1 and 30 fail with CannotPlanException

2015-09-15 Thread Victoria Markman (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-2949?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Victoria Markman updated DRILL-2949:

Assignee: Abhishek Girish  (was: Aman Sinha)

> TPC-DS queries 1 and 30 fail with CannotPlanException
> -
>
> Key: DRILL-2949
> URL: https://issues.apache.org/jira/browse/DRILL-2949
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Query Planning & Optimization
>Affects Versions: 0.9.0
>Reporter: Abhishek Girish
>Assignee: Abhishek Girish
> Fix For: 1.2.0
>
> Attachments: 
> 0001-DRILL-2949-Fix-CannotPlanException-for-TPC-DS-1-and-.patch
>
>
> TPC-DS queries 1 & 30 fail on recent master [Regression]:
> {code}
> SYSTEM ERROR: This query cannot be planned possibly due to either a cartesian 
> join or an inequality join 
> {code}
>  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (DRILL-3580) wrong plan for window function queries containing function(col1 + colb)

2015-09-15 Thread Victoria Markman (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-3580?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14745973#comment-14745973
 ] 

Victoria Markman commented on DRILL-3580:
-

Verified fixed in 1.2.0

#Fri Sep 11 05:38:24 UTC 2015
git.commit.id.abbrev=b525692

Tests added under: Functional/Passing/window_functions/multiple_partitions

> wrong plan for window function queries containing function(col1 + colb)
> ---
>
> Key: DRILL-3580
> URL: https://issues.apache.org/jira/browse/DRILL-3580
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Query Planning & Optimization
>Affects Versions: 1.1.0
>Reporter: Deneche A. Hakim
>Assignee: Sean Hsuan-Yi Chu
>Priority: Critical
>  Labels: window_function
> Fix For: 1.2.0
>
>
> The following query has a wrong plan:
> {noformat}
> explain plan for select position_id, salary, sum(salary) over (partition by 
> position_id), sum(position_id + salary) over (partition by position_id) from 
> cp.`employee.json` limit 20;
> +--+--+
> | text | json |
> +--+--+
> | 00-00Screen
> 00-01  ProjectAllowDup(position_id=[$0], salary=[$1], EXPR$2=[$2], 
> EXPR$3=[$3])
> 00-02SelectionVectorRemover
> 00-03  Limit(fetch=[20])
> 00-04Project(position_id=[$0], salary=[$1], w0$o0=[$2], 
> w0$o00=[$4])
> 00-05  Window(window#0=[window(partition {0} order by [] range 
> between UNBOUNDED PRECEDING and UNBOUNDED FOLLOWING aggs [SUM($3)])])
> 00-06Project(position_id=[$1], salary=[$2], w0$o0=[$3], 
> $3=[+($1, $2)])
> 00-07  Window(window#0=[window(partition {1} order by [] 
> range between UNBOUNDED PRECEDING and UNBOUNDED FOLLOWING aggs [SUM($2)])])
> 00-08SelectionVectorRemover
> 00-09  Sort(sort0=[$1], dir0=[ASC])
> 00-10Project(T13¦¦*=[$0], position_id=[$1], 
> salary=[$2])
> 00-11  Scan(groupscan=[EasyGroupScan 
> [selectionRoot=classpath:/employee.json, numFiles=1, columns=[`*`], 
> files=[classpath:/employee.json]]])
> {noformat}
> The plan contains 2 window operators which shouldn't be possible according to 
> DRILL-3196. 
> The results are also incorrect.
> Depending on which aggregation or window function used we get wrong results 
> or an IndexOutOfBounds exception



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Closed] (DRILL-3580) wrong plan for window function queries containing function(col1 + colb)

2015-09-15 Thread Victoria Markman (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-3580?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Victoria Markman closed DRILL-3580.
---

> wrong plan for window function queries containing function(col1 + colb)
> ---
>
> Key: DRILL-3580
> URL: https://issues.apache.org/jira/browse/DRILL-3580
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Query Planning & Optimization
>Affects Versions: 1.1.0
>Reporter: Deneche A. Hakim
>Assignee: Sean Hsuan-Yi Chu
>Priority: Critical
>  Labels: window_function
> Fix For: 1.2.0
>
>
> The following query has a wrong plan:
> {noformat}
> explain plan for select position_id, salary, sum(salary) over (partition by 
> position_id), sum(position_id + salary) over (partition by position_id) from 
> cp.`employee.json` limit 20;
> +--+--+
> | text | json |
> +--+--+
> | 00-00Screen
> 00-01  ProjectAllowDup(position_id=[$0], salary=[$1], EXPR$2=[$2], 
> EXPR$3=[$3])
> 00-02SelectionVectorRemover
> 00-03  Limit(fetch=[20])
> 00-04Project(position_id=[$0], salary=[$1], w0$o0=[$2], 
> w0$o00=[$4])
> 00-05  Window(window#0=[window(partition {0} order by [] range 
> between UNBOUNDED PRECEDING and UNBOUNDED FOLLOWING aggs [SUM($3)])])
> 00-06Project(position_id=[$1], salary=[$2], w0$o0=[$3], 
> $3=[+($1, $2)])
> 00-07  Window(window#0=[window(partition {1} order by [] 
> range between UNBOUNDED PRECEDING and UNBOUNDED FOLLOWING aggs [SUM($2)])])
> 00-08SelectionVectorRemover
> 00-09  Sort(sort0=[$1], dir0=[ASC])
> 00-10Project(T13¦¦*=[$0], position_id=[$1], 
> salary=[$2])
> 00-11  Scan(groupscan=[EasyGroupScan 
> [selectionRoot=classpath:/employee.json, numFiles=1, columns=[`*`], 
> files=[classpath:/employee.json]]])
> {noformat}
> The plan contains 2 window operators which shouldn't be possible according to 
> DRILL-3196. 
> The results are also incorrect.
> Depending on which aggregation or window function used we get wrong results 
> or an IndexOutOfBounds exception



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Closed] (DRILL-3542) Rebase Drill on Calcite 1.4.0 release

2015-09-15 Thread Khurram Faraaz (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-3542?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Khurram Faraaz closed DRILL-3542.
-

> Rebase Drill on Calcite 1.4.0 release
> -
>
> Key: DRILL-3542
> URL: https://issues.apache.org/jira/browse/DRILL-3542
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Query Planning & Optimization
>Reporter: Jinfeng Ni
>Assignee: Aman Sinha
> Fix For: 1.2.0
>
>
> DRILL-1384 rebase Drill on Calcite from 0.9 to 1.1. However, since the most 
> recent of Calcite is 1.4.0-SNAPSHOT,  Drill has to move forward with the 
> Calcite library again.
> Once we finish the rebasing, if there are regressions on Drill side, this 
> JIRA will  be used as umbrella JIRA and we will create each individual JIRA 
> for one category of regression, and link those JIRA(s) here.  This is 
> different from the way DRILL-1384 worked, where we mixed the fix together, 
> making it hard to understand the reason of each code change during the 
> rebasing effort.  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Closed] (DRILL-3553) add support for LEAD and LAG window functions

2015-09-15 Thread Khurram Faraaz (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-3553?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Khurram Faraaz closed DRILL-3553.
-

Verified on master b525692e

> add support for LEAD and LAG window functions
> -
>
> Key: DRILL-3553
> URL: https://issues.apache.org/jira/browse/DRILL-3553
> Project: Apache Drill
>  Issue Type: Sub-task
>  Components: Execution - Relational Operators
>Reporter: Deneche A. Hakim
>Assignee: Deneche A. Hakim
>  Labels: window_function
> Fix For: 1.2.0
>
>
> From SQL standard here is the general format of LEAD and LAG:
> {noformat}
>  ::=
>OVER 
> {noformat}
> {noformat}
>  ::=
>( 
>   [ ,  [ ,  ] ] )
>   [  ]
> {noformat}
> {noformat}
>  ::=
>   LEAD | LAG
> {noformat}
> {noformat}
>  ::=
>   
> {noformat}
> {noformat}
>  ::=
>   
> {noformat}
> {noformat}
>  ::=
>   
> {noformat}
> The following won't be supported until CALCITE-337 is resolved:
> {noformat}
>  ::=
>   RESPECT NULLS | IGNORE NULLS
> {noformat}
> As part of this JIRA task only the following syntax will be supported:
> {noformat}
>  ::=
>(  )
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (DRILL-3542) Rebase Drill on Calcite 1.4.0 release

2015-09-15 Thread Khurram Faraaz (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-3542?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14745970#comment-14745970
 ] 

Khurram Faraaz commented on DRILL-3542:
---

Verified on master b525692e

> Rebase Drill on Calcite 1.4.0 release
> -
>
> Key: DRILL-3542
> URL: https://issues.apache.org/jira/browse/DRILL-3542
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Query Planning & Optimization
>Reporter: Jinfeng Ni
>Assignee: Aman Sinha
> Fix For: 1.2.0
>
>
> DRILL-1384 rebase Drill on Calcite from 0.9 to 1.1. However, since the most 
> recent of Calcite is 1.4.0-SNAPSHOT,  Drill has to move forward with the 
> Calcite library again.
> Once we finish the rebasing, if there are regressions on Drill side, this 
> JIRA will  be used as umbrella JIRA and we will create each individual JIRA 
> for one category of regression, and link those JIRA(s) here.  This is 
> different from the way DRILL-1384 worked, where we mixed the fix together, 
> making it hard to understand the reason of each code change during the 
> rebasing effort.  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (DRILL-3557) Reading empty CSV file fails with SYSTEM ERROR

2015-09-15 Thread Khurram Faraaz (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-3557?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14745967#comment-14745967
 ] 

Khurram Faraaz commented on DRILL-3557:
---

Verified on master commit id b525692e

{code}
[root@centos-01 ~]# hadoop fs -ls /tmp/empty.csv
-rwxr-xr-x   3 root root  0 2015-09-15 19:18 /tmp/empty.csv

0: jdbc:drill:schema=dfs.tmp> select * from `empty.csv`;
+--+
|  |
+--+
+--+
No rows selected (0.175 seconds)
{code}

> Reading empty CSV file fails with SYSTEM ERROR
> --
>
> Key: DRILL-3557
> URL: https://issues.apache.org/jira/browse/DRILL-3557
> Project: Apache Drill
>  Issue Type: Bug
>Reporter: Daniel Barclay (Drill)
>Assignee: Sean Hsuan-Yi Chu
> Fix For: 1.2.0
>
>
> Trying to read an empty CSV file (a file containing zero bytes) fails with a 
> system error:
> {noformat}
> 0: jdbc:drill:zk=local> SELECT * FROM `dfs.root`.`/tmp/empty.csv`;
> Error: SYSTEM ERROR: IllegalArgumentException: MinorFragmentId 0 has no read 
> entries assigned
> [Error Id: f1da68f6-9749-45bc-956b-20cbc6d28894 on dev-linux2:31010] 
> (state=,code=0)
> 0: jdbc:drill:zk=local> 
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Closed] (DRILL-3557) Reading empty CSV file fails with SYSTEM ERROR

2015-09-15 Thread Khurram Faraaz (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-3557?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Khurram Faraaz closed DRILL-3557.
-

> Reading empty CSV file fails with SYSTEM ERROR
> --
>
> Key: DRILL-3557
> URL: https://issues.apache.org/jira/browse/DRILL-3557
> Project: Apache Drill
>  Issue Type: Bug
>Reporter: Daniel Barclay (Drill)
>Assignee: Sean Hsuan-Yi Chu
> Fix For: 1.2.0
>
>
> Trying to read an empty CSV file (a file containing zero bytes) fails with a 
> system error:
> {noformat}
> 0: jdbc:drill:zk=local> SELECT * FROM `dfs.root`.`/tmp/empty.csv`;
> Error: SYSTEM ERROR: IllegalArgumentException: MinorFragmentId 0 has no read 
> entries assigned
> [Error Id: f1da68f6-9749-45bc-956b-20cbc6d28894 on dev-linux2:31010] 
> (state=,code=0)
> 0: jdbc:drill:zk=local> 
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Closed] (DRILL-3583) SUM on varchar column produces incorrect error

2015-09-15 Thread Khurram Faraaz (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-3583?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Khurram Faraaz closed DRILL-3583.
-

> SUM on varchar column produces incorrect error
> --
>
> Key: DRILL-3583
> URL: https://issues.apache.org/jira/browse/DRILL-3583
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Codegen
>Affects Versions: 1.1.0
>Reporter: Adam Gilmore
>Assignee: Sudheesh Katkam
> Fix For: 1.2.0
>
> Attachments: DRILL-3583.1.patch.txt
>
>
> With the implementation of DRILL-3319, a bug was introduced whereby the 
> codegen for an aggregate when SUMing a varchar column fails:
> {code}
> 0: jdbc:drill:zk=local> select sum(full_name) from cp.`employee.json`;
> java.lang.RuntimeException: java.sql.SQLException: SYSTEM ERROR: 
> CompileException: Line 57, Column 177: Unknown variable or type "logger"
> Fragment 0:0
> [Error Id: 8d5585c4-620c-4275-b0c5-8bc4cbc2da90 on 
> pharma-lap14.ad.pharmadata.net.au:31010]
> at sqlline.IncrementalRows.hasNext(IncrementalRows.java:73)
> at 
> sqlline.TableOutputFormat$ResizingRowsProvider.next(TableOutputFormat.java:87)
> at sqlline.TableOutputFormat.print(TableOutputFormat.java:118)
> at sqlline.SqlLine.print(SqlLine.java:1583)
> at sqlline.Commands.execute(Commands.java:852)
> at sqlline.Commands.sql(Commands.java:751)
> at sqlline.SqlLine.dispatch(SqlLine.java:738)
> at sqlline.SqlLine.begin(SqlLine.java:612)
> at sqlline.SqlLine.start(SqlLine.java:366)
> at sqlline.SqlLine.main(SqlLine.java:259)
> {code}
> This is due to the fact AggregateErrorFunctions now builds its errors with a 
> "logger" static field, which does not exist in the codegenned code.
> We either need to include a static logger in codegen aggregates, or revert 
> back to simpler exceptions for these functions.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (DRILL-3583) SUM on varchar column produces incorrect error

2015-09-15 Thread Khurram Faraaz (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-3583?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14745960#comment-14745960
 ] 

Khurram Faraaz commented on DRILL-3583:
---

Verified on master commit ID: b525692e
We now get a better error message.

{code}
0: jdbc:drill:schema=dfs.tmp> select sum(full_name) from cp.`employee.json`;
Error: UNSUPPORTED_OPERATION ERROR: Only COUNT, MIN and MAX aggregate functions 
supported for VarChar type

Fragment 0:0

[Error Id: 94a2b8f3-b67e-42e8-a776-a131b5f467bc on centos-01.qa.lab:31010] 
(state=,code=0)
{code}

> SUM on varchar column produces incorrect error
> --
>
> Key: DRILL-3583
> URL: https://issues.apache.org/jira/browse/DRILL-3583
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Codegen
>Affects Versions: 1.1.0
>Reporter: Adam Gilmore
>Assignee: Sudheesh Katkam
> Fix For: 1.2.0
>
> Attachments: DRILL-3583.1.patch.txt
>
>
> With the implementation of DRILL-3319, a bug was introduced whereby the 
> codegen for an aggregate when SUMing a varchar column fails:
> {code}
> 0: jdbc:drill:zk=local> select sum(full_name) from cp.`employee.json`;
> java.lang.RuntimeException: java.sql.SQLException: SYSTEM ERROR: 
> CompileException: Line 57, Column 177: Unknown variable or type "logger"
> Fragment 0:0
> [Error Id: 8d5585c4-620c-4275-b0c5-8bc4cbc2da90 on 
> pharma-lap14.ad.pharmadata.net.au:31010]
> at sqlline.IncrementalRows.hasNext(IncrementalRows.java:73)
> at 
> sqlline.TableOutputFormat$ResizingRowsProvider.next(TableOutputFormat.java:87)
> at sqlline.TableOutputFormat.print(TableOutputFormat.java:118)
> at sqlline.SqlLine.print(SqlLine.java:1583)
> at sqlline.Commands.execute(Commands.java:852)
> at sqlline.Commands.sql(Commands.java:751)
> at sqlline.SqlLine.dispatch(SqlLine.java:738)
> at sqlline.SqlLine.begin(SqlLine.java:612)
> at sqlline.SqlLine.start(SqlLine.java:366)
> at sqlline.SqlLine.main(SqlLine.java:259)
> {code}
> This is due to the fact AggregateErrorFunctions now builds its errors with a 
> "logger" static field, which does not exist in the codegenned code.
> We either need to include a static logger in codegen aggregates, or revert 
> back to simpler exceptions for these functions.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (DRILL-3616) Memory leak in a cleanup code after canceling queries with window functions spilling to disk

2015-09-15 Thread Chun Chang (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-3616?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chun Chang updated DRILL-3616:
--
Assignee: Victoria Markman  (was: Deneche A. Hakim)

> Memory leak in a cleanup code after canceling queries with window functions 
> spilling to disk
> 
>
> Key: DRILL-3616
> URL: https://issues.apache.org/jira/browse/DRILL-3616
> Project: Apache Drill
>  Issue Type: Sub-task
>  Components: Execution - Flow
>Affects Versions: 1.2.0
> Environment: private-locking-allocator-branch
>Reporter: Victoria Markman
>Assignee: Victoria Markman
> Fix For: 1.2.0
>
> Attachments: DRILL-3616.1.patch.txt
>
>
> Bunch of concurrent queries with window functions were cancelled.
> Got an error in drillbit.log that might indicate that we have a memory leak 
> in in cleanup code after cancellation.
> Assigning to myself for creation of a reproducible test case.
> {code}
> 2015-08-05 22:43:56,475 [2a3d6e54-12c2-2519-3ea1-736cb1e39e2a:frag:0:0] INFO  
> o.a.d.e.w.fragment.FragmentExecutor - 
> 2a3d6e54-12c2-2519-3ea1-736cb1e39e2a:0:0: State change requested from 
> CANCELLATION_REQUESTED --> FAILED
> 2015-08-05 22:43:56,475 [2a3d6e54-12c2-2519-3ea1-736cb1e39e2a:frag:0:0] INFO  
> o.a.d.e.w.fragment.FragmentExecutor - 
> 2a3d6e54-12c2-2519-3ea1-736cb1e39e2a:0:0: State change requested from FAILED 
> --> FAILED
> 2015-08-05 22:43:56,475 [2a3d6e54-12c2-2519-3ea1-736cb1e39e2a:frag:0:0] INFO  
> o.a.d.e.w.fragment.FragmentExecutor - 
> 2a3d6e54-12c2-2519-3ea1-736cb1e39e2a:0:0: State change requested from FAILED 
> --> FINISHED
> 2015-08-05 22:43:56,476 [2a3d6e54-12c2-2519-3ea1-736cb1e39e2a:frag:0:0] ERROR 
> o.a.d.e.w.fragment.FragmentExecutor - SYSTEM ERROR: IllegalStateException: 
> Unaccounted for outstanding allocation (902492)
> Fragment 0:0
> [Error Id: 1b9714b9-5a39-48ec-80e7-c49c79825cda on atsqa4-133.qa.lab:31010]
> org.apache.drill.common.exceptions.UserException: SYSTEM ERROR: 
> IllegalStateException: Unaccounted for outstanding allocation (902492)
> Fragment 0:0
> [Error Id: 1b9714b9-5a39-48ec-80e7-c49c79825cda on atsqa4-133.qa.lab:31010]
> at 
> org.apache.drill.common.exceptions.UserException$Builder.build(UserException.java:523)
>  ~[drill-common-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
> at 
> org.apache.drill.exec.work.fragment.FragmentExecutor.sendFinalState(FragmentExecutor.java:323)
>  [drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
> at 
> org.apache.drill.exec.work.fragment.FragmentExecutor.cleanup(FragmentExecutor.java:178)
>  [drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
> at 
> org.apache.drill.exec.work.fragment.FragmentExecutor.run(FragmentExecutor.java:292)
>  [drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
> at 
> org.apache.drill.common.SelfCleaningRunnable.run(SelfCleaningRunnable.java:38)
>  [drill-common-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>  [na:1.7.0_71]
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>  [na:1.7.0_71]
> at java.lang.Thread.run(Thread.java:745) [na:1.7.0_71]
> Caused by: java.lang.RuntimeException: Exception while closing
> at 
> org.apache.drill.common.DrillAutoCloseables.closeNoChecked(DrillAutoCloseables.java:46)
>  ~[drill-common-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
> at 
> org.apache.drill.exec.ops.OperatorContextImpl.close(OperatorContextImpl.java:139)
>  ~[drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
> at 
> org.apache.drill.exec.ops.FragmentContext.suppressingClose(FragmentContext.java:439)
>  ~[drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
> at 
> org.apache.drill.exec.ops.FragmentContext.close(FragmentContext.java:424) 
> ~[drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
> at 
> org.apache.drill.exec.work.fragment.FragmentExecutor.closeOutResources(FragmentExecutor.java:352)
>  [drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
> at 
> org.apache.drill.exec.work.fragment.FragmentExecutor.cleanup(FragmentExecutor.java:173)
>  [drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
> ... 5 common frames omitted
> Caused by: java.lang.IllegalStateException: Unaccounted for outstanding 
> allocation (902492)
> at 
> org.apache.drill.exec.memory.BaseAllocator.close(BaseAllocator.java:1278) 
> ~[drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
> at 
> org.apache.drill.common.DrillAutoCloseables.closeNoChecked(DrillAutoCloseables.java:44)
>  ~[drill-common-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
> ... 10 common frames omitted
> 2015-08-05 22:43:56,477 [2a3d6e54-f2c1-4682-9121-73c9110e3dd7:frag:0:0] ERROR 

[jira] [Updated] (DRILL-3622) With user authentication enabled, only admin users should be able to change system options

2015-09-15 Thread Chun Chang (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-3622?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chun Chang updated DRILL-3622:
--
Assignee: Krystal  (was: Venki Korukanti)

> With user authentication enabled, only admin users should be able to change 
> system options
> --
>
> Key: DRILL-3622
> URL: https://issues.apache.org/jira/browse/DRILL-3622
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Flow
>Reporter: Sudheesh Katkam
>Assignee: Krystal
> Fix For: 1.2.0
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Closed] (DRILL-3619) Add support for NTILE window function

2015-09-15 Thread Khurram Faraaz (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-3619?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Khurram Faraaz closed DRILL-3619.
-

Verified.

> Add support for NTILE window function
> -
>
> Key: DRILL-3619
> URL: https://issues.apache.org/jira/browse/DRILL-3619
> Project: Apache Drill
>  Issue Type: Sub-task
>  Components: Execution - Relational Operators
>Reporter: Deneche A. Hakim
>Assignee: Deneche A. Hakim
>  Labels: window_function
> Fix For: 1.2.0
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Closed] (DRILL-3608) add support for FIRST_VALUE and LAST_VALUE

2015-09-15 Thread Khurram Faraaz (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-3608?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Khurram Faraaz closed DRILL-3608.
-

Verified on master.

> add support for FIRST_VALUE and LAST_VALUE
> --
>
> Key: DRILL-3608
> URL: https://issues.apache.org/jira/browse/DRILL-3608
> Project: Apache Drill
>  Issue Type: Sub-task
>  Components: Execution - Relational Operators
>Reporter: Deneche A. Hakim
>Assignee: Deneche A. Hakim
>  Labels: window_function
> Fix For: 1.2.0
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (DRILL-3180) Apache Drill JDBC storage plugin to query rdbms systems such as MySQL and Netezza from Apache Drill

2015-09-15 Thread Ajo Abraham (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-3180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14745931#comment-14745931
 ] 

Ajo Abraham commented on DRILL-3180:


Hi [~jnadeau] - is it possible to push down a query directly by passing all the 
drill optimization?  Then we could leverage database native functionality not 
supported by drill.  Use case: i have a really complex query i want to run on 
postgres db then use that resultset inside drill with other drill sources.  But 
the query on postgres I just want to pass it through.  Thanks!

> Apache Drill JDBC storage plugin to query rdbms systems such as MySQL and 
> Netezza from Apache Drill
> ---
>
> Key: DRILL-3180
> URL: https://issues.apache.org/jira/browse/DRILL-3180
> Project: Apache Drill
>  Issue Type: New Feature
>  Components: Storage - Other
>Affects Versions: 1.0.0
>Reporter: Magnus Pierre
>Assignee: Jacques Nadeau
>  Labels: Drill, JDBC, plugin
> Fix For: 1.2.0
>
> Attachments: patch.diff, pom.xml, storage-mpjdbc.zip
>
>   Original Estimate: 1m
>  Remaining Estimate: 1m
>
> I have developed the base code for a JDBC storage-plugin for Apache Drill. 
> The code is primitive but consitutes a good starting point for further 
> coding. Today it provides primitive support for SELECT against RDBMS with 
> JDBC. 
> The goal is to provide complete SELECT support against RDBMS with push down 
> capabilities.
> Currently the code is using standard JDBC classes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Closed] (DRILL-3635) IllegalArgumentException - not a Parquet file (too small)

2015-09-15 Thread Chun Chang (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-3635?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chun Chang closed DRILL-3635.
-
Assignee: Chun Chang  (was: Deneche A. Hakim)

Ran the same test in a loop for 10 times, did not hit the failure.

> IllegalArgumentException - not a Parquet file (too small)
> -
>
> Key: DRILL-3635
> URL: https://issues.apache.org/jira/browse/DRILL-3635
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Flow, Storage - Parquet
>Affects Versions: 1.1.0
> Environment: Test framework
>Reporter: Chris Westin
>Assignee: Chun Chang
> Fix For: 1.2.0
>
>
> The (MapR internal) regression suite is sporadically seeing this error:
> /root/private-sql-hadoop-test/framework/resources/Precommit/Functional/ctas_flatten/10rows/filter4.q
> Query: 
> select * from dfs.ctas_flatten.`filter4_10rows_ctas`
> Failed with exception
> java.sql.SQLException: SYSTEM ERROR: IllegalArgumentException: 
> maprfs:///drill/testdata/ctas_flatten/filter4_10rows_ctas/0_0_0.parquet 
> is not a Parquet file (too small)
> [Error Id: 9749d6a7-685d-4663-9b27-1a456a5dec40 on drillats3.qa.lab:31010]
>   at 
> org.apache.drill.jdbc.impl.DrillCursor.nextRowInternally(DrillCursor.java:244)
>   at 
> org.apache.drill.jdbc.impl.DrillCursor.loadInitialSchema(DrillCursor.java:287)
>   at 
> org.apache.drill.jdbc.impl.DrillResultSetImpl.execute(DrillResultSetImpl.java:1362)
>   at 
> org.apache.drill.jdbc.impl.DrillResultSetImpl.execute(DrillResultSetImpl.java:72)
>   at 
> net.hydromatic.avatica.AvaticaConnection.executeQueryInternal(AvaticaConnection.java:404)
>   at 
> net.hydromatic.avatica.AvaticaStatement.executeQueryInternal(AvaticaStatement.java:351)
>   at 
> net.hydromatic.avatica.AvaticaStatement.executeQuery(AvaticaStatement.java:78)
>   at 
> org.apache.drill.jdbc.impl.DrillStatementImpl.executeQuery(DrillStatementImpl.java:96)
>   at 
> org.apache.drill.test.framework.DrillTestJdbc.executeQuery(DrillTestJdbc.java:144)
>   at 
> org.apache.drill.test.framework.DrillTestJdbc.run(DrillTestJdbc.java:83)
>   at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:262)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>   at java.lang.Thread.run(Thread.java:744)
> Caused by: org.apache.drill.common.exceptions.UserRemoteException: SYSTEM 
> ERROR: IllegalArgumentException: 
> maprfs:///drill/testdata/ctas_flatten/filter4_10rows_ctas/0_0_0.parquet 
> is not a Parquet file (too small)
> It doesn't happen every time, but based on looking at log files, it seems to 
> happen more than half the time.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Reopened] (DRILL-3658) Missing org.apache.hadoop in the JDBC jar

2015-09-15 Thread Daniel Barclay (Drill) (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-3658?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Barclay (Drill) reopened DRILL-3658:
---

> Missing org.apache.hadoop in the JDBC jar
> -
>
> Key: DRILL-3658
> URL: https://issues.apache.org/jira/browse/DRILL-3658
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Client - JDBC
>Reporter: Piotr Sokólski
>Assignee: Daniel Barclay (Drill)
>Priority: Blocker
> Fix For: 1.2.0
>
>
> java.lang.ClassNotFoundException: local.org.apache.hadoop.io.Text is thrown 
> while trying to access a text field from a result set returned from Drill 
> while using the drill-jdbc-all.jar



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (DRILL-2489) Accessing Connection, Statement, PreparedStatement after they are closed should throw a SQLException

2015-09-15 Thread Daniel Barclay (Drill) (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-2489?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Barclay (Drill) reassigned DRILL-2489:
-

Assignee: Daniel Barclay (Drill)  (was: Mehant Baid)

> Accessing Connection, Statement, PreparedStatement after they are closed 
> should throw a SQLException
> 
>
> Key: DRILL-2489
> URL: https://issues.apache.org/jira/browse/DRILL-2489
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Client - JDBC
>Reporter: Rahul Challapalli
>Assignee: Daniel Barclay (Drill)
> Fix For: 1.2.0
>
>
> git.commit.id.abbrev=7b4c887
> According to JDBC spec we should throw a SQLException when we access methods 
> on a closed Connection, Statement, or PreparedStatement. Drill is currently 
> not doing it. 
> I can raise multiple JIRA's if the developer wishes to work on them 
> independently



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (DRILL-3780) storage plugin configurations in ZooKeeper need to be secured

2015-09-15 Thread Khurram Faraaz (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-3780?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Khurram Faraaz updated DRILL-3780:
--
Assignee: Venki Korukanti

> storage plugin configurations in ZooKeeper need to be secured
> -
>
> Key: DRILL-3780
> URL: https://issues.apache.org/jira/browse/DRILL-3780
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.2.0
>Reporter: Kristine Hahn
>Assignee: Venki Korukanti
>
> Drill saves storage plugin configurations in ZooKeeper (distributed mode), 
> and when authorization is enabled to prevent modification or deletion of the 
> configurations from the Web UI (DRILL-3725, 3201, 3622), an unauthorized user 
> can still access the configuration in ZooKeeper.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Closed] (DRILL-2637) Schema change reported incorrectly although both the input columns are of same datatype

2015-09-15 Thread Khurram Faraaz (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-2637?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Khurram Faraaz closed DRILL-2637.
-

> Schema change reported incorrectly although both the input columns are of 
> same datatype
> ---
>
> Key: DRILL-2637
> URL: https://issues.apache.org/jira/browse/DRILL-2637
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 0.9.0
>Reporter: Khurram Faraaz
>Assignee: Sean Hsuan-Yi Chu
>
> Schema change is reported incorrectly, although the two columns hold data of 
> same datatype, and those two columns are input to Union All and an aggregate 
> count is performed on the results returned by Union All.
> Tests were run on 4 node cluster.
> {code}
> 0: jdbc:drill:> select c1 from (select columns[0] c1 from `testWindow.csv`) 
> union all (select columns[0] c2 from `testWindow.csv`);
> ++
> | c1 |
> ++
> | 100|
> | 10 |
> | 2  |
> | 50 |
> | 55 |
> | 67 |
> | 113|
> | 119|
> | 89 |
> | 57 |
> | 61 |
> | 100|
> | 10 |
> | 2  |
> | 50 |
> | 55 |
> | 67 |
> | 113|
> | 119|
> | 89 |
> | 57 |
> | 61 |
> ++
> 22 rows selected (0.121 seconds)
> {code}
> {code}
> 0: jdbc:drill:> select count(c1) from (select columns[0] c1 from 
> `testWindow.csv`) union all (select columns[0] c2 from `testWindow.csv`);
> ++
> |   EXPR$0   |
> ++
> Query failed: Query stopped., Schema change detected in the left input of 
> Union-All. This is not currently supported [ 
> 57dd6384-fb23-4ab0-aee9-fb7def390788 on centos-04.qa.lab:31010 ]
> java.lang.RuntimeException: java.sql.SQLException: Failure while executing 
> query.
>   at sqlline.SqlLine$IncrementalRows.hasNext(SqlLine.java:2514)
>   at sqlline.SqlLine$TableOutputFormat.print(SqlLine.java:2148)
>   at sqlline.SqlLine.print(SqlLine.java:1809)
>   at sqlline.SqlLine$Commands.execute(SqlLine.java:3766)
>   at sqlline.SqlLine$Commands.sql(SqlLine.java:3663)
>   at sqlline.SqlLine.dispatch(SqlLine.java:889)
>   at sqlline.SqlLine.begin(SqlLine.java:763)
>   at sqlline.SqlLine.start(SqlLine.java:498)
>   at sqlline.SqlLine.main(SqlLine.java:460)
> {code}
> Stack trace from drillbit.log
> {code}
> 2015-03-31 20:10:07,825 [2ae500df-db85-2583-fa7f-b89beb7e5ac0:frag:0:0] ERROR 
> o.a.drill.exec.work.foreman.Foreman - Error 
> 0b4d9b3a-d8af-4dc9-be47-46c4547a793a: RemoteRpcException: Failure while 
> running fragment., Schema change detected in the left input of Union-All. 
> This is not currently supported [ b9555eb8-c009-4e9c-b058-ffae3f015df7 on 
> centos-04.qa.lab:31010 ]
> [ b9555eb8-c009-4e9c-b058-ffae3f015df7 on centos-04.qa.lab:31010 ]
> org.apache.drill.exec.rpc.RemoteRpcException: Failure while running 
> fragment., Schema change detected in the left input of Union-All. This is not 
> currently supported [ b9555eb8-c009-4e9c-b058-ffae3f015df7 on 
> centos-04.qa.lab:31010 ]
> [ b9555eb8-c009-4e9c-b058-ffae3f015df7 on centos-04.qa.lab:31010 ]
> at 
> org.apache.drill.exec.work.foreman.QueryManager.statusUpdate(QueryManager.java:163)
>  [drill-java-exec-0.9.0-SNAPSHOT-rebuffed.jar:0.9.0-SNAPSHOT]
> at 
> org.apache.drill.exec.work.foreman.QueryManager$RootStatusReporter.statusChange(QueryManager.java:281)
>  [drill-java-exec-0.9.0-SNAPSHOT-rebuffed.jar:0.9.0-SNAPSHOT]
> at 
> org.apache.drill.exec.work.fragment.AbstractStatusReporter.fail(AbstractStatusReporter.java:114)
>  [drill-java-exec-0.9.0-SNAPSHOT-rebuffed.jar:0.9.0-SNAPSHOT]
> at 
> org.apache.drill.exec.work.fragment.AbstractStatusReporter.fail(AbstractStatusReporter.java:110)
>  [drill-java-exec-0.9.0-SNAPSHOT-rebuffed.jar:0.9.0-SNAPSHOT]
> at 
> org.apache.drill.exec.work.fragment.FragmentExecutor.internalFail(FragmentExecutor.java:230)
>  [drill-java-exec-0.9.0-SNAPSHOT-rebuffed.jar:0.9.0-SNAPSHOT]
> at 
> org.apache.drill.exec.work.fragment.FragmentExecutor.run(FragmentExecutor.java:165)
>  [drill-java-exec-0.9.0-SNAPSHOT-rebuffed.jar:0.9.0-SNAPSHOT]
> at 
> org.apache.drill.common.SelfCleaningRunnable.run(SelfCleaningRunnable.java:38)
>  [drill-common-0.9.0-SNAPSHOT-rebuffed.jar:0.9.0-SNAPSHOT]
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>  [na:1.7.0_75]
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>  [na:1.7.0_75]
> at java.lang.Thread.run(Thread.java:745) [na:1.7.0_75]
> 2015-03-31 20:10:07,825 [2ae500df-db85-2583-fa7f-b89beb7e5ac0:frag:0:0] WARN  
> o.a.d.e.w.fragment.FragmentExecutor - Error while initializing o

[jira] [Commented] (DRILL-2637) Schema change reported incorrectly although both the input columns are of same datatype

2015-09-15 Thread Khurram Faraaz (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-2637?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14745886#comment-14745886
 ] 

Khurram Faraaz commented on DRILL-2637:
---

Verified that we do not see the RelOptPlanner.CannotPlanException anymore on 
master (commit id: b525692e)
However the same query returns incorrect results. I will report another JIRA to 
track incorrect results.

> Schema change reported incorrectly although both the input columns are of 
> same datatype
> ---
>
> Key: DRILL-2637
> URL: https://issues.apache.org/jira/browse/DRILL-2637
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 0.9.0
>Reporter: Khurram Faraaz
>Assignee: Sean Hsuan-Yi Chu
>
> Schema change is reported incorrectly, although the two columns hold data of 
> same datatype, and those two columns are input to Union All and an aggregate 
> count is performed on the results returned by Union All.
> Tests were run on 4 node cluster.
> {code}
> 0: jdbc:drill:> select c1 from (select columns[0] c1 from `testWindow.csv`) 
> union all (select columns[0] c2 from `testWindow.csv`);
> ++
> | c1 |
> ++
> | 100|
> | 10 |
> | 2  |
> | 50 |
> | 55 |
> | 67 |
> | 113|
> | 119|
> | 89 |
> | 57 |
> | 61 |
> | 100|
> | 10 |
> | 2  |
> | 50 |
> | 55 |
> | 67 |
> | 113|
> | 119|
> | 89 |
> | 57 |
> | 61 |
> ++
> 22 rows selected (0.121 seconds)
> {code}
> {code}
> 0: jdbc:drill:> select count(c1) from (select columns[0] c1 from 
> `testWindow.csv`) union all (select columns[0] c2 from `testWindow.csv`);
> ++
> |   EXPR$0   |
> ++
> Query failed: Query stopped., Schema change detected in the left input of 
> Union-All. This is not currently supported [ 
> 57dd6384-fb23-4ab0-aee9-fb7def390788 on centos-04.qa.lab:31010 ]
> java.lang.RuntimeException: java.sql.SQLException: Failure while executing 
> query.
>   at sqlline.SqlLine$IncrementalRows.hasNext(SqlLine.java:2514)
>   at sqlline.SqlLine$TableOutputFormat.print(SqlLine.java:2148)
>   at sqlline.SqlLine.print(SqlLine.java:1809)
>   at sqlline.SqlLine$Commands.execute(SqlLine.java:3766)
>   at sqlline.SqlLine$Commands.sql(SqlLine.java:3663)
>   at sqlline.SqlLine.dispatch(SqlLine.java:889)
>   at sqlline.SqlLine.begin(SqlLine.java:763)
>   at sqlline.SqlLine.start(SqlLine.java:498)
>   at sqlline.SqlLine.main(SqlLine.java:460)
> {code}
> Stack trace from drillbit.log
> {code}
> 2015-03-31 20:10:07,825 [2ae500df-db85-2583-fa7f-b89beb7e5ac0:frag:0:0] ERROR 
> o.a.drill.exec.work.foreman.Foreman - Error 
> 0b4d9b3a-d8af-4dc9-be47-46c4547a793a: RemoteRpcException: Failure while 
> running fragment., Schema change detected in the left input of Union-All. 
> This is not currently supported [ b9555eb8-c009-4e9c-b058-ffae3f015df7 on 
> centos-04.qa.lab:31010 ]
> [ b9555eb8-c009-4e9c-b058-ffae3f015df7 on centos-04.qa.lab:31010 ]
> org.apache.drill.exec.rpc.RemoteRpcException: Failure while running 
> fragment., Schema change detected in the left input of Union-All. This is not 
> currently supported [ b9555eb8-c009-4e9c-b058-ffae3f015df7 on 
> centos-04.qa.lab:31010 ]
> [ b9555eb8-c009-4e9c-b058-ffae3f015df7 on centos-04.qa.lab:31010 ]
> at 
> org.apache.drill.exec.work.foreman.QueryManager.statusUpdate(QueryManager.java:163)
>  [drill-java-exec-0.9.0-SNAPSHOT-rebuffed.jar:0.9.0-SNAPSHOT]
> at 
> org.apache.drill.exec.work.foreman.QueryManager$RootStatusReporter.statusChange(QueryManager.java:281)
>  [drill-java-exec-0.9.0-SNAPSHOT-rebuffed.jar:0.9.0-SNAPSHOT]
> at 
> org.apache.drill.exec.work.fragment.AbstractStatusReporter.fail(AbstractStatusReporter.java:114)
>  [drill-java-exec-0.9.0-SNAPSHOT-rebuffed.jar:0.9.0-SNAPSHOT]
> at 
> org.apache.drill.exec.work.fragment.AbstractStatusReporter.fail(AbstractStatusReporter.java:110)
>  [drill-java-exec-0.9.0-SNAPSHOT-rebuffed.jar:0.9.0-SNAPSHOT]
> at 
> org.apache.drill.exec.work.fragment.FragmentExecutor.internalFail(FragmentExecutor.java:230)
>  [drill-java-exec-0.9.0-SNAPSHOT-rebuffed.jar:0.9.0-SNAPSHOT]
> at 
> org.apache.drill.exec.work.fragment.FragmentExecutor.run(FragmentExecutor.java:165)
>  [drill-java-exec-0.9.0-SNAPSHOT-rebuffed.jar:0.9.0-SNAPSHOT]
> at 
> org.apache.drill.common.SelfCleaningRunnable.run(SelfCleaningRunnable.java:38)
>  [drill-common-0.9.0-SNAPSHOT-rebuffed.jar:0.9.0-SNAPSHOT]
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>  [na:1.7.0_75]
> at 
> java.util.concurrent.ThreadPoolExecu

[jira] [Comment Edited] (DRILL-3189) Disable ALLOW PARTIAL/DISALLOW PARTIAL in window function grammar

2015-09-15 Thread Victoria Markman (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-3189?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14745884#comment-14745884
 ] 

Victoria Markman edited comment on DRILL-3189 at 9/15/15 6:25 PM:
--

Since we decided not to support "ROWS" syntax for now, it's not possible to 
verify this fix.
Will revisit if we decide to support it.


was (Author: vicky):
Since we decided not to support "ROWS" syntax for now, it's not possible to 
verify this fix.

> Disable ALLOW PARTIAL/DISALLOW PARTIAL in window function grammar
> -
>
> Key: DRILL-3189
> URL: https://issues.apache.org/jira/browse/DRILL-3189
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Query Planning & Optimization
>Affects Versions: 1.0.0
>Reporter: Victoria Markman
>Assignee: Sean Hsuan-Yi Chu
>Priority: Critical
>  Labels: window_function
> Fix For: 1.2.0
>
>
> It does not seem to be implemented on the drill side. Looks like Calcite 
> specific grammar. Don't see it SQL Standard.
> Looks like wrong result:
> {code}
> 0: jdbc:drill:schema=dfs> select a2, sum(a2) over(partition by a2 order by a2 
> rows between 1 preceding and 1 following disallow partial) from t2 order by 
> a2;
> +-+-+
> | a2  | EXPR$1  |
> +-+-+
> | 0   | null|
> | 1   | null|
> | 2   | 6   |
> | 2   | 6   |
> | 2   | 6   |
> | 3   | null|
> | 4   | null|
> | 5   | null|
> | 6   | null|
> | 7   | 14  |
> | 7   | 14  |
> | 8   | null|
> | 9   | null|
> +-+-+
> 13 rows selected (0.213 seconds)
> {code}
> {code}
> 0: jdbc:drill:schema=dfs> select a2, sum(a2) over(partition by a2 order by a2 
> rows between 1 preceding and 1 following allow partial) from t2 order by a2;
> +-+-+
> | a2  | EXPR$1  |
> +-+-+
> | 0   | 0   |
> | 1   | 1   |
> | 2   | 6   |
> | 2   | 6   |
> | 2   | 6   |
> | 3   | 3   |
> | 4   | 4   |
> | 5   | 5   |
> | 6   | 6   |
> | 7   | 14  |
> | 7   | 14  |
> | 8   | 8   |
> | 9   | 9   |
> +-+-+
> 13 rows selected (0.208 seconds)
> {code}
> {code}
> 0: jdbc:drill:schema=dfs> select a2, sum(a2) over(partition by a2 order by a2 
> disallow partial) from t2 order by a2;
> Error: PARSE ERROR: From line 1, column 53 to line 1, column 68: Cannot use 
> DISALLOW PARTIAL with window based on RANGE
> [Error Id: 984c4b81-9eb0-401d-b36a-9580640b4a78 on atsqa4-133.qa.lab:31010] 
> (state=,code=0)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Closed] (DRILL-3189) Disable ALLOW PARTIAL/DISALLOW PARTIAL in window function grammar

2015-09-15 Thread Victoria Markman (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-3189?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Victoria Markman closed DRILL-3189.
---

> Disable ALLOW PARTIAL/DISALLOW PARTIAL in window function grammar
> -
>
> Key: DRILL-3189
> URL: https://issues.apache.org/jira/browse/DRILL-3189
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Query Planning & Optimization
>Affects Versions: 1.0.0
>Reporter: Victoria Markman
>Assignee: Sean Hsuan-Yi Chu
>Priority: Critical
>  Labels: window_function
> Fix For: 1.2.0
>
>
> It does not seem to be implemented on the drill side. Looks like Calcite 
> specific grammar. Don't see it SQL Standard.
> Looks like wrong result:
> {code}
> 0: jdbc:drill:schema=dfs> select a2, sum(a2) over(partition by a2 order by a2 
> rows between 1 preceding and 1 following disallow partial) from t2 order by 
> a2;
> +-+-+
> | a2  | EXPR$1  |
> +-+-+
> | 0   | null|
> | 1   | null|
> | 2   | 6   |
> | 2   | 6   |
> | 2   | 6   |
> | 3   | null|
> | 4   | null|
> | 5   | null|
> | 6   | null|
> | 7   | 14  |
> | 7   | 14  |
> | 8   | null|
> | 9   | null|
> +-+-+
> 13 rows selected (0.213 seconds)
> {code}
> {code}
> 0: jdbc:drill:schema=dfs> select a2, sum(a2) over(partition by a2 order by a2 
> rows between 1 preceding and 1 following allow partial) from t2 order by a2;
> +-+-+
> | a2  | EXPR$1  |
> +-+-+
> | 0   | 0   |
> | 1   | 1   |
> | 2   | 6   |
> | 2   | 6   |
> | 2   | 6   |
> | 3   | 3   |
> | 4   | 4   |
> | 5   | 5   |
> | 6   | 6   |
> | 7   | 14  |
> | 7   | 14  |
> | 8   | 8   |
> | 9   | 9   |
> +-+-+
> 13 rows selected (0.208 seconds)
> {code}
> {code}
> 0: jdbc:drill:schema=dfs> select a2, sum(a2) over(partition by a2 order by a2 
> disallow partial) from t2 order by a2;
> Error: PARSE ERROR: From line 1, column 53 to line 1, column 68: Cannot use 
> DISALLOW PARTIAL with window based on RANGE
> [Error Id: 984c4b81-9eb0-401d-b36a-9580640b4a78 on atsqa4-133.qa.lab:31010] 
> (state=,code=0)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


  1   2   >