date:20150626

[jira] [Commented] (DRILL-3408) CTAS partition by columns[i] from csv fails

2015-06-26 Thread Sean Hsuan-Yi Chu (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-3408?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14604031#comment-14604031
 ] 

Sean Hsuan-Yi Chu commented on DRILL-3408:
--

[~jni] It is a little different from your comment. Can you help clarify if it 
is expected?

If yes, that means the arguments in partition by are actually pointing at the 
elements in the select-list? Besides, sometimes it is necessary give alias 
(e.g., the use case in this issue and DRILL-3411).

> CTAS partition by columns[i] from csv fails
> ---
>
> Key: DRILL-3408
> URL: https://issues.apache.org/jira/browse/DRILL-3408
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Query Planning & Optimization
>Reporter: Sean Hsuan-Yi Chu
>Assignee: Jinfeng Ni
>
> CTAS does not work when users try to partition by index on complex types.
> For example,
> create table `z` partition by columns[0] as select columns[0], columns[1], 
> columns[2] from `t.csv`;
> Will result into 
> Error: PARSE ERROR: Encountered "columns" at line 1, column 31.
> Query parser does not support it; We need to do it from here



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Resolved] (DRILL-3411) CTAS Partition by column in deeper layer fails

2015-06-26 Thread Sean Hsuan-Yi Chu (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-3411?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean Hsuan-Yi Chu resolved DRILL-3411.
--
Resolution: Invalid

If giving alias in select list, it works!

create table t1 (c1, c2) partition by (c2) as 
select t.id {color:red}c1{color}, t.batters.batter {color:red}c2{color} from 
`t.json` t;

> CTAS Partition by column in deeper layer fails
> --
>
> Key: DRILL-3411
> URL: https://issues.apache.org/jira/browse/DRILL-3411
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Query Planning & Optimization
>Reporter: Sean Hsuan-Yi Chu
>Assignee: Jinfeng Ni
>
> A simple data such as 
> {code}
> {
> "id": "0001",
> "type": "donut1",
> "batters":
> {
>   "batter": 1
> }
> }
> {
>   "id": "0002",
> "type": "donut2",
> "batters":
> {
>   "batter": 2
> }
> }
> {code}
> I tried to partition by batters.batter: 
> {code}
> create table t1 (c1, c2) partition by (c2) as 
> select t.id, t.batters.batter from `t.json` t;
> {code}
> But got this exception:
> Error: SYSTEM ERROR: IllegalArgumentException: partition col c2 could not be 
> resolved in table's column lists!



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (DRILL-3408) CTAS partition by columns[i] from csv fails

2015-06-26 Thread Sean Hsuan-Yi Chu (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-3408?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14604028#comment-14604028
 ] 

Sean Hsuan-Yi Chu commented on DRILL-3408:
--

It seems we also need to give alias in the select list. 

create table t1 ( c1, c2, c3) 
partition by (c1, c2)
as
select columns[0] {color:red}c1{color}, columns[1] {color:red}c2{color}, 
columns[2] {color:red}c3{color}
from `t.csv`;

> CTAS partition by columns[i] from csv fails
> ---
>
> Key: DRILL-3408
> URL: https://issues.apache.org/jira/browse/DRILL-3408
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Query Planning & Optimization
>Reporter: Sean Hsuan-Yi Chu
>Assignee: Jinfeng Ni
>
> CTAS does not work when users try to partition by index on complex types.
> For example,
> create table `z` partition by columns[0] as select columns[0], columns[1], 
> columns[2] from `t.csv`;
> Will result into 
> Error: PARSE ERROR: Encountered "columns" at line 1, column 31.
> Query parser does not support it; We need to do it from here



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (DRILL-3411) CTAS Partition by column in deeper layer fails

2015-06-26 Thread Sean Hsuan-Yi Chu (JIRA)

Sean Hsuan-Yi Chu created DRILL-3411:


 Summary: CTAS Partition by column in deeper layer fails
 Key: DRILL-3411
 URL: https://issues.apache.org/jira/browse/DRILL-3411
 Project: Apache Drill
  Issue Type: Bug
  Components: Query Planning & Optimization
Reporter: Sean Hsuan-Yi Chu
Assignee: Jinfeng Ni


A simple data such as 
{code}
{
"id": "0001",
"type": "donut1",
"batters":
{
"batter": 1
}
}

{
"id": "0002",
"type": "donut2",
"batters":
{
"batter": 2
}
}
{code}

I tried to partition by batters.batter: 
{code}
create table t1 (c1, c2) partition by (c2) as 
select t.id, t.batters.batter from `t.json` t;
{code}

But got this exception:
Error: SYSTEM ERROR: IllegalArgumentException: partition col c2 could not be 
resolved in table's column lists!



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (DRILL-3408) CTAS partition by columns[i] from csv fails

2015-06-26 Thread Sean Hsuan-Yi Chu (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-3408?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean Hsuan-Yi Chu updated DRILL-3408:
-
Summary: CTAS partition by columns[i] from csv fails  (was: CTAS partition 
by columns[i] from cdv fails)

> CTAS partition by columns[i] from csv fails
> ---
>
> Key: DRILL-3408
> URL: https://issues.apache.org/jira/browse/DRILL-3408
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Query Planning & Optimization
>Reporter: Sean Hsuan-Yi Chu
>Assignee: Jinfeng Ni
>
> CTAS does not work when users try to partition by index on complex types.
> For example,
> create table `z` partition by columns[0] as select columns[0], columns[1], 
> columns[2] from `t.csv`;
> Will result into 
> Error: PARSE ERROR: Encountered "columns" at line 1, column 31.
> Query parser does not support it; We need to do it from here



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (DRILL-3151) ResultSetMetaData not as specified by JDBC (null/dummy value, not ""/etc.)

2015-06-26 Thread Daniel Barclay (Drill) (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-3151?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Barclay (Drill) updated DRILL-3151:
--
Attachment: (was: DRILL-3151.2.patch.txt)

> ResultSetMetaData not as specified by JDBC (null/dummy value, not ""/etc.)
> --
>
> Key: DRILL-3151
> URL: https://issues.apache.org/jira/browse/DRILL-3151
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Client - JDBC
>Reporter: Daniel Barclay (Drill)
>Assignee: Daniel Barclay (Drill)
> Fix For: 1.2.0
>
> Attachments: DRILL-3151.3.patch.txt
>
>
> In Drill's JDBC driver, some ResultSetMetaData methods don't return what JDBC 
> specifies they should return.
> Some cases:
> {{getTableName(int)}}:
> - (JDBC says: {{table name or "" if not applicable}})
> - Drill returns {{null}} (instead of empty string or table name)
> - (Drill indicates "not applicable" even when from named table, e.g., for  
> "{{SELECT * FROM INFORMATION_SCHEMA.CATALOGS}}".)
> {{getSchemaName(int)}}:
> - (JDBC says: {{schema name or "" if not applicable}})
> - Drill returns "{{\-\-UNKNOWN--}}" (instead of empty string or schema name)
> - (Drill indicates "not applicable" even when from named table, e.g., for  
> "{{SELECT * FROM INFORMATION_SCHEMA.CATALOGS}}".)
> {{getCatalogName(int)}}:
> - (JDBC says: {{the name of the catalog for the table in which the given 
> column appears or "" if not applicable}})
> - Drill returns "{{\-\-UNKNOWN--}}" (instead of empty string or catalog name)
> - (Drill indicates "not applicable" even when from named table, e.g., for  
> "{{SELECT * FROM INFORMATION_SCHEMA.CATALOGS}}".)
> {{isSearchable(int)}}:
> - (JDBC says:  {{Indicates whether the designated column can be used in a 
> where clause.}})
> - Drill returns {{false}}.
> {{getColumnClassName(int}}:
> - (JDBC says: {{the fully-qualified name of the class in the Java programming 
> language that would be used by the method ResultSet.getObject to retrieve the 
> value in the specified column. This is the class name used for custom 
> mapping.}})
> - Drill returns "{{none}}" (instead of the correct class name).
> More cases:
> {{getColumnDisplaySize}}
> - (JDBC says (quite ambiguously): {{the normal maximum number of characters 
> allowed as the width of the designated column}})
> - Drill always returns {{10}}!



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (DRILL-3151) ResultSetMetaData not as specified by JDBC (null/dummy value, not ""/etc.)

2015-06-26 Thread Daniel Barclay (Drill) (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-3151?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Barclay (Drill) updated DRILL-3151:
--
Attachment: (was: DRILL-3151.1.patch.txt)

> ResultSetMetaData not as specified by JDBC (null/dummy value, not ""/etc.)
> --
>
> Key: DRILL-3151
> URL: https://issues.apache.org/jira/browse/DRILL-3151
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Client - JDBC
>Reporter: Daniel Barclay (Drill)
>Assignee: Daniel Barclay (Drill)
> Fix For: 1.2.0
>
> Attachments: DRILL-3151.2.patch.txt, DRILL-3151.3.patch.txt
>
>
> In Drill's JDBC driver, some ResultSetMetaData methods don't return what JDBC 
> specifies they should return.
> Some cases:
> {{getTableName(int)}}:
> - (JDBC says: {{table name or "" if not applicable}})
> - Drill returns {{null}} (instead of empty string or table name)
> - (Drill indicates "not applicable" even when from named table, e.g., for  
> "{{SELECT * FROM INFORMATION_SCHEMA.CATALOGS}}".)
> {{getSchemaName(int)}}:
> - (JDBC says: {{schema name or "" if not applicable}})
> - Drill returns "{{\-\-UNKNOWN--}}" (instead of empty string or schema name)
> - (Drill indicates "not applicable" even when from named table, e.g., for  
> "{{SELECT * FROM INFORMATION_SCHEMA.CATALOGS}}".)
> {{getCatalogName(int)}}:
> - (JDBC says: {{the name of the catalog for the table in which the given 
> column appears or "" if not applicable}})
> - Drill returns "{{\-\-UNKNOWN--}}" (instead of empty string or catalog name)
> - (Drill indicates "not applicable" even when from named table, e.g., for  
> "{{SELECT * FROM INFORMATION_SCHEMA.CATALOGS}}".)
> {{isSearchable(int)}}:
> - (JDBC says:  {{Indicates whether the designated column can be used in a 
> where clause.}})
> - Drill returns {{false}}.
> {{getColumnClassName(int}}:
> - (JDBC says: {{the fully-qualified name of the class in the Java programming 
> language that would be used by the method ResultSet.getObject to retrieve the 
> value in the specified column. This is the class name used for custom 
> mapping.}})
> - Drill returns "{{none}}" (instead of the correct class name).
> More cases:
> {{getColumnDisplaySize}}
> - (JDBC says (quite ambiguously): {{the normal maximum number of characters 
> allowed as the width of the designated column}})
> - Drill always returns {{10}}!



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (DRILL-3151) ResultSetMetaData not as specified by JDBC (null/dummy value, not ""/etc.)

2015-06-26 Thread Daniel Barclay (Drill) (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-3151?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Barclay (Drill) updated DRILL-3151:
--
Attachment: DRILL-3151.3.patch.txt

> ResultSetMetaData not as specified by JDBC (null/dummy value, not ""/etc.)
> --
>
> Key: DRILL-3151
> URL: https://issues.apache.org/jira/browse/DRILL-3151
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Client - JDBC
>Reporter: Daniel Barclay (Drill)
>Assignee: Daniel Barclay (Drill)
> Fix For: 1.2.0
>
> Attachments: DRILL-3151.2.patch.txt, DRILL-3151.3.patch.txt
>
>
> In Drill's JDBC driver, some ResultSetMetaData methods don't return what JDBC 
> specifies they should return.
> Some cases:
> {{getTableName(int)}}:
> - (JDBC says: {{table name or "" if not applicable}})
> - Drill returns {{null}} (instead of empty string or table name)
> - (Drill indicates "not applicable" even when from named table, e.g., for  
> "{{SELECT * FROM INFORMATION_SCHEMA.CATALOGS}}".)
> {{getSchemaName(int)}}:
> - (JDBC says: {{schema name or "" if not applicable}})
> - Drill returns "{{\-\-UNKNOWN--}}" (instead of empty string or schema name)
> - (Drill indicates "not applicable" even when from named table, e.g., for  
> "{{SELECT * FROM INFORMATION_SCHEMA.CATALOGS}}".)
> {{getCatalogName(int)}}:
> - (JDBC says: {{the name of the catalog for the table in which the given 
> column appears or "" if not applicable}})
> - Drill returns "{{\-\-UNKNOWN--}}" (instead of empty string or catalog name)
> - (Drill indicates "not applicable" even when from named table, e.g., for  
> "{{SELECT * FROM INFORMATION_SCHEMA.CATALOGS}}".)
> {{isSearchable(int)}}:
> - (JDBC says:  {{Indicates whether the designated column can be used in a 
> where clause.}})
> - Drill returns {{false}}.
> {{getColumnClassName(int}}:
> - (JDBC says: {{the fully-qualified name of the class in the Java programming 
> language that would be used by the method ResultSet.getObject to retrieve the 
> value in the specified column. This is the class name used for custom 
> mapping.}})
> - Drill returns "{{none}}" (instead of the correct class name).
> More cases:
> {{getColumnDisplaySize}}
> - (JDBC says (quite ambiguously): {{the normal maximum number of characters 
> allowed as the width of the designated column}})
> - Drill always returns {{10}}!



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (DRILL-1332) Statistics functions - regr_sxx(X, Y) regr_sxy(X, Y) regr_syy(X, Y)

2015-06-26 Thread Jinfeng Ni (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-1332?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jinfeng Ni updated DRILL-1332:
--
Fix Version/s: (was: 1.1.0)
   Future

> Statistics functions - regr_sxx(X, Y) regr_sxy(X, Y) regr_syy(X, Y)
> ---
>
> Key: DRILL-1332
> URL: https://issues.apache.org/jira/browse/DRILL-1332
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Functions - Drill
>Reporter: Yash Sharma
>Assignee: Jinfeng Ni
>Priority: Minor
> Fix For: Future
>
> Attachments: DRILL-1332.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (DRILL-1331) Aggregate Statistics function - regr_avgx(X, Y) regr_avgy(X, Y)

2015-06-26 Thread Jinfeng Ni (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-1331?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jinfeng Ni updated DRILL-1331:
--
Fix Version/s: (was: 1.1.0)
   Future

> Aggregate Statistics function - regr_avgx(X, Y) regr_avgy(X, Y)
> ---
>
> Key: DRILL-1331
> URL: https://issues.apache.org/jira/browse/DRILL-1331
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Functions - Drill
>Reporter: Yash Sharma
>Assignee: Jinfeng Ni
>Priority: Minor
> Fix For: Future
>
> Attachments: DRILL-1331.patch, DRILL-1331.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (DRILL-3307) Query with window function runs out of memory

2015-06-26 Thread Deneche A. Hakim (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-3307?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14603934#comment-14603934
 ] 

Deneche A. Hakim commented on DRILL-3307:
-

all unit tests are passing along with functional and tpch100

> Query with window function runs out of memory
> -
>
> Key: DRILL-3307
> URL: https://issues.apache.org/jira/browse/DRILL-3307
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Query Planning & Optimization
>Affects Versions: 1.1.0
> Environment: Data set: TPC-DS SF 100 Parquet
> Number of Nodes: 4
>Reporter: Abhishek Girish
>Assignee: Steven Phillips
>Priority: Blocker
> Fix For: 1.1.0
>
> Attachments: DRILL-3307.1.patch.txt, drillbit.log.txt
>
>
> Query with window function runs out of memory:
> {code:sql}
>  SELECT SUM(ss.ss_net_paid_inc_tax) OVER (PARTITION BY ss.ss_store_sk) AS 
> TotalSpend FROM store_sales ss ORDER BY 1 LIMIT 20;
> java.lang.RuntimeException: java.sql.SQLException: RESOURCE ERROR: One or 
> more nodes ran out of memory while executing the query.
> Fragment 3:0
> [Error Id: 9af19064-9175-46a4-b557-714d1c77cd76 on abhi6.qa.lab:31010]
>   at sqlline.IncrementalRows.hasNext(IncrementalRows.java:73)
>   at 
> sqlline.TableOutputFormat$ResizingRowsProvider.next(TableOutputFormat.java:85)
>   at sqlline.TableOutputFormat.print(TableOutputFormat.java:116)
>   at sqlline.SqlLine.print(SqlLine.java:1583)
>   at sqlline.Commands.execute(Commands.java:852)
>   at sqlline.Commands.sql(Commands.java:751)
>   at sqlline.SqlLine.dispatch(SqlLine.java:738)
>   at sqlline.SqlLine.begin(SqlLine.java:612)
>   at sqlline.SqlLine.start(SqlLine.java:366)
>   at sqlline.SqlLine.main(SqlLine.java:259)
> {code}
> Plan:
> {code}
> 00-00Screen : rowType = RecordType(ANY TotalSpend): rowcount = 
> 2.87997024E8, cumulative cost = {4.3487550824E9 rows, 5.7539970079068695E10 
> cpu, 0.0 io, 7.077814861824E12 network, 4.607952384E9 memory}, id = 142297
> 00-01  SelectionVectorRemover : rowType = RecordType(ANY TotalSpend): 
> rowcount = 2.87997024E8, cumulative cost = {4.31995538E9 rows, 
> 5.751117037666869E10 cpu, 0.0 io, 7.077814861824E12 network, 4.607952384E9 
> memory}, id = 142296
> 00-02Limit(fetch=[20]) : rowType = RecordType(ANY TotalSpend): 
> rowcount = 2.87997024E8, cumulative cost = {4.031958356E9 rows, 
> 5.722317335266869E10 cpu, 0.0 io, 7.077814861824E12 network, 4.607952384E9 
> memory}, id = 142295
> 00-03  SingleMergeExchange(sort0=[0 ASC]) : rowType = RecordType(ANY 
> TotalSpend): rowcount = 2.87997024E8, cumulative cost = {4.031958336E9 rows, 
> 5.722317327266869E10 cpu, 0.0 io, 7.077814861824E12 network, 4.607952384E9 
> memory}, id = 142294
> 01-01SelectionVectorRemover : rowType = RecordType(ANY 
> TotalSpend): rowcount = 2.87997024E8, cumulative cost = {3.743961312E9 rows, 
> 5.261522088866869E10 cpu, 0.0 io, 5.89817905152E12 network, 4.607952384E9 
> memory}, id = 142293
> 01-02  TopN(limit=[20]) : rowType = RecordType(ANY TotalSpend): 
> rowcount = 2.87997024E8, cumulative cost = {3.455964288E9 rows, 
> 5.232722386466869E10 cpu, 0.0 io, 5.89817905152E12 network, 4.607952384E9 
> memory}, id = 142292
> 01-03Project(TotalSpend=[$0]) : rowType = RecordType(ANY 
> TotalSpend): rowcount = 2.87997024E8, cumulative cost = {3.167967264E9 rows, 
> 4.734841414759049E10 cpu, 0.0 io, 5.89817905152E12 network, 4.607952384E9 
> memory}, id = 142291
> 01-04  HashToRandomExchange(dist0=[[$0]]) : rowType = 
> RecordType(ANY TotalSpend, ANY E_X_P_R_H_A_S_H_F_I_E_L_D): rowcount = 
> 2.87997024E8, cumulative cost = {3.167967264E9 rows, 4.734841414759049E10 
> cpu, 0.0 io, 5.89817905152E12 network, 4.607952384E9 memory}, id = 142290
> 02-01UnorderedMuxExchange : rowType = RecordType(ANY 
> TotalSpend, ANY E_X_P_R_H_A_S_H_F_I_E_L_D): rowcount = 2.87997024E8, 
> cumulative cost = {2.87997024E9 rows, 4.274046176359049E10 cpu, 0.0 io, 
> 3.538907430912E12 network, 4.607952384E9 memory}, id = 142289
> 03-01  Project(TotalSpend=[$0], 
> E_X_P_R_H_A_S_H_F_I_E_L_D=[castInt(hash64AsDouble($0))]) : rowType = 
> RecordType(ANY TotalSpend, ANY E_X_P_R_H_A_S_H_F_I_E_L_D): rowcount = 
> 2.87997024E8, cumulative cost = {2.591973216E9 rows, 4.245246473959049E10 
> cpu, 0.0 io, 3.538907430912E12 network, 4.607952384E9 memory}, id = 142288
> 03-02Project(TotalSpend=[CASE(>($2, 0), CAST($3):ANY, 
> null)]) : rowType = RecordType(ANY TotalSpend): rowcount = 2.87997024E8, 
> cumulative cost = {2.303976192E9 rows, 4.130047664359049E10 cpu, 0.0 io, 
> 3.538907430912E12 network, 4.607952384E9 memory}, id = 142287
> 03-03  Window(wind

[jira] [Commented] (DRILL-3378) Average over window on a view returns wrong results

2015-06-26 Thread Aman Sinha (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-3378?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14603921#comment-14603921
 ] 

Aman Sinha commented on DRILL-3378:
---

+1 .

> Average over window on a view returns wrong results
> ---
>
> Key: DRILL-3378
> URL: https://issues.apache.org/jira/browse/DRILL-3378
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Flow
>Affects Versions: 1.1.0
> Environment: 4 node cluster on CentOS
>Reporter: Khurram Faraaz
>Assignee: Aman Sinha
>Priority: Critical
>  Labels: window_function
> Fix For: 1.1.0
>
> Attachments: DRILL-3378.patch
>
>
> We see a loss of precision for a window query over a view.
> Average aggregate query over parquet input.
> {code}
> 0: jdbc:drill:schema=dfs.tmp> SELECT AVG(col_int) OVER() average FROM 
> `forViewCrn.parquet`;
> ++
> |  average   |
> ++
> | 3.033  |
> | 3.033  |
> | 3.033  |
> | 3.033  |
> | 3.033  |
> | 3.033  |
> | 3.033  |
> | 3.033  |
> | 3.033  |
> | 3.033  |
> | 3.033  |
> | 3.033  |
> | 3.033  |
> | 3.033  |
> | 3.033  |
> | 3.033  |
> | 3.033  |
> | 3.033  |
> | 3.033  |
> | 3.033  |
> | 3.033  |
> | 3.033  |
> | 3.033  |
> | 3.033  |
> | 3.033  |
> | 3.033  |
> | 3.033  |
> | 3.033  |
> | 3.033  |
> | 3.033  |
> ++
> 30 rows selected (0.121 seconds)
> {code}
> The same query over a view that was created on the above parquet data. Note 
> that in this case we loose the precision value after the point, which is 
> incorrect.
> {code}
> 0: jdbc:drill:schema=dfs.tmp> SELECT AVG(col_int) OVER() average FROM 
> vwOnParq_wCst;
> +--+
> | average  |
> +--+
> | 3|
> | 3|
> | 3|
> | 3|
> | 3|
> | 3|
> | 3|
> | 3|
> | 3|
> | 3|
> | 3|
> | 3|
> | 3|
> | 3|
> | 3|
> | 3|
> | 3|
> | 3|
> | 3|
> | 3|
> | 3|
> | 3|
> | 3|
> | 3|
> | 3|
> | 3|
> | 3|
> | 3|
> | 3|
> | 3|
> +--+
> 30 rows selected (0.165 seconds)
> {code}
> Aggregate AVG over original parquet file, with cast to INT.
> {code}
> 0: jdbc:drill:schema=dfs.tmp> SELECT AVG(cast(col_int as INT)) OVER() average 
> FROM `forViewCrn.parquet`;
> +--+
> | average  |
> +--+
> | 3|
> | 3|
> | 3|
> | 3|
> | 3|
> | 3|
> | 3|
> | 3|
> | 3|
> | 3|
> | 3|
> | 3|
> | 3|
> | 3|
> | 3|
> | 3|
> | 3|
> | 3|
> | 3|
> | 3|
> | 3|
> | 3|
> | 3|
> | 3|
> | 3|
> | 3|
> | 3|
> | 3|
> | 3|
> | 3|
> +--+
> 30 rows selected (0.133 seconds)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (DRILL-3378) Average over window on a view returns wrong results

2015-06-26 Thread Aman Sinha (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-3378?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aman Sinha updated DRILL-3378:
--
Assignee: Mehant Baid  (was: Aman Sinha)

> Average over window on a view returns wrong results
> ---
>
> Key: DRILL-3378
> URL: https://issues.apache.org/jira/browse/DRILL-3378
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Flow
>Affects Versions: 1.1.0
> Environment: 4 node cluster on CentOS
>Reporter: Khurram Faraaz
>Assignee: Mehant Baid
>Priority: Critical
>  Labels: window_function
> Fix For: 1.1.0
>
> Attachments: DRILL-3378.patch
>
>
> We see a loss of precision for a window query over a view.
> Average aggregate query over parquet input.
> {code}
> 0: jdbc:drill:schema=dfs.tmp> SELECT AVG(col_int) OVER() average FROM 
> `forViewCrn.parquet`;
> ++
> |  average   |
> ++
> | 3.033  |
> | 3.033  |
> | 3.033  |
> | 3.033  |
> | 3.033  |
> | 3.033  |
> | 3.033  |
> | 3.033  |
> | 3.033  |
> | 3.033  |
> | 3.033  |
> | 3.033  |
> | 3.033  |
> | 3.033  |
> | 3.033  |
> | 3.033  |
> | 3.033  |
> | 3.033  |
> | 3.033  |
> | 3.033  |
> | 3.033  |
> | 3.033  |
> | 3.033  |
> | 3.033  |
> | 3.033  |
> | 3.033  |
> | 3.033  |
> | 3.033  |
> | 3.033  |
> | 3.033  |
> ++
> 30 rows selected (0.121 seconds)
> {code}
> The same query over a view that was created on the above parquet data. Note 
> that in this case we loose the precision value after the point, which is 
> incorrect.
> {code}
> 0: jdbc:drill:schema=dfs.tmp> SELECT AVG(col_int) OVER() average FROM 
> vwOnParq_wCst;
> +--+
> | average  |
> +--+
> | 3|
> | 3|
> | 3|
> | 3|
> | 3|
> | 3|
> | 3|
> | 3|
> | 3|
> | 3|
> | 3|
> | 3|
> | 3|
> | 3|
> | 3|
> | 3|
> | 3|
> | 3|
> | 3|
> | 3|
> | 3|
> | 3|
> | 3|
> | 3|
> | 3|
> | 3|
> | 3|
> | 3|
> | 3|
> | 3|
> +--+
> 30 rows selected (0.165 seconds)
> {code}
> Aggregate AVG over original parquet file, with cast to INT.
> {code}
> 0: jdbc:drill:schema=dfs.tmp> SELECT AVG(cast(col_int as INT)) OVER() average 
> FROM `forViewCrn.parquet`;
> +--+
> | average  |
> +--+
> | 3|
> | 3|
> | 3|
> | 3|
> | 3|
> | 3|
> | 3|
> | 3|
> | 3|
> | 3|
> | 3|
> | 3|
> | 3|
> | 3|
> | 3|
> | 3|
> | 3|
> | 3|
> | 3|
> | 3|
> | 3|
> | 3|
> | 3|
> | 3|
> | 3|
> | 3|
> | 3|
> | 3|
> | 3|
> | 3|
> +--+
> 30 rows selected (0.133 seconds)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (DRILL-3410) Partition Pruning : We are doing a prune when we shouldn't

2015-06-26 Thread Steven Phillips (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-3410?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14603909#comment-14603909
 ] 

Steven Phillips commented on DRILL-3410:


This appears to be due to the fact that the FindPartitionConditions class, 
which is the code that walks the expression tree and determines if pruning is 
valid, assumes that the "Binary" operators "OR" and "AND" only have two 
arguments. But you can see from expression in the plan:

{code}
OR(AND(=($1, 1993), >(ITEM($2, 0), 29600)), =($1, 1994), >(ITEM($2, 0), 29700))
{code}

that expression was rewritten with a single OR operator with 3 arguments.

Rewriting the expression with true binary operators seems to fix the problem. I 
will have a patch available shortly.

> Partition Pruning : We are doing a prune when we shouldn't
> --
>
> Key: DRILL-3410
> URL: https://issues.apache.org/jira/browse/DRILL-3410
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Query Planning & Optimization
>Reporter: Rahul Challapalli
>Assignee: Steven Phillips
>Priority: Critical
> Fix For: 1.1.0
>
>
> git.commit.id.abbrev=60bc945
> The below plan does not look right. It should scan all the files based on the 
> filters in the query. Also hive returned more rows than drill
> {code}
> explain plan for select * from `existing_partition_pruning/lineitempart` 
> where (dir0=1993 and columns[0] >29600) or (dir0=1994 or columns[0]>29700);
> | 00-00Screen
> 00-01  Project(*=[$0])
> 00-02Project(T70¦¦*=[$0])
> 00-03  SelectionVectorRemover
> 00-04Filter(condition=[OR(AND(=($1, 1993), >(ITEM($2, 0), 
> 29600)), =($1, 1994), >(ITEM($2, 0), 29700))])
> 00-05  Project(T70¦¦*=[$0], dir0=[$1], columns=[$2])
> 00-06Scan(groupscan=[ParquetGroupScan 
> [entries=[ReadEntryWithPath 
> [path=/drill/testdata/ctas_auto_partition/existing_partition_pruning/lineitempart/0_0_3.parquet],
>  ReadEntryWithPath 
> [path=/drill/testdata/ctas_auto_partition/existing_partition_pruning/lineitempart/0_0_4.parquet]],
>  
> selectionRoot=/drill/testdata/ctas_auto_partition/existing_partition_pruning/lineitempart,
>  numFiles=2, columns=[`*`]]])
>  |
> {code}
> I attached the data set used. Let me know if you need anything more



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (DRILL-3151) ResultSetMetaData not as specified by JDBC (null/dummy value, not ""/etc.)

2015-06-26 Thread Daniel Barclay (Drill) (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-3151?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Barclay (Drill) updated DRILL-3151:
--
Attachment: DRILL-3151.2.patch.txt

> ResultSetMetaData not as specified by JDBC (null/dummy value, not ""/etc.)
> --
>
> Key: DRILL-3151
> URL: https://issues.apache.org/jira/browse/DRILL-3151
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Client - JDBC
>Reporter: Daniel Barclay (Drill)
>Assignee: Daniel Barclay (Drill)
> Fix For: 1.2.0
>
> Attachments: DRILL-3151.1.patch.txt, DRILL-3151.2.patch.txt
>
>
> In Drill's JDBC driver, some ResultSetMetaData methods don't return what JDBC 
> specifies they should return.
> Some cases:
> {{getTableName(int)}}:
> - (JDBC says: {{table name or "" if not applicable}})
> - Drill returns {{null}} (instead of empty string or table name)
> - (Drill indicates "not applicable" even when from named table, e.g., for  
> "{{SELECT * FROM INFORMATION_SCHEMA.CATALOGS}}".)
> {{getSchemaName(int)}}:
> - (JDBC says: {{schema name or "" if not applicable}})
> - Drill returns "{{\-\-UNKNOWN--}}" (instead of empty string or schema name)
> - (Drill indicates "not applicable" even when from named table, e.g., for  
> "{{SELECT * FROM INFORMATION_SCHEMA.CATALOGS}}".)
> {{getCatalogName(int)}}:
> - (JDBC says: {{the name of the catalog for the table in which the given 
> column appears or "" if not applicable}})
> - Drill returns "{{\-\-UNKNOWN--}}" (instead of empty string or catalog name)
> - (Drill indicates "not applicable" even when from named table, e.g., for  
> "{{SELECT * FROM INFORMATION_SCHEMA.CATALOGS}}".)
> {{isSearchable(int)}}:
> - (JDBC says:  {{Indicates whether the designated column can be used in a 
> where clause.}})
> - Drill returns {{false}}.
> {{getColumnClassName(int}}:
> - (JDBC says: {{the fully-qualified name of the class in the Java programming 
> language that would be used by the method ResultSet.getObject to retrieve the 
> value in the specified column. This is the class name used for custom 
> mapping.}})
> - Drill returns "{{none}}" (instead of the correct class name).
> More cases:
> {{getColumnDisplaySize}}
> - (JDBC says (quite ambiguously): {{the normal maximum number of characters 
> allowed as the width of the designated column}})
> - Drill always returns {{10}}!



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (DRILL-3151) ResultSetMetaData not as specified by JDBC (null/dummy value, not ""/etc.)

2015-06-26 Thread Daniel Barclay (Drill) (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-3151?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14603874#comment-14603874
 ] 

Daniel Barclay (Drill) commented on DRILL-3151:
---

Patch commit message:

DRILL-3151:  Fix many ResultSetMetaData method return values.

Added ~unit test for ResultSetMetaData implementation.

Made getObject return classes available to implementation of getColumnClassName:
- Added SqlAccessor.getObjectClass() (to put that metadata right next to code 
to which it corresponds rather than in far-away parallel code).
- Added similar AvaticaDrillSqlAccessor.getObjectClass().
- Changed DrillAccessorList.accessors from Accessor[] to 
AvaticaDrillSqlAccessor[] for better access to JDBC getObject return class.
- Extracted return classes from accessors to pass to updateColumnMetaData.

Reworked some data type mapping and utilities:
- Added Added Types.getSqlTypeName(...).
- Renamed Types.getJdbcType(...) to getJdbcTypeCode(...)
- Replaced Types.isUnSigned with isJdbcSignedType.
- Fixed various bogus RPC-type XXX -> java.sql.Types.SMALLINT mappings.
- Removed DrillColumnMetaDataList.getJdbcTypeName.
- Moved getAvaticaType up (for bottom-up order).
- Revised DrillColumnMetaDataList.getAvaticaType(...).

MAIN:
- Updated updateColumnMetaData(...) to change many calculations of metadata 
input to ColumnMetaData construction.  [DrillColumnMetaDataList]

Updated other metadata tests per changes.


> ResultSetMetaData not as specified by JDBC (null/dummy value, not ""/etc.)
> --
>
> Key: DRILL-3151
> URL: https://issues.apache.org/jira/browse/DRILL-3151
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Client - JDBC
>Reporter: Daniel Barclay (Drill)
>Assignee: Daniel Barclay (Drill)
> Fix For: 1.2.0
>
> Attachments: DRILL-3151.1.patch.txt
>
>
> In Drill's JDBC driver, some ResultSetMetaData methods don't return what JDBC 
> specifies they should return.
> Some cases:
> {{getTableName(int)}}:
> - (JDBC says: {{table name or "" if not applicable}})
> - Drill returns {{null}} (instead of empty string or table name)
> - (Drill indicates "not applicable" even when from named table, e.g., for  
> "{{SELECT * FROM INFORMATION_SCHEMA.CATALOGS}}".)
> {{getSchemaName(int)}}:
> - (JDBC says: {{schema name or "" if not applicable}})
> - Drill returns "{{\-\-UNKNOWN--}}" (instead of empty string or schema name)
> - (Drill indicates "not applicable" even when from named table, e.g., for  
> "{{SELECT * FROM INFORMATION_SCHEMA.CATALOGS}}".)
> {{getCatalogName(int)}}:
> - (JDBC says: {{the name of the catalog for the table in which the given 
> column appears or "" if not applicable}})
> - Drill returns "{{\-\-UNKNOWN--}}" (instead of empty string or catalog name)
> - (Drill indicates "not applicable" even when from named table, e.g., for  
> "{{SELECT * FROM INFORMATION_SCHEMA.CATALOGS}}".)
> {{isSearchable(int)}}:
> - (JDBC says:  {{Indicates whether the designated column can be used in a 
> where clause.}})
> - Drill returns {{false}}.
> {{getColumnClassName(int}}:
> - (JDBC says: {{the fully-qualified name of the class in the Java programming 
> language that would be used by the method ResultSet.getObject to retrieve the 
> value in the specified column. This is the class name used for custom 
> mapping.}})
> - Drill returns "{{none}}" (instead of the correct class name).
> More cases:
> {{getColumnDisplaySize}}
> - (JDBC says (quite ambiguously): {{the normal maximum number of characters 
> allowed as the width of the designated column}})
> - Drill always returns {{10}}!



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (DRILL-3151) ResultSetMetaData not as specified by JDBC (null/dummy value, not ""/etc.)

2015-06-26 Thread Daniel Barclay (Drill) (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-3151?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Barclay (Drill) updated DRILL-3151:
--
Attachment: DRILL-3151.1.patch.txt

> ResultSetMetaData not as specified by JDBC (null/dummy value, not ""/etc.)
> --
>
> Key: DRILL-3151
> URL: https://issues.apache.org/jira/browse/DRILL-3151
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Client - JDBC
>Reporter: Daniel Barclay (Drill)
>Assignee: Daniel Barclay (Drill)
> Fix For: 1.2.0
>
> Attachments: DRILL-3151.1.patch.txt
>
>
> In Drill's JDBC driver, some ResultSetMetaData methods don't return what JDBC 
> specifies they should return.
> Some cases:
> {{getTableName(int)}}:
> - (JDBC says: {{table name or "" if not applicable}})
> - Drill returns {{null}} (instead of empty string or table name)
> - (Drill indicates "not applicable" even when from named table, e.g., for  
> "{{SELECT * FROM INFORMATION_SCHEMA.CATALOGS}}".)
> {{getSchemaName(int)}}:
> - (JDBC says: {{schema name or "" if not applicable}})
> - Drill returns "{{\-\-UNKNOWN--}}" (instead of empty string or schema name)
> - (Drill indicates "not applicable" even when from named table, e.g., for  
> "{{SELECT * FROM INFORMATION_SCHEMA.CATALOGS}}".)
> {{getCatalogName(int)}}:
> - (JDBC says: {{the name of the catalog for the table in which the given 
> column appears or "" if not applicable}})
> - Drill returns "{{\-\-UNKNOWN--}}" (instead of empty string or catalog name)
> - (Drill indicates "not applicable" even when from named table, e.g., for  
> "{{SELECT * FROM INFORMATION_SCHEMA.CATALOGS}}".)
> {{isSearchable(int)}}:
> - (JDBC says:  {{Indicates whether the designated column can be used in a 
> where clause.}})
> - Drill returns {{false}}.
> {{getColumnClassName(int}}:
> - (JDBC says: {{the fully-qualified name of the class in the Java programming 
> language that would be used by the method ResultSet.getObject to retrieve the 
> value in the specified column. This is the class name used for custom 
> mapping.}})
> - Drill returns "{{none}}" (instead of the correct class name).
> More cases:
> {{getColumnDisplaySize}}
> - (JDBC says (quite ambiguously): {{the normal maximum number of characters 
> allowed as the width of the designated column}})
> - Drill always returns {{10}}!



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (DRILL-3307) Query with window function runs out of memory

2015-06-26 Thread Deneche A. Hakim (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-3307?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Deneche A. Hakim updated DRILL-3307:

Assignee: Steven Phillips  (was: Deneche A. Hakim)

> Query with window function runs out of memory
> -
>
> Key: DRILL-3307
> URL: https://issues.apache.org/jira/browse/DRILL-3307
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Query Planning & Optimization
>Affects Versions: 1.1.0
> Environment: Data set: TPC-DS SF 100 Parquet
> Number of Nodes: 4
>Reporter: Abhishek Girish
>Assignee: Steven Phillips
>Priority: Blocker
> Fix For: 1.1.0
>
> Attachments: DRILL-3307.1.patch.txt, drillbit.log.txt
>
>
> Query with window function runs out of memory:
> {code:sql}
>  SELECT SUM(ss.ss_net_paid_inc_tax) OVER (PARTITION BY ss.ss_store_sk) AS 
> TotalSpend FROM store_sales ss ORDER BY 1 LIMIT 20;
> java.lang.RuntimeException: java.sql.SQLException: RESOURCE ERROR: One or 
> more nodes ran out of memory while executing the query.
> Fragment 3:0
> [Error Id: 9af19064-9175-46a4-b557-714d1c77cd76 on abhi6.qa.lab:31010]
>   at sqlline.IncrementalRows.hasNext(IncrementalRows.java:73)
>   at 
> sqlline.TableOutputFormat$ResizingRowsProvider.next(TableOutputFormat.java:85)
>   at sqlline.TableOutputFormat.print(TableOutputFormat.java:116)
>   at sqlline.SqlLine.print(SqlLine.java:1583)
>   at sqlline.Commands.execute(Commands.java:852)
>   at sqlline.Commands.sql(Commands.java:751)
>   at sqlline.SqlLine.dispatch(SqlLine.java:738)
>   at sqlline.SqlLine.begin(SqlLine.java:612)
>   at sqlline.SqlLine.start(SqlLine.java:366)
>   at sqlline.SqlLine.main(SqlLine.java:259)
> {code}
> Plan:
> {code}
> 00-00Screen : rowType = RecordType(ANY TotalSpend): rowcount = 
> 2.87997024E8, cumulative cost = {4.3487550824E9 rows, 5.7539970079068695E10 
> cpu, 0.0 io, 7.077814861824E12 network, 4.607952384E9 memory}, id = 142297
> 00-01  SelectionVectorRemover : rowType = RecordType(ANY TotalSpend): 
> rowcount = 2.87997024E8, cumulative cost = {4.31995538E9 rows, 
> 5.751117037666869E10 cpu, 0.0 io, 7.077814861824E12 network, 4.607952384E9 
> memory}, id = 142296
> 00-02Limit(fetch=[20]) : rowType = RecordType(ANY TotalSpend): 
> rowcount = 2.87997024E8, cumulative cost = {4.031958356E9 rows, 
> 5.722317335266869E10 cpu, 0.0 io, 7.077814861824E12 network, 4.607952384E9 
> memory}, id = 142295
> 00-03  SingleMergeExchange(sort0=[0 ASC]) : rowType = RecordType(ANY 
> TotalSpend): rowcount = 2.87997024E8, cumulative cost = {4.031958336E9 rows, 
> 5.722317327266869E10 cpu, 0.0 io, 7.077814861824E12 network, 4.607952384E9 
> memory}, id = 142294
> 01-01SelectionVectorRemover : rowType = RecordType(ANY 
> TotalSpend): rowcount = 2.87997024E8, cumulative cost = {3.743961312E9 rows, 
> 5.261522088866869E10 cpu, 0.0 io, 5.89817905152E12 network, 4.607952384E9 
> memory}, id = 142293
> 01-02  TopN(limit=[20]) : rowType = RecordType(ANY TotalSpend): 
> rowcount = 2.87997024E8, cumulative cost = {3.455964288E9 rows, 
> 5.232722386466869E10 cpu, 0.0 io, 5.89817905152E12 network, 4.607952384E9 
> memory}, id = 142292
> 01-03Project(TotalSpend=[$0]) : rowType = RecordType(ANY 
> TotalSpend): rowcount = 2.87997024E8, cumulative cost = {3.167967264E9 rows, 
> 4.734841414759049E10 cpu, 0.0 io, 5.89817905152E12 network, 4.607952384E9 
> memory}, id = 142291
> 01-04  HashToRandomExchange(dist0=[[$0]]) : rowType = 
> RecordType(ANY TotalSpend, ANY E_X_P_R_H_A_S_H_F_I_E_L_D): rowcount = 
> 2.87997024E8, cumulative cost = {3.167967264E9 rows, 4.734841414759049E10 
> cpu, 0.0 io, 5.89817905152E12 network, 4.607952384E9 memory}, id = 142290
> 02-01UnorderedMuxExchange : rowType = RecordType(ANY 
> TotalSpend, ANY E_X_P_R_H_A_S_H_F_I_E_L_D): rowcount = 2.87997024E8, 
> cumulative cost = {2.87997024E9 rows, 4.274046176359049E10 cpu, 0.0 io, 
> 3.538907430912E12 network, 4.607952384E9 memory}, id = 142289
> 03-01  Project(TotalSpend=[$0], 
> E_X_P_R_H_A_S_H_F_I_E_L_D=[castInt(hash64AsDouble($0))]) : rowType = 
> RecordType(ANY TotalSpend, ANY E_X_P_R_H_A_S_H_F_I_E_L_D): rowcount = 
> 2.87997024E8, cumulative cost = {2.591973216E9 rows, 4.245246473959049E10 
> cpu, 0.0 io, 3.538907430912E12 network, 4.607952384E9 memory}, id = 142288
> 03-02Project(TotalSpend=[CASE(>($2, 0), CAST($3):ANY, 
> null)]) : rowType = RecordType(ANY TotalSpend): rowcount = 2.87997024E8, 
> cumulative cost = {2.303976192E9 rows, 4.130047664359049E10 cpu, 0.0 io, 
> 3.538907430912E12 network, 4.607952384E9 memory}, id = 142287
> 03-03  Window(window#0=[window(partition {1} order by 
> [] range between UNBOUNDE

[jira] [Updated] (DRILL-3307) Query with window function runs out of memory

2015-06-26 Thread Deneche A. Hakim (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-3307?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Deneche A. Hakim updated DRILL-3307:

Attachment: DRILL-3307.1.patch.txt

> Query with window function runs out of memory
> -
>
> Key: DRILL-3307
> URL: https://issues.apache.org/jira/browse/DRILL-3307
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Query Planning & Optimization
>Affects Versions: 1.1.0
> Environment: Data set: TPC-DS SF 100 Parquet
> Number of Nodes: 4
>Reporter: Abhishek Girish
>Assignee: Deneche A. Hakim
>Priority: Blocker
> Fix For: 1.1.0
>
> Attachments: DRILL-3307.1.patch.txt, drillbit.log.txt
>
>
> Query with window function runs out of memory:
> {code:sql}
>  SELECT SUM(ss.ss_net_paid_inc_tax) OVER (PARTITION BY ss.ss_store_sk) AS 
> TotalSpend FROM store_sales ss ORDER BY 1 LIMIT 20;
> java.lang.RuntimeException: java.sql.SQLException: RESOURCE ERROR: One or 
> more nodes ran out of memory while executing the query.
> Fragment 3:0
> [Error Id: 9af19064-9175-46a4-b557-714d1c77cd76 on abhi6.qa.lab:31010]
>   at sqlline.IncrementalRows.hasNext(IncrementalRows.java:73)
>   at 
> sqlline.TableOutputFormat$ResizingRowsProvider.next(TableOutputFormat.java:85)
>   at sqlline.TableOutputFormat.print(TableOutputFormat.java:116)
>   at sqlline.SqlLine.print(SqlLine.java:1583)
>   at sqlline.Commands.execute(Commands.java:852)
>   at sqlline.Commands.sql(Commands.java:751)
>   at sqlline.SqlLine.dispatch(SqlLine.java:738)
>   at sqlline.SqlLine.begin(SqlLine.java:612)
>   at sqlline.SqlLine.start(SqlLine.java:366)
>   at sqlline.SqlLine.main(SqlLine.java:259)
> {code}
> Plan:
> {code}
> 00-00Screen : rowType = RecordType(ANY TotalSpend): rowcount = 
> 2.87997024E8, cumulative cost = {4.3487550824E9 rows, 5.7539970079068695E10 
> cpu, 0.0 io, 7.077814861824E12 network, 4.607952384E9 memory}, id = 142297
> 00-01  SelectionVectorRemover : rowType = RecordType(ANY TotalSpend): 
> rowcount = 2.87997024E8, cumulative cost = {4.31995538E9 rows, 
> 5.751117037666869E10 cpu, 0.0 io, 7.077814861824E12 network, 4.607952384E9 
> memory}, id = 142296
> 00-02Limit(fetch=[20]) : rowType = RecordType(ANY TotalSpend): 
> rowcount = 2.87997024E8, cumulative cost = {4.031958356E9 rows, 
> 5.722317335266869E10 cpu, 0.0 io, 7.077814861824E12 network, 4.607952384E9 
> memory}, id = 142295
> 00-03  SingleMergeExchange(sort0=[0 ASC]) : rowType = RecordType(ANY 
> TotalSpend): rowcount = 2.87997024E8, cumulative cost = {4.031958336E9 rows, 
> 5.722317327266869E10 cpu, 0.0 io, 7.077814861824E12 network, 4.607952384E9 
> memory}, id = 142294
> 01-01SelectionVectorRemover : rowType = RecordType(ANY 
> TotalSpend): rowcount = 2.87997024E8, cumulative cost = {3.743961312E9 rows, 
> 5.261522088866869E10 cpu, 0.0 io, 5.89817905152E12 network, 4.607952384E9 
> memory}, id = 142293
> 01-02  TopN(limit=[20]) : rowType = RecordType(ANY TotalSpend): 
> rowcount = 2.87997024E8, cumulative cost = {3.455964288E9 rows, 
> 5.232722386466869E10 cpu, 0.0 io, 5.89817905152E12 network, 4.607952384E9 
> memory}, id = 142292
> 01-03Project(TotalSpend=[$0]) : rowType = RecordType(ANY 
> TotalSpend): rowcount = 2.87997024E8, cumulative cost = {3.167967264E9 rows, 
> 4.734841414759049E10 cpu, 0.0 io, 5.89817905152E12 network, 4.607952384E9 
> memory}, id = 142291
> 01-04  HashToRandomExchange(dist0=[[$0]]) : rowType = 
> RecordType(ANY TotalSpend, ANY E_X_P_R_H_A_S_H_F_I_E_L_D): rowcount = 
> 2.87997024E8, cumulative cost = {3.167967264E9 rows, 4.734841414759049E10 
> cpu, 0.0 io, 5.89817905152E12 network, 4.607952384E9 memory}, id = 142290
> 02-01UnorderedMuxExchange : rowType = RecordType(ANY 
> TotalSpend, ANY E_X_P_R_H_A_S_H_F_I_E_L_D): rowcount = 2.87997024E8, 
> cumulative cost = {2.87997024E9 rows, 4.274046176359049E10 cpu, 0.0 io, 
> 3.538907430912E12 network, 4.607952384E9 memory}, id = 142289
> 03-01  Project(TotalSpend=[$0], 
> E_X_P_R_H_A_S_H_F_I_E_L_D=[castInt(hash64AsDouble($0))]) : rowType = 
> RecordType(ANY TotalSpend, ANY E_X_P_R_H_A_S_H_F_I_E_L_D): rowcount = 
> 2.87997024E8, cumulative cost = {2.591973216E9 rows, 4.245246473959049E10 
> cpu, 0.0 io, 3.538907430912E12 network, 4.607952384E9 memory}, id = 142288
> 03-02Project(TotalSpend=[CASE(>($2, 0), CAST($3):ANY, 
> null)]) : rowType = RecordType(ANY TotalSpend): rowcount = 2.87997024E8, 
> cumulative cost = {2.303976192E9 rows, 4.130047664359049E10 cpu, 0.0 io, 
> 3.538907430912E12 network, 4.607952384E9 memory}, id = 142287
> 03-03  Window(window#0=[window(partition {1} order by 
> [] range between UNBOUNDED PRECEDING and

[jira] [Updated] (DRILL-3378) Average over window on a view returns wrong results

2015-06-26 Thread Mehant Baid (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-3378?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mehant Baid updated DRILL-3378:
---
Assignee: Aman Sinha  (was: Mehant Baid)

> Average over window on a view returns wrong results
> ---
>
> Key: DRILL-3378
> URL: https://issues.apache.org/jira/browse/DRILL-3378
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Flow
>Affects Versions: 1.1.0
> Environment: 4 node cluster on CentOS
>Reporter: Khurram Faraaz
>Assignee: Aman Sinha
>Priority: Critical
>  Labels: window_function
> Fix For: 1.1.0
>
> Attachments: DRILL-3378.patch
>
>
> We see a loss of precision for a window query over a view.
> Average aggregate query over parquet input.
> {code}
> 0: jdbc:drill:schema=dfs.tmp> SELECT AVG(col_int) OVER() average FROM 
> `forViewCrn.parquet`;
> ++
> |  average   |
> ++
> | 3.033  |
> | 3.033  |
> | 3.033  |
> | 3.033  |
> | 3.033  |
> | 3.033  |
> | 3.033  |
> | 3.033  |
> | 3.033  |
> | 3.033  |
> | 3.033  |
> | 3.033  |
> | 3.033  |
> | 3.033  |
> | 3.033  |
> | 3.033  |
> | 3.033  |
> | 3.033  |
> | 3.033  |
> | 3.033  |
> | 3.033  |
> | 3.033  |
> | 3.033  |
> | 3.033  |
> | 3.033  |
> | 3.033  |
> | 3.033  |
> | 3.033  |
> | 3.033  |
> | 3.033  |
> ++
> 30 rows selected (0.121 seconds)
> {code}
> The same query over a view that was created on the above parquet data. Note 
> that in this case we loose the precision value after the point, which is 
> incorrect.
> {code}
> 0: jdbc:drill:schema=dfs.tmp> SELECT AVG(col_int) OVER() average FROM 
> vwOnParq_wCst;
> +--+
> | average  |
> +--+
> | 3|
> | 3|
> | 3|
> | 3|
> | 3|
> | 3|
> | 3|
> | 3|
> | 3|
> | 3|
> | 3|
> | 3|
> | 3|
> | 3|
> | 3|
> | 3|
> | 3|
> | 3|
> | 3|
> | 3|
> | 3|
> | 3|
> | 3|
> | 3|
> | 3|
> | 3|
> | 3|
> | 3|
> | 3|
> | 3|
> +--+
> 30 rows selected (0.165 seconds)
> {code}
> Aggregate AVG over original parquet file, with cast to INT.
> {code}
> 0: jdbc:drill:schema=dfs.tmp> SELECT AVG(cast(col_int as INT)) OVER() average 
> FROM `forViewCrn.parquet`;
> +--+
> | average  |
> +--+
> | 3|
> | 3|
> | 3|
> | 3|
> | 3|
> | 3|
> | 3|
> | 3|
> | 3|
> | 3|
> | 3|
> | 3|
> | 3|
> | 3|
> | 3|
> | 3|
> | 3|
> | 3|
> | 3|
> | 3|
> | 3|
> | 3|
> | 3|
> | 3|
> | 3|
> | 3|
> | 3|
> | 3|
> | 3|
> | 3|
> +--+
> 30 rows selected (0.133 seconds)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (DRILL-3378) Average over window on a view returns wrong results

2015-06-26 Thread Mehant Baid (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-3378?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mehant Baid updated DRILL-3378:
---
Attachment: DRILL-3378.patch

[~amansinha100] please review

> Average over window on a view returns wrong results
> ---
>
> Key: DRILL-3378
> URL: https://issues.apache.org/jira/browse/DRILL-3378
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Flow
>Affects Versions: 1.1.0
> Environment: 4 node cluster on CentOS
>Reporter: Khurram Faraaz
>Assignee: Mehant Baid
>Priority: Critical
>  Labels: window_function
> Fix For: 1.1.0
>
> Attachments: DRILL-3378.patch
>
>
> We see a loss of precision for a window query over a view.
> Average aggregate query over parquet input.
> {code}
> 0: jdbc:drill:schema=dfs.tmp> SELECT AVG(col_int) OVER() average FROM 
> `forViewCrn.parquet`;
> ++
> |  average   |
> ++
> | 3.033  |
> | 3.033  |
> | 3.033  |
> | 3.033  |
> | 3.033  |
> | 3.033  |
> | 3.033  |
> | 3.033  |
> | 3.033  |
> | 3.033  |
> | 3.033  |
> | 3.033  |
> | 3.033  |
> | 3.033  |
> | 3.033  |
> | 3.033  |
> | 3.033  |
> | 3.033  |
> | 3.033  |
> | 3.033  |
> | 3.033  |
> | 3.033  |
> | 3.033  |
> | 3.033  |
> | 3.033  |
> | 3.033  |
> | 3.033  |
> | 3.033  |
> | 3.033  |
> | 3.033  |
> ++
> 30 rows selected (0.121 seconds)
> {code}
> The same query over a view that was created on the above parquet data. Note 
> that in this case we loose the precision value after the point, which is 
> incorrect.
> {code}
> 0: jdbc:drill:schema=dfs.tmp> SELECT AVG(col_int) OVER() average FROM 
> vwOnParq_wCst;
> +--+
> | average  |
> +--+
> | 3|
> | 3|
> | 3|
> | 3|
> | 3|
> | 3|
> | 3|
> | 3|
> | 3|
> | 3|
> | 3|
> | 3|
> | 3|
> | 3|
> | 3|
> | 3|
> | 3|
> | 3|
> | 3|
> | 3|
> | 3|
> | 3|
> | 3|
> | 3|
> | 3|
> | 3|
> | 3|
> | 3|
> | 3|
> | 3|
> +--+
> 30 rows selected (0.165 seconds)
> {code}
> Aggregate AVG over original parquet file, with cast to INT.
> {code}
> 0: jdbc:drill:schema=dfs.tmp> SELECT AVG(cast(col_int as INT)) OVER() average 
> FROM `forViewCrn.parquet`;
> +--+
> | average  |
> +--+
> | 3|
> | 3|
> | 3|
> | 3|
> | 3|
> | 3|
> | 3|
> | 3|
> | 3|
> | 3|
> | 3|
> | 3|
> | 3|
> | 3|
> | 3|
> | 3|
> | 3|
> | 3|
> | 3|
> | 3|
> | 3|
> | 3|
> | 3|
> | 3|
> | 3|
> | 3|
> | 3|
> | 3|
> | 3|
> | 3|
> +--+
> 30 rows selected (0.133 seconds)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (DRILL-3307) Query with window function runs out of memory

2015-06-26 Thread Deneche A. Hakim (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-3307?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Deneche A. Hakim updated DRILL-3307:

Fix Version/s: (was: 1.2.0)
   1.1.0

> Query with window function runs out of memory
> -
>
> Key: DRILL-3307
> URL: https://issues.apache.org/jira/browse/DRILL-3307
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Query Planning & Optimization
>Affects Versions: 1.1.0
> Environment: Data set: TPC-DS SF 100 Parquet
> Number of Nodes: 4
>Reporter: Abhishek Girish
>Assignee: Deneche A. Hakim
>Priority: Blocker
> Fix For: 1.1.0
>
> Attachments: drillbit.log.txt
>
>
> Query with window function runs out of memory:
> {code:sql}
>  SELECT SUM(ss.ss_net_paid_inc_tax) OVER (PARTITION BY ss.ss_store_sk) AS 
> TotalSpend FROM store_sales ss ORDER BY 1 LIMIT 20;
> java.lang.RuntimeException: java.sql.SQLException: RESOURCE ERROR: One or 
> more nodes ran out of memory while executing the query.
> Fragment 3:0
> [Error Id: 9af19064-9175-46a4-b557-714d1c77cd76 on abhi6.qa.lab:31010]
>   at sqlline.IncrementalRows.hasNext(IncrementalRows.java:73)
>   at 
> sqlline.TableOutputFormat$ResizingRowsProvider.next(TableOutputFormat.java:85)
>   at sqlline.TableOutputFormat.print(TableOutputFormat.java:116)
>   at sqlline.SqlLine.print(SqlLine.java:1583)
>   at sqlline.Commands.execute(Commands.java:852)
>   at sqlline.Commands.sql(Commands.java:751)
>   at sqlline.SqlLine.dispatch(SqlLine.java:738)
>   at sqlline.SqlLine.begin(SqlLine.java:612)
>   at sqlline.SqlLine.start(SqlLine.java:366)
>   at sqlline.SqlLine.main(SqlLine.java:259)
> {code}
> Plan:
> {code}
> 00-00Screen : rowType = RecordType(ANY TotalSpend): rowcount = 
> 2.87997024E8, cumulative cost = {4.3487550824E9 rows, 5.7539970079068695E10 
> cpu, 0.0 io, 7.077814861824E12 network, 4.607952384E9 memory}, id = 142297
> 00-01  SelectionVectorRemover : rowType = RecordType(ANY TotalSpend): 
> rowcount = 2.87997024E8, cumulative cost = {4.31995538E9 rows, 
> 5.751117037666869E10 cpu, 0.0 io, 7.077814861824E12 network, 4.607952384E9 
> memory}, id = 142296
> 00-02Limit(fetch=[20]) : rowType = RecordType(ANY TotalSpend): 
> rowcount = 2.87997024E8, cumulative cost = {4.031958356E9 rows, 
> 5.722317335266869E10 cpu, 0.0 io, 7.077814861824E12 network, 4.607952384E9 
> memory}, id = 142295
> 00-03  SingleMergeExchange(sort0=[0 ASC]) : rowType = RecordType(ANY 
> TotalSpend): rowcount = 2.87997024E8, cumulative cost = {4.031958336E9 rows, 
> 5.722317327266869E10 cpu, 0.0 io, 7.077814861824E12 network, 4.607952384E9 
> memory}, id = 142294
> 01-01SelectionVectorRemover : rowType = RecordType(ANY 
> TotalSpend): rowcount = 2.87997024E8, cumulative cost = {3.743961312E9 rows, 
> 5.261522088866869E10 cpu, 0.0 io, 5.89817905152E12 network, 4.607952384E9 
> memory}, id = 142293
> 01-02  TopN(limit=[20]) : rowType = RecordType(ANY TotalSpend): 
> rowcount = 2.87997024E8, cumulative cost = {3.455964288E9 rows, 
> 5.232722386466869E10 cpu, 0.0 io, 5.89817905152E12 network, 4.607952384E9 
> memory}, id = 142292
> 01-03Project(TotalSpend=[$0]) : rowType = RecordType(ANY 
> TotalSpend): rowcount = 2.87997024E8, cumulative cost = {3.167967264E9 rows, 
> 4.734841414759049E10 cpu, 0.0 io, 5.89817905152E12 network, 4.607952384E9 
> memory}, id = 142291
> 01-04  HashToRandomExchange(dist0=[[$0]]) : rowType = 
> RecordType(ANY TotalSpend, ANY E_X_P_R_H_A_S_H_F_I_E_L_D): rowcount = 
> 2.87997024E8, cumulative cost = {3.167967264E9 rows, 4.734841414759049E10 
> cpu, 0.0 io, 5.89817905152E12 network, 4.607952384E9 memory}, id = 142290
> 02-01UnorderedMuxExchange : rowType = RecordType(ANY 
> TotalSpend, ANY E_X_P_R_H_A_S_H_F_I_E_L_D): rowcount = 2.87997024E8, 
> cumulative cost = {2.87997024E9 rows, 4.274046176359049E10 cpu, 0.0 io, 
> 3.538907430912E12 network, 4.607952384E9 memory}, id = 142289
> 03-01  Project(TotalSpend=[$0], 
> E_X_P_R_H_A_S_H_F_I_E_L_D=[castInt(hash64AsDouble($0))]) : rowType = 
> RecordType(ANY TotalSpend, ANY E_X_P_R_H_A_S_H_F_I_E_L_D): rowcount = 
> 2.87997024E8, cumulative cost = {2.591973216E9 rows, 4.245246473959049E10 
> cpu, 0.0 io, 3.538907430912E12 network, 4.607952384E9 memory}, id = 142288
> 03-02Project(TotalSpend=[CASE(>($2, 0), CAST($3):ANY, 
> null)]) : rowType = RecordType(ANY TotalSpend): rowcount = 2.87997024E8, 
> cumulative cost = {2.303976192E9 rows, 4.130047664359049E10 cpu, 0.0 io, 
> 3.538907430912E12 network, 4.607952384E9 memory}, id = 142287
> 03-03  Window(window#0=[window(partition {1} order by 
> [] range between UNBOUNDED PRECEDING and U

[jira] [Commented] (DRILL-3382) CTAS with order by clause fails with IOOB exception

2015-06-26 Thread Aman Sinha (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-3382?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14603839#comment-14603839
 ] 

Aman Sinha commented on DRILL-3382:
---

+1 . 

> CTAS with order by clause fails with IOOB exception
> ---
>
> Key: DRILL-3382
> URL: https://issues.apache.org/jira/browse/DRILL-3382
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Query Planning & Optimization
>Affects Versions: 1.1.0
>Reporter: Parth Chandra
>Assignee: Aman Sinha
>Priority: Critical
> Fix For: 1.2.0
>
> Attachments: 
> 0001-DRILL-3382-Fix-IOOB-error-for-CTAS-order-by-statemen.patch, 
> 0001-DRILL-3382-Fix-IOOB-error-for-CTAS-order-by-statemen.patch
>
>
> The query :
> {panel}
> create table `lineitem__5`  as select l_suppkey, l_partkey, l_linenumber from 
> cp.`tpch/lineitem.parquet` l order by l_linenumber;
> {panel}
> fails with an IOOB exception
> Trace in log - 
> {panel}
> [Error Id: 3351dcf3-032f-4d10-b2a4-c42959d0c06a on localhost:31010]
> org.apache.drill.common.exceptions.UserException: SYSTEM ERROR: 
> IndexOutOfBoundsException: index (2) must be less than size (2)
> [Error Id: 3351dcf3-032f-4d10-b2a4-c42959d0c06a on localhost:31010]
> at 
> org.apache.drill.common.exceptions.UserException$Builder.build(UserException.java:523)
>  ~[drill-common-1.1.0-SNAPSHOT.jar:1.1.0-SNAPSHOT]
> at 
> org.apache.drill.exec.work.foreman.Foreman$ForemanResult.close(Foreman.java:737)
>  [drill-java-exec-1.1.0-SNAPSHOT.jar:1.1.0-SNAPSHOT]
> at 
> org.apache.drill.exec.work.foreman.Foreman$StateSwitch.processEvent(Foreman.java:839)
>  [drill-java-exec-1.1.0-SNAPSHOT.jar:1.1.0-SNAPSHOT]
> at 
> org.apache.drill.exec.work.foreman.Foreman$StateSwitch.processEvent(Foreman.java:781)
>  [drill-java-exec-1.1.0-SNAPSHOT.jar:1.1.0-SNAPSHOT]
> at 
> org.apache.drill.common.EventProcessor.sendEvent(EventProcessor.java:73) 
> [drill-common-1.1.0-SNAPSHOT.jar:1.1.0-SNAPSHOT]
> at 
> org.apache.drill.exec.work.foreman.Foreman$StateSwitch.moveToState(Foreman.java:783)
>  [drill-java-exec-1.1.0-SNAPSHOT.jar:1.1.0-SNAPSHOT]
> at 
> org.apache.drill.exec.work.foreman.Foreman.moveToState(Foreman.java:892) 
> [drill-java-exec-1.1.0-SNAPSHOT.jar:1.1.0-SNAPSHOT]
> at org.apache.drill.exec.work.foreman.Foreman.run(Foreman.java:253) 
> [drill-java-exec-1.1.0-SNAPSHOT.jar:1.1.0-SNAPSHOT]
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>  [na:1.7.0_65]
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>  [na:1.7.0_65]
> at java.lang.Thread.run(Thread.java:745) [na:1.7.0_65]
> Caused by: org.apache.drill.exec.work.foreman.ForemanException: Unexpected 
> exception during fragment initialization: Internal error: Error while 
> applying rule ExpandConversionRule, args 
> [rel#9013:AbstractConverter.PHYSICAL.SINGLETON([]).[2](input=rel#9011:Subset#7.PHYSICAL.ANY([]).[2],convention=PHYSICAL,DrillDistributionTraitDef=SINGLETON([]),sort=[2])]
> ... 4 common frames omitted
> Caused by: java.lang.AssertionError: Internal error: Error while applying 
> rule ExpandConversionRule, args 
> [rel#9013:AbstractConverter.PHYSICAL.SINGLETON([]).[2](input=rel#9011:Subset#7.PHYSICAL.ANY([]).[2],convention=PHYSICAL,DrillDistributionTraitDef=SINGLETON([]),sort=[2])]
> at org.apache.calcite.util.Util.newInternal(Util.java:790) 
> ~[calcite-core-1.1.0-drill-r9.jar:1.1.0-drill-r9]
> at 
> org.apache.calcite.plan.volcano.VolcanoRuleCall.onMatch(VolcanoRuleCall.java:251)
>  ~[calcite-core-1.1.0-drill-r9.jar:1.1.0-drill-r9]
> at 
> org.apache.calcite.plan.volcano.VolcanoPlanner.findBestExp(VolcanoPlanner.java:795)
>  ~[calcite-core-1.1.0-drill-r9.jar:1.1.0-drill-r9]
> at 
> org.apache.calcite.tools.Programs$RuleSetProgram.run(Programs.java:303) 
> ~[calcite-core-1.1.0-drill-r9.jar:1.1.0-drill-r9]
> at org.apache.calcite.prepare.PlannerImpl.transform(PlannerImpl.java:316) 
> ~[calcite-core-1.1.0-drill-r9.jar:1.1.0-drill-r9]
> at 
> org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.convertToPrel(DefaultSqlHandler.java:260)
>  ~[drill-java-exec-1.1.0-SNAPSHOT.jar:1.1.0-SNAPSHOT]
> at 
> org.apache.drill.exec.planner.sql.handlers.CreateTableHandler.convertToPrel(CreateTableHandler.java:120)
>  ~[drill-java-exec-1.1.0-SNAPSHOT.jar:1.1.0-SNAPSHOT]
> at 
> org.apache.drill.exec.planner.sql.handlers.CreateTableHandler.getPlan(CreateTableHandler.java:99)
>  ~[drill-java-exec-1.1.0-SNAPSHOT.jar:1.1.0-SNAPSHOT]
> at 
> org.apache.drill.exec.planner.sql.DrillSqlWorker.getPlan(DrillSqlWorker.java:178)
>  ~[drill-java-exec-1.1.0-SNAPSHOT.jar:1.1.0-SNAPSHOT]
> at org.apache.drill.exec.work.foreman.Foreman.runSQL(Foreman.java:903) 
> [drill-java-exec-1.1.0-SNAPSHOT.

[jira] [Updated] (DRILL-3382) CTAS with order by clause fails with IOOB exception

2015-06-26 Thread Aman Sinha (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-3382?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aman Sinha updated DRILL-3382:
--
Assignee: Jinfeng Ni  (was: Aman Sinha)

> CTAS with order by clause fails with IOOB exception
> ---
>
> Key: DRILL-3382
> URL: https://issues.apache.org/jira/browse/DRILL-3382
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Query Planning & Optimization
>Affects Versions: 1.1.0
>Reporter: Parth Chandra
>Assignee: Jinfeng Ni
>Priority: Critical
> Fix For: 1.2.0
>
> Attachments: 
> 0001-DRILL-3382-Fix-IOOB-error-for-CTAS-order-by-statemen.patch, 
> 0001-DRILL-3382-Fix-IOOB-error-for-CTAS-order-by-statemen.patch
>
>
> The query :
> {panel}
> create table `lineitem__5`  as select l_suppkey, l_partkey, l_linenumber from 
> cp.`tpch/lineitem.parquet` l order by l_linenumber;
> {panel}
> fails with an IOOB exception
> Trace in log - 
> {panel}
> [Error Id: 3351dcf3-032f-4d10-b2a4-c42959d0c06a on localhost:31010]
> org.apache.drill.common.exceptions.UserException: SYSTEM ERROR: 
> IndexOutOfBoundsException: index (2) must be less than size (2)
> [Error Id: 3351dcf3-032f-4d10-b2a4-c42959d0c06a on localhost:31010]
> at 
> org.apache.drill.common.exceptions.UserException$Builder.build(UserException.java:523)
>  ~[drill-common-1.1.0-SNAPSHOT.jar:1.1.0-SNAPSHOT]
> at 
> org.apache.drill.exec.work.foreman.Foreman$ForemanResult.close(Foreman.java:737)
>  [drill-java-exec-1.1.0-SNAPSHOT.jar:1.1.0-SNAPSHOT]
> at 
> org.apache.drill.exec.work.foreman.Foreman$StateSwitch.processEvent(Foreman.java:839)
>  [drill-java-exec-1.1.0-SNAPSHOT.jar:1.1.0-SNAPSHOT]
> at 
> org.apache.drill.exec.work.foreman.Foreman$StateSwitch.processEvent(Foreman.java:781)
>  [drill-java-exec-1.1.0-SNAPSHOT.jar:1.1.0-SNAPSHOT]
> at 
> org.apache.drill.common.EventProcessor.sendEvent(EventProcessor.java:73) 
> [drill-common-1.1.0-SNAPSHOT.jar:1.1.0-SNAPSHOT]
> at 
> org.apache.drill.exec.work.foreman.Foreman$StateSwitch.moveToState(Foreman.java:783)
>  [drill-java-exec-1.1.0-SNAPSHOT.jar:1.1.0-SNAPSHOT]
> at 
> org.apache.drill.exec.work.foreman.Foreman.moveToState(Foreman.java:892) 
> [drill-java-exec-1.1.0-SNAPSHOT.jar:1.1.0-SNAPSHOT]
> at org.apache.drill.exec.work.foreman.Foreman.run(Foreman.java:253) 
> [drill-java-exec-1.1.0-SNAPSHOT.jar:1.1.0-SNAPSHOT]
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>  [na:1.7.0_65]
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>  [na:1.7.0_65]
> at java.lang.Thread.run(Thread.java:745) [na:1.7.0_65]
> Caused by: org.apache.drill.exec.work.foreman.ForemanException: Unexpected 
> exception during fragment initialization: Internal error: Error while 
> applying rule ExpandConversionRule, args 
> [rel#9013:AbstractConverter.PHYSICAL.SINGLETON([]).[2](input=rel#9011:Subset#7.PHYSICAL.ANY([]).[2],convention=PHYSICAL,DrillDistributionTraitDef=SINGLETON([]),sort=[2])]
> ... 4 common frames omitted
> Caused by: java.lang.AssertionError: Internal error: Error while applying 
> rule ExpandConversionRule, args 
> [rel#9013:AbstractConverter.PHYSICAL.SINGLETON([]).[2](input=rel#9011:Subset#7.PHYSICAL.ANY([]).[2],convention=PHYSICAL,DrillDistributionTraitDef=SINGLETON([]),sort=[2])]
> at org.apache.calcite.util.Util.newInternal(Util.java:790) 
> ~[calcite-core-1.1.0-drill-r9.jar:1.1.0-drill-r9]
> at 
> org.apache.calcite.plan.volcano.VolcanoRuleCall.onMatch(VolcanoRuleCall.java:251)
>  ~[calcite-core-1.1.0-drill-r9.jar:1.1.0-drill-r9]
> at 
> org.apache.calcite.plan.volcano.VolcanoPlanner.findBestExp(VolcanoPlanner.java:795)
>  ~[calcite-core-1.1.0-drill-r9.jar:1.1.0-drill-r9]
> at 
> org.apache.calcite.tools.Programs$RuleSetProgram.run(Programs.java:303) 
> ~[calcite-core-1.1.0-drill-r9.jar:1.1.0-drill-r9]
> at org.apache.calcite.prepare.PlannerImpl.transform(PlannerImpl.java:316) 
> ~[calcite-core-1.1.0-drill-r9.jar:1.1.0-drill-r9]
> at 
> org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.convertToPrel(DefaultSqlHandler.java:260)
>  ~[drill-java-exec-1.1.0-SNAPSHOT.jar:1.1.0-SNAPSHOT]
> at 
> org.apache.drill.exec.planner.sql.handlers.CreateTableHandler.convertToPrel(CreateTableHandler.java:120)
>  ~[drill-java-exec-1.1.0-SNAPSHOT.jar:1.1.0-SNAPSHOT]
> at 
> org.apache.drill.exec.planner.sql.handlers.CreateTableHandler.getPlan(CreateTableHandler.java:99)
>  ~[drill-java-exec-1.1.0-SNAPSHOT.jar:1.1.0-SNAPSHOT]
> at 
> org.apache.drill.exec.planner.sql.DrillSqlWorker.getPlan(DrillSqlWorker.java:178)
>  ~[drill-java-exec-1.1.0-SNAPSHOT.jar:1.1.0-SNAPSHOT]
> at org.apache.drill.exec.work.foreman.Foreman.runSQL(Foreman.java:903) 
> [drill-java-exec-1.1.0-SNAPSHOT.jar:1.1.0-SNAPSHOT]

[jira] [Updated] (DRILL-3410) Partition Pruning : We are doing a prune when we shouldn't

2015-06-26 Thread Jinfeng Ni (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-3410?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jinfeng Ni updated DRILL-3410:
--
Assignee: Steven Phillips  (was: Jinfeng Ni)

> Partition Pruning : We are doing a prune when we shouldn't
> --
>
> Key: DRILL-3410
> URL: https://issues.apache.org/jira/browse/DRILL-3410
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Query Planning & Optimization
>Reporter: Rahul Challapalli
>Assignee: Steven Phillips
>Priority: Critical
> Fix For: 1.1.0
>
>
> git.commit.id.abbrev=60bc945
> The below plan does not look right. It should scan all the files based on the 
> filters in the query. Also hive returned more rows than drill
> {code}
> explain plan for select * from `existing_partition_pruning/lineitempart` 
> where (dir0=1993 and columns[0] >29600) or (dir0=1994 or columns[0]>29700);
> | 00-00Screen
> 00-01  Project(*=[$0])
> 00-02Project(T70¦¦*=[$0])
> 00-03  SelectionVectorRemover
> 00-04Filter(condition=[OR(AND(=($1, 1993), >(ITEM($2, 0), 
> 29600)), =($1, 1994), >(ITEM($2, 0), 29700))])
> 00-05  Project(T70¦¦*=[$0], dir0=[$1], columns=[$2])
> 00-06Scan(groupscan=[ParquetGroupScan 
> [entries=[ReadEntryWithPath 
> [path=/drill/testdata/ctas_auto_partition/existing_partition_pruning/lineitempart/0_0_3.parquet],
>  ReadEntryWithPath 
> [path=/drill/testdata/ctas_auto_partition/existing_partition_pruning/lineitempart/0_0_4.parquet]],
>  
> selectionRoot=/drill/testdata/ctas_auto_partition/existing_partition_pruning/lineitempart,
>  numFiles=2, columns=[`*`]]])
>  |
> {code}
> I attached the data set used. Let me know if you need anything more



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (DRILL-3410) Partition Pruning : We are doing a prune when we shouldn't

2015-06-26 Thread Rahul Challapalli (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-3410?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14603834#comment-14603834
 ] 

Rahul Challapalli commented on DRILL-3410:
--

I am not attaching the data as it is larger than 10MB.

> Partition Pruning : We are doing a prune when we shouldn't
> --
>
> Key: DRILL-3410
> URL: https://issues.apache.org/jira/browse/DRILL-3410
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Query Planning & Optimization
>Reporter: Rahul Challapalli
>Assignee: Jinfeng Ni
>Priority: Critical
> Fix For: 1.1.0
>
>
> git.commit.id.abbrev=60bc945
> The below plan does not look right. It should scan all the files based on the 
> filters in the query. Also hive returned more rows than drill
> {code}
> explain plan for select * from `existing_partition_pruning/lineitempart` 
> where (dir0=1993 and columns[0] >29600) or (dir0=1994 or columns[0]>29700);
> | 00-00Screen
> 00-01  Project(*=[$0])
> 00-02Project(T70¦¦*=[$0])
> 00-03  SelectionVectorRemover
> 00-04Filter(condition=[OR(AND(=($1, 1993), >(ITEM($2, 0), 
> 29600)), =($1, 1994), >(ITEM($2, 0), 29700))])
> 00-05  Project(T70¦¦*=[$0], dir0=[$1], columns=[$2])
> 00-06Scan(groupscan=[ParquetGroupScan 
> [entries=[ReadEntryWithPath 
> [path=/drill/testdata/ctas_auto_partition/existing_partition_pruning/lineitempart/0_0_3.parquet],
>  ReadEntryWithPath 
> [path=/drill/testdata/ctas_auto_partition/existing_partition_pruning/lineitempart/0_0_4.parquet]],
>  
> selectionRoot=/drill/testdata/ctas_auto_partition/existing_partition_pruning/lineitempart,
>  numFiles=2, columns=[`*`]]])
>  |
> {code}
> I attached the data set used. Let me know if you need anything more



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (DRILL-3410) Partition Pruning : We are doing a prune when we shouldn't

2015-06-26 Thread Rahul Challapalli (JIRA)

Rahul Challapalli created DRILL-3410:


 Summary: Partition Pruning : We are doing a prune when we shouldn't
 Key: DRILL-3410
 URL: https://issues.apache.org/jira/browse/DRILL-3410
 Project: Apache Drill
  Issue Type: Bug
  Components: Query Planning & Optimization
Reporter: Rahul Challapalli
Assignee: Jinfeng Ni
Priority: Critical
 Fix For: 1.1.0


git.commit.id.abbrev=60bc945

The below plan does not look right. It should scan all the files based on the 
filters in the query. Also hive returned more rows than drill
{code}
explain plan for select * from `existing_partition_pruning/lineitempart` where 
(dir0=1993 and columns[0] >29600) or (dir0=1994 or columns[0]>29700);
| 00-00Screen
00-01  Project(*=[$0])
00-02Project(T70¦¦*=[$0])
00-03  SelectionVectorRemover
00-04Filter(condition=[OR(AND(=($1, 1993), >(ITEM($2, 0), 29600)), 
=($1, 1994), >(ITEM($2, 0), 29700))])
00-05  Project(T70¦¦*=[$0], dir0=[$1], columns=[$2])
00-06Scan(groupscan=[ParquetGroupScan 
[entries=[ReadEntryWithPath 
[path=/drill/testdata/ctas_auto_partition/existing_partition_pruning/lineitempart/0_0_3.parquet],
 ReadEntryWithPath 
[path=/drill/testdata/ctas_auto_partition/existing_partition_pruning/lineitempart/0_0_4.parquet]],
 
selectionRoot=/drill/testdata/ctas_auto_partition/existing_partition_pruning/lineitempart,
 numFiles=2, columns=[`*`]]])
 |
{code}

I attached the data set used. Let me know if you need anything more



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Resolved] (DRILL-3408) CTAS partition by columns[i] from cdv fails

2015-06-26 Thread Jinfeng Ni (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-3408?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jinfeng Ni resolved DRILL-3408.
---
Resolution: Fixed

The error for your CTAS seems to be expected.

Partitioning columns could only be the columns in the created tables. 
Therefore, the column name in partition by clause should be unqualified 
identifier; it would not accept t1.a, nor t1.columns[1], nor a[b][c].

If you want to partition by columns[0], columns[1]

{code}
create table t1 ( c1, c2, c3) 
partition by (c1, c2)
as
select columns[0], columns[1], columns[2]
from csv_file;
{code}

> CTAS partition by columns[i] from cdv fails
> ---
>
> Key: DRILL-3408
> URL: https://issues.apache.org/jira/browse/DRILL-3408
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Query Planning & Optimization
>Reporter: Sean Hsuan-Yi Chu
>Assignee: Jinfeng Ni
>
> CTAS does not work when users try to partition by index on complex types.
> For example,
> create table `z` partition by columns[0] as select columns[0], columns[1], 
> columns[2] from `t.csv`;
> Will result into 
> Error: PARSE ERROR: Encountered "columns" at line 1, column 31.
> Query parser does not support it; We need to do it from here



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (DRILL-3409) Specifying default frame explicitly results in an error

2015-06-26 Thread Victoria Markman (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-3409?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Victoria Markman updated DRILL-3409:

Labels: window_function  (was: )

> Specifying default frame explicitly results in an error
> ---
>
> Key: DRILL-3409
> URL: https://issues.apache.org/jira/browse/DRILL-3409
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Query Planning & Optimization
>Affects Versions: 1.1.0
>Reporter: Victoria Markman
>Assignee: Jinfeng Ni
>  Labels: window_function
>
> If I spell out default frame, I get an error:
> {code}
> 0: jdbc:drill:schema=dfs> select c_bigint, min(c_double) over(partition by 
> c_bigint order by c_date, c_time nulls first range between unbounded 
> preceding and current row) from j9;
> Error: PARSE ERROR: From line 1, column 95 to line 1, column 99: RANGE clause 
> cannot be used with compound ORDER BY clause
> [Error Id: fe955fc0-bc0f-4588-bdc2-24defdc9390c on atsqa4-133.qa.lab:31010] 
> (state=,code=0)
> {code}
> If I don't specify explicitly "default" frame as in the example above: query 
> parses and returns the same result as Postgres:
> {code}
> 0: jdbc:drill:schema=dfs> explain plan for select c_bigint, min(c_double) 
> over(partition by c_bigint order by c_date, c_time nulls first) from j9;
> 00-00Screen
> 00-01  ProjectAllowDup(c_bigint=[$0], EXPR$1=[$1])
> 00-02Project(c_bigint=[$1], w0$o0=[$5])
> 00-03  Window(window#0=[window(partition {1} order by [3, 4 
> ASC-nulls-first] range between UNBOUNDED PRECEDING and CURRENT ROW aggs 
> [MIN($2)])])
> 00-04SelectionVectorRemover
> 00-05  Sort(sort0=[$1], sort1=[$3], sort2=[$4], dir0=[ASC], 
> dir1=[ASC], dir2=[ASC-nulls-first])
> 00-06Project(T32¦¦*=[$0], c_bigint=[$1], c_double=[$2], 
> c_date=[$3], c_time=[$4])
> 00-07  Scan(groupscan=[ParquetGroupScan 
> [entries=[ReadEntryWithPath [path=maprfs:///drill/testdata/subqueries/j9]], 
> selectionRoot=/drill/testdata/subqueries/j9, numFiles=1, columns=[`*`]]])
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (DRILL-3409) Specifying default frame explicitly results in an error

2015-06-26 Thread Victoria Markman (JIRA)

Victoria Markman created DRILL-3409:
---

 Summary: Specifying default frame explicitly results in an error
 Key: DRILL-3409
 URL: https://issues.apache.org/jira/browse/DRILL-3409
 Project: Apache Drill
  Issue Type: Bug
  Components: Query Planning & Optimization
Affects Versions: 1.1.0
Reporter: Victoria Markman
Assignee: Jinfeng Ni


If I spell out default frame, I get an error:
{code}
0: jdbc:drill:schema=dfs> select c_bigint, min(c_double) over(partition by 
c_bigint order by c_date, c_time nulls first range between unbounded preceding 
and current row) from j9;
Error: PARSE ERROR: From line 1, column 95 to line 1, column 99: RANGE clause 
cannot be used with compound ORDER BY clause
[Error Id: fe955fc0-bc0f-4588-bdc2-24defdc9390c on atsqa4-133.qa.lab:31010] 
(state=,code=0)
{code}

If I don't specify explicitly "default" frame as in the example above: query 
parses and returns the same result as Postgres:
{code}
0: jdbc:drill:schema=dfs> explain plan for select c_bigint, min(c_double) 
over(partition by c_bigint order by c_date, c_time nulls first) from j9;
00-00Screen
00-01  ProjectAllowDup(c_bigint=[$0], EXPR$1=[$1])
00-02Project(c_bigint=[$1], w0$o0=[$5])
00-03  Window(window#0=[window(partition {1} order by [3, 4 
ASC-nulls-first] range between UNBOUNDED PRECEDING and CURRENT ROW aggs 
[MIN($2)])])
00-04SelectionVectorRemover
00-05  Sort(sort0=[$1], sort1=[$3], sort2=[$4], dir0=[ASC], 
dir1=[ASC], dir2=[ASC-nulls-first])
00-06Project(T32¦¦*=[$0], c_bigint=[$1], c_double=[$2], 
c_date=[$3], c_time=[$4])
00-07  Scan(groupscan=[ParquetGroupScan 
[entries=[ReadEntryWithPath [path=maprfs:///drill/testdata/subqueries/j9]], 
selectionRoot=/drill/testdata/subqueries/j9, numFiles=1, columns=[`*`]]])
{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (DRILL-3408) CTAS partition by columns[i] from cdv fails

2015-06-26 Thread Sean Hsuan-Yi Chu (JIRA)

Sean Hsuan-Yi Chu created DRILL-3408:


 Summary: CTAS partition by columns[i] from cdv fails
 Key: DRILL-3408
 URL: https://issues.apache.org/jira/browse/DRILL-3408
 Project: Apache Drill
  Issue Type: Bug
  Components: Query Planning & Optimization
Reporter: Sean Hsuan-Yi Chu
Assignee: Jinfeng Ni


CTAS does not work when users try to partition by index on complex types.
For example,
create table `z` partition by columns[0] as select columns[0], columns[1], 
columns[2] from `t.csv`;

Will result into 
Error: PARSE ERROR: Encountered "columns" at line 1, column 31.

Query parser does not support it; We need to do it from here



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (DRILL-1970) Hive views must not be listed with the show tables command

2015-06-26 Thread Jason Altekruse (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-1970?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Altekruse updated DRILL-1970:
---
Fix Version/s: (was: 1.1.0)
   1.2.0

> Hive views must not be listed with the show tables command
> --
>
> Key: DRILL-1970
> URL: https://issues.apache.org/jira/browse/DRILL-1970
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Storage - Hive
>Reporter: Abhishek Girish
>Assignee: Jason Altekruse
> Fix For: 1.2.0
>
> Attachments: DRILL-1970.1.patch.txt, DRILL-1970.2.patch.txt
>
>
> This is related to DRILL-1969. 
> Until Drill can support querying of Hive Views, hive views metadata must not 
> be visible upon issuing the "show tables" command. 
> > use hive;
> +++
> | ok |  summary   |
> +++
> | true   | Default schema changed to 'hive' |
> +++
> Currently Observed:
> > show tables ;
> +--++
> | TABLE_SCHEMA | TABLE_NAME |
> +--++
> | hive.default | table1 |
> | hive.default | table2 |
> | hive.default | table1_view1 |
> | hive.default | table2_view1 |
> ...
> +--++
> Expected:
> > show tables ;
> +--++
> | TABLE_SCHEMA | TABLE_NAME |
> +--++
> | hive.default | table1 |
> | hive.default | table2 |
> +--++



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (DRILL-2677) Query does not go beyond 4096 lines in small JSON files

2015-06-26 Thread Jason Altekruse (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-2677?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Altekruse updated DRILL-2677:
---
Fix Version/s: (was: 1.1.0)
   1.2.0

> Query does not go beyond 4096 lines in small JSON files
> ---
>
> Key: DRILL-2677
> URL: https://issues.apache.org/jira/browse/DRILL-2677
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Storage - JSON
> Environment: drill 0.8 official build
>Reporter: Alexander Reshetov
>Assignee: Jason Altekruse
> Fix For: 1.2.0
>
> Attachments: dataset_4095_and_1.json, dataset_4096_and_1.json, 
> dataset_sample.json.gz.part-aa, dataset_sample.json.gz.part-ab, 
> dataset_sample.json.gz.part-ac, dataset_sample.json.gz.part-ad, 
> dataset_sample.json.gz.part-ae, dataset_sample.json.gz.part-af
>
>
> Hello,
> I'm trying to execute next query:
> {code}
> select * from (select source.pck, source.`timestamp`, 
> flatten(source.HostUpdateTypeNW.Transfers) as entry from 
> dfs.`/mnt/data/dataset_4095_and_1.json` as source) as parsed;
> {code}
> And it works as expected and I got result:
> {code}
> ++++
> |pck | timestamp  |   entry|
> ++++
> | 3547   | 1419807470286356 | 
> {"TransferingPurpose":"8","TransferingImpact":"88","TransferingKind":"8","TransferingTime":"8","PackageOrigSenderID":"8","TransferingID":"8","TransitCN":"888","PackageChkPnt":"","PackageFullSize":"8","TransferingSessionID":"8","SubpackagesCounter":"8"}
>  |
> ++++
> 1 row selected (0.188 seconds)
> {code}
> This file contains 4095 same lines of one JSON string + at the end another 
> JOSN line (see attached file dataset_4095_and_1.json)
> The problem is when first string repeats more than 4095 times query got 
> exception. Here is query for file with 4096 string of first type + 1 string 
> of another (see attached file dataset_4096_and_1.json).
> {code}
> select * from (select source.pck, source.`timestamp`, 
> flatten(source.HostUpdateTypeNW.Transfers) as entry from 
> dfs.`/mnt/data/dataset_4096_and_1.json` as source) as parsed;
> Exception in thread "2ae108ff-b7ea-8f07-054e-84875815d856:frag:0:0" 
> java.lang.RuntimeException: Error closing fragment context.
>   at 
> org.apache.drill.exec.work.fragment.FragmentExecutor.closeOutResources(FragmentExecutor.java:224)
>   at 
> org.apache.drill.exec.work.fragment.FragmentExecutor.run(FragmentExecutor.java:187)
>   at 
> org.apache.drill.common.SelfCleaningRunnable.run(SelfCleaningRunnable.java:38)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>   at java.lang.Thread.run(Thread.java:745)
> Caused by: java.lang.ClassCastException: 
> org.apache.drill.exec.vector.NullableIntVector cannot be cast to 
> org.apache.drill.exec.vector.RepeatedVector
>   at 
> org.apache.drill.exec.physical.impl.flatten.FlattenRecordBatch.getFlattenFieldTransferPair(FlattenRecordBatch.java:274)
>   at 
> org.apache.drill.exec.physical.impl.flatten.FlattenRecordBatch.setupNewSchema(FlattenRecordBatch.java:296)
>   at 
> org.apache.drill.exec.record.AbstractSingleRecordBatch.innerNext(AbstractSingleRecordBatch.java:78)
>   at 
> org.apache.drill.exec.physical.impl.flatten.FlattenRecordBatch.innerNext(FlattenRecordBatch.java:122)
>   at 
> org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:142)
>   at 
> org.apache.drill.exec.physical.impl.validate.IteratorValidatorBatchIterator.next(IteratorValidatorBatchIterator.java:118)
>   at 
> org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:99)
>   at 
> org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:89)
>   at 
> org.apache.drill.exec.record.AbstractSingleRecordBatch.innerNext(AbstractSingleRecordBatch.java:51)
>   at 
> org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.innerNext(ProjectRecordBatch.java:134)
>   at 
> org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:142)
>   at 
> org.apache.drill.exec.physical.impl.validate.IteratorValidatorBatchIterator.next(IteratorValidatorBatchIterator.java:118)
>   at 
> org.apache.drill.exec.physical.impl.BaseRootExec.next(BaseRootExec.java:68)
>   at 
> org.apache.drill.exec.physical.impl.ScreenCreator$ScreenRoot.innerNext(ScreenCreator.java:96)
>   at 
> org.apache.drill.exec.physical.impl.BaseRootExec.next(BaseRootExec.java:58)
>   at 
> org.apache.drill.exec.work.fragment.FragmentExecutor.run(FragmentExecutor.java:163)
>   ... 4 more
>

[jira] [Updated] (DRILL-3008) Canonicalize Option Names, update calls to use validators rather than names.

2015-06-26 Thread Jason Altekruse (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-3008?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Altekruse updated DRILL-3008:
---
Fix Version/s: (was: 1.1.0)
   1.2.0

> Canonicalize Option Names, update calls to use validators rather than names.
> 
>
> Key: DRILL-3008
> URL: https://issues.apache.org/jira/browse/DRILL-3008
> Project: Apache Drill
>  Issue Type: Bug
>Reporter: Jacques Nadeau
>Assignee: Jason Altekruse
> Fix For: 1.2.0
>
> Attachments: DRILL-3008.patch
>
>
> Clean up option usages before 1.0:
> - Update the option names to be consistent.
> - Always use type-safe validators rather than direct names in Drill core and 
> testcode.  
> - Update test framework to take validator and setting rather than string
> - Remove all string ALTER SESSION settings in Drill codebase



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (DRILL-3167) When a query fails, Foreman should wait for all fragments to finish cleaning up before sending a FAILED state to the client

2015-06-26 Thread Jason Altekruse (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-3167?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14603794#comment-14603794
 ] 

Jason Altekruse commented on DRILL-3167:


I have been running the unit tests as mast has been moving forward pretty 
quickly yesterday and today. Originally I had seen what I assumed were 
intermittent failures, but I am now seeing a consistent failure in 
TestDrillbitResilience. I will be moving this to 1.2.

> When a query fails, Foreman should wait for all fragments to finish cleaning 
> up before sending a FAILED state to the client
> ---
>
> Key: DRILL-3167
> URL: https://issues.apache.org/jira/browse/DRILL-3167
> Project: Apache Drill
>  Issue Type: Bug
>Reporter: Deneche A. Hakim
>Assignee: Jason Altekruse
> Fix For: 1.2.0
>
> Attachments: DRILL-3167.1.patch.txt
>
>
> TestDrillbitResilience.foreman_runTryEnd() exposes this problem intermittently
> The query fails and the Foreman reports the failure to the client which 
> removes the results listener associated to the failed query. 
> Sometimes, a data batch reaches the client after the FAILED state already 
> arrived, the client doesn't handle this properly and the corresponding buffer 
> is never released.
> Making the Foreman wait for all fragments to finish before sending the final 
> state should help avoid such scenarios.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (DRILL-3167) When a query fails, Foreman should wait for all fragments to finish cleaning up before sending a FAILED state to the client

2015-06-26 Thread Jason Altekruse (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-3167?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Altekruse updated DRILL-3167:
---
Fix Version/s: (was: 1.1.0)
   1.2.0

> When a query fails, Foreman should wait for all fragments to finish cleaning 
> up before sending a FAILED state to the client
> ---
>
> Key: DRILL-3167
> URL: https://issues.apache.org/jira/browse/DRILL-3167
> Project: Apache Drill
>  Issue Type: Bug
>Reporter: Deneche A. Hakim
>Assignee: Jason Altekruse
> Fix For: 1.2.0
>
> Attachments: DRILL-3167.1.patch.txt
>
>
> TestDrillbitResilience.foreman_runTryEnd() exposes this problem intermittently
> The query fails and the Foreman reports the failure to the client which 
> removes the results listener associated to the failed query. 
> Sometimes, a data batch reaches the client after the FAILED state already 
> arrived, the client doesn't handle this properly and the corresponding buffer 
> is never released.
> Making the Foreman wait for all fragments to finish before sending the final 
> state should help avoid such scenarios.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (DRILL-3344) When Group By clause is present, the argument in window function should not refer to any column outside Group By

2015-06-26 Thread Sean Hsuan-Yi Chu (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-3344?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean Hsuan-Yi Chu updated DRILL-3344:
-
Summary: When Group By clause is present, the argument in window function 
should not refer to any column outside Group By  (was: Empty OVER clause + 
Group By : AssertionError)

> When Group By clause is present, the argument in window function should not 
> refer to any column outside Group By
> 
>
> Key: DRILL-3344
> URL: https://issues.apache.org/jira/browse/DRILL-3344
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Flow
>Affects Versions: 1.1.0
> Environment: 6ebfbb9d0fc0b87b032f5e5d5cb0825f5464426e
>Reporter: Khurram Faraaz
>Assignee: Sean Hsuan-Yi Chu
>  Labels: window_function
> Attachments: forPrqView.csv
>
>
> CTAS
> {code}
> 0: jdbc:drill:schema=dfs.tmp> create table tblForView(col_int, col_bigint, 
> col_char_2, col_vchar_52, col_tmstmp, col_dt, col_booln, col_dbl, col_tm) as 
> select cast(columns[0] as INT), cast(columns[1] as BIGINT),cast(columns[2] as 
> CHAR(2)), cast(columns[3] as VARCHAR(52)), cast(columns[4] as TIMESTAMP), 
> cast(columns[5] as DATE), cast(columns[6] as BOOLEAN),cast(columns[7] as 
> DOUBLE),cast(columns[8] as TIME) from `forPrqView.csv`;
> +---++
> | Fragment  | Number of records written  |
> +---++
> | 0_0   | 30 |
> +---++
> 1 row selected (0.586 seconds)
> {code}
> Failing query
> {code}
> 0: jdbc:drill:schema=dfs.tmp> select max(col_tm) over(), col_char_2 from 
> tblForView group by col_char_2;
> Error: SYSTEM ERROR: java.lang.AssertionError: Internal error: while 
> converting MAX(`tblForView`.`col_tm`)
> [Error Id: 11afbdc9-d47a-4a52-aa77-40c20ffd2bc6 on centos-03.qa.lab:31010] 
> (state=,code=0)
> {code}
> Stack trace
> {code}
> [Error Id: 11afbdc9-d47a-4a52-aa77-40c20ffd2bc6 on centos-03.qa.lab:31010]
> org.apache.drill.common.exceptions.UserException: SYSTEM ERROR: 
> java.lang.AssertionError: Internal error: while converting 
> MAX(`tblForView`.`col_tm`)
> [Error Id: 11afbdc9-d47a-4a52-aa77-40c20ffd2bc6 on centos-03.qa.lab:31010]
> at 
> org.apache.drill.common.exceptions.UserException$Builder.build(UserException.java:522)
>  ~[drill-common-1.1.0-SNAPSHOT-rebuffed.jar:1.1.0-SNAPSHOT]
> at 
> org.apache.drill.exec.work.foreman.Foreman$ForemanResult.close(Foreman.java:738)
>  [drill-java-exec-1.1.0-SNAPSHOT-rebuffed.jar:1.1.0-SNAPSHOT]
> at 
> org.apache.drill.exec.work.foreman.Foreman$StateSwitch.processEvent(Foreman.java:840)
>  [drill-java-exec-1.1.0-SNAPSHOT-rebuffed.jar:1.1.0-SNAPSHOT]
> at 
> org.apache.drill.exec.work.foreman.Foreman$StateSwitch.processEvent(Foreman.java:782)
>  [drill-java-exec-1.1.0-SNAPSHOT-rebuffed.jar:1.1.0-SNAPSHOT]
> at 
> org.apache.drill.common.EventProcessor.sendEvent(EventProcessor.java:73) 
> [drill-common-1.1.0-SNAPSHOT-rebuffed.jar:1.1.0-SNAPSHOT]
> at 
> org.apache.drill.exec.work.foreman.Foreman$StateSwitch.moveToState(Foreman.java:784)
>  [drill-java-exec-1.1.0-SNAPSHOT-rebuffed.jar:1.1.0-SNAPSHOT]
> at 
> org.apache.drill.exec.work.foreman.Foreman.moveToState(Foreman.java:893) 
> [drill-java-exec-1.1.0-SNAPSHOT-rebuffed.jar:1.1.0-SNAPSHOT]
> at org.apache.drill.exec.work.foreman.Foreman.run(Foreman.java:253) 
> [drill-java-exec-1.1.0-SNAPSHOT-rebuffed.jar:1.1.0-SNAPSHOT]
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>  [na:1.7.0_45]
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>  [na:1.7.0_45]
> at java.lang.Thread.run(Thread.java:744) [na:1.7.0_45]
> Caused by: org.apache.drill.exec.work.foreman.ForemanException: Unexpected 
> exception during fragment initialization: Internal error: while converting 
> MAX(`tblForView`.`col_tm`)
> ... 4 common frames omitted
> Caused by: java.lang.AssertionError: Internal error: while converting 
> MAX(`tblForView`.`col_tm`)
> at org.apache.calcite.util.Util.newInternal(Util.java:790) 
> ~[calcite-core-1.1.0-drill-r8.jar:1.1.0-drill-r8]
> at 
> org.apache.calcite.sql2rel.ReflectiveConvertletTable$2.convertCall(ReflectiveConvertletTable.java:152)
>  ~[calcite-core-1.1.0-drill-r8.jar:1.1.0-drill-r8]
> at 
> org.apache.calcite.sql2rel.SqlNodeToRexConverterImpl.convertCall(SqlNodeToRexConverterImpl.java:60)
>  ~[calcite-core-1.1.0-drill-r8.jar:1.1.0-drill-r8]
> at 
> org.apache.calcite.sql2rel.SqlToRelConverter.convertOver(SqlToRelConverter.java:1762)
>  ~[calcite-core-1.1.0-drill-r8.jar:1.1.0-dri

[jira] [Updated] (DRILL-3183) Query that uses window functions returns NPE

2015-06-26 Thread Sean Hsuan-Yi Chu (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-3183?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean Hsuan-Yi Chu updated DRILL-3183:
-
Summary: Query that uses window functions returns NPE  (was: When Group By 
clause is present, the argument in window function should not refer to any 
column outside Group By)

> Query that uses window functions returns NPE
> 
>
> Key: DRILL-3183
> URL: https://issues.apache.org/jira/browse/DRILL-3183
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Query Planning & Optimization
>Affects Versions: 1.0.0
> Environment: faec150598840c40827e6493992d81209aa936da
>Reporter: Khurram Faraaz
>Assignee: Sean Hsuan-Yi Chu
>  Labels: window_function
> Fix For: 1.1.0
>
>
> Test was run on 4 node cluster on CentOS. We see NPE for a query that uses 
> window functions. Data was from CSV file (headers were ignored).
> To enable window functions, 
> alter session set `window.enable`=true;
> These two queries work 
> {code}
> SELECT count(salary) OVER w, count(salary) OVER w FROM cp.`employee.json` t 
> WINDOW w AS (PARTITION BY store_id ORDER BY position_id DESC);
> SELECT count(columns[0]) OVER(PARTITION BY columns[1] ORDER BY columns[0] 
> DESC), count(columns[0]) OVER(PARTITION BY columns[1] ORDER BY columns[0] 
> DESC) FROM `airports.csv`;
> {code}
> These two queries do not work and in the second query we see a NPE
> {code}
> 0: jdbc:drill:schema=dfs.tmp> SELECT count(*) OVER w, count(*) OVER w FROM 
> `airports.csv` WINDOW w AS (PARTITION BY columns[1] ORDER BY columns[0] DESC);
> Error: PARSE ERROR: From line 1, column 87 to line 1, column 93: Table 
> 'columns' not found
> [Error Id: 51d080bc-580f-44cc-a9be-d29ae60900c3 on centos-03.qa.lab:31010] 
> (state=,code=0)
> {code}
> Query that returns NPE.
> {code}
> 0: jdbc:drill:schema=dfs.tmp> SELECT count(*) OVER w, count(*) OVER w FROM 
> `airports.csv` t WINDOW w AS (PARTITION BY t.columns[1] ORDER BY t.columns[0] 
> DESC);
> Error: PARSE ERROR: java.lang.NullPointerException
> [Error Id: 27e933bf-1382-4aae-bfef-36444a69acc9 on centos-03.qa.lab:31010] 
> (state=,code=0)
> {code}
> Stack trace from drillbit.log
> {code}
> 2015-05-26 19:07:33,104 [2a9b3b8a-4d0b-ba7b-f0ff-f8038f9f9dbd:foreman] INFO  
> o.a.drill.exec.work.foreman.Foreman - State change requested.  PENDING --> 
> FAILED
> org.apache.drill.exec.work.foreman.ForemanException: Unexpected exception 
> during fragment initialization: PARSE ERROR: java.lang.NullPointerException
> [Error Id: 16e17855-32f7-4687-9502-5b4880bb11a4 ]
> at org.apache.drill.exec.work.foreman.Foreman.run(Foreman.java:251) 
> [drill-java-exec-1.0.0-mapr-r1-rebuffed.jar:1.0.0-mapr-r1]
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>  [na:1.7.0_45]
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>  [na:1.7.0_45]
> at java.lang.Thread.run(Thread.java:744) [na:1.7.0_45]
> Caused by: org.apache.drill.common.exceptions.UserException: PARSE ERROR: 
> java.lang.NullPointerException
> [Error Id: 16e17855-32f7-4687-9502-5b4880bb11a4 ]
> at 
> org.apache.drill.common.exceptions.UserException$Builder.build(UserException.java:522)
>  ~[drill-common-1.0.0-mapr-r1-rebuffed.jar:1.0.0-mapr-r1]
> at 
> org.apache.drill.exec.planner.sql.DrillSqlWorker.getPlan(DrillSqlWorker.java:180)
>  ~[drill-java-exec-1.0.0-mapr-r1-rebuffed.jar:1.0.0-mapr-r1]
> at 
> org.apache.drill.exec.work.foreman.Foreman.runSQL(Foreman.java:902) 
> [drill-java-exec-1.0.0-mapr-r1-rebuffed.jar:1.0.0-mapr-r1]
> at org.apache.drill.exec.work.foreman.Foreman.run(Foreman.java:240) 
> [drill-java-exec-1.0.0-mapr-r1-rebuffed.jar:1.0.0-mapr-r1]
> ... 3 common frames omitted
> Caused by: org.apache.calcite.tools.ValidationException: 
> java.lang.NullPointerException
> at 
> org.apache.calcite.prepare.PlannerImpl.validate(PlannerImpl.java:176) 
> ~[calcite-core-1.1.0-drill-r7.jar:1.1.0-drill-r7]
> at 
> org.apache.calcite.prepare.PlannerImpl.validateAndGetType(PlannerImpl.java:185)
>  ~[calcite-core-1.1.0-drill-r7.jar:1.1.0-drill-r7]
> at 
> org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.validateNode(DefaultSqlHandler.java:226)
>  ~[drill-java-exec-1.0.0-mapr-r1-rebuffed.jar:1.0.0-mapr-r1]
> at 
> org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.getPlan(DefaultSqlHandler.java:178)
>  ~[drill-java-exec-1.0.0-mapr-r1-rebuffed.jar:1.0.0-mapr-r1]
> at 
> org.apache.drill.exec.planner.sql.DrillSqlWorker.getPlan(DrillSqlWorker.java:177)
>  ~[drill-java-exec-1.0.0-mapr-r1-rebuffed.jar:1.0.0-mapr-r1]
> ... 5 common frames omitted
> Caused by: java.lang.NullPointerException: null
> at 
> org.

[jira] [Updated] (DRILL-3370) FLATTEN error with a where clause

2015-06-26 Thread Jason Altekruse (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-3370?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Altekruse updated DRILL-3370:
---
Fix Version/s: (was: 1.1.0)
   1.2.0

> FLATTEN error with a where clause
> -
>
> Key: DRILL-3370
> URL: https://issues.apache.org/jira/browse/DRILL-3370
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Functions - Drill
>Affects Versions: 1.0.0
>Reporter: Joanlynn LIN
>Assignee: Jason Altekruse
> Fix For: 1.2.0
>
> Attachments: DRILL-3370.patch, jsonarray.150.json
>
>
> I've got a JSON file which contains 150 JSON strings all like this:
> {"arr": [94]}
> {"arr": [39]}
> {"arr": [180]}
> I was trying to Flatten() the arrays and filter the values using such an SQL 
> query:
> select flatten(arr) as a from dfs.`/data/test/jsonarray.150.json` where a 
> > 100;
> However, it returned no result. Then I modified my expression like this:
>   select a from (select flatten(arr) as a from 
> dfs.`/data/test/jsonarray.150.json`) where a > 100;
> It then failed:
> Error: SYSTEM ERROR: 
> org.apache.drill.exec.exception.SchemaChangeException: Failure while trying 
> to materialize incoming schema.  Errors:
> Error in expression at index -1.  Error: Missing function implementation: 
> [flatten(BIGINT-REPEATED)].  Full expression: --UNKNOWN EXPRESSION--..
> Fragment 0:0
> [Error Id: 1d71bf0e-48da-43f8-8b36-6a513120d7e0 on slave2:31010] 
> (state=,code=0)
> After a lot of attempts, I finally got it work:
> select a from (select flatten(arr) as a from 
> dfs.`/data/test/jsonarray.150.json` limit 1000) where a > 100;
> See, I just added a "limit 1000" in this query and I am wondering if this 
> is a bug or what in Drill?
> Looking forward to your attention and help. Many thanks.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (DRILL-2838) Applying flatten after joining 2 sub-queries returns empty maps

2015-06-26 Thread Jason Altekruse (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-2838?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Altekruse updated DRILL-2838:
---
Fix Version/s: (was: 1.1.0)
   1.2.0

> Applying flatten after joining 2 sub-queries returns empty maps
> ---
>
> Key: DRILL-2838
> URL: https://issues.apache.org/jira/browse/DRILL-2838
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Relational Operators
>Reporter: Rahul Challapalli
>Assignee: Jason Altekruse
>Priority: Critical
> Fix For: 1.2.0
>
> Attachments: data.json
>
>
> git.commit.id.abbrev=5cd36c5
> The below query applies flatten after joining 2 subqueries. It generates 
> empty maps which is wrong
> {code}
> select v1.uid, flatten(events), flatten(transactions) from 
> (select uid, events from `data.json`) v1
> inner join
> (select uid, transactions from `data.json`) v2
> on v1.uid = v2.uid;
> ++++
> |uid |   EXPR$1   |   EXPR$2   |
> ++++
> | 1  | {} | {} |
> | 1  | {} | {} |
> | 1  | {} | {} |
> | 1  | {} | {} |
> | 1  | {} | {} |
> | 1  | {} | {} |
> | 1  | {} | {} |
> | 1  | {} | {} |
> | 1  | {} | {} |
> | 1  | {} | {} |
> | 1  | {} | {} |
> | 1  | {} | {} |
> | 1  | {} | {} |
> | 1  | {} | {} |
> | 1  | {} | {} |
> | 1  | {} | {} |
> | 1  | {} | {} |
> | 1  | {} | {} |
> | 2  | {} | {} |
> | 2  | {} | {} |
> | 2  | {} | {} |
> | 2  | {} | {} |
> | 2  | {} | {} |
> | 2  | {} | {} |
> | 2  | {} | {} |
> | 2  | {} | {} |
> | 2  | {} | {} |
> | 2  | {} | {} |
> | 2  | {} | {} |
> | 2  | {} | {} |
> | 2  | {} | {} |
> | 2  | {} | {} |
> | 2  | {} | {} |
> | 2  | {} | {} |
> | 2  | {} | {} |
> | 2  | {} | {} |
> ++++
> 36 rows selected (0.244 seconds)
> {code}
> I attached the data set. Let me know if you have any questions.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (DRILL-3354) TestBuilder can check if the number of result batches equals some expected value

2015-06-26 Thread Jason Altekruse (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-3354?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Altekruse updated DRILL-3354:
---
Fix Version/s: (was: 1.1.0)
   1.2.0

> TestBuilder can check if the number of result batches equals some expected 
> value
> 
>
> Key: DRILL-3354
> URL: https://issues.apache.org/jira/browse/DRILL-3354
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Tools, Build & Test
>Reporter: Deneche A. Hakim
>Assignee: Jason Altekruse
>Priority: Minor
> Fix For: 1.2.0
>
> Attachments: DRILL-3354.1.patch.txt
>
>
> TestWindowFrame unit tests are only meaningful if memory sort exposes batches 
> of a specific size (20 rows) downstream. Otherwise, the tests will pass but 
> they won't catch problems related to some specific edge cases when multiple 
> batches are involved.
> The purpose of this JIRA is to extend TestBuilder so that unit tests that 
> wish to do so, to fail if the number of result batches differs from some 
> expected number.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (DRILL-1673) Flatten function can not work well with nested arrays.

2015-06-26 Thread Jason Altekruse (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-1673?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Altekruse updated DRILL-1673:
---
Fix Version/s: (was: 1.1.0)
   1.2.0

> Flatten function can not work well with nested arrays.
> --
>
> Key: DRILL-1673
> URL: https://issues.apache.org/jira/browse/DRILL-1673
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Functions - Drill
>Affects Versions: 0.7.0
> Environment: 0.7.0
>Reporter: Hao Zhu
>Assignee: Jason Altekruse
>Priority: Blocker
> Fix For: 1.2.0
>
> Attachments: DRILL-1673-reopened.patch, DRILL-1673.patch, error.log
>
>
> Flatten function failed to scan nested arrays , for example something like 
> ""[[ ]]"".
> The only difference of JSON files between below 2 tests is 
> "num":[1,2,3]
> VS
> "num":[[1,2,3]]
> ==Test 1 (Works well):==
> file:
> {code}
> {"fixed_column":"abc", "list_column":[{"id1":"1","name":"zhu", "num": 
> [1,2,3]}, {"id1":"2","name":"hao", "num": [4,5,6]} ]}
> {code}
> SQL:
> {code}
> 0: jdbc:drill:zk=local> select t.`fixed_column` as fixed_column, 
> flatten(t.`list_column`)  from 
> dfs.root.`/Users/hzu/Documents/sharefolder/hp/n2.json` as t;
> +--++
> | fixed_column |   EXPR$1   |
> +--++
> | abc  | {"id1":"1","name":"zhu","num":[1,2,3]} |
> | abc  | {"id1":"2","name":"hao","num":[4,5,6]} |
> +--++
> 2 rows selected (0.154 seconds)
> {code}
> ==Test 2 (Failed):==
> file:
> {code}
> {"fixed_column":"abc", "list_column":[{"id1":"1","name":"zhu", "num": 
> [[1,2,3]]}, {"id1":"2","name":"hao", "num": [[4,5,6]]} ]}
> {code}
> SQL:
> {code}
> 0: jdbc:drill:zk=local>  select t.`fixed_column` as fixed_column, 
> flatten(t.`list_column`)  from 
> dfs.root.`/Users/hzu/Documents/sharefolder/hp/n3.json` as t;
> +--++
> | fixed_column |   EXPR$1   |
> +--++
> Query failed: Failure while running fragment.[ 
> df28347b-fac1-497d-b9c5-a313ba77aa4d on 10.250.0.115:31010 ]
>   (java.lang.UnsupportedOperationException) 
> 
> org.apache.drill.exec.vector.complex.RepeatedListVector$RepeatedListTransferPair.splitAndTransfer():339
> 
> org.apache.drill.exec.vector.complex.RepeatedMapVector$SingleMapTransferPair.splitAndTransfer():305
> org.apache.drill.exec.test.generated.FlattenerGen22.flattenRecords():93
> 
> org.apache.drill.exec.physical.impl.flatten.FlattenRecordBatch.doWork():152
> org.apache.drill.exec.record.AbstractSingleRecordBatch.innerNext():89
> 
> org.apache.drill.exec.physical.impl.flatten.FlattenRecordBatch.innerNext():118
> org.apache.drill.exec.record.AbstractRecordBatch.next():106
> 
> org.apache.drill.exec.physical.impl.validate.IteratorValidatorBatchIterator.next():124
> org.apache.drill.exec.record.AbstractRecordBatch.next():86
> org.apache.drill.exec.record.AbstractRecordBatch.next():76
> org.apache.drill.exec.record.AbstractSingleRecordBatch.innerNext():52
> 
> org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.innerNext():129
> org.apache.drill.exec.record.AbstractRecordBatch.next():106
> 
> org.apache.drill.exec.physical.impl.validate.IteratorValidatorBatchIterator.next():124
> org.apache.drill.exec.physical.impl.BaseRootExec.next():67
> 
> org.apache.drill.exec.physical.impl.ScreenCreator$ScreenRoot.innerNext():122
> org.apache.drill.exec.physical.impl.BaseRootExec.next():57
> org.apache.drill.exec.work.fragment.FragmentExecutor.run():105
> org.apache.drill.exec.work.WorkManager$RunnableWrapper.run():249
> ...():0
> java.lang.RuntimeException: java.sql.SQLException: Failure while executing 
> query.
>   at sqlline.SqlLine$IncrementalRows.hasNext(SqlLine.java:2514)
>   at sqlline.SqlLine$TableOutputFormat.print(SqlLine.java:2148)
>   at sqlline.SqlLine.print(SqlLine.java:1809)
>   at sqlline.SqlLine$Commands.execute(SqlLine.java:3766)
>   at sqlline.SqlLine$Commands.sql(SqlLine.java:3663)
>   at sqlline.SqlLine.dispatch(SqlLine.java:889)
>   at sqlline.SqlLine.begin(SqlLine.java:763)
>   at sqlline.SqlLine.start(SqlLine.java:498)
>   at sqlline.SqlLine.main(SqlLine.java:460)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (DRILL-3382) CTAS with order by clause fails with IOOB exception

2015-06-26 Thread Jinfeng Ni (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-3382?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jinfeng Ni updated DRILL-3382:
--
Attachment: 0001-DRILL-3382-Fix-IOOB-error-for-CTAS-order-by-statemen.patch

Revised code based on review comments. 

> CTAS with order by clause fails with IOOB exception
> ---
>
> Key: DRILL-3382
> URL: https://issues.apache.org/jira/browse/DRILL-3382
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Query Planning & Optimization
>Affects Versions: 1.1.0
>Reporter: Parth Chandra
>Assignee: Aman Sinha
>Priority: Critical
> Fix For: 1.2.0
>
> Attachments: 
> 0001-DRILL-3382-Fix-IOOB-error-for-CTAS-order-by-statemen.patch, 
> 0001-DRILL-3382-Fix-IOOB-error-for-CTAS-order-by-statemen.patch
>
>
> The query :
> {panel}
> create table `lineitem__5`  as select l_suppkey, l_partkey, l_linenumber from 
> cp.`tpch/lineitem.parquet` l order by l_linenumber;
> {panel}
> fails with an IOOB exception
> Trace in log - 
> {panel}
> [Error Id: 3351dcf3-032f-4d10-b2a4-c42959d0c06a on localhost:31010]
> org.apache.drill.common.exceptions.UserException: SYSTEM ERROR: 
> IndexOutOfBoundsException: index (2) must be less than size (2)
> [Error Id: 3351dcf3-032f-4d10-b2a4-c42959d0c06a on localhost:31010]
> at 
> org.apache.drill.common.exceptions.UserException$Builder.build(UserException.java:523)
>  ~[drill-common-1.1.0-SNAPSHOT.jar:1.1.0-SNAPSHOT]
> at 
> org.apache.drill.exec.work.foreman.Foreman$ForemanResult.close(Foreman.java:737)
>  [drill-java-exec-1.1.0-SNAPSHOT.jar:1.1.0-SNAPSHOT]
> at 
> org.apache.drill.exec.work.foreman.Foreman$StateSwitch.processEvent(Foreman.java:839)
>  [drill-java-exec-1.1.0-SNAPSHOT.jar:1.1.0-SNAPSHOT]
> at 
> org.apache.drill.exec.work.foreman.Foreman$StateSwitch.processEvent(Foreman.java:781)
>  [drill-java-exec-1.1.0-SNAPSHOT.jar:1.1.0-SNAPSHOT]
> at 
> org.apache.drill.common.EventProcessor.sendEvent(EventProcessor.java:73) 
> [drill-common-1.1.0-SNAPSHOT.jar:1.1.0-SNAPSHOT]
> at 
> org.apache.drill.exec.work.foreman.Foreman$StateSwitch.moveToState(Foreman.java:783)
>  [drill-java-exec-1.1.0-SNAPSHOT.jar:1.1.0-SNAPSHOT]
> at 
> org.apache.drill.exec.work.foreman.Foreman.moveToState(Foreman.java:892) 
> [drill-java-exec-1.1.0-SNAPSHOT.jar:1.1.0-SNAPSHOT]
> at org.apache.drill.exec.work.foreman.Foreman.run(Foreman.java:253) 
> [drill-java-exec-1.1.0-SNAPSHOT.jar:1.1.0-SNAPSHOT]
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>  [na:1.7.0_65]
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>  [na:1.7.0_65]
> at java.lang.Thread.run(Thread.java:745) [na:1.7.0_65]
> Caused by: org.apache.drill.exec.work.foreman.ForemanException: Unexpected 
> exception during fragment initialization: Internal error: Error while 
> applying rule ExpandConversionRule, args 
> [rel#9013:AbstractConverter.PHYSICAL.SINGLETON([]).[2](input=rel#9011:Subset#7.PHYSICAL.ANY([]).[2],convention=PHYSICAL,DrillDistributionTraitDef=SINGLETON([]),sort=[2])]
> ... 4 common frames omitted
> Caused by: java.lang.AssertionError: Internal error: Error while applying 
> rule ExpandConversionRule, args 
> [rel#9013:AbstractConverter.PHYSICAL.SINGLETON([]).[2](input=rel#9011:Subset#7.PHYSICAL.ANY([]).[2],convention=PHYSICAL,DrillDistributionTraitDef=SINGLETON([]),sort=[2])]
> at org.apache.calcite.util.Util.newInternal(Util.java:790) 
> ~[calcite-core-1.1.0-drill-r9.jar:1.1.0-drill-r9]
> at 
> org.apache.calcite.plan.volcano.VolcanoRuleCall.onMatch(VolcanoRuleCall.java:251)
>  ~[calcite-core-1.1.0-drill-r9.jar:1.1.0-drill-r9]
> at 
> org.apache.calcite.plan.volcano.VolcanoPlanner.findBestExp(VolcanoPlanner.java:795)
>  ~[calcite-core-1.1.0-drill-r9.jar:1.1.0-drill-r9]
> at 
> org.apache.calcite.tools.Programs$RuleSetProgram.run(Programs.java:303) 
> ~[calcite-core-1.1.0-drill-r9.jar:1.1.0-drill-r9]
> at org.apache.calcite.prepare.PlannerImpl.transform(PlannerImpl.java:316) 
> ~[calcite-core-1.1.0-drill-r9.jar:1.1.0-drill-r9]
> at 
> org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.convertToPrel(DefaultSqlHandler.java:260)
>  ~[drill-java-exec-1.1.0-SNAPSHOT.jar:1.1.0-SNAPSHOT]
> at 
> org.apache.drill.exec.planner.sql.handlers.CreateTableHandler.convertToPrel(CreateTableHandler.java:120)
>  ~[drill-java-exec-1.1.0-SNAPSHOT.jar:1.1.0-SNAPSHOT]
> at 
> org.apache.drill.exec.planner.sql.handlers.CreateTableHandler.getPlan(CreateTableHandler.java:99)
>  ~[drill-java-exec-1.1.0-SNAPSHOT.jar:1.1.0-SNAPSHOT]
> at 
> org.apache.drill.exec.planner.sql.DrillSqlWorker.getPlan(DrillSqlWorker.java:178)
>  ~[drill-java-exec-1.1.0-SNAPSHOT.jar:1.1.0-SNAPSHOT]
> at org.apache.drill.exec.work.foreman.Foreman.run

[jira] [Commented] (DRILL-3307) Query with window function runs out of memory

2015-06-26 Thread Abhishek Girish (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-3307?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14603758#comment-14603758
 ] 

Abhishek Girish commented on DRILL-3307:


I tried running with a WHERE clause to remove NULLs. But still hit into the 
issue. Raising priority.

> Query with window function runs out of memory
> -
>
> Key: DRILL-3307
> URL: https://issues.apache.org/jira/browse/DRILL-3307
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Query Planning & Optimization
>Affects Versions: 1.1.0
> Environment: Data set: TPC-DS SF 100 Parquet
> Number of Nodes: 4
>Reporter: Abhishek Girish
>Assignee: Deneche A. Hakim
>Priority: Blocker
> Fix For: 1.2.0
>
> Attachments: drillbit.log.txt
>
>
> Query with window function runs out of memory:
> {code:sql}
>  SELECT SUM(ss.ss_net_paid_inc_tax) OVER (PARTITION BY ss.ss_store_sk) AS 
> TotalSpend FROM store_sales ss ORDER BY 1 LIMIT 20;
> java.lang.RuntimeException: java.sql.SQLException: RESOURCE ERROR: One or 
> more nodes ran out of memory while executing the query.
> Fragment 3:0
> [Error Id: 9af19064-9175-46a4-b557-714d1c77cd76 on abhi6.qa.lab:31010]
>   at sqlline.IncrementalRows.hasNext(IncrementalRows.java:73)
>   at 
> sqlline.TableOutputFormat$ResizingRowsProvider.next(TableOutputFormat.java:85)
>   at sqlline.TableOutputFormat.print(TableOutputFormat.java:116)
>   at sqlline.SqlLine.print(SqlLine.java:1583)
>   at sqlline.Commands.execute(Commands.java:852)
>   at sqlline.Commands.sql(Commands.java:751)
>   at sqlline.SqlLine.dispatch(SqlLine.java:738)
>   at sqlline.SqlLine.begin(SqlLine.java:612)
>   at sqlline.SqlLine.start(SqlLine.java:366)
>   at sqlline.SqlLine.main(SqlLine.java:259)
> {code}
> Plan:
> {code}
> 00-00Screen : rowType = RecordType(ANY TotalSpend): rowcount = 
> 2.87997024E8, cumulative cost = {4.3487550824E9 rows, 5.7539970079068695E10 
> cpu, 0.0 io, 7.077814861824E12 network, 4.607952384E9 memory}, id = 142297
> 00-01  SelectionVectorRemover : rowType = RecordType(ANY TotalSpend): 
> rowcount = 2.87997024E8, cumulative cost = {4.31995538E9 rows, 
> 5.751117037666869E10 cpu, 0.0 io, 7.077814861824E12 network, 4.607952384E9 
> memory}, id = 142296
> 00-02Limit(fetch=[20]) : rowType = RecordType(ANY TotalSpend): 
> rowcount = 2.87997024E8, cumulative cost = {4.031958356E9 rows, 
> 5.722317335266869E10 cpu, 0.0 io, 7.077814861824E12 network, 4.607952384E9 
> memory}, id = 142295
> 00-03  SingleMergeExchange(sort0=[0 ASC]) : rowType = RecordType(ANY 
> TotalSpend): rowcount = 2.87997024E8, cumulative cost = {4.031958336E9 rows, 
> 5.722317327266869E10 cpu, 0.0 io, 7.077814861824E12 network, 4.607952384E9 
> memory}, id = 142294
> 01-01SelectionVectorRemover : rowType = RecordType(ANY 
> TotalSpend): rowcount = 2.87997024E8, cumulative cost = {3.743961312E9 rows, 
> 5.261522088866869E10 cpu, 0.0 io, 5.89817905152E12 network, 4.607952384E9 
> memory}, id = 142293
> 01-02  TopN(limit=[20]) : rowType = RecordType(ANY TotalSpend): 
> rowcount = 2.87997024E8, cumulative cost = {3.455964288E9 rows, 
> 5.232722386466869E10 cpu, 0.0 io, 5.89817905152E12 network, 4.607952384E9 
> memory}, id = 142292
> 01-03Project(TotalSpend=[$0]) : rowType = RecordType(ANY 
> TotalSpend): rowcount = 2.87997024E8, cumulative cost = {3.167967264E9 rows, 
> 4.734841414759049E10 cpu, 0.0 io, 5.89817905152E12 network, 4.607952384E9 
> memory}, id = 142291
> 01-04  HashToRandomExchange(dist0=[[$0]]) : rowType = 
> RecordType(ANY TotalSpend, ANY E_X_P_R_H_A_S_H_F_I_E_L_D): rowcount = 
> 2.87997024E8, cumulative cost = {3.167967264E9 rows, 4.734841414759049E10 
> cpu, 0.0 io, 5.89817905152E12 network, 4.607952384E9 memory}, id = 142290
> 02-01UnorderedMuxExchange : rowType = RecordType(ANY 
> TotalSpend, ANY E_X_P_R_H_A_S_H_F_I_E_L_D): rowcount = 2.87997024E8, 
> cumulative cost = {2.87997024E9 rows, 4.274046176359049E10 cpu, 0.0 io, 
> 3.538907430912E12 network, 4.607952384E9 memory}, id = 142289
> 03-01  Project(TotalSpend=[$0], 
> E_X_P_R_H_A_S_H_F_I_E_L_D=[castInt(hash64AsDouble($0))]) : rowType = 
> RecordType(ANY TotalSpend, ANY E_X_P_R_H_A_S_H_F_I_E_L_D): rowcount = 
> 2.87997024E8, cumulative cost = {2.591973216E9 rows, 4.245246473959049E10 
> cpu, 0.0 io, 3.538907430912E12 network, 4.607952384E9 memory}, id = 142288
> 03-02Project(TotalSpend=[CASE(>($2, 0), CAST($3):ANY, 
> null)]) : rowType = RecordType(ANY TotalSpend): rowcount = 2.87997024E8, 
> cumulative cost = {2.303976192E9 rows, 4.130047664359049E10 cpu, 0.0 io, 
> 3.538907430912E12 network, 4.607952384E9 memory}, id = 142287
> 03-03

[jira] [Updated] (DRILL-3307) Query with window function runs out of memory

2015-06-26 Thread Abhishek Girish (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-3307?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Abhishek Girish updated DRILL-3307:
---
Priority: Blocker  (was: Major)

> Query with window function runs out of memory
> -
>
> Key: DRILL-3307
> URL: https://issues.apache.org/jira/browse/DRILL-3307
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Query Planning & Optimization
>Affects Versions: 1.1.0
> Environment: Data set: TPC-DS SF 100 Parquet
> Number of Nodes: 4
>Reporter: Abhishek Girish
>Assignee: Deneche A. Hakim
>Priority: Blocker
> Fix For: 1.2.0
>
> Attachments: drillbit.log.txt
>
>
> Query with window function runs out of memory:
> {code:sql}
>  SELECT SUM(ss.ss_net_paid_inc_tax) OVER (PARTITION BY ss.ss_store_sk) AS 
> TotalSpend FROM store_sales ss ORDER BY 1 LIMIT 20;
> java.lang.RuntimeException: java.sql.SQLException: RESOURCE ERROR: One or 
> more nodes ran out of memory while executing the query.
> Fragment 3:0
> [Error Id: 9af19064-9175-46a4-b557-714d1c77cd76 on abhi6.qa.lab:31010]
>   at sqlline.IncrementalRows.hasNext(IncrementalRows.java:73)
>   at 
> sqlline.TableOutputFormat$ResizingRowsProvider.next(TableOutputFormat.java:85)
>   at sqlline.TableOutputFormat.print(TableOutputFormat.java:116)
>   at sqlline.SqlLine.print(SqlLine.java:1583)
>   at sqlline.Commands.execute(Commands.java:852)
>   at sqlline.Commands.sql(Commands.java:751)
>   at sqlline.SqlLine.dispatch(SqlLine.java:738)
>   at sqlline.SqlLine.begin(SqlLine.java:612)
>   at sqlline.SqlLine.start(SqlLine.java:366)
>   at sqlline.SqlLine.main(SqlLine.java:259)
> {code}
> Plan:
> {code}
> 00-00Screen : rowType = RecordType(ANY TotalSpend): rowcount = 
> 2.87997024E8, cumulative cost = {4.3487550824E9 rows, 5.7539970079068695E10 
> cpu, 0.0 io, 7.077814861824E12 network, 4.607952384E9 memory}, id = 142297
> 00-01  SelectionVectorRemover : rowType = RecordType(ANY TotalSpend): 
> rowcount = 2.87997024E8, cumulative cost = {4.31995538E9 rows, 
> 5.751117037666869E10 cpu, 0.0 io, 7.077814861824E12 network, 4.607952384E9 
> memory}, id = 142296
> 00-02Limit(fetch=[20]) : rowType = RecordType(ANY TotalSpend): 
> rowcount = 2.87997024E8, cumulative cost = {4.031958356E9 rows, 
> 5.722317335266869E10 cpu, 0.0 io, 7.077814861824E12 network, 4.607952384E9 
> memory}, id = 142295
> 00-03  SingleMergeExchange(sort0=[0 ASC]) : rowType = RecordType(ANY 
> TotalSpend): rowcount = 2.87997024E8, cumulative cost = {4.031958336E9 rows, 
> 5.722317327266869E10 cpu, 0.0 io, 7.077814861824E12 network, 4.607952384E9 
> memory}, id = 142294
> 01-01SelectionVectorRemover : rowType = RecordType(ANY 
> TotalSpend): rowcount = 2.87997024E8, cumulative cost = {3.743961312E9 rows, 
> 5.261522088866869E10 cpu, 0.0 io, 5.89817905152E12 network, 4.607952384E9 
> memory}, id = 142293
> 01-02  TopN(limit=[20]) : rowType = RecordType(ANY TotalSpend): 
> rowcount = 2.87997024E8, cumulative cost = {3.455964288E9 rows, 
> 5.232722386466869E10 cpu, 0.0 io, 5.89817905152E12 network, 4.607952384E9 
> memory}, id = 142292
> 01-03Project(TotalSpend=[$0]) : rowType = RecordType(ANY 
> TotalSpend): rowcount = 2.87997024E8, cumulative cost = {3.167967264E9 rows, 
> 4.734841414759049E10 cpu, 0.0 io, 5.89817905152E12 network, 4.607952384E9 
> memory}, id = 142291
> 01-04  HashToRandomExchange(dist0=[[$0]]) : rowType = 
> RecordType(ANY TotalSpend, ANY E_X_P_R_H_A_S_H_F_I_E_L_D): rowcount = 
> 2.87997024E8, cumulative cost = {3.167967264E9 rows, 4.734841414759049E10 
> cpu, 0.0 io, 5.89817905152E12 network, 4.607952384E9 memory}, id = 142290
> 02-01UnorderedMuxExchange : rowType = RecordType(ANY 
> TotalSpend, ANY E_X_P_R_H_A_S_H_F_I_E_L_D): rowcount = 2.87997024E8, 
> cumulative cost = {2.87997024E9 rows, 4.274046176359049E10 cpu, 0.0 io, 
> 3.538907430912E12 network, 4.607952384E9 memory}, id = 142289
> 03-01  Project(TotalSpend=[$0], 
> E_X_P_R_H_A_S_H_F_I_E_L_D=[castInt(hash64AsDouble($0))]) : rowType = 
> RecordType(ANY TotalSpend, ANY E_X_P_R_H_A_S_H_F_I_E_L_D): rowcount = 
> 2.87997024E8, cumulative cost = {2.591973216E9 rows, 4.245246473959049E10 
> cpu, 0.0 io, 3.538907430912E12 network, 4.607952384E9 memory}, id = 142288
> 03-02Project(TotalSpend=[CASE(>($2, 0), CAST($3):ANY, 
> null)]) : rowType = RecordType(ANY TotalSpend): rowcount = 2.87997024E8, 
> cumulative cost = {2.303976192E9 rows, 4.130047664359049E10 cpu, 0.0 io, 
> 3.538907430912E12 network, 4.607952384E9 memory}, id = 142287
> 03-03  Window(window#0=[window(partition {1} order by 
> [] range between UNBOUNDED PRECEDING and UNBOUNDED FOLLOWING aggs [CO

[jira] [Updated] (DRILL-3192) TestDrillbitResilience#cancelWhenQueryIdArrives hangs

2015-06-26 Thread Sudheesh Katkam (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-3192?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sudheesh Katkam updated DRILL-3192:
---
Fix Version/s: (was: 1.1.0)
   1.2.0

> TestDrillbitResilience#cancelWhenQueryIdArrives hangs
> -
>
> Key: DRILL-3192
> URL: https://issues.apache.org/jira/browse/DRILL-3192
> Project: Apache Drill
>  Issue Type: Bug
>Reporter: Sudheesh Katkam
>Assignee: Sudheesh Katkam
> Fix For: 1.2.0
>
>
> TestDrillbitResilience#cancelWhenQueryIdArrives (previously named 
> cancelBeforeAnyResultsArrive) hangs when the test is run multiple times. 
> (Will add more information)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (DRILL-3382) CTAS with order by clause fails with IOOB exception

2015-06-26 Thread Jinfeng Ni (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-3382?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jinfeng Ni updated DRILL-3382:
--
Assignee: Aman Sinha  (was: Jinfeng Ni)

> CTAS with order by clause fails with IOOB exception
> ---
>
> Key: DRILL-3382
> URL: https://issues.apache.org/jira/browse/DRILL-3382
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Query Planning & Optimization
>Affects Versions: 1.1.0
>Reporter: Parth Chandra
>Assignee: Aman Sinha
>Priority: Critical
> Fix For: 1.2.0
>
> Attachments: 
> 0001-DRILL-3382-Fix-IOOB-error-for-CTAS-order-by-statemen.patch
>
>
> The query :
> {panel}
> create table `lineitem__5`  as select l_suppkey, l_partkey, l_linenumber from 
> cp.`tpch/lineitem.parquet` l order by l_linenumber;
> {panel}
> fails with an IOOB exception
> Trace in log - 
> {panel}
> [Error Id: 3351dcf3-032f-4d10-b2a4-c42959d0c06a on localhost:31010]
> org.apache.drill.common.exceptions.UserException: SYSTEM ERROR: 
> IndexOutOfBoundsException: index (2) must be less than size (2)
> [Error Id: 3351dcf3-032f-4d10-b2a4-c42959d0c06a on localhost:31010]
> at 
> org.apache.drill.common.exceptions.UserException$Builder.build(UserException.java:523)
>  ~[drill-common-1.1.0-SNAPSHOT.jar:1.1.0-SNAPSHOT]
> at 
> org.apache.drill.exec.work.foreman.Foreman$ForemanResult.close(Foreman.java:737)
>  [drill-java-exec-1.1.0-SNAPSHOT.jar:1.1.0-SNAPSHOT]
> at 
> org.apache.drill.exec.work.foreman.Foreman$StateSwitch.processEvent(Foreman.java:839)
>  [drill-java-exec-1.1.0-SNAPSHOT.jar:1.1.0-SNAPSHOT]
> at 
> org.apache.drill.exec.work.foreman.Foreman$StateSwitch.processEvent(Foreman.java:781)
>  [drill-java-exec-1.1.0-SNAPSHOT.jar:1.1.0-SNAPSHOT]
> at 
> org.apache.drill.common.EventProcessor.sendEvent(EventProcessor.java:73) 
> [drill-common-1.1.0-SNAPSHOT.jar:1.1.0-SNAPSHOT]
> at 
> org.apache.drill.exec.work.foreman.Foreman$StateSwitch.moveToState(Foreman.java:783)
>  [drill-java-exec-1.1.0-SNAPSHOT.jar:1.1.0-SNAPSHOT]
> at 
> org.apache.drill.exec.work.foreman.Foreman.moveToState(Foreman.java:892) 
> [drill-java-exec-1.1.0-SNAPSHOT.jar:1.1.0-SNAPSHOT]
> at org.apache.drill.exec.work.foreman.Foreman.run(Foreman.java:253) 
> [drill-java-exec-1.1.0-SNAPSHOT.jar:1.1.0-SNAPSHOT]
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>  [na:1.7.0_65]
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>  [na:1.7.0_65]
> at java.lang.Thread.run(Thread.java:745) [na:1.7.0_65]
> Caused by: org.apache.drill.exec.work.foreman.ForemanException: Unexpected 
> exception during fragment initialization: Internal error: Error while 
> applying rule ExpandConversionRule, args 
> [rel#9013:AbstractConverter.PHYSICAL.SINGLETON([]).[2](input=rel#9011:Subset#7.PHYSICAL.ANY([]).[2],convention=PHYSICAL,DrillDistributionTraitDef=SINGLETON([]),sort=[2])]
> ... 4 common frames omitted
> Caused by: java.lang.AssertionError: Internal error: Error while applying 
> rule ExpandConversionRule, args 
> [rel#9013:AbstractConverter.PHYSICAL.SINGLETON([]).[2](input=rel#9011:Subset#7.PHYSICAL.ANY([]).[2],convention=PHYSICAL,DrillDistributionTraitDef=SINGLETON([]),sort=[2])]
> at org.apache.calcite.util.Util.newInternal(Util.java:790) 
> ~[calcite-core-1.1.0-drill-r9.jar:1.1.0-drill-r9]
> at 
> org.apache.calcite.plan.volcano.VolcanoRuleCall.onMatch(VolcanoRuleCall.java:251)
>  ~[calcite-core-1.1.0-drill-r9.jar:1.1.0-drill-r9]
> at 
> org.apache.calcite.plan.volcano.VolcanoPlanner.findBestExp(VolcanoPlanner.java:795)
>  ~[calcite-core-1.1.0-drill-r9.jar:1.1.0-drill-r9]
> at 
> org.apache.calcite.tools.Programs$RuleSetProgram.run(Programs.java:303) 
> ~[calcite-core-1.1.0-drill-r9.jar:1.1.0-drill-r9]
> at org.apache.calcite.prepare.PlannerImpl.transform(PlannerImpl.java:316) 
> ~[calcite-core-1.1.0-drill-r9.jar:1.1.0-drill-r9]
> at 
> org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.convertToPrel(DefaultSqlHandler.java:260)
>  ~[drill-java-exec-1.1.0-SNAPSHOT.jar:1.1.0-SNAPSHOT]
> at 
> org.apache.drill.exec.planner.sql.handlers.CreateTableHandler.convertToPrel(CreateTableHandler.java:120)
>  ~[drill-java-exec-1.1.0-SNAPSHOT.jar:1.1.0-SNAPSHOT]
> at 
> org.apache.drill.exec.planner.sql.handlers.CreateTableHandler.getPlan(CreateTableHandler.java:99)
>  ~[drill-java-exec-1.1.0-SNAPSHOT.jar:1.1.0-SNAPSHOT]
> at 
> org.apache.drill.exec.planner.sql.DrillSqlWorker.getPlan(DrillSqlWorker.java:178)
>  ~[drill-java-exec-1.1.0-SNAPSHOT.jar:1.1.0-SNAPSHOT]
> at org.apache.drill.exec.work.foreman.Foreman.runSQL(Foreman.java:903) 
> [drill-java-exec-1.1.0-SNAPSHOT.jar:1.1.0-SNAPSHOT]
> at org.apache.drill.exec.work.foreman.Foreman.run(Foreman.java

[jira] [Commented] (DRILL-3382) CTAS with order by clause fails with IOOB exception

2015-06-26 Thread Jinfeng Ni (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-3382?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14603709#comment-14603709
 ] 

Jinfeng Ni commented on DRILL-3382:
---

[~amansinha100], could you please review the patch? Thanks!




> CTAS with order by clause fails with IOOB exception
> ---
>
> Key: DRILL-3382
> URL: https://issues.apache.org/jira/browse/DRILL-3382
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Query Planning & Optimization
>Affects Versions: 1.1.0
>Reporter: Parth Chandra
>Assignee: Jinfeng Ni
>Priority: Critical
> Fix For: 1.2.0
>
> Attachments: 
> 0001-DRILL-3382-Fix-IOOB-error-for-CTAS-order-by-statemen.patch
>
>
> The query :
> {panel}
> create table `lineitem__5`  as select l_suppkey, l_partkey, l_linenumber from 
> cp.`tpch/lineitem.parquet` l order by l_linenumber;
> {panel}
> fails with an IOOB exception
> Trace in log - 
> {panel}
> [Error Id: 3351dcf3-032f-4d10-b2a4-c42959d0c06a on localhost:31010]
> org.apache.drill.common.exceptions.UserException: SYSTEM ERROR: 
> IndexOutOfBoundsException: index (2) must be less than size (2)
> [Error Id: 3351dcf3-032f-4d10-b2a4-c42959d0c06a on localhost:31010]
> at 
> org.apache.drill.common.exceptions.UserException$Builder.build(UserException.java:523)
>  ~[drill-common-1.1.0-SNAPSHOT.jar:1.1.0-SNAPSHOT]
> at 
> org.apache.drill.exec.work.foreman.Foreman$ForemanResult.close(Foreman.java:737)
>  [drill-java-exec-1.1.0-SNAPSHOT.jar:1.1.0-SNAPSHOT]
> at 
> org.apache.drill.exec.work.foreman.Foreman$StateSwitch.processEvent(Foreman.java:839)
>  [drill-java-exec-1.1.0-SNAPSHOT.jar:1.1.0-SNAPSHOT]
> at 
> org.apache.drill.exec.work.foreman.Foreman$StateSwitch.processEvent(Foreman.java:781)
>  [drill-java-exec-1.1.0-SNAPSHOT.jar:1.1.0-SNAPSHOT]
> at 
> org.apache.drill.common.EventProcessor.sendEvent(EventProcessor.java:73) 
> [drill-common-1.1.0-SNAPSHOT.jar:1.1.0-SNAPSHOT]
> at 
> org.apache.drill.exec.work.foreman.Foreman$StateSwitch.moveToState(Foreman.java:783)
>  [drill-java-exec-1.1.0-SNAPSHOT.jar:1.1.0-SNAPSHOT]
> at 
> org.apache.drill.exec.work.foreman.Foreman.moveToState(Foreman.java:892) 
> [drill-java-exec-1.1.0-SNAPSHOT.jar:1.1.0-SNAPSHOT]
> at org.apache.drill.exec.work.foreman.Foreman.run(Foreman.java:253) 
> [drill-java-exec-1.1.0-SNAPSHOT.jar:1.1.0-SNAPSHOT]
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>  [na:1.7.0_65]
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>  [na:1.7.0_65]
> at java.lang.Thread.run(Thread.java:745) [na:1.7.0_65]
> Caused by: org.apache.drill.exec.work.foreman.ForemanException: Unexpected 
> exception during fragment initialization: Internal error: Error while 
> applying rule ExpandConversionRule, args 
> [rel#9013:AbstractConverter.PHYSICAL.SINGLETON([]).[2](input=rel#9011:Subset#7.PHYSICAL.ANY([]).[2],convention=PHYSICAL,DrillDistributionTraitDef=SINGLETON([]),sort=[2])]
> ... 4 common frames omitted
> Caused by: java.lang.AssertionError: Internal error: Error while applying 
> rule ExpandConversionRule, args 
> [rel#9013:AbstractConverter.PHYSICAL.SINGLETON([]).[2](input=rel#9011:Subset#7.PHYSICAL.ANY([]).[2],convention=PHYSICAL,DrillDistributionTraitDef=SINGLETON([]),sort=[2])]
> at org.apache.calcite.util.Util.newInternal(Util.java:790) 
> ~[calcite-core-1.1.0-drill-r9.jar:1.1.0-drill-r9]
> at 
> org.apache.calcite.plan.volcano.VolcanoRuleCall.onMatch(VolcanoRuleCall.java:251)
>  ~[calcite-core-1.1.0-drill-r9.jar:1.1.0-drill-r9]
> at 
> org.apache.calcite.plan.volcano.VolcanoPlanner.findBestExp(VolcanoPlanner.java:795)
>  ~[calcite-core-1.1.0-drill-r9.jar:1.1.0-drill-r9]
> at 
> org.apache.calcite.tools.Programs$RuleSetProgram.run(Programs.java:303) 
> ~[calcite-core-1.1.0-drill-r9.jar:1.1.0-drill-r9]
> at org.apache.calcite.prepare.PlannerImpl.transform(PlannerImpl.java:316) 
> ~[calcite-core-1.1.0-drill-r9.jar:1.1.0-drill-r9]
> at 
> org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.convertToPrel(DefaultSqlHandler.java:260)
>  ~[drill-java-exec-1.1.0-SNAPSHOT.jar:1.1.0-SNAPSHOT]
> at 
> org.apache.drill.exec.planner.sql.handlers.CreateTableHandler.convertToPrel(CreateTableHandler.java:120)
>  ~[drill-java-exec-1.1.0-SNAPSHOT.jar:1.1.0-SNAPSHOT]
> at 
> org.apache.drill.exec.planner.sql.handlers.CreateTableHandler.getPlan(CreateTableHandler.java:99)
>  ~[drill-java-exec-1.1.0-SNAPSHOT.jar:1.1.0-SNAPSHOT]
> at 
> org.apache.drill.exec.planner.sql.DrillSqlWorker.getPlan(DrillSqlWorker.java:178)
>  ~[drill-java-exec-1.1.0-SNAPSHOT.jar:1.1.0-SNAPSHOT]
> at org.apache.drill.exec.work.foreman.Foreman.runSQL(Foreman.java:903) 
> [drill-java-exec-1.1.0-SNAPSHOT.jar:1.1.0-S

[jira] [Updated] (DRILL-3382) CTAS with order by clause fails with IOOB exception

2015-06-26 Thread Jinfeng Ni (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-3382?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jinfeng Ni updated DRILL-3382:
--
Attachment: 0001-DRILL-3382-Fix-IOOB-error-for-CTAS-order-by-statemen.patch

> CTAS with order by clause fails with IOOB exception
> ---
>
> Key: DRILL-3382
> URL: https://issues.apache.org/jira/browse/DRILL-3382
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Query Planning & Optimization
>Affects Versions: 1.1.0
>Reporter: Parth Chandra
>Assignee: Jinfeng Ni
>Priority: Critical
> Fix For: 1.2.0
>
> Attachments: 
> 0001-DRILL-3382-Fix-IOOB-error-for-CTAS-order-by-statemen.patch
>
>
> The query :
> {panel}
> create table `lineitem__5`  as select l_suppkey, l_partkey, l_linenumber from 
> cp.`tpch/lineitem.parquet` l order by l_linenumber;
> {panel}
> fails with an IOOB exception
> Trace in log - 
> {panel}
> [Error Id: 3351dcf3-032f-4d10-b2a4-c42959d0c06a on localhost:31010]
> org.apache.drill.common.exceptions.UserException: SYSTEM ERROR: 
> IndexOutOfBoundsException: index (2) must be less than size (2)
> [Error Id: 3351dcf3-032f-4d10-b2a4-c42959d0c06a on localhost:31010]
> at 
> org.apache.drill.common.exceptions.UserException$Builder.build(UserException.java:523)
>  ~[drill-common-1.1.0-SNAPSHOT.jar:1.1.0-SNAPSHOT]
> at 
> org.apache.drill.exec.work.foreman.Foreman$ForemanResult.close(Foreman.java:737)
>  [drill-java-exec-1.1.0-SNAPSHOT.jar:1.1.0-SNAPSHOT]
> at 
> org.apache.drill.exec.work.foreman.Foreman$StateSwitch.processEvent(Foreman.java:839)
>  [drill-java-exec-1.1.0-SNAPSHOT.jar:1.1.0-SNAPSHOT]
> at 
> org.apache.drill.exec.work.foreman.Foreman$StateSwitch.processEvent(Foreman.java:781)
>  [drill-java-exec-1.1.0-SNAPSHOT.jar:1.1.0-SNAPSHOT]
> at 
> org.apache.drill.common.EventProcessor.sendEvent(EventProcessor.java:73) 
> [drill-common-1.1.0-SNAPSHOT.jar:1.1.0-SNAPSHOT]
> at 
> org.apache.drill.exec.work.foreman.Foreman$StateSwitch.moveToState(Foreman.java:783)
>  [drill-java-exec-1.1.0-SNAPSHOT.jar:1.1.0-SNAPSHOT]
> at 
> org.apache.drill.exec.work.foreman.Foreman.moveToState(Foreman.java:892) 
> [drill-java-exec-1.1.0-SNAPSHOT.jar:1.1.0-SNAPSHOT]
> at org.apache.drill.exec.work.foreman.Foreman.run(Foreman.java:253) 
> [drill-java-exec-1.1.0-SNAPSHOT.jar:1.1.0-SNAPSHOT]
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>  [na:1.7.0_65]
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>  [na:1.7.0_65]
> at java.lang.Thread.run(Thread.java:745) [na:1.7.0_65]
> Caused by: org.apache.drill.exec.work.foreman.ForemanException: Unexpected 
> exception during fragment initialization: Internal error: Error while 
> applying rule ExpandConversionRule, args 
> [rel#9013:AbstractConverter.PHYSICAL.SINGLETON([]).[2](input=rel#9011:Subset#7.PHYSICAL.ANY([]).[2],convention=PHYSICAL,DrillDistributionTraitDef=SINGLETON([]),sort=[2])]
> ... 4 common frames omitted
> Caused by: java.lang.AssertionError: Internal error: Error while applying 
> rule ExpandConversionRule, args 
> [rel#9013:AbstractConverter.PHYSICAL.SINGLETON([]).[2](input=rel#9011:Subset#7.PHYSICAL.ANY([]).[2],convention=PHYSICAL,DrillDistributionTraitDef=SINGLETON([]),sort=[2])]
> at org.apache.calcite.util.Util.newInternal(Util.java:790) 
> ~[calcite-core-1.1.0-drill-r9.jar:1.1.0-drill-r9]
> at 
> org.apache.calcite.plan.volcano.VolcanoRuleCall.onMatch(VolcanoRuleCall.java:251)
>  ~[calcite-core-1.1.0-drill-r9.jar:1.1.0-drill-r9]
> at 
> org.apache.calcite.plan.volcano.VolcanoPlanner.findBestExp(VolcanoPlanner.java:795)
>  ~[calcite-core-1.1.0-drill-r9.jar:1.1.0-drill-r9]
> at 
> org.apache.calcite.tools.Programs$RuleSetProgram.run(Programs.java:303) 
> ~[calcite-core-1.1.0-drill-r9.jar:1.1.0-drill-r9]
> at org.apache.calcite.prepare.PlannerImpl.transform(PlannerImpl.java:316) 
> ~[calcite-core-1.1.0-drill-r9.jar:1.1.0-drill-r9]
> at 
> org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.convertToPrel(DefaultSqlHandler.java:260)
>  ~[drill-java-exec-1.1.0-SNAPSHOT.jar:1.1.0-SNAPSHOT]
> at 
> org.apache.drill.exec.planner.sql.handlers.CreateTableHandler.convertToPrel(CreateTableHandler.java:120)
>  ~[drill-java-exec-1.1.0-SNAPSHOT.jar:1.1.0-SNAPSHOT]
> at 
> org.apache.drill.exec.planner.sql.handlers.CreateTableHandler.getPlan(CreateTableHandler.java:99)
>  ~[drill-java-exec-1.1.0-SNAPSHOT.jar:1.1.0-SNAPSHOT]
> at 
> org.apache.drill.exec.planner.sql.DrillSqlWorker.getPlan(DrillSqlWorker.java:178)
>  ~[drill-java-exec-1.1.0-SNAPSHOT.jar:1.1.0-SNAPSHOT]
> at org.apache.drill.exec.work.foreman.Foreman.runSQL(Foreman.java:903) 
> [drill-java-exec-1.1.0-SNAPSHOT.jar:1.1.0-SNAPSHOT]
> at org.apache.drill.exec.w

[jira] [Updated] (DRILL-3274) remove option 'window.enable'

2015-06-26 Thread Victoria Markman (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-3274?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Victoria Markman updated DRILL-3274:

Labels: window_function  (was: )

> remove option 'window.enable'
> -
>
> Key: DRILL-3274
> URL: https://issues.apache.org/jira/browse/DRILL-3274
> Project: Apache Drill
>  Issue Type: Bug
>Reporter: Deneche A. Hakim
>Assignee: Deneche A. Hakim
>  Labels: window_function
> Fix For: 1.2.0
>
>
> as part of DRILL-3200 window functions will be enabled by default. We 
> shouldn't need to disable window functions so it's safe to remove the 
> 'window.enable' option



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (DRILL-3407) CTAS Auto Partition : The plan for count(*) should show the list of files scanned

2015-06-26 Thread Rahul Challapalli (JIRA)

Rahul Challapalli created DRILL-3407:


 Summary: CTAS Auto Partition : The plan for count(*) should show 
the list of files scanned
 Key: DRILL-3407
 URL: https://issues.apache.org/jira/browse/DRILL-3407
 Project: Apache Drill
  Issue Type: Bug
  Components: Query Planning & Optimization
Reporter: Rahul Challapalli
Assignee: Jinfeng Ni
Priority: Minor
 Fix For: 1.2.0


#Generated by Git-Commit-Id-Plugin
#Fri Jun 26 19:46:34 UTC 2015
git.commit.id.abbrev=60bc945

The below plan does not give information about the list of files scanned
{code}
0: jdbc:drill:schema=dfs_eea> explain plan for select count(*) from 
`existing_partition_pruning/lineitempart` where dir0=1991;
+-+---+
|text   
  | json  |
+-+---+
| 00-00Screen
00-01  Project(EXPR$0=[$0])
00-02
Scan(groupscan=[org.apache.drill.exec.store.pojo.PojoRecordReader@17153c76])
 | {
  "head" : {
"version" : 1,
"generator" : {
  "type" : "ExplainHandler",
  "info" : ""
},
"type" : "APACHE_DRILL_PHYSICAL",
"options" : [ {
  "name" : "drill.exec.storage.file.partition.column.label",
  "kind" : "STRING",
  "type" : "SESSION",
  "string_val" : "partition_string1"
} ],
"queue" : 0,
"resultMode" : "EXEC"
  },
  "graph" : [ {
"pop" : "DirectGroupScan",
"@id" : 2,
"cost" : 20.0
  }, {
"pop" : "project",
"@id" : 1,
"exprs" : [ {
  "ref" : "`EXPR$0`",
  "expr" : "`count`"
} ],
"child" : 2,
"initialAllocation" : 100,
"maxAllocation" : 100,
"cost" : 20.0
  }, {
"pop" : "screen",
"@id" : 0,
"child" : 1,
"initialAllocation" : 100,
"maxAllocation" : 100,
"cost" : 20.0
  } ]
} |
{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (DRILL-3402) Throw exception when attempting to partition for format that don't support

2015-06-26 Thread Jinfeng Ni (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-3402?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14603660#comment-14603660
 ] 

Jinfeng Ni commented on DRILL-3402:
---

Please upload the new patch here. 

+1.  




> Throw exception when attempting to partition for format that don't support
> --
>
> Key: DRILL-3402
> URL: https://issues.apache.org/jira/browse/DRILL-3402
> Project: Apache Drill
>  Issue Type: Bug
>Reporter: Steven Phillips
>Assignee: Steven Phillips
> Attachments: DRILL-3402.patch
>
>
> CTAS auto-partitioning only works with Parquet output, so we need to make 
> sure we catch it if the output format is set to something other than Parquet. 
> Since CTAS is only supported for the FileSystem storage, that means we only 
> have to handle it for the various FormatPlugins.
> I will add a method to the FormatPlugin interface, supportAutoPartitioning(), 
> which will indicate whether it is supported. If it is not supported, and the 
> statement contains a partition clause, it will throw an exception.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (DRILL-3404) Filter on window function does not appear in query plan

2015-06-26 Thread Khurram Faraaz (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-3404?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Khurram Faraaz updated DRILL-3404:
--
Attachment: 0_0_0.parquet

Attached parquet input file used in the test.

> Filter on window function does not appear in query plan
> ---
>
> Key: DRILL-3404
> URL: https://issues.apache.org/jira/browse/DRILL-3404
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Query Planning & Optimization
>Affects Versions: 1.1.0
> Environment: 4 node cluster on CentOS
>Reporter: Khurram Faraaz
>Assignee: Jinfeng Ni
>Priority: Critical
> Attachments: 0_0_0.parquet
>
>
> Filter is missing in the query plan for the below query in Drill, and hence 
> wrong results are returned.
> Results from Drill
> {code}
> 0: jdbc:drill:schema=dfs.tmp> select c1, c2, w_sum from ( select c1, c2, sum 
> ( c1 ) over ( partition by c2 order by c1 asc nulls first ) w_sum from 
> `tblWnulls` ) sub_query where w_sum is not null;
> +-+---+-+
> | c1  |  c2   |w_sum|
> +-+---+-+
> | 0   | a | 0   |
> | 1   | a | 1   |
> | 5   | a | 6   |
> | 10  | a | 16  |
> | 11  | a | 27  |
> | 14  | a | 41  |
> | 1   | a | 11152   |
> | 2   | b | 2   |
> | 9   | b | 11  |
> | 13  | b | 24  |
> | 17  | b | 41  |
> | null| c | null|
> | 4   | c | 4   |
> | 6   | c | 10  |
> | 8   | c | 18  |
> | 12  | c | 30  |
> | 13  | c | 56  |
> | 13  | c | 56  |
> | null| d | null|
> | null| d | null|
> | 10  | d | 10  |
> | 11  | d | 21  |
> | 2147483647  | d | 4294967315  |
> | 2147483647  | d | 4294967315  |
> | -1  | e | -1  |
> | 15  | e | 14  |
> | null| null  | null|
> | 19  | null  | 19  |
> | 65536   | null  | 6   |
> | 100 | null  | 106 |
> +-+---+-+
> 30 rows selected (0.337 seconds)
> {code}
> Explain plan for the above query from Drill
> {code}
> 0: jdbc:drill:schema=dfs.tmp> explain plan for select c1, c2, w_sum from ( 
> select c1, c2, sum ( c1 ) over ( partition by c2 order by c1 asc nulls first 
> ) w_sum from `tblWnulls` ) sub_query where w_sum is not null;
> +--+---+
> | 
>   
>   
>   
> text  
>   
>   
>   
>| json  |
> +--+---+
> | 00-00Screen
> 00-01  Project(c1=[$0], c2=[$1], w_sum=[$2])
> 00-02Project(c1=[$

[jira] [Updated] (DRILL-3402) Throw exception when attempting to partition for format that don't support

2015-06-26 Thread Jinfeng Ni (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-3402?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jinfeng Ni updated DRILL-3402:
--
Assignee: Steven Phillips  (was: Jinfeng Ni)

> Throw exception when attempting to partition for format that don't support
> --
>
> Key: DRILL-3402
> URL: https://issues.apache.org/jira/browse/DRILL-3402
> Project: Apache Drill
>  Issue Type: Bug
>Reporter: Steven Phillips
>Assignee: Steven Phillips
> Attachments: DRILL-3402.patch
>
>
> CTAS auto-partitioning only works with Parquet output, so we need to make 
> sure we catch it if the output format is set to something other than Parquet. 
> Since CTAS is only supported for the FileSystem storage, that means we only 
> have to handle it for the various FormatPlugins.
> I will add a method to the FormatPlugin interface, supportAutoPartitioning(), 
> which will indicate whether it is supported. If it is not supported, and the 
> statement contains a partition clause, it will throw an exception.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (DRILL-3401) How to use Snappy compression on Parquet table?

2015-06-26 Thread Daniel Barclay (Drill) (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-3401?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Barclay (Drill) updated DRILL-3401:
--
Component/s: (was: Functions - Drill)
 Storage - Parquet

> How to use Snappy compression on Parquet table?
> ---
>
> Key: DRILL-3401
> URL: https://issues.apache.org/jira/browse/DRILL-3401
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Storage - Parquet
>Affects Versions: 1.0.0
> Environment: Cloudera 5.4.2 AWS cluster
>Reporter: Kathy Qiu
>Assignee: Daniel Barclay (Drill)
>  Labels: easyfix, features
>
> To use Snappy compression on a Parquet table I created, these are the 
> commands I used:
> alter session set `store.format`='parquet';
> alter session set `store.parquet.compression`='snappy';
> create table  as (select 
> cast (columns[0] as DECIMAL(10,0)) 
> etc...
> from dfs.``);
> Does this suffice? Or do I need to specify Snappy compression in the CTAS 
> command as well?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Comment Edited] (DRILL-3405) "INTERVAL '1111111111' YEAR(10)" yields garbage result

2015-06-26 Thread Daniel Barclay (Drill) (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-3405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14603587#comment-14603587
 ] 

Daniel Barclay (Drill) edited comment on DRILL-3405 at 6/26/15 9:24 PM:


There's also a problem with interval literals involving seconds and not days 
(HOUR(10) to SECOND, MINUTE(10) to SECOND, and SECOND(10).

Oh--and it's not just with a precision of 10 and 10-digit numbers:

{noformat}
0: jdbc:drill:zk=local> SELECT  INTERVAL '987654321' YEAR(9) FROM 
INFORMATION_SCHEMA.CATALOGS;
+--+
|EXPR$0|
+--+
| P-86087503Y  |
+--+
1 row selected (0.209 seconds)
0: jdbc:drill:zk=local> 
{noformat}



was (Author: dsbos):
There's also a problem with interval literals involving seconds and not days 
(HOUR(10) to SECOND, MINUTE(10) to SECOND, and SECOND(10).

> "INTERVAL '11' YEAR(10)" yields garbage result
> --
>
> Key: DRILL-3405
> URL: https://issues.apache.org/jira/browse/DRILL-3405
> Project: Apache Drill
>  Issue Type: Bug
>Reporter: Daniel Barclay (Drill)
>
> Some interval literals yield garbage results.
> It seems to be those with YEAR and/or MONTH fields, with Drill's maximum 
> leading-digit precision (10), and having a ten-digit value for that leading 
> field (even when it's 10 digits that fit in int).
> {noformat}
> 0: jdbc:drill:zk=local> SELECT  INTERVAL '11' YEAR(10) FROM 
> INFORMATION_SCHEMA.CATALOGS;
> +-+
> |   EXPR$0|
> +-+
> | P37369287Y  |
> +-+
> 1 row selected (0.234 seconds)
> 0: jdbc:drill:zk=local> 
> 0: jdbc:drill:zk=local> SELECT  INTERVAL '11' MONTH(10) FROM 
> INFORMATION_SCHEMA.CATALOGS;
> +---+
> |EXPR$0 |
> +---+
> | P92592592Y7M  |
> +---+
> 1 row selected (0.171 seconds)
> 0: jdbc:drill:zk=local> 
> 0: jdbc:drill:zk=local> SELECT  INTERVAL '11-06' YEAR(10) TO MONTH 
> FROM INFORMATION_SCHEMA.CATALOGS;
> +---+
> |EXPR$0 |
> +---+
> | P37369287Y6M  |
> +---+
> 1 row selected (0.229 seconds)
> 0: jdbc:drill:zk=local> 
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (DRILL-3406) Drill allows YEAR(10) but then disallows 10 digits

2015-06-26 Thread Daniel Barclay (Drill) (JIRA)

Daniel Barclay (Drill) created DRILL-3406:
-

 Summary: Drill allows YEAR(10) but then disallows 10 digits
 Key: DRILL-3406
 URL: https://issues.apache.org/jira/browse/DRILL-3406
 Project: Apache Drill
  Issue Type: Bug
Reporter: Daniel Barclay (Drill)


In interval literals, Drill allows specifying interval types with a leading 
field precision of 10, but then doesn't allow all 10-digit values.  For example:

{noformat}
0: jdbc:drill:zk=local> SELECT  INTERVAL '22' YEAR(10) FROM 
INFORMATION_SCHEMA.CATALOGS;
Jun 26, 2015 2:13:53 PM org.apache.calcite.sql.validate.SqlValidatorException 

SEVERE: org.apache.calcite.sql.validate.SqlValidatorException: Interval field 
value 2,222,222,222 exceeds precision of YEAR(10) field
Jun 26, 2015 2:13:53 PM org.apache.calcite.runtime.CalciteException 
SEVERE: org.apache.calcite.runtime.CalciteContextException: From line 1, column 
9 to line 1, column 38: Interval field value 2,222,222,222 exceeds precision of 
YEAR(10) field
Error: PARSE ERROR: From line 1, column 9 to line 1, column 38: Interval field 
value 2,222,222,222 exceeds precision of YEAR(10) field


[Error Id: dea32980-c1ad-4d7c-9780-5a08714ffcb7 on dev-linux2:31010] 
(state=,code=0)
0: jdbc:drill:zk=local> 
{noformat}

Note that the value does _not_ exceed the declared precision of 10.

If Drill isn't going to allow a 10-digit value, it shouldn't accept a precision 
of 10 digits.

Either the maximum allowed leading digit precision should be reduced to 9 
(because 9-digit values seem to be accepted--although larger 9-digit values are 
processed wrong) or 10-digit values should be accepted.




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (DRILL-3405) "INTERVAL '1111111111' YEAR(10)" yields garbage result

2015-06-26 Thread Daniel Barclay (Drill) (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-3405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14603587#comment-14603587
 ] 

Daniel Barclay (Drill) commented on DRILL-3405:
---

There's also a problem with interval literals involving seconds and not days 
(HOUR(10) to SECOND, MINUTE(10) to SECOND, and SECOND(10).

> "INTERVAL '11' YEAR(10)" yields garbage result
> --
>
> Key: DRILL-3405
> URL: https://issues.apache.org/jira/browse/DRILL-3405
> Project: Apache Drill
>  Issue Type: Bug
>Reporter: Daniel Barclay (Drill)
>
> Some interval literals yield garbage results.
> It seems to be those with YEAR and/or MONTH fields, with Drill's maximum 
> leading-digit precision (10), and having a ten-digit value for that leading 
> field (even when it's 10 digits that fit in int).
> {noformat}
> 0: jdbc:drill:zk=local> SELECT  INTERVAL '11' YEAR(10) FROM 
> INFORMATION_SCHEMA.CATALOGS;
> +-+
> |   EXPR$0|
> +-+
> | P37369287Y  |
> +-+
> 1 row selected (0.234 seconds)
> 0: jdbc:drill:zk=local> 
> 0: jdbc:drill:zk=local> SELECT  INTERVAL '11' MONTH(10) FROM 
> INFORMATION_SCHEMA.CATALOGS;
> +---+
> |EXPR$0 |
> +---+
> | P92592592Y7M  |
> +---+
> 1 row selected (0.171 seconds)
> 0: jdbc:drill:zk=local> 
> 0: jdbc:drill:zk=local> SELECT  INTERVAL '11-06' YEAR(10) TO MONTH 
> FROM INFORMATION_SCHEMA.CATALOGS;
> +---+
> |EXPR$0 |
> +---+
> | P37369287Y6M  |
> +---+
> 1 row selected (0.229 seconds)
> 0: jdbc:drill:zk=local> 
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (DRILL-3405) "INTERVAL '1111111111' YEAR(10)" yields garbage result

2015-06-26 Thread Daniel Barclay (Drill) (JIRA)

Daniel Barclay (Drill) created DRILL-3405:
-

 Summary: "INTERVAL '11' YEAR(10)" yields garbage result
 Key: DRILL-3405
 URL: https://issues.apache.org/jira/browse/DRILL-3405
 Project: Apache Drill
  Issue Type: Bug
Reporter: Daniel Barclay (Drill)


Some interval literals yield garbage results.

It seems to be those with YEAR and/or MONTH fields, with Drill's maximum 
leading-digit precision (10), and having a ten-digit value for that leading 
field (even when it's 10 digits that fit in int).

{noformat}
0: jdbc:drill:zk=local> SELECT  INTERVAL '11' YEAR(10) FROM 
INFORMATION_SCHEMA.CATALOGS;
+-+
|   EXPR$0|
+-+
| P37369287Y  |
+-+
1 row selected (0.234 seconds)
0: jdbc:drill:zk=local> 

0: jdbc:drill:zk=local> SELECT  INTERVAL '11' MONTH(10) FROM 
INFORMATION_SCHEMA.CATALOGS;
+---+
|EXPR$0 |
+---+
| P92592592Y7M  |
+---+
1 row selected (0.171 seconds)
0: jdbc:drill:zk=local> 

0: jdbc:drill:zk=local> SELECT  INTERVAL '11-06' YEAR(10) TO MONTH FROM 
INFORMATION_SCHEMA.CATALOGS;
+---+
|EXPR$0 |
+---+
| P37369287Y6M  |
+---+
1 row selected (0.229 seconds)
0: jdbc:drill:zk=local> 

{noformat}




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (DRILL-3404) Filter on window function does not appear in query plan

2015-06-26 Thread Khurram Faraaz (JIRA)

Khurram Faraaz created DRILL-3404:
-

 Summary: Filter on window function does not appear in query plan
 Key: DRILL-3404
 URL: https://issues.apache.org/jira/browse/DRILL-3404
 Project: Apache Drill
  Issue Type: Bug
  Components: Query Planning & Optimization
Affects Versions: 1.1.0
 Environment: 4 node cluster on CentOS
Reporter: Khurram Faraaz
Assignee: Jinfeng Ni
Priority: Critical


Filter is missing in the query plan for the below query in Drill, and hence 
wrong results are returned.

Results from Drill
{code}
0: jdbc:drill:schema=dfs.tmp> select c1, c2, w_sum from ( select c1, c2, sum ( 
c1 ) over ( partition by c2 order by c1 asc nulls first ) w_sum from 
`tblWnulls` ) sub_query where w_sum is not null;
+-+---+-+
| c1  |  c2   |w_sum|
+-+---+-+
| 0   | a | 0   |
| 1   | a | 1   |
| 5   | a | 6   |
| 10  | a | 16  |
| 11  | a | 27  |
| 14  | a | 41  |
| 1   | a | 11152   |
| 2   | b | 2   |
| 9   | b | 11  |
| 13  | b | 24  |
| 17  | b | 41  |
| null| c | null|
| 4   | c | 4   |
| 6   | c | 10  |
| 8   | c | 18  |
| 12  | c | 30  |
| 13  | c | 56  |
| 13  | c | 56  |
| null| d | null|
| null| d | null|
| 10  | d | 10  |
| 11  | d | 21  |
| 2147483647  | d | 4294967315  |
| 2147483647  | d | 4294967315  |
| -1  | e | -1  |
| 15  | e | 14  |
| null| null  | null|
| 19  | null  | 19  |
| 65536   | null  | 6   |
| 100 | null  | 106 |
+-+---+-+
30 rows selected (0.337 seconds)
{code}

Explain plan for the above query from Drill
{code}
0: jdbc:drill:schema=dfs.tmp> explain plan for select c1, c2, w_sum from ( 
select c1, c2, sum ( c1 ) over ( partition by c2 order by c1 asc nulls first ) 
w_sum from `tblWnulls` ) sub_query where w_sum is not null;
+--+---+
|   



text



   | json  |
+--+---+
| 00-00Screen
00-01  Project(c1=[$0], c2=[$1], w_sum=[$2])
00-02Project(c1=[$0], c2=[$1], w_sum=[CASE(>($2, 0), $3, null)])
00-03  Window(window#0=[window(partition {1} order by [0 
ASC-nulls-first] range between UNBOUNDED PRECEDING and CURRENT ROW aggs 
[COUNT($0), $SUM0($0)])])
00-04SelectionVectorRemover
00-05  Sort(sort0=[$1], sort1=[$0], dir0=[ASC], 
dir1=[ASC-nulls-first])
00-06Project(c1=[$1], c2=[$0])
00-07  Scan(g

[jira] [Closed] (DRILL-3376) Reading individual files created by CTAS with partition causes an exception

2015-06-26 Thread Parth Chandra (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-3376?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Parth Chandra closed DRILL-3376.


Fix verified

> Reading individual files created by CTAS with partition causes an exception
> ---
>
> Key: DRILL-3376
> URL: https://issues.apache.org/jira/browse/DRILL-3376
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Storage - Writer
>Affects Versions: 1.1.0
>Reporter: Parth Chandra
>Assignee: Steven Phillips
> Fix For: 1.1.0
>
>
> Create a table using CTAS with partitioning:
> {code}
> create table `lineitem_part` partition by (l_moddate) as select l.*, 
> l_shipdate - extract(day from l_shipdate) + 1 l_moddate from 
> cp.`tpch/lineitem.parquet` l
> {code}
> Then the following query causes an exception
> {code}
> select distinct l_moddate from `lineitem_part/0_0_1.parquet` where l_moddate 
> = date '1992-01-01';
> {code}
> Trace in the log file - 
> {panel}
> Caused by: java.lang.StringIndexOutOfBoundsException: String index out of 
> range: 0
> at java.lang.String.charAt(String.java:658) ~[na:1.7.0_65]
> at 
> org.apache.drill.exec.planner.logical.partition.PruneScanRule$PathPartition.(PruneScanRule.java:493)
>  ~[drill-java-exec-1.1.0-SNAPSHOT.jar:1.1.0-SNAPSHOT]
> at 
> org.apache.drill.exec.planner.logical.partition.PruneScanRule.doOnMatch(PruneScanRule.java:385)
>  ~[drill-java-exec-1.1.0-SNAPSHOT.jar:1.1.0-SNAPSHOT]
> at 
> org.apache.drill.exec.planner.logical.partition.PruneScanRule$4.onMatch(PruneScanRule.java:278)
>  ~[drill-java-exec-1.1.0-SNAPSHOT.jar:1.1.0-SNAPSHOT]
> at 
> org.apache.calcite.plan.volcano.VolcanoRuleCall.onMatch(VolcanoRuleCall.java:228)
>  ~[calcite-core-1.1.0-drill-r9.jar:1.1.0-drill-r9]
> ... 13 common frames omitted
> {panel}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (DRILL-3199) GenericAccessor doesn't support isNull

2015-06-26 Thread Parth Chandra (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-3199?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Parth Chandra updated DRILL-3199:
-
Assignee: Parth Chandra  (was: Daniel Barclay (Drill))

> GenericAccessor doesn't support isNull
> --
>
> Key: DRILL-3199
> URL: https://issues.apache.org/jira/browse/DRILL-3199
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Client - JDBC
> Environment: I found this problem when calling the driver's wasNull() 
> method on a field that represented a nested JSON object (one level below the 
> root object), using the 'dfs' storage plugin and pointing at my local 
> filesystem.
>Reporter: Matt Burgess
>Assignee: Parth Chandra
> Fix For: 1.1.0
>
> Attachments: DRILL-3199.patch.1, DRILL-3199.patch.2, 
> DRILL-3199.patch.3
>
>
> GenericAccessor throws an UnsupportedOperationException when isNull() is 
> called. However for other methods it delegates to its ValueVector's accessor. 
> I think it should do the same for isNull().



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (DRILL-3150) Error when filtering non-existent field with a string

2015-06-26 Thread Parth Chandra (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-3150?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Parth Chandra updated DRILL-3150:
-
Assignee: Parth Chandra  (was: Adam Gilmore)

> Error when filtering non-existent field with a string
> -
>
> Key: DRILL-3150
> URL: https://issues.apache.org/jira/browse/DRILL-3150
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Relational Operators
>Affects Versions: 1.0.0
>Reporter: Adam Gilmore
>Assignee: Parth Chandra
>Priority: Critical
> Fix For: 1.2.0
>
> Attachments: DRILL-3150.1.patch.txt
>
>
> The following query throws an exception:
> {code}
> select count(*) from cp.`employee.json` where `blah` = 'test'
> {code}
> "blah" does not exist as a field in the JSON.  The expected behaviour would 
> be to filter out all rows as that field is not present (thus cannot equal the 
> string 'test').
> Instead, the following exception occurs:
> {code}
> org.apache.drill.common.exceptions.UserRemoteException: SYSTEM ERROR: test
> Fragment 0:0
> [Error Id: 5d6c9a82-8f87-41b2-a496-67b360302b76 on 
> ip-10-1-50-208.ec2.internal:31010]
> {code}
> Apart from the fact the real error message is hidden, the issue is that we're 
> trying to cast the varchar to int ('test' to an int).  This seems to be 
> because the projection out of the scan when a field is not found becomes 
> INT:OPTIONAL.
> The filter should not fail on this - if the varchar fails to convert to an 
> int, the filter should just simply not allow any records through.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (DRILL-3403) handled incorrectly

2015-06-26 Thread Daniel Barclay (Drill) (JIRA)

Daniel Barclay (Drill) created DRILL-3403:
-

 Summary:  handled incorrectly
 Key: DRILL-3403
 URL: https://issues.apache.org/jira/browse/DRILL-3403
 Project: Apache Drill
  Issue Type: Bug
Reporter: Daniel Barclay (Drill)


The {{}} syntax (e.g., U&'$+00' UESCAPE '$') 
is not handled correctly.

In particular, the parser doesn't seem to recognize that the first character 
after the escape character is a "{{+}}", it takes the first three hex digits 
and decodes them into a character, and then it takes the next three hex-digit 
characters as plain characters.

In the following, note how the part with a backslash followed by the 
{{+43}} part yields a NULL (as evidenced by the unaligned trailing vertical 
bar) and "043" instead of yielding "C":

{noformat}
0: jdbc:drill:zk=local> SELECT  U&'\0041 2 \+43'  UESCAPE '\' FROM 
INFORMATION_SCHEMA.CATALOGS;
+---+
|  EXPR$0   |
+---+
| A 2 043  |
+---+
1 row selected (0.253 seconds)
0: jdbc:drill:zk=local> 
{noformat}

(This means that Drill can't accept character string literals containing 
characters beyond code point U+00.)





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (DRILL-3402) Throw exception when attempting to partition for format that don't support

2015-06-26 Thread Steven Phillips (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-3402?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steven Phillips updated DRILL-3402:
---
Assignee: Jinfeng Ni

> Throw exception when attempting to partition for format that don't support
> --
>
> Key: DRILL-3402
> URL: https://issues.apache.org/jira/browse/DRILL-3402
> Project: Apache Drill
>  Issue Type: Bug
>Reporter: Steven Phillips
>Assignee: Jinfeng Ni
> Attachments: DRILL-3402.patch
>
>
> CTAS auto-partitioning only works with Parquet output, so we need to make 
> sure we catch it if the output format is set to something other than Parquet. 
> Since CTAS is only supported for the FileSystem storage, that means we only 
> have to handle it for the various FormatPlugins.
> I will add a method to the FormatPlugin interface, supportAutoPartitioning(), 
> which will indicate whether it is supported. If it is not supported, and the 
> statement contains a partition clause, it will throw an exception.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (DRILL-1673) Flatten function can not work well with nested arrays.

2015-06-26 Thread Jason Altekruse (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-1673?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Altekruse updated DRILL-1673:
---
Attachment: DRILL-1673-reopened.patch

> Flatten function can not work well with nested arrays.
> --
>
> Key: DRILL-1673
> URL: https://issues.apache.org/jira/browse/DRILL-1673
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Functions - Drill
>Affects Versions: 0.7.0
> Environment: 0.7.0
>Reporter: Hao Zhu
>Assignee: Jason Altekruse
>Priority: Blocker
> Fix For: 1.1.0
>
> Attachments: DRILL-1673-reopened.patch, DRILL-1673.patch, error.log
>
>
> Flatten function failed to scan nested arrays , for example something like 
> ""[[ ]]"".
> The only difference of JSON files between below 2 tests is 
> "num":[1,2,3]
> VS
> "num":[[1,2,3]]
> ==Test 1 (Works well):==
> file:
> {code}
> {"fixed_column":"abc", "list_column":[{"id1":"1","name":"zhu", "num": 
> [1,2,3]}, {"id1":"2","name":"hao", "num": [4,5,6]} ]}
> {code}
> SQL:
> {code}
> 0: jdbc:drill:zk=local> select t.`fixed_column` as fixed_column, 
> flatten(t.`list_column`)  from 
> dfs.root.`/Users/hzu/Documents/sharefolder/hp/n2.json` as t;
> +--++
> | fixed_column |   EXPR$1   |
> +--++
> | abc  | {"id1":"1","name":"zhu","num":[1,2,3]} |
> | abc  | {"id1":"2","name":"hao","num":[4,5,6]} |
> +--++
> 2 rows selected (0.154 seconds)
> {code}
> ==Test 2 (Failed):==
> file:
> {code}
> {"fixed_column":"abc", "list_column":[{"id1":"1","name":"zhu", "num": 
> [[1,2,3]]}, {"id1":"2","name":"hao", "num": [[4,5,6]]} ]}
> {code}
> SQL:
> {code}
> 0: jdbc:drill:zk=local>  select t.`fixed_column` as fixed_column, 
> flatten(t.`list_column`)  from 
> dfs.root.`/Users/hzu/Documents/sharefolder/hp/n3.json` as t;
> +--++
> | fixed_column |   EXPR$1   |
> +--++
> Query failed: Failure while running fragment.[ 
> df28347b-fac1-497d-b9c5-a313ba77aa4d on 10.250.0.115:31010 ]
>   (java.lang.UnsupportedOperationException) 
> 
> org.apache.drill.exec.vector.complex.RepeatedListVector$RepeatedListTransferPair.splitAndTransfer():339
> 
> org.apache.drill.exec.vector.complex.RepeatedMapVector$SingleMapTransferPair.splitAndTransfer():305
> org.apache.drill.exec.test.generated.FlattenerGen22.flattenRecords():93
> 
> org.apache.drill.exec.physical.impl.flatten.FlattenRecordBatch.doWork():152
> org.apache.drill.exec.record.AbstractSingleRecordBatch.innerNext():89
> 
> org.apache.drill.exec.physical.impl.flatten.FlattenRecordBatch.innerNext():118
> org.apache.drill.exec.record.AbstractRecordBatch.next():106
> 
> org.apache.drill.exec.physical.impl.validate.IteratorValidatorBatchIterator.next():124
> org.apache.drill.exec.record.AbstractRecordBatch.next():86
> org.apache.drill.exec.record.AbstractRecordBatch.next():76
> org.apache.drill.exec.record.AbstractSingleRecordBatch.innerNext():52
> 
> org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.innerNext():129
> org.apache.drill.exec.record.AbstractRecordBatch.next():106
> 
> org.apache.drill.exec.physical.impl.validate.IteratorValidatorBatchIterator.next():124
> org.apache.drill.exec.physical.impl.BaseRootExec.next():67
> 
> org.apache.drill.exec.physical.impl.ScreenCreator$ScreenRoot.innerNext():122
> org.apache.drill.exec.physical.impl.BaseRootExec.next():57
> org.apache.drill.exec.work.fragment.FragmentExecutor.run():105
> org.apache.drill.exec.work.WorkManager$RunnableWrapper.run():249
> ...():0
> java.lang.RuntimeException: java.sql.SQLException: Failure while executing 
> query.
>   at sqlline.SqlLine$IncrementalRows.hasNext(SqlLine.java:2514)
>   at sqlline.SqlLine$TableOutputFormat.print(SqlLine.java:2148)
>   at sqlline.SqlLine.print(SqlLine.java:1809)
>   at sqlline.SqlLine$Commands.execute(SqlLine.java:3766)
>   at sqlline.SqlLine$Commands.sql(SqlLine.java:3663)
>   at sqlline.SqlLine.dispatch(SqlLine.java:889)
>   at sqlline.SqlLine.begin(SqlLine.java:763)
>   at sqlline.SqlLine.start(SqlLine.java:498)
>   at sqlline.SqlLine.main(SqlLine.java:460)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (DRILL-2046) Merge join inconsistent results

2015-06-26 Thread Aman Sinha (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-2046?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aman Sinha updated DRILL-2046:
--
Fix Version/s: (was: 1.1.0)
   1.2.0

> Merge join inconsistent results
> ---
>
> Key: DRILL-2046
> URL: https://issues.apache.org/jira/browse/DRILL-2046
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Relational Operators
>Reporter: Rahul Challapalli
>Assignee: Aman Sinha
>Priority: Critical
> Fix For: 1.2.0
>
> Attachments: widestrings_small.parquet
>
>
> git.commit.id.abbrev=a418af1
> The below queries should result in the same no of records. However the counts 
> do not match when we use merge join.
> {code}
> alter session set `planner.enable_hashjoin` = false;
> select ws1.* from widestrings_small ws1 INNER JOIN widestrings_small ws2 on 
> ws1.str_fixed_null_empty=ws2.str_var_null_empty where 
> ws1.str_fixed_null_empty is not null and ws2.str_var_null_empty is not null 
> and ws1.tinyint_var > 120;
> 6 records
> select count(*) from widestrings_small ws1 INNER JOIN widestrings_small ws2 
> on ws1.str_fixed_null_empty=ws2.str_var_null_empty where 
> ws1.str_fixed_null_empty is not null and ws2.str_var_null_empty is not null 
> and ws1.tinyint_var > 120;
> 60 records
> select count(ws1.str_var) from widestrings_small ws1 INNER JOIN 
> widestrings_small ws2 on ws1.str_fixed_null_empty=ws2.str_var_null_empty 
> where ws1.str_fixed_null_empty is not null and ws2.str_var_null_empty is not 
> null and ws1.tinyint_var > 120;
> 4 records
> {code}
> For hash join all the above queries result in 60 records. I attached the 
> dataset used. Let me know if you have any questions.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (DRILL-3402) Throw exception when attempting to partition for format that don't support

2015-06-26 Thread Steven Phillips (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-3402?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steven Phillips updated DRILL-3402:
---
Attachment: DRILL-3402.patch

> Throw exception when attempting to partition for format that don't support
> --
>
> Key: DRILL-3402
> URL: https://issues.apache.org/jira/browse/DRILL-3402
> Project: Apache Drill
>  Issue Type: Bug
>Reporter: Steven Phillips
> Attachments: DRILL-3402.patch
>
>
> CTAS auto-partitioning only works with Parquet output, so we need to make 
> sure we catch it if the output format is set to something other than Parquet. 
> Since CTAS is only supported for the FileSystem storage, that means we only 
> have to handle it for the various FormatPlugins.
> I will add a method to the FormatPlugin interface, supportAutoPartitioning(), 
> which will indicate whether it is supported. If it is not supported, and the 
> statement contains a partition clause, it will throw an exception.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (DRILL-3402) Throw exception when attempting to partition for format that don't support

2015-06-26 Thread Steven Phillips (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-3402?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14603510#comment-14603510
 ] 

Steven Phillips commented on DRILL-3402:


Created reviewboard https://reviews.apache.org/r/35941/


> Throw exception when attempting to partition for format that don't support
> --
>
> Key: DRILL-3402
> URL: https://issues.apache.org/jira/browse/DRILL-3402
> Project: Apache Drill
>  Issue Type: Bug
>Reporter: Steven Phillips
> Attachments: DRILL-3402.patch
>
>
> CTAS auto-partitioning only works with Parquet output, so we need to make 
> sure we catch it if the output format is set to something other than Parquet. 
> Since CTAS is only supported for the FileSystem storage, that means we only 
> have to handle it for the various FormatPlugins.
> I will add a method to the FormatPlugin interface, supportAutoPartitioning(), 
> which will indicate whether it is supported. If it is not supported, and the 
> statement contains a partition clause, it will throw an exception.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (DRILL-3402) Throw exception when attempting to partition for format that don't support

2015-06-26 Thread Steven Phillips (JIRA)

Steven Phillips created DRILL-3402:
--

 Summary: Throw exception when attempting to partition for format 
that don't support
 Key: DRILL-3402
 URL: https://issues.apache.org/jira/browse/DRILL-3402
 Project: Apache Drill
  Issue Type: Bug
Reporter: Steven Phillips


CTAS auto-partitioning only works with Parquet output, so we need to make sure 
we catch it if the output format is set to something other than Parquet. Since 
CTAS is only supported for the FileSystem storage, that means we only have to 
handle it for the various FormatPlugins.

I will add a method to the FormatPlugin interface, supportAutoPartitioning(), 
which will indicate whether it is supported. If it is not supported, and the 
statement contains a partition clause, it will throw an exception.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (DRILL-3382) CTAS with order by clause fails with IOOB exception

2015-06-26 Thread Parth Chandra (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-3382?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Parth Chandra updated DRILL-3382:
-
Fix Version/s: (was: 1.1.0)
   1.2.0

> CTAS with order by clause fails with IOOB exception
> ---
>
> Key: DRILL-3382
> URL: https://issues.apache.org/jira/browse/DRILL-3382
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Query Planning & Optimization
>Affects Versions: 1.1.0
>Reporter: Parth Chandra
>Assignee: Jinfeng Ni
>Priority: Critical
> Fix For: 1.2.0
>
>
> The query :
> {panel}
> create table `lineitem__5`  as select l_suppkey, l_partkey, l_linenumber from 
> cp.`tpch/lineitem.parquet` l order by l_linenumber;
> {panel}
> fails with an IOOB exception
> Trace in log - 
> {panel}
> [Error Id: 3351dcf3-032f-4d10-b2a4-c42959d0c06a on localhost:31010]
> org.apache.drill.common.exceptions.UserException: SYSTEM ERROR: 
> IndexOutOfBoundsException: index (2) must be less than size (2)
> [Error Id: 3351dcf3-032f-4d10-b2a4-c42959d0c06a on localhost:31010]
> at 
> org.apache.drill.common.exceptions.UserException$Builder.build(UserException.java:523)
>  ~[drill-common-1.1.0-SNAPSHOT.jar:1.1.0-SNAPSHOT]
> at 
> org.apache.drill.exec.work.foreman.Foreman$ForemanResult.close(Foreman.java:737)
>  [drill-java-exec-1.1.0-SNAPSHOT.jar:1.1.0-SNAPSHOT]
> at 
> org.apache.drill.exec.work.foreman.Foreman$StateSwitch.processEvent(Foreman.java:839)
>  [drill-java-exec-1.1.0-SNAPSHOT.jar:1.1.0-SNAPSHOT]
> at 
> org.apache.drill.exec.work.foreman.Foreman$StateSwitch.processEvent(Foreman.java:781)
>  [drill-java-exec-1.1.0-SNAPSHOT.jar:1.1.0-SNAPSHOT]
> at 
> org.apache.drill.common.EventProcessor.sendEvent(EventProcessor.java:73) 
> [drill-common-1.1.0-SNAPSHOT.jar:1.1.0-SNAPSHOT]
> at 
> org.apache.drill.exec.work.foreman.Foreman$StateSwitch.moveToState(Foreman.java:783)
>  [drill-java-exec-1.1.0-SNAPSHOT.jar:1.1.0-SNAPSHOT]
> at 
> org.apache.drill.exec.work.foreman.Foreman.moveToState(Foreman.java:892) 
> [drill-java-exec-1.1.0-SNAPSHOT.jar:1.1.0-SNAPSHOT]
> at org.apache.drill.exec.work.foreman.Foreman.run(Foreman.java:253) 
> [drill-java-exec-1.1.0-SNAPSHOT.jar:1.1.0-SNAPSHOT]
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>  [na:1.7.0_65]
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>  [na:1.7.0_65]
> at java.lang.Thread.run(Thread.java:745) [na:1.7.0_65]
> Caused by: org.apache.drill.exec.work.foreman.ForemanException: Unexpected 
> exception during fragment initialization: Internal error: Error while 
> applying rule ExpandConversionRule, args 
> [rel#9013:AbstractConverter.PHYSICAL.SINGLETON([]).[2](input=rel#9011:Subset#7.PHYSICAL.ANY([]).[2],convention=PHYSICAL,DrillDistributionTraitDef=SINGLETON([]),sort=[2])]
> ... 4 common frames omitted
> Caused by: java.lang.AssertionError: Internal error: Error while applying 
> rule ExpandConversionRule, args 
> [rel#9013:AbstractConverter.PHYSICAL.SINGLETON([]).[2](input=rel#9011:Subset#7.PHYSICAL.ANY([]).[2],convention=PHYSICAL,DrillDistributionTraitDef=SINGLETON([]),sort=[2])]
> at org.apache.calcite.util.Util.newInternal(Util.java:790) 
> ~[calcite-core-1.1.0-drill-r9.jar:1.1.0-drill-r9]
> at 
> org.apache.calcite.plan.volcano.VolcanoRuleCall.onMatch(VolcanoRuleCall.java:251)
>  ~[calcite-core-1.1.0-drill-r9.jar:1.1.0-drill-r9]
> at 
> org.apache.calcite.plan.volcano.VolcanoPlanner.findBestExp(VolcanoPlanner.java:795)
>  ~[calcite-core-1.1.0-drill-r9.jar:1.1.0-drill-r9]
> at 
> org.apache.calcite.tools.Programs$RuleSetProgram.run(Programs.java:303) 
> ~[calcite-core-1.1.0-drill-r9.jar:1.1.0-drill-r9]
> at org.apache.calcite.prepare.PlannerImpl.transform(PlannerImpl.java:316) 
> ~[calcite-core-1.1.0-drill-r9.jar:1.1.0-drill-r9]
> at 
> org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.convertToPrel(DefaultSqlHandler.java:260)
>  ~[drill-java-exec-1.1.0-SNAPSHOT.jar:1.1.0-SNAPSHOT]
> at 
> org.apache.drill.exec.planner.sql.handlers.CreateTableHandler.convertToPrel(CreateTableHandler.java:120)
>  ~[drill-java-exec-1.1.0-SNAPSHOT.jar:1.1.0-SNAPSHOT]
> at 
> org.apache.drill.exec.planner.sql.handlers.CreateTableHandler.getPlan(CreateTableHandler.java:99)
>  ~[drill-java-exec-1.1.0-SNAPSHOT.jar:1.1.0-SNAPSHOT]
> at 
> org.apache.drill.exec.planner.sql.DrillSqlWorker.getPlan(DrillSqlWorker.java:178)
>  ~[drill-java-exec-1.1.0-SNAPSHOT.jar:1.1.0-SNAPSHOT]
> at org.apache.drill.exec.work.foreman.Foreman.runSQL(Foreman.java:903) 
> [drill-java-exec-1.1.0-SNAPSHOT.jar:1.1.0-SNAPSHOT]
> at org.apache.drill.exec.work.foreman.Foreman.run(Foreman.java:242) 
> [drill-java-exec-1.1.0-SNAPSHOT.jar:1.1.0-SNAPSHOT]
> ..

[jira] [Updated] (DRILL-3030) Foreman hangs trying to cancel non-root fragments

2015-06-26 Thread Chris Westin (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-3030?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Westin updated DRILL-3030:

Fix Version/s: (was: 1.1.0)
   1.2.0

> Foreman hangs trying to cancel non-root fragments
> -
>
> Key: DRILL-3030
> URL: https://issues.apache.org/jira/browse/DRILL-3030
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Flow
>Affects Versions: 1.0.0
>Reporter: Ramana Inukonda Nagaraj
>Assignee: Sudheesh Katkam
> Fix For: 1.2.0
>
> Attachments: threadstack
>
>
> Steps to repro:
> 1. Ran long running query on a clean drill restart. 
> 2. Killed a non foreman node. 
> 3. Restarted drillbits using clush.
> One of the drillbits(coincidentally a foreman node always) refused to 
> shutdown. 
> Jstack shows that the foreman is waiting 
> {code}
>   at 
> org.apache.drill.exec.rpc.ReconnectingConnection$ConnectionListeningFuture.waitAndRun(ReconnectingConnection.java:105)
> at 
> org.apache.drill.exec.rpc.ReconnectingConnection.runCommand(ReconnectingConnection.java:81)
> - locked <0x00073878aaa8> (a 
> org.apache.drill.exec.rpc.control.ControlConnectionManager)
> at 
> org.apache.drill.exec.rpc.control.ControlTunnel.cancelFragment(ControlTunnel.java:57)
> at 
> org.apache.drill.exec.work.foreman.QueryManager.cancelExecutingFragments(QueryManager.java:192)
> at 
> org.apache.drill.exec.work.foreman.Foreman$StateSwitch.processEvent(Foreman.java:824)
> at 
> org.apache.drill.exec.work.foreman.Foreman$StateSwitch.processEvent(Foreman.java:768)
> at 
> org.apache.drill.common.EventProcessor.sendEvent(EventProcessor.java:73)
> at 
> org.apache.drill.exec.work.foreman.Foreman$StateSwitch.moveToState(Foreman.java:770)
> at 
> org.apache.drill.exec.work.foreman.Foreman.moveToState(Foreman.java:871)
> at 
> org.apache.drill.exec.work.foreman.Foreman.access$2700(Foreman.java:107)
> at 
> org.apache.drill.exec.work.foreman.Foreman$StateListener.moveToState(Foreman.java:1132)
> at 
> org.apache.drill.exec.work.foreman.QueryManager$1.statusUpdate(QueryManager.java:460)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (DRILL-3257) TPCDS query 74 results in a StackOverflowError on Scale Factor 1

2015-06-26 Thread Chris Westin (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-3257?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Westin updated DRILL-3257:

Fix Version/s: (was: 1.1.0)
   1.2.0

> TPCDS query 74 results in a StackOverflowError on Scale Factor 1
> 
>
> Key: DRILL-3257
> URL: https://issues.apache.org/jira/browse/DRILL-3257
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Flow
>Reporter: Rahul Challapalli
>Assignee: Chris Westin
> Fix For: 1.2.0
>
> Attachments: error.log
>
>
> git.commit.id.abbrev=5f26b8b
> Query :
> {code}
> WITH year_total 
>  AS (SELECT c_customer_idcustomer_id, 
> c_first_name customer_first_name, 
> c_last_name  customer_last_name, 
> d_year   AS year1, 
> Sum(ss_net_paid) year_total, 
> 's'  sale_type 
>  FROM   customer, 
> store_sales, 
> date_dim 
>  WHERE  c_customer_sk = ss_customer_sk 
> AND ss_sold_date_sk = d_date_sk 
> AND d_year IN ( 1999, 1999 + 1 ) 
>  GROUP  BY c_customer_id, 
>c_first_name, 
>c_last_name, 
>d_year 
>  UNION ALL 
>  SELECT c_customer_idcustomer_id, 
> c_first_name customer_first_name, 
> c_last_name  customer_last_name, 
> d_year   AS year1, 
> Sum(ws_net_paid) year_total, 
> 'w'  sale_type 
>  FROM   customer, 
> web_sales, 
> date_dim 
>  WHERE  c_customer_sk = ws_bill_customer_sk 
> AND ws_sold_date_sk = d_date_sk 
> AND d_year IN ( 1999, 1999 + 1 ) 
>  GROUP  BY c_customer_id, 
>c_first_name, 
>c_last_name, 
>d_year) 
> SELECT t_s_secyear.customer_id, 
>t_s_secyear.customer_first_name, 
>t_s_secyear.customer_last_name 
> FROM   year_total t_s_firstyear, 
>year_total t_s_secyear, 
>year_total t_w_firstyear, 
>year_total t_w_secyear 
> WHERE  t_s_secyear.customer_id = t_s_firstyear.customer_id 
>AND t_s_firstyear.customer_id = t_w_secyear.customer_id 
>AND t_s_firstyear.customer_id = t_w_firstyear.customer_id 
>AND t_s_firstyear.sale_type = 's' 
>AND t_w_firstyear.sale_type = 'w' 
>AND t_s_secyear.sale_type = 's' 
>AND t_w_secyear.sale_type = 'w' 
>AND t_s_firstyear.year1 = 1999 
>AND t_s_secyear.year1 = 1999 + 1 
>AND t_w_firstyear.year1 = 1999 
>AND t_w_secyear.year1 = 1999 + 1 
>AND t_s_firstyear.year_total > 0 
>AND t_w_firstyear.year_total > 0 
>AND CASE 
>  WHEN t_w_firstyear.year_total > 0 THEN t_w_secyear.year_total / 
> t_w_firstyear.year_total 
>  ELSE NULL 
>END > CASE 
>WHEN t_s_firstyear.year_total > 0 THEN 
>t_s_secyear.year_total / 
>t_s_firstyear.year_total 
>ELSE NULL 
>  END 
> ORDER  BY 1, 
>   2, 
>   3
> LIMIT 100;
> {code}
> The above query never returns. I attached the log file.
> Since the data is 1GB I cannot attach it here. Kindly reach out to me if you 
> want more information.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (DRILL-2625) org.apache.drill.common.StackTrace should follow standard stacktrace format

2015-06-26 Thread Chris Westin (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-2625?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Westin updated DRILL-2625:

Fix Version/s: (was: 1.1.0)
   1.2.0

> org.apache.drill.common.StackTrace should follow standard stacktrace format
> ---
>
> Key: DRILL-2625
> URL: https://issues.apache.org/jira/browse/DRILL-2625
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Flow
>Affects Versions: 0.8.0
>Reporter: Daniel Barclay (Drill)
>Assignee: Chris Westin
> Fix For: 1.2.0
>
>
> org.apache.drill.common.StackTrace uses a different textual format than JDK's 
> standard format for stack traces.
> It should probably use the standard format so that its stack trace output can 
> be used by tools that already can parse the standard format to provide 
> functionality such as displaying the corresponding source.
> (After correcting for DRILL-2624, StackTrace formats stack traces like this:
> org.apache.drill.common.StackTrace.:1
> org.apache.drill.exec.server.Drillbit.run:20
> org.apache.drill.jdbc.DrillConnectionImpl.:232
> The normal form is like this:
>   at 
> org.apache.drill.exec.memory.TopLevelAllocator.close(TopLevelAllocator.java:162)
>   at 
> org.apache.drill.exec.server.BootStrapContext.close(BootStrapContext.java:75)
>   at com.google.common.io.Closeables.close(Closeables.java:77)
> )



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (DRILL-3251) TPC-DS SF 100 query 37 & 82 fail with possible memory leak

2015-06-26 Thread Chris Westin (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-3251?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Westin updated DRILL-3251:

Fix Version/s: (was: 1.1.0)
   1.2.0

> TPC-DS SF 100 query 37 & 82 fail with possible memory leak
> --
>
> Key: DRILL-3251
> URL: https://issues.apache.org/jira/browse/DRILL-3251
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Flow
>Affects Versions: 1.1.0
>Reporter: Abhishek Girish
>Assignee: Chris Westin
> Fix For: 1.2.0
>
> Attachments: drillbit_memoryleak.log, query 37 - explain plan with 
> all attributes.txt, query 82 - explain plan with all attributes.txt
>
>
> Queries can be found here: 
> https://github.com/Agirish/tpcds/blob/master/query37.sql & 
> https://github.com/Agirish/tpcds/blob/master/query82.sql
> Dataset: SF 100 - Parquet
> Environment: 4 nodes - 48Gigs of Direct memory for Drill on each node
> Both queries fail with ectpions: 
> *Query 37 failed: *
> org.apache.drill.common.exceptions.UserRemoteException: SYSTEM ERROR: 
> java.lang.IllegalStateException: Failure while closing accountor.  Expected 
> private and shared pools to be set to initial values.  However, one or more 
> were not.  Stats are
>   zoneinitallocated   delta 
>   private 100010000 
>   shared  000 99989990666 9334.
> Fragment 5:72
> *Query 82 failed: *
> org.apache.drill.common.exceptions.UserRemoteException: SYSTEM ERROR: 
> java.lang.IllegalStateException: Failure while closing accountor.  Expected 
> private and shared pools to be set to initial values.  However, one or more 
> were not.  Stats are
>   zoneinitallocated   delta 
>   private 10009990666 9334 
>   shared  000 000 0.
> Fragment 5:47
> Query plans attached. Log attached. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (DRILL-3040) Accountor drill.exec.memory.enable_frag_limit not defaulted normally

2015-06-26 Thread Chris Westin (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-3040?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Westin updated DRILL-3040:

Fix Version/s: (was: 1.1.0)
   1.2.0

> Accountor drill.exec.memory.enable_frag_limit not defaulted normally
> 
>
> Key: DRILL-3040
> URL: https://issues.apache.org/jira/browse/DRILL-3040
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Flow
>Reporter: Daniel Barclay (Drill)
>Assignee: Chris Westin
> Fix For: 1.2.0
>
>
> Defaulting of drill.exec.memory.enable_frag_limit is implemented "manually" 
> in Accountor (by catching ConfigException) rather than by using the normal 
> configuration file hierarchy (being defined in the base file).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (DRILL-2421) ensure all allocators for a query are descendants of a single root

2015-06-26 Thread Chris Westin (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-2421?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Westin updated DRILL-2421:

Fix Version/s: (was: 1.1.0)
   Future

> ensure all allocators for a query are descendants of a single root
> --
>
> Key: DRILL-2421
> URL: https://issues.apache.org/jira/browse/DRILL-2421
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Flow
>Reporter: Chris Westin
>Assignee: Chris Westin
> Fix For: Future
>
>
> In order to help improve usage tracking, allocations for a single query 
> should all roll up to a single root.
> This requires that the Foreman create that root, and label it, and then pass 
> that along to anyone else that needs to create additional sub-allocators. The 
> patch for DRILL-2406 introduces the creation of a new allocator in 
> QueryContext, but this is currently a child of the Drillbit's 
> TopLevelAllocator, violating the principle above. This is a reminder to fix 
> that after the dependencies above are available.
> As well as the known case in QueryContext, check to make sure other locations 
> aren't creating new children from the DrillbitContext, but are instead using 
> the allocator from the FragmentContext instead.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (DRILL-1942) Improve off-heap memory usage tracking

2015-06-26 Thread Chris Westin (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-1942?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Westin updated DRILL-1942:

Fix Version/s: (was: 1.1.0)
   1.2.0

> Improve off-heap memory usage tracking
> --
>
> Key: DRILL-1942
> URL: https://issues.apache.org/jira/browse/DRILL-1942
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Execution - Relational Operators
>Reporter: Chris Westin
>Assignee: Chris Westin
> Fix For: 1.2.0
>
> Attachments: DRILL-1942.1.patch.txt, DRILL-1942.2.patch.txt, 
> DRILL-1942.3.patch.txt
>
>
> We're using a lot more memory than we think we should. We may be leaking it, 
> or not releasing it as soon as we could. 
> This is a call to come up with some improved tracking so that we can get 
> statistics out about exactly where we're using it, and whether or not we can 
> release it earlier.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (DRILL-2515) Add allocator state verification at the end of test suites

2015-06-26 Thread Chris Westin (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-2515?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Westin updated DRILL-2515:

Fix Version/s: (was: 1.1.0)
   1.3.0

> Add allocator state verification at the end of test suites
> --
>
> Key: DRILL-2515
> URL: https://issues.apache.org/jira/browse/DRILL-2515
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Tools, Build & Test
>Affects Versions: 0.8.0
>Reporter: Chris Westin
>Assignee: Chris Westin
> Fix For: 1.3.0
>
>
> In order to speed up testing, we've set up maven so that it reuses the JVM 
> between test classes in "mvn install". However, this means that if there are 
> missing closures of allocators or other resources, we don't find out until 
> much later, after the offending test, instead of in the test, as we would 
> when running a JUnit test from an IDE.
> The suggestion here is to add @AfterClass methods to base test classes that 
> do some kind of verification. We'll need to add something to the top level 
> allocator that does the same verification it does now in close(), but allows 
> us to spot check it between tests. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (DRILL-2626) org.apache.drill.common.StackTrace seems to have duplicate code; should we re-use Throwable's code?

2015-06-26 Thread Chris Westin (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-2626?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Westin updated DRILL-2626:

Fix Version/s: (was: 1.1.0)
   1.2.0

> org.apache.drill.common.StackTrace seems to have duplicate code; should we 
> re-use Throwable's code?
> ---
>
> Key: DRILL-2626
> URL: https://issues.apache.org/jira/browse/DRILL-2626
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Flow
>Affects Versions: 0.8.0
>Reporter: Daniel Barclay (Drill)
>Assignee: Chris Westin
> Fix For: 1.2.0
>
>
> It seems that class org.apache.drill.common.StackTrace needlessly duplicates 
> code that's already in the JDK.
> In particular, it has code to format the stack trace.  That seems at least 
> mostly redundant with the formatting code already in java.lang.Throwable.
> StackTrace does have a comment about eliminating the StackTrace constructor 
> from the stack trace.  However, StackTrace does _not_ actuallly eliminate its 
> contructor from the stack trace (e.g., its stack traces start with 
> "org.apache.drill.common.StackTrace.:...").
> Should StackTrace be implemented by simply subclassing Throwable?  
> That would eliminate StackTrace's current formatting code (which would also 
> eliminate the difference between StackTrace's format and the standard format).
> That should also eliminate having the StackTrace constructor's stack frame 
> show up in the stack trace.  (Throwable's constructor/fillInStackTrace 
> already handles that.)
> (Having "StackTrace extends Throwable" isn't ideal, since StackTrace is not 
> intended to be a kind of exception, but that would probably be better than 
> the current form, given the bugs StackTrace has/has had (DRILL-2624, 
> DRILL-2625).
> That non-ideal subclassing could be eliminated by having a member variable of 
> type Throwable that is constructed during StackTrace's construction, although 
> that would either cause the StackTrace constructor to re-appear in the stack 
> trace or require a non-trivial workaround to re-eliminate it.
> Perhaps client code should simply use "new Throwable()" to capture the stack 
> trace and a static methods on a utility class to format the stack trace into 
> a String.)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (DRILL-3095) Memory Leak : Failure while closing accountor.

2015-06-26 Thread Chris Westin (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-3095?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Westin updated DRILL-3095:

Fix Version/s: (was: 1.1.0)
   1.2.0

> Memory Leak : Failure while closing accountor.
> --
>
> Key: DRILL-3095
> URL: https://issues.apache.org/jira/browse/DRILL-3095
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Flow
>Affects Versions: 1.0.0
> Environment: f7f6efc525cd833ce1530deae32eb9ccb20b664a
>Reporter: Khurram Faraaz
>Assignee: Chris Westin
> Fix For: 1.2.0
>
>
> I am seeing a memory leak when i Cancel a long running query on sqlline. I am 
> re running the query with assertion enabled, will add details after the 
> second run is complete with assertions.
> Long running query was,
> {code}
> select key1, key2 from `twoKeyJsn.json`;
> {code}
> I did Ctrl-C when the above query was running on sqlline, and then issued the 
> below query that returned correct results. And then I see there is a memory 
> leak message in drillbit.log
> {code}
> 0: jdbc:drill:schema=dfs.tmp> select count(*) from `twoKeyJsn.json`;
> ++
> |   EXPR$0   |
> ++
> | 26212355   |
> ++
> 1 row selected (14.734 seconds)
> {code}
> Stack trace from drillbit.log
> {code}
> 2015-05-15 00:59:01,951 [2aaabb50-3afd-2906-3f48-eb86a315a1f5:frag:0:0] WARN  
> o.a.drill.exec.ops.SendingAccountor - Interrupted while waiting for send 
> complete. Continuing to wait.
> java.lang.InterruptedException: null
> at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1301)
>  ~[na:1.7.0_45]
> at java.util.concurrent.Semaphore.acquire(Semaphore.java:472) 
> ~[na:1.7.0_45]
> at 
> org.apache.drill.exec.ops.SendingAccountor.waitForSendComplete(SendingAccountor.java:48)
>  ~[drill-java-exec-1.0.0-SNAPSHOT-rebuffed.jar:1.0.0-SNAPSHOT]
> at 
> org.apache.drill.exec.ops.FragmentContext.waitForSendComplete(FragmentContext.java:436)
>  [drill-java-exec-1.0.0-SNAPSHOT-rebuffed.jar:1.0.0-SNAPSHOT]
> at 
> org.apache.drill.exec.physical.impl.BaseRootExec.close(BaseRootExec.java:112) 
> [drill-java-exec-1.0.0-SNAPSHOT-rebuffed.jar:1.0.0-SNAPSHOT]
> at 
> org.apache.drill.exec.physical.impl.ScreenCreator$ScreenRoot.close(ScreenCreator.java:141)
>  [drill-java-exec-1.0.0-SNAPSHOT-rebuffed.jar:1.0.0-SNAPSHOT]
> at 
> org.apache.drill.exec.work.fragment.FragmentExecutor.closeOutResources(FragmentExecutor.java:333)
>  [drill-java-exec-1.0.0-SNAPSHOT-rebuffed.jar:1.0.0-SNAPSHOT]
> at 
> org.apache.drill.exec.work.fragment.FragmentExecutor.run(FragmentExecutor.java:278)
>  [drill-java-exec-1.0.0-SNAPSHOT-rebuffed.jar:1.0.0-SNAPSHOT]
> at 
> org.apache.drill.common.SelfCleaningRunnable.run(SelfCleaningRunnable.java:38)
>  [drill-common-1.0.0-SNAPSHOT-rebuffed.jar:1.0.0-SNAPSHOT]
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>  [na:1.7.0_45]
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>  [na:1.7.0_45]
> at java.lang.Thread.run(Thread.java:744) [na:1.7.0_45]
> 2015-05-15 00:59:01,952 [2aaabb50-3afd-2906-3f48-eb86a315a1f5:frag:0:0] INFO  
> o.a.d.e.w.fragment.FragmentExecutor - 
> 2aaabb50-3afd-2906-3f48-eb86a315a1f5:0:0: State change requested from 
> CANCELLATION_REQUESTED --> FAILED for
> 2015-05-15 00:59:01,952 [2aaabb50-3afd-2906-3f48-eb86a315a1f5:frag:0:0] INFO  
> o.a.d.e.w.fragment.FragmentExecutor - 
> 2aaabb50-3afd-2906-3f48-eb86a315a1f5:0:0: State change requested from FAILED 
> --> FAILED for
> 2015-05-15 00:59:01,952 [2aaabb50-3afd-2906-3f48-eb86a315a1f5:frag:0:0] INFO  
> o.a.d.e.w.fragment.FragmentExecutor - 
> 2aaabb50-3afd-2906-3f48-eb86a315a1f5:0:0: State change requested from FAILED 
> --> FINISHED for
> 2015-05-15 00:59:01,956 [2aaabb50-3afd-2906-3f48-eb86a315a1f5:frag:0:0] ERROR 
> o.a.d.c.exceptions.UserException - SYSTEM ERROR: 
> java.lang.IllegalStateException: Failure while closing accountor.  Expected 
> private and shared pools to be set to initial values.  However, one or more 
> were not.  Stats are
> zoneinitallocated   delta
> private 100 918080  81920
> shared  00  00  0.
> Fragment 0:0
> [Error Id: 90ced8b1-b6db-438f-b193-b7634de31b81 on centos-03.qa.lab:31010]
> org.apache.drill.common.exceptions.UserException: SYSTEM ERROR: 
> java.lang.IllegalStateException: Failure while closing accountor.  Expected 
> private and shared pools to be set to initial values.  However, one or more 
> were not.  Stats are
> zoneinitallocated   delta
> private 100 918080  81920
> shared  00

[jira] [Updated] (DRILL-3150) Error when filtering non-existent field with a string

2015-06-26 Thread Chris Westin (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-3150?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Westin updated DRILL-3150:

Fix Version/s: (was: 1.1.0)
   1.2.0

> Error when filtering non-existent field with a string
> -
>
> Key: DRILL-3150
> URL: https://issues.apache.org/jira/browse/DRILL-3150
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Relational Operators
>Affects Versions: 1.0.0
>Reporter: Adam Gilmore
>Assignee: Adam Gilmore
>Priority: Critical
> Fix For: 1.2.0
>
> Attachments: DRILL-3150.1.patch.txt
>
>
> The following query throws an exception:
> {code}
> select count(*) from cp.`employee.json` where `blah` = 'test'
> {code}
> "blah" does not exist as a field in the JSON.  The expected behaviour would 
> be to filter out all rows as that field is not present (thus cannot equal the 
> string 'test').
> Instead, the following exception occurs:
> {code}
> org.apache.drill.common.exceptions.UserRemoteException: SYSTEM ERROR: test
> Fragment 0:0
> [Error Id: 5d6c9a82-8f87-41b2-a496-67b360302b76 on 
> ip-10-1-50-208.ec2.internal:31010]
> {code}
> Apart from the fact the real error message is hidden, the issue is that we're 
> trying to cast the varchar to int ('test' to an int).  This seems to be 
> because the projection out of the scan when a field is not found becomes 
> INT:OPTIONAL.
> The filter should not fail on this - if the varchar fails to convert to an 
> int, the filter should just simply not allow any records through.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (DRILL-3401) How to use Snappy compression on Parquet table?

2015-06-26 Thread Kathy Qiu (JIRA)

Kathy Qiu created DRILL-3401:


 Summary: How to use Snappy compression on Parquet table?
 Key: DRILL-3401
 URL: https://issues.apache.org/jira/browse/DRILL-3401
 Project: Apache Drill
  Issue Type: Bug
  Components: Functions - Drill
Affects Versions: 1.0.0
 Environment: Cloudera 5.4.2 AWS cluster
Reporter: Kathy Qiu
Assignee: Daniel Barclay (Drill)


To use Snappy compression on a Parquet table I created, these are the commands 
I used:

alter session set `store.format`='parquet';
alter session set `store.parquet.compression`='snappy';
create table  as (select 
cast (columns[0] as DECIMAL(10,0)) 
etc...
from dfs.``);

Does this suffice? Or do I need to specify Snappy compression in the CTAS 
command as well?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Resolved] (DRILL-3376) Reading individual files created by CTAS with partition causes an exception

2015-06-26 Thread Steven Phillips (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-3376?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steven Phillips resolved DRILL-3376.

Resolution: Fixed

Fixed by 5f0e4cbd0f49600c41abf38056bcd29849c5cdf9

> Reading individual files created by CTAS with partition causes an exception
> ---
>
> Key: DRILL-3376
> URL: https://issues.apache.org/jira/browse/DRILL-3376
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Storage - Writer
>Affects Versions: 1.1.0
>Reporter: Parth Chandra
>Assignee: Steven Phillips
> Fix For: 1.1.0
>
>
> Create a table using CTAS with partitioning:
> {code}
> create table `lineitem_part` partition by (l_moddate) as select l.*, 
> l_shipdate - extract(day from l_shipdate) + 1 l_moddate from 
> cp.`tpch/lineitem.parquet` l
> {code}
> Then the following query causes an exception
> {code}
> select distinct l_moddate from `lineitem_part/0_0_1.parquet` where l_moddate 
> = date '1992-01-01';
> {code}
> Trace in the log file - 
> {panel}
> Caused by: java.lang.StringIndexOutOfBoundsException: String index out of 
> range: 0
> at java.lang.String.charAt(String.java:658) ~[na:1.7.0_65]
> at 
> org.apache.drill.exec.planner.logical.partition.PruneScanRule$PathPartition.(PruneScanRule.java:493)
>  ~[drill-java-exec-1.1.0-SNAPSHOT.jar:1.1.0-SNAPSHOT]
> at 
> org.apache.drill.exec.planner.logical.partition.PruneScanRule.doOnMatch(PruneScanRule.java:385)
>  ~[drill-java-exec-1.1.0-SNAPSHOT.jar:1.1.0-SNAPSHOT]
> at 
> org.apache.drill.exec.planner.logical.partition.PruneScanRule$4.onMatch(PruneScanRule.java:278)
>  ~[drill-java-exec-1.1.0-SNAPSHOT.jar:1.1.0-SNAPSHOT]
> at 
> org.apache.calcite.plan.volcano.VolcanoRuleCall.onMatch(VolcanoRuleCall.java:228)
>  ~[calcite-core-1.1.0-drill-r9.jar:1.1.0-drill-r9]
> ... 13 common frames omitted
> {panel}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Resolved] (DRILL-3398) WebServer is leaking memory for queries submitted through REST API or WebUI

2015-06-26 Thread Venki Korukanti (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-3398?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Venki Korukanti resolved DRILL-3398.

Resolution: Fixed

Fixed in 
[60bc945|https://github.com/apache/drill/commit/60bc9459bd8ef29e9d90ffe885771090ab658a40].

> WebServer is leaking memory for queries submitted through REST API or WebUI
> ---
>
> Key: DRILL-3398
> URL: https://issues.apache.org/jira/browse/DRILL-3398
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.1.0
>Reporter: Venki Korukanti
>Assignee: Venki Korukanti
> Fix For: 1.1.0
>
> Attachments: DRILL-3398-1.patch
>
>
> 1. Start embedded drillbit
> 2. Submit queries through WebUI or REST APIs
> 3. Shutdown drillbit. Here TopLevelAllocator close prints out the leaked 
> pools.
> [~sudheeshkatkam] and I looked the issue, it turns out we don't release the 
> RecordBatchLoaded in QueryWrapper.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (DRILL-2100) Drill not deleting spooling files

2015-06-26 Thread Steven Phillips (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-2100?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steven Phillips updated DRILL-2100:
---
Fix Version/s: (was: 1.1.0)
   1.2.0

> Drill not deleting spooling files
> -
>
> Key: DRILL-2100
> URL: https://issues.apache.org/jira/browse/DRILL-2100
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Relational Operators
>Affects Versions: 0.8.0
>Reporter: Abhishek Girish
>Assignee: Steven Phillips
> Fix For: 1.2.0
>
>
> Currently, after forcing queries to use an external sort by switching off 
> hash join/agg causes spill-to-disk files accumulating. 
> This causes issues with disk space availability when the spill is configured 
> to be on the local file system (/tmp/drill). Also not optimal when configured 
> to use DFS (custom). 
> Drill must clean up all temporary files created after a query completes or 
> after a drillbit restart. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (DRILL-2930) Need detail of Drillbit version in output of sys.drillbits

2015-06-26 Thread Steven Phillips (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-2930?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steven Phillips updated DRILL-2930:
---
Fix Version/s: (was: 1.1.0)
   1.2.0

> Need detail of Drillbit version in output of sys.drillbits
> --
>
> Key: DRILL-2930
> URL: https://issues.apache.org/jira/browse/DRILL-2930
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Metadata
>Affects Versions: 0.9.0
>Reporter: Khurram Faraaz
>Assignee: Steven Phillips
>Priority: Minor
> Fix For: 1.2.0
>
>
> On a cluster setup, where there are several nodes and each node has a 
> Drillbit, it will help if we can provide the detail of Drillbit version as 
> one of the columns in the output of sys.drillbits. That will help users to 
> verify that they are running the same version of the Drillbit on each of the 
> nodes in the cluster.
> Today, we need to manually query sys.version on each of the nodes to know the 
> version of the Drillbit running on that node.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (DRILL-3249) CTAS from an empty TSV file fails

2015-06-26 Thread Steven Phillips (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-3249?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steven Phillips updated DRILL-3249:
---
Fix Version/s: (was: 1.1.0)
   1.2.0

> CTAS from an empty TSV file fails
> -
>
> Key: DRILL-3249
> URL: https://issues.apache.org/jira/browse/DRILL-3249
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Storage - Text & CSV
>Affects Versions: 1.0.0
>Reporter: Stuart Hayes
>Assignee: Steven Phillips
>  Labels: csv
> Fix For: 1.2.0
>
>
> Performing a CTAS to parquet from an empty CSV file fails:
> CREATE TABLE dfs.latest.d_csi(dsa_id, uid, DN, d_csi_name, d_svc_avail, 
> d_sending_opt, d_supp_cap_ph, d_repl_hndl_lup, d_repl_hnd_interr, 
> d_opt_routing, ref_status) AS 
> SELECT cast(columns[0] AS INTEGER), columns[1], columns[2], columns[3], 
> columns[4], columns[5], columns[6], columns[7], columns[8], columns[9], 
> columns[10] FROM dfs.converted.`02062015/25/d_csi.txt`;
> Error: SYSTEM ERROR: java.lang.IllegalArgumentException: MinorFragmentId 0 
> has no read entries assigned
> This is part of a daily automated job in which case the entire job fails.  An 
> empty file/table should be a valid scenario.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (DRILL-3400) After shifting CTAS's data, query on CTAS table failed

2015-06-26 Thread Sean Hsuan-Yi Chu (JIRA)

Sean Hsuan-Yi Chu created DRILL-3400:


 Summary: After shifting CTAS's data, query on CTAS table failed 
 Key: DRILL-3400
 URL: https://issues.apache.org/jira/browse/DRILL-3400
 Project: Apache Drill
  Issue Type: Bug
  Components: Query Planning & Optimization
Reporter: Sean Hsuan-Yi Chu
Assignee: Jinfeng Ni


A simple query like this:
create table `ttt` partition by (r_regionkey) as select * from 
cp.`tpch/region.parquet`;

Without touching generated data from CTAS, this query select * from `ttt`; 
works. 

Then, I tried to reorganize the parquet generated by CTAS as:
|-Q1/ ... .pq
   Q2/ ... .pq
   Q3/ ... .pq

However, after this manual moving, any query fails. Surprisingly, even after I 
manually move the data to the original place, the queries which worked before 
all failed with:

Error: PARSE ERROR: From line 1, column 15 to line 1, column 19: Table 'ttt' 
not found



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (DRILL-2475) Handle IterOutcome.NONE correctly in operators

2015-06-26 Thread Steven Phillips (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-2475?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steven Phillips updated DRILL-2475:
---
Fix Version/s: (was: 1.1.0)
   1.2.0

> Handle IterOutcome.NONE correctly in operators
> --
>
> Key: DRILL-2475
> URL: https://issues.apache.org/jira/browse/DRILL-2475
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Relational Operators
>Affects Versions: 0.8.0
>Reporter: Venki Korukanti
>Assignee: Steven Phillips
> Fix For: 1.2.0
>
>
> Currently not all operators are handling the NONE (with no OK_NEW_SCHEMA) 
> correctly. This JIRA is to go through the operators and check if it handling 
> the NONE correctly or not and modify accordingly.
> (from DRILL-2453)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (DRILL-3180) Apache Drill JDBC storage plugin to query rdbms systems such as MySQL and Netezza from Apache Drill

2015-06-26 Thread Magnus Pierre (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-3180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14603445#comment-14603445
 ] 

Magnus Pierre commented on DRILL-3180:
--

Yes, please guide me how to initiate the process.


> Apache Drill JDBC storage plugin to query rdbms systems such as MySQL and 
> Netezza from Apache Drill
> ---
>
> Key: DRILL-3180
> URL: https://issues.apache.org/jira/browse/DRILL-3180
> Project: Apache Drill
>  Issue Type: New Feature
>  Components: Storage - Other
>Affects Versions: 1.0.0
>Reporter: Magnus Pierre
>Assignee: Jacques Nadeau
>  Labels: Drill, JDBC, plugin
> Attachments: pom.xml, storage-mpjdbc.zip
>
>   Original Estimate: 1m
>  Remaining Estimate: 1m
>
> I have developed the base code for a JDBC storage-plugin for Apache Drill. 
> The code is primitive but consitutes a good starting point for further 
> coding. Today it provides primitive support for SELECT against RDBMS with 
> JDBC. 
> The goal is to provide complete SELECT support against RDBMS with push down 
> capabilities.
> Currently the code is using standard JDBC classes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (DRILL-2774) Updated drill-patch-review.py to use git-format-patch

2015-06-26 Thread Steven Phillips (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-2774?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steven Phillips updated DRILL-2774:
---
Fix Version/s: (was: 1.1.0)
   1.2.0

> Updated drill-patch-review.py to use git-format-patch
> -
>
> Key: DRILL-2774
> URL: https://issues.apache.org/jira/browse/DRILL-2774
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Tools, Build & Test
>Reporter: Steven Phillips
>Assignee: Steven Phillips
>Priority: Minor
> Fix For: 1.2.0
>
> Attachments: DRILL-2774.patch, DRILL-2774.patch
>
>
> The tool currently uses git diff to generate the patches, which does not 
> preserve commit information, and which is required for submitting patches in 
> the Drill community.
> This doesn't work properly when there are multiple commits, so as part of 
> this change, we enforce the requirement that the branch used to create the 
> patch is exactly one commit ahead and zero behind the remote branch.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (DRILL-3199) GenericAccessor doesn't support isNull

2015-06-26 Thread Jacques Nadeau (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-3199?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14603426#comment-14603426
 ] 

Jacques Nadeau commented on DRILL-3199:
---

Looks reasonable to me.  +1.

Let's get it in.

> GenericAccessor doesn't support isNull
> --
>
> Key: DRILL-3199
> URL: https://issues.apache.org/jira/browse/DRILL-3199
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Client - JDBC
> Environment: I found this problem when calling the driver's wasNull() 
> method on a field that represented a nested JSON object (one level below the 
> root object), using the 'dfs' storage plugin and pointing at my local 
> filesystem.
>Reporter: Matt Burgess
>Assignee: Daniel Barclay (Drill)
> Fix For: 1.1.0
>
> Attachments: DRILL-3199.patch.1, DRILL-3199.patch.2, 
> DRILL-3199.patch.3
>
>
> GenericAccessor throws an UnsupportedOperationException when isNull() is 
> called. However for other methods it delegates to its ValueVector's accessor. 
> I think it should do the same for isNull().



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (DRILL-3199) GenericAccessor doesn't support isNull

2015-06-26 Thread Jacques Nadeau (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-3199?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jacques Nadeau updated DRILL-3199:
--
Fix Version/s: 1.1.0

> GenericAccessor doesn't support isNull
> --
>
> Key: DRILL-3199
> URL: https://issues.apache.org/jira/browse/DRILL-3199
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Client - JDBC
> Environment: I found this problem when calling the driver's wasNull() 
> method on a field that represented a nested JSON object (one level below the 
> root object), using the 'dfs' storage plugin and pointing at my local 
> filesystem.
>Reporter: Matt Burgess
>Assignee: Daniel Barclay (Drill)
> Fix For: 1.1.0
>
> Attachments: DRILL-3199.patch.1, DRILL-3199.patch.2, 
> DRILL-3199.patch.3
>
>
> GenericAccessor throws an UnsupportedOperationException when isNull() is 
> called. However for other methods it delegates to its ValueVector's accessor. 
> I think it should do the same for isNull().



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Resolved] (DRILL-3374) CTAS with PARTITION BY, partition column name from view can not be resolved

2015-06-26 Thread Jinfeng Ni (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-3374?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jinfeng Ni resolved DRILL-3374.
---
Resolution: Fixed

fixed in commit:  80270d1b687ec4cbff69fd13f1364ec77473588f . 

> CTAS with PARTITION BY, partition column name from view can not be resolved
> ---
>
> Key: DRILL-3374
> URL: https://issues.apache.org/jira/browse/DRILL-3374
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Flow
>Affects Versions: 1.1.0
>Reporter: Khurram Faraaz
>Assignee: Jinfeng Ni
>  Labels: window_function
> Fix For: 1.1.0
>
>
> CTAS with PARTITION BY clause fails to resolve column name when partitioning 
> column is from a view.
> {code}
> 0: jdbc:drill:schema=dfs.tmp> create table ctas_prtng_01 partition by 
> (col_vchar_52) as select * from vwOnParq_wCst;
> Error: SYSTEM ERROR: IllegalArgumentException: partition col col_vchar_52 
> could not be resolved in table's column lists!
> [Error Id: 7cb227c1-65c5-48cb-a00b-1a89a5309bc8 on centos-04.qa.lab:31010] 
> (state=,code=0)
> {code}
> Table used in above CTAS does exist and the column used to partition by also 
> exists.
> {code}
> 0: jdbc:drill:schema=dfs.tmp> describe vwOnParq_wCst;
> +---++--+
> |  COLUMN_NAME  | DATA_TYPE  | IS_NULLABLE  |
> +---++--+
> | col_int   | INTEGER| YES  |
> | col_bigint| BIGINT | YES  |
> | col_char_2| CHARACTER  | YES  |
> | col_vchar_52  | CHARACTER VARYING  | YES  |
> | col_tmstmp| TIMESTAMP  | YES  |
> | col_dt| DATE   | YES  |
> | col_booln | BOOLEAN| YES  |
> | col_dbl   | DOUBLE | YES  |
> | col_tm| TIME   | YES  |
> +---++--+
> 9 rows selected (0.411 seconds)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (DRILL-3399) document how to get the Drill views definition

2015-06-26 Thread Kristine Hahn (JIRA)

Kristine Hahn created DRILL-3399:


 Summary: document how to get the Drill views definition
 Key: DRILL-3399
 URL: https://issues.apache.org/jira/browse/DRILL-3399
 Project: Apache Drill
  Issue Type: Improvement
  Components: Documentation
Affects Versions: 1.0.0
Reporter: Kristine Hahn


Document (if not already covered) how to get a view definition, which is sql 
for which the drill view is created:
{code}
select VIEW_DEFINITION from INFORMATION_SCHEMA.VIEWS where TABLE_NAME 
='your_view_name';
{code}
In Drill, a view is just a JSON file, which will live within the workspace 
where you saved it. Example:
{code}
 create or replace view dfs.workspace.myview as select * from mytable;
{code}

 It will create a file called 'myview.view.drill' , which will look
something like this:
{code}
 {
 "name" : "testview",
  "sql" : "SELECT *\nFROM `drill/new.json`\nFETCH NEXT 10 ROWS ONLY",
  "fields" : [ {
"name" : "*",
"type" : "ANY",
   "isNullable" : true
 } ],
   "workspaceSchemaPath" : [ "dfs", "workspace" ]
 }
{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Resolved] (DRILL-3377) Can't partition by expression when columns are explicitly specified in the CTAS column list

2015-06-26 Thread Jinfeng Ni (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-3377?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jinfeng Ni resolved DRILL-3377.
---
Resolution: Fixed

fixed in commit:  80270d1b687ec4cbff69fd13f1364ec77473588f 

> Can't partition by expression when columns are explicitly specified in the 
> CTAS column list
> ---
>
> Key: DRILL-3377
> URL: https://issues.apache.org/jira/browse/DRILL-3377
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Query Planning & Optimization
>Affects Versions: 1.0.0
>Reporter: Victoria Markman
>Assignee: Jinfeng Ni
>  Labels: ctas
> Attachments: 
> 0001-DRILL-3377-Fix-naming-resolution-error-for-partition.patch
>
>
> Query below throws an error:
> {code:sql}
> create table test(x1, x2) partition by (x1) as 
> selectsum(a1),  b1 
> from   t1 
> group by  b1;
> {code}
> {code}
> 0: jdbc:drill:schema=dfs> create table test(x1, x2) partition by (x1) as 
> select sum(a1), b1 from t1 group by b1;
> Error: SYSTEM ERROR: IllegalArgumentException: partition col x1 could not be 
> resolved in table's column lists!
> [Error Id: ab5624e8-e4dd-4752-95af-8bc2eef5d056 on atsqa4-133.qa.lab:31010] 
> (state=,code=0)
> {code}
> When column aliases are used, it works:
> {code}
> 0: jdbc:drill:schema=dfs> create table test partition by (x1) as select 
> sum(a1) x1, b1 x2 from t1 group by b1;
> +---++
> | Fragment  | Number of records written  |
> +---++
> | 0_0   | 10 |
> +---++
> 1 row selected (0.904 seconds)
> 0: jdbc:drill:schema=dfs> select * from test;
> +---++
> |  x1   |   x2   |
> +---++
> | null  | h  |
> | 2 | b  |
> | 10| j  |
> | 1 | a  |
> | 3 | c  |
> | 4 | null   |
> | 5 | e  |
> | 7 | g  |
> | 6 | f  |
> | 9 | i  |
> +---++
> 10 rows selected (0.161 seconds)
> 0: jdbc:drill:schema=dfs> select * from test order by x1;
> +---++
> |  x1   |   x2   |
> +---++
> | 1 | a  |
> | 2 | b  |
> | 3 | c  |
> | 4 | null   |
> | 5 | e  |
> | 6 | f  |
> | 7 | g  |
> | 9 | i  |
> | 10| j  |
> | null  | h  |
> +---++
> 10 rows selected (0.299 seconds)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (DRILL-3377) Can't partition by expression when columns are explicitly specified in the CTAS column list

2015-06-26 Thread Jinfeng Ni (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-3377?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jinfeng Ni updated DRILL-3377:
--
Fix Version/s: 1.1.0

> Can't partition by expression when columns are explicitly specified in the 
> CTAS column list
> ---
>
> Key: DRILL-3377
> URL: https://issues.apache.org/jira/browse/DRILL-3377
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Query Planning & Optimization
>Affects Versions: 1.0.0
>Reporter: Victoria Markman
>Assignee: Jinfeng Ni
>  Labels: ctas
> Fix For: 1.1.0
>
> Attachments: 
> 0001-DRILL-3377-Fix-naming-resolution-error-for-partition.patch
>
>
> Query below throws an error:
> {code:sql}
> create table test(x1, x2) partition by (x1) as 
> selectsum(a1),  b1 
> from   t1 
> group by  b1;
> {code}
> {code}
> 0: jdbc:drill:schema=dfs> create table test(x1, x2) partition by (x1) as 
> select sum(a1), b1 from t1 group by b1;
> Error: SYSTEM ERROR: IllegalArgumentException: partition col x1 could not be 
> resolved in table's column lists!
> [Error Id: ab5624e8-e4dd-4752-95af-8bc2eef5d056 on atsqa4-133.qa.lab:31010] 
> (state=,code=0)
> {code}
> When column aliases are used, it works:
> {code}
> 0: jdbc:drill:schema=dfs> create table test partition by (x1) as select 
> sum(a1) x1, b1 x2 from t1 group by b1;
> +---++
> | Fragment  | Number of records written  |
> +---++
> | 0_0   | 10 |
> +---++
> 1 row selected (0.904 seconds)
> 0: jdbc:drill:schema=dfs> select * from test;
> +---++
> |  x1   |   x2   |
> +---++
> | null  | h  |
> | 2 | b  |
> | 10| j  |
> | 1 | a  |
> | 3 | c  |
> | 4 | null   |
> | 5 | e  |
> | 7 | g  |
> | 6 | f  |
> | 9 | i  |
> +---++
> 10 rows selected (0.161 seconds)
> 0: jdbc:drill:schema=dfs> select * from test order by x1;
> +---++
> |  x1   |   x2   |
> +---++
> | 1 | a  |
> | 2 | b  |
> | 3 | c  |
> | 4 | null   |
> | 5 | e  |
> | 6 | f  |
> | 7 | g  |
> | 9 | i  |
> | 10| j  |
> | null  | h  |
> +---++
> 10 rows selected (0.299 seconds)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (DRILL-3374) CTAS with PARTITION BY, partition column name from view can not be resolved

2015-06-26 Thread Jinfeng Ni (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-3374?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jinfeng Ni updated DRILL-3374:
--
Fix Version/s: (was: 1.2.0)
   1.1.0

> CTAS with PARTITION BY, partition column name from view can not be resolved
> ---
>
> Key: DRILL-3374
> URL: https://issues.apache.org/jira/browse/DRILL-3374
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Flow
>Affects Versions: 1.1.0
>Reporter: Khurram Faraaz
>Assignee: Jinfeng Ni
>  Labels: window_function
> Fix For: 1.1.0
>
>
> CTAS with PARTITION BY clause fails to resolve column name when partitioning 
> column is from a view.
> {code}
> 0: jdbc:drill:schema=dfs.tmp> create table ctas_prtng_01 partition by 
> (col_vchar_52) as select * from vwOnParq_wCst;
> Error: SYSTEM ERROR: IllegalArgumentException: partition col col_vchar_52 
> could not be resolved in table's column lists!
> [Error Id: 7cb227c1-65c5-48cb-a00b-1a89a5309bc8 on centos-04.qa.lab:31010] 
> (state=,code=0)
> {code}
> Table used in above CTAS does exist and the column used to partition by also 
> exists.
> {code}
> 0: jdbc:drill:schema=dfs.tmp> describe vwOnParq_wCst;
> +---++--+
> |  COLUMN_NAME  | DATA_TYPE  | IS_NULLABLE  |
> +---++--+
> | col_int   | INTEGER| YES  |
> | col_bigint| BIGINT | YES  |
> | col_char_2| CHARACTER  | YES  |
> | col_vchar_52  | CHARACTER VARYING  | YES  |
> | col_tmstmp| TIMESTAMP  | YES  |
> | col_dt| DATE   | YES  |
> | col_booln | BOOLEAN| YES  |
> | col_dbl   | DOUBLE | YES  |
> | col_tm| TIME   | YES  |
> +---++--+
> 9 rows selected (0.411 seconds)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Assigned] (DRILL-3377) Can't partition by expression when columns are explicitly specified in the CTAS column list

2015-06-26 Thread Jinfeng Ni (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-3377?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jinfeng Ni reassigned DRILL-3377:
-

Assignee: Jinfeng Ni  (was: Venki Korukanti)

> Can't partition by expression when columns are explicitly specified in the 
> CTAS column list
> ---
>
> Key: DRILL-3377
> URL: https://issues.apache.org/jira/browse/DRILL-3377
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Query Planning & Optimization
>Affects Versions: 1.0.0
>Reporter: Victoria Markman
>Assignee: Jinfeng Ni
>  Labels: ctas
> Attachments: 
> 0001-DRILL-3377-Fix-naming-resolution-error-for-partition.patch
>
>
> Query below throws an error:
> {code:sql}
> create table test(x1, x2) partition by (x1) as 
> selectsum(a1),  b1 
> from   t1 
> group by  b1;
> {code}
> {code}
> 0: jdbc:drill:schema=dfs> create table test(x1, x2) partition by (x1) as 
> select sum(a1), b1 from t1 group by b1;
> Error: SYSTEM ERROR: IllegalArgumentException: partition col x1 could not be 
> resolved in table's column lists!
> [Error Id: ab5624e8-e4dd-4752-95af-8bc2eef5d056 on atsqa4-133.qa.lab:31010] 
> (state=,code=0)
> {code}
> When column aliases are used, it works:
> {code}
> 0: jdbc:drill:schema=dfs> create table test partition by (x1) as select 
> sum(a1) x1, b1 x2 from t1 group by b1;
> +---++
> | Fragment  | Number of records written  |
> +---++
> | 0_0   | 10 |
> +---++
> 1 row selected (0.904 seconds)
> 0: jdbc:drill:schema=dfs> select * from test;
> +---++
> |  x1   |   x2   |
> +---++
> | null  | h  |
> | 2 | b  |
> | 10| j  |
> | 1 | a  |
> | 3 | c  |
> | 4 | null   |
> | 5 | e  |
> | 7 | g  |
> | 6 | f  |
> | 9 | i  |
> +---++
> 10 rows selected (0.161 seconds)
> 0: jdbc:drill:schema=dfs> select * from test order by x1;
> +---++
> |  x1   |   x2   |
> +---++
> | 1 | a  |
> | 2 | b  |
> | 3 | c  |
> | 4 | null   |
> | 5 | e  |
> | 6 | f  |
> | 7 | g  |
> | 9 | i  |
> | 10| j  |
> | null  | h  |
> +---++
> 10 rows selected (0.299 seconds)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

1 2 >

1 - 100 of 165 matches

Mail list logo