[jira] [Commented] (DRILL-3764) Support the ability to identify and/or skip records when a function evaluation fails

2015-10-08 Thread Sean Hsuan-Yi Chu (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-3764?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14949716#comment-14949716
 ] 

Sean Hsuan-Yi Chu commented on DRILL-3764:
--

I just made comments available.

> Support the ability to identify and/or skip records when a function 
> evaluation fails
> 
>
> Key: DRILL-3764
> URL: https://issues.apache.org/jira/browse/DRILL-3764
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Functions - Drill
>Affects Versions: 1.1.0
>Reporter: Aman Sinha
> Fix For: Future
>
>
> Drill can point out the filename and location of corrupted records in a file 
> but it does not have a good mechanism to deal with the following scenario: 
> Consider a text file with 2 records:
> {code}
> $ cat t4.csv
> 10,2001
> 11,http://www.cnn.com
> {code}
> {code}
> 0: jdbc:drill:zk=local> alter session set `exec.errors.verbose` = true;
> 0: jdbc:drill:zk=local> select cast(columns[0] as init), cast(columns[1] as 
> bigint) from dfs.`t4.csv`;
> Error: SYSTEM ERROR: NumberFormatException: http://www.cnn.com
> Fragment 0:0
> [Error Id: 72aad22c-a345-4100-9a57-dcd8436105f7 on 10.250.56.140:31010]
>   (java.lang.NumberFormatException) http://www.cnn.com
> org.apache.drill.exec.expr.fn.impl.StringFunctionHelpers.nfeL():91
> 
> org.apache.drill.exec.expr.fn.impl.StringFunctionHelpers.varCharToLong():62
> org.apache.drill.exec.test.generated.ProjectorGen1.doEval():62
> org.apache.drill.exec.test.generated.ProjectorGen1.projectRecords():62
> 
> org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.doWork():172
> {code}
> The problem is user does not have the context of where the error occurred 
> -either the file name or the record number.   This becomes a pain point 
> especially when CTAS is being used to do data conversion from (say) text 
> format to Parquet format.  The CTAS may be accessing thousands of files and 1 
> such casting (or another function) failure aborts the query. 
> It would substantially improve the user experience if we provided: 
> 1) the filename and record number where  this failure occurred
> 2) the ability to skip such records depending on a session option
> 3) the ability to write such records to a staging table for future ingestion
> Please see discussion on dev list: 
> http://mail-archives.apache.org/mod_mbox/drill-dev/201509.mbox/%3cCAFyDVvLuPLgTNZ56S6=J=9Vb=aBs=pdw7nrhkkdupbdxgfa...@mail.gmail.com%3e



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Closed] (DRILL-1072) Drill is very slow when we have a large number of text files

2015-10-08 Thread Rahul Challapalli (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-1072?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rahul Challapalli closed DRILL-1072.

Resolution: Fixed

> Drill is very slow when we have a large number of text files
> 
>
> Key: DRILL-1072
> URL: https://issues.apache.org/jira/browse/DRILL-1072
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Query Planning & Optimization, Storage - Parquet, 
> Storage - Text & CSV
>Reporter: Rahul Challapalli
>Priority: Minor
> Fix For: Future
>
>
> git.commit.id.abbrev=efa3274
> Build# 26178
> As the total number of files under the below directory increase, drill 
> becomes very slow. Check the results for different file counts for the below 
> query.
> All files just contain 1 number and have a '.tbl' extension
> select count(*) from dfs.`/drill/testdata/morefiles`;
> 100 files --- 5.183 seconds
> 250 files --- 15.021 seconds
> 500 files --- 26.846 seconds
> 1000 files --- 69.835 seconds
> 5000 files --- 1573.589 seconds
> The logs contain these messages repeatedly when executing against 5000 files:
> 22:02:22.818 [b5a7fdd3-f788-4a40-9fd7-bf525bad09e3:frag:0:0] DEBUG 
> o.a.d.e.s.text.DrillTextRecordReader - vector value capacity 65536
> 22:02:22.818 [b5a7fdd3-f788-4a40-9fd7-bf525bad09e3:frag:0:0] DEBUG 
> o.a.d.e.s.text.DrillTextRecordReader - vector byte capacity 32767500
> 22:02:22.819 [b5a7fdd3-f788-4a40-9fd7-bf525bad09e3:frag:0:0] DEBUG 
> o.a.d.e.s.text.DrillTextRecordReader - text scan batch size 5
> 22:02:22.840 [b5a7fdd3-f788-4a40-9fd7-bf525bad09e3:frag:0:0] DEBUG 
> o.a.d.e.s.text.DrillTextRecordReader - vector value capacity 65536
> 22:02:22.841 [b5a7fdd3-f788-4a40-9fd7-bf525bad09e3:frag:0:0] DEBUG 
> o.a.d.e.s.text.DrillTextRecordReader - vector byte capacity 32767500
> 22:02:22.841 [b5a7fdd3-f788-4a40-9fd7-bf525bad09e3:frag:0:0] DEBUG 
> o.a.d.e.s.text.DrillTextRecordReader - text scan batch size 0
> 22:02:22.863 [b5a7fdd3-f788-4a40-9fd7-bf525bad09e3:frag:0:0] DEBUG 
> o.a.d.e.s.text.DrillTextRecordReader - vector value capacity 65536
> 22:02:22.863 [b5a7fdd3-f788-4a40-9fd7-bf525bad09e3:frag:0:0] DEBUG 
> o.a.d.e.s.text.DrillTextRecordReader - vector byte capacity 32767500
> 22:02:22.864 [b5a7fdd3-f788-4a40-9fd7-bf525bad09e3:frag:0:0] DEBUG 
> o.a.d.e.s.text.DrillTextRecordReader - text scan batch size 5
> 22:02:23.035 [b5a7fdd3-f788-4a40-9fd7-bf525bad09e3:frag:0:0] DEBUG 
> o.a.d.e.s.text.DrillTextRecordReader - vector value capacity 65536
> 22:02:23.036 [b5a7fdd3-f788-4a40-9fd7-bf525bad09e3:frag:0:0] DEBUG 
> o.a.d.e.s.text.DrillTextRecordReader - vector byte capacity 32767500
> 22:02:23.036 [b5a7fdd3-f788-4a40-9fd7-bf525bad09e3:frag:0:0] DEBUG 
> o.a.d.e.s.text.DrillTextRecordReader - text scan batch size 0
> 22:02:23.059 [b5a7fdd3-f788-4a40-9fd7-bf525bad09e3:frag:0:0] DEBUG 
> o.a.d.e.s.text.DrillTextRecordReader - vector value capacity 65536
> 22:02:23.059 [b5a7fdd3-f788-4a40-9fd7-bf525bad09e3:frag:0:0] DEBUG 
> o.a.d.e.s.text.DrillTextRecordReader - vector byte capacity 32767500
> 22:02:23.060 [b5a7fdd3-f788-4a40-9fd7-bf525bad09e3:frag:0:0] DEBUG 
> o.a.d.e.s.text.DrillTextRecordReader - text scan batch size 5



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (DRILL-3764) Support the ability to identify and/or skip records when a function evaluation fails

2015-10-08 Thread Jacques Nadeau (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-3764?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14949687#comment-14949687
 ] 

Jacques Nadeau commented on DRILL-3764:
---

No way to edit or comment on design doc.



> Support the ability to identify and/or skip records when a function 
> evaluation fails
> 
>
> Key: DRILL-3764
> URL: https://issues.apache.org/jira/browse/DRILL-3764
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Functions - Drill
>Affects Versions: 1.1.0
>Reporter: Aman Sinha
> Fix For: Future
>
>
> Drill can point out the filename and location of corrupted records in a file 
> but it does not have a good mechanism to deal with the following scenario: 
> Consider a text file with 2 records:
> {code}
> $ cat t4.csv
> 10,2001
> 11,http://www.cnn.com
> {code}
> {code}
> 0: jdbc:drill:zk=local> alter session set `exec.errors.verbose` = true;
> 0: jdbc:drill:zk=local> select cast(columns[0] as init), cast(columns[1] as 
> bigint) from dfs.`t4.csv`;
> Error: SYSTEM ERROR: NumberFormatException: http://www.cnn.com
> Fragment 0:0
> [Error Id: 72aad22c-a345-4100-9a57-dcd8436105f7 on 10.250.56.140:31010]
>   (java.lang.NumberFormatException) http://www.cnn.com
> org.apache.drill.exec.expr.fn.impl.StringFunctionHelpers.nfeL():91
> 
> org.apache.drill.exec.expr.fn.impl.StringFunctionHelpers.varCharToLong():62
> org.apache.drill.exec.test.generated.ProjectorGen1.doEval():62
> org.apache.drill.exec.test.generated.ProjectorGen1.projectRecords():62
> 
> org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.doWork():172
> {code}
> The problem is user does not have the context of where the error occurred 
> -either the file name or the record number.   This becomes a pain point 
> especially when CTAS is being used to do data conversion from (say) text 
> format to Parquet format.  The CTAS may be accessing thousands of files and 1 
> such casting (or another function) failure aborts the query. 
> It would substantially improve the user experience if we provided: 
> 1) the filename and record number where  this failure occurred
> 2) the ability to skip such records depending on a session option
> 3) the ability to write such records to a staging table for future ingestion
> Please see discussion on dev list: 
> http://mail-archives.apache.org/mod_mbox/drill-dev/201509.mbox/%3cCAFyDVvLuPLgTNZ56S6=J=9Vb=aBs=pdw7nrhkkdupbdxgfa...@mail.gmail.com%3e



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Closed] (DRILL-3832) Metadata Caching : There should be a way for the user to know that the cache has been leveraged

2015-10-08 Thread Rahul Challapalli (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-3832?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rahul Challapalli closed DRILL-3832.

Resolution: Fixed

This has been fixed by steven commit 1cfd4c20d30dd042290e769472e60d06ae66020c

> Metadata Caching : There should be a way for the user to know that the cache 
> has been leveraged
> ---
>
> Key: DRILL-3832
> URL: https://issues.apache.org/jira/browse/DRILL-3832
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Metadata
>Reporter: Rahul Challapalli
>
> git.commit.id.abbrev=3c89b30
> Currently the user has no way of knowing that the metadata cache file has 
> been leveraged apart from comparing the time which could be influenced by 
> other factors.
> It would be helpful while debugging to know whether or not the cache has been 
> leveraged. This information can be added to one or a combination of the below 
> places
> 1. Profiles
> 2. Explain Plan Output
> 3. Log files



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (DRILL-1951) Improve Error message when trying to cast a string with a decimal point to integer

2015-10-08 Thread Jacques Nadeau (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-1951?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jacques Nadeau updated DRILL-1951:
--
Summary: Improve Error message when trying to cast a string with a decimal 
point to integer  (was: Can't cast numeric value with decimal point read from 
CSV file into integer data type)

> Improve Error message when trying to cast a string with a decimal point to 
> integer
> --
>
> Key: DRILL-1951
> URL: https://issues.apache.org/jira/browse/DRILL-1951
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Data Types
>Affects Versions: 0.8.0
>Reporter: Victoria Markman
> Fix For: Future
>
>
> sales.csv file:
> {code}
> 997,Ford,ME350,3000.00, comment#1
> 1999,Chevy,Venture,4900.00, comment#2
> 1999,Chevy,Venture,5000.00, comment#3
> 1996,Jeep,Cherokee,1.01, comment#4
> 0: jdbc:drill:schema=dfs> select cast(columns[3] as decimal(18,2))  from 
> `sales.csv`;
> ++
> |   EXPR$0   |
> ++
> | 3000.00|
> | 4900.00|
> | 5000.00|
> | 1.01   |
> ++
> 4 rows selected (0.093 seconds)
> {code}
> -- Can cast to decimal
> {code}
> 0: jdbc:drill:schema=dfs> select cast(columns[3] as decimal(18,2))  from 
> `sales.csv`;
> ++
> |   EXPR$0   |
> ++
> | 3000.00|
> | 4900.00|
> | 5000.00|
> | 1.01   |
> ++
> 4 rows selected (0.095 seconds)
> {code}
> -- Can cast to float
> {code}
> 0: jdbc:drill:schema=dfs> select cast(columns[3] as float)  from `sales.csv`;
> ++
> |   EXPR$0   |
> ++
> | 3000.0 |
> | 4900.0 |
> | 5000.0 |
> | 1.01   |
> ++
> 4 rows selected (0.112 seconds)
> {code}-- Can't cast to INT/BIGINT
> {code}
> 0: jdbc:drill:schema=dfs> select cast(columns[3] as bigint)  from `sales.csv`;
> Query failed: Query failed: Failure while running fragment., 3000.00 [ 
> 4818451a-c731-48a9-9992-1e81ab1d520d on atsqa4-134.qa.lab:31010 ]
> [ 4818451a-c731-48a9-9992-1e81ab1d520d on atsqa4-134.qa.lab:31010 ]
> Error: exception while executing query: Failure while executing query. 
> (state=,code=0)
> {code}
> -- Same works with json/parquet files
> {code}
> 0: jdbc:drill:schema=dfs> select a1  from `t1.json`;
> ++
> | a1 |
> ++
> | 10.01  |
> ++
> 1 row selected (0.077 seconds)
> 0: jdbc:drill:schema=dfs> select cast(a1 as int)  from `t1.json`;
> ++
> |   EXPR$0   |
> ++
> | 10 |
> ++
> 0: jdbc:drill:schema=dfs> select * from test_cast;
> ++
> | a1 |
> ++
> | 10.0100|
> ++
> 1 row selected (0.06 seconds)
> 0: jdbc:drill:schema=dfs> select cast(a1 as int) from test_cast;
> ++
> |   EXPR$0   |
> ++
> | 10 |
> ++
> 1 row selected (0.094 seconds)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (DRILL-2759) Wrong result on count(*) from empty csv file

2015-10-08 Thread Victoria Markman (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-2759?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Victoria Markman updated DRILL-2759:

Summary: Wrong result on count(*) from empty csv file  (was: Improve error 
message when reading from an empty csv file)

> Wrong result on count(*) from empty csv file
> 
>
> Key: DRILL-2759
> URL: https://issues.apache.org/jira/browse/DRILL-2759
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Storage - Text & CSV
>Affects Versions: 0.9.0
>Reporter: Victoria Markman
>Priority: Critical
> Fix For: Future
>
>
> t1.csv is an empty file:
> {code}
> 0: jdbc:drill:schema=dfs> select count(*) from `bigtable/2015/01/t1.csv`;
> Query failed: IllegalArgumentException: Incoming endpoints 1 is greater than 
> number of row groups 0
> Error: exception while executing query: Failure while executing query. 
> (state=,code=0)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (DRILL-3764) Support the ability to identify and/or skip records when a function evaluation fails

2015-10-08 Thread Sean Hsuan-Yi Chu (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-3764?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14949584#comment-14949584
 ] 

Sean Hsuan-Yi Chu commented on DRILL-3764:
--

We have a design proposal here:
https://docs.google.com/document/d/1jCeYW924_SFwf-nOqtXrO68eixmAitM-tLngezzXw3Y/edit

First of all, we target at a narrowed scope where skipping mechanism is in 
Project, and for this mechanism to be effective, the Project should be directly 
on top of Scan.

> Support the ability to identify and/or skip records when a function 
> evaluation fails
> 
>
> Key: DRILL-3764
> URL: https://issues.apache.org/jira/browse/DRILL-3764
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Functions - Drill
>Affects Versions: 1.1.0
>Reporter: Aman Sinha
> Fix For: Future
>
>
> Drill can point out the filename and location of corrupted records in a file 
> but it does not have a good mechanism to deal with the following scenario: 
> Consider a text file with 2 records:
> {code}
> $ cat t4.csv
> 10,2001
> 11,http://www.cnn.com
> {code}
> {code}
> 0: jdbc:drill:zk=local> alter session set `exec.errors.verbose` = true;
> 0: jdbc:drill:zk=local> select cast(columns[0] as init), cast(columns[1] as 
> bigint) from dfs.`t4.csv`;
> Error: SYSTEM ERROR: NumberFormatException: http://www.cnn.com
> Fragment 0:0
> [Error Id: 72aad22c-a345-4100-9a57-dcd8436105f7 on 10.250.56.140:31010]
>   (java.lang.NumberFormatException) http://www.cnn.com
> org.apache.drill.exec.expr.fn.impl.StringFunctionHelpers.nfeL():91
> 
> org.apache.drill.exec.expr.fn.impl.StringFunctionHelpers.varCharToLong():62
> org.apache.drill.exec.test.generated.ProjectorGen1.doEval():62
> org.apache.drill.exec.test.generated.ProjectorGen1.projectRecords():62
> 
> org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.doWork():172
> {code}
> The problem is user does not have the context of where the error occurred 
> -either the file name or the record number.   This becomes a pain point 
> especially when CTAS is being used to do data conversion from (say) text 
> format to Parquet format.  The CTAS may be accessing thousands of files and 1 
> such casting (or another function) failure aborts the query. 
> It would substantially improve the user experience if we provided: 
> 1) the filename and record number where  this failure occurred
> 2) the ability to skip such records depending on a session option
> 3) the ability to write such records to a staging table for future ingestion
> Please see discussion on dev list: 
> http://mail-archives.apache.org/mod_mbox/drill-dev/201509.mbox/%3cCAFyDVvLuPLgTNZ56S6=J=9Vb=aBs=pdw7nrhkkdupbdxgfa...@mail.gmail.com%3e



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (DRILL-3764) Support the ability to identify and/or skip records when a function evaluation fails

2015-10-08 Thread Sean Hsuan-Yi Chu (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-3764?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14949575#comment-14949575
 ] 

Sean Hsuan-Yi Chu commented on DRILL-3764:
--

The option might not be viable if the data type is non-nullable. 

Further, we cannot just cast it to nullable data type since the batches prior 
to the current one might have been sent to the downstream operator. And 
changing the type to nullable would cause SchemaChange issues.

> Support the ability to identify and/or skip records when a function 
> evaluation fails
> 
>
> Key: DRILL-3764
> URL: https://issues.apache.org/jira/browse/DRILL-3764
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Functions - Drill
>Affects Versions: 1.1.0
>Reporter: Aman Sinha
> Fix For: Future
>
>
> Drill can point out the filename and location of corrupted records in a file 
> but it does not have a good mechanism to deal with the following scenario: 
> Consider a text file with 2 records:
> {code}
> $ cat t4.csv
> 10,2001
> 11,http://www.cnn.com
> {code}
> {code}
> 0: jdbc:drill:zk=local> alter session set `exec.errors.verbose` = true;
> 0: jdbc:drill:zk=local> select cast(columns[0] as init), cast(columns[1] as 
> bigint) from dfs.`t4.csv`;
> Error: SYSTEM ERROR: NumberFormatException: http://www.cnn.com
> Fragment 0:0
> [Error Id: 72aad22c-a345-4100-9a57-dcd8436105f7 on 10.250.56.140:31010]
>   (java.lang.NumberFormatException) http://www.cnn.com
> org.apache.drill.exec.expr.fn.impl.StringFunctionHelpers.nfeL():91
> 
> org.apache.drill.exec.expr.fn.impl.StringFunctionHelpers.varCharToLong():62
> org.apache.drill.exec.test.generated.ProjectorGen1.doEval():62
> org.apache.drill.exec.test.generated.ProjectorGen1.projectRecords():62
> 
> org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.doWork():172
> {code}
> The problem is user does not have the context of where the error occurred 
> -either the file name or the record number.   This becomes a pain point 
> especially when CTAS is being used to do data conversion from (say) text 
> format to Parquet format.  The CTAS may be accessing thousands of files and 1 
> such casting (or another function) failure aborts the query. 
> It would substantially improve the user experience if we provided: 
> 1) the filename and record number where  this failure occurred
> 2) the ability to skip such records depending on a session option
> 3) the ability to write such records to a staging table for future ingestion
> Please see discussion on dev list: 
> http://mail-archives.apache.org/mod_mbox/drill-dev/201509.mbox/%3cCAFyDVvLuPLgTNZ56S6=J=9Vb=aBs=pdw7nrhkkdupbdxgfa...@mail.gmail.com%3e



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (DRILL-3714) Query runs out of memory and remains in CANCELLATION_REQUESTED state until drillbit is restarted

2015-10-08 Thread Victoria Markman (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-3714?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Victoria Markman updated DRILL-3714:

Priority: Critical  (was: Major)

> Query runs out of memory and remains in CANCELLATION_REQUESTED state until 
> drillbit is restarted
> 
>
> Key: DRILL-3714
> URL: https://issues.apache.org/jira/browse/DRILL-3714
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Flow
>Affects Versions: 1.2.0
>Reporter: Victoria Markman
>Priority: Critical
> Fix For: Future
>
> Attachments: Screen Shot 2015-08-26 at 10.36.33 AM.png, drillbit.log, 
> jstack.txt, query_profile_2a2210a7-7a78-c774-d54c-c863d0b77bb0.json
>
>
> This is a variation of DRILL-3705 with the difference of drill behavior when 
> hitting OOM condition.
> Query runs out of memory during execution and remains in 
> "CANCELLATION_REQUESTED" state until drillbit is bounced.
> Client (sqlline in this case) never gets a response from the server.
> Reproduction details:
> Single node drillbit installation.
> DRILL_MAX_DIRECT_MEMORY="8G"
> DRILL_HEAP="4G"
> Run this query on TPCDS SF100 data set
> {code}
> SELECT SUM(ss.ss_net_paid_inc_tax) OVER (PARTITION BY ss.ss_store_sk) AS 
> TotalSpend FROM store_sales ss WHERE ss.ss_store_sk IS NOT NULL ORDER BY 1 
> LIMIT 10;
> {code}
> drillbit.log
> {code}
> 2015-08-26 16:54:58,469 [2a2210a7-7a78-c774-d54c-c863d0b77bb0:frag:3:22] INFO 
>  o.a.d.e.w.f.FragmentStatusReporter - 
> 2a2210a7-7a78-c774-d54c-c863d0b77bb0:3:22: State to report: RUNNING
> 2015-08-26 16:55:50,498 [BitServer-5] WARN  
> o.a.drill.exec.rpc.data.DataServer - Message of mode REQUEST of rpc type 3 
> took longer than 500ms.  Actual duration was 2569ms.
> 2015-08-26 16:56:31,086 [BitServer-5] ERROR 
> o.a.d.exec.rpc.RpcExceptionHandler - Exception in RPC communication.  
> Connection: /10.10.88.133:31012 <--> /10.10.88.133:54554 (data server).  
> Closing connection.
> io.netty.handler.codec.DecoderException: java.lang.OutOfMemoryError: Direct 
> buffer memory
> at 
> io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:233)
>  ~[netty-codec-4.0.27.Final.jar:4.0.27.Final]
> at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:339)
>  [netty-transport-4.0.27.Final.jar:4.0.27.Final]
> at 
> io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:324)
>  [netty-transport-4.0.27.Final.jar:4.0.27.Final]
> at 
> io.netty.channel.ChannelInboundHandlerAdapter.channelRead(ChannelInboundHandlerAdapter.java:86)
>  [netty-transport-4.0.27.Final.jar:4.0.27.Final]
> at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:339)
>  [netty-transport-4.0.27.Final.jar:4.0.27.Final]
> at 
> io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:324)
>  [netty-transport-4.0.27.Final.jar:4.0.27.Final]
> at 
> io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:847)
>  [netty-transport-4.0.27.Final.jar:4.0.27.Final]
> at 
> io.netty.channel.epoll.AbstractEpollStreamChannel$EpollStreamUnsafe.epollInReady(AbstractEpollStreamChannel.java:618)
>  [netty-transport-native-epoll-4.0.27.Final-linux-x86_64.jar:na]
> at 
> io.netty.channel.epoll.EpollEventLoop.processReady(EpollEventLoop.java:329) 
> [netty-transport-native-epoll-4.0.27.Final-linux-x86_64.jar:na]
> at io.netty.channel.epoll.EpollEventLoop.run(EpollEventLoop.java:250) 
> [netty-transport-native-epoll-4.0.27.Final-linux-x86_64.jar:na]
> at 
> io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:111)
>  [netty-common-4.0.27.Final.jar:4.0.27.Final]
> at java.lang.Thread.run(Thread.java:745) [na:1.7.0_71]
> Caused by: java.lang.OutOfMemoryError: Direct buffer memory
> at java.nio.Bits.reserveMemory(Bits.java:658) ~[na:1.7.0_71]
> at java.nio.DirectByteBuffer.(DirectByteBuffer.java:123) 
> ~[na:1.7.0_71]
> at java.nio.ByteBuffer.allocateDirect(ByteBuffer.java:306) 
> ~[na:1.7.0_71]
> at io.netty.buffer.PoolArena$DirectArena.newChunk(PoolArena.java:437) 
> ~[netty-buffer-4.0.27.Final.jar:4.0.27.Final]
> at io.netty.buffer.PoolArena.allocateNormal(PoolArena.java:179) 
> ~[netty-buffer-4.0.27.Final.jar:4.0.27.Final]
> at io.netty.buffer.PoolArena.allocate(PoolArena.java:168) 
> ~[netty-buffer-4.0.27.Final.jar:4.0.27.Final]
> at io.netty.buffer.PoolArena.reallocate(PoolArena.java:280) 
> ~[netty-buffer-4.0.27.Final.jar:4.0.27.Final]
> at io.net

[jira] [Updated] (DRILL-3705) Query runs out of memory, reported as FAILED and leaves thread running

2015-10-08 Thread Victoria Markman (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-3705?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Victoria Markman updated DRILL-3705:

Priority: Critical  (was: Major)

> Query runs out of memory, reported as FAILED and leaves thread running 
> ---
>
> Key: DRILL-3705
> URL: https://issues.apache.org/jira/browse/DRILL-3705
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Flow
>Affects Versions: 1.2.0
>Reporter: Victoria Markman
>Priority: Critical
> Fix For: Future
>
> Attachments: 2a2451ec-09d8-9f26-e856-5fd349ae72fd.sys.drill, 
> drillbit.log, jstack.txt
>
>
> Single node drill installation
> DRILL_MAX_DIRECT_MEMORY="2G"
> DRILL_HEAP="1G"
> Execute tpcds query 15 SF100 (parquet) with the settings above. Reproduces 2 
> out of 3 times.
> {code}
> SELECT ca.ca_zip,
>Sum(cs.cs_sales_price)
> FROM   catalog_salescs,
>customer c,
>customer_address ca,
>date_dim dd
> WHERE  cs.cs_bill_customer_sk = c.c_customer_sk
>AND c.c_current_addr_sk = ca.ca_address_sk
>AND ( Substr(ca.ca_zip, 1, 5) IN ( '85669', '86197', '88274', '83405',
>'86475', '85392', '85460', '80348',
>'81792' )
>   OR ca.ca_state IN ( 'CA', 'WA', 'GA' )
>   OR cs.cs_sales_price > 500 )
>AND cs.cs_sold_date_sk = dd.d_date_sk
>AND dd.d_qoy = 1
>AND dd.d_year = 1998
> GROUP  BY ca.ca_zip
> ORDER  BY ca.ca_zip
> LIMIT 100;
> {code}
> Query runs out of memory, but leaves thread behind even though it is reported 
> as FAILED (expected result)
> Snippet from jstack:
> {code}
> "2a2451ec-09d8-9f26-e856-5fd349ae72fd:frag:4:0" daemon prio=10 
> tid=0x7f507414 nid=0x3000 waiting on condition [0x7f5055b66000]
>java.lang.Thread.State: WAITING (parking)
> at sun.misc.Unsafe.park(Native Method)
> - parking to wait for  <0xc012b038> (a 
> java.util.concurrent.Semaphore$NonfairSync)
> at java.util.concurrent.locks.LockSupport.park(LockSupport.java:186)
> at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:834)
> at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(AbstractQueuedSynchronizer.java:994)
> at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1303)
> at java.util.concurrent.Semaphore.acquire(Semaphore.java:472)
> at 
> org.apache.drill.exec.ops.SendingAccountor.waitForSendComplete(SendingAccountor.java:48)
> - locked <0xc012b068> (a 
> org.apache.drill.exec.ops.SendingAccountor)
> at 
> org.apache.drill.exec.ops.FragmentContext.waitForSendComplete(FragmentContext.java:436)
> at 
> org.apache.drill.exec.physical.impl.BaseRootExec.close(BaseRootExec.java:112)
> at 
> org.apache.drill.exec.work.fragment.FragmentExecutor.closeOutResources(FragmentExecutor.java:341)
> at 
> org.apache.drill.exec.work.fragment.FragmentExecutor.cleanup(FragmentExecutor.java:173)
> at 
> org.apache.drill.exec.work.fragment.FragmentExecutor.run(FragmentExecutor.java:292)
> at 
> org.apache.drill.common.SelfCleaningRunnable.run(SelfCleaningRunnable.java:38)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:745)
> {code}
> NPE in drillbit.log:
> {code}
> 2015-08-24 23:52:04,486 [BitServer-5] ERROR 
> o.a.d.exec.rpc.RpcExceptionHandler - Exception in RPC communication.  
> Connection: /10.10.88.133:31012 <--> /10.10.88.133:52417 (data server).  
> Closing connection.
> io.netty.handler.codec.DecoderException: java.lang.NullPointerException
> at 
> io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:99)
>  [netty-codec-4.0.27.Final.jar:4.0.27.Final]
> at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:339)
>  [netty-transport-4.0.27.Final.jar:4.0.27.Final]
> at 
> io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:324)
>  [netty-transport-4.0.27.Final.jar:4.0.27.Final]
> at 
> io.netty.handler.timeout.ReadTimeoutHandler.channelRead(ReadTimeoutHandler.java:150)
>  [netty-handler-4.0.27.Final.jar:4.0.27.Final]
> at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:339)
>  [netty-transport-4.0.27.Fin

[jira] [Updated] (DRILL-3241) Query with window function runs out of direct memory and does not report back to client that it did

2015-10-08 Thread Victoria Markman (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-3241?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Victoria Markman updated DRILL-3241:

Priority: Critical  (was: Major)

> Query with window function runs out of direct memory and does not report back 
> to client that it did
> ---
>
> Key: DRILL-3241
> URL: https://issues.apache.org/jira/browse/DRILL-3241
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Relational Operators
>Affects Versions: 1.0.0
>Reporter: Victoria Markman
>Priority: Critical
> Fix For: Future
>
>
> Even though query run out of memory and was cancelled on the server, client 
> (sqlline) was never notified of the event and it appears to the user that 
> query is hung. 
> Configuration:
> Single drillbit configured with:
> DRILL_MAX_DIRECT_MEMORY="2G"
> DRILL_HEAP="1G"
> TPCDS100 parquet files
> Query:
> {code}
> select 
>   sum(ss_quantity) over(partition by ss_store_sk order by ss_sold_date_sk) 
> from store_sales;
> {code}
> drillbit.log
> {code}
> 2015-06-01 21:42:29,514 [BitServer-5] ERROR 
> o.a.d.exec.rpc.RpcExceptionHandler - Exception in RPC communication.  
> Connection: /10.10.88.133:31012 <--> /10.10.88.133:38887 (data server).  
> Closing connection.
> io.netty.handler.codec.DecoderException: java.lang.OutOfMemoryError: Direct 
> buffer memory
> at 
> io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:233)
>  ~[netty-codec-4.0.27.Final.jar:4.0.27.Final]
> at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:339)
>  [netty-transport-4.0.27.Final.jar:4.0.27.Final]
> at 
> io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:324)
>  [netty-transport-4.0.27.Final.jar:4.0.27.Final]
> at 
> io.netty.channel.ChannelInboundHandlerAdapter.channelRead(ChannelInboundHandlerAdapter.java:86)
>  [netty-transport-4.0.27.Final.jar:4.0.27.Final]
> at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:339)
>  [netty-transport-4.0.27.Final.jar:4.0.27.Final]
> at 
> io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:324)
>  [netty-transport-4.0.27.Final.jar:4.0.27.Final]
> at 
> io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:847)
>  [netty-transport-4.0.27.Final.jar:4.0.27.Final]
> at 
> io.netty.channel.epoll.AbstractEpollStreamChannel$EpollStreamUnsafe.epollInReady(AbstractEpollStreamChannel.java:618)
>  [netty-transport-native-epoll-4.0.27.Final-linux-x86_64.jar:na]
> at 
> io.netty.channel.epoll.EpollEventLoop.processReady(EpollEventLoop.java:329) 
> [netty-transport-native-epoll-4.0.27.Final-linux-x86_64.jar:na]
> at io.netty.channel.epoll.EpollEventLoop.run(EpollEventLoop.java:250) 
> [netty-transport-native-epoll-4.0.27.Final-linux-x86_64.jar:na]
> at 
> io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:111)
>  [netty-common-4.0.27.Final.jar:4.0.27.Final]
> at java.lang.Thread.run(Thread.java:745) [na:1.7.0_71]
> Caused by: java.lang.OutOfMemoryError: Direct buffer memory
> at java.nio.Bits.reserveMemory(Bits.java:658) ~[na:1.7.0_71]
> at java.nio.DirectByteBuffer.(DirectByteBuffer.java:123) 
> ~[na:1.7.0_71]
> at java.nio.ByteBuffer.allocateDirect(ByteBuffer.java:306) 
> ~[na:1.7.0_71]
> at io.netty.buffer.PoolArena$DirectArena.newChunk(PoolArena.java:437) 
> ~[netty-buffer-4.0.27.Final.jar:4.0.27.Final]
> at io.netty.buffer.PoolArena.allocateNormal(PoolArena.java:179) 
> ~[netty-buffer-4.0.27.Final.jar:4.0.27.Final]
> at io.netty.buffer.PoolArena.allocate(PoolArena.java:168) 
> ~[netty-buffer-4.0.27.Final.jar:4.0.27.Final]
> at io.netty.buffer.PoolArena.reallocate(PoolArena.java:280) 
> ~[netty-buffer-4.0.27.Final.jar:4.0.27.Final]
> at io.netty.buffer.PooledByteBuf.capacity(PooledByteBuf.java:110) 
> ~[netty-buffer-4.0.27.Final.jar:4.0.27.Final]
> at 
> io.netty.buffer.AbstractByteBuf.ensureWritable(AbstractByteBuf.java:251) 
> ~[netty-buffer-4.0.27.Final.jar:4.0.27.Final]
> at 
> io.netty.buffer.AbstractByteBuf.writeBytes(AbstractByteBuf.java:849) 
> ~[netty-buffer-4.0.27.Final.jar:4.0.27.Final]
> at 
> io.netty.buffer.AbstractByteBuf.writeBytes(AbstractByteBuf.java:841) 
> ~[netty-buffer-4.0.27.Final.jar:4.0.27.Final]
> at 
> io.netty.buffer.AbstractByteBuf.writeBytes(AbstractByteBuf.java:831) 
> ~[netty-buffer-4.0.27.Final.jar:4.0.27.Final]
> at io.netty.buffer.WrappedByteBuf.writeBytes(WrappedByteBuf.java:60

[jira] [Commented] (DRILL-3623) Hive query hangs with limit 0 clause

2015-10-08 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-3623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14949565#comment-14949565
 ] 

ASF GitHub Bot commented on DRILL-3623:
---

GitHub user sudheeshkatkam opened a pull request:

https://github.com/apache/drill/pull/193

DRILL-3623: Use shorter query path for LIMIT 0 queries on schema-ed tables

Initial patch.

DrillTable#providesDeferredSchema function is used by the 
NonDeferredSchemaTableLimit0Visitor to check if the table can provide schema 
directly, and if so the result is directly returned.

It seems the shorter query path for this query needs a hacky "otherPlan" in 
the DefaultSqlHandler without major refactoring (Should I go ahead and make 
changes?). This also means that "EXPLAIN PLAN ..." returns a plan that is 
different the actual query plan (without a check in ExplainHandler, another 
hack).

I think the classes need more meaningful names 
(NonDeferredSchemaTableLimit0Visitor).

Also, note the type conversion using CALCITE_TO_DRILL_TYPE_MAPPING.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/sudheeshkatkam/drill DRILL-3623

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/drill/pull/193.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #193


commit a766c54b34697df8b851204705ea1ce16c7114b7
Author: Sudheesh Katkam 
Date:   2015-10-08T22:38:00Z

DRILL-3623: Use shorter query path for LIMIT 0 queries on schema-ed tables




> Hive query hangs with limit 0 clause
> 
>
> Key: DRILL-3623
> URL: https://issues.apache.org/jira/browse/DRILL-3623
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Storage - Hive
>Affects Versions: 1.1.0
> Environment: MapR cluster
>Reporter: Andries Engelbrecht
> Fix For: Future
>
>
> Running a select * from hive.table limit 0 does not return (hangs).
> Select * from hive.table limit 1 works fine
> Hive table is about 6GB with 330 files with parquet using snappy compression.
> Data types are int, bigint, string and double.
> Querying directory with parquet files through the DFS plugin works fine
> select * from dfs.root.`/user/hive/warehouse/database/table` limit 0;



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (DRILL-2891) Allowing ROUND function on boolean type can cause all sorts of problems

2015-10-08 Thread Victoria Markman (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-2891?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14949561#comment-14949561
 ] 

Victoria Markman commented on DRILL-2891:
-

Now that I think about it, first case is technically data corruption. Raising 
priority to critical.

> Allowing ROUND function on boolean type can cause all sorts of problems
> ---
>
> Key: DRILL-2891
> URL: https://issues.apache.org/jira/browse/DRILL-2891
> Project: Apache Drill
>  Issue Type: Bug
>Reporter: Victoria Markman
>Priority: Minor
> Fix For: 1.4.0
>
>
> Works, and I don't think it makes much sense:
> {code}
> 0: jdbc:drill:schema=dfs> select round(c_boolean) from alltypes_with_nulls 
> limit 1;
> ++
> |   EXPR$0   |
> ++
> | 1  |
> ++
> 1 row selected (0.19 seconds)
> {code}
> Fails later if used in other parts of the query.
> In order by:
> {code}
> 0: jdbc:drill:schema=dfs> select round(c_boolean) from alltypes_with_nulls 
> order by 1;
> ++
> |   EXPR$0   |
> ++
> Query failed: SYSTEM ERROR: java.lang.UnsupportedOperationException: Failure 
> finding function that runtime code generation expected.  Signature: 
> compare_to_nulls_high( TINYINT:OPTIONAL, TINYINT:OPTIONAL ) returns 
> INT:REQUIRED
> Fragment 0:0
> [7add2ed7-de6a-4c66-b511-ecad32413fcc on atsqa4-133.qa.lab:31010]
> java.lang.RuntimeException: java.sql.SQLException: Failure while executing 
> query.
>   at sqlline.SqlLine$IncrementalRows.hasNext(SqlLine.java:2514)
>   at sqlline.SqlLine$TableOutputFormat.print(SqlLine.java:2148)
>   at sqlline.SqlLine.print(SqlLine.java:1809)
>   at sqlline.SqlLine$Commands.execute(SqlLine.java:3766)
>   at sqlline.SqlLine$Commands.sql(SqlLine.java:3663)
>   at sqlline.SqlLine.dispatch(SqlLine.java:889)
>   at sqlline.SqlLine.begin(SqlLine.java:763)
>   at sqlline.SqlLine.start(SqlLine.java:498)
>   at sqlline.SqlLine.main(SqlLine.java:460)
> {code}
> In group by
> {code}
> 0: jdbc:drill:schema=dfs> select round(c_boolean) from alltypes group by 
> round(c_boolean);
> Query failed: SYSTEM ERROR: Failure finding function that runtime code 
> generation expected.  Signature: compare_to_nulls_high( TINYINT:REQUIRED, 
> TINYINT:REQUIRED ) returns INT:REQUIRED
> Fragment 0:0
> [286777b2-3395-4e44-94a2-d9dafa07f9dc on atsqa4-133.qa.lab:31010]
> Error: exception while executing query: Failure while executing query. 
> (state=,code=0)
> {code}
> We should not allow that.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (DRILL-2891) Allowing ROUND function on boolean type can cause all sorts of problems

2015-10-08 Thread Victoria Markman (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-2891?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Victoria Markman updated DRILL-2891:

Priority: Critical  (was: Minor)

> Allowing ROUND function on boolean type can cause all sorts of problems
> ---
>
> Key: DRILL-2891
> URL: https://issues.apache.org/jira/browse/DRILL-2891
> Project: Apache Drill
>  Issue Type: Bug
>Reporter: Victoria Markman
>Priority: Critical
> Fix For: 1.4.0
>
>
> Works, and I don't think it makes much sense:
> {code}
> 0: jdbc:drill:schema=dfs> select round(c_boolean) from alltypes_with_nulls 
> limit 1;
> ++
> |   EXPR$0   |
> ++
> | 1  |
> ++
> 1 row selected (0.19 seconds)
> {code}
> Fails later if used in other parts of the query.
> In order by:
> {code}
> 0: jdbc:drill:schema=dfs> select round(c_boolean) from alltypes_with_nulls 
> order by 1;
> ++
> |   EXPR$0   |
> ++
> Query failed: SYSTEM ERROR: java.lang.UnsupportedOperationException: Failure 
> finding function that runtime code generation expected.  Signature: 
> compare_to_nulls_high( TINYINT:OPTIONAL, TINYINT:OPTIONAL ) returns 
> INT:REQUIRED
> Fragment 0:0
> [7add2ed7-de6a-4c66-b511-ecad32413fcc on atsqa4-133.qa.lab:31010]
> java.lang.RuntimeException: java.sql.SQLException: Failure while executing 
> query.
>   at sqlline.SqlLine$IncrementalRows.hasNext(SqlLine.java:2514)
>   at sqlline.SqlLine$TableOutputFormat.print(SqlLine.java:2148)
>   at sqlline.SqlLine.print(SqlLine.java:1809)
>   at sqlline.SqlLine$Commands.execute(SqlLine.java:3766)
>   at sqlline.SqlLine$Commands.sql(SqlLine.java:3663)
>   at sqlline.SqlLine.dispatch(SqlLine.java:889)
>   at sqlline.SqlLine.begin(SqlLine.java:763)
>   at sqlline.SqlLine.start(SqlLine.java:498)
>   at sqlline.SqlLine.main(SqlLine.java:460)
> {code}
> In group by
> {code}
> 0: jdbc:drill:schema=dfs> select round(c_boolean) from alltypes group by 
> round(c_boolean);
> Query failed: SYSTEM ERROR: Failure finding function that runtime code 
> generation expected.  Signature: compare_to_nulls_high( TINYINT:REQUIRED, 
> TINYINT:REQUIRED ) returns INT:REQUIRED
> Fragment 0:0
> [286777b2-3395-4e44-94a2-d9dafa07f9dc on atsqa4-133.qa.lab:31010]
> Error: exception while executing query: Failure while executing query. 
> (state=,code=0)
> {code}
> We should not allow that.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (DRILL-2860) Unable to cast integer column from parquet file to interval day

2015-10-08 Thread Victoria Markman (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-2860?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Victoria Markman updated DRILL-2860:

Labels: interval  (was: )

> Unable to cast integer column from parquet file to interval day
> ---
>
> Key: DRILL-2860
> URL: https://issues.apache.org/jira/browse/DRILL-2860
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Data Types
>Reporter: Victoria Markman
>  Labels: interval
> Fix For: Future
>
> Attachments: t1.parquet
>
>
> I can cast numeric literal to "interval day":
> {code}
> 0: jdbc:drill:schema=dfs> select cast(1 as interval day) from t1;
> ++
> |   EXPR$0   |
> ++
> | P1D|
> | P1D|
> | P1D|
> | P1D|
> | P1D|
> | P1D|
> | P1D|
> | P1D|
> | P1D|
> | P1D|
> ++
> 10 rows selected (0.122 seconds)
> {code}
> Get an error when I'm trying to do the same from parquet file:
> {code}
> 0: jdbc:drill:schema=dfs> select cast(a1 as interval day) from t1 where a1 = 
> 1;
> Query failed: SYSTEM ERROR: Invalid format: "1"
> Fragment 0:0
> [6a4adf04-f3db-4feb-8010-ebc3bfced1e3 on atsqa4-134.qa.lab:31010]
>   (java.lang.IllegalArgumentException) Invalid format: "1"
> org.joda.time.format.PeriodFormatter.parseMutablePeriod():326
> org.joda.time.format.PeriodFormatter.parsePeriod():304
> org.joda.time.Period.parse():92
> org.joda.time.Period.parse():81
> org.apache.drill.exec.test.generated.ProjectorGen180.doEval():77
> org.apache.drill.exec.test.generated.ProjectorGen180.projectRecords():62
> 
> org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.doWork():170
> org.apache.drill.exec.record.AbstractSingleRecordBatch.innerNext():93
> 
> org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.innerNext():130
> org.apache.drill.exec.record.AbstractRecordBatch.next():144
> 
> org.apache.drill.exec.physical.impl.validate.IteratorValidatorBatchIterator.next():118
> org.apache.drill.exec.physical.impl.BaseRootExec.next():74
> 
> org.apache.drill.exec.physical.impl.ScreenCreator$ScreenRoot.innerNext():80
> org.apache.drill.exec.physical.impl.BaseRootExec.next():64
> org.apache.drill.exec.work.fragment.FragmentExecutor$1.run():198
> org.apache.drill.exec.work.fragment.FragmentExecutor$1.run():192
> java.security.AccessController.doPrivileged():-2
> javax.security.auth.Subject.doAs():415
> org.apache.hadoop.security.UserGroupInformation.doAs():1469
> org.apache.drill.exec.work.fragment.FragmentExecutor.run():192
> org.apache.drill.common.SelfCleaningRunnable.run():38
> java.util.concurrent.ThreadPoolExecutor.runWorker():1145
> java.util.concurrent.ThreadPoolExecutor$Worker.run():615
> java.lang.Thread.run():745
> Error: exception while executing query: Failure while executing query. 
> (state=,code=0)
> {code}
> If I try casting a1 to an integer I run into drill-2859



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (DRILL-2859) Unexpected exception in the query with an interval data type

2015-10-08 Thread Victoria Markman (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-2859?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Victoria Markman updated DRILL-2859:

Labels: interval  (was: )

> Unexpected exception in the query with an interval data type
> 
>
> Key: DRILL-2859
> URL: https://issues.apache.org/jira/browse/DRILL-2859
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Query Planning & Optimization
>Affects Versions: 0.9.0
>Reporter: Victoria Markman
>Priority: Critical
>  Labels: interval
> Fix For: Future
>
>
> {code}
> 0: jdbc:drill:schema=dfs> select cast(cast(a1 as int) as interval day) from 
> t1 where a1 = 1;
> Query failed: SYSTEM ERROR: Unexpected exception during fragment 
> initialization: todo: implement syntax 
> SPECIAL(Reinterpret(*(Reinterpret(CAST(CAST($0):INTEGER):DECIMAL(2, 0)), 
> 8640)))
> [5119315b-dd73-432f-ab93-49e76e9165f6 on atsqa4-134.qa.lab:31010]
>   (org.apache.drill.exec.work.foreman.ForemanException) Unexpected exception 
> during fragment initialization: todo: implement syntax 
> SPECIAL(Reinterpret(*(Reinterpret(CAST(CAST($0):INTEGER):DECIMAL(2, 0)), 
> 8640)))
> org.apache.drill.exec.work.foreman.Foreman.run():212
> java.util.concurrent.ThreadPoolExecutor.runWorker():1145
> java.util.concurrent.ThreadPoolExecutor$Worker.run():615
> java.lang.Thread.run():745
>   Caused By (java.lang.AssertionError) todo: implement syntax 
> SPECIAL(Reinterpret(*(Reinterpret(CAST(CAST($0):INTEGER):DECIMAL(2, 0)), 
> 8640)))
> 
> org.apache.drill.exec.planner.logical.DrillOptiq$RexToDrill.visitCall():182
> org.apache.drill.exec.planner.logical.DrillOptiq$RexToDrill.visitCall():73
> org.apache.calcite.rex.RexCall.accept():107
> org.apache.drill.exec.planner.logical.DrillOptiq.toDrill():70
> 
> org.apache.drill.exec.planner.common.DrillProjectRelBase.getProjectExpressions():111
> 
> org.apache.drill.exec.planner.physical.ProjectPrel.getPhysicalOperator():57
> org.apache.drill.exec.planner.physical.ScreenPrel.getPhysicalOperator():51
> 
> org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.convertToPop():376
> org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.getPlan():157
> org.apache.drill.exec.planner.sql.DrillSqlWorker.getPlan():167
> org.apache.drill.exec.work.foreman.Foreman.runSQL():773
> org.apache.drill.exec.work.foreman.Foreman.run():203
> java.util.concurrent.ThreadPoolExecutor.runWorker():1145
> java.util.concurrent.ThreadPoolExecutor$Worker.run():615
> java.lang.Thread.run():745
> Error: exception while executing query: Failure while executing query. 
> (state=,code=0)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (DRILL-2859) Unexpected exception in the query with an interval data type

2015-10-08 Thread Victoria Markman (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-2859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14949551#comment-14949551
 ] 

Victoria Markman commented on DRILL-2859:
-

I marked this bug as critical because this looks like it could be a very common 
case ... 

> Unexpected exception in the query with an interval data type
> 
>
> Key: DRILL-2859
> URL: https://issues.apache.org/jira/browse/DRILL-2859
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Query Planning & Optimization
>Affects Versions: 0.9.0
>Reporter: Victoria Markman
>Priority: Critical
> Fix For: Future
>
>
> {code}
> 0: jdbc:drill:schema=dfs> select cast(cast(a1 as int) as interval day) from 
> t1 where a1 = 1;
> Query failed: SYSTEM ERROR: Unexpected exception during fragment 
> initialization: todo: implement syntax 
> SPECIAL(Reinterpret(*(Reinterpret(CAST(CAST($0):INTEGER):DECIMAL(2, 0)), 
> 8640)))
> [5119315b-dd73-432f-ab93-49e76e9165f6 on atsqa4-134.qa.lab:31010]
>   (org.apache.drill.exec.work.foreman.ForemanException) Unexpected exception 
> during fragment initialization: todo: implement syntax 
> SPECIAL(Reinterpret(*(Reinterpret(CAST(CAST($0):INTEGER):DECIMAL(2, 0)), 
> 8640)))
> org.apache.drill.exec.work.foreman.Foreman.run():212
> java.util.concurrent.ThreadPoolExecutor.runWorker():1145
> java.util.concurrent.ThreadPoolExecutor$Worker.run():615
> java.lang.Thread.run():745
>   Caused By (java.lang.AssertionError) todo: implement syntax 
> SPECIAL(Reinterpret(*(Reinterpret(CAST(CAST($0):INTEGER):DECIMAL(2, 0)), 
> 8640)))
> 
> org.apache.drill.exec.planner.logical.DrillOptiq$RexToDrill.visitCall():182
> org.apache.drill.exec.planner.logical.DrillOptiq$RexToDrill.visitCall():73
> org.apache.calcite.rex.RexCall.accept():107
> org.apache.drill.exec.planner.logical.DrillOptiq.toDrill():70
> 
> org.apache.drill.exec.planner.common.DrillProjectRelBase.getProjectExpressions():111
> 
> org.apache.drill.exec.planner.physical.ProjectPrel.getPhysicalOperator():57
> org.apache.drill.exec.planner.physical.ScreenPrel.getPhysicalOperator():51
> 
> org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.convertToPop():376
> org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.getPlan():157
> org.apache.drill.exec.planner.sql.DrillSqlWorker.getPlan():167
> org.apache.drill.exec.work.foreman.Foreman.runSQL():773
> org.apache.drill.exec.work.foreman.Foreman.run():203
> java.util.concurrent.ThreadPoolExecutor.runWorker():1145
> java.util.concurrent.ThreadPoolExecutor$Worker.run():615
> java.lang.Thread.run():745
> Error: exception while executing query: Failure while executing query. 
> (state=,code=0)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (DRILL-2859) Unexpected exception in the query with an interval data type

2015-10-08 Thread Victoria Markman (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-2859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14949550#comment-14949550
 ] 

Victoria Markman commented on DRILL-2859:
-

{code}
0: jdbc:drill:schema=dfs> select cast(cast(a1 as int) as interval day) from 
`t1.csv` where a1 = 1;
Error: SYSTEM ERROR: AssertionError: todo: implement syntax 
SPECIAL(Reinterpret(*(Reinterpret(CAST(CAST($0):INTEGER):DECIMAL(2, 0)), 
8640)))
[Error Id: 538e6136-8ef1-4a4b-ba29-3756617af561 on atsqa4-133.qa.lab:31010] 
(state=,code=0)
{code}

> Unexpected exception in the query with an interval data type
> 
>
> Key: DRILL-2859
> URL: https://issues.apache.org/jira/browse/DRILL-2859
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Query Planning & Optimization
>Affects Versions: 0.9.0
>Reporter: Victoria Markman
> Fix For: Future
>
>
> {code}
> 0: jdbc:drill:schema=dfs> select cast(cast(a1 as int) as interval day) from 
> t1 where a1 = 1;
> Query failed: SYSTEM ERROR: Unexpected exception during fragment 
> initialization: todo: implement syntax 
> SPECIAL(Reinterpret(*(Reinterpret(CAST(CAST($0):INTEGER):DECIMAL(2, 0)), 
> 8640)))
> [5119315b-dd73-432f-ab93-49e76e9165f6 on atsqa4-134.qa.lab:31010]
>   (org.apache.drill.exec.work.foreman.ForemanException) Unexpected exception 
> during fragment initialization: todo: implement syntax 
> SPECIAL(Reinterpret(*(Reinterpret(CAST(CAST($0):INTEGER):DECIMAL(2, 0)), 
> 8640)))
> org.apache.drill.exec.work.foreman.Foreman.run():212
> java.util.concurrent.ThreadPoolExecutor.runWorker():1145
> java.util.concurrent.ThreadPoolExecutor$Worker.run():615
> java.lang.Thread.run():745
>   Caused By (java.lang.AssertionError) todo: implement syntax 
> SPECIAL(Reinterpret(*(Reinterpret(CAST(CAST($0):INTEGER):DECIMAL(2, 0)), 
> 8640)))
> 
> org.apache.drill.exec.planner.logical.DrillOptiq$RexToDrill.visitCall():182
> org.apache.drill.exec.planner.logical.DrillOptiq$RexToDrill.visitCall():73
> org.apache.calcite.rex.RexCall.accept():107
> org.apache.drill.exec.planner.logical.DrillOptiq.toDrill():70
> 
> org.apache.drill.exec.planner.common.DrillProjectRelBase.getProjectExpressions():111
> 
> org.apache.drill.exec.planner.physical.ProjectPrel.getPhysicalOperator():57
> org.apache.drill.exec.planner.physical.ScreenPrel.getPhysicalOperator():51
> 
> org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.convertToPop():376
> org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.getPlan():157
> org.apache.drill.exec.planner.sql.DrillSqlWorker.getPlan():167
> org.apache.drill.exec.work.foreman.Foreman.runSQL():773
> org.apache.drill.exec.work.foreman.Foreman.run():203
> java.util.concurrent.ThreadPoolExecutor.runWorker():1145
> java.util.concurrent.ThreadPoolExecutor$Worker.run():615
> java.lang.Thread.run():745
> Error: exception while executing query: Failure while executing query. 
> (state=,code=0)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (DRILL-2859) Unexpected exception in the query with an interval data type

2015-10-08 Thread Victoria Markman (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-2859?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Victoria Markman updated DRILL-2859:

Priority: Critical  (was: Major)

> Unexpected exception in the query with an interval data type
> 
>
> Key: DRILL-2859
> URL: https://issues.apache.org/jira/browse/DRILL-2859
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Query Planning & Optimization
>Affects Versions: 0.9.0
>Reporter: Victoria Markman
>Priority: Critical
> Fix For: Future
>
>
> {code}
> 0: jdbc:drill:schema=dfs> select cast(cast(a1 as int) as interval day) from 
> t1 where a1 = 1;
> Query failed: SYSTEM ERROR: Unexpected exception during fragment 
> initialization: todo: implement syntax 
> SPECIAL(Reinterpret(*(Reinterpret(CAST(CAST($0):INTEGER):DECIMAL(2, 0)), 
> 8640)))
> [5119315b-dd73-432f-ab93-49e76e9165f6 on atsqa4-134.qa.lab:31010]
>   (org.apache.drill.exec.work.foreman.ForemanException) Unexpected exception 
> during fragment initialization: todo: implement syntax 
> SPECIAL(Reinterpret(*(Reinterpret(CAST(CAST($0):INTEGER):DECIMAL(2, 0)), 
> 8640)))
> org.apache.drill.exec.work.foreman.Foreman.run():212
> java.util.concurrent.ThreadPoolExecutor.runWorker():1145
> java.util.concurrent.ThreadPoolExecutor$Worker.run():615
> java.lang.Thread.run():745
>   Caused By (java.lang.AssertionError) todo: implement syntax 
> SPECIAL(Reinterpret(*(Reinterpret(CAST(CAST($0):INTEGER):DECIMAL(2, 0)), 
> 8640)))
> 
> org.apache.drill.exec.planner.logical.DrillOptiq$RexToDrill.visitCall():182
> org.apache.drill.exec.planner.logical.DrillOptiq$RexToDrill.visitCall():73
> org.apache.calcite.rex.RexCall.accept():107
> org.apache.drill.exec.planner.logical.DrillOptiq.toDrill():70
> 
> org.apache.drill.exec.planner.common.DrillProjectRelBase.getProjectExpressions():111
> 
> org.apache.drill.exec.planner.physical.ProjectPrel.getPhysicalOperator():57
> org.apache.drill.exec.planner.physical.ScreenPrel.getPhysicalOperator():51
> 
> org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.convertToPop():376
> org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.getPlan():157
> org.apache.drill.exec.planner.sql.DrillSqlWorker.getPlan():167
> org.apache.drill.exec.work.foreman.Foreman.runSQL():773
> org.apache.drill.exec.work.foreman.Foreman.run():203
> java.util.concurrent.ThreadPoolExecutor.runWorker():1145
> java.util.concurrent.ThreadPoolExecutor$Worker.run():615
> java.lang.Thread.run():745
> Error: exception while executing query: Failure while executing query. 
> (state=,code=0)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (DRILL-2859) Unexpected exception in the query with an interval data type

2015-10-08 Thread Victoria Markman (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-2859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14949550#comment-14949550
 ] 

Victoria Markman edited comment on DRILL-2859 at 10/8/15 10:43 PM:
---

As of 1.2.0:

{code}
0: jdbc:drill:schema=dfs> select cast(cast(a1 as int) as interval day) from 
`t1.csv` where a1 = 1;
Error: SYSTEM ERROR: AssertionError: todo: implement syntax 
SPECIAL(Reinterpret(*(Reinterpret(CAST(CAST($0):INTEGER):DECIMAL(2, 0)), 
8640)))
[Error Id: 538e6136-8ef1-4a4b-ba29-3756617af561 on atsqa4-133.qa.lab:31010] 
(state=,code=0)
{code}


was (Author: vicky):
{code}
0: jdbc:drill:schema=dfs> select cast(cast(a1 as int) as interval day) from 
`t1.csv` where a1 = 1;
Error: SYSTEM ERROR: AssertionError: todo: implement syntax 
SPECIAL(Reinterpret(*(Reinterpret(CAST(CAST($0):INTEGER):DECIMAL(2, 0)), 
8640)))
[Error Id: 538e6136-8ef1-4a4b-ba29-3756617af561 on atsqa4-133.qa.lab:31010] 
(state=,code=0)
{code}

> Unexpected exception in the query with an interval data type
> 
>
> Key: DRILL-2859
> URL: https://issues.apache.org/jira/browse/DRILL-2859
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Query Planning & Optimization
>Affects Versions: 0.9.0
>Reporter: Victoria Markman
> Fix For: Future
>
>
> {code}
> 0: jdbc:drill:schema=dfs> select cast(cast(a1 as int) as interval day) from 
> t1 where a1 = 1;
> Query failed: SYSTEM ERROR: Unexpected exception during fragment 
> initialization: todo: implement syntax 
> SPECIAL(Reinterpret(*(Reinterpret(CAST(CAST($0):INTEGER):DECIMAL(2, 0)), 
> 8640)))
> [5119315b-dd73-432f-ab93-49e76e9165f6 on atsqa4-134.qa.lab:31010]
>   (org.apache.drill.exec.work.foreman.ForemanException) Unexpected exception 
> during fragment initialization: todo: implement syntax 
> SPECIAL(Reinterpret(*(Reinterpret(CAST(CAST($0):INTEGER):DECIMAL(2, 0)), 
> 8640)))
> org.apache.drill.exec.work.foreman.Foreman.run():212
> java.util.concurrent.ThreadPoolExecutor.runWorker():1145
> java.util.concurrent.ThreadPoolExecutor$Worker.run():615
> java.lang.Thread.run():745
>   Caused By (java.lang.AssertionError) todo: implement syntax 
> SPECIAL(Reinterpret(*(Reinterpret(CAST(CAST($0):INTEGER):DECIMAL(2, 0)), 
> 8640)))
> 
> org.apache.drill.exec.planner.logical.DrillOptiq$RexToDrill.visitCall():182
> org.apache.drill.exec.planner.logical.DrillOptiq$RexToDrill.visitCall():73
> org.apache.calcite.rex.RexCall.accept():107
> org.apache.drill.exec.planner.logical.DrillOptiq.toDrill():70
> 
> org.apache.drill.exec.planner.common.DrillProjectRelBase.getProjectExpressions():111
> 
> org.apache.drill.exec.planner.physical.ProjectPrel.getPhysicalOperator():57
> org.apache.drill.exec.planner.physical.ScreenPrel.getPhysicalOperator():51
> 
> org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.convertToPop():376
> org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.getPlan():157
> org.apache.drill.exec.planner.sql.DrillSqlWorker.getPlan():167
> org.apache.drill.exec.work.foreman.Foreman.runSQL():773
> org.apache.drill.exec.work.foreman.Foreman.run():203
> java.util.concurrent.ThreadPoolExecutor.runWorker():1145
> java.util.concurrent.ThreadPoolExecutor$Worker.run():615
> java.lang.Thread.run():745
> Error: exception while executing query: Failure while executing query. 
> (state=,code=0)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (DRILL-2833) Exception on select * from INFORMATION_SCHEMA.COLUMNS caused by enabled HBase storage plug in where HBase is not installed

2015-10-08 Thread Victoria Markman (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-2833?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Victoria Markman updated DRILL-2833:

Summary: Exception on select * from INFORMATION_SCHEMA.COLUMNS caused by 
enabled HBase storage plug in where HBase is not installed  (was: Exception on 
select * from INFORMATION_SCHEMA.COLUMNS)

> Exception on select * from INFORMATION_SCHEMA.COLUMNS caused by enabled HBase 
> storage plug in where HBase is not installed
> --
>
> Key: DRILL-2833
> URL: https://issues.apache.org/jira/browse/DRILL-2833
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Storage - HBase
>Affects Versions: 0.9.0
>Reporter: Victoria Markman
>Assignee: Aditya Kishore
>Priority: Critical
>
> #Sat Apr 18 21:26:53 EDT 2015
> git.commit.id.abbrev=9ec257e
> {code}
> 0: jdbc:drill:schema=dfs> select * from INFORMATION_SCHEMA.COLUMNS;
> Query failed: SYSTEM ERROR: null
> Fragment 0:0
> [bd2a0477-90ea-423b-ad77-ad9784f4116b on atsqa4-133.qa.lab:31010]
>   (java.lang.NullPointerException) null
> org.eigenbase.sql.type.IntervalSqlType.():41
> org.eigenbase.sql.type.SqlTypeFactoryImpl.createSqlIntervalType():104
> org.apache.drill.exec.dotdrill.View.getRowType():236
> org.apache.drill.exec.planner.logical.DrillViewTable.getRowType():46
> org.apache.drill.exec.store.ischema.RecordGenerator.scanSchema():123
> org.apache.drill.exec.store.ischema.RecordGenerator.scanSchema():109
> org.apache.drill.exec.store.ischema.RecordGenerator.scanSchema():109
> org.apache.drill.exec.store.ischema.RecordGenerator.scanSchema():97
> org.apache.drill.exec.store.ischema.SelectedTable.getRecordReader():47
> org.apache.drill.exec.store.ischema.InfoSchemaBatchCreator.getBatch():35
> org.apache.drill.exec.store.ischema.InfoSchemaBatchCreator.getBatch():30
> org.apache.drill.exec.physical.impl.ImplCreator.visitOp():62
> org.apache.drill.exec.physical.impl.ImplCreator.visitOp():39
> 
> org.apache.drill.exec.physical.base.AbstractPhysicalVisitor.visitSubScan():127
> org.apache.drill.exec.physical.base.AbstractSubScan.accept():39
> org.apache.drill.exec.physical.impl.ImplCreator.getChildren():74
> org.apache.drill.exec.physical.impl.ImplCreator.visitOp():62
> org.apache.drill.exec.physical.impl.ImplCreator.visitOp():39
> 
> org.apache.drill.exec.physical.base.AbstractPhysicalVisitor.visitIteratorValidator():215
> org.apache.drill.exec.physical.config.IteratorValidator.accept():34
> org.apache.drill.exec.physical.impl.ImplCreator.getChildren():74
> org.apache.drill.exec.physical.impl.ImplCreator.visitOp():62
> org.apache.drill.exec.physical.impl.ImplCreator.visitOp():39
> 
> org.apache.drill.exec.physical.base.AbstractPhysicalVisitor.visitProject():77
> org.apache.drill.exec.physical.config.Project.accept():51
> org.apache.drill.exec.physical.impl.ImplCreator.getChildren():74
> org.apache.drill.exec.physical.impl.ImplCreator.visitOp():62
> org.apache.drill.exec.physical.impl.ImplCreator.visitOp():39
> 
> org.apache.drill.exec.physical.base.AbstractPhysicalVisitor.visitIteratorValidator():215
> org.apache.drill.exec.physical.config.IteratorValidator.accept():34
> org.apache.drill.exec.physical.impl.ImplCreator.getChildren():74
> org.apache.drill.exec.physical.impl.ImplCreator.visitOp():59
> org.apache.drill.exec.physical.impl.ImplCreator.visitOp():39
> 
> org.apache.drill.exec.physical.base.AbstractPhysicalVisitor.visitStore():132
> 
> org.apache.drill.exec.physical.base.AbstractPhysicalVisitor.visitScreen():195
> org.apache.drill.exec.physical.config.Screen.accept():97
> org.apache.drill.exec.physical.impl.ImplCreator.getExec():87
> org.apache.drill.exec.work.fragment.FragmentExecutor.run():148
> org.apache.drill.common.SelfCleaningRunnable.run():38
> java.util.concurrent.ThreadPoolExecutor.runWorker():1145
> java.util.concurrent.ThreadPoolExecutor$Worker.run():615
> java.lang.Thread.run():745
> Error: exception while executing query: Failure while executing query. 
> (state=,code=0)
> {code}
> Only happens when Hbase plugin is enabled. Thanks Rahul for helping diagnose 
> the problem !



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (DRILL-2759) Improve error message when reading from an empty csv file

2015-10-08 Thread Victoria Markman (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-2759?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14949537#comment-14949537
 ] 

Victoria Markman commented on DRILL-2759:
-

As of 1.2.0

I think expected result is 0 and not an empty set ... please correct me if I'm 
wrong.
{code}
0: jdbc:drill:schema=dfs> select count(*) from `empty.csv`;
+--+
|  |
+--+
+--+
No rows selected (0.245 seconds)
{code}

> Improve error message when reading from an empty csv file
> -
>
> Key: DRILL-2759
> URL: https://issues.apache.org/jira/browse/DRILL-2759
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Storage - Text & CSV
>Affects Versions: 0.9.0
>Reporter: Victoria Markman
>Priority: Minor
> Fix For: Future
>
>
> t1.csv is an empty file:
> {code}
> 0: jdbc:drill:schema=dfs> select count(*) from `bigtable/2015/01/t1.csv`;
> Query failed: IllegalArgumentException: Incoming endpoints 1 is greater than 
> number of row groups 0
> Error: exception while executing query: Failure while executing query. 
> (state=,code=0)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (DRILL-2759) Improve error message when reading from an empty csv file

2015-10-08 Thread Victoria Markman (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-2759?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Victoria Markman updated DRILL-2759:

Priority: Critical  (was: Minor)

> Improve error message when reading from an empty csv file
> -
>
> Key: DRILL-2759
> URL: https://issues.apache.org/jira/browse/DRILL-2759
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Storage - Text & CSV
>Affects Versions: 0.9.0
>Reporter: Victoria Markman
>Priority: Critical
> Fix For: Future
>
>
> t1.csv is an empty file:
> {code}
> 0: jdbc:drill:schema=dfs> select count(*) from `bigtable/2015/01/t1.csv`;
> Query failed: IllegalArgumentException: Incoming endpoints 1 is greater than 
> number of row groups 0
> Error: exception while executing query: Failure while executing query. 
> (state=,code=0)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (DRILL-2543) Correlated subquery where outer table contains NULL values returns seemingly wrong result

2015-10-08 Thread Victoria Markman (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-2543?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Victoria Markman updated DRILL-2543:

Priority: Critical  (was: Major)

> Correlated subquery where outer table contains NULL values returns  seemingly 
> wrong result
> --
>
> Key: DRILL-2543
> URL: https://issues.apache.org/jira/browse/DRILL-2543
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Query Planning & Optimization
>Affects Versions: 0.8.0
>Reporter: Victoria Markman
>Priority: Critical
> Fix For: Future
>
>
> {code}
> 0: jdbc:drill:schema=dfs> select * from t1;
> ++++
> | a1 | b1 | c1 |
> ++++
> | 1  | 2015-03-01 | a  |
> | 2  | 2015-03-02 | b  |
> | null   | null   | null   |
> ++++
> 3 rows selected (0.064 seconds)
> 0: jdbc:drill:schema=dfs> select * from t2;
> ++++
> | a2 | b2 | c2 |
> ++++
> | 5  | 2017-03-01 | a  |
> ++++
> 1 row selected (0.07 seconds)
> 0: jdbc:drill:schema=dfs> select t1.c1, count(*) from t1 where t1.b1 not in 
> (select b2 from t2 where t1.a1 = t2.a2) group by t1.c1 order by t1.c1;
> +++
> | c1 |   EXPR$1   |
> +++
> | a  | 1  |
> | b  | 1  |
> +++
> 2 rows selected (0.32 seconds)
> {code}
> Postgres returns row from the outer table where a1 is null.
> This is part that I don't understand, because join condition in the subquery 
> should have eliminated row where a1 IS NULL. To me Drill result looks 
> correct. Unless there is something different in correlated comparison 
> semantics that I'm not aware of.
> {code}
> postgres=# select * from t1;
>  a1 | b1 |  c1
> ++---
>   1 | 2015-03-01 | a
>   2 | 2015-03-02 | b
> ||
> (3 rows)
> {code}
> Explain plan for the query:
> {code}
> 00-01  Project(c1=[$0], EXPR$1=[$1])
> 00-02StreamAgg(group=[{0}], EXPR$1=[COUNT()])
> 00-03  Sort(sort0=[$0], dir0=[ASC])
> 00-04Project(c1=[$0])
> 00-05  SelectionVectorRemover
> 00-06Filter(condition=[NOT(IS TRUE($3))])
> 00-07  HashJoin(condition=[=($1, $2)], joinType=[left])
> 00-09Project($f1=[$0], $f3=[$2])
> 00-11  SelectionVectorRemover
> 00-13Filter(condition=[IS NOT NULL($1)])
> 00-15  Project(c1=[$1], b1=[$0], a1=[$2])
> 00-17Scan(groupscan=[ParquetGroupScan 
> [entries=[ReadEntryWithPath [path=maprfs:/test/t1]], selectionRoot=/test/t1, 
> numFiles=1, columns=[`c1`, `b1`, `a1`]]])
> 00-08Project($f02=[$1], $f2=[$2])
> 00-10  StreamAgg(group=[{0, 1}], agg#0=[MIN($2)])
> 00-12Sort(sort0=[$0], sort1=[$1], dir0=[ASC], 
> dir1=[ASC])
> 00-14  Project($f0=[$1], $f02=[$2], $f1=[true])
> 00-16HashJoin(condition=[=($2, $0)], 
> joinType=[inner])
> 00-18  StreamAgg(group=[{0}])
> 00-20Sort(sort0=[$0], dir0=[ASC])
> 00-22  Project($f0=[$1])
> 00-23SelectionVectorRemover
> 00-24  Filter(condition=[IS NOT NULL($0)])
> 00-25Scan(groupscan=[ParquetGroupScan 
> [entries=[ReadEntryWithPath [path=maprfs:/test/t1]], selectionRoot=/test/t1, 
> numFiles=1, columns=[`b1`, `a1`]]])
> 00-19  Project(a2=[$1], b2=[$0])
> 00-21Scan(groupscan=[ParquetGroupScan 
> [entries=[ReadEntryWithPath [path=maprfs:/test/t2]], selectionRoot=/test/t2, 
> numFiles=1, columns=[`a2`, `b2`]]])
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (DRILL-2543) Correlated subquery where outer table contains NULL values returns seemingly wrong result

2015-10-08 Thread Victoria Markman (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-2543?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14949526#comment-14949526
 ] 

Victoria Markman commented on DRILL-2543:
-

[~jni] it does make sense after you spelled it out with the standard. It is 
technically wrong result, raising priority.

> Correlated subquery where outer table contains NULL values returns  seemingly 
> wrong result
> --
>
> Key: DRILL-2543
> URL: https://issues.apache.org/jira/browse/DRILL-2543
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Query Planning & Optimization
>Affects Versions: 0.8.0
>Reporter: Victoria Markman
> Fix For: Future
>
>
> {code}
> 0: jdbc:drill:schema=dfs> select * from t1;
> ++++
> | a1 | b1 | c1 |
> ++++
> | 1  | 2015-03-01 | a  |
> | 2  | 2015-03-02 | b  |
> | null   | null   | null   |
> ++++
> 3 rows selected (0.064 seconds)
> 0: jdbc:drill:schema=dfs> select * from t2;
> ++++
> | a2 | b2 | c2 |
> ++++
> | 5  | 2017-03-01 | a  |
> ++++
> 1 row selected (0.07 seconds)
> 0: jdbc:drill:schema=dfs> select t1.c1, count(*) from t1 where t1.b1 not in 
> (select b2 from t2 where t1.a1 = t2.a2) group by t1.c1 order by t1.c1;
> +++
> | c1 |   EXPR$1   |
> +++
> | a  | 1  |
> | b  | 1  |
> +++
> 2 rows selected (0.32 seconds)
> {code}
> Postgres returns row from the outer table where a1 is null.
> This is part that I don't understand, because join condition in the subquery 
> should have eliminated row where a1 IS NULL. To me Drill result looks 
> correct. Unless there is something different in correlated comparison 
> semantics that I'm not aware of.
> {code}
> postgres=# select * from t1;
>  a1 | b1 |  c1
> ++---
>   1 | 2015-03-01 | a
>   2 | 2015-03-02 | b
> ||
> (3 rows)
> {code}
> Explain plan for the query:
> {code}
> 00-01  Project(c1=[$0], EXPR$1=[$1])
> 00-02StreamAgg(group=[{0}], EXPR$1=[COUNT()])
> 00-03  Sort(sort0=[$0], dir0=[ASC])
> 00-04Project(c1=[$0])
> 00-05  SelectionVectorRemover
> 00-06Filter(condition=[NOT(IS TRUE($3))])
> 00-07  HashJoin(condition=[=($1, $2)], joinType=[left])
> 00-09Project($f1=[$0], $f3=[$2])
> 00-11  SelectionVectorRemover
> 00-13Filter(condition=[IS NOT NULL($1)])
> 00-15  Project(c1=[$1], b1=[$0], a1=[$2])
> 00-17Scan(groupscan=[ParquetGroupScan 
> [entries=[ReadEntryWithPath [path=maprfs:/test/t1]], selectionRoot=/test/t1, 
> numFiles=1, columns=[`c1`, `b1`, `a1`]]])
> 00-08Project($f02=[$1], $f2=[$2])
> 00-10  StreamAgg(group=[{0, 1}], agg#0=[MIN($2)])
> 00-12Sort(sort0=[$0], sort1=[$1], dir0=[ASC], 
> dir1=[ASC])
> 00-14  Project($f0=[$1], $f02=[$2], $f1=[true])
> 00-16HashJoin(condition=[=($2, $0)], 
> joinType=[inner])
> 00-18  StreamAgg(group=[{0}])
> 00-20Sort(sort0=[$0], dir0=[ASC])
> 00-22  Project($f0=[$1])
> 00-23SelectionVectorRemover
> 00-24  Filter(condition=[IS NOT NULL($0)])
> 00-25Scan(groupscan=[ParquetGroupScan 
> [entries=[ReadEntryWithPath [path=maprfs:/test/t1]], selectionRoot=/test/t1, 
> numFiles=1, columns=[`b1`, `a1`]]])
> 00-19  Project(a2=[$1], b2=[$0])
> 00-21Scan(groupscan=[ParquetGroupScan 
> [entries=[ReadEntryWithPath [path=maprfs:/test/t2]], selectionRoot=/test/t2, 
> numFiles=1, columns=[`a2`, `b2`]]])
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (DRILL-3712) Drill does not recognize UTF-16-LE encoding

2015-10-08 Thread Steven Phillips (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-3712?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14949503#comment-14949503
 ] 

Steven Phillips commented on DRILL-3712:


I think one solution would be to write a UDF to convert from utf16 to utf8. We 
already have a function that does the reverse: CastVarCharVar16Char . 

> Drill does not recognize UTF-16-LE encoding
> ---
>
> Key: DRILL-3712
> URL: https://issues.apache.org/jira/browse/DRILL-3712
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Storage - Text & CSV
>Affects Versions: 1.1.0
> Environment: OSX, likely Linux. 
>Reporter: Edmon Begoli
> Fix For: Future
>
>
> We are unable to process files that OSX identifies as character sete UTF16LE. 
>  After unzipping and converting to UTF8, we are able to process one fine.  
> There are CONVERT_TO and CONVERT_FROM commands that appear to address the 
> issue, but we were unable to make them work on a gzipped or unzipped version 
> of the UTF16 file.  We were  able to use CONVERT_FROM ok, but when we tried 
> to wrap the results of that to cast as a date, or anything else, it failed.  
> Trying to work with it natively caused the double-byte nature to appear (a 
> substring 1,4 only return the first two characters).
> I cannot post the data because it is proprietary in nature, but I am posting 
> this code that might be useful in re-creating an issue:
> {noformat}
> #!/usr/bin/env python
> """ Generates a test psv file with some text fields encoded as UTF-16-LE. """
> def write_utf16le_encoded_psv():
>   total_lines = 10
>   encoded = "Encoded B".encode("utf-16-le")
>   with open("test.psv","wb") as csv_file:
>   csv_file.write("header 1|header 2|header 3\n")
>   for i in xrange(total_lines):
>   csv_file.write("value 
> A"+str(i)+"|"+encoded+"|value C"+str(i)+"\n")
> if __name__ == "__main__":
>   write_utf16le_encoded_psv()
> {noformat}
> then:
> tar zcvf test.psv.tar.gz test.psv



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (DRILL-2355) TRUNC function returns incorrect result when second parameter is specified

2015-10-08 Thread Victoria Markman (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-2355?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Victoria Markman updated DRILL-2355:

Priority: Critical  (was: Minor)

> TRUNC function returns incorrect result when second parameter is specified
> --
>
> Key: DRILL-2355
> URL: https://issues.apache.org/jira/browse/DRILL-2355
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Functions - Drill
>Affects Versions: 0.8.0
>Reporter: Victoria Markman
>Priority: Critical
>  Labels: document_if_not_fixed
> Fix For: Future
>
>
> I believe our TRUNC function is modeled on Postgres TRUNC function.
> Semandics of Postgres TRUNC function:
> {code}
> postgres=# select trunc(1234.1234);
>  trunc
> ---
>   1234
> (1 row)
> postgres=# select trunc(1234.1234, 0);
>  trunc
> ---
>   1234
> (1 row)
> postgres=# select trunc(1234.1234, 2);
>   trunc
> -
>  1234.12
> (1 row)
> postgres=# select trunc(1234.1234, -1);
>  trunc 
> ---
>   1230
> (1 row)
> postgres=# select trunc(1234.1234, -3);
>  trunc
> ---
>   1000
> (1 row)
> {code}
> This is incorrect, I can't truncate to zero decimal places:
> {code}
> 0: jdbc:drill:schema=dfs> select trunc(1234.1234) from sys.options limit 1;
> ++
> |   EXPR$0   |
> ++
> | 1234.0 |
> ++
> 1 row selected (0.133 seconds)
> 0: jdbc:drill:schema=dfs> select trunc(1234.1234,0) from sys.options limit 1;
> ++
> |   EXPR$0   |
> ++
> | 1234.0 |
> ++
> 1 row selected (0.065 seconds)
> {code}
> Second negative parameter, does not do what it is supposed to do as well:
> {code}
> 0: jdbc:drill:schema=dfs> select trunc(1234.1234, -1) from sys.options limit 
> 1;
> ++
> |   EXPR$0   |
> ++
> | 1230.0 |
> ++
> 1 row selected (0.068 seconds)
> 0: jdbc:drill:schema=dfs> select trunc(1234.1234, -3) from sys.options limit 
> 1;
> ++
> |   EXPR$0   |
> ++
> | 1000.0 |
> ++
> 1 row selected (0.072 seconds)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (DRILL-3712) Drill does not recognize UTF-16-LE encoding

2015-10-08 Thread Steven Phillips (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-3712?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14949499#comment-14949499
 ] 

Steven Phillips commented on DRILL-3712:


The second column is utf16 encoded. I don't think any of our cast functions 
will deal with it properly. Nor will any of the string functions.

> Drill does not recognize UTF-16-LE encoding
> ---
>
> Key: DRILL-3712
> URL: https://issues.apache.org/jira/browse/DRILL-3712
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Storage - Text & CSV
>Affects Versions: 1.1.0
> Environment: OSX, likely Linux. 
>Reporter: Edmon Begoli
> Fix For: Future
>
>
> We are unable to process files that OSX identifies as character sete UTF16LE. 
>  After unzipping and converting to UTF8, we are able to process one fine.  
> There are CONVERT_TO and CONVERT_FROM commands that appear to address the 
> issue, but we were unable to make them work on a gzipped or unzipped version 
> of the UTF16 file.  We were  able to use CONVERT_FROM ok, but when we tried 
> to wrap the results of that to cast as a date, or anything else, it failed.  
> Trying to work with it natively caused the double-byte nature to appear (a 
> substring 1,4 only return the first two characters).
> I cannot post the data because it is proprietary in nature, but I am posting 
> this code that might be useful in re-creating an issue:
> {noformat}
> #!/usr/bin/env python
> """ Generates a test psv file with some text fields encoded as UTF-16-LE. """
> def write_utf16le_encoded_psv():
>   total_lines = 10
>   encoded = "Encoded B".encode("utf-16-le")
>   with open("test.psv","wb") as csv_file:
>   csv_file.write("header 1|header 2|header 3\n")
>   for i in xrange(total_lines):
>   csv_file.write("value 
> A"+str(i)+"|"+encoded+"|value C"+str(i)+"\n")
> if __name__ == "__main__":
>   write_utf16le_encoded_psv()
> {noformat}
> then:
> tar zcvf test.psv.tar.gz test.psv



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (DRILL-3712) Drill does not recognize UTF-16-LE encoding

2015-10-08 Thread Deneche A. Hakim (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-3712?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14949489#comment-14949489
 ] 

Deneche A. Hakim edited comment on DRILL-3712 at 10/8/15 10:13 PM:
---

[~ebegoli] I did the following using the latest master:
- I used your script to create a text.psv file
- I created a gzipped version of the file (just .gz not tar.gz)
- I updated the "psv" definition in my dfs storage plugin like this:
{noformat}
"psv": {
  "type": "text",
  "extensions": [
"tbl",
"psv"
  ],
  "skipFirstLine": true,
  "delimiter": "|"
}
{noformat}

Here are the results I get when I query the file:
{noformat}
0: jdbc:drill:zk=local> select * from dfs.data.`test.psv.gz`;
++
|  columns  
 |
++
| ["value A0","E\un\uc\uo\ud\ue\ud\u 
\uB\u","value C0"]  |
| ["value A1","E\un\uc\uo\ud\ue\ud\u 
\uB\u","value C1"]  |
| ["value A2","E\un\uc\uo\ud\ue\ud\u 
\uB\u","value C2"]  |
| ["value A3","E\un\uc\uo\ud\ue\ud\u 
\uB\u","value C3"]  |
| ["value A4","E\un\uc\uo\ud\ue\ud\u 
\uB\u","value C4"]  |
| ["value A5","E\un\uc\uo\ud\ue\ud\u 
\uB\u","value C5"]  |
| ["value A6","E\un\uc\uo\ud\ue\ud\u 
\uB\u","value C6"]  |
| ["value A7","E\un\uc\uo\ud\ue\ud\u 
\uB\u","value C7"]  |
| ["value A8","E\un\uc\uo\ud\ue\ud\u 
\uB\u","value C8"]  |
| ["value A9","E\un\uc\uo\ud\ue\ud\u 
\uB\u","value C9"]  |
++
10 rows selected (0.136 seconds)
{noformat}

{noformat}
0: jdbc:drill:zk=local> select columns[0], columns[1], columns[2] from 
dfs.data.`test.psv.gz`;
+---+-+---+
|  EXPR$0   |   EXPR$1|  EXPR$2   |
+---+-+---+
| value A0  | Encoded B  | value C0  |
| value A1  | Encoded B  | value C1  |
| value A2  | Encoded B  | value C2  |
| value A3  | Encoded B  | value C3  |
| value A4  | Encoded B  | value C4  |
| value A5  | Encoded B  | value C5  |
| value A6  | Encoded B  | value C6  |
| value A7  | Encoded B  | value C7  |
| value A8  | Encoded B  | value C8  |
| value A9  | Encoded B  | value C9  |
+---+-+---+
10 rows selected (0.194 seconds)
{noformat}

Do you have more details about how to reproduce the issues you are seeing ?


was (Author: adeneche):
[~ebegoli] I did the following using the latest master:
- I used your script to create a text.psv file
- I created a gzipped version of the file (just .gz not tar.gz)
- I updated the "psv" definition in my dfs storage plugin like this:
{noformat}
"psv": {
  "type": "text",
  "extensions": [
"tbl",
"psv"
  ],
  "skipFirstLine": true,
  "delimiter": "|"
}
{noformat}

Here are the results I get when I query the file:
{noformat}
0: jdbc:drill:zk=local> select * from dfs.data.`test.psv.gz`;
++
|  columns  
 |
++
| ["value A0","E\un\uc\uo\ud\ue\ud\u 
\uB\u","value C0"]  |
| ["value A1","E\un\uc\uo\ud\ue\ud\u 
\uB\u","value C1"]  |
| ["value A2","E\un\uc\uo\ud\ue\ud\u 
\uB\u","value C2"]  |
| ["value A3","E\un\uc\uo\ud\ue\ud\u 
\uB\u","value C3"]  |
| ["value A4","E\un\uc\uo\ud\ue\ud\u 
\uB\u","value C4"]  |
| ["value A5","E\un\uc\uo\ud\ue\ud\u 
\uB\u","value C5"]  |
| ["value A6","E\un\uc\uo\ud\ue\ud\u 
\uB\u","value C6"]  |
| ["value A7","E\un\uc\uo\ud\ue\ud\u 
\uB\u","value C7"]  |
| ["value A8","E\un\uc\uo\ud\ue\ud\u 
\uB\u","value C8"]  |
| ["value A9","E\un\uc\uo\ud\ue\ud\u 
\uB\u","value C9"]  |
++
10 rows selected (0.136 seconds)
{noformat}

{noformat}

[jira] [Commented] (DRILL-3712) Drill does not recognize UTF-16-LE encoding

2015-10-08 Thread Deneche A. Hakim (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-3712?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14949489#comment-14949489
 ] 

Deneche A. Hakim commented on DRILL-3712:
-

[~ebegoli] I did the following using the latest master:
- I used your script to create a text.psv file
- I created a gzipped version of the file (just .gz not tar.gz)
- I updated the "psv" definition in my dfs storage plugin like this:
{noformat}
"psv": {
  "type": "text",
  "extensions": [
"tbl",
"psv"
  ],
  "skipFirstLine": true,
  "delimiter": "|"
}
{noformat}

Here are the results I get when I query the file:
{noformat}
0: jdbc:drill:zk=local> select * from dfs.data.`test.psv.gz`;
++
|  columns  
 |
++
| ["value A0","E\un\uc\uo\ud\ue\ud\u 
\uB\u","value C0"]  |
| ["value A1","E\un\uc\uo\ud\ue\ud\u 
\uB\u","value C1"]  |
| ["value A2","E\un\uc\uo\ud\ue\ud\u 
\uB\u","value C2"]  |
| ["value A3","E\un\uc\uo\ud\ue\ud\u 
\uB\u","value C3"]  |
| ["value A4","E\un\uc\uo\ud\ue\ud\u 
\uB\u","value C4"]  |
| ["value A5","E\un\uc\uo\ud\ue\ud\u 
\uB\u","value C5"]  |
| ["value A6","E\un\uc\uo\ud\ue\ud\u 
\uB\u","value C6"]  |
| ["value A7","E\un\uc\uo\ud\ue\ud\u 
\uB\u","value C7"]  |
| ["value A8","E\un\uc\uo\ud\ue\ud\u 
\uB\u","value C8"]  |
| ["value A9","E\un\uc\uo\ud\ue\ud\u 
\uB\u","value C9"]  |
++
10 rows selected (0.136 seconds)
{noformat}

{noformat}
0: jdbc:drill:zk=local> select columns[0], columns[1], columns[2] from 
dfs.data.`test.psv.gz`;
+---+-+---+
|  EXPR$0   |   EXPR$1|  EXPR$2   |
+---+-+---+
| value A0  | Encoded B  | value C0  |
| value A1  | Encoded B  | value C1  |
| value A2  | Encoded B  | value C2  |
| value A3  | Encoded B  | value C3  |
| value A4  | Encoded B  | value C4  |
| value A5  | Encoded B  | value C5  |
| value A6  | Encoded B  | value C6  |
| value A7  | Encoded B  | value C7  |
| value A8  | Encoded B  | value C8  |
| value A9  | Encoded B  | value C9  |
+---+-+---+
10 rows selected (0.194 seconds)
{noformat}


> Drill does not recognize UTF-16-LE encoding
> ---
>
> Key: DRILL-3712
> URL: https://issues.apache.org/jira/browse/DRILL-3712
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Storage - Text & CSV
>Affects Versions: 1.1.0
> Environment: OSX, likely Linux. 
>Reporter: Edmon Begoli
> Fix For: Future
>
>
> We are unable to process files that OSX identifies as character sete UTF16LE. 
>  After unzipping and converting to UTF8, we are able to process one fine.  
> There are CONVERT_TO and CONVERT_FROM commands that appear to address the 
> issue, but we were unable to make them work on a gzipped or unzipped version 
> of the UTF16 file.  We were  able to use CONVERT_FROM ok, but when we tried 
> to wrap the results of that to cast as a date, or anything else, it failed.  
> Trying to work with it natively caused the double-byte nature to appear (a 
> substring 1,4 only return the first two characters).
> I cannot post the data because it is proprietary in nature, but I am posting 
> this code that might be useful in re-creating an issue:
> {noformat}
> #!/usr/bin/env python
> """ Generates a test psv file with some text fields encoded as UTF-16-LE. """
> def write_utf16le_encoded_psv():
>   total_lines = 10
>   encoded = "Encoded B".encode("utf-16-le")
>   with open("test.psv","wb") as csv_file:
>   csv_file.write("header 1|header 2|header 3\n")
>   for i in xrange(total_lines):
>   csv_file.write("value 
> A"+str(i)+"|"+encoded+"|value C"+str(i)+"\n")
> if __name__ == "__main__":
>   write_utf16le_encoded_psv()
> {noformat}
> then:
> tar zcvf test.psv.tar.gz test.psv



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (DRILL-3630) error message needs to be fixed LEAD , LAG window functions

2015-10-08 Thread Khurram Faraaz (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-3630?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Khurram Faraaz updated DRILL-3630:
--
Assignee: (was: Deneche A. Hakim)

> error message needs to be fixed LEAD , LAG window functions
> ---
>
> Key: DRILL-3630
> URL: https://issues.apache.org/jira/browse/DRILL-3630
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Flow
>Affects Versions: 1.2.0
> Environment: private-branch 
> https://github.com/adeneche/incubator-drill/tree/new-window-funcs
>Reporter: Khurram Faraaz
>Priority: Minor
>  Labels: window_function
> Fix For: Future
>
>
> The error message says LEAD expects ONE input argument, however if you see in 
> the other query below, lead( col9 , 1 ) does not throw any error (as this is 
> valid syntax and is supported). Another case is, lead(col9,1,1) where we 
> don't report any error being reported. We need to fix the error message and 
> handle inputs appropriately.
> {code}
> 0: jdbc:drill:schema=dfs.tmp> select col9 , lead(col9,1,0,1) over(partition 
> by col7 order by col0) lead_col9 from FEWRWSPQQ_101;
> Error: PARSE ERROR: From line 1, column 15 to line 1, column 35: Invalid 
> number of arguments to function 'LEAD'. Was expecting 1 arguments
> [Error Id: 4a94487c-42e6-4672-978e-f011c997c537 on centos-03.qa.lab:31010] 
> (state=,code=0)
> {code}
> {code}
> 0: jdbc:drill:schema=dfs.tmp> select col9 , lead(col9,1) over(partition by 
> col7 order by col0) lead_col9 from FEWRWSPQQ_101;
> +---+---+
> | col9  | 
>   lead_col9   |
> +---+---+
> | OXCB  | 
> ZXCZ  |
> | ZXCZ  | 
> AXCZ  |
> | AXCZ  | 
> CXCB  |
> | CXCB  | 
> HXCZ  |
> | HXCZ  | 
> UXCB  |
> | UXCB  | 
> TXCD  |
> | TXCD  | 
> PXCD  |
> | PXCD  | 
> NXCB  |
> | NXCB  | 
> MXCB  |
> | MXCB  | 
> WXCB  |
> | WXCB  | null
>   |
> | KXCB  | 
> BXCD  |
> | BXCD  | 
> DXCB  |
> | DXCB  | 
> IXCD  |
> | IXCD  | 
> YXCB  |
> | YXCB  | 
> EXCZ  |
> | EXCZ  | 
> LXCB  |
> | LXCB  | 
> SXCB  |
> | SXCB  | 
> VXCZ  |
> | VXCZ  | 
> FXCB  |
> | FXCB  | 
> QXCZ  |
> | QXCZ  | null
>   |
> +---+---+
> 22 rows selected (0.245 seconds)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (DRILL-3647) Handle null as input to window function NTILE

2015-10-08 Thread Khurram Faraaz (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-3647?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Khurram Faraaz updated DRILL-3647:
--
Assignee: (was: Deneche A. Hakim)

> Handle null as input to window function NTILE 
> --
>
> Key: DRILL-3647
> URL: https://issues.apache.org/jira/browse/DRILL-3647
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Query Planning & Optimization
>Affects Versions: 1.2.0
> Environment: private-branch 
> https://github.com/adeneche/incubator-drill/tree/new-window-funcs
>Reporter: Khurram Faraaz
>  Labels: window_function
> Fix For: Future
>
>
> We need to handle null as input to window functions. NTILE function must 
> return null as output when input is null.
> {code}
> 0: jdbc:drill:schema=dfs.tmp> select col7 , col0 , ntile(null) over(partition 
> by col7 order by col0) lead_col0 from FEWRWSPQQ_101;
> Error: PARSE ERROR: From line 1, column 22 to line 1, column 37: Argument to 
> function 'NTILE' must not be NULL
> [Error Id: e5e69582-8502-4a99-8ba1-dffdfb8ac028 on centos-04.qa.lab:31010] 
> (state=,code=0)
> {code}
> {code}
> 0: jdbc:drill:schema=dfs.tmp> select col7 , col0 , lead(null) over(partition 
> by col7 order by col0) lead_col0 from FEWRWSPQQ_101;
> Error: PARSE ERROR: From line 1, column 27 to line 1, column 30: Illegal use 
> of 'NULL'
> [Error Id: 6824ca01-e3f1-4338-b4c8-5535e7a42e13 on centos-04.qa.lab:31010] 
> (state=,code=0)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (DRILL-2796) Select keys from JSON file where not in null results in RelOptPlanner.CannotPlanException

2015-10-08 Thread Khurram Faraaz (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-2796?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Khurram Faraaz updated DRILL-2796:
--
Assignee: (was: Jinfeng Ni)

> Select keys from JSON file where  not in null results in 
> RelOptPlanner.CannotPlanException
> ---
>
> Key: DRILL-2796
> URL: https://issues.apache.org/jira/browse/DRILL-2796
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Query Planning & Optimization
>Affects Versions: 0.9.0
> Environment:  9d92b8e319f2d46e8659d903d355450e15946533 | DRILL-2580: 
> Exit early from HashJoinBatch if build side is empty | 26.03.2015 @ 16:13:53 
> EDT
>Reporter: Khurram Faraaz
> Fix For: Future
>
>
> Query that has,  not in (null) in its predicate, results in 
> RelOptPlanner.CannotPlanException. Tests were run on 4 node cluster and on 
> CentOS.
> Data is being selected from a JSON data file.
> {code}
> 0: jdbc:drill:> select * from `mKeyJSN.json`;
> +++++++++
> |key1|key2|key3|key4|key5|key6
> |key7|key8|
> +++++++++
> | 1234   | null   | null   | null   | null   | null   
> | null   | null   |
> | null   | 1245685| null   | null   | null   | null   
> | null   | null   |
> | null   | null   | hello world! | null   | null   | null 
>   | null   | null   |
> | null   | null   | null   | true   | null   | null   
> | null   | null   |
> | null   | null   | null   | null   | 2000-03-10 | null   
> | null   | null   |
> | null   | null   | null   | null   | null   | 2012-01-21 
> 15:19:12.123 | null   | null   |
> | null   | null   | null   | null   | null   | null   
> | 21:34:32.321 | null   |
> | null   | null   | null   | null   | null   | null   
> | null   | 9789.99|
> +++++++++
> 8 rows selected (0.1 seconds)
> store.json.all_text_mode was set to false
> 0: jdbc:drill:> select * from sys.options where name like 
> '%json.all_text_mode%';
> ++++++++
> |name|kind|type|  num_val   | string_val |  bool_val  
> | float_val  |
> ++++++++
> | store.json.all_text_mode | BOOLEAN| SYSTEM | null   | null  
>  | false  | null   |
> ++++++++
> 1 row selected (0.135 seconds)
> Failing query
> 0: jdbc:drill:> select key1,key2,key3,key4,key5,key6,key7,key8 from 
> `mKeyJSN.json` where key5 not in (null);
> Query failed: RelOptPlanner.CannotPlanException: Node 
> [rel#14924:Subset#7.LOGICAL.ANY([]).[]] could not be implemented; planner 
> state:
> Root: rel#14924:Subset#7.LOGICAL.ANY([]).[]
> Original rel:
> AbstractConverter(subset=[rel#14924:Subset#7.LOGICAL.ANY([]).[]], 
> convention=[LOGICAL], DrillDistributionTraitDef=[ANY([])], sort=[[]]): 
> rowcount = 1.7976931348623157E308, cumulative cost = {inf}, id = 14925
>   ProjectRel(subset=[rel#14923:Subset#7.NONE.ANY([]).[]], key1=[$2], 
> key2=[$3], key3=[$4], key4=[$5], key5=[$1], key6=[$6], key7=[$7], key8=[$8]): 
> rowcount = 1.7976931348623157E308, cumulative cost = {1.7976931348623157E308 
> rows, Infinity cpu, 0.0 io, 0.0 network, 0.0 memory}, id = 14922
> FilterRel(subset=[rel#14921:Subset#6.NONE.ANY([]).[]], 
> condition=[AND(NOT(IS TRUE($11)), IS NOT NULL($9))]): rowcount = 
> 4.0448095534402104E307, cumulative cost = {4.0448095534402104E307 rows, 
> 1.7976931348623157E308 cpu, 0.0 io, 0.0 network, 0.0 memory}, id = 14920
>   JoinRel(subset=[rel#14919:Subset#5.NONE.ANY([]).[]], condition=[=($9, 
> $10)], joinType=[left]): rowcount = 1.7976931348623157E308, cumulative cost = 
> {1.7976931348623157E308 rows, 0.0 cpu, 0.0 io, 0.0 network, 0.0 memory}, id = 
> 14918
> ProjectRel(subset=[rel#14912:Subset#1.NONE.ANY([]).[]], $f0=[$0], 
> $f1=[$1], $f2=[$2], $f3=[$3], $f4=[$4], $f5=[$5], $f6=[$6], $f7=[$7], 
> $f8=[$8], $f9=[$1]): rowcount = 100.0, cumulative cost = {100.0 rows, 1000.0 
> cpu, 0.0 io, 0.0 network, 0.0 memory}, id = 14911
>   
> EnumerableTableAccessRel(subset=[rel#14910:Subset#0.ENUMERABLE.ANY([]).[]], 
> table=[[dfs, tmp, mKeyJSN.js

[jira] [Updated] (DRILL-2796) Select keys from JSON file where not in null results in RelOptPlanner.CannotPlanException

2015-10-08 Thread Khurram Faraaz (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-2796?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Khurram Faraaz updated DRILL-2796:
--
Fix Version/s: (was: 1.3.0)
   Future

> Select keys from JSON file where  not in null results in 
> RelOptPlanner.CannotPlanException
> ---
>
> Key: DRILL-2796
> URL: https://issues.apache.org/jira/browse/DRILL-2796
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Query Planning & Optimization
>Affects Versions: 0.9.0
> Environment:  9d92b8e319f2d46e8659d903d355450e15946533 | DRILL-2580: 
> Exit early from HashJoinBatch if build side is empty | 26.03.2015 @ 16:13:53 
> EDT
>Reporter: Khurram Faraaz
>Assignee: Jinfeng Ni
> Fix For: Future
>
>
> Query that has,  not in (null) in its predicate, results in 
> RelOptPlanner.CannotPlanException. Tests were run on 4 node cluster and on 
> CentOS.
> Data is being selected from a JSON data file.
> {code}
> 0: jdbc:drill:> select * from `mKeyJSN.json`;
> +++++++++
> |key1|key2|key3|key4|key5|key6
> |key7|key8|
> +++++++++
> | 1234   | null   | null   | null   | null   | null   
> | null   | null   |
> | null   | 1245685| null   | null   | null   | null   
> | null   | null   |
> | null   | null   | hello world! | null   | null   | null 
>   | null   | null   |
> | null   | null   | null   | true   | null   | null   
> | null   | null   |
> | null   | null   | null   | null   | 2000-03-10 | null   
> | null   | null   |
> | null   | null   | null   | null   | null   | 2012-01-21 
> 15:19:12.123 | null   | null   |
> | null   | null   | null   | null   | null   | null   
> | 21:34:32.321 | null   |
> | null   | null   | null   | null   | null   | null   
> | null   | 9789.99|
> +++++++++
> 8 rows selected (0.1 seconds)
> store.json.all_text_mode was set to false
> 0: jdbc:drill:> select * from sys.options where name like 
> '%json.all_text_mode%';
> ++++++++
> |name|kind|type|  num_val   | string_val |  bool_val  
> | float_val  |
> ++++++++
> | store.json.all_text_mode | BOOLEAN| SYSTEM | null   | null  
>  | false  | null   |
> ++++++++
> 1 row selected (0.135 seconds)
> Failing query
> 0: jdbc:drill:> select key1,key2,key3,key4,key5,key6,key7,key8 from 
> `mKeyJSN.json` where key5 not in (null);
> Query failed: RelOptPlanner.CannotPlanException: Node 
> [rel#14924:Subset#7.LOGICAL.ANY([]).[]] could not be implemented; planner 
> state:
> Root: rel#14924:Subset#7.LOGICAL.ANY([]).[]
> Original rel:
> AbstractConverter(subset=[rel#14924:Subset#7.LOGICAL.ANY([]).[]], 
> convention=[LOGICAL], DrillDistributionTraitDef=[ANY([])], sort=[[]]): 
> rowcount = 1.7976931348623157E308, cumulative cost = {inf}, id = 14925
>   ProjectRel(subset=[rel#14923:Subset#7.NONE.ANY([]).[]], key1=[$2], 
> key2=[$3], key3=[$4], key4=[$5], key5=[$1], key6=[$6], key7=[$7], key8=[$8]): 
> rowcount = 1.7976931348623157E308, cumulative cost = {1.7976931348623157E308 
> rows, Infinity cpu, 0.0 io, 0.0 network, 0.0 memory}, id = 14922
> FilterRel(subset=[rel#14921:Subset#6.NONE.ANY([]).[]], 
> condition=[AND(NOT(IS TRUE($11)), IS NOT NULL($9))]): rowcount = 
> 4.0448095534402104E307, cumulative cost = {4.0448095534402104E307 rows, 
> 1.7976931348623157E308 cpu, 0.0 io, 0.0 network, 0.0 memory}, id = 14920
>   JoinRel(subset=[rel#14919:Subset#5.NONE.ANY([]).[]], condition=[=($9, 
> $10)], joinType=[left]): rowcount = 1.7976931348623157E308, cumulative cost = 
> {1.7976931348623157E308 rows, 0.0 cpu, 0.0 io, 0.0 network, 0.0 memory}, id = 
> 14918
> ProjectRel(subset=[rel#14912:Subset#1.NONE.ANY([]).[]], $f0=[$0], 
> $f1=[$1], $f2=[$2], $f3=[$3], $f4=[$4], $f5=[$5], $f6=[$6], $f7=[$7], 
> $f8=[$8], $f9=[$1]): rowcount = 100.0, cumulative cost = {100.0 rows, 1000.0 
> cpu, 0.0 io, 0.0 network, 0.0 memory}, id = 14911
>   
> EnumerableTableAccessRel(subset=[rel#14910:Subs

[jira] [Commented] (DRILL-2853) AssertionError: Internal error: while converting `KVGEN`(*)

2015-10-08 Thread Khurram Faraaz (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-2853?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14949459#comment-14949459
 ] 

Khurram Faraaz commented on DRILL-2853:
---

IOB Exception on Drill 1.2 master commit id eafe0a24

{code}
: jdbc:drill:schema=dfs.tmp> select kvgen(*) from `kvgenData.json`;
Error: SYSTEM ERROR: IndexOutOfBoundsException: index (-1) must not be negative


[Error Id: 9a9204bc-0701-4fb1-b322-bae78019a6fc on centos-04.qa.lab:31010] 
(state=,code=0)
{code}

> AssertionError: Internal error: while converting `KVGEN`(*)
> ---
>
> Key: DRILL-2853
> URL: https://issues.apache.org/jira/browse/DRILL-2853
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Flow
>Affects Versions: 0.9.0
> Environment: 64e3ec52b93e9331aa5179e040eca19afece8317 | DRILL-2611: 
> value vectors should report valid value count | 16.04.2015 @ 13:53:34 EDT
>Reporter: Khurram Faraaz
> Fix For: Future
>
>
> I am seeing an assertion error in the below query that uses KVGEN function. 
> Data in the JSON file was,
> {code}
> 0: jdbc:drill:> select * from `kvgenData.json`;
> +++++++
> |eid |   ename|eage|  empDept   | empSalary  |  empPhone  
> |
> +++++++
> | 12345  | MR WRIGHT  | null   | null   | null   | null   
> |
> | null   | null   | 45 | HR | null   | null   
> |
> | null   | null   | null   | null   | 5  | 
> 123-456-6789 |
> +++++++
> 3 rows selected (0.185 seconds)
> {code}
> Failing query
> {code}
> 0: jdbc:drill:> select kvgen(*) from `kvgenData.json`;
> Query failed: SYSTEM ERROR: Unexpected exception during fragment 
> initialization: Internal error: while converting `KVGEN`(*)
> [13b9f79f-6f24-43d4-9119-2cdae441895f on centos-02.qa.lab:31010]
> Error: exception while executing query: Failure while executing query. 
> (state=,code=0)
> {code}
> Stack trace from drillbit.log 
> {code}
> 2015-04-23 06:42:15,807 [2ac76bb7-e392-bb0b-c5e3-91dfa98b2c94:foreman] INFO  
> o.a.drill.exec.work.foreman.Foreman - State change requested.  PENDING --> 
> FAILED
> org.apache.drill.exec.work.foreman.ForemanException: Unexpected exception 
> during fragment initialization: Internal error: while converting `KVGEN`(*)
> at org.apache.drill.exec.work.foreman.Foreman.run(Foreman.java:211) 
> [drill-java-exec-0.9.0-SNAPSHOT-rebuffed.jar:0.9.0-SNAPSHOT]
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>  [na:1.7.0_75]
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>  [na:1.7.0_75]
> at java.lang.Thread.run(Thread.java:745) [na:1.7.0_75]
> Caused by: java.lang.AssertionError: Internal error: while converting 
> `KVGEN`(*)
> at org.eigenbase.util.Util.newInternal(Util.java:750) 
> ~[optiq-core-0.9-drill-r21.jar:na]
> at 
> org.eigenbase.sql2rel.ReflectiveConvertletTable$2.convertCall(ReflectiveConvertletTable.java:149)
>  ~[optiq-core-0.9-drill-r21.jar:na]
> at 
> org.eigenbase.sql2rel.SqlNodeToRexConverterImpl.convertCall(SqlNodeToRexConverterImpl.java:52)
>  ~[optiq-core-0.9-drill-r21.jar:na]
> at 
> org.eigenbase.sql2rel.SqlToRelConverter$Blackboard.visit(SqlToRelConverter.java:4099)
>  ~[optiq-core-0.9-drill-r21.jar:na]
> at 
> org.eigenbase.sql2rel.SqlToRelConverter$Blackboard.visit(SqlToRelConverter.java:3485)
>  ~[optiq-core-0.9-drill-r21.jar:na]
> at org.eigenbase.sql.SqlCall.accept(SqlCall.java:125) 
> ~[optiq-core-0.9-drill-r21.jar:na]
> at 
> org.eigenbase.sql2rel.SqlToRelConverter$Blackboard.convertExpression(SqlToRelConverter.java:3994)
>  ~[optiq-core-0.9-drill-r21.jar:na]
> at 
> org.eigenbase.sql2rel.SqlToRelConverter.convertSelectList(SqlToRelConverter.java:3301)
>  ~[optiq-core-0.9-drill-r21.jar:na]
> at 
> org.eigenbase.sql2rel.SqlToRelConverter.convertSelectImpl(SqlToRelConverter.java:519)
>  ~[optiq-core-0.9-drill-r21.jar:na]
> at 
> org.eigenbase.sql2rel.SqlToRelConverter.convertSelect(SqlToRelConverter.java:474)
>  ~[optiq-core-0.9-drill-r21.jar:na]
> at 
> org.eigenbase.sql2rel.SqlToRelConverter.convertQueryRecursive(SqlToRelConverter.java:2657)
>  ~[optiq-core-0.9-drill-r21.jar:na]
> at 
> org.eigenbase.sql2rel.SqlToRelConverter.convertQuery(SqlToRelConverter.java:432)
>  ~[optiq-core-0.9-drill-r21.jar:na]
> at 
> net.hydromatic.optiq.prepare.PlannerImpl.convert(PlannerImpl.java:190) 
> ~[optiq-core-0.9-drill-r21.jar:na]
> at 
> org.a

[jira] [Commented] (DRILL-3712) Drill does not recognize UTF-16-LE encoding

2015-10-08 Thread Steven Phillips (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-3712?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14949453#comment-14949453
 ] 

Steven Phillips commented on DRILL-3712:


What convert_from function did you use?

> Drill does not recognize UTF-16-LE encoding
> ---
>
> Key: DRILL-3712
> URL: https://issues.apache.org/jira/browse/DRILL-3712
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Storage - Text & CSV
>Affects Versions: 1.1.0
> Environment: OSX, likely Linux. 
>Reporter: Edmon Begoli
> Fix For: Future
>
>
> We are unable to process files that OSX identifies as character sete UTF16LE. 
>  After unzipping and converting to UTF8, we are able to process one fine.  
> There are CONVERT_TO and CONVERT_FROM commands that appear to address the 
> issue, but we were unable to make them work on a gzipped or unzipped version 
> of the UTF16 file.  We were  able to use CONVERT_FROM ok, but when we tried 
> to wrap the results of that to cast as a date, or anything else, it failed.  
> Trying to work with it natively caused the double-byte nature to appear (a 
> substring 1,4 only return the first two characters).
> I cannot post the data because it is proprietary in nature, but I am posting 
> this code that might be useful in re-creating an issue:
> {noformat}
> #!/usr/bin/env python
> """ Generates a test psv file with some text fields encoded as UTF-16-LE. """
> def write_utf16le_encoded_psv():
>   total_lines = 10
>   encoded = "Encoded B".encode("utf-16-le")
>   with open("test.psv","wb") as csv_file:
>   csv_file.write("header 1|header 2|header 3\n")
>   for i in xrange(total_lines):
>   csv_file.write("value 
> A"+str(i)+"|"+encoded+"|value C"+str(i)+"\n")
> if __name__ == "__main__":
>   write_utf16le_encoded_psv()
> {noformat}
> then:
> tar zcvf test.psv.tar.gz test.psv



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (DRILL-3384) CTAS Partitioning by a non-existing column results in NPE.

2015-10-08 Thread Khurram Faraaz (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-3384?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Khurram Faraaz updated DRILL-3384:
--
Assignee: (was: Sean Hsuan-Yi Chu)

> CTAS Partitioning by a non-existing column results in NPE.
> --
>
> Key: DRILL-3384
> URL: https://issues.apache.org/jira/browse/DRILL-3384
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Flow
>Affects Versions: 1.1.0
> Environment: 4 node cluster on CentOS
>Reporter: Khurram Faraaz
> Fix For: Future
>
>
> CTAS Partitioning by a non-existing column results in NPE. 
> Input CSV file had 107 rows, with many duplicates. Note that columns[2] 
> does not exist in the CSV file used in CTAS.
> {code}
> 0: jdbc:drill:schema=dfs.tmp> create table ctas_prtn_978  as select 
> columns[0] c1, columns[2] c2 from `manyDuplicates.csv`;
> +---++
> | Fragment  | Number of records written  |
> +---++
> | 0_0   | 107|
> +---++
> 1 row selected (1.822 seconds)
> {code}
> {code}
> 0: jdbc:drill:schema=dfs.tmp> create table ctas_prtn_999 partition by (c2) as 
> select * from ctas_prtn_978;
> Error: SYSTEM ERROR: NullPointerException: src
> Fragment 0:0
> [Error Id: a464dd65-ba7d-412b-b97c-4e31ac74a9d7 on centos-01.qa.lab:31010] 
> (state=,code=0)
> {code}
> {code}
> 2015-06-26 01:35:41,995 [2a735392-aa07-be41-ec14-bc5591aa044d:frag:0:0] ERROR 
> o.a.d.e.w.fragment.FragmentExecutor - SYSTEM ERROR: NullPointerException: src
> Fragment 0:0
> [Error Id: 86f15f89-ba57-40b9-bff8-554c0fee481b on centos-01.qa.lab:31010]
> org.apache.drill.common.exceptions.UserException: SYSTEM ERROR: 
> NullPointerException: src
> Fragment 0:0
> [Error Id: 86f15f89-ba57-40b9-bff8-554c0fee481b on centos-01.qa.lab:31010]
> at 
> org.apache.drill.common.exceptions.UserException$Builder.build(UserException.java:523)
>  ~[drill-common-1.1.0-SNAPSHOT.jar:1.1.0-SNAPSHOT]
> at 
> org.apache.drill.exec.work.fragment.FragmentExecutor.sendFinalState(FragmentExecutor.java:326)
>  [drill-java-exec-1.1.0-SNAPSHOT.jar:1.1.0-SNAPSHOT]
> at 
> org.apache.drill.exec.work.fragment.FragmentExecutor.cleanup(FragmentExecutor.java:181)
>  [drill-java-exec-1.1.0-SNAPSHOT.jar:1.1.0-SNAPSHOT]
> at 
> org.apache.drill.exec.work.fragment.FragmentExecutor.run(FragmentExecutor.java:295)
>  [drill-java-exec-1.1.0-SNAPSHOT.jar:1.1.0-SNAPSHOT]
> at 
> org.apache.drill.common.SelfCleaningRunnable.run(SelfCleaningRunnable.java:38)
>  [drill-common-1.1.0-SNAPSHOT.jar:1.1.0-SNAPSHOT]
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>  [na:1.7.0_45]
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>  [na:1.7.0_45]
> at java.lang.Thread.run(Thread.java:744) [na:1.7.0_45]
> Caused by: java.lang.NullPointerException: src
> at 
> io.netty.buffer.PooledUnsafeDirectByteBuf.setBytes(PooledUnsafeDirectByteBuf.java:252)
>  ~[netty-buffer-4.0.27.Final.jar:4.0.27.Final]
> at io.netty.buffer.WrappedByteBuf.setBytes(WrappedByteBuf.java:378) 
> ~[netty-buffer-4.0.27.Final.jar:4.0.27.Final]
> at 
> io.netty.buffer.UnsafeDirectLittleEndian.setBytes(UnsafeDirectLittleEndian.java:28)
>  ~[drill-java-exec-1.1.0-SNAPSHOT.jar:4.0.27.Final]
> at io.netty.buffer.DrillBuf.setBytes(DrillBuf.java:699) 
> ~[drill-java-exec-1.1.0-SNAPSHOT.jar:4.0.27.Final]
> at 
> org.apache.drill.exec.test.generated.ProjectorGen4107.doEval(ProjectorTemplate.java:109)
>  ~[na:na]
> at 
> org.apache.drill.exec.test.generated.ProjectorGen4107.projectRecords(ProjectorTemplate.java:62)
>  ~[na:na]
> at 
> org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.doWork(ProjectRecordBatch.java:172)
>  ~[drill-java-exec-1.1.0-SNAPSHOT.jar:1.1.0-SNAPSHOT]
> at 
> org.apache.drill.exec.record.AbstractSingleRecordBatch.innerNext(AbstractSingleRecordBatch.java:93)
>  ~[drill-java-exec-1.1.0-SNAPSHOT.jar:1.1.0-SNAPSHOT]
> at 
> org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.innerNext(ProjectRecordBatch.java:129)
>  ~[drill-java-exec-1.1.0-SNAPSHOT.jar:1.1.0-SNAPSHOT]
> at 
> org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:147)
>  ~[drill-java-exec-1.1.0-SNAPSHOT.jar:1.1.0-SNAPSHOT]
> at 
> org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:105)
>  ~[drill-java-exec-1.1.0-SNAPSHOT.jar:1.1.0-SNAPSHOT]
> at 
> org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:95)
>  ~[drill-java-exec-1.1.0-SNAPSHOT.jar:1.1.0-SNAPSHOT]
> at 
> org.apache.drill.exec.phys

[jira] [Updated] (DRILL-3417) Filter present in query plan when we should not see one

2015-10-08 Thread Khurram Faraaz (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-3417?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Khurram Faraaz updated DRILL-3417:
--
Assignee: (was: Jinfeng Ni)

> Filter present in query plan when we should not see one 
> 
>
> Key: DRILL-3417
> URL: https://issues.apache.org/jira/browse/DRILL-3417
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Query Planning & Optimization
>Affects Versions: 1.1.0
> Environment: 4 node cluster CentOS
>Reporter: Khurram Faraaz
> Fix For: Future, 1.3.0
>
>
> We are seeing a FILTER in the query plan, in this case we should not see one.
> {code}
> 0: jdbc:drill:schema=dfs.tmp> CREATE TABLE CTAS_ONE_MILN_RWS_PER_GROUP(col1, 
> col2) PARTITION BY (col2) AS select cast(columns[0] as bigint) col1, 
> cast(columns[1] as char(2)) col2 from `millionValGroup.csv`;
> +---++
> | Fragment  | Number of records written  |
> +---++
> | 1_1   | 21932064   |
> | 1_0   | 28067936   |
> +---++
> 2 rows selected (73.661 seconds)
> {code}
> Total number of rows in CTAS output
> {code}
> 0: jdbc:drill:schema=dfs.tmp> select count(*) from 
> CTAS_ONE_MILN_RWS_PER_GROUP;
> +---+
> |  EXPR$0   |
> +---+
> | 5000  |
> +---+
> 1 row selected (0.197 seconds)
> {code}
> {code}
> explain plan for select col1, col2 from CTAS_ONE_MILN_RWS_PER_GROUP where 
> col2 >= 'AK';
> | 00-00Screen
> 00-01  Project(col1=[$0], col2=[$1])
> 00-02UnionExchange
> 01-01  Project(col1=[$1], col2=[$0])
> 01-02SelectionVectorRemover
> 01-03  Filter(condition=[>=($0, 'AK')])
> 01-04Project(col2=[$1], col1=[$0])
> 01-05  Scan(groupscan=[ParquetGroupScan 
> [entries=[ReadEntryWithPath 
> [path=maprfs:///tmp/CTAS_ONE_MILN_RWS_PER_GROUP]], 
> selectionRoot=/tmp/CTAS_ONE_MILN_RWS_PER_GROUP, numFiles=1, columns=[`col2`, 
> `col1`]]])
> {code}
> Number of files created by CTAS
> {code}
> [root@centos-01 ~]# hadoop fs -ls /tmp/CTAS_ONE_MILN_RWS_PER_GROUP
> Found 98 items
> -rwxr-xr-x   3 mapr mapr2907957 2015-06-29 17:55 
> /tmp/CTAS_ONE_MILN_RWS_PER_GROUP/1_0_1.parquet
> -rwxr-xr-x   3 mapr mapr2902189 2015-06-29 17:55 
> /tmp/CTAS_ONE_MILN_RWS_PER_GROUP/1_0_10.parquet
> -rwxr-xr-x   3 mapr mapr2910365 2015-06-29 17:55 
> /tmp/CTAS_ONE_MILN_RWS_PER_GROUP/1_0_11.parquet
> -rwxr-xr-x   3 mapr mapr2906479 2015-06-29 17:55 
> /tmp/CTAS_ONE_MILN_RWS_PER_GROUP/1_0_12.parquet
> -rwxr-xr-x   3 mapr mapr2900842 2015-06-29 17:55 
> /tmp/CTAS_ONE_MILN_RWS_PER_GROUP/1_0_13.parquet
> -rwxr-xr-x   3 mapr mapr2901196 2015-06-29 17:55 
> /tmp/CTAS_ONE_MILN_RWS_PER_GROUP/1_0_14.parquet
> -rwxr-xr-x   3 mapr mapr2909687 2015-06-29 17:55 
> /tmp/CTAS_ONE_MILN_RWS_PER_GROUP/1_0_15.parquet
> -rwxr-xr-x   3 mapr mapr2908603 2015-06-29 17:55 
> /tmp/CTAS_ONE_MILN_RWS_PER_GROUP/1_0_16.parquet
> -rwxr-xr-x   3 mapr mapr2903334 2015-06-29 17:55 
> /tmp/CTAS_ONE_MILN_RWS_PER_GROUP/1_0_17.parquet
> -rwxr-xr-x   3 mapr mapr2906378 2015-06-29 17:55 
> /tmp/CTAS_ONE_MILN_RWS_PER_GROUP/1_0_18.parquet
> -rwxr-xr-x   3 mapr mapr2904710 2015-06-29 17:55 
> /tmp/CTAS_ONE_MILN_RWS_PER_GROUP/1_0_19.parquet
> -rwxr-xr-x   3 mapr mapr2903170 2015-06-29 17:55 
> /tmp/CTAS_ONE_MILN_RWS_PER_GROUP/1_0_2.parquet
> -rwxr-xr-x   3 mapr mapr2908703 2015-06-29 17:55 
> /tmp/CTAS_ONE_MILN_RWS_PER_GROUP/1_0_20.parquet
> -rwxr-xr-x   3 mapr mapr2903634 2015-06-29 17:55 
> /tmp/CTAS_ONE_MILN_RWS_PER_GROUP/1_0_21.parquet
> -rwxr-xr-x   3 mapr mapr2898076 2015-06-29 17:55 
> /tmp/CTAS_ONE_MILN_RWS_PER_GROUP/1_0_22.parquet
> -rwxr-xr-x   3 mapr mapr2899426 2015-06-29 17:55 
> /tmp/CTAS_ONE_MILN_RWS_PER_GROUP/1_0_23.parquet
> -rwxr-xr-x   3 mapr mapr2903914 2015-06-29 17:56 
> /tmp/CTAS_ONE_MILN_RWS_PER_GROUP/1_0_24.parquet
> -rwxr-xr-x   3 mapr mapr2906561 2015-06-29 17:56 
> /tmp/CTAS_ONE_MILN_RWS_PER_GROUP/1_0_25.parquet
> -rwxr-xr-x   3 mapr mapr2899655 2015-06-29 17:56 
> /tmp/CTAS_ONE_MILN_RWS_PER_GROUP/1_0_26.parquet
> -rwxr-xr-x   3 mapr mapr2902479 2015-06-29 17:56 
> /tmp/CTAS_ONE_MILN_RWS_PER_GROUP/1_0_27.parquet
> -rwxr-xr-x   3 mapr mapr2905985 2015-06-29 17:56 
> /tmp/CTAS_ONE_MILN_RWS_PER_GROUP/1_0_28.parquet
> -rwxr-xr-x   3 mapr mapr2901645 2015-06-29 17:56 
> /tmp/CTAS_ONE_MILN_RWS_PER_GROUP/1_0_29.parquet
> -rwxr-xr-x   3 mapr mapr2901653 2015-06-29 17:55 
> /tmp/CTAS_ONE_MILN_RWS_PER_GROUP/1_0_3.parquet
> -rwxr-xr-x   3 mapr mapr2903008 2015-06-29 17:56 
> /tmp/CTAS_ONE_MILN_RWS_PER_GROUP/1_0_30.parquet
> -rwxr-xr-x   3 mapr mapr2898135 2015-06-29 17:56 
> /tmp/CTAS_ONE_MILN_RWS_PER_GROUP/

[jira] [Updated] (DRILL-3417) Filter present in query plan when we should not see one

2015-10-08 Thread Khurram Faraaz (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-3417?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Khurram Faraaz updated DRILL-3417:
--
Fix Version/s: Future

> Filter present in query plan when we should not see one 
> 
>
> Key: DRILL-3417
> URL: https://issues.apache.org/jira/browse/DRILL-3417
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Query Planning & Optimization
>Affects Versions: 1.1.0
> Environment: 4 node cluster CentOS
>Reporter: Khurram Faraaz
> Fix For: Future, 1.3.0
>
>
> We are seeing a FILTER in the query plan, in this case we should not see one.
> {code}
> 0: jdbc:drill:schema=dfs.tmp> CREATE TABLE CTAS_ONE_MILN_RWS_PER_GROUP(col1, 
> col2) PARTITION BY (col2) AS select cast(columns[0] as bigint) col1, 
> cast(columns[1] as char(2)) col2 from `millionValGroup.csv`;
> +---++
> | Fragment  | Number of records written  |
> +---++
> | 1_1   | 21932064   |
> | 1_0   | 28067936   |
> +---++
> 2 rows selected (73.661 seconds)
> {code}
> Total number of rows in CTAS output
> {code}
> 0: jdbc:drill:schema=dfs.tmp> select count(*) from 
> CTAS_ONE_MILN_RWS_PER_GROUP;
> +---+
> |  EXPR$0   |
> +---+
> | 5000  |
> +---+
> 1 row selected (0.197 seconds)
> {code}
> {code}
> explain plan for select col1, col2 from CTAS_ONE_MILN_RWS_PER_GROUP where 
> col2 >= 'AK';
> | 00-00Screen
> 00-01  Project(col1=[$0], col2=[$1])
> 00-02UnionExchange
> 01-01  Project(col1=[$1], col2=[$0])
> 01-02SelectionVectorRemover
> 01-03  Filter(condition=[>=($0, 'AK')])
> 01-04Project(col2=[$1], col1=[$0])
> 01-05  Scan(groupscan=[ParquetGroupScan 
> [entries=[ReadEntryWithPath 
> [path=maprfs:///tmp/CTAS_ONE_MILN_RWS_PER_GROUP]], 
> selectionRoot=/tmp/CTAS_ONE_MILN_RWS_PER_GROUP, numFiles=1, columns=[`col2`, 
> `col1`]]])
> {code}
> Number of files created by CTAS
> {code}
> [root@centos-01 ~]# hadoop fs -ls /tmp/CTAS_ONE_MILN_RWS_PER_GROUP
> Found 98 items
> -rwxr-xr-x   3 mapr mapr2907957 2015-06-29 17:55 
> /tmp/CTAS_ONE_MILN_RWS_PER_GROUP/1_0_1.parquet
> -rwxr-xr-x   3 mapr mapr2902189 2015-06-29 17:55 
> /tmp/CTAS_ONE_MILN_RWS_PER_GROUP/1_0_10.parquet
> -rwxr-xr-x   3 mapr mapr2910365 2015-06-29 17:55 
> /tmp/CTAS_ONE_MILN_RWS_PER_GROUP/1_0_11.parquet
> -rwxr-xr-x   3 mapr mapr2906479 2015-06-29 17:55 
> /tmp/CTAS_ONE_MILN_RWS_PER_GROUP/1_0_12.parquet
> -rwxr-xr-x   3 mapr mapr2900842 2015-06-29 17:55 
> /tmp/CTAS_ONE_MILN_RWS_PER_GROUP/1_0_13.parquet
> -rwxr-xr-x   3 mapr mapr2901196 2015-06-29 17:55 
> /tmp/CTAS_ONE_MILN_RWS_PER_GROUP/1_0_14.parquet
> -rwxr-xr-x   3 mapr mapr2909687 2015-06-29 17:55 
> /tmp/CTAS_ONE_MILN_RWS_PER_GROUP/1_0_15.parquet
> -rwxr-xr-x   3 mapr mapr2908603 2015-06-29 17:55 
> /tmp/CTAS_ONE_MILN_RWS_PER_GROUP/1_0_16.parquet
> -rwxr-xr-x   3 mapr mapr2903334 2015-06-29 17:55 
> /tmp/CTAS_ONE_MILN_RWS_PER_GROUP/1_0_17.parquet
> -rwxr-xr-x   3 mapr mapr2906378 2015-06-29 17:55 
> /tmp/CTAS_ONE_MILN_RWS_PER_GROUP/1_0_18.parquet
> -rwxr-xr-x   3 mapr mapr2904710 2015-06-29 17:55 
> /tmp/CTAS_ONE_MILN_RWS_PER_GROUP/1_0_19.parquet
> -rwxr-xr-x   3 mapr mapr2903170 2015-06-29 17:55 
> /tmp/CTAS_ONE_MILN_RWS_PER_GROUP/1_0_2.parquet
> -rwxr-xr-x   3 mapr mapr2908703 2015-06-29 17:55 
> /tmp/CTAS_ONE_MILN_RWS_PER_GROUP/1_0_20.parquet
> -rwxr-xr-x   3 mapr mapr2903634 2015-06-29 17:55 
> /tmp/CTAS_ONE_MILN_RWS_PER_GROUP/1_0_21.parquet
> -rwxr-xr-x   3 mapr mapr2898076 2015-06-29 17:55 
> /tmp/CTAS_ONE_MILN_RWS_PER_GROUP/1_0_22.parquet
> -rwxr-xr-x   3 mapr mapr2899426 2015-06-29 17:55 
> /tmp/CTAS_ONE_MILN_RWS_PER_GROUP/1_0_23.parquet
> -rwxr-xr-x   3 mapr mapr2903914 2015-06-29 17:56 
> /tmp/CTAS_ONE_MILN_RWS_PER_GROUP/1_0_24.parquet
> -rwxr-xr-x   3 mapr mapr2906561 2015-06-29 17:56 
> /tmp/CTAS_ONE_MILN_RWS_PER_GROUP/1_0_25.parquet
> -rwxr-xr-x   3 mapr mapr2899655 2015-06-29 17:56 
> /tmp/CTAS_ONE_MILN_RWS_PER_GROUP/1_0_26.parquet
> -rwxr-xr-x   3 mapr mapr2902479 2015-06-29 17:56 
> /tmp/CTAS_ONE_MILN_RWS_PER_GROUP/1_0_27.parquet
> -rwxr-xr-x   3 mapr mapr2905985 2015-06-29 17:56 
> /tmp/CTAS_ONE_MILN_RWS_PER_GROUP/1_0_28.parquet
> -rwxr-xr-x   3 mapr mapr2901645 2015-06-29 17:56 
> /tmp/CTAS_ONE_MILN_RWS_PER_GROUP/1_0_29.parquet
> -rwxr-xr-x   3 mapr mapr2901653 2015-06-29 17:55 
> /tmp/CTAS_ONE_MILN_RWS_PER_GROUP/1_0_3.parquet
> -rwxr-xr-x   3 mapr mapr2903008 2015-06-29 17:56 
> /tmp/CTAS_ONE_MILN_RWS_PER_GROUP/1_0_30.parquet
> -rwxr-xr-x   3 mapr mapr2898135 2015-06-29 17:56 
> /tmp/CTAS_ONE_MILN_RWS_PER_GROUP/1_0_31.par

[jira] [Updated] (DRILL-3384) CTAS Partitioning by a non-existing column results in NPE.

2015-10-08 Thread Khurram Faraaz (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-3384?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Khurram Faraaz updated DRILL-3384:
--
Fix Version/s: (was: 1.3.0)
   Future

> CTAS Partitioning by a non-existing column results in NPE.
> --
>
> Key: DRILL-3384
> URL: https://issues.apache.org/jira/browse/DRILL-3384
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Flow
>Affects Versions: 1.1.0
> Environment: 4 node cluster on CentOS
>Reporter: Khurram Faraaz
> Fix For: Future
>
>
> CTAS Partitioning by a non-existing column results in NPE. 
> Input CSV file had 107 rows, with many duplicates. Note that columns[2] 
> does not exist in the CSV file used in CTAS.
> {code}
> 0: jdbc:drill:schema=dfs.tmp> create table ctas_prtn_978  as select 
> columns[0] c1, columns[2] c2 from `manyDuplicates.csv`;
> +---++
> | Fragment  | Number of records written  |
> +---++
> | 0_0   | 107|
> +---++
> 1 row selected (1.822 seconds)
> {code}
> {code}
> 0: jdbc:drill:schema=dfs.tmp> create table ctas_prtn_999 partition by (c2) as 
> select * from ctas_prtn_978;
> Error: SYSTEM ERROR: NullPointerException: src
> Fragment 0:0
> [Error Id: a464dd65-ba7d-412b-b97c-4e31ac74a9d7 on centos-01.qa.lab:31010] 
> (state=,code=0)
> {code}
> {code}
> 2015-06-26 01:35:41,995 [2a735392-aa07-be41-ec14-bc5591aa044d:frag:0:0] ERROR 
> o.a.d.e.w.fragment.FragmentExecutor - SYSTEM ERROR: NullPointerException: src
> Fragment 0:0
> [Error Id: 86f15f89-ba57-40b9-bff8-554c0fee481b on centos-01.qa.lab:31010]
> org.apache.drill.common.exceptions.UserException: SYSTEM ERROR: 
> NullPointerException: src
> Fragment 0:0
> [Error Id: 86f15f89-ba57-40b9-bff8-554c0fee481b on centos-01.qa.lab:31010]
> at 
> org.apache.drill.common.exceptions.UserException$Builder.build(UserException.java:523)
>  ~[drill-common-1.1.0-SNAPSHOT.jar:1.1.0-SNAPSHOT]
> at 
> org.apache.drill.exec.work.fragment.FragmentExecutor.sendFinalState(FragmentExecutor.java:326)
>  [drill-java-exec-1.1.0-SNAPSHOT.jar:1.1.0-SNAPSHOT]
> at 
> org.apache.drill.exec.work.fragment.FragmentExecutor.cleanup(FragmentExecutor.java:181)
>  [drill-java-exec-1.1.0-SNAPSHOT.jar:1.1.0-SNAPSHOT]
> at 
> org.apache.drill.exec.work.fragment.FragmentExecutor.run(FragmentExecutor.java:295)
>  [drill-java-exec-1.1.0-SNAPSHOT.jar:1.1.0-SNAPSHOT]
> at 
> org.apache.drill.common.SelfCleaningRunnable.run(SelfCleaningRunnable.java:38)
>  [drill-common-1.1.0-SNAPSHOT.jar:1.1.0-SNAPSHOT]
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>  [na:1.7.0_45]
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>  [na:1.7.0_45]
> at java.lang.Thread.run(Thread.java:744) [na:1.7.0_45]
> Caused by: java.lang.NullPointerException: src
> at 
> io.netty.buffer.PooledUnsafeDirectByteBuf.setBytes(PooledUnsafeDirectByteBuf.java:252)
>  ~[netty-buffer-4.0.27.Final.jar:4.0.27.Final]
> at io.netty.buffer.WrappedByteBuf.setBytes(WrappedByteBuf.java:378) 
> ~[netty-buffer-4.0.27.Final.jar:4.0.27.Final]
> at 
> io.netty.buffer.UnsafeDirectLittleEndian.setBytes(UnsafeDirectLittleEndian.java:28)
>  ~[drill-java-exec-1.1.0-SNAPSHOT.jar:4.0.27.Final]
> at io.netty.buffer.DrillBuf.setBytes(DrillBuf.java:699) 
> ~[drill-java-exec-1.1.0-SNAPSHOT.jar:4.0.27.Final]
> at 
> org.apache.drill.exec.test.generated.ProjectorGen4107.doEval(ProjectorTemplate.java:109)
>  ~[na:na]
> at 
> org.apache.drill.exec.test.generated.ProjectorGen4107.projectRecords(ProjectorTemplate.java:62)
>  ~[na:na]
> at 
> org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.doWork(ProjectRecordBatch.java:172)
>  ~[drill-java-exec-1.1.0-SNAPSHOT.jar:1.1.0-SNAPSHOT]
> at 
> org.apache.drill.exec.record.AbstractSingleRecordBatch.innerNext(AbstractSingleRecordBatch.java:93)
>  ~[drill-java-exec-1.1.0-SNAPSHOT.jar:1.1.0-SNAPSHOT]
> at 
> org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.innerNext(ProjectRecordBatch.java:129)
>  ~[drill-java-exec-1.1.0-SNAPSHOT.jar:1.1.0-SNAPSHOT]
> at 
> org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:147)
>  ~[drill-java-exec-1.1.0-SNAPSHOT.jar:1.1.0-SNAPSHOT]
> at 
> org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:105)
>  ~[drill-java-exec-1.1.0-SNAPSHOT.jar:1.1.0-SNAPSHOT]
> at 
> org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:95)
>  ~[drill-java-exec-1.1.0-SNAPSHOT.jar:1.1.0-SNAPSHOT]
> at 
> org.apa

[jira] [Updated] (DRILL-3456) Need support for default values for column in CTAS

2015-10-08 Thread Khurram Faraaz (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-3456?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Khurram Faraaz updated DRILL-3456:
--
Fix Version/s: (was: 1.4.0)
   Future

> Need support for default values for column in CTAS
> --
>
> Key: DRILL-3456
> URL: https://issues.apache.org/jira/browse/DRILL-3456
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Execution - Data Types
>Affects Versions: 1.1.0
>Reporter: Khurram Faraaz
>Priority: Minor
> Fix For: Future
>
>
> Drill does not allow assigning of default values to columns in CTAS. We 
> should support default values for columns. 
> DATATYPE column-name DEFAULT value
> {code}
> 0: jdbc:drill:schema=dfs.tmp> CREATE TABLE TBL_DFLT_NULL(col1 default null) 
> AS SELECT cast(columns[1] as char(1)) col1 from `deflt_Null.csv`;
> Error: PARSE ERROR: Encountered "default" at line 1, column 33.
> Was expecting one of:
> ")" ...
> "," ...
> 
> [Error Id: db98f32e-27ac-47c5-b5c7-4dc33f288b58 on centos-03.qa.lab:31010] 
> (state=,code=0)
> 0: jdbc:drill:schema=dfs.tmp> CREATE TABLE TBL_DFLT_NULL(col1 null) AS SELECT 
> cast(columns[1] as char(1)) col1 from `deflt_Null.csv`;
> Error: PARSE ERROR: Encountered "null" at line 1, column 33.
> Was expecting one of:
> ")" ...
> "," ...
> 
> [Error Id: 69021eff-b65c-4db8-8628-8650a569175a on centos-03.qa.lab:31010] 
> (state=,code=0)
> 0: jdbc:drill:schema=dfs.tmp> CREATE TABLE TBL_DFLT_NULL(col1) AS SELECT 
> cast(columns[1] as char(1)) col1 default null from `deflt_Null.csv`;
> Error: PARSE ERROR: Encountered "default" at line 1, column 77.
> Was expecting one of:
> "FROM" ...
> "," ...
> 
> [Error Id: ebec6600-82b0-4c87-ad9d-756ecd1b9095 on centos-03.qa.lab:31010] 
> (state=,code=0)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (DRILL-3419) Handle scans optimally when all files are pruned out

2015-10-08 Thread Khurram Faraaz (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-3419?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Khurram Faraaz updated DRILL-3419:
--
Assignee: (was: Jinfeng Ni)

> Handle scans optimally when all files are pruned out
> 
>
> Key: DRILL-3419
> URL: https://issues.apache.org/jira/browse/DRILL-3419
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Query Planning & Optimization
>Affects Versions: 1.1.0
>Reporter: Khurram Faraaz
> Fix For: Future
>
>
> Note that in case (1) and case (2) we prune, however it is not clear if we 
> prune is case (3), that is because we see a FILTER in the query plan in case 
> (3)
> CTAS 
> {code}
> 0: jdbc:drill:schema=dfs.tmp> CREATE TABLE CTAS_ONE_MILN_RWS_PER_GROUP(col1, 
> col2) PARTITION BY (col2) AS select cast(columns[0] as bigint) col1, 
> cast(columns[1] as char(2)) col2 from `millionValGroup.csv`;
> +---++
> | Fragment  | Number of records written  |
> +---++
> | 1_1   | 21932064   |
> | 1_0   | 28067936   |
> +---++
> 2 rows selected (73.661 seconds)
> {code}
> case 1)
> {code}
> explain plan for select col1, col2 from CTAS_ONE_MILN_RWS_PER_GROUP where 
> col2 LIKE '%Z%';
> | 00-00Screen
> 00-01  Project(col1=[$0], col2=[$1])
> 00-02UnionExchange
> 01-01  Scan(groupscan=[ParquetGroupScan [entries=[ReadEntryWithPath 
> [path=/tmp/CTAS_ONE_MILN_RWS_PER_GROUP/1_0_3.parquet], ReadEntryWithPath 
> [path=/tmp/CTAS_ONE_MILN_RWS_PER_GROUP/1_1_3.parquet]], 
> selectionRoot=/tmp/CTAS_ONE_MILN_RWS_PER_GROUP, numFiles=2, columns=[`col2`, 
> `col1`]]])
> {code}
> case 2)
> {code}
> explain plan for select col1, col2 from CTAS_ONE_MILN_RWS_PER_GROUP where 
> col2 LIKE 'A%';
> | 00-00Screen
> 00-01  Project(col1=[$0], col2=[$1])
> 00-02UnionExchange
> 01-01  Scan(groupscan=[ParquetGroupScan [entries=[ReadEntryWithPath 
> [path=/tmp/CTAS_ONE_MILN_RWS_PER_GROUP/1_0_3.parquet], ReadEntryWithPath 
> [path=/tmp/CTAS_ONE_MILN_RWS_PER_GROUP/1_0_2.parquet], ReadEntryWithPath 
> [path=/tmp/CTAS_ONE_MILN_RWS_PER_GROUP/1_1_1.parquet], ReadEntryWithPath 
> [path=/tmp/CTAS_ONE_MILN_RWS_PER_GROUP/1_1_2.parquet], ReadEntryWithPath 
> [path=/tmp/CTAS_ONE_MILN_RWS_PER_GROUP/1_1_3.parquet], ReadEntryWithPath 
> [path=/tmp/CTAS_ONE_MILN_RWS_PER_GROUP/1_0_1.parquet]], 
> selectionRoot=/tmp/CTAS_ONE_MILN_RWS_PER_GROUP, numFiles=6, columns=[`col2`, 
> `col1`]]])
> {code}
> case 3) we are NOT pruning here.
> {code}
> explain plan for select col1, col2 from CTAS_ONE_MILN_RWS_PER_GROUP where 
> col2 LIKE 'Z%';
> | 00-00Screen
> 00-01  Project(col1=[$1], col2=[$0])
> 00-02SelectionVectorRemover
> 00-03  Filter(condition=[LIKE($0, 'Z%')])
> 00-04Project(col2=[$1], col1=[$0])
> 00-05  Scan(groupscan=[ParquetGroupScan 
> [entries=[ReadEntryWithPath 
> [path=/tmp/CTAS_ONE_MILN_RWS_PER_GROUP/1_1_48.parquet]], 
> selectionRoot=/tmp/CTAS_ONE_MILN_RWS_PER_GROUP, numFiles=1, columns=[`col2`, 
> `col1`]]])
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (DRILL-3468) CTAS IOB

2015-10-08 Thread Khurram Faraaz (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-3468?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Khurram Faraaz updated DRILL-3468:
--
Fix Version/s: (was: 1.3.0)
   Future

> CTAS IOB
> 
>
> Key: DRILL-3468
> URL: https://issues.apache.org/jira/browse/DRILL-3468
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Flow
>Affects Versions: 1.2.0
>Reporter: Khurram Faraaz
>Priority: Critical
> Fix For: Future
>
>
> I am seeing a IOB when I use same table name in CTAS, after deleting the 
> previously create parquet file.
> {code}
> 0: jdbc:drill:schema=dfs.tmp> CREATE TABLE tbl_allData AS SELECT 
> CAST(columns[0] as INT ), CAST(columns[1] as BIGINT ), CAST(columns[2] as 
> CHAR(2) ), CAST(columns[3] as VARCHAR(52) ), CAST(columns[4] as TIMESTAMP ), 
> CAST(columns[5] as DATE ), CAST(columns[6] as BOOLEAN ), CAST(columns[7] as 
> DOUBLE), CAST( columns[8] as TIME) FROM `allData.csv`;
> +---++
> | Fragment  | Number of records written  |
> +---++
> | 0_0   | 11196  |
> +---++
> 1 row selected (1.864 seconds)
> {code}
> Remove the parquet file that was created by the above CTAS.
> {code}
> [root@centos-01 aggregates]# hadoop fs -ls /tmp/tbl_allData
> Found 1 items
> -rwxr-xr-x   3 mapr mapr 397868 2015-07-07 21:08 
> /tmp/tbl_allData/0_0_0.parquet
> [root@centos-01 aggregates]# hadoop fs -rm /tmp/tbl_allData/0_0_0.parquet
> 15/07/07 21:10:47 INFO Configuration.deprecation: io.bytes.per.checksum is 
> deprecated. Instead, use dfs.bytes-per-checksum
> 15/07/07 21:10:47 INFO fs.TrashPolicyDefault: Namenode trash configuration: 
> Deletion interval = 0 minutes, Emptier interval = 0 minutes.
> Deleted /tmp/tbl_allData/0_0_0.parquet
> {code}
> I see a IOB when I CTAS with same table name as the one that was removed in 
> the above step.
> {code}
> 0: jdbc:drill:schema=dfs.tmp> CREATE TABLE tbl_allData AS SELECT 
> CAST(columns[0] as INT ), CAST(columns[1] as BIGINT ), CAST(columns[2] as 
> CHAR(2) ), CAST(columns[3] as VARCHAR(52) ), CAST(columns[4] as TIMESTAMP ), 
> CAST(columns[5] as DATE ), CAST(columns[6] as BOOLEAN ), CAST(columns[7] as 
> DOUBLE), CAST( columns[8] as TIME) FROM `lessData.csv`;
> Error: SYSTEM ERROR: IndexOutOfBoundsException: Index: 0, Size: 0
> [Error Id: 6d6df8e9-699c-4475-8ad3-183c0a91dc99 on centos-02.qa.lab:31010] 
> (state=,code=0)
> {code}
> stack trace from drillbit.log
> {code}
> org.apache.drill.exec.work.foreman.ForemanException: Unexpected exception 
> during fragment initialization: Failure while trying to check if a table or 
> view with given name [tbl_allData] already exists in schema [dfs.tmp]: Index: 
> 0, Size: 0
> at org.apache.drill.exec.work.foreman.Foreman.run(Foreman.java:253) 
> [drill-java-exec-1.1.0.jar:1.1.0]
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>  [na:1.7.0_45]
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>  [na:1.7.0_45]
> at java.lang.Thread.run(Thread.java:744) [na:1.7.0_45]
> Caused by: org.apache.drill.common.exceptions.DrillRuntimeException: Failure 
> while trying to check if a table or view with given name [tbl_allData] 
> already exists in schema [dfs.tmp]: Index: 0, Size: 0
> at 
> org.apache.drill.exec.planner.sql.handlers.SqlHandlerUtil.getTableFromSchema(SqlHandlerUtil.java:222)
>  ~[drill-java-exec-1.1.0.jar:1.1.0]
> at 
> org.apache.drill.exec.planner.sql.handlers.CreateTableHandler.getPlan(CreateTableHandler.java:88)
>  ~[drill-java-exec-1.1.0.jar:1.1.0]
> at 
> org.apache.drill.exec.planner.sql.DrillSqlWorker.getPlan(DrillSqlWorker.java:178)
>  ~[drill-java-exec-1.1.0.jar:1.1.0]
> at 
> org.apache.drill.exec.work.foreman.Foreman.runSQL(Foreman.java:903) 
> [drill-java-exec-1.1.0.jar:1.1.0]
> at org.apache.drill.exec.work.foreman.Foreman.run(Foreman.java:242) 
> [drill-java-exec-1.1.0.jar:1.1.0]
> ... 3 common frames omitted
> Caused by: java.lang.IndexOutOfBoundsException: Index: 0, Size: 0
> at java.util.ArrayList.rangeCheck(ArrayList.java:635) ~[na:1.7.0_45]
> at java.util.ArrayList.get(ArrayList.java:411) ~[na:1.7.0_45]
> at 
> org.apache.drill.exec.store.dfs.FileSelection.getFirstPath(FileSelection.java:100)
>  ~[drill-java-exec-1.1.0.jar:1.1.0]
> at 
> org.apache.drill.exec.store.dfs.BasicFormatMatcher.isReadable(BasicFormatMatcher.java:75)
>  ~[drill-java-exec-1.1.0.jar:1.1.0]
> at 
> org.apache.drill.exec.store.dfs.WorkspaceSchemaFactory$WorkspaceSchema.create(WorkspaceSchemaFactory.java:303)
>  ~[drill-java-exec-1.1.0.jar:1.1.0]
> at 
> org.apache.drill.exec.store.dfs.WorkspaceSch

[jira] [Updated] (DRILL-3419) Handle scans optimally when all files are pruned out

2015-10-08 Thread Khurram Faraaz (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-3419?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Khurram Faraaz updated DRILL-3419:
--
Fix Version/s: (was: 1.3.0)
   Future

> Handle scans optimally when all files are pruned out
> 
>
> Key: DRILL-3419
> URL: https://issues.apache.org/jira/browse/DRILL-3419
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Query Planning & Optimization
>Affects Versions: 1.1.0
>Reporter: Khurram Faraaz
>Assignee: Jinfeng Ni
> Fix For: Future
>
>
> Note that in case (1) and case (2) we prune, however it is not clear if we 
> prune is case (3), that is because we see a FILTER in the query plan in case 
> (3)
> CTAS 
> {code}
> 0: jdbc:drill:schema=dfs.tmp> CREATE TABLE CTAS_ONE_MILN_RWS_PER_GROUP(col1, 
> col2) PARTITION BY (col2) AS select cast(columns[0] as bigint) col1, 
> cast(columns[1] as char(2)) col2 from `millionValGroup.csv`;
> +---++
> | Fragment  | Number of records written  |
> +---++
> | 1_1   | 21932064   |
> | 1_0   | 28067936   |
> +---++
> 2 rows selected (73.661 seconds)
> {code}
> case 1)
> {code}
> explain plan for select col1, col2 from CTAS_ONE_MILN_RWS_PER_GROUP where 
> col2 LIKE '%Z%';
> | 00-00Screen
> 00-01  Project(col1=[$0], col2=[$1])
> 00-02UnionExchange
> 01-01  Scan(groupscan=[ParquetGroupScan [entries=[ReadEntryWithPath 
> [path=/tmp/CTAS_ONE_MILN_RWS_PER_GROUP/1_0_3.parquet], ReadEntryWithPath 
> [path=/tmp/CTAS_ONE_MILN_RWS_PER_GROUP/1_1_3.parquet]], 
> selectionRoot=/tmp/CTAS_ONE_MILN_RWS_PER_GROUP, numFiles=2, columns=[`col2`, 
> `col1`]]])
> {code}
> case 2)
> {code}
> explain plan for select col1, col2 from CTAS_ONE_MILN_RWS_PER_GROUP where 
> col2 LIKE 'A%';
> | 00-00Screen
> 00-01  Project(col1=[$0], col2=[$1])
> 00-02UnionExchange
> 01-01  Scan(groupscan=[ParquetGroupScan [entries=[ReadEntryWithPath 
> [path=/tmp/CTAS_ONE_MILN_RWS_PER_GROUP/1_0_3.parquet], ReadEntryWithPath 
> [path=/tmp/CTAS_ONE_MILN_RWS_PER_GROUP/1_0_2.parquet], ReadEntryWithPath 
> [path=/tmp/CTAS_ONE_MILN_RWS_PER_GROUP/1_1_1.parquet], ReadEntryWithPath 
> [path=/tmp/CTAS_ONE_MILN_RWS_PER_GROUP/1_1_2.parquet], ReadEntryWithPath 
> [path=/tmp/CTAS_ONE_MILN_RWS_PER_GROUP/1_1_3.parquet], ReadEntryWithPath 
> [path=/tmp/CTAS_ONE_MILN_RWS_PER_GROUP/1_0_1.parquet]], 
> selectionRoot=/tmp/CTAS_ONE_MILN_RWS_PER_GROUP, numFiles=6, columns=[`col2`, 
> `col1`]]])
> {code}
> case 3) we are NOT pruning here.
> {code}
> explain plan for select col1, col2 from CTAS_ONE_MILN_RWS_PER_GROUP where 
> col2 LIKE 'Z%';
> | 00-00Screen
> 00-01  Project(col1=[$1], col2=[$0])
> 00-02SelectionVectorRemover
> 00-03  Filter(condition=[LIKE($0, 'Z%')])
> 00-04Project(col2=[$1], col1=[$0])
> 00-05  Scan(groupscan=[ParquetGroupScan 
> [entries=[ReadEntryWithPath 
> [path=/tmp/CTAS_ONE_MILN_RWS_PER_GROUP/1_1_48.parquet]], 
> selectionRoot=/tmp/CTAS_ONE_MILN_RWS_PER_GROUP, numFiles=1, columns=[`col2`, 
> `col1`]]])
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (DRILL-3468) CTAS IOB

2015-10-08 Thread Khurram Faraaz (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-3468?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Khurram Faraaz updated DRILL-3468:
--
Assignee: (was: Steven Phillips)

> CTAS IOB
> 
>
> Key: DRILL-3468
> URL: https://issues.apache.org/jira/browse/DRILL-3468
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Flow
>Affects Versions: 1.2.0
>Reporter: Khurram Faraaz
>Priority: Critical
> Fix For: Future
>
>
> I am seeing a IOB when I use same table name in CTAS, after deleting the 
> previously create parquet file.
> {code}
> 0: jdbc:drill:schema=dfs.tmp> CREATE TABLE tbl_allData AS SELECT 
> CAST(columns[0] as INT ), CAST(columns[1] as BIGINT ), CAST(columns[2] as 
> CHAR(2) ), CAST(columns[3] as VARCHAR(52) ), CAST(columns[4] as TIMESTAMP ), 
> CAST(columns[5] as DATE ), CAST(columns[6] as BOOLEAN ), CAST(columns[7] as 
> DOUBLE), CAST( columns[8] as TIME) FROM `allData.csv`;
> +---++
> | Fragment  | Number of records written  |
> +---++
> | 0_0   | 11196  |
> +---++
> 1 row selected (1.864 seconds)
> {code}
> Remove the parquet file that was created by the above CTAS.
> {code}
> [root@centos-01 aggregates]# hadoop fs -ls /tmp/tbl_allData
> Found 1 items
> -rwxr-xr-x   3 mapr mapr 397868 2015-07-07 21:08 
> /tmp/tbl_allData/0_0_0.parquet
> [root@centos-01 aggregates]# hadoop fs -rm /tmp/tbl_allData/0_0_0.parquet
> 15/07/07 21:10:47 INFO Configuration.deprecation: io.bytes.per.checksum is 
> deprecated. Instead, use dfs.bytes-per-checksum
> 15/07/07 21:10:47 INFO fs.TrashPolicyDefault: Namenode trash configuration: 
> Deletion interval = 0 minutes, Emptier interval = 0 minutes.
> Deleted /tmp/tbl_allData/0_0_0.parquet
> {code}
> I see a IOB when I CTAS with same table name as the one that was removed in 
> the above step.
> {code}
> 0: jdbc:drill:schema=dfs.tmp> CREATE TABLE tbl_allData AS SELECT 
> CAST(columns[0] as INT ), CAST(columns[1] as BIGINT ), CAST(columns[2] as 
> CHAR(2) ), CAST(columns[3] as VARCHAR(52) ), CAST(columns[4] as TIMESTAMP ), 
> CAST(columns[5] as DATE ), CAST(columns[6] as BOOLEAN ), CAST(columns[7] as 
> DOUBLE), CAST( columns[8] as TIME) FROM `lessData.csv`;
> Error: SYSTEM ERROR: IndexOutOfBoundsException: Index: 0, Size: 0
> [Error Id: 6d6df8e9-699c-4475-8ad3-183c0a91dc99 on centos-02.qa.lab:31010] 
> (state=,code=0)
> {code}
> stack trace from drillbit.log
> {code}
> org.apache.drill.exec.work.foreman.ForemanException: Unexpected exception 
> during fragment initialization: Failure while trying to check if a table or 
> view with given name [tbl_allData] already exists in schema [dfs.tmp]: Index: 
> 0, Size: 0
> at org.apache.drill.exec.work.foreman.Foreman.run(Foreman.java:253) 
> [drill-java-exec-1.1.0.jar:1.1.0]
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>  [na:1.7.0_45]
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>  [na:1.7.0_45]
> at java.lang.Thread.run(Thread.java:744) [na:1.7.0_45]
> Caused by: org.apache.drill.common.exceptions.DrillRuntimeException: Failure 
> while trying to check if a table or view with given name [tbl_allData] 
> already exists in schema [dfs.tmp]: Index: 0, Size: 0
> at 
> org.apache.drill.exec.planner.sql.handlers.SqlHandlerUtil.getTableFromSchema(SqlHandlerUtil.java:222)
>  ~[drill-java-exec-1.1.0.jar:1.1.0]
> at 
> org.apache.drill.exec.planner.sql.handlers.CreateTableHandler.getPlan(CreateTableHandler.java:88)
>  ~[drill-java-exec-1.1.0.jar:1.1.0]
> at 
> org.apache.drill.exec.planner.sql.DrillSqlWorker.getPlan(DrillSqlWorker.java:178)
>  ~[drill-java-exec-1.1.0.jar:1.1.0]
> at 
> org.apache.drill.exec.work.foreman.Foreman.runSQL(Foreman.java:903) 
> [drill-java-exec-1.1.0.jar:1.1.0]
> at org.apache.drill.exec.work.foreman.Foreman.run(Foreman.java:242) 
> [drill-java-exec-1.1.0.jar:1.1.0]
> ... 3 common frames omitted
> Caused by: java.lang.IndexOutOfBoundsException: Index: 0, Size: 0
> at java.util.ArrayList.rangeCheck(ArrayList.java:635) ~[na:1.7.0_45]
> at java.util.ArrayList.get(ArrayList.java:411) ~[na:1.7.0_45]
> at 
> org.apache.drill.exec.store.dfs.FileSelection.getFirstPath(FileSelection.java:100)
>  ~[drill-java-exec-1.1.0.jar:1.1.0]
> at 
> org.apache.drill.exec.store.dfs.BasicFormatMatcher.isReadable(BasicFormatMatcher.java:75)
>  ~[drill-java-exec-1.1.0.jar:1.1.0]
> at 
> org.apache.drill.exec.store.dfs.WorkspaceSchemaFactory$WorkspaceSchema.create(WorkspaceSchemaFactory.java:303)
>  ~[drill-java-exec-1.1.0.jar:1.1.0]
> at 
> org.apache.drill.exec.store.dfs.WorkspaceSchemaFactory$WorkspaceS

[jira] [Updated] (DRILL-3539) CTAS over empty json file throws NPE

2015-10-08 Thread Khurram Faraaz (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-3539?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Khurram Faraaz updated DRILL-3539:
--
Assignee: (was: Sean Hsuan-Yi Chu)

> CTAS over empty json file throws NPE
> 
>
> Key: DRILL-3539
> URL: https://issues.apache.org/jira/browse/DRILL-3539
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Flow
>Affects Versions: 1.2.0
>Reporter: Khurram Faraaz
> Fix For: Future
>
>
> CTAS over empty JSON file results in NPE.
> {code}
> 0: jdbc:drill:schema=dfs.tmp> create table t45645 as select * from 
> `empty.json`;
> Error: SYSTEM ERROR: NullPointerException
> Fragment 0:0
> [Error Id: 79039288-5402-4b0a-b32d-5bf5024f3b71 on centos-02.qa.lab:31010] 
> (state=,code=0)
> {code}
> Stack trace from drillbit.log
> {code}
> 2015-07-22 00:34:03,788 [2a511b03-90b3-1d39-f4e3-cfd754aa085f:frag:0:0] ERROR 
> o.a.d.e.w.fragment.FragmentExecutor - SYSTEM ERROR: NullPointerException
> Fragment 0:0
> [Error Id: 79039288-5402-4b0a-b32d-5bf5024f3b71 on centos-02.qa.lab:31010]
> org.apache.drill.common.exceptions.UserException: SYSTEM ERROR: 
> NullPointerException
> Fragment 0:0
> [Error Id: 79039288-5402-4b0a-b32d-5bf5024f3b71 on centos-02.qa.lab:31010]
> at 
> org.apache.drill.common.exceptions.UserException$Builder.build(UserException.java:523)
>  ~[drill-common-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
> at 
> org.apache.drill.exec.work.fragment.FragmentExecutor.sendFinalState(FragmentExecutor.java:323)
>  [drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
> at 
> org.apache.drill.exec.work.fragment.FragmentExecutor.cleanup(FragmentExecutor.java:178)
>  [drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
> at 
> org.apache.drill.exec.work.fragment.FragmentExecutor.run(FragmentExecutor.java:292)
>  [drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
> at 
> org.apache.drill.common.SelfCleaningRunnable.run(SelfCleaningRunnable.java:38)
>  [drill-common-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>  [na:1.7.0_45]
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>  [na:1.7.0_45]
> at java.lang.Thread.run(Thread.java:744) [na:1.7.0_45]
> Caused by: java.lang.NullPointerException: null
> at 
> org.apache.drill.exec.physical.impl.WriterRecordBatch.addOutputContainerData(WriterRecordBatch.java:133)
>  ~[drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
> at 
> org.apache.drill.exec.physical.impl.WriterRecordBatch.innerNext(WriterRecordBatch.java:126)
>  ~[drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
> at 
> org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:147)
>  ~[drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
> at 
> org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:105)
>  ~[drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
> at 
> org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:95)
>  ~[drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
> at 
> org.apache.drill.exec.record.AbstractSingleRecordBatch.innerNext(AbstractSingleRecordBatch.java:51)
>  ~[drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
> at 
> org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.innerNext(ProjectRecordBatch.java:129)
>  ~[drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
> at 
> org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:147)
>  ~[drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
> at 
> org.apache.drill.exec.physical.impl.BaseRootExec.next(BaseRootExec.java:83) 
> ~[drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
> at 
> org.apache.drill.exec.physical.impl.ScreenCreator$ScreenRoot.innerNext(ScreenCreator.java:79)
>  ~[drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
> at 
> org.apache.drill.exec.physical.impl.BaseRootExec.next(BaseRootExec.java:73) 
> ~[drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
> at 
> org.apache.drill.exec.work.fragment.FragmentExecutor$1.run(FragmentExecutor.java:258)
>  ~[drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
> at 
> org.apache.drill.exec.work.fragment.FragmentExecutor$1.run(FragmentExecutor.java:252)
>  ~[drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
> at java.security.AccessController.doPrivileged(Native Method) 
> ~[na:1.7.0_45]
> at javax.security.auth.Subject.doAs(Subject.java:415) ~[na:1.7.0_45]
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1566)
>  ~[hadoop-common-2.5.1-mapr-1503.jar:na]
> at 
> org.apache.drill.exec.work.fragment.FragmentExecutor

[jira] [Updated] (DRILL-3538) We do not prune partitions when we count over partitioning key and filter over partitioning key

2015-10-08 Thread Khurram Faraaz (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-3538?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Khurram Faraaz updated DRILL-3538:
--
Fix Version/s: (was: 1.3.0)
   Future

> We do not prune partitions when we count over partitioning key and filter 
> over partitioning key
> ---
>
> Key: DRILL-3538
> URL: https://issues.apache.org/jira/browse/DRILL-3538
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Flow
>Affects Versions: 1.2.0
> Environment: 4 node cluster on CentOS
>Reporter: Khurram Faraaz
>Assignee: Mehant Baid
>Priority: Critical
> Fix For: Future
>
>
> We are not partition pruning when we do a count over partitioning key and 
> when the predicate involves the partitioning key. CTAS used was,
> {code}
> create table t3214 partition by (key2) as select cast(key1 as double) key1, 
> cast(key2 as char(1)) key2 from `twoKeyJsn.json`;
> {code}
> case 1) We do not do partition pruning in this case.
> {code}
> 0: jdbc:drill:schema=dfs.tmp> explain plan for select count(key2) from t3214 
> where key2 = 'm';
> +--+--+
> | text | json |
> +--+--+
> | 00-00Screen
> 00-01  Project(EXPR$0=[$0])
> 00-02Project(EXPR$0=[$0])
> 00-03  
> Scan(groupscan=[org.apache.drill.exec.store.pojo.PojoRecordReader@e2471d7])
> {code}
> case 2) We do not do partition pruning in this case.
> {code}
> 0: jdbc:drill:schema=dfs.tmp> explain plan for select count(*) from t3214 
> where key2 = 'm';
> +--+--+
> | text | json |
> +--+--+
> | 00-00Screen
> 00-01  Project(EXPR$0=[$0])
> 00-02Project(EXPR$0=[$0])
> 00-03  
> Scan(groupscan=[org.apache.drill.exec.store.pojo.PojoRecordReader@211930a2])
> {code}
> case 3) We do not do partition pruning in this case.
> {code}
> 0: jdbc:drill:schema=dfs.tmp> explain plan for select count(key1) from t3214 
> where key2 = 'm';
> +--+--+
> | text | json |
> +--+--+
> | 00-00Screen
> 00-01  Project(EXPR$0=[$0])
> 00-02Project(EXPR$0=[$0])
> 00-03  
> Scan(groupscan=[org.apache.drill.exec.store.pojo.PojoRecordReader@23fea3b0])
> {code}
> case 4) we do prune here.
> {code}
> 0: jdbc:drill:schema=dfs.tmp> explain plan for select avg(key1) from t3214 
> where key2 = 'm';
> +--+--+
> | text | json |
> +--+--+
> | 00-00Screen
> 00-01  Project(EXPR$0=[CAST(/(CastHigh(CASE(=($1, 0), null, $0)), 
> $1)):ANY NOT NULL])
> 00-02StreamAgg(group=[{}], agg#0=[$SUM0($0)], agg#1=[$SUM0($1)])
> 00-03  StreamAgg(group=[{}], agg#0=[$SUM0($0)], agg#1=[COUNT($0)])
> 00-04Project(key1=[$1])
> 00-05  Scan(groupscan=[ParquetGroupScan 
> [entries=[ReadEntryWithPath [path=/tmp/t3214/0_0_15.parquet]], 
> selectionRoot=maprfs:/tmp/t3214, numFiles=1, columns=[`key2`, `key1`]]])
> {code}
> case 5) we do prune here.
> {code}
> 0: jdbc:drill:schema=dfs.tmp> explain plan for select min(key1) from t3214 
> where key2 = 'm';
> +--+--+
> | text | json |
> +--+--+
> | 00-00Screen
> 00-01  Project(EXPR$0=[$0])
> 00-02StreamAgg(group=[{}], EXPR$0=[MIN($0)])
> 00-03  StreamAgg(group=[{}], EXPR$0=[MIN($0)])
> 00-04Project(key1=[$1])
> 00-05  Scan(groupscan=[ParquetGroupScan 
> [entries=[ReadEntryWithPath [path=/tmp/t3214/0_0_15.parquet]], 
> selectionRoot=maprfs:/tmp/t3214, numFiles=1, columns=[`key2`, `key1`]]])
> {code}
> commit id that I am testing on : 17e580a7



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (DRILL-3538) We do not prune partitions when we count over partitioning key and filter over partitioning key

2015-10-08 Thread Khurram Faraaz (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-3538?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Khurram Faraaz updated DRILL-3538:
--
Assignee: (was: Mehant Baid)

> We do not prune partitions when we count over partitioning key and filter 
> over partitioning key
> ---
>
> Key: DRILL-3538
> URL: https://issues.apache.org/jira/browse/DRILL-3538
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Flow
>Affects Versions: 1.2.0
> Environment: 4 node cluster on CentOS
>Reporter: Khurram Faraaz
>Priority: Critical
> Fix For: Future
>
>
> We are not partition pruning when we do a count over partitioning key and 
> when the predicate involves the partitioning key. CTAS used was,
> {code}
> create table t3214 partition by (key2) as select cast(key1 as double) key1, 
> cast(key2 as char(1)) key2 from `twoKeyJsn.json`;
> {code}
> case 1) We do not do partition pruning in this case.
> {code}
> 0: jdbc:drill:schema=dfs.tmp> explain plan for select count(key2) from t3214 
> where key2 = 'm';
> +--+--+
> | text | json |
> +--+--+
> | 00-00Screen
> 00-01  Project(EXPR$0=[$0])
> 00-02Project(EXPR$0=[$0])
> 00-03  
> Scan(groupscan=[org.apache.drill.exec.store.pojo.PojoRecordReader@e2471d7])
> {code}
> case 2) We do not do partition pruning in this case.
> {code}
> 0: jdbc:drill:schema=dfs.tmp> explain plan for select count(*) from t3214 
> where key2 = 'm';
> +--+--+
> | text | json |
> +--+--+
> | 00-00Screen
> 00-01  Project(EXPR$0=[$0])
> 00-02Project(EXPR$0=[$0])
> 00-03  
> Scan(groupscan=[org.apache.drill.exec.store.pojo.PojoRecordReader@211930a2])
> {code}
> case 3) We do not do partition pruning in this case.
> {code}
> 0: jdbc:drill:schema=dfs.tmp> explain plan for select count(key1) from t3214 
> where key2 = 'm';
> +--+--+
> | text | json |
> +--+--+
> | 00-00Screen
> 00-01  Project(EXPR$0=[$0])
> 00-02Project(EXPR$0=[$0])
> 00-03  
> Scan(groupscan=[org.apache.drill.exec.store.pojo.PojoRecordReader@23fea3b0])
> {code}
> case 4) we do prune here.
> {code}
> 0: jdbc:drill:schema=dfs.tmp> explain plan for select avg(key1) from t3214 
> where key2 = 'm';
> +--+--+
> | text | json |
> +--+--+
> | 00-00Screen
> 00-01  Project(EXPR$0=[CAST(/(CastHigh(CASE(=($1, 0), null, $0)), 
> $1)):ANY NOT NULL])
> 00-02StreamAgg(group=[{}], agg#0=[$SUM0($0)], agg#1=[$SUM0($1)])
> 00-03  StreamAgg(group=[{}], agg#0=[$SUM0($0)], agg#1=[COUNT($0)])
> 00-04Project(key1=[$1])
> 00-05  Scan(groupscan=[ParquetGroupScan 
> [entries=[ReadEntryWithPath [path=/tmp/t3214/0_0_15.parquet]], 
> selectionRoot=maprfs:/tmp/t3214, numFiles=1, columns=[`key2`, `key1`]]])
> {code}
> case 5) we do prune here.
> {code}
> 0: jdbc:drill:schema=dfs.tmp> explain plan for select min(key1) from t3214 
> where key2 = 'm';
> +--+--+
> | text | json |
> +--+--+
> | 00-00Screen
> 00-01  Project(EXPR$0=[$0])
> 00-02StreamAgg(group=[{}], EXPR$0=[MIN($0)])
> 00-03  StreamAgg(group=[{}], EXPR$0=[MIN($0)])
> 00-04Project(key1=[$1])
> 00-05  Scan(groupscan=[ParquetGroupScan 
> [entries=[ReadEntryWithPath [path=/tmp/t3214/0_0_15.parquet]], 
> selectionRoot=maprfs:/tmp/t3214, numFiles=1, columns=[`key2`, `key1`]]])
> {code}
> commit id that I am testing on : 17e580a7



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (DRILL-3539) CTAS over empty json file throws NPE

2015-10-08 Thread Khurram Faraaz (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-3539?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Khurram Faraaz updated DRILL-3539:
--
Fix Version/s: (was: 1.3.0)
   Future

> CTAS over empty json file throws NPE
> 
>
> Key: DRILL-3539
> URL: https://issues.apache.org/jira/browse/DRILL-3539
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Flow
>Affects Versions: 1.2.0
>Reporter: Khurram Faraaz
> Fix For: Future
>
>
> CTAS over empty JSON file results in NPE.
> {code}
> 0: jdbc:drill:schema=dfs.tmp> create table t45645 as select * from 
> `empty.json`;
> Error: SYSTEM ERROR: NullPointerException
> Fragment 0:0
> [Error Id: 79039288-5402-4b0a-b32d-5bf5024f3b71 on centos-02.qa.lab:31010] 
> (state=,code=0)
> {code}
> Stack trace from drillbit.log
> {code}
> 2015-07-22 00:34:03,788 [2a511b03-90b3-1d39-f4e3-cfd754aa085f:frag:0:0] ERROR 
> o.a.d.e.w.fragment.FragmentExecutor - SYSTEM ERROR: NullPointerException
> Fragment 0:0
> [Error Id: 79039288-5402-4b0a-b32d-5bf5024f3b71 on centos-02.qa.lab:31010]
> org.apache.drill.common.exceptions.UserException: SYSTEM ERROR: 
> NullPointerException
> Fragment 0:0
> [Error Id: 79039288-5402-4b0a-b32d-5bf5024f3b71 on centos-02.qa.lab:31010]
> at 
> org.apache.drill.common.exceptions.UserException$Builder.build(UserException.java:523)
>  ~[drill-common-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
> at 
> org.apache.drill.exec.work.fragment.FragmentExecutor.sendFinalState(FragmentExecutor.java:323)
>  [drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
> at 
> org.apache.drill.exec.work.fragment.FragmentExecutor.cleanup(FragmentExecutor.java:178)
>  [drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
> at 
> org.apache.drill.exec.work.fragment.FragmentExecutor.run(FragmentExecutor.java:292)
>  [drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
> at 
> org.apache.drill.common.SelfCleaningRunnable.run(SelfCleaningRunnable.java:38)
>  [drill-common-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>  [na:1.7.0_45]
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>  [na:1.7.0_45]
> at java.lang.Thread.run(Thread.java:744) [na:1.7.0_45]
> Caused by: java.lang.NullPointerException: null
> at 
> org.apache.drill.exec.physical.impl.WriterRecordBatch.addOutputContainerData(WriterRecordBatch.java:133)
>  ~[drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
> at 
> org.apache.drill.exec.physical.impl.WriterRecordBatch.innerNext(WriterRecordBatch.java:126)
>  ~[drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
> at 
> org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:147)
>  ~[drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
> at 
> org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:105)
>  ~[drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
> at 
> org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:95)
>  ~[drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
> at 
> org.apache.drill.exec.record.AbstractSingleRecordBatch.innerNext(AbstractSingleRecordBatch.java:51)
>  ~[drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
> at 
> org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.innerNext(ProjectRecordBatch.java:129)
>  ~[drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
> at 
> org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:147)
>  ~[drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
> at 
> org.apache.drill.exec.physical.impl.BaseRootExec.next(BaseRootExec.java:83) 
> ~[drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
> at 
> org.apache.drill.exec.physical.impl.ScreenCreator$ScreenRoot.innerNext(ScreenCreator.java:79)
>  ~[drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
> at 
> org.apache.drill.exec.physical.impl.BaseRootExec.next(BaseRootExec.java:73) 
> ~[drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
> at 
> org.apache.drill.exec.work.fragment.FragmentExecutor$1.run(FragmentExecutor.java:258)
>  ~[drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
> at 
> org.apache.drill.exec.work.fragment.FragmentExecutor$1.run(FragmentExecutor.java:252)
>  ~[drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
> at java.security.AccessController.doPrivileged(Native Method) 
> ~[na:1.7.0_45]
> at javax.security.auth.Subject.doAs(Subject.java:415) ~[na:1.7.0_45]
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1566)
>  ~[hadoop-common-2.5.1-mapr-1503.jar:na]
> at 
> org.apache.drill.exec.work.fragme

[jira] [Updated] (DRILL-3564) Error message fix required

2015-10-08 Thread Khurram Faraaz (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-3564?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Khurram Faraaz updated DRILL-3564:
--
Assignee: (was: Sean Hsuan-Yi Chu)

> Error message fix required
> --
>
> Key: DRILL-3564
> URL: https://issues.apache.org/jira/browse/DRILL-3564
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Flow
>Reporter: Khurram Faraaz
>Priority: Minor
> Fix For: Future
>
>
> We report Union-All in the error message, we should say "Union" as the query 
> involves a UNION and not Union-All.
> {code}
> 0: jdbc:drill:schema=dfs.tmp> select * from union_01 UNION select * from 
> union_02;
> Error: UNSUPPORTED_OPERATION ERROR: Union-All over schema-less tables must 
> specify the columns explicitly
> See Apache Drill JIRA: DRILL-2414
> [Error Id: 760e48d8-ffac-4d5f-ac14-25aabcfd8033 on centos-04.qa.lab:31010] 
> (state=,code=0)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (DRILL-3564) Error message fix required

2015-10-08 Thread Khurram Faraaz (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-3564?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Khurram Faraaz updated DRILL-3564:
--
Fix Version/s: (was: 1.3.0)
   Future

> Error message fix required
> --
>
> Key: DRILL-3564
> URL: https://issues.apache.org/jira/browse/DRILL-3564
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Flow
>Reporter: Khurram Faraaz
>Assignee: Sean Hsuan-Yi Chu
>Priority: Minor
> Fix For: Future
>
>
> We report Union-All in the error message, we should say "Union" as the query 
> involves a UNION and not Union-All.
> {code}
> 0: jdbc:drill:schema=dfs.tmp> select * from union_01 UNION select * from 
> union_02;
> Error: UNSUPPORTED_OPERATION ERROR: Union-All over schema-less tables must 
> specify the columns explicitly
> See Apache Drill JIRA: DRILL-2414
> [Error Id: 760e48d8-ffac-4d5f-ac14-25aabcfd8033 on centos-04.qa.lab:31010] 
> (state=,code=0)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (DRILL-3597) Disable RESPECT NULLS, IGNORE NULLS option for LEAD, LAG window functions

2015-10-08 Thread Khurram Faraaz (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-3597?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Khurram Faraaz updated DRILL-3597:
--
Fix Version/s: (was: 1.3.0)
   Future

> Disable RESPECT NULLS, IGNORE NULLS option for LEAD, LAG window functions
> -
>
> Key: DRILL-3597
> URL: https://issues.apache.org/jira/browse/DRILL-3597
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Relational Operators
>Affects Versions: 1.2.0
>Reporter: Khurram Faraaz
>Assignee: Sean Hsuan-Yi Chu
> Fix For: Future
>
>
> The SQL standard defines a RESPECT NULLS or IGNORE NULLS option for lead, 
> lag, first_value, and last_value window functions. We need to (disable it) 
> and report a meaningful message to user if either of these two options is 
> used in a query, with any of those functions.
> Currently we throw parse errors when these options are used in query.
> {code}
> 0: jdbc:drill:schema=dfs.tmp> select lead(c1,2) respect nulls over w from 
> union_01 window w as (partition by c3 order by c1);
> Error: PARSE ERROR: Encountered "nulls" at line 1, column 27.
> Was expecting one of:
> "FROM" ...
> "," ...
>
> [Error Id: 73d09692-6374-41a2-bce0-db73d2828f1f on centos-04.qa.lab:31010] 
> (state=,code=0)
> {code}
> {code}
> 0: jdbc:drill:schema=dfs.tmp> select lead(c1,2) ignore nulls over w from 
> union_01 window w as (partition by c3 order by c1);
> Error: PARSE ERROR: Encountered "nulls" at line 1, column 26.
> Was expecting one of:
> "FROM" ...
> "," ...
> 
> [Error Id: a7bd21b3-b46c-417d-a9d9-aa6d0378b0fc on centos-04.qa.lab:31010] 
> (state=,code=0)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (DRILL-3597) Disable RESPECT NULLS, IGNORE NULLS option for LEAD, LAG window functions

2015-10-08 Thread Khurram Faraaz (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-3597?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Khurram Faraaz updated DRILL-3597:
--
Assignee: (was: Sean Hsuan-Yi Chu)

> Disable RESPECT NULLS, IGNORE NULLS option for LEAD, LAG window functions
> -
>
> Key: DRILL-3597
> URL: https://issues.apache.org/jira/browse/DRILL-3597
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Relational Operators
>Affects Versions: 1.2.0
>Reporter: Khurram Faraaz
> Fix For: Future
>
>
> The SQL standard defines a RESPECT NULLS or IGNORE NULLS option for lead, 
> lag, first_value, and last_value window functions. We need to (disable it) 
> and report a meaningful message to user if either of these two options is 
> used in a query, with any of those functions.
> Currently we throw parse errors when these options are used in query.
> {code}
> 0: jdbc:drill:schema=dfs.tmp> select lead(c1,2) respect nulls over w from 
> union_01 window w as (partition by c3 order by c1);
> Error: PARSE ERROR: Encountered "nulls" at line 1, column 27.
> Was expecting one of:
> "FROM" ...
> "," ...
>
> [Error Id: 73d09692-6374-41a2-bce0-db73d2828f1f on centos-04.qa.lab:31010] 
> (state=,code=0)
> {code}
> {code}
> 0: jdbc:drill:schema=dfs.tmp> select lead(c1,2) ignore nulls over w from 
> union_01 window w as (partition by c3 order by c1);
> Error: PARSE ERROR: Encountered "nulls" at line 1, column 26.
> Was expecting one of:
> "FROM" ...
> "," ...
> 
> [Error Id: a7bd21b3-b46c-417d-a9d9-aa6d0378b0fc on centos-04.qa.lab:31010] 
> (state=,code=0)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (DRILL-3664) CAST integer zero , one to boolean false , true

2015-10-08 Thread Khurram Faraaz (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-3664?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Khurram Faraaz updated DRILL-3664:
--
Assignee: (was: Mehant Baid)

> CAST integer zero , one to boolean false , true
> ---
>
> Key: DRILL-3664
> URL: https://issues.apache.org/jira/browse/DRILL-3664
> Project: Apache Drill
>  Issue Type: Bug
>  Components: SQL Parser
>Affects Versions: 1.2.0
>Reporter: Khurram Faraaz
> Fix For: Future
>
>
> We should be able to cast (zero) 0 to false and (one) 1 to true, currently we 
> report a parse error when an explicit cast is used in query.
> col7 is of type Boolean in the below input parquet file.
> {code}
> 0: jdbc:drill:schema=dfs.tmp> select col7 from FEWRWSPQQ_101 where col7 IN 
> (cast(0 as boolean),cast(1 as boolean));
> Error: PARSE ERROR: From line 1, column 47 to line 1, column 64: Cast 
> function cannot convert value of type INTEGER to type BOOLEAN
> [Error Id: d751945f-8a0f-4369-ae9e-c42504f6d978 on centos-04.qa.lab:31010] 
> (state=,code=0)
> {code}
> Without explicit cast we see SchemaChangeException.
> {code}
> 0: jdbc:drill:schema=dfs.tmp> select col7 from FEWRWSPQQ_101 where col7 IN 
> (0,1);
> Error: SYSTEM ERROR: SchemaChangeException: Failure while trying to 
> materialize incoming schema.  Errors:
>  
> Error in expression at index -1.  Error: Missing function implementation: 
> [castINT(BIT-OPTIONAL)].  Full expression: --UNKNOWN EXPRESSION--.
> Error in expression at index -1.  Error: Missing function implementation: 
> [castINT(BIT-OPTIONAL)].  Full expression: --UNKNOWN EXPRESSION--..
> Fragment 0:0
> [Error Id: ecf51dae-62c5-40d7-b0f5-3b9bf9fd3377 on centos-04.qa.lab:31010] 
> (state=,code=0)
> {code}
> Postgres results for the same query.
> {code}
> postgres=# select col7 from FEWRWSPQQ_101 where col7 IN (cast(0 as 
> boolean),cast(1 as boolean));
>  col7 
> --
>  f
>  t
>  f
>  t
>  f
>  t
>  f
>  t
>  f
>  t
>  f
>  t
>  f
>  t
>  f
>  t
>  f
>  t
>  f
>  t
>  f
>  t
> (22 rows)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (DRILL-3664) CAST integer zero , one to boolean false , true

2015-10-08 Thread Khurram Faraaz (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-3664?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Khurram Faraaz updated DRILL-3664:
--
Fix Version/s: (was: 1.3.0)
   Future

> CAST integer zero , one to boolean false , true
> ---
>
> Key: DRILL-3664
> URL: https://issues.apache.org/jira/browse/DRILL-3664
> Project: Apache Drill
>  Issue Type: Bug
>  Components: SQL Parser
>Affects Versions: 1.2.0
>Reporter: Khurram Faraaz
> Fix For: Future
>
>
> We should be able to cast (zero) 0 to false and (one) 1 to true, currently we 
> report a parse error when an explicit cast is used in query.
> col7 is of type Boolean in the below input parquet file.
> {code}
> 0: jdbc:drill:schema=dfs.tmp> select col7 from FEWRWSPQQ_101 where col7 IN 
> (cast(0 as boolean),cast(1 as boolean));
> Error: PARSE ERROR: From line 1, column 47 to line 1, column 64: Cast 
> function cannot convert value of type INTEGER to type BOOLEAN
> [Error Id: d751945f-8a0f-4369-ae9e-c42504f6d978 on centos-04.qa.lab:31010] 
> (state=,code=0)
> {code}
> Without explicit cast we see SchemaChangeException.
> {code}
> 0: jdbc:drill:schema=dfs.tmp> select col7 from FEWRWSPQQ_101 where col7 IN 
> (0,1);
> Error: SYSTEM ERROR: SchemaChangeException: Failure while trying to 
> materialize incoming schema.  Errors:
>  
> Error in expression at index -1.  Error: Missing function implementation: 
> [castINT(BIT-OPTIONAL)].  Full expression: --UNKNOWN EXPRESSION--.
> Error in expression at index -1.  Error: Missing function implementation: 
> [castINT(BIT-OPTIONAL)].  Full expression: --UNKNOWN EXPRESSION--..
> Fragment 0:0
> [Error Id: ecf51dae-62c5-40d7-b0f5-3b9bf9fd3377 on centos-04.qa.lab:31010] 
> (state=,code=0)
> {code}
> Postgres results for the same query.
> {code}
> postgres=# select col7 from FEWRWSPQQ_101 where col7 IN (cast(0 as 
> boolean),cast(1 as boolean));
>  col7 
> --
>  f
>  t
>  f
>  t
>  f
>  t
>  f
>  t
>  f
>  t
>  f
>  t
>  f
>  t
>  f
>  t
>  f
>  t
>  f
>  t
>  f
>  t
> (22 rows)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (DRILL-3728) millisecond portion of time value missing from query results

2015-10-08 Thread Khurram Faraaz (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-3728?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Khurram Faraaz updated DRILL-3728:
--
Fix Version/s: (was: 1.3.0)
   Future

> millisecond portion of time value missing from query results
> 
>
> Key: DRILL-3728
> URL: https://issues.apache.org/jira/browse/DRILL-3728
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Data Types
>Affects Versions: 1.2.0
>Reporter: Khurram Faraaz
> Fix For: Future
>
>
> Some rows in the results below are missing the millisecond portion of the 
> time value.
> {code}
> 0: jdbc:drill:schema=dfs.tmp> SELECT cast(col5 as time) FROM FEWRWSPQQ_101;
> +---+
> |EXPR$0 |
> +---+
> | 00:28:02.338  |
> | 00:28:02.228  |
> | 00:28:02.616  |
> | 00:28:02.404  |
> | 00:28:02.309  |
> | 00:28:02.638  |
> | 00:28:02.748  |
> | 00:28:02.321  |
> | 00:28:02  |
> | 00:28:02  |
> | 00:28:02.418  |
> | 00:28:02.418  |
> | 00:28:02.418  |
> | 00:28:02.418  |
> | 00:28:02.418  |
> | 00:28:02.418  |
> | 00:28:02.418  |
> | 00:28:02.118  |
> | 00:28:02.218  |
> | 00:28:02.418  |
> | 00:28:02.318  |
> | 20:28:02.318  |
> +---+
> 22 rows selected (0.491 seconds)
> {code}
> Column of type timestamp from which I try to extract the time portion in 
> above query.
> {code}
> 0: jdbc:drill:schema=dfs.tmp> SELECT col5 FROM FEWRWSPQQ_101;
> +--+
> |   col5   |
> +--+
> | 2014-03-02 00:28:02.338  |
> | 2014-01-02 00:28:02.228  |
> | 2014-09-02 00:28:02.616  |
> | 2015-02-02 00:28:02.404  |
> | 2014-07-02 00:28:02.309  |
> | 1985-04-02 00:28:02.638  |
> | 2006-05-02 00:28:02.748  |
> | 2005-06-02 00:28:02.321  |
> | 1950-08-02 00:28:02.111  |
> | 1947-07-02 00:28:02.418  |
> | 1973-06-02 00:28:02.418  |
> | 1992-06-02 00:28:02.418  |
> | 1994-06-02 00:28:02.418  |
> | 2000-06-02 00:28:02.418  |
> | 2002-06-02 00:28:02.418  |
> | 2003-06-02 00:28:02.418  |
> | 2004-06-02 00:28:02.418  |
> | 2010-06-02 00:28:02.118  |
> | 2011-06-02 00:28:02.218  |
> | 2012-06-02 00:28:02.418  |
> | 2013-06-02 00:28:02.318  |
> | 2015-08-02 20:28:02.318  |
> +--+
> 22 rows selected (0.285 seconds)
> {code}
> When we cast the extracted time portion to varchar type, the millisecond 
> portion is not missing from the rows that it was missing in the first query 
> above.
> {code}
> 0: jdbc:drill:schema=dfs.tmp> SELECT cast(cast(col5 as time) as varchar(12)) 
> FROM FEWRWSPQQ_101;
> +---+
> |EXPR$0 |
> +---+
> | 00:28:02.338  |
> | 00:28:02.228  |
> | 00:28:02.616  |
> | 00:28:02.404  |
> | 00:28:02.309  |
> | 00:28:02.638  |
> | 00:28:02.748  |
> | 00:28:02.321  |
> | 00:28:02.111  |
> | 00:28:02.418  |
> | 00:28:02.418  |
> | 00:28:02.418  |
> | 00:28:02.418  |
> | 00:28:02.418  |
> | 00:28:02.418  |
> | 00:28:02.418  |
> | 00:28:02.418  |
> | 00:28:02.118  |
> | 00:28:02.218  |
> | 00:28:02.418  |
> | 00:28:02.318  |
> | 20:28:02.318  |
> +---+
> 22 rows selected (0.285 seconds)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (DRILL-3676) Group by ordinal number of an output column results in parse error

2015-10-08 Thread Khurram Faraaz (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-3676?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Khurram Faraaz updated DRILL-3676:
--
Assignee: (was: Sean Hsuan-Yi Chu)

> Group by ordinal number of an output column results in parse error
> --
>
> Key: DRILL-3676
> URL: https://issues.apache.org/jira/browse/DRILL-3676
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Query Planning & Optimization
>Affects Versions: 1.2.0
>Reporter: Khurram Faraaz
> Fix For: Future
>
>
> Group by number results in parse error.
> {code}
> 0: jdbc:drill:schema=dfs.tmp> select sub_q.col1 from (select col1 from 
> FEWRWSPQQ_101) sub_q group by 1;
> Error: PARSE ERROR: At line 1, column 8: Expression 'q.col1' is not being 
> grouped
> [Error Id: 0eedafd9-372e-4610-b7a8-d97e26458d58 on centos-02.qa.lab:31010] 
> (state=,code=0)
> {code}
> When we use the column name instead of the number, the query compiles and 
> returns results.
> {code}
> 0: jdbc:drill:schema=dfs.tmp> select col1 from (select col1 from 
> FEWRWSPQQ_101) group by col1;
> +--+
> | col1 |
> +--+
> | 65534|
> | 1000 |
> | -1   |
> | 0|
> | 1|
> | 13   |
> | 17   |
> | 23   |
> | 1000 |
> | 999  |
> | 30   |
> | 25   |
> | 1001 |
> | -65535   |
> | 5000 |
> | 3000 |
> | 200  |
> | 197  |
> | 4611686018427387903  |
> | 9223372036854775806  |
> | 9223372036854775807  |
> | 92233720385475807|
> +--+
> 22 rows selected (0.218 seconds)
> {code}
> {code}
> 0: jdbc:drill:schema=dfs.tmp> select sub_query.col1 from (select col1 from 
> FEWRWSPQQ_101) sub_query group by sub_query.col1;
> +--+
> | col1 |
> +--+
> | 65534|
> | 1000 |
> | -1   |
> | 0|
> | 1|
> | 13   |
> | 17   |
> | 23   |
> | 1000 |
> | 999  |
> | 30   |
> | 25   |
> | 1001 |
> | -65535   |
> | 5000 |
> | 3000 |
> | 200  |
> | 197  |
> | 4611686018427387903  |
> | 9223372036854775806  |
> | 9223372036854775807  |
> | 92233720385475807|
> +--+
> 22 rows selected (0.177 seconds)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (DRILL-3728) millisecond portion of time value missing from query results

2015-10-08 Thread Khurram Faraaz (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-3728?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Khurram Faraaz updated DRILL-3728:
--
Assignee: (was: Mehant Baid)

> millisecond portion of time value missing from query results
> 
>
> Key: DRILL-3728
> URL: https://issues.apache.org/jira/browse/DRILL-3728
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Data Types
>Affects Versions: 1.2.0
>Reporter: Khurram Faraaz
> Fix For: Future
>
>
> Some rows in the results below are missing the millisecond portion of the 
> time value.
> {code}
> 0: jdbc:drill:schema=dfs.tmp> SELECT cast(col5 as time) FROM FEWRWSPQQ_101;
> +---+
> |EXPR$0 |
> +---+
> | 00:28:02.338  |
> | 00:28:02.228  |
> | 00:28:02.616  |
> | 00:28:02.404  |
> | 00:28:02.309  |
> | 00:28:02.638  |
> | 00:28:02.748  |
> | 00:28:02.321  |
> | 00:28:02  |
> | 00:28:02  |
> | 00:28:02.418  |
> | 00:28:02.418  |
> | 00:28:02.418  |
> | 00:28:02.418  |
> | 00:28:02.418  |
> | 00:28:02.418  |
> | 00:28:02.418  |
> | 00:28:02.118  |
> | 00:28:02.218  |
> | 00:28:02.418  |
> | 00:28:02.318  |
> | 20:28:02.318  |
> +---+
> 22 rows selected (0.491 seconds)
> {code}
> Column of type timestamp from which I try to extract the time portion in 
> above query.
> {code}
> 0: jdbc:drill:schema=dfs.tmp> SELECT col5 FROM FEWRWSPQQ_101;
> +--+
> |   col5   |
> +--+
> | 2014-03-02 00:28:02.338  |
> | 2014-01-02 00:28:02.228  |
> | 2014-09-02 00:28:02.616  |
> | 2015-02-02 00:28:02.404  |
> | 2014-07-02 00:28:02.309  |
> | 1985-04-02 00:28:02.638  |
> | 2006-05-02 00:28:02.748  |
> | 2005-06-02 00:28:02.321  |
> | 1950-08-02 00:28:02.111  |
> | 1947-07-02 00:28:02.418  |
> | 1973-06-02 00:28:02.418  |
> | 1992-06-02 00:28:02.418  |
> | 1994-06-02 00:28:02.418  |
> | 2000-06-02 00:28:02.418  |
> | 2002-06-02 00:28:02.418  |
> | 2003-06-02 00:28:02.418  |
> | 2004-06-02 00:28:02.418  |
> | 2010-06-02 00:28:02.118  |
> | 2011-06-02 00:28:02.218  |
> | 2012-06-02 00:28:02.418  |
> | 2013-06-02 00:28:02.318  |
> | 2015-08-02 20:28:02.318  |
> +--+
> 22 rows selected (0.285 seconds)
> {code}
> When we cast the extracted time portion to varchar type, the millisecond 
> portion is not missing from the rows that it was missing in the first query 
> above.
> {code}
> 0: jdbc:drill:schema=dfs.tmp> SELECT cast(cast(col5 as time) as varchar(12)) 
> FROM FEWRWSPQQ_101;
> +---+
> |EXPR$0 |
> +---+
> | 00:28:02.338  |
> | 00:28:02.228  |
> | 00:28:02.616  |
> | 00:28:02.404  |
> | 00:28:02.309  |
> | 00:28:02.638  |
> | 00:28:02.748  |
> | 00:28:02.321  |
> | 00:28:02.111  |
> | 00:28:02.418  |
> | 00:28:02.418  |
> | 00:28:02.418  |
> | 00:28:02.418  |
> | 00:28:02.418  |
> | 00:28:02.418  |
> | 00:28:02.418  |
> | 00:28:02.418  |
> | 00:28:02.118  |
> | 00:28:02.218  |
> | 00:28:02.418  |
> | 00:28:02.318  |
> | 20:28:02.318  |
> +---+
> 22 rows selected (0.285 seconds)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (DRILL-3743) query hangs on sqlline once Drillbit on foreman node is killed

2015-10-08 Thread Khurram Faraaz (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-3743?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Khurram Faraaz updated DRILL-3743:
--
Fix Version/s: (was: 1.3.0)
   Future

> query hangs on sqlline once Drillbit on foreman node is killed
> --
>
> Key: DRILL-3743
> URL: https://issues.apache.org/jira/browse/DRILL-3743
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Flow
>Affects Versions: 1.2.0
> Environment: 4 node cluster CentOS
>Reporter: Khurram Faraaz
>Priority: Critical
> Fix For: Future
>
>
> sqlline/query hangs once Drillbit (on Foreman node) is killed. (kill -9 )
> query was issued from the Foreman node. The query returns many records, and 
> it is a long running query.
> Steps to reproduce the problem.
> set planner.slice_target=1
> 1.  clush -g khurram service mapr-warden stop
> 2.  clush -g khurram service mapr-warden start
> 3.  ./sqlline -u "jdbc:drill:schema=dfs.tmp"
> 0: jdbc:drill:schema=dfs.tmp> select * from `twoKeyJsn.json` limit 200;
> 4.  Immediately from another console do a jps and kill the Drillbit process 
> (in this case foreman) while the query is being run on sqlline. You will 
> notice that sqlline just hangs, we do not see any exceptions or errors being 
> reported on sqlline prompt or in drillbit.log or drillbit.out
> I do see this Exception in sqlline.log on the node from where sqlline was 
> started
> {code}
> 2015-09-04 18:45:12,069 [Client-1] INFO  o.a.d.e.rpc.user.QueryResultHandler 
> - User Error Occurred
> org.apache.drill.common.exceptions.UserException: CONNECTION ERROR: 
> Connection /10.10.100.201:53425 <--> /10.10.100.201:31010 (user client) 
> closed unexpectedly.
> [Error Id: ec316cfd-c9a5-4905-98e3-da20cb799ba5 ]
> at 
> org.apache.drill.common.exceptions.UserException$Builder.build(UserException.java:524)
>  ~[drill-common-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
> at 
> org.apache.drill.exec.rpc.user.QueryResultHandler$SubmissionListener$ChannelClosedListener.operationComplete(QueryResultHandler.java:298)
>  [drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
> at 
> io.netty.util.concurrent.DefaultPromise.notifyListener0(DefaultPromise.java:680)
>  [netty-common-4.0.27.Final.jar:4.0.27.Final]
> at 
> io.netty.util.concurrent.DefaultPromise$LateListeners.run(DefaultPromise.java:845)
>  [netty-common-4.0.27.Final.jar:4.0.27.Final]
> at 
> io.netty.util.concurrent.DefaultPromise$LateListenerNotifier.run(DefaultPromise.java:873)
>  [netty-common-4.0.27.Final.jar:4.0.27.Final]
> at 
> io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:357)
>  [netty-common-4.0.27.Final.jar:4.0.27.Final]
> at io.netty.channel.epoll.EpollEventLoop.run(EpollEventLoop.java:254) 
> [netty-transport-native-epoll-4.0.27.Final-linux-x86_64.jar:na]
> at 
> io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:111)
>  [netty-common-4.0.27.Final.jar:4.0.27.Final]
> at java.lang.Thread.run(Thread.java:744) [na:1.7.0_45]
> 2015-09-04 18:45:12,069 [Client-1] INFO  
> o.a.d.j.i.DrillResultSetImpl$ResultsListener - [#7] Query failed:
> org.apache.drill.common.exceptions.UserException: CONNECTION ERROR: 
> Connection /10.10.100.201:53425 <--> /10.10.100.201:31010 (user client) 
> closed unexpectedly.
> [Error Id: ec316cfd-c9a5-4905-98e3-da20cb799ba5 ]
> at 
> org.apache.drill.common.exceptions.UserException$Builder.build(UserException.java:524)
>  ~[drill-common-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
> at 
> org.apache.drill.exec.rpc.user.QueryResultHandler$SubmissionListener$ChannelClosedListener.operationComplete(QueryResultHandler.java:298)
>  [drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
> at 
> io.netty.util.concurrent.DefaultPromise.notifyListener0(DefaultPromise.java:680)
>  [netty-common-4.0.27.Final.jar:4.0.27.Final]
> at 
> io.netty.util.concurrent.DefaultPromise$LateListeners.run(DefaultPromise.java:845)
>  [netty-common-4.0.27.Final.jar:4.0.27.Final]
> at 
> io.netty.util.concurrent.DefaultPromise$LateListenerNotifier.run(DefaultPromise.java:873)
>  [netty-common-4.0.27.Final.jar:4.0.27.Final]
> at 
> io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:357)
>  [netty-common-4.0.27.Final.jar:4.0.27.Final]
> at io.netty.channel.epoll.EpollEventLoop.run(EpollEventLoop.java:254) 
> [netty-transport-native-epoll-4.0.27.Final-linux-x86_64.jar:na]
> at 
> io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:111)
>  [netty-common-4.0.27.Final.jar:4.0.27.Final]
> at java.lang.Thread.run(Thread.java:744) [na:1.7.0_45]
> 2015-09-

[jira] [Updated] (DRILL-3743) query hangs on sqlline once Drillbit on foreman node is killed

2015-10-08 Thread Khurram Faraaz (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-3743?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Khurram Faraaz updated DRILL-3743:
--
Assignee: (was: Deneche A. Hakim)

> query hangs on sqlline once Drillbit on foreman node is killed
> --
>
> Key: DRILL-3743
> URL: https://issues.apache.org/jira/browse/DRILL-3743
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Flow
>Affects Versions: 1.2.0
> Environment: 4 node cluster CentOS
>Reporter: Khurram Faraaz
>Priority: Critical
> Fix For: Future
>
>
> sqlline/query hangs once Drillbit (on Foreman node) is killed. (kill -9 )
> query was issued from the Foreman node. The query returns many records, and 
> it is a long running query.
> Steps to reproduce the problem.
> set planner.slice_target=1
> 1.  clush -g khurram service mapr-warden stop
> 2.  clush -g khurram service mapr-warden start
> 3.  ./sqlline -u "jdbc:drill:schema=dfs.tmp"
> 0: jdbc:drill:schema=dfs.tmp> select * from `twoKeyJsn.json` limit 200;
> 4.  Immediately from another console do a jps and kill the Drillbit process 
> (in this case foreman) while the query is being run on sqlline. You will 
> notice that sqlline just hangs, we do not see any exceptions or errors being 
> reported on sqlline prompt or in drillbit.log or drillbit.out
> I do see this Exception in sqlline.log on the node from where sqlline was 
> started
> {code}
> 2015-09-04 18:45:12,069 [Client-1] INFO  o.a.d.e.rpc.user.QueryResultHandler 
> - User Error Occurred
> org.apache.drill.common.exceptions.UserException: CONNECTION ERROR: 
> Connection /10.10.100.201:53425 <--> /10.10.100.201:31010 (user client) 
> closed unexpectedly.
> [Error Id: ec316cfd-c9a5-4905-98e3-da20cb799ba5 ]
> at 
> org.apache.drill.common.exceptions.UserException$Builder.build(UserException.java:524)
>  ~[drill-common-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
> at 
> org.apache.drill.exec.rpc.user.QueryResultHandler$SubmissionListener$ChannelClosedListener.operationComplete(QueryResultHandler.java:298)
>  [drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
> at 
> io.netty.util.concurrent.DefaultPromise.notifyListener0(DefaultPromise.java:680)
>  [netty-common-4.0.27.Final.jar:4.0.27.Final]
> at 
> io.netty.util.concurrent.DefaultPromise$LateListeners.run(DefaultPromise.java:845)
>  [netty-common-4.0.27.Final.jar:4.0.27.Final]
> at 
> io.netty.util.concurrent.DefaultPromise$LateListenerNotifier.run(DefaultPromise.java:873)
>  [netty-common-4.0.27.Final.jar:4.0.27.Final]
> at 
> io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:357)
>  [netty-common-4.0.27.Final.jar:4.0.27.Final]
> at io.netty.channel.epoll.EpollEventLoop.run(EpollEventLoop.java:254) 
> [netty-transport-native-epoll-4.0.27.Final-linux-x86_64.jar:na]
> at 
> io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:111)
>  [netty-common-4.0.27.Final.jar:4.0.27.Final]
> at java.lang.Thread.run(Thread.java:744) [na:1.7.0_45]
> 2015-09-04 18:45:12,069 [Client-1] INFO  
> o.a.d.j.i.DrillResultSetImpl$ResultsListener - [#7] Query failed:
> org.apache.drill.common.exceptions.UserException: CONNECTION ERROR: 
> Connection /10.10.100.201:53425 <--> /10.10.100.201:31010 (user client) 
> closed unexpectedly.
> [Error Id: ec316cfd-c9a5-4905-98e3-da20cb799ba5 ]
> at 
> org.apache.drill.common.exceptions.UserException$Builder.build(UserException.java:524)
>  ~[drill-common-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
> at 
> org.apache.drill.exec.rpc.user.QueryResultHandler$SubmissionListener$ChannelClosedListener.operationComplete(QueryResultHandler.java:298)
>  [drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
> at 
> io.netty.util.concurrent.DefaultPromise.notifyListener0(DefaultPromise.java:680)
>  [netty-common-4.0.27.Final.jar:4.0.27.Final]
> at 
> io.netty.util.concurrent.DefaultPromise$LateListeners.run(DefaultPromise.java:845)
>  [netty-common-4.0.27.Final.jar:4.0.27.Final]
> at 
> io.netty.util.concurrent.DefaultPromise$LateListenerNotifier.run(DefaultPromise.java:873)
>  [netty-common-4.0.27.Final.jar:4.0.27.Final]
> at 
> io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:357)
>  [netty-common-4.0.27.Final.jar:4.0.27.Final]
> at io.netty.channel.epoll.EpollEventLoop.run(EpollEventLoop.java:254) 
> [netty-transport-native-epoll-4.0.27.Final-linux-x86_64.jar:na]
> at 
> io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:111)
>  [netty-common-4.0.27.Final.jar:4.0.27.Final]
> at java.lang.Thread.run(Thread.java:744) [na:1.7.0_45]
> 2015-09-04 18:45:12,071 [Cli

[jira] [Updated] (DRILL-3751) Query hang when zookeeper is stopped

2015-10-08 Thread Khurram Faraaz (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-3751?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Khurram Faraaz updated DRILL-3751:
--
Fix Version/s: (was: 1.3.0)
   Future

> Query hang when zookeeper is stopped
> 
>
> Key: DRILL-3751
> URL: https://issues.apache.org/jira/browse/DRILL-3751
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Flow
>Affects Versions: 1.2.0
> Environment: 4 node cluster on CentOS
>Reporter: Khurram Faraaz
>Priority: Critical
> Fix For: Future
>
>
> I see an indefinite hang on sqlline prompt, issue a long running query and 
> then stop zookeeper process when the query is still being executed. Sqlline 
> prompt is never returned and it hangs showing the below stack trace. I am on 
> master.
> Steps to reproduce the problem
> clush -g khurram service mapr-warden stop
> clush -g khurram service mapr-warden start
> Issue long running query from sqlline
> While query is running, stop zookeeper using script.
> To stop zookeeper 
> {code}
> [root@centos-01 bin]# ./zkServer.sh stop
> JMX enabled by default
> Using config: /opt/mapr/zookeeper/zookeeper-3.4.5/bin/../conf/zoo.cfg
> Stopping zookeeper ... STOPPED
> {code}
> Issue below long running query from sqlline
> {code}
> ./sqlline -u "jdbc:drill:schema=dfs.tmp"
> 0: jdbc:drill:schema=dfs.tmp> select * from `twoKeyJsn.json` limit 800;
> ...
> | 7.40907649723E8  | g|
> | 1.12378007695E9  | d|
> 03:03:28.482 [CuratorFramework-0] ERROR org.apache.curator.ConnectionState - 
> Connection timed out for connection string (10.10.100.201:5181) and timeout 
> (5000) / elapsed (5013)
> org.apache.curator.CuratorConnectionLossException: KeeperErrorCode = 
> ConnectionLoss
>   at 
> org.apache.curator.ConnectionState.checkTimeouts(ConnectionState.java:198) 
> [curator-client-2.5.0.jar:na]
>   at 
> org.apache.curator.ConnectionState.getZooKeeper(ConnectionState.java:88) 
> [curator-client-2.5.0.jar:na]
>   at 
> org.apache.curator.CuratorZookeeperClient.getZooKeeper(CuratorZookeeperClient.java:115)
>  [curator-client-2.5.0.jar:na]
>   at 
> org.apache.curator.framework.imps.CuratorFrameworkImpl.performBackgroundOperation(CuratorFrameworkImpl.java:807)
>  [curator-framework-2.5.0.jar:na]
>   at 
> org.apache.curator.framework.imps.CuratorFrameworkImpl.backgroundOperationsLoop(CuratorFrameworkImpl.java:793)
>  [curator-framework-2.5.0.jar:na]
>   at 
> org.apache.curator.framework.imps.CuratorFrameworkImpl.access$400(CuratorFrameworkImpl.java:57)
>  [curator-framework-2.5.0.jar:na]
>   at 
> org.apache.curator.framework.imps.CuratorFrameworkImpl$4.call(CuratorFrameworkImpl.java:275)
>  [curator-framework-2.5.0.jar:na]
>   at java.util.concurrent.FutureTask.run(FutureTask.java:262) 
> [na:1.7.0_45]
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>  [na:1.7.0_45]
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>  [na:1.7.0_45]
>   at java.lang.Thread.run(Thread.java:744) [na:1.7.0_45]
> {code}
> Here is the stack for sqlline process
> {code}
> [root@centos-01 bin]# /usr/java/jdk1.7.0_45/bin/jstack 32136
> 2015-09-05 03:21:52
> Full thread dump Java HotSpot(TM) 64-Bit Server VM (24.45-b08 mixed mode):
> "Attach Listener" daemon prio=10 tid=0x7f8328003800 nid=0x27f1 waiting on 
> condition [0x]
>java.lang.Thread.State: RUNNABLE
> "CuratorFramework-0-EventThread" daemon prio=10 tid=0x012fd800 
> nid=0x26e1 waiting on condition [0x7f8317c2e000]
>java.lang.Thread.State: WAITING (parking)
>   at sun.misc.Unsafe.park(Native Method)
>   - parking to wait for  <0x0007e2117798> (a 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
>   at java.util.concurrent.locks.LockSupport.park(LockSupport.java:186)
>   at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2043)
>   at 
> java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442)
>   at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:491)
> "CuratorFramework-0-SendThread(centos-01.qa.lab:5181)" daemon prio=10 
> tid=0x01109800 nid=0x26e0 waiting on condition [0x7f8317b2d000]
>java.lang.Thread.State: TIMED_WAITING (sleeping)
>   at java.lang.Thread.sleep(Native Method)
>   at 
> org.apache.zookeeper.client.StaticHostProvider.next(StaticHostProvider.java:86)
>   at 
> org.apache.zookeeper.ClientCnxn$SendThread.startConnect(ClientCnxn.java:937)
>   at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:995)
> "threadDeathWatcher-2-1" daemon prio=10 tid=0x7f833043b800 nid=0x7e16 
> waiting on condition

[jira] [Updated] (DRILL-3751) Query hang when zookeeper is stopped

2015-10-08 Thread Khurram Faraaz (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-3751?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Khurram Faraaz updated DRILL-3751:
--
Assignee: (was: Sudheesh Katkam)

> Query hang when zookeeper is stopped
> 
>
> Key: DRILL-3751
> URL: https://issues.apache.org/jira/browse/DRILL-3751
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Flow
>Affects Versions: 1.2.0
> Environment: 4 node cluster on CentOS
>Reporter: Khurram Faraaz
>Priority: Critical
> Fix For: Future
>
>
> I see an indefinite hang on sqlline prompt, issue a long running query and 
> then stop zookeeper process when the query is still being executed. Sqlline 
> prompt is never returned and it hangs showing the below stack trace. I am on 
> master.
> Steps to reproduce the problem
> clush -g khurram service mapr-warden stop
> clush -g khurram service mapr-warden start
> Issue long running query from sqlline
> While query is running, stop zookeeper using script.
> To stop zookeeper 
> {code}
> [root@centos-01 bin]# ./zkServer.sh stop
> JMX enabled by default
> Using config: /opt/mapr/zookeeper/zookeeper-3.4.5/bin/../conf/zoo.cfg
> Stopping zookeeper ... STOPPED
> {code}
> Issue below long running query from sqlline
> {code}
> ./sqlline -u "jdbc:drill:schema=dfs.tmp"
> 0: jdbc:drill:schema=dfs.tmp> select * from `twoKeyJsn.json` limit 800;
> ...
> | 7.40907649723E8  | g|
> | 1.12378007695E9  | d|
> 03:03:28.482 [CuratorFramework-0] ERROR org.apache.curator.ConnectionState - 
> Connection timed out for connection string (10.10.100.201:5181) and timeout 
> (5000) / elapsed (5013)
> org.apache.curator.CuratorConnectionLossException: KeeperErrorCode = 
> ConnectionLoss
>   at 
> org.apache.curator.ConnectionState.checkTimeouts(ConnectionState.java:198) 
> [curator-client-2.5.0.jar:na]
>   at 
> org.apache.curator.ConnectionState.getZooKeeper(ConnectionState.java:88) 
> [curator-client-2.5.0.jar:na]
>   at 
> org.apache.curator.CuratorZookeeperClient.getZooKeeper(CuratorZookeeperClient.java:115)
>  [curator-client-2.5.0.jar:na]
>   at 
> org.apache.curator.framework.imps.CuratorFrameworkImpl.performBackgroundOperation(CuratorFrameworkImpl.java:807)
>  [curator-framework-2.5.0.jar:na]
>   at 
> org.apache.curator.framework.imps.CuratorFrameworkImpl.backgroundOperationsLoop(CuratorFrameworkImpl.java:793)
>  [curator-framework-2.5.0.jar:na]
>   at 
> org.apache.curator.framework.imps.CuratorFrameworkImpl.access$400(CuratorFrameworkImpl.java:57)
>  [curator-framework-2.5.0.jar:na]
>   at 
> org.apache.curator.framework.imps.CuratorFrameworkImpl$4.call(CuratorFrameworkImpl.java:275)
>  [curator-framework-2.5.0.jar:na]
>   at java.util.concurrent.FutureTask.run(FutureTask.java:262) 
> [na:1.7.0_45]
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>  [na:1.7.0_45]
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>  [na:1.7.0_45]
>   at java.lang.Thread.run(Thread.java:744) [na:1.7.0_45]
> {code}
> Here is the stack for sqlline process
> {code}
> [root@centos-01 bin]# /usr/java/jdk1.7.0_45/bin/jstack 32136
> 2015-09-05 03:21:52
> Full thread dump Java HotSpot(TM) 64-Bit Server VM (24.45-b08 mixed mode):
> "Attach Listener" daemon prio=10 tid=0x7f8328003800 nid=0x27f1 waiting on 
> condition [0x]
>java.lang.Thread.State: RUNNABLE
> "CuratorFramework-0-EventThread" daemon prio=10 tid=0x012fd800 
> nid=0x26e1 waiting on condition [0x7f8317c2e000]
>java.lang.Thread.State: WAITING (parking)
>   at sun.misc.Unsafe.park(Native Method)
>   - parking to wait for  <0x0007e2117798> (a 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
>   at java.util.concurrent.locks.LockSupport.park(LockSupport.java:186)
>   at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2043)
>   at 
> java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442)
>   at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:491)
> "CuratorFramework-0-SendThread(centos-01.qa.lab:5181)" daemon prio=10 
> tid=0x01109800 nid=0x26e0 waiting on condition [0x7f8317b2d000]
>java.lang.Thread.State: TIMED_WAITING (sleeping)
>   at java.lang.Thread.sleep(Native Method)
>   at 
> org.apache.zookeeper.client.StaticHostProvider.next(StaticHostProvider.java:86)
>   at 
> org.apache.zookeeper.ClientCnxn$SendThread.startConnect(ClientCnxn.java:937)
>   at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:995)
> "threadDeathWatcher-2-1" daemon prio=10 tid=0x7f833043b800 nid=0x7e16 
> waiting on condition [0x7f831751f000]

[jira] [Comment Edited] (DRILL-2123) Order of columns in the Web UI is wrong when columns are explicitly specified in projection list

2015-10-08 Thread Sudheesh Katkam (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-2123?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14949437#comment-14949437
 ] 

Sudheesh Katkam edited comment on DRILL-2123 at 10/8/15 9:34 PM:
-

You just happen to get the right order; I don't think the fix is in.


was (Author: sudheeshkatkam):
This still happens.

> Order of columns in the Web UI is wrong when columns are explicitly specified 
> in projection list
> 
>
> Key: DRILL-2123
> URL: https://issues.apache.org/jira/browse/DRILL-2123
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Client - HTTP
>Affects Versions: 0.8.0
>Reporter: Victoria Markman
>Priority: Critical
> Fix For: Future
>
> Attachments: Screen Shot 2015-01-29 at 4.08.06 PM.png
>
>
> I'm running query:
> {code}
> select  c_integer, 
>c_bigint, 
>nullif(c_integer, c_bigint) 
> from   `dfs.aggregation`.t1 
> order by c_integer
> {code}
> In sqlline I get correct order of columns:
> {code}
> 0: jdbc:drill:schema=dfs> select c_integer, c_bigint, nullif(c_integer, 
> c_bigint) from `dfs.aggregation`.t1;
> ++++
> | c_integer  |  c_bigint  |   EXPR$2   |
> ++++
> | 451237400  | -3477884857818808320 | 451237400  |
> {code}
> In Web UI - columns are sorted in alphabetical order. 
> Screenshot is attached.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (DRILL-3762) NPE : Query nested JSON data

2015-10-08 Thread Khurram Faraaz (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-3762?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Khurram Faraaz updated DRILL-3762:
--
Fix Version/s: (was: 1.3.0)
   Future

> NPE : Query nested JSON data
> 
>
> Key: DRILL-3762
> URL: https://issues.apache.org/jira/browse/DRILL-3762
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Client - JDBC
>Affects Versions: 1.2.0
> Environment: 4 node cluster CentOS
>Reporter: Khurram Faraaz
> Fix For: Future
>
>
> Drill master commit ID : 0686bc23
> I am seeing an NPE when I try to query nested data. Interestingly there are 
> no Exceptions written to either drillbit.log or drillbit.out, I had to set 
> verbose mode to ON, on sqlline to see the stack trace.
> {code}
> 0: jdbc:drill:schema=dfs.tmp> select * from `repro_data.json`;
> +-+
> |meta |
> +-+
> | {"view":{"columns":[{"cachedCounts":{"smallest":"hello world"}}]}}  |
> +-+
> 1 row selected (2.401 seconds)
> 0: jdbc:drill:schema=dfs.tmp> select t.meta.view.columns from 
> `repro_data.json` t;
> ++
> | EXPR$0 |
> ++
> | [{"cachedCounts":{"smallest":"hello world"}}]  |
> ++
> 1 row selected (0.347 seconds)
> 0: jdbc:drill:schema=dfs.tmp> select t.meta.view.columns[1] from 
> `repro_data.json` t;
> Error: Unexpected RuntimeException: java.lang.NullPointerException 
> (state=,code=0)
> 0: jdbc:drill:schema=dfs.tmp> !verbose true
> verbose: on
> 0: jdbc:drill:schema=dfs.tmp> select t.meta.view.columns[1] from 
> `repro_data.json` t;
> Error: Unexpected RuntimeException: java.lang.NullPointerException 
> (state=,code=0)
> java.sql.SQLException: Unexpected RuntimeException: 
> java.lang.NullPointerException
>   at 
> org.apache.drill.jdbc.impl.DrillCursor.nextRowInternally(DrillCursor.java:261)
>   at 
> org.apache.drill.jdbc.impl.DrillCursor.loadInitialSchema(DrillCursor.java:290)
>   at 
> org.apache.drill.jdbc.impl.DrillResultSetImpl.execute(DrillResultSetImpl.java:1359)
>   at 
> org.apache.drill.jdbc.impl.DrillResultSetImpl.execute(DrillResultSetImpl.java:74)
>   at 
> net.hydromatic.avatica.AvaticaConnection.executeQueryInternal(AvaticaConnection.java:404)
>   at 
> net.hydromatic.avatica.AvaticaStatement.executeQueryInternal(AvaticaStatement.java:351)
>   at 
> net.hydromatic.avatica.AvaticaStatement.executeInternal(AvaticaStatement.java:338)
>   at 
> net.hydromatic.avatica.AvaticaStatement.execute(AvaticaStatement.java:69)
>   at 
> org.apache.drill.jdbc.impl.DrillStatementImpl.execute(DrillStatementImpl.java:86)
>   at sqlline.Commands.execute(Commands.java:841)
>   at sqlline.Commands.sql(Commands.java:751)
>   at sqlline.SqlLine.dispatch(SqlLine.java:738)
>   at sqlline.SqlLine.begin(SqlLine.java:612)
>   at sqlline.SqlLine.start(SqlLine.java:366)
>   at sqlline.SqlLine.main(SqlLine.java:259)
> Caused by: java.lang.NullPointerException
>   at 
> org.apache.drill.exec.record.RecordBatchLoader.load(RecordBatchLoader.java:99)
>   at 
> org.apache.drill.jdbc.impl.DrillCursor.nextRowInternally(DrillCursor.java:223)
>   ... 14 more
> {code}
> Data used in the test, repo_data.json
> {code}
> {
> "meta":{
> "view":{
> "columns":[{"cachedCounts":{ "smallest":"hello world"}}]
> }
> }
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (DRILL-2123) Order of columns in the Web UI is wrong when columns are explicitly specified in projection list

2015-10-08 Thread Sudheesh Katkam (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-2123?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sudheesh Katkam updated DRILL-2123:
---
Priority: Critical  (was: Major)

> Order of columns in the Web UI is wrong when columns are explicitly specified 
> in projection list
> 
>
> Key: DRILL-2123
> URL: https://issues.apache.org/jira/browse/DRILL-2123
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Client - HTTP
>Affects Versions: 0.8.0
>Reporter: Victoria Markman
>Priority: Critical
> Fix For: Future
>
> Attachments: Screen Shot 2015-01-29 at 4.08.06 PM.png
>
>
> I'm running query:
> {code}
> select  c_integer, 
>c_bigint, 
>nullif(c_integer, c_bigint) 
> from   `dfs.aggregation`.t1 
> order by c_integer
> {code}
> In sqlline I get correct order of columns:
> {code}
> 0: jdbc:drill:schema=dfs> select c_integer, c_bigint, nullif(c_integer, 
> c_bigint) from `dfs.aggregation`.t1;
> ++++
> | c_integer  |  c_bigint  |   EXPR$2   |
> ++++
> | 451237400  | -3477884857818808320 | 451237400  |
> {code}
> In Web UI - columns are sorted in alphabetical order. 
> Screenshot is attached.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (DRILL-3762) NPE : Query nested JSON data

2015-10-08 Thread Khurram Faraaz (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-3762?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Khurram Faraaz updated DRILL-3762:
--
Assignee: (was: Jason Altekruse)

> NPE : Query nested JSON data
> 
>
> Key: DRILL-3762
> URL: https://issues.apache.org/jira/browse/DRILL-3762
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Client - JDBC
>Affects Versions: 1.2.0
> Environment: 4 node cluster CentOS
>Reporter: Khurram Faraaz
> Fix For: 1.3.0
>
>
> Drill master commit ID : 0686bc23
> I am seeing an NPE when I try to query nested data. Interestingly there are 
> no Exceptions written to either drillbit.log or drillbit.out, I had to set 
> verbose mode to ON, on sqlline to see the stack trace.
> {code}
> 0: jdbc:drill:schema=dfs.tmp> select * from `repro_data.json`;
> +-+
> |meta |
> +-+
> | {"view":{"columns":[{"cachedCounts":{"smallest":"hello world"}}]}}  |
> +-+
> 1 row selected (2.401 seconds)
> 0: jdbc:drill:schema=dfs.tmp> select t.meta.view.columns from 
> `repro_data.json` t;
> ++
> | EXPR$0 |
> ++
> | [{"cachedCounts":{"smallest":"hello world"}}]  |
> ++
> 1 row selected (0.347 seconds)
> 0: jdbc:drill:schema=dfs.tmp> select t.meta.view.columns[1] from 
> `repro_data.json` t;
> Error: Unexpected RuntimeException: java.lang.NullPointerException 
> (state=,code=0)
> 0: jdbc:drill:schema=dfs.tmp> !verbose true
> verbose: on
> 0: jdbc:drill:schema=dfs.tmp> select t.meta.view.columns[1] from 
> `repro_data.json` t;
> Error: Unexpected RuntimeException: java.lang.NullPointerException 
> (state=,code=0)
> java.sql.SQLException: Unexpected RuntimeException: 
> java.lang.NullPointerException
>   at 
> org.apache.drill.jdbc.impl.DrillCursor.nextRowInternally(DrillCursor.java:261)
>   at 
> org.apache.drill.jdbc.impl.DrillCursor.loadInitialSchema(DrillCursor.java:290)
>   at 
> org.apache.drill.jdbc.impl.DrillResultSetImpl.execute(DrillResultSetImpl.java:1359)
>   at 
> org.apache.drill.jdbc.impl.DrillResultSetImpl.execute(DrillResultSetImpl.java:74)
>   at 
> net.hydromatic.avatica.AvaticaConnection.executeQueryInternal(AvaticaConnection.java:404)
>   at 
> net.hydromatic.avatica.AvaticaStatement.executeQueryInternal(AvaticaStatement.java:351)
>   at 
> net.hydromatic.avatica.AvaticaStatement.executeInternal(AvaticaStatement.java:338)
>   at 
> net.hydromatic.avatica.AvaticaStatement.execute(AvaticaStatement.java:69)
>   at 
> org.apache.drill.jdbc.impl.DrillStatementImpl.execute(DrillStatementImpl.java:86)
>   at sqlline.Commands.execute(Commands.java:841)
>   at sqlline.Commands.sql(Commands.java:751)
>   at sqlline.SqlLine.dispatch(SqlLine.java:738)
>   at sqlline.SqlLine.begin(SqlLine.java:612)
>   at sqlline.SqlLine.start(SqlLine.java:366)
>   at sqlline.SqlLine.main(SqlLine.java:259)
> Caused by: java.lang.NullPointerException
>   at 
> org.apache.drill.exec.record.RecordBatchLoader.load(RecordBatchLoader.java:99)
>   at 
> org.apache.drill.jdbc.impl.DrillCursor.nextRowInternally(DrillCursor.java:223)
>   ... 14 more
> {code}
> Data used in the test, repo_data.json
> {code}
> {
> "meta":{
> "view":{
> "columns":[{"cachedCounts":{ "smallest":"hello world"}}]
> }
> }
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (DRILL-2123) Order of columns in the Web UI is wrong when columns are explicitly specified in projection list

2015-10-08 Thread Sudheesh Katkam (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-2123?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14949437#comment-14949437
 ] 

Sudheesh Katkam commented on DRILL-2123:


This still happens.

> Order of columns in the Web UI is wrong when columns are explicitly specified 
> in projection list
> 
>
> Key: DRILL-2123
> URL: https://issues.apache.org/jira/browse/DRILL-2123
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Client - HTTP
>Affects Versions: 0.8.0
>Reporter: Victoria Markman
>Priority: Critical
> Fix For: Future
>
> Attachments: Screen Shot 2015-01-29 at 4.08.06 PM.png
>
>
> I'm running query:
> {code}
> select  c_integer, 
>c_bigint, 
>nullif(c_integer, c_bigint) 
> from   `dfs.aggregation`.t1 
> order by c_integer
> {code}
> In sqlline I get correct order of columns:
> {code}
> 0: jdbc:drill:schema=dfs> select c_integer, c_bigint, nullif(c_integer, 
> c_bigint) from `dfs.aggregation`.t1;
> ++++
> | c_integer  |  c_bigint  |   EXPR$2   |
> ++++
> | 451237400  | -3477884857818808320 | 451237400  |
> {code}
> In Web UI - columns are sorted in alphabetical order. 
> Screenshot is attached.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (DRILL-2246) select from sys.options fails in a weird way when slice_target=1

2015-10-08 Thread Victoria Markman (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-2246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14949421#comment-14949421
 ] 

Victoria Markman commented on DRILL-2246:
-

Works in 1.2.0, no info on check in.

{code}
0: jdbc:drill:schema=dfs> alter system set `planner.slice_target` =1;
+---++
|  ok   |summary |
+---++
| true  | planner.slice_target updated.  |
+---++
1 row selected (0.244 seconds)
0: jdbc:drill:schema=dfs> select * from sys.options order by name;
++--+-+--+-+-+---++
|name|   kind   
|  type   |  status  |   num_val   | string_val  | bool_val  | float_val  |
++--+-+--+-+-+---++
| drill.exec.functions.cast_empty_string_to_null | BOOLEAN  
| SYSTEM  | DEFAULT  | null| null| false | null   |
| drill.exec.storage.file.partition.column.label | STRING   
| SYSTEM  | DEFAULT  | null| dir | null  | null   |
| exec.errors.verbose| BOOLEAN  
| SYSTEM  | DEFAULT  | null| null| false | null   |
| exec.java_compiler | STRING   
| SYSTEM  | DEFAULT  | null| DEFAULT | null  | null   |
| exec.java_compiler_debug   | BOOLEAN  
| SYSTEM  | DEFAULT  | null| null| true  | null   |
| exec.java_compiler_janino_maxsize  | LONG 
| SYSTEM  | DEFAULT  | 262144  | null| null  | null   |
| exec.max_hash_table_size   | LONG 
| SYSTEM  | DEFAULT  | 1073741824  | null| null  | null   |
| exec.min_hash_table_size   | LONG 
| SYSTEM  | DEFAULT  | 65536   | null| null  | null   |
| exec.queue.enable  | BOOLEAN  
| SYSTEM  | DEFAULT  | null| null| false | null   |
| exec.queue.large   | LONG 
| SYSTEM  | DEFAULT  | 10  | null| null  | null   |
| exec.queue.small   | LONG 
| SYSTEM  | DEFAULT  | 100 | null| null  | null   |
| exec.queue.threshold   | LONG 
| SYSTEM  | DEFAULT  | 3000| null| null  | null   |
| exec.queue.timeout_millis  | LONG 
| SYSTEM  | DEFAULT  | 30  | null| null  | null   |
| exec.schedule.assignment.old   | BOOLEAN  
| SYSTEM  | DEFAULT  | null| null| false | null   |
| exec.storage.enable_new_text_reader| BOOLEAN  
| SYSTEM  | DEFAULT  | null| null| true  | null   |
| new_view_default_permissions   | STRING   
| SYSTEM  | DEFAULT  | null| 700 | null  | null   |
| org.apache.drill.exec.compile.ClassTransformer.scalar_replacement  | STRING   
| SYSTEM  | DEFAULT  | null| try | null  | null   |
| planner.add_producer_consumer  | BOOLEAN  
| SYSTEM  | DEFAULT  | null| null| false | null   |
| planner.affinity_factor| DOUBLE   
| SYSTEM  | DEFAULT  | null| null| null  | 1.2|
| planner.broadcast_factor   | DOUBLE   
| SYSTEM  | DEFAULT  | null| null| null  | 1.0|
| planner.broadcast_threshold| LONG 
| SYSTEM  | DEFAULT  | 1000| null| null  | null   |
| planner.disable_exchanges  | BOOLEAN  
| SYSTEM  | DEFAULT  | null| null| false | null   |
| planner.enable_broadcast_join  | BOOLEAN  
| SYSTEM  | DEFAULT  | null| null| true  | null   |
| planner.enable_constant_folding| BOOLEAN  
| SYSTEM  | DEFAULT  | null| null| true  | null   |
| planner.enable_decimal_data_type   | BOO

[jira] [Resolved] (DRILL-2246) select from sys.options fails in a weird way when slice_target=1

2015-10-08 Thread Victoria Markman (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-2246?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Victoria Markman resolved DRILL-2246.
-
Resolution: Fixed

> select from sys.options fails in a weird way when slice_target=1
> 
>
> Key: DRILL-2246
> URL: https://issues.apache.org/jira/browse/DRILL-2246
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Flow
>Reporter: Victoria Markman
>Priority: Minor
> Fix For: Future
>
>
> Since this is not supposed to be customer visible option, it's a minor issue.
> {code}
> alter system set `planner.slice_target` =1;
> select * from sys.options order by name;
> {code}
> {code}
> 0: jdbc:drill:schema=dfs> select * from sys.options order by name;
> Query failed: RemoteRpcException: Failure while trying to start remote 
> fragment, No suitable constructor found for type [simple type, class 
> org.apache.drill.exec.store.sys.SystemTablePlugin]: can not instantiate from 
> JSON object (need to add/enable type information?)
>  at [Source: {
>   "pop" : "hash-partition-sender",
>   "@id" : 0,
>   "receiver-major-fragment" : 1,
>   "child" : {
> "pop" : "sys",
> "@id" : 1,
> "table" : "OPTION",
> "plugin" : {
>   "config" : {
> "type" : "SystemTablePluginConfig",
> "enabled" : true
>   },
>   "optimizerRules" : [ ]
> },
> "cost" : 20.0
>   },
>   "expr" : "hash(`name`) ",
>   "destinations" : [ "ChFhdHNxYTQtMTMzLnFhLmxhYhCi8gEYo/IBIKTyAQ==", 
> "ChFhdHNxYTQtMTM0LnFhLmxhYhCi8gEYo/IBIKTyAQ==", 
> "ChFhdHNxYTQtMTMzLnFhLmxhYhCi8gEYo/IBIKTyAQ==", 
> "ChFhdHNxYTQtMTM0LnFhLmxhYhCi8gEYo/IBIKTyAQ==", 
> "ChFhdHNxYTQtMTMzLnFhLmxhYhCi8gEYo/IBIKTyAQ==", 
> "ChFhdHNxYTQtMTM0LnFhLmxhYhCi8gEYo/IBIKTyAQ==", 
> "ChFhdHNxYTQtMTMzLnFhLmxhYhCi8gEYo/IBIKTyAQ==", 
> "ChFhdHNxYTQtMTM0LnFhLmxhYhCi8gEYo/IBIKTyAQ==", 
> "ChFhdHNxYTQtMTMzLnFhLmxhYhCi8gEYo/IBIKTyAQ==", 
> "ChFhdHNxYTQtMTM0LnFhLmxhYhCi8gEYo/IBIKTyAQ==", 
> "ChFhdHNxYTQtMTMzLnFhLmxhYhCi8gEYo/IBIKTyAQ==", 
> "ChFhdHNxYTQtMTM0LnFhLmxhYhCi8gEYo/IBIKTyAQ==", 
> "ChFhdHNxYTQtMTMzLnFhLmxhYhCi8gEYo/IBIKTyAQ==", 
> "ChFhdHNxYTQtMTM0LnFhLmxhYhCi8gEYo/IBIKTyAQ==", 
> "ChFhdHNxYTQtMTMzLnFhLmxhYhCi8gEYo/IBIKTyAQ==", 
> "ChFhdHNxYTQtMTM0LnFhLmxhYhCi8gEYo/IBIKTyAQ==", 
> "ChFhdHNxYTQtMTMzLnFhLmxhYhCi8gEYo/IBIKTyAQ==", 
> "ChFhdHNxYTQtMTM0LnFhLmxhYhCi8gEYo/IBIKTyAQ==", 
> "ChFhdHNxYTQtMTMzLnFhLmxhYhCi8gEYo/IBIKTyAQ==", 
> "ChFhdHNxYTQtMTM0LnFhLmxhYhCi8gEYo/IBIKTyAQ==" ],
>   "initialAllocation" : 100,
>   "maxAllocation" : 100,
>   "cost" : 0.0
> }; line: 10, column: 7] [ 3082d365-3bc6-4aeb-bb29-ddd133832cff on 
> atsqa4-134.qa.lab:31010 ]
> Error: exception while executing query: Failure while executing query. 
> (state=,code=0)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (DRILL-2235) Assert when NOT IN clause contains multiple columns

2015-10-08 Thread Victoria Markman (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-2235?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14949417#comment-14949417
 ] 

Victoria Markman commented on DRILL-2235:
-

Here is the new runtime error without count(*):

{code}
0: jdbc:drill:schema=dfs> select * from t1 where (a1, b1) not in (select a2, b2 
from t2);
Error: SYSTEM ERROR: SchemaChangeException: Failure while materializing 
expression. 
Error in expression at index -1.  Error: Missing function implementation: 
[count(INT-OPTIONAL, VARCHAR-OPTIONAL)].  Full expression: --UNKNOWN 
EXPRESSION--.
Fragment 0:0
[Error Id: 10394832-b18f-4158-853a-23add6463318 on atsqa4-133.qa.lab:31010] 
(state=,code=0)
{code}


> Assert when NOT IN clause contains multiple columns
> ---
>
> Key: DRILL-2235
> URL: https://issues.apache.org/jira/browse/DRILL-2235
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Query Planning & Optimization
>Affects Versions: 0.8.0
>Reporter: Victoria Markman
> Fix For: Future
>
>
> {code}
> 0: jdbc:drill:schema=dfs> select * from t1;
> ++++
> | a1 | b1 | c1 |
> ++++
> | 1  | a  | 2015-01-01 |
> | 2  | b  | 2015-01-02 |
> | 3  | c  | 2015-01-03 |
> | 4  | null   | 2015-01-04 |
> | 5  | e  | 2015-01-05 |
> | 6  | f  | 2015-01-06 |
> | 7  | g  | 2015-01-07 |
> | null   | h  | 2015-01-08 |
> | 9  | i  | null   |
> | 10 | j  | 2015-01-10 |
> ++++
> 10 rows selected (0.056 seconds)
> 0: jdbc:drill:schema=dfs> select * from t2;
> ++++
> | a2 | b2 | c2 |
> ++++
> | 0  | zzz| 2014-12-31 |
> | 1  | a  | 2015-01-01 |
> | 2  | b  | 2015-01-02 |
> | 2  | b  | 2015-01-02 |
> | 2  | b  | 2015-01-02 |
> | 3  | c  | 2015-01-03 |
> | 4  | d  | 2015-01-04 |
> | 5  | e  | 2015-01-05 |
> | 6  | f  | 2015-01-06 |
> | 7  | g  | 2015-01-07 |
> | 7  | g  | 2015-01-07 |
> | 8  | h  | 2015-01-08 |
> | 9  | i  | 2015-01-09 |
> ++++
> 13 rows selected (0.069 seconds)
> {code}
> IN clause returns correct result:
> {code}
> 0: jdbc:drill:schema=dfs> select count(*) from t1 where (a1, b1) in (select 
> a2, b2 from t2);
> ++
> |   EXPR$0   |
> ++
> | 7  |
> ++
> 1 row selected (0.258 seconds)
> {code}
> NOT IN clause asserts:
> {code}
> 0: jdbc:drill:schema=dfs> select count(*) from t1 where (a1, b1) not in 
> (select a2, b2 from t2);
> Query failed: AssertionError: AND(AND(NOT(IS TRUE($7)), IS NOT NULL($3)), IS 
> NOT NULL($4))
> Error: exception while executing query: Failure while executing query. 
> (state=,code=0)
> {code}
> {code}
> #Thu Feb 12 12:13:26 EST 2015
> git.commit.id.abbrev=de89f36
> {code}
> drillbit.log
> {code}
> 2015-02-12 22:47:11,730 [2b22d290-315e-4450-8b3f-9b3590eb20c3:foreman] INFO  
> o.a.drill.exec.work.foreman.Foreman - State change requested.  PENDING --> 
> FAILED
> org.apache.drill.exec.work.foreman.ForemanException: Unexpected exception 
> during fragment initialization: AND(AND(NOT(IS TRUE($7)), IS NOT NULL($3)), 
> IS NOT NULL($4))
> at org.apache.drill.exec.work.foreman.Foreman.run(Foreman.java:197) 
> [drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
> at 
> org.apache.drill.exec.work.WorkManager$RunnableWrapper.run(WorkManager.java:303)
>  [drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>  [na:1.7.0_71]
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>  [na:1.7.0_71]
> at java.lang.Thread.run(Thread.java:745) [na:1.7.0_71]
> Caused by: java.lang.AssertionError: AND(AND(NOT(IS TRUE($7)), IS NOT 
> NULL($3)), IS NOT NULL($4))
> at org.eigenbase.rel.FilterRelBase.(FilterRelBase.java:56) 
> ~[optiq-core-0.9-drill-r18.jar:na]
> at org.eigenbase.rel.FilterRel.(FilterRel.java:50) 
> ~[optiq-core-0.9-drill-r18.jar:na]
> at org.eigenbase.rel.CalcRel.createFilter(CalcRel.java:212) 
> ~[optiq-core-0.9-drill-r18.jar:na]
> at 
> org.eigenbase.sql2rel.SqlToRelConverter.convertWhere(SqlToRelConverter.java:840)
>  ~[optiq-core-0.9-drill-r18.jar:na]
> at 
> org.eigenbase.sql2rel.SqlToRelConverter.convertSelectImpl(SqlToRelConver

[jira] [Updated] (DRILL-2139) Star is not expanded correctly in "select distinct" query

2015-10-08 Thread Victoria Markman (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-2139?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Victoria Markman updated DRILL-2139:

Priority: Critical  (was: Major)

> Star is not expanded correctly in "select distinct" query
> -
>
> Key: DRILL-2139
> URL: https://issues.apache.org/jira/browse/DRILL-2139
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Query Planning & Optimization
>Affects Versions: 0.8.0
>Reporter: Victoria Markman
>Assignee: Sean Hsuan-Yi Chu
>Priority: Critical
> Fix For: 1.3.0
>
>
> {code}
> 0: jdbc:drill:schema=dfs> select distinct * from t1;
> ++
> | *  |
> ++
> | null   |
> ++
> 1 row selected (0.14 seconds)
> 0: jdbc:drill:schema=dfs> select distinct * from `test.json`;
> ++
> | *  |
> ++
> | null   |
> ++
> 1 row selected (0.163 seconds)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (DRILL-2123) Order of columns in the Web UI is wrong when columns are explicitly specified in projection list

2015-10-08 Thread Victoria Markman (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-2123?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14949406#comment-14949406
 ] 

Victoria Markman commented on DRILL-2123:
-

[~sudheeshkatkam] I just ran this query and it looks correct to me, but I don't 
remember any check ins related to this area. Can you please check it out ?

> Order of columns in the Web UI is wrong when columns are explicitly specified 
> in projection list
> 
>
> Key: DRILL-2123
> URL: https://issues.apache.org/jira/browse/DRILL-2123
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Client - HTTP
>Affects Versions: 0.8.0
>Reporter: Victoria Markman
> Fix For: Future
>
> Attachments: Screen Shot 2015-01-29 at 4.08.06 PM.png
>
>
> I'm running query:
> {code}
> select  c_integer, 
>c_bigint, 
>nullif(c_integer, c_bigint) 
> from   `dfs.aggregation`.t1 
> order by c_integer
> {code}
> In sqlline I get correct order of columns:
> {code}
> 0: jdbc:drill:schema=dfs> select c_integer, c_bigint, nullif(c_integer, 
> c_bigint) from `dfs.aggregation`.t1;
> ++++
> | c_integer  |  c_bigint  |   EXPR$2   |
> ++++
> | 451237400  | -3477884857818808320 | 451237400  |
> {code}
> In Web UI - columns are sorted in alphabetical order. 
> Screenshot is attached.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (DRILL-3712) Drill does not recognize UTF-16-LE encoding

2015-10-08 Thread Deneche A. Hakim (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-3712?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Deneche A. Hakim updated DRILL-3712:

Description: 
We are unable to process files that OSX identifies as character sete UTF16LE.  
After unzipping and converting to UTF8, we are able to process one fine.  There 
are CONVERT_TO and CONVERT_FROM commands that appear to address the issue, but 
we were unable to make them work on a gzipped or unzipped version of the UTF16 
file.  We were  able to use CONVERT_FROM ok, but when we tried to wrap the 
results of that to cast as a date, or anything else, it failed.  Trying to work 
with it natively caused the double-byte nature to appear (a substring 1,4 only 
return the first two characters).

I cannot post the data because it is proprietary in nature, but I am posting 
this code that might be useful in re-creating an issue:

{noformat}
#!/usr/bin/env python
""" Generates a test psv file with some text fields encoded as UTF-16-LE. """
def write_utf16le_encoded_psv():
total_lines = 10
encoded = "Encoded B".encode("utf-16-le")
with open("test.psv","wb") as csv_file:
csv_file.write("header 1|header 2|header 3\n")
for i in xrange(total_lines):
csv_file.write("value 
A"+str(i)+"|"+encoded+"|value C"+str(i)+"\n")

if __name__ == "__main__":
write_utf16le_encoded_psv()
{noformat}


then:

tar zcvf test.psv.tar.gz test.psv




  was:
We are unable to process files that OSX identifies as character sete UTF16LE.  
After unzipping and converting to UTF8, we are able to process one fine.  There 
are CONVERT_TO and CONVERT_FROM commands that appear to address the issue, but 
we were unable to make them work on a gzipped or unzipped version of the UTF16 
file.  We were  able to use CONVERT_FROM ok, but when we tried to wrap the 
results of that to cast as a date, or anything else, it failed.  Trying to work 
with it natively caused the double-byte nature to appear (a substring 1,4 only 
return the first two characters).

I cannot post the data because it is proprietary in nature, but I am posting 
this code that might be useful in re-creating an issue:

{code}
#!/usr/bin/env python
""" Generates a test psv file with some text fields encoded as UTF-16-LE. """
def write_utf16le_encoded_psv():
total_lines = 10
encoded = "Encoded B".encode("utf-16-le")
with open("test.psv","wb") as csv_file:
csv_file.write("header 1|header 2|header 3\n")
for i in xrange(total_lines):
csv_file.write("value 
A"+str(i)+"|"+encoded+"|value C"+str(i)+"\n")

if __name__ == "__main__":
write_utf16le_encoded_psv()
{code}


then:

tar zcvf test.psv.tar.gz test.psv





> Drill does not recognize UTF-16-LE encoding
> ---
>
> Key: DRILL-3712
> URL: https://issues.apache.org/jira/browse/DRILL-3712
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Storage - Text & CSV
>Affects Versions: 1.1.0
> Environment: OSX, likely Linux. 
>Reporter: Edmon Begoli
> Fix For: Future
>
>
> We are unable to process files that OSX identifies as character sete UTF16LE. 
>  After unzipping and converting to UTF8, we are able to process one fine.  
> There are CONVERT_TO and CONVERT_FROM commands that appear to address the 
> issue, but we were unable to make them work on a gzipped or unzipped version 
> of the UTF16 file.  We were  able to use CONVERT_FROM ok, but when we tried 
> to wrap the results of that to cast as a date, or anything else, it failed.  
> Trying to work with it natively caused the double-byte nature to appear (a 
> substring 1,4 only return the first two characters).
> I cannot post the data because it is proprietary in nature, but I am posting 
> this code that might be useful in re-creating an issue:
> {noformat}
> #!/usr/bin/env python
> """ Generates a test psv file with some text fields encoded as UTF-16-LE. """
> def write_utf16le_encoded_psv():
>   total_lines = 10
>   encoded = "Encoded B".encode("utf-16-le")
>   with open("test.psv","wb") as csv_file:
>   csv_file.write("header 1|header 2|header 3\n")
>   for i in xrange(total_lines):
>   csv_file.write("value 
> A"+str(i)+"|"+encoded+"|value C"+str(i)+"\n")
> if __name__ == "__main__":
>   write_utf16le_encoded_psv()
> {noformat}
> then:
> tar zcvf test.psv.tar.gz test.psv



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (DRILL-3712) Drill does not recognize UTF-16-LE encoding

2015-10-08 Thread Deneche A. Hakim (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-3712?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Deneche A. Hakim updated DRILL-3712:

Description: 
We are unable to process files that OSX identifies as character sete UTF16LE.  
After unzipping and converting to UTF8, we are able to process one fine.  There 
are CONVERT_TO and CONVERT_FROM commands that appear to address the issue, but 
we were unable to make them work on a gzipped or unzipped version of the UTF16 
file.  We were  able to use CONVERT_FROM ok, but when we tried to wrap the 
results of that to cast as a date, or anything else, it failed.  Trying to work 
with it natively caused the double-byte nature to appear (a substring 1,4 only 
return the first two characters).

I cannot post the data because it is proprietary in nature, but I am posting 
this code that might be useful in re-creating an issue:

{code}
#!/usr/bin/env python
""" Generates a test psv file with some text fields encoded as UTF-16-LE. """
def write_utf16le_encoded_psv():
total_lines = 10
encoded = "Encoded B".encode("utf-16-le")
with open("test.psv","wb") as csv_file:
csv_file.write("header 1|header 2|header 3\n")
for i in xrange(total_lines):
csv_file.write("value 
A"+str(i)+"|"+encoded+"|value C"+str(i)+"\n")

if __name__ == "__main__":
write_utf16le_encoded_psv()
{code}


then:

tar zcvf test.psv.tar.gz test.psv




  was:
We are unable to process files that OSX identifies as character sete UTF16LE.  
After unzipping and converting to UTF8, we are able to process one fine.  There 
are CONVERT_TO and CONVERT_FROM commands that appear to address the issue, but 
we were unable to make them work on a gzipped or unzipped version of the UTF16 
file.  We were  able to use CONVERT_FROM ok, but when we tried to wrap the 
results of that to cast as a date, or anything else, it failed.  Trying to work 
with it natively caused the double-byte nature to appear (a substring 1,4 only 
return the first two characters).

I cannot post the data because it is proprietary in nature, but I am posting 
this code that might be useful in re-creating an issue:


#!/usr/bin/env python
""" Generates a test psv file with some text fields encoded as UTF-16-LE. """
def write_utf16le_encoded_psv():
total_lines = 10
encoded = "Encoded B".encode("utf-16-le")
with open("test.psv","wb") as csv_file:
csv_file.write("header 1|header 2|header 3\n")
for i in xrange(total_lines):
csv_file.write("value 
A"+str(i)+"|"+encoded+"|value C"+str(i)+"\n")

if __name__ == "__main__":
write_utf16le_encoded_psv()


then:

tar zcvf test.psv.tar.gz test.psv





> Drill does not recognize UTF-16-LE encoding
> ---
>
> Key: DRILL-3712
> URL: https://issues.apache.org/jira/browse/DRILL-3712
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Storage - Text & CSV
>Affects Versions: 1.1.0
> Environment: OSX, likely Linux. 
>Reporter: Edmon Begoli
> Fix For: Future
>
>
> We are unable to process files that OSX identifies as character sete UTF16LE. 
>  After unzipping and converting to UTF8, we are able to process one fine.  
> There are CONVERT_TO and CONVERT_FROM commands that appear to address the 
> issue, but we were unable to make them work on a gzipped or unzipped version 
> of the UTF16 file.  We were  able to use CONVERT_FROM ok, but when we tried 
> to wrap the results of that to cast as a date, or anything else, it failed.  
> Trying to work with it natively caused the double-byte nature to appear (a 
> substring 1,4 only return the first two characters).
> I cannot post the data because it is proprietary in nature, but I am posting 
> this code that might be useful in re-creating an issue:
> {code}
> #!/usr/bin/env python
> """ Generates a test psv file with some text fields encoded as UTF-16-LE. """
> def write_utf16le_encoded_psv():
>   total_lines = 10
>   encoded = "Encoded B".encode("utf-16-le")
>   with open("test.psv","wb") as csv_file:
>   csv_file.write("header 1|header 2|header 3\n")
>   for i in xrange(total_lines):
>   csv_file.write("value 
> A"+str(i)+"|"+encoded+"|value C"+str(i)+"\n")
> if __name__ == "__main__":
>   write_utf16le_encoded_psv()
> {code}
> then:
> tar zcvf test.psv.tar.gz test.psv



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (DRILL-1951) Can't cast numeric value with decimal point read from CSV file into integer data type

2015-10-08 Thread Victoria Markman (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-1951?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14949380#comment-14949380
 ] 

Victoria Markman commented on DRILL-1951:
-

In 1.2.0 it throws a new error:

{code}
0: jdbc:drill:schema=dfs> select cast(columns[3] as bigint)  from `sales.csv`;
Error: SYSTEM ERROR: NumberFormatException: 3000.00
Fragment 0:0
[Error Id: 2fdbf4a5-7d58-4473-a984-aee9e2c81a76 on atsqa4-133.qa.lab:31010] 
(state=,code=0)
{code}

> Can't cast numeric value with decimal point read from CSV file into integer 
> data type
> -
>
> Key: DRILL-1951
> URL: https://issues.apache.org/jira/browse/DRILL-1951
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Data Types
>Affects Versions: 0.8.0
>Reporter: Victoria Markman
> Fix For: Future
>
>
> sales.csv file:
> {code}
> 997,Ford,ME350,3000.00, comment#1
> 1999,Chevy,Venture,4900.00, comment#2
> 1999,Chevy,Venture,5000.00, comment#3
> 1996,Jeep,Cherokee,1.01, comment#4
> 0: jdbc:drill:schema=dfs> select cast(columns[3] as decimal(18,2))  from 
> `sales.csv`;
> ++
> |   EXPR$0   |
> ++
> | 3000.00|
> | 4900.00|
> | 5000.00|
> | 1.01   |
> ++
> 4 rows selected (0.093 seconds)
> {code}
> -- Can cast to decimal
> {code}
> 0: jdbc:drill:schema=dfs> select cast(columns[3] as decimal(18,2))  from 
> `sales.csv`;
> ++
> |   EXPR$0   |
> ++
> | 3000.00|
> | 4900.00|
> | 5000.00|
> | 1.01   |
> ++
> 4 rows selected (0.095 seconds)
> {code}
> -- Can cast to float
> {code}
> 0: jdbc:drill:schema=dfs> select cast(columns[3] as float)  from `sales.csv`;
> ++
> |   EXPR$0   |
> ++
> | 3000.0 |
> | 4900.0 |
> | 5000.0 |
> | 1.01   |
> ++
> 4 rows selected (0.112 seconds)
> {code}-- Can't cast to INT/BIGINT
> {code}
> 0: jdbc:drill:schema=dfs> select cast(columns[3] as bigint)  from `sales.csv`;
> Query failed: Query failed: Failure while running fragment., 3000.00 [ 
> 4818451a-c731-48a9-9992-1e81ab1d520d on atsqa4-134.qa.lab:31010 ]
> [ 4818451a-c731-48a9-9992-1e81ab1d520d on atsqa4-134.qa.lab:31010 ]
> Error: exception while executing query: Failure while executing query. 
> (state=,code=0)
> {code}
> -- Same works with json/parquet files
> {code}
> 0: jdbc:drill:schema=dfs> select a1  from `t1.json`;
> ++
> | a1 |
> ++
> | 10.01  |
> ++
> 1 row selected (0.077 seconds)
> 0: jdbc:drill:schema=dfs> select cast(a1 as int)  from `t1.json`;
> ++
> |   EXPR$0   |
> ++
> | 10 |
> ++
> 0: jdbc:drill:schema=dfs> select * from test_cast;
> ++
> | a1 |
> ++
> | 10.0100|
> ++
> 1 row selected (0.06 seconds)
> 0: jdbc:drill:schema=dfs> select cast(a1 as int) from test_cast;
> ++
> |   EXPR$0   |
> ++
> | 10 |
> ++
> 1 row selected (0.094 seconds)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (DRILL-2101) Decimal literals are treated as double

2015-10-08 Thread Victoria Markman (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-2101?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Victoria Markman updated DRILL-2101:

Labels: decimal  (was: )

> Decimal literals are treated as double
> --
>
> Key: DRILL-2101
> URL: https://issues.apache.org/jira/browse/DRILL-2101
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Data Types
>Affects Versions: 0.8.0
>Reporter: Victoria Markman
>  Labels: decimal
> Fix For: Future
>
> Attachments: DRILL-2101-PARTIAL-PATCH-enable-decimal-literals.patch, 
> DRILL-2101.patch
>
>
> {code}
> create table t1(c1) as
> select
> cast(null as decimal(28,4))
> from `t1.csv`;
> message root {
>   optional double c1; <-- Wrong, should be decimal
> }
> {code}
> This is very commonly used construct to convert csv files to parquet files, 
> that's why I'm marking this bug as critical.
> {code}
> create table t2 as 
> select
> case when columns[3] = '' then cast(null as decimal(28,4)) else 
> cast(columns[3] as decimal(28, 4)) end
> from `t1.csv`;
> {code}
> Correct - cast string literal to decimal
> {code}
> create table t3(c1) as
> select
> cast('12345678901234567890.1234' as decimal(28,4))
> from `t1.csv`;
> message root {
>   required fixed_len_byte_array(12) c1 (DECIMAL(28,4));
> }
> {code}
> Correct - cast literal from csv file as decimal
> {code}
> create table t4(c1) as
> select
> cast(columns[3] as decimal(28,4))
> from `t1.csv`;
> message root {
>   optional fixed_len_byte_array(12) c1 (DECIMAL(28,4));
> }
> {code}
> Correct - case statement (no null involved)
> {code}
> create table t5(c1) as
> select
> case when columns[3] = '' then cast('' as decimal(28,4)) else 
> cast(columns[3] as decimal(28,4)) end
> from `t1.csv`;
> message root {
>   optional fixed_len_byte_array(12) c1 (DECIMAL(28,4));
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (DRILL-2051) NPE when querying view with where clause and derived table

2015-10-08 Thread Victoria Markman (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-2051?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Victoria Markman resolved DRILL-2051.
-
Resolution: Fixed

> NPE when querying view with where clause and derived table
> --
>
> Key: DRILL-2051
> URL: https://issues.apache.org/jira/browse/DRILL-2051
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Query Planning & Optimization
>Affects Versions: 0.8.0
>Reporter: Victoria Markman
> Fix For: Future
>
> Attachments: drill-2051.log
>
>
> {code}
> #Wed Jan 21 12:38:45 EST 2015
> git.commit.id.abbrev=8d1e1af
> {code}
> `customer.json`
> {code}
> { "CustomerId": "100", "cityId": 10 }
> { "CustomerId": "101", "cityId": 10 }
> { "CustomerId": "102", "cityId": 10 }
> { "CustomerId": "103", "cityId": 20 }
> { "CustomerId": "104", "cityId": 30 }
> { "CustomerId": "105", "cityId": null }
> { "CustomerId": null,  "cityId": 50 }
> {code}
> {code}
> 0: jdbc:drill:schema=dfs> create view v3 as select * from ( select * from 
> `customer.json` ) where customerid >= 100;
> +++
> | ok |  summary   |
> +++
> | true   | View 'v3' created successfully in 'dfs.identifiers' schema |
> +++
> 1 row selected (0.063 seconds)
> 0: jdbc:drill:schema=dfs> select * from v3;
> Query failed: NullPointerException: 
> Error: exception while executing query: Failure while executing query. 
> (state=,code=0)
> {code}
> Query by itself works fine:
> {code}
> 0: jdbc:drill:schema=dfs> select * from ( select * from `customer.json` ) 
> where CustomerId >= 100;
> +++
> | CustomerId |   cityId   |
> +++
> | 100| 10 |
> | 101| 10 |
> | 102| 10 |
> | 103| 20 |
> | 104| 30 |
> | 105| null   |
> +++
> 6 rows selected (0.117 seconds)
> {code}
> If you remove where clause and leave just derived table in the view creation, 
> you can query the view as well.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (DRILL-2051) NPE when querying view with where clause and derived table

2015-10-08 Thread Victoria Markman (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-2051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14949371#comment-14949371
 ] 

Victoria Markman commented on DRILL-2051:
-

This bug is fixed, can't point at check in the fixed it. Resolving.

Tested with 1.2.0:
{code}
0: jdbc:drill:schema=dfs> create or replace view v3 as select * from ( select * 
from `customer.json` ) where customerid >= 100;
+---+--+
|  ok   |   summary|
+---+--+
| true  | View 'v3' created successfully in 'dfs.test' schema  |
+---+--+
1 row selected (0.31 seconds)

0: jdbc:drill:schema=dfs> select * from v3;
+-+-+
| CustomerId  | cityId  |
+-+-+
| 100 | 10  |
| 101 | 10  |
| 102 | 10  |
| 103 | 20  |
| 104 | 30  |
| 105 | null|
+-+-+
6 rows selected (0.287 seconds)
{code}

> NPE when querying view with where clause and derived table
> --
>
> Key: DRILL-2051
> URL: https://issues.apache.org/jira/browse/DRILL-2051
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Query Planning & Optimization
>Affects Versions: 0.8.0
>Reporter: Victoria Markman
> Fix For: Future
>
> Attachments: drill-2051.log
>
>
> {code}
> #Wed Jan 21 12:38:45 EST 2015
> git.commit.id.abbrev=8d1e1af
> {code}
> `customer.json`
> {code}
> { "CustomerId": "100", "cityId": 10 }
> { "CustomerId": "101", "cityId": 10 }
> { "CustomerId": "102", "cityId": 10 }
> { "CustomerId": "103", "cityId": 20 }
> { "CustomerId": "104", "cityId": 30 }
> { "CustomerId": "105", "cityId": null }
> { "CustomerId": null,  "cityId": 50 }
> {code}
> {code}
> 0: jdbc:drill:schema=dfs> create view v3 as select * from ( select * from 
> `customer.json` ) where customerid >= 100;
> +++
> | ok |  summary   |
> +++
> | true   | View 'v3' created successfully in 'dfs.identifiers' schema |
> +++
> 1 row selected (0.063 seconds)
> 0: jdbc:drill:schema=dfs> select * from v3;
> Query failed: NullPointerException: 
> Error: exception while executing query: Failure while executing query. 
> (state=,code=0)
> {code}
> Query by itself works fine:
> {code}
> 0: jdbc:drill:schema=dfs> select * from ( select * from `customer.json` ) 
> where CustomerId >= 100;
> +++
> | CustomerId |   cityId   |
> +++
> | 100| 10 |
> | 101| 10 |
> | 102| 10 |
> | 103| 20 |
> | 104| 30 |
> | 105| null   |
> +++
> 6 rows selected (0.117 seconds)
> {code}
> If you remove where clause and leave just derived table in the view creation, 
> you can query the view as well.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (DRILL-3429) DrillAvgVarianceConvertlet may produce wrong results while rewriting stddev, variance

2015-10-08 Thread Mehant Baid (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-3429?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mehant Baid updated DRILL-3429:
---
Attachment: DRILL-3429.patch

> DrillAvgVarianceConvertlet may produce wrong results while rewriting stddev, 
> variance
> -
>
> Key: DRILL-3429
> URL: https://issues.apache.org/jira/browse/DRILL-3429
> Project: Apache Drill
>  Issue Type: Bug
>Reporter: Mehant Baid
>Assignee: Mehant Baid
>Priority: Critical
> Fix For: 1.3.0
>
> Attachments: DRILL-3429.patch
>
>
> DrillAvgVarianceConvertlet currently rewrites aggregate functions like avg, 
> stddev, variance to simple computations. 
> Eg: 
> Stddev( x ) => power(
>  (sum(x * x) - sum( x ) * sum( x ) / count( x ))
>  / count( x ),
>  .5)
> Consider the case when the input is an integer. Now the rewrite contains 
> multiplication and division, which will bind to functions that operate on 
> integers however the expected result should be a double and since double has 
> more precision than integer we should be operating on double during the 
> multiplication and division.  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (DRILL-3914) Support geospatial queries

2015-10-08 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-3914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14949357#comment-14949357
 ] 

ASF GitHub Bot commented on DRILL-3914:
---

Github user k255 commented on the pull request:

https://github.com/apache/drill/pull/191#issuecomment-146679625
  
I added some general tests to check if geometry functions work as expected.

I'm happy that you like it. Currently it's quite simple but it can grow.
One direction is to take care of limited size of varbinary (introduce new 
type or extend size of existing one) because it limits geometry to just simple 
shapes.


> Support geospatial queries
> --
>
> Key: DRILL-3914
> URL: https://issues.apache.org/jira/browse/DRILL-3914
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Client - JDBC, Functions - Drill
>Reporter: Karol Potocki
>
> Implement spatial query functionality in Drill to provide location based 
> queries and filtering. It could be similar to PostGIS for Postgres and allow 
> queries like:
> select * from
> (select columns[2] as location, columns[4] as lon, columns[3] as lat,
> ST_DWithin(ST_Point(-121.895, 37.339), ST_Point(columns[4], 
> columns[3]), 0.1) as isWithin
> from dfs.`default`.`/home/k255/drill/sample-data/CA-cities.csv`
> )
> where isWithin = true;
> Working proposal is available at http://github.com/k255 (see drill-gis and 
> drill fork).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (DRILL-2015) Casting numeric value that does not fit in integer data type produces incorrect result

2015-10-08 Thread Victoria Markman (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-2015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14949351#comment-14949351
 ] 

Victoria Markman commented on DRILL-2015:
-

Changed priority to critical, since it is incorrect result.

> Casting numeric value that does not fit in integer data type produces 
> incorrect result
> --
>
> Key: DRILL-2015
> URL: https://issues.apache.org/jira/browse/DRILL-2015
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Execution - Data Types
>Affects Versions: 0.8.0, 1.2.0
>Reporter: Victoria Markman
>Priority: Critical
>  Labels: document_if_not_fixed
> Fix For: Future
>
>
> t1.json
> {code}
> { "a1": 1 ,"b1" : 1}
> { "a1": 2 ,"b1" : 1}
> { "a1": 2 ,"b1" : 2}
> { "a1": 3 ,"b1" : 2}
> { "a1": 5000147483647 , "b1" : 3}
> {code}
> We should throw an error, this is technically data corruption.
> {code}
> 0: jdbc:drill:schema=dfs> select cast(a1 as integer) from `t1.json`;
> ++
> |   EXPR$0   |
> ++
> | 1  |
> | 2  |
> | 2  |
> | 3  |
> | 805551103  |
> ++
> 5 rows selected (0.074 seconds)
> {code}
> {code}
> 0: jdbc:drill:schema=dfs> select cast(2147483648 as integer) from `t1.json`;
> ++
> |   EXPR$0   |
> ++
> | -2147483648 |
> | -2147483648 |
> | -2147483648 |
> | -2147483648 |
> | -2147483648 |
> ++
> 5 rows selected (0.076 seconds)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (DRILL-2015) Casting numeric value that does not fit in integer data type produces incorrect result

2015-10-08 Thread Victoria Markman (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-2015?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Victoria Markman updated DRILL-2015:

Affects Version/s: 1.2.0

> Casting numeric value that does not fit in integer data type produces 
> incorrect result
> --
>
> Key: DRILL-2015
> URL: https://issues.apache.org/jira/browse/DRILL-2015
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Execution - Data Types
>Affects Versions: 0.8.0, 1.2.0
>Reporter: Victoria Markman
>Priority: Critical
>  Labels: document_if_not_fixed
> Fix For: Future
>
>
> t1.json
> {code}
> { "a1": 1 ,"b1" : 1}
> { "a1": 2 ,"b1" : 1}
> { "a1": 2 ,"b1" : 2}
> { "a1": 3 ,"b1" : 2}
> { "a1": 5000147483647 , "b1" : 3}
> {code}
> We should throw an error, this is technically data corruption.
> {code}
> 0: jdbc:drill:schema=dfs> select cast(a1 as integer) from `t1.json`;
> ++
> |   EXPR$0   |
> ++
> | 1  |
> | 2  |
> | 2  |
> | 3  |
> | 805551103  |
> ++
> 5 rows selected (0.074 seconds)
> {code}
> {code}
> 0: jdbc:drill:schema=dfs> select cast(2147483648 as integer) from `t1.json`;
> ++
> |   EXPR$0   |
> ++
> | -2147483648 |
> | -2147483648 |
> | -2147483648 |
> | -2147483648 |
> | -2147483648 |
> ++
> 5 rows selected (0.076 seconds)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (DRILL-2015) Casting numeric value that does not fit in integer data type produces incorrect result

2015-10-08 Thread Victoria Markman (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-2015?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Victoria Markman updated DRILL-2015:

Priority: Critical  (was: Major)

> Casting numeric value that does not fit in integer data type produces 
> incorrect result
> --
>
> Key: DRILL-2015
> URL: https://issues.apache.org/jira/browse/DRILL-2015
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Execution - Data Types
>Affects Versions: 0.8.0
>Reporter: Victoria Markman
>Priority: Critical
>  Labels: document_if_not_fixed
> Fix For: Future
>
>
> t1.json
> {code}
> { "a1": 1 ,"b1" : 1}
> { "a1": 2 ,"b1" : 1}
> { "a1": 2 ,"b1" : 2}
> { "a1": 3 ,"b1" : 2}
> { "a1": 5000147483647 , "b1" : 3}
> {code}
> We should throw an error, this is technically data corruption.
> {code}
> 0: jdbc:drill:schema=dfs> select cast(a1 as integer) from `t1.json`;
> ++
> |   EXPR$0   |
> ++
> | 1  |
> | 2  |
> | 2  |
> | 3  |
> | 805551103  |
> ++
> 5 rows selected (0.074 seconds)
> {code}
> {code}
> 0: jdbc:drill:schema=dfs> select cast(2147483648 as integer) from `t1.json`;
> ++
> |   EXPR$0   |
> ++
> | -2147483648 |
> | -2147483648 |
> | -2147483648 |
> | -2147483648 |
> | -2147483648 |
> ++
> 5 rows selected (0.076 seconds)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (DRILL-2001) Poor error message when arithmetic expression with MIN/MAX functions on a string type

2015-10-08 Thread Victoria Markman (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-2001?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14949346#comment-14949346
 ] 

Victoria Markman commented on DRILL-2001:
-

in 1.2.0 we return a better error, however it should be a USER error and not a 
SYSTEM error:

{code}
0: jdbc:drill:schema=dfs>  select min(a1)+1 from `t1.json`;
Error: SYSTEM ERROR: NumberFormatException: aaa
Fragment 0:0
[Error Id: 4623ffca-3b83-4d92-9c55-585fedb27789 on atsqa4-133.qa.lab:31010] 
(state=,code=0)
{code}

> Poor error message when arithmetic expression with MIN/MAX functions on a 
> string type
> -
>
> Key: DRILL-2001
> URL: https://issues.apache.org/jira/browse/DRILL-2001
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Data Types
>Affects Versions: 0.8.0
>Reporter: Victoria Markman
>Priority: Minor
> Fix For: 1.4.0
>
>
> We do support MIN/MAX().
> However, if I inadvertently were to use this in arithmetic expression, I 
> would get an exception:
> {code}
> { "a1": "aaa" }
> { "a1": "bbb" }
> { "a1": "bbb" }
> { "a1": "eee" }
> {code}
> Works correctly:
> {code}
> 0: jdbc:drill:schema=dfs> select max(a1) from `t.json`;
> ++
> |   EXPR$0   |
> ++
> | eee|
> ++
> 1 row selected (0.085 seconds)
> 0: jdbc:drill:schema=dfs> select min(a1) from `t.json`;
> ++
> |   EXPR$0   |
> ++
> | aaa|
> ++
> 1 row selected (0.104 seconds)
> {code}
> Throws an exception:
> {code}
> 0: jdbc:drill:schema=dfs> select min(a1)+1 from `t.json`;
> ++
> |   EXPR$0   |
> ++
> Query failed: Query failed: Failure while running fragment., aaa [ 
> 9194a73d-676d-4e63-8a49-a3c0ff0e63e0 on atsqa4-133.qa.lab:31010 ]
> [ 9194a73d-676d-4e63-8a49-a3c0ff0e63e0 on atsqa4-133.qa.lab:31010 ]
> java.lang.RuntimeException: java.sql.SQLException: Failure while executing 
> query.
>   at sqlline.SqlLine$IncrementalRows.hasNext(SqlLine.java:2514)
>   at sqlline.SqlLine$TableOutputFormat.print(SqlLine.java:2148)
>   at sqlline.SqlLine.print(SqlLine.java:1809)
>   at sqlline.SqlLine$Commands.execute(SqlLine.java:3766)
>   at sqlline.SqlLine$Commands.sql(SqlLine.java:3663)
>   at sqlline.SqlLine.dispatch(SqlLine.java:889)
>   at sqlline.SqlLine.begin(SqlLine.java:763)
>   at sqlline.SqlLine.start(SqlLine.java:498)
>   at sqlline.SqlLine.main(SqlLine.java:460)
> 0: jdbc:drill:schema=dfs> select max(a1)+1 from `t.json`;
> ++
> |   EXPR$0   |
> ++
> Query failed: Query failed: Failure while running fragment., eee [ 
> 23d7d06d-b93b-4cb0-a35b-c0fa0faf3369 on atsqa4-133.qa.lab:31010 ]
> [ 23d7d06d-b93b-4cb0-a35b-c0fa0faf3369 on atsqa4-133.qa.lab:31010 ]
> java.lang.RuntimeException: java.sql.SQLException: Failure while executing 
> query.
>   at sqlline.SqlLine$IncrementalRows.hasNext(SqlLine.java:2514)
>   at sqlline.SqlLine$TableOutputFormat.print(SqlLine.java:2148)
>   at sqlline.SqlLine.print(SqlLine.java:1809)
>   at sqlline.SqlLine$Commands.execute(SqlLine.java:3766)
>   at sqlline.SqlLine$Commands.sql(SqlLine.java:3663)
>   at sqlline.SqlLine.dispatch(SqlLine.java:889)
>   at sqlline.SqlLine.begin(SqlLine.java:763)
>   at sqlline.SqlLine.start(SqlLine.java:498)
>   at sqlline.SqlLine.main(SqlLine.java:460)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (DRILL-1998) Confusing error message when you pass character data type to SUM aggregate function

2015-10-08 Thread Victoria Markman (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-1998?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Victoria Markman resolved DRILL-1998.
-
Resolution: Fixed

> Confusing error message when you pass character data type to SUM aggregate 
> function
> ---
>
> Key: DRILL-1998
> URL: https://issues.apache.org/jira/browse/DRILL-1998
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Data Types
>Affects Versions: 0.8.0
>Reporter: Victoria Markman
>Priority: Minor
> Fix For: Future
>
>
> t.json
> {code}
> { "a2": 1,  "b2": 1,"c2": 1, "d2": "aaa",   e1: "2015-01-01"}
> { "a2": 2,  "b2": 2,"c2": 2, "d2": "bbb",   e1: "2015-01-02"}
> { "a2": 2,  "b2": 2,"c2": 2, "d2": "bbb",   e1: "2015-01-02"}
> { "a2": 5,  "b2": 5,"c2": 5, "d2": "eee",   e1: "2015-01-05"}
> {code}
> {code}
> 0: jdbc:drill:schema=dfs> select sum(d2) from `t2.json`;
> Query failed: Query failed: Failure while running fragment., Only COUNT 
> aggregate function supported for Boolean type [ 
> 003f011e-ae98-47ac-99c8-1674b1edb74b on atsqa4-133.qa.lab:31010 ]
> [ 003f011e-ae98-47ac-99c8-1674b1edb74b on atsqa4-133.qa.lab:31010 ]
> Error: exception while executing query: Failure while executing query. 
> (state=,code=0)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (DRILL-1998) Confusing error message when you pass character data type to SUM aggregate function

2015-10-08 Thread Victoria Markman (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-1998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14949337#comment-14949337
 ] 

Victoria Markman commented on DRILL-1998:
-

This bug is fixed in version 1.1.0, see comments in: 
https://issues.apache.org/jira/browse/DRILL-3245

In 1.2.0
{code}
0: jdbc:drill:schema=dfs> select sum(d2) from `t1.json`;
Error: UNSUPPORTED_OPERATION ERROR: Only COUNT, MIN and MAX aggregate functions 
supported for VarChar type
Fragment 0:0
[Error Id: 405da0ff-f3c9-4617-a463-76cade51aefe on atsqa4-133.qa.lab:31010] 
(state=,code=0)
{code}

> Confusing error message when you pass character data type to SUM aggregate 
> function
> ---
>
> Key: DRILL-1998
> URL: https://issues.apache.org/jira/browse/DRILL-1998
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Data Types
>Affects Versions: 0.8.0
>Reporter: Victoria Markman
>Priority: Minor
> Fix For: Future
>
>
> t.json
> {code}
> { "a2": 1,  "b2": 1,"c2": 1, "d2": "aaa",   e1: "2015-01-01"}
> { "a2": 2,  "b2": 2,"c2": 2, "d2": "bbb",   e1: "2015-01-02"}
> { "a2": 2,  "b2": 2,"c2": 2, "d2": "bbb",   e1: "2015-01-02"}
> { "a2": 5,  "b2": 5,"c2": 5, "d2": "eee",   e1: "2015-01-05"}
> {code}
> {code}
> 0: jdbc:drill:schema=dfs> select sum(d2) from `t2.json`;
> Query failed: Query failed: Failure while running fragment., Only COUNT 
> aggregate function supported for Boolean type [ 
> 003f011e-ae98-47ac-99c8-1674b1edb74b on atsqa4-133.qa.lab:31010 ]
> [ 003f011e-ae98-47ac-99c8-1674b1edb74b on atsqa4-133.qa.lab:31010 ]
> Error: exception while executing query: Failure while executing query. 
> (state=,code=0)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (DRILL-1868) Provide options to handle nonexistent columns in schema-less query

2015-10-08 Thread Victoria Markman (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-1868?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Victoria Markman updated DRILL-1868:

Summary: Provide options to handle nonexistent columns in schema-less query 
 (was: Filtering on an alias should return an error, user  gets wrong result 
instead)

> Provide options to handle nonexistent columns in schema-less query
> --
>
> Key: DRILL-1868
> URL: https://issues.apache.org/jira/browse/DRILL-1868
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: SQL Parser
>Reporter: Victoria Markman
>  Labels: document_if_not_fixed
> Fix For: Future
>
>
> git.commit.id.abbrev=c65928f
> {code}
> 0: jdbc:drill:schema=dfs> select * from `test.json`;
> +++
> | eventdate  |sold|
> +++
> | 2014-01-01 | 100|
> | 2014-01-01 | 100|
> | 2014-02-01 | 200|
> +++
> 3 rows selected (0.099 seconds)
> {code}
>  
> {code}
> 0: jdbc:drill:schema=dfs> -- Correct result
> 0: jdbc:drill:schema=dfs> SELECT
> . . . . . . . . . . . . > extract( month from eventdate ) as 
> `month`,
> . . . . . . . . . . . . > extract( year  from eventdate ) as 
> `year`
> . . . . . . . . . . . . > 
> . . . . . . . . . . . . > FROM`test.json`
> . . . . . . . . . . . . > WHERE   extract( month from eventdate ) IS 
> NOT NULL;
> +++
> |   month|year|
> +++
> | 1  | 2014   |
> | 1  | 2014   |
> | 2  | 2014   |
> +++
> 3 rows selected (0.074 seconds)
> {code}
> {code}
> 0: jdbc:drill:schema=dfs> -- Wrong result, should throw an error
> 0: jdbc:drill:schema=dfs> SELECT
> . . . . . . . . . . . . > extract( month from eventdate ) as 
> `month`,
> . . . . . . . . . . . . > extract( year  from eventdate ) as 
> `year`
> . . . . . . . . . . . . > 
> . . . . . . . . . . . . > FROM`test.json`
> . . . . . . . . . . . . > WHERE   `month` IS NOT NULL;
> +++
> |   month|year|
> +++
> +++
> No rows selected (0.079 seconds)
> {code}
> {code}
> 0: jdbc:drill:schema=dfs> -- Wrong result, should throw an error
> 0: jdbc:drill:schema=dfs> SELECT
> . . . . . . . . . . . . > extract( month from eventdate ) as 
> xyz,
> . . . . . . . . . . . . > extract( year  from eventdate ) as 
> `year`
> . . . . . . . . . . . . > 
> . . . . . . . . . . . . > FROM`test.json`
> . . . . . . . . . . . . > WHERE   xyz IS NOT NULL;
> +++
> |xyz |year|
> +++
> +++
> No rows selected (0.073 seconds)
> {code}
> {code} 
> 0: jdbc:drill:schema=dfs> -- Correct result
> 0: jdbc:drill:schema=dfs> SELECT *
> . . . . . . . . . . . . > FROM
> . . . . . . . . . . . . > (
> . . . . . . . . . . . . > SELECT
> . . . . . . . . . . . . > extract( month from eventdate ) as 
> `month`,
> . . . . . . . . . . . . > extract( year  from eventdate ) as 
> `year`
> . . . . . . . . . . . . > 
> . . . . . . . . . . . . > FROM`test.json`
> . . . . . . . . . . . . > WHERE   `month` IS NULL
> . . . . . . . . . . . . > )
> . . . . . . . . . . . . > WHERE `month` IS NOT NULL;
> +++
> |   month|year|
> +++
> | 1  | 2014   |
> | 1  | 2014   |
> | 2  | 2014   |
> +++
> 3 rows selected (0.099 seconds)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (DRILL-1895) Improve message where implicit cast fails (e.g. with IN clause)

2015-10-08 Thread Victoria Markman (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-1895?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14949304#comment-14949304
 ] 

Victoria Markman commented on DRILL-1895:
-

Tried with 1.2.0, error is much more descriptive now. However I don't think it 
should be SYSTEM error, but rather USER error:

{code}
0: jdbc:drill:schema=dfs> select * from cp.`tpch/nation.parquet` where 
n_regionkey in ('abc');
Error: SYSTEM ERROR: NumberFormatException: abc
Fragment 0:0
[Error Id: 8f70a9f3-ab15-4c72-8887-14b05ad57752 on atsqa4-133.qa.lab:31010] 
(state=,code=0)
{code}

> Improve message where implicit cast fails (e.g. with IN clause)
> ---
>
> Key: DRILL-1895
> URL: https://issues.apache.org/jira/browse/DRILL-1895
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Data Types
>Affects Versions: 0.7.0
>Reporter: Victoria Markman
>Priority: Minor
> Fix For: Future
>
>
> -- Works, because value in "IN CLAUSE" is a compatible numeric type
> {code}
> 0: jdbc:drill:schema=dfs> select * from cp.`tpch/nation.parquet` where 
> n_regionkey in (4);
> +-++-++
> | n_nationkey |   n_name   | n_regionkey | n_comment  |
> +-++-++
> | 4   | EGYPT  | 4   | y above the carefully unusual 
> theodolites. final dugouts are quickly across the furiously regular d |
> | 10  | IRAN   | 4   | efully alongside of the slyly 
> final dependencies.  |
> | 11  | IRAQ   | 4   | nic deposits boost atop the 
> quickly final requests? quickly regula |
> | 13  | JORDAN | 4   | ic deposits are blithely about the 
> carefully regular pa |
> | 20  | SAUDI ARABIA | 4   | ts. silent requests haggle. 
> closely express packages sleep across the blithely |
> +-++-++
> 5 rows selected (0.092 seconds)
> {code}
> -- WORKS (trying to convert literal string to numeric and succeeds, because 
> it can be implicitly converted)
> {code}
> 0: jdbc:drill:schema=dfs> select * from cp.`tpch/nation.parquet` where 
> n_regionkey in ('4');
> +-++-++
> | n_nationkey |   n_name   | n_regionkey | n_comment  |
> +-++-++
> | 4   | EGYPT  | 4   | y above the carefully unusual 
> theodolites. final dugouts are quickly across the furiously regular d |
> | 10  | IRAN   | 4   | efully alongside of the slyly 
> final dependencies.  |
> | 11  | IRAQ   | 4   | nic deposits boost atop the 
> quickly final requests? quickly regula |
> | 13  | JORDAN | 4   | ic deposits are blithely about the 
> carefully regular pa |
> | 20  | SAUDI ARABIA | 4   | ts. silent requests haggle. 
> closely express packages sleep across the blithely |
> +-++-++
> 5 rows selected (0.073 seconds)
> {code}
> -- FAILS (can't be converted to numeric type)
> {code}
> 0: jdbc:drill:schema=dfs> select * from cp.`tpch/nation.parquet` where 
> n_regionkey in ('abc');
> Query failed: Query failed: Failure while running fragment., abc [ 
> 4578a64c-75c5-4acf-be8c-28ce0db8623d on atsqa4-133.qa.lab:31010 ]
> [ 4578a64c-75c5-4acf-be8c-28ce0db8623d on atsqa4-133.qa.lab:31010 ]
> Error: exception while executing query: Failure while executing query. 
> (state=,code=0)
> 0: jdbc:drill:schema=dfs> select * from cp.`tpch/nation.parquet` where 
> n_regionkey in ('4');
> +-++-++
> | n_nationkey |   n_name   | n_regionkey | n_comment  |
> +-++-++
> | 4   | EGYPT  | 4   | y above the carefully unusual 
> theodolites. final dugouts are quickly across the furiously regular d |
> | 10  | IRAN   | 4   | efully alongside of the slyly 
> final dependencies.  |
> | 11  | IRAQ   | 4   | nic deposits boost atop the 
> quickly final requests? quickly regula |
> | 13  | JORDAN | 4   | ic deposits are blithely about the 
> carefully regular pa |
> | 20  | SAUDI ARABIA | 4   | ts. silent requests haggle. 
> closely express packages sleep across the blithely |
> +-++-++
> 5 rows selected (0.073 seconds)
> {code}
> It would be really neat to get a descriptive error message.
> Postges example:
> {code}
> postgres=# select * from t1 where c1 in ('abc');
> ERROR:  invalid input syntax for integer: "abc"
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (DRILL-3916) Assembly for JDBC storage plugin missing

2015-10-08 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-3916?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14949243#comment-14949243
 ] 

ASF GitHub Bot commented on DRILL-3916:
---

Github user asfgit closed the pull request at:

https://github.com/apache/drill/pull/192


> Assembly for JDBC storage plugin missing
> 
>
> Key: DRILL-3916
> URL: https://issues.apache.org/jira/browse/DRILL-3916
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Storage - Other
>Affects Versions: 1.2.0
>Reporter: Andrew
>Assignee: Andrew
>
> The JDBC storage plugin is missing from the assembly instructions, which 
> means that the plugin fails to be loaded by the drill bit on start.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (DRILL-3916) Assembly for JDBC storage plugin missing

2015-10-08 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-3916?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14949218#comment-14949218
 ] 

ASF GitHub Bot commented on DRILL-3916:
---

GitHub user aleph-zero opened a pull request:

https://github.com/apache/drill/pull/192

DRILL-3916: Add JDBC plugin to assembly

This commits adds the JDBC plugin jar to the assembly so that it can be
loaded by Drill as a storage plugin.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/aleph-zero/drill issues/DRILL-3916

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/drill/pull/192.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #192


commit ca2c4f29b4bac21c949dd340fb6214cc5b2afbe4
Author: aleph-zero 
Date:   2015-10-08T19:25:11Z

DRILL-3916: Add JDBC plugin to assembly

This commits adds the JDBC plugin jar to the assembly so that it can be
loaded by Drill as a storage plugin.




> Assembly for JDBC storage plugin missing
> 
>
> Key: DRILL-3916
> URL: https://issues.apache.org/jira/browse/DRILL-3916
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Storage - Other
>Affects Versions: 1.2.0
>Reporter: Andrew
>Assignee: Andrew
>
> The JDBC storage plugin is missing from the assembly instructions, which 
> means that the plugin fails to be loaded by the drill bit on start.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (DRILL-3916) Assembly for JDBC storage plugin missing

2015-10-08 Thread Andrew (JIRA)
Andrew created DRILL-3916:
-

 Summary: Assembly for JDBC storage plugin missing
 Key: DRILL-3916
 URL: https://issues.apache.org/jira/browse/DRILL-3916
 Project: Apache Drill
  Issue Type: Bug
  Components: Storage - Other
Affects Versions: 1.2.0
Reporter: Andrew
Assignee: Andrew


The JDBC storage plugin is missing from the assembly instructions, which means 
that the plugin fails to be loaded by the drill bit on start.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (DRILL-3873) SchemaChangeException : CONVERT_FROM(...row_key...) IS NOT NULL

2015-10-08 Thread Khurram Faraaz (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-3873?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Khurram Faraaz updated DRILL-3873:
--
Assignee: (was: Mehant Baid)

> SchemaChangeException : CONVERT_FROM(...row_key...) IS NOT NULL
> ---
>
> Key: DRILL-3873
> URL: https://issues.apache.org/jira/browse/DRILL-3873
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Flow
>Affects Versions: 1.2.0
>Reporter: Khurram Faraaz
> Fix For: Future
>
>
> SchemaChangeException is seen when, 
> CONVERT_FROM(BYTE_SUBSTR(row_key,1,8),'uint8_be') IS NOT NULL 
> is used in predicate.
> Drill master commit ID : 69c73af5
> {code}
> 0: jdbc:drill:schema=dfs.tmp> select * from dt_Tbl WHERE 
> CONVERT_FROM(BYTE_SUBSTR(row_key,1,8),'uint8_be') IS NOT NULL;
> Error: SYSTEM ERROR: SchemaChangeException: Failure while trying to 
> materialize incoming schema.  Errors:
> Error in expression at index -1.  Error: Missing function implementation: 
> [convert_fromUINT8_BE(VARBINARY-REQUIRED)].  Full expression: --UNKNOWN 
> EXPRESSION--..
> Fragment 0:0
> [Error Id: c29cc65c-dfe3-4a18-ae64-ba65c97dfb73 on centos-03.qa.lab:31010] 
> (state=,code=0)
> {code}
> Stack trace from drillbit.log
> {code}
> 2015-09-30 21:01:26,935 [29f3b259-3fd2-2fe2-3cf7-7ecf3f2b4278:frag:0:0] ERROR 
> o.a.d.e.r.AbstractSingleRecordBatch - Failure during query
> org.apache.drill.exec.exception.SchemaChangeException: Failure while trying 
> to materialize incoming schema.  Errors:
> Error in expression at index -1.  Error: Missing function implementation: 
> [convert_fromUINT8_BE(VARBINARY-REQUIRED)].  Full expression: --UNKNOWN 
> EXPRESSION--..
> at 
> org.apache.drill.exec.physical.impl.filter.FilterRecordBatch.generateSV2Filterer(FilterRecordBatch.java:181)
>  ~[drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
> at 
> org.apache.drill.exec.physical.impl.filter.FilterRecordBatch.setupNewSchema(FilterRecordBatch.java:107)
>  ~[drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
> at 
> org.apache.drill.exec.record.AbstractSingleRecordBatch.innerNext(AbstractSingleRecordBatch.java:78)
>  [drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
> at 
> org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:147)
>  [drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
> at 
> org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:104)
>  [drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
> at 
> org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:94)
>  [drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
> at 
> org.apache.drill.exec.record.AbstractSingleRecordBatch.innerNext(AbstractSingleRecordBatch.java:51)
>  [drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
> at 
> org.apache.drill.exec.physical.impl.svremover.RemovingRecordBatch.innerNext(RemovingRecordBatch.java:94)
>  [drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
> at 
> org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:147)
>  [drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
> at 
> org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:104)
>  [drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
> at 
> org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:94)
>  [drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
> at 
> org.apache.drill.exec.record.AbstractSingleRecordBatch.innerNext(AbstractSingleRecordBatch.java:51)
>  [drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
> at 
> org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.innerNext(ProjectRecordBatch.java:129)
>  [drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
> at 
> org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:147)
>  [drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
> at 
> org.apache.drill.exec.physical.impl.BaseRootExec.next(BaseRootExec.java:83) 
> [drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
> at 
> org.apache.drill.exec.physical.impl.ScreenCreator$ScreenRoot.innerNext(ScreenCreator.java:80)
>  [drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
> at 
> org.apache.drill.exec.physical.impl.BaseRootExec.next(BaseRootExec.java:73) 
> [drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
> at 
> org.apache.drill.exec.work.fragment.FragmentExecutor$1.run(FragmentExecutor.java:258)
>  [drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
> at 
> org.apache.drill.exec.work.fragment.FragmentExecutor$1.run(FragmentExecutor.java:252)
>  [drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
> at java.security.AccessController.doPrivileged(Native Method) 

[jira] [Updated] (DRILL-3870) CONVERT_FROM(..., uint8_be) can not be used in project

2015-10-08 Thread Khurram Faraaz (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-3870?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Khurram Faraaz updated DRILL-3870:
--
Fix Version/s: (was: 1.3.0)
   Future

> CONVERT_FROM(..., uint8_be) can not be used in project
> --
>
> Key: DRILL-3870
> URL: https://issues.apache.org/jira/browse/DRILL-3870
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Relational Operators
>Affects Versions: 1.2.0
> Environment: 4 node cluster CentOS
>Reporter: Khurram Faraaz
>Assignee: Aditya Kishore
> Fix For: Future
>
>
> CONVERT_FROM(..., 'uint8_be') results in UNSUPPORTED_OPERATION when used in 
> project. However, when it is used in the predicate in the WHERE clause no 
> error is reported.
> Data in the HBase table has long integer values.
> {code}
> 0: jdbc:drill:schema=dfs.tmp> SELECT 
> CONVERT_FROM(BYTE_SUBSTR(row_key,1,8),'bigint_be') AS RK from dt_Tbl T;
> ++
> |   RK   |
> ++
> | 140892480  |
> | 1443240886094  |
> | 1443413686094  |
> | 1443491446094  |
> | 1443500086094  |
> ++
> 5 rows selected (0.767 seconds)
> {code}
> The below query fails when CONVERT_FROM function is used in the project.
> {code}
> 0: jdbc:drill:schema=dfs.tmp> SELECT 
> CONVERT_FROM(BYTE_SUBSTR(row_key,1,8),'uint8_be') AS RK from dt_Tbl T;
> Error: UNSUPPORTED_OPERATION ERROR: CONVERT_FROM does not support conversion 
> from type 'uint8_be'.
> Did you mean UINT8?
> [Error Id: 1e58b2bb-73c3-4c55-84db-e59b80ea6dbb on centos-03.qa.lab:31010] 
> (state=,code=0)
> {code}
> When the same CONVERT_FROM function is used in the predicate no error is 
> reported.
> {code}
> 0: jdbc:drill:schema=dfs.tmp> select * from dt_Tbl WHERE 
> CONVERT_FROM(BYTE_SUBSTR(row_key,1,8),'uint8_be') < cast(1443500086094 as 
> bigint);
> +--+---+
> |   row_key|colfam1|
> +--+---+
> | [B@14c9d776  | {"qual1":"dmFsMQ=="}  |
> | [B@4f608ea3  | {"qual1":"dmFsNQ=="}  |
> | [B@4c34980b  | {"qual1":"dmFsNA=="}  |
> | [B@10ea2143  | {"qual1":"dmFsMw=="}  |
> +--+---+
> 4 rows selected (0.865 seconds)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (DRILL-3873) SchemaChangeException : CONVERT_FROM(...row_key...) IS NOT NULL

2015-10-08 Thread Khurram Faraaz (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-3873?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Khurram Faraaz updated DRILL-3873:
--
Fix Version/s: (was: 1.3.0)
   Future

> SchemaChangeException : CONVERT_FROM(...row_key...) IS NOT NULL
> ---
>
> Key: DRILL-3873
> URL: https://issues.apache.org/jira/browse/DRILL-3873
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Flow
>Affects Versions: 1.2.0
>Reporter: Khurram Faraaz
> Fix For: Future
>
>
> SchemaChangeException is seen when, 
> CONVERT_FROM(BYTE_SUBSTR(row_key,1,8),'uint8_be') IS NOT NULL 
> is used in predicate.
> Drill master commit ID : 69c73af5
> {code}
> 0: jdbc:drill:schema=dfs.tmp> select * from dt_Tbl WHERE 
> CONVERT_FROM(BYTE_SUBSTR(row_key,1,8),'uint8_be') IS NOT NULL;
> Error: SYSTEM ERROR: SchemaChangeException: Failure while trying to 
> materialize incoming schema.  Errors:
> Error in expression at index -1.  Error: Missing function implementation: 
> [convert_fromUINT8_BE(VARBINARY-REQUIRED)].  Full expression: --UNKNOWN 
> EXPRESSION--..
> Fragment 0:0
> [Error Id: c29cc65c-dfe3-4a18-ae64-ba65c97dfb73 on centos-03.qa.lab:31010] 
> (state=,code=0)
> {code}
> Stack trace from drillbit.log
> {code}
> 2015-09-30 21:01:26,935 [29f3b259-3fd2-2fe2-3cf7-7ecf3f2b4278:frag:0:0] ERROR 
> o.a.d.e.r.AbstractSingleRecordBatch - Failure during query
> org.apache.drill.exec.exception.SchemaChangeException: Failure while trying 
> to materialize incoming schema.  Errors:
> Error in expression at index -1.  Error: Missing function implementation: 
> [convert_fromUINT8_BE(VARBINARY-REQUIRED)].  Full expression: --UNKNOWN 
> EXPRESSION--..
> at 
> org.apache.drill.exec.physical.impl.filter.FilterRecordBatch.generateSV2Filterer(FilterRecordBatch.java:181)
>  ~[drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
> at 
> org.apache.drill.exec.physical.impl.filter.FilterRecordBatch.setupNewSchema(FilterRecordBatch.java:107)
>  ~[drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
> at 
> org.apache.drill.exec.record.AbstractSingleRecordBatch.innerNext(AbstractSingleRecordBatch.java:78)
>  [drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
> at 
> org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:147)
>  [drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
> at 
> org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:104)
>  [drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
> at 
> org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:94)
>  [drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
> at 
> org.apache.drill.exec.record.AbstractSingleRecordBatch.innerNext(AbstractSingleRecordBatch.java:51)
>  [drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
> at 
> org.apache.drill.exec.physical.impl.svremover.RemovingRecordBatch.innerNext(RemovingRecordBatch.java:94)
>  [drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
> at 
> org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:147)
>  [drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
> at 
> org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:104)
>  [drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
> at 
> org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:94)
>  [drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
> at 
> org.apache.drill.exec.record.AbstractSingleRecordBatch.innerNext(AbstractSingleRecordBatch.java:51)
>  [drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
> at 
> org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.innerNext(ProjectRecordBatch.java:129)
>  [drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
> at 
> org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:147)
>  [drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
> at 
> org.apache.drill.exec.physical.impl.BaseRootExec.next(BaseRootExec.java:83) 
> [drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
> at 
> org.apache.drill.exec.physical.impl.ScreenCreator$ScreenRoot.innerNext(ScreenCreator.java:80)
>  [drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
> at 
> org.apache.drill.exec.physical.impl.BaseRootExec.next(BaseRootExec.java:73) 
> [drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
> at 
> org.apache.drill.exec.work.fragment.FragmentExecutor$1.run(FragmentExecutor.java:258)
>  [drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
> at 
> org.apache.drill.exec.work.fragment.FragmentExecutor$1.run(FragmentExecutor.java:252)
>  [drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
> at java.security.AccessController.doP

[jira] [Updated] (DRILL-3881) Rowkey filter does not get pushed into Scan

2015-10-08 Thread Khurram Faraaz (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-3881?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Khurram Faraaz updated DRILL-3881:
--
Fix Version/s: (was: 1.3.0)
   Future

> Rowkey filter does not get pushed into Scan
> ---
>
> Key: DRILL-3881
> URL: https://issues.apache.org/jira/browse/DRILL-3881
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Flow
>Affects Versions: 1.2.0
>Reporter: Khurram Faraaz
>Assignee: Smidth Panchamia
>Priority: Critical
> Fix For: Future
>
>
> Rowkey filter does not get pushed down into Scan
> 4 node cluster CentOS
> Drill master commit ID: b9afcf8f
> case 1) Rowkey filter does not get pushed into Scan
> {code}
> 0: jdbc:drill:schema=dfs.tmp> explain plan for select 
> CONVERT_FROM(ROW_KEY,'FLOAT_OB') AS 
> RK,CONVERT_FROM(T.`colfam1`.`qual1`,'UTF8') FROM flt_Tbl T WHERE ROW_KEY = 
> CAST('3.0838087E38' AS FLOAT);
> +--+--+
> | text | json |
> +--+--+
> | 00-00Screen
> 00-01  Project(RK=[CONVERT_FROMFLOAT_OB($0)], 
> EXPR$1=[CONVERT_FROMUTF8(ITEM($1, 'qual1'))])
> 00-02SelectionVectorRemover
> 00-03  Filter(condition=[=($0, CAST('3.0838087E38'):FLOAT NOT NULL)])
> 00-04Scan(groupscan=[HBaseGroupScan [HBaseScanSpec=HBaseScanSpec 
> [tableName=flt_Tbl, startRow=null, stopRow=null, filter=null], 
> columns=[`*`]]])
> {code}
> case 2) Rowkey filter does not get pushed into Scan
> {code}
> 0: jdbc:drill:schema=dfs.tmp> explain plan for select 
> CONVERT_FROM(ROW_KEY,'FLOAT_OB') AS 
> RK,CONVERT_FROM(T.`colfam1`.`qual1`,'UTF8') FROM flt_Tbl T WHERE 
> CONVERT_FROM(ROW_KEY,'FLOAT_OB') = CAST('3.0838087E38' AS FLOAT) AND 
> CONVERT_FROM(T.`colfam1`.`qual1`,'UTF8') LIKE '%30838087473969088%' order by 
> CONVERT_FROM(ROW_KEY,'FLOAT_OB') ASC;
> +--+--+
> | text | json |
> +--+--+
> | 00-00Screen
> 00-01  Project(RK=[$0], EXPR$1=[$1])
> 00-02SelectionVectorRemover
> 00-03  Sort(sort0=[$0], dir0=[ASC])
> 00-04Project(RK=[CONVERT_FROMFLOAT_OB($0)], 
> EXPR$1=[CONVERT_FROMUTF8(ITEM($1, 'qual1'))])
> 00-05  SelectionVectorRemover
> 00-06Filter(condition=[AND(=(CONVERT_FROM($0, 'FLOAT_OB'), 
> CAST('3.0838087E38'):FLOAT NOT NULL), LIKE(CONVERT_FROM(ITEM($1, 'qual1'), 
> 'UTF8'), '%30838087473969088%'))])
> 00-07  Scan(groupscan=[HBaseGroupScan 
> [HBaseScanSpec=HBaseScanSpec [tableName=flt_Tbl, startRow=, stopRow=, 
> filter=SingleColumnValueFilter (colfam1, qual1, EQUAL, 
> ^.*\x5CQ30838087473969088\x5CE.*$)], columns=[`*`]]])
> {code}
> Same as case (2) just that ASC is missing in order by clause.
> {code}
> 0: jdbc:drill:schema=dfs.tmp> explain plan for select 
> CONVERT_FROM(ROW_KEY,'FLOAT_OB') AS 
> RK,CONVERT_FROM(T.`colfam1`.`qual1`,'UTF8') FROM flt_Tbl T WHERE 
> CONVERT_FROM(ROW_KEY,'FLOAT_OB') = CAST('3.0838087E38' AS FLOAT) AND 
> CONVERT_FROM(T.`colfam1`.`qual1`,'UTF8') LIKE '%30838087473969088%' order by 
> CONVERT_FROM(ROW_KEY,'FLOAT_OB');
> +--+--+
> | text | json |
> +--+--+
> | 00-00Screen
> 00-01  Project(RK=[$0], EXPR$1=[$1])
> 00-02SelectionVectorRemover
> 00-03  Sort(sort0=[$0], dir0=[ASC])
> 00-04Project(RK=[CONVERT_FROMFLOAT_OB($0)], 
> EXPR$1=[CONVERT_FROMUTF8(ITEM($1, 'qual1'))])
> 00-05  SelectionVectorRemover
> 00-06Filter(condition=[AND(=(CONVERT_FROM($0, 'FLOAT_OB'), 
> CAST('3.0838087E38'):FLOAT NOT NULL), LIKE(CONVERT_FROM(ITEM($1, 'qual1'), 
> 'UTF8'), '%30838087473969088%'))])
> 00-07  Scan(groupscan=[HBaseGroupScan 
> [HBaseScanSpec=HBaseScanSpec [tableName=flt_Tbl, startRow=, stopRow=, 
> filter=SingleColumnValueFilter (colfam1, qual1, EQUAL, 
> ^.*\x5CQ30838087473969088\x5CE.*$)], columns=[`*`]]])
> {code}
> Snippet that creates and inserts data into HBase table.
> {code}
> public static void main(String args[]) throws IOException {
> Configuration conf = HBaseConfiguration.create();
> conf.set("hbase.zookeeper.property.clientPort","5181");
> HBaseAdmin admin = new HBaseAdmin(conf);
> if (admin.tableExists("flt_Tbl")) {
> admin.disableTable("flt_Tbl");
> admin.deleteTable("flt_Tbl");
> }
> HTableDescriptor tableDesc = new
> HTableDescriptor(TableName.valueOf("flt_Tbl"));
> tableDesc.addFamily(new HColumnDescriptor("colfam1"));
> admin.createTable(tableDesc);
> HTable table  = new HTable(conf, "flt_Tbl");
> //for (float i = (float)0.5; i <= 100.00; i += 0.75) {
> for (float i = (float)1.4E-45; i <= Float.MAX_VALUE; i += 
> Float.MAX_VALUE / 64) {
> byte[] bytes = new byte[5];
> org.apache.hadoop.hbase.util.PositionedByteRange 

[jira] [Updated] (DRILL-3797) Filter not pushed down as part of scan (for JSON data)

2015-10-08 Thread Khurram Faraaz (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-3797?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Khurram Faraaz updated DRILL-3797:
--
Assignee: (was: Sean Hsuan-Yi Chu)

> Filter not pushed down as part of scan (for JSON data)
> --
>
> Key: DRILL-3797
> URL: https://issues.apache.org/jira/browse/DRILL-3797
> Project: Apache Drill
>  Issue Type: New Feature
>  Components: Execution - Relational Operators
>Affects Versions: 1.2.0
>Reporter: Khurram Faraaz
>Priority: Minor
> Fix For: Future
>
>
> Filter is not part of the scan operator.
> {code}
> 0: jdbc:drill:schema=dfs.tmp> explain plan for select key1, key2 from 
> `twoKeyJsn.json` where key1 = 1.26807240856E8;
> +--+--+
> | text | json |
> +--+--+
> | 00-00Screen
> 00-01  Project(key1=[$0], key2=[$1])
> 00-02SelectionVectorRemover
> 00-03  Filter(condition=[=($0, 1.26807240856E8)])
> 00-04Scan(groupscan=[EasyGroupScan 
> [selectionRoot=maprfs:/tmp/twoKeyJsn.json, numFiles=1, columns=[`key1`, 
> `key2`], files=[maprfs:///tmp/twoKeyJsn.json]]])
> {code}
> {code}
> Here is snippet of data from JSON file
> root@centos-01 ~]# head -n 10 twoKeyJsn.json
> {"key1":1296815267.3,"key2":"d"}
> {"key1":46736552.9012,"key2":"c"}
> {"key1":93968206.5896,"key2":"b"}
> {"key1":1015801729.33,"key2":"d"}
> {"key1":49878.641,"key2":"1"}
> {"key1":152391833.107,"key2":"1"}
> {"key1":731290386.917,"key2":"a"}
> {"key1":692726688.161,"key2":"d"}
> {"key1":1123835226.54,"key2":"a"}
> {"key1":126807240.856,"key2":"1"}
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (DRILL-3805) Empty JSON on LHS UNION non empty JSON on RHS must return results

2015-10-08 Thread Khurram Faraaz (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-3805?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Khurram Faraaz updated DRILL-3805:
--
Fix Version/s: (was: 1.3.0)
   Future

> Empty JSON on LHS UNION non empty JSON on RHS must return results
> -
>
> Key: DRILL-3805
> URL: https://issues.apache.org/jira/browse/DRILL-3805
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Execution - Relational Operators
>Affects Versions: 1.2.0
>Reporter: Khurram Faraaz
>Priority: Minor
> Fix For: Future
>
>
> When the input on LHS of UNION operator is empty and there is non empty input 
> on RHS of Union, we need to return the data from the RHS. Currently we return 
> SchemaChangeException.
> {code}
> 0: jdbc:drill:schema=dfs.tmp> select key1 from `empty.json` UNION select key1 
> from `fewRows.json`;
> Error: SYSTEM ERROR: SchemaChangeException: The left input of Union-All 
> should not come from an empty data source
> Fragment 0:0
> [Error Id: f0fcff87-f470-46a8-9733-316b7da1a87f on centos-02.qa.lab:31010] 
> (state=,code=0)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (DRILL-3805) Empty JSON on LHS UNION non empty JSON on RHS must return results

2015-10-08 Thread Khurram Faraaz (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-3805?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Khurram Faraaz updated DRILL-3805:
--
Assignee: (was: Sean Hsuan-Yi Chu)

> Empty JSON on LHS UNION non empty JSON on RHS must return results
> -
>
> Key: DRILL-3805
> URL: https://issues.apache.org/jira/browse/DRILL-3805
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Execution - Relational Operators
>Affects Versions: 1.2.0
>Reporter: Khurram Faraaz
>Priority: Minor
> Fix For: Future
>
>
> When the input on LHS of UNION operator is empty and there is non empty input 
> on RHS of Union, we need to return the data from the RHS. Currently we return 
> SchemaChangeException.
> {code}
> 0: jdbc:drill:schema=dfs.tmp> select key1 from `empty.json` UNION select key1 
> from `fewRows.json`;
> Error: SYSTEM ERROR: SchemaChangeException: The left input of Union-All 
> should not come from an empty data source
> Fragment 0:0
> [Error Id: f0fcff87-f470-46a8-9733-316b7da1a87f on centos-02.qa.lab:31010] 
> (state=,code=0)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (DRILL-3865) Difference in results - query parquet table vs query HBase table

2015-10-08 Thread Khurram Faraaz (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-3865?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Khurram Faraaz updated DRILL-3865:
--
Assignee: (was: Aditya Kishore)

> Difference in results - query parquet table vs query HBase table
> 
>
> Key: DRILL-3865
> URL: https://issues.apache.org/jira/browse/DRILL-3865
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Flow
>Affects Versions: 1.2.0
> Environment: 4 node cluster CentOS
>Reporter: Khurram Faraaz
> Fix For: Future
>
>
> Querying HBase from Drill returns no results. Query from Drill over same data 
> in parquet format from Drill returns expected results.
> {code}
> 0: jdbc:drill:schema=dfs.tmp> describe dt_Tbl;
> +--++--+
> | COLUMN_NAME  | DATA_TYPE  | IS_NULLABLE  |
> +--++--+
> | row_key  | ANY| NO   |
> | colfam1  | MAP| NO   |
> +--++--+
> 2 rows selected (1.494 seconds)
> 0: jdbc:drill:schema=dfs.tmp> select RK from dfs.tmp.t20 WHERE RK >= TIME 
> '00:00:00' AND RK <= TIME '04:14:46.094';
> +---+
> |  RK   |
> +---+
> | 00:00:00  |
> | 04:14:46.094  |
> | 04:14:46.094  |
> | 01:50:46.094  |
> | 04:14:46.094  |
> +---+
> 5 rows selected (0.242 seconds)
> {code}
> {code}
> root@centos-01 ~]# parquet-tools/parquet-schema 0_0_0.parquet
> message root {
>   required int32 RK (TIME_MILLIS);  
> }
> 0: jdbc:drill:schema=dfs.tmp> SELECT 
> CONVERT_FROM(BYTE_SUBSTR(row_key,1,8),'time_epoch_be') AS RK from dt_Tbl T 
> WHERE CONVERT_FROM(BYTE_SUBSTR(row_key,1,8),'time_epoch_be') >= TIME 
> '00:00:00' AND CONVERT_FROM(BYTE_SUBSTR(row_key,1,8),'time_epoch_be') <= TIME 
> '04:14:46.094';
> +--+
> |  |
> +--+
> +--+
> No rows selected (0.902 seconds)
> explain plan for above query
> 00-01  Project(RK=[CONVERT_FROMTIME_EPOCH_BE(BYTE_SUBSTR($0, 1, 8))])
> 00-02Scan(groupscan=[HBaseGroupScan [HBaseScanSpec=HBaseScanSpec 
> [tableName=dt_Tbl, startRow=\x00\x00\x00\x00\x00\x00\x00\x00, 
> stopRow=\x00\x00\x00\x00\x00\xE9?O, filter=null], columns=[`row_key`]]])
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (DRILL-3854) IOB Exception : CONVERT_FROM (sal, int_be)

2015-10-08 Thread Khurram Faraaz (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-3854?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Khurram Faraaz updated DRILL-3854:
--
Assignee: (was: Mehant Baid)

> IOB Exception : CONVERT_FROM (sal, int_be)
> --
>
> Key: DRILL-3854
> URL: https://issues.apache.org/jira/browse/DRILL-3854
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Flow
>Affects Versions: 1.2.0
> Environment: 4 node cluster CentOS
>Reporter: Khurram Faraaz
> Fix For: Future
>
>
> CONVERT_FROM function results in IOB Exception
> Drill master commit id : b9afcf8f
> {code}
> 0: jdbc:drill:schema=dfs.tmp> select salary from Emp;
> +-+
> | salary  |
> +-+
> | 8   |
> | 9   |
> | 20  |
> | 95000   |
> | 85000   |
> | 9   |
> | 10  |
> | 87000   |
> | 8   |
> | 10  |
> | 99000   |
> +-+
> 11 rows selected (0.535 seconds)
> # create table using above Emp table
> create table tbl_int_be as select convert_to(salary, 'int_be') sal from Emp;
> 0: jdbc:drill:schema=dfs.tmp> alter session set `planner.slice_target`=1;
> +---++
> |  ok   |summary |
> +---++
> | true  | planner.slice_target updated.  |
> +---++
> 1 row selected (0.19 seconds)
> # Below query results in IOB on server.
> 0: jdbc:drill:schema=dfs.tmp> select convert_from(sal, 'int_be') from 
> tbl_int_be order by sal;
> Error: SYSTEM ERROR: IndexOutOfBoundsException: DrillBuf(ridx: 0, widx: 158, 
> cap: 158/158, unwrapped: SlicedByteBuf(ridx: 0, widx: 158, cap: 158/158, 
> unwrapped: UnsafeDirectLittleEndian(PooledUnsafeDirectByteBuf(ridx: 0, widx: 
> 0, cap: 417/417.slice(158, 44)
> Fragment 2:0
> [Error Id: 4ee1361d-9877-45eb-bde6-57d5add9fe5e on centos-04.qa.lab:31010] 
> (state=,code=0)
> # Apply convert_from function and project original column results in IOB on 
> client. (because Error Id is missing)
> 0: jdbc:drill:schema=dfs.tmp> select convert_from(sal, 'int_be'), sal from 
> tbl_int_be;
> Error: Unexpected RuntimeException: java.lang.IndexOutOfBoundsException: 
> DrillBuf(ridx: 0, widx: 114, cap: 114/114, unwrapped: DrillBuf(ridx: 321, 
> widx: 321, cap: 321/321, unwrapped: 
> UnsafeDirectLittleEndian(PooledUnsafeDirectByteBuf(ridx: 0, widx: 0, cap: 
> 321/321.slice(55, 103) (state=,code=0)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


  1   2   >