[jira] [Updated] (DRILL-3546) S3 - jets3t - No such File Or Directory

2015-07-24 Thread Philip Deegan (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-3546?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Philip Deegan updated DRILL-3546:
-
Description: 
Tested on 1.1 with commit id:
{noformat}
0: jdbc:drill:zk=local select commit_id from sys.version;
+---+
| commit_id |
+---+
| e3fc7e97bfe712dc09d43a8a055a5135c96b7344  |
+---+
{noformat}

Three instance zookeeper cluster running drill with the jets3t plugin. 
Occassionally throws a No such file or directory error. Query example SELECT 
COUNT(*) FROM s3.json_directory;
Might be a jets3t issue, existing issue here: 
https://bitbucket.org/jmurty/jets3t/issues/215/drill-intermittent-file-not-found-error
{code}
org.apache.drill.common.exceptions.UserException: DATA_READ ERROR: Failure 
reading JSON file - No such file or directory 's3n://json_directory/xyz.json.gz'

File  /json_directory/xyz.json.gz
Record  1

[Error Id: 3f83967b-0b7b-4778-b623-b7a20528e3d1 ]
at 
org.apache.drill.common.exceptions.UserException$Builder.build(UserException.java:523)
 ~[drill-common-1.1.0.jar:1.1.0]
at 
org.apache.drill.exec.store.easy.json.JSONRecordReader.handleAndRaise(JSONRecordReader.java:161)
 [drill-java-exec-1.1.0.jar:1.1.0]
at 
org.apache.drill.exec.store.easy.json.JSONRecordReader.setup(JSONRecordReader.java:130)
 [drill-java-exec-1.1.0.jar:1.1.0]
at 
org.apache.drill.exec.physical.impl.ScanBatch.init(ScanBatch.java:100) 
[drill-java-exec-1.1.0.jar:1.1.0]
at 
org.apache.drill.exec.store.dfs.easy.EasyFormatPlugin.getReaderBatch(EasyFormatPlugin.java:195)
 [drill-java-exec-1.1.0.jar:1.1.0]
at 
org.apache.drill.exec.store.dfs.easy.EasyReaderBatchCreator.getBatch(EasyReaderBatchCreator.java:35)
 [drill-java-exec-1.1.0.jar:1.1.0]
at 
org.apache.drill.exec.store.dfs.easy.EasyReaderBatchCreator.getBatch(EasyReaderBatchCreator.java:28)
 [drill-java-exec-1.1.0.jar:1.1.0]
at 
org.apache.drill.exec.physical.impl.ImplCreator.getRecordBatch(ImplCreator.java:150)
 [drill-java-exec-1.1.0.jar:1.1.0]
at 
org.apache.drill.exec.physical.impl.ImplCreator.getChildren(ImplCreator.java:173)
 [drill-java-exec-1.1.0.jar:1.1.0]
at 
org.apache.drill.exec.physical.impl.ImplCreator.getRecordBatch(ImplCreator.java:131)
 [drill-java-exec-1.1.0.jar:1.1.0]
at 
org.apache.drill.exec.physical.impl.ImplCreator.getChildren(ImplCreator.java:173)
 [drill-java-exec-1.1.0.jar:1.1.0]
at 
org.apache.drill.exec.physical.impl.ImplCreator.getRecordBatch(ImplCreator.java:131)
 [drill-java-exec-1.1.0.jar:1.1.0]
at 
org.apache.drill.exec.physical.impl.ImplCreator.getChildren(ImplCreator.java:173)
 [drill-java-exec-1.1.0.jar:1.1.0]
at 
org.apache.drill.exec.physical.impl.ImplCreator.getRecordBatch(ImplCreator.java:131)
 [drill-java-exec-1.1.0.jar:1.1.0]
at 
org.apache.drill.exec.physical.impl.ImplCreator.getChildren(ImplCreator.java:173)
 [drill-java-exec-1.1.0.jar:1.1.0]
at 
org.apache.drill.exec.physical.impl.ImplCreator.getRecordBatch(ImplCreator.java:131)
 [drill-java-exec-1.1.0.jar:1.1.0]
at 
org.apache.drill.exec.physical.impl.ImplCreator.getChildren(ImplCreator.java:173)
 [drill-java-exec-1.1.0.jar:1.1.0]
at 
org.apache.drill.exec.physical.impl.ImplCreator.getRecordBatch(ImplCreator.java:131)
 [drill-java-exec-1.1.0.jar:1.1.0]
at 
org.apache.drill.exec.physical.impl.ImplCreator.getChildren(ImplCreator.java:173)
 [drill-java-exec-1.1.0.jar:1.1.0]
at 
org.apache.drill.exec.physical.impl.ImplCreator.getRecordBatch(ImplCreator.java:131)
 [drill-java-exec-1.1.0.jar:1.1.0]
at 
org.apache.drill.exec.physical.impl.ImplCreator.getChildren(ImplCreator.java:173)
 [drill-java-exec-1.1.0.jar:1.1.0]
at 
org.apache.drill.exec.physical.impl.ImplCreator.getRecordBatch(ImplCreator.java:131)
 [drill-java-exec-1.1.0.jar:1.1.0]
at 
org.apache.drill.exec.physical.impl.ImplCreator.getChildren(ImplCreator.java:173)
 [drill-java-exec-1.1.0.jar:1.1.0]
at 
org.apache.drill.exec.physical.impl.ImplCreator.getRootExec(ImplCreator.java:106)
 [drill-java-exec-1.1.0.jar:1.1.0]
at 
org.apache.drill.exec.physical.impl.ImplCreator.getExec(ImplCreator.java:81) 
[drill-java-exec-1.1.0.jar:1.1.0]
at 
org.apache.drill.exec.work.fragment.FragmentExecutor.run(FragmentExecutor.java:235)
 [drill-java-exec-1.1.0.jar:1.1.0]
at 
org.apache.drill.common.SelfCleaningRunnable.run(SelfCleaningRunnable.java:38) 
[drill-common-1.1.0.jar:1.1.0]
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) 
[na:1.7.0_80]
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) 
[na:1.7.0_80]
at java.lang.Thread.run(Thread.java:745) [na:1.7.0_80]
Caused by: 

[jira] [Commented] (DRILL-3547) IndexOutOfBoundsException on directory with ~20 subdirectories

2015-07-24 Thread Philip Deegan (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-3547?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14640166#comment-14640166
 ] 

Philip Deegan commented on DRILL-3547:
--

The issue is due to an empty file amongst non-empty files.

 IndexOutOfBoundsException on directory with ~20 subdirectories
 --

 Key: DRILL-3547
 URL: https://issues.apache.org/jira/browse/DRILL-3547
 Project: Apache Drill
  Issue Type: Bug
  Components: Functions - Drill
Affects Versions: 1.1.0
 Environment: RHEL 7
Reporter: Philip Deegan
Assignee: Daniel Barclay (Drill)

 Tested on 1.1 with commit id:
 {noformat}
 0: jdbc:drill:zk=local select commit_id from sys.version;
 +---+
 | commit_id |
 +---+
 | e3fc7e97bfe712dc09d43a8a055a5135c96b7344  |
 +---+
 {noformat}
 Directory has child directories a to u, each contain json files.
 Running the query on each subdirectory indivudually does not cause an error.
 {noformat}
 java.lang.RuntimeException: java.sql.SQLException: SYSTEM ERROR: 
 IndexOutOfBoundsException: index: 0, length: 1 (expected: range(0, 0))
 Fragment 1:2
 [Error Id: 69a0879f-f718-4930-ae6f-c526de05528c on 
 ip-172-31-29-60.eu-central-1.compute.internal:31010]
   at sqlline.IncrementalRows.hasNext(IncrementalRows.java:73)
   at 
 sqlline.TableOutputFormat$ResizingRowsProvider.next(TableOutputFormat.java:87)
   at sqlline.TableOutputFormat.print(TableOutputFormat.java:118)
   at sqlline.SqlLine.print(SqlLine.java:1583)
   at sqlline.Commands.execute(Commands.java:852)
   at sqlline.Commands.sql(Commands.java:751)
   at sqlline.SqlLine.dispatch(SqlLine.java:738)
   at sqlline.SqlLine.begin(SqlLine.java:612)
   at sqlline.SqlLine.start(SqlLine.java:366)
   at sqlline.SqlLine.main(SqlLine.java:259)
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (DRILL-3546) S3 - jets3t - No such File Or Directory

2015-07-24 Thread Philip Deegan (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-3546?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Philip Deegan updated DRILL-3546:
-
Description: 
Tested on 1.1 with commit id:
{noformat}
0: jdbc:drill:zk=local select commit_id from sys.version;
+---+
| commit_id |
+---+
| e3fc7e97bfe712dc09d43a8a055a5135c96b7344  |
+---+
{noformat}

Three instance zookeeper cluster running drill with the jets3t plugin. 
Occassionally throws a No such file or directory error. Query example SELECT 
COUNT(*) FROM s3.json_directory;
Might be a jets3t issue, existing issue here: 
https://bitbucket.org/jmurty/jets3t/issues/215/drill-intermittent-file-not-found-error
{code}
org.apache.drill.common.exceptions.UserException: DATA_READ ERROR: Failure 
reading JSON file - No such file or directory 's3n://json_directory/xyz.json.gz'

File  /json_directory/xyz.json.gz
Record  1

[Error Id: 3f83967b-0b7b-4778-b623-b7a20528e3d1 ]
at 
org.apache.drill.common.exceptions.UserException$Builder.build(UserException.java:523)
 ~[drill-common-1.1.0.jar:1.1.0]
at 
org.apache.drill.exec.store.easy.json.JSONRecordReader.handleAndRaise(JSONRecordReader.java:161)
 [drill-java-exec-1.1.0.jar:1.1.0]
at 
org.apache.drill.exec.store.easy.json.JSONRecordReader.setup(JSONRecordReader.java:130)
 [drill-java-exec-1.1.0.jar:1.1.0]
at 
org.apache.drill.exec.physical.impl.ScanBatch.init(ScanBatch.java:100) 
[drill-java-exec-1.1.0.jar:1.1.0]
at 
org.apache.drill.exec.store.dfs.easy.EasyFormatPlugin.getReaderBatch(EasyFormatPlugin.java:195)
 [drill-java-exec-1.1.0.jar:1.1.0]
at 
org.apache.drill.exec.store.dfs.easy.EasyReaderBatchCreator.getBatch(EasyReaderBatchCreator.java:35)
 [drill-java-exec-1.1.0.jar:1.1.0]
at 
org.apache.drill.exec.store.dfs.easy.EasyReaderBatchCreator.getBatch(EasyReaderBatchCreator.java:28)
 [drill-java-exec-1.1.0.jar:1.1.0]
at 
org.apache.drill.exec.physical.impl.ImplCreator.getRecordBatch(ImplCreator.java:150)
 [drill-java-exec-1.1.0.jar:1.1.0]
at 
org.apache.drill.exec.physical.impl.ImplCreator.getChildren(ImplCreator.java:173)
 [drill-java-exec-1.1.0.jar:1.1.0]
at 
org.apache.drill.exec.physical.impl.ImplCreator.getRecordBatch(ImplCreator.java:131)
 [drill-java-exec-1.1.0.jar:1.1.0]
at 
org.apache.drill.exec.physical.impl.ImplCreator.getChildren(ImplCreator.java:173)
 [drill-java-exec-1.1.0.jar:1.1.0]
at 
org.apache.drill.exec.physical.impl.ImplCreator.getRecordBatch(ImplCreator.java:131)
 [drill-java-exec-1.1.0.jar:1.1.0]
at 
org.apache.drill.exec.physical.impl.ImplCreator.getChildren(ImplCreator.java:173)
 [drill-java-exec-1.1.0.jar:1.1.0]
at 
org.apache.drill.exec.physical.impl.ImplCreator.getRecordBatch(ImplCreator.java:131)
 [drill-java-exec-1.1.0.jar:1.1.0]
at 
org.apache.drill.exec.physical.impl.ImplCreator.getChildren(ImplCreator.java:173)
 [drill-java-exec-1.1.0.jar:1.1.0]
at 
org.apache.drill.exec.physical.impl.ImplCreator.getRecordBatch(ImplCreator.java:131)
 [drill-java-exec-1.1.0.jar:1.1.0]
at 
org.apache.drill.exec.physical.impl.ImplCreator.getChildren(ImplCreator.java:173)
 [drill-java-exec-1.1.0.jar:1.1.0]
at 
org.apache.drill.exec.physical.impl.ImplCreator.getRecordBatch(ImplCreator.java:131)
 [drill-java-exec-1.1.0.jar:1.1.0]
at 
org.apache.drill.exec.physical.impl.ImplCreator.getChildren(ImplCreator.java:173)
 [drill-java-exec-1.1.0.jar:1.1.0]
at 
org.apache.drill.exec.physical.impl.ImplCreator.getRecordBatch(ImplCreator.java:131)
 [drill-java-exec-1.1.0.jar:1.1.0]
at 
org.apache.drill.exec.physical.impl.ImplCreator.getChildren(ImplCreator.java:173)
 [drill-java-exec-1.1.0.jar:1.1.0]
at 
org.apache.drill.exec.physical.impl.ImplCreator.getRecordBatch(ImplCreator.java:131)
 [drill-java-exec-1.1.0.jar:1.1.0]
at 
org.apache.drill.exec.physical.impl.ImplCreator.getChildren(ImplCreator.java:173)
 [drill-java-exec-1.1.0.jar:1.1.0]
at 
org.apache.drill.exec.physical.impl.ImplCreator.getRootExec(ImplCreator.java:106)
 [drill-java-exec-1.1.0.jar:1.1.0]
at 
org.apache.drill.exec.physical.impl.ImplCreator.getExec(ImplCreator.java:81) 
[drill-java-exec-1.1.0.jar:1.1.0]
at 
org.apache.drill.exec.work.fragment.FragmentExecutor.run(FragmentExecutor.java:235)
 [drill-java-exec-1.1.0.jar:1.1.0]
at 
org.apache.drill.common.SelfCleaningRunnable.run(SelfCleaningRunnable.java:38) 
[drill-common-1.1.0.jar:1.1.0]
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) 
[na:1.7.0_80]
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) 
[na:1.7.0_80]
at java.lang.Thread.run(Thread.java:745) [na:1.7.0_80]
Caused by: 

[jira] [Commented] (DRILL-3537) Empty Json file can potentially result into wrong results

2015-07-24 Thread Sean Hsuan-Yi Chu (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-3537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14640626#comment-14640626
 ] 

Sean Hsuan-Yi Chu commented on DRILL-3537:
--

https://reviews.apache.org/r/36782/

 Empty Json file can potentially result into wrong results 
 --

 Key: DRILL-3537
 URL: https://issues.apache.org/jira/browse/DRILL-3537
 Project: Apache Drill
  Issue Type: Bug
  Components: Execution - Relational Operators, Storage - JSON
Reporter: Sean Hsuan-Yi Chu
Assignee: Sean Hsuan-Yi Chu
Priority: Critical
 Fix For: 1.2.0


 In the directory, we have two files. One has some data and the other one is 
 empty. A query as below:
 {code}
 select * from dfs.`directory`;
 {code}
 will produce different results according to the order of the files being read 
 (The default order is in the alphabetic order of the filenames). To give a 
 more concrete example, the non-empty json has data:
 {code}
 {
   a:1
 }
 {code}
 By naming the files, you can control the orders. If the empty file is read in 
 firstly, the result is
 {code}
 +---++
 |   *   | a  |
 +---++
 | null  | 1  |
 +---++
 {code}
 If the opposite order takes place, the result is
 {code}
 ++
 | a  |
 ++
 | 1  |
 | 2  |
 ++
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (DRILL-3313) Eliminate redundant #load methods and unit-test loading exporting of vectors

2015-07-24 Thread Hanifi Gunes (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-3313?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hanifi Gunes updated DRILL-3313:

Assignee: Jason Altekruse  (was: Steven Phillips)

 Eliminate redundant #load methods and unit-test loading  exporting of vectors
 --

 Key: DRILL-3313
 URL: https://issues.apache.org/jira/browse/DRILL-3313
 Project: Apache Drill
  Issue Type: Sub-task
  Components: Execution - Data Types
Affects Versions: 1.0.0
Reporter: Hanifi Gunes
Assignee: Jason Altekruse
 Fix For: 1.2.0


 Vectors have multiple #load methods that are used to populate data from raw 
 buffers. It is relatively tough to reason, maintain and unit-test loading and 
 exporting of data since there is many redundant code around load methods. 
 This issue proposes to have single #load method conforming to VV#load(def, 
 buffer) signature eliminating all other #load overrides.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (DRILL-1750) Querying directories with JSON files returns incomplete results

2015-07-24 Thread Hanifi Gunes (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-1750?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hanifi Gunes updated DRILL-1750:

Assignee: Steven Phillips  (was: Hanifi Gunes)

 Querying directories with JSON files returns incomplete results
 ---

 Key: DRILL-1750
 URL: https://issues.apache.org/jira/browse/DRILL-1750
 Project: Apache Drill
  Issue Type: Bug
  Components: Storage - JSON
Reporter: Abhishek Girish
Assignee: Steven Phillips
Priority: Critical
 Fix For: 1.2.0

 Attachments: 1.json, 2.json, 3.json, 4.json, 
 DRILL-1750_2015-07-06_16:39:04.patch


 I happened to observe that querying (select *) a directory with json files 
 displays only fields common to all json files. All corresponding fields are 
 displayed while querying each of the json files individually. And in some 
 scenarios, querying the directory crashes sqlline.
 The example below may help make the issue clear:
  select * from dfs.`/data/json/tmp/1.json`;
 ++++
 |   artist   |  track_id  |   title|
 ++++
 | Jonathan King | TRAAAEA128F935A30D | I'll Slap Your Face (Entertainment USA 
 Theme) |
 ++++
 1 row selected (1.305 seconds)
  select * from dfs.`/data/json/tmp/2.json`;
 +++++
 |   artist   | timestamp  |  track_id  |   title|
 +++++
 | Supersuckers | 2011-08-01 20:30:17.991134 | TRAAAQN128F9353BA0 | Double 
 Wide |
 +++++
 1 row selected (0.105 seconds)
  select * from dfs.`/data/json/tmp/3.json`;
 ++++
 | timestamp  |  track_id  |   title|
 ++++
 | 2011-08-01 20:30:17.991134 | TRAAAQN128F9353BA0 | Double Wide |
 ++++
 1 row selected (0.083 seconds)
  select * from dfs.`/data/json/tmp/4.json`;
 +++
 |  track_id  |   title|
 +++
 | TRAAAQN128F9353BA0 | Double Wide |
 +++
 1 row selected (0.076 seconds)
  select * from dfs.`/data/json/tmp`;
 +++
 |  track_id  |   title|
 +++
 | TRAAAQN128F9353BA0 | Double Wide |
 | TRAAAQN128F9353BA0 | Double Wide |
 | TRAAAEA128F935A30D | I'll Slap Your Face (Entertainment USA Theme) |
 | TRAAAQN128F9353BA0 | Double Wide |
 +++
 4 rows selected (0.121 seconds)
 JVM Crash occurs at times:
  select * from dfs.`/data/json/tmp`;
 ++++
 | timestamp  |  track_id  |   title|
 ++++
 | 2011-08-01 20:30:17.991134 | TRAAAQN128F9353BA0 | Double Wide |
 #
 # A fatal error has been detected by the Java Runtime Environment:
 #
 #  SIGSEGV (0xb) at pc=0x7f3cb99be053, pid=13943, tid=139898808436480
 #
 # JRE version: OpenJDK Runtime Environment (7.0_65-b17) (build 
 1.7.0_65-mockbuild_2014_07_16_06_06-b00)
 # Java VM: OpenJDK 64-Bit Server VM (24.65-b04 mixed mode linux-amd64 
 compressed oops)
 # Problematic frame:
 # V  [libjvm.so+0x932053]
 #
 # Failed to write core dump. Core dumps have been disabled. To enable core 
 dumping, try ulimit -c unlimited before starting Java again
 #
 # An error report file with more information is saved as:
 # /tmp/jvm-13943/hs_error.log
 #
 # If you would like to submit a bug report, please include
 # instructions on how to reproduce the bug and visit:
 #   http://icedtea.classpath.org/bugzilla
 #
 Aborted



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (DRILL-3353) Non data-type related schema changes errors

2015-07-24 Thread Hanifi Gunes (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-3353?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hanifi Gunes updated DRILL-3353:

Assignee: Steven Phillips  (was: Hanifi Gunes)

 Non data-type related schema changes errors
 ---

 Key: DRILL-3353
 URL: https://issues.apache.org/jira/browse/DRILL-3353
 Project: Apache Drill
  Issue Type: Bug
  Components: Storage - JSON
Affects Versions: 1.0.0
Reporter: Oscar Bernal
Assignee: Steven Phillips
 Fix For: 1.2.0

 Attachments: i-bfbc0a5c-ios-PulsarEvent-2015-06-23_19.json.zip


 I'm having trouble querying a data set with varying schema for a nested 
 object fields. The majority of my data for a specific type of record has the 
 following nested data:
 {code}
 attributes:{daysSinceInstall:0,destination:none,logged:no,nth:1,type:organic,wearable:no}}
 {code}
 Among those records (hundreds of them) I have only two with a slightly 
 different schema:
 {code}
 attributes:{adSet:Teste-Adwords-Engagement-Branch-iOS-230615-adset,campaign:Teste-Adwords-Engagement-Branch-iOS-230615,channel:Adwords,daysSinceInstall:0,destination:none,logged:no,nth:4,type:branch,wearable:no}}
 {code}
 When trying to query the new fields, my queries fail:
 With {code:sql}ALTER SYSTEM SET `store.json.all_text_mode` = true;{code}
 {noformat}
 0: jdbc:drill:zk=local select log.event.attributes from 
 `dfs`.`root`.`/file.json` as log where log.si = 
 '07A3F985-4B34-4A01-9B83-3B14548EF7BE' and log.event.attributes.ad = 
 'Teste-FB-Engagement-Puro-iOS-230615';
 Error: SYSTEM ERROR: java.lang.NumberFormatException: 
 Teste-FB-Engagement-Puro-iOS-230615
 Fragment 0:0
 [Error Id: 22d37a65-7dd0-4661-bbfc-7a50bbee9388 on 
 ip-10-0-1-16.sa-east-1.compute.internal:31010] (state=,code=0)
 {noformat}
 With {code:sql}ALTER SYSTEM SET `store.json.all_text_mode` = false;`{code}
 {noformat}
 0: jdbc:drill:zk=local select log.event.attributes from 
 `dfs`.`root`.`/file.json` as log where log.si = 
 '07A3F985-4B34-4A01-9B83-3B14548EF7BE';
 Error: DATA_READ ERROR: Error parsing JSON - You tried to write a Bit type 
 when you are using a ValueWriter of type NullableVarCharWriterImpl.
 File  file.json
 Record  35
 Fragment 0:0
 [Error Id: 5746e3e9-48c0-44b1-8e5f-7c94e7c64d0f on 
 ip-10-0-1-16.sa-east-1.compute.internal:31010] (state=,code=0)
 {noformat}
 If I try to extract all attributes from those events, Drill will only 
 return a subset of the fields, ignoring the others. 
 {noformat}
 0: jdbc:drill:zk=local select log.event.attributes from 
 `dfs`.`root`.`/file.json` as log where log.si = 
 '07A3F985-4B34-4A01-9B83-3B14548EF7BE' and log.type ='Opens App';
 ++
 |   EXPR$0   |
 ++
 | {logged:no,wearable:no,type:}   |
 | {logged:no,wearable:no,type:}  |
 | {logged:no,wearable:no,type:}  |
 | {logged:no,wearable:no,type:}|
 | {logged:no,wearable:no,type:}   |
 ++
 {noformat}
 What I find strange is that I have thousands of records in the same file with 
 different schema for different record types and all other queries seem run 
 well.
 Is there something about how Drill infers schema that I might be missing 
 here? Does it infer based on a sample % of the data and fail for records that 
 were not taken into account while inferring schema? I suspect I wouldn't have 
 this error if I had 100's of records with that other schema inside the file, 
 but I can't find anything in the docs or code to support that hypothesis. 
 Perhaps it's just a bug? Is it expected?
 Troubleshooting guide seems to mention something about this but it's very 
 vague in implying Drill doesn't fully support schema changes. I thought that 
 was for data type changes mostly, for which there are other well documented 
 issues.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (DRILL-3476) Filter on nested element gives wrong results

2015-07-24 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-3476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14640611#comment-14640611
 ] 

ASF GitHub Bot commented on DRILL-3476:
---

Github user hnfgns commented on the pull request:

https://github.com/apache/drill/pull/83#issuecomment-124561596
  
+1


 Filter on nested element gives wrong results
 

 Key: DRILL-3476
 URL: https://issues.apache.org/jira/browse/DRILL-3476
 Project: Apache Drill
  Issue Type: Bug
Reporter: Steven Phillips
Assignee: Steven Phillips
Priority: Critical
 Fix For: 1.2.0


 Take this query for example:
 {code}
 0: jdbc:drill:drillbit=localhost select * from t;
 ++
 |   a|
 ++
 | {b:1,c:1}  |
 ++
 {code}
 if I instead run:
 {code}
 0: jdbc:drill:drillbit=localhost select a from t where t.a.b = 1;
 ++
 |   a|
 ++
 | {b:1}  |
 ++
 {code}
 Only a.b was returned, but the select specified a. In this case, it should 
 have returned all of the elements of a, not just the one specified in the 
 filter.
 This is because the logic in FieldSelection does not correctly handle the 
 case where a selected column is a child of another selected column. In such a 
 case, the record reader should ignore the child column, and just return the 
 full selected parent column.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (DRILL-3353) Non data-type related schema changes errors

2015-07-24 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-3353?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14640613#comment-14640613
 ] 

ASF GitHub Bot commented on DRILL-3353:
---

Github user hnfgns commented on the pull request:

https://github.com/apache/drill/pull/86#issuecomment-124561688
  
+1


 Non data-type related schema changes errors
 ---

 Key: DRILL-3353
 URL: https://issues.apache.org/jira/browse/DRILL-3353
 Project: Apache Drill
  Issue Type: Bug
  Components: Storage - JSON
Affects Versions: 1.0.0
Reporter: Oscar Bernal
Assignee: Hanifi Gunes
 Fix For: 1.2.0

 Attachments: i-bfbc0a5c-ios-PulsarEvent-2015-06-23_19.json.zip


 I'm having trouble querying a data set with varying schema for a nested 
 object fields. The majority of my data for a specific type of record has the 
 following nested data:
 {code}
 attributes:{daysSinceInstall:0,destination:none,logged:no,nth:1,type:organic,wearable:no}}
 {code}
 Among those records (hundreds of them) I have only two with a slightly 
 different schema:
 {code}
 attributes:{adSet:Teste-Adwords-Engagement-Branch-iOS-230615-adset,campaign:Teste-Adwords-Engagement-Branch-iOS-230615,channel:Adwords,daysSinceInstall:0,destination:none,logged:no,nth:4,type:branch,wearable:no}}
 {code}
 When trying to query the new fields, my queries fail:
 With {code:sql}ALTER SYSTEM SET `store.json.all_text_mode` = true;{code}
 {noformat}
 0: jdbc:drill:zk=local select log.event.attributes from 
 `dfs`.`root`.`/file.json` as log where log.si = 
 '07A3F985-4B34-4A01-9B83-3B14548EF7BE' and log.event.attributes.ad = 
 'Teste-FB-Engagement-Puro-iOS-230615';
 Error: SYSTEM ERROR: java.lang.NumberFormatException: 
 Teste-FB-Engagement-Puro-iOS-230615
 Fragment 0:0
 [Error Id: 22d37a65-7dd0-4661-bbfc-7a50bbee9388 on 
 ip-10-0-1-16.sa-east-1.compute.internal:31010] (state=,code=0)
 {noformat}
 With {code:sql}ALTER SYSTEM SET `store.json.all_text_mode` = false;`{code}
 {noformat}
 0: jdbc:drill:zk=local select log.event.attributes from 
 `dfs`.`root`.`/file.json` as log where log.si = 
 '07A3F985-4B34-4A01-9B83-3B14548EF7BE';
 Error: DATA_READ ERROR: Error parsing JSON - You tried to write a Bit type 
 when you are using a ValueWriter of type NullableVarCharWriterImpl.
 File  file.json
 Record  35
 Fragment 0:0
 [Error Id: 5746e3e9-48c0-44b1-8e5f-7c94e7c64d0f on 
 ip-10-0-1-16.sa-east-1.compute.internal:31010] (state=,code=0)
 {noformat}
 If I try to extract all attributes from those events, Drill will only 
 return a subset of the fields, ignoring the others. 
 {noformat}
 0: jdbc:drill:zk=local select log.event.attributes from 
 `dfs`.`root`.`/file.json` as log where log.si = 
 '07A3F985-4B34-4A01-9B83-3B14548EF7BE' and log.type ='Opens App';
 ++
 |   EXPR$0   |
 ++
 | {logged:no,wearable:no,type:}   |
 | {logged:no,wearable:no,type:}  |
 | {logged:no,wearable:no,type:}  |
 | {logged:no,wearable:no,type:}|
 | {logged:no,wearable:no,type:}   |
 ++
 {noformat}
 What I find strange is that I have thousands of records in the same file with 
 different schema for different record types and all other queries seem run 
 well.
 Is there something about how Drill infers schema that I might be missing 
 here? Does it infer based on a sample % of the data and fail for records that 
 were not taken into account while inferring schema? I suspect I wouldn't have 
 this error if I had 100's of records with that other schema inside the file, 
 but I can't find anything in the docs or code to support that hypothesis. 
 Perhaps it's just a bug? Is it expected?
 Troubleshooting guide seems to mention something about this but it's very 
 vague in implying Drill doesn't fully support schema changes. I thought that 
 was for data type changes mostly, for which there are other well documented 
 issues.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (DRILL-2838) Applying flatten after joining 2 sub-queries returns empty maps

2015-07-24 Thread Hanifi Gunes (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-2838?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hanifi Gunes updated DRILL-2838:

Assignee: Jason Altekruse  (was: Hanifi Gunes)

 Applying flatten after joining 2 sub-queries returns empty maps
 ---

 Key: DRILL-2838
 URL: https://issues.apache.org/jira/browse/DRILL-2838
 Project: Apache Drill
  Issue Type: Bug
  Components: Execution - Relational Operators
Reporter: Rahul Challapalli
Assignee: Jason Altekruse
Priority: Critical
 Fix For: 1.2.0

 Attachments: DRILL-2838.patch, data.json


 git.commit.id.abbrev=5cd36c5
 The below query applies flatten after joining 2 subqueries. It generates 
 empty maps which is wrong
 {code}
 select v1.uid, flatten(events), flatten(transactions) from 
 (select uid, events from `data.json`) v1
 inner join
 (select uid, transactions from `data.json`) v2
 on v1.uid = v2.uid;
 ++++
 |uid |   EXPR$1   |   EXPR$2   |
 ++++
 | 1  | {} | {} |
 | 1  | {} | {} |
 | 1  | {} | {} |
 | 1  | {} | {} |
 | 1  | {} | {} |
 | 1  | {} | {} |
 | 1  | {} | {} |
 | 1  | {} | {} |
 | 1  | {} | {} |
 | 1  | {} | {} |
 | 1  | {} | {} |
 | 1  | {} | {} |
 | 1  | {} | {} |
 | 1  | {} | {} |
 | 1  | {} | {} |
 | 1  | {} | {} |
 | 1  | {} | {} |
 | 1  | {} | {} |
 | 2  | {} | {} |
 | 2  | {} | {} |
 | 2  | {} | {} |
 | 2  | {} | {} |
 | 2  | {} | {} |
 | 2  | {} | {} |
 | 2  | {} | {} |
 | 2  | {} | {} |
 | 2  | {} | {} |
 | 2  | {} | {} |
 | 2  | {} | {} |
 | 2  | {} | {} |
 | 2  | {} | {} |
 | 2  | {} | {} |
 | 2  | {} | {} |
 | 2  | {} | {} |
 | 2  | {} | {} |
 | 2  | {} | {} |
 ++++
 36 rows selected (0.244 seconds)
 {code}
 I attached the data set. Let me know if you have any questions.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (DRILL-3476) Filter on nested element gives wrong results

2015-07-24 Thread Hanifi Gunes (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-3476?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hanifi Gunes updated DRILL-3476:

Assignee: Steven Phillips  (was: Hanifi Gunes)

 Filter on nested element gives wrong results
 

 Key: DRILL-3476
 URL: https://issues.apache.org/jira/browse/DRILL-3476
 Project: Apache Drill
  Issue Type: Bug
Reporter: Steven Phillips
Assignee: Steven Phillips
Priority: Critical
 Fix For: 1.2.0


 Take this query for example:
 {code}
 0: jdbc:drill:drillbit=localhost select * from t;
 ++
 |   a|
 ++
 | {b:1,c:1}  |
 ++
 {code}
 if I instead run:
 {code}
 0: jdbc:drill:drillbit=localhost select a from t where t.a.b = 1;
 ++
 |   a|
 ++
 | {b:1}  |
 ++
 {code}
 Only a.b was returned, but the select specified a. In this case, it should 
 have returned all of the elements of a, not just the one specified in the 
 filter.
 This is because the logic in FieldSelection does not correctly handle the 
 case where a selected column is a child of another selected column. In such a 
 case, the record reader should ignore the child column, and just return the 
 full selected parent column.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (DRILL-3533) null values in a sub-structure in Parquet returns unexpected/misleading results

2015-07-24 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-3533?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14640835#comment-14640835
 ] 

ASF GitHub Bot commented on DRILL-3533:
---

Github user asfgit closed the pull request at:

https://github.com/apache/drill/pull/97


 null values in a sub-structure in Parquet returns unexpected/misleading 
 results
 ---

 Key: DRILL-3533
 URL: https://issues.apache.org/jira/browse/DRILL-3533
 Project: Apache Drill
  Issue Type: Bug
  Components: Query Planning  Optimization
Affects Versions: 1.1.0
Reporter: Stefán Baxter
Assignee: Parth Chandra
Priority: Critical

 With this minimal dataset as /tmp/test.json:
 {dimensions:{adults:A}}
 select lower(p.dimensions.budgetLevel) as `field1`, 
 lower(p.dimensions.adults) as `field2` from dfs.tmp.`/test.json` as p;
 Returns this:
 +-+-+
 | field1  | field2  |
 +-+-+
 | null| a   |
 +-+-+
 With the same data as a Parquet file
 CREATE TABLE dfs.tmp.`/test` AS SELECT * FROM dfs.tmp.`/test.json`;
 The same query:
 select lower(p.dimensions.budgetLevel) as `field1`, 
 lower(p.dimensions.adults) as `field2` from dfs.tmp.`/test/0_0_0.parquet` as 
 p;
 Return this:
 +-+-+
 | field1  | field2  |
 +-+-+
 | a   | null|
 +-+-+
 After some more testing it appears that this has nothing to do with trim. 
 (any non existing nested-value will be pushed aside)
 select p.dimensions.budgetLevel as `field1`, lower(p.dimensions.adults) as 
 `field2` from dfs.tmp.`/test/0_0_0.parquet` as p;
 also returns:
 +-+-+
 | field1  | field2  |
 +-+-+
 | a   | null|
 +-+-+



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (DRILL-3546) S3 - jets3t - No such File Or Directory

2015-07-24 Thread Daniel Barclay (Drill) (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-3546?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Barclay (Drill) updated DRILL-3546:
--
Assignee: (was: Daniel Barclay (Drill))

 S3 - jets3t - No such File Or Directory
 ---

 Key: DRILL-3546
 URL: https://issues.apache.org/jira/browse/DRILL-3546
 Project: Apache Drill
  Issue Type: Bug
  Components: Functions - Drill
Affects Versions: 1.1.0
 Environment: RHEL 7
Reporter: Philip Deegan

 Tested on 1.1 with commit id:
 {noformat}
 0: jdbc:drill:zk=local select commit_id from sys.version;
 +---+
 | commit_id |
 +---+
 | e3fc7e97bfe712dc09d43a8a055a5135c96b7344  |
 +---+
 {noformat}
 Three instance zookeeper cluster running drill with the jets3t plugin. 
 Occassionally throws a No such file or directory error. Query example 
 SELECT COUNT(*) FROM s3.json_directory;
 Might be a jets3t issue, existing issue here: 
 https://bitbucket.org/jmurty/jets3t/issues/215/drill-intermittent-file-not-found-error
 {code}
 org.apache.drill.common.exceptions.UserException: DATA_READ ERROR: Failure 
 reading JSON file - No such file or directory 
 's3n://json_directory/xyz.json.gz'
 File  /json_directory/xyz.json.gz
 Record  1
 [Error Id: 3f83967b-0b7b-4778-b623-b7a20528e3d1 ]
 at 
 org.apache.drill.common.exceptions.UserException$Builder.build(UserException.java:523)
  ~[drill-common-1.1.0.jar:1.1.0]
 at 
 org.apache.drill.exec.store.easy.json.JSONRecordReader.handleAndRaise(JSONRecordReader.java:161)
  [drill-java-exec-1.1.0.jar:1.1.0]
 at 
 org.apache.drill.exec.store.easy.json.JSONRecordReader.setup(JSONRecordReader.java:130)
  [drill-java-exec-1.1.0.jar:1.1.0]
 at 
 org.apache.drill.exec.physical.impl.ScanBatch.init(ScanBatch.java:100) 
 [drill-java-exec-1.1.0.jar:1.1.0]
 at 
 org.apache.drill.exec.store.dfs.easy.EasyFormatPlugin.getReaderBatch(EasyFormatPlugin.java:195)
  [drill-java-exec-1.1.0.jar:1.1.0]
 at 
 org.apache.drill.exec.store.dfs.easy.EasyReaderBatchCreator.getBatch(EasyReaderBatchCreator.java:35)
  [drill-java-exec-1.1.0.jar:1.1.0]
 at 
 org.apache.drill.exec.store.dfs.easy.EasyReaderBatchCreator.getBatch(EasyReaderBatchCreator.java:28)
  [drill-java-exec-1.1.0.jar:1.1.0]
 at 
 org.apache.drill.exec.physical.impl.ImplCreator.getRecordBatch(ImplCreator.java:150)
  [drill-java-exec-1.1.0.jar:1.1.0]
 at 
 org.apache.drill.exec.physical.impl.ImplCreator.getChildren(ImplCreator.java:173)
  [drill-java-exec-1.1.0.jar:1.1.0]
 at 
 org.apache.drill.exec.physical.impl.ImplCreator.getRecordBatch(ImplCreator.java:131)
  [drill-java-exec-1.1.0.jar:1.1.0]
 at 
 org.apache.drill.exec.physical.impl.ImplCreator.getChildren(ImplCreator.java:173)
  [drill-java-exec-1.1.0.jar:1.1.0]
 at 
 org.apache.drill.exec.physical.impl.ImplCreator.getRecordBatch(ImplCreator.java:131)
  [drill-java-exec-1.1.0.jar:1.1.0]
 at 
 org.apache.drill.exec.physical.impl.ImplCreator.getChildren(ImplCreator.java:173)
  [drill-java-exec-1.1.0.jar:1.1.0]
 at 
 org.apache.drill.exec.physical.impl.ImplCreator.getRecordBatch(ImplCreator.java:131)
  [drill-java-exec-1.1.0.jar:1.1.0]
 at 
 org.apache.drill.exec.physical.impl.ImplCreator.getChildren(ImplCreator.java:173)
  [drill-java-exec-1.1.0.jar:1.1.0]
 at 
 org.apache.drill.exec.physical.impl.ImplCreator.getRecordBatch(ImplCreator.java:131)
  [drill-java-exec-1.1.0.jar:1.1.0]
 at 
 org.apache.drill.exec.physical.impl.ImplCreator.getChildren(ImplCreator.java:173)
  [drill-java-exec-1.1.0.jar:1.1.0]
 at 
 org.apache.drill.exec.physical.impl.ImplCreator.getRecordBatch(ImplCreator.java:131)
  [drill-java-exec-1.1.0.jar:1.1.0]
 at 
 org.apache.drill.exec.physical.impl.ImplCreator.getChildren(ImplCreator.java:173)
  [drill-java-exec-1.1.0.jar:1.1.0]
 at 
 org.apache.drill.exec.physical.impl.ImplCreator.getRecordBatch(ImplCreator.java:131)
  [drill-java-exec-1.1.0.jar:1.1.0]
 at 
 org.apache.drill.exec.physical.impl.ImplCreator.getChildren(ImplCreator.java:173)
  [drill-java-exec-1.1.0.jar:1.1.0]
 at 
 org.apache.drill.exec.physical.impl.ImplCreator.getRecordBatch(ImplCreator.java:131)
  [drill-java-exec-1.1.0.jar:1.1.0]
 at 
 org.apache.drill.exec.physical.impl.ImplCreator.getChildren(ImplCreator.java:173)
  [drill-java-exec-1.1.0.jar:1.1.0]
 at 
 org.apache.drill.exec.physical.impl.ImplCreator.getRootExec(ImplCreator.java:106)
  [drill-java-exec-1.1.0.jar:1.1.0]
 at 
 org.apache.drill.exec.physical.impl.ImplCreator.getExec(ImplCreator.java:81) 
 [drill-java-exec-1.1.0.jar:1.1.0]
 at 
 

[jira] [Comment Edited] (DRILL-2288) result set metadata not set for zero-row result (DatabaseMetaData.getColumns(...))

2015-07-24 Thread Daniel Barclay (Drill) (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-2288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14632614#comment-14632614
 ] 

Daniel Barclay (Drill) edited comment on DRILL-2288 at 7/24/15 5:54 PM:


Investigation notes:
- Is not a JDBC problem--seems to be an INFORMATION_SCHEMA/ischema problem.
- Has something to do with ischema filtering--whether metadata is missing or 
not depends on whether having zero rows was caused by mismatching one of the 
specially filtered (pushed-down?) fields (e.g., TABLE_SCHEMA and TABLE_NAME for 
COLUMNS) or not, respectively.
- Seems that a downstream schema is derived from the set of value vectors 
(etc.) at some point, but that set is empty sometimes when there are no rows 
(when no values have been written to vectors/vector container?).
- Does seem to be in INFORMATION_SCHEMA plug-in:  It doesn't seem to use 
PojoDataType as system-tables plug-in does.


was (Author: dsbos):
Investigation notes:
- Is not a JDBC problem--seems to be an INFORMATION_SCHEMA/ischema problem.
- Has something to do with ischema filtering--whether metadata is missing or 
not depends on whether having zero rows was caused by mismatching one of the 
specially filtered (pushed-down?) fields (e.g., TABLE_SCHEMA and TABLE_NAME for 
COLUMNS) or not, respectively.
- Might not be in INFORMATION_SCHEMA.
- Seems that a downstream schema is derived from the set of value vectors 
(etc.) at some point, but that set is empty sometimes when there are no rows 
(when no values have been written to vectors/vector container?).
- Does seem to be in INFORMATION_SCHEMA plug-in:  It doesn't seem to use 
PojoDataType as system-tables plug-in does.

 result set metadata not set for zero-row result  
 (DatabaseMetaData.getColumns(...))
 ---

 Key: DRILL-2288
 URL: https://issues.apache.org/jira/browse/DRILL-2288
 Project: Apache Drill
  Issue Type: Bug
  Components: Storage - Information Schema
Reporter: Daniel Barclay (Drill)
Assignee: Daniel Barclay (Drill)
 Fix For: 1.2.0

 Attachments: Drill2288NoResultSetMetadataWhenZeroRowsTest.java


 The ResultSetMetaData object from getMetadata() of a ResultSet is not set up 
 (getColumnCount() returns zero, and trying to access any other metadata 
 throws IndexOutOfBoundsException) for a result set with zero rows, at least 
 for one from DatabaseMetaData.getColumns(...).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (DRILL-3553) add support for LEAD and LAG window functions

2015-07-24 Thread Deneche A. Hakim (JIRA)
Deneche A. Hakim created DRILL-3553:
---

 Summary: add support for LEAD and LAG window functions
 Key: DRILL-3553
 URL: https://issues.apache.org/jira/browse/DRILL-3553
 Project: Apache Drill
  Issue Type: Sub-task
  Components: Execution - Relational Operators
Reporter: Deneche A. Hakim
Assignee: Deneche A. Hakim






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (DRILL-3536) Add support for LEAD, LAG, NTILE, FIRST_VALUE and LAST_VALUE window functions

2015-07-24 Thread Deneche A. Hakim (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-3536?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Deneche A. Hakim updated DRILL-3536:

Labels: window_function  (was: )

 Add support for LEAD, LAG, NTILE, FIRST_VALUE and LAST_VALUE window functions
 -

 Key: DRILL-3536
 URL: https://issues.apache.org/jira/browse/DRILL-3536
 Project: Apache Drill
  Issue Type: Improvement
  Components: Execution - Relational Operators
Reporter: Deneche A. Hakim
Assignee: Deneche A. Hakim
  Labels: window_function
 Fix For: 1.2.0


 This JIRA will track the progress on the following window functions (no 
 particular order):
 - LEAD
 - LAG
 - NTILE
 - FIRST_VALUE
 - LAST_VALUE



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (DRILL-3553) add support for LEAD and LAG window functions

2015-07-24 Thread Deneche A. Hakim (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-3553?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Deneche A. Hakim updated DRILL-3553:

Description: 
From SQL standard here is the general format of LEAD and LAG:
{noformat}
window function ::=
  lead or lag function OVER window name or specification
{noformat}

{noformat}
lead or lag function ::=
  lead or lag ( lead or lag extent
  [ , offset [ , default expression ] ] )
  [ null treatment ]
{noformat}

{noformat}
lead or lag ::=
  LEAD | LAG
{noformat}

{noformat}
lead or lag extent ::=
  value expression
{noformat}

{noformat}
offset ::=
  exact numeric literal
{noformat}

{noformat}
default expression ::=
  value expression
{noformat}

{noformat}
null treatment ::=
  RESPECT NULLS | IGNORE NULLS
{noformat}

  was:
From SQL standard here is the general format of LEAD and LAG:
{noformat}
window function ::=
  lead or lag function OVER window name or specification
lead or lag function ::=
  lead or lag ( lead or lag extent
  [ , offset [ , default expression ] ] )
  [ null treatment ]
lead or lag ::=
  LEAD | LAG
lead or lag extent ::=
  value expression
offset ::=
  exact numeric literal
default expression ::=
  value expression
null treatment ::=
  RESPECT NULLS | IGNORE NULLS
{noformat}


 add support for LEAD and LAG window functions
 -

 Key: DRILL-3553
 URL: https://issues.apache.org/jira/browse/DRILL-3553
 Project: Apache Drill
  Issue Type: Sub-task
  Components: Execution - Relational Operators
Reporter: Deneche A. Hakim
Assignee: Deneche A. Hakim
  Labels: window_function
 Fix For: 1.2.0


 From SQL standard here is the general format of LEAD and LAG:
 {noformat}
 window function ::=
   lead or lag function OVER window name or specification
 {noformat}
 {noformat}
 lead or lag function ::=
   lead or lag ( lead or lag extent
   [ , offset [ , default expression ] ] )
   [ null treatment ]
 {noformat}
 {noformat}
 lead or lag ::=
   LEAD | LAG
 {noformat}
 {noformat}
 lead or lag extent ::=
   value expression
 {noformat}
 {noformat}
 offset ::=
   exact numeric literal
 {noformat}
 {noformat}
 default expression ::=
   value expression
 {noformat}
 {noformat}
 null treatment ::=
   RESPECT NULLS | IGNORE NULLS
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (DRILL-3553) add support for LEAD and LAG window functions

2015-07-24 Thread Deneche A. Hakim (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-3553?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14641182#comment-14641182
 ] 

Deneche A. Hakim commented on DRILL-3553:
-

Calcite only supports {{RESPECT NULLS}} by default (but not when explicitly 
stated in a query) for now. Until CALCITE-337 is fixed we won't be able to 
support {{IGNORE NULLS}} 

 add support for LEAD and LAG window functions
 -

 Key: DRILL-3553
 URL: https://issues.apache.org/jira/browse/DRILL-3553
 Project: Apache Drill
  Issue Type: Sub-task
  Components: Execution - Relational Operators
Reporter: Deneche A. Hakim
Assignee: Deneche A. Hakim
  Labels: window_function
 Fix For: 1.2.0


 From SQL standard here is the general format of LEAD and LAG:
 {noformat}
 window function ::=
   lead or lag function OVER window name or specification
 {noformat}
 {noformat}
 lead or lag function ::=
   lead or lag ( lead or lag extent
   [ , offset [ , default expression ] ] )
   [ null treatment ]
 {noformat}
 {noformat}
 lead or lag ::=
   LEAD | LAG
 {noformat}
 {noformat}
 lead or lag extent ::=
   value expression
 {noformat}
 {noformat}
 offset ::=
   exact numeric literal
 {noformat}
 {noformat}
 default expression ::=
   value expression
 {noformat}
 {noformat}
 null treatment ::=
   RESPECT NULLS | IGNORE NULLS
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (DRILL-3553) add support for LEAD and LAG window functions

2015-07-24 Thread Deneche A. Hakim (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-3553?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Deneche A. Hakim updated DRILL-3553:

Description: 
From SQL standard here is the general format of LEAD and LAG:
{noformat}
window function ::=
  lead or lag function OVER window name or specification
lead or lag function ::=
  lead or lag ( lead or lag extent
  [ , offset [ , default expression ] ] )
  [ null treatment ]
lead or lag ::=
  LEAD | LAG
lead or lag extent ::=
  value expression
offset ::=
  exact numeric literal
default expression ::=
  value expression
null treatment ::=
  RESPECT NULLS | IGNORE NULLS
{noformat}

 add support for LEAD and LAG window functions
 -

 Key: DRILL-3553
 URL: https://issues.apache.org/jira/browse/DRILL-3553
 Project: Apache Drill
  Issue Type: Sub-task
  Components: Execution - Relational Operators
Reporter: Deneche A. Hakim
Assignee: Deneche A. Hakim
  Labels: window_function
 Fix For: 1.2.0


 From SQL standard here is the general format of LEAD and LAG:
 {noformat}
 window function ::=
   lead or lag function OVER window name or specification
 lead or lag function ::=
   lead or lag ( lead or lag extent
   [ , offset [ , default expression ] ] )
   [ null treatment ]
 lead or lag ::=
   LEAD | LAG
 lead or lag extent ::=
   value expression
 offset ::=
   exact numeric literal
 default expression ::=
   value expression
 null treatment ::=
   RESPECT NULLS | IGNORE NULLS
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (DRILL-3553) add support for LEAD and LAG window functions

2015-07-24 Thread Deneche A. Hakim (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-3553?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Deneche A. Hakim updated DRILL-3553:

Labels: window_function  (was: )

 add support for LEAD and LAG window functions
 -

 Key: DRILL-3553
 URL: https://issues.apache.org/jira/browse/DRILL-3553
 Project: Apache Drill
  Issue Type: Sub-task
  Components: Execution - Relational Operators
Reporter: Deneche A. Hakim
Assignee: Deneche A. Hakim
  Labels: window_function
 Fix For: 1.2.0






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (DRILL-3554) Union over TIME and TIMESTAMP values throws SchemaChangeException

2015-07-24 Thread Khurram Faraaz (JIRA)
Khurram Faraaz created DRILL-3554:
-

 Summary: Union over TIME and TIMESTAMP values throws 
SchemaChangeException
 Key: DRILL-3554
 URL: https://issues.apache.org/jira/browse/DRILL-3554
 Project: Apache Drill
  Issue Type: Bug
  Components: Execution - Relational Operators
Affects Versions: 1.2.0
 Environment: 4 node cluster on CentOS
Reporter: Khurram Faraaz
Assignee: Chris Westin


Union over TIME and TIMESTAMP values results in Exception
commit ID : 17e580a7

{code}
0: jdbc:drill:schema=dfs.tmp select c9, c5 from union_01 union select c5, c9 
from union_02;
Error: SYSTEM ERROR: SchemaChangeException: Failure while trying to materialize 
incoming schema.  Errors:
 
Error in expression at index -1.  Error: Missing function implementation: 
[castTIMESTAMP(TIME-OPTIONAL)].  Full expression: --UNKNOWN EXPRESSION--..

Fragment 0:0

[Error Id: 18eed3ba-f046-48ed-93a6-19ffa87f969e on centos-02.qa.lab:31010] 
(state=,code=0)
{code}

Stack trace from drillbit.log

2015-07-24 22:09:57,467 [2a4d4849-d440-981d-ebf0-b4c35010bf02:frag:0:0] ERROR 
o.a.d.e.w.fragment.FragmentExecutor - SYSTEM ERROR: SchemaChangeException: 
Failure while trying to materialize incoming schema.  Errors:

Error in expression at index -1.  Error: Missing function implementation: 
[castTIMESTAMP(TIME-OPTIONAL)].  Full expression: --UNKNOWN EXPRESSION--..

Fragment 0:0

[Error Id: 18eed3ba-f046-48ed-93a6-19ffa87f969e on centos-02.qa.lab:31010]
org.apache.drill.common.exceptions.UserException: SYSTEM ERROR: 
SchemaChangeException: Failure while trying to materialize incoming schema.  
Errors:

Error in expression at index -1.  Error: Missing function implementation: 
[castTIMESTAMP(TIME-OPTIONAL)].  Full expression: --UNKNOWN EXPRESSION--..

Fragment 0:0

[Error Id: 18eed3ba-f046-48ed-93a6-19ffa87f969e on centos-02.qa.lab:31010]
at 
org.apache.drill.common.exceptions.UserException$Builder.build(UserException.java:523)
 ~[drill-common-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
at 
org.apache.drill.exec.work.fragment.FragmentExecutor.sendFinalState(FragmentExecutor.java:323)
 [drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
at 
org.apache.drill.exec.work.fragment.FragmentExecutor.cleanup(FragmentExecutor.java:178)
 [drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
at 
org.apache.drill.exec.work.fragment.FragmentExecutor.run(FragmentExecutor.java:292)
 [drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
at 
org.apache.drill.common.SelfCleaningRunnable.run(SelfCleaningRunnable.java:38) 
[drill-common-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) 
[na:1.7.0_45]
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) 
[na:1.7.0_45]
at java.lang.Thread.run(Thread.java:744) [na:1.7.0_45]
Caused by: org.apache.drill.exec.exception.SchemaChangeException: Failure while 
trying to materialize incoming schema.  Errors:

Error in expression at index -1.  Error: Missing function implementation: 
[castTIMESTAMP(TIME-OPTIONAL)].  Full expression: --UNKNOWN EXPRESSION--..
at 
org.apache.drill.exec.physical.impl.union.UnionAllRecordBatch.doWork(UnionAllRecordBatch.java:228)
 ~[drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
at 
org.apache.drill.exec.physical.impl.union.UnionAllRecordBatch.innerNext(UnionAllRecordBatch.java:116)
 ~[drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
at 
org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:147)
 ~[drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
at 
org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:105)
 ~[drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
at 
org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:95)
 ~[drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
at 
org.apache.drill.exec.physical.impl.aggregate.HashAggBatch.buildSchema(HashAggBatch.java:96)
 ~[drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
at 
org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:127)
 ~[drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
at 
org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:105)
 ~[drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
at 
org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:95)
 ~[drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
at 
org.apache.drill.exec.record.AbstractSingleRecordBatch.innerNext(AbstractSingleRecordBatch.java:51)
 ~[drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
at 
org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.innerNext(ProjectRecordBatch.java:129)
 

[jira] [Updated] (DRILL-3364) Prune scan range if the filter is on the leading field with byte comparable encoding

2015-07-24 Thread Smidth Panchamia (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-3364?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Smidth Panchamia updated DRILL-3364:

Attachment: 0001-DRILL-3364-Prune-scan-range-if-the-filter-is-on-the-.patch

The change adds some framework to handle conditions on row-key prefix.
It also adds support to perform row-key range pruning when the row-key
prefix is interpretted as DATE_EPOCH_BE encoded.

 Prune scan range if the filter is on the leading field with byte comparable 
 encoding
 

 Key: DRILL-3364
 URL: https://issues.apache.org/jira/browse/DRILL-3364
 Project: Apache Drill
  Issue Type: Sub-task
  Components: Storage - HBase
Reporter: Aditya Kishore
Assignee: Smidth Panchamia
 Fix For: 1.2.0

 Attachments: 
 0001-DRILL-3364-Prune-scan-range-if-the-filter-is-on-the-.patch, 
 composite.jun26.diff






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (DRILL-3553) add support for LEAD and LAG window functions

2015-07-24 Thread Deneche A. Hakim (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-3553?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Deneche A. Hakim updated DRILL-3553:

Description: 
From SQL standard here is the general format of LEAD and LAG:
{noformat}
window function ::=
  lead or lag function OVER window name or specification
{noformat}

{noformat}
lead or lag function ::=
  lead or lag ( lead or lag extent
  [ , offset [ , default expression ] ] )
  [ null treatment ]
{noformat}

{noformat}
lead or lag ::=
  LEAD | LAG
{noformat}

{noformat}
lead or lag extent ::=
  value expression
{noformat}

{noformat}
offset ::=
  exact numeric literal
{noformat}

{noformat}
default expression ::=
  value expression
{noformat}

The following won't be supported until CALCITE-337 is resolved:
{noformat}
null treatment ::=
  RESPECT NULLS | IGNORE NULLS
{noformat}

  was:
From SQL standard here is the general format of LEAD and LAG:
{noformat}
window function ::=
  lead or lag function OVER window name or specification
{noformat}

{noformat}
lead or lag function ::=
  lead or lag ( lead or lag extent
  [ , offset [ , default expression ] ] )
  [ null treatment ]
{noformat}

{noformat}
lead or lag ::=
  LEAD | LAG
{noformat}

{noformat}
lead or lag extent ::=
  value expression
{noformat}

{noformat}
offset ::=
  exact numeric literal
{noformat}

{noformat}
default expression ::=
  value expression
{noformat}

{noformat}
null treatment ::=
  RESPECT NULLS | IGNORE NULLS
{noformat}


 add support for LEAD and LAG window functions
 -

 Key: DRILL-3553
 URL: https://issues.apache.org/jira/browse/DRILL-3553
 Project: Apache Drill
  Issue Type: Sub-task
  Components: Execution - Relational Operators
Reporter: Deneche A. Hakim
Assignee: Deneche A. Hakim
  Labels: window_function
 Fix For: 1.2.0


 From SQL standard here is the general format of LEAD and LAG:
 {noformat}
 window function ::=
   lead or lag function OVER window name or specification
 {noformat}
 {noformat}
 lead or lag function ::=
   lead or lag ( lead or lag extent
   [ , offset [ , default expression ] ] )
   [ null treatment ]
 {noformat}
 {noformat}
 lead or lag ::=
   LEAD | LAG
 {noformat}
 {noformat}
 lead or lag extent ::=
   value expression
 {noformat}
 {noformat}
 offset ::=
   exact numeric literal
 {noformat}
 {noformat}
 default expression ::=
   value expression
 {noformat}
 The following won't be supported until CALCITE-337 is resolved:
 {noformat}
 null treatment ::=
   RESPECT NULLS | IGNORE NULLS
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (DRILL-3547) IndexOutOfBoundsException on directory with ~20 subdirectories

2015-07-24 Thread Daniel Barclay (Drill) (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-3547?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Barclay (Drill) updated DRILL-3547:
--
Assignee: (was: Daniel Barclay (Drill))

 IndexOutOfBoundsException on directory with ~20 subdirectories
 --

 Key: DRILL-3547
 URL: https://issues.apache.org/jira/browse/DRILL-3547
 Project: Apache Drill
  Issue Type: Bug
  Components: Functions - Drill
Affects Versions: 1.1.0
 Environment: RHEL 7
Reporter: Philip Deegan

 Tested on 1.1 with commit id:
 {noformat}
 0: jdbc:drill:zk=local select commit_id from sys.version;
 +---+
 | commit_id |
 +---+
 | e3fc7e97bfe712dc09d43a8a055a5135c96b7344  |
 +---+
 {noformat}
 Directory has child directories a to u, each contain json files.
 Running the query on each subdirectory indivudually does not cause an error.
 {noformat}
 java.lang.RuntimeException: java.sql.SQLException: SYSTEM ERROR: 
 IndexOutOfBoundsException: index: 0, length: 1 (expected: range(0, 0))
 Fragment 1:2
 [Error Id: 69a0879f-f718-4930-ae6f-c526de05528c on 
 ip-172-31-29-60.eu-central-1.compute.internal:31010]
   at sqlline.IncrementalRows.hasNext(IncrementalRows.java:73)
   at 
 sqlline.TableOutputFormat$ResizingRowsProvider.next(TableOutputFormat.java:87)
   at sqlline.TableOutputFormat.print(TableOutputFormat.java:118)
   at sqlline.SqlLine.print(SqlLine.java:1583)
   at sqlline.Commands.execute(Commands.java:852)
   at sqlline.Commands.sql(Commands.java:751)
   at sqlline.SqlLine.dispatch(SqlLine.java:738)
   at sqlline.SqlLine.begin(SqlLine.java:612)
   at sqlline.SqlLine.start(SqlLine.java:366)
   at sqlline.SqlLine.main(SqlLine.java:259)
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (DRILL-3554) Union over TIME and TIMESTAMP values throws SchemaChangeException

2015-07-24 Thread Khurram Faraaz (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-3554?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Khurram Faraaz updated DRILL-3554:
--
Assignee: Sean Hsuan-Yi Chu  (was: Chris Westin)

 Union over TIME and TIMESTAMP values throws SchemaChangeException
 -

 Key: DRILL-3554
 URL: https://issues.apache.org/jira/browse/DRILL-3554
 Project: Apache Drill
  Issue Type: Bug
  Components: Execution - Relational Operators
Affects Versions: 1.2.0
 Environment: 4 node cluster on CentOS
Reporter: Khurram Faraaz
Assignee: Sean Hsuan-Yi Chu

 Union over TIME and TIMESTAMP values results in Exception
 commit ID : 17e580a7
 {code}
 0: jdbc:drill:schema=dfs.tmp select c9, c5 from union_01 union select c5, c9 
 from union_02;
 Error: SYSTEM ERROR: SchemaChangeException: Failure while trying to 
 materialize incoming schema.  Errors:
  
 Error in expression at index -1.  Error: Missing function implementation: 
 [castTIMESTAMP(TIME-OPTIONAL)].  Full expression: --UNKNOWN EXPRESSION--..
 Fragment 0:0
 [Error Id: 18eed3ba-f046-48ed-93a6-19ffa87f969e on centos-02.qa.lab:31010] 
 (state=,code=0)
 {code}
 Stack trace from drillbit.log
 2015-07-24 22:09:57,467 [2a4d4849-d440-981d-ebf0-b4c35010bf02:frag:0:0] ERROR 
 o.a.d.e.w.fragment.FragmentExecutor - SYSTEM ERROR: SchemaChangeException: 
 Failure while trying to materialize incoming schema.  Errors:
 Error in expression at index -1.  Error: Missing function implementation: 
 [castTIMESTAMP(TIME-OPTIONAL)].  Full expression: --UNKNOWN EXPRESSION--..
 Fragment 0:0
 [Error Id: 18eed3ba-f046-48ed-93a6-19ffa87f969e on centos-02.qa.lab:31010]
 org.apache.drill.common.exceptions.UserException: SYSTEM ERROR: 
 SchemaChangeException: Failure while trying to materialize incoming schema.  
 Errors:
 Error in expression at index -1.  Error: Missing function implementation: 
 [castTIMESTAMP(TIME-OPTIONAL)].  Full expression: --UNKNOWN EXPRESSION--..
 Fragment 0:0
 [Error Id: 18eed3ba-f046-48ed-93a6-19ffa87f969e on centos-02.qa.lab:31010]
 at 
 org.apache.drill.common.exceptions.UserException$Builder.build(UserException.java:523)
  ~[drill-common-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
 at 
 org.apache.drill.exec.work.fragment.FragmentExecutor.sendFinalState(FragmentExecutor.java:323)
  [drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
 at 
 org.apache.drill.exec.work.fragment.FragmentExecutor.cleanup(FragmentExecutor.java:178)
  [drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
 at 
 org.apache.drill.exec.work.fragment.FragmentExecutor.run(FragmentExecutor.java:292)
  [drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
 at 
 org.apache.drill.common.SelfCleaningRunnable.run(SelfCleaningRunnable.java:38)
  [drill-common-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
 at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
  [na:1.7.0_45]
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
  [na:1.7.0_45]
 at java.lang.Thread.run(Thread.java:744) [na:1.7.0_45]
 Caused by: org.apache.drill.exec.exception.SchemaChangeException: Failure 
 while trying to materialize incoming schema.  Errors:
 Error in expression at index -1.  Error: Missing function implementation: 
 [castTIMESTAMP(TIME-OPTIONAL)].  Full expression: --UNKNOWN EXPRESSION--..
 at 
 org.apache.drill.exec.physical.impl.union.UnionAllRecordBatch.doWork(UnionAllRecordBatch.java:228)
  ~[drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
 at 
 org.apache.drill.exec.physical.impl.union.UnionAllRecordBatch.innerNext(UnionAllRecordBatch.java:116)
  ~[drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
 at 
 org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:147)
  ~[drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
 at 
 org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:105)
  ~[drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
 at 
 org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:95)
  ~[drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
 at 
 org.apache.drill.exec.physical.impl.aggregate.HashAggBatch.buildSchema(HashAggBatch.java:96)
  ~[drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
 at 
 org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:127)
  ~[drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
 at 
 org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:105)
  ~[drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
 at 
 org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:95)
  ~[drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
 at 
 

[jira] [Updated] (DRILL-3364) Prune scan range if the filter is on the leading field with byte comparable encoding

2015-07-24 Thread Smidth Panchamia (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-3364?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Smidth Panchamia updated DRILL-3364:

Attachment: 0001-DRILL-3364-Prune-scan-range-if-the-filter-is-on-the-.patch

The change adds support to perform row-key range pruning when the
row-key prefix is interpretted as TIME_EPOCH_BE, TIMESTAMP_EPOCH_BE or UINT8_BE
encoded.

 Prune scan range if the filter is on the leading field with byte comparable 
 encoding
 

 Key: DRILL-3364
 URL: https://issues.apache.org/jira/browse/DRILL-3364
 Project: Apache Drill
  Issue Type: Sub-task
  Components: Storage - HBase
Reporter: Aditya Kishore
Assignee: Smidth Panchamia
 Fix For: 1.2.0

 Attachments: 
 0001-Add-convert_from-and-convert_to-methods-for-TIMESTAM.patch, 
 0001-DRILL-3364-Prune-scan-range-if-the-filter-is-on-the-.patch, 
 0001-DRILL-3364-Prune-scan-range-if-the-filter-is-on-the-.patch, 
 composite.jun26.diff






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (DRILL-3364) Prune scan range if the filter is on the leading field with byte comparable encoding

2015-07-24 Thread Smidth Panchamia (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-3364?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Smidth Panchamia updated DRILL-3364:

Attachment: 0001-Add-convert_from-and-convert_to-methods-for-TIMESTAM.patch

Add convert_from and convert_to methods for TIMESTAMP type.
This will help in scan range pruning when the query is on leading bytes of 
row-key and it needs to be interpreted as timestamp.

 Prune scan range if the filter is on the leading field with byte comparable 
 encoding
 

 Key: DRILL-3364
 URL: https://issues.apache.org/jira/browse/DRILL-3364
 Project: Apache Drill
  Issue Type: Sub-task
  Components: Storage - HBase
Reporter: Aditya Kishore
Assignee: Smidth Panchamia
 Fix For: 1.2.0

 Attachments: 
 0001-Add-convert_from-and-convert_to-methods-for-TIMESTAM.patch, 
 0001-DRILL-3364-Prune-scan-range-if-the-filter-is-on-the-.patch, 
 0001-DRILL-3364-Prune-scan-range-if-the-filter-is-on-the-.patch, 
 composite.jun26.diff






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (DRILL-3555) Changing defaults for planner.memory.max_query_memory_per_node causes queries with window function to fail

2015-07-24 Thread Deneche A. Hakim (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-3555?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14641301#comment-14641301
 ] 

Deneche A. Hakim commented on DRILL-3555:
-

what dataset are you running the query on ?

 Changing defaults for planner.memory.max_query_memory_per_node causes queries 
 with window function to fail
 --

 Key: DRILL-3555
 URL: https://issues.apache.org/jira/browse/DRILL-3555
 Project: Apache Drill
  Issue Type: Bug
  Components: Query Planning  Optimization
Affects Versions: 1.1.0, 1.2.0
 Environment: 4 Nodes. Direct Memory= 48 GB each
Reporter: Abhishek Girish
Assignee: Jinfeng Ni
Priority: Critical

 Changing the default value for planner.memory.max_query_memory_per_node from 
 2 GB to anything higher causes queries with window functions to fail. 
 Changed system options
 {code:sql}
  select * from sys.options where status like '%CHANGE%';
 +---+--+-+--+-+-+---++
 |   name|   kind   |  type   |  status  | 
   num_val   | string_val  | bool_val  | float_val  |
 +---+--+-+--+-+-+---++
 | planner.enable_decimal_data_type  | BOOLEAN  | SYSTEM  | CHANGED  | 
 null| null| true  | null   |
 | planner.memory.max_query_memory_per_node  | LONG | SYSTEM  | CHANGED  | 
 8589934592  | null| null  | null   |
 +---+--+-+--+-+-+---++
 2 rows selected (0.249 seconds)
 {code}
 Query
 {code:sql}
  SELECT SUM(ss.ss_net_paid_inc_tax) OVER (PARTITION BY ss.ss_store_sk) FROM 
  store_sales ss LIMIT 20;
 java.lang.RuntimeException: java.sql.SQLException: SYSTEM ERROR: 
 DrillRuntimeException: Adding this batch causes the total size to exceed max 
 allowed size. Current runningBytes 1073638500, Incoming batchBytes 127875. 
 maxBytes 1073741824
 Fragment 1:0
 [Error Id: 9c2ec9cf-21c6-4d5e-b0d6-7cd59e32c49d on abhi1:31010]
 at sqlline.IncrementalRows.hasNext(IncrementalRows.java:73)
 at 
 sqlline.TableOutputFormat$ResizingRowsProvider.next(TableOutputFormat.java:87)
 at sqlline.TableOutputFormat.print(TableOutputFormat.java:118)
 at sqlline.SqlLine.print(SqlLine.java:1583)
 at sqlline.Commands.execute(Commands.java:852)
 at sqlline.Commands.sql(Commands.java:751)
 at sqlline.SqlLine.dispatch(SqlLine.java:738)
 at sqlline.SqlLine.begin(SqlLine.java:612)
 at sqlline.SqlLine.start(SqlLine.java:366)
 at sqlline.SqlLine.main(SqlLine.java:259)
 {code}
 Log:
 {code}
 2015-07-23 18:16:52,292 [2a4e6e2e-8cfa-ed8f-de56-e6c5517b5da6:frag:2:2] INFO  
 o.a.d.e.w.fragment.FragmentExecutor - 
 2a4e6e2e-8cfa-ed8f-de56-e6c5517b5da6:2:2: State change requested RUNNING -- 
 FINISHED
 2015-07-23 18:16:52,292 [2a4e6e2e-8cfa-ed8f-de56-e6c5517b5da6:frag:2:2] INFO  
 o.a.d.e.w.f.FragmentStatusReporter - 
 2a4e6e2e-8cfa-ed8f-de56-e6c5517b5da6:2:2: State to report: FINISHED
 2015-07-23 18:17:05,485 [2a4e6e2e-8cfa-ed8f-de56-e6c5517b5da6:frag:1:0] ERROR 
 o.a.d.e.p.i.s.SortRecordBatchBuilder - Adding this batch causes the total 
 size to exceed max allowed size. Current runningBytes 1073638500, Incoming 
 batchBytes 127875. maxBytes 1073741824
 2015-07-23 18:17:05,486 [2a4e6e2e-8cfa-ed8f-de56-e6c5517b5da6:frag:1:0] INFO  
 o.a.d.e.w.fragment.FragmentExecutor - 
 2a4e6e2e-8cfa-ed8f-de56-e6c5517b5da6:1:0: State change requested RUNNING -- 
 FAILED
 ...
 2015-07-23 18:17:05,990 [2a4e6e2e-8cfa-ed8f-de56-e6c5517b5da6:frag:1:0] INFO  
 o.a.d.e.w.fragment.FragmentExecutor - 
 2a4e6e2e-8cfa-ed8f-de56-e6c5517b5da6:1:0: State change requested FAILED -- 
 FINISHED
 2015-07-23 18:17:05,999 [2a4e6e2e-8cfa-ed8f-de56-e6c5517b5da6:frag:1:0] ERROR 
 o.a.d.e.w.fragment.FragmentExecutor - SYSTEM ERROR: DrillRuntimeException: 
 Adding this batch causes the total size to exceed max allowed size. Current 
 runningBytes 1073638500, Incoming batchBytes 127875. maxBytes 1073741824
 Fragment 1:0
 [Error Id: 9c2ec9cf-21c6-4d5e-b0d6-7cd59e32c49d on abhi1:31010]
 org.apache.drill.common.exceptions.UserException: SYSTEM ERROR: 
 DrillRuntimeException: Adding this batch causes the total size to exceed max 
 allowed size. Current runningBytes 1073638500, Incoming batchBytes 127875. 
 maxBytes 1073741824
 Fragment 1:0
 [Error Id: 9c2ec9cf-21c6-4d5e-b0d6-7cd59e32c49d on abhi1:31010]
 at 
 org.apache.drill.common.exceptions.UserException$Builder.build(UserException.java:523)
  ~[drill-common-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
 at 
 

[jira] [Commented] (DRILL-2218) Constant folding rule exposing planning bugs and not being used in plan where the constant expression is in the select list

2015-07-24 Thread Jason Altekruse (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-2218?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14641285#comment-14641285
 ] 

Jason Altekruse commented on DRILL-2218:


An update on this issue after discussing with [~jni] and [~amansinha100]. The 
cost model for project currently on considers the number of expressions 
present, not the complexity of the expressions. Therefore the rule being fired 
to reduce the expression is producing the correct rewritten project, but it is 
not being selected because it is exposing the same cost value as the version of 
the project where the full expression is still present.

 Constant folding rule exposing planning bugs and not being used in plan where 
 the constant expression is in the select list
 ---

 Key: DRILL-2218
 URL: https://issues.apache.org/jira/browse/DRILL-2218
 Project: Apache Drill
  Issue Type: Improvement
  Components: Query Planning  Optimization
Reporter: Jason Altekruse
Assignee: Aman Sinha
 Fix For: 1.4.0


 This test method and rule is not currently in the master branch, but it does 
 appear in the patch posted for constant expression folding during planning, 
 DRILL-2060. Once it is merged, the test 
 TestConstantFolding.testConstExprFolding_InSelect() which is currently 
 ignored, will be failing. The issue is that even though the constant folding 
 rule for project is firing, and I have traced it to see that a replacement 
 project with a literal is created, it is not being selected in the final 
 plan. This seems rather odd, as there is a comment in the last line of the 
 onMatch() method of the rule that says the following. This does not appear to 
 be having the desired effect, may need to file a bug in calcite.
 {code}
 // New plan is absolutely better than old plan.
 call.getPlanner().setImportance(project, 0.0);
 {code}
 Here is the query from the test, I expect the sum to be folded in planning 
 with the newly enabled project constant folding rule.
 {code}
 select columns[0], 3+5 from cp.`test_input.csv`
 {code}
 There also some planning bugs that are exposed when this rule is enabled, 
 even if the ReduceExpressionsRule.PROJECT_INSTANCE has no impact on the plan 
 itself.
 It is causing a planning bug for the TestAggregateFunctions.testDrill2092 -as 
 well as TestProjectPushDown.testProjectPastJoinPastFilterPastJoinPushDown()-. 
 The rule's OnMatch is being called, but not modifying the plan. It seems like 
 its presence in the optimizer is making another rule fire that is creating a 
 bad plan.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (DRILL-3555) Changing defaults for planner.memory.max_query_memory_per_node causes queries with window function to fail

2015-07-24 Thread Abhishek Girish (JIRA)
Abhishek Girish created DRILL-3555:
--

 Summary: Changing defaults for 
planner.memory.max_query_memory_per_node causes queries with window function to 
fail
 Key: DRILL-3555
 URL: https://issues.apache.org/jira/browse/DRILL-3555
 Project: Apache Drill
  Issue Type: Bug
  Components: Query Planning  Optimization
Affects Versions: 1.1.0, 1.2.0
 Environment: 4 Nodes. Direct Memory= 48 GB each
Reporter: Abhishek Girish
Assignee: Jinfeng Ni
Priority: Critical


Changing the default value for planner.memory.max_query_memory_per_node from 2 
GB to anything higher causes queries with window functions to fail. 

Changed system options
{code:sql}
 select * from sys.options where status like '%CHANGE%';
+---+--+-+--+-+-+---++
|   name|   kind   |  type   |  status  |   
num_val   | string_val  | bool_val  | float_val  |
+---+--+-+--+-+-+---++
| planner.enable_decimal_data_type  | BOOLEAN  | SYSTEM  | CHANGED  | 
null| null| true  | null   |
| planner.memory.max_query_memory_per_node  | LONG | SYSTEM  | CHANGED  | 
8589934592  | null| null  | null   |
+---+--+-+--+-+-+---++
2 rows selected (0.249 seconds)
{code}

Query
{code:sql}
 SELECT SUM(ss.ss_net_paid_inc_tax) OVER (PARTITION BY ss.ss_store_sk) FROM 
 store_sales ss LIMIT 20;

java.lang.RuntimeException: java.sql.SQLException: SYSTEM ERROR: 
DrillRuntimeException: Adding this batch causes the total size to exceed max 
allowed size. Current runningBytes 1073638500, Incoming batchBytes 127875. 
maxBytes 1073741824
Fragment 1:0
[Error Id: 9c2ec9cf-21c6-4d5e-b0d6-7cd59e32c49d on abhi1:31010]
at sqlline.IncrementalRows.hasNext(IncrementalRows.java:73)
at 
sqlline.TableOutputFormat$ResizingRowsProvider.next(TableOutputFormat.java:87)
at sqlline.TableOutputFormat.print(TableOutputFormat.java:118)
at sqlline.SqlLine.print(SqlLine.java:1583)
at sqlline.Commands.execute(Commands.java:852)
at sqlline.Commands.sql(Commands.java:751)
at sqlline.SqlLine.dispatch(SqlLine.java:738)
at sqlline.SqlLine.begin(SqlLine.java:612)
at sqlline.SqlLine.start(SqlLine.java:366)
at sqlline.SqlLine.main(SqlLine.java:259)
{code}


Log:
{code}
2015-07-23 18:16:52,292 [2a4e6e2e-8cfa-ed8f-de56-e6c5517b5da6:frag:2:2] INFO  
o.a.d.e.w.fragment.FragmentExecutor - 2a4e6e2e-8cfa-ed8f-de56-e6c5517b5da6:2:2: 
State change requested RUNNING -- FINISHED

2015-07-23 18:16:52,292 [2a4e6e2e-8cfa-ed8f-de56-e6c5517b5da6:frag:2:2] INFO  
o.a.d.e.w.f.FragmentStatusReporter - 2a4e6e2e-8cfa-ed8f-de56-e6c5517b5da6:2:2: 
State to report: FINISHED

2015-07-23 18:17:05,485 [2a4e6e2e-8cfa-ed8f-de56-e6c5517b5da6:frag:1:0] ERROR 
o.a.d.e.p.i.s.SortRecordBatchBuilder - Adding this batch causes the total size 
to exceed max allowed size. Current runningBytes 1073638500, Incoming 
batchBytes 127875. maxBytes 1073741824

2015-07-23 18:17:05,486 [2a4e6e2e-8cfa-ed8f-de56-e6c5517b5da6:frag:1:0] INFO  
o.a.d.e.w.fragment.FragmentExecutor - 2a4e6e2e-8cfa-ed8f-de56-e6c5517b5da6:1:0: 
State change requested RUNNING -- FAILED

...

2015-07-23 18:17:05,990 [2a4e6e2e-8cfa-ed8f-de56-e6c5517b5da6:frag:1:0] INFO  
o.a.d.e.w.fragment.FragmentExecutor - 2a4e6e2e-8cfa-ed8f-de56-e6c5517b5da6:1:0: 
State change requested FAILED -- FINISHED

2015-07-23 18:17:05,999 [2a4e6e2e-8cfa-ed8f-de56-e6c5517b5da6:frag:1:0] ERROR 
o.a.d.e.w.fragment.FragmentExecutor - SYSTEM ERROR: DrillRuntimeException: 
Adding this batch causes the total size to exceed max allowed size. Current 
runningBytes 1073638500, Incoming batchBytes 127875. maxBytes 1073741824

Fragment 1:0

[Error Id: 9c2ec9cf-21c6-4d5e-b0d6-7cd59e32c49d on abhi1:31010]

org.apache.drill.common.exceptions.UserException: SYSTEM ERROR: 
DrillRuntimeException: Adding this batch causes the total size to exceed max 
allowed size. Current runningBytes 1073638500, Incoming batchBytes 127875. 
maxBytes 1073741824

Fragment 1:0

[Error Id: 9c2ec9cf-21c6-4d5e-b0d6-7cd59e32c49d on abhi1:31010]

at 
org.apache.drill.common.exceptions.UserException$Builder.build(UserException.java:523)
 ~[drill-common-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]

at 
org.apache.drill.exec.work.fragment.FragmentExecutor.sendFinalState(FragmentExecutor.java:323)
 [drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]

at 
org.apache.drill.exec.work.fragment.FragmentExecutor.cleanup(FragmentExecutor.java:178)
 [drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]

at 
org.apache.drill.exec.work.fragment.FragmentExecutor.run(FragmentExecutor.java:292)