[jira] [Commented] (DRILL-4779) Kafka storage plugin support

2017-11-10 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-4779?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16248369#comment-16248369
 ] 

ASF GitHub Bot commented on DRILL-4779:
---

Github user akumarb2010 commented on a diff in the pull request:

https://github.com/apache/drill/pull/1027#discussion_r150376586
  
--- Diff: 
contrib/storage-kafka/src/test/java/org/apache/drill/exec/store/kafka/cluster/EmbeddedZKQuorum.java
 ---
@@ -0,0 +1,83 @@
+/**
--- End diff --

This is taken care.


> Kafka storage plugin support
> 
>
> Key: DRILL-4779
> URL: https://issues.apache.org/jira/browse/DRILL-4779
> Project: Apache Drill
>  Issue Type: New Feature
>  Components: Storage - Other
>Affects Versions: 1.11.0
>Reporter: B Anil Kumar
>Assignee: B Anil Kumar
>  Labels: doc-impacting
> Fix For: 1.12.0
>
>
> Implement Kafka storage plugin will enable the strong SQL support for Kafka.
> Initially implementation can target for supporting json and avro message types



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (DRILL-4779) Kafka storage plugin support

2017-11-10 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-4779?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16248355#comment-16248355
 ] 

ASF GitHub Bot commented on DRILL-4779:
---

Github user akumarb2010 commented on a diff in the pull request:

https://github.com/apache/drill/pull/1027#discussion_r150376174
  
--- Diff: contrib/storage-kafka/README.md ---
@@ -0,0 +1,230 @@
+# Drill Kafka Plugin
+
+Drill kafka storage plugin allows you to perform interactive analysis 
using SQL against Apache Kafka.
+
+Supported Kafka Version
+Kafka-0.10 and above 
+
+Message Formats
+Currently this plugin supports reading only Kafka messages of type 
JSON.
+
+
+Message Readers
+Message Readers are used for reading messages from Kafka. Type of the 
MessageReaders supported as of now are
+
+
+  
+MessageReader
+Description
+Key DeSerializer 
+Value DeSerializer
+  
+  
+JsonMessageReader
+To read Json messages
+org.apache.kafka.common.serialization.ByteArrayDeserializer 
+org.apache.kafka.common.serialization.ByteArrayDeserializer
+  
+
+
+
+Plugin Configurations
+Drill Kafka plugin supports following properties
+
+   kafkaConsumerProps: These are typical https://kafka.apache.org/documentation/#consumerconfigs;>Kafka consumer 
properties.
+drillKafkaProps: These are Drill Kafka plugin 
properties. As of now, it supports the following properties
+   
+drill.kafka.message.reader: Message Reader 
implementation to use while reading messages from Kafka. Message reader 
implementaion should be configured based on message format. Type of message 
readers
+ 
+ org.apache.drill.exec.store.kafka.decoders.JsonMessageReader
+ 
+
+drill.kafka.poll.timeout: Polling timeout used by 
Kafka client while fetching messages from Kafka cluster.
+
+
+
+
+Plugin Registration
+To register the kafka plugin, open the drill web interface. To open the 
drill web interface, enter http://drillbit:8047/storage in 
your browser.
+
+The following is an example plugin registration configuration
+
+{
+  "type": "kafka",
+  "kafkaConsumerProps": {
+"key.deserializer": 
"org.apache.kafka.common.serialization.ByteArrayDeserializer",
+"auto.offset.reset": "earliest",
+"bootstrap.servers": "localhost:9092",
+"enable.auto.commit": "true",
+"group.id": "drill-query-consumer-1",
+"value.deserializer": 
"org.apache.kafka.common.serialization.ByteArrayDeserializer",
+"session.timeout.ms": "3"
+  },
+  "drillKafkaProps": {
+"drill.kafka.message.reader": 
"org.apache.drill.exec.store.kafka.decoders.JsonMessageReader",
+"drill.kafka.poll.timeout": "2000"
+  },
+  "enabled": true
+}
+
+
+ Abstraction 
+In Drill, each Kafka topic is mapped to a SQL table and when a query is 
issued on a table, it scans all the messages from the earliest offset to the 
latest offset of that topic at that point of time. This plugin automatically 
discovers all the topics (tables), to allow you perform analysis without 
executing DDL statements.
--- End diff --

This is very valid point Paul. Only issue is, in other storage plugins like 
Mongo, we are able to push down these predicates as filters to storage. Since 
they support predicate push down.

But in case of Kafka, we cannot push down these filters. So to achieve this 
we can create specific  KafkaSubScanSpec for the query range by parsing 
predicates on kafkaMsgOffset.  But this needs some time for developing and 
testing. Is that OK, If we create separate JIRA for this and release in next 
version?


> Kafka storage plugin support
> 
>
> Key: DRILL-4779
> URL: https://issues.apache.org/jira/browse/DRILL-4779
> Project: Apache Drill
>  Issue Type: New Feature
>  Components: Storage - Other
>Affects Versions: 1.11.0
>Reporter: B Anil Kumar
>Assignee: B Anil Kumar
>  Labels: doc-impacting
> Fix For: 1.12.0
>
>
> Implement Kafka storage plugin will enable the strong SQL support for Kafka.
> Initially implementation can target for supporting json and avro message types



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (DRILL-5783) Make code generation in the TopN operator more modular and test it

2017-11-10 Thread Timothy Farkas (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-5783?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Timothy Farkas updated DRILL-5783:
--
Labels: ready-to-commit  (was: )

> Make code generation in the TopN operator more modular and test it
> --
>
> Key: DRILL-5783
> URL: https://issues.apache.org/jira/browse/DRILL-5783
> Project: Apache Drill
>  Issue Type: Improvement
>Reporter: Timothy Farkas
>Assignee: Timothy Farkas
>  Labels: ready-to-commit
>
> The work for this PR has had several other PRs batched together with it. The 
> full description of work is the following:
> DRILL-5783
> * A unit test is created for the priority queue in the TopN operator
> * The code generation classes passed around a completely unused function 
> registry reference in some places so I removed it.
> * The priority queue had unused parameters for some of its methods so I 
> removed them.
> DRILL-5841
> * There were many many ways in which temporary folders were created in unit 
> tests. I have unified the way these folders are created with the 
> DirTestWatcher, SubDirTestWatcher, and BaseDirTestWatcher. All the unit tests 
> have been updated to use these. The test watchers create temp directories in 
> ./target//. So all the files generated and used in the context of a test can 
> easily be found in the same consistent location.
> * This change should fix the sporadic hashagg test failures, as well as 
> failures caused by stray files in /tmp
> DRILL-5894
> * dfs_test is used as a storage plugin throughout the unit tests. This is 
> highly confusing and we can just use dfs instead.
> *Misc*
> * General code cleanup.
> * There are many places where String.format is used unnecessarily. The test 
> builder methods already use String.format for you when you pass them args. I 
> cleaned some of these up.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (DRILL-5909) need new JMX metrics for (FAILED and CANCELED) queries

2017-11-10 Thread Prasad Nagaraj Subramanya (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-5909?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasad Nagaraj Subramanya updated DRILL-5909:
-
Issue Type: Improvement  (was: Bug)

> need new JMX metrics for (FAILED and CANCELED) queries
> --
>
> Key: DRILL-5909
> URL: https://issues.apache.org/jira/browse/DRILL-5909
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Execution - Monitoring
>Affects Versions: 1.11.0, 1.12.0
>Reporter: Khurram Faraaz
>Assignee: Prasad Nagaraj Subramanya
>  Labels: ready-to-commit
> Fix For: 1.12.0
>
>
> we have these JMX metrics today
> {noformat}
> drill.queries.running
> drill.queries.completed
> {noformat}
> we need these new JMX metrics
> {noformat}
> drill.queries.failed
> drill.queries.canceled
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (DRILL-5952) Implement "CREATE TABLE IF NOT EXISTS"

2017-11-10 Thread Prasad Nagaraj Subramanya (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-5952?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasad Nagaraj Subramanya updated DRILL-5952:
-
Issue Type: Improvement  (was: Bug)

> Implement "CREATE TABLE IF NOT EXISTS"
> --
>
> Key: DRILL-5952
> URL: https://issues.apache.org/jira/browse/DRILL-5952
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: SQL Parser
>Affects Versions: 1.11.0
>Reporter: Prasad Nagaraj Subramanya
>Assignee: Prasad Nagaraj Subramanya
> Fix For: 1.12.0
>
>
> Currently, if a table/view with the same name exists CREATE TABLE fails with 
> VALIDATION ERROR
> Having "IF NOT EXISTS" support for CREATE TABLE will ensure that query 
> succeeds 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (DRILL-5921) Counters metrics should be listed in table

2017-11-10 Thread Prasad Nagaraj Subramanya (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-5921?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasad Nagaraj Subramanya updated DRILL-5921:
-
Issue Type: Improvement  (was: Bug)

> Counters metrics should be listed in table
> --
>
> Key: DRILL-5921
> URL: https://issues.apache.org/jira/browse/DRILL-5921
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Client - HTTP
>Affects Versions: 1.11.0
>Reporter: Prasad Nagaraj Subramanya
>Assignee: Prasad Nagaraj Subramanya
>Priority: Minor
>  Labels: ready-to-commit
> Fix For: 1.12.0
>
>
> Counter metrics are currently displayed as json string in the Drill UI. They 
> should be listed in a table similar to other metrics.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (DRILL-5783) Make code generation in the TopN operator more modular and test it

2017-11-10 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5783?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16248288#comment-16248288
 ] 

ASF GitHub Bot commented on DRILL-5783:
---

Github user ilooner commented on the issue:

https://github.com/apache/drill/pull/984
  
After squashing the commits and rebasing I noticed the windows functional 
tests were failing. The issue was caused by replacing the '/' constant in 
FileUtils (now renamed to DrillFileUtils) in **ClassTransformer** and 
**FunctionInitializer** with File.separator. This broke the build because 
File.separator is '\' on windows and in the context of **ClassTransformer** and 
**FunctionInitializer** the separator is used to look things up in the 
classpath. Looking things up in the classpath requires '/' on both windows and 
linux, hence I added back the '/' constant to DrillFileUtils along with a 
comment explaining its purpose and used it in **ClassTransformer** and 
**FunctionInitializer**. *Note:* this was a minor fix that impacted only a few 
lines.


> Make code generation in the TopN operator more modular and test it
> --
>
> Key: DRILL-5783
> URL: https://issues.apache.org/jira/browse/DRILL-5783
> Project: Apache Drill
>  Issue Type: Improvement
>Reporter: Timothy Farkas
>Assignee: Timothy Farkas
>
> The work for this PR has had several other PRs batched together with it. The 
> full description of work is the following:
> DRILL-5783
> * A unit test is created for the priority queue in the TopN operator
> * The code generation classes passed around a completely unused function 
> registry reference in some places so I removed it.
> * The priority queue had unused parameters for some of its methods so I 
> removed them.
> DRILL-5841
> * There were many many ways in which temporary folders were created in unit 
> tests. I have unified the way these folders are created with the 
> DirTestWatcher, SubDirTestWatcher, and BaseDirTestWatcher. All the unit tests 
> have been updated to use these. The test watchers create temp directories in 
> ./target//. So all the files generated and used in the context of a test can 
> easily be found in the same consistent location.
> * This change should fix the sporadic hashagg test failures, as well as 
> failures caused by stray files in /tmp
> DRILL-5894
> * dfs_test is used as a storage plugin throughout the unit tests. This is 
> highly confusing and we can just use dfs instead.
> *Misc*
> * General code cleanup.
> * There are many places where String.format is used unnecessarily. The test 
> builder methods already use String.format for you when you pass them args. I 
> cleaned some of these up.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (DRILL-5919) Add non-numeric support for JSON processing

2017-11-10 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5919?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16248269#comment-16248269
 ] 

ASF GitHub Bot commented on DRILL-5919:
---

Github user paul-rogers commented on the issue:

https://github.com/apache/drill/pull/1026
  
Further, is the extra option to `convertFromJSON` really needed? Can't we 
just accept `NaN` and `Infinity` by default?

Consider. If the option is off by default, users without `NaN` or 
`Infinity` data will see no difference. But, users will this data will get an 
error and have to hunt down the option to make their data work.

If the option is on by default, users without `NaN` or `Infinity` data will 
see no difference. But, users will this data will also have their queries work 
by default.

So, seems no harm in making the `NaN` and `Infinity` support turned on by 
default.


> Add non-numeric support for JSON processing
> ---
>
> Key: DRILL-5919
> URL: https://issues.apache.org/jira/browse/DRILL-5919
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Storage - JSON
>Affects Versions: 1.11.0
>Reporter: Volodymyr Tkach
>Assignee: Volodymyr Tkach
>  Labels: doc-impacting
> Fix For: Future
>
>
> Add session options to allow drill working with non standard json strings 
> number literals like: NaN, Infinity, -Infinity. By default these options will 
> be switched off, the user will be able to toggle them during working session.
> *For documentation*
> 1. Added two session options {{store.json.reader.non_numeric_numbers}} and 
> {{store.json.reader.non_numeric_numbers}} that allow to read/write NaN and 
> Infinity as numbers. By default these options are set to false.
> 2. Extended signature of {{convert_toJSON}} and {{convert_fromJSON}} 
> functions by adding second optional parameter that enables read/write NaN and 
> Infinity.
> For example:
> {noformat}
> select convert_fromJSON('{"key": NaN}') from (values(1)); will result with 
> JsonParseException, but
> select convert_fromJSON('{"key": NaN}', true) from (values(1)); will parse 
> NaN as a number.
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (DRILL-5941) Skip header / footer logic works incorrectly for Hive tables when file has several input splits

2017-11-10 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5941?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16248259#comment-16248259
 ] 

ASF GitHub Bot commented on DRILL-5941:
---

Github user ppadma commented on the issue:

https://github.com/apache/drill/pull/1030
  
@arina-ielchiieva I have a question. If the splits for a file are spread 
across multiple fragments, does this logic work ?



> Skip header / footer logic works incorrectly for Hive tables when file has 
> several input splits
> ---
>
> Key: DRILL-5941
> URL: https://issues.apache.org/jira/browse/DRILL-5941
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Storage - Hive
>Affects Versions: 1.11.0
>Reporter: Arina Ielchiieva
>Assignee: Arina Ielchiieva
> Fix For: Future
>
>
> *To reproduce*
> 1. Create csv file with two columns (key, value) for 329 rows, where 
> first row is a header.
> The data file has size of should be greater than chunk size of 256 MB. Copy 
> file to the distributed file system.
> 2. Create table in Hive:
> {noformat}
> CREATE EXTERNAL TABLE `h_table`(
>   `key` bigint,
>   `value` string)
> ROW FORMAT DELIMITED
>   FIELDS TERMINATED BY ','
> STORED AS INPUTFORMAT
>   'org.apache.hadoop.mapred.TextInputFormat'
> OUTPUTFORMAT
>   'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat'
> LOCATION
>   'maprfs:/tmp/h_table'
> TBLPROPERTIES (
>  'skip.header.line.count'='1');
> {noformat}
> 3. Execute query {{select * from hive.h_table}} in Drill (query data using 
> Hive plugin). The result will return less rows then expected. Expected result 
> is 328 (total count minus one row as header).
> *The root cause*
> Since file is greater than default chunk size, it's split into several 
> fragments, known as input splits. For example:
> {noformat}
> maprfs:/tmp/h_table/h_table.csv:0+268435456
> maprfs:/tmp/h_table/h_table.csv:268435457+492782112
> {noformat}
> TextHiveReader is responsible for handling skip header and / or footer logic.
> Currently Drill creates reader [for each input 
> split|https://github.com/apache/drill/blob/master/contrib/storage-hive/core/src/main/java/org/apache/drill/exec/store/hive/HiveScanBatchCreator.java#L84]
>  and skip header and /or footer logic is applied for each input splits, 
> though ideally the above mentioned input splits should have been read by one 
> reader, so skip / header footer logic was applied correctly.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (DRILL-5922) Intermittent Memory Leaks in the ROOT allocator

2017-11-10 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5922?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16248253#comment-16248253
 ] 

ASF GitHub Bot commented on DRILL-5922:
---

Github user paul-rogers commented on the issue:

https://github.com/apache/drill/pull/1023
  
For commit, let's do this:

* With luck, Arina will commit two PRs this week that may conflict: PR 
#970, and PR #978.
* Tim should rebase this PR on top of those changes once they are committed.
* If the changes are non-trivial, they may require additional review. If 
so, Paul to do that early the week of the 13th.
* If there is time to squeeze in this commit before Arina starts the 1.12, 
release, Paul will do a one-off commit.
* Otherwise, since Arina wants to do the release during that week, this 
commit may have to wait until after that release.
* Once this commit is in, Paul to rebase the PR #914, as it changed some of 
the same files.


> Intermittent Memory Leaks in the ROOT allocator  
> -
>
> Key: DRILL-5922
> URL: https://issues.apache.org/jira/browse/DRILL-5922
> Project: Apache Drill
>  Issue Type: Bug
>Reporter: Timothy Farkas
>Assignee: Timothy Farkas
>Priority: Minor
>  Labels: ready-to-commit
>
> This issue was originall found by [~ben-zvi]. I am able to consistently 
> reproduce the error on my laptop by running the following unit test:
> org.apache.drill.exec.DrillSeparatePlanningTest#testMultiMinorFragmentComplexQuery
> {code}
> java.lang.IllegalStateException: Allocator[ROOT] closed with outstanding 
> child allocators.
> Allocator(ROOT) 0/1048576/10113536/3221225472 (res/actual/peak/limit)
>   child allocators: 1
> Allocator(query:26049b50-0cec-0a92-437c-bbe486e1fcbf) 
> 1048576/0/0/268435456 (res/actual/peak/limit)
>   child allocators: 0
>   ledgers: 0
>   reservations: 0
>   ledgers: 0
>   reservations: 0
>   at 
> org.apache.drill.exec.memory.BaseAllocator.close(BaseAllocator.java:496) 
> ~[classes/:na]
>   at org.apache.drill.common.AutoCloseables.close(AutoCloseables.java:76) 
> [classes/:na]
>   at org.apache.drill.common.AutoCloseables.close(AutoCloseables.java:64) 
> [classes/:na]
>   at 
> org.apache.drill.exec.server.BootStrapContext.close(BootStrapContext.java:256)
>  ~[classes/:na]
>   at org.apache.drill.common.AutoCloseables.close(AutoCloseables.java:76) 
> [classes/:na]
>   at org.apache.drill.common.AutoCloseables.close(AutoCloseables.java:64) 
> [classes/:na]
>   at org.apache.drill.exec.server.Drillbit.close(Drillbit.java:205) 
> [classes/:na]
>   at org.apache.drill.BaseTestQuery.closeClient(BaseTestQuery.java:315) 
> [test-classes/:na]
>   at 
> org.apache.drill.BaseTestQuery.updateTestCluster(BaseTestQuery.java:157) 
> [test-classes/:na]
>   at 
> org.apache.drill.BaseTestQuery.updateTestCluster(BaseTestQuery.java:148) 
> [test-classes/:na]
>   at 
> org.apache.drill.exec.DrillSeparatePlanningTest.getFragmentsHelper(DrillSeparatePlanningTest.java:185)
>  [test-classes/:na]
>   at 
> org.apache.drill.exec.DrillSeparatePlanningTest.testMultiMinorFragmentComplexQuery(DrillSeparatePlanningTest.java:108)
>  [test-classes/:na]
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
> ~[na:1.8.0_144]
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
> ~[na:1.8.0_144]
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  ~[na:1.8.0_144]
>   at java.lang.reflect.Method.invoke(Method.java:498) ~[na:1.8.0_144]
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
>  [junit-4.11.jar:na]
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>  [junit-4.11.jar:na]
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
>  [junit-4.11.jar:na]
>   at 
> mockit.integration.junit4.internal.JUnit4TestRunnerDecorator.executeTestMethod(JUnit4TestRunnerDecorator.java:120)
>  [jmockit-1.3.jar:na]
>   at 
> mockit.integration.junit4.internal.JUnit4TestRunnerDecorator.invokeExplosively(JUnit4TestRunnerDecorator.java:65)
>  [jmockit-1.3.jar:na]
>   at 
> mockit.integration.junit4.internal.MockFrameworkMethod.invokeExplosively(MockFrameworkMethod.java:29)
>  [jmockit-1.3.jar:na]
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
> ~[na:1.8.0_144]
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
> ~[na:1.8.0_144]
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  ~[na:1.8.0_144]
>   at java.lang.reflect.Method.invoke(Method.java:498) ~[na:1.8.0_144]
>  

[jira] [Commented] (DRILL-5952) Implement "CREATE TABLE IF NOT EXISTS"

2017-11-10 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5952?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16248223#comment-16248223
 ] 

ASF GitHub Bot commented on DRILL-5952:
---

Github user prasadns14 commented on a diff in the pull request:

https://github.com/apache/drill/pull/1033#discussion_r150364771
  
--- Diff: exec/java-exec/src/main/codegen/includes/parserImpls.ftl ---
@@ -215,11 +215,12 @@ SqlNode SqlDropView() :
 
 /**
  * Parses a CTAS or CTTAS statement.
- * CREATE [TEMPORARY] TABLE tblname [ (field1, field2, ...) ] AS 
select_statement.
+ * CREATE [TEMPORARY] [IF NOT EXISTS] TABLE tblname [ (field1, field2, 
...) ] AS select_statement.
--- End diff --

Fixed it


> Implement "CREATE TABLE IF NOT EXISTS"
> --
>
> Key: DRILL-5952
> URL: https://issues.apache.org/jira/browse/DRILL-5952
> Project: Apache Drill
>  Issue Type: Bug
>  Components: SQL Parser
>Affects Versions: 1.11.0
>Reporter: Prasad Nagaraj Subramanya
>Assignee: Prasad Nagaraj Subramanya
> Fix For: 1.12.0
>
>
> Currently, if a table/view with the same name exists CREATE TABLE fails with 
> VALIDATION ERROR
> Having "IF NOT EXISTS" support for CREATE TABLE will ensure that query 
> succeeds 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (DRILL-5952) Implement "CREATE TABLE IF NOT EXISTS"

2017-11-10 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5952?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16248221#comment-16248221
 ] 

ASF GitHub Bot commented on DRILL-5952:
---

Github user Agirish commented on a diff in the pull request:

https://github.com/apache/drill/pull/1033#discussion_r150364581
  
--- Diff: exec/java-exec/src/main/codegen/includes/parserImpls.ftl ---
@@ -215,11 +215,12 @@ SqlNode SqlDropView() :
 
 /**
  * Parses a CTAS or CTTAS statement.
- * CREATE [TEMPORARY] TABLE tblname [ (field1, field2, ...) ] AS 
select_statement.
+ * CREATE [TEMPORARY] [IF NOT EXISTS] TABLE tblname [ (field1, field2, 
...) ] AS select_statement.
--- End diff --

[IF NOT EXISTS] should be after TABLE?


> Implement "CREATE TABLE IF NOT EXISTS"
> --
>
> Key: DRILL-5952
> URL: https://issues.apache.org/jira/browse/DRILL-5952
> Project: Apache Drill
>  Issue Type: Bug
>  Components: SQL Parser
>Affects Versions: 1.11.0
>Reporter: Prasad Nagaraj Subramanya
>Assignee: Prasad Nagaraj Subramanya
> Fix For: 1.12.0
>
>
> Currently, if a table/view with the same name exists CREATE TABLE fails with 
> VALIDATION ERROR
> Having "IF NOT EXISTS" support for CREATE TABLE will ensure that query 
> succeeds 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (DRILL-4842) SELECT * on JSON data results in NumberFormatException

2017-11-10 Thread Paul Rogers (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-4842?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16248214#comment-16248214
 ] 

Paul Rogers commented on DRILL-4842:


The JSON reader has been updated in the "Batch Size Control" project. A new 
feature was added to handle runs of nulls.

If the first value seen for a field is null, then we enter "deferred null mode."

While in that mode, we do not create a value vector; instead we bide our time 
waiting for an actual value.

When we see an actual value, we create a vector of that type. The vector 
automatically performs "fill empties" logic for the missing null values.

If we reach the end of a batch, and we've seen no value, then we assume "text 
mode" for this field. That is, this field (only) acts as if "all text mode" 
were set. The field becomes a nullable Varchar.

Once we see a scalar value, that value is saved as text. So, if we have 10K 
nulls, followed by the value 10, the value will be saved as the string "10".

If the value ends up being an array or object, then the parse fails with a type 
mismatch error.

The above is only a work-around. The proper solution is to allow the user to 
specify a schema so that Drill need not try to predict the future.

> SELECT * on JSON data results in NumberFormatException
> --
>
> Key: DRILL-4842
> URL: https://issues.apache.org/jira/browse/DRILL-4842
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Flow
>Affects Versions: 1.2.0
>Reporter: Khurram Faraaz
>Assignee: Serhii Harnyk
> Attachments: tooManyNulls.json
>
>
> Note that doing SELECT c1 returns correct results, the failure is seen when 
> we do SELECT star. json.all_text_mode was set to true.
> JSON file tooManyNulls.json has one key c1 with 4096 nulls as its value and 
> the 4097th key c1 has the value "Hello World"
> git commit ID : aaf220ff
> MapR Drill 1.8.0 RPM
> {noformat}
> 0: jdbc:drill:schema=dfs.tmp> alter session set 
> `store.json.all_text_mode`=true;
> +---++
> |  ok   |  summary   |
> +---++
> | true  | store.json.all_text_mode updated.  |
> +---++
> 1 row selected (0.27 seconds)
> 0: jdbc:drill:schema=dfs.tmp> SELECT c1 FROM `tooManyNulls.json` WHERE c1 IN 
> ('Hello World');
> +--+
> |  c1  |
> +--+
> | Hello World  |
> +--+
> 1 row selected (0.243 seconds)
> 0: jdbc:drill:schema=dfs.tmp> select * FROM `tooManyNulls.json` WHERE c1 IN 
> ('Hello World');
> Error: SYSTEM ERROR: NumberFormatException: Hello World
> Fragment 0:0
> [Error Id: 9cafb3f9-3d5c-478a-b55c-900602b8765e on centos-01.qa.lab:31010]
>  (java.lang.NumberFormatException) Hello World
> org.apache.drill.exec.expr.fn.impl.StringFunctionHelpers.nfeI():95
> 
> org.apache.drill.exec.expr.fn.impl.StringFunctionHelpers.varTypesToInt():120
> org.apache.drill.exec.test.generated.FiltererGen1169.doSetup():45
> org.apache.drill.exec.test.generated.FiltererGen1169.setup():54
> 
> org.apache.drill.exec.physical.impl.filter.FilterRecordBatch.generateSV2Filterer():195
> 
> org.apache.drill.exec.physical.impl.filter.FilterRecordBatch.setupNewSchema():107
> org.apache.drill.exec.record.AbstractSingleRecordBatch.innerNext():78
> org.apache.drill.exec.record.AbstractRecordBatch.next():162
> org.apache.drill.exec.record.AbstractRecordBatch.next():119
> org.apache.drill.exec.record.AbstractRecordBatch.next():109
> org.apache.drill.exec.record.AbstractSingleRecordBatch.innerNext():51
> 
> org.apache.drill.exec.physical.impl.svremover.RemovingRecordBatch.innerNext():94
> org.apache.drill.exec.record.AbstractRecordBatch.next():162
> org.apache.drill.exec.record.AbstractRecordBatch.next():119
> org.apache.drill.exec.record.AbstractRecordBatch.next():109
> org.apache.drill.exec.record.AbstractSingleRecordBatch.innerNext():51
> 
> org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.innerNext():135
> org.apache.drill.exec.record.AbstractRecordBatch.next():162
> org.apache.drill.exec.record.AbstractRecordBatch.next():119
> org.apache.drill.exec.record.AbstractRecordBatch.next():109
> org.apache.drill.exec.record.AbstractSingleRecordBatch.innerNext():51
> 
> org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.innerNext():135
> org.apache.drill.exec.record.AbstractRecordBatch.next():162
> org.apache.drill.exec.physical.impl.BaseRootExec.next():104
> 
> org.apache.drill.exec.physical.impl.ScreenCreator$ScreenRoot.innerNext():81
> org.apache.drill.exec.physical.impl.BaseRootExec.next():94
> org.apache.drill.exec.work.fragment.FragmentExecutor$1.run():257
> 

[jira] [Commented] (DRILL-5952) Implement "CREATE TABLE IF NOT EXISTS"

2017-11-10 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5952?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16248212#comment-16248212
 ] 

ASF GitHub Bot commented on DRILL-5952:
---

GitHub user prasadns14 opened a pull request:

https://github.com/apache/drill/pull/1033

DRILL-5952: Implement "CREATE TABLE IF NOT EXISTS"

1) Addedsupport for CREATE TABLE
2) Added unit tests for the same

@paul-rogers please review

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/prasadns14/drill DRILL-5951

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/drill/pull/1033.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1033


commit 9dbeeeb2ae2ac0d4626bbe175095fc85f5068680
Author: Prasad Nagaraj Subramanya 
Date:   2017-11-10T23:52:31Z

DRILL-5952: Implement "CREATE TABLE IF NOT EXISTS"




> Implement "CREATE TABLE IF NOT EXISTS"
> --
>
> Key: DRILL-5952
> URL: https://issues.apache.org/jira/browse/DRILL-5952
> Project: Apache Drill
>  Issue Type: Bug
>  Components: SQL Parser
>Affects Versions: 1.11.0
>Reporter: Prasad Nagaraj Subramanya
>Assignee: Prasad Nagaraj Subramanya
> Fix For: 1.12.0
>
>
> Currently, if a table/view with the same name exists CREATE TABLE fails with 
> VALIDATION ERROR
> Having "IF NOT EXISTS" support for CREATE TABLE will ensure that query 
> succeeds 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (DRILL-5952) Implement "CREATE TABLE IF NOT EXISTS"

2017-11-10 Thread Prasad Nagaraj Subramanya (JIRA)
Prasad Nagaraj Subramanya created DRILL-5952:


 Summary: Implement "CREATE TABLE IF NOT EXISTS"
 Key: DRILL-5952
 URL: https://issues.apache.org/jira/browse/DRILL-5952
 Project: Apache Drill
  Issue Type: Bug
  Components: SQL Parser
Affects Versions: 1.11.0
Reporter: Prasad Nagaraj Subramanya
Assignee: Prasad Nagaraj Subramanya
 Fix For: 1.12.0


Currently, if a table/view with the same name exists CREATE TABLE fails with 
VALIDATION ERROR

Having "IF NOT EXISTS" support for CREATE TABLE will ensure that query succeeds 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (DRILL-5919) Add non-numeric support for JSON processing

2017-11-10 Thread Paul Rogers (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5919?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16248210#comment-16248210
 ] 

Paul Rogers commented on DRILL-5919:


As a reference, this change has been ported to the revised JSON reader created 
within the "Batch Size Control" project. PR to be issued in Drill 1.13.

> Add non-numeric support for JSON processing
> ---
>
> Key: DRILL-5919
> URL: https://issues.apache.org/jira/browse/DRILL-5919
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Storage - JSON
>Affects Versions: 1.11.0
>Reporter: Volodymyr Tkach
>Assignee: Volodymyr Tkach
>  Labels: doc-impacting
> Fix For: Future
>
>
> Add session options to allow drill working with non standard json strings 
> number literals like: NaN, Infinity, -Infinity. By default these options will 
> be switched off, the user will be able to toggle them during working session.
> *For documentation*
> 1. Added two session options {{store.json.reader.non_numeric_numbers}} and 
> {{store.json.reader.non_numeric_numbers}} that allow to read/write NaN and 
> Infinity as numbers. By default these options are set to false.
> 2. Extended signature of {{convert_toJSON}} and {{convert_fromJSON}} 
> functions by adding second optional parameter that enables read/write NaN and 
> Infinity.
> For example:
> {noformat}
> select convert_fromJSON('{"key": NaN}') from (values(1)); will result with 
> JsonParseException, but
> select convert_fromJSON('{"key": NaN}', true) from (values(1)); will parse 
> NaN as a number.
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (DRILL-5919) Add non-numeric support for JSON processing

2017-11-10 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5919?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16248189#comment-16248189
 ] 

ASF GitHub Bot commented on DRILL-5919:
---

Github user paul-rogers commented on a diff in the pull request:

https://github.com/apache/drill/pull/1026#discussion_r150359510
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/expr/fn/impl/conv/JsonConvertFrom.java
 ---
@@ -91,4 +92,60 @@ public void eval(){
 }
   }
 
+  @FunctionTemplate(name = "convert_fromJSON", scope = 
FunctionScope.SIMPLE, nulls = NullHandling.NULL_IF_NULL, isRandom = true)
+  public static class ConvertFromJsonVarcharNonNumerics implements 
DrillSimpleFunc{
+
+@Param VarCharHolder in;
+@Param BitHolder enableNonNumeric;
+@Inject DrillBuf buffer;
+@Workspace org.apache.drill.exec.vector.complex.fn.JsonReader 
jsonReader;
+
+@Output ComplexWriter writer;
+
+public void setup(){
+  jsonReader = new 
org.apache.drill.exec.vector.complex.fn.JsonReader(buffer, false, false, 
false,/* do not read numbers as doubles */
--- End diff --

See above.


> Add non-numeric support for JSON processing
> ---
>
> Key: DRILL-5919
> URL: https://issues.apache.org/jira/browse/DRILL-5919
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Storage - JSON
>Affects Versions: 1.11.0
>Reporter: Volodymyr Tkach
>Assignee: Volodymyr Tkach
>  Labels: doc-impacting
> Fix For: Future
>
>
> Add session options to allow drill working with non standard json strings 
> number literals like: NaN, Infinity, -Infinity. By default these options will 
> be switched off, the user will be able to toggle them during working session.
> *For documentation*
> 1. Added two session options {{store.json.reader.non_numeric_numbers}} and 
> {{store.json.reader.non_numeric_numbers}} that allow to read/write NaN and 
> Infinity as numbers. By default these options are set to false.
> 2. Extended signature of {{convert_toJSON}} and {{convert_fromJSON}} 
> functions by adding second optional parameter that enables read/write NaN and 
> Infinity.
> For example:
> {noformat}
> select convert_fromJSON('{"key": NaN}') from (values(1)); will result with 
> JsonParseException, but
> select convert_fromJSON('{"key": NaN}', true) from (values(1)); will parse 
> NaN as a number.
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (DRILL-5919) Add non-numeric support for JSON processing

2017-11-10 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5919?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16248191#comment-16248191
 ] 

ASF GitHub Bot commented on DRILL-5919:
---

Github user paul-rogers commented on a diff in the pull request:

https://github.com/apache/drill/pull/1026#discussion_r150360262
  
--- Diff: exec/java-exec/src/main/resources/drill-module.conf ---
@@ -502,6 +502,8 @@ drill.exec.options: {
 store.format: "parquet",
 store.hive.optimize_scan_with_native_readers: false,
 store.json.all_text_mode: false,
+store.json.writer.non_numeric_numbers: false,
+store.json.reader.non_numeric_numbers: false,
--- End diff --

Any reason these are not enabled by default?


> Add non-numeric support for JSON processing
> ---
>
> Key: DRILL-5919
> URL: https://issues.apache.org/jira/browse/DRILL-5919
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Storage - JSON
>Affects Versions: 1.11.0
>Reporter: Volodymyr Tkach
>Assignee: Volodymyr Tkach
>  Labels: doc-impacting
> Fix For: Future
>
>
> Add session options to allow drill working with non standard json strings 
> number literals like: NaN, Infinity, -Infinity. By default these options will 
> be switched off, the user will be able to toggle them during working session.
> *For documentation*
> 1. Added two session options {{store.json.reader.non_numeric_numbers}} and 
> {{store.json.reader.non_numeric_numbers}} that allow to read/write NaN and 
> Infinity as numbers. By default these options are set to false.
> 2. Extended signature of {{convert_toJSON}} and {{convert_fromJSON}} 
> functions by adding second optional parameter that enables read/write NaN and 
> Infinity.
> For example:
> {noformat}
> select convert_fromJSON('{"key": NaN}') from (values(1)); will result with 
> JsonParseException, but
> select convert_fromJSON('{"key": NaN}', true) from (values(1)); will parse 
> NaN as a number.
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (DRILL-5919) Add non-numeric support for JSON processing

2017-11-10 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5919?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16248187#comment-16248187
 ] 

ASF GitHub Bot commented on DRILL-5919:
---

Github user paul-rogers commented on a diff in the pull request:

https://github.com/apache/drill/pull/1026#discussion_r150359625
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/expr/fn/impl/conv/JsonConvertTo.java
 ---
@@ -90,7 +91,71 @@ public void eval(){
 
   java.io.ByteArrayOutputStream stream = new 
java.io.ByteArrayOutputStream();
   try {
-org.apache.drill.exec.vector.complex.fn.JsonWriter jsonWriter = 
new org.apache.drill.exec.vector.complex.fn.JsonWriter(stream, true, true);
+org.apache.drill.exec.vector.complex.fn.JsonWriter jsonWriter = 
new org.apache.drill.exec.vector.complex.fn.JsonWriter(stream, true, true, 
false);
+
+jsonWriter.write(input);
+  } catch (Exception e) {
+throw new RuntimeException(e);
+  }
+
+  byte [] bytea = stream.toByteArray();
+
+  out.buffer = buffer = buffer.reallocIfNeeded(bytea.length);
+  out.buffer.setBytes(0, bytea);
+  out.end = bytea.length;
+}
+  }
+
+  @FunctionTemplate(names = { "convert_toJSON", "convert_toSIMPLEJSON" } , 
scope = FunctionScope.SIMPLE, nulls = NullHandling.NULL_IF_NULL)
+  public static class ConvertToJsonNonNumeric implements DrillSimpleFunc{
+
+@Param FieldReader input;
+@Param BitHolder nonNumeric;
+@Output VarBinaryHolder out;
+@Inject DrillBuf buffer;
+
+public void setup(){
+}
+
+public void eval(){
+  out.start = 0;
+
+  java.io.ByteArrayOutputStream stream = new 
java.io.ByteArrayOutputStream();
+  try {
+org.apache.drill.exec.vector.complex.fn.JsonWriter jsonWriter = 
new org.apache.drill.exec.vector.complex.fn.JsonWriter(stream, true, false,
--- End diff --

More copies.


> Add non-numeric support for JSON processing
> ---
>
> Key: DRILL-5919
> URL: https://issues.apache.org/jira/browse/DRILL-5919
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Storage - JSON
>Affects Versions: 1.11.0
>Reporter: Volodymyr Tkach
>Assignee: Volodymyr Tkach
>  Labels: doc-impacting
> Fix For: Future
>
>
> Add session options to allow drill working with non standard json strings 
> number literals like: NaN, Infinity, -Infinity. By default these options will 
> be switched off, the user will be able to toggle them during working session.
> *For documentation*
> 1. Added two session options {{store.json.reader.non_numeric_numbers}} and 
> {{store.json.reader.non_numeric_numbers}} that allow to read/write NaN and 
> Infinity as numbers. By default these options are set to false.
> 2. Extended signature of {{convert_toJSON}} and {{convert_fromJSON}} 
> functions by adding second optional parameter that enables read/write NaN and 
> Infinity.
> For example:
> {noformat}
> select convert_fromJSON('{"key": NaN}') from (values(1)); will result with 
> JsonParseException, but
> select convert_fromJSON('{"key": NaN}', true) from (values(1)); will parse 
> NaN as a number.
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (DRILL-5919) Add non-numeric support for JSON processing

2017-11-10 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5919?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16248192#comment-16248192
 ] 

ASF GitHub Bot commented on DRILL-5919:
---

Github user paul-rogers commented on a diff in the pull request:

https://github.com/apache/drill/pull/1026#discussion_r150360475
  
--- Diff: 
exec/java-exec/src/test/java/org/apache/drill/exec/vector/complex/writer/TestJsonNonNumerics.java
 ---
@@ -0,0 +1,167 @@
+/*
+* Licensed to the Apache Software Foundation (ASF) under one or more
+* contributor license agreements.  See the NOTICE file distributed with
+* this work for additional information regarding copyright ownership.
+* The ASF licenses this file to you under the Apache License, Version 2.0
+* (the "License"); you may not use this file except in compliance with
+* the License.  You may obtain a copy of the License at
+*
+* http://www.apache.org/licenses/LICENSE-2.0
+*
+* Unless required by applicable law or agreed to in writing, software
+* distributed under the License is distributed on an "AS IS" BASIS,
+* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+* See the License for the specific language governing permissions and
+* limitations under the License.
+*/
+
+package org.apache.drill.exec.vector.complex.writer;
+
+import com.google.common.collect.ImmutableMap;
+import org.apache.commons.io.FileUtils;
+import org.apache.drill.BaseTestQuery;
+import org.apache.drill.common.exceptions.UserRemoteException;
+import org.apache.drill.common.expression.SchemaPath;
+import org.apache.drill.exec.record.RecordBatchLoader;
+import org.apache.drill.exec.record.VectorWrapper;
+import org.apache.drill.exec.rpc.user.QueryDataBatch;
+import org.apache.drill.exec.vector.VarCharVector;
+import org.junit.Test;
+
+import java.io.File;
+import java.util.List;
+
+import static org.hamcrest.CoreMatchers.containsString;
+import static org.junit.Assert.*;
+
+public class TestJsonNonNumerics extends BaseTestQuery {
+
+  @Test
+  public void testNonNumericSelect() throws Exception {
+File file = new File(getTempDir("nan_test"), "nan_test.json");
+String json = "{\"nan\":NaN, \"inf\":Infinity}";
+String query = String.format("select * from 
dfs.`%s`",file.getAbsolutePath());
+try {
+  FileUtils.writeStringToFile(file, json);
+  test("alter session set `store.json.reader.non_numeric_numbers` = 
true");
+  testBuilder()
+.sqlQuery(query)
+.unOrdered()
+.baselineColumns("nan", "inf")
+.baselineValues(Double.NaN, Double.POSITIVE_INFINITY)
+.build()
+.run();
+} finally {
+  test("alter session reset `store.json.reader.non_numeric_numbers`");
+  FileUtils.deleteQuietly(file);
+}
+  }
+
+  @Test(expected = UserRemoteException.class)
+  public void testNonNumericFailure() throws Exception {
+File file = new File(getTempDir("nan_test"), "nan_test.json");
+test("alter session set `store.json.reader.non_numeric_numbers` = 
false");
+String json = "{\"nan\":NaN, \"inf\":Infinity}";
+try {
+  FileUtils.writeStringToFile(file, json);
+  test("select * from dfs.`%s`;", file.getAbsolutePath());
+} catch (UserRemoteException e) {
+  assertThat(e.getMessage(), containsString("Error parsing JSON"));
+  throw e;
+} finally {
+  test("alter session reset `store.json.reader.non_numeric_numbers`");
+  FileUtils.deleteQuietly(file);
+}
+  }
+
+  @Test
+  public void testCreateTableNonNumerics() throws Exception {
+File file = new File(getTempDir("nan_test"), "nan_test.json");
+String json = "{\"nan\":NaN, \"inf\":Infinity}";
+String tableName = "ctas_test";
+try {
+  FileUtils.writeStringToFile(file, json);
+  test("alter session set `store.json.reader.non_numeric_numbers` = 
true");
+  test("alter session set `store.json.writer.non_numeric_numbers` = 
true");
+  test("alter session set `store.format`='json'");
+  test("create table dfs_test.tmp.`%s` as select * from dfs.`%s`;", 
tableName, file.getAbsolutePath());
+
+  // ensuring that `NaN` and `Infinity` tokens ARE NOT enclosed with 
double quotes
+  File resultFile = new File(new 
File(getDfsTestTmpSchemaLocation(),tableName),"0_0_0.json");
+  String resultJson = FileUtils.readFileToString(resultFile);
+  int nanIndex = resultJson.indexOf("NaN");
+  assertFalse("`NaN` must not be enclosed with \"\" ", 
resultJson.charAt(nanIndex - 1) == '"');
+  assertFalse("`NaN` must not be enclosed with \"\" ", 

[jira] [Commented] (DRILL-5919) Add non-numeric support for JSON processing

2017-11-10 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5919?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16248188#comment-16248188
 ] 

ASF GitHub Bot commented on DRILL-5919:
---

Github user paul-rogers commented on a diff in the pull request:

https://github.com/apache/drill/pull/1026#discussion_r150361681
  
--- Diff: 
exec/java-exec/src/test/java/org/apache/drill/exec/vector/complex/writer/TestJsonNonNumerics.java
 ---
@@ -0,0 +1,167 @@
+/*
+* Licensed to the Apache Software Foundation (ASF) under one or more
+* contributor license agreements.  See the NOTICE file distributed with
+* this work for additional information regarding copyright ownership.
+* The ASF licenses this file to you under the Apache License, Version 2.0
+* (the "License"); you may not use this file except in compliance with
+* the License.  You may obtain a copy of the License at
+*
+* http://www.apache.org/licenses/LICENSE-2.0
+*
+* Unless required by applicable law or agreed to in writing, software
+* distributed under the License is distributed on an "AS IS" BASIS,
+* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+* See the License for the specific language governing permissions and
+* limitations under the License.
+*/
+
+package org.apache.drill.exec.vector.complex.writer;
+
+import com.google.common.collect.ImmutableMap;
+import org.apache.commons.io.FileUtils;
+import org.apache.drill.BaseTestQuery;
+import org.apache.drill.common.exceptions.UserRemoteException;
+import org.apache.drill.common.expression.SchemaPath;
+import org.apache.drill.exec.record.RecordBatchLoader;
+import org.apache.drill.exec.record.VectorWrapper;
+import org.apache.drill.exec.rpc.user.QueryDataBatch;
+import org.apache.drill.exec.vector.VarCharVector;
+import org.junit.Test;
+
+import java.io.File;
+import java.util.List;
+
+import static org.hamcrest.CoreMatchers.containsString;
+import static org.junit.Assert.*;
+
+public class TestJsonNonNumerics extends BaseTestQuery {
+
+  @Test
+  public void testNonNumericSelect() throws Exception {
+File file = new File(getTempDir("nan_test"), "nan_test.json");
+String json = "{\"nan\":NaN, \"inf\":Infinity}";
+String query = String.format("select * from 
dfs.`%s`",file.getAbsolutePath());
+try {
+  FileUtils.writeStringToFile(file, json);
+  test("alter session set `store.json.reader.non_numeric_numbers` = 
true");
+  testBuilder()
+.sqlQuery(query)
+.unOrdered()
+.baselineColumns("nan", "inf")
+.baselineValues(Double.NaN, Double.POSITIVE_INFINITY)
+.build()
+.run();
+} finally {
+  test("alter session reset `store.json.reader.non_numeric_numbers`");
+  FileUtils.deleteQuietly(file);
+}
+  }
+
+  @Test(expected = UserRemoteException.class)
+  public void testNonNumericFailure() throws Exception {
+File file = new File(getTempDir("nan_test"), "nan_test.json");
+test("alter session set `store.json.reader.non_numeric_numbers` = 
false");
+String json = "{\"nan\":NaN, \"inf\":Infinity}";
+try {
+  FileUtils.writeStringToFile(file, json);
+  test("select * from dfs.`%s`;", file.getAbsolutePath());
+} catch (UserRemoteException e) {
+  assertThat(e.getMessage(), containsString("Error parsing JSON"));
+  throw e;
+} finally {
+  test("alter session reset `store.json.reader.non_numeric_numbers`");
+  FileUtils.deleteQuietly(file);
+}
+  }
+
+  @Test
+  public void testCreateTableNonNumerics() throws Exception {
+File file = new File(getTempDir("nan_test"), "nan_test.json");
+String json = "{\"nan\":NaN, \"inf\":Infinity}";
+String tableName = "ctas_test";
+try {
+  FileUtils.writeStringToFile(file, json);
+  test("alter session set `store.json.reader.non_numeric_numbers` = 
true");
+  test("alter session set `store.json.writer.non_numeric_numbers` = 
true");
+  test("alter session set `store.format`='json'");
+  test("create table dfs_test.tmp.`%s` as select * from dfs.`%s`;", 
tableName, file.getAbsolutePath());
+
+  // ensuring that `NaN` and `Infinity` tokens ARE NOT enclosed with 
double quotes
+  File resultFile = new File(new 
File(getDfsTestTmpSchemaLocation(),tableName),"0_0_0.json");
+  String resultJson = FileUtils.readFileToString(resultFile);
+  int nanIndex = resultJson.indexOf("NaN");
+  assertFalse("`NaN` must not be enclosed with \"\" ", 
resultJson.charAt(nanIndex - 1) == '"');
+  assertFalse("`NaN` must not be enclosed with \"\" ", 

[jira] [Commented] (DRILL-5919) Add non-numeric support for JSON processing

2017-11-10 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5919?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16248197#comment-16248197
 ] 

ASF GitHub Bot commented on DRILL-5919:
---

Github user paul-rogers commented on a diff in the pull request:

https://github.com/apache/drill/pull/1026#discussion_r150360192
  
--- Diff: exec/java-exec/src/main/resources/drill-module.conf ---
@@ -502,6 +502,8 @@ drill.exec.options: {
 store.format: "parquet",
 store.hive.optimize_scan_with_native_readers: false,
 store.json.all_text_mode: false,
+store.json.writer.non_numeric_numbers: false,
+store.json.reader.non_numeric_numbers: false,
--- End diff --

`allow_nan_inf`?


> Add non-numeric support for JSON processing
> ---
>
> Key: DRILL-5919
> URL: https://issues.apache.org/jira/browse/DRILL-5919
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Storage - JSON
>Affects Versions: 1.11.0
>Reporter: Volodymyr Tkach
>Assignee: Volodymyr Tkach
>  Labels: doc-impacting
> Fix For: Future
>
>
> Add session options to allow drill working with non standard json strings 
> number literals like: NaN, Infinity, -Infinity. By default these options will 
> be switched off, the user will be able to toggle them during working session.
> *For documentation*
> 1. Added two session options {{store.json.reader.non_numeric_numbers}} and 
> {{store.json.reader.non_numeric_numbers}} that allow to read/write NaN and 
> Infinity as numbers. By default these options are set to false.
> 2. Extended signature of {{convert_toJSON}} and {{convert_fromJSON}} 
> functions by adding second optional parameter that enables read/write NaN and 
> Infinity.
> For example:
> {noformat}
> select convert_fromJSON('{"key": NaN}') from (values(1)); will result with 
> JsonParseException, but
> select convert_fromJSON('{"key": NaN}', true) from (values(1)); will parse 
> NaN as a number.
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (DRILL-5919) Add non-numeric support for JSON processing

2017-11-10 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5919?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16248196#comment-16248196
 ] 

ASF GitHub Bot commented on DRILL-5919:
---

Github user paul-rogers commented on a diff in the pull request:

https://github.com/apache/drill/pull/1026#discussion_r150360424
  
--- Diff: 
exec/java-exec/src/test/java/org/apache/drill/exec/vector/complex/writer/TestJsonNonNumerics.java
 ---
@@ -0,0 +1,167 @@
+/*
+* Licensed to the Apache Software Foundation (ASF) under one or more
+* contributor license agreements.  See the NOTICE file distributed with
+* this work for additional information regarding copyright ownership.
+* The ASF licenses this file to you under the Apache License, Version 2.0
+* (the "License"); you may not use this file except in compliance with
+* the License.  You may obtain a copy of the License at
+*
+* http://www.apache.org/licenses/LICENSE-2.0
+*
+* Unless required by applicable law or agreed to in writing, software
+* distributed under the License is distributed on an "AS IS" BASIS,
+* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+* See the License for the specific language governing permissions and
+* limitations under the License.
+*/
+
+package org.apache.drill.exec.vector.complex.writer;
+
+import com.google.common.collect.ImmutableMap;
+import org.apache.commons.io.FileUtils;
+import org.apache.drill.BaseTestQuery;
+import org.apache.drill.common.exceptions.UserRemoteException;
+import org.apache.drill.common.expression.SchemaPath;
+import org.apache.drill.exec.record.RecordBatchLoader;
+import org.apache.drill.exec.record.VectorWrapper;
+import org.apache.drill.exec.rpc.user.QueryDataBatch;
+import org.apache.drill.exec.vector.VarCharVector;
+import org.junit.Test;
+
+import java.io.File;
+import java.util.List;
+
+import static org.hamcrest.CoreMatchers.containsString;
+import static org.junit.Assert.*;
+
+public class TestJsonNonNumerics extends BaseTestQuery {
+
+  @Test
+  public void testNonNumericSelect() throws Exception {
+File file = new File(getTempDir("nan_test"), "nan_test.json");
+String json = "{\"nan\":NaN, \"inf\":Infinity}";
+String query = String.format("select * from 
dfs.`%s`",file.getAbsolutePath());
+try {
+  FileUtils.writeStringToFile(file, json);
+  test("alter session set `store.json.reader.non_numeric_numbers` = 
true");
+  testBuilder()
+.sqlQuery(query)
+.unOrdered()
+.baselineColumns("nan", "inf")
+.baselineValues(Double.NaN, Double.POSITIVE_INFINITY)
+.build()
+.run();
+} finally {
+  test("alter session reset `store.json.reader.non_numeric_numbers`");
+  FileUtils.deleteQuietly(file);
+}
+  }
+
+  @Test(expected = UserRemoteException.class)
+  public void testNonNumericFailure() throws Exception {
+File file = new File(getTempDir("nan_test"), "nan_test.json");
+test("alter session set `store.json.reader.non_numeric_numbers` = 
false");
+String json = "{\"nan\":NaN, \"inf\":Infinity}";
+try {
+  FileUtils.writeStringToFile(file, json);
+  test("select * from dfs.`%s`;", file.getAbsolutePath());
+} catch (UserRemoteException e) {
+  assertThat(e.getMessage(), containsString("Error parsing JSON"));
+  throw e;
+} finally {
+  test("alter session reset `store.json.reader.non_numeric_numbers`");
+  FileUtils.deleteQuietly(file);
+}
+  }
+
+  @Test
+  public void testCreateTableNonNumerics() throws Exception {
+File file = new File(getTempDir("nan_test"), "nan_test.json");
+String json = "{\"nan\":NaN, \"inf\":Infinity}";
+String tableName = "ctas_test";
+try {
+  FileUtils.writeStringToFile(file, json);
+  test("alter session set `store.json.reader.non_numeric_numbers` = 
true");
+  test("alter session set `store.json.writer.non_numeric_numbers` = 
true");
+  test("alter session set `store.format`='json'");
+  test("create table dfs_test.tmp.`%s` as select * from dfs.`%s`;", 
tableName, file.getAbsolutePath());
+
+  // ensuring that `NaN` and `Infinity` tokens ARE NOT enclosed with 
double quotes
+  File resultFile = new File(new 
File(getDfsTestTmpSchemaLocation(),tableName),"0_0_0.json");
+  String resultJson = FileUtils.readFileToString(resultFile);
+  int nanIndex = resultJson.indexOf("NaN");
+  assertFalse("`NaN` must not be enclosed with \"\" ", 
resultJson.charAt(nanIndex - 1) == '"');
+  assertFalse("`NaN` must not be enclosed with \"\" ", 

[jira] [Commented] (DRILL-5919) Add non-numeric support for JSON processing

2017-11-10 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5919?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16248190#comment-16248190
 ] 

ASF GitHub Bot commented on DRILL-5919:
---

Github user paul-rogers commented on a diff in the pull request:

https://github.com/apache/drill/pull/1026#discussion_r150359353
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/expr/fn/impl/conv/JsonConvertFrom.java
 ---
@@ -50,7 +51,7 @@ private JsonConvertFrom(){}
 @Output ComplexWriter writer;
 
 public void setup(){
-  jsonReader = new 
org.apache.drill.exec.vector.complex.fn.JsonReader(buffer, false, false, false 
/* do not read numbers as doubles */);
+  jsonReader = new 
org.apache.drill.exec.vector.complex.fn.JsonReader(buffer, false, false, false, 
false /* do not read numbers as doubles */);
--- End diff --

Here, the comment refers to the second-to-last item. Consider this:

```
false, // What is the first one?
false, // What is the second one?
false, // do not read numbers as doubles
false // Do not allow Nan, INF
```


> Add non-numeric support for JSON processing
> ---
>
> Key: DRILL-5919
> URL: https://issues.apache.org/jira/browse/DRILL-5919
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Storage - JSON
>Affects Versions: 1.11.0
>Reporter: Volodymyr Tkach
>Assignee: Volodymyr Tkach
>  Labels: doc-impacting
> Fix For: Future
>
>
> Add session options to allow drill working with non standard json strings 
> number literals like: NaN, Infinity, -Infinity. By default these options will 
> be switched off, the user will be able to toggle them during working session.
> *For documentation*
> 1. Added two session options {{store.json.reader.non_numeric_numbers}} and 
> {{store.json.reader.non_numeric_numbers}} that allow to read/write NaN and 
> Infinity as numbers. By default these options are set to false.
> 2. Extended signature of {{convert_toJSON}} and {{convert_fromJSON}} 
> functions by adding second optional parameter that enables read/write NaN and 
> Infinity.
> For example:
> {noformat}
> select convert_fromJSON('{"key": NaN}') from (values(1)); will result with 
> JsonParseException, but
> select convert_fromJSON('{"key": NaN}', true) from (values(1)); will parse 
> NaN as a number.
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (DRILL-5919) Add non-numeric support for JSON processing

2017-11-10 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5919?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16248186#comment-16248186
 ] 

ASF GitHub Bot commented on DRILL-5919:
---

Github user paul-rogers commented on a diff in the pull request:

https://github.com/apache/drill/pull/1026#discussion_r150359452
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/expr/fn/impl/conv/JsonConvertFrom.java
 ---
@@ -76,7 +77,7 @@ public void eval(){
 @Output ComplexWriter writer;
 
 public void setup(){
-  jsonReader = new 
org.apache.drill.exec.vector.complex.fn.JsonReader(buffer, false, false, false 
/* do not read numbers as doubles */);
+  jsonReader = new 
org.apache.drill.exec.vector.complex.fn.JsonReader(buffer, false, false, false, 
false /* do not read numbers as doubles */);
--- End diff --

See above. Why do we have two copies? Can we have a function that returns 
the reader using default configs?


> Add non-numeric support for JSON processing
> ---
>
> Key: DRILL-5919
> URL: https://issues.apache.org/jira/browse/DRILL-5919
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Storage - JSON
>Affects Versions: 1.11.0
>Reporter: Volodymyr Tkach
>Assignee: Volodymyr Tkach
>  Labels: doc-impacting
> Fix For: Future
>
>
> Add session options to allow drill working with non standard json strings 
> number literals like: NaN, Infinity, -Infinity. By default these options will 
> be switched off, the user will be able to toggle them during working session.
> *For documentation*
> 1. Added two session options {{store.json.reader.non_numeric_numbers}} and 
> {{store.json.reader.non_numeric_numbers}} that allow to read/write NaN and 
> Infinity as numbers. By default these options are set to false.
> 2. Extended signature of {{convert_toJSON}} and {{convert_fromJSON}} 
> functions by adding second optional parameter that enables read/write NaN and 
> Infinity.
> For example:
> {noformat}
> select convert_fromJSON('{"key": NaN}') from (values(1)); will result with 
> JsonParseException, but
> select convert_fromJSON('{"key": NaN}', true) from (values(1)); will parse 
> NaN as a number.
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (DRILL-5919) Add non-numeric support for JSON processing

2017-11-10 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5919?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16248194#comment-16248194
 ] 

ASF GitHub Bot commented on DRILL-5919:
---

Github user paul-rogers commented on a diff in the pull request:

https://github.com/apache/drill/pull/1026#discussion_r150359565
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/expr/fn/impl/conv/JsonConvertFrom.java
 ---
@@ -91,4 +92,60 @@ public void eval(){
 }
   }
 
+  @FunctionTemplate(name = "convert_fromJSON", scope = 
FunctionScope.SIMPLE, nulls = NullHandling.NULL_IF_NULL, isRandom = true)
+  public static class ConvertFromJsonVarcharNonNumerics implements 
DrillSimpleFunc{
+
+@Param VarCharHolder in;
+@Param BitHolder enableNonNumeric;
+@Inject DrillBuf buffer;
+@Workspace org.apache.drill.exec.vector.complex.fn.JsonReader 
jsonReader;
+
+@Output ComplexWriter writer;
+
+public void setup(){
+  jsonReader = new 
org.apache.drill.exec.vector.complex.fn.JsonReader(buffer, false, false, 
false,/* do not read numbers as doubles */
+  enableNonNumeric.value == 1);
+}
+
+public void eval(){
+  try {
+jsonReader.setSource(in.start, in.end, in.buffer);
+jsonReader.write(writer);
+buffer = jsonReader.getWorkBuf();
+
+  } catch (Exception e) {
+throw new 
org.apache.drill.common.exceptions.DrillRuntimeException("Error while 
converting from JSON. ", e);
+  }
+}
+  }
+
+  @FunctionTemplate(name = "convert_fromJSON", scope = 
FunctionScope.SIMPLE, nulls = NullHandling.NULL_IF_NULL, isRandom = true)
+  public static class ConvertFromJsonNonNumerics implements 
DrillSimpleFunc{
+
+@Param VarBinaryHolder in;
+@Param BitHolder enableNonNumeric;
+@Inject DrillBuf buffer;
+@Workspace org.apache.drill.exec.vector.complex.fn.JsonReader 
jsonReader;
+
+@Output ComplexWriter writer;
+
+public void setup(){
+  jsonReader = new 
org.apache.drill.exec.vector.complex.fn.JsonReader(buffer, false, false, false, 
/* do not read numbers as doubles */
--- End diff --

See above. We really don't want all these duplicate copies.


> Add non-numeric support for JSON processing
> ---
>
> Key: DRILL-5919
> URL: https://issues.apache.org/jira/browse/DRILL-5919
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Storage - JSON
>Affects Versions: 1.11.0
>Reporter: Volodymyr Tkach
>Assignee: Volodymyr Tkach
>  Labels: doc-impacting
> Fix For: Future
>
>
> Add session options to allow drill working with non standard json strings 
> number literals like: NaN, Infinity, -Infinity. By default these options will 
> be switched off, the user will be able to toggle them during working session.
> *For documentation*
> 1. Added two session options {{store.json.reader.non_numeric_numbers}} and 
> {{store.json.reader.non_numeric_numbers}} that allow to read/write NaN and 
> Infinity as numbers. By default these options are set to false.
> 2. Extended signature of {{convert_toJSON}} and {{convert_fromJSON}} 
> functions by adding second optional parameter that enables read/write NaN and 
> Infinity.
> For example:
> {noformat}
> select convert_fromJSON('{"key": NaN}') from (values(1)); will result with 
> JsonParseException, but
> select convert_fromJSON('{"key": NaN}', true) from (values(1)); will parse 
> NaN as a number.
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (DRILL-5919) Add non-numeric support for JSON processing

2017-11-10 Thread Paul Rogers (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-5919?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Paul Rogers updated DRILL-5919:
---
Labels: doc-impacting  (was: doc-impacting ready-to-commit)

> Add non-numeric support for JSON processing
> ---
>
> Key: DRILL-5919
> URL: https://issues.apache.org/jira/browse/DRILL-5919
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Storage - JSON
>Affects Versions: 1.11.0
>Reporter: Volodymyr Tkach
>Assignee: Volodymyr Tkach
>  Labels: doc-impacting
> Fix For: Future
>
>
> Add session options to allow drill working with non standard json strings 
> number literals like: NaN, Infinity, -Infinity. By default these options will 
> be switched off, the user will be able to toggle them during working session.
> *For documentation*
> 1. Added two session options {{store.json.reader.non_numeric_numbers}} and 
> {{store.json.reader.non_numeric_numbers}} that allow to read/write NaN and 
> Infinity as numbers. By default these options are set to false.
> 2. Extended signature of {{convert_toJSON}} and {{convert_fromJSON}} 
> functions by adding second optional parameter that enables read/write NaN and 
> Infinity.
> For example:
> {noformat}
> select convert_fromJSON('{"key": NaN}') from (values(1)); will result with 
> JsonParseException, but
> select convert_fromJSON('{"key": NaN}', true) from (values(1)); will parse 
> NaN as a number.
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (DRILL-5919) Add non-numeric support for JSON processing

2017-11-10 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5919?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16248193#comment-16248193
 ] 

ASF GitHub Bot commented on DRILL-5919:
---

Github user paul-rogers commented on a diff in the pull request:

https://github.com/apache/drill/pull/1026#discussion_r150360309
  
--- Diff: exec/java-exec/src/main/resources/drill-module.conf ---
@@ -502,6 +502,8 @@ drill.exec.options: {
 store.format: "parquet",
 store.hive.optimize_scan_with_native_readers: false,
 store.json.all_text_mode: false,
+store.json.writer.non_numeric_numbers: false,
+store.json.reader.non_numeric_numbers: false,
--- End diff --

Have we tested all Drill's floating point methods to ensure that they 
correctly handle NaN and INF?


> Add non-numeric support for JSON processing
> ---
>
> Key: DRILL-5919
> URL: https://issues.apache.org/jira/browse/DRILL-5919
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Storage - JSON
>Affects Versions: 1.11.0
>Reporter: Volodymyr Tkach
>Assignee: Volodymyr Tkach
>  Labels: doc-impacting
> Fix For: Future
>
>
> Add session options to allow drill working with non standard json strings 
> number literals like: NaN, Infinity, -Infinity. By default these options will 
> be switched off, the user will be able to toggle them during working session.
> *For documentation*
> 1. Added two session options {{store.json.reader.non_numeric_numbers}} and 
> {{store.json.reader.non_numeric_numbers}} that allow to read/write NaN and 
> Infinity as numbers. By default these options are set to false.
> 2. Extended signature of {{convert_toJSON}} and {{convert_fromJSON}} 
> functions by adding second optional parameter that enables read/write NaN and 
> Infinity.
> For example:
> {noformat}
> select convert_fromJSON('{"key": NaN}') from (values(1)); will result with 
> JsonParseException, but
> select convert_fromJSON('{"key": NaN}', true) from (values(1)); will parse 
> NaN as a number.
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (DRILL-5919) Add non-numeric support for JSON processing

2017-11-10 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5919?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16248195#comment-16248195
 ] 

ASF GitHub Bot commented on DRILL-5919:
---

Github user paul-rogers commented on a diff in the pull request:

https://github.com/apache/drill/pull/1026#discussion_r150359992
  
--- Diff: 
contrib/storage-mongo/src/main/java/org/apache/drill/exec/store/mongo/MongoRecordReader.java
 ---
@@ -73,6 +73,7 @@
   private final MongoStoragePlugin plugin;
 
   private final boolean enableAllTextMode;
+  private final boolean enableNonNumericNumbers;
--- End diff --

Would recommend: `enableNanInf`.

"Non-numeric numbers" sounds like we might allow "foo" or "thirteen".


> Add non-numeric support for JSON processing
> ---
>
> Key: DRILL-5919
> URL: https://issues.apache.org/jira/browse/DRILL-5919
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Storage - JSON
>Affects Versions: 1.11.0
>Reporter: Volodymyr Tkach
>Assignee: Volodymyr Tkach
>  Labels: doc-impacting
> Fix For: Future
>
>
> Add session options to allow drill working with non standard json strings 
> number literals like: NaN, Infinity, -Infinity. By default these options will 
> be switched off, the user will be able to toggle them during working session.
> *For documentation*
> 1. Added two session options {{store.json.reader.non_numeric_numbers}} and 
> {{store.json.reader.non_numeric_numbers}} that allow to read/write NaN and 
> Infinity as numbers. By default these options are set to false.
> 2. Extended signature of {{convert_toJSON}} and {{convert_fromJSON}} 
> functions by adding second optional parameter that enables read/write NaN and 
> Infinity.
> For example:
> {noformat}
> select convert_fromJSON('{"key": NaN}') from (values(1)); will result with 
> JsonParseException, but
> select convert_fromJSON('{"key": NaN}', true) from (values(1)); will parse 
> NaN as a number.
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (DRILL-5922) Intermittent Memory Leaks in the ROOT allocator

2017-11-10 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5922?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16248121#comment-16248121
 ] 

ASF GitHub Bot commented on DRILL-5922:
---

Github user ilooner commented on the issue:

https://github.com/apache/drill/pull/1023
  
*Note for batch committer:* Please do not squash the two commits in this 
PR. Please see the discussion on **pom.xml** for details.


> Intermittent Memory Leaks in the ROOT allocator  
> -
>
> Key: DRILL-5922
> URL: https://issues.apache.org/jira/browse/DRILL-5922
> Project: Apache Drill
>  Issue Type: Bug
>Reporter: Timothy Farkas
>Assignee: Timothy Farkas
>Priority: Minor
>  Labels: ready-to-commit
>
> This issue was originall found by [~ben-zvi]. I am able to consistently 
> reproduce the error on my laptop by running the following unit test:
> org.apache.drill.exec.DrillSeparatePlanningTest#testMultiMinorFragmentComplexQuery
> {code}
> java.lang.IllegalStateException: Allocator[ROOT] closed with outstanding 
> child allocators.
> Allocator(ROOT) 0/1048576/10113536/3221225472 (res/actual/peak/limit)
>   child allocators: 1
> Allocator(query:26049b50-0cec-0a92-437c-bbe486e1fcbf) 
> 1048576/0/0/268435456 (res/actual/peak/limit)
>   child allocators: 0
>   ledgers: 0
>   reservations: 0
>   ledgers: 0
>   reservations: 0
>   at 
> org.apache.drill.exec.memory.BaseAllocator.close(BaseAllocator.java:496) 
> ~[classes/:na]
>   at org.apache.drill.common.AutoCloseables.close(AutoCloseables.java:76) 
> [classes/:na]
>   at org.apache.drill.common.AutoCloseables.close(AutoCloseables.java:64) 
> [classes/:na]
>   at 
> org.apache.drill.exec.server.BootStrapContext.close(BootStrapContext.java:256)
>  ~[classes/:na]
>   at org.apache.drill.common.AutoCloseables.close(AutoCloseables.java:76) 
> [classes/:na]
>   at org.apache.drill.common.AutoCloseables.close(AutoCloseables.java:64) 
> [classes/:na]
>   at org.apache.drill.exec.server.Drillbit.close(Drillbit.java:205) 
> [classes/:na]
>   at org.apache.drill.BaseTestQuery.closeClient(BaseTestQuery.java:315) 
> [test-classes/:na]
>   at 
> org.apache.drill.BaseTestQuery.updateTestCluster(BaseTestQuery.java:157) 
> [test-classes/:na]
>   at 
> org.apache.drill.BaseTestQuery.updateTestCluster(BaseTestQuery.java:148) 
> [test-classes/:na]
>   at 
> org.apache.drill.exec.DrillSeparatePlanningTest.getFragmentsHelper(DrillSeparatePlanningTest.java:185)
>  [test-classes/:na]
>   at 
> org.apache.drill.exec.DrillSeparatePlanningTest.testMultiMinorFragmentComplexQuery(DrillSeparatePlanningTest.java:108)
>  [test-classes/:na]
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
> ~[na:1.8.0_144]
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
> ~[na:1.8.0_144]
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  ~[na:1.8.0_144]
>   at java.lang.reflect.Method.invoke(Method.java:498) ~[na:1.8.0_144]
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
>  [junit-4.11.jar:na]
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>  [junit-4.11.jar:na]
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
>  [junit-4.11.jar:na]
>   at 
> mockit.integration.junit4.internal.JUnit4TestRunnerDecorator.executeTestMethod(JUnit4TestRunnerDecorator.java:120)
>  [jmockit-1.3.jar:na]
>   at 
> mockit.integration.junit4.internal.JUnit4TestRunnerDecorator.invokeExplosively(JUnit4TestRunnerDecorator.java:65)
>  [jmockit-1.3.jar:na]
>   at 
> mockit.integration.junit4.internal.MockFrameworkMethod.invokeExplosively(MockFrameworkMethod.java:29)
>  [jmockit-1.3.jar:na]
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
> ~[na:1.8.0_144]
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
> ~[na:1.8.0_144]
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  ~[na:1.8.0_144]
>   at java.lang.reflect.Method.invoke(Method.java:498) ~[na:1.8.0_144]
>   at 
> mockit.internal.util.MethodReflection.invokeWithCheckedThrows(MethodReflection.java:95)
>  [jmockit-1.3.jar:na]
>   at 
> mockit.internal.annotations.MockMethodBridge.callMock(MockMethodBridge.java:76)
>  [jmockit-1.3.jar:na]
>   at 
> mockit.internal.annotations.MockMethodBridge.invoke(MockMethodBridge.java:41) 
> [jmockit-1.3.jar:na]
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java)
>  [junit-4.11.jar:na]
>   at 
> 

[jira] [Updated] (DRILL-5922) Intermittent Memory Leaks in the ROOT allocator

2017-11-10 Thread Timothy Farkas (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-5922?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Timothy Farkas updated DRILL-5922:
--
Labels: ready-to-commit  (was: )

> Intermittent Memory Leaks in the ROOT allocator  
> -
>
> Key: DRILL-5922
> URL: https://issues.apache.org/jira/browse/DRILL-5922
> Project: Apache Drill
>  Issue Type: Bug
>Reporter: Timothy Farkas
>Assignee: Timothy Farkas
>Priority: Minor
>  Labels: ready-to-commit
>
> This issue was originall found by [~ben-zvi]. I am able to consistently 
> reproduce the error on my laptop by running the following unit test:
> org.apache.drill.exec.DrillSeparatePlanningTest#testMultiMinorFragmentComplexQuery
> {code}
> java.lang.IllegalStateException: Allocator[ROOT] closed with outstanding 
> child allocators.
> Allocator(ROOT) 0/1048576/10113536/3221225472 (res/actual/peak/limit)
>   child allocators: 1
> Allocator(query:26049b50-0cec-0a92-437c-bbe486e1fcbf) 
> 1048576/0/0/268435456 (res/actual/peak/limit)
>   child allocators: 0
>   ledgers: 0
>   reservations: 0
>   ledgers: 0
>   reservations: 0
>   at 
> org.apache.drill.exec.memory.BaseAllocator.close(BaseAllocator.java:496) 
> ~[classes/:na]
>   at org.apache.drill.common.AutoCloseables.close(AutoCloseables.java:76) 
> [classes/:na]
>   at org.apache.drill.common.AutoCloseables.close(AutoCloseables.java:64) 
> [classes/:na]
>   at 
> org.apache.drill.exec.server.BootStrapContext.close(BootStrapContext.java:256)
>  ~[classes/:na]
>   at org.apache.drill.common.AutoCloseables.close(AutoCloseables.java:76) 
> [classes/:na]
>   at org.apache.drill.common.AutoCloseables.close(AutoCloseables.java:64) 
> [classes/:na]
>   at org.apache.drill.exec.server.Drillbit.close(Drillbit.java:205) 
> [classes/:na]
>   at org.apache.drill.BaseTestQuery.closeClient(BaseTestQuery.java:315) 
> [test-classes/:na]
>   at 
> org.apache.drill.BaseTestQuery.updateTestCluster(BaseTestQuery.java:157) 
> [test-classes/:na]
>   at 
> org.apache.drill.BaseTestQuery.updateTestCluster(BaseTestQuery.java:148) 
> [test-classes/:na]
>   at 
> org.apache.drill.exec.DrillSeparatePlanningTest.getFragmentsHelper(DrillSeparatePlanningTest.java:185)
>  [test-classes/:na]
>   at 
> org.apache.drill.exec.DrillSeparatePlanningTest.testMultiMinorFragmentComplexQuery(DrillSeparatePlanningTest.java:108)
>  [test-classes/:na]
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
> ~[na:1.8.0_144]
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
> ~[na:1.8.0_144]
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  ~[na:1.8.0_144]
>   at java.lang.reflect.Method.invoke(Method.java:498) ~[na:1.8.0_144]
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
>  [junit-4.11.jar:na]
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>  [junit-4.11.jar:na]
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
>  [junit-4.11.jar:na]
>   at 
> mockit.integration.junit4.internal.JUnit4TestRunnerDecorator.executeTestMethod(JUnit4TestRunnerDecorator.java:120)
>  [jmockit-1.3.jar:na]
>   at 
> mockit.integration.junit4.internal.JUnit4TestRunnerDecorator.invokeExplosively(JUnit4TestRunnerDecorator.java:65)
>  [jmockit-1.3.jar:na]
>   at 
> mockit.integration.junit4.internal.MockFrameworkMethod.invokeExplosively(MockFrameworkMethod.java:29)
>  [jmockit-1.3.jar:na]
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
> ~[na:1.8.0_144]
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
> ~[na:1.8.0_144]
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  ~[na:1.8.0_144]
>   at java.lang.reflect.Method.invoke(Method.java:498) ~[na:1.8.0_144]
>   at 
> mockit.internal.util.MethodReflection.invokeWithCheckedThrows(MethodReflection.java:95)
>  [jmockit-1.3.jar:na]
>   at 
> mockit.internal.annotations.MockMethodBridge.callMock(MockMethodBridge.java:76)
>  [jmockit-1.3.jar:na]
>   at 
> mockit.internal.annotations.MockMethodBridge.invoke(MockMethodBridge.java:41) 
> [jmockit-1.3.jar:na]
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java)
>  [junit-4.11.jar:na]
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>  [junit-4.11.jar:na]
>   at 
> org.junit.internal.runners.statements.FailOnTimeout$StatementThread.run(FailOnTimeout.java:74)
>  [junit-4.11.jar:na]
> {code}



--
This message was sent by Atlassian JIRA

[jira] [Updated] (DRILL-5899) Simple pattern matchers can work with DrillBuf directly

2017-11-10 Thread Pritesh Maker (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-5899?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pritesh Maker updated DRILL-5899:
-
Reviewer: salim achouche

> Simple pattern matchers can work with DrillBuf directly
> ---
>
> Key: DRILL-5899
> URL: https://issues.apache.org/jira/browse/DRILL-5899
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Execution - Flow
>Reporter: Padma Penumarthy
>Assignee: Padma Penumarthy
>Priority: Critical
>  Labels: ready-to-commit
>
> For the 4 simple patterns we have i.e. startsWith, endsWith, contains and 
> constant,, we do not need the overhead of charSequenceWrapper. We can work 
> with DrillBuf directly. This will save us from doing isAscii check and UTF8 
> decoding for each row.
> UTF-8 encoding ensures that no UTF-8 character is a prefix of any other valid 
> character. So, instead of decoding varChar from each row we are processing, 
> encode the patternString once during setup and do raw byte comparison. 
> Instead of bounds checking and reading one byte at a time, we get the whole 
> buffer in one shot and use that for comparison.
> This improved overall performance for filter operator by around 20%. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (DRILL-5922) Intermittent Memory Leaks in the ROOT allocator

2017-11-10 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5922?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16248080#comment-16248080
 ] 

ASF GitHub Bot commented on DRILL-5922:
---

Github user vrozov commented on a diff in the pull request:

https://github.com/apache/drill/pull/1023#discussion_r150348155
  
--- Diff: pom.xml ---
@@ -442,7 +442,7 @@
   
-Dorg.apache.drill.exec.server.Drillbit.system_options="org.apache.drill.exec.compile.ClassTransformer.scalar_replacement=on"
   -Ddrill.test.query.printing.silent=true
   -Ddrill.catastrophic_to_standard_out=true
-  -XX:MaxPermSize=512M -XX:MaxDirectMemorySize=3072M
+  -XX:MaxPermSize=512M -XX:MaxDirectMemorySize=4096M
--- End diff --

I am in favor of 1 as well.

@paul-rogers One of the test that fails initially tries to allocate 2GB 
(without re-allocation). The test intermittently fails due to Pooled Allocator 
not releasing memory back to the system. I don't know if it is supposed to 
return memory back to the system when it is closed and whether it is supposed 
to be closed at all during the unit tests.


> Intermittent Memory Leaks in the ROOT allocator  
> -
>
> Key: DRILL-5922
> URL: https://issues.apache.org/jira/browse/DRILL-5922
> Project: Apache Drill
>  Issue Type: Bug
>Reporter: Timothy Farkas
>Assignee: Timothy Farkas
>Priority: Minor
>
> This issue was originall found by [~ben-zvi]. I am able to consistently 
> reproduce the error on my laptop by running the following unit test:
> org.apache.drill.exec.DrillSeparatePlanningTest#testMultiMinorFragmentComplexQuery
> {code}
> java.lang.IllegalStateException: Allocator[ROOT] closed with outstanding 
> child allocators.
> Allocator(ROOT) 0/1048576/10113536/3221225472 (res/actual/peak/limit)
>   child allocators: 1
> Allocator(query:26049b50-0cec-0a92-437c-bbe486e1fcbf) 
> 1048576/0/0/268435456 (res/actual/peak/limit)
>   child allocators: 0
>   ledgers: 0
>   reservations: 0
>   ledgers: 0
>   reservations: 0
>   at 
> org.apache.drill.exec.memory.BaseAllocator.close(BaseAllocator.java:496) 
> ~[classes/:na]
>   at org.apache.drill.common.AutoCloseables.close(AutoCloseables.java:76) 
> [classes/:na]
>   at org.apache.drill.common.AutoCloseables.close(AutoCloseables.java:64) 
> [classes/:na]
>   at 
> org.apache.drill.exec.server.BootStrapContext.close(BootStrapContext.java:256)
>  ~[classes/:na]
>   at org.apache.drill.common.AutoCloseables.close(AutoCloseables.java:76) 
> [classes/:na]
>   at org.apache.drill.common.AutoCloseables.close(AutoCloseables.java:64) 
> [classes/:na]
>   at org.apache.drill.exec.server.Drillbit.close(Drillbit.java:205) 
> [classes/:na]
>   at org.apache.drill.BaseTestQuery.closeClient(BaseTestQuery.java:315) 
> [test-classes/:na]
>   at 
> org.apache.drill.BaseTestQuery.updateTestCluster(BaseTestQuery.java:157) 
> [test-classes/:na]
>   at 
> org.apache.drill.BaseTestQuery.updateTestCluster(BaseTestQuery.java:148) 
> [test-classes/:na]
>   at 
> org.apache.drill.exec.DrillSeparatePlanningTest.getFragmentsHelper(DrillSeparatePlanningTest.java:185)
>  [test-classes/:na]
>   at 
> org.apache.drill.exec.DrillSeparatePlanningTest.testMultiMinorFragmentComplexQuery(DrillSeparatePlanningTest.java:108)
>  [test-classes/:na]
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
> ~[na:1.8.0_144]
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
> ~[na:1.8.0_144]
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  ~[na:1.8.0_144]
>   at java.lang.reflect.Method.invoke(Method.java:498) ~[na:1.8.0_144]
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
>  [junit-4.11.jar:na]
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>  [junit-4.11.jar:na]
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
>  [junit-4.11.jar:na]
>   at 
> mockit.integration.junit4.internal.JUnit4TestRunnerDecorator.executeTestMethod(JUnit4TestRunnerDecorator.java:120)
>  [jmockit-1.3.jar:na]
>   at 
> mockit.integration.junit4.internal.JUnit4TestRunnerDecorator.invokeExplosively(JUnit4TestRunnerDecorator.java:65)
>  [jmockit-1.3.jar:na]
>   at 
> mockit.integration.junit4.internal.MockFrameworkMethod.invokeExplosively(MockFrameworkMethod.java:29)
>  [jmockit-1.3.jar:na]
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
> ~[na:1.8.0_144]
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
> ~[na:1.8.0_144]
>   at 

[jira] [Commented] (DRILL-5922) Intermittent Memory Leaks in the ROOT allocator

2017-11-10 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5922?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16248066#comment-16248066
 ] 

ASF GitHub Bot commented on DRILL-5922:
---

Github user paul-rogers commented on a diff in the pull request:

https://github.com/apache/drill/pull/1023#discussion_r150345195
  
--- Diff: pom.xml ---
@@ -442,7 +442,7 @@
   
-Dorg.apache.drill.exec.server.Drillbit.system_options="org.apache.drill.exec.compile.ClassTransformer.scalar_replacement=on"
   -Ddrill.test.query.printing.silent=true
   -Ddrill.catastrophic_to_standard_out=true
-  -XX:MaxPermSize=512M -XX:MaxDirectMemorySize=3072M
+  -XX:MaxPermSize=512M -XX:MaxDirectMemorySize=4096M
--- End diff --

I'm in favor of 1 for now, then we can revert it, if needed, once the batch 
size work is available and we remove the offending test (option 3).


> Intermittent Memory Leaks in the ROOT allocator  
> -
>
> Key: DRILL-5922
> URL: https://issues.apache.org/jira/browse/DRILL-5922
> Project: Apache Drill
>  Issue Type: Bug
>Reporter: Timothy Farkas
>Assignee: Timothy Farkas
>Priority: Minor
>
> This issue was originall found by [~ben-zvi]. I am able to consistently 
> reproduce the error on my laptop by running the following unit test:
> org.apache.drill.exec.DrillSeparatePlanningTest#testMultiMinorFragmentComplexQuery
> {code}
> java.lang.IllegalStateException: Allocator[ROOT] closed with outstanding 
> child allocators.
> Allocator(ROOT) 0/1048576/10113536/3221225472 (res/actual/peak/limit)
>   child allocators: 1
> Allocator(query:26049b50-0cec-0a92-437c-bbe486e1fcbf) 
> 1048576/0/0/268435456 (res/actual/peak/limit)
>   child allocators: 0
>   ledgers: 0
>   reservations: 0
>   ledgers: 0
>   reservations: 0
>   at 
> org.apache.drill.exec.memory.BaseAllocator.close(BaseAllocator.java:496) 
> ~[classes/:na]
>   at org.apache.drill.common.AutoCloseables.close(AutoCloseables.java:76) 
> [classes/:na]
>   at org.apache.drill.common.AutoCloseables.close(AutoCloseables.java:64) 
> [classes/:na]
>   at 
> org.apache.drill.exec.server.BootStrapContext.close(BootStrapContext.java:256)
>  ~[classes/:na]
>   at org.apache.drill.common.AutoCloseables.close(AutoCloseables.java:76) 
> [classes/:na]
>   at org.apache.drill.common.AutoCloseables.close(AutoCloseables.java:64) 
> [classes/:na]
>   at org.apache.drill.exec.server.Drillbit.close(Drillbit.java:205) 
> [classes/:na]
>   at org.apache.drill.BaseTestQuery.closeClient(BaseTestQuery.java:315) 
> [test-classes/:na]
>   at 
> org.apache.drill.BaseTestQuery.updateTestCluster(BaseTestQuery.java:157) 
> [test-classes/:na]
>   at 
> org.apache.drill.BaseTestQuery.updateTestCluster(BaseTestQuery.java:148) 
> [test-classes/:na]
>   at 
> org.apache.drill.exec.DrillSeparatePlanningTest.getFragmentsHelper(DrillSeparatePlanningTest.java:185)
>  [test-classes/:na]
>   at 
> org.apache.drill.exec.DrillSeparatePlanningTest.testMultiMinorFragmentComplexQuery(DrillSeparatePlanningTest.java:108)
>  [test-classes/:na]
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
> ~[na:1.8.0_144]
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
> ~[na:1.8.0_144]
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  ~[na:1.8.0_144]
>   at java.lang.reflect.Method.invoke(Method.java:498) ~[na:1.8.0_144]
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
>  [junit-4.11.jar:na]
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>  [junit-4.11.jar:na]
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
>  [junit-4.11.jar:na]
>   at 
> mockit.integration.junit4.internal.JUnit4TestRunnerDecorator.executeTestMethod(JUnit4TestRunnerDecorator.java:120)
>  [jmockit-1.3.jar:na]
>   at 
> mockit.integration.junit4.internal.JUnit4TestRunnerDecorator.invokeExplosively(JUnit4TestRunnerDecorator.java:65)
>  [jmockit-1.3.jar:na]
>   at 
> mockit.integration.junit4.internal.MockFrameworkMethod.invokeExplosively(MockFrameworkMethod.java:29)
>  [jmockit-1.3.jar:na]
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
> ~[na:1.8.0_144]
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
> ~[na:1.8.0_144]
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  ~[na:1.8.0_144]
>   at java.lang.reflect.Method.invoke(Method.java:498) ~[na:1.8.0_144]
>   at 
> 

[jira] [Commented] (DRILL-5771) Fix serDe errors for format plugins

2017-11-10 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5771?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16248056#comment-16248056
 ] 

ASF GitHub Bot commented on DRILL-5771:
---

Github user ilooner commented on the issue:

https://github.com/apache/drill/pull/1014
  
@paul-rogers Thanks for the explanation. Your explanation is in sync with 
Arina's descriptions in the ticket and with the code changes. The only point of 
confusion I have now is with regards to the Calcite things changes you 
mentioned. I was under the impression that the following happened:

  1. FormatPlugins created a DrillTable, which held the format 
configuration. 
  1. The DrillTable is then encapsulated in a ScanOperator
  1. The ScanOperator automagically gets serialized into the PhysicalPlan 
with the DrillTable and corresponding format configuration correctly.

But it seems like there are more Calcite things at work here. 
@arina-ielchiieva could you please point me to the general flow I need to look 
at in order to understand how the FormatPluginConfig interacts with Calcite and 
gets serialized into the PhysicalPlan?


> Fix serDe errors for format plugins
> ---
>
> Key: DRILL-5771
> URL: https://issues.apache.org/jira/browse/DRILL-5771
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.11.0
>Reporter: Arina Ielchiieva
>Assignee: Arina Ielchiieva
>Priority: Minor
> Fix For: 1.12.0
>
>
> Create unit tests to check that all storage format plugins can be 
> successfully serialized  / deserialized.
> Usually this happens when query has several major fragments. 
> One way to check serde is to generate physical plan (generated as json) and 
> then submit it back to Drill.
> One example of found errors is described in the first comment. Another 
> example is described in DRILL-5166.
> *Serde issues:*
> 1. Could not obtain format plugin during deserialization
> Format plugin is created based on format plugin configuration or its name. 
> On Drill start up we load information about available plugins (its reloaded 
> each time storage plugin is updated, can be done only by admin).
> When query is parsed, we try to get plugin from the available ones, it we can 
> not find one we try to [create 
> one|https://github.com/apache/drill/blob/3e8b01d5b0d3013e3811913f0fd6028b22c1ac3f/exec/java-exec/src/main/java/org/apache/drill/exec/store/dfs/FileSystemPlugin.java#L136-L144]
> but on other query execution stages we always assume that [plugin exists 
> based on 
> configuration|https://github.com/apache/drill/blob/3e8b01d5b0d3013e3811913f0fd6028b22c1ac3f/exec/java-exec/src/main/java/org/apache/drill/exec/store/dfs/FileSystemPlugin.java#L156-L162].
> For example, during query parsing we had to create format plugin on one node 
> based on format configuration.
> Then we have sent major fragment to the different node where we used this 
> format configuration we could not get format plugin based on it and 
> deserialization has failed.
> To fix this problem we need to create format plugin during query 
> deserialization if it's absent.
>   
> 2.  Absent hash code and equals.
> Format plugins are stored in hash map where key is format plugin config.
> Since some format plugin configs did not have overridden hash code and 
> equals, we could not find format plugin based on its configuration.
> 3. Named format plugin usage
> Named format plugins configs allow to get format plugin by its name for 
> configuration shared among all drillbits.
> They are used as alias for pre-configured format plugiins. User with admin 
> priliges can modify them at runtime.
> Named format plugins configs are used instead of sending all non-default 
> parameters of format plugin config, in this case only name is sent.
> Their usage in distributed system may cause raise conditions.
> For example, 
> 1. Query is submitted. 
> 2. Parquet format plugin is created with the following configuration 
> (autoCorrectCorruptDates=>true).
> 3. Seralized named format plugin config with name as parquet.
> 4. Major fragment is sent to the different node.
> 5. Admin has changed parquet configuration for the alias 'parquet' on all 
> nodes to autoCorrectCorruptDates=>false.
> 6. Named format is deserialized on the different node into parquet format 
> plugin with configuration (autoCorrectCorruptDates=>false).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (DRILL-5922) Intermittent Memory Leaks in the ROOT allocator

2017-11-10 Thread Timothy Farkas (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-5922?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Timothy Farkas updated DRILL-5922:
--
Labels:   (was: ready-to-commit)

> Intermittent Memory Leaks in the ROOT allocator  
> -
>
> Key: DRILL-5922
> URL: https://issues.apache.org/jira/browse/DRILL-5922
> Project: Apache Drill
>  Issue Type: Bug
>Reporter: Timothy Farkas
>Assignee: Timothy Farkas
>Priority: Minor
>
> This issue was originall found by [~ben-zvi]. I am able to consistently 
> reproduce the error on my laptop by running the following unit test:
> org.apache.drill.exec.DrillSeparatePlanningTest#testMultiMinorFragmentComplexQuery
> {code}
> java.lang.IllegalStateException: Allocator[ROOT] closed with outstanding 
> child allocators.
> Allocator(ROOT) 0/1048576/10113536/3221225472 (res/actual/peak/limit)
>   child allocators: 1
> Allocator(query:26049b50-0cec-0a92-437c-bbe486e1fcbf) 
> 1048576/0/0/268435456 (res/actual/peak/limit)
>   child allocators: 0
>   ledgers: 0
>   reservations: 0
>   ledgers: 0
>   reservations: 0
>   at 
> org.apache.drill.exec.memory.BaseAllocator.close(BaseAllocator.java:496) 
> ~[classes/:na]
>   at org.apache.drill.common.AutoCloseables.close(AutoCloseables.java:76) 
> [classes/:na]
>   at org.apache.drill.common.AutoCloseables.close(AutoCloseables.java:64) 
> [classes/:na]
>   at 
> org.apache.drill.exec.server.BootStrapContext.close(BootStrapContext.java:256)
>  ~[classes/:na]
>   at org.apache.drill.common.AutoCloseables.close(AutoCloseables.java:76) 
> [classes/:na]
>   at org.apache.drill.common.AutoCloseables.close(AutoCloseables.java:64) 
> [classes/:na]
>   at org.apache.drill.exec.server.Drillbit.close(Drillbit.java:205) 
> [classes/:na]
>   at org.apache.drill.BaseTestQuery.closeClient(BaseTestQuery.java:315) 
> [test-classes/:na]
>   at 
> org.apache.drill.BaseTestQuery.updateTestCluster(BaseTestQuery.java:157) 
> [test-classes/:na]
>   at 
> org.apache.drill.BaseTestQuery.updateTestCluster(BaseTestQuery.java:148) 
> [test-classes/:na]
>   at 
> org.apache.drill.exec.DrillSeparatePlanningTest.getFragmentsHelper(DrillSeparatePlanningTest.java:185)
>  [test-classes/:na]
>   at 
> org.apache.drill.exec.DrillSeparatePlanningTest.testMultiMinorFragmentComplexQuery(DrillSeparatePlanningTest.java:108)
>  [test-classes/:na]
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
> ~[na:1.8.0_144]
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
> ~[na:1.8.0_144]
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  ~[na:1.8.0_144]
>   at java.lang.reflect.Method.invoke(Method.java:498) ~[na:1.8.0_144]
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
>  [junit-4.11.jar:na]
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>  [junit-4.11.jar:na]
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
>  [junit-4.11.jar:na]
>   at 
> mockit.integration.junit4.internal.JUnit4TestRunnerDecorator.executeTestMethod(JUnit4TestRunnerDecorator.java:120)
>  [jmockit-1.3.jar:na]
>   at 
> mockit.integration.junit4.internal.JUnit4TestRunnerDecorator.invokeExplosively(JUnit4TestRunnerDecorator.java:65)
>  [jmockit-1.3.jar:na]
>   at 
> mockit.integration.junit4.internal.MockFrameworkMethod.invokeExplosively(MockFrameworkMethod.java:29)
>  [jmockit-1.3.jar:na]
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
> ~[na:1.8.0_144]
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
> ~[na:1.8.0_144]
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  ~[na:1.8.0_144]
>   at java.lang.reflect.Method.invoke(Method.java:498) ~[na:1.8.0_144]
>   at 
> mockit.internal.util.MethodReflection.invokeWithCheckedThrows(MethodReflection.java:95)
>  [jmockit-1.3.jar:na]
>   at 
> mockit.internal.annotations.MockMethodBridge.callMock(MockMethodBridge.java:76)
>  [jmockit-1.3.jar:na]
>   at 
> mockit.internal.annotations.MockMethodBridge.invoke(MockMethodBridge.java:41) 
> [jmockit-1.3.jar:na]
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java)
>  [junit-4.11.jar:na]
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>  [junit-4.11.jar:na]
>   at 
> org.junit.internal.runners.statements.FailOnTimeout$StatementThread.run(FailOnTimeout.java:74)
>  [junit-4.11.jar:na]
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (DRILL-5922) Intermittent Memory Leaks in the ROOT allocator

2017-11-10 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5922?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16248039#comment-16248039
 ] 

ASF GitHub Bot commented on DRILL-5922:
---

Github user ilooner commented on a diff in the pull request:

https://github.com/apache/drill/pull/1023#discussion_r150339444
  
--- Diff: pom.xml ---
@@ -442,7 +442,7 @@
   
-Dorg.apache.drill.exec.server.Drillbit.system_options="org.apache.drill.exec.compile.ClassTransformer.scalar_replacement=on"
   -Ddrill.test.query.printing.silent=true
   -Ddrill.catastrophic_to_standard_out=true
-  -XX:MaxPermSize=512M -XX:MaxDirectMemorySize=3072M
+  -XX:MaxPermSize=512M -XX:MaxDirectMemorySize=4096M
--- End diff --

@paul-rogers I think there are a few choices available to us:

 1. Keep this change in a separate commit (already done) and revert it once 
the Batch Size Limitation work is done. The benefit is that the build will not 
be broken up until the Batch Size work is in. I can make a note on the Batch 
Size jira to revert this change as well.
 1. Remove this change and suffer with a broken build until the Batch Size 
work is complete. (not fun)
 1. Delete the offending tests instead as part of this PR.

I am in favor of option **1**. I am strongly against option **2** since 
ignoring build issues makes life much harder for everyone.

Please let me know what path forward you'd like to take.


> Intermittent Memory Leaks in the ROOT allocator  
> -
>
> Key: DRILL-5922
> URL: https://issues.apache.org/jira/browse/DRILL-5922
> Project: Apache Drill
>  Issue Type: Bug
>Reporter: Timothy Farkas
>Assignee: Timothy Farkas
>Priority: Minor
>  Labels: ready-to-commit
>
> This issue was originall found by [~ben-zvi]. I am able to consistently 
> reproduce the error on my laptop by running the following unit test:
> org.apache.drill.exec.DrillSeparatePlanningTest#testMultiMinorFragmentComplexQuery
> {code}
> java.lang.IllegalStateException: Allocator[ROOT] closed with outstanding 
> child allocators.
> Allocator(ROOT) 0/1048576/10113536/3221225472 (res/actual/peak/limit)
>   child allocators: 1
> Allocator(query:26049b50-0cec-0a92-437c-bbe486e1fcbf) 
> 1048576/0/0/268435456 (res/actual/peak/limit)
>   child allocators: 0
>   ledgers: 0
>   reservations: 0
>   ledgers: 0
>   reservations: 0
>   at 
> org.apache.drill.exec.memory.BaseAllocator.close(BaseAllocator.java:496) 
> ~[classes/:na]
>   at org.apache.drill.common.AutoCloseables.close(AutoCloseables.java:76) 
> [classes/:na]
>   at org.apache.drill.common.AutoCloseables.close(AutoCloseables.java:64) 
> [classes/:na]
>   at 
> org.apache.drill.exec.server.BootStrapContext.close(BootStrapContext.java:256)
>  ~[classes/:na]
>   at org.apache.drill.common.AutoCloseables.close(AutoCloseables.java:76) 
> [classes/:na]
>   at org.apache.drill.common.AutoCloseables.close(AutoCloseables.java:64) 
> [classes/:na]
>   at org.apache.drill.exec.server.Drillbit.close(Drillbit.java:205) 
> [classes/:na]
>   at org.apache.drill.BaseTestQuery.closeClient(BaseTestQuery.java:315) 
> [test-classes/:na]
>   at 
> org.apache.drill.BaseTestQuery.updateTestCluster(BaseTestQuery.java:157) 
> [test-classes/:na]
>   at 
> org.apache.drill.BaseTestQuery.updateTestCluster(BaseTestQuery.java:148) 
> [test-classes/:na]
>   at 
> org.apache.drill.exec.DrillSeparatePlanningTest.getFragmentsHelper(DrillSeparatePlanningTest.java:185)
>  [test-classes/:na]
>   at 
> org.apache.drill.exec.DrillSeparatePlanningTest.testMultiMinorFragmentComplexQuery(DrillSeparatePlanningTest.java:108)
>  [test-classes/:na]
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
> ~[na:1.8.0_144]
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
> ~[na:1.8.0_144]
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  ~[na:1.8.0_144]
>   at java.lang.reflect.Method.invoke(Method.java:498) ~[na:1.8.0_144]
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
>  [junit-4.11.jar:na]
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>  [junit-4.11.jar:na]
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
>  [junit-4.11.jar:na]
>   at 
> mockit.integration.junit4.internal.JUnit4TestRunnerDecorator.executeTestMethod(JUnit4TestRunnerDecorator.java:120)
>  [jmockit-1.3.jar:na]
>   at 
> mockit.integration.junit4.internal.JUnit4TestRunnerDecorator.invokeExplosively(JUnit4TestRunnerDecorator.java:65)

[jira] [Commented] (DRILL-5922) Intermittent Memory Leaks in the ROOT allocator

2017-11-10 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5922?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16248025#comment-16248025
 ] 

ASF GitHub Bot commented on DRILL-5922:
---

Github user paul-rogers commented on a diff in the pull request:

https://github.com/apache/drill/pull/1023#discussion_r150336286
  
--- Diff: pom.xml ---
@@ -442,7 +442,7 @@
   
-Dorg.apache.drill.exec.server.Drillbit.system_options="org.apache.drill.exec.compile.ClassTransformer.scalar_replacement=on"
   -Ddrill.test.query.printing.silent=true
   -Ddrill.catastrophic_to_standard_out=true
-  -XX:MaxPermSize=512M -XX:MaxDirectMemorySize=3072M
+  -XX:MaxPermSize=512M -XX:MaxDirectMemorySize=4096M
--- End diff --

This is a fix for a test that we don't really even need. We are attempting 
to verify that we can allocate a single value vector of 2 GB by doubling from a 
small amount. On the last allocation, we have a 1 GB vector doubling to 2 GB, 
so we temporarily need 3 GB.

But, vectors should never get this large. The batch size limitation project 
is doing work to limit vectors to the 16 MB Netty slab size.

For this reason, this is a workaround to a test for a feature that is 
actually a bug which we will be fixing in a later release.


> Intermittent Memory Leaks in the ROOT allocator  
> -
>
> Key: DRILL-5922
> URL: https://issues.apache.org/jira/browse/DRILL-5922
> Project: Apache Drill
>  Issue Type: Bug
>Reporter: Timothy Farkas
>Assignee: Timothy Farkas
>Priority: Minor
>  Labels: ready-to-commit
>
> This issue was originall found by [~ben-zvi]. I am able to consistently 
> reproduce the error on my laptop by running the following unit test:
> org.apache.drill.exec.DrillSeparatePlanningTest#testMultiMinorFragmentComplexQuery
> {code}
> java.lang.IllegalStateException: Allocator[ROOT] closed with outstanding 
> child allocators.
> Allocator(ROOT) 0/1048576/10113536/3221225472 (res/actual/peak/limit)
>   child allocators: 1
> Allocator(query:26049b50-0cec-0a92-437c-bbe486e1fcbf) 
> 1048576/0/0/268435456 (res/actual/peak/limit)
>   child allocators: 0
>   ledgers: 0
>   reservations: 0
>   ledgers: 0
>   reservations: 0
>   at 
> org.apache.drill.exec.memory.BaseAllocator.close(BaseAllocator.java:496) 
> ~[classes/:na]
>   at org.apache.drill.common.AutoCloseables.close(AutoCloseables.java:76) 
> [classes/:na]
>   at org.apache.drill.common.AutoCloseables.close(AutoCloseables.java:64) 
> [classes/:na]
>   at 
> org.apache.drill.exec.server.BootStrapContext.close(BootStrapContext.java:256)
>  ~[classes/:na]
>   at org.apache.drill.common.AutoCloseables.close(AutoCloseables.java:76) 
> [classes/:na]
>   at org.apache.drill.common.AutoCloseables.close(AutoCloseables.java:64) 
> [classes/:na]
>   at org.apache.drill.exec.server.Drillbit.close(Drillbit.java:205) 
> [classes/:na]
>   at org.apache.drill.BaseTestQuery.closeClient(BaseTestQuery.java:315) 
> [test-classes/:na]
>   at 
> org.apache.drill.BaseTestQuery.updateTestCluster(BaseTestQuery.java:157) 
> [test-classes/:na]
>   at 
> org.apache.drill.BaseTestQuery.updateTestCluster(BaseTestQuery.java:148) 
> [test-classes/:na]
>   at 
> org.apache.drill.exec.DrillSeparatePlanningTest.getFragmentsHelper(DrillSeparatePlanningTest.java:185)
>  [test-classes/:na]
>   at 
> org.apache.drill.exec.DrillSeparatePlanningTest.testMultiMinorFragmentComplexQuery(DrillSeparatePlanningTest.java:108)
>  [test-classes/:na]
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
> ~[na:1.8.0_144]
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
> ~[na:1.8.0_144]
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  ~[na:1.8.0_144]
>   at java.lang.reflect.Method.invoke(Method.java:498) ~[na:1.8.0_144]
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
>  [junit-4.11.jar:na]
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>  [junit-4.11.jar:na]
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
>  [junit-4.11.jar:na]
>   at 
> mockit.integration.junit4.internal.JUnit4TestRunnerDecorator.executeTestMethod(JUnit4TestRunnerDecorator.java:120)
>  [jmockit-1.3.jar:na]
>   at 
> mockit.integration.junit4.internal.JUnit4TestRunnerDecorator.invokeExplosively(JUnit4TestRunnerDecorator.java:65)
>  [jmockit-1.3.jar:na]
>   at 
> mockit.integration.junit4.internal.MockFrameworkMethod.invokeExplosively(MockFrameworkMethod.java:29)
>  [jmockit-1.3.jar:na]
> 

[jira] [Commented] (DRILL-5936) Refactor MergingRecordBatch based on code inspection

2017-11-10 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5936?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16248023#comment-16248023
 ] 

ASF GitHub Bot commented on DRILL-5936:
---

Github user amansinha100 commented on a diff in the pull request:

https://github.com/apache/drill/pull/1025#discussion_r150335985
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/mergereceiver/MergingRecordBatch.java
 ---
@@ -795,6 +788,8 @@ private void generateComparisons(final 
ClassGenerator g, final VectorAccessib
* @param node Reference to the next record to copy from the incoming 
batches
*/
   private boolean copyRecordToOutgoingBatch(final Node node) {
+assert outgoingPosition < OUTGOING_BATCH_SIZE
--- End diff --

Ok.  Updated changes lgtm. 


> Refactor MergingRecordBatch based on code inspection
> 
>
> Key: DRILL-5936
> URL: https://issues.apache.org/jira/browse/DRILL-5936
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Tools, Build & Test
>Reporter: Vlad Rozov
>Assignee: Vlad Rozov
>Priority: Minor
>
> * Reorganize code to remove unnecessary {{pqueue.peek()}}
> * Reuse Node



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (DRILL-5936) Refactor MergingRecordBatch based on code inspection

2017-11-10 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5936?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16248008#comment-16248008
 ] 

ASF GitHub Bot commented on DRILL-5936:
---

Github user vrozov commented on a diff in the pull request:

https://github.com/apache/drill/pull/1025#discussion_r150331389
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/mergereceiver/MergingRecordBatch.java
 ---
@@ -795,6 +788,8 @@ private void generateComparisons(final 
ClassGenerator g, final VectorAccessib
* @param node Reference to the next record to copy from the incoming 
batches
*/
   private boolean copyRecordToOutgoingBatch(final Node node) {
+assert outgoingPosition < OUTGOING_BATCH_SIZE
--- End diff --

I added the assert to avoid possible errors during further code 
refactoring. As it is an assert that will not affect performance in production 
and there is another assert already, I'd prefer to keep it.


> Refactor MergingRecordBatch based on code inspection
> 
>
> Key: DRILL-5936
> URL: https://issues.apache.org/jira/browse/DRILL-5936
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Tools, Build & Test
>Reporter: Vlad Rozov
>Assignee: Vlad Rozov
>Priority: Minor
>
> * Reorganize code to remove unnecessary {{pqueue.peek()}}
> * Reuse Node



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (DRILL-5936) Refactor MergingRecordBatch based on code inspection

2017-11-10 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5936?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16248003#comment-16248003
 ] 

ASF GitHub Bot commented on DRILL-5936:
---

Github user amansinha100 commented on a diff in the pull request:

https://github.com/apache/drill/pull/1025#discussion_r150329246
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/mergereceiver/MergingRecordBatch.java
 ---
@@ -795,6 +788,8 @@ private void generateComparisons(final 
ClassGenerator g, final VectorAccessib
* @param node Reference to the next record to copy from the incoming 
batches
*/
   private boolean copyRecordToOutgoingBatch(final Node node) {
+assert outgoingPosition < OUTGOING_BATCH_SIZE
--- End diff --

Doing this assert check on each record would add some overhead.  Since the 
caller already is checking outgoingBatchHasSpace in the while loop, this check 
is not needed.  You could add a javadoc comment that caller is expected to do 
the validity check. 


> Refactor MergingRecordBatch based on code inspection
> 
>
> Key: DRILL-5936
> URL: https://issues.apache.org/jira/browse/DRILL-5936
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Tools, Build & Test
>Reporter: Vlad Rozov
>Assignee: Vlad Rozov
>Priority: Minor
>
> * Reorganize code to remove unnecessary {{pqueue.peek()}}
> * Reuse Node



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (DRILL-5922) Intermittent Memory Leaks in the ROOT allocator

2017-11-10 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5922?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16247998#comment-16247998
 ] 

ASF GitHub Bot commented on DRILL-5922:
---

Github user ilooner commented on a diff in the pull request:

https://github.com/apache/drill/pull/1023#discussion_r150327067
  
--- Diff: pom.xml ---
@@ -442,7 +442,7 @@
   
-Dorg.apache.drill.exec.server.Drillbit.system_options="org.apache.drill.exec.compile.ClassTransformer.scalar_replacement=on"
   -Ddrill.test.query.printing.silent=true
   -Ddrill.catastrophic_to_standard_out=true
-  -XX:MaxPermSize=512M -XX:MaxDirectMemorySize=3072M
+  -XX:MaxPermSize=512M -XX:MaxDirectMemorySize=4096M
--- End diff --

I will move DRILL-5926 into a separate commit.


> Intermittent Memory Leaks in the ROOT allocator  
> -
>
> Key: DRILL-5922
> URL: https://issues.apache.org/jira/browse/DRILL-5922
> Project: Apache Drill
>  Issue Type: Bug
>Reporter: Timothy Farkas
>Assignee: Timothy Farkas
>Priority: Minor
>  Labels: ready-to-commit
>
> This issue was originall found by [~ben-zvi]. I am able to consistently 
> reproduce the error on my laptop by running the following unit test:
> org.apache.drill.exec.DrillSeparatePlanningTest#testMultiMinorFragmentComplexQuery
> {code}
> java.lang.IllegalStateException: Allocator[ROOT] closed with outstanding 
> child allocators.
> Allocator(ROOT) 0/1048576/10113536/3221225472 (res/actual/peak/limit)
>   child allocators: 1
> Allocator(query:26049b50-0cec-0a92-437c-bbe486e1fcbf) 
> 1048576/0/0/268435456 (res/actual/peak/limit)
>   child allocators: 0
>   ledgers: 0
>   reservations: 0
>   ledgers: 0
>   reservations: 0
>   at 
> org.apache.drill.exec.memory.BaseAllocator.close(BaseAllocator.java:496) 
> ~[classes/:na]
>   at org.apache.drill.common.AutoCloseables.close(AutoCloseables.java:76) 
> [classes/:na]
>   at org.apache.drill.common.AutoCloseables.close(AutoCloseables.java:64) 
> [classes/:na]
>   at 
> org.apache.drill.exec.server.BootStrapContext.close(BootStrapContext.java:256)
>  ~[classes/:na]
>   at org.apache.drill.common.AutoCloseables.close(AutoCloseables.java:76) 
> [classes/:na]
>   at org.apache.drill.common.AutoCloseables.close(AutoCloseables.java:64) 
> [classes/:na]
>   at org.apache.drill.exec.server.Drillbit.close(Drillbit.java:205) 
> [classes/:na]
>   at org.apache.drill.BaseTestQuery.closeClient(BaseTestQuery.java:315) 
> [test-classes/:na]
>   at 
> org.apache.drill.BaseTestQuery.updateTestCluster(BaseTestQuery.java:157) 
> [test-classes/:na]
>   at 
> org.apache.drill.BaseTestQuery.updateTestCluster(BaseTestQuery.java:148) 
> [test-classes/:na]
>   at 
> org.apache.drill.exec.DrillSeparatePlanningTest.getFragmentsHelper(DrillSeparatePlanningTest.java:185)
>  [test-classes/:na]
>   at 
> org.apache.drill.exec.DrillSeparatePlanningTest.testMultiMinorFragmentComplexQuery(DrillSeparatePlanningTest.java:108)
>  [test-classes/:na]
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
> ~[na:1.8.0_144]
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
> ~[na:1.8.0_144]
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  ~[na:1.8.0_144]
>   at java.lang.reflect.Method.invoke(Method.java:498) ~[na:1.8.0_144]
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
>  [junit-4.11.jar:na]
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>  [junit-4.11.jar:na]
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
>  [junit-4.11.jar:na]
>   at 
> mockit.integration.junit4.internal.JUnit4TestRunnerDecorator.executeTestMethod(JUnit4TestRunnerDecorator.java:120)
>  [jmockit-1.3.jar:na]
>   at 
> mockit.integration.junit4.internal.JUnit4TestRunnerDecorator.invokeExplosively(JUnit4TestRunnerDecorator.java:65)
>  [jmockit-1.3.jar:na]
>   at 
> mockit.integration.junit4.internal.MockFrameworkMethod.invokeExplosively(MockFrameworkMethod.java:29)
>  [jmockit-1.3.jar:na]
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
> ~[na:1.8.0_144]
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
> ~[na:1.8.0_144]
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  ~[na:1.8.0_144]
>   at java.lang.reflect.Method.invoke(Method.java:498) ~[na:1.8.0_144]
>   at 
> mockit.internal.util.MethodReflection.invokeWithCheckedThrows(MethodReflection.java:95)
>  

[jira] [Updated] (DRILL-5936) Refactor MergingRecordBatch based on code inspection

2017-11-10 Thread Vlad Rozov (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-5936?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vlad Rozov updated DRILL-5936:
--
Summary: Refactor MergingRecordBatch based on code inspection  (was: 
Refactor MergingRecordBatch based on code review)

> Refactor MergingRecordBatch based on code inspection
> 
>
> Key: DRILL-5936
> URL: https://issues.apache.org/jira/browse/DRILL-5936
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Tools, Build & Test
>Reporter: Vlad Rozov
>Assignee: Vlad Rozov
>Priority: Minor
>
> * Reorganize code to remove unnecessary {{pqueue.peek()}}
> * Reuse Node



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (DRILL-5771) Fix serDe errors for format plugins

2017-11-10 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5771?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16247961#comment-16247961
 ] 

ASF GitHub Bot commented on DRILL-5771:
---

Github user paul-rogers commented on the issue:

https://github.com/apache/drill/pull/1014
  
@ilooner, there are two phases of issues. The problem is in the planner, 
but we've been talking about serialization out to the workers.

Here, we can learn from the work Arina did with dynamic UDFs, especially in 
the synchronization aspects as it has potential challenges similar to what we 
have here.

Let's walk through the lifecycle.

* User A defines a storage plugin config, call it P.
* User B else runs a query Q that references P.
* At the same time, user A changes P to P'.
* Query Q is distributed to nodes.
* At the same time, user A changes P again to produce P''.

This is a classic distributed system synchronization problem. As we 
discussed, once the query is planned, the contents of the plugin definition are 
serialized as part of the physical plan for each fragment. As a result we 
"freeze" the contents of P at the time of serialization.

The problem that Arina seems to be describing is during planning. Rather 
than taking a copy of P at some fixed point in time, then always using that 
copy; we seem to be holding a named reference to P. This introduces the obvious 
synchronization issues.

So, what we should do is, once a query resolves P, the query takes a copy 
and uses that copy for the rest of query planning and execution.

This resolves race conditions by saying that a query (on all nodes) uses 
the version of the plugin definition that existed at the moment that the 
reference to the plugin definition was first resolved by that query. The query 
will be oblivious to all subsequent updates.

Since plugin definition is done as part of workspace and table definition 
in Calcite, this is likely to be a tricky change; it is not clear that Calcite 
offers a per-query context to hold such information. Arina probably can offer 
advice on this front.



> Fix serDe errors for format plugins
> ---
>
> Key: DRILL-5771
> URL: https://issues.apache.org/jira/browse/DRILL-5771
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.11.0
>Reporter: Arina Ielchiieva
>Assignee: Arina Ielchiieva
>Priority: Minor
> Fix For: 1.12.0
>
>
> Create unit tests to check that all storage format plugins can be 
> successfully serialized  / deserialized.
> Usually this happens when query has several major fragments. 
> One way to check serde is to generate physical plan (generated as json) and 
> then submit it back to Drill.
> One example of found errors is described in the first comment. Another 
> example is described in DRILL-5166.
> *Serde issues:*
> 1. Could not obtain format plugin during deserialization
> Format plugin is created based on format plugin configuration or its name. 
> On Drill start up we load information about available plugins (its reloaded 
> each time storage plugin is updated, can be done only by admin).
> When query is parsed, we try to get plugin from the available ones, it we can 
> not find one we try to [create 
> one|https://github.com/apache/drill/blob/3e8b01d5b0d3013e3811913f0fd6028b22c1ac3f/exec/java-exec/src/main/java/org/apache/drill/exec/store/dfs/FileSystemPlugin.java#L136-L144]
> but on other query execution stages we always assume that [plugin exists 
> based on 
> configuration|https://github.com/apache/drill/blob/3e8b01d5b0d3013e3811913f0fd6028b22c1ac3f/exec/java-exec/src/main/java/org/apache/drill/exec/store/dfs/FileSystemPlugin.java#L156-L162].
> For example, during query parsing we had to create format plugin on one node 
> based on format configuration.
> Then we have sent major fragment to the different node where we used this 
> format configuration we could not get format plugin based on it and 
> deserialization has failed.
> To fix this problem we need to create format plugin during query 
> deserialization if it's absent.
>   
> 2.  Absent hash code and equals.
> Format plugins are stored in hash map where key is format plugin config.
> Since some format plugin configs did not have overridden hash code and 
> equals, we could not find format plugin based on its configuration.
> 3. Named format plugin usage
> Named format plugins configs allow to get format plugin by its name for 
> configuration shared among all drillbits.
> They are used as alias for pre-configured format plugiins. User with admin 
> priliges can modify them at runtime.
> Named format plugins configs are used instead of sending all non-default 
> parameters of format plugin config, in this case only name is sent.
> Their usage in 

[jira] [Commented] (DRILL-5936) Refactor MergingRecordBatch based on code review

2017-11-10 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5936?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16247941#comment-16247941
 ] 

ASF GitHub Bot commented on DRILL-5936:
---

Github user amansinha100 commented on the issue:

https://github.com/apache/drill/pull/1025
  
+1  with a minor comment.   In the commit message and JIRA it would be 
better to say 'code inspection'  instead of code review which may be 
interpreted to mean the normal code review process. 


> Refactor MergingRecordBatch based on code review
> 
>
> Key: DRILL-5936
> URL: https://issues.apache.org/jira/browse/DRILL-5936
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Tools, Build & Test
>Reporter: Vlad Rozov
>Assignee: Vlad Rozov
>Priority: Minor
>
> * Reorganize code to remove unnecessary {{pqueue.peek()}}
> * Reuse Node



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (DRILL-5936) Refactor MergingRecordBatch based on code review

2017-11-10 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5936?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16247933#comment-16247933
 ] 

ASF GitHub Bot commented on DRILL-5936:
---

Github user amansinha100 commented on a diff in the pull request:

https://github.com/apache/drill/pull/1025#discussion_r150310556
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/mergereceiver/MergingRecordBatch.java
 ---
@@ -177,11 +177,11 @@ public IterOutcome innerNext() {
 }
 boolean schemaChanged = false;
 
-if (prevBatchWasFull) {
+if (!prevBatchNotFull) {
--- End diff --

The double negative makes it somewhat confusing.  Perhaps rename the 
variable to 'prevBatchHasSpace' . 


> Refactor MergingRecordBatch based on code review
> 
>
> Key: DRILL-5936
> URL: https://issues.apache.org/jira/browse/DRILL-5936
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Tools, Build & Test
>Reporter: Vlad Rozov
>Assignee: Vlad Rozov
>Priority: Minor
>
> * Reorganize code to remove unnecessary {{pqueue.peek()}}
> * Reuse Node



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (DRILL-5771) Fix serDe errors for format plugins

2017-11-10 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5771?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16247895#comment-16247895
 ] 

ASF GitHub Bot commented on DRILL-5771:
---

Github user ilooner commented on the issue:

https://github.com/apache/drill/pull/1014
  
@paul-rogers I follow your explanation except for the last paragraph.

The Drill UI let's me update storage plugins and their formatters at 
runtime (at least on my laptop) by going to **Storage** tab and clicking on 
**Update** for a plugin. It also let's me add storage plugins provided the 
necessary jars are in the classpath. I thought the goal of this change was to 
address the fact that such a runtime update cannot be propagated to the cluster 
atomically, so it is unreliable to reference a FormatPluginConfig by name 
(which was done previously). It is, however, safe to copy the entire 
FormatPluginConfig into the query plan as you said. Is this accurate?


> Fix serDe errors for format plugins
> ---
>
> Key: DRILL-5771
> URL: https://issues.apache.org/jira/browse/DRILL-5771
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.11.0
>Reporter: Arina Ielchiieva
>Assignee: Arina Ielchiieva
>Priority: Minor
> Fix For: 1.12.0
>
>
> Create unit tests to check that all storage format plugins can be 
> successfully serialized  / deserialized.
> Usually this happens when query has several major fragments. 
> One way to check serde is to generate physical plan (generated as json) and 
> then submit it back to Drill.
> One example of found errors is described in the first comment. Another 
> example is described in DRILL-5166.
> *Serde issues:*
> 1. Could not obtain format plugin during deserialization
> Format plugin is created based on format plugin configuration or its name. 
> On Drill start up we load information about available plugins (its reloaded 
> each time storage plugin is updated, can be done only by admin).
> When query is parsed, we try to get plugin from the available ones, it we can 
> not find one we try to [create 
> one|https://github.com/apache/drill/blob/3e8b01d5b0d3013e3811913f0fd6028b22c1ac3f/exec/java-exec/src/main/java/org/apache/drill/exec/store/dfs/FileSystemPlugin.java#L136-L144]
> but on other query execution stages we always assume that [plugin exists 
> based on 
> configuration|https://github.com/apache/drill/blob/3e8b01d5b0d3013e3811913f0fd6028b22c1ac3f/exec/java-exec/src/main/java/org/apache/drill/exec/store/dfs/FileSystemPlugin.java#L156-L162].
> For example, during query parsing we had to create format plugin on one node 
> based on format configuration.
> Then we have sent major fragment to the different node where we used this 
> format configuration we could not get format plugin based on it and 
> deserialization has failed.
> To fix this problem we need to create format plugin during query 
> deserialization if it's absent.
>   
> 2.  Absent hash code and equals.
> Format plugins are stored in hash map where key is format plugin config.
> Since some format plugin configs did not have overridden hash code and 
> equals, we could not find format plugin based on its configuration.
> 3. Named format plugin usage
> Named format plugins configs allow to get format plugin by its name for 
> configuration shared among all drillbits.
> They are used as alias for pre-configured format plugiins. User with admin 
> priliges can modify them at runtime.
> Named format plugins configs are used instead of sending all non-default 
> parameters of format plugin config, in this case only name is sent.
> Their usage in distributed system may cause raise conditions.
> For example, 
> 1. Query is submitted. 
> 2. Parquet format plugin is created with the following configuration 
> (autoCorrectCorruptDates=>true).
> 3. Seralized named format plugin config with name as parquet.
> 4. Major fragment is sent to the different node.
> 5. Admin has changed parquet configuration for the alias 'parquet' on all 
> nodes to autoCorrectCorruptDates=>false.
> 6. Named format is deserialized on the different node into parquet format 
> plugin with configuration (autoCorrectCorruptDates=>false).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (DRILL-5922) Intermittent Memory Leaks in the ROOT allocator

2017-11-10 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5922?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16247871#comment-16247871
 ] 

ASF GitHub Bot commented on DRILL-5922:
---

Github user ilooner commented on a diff in the pull request:

https://github.com/apache/drill/pull/1023#discussion_r150302692
  
--- Diff: pom.xml ---
@@ -442,7 +442,7 @@
   
-Dorg.apache.drill.exec.server.Drillbit.system_options="org.apache.drill.exec.compile.ClassTransformer.scalar_replacement=on"
   -Ddrill.test.query.printing.silent=true
   -Ddrill.catastrophic_to_standard_out=true
-  -XX:MaxPermSize=512M -XX:MaxDirectMemorySize=3072M
+  -XX:MaxPermSize=512M -XX:MaxDirectMemorySize=4096M
--- End diff --

I could not properly test my PR because of a pre-existing sporadic test 
failure. This fix was necessary in order to test my changes. Errors with the 
builds should be fixed immediately, especially when they interfere with testing 
a change. So I will keep the fix in this PR.


> Intermittent Memory Leaks in the ROOT allocator  
> -
>
> Key: DRILL-5922
> URL: https://issues.apache.org/jira/browse/DRILL-5922
> Project: Apache Drill
>  Issue Type: Bug
>Reporter: Timothy Farkas
>Assignee: Timothy Farkas
>Priority: Minor
>  Labels: ready-to-commit
>
> This issue was originall found by [~ben-zvi]. I am able to consistently 
> reproduce the error on my laptop by running the following unit test:
> org.apache.drill.exec.DrillSeparatePlanningTest#testMultiMinorFragmentComplexQuery
> {code}
> java.lang.IllegalStateException: Allocator[ROOT] closed with outstanding 
> child allocators.
> Allocator(ROOT) 0/1048576/10113536/3221225472 (res/actual/peak/limit)
>   child allocators: 1
> Allocator(query:26049b50-0cec-0a92-437c-bbe486e1fcbf) 
> 1048576/0/0/268435456 (res/actual/peak/limit)
>   child allocators: 0
>   ledgers: 0
>   reservations: 0
>   ledgers: 0
>   reservations: 0
>   at 
> org.apache.drill.exec.memory.BaseAllocator.close(BaseAllocator.java:496) 
> ~[classes/:na]
>   at org.apache.drill.common.AutoCloseables.close(AutoCloseables.java:76) 
> [classes/:na]
>   at org.apache.drill.common.AutoCloseables.close(AutoCloseables.java:64) 
> [classes/:na]
>   at 
> org.apache.drill.exec.server.BootStrapContext.close(BootStrapContext.java:256)
>  ~[classes/:na]
>   at org.apache.drill.common.AutoCloseables.close(AutoCloseables.java:76) 
> [classes/:na]
>   at org.apache.drill.common.AutoCloseables.close(AutoCloseables.java:64) 
> [classes/:na]
>   at org.apache.drill.exec.server.Drillbit.close(Drillbit.java:205) 
> [classes/:na]
>   at org.apache.drill.BaseTestQuery.closeClient(BaseTestQuery.java:315) 
> [test-classes/:na]
>   at 
> org.apache.drill.BaseTestQuery.updateTestCluster(BaseTestQuery.java:157) 
> [test-classes/:na]
>   at 
> org.apache.drill.BaseTestQuery.updateTestCluster(BaseTestQuery.java:148) 
> [test-classes/:na]
>   at 
> org.apache.drill.exec.DrillSeparatePlanningTest.getFragmentsHelper(DrillSeparatePlanningTest.java:185)
>  [test-classes/:na]
>   at 
> org.apache.drill.exec.DrillSeparatePlanningTest.testMultiMinorFragmentComplexQuery(DrillSeparatePlanningTest.java:108)
>  [test-classes/:na]
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
> ~[na:1.8.0_144]
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
> ~[na:1.8.0_144]
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  ~[na:1.8.0_144]
>   at java.lang.reflect.Method.invoke(Method.java:498) ~[na:1.8.0_144]
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
>  [junit-4.11.jar:na]
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>  [junit-4.11.jar:na]
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
>  [junit-4.11.jar:na]
>   at 
> mockit.integration.junit4.internal.JUnit4TestRunnerDecorator.executeTestMethod(JUnit4TestRunnerDecorator.java:120)
>  [jmockit-1.3.jar:na]
>   at 
> mockit.integration.junit4.internal.JUnit4TestRunnerDecorator.invokeExplosively(JUnit4TestRunnerDecorator.java:65)
>  [jmockit-1.3.jar:na]
>   at 
> mockit.integration.junit4.internal.MockFrameworkMethod.invokeExplosively(MockFrameworkMethod.java:29)
>  [jmockit-1.3.jar:na]
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
> ~[na:1.8.0_144]
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
> ~[na:1.8.0_144]
>   at 
> 

[jira] [Commented] (DRILL-5657) Implement size-aware result set loader

2017-11-10 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16247833#comment-16247833
 ] 

ASF GitHub Bot commented on DRILL-5657:
---

Github user parthchandra commented on a diff in the pull request:

https://github.com/apache/drill/pull/914#discussion_r149759147
  
--- Diff: exec/vector/src/main/codegen/templates/ColumnAccessors.java ---
@@ -191,141 +180,268 @@ public void bind(RowIndex vectorIndex, ValueVector 
vector) {
 <#if accessorType=="BigDecimal">
   <#assign label="Decimal">
 
+<#if drillType == "VarChar" || drillType == "Var16Char">
+  <#assign accessorType = "byte[]">
+  <#assign label = "Bytes">
+
 <#if ! notyet>
   
//
   // ${drillType} readers and writers
 
-  public static class ${drillType}ColumnReader extends 
AbstractColumnReader {
+  public static class ${drillType}ColumnReader extends BaseScalarReader {
 
-<@bindReader "" drillType />
+<@bindReader "" drillType false />
 
-<@getType label />
+<@getType drillType label />
 
 <@get drillType accessorType label false/>
   }
 
-  public static class Nullable${drillType}ColumnReader extends 
AbstractColumnReader {
+  public static class Nullable${drillType}ColumnReader extends 
BaseScalarReader {
 
-<@bindReader "Nullable" drillType />
+<@bindReader "Nullable" drillType false />
 
-<@getType label />
+<@getType drillType label />
 
 @Override
 public boolean isNull() {
-  return accessor().isNull(vectorIndex.index());
-}
-
-<@get drillType accessorType label false/>
-  }
-
-  public static class Repeated${drillType}ColumnReader extends 
AbstractArrayReader {
-
-<@bindReader "Repeated" drillType />
-
-<@getType label />
-
-@Override
-public int size() {
-  return accessor().getInnerValueCountAt(vectorIndex.index());
+  return accessor().isNull(vectorIndex.vectorIndex());
 }
 
-<@get drillType accessorType label true/>
+<@get drillType accessorType label false />
   }
 
-  public static class ${drillType}ColumnWriter extends 
AbstractColumnWriter {
+  public static class Repeated${drillType}ColumnReader extends 
BaseElementReader {
 
-<@bindWriter "" drillType />
+<@bindReader "" drillType true />
 
-<@getType label />
+<@getType drillType label />
 
-<@set drillType accessorType label false "set" />
+<@get drillType accessorType label true />
   }
 
-  public static class Nullable${drillType}ColumnWriter extends 
AbstractColumnWriter {
-
-<@bindWriter "Nullable" drillType />
+  <#assign varWidth = drillType == "VarChar" || drillType == 
"Var16Char" || drillType == "VarBinary" />
+  <#if varWidth>
+  public static class ${drillType}ColumnWriter extends BaseVarWidthWriter {
+  <#else>
+  public static class ${drillType}ColumnWriter extends 
BaseFixedWidthWriter {
+<#if drillType = "Decimal9" || drillType == "Decimal18" ||
+ drillType == "Decimal28Sparse" || drillType == 
"Decimal38Sparse">
+private MajorType type;
+
+private static final int VALUE_WIDTH = ${drillType}Vector.VALUE_WIDTH;
+  
+private final ${drillType}Vector vector;
+
+public ${drillType}ColumnWriter(final ValueVector vector) {
+  <#if varWidth>
+  super(((${drillType}Vector) vector).getOffsetVector());
+  <#else>
+<#if drillType = "Decimal9" || drillType == "Decimal18" ||
+ drillType == "Decimal28Sparse" || drillType == 
"Decimal38Sparse">
+  type = vector.getField().getType();
+
+  
+  this.vector = (${drillType}Vector) vector;
+}
 
-<@getType label />
+@Override public ValueVector vector() { return vector; }
 
+<#-- All change of buffer comes through this function to allow 
capturing
+ the buffer address and capacity. Only two ways to set the 
buffer:
+ by binding to a vector in bindVector(), or by resizing the 
vector
+ in writeIndex(). -->
 @Override
-public void setNull() {
-  mutator.setNull(vectorIndex.index());
+protected final void setAddr() {
+  final DrillBuf buf = vector.getBuffer();
+  bufAddr = buf.addr();
+  <#if varWidth>
+  capacity = buf.capacity();
+  <#else>
+  <#-- Turns out that keeping track of capacity as the count of
+   values 

[jira] [Commented] (DRILL-5657) Implement size-aware result set loader

2017-11-10 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16247830#comment-16247830
 ] 

ASF GitHub Bot commented on DRILL-5657:
---

Github user parthchandra commented on a diff in the pull request:

https://github.com/apache/drill/pull/914#discussion_r149764036
  
--- Diff: 
exec/vector/src/main/java/org/apache/drill/exec/vector/accessor/writer/BaseScalarWriter.java
 ---
@@ -0,0 +1,264 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.drill.exec.vector.accessor.writer;
+
+import java.math.BigDecimal;
+
+import org.apache.drill.exec.vector.accessor.ColumnWriterIndex;
+import org.apache.drill.exec.vector.accessor.impl.HierarchicalFormatter;
+import org.joda.time.Period;
+
+/**
+ * Column writer implementation that acts as the basis for the
+ * generated, vector-specific implementations. All set methods
+ * throw an exception; subclasses simply override the supported
+ * method(s).
+ * 
+ * The only tricky part to this class is understanding the
+ * state of the write indexes as the write proceeds. There are
+ * two pointers to consider:
+ * 
+ * lastWriteIndex: The position in the vector at which the
+ * client last asked us to write data. This index is maintained
+ * in this class because it depends only on the actions of this
+ * class.
+ * vectorIndex: The position in the vector at which we will
+ * write if the client chooses to write a value at this time.
+ * The vector index is shared by all columns at the same repeat
+ * level. It is incremented as the client steps through the write
+ * and is observed in this class each time a write occurs.
+ * 
+ * A repeat level is defined as any of the following:
+ * 
+ * The set of top-level scalar columns, or those within a
+ * top-level, non-repeated map, or nested to any depth within
+ * non-repeated maps rooted at the top level.
+ * The values for a single scalar array.
+ * The set of scalar columns within a repeated map, or
+ * nested within non-repeated maps within a repeated map.
+ * 
+ * Items at a repeat level index together and share a vector
+ * index. However, the columns within a repeat level
+ * do not share a last write index: some can lag further
+ * behind than others.
+ * 
+ * Let's illustrate the states. Let's focus on one column and
+ * illustrate the three states that can occur during write:
+ * 
+ * Behind: the last write index is more than one position behind
+ * the vector index. Zero-filling will be needed to catch up to
+ * the vector index.
+ * Written: the last write index is the same as the vector
+ * index because the client wrote data at this position (and previous
+ * values were back-filled with nulls, empties or zeros.)
+ * Unwritten: the last write index is one behind the vector
+ * index. This occurs when the column was written, then the client
+ * moved to the next row or array position.
+ * Restarted: The current row is abandoned (perhaps filtered
+ * out) and is to be rewritten. The last write position moves
+ * back one position. Note that, the Restarted state is
+ * indistinguishable from the unwritten state: the only real
+ * difference is that the current slot (pointed to by the
+ * vector index) contains the previous written value that must
+ * be overwritten or back-filled. But, this is fine, because we
+ * assume that unwritten values are garbage anyway.
+ * 
+ * To illustrate:
+ *  Behind  WrittenUnwrittenRestarted
+ *   |X|  |X| |X|  |X|
+ *   lw >|X|  |X| |X|  |X|
+ *   | |  |0| |0| lw > |0|
+ *v >| |  lw, v > |X|lw > |X|  v > |X|
+ *v > | |
+ * 
+ * The illustrated state transitions are:
+ * 
+ * Suppose the state starts in Behind.
  

[jira] [Commented] (DRILL-5657) Implement size-aware result set loader

2017-11-10 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16247832#comment-16247832
 ] 

ASF GitHub Bot commented on DRILL-5657:
---

Github user parthchandra commented on a diff in the pull request:

https://github.com/apache/drill/pull/914#discussion_r149765040
  
--- Diff: 
exec/vector/src/main/java/org/apache/drill/exec/record/MaterializedField.java 
---
@@ -168,6 +174,58 @@ public boolean equals(Object obj) {
 Objects.equals(this.type, other.type);
   }
 
+  public boolean isEquivalent(MaterializedField other) {
+if (! name.equalsIgnoreCase(other.name)) {
+  return false;
+}
+
+// Requires full type equality, including fields such as precision and 
scale.
+// But, unset fields are equivalent to 0. Can't use the 
protobuf-provided
+// isEquals(), that treats set and unset fields as different.
+
+if (type.getMinorType() != other.type.getMinorType()) {
+  return false;
+}
+if (type.getMode() != other.type.getMode()) {
+  return false;
+}
+if (type.getScale() != other.type.getScale()) {
+  return false;
+}
+if (type.getPrecision() != other.type.getPrecision()) {
+  return false;
+}
+
+// Compare children -- but only for maps, not the internal children
+// for Varchar, repeated or nullable types.
+
+if (type.getMinorType() != MinorType.MAP) {
+  return true;
+}
+
+if (children == null  ||  other.children == null) {
+  return children == other.children;
+}
+if (children.size() != other.children.size()) {
+  return false;
+}
+
+// Maps are name-based, not position. But, for our
+// purposes, we insist on identical ordering.
+
+Iterator thisIter = children.iterator();
+Iterator otherIter = other.children.iterator();
+while (thisIter.hasNext()) {
--- End diff --

isEquivalent requires identical ordering which is a stronger requirement 
than the guarantee that the children list is providing. Could we use contains() 
to find the child and then apply isEquivalent recursively?



> Implement size-aware result set loader
> --
>
> Key: DRILL-5657
> URL: https://issues.apache.org/jira/browse/DRILL-5657
> Project: Apache Drill
>  Issue Type: Improvement
>Affects Versions: Future
>Reporter: Paul Rogers
>Assignee: Paul Rogers
> Fix For: Future
>
>
> A recent extension to Drill's set of test tools created a "row set" 
> abstraction to allow us to create, and verify, record batches with very few 
> lines of code. Part of this work involved creating a set of "column 
> accessors" in the vector subsystem. Column readers provide a uniform API to 
> obtain data from columns (vectors), while column writers provide a uniform 
> writing interface.
> DRILL-5211 discusses a set of changes to limit value vectors to 16 MB in size 
> (to avoid memory fragmentation due to Drill's two memory allocators.) The 
> column accessors have proven to be so useful that they will be the basis for 
> the new, size-aware writers used by Drill's record readers.
> A step in that direction is to retrofit the column writers to use the 
> size-aware {{setScalar()}} and {{setArray()}} methods introduced in 
> DRILL-5517.
> Since the test framework row set classes are (at present) the only consumer 
> of the accessors, those classes must also be updated with the changes.
> This then allows us to add a new "row mutator" class that handles size-aware 
> vector writing, including the case in which a vector fills in the middle of a 
> row.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (DRILL-5657) Implement size-aware result set loader

2017-11-10 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16247831#comment-16247831
 ] 

ASF GitHub Bot commented on DRILL-5657:
---

Github user parthchandra commented on a diff in the pull request:

https://github.com/apache/drill/pull/914#discussion_r149758695
  
--- Diff: exec/vector/src/main/codegen/templates/ColumnAccessors.java ---
@@ -191,141 +180,268 @@ public void bind(RowIndex vectorIndex, ValueVector 
vector) {
 <#if accessorType=="BigDecimal">
   <#assign label="Decimal">
 
+<#if drillType == "VarChar" || drillType == "Var16Char">
+  <#assign accessorType = "byte[]">
+  <#assign label = "Bytes">
+
 <#if ! notyet>
   
//
   // ${drillType} readers and writers
 
-  public static class ${drillType}ColumnReader extends 
AbstractColumnReader {
+  public static class ${drillType}ColumnReader extends BaseScalarReader {
 
-<@bindReader "" drillType />
+<@bindReader "" drillType false />
 
-<@getType label />
+<@getType drillType label />
 
 <@get drillType accessorType label false/>
   }
 
-  public static class Nullable${drillType}ColumnReader extends 
AbstractColumnReader {
+  public static class Nullable${drillType}ColumnReader extends 
BaseScalarReader {
 
-<@bindReader "Nullable" drillType />
+<@bindReader "Nullable" drillType false />
 
-<@getType label />
+<@getType drillType label />
 
 @Override
 public boolean isNull() {
-  return accessor().isNull(vectorIndex.index());
-}
-
-<@get drillType accessorType label false/>
-  }
-
-  public static class Repeated${drillType}ColumnReader extends 
AbstractArrayReader {
-
-<@bindReader "Repeated" drillType />
-
-<@getType label />
-
-@Override
-public int size() {
-  return accessor().getInnerValueCountAt(vectorIndex.index());
+  return accessor().isNull(vectorIndex.vectorIndex());
 }
 
-<@get drillType accessorType label true/>
+<@get drillType accessorType label false />
   }
 
-  public static class ${drillType}ColumnWriter extends 
AbstractColumnWriter {
+  public static class Repeated${drillType}ColumnReader extends 
BaseElementReader {
 
-<@bindWriter "" drillType />
+<@bindReader "" drillType true />
 
-<@getType label />
+<@getType drillType label />
 
-<@set drillType accessorType label false "set" />
+<@get drillType accessorType label true />
   }
 
-  public static class Nullable${drillType}ColumnWriter extends 
AbstractColumnWriter {
-
-<@bindWriter "Nullable" drillType />
+  <#assign varWidth = drillType == "VarChar" || drillType == 
"Var16Char" || drillType == "VarBinary" />
+  <#if varWidth>
+  public static class ${drillType}ColumnWriter extends BaseVarWidthWriter {
+  <#else>
+  public static class ${drillType}ColumnWriter extends 
BaseFixedWidthWriter {
+<#if drillType = "Decimal9" || drillType == "Decimal18" ||
+ drillType == "Decimal28Sparse" || drillType == 
"Decimal38Sparse">
+private MajorType type;
+
+private static final int VALUE_WIDTH = ${drillType}Vector.VALUE_WIDTH;
+  
+private final ${drillType}Vector vector;
+
+public ${drillType}ColumnWriter(final ValueVector vector) {
+  <#if varWidth>
+  super(((${drillType}Vector) vector).getOffsetVector());
+  <#else>
+<#if drillType = "Decimal9" || drillType == "Decimal18" ||
+ drillType == "Decimal28Sparse" || drillType == 
"Decimal38Sparse">
+  type = vector.getField().getType();
+
+  
+  this.vector = (${drillType}Vector) vector;
+}
 
-<@getType label />
+@Override public ValueVector vector() { return vector; }
 
+<#-- All change of buffer comes through this function to allow 
capturing
+ the buffer address and capacity. Only two ways to set the 
buffer:
+ by binding to a vector in bindVector(), or by resizing the 
vector
+ in writeIndex(). -->
 @Override
-public void setNull() {
-  mutator.setNull(vectorIndex.index());
+protected final void setAddr() {
+  final DrillBuf buf = vector.getBuffer();
+  bufAddr = buf.addr();
+  <#if varWidth>
+  capacity = buf.capacity();
+  <#else>
+  <#-- Turns out that keeping track of capacity as the count of
+   values 

[jira] [Commented] (DRILL-5657) Implement size-aware result set loader

2017-11-10 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16247829#comment-16247829
 ] 

ASF GitHub Bot commented on DRILL-5657:
---

Github user parthchandra commented on a diff in the pull request:

https://github.com/apache/drill/pull/914#discussion_r149760234
  
--- Diff: 
exec/java-exec/src/test/java/org/apache/drill/test/rowSet/test/RowSetTest.java 
---
@@ -19,420 +19,648 @@
 
 import static org.junit.Assert.assertEquals;
 import static org.junit.Assert.assertFalse;
-import static org.junit.Assert.assertNotNull;
-import static org.junit.Assert.assertNull;
 import static org.junit.Assert.assertSame;
 import static org.junit.Assert.assertTrue;
+import static org.junit.Assert.fail;
+
+import java.io.UnsupportedEncodingException;
 
-import org.apache.drill.common.types.TypeProtos.DataMode;
 import org.apache.drill.common.types.TypeProtos.MinorType;
 import org.apache.drill.exec.record.BatchSchema;
+import org.apache.drill.exec.record.TupleMetadata;
+import org.apache.drill.exec.vector.ValueVector;
 import org.apache.drill.exec.vector.accessor.ArrayReader;
 import org.apache.drill.exec.vector.accessor.ArrayWriter;
-import org.apache.drill.exec.vector.accessor.TupleAccessor.TupleSchema;
+import org.apache.drill.exec.vector.accessor.ObjectType;
+import org.apache.drill.exec.vector.accessor.ScalarElementReader;
+import org.apache.drill.exec.vector.accessor.ScalarReader;
+import org.apache.drill.exec.vector.accessor.ScalarWriter;
+import org.apache.drill.exec.vector.accessor.TupleReader;
+import org.apache.drill.exec.vector.accessor.TupleWriter;
+import org.apache.drill.exec.vector.accessor.ValueType;
+import org.apache.drill.exec.vector.complex.MapVector;
+import org.apache.drill.exec.vector.complex.RepeatedMapVector;
 import org.apache.drill.test.SubOperatorTest;
 import org.apache.drill.test.rowSet.RowSet.ExtendableRowSet;
-import org.apache.drill.test.rowSet.RowSet.RowSetReader;
-import org.apache.drill.test.rowSet.RowSet.RowSetWriter;
 import org.apache.drill.test.rowSet.RowSet.SingleRowSet;
 import org.apache.drill.test.rowSet.RowSetComparison;
-import org.apache.drill.test.rowSet.RowSetSchema;
-import org.apache.drill.test.rowSet.RowSetSchema.FlattenedSchema;
-import org.apache.drill.test.rowSet.RowSetSchema.PhysicalSchema;
+import org.apache.drill.test.rowSet.RowSetReader;
+import org.apache.drill.test.rowSet.RowSetWriter;
 import org.apache.drill.test.rowSet.SchemaBuilder;
+import org.bouncycastle.util.Arrays;
--- End diff --

BouncyCastle arrays instead of Java Arrays ?


> Implement size-aware result set loader
> --
>
> Key: DRILL-5657
> URL: https://issues.apache.org/jira/browse/DRILL-5657
> Project: Apache Drill
>  Issue Type: Improvement
>Affects Versions: Future
>Reporter: Paul Rogers
>Assignee: Paul Rogers
> Fix For: Future
>
>
> A recent extension to Drill's set of test tools created a "row set" 
> abstraction to allow us to create, and verify, record batches with very few 
> lines of code. Part of this work involved creating a set of "column 
> accessors" in the vector subsystem. Column readers provide a uniform API to 
> obtain data from columns (vectors), while column writers provide a uniform 
> writing interface.
> DRILL-5211 discusses a set of changes to limit value vectors to 16 MB in size 
> (to avoid memory fragmentation due to Drill's two memory allocators.) The 
> column accessors have proven to be so useful that they will be the basis for 
> the new, size-aware writers used by Drill's record readers.
> A step in that direction is to retrofit the column writers to use the 
> size-aware {{setScalar()}} and {{setArray()}} methods introduced in 
> DRILL-5517.
> Since the test framework row set classes are (at present) the only consumer 
> of the accessors, those classes must also be updated with the changes.
> This then allows us to add a new "row mutator" class that handles size-aware 
> vector writing, including the case in which a vector fills in the middle of a 
> row.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (DRILL-5951) Transaction Support (CRUD) operations on HBase and MongoDB

2017-11-10 Thread Saurabh Mahapatra (JIRA)
Saurabh Mahapatra created DRILL-5951:


 Summary: Transaction Support (CRUD) operations on HBase and MongoDB
 Key: DRILL-5951
 URL: https://issues.apache.org/jira/browse/DRILL-5951
 Project: Apache Drill
  Issue Type: Bug
Reporter: Saurabh Mahapatra


As a user, I would like support for basic CRUD operations through Drill. I 
understand that Drill is an analytics query engine that should work on 
historical data stored in my databases but there are times when I would like to 
make an update on a specific set of rows (as a batch or otherwise) and running 
an ETL job that processes the entire dataset is too cumbersome. 

Even Apache Phoenix project is also beginning to introduce these new semantics:
https://phoenix.apache.org/transactions.html

It would be good if we can have the same so that I do not have to use two 
different query engines. 

>From a priority standpoint, HBase and MongoDB would be the first that I would 
>request this support for because of their popularity.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (DRILL-5922) Intermittent Memory Leaks in the ROOT allocator

2017-11-10 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5922?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16247622#comment-16247622
 ] 

ASF GitHub Bot commented on DRILL-5922:
---

Github user vrozov commented on a diff in the pull request:

https://github.com/apache/drill/pull/1023#discussion_r150259975
  
--- Diff: pom.xml ---
@@ -442,7 +442,7 @@
   
-Dorg.apache.drill.exec.server.Drillbit.system_options="org.apache.drill.exec.compile.ClassTransformer.scalar_replacement=on"
   -Ddrill.test.query.printing.silent=true
   -Ddrill.catastrophic_to_standard_out=true
-  -XX:MaxPermSize=512M -XX:MaxDirectMemorySize=3072M
+  -XX:MaxPermSize=512M -XX:MaxDirectMemorySize=4096M
--- End diff --

It may be better to open a separate PR for DRILL-5926


> Intermittent Memory Leaks in the ROOT allocator  
> -
>
> Key: DRILL-5922
> URL: https://issues.apache.org/jira/browse/DRILL-5922
> Project: Apache Drill
>  Issue Type: Bug
>Reporter: Timothy Farkas
>Assignee: Timothy Farkas
>Priority: Minor
>  Labels: ready-to-commit
>
> This issue was originall found by [~ben-zvi]. I am able to consistently 
> reproduce the error on my laptop by running the following unit test:
> org.apache.drill.exec.DrillSeparatePlanningTest#testMultiMinorFragmentComplexQuery
> {code}
> java.lang.IllegalStateException: Allocator[ROOT] closed with outstanding 
> child allocators.
> Allocator(ROOT) 0/1048576/10113536/3221225472 (res/actual/peak/limit)
>   child allocators: 1
> Allocator(query:26049b50-0cec-0a92-437c-bbe486e1fcbf) 
> 1048576/0/0/268435456 (res/actual/peak/limit)
>   child allocators: 0
>   ledgers: 0
>   reservations: 0
>   ledgers: 0
>   reservations: 0
>   at 
> org.apache.drill.exec.memory.BaseAllocator.close(BaseAllocator.java:496) 
> ~[classes/:na]
>   at org.apache.drill.common.AutoCloseables.close(AutoCloseables.java:76) 
> [classes/:na]
>   at org.apache.drill.common.AutoCloseables.close(AutoCloseables.java:64) 
> [classes/:na]
>   at 
> org.apache.drill.exec.server.BootStrapContext.close(BootStrapContext.java:256)
>  ~[classes/:na]
>   at org.apache.drill.common.AutoCloseables.close(AutoCloseables.java:76) 
> [classes/:na]
>   at org.apache.drill.common.AutoCloseables.close(AutoCloseables.java:64) 
> [classes/:na]
>   at org.apache.drill.exec.server.Drillbit.close(Drillbit.java:205) 
> [classes/:na]
>   at org.apache.drill.BaseTestQuery.closeClient(BaseTestQuery.java:315) 
> [test-classes/:na]
>   at 
> org.apache.drill.BaseTestQuery.updateTestCluster(BaseTestQuery.java:157) 
> [test-classes/:na]
>   at 
> org.apache.drill.BaseTestQuery.updateTestCluster(BaseTestQuery.java:148) 
> [test-classes/:na]
>   at 
> org.apache.drill.exec.DrillSeparatePlanningTest.getFragmentsHelper(DrillSeparatePlanningTest.java:185)
>  [test-classes/:na]
>   at 
> org.apache.drill.exec.DrillSeparatePlanningTest.testMultiMinorFragmentComplexQuery(DrillSeparatePlanningTest.java:108)
>  [test-classes/:na]
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
> ~[na:1.8.0_144]
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
> ~[na:1.8.0_144]
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  ~[na:1.8.0_144]
>   at java.lang.reflect.Method.invoke(Method.java:498) ~[na:1.8.0_144]
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
>  [junit-4.11.jar:na]
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>  [junit-4.11.jar:na]
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
>  [junit-4.11.jar:na]
>   at 
> mockit.integration.junit4.internal.JUnit4TestRunnerDecorator.executeTestMethod(JUnit4TestRunnerDecorator.java:120)
>  [jmockit-1.3.jar:na]
>   at 
> mockit.integration.junit4.internal.JUnit4TestRunnerDecorator.invokeExplosively(JUnit4TestRunnerDecorator.java:65)
>  [jmockit-1.3.jar:na]
>   at 
> mockit.integration.junit4.internal.MockFrameworkMethod.invokeExplosively(MockFrameworkMethod.java:29)
>  [jmockit-1.3.jar:na]
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
> ~[na:1.8.0_144]
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
> ~[na:1.8.0_144]
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  ~[na:1.8.0_144]
>   at java.lang.reflect.Method.invoke(Method.java:498) ~[na:1.8.0_144]
>   at 
> mockit.internal.util.MethodReflection.invokeWithCheckedThrows(MethodReflection.java:95)
>  

[jira] [Commented] (DRILL-5948) The wrong number of batches is displayed

2017-11-10 Thread Vlad (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5948?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16247577#comment-16247577
 ] 

Vlad commented on DRILL-5948:
-

Thank you, Paul. So looks like it is an expected behavior. 
Does it make sense to leave this jira and to edit the description according to 
your suggestion to add an option of ignoring the initial schema batch in the 
resulting profile?

> The wrong number of batches is displayed
> 
>
> Key: DRILL-5948
> URL: https://issues.apache.org/jira/browse/DRILL-5948
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.11.0
>Reporter: Vlad
> Attachments: json_profile.json
>
>
> I suppose, when you execute a query with a small amount of data drill must 
> create 1 batch, but here you can see that drill created 2 batches. I think 
> it's a wrong behaviour for the drill. Full JSON file will be in the 
> attachment.
> {code:html}
> "fragmentProfile": [
> {
> "majorFragmentId": 0,
> "minorFragmentProfile": [
> {
> "state": 3,
> "minorFragmentId": 0,
> "operatorProfile": [
> {
> "inputProfile": [
> {
> "records": 1,
> "batches": 2,
> "schemas": 1
> }
> ],
> "operatorId": 2,
> "operatorType": 29,
> "setupNanos": 0,
> "processNanos": 1767363740,
> "peakLocalMemoryAllocated": 639120,
> "waitNanos": 25787
> },
> {code}
> Step to reproduce:
> # Create JSON file with 1 row
> # Execute star query whith this file, for example 
> {code:sql}
> select * from dfs.`/path/to/your/file/example.json`
> {code}
> # Go to the Profile page on the UI, and open info about your query
> # Open JSON profile



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)