[jira] [Closed] (DRILL-4545) Incorrect query plan for LIMIT 0 query

2016-03-28 Thread Khurram Faraaz (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-4545?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Khurram Faraaz closed DRILL-4545.
-

> Incorrect query plan for LIMIT 0 query
> --
>
> Key: DRILL-4545
> URL: https://issues.apache.org/jira/browse/DRILL-4545
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Query Planning & Optimization
>Affects Versions: 1.6.0
> Environment: 4 node cluster CentOS
>Reporter: Khurram Faraaz
>  Labels: limit0
>
> Inner query has a LIMIT 1 and outer query has LIMIT 0. Looking at the query 
> plan it looks like the outer LIMIT 0 is applied before the LIMIT 1 is applied 
> to inner query. This does not seem right.
> Drill 1.6.0 commit ID : fb09973e
> {noformat}
> 0: jdbc:drill:schema=dfs.tmp> explain plan for select * from (select * from 
> `employee.json` limit 1) limit 0;
> +--+--+
> | text | json |
> +--+--+
> | 00-00Screen
> 00-01  Project(*=[$0])
> 00-02SelectionVectorRemover
> 00-03  Limit(fetch=[0])
> 00-04Limit(fetch=[1])
> 00-05  Limit(offset=[0], fetch=[0])
> 00-06Scan(groupscan=[EasyGroupScan 
> [selectionRoot=maprfs:/tmp/employee.json, numFiles=1, columns=[`*`], 
> files=[maprfs:///tmp/employee.json]]])
> {noformat}
> Here is the data from JSON file
> {noformat}
> [root@centos-01 ~]# cat employee.json
> {
>   "firstName": "John",
>   "lastName": "Smith",
>   "isAlive": true,
>   "age": 45,
>   "height_cm": 177.6,
>   "address": {
> "streetAddress": "29 4th Street",
> "city": "New York",
> "state": "NY",
> "postalCode": "10021-3100"
>   },
>   "phoneNumbers": [
> {
>   "type": "home",
>   "number": "212 555-1234"
> },
> {
>   "type": "office",
>   "number": "646 555-4567"
> }
>   ],
>   "children": [],
>   "hobbies": ["scuba diving","hiking","biking","rock climbing","surfing"]
> }
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (DRILL-4545) Incorrect query plan for LIMIT 0 query

2016-03-28 Thread Khurram Faraaz (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-4545?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Khurram Faraaz resolved DRILL-4545.
---
Resolution: Invalid

> Incorrect query plan for LIMIT 0 query
> --
>
> Key: DRILL-4545
> URL: https://issues.apache.org/jira/browse/DRILL-4545
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Query Planning & Optimization
>Affects Versions: 1.6.0
> Environment: 4 node cluster CentOS
>Reporter: Khurram Faraaz
>  Labels: limit0
>
> Inner query has a LIMIT 1 and outer query has LIMIT 0. Looking at the query 
> plan it looks like the outer LIMIT 0 is applied before the LIMIT 1 is applied 
> to inner query. This does not seem right.
> Drill 1.6.0 commit ID : fb09973e
> {noformat}
> 0: jdbc:drill:schema=dfs.tmp> explain plan for select * from (select * from 
> `employee.json` limit 1) limit 0;
> +--+--+
> | text | json |
> +--+--+
> | 00-00Screen
> 00-01  Project(*=[$0])
> 00-02SelectionVectorRemover
> 00-03  Limit(fetch=[0])
> 00-04Limit(fetch=[1])
> 00-05  Limit(offset=[0], fetch=[0])
> 00-06Scan(groupscan=[EasyGroupScan 
> [selectionRoot=maprfs:/tmp/employee.json, numFiles=1, columns=[`*`], 
> files=[maprfs:///tmp/employee.json]]])
> {noformat}
> Here is the data from JSON file
> {noformat}
> [root@centos-01 ~]# cat employee.json
> {
>   "firstName": "John",
>   "lastName": "Smith",
>   "isAlive": true,
>   "age": 45,
>   "height_cm": 177.6,
>   "address": {
> "streetAddress": "29 4th Street",
> "city": "New York",
> "state": "NY",
> "postalCode": "10021-3100"
>   },
>   "phoneNumbers": [
> {
>   "type": "home",
>   "number": "212 555-1234"
> },
> {
>   "type": "office",
>   "number": "646 555-4567"
> }
>   ],
>   "children": [],
>   "hobbies": ["scuba diving","hiking","biking","rock climbing","surfing"]
> }
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (DRILL-4549) Add support for more truncation units in date_trunc function

2016-03-28 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-4549?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15215230#comment-15215230
 ] 

ASF GitHub Bot commented on DRILL-4549:
---

Github user asfgit closed the pull request at:

https://github.com/apache/drill/pull/450


> Add support for more truncation units in date_trunc function
> 
>
> Key: DRILL-4549
> URL: https://issues.apache.org/jira/browse/DRILL-4549
> Project: Apache Drill
>  Issue Type: Improvement
>Affects Versions: 1.6.0
>Reporter: Venki Korukanti
>Assignee: Venki Korukanti
> Fix For: 1.7.0
>
>
> Currently we support only {{YEAR, MONTH, DAY, HOUR, MINUTE, SECOND}} truncate 
> units for types {{TIME, TIMESTAMP and DATE}}. Extend the functions to support 
> {{YEAR, MONTH, DAY, HOUR, MINUTE, SECOND, WEEK, QUARTER, DECADE, CENTURY, 
> MILLENNIUM}} truncate units for types {{TIME, TIMESTAMP, DATE, INTERVAL DAY, 
> INTERVAL YEAR}}.
> Also get rid of the if-and-else (on truncation unit) implementation. Instead 
> resolve to a direct function based on the truncation unit in Calcite -> Drill 
> (DrillOptiq) expression conversion.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (DRILL-4549) Add support for more truncation units in date_trunc function

2016-03-28 Thread Venki Korukanti (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-4549?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Venki Korukanti resolved DRILL-4549.

Resolution: Fixed

> Add support for more truncation units in date_trunc function
> 
>
> Key: DRILL-4549
> URL: https://issues.apache.org/jira/browse/DRILL-4549
> Project: Apache Drill
>  Issue Type: Improvement
>Affects Versions: 1.6.0
>Reporter: Venki Korukanti
>Assignee: Venki Korukanti
> Fix For: 1.7.0
>
>
> Currently we support only {{YEAR, MONTH, DAY, HOUR, MINUTE, SECOND}} truncate 
> units for types {{TIME, TIMESTAMP and DATE}}. Extend the functions to support 
> {{YEAR, MONTH, DAY, HOUR, MINUTE, SECOND, WEEK, QUARTER, DECADE, CENTURY, 
> MILLENNIUM}} truncate units for types {{TIME, TIMESTAMP, DATE, INTERVAL DAY, 
> INTERVAL YEAR}}.
> Also get rid of the if-and-else (on truncation unit) implementation. Instead 
> resolve to a direct function based on the truncation unit in Calcite -> Drill 
> (DrillOptiq) expression conversion.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (DRILL-4550) Add support more time units in extract function

2016-03-28 Thread Venki Korukanti (JIRA)
Venki Korukanti created DRILL-4550:
--

 Summary: Add support more time units in extract function
 Key: DRILL-4550
 URL: https://issues.apache.org/jira/browse/DRILL-4550
 Project: Apache Drill
  Issue Type: Improvement
  Components: Functions - Drill
Affects Versions: 1.6.0
Reporter: Venki Korukanti
Assignee: Venki Korukanti
 Fix For: 1.7.0


Currently {{extract}} function support following units {{YEAR, MONTH, DAY, 
HOUR, MINUTE, SECOND}}. Add support for more units: {{CENTURY, DECADE, DOW, 
DOY, EPOCH, MILLENNIUM, QUARTER, WEEK}}.

We also need changes in the SQL parser. Currently the parser only allows 
{{YEAR, MONTH, DAY, HOUR, MINUTE, SECOND}} as units.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (DRILL-4549) Add support for more truncation units in date_trunc function

2016-03-28 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-4549?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15215120#comment-15215120
 ] 

ASF GitHub Bot commented on DRILL-4549:
---

Github user jacques-n commented on the pull request:

https://github.com/apache/drill/pull/450#issuecomment-202632855
  
lgtm


> Add support for more truncation units in date_trunc function
> 
>
> Key: DRILL-4549
> URL: https://issues.apache.org/jira/browse/DRILL-4549
> Project: Apache Drill
>  Issue Type: Improvement
>Affects Versions: 1.6.0
>Reporter: Venki Korukanti
>Assignee: Venki Korukanti
> Fix For: 1.7.0
>
>
> Currently we support only {{YEAR, MONTH, DAY, HOUR, MINUTE, SECOND}} truncate 
> units for types {{TIME, TIMESTAMP and DATE}}. Extend the functions to support 
> {{YEAR, MONTH, DAY, HOUR, MINUTE, SECOND, WEEK, QUARTER, DECADE, CENTURY, 
> MILLENNIUM}} truncate units for types {{TIME, TIMESTAMP, DATE, INTERVAL DAY, 
> INTERVAL YEAR}}.
> Also get rid of the if-and-else (on truncation unit) implementation. Instead 
> resolve to a direct function based on the truncation unit in Calcite -> Drill 
> (DrillOptiq) expression conversion.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (DRILL-4549) Add support for more truncation units in date_trunc function

2016-03-28 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-4549?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15215109#comment-15215109
 ] 

ASF GitHub Bot commented on DRILL-4549:
---

GitHub user vkorukanti opened a pull request:

https://github.com/apache/drill/pull/450

DRILL-4549: Add support for more truncation units in date_trunc function



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/vkorukanti/drill DRILL-4549

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/drill/pull/450.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #450


commit 89fc8462497e162772b56d6d45bca0074fbe3e43
Author: vkorukanti 
Date:   2016-03-28T18:09:34Z

DRILL-4549: Add support for more truncation units in date_trunc function




> Add support for more truncation units in date_trunc function
> 
>
> Key: DRILL-4549
> URL: https://issues.apache.org/jira/browse/DRILL-4549
> Project: Apache Drill
>  Issue Type: Improvement
>Affects Versions: 1.6.0
>Reporter: Venki Korukanti
>Assignee: Venki Korukanti
> Fix For: 1.7.0
>
>
> Currently we support only {{YEAR, MONTH, DAY, HOUR, MINUTE, SECOND}} truncate 
> units for types {{TIME, TIMESTAMP and DATE}}. Extend the functions to support 
> {{YEAR, MONTH, DAY, HOUR, MINUTE, SECOND, WEEK, QUARTER, DECADE, CENTURY, 
> MILLENNIUM}} truncate units for types {{TIME, TIMESTAMP, DATE, INTERVAL DAY, 
> INTERVAL YEAR}}.
> Also get rid of the if-and-else (on truncation unit) implementation. Instead 
> resolve to a direct function based on the truncation unit in Calcite -> Drill 
> (DrillOptiq) expression conversion.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (DRILL-4548) Avro fails on schema changes even when selecting from a single file (Union related)

2016-03-28 Thread JIRA

[ 
https://issues.apache.org/jira/browse/DRILL-4548?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15215014#comment-15215014
 ] 

Stefán Baxter commented on DRILL-4548:
--


I removed all the union ( ["null" .. ) settings and ended up adding default 
values to all fields (empty strings and 0) which fixed this.

The Avro plugin is not handling simple Unions correctly

> Avro fails on schema changes even when selecting from a single file (Union 
> related)
> ---
>
> Key: DRILL-4548
> URL: https://issues.apache.org/jira/browse/DRILL-4548
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Storage - Avro
>Affects Versions: 1.6.0, 1.7.0
>Reporter: Stefán Baxter
>
> Hi,
> I have reworked/refactored our Avro based logging system trying to make the 
> whole Drill + Avro->Parquet experience a bit more agreeable.
> Long story short I'm getting this error when selecting form multiple Avro 
> files even though these files share the EXCACT same schema:
> Error: UNSUPPORTED_OPERATION ERROR: Hash aggregate does not support schema 
> changes
> Fragment 0:0
> [Error Id: 00d49aa2-5564-497e-a330-e852d5889beb on swift:31010] 
> (state=,code=0)
> We are using union types but only to allow for null values as seems to be 
> supported by drill as per this comment in the Drill code: 
> // currently supporting only nullable union (optional fields) like ["null", 
> "some-type"].
> This happens for a very simple group_by + count(*) query that only uses two 
> fields in Avro and neither one of them uses a Union construct so and both of 
> them contain string values in every case.
> I now think this has nothing to do with the union types since the query uses 
> only simple string, unless there is a full schema validation done on the 
> content of the files rather then the identical Avro schema embedded in both 
> files.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (DRILL-4548) Avro fails on schema changes even when selecting from a single file (Union related)

2016-03-28 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/DRILL-4548?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stefán Baxter updated DRILL-4548:
-
Summary: Avro fails on schema changes even when selecting from a single 
file (Union related)  (was: Avro fails on schema changes even when selecting 
from a single file)

> Avro fails on schema changes even when selecting from a single file (Union 
> related)
> ---
>
> Key: DRILL-4548
> URL: https://issues.apache.org/jira/browse/DRILL-4548
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Storage - Avro
>Affects Versions: 1.6.0, 1.7.0
>Reporter: Stefán Baxter
>
> Hi,
> I have reworked/refactored our Avro based logging system trying to make the 
> whole Drill + Avro->Parquet experience a bit more agreeable.
> Long story short I'm getting this error when selecting form multiple Avro 
> files even though these files share the EXCACT same schema:
> Error: UNSUPPORTED_OPERATION ERROR: Hash aggregate does not support schema 
> changes
> Fragment 0:0
> [Error Id: 00d49aa2-5564-497e-a330-e852d5889beb on swift:31010] 
> (state=,code=0)
> We are using union types but only to allow for null values as seems to be 
> supported by drill as per this comment in the Drill code: 
> // currently supporting only nullable union (optional fields) like ["null", 
> "some-type"].
> This happens for a very simple group_by + count(*) query that only uses two 
> fields in Avro and neither one of them uses a Union construct so and both of 
> them contain string values in every case.
> I now think this has nothing to do with the union types since the query uses 
> only simple string, unless there is a full schema validation done on the 
> content of the files rather then the identical Avro schema embedded in both 
> files.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (DRILL-3771) MEMORY LEAK : Concurrent query execution

2016-03-28 Thread Deneche A. Hakim (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-3771?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15214941#comment-15214941
 ] 

Deneche A. Hakim commented on DRILL-3771:
-

I couldn't reproduce the memory leak on the latest master. It's worth noting 
that in ConcurrecyTest all queries share the same connection and the each query 
will close the connection as soon as it finishes. That's what causing the other 
queries to fail with a "Connection is already closed" error.

> MEMORY LEAK : Concurrent query execution
> 
>
> Key: DRILL-3771
> URL: https://issues.apache.org/jira/browse/DRILL-3771
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Flow
>Affects Versions: 1.2.0
> Environment: 4 node cluster CentOS
>Reporter: Khurram Faraaz
>Assignee: Deneche A. Hakim
>Priority: Critical
> Fix For: 1.7.0
>
>
> I am seeing a memory leak when I execute concurrent queries (16 threads). 
> Total number of records in the JSON file are close to ~26M. Number of records 
> that match the predicate key2 = 'm' are 1,874,177.
> I do not see the memory leak reported in the drillbit.log though.
> Query STATE is listed as CANCELLATION_REQUESTED for each of the query on the 
> Web UI's query profiles page.
> master commit ID: b525692e
> Query : select key1 , key2 from `twoKeyJsn.json` where key2 = 'm';
> I see this on the prompt from where I run the java program
> {code}
> org.apache.drill.jdbc.AlreadyClosedSqlException: Connection is already closed.
>   at 
> org.apache.drill.jdbc.impl.DrillConnectionImpl.checkNotClosed(DrillConnectionImpl.java:150)
>   at 
> org.apache.drill.jdbc.impl.DrillConnectionImpl.createStatement(DrillConnectionImpl.java:331)
>   at 
> org.apache.drill.jdbc.impl.DrillConnectionImpl.createStatement(DrillConnectionImpl.java:61)
>   at 
> net.hydromatic.avatica.AvaticaConnection.createStatement(AvaticaConnection.java:91)
>   at 
> net.hydromatic.avatica.AvaticaConnection.createStatement(AvaticaConnection.java:30)
>   at ConcurrencyTest.executeQuery(ConcurrencyTest.java:43)
>   at ConcurrencyTest.selectData(ConcurrencyTest.java:33)
>   at ConcurrencyTest.run(ConcurrencyTest.java:23)
>   at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:262)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>   at java.lang.Thread.run(Thread.java:744)
> java.sql.SQLException: While closing connection
>   at net.hydromatic.avatica.Helper.createException(Helper.java:40)
>   at 
> net.hydromatic.avatica.AvaticaConnection.close(AvaticaConnection.java:137)
>   at ConcurrencyTest.executeQuery(ConcurrencyTest.java:52)
>   at ConcurrencyTest.selectData(ConcurrencyTest.java:33)
>   at ConcurrencyTest.run(ConcurrencyTest.java:23)
>   at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:262)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>   at java.lang.Thread.run(Thread.java:744)
> Caused by: java.lang.IllegalStateException: Failure while closing accountor.  
> Expected private and shared pools to be set to initial values.  However, one 
> or more were not.  Stats are
>   zoneinitallocated   delta 
>   private 0   0   0 
>   shared  11246501888 11246497280 4608.
>   at 
> org.apache.drill.exec.memory.AtomicRemainder.close(AtomicRemainder.java:200)
>   at org.apache.drill.exec.memory.Accountor.close(Accountor.java:390)
>   at 
> org.apache.drill.exec.memory.TopLevelAllocator.close(TopLevelAllocator.java:187)
>   at org.apache.drill.exec.client.DrillClient.close(DrillClient.java:261)
>   at 
> org.apache.drill.jdbc.impl.DrillConnectionImpl.cleanup(DrillConnectionImpl.java:377)
>   at 
> org.apache.drill.jdbc.impl.DrillHandler.onConnectionClose(DrillHandler.java:36)
>   at 
> net.hydromatic.avatica.AvaticaConnection.close(AvaticaConnection.java:135)
>   ... 8 more
> {code}
> From drillbit.log
> {code}
> 2015-09-12 02:32:04,709 [BitServer-4] INFO  
> o.a.d.e.w.fragment.FragmentExecutor - 
> 2a0c71c7-9adc--2a97-f2f218f5b7a2:0:0: State change requested RUNNING --> 
> CANCELLATION_REQUESTED
> 2015-09-12 02:32:04,709 [BitServer-4] INFO  
> o.a.d.e.w.f.FragmentStatusReporter - 
> 2a0c71c7-9adc--2a97-f2f218f5b7a2:0:0: State to report: 
> CANCELLATION_REQUESTED
> 2015-09-12 02:32:04,720 [UserServer-1] ERROR 

[jira] [Closed] (DRILL-3771) MEMORY LEAK : Concurrent query execution

2016-03-28 Thread Deneche A. Hakim (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-3771?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Deneche A. Hakim closed DRILL-3771.
---
Resolution: Cannot Reproduce

> MEMORY LEAK : Concurrent query execution
> 
>
> Key: DRILL-3771
> URL: https://issues.apache.org/jira/browse/DRILL-3771
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Flow
>Affects Versions: 1.2.0
> Environment: 4 node cluster CentOS
>Reporter: Khurram Faraaz
>Assignee: Deneche A. Hakim
>Priority: Critical
> Fix For: 1.7.0
>
>
> I am seeing a memory leak when I execute concurrent queries (16 threads). 
> Total number of records in the JSON file are close to ~26M. Number of records 
> that match the predicate key2 = 'm' are 1,874,177.
> I do not see the memory leak reported in the drillbit.log though.
> Query STATE is listed as CANCELLATION_REQUESTED for each of the query on the 
> Web UI's query profiles page.
> master commit ID: b525692e
> Query : select key1 , key2 from `twoKeyJsn.json` where key2 = 'm';
> I see this on the prompt from where I run the java program
> {code}
> org.apache.drill.jdbc.AlreadyClosedSqlException: Connection is already closed.
>   at 
> org.apache.drill.jdbc.impl.DrillConnectionImpl.checkNotClosed(DrillConnectionImpl.java:150)
>   at 
> org.apache.drill.jdbc.impl.DrillConnectionImpl.createStatement(DrillConnectionImpl.java:331)
>   at 
> org.apache.drill.jdbc.impl.DrillConnectionImpl.createStatement(DrillConnectionImpl.java:61)
>   at 
> net.hydromatic.avatica.AvaticaConnection.createStatement(AvaticaConnection.java:91)
>   at 
> net.hydromatic.avatica.AvaticaConnection.createStatement(AvaticaConnection.java:30)
>   at ConcurrencyTest.executeQuery(ConcurrencyTest.java:43)
>   at ConcurrencyTest.selectData(ConcurrencyTest.java:33)
>   at ConcurrencyTest.run(ConcurrencyTest.java:23)
>   at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:262)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>   at java.lang.Thread.run(Thread.java:744)
> java.sql.SQLException: While closing connection
>   at net.hydromatic.avatica.Helper.createException(Helper.java:40)
>   at 
> net.hydromatic.avatica.AvaticaConnection.close(AvaticaConnection.java:137)
>   at ConcurrencyTest.executeQuery(ConcurrencyTest.java:52)
>   at ConcurrencyTest.selectData(ConcurrencyTest.java:33)
>   at ConcurrencyTest.run(ConcurrencyTest.java:23)
>   at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:262)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>   at java.lang.Thread.run(Thread.java:744)
> Caused by: java.lang.IllegalStateException: Failure while closing accountor.  
> Expected private and shared pools to be set to initial values.  However, one 
> or more were not.  Stats are
>   zoneinitallocated   delta 
>   private 0   0   0 
>   shared  11246501888 11246497280 4608.
>   at 
> org.apache.drill.exec.memory.AtomicRemainder.close(AtomicRemainder.java:200)
>   at org.apache.drill.exec.memory.Accountor.close(Accountor.java:390)
>   at 
> org.apache.drill.exec.memory.TopLevelAllocator.close(TopLevelAllocator.java:187)
>   at org.apache.drill.exec.client.DrillClient.close(DrillClient.java:261)
>   at 
> org.apache.drill.jdbc.impl.DrillConnectionImpl.cleanup(DrillConnectionImpl.java:377)
>   at 
> org.apache.drill.jdbc.impl.DrillHandler.onConnectionClose(DrillHandler.java:36)
>   at 
> net.hydromatic.avatica.AvaticaConnection.close(AvaticaConnection.java:135)
>   ... 8 more
> {code}
> From drillbit.log
> {code}
> 2015-09-12 02:32:04,709 [BitServer-4] INFO  
> o.a.d.e.w.fragment.FragmentExecutor - 
> 2a0c71c7-9adc--2a97-f2f218f5b7a2:0:0: State change requested RUNNING --> 
> CANCELLATION_REQUESTED
> 2015-09-12 02:32:04,709 [BitServer-4] INFO  
> o.a.d.e.w.f.FragmentStatusReporter - 
> 2a0c71c7-9adc--2a97-f2f218f5b7a2:0:0: State to report: 
> CANCELLATION_REQUESTED
> 2015-09-12 02:32:04,720 [UserServer-1] ERROR 
> o.a.d.exec.rpc.RpcExceptionHandler - Exception in RPC communication.  
> Connection: /10.10.100.201:31010 <--> /10.10.100.201:53620 (user client).  
> Closing connection.
> java.io.IOException: syscall:writev(...)() failed: Broken pipe
> ...
> 2015-09-12 02:32:04,896 [UserServer-1] INFO  
> 

[jira] [Updated] (DRILL-4548) Avro fails on schema changes even when selecting from a single file

2016-03-28 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/DRILL-4548?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stefán Baxter updated DRILL-4548:
-
Summary: Avro fails on schema changes even when selecting from a single 
file  (was: Drill can not select from multiple Avro files)

> Avro fails on schema changes even when selecting from a single file
> ---
>
> Key: DRILL-4548
> URL: https://issues.apache.org/jira/browse/DRILL-4548
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Storage - Avro
>Affects Versions: 1.6.0, 1.7.0
>Reporter: Stefán Baxter
>
> Hi,
> I have reworked/refactored our Avro based logging system trying to make the 
> whole Drill + Avro->Parquet experience a bit more agreeable.
> Long story short I'm getting this error when selecting form multiple Avro 
> files even though these files share the EXCACT same schema:
> Error: UNSUPPORTED_OPERATION ERROR: Hash aggregate does not support schema 
> changes
> Fragment 0:0
> [Error Id: 00d49aa2-5564-497e-a330-e852d5889beb on swift:31010] 
> (state=,code=0)
> We are using union types but only to allow for null values as seems to be 
> supported by drill as per this comment in the Drill code: 
> // currently supporting only nullable union (optional fields) like ["null", 
> "some-type"].
> This happens for a very simple group_by + count(*) query that only uses two 
> fields in Avro and neither one of them uses a Union construct so and both of 
> them contain string values in every case.
> I now think this has nothing to do with the union types since the query uses 
> only simple string, unless there is a full schema validation done on the 
> content of the files rather then the identical Avro schema embedded in both 
> files.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (DRILL-4549) Add support for more truncation units in date_trunc function

2016-03-28 Thread Venki Korukanti (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-4549?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Venki Korukanti updated DRILL-4549:
---
Summary: Add support for more truncation units in date_trunc function  
(was: Add support for more units in date_trunc function)

> Add support for more truncation units in date_trunc function
> 
>
> Key: DRILL-4549
> URL: https://issues.apache.org/jira/browse/DRILL-4549
> Project: Apache Drill
>  Issue Type: Improvement
>Affects Versions: 1.6.0
>Reporter: Venki Korukanti
>Assignee: Venki Korukanti
> Fix For: 1.7.0
>
>
> Currently we support only {{YEAR, MONTH, DAY, HOUR, MINUTE, SECOND}} truncate 
> units for types {{TIME, TIMESTAMP and DATE}}. Extend the functions to support 
> {{YEAR, MONTH, DAY, HOUR, MINUTE, SECOND, WEEK, QUARTER, DECADE, CENTURY, 
> MILLENNIUM}} truncate units for types {{TIME, TIMESTAMP, DATE, INTERVAL DAY, 
> INTERVAL YEAR}}.
> Also get rid of the if-and-else (on truncation unit) implementation. Instead 
> resolve to a direct function based on the truncation unit in Calcite -> Drill 
> (DrillOptiq) expression conversion.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (DRILL-4549) Add support for more units in date_trunc function

2016-03-28 Thread Venki Korukanti (JIRA)
Venki Korukanti created DRILL-4549:
--

 Summary: Add support for more units in date_trunc function
 Key: DRILL-4549
 URL: https://issues.apache.org/jira/browse/DRILL-4549
 Project: Apache Drill
  Issue Type: Improvement
Affects Versions: 1.6.0
Reporter: Venki Korukanti
Assignee: Venki Korukanti
 Fix For: 1.7.0


Currently we support only {{YEAR, MONTH, DAY, HOUR, MINUTE, SECOND}} truncate 
units for types {{TIME, TIMESTAMP and DATE}}. Extend the functions to support 
{{YEAR, MONTH, DAY, HOUR, MINUTE, SECOND, WEEK, QUARTER, DECADE, CENTURY, 
MILLENNIUM}} truncate units for types {{TIME, TIMESTAMP, DATE, INTERVAL DAY, 
INTERVAL YEAR}}.

Also get rid of the if-and-else (on truncation unit) implementation. Instead 
resolve to a direct function based on the truncation unit in Calcite -> Drill 
(DrillOptiq) expression conversion.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (DRILL-4467) Invalid projection created using PrelUtil.getColumns

2016-03-28 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-4467?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15214871#comment-15214871
 ] 

ASF GitHub Bot commented on DRILL-4467:
---

Github user laurentgo commented on the pull request:

https://github.com/apache/drill/pull/404#issuecomment-202577322
  
was fixed by @jacques-n in commit edea8b1cf4e5476d803e8b87c79e08e8c3263e04. 
I'm closing this PR


> Invalid projection created using PrelUtil.getColumns
> 
>
> Key: DRILL-4467
> URL: https://issues.apache.org/jira/browse/DRILL-4467
> Project: Apache Drill
>  Issue Type: Bug
>Reporter: Laurent Goujon
>Assignee: Jacques Nadeau
>Priority: Critical
> Fix For: 1.6.0
>
>
> In {{DrillPushProjIntoScan}}, a new scan and a new projection are created 
> using {{PrelUtil#getColumn(RelDataType, List)}}.
> The returned {{ProjectPushInfo}} instance has several fields, one of them is 
> {{desiredFields}} which is the list of projected fields. There's one instance 
> per {{RexNode}} but because instances were initially added to a set, they 
> might not be in the same order as the order they were created.
> The issue happens in the following code:
> {code:java}
>   List newProjects = Lists.newArrayList();
>   for (RexNode n : proj.getChildExps()) {
> newProjects.add(n.accept(columnInfo.getInputRewriter()));
>   }
> {code}
> This code creates a new list of projects out of the initial ones, by mapping 
> the indices from the old projects to the new projects, but the indices of the 
> new RexNode instances might be out of order (because of the ordering of 
> desiredFields). And if indices are out of order, the check 
> {{ProjectRemoveRule.isTrivial(newProj)}} will fail.
> My guess is that desiredFields ordering should be preserved when instances 
> are added, to satisfy the condition above.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (DRILL-4467) Invalid projection created using PrelUtil.getColumns

2016-03-28 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-4467?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15214872#comment-15214872
 ] 

ASF GitHub Bot commented on DRILL-4467:
---

Github user laurentgo closed the pull request at:

https://github.com/apache/drill/pull/404


> Invalid projection created using PrelUtil.getColumns
> 
>
> Key: DRILL-4467
> URL: https://issues.apache.org/jira/browse/DRILL-4467
> Project: Apache Drill
>  Issue Type: Bug
>Reporter: Laurent Goujon
>Assignee: Jacques Nadeau
>Priority: Critical
> Fix For: 1.6.0
>
>
> In {{DrillPushProjIntoScan}}, a new scan and a new projection are created 
> using {{PrelUtil#getColumn(RelDataType, List)}}.
> The returned {{ProjectPushInfo}} instance has several fields, one of them is 
> {{desiredFields}} which is the list of projected fields. There's one instance 
> per {{RexNode}} but because instances were initially added to a set, they 
> might not be in the same order as the order they were created.
> The issue happens in the following code:
> {code:java}
>   List newProjects = Lists.newArrayList();
>   for (RexNode n : proj.getChildExps()) {
> newProjects.add(n.accept(columnInfo.getInputRewriter()));
>   }
> {code}
> This code creates a new list of projects out of the initial ones, by mapping 
> the indices from the old projects to the new projects, but the indices of the 
> new RexNode instances might be out of order (because of the ordering of 
> desiredFields). And if indices are out of order, the check 
> {{ProjectRemoveRule.isTrivial(newProj)}} will fail.
> My guess is that desiredFields ordering should be preserved when instances 
> are added, to satisfy the condition above.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (DRILL-4546) mvn deploy pushes the same zip artifact twice

2016-03-28 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-4546?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15214870#comment-15214870
 ] 

ASF GitHub Bot commented on DRILL-4546:
---

GitHub user laurentgo opened a pull request:

https://github.com/apache/drill/pull/449

DRILL-4546: Only generate one zip archive when using apache-release profile

Drill root pom doesn't override completely Apache parent pom configuration
regarding assemblies, which caused a zip archive of the project to be 
generated
twice, and deployed to a remote server twice too.

The fix updated the Apache parent pom version, and uses the plugin 
properties
to override the configuration. Also remove Drill source assembly descriptor 
as
the Apache parent project provides the same one.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/laurentgo/drill 
laurent/fix-assembly-descriptor

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/drill/pull/449.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #449


commit 6353b057361f40f2ec56942e77321cd84147f340
Author: Laurent Goujon 
Date:   2016-03-28T20:31:53Z

DRILL-4546: Only generate one zip archive when using apache-release profile

Drill root pom doesn't override completely Apache parent pom configuration
regarding assemblies, which caused a zip archive of the project to be 
generated
twice, and deployed to a remote server twice too.

The fix updated the Apache parent pom version, and uses the plugin 
properties
to override the configuration. Also remove Drill source assembly descriptor 
as
the Apache parent project provides the same one.




> mvn deploy pushes the same zip artifact twice
> -
>
> Key: DRILL-4546
> URL: https://issues.apache.org/jira/browse/DRILL-4546
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Tools, Build & Test
>Reporter: Laurent Goujon
>
> When using the apache-release profile, both Apache and Drill assembly 
> descriptors are used. This caused the zip artifact to be generated twice, and 
> pushed twice to the remote repository. Because some repositories are 
> configured to not allow the same artifact to be pushed several times, this 
> might cause the build to fail.
> Ideally, only one zip file should be built and push. Also, the Apache Parent 
> pom provides a descriptor to build both the tar and the zip archives.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (DRILL-4548) Drill can not select from multiple Avro files

2016-03-28 Thread JIRA
Stefán Baxter created DRILL-4548:


 Summary: Drill can not select from multiple Avro files
 Key: DRILL-4548
 URL: https://issues.apache.org/jira/browse/DRILL-4548
 Project: Apache Drill
  Issue Type: Bug
  Components: Storage - Avro
Affects Versions: 1.6.0, 1.7.0
Reporter: Stefán Baxter


Hi,

I have reworked/refactored our Avro based logging system trying to make the 
whole Drill + Avro->Parquet experience a bit more agreeable.

Long story short I'm getting this error when selecting form multiple Avro files 
even though these files share the EXCACT same schema:

Error: UNSUPPORTED_OPERATION ERROR: Hash aggregate does not support schema 
changes
Fragment 0:0
[Error Id: 00d49aa2-5564-497e-a330-e852d5889beb on swift:31010] (state=,code=0)

We are using union types but only to allow for null values as seems to be 
supported by drill as per this comment in the Drill code: 
// currently supporting only nullable union (optional fields) like ["null", 
"some-type"].

This happens for a very simple group_by + count(*) query that only uses two 
fields in Avro and neither one of them uses a Union construct so and both of 
them contain string values in every case.

I now think this has nothing to do with the union types since the query uses 
only simple string, unless there is a full schema validation done on the 
content of the files rather then the identical Avro schema embedded in both 
files.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (DRILL-4199) Add Support for HBase 1.X

2016-03-28 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-4199?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15214865#comment-15214865
 ] 

ASF GitHub Bot commented on DRILL-4199:
---

Github user adityakishore commented on the pull request:

https://github.com/apache/drill/pull/443#issuecomment-202576211
  
I have verified that, with these changes, Drill queries run fine with HBase 
0.98 cluster too.


> Add Support for HBase 1.X
> -
>
> Key: DRILL-4199
> URL: https://issues.apache.org/jira/browse/DRILL-4199
> Project: Apache Drill
>  Issue Type: New Feature
>  Components: Storage - HBase
>Affects Versions: 1.7.0
>Reporter: Divjot singh
>Assignee: Aditya Kishore
>
> Is there any Road map to upgrade the Hbase version to 1.x series. Currently 
> drill supports Hbase 0.98 version.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (DRILL-4547) Javadoc fails with Java8

2016-03-28 Thread Laurent Goujon (JIRA)
Laurent Goujon created DRILL-4547:
-

 Summary: Javadoc fails with Java8
 Key: DRILL-4547
 URL: https://issues.apache.org/jira/browse/DRILL-4547
 Project: Apache Drill
  Issue Type: Bug
  Components: Tools, Build & Test
Affects Versions: 1.6.0
Reporter: Laurent Goujon


Javadoc cannot be generated when using Java8 (likely because the parser is now 
more strict).

Here's an example of issues when trying to generate javadocs in module 
{{drill-fmpp-maven-plugin}}

{noformat}
[ERROR] Failed to execute goal 
org.apache.maven.plugins:maven-javadoc-plugin:2.9.1:jar (attach-javadocs) on 
project drill-fmpp-maven-plugin: MavenReportException: Error while creating 
archive:
[ERROR] Exit code: 1 - 
/Users/laurent/devel/drill/tools/fmpp/src/main/java/org/apache/drill/fmpp/mojo/FMPPMojo.java:44:
 error: unknown tag: goal
[ERROR] * @goal generate
[ERROR] ^
[ERROR] 
/Users/laurent/devel/drill/tools/fmpp/src/main/java/org/apache/drill/fmpp/mojo/FMPPMojo.java:45:
 error: unknown tag: phase
[ERROR] * @phase generate-sources
[ERROR] ^
[ERROR] 
/Users/laurent/devel/drill/tools/fmpp/target/generated-sources/plugin/org/apache/drill/fmpp/mojo/HelpMojo.java:25:
 error: unknown tag: goal
[ERROR] * @goal help
[ERROR] ^
[ERROR] 
/Users/laurent/devel/drill/tools/fmpp/target/generated-sources/plugin/org/apache/drill/fmpp/mojo/HelpMojo.java:26:
 error: unknown tag: requiresProject
[ERROR] * @requiresProject false
[ERROR] ^
[ERROR] 
/Users/laurent/devel/drill/tools/fmpp/target/generated-sources/plugin/org/apache/drill/fmpp/mojo/HelpMojo.java:27:
 error: unknown tag: threadSafe
[ERROR] * @threadSafe
[ERROR] ^
[ERROR] 
[ERROR] Command line was: 
/Library/Java/JavaVirtualMachines/jdk1.8.0_72.jdk/Contents/Home/bin/javadoc 
@options @packages
[ERROR] 
[ERROR] Refer to the generated Javadoc files in 
'/Users/laurent/devel/drill/tools/fmpp/target/apidocs' dir.
[ERROR] -> [Help 1]
[ERROR] 
[ERROR] To see the full stack trace of the errors, re-run Maven with the -e 
switch.
[ERROR] Re-run Maven using the -X switch to enable full debug logging.
[ERROR] 
[ERROR] For more information about the errors and possible solutions, please 
read the following articles:
[ERROR] [Help 1] 
http://cwiki.apache.org/confluence/display/MAVEN/MojoExecutionException
[ERROR] 
[ERROR] After correcting the problems, you can resume the build with the command
[ERROR]   mvn  -rf :drill-fmpp-maven-plugin
{noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (DRILL-4546) mvn deploy pushes the same zip artifact twice

2016-03-28 Thread Laurent Goujon (JIRA)
Laurent Goujon created DRILL-4546:
-

 Summary: mvn deploy pushes the same zip artifact twice
 Key: DRILL-4546
 URL: https://issues.apache.org/jira/browse/DRILL-4546
 Project: Apache Drill
  Issue Type: Bug
  Components: Tools, Build & Test
Reporter: Laurent Goujon


When using the apache-release profile, both Apache and Drill assembly 
descriptors are used. This caused the zip artifact to be generated twice, and 
pushed twice to the remote repository. Because some repositories are configured 
to not allow the same artifact to be pushed several times, this might cause the 
build to fail.

Ideally, only one zip file should be built and push. Also, the Apache Parent 
pom provides a descriptor to build both the tar and the zip archives.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (DRILL-4544) Improve error messages for REFRESH TABLE METADATA command

2016-03-28 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-4544?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15214793#comment-15214793
 ] 

ASF GitHub Bot commented on DRILL-4544:
---

GitHub user arina-ielchiieva opened a pull request:

https://github.com/apache/drill/pull/448

DRILL-4544: Improve error messages for REFRESH TABLE METADATA command

1. Added error message when storage plugin or workspace does not exist
2. Updated error message when refresh metadata is not supported
3. Unit tests

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/arina-ielchiieva/drill DRILL-4544

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/drill/pull/448.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #448


commit 1d1edee8920563f7b5fadf760f99d96f6c68d432
Author: Arina Ielchiieva 
Date:   2016-03-28T10:55:56Z

DRILL-4544: Improve error messages for REFRESH TABLE METADATA command
1. Added error message when storage plugin or workspace does not exist
2. Updated error message when refresh metadata is not supported
3. Unit tests




> Improve error messages for REFRESH TABLE METADATA command
> -
>
> Key: DRILL-4544
> URL: https://issues.apache.org/jira/browse/DRILL-4544
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Metadata
>Reporter: Arina Ielchiieva
>Assignee: Arina Ielchiieva
>Priority: Minor
> Fix For: 1.7.0
>
>
> Improve the error messages thrown by REFRESH TABLE METADATA command:
> In the first case below, the error is maprfs.abc doesn't exist. It should 
> throw a Object not found or workspace not found. It is currently throwing a 
> non helpful message;
> 0: jdbc:drill:> refresh table metadata maprfs.abc.`my_table`;
> +
> oksummary
> +
> false Error: null
> +
> 1 row selected (0.355 seconds)
> In the second case below, it says refresh table metadata is supported only 
> for single-directory based Parquet tables. But the command works for nested 
> multi-directory Parquet files.
> 0: jdbc:drill:> refresh table metadata maprfs.vnaranammalpuram.`rfm_sales_vw`;
> ---+
> oksummary
> ---+
> false Table rfm_sales_vw does not support metadata refresh. Support is 
> currently limited to single-directory-based Parquet tables.
> ---+
> 1 row selected (0.418 seconds)
> 0: jdbc:drill:>



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (DRILL-1170) YARN support for Drill

2016-03-28 Thread Billie Rinaldi (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-1170?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15214693#comment-15214693
 ] 

Billie Rinaldi commented on DRILL-1170:
---

Have you considered using [Slider|http://slider.incubator.apache.org] instead 
of writing a new AM for Drill?  Slider would take care of most of these bullets 
already, without requiring you to write new Java code.  If there end up being 
new features Drill would need, I imagine the Slider folks would be receptive to 
adding those.

> YARN support for Drill
> --
>
> Key: DRILL-1170
> URL: https://issues.apache.org/jira/browse/DRILL-1170
> Project: Apache Drill
>  Issue Type: New Feature
>Reporter: Neeraja
>Assignee: Paul Rogers
> Fix For: Future
>
>
> This is a tracking item to make Drill work with YARN.
> Below are few requirements/needs to consider.
> - Drill should run as an YARN based application, side by side with other YARN 
> enabled applications (on same nodes or different nodes). Both memory and CPU 
> resources of Drill should be controlled in this mechanism.
> - As an YARN enabled application, Drill resource consumption should be 
> adaptive to the load on the cluster. For ex: When there is no load on the 
> Drill , Drill should consume no resources on the cluster.  As the load on 
> Drill increases, resources permitting, usage should grow proportionally.
> - Low latency is a key requirement for Apache Drill along with support for 
> multiple users (concurrency in 100s-1000s). This should be supported when run 
> as YARN application as well.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (DRILL-2100) Drill not deleting spooling files

2016-03-28 Thread Vitalii Diravka (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-2100?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vitalii Diravka reassigned DRILL-2100:
--

Assignee: Vitalii Diravka

> Drill not deleting spooling files
> -
>
> Key: DRILL-2100
> URL: https://issues.apache.org/jira/browse/DRILL-2100
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Relational Operators
>Affects Versions: 0.8.0
>Reporter: Abhishek Girish
>Assignee: Vitalii Diravka
> Fix For: Future
>
>
> Currently, after forcing queries to use an external sort by switching off 
> hash join/agg causes spill-to-disk files accumulating. 
> This causes issues with disk space availability when the spill is configured 
> to be on the local file system (/tmp/drill). Also not optimal when configured 
> to use DFS (custom). 
> Drill must clean up all temporary files created after a query completes or 
> after a drillbit restart. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (DRILL-2100) Drill not deleting spooling files

2016-03-28 Thread Vitalii Diravka (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-2100?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15214433#comment-15214433
 ] 

Vitalii Diravka commented on DRILL-2100:


[~haozhu], 

Added deleting of the whole directory for SQL profile Id 
("/tmp/drill/spill/2aa9600f-016a-5283-f98e-ef22942981c2" for example) when 
FileSystem is closed. 
https://github.com/vdiravka/drill/commit/a5f891dbba06c2f15c8478c7843394c809de25c0

> Drill not deleting spooling files
> -
>
> Key: DRILL-2100
> URL: https://issues.apache.org/jira/browse/DRILL-2100
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Relational Operators
>Affects Versions: 0.8.0
>Reporter: Abhishek Girish
> Fix For: Future
>
>
> Currently, after forcing queries to use an external sort by switching off 
> hash join/agg causes spill-to-disk files accumulating. 
> This causes issues with disk space availability when the spill is configured 
> to be on the local file system (/tmp/drill). Also not optimal when configured 
> to use DFS (custom). 
> Drill must clean up all temporary files created after a query completes or 
> after a drillbit restart. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (DRILL-4458) JDBC plugin case sensitive table names

2016-03-28 Thread Serge Harnyk (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-4458?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15214195#comment-15214195
 ] 

Serge Harnyk commented on DRILL-4458:
-

Its bug of current version of Calcite that Drill uses:

@Override
  public TableEntry getTable(String tableName, boolean caseSensitive) {
Table t = schema.getTable(tableName);
if (t == null) {
  TableEntry e = tableMap.get(tableName);
  if (e != null) {
t = e.getTable();
  }
}
if (t != null) {
  return new TableEntryImpl(this, tableName, t, ImmutableList.of());
}

return null;
  }

argument caseSensitive doesnt used anywhere in this method

In last version of Calcite this bug is fixed:
https://github.com/apache/calcite/blob/master/core/src/main/java/org/apache/calcite/jdbc/CalciteSchema.java

> JDBC plugin case sensitive table names
> --
>
> Key: DRILL-4458
> URL: https://issues.apache.org/jira/browse/DRILL-4458
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Storage - JDBC
>Affects Versions: 1.5.0
> Environment: Drill embedded mode on OSX, connecting to MS SQLServer
>Reporter: Paul Mogren
>Assignee: Serge Harnyk
>Priority: Minor
>
> I just tried Drill with MS SQL Server and I found that Drill treats table
> names case-sensitively, contrary to
> https://drill.apache.org/docs/lexical-structure/ which indicates that
> table names are "case-insensitive unless enclosed in double quotation
> marks”. This presents a problem for users and existing SQL scripts that
> expect table names to be case-insensitive.
> This works: select * from mysandbox.dbo.AD_Role
> This does not work: select * from mysandbox.dbo.ad_role
> Mailing list reference including stack trace: 
> http://mail-archives.apache.org/mod_mbox/drill-user/201603.mbox/%3ccajrw0otv8n5ybmvu6w_efe4npgenrdk5grmh9jtbxu9xnni...@mail.gmail.com%3e



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (DRILL-4409) projecting literal will result in an empty resultset

2016-03-28 Thread Serge Harnyk (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-4409?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15214071#comment-15214071
 ] 

Serge Harnyk commented on DRILL-4409:
-

Resume:
1. PostgreSQL returns "unknown" type for literals. It breaks SQL Standart.
2. MySQL and Oracle returns VARCHAR for literals.
3. Nothing to fix in Drill itself, but seems reasonable to add note here: 
https://drill.apache.org/docs/jdbc-storage-plugin/

> projecting literal will result in an empty resultset
> 
>
> Key: DRILL-4409
> URL: https://issues.apache.org/jira/browse/DRILL-4409
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Storage - JDBC
>Affects Versions: 1.5.0
>Reporter: N Campbell
>Assignee: Serge Harnyk
>
> A query which projects a literal as shown against a Postgres table will 
> result in an empty result set being returned. 
> select 'BB' from postgres.public.tversion



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (DRILL-4003) Tests expecting Drill OversizedAllocationException yield OutOfMemoryError

2016-03-28 Thread sonali shrivastava (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-4003?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15213984#comment-15213984
 ] 

sonali shrivastava commented on DRILL-4003:
---

Hi, 

I were trying to port Apache Arrow 68 for IBM power architecture in virtual 
machine, i am facing same error while running test for java, beside i have 
enough memory space. Can you please let me know status if it is fixed recently 
with details?
Thank You.


> Tests expecting Drill OversizedAllocationException yield OutOfMemoryError
> -
>
> Key: DRILL-4003
> URL: https://issues.apache.org/jira/browse/DRILL-4003
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Data Types, Tools, Build & Test
>Reporter: Daniel Barclay (Drill)
>
> Tests that expect Drill's {{OversizedAllocationException}} (for example, 
> {{TestValueVector.testFixedVectorReallocation()}}) sometimes fail with an 
> {{OutOfMemoryError}} instead.
> (Do the tests check whether there's enough memory available for the test 
> before proceeding?)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (DRILL-4545) Incorrect query plan for LIMIT 0 query

2016-03-28 Thread Khurram Faraaz (JIRA)
Khurram Faraaz created DRILL-4545:
-

 Summary: Incorrect query plan for LIMIT 0 query
 Key: DRILL-4545
 URL: https://issues.apache.org/jira/browse/DRILL-4545
 Project: Apache Drill
  Issue Type: Bug
  Components: Query Planning & Optimization
Affects Versions: 1.6.0
 Environment: 4 node cluster CentOS
Reporter: Khurram Faraaz


Inner query has a LIMIT 1 and outer query has LIMIT 0. Looking at the query 
plan it looks like the outer LIMIT 0 is applied before the LIMIT 1 is applied 
to inner query. This does not seem right.

Drill 1.6.0 commit ID : fb09973e

{noformat}
0: jdbc:drill:schema=dfs.tmp> explain plan for select * from (select * from 
`employee.json` limit 1) limit 0;
+--+--+
| text | json |
+--+--+
| 00-00Screen
00-01  Project(*=[$0])
00-02SelectionVectorRemover
00-03  Limit(fetch=[0])
00-04Limit(fetch=[1])
00-05  Limit(offset=[0], fetch=[0])
00-06Scan(groupscan=[EasyGroupScan 
[selectionRoot=maprfs:/tmp/employee.json, numFiles=1, columns=[`*`], 
files=[maprfs:///tmp/employee.json]]])
{noformat}

Here is the data from JSON file

{noformat}
[root@centos-01 ~]# cat employee.json
{
  "firstName": "John",
  "lastName": "Smith",
  "isAlive": true,
  "age": 45,
  "height_cm": 177.6,
  "address": {
"streetAddress": "29 4th Street",
"city": "New York",
"state": "NY",
"postalCode": "10021-3100"
  },
  "phoneNumbers": [
{
  "type": "home",
  "number": "212 555-1234"
},
{
  "type": "office",
  "number": "646 555-4567"
}
  ],
  "children": [],
  "hobbies": ["scuba diving","hiking","biking","rock climbing","surfing"]
}
{noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)