date:20150923

[jira] [Updated] (DRILL-3826) Concurrent Query Submission leads to Channel Closed Exception

2015-09-23 Thread Yiyi Hu (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-3826?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yiyi Hu updated DRILL-3826:
---
Attachment: shell-test-drillbit.log

> Concurrent Query Submission leads to Channel Closed Exception
> -
>
> Key: DRILL-3826
> URL: https://issues.apache.org/jira/browse/DRILL-3826
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Client - JDBC, Execution - RPC
>Affects Versions: 1.1.0
> Environment: - CentOS release 6.6 (Final)
> - hadoop-2.7.1
> - hbase-1.0.1.1
> - drill-1.1.0
> - jdk-1.8.0_45
>Reporter: Yiyi Hu
>Assignee: Daniel Barclay (Drill)
>  Labels: filesystem, hadoop, hbase, jdbc, rpc
> Attachments: shell-test-drillbit.log
>
>
> Frequently seen CHANNEL CLOSED EXCEPTION while running concurrent quries with 
> relatively large LIMIT.
> Here are the details,
> SET UP:
> - Single drillbit running on a single zookeeper node
> - 4G heap size, 8G direct memory
> - Storage plugins: local filesystem, hdfs, hbase
> TEST DATA:
> - A 50,000,000 records json file test.json, with two fields id, 
> title SHELL TEST:
> - Running 4 drill shells concurrently with query:
>   SELECT id, title from dfs.`test.json` LIMIT 500.
> - Queries got canceled. Channel closing between client and server were seen 
> randomly, as an example shown below:
> java.lang.RuntimeException: java.sql.SQLException: SYSTEM ERROR: 
> ChannelClosedException: Channel closed /192.168.4.201:31010 <--> 
> /192.168.4.201:48829.
> Fragment 0:0
> [Error Id: 0bd2b500-155e-46e0-9f26-bd89fea47a25 on TEST-101:31010]
>   at sqlline.IncrementalRows.hasNext(IncrementalRows.java:73)
>   at 
> sqlline.TableOutputFormat$ResizingRowsProvider.next(TableOutputFormat.java:87)
>   at sqlline.TableOutputFormat.print(TableOutputFormat.java:118)
>   at sqlline.SqlLine.print(SqlLine.java:1583)
>   at sqlline.Commands.execute(Commands.java:852)
>   at sqlline.Commands.sql(Commands.java:751)
>   at sqlline.SqlLine.dispatch(SqlLine.java:738)
>   at sqlline.SqlLine.begin(SqlLine.java:612)
>   at sqlline.SqlLine.start(SqlLine.java:366)
>   at sqlline.SqlLine.main(SqlLine.java:259)
> JDBC TEST:
> - 6 separate threads running the same query: SELECT id, title from 
> dfs.`test.json` LIMIT 1000, each maintains its own connection to drill 
> and resultSet, statement and connection are closed finally.
> - Used resultSet.next() to iterate on the result set, do nothing else.
> - Throws the same channel closed exception randomly. Log file were enclosed 
> for review.
> - Memory were monitored, all good.
> CROSS STORAGE PLUGINS:
> - The same issue can be found not only in JSON on a file system (local/hdfs), 
> but also when querying the same 50,000,000 records table in HBASE.
> - The issue is not found in a single thread application.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (DRILL-3826) Concurrent Query Submission leads to Channel Closed Exception

2015-09-23 Thread Yiyi Hu (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-3826?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yiyi Hu updated DRILL-3826:
---
Description: 
Frequently seen CHANNEL CLOSED EXCEPTION while running concurrent quries with 
relatively large LIMIT.


Here are the details,

SET UP:
- Single drillbit running on a single zookeeper node
- 4G heap size, 8G direct memory
- Storage plugins: local filesystem, hdfs, hbase


TEST DATA:
- A 50,000,000 records json file test.json, with two fields id, 
title  (approximately 3G).


SHELL TEST:
- Running 4 drill shells concurrently with query:
  SELECT id, title from dfs.`test.json` LIMIT 500.

- Queries got canceled. Channel closing between client and server were seen 
randomly, as an example shown below:

java.lang.RuntimeException: java.sql.SQLException: SYSTEM ERROR: 
ChannelClosedException: Channel closed /192.168.4.201:31010 <--> 
/192.168.4.201:48829.

Fragment 0:0

[Error Id: 0bd2b500-155e-46e0-9f26-bd89fea47a25 on TEST-101:31010]
at sqlline.IncrementalRows.hasNext(IncrementalRows.java:73)
at 
sqlline.TableOutputFormat$ResizingRowsProvider.next(TableOutputFormat.java:87)
at sqlline.TableOutputFormat.print(TableOutputFormat.java:118)
at sqlline.SqlLine.print(SqlLine.java:1583)
at sqlline.Commands.execute(Commands.java:852)
at sqlline.Commands.sql(Commands.java:751)
at sqlline.SqlLine.dispatch(SqlLine.java:738)
at sqlline.SqlLine.begin(SqlLine.java:612)
at sqlline.SqlLine.start(SqlLine.java:366)
at sqlline.SqlLine.main(SqlLine.java:259)


JDBC TEST:
- 6 separate threads running the same query: SELECT id, title from 
dfs.`test.json` LIMIT 1000, each maintains its own connection to drill and 
resultSet, statement and connection are closed finally.

- Used resultSet.next() to iterate on the result set, do nothing else.

- Throws the same channel closed exception randomly. Log file were enclosed for 
review.

- Memory usage was monitored, all good.

CROSS STORAGE PLUGINS:
- The same issue can be found not only in JSON on a file system (local/hdfs), 
but also when querying the same 50,000,000 records table in HBASE.

- The issue is not found in a single thread application.


  was:
Frequently seen CHANNEL CLOSED EXCEPTION while running concurrent quries with 
relatively large LIMIT.


Here are the details,

SET UP:
- Single drillbit running on a single zookeeper node
- 4G heap size, 8G direct memory
- Storage plugins: local filesystem, hdfs, hbase


TEST DATA:
- A 50,000,000 records json file test.json, with two fields id, 
title 
/192.168.4.201:48829.

Fragment 0:0

[Error Id: 0bd2b500-155e-46e0-9f26-bd89fea47a25 on TEST-101:31010]
at sqlline.IncrementalRows.hasNext(IncrementalRows.java:73)
at 
sqlline.TableOutputFormat$ResizingRowsProvider.next(TableOutputFormat.java:87)
at sqlline.TableOutputFormat.print(TableOutputFormat.java:118)
at sqlline.SqlLine.print(SqlLine.java:1583)
at sqlline.Commands.execute(Commands.java:852)
at sqlline.Commands.sql(Commands.java:751)
at sqlline.SqlLine.dispatch(SqlLine.java:738)
at sqlline.SqlLine.begin(SqlLine.java:612)
at sqlline.SqlLine.start(SqlLine.java:366)
at sqlline.SqlLine.main(SqlLine.java:259)


JDBC TEST:
- 6 separate threads running the same query: SELECT id, title from 
dfs.`test.json` LIMIT 1000, each maintains its own connection to drill and 
resultSet, statement and connection are closed finally.

- Used resultSet.next() to iterate on the result set, do nothing else.

- Throws the same channel closed exception randomly. Log file were enclosed for 
review.

- Memory were monitored, all good.

CROSS STORAGE PLUGINS:
- The same issue can be found not only in JSON on a file system (local/hdfs), 
but also when querying the same 50,000,000 records table in HBASE.

- The issue is not found in a single thread application.



> Concurrent Query Submission leads to Channel Closed Exception
> -
>
> Key: DRILL-3826
> URL: https://issues.apache.org/jira/browse/DRILL-3826
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Client - JDBC, Execution - RPC
>Affects Versions: 1.1.0
> Environment: - CentOS release 6.6 (Final)
> - hadoop-2.7.1
> - hbase-1.0.1.1
> - drill-1.1.0
> - jdk-1.8.0_45
>Reporter: Yiyi Hu
>Assignee: Daniel Barclay (Drill)
>  Labels: filesystem, hadoop, hbase, jdbc, rpc
> Attachments: shell-test-drillbit.log
>
>
> Frequently seen CHANNEL CLOSED EXCEPTION while running concurrent quries with 
> relatively large LIMIT.
> Here are the details,
> SET UP:
> - Single drillbit running on a single zookeeper node
> - 4G heap size, 8G direct memory
> - Storage plugins: local filesystem, hdfs, hbase
> TEST DATA:
> - A

[jira] [Updated] (DRILL-3826) Concurrent Query Submission leads to Channel Closed Exception

2015-09-23 Thread Yiyi Hu (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-3826?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yiyi Hu updated DRILL-3826:
---
Attachment: Sample2.png

> Concurrent Query Submission leads to Channel Closed Exception
> -
>
> Key: DRILL-3826
> URL: https://issues.apache.org/jira/browse/DRILL-3826
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Client - JDBC, Execution - RPC
>Affects Versions: 1.1.0
> Environment: - CentOS release 6.6 (Final)
> - hadoop-2.7.1
> - hbase-1.0.1.1
> - drill-1.1.0
> - jdk-1.8.0_45
>Reporter: Yiyi Hu
>Assignee: Daniel Barclay (Drill)
>  Labels: filesystem, hadoop, hbase, jdbc, rpc
> Attachments: Sample1.png, Sample2.png, jdbc-test-client-drillbit.log, 
> shell-sqlline.log, shell-test-drillbit.log
>
>
> Frequently seen CHANNEL CLOSED EXCEPTION while running concurrent quries with 
> relatively large LIMIT.
> Here are the details,
> SET UP:
> - Single drillbit running on a single zookeeper node
> - 4G heap size, 8G direct memory
> - Storage plugins: local filesystem, hdfs, hbase
> TEST DATA:
> - A 50,000,000 records json file test.json, with two fields id, 
> title  (approximately 3G).
> SHELL TEST:
> - Running 4 drill shells concurrently with query:
>   SELECT id, title from dfs.`test.json` LIMIT 500.
> - Queries got canceled. Channel closing between client and server were seen 
> randomly, as an example shown below:
> java.lang.RuntimeException: java.sql.SQLException: SYSTEM ERROR: 
> ChannelClosedException: Channel closed /192.168.4.201:31010 <--> 
> /192.168.4.201:48829.
> Fragment 0:0
> [Error Id: 0bd2b500-155e-46e0-9f26-bd89fea47a25 on TEST-101:31010]
>   at sqlline.IncrementalRows.hasNext(IncrementalRows.java:73)
>   at 
> sqlline.TableOutputFormat$ResizingRowsProvider.next(TableOutputFormat.java:87)
>   at sqlline.TableOutputFormat.print(TableOutputFormat.java:118)
>   at sqlline.SqlLine.print(SqlLine.java:1583)
>   at sqlline.Commands.execute(Commands.java:852)
>   at sqlline.Commands.sql(Commands.java:751)
>   at sqlline.SqlLine.dispatch(SqlLine.java:738)
>   at sqlline.SqlLine.begin(SqlLine.java:612)
>   at sqlline.SqlLine.start(SqlLine.java:366)
>   at sqlline.SqlLine.main(SqlLine.java:259)
> JDBC TEST:
> - 6 separate threads running the same query: SELECT id, title from 
> dfs.`test.json` LIMIT 1000, each maintains its own connection to drill 
> and resultSet, statement and connection are closed finally.
> - Used resultSet.next() to iterate on the result set, do nothing else.
> - Throws the same channel closed exception randomly. Log file were enclosed 
> for review.
> - Memory usage was monitored, all good.
> CROSS STORAGE PLUGINS:
> - The same issue can be found not only in JSON on a file system (local/hdfs), 
> but also when querying the same 50,000,000 records table in HBASE.
> - The issue is not found in a single thread application.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (DRILL-3826) Concurrent Query Submission leads to Channel Closed Exception

2015-09-23 Thread Yiyi Hu (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-3826?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yiyi Hu updated DRILL-3826:
---
Attachment: Sample1.png

> Concurrent Query Submission leads to Channel Closed Exception
> -
>
> Key: DRILL-3826
> URL: https://issues.apache.org/jira/browse/DRILL-3826
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Client - JDBC, Execution - RPC
>Affects Versions: 1.1.0
> Environment: - CentOS release 6.6 (Final)
> - hadoop-2.7.1
> - hbase-1.0.1.1
> - drill-1.1.0
> - jdk-1.8.0_45
>Reporter: Yiyi Hu
>Assignee: Daniel Barclay (Drill)
>  Labels: filesystem, hadoop, hbase, jdbc, rpc
> Attachments: Sample1.png, jdbc-test-client-drillbit.log, 
> shell-sqlline.log, shell-test-drillbit.log
>
>
> Frequently seen CHANNEL CLOSED EXCEPTION while running concurrent quries with 
> relatively large LIMIT.
> Here are the details,
> SET UP:
> - Single drillbit running on a single zookeeper node
> - 4G heap size, 8G direct memory
> - Storage plugins: local filesystem, hdfs, hbase
> TEST DATA:
> - A 50,000,000 records json file test.json, with two fields id, 
> title  (approximately 3G).
> SHELL TEST:
> - Running 4 drill shells concurrently with query:
>   SELECT id, title from dfs.`test.json` LIMIT 500.
> - Queries got canceled. Channel closing between client and server were seen 
> randomly, as an example shown below:
> java.lang.RuntimeException: java.sql.SQLException: SYSTEM ERROR: 
> ChannelClosedException: Channel closed /192.168.4.201:31010 <--> 
> /192.168.4.201:48829.
> Fragment 0:0
> [Error Id: 0bd2b500-155e-46e0-9f26-bd89fea47a25 on TEST-101:31010]
>   at sqlline.IncrementalRows.hasNext(IncrementalRows.java:73)
>   at 
> sqlline.TableOutputFormat$ResizingRowsProvider.next(TableOutputFormat.java:87)
>   at sqlline.TableOutputFormat.print(TableOutputFormat.java:118)
>   at sqlline.SqlLine.print(SqlLine.java:1583)
>   at sqlline.Commands.execute(Commands.java:852)
>   at sqlline.Commands.sql(Commands.java:751)
>   at sqlline.SqlLine.dispatch(SqlLine.java:738)
>   at sqlline.SqlLine.begin(SqlLine.java:612)
>   at sqlline.SqlLine.start(SqlLine.java:366)
>   at sqlline.SqlLine.main(SqlLine.java:259)
> JDBC TEST:
> - 6 separate threads running the same query: SELECT id, title from 
> dfs.`test.json` LIMIT 1000, each maintains its own connection to drill 
> and resultSet, statement and connection are closed finally.
> - Used resultSet.next() to iterate on the result set, do nothing else.
> - Throws the same channel closed exception randomly. Log file were enclosed 
> for review.
> - Memory usage was monitored, all good.
> CROSS STORAGE PLUGINS:
> - The same issue can be found not only in JSON on a file system (local/hdfs), 
> but also when querying the same 50,000,000 records table in HBASE.
> - The issue is not found in a single thread application.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (DRILL-3826) Concurrent Query Submission leads to Channel Closed Exception

2015-09-23 Thread Yiyi Hu (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-3826?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yiyi Hu updated DRILL-3826:
---
Attachment: jdbc-test-client-drillbit.log

jdbc client side log

> Concurrent Query Submission leads to Channel Closed Exception
> -
>
> Key: DRILL-3826
> URL: https://issues.apache.org/jira/browse/DRILL-3826
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Client - JDBC, Execution - RPC
>Affects Versions: 1.1.0
> Environment: - CentOS release 6.6 (Final)
> - hadoop-2.7.1
> - hbase-1.0.1.1
> - drill-1.1.0
> - jdk-1.8.0_45
>Reporter: Yiyi Hu
>Assignee: Daniel Barclay (Drill)
>  Labels: filesystem, hadoop, hbase, jdbc, rpc
> Attachments: jdbc-test-client-drillbit.log, shell-sqlline.log, 
> shell-test-drillbit.log
>
>
> Frequently seen CHANNEL CLOSED EXCEPTION while running concurrent quries with 
> relatively large LIMIT.
> Here are the details,
> SET UP:
> - Single drillbit running on a single zookeeper node
> - 4G heap size, 8G direct memory
> - Storage plugins: local filesystem, hdfs, hbase
> TEST DATA:
> - A 50,000,000 records json file test.json, with two fields id, 
> title  (approximately 3G).
> SHELL TEST:
> - Running 4 drill shells concurrently with query:
>   SELECT id, title from dfs.`test.json` LIMIT 500.
> - Queries got canceled. Channel closing between client and server were seen 
> randomly, as an example shown below:
> java.lang.RuntimeException: java.sql.SQLException: SYSTEM ERROR: 
> ChannelClosedException: Channel closed /192.168.4.201:31010 <--> 
> /192.168.4.201:48829.
> Fragment 0:0
> [Error Id: 0bd2b500-155e-46e0-9f26-bd89fea47a25 on TEST-101:31010]
>   at sqlline.IncrementalRows.hasNext(IncrementalRows.java:73)
>   at 
> sqlline.TableOutputFormat$ResizingRowsProvider.next(TableOutputFormat.java:87)
>   at sqlline.TableOutputFormat.print(TableOutputFormat.java:118)
>   at sqlline.SqlLine.print(SqlLine.java:1583)
>   at sqlline.Commands.execute(Commands.java:852)
>   at sqlline.Commands.sql(Commands.java:751)
>   at sqlline.SqlLine.dispatch(SqlLine.java:738)
>   at sqlline.SqlLine.begin(SqlLine.java:612)
>   at sqlline.SqlLine.start(SqlLine.java:366)
>   at sqlline.SqlLine.main(SqlLine.java:259)
> JDBC TEST:
> - 6 separate threads running the same query: SELECT id, title from 
> dfs.`test.json` LIMIT 1000, each maintains its own connection to drill 
> and resultSet, statement and connection are closed finally.
> - Used resultSet.next() to iterate on the result set, do nothing else.
> - Throws the same channel closed exception randomly. Log file were enclosed 
> for review.
> - Memory usage was monitored, all good.
> CROSS STORAGE PLUGINS:
> - The same issue can be found not only in JSON on a file system (local/hdfs), 
> but also when querying the same 50,000,000 records table in HBASE.
> - The issue is not found in a single thread application.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (DRILL-3826) Concurrent Query Submission leads to Channel Closed Exception

2015-09-23 Thread Yiyi Hu (JIRA)

Yiyi Hu created DRILL-3826:
--

 Summary: Concurrent Query Submission leads to Channel Closed 
Exception
 Key: DRILL-3826
 URL: https://issues.apache.org/jira/browse/DRILL-3826
 Project: Apache Drill
  Issue Type: Bug
  Components: Client - JDBC, Execution - RPC
Affects Versions: 1.1.0
 Environment: - CentOS release 6.6 (Final)
- hadoop-2.7.1
- hbase-1.0.1.1
- drill-1.1.0
- jdk-1.8.0_45
Reporter: Yiyi Hu
Assignee: Daniel Barclay (Drill)


Frequently seen CHANNEL CLOSED EXCEPTION while running concurrent quries with 
relatively large LIMIT.


Here are the details,

SET UP:
- Single drillbit running on a single zookeeper node
- 4G heap size, 8G direct memory
- Storage plugins: local filesystem, hdfs, hbase


TEST DATA:
- A 50,000,000 records json file test.json, with two fields id, 
title 
/192.168.4.201:48829.

Fragment 0:0

[Error Id: 0bd2b500-155e-46e0-9f26-bd89fea47a25 on TEST-101:31010]
at sqlline.IncrementalRows.hasNext(IncrementalRows.java:73)
at 
sqlline.TableOutputFormat$ResizingRowsProvider.next(TableOutputFormat.java:87)
at sqlline.TableOutputFormat.print(TableOutputFormat.java:118)
at sqlline.SqlLine.print(SqlLine.java:1583)
at sqlline.Commands.execute(Commands.java:852)
at sqlline.Commands.sql(Commands.java:751)
at sqlline.SqlLine.dispatch(SqlLine.java:738)
at sqlline.SqlLine.begin(SqlLine.java:612)
at sqlline.SqlLine.start(SqlLine.java:366)
at sqlline.SqlLine.main(SqlLine.java:259)


JDBC TEST:
- 6 separate threads running the same query: SELECT id, title from 
dfs.`test.json` LIMIT 1000, each maintains its own connection to drill and 
resultSet, statement and connection are closed finally.

- Used resultSet.next() to iterate on the result set, do nothing else.

- Throws the same channel closed exception randomly. Log file were enclosed for 
review.

- Memory were monitored, all good.

CROSS STORAGE PLUGINS:
- The same issue can be found not only in JSON on a file system (local/hdfs), 
but also when querying the same 50,000,000 records table in HBASE.

- The issue is not found in a single thread application.




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (DRILL-3826) Concurrent Query Submission leads to Channel Closed Exception

2015-09-23 Thread Yiyi Hu (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-3826?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yiyi Hu updated DRILL-3826:
---
Attachment: shell-sqlline.log

> Concurrent Query Submission leads to Channel Closed Exception
> -
>
> Key: DRILL-3826
> URL: https://issues.apache.org/jira/browse/DRILL-3826
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Client - JDBC, Execution - RPC
>Affects Versions: 1.1.0
> Environment: - CentOS release 6.6 (Final)
> - hadoop-2.7.1
> - hbase-1.0.1.1
> - drill-1.1.0
> - jdk-1.8.0_45
>Reporter: Yiyi Hu
>Assignee: Daniel Barclay (Drill)
>  Labels: filesystem, hadoop, hbase, jdbc, rpc
> Attachments: shell-sqlline.log, shell-test-drillbit.log
>
>
> Frequently seen CHANNEL CLOSED EXCEPTION while running concurrent quries with 
> relatively large LIMIT.
> Here are the details,
> SET UP:
> - Single drillbit running on a single zookeeper node
> - 4G heap size, 8G direct memory
> - Storage plugins: local filesystem, hdfs, hbase
> TEST DATA:
> - A 50,000,000 records json file test.json, with two fields id, 
> title  (approximately 3G).
> SHELL TEST:
> - Running 4 drill shells concurrently with query:
>   SELECT id, title from dfs.`test.json` LIMIT 500.
> - Queries got canceled. Channel closing between client and server were seen 
> randomly, as an example shown below:
> java.lang.RuntimeException: java.sql.SQLException: SYSTEM ERROR: 
> ChannelClosedException: Channel closed /192.168.4.201:31010 <--> 
> /192.168.4.201:48829.
> Fragment 0:0
> [Error Id: 0bd2b500-155e-46e0-9f26-bd89fea47a25 on TEST-101:31010]
>   at sqlline.IncrementalRows.hasNext(IncrementalRows.java:73)
>   at 
> sqlline.TableOutputFormat$ResizingRowsProvider.next(TableOutputFormat.java:87)
>   at sqlline.TableOutputFormat.print(TableOutputFormat.java:118)
>   at sqlline.SqlLine.print(SqlLine.java:1583)
>   at sqlline.Commands.execute(Commands.java:852)
>   at sqlline.Commands.sql(Commands.java:751)
>   at sqlline.SqlLine.dispatch(SqlLine.java:738)
>   at sqlline.SqlLine.begin(SqlLine.java:612)
>   at sqlline.SqlLine.start(SqlLine.java:366)
>   at sqlline.SqlLine.main(SqlLine.java:259)
> JDBC TEST:
> - 6 separate threads running the same query: SELECT id, title from 
> dfs.`test.json` LIMIT 1000, each maintains its own connection to drill 
> and resultSet, statement and connection are closed finally.
> - Used resultSet.next() to iterate on the result set, do nothing else.
> - Throws the same channel closed exception randomly. Log file were enclosed 
> for review.
> - Memory usage was monitored, all good.
> CROSS STORAGE PLUGINS:
> - The same issue can be found not only in JSON on a file system (local/hdfs), 
> but also when querying the same 50,000,000 records table in HBASE.
> - The issue is not found in a single thread application.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (DRILL-3826) Concurrent Query Submission leads to Channel Closed Exception

2015-09-23 Thread Khurram Faraaz (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-3826?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14904573#comment-14904573
 ] 

Khurram Faraaz commented on DRILL-3826:
---

Stack trace in drillbit.log is same as that reported in DRILL-3763.

> Concurrent Query Submission leads to Channel Closed Exception
> -
>
> Key: DRILL-3826
> URL: https://issues.apache.org/jira/browse/DRILL-3826
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Client - JDBC, Execution - RPC
>Affects Versions: 1.1.0
> Environment: - CentOS release 6.6 (Final)
> - hadoop-2.7.1
> - hbase-1.0.1.1
> - drill-1.1.0
> - jdk-1.8.0_45
>Reporter: Yiyi Hu
>Assignee: Daniel Barclay (Drill)
>  Labels: filesystem, hadoop, hbase, jdbc, rpc
> Attachments: Sample1.png, Sample2.png, jdbc-test-client-drillbit.log, 
> shell-sqlline.log, shell-test-drillbit.log
>
>
> Frequently seen CHANNEL CLOSED EXCEPTION while running concurrent quries with 
> relatively large LIMIT.
> Here are the details,
> SET UP:
> - Single drillbit running on a single zookeeper node
> - 4G heap size, 8G direct memory
> - Storage plugins: local filesystem, hdfs, hbase
> TEST DATA:
> - A 50,000,000 records json file test.json, with two fields id, 
> title  (approximately 3G).
> SHELL TEST:
> - Running 4 drill shells concurrently with query:
>   SELECT id, title from dfs.`test.json` LIMIT 500.
> - Queries got canceled. Channel closing between client and server were seen 
> randomly, as an example shown below:
> java.lang.RuntimeException: java.sql.SQLException: SYSTEM ERROR: 
> ChannelClosedException: Channel closed /192.168.4.201:31010 <--> 
> /192.168.4.201:48829.
> Fragment 0:0
> [Error Id: 0bd2b500-155e-46e0-9f26-bd89fea47a25 on TEST-101:31010]
>   at sqlline.IncrementalRows.hasNext(IncrementalRows.java:73)
>   at 
> sqlline.TableOutputFormat$ResizingRowsProvider.next(TableOutputFormat.java:87)
>   at sqlline.TableOutputFormat.print(TableOutputFormat.java:118)
>   at sqlline.SqlLine.print(SqlLine.java:1583)
>   at sqlline.Commands.execute(Commands.java:852)
>   at sqlline.Commands.sql(Commands.java:751)
>   at sqlline.SqlLine.dispatch(SqlLine.java:738)
>   at sqlline.SqlLine.begin(SqlLine.java:612)
>   at sqlline.SqlLine.start(SqlLine.java:366)
>   at sqlline.SqlLine.main(SqlLine.java:259)
> JDBC TEST:
> - 6 separate threads running the same query: SELECT id, title from 
> dfs.`test.json` LIMIT 1000, each maintains its own connection to drill 
> and resultSet, statement and connection are closed finally.
> - Used resultSet.next() to iterate on the result set, do nothing else.
> - Throws the same channel closed exception randomly. Log file were enclosed 
> for review.
> - Memory usage was monitored, all good.
> CROSS STORAGE PLUGINS:
> - The same issue can be found not only in JSON on a file system (local/hdfs), 
> but also when querying the same 50,000,000 records table in HBASE.
> - The issue is not found in a single thread application.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Closed] (DRILL-2748) Filter is not pushed down into subquery with the group by

2015-09-23 Thread Victoria Markman (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-2748?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Victoria Markman closed DRILL-2748.
---

> Filter is not pushed down into subquery with the group by
> -
>
> Key: DRILL-2748
> URL: https://issues.apache.org/jira/browse/DRILL-2748
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Query Planning & Optimization
>Affects Versions: 0.9.0, 1.0.0, 1.1.0
>Reporter: Victoria Markman
>Assignee: Jinfeng Ni
> Fix For: 1.2.0
>
> Attachments: 
> 0001-DRILL-2748-Improve-cost-estimation-for-Drill-logical.patch
>
>
> I'm not sure about this one, theoretically filter could have been pushed into 
> the subquery.
> {code}
> 0: jdbc:drill:schema=dfs> explain plan for select x, y, z from (select a1, 
> b1, avg(a1) from t1 group by a1, b1) as sq(x, y, z) where x = 10;
> +++
> |text|json|
> +++
> | 00-00Screen
> 00-01  Project(x=[$0], y=[$1], z=[$2])
> 00-02Project(x=[$0], y=[$1], z=[CAST(/(CastHigh(CASE(=($3, 0), null, 
> $2)), $3)):ANY NOT NULL])
> 00-03  SelectionVectorRemover
> 00-04Filter(condition=[=($0, 10)])
> 00-05  HashAgg(group=[{0, 1}], agg#0=[$SUM0($0)], 
> agg#1=[COUNT($0)])
> 00-06Project(a1=[$1], b1=[$0])
> 00-07  Scan(groupscan=[ParquetGroupScan 
> [entries=[ReadEntryWithPath [path=maprfs:/drill/testdata/predicates/t1]], 
> selectionRoot=/drill/testdata/predicates/t1, numFiles=1, columns=[`a1`, 
> `b1`]]])
> {code}
> Same with distinct in subquery:
> {code}
> 0: jdbc:drill:schema=dfs> explain plan for select x, y, z from ( select 
> distinct a1, b1, c1 from t1 ) as sq(x, y, z) where x = 10;
> +++
> |text|json|
> +++
> | 00-00Screen
> 00-01  Project(x=[$0], y=[$1], z=[$2])
> 00-02Project(x=[$0], y=[$1], z=[$2])
> 00-03  SelectionVectorRemover
> 00-04Filter(condition=[=($0, 10)])
> 00-05  HashAgg(group=[{0, 1, 2}])
> 00-06Project(a1=[$2], b1=[$1], c1=[$0])
> 00-07  Scan(groupscan=[ParquetGroupScan 
> [entries=[ReadEntryWithPath [path=maprfs:/drill/testdata/predicates/t1]], 
> selectionRoot=/drill/testdata/predicates/t1, numFiles=1, columns=[`a1`, `b1`, 
> `c1`]]])
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (DRILL-3827) Empty metadata file causes queries on the table to fail

2015-09-23 Thread Victoria Markman (JIRA)

Victoria Markman created DRILL-3827:
---

 Summary: Empty metadata file causes queries on the table to fail
 Key: DRILL-3827
 URL: https://issues.apache.org/jira/browse/DRILL-3827
 Project: Apache Drill
  Issue Type: Bug
  Components: Query Planning & Optimization
Affects Versions: 1.2.0
Reporter: Victoria Markman
Assignee: Jinfeng Ni
Priority: Critical


I ran into a situation where drill created an empty metadata file (which is a 
separate issue and I will try to narrow it down. Suspicion is that this happens 
when "refresh table metada x" fails with "permission denied" error).

However, we need to guard against situation where metadata file is empty or 
corrupted. We probably should skip reading it if we encounter unexpected result 
and continue with query planning without that information. In the same fashion 
as partition pruning failure. It's also important to log this information 
somewhere, drillbit.log as a start. It would be really nice to have a flag in 
the query profile that tells a user if we used metadata file for planning or 
not. Will help in debugging performance issues.

Very confusing exception is thrown if you have zero length meta data file in 
the directory:
{code}
[Wed Sep 23 07:45:28] # ls -la
total 2
drwxr-xr-x  2 root root   2 Sep 10 14:55 .
drwxr-xr-x 16 root root  35 Sep 15 12:54 ..
-rwxr-xr-x  1 root root 483 Jul  1 11:29 0_0_0.parquet
-rwxr-xr-x  1 root root   0 Sep 10 14:55 .drill.parquet_metadata

0: jdbc:drill:schema=dfs> select * from t1;
Error: SYSTEM ERROR: JsonMappingException: No content to map due to end-of-input
 at [Source: com.mapr.fs.MapRFsDataInputStream@342bd88d; line: 1, column: 1]
[Error Id: c97574f6-b3e8-4183-8557-c30df6ca675f on atsqa4-133.qa.lab:31010] 
(state=,code=0)
{code}

Workaround is trivial, remove the file. Marking it as critical, since we don't 
have any concurrency control in place and this file can get corrupted as well.




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (DRILL-2748) Filter is not pushed down into subquery with the group by

2015-09-23 Thread Victoria Markman (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-2748?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14904614#comment-14904614
 ] 

Victoria Markman commented on DRILL-2748:
-

Verified fixed in 1.2.0

#Tue Sep 22 19:46:29 UTC 2015
git.commit.id.abbrev=942d352

Added a test suite to test filter pushdown in isolation: 
Functional/Passing/filter/pushdown

> Filter is not pushed down into subquery with the group by
> -
>
> Key: DRILL-2748
> URL: https://issues.apache.org/jira/browse/DRILL-2748
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Query Planning & Optimization
>Affects Versions: 0.9.0, 1.0.0, 1.1.0
>Reporter: Victoria Markman
>Assignee: Jinfeng Ni
> Fix For: 1.2.0
>
> Attachments: 
> 0001-DRILL-2748-Improve-cost-estimation-for-Drill-logical.patch
>
>
> I'm not sure about this one, theoretically filter could have been pushed into 
> the subquery.
> {code}
> 0: jdbc:drill:schema=dfs> explain plan for select x, y, z from (select a1, 
> b1, avg(a1) from t1 group by a1, b1) as sq(x, y, z) where x = 10;
> +++
> |text|json|
> +++
> | 00-00Screen
> 00-01  Project(x=[$0], y=[$1], z=[$2])
> 00-02Project(x=[$0], y=[$1], z=[CAST(/(CastHigh(CASE(=($3, 0), null, 
> $2)), $3)):ANY NOT NULL])
> 00-03  SelectionVectorRemover
> 00-04Filter(condition=[=($0, 10)])
> 00-05  HashAgg(group=[{0, 1}], agg#0=[$SUM0($0)], 
> agg#1=[COUNT($0)])
> 00-06Project(a1=[$1], b1=[$0])
> 00-07  Scan(groupscan=[ParquetGroupScan 
> [entries=[ReadEntryWithPath [path=maprfs:/drill/testdata/predicates/t1]], 
> selectionRoot=/drill/testdata/predicates/t1, numFiles=1, columns=[`a1`, 
> `b1`]]])
> {code}
> Same with distinct in subquery:
> {code}
> 0: jdbc:drill:schema=dfs> explain plan for select x, y, z from ( select 
> distinct a1, b1, c1 from t1 ) as sq(x, y, z) where x = 10;
> +++
> |text|json|
> +++
> | 00-00Screen
> 00-01  Project(x=[$0], y=[$1], z=[$2])
> 00-02Project(x=[$0], y=[$1], z=[$2])
> 00-03  SelectionVectorRemover
> 00-04Filter(condition=[=($0, 10)])
> 00-05  HashAgg(group=[{0, 1, 2}])
> 00-06Project(a1=[$2], b1=[$1], c1=[$0])
> 00-07  Scan(groupscan=[ParquetGroupScan 
> [entries=[ReadEntryWithPath [path=maprfs:/drill/testdata/predicates/t1]], 
> selectionRoot=/drill/testdata/predicates/t1, numFiles=1, columns=[`a1`, `b1`, 
> `c1`]]])
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (DRILL-3828) Metadata Caching + Impersonation : Inconsistency in reporting a permission error

2015-09-23 Thread Rahul Challapalli (JIRA)

Rahul Challapalli created DRILL-3828:


 Summary: Metadata Caching + Impersonation : Inconsistency in 
reporting a permission error
 Key: DRILL-3828
 URL: https://issues.apache.org/jira/browse/DRILL-3828
 Project: Apache Drill
  Issue Type: Bug
  Components: Metadata
Reporter: Rahul Challapalli
Assignee: Mehant Baid


git.commit.id.abbrev=3c89b30

User A has permissions to lineitem and lineitem/2006. He does not have access 
to lineitem/2007

Scenario 1 : There is already a cache file created by a different user in the 
lineitem folder, then
a) refresh table metadata lineitem : succeeds
b) refresh table metadata `lineitem/2007` : fails with a permission error

Scenario 2 : There is no cache file in the lineitem folder, then
a) refresh table metadata lineitem : fails with permission error
b) refresh table metadata `lineitem/2007` : fails with a permission error

Scenario 2 seems to be the expected result in this case as the cache file 
contains metadata information about file from within the sub-directories as well




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Reopened] (DRILL-3795) TextReader can't read .tsv file contains multiple double quotes

2015-09-23 Thread Chun Chang (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-3795?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chun Chang reopened DRILL-3795:
---

This is what I got now with a newer build. Still seems not right.

{noformat}
0: jdbc:drill:schema=dfs.tpch_maprdb> select * from dfs.tmp.`drill-3718.tsv`;
++
|columns |
++
| ["another no quote\"\"anotherwith quote\"\""]  |
| ["\"another with double quotes\"  no quotes\n"]|
++
2 rows selected (0.421 seconds)
0: jdbc:drill:schema=dfs.tpch_maprdb> select commit_id from sys.version;
+---+
| commit_id |
+---+
| 813903a34ea1c9c3fec28f2472312c8785f780c5  |
+---+
{noformat}

> TextReader can't read .tsv file contains multiple double quotes
> ---
>
> Key: DRILL-3795
> URL: https://issues.apache.org/jira/browse/DRILL-3795
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Storage - Text & CSV
>Affects Versions: 1.2.0
>Reporter: Chun Chang
>Assignee: Sean Hsuan-Yi Chu
> Attachments: drill-3795.tsv
>
>
> commit_id: 69c73af54ac3d15b8e7c21e8a3c35b4a62ebc844
> I have a simple tab delimitated file contains multiple double quoted text:
> {noformat}
> another no quote""anotherwith quote""
> ""another with double quotes""  no quotes
> {noformat}
> This cause the following error:
> {noformat}
> 0: jdbc:drill:schema=dfs.drillTestDirDropTabl> select columns[0], columns[1] 
> from dfs.tmp.`drill-3718.tsv`;
> Error: SYSTEM ERROR: TextParsingException: Error processing input: Cannot use 
> newline character within quoted string, line=2, char=61. Content parsed: [ ]
> Fragment 0:0
> [Error Id: c631eccc-038c-4d61-bda8-e7037c3677e8 on 10.10.30.166:31010] 
> (state=,code=0)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (DRILL-3807) [Regression] Query with equality join and a FALSE condition fails to plan

2015-09-23 Thread Jinfeng Ni (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-3807?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14904879#comment-14904879
 ] 

Jinfeng Ni commented on DRILL-3807:
---

The problem seems to be caused by constant_folding rule. If I disable the 
constant_folding rule, the query could get plan successfully.

{code}
alter session set `planner.enable_constant_folding` = false;

+---+---+
|  ok   |  summary  |
+---+---+
| true  | planner.enable_constant_folding updated.  |
+---+---+
1 row selected (0.416 seconds)
0: jdbc:drill:zk=local> explain plan for select l.l_quantity from 
cp.`tpch/lineitem.parquet` l, cp.`tpch/part.parquet` p where l.l_partkey = 
p.p_partkey and (1 = 0);
+--+--+
| text | json |
+--+--+
| 00-00Screen
00-01  Project(l_quantity=[$1])
00-02HashJoin(condition=[=($0, $2)], joinType=[inner])
00-04  SelectionVectorRemover
00-05Filter(condition=[=(1, 0)])
00-06  Scan(groupscan=[ParquetGroupScan [entries=[ReadEntryWithPath 
[path=classpath:/tpch/lineitem.parquet]], 
selectionRoot=classpath:/tpch/lineitem.parquet, numFiles=1, 
columns=[`l_partkey`, `l_quantity`]]])
00-03  Scan(groupscan=[ParquetGroupScan [entries=[ReadEntryWithPath 
[path=classpath:/tpch/part.parquet]], 
selectionRoot=classpath:/tpch/part.parquet, numFiles=1, columns=[`p_partkey`]]])

{code}

Can you see what the change that causing constant folding rule kicked in, and 
affect the join query planning?




> [Regression] Query with equality join and a FALSE condition fails to plan
> -
>
> Key: DRILL-3807
> URL: https://issues.apache.org/jira/browse/DRILL-3807
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Query Planning & Optimization
>Affects Versions: 1.2.0
>Reporter: Aman Sinha
>Assignee: Sean Hsuan-Yi Chu
> Fix For: 1.2.0
>
>
> 1.2.0-SNAPSHOT behavior: 
> {code}
> 0: jdbc:drill:zk=local> explain plan for select l.l_quantity from 
> cp.`tpch/lineitem.parquet` l, cp.`tpch/part.parquet` p where l.l_partkey = 
> p.p_partkey and (1 = 0);
> Error: UNSUPPORTED_OPERATION ERROR: This query cannot be planned possibly due 
> to either a cartesian join or an inequality join
> [Error Id: f7466d86-b709-465e-bb49-d3c51ecf941b on 172.16.0.160:31010] 
> (state=,code=0)
> {code}
> The simplification of  ' l.l_partkey = p.p_partkey and (1 = 0)' to a False 
> condition is valid and accordingly Drill fails to plan due to the cartesian 
> join introduced by the False condition.   However,  in 1.1.0 apparently the 
> 1=0 was converted to a LIMIT 0 which was pushed below the Join and the query 
> successfully planned and executed: 
> 1.1.0 behavior: 
> {code}
> 0: jdbc:drill:zk=local> explain plan for select l.l_quantity from 
> cp.`tpch/lineitem.parquet` l, cp.`tpch/part.parquet` p where l.l_partkey = 
> p.p_partkey and (1 = 0);
> +--+--+
> | text | json |
> +--+--+
> | 00-00Screen
> 00-01  Project(l_quantity=[$1])
> 00-02HashJoin(condition=[=($0, $2)], joinType=[inner])
> 00-04  SelectionVectorRemover
> 00-05Limit(offset=[0], fetch=[0])
> 00-06  Scan(groupscan=[ParquetGroupScan 
> [entries=[ReadEntryWithPath [path=classpath:/tpch/lineitem.parquet]], 
> selectionRoot=classpath:/tpch/lineitem.parquet, numFiles=1, 
> columns=[`l_partkey`, `l_quantity`]]])
> 00-03  Scan(groupscan=[ParquetGroupScan [entries=[ReadEntryWithPath 
> [path=classpath:/tpch/part.parquet]], 
> selectionRoot=classpath:/tpch/part.parquet, numFiles=1, 
> columns=[`p_partkey`]]])
> {code}
> [~cchang] and I looked at the commit history and it appears that the 
> regression started somewhere between Aug 24 and Aug 28, which is the time 
> when we rebased on Calcite 1.4.0.  So we need to narrow down further the 
> change that may have caused this. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Resolved] (DRILL-3795) TextReader can't read .tsv file contains multiple double quotes

2015-09-23 Thread Sean Hsuan-Yi Chu (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-3795?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean Hsuan-Yi Chu resolved DRILL-3795.
--
Resolution: Not A Problem

The input is not valid. Let's take the second field in the first row as an 
example.
(""another   with quote"")

The outermost pair of " denotes that they are surrounding a field. Thus, any " 
sitting inside this field has to escape with another ". So we should rewrite 
that field as """another with quote""".




> TextReader can't read .tsv file contains multiple double quotes
> ---
>
> Key: DRILL-3795
> URL: https://issues.apache.org/jira/browse/DRILL-3795
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Storage - Text & CSV
>Affects Versions: 1.2.0
>Reporter: Chun Chang
>Assignee: Sean Hsuan-Yi Chu
> Attachments: drill-3795.tsv
>
>
> commit_id: 69c73af54ac3d15b8e7c21e8a3c35b4a62ebc844
> I have a simple tab delimitated file contains multiple double quoted text:
> {noformat}
> another no quote""anotherwith quote""
> ""another with double quotes""  no quotes
> {noformat}
> This cause the following error:
> {noformat}
> 0: jdbc:drill:schema=dfs.drillTestDirDropTabl> select columns[0], columns[1] 
> from dfs.tmp.`drill-3718.tsv`;
> Error: SYSTEM ERROR: TextParsingException: Error processing input: Cannot use 
> newline character within quoted string, line=2, char=61. Content parsed: [ ]
> Fragment 0:0
> [Error Id: c631eccc-038c-4d61-bda8-e7037c3677e8 on 10.10.30.166:31010] 
> (state=,code=0)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (DRILL-3795) TextReader can't read .tsv file contains multiple double quotes

2015-09-23 Thread Chun Chang (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-3795?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14904841#comment-14904841
 ] 

Chun Chang commented on DRILL-3795:
---

With tab between columns, this is what I got: 

{noformat}
0: jdbc:drill:schema=dfs.tpch_maprdb> select columns[0], columns[1] from 
dfs.tmp.`drill-3718.tsv`;
+--+-+
|  EXPR$0  | EXPR$1  |
+--+-+
| another no quote |  ""anotherwith quote""  |
| "another with double quotes"  no quotes
  | null|
+--+-+
2 rows selected (0.419 seconds)
0: jdbc:drill:schema=dfs.tpch_maprdb> select * from dfs.tmp.`drill-3718.tsv`;
++
|  columns   |
++
| ["another no quote"," \"\"anotherwith quote\"\""]  |
| ["\"another with double quotes\"\tno quotes\n"]|
++
2 rows selected (0.428 seconds)
{noformat}

> TextReader can't read .tsv file contains multiple double quotes
> ---
>
> Key: DRILL-3795
> URL: https://issues.apache.org/jira/browse/DRILL-3795
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Storage - Text & CSV
>Affects Versions: 1.2.0
>Reporter: Chun Chang
>Assignee: Sean Hsuan-Yi Chu
> Attachments: drill-3795.tsv
>
>
> commit_id: 69c73af54ac3d15b8e7c21e8a3c35b4a62ebc844
> I have a simple tab delimitated file contains multiple double quoted text:
> {noformat}
> another no quote""anotherwith quote""
> ""another with double quotes""  no quotes
> {noformat}
> This cause the following error:
> {noformat}
> 0: jdbc:drill:schema=dfs.drillTestDirDropTabl> select columns[0], columns[1] 
> from dfs.tmp.`drill-3718.tsv`;
> Error: SYSTEM ERROR: TextParsingException: Error processing input: Cannot use 
> newline character within quoted string, line=2, char=61. Content parsed: [ ]
> Fragment 0:0
> [Error Id: c631eccc-038c-4d61-bda8-e7037c3677e8 on 10.10.30.166:31010] 
> (state=,code=0)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (DRILL-3486) Doc. site JDBC page(s) should link to JDBC driver Javadoc doc. once it's available

2015-09-23 Thread Daniel Barclay (Drill) (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-3486?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Barclay (Drill) updated DRILL-3486:
--
Summary: Doc. site JDBC page(s) should link to JDBC driver Javadoc doc. 
once it's available  (was: JDBC doc. pages should link to JDBC driver Javadoc 
doc. once available)

> Doc. site JDBC page(s) should link to JDBC driver Javadoc doc. once it's 
> available
> --
>
> Key: DRILL-3486
> URL: https://issues.apache.org/jira/browse/DRILL-3486
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Documentation
>Reporter: Daniel Barclay (Drill)
>Assignee: Bridget Bevens
>
> The Drill documentation site's JDBC pages should have a link to a copy of the 
> driver's generated Javadoc documentation once we start generating it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (DRILL-3485) Doc. site JDBC pages should at least point to JDBC Javadoc in source

2015-09-23 Thread Daniel Barclay (Drill) (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-3485?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Barclay (Drill) updated DRILL-3485:
--
Summary: Doc. site JDBC pages should at least point to JDBC Javadoc in 
source  (was: Doc. JDBC pages should at least point to JDBC Javadoc in source)

> Doc. site JDBC pages should at least point to JDBC Javadoc in source
> 
>
> Key: DRILL-3485
> URL: https://issues.apache.org/jira/browse/DRILL-3485
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Documentation
>Reporter: Daniel Barclay (Drill)
>Assignee: Bridget Bevens
>
> We don't yet generate and publish Javadoc documentation for Drill's JDBC 
> driver, and therefore the Drill documentation site's JDBC pages can't yet 
> link to generated Javadoc documentation as they eventually should.
> However, we have already written Javadoc source documentation for much of the 
> Drill-specific behavior and extensions in the JDBC interface.
> Since that documentation already exists, we should point users to it somehow 
> (until we provide its information to the users normally, as generated Javadoc 
> documentation).
> Therefore, in the interim, the Drill documentation site's JDBC pages should 
> at least point to the source code at 
> [https://github.com/apache/drill/tree/master/exec/jdbc/src/main/java/org/apache/drill/jdbc].



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (DRILL-3485) Doc. site JDBC page(s) should at least point to JDBC Javadoc in source

2015-09-23 Thread Daniel Barclay (Drill) (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-3485?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Barclay (Drill) updated DRILL-3485:
--
Summary: Doc. site JDBC page(s) should at least point to JDBC Javadoc in 
source  (was: Doc. site JDBC pages should at least point to JDBC Javadoc in 
source)

> Doc. site JDBC page(s) should at least point to JDBC Javadoc in source
> --
>
> Key: DRILL-3485
> URL: https://issues.apache.org/jira/browse/DRILL-3485
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Documentation
>Reporter: Daniel Barclay (Drill)
>Assignee: Bridget Bevens
>
> We don't yet generate and publish Javadoc documentation for Drill's JDBC 
> driver, and therefore the Drill documentation site's JDBC pages can't yet 
> link to generated Javadoc documentation as they eventually should.
> However, we have already written Javadoc source documentation for much of the 
> Drill-specific behavior and extensions in the JDBC interface.
> Since that documentation already exists, we should point users to it somehow 
> (until we provide its information to the users normally, as generated Javadoc 
> documentation).
> Therefore, in the interim, the Drill documentation site's JDBC pages should 
> at least point to the source code at 
> [https://github.com/apache/drill/tree/master/exec/jdbc/src/main/java/org/apache/drill/jdbc].



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Closed] (DRILL-3826) Concurrent Query Submission leads to Channel Closed Exception

2015-09-23 Thread Khurram Faraaz (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-3826?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Khurram Faraaz closed DRILL-3826.
-
Resolution: Duplicate

> Concurrent Query Submission leads to Channel Closed Exception
> -
>
> Key: DRILL-3826
> URL: https://issues.apache.org/jira/browse/DRILL-3826
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Client - JDBC, Execution - RPC
>Affects Versions: 1.1.0
> Environment: - CentOS release 6.6 (Final)
> - hadoop-2.7.1
> - hbase-1.0.1.1
> - drill-1.1.0
> - jdk-1.8.0_45
>Reporter: Yiyi Hu
>Assignee: Daniel Barclay (Drill)
>  Labels: filesystem, hadoop, hbase, jdbc, rpc
> Attachments: Sample1.png, Sample2.png, jdbc-test-client-drillbit.log, 
> shell-sqlline.log, shell-test-drillbit.log
>
>
> Frequently seen CHANNEL CLOSED EXCEPTION while running concurrent quries with 
> relatively large LIMIT.
> Here are the details,
> SET UP:
> - Single drillbit running on a single zookeeper node
> - 4G heap size, 8G direct memory
> - Storage plugins: local filesystem, hdfs, hbase
> TEST DATA:
> - A 50,000,000 records json file test.json, with two fields id, 
> title  (approximately 3G).
> SHELL TEST:
> - Running 4 drill shells concurrently with query:
>   SELECT id, title from dfs.`test.json` LIMIT 500.
> - Queries got canceled. Channel closing between client and server were seen 
> randomly, as an example shown below:
> java.lang.RuntimeException: java.sql.SQLException: SYSTEM ERROR: 
> ChannelClosedException: Channel closed /192.168.4.201:31010 <--> 
> /192.168.4.201:48829.
> Fragment 0:0
> [Error Id: 0bd2b500-155e-46e0-9f26-bd89fea47a25 on TEST-101:31010]
>   at sqlline.IncrementalRows.hasNext(IncrementalRows.java:73)
>   at 
> sqlline.TableOutputFormat$ResizingRowsProvider.next(TableOutputFormat.java:87)
>   at sqlline.TableOutputFormat.print(TableOutputFormat.java:118)
>   at sqlline.SqlLine.print(SqlLine.java:1583)
>   at sqlline.Commands.execute(Commands.java:852)
>   at sqlline.Commands.sql(Commands.java:751)
>   at sqlline.SqlLine.dispatch(SqlLine.java:738)
>   at sqlline.SqlLine.begin(SqlLine.java:612)
>   at sqlline.SqlLine.start(SqlLine.java:366)
>   at sqlline.SqlLine.main(SqlLine.java:259)
> JDBC TEST:
> - 6 separate threads running the same query: SELECT id, title from 
> dfs.`test.json` LIMIT 1000, each maintains its own connection to drill 
> and resultSet, statement and connection are closed finally.
> - Used resultSet.next() to iterate on the result set, do nothing else.
> - Throws the same channel closed exception randomly. Log file were enclosed 
> for review.
> - Memory usage was monitored, all good.
> CROSS STORAGE PLUGINS:
> - The same issue can be found not only in JSON on a file system (local/hdfs), 
> but also when querying the same 50,000,000 records table in HBASE.
> - The issue is not found in a single thread application.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (DRILL-3741) Document logging configuration for JDBC-all

2015-09-23 Thread Daniel Barclay (Drill) (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-3741?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14904963#comment-14904963
 ] 

Daniel Barclay (Drill) commented on DRILL-3741:
---

Notes for documentation:

- Internally, Drill uses [SLF4J|http://www.slf4j.org/], which can log through 
different logging back ends.
- Drill's JDBC-all Jar file does not contain any logging back end for SLF4J.  
(That avoid's interfering with the calling application's choice of which back 
end to use.)  (A warning about not finding 
{{org.slf4j.impl.StaticLoggerBinder}} indicates that SLF4J didn't find a back 
end.)
- A logging back end for SLF4J is typically added by adding the back end's Jar 
files to the class path and configuring the back end appropriately.
- For example, to add the [Logback|http://logback.qos.ch/] back end, add 
Logback's  {{logback-core}} and {{logback-classic}} Jar files to the class path 
and a {{logback.xml}} Logback configuration file in some classpath root (Jar 
file or directory).







> Document logging configuration for JDBC-all
> ---
>
> Key: DRILL-3741
> URL: https://issues.apache.org/jira/browse/DRILL-3741
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Client - JDBC, Documentation
>Reporter: Daniel Barclay (Drill)
>Assignee: Daniel Barclay (Drill)
>
> Add some documentation about how to configure logging when using the JDBC-all 
> Jar file.
> (Presumably, the user needs to select an SLF4J back end, put its Jar file on 
> the class path somewhere, and configure it however that specific back end 
> supports configuration, and we link to SLF4J documentation for details.)
> Probably have something in the Javadoc for class 
> {{org.apache.drill.jdbc.Driver}} or for {{package org.apache.drill.jdbc}} and 
> something in or near the Drill site documentation page 
> https://drill.apache.org/docs/using-the-jdbc-driver/], and have them refer to 
> each other (so that from whichever starting point, the users easily find the 
> other documentation).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (DRILL-3815) unknown suffixes .not_json and .json_not treated differently (multi-file case)

2015-09-23 Thread Jacques Nadeau (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-3815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14904914#comment-14904914
 ] 

Jacques Nadeau commented on DRILL-3815:
---

Can you also confirm that you have no default input format set?

> unknown suffixes .not_json and .json_not treated differently (multi-file case)
> --
>
> Key: DRILL-3815
> URL: https://issues.apache.org/jira/browse/DRILL-3815
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Storage - Other
>Reporter: Daniel Barclay (Drill)
>Assignee: Jacques Nadeau
>
> In scanning a directory subtree used as a table, unknown filename extensions 
> seem to be treated differently depending on whether they're similar to known 
> file extensions.  The behavior suggests that Drill checks whether a file name 
> _contains_ an extension's string rather than _ending_ with it. 
> For example, given these subtrees with almost identical leaf file names:
> {noformat}
> $ find /tmp/testext_xx_json/
> /tmp/testext_xx_json/
> /tmp/testext_xx_json/voter2.not_json
> /tmp/testext_xx_json/voter1.json
> $ find /tmp/testext_json_xx/
> /tmp/testext_json_xx/
> /tmp/testext_json_xx/voter1.json
> /tmp/testext_json_xx/voter2.json_not
> $ 
> {noformat}
> the results of trying to use them as tables differs:
> {noformat}
> 0: jdbc:drill:zk=local> SELECT *   FROM `dfs.tmp`.`testext_xx_json`;
> Sep 21, 2015 11:41:50 AM 
> org.apache.calcite.sql.validate.SqlValidatorException 
> ...
> Error: VALIDATION ERROR: From line 1, column 17 to line 1, column 25: Table 
> 'dfs.tmp.testext_xx_json' not found
> [Error Id: 6fe41deb-0e39-43f6-beca-de27b39d276b on dev-linux2:31010] 
> (state=,code=0)
> 0: jdbc:drill:zk=local> SELECT *   FROM `dfs.tmp`.`testext_json_xx`;
> +---+
> | onecf |
> +---+
> | {"name":"someName1"}  |
> | {"name":"someName2"}  |
> +---+
> 2 rows selected (0.149 seconds)
> {noformat}
> (Other probing seems to indicate that there is also some sensitivity to 
> whether the extension contains an underscore character.)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (DRILL-3479) Sqlline from drill v1.1.0 displays version as 1.0.0

2015-09-23 Thread Parth Chandra (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-3479?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14905096#comment-14905096
 ] 

Parth Chandra commented on DRILL-3479:
--

As a shorter term fix, I'm putting a patch together that reads the version from 
the jar manifest file. I think the properties file is a better idea that allows 
us to get off the sqlline fork as well. 

> Sqlline from drill v1.1.0 displays version as 1.0.0
> ---
>
> Key: DRILL-3479
> URL: https://issues.apache.org/jira/browse/DRILL-3479
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Client - JDBC
>Reporter: Krystal
>Assignee: Parth Chandra
>Priority: Minor
> Fix For: 1.2.0
>
>
> Sqlline from drill 1.1.0 displays drill version as 1.0.0
> /opt/drill/bin/sqlline
> apache drill 1.0.0 
> "start your sql engine"



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (DRILL-3578) UnsupportedOperationException: Unable to get value vector class for minor type [FIXEDBINARY] and mode [OPTIONAL]

2015-09-23 Thread Parth Chandra (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-3578?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Parth Chandra updated DRILL-3578:
-
Assignee: Parth Chandra  (was: Jason Altekruse)

> UnsupportedOperationException: Unable to get value vector class for minor 
> type [FIXEDBINARY] and mode [OPTIONAL]
> 
>
> Key: DRILL-3578
> URL: https://issues.apache.org/jira/browse/DRILL-3578
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Data Types
>Affects Versions: 1.1.0
>Reporter: Hao Zhu
>Assignee: Parth Chandra
>Priority: Critical
> Fix For: 1.3.0
>
>
> The issue is Drill fails to read "timestamp" type in parquet file generated 
> by Hive.
> How to reproduce:
> 1. Create a external Hive CSV table in hive 1.0:
> {code}
> create external table type_test_csv
> (
>   id1 int,
>   id2 string,
>   id3 timestamp,
>   id4 double
> )
> ROW FORMAT DELIMITED
> FIELDS TERMINATED BY ','
> STORED AS TEXTFILE
> LOCATION '/xxx/testcsv';
> {code}
> 2. Put sample data for above external table:
> {code}
> 1,One,2015-01-01 00:01:00,1.0
> 2,Two,2015-01-02 00:02:00,2.0
> {code}
> 3. Create a parquet hive table:
> {code}
> create external table type_test
> (
>   id1 int,
>   id2 string,
>   id3 timestamp,
>   id4 double
> )
> STORED AS PARQUET
> LOCATION '/xxx/type_test';
> INSERT OVERWRITE TABLE type_test
>   SELECT * FROM type_test_csv;
> {code}
> 4. Then querying the parquet file directly through filesystem storage plugin:
> {code}
> > select * from dfs.`xxx/type_test`;
> Error: SYSTEM ERROR: UnsupportedOperationException: Unable to get value 
> vector class for minor type [FIXEDBINARY] and mode [OPTIONAL]
> Fragment 0:0
> [Error Id: fccfe8b2-6427-46e5-8bfd-cac639e526e8 on h3.poc.com:31010] 
> (state=,code=0)
> {code}
> 5. If the sample data is only 1 row:
> {code}
> 1,One,2015-01-01 00:01:00,1.0
> {code}
> Then the error message would become:
> {code}
> > select * from dfs.`xxx/type_test`;
> Error: SYSTEM ERROR: UnsupportedOperationException: Unsupported type:INT96
> [Error Id: b52b5d46-63a8-4be6-a11d-999a1b46c7c2 on h3.poc.com:31010] 
> (state=,code=0)
> {code}
> Using Hive storage plugin works fine. This issue only applies to filesystem 
> storage plugin.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (DRILL-3242) Enhance RPC layer to offload all request work onto a separate thread.

2015-09-23 Thread Jacques Nadeau (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-3242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14904921#comment-14904921
 ] 

Jacques Nadeau commented on DRILL-3242:
---

I'm inclined to get this merged. What do others think?

> Enhance RPC layer to offload all request work onto a separate thread.
> -
>
> Key: DRILL-3242
> URL: https://issues.apache.org/jira/browse/DRILL-3242
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Execution - RPC
>Reporter: Chris Westin
>Assignee: Jacques Nadeau
> Fix For: 1.2.0
>
> Attachments: DRILL-3242.patch
>
>
> Right now, the app is responsible for ensuring that very small amounts of 
> work are done on the RPC thread.  In some cases, the app doesn't do this 
> correctly.  Additionally, in high load situations these small amounts of work 
> become no trivial.  As such, we need to make RPC layer protect itself from 
> slow requests/responses.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (DRILL-3815) unknown suffixes .not_json and .json_not treated differently (multi-file case)

2015-09-23 Thread Jacques Nadeau (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-3815?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jacques Nadeau updated DRILL-3815:
--
Assignee: (was: Jacques Nadeau)

> unknown suffixes .not_json and .json_not treated differently (multi-file case)
> --
>
> Key: DRILL-3815
> URL: https://issues.apache.org/jira/browse/DRILL-3815
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Storage - Other
>Reporter: Daniel Barclay (Drill)
>
> In scanning a directory subtree used as a table, unknown filename extensions 
> seem to be treated differently depending on whether they're similar to known 
> file extensions.  The behavior suggests that Drill checks whether a file name 
> _contains_ an extension's string rather than _ending_ with it. 
> For example, given these subtrees with almost identical leaf file names:
> {noformat}
> $ find /tmp/testext_xx_json/
> /tmp/testext_xx_json/
> /tmp/testext_xx_json/voter2.not_json
> /tmp/testext_xx_json/voter1.json
> $ find /tmp/testext_json_xx/
> /tmp/testext_json_xx/
> /tmp/testext_json_xx/voter1.json
> /tmp/testext_json_xx/voter2.json_not
> $ 
> {noformat}
> the results of trying to use them as tables differs:
> {noformat}
> 0: jdbc:drill:zk=local> SELECT *   FROM `dfs.tmp`.`testext_xx_json`;
> Sep 21, 2015 11:41:50 AM 
> org.apache.calcite.sql.validate.SqlValidatorException 
> ...
> Error: VALIDATION ERROR: From line 1, column 17 to line 1, column 25: Table 
> 'dfs.tmp.testext_xx_json' not found
> [Error Id: 6fe41deb-0e39-43f6-beca-de27b39d276b on dev-linux2:31010] 
> (state=,code=0)
> 0: jdbc:drill:zk=local> SELECT *   FROM `dfs.tmp`.`testext_json_xx`;
> +---+
> | onecf |
> +---+
> | {"name":"someName1"}  |
> | {"name":"someName2"}  |
> +---+
> 2 rows selected (0.149 seconds)
> {noformat}
> (Other probing seems to indicate that there is also some sensitivity to 
> whether the extension contains an underscore character.)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (DRILL-3479) Sqlline from drill v1.1.0 displays version as 1.0.0

2015-09-23 Thread Parth Chandra (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-3479?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Parth Chandra updated DRILL-3479:
-
Assignee: Parth Chandra  (was: Jason Altekruse)

> Sqlline from drill v1.1.0 displays version as 1.0.0
> ---
>
> Key: DRILL-3479
> URL: https://issues.apache.org/jira/browse/DRILL-3479
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Client - JDBC
>Reporter: Krystal
>Assignee: Parth Chandra
>Priority: Minor
> Fix For: 1.2.0
>
>
> Sqlline from drill 1.1.0 displays drill version as 1.0.0
> /opt/drill/bin/sqlline
> apache drill 1.0.0 
> "start your sql engine"



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (DRILL-3822) PathScanner fails to find jdbc-all's drill-module.conf in SQuirreL

2015-09-23 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-3822?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14905270#comment-14905270
 ] 

ASF GitHub Bot commented on DRILL-3822:
---

Github user dsbos commented on the pull request:

https://github.com/apache/drill/pull/166#issuecomment-142730429
  
@hakim, can you review and merge this (probably with your fix for 
DRILL-3874)?


> PathScanner fails to find jdbc-all's drill-module.conf in SQuirreL
> --
>
> Key: DRILL-3822
> URL: https://issues.apache.org/jira/browse/DRILL-3822
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Client - JDBC
>Affects Versions: 1.2.0
>Reporter: Krystal
>Assignee: Daniel Barclay (Drill)
> Attachments: squirrel.log
>
>
> git.commit.id.abbrev=3c89b30
> I used the latest drill-jdbc-all-1.2.0-SNAPSHOT.jar against the SQuirreL SQL 
> application.  I got the following error when trying to connect to the drill 
> data source:
> {noformat}
> ERROR net.sourceforge.squirrel_sql.client.gui.db.ConnectToAliasCallBack  - 
> Unexpected Error occurred attempting to open an SQL connection.
> java.util.concurrent.ExecutionException: java.lang.RuntimeException: 
> oadd.com.typesafe.config.ConfigException$Missing: No configuration setting 
> found for key 'drill.exec'
> Full error message is in the attached file. 
> {noformat}
> We turned on logging and found that the jdbc-all Jar file's 
> {{drill-module.conf}} file was not being found (explaining why the 
> configuration key {{drill.exec}} wasn't found).
> After further investigation, it seems that {{PathScanner}} directly uses the 
> system class loader, bypassing the context class loader.
> (After drill-jdbc-all-1.2.0-SNAPSHOT.jar was changed from being listed in 
> SQuirreL's "additional class paths" (presumably being loaded by a special 
> class loader) to being copied into SQuirreL's Jar file directory (and 
> therefore loaded by the system class loader), SQuirreL worked. (Apparently, 
> {{PathScanner}} was then able to find  {{drill-module.conf}} in the JDBC-all 
> Jar file and load it, so the later reference to {{drill.exec}} no longer 
> failed.) 
> Also, SQuirreL works correctly with drill-1.1's JDBC-all Jar file, and there 
> were some recent changes to {{PathScanner}} related to class loaders.)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (DRILL-3778) Add rest of DRILL-3160 (making JDBC Javadoc available)

2015-09-23 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-3778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14905326#comment-14905326
 ] 

ASF GitHub Bot commented on DRILL-3778:
---

Github user jaltekruse commented on a diff in the pull request:

https://github.com/apache/drill/pull/158#discussion_r40260917
  
--- Diff: exec/jdbc/src/main/java/org/apache/drill/jdbc/package-info.java 
---
@@ -19,7 +19,13 @@
 /**
  * JDBC driver for Drill.
  * 
- *   Drill's JDBC driver class is {@link org.apache.drill.jdbc.Driver}.
+ *   Drill's JDBC driver class is
+ *   {@link org.apache.drill.jdbc.Driver org.apache.drill.jdbc.Driver}.
--- End diff --

Sounds good, just making sure it wasn't a mistake. I'll be merging shortly.


> Add rest of DRILL-3160 (making JDBC Javadoc available)
> --
>
> Key: DRILL-3778
> URL: https://issues.apache.org/jira/browse/DRILL-3778
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Client - JDBC
>Reporter: Daniel Barclay (Drill)
>Assignee: Aditya Kishore
> Fix For: 1.2.0
>
>
> Apply changes for DRILL-3160 (making JDBC Javadoc available) that were missed 
> by unsynchronized merge (Javadoc configuration, JDBC Javadoc additions and 
> adjustments).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Assigned] (DRILL-3778) Add rest of DRILL-3160 (making JDBC Javadoc available)

2015-09-23 Thread Daniel Barclay (Drill) (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-3778?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Barclay (Drill) reassigned DRILL-3778:
-

Assignee: Daniel Barclay (Drill)  (was: Aditya Kishore)

> Add rest of DRILL-3160 (making JDBC Javadoc available)
> --
>
> Key: DRILL-3778
> URL: https://issues.apache.org/jira/browse/DRILL-3778
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Client - JDBC
>Reporter: Daniel Barclay (Drill)
>Assignee: Daniel Barclay (Drill)
> Fix For: 1.2.0
>
>
> Apply changes for DRILL-3160 (making JDBC Javadoc available) that were missed 
> by unsynchronized merge (Javadoc configuration, JDBC Javadoc additions and 
> adjustments).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (DRILL-3778) Add rest of DRILL-3160 (making JDBC Javadoc available)

2015-09-23 Thread Daniel Barclay (Drill) (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-3778?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Barclay (Drill) updated DRILL-3778:
--
Assignee: Jason Altekruse  (was: Daniel Barclay (Drill))

> Add rest of DRILL-3160 (making JDBC Javadoc available)
> --
>
> Key: DRILL-3778
> URL: https://issues.apache.org/jira/browse/DRILL-3778
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Client - JDBC
>Reporter: Daniel Barclay (Drill)
>Assignee: Jason Altekruse
> Fix For: 1.2.0
>
>
> Apply changes for DRILL-3160 (making JDBC Javadoc available) that were missed 
> by unsynchronized merge (Javadoc configuration, JDBC Javadoc additions and 
> adjustments).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (DRILL-3829) Metadata Caching : Drill should ignore a corrupted cache file

2015-09-23 Thread Rahul Challapalli (JIRA)

Rahul Challapalli created DRILL-3829:


 Summary: Metadata Caching : Drill should ignore a corrupted cache 
file
 Key: DRILL-3829
 URL: https://issues.apache.org/jira/browse/DRILL-3829
 Project: Apache Drill
  Issue Type: Bug
  Components: Metadata
Reporter: Rahul Challapalli
Assignee: Mehant Baid


git.commit.id.abbrev=3c89b30

Drill should validate the cache file structure and ignore it if it detects any 
corruption to its contents.

I placed an empty cache file in the directory and executed a count(*) query on 
top of the directory. Below is what I got
{code}
select count(*) from dfs.`/drill/testdata/metadata_caching/lineitem`;
Error: SYSTEM ERROR: JsonMappingException: No content to map due to end-of-input
 at [Source: com.mapr.fs.MapRFsDataInputStream@293240cd; line: 1, column: 1]


[Error Id: 88f77d37-aff3-4adc-bb0e-6c13b49e7776 on qa-node190.qa.lab:31010] 
(state=,code=0)
{code}

At the very least we should inform that the cache file has been corrupted.




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (DRILL-3829) Metadata Caching : Drill should ignore a corrupted cache file

2015-09-23 Thread Chun Chang (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-3829?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14905056#comment-14905056
 ] 

Chun Chang commented on DRILL-3829:
---

This should be a dup of DRILL-3827.

> Metadata Caching : Drill should ignore a corrupted cache file
> -
>
> Key: DRILL-3829
> URL: https://issues.apache.org/jira/browse/DRILL-3829
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Metadata
>Reporter: Rahul Challapalli
>Assignee: Mehant Baid
>
> git.commit.id.abbrev=3c89b30
> Drill should validate the cache file structure and ignore it if it detects 
> any corruption to its contents.
> I placed an empty cache file in the directory and executed a count(*) query 
> on top of the directory. Below is what I got
> {code}
> select count(*) from dfs.`/drill/testdata/metadata_caching/lineitem`;
> Error: SYSTEM ERROR: JsonMappingException: No content to map due to 
> end-of-input
>  at [Source: com.mapr.fs.MapRFsDataInputStream@293240cd; line: 1, column: 1]
> [Error Id: 88f77d37-aff3-4adc-bb0e-6c13b49e7776 on qa-node190.qa.lab:31010] 
> (state=,code=0)
> {code}
> At the very least we should inform that the cache file has been corrupted.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (DRILL-2908) Support reading the Parquet int 96 type

2015-09-23 Thread Parth Chandra (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-2908?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Parth Chandra updated DRILL-2908:
-
Assignee: Parth Chandra  (was: Jason Altekruse)

> Support reading the Parquet int 96 type
> ---
>
> Key: DRILL-2908
> URL: https://issues.apache.org/jira/browse/DRILL-2908
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Storage - Parquet
>Reporter: Jason Altekruse
>Assignee: Parth Chandra
> Fix For: 1.2.0
>
>
> While Drill does not currently have an int96 type, it is supported by the 
> parquet format and we should be able to read files that contain columns of 
> this type. For now we will read the data into a varbinary and users will have 
> to use existing convert_from functions or write their own to interpret the 
> type of data actually stored. One example is the Impala timestamp format 
> which is encoded in an int96 column.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (DRILL-3778) Add rest of DRILL-3160 (making JDBC Javadoc available)

2015-09-23 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-3778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14905297#comment-14905297
 ] 

ASF GitHub Bot commented on DRILL-3778:
---

Github user jaltekruse commented on the pull request:

https://github.com/apache/drill/pull/158#issuecomment-142735217
  
+1


> Add rest of DRILL-3160 (making JDBC Javadoc available)
> --
>
> Key: DRILL-3778
> URL: https://issues.apache.org/jira/browse/DRILL-3778
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Client - JDBC
>Reporter: Daniel Barclay (Drill)
>Assignee: Aditya Kishore
> Fix For: 1.2.0
>
>
> Apply changes for DRILL-3160 (making JDBC Javadoc available) that were missed 
> by unsynchronized merge (Javadoc configuration, JDBC Javadoc additions and 
> adjustments).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (DRILL-3830) Query with aggregate window functions returns possibly wrong results on large scale data

2015-09-23 Thread Abhishek Girish (JIRA)

Abhishek Girish created DRILL-3830:
--

 Summary: Query with aggregate window functions returns possibly 
wrong results on large scale data
 Key: DRILL-3830
 URL: https://issues.apache.org/jira/browse/DRILL-3830
 Project: Apache Drill
  Issue Type: Bug
  Components: Execution - Relational Operators
Affects Versions: 1.2.0
 Environment: 10 Performance Nodes
DRILL_MAX_DIRECT_MEMORY=100g
DRILL_INIT_HEAP="8g"
DRILL_MAX_HEAP="8g"
planner.memory.query_max_memory_per_node bumped up to 20 GB
TPC-DS SF 1000 dataset (Parquet)
Reporter: Abhishek Girish
Assignee: Deneche A. Hakim


Results returned by the following two queries slightly differ from those 
returned  by Greenplum DB. 

{code:sql}
SELECT SUM(ss.ss_net_paid_inc_tax) OVER (PARTITION BY ss.ss_store_sk) FROM 
store_sales ss LIMIT 1;

SELECT SUM(ss.ss_net_paid_inc_tax) OVER (PARTITION BY ss.ss_store_sk ORDER BY 
ss.ss_store_sk) FROM store_sales ss LIMIT 2;

Drill:
9.653697131700665E9

Greenplum DB:
9.628946925860903E9

P.S. Both queries return same results
{code}

I was unable to reproduce this on smaller scale (tried SF 1). I'll attach plans 
from both systems. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (DRILL-3829) Metadata Caching : Drill should ignore a corrupted cache file

2015-09-23 Thread Rahul Challapalli (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-3829?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14905394#comment-14905394
 ] 

Rahul Challapalli commented on DRILL-3829:
--

This is definitely a duplicate.

> Metadata Caching : Drill should ignore a corrupted cache file
> -
>
> Key: DRILL-3829
> URL: https://issues.apache.org/jira/browse/DRILL-3829
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Metadata
>Reporter: Rahul Challapalli
>Assignee: Mehant Baid
>
> git.commit.id.abbrev=3c89b30
> Drill should validate the cache file structure and ignore it if it detects 
> any corruption to its contents.
> I placed an empty cache file in the directory and executed a count(*) query 
> on top of the directory. Below is what I got
> {code}
> select count(*) from dfs.`/drill/testdata/metadata_caching/lineitem`;
> Error: SYSTEM ERROR: JsonMappingException: No content to map due to 
> end-of-input
>  at [Source: com.mapr.fs.MapRFsDataInputStream@293240cd; line: 1, column: 1]
> [Error Id: 88f77d37-aff3-4adc-bb0e-6c13b49e7776 on qa-node190.qa.lab:31010] 
> (state=,code=0)
> {code}
> At the very least we should inform that the cache file has been corrupted.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (DRILL-3691) CTAS Memory Leak : IllegalStateException

2015-09-23 Thread Rahul Challapalli (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-3691?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rahul Challapalli updated DRILL-3691:
-
Fix Version/s: 1.2.0

> CTAS Memory Leak : IllegalStateException
> 
>
> Key: DRILL-3691
> URL: https://issues.apache.org/jira/browse/DRILL-3691
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Storage - Parquet
>Reporter: Rahul Challapalli
>Assignee: Rahul Challapalli
> Fix For: 1.2.0
>
> Attachments: error.log
>
>
> git.commit.id.abbrev=55dfd0e
> The below CTAS statement fails with a memory leak. The query runs on top of 
> Tpch SF100 data.
> {code}
> create table lineitem as select * from dfs.`/drill/testdata/tpch100/lineitem`;
> java.sql.SQLException: SYSTEM ERROR: IllegalStateException: Failure while 
> closing accountor.  Expected private and shared pools to be set to initial 
> values.  However, one or more were not.  Stats are
> zoneinitallocated   delta 
> private 100 100 0 
> shared  00  9998410176  589824.
> Fragment 1:19
> [Error Id: ba8fedf2-be40-4488-af2e-b6034527c943 on qa-node191.qa.lab:31010]
> Aborting command set because "force" is false and command failed: "create 
> table lineitem as select * from dfs.`/drill/testdata/tpch100/lineitem`;"
> {code}
> I attached the log file. I am not uploading the data as it is too large



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Closed] (DRILL-3691) CTAS Memory Leak : IllegalStateException

2015-09-23 Thread Rahul Challapalli (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-3691?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rahul Challapalli closed DRILL-3691.

Resolution: Fixed

> CTAS Memory Leak : IllegalStateException
> 
>
> Key: DRILL-3691
> URL: https://issues.apache.org/jira/browse/DRILL-3691
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Storage - Parquet
>Reporter: Rahul Challapalli
>Assignee: Rahul Challapalli
> Fix For: 1.2.0
>
> Attachments: error.log
>
>
> git.commit.id.abbrev=55dfd0e
> The below CTAS statement fails with a memory leak. The query runs on top of 
> Tpch SF100 data.
> {code}
> create table lineitem as select * from dfs.`/drill/testdata/tpch100/lineitem`;
> java.sql.SQLException: SYSTEM ERROR: IllegalStateException: Failure while 
> closing accountor.  Expected private and shared pools to be set to initial 
> values.  However, one or more were not.  Stats are
> zoneinitallocated   delta 
> private 100 100 0 
> shared  00  9998410176  589824.
> Fragment 1:19
> [Error Id: ba8fedf2-be40-4488-af2e-b6034527c943 on qa-node191.qa.lab:31010]
> Aborting command set because "force" is false and command failed: "create 
> table lineitem as select * from dfs.`/drill/testdata/tpch100/lineitem`;"
> {code}
> I attached the log file. I am not uploading the data as it is too large



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (DRILL-3778) Add rest of DRILL-3160 (making JDBC Javadoc available)

2015-09-23 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-3778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14905295#comment-14905295
 ] 

ASF GitHub Bot commented on DRILL-3778:
---

Github user jaltekruse commented on a diff in the pull request:

https://github.com/apache/drill/pull/158#discussion_r40260046
  
--- Diff: exec/jdbc/src/main/java/org/apache/drill/jdbc/package-info.java 
---
@@ -19,7 +19,13 @@
 /**
  * JDBC driver for Drill.
  * 
- *   Drill's JDBC driver class is {@link org.apache.drill.jdbc.Driver}.
+ *   Drill's JDBC driver class is
+ *   {@link org.apache.drill.jdbc.Driver org.apache.drill.jdbc.Driver}.
--- End diff --

Is this link tag written like this because you want the fully qualified 
name to be shown as the link text?


> Add rest of DRILL-3160 (making JDBC Javadoc available)
> --
>
> Key: DRILL-3778
> URL: https://issues.apache.org/jira/browse/DRILL-3778
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Client - JDBC
>Reporter: Daniel Barclay (Drill)
>Assignee: Aditya Kishore
> Fix For: 1.2.0
>
>
> Apply changes for DRILL-3160 (making JDBC Javadoc available) that were missed 
> by unsynchronized merge (Javadoc configuration, JDBC Javadoc additions and 
> adjustments).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (DRILL-3778) Add rest of DRILL-3160 (making JDBC Javadoc available)

2015-09-23 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-3778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14905305#comment-14905305
 ] 

ASF GitHub Bot commented on DRILL-3778:
---

Github user dsbos commented on a diff in the pull request:

https://github.com/apache/drill/pull/158#discussion_r40260511
  
--- Diff: exec/jdbc/src/main/java/org/apache/drill/jdbc/package-info.java 
---
@@ -19,7 +19,13 @@
 /**
  * JDBC driver for Drill.
  * 
- *   Drill's JDBC driver class is {@link org.apache.drill.jdbc.Driver}.
+ *   Drill's JDBC driver class is
+ *   {@link org.apache.drill.jdbc.Driver org.apache.drill.jdbc.Driver}.
--- End diff --

Yes.


> Add rest of DRILL-3160 (making JDBC Javadoc available)
> --
>
> Key: DRILL-3778
> URL: https://issues.apache.org/jira/browse/DRILL-3778
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Client - JDBC
>Reporter: Daniel Barclay (Drill)
>Assignee: Aditya Kishore
> Fix For: 1.2.0
>
>
> Apply changes for DRILL-3160 (making JDBC Javadoc available) that were missed 
> by unsynchronized merge (Javadoc configuration, JDBC Javadoc additions and 
> adjustments).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (DRILL-3778) Add rest of DRILL-3160 (making JDBC Javadoc available)

2015-09-23 Thread Daniel Barclay (Drill) (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-3778?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Barclay (Drill) updated DRILL-3778:
--
Assignee: Jason Altekruse  (was: Aditya Kishore)

> Add rest of DRILL-3160 (making JDBC Javadoc available)
> --
>
> Key: DRILL-3778
> URL: https://issues.apache.org/jira/browse/DRILL-3778
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Client - JDBC
>Reporter: Daniel Barclay (Drill)
>Assignee: Jason Altekruse
> Fix For: 1.2.0
>
>
> Apply changes for DRILL-3160 (making JDBC Javadoc available) that were missed 
> by unsynchronized merge (Javadoc configuration, JDBC Javadoc additions and 
> adjustments).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (DRILL-3830) Query with aggregate window functions returns possibly wrong results on large scale data

2015-09-23 Thread Abhishek Girish (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-3830?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Abhishek Girish updated DRILL-3830:
---
Attachment: gpdb_sf1000_plan.txt
gpdb_sf1_plan.txt
drill_sf1_plan.txt

> Query with aggregate window functions returns possibly wrong results on large 
> scale data
> 
>
> Key: DRILL-3830
> URL: https://issues.apache.org/jira/browse/DRILL-3830
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Relational Operators
>Affects Versions: 1.2.0
> Environment: 10 Performance Nodes
> DRILL_MAX_DIRECT_MEMORY=100g
> DRILL_INIT_HEAP="8g"
> DRILL_MAX_HEAP="8g"
> planner.memory.query_max_memory_per_node bumped up to 20 GB
> TPC-DS SF 1000 dataset (Parquet)
>Reporter: Abhishek Girish
>Assignee: Deneche A. Hakim
> Attachments: drill_sf1_plan.txt, gpdb_sf1000_plan.txt, 
> gpdb_sf1_plan.txt
>
>
> Results returned by the following two queries slightly differ from those 
> returned  by Greenplum DB. 
> {code:sql}
> SELECT SUM(ss.ss_net_paid_inc_tax) OVER (PARTITION BY ss.ss_store_sk) FROM 
> store_sales ss LIMIT 1;
> SELECT SUM(ss.ss_net_paid_inc_tax) OVER (PARTITION BY ss.ss_store_sk ORDER BY 
> ss.ss_store_sk) FROM store_sales ss LIMIT 2;
> Drill:
> 9.653697131700665E9
> Greenplum DB:
> 9.628946925860903E9
> P.S. Both queries return same results
> {code}
> I was unable to reproduce this on smaller scale (tried SF 1). I'll attach 
> plans from both systems. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (DRILL-3822) PathScanner fails to find jdbc-all's drill-module.conf in SQuirreL

2015-09-23 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-3822?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14905229#comment-14905229
 ] 

ASF GitHub Bot commented on DRILL-3822:
---

GitHub user dsbos opened a pull request:

https://github.com/apache/drill/pull/166

DRILL-3822:  Have PathScanner use own, not thread-context, class loader.



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/dsbos/incubator-drill bugs/drill-3822

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/drill/pull/166.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #166


commit 605a65780a38da2e1e8e35c521876e68e5901cab
Author: dbarclay 
Date:   2015-09-23T01:04:45Z

DRILL-3822:  Have PathScanner use own, not thread-context, class loader.




> PathScanner fails to find jdbc-all's drill-module.conf in SQuirreL
> --
>
> Key: DRILL-3822
> URL: https://issues.apache.org/jira/browse/DRILL-3822
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Client - JDBC
>Affects Versions: 1.2.0
>Reporter: Krystal
>Assignee: Daniel Barclay (Drill)
> Attachments: squirrel.log
>
>
> git.commit.id.abbrev=3c89b30
> I used the latest drill-jdbc-all-1.2.0-SNAPSHOT.jar against the SQuirreL SQL 
> application.  I got the following error when trying to connect to the drill 
> data source:
> {noformat}
> ERROR net.sourceforge.squirrel_sql.client.gui.db.ConnectToAliasCallBack  - 
> Unexpected Error occurred attempting to open an SQL connection.
> java.util.concurrent.ExecutionException: java.lang.RuntimeException: 
> oadd.com.typesafe.config.ConfigException$Missing: No configuration setting 
> found for key 'drill.exec'
> Full error message is in the attached file. 
> {noformat}
> We turned on logging and found that the jdbc-all Jar file's 
> {{drill-module.conf}} file was not being found (explaining why the 
> configuration key {{drill.exec}} wasn't found).
> After further investigation, it seems that {{PathScanner}} directly uses the 
> system class loader, bypassing the context class loader.
> (After drill-jdbc-all-1.2.0-SNAPSHOT.jar was changed from being listed in 
> SQuirreL's "additional class paths" (presumably being loaded by a special 
> class loader) to being copied into SQuirreL's Jar file directory (and 
> therefore loaded by the system class loader), SQuirreL worked. (Apparently, 
> {{PathScanner}} was then able to find  {{drill-module.conf}} in the JDBC-all 
> Jar file and load it, so the later reference to {{drill.exec}} no longer 
> failed.) 
> Also, SQuirreL works correctly with drill-1.1's JDBC-all Jar file, and there 
> were some recent changes to {{PathScanner}} related to class loaders.)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (DRILL-3822) PathScanner fails to find jdbc-all's drill-module.conf in SQuirreL

2015-09-23 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-3822?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14905228#comment-14905228
 ] 

ASF GitHub Bot commented on DRILL-3822:
---

Github user dsbos closed the pull request at:

https://github.com/apache/drill/pull/165


> PathScanner fails to find jdbc-all's drill-module.conf in SQuirreL
> --
>
> Key: DRILL-3822
> URL: https://issues.apache.org/jira/browse/DRILL-3822
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Client - JDBC
>Affects Versions: 1.2.0
>Reporter: Krystal
>Assignee: Daniel Barclay (Drill)
> Attachments: squirrel.log
>
>
> git.commit.id.abbrev=3c89b30
> I used the latest drill-jdbc-all-1.2.0-SNAPSHOT.jar against the SQuirreL SQL 
> application.  I got the following error when trying to connect to the drill 
> data source:
> {noformat}
> ERROR net.sourceforge.squirrel_sql.client.gui.db.ConnectToAliasCallBack  - 
> Unexpected Error occurred attempting to open an SQL connection.
> java.util.concurrent.ExecutionException: java.lang.RuntimeException: 
> oadd.com.typesafe.config.ConfigException$Missing: No configuration setting 
> found for key 'drill.exec'
> Full error message is in the attached file. 
> {noformat}
> We turned on logging and found that the jdbc-all Jar file's 
> {{drill-module.conf}} file was not being found (explaining why the 
> configuration key {{drill.exec}} wasn't found).
> After further investigation, it seems that {{PathScanner}} directly uses the 
> system class loader, bypassing the context class loader.
> (After drill-jdbc-all-1.2.0-SNAPSHOT.jar was changed from being listed in 
> SQuirreL's "additional class paths" (presumably being loaded by a special 
> class loader) to being copied into SQuirreL's Jar file directory (and 
> therefore loaded by the system class loader), SQuirreL worked. (Apparently, 
> {{PathScanner}} was then able to find  {{drill-module.conf}} in the JDBC-all 
> Jar file and load it, so the later reference to {{drill.exec}} no longer 
> failed.) 
> Also, SQuirreL works correctly with drill-1.1's JDBC-all Jar file, and there 
> were some recent changes to {{PathScanner}} related to class loaders.)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (DRILL-3479) Sqlline from drill v1.1.0 displays version as 1.0.0

2015-09-23 Thread Parth Chandra (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-3479?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14905258#comment-14905258
 ] 

Parth Chandra commented on DRILL-3479:
--

[~adityakishore] Can you comment on the sqlline change to read the drill 
version from the manifest file?

https://github.com/parthchandra/sqlline/commit/34be310aaa87d22b60ee60620f03df9107c91eb6

I haven't made the more comprehensive change to read from a properties file 
that would also allow us to move off the fork.  We should do that as part of 
another JIRA.

> Sqlline from drill v1.1.0 displays version as 1.0.0
> ---
>
> Key: DRILL-3479
> URL: https://issues.apache.org/jira/browse/DRILL-3479
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Client - JDBC
>Reporter: Krystal
>Assignee: Parth Chandra
>Priority: Minor
> Fix For: 1.2.0
>
>
> Sqlline from drill 1.1.0 displays drill version as 1.0.0
> /opt/drill/bin/sqlline
> apache drill 1.0.0 
> "start your sql engine"



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (DRILL-3822) PathScanner fails to find jdbc-all's drill-module.conf in SQuirreL

2015-09-23 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-3822?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14905191#comment-14905191
 ] 

ASF GitHub Bot commented on DRILL-3822:
---

Github user dsbos commented on the pull request:

https://github.com/apache/drill/pull/165#issuecomment-142719537
  
@jacques-n, can you review and merge this?

We (Krystal and I) have verified that this patch fixes the problem seen in 
SQuirrel.


> PathScanner fails to find jdbc-all's drill-module.conf in SQuirreL
> --
>
> Key: DRILL-3822
> URL: https://issues.apache.org/jira/browse/DRILL-3822
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Client - JDBC
>Affects Versions: 1.2.0
>Reporter: Krystal
>Assignee: Daniel Barclay (Drill)
> Attachments: squirrel.log
>
>
> git.commit.id.abbrev=3c89b30
> I used the latest drill-jdbc-all-1.2.0-SNAPSHOT.jar against the SQuirreL SQL 
> application.  I got the following error when trying to connect to the drill 
> data source:
> {noformat}
> ERROR net.sourceforge.squirrel_sql.client.gui.db.ConnectToAliasCallBack  - 
> Unexpected Error occurred attempting to open an SQL connection.
> java.util.concurrent.ExecutionException: java.lang.RuntimeException: 
> oadd.com.typesafe.config.ConfigException$Missing: No configuration setting 
> found for key 'drill.exec'
> Full error message is in the attached file. 
> {noformat}
> We turned on logging and found that the jdbc-all Jar file's 
> {{drill-module.conf}} file was not being found (explaining why the 
> configuration key {{drill.exec}} wasn't found).
> After further investigation, it seems that {{PathScanner}} directly uses the 
> system class loader, bypassing the context class loader.
> (After drill-jdbc-all-1.2.0-SNAPSHOT.jar was changed from being listed in 
> SQuirreL's "additional class paths" (presumably being loaded by a special 
> class loader) to being copied into SQuirreL's Jar file directory (and 
> therefore loaded by the system class loader), SQuirreL worked. (Apparently, 
> {{PathScanner}} was then able to find  {{drill-module.conf}} in the JDBC-all 
> Jar file and load it, so the later reference to {{drill.exec}} no longer 
> failed.) 
> Also, SQuirreL works correctly with drill-1.1's JDBC-all Jar file, and there 
> were some recent changes to {{PathScanner}} related to class loaders.)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (DRILL-3822) PathScanner fails to find jdbc-all's drill-module.conf in SQuirreL

2015-09-23 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-3822?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14905261#comment-14905261
 ] 

ASF GitHub Bot commented on DRILL-3822:
---

Github user jacques-n commented on the pull request:

https://github.com/apache/drill/pull/166#issuecomment-142728658
  
+1. 

I won't be able to merge for a couple days so you might ping someone else 
on the merge.


> PathScanner fails to find jdbc-all's drill-module.conf in SQuirreL
> --
>
> Key: DRILL-3822
> URL: https://issues.apache.org/jira/browse/DRILL-3822
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Client - JDBC
>Affects Versions: 1.2.0
>Reporter: Krystal
>Assignee: Daniel Barclay (Drill)
> Attachments: squirrel.log
>
>
> git.commit.id.abbrev=3c89b30
> I used the latest drill-jdbc-all-1.2.0-SNAPSHOT.jar against the SQuirreL SQL 
> application.  I got the following error when trying to connect to the drill 
> data source:
> {noformat}
> ERROR net.sourceforge.squirrel_sql.client.gui.db.ConnectToAliasCallBack  - 
> Unexpected Error occurred attempting to open an SQL connection.
> java.util.concurrent.ExecutionException: java.lang.RuntimeException: 
> oadd.com.typesafe.config.ConfigException$Missing: No configuration setting 
> found for key 'drill.exec'
> Full error message is in the attached file. 
> {noformat}
> We turned on logging and found that the jdbc-all Jar file's 
> {{drill-module.conf}} file was not being found (explaining why the 
> configuration key {{drill.exec}} wasn't found).
> After further investigation, it seems that {{PathScanner}} directly uses the 
> system class loader, bypassing the context class loader.
> (After drill-jdbc-all-1.2.0-SNAPSHOT.jar was changed from being listed in 
> SQuirreL's "additional class paths" (presumably being loaded by a special 
> class loader) to being copied into SQuirreL's Jar file directory (and 
> therefore loaded by the system class loader), SQuirreL worked. (Apparently, 
> {{PathScanner}} was then able to find  {{drill-module.conf}} in the JDBC-all 
> Jar file and load it, so the later reference to {{drill.exec}} no longer 
> failed.) 
> Also, SQuirreL works correctly with drill-1.1's JDBC-all Jar file, and there 
> were some recent changes to {{PathScanner}} related to class loaders.)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (DRILL-3778) Add rest of DRILL-3160 (making JDBC Javadoc available)

2015-09-23 Thread Daniel Barclay (Drill) (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-3778?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Barclay (Drill) updated DRILL-3778:
--
Assignee: Aditya Kishore  (was: Daniel Barclay (Drill))

> Add rest of DRILL-3160 (making JDBC Javadoc available)
> --
>
> Key: DRILL-3778
> URL: https://issues.apache.org/jira/browse/DRILL-3778
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Client - JDBC
>Reporter: Daniel Barclay (Drill)
>Assignee: Aditya Kishore
> Fix For: 1.2.0
>
>
> Apply changes for DRILL-3160 (making JDBC Javadoc available) that were missed 
> by unsynchronized merge (Javadoc configuration, JDBC Javadoc additions and 
> adjustments).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (DRILL-3822) PathScanner fails to find jdbc-all's drill-module.conf in SQuirreL

2015-09-23 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-3822?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14905232#comment-14905232
 ] 

ASF GitHub Bot commented on DRILL-3822:
---

Github user dsbos commented on the pull request:

https://github.com/apache/drill/pull/166#issuecomment-142724811
  
@jacques-n, can you review and merge this?

We (Krystal and I) have verified that this patch fixes the problem seen in 
SQuirrel.



> PathScanner fails to find jdbc-all's drill-module.conf in SQuirreL
> --
>
> Key: DRILL-3822
> URL: https://issues.apache.org/jira/browse/DRILL-3822
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Client - JDBC
>Affects Versions: 1.2.0
>Reporter: Krystal
>Assignee: Daniel Barclay (Drill)
> Attachments: squirrel.log
>
>
> git.commit.id.abbrev=3c89b30
> I used the latest drill-jdbc-all-1.2.0-SNAPSHOT.jar against the SQuirreL SQL 
> application.  I got the following error when trying to connect to the drill 
> data source:
> {noformat}
> ERROR net.sourceforge.squirrel_sql.client.gui.db.ConnectToAliasCallBack  - 
> Unexpected Error occurred attempting to open an SQL connection.
> java.util.concurrent.ExecutionException: java.lang.RuntimeException: 
> oadd.com.typesafe.config.ConfigException$Missing: No configuration setting 
> found for key 'drill.exec'
> Full error message is in the attached file. 
> {noformat}
> We turned on logging and found that the jdbc-all Jar file's 
> {{drill-module.conf}} file was not being found (explaining why the 
> configuration key {{drill.exec}} wasn't found).
> After further investigation, it seems that {{PathScanner}} directly uses the 
> system class loader, bypassing the context class loader.
> (After drill-jdbc-all-1.2.0-SNAPSHOT.jar was changed from being listed in 
> SQuirreL's "additional class paths" (presumably being loaded by a special 
> class loader) to being copied into SQuirreL's Jar file directory (and 
> therefore loaded by the system class loader), SQuirreL worked. (Apparently, 
> {{PathScanner}} was then able to find  {{drill-module.conf}} in the JDBC-all 
> Jar file and load it, so the later reference to {{drill.exec}} no longer 
> failed.) 
> Also, SQuirreL works correctly with drill-1.1's JDBC-all Jar file, and there 
> were some recent changes to {{PathScanner}} related to class loaders.)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (DRILL-3815) unknown suffixes .not_json and .json_not treated differently (multi-file case)

2015-09-23 Thread Daniel Barclay (Drill) (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-3815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14905243#comment-14905243
 ] 

Daniel Barclay (Drill) commented on DRILL-3815:
---

I haven't changed any storage plug-in configuration from the defaults.

> unknown suffixes .not_json and .json_not treated differently (multi-file case)
> --
>
> Key: DRILL-3815
> URL: https://issues.apache.org/jira/browse/DRILL-3815
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Storage - Other
>Reporter: Daniel Barclay (Drill)
>
> In scanning a directory subtree used as a table, unknown filename extensions 
> seem to be treated differently depending on whether they're similar to known 
> file extensions.  The behavior suggests that Drill checks whether a file name 
> _contains_ an extension's string rather than _ending_ with it. 
> For example, given these subtrees with almost identical leaf file names:
> {noformat}
> $ find /tmp/testext_xx_json/
> /tmp/testext_xx_json/
> /tmp/testext_xx_json/voter2.not_json
> /tmp/testext_xx_json/voter1.json
> $ find /tmp/testext_json_xx/
> /tmp/testext_json_xx/
> /tmp/testext_json_xx/voter1.json
> /tmp/testext_json_xx/voter2.json_not
> $ 
> {noformat}
> the results of trying to use them as tables differs:
> {noformat}
> 0: jdbc:drill:zk=local> SELECT *   FROM `dfs.tmp`.`testext_xx_json`;
> Sep 21, 2015 11:41:50 AM 
> org.apache.calcite.sql.validate.SqlValidatorException 
> ...
> Error: VALIDATION ERROR: From line 1, column 17 to line 1, column 25: Table 
> 'dfs.tmp.testext_xx_json' not found
> [Error Id: 6fe41deb-0e39-43f6-beca-de27b39d276b on dev-linux2:31010] 
> (state=,code=0)
> 0: jdbc:drill:zk=local> SELECT *   FROM `dfs.tmp`.`testext_json_xx`;
> +---+
> | onecf |
> +---+
> | {"name":"someName1"}  |
> | {"name":"someName2"}  |
> +---+
> 2 rows selected (0.149 seconds)
> {noformat}
> (Other probing seems to indicate that there is also some sensitivity to 
> whether the extension contains an underscore character.)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (DRILL-3778) Add rest of DRILL-3160 (making JDBC Javadoc available)

2015-09-23 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-3778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14905264#comment-14905264
 ] 

ASF GitHub Bot commented on DRILL-3778:
---

Github user dsbos commented on the pull request:

https://github.com/apache/drill/pull/158#issuecomment-142729309
  
@jaltekruse, can you review and merge this commit?


> Add rest of DRILL-3160 (making JDBC Javadoc available)
> --
>
> Key: DRILL-3778
> URL: https://issues.apache.org/jira/browse/DRILL-3778
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Client - JDBC
>Reporter: Daniel Barclay (Drill)
>Assignee: Aditya Kishore
> Fix For: 1.2.0
>
>
> Apply changes for DRILL-3160 (making JDBC Javadoc available) that were missed 
> by unsynchronized merge (Javadoc configuration, JDBC Javadoc additions and 
> adjustments).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (DRILL-3832) Metadata Caching : There should be a way for the user to know that the cache has been leveraged

2015-09-23 Thread Rahul Challapalli (JIRA)

Rahul Challapalli created DRILL-3832:


 Summary: Metadata Caching : There should be a way for the user to 
know that the cache has been leveraged
 Key: DRILL-3832
 URL: https://issues.apache.org/jira/browse/DRILL-3832
 Project: Apache Drill
  Issue Type: Improvement
  Components: Metadata
Reporter: Rahul Challapalli
Assignee: Aman Sinha


git.commit.id.abbrev=3c89b30

Currently the user has no way of knowing that the metadata cache file has been 
leveraged apart from comparing the time which could be influenced by other 
factors.

It would be helpful while debugging to know whether or not the cache has been 
leveraged. This information can be added to one or a combination of the below 
places
1. Profiles
2. Explain Plan Output
3. Log files



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (DRILL-3479) Sqlline from drill v1.1.0 displays version as 1.0.0

2015-09-23 Thread Parth Chandra (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-3479?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14905539#comment-14905539
 ] 

Parth Chandra commented on DRILL-3479:
--

Updated version so only the drill manifest file is read. I'll post the 
corresponding drill code as a review request. :
https://github.com/parthchandra/sqlline/commit/0778985f11a72575e0d4e7e55307f450a35f8c62

Will post the drill change as a review request once the sqlline change is 
pushed to the repo.

> Sqlline from drill v1.1.0 displays version as 1.0.0
> ---
>
> Key: DRILL-3479
> URL: https://issues.apache.org/jira/browse/DRILL-3479
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Client - JDBC
>Reporter: Krystal
>Assignee: Parth Chandra
>Priority: Minor
> Fix For: 1.2.0
>
>
> Sqlline from drill 1.1.0 displays drill version as 1.0.0
> /opt/drill/bin/sqlline
> apache drill 1.0.0 
> "start your sql engine"



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (DRILL-3818) Error when DISTINCT and GROUP BY is used in avro or json

2015-09-23 Thread Daniel Barclay (Drill) (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-3818?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Barclay (Drill) updated DRILL-3818:
--
Component/s: (was: Client - JDBC)
 SQL Parser

> Error when DISTINCT and GROUP BY is used in avro or json
> 
>
> Key: DRILL-3818
> URL: https://issues.apache.org/jira/browse/DRILL-3818
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - RPC, SQL Parser
>Affects Versions: 1.1.0, 1.2.0
> Environment: Linux Mint 17.1
> java version "1.7.0_80"
> Java(TM) SE Runtime Environment (build 1.7.0_80-b15)
> Java HotSpot(TM) 64-Bit Server VM (build 24.80-b11, mixed mode)
>Reporter: Philip Deegan
>Assignee: Daniel Barclay (Drill)
> Fix For: 1.2.0
>
>
> Data
> {noformat}
> { "a": { "b": { "c": "d" }, "e": 2 }}
> {noformat}
> Query
> {noformat}
> select DISTINCT(t.a.b.c), MAX(t.a.e)  FROM dfs.`json.json` t GROUP BY t.a.b.c 
> LIMIT 1;
> {noformat}
> Occurs on 1.1.0 and incubator-drill master
> {noformat}
> +---+
> | commit_id |
> +---+
> | 9f54aac33df3e783c0192ab56c7e1313dbc823fa  |
> +---+
> [Error Id: bb826851-d8cb-46f5-96c0-1ed01d3d8c45 on philix:31010]
>   at 
> org.apache.drill.jdbc.impl.DrillCursor.nextRowInternally(DrillCursor.java:247)
>   at 
> org.apache.drill.jdbc.impl.DrillCursor.loadInitialSchema(DrillCursor.java:290)
>   at 
> org.apache.drill.jdbc.impl.DrillResultSetImpl.execute(DrillResultSetImpl.java:1359)
>   at 
> org.apache.drill.jdbc.impl.DrillResultSetImpl.execute(DrillResultSetImpl.java:74)
>   at 
> net.hydromatic.avatica.AvaticaConnection.executeQueryInternal(AvaticaConnection.java:404)
>   at 
> net.hydromatic.avatica.AvaticaStatement.executeQueryInternal(AvaticaStatement.java:351)
>   at 
> net.hydromatic.avatica.AvaticaStatement.executeInternal(AvaticaStatement.java:338)
>   at 
> net.hydromatic.avatica.AvaticaStatement.execute(AvaticaStatement.java:69)
>   at 
> org.apache.drill.jdbc.impl.DrillStatementImpl.execute(DrillStatementImpl.java:86)
>   at sqlline.Commands.execute(Commands.java:841)
>   at sqlline.Commands.sql(Commands.java:751)
>   at sqlline.SqlLine.dispatch(SqlLine.java:737)
>   at sqlline.SqlLine.begin(SqlLine.java:612)
>   at sqlline.SqlLine.start(SqlLine.java:366)
>   at sqlline.SqlLine.main(SqlLine.java:259)
> Caused by: org.apache.drill.common.exceptions.UserRemoteException: VALIDATION 
> ERROR: java.lang.NullPointerException
> [Error Id: bb826851-d8cb-46f5-96c0-1ed01d3d8c45 on philix:31010]
>   at 
> org.apache.drill.exec.rpc.user.QueryResultHandler.resultArrived(QueryResultHandler.java:118)
>   at 
> org.apache.drill.exec.rpc.user.UserClient.handleReponse(UserClient.java:110)
>   at 
> org.apache.drill.exec.rpc.BasicClientWithConnection.handle(BasicClientWithConnection.java:47)
>   at 
> org.apache.drill.exec.rpc.BasicClientWithConnection.handle(BasicClientWithConnection.java:32)
>   at org.apache.drill.exec.rpc.RpcBus.handle(RpcBus.java:61)
>   at 
> org.apache.drill.exec.rpc.RpcBus$InboundHandler.decode(RpcBus.java:233)
>   at 
> org.apache.drill.exec.rpc.RpcBus$InboundHandler.decode(RpcBus.java:205)
>   at 
> io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:89)
>   at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:339)
>   at 
> io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:324)
>   at 
> io.netty.handler.timeout.IdleStateHandler.channelRead(IdleStateHandler.java:254)
>   at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:339)
>   at 
> io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:324)
>   at 
> io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:103)
>   at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:339)
>   at 
> io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:324)
>   at 
> io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:242)
>   at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:339)
>   at 
> io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:324)
>   at 
>

[jira] [Updated] (DRILL-3577) Counting nested fields on CTAS-created-parquet file/s reports inaccurate results

2015-09-23 Thread Mehant Baid (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-3577?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mehant Baid updated DRILL-3577:
---
Fix Version/s: (was: 1.2.0)
   1.3.0

> Counting nested fields on CTAS-created-parquet file/s reports inaccurate 
> results
> 
>
> Key: DRILL-3577
> URL: https://issues.apache.org/jira/browse/DRILL-3577
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Functions - Drill
>Affects Versions: 1.1.0
>Reporter: Hanifi Gunes
>Assignee: Mehant Baid
>Priority: Critical
> Fix For: 1.3.0
>
>
> I have not tried this at a smaller scale nor on JSON file directly but the 
> following seems to re-prod the issue
> 1. Create an input file as follows
> 20K rows with the following - 
> {"some":"yes","others":{"other":"true","all":"false","sometimes":"yes"}}
> 200 rows with the following - 
> {"some":"yes","others":{"other":"true","all":"false","sometimes":"yes","additional":"last
> entries only"}}
> 2. CTAS as follows
> {code:sql}
> CREATE TABLE dfs.`tmp`.`tp` as select * from dfs.`data.json` t
> {code}
> This should read
> {code}
> Fragment Number of records written
> 0_0   20200
> {code}
> 3. Count on nested fields via
> {code:sql}
> select count(t.others.additional) from dfs.`tmp`.`tp` t
> OR
> select count(t.others.other) from dfs.`tmp`.`tp` t
> {code}
> reports no rows as follows
> {code}
> EXPR$0
> 0
> {code}
> While
> {code:sql}
> select count(t.`some`) from dfs.`tmp`.`tp` t where t.others.additional is not 
> null
> {code}
> reports expected 200 rows
> {code}
> EXPR$0
> 200
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (DRILL-3830) Query with aggregate window functions returns possibly wrong results on large scale data

2015-09-23 Thread Deneche A. Hakim (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-3830?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14905465#comment-14905465
 ] 

Deneche A. Hakim commented on DRILL-3830:
-

can you run the following query on both systems:
{noformat}
SELECT ss.ss_store_sk, SUM(ss.ss_net_paid_inc_tax) OVER (PARTITION BY 
ss.ss_store_sk) FROM store_sales ss LIMIT 1;
{noformat}

also, can you attach drill's plan for sf1000.

Thanks

> Query with aggregate window functions returns possibly wrong results on large 
> scale data
> 
>
> Key: DRILL-3830
> URL: https://issues.apache.org/jira/browse/DRILL-3830
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Relational Operators
>Affects Versions: 1.2.0
> Environment: 10 Performance Nodes
> DRILL_MAX_DIRECT_MEMORY=100g
> DRILL_INIT_HEAP="8g"
> DRILL_MAX_HEAP="8g"
> planner.memory.query_max_memory_per_node bumped up to 20 GB
> TPC-DS SF 1000 dataset (Parquet)
>Reporter: Abhishek Girish
>Assignee: Deneche A. Hakim
> Attachments: drill_sf1_plan.txt, gpdb_sf1000_plan.txt, 
> gpdb_sf1_plan.txt
>
>
> Results returned by the following two queries slightly differ from those 
> returned  by Greenplum DB. 
> {code:sql}
> SELECT SUM(ss.ss_net_paid_inc_tax) OVER (PARTITION BY ss.ss_store_sk) FROM 
> store_sales ss LIMIT 1;
> SELECT SUM(ss.ss_net_paid_inc_tax) OVER (PARTITION BY ss.ss_store_sk ORDER BY 
> ss.ss_store_sk) FROM store_sales ss LIMIT 2;
> Drill:
> 9.653697131700665E9
> Greenplum DB:
> 9.628946925860903E9
> P.S. Both queries return same results
> {code}
> I was unable to reproduce this on smaller scale (tried SF 1). I'll attach 
> plans from both systems. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (DRILL-3831) Allow null values in lists

2015-09-23 Thread Jason Altekruse (JIRA)

Jason Altekruse created DRILL-3831:
--

 Summary: Allow null values in lists
 Key: DRILL-3831
 URL: https://issues.apache.org/jira/browse/DRILL-3831
 Project: Apache Drill
  Issue Type: Improvement
  Components: Execution - Data Types
Reporter: Jason Altekruse
Assignee: Jason Altekruse
 Fix For: 1.3.0


Drill currently fails to read a json file where a list has a value of null in 
it. We have a workaround with all_text_mode for this case, but we need to 
enhance Drill to support this concept in the core ValueVector data structure 
used to represent records.

As part of this change, I am considering removing the concept of a list that 
requires all of its members to be non-null, effectively the only type of list 
we have today. The data that can be read today would simply be read into a list 
where the members could be nullable, but they all happen to be non-null. This 
would simplify the code to prevent the need to cover the null and non-null 
cases explicitly.

Initially this could pose a risk with a minor performance hit, but overall our 
approach with complex data is not been heavily performance tested. Keeping the 
code simple for now will at least allow for more thorough testing of the 
smaller number of cases, and hopefully make it easier to reason about and 
improve as we evaluate the performance of Drill with complex data more 
thoroughly.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (DRILL-3807) [Regression] Query with equality join and a FALSE condition fails to plan

2015-09-23 Thread Sean Hsuan-Yi Chu (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-3807?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14905470#comment-14905470
 ] 

Sean Hsuan-Yi Chu commented on DRILL-3807:
--

Correct. The update of Calcite improves Constant folding. Thus, the filer's 
condition is made as false;
Equivalently, the query is rewritten as 

{code}
select l.l_quantity from cp.`tpch/lineitem.parquet` l, cp.`tpch/part.parquet` p 
where false;
{code}

> [Regression] Query with equality join and a FALSE condition fails to plan
> -
>
> Key: DRILL-3807
> URL: https://issues.apache.org/jira/browse/DRILL-3807
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Query Planning & Optimization
>Affects Versions: 1.2.0
>Reporter: Aman Sinha
>Assignee: Sean Hsuan-Yi Chu
> Fix For: 1.2.0
>
>
> 1.2.0-SNAPSHOT behavior: 
> {code}
> 0: jdbc:drill:zk=local> explain plan for select l.l_quantity from 
> cp.`tpch/lineitem.parquet` l, cp.`tpch/part.parquet` p where l.l_partkey = 
> p.p_partkey and (1 = 0);
> Error: UNSUPPORTED_OPERATION ERROR: This query cannot be planned possibly due 
> to either a cartesian join or an inequality join
> [Error Id: f7466d86-b709-465e-bb49-d3c51ecf941b on 172.16.0.160:31010] 
> (state=,code=0)
> {code}
> The simplification of  ' l.l_partkey = p.p_partkey and (1 = 0)' to a False 
> condition is valid and accordingly Drill fails to plan due to the cartesian 
> join introduced by the False condition.   However,  in 1.1.0 apparently the 
> 1=0 was converted to a LIMIT 0 which was pushed below the Join and the query 
> successfully planned and executed: 
> 1.1.0 behavior: 
> {code}
> 0: jdbc:drill:zk=local> explain plan for select l.l_quantity from 
> cp.`tpch/lineitem.parquet` l, cp.`tpch/part.parquet` p where l.l_partkey = 
> p.p_partkey and (1 = 0);
> +--+--+
> | text | json |
> +--+--+
> | 00-00Screen
> 00-01  Project(l_quantity=[$1])
> 00-02HashJoin(condition=[=($0, $2)], joinType=[inner])
> 00-04  SelectionVectorRemover
> 00-05Limit(offset=[0], fetch=[0])
> 00-06  Scan(groupscan=[ParquetGroupScan 
> [entries=[ReadEntryWithPath [path=classpath:/tpch/lineitem.parquet]], 
> selectionRoot=classpath:/tpch/lineitem.parquet, numFiles=1, 
> columns=[`l_partkey`, `l_quantity`]]])
> 00-03  Scan(groupscan=[ParquetGroupScan [entries=[ReadEntryWithPath 
> [path=classpath:/tpch/part.parquet]], 
> selectionRoot=classpath:/tpch/part.parquet, numFiles=1, 
> columns=[`p_partkey`]]])
> {code}
> [~cchang] and I looked at the commit history and it appears that the 
> regression started somewhere between Aug 24 and Aug 28, which is the time 
> when we rebased on Calcite 1.4.0.  So we need to narrow down further the 
> change that may have caused this. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (DRILL-3818) Error when DISTINCT and GROUP BY is used in avro or json

2015-09-23 Thread Daniel Barclay (Drill) (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-3818?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14905477#comment-14905477
 ] 

Daniel Barclay (Drill) commented on DRILL-3818:
---

Stack trace from server log ({{sqlline.log}} from {{drill-embeded}}):


{noformat}
org.apache.drill.common.exceptions.UserException: VALIDATION ERROR: 
java.lang.NullPointerException


[Error Id: d1ea15ce-e0dc-4ee8-afaf-ff9c970ffb1d ]
at 
org.apache.drill.common.exceptions.UserException$Builder.build(UserException.java:524)
 ~[drill-common-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
at 
org.apache.drill.exec.planner.sql.DrillSqlWorker.getPlan(DrillSqlWorker.java:181)
 [drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
at org.apache.drill.exec.work.foreman.Foreman.runSQL(Foreman.java:905) 
[drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
at org.apache.drill.exec.work.foreman.Foreman.run(Foreman.java:244) 
[drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) 
[na:1.7.0_72]
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) 
[na:1.7.0_72]
at java.lang.Thread.run(Thread.java:745) [na:1.7.0_72]
Caused by: org.apache.calcite.tools.ValidationException: 
java.lang.NullPointerException
at 
org.apache.calcite.prepare.PlannerImpl.validate(PlannerImpl.java:179) 
~[calcite-core-1.4.0-drill-r2.jar:1.4.0-drill-r2]
at 
org.apache.calcite.prepare.PlannerImpl.validateAndGetType(PlannerImpl.java:188) 
~[calcite-core-1.4.0-drill-r2.jar:1.4.0-drill-r2]
at 
org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.validateNode(DefaultSqlHandler.java:447)
 ~[drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
at 
org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.validateAndConvert(DefaultSqlHandler.java:190)
 ~[drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
at 
org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.getPlan(DefaultSqlHandler.java:159)
 ~[drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
at 
org.apache.drill.exec.planner.sql.DrillSqlWorker.getPlan(DrillSqlWorker.java:178)
 [drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
... 5 common frames omitted
Caused by: java.lang.NullPointerException: null
at 
org.apache.calcite.sql.validate.AggregatingSelectScope.getGroupExprs(AggregatingSelectScope.java:142)
 ~[calcite-core-1.4.0-drill-r2.jar:1.4.0-drill-r2]
at 
org.apache.calcite.sql.validate.AggregatingSelectScope.checkAggregateExpr(AggregatingSelectScope.java:221)
 ~[calcite-core-1.4.0-drill-r2.jar:1.4.0-drill-r2]
at 
org.apache.calcite.sql.validate.AggregatingSelectScope.getOperandScope(AggregatingSelectScope.java:206)
 ~[calcite-core-1.4.0-drill-r2.jar:1.4.0-drill-r2]
at 
org.apache.calcite.sql.validate.SqlScopedShuttle.visit(SqlScopedShuttle.java:48)
 ~[calcite-core-1.4.0-drill-r2.jar:1.4.0-drill-r2]
at 
org.apache.calcite.sql.validate.SqlScopedShuttle.visit(SqlScopedShuttle.java:32)
 ~[calcite-core-1.4.0-drill-r2.jar:1.4.0-drill-r2]
at org.apache.calcite.sql.SqlCall.accept(SqlCall.java:130) 
~[calcite-core-1.4.0-drill-r2.jar:1.4.0-drill-r2]
at 
org.apache.calcite.sql.validate.SqlValidatorImpl.expand(SqlValidatorImpl.java:4067)
 ~[calcite-core-1.4.0-drill-r2.jar:1.4.0-drill-r2]
at 
org.apache.calcite.sql.validate.SqlValidatorUtil.analyzeGroupExpr(SqlValidatorUtil.java:455)
 ~[calcite-core-1.4.0-drill-r2.jar:1.4.0-drill-r2]
at 
org.apache.calcite.sql.validate.SqlValidatorUtil.convertGroupSet(SqlValidatorUtil.java:426)
 ~[calcite-core-1.4.0-drill-r2.jar:1.4.0-drill-r2]
at 
org.apache.calcite.sql.validate.SqlValidatorUtil.analyzeGroupItem(SqlValidatorUtil.java:402)
 ~[calcite-core-1.4.0-drill-r2.jar:1.4.0-drill-r2]
at 
org.apache.calcite.sql.validate.AggregatingSelectScope.(AggregatingSelectScope.java:97)
 ~[calcite-core-1.4.0-drill-r2.jar:1.4.0-drill-r2]
at 
org.apache.calcite.sql.validate.SqlValidatorImpl.registerQuery(SqlValidatorImpl.java:2216)
 ~[calcite-core-1.4.0-drill-r2.jar:1.4.0-drill-r2]
at 
org.apache.calcite.sql.validate.SqlValidatorImpl.registerQuery(SqlValidatorImpl.java:2121)
 ~[calcite-core-1.4.0-drill-r2.jar:1.4.0-drill-r2]
at 
org.apache.calcite.sql.validate.SqlValidatorImpl.validateScopedExpression(SqlValidatorImpl.java:835)
 ~[calcite-core-1.4.0-drill-r2.jar:1.4.0-drill-r2]
at 
org.apache.calcite.sql.validate.SqlValidatorImpl.validate(SqlValidatorImpl.java:551)
 ~[calcite-core-1.4.0-drill-r2.jar:1.4.0-drill-r2]
at 
org.apache.calcite.prepare.PlannerImpl.validate(PlannerImpl.java:177) 
~[calcite-core-1.4.0-drill-r2.jar:1.4.0-drill-r2]
... 10 common frames omitted
{noformat}



> Error when DISTINCT and GROUP BY is used in avro or json
>

[jira] [Updated] (DRILL-3818) Error when DISTINCT and GROUP BY is used in avro or json

2015-09-23 Thread Daniel Barclay (Drill) (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-3818?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Barclay (Drill) updated DRILL-3818:
--
Assignee: Jacques Nadeau  (was: Daniel Barclay (Drill))

> Error when DISTINCT and GROUP BY is used in avro or json
> 
>
> Key: DRILL-3818
> URL: https://issues.apache.org/jira/browse/DRILL-3818
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - RPC, SQL Parser
>Affects Versions: 1.1.0, 1.2.0
> Environment: Linux Mint 17.1
> java version "1.7.0_80"
> Java(TM) SE Runtime Environment (build 1.7.0_80-b15)
> Java HotSpot(TM) 64-Bit Server VM (build 24.80-b11, mixed mode)
>Reporter: Philip Deegan
>Assignee: Jacques Nadeau
> Fix For: 1.2.0
>
>
> Data
> {noformat}
> { "a": { "b": { "c": "d" }, "e": 2 }}
> {noformat}
> Query
> {noformat}
> select DISTINCT(t.a.b.c), MAX(t.a.e)  FROM dfs.`json.json` t GROUP BY t.a.b.c 
> LIMIT 1;
> {noformat}
> Occurs on 1.1.0 and incubator-drill master
> {noformat}
> +---+
> | commit_id |
> +---+
> | 9f54aac33df3e783c0192ab56c7e1313dbc823fa  |
> +---+
> [Error Id: bb826851-d8cb-46f5-96c0-1ed01d3d8c45 on philix:31010]
>   at 
> org.apache.drill.jdbc.impl.DrillCursor.nextRowInternally(DrillCursor.java:247)
>   at 
> org.apache.drill.jdbc.impl.DrillCursor.loadInitialSchema(DrillCursor.java:290)
>   at 
> org.apache.drill.jdbc.impl.DrillResultSetImpl.execute(DrillResultSetImpl.java:1359)
>   at 
> org.apache.drill.jdbc.impl.DrillResultSetImpl.execute(DrillResultSetImpl.java:74)
>   at 
> net.hydromatic.avatica.AvaticaConnection.executeQueryInternal(AvaticaConnection.java:404)
>   at 
> net.hydromatic.avatica.AvaticaStatement.executeQueryInternal(AvaticaStatement.java:351)
>   at 
> net.hydromatic.avatica.AvaticaStatement.executeInternal(AvaticaStatement.java:338)
>   at 
> net.hydromatic.avatica.AvaticaStatement.execute(AvaticaStatement.java:69)
>   at 
> org.apache.drill.jdbc.impl.DrillStatementImpl.execute(DrillStatementImpl.java:86)
>   at sqlline.Commands.execute(Commands.java:841)
>   at sqlline.Commands.sql(Commands.java:751)
>   at sqlline.SqlLine.dispatch(SqlLine.java:737)
>   at sqlline.SqlLine.begin(SqlLine.java:612)
>   at sqlline.SqlLine.start(SqlLine.java:366)
>   at sqlline.SqlLine.main(SqlLine.java:259)
> Caused by: org.apache.drill.common.exceptions.UserRemoteException: VALIDATION 
> ERROR: java.lang.NullPointerException
> [Error Id: bb826851-d8cb-46f5-96c0-1ed01d3d8c45 on philix:31010]
>   at 
> org.apache.drill.exec.rpc.user.QueryResultHandler.resultArrived(QueryResultHandler.java:118)
>   at 
> org.apache.drill.exec.rpc.user.UserClient.handleReponse(UserClient.java:110)
>   at 
> org.apache.drill.exec.rpc.BasicClientWithConnection.handle(BasicClientWithConnection.java:47)
>   at 
> org.apache.drill.exec.rpc.BasicClientWithConnection.handle(BasicClientWithConnection.java:32)
>   at org.apache.drill.exec.rpc.RpcBus.handle(RpcBus.java:61)
>   at 
> org.apache.drill.exec.rpc.RpcBus$InboundHandler.decode(RpcBus.java:233)
>   at 
> org.apache.drill.exec.rpc.RpcBus$InboundHandler.decode(RpcBus.java:205)
>   at 
> io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:89)
>   at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:339)
>   at 
> io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:324)
>   at 
> io.netty.handler.timeout.IdleStateHandler.channelRead(IdleStateHandler.java:254)
>   at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:339)
>   at 
> io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:324)
>   at 
> io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:103)
>   at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:339)
>   at 
> io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:324)
>   at 
> io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:242)
>   at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:339)
>   at 
> io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:324)
>   at 
> io.netty.channel.ChannelInboundHandlerAdapter.channelRead(ChannelInboundHandlerAdapter.java:86)
>

[jira] [Commented] (DRILL-3826) Concurrent Query Submission leads to Channel Closed Exception

2015-09-23 Thread Yiyi Hu (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-3826?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14905654#comment-14905654
 ] 

Yiyi Hu commented on DRILL-3826:


Thanks for pointing out.

However, shall notice that first of all, cancellation is done by drill 
unexpectedly, and secondly, JDBC has the same problem.   

> Concurrent Query Submission leads to Channel Closed Exception
> -
>
> Key: DRILL-3826
> URL: https://issues.apache.org/jira/browse/DRILL-3826
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Client - JDBC, Execution - RPC
>Affects Versions: 1.1.0
> Environment: - CentOS release 6.6 (Final)
> - hadoop-2.7.1
> - hbase-1.0.1.1
> - drill-1.1.0
> - jdk-1.8.0_45
>Reporter: Yiyi Hu
>Assignee: Daniel Barclay (Drill)
>  Labels: filesystem, hadoop, hbase, jdbc, rpc
> Attachments: Sample1.png, Sample2.png, jdbc-test-client-drillbit.log, 
> shell-sqlline.log, shell-test-drillbit.log
>
>
> Frequently seen CHANNEL CLOSED EXCEPTION while running concurrent quries with 
> relatively large LIMIT.
> Here are the details,
> SET UP:
> - Single drillbit running on a single zookeeper node
> - 4G heap size, 8G direct memory
> - Storage plugins: local filesystem, hdfs, hbase
> TEST DATA:
> - A 50,000,000 records json file test.json, with two fields id, 
> title  (approximately 3G).
> SHELL TEST:
> - Running 4 drill shells concurrently with query:
>   SELECT id, title from dfs.`test.json` LIMIT 500.
> - Queries got canceled. Channel closing between client and server were seen 
> randomly, as an example shown below:
> java.lang.RuntimeException: java.sql.SQLException: SYSTEM ERROR: 
> ChannelClosedException: Channel closed /192.168.4.201:31010 <--> 
> /192.168.4.201:48829.
> Fragment 0:0
> [Error Id: 0bd2b500-155e-46e0-9f26-bd89fea47a25 on TEST-101:31010]
>   at sqlline.IncrementalRows.hasNext(IncrementalRows.java:73)
>   at 
> sqlline.TableOutputFormat$ResizingRowsProvider.next(TableOutputFormat.java:87)
>   at sqlline.TableOutputFormat.print(TableOutputFormat.java:118)
>   at sqlline.SqlLine.print(SqlLine.java:1583)
>   at sqlline.Commands.execute(Commands.java:852)
>   at sqlline.Commands.sql(Commands.java:751)
>   at sqlline.SqlLine.dispatch(SqlLine.java:738)
>   at sqlline.SqlLine.begin(SqlLine.java:612)
>   at sqlline.SqlLine.start(SqlLine.java:366)
>   at sqlline.SqlLine.main(SqlLine.java:259)
> JDBC TEST:
> - 6 separate threads running the same query: SELECT id, title from 
> dfs.`test.json` LIMIT 1000, each maintains its own connection to drill 
> and resultSet, statement and connection are closed finally.
> - Used resultSet.next() to iterate on the result set, do nothing else.
> - Throws the same channel closed exception randomly. Log file were enclosed 
> for review.
> - Memory usage was monitored, all good.
> CROSS STORAGE PLUGINS:
> - The same issue can be found not only in JSON on a file system (local/hdfs), 
> but also when querying the same 50,000,000 records table in HBASE.
> - The issue is not found in a single thread application.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

62 matches

Mail list logo