[jira] [Commented] (DRILL-4657) Rank() will return wrong results if a frame of data is too big (more than 2 batches)

2016-05-06 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-4657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15274958#comment-15274958
 ] 

ASF GitHub Bot commented on DRILL-4657:
---

Github user asfgit closed the pull request at:

https://github.com/apache/drill/pull/499


> Rank() will return wrong results if a frame of data is too big (more than 2 
> batches)
> 
>
> Key: DRILL-4657
> URL: https://issues.apache.org/jira/browse/DRILL-4657
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Relational Operators
>Affects Versions: 1.3.0
>Reporter: Deneche A. Hakim
>Assignee: Aman Sinha
>Priority: Critical
> Fix For: 1.7.0
>
>
> When you run a query with RANK, and one particular frame is too long to fit 
> in 2 batches of data, you will get wrong result.
> I was able to reproduce the issue in a unit test, thanks to the fact that we 
> can control the size of the batches processed by the window operator. I will 
> post a fix soon along with the unit test



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (DRILL-4657) Rank() will return wrong results if a frame of data is too big (more than 2 batches)

2016-05-06 Thread Deneche A. Hakim (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-4657?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Deneche A. Hakim reassigned DRILL-4657:
---

Assignee: Deneche A. Hakim  (was: Aman Sinha)

> Rank() will return wrong results if a frame of data is too big (more than 2 
> batches)
> 
>
> Key: DRILL-4657
> URL: https://issues.apache.org/jira/browse/DRILL-4657
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Relational Operators
>Affects Versions: 1.3.0
>Reporter: Deneche A. Hakim
>Assignee: Deneche A. Hakim
>Priority: Critical
> Fix For: 1.7.0
>
>
> When you run a query with RANK, and one particular frame is too long to fit 
> in 2 batches of data, you will get wrong result.
> I was able to reproduce the issue in a unit test, thanks to the fact that we 
> can control the size of the batches processed by the window operator. I will 
> post a fix soon along with the unit test



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (DRILL-4657) Rank() will return wrong results if a frame of data is too big (more than 2 batches)

2016-05-06 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-4657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15274919#comment-15274919
 ] 

ASF GitHub Bot commented on DRILL-4657:
---

Github user amansinha100 commented on the pull request:

https://github.com/apache/drill/pull/499#issuecomment-217587650
  
+1.  LGTM


> Rank() will return wrong results if a frame of data is too big (more than 2 
> batches)
> 
>
> Key: DRILL-4657
> URL: https://issues.apache.org/jira/browse/DRILL-4657
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Relational Operators
>Affects Versions: 1.3.0
>Reporter: Deneche A. Hakim
>Assignee: Aman Sinha
>Priority: Critical
> Fix For: 1.7.0
>
>
> When you run a query with RANK, and one particular frame is too long to fit 
> in 2 batches of data, you will get wrong result.
> I was able to reproduce the issue in a unit test, thanks to the fact that we 
> can control the size of the batches processed by the window operator. I will 
> post a fix soon along with the unit test



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (DRILL-4478) binary_string cannot convert buffer that were not start from 0 correctly

2016-05-06 Thread Suresh Ollala (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-4478?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Suresh Ollala updated DRILL-4478:
-
Reviewer: Khurram Faraaz  (was: Aman Sinha)

> binary_string cannot convert buffer that were not start from 0 correctly
> 
>
> Key: DRILL-4478
> URL: https://issues.apache.org/jira/browse/DRILL-4478
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Codegen
>Reporter: Chunhui Shi
>Assignee: Chunhui Shi
> Fix For: 1.7.0
>
>
> When binary_string was called multiple times, it can only convert the first 
> one correctly if the drillbuf start from 0. For the second and afterwards 
> calls, because the drillbuf is not starting from 0 thus 
> DrillStringUtils.parseBinaryString could not do the work correctly.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (DRILL-3845) PartitionSender doesn't send last batch for receivers that already terminated

2016-05-06 Thread Suresh Ollala (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-3845?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Suresh Ollala updated DRILL-3845:
-
Reviewer: Kunal Khatua  (was: Victoria Markman)

> PartitionSender doesn't send last batch for receivers that already terminated
> -
>
> Key: DRILL-3845
> URL: https://issues.apache.org/jira/browse/DRILL-3845
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Relational Operators
>Reporter: Deneche A. Hakim
>Assignee: Deneche A. Hakim
> Fix For: 1.5.0
>
> Attachments: 29c45a5b-e2b9-72d6-89f2-d49ba88e2939.sys.drill
>
>
> Even if a receiver has finished and informed the corresponding partition 
> sender, the sender will still try to send a "last batch" to the receiver when 
> it's done. In most cases this is fine as those batches will be silently 
> dropped by the receiving DataServer, but if a receiver has finished +10 
> minutes ago, DataServer will throw an exception as it couldn't find the 
> corresponding FragmentManager (WorkEventBus has a 10 minutes recentlyFinished 
> cache).
> DRILL-2274 is a reproduction for this case (after the corresponding fix is 
> applied).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (DRILL-4121) External Sort may not spill if above a receiver

2016-05-06 Thread Suresh Ollala (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-4121?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Suresh Ollala updated DRILL-4121:
-
Reviewer: Kunal Khatua  (was: Victoria Markman)

> External Sort may not spill if above a receiver
> ---
>
> Key: DRILL-4121
> URL: https://issues.apache.org/jira/browse/DRILL-4121
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.1.0
>Reporter: Deneche A. Hakim
>Assignee: Deneche A. Hakim
> Fix For: 1.5.0
>
>
> If external sort is above a receiver, all received batches will contain non 
> root buffers. Sort operator doesn't account for non root buffers when 
> estimating how much memory and if it needs to spill to disk. This may delay 
> the spill and cause the corresponding Drillbit to use large amounts of memory



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (DRILL-4163) Support schema changes for MergeJoin operator.

2016-05-06 Thread Suresh Ollala (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-4163?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Suresh Ollala updated DRILL-4163:
-
Reviewer: Khurram Faraaz  (was: Victoria Markman)

> Support schema changes for MergeJoin operator.
> --
>
> Key: DRILL-4163
> URL: https://issues.apache.org/jira/browse/DRILL-4163
> Project: Apache Drill
>  Issue Type: Improvement
>Reporter: amit hadke
>Assignee: Jason Altekruse
> Fix For: 1.5.0
>
>
> Since external sort operator supports schema changes, allow use of union 
> types in merge join to support for schema changes.
> For now, we assume that merge join always works on record batches from sort 
> operator. Thus merging schemas and promoting to union vectors is already 
> taken care by sort operator.
> Test Cases:
> 1) Only one side changes schema (join on union type and primitive type)
> 2) Both sids change schema on all columns.
> 3) Join between numeric types and string types.
> 4) Missing columns - each batch has different columns. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (DRILL-4187) Introduce a state to separate queries pending execution from those pending in the queue.

2016-05-06 Thread Suresh Ollala (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-4187?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Suresh Ollala updated DRILL-4187:
-
Reviewer: Chun Chang  (was: Sudheesh Katkam)

> Introduce a state to separate queries pending execution from those pending in 
> the queue.
> 
>
> Key: DRILL-4187
> URL: https://issues.apache.org/jira/browse/DRILL-4187
> Project: Apache Drill
>  Issue Type: Sub-task
>Reporter: Hanifi Gunes
>Assignee: Hanifi Gunes
> Fix For: 1.5.0
>
>
> Currently queries pending in the queue are not listed in the web UI besides 
> we use the state PENDING to mean pending executions. This issue proposes i) 
> to list enqueued queries in the web UI ii) to introduce a new state for 
> queries sitting at the queue, differentiating then from those pending 
> execution.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (DRILL-2517) Apply Partition pruning before reading files during planning

2016-05-06 Thread Suresh Ollala (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-2517?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Suresh Ollala updated DRILL-2517:
-
Reviewer: Kunal Khatua  (was: Victoria Markman)

> Apply Partition pruning before reading files during planning
> 
>
> Key: DRILL-2517
> URL: https://issues.apache.org/jira/browse/DRILL-2517
> Project: Apache Drill
>  Issue Type: New Feature
>  Components: Query Planning & Optimization
>Affects Versions: 0.7.0, 0.8.0
>Reporter: Adam Gilmore
>Assignee: Jinfeng Ni
> Fix For: 1.6.0, Future
>
>
> Partition pruning still tries to read Parquet files during the planning stage 
> even though they don't match the partition filter.
> For example, if there were an invalid Parquet file in a directory that should 
> not be queried:
> {code}
> 0: jdbc:drill:zk=local> select sum(price) from dfs.tmp.purchases where dir0 = 
> 1;
> Query failed: IllegalArgumentException: file:/tmp/purchases/4/0_0_0.parquet 
> is not a Parquet file (too small)
> {code}
> The reason is that the partition pruning happens after the Parquet plugin 
> tries to read the footer of each file.
> Ideally, partition pruning would happen first before the format plugin gets 
> involved.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Closed] (DRILL-4589) Reduce planning time for file system partition pruning by reducing filter evaluation overhead

2016-05-06 Thread Dechang Gu (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-4589?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dechang Gu closed DRILL-4589.
-

Reviewed and verified.  LGTM.

> Reduce planning time for file system partition pruning by reducing filter 
> evaluation overhead
> -
>
> Key: DRILL-4589
> URL: https://issues.apache.org/jira/browse/DRILL-4589
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Query Planning & Optimization
>Reporter: Jinfeng Ni
>Assignee: Jinfeng Ni
> Fix For: 1.7.0
>
>
> When Drill is used to query hundreds of thousands, or even millions of files 
> organized into multi-level directories, user typically will provide a 
> partition filter like  : dir0 = something and dir1 = something2 and .. .  
> For such queries, we saw the query planning time could be unacceptable long, 
> due to three main overheads: 1) to expand and get the list of files, 2) to 
> evaluate the partition filter, 3) to get the metadata, in the case of parquet 
> files for which metadata cache file is not available. 
> DRILL-2517 targets at the 3rd part of overhead. As a follow-up work after 
> DRILL-2517, we plan to reduce the filter evaluation overhead. For now, the 
> partition filter evaluation is applied to file level. In many cases, we saw 
> that the number of leaf subdirectories is significantly lower than that of 
> files. Since all the files under the same leaf subdirecctory share the same 
> directory metadata, we should apply the filter evaluation at the leaf 
> subdirectory. By doing that, we could reduce the cpu overhead to evaluate the 
> filter, and the memory overhead as well.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (DRILL-4490) Count(*) function returns as optional instead of required

2016-05-06 Thread Suresh Ollala (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-4490?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Suresh Ollala updated DRILL-4490:
-
Reviewer: Krystal

> Count(*) function returns as optional instead of required
> -
>
> Key: DRILL-4490
> URL: https://issues.apache.org/jira/browse/DRILL-4490
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Data Types
>Affects Versions: 1.6.0
>Reporter: Krystal
>Assignee: Sean Hsuan-Yi Chu
> Fix For: 1.7.0
>
>
> git.commit.id.abbrev=c8a7840
> I have the following CTAS query:
> create table test as select count(*) as col1 from cp.`tpch/orders.parquet`;
> The schema of the test table shows col1 as optional:
> message root {
>   optional int64 col1;
> }



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (DRILL-4387) Improve execution side when it handles skipAll query

2016-05-06 Thread Suresh Ollala (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-4387?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Suresh Ollala updated DRILL-4387:
-
Reviewer: Khurram Faraaz  (was: Victoria Markman)

> Improve execution side when it handles skipAll query
> 
>
> Key: DRILL-4387
> URL: https://issues.apache.org/jira/browse/DRILL-4387
> Project: Apache Drill
>  Issue Type: Bug
>Reporter: Jinfeng Ni
>Assignee: Jinfeng Ni
> Fix For: 1.6.0
>
>
> DRILL-4279 changes the planner side and the RecordReader in the execution 
> side when they handles skipAll query. However, it seems there are other 
> places in the codebase that do not handle skipAll query efficiently. In 
> particular, in GroupScan or ScanBatchCreator, we will replace a NULL or empty 
> column list with star column. This essentially will force the execution side 
> (RecordReader) to fetch all the columns for data source. Such behavior will 
> lead to big performance overhead for the SCAN operator.
> To improve Drill's performance, we should change those places as well, as a 
> follow-up work after DRILL-4279.
> One simple example of this problem is:
> {code}
>SELECT DISTINCT substring(dir1, 5) from  dfs.`/Path/To/ParquetTable`;  
> {code}
> The query does not require any regular column from the parquet file. However, 
> ParquetRowGroupScan and ParquetScanBatchCreator will put star column as the 
> column list. In case table has dozens or hundreds of columns, this will make 
> SCAN operator much more expensive than necessary. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (DRILL-4577) Improve performance for query on INFORMATION_SCHEMA when HIVE is plugged in

2016-05-06 Thread Suresh Ollala (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-4577?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Suresh Ollala updated DRILL-4577:
-
Reviewer: Dechang Gu

> Improve performance for query on INFORMATION_SCHEMA when HIVE is plugged in
> ---
>
> Key: DRILL-4577
> URL: https://issues.apache.org/jira/browse/DRILL-4577
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Storage - Hive
>Reporter: Sean Hsuan-Yi Chu
>Assignee: Sean Hsuan-Yi Chu
> Fix For: 1.7.0
>
>
> A query such as 
> {code}
> select * from INFORMATION_SCHEMA.`TABLES` 
> {code}
> is converted as calls to fetch all tables from storage plugins. 
> When users have Hive, the calls to hive metadata storage would be: 
> 1) get_table
> 2) get_partitions
> However, the information regarding partitions is not used in this type of 
> queries. Beside, a more efficient way is to fetch tables is to use 
> get_multi_table call.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (DRILL-4584) JDBC/ODBC Client IP in Drill audit logs

2016-05-06 Thread Suresh Ollala (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-4584?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Suresh Ollala updated DRILL-4584:
-
Reviewer: Krystal

> JDBC/ODBC Client IP in Drill audit logs
> ---
>
> Key: DRILL-4584
> URL: https://issues.apache.org/jira/browse/DRILL-4584
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Client - JDBC, Client - ODBC
>Reporter: Vitalii Diravka
>Assignee: Vitalii Diravka
>Priority: Minor
>  Labels: doc-impacting
> Fix For: 1.7.0
>
>
> Currently Drill audit logs - sqlline_queries.json and drillbit_queries.json 
> provide information about client username who fired the query . It will be 
> good to also have the client IP from where the query was fired .



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (DRILL-4589) Reduce planning time for file system partition pruning by reducing filter evaluation overhead

2016-05-06 Thread Suresh Ollala (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-4589?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Suresh Ollala updated DRILL-4589:
-
Reviewer: Dechang Gu

> Reduce planning time for file system partition pruning by reducing filter 
> evaluation overhead
> -
>
> Key: DRILL-4589
> URL: https://issues.apache.org/jira/browse/DRILL-4589
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Query Planning & Optimization
>Reporter: Jinfeng Ni
>Assignee: Jinfeng Ni
> Fix For: 1.7.0
>
>
> When Drill is used to query hundreds of thousands, or even millions of files 
> organized into multi-level directories, user typically will provide a 
> partition filter like  : dir0 = something and dir1 = something2 and .. .  
> For such queries, we saw the query planning time could be unacceptable long, 
> due to three main overheads: 1) to expand and get the list of files, 2) to 
> evaluate the partition filter, 3) to get the metadata, in the case of parquet 
> files for which metadata cache file is not available. 
> DRILL-2517 targets at the 3rd part of overhead. As a follow-up work after 
> DRILL-2517, we plan to reduce the filter evaluation overhead. For now, the 
> partition filter evaluation is applied to file level. In many cases, we saw 
> that the number of leaf subdirectories is significantly lower than that of 
> files. Since all the files under the same leaf subdirecctory share the same 
> directory metadata, we should apply the filter evaluation at the leaf 
> subdirectory. By doing that, we could reduce the cpu overhead to evaluate the 
> filter, and the memory overhead as well.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (DRILL-4571) Add link to local Drill logs from the web UI

2016-05-06 Thread Suresh Ollala (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-4571?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Suresh Ollala updated DRILL-4571:
-
Reviewer: Krystal

> Add link to local Drill logs from the web UI
> 
>
> Key: DRILL-4571
> URL: https://issues.apache.org/jira/browse/DRILL-4571
> Project: Apache Drill
>  Issue Type: Improvement
>Affects Versions: 1.6.0
>Reporter: Arina Ielchiieva
>Assignee: Arina Ielchiieva
>  Labels: doc-impacting
> Fix For: 1.7.0
>
> Attachments: display_log.JPG, log_list.JPG
>
>
> Now we have link to the profile from the web UI.
> It will be handy for the users to have the link to local logs as well.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (DRILL-3763) Cancel (Ctrl-C) one of concurrent queries results in ChannelClosedException

2016-05-06 Thread Suresh Ollala (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-3763?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Suresh Ollala updated DRILL-3763:
-
Reviewer: Khurram Faraaz

> Cancel (Ctrl-C) one of concurrent queries results in ChannelClosedException
> ---
>
> Key: DRILL-3763
> URL: https://issues.apache.org/jira/browse/DRILL-3763
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - RPC
>Affects Versions: 1.2.0
> Environment: 4 node cluster CentOS
>Reporter: Khurram Faraaz
>Assignee: Deneche A. Hakim
> Fix For: 1.7.0
>
>
> Canceling a query from a set of concurrent queries executing on the same 
> table, results in ChannelClosedException when one of the queries in Canceled 
> using Ctrl-C from sqlline prompt.
> Steps to reproduce the problem,
> 1. Start, service map-warden start on 4 nodes.
> 2. Start eight sqlline sessions from eight different terminals, ./sqlline -u 
> "jdbc:drill:schema=dfs.tmp"
> 3. Run the below query from the eight sqlline sessions
> select * from `twoKeyJsn.json`;
> 4. While the queries are being executed, Cancel one of the queries on sqlline 
> prompt, issue Ctrl-C on sqlline.
> 5. you will note that after a few seconds/minutes, on one of the other 
> sqlline sessions there is a ChannelClosedException that is reported and that 
> query is reported as FAILED on the query profile on the Web UI.
> {code}
> 0: jdbc:drill:schema=dfs.tmp> select * from `twoKeyJsn.json`;
> ...
> | 1.31643767542E9  | h|
> | 9.02780441562E8  | b|
> | 6.46524413864E8  | l|
> java.lang.RuntimeException: java.sql.SQLException: SYSTEM ERROR: 
> ChannelClosedException: Channel closed /10.10.100.201:31010 <--> 
> /10.10.100.202:58705.
> Fragment 0:0
> [Error Id: bd765bd5-3921-4b58-89a7-cd87482f9088 on centos-01.qa.lab:31010]
>   at sqlline.IncrementalRows.hasNext(IncrementalRows.java:73)
>   at 
> sqlline.TableOutputFormat$ResizingRowsProvider.next(TableOutputFormat.java:87)
>   at sqlline.TableOutputFormat.print(TableOutputFormat.java:118)
>   at sqlline.SqlLine.print(SqlLine.java:1583)
>   at sqlline.Commands.execute(Commands.java:852)
>   at sqlline.Commands.sql(Commands.java:751)
>   at sqlline.SqlLine.dispatch(SqlLine.java:738)
>   at sqlline.SqlLine.begin(SqlLine.java:612)
>   at sqlline.SqlLine.start(SqlLine.java:366)
>   at sqlline.SqlLine.main(SqlLine.java:259)
> 0: jdbc:drill:schema=dfs.tmp> 
> {code}
> Stack trace from drillbit.log
> {code}
> 2015-09-10 23:23:55,019 [UserServer-1] INFO  
> o.a.d.e.w.fragment.FragmentExecutor - 
> 2a0df049-697d-f47b-86b5-1e2697946237:0:0: State change requested RUNNING --> 
> FAILED
> 2015-09-10 23:24:05,319 [2a0df049-697d-f47b-86b5-1e2697946237:frag:0:0] INFO  
> o.a.d.e.w.fragment.FragmentExecutor - 
> 2a0df049-697d-f47b-86b5-1e2697946237:0:0: State change requested FAILED --> 
> FINISHED
> 2015-09-10 23:24:05,331 [2a0df049-697d-f47b-86b5-1e2697946237:frag:0:0] ERROR 
> o.a.d.e.w.fragment.FragmentExecutor - SYSTEM ERROR: ChannelClosedException: 
> Channel closed /10.10.100.201:31010 <--> /10.10.100.202:58705.
> Fragment 0:0
> [Error Id: bd765bd5-3921-4b58-89a7-cd87482f9088 on centos-01.qa.lab:31010]
> org.apache.drill.common.exceptions.UserException: SYSTEM ERROR: 
> ChannelClosedException: Channel closed /10.10.100.201:31010 <--> 
> /10.10.100.202:58705.
> Fragment 0:0
> [Error Id: bd765bd5-3921-4b58-89a7-cd87482f9088 on centos-01.qa.lab:31010]
> at 
> org.apache.drill.common.exceptions.UserException$Builder.build(UserException.java:524)
>  ~[drill-common-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
> at 
> org.apache.drill.exec.work.fragment.FragmentExecutor.sendFinalState(FragmentExecutor.java:323)
>  [drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
> at 
> org.apache.drill.exec.work.fragment.FragmentExecutor.cleanup(FragmentExecutor.java:178)
>  [drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
> at 
> org.apache.drill.exec.work.fragment.FragmentExecutor.run(FragmentExecutor.java:292)
>  [drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
> at 
> org.apache.drill.common.SelfCleaningRunnable.run(SelfCleaningRunnable.java:38)
>  [drill-common-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>  [na:1.7.0_45]
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>  [na:1.7.0_45]
> at java.lang.Thread.run(Thread.java:744) [na:1.7.0_45]
> Caused by: org.apache.drill.exec.rpc.ChannelClosedException: Channel closed 
> /10.10.100.201:31010 <--> /10.10.100.202:58705.
> at 
> org.apache.drill.exec.rpc.RpcBus$ChannelClosedHandler.operationComplete(RpcBus.java:167)
>  

[jira] [Updated] (DRILL-3894) Directory functions (MaxDir, MinDir ..) should have optional filename parameter

2016-05-06 Thread Suresh Ollala (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-3894?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Suresh Ollala updated DRILL-3894:
-
Reviewer: Krystal

> Directory functions (MaxDir, MinDir ..) should have optional filename 
> parameter
> ---
>
> Key: DRILL-3894
> URL: https://issues.apache.org/jira/browse/DRILL-3894
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Functions - Drill
>Affects Versions: 1.2.0
>Reporter: Neeraja
>Assignee: Vitalii Diravka
>  Labels: doc-impacting
> Fix For: 1.7.0
>
>
> https://drill.apache.org/docs/query-directory-functions/
> The directory functions documented above should provide ability to have 
> second parameter(file name) as optional.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (DRILL-3763) Cancel (Ctrl-C) one of concurrent queries results in ChannelClosedException

2016-05-06 Thread Deneche A. Hakim (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-3763?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Deneche A. Hakim resolved DRILL-3763.
-
Resolution: Fixed

DRILL-3714 should also fix this issue

> Cancel (Ctrl-C) one of concurrent queries results in ChannelClosedException
> ---
>
> Key: DRILL-3763
> URL: https://issues.apache.org/jira/browse/DRILL-3763
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - RPC
>Affects Versions: 1.2.0
> Environment: 4 node cluster CentOS
>Reporter: Khurram Faraaz
>Assignee: Deneche A. Hakim
> Fix For: 1.7.0
>
>
> Canceling a query from a set of concurrent queries executing on the same 
> table, results in ChannelClosedException when one of the queries in Canceled 
> using Ctrl-C from sqlline prompt.
> Steps to reproduce the problem,
> 1. Start, service map-warden start on 4 nodes.
> 2. Start eight sqlline sessions from eight different terminals, ./sqlline -u 
> "jdbc:drill:schema=dfs.tmp"
> 3. Run the below query from the eight sqlline sessions
> select * from `twoKeyJsn.json`;
> 4. While the queries are being executed, Cancel one of the queries on sqlline 
> prompt, issue Ctrl-C on sqlline.
> 5. you will note that after a few seconds/minutes, on one of the other 
> sqlline sessions there is a ChannelClosedException that is reported and that 
> query is reported as FAILED on the query profile on the Web UI.
> {code}
> 0: jdbc:drill:schema=dfs.tmp> select * from `twoKeyJsn.json`;
> ...
> | 1.31643767542E9  | h|
> | 9.02780441562E8  | b|
> | 6.46524413864E8  | l|
> java.lang.RuntimeException: java.sql.SQLException: SYSTEM ERROR: 
> ChannelClosedException: Channel closed /10.10.100.201:31010 <--> 
> /10.10.100.202:58705.
> Fragment 0:0
> [Error Id: bd765bd5-3921-4b58-89a7-cd87482f9088 on centos-01.qa.lab:31010]
>   at sqlline.IncrementalRows.hasNext(IncrementalRows.java:73)
>   at 
> sqlline.TableOutputFormat$ResizingRowsProvider.next(TableOutputFormat.java:87)
>   at sqlline.TableOutputFormat.print(TableOutputFormat.java:118)
>   at sqlline.SqlLine.print(SqlLine.java:1583)
>   at sqlline.Commands.execute(Commands.java:852)
>   at sqlline.Commands.sql(Commands.java:751)
>   at sqlline.SqlLine.dispatch(SqlLine.java:738)
>   at sqlline.SqlLine.begin(SqlLine.java:612)
>   at sqlline.SqlLine.start(SqlLine.java:366)
>   at sqlline.SqlLine.main(SqlLine.java:259)
> 0: jdbc:drill:schema=dfs.tmp> 
> {code}
> Stack trace from drillbit.log
> {code}
> 2015-09-10 23:23:55,019 [UserServer-1] INFO  
> o.a.d.e.w.fragment.FragmentExecutor - 
> 2a0df049-697d-f47b-86b5-1e2697946237:0:0: State change requested RUNNING --> 
> FAILED
> 2015-09-10 23:24:05,319 [2a0df049-697d-f47b-86b5-1e2697946237:frag:0:0] INFO  
> o.a.d.e.w.fragment.FragmentExecutor - 
> 2a0df049-697d-f47b-86b5-1e2697946237:0:0: State change requested FAILED --> 
> FINISHED
> 2015-09-10 23:24:05,331 [2a0df049-697d-f47b-86b5-1e2697946237:frag:0:0] ERROR 
> o.a.d.e.w.fragment.FragmentExecutor - SYSTEM ERROR: ChannelClosedException: 
> Channel closed /10.10.100.201:31010 <--> /10.10.100.202:58705.
> Fragment 0:0
> [Error Id: bd765bd5-3921-4b58-89a7-cd87482f9088 on centos-01.qa.lab:31010]
> org.apache.drill.common.exceptions.UserException: SYSTEM ERROR: 
> ChannelClosedException: Channel closed /10.10.100.201:31010 <--> 
> /10.10.100.202:58705.
> Fragment 0:0
> [Error Id: bd765bd5-3921-4b58-89a7-cd87482f9088 on centos-01.qa.lab:31010]
> at 
> org.apache.drill.common.exceptions.UserException$Builder.build(UserException.java:524)
>  ~[drill-common-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
> at 
> org.apache.drill.exec.work.fragment.FragmentExecutor.sendFinalState(FragmentExecutor.java:323)
>  [drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
> at 
> org.apache.drill.exec.work.fragment.FragmentExecutor.cleanup(FragmentExecutor.java:178)
>  [drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
> at 
> org.apache.drill.exec.work.fragment.FragmentExecutor.run(FragmentExecutor.java:292)
>  [drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
> at 
> org.apache.drill.common.SelfCleaningRunnable.run(SelfCleaningRunnable.java:38)
>  [drill-common-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>  [na:1.7.0_45]
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>  [na:1.7.0_45]
> at java.lang.Thread.run(Thread.java:744) [na:1.7.0_45]
> Caused by: org.apache.drill.exec.rpc.ChannelClosedException: Channel closed 
> /10.10.100.201:31010 <--> /10.10.100.202:58705.
> at 
> org.apache.drill.exec.rpc.RpcBus$ChannelClosedHandler.operationComplete(RpcBus.java:167)
>  

[jira] [Created] (DRILL-4658) cannot specify tab as a fieldDelimiter in table function

2016-05-06 Thread Vince Gonzalez (JIRA)
Vince Gonzalez created DRILL-4658:
-

 Summary: cannot specify tab as a fieldDelimiter in table function
 Key: DRILL-4658
 URL: https://issues.apache.org/jira/browse/DRILL-4658
 Project: Apache Drill
  Issue Type: Bug
  Components: SQL Parser
Affects Versions: 1.6.0
 Environment: Mac OS X, Java 8
Reporter: Vince Gonzalez


I can't specify a tab delimiter in the table function because it maybe counts 
the characters rather than trying to interpret as a character escape code?

{code}
0: jdbc:drill:zk=local> select columns[0] as a, cast(columns[1] as bigint) as b 
from table(dfs.tmp.`sample_cast.tsv`(type => 'text', fieldDelimiter => '\t', 
skipFirstLine => true));
Error: PARSE ERROR: Expected single character but was String: \t

table sample_cast.tsv
parameter fieldDelimiter
SQL Query null

[Error Id: 3efa82e1-3810-4d4a-b23c-32d6658dffcf on 172.30.1.144:31010] 
(state=,code=0)
{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)