[jira] [Updated] (DRILL-2552) ZK disconnect to foreman node results in hung query on client

2015-04-15 Thread Jacques Nadeau (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-2552?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jacques Nadeau updated DRILL-2552:
--
Fix Version/s: (was: 0.9.0)
   1.0.0

> ZK disconnect to foreman node results in hung query on client
> -
>
> Key: DRILL-2552
> URL: https://issues.apache.org/jira/browse/DRILL-2552
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Flow
>Affects Versions: 0.8.0
>Reporter: Ramana Inukonda Nagaraj
>Assignee: Jacques Nadeau
> Fix For: 1.0.0
>
>
> Steps taken to recreate:
> 1. Startup drillbits on multiple nodes. (They all come up and form a 8 node 
> cluster)
> 2. Start executing a long running query.
> 3. Use TCPKILL to kill all connections between foreman node and zookeeper 
> port 5181. 
> Drill seems to detect the node as gone and cancels the query but there is no 
> communication of this back to the client which is hanging.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (DRILL-2803) Severe skew due to null values in columns even when other columns are non-null

2015-04-15 Thread Aman Sinha (JIRA)
Aman Sinha created DRILL-2803:
-

 Summary: Severe skew due to null values in columns even when other 
columns are non-null
 Key: DRILL-2803
 URL: https://issues.apache.org/jira/browse/DRILL-2803
 Project: Apache Drill
  Issue Type: Bug
  Components: Functions - Drill
Affects Versions: 0.8.0
Reporter: Aman Sinha
Assignee: Jacques Nadeau


If you have 2 columns that are hashed (either for distribution or for hash 
based operators) and one of those columns has lots of null values, it can 
result in substantial skew even if the other column has non-null values. 

In the following query the combined hash value of 2 columns is 0 even when 1 
column is non-null.   The reason is that if the starting value is null (for 
cr_reason_sk all values are null in the above query), it does not matter what 
seed is passed in.   The hash function treats the second parameter as a seed 
and not as a combiner, so it gets ignored. 

{code}
select cr_call_center_sk, cr_reason_sk, hash64(cr_reason_sk, 
hash64(cr_call_center_sk)) as hash_value from catalog_returns  where 
cr_reason_sk is null and cr_call_center_sk is not null limit 10;
+---+--++
| cr_call_center_sk | cr_reason_sk | hash_value |
+---+--++
| 1 | null | 0  |
| 1 | null | 0  |
| 4 | null | 0  |
| 1 | null | 0  |
| 4 | null | 0  |
| 2 | null | 0  |
| 2 | null | 0  |
| 2 | null | 0  |
| 2 | null | 0  |
| 2 | null | 0  |
+---+--++
{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (DRILL-2218) Constant folding rule exposing planning bugs and not being used in plan where the constant expression is in the select list

2015-04-15 Thread Aman Sinha (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-2218?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aman Sinha updated DRILL-2218:
--
Fix Version/s: (was: 0.9.0)
   1.0.0

> Constant folding rule exposing planning bugs and not being used in plan where 
> the constant expression is in the select list
> ---
>
> Key: DRILL-2218
> URL: https://issues.apache.org/jira/browse/DRILL-2218
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Query Planning & Optimization
>Reporter: Jason Altekruse
>Assignee: Aman Sinha
> Fix For: 1.0.0
>
>
> This test method and rule is not currently in the master branch, but it does 
> appear in the patch posted for constant expression folding during planning, 
> DRILL-2060. Once it is merged, the test 
> TestConstantFolding.testConstExprFolding_InSelect() which is currently 
> ignored, will be failing. The issue is that even though the constant folding 
> rule for project is firing, and I have traced it to see that a replacement 
> project with a literal is created, it is not being selected in the final 
> plan. This seems rather odd, as there is a comment in the last line of the 
> onMatch() method of the rule that says the following. This does not appear to 
> be having the desired effect, may need to file a bug in calcite.
> {code}
> // New plan is absolutely better than old plan.
> call.getPlanner().setImportance(project, 0.0);
> {code}
> Here is the query from the test, I expect the sum to be folded in planning 
> with the newly enabled project constant folding rule.
> {code}
> select columns[0], 3+5 from cp.`test_input.csv`
> {code}
> There also some planning bugs that are exposed when this rule is enabled, 
> even if the ReduceExpressionsRule.PROJECT_INSTANCE has no impact on the plan 
> itself.
> It is causing a planning bug for the TestAggregateFunctions.testDrill2092 -as 
> well as TestProjectPushDown.testProjectPastJoinPastFilterPastJoinPushDown()-. 
> The rule's OnMatch is being called, but not modifying the plan. It seems like 
> its presence in the optimizer is making another rule fire that is creating a 
> bad plan.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (DRILL-1441) Replace sql function requires backticks to avoid parse error

2015-04-15 Thread Parth Chandra (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-1441?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Parth Chandra updated DRILL-1441:
-
 Component/s: (was: Functions - Drill)
  SQL Parser
Target Version/s: 1.1.0
   Fix Version/s: (was: 0.9.0)
  1.1.0

> Replace sql function requires backticks to avoid parse error
> 
>
> Key: DRILL-1441
> URL: https://issues.apache.org/jira/browse/DRILL-1441
> Project: Apache Drill
>  Issue Type: Bug
>  Components: SQL Parser
>Reporter: Krystal
>Assignee: Jinfeng Ni
>Priority: Minor
> Fix For: 1.1.0
>
>
> git.commit.id.abbrev=f8d38b6
> Currently the "replace" function requires to be enclosed within backticks 
> (ie. `replace`); otherwise it would fail during the parsing.
> For example the following query fails:
> 0: jdbc:drill:schema=M7> select replace(substring(name,1,10), 'or', 'EA') 
> from `dfs.default`.voter where name like 'victor%';
> Query failed: Failure while parsing sql. Encountered "replace" at line 1, 
> column 8.
> Was expecting one of:
> "UNION" ...
> "INTERSECT" ...
> "EXCEPT" ...
> .
> .
> .
> The following modified query succeeds:
> 0: jdbc:drill:schema=M7> select `replace`(substring(name,1,10), 'or', 'EA') 
> from `dfs.default`.voter where name like 'victor%';
> ++
> |   EXPR$0   |
> ++
> | victEA tho |
> | victEA you |
> | victEA rob |
> | victEA van |
> | victEA rob |
> .
> .
> .
> The parse error was originally filed in this jira: 
> https://issues.apache.org/jira/browse/DRILL-738.  We should find a way to 
> avoid requiring backticks as this is a standard sql function.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (DRILL-1332) Statistics functions - regr_sxx(X, Y) regr_sxy(X, Y) regr_syy(X, Y)

2015-04-15 Thread Parth Chandra (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-1332?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Parth Chandra updated DRILL-1332:
-
Target Version/s: 1.1.0
   Fix Version/s: (was: 0.9.0)
  1.1.0

> Statistics functions - regr_sxx(X, Y) regr_sxy(X, Y) regr_syy(X, Y)
> ---
>
> Key: DRILL-1332
> URL: https://issues.apache.org/jira/browse/DRILL-1332
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Functions - Drill
>Reporter: Yash Sharma
>Assignee: Jinfeng Ni
>Priority: Minor
> Fix For: 1.1.0
>
> Attachments: DRILL-1332.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (DRILL-1331) Aggregate Statistics function - regr_avgx(X, Y) regr_avgy(X, Y)

2015-04-15 Thread Parth Chandra (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-1331?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Parth Chandra updated DRILL-1331:
-
Component/s: (was: SQL Parser)
 Functions - Drill

> Aggregate Statistics function - regr_avgx(X, Y) regr_avgy(X, Y)
> ---
>
> Key: DRILL-1331
> URL: https://issues.apache.org/jira/browse/DRILL-1331
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Functions - Drill
>Reporter: Yash Sharma
>Assignee: Jinfeng Ni
>Priority: Minor
> Fix For: 1.1.0
>
> Attachments: DRILL-1331.patch, DRILL-1331.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (DRILL-1331) Aggregate Statistics function - regr_avgx(X, Y) regr_avgy(X, Y)

2015-04-15 Thread Parth Chandra (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-1331?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Parth Chandra updated DRILL-1331:
-
 Component/s: (was: Functions - Drill)
  SQL Parser
Target Version/s: 1.1.0
   Fix Version/s: (was: 0.9.0)
  1.1.0

> Aggregate Statistics function - regr_avgx(X, Y) regr_avgy(X, Y)
> ---
>
> Key: DRILL-1331
> URL: https://issues.apache.org/jira/browse/DRILL-1331
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: SQL Parser
>Reporter: Yash Sharma
>Assignee: Jinfeng Ni
>Priority: Minor
> Fix For: 1.1.0
>
> Attachments: DRILL-1331.patch, DRILL-1331.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (DRILL-2647) NullPointerException from CONVERT_FROM given a NULL

2015-04-15 Thread Parth Chandra (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-2647?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Parth Chandra updated DRILL-2647:
-
Target Version/s: 1.0.0
   Fix Version/s: (was: 0.9.0)
  1.0.0

> NullPointerException from CONVERT_FROM given a NULL
> ---
>
> Key: DRILL-2647
> URL: https://issues.apache.org/jira/browse/DRILL-2647
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Functions - Drill
>Reporter: Daniel Barclay (Drill)
>Assignee: Parth Chandra
> Fix For: 1.0.0
>
>
> CONVERT_FROM crashes when given a null value like this:
> SELECT CONVERT_FROM(CAST(NULL AS VARCHAR), 'JSON') FROM  
> INFORMATION_SCHEMA.CATALOGS;
> This fails similarly
> SELECT CONVERT_FROM(CAST(NULL AS INTEGER), 'JSON') FROM  
> INFORMATION_SCHEMA.CATALOGS;
> --
> 0: jdbc:drill:zk=local> SELECT CONVERT_FROM(CAST(NULL AS VARCHAR), 'JSON') 
> FROM  INFORMATION_SCHEMA.CATALOGS;
> Exception in thread "2ae48af0-c497-8b98-d9eb-f64353f79065:frag:0:0" 
> java.lang.RuntimeException: Error closing fragment context.
>   at 
> org.apache.drill.exec.work.fragment.FragmentExecutor.closeOutResources(FragmentExecutor.java:224)
>   at 
> org.apache.drill.exec.work.fragment.FragmentExecutor.run(FragmentExecutor.java:187)
>   at 
> org.apache.drill.common.SelfCleaningRunnable.run(SelfCleaningRunnable.java:38)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>   at java.lang.Thread.run(Thread.java:745)
> Caused by: org.apache.drill.common.exceptions.DrillRuntimeException: Error 
> while converting from JSON. 
>   at 
> org.apache.drill.exec.test.generated.ProjectorGen4.doEval(ProjectorTemplate.java:38)
> Query failed: RemoteRpcException: Failure while running fragment., Error 
> while converting from JSON.  [ f0c043d2-f86f-4e4d-a864-74df93f6c79f on 
> dev-linux2:31010 ]
> [ f0c043d2-f86f-4e4d-a864-74df93f6c79f on dev-linux2:31010 ]
>   at 
> org.apache.drill.exec.test.generated.ProjectorGen4.projectRecords(ProjectorTemplate.java:62)
>   at 
> org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.doWork(ProjectRecordBatch.java:174)
>   at 
> org.apache.drill.exec.record.AbstractSingleRecordBatch.innerNext(AbstractSingleRecordBatch.java:93)
>   at 
> org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.innerNext(ProjectRecordBatch.java:134)
>   at 
> org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:142)
>   at 
> org.apache.drill.exec.physical.impl.validate.IteratorValidatorBatchIterator.next(IteratorValidatorBatchIterator.java:118)
>   at 
> org.apache.drill.exec.physical.impl.BaseRootExec.next(BaseRootExec.java:68)
>   at 
> org.apache.drill.exec.physical.impl.ScreenCreator$ScreenRoot.innerNext(ScreenCreator.java:96)
>   at 
> org.apache.drill.exec.physical.impl.BaseRootExec.next(BaseRootExec.java:58)
>   at 
> org.apache.drill.exec.work.fragment.FragmentExecutor.run(FragmentExecutor.java:163)
>   ... 4 more
> Caused by: java.lang.NullPointerException
>   at 
> org.apache.drill.exec.vector.complex.fn.DrillBufInputStream.getStream(DrillBufInputStream.java:56)
>   at 
> org.apache.drill.exec.vector.complex.fn.JsonReader.setSource(JsonReader.java:114)
>   at 
> org.apache.drill.exec.test.generated.ProjectorGen4.doEval(ProjectorTemplate.java:34)
>   ... 14 more
> Error: exception while executing query: Failure while executing query. 
> (state=,code=0)
> 0: jdbc:drill:zk=local> 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (DRILL-2554) Incorrect results for repeated values when using jdbc

2015-04-15 Thread Parth Chandra (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-2554?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Parth Chandra updated DRILL-2554:
-
Target Version/s: 1.0.0
   Fix Version/s: (was: 0.9.0)
  1.0.0

> Incorrect results for repeated values when using jdbc
> -
>
> Key: DRILL-2554
> URL: https://issues.apache.org/jira/browse/DRILL-2554
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Client - JDBC, Execution - Data Types
>Affects Versions: 0.8.0
>Reporter: Khurram Faraaz
>Assignee: Parth Chandra
>Priority: Critical
> Fix For: 1.0.0
>
>
> Data is missing from the output of select * from JSON data file statement. 
> Data pertaining to key2 and key3 and key4 is missing from the output of the 
> below select statement. I had enabled `store.json.all_text_mode`=true for 
> that session.
> {code}
> 0: jdbc:drill:> alter session set `store.json.all_text_mode`=true;
> +++
> | ok |  summary   |
> +++
> | true   | store.json.all_text_mode updated. |
> +++
> 1 row selected (0.022 seconds)
> 0: jdbc:drill:> select * from `testJsnData02.json`;
> ++++++
> |key |key1|key2|key3|key4|
> ++++++
> | 12345  | {} | [] | {} | [] |
> | -123456| {} | [] | {} | null   |
> | 0  | {} | [] | {} | null   |
> | -9.999 | {} | [] | {} | null   |
> | .9876 | {} | [] | {} | null   |
> | Hello World! | {} | [] | {} | null   |
> | this is a long string, not very long though! | {} | [] | {} 
> | null   |
> | true   | {} | [] | {} | null   |
> | false  | {} | [] | {} | null   |
> | null   | {} | [] | {} | null   |
> | 2147483647 | {} | [] | {} | null   |
> | 1100110010101010100101010101010101 | {} | [] | {} | 
> null   |
> | 2008-1-23 14:24:23 | {} | [] | {} | null   |
> | 2008-2-23  | {} | [] | {} | null   |
> | 10:20:30.123 | {} | null   | {} | null   |
> | -1 | {} | null   | {} | null   |
> | 3.147  | {} | null   | {} | null   |
> | null   | {"id":"1000.997"} | null   | {} | null   |
> | null   | {} | null   | {} | null   |
> | null   | {} | null   | {} | null   |
> | null   | {} | null   | {} | null   |
> | abcdefghijklmnopqrstuvwxyz1234567890ABCDEFGHIJKLMNOPQRSTUVWXYZ12345 
> aeiou | {} | null   | {} | null   |
> ++++++
> 22 rows selected (0.069 seconds)
> 0: jdbc:drill:> select * from sys.version;
> +++-+-++
> | commit_id  | commit_message | commit_time | build_email | build_time |
> +++-+-++
> | f658a3c513ddf7f2d1b0ad7aa1f3f65049a594fe | DRILL-2209 Insert 
> ProjectOperator with MuxExchange | 09.03.2015 @ 01:49:18 EDT | Unknown | 
> 09.03.2015 @ 04:52:49 EDT |
> +++-+-++
> 1 row selected (0.041 seconds)
> {code}
> The data that I used in my test was
> {code}
> {"key":12345}
> {"key":-123456}
> {"key":0}
> {"key":-9.999}
> {"key":.9876}
> {"key":"Hello World!"}
> {"key":"this is a long string, not very long though!"}
> {"key":true}
> {"key":false}
> {"key":null}
> {"key":2147483647}
> {"key":1100110010101010100101010101010101}
> {"key":"2008-1-23 14:24:23"}
> {"key":"2008-2-23"}
> {"key":"10:20:30.123"}
> {"key":-1}
> {"key":3.147}
> {"key1":{"id":1000.997}}
> {"key2":[1,2,3,4,-1,0,135.987,9,-.876,2147483647,"test 
> string",null,true,false]}
> {"key3":{"id":null}}
> {"key4":[null]}
> {"key":"abcdefghijklmnopqrstuvwxyz1234567890ABCDEFGHIJKLMNOPQRSTUVWXYZ
> 12345 aeiou"}
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (DRILL-1330) String aggregate function - string_agg(expression, delimiter)

2015-04-15 Thread Parth Chandra (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-1330?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Parth Chandra updated DRILL-1330:
-
Fix Version/s: (was: 0.9.0)

> String aggregate function - string_agg(expression, delimiter)
> -
>
> Key: DRILL-1330
> URL: https://issues.apache.org/jira/browse/DRILL-1330
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Functions - Drill
>Reporter: Yash Sharma
>Assignee: Yash Sharma
>Priority: Minor
> Fix For: Future
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (DRILL-1330) String aggregate function - string_agg(expression, delimiter)

2015-04-15 Thread Parth Chandra (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-1330?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Parth Chandra updated DRILL-1330:
-
Target Version/s: Future

> String aggregate function - string_agg(expression, delimiter)
> -
>
> Key: DRILL-1330
> URL: https://issues.apache.org/jira/browse/DRILL-1330
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Functions - Drill
>Reporter: Yash Sharma
>Assignee: Yash Sharma
>Priority: Minor
> Fix For: Future
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (DRILL-1330) String aggregate function - string_agg(expression, delimiter)

2015-04-15 Thread Parth Chandra (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-1330?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Parth Chandra updated DRILL-1330:
-
Fix Version/s: Future

> String aggregate function - string_agg(expression, delimiter)
> -
>
> Key: DRILL-1330
> URL: https://issues.apache.org/jira/browse/DRILL-1330
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Functions - Drill
>Reporter: Yash Sharma
>Assignee: Yash Sharma
>Priority: Minor
> Fix For: Future
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (DRILL-2671) C++ Client Authentication API passing std::string across DLL boundaries

2015-04-15 Thread Parth Chandra (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-2671?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Parth Chandra resolved DRILL-2671.
--
Resolution: Fixed

Fixed in e4e88cc

> C++ Client Authentication API passing std::string across DLL boundaries
> ---
>
> Key: DRILL-2671
> URL: https://issues.apache.org/jira/browse/DRILL-2671
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Client - C++
>Reporter: Norris Lee
>Assignee: Norris Lee
> Fix For: 0.9.0
>
>
> DrillUserProperty::setProperty is taking std::string as parameters. Memory 
> gets allocated in the client yet Drill Client tries to clean it up.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (DRILL-2802) Projecting dir[n] by itself, results in projecting of all columns

2015-04-15 Thread Victoria Markman (JIRA)
Victoria Markman created DRILL-2802:
---

 Summary: Projecting dir[n] by itself, results in projecting of all 
columns
 Key: DRILL-2802
 URL: https://issues.apache.org/jira/browse/DRILL-2802
 Project: Apache Drill
  Issue Type: Bug
  Components: Query Planning & Optimization
Affects Versions: 0.9.0
Reporter: Victoria Markman
Assignee: Jinfeng Ni


{code}
0: jdbc:drill:schema=dfs> select dir1 from bigtable limit 1;
+++++
| a1 | b1 | c1 |dir1|
+++++
| 1  | a  | 2015-01-01 | 01 |
+++++
1 row selected (0.189 seconds)

0: jdbc:drill:schema=dfs> select dir0 from bigtable limit 1;
+++++
| a1 | b1 | c1 |dir0|
+++++
| 1  | a  | 2015-01-01 | 2015   |
+++++
1 row selected (0.193 seconds)
{code}

In explain plan, I don't see project:
{code}
0: jdbc:drill:schema=dfs> explain plan for select dir0 from bigtable;
+++
|text|json|
+++
| 00-00Screen
00-01  Scan(groupscan=[ParquetGroupScan [entries=[ReadEntryWithPath 
[path=maprfs:/test/bigtable/2015/01/4_0_0.parquet], ReadEntryWithPath 
[path=maprfs:/test/bigtable/2015/01/3_0_0.parquet], ReadEntryWithPath 
[path=maprfs:/test/bigtable/2015/01/5_0_0.parquet], ReadEntryWithPath 
[path=maprfs:/test/bigtable/2015/01/1_0_0.parquet], ReadEntryWithPath 
[path=maprfs:/test/bigtable/2015/01/2_0_0.parquet], ReadEntryWithPath 
[path=maprfs:/test/bigtable/2015/01/0_0_0.parquet], ReadEntryWithPath 
[path=maprfs:/test/bigtable/2015/02/0_0_0.parquet], ReadEntryWithPath 
[path=maprfs:/test/bigtable/2015/03/0_0_0.parquet], ReadEntryWithPath 
[path=maprfs:/test/bigtable/2015/04/0_0_0.parquet], ReadEntryWithPath 
[path=maprfs:/test/bigtable/2016/01/parquet.file], ReadEntryWithPath 
[path=maprfs:/test/bigtable/2016/parquet.file]], selectionRoot=/test/bigtable, 
numFiles=11, columns=[`dir0`]]])
{code}

If you project both dir0 and dir1, both columns are projected with the correct 
result:

{code}
0: jdbc:drill:schema=dfs> select dir0, dir1 from bigtable;
+++
|dir0|dir1|
+++
| 2015   | 01 |
| 2015   | 01 |
| 2015   | 01 |
| 2015   | 01 |
| 2015   | 01 |
| 2015   | 01 |
| 2015   | 01 |
| 2015   | 01 |
| 2015   | 01 |
{code}

{code}
[Wed Apr 15 14:09:47 root@/mapr/vmarkman.cluster.com/test/bigtable ] # ls -R
.:
2015  2016

./2015:
01  02  03  04

./2015/01:
0_0_0.parquet  1_0_0.parquet  2_0_0.parquet  3_0_0.parquet  4_0_0.parquet  
5_0_0.parquet

./2015/02:
0_0_0.parquet

./2015/03:
0_0_0.parquet

./2015/04:
0_0_0.parquet

./2016:
01  parquet.file

./2016/01:
parquet.file
{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (DRILL-2798) Suppress log location message from sqlline

2015-04-15 Thread Parth Chandra (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-2798?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14496994#comment-14496994
 ] 

Parth Chandra commented on DRILL-2798:
--

+!. Looks good to me.

> Suppress log location message from sqlline
> --
>
> Key: DRILL-2798
> URL: https://issues.apache.org/jira/browse/DRILL-2798
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Client - CLI
>Affects Versions: 0.8.0
>Reporter: Parth Chandra
>Assignee: Parth Chandra
> Fix For: 0.9.0
>
> Attachments: DRILL-2798.1.patch.txt
>
>
> sqlline is now printing a message with the location of the log file that is 
> breaking external scripts to extract data using Drill.
> We need to add an option to suppress sqlline shell script messages (or remove 
> them).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (DRILL-2798) Suppress log location message from sqlline

2015-04-15 Thread Parth Chandra (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-2798?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Parth Chandra reassigned DRILL-2798:


Assignee: Parth Chandra  (was: DrillCommitter)

> Suppress log location message from sqlline
> --
>
> Key: DRILL-2798
> URL: https://issues.apache.org/jira/browse/DRILL-2798
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Client - CLI
>Affects Versions: 0.8.0
>Reporter: Parth Chandra
>Assignee: Parth Chandra
> Fix For: 0.9.0
>
> Attachments: DRILL-2798.1.patch.txt
>
>
> sqlline is now printing a message with the location of the log file that is 
> breaking external scripts to extract data using Drill.
> We need to add an option to suppress sqlline shell script messages (or remove 
> them).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Closed] (DRILL-811) Selecting individual columns from views created using 'select * ......' query throws an error

2015-04-15 Thread Rahul Challapalli (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-811?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rahul Challapalli closed DRILL-811.
---

Verified that the issue is fixed. Thank You.

> Selecting individual columns from views created using 'select * ..' query 
> throws an error
> -
>
> Key: DRILL-811
> URL: https://issues.apache.org/jira/browse/DRILL-811
> Project: Apache Drill
>  Issue Type: Bug
>  Components: SQL Parser
>Reporter: Rahul Challapalli
>Assignee: DrillCommitter
> Fix For: 0.5.0
>
> Attachments: rankings.parquet
>
>
> git.commit.id.abbrev=70fab8c
> Follow the below steps :
> create view v1 as select * from `dfs/parquet/rankings`;
> select pageRank from v1; 
> The query fails with the below error : 
> message: "Failure while parsing sql. < IndexOutOfBoundsException:[ Index: 1, 
> Size: 1 ]"
> This looks like a known issue. However I couldn't find any bug related to it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Closed] (DRILL-806) Information_Schema.Views : ViewDefinition is missing

2015-04-15 Thread Rahul Challapalli (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-806?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rahul Challapalli closed DRILL-806.
---

> Information_Schema.Views  : ViewDefinition is missing
> -
>
> Key: DRILL-806
> URL: https://issues.apache.org/jira/browse/DRILL-806
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Metadata
>Reporter: Rahul Challapalli
>Assignee: Venki Korukanti
> Fix For: 0.4.0
>
>
> git.commit.id.abbrev=70fab8c
> Information_Schema.Views : The view_definition column currently displays 
> "TODO: GetViewDefinition"



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Closed] (DRILL-792) Joining a hive table with parquet file is returning an empty result set

2015-04-15 Thread Rahul Challapalli (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-792?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rahul Challapalli closed DRILL-792.
---

Verified this. Thanks for the fix

> Joining a hive table with parquet file is returning an empty result set
> ---
>
> Key: DRILL-792
> URL: https://issues.apache.org/jira/browse/DRILL-792
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Storage - Parquet
>Reporter: Rahul Challapalli
>Assignee: Rahul Challapalli
>Priority: Critical
> Fix For: 0.5.0
>
> Attachments: 792.ddl, rankings.parquet, rankings.txt, 
> uservisits.parquet, uservisits.txt
>
>
> git.commit.id.abbrev=70fab8c
> 1. Joining a hive table with parquet results in an empty output. Check below 
> query
> select rankings.pageRank pagerank from `dfs/parquet/rankings/` rankings inner 
> join hive.uservisits uservisits on rankings.pageURL = 
> uservisits.destinationurl
> 2. Joining hive table with hive table seems to work fine
> select rankings.pagerank pagerank from hive.rankings rankings inner join 
> hive.uservisits uservisits on rankings.pageurl = uservisits.destinationurl
> I attached the parquet and text files required along with the required hive 
> ddl. Let me know if you need more information.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (DRILL-2798) Suppress log location message from sqlline

2015-04-15 Thread Patrick Wong (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-2798?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Patrick Wong updated DRILL-2798:

Assignee: DrillCommitter  (was: Patrick Wong)

> Suppress log location message from sqlline
> --
>
> Key: DRILL-2798
> URL: https://issues.apache.org/jira/browse/DRILL-2798
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Client - CLI
>Affects Versions: 0.8.0
>Reporter: Parth Chandra
>Assignee: DrillCommitter
> Fix For: 0.9.0
>
> Attachments: DRILL-2798.1.patch.txt
>
>
> sqlline is now printing a message with the location of the log file that is 
> breaking external scripts to extract data using Drill.
> We need to add an option to suppress sqlline shell script messages (or remove 
> them).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (DRILL-2798) Suppress log location message from sqlline

2015-04-15 Thread Patrick Wong (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-2798?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Patrick Wong updated DRILL-2798:

Attachment: DRILL-2798.1.patch.txt

DRILL-2798.1.patch.txt - don't print message about Drill log dir unless 
environment variable DRILL_LOG_DEBUG=1

> Suppress log location message from sqlline
> --
>
> Key: DRILL-2798
> URL: https://issues.apache.org/jira/browse/DRILL-2798
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Client - CLI
>Affects Versions: 0.8.0
>Reporter: Parth Chandra
>Assignee: Patrick Wong
> Fix For: 0.9.0
>
> Attachments: DRILL-2798.1.patch.txt
>
>
> sqlline is now printing a message with the location of the log file that is 
> breaking external scripts to extract data using Drill.
> We need to add an option to suppress sqlline shell script messages (or remove 
> them).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (DRILL-2475) Handle IterOutcome.NONE correctly in operators

2015-04-15 Thread Sudheesh Katkam (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-2475?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sudheesh Katkam updated DRILL-2475:
---
Target Version/s: 1.0.0  (was: 0.9.0)

> Handle IterOutcome.NONE correctly in operators
> --
>
> Key: DRILL-2475
> URL: https://issues.apache.org/jira/browse/DRILL-2475
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Relational Operators
>Affects Versions: 0.8.0
>Reporter: Venki Korukanti
>Assignee: Sudheesh Katkam
> Fix For: 1.0.0
>
>
> Currently not all operators are handling the NONE (with no OK_NEW_SCHEMA) 
> correctly. This JIRA is to go through the operators and check if it handling 
> the NONE correctly or not and modify accordingly.
> (from DRILL-2453)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (DRILL-2475) Handle IterOutcome.NONE correctly in operators

2015-04-15 Thread Sudheesh Katkam (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-2475?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sudheesh Katkam updated DRILL-2475:
---
Fix Version/s: (was: 0.9.0)
   1.0.0

> Handle IterOutcome.NONE correctly in operators
> --
>
> Key: DRILL-2475
> URL: https://issues.apache.org/jira/browse/DRILL-2475
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Relational Operators
>Affects Versions: 0.8.0
>Reporter: Venki Korukanti
>Assignee: Sudheesh Katkam
> Fix For: 1.0.0
>
>
> Currently not all operators are handling the NONE (with no OK_NEW_SCHEMA) 
> correctly. This JIRA is to go through the operators and check if it handling 
> the NONE correctly or not and modify accordingly.
> (from DRILL-2453)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (DRILL-2363) Support user impersonation in Drill

2015-04-15 Thread Chris Westin (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-2363?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Westin updated DRILL-2363:

Fix Version/s: (was: 0.9.0)
   1.0.0

> Support user impersonation in Drill
> ---
>
> Key: DRILL-2363
> URL: https://issues.apache.org/jira/browse/DRILL-2363
> Project: Apache Drill
>  Issue Type: New Feature
>  Components: Execution - Flow, Metadata
>Reporter: Venki Korukanti
>Assignee: Venki Korukanti
> Fix For: 1.0.0
>
>
> Currently Drill has no impersonation. All queries are run as the user who is 
> running the Drillbit process and not as the user who issued the query from 
> JDBC/ODBC. This restricts the controlling of who can query tables/views and 
> who can create new tables/views.
> This is a umbrella JIRA to track impersonation feature in Drill.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (DRILL-1512) Avro Record Reader

2015-04-15 Thread Steven Phillips (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-1512?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steven Phillips resolved DRILL-1512.

Resolution: Fixed

Fixed with 55a9a59 and bf3db31

> Avro Record Reader
> --
>
> Key: DRILL-1512
> URL: https://issues.apache.org/jira/browse/DRILL-1512
> Project: Apache Drill
>  Issue Type: New Feature
>Reporter: Andrew
>Assignee: Steven Phillips
>Priority: Minor
>  Labels: avro, drill
> Fix For: 0.9.0
>
> Attachments: DRILL-1512.1.patch.txt, DRILL-1512.2.patch.txt, 
> DRILL-1512.3.patch.txt, DRILL-1512.4a.patch, DRILL-1512.4b.patch
>
>
> Record reader implementation for Avro data files.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (DRILL-2658) Add ilike and regex substring functions

2015-04-15 Thread Steven Phillips (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-2658?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steven Phillips resolved DRILL-2658.

Resolution: Fixed

Fixed with 033d0df

> Add ilike and regex substring functions
> ---
>
> Key: DRILL-2658
> URL: https://issues.apache.org/jira/browse/DRILL-2658
> Project: Apache Drill
>  Issue Type: New Feature
>  Components: Functions - Drill
>Reporter: Steven Phillips
>Assignee: Steven Phillips
> Fix For: 1.0.0
>
> Attachments: DRILL-2658.patch, DRILL-2658.patch
>
>
> This will not modify the parser, so postgress syntax such as:
> "... where c ILIKE '%ABC%'"
> will not be currently supported. It will simply be a function:
> "... where ILIKE(c, '%ABC%')"
> Same for substring:
> "select substr(c, 'abc')..."
> will be equivalent to postgress
> "select substr(c from 'abc')",
> but 'abc' will be treated as a java regex pattern.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (DRILL-2722) Query profile data not being sent/received (and web UI not updated)

2015-04-15 Thread Chris Westin (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-2722?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Westin updated DRILL-2722:

Fix Version/s: (was: 0.9.0)
   1.0.0

> Query profile data not being sent/received (and web UI not updated)
> ---
>
> Key: DRILL-2722
> URL: https://issues.apache.org/jira/browse/DRILL-2722
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Client - HTTP, Execution - Flow, Execution - RPC
>Affects Versions: 0.8.0
>Reporter: Chris Westin
>Assignee: Sudheesh Katkam
> Fix For: 1.0.0
>
> Attachments: query1_foreman.log
>
>
> [~amansinha100] has a test case that shows that profile information is not 
> being received (or not being sent, I'm not sure which) for a long-running 
> query. The query appears to stop, even though cycles are being used, and it 
> looks like work is being done. This is becoming a problem for monitoring 
> query progress. We need to find out why the profile information isn't being 
> updated, and fix that.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (DRILL-2476) Handle IterOutcome.STOP in buildSchema()

2015-04-15 Thread Chris Westin (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-2476?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Westin updated DRILL-2476:

Fix Version/s: (was: 0.9.0)
   1.0.0

> Handle IterOutcome.STOP in buildSchema()
> 
>
> Key: DRILL-2476
> URL: https://issues.apache.org/jira/browse/DRILL-2476
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Relational Operators
>Affects Versions: 0.7.0
>Reporter: Sudheesh Katkam
>Assignee: Sudheesh Katkam
> Fix For: 1.0.0
>
>
> There are some {{RecordBatch}} implementations like {{HashAggBatch}} that 
> override the {{buildSchema()}} function. The overriding functions do not 
> handle {{IterOutcome.STOP}}. This causes the {{FragmentContext}} to receive 
> two failures in some cases (linked JIRAs).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (DRILL-2325) conf/drill-override-example.conf is outdated

2015-04-15 Thread Chris Westin (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-2325?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Westin updated DRILL-2325:

Fix Version/s: (was: 0.9.0)
   1.0.0

> conf/drill-override-example.conf is outdated
> 
>
> Key: DRILL-2325
> URL: https://issues.apache.org/jira/browse/DRILL-2325
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Tools, Build & Test
>Reporter: Zhiyong Liu
>Assignee: Sudheesh Katkam
> Fix For: 1.0.0
>
>
> The conf/drill-override-example.conf file is outdated.  Properties have been 
> added (e.g., compile), removed (e.g., cache.hazel.subnets) or otherwise 
> modified.
> The file is statically tracked in 
> distribution/src/resources/drill-override-example.conf.  Ideally there should 
> be a way to update the file programmatically when things change.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (DRILL-2167) Order by on a repeated index from the output of a flatten on large no of records results in incorrect results

2015-04-15 Thread Chris Westin (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-2167?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Westin updated DRILL-2167:

Fix Version/s: (was: 0.9.0)
   1.0.0

> Order by on a repeated index from the output of a flatten on large no of 
> records results in incorrect results
> -
>
> Key: DRILL-2167
> URL: https://issues.apache.org/jira/browse/DRILL-2167
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Relational Operators
>Reporter: Rahul Challapalli
>Assignee: Sudheesh Katkam
>Priority: Critical
> Fix For: 1.0.0
>
> Attachments: data.json
>
>
> git.commit.id.abbrev=3e33880
> The below query results in 26 records. Based on the data set we should 
> only receive 20 records. 
> {code}
> select s.uid from (select d.uid, flatten(d.map.rm) rms from `data.json` d) s 
> order by s.rms.rptd[1].d;
> {code}
> When I removed the order by part, drill correctly reported 20 records.
> {code}
> select s.uid from (select d.uid, flatten(d.map.rm) rms from `data.json` d) s;
> {code}
> I attached the data set with 2 records. I copied over the data set 5 
> times and ran the queries on top of it. Let me know if you have any other 
> questions



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (DRILL-2264) Incorrect data when we use aggregate functions with flatten

2015-04-15 Thread Chris Westin (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-2264?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Westin updated DRILL-2264:

Fix Version/s: (was: 0.9.0)
   1.0.0

> Incorrect data when we use aggregate functions with flatten
> ---
>
> Key: DRILL-2264
> URL: https://issues.apache.org/jira/browse/DRILL-2264
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Relational Operators
>Reporter: Rahul Challapalli
>Assignee: Jason Altekruse
>Priority: Critical
> Fix For: 1.0.0
>
>
> git.commit.id.abbrev=6676f2d
> Data Set :
> {code}
> {
>   "uid":1,
>   "lst_lst" : [[1,2],[3,4]]
> }
> {
>   "uid":2,
>   "lst_lst" : [[1,2],[3,4]]
> }
> {code}
> The below query returns incorrect results :
> {code}
> select uid,MAX( flatten(lst_lst[1]) + flatten(lst_lst[0])) from `temp.json` 
> group by uid, flatten(lst_lst[1]), flatten(lst_lst[0]);
> +++
> |uid |   EXPR$1   |
> +++
> | 1  | 6  |
> | 1  | 6  |
> | 1  | 6  |
> | 1  | 6  |
> | 2  | 6  |
> | 2  | 6  |
> | 2  | 6  |
> | 2  | 6  |
> +++
> {code}
> However if we use a sub query, drill returns the right data
> {code}
> select uid, MAX(l1+l2) from (select uid,flatten(lst_lst[1]) l1, 
> flatten(lst_lst[0]) l2 from `temp.json`) sub group by uid, l1, l2;
> +++
> |uid |   EXPR$1   |
> +++
> | 1  | 4  |
> | 1  | 5  |
> | 1  | 5  |
> | 1  | 6  |
> | 2  | 4  |
> | 2  | 5  |
> | 2  | 5  |
> | 2  | 6  |
> +++
> {code}
> Also using a single flatten yields proper results
> {code}
> select uid,MAX(flatten(lst_lst[0])) from `temp.json` group by uid, 
> flatten(lst_lst[0]);
> +++
> |uid |   EXPR$1   |
> +++
> | 1  | 1  |
> | 1  | 2  |
> | 2  | 1  |
> | 2  | 2  |
> +++
> {code}
> Marked it as critical since we return in-correct data. Let me know if you 
> have any other questions



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (DRILL-1770) Flatten on top a subquery which applies flatten over kvgen results in a ClassCastException

2015-04-15 Thread Chris Westin (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-1770?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Westin updated DRILL-1770:

Fix Version/s: (was: 0.9.0)
   1.0.0

> Flatten on top a subquery which applies flatten over kvgen results in a 
> ClassCastException
> --
>
> Key: DRILL-1770
> URL: https://issues.apache.org/jira/browse/DRILL-1770
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Relational Operators
>Reporter: Rahul Challapalli
>Assignee: Jason Altekruse
>Priority: Minor
> Fix For: 1.0.0
>
> Attachments: DRILL_1770.patch
>
>
> git.commit.id.abbrev=108d29f
> Dataset :
> {code}
> {"map":{"rm": [ {"rptd": [{ "a": "foo"}]}]}}
> {code}
> Query :
> {code}
> select flatten(sub.fk.`value`) from (select flatten(kvgen(map)) fk from 
> `nested.json`) sub;
> Query failed: Failure while running fragment., 
> org.apache.drill.exec.vector.NullableIntVector cannot be cast to 
> org.apache.drill.exec.vector.RepeatedVector
> {code}
> Let me know if you need more information.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (DRILL-2745) Query returns IOB Exception when JSON data with empty arrays is input to flatten function

2015-04-15 Thread Chris Westin (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-2745?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Westin updated DRILL-2745:

Fix Version/s: (was: 0.9.0)
   1.0.0

> Query returns IOB Exception when JSON data with empty arrays is input to 
> flatten function
> -
>
> Key: DRILL-2745
> URL: https://issues.apache.org/jira/browse/DRILL-2745
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Relational Operators
>Affects Versions: 0.9.0
> Environment: | 9d92b8e319f2d46e8659d903d355450e15946533 | DRILL-2580: 
> Exit early from HashJoinBatch if build side is empty | 26.03.2015 @ 16:13:53 
> EDT 
>Reporter: Khurram Faraaz
>Assignee: Jason Altekruse
>Priority: Minor
> Fix For: 1.0.0
>
>
> IOB Exception is returned when JSON file that has many empty arrays and 
> arrays with different types of data is passed to flatten function.
> Tested on 4 node cluster on CentOS
> {code}
> 0: jdbc:drill:> select flatten(outkey) from `nestedJArry.json` ;
> Query failed: RemoteRpcException: Failure while running fragment., index: 
> 176, length: 4 (expected: range(0, 176)) [ 
> 2627cf84-9dfb-4077-8531-9955ecdbdec7 on centos-02.qa.lab:31010 ]
> [ 2627cf84-9dfb-4077-8531-9955ecdbdec7 on centos-02.qa.lab:31010 ]
> Error: exception while executing query: Failure while executing query. 
> (state=,code=0)
> 0: jdbc:drill:> select outkey from `nestedJArry.json`;
> ++
> |   outkey   |
> ++
> | 
> [["100","1000","200","99","1","0","-1","10"],["a","b","c","d","e","p","o","f","m","q","d","s","v"],["2012-04-01","1998-02-20","2011-08-05","1992-01-01"],["10:30:29.123","12:29:21.999"],["sdfklgjsdlkjfghlsidhfgopiuesrtoipuertoiurtyoiurotuiydkfjlbn,bfn;waokefpqowertoipuwergklnjdfbpdsiofgoigiuewqrqiugkjehgjksdhbvkjshdfkjsdfbnlkfbkljrghljrelkhbdlkfjbgkdfjbgkndfbnkldfgklbhjdflkghjlnkoiurty984756897345609782-3458745uiyoheirluht7895e6y"],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],["null"],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],["test
>  string","hello world!","just do it!","houston we have a 
> problem"],["1","2","3","4","5","6","7","8","9","0"]] |
> ++
> 1 row selected (0.088 seconds)
> Stack trace from drillbit.log
> 2015-04-09 23:54:41,965 [2ad8eebd-adb6-6f7e-469e-4bb8ca276984:frag:0:0] WARN  
> o.a.d.e.w.fragment.FragmentExecutor - Error while initializing or executing 
> fragment
> java.lang.IndexOutOfBoundsException: index: 176, length: 4 (expected: 
> range(0, 176))
> at io.netty.buffer.DrillBuf.checkIndexD(DrillBuf.java:187) 
> ~[drill-java-exec-0.9.0-SNAPSHOT-rebuffed.jar:4.0.24.Final]
> at io.netty.buffer.DrillBuf.chk(DrillBuf.java:209) 
> ~[drill-java-exec-0.9.0-SNAPSHOT-rebuffed.jar:4.0.24.Final]
> at io.netty.buffer.DrillBuf.setInt(DrillBuf.java:513) 
> ~[drill-java-exec-0.9.0-SNAPSHOT-rebuffed.jar:4.0.24.Final]
> at 
> org.apache.drill.exec.vector.UInt4Vector$Mutator.set(UInt4Vector.java:363) 
> ~[drill-java-exec-0.9.0-SNAPSHOT-rebuffed.jar:0.9.0-SNAPSHOT]
> at 
> org.apache.drill.exec.vector.RepeatedVarCharVector.splitAndTransferTo(RepeatedVarCharVector.java:173)
>  ~[drill-java-exec-0.9.0-SNAPSHOT-rebuffed.jar:0.9.0-SNAPSHOT]
> at 
> org.apache.drill.exec.vector.RepeatedVarCharVector$TransferImpl.splitAndTransfer(RepeatedVarCharVector.java:200)
>  ~[drill-java-exec-0.9.0-SNAPSHOT-rebuffed.jar:0.9.0-SNAPSHOT]
> at 
> org.apache.drill.exec.test.generated.FlattenerGen1107.flattenRecords(FlattenTemplate.java:106)
>  ~[na:na]
> at 
> org.apache.drill.exec.physical.impl.flatten.FlattenRecordBatch.doWork(FlattenRecordBatch.java:156)
>  ~[drill-java-exec-0.9.0-SNAPSHOT-rebuffed.jar:0.9.0-SNAPSHOT]
> at 
> org.apache.drill.exec.record.AbstractSingleRecordBatch.innerNext(AbstractSingleRecordBatch.java:93)
>  ~[drill-java-exec-0.9.0-SNAPSHOT-rebuffed.jar:0.9.0-SNAPSHOT]
> at 
> org.apache.drill.exec.physical.impl.flatten.FlattenRecordBatch.innerNext(FlattenRecordBatch.java:122)
>  ~[drill-java-exec-0.9.0-SNAPSHOT-rebuffed.jar:0.9.0-SNAPSHOT]
> at 
> org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:142)
>  ~[drill-java-exec-0.9.0-SNAPSHOT-rebuffed.jar:0.9.0-SNAPSHOT]
> at 
> org.apache.drill.exec.physical.impl.validate.IteratorValidatorBatchIterator.next(IteratorValidatorBatchIterator.java:118)
>  ~[drill-java-exec-0.9.0-SNAPSHOT-rebuffed.jar:0.9.0-SNAPSHOT]
> at 
> org.apache.drill.exec.record.AbstractRecordBatch

[jira] [Updated] (DRILL-2161) Flatten on a list within a list on a large data set results in an IOB Exception

2015-04-15 Thread Chris Westin (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-2161?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Westin updated DRILL-2161:

Fix Version/s: (was: 0.9.0)
   1.0.0

> Flatten on a list within a list on a large data set results in an IOB 
> Exception
> ---
>
> Key: DRILL-2161
> URL: https://issues.apache.org/jira/browse/DRILL-2161
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Flow, Execution - Relational Operators
>Reporter: Rahul Challapalli
>Assignee: Jason Altekruse
> Fix For: 1.0.0
>
> Attachments: data.json
>
>
> git.commit.id.abbrev=3e33880
> I attached the data set which contains 2 records.
> Below query works fine on the attached data set
> {code}
> 0: jdbc:drill:schema=dfs.drillTestDir> select uid, flatten(d.lst_lst) lst 
> from `data.json` d;
> +++
> |uid |lst |
> +++
> | 1  | [1,2,3,4,5] |
> | 1  | [2,3,4,5,6] |
> | 2  | [1,2,3,4,5] |
> | 2  | [2,3,4,5,6] |
> +++
> {code}
> However if I copy the same data set 50, 000 times, and run the same query, it 
> fails with IOB. Below is the contents of the log file
> {code}
> java.lang.IndexOutOfBoundsException: index: 16384, length: 4 (expected: 
> range(0, 16384))
>   at io.netty.buffer.DrillBuf.checkIndexD(DrillBuf.java:156) 
> ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:4.0.24.Final]
>   at io.netty.buffer.DrillBuf.chk(DrillBuf.java:178) 
> ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:4.0.24.Final]
>   at io.netty.buffer.DrillBuf.getInt(DrillBuf.java:447) 
> ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:4.0.24.Final]
>   at 
> org.apache.drill.exec.vector.UInt4Vector$Accessor.get(UInt4Vector.java:309) 
> ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
>   at 
> org.apache.drill.exec.vector.complex.RepeatedListVector.populateEmpties(RepeatedListVector.java:385)
>  ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
>   at 
> org.apache.drill.exec.vector.complex.RepeatedListVector.access$300(RepeatedListVector.java:54)
>  ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
>   at 
> org.apache.drill.exec.vector.complex.RepeatedListVector$Mutator.setValueCount(RepeatedListVector.java:132)
>  ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
>   at 
> org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.setValueCount(ProjectRecordBatch.java:248)
>  ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
>   at 
> org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.doWork(ProjectRecordBatch.java:181)
>  ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
>   at 
> org.apache.drill.exec.record.AbstractSingleRecordBatch.innerNext(AbstractSingleRecordBatch.java:93)
>  ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
>   at 
> org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.innerNext(ProjectRecordBatch.java:134)
>  ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
>   at 
> org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:142)
>  ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
>   at 
> org.apache.drill.exec.physical.impl.validate.IteratorValidatorBatchIterator.next(IteratorValidatorBatchIterator.java:118)
>  ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
>   at 
> org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:99)
>  ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
>   at 
> org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:89)
>  ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
>   at 
> org.apache.drill.exec.record.AbstractSingleRecordBatch.innerNext(AbstractSingleRecordBatch.java:51)
>  ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
>   at 
> org.apache.drill.exec.physical.impl.flatten.FlattenRecordBatch.innerNext(FlattenRecordBatch.java:122)
>  ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
>   at 
> org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:142)
>  ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
>   at 
> org.apache.drill.exec.physical.impl.validate.IteratorValidatorBatchIterator.next(IteratorValidatorBatchIterator.java:118)
>  ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
>   at 
> org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:99)
>  ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
>   at 
> org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractReco

[jira] [Updated] (DRILL-2356) Wrong result when ROUND function is used in expression

2015-04-15 Thread Mehant Baid (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-2356?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mehant Baid updated DRILL-2356:
---
Attachment: DRILL-2356.patch

> Wrong result when ROUND function is used in expression
> --
>
> Key: DRILL-2356
> URL: https://issues.apache.org/jira/browse/DRILL-2356
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Functions - Drill
>Affects Versions: 0.8.0
>Reporter: Victoria Markman
>Assignee: Mehant Baid
>Priority: Critical
> Fix For: 0.9.0
>
> Attachments: DRILL-2356.patch, alltypes_with_nulls
>
>
> Observe overflow in the expression SUM(ROUND ...)):
> {code}
> 0: jdbc:drill:schema=dfs> select
> . . . . . . . . . . . . > sum(c_bigint) as sum_c_bigint,
> . . . . . . . . . . . . > sum(ROUND(c_bigint/12))
> . . . . . . . . . . . . > from
> . . . . . . . . . . . . > alltypes_with_nulls
> . . . . . . . . . . . . > group by
> . . . . . . . . . . . . > c_varchar,
> . . . . . . . . . . . . > c_integer,
> . . . . . . . . . . . . > c_date,
> . . . . . . . . . . . . > c_time,
> . . . . . . . . . . . . > c_boolean;
> +--++
> | sum_c_bigint |   EXPR$1   |
> +--++
> | -3477884857818808320 | -2147483648 |
> | 0| 0  |
> | 0| 0  |
> | 4465148082249531392 | 2147483647 |
> | 4465148082249531392 | 2147483647 |
> | -3999734748766273536 | -2147483648 |
> | 0| 0  |
> | -449093763428515840 | -2147483648 |
> | -1825551161692782592 | -2147483648 |
> | -7308685202664980480 | -2147483648 |
> | -6772904422084182016 | -2147483648 |
> ...
> ...
> {code}
> Wrapping ROUND around SUM, produces incorrect result as well:
> {code}
> 0: jdbc:drill:schema=dfs> select
> . . . . . . . . . . . . > sum(c_bigint) as sum_c_bigint,
> . . . . . . . . . . . . > ROUND(sum(c_bigint/12))
> . . . . . . . . . . . . > from
> . . . . . . . . . . . . > alltypes_with_nulls
> . . . . . . . . . . . . > group by
> . . . . . . . . . . . . > c_varchar,
> . . . . . . . . . . . . > c_integer,
> . . . . . . . . . . . . > c_date,
> . . . . . . . . . . . . > c_time,
> . . . . . . . . . . . . > c_boolean;
> +--++
> | sum_c_bigint |   EXPR$1   | 
> +--++ 
> | -3477884857818808320 | -2147483648 |
> | 0| 0  | 
> | 0| 0  | 
> | 4465148082249531392 | 2147483647 |
> | 4465148082249531392 | 2147483647 |
> | -3999734748766273536 | -2147483648 |
> | 0| 0  |
> | -449093763428515840 | -2147483648 |
> | -1825551161692782592 | -2147483648 |
> | -7308685202664980480 | -2147483648 |
> | -6772904422084182016 | -2147483648 |
> ...
> ...
> {code}
> If you remove ROUND function, you get correct result:
> {code}
> 0: jdbc:drill:schema=dfs> select
> . . . . . . . . . . . . > sum(c_bigint) as sum_c_bigint,
> . . . . . . . . . . . . > sum(c_bigint/12)
> . . . . . . . . . . . . > from
> . . . . . . . . . . . . > alltypes_with_nulls
> . . . . . . . . . . . . > group by
> . . . . . . . . . . . . > c_varchar,
> . . . . . . . . . . . . > c_integer,
> . . . . . . . . . . . . > c_date,
> . . . . . . . . . . . . > c_time,
> . . . . . . . . . . . . > c_boolean;
> +--++
> | sum_c_bigint |   EXPR$1   |
> +--++
> | -3477884857818808320 | -289823738151567360 |
> | 0| 0  |
> | 0| 0  |
> | 4465148082249531392 | 372095673520794282 |
> | 4465148082249531392 | 372095673520794282 |
> | -3999734748766273536 | -11229063856128 |
> | 0| 0  |
> ...
> ...
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (DRILL-2217) Trying to flatten an empty list should return an empty result

2015-04-15 Thread Chris Westin (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-2217?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Westin updated DRILL-2217:

Fix Version/s: (was: 0.9.0)
   1.0.0

> Trying to flatten an empty list should return an empty result
> -
>
> Key: DRILL-2217
> URL: https://issues.apache.org/jira/browse/DRILL-2217
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Relational Operators
>Reporter: Rahul Challapalli
>Assignee: Jason Altekruse
> Fix For: 1.0.0
>
> Attachments: error.log
>
>
> git.commit.id.abbrev=3d863b5
> Data Set :
> {code}
> {"empty":[[],[[]]]}
> {code}
> Query :
> {code}
> select flatten(empty) from `data1.json`;
> Query failed: RemoteRpcException: Failure while running fragment.[ 
> 1b3123d9-92bc-45d5-bef8-b5f1be9def07 on qa-node191.qa.lab:31010 ]
> [ 1b3123d9-92bc-45d5-bef8-b5f1be9def07 on qa-node191.qa.lab:31010 ]
> {code}
> I also attached the error from the logs



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (DRILL-1542) Early fragment termination causes non running intermediate fragments to error

2015-04-15 Thread Chris Westin (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-1542?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Westin updated DRILL-1542:

Fix Version/s: (was: 0.9.0)
   1.0.0

> Early fragment termination causes non running intermediate fragments to error
> -
>
> Key: DRILL-1542
> URL: https://issues.apache.org/jira/browse/DRILL-1542
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Flow
>Reporter: Jacques Nadeau
>Assignee: Chris Westin
> Fix For: 1.0.0
>
>
> Caused by: java.lang.NullPointerException: null
> at 
> org.apache.drill.exec.work.fragment.FragmentExecutor.receivingFragmentFinished(FragmentExecutor.java:75)
>  
> ~[drill-java-exec-0.6.0-incubating-SNAPSHOT-rebuffed.jar:0.6.0-incubating-SNAPSHOT]
> at 
> org.apache.drill.exec.work.batch.ControlHandlerImpl.receivingFragmentFinished(ControlHandlerImpl.java:174)
>  
> ~[drill-java-exec-0.6.0-incubating-SNAPSHOT-rebuffed.jar:0.6.0-incubating-SNAPSHOT]
> at 
> org.apache.drill.exec.work.batch.ControlHandlerImpl.handle(ControlHandlerImpl.java:76)
>  
> ~[drill-java-exec-0.6.0-incubating-SNAPSHOT-rebuffed.jar:0.6.0-incubating-SNAPSHOT]
> at 
> org.apache.drill.exec.rpc.control.ControlServer.handle(ControlServer.java:60) 
> ~[drill-java-exec-0.6.0-incubating-SNAPSHOT-rebuffed.jar:0.6.0-incubating-SNAPSHOT]
> at 
> org.apache.drill.exec.rpc.control.ControlServer.handle(ControlServer.java:38) 
> ~[drill-java-exec-0.6.0-incubating-SNAPSHOT-rebuffed.jar:0.6.0-incubating-SNAPSHOT]
> at org.apache.drill.exec.rpc.RpcBus.handle(RpcBus.java:56) 
> ~[drill-java-exec-0.6.0-incubating-SNAPSHOT-rebuffed.jar:0.6.0-incubating-SNAPSHOT]
> at 
> org.apache.drill.exec.rpc.RpcBus$InboundHandler.decode(RpcBus.java:191) 
> ~[drill-java-exec-0.6.0-incubating-SNAPSHOT-rebuffed.jar:0.6.0-incubating-SNAPSHOT]
> at 
> org.apache.drill.exec.rpc.RpcBus$InboundHandler.decode(RpcBus.java:171) 
> ~[drill-java-exec-0.6.0-incubating-SNAPSHOT-rebuffed.jar:0.6.0-incubating-SNAPSHOT]
> at 
> io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:89)
>  [netty-codec-4.0.20.Final.jar:4.0.20.Final]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (DRILL-1976) Possible Memory Leak in drill jdbc client when dealing with wide columns (5000 chars long)

2015-04-15 Thread Chris Westin (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-1976?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Westin updated DRILL-1976:

Fix Version/s: (was: 0.9.0)
   1.0.0

> Possible Memory Leak in drill jdbc client when dealing with wide columns 
> (5000 chars long)
> --
>
> Key: DRILL-1976
> URL: https://issues.apache.org/jira/browse/DRILL-1976
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Flow
>Reporter: Rahul Challapalli
>Assignee: Chris Westin
> Fix For: 1.0.0
>
> Attachments: wide-strings.sh
>
>
> git.commit.id.abbrev=b491cdb
> I am seeing an execution failure when I execute the same query multiple times 
> (<10). The data file contains 9 columns out of which 7 are wide strings 
> (4000-5000 chars long)
> {code}
> select ws.*, sub.str_var str_var1 from widestrings ws INNER JOIN (select 
> str_var, max(tinyint_var) max_ti from widestrings group by str_var) sub on 
> ws.tinyint_var = sub.max_ti
> {code}
> Below are my memory settings :
> {code}
> DRILL_MAX_DIRECT_MEMORY="32G"
> DRILL_MAX_HEAP="4G"
> {code}
> Error From the JDBC client
> {code}
> select ws.*, sub.str_var str_var1 from widestrings ws INNER JOIN (select 
> str_var, max(tinyint_var) max_ti from widestrings group by str_var) sub on 
> ws.tinyint_var = sub.max_ti
> Exception in pipeline.  Closing channel between local /10.10.100.190:38179 
> and remote qa-node191.qa.lab/10.10.100.191:31010
> io.netty.handler.codec.DecoderException: java.lang.OutOfMemoryError: Direct 
> buffer memory
>   at 
> io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:151)
>   at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:333)
>   at 
> io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:319)
>   at 
> io.netty.channel.ChannelInboundHandlerAdapter.channelRead(ChannelInboundHandlerAdapter.java:86)
>   at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:333)
>   at 
> io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:319)
>   at 
> io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:787)
>   at 
> io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:130)
>   at 
> io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:511)
>   at 
> io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:468)
>   at 
> io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:382)
>   at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:354)
>   at 
> io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:116)
>   at java.lang.Thread.run(Thread.java:745)
> Caused by: java.lang.OutOfMemoryError: Direct buffer memory
>   at java.nio.Bits.reserveMemory(Bits.java:658)
>   at java.nio.DirectByteBuffer.(DirectByteBuffer.java:123)
>   at java.nio.ByteBuffer.allocateDirect(ByteBuffer.java:306)
>   at 
> io.netty.buffer.PoolArena$DirectArena.newUnpooledChunk(PoolArena.java:443)
>   at io.netty.buffer.PoolArena.allocateHuge(PoolArena.java:187)
>   at io.netty.buffer.PoolArena.allocate(PoolArena.java:165)
>   at io.netty.buffer.PoolArena.reallocate(PoolArena.java:280)
>   at io.netty.buffer.PooledByteBuf.capacity(PooledByteBuf.java:110)
>   at 
> io.netty.buffer.AbstractByteBuf.ensureWritable(AbstractByteBuf.java:251)
>   at io.netty.buffer.AbstractByteBuf.writeBytes(AbstractByteBuf.java:849)
>   at io.netty.buffer.AbstractByteBuf.writeBytes(AbstractByteBuf.java:841)
>   at io.netty.buffer.AbstractByteBuf.writeBytes(AbstractByteBuf.java:831)
>   at io.netty.buffer.WrappedByteBuf.writeBytes(WrappedByteBuf.java:600)
>   at 
> io.netty.buffer.UnsafeDirectLittleEndian.writeBytes(UnsafeDirectLittleEndian.java:25)
>   at 
> io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:144)
>   ... 13 more
> Channel closed between local /10.10.100.190:38179 and remote 
> qa-node191.qa.lab/10.10.100.191:31010
> Channel is closed, discarding remaining 255231 byte(s) in buffer.
> {code}
> The logs



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (DRILL-1942) Improve off-heap memory usage tracking

2015-04-15 Thread Chris Westin (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-1942?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Westin updated DRILL-1942:

Fix Version/s: (was: 0.9.0)
   1.0.0

> Improve off-heap memory usage tracking
> --
>
> Key: DRILL-1942
> URL: https://issues.apache.org/jira/browse/DRILL-1942
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Execution - Relational Operators
>Reporter: Chris Westin
>Assignee: Chris Westin
> Fix For: 1.0.0
>
>
> We're using a lot more memory than we think we should. We may be leaking it, 
> or not releasing it as soon as we could. 
> This is a call to come up with some improved tracking so that we can get 
> statistics out about exactly where we're using it, and whether or not we can 
> release it earlier.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (DRILL-2550) Drillbit disconnect from ZK results in drillbit being lost until restart

2015-04-15 Thread Chris Westin (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-2550?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Westin updated DRILL-2550:

Fix Version/s: (was: 0.9.0)
   1.0.0

> Drillbit disconnect from ZK results in drillbit being lost until restart
> 
>
> Key: DRILL-2550
> URL: https://issues.apache.org/jira/browse/DRILL-2550
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Flow
>Affects Versions: 0.8.0
>Reporter: Ramana Inukonda Nagaraj
>Assignee: Chris Westin
>Priority: Minor
> Fix For: 1.0.0
>
>
> Not quite sure if this is an issue or even if its important- maybe someone 
> can think of a situation where this might be a bigger issue.
> Steps taken to recreate:
> 1. Startup drillbits on multiple nodes. (They all come up and form a 8 node 
> cluster)
> 2. Start executing a long running query.
> 3. Use TCPKILL to kill all connections between one node and zookeeper port 
> 5181. 
> Drill seems to behave very gracefully here - I see a nice error message 
> saying Query failed: ForemanException: One more more nodes lost connectivity 
> during query. Identified node was atsqa6c61.qa.lab
> However, once I start allowing connections back the node is not brought back 
> as part of the cluster until a drillbit restart.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (DRILL-2166) left join with complex type throw ClassTransformationException

2015-04-15 Thread Chris Westin (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-2166?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Westin updated DRILL-2166:

Fix Version/s: (was: 0.9.0)
   1.0.0

> left join with complex type throw ClassTransformationException
> --
>
> Key: DRILL-2166
> URL: https://issues.apache.org/jira/browse/DRILL-2166
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Relational Operators
>Affects Versions: 0.8.0
>Reporter: Chun Chang
>Assignee: Chris Westin
> Fix For: 1.0.0
>
>
> #Thu Jan 29 18:00:57 EST 2015
> git.commit.id.abbrev=09f7fb2
> Dataset can be downloaded from 
> https://s3.amazonaws.com/apache-drill/files/complex.json.gz
> The following query caused the exception:
> {code}
> 0: jdbc:drill:schema=dfs.drillTestDirComplexJ> select a.id, a.soa, b.sfa[0], 
> b.soa[1] from `complex.json` a left outer join `complex.json` b on 
> a.sia[0]=b.sia[0] order by a.id limit 20;
> Query failed: RemoteRpcException: Failure while running fragment., Line 35, 
> Column 32: No applicable constructor/method found for actual parameters "int, 
> int, org.apache.drill.exec.vector.complex.MapVector"; candidates are: "public 
> void org.apache.drill.exec.vector.NullableTinyIntVector.copyFromSafe(int, 
> int, org.apache.drill.exec.vector.NullableTinyIntVector)", "public void 
> org.apache.drill.exec.vector.NullableTinyIntVector.copyFromSafe(int, int, 
> org.apache.drill.exec.vector.TinyIntVector)" [ 
> fbf47be8-b5fe-4d56-9488-15d45d4224e4 on qa-node117.qa.lab:31010 ]
> [ fbf47be8-b5fe-4d56-9488-15d45d4224e4 on qa-node117.qa.lab:31010 ]
> Error: exception while executing query: Failure while executing query. 
> (state=,code=0)
> {code}
> stack from drill bit.log
> {code}
> 2015-02-04 13:37:22,117 [2b2d6eee-105b-5544-9111-83a3a356285d:frag:2:6] WARN  
> o.a.d.e.w.fragment.FragmentExecutor - Error while initializing or executing 
> fragment
> org.apache.drill.common.exceptions.DrillRuntimeException: 
> org.apache.drill.exec.exception.SchemaChangeException: 
> org.apache.drill.exec.exception.ClassTransformationException: 
> java.util.concurrent.ExecutionException: 
> org.apache.drill.exec.exception.ClassTransformationException: Failure 
> generating transformation classes for value:
> package org.apache.drill.exec.test.generated;
> import org.apache.drill.exec.exception.SchemaChangeException;
> import org.apache.drill.exec.ops.FragmentContext;
> import org.apache.drill.exec.record.RecordBatch;
> import org.apache.drill.exec.record.VectorContainer;
> import org.apache.drill.exec.vector.NullableBigIntVector;
> import org.apache.drill.exec.vector.NullableFloat8Vector;
> import org.apache.drill.exec.vector.NullableTinyIntVector;
> import org.apache.drill.exec.vector.complex.MapVector;
> import org.apache.drill.exec.vector.complex.RepeatedMapVector;
> {code}
> from forman drill bit.log
> {code}
> 2015-02-04 13:37:22,189 [BitServer-5] ERROR 
> o.a.d.exec.rpc.RpcExceptionHandler - Exception in pipeline.  Closing channel 
> between local /10.10.100.117:31012 and remote /10.10.100.120:56250
> io.netty.handler.codec.DecoderException: java.lang.NullPointerException
>   at 
> io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:99)
>  [netty-codec-4.0.24.Final.jar:4.0.24.Final]
>   at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:333)
>  [netty-transport-4.0.24.Final.jar:4.0.24.Final]
>   at 
> io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:319)
>  [netty-transport-4.0.24.Final.jar:4.0.24.Final]
>   at 
> io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:103)
>  [netty-codec-4.0.24.Final.jar:4.0.24.Final]
>   at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:333)
>  [netty-transport-4.0.24.Final.jar:4.0.24.Final]
>   at 
> io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:319)
>  [netty-transport-4.0.24.Final.jar:4.0.24.Final]
>   at 
> io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:161)
>  [netty-codec-4.0.24.Final.jar:4.0.24.Final]
>   at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:333)
>  [netty-transport-4.0.24.Final.jar:4.0.24.Final]
>   at 
> io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:319)
>  [netty-transport-4.0.24.Final.jar:4.0.24.Final]
>   at 
> io.netty.channel.ChannelInboundHandlerAdapter.channelRead(ChannelInboundHandlerAdapter.java:86)
>  [netty-transport-4.0.24.Final.jar:4.0.24.Final]
>   at 
> i

[jira] [Updated] (DRILL-2074) Queries fail with OutOfMemory Exception when Hash Join & Agg are turned off

2015-04-15 Thread Chris Westin (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-2074?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Westin updated DRILL-2074:

Fix Version/s: (was: 0.9.0)
   1.0.0

> Queries fail with OutOfMemory Exception when Hash Join & Agg are turned off
> ---
>
> Key: DRILL-2074
> URL: https://issues.apache.org/jira/browse/DRILL-2074
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Relational Operators
>Reporter: Abhishek Girish
>Assignee: Chris Westin
> Fix For: 1.0.0
>
> Attachments: 05_par1000.q, 05_par1000_d6e54ab.logical.plan, 
> 05_par1000_d6e54ab.verbose.plan, drill-env.sh
>
>
> Query attached. 
> Hash Join and Hash Agg were turned off. And the following property was added 
> to drill-override.conf:
> sort: {
> purge.threshold : 100,
> external: {
>   batch.size : 4000,
>   spill: {
> batch.size : 4000,
> group.size : 100,
> threshold : 200,
> directories : [ "/drill_spill" ],
> fs : "maprfs:///"
>   }
> }
>   }
> Query failed with the below error message:
> Query failed: RemoteRpcException: Failure while running fragment., Unable to 
> allocate sv2 buffer after repeated attempts [ 
> faf3044a-e14a-427b-b66d-7bcd7522ead5 on drone-42:31010 ]
> [ faf3044a-e14a-427b-b66d-7bcd7522ead5 on drone-42:31010 ]
> Log Snippets:
> 2015-01-26 20:07:33,239 atsqa8c42.qa.lab 
> [2b396307-2c1e-3486-90bc-fbaf09fbeb3e:frag:15:51] ERROR 
> o.a.d.e.w.f.AbstractStatusReporter - Error 
> faf3044a-e14a-427b-b66d-7bcd7522ead5: Failure while running fragment.
> java.lang.RuntimeException: 
> org.apache.drill.exec.memory.OutOfMemoryException: Unable to allocate sv2 
> buffer after repeated attempts
> at 
> org.apache.drill.exec.physical.impl.xsort.ExternalSortBatch.innerNext(ExternalSortBatch.java:309)
>  ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
> at 
> org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:142)
>  ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
> at 
> org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:99)
>  ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
> at 
> org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:89)
>  ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
> at 
> org.apache.drill.exec.record.AbstractSingleRecordBatch.innerNext(AbstractSingleRecordBatch.java:51)
>  ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
> at 
> org.apache.drill.exec.physical.impl.svremover.RemovingRecordBatch.innerNext(RemovingRecordBatch.java:96)
>  ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
> at 
> org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:142)
>  ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
> at 
> org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:99)
>  ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
> at 
> org.apache.drill.exec.physical.impl.join.JoinStatus.nextRight(JoinStatus.java:80)
>  ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
> at 
> org.apache.drill.exec.physical.impl.join.JoinStatus.ensureInitial(JoinStatus.java:95)
>  ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
> at 
> org.apache.drill.exec.physical.impl.join.MergeJoinBatch.innerNext(MergeJoinBatch.java:147)
>  ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
> at 
> org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:142)
>  ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
> at 
> org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:99)
>  ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
> at 
> org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:89)
>  ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
> at 
> org.apache.drill.exec.record.AbstractSingleRecordBatch.innerNext(AbstractSingleRecordBatch.java:51)
>  ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
> at 
> org.apache.drill.exec.physical.impl.svremover.RemovingRecordBatch.innerNext(RemovingRecordBatch.java:96)
>  ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
> at 
> org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:142)
>  ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
> at 
> org.apache.drill.exec.physical.impl.BaseRootExec.next(BaseRootExec.java:67) 
> ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
>

[jira] [Updated] (DRILL-2050) remote rpc exception

2015-04-15 Thread Chris Westin (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-2050?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Westin updated DRILL-2050:

Fix Version/s: (was: 0.9.0)
   1.0.0

> remote rpc exception
> 
>
> Key: DRILL-2050
> URL: https://issues.apache.org/jira/browse/DRILL-2050
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - RPC
>Affects Versions: 0.8.0
>Reporter: Chun Chang
>Assignee: Chris Westin
> Fix For: 1.0.0
>
>
> git.branch=8d1e1affe86a5adca3bc17eeaf7520f0d379a393
> git.commit.time=20.01.2015 @ 23\:02\:03 PST
> The following tpcds-implala-sf1 automation query caused RemoteRpcException. 
> This query works most of the time. But fails on average once in every four or 
> five tries. Test data can be downloaded from
> http://apache-drill.s3.amazonaws.com/files/tpcds-sf1-parquet.tgz
> {code}
> 0: jdbc:drill:schema=dfs.drillTestDir> select
> . . . . . . . . . . . . . . . . . . .>   *
> . . . . . . . . . . . . . . . . . . .> from
> . . . . . . . . . . . . . . . . . . .>   (select
> . . . . . . . . . . . . . . . . . . .> i.i_manufact_id as imid,
> . . . . . . . . . . . . . . . . . . .> sum(ss.ss_sales_price) sum_sales
> . . . . . . . . . . . . . . . . . . .> -- avg(sum(ss.ss_sales_price)) 
> over (partition by i.i_manufact_id) avg_quarterly_sales
> . . . . . . . . . . . . . . . . . . .>   from
> . . . . . . . . . . . . . . . . . . .> item as i,
> . . . . . . . . . . . . . . . . . . .> store_sales as ss,
> . . . . . . . . . . . . . . . . . . .> date_dim as d,
> . . . . . . . . . . . . . . . . . . .> store as s
> . . . . . . . . . . . . . . . . . . .>   where
> . . . . . . . . . . . . . . . . . . .> ss.ss_item_sk = i.i_item_sk
> . . . . . . . . . . . . . . . . . . .> and ss.ss_sold_date_sk = 
> d.d_date_sk
> . . . . . . . . . . . . . . . . . . .> and ss.ss_store_sk = s.s_store_sk
> . . . . . . . . . . . . . . . . . . .> and d.d_month_seq in (1212, 1212 + 
> 1, 1212 + 2, 1212 + 3, 1212 + 4, 1212 + 5, 1212 + 6, 1212 + 7, 1212 + 8, 1212 
> + 9, 1212 + 10, 1212 + 11)
> . . . . . . . . . . . . . . . . . . .> and ((i.i_category in ('Books', 
> 'Children', 'Electronics')
> . . . . . . . . . . . . . . . . . . .>   and i.i_class in ('personal', 
> 'portable', 'reference', 'self-help')
> . . . . . . . . . . . . . . . . . . .>   and i.i_brand in 
> ('scholaramalgamalg #14', 'scholaramalgamalg #7', 'exportiunivamalg #9', 
> 'scholaramalgamalg #9'))
> . . . . . . . . . . . . . . . . . . .> or (i.i_category in ('Women', 
> 'Music', 'Men')
> . . . . . . . . . . . . . . . . . . .>   and i.i_class in ('accessories', 
> 'classical', 'fragrances', 'pants')
> . . . . . . . . . . . . . . . . . . .>   and i.i_brand in ('amalgimporto 
> #1', 'edu packscholar #1', 'exportiimporto #1', 'importoamalg #1')))
> . . . . . . . . . . . . . . . . . . .> and ss.ss_sold_date_sk between 
> 2451911 and 2452275 -- partition key filter
> . . . . . . . . . . . . . . . . . . .>   group by
> . . . . . . . . . . . . . . . . . . .> i.i_manufact_id,
> . . . . . . . . . . . . . . . . . . .> d.d_qoy
> . . . . . . . . . . . . . . . . . . .>   ) tmp1
> . . . . . . . . . . . . . . . . . . .> -- where
> . . . . . . . . . . . . . . . . . . .> --   case when avg_quarterly_sales > 0 
> then abs (sum_sales - avg_quarterly_sales) / avg_quarterly_sales else null 
> end > 0.1
> . . . . . . . . . . . . . . . . . . .> order by
> . . . . . . . . . . . . . . . . . . .>   -- avg_quarterly_sales,
> . . . . . . . . . . . . . . . . . . .>   sum_sales,
> . . . . . . . . . . . . . . . . . . .>   tmp1.imid
> . . . . . . . . . . . . . . . . . . .> limit 100;
> +++
> |imid| sum_sales  |
> +++
> Query failed: RemoteRpcException: Failure while running fragment., Attempted 
> to close accountor with 1 buffer(s) still allocatedfor QueryId: 
> 2b401913-4292-26d4-18b0-f105afe06121, MajorFragmentId: 2, MinorFragmentId: 0.
>   Total 1 allocation(s) of byte size(s): 771, at stack location:
>   
> org.apache.drill.exec.memory.TopLevelAllocator$ChildAllocator.takeOwnership(TopLevelAllocator.java:197)
>   
> org.apache.drill.exec.rpc.data.DataServer.handle(DataServer.java:119)
>   
> org.apache.drill.exec.rpc.data.DataServer.handle(DataServer.java:48)
>   
> org.apache.drill.exec.rpc.RpcBus$InboundHandler.decode(RpcBus.java:194)
>   
> org.apache.drill.exec.rpc.RpcBus$InboundHandler.decode(RpcBus.java:173)
>   
> io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:89)
>   
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:333)
>   
> io.netty.channel.Abs

[jira] [Updated] (DRILL-2140) RPC Error querying JSON with empty nested maps

2015-04-15 Thread Chris Westin (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-2140?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Westin updated DRILL-2140:

Fix Version/s: (was: 0.9.0)
   1.0.0

> RPC Error querying JSON with empty nested maps
> --
>
> Key: DRILL-2140
> URL: https://issues.apache.org/jira/browse/DRILL-2140
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Flow
>Affects Versions: 0.7.0
> Environment: Centos 4 node MapR cluster
>Reporter: Andries Engelbrecht
>Assignee: Chris Westin
> Fix For: 1.0.0
>
> Attachments: drillbit.log
>
>
> When querying large number of documents in multiple directories with multiple 
> JSON files in each, and some documents have no top level map that is used for 
> a predicate, Drill produces a RPC error in the log.
> Query
> select t.retweeted_status.`user`.name as name, 
> count(t.retweeted_status.favorited) as rt_count from `./nfl` t where 
> t.retweeted_status.`user`.name is not null group by 
> t.retweeted_status.`user`.name order by count(t.retweeted_status.favorited) 
> desc limit 10;
> Screen Error
> Query failed: Query failed: Failure while running fragment., index: 0, 
> length: 1 (expected: range(0, 0)) [ b96e3bfa-74c9-4b78-886b-9a2c3fc4ea9b on 
> se-node13.se.lab:31010 ]
> [ b96e3bfa-74c9-4b78-886b-9a2c3fc4ea9b on se-node13.se.lab:31010 ]
> Drillbit log attached



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (DRILL-2274) Unable to allocate sv2 buffer after repeated attempts : JOIN, Order by used in query

2015-04-15 Thread Chris Westin (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-2274?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Westin updated DRILL-2274:

Fix Version/s: (was: 0.9.0)
   1.0.0

> Unable to allocate sv2 buffer after repeated attempts : JOIN, Order by used 
> in query
> 
>
> Key: DRILL-2274
> URL: https://issues.apache.org/jira/browse/DRILL-2274
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Relational Operators
>Reporter: Rahul Challapalli
>Assignee: Chris Westin
> Fix For: 1.0.0
>
> Attachments: data.json
>
>
> git.commit.id.abbrev=6676f2d
> The below query fails :
> {code}
> select sub1.uid from `data.json` sub1 inner join `data.json` sub2 on sub1.uid 
> = sub2.uid order by sub1.uid;
> {code}
> Error from the logs :
> {code}
> 2015-02-20 00:24:08,431 [2b1981b0-149e-981b-f83f-512c587321d7:frag:1:2] ERROR 
> o.a.d.e.w.f.AbstractStatusReporter - Error 
> 66dba4ff-644c-4400-ab84-203256dc2600: Failure while running fragment.
>  java.lang.RuntimeException: 
> org.apache.drill.exec.memory.OutOfMemoryException: Unable to allocate sv2 
> buffer after repeated attempts
>   at 
> org.apache.drill.exec.physical.impl.xsort.ExternalSortBatch.innerNext(ExternalSortBatch.java:307)
>  ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
>   at 
> org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:142)
>  ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
>   at 
> org.apache.drill.exec.physical.impl.validate.IteratorValidatorBatchIterator.next(IteratorValidatorBatchIterator.java:118)
>  ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
>   at 
> org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:99)
>  ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
>   at 
> org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:89)
>  ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
>   at 
> org.apache.drill.exec.record.AbstractSingleRecordBatch.innerNext(AbstractSingleRecordBatch.java:51)
>  ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
>   at 
> org.apache.drill.exec.physical.impl.svremover.RemovingRecordBatch.innerNext(RemovingRecordBatch.java:96)
>  ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
>   at 
> org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:142)
>  ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
>   at 
> org.apache.drill.exec.physical.impl.validate.IteratorValidatorBatchIterator.next(IteratorValidatorBatchIterator.java:118)
>  ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
>   at 
> org.apache.drill.exec.physical.impl.BaseRootExec.next(BaseRootExec.java:67) 
> ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
>   at 
> org.apache.drill.exec.physical.impl.SingleSenderCreator$SingleSenderRootExec.innerNext(SingleSenderCreator.java:97)
>  ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
>   at 
> org.apache.drill.exec.physical.impl.BaseRootExec.next(BaseRootExec.java:57) 
> ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
>   at 
> org.apache.drill.exec.work.fragment.FragmentExecutor.run(FragmentExecutor.java:116)
>  ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
>   at 
> org.apache.drill.exec.work.WorkManager$RunnableWrapper.run(WorkManager.java:303)
>  [drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>  [na:1.7.0_71]
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>  [na:1.7.0_71]
>   at java.lang.Thread.run(Thread.java:745) [na:1.7.0_71]
>  Caused by: org.apache.drill.exec.memory.OutOfMemoryException: Unable to 
> allocate sv2 buffer after repeated attempts
>   at 
> org.apache.drill.exec.physical.impl.xsort.ExternalSortBatch.newSV2(ExternalSortBatch.java:516)
>  ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
>   at 
> org.apache.drill.exec.physical.impl.xsort.ExternalSortBatch.innerNext(ExternalSortBatch.java:305)
>  ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
>   ... 16 common frames omitted
> {code}
> On a different drillbit in the cluster, I found the below message for the 
> same run
> {code}
> 2015-02-20 00:24:08,435 [BitServer-6] WARN  
> o.a.d.exec.rpc.control.WorkEventBus - A fragment message arrived but there 
> was no registered listener for that message: profile {
>state: FAILED
>error {
>  error_id: "66dba4ff-644c-4400-ab84-203256dc2600"
>  endpoint {
>address: "qa-no

[jira] [Updated] (DRILL-2278) Join on a large data set results in an NPE

2015-04-15 Thread Chris Westin (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-2278?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Westin updated DRILL-2278:

Fix Version/s: (was: 0.9.0)
   1.0.0

> Join on a large data set results in an NPE
> --
>
> Key: DRILL-2278
> URL: https://issues.apache.org/jira/browse/DRILL-2278
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Relational Operators
>Reporter: Rahul Challapalli
>Assignee: Chris Westin
> Fix For: 1.0.0
>
> Attachments: data.json
>
>
> git.commit.id.abbrev=6676f2d
> I attached the data set which only contains 2 records. The below query works 
> fine on this data. However if we just copy over the same 2 records 5 
> times, the same query fails with an NPE.
> {code}
> select * from `data.json` t1 inner join `data.json` t2 on t1.uid=t2.uid;
> Query failed: RemoteRpcException: Failure while running fragment.[ 
> 1cec8b9a-5c04-44d8-b1a4-bcf3a0fadcbc on qa-node190.qa.lab:31010 ]
> [ 1cec8b9a-5c04-44d8-b1a4-bcf3a0fadcbc on qa-node190.qa.lab:31010 ]
> {code}
> Projecting a specific column works as expected. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (DRILL-2281) Drill never returns when we use aggregate functions after a join with an order by

2015-04-15 Thread Chris Westin (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-2281?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Westin updated DRILL-2281:

Fix Version/s: (was: 0.9.0)
   1.0.0

> Drill never returns when we use aggregate functions after a join with an 
> order by
> -
>
> Key: DRILL-2281
> URL: https://issues.apache.org/jira/browse/DRILL-2281
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Relational Operators
>Reporter: Rahul Challapalli
>Assignee: Chris Westin
> Fix For: 1.0.0
>
> Attachments: data.json
>
>
> git.commit.id.abbrev=6676f2d
> The below query never returns : (Order by seems to be the culprit)
> {code}
> create view v1 as select uid, flatten(events) event from `data.json`;
> create view v2 as select uid, flatten(transactions) transaction from 
> `data.json`;
> select v1.uid, MAX(v2.transaction.amount), MIN(v1.event.event_time) from v1 
> inner join v2 on v1.uid = v2.uid where v2.transaction.trans_time < 0 group by 
> v1.uid order by v1.uid;
> {code}
> There seems to be constant activity in the drillbit.log file. The below 
> message is continuously displayed in the log file
> {code}
> 2015-02-20 23:35:04,450 [2b183b65-4551-bb9a-35ca-b71b9eedc4d6:frag:1:2] INFO  
> o.a.d.exec.vector.BaseValueVector - Realloc vector null. [32768] -> [65536]
> 2015-02-20 23:35:04,451 [2b183b65-4551-bb9a-35ca-b71b9eedc4d6:frag:1:2] INFO  
> o.a.d.exec.vector.BaseValueVector - Realloc vector null. [32768] -> [65536]
> 2015-02-20 23:35:04,451 [2b183b65-4551-bb9a-35ca-b71b9eedc4d6:frag:1:2] INFO  
> o.a.d.exec.vector.BaseValueVector - Realloc vector null. [32768] -> [65536]
> {code}
> Drill returns correct data when we remove one of the agg functions or use 
> multiple aggs from the same side of the join. The below queries work :
> {code}
>  select v1.uid, MAX(v2.transaction.amount) from v1 inner join v2 on v1.uid = 
> v2.uid where v2.transaction.trans_time < 0 group by v1.uid order by v1.uid;
>  select v1.uid, MAX(v2.transaction.amount), MAX(v2.transaction.amount) from 
> v1 inner join v2 on v1.uid = v2.uid where v2.transaction.trans_time < 0 group 
> by v1.uid order by v1.uid;
> {code}
> Attached the dataset which contains 2 records. I copied over the same 2 
> records 5 times and ran the queries on the data set. Let me know if you 
> need anything else.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (DRILL-2643) Exception during xsort.ExternalSortBatch.cleanup (possible memory leak ?)

2015-04-15 Thread Chris Westin (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-2643?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Westin updated DRILL-2643:

Fix Version/s: (was: 0.9.0)
   1.0.0

> Exception during xsort.ExternalSortBatch.cleanup (possible memory leak ?)
> -
>
> Key: DRILL-2643
> URL: https://issues.apache.org/jira/browse/DRILL-2643
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Relational Operators
>Affects Versions: 0.8.0
>Reporter: Victoria Markman
>Assignee: Chris Westin
> Fix For: 1.0.0
>
> Attachments: t1.parquet, t2.parquet
>
>
> In this case j1,j2 are views created on top of parquet files, BOTH  views 
> have order by on multiple columns in different order with nulls first/last.
> Also, table in in view j1, consists of 99 parquet files.  See attached 
> views.txt file on how to create views (make sure to create views in a 
> different workspace, views have the same names as tables)
> {code}
> select DISTINCT
> COALESCE(j1.c_varchar || j2.c_varchar || 'EMPTY') as 
> concatentated_string
> from
> j1  INNER JOIN j2 ON
> (j1.d18 = j2.d18)
> ;
> {code}
> The same can be reproduced with parquet files and subqueries:
> (pay attention parquet files are named the same as views: j1, j2)
> {code}
> select DISTINCT
> COALESCE(sq1.c_varchar || sq2.c_varchar || 'EMPTY') as 
> concatentated_string
> from
> (select c_varchar, c_integer from j1 order by j1.c_varchar desc nulls 
> first ) as sq1(c_varchar, c_integer)
> INNER JOIN
> (select c_varchar, c_integer from j2 order by j2.c_varchar nulls 
> last) as sq2(c_varchar, c_integer)
> ON (sq1.c_integer = sq2.c_integer)
> {code}
> You do need to have sort in order to reproduce the problem.
> This query works:
> {code}
> select DISTINCT
> COALESCE(j1.c_varchar || j2.c_varchar || 'EMPTY') as 
> concatentated_string
> from j1,j2
> where j1.c_integer = j2.c_integer;
> {code}
> {code}
> 2015-04-01 00:43:42,455 [2ae4c0c0-c408-3e66-4fb3-e7bf80a42bad:foreman] INFO  
> o.a.d.e.s.parquet.ParquetGroupScan - Load Parquet RowGroup block maps: 
> Executed 99 out of 99 using 16 threads. Time: 20ms total, 2.877318ms avg, 3ms 
> max.
> 2015-04-01 00:43:42,458 [2ae4c0c0-c408-3e66-4fb3-e7bf80a42bad:foreman] INFO  
> o.a.d.e.s.schedule.BlockMapBuilder - Failure finding Drillbit running on host 
> atsqa4-136.qa.lab.  Skipping affinity to that host.
> 2015-04-01 00:43:42,458 [2ae4c0c0-c408-3e66-4fb3-e7bf80a42bad:foreman] INFO  
> o.a.d.e.s.parquet.ParquetGroupScan - Load Parquet RowGroup block maps: 
> Executed 1 out of 1 using 1 threads. Time: 1ms total, 1.562620ms avg, 1ms max.
> 2015-04-01 00:43:42,485 [2ae4c0c0-c408-3e66-4fb3-e7bf80a42bad:foreman] INFO  
> o.a.drill.exec.work.foreman.Foreman - State change requested.  PENDING --> 
> RUNNING
> 2015-04-01 00:43:45,613 [2ae4c0c0-c408-3e66-4fb3-e7bf80a42bad:frag:0:0] WARN  
> o.a.d.e.p.i.xsort.ExternalSortBatch - Starting to merge. 32 batch groups. 
> Current allocated memory: 16642330
> 2015-04-01 00:43:45,676 [2ae4c0c0-c408-3e66-4fb3-e7bf80a42bad:frag:0:0] INFO  
> o.a.d.exec.vector.BaseValueVector - Realloc vector null. [16384] -> [32768]
> 2015-04-01 00:43:45,676 [2ae4c0c0-c408-3e66-4fb3-e7bf80a42bad:frag:0:0] INFO  
> o.a.d.exec.vector.BaseValueVector - Realloc vector 
> ``c_varchar`(VARCHAR:OPTIONAL)_bits`(UINT1:REQUIRED). [4096] -> [8192]
> 2015-04-01 00:43:45,679 [2ae4c0c0-c408-3e66-4fb3-e7bf80a42bad:frag:0:0] INFO  
> o.a.d.exec.vector.BaseValueVector - Realloc vector null. [32768] -> [65536]
> 2015-04-01 00:43:45,680 [2ae4c0c0-c408-3e66-4fb3-e7bf80a42bad:frag:0:0] INFO  
> o.a.d.exec.vector.BaseValueVector - Realloc vector 
> ``c_varchar`(VARCHAR:OPTIONAL)_bits`(UINT1:REQUIRED). [8192] -> [16384]
> 2015-04-01 00:43:45,709 [2ae4c0c0-c408-3e66-4fb3-e7bf80a42bad:frag:0:0] WARN  
> o.a.d.exec.memory.AtomicRemainder - Tried to close remainder, but it has 
> already been closed
> java.lang.Exception: null
> at 
> org.apache.drill.exec.memory.AtomicRemainder.close(AtomicRemainder.java:196) 
> [drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
> at org.apache.drill.exec.memory.Accountor.close(Accountor.java:386) 
> [drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
> at 
> org.apache.drill.exec.memory.TopLevelAllocator$ChildAllocator.close(TopLevelAllocator.java:298)
>  [drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
> at 
> org.apache.drill.exec.physical.impl.xsort.ExternalSortBatch.cleanup(ExternalSortBatch.java:162)
>  [drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
> at 
> org.apache.drill.exec.physical.impl.validate.IteratorValidatorBatchIterator.cleanup(IteratorValidatorBatchIterator.java:148)
>  [dri

[jira] [Updated] (DRILL-2421) ensure all allocators for a query are descendants of a single root

2015-04-15 Thread Chris Westin (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-2421?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Westin updated DRILL-2421:

Fix Version/s: (was: 0.9.0)
   1.0.0

> ensure all allocators for a query are descendants of a single root
> --
>
> Key: DRILL-2421
> URL: https://issues.apache.org/jira/browse/DRILL-2421
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Flow
>Reporter: Chris Westin
>Assignee: Chris Westin
> Fix For: 1.0.0
>
>
> In order to help improve usage tracking, allocations for a single query 
> should all roll up to a single root.
> This requires that the Foreman create that root, and label it, and then pass 
> that along to anyone else that needs to create additional sub-allocators. The 
> patch for DRILL-2406 introduces the creation of a new allocator in 
> QueryContext, but this is currently a child of the Drillbit's 
> TopLevelAllocator, violating the principle above. This is a reminder to fix 
> that after the dependencies above are available.
> As well as the known case in QueryContext, check to make sure other locations 
> aren't creating new children from the DrillbitContext, but are instead using 
> the allocator from the FragmentContext instead.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (DRILL-2699) Collect all cleanup errors before reporting a failure to the client

2015-04-15 Thread Chris Westin (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-2699?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Westin updated DRILL-2699:

Fix Version/s: (was: 0.9.0)
   1.0.0

> Collect all cleanup errors before reporting a failure to the client
> ---
>
> Key: DRILL-2699
> URL: https://issues.apache.org/jira/browse/DRILL-2699
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Execution - Flow
>Affects Versions: 0.8.0
>Reporter: Deneche A. Hakim
>Assignee: Chris Westin
> Fix For: 1.0.0
>
>
> If a query fails, the fragments and foreman should make sure to collect all 
> failures and report them back to the client. Some known places where this 
> isn't respected:
> - If a fragment fails, it will report the failure to the foreman before 
> cleaning up. Any failure that happens in the cleanup process will be dropped 
> by the foreman.
> - If multiple fragments fail, the Foreman will only report to the user the 
> first failure it received and close immediately. All other failures will be 
> dropped.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (DRILL-2617) Errors in the execution stack will cause DeferredException to throw an IllegalStateException

2015-04-15 Thread Chris Westin (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-2617?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Westin updated DRILL-2617:

Fix Version/s: (was: 0.9.0)
   1.0.0

> Errors in the execution stack will cause DeferredException to throw an 
> IllegalStateException
> 
>
> Key: DRILL-2617
> URL: https://issues.apache.org/jira/browse/DRILL-2617
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Flow
>Affects Versions: 0.8.0
>Reporter: Deneche A. Hakim
>Assignee: Chris Westin
> Fix For: 1.0.0
>
>
> When a query fails while executing, the following events happen:
> - the exception is added to {{FragmentContext.deferredException}}
> - the {{FragmentExecutor}} reports the failure to the client through the 
> {{Foreman}}
> - the {{FragmentExecutor}} closes the {{DeferredException}}
> - {{DeferredException.close()}} throws back the original exception
> - {{FragmentExecutor.run()}} catches the exception and try to add it to the 
> {{DeferredException}}
> - {{DeferredException.addException()}} throws an {{IllegalStateException}} 
> because it's already closed.
> You can reproduce this by querying the following json file, which contains an 
> extra ":"
> {code}
> { "a1": 0 , "b1": "a"}
> { "a1": 1 , "b1": "b"}
> { "a1": 2 , "b1": "c"}
> { "a1":: 3 , "b1": "c"}
> {code}
> Sqlline will dispaly both the error message sent by the Foreman and the 
> IllegalStateException:
> {noformat}
> 0: jdbc:drill:zk=local> select * from `t.json`;
> Query failed: Query stopped., Unexpected character (':' (code 58)): expected 
> a valid value (number, String, array, object, 'true', 'false' or 'null')
>  at [Source: org.apache.drill.exec.vector.complex.fn.JsonReader@161188d3; 
> line: 3, column: 9] [ b55f7d53-0e88-456f-bb12-160cacae9222 on 
> administorsmbp2.attlocal.net:31010 ]
> Error: exception while executing query: Failure while executing query. 
> (state=,code=0)
> 0: jdbc:drill:zk=local> Exception in thread "WorkManager-2" 
> java.lang.IllegalStateException
>   at 
> com.google.common.base.Preconditions.checkState(Preconditions.java:133)
>   at 
> org.apache.drill.common.DeferredException.addException(DeferredException.java:47)
>   at 
> org.apache.drill.common.DeferredException.addThrowable(DeferredException.java:61)
>   at 
> org.apache.drill.exec.ops.FragmentContext.fail(FragmentContext.java:135)
>   at 
> org.apache.drill.exec.work.fragment.FragmentExecutor.run(FragmentExecutor.java:181)
>   at 
> org.apache.drill.common.SelfCleaningRunnable.run(SelfCleaningRunnable.java:38)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>   at java.lang.Thread.run(Thread.java:745)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (DRILL-2750) Running 1 or more queries against Drillbits having insufficient DirectMem renders the Drillbits in an unusable state

2015-04-15 Thread Chris Westin (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-2750?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Westin updated DRILL-2750:

Fix Version/s: (was: 0.9.0)
   1.0.0

> Running 1 or more queries against Drillbits having insufficient DirectMem 
> renders the Drillbits in an unusable state
> 
>
> Key: DRILL-2750
> URL: https://issues.apache.org/jira/browse/DRILL-2750
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Flow
>Affects Versions: 0.9.0
> Environment: RHEL 6.4
>Reporter: Kunal Khatua
>Assignee: Chris Westin
>Priority: Critical
> Fix For: 1.0.0
>
>
> When running queries against a Drill cluster with limited DirectMem; if one 
> or more queries fail due to insufficient memory, then even queries that 
> should easily run within the allocated memory fail.
> The initial failure when queries with large memory requirements fail: 
> 2015-04-10 09:57:55 [pip0] ERROR PipSQuawkling fetchRows - [ 1 / 16_par1000 ] 
> Failure while executing query.
> java.sql.SQLException: Failure while executing query.
> at org.apache.drill.jdbc.DrillCursor.next(DrillCursor.java:144)
> at 
> net.hydromatic.avatica.AvaticaResultSet.next(AvaticaResultSet.java:187)
> at org.apache.drill.jdbc.DrillResultSet.next(DrillResultSet.java:85)
> at PipSQuawkling.fetchRows(PipSQuawkling.java:319)
> at PipSQuawkling.executeTest(PipSQuawkling.java:154)
> at PipSQuawkling.run(PipSQuawkling.java:76)
> Caused by: org.apache.drill.exec.rpc.RpcException: RemoteRpcException: 
> Failure while running fragment.[ e8c657a7-93a9-415a-8641-a4fbd4836a65 on 
> ucs-node5.perf.lab:31010 ]
> [ e8c657a7-93a9-415a-8641-a4fbd4836a65 on ucs-node5.perf.lab:31010 ]
> at 
> org.apache.drill.exec.rpc.user.QueryResultHandler.resultArrived(QueryResultHandler.java:111)
> at 
> org.apache.drill.exec.rpc.user.UserClient.handleReponse(UserClient.java:100)
> at 
> org.apache.drill.exec.rpc.BasicClientWithConnection.handle(BasicClientWithConnection.java:52)
> at 
> org.apache.drill.exec.rpc.BasicClientWithConnection.handle(BasicClientWithConnection.java:34)
> at org.apache.drill.exec.rpc.RpcBus.handle(RpcBus.java:57)
> at 
> org.apache.drill.exec.rpc.RpcBus$InboundHandler.decode(RpcBus.java:194)
> at 
> org.apache.drill.exec.rpc.RpcBus$InboundHandler.decode(RpcBus.java:173)
> at 
> io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:89)
> at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:333)
> at 
> io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:319)
> at 
> io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:103)
> at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:333)
> at 
> io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:319)
> at 
> io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:161)
> at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:333)
> at 
> io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:319)
> at 
> io.netty.channel.ChannelInboundHandlerAdapter.channelRead(ChannelInboundHandlerAdapter.java:86)
> at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:333)
> at 
> io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:319)
> at 
> io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:787)
> at 
> io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:130)
> at 
> io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:511)
> at 
> io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:468)
> at 
> io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:382)
> at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:354)
> at 
> io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:116)
> at java.lang.Thread.run(Thread.java:744)
> After that, subsequent queries that should run, fail with the following:
> 2015-04-10 09:59:29 [pip0] ERROR PipSQuawkling executeQuery - [ 2 / 
> rerun_06_par1000 ] exception

[jira] [Updated] (DRILL-2418) Memory leak during execution if comparison function is not found

2015-04-15 Thread Chris Westin (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-2418?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Westin updated DRILL-2418:

Fix Version/s: (was: 0.9.0)
   1.0.0

> Memory leak during execution if comparison function is not found
> 
>
> Key: DRILL-2418
> URL: https://issues.apache.org/jira/browse/DRILL-2418
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Flow
>Affects Versions: 0.8.0
>Reporter: Victoria Markman
>Assignee: Chris Westin
> Fix For: 1.0.0
>
>
> While testing implicit cast during join, I ran into an issue where if you run 
> a query that throws an exception during execution, eventually, if you run 
> enough of those, drill will run out of memory.
> Here is a query example:
> {code}
> select count(*) from cast_tbl_1 a, cast_tbl_2 b where a.c_float = b.c_time
>  failed: RemoteRpcException: Failure while running fragment., Failure finding 
> function that runtime code generation expected.  Signature: 
> compare_to_nulls_high( TIME:OPTIONAL, FLOAT4:OPTIONAL ) returns INT:REQUIRED 
> [ 633c8ce3-1ed2-4a0a-8248-1e3d5b4f7c0a on atsqa4-133.qa.lab:31010 ]
> [ 633c8ce3-1ed2-4a0a-8248-1e3d5b4f7c0a on atsqa4-133.qa.lab:31010 ]
> Test_Failed: 2015/03/10 18:34:15.0015 - Failed to execute.
> {code}
> If you set planner.slice_target to 1, you hit out of memory after about ~40 
> or so of such failures on my cluster.
> {code}
> select count(*) from cast_tbl_1 a, cast_tbl_2 b where a.d38 = b.c_double
> Query failed: OutOfMemoryException: You attempted to create a new child 
> allocator with initial reservation 300 but only 916199 bytes of memory 
> were available.
> {code}
> From the drillbit.log
> {code}
> 2015-03-10 18:34:34,588 [2b00c6c5-5525-ae65-25f8-24ea2d88ba2f:foreman] INFO  
> o.a.d.e.store.parquet.FooterGatherer - Fetch Parquet Footers: Executed 1 out 
> of 1 using 1 threads. Time: 1ms total, 1.190007ms avg, 1ms max.
> 2015-03-10 18:34:34,591 [2b00c6c5-5525-ae65-25f8-24ea2d88ba2f:foreman] INFO  
> o.a.d.e.store.parquet.FooterGatherer - Fetch Parquet Footers: Executed 1 out 
> of 1 using 1 threads. Time: 0ms total, 0.953679ms avg, 0ms max.
> 2015-03-10 18:34:34,627 [2b00c6c5-5525-ae65-25f8-24ea2d88ba2f:foreman] INFO  
> o.a.d.e.s.schedule.BlockMapBuilder - Failure finding Drillbit running on host 
> atsqa4-136.qa.lab.  Skipping affinity to that host.
> 2015-03-10 18:34:34,627 [2b00c6c5-5525-ae65-25f8-24ea2d88ba2f:foreman] INFO  
> o.a.d.e.s.parquet.ParquetGroupScan - Load Parquet RowGroup block maps: 
> Executed 1 out of 1 using 1 threads. Time: 1ms total, 1.609586ms avg, 1ms max.
> 2015-03-10 18:34:34,629 [2b00c6c5-5525-ae65-25f8-24ea2d88ba2f:foreman] INFO  
> o.a.d.e.s.schedule.BlockMapBuilder - Failure finding Drillbit running on host 
> atsqa4-136.qa.lab.  Skipping affinity to that host.
> 2015-03-10 18:34:34,629 [2b00c6c5-5525-ae65-25f8-24ea2d88ba2f:foreman] INFO  
> o.a.d.e.s.parquet.ParquetGroupScan - Load Parquet RowGroup block maps: 
> Executed 1 out of 1 using 1 threads. Time: 1ms total, 1.270340ms avg, 1ms max.
> 2015-03-10 18:34:34,684 [2b00c6c5-5525-ae65-25f8-24ea2d88ba2f:foreman] INFO  
> o.a.drill.exec.work.foreman.Foreman - State change requested.  PENDING --> 
> FAILED
> org.apache.drill.exec.work.foreman.ForemanException: Unexpected exception 
> during fragment initialization: Failure while getting memory allocator for 
> fragment.
> at org.apache.drill.exec.work.foreman.Foreman.run(Foreman.java:195) 
> [drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
> at 
> org.apache.drill.exec.work.WorkManager$RunnableWrapper.run(WorkManager.java:303)
>  [drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>  [na:1.7.0_71]
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>  [na:1.7.0_71]
> at java.lang.Thread.run(Thread.java:745) [na:1.7.0_71]
> Caused by: org.apache.drill.common.exceptions.ExecutionSetupException: 
> Failure while getting memory allocator for fragment.
> at 
> org.apache.drill.exec.ops.FragmentContext.(FragmentContext.java:119) 
> ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
> at 
> org.apache.drill.exec.work.foreman.Foreman.setupRootFragment(Foreman.java:535)
>  [drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
> at 
> org.apache.drill.exec.work.foreman.Foreman.runPhysicalPlan(Foreman.java:307) 
> [drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
> at 
> org.apache.drill.exec.work.foreman.Foreman.runSQL(Foreman.java:511) 
> [drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
> at org.apache.drill.exec.work.foreman.Foreman.run(Foreman

[jira] [Updated] (DRILL-2219) Concurrent modification exception in TopLevelAllocator if a child allocator is added during loop in close()

2015-04-15 Thread Chris Westin (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-2219?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Westin updated DRILL-2219:

Fix Version/s: (was: 0.9.0)
   1.0.0

> Concurrent modification exception in TopLevelAllocator if a child allocator 
> is added during loop in close()
> ---
>
> Key: DRILL-2219
> URL: https://issues.apache.org/jira/browse/DRILL-2219
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Flow
>Reporter: Jason Altekruse
>Assignee: Chris Westin
>Priority: Critical
> Fix For: 1.0.0
>
> Attachments: DRILL-2219-v2.patch, DRILL-2219-v3.patch, 
> DRILL-2219-v4.patch, DRILL-2219.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (DRILL-2370) Missing cancellation acknowledgements leave orphaned cancelled queries around

2015-04-15 Thread Chris Westin (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-2370?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14496879#comment-14496879
 ] 

Chris Westin commented on DRILL-2370:
-

[~jnadeau], it seems like your changes with the new NodeTracker handle dead 
nodes and take care of this. Do you agree?

> Missing cancellation acknowledgements leave orphaned cancelled queries around
> -
>
> Key: DRILL-2370
> URL: https://issues.apache.org/jira/browse/DRILL-2370
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Flow
>Affects Versions: 0.7.0
>Reporter: Chris Westin
>Assignee: Jacques Nadeau
> Fix For: 0.9.0
>
>
> When a query is cancelled (say by Foreman.cancel()), Foreman moves to the 
> CANCELLED state, which indicates that cancellation is in progress. 
> Cancellation requests are sent to remote drill fragments. These should be 
> replied to via QueryManager.statusUpdate() (inherited from 
> FragmentStatusListener), which in turn updates Foreman. However, if any of 
> the cancellation requests are not acknowledged (due to drillbit failure, RPC 
> timeout, etc), then the being-cancelled query will stay that way 
> indefinitely. We need to have a way to find such queries and force them to be 
> cleaned up. This may require substituting stub listeners for some of the 
> objects involved in case cancellation acknowledgements still arrive even 
> later still -- these need to be safely invokable by the RPC layer even though 
> the query they referred to is gone.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (DRILL-2370) Missing cancellation acknowledgements leave orphaned cancelled queries around

2015-04-15 Thread Chris Westin (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-2370?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Westin updated DRILL-2370:

Assignee: Jacques Nadeau  (was: Chris Westin)

> Missing cancellation acknowledgements leave orphaned cancelled queries around
> -
>
> Key: DRILL-2370
> URL: https://issues.apache.org/jira/browse/DRILL-2370
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Flow
>Affects Versions: 0.7.0
>Reporter: Chris Westin
>Assignee: Jacques Nadeau
> Fix For: 0.9.0
>
>
> When a query is cancelled (say by Foreman.cancel()), Foreman moves to the 
> CANCELLED state, which indicates that cancellation is in progress. 
> Cancellation requests are sent to remote drill fragments. These should be 
> replied to via QueryManager.statusUpdate() (inherited from 
> FragmentStatusListener), which in turn updates Foreman. However, if any of 
> the cancellation requests are not acknowledged (due to drillbit failure, RPC 
> timeout, etc), then the being-cancelled query will stay that way 
> indefinitely. We need to have a way to find such queries and force them to be 
> cleaned up. This may require substituting stub listeners for some of the 
> objects involved in case cancellation acknowledgements still arrive even 
> later still -- these need to be safely invokable by the RPC layer even though 
> the query they referred to is gone.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (DRILL-996) Build - allow other versions of Hadoop to be specified on command-line

2015-04-15 Thread Jason Altekruse (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-996?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Altekruse updated DRILL-996:
--
Fix Version/s: (was: 0.9.0)
   1.0.0

> Build - allow other versions of Hadoop to be specified on command-line
> --
>
> Key: DRILL-996
> URL: https://issues.apache.org/jira/browse/DRILL-996
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Tools, Build & Test
>Reporter: Patrick Wong
>Assignee: Jason Altekruse
>Priority: Minor
> Fix For: 1.0.0
>
> Attachments: DRILL-996.1.patch.txt, DRILL-996.2.patch.txt
>
>
> Right now, in order to change the version of Hadoop that Drill is built 
> against, you have to edit the POM. However, for automated build systems that 
> build Drill against a variety of possible Hadoop versions, it would be much 
> cleaner to be able to specify it on the command-line.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (DRILL-1159) query a particular row within a csv file caused IllegalStateException

2015-04-15 Thread Jason Altekruse (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-1159?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Altekruse updated DRILL-1159:
---
Fix Version/s: (was: 0.9.0)
   1.0.0

> query a particular row within a csv file caused IllegalStateException
> -
>
> Key: DRILL-1159
> URL: https://issues.apache.org/jira/browse/DRILL-1159
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Flow
>Reporter: Chun Chang
>Assignee: Jason Altekruse
>Priority: Minor
> Fix For: 1.0.0
>
> Attachments: jira1159
>
>
> #Mon Jul 14 10:10:52 PDT 2014
> git.commit.id.abbrev=699851b
> I have some data in a csv file format. And the following query caused 
> IllegalStateException:
> 0: jdbc:drill:schema=dfs> select * from dfs.`bugsb.csv` where columns[0]=887;
> Error: exception while executing query (state=,code=0)
> Data is sensitive so not shown here. But I will paste stack.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (DRILL-1082) Problems with creating views using a path

2015-04-15 Thread Jason Altekruse (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-1082?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Altekruse updated DRILL-1082:
---
Fix Version/s: (was: 0.9.0)
   1.0.0

> Problems with creating views using a path
> -
>
> Key: DRILL-1082
> URL: https://issues.apache.org/jira/browse/DRILL-1082
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Metadata
>Reporter: Krystal
>Assignee: Jason Altekruse
>Priority: Minor
> Fix For: 1.0.0
>
>
> git.commit.id.abbrev=33c28f6
> I tried to create a view with a path relative to dfs.default schema:
> 0: jdbc:drill:schema=dfs> create view `dfs.default`.`views/votertsv.v2` as 
> select columns[0] voter_id, columns[1] name, columns[2] age, columns[3] 
> registration,columns[4] contributions,columns[5] voterzone,columns[6] 
> create_time from `dfs`.`root`.`./drill/testdata/tsv/voter.tsv`;
> I got the following error message:
> +++
> | ok |  summary   |
> +++
> | false  | Error: Failure while accessing Zookeeper |
> +++
> 1 row selected (0.595 seconds)
> However, the view is actually created:
> [root@qa-node56 ~]# hadoop fs -ls /drill/testdata/p1tests/views
> Found 1 items
> -rwxr-xr-x   3 mapr mapr683 2014-06-26 09:27 
> /drill/testdata/p1tests/views/votertsv.v2.view.drill
> I cannot query from the view:
> 0: jdbc:drill:schema=dfs> select * from `dfs.default`.`views/votertsv.v2`;
> "Failure while parsing sql. < ValidationException:[ 
> org.eigenbase.util.EigenbaseContextException: From line 1, column 15 to line 
> 1, column 27 ] < EigenbaseContextException:[ From line 1, column 15 to line 
> 1, column 27 ] < SqlValidatorException:[ Table 
> 'dfs.default.views/votertsv.v2' not found ]"



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (DRILL-1460) JsonReader fails reading files with decimal numbers and integers in the same field

2015-04-15 Thread Jason Altekruse (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-1460?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Altekruse updated DRILL-1460:
---
Fix Version/s: (was: 0.9.0)
   1.0.0

> JsonReader fails reading files with decimal numbers and integers in the same 
> field
> --
>
> Key: DRILL-1460
> URL: https://issues.apache.org/jira/browse/DRILL-1460
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Storage - JSON
>Affects Versions: 0.6.0, 0.7.0
>Reporter: Bhallamudi Venkata Siva Kamesh
>Assignee: Jason Altekruse
>Priority: Critical
> Fix For: 1.0.0
>
> Attachments: DRILL-1460.1.patch.txt, DRILL-1460.2.patch.txt
>
>
> Used the following dataset : 
> http://thecodebarbarian.wordpress.com/2014/03/28/plugging-usda-nutrition-data-into-mongodb
> Executed the following query
> {noformat}select t.nutrients from dfs.usda.`usda.json` t limit 1;{noformat}
> and it failed with following exception
> {noformat}
> 2014-09-27 17:48:39,421 [b9dfbb9b-29a9-425d-801c-2e418533525f:frag:0:0] ERROR 
> o.a.d.e.p.i.ScreenCreator$ScreenRoot - Error 
> 0568d90a-d7df-4a5d-87e9-8b9f718dffa4: Screen received stop request sent.
> java.lang.IllegalArgumentException: You tried to write a BigInt type when you 
> are using a ValueWriter of type NullableFloat8WriterImpl.
>   at 
> org.apache.drill.exec.vector.complex.impl.AbstractFieldWriter.fail(AbstractFieldWriter.java:513)
>  
> ~[drill-java-exec-0.6.0-incubating-SNAPSHOT-rebuffed.jar:0.6.0-incubating-SNAPSHOT]
>   at 
> org.apache.drill.exec.vector.complex.impl.AbstractFieldWriter.write(AbstractFieldWriter.java:145)
>  
> ~[drill-java-exec-0.6.0-incubating-SNAPSHOT-rebuffed.jar:0.6.0-incubating-SNAPSHOT]
>   at 
> org.apache.drill.exec.vector.complex.impl.NullableFloat8WriterImpl.write(NullableFloat8WriterImpl.java:88)
>  
> ~[drill-java-exec-0.6.0-incubating-SNAPSHOT-rebuffed.jar:0.6.0-incubating-SNAPSHOT]
>   at 
> org.apache.drill.exec.vector.complex.fn.JsonReader.writeData(JsonReader.java:257)
>  
> ~[drill-java-exec-0.6.0-incubating-SNAPSHOT-rebuffed.jar:0.6.0-incubating-SNAPSHOT]
>   at 
> org.apache.drill.exec.vector.complex.fn.JsonReader.writeData(JsonReader.java:310)
>  
> ~[drill-java-exec-0.6.0-incubating-SNAPSHOT-rebuffed.jar:0.6.0-incubating-SNAPSHOT]
>   at 
> org.apache.drill.exec.vector.complex.fn.JsonReader.writeData(JsonReader.java:204)
>  
> ~[drill-java-exec-0.6.0-incubating-SNAPSHOT-rebuffed.jar:0.6.0-incubating-SNAPSHOT]
>   at 
> org.apache.drill.exec.vector.complex.fn.JsonReader.write(JsonReader.java:134) 
> ~[drill-java-exec-0.6.0-incubating-SNAPSHOT-rebuffed.jar:0.6.0-incubating-SNAPSHOT]
>   at 
> org.apache.drill.exec.vector.complex.fn.JsonReaderWithState.write(JsonReaderWithState.java:65)
>  
> ~[drill-java-exec-0.6.0-incubating-SNAPSHOT-rebuffed.jar:0.6.0-incubating-SNAPSHOT]
>   at 
> org.apache.drill.exec.store.easy.json.JSONRecordReader2.next(JSONRecordReader2.java:111)
>  
> ~[drill-java-exec-0.6.0-incubating-SNAPSHOT-rebuffed.jar:0.6.0-incubating-SNAPSHOT]
> {noformat}
> {noformat}select t.nutrients[0].units from dfs.usda.`usda.json` t limit 
> 1;{noformat}
> and it failed with following exception
> {noformat}
> 2014-09-27 17:50:04,394 [9ee8a529-17fd-492f-9cba-2d1f5842eae1:frag:0:0] ERROR 
> o.a.d.e.p.i.ScreenCreator$ScreenRoot - Error 
> c4c6bffd-b62b-4878-af1e-58db64453307: Screen received stop request sent.
> java.lang.IllegalArgumentException: You tried to write a BigInt type when you 
> are using a ValueWriter of type NullableFloat8WriterImpl.
>   at 
> org.apache.drill.exec.vector.complex.impl.AbstractFieldWriter.fail(AbstractFieldWriter.java:513)
>  
> ~[drill-java-exec-0.6.0-incubating-SNAPSHOT-rebuffed.jar:0.6.0-incubating-SNAPSHOT]
>   at 
> org.apache.drill.exec.vector.complex.impl.AbstractFieldWriter.write(AbstractFieldWriter.java:145)
>  
> ~[drill-java-exec-0.6.0-incubating-SNAPSHOT-rebuffed.jar:0.6.0-incubating-SNAPSHOT]
>   at 
> org.apache.drill.exec.vector.complex.impl.NullableFloat8WriterImpl.write(NullableFloat8WriterImpl.java:88)
>  
> ~[drill-java-exec-0.6.0-incubating-SNAPSHOT-rebuffed.jar:0.6.0-incubating-SNAPSHOT]
>   at 
> org.apache.drill.exec.vector.complex.fn.JsonReader.writeData(JsonReader.java:257)
>  
> ~[drill-java-exec-0.6.0-incubating-SNAPSHOT-rebuffed.jar:0.6.0-incubating-SNAPSHOT]
>   at 
> org.apache.drill.exec.vector.complex.fn.JsonReader.writeData(JsonReader.java:310)
>  
> ~[drill-java-exec-0.6.0-incubating-SNAPSHOT-rebuffed.jar:0.6.0-incubating-SNAPSHOT]
>   at 
> org.apache.drill.exec.vector.complex.fn.JsonReader.writeData(JsonReader.java:204)
>  
> ~[drill-java-exec-0.6.0-incubating-SNAPSHOT-rebuffed.jar:0.6.0-incubating-SNAPSHOT]
>   at 
> org.apache.drill.exec.vec

[jira] [Updated] (DRILL-1479) Hbase query using "sum" randomly fails to execute in Drill-0.6.0

2015-04-15 Thread Jason Altekruse (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-1479?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Altekruse updated DRILL-1479:
---
Fix Version/s: (was: 0.9.0)
   1.0.0

> Hbase query using "sum" randomly fails to execute in Drill-0.6.0
> 
>
> Key: DRILL-1479
> URL: https://issues.apache.org/jira/browse/DRILL-1479
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Storage - HBase
>Reporter: Rahul Challapalli
>Assignee: Jason Altekruse
>Priority: Minor
> Fix For: 1.0.0
>
> Attachments: hbase_sum.error, votertab
>
>
> git.commit.id.abbrev=5c220e3
> The below query works fine when we from sqlline. However we run this query as 
> part of batch which contains other queries, then this fails frequently with 
> the attached error. The source data voter resides in hbase
> Drill Query :
> {code}
> select sum(cast(threecf['contributions'] as decimal(6,2))) from voter;
> {code}
> Attached the error log and the data files



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (DRILL-1556) Querying JSON-converted-Parquet file throws parquet.io.ParquetDecodingException (Intermittent)

2015-04-15 Thread Jason Altekruse (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-1556?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Altekruse updated DRILL-1556:
---
Fix Version/s: (was: 0.9.0)
   1.0.0

> Querying JSON-converted-Parquet file throws 
> parquet.io.ParquetDecodingException (Intermittent)
> --
>
> Key: DRILL-1556
> URL: https://issues.apache.org/jira/browse/DRILL-1556
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Storage - Parquet
>Reporter: Abhishek Girish
>Assignee: Jason Altekruse
>Priority: Critical
> Fix For: 1.0.0
>
> Attachments: drillbit.log
>
>
> Querying JSON data works at higher values for limit:
> > select * from `yelp_academic_dataset_review.json` limit 1125458;
> Querying Parquet data (converted from JSON) fails at higher values for limit:
> > create table yelp_academic_dataset_review as select * from 
> > `yelp_academic_dataset_review.json`;
> [success]
> >select * from yelp_academic_dataset_review limit 4;
> [data]
> java.lang.RuntimeException: java.sql.SQLException: Failure while trying to 
> get next result batch.
> Logs indicate an error in decoding the Parquet file. Drillbit.log is 
> attached. 
> 2014-10-20 15:21:22,739 [bf4a3f58-781b-4c89-b718-e1ef6eab6da4:frag:1:0] ERROR 
> o.a.drill.exec.ops.FragmentContext - Fragment Context received 
> failure.
> parquet.io.ParquetDecodingException: Can't read value in column [votes, 
> funny] INT64 at value 61063 out of 61063, 61063 out of 61063 in currentPage. 
> repetition level: 0, definition level: 2
> This is at times consistent and some other times intermittent, for varied 
> values provided to the limit clause. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (DRILL-1586) NPE when the collection being queried for does not exist in Mongo DB

2015-04-15 Thread Jason Altekruse (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-1586?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Altekruse updated DRILL-1586:
---
Fix Version/s: (was: 0.9.0)
   1.0.0

> NPE when the collection being queried for does not exist in Mongo DB
> 
>
> Key: DRILL-1586
> URL: https://issues.apache.org/jira/browse/DRILL-1586
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Storage - MongoDB
>Affects Versions: 0.6.0
>Reporter: Bhallamudi Venkata Siva Kamesh
>Assignee: Jason Altekruse
>Priority: Minor
> Fix For: 1.0.0
>
>
> NPE when the collection being queried for does not exist in Mongo DB.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (DRILL-1750) Querying directories with JSON files returns incomplete results

2015-04-15 Thread Jason Altekruse (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-1750?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Altekruse updated DRILL-1750:
---
Fix Version/s: (was: 0.9.0)
   1.0.0

> Querying directories with JSON files returns incomplete results
> ---
>
> Key: DRILL-1750
> URL: https://issues.apache.org/jira/browse/DRILL-1750
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Storage - JSON
>Reporter: Abhishek Girish
>Assignee: Jason Altekruse
>Priority: Critical
> Fix For: 1.0.0
>
> Attachments: 1.json, 2.json, 3.json, 4.json
>
>
> I happened to observe that querying (select *) a directory with json files 
> displays only fields common to all json files. All corresponding fields are 
> displayed while querying each of the json files individually. And in some 
> scenarios, querying the directory crashes sqlline.
> The example below may help make the issue clear:
> > select * from dfs.`/data/json/tmp/1.json`;
> ++++
> |   artist   |  track_id  |   title|
> ++++
> | Jonathan King | TRAAAEA128F935A30D | I'll Slap Your Face (Entertainment USA 
> Theme) |
> ++++
> 1 row selected (1.305 seconds)
> > select * from dfs.`/data/json/tmp/2.json`;
> +++++
> |   artist   | timestamp  |  track_id  |   title|
> +++++
> | Supersuckers | 2011-08-01 20:30:17.991134 | TRAAAQN128F9353BA0 | Double 
> Wide |
> +++++
> 1 row selected (0.105 seconds)
> > select * from dfs.`/data/json/tmp/3.json`;
> ++++
> | timestamp  |  track_id  |   title|
> ++++
> | 2011-08-01 20:30:17.991134 | TRAAAQN128F9353BA0 | Double Wide |
> ++++
> 1 row selected (0.083 seconds)
> > select * from dfs.`/data/json/tmp/4.json`;
> +++
> |  track_id  |   title|
> +++
> | TRAAAQN128F9353BA0 | Double Wide |
> +++
> 1 row selected (0.076 seconds)
> > select * from dfs.`/data/json/tmp`;
> +++
> |  track_id  |   title|
> +++
> | TRAAAQN128F9353BA0 | Double Wide |
> | TRAAAQN128F9353BA0 | Double Wide |
> | TRAAAEA128F935A30D | I'll Slap Your Face (Entertainment USA Theme) |
> | TRAAAQN128F9353BA0 | Double Wide |
> +++
> 4 rows selected (0.121 seconds)
> JVM Crash occurs at times:
> > select * from dfs.`/data/json/tmp`;
> ++++
> | timestamp  |  track_id  |   title|
> ++++
> | 2011-08-01 20:30:17.991134 | TRAAAQN128F9353BA0 | Double Wide |
> #
> # A fatal error has been detected by the Java Runtime Environment:
> #
> #  SIGSEGV (0xb) at pc=0x7f3cb99be053, pid=13943, tid=139898808436480
> #
> # JRE version: OpenJDK Runtime Environment (7.0_65-b17) (build 
> 1.7.0_65-mockbuild_2014_07_16_06_06-b00)
> # Java VM: OpenJDK 64-Bit Server VM (24.65-b04 mixed mode linux-amd64 
> compressed oops)
> # Problematic frame:
> # V  [libjvm.so+0x932053]
> #
> # Failed to write core dump. Core dumps have been disabled. To enable core 
> dumping, try "ulimit -c unlimited" before starting Java again
> #
> # An error report file with more information is saved as:
> # /tmp/jvm-13943/hs_error.log
> #
> # If you would like to submit a bug report, please include
> # instructions on how to reproduce the bug and visit:
> #   http://icedtea.classpath.org/bugzilla
> #
> Aborted



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (DRILL-1728) Better error messages on Drill JSON read error

2015-04-15 Thread Jason Altekruse (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-1728?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Altekruse updated DRILL-1728:
---
Fix Version/s: (was: 0.9.0)
   1.0.0

> Better error messages on Drill JSON read error
> --
>
> Key: DRILL-1728
> URL: https://issues.apache.org/jira/browse/DRILL-1728
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Storage - JSON
>Reporter: Tomer Shiran
>Assignee: Jason Altekruse
>Priority: Minor
> Fix For: 1.0.0
>
>
> {code}
> 0: jdbc:drill:zk=localhost:2181> SELECT * FROM 
> dfs.root.`Users/tshiran/Development/demo/data/yelp/business.json` WHERE true 
> and REPEATED_CONTAINS(categories, 'Australian');
> +-+--+++++--++++++++---+
> | business_id | full_address |   hours|open| categories |city 
>| review_count |name| longitude  |   state|   stars|  
> latitude  | attributes |type| neighborhoods |
> +-+--+++++--++++++++---+
> Query failed: Query stopeed., You tried to start when you are using a 
> ValueWriter of type NullableBitWriterImpl. [ 
> e5bafa1e-6226-443d-80fd-51e18f330899 on 172.17.3.132:31010 ]
> java.lang.RuntimeException: java.sql.SQLException: Failure while executing 
> query.
>   at sqlline.SqlLine$IncrementalRows.hasNext(SqlLine.java:2514)
>   at sqlline.SqlLine$TableOutputFormat.print(SqlLine.java:2148)
>   at sqlline.SqlLine.print(SqlLine.java:1809)
>   at sqlline.SqlLine$Commands.execute(SqlLine.java:3766)
>   at sqlline.SqlLine$Commands.sql(SqlLine.java:3663)
>   at sqlline.SqlLine.dispatch(SqlLine.java:889)
>   at sqlline.SqlLine.begin(SqlLine.java:763)
>   at sqlline.SqlLine.start(SqlLine.java:498)
>   at sqlline.SqlLine.main(SqlLine.java:460)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (DRILL-2711) Package native PAM library for Linux in Drill tar.gz

2015-04-15 Thread Venki Korukanti (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-2711?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Venki Korukanti updated DRILL-2711:
---
Fix Version/s: (was: 0.9.0)
   1.0.0

> Package native PAM library for Linux in Drill tar.gz 
> -
>
> Key: DRILL-2711
> URL: https://issues.apache.org/jira/browse/DRILL-2711
> Project: Apache Drill
>  Issue Type: Sub-task
>  Components: Execution - RPC
>Affects Versions: 0.9.0
>Reporter: Venki Korukanti
>Assignee: Venki Korukanti
> Fix For: 1.0.0
>
>
> PAM authenticator is added as part of the DRILL-2674. Currently it requires 
> manually getting the libjpam.so and setting the JVM option 
> {{java.library.path}}. This JIRA is to automate those steps.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (DRILL-1751) Drill must indicate JSON All Text Mode needs to be turned on when such queries fail

2015-04-15 Thread Jason Altekruse (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-1751?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Altekruse updated DRILL-1751:
---
Fix Version/s: (was: 0.9.0)
   1.0.0

> Drill must indicate JSON All Text Mode needs to be turned on when such 
> queries fail
> ---
>
> Key: DRILL-1751
> URL: https://issues.apache.org/jira/browse/DRILL-1751
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Storage - JSON
>Reporter: Abhishek Girish
>Assignee: Jason Altekruse
>Priority: Critical
>  Labels: error_message_must_fix
> Fix For: 1.0.0
>
> Attachments: drillbit.log
>
>
> Although JSON All Text Mode is a documented option, it may not be obvious to 
> turn this option ON on encountering an error. 
> Query:
> > select * from 
> > dfs.`/data/json/lastfm/lastfm_test/A/A/A/TRAAAEA128F935A30D.json` limit 1;
> Query failed: Failure while running fragment.[ 
> 4331e9a7-c5b4-4e52-bece-214ffa5d06dd on abhi7.qa.lab:31010 ]
> Error: exception while executing query: Failure while executing query. 
> (state=,code=0)
> Resolution:
> I tried setting JSON Text Mode and queries began to work. 
> > alter system set `store.json.all_text_mode` = true;
> +++
> | ok |  summary   |
> +++
> | true   | store.json.all_text_mode updated. |
> +++
> 1 row selected (0.136 seconds)
> 0: jdbc:drill:zk=10.10.103.34:5181> select * from 
> dfs.`/data/json/lastfm/lastfm_test/A/A/A/TRAAAEA128F935A30D.json` limit 1;
> ++++++
> 
> ++++++
> 1 row selected (0.169 seconds)
> A clear message must be included in the logs and must be displayed on 
> SQLline. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (DRILL-1752) Drill cluster returns error when querying Mongo shards on an unsharded collection

2015-04-15 Thread Jason Altekruse (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-1752?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Altekruse updated DRILL-1752:
---
Fix Version/s: (was: 0.9.0)
   1.0.0

> Drill cluster returns error when querying Mongo shards on an unsharded 
> collection
> -
>
> Key: DRILL-1752
> URL: https://issues.apache.org/jira/browse/DRILL-1752
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Storage - MongoDB
>Affects Versions: 0.6.0, 0.7.0
> Environment: Drill cluster on nodes with Mongo Shards
>Reporter: Andries Engelbrecht
>Assignee: Jason Altekruse
>Priority: Minor
> Fix For: 1.0.0
>
> Attachments: DRILL-1752.patch
>
>
> Query fails on a large unsharded collection in MongoDB sharded cluster with 
> drillbits on each node with Mongo shards.
> Error message:
> 0: jdbc:drill:se0:5181> select * from unshard limit 2;
> Query failed: Failure while setting up query. Incoming endpoints 1 is greater 
> than number of chunks 0 [cb2121f7-eb3e-48cd-8530-474ca76c598d]
> Error: exception while executing query: Failure while trying to get next 
> result batch. (state=,code=0)
> 0: jdbc:drill:se0:5181> explain plan for select * from unshard limit 2;
> +++
> |text|json|
> +++
> | 00-00Screen
> 00-01  SelectionVectorRemover
> 00-02Limit(fetch=[2])
> 00-03  Scan(groupscan=[MongoGroupScan [MongoScanSpec=MongoScanSpec 
> [dbName=review_syn, collectionName=unshard, filters=null], 
> columns=[SchemaPath [`*`)
>  | {
>   "head" : {
> "version" : 1,
> "generator" : {
>   "type" : "ExplainHandler",
>   "info" : ""
> },
> "type" : "APACHE_DRILL_PHYSICAL",
> "options" : [ ],
> "queue" : 0,
> "resultMode" : "EXEC"
>   },
>   "graph" : [ {
> "pop" : "mongo-scan",
> "@id" : 3,
> "mongoScanSpec" : {
>   "dbName" : "review_syn",
>   "collectionName" : "unshard",
>   "filters" : null
> },
> "storage" : {
>   "type" : "mongo",
>   "connection" : "mongodb://se4.dmz:27017",
>   "enabled" : true
> },
> "columns" : [ "`*`" ],
> "cost" : 625000.0
>   }, {
> "pop" : "limit",
> "@id" : 2,
> "child" : 3,
> "first" : 0,
> "last" : 2,
> "initialAllocation" : 100,
> "maxAllocation" : 100,
> "cost" : 625000.0
>   }, {
> "pop" : "selection-vector-remover",
> "@id" : 1,
> "child" : 2,
> "initialAllocation" : 100,
> "maxAllocation" : 100,
> "cost" : 625000.0
>   }, {
> "pop" : "screen",
> "@id" : 0,
> "child" : 1,
> "initialAllocation" : 100,
> "maxAllocation" : 100,
> "cost" : 625000.0
>   } ]
> } |
> +++



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (DRILL-2725) Faster work assignment logic

2015-04-15 Thread Steven Phillips (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-2725?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steven Phillips updated DRILL-2725:
---
Attachment: DRILL-2725.2.patch

Address review comments.

> Faster work assignment logic
> 
>
> Key: DRILL-2725
> URL: https://issues.apache.org/jira/browse/DRILL-2725
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Storage - Other
>Reporter: Steven Phillips
>Assignee: Steven Phillips
> Fix For: 1.0.0
>
> Attachments: DRILL-2725.2.patch, DRILL-2725.patch
>
>
> The current AssignmentCreator logic for assigning work to drillbits takes a 
> non-negligible amount of time once the number of work units is more than a 
> few thousand
> We need a new algorithm to that will cut this time down to less than a 
> second, even for tables with more than 100K files.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (DRILL-2725) Faster work assignment logic

2015-04-15 Thread Steven Phillips (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-2725?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steven Phillips updated DRILL-2725:
---
Fix Version/s: (was: 0.9.0)
   1.0.0

> Faster work assignment logic
> 
>
> Key: DRILL-2725
> URL: https://issues.apache.org/jira/browse/DRILL-2725
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Storage - Other
>Reporter: Steven Phillips
>Assignee: Steven Phillips
> Fix For: 1.0.0
>
> Attachments: DRILL-2725.patch
>
>
> The current AssignmentCreator logic for assigning work to drillbits takes a 
> non-negligible amount of time once the number of work units is more than a 
> few thousand
> We need a new algorithm to that will cut this time down to less than a 
> second, even for tables with more than 100K files.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (DRILL-2161) Flatten on a list within a list on a large data set results in an IOB Exception

2015-04-15 Thread Jason Altekruse (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-2161?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14496730#comment-14496730
 ] 

Jason Altekruse commented on DRILL-2161:


Many of these have been fixed in various commits, will update soon with a 
smaller list of the queries that are still failing.

> Flatten on a list within a list on a large data set results in an IOB 
> Exception
> ---
>
> Key: DRILL-2161
> URL: https://issues.apache.org/jira/browse/DRILL-2161
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Flow, Execution - Relational Operators
>Reporter: Rahul Challapalli
>Assignee: Jason Altekruse
> Fix For: 0.9.0
>
> Attachments: data.json
>
>
> git.commit.id.abbrev=3e33880
> I attached the data set which contains 2 records.
> Below query works fine on the attached data set
> {code}
> 0: jdbc:drill:schema=dfs.drillTestDir> select uid, flatten(d.lst_lst) lst 
> from `data.json` d;
> +++
> |uid |lst |
> +++
> | 1  | [1,2,3,4,5] |
> | 1  | [2,3,4,5,6] |
> | 2  | [1,2,3,4,5] |
> | 2  | [2,3,4,5,6] |
> +++
> {code}
> However if I copy the same data set 50, 000 times, and run the same query, it 
> fails with IOB. Below is the contents of the log file
> {code}
> java.lang.IndexOutOfBoundsException: index: 16384, length: 4 (expected: 
> range(0, 16384))
>   at io.netty.buffer.DrillBuf.checkIndexD(DrillBuf.java:156) 
> ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:4.0.24.Final]
>   at io.netty.buffer.DrillBuf.chk(DrillBuf.java:178) 
> ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:4.0.24.Final]
>   at io.netty.buffer.DrillBuf.getInt(DrillBuf.java:447) 
> ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:4.0.24.Final]
>   at 
> org.apache.drill.exec.vector.UInt4Vector$Accessor.get(UInt4Vector.java:309) 
> ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
>   at 
> org.apache.drill.exec.vector.complex.RepeatedListVector.populateEmpties(RepeatedListVector.java:385)
>  ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
>   at 
> org.apache.drill.exec.vector.complex.RepeatedListVector.access$300(RepeatedListVector.java:54)
>  ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
>   at 
> org.apache.drill.exec.vector.complex.RepeatedListVector$Mutator.setValueCount(RepeatedListVector.java:132)
>  ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
>   at 
> org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.setValueCount(ProjectRecordBatch.java:248)
>  ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
>   at 
> org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.doWork(ProjectRecordBatch.java:181)
>  ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
>   at 
> org.apache.drill.exec.record.AbstractSingleRecordBatch.innerNext(AbstractSingleRecordBatch.java:93)
>  ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
>   at 
> org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.innerNext(ProjectRecordBatch.java:134)
>  ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
>   at 
> org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:142)
>  ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
>   at 
> org.apache.drill.exec.physical.impl.validate.IteratorValidatorBatchIterator.next(IteratorValidatorBatchIterator.java:118)
>  ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
>   at 
> org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:99)
>  ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
>   at 
> org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:89)
>  ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
>   at 
> org.apache.drill.exec.record.AbstractSingleRecordBatch.innerNext(AbstractSingleRecordBatch.java:51)
>  ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
>   at 
> org.apache.drill.exec.physical.impl.flatten.FlattenRecordBatch.innerNext(FlattenRecordBatch.java:122)
>  ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
>   at 
> org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:142)
>  ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
>   at 
> org.apache.drill.exec.physical.impl.validate.IteratorValidatorBatchIterator.next(IteratorValidatorBatchIterator.java:118)
>  ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
>   at 
> org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:99)
>  ~[drill-java

[jira] [Updated] (DRILL-2208) Error message must be updated when query contains operations on a flattened column

2015-04-15 Thread Jason Altekruse (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-2208?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Altekruse updated DRILL-2208:
---
Fix Version/s: (was: 0.9.0)
   1.0.0

> Error message must be updated when query contains operations on a flattened 
> column
> --
>
> Key: DRILL-2208
> URL: https://issues.apache.org/jira/browse/DRILL-2208
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Query Planning & Optimization
>Affects Versions: 0.8.0
>Reporter: Abhishek Girish
>Assignee: Jason Altekruse
>Priority: Minor
> Fix For: 1.0.0
>
> Attachments: drillbit_flatten.log
>
>
> Currently i observe that if there is a flatten/kvgen operation applied on a 
> column, no further operations can be performed on the said column unless it 
> is wrapped inside a nested query. 
> Consider a simple flatten/kvgen operation on a complex JSON file :
> > select flatten(kvgen(f.`people`)) as p from `factbook/world.json` f limit 1;
> ++
> | p  |
> ++
> | {"key":"languages","value":{"text":"Mandarin Chinese 12.44%, Spanish 4.85%, 
> English 4.83%, Arabic 3.25%, Hindi 2.68%, Bengali 2.66%, Portuguese 2.62%, 
> Russian 2.12%, Japanese 1.8%, Standard German 1.33%, Javanese 1.25% (2009 
> est.)","note_1":"percents are for \"first language\" speakers only; the six 
> UN languages - Arabic, Chinese (Mandarin), English, French, Russian, and 
> Spanish (Castilian) - are the mother tongue or second language of about half 
> of the world's population, and are the official languages in more than half 
> the states in the world; some 150 to 200 languages have more than a million 
> speakers","note_2":"all told, there are an estimated 7,100 languages spoken 
> in the world; aproximately 80% of these languages are spoken by less than 
> 100,000 people; about 50 languages are spoken by only 1 person; communities 
> that are isolated from each other in mountainous regions often develop 
> multiple languages; Papua New Guinea, for example, boasts about 836 separate 
> languages","note_3":"approximately 2,300 languages are spoken in Asia, 2,150, 
> in Africa, 1,311 in the Pacific, 1,060 in the Americas, and 280 in Europe"}} |
> | {"key":"religions","value":{"text":"Christian 33.39% (of which Roman 
> Catholic 16.85%, Protestant 6.15%, Orthodox 3.96%, Anglican 1.26%), Muslim 
> 22.74%, Hindu 13.8%, Buddhist 6.77%, Sikh 0.35%, Jewish 0.22%, Baha'i 0.11%, 
> other religions 10.95%, non-religious 9.66%, atheists 2.01% (2010 est.)"}} |
> | {"key":"population","value":{"text":"7,095,217,980 (July 2013 
> est.)","top_ten_most_populous_countries_in_millions":"China 1,349.59; India 
> 1,220.80; United States 316.67; Indonesia 251.16; Brazil 201.01; Pakistan 
> 193.24; Nigeria 174.51; Bangladesh 163.65; Russia 142.50; Japan 127.25"}} |
> | {"key":"age_structure","value":{"0_14_years":"26% (male 953,496,513/female 
> 890,372,474)","15_24_years":"16.8% (male 614,574,389/female 
> 579,810,490)","25_54_years":"40.6% (male 1,454,831,900/female 
> 1,426,721,773)","55_64_years":"8.4% (male 291,435,881/female 
> 305,185,398)","65_years_and_over":"8.2% (male 257,035,416/female 321,753,746) 
> (2013 est.)"}} |
> | {"key":"dependency_ratios","value":{"total_dependency_ratio":"52 
> %","youth_dependency_ratio":"39.9 %","elderly_dependency_ratio":"12.1 
> %","potential_support_ratio":"8.3 (2013)"}} |
> ++
> *Adding a WHERE clause with conditions on this column fails:*
> > select flatten(kvgen(f.`people`)) as p from `factbook/world.json` f where 
> > f.p.`key` = 'languages';
> Query failed: RemoteRpcException: Failure while running fragment., languages 
> [ 686bcd40-c23b-448c-93d8-b98a3b092657 on abhi5.qa.lab:31010 ]
> [ 686bcd40-c23b-448c-93d8-b98a3b092657 on abhi5.qa.lab:31010 ]
> Error: exception while executing query: Failure while executing query. 
> (state=,code=0)
> Logs indicate a NumberFormat Exception in the above case.
> *And query fails to parse in the below case*
> > select flatten(kvgen(f.`people`)).`value` as p from `factbook/world.json` f 
> > limit 5;
> Query failed: ParseException: Encountered "." at line 1, column 34.
> Was expecting one of:
> "FROM" ...
> "," ...
> "AS" ...
>  
>  
> "OVER" ...
> Error: exception while executing query: Failure while executing query. 
> (state=,code=0)
> Rewriting using an inner query succeeds:
> select g.p.`value`.`note_3` from (select flatten(kvgen(f.`people`)) as p from 
> `factbook/world.json` f) g where g.p.`key`='languages';
> ++
> |   EXPR$0   |
> ++
> | approximately 2,300 languages are spoken in Asia, 2,150, in Africa, 1,311 
> in the Pacific, 1,060 in the Americas, and 280 in Europe |
> ++
> *In both the fail

[jira] [Created] (DRILL-2801) ORDER BY produces extra records

2015-04-15 Thread Sudheesh Katkam (JIRA)
Sudheesh Katkam created DRILL-2801:
--

 Summary: ORDER BY produces extra records
 Key: DRILL-2801
 URL: https://issues.apache.org/jira/browse/DRILL-2801
 Project: Apache Drill
  Issue Type: Bug
  Components: Execution - Relational Operators
Affects Versions: 0.8.0
Reporter: Sudheesh Katkam
Assignee: Chris Westin
Priority: Critical
 Attachments: data.csv

Running in embedded mode on my mac.
{code}
$ wc -w data.csv
   5 data.csv
{code}
Here's the query:
{code}
0: jdbc:drill:zk=local> SELECT count(*) FROM dfs.`data.csv`;
++
|   EXPR$0   |
++
| 5  |
++
1 row selected (0.223 seconds)
0: jdbc:drill:zk=local> SELECT columns[0] FROM dfs.`data.csv` ORDER BY 
columns[0];
++
|   EXPR$0   |
++
...
| 6  |
++
50,001 rows selected (0.928 seconds)
0: jdbc:drill:zk=local> SELECT tab.col, COUNT(tab.col) FROM (SELECT columns[0] 
col FROM dfs.`data.csv` ORDER BY columns[0]) tab GROUP BY tab.col;
+++
| col  |   EXPR$1   |
+++
| 2  | 1  |
| 3  | 1  |
| 4  | 1  |
| 5  | 10001  |
| 6  | 1  |
+++
5 rows selected (0.704 seconds)
{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (DRILL-2801) ORDER BY produces extra records

2015-04-15 Thread Sudheesh Katkam (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-2801?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sudheesh Katkam updated DRILL-2801:
---
Attachment: data.csv

> ORDER BY produces extra records
> ---
>
> Key: DRILL-2801
> URL: https://issues.apache.org/jira/browse/DRILL-2801
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Relational Operators
>Affects Versions: 0.8.0
>Reporter: Sudheesh Katkam
>Assignee: Chris Westin
>Priority: Critical
> Attachments: data.csv
>
>
> Running in embedded mode on my mac.
> {code}
> $ wc -w data.csv
>5 data.csv
> {code}
> Here's the query:
> {code}
> 0: jdbc:drill:zk=local> SELECT count(*) FROM dfs.`data.csv`;
> ++
> |   EXPR$0   |
> ++
> | 5  |
> ++
> 1 row selected (0.223 seconds)
> 0: jdbc:drill:zk=local> SELECT columns[0] FROM dfs.`data.csv` ORDER BY 
> columns[0];
> ++
> |   EXPR$0   |
> ++
> ...
> | 6  |
> ++
> 50,001 rows selected (0.928 seconds)
> 0: jdbc:drill:zk=local> SELECT tab.col, COUNT(tab.col) FROM (SELECT 
> columns[0] col FROM dfs.`data.csv` ORDER BY columns[0]) tab GROUP BY tab.col;
> +++
> | col  |   EXPR$1   |
> +++
> | 2  | 1  |
> | 3  | 1  |
> | 4  | 1  |
> | 5  | 10001  |
> | 6  | 1  |
> +++
> 5 rows selected (0.704 seconds)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (DRILL-1964) Missing key elements in returned array of maps

2015-04-15 Thread Hanifi Gunes (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-1964?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14496707#comment-14496707
 ] 

Hanifi Gunes commented on DRILL-1964:
-

This is an intended feature of Drill. We don't differentiate null fields from 
non-existence ones. Outputting null is semantically the same with outputting 
nothing. I am going to close this issue.

> Missing key elements in returned array of maps
> --
>
> Key: DRILL-1964
> URL: https://issues.apache.org/jira/browse/DRILL-1964
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Data Types
>Affects Versions: 0.8.0
>Reporter: Chun Chang
>Assignee: Hanifi Gunes
>Priority: Minor
> Fix For: 0.9.0
>
>
> #Wed Jan 07 18:54:07 EST 2015
> git.commit.id.abbrev=35a350f
> For an array of maps, if the schema for each map is not identical, with 
> today's implementation, we suppose to display each map with all elements 
> (keys) from all maps. This is not happening. For example, I have the 
> following data:
> {code}
> {
> "id": 2,
> "oooa": {
> "oa": {
> "oab": {
> "oabc": [
> {
> "rowId": 2
> },
> {
> "rowValue1": [{"rv1":1, "rv2":2}, {"rva1":3, 
> "rva2":4}],
> "rowValue2": [{"rw1":1, "rw2":2}, {"rwa1":3, 
> "rwa2":4}]
> }
> ]
> }
> }
> }
> }
> {code}
> The following query gives:
> {code}
> 0: jdbc:drill:schema=dfs.drillTestDirComplexJ> select t.oooa.oa.oab.oabc from 
> `jira2file/jira1.json` t;
> ++
> |   EXPR$0   |
> ++
> | 
> [{"rowId":2,"rowValue1":[],"rowValue2":[]},{"rowValue1":[{"rv1":1,"rv2":2},{"rva1":3,"rva2":4}],"rowValue2":[{"rw1":1,"rw2":2},{"rwa1":3,"rwa2":4}]}]
>  |
> ++
> {code}
> The returned result in a nicely formatted json form:
> {code}
> [
> {
> "rowId": 2,
> "rowValue1": [],
> "rowValue2": []
> },
> {
> "rowValue1": [
> {
> "rv1": 1,
> "rv2": 2
> },
> {
> "rva1": 3,
> "rva2": 4
> }
> ],
> "rowValue2": [
> {
> "rw1": 1,
> "rw2": 2
> },
> {
> "rwa1": 3,
> "rwa2": 4
> }
> ]
> }
> ]
> {code}
> Notice the first map includes all keys from all maps. But the second map is 
> missing the "rowId" key.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Closed] (DRILL-1964) Missing key elements in returned array of maps

2015-04-15 Thread Hanifi Gunes (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-1964?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hanifi Gunes closed DRILL-1964.
---
Resolution: Not A Problem

> Missing key elements in returned array of maps
> --
>
> Key: DRILL-1964
> URL: https://issues.apache.org/jira/browse/DRILL-1964
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Data Types
>Affects Versions: 0.8.0
>Reporter: Chun Chang
>Assignee: Hanifi Gunes
>Priority: Minor
> Fix For: 0.9.0
>
>
> #Wed Jan 07 18:54:07 EST 2015
> git.commit.id.abbrev=35a350f
> For an array of maps, if the schema for each map is not identical, with 
> today's implementation, we suppose to display each map with all elements 
> (keys) from all maps. This is not happening. For example, I have the 
> following data:
> {code}
> {
> "id": 2,
> "oooa": {
> "oa": {
> "oab": {
> "oabc": [
> {
> "rowId": 2
> },
> {
> "rowValue1": [{"rv1":1, "rv2":2}, {"rva1":3, 
> "rva2":4}],
> "rowValue2": [{"rw1":1, "rw2":2}, {"rwa1":3, 
> "rwa2":4}]
> }
> ]
> }
> }
> }
> }
> {code}
> The following query gives:
> {code}
> 0: jdbc:drill:schema=dfs.drillTestDirComplexJ> select t.oooa.oa.oab.oabc from 
> `jira2file/jira1.json` t;
> ++
> |   EXPR$0   |
> ++
> | 
> [{"rowId":2,"rowValue1":[],"rowValue2":[]},{"rowValue1":[{"rv1":1,"rv2":2},{"rva1":3,"rva2":4}],"rowValue2":[{"rw1":1,"rw2":2},{"rwa1":3,"rwa2":4}]}]
>  |
> ++
> {code}
> The returned result in a nicely formatted json form:
> {code}
> [
> {
> "rowId": 2,
> "rowValue1": [],
> "rowValue2": []
> },
> {
> "rowValue1": [
> {
> "rv1": 1,
> "rv2": 2
> },
> {
> "rva1": 3,
> "rva2": 4
> }
> ],
> "rowValue2": [
> {
> "rw1": 1,
> "rw2": 2
> },
> {
> "rwa1": 3,
> "rwa2": 4
> }
> ]
> }
> ]
> {code}
> Notice the first map includes all keys from all maps. But the second map is 
> missing the "rowId" key.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (DRILL-2800) Performance regression introduced with commit: a6df26a (Patch for DRILL-2512)

2015-04-15 Thread Kunal Khatua (JIRA)
Kunal Khatua created DRILL-2800:
---

 Summary: Performance regression introduced with commit: a6df26a  
(Patch for DRILL-2512)
 Key: DRILL-2800
 URL: https://issues.apache.org/jira/browse/DRILL-2800
 Project: Apache Drill
  Issue Type: Bug
Affects Versions: 0.9.0
 Environment: RHEL 6.4
TPCH Data Set: SF100 (Uncompressed Parquet)
Reporter: Kunal Khatua
 Fix For: 0.9.0


TPCH 06 (Cached Run) was used as a reference to identify the regressive commit.

DRILL-2613: 2-Core: Impl. ResultSet.getXxx(...) number-to-number data [fe11e86]
3,902 msec

DRILL-2668: Fix: CAST(1.1 AS FLOAT) was yielding DOUBLE. [49042bc]
5,606 msec

DRILL-2512: Shuffle the list of Drill endpoints before connecting [a6df26a] 
10,506 msec 
(Rerun 9,678 msec)

Here are comparisons from the last complete run (Cached runs):
Commit  d7e37f4 a6df26a
tpch 01  12,232  16,693 
tpch 03  23,374  30,062 
tpch 04  42,144  23,749 
tpch 05  32,247  41,648 
tpch 06  4,665   10,506 
tpch 07  29,322  34,315 
tpch 08  35,478  42,120 
tpch 09  43,959  49,262 
tpch 10  24,439  26,136 
tpch 12  Timeout 18,866 
tpch 13  18,226  20,863 
tpch 14  11,760  11,884 
tpch 16  10,676  15,032 
tpch 18  34,153  39,058 
tpch 19  Timeout 32,909 
tpch 20  99,788  22,890 





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (DRILL-2022) Parquet engine falls back to "new" Parquet reader unnecessarily

2015-04-15 Thread Jason Altekruse (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-2022?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Altekruse updated DRILL-2022:
---
Fix Version/s: (was: 0.9.0)
   1.0.0

> Parquet engine falls back to "new" Parquet reader unnecessarily
> ---
>
> Key: DRILL-2022
> URL: https://issues.apache.org/jira/browse/DRILL-2022
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Storage - Parquet
>Affects Versions: 0.8.0
>Reporter: Adam Gilmore
>Assignee: Jason Altekruse
>Priority: Minor
> Fix For: 1.0.0
>
> Attachments: DRILL-2022.1.patch.txt, DRILL-2022.2.patch.txt
>
>
> The Parquet engine falls back to the "new" Parquet reader whenever a Parquet 
> file that is "complex" (i.e. not purely primitive types) is found.
> The engine should still use the faster reader when all the projected columns 
> are primitive types and only fall back to the other reader when columns 
> containing complex types are selected.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Closed] (DRILL-746) Union all operator not working with tables in hive

2015-04-15 Thread Rahul Challapalli (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-746?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rahul Challapalli closed DRILL-746.
---

Verified the fix and everything looks good

> Union all operator not working with tables in hive
> --
>
> Key: DRILL-746
> URL: https://issues.apache.org/jira/browse/DRILL-746
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Query Planning & Optimization
>Reporter: Rahul Challapalli
>Assignee: Aman Sinha
>Priority: Blocker
> Fix For: 0.4.0
>
> Attachments: students.txt, unionall.ddl
>
>
> Attached the DDL and a small data set that I used. The below query fails to 
> be parsed
> select * from hive.hivestudents1 union all select * from hive.hivestudents2;
> Error: exception while executing query (state=,code=0)
> Query failed: org.apache.drill.exec.rpc.RpcException: Remote failure while 
> running query.[error_id: "2602c8e6-2654-40bd-b000-3f015e7d699f"
> endpoint {
>   address: "qa-node191.qa.lab"
>   user_port: 31010
>   control_port: 31011
>   data_port: 31012
> }
> error_type: 0
> message: "Failure while parsing sql. < CannotPlanException:[ Node 
> [rel#125:Subset#8.PHYSICAL.SINGLETON([]).[]] could not be implemented; 
> planner state:
> Root: rel#125:Subset#8.PHYSICAL.SINGLETON([]).[]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (DRILL-2162) Multiple flattens on a list within a list results in violating the incoming batch size limit

2015-04-15 Thread Jason Altekruse (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-2162?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Altekruse resolved DRILL-2162.

Resolution: Fixed

> Multiple flattens on a list within a list results in violating the incoming 
> batch size limit
> 
>
> Key: DRILL-2162
> URL: https://issues.apache.org/jira/browse/DRILL-2162
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Relational Operators
>Reporter: Rahul Challapalli
>Assignee: Jason Altekruse
> Fix For: 0.9.0
>
> Attachments: data.json, drill-2162.patch
>
>
> git.commit.id.abbrev=3e33880
> I attached the data set with 2 records.
> The below query succeeds on top of the attached data set. However when I 
> copied over the same data set 5 times, the same query failed
> {code}
> select uid, flatten(d.lst_lst[1]) lst1, flatten(d.lst_lst[0]) lst0, 
> flatten(d.lst_lst) lst from `data.json` d;
> Query failed: RemoteRpcException: Failure while running fragment., Incoming 
> batch of org.apache.drill.exec.physical.impl.flatten.FlattenRecordBatch has 
> size 102375, which is beyond the limit of 65536 [ 
> ef16dd95-40e2-4b66-ba30-8650ddb99812 on qa-node190.qa.lab:31010 ]
> [ ef16dd95-40e2-4b66-ba30-8650ddb99812 on qa-node190.qa.lab:31010 ]
> {code}
> Error from the logs :
> {code}
> java.lang.IllegalStateException: Incoming batch of 
> org.apache.drill.exec.physical.impl.flatten.FlattenRecordBatch has size 
> 102375, which is beyond the limit of 65536
> at 
> org.apache.drill.exec.physical.impl.validate.IteratorValidatorBatchIterator.next(IteratorValidatorBatchIterator.java:129)
>  ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
> at 
> org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:99)
>  ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
> at 
> org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:89)
>  ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
> at 
> org.apache.drill.exec.record.AbstractSingleRecordBatch.innerNext(AbstractSingleRecordBatch.java:51)
>  ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
> at 
> org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.innerNext(ProjectRecordBatch.java:134)
>  ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
> at 
> org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:142)
>  ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
> at 
> org.apache.drill.exec.physical.impl.validate.IteratorValidatorBatchIterator.next(IteratorValidatorBatchIterator.java:118)
>  ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
> at 
> org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:99)
>  ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
> at 
> org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:89)
>  ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
> at 
> org.apache.drill.exec.record.AbstractSingleRecordBatch.innerNext(AbstractSingleRecordBatch.java:51)
>  ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
> at 
> org.apache.drill.exec.physical.impl.flatten.FlattenRecordBatch.innerNext(FlattenRecordBatch.java:122)
>  ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
> at 
> org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:142)
>  ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
> at 
> org.apache.drill.exec.physical.impl.validate.IteratorValidatorBatchIterator.next(IteratorValidatorBatchIterator.java:118)
>  ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
> at 
> org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:99)
>  ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
> at 
> org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:89)
>  ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
> at 
> org.apache.drill.exec.record.AbstractSingleRecordBatch.innerNext(AbstractSingleRecordBatch.java:51)
>  ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
> at 
> org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.innerNext(ProjectRecordBatch.java:134)
>  ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
> at 
> org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:142)
>  ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
> at 
> org.apache.drill.exec.physical.impl.validate.IteratorValidatorBatchIterator.next(Itera

[jira] [Commented] (DRILL-2162) Multiple flattens on a list within a list results in violating the incoming batch size limit

2015-04-15 Thread Jason Altekruse (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-2162?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14496678#comment-14496678
 ] 

Jason Altekruse commented on DRILL-2162:


The fix for this issue was actually included in the patch for DRILL-2695, so it 
was fixed in 314e5a2a8f476f059153fde1b7e7da7d882db94e

> Multiple flattens on a list within a list results in violating the incoming 
> batch size limit
> 
>
> Key: DRILL-2162
> URL: https://issues.apache.org/jira/browse/DRILL-2162
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Relational Operators
>Reporter: Rahul Challapalli
>Assignee: Jason Altekruse
> Fix For: 0.9.0
>
> Attachments: data.json, drill-2162.patch
>
>
> git.commit.id.abbrev=3e33880
> I attached the data set with 2 records.
> The below query succeeds on top of the attached data set. However when I 
> copied over the same data set 5 times, the same query failed
> {code}
> select uid, flatten(d.lst_lst[1]) lst1, flatten(d.lst_lst[0]) lst0, 
> flatten(d.lst_lst) lst from `data.json` d;
> Query failed: RemoteRpcException: Failure while running fragment., Incoming 
> batch of org.apache.drill.exec.physical.impl.flatten.FlattenRecordBatch has 
> size 102375, which is beyond the limit of 65536 [ 
> ef16dd95-40e2-4b66-ba30-8650ddb99812 on qa-node190.qa.lab:31010 ]
> [ ef16dd95-40e2-4b66-ba30-8650ddb99812 on qa-node190.qa.lab:31010 ]
> {code}
> Error from the logs :
> {code}
> java.lang.IllegalStateException: Incoming batch of 
> org.apache.drill.exec.physical.impl.flatten.FlattenRecordBatch has size 
> 102375, which is beyond the limit of 65536
> at 
> org.apache.drill.exec.physical.impl.validate.IteratorValidatorBatchIterator.next(IteratorValidatorBatchIterator.java:129)
>  ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
> at 
> org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:99)
>  ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
> at 
> org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:89)
>  ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
> at 
> org.apache.drill.exec.record.AbstractSingleRecordBatch.innerNext(AbstractSingleRecordBatch.java:51)
>  ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
> at 
> org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.innerNext(ProjectRecordBatch.java:134)
>  ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
> at 
> org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:142)
>  ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
> at 
> org.apache.drill.exec.physical.impl.validate.IteratorValidatorBatchIterator.next(IteratorValidatorBatchIterator.java:118)
>  ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
> at 
> org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:99)
>  ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
> at 
> org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:89)
>  ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
> at 
> org.apache.drill.exec.record.AbstractSingleRecordBatch.innerNext(AbstractSingleRecordBatch.java:51)
>  ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
> at 
> org.apache.drill.exec.physical.impl.flatten.FlattenRecordBatch.innerNext(FlattenRecordBatch.java:122)
>  ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
> at 
> org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:142)
>  ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
> at 
> org.apache.drill.exec.physical.impl.validate.IteratorValidatorBatchIterator.next(IteratorValidatorBatchIterator.java:118)
>  ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
> at 
> org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:99)
>  ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
> at 
> org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:89)
>  ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
> at 
> org.apache.drill.exec.record.AbstractSingleRecordBatch.innerNext(AbstractSingleRecordBatch.java:51)
>  ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
> at 
> org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.innerNext(ProjectRecordBatch.java:134)
>  ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
> at 
> org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:142

[jira] [Updated] (DRILL-2725) Faster work assignment logic

2015-04-15 Thread Jason Altekruse (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-2725?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Altekruse updated DRILL-2725:
---
Assignee: Steven Phillips  (was: Jason Altekruse)

> Faster work assignment logic
> 
>
> Key: DRILL-2725
> URL: https://issues.apache.org/jira/browse/DRILL-2725
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Storage - Other
>Reporter: Steven Phillips
>Assignee: Steven Phillips
> Fix For: 0.9.0
>
> Attachments: DRILL-2725.patch
>
>
> The current AssignmentCreator logic for assigning work to drillbits takes a 
> non-negligible amount of time once the number of work units is more than a 
> few thousand
> We need a new algorithm to that will cut this time down to less than a 
> second, even for tables with more than 100K files.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (DRILL-2611) Value vectors report invalid value count

2015-04-15 Thread Hanifi Gunes (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-2611?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hanifi Gunes updated DRILL-2611:

Assignee: Mehant Baid  (was: Hanifi Gunes)

> Value vectors report invalid value count
> 
>
> Key: DRILL-2611
> URL: https://issues.apache.org/jira/browse/DRILL-2611
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Data Types
>Reporter: Hanifi Gunes
>Assignee: Mehant Baid
>Priority: Critical
> Fix For: 0.9.0
>
>
> We maintain an exclusive value count variable in fixed vectors however we 
> don't update it upon calling set/Safe. Accessor reports the value count from 
> the variable ignoring values that are already in the buffer or written via 
> set/Safe. This causes execution failures manifested as IOOB when underlying 
> data is sparse. We should either remove the variable and report value count 
> directly investigating the buffer if not computationally expensive or update 
> the variable each time we make a write to the vector.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (DRILL-1785) Infer operator return type from function templates

2015-04-15 Thread Hanifi Gunes (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-1785?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hanifi Gunes updated DRILL-1785:

Fix Version/s: (was: 0.9.0)
   1.0.0

> Infer operator return type from function templates
> --
>
> Key: DRILL-1785
> URL: https://issues.apache.org/jira/browse/DRILL-1785
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Functions - Drill
>Reporter: Hanifi Gunes
>Assignee: Hanifi Gunes
>Priority: Minor
> Fix For: 1.0.0
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (DRILL-2539) NullReader should allocate an empty vector in copy* methods

2015-04-15 Thread Hanifi Gunes (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-2539?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hanifi Gunes updated DRILL-2539:

Fix Version/s: (was: 0.9.0)
   1.0.0

> NullReader should allocate an empty vector in copy* methods
> ---
>
> Key: DRILL-2539
> URL: https://issues.apache.org/jira/browse/DRILL-2539
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Data Types, Execution - Flow
>Reporter: Hanifi Gunes
>Assignee: Hanifi Gunes
> Fix For: 1.0.0
>
>
> Projecting an non-existing field from a repeated type fails with an NPE 
> mainly because projected vector is not allocated and underlying buffer is 
> dead. This issue proposes to allocate an empty vector in NullReader's copy* 
> methods.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (DRILL-2005) Create table fails to write out a parquet file created from hive- read works fine

2015-04-15 Thread Deneche A. Hakim (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-2005?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Deneche A. Hakim updated DRILL-2005:

Fix Version/s: (was: 0.9.0)
   1.0.0

> Create table fails to write out a parquet file created from hive- read works 
> fine
> -
>
> Key: DRILL-2005
> URL: https://issues.apache.org/jira/browse/DRILL-2005
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Storage - Writer
>Affects Versions: 0.7.0
>Reporter: Ramana Inukonda Nagaraj
>Assignee: Deneche A. Hakim
> Fix For: 1.0.0
>
> Attachments: hive_alltypes.parquet
>
>
> Created a parquet file in hive having the following DDL
> hive> desc alltypesparquet; 
> OK
> c1 int 
> c2 boolean 
> c3 double 
> c4 string 
> c5 array 
> c6 map 
> c7 map 
> c8 struct
> c9 tinyint 
> c10 smallint 
> c11 float 
> c12 bigint 
> c13 array>  
> c15 struct>
> c16 array,n:int>> 
> Time taken: 0.076 seconds, Fetched: 15 row(s)
> Now tried to write the file out using drill
> 0: jdbc:drill:> create table hive_alltypesparquet as select * from 
> `/user/hive/warehouse/alltypesparquet`;
> Query failed: Query failed: Failure while running fragment., Attempted to 
> access index 2147483646 when value capacity is 4095 [ 
> 7c030695-9dee-4e15-b9a7-7c807dbd3e5f on 10.10.30.167:31010 ]
> [ 7c030695-9dee-4e15-b9a7-7c807dbd3e5f on 10.10.30.167:31010 ]
> Drill is able to read the file (reasonably well- see 1997,1999,2000)
> 0: jdbc:drill:> select * from `/user/hive/warehouse/alltypesparquet`;
> ++++++++++++++++
> | c1 | c2 | c3 | c4 | c5 | c6 
> | c7 | c8 | c9 |c10 |c11 |c12 
> |c13 |c15 |c16 |
> ++++++++++++++++
> | null   | null   | null   | null   | {"bag":[]} | {"map":[]} 
> | {"map":[]} | {} | null   | null   | null   | null   
> | {"bag":[]} | {"s":{}}   | {"bag":[]} |
> | -1 | false  | -1.1   | [B@62369833 | {"bag":[]} | 
> {"map":[]} | {"map":[]} | {} | -1 | -1 | -1.0   | 
> -1 | {"bag":[]} | {"s":{}}   | {"bag":[]} |
> | 1  | true   | 1.1| [B@6e426ea4 | 
> {"bag":[{"array_element":1},{"array_element":2}]} | 
> {"map":[{"key":1,"value":"eA=="},{"key":2,"value":"eQ=="}]} | 
> {"map":[{"key":"aw==","value":"dg=="}]} | {"r":"YQ==","s":9,"t":2.2} | 1  
> | 1  | 1.0| 1  | 
> {"bag":[{"array_element":{"bag":[{"array_element":"YQ=="},{"array_element":"Yg=="}]}},{"array_element":{"bag":[{"array_element":"Yw=="},{"array_element":"ZA=="}]}}]}
>  | {"r":1,"s":{"a":2,"b":"eA=="}} | 
> {"bag":[{"array_element":{"m":{"map":[]},"n":1}},{"array_element":{"m":{"map":[{"key":"YQ==","value":"Yg=="},{"key":"Yw==","value":"ZA=="}]},"n":2}}]}
>  |
> ++++++++++++++++
> 3 rows selected (0.101 seconds)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (DRILL-2662) Exception type not being included when propagating exception message

2015-04-15 Thread Deneche A. Hakim (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-2662?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Deneche A. Hakim updated DRILL-2662:

Fix Version/s: (was: 0.9.0)
   1.0.0

> Exception type not being included when propagating exception message
> 
>
> Key: DRILL-2662
> URL: https://issues.apache.org/jira/browse/DRILL-2662
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Flow
>Affects Versions: 0.8.0
>Reporter: Daniel Barclay (Drill)
>Assignee: Deneche A. Hakim
> Fix For: 1.0.0
>
>
> A query that tries to cast a non-numeric string (e.g., "col4") to an integer 
> fails (as expected), with the root exception being a NumberFormatException 
> whose exception trace printout would begin with:
>   java.lang.NumberFormatException: "col4"
> However, one of the higher-level chained/wrapping exceptions shows up like 
> this:
>   Query failed: RemoteRpcException: Failure while running fragment., "col4" [ 
> 99343f97-5c70-4454-b67f-ae550b2252fb on dev-linux2:31013 ]
> In particular, note that the most important information, that there was a 
> numeric syntax error, is not present in the message, even though some details 
> (the string with the invalid syntax) is present.
> This usually comes from taking getMessage() of an exception rather than 
> toString() when making a higher-level message.
> The toString() method normally includes the class name--and frequently the 
> class name contains key information that is not given in the exception 
> message.  (Maybe Sun/Oracle should have always put the full information in 
> the message part, but they didn't.)
> _If_ all our exceptions were just for developers, then I'd suggest always 
> wrapping exceptions like this:
>   throw new WrappingException( "higher-level problem: " + e, e );
> rather than
>   throw new WrappingException( "higher-level problem: " + e.getMessage(), e );
> to avoid losing information.  (Then the top-most exception's message string 
> always includes all the information from the lower-level exception's message 
> strings.)
> However, since that would inject class names (irrelevant to users) into 
> message strings (shown to users), for Drill we should probably make sure that 
> exceptions like NumberFormatException (for expected conversion errors) are 
> always wrapped in or replaced by exceptions that are meant for users (e.g., 
> an InvalidIntegerFormatDataException (from standard SQL exception conditions 
> like "data exception — invalid datetime format") whose message string stands 
> on its own (independent of whether the class name appears with it)).  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (DRILL-1245) Drill should pinpoint to the "Problem Record" when it fails to parse a json file

2015-04-15 Thread Deneche A. Hakim (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-1245?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Deneche A. Hakim updated DRILL-1245:

Fix Version/s: (was: 0.9.0)
   1.0.0

> Drill should pinpoint to the "Problem Record" when it fails to parse a json 
> file
> 
>
> Key: DRILL-1245
> URL: https://issues.apache.org/jira/browse/DRILL-1245
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Storage - JSON
>Reporter: Rahul Challapalli
>Assignee: Deneche A. Hakim
> Fix For: 1.0.0
>
> Attachments: DRILL-1245.1.patch.txt
>
>
> git.commit.id.abbrev=98b208e
> Data :
> {code}
> {"name":"name1", "id":1}
> {"name":"name2", "id":2}
> {"name":"name3", "id":3}
> {"name":"name4", "id":04}
> {"name":"name5", "id":5}
> {code}
> Query :
> {code}
>  select * from cp.`file.json`;
> Query failed: Screen received stop request sent. Invalid numeric value: 
> Leading zeroes not allowed
>  at [Source: java.io.BufferedReader@202fbdb4; line: 1, column: 24] 
> [c11a17bd-1a3a-4eed-a848-6d79225399d3]
> Error: exception while executing query: Failure while trying to get next 
> result batch. (state=,code=0)
> {code}
> The msg should point to the exact record which is causing the problem as it 
> will be hard looking into the data and finding out the problem record.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (DRILL-2430) Improve Error Propagation (Umbrella)

2015-04-15 Thread Deneche A. Hakim (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-2430?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Deneche A. Hakim updated DRILL-2430:

Fix Version/s: (was: 0.9.0)
   1.0.0

> Improve Error Propagation (Umbrella)
> 
>
> Key: DRILL-2430
> URL: https://issues.apache.org/jira/browse/DRILL-2430
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Execution - Flow
>Affects Versions: 0.9.0
>Reporter: Deneche A. Hakim
>Assignee: Deneche A. Hakim
> Fix For: 1.0.0
>
>
> In many cases, when an exception is thrown, it will be reported in the logs 
> but the client will actually get a different and more generic message that 
> doesn't give enough information about the problem.
> Our goal is to provide the user with better error messages. To do so we will 
> separate user and system exceptions:
> - for user exceptions the server returns enough information about the error 
> to the client
> - system exceptions only contain the necessary information to help developers 
> debug the error, e.g. error ID + drillbit IP



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (DRILL-2658) Add ilike and regex substring functions

2015-04-15 Thread Deneche A. Hakim (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-2658?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Deneche A. Hakim updated DRILL-2658:

Assignee: Steven Phillips  (was: Deneche A. Hakim)

> Add ilike and regex substring functions
> ---
>
> Key: DRILL-2658
> URL: https://issues.apache.org/jira/browse/DRILL-2658
> Project: Apache Drill
>  Issue Type: New Feature
>  Components: Functions - Drill
>Reporter: Steven Phillips
>Assignee: Steven Phillips
> Fix For: 1.0.0
>
> Attachments: DRILL-2658.patch, DRILL-2658.patch
>
>
> This will not modify the parser, so postgress syntax such as:
> "... where c ILIKE '%ABC%'"
> will not be currently supported. It will simply be a function:
> "... where ILIKE(c, '%ABC%')"
> Same for substring:
> "select substr(c, 'abc')..."
> will be equivalent to postgress
> "select substr(c from 'abc')",
> but 'abc' will be treated as a java regex pattern.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (DRILL-2658) Add ilike and regex substring functions

2015-04-15 Thread Deneche A. Hakim (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-2658?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14496638#comment-14496638
 ] 

Deneche A. Hakim commented on DRILL-2658:
-

+1

> Add ilike and regex substring functions
> ---
>
> Key: DRILL-2658
> URL: https://issues.apache.org/jira/browse/DRILL-2658
> Project: Apache Drill
>  Issue Type: New Feature
>  Components: Functions - Drill
>Reporter: Steven Phillips
>Assignee: Steven Phillips
> Fix For: 1.0.0
>
> Attachments: DRILL-2658.patch, DRILL-2658.patch
>
>
> This will not modify the parser, so postgress syntax such as:
> "... where c ILIKE '%ABC%'"
> will not be currently supported. It will simply be a function:
> "... where ILIKE(c, '%ABC%')"
> Same for substring:
> "select substr(c, 'abc')..."
> will be equivalent to postgress
> "select substr(c from 'abc')",
> but 'abc' will be treated as a java regex pattern.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (DRILL-1891) Error message does not get propagated correctly when reading from JSON file

2015-04-15 Thread Deneche A. Hakim (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-1891?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Deneche A. Hakim updated DRILL-1891:

Fix Version/s: (was: 0.9.0)
   1.0.0

> Error message does not get propagated correctly when reading from JSON file
> ---
>
> Key: DRILL-1891
> URL: https://issues.apache.org/jira/browse/DRILL-1891
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Storage - JSON
>Affects Versions: 0.7.0
>Reporter: Victoria Markman
>Assignee: Deneche A. Hakim
> Fix For: 1.0.0
>
>
> I made a mistake in t.json file (extra colon in the last row):
> {code}
> { "a1": 0 , "b1": "a"}
> { "a1": 1 , "b1": "b"}
> { "a1": 2 , "b1": "c"}
> { "a1":: 3 , "b1": "c"}
> {code}
> Error message below pretty much tells me everything that went wrong.
> {code}
> 0: jdbc:drill:schema=dfs> select a1 from `t.json` where a1 is not null;
> Query failed: Query stopped., Unexpected character (':' (code 58)): expected 
> a valid value (number, String, array, object, 'true', 'false' or 'null')
>  at [Source: org.apache.drill.exec.vector.complex.fn.JsonReader@53c10ede; 
> line: 3, column: 9] [ 64182782-ebba-4c6a-a963-005b8cb48339 on 
> atsqa4-133.qa.lab:31010 ]
> {code}
> However, if a result of query above is an input to any other operator, I get 
> this error message:
> {code}
> 0: jdbc:drill:schema=dfs> select a1 from `t.json` where a1 is not null group 
> by a1;
> Query failed: Query failed: Failure while running fragment., You tried to do 
> a batch data read operation when you were in a state of STOP.  You can only 
> do this type of operation when you are in a state of OK or OK_NEW_SCHEMA. [ 
> 955aac65-5e43-4430-baf6-ed6bb8a020d9 on atsqa4-133.qa.lab:31010 ]
> [ 955aac65-5e43-4430-baf6-ed6bb8a020d9 on atsqa4-133.qa.lab:31010 ]
> Error: exception while executing query: Failure while executing query. 
> (state=,code=0)
> {code}
> Very painful for the user if query is really complex.
> The same behavior if file does not exist.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (DRILL-2799) Query fails if directory contains .DS_Store

2015-04-15 Thread Abhishek Girish (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-2799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14496637#comment-14496637
 ] 

Abhishek Girish commented on DRILL-2799:


Git.Commit.ID: 314e5a2 (Apr 15, 2015)

> Query fails if directory contains .DS_Store
> ---
>
> Key: DRILL-2799
> URL: https://issues.apache.org/jira/browse/DRILL-2799
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Storage - Other
>Reporter: Abhishek Girish
>Assignee: Jacques Nadeau
>
> On accessing a folder, Mac OS X writes .DS_Store (some metadata) into it. See 
> http://en.wikipedia.org/wiki/.DS_Store 
> When querying such a folder, Drill throws an error. Drill should ignore this 
> file. And there should be a way to configure the same. 
> {code:sql}
> > select * from dfs.`/data/json/factbook` limit 1;
> Query failed: DATA_READ ERROR: Error reading JSON. - Invalid UTF-32 character 
> 0x42756431(above 10)  at char #1, byte #7)
> Filename: /data/json/factbook/.DS_Store
> Record: 1
> [f73266e5-3171-4134-a0a8-671af037ddd9 on abhi6.qa.lab:31010]
> Error: exception while executing query: Failure while executing query. 
> (state=,code=0)
> {code}
> Removing the file results in successfully querying the directory. 
> Log Snippet:
> {code}
> 2015-04-15 11:14:04,256 [2ad15592-eb25-eb7a-5e7b-1c93e68171c6:frag:0:0] ERROR 
> o.a.drill.exec.ops.FragmentContext - Fragment Context received failure -- 
> Fragment: 0:0
> org.apache.drill.common.exceptions.DrillUserException: DATA_READ ERROR: Error 
> reading JSON. - Invalid UTF-32 character 0x42756431(above 10)  at char 
> #1, byte #7)
> Filename: /data/json/factbook/.DS_Store
> Record: 1
> [f73266e5-3171-4134-a0a8-671af037ddd9 on abhi6.qa.lab:31010]
> at 
> org.apache.drill.common.exceptions.DrillUserException$Builder.build(DrillUserException.java:115)
>  ~[drill-common-0.9.0-SNAPSHOT-rebuffed.jar:0.9.0-SNAPSHOT]
> at 
> org.apache.drill.exec.store.easy.json.JSONRecordReader.handleAndRaise(JSONRecordReader.java:171)
>  ~[drill-java-exec-0.9.0-SNAPSHOT-rebuffed.jar:0.9.0-SNAPSHOT]
> at 
> org.apache.drill.exec.store.easy.json.JSONRecordReader.next(JSONRecordReader.java:218)
>  ~[drill-java-exec-0.9.0-SNAPSHOT-rebuffed.jar:0.9.0-SNAPSHOT]
> at 
> org.apache.drill.exec.physical.impl.ScanBatch.next(ScanBatch.java:170) 
> ~[drill-java-exec-0.9.0-SNAPSHOT-rebuffed.jar:0.9.0-SNAPSHOT]
> at 
> org.apache.drill.exec.physical.impl.validate.IteratorValidatorBatchIterator.next(IteratorValidatorBatchIterator.java:118)
>  [drill-java-exec-0.9.0-SNAPSHOT-rebuffed.jar:0.9.0-SNAPSHOT]
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (DRILL-2799) Query fails if directory contains .DS_Store

2015-04-15 Thread Abhishek Girish (JIRA)
Abhishek Girish created DRILL-2799:
--

 Summary: Query fails if directory contains .DS_Store
 Key: DRILL-2799
 URL: https://issues.apache.org/jira/browse/DRILL-2799
 Project: Apache Drill
  Issue Type: Bug
  Components: Storage - Other
Reporter: Abhishek Girish
Assignee: Jacques Nadeau


On accessing a folder, Mac OS X writes .DS_Store (some metadata) into it. See 
http://en.wikipedia.org/wiki/.DS_Store 

When querying such a folder, Drill throws an error. Drill should ignore this 
file. And there should be a way to configure the same. 

{code:sql}
> select * from dfs.`/data/json/factbook` limit 1;
Query failed: DATA_READ ERROR: Error reading JSON. - Invalid UTF-32 character 
0x42756431(above 10)  at char #1, byte #7)
Filename: /data/json/factbook/.DS_Store
Record: 1
[f73266e5-3171-4134-a0a8-671af037ddd9 on abhi6.qa.lab:31010]
Error: exception while executing query: Failure while executing query. 
(state=,code=0)
{code}

Removing the file results in successfully querying the directory. 

Log Snippet:
{code}
2015-04-15 11:14:04,256 [2ad15592-eb25-eb7a-5e7b-1c93e68171c6:frag:0:0] ERROR 
o.a.drill.exec.ops.FragmentContext - Fragment Context received failure -- 
Fragment: 0:0
org.apache.drill.common.exceptions.DrillUserException: DATA_READ ERROR: Error 
reading JSON. - Invalid UTF-32 character 0x42756431(above 10)  at char #1, 
byte #7)
Filename: /data/json/factbook/.DS_Store
Record: 1

[f73266e5-3171-4134-a0a8-671af037ddd9 on abhi6.qa.lab:31010]

at 
org.apache.drill.common.exceptions.DrillUserException$Builder.build(DrillUserException.java:115)
 ~[drill-common-0.9.0-SNAPSHOT-rebuffed.jar:0.9.0-SNAPSHOT]
at 
org.apache.drill.exec.store.easy.json.JSONRecordReader.handleAndRaise(JSONRecordReader.java:171)
 ~[drill-java-exec-0.9.0-SNAPSHOT-rebuffed.jar:0.9.0-SNAPSHOT]
at 
org.apache.drill.exec.store.easy.json.JSONRecordReader.next(JSONRecordReader.java:218)
 ~[drill-java-exec-0.9.0-SNAPSHOT-rebuffed.jar:0.9.0-SNAPSHOT]
at 
org.apache.drill.exec.physical.impl.ScanBatch.next(ScanBatch.java:170) 
~[drill-java-exec-0.9.0-SNAPSHOT-rebuffed.jar:0.9.0-SNAPSHOT]
at 
org.apache.drill.exec.physical.impl.validate.IteratorValidatorBatchIterator.next(IteratorValidatorBatchIterator.java:118)
 [drill-java-exec-0.9.0-SNAPSHOT-rebuffed.jar:0.9.0-SNAPSHOT]
{code}




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (DRILL-2675) Implement a subset of User Exceptions to improve how errors are reported to the user

2015-04-15 Thread Deneche A. Hakim (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-2675?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Deneche A. Hakim updated DRILL-2675:

Attachment: DRILL-2675.fix.patch.txt

incremental patch that adds the missing changes from master

> Implement a subset of User Exceptions to improve how errors are reported to 
> the user
> 
>
> Key: DRILL-2675
> URL: https://issues.apache.org/jira/browse/DRILL-2675
> Project: Apache Drill
>  Issue Type: Sub-task
>  Components: Execution - Flow
>Affects Versions: 0.9.0
>Reporter: Deneche A. Hakim
>Assignee: Jacques Nadeau
> Fix For: 0.9.0
>
> Attachments: DRILL-2675.0.patch.txt, DRILL-2675.1.patch.txt, 
> DRILL-2675.2.patch.txt, DRILL-2675.3.patch.txt, DRILL-2675.4.patch.txt, 
> DRILL-2675.5.patch.txt, DRILL-2675.6.patch.txt, DRILL-2675.7.patch.txt, 
> DRILL-2675.fix.patch.txt
>
>
> Implement a set of most needed User Exceptions. Each user exception will 
> contain:
> - a meaningful error message
> - optional context information (e.g. which file/row caused the error)
> - errorID and drillbit name, to help users and developers find the original 
> error in the logs
> - a set of suppressed exceptions that were thrown along the original error



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (DRILL-2680) Assertions, thrown AssertionErrors should include some message text [umbrella/tracking bug]

2015-04-15 Thread Daniel Barclay (Drill) (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-2680?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14496606#comment-14496606
 ] 

Daniel Barclay (Drill) edited comment on DRILL-2680 at 4/15/15 6:03 PM:


Partial info:

{{grep -r '\bassert\b[^.][^:]*;' . --include="*.java" | grep -v "/target/"}} 
yields about 250 raw locations (raw in the sense of not filtering out cases 
where a message probably isn't really needed)

At the moment I can't find cases with {{throw new AssertionError()}} that I 
thought I remembered seeing.

However, in searching now, I need a number of cases of {{throw new 
UnsupportedOperationException()}} that could use a message (and some of which 
should be some kind of "unexpected-case" exception (e.g., AssertionError)), for 
example:

{noformat}
  public static MajorType overrideMinorType(final MajorType originalMajorType, 
final MinorType overrideMinorType) {
switch (originalMajorType.getMode()) {
  case REPEATED:
return repeated(overrideMinorType);
  case OPTIONAL:
return optional(overrideMinorType);
  case REQUIRED:
return required(overrideMinorType);
  default:
throw new UnsupportedOperationException();
}
  }
{noformat}





was (Author: dsbos):
Partial info:

{{grep -r '\bassert\b[^.][^:]*;' . --include="*.java" | grep -v "/target/"}} 
yields about 250 raw locations (raw in the sense of not filtering out cases 
where a message probably isn't really needed)

At the moment I can't find cases with {{throw new AssertionError()}} that I 
thought I remembered seeing.

However, in searching now, I need a number of cases of {{throw new 
UnsupportedOperationException()}} that could use a message (and some of which 
should be some kind of "unexpected-case" exception (e.g., AssertionError)), for 
example:
{{noformat}}
  public static MajorType overrideMinorType(final MajorType originalMajorType, 
final MinorType overrideMinorType) {
switch (originalMajorType.getMode()) {
  case REPEATED:
return repeated(overrideMinorType);
  case OPTIONAL:
return optional(overrideMinorType);
  case REQUIRED:
return required(overrideMinorType);
  default:
throw new UnsupportedOperationException();
}
  }
{{noformat}}




> Assertions, thrown AssertionErrors should include some message text 
> [umbrella/tracking bug]
> ---
>
> Key: DRILL-2680
> URL: https://issues.apache.org/jira/browse/DRILL-2680
> Project: Apache Drill
>  Issue Type: Task
>Reporter: Daniel Barclay (Drill)
>Assignee: Daniel Barclay (Drill)
>
> Drill seems to have a lot of "assert ..." statements and "throw new 
> AssertionError()" cases that include no message text.
> Many of them should at least include a phrase or two about what's wrong.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (DRILL-2680) Assertions, thrown AssertionErrors should include some message text [umbrella/tracking bug]

2015-04-15 Thread Daniel Barclay (Drill) (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-2680?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14496606#comment-14496606
 ] 

Daniel Barclay (Drill) commented on DRILL-2680:
---

Partial info:

{{grep -r '\bassert\b[^.][^:]*;' . --include="*.java" | grep -v "/target/"}} 
yields about 250 raw locations (raw in the sense of not filtering out cases 
where a message probably isn't really needed)

At the moment I can't find cases with {{throw new AssertionError()}} that I 
thought I remembered seeing.

However, in searching now, I need a number of cases of {{throw new 
UnsupportedOperationException()}} that could use a message (and some of which 
should be some kind of "unexpected-case" exception (e.g., AssertionError)), for 
example:
{{noformat}}
  public static MajorType overrideMinorType(final MajorType originalMajorType, 
final MinorType overrideMinorType) {
switch (originalMajorType.getMode()) {
  case REPEATED:
return repeated(overrideMinorType);
  case OPTIONAL:
return optional(overrideMinorType);
  case REQUIRED:
return required(overrideMinorType);
  default:
throw new UnsupportedOperationException();
}
  }
{{noformat}}




> Assertions, thrown AssertionErrors should include some message text 
> [umbrella/tracking bug]
> ---
>
> Key: DRILL-2680
> URL: https://issues.apache.org/jira/browse/DRILL-2680
> Project: Apache Drill
>  Issue Type: Task
>Reporter: Daniel Barclay (Drill)
>Assignee: Daniel Barclay (Drill)
>
> Drill seems to have a lot of "assert ..." statements and "throw new 
> AssertionError()" cases that include no message text.
> Many of them should at least include a phrase or two about what's wrong.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (DRILL-2798) Suppress log location message from sqlline

2015-04-15 Thread Parth Chandra (JIRA)
Parth Chandra created DRILL-2798:


 Summary: Suppress log location message from sqlline
 Key: DRILL-2798
 URL: https://issues.apache.org/jira/browse/DRILL-2798
 Project: Apache Drill
  Issue Type: Bug
  Components: Client - CLI
Affects Versions: 0.8.0
Reporter: Parth Chandra
Assignee: Patrick Wong
 Fix For: 0.9.0


sqlline is now printing a message with the location of the log file that is 
breaking external scripts to extract data using Drill.
We need to add an option to suppress sqlline shell script messages (or remove 
them).





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (DRILL-2680) Assertions, thrown AssertionErrors should include some message text [umbrella/tracking bug]

2015-04-15 Thread Daniel Barclay (Drill) (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-2680?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Barclay (Drill) updated DRILL-2680:
--
Summary: Assertions, thrown AssertionErrors should include some message 
text [umbrella/tracking bug]  (was: Assertions and thrown AssertionsErrors 
should include some message text)

> Assertions, thrown AssertionErrors should include some message text 
> [umbrella/tracking bug]
> ---
>
> Key: DRILL-2680
> URL: https://issues.apache.org/jira/browse/DRILL-2680
> Project: Apache Drill
>  Issue Type: Task
>Reporter: Daniel Barclay (Drill)
>Assignee: Daniel Barclay (Drill)
>
> Drill seems to have a lot of "assert ..." statements and "throw new 
> AssertionError()" cases that include no message text.
> Many of them should at least include a phrase or two about what's wrong.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (DRILL-2680) Assertions and thrown AssertionsErrors should include some message text

2015-04-15 Thread Daniel Barclay (Drill) (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-2680?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Barclay (Drill) updated DRILL-2680:
--
Issue Type: Task  (was: Bug)

> Assertions and thrown AssertionsErrors should include some message text
> ---
>
> Key: DRILL-2680
> URL: https://issues.apache.org/jira/browse/DRILL-2680
> Project: Apache Drill
>  Issue Type: Task
>Reporter: Daniel Barclay (Drill)
>Assignee: Daniel Barclay (Drill)
>
> Drill seems to have a lot of "assert ..." statements and "throw new 
> AssertionError()" cases that include no message text.
> Many of them should at least include a phrase or two about what's wrong.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


  1   2   >