[jira] [Created] (DRILL-4011) Refactor CopyUtils and remove repeated code

2015-11-03 Thread amit hadke (JIRA)
amit hadke created DRILL-4011:
-

 Summary: Refactor CopyUtils and remove repeated code
 Key: DRILL-4011
 URL: https://issues.apache.org/jira/browse/DRILL-4011
 Project: Apache Drill
  Issue Type: Task
Reporter: amit hadke
Assignee: amit hadke
Priority: Minor


Lots of operators share same code to setup copy between incoming and outgoing. 
Refactor CopyUtils and use it for all possible operators.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (DRILL-3769) to_date function with one argument returns wrong data type

2015-11-03 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-3769?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14986882#comment-14986882
 ] 

ASF GitHub Bot commented on DRILL-3769:
---

Github user hsuanyi closed the pull request at:

https://github.com/apache/drill/pull/205


> to_date function with one argument returns wrong data type
> --
>
> Key: DRILL-3769
> URL: https://issues.apache.org/jira/browse/DRILL-3769
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Functions - Drill
>Affects Versions: 1.1.0, 1.2.0
>Reporter: Victoria Markman
>Assignee: Sean Hsuan-Yi Chu
> Fix For: Future
>
>
> 1. to_date function is not part of SQL standard according to my research 
> (checked ISO/IEC9075-2), so implementations of it may vary from database to 
> database (our implementation of to_date is different from Postgres, for 
> example)
> 2. Our documentation only talks about to_date with 2 parameters: format and 
> actual string to be converted to date type
> 3. Calcite does not seem to have to_date, which makes me think that this is 
> Drill UDF
> 4. Apparently, if you invoke to_date() with one argument in Drill: it runs.
>
> So there are two possibilities: we implemented to_date with one argument to 
> be compatible with some other SQL engine, Hive ?
>or 
>it is a bug and we should throw an error.
> You can use to_date with one argument in a simple query:
> {code}
> 0: jdbc:drill:schema=dfs> select to_date(c1 + interval '1' day) from t1 limit 
> 1;
> +-+
> |   EXPR$0|
> +-+
> | 2015-01-02  |
> +-+
> 1 row selected (0.242 seconds)
> {code}
> However, since return type is varbinary, joins, aggregations and CTAS are 
> going to be problematic. 
> Here is to_date use in join to illustrate this (c1 is a date column):
> {code}
> 0: jdbc:drill:schema=dfs> select * from t1, t2 where to_date(t1.c1) = t2.c2;
> Error: SYSTEM ERROR: DrillRuntimeException: Join only supports implicit casts 
> between 1. Numeric data
>  2. Varchar, Varbinary data 3. Date, Timestamp data Left type: DATE, Right 
> type: VAR16CHAR. Add explicit casts to avoid this error
> Fragment 0:0
> [Error Id: 66ac8248-56c5-401a-aa53-de90cb828bc4 on atsqa4-133.qa.lab:31010] 
> (state=,code=0)
> {code}
> Since we don't support cast between varbinary and date, attempt to cast it 
> results in:
> {code}
> 0: jdbc:drill:schema=dfs> select * from t1, t2 where cast(to_date(t1.c1) as 
> date) = t2.c2;
> Error: SYSTEM ERROR: SchemaChangeException: Failure while trying to 
> materialize incoming schema.  Errors:
> Error in expression at index -1.  Error: Missing function implementation: 
> [castBIGINT(VAR16CHAR-OPTIONAL)].  Full expression: --UNKNOWN EXPRESSION--..
> Fragment 0:0
> [Error Id: deeb040a-f1d3-4ea0-8849-7ced29508576 on atsqa4-133.qa.lab:31010] 
> (state=,code=0)
> {code}
> Same with CTAS:
> {code}
> 0: jdbc:drill:schema=dfs> create table x(a1) as select to_date(c1) from t1;
> +---++
> | Fragment  | Number of records written  |
> +---++
> | 0_0   | 10 |
> +---++
> 1 row selected (0.4 seconds)
> 0: jdbc:drill:schema=dfs> select * from x;
> +--+
> |  a1  |
> +--+
> | [B@28b5395d  |
> | [B@11c91d8c  |
> | [B@2ab2db73  |
> | [B@446570eb  |
> | [B@5fd87761  |
> | [B@7c85b26f  |
> | [B@2d85d547  |
> | [B@2d753faa  |
> | null |
> | [B@6ca6c936  |
> +--+
> 10 rows selected (0.183 seconds)
> 0: jdbc:drill:schema=dfs> select cast(a1 as date) from x;
> Error: SYSTEM ERROR: IllegalFieldValueException: Value 0 for monthOfYear must 
> be in the range [1,12]
> Fragment 0:0
> [Error Id: 71d8cd8f-6c88-4a13-9d24-b06ef52f6572 on atsqa4-133.qa.lab:31010] 
> (state=,code=0)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (DRILL-3229) Create a new EmbeddedVector

2015-11-03 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-3229?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14986926#comment-14986926
 ] 

ASF GitHub Bot commented on DRILL-3229:
---

Github user asfgit closed the pull request at:

https://github.com/apache/drill/pull/180


> Create a new EmbeddedVector
> ---
>
> Key: DRILL-3229
> URL: https://issues.apache.org/jira/browse/DRILL-3229
> Project: Apache Drill
>  Issue Type: Sub-task
>  Components: Execution - Codegen, Execution - Data Types, Execution - 
> Relational Operators, Functions - Drill
>Reporter: Jacques Nadeau
>Assignee: Hanifi Gunes
> Fix For: Future
>
>
> Embedded Vector will leverage a binary encoding for holding information about 
> type for each individual field.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (DRILL-3232) Modify existing vectors to allow type promotion

2015-11-03 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-3232?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14986925#comment-14986925
 ] 

ASF GitHub Bot commented on DRILL-3232:
---

Github user asfgit closed the pull request at:

https://github.com/apache/drill/pull/207


> Modify existing vectors to allow type promotion
> ---
>
> Key: DRILL-3232
> URL: https://issues.apache.org/jira/browse/DRILL-3232
> Project: Apache Drill
>  Issue Type: Sub-task
>  Components: Execution - Codegen, Execution - Data Types, Execution - 
> Relational Operators, Functions - Drill
>Reporter: Steven Phillips
>Assignee: Hanifi Gunes
> Fix For: 1.3.0
>
>
> Support the ability for existing vectors to be promoted similar to supported 
> implicit casting rules.
> For example:
> INT > DOUBLE > STRING > EMBEDDED



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (DRILL-4011) Refactor CopyUtils and remove duplicate code

2015-11-03 Thread amit hadke (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-4011?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

amit hadke updated DRILL-4011:
--
Summary: Refactor CopyUtils and remove duplicate code  (was: Refactor 
CopyUtils and remove repeated code)

> Refactor CopyUtils and remove duplicate code
> 
>
> Key: DRILL-4011
> URL: https://issues.apache.org/jira/browse/DRILL-4011
> Project: Apache Drill
>  Issue Type: Task
>Reporter: amit hadke
>Assignee: amit hadke
>Priority: Minor
>
> Lots of operators share same code to setup copy between incoming and 
> outgoing. Refactor CopyUtils and use it for all possible operators.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (DRILL-4012) Limit 0 on top of query with kvg/flatten results in assert

2015-11-03 Thread Victoria Markman (JIRA)
Victoria Markman created DRILL-4012:
---

 Summary: Limit 0 on top of query with kvg/flatten results in assert
 Key: DRILL-4012
 URL: https://issues.apache.org/jira/browse/DRILL-4012
 Project: Apache Drill
  Issue Type: Bug
  Components: Execution - Data Types
Reporter: Victoria Markman


missing-map.json
{code}
{
"id": 1,
"m": {"a":1,"b":2}
}
{
"id": 2
}
{
"id": 3,
"m": {"c":3,"d":4}
}
{code}

'limit 0' results in an assert:
{code}
0: jdbc:drill:schema=dfs> select * from (select id, flatten(kvgen(m)) from 
`missing-map.json`) limit 0;
Error: SYSTEM ERROR: ClassCastException: Cannot cast 
org.apache.drill.exec.vector.NullableIntVector to 
org.apache.drill.exec.vector.complex.RepeatedValueVector
Fragment 0:0
[Error Id: 046bb4d4-2c54-43ab-9577-cf21542ff8ef on atsqa4-133.qa.lab:31010] 
(state=,code=0)
{code}

'limit 1' works:
{code}
0: jdbc:drill:schema=dfs> select * from (select id, flatten(kvgen(m)) from 
`missing-map.json`) limit 1;
+-++
| id  | EXPR$1 |
+-++
| 1   | {"key":"a","value":1}  |
+-++
1 row selected (0.247 seconds)
{code}

No limit, just in subquery: works
{code}
0: jdbc:drill:schema=dfs> select * from (select id, flatten(kvgen(m)) from 
`json_kvgenflatten/missing-map.json`);
+-++
| id  | EXPR$1 |
+-++
| 1   | {"key":"a","value":1}  |
| 1   | {"key":"b","value":2}  |
| 3   | {"key":"c","value":3}  |
| 3   | {"key":"d","value":4}  |
+-++
4 rows selected (0.247 seconds)
{code}

drillbit.log
{code}
2015-11-03 15:23:20,943 [29c72e96-c9f6-9fce-ecf1-14eaa145f72b:foreman] INFO  
o.a.d.e.s.schedule.BlockMapBuilder - Failure finding Drillbit running on host 
atsqa4-136.qa.lab.  Skipping affinity to that host.
2015-11-03 15:23:20,944 [29c72e96-c9f6-9fce-ecf1-14eaa145f72b:foreman] INFO  
o.a.d.e.s.schedule.BlockMapBuilder - Failure finding Drillbit running on host 
atsqa4-134.qa.lab.  Skipping affinity to that host.
2015-11-03 15:23:20,944 [29c72e96-c9f6-9fce-ecf1-14eaa145f72b:foreman] INFO  
o.a.d.e.s.schedule.BlockMapBuilder - Get block maps: Executed 1 out of 1 using 
1 threads. Time: 1ms total, 1.530719ms avg, 1ms max.
2015-11-03 15:23:20,944 [29c72e96-c9f6-9fce-ecf1-14eaa145f72b:foreman] INFO  
o.a.d.e.s.schedule.BlockMapBuilder - Get block maps: Executed 1 out of 1 using 
1 threads. Earliest start: 1.744000 μs, Latest start: 1.744000 μs, Average 
start: 1.744000 μs .
2015-11-03 15:23:20,968 [29c72e96-c9f6-9fce-ecf1-14eaa145f72b:frag:0:0] INFO  
o.a.d.e.w.fragment.FragmentExecutor - 29c72e96-c9f6-9fce-ecf1-14eaa145f72b:0:0: 
State change requested AWAITING_ALLOCATION --> RUNNING
2015-11-03 15:23:20,968 [29c72e96-c9f6-9fce-ecf1-14eaa145f72b:frag:0:0] INFO  
o.a.d.e.w.f.FragmentStatusReporter - 29c72e96-c9f6-9fce-ecf1-14eaa145f72b:0:0: 
State to report: RUNNING
2015-11-03 15:23:20,974 [29c72e96-c9f6-9fce-ecf1-14eaa145f72b:frag:0:0] WARN  
o.a.d.e.e.ExpressionTreeMaterializer - Unable to find value vector of path 
`EXPR$3`, returning null instance.
2015-11-03 15:23:20,975 [29c72e96-c9f6-9fce-ecf1-14eaa145f72b:frag:0:0] WARN  
o.a.d.e.e.ExpressionTreeMaterializer - Unable to find value vector of path 
`EXPR$3`, returning null instance.
2015-11-03 15:23:20,976 [29c72e96-c9f6-9fce-ecf1-14eaa145f72b:frag:0:0] INFO  
o.a.d.e.w.fragment.FragmentExecutor - 29c72e96-c9f6-9fce-ecf1-14eaa145f72b:0:0: 
State change requested RUNNING --> FAILED
2015-11-03 15:23:20,976 [29c72e96-c9f6-9fce-ecf1-14eaa145f72b:frag:0:0] INFO  
o.a.d.e.w.fragment.FragmentExecutor - 29c72e96-c9f6-9fce-ecf1-14eaa145f72b:0:0: 
State change requested FAILED --> FINISHED
2015-11-03 15:23:20,978 [29c72e96-c9f6-9fce-ecf1-14eaa145f72b:frag:0:0] ERROR 
o.a.d.e.w.fragment.FragmentExecutor - SYSTEM ERROR: ClassCastException: Cannot 
cast org.apache.drill.exec.vector.NullableIntVector to 
org.apache.drill.exec.vector.complex.RepeatedValueVector

Fragment 0:0

[Error Id: c82c7f59-4dad-47ad-8901-5a2261c81279 on atsqa4-133.qa.lab:31010]
org.apache.drill.common.exceptions.UserException: SYSTEM ERROR: 
ClassCastException: Cannot cast org.apache.drill.exec.vector.NullableIntVector 
to org.apache.drill.exec.vector.complex.RepeatedValueVector

Fragment 0:0
[Error Id: c82c7f59-4dad-47ad-8901-5a2261c81279 on atsqa4-133.qa.lab:31010]
at 
org.apache.drill.common.exceptions.UserException$Builder.build(UserException.java:534)
 ~[drill-common-1.2.0.jar:1.2.0]
at 
org.apache.drill.exec.work.fragment.FragmentExecutor.sendFinalState(FragmentExecutor.java:323)
 [drill-java-exec-1.2.0.jar:1.2.0]
at 
org.apache.drill.exec.work.fragment.FragmentExecutor.cleanup(FragmentExecutor.java:178)
 [drill-java-exec-1.2.0.jar:1.2.0]
at 

[jira] [Commented] (DRILL-951) CSV header row should be parsed

2015-11-03 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-951?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14987505#comment-14987505
 ] 

ASF GitHub Bot commented on DRILL-951:
--

Github user jacques-n commented on a diff in the pull request:

https://github.com/apache/drill/pull/232#discussion_r43765441
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/store/easy/text/compliant/CompliantTextRecordReader.java
 ---
@@ -71,15 +79,14 @@ public CompliantTextRecordReader(FileSplit split, 
DrillFileSystem dfs, FragmentC
   // checks to see if we are querying all columns(star) or individual 
columns
   @Override
   public boolean isStarQuery() {
-if(settings.isUseRepeatedVarChar()) {
-  return super.isStarQuery() || Iterables.tryFind(getColumns(), new 
Predicate() {
-@Override
-public boolean apply(@Nullable SchemaPath path) {
-  return path.equals(RepeatedVarCharOutput.COLUMNS);
-}
-  }).isPresent();
-}
-return super.isStarQuery();
+if (super.isStarQuery()) { return true; }
--- End diff --

If we're in header extraction mode, requesting the columns column shouldn't 
mean a request for star query. Only a * should.


> CSV header row should be parsed
> ---
>
> Key: DRILL-951
> URL: https://issues.apache.org/jira/browse/DRILL-951
> Project: Apache Drill
>  Issue Type: New Feature
>  Components: Storage - Text & CSV
>Reporter: Tomer Shiran
>Assignee: Abhijit Pol
> Fix For: Future
>
>
> CSV reader is currently treating header names like regular rows. There should 
> be a way to treat the header row as the column names (optional?).
> I exported this dataset to a CSV: 
> https://data.sfgov.org/Public-Safety/SFPD-Incidents-Previous-Three-Months/tmnf-yvry



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (DRILL-4006) As json reader reads a field with empty lists, IOOB could happen

2015-11-03 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-4006?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14987640#comment-14987640
 ] 

ASF GitHub Bot commented on DRILL-4006:
---

Github user hsuanyi closed the pull request at:

https://github.com/apache/drill/pull/233


> As json reader reads a field with empty lists, IOOB could happen
> 
>
> Key: DRILL-4006
> URL: https://issues.apache.org/jira/browse/DRILL-4006
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Storage - JSON
>Reporter: Sean Hsuan-Yi Chu
>Assignee: Sean Hsuan-Yi Chu
> Attachments: a.json, b.json, c.json
>
>
> If a field in a json file has many empty lists before a non-empty list, there 
> could be an IOOB exception.
> Running the following query on the folder with files in the attachment can 
> reproduce the observation:
> {code}
> select a from`folder`
> {code}
> Exception:
> org.apache.drill.common.exceptions.UserRemoteException: DATA_READ ERROR: 
> index: 4448, length: 4 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (DRILL-4019) Limit 0 with two flatten operators in a query fails with unsupported exception

2015-11-03 Thread Victoria Markman (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-4019?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Victoria Markman updated DRILL-4019:

Affects Version/s: (was: 1.2.0)

> Limit 0 with two flatten operators in a query fails with unsupported exception
> --
>
> Key: DRILL-4019
> URL: https://issues.apache.org/jira/browse/DRILL-4019
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Relational Operators
>Reporter: Victoria Markman
>  Labels: private_branch
>
> I think this is manifestation of DRILL-2256 in a different SQL scenario.
> test.json
> {code}
> {
> "c1" : [[1,2,3],[10,20,30]]
> }
> {code}
> Two flatten operators with limit zero fails:
> {code}
> 0: jdbc:drill:schema=dfs> select flatten(c1[0]), flatten(c1[1]) from 
> `test.json` limit 0;
> Error: SYSTEM ERROR: UnsupportedOperationException: Unable to get value 
> vector class for minor type [LATE] and mode [OPTIONAL]
> Fragment 0:0
> [Error Id: 0f19b566-ca53-43c7-ae0f-fec39c909cfd on atsqa4-133.qa.lab:31010] 
> (state=,code=0)
> {code}
> Single flatten: works
> {code}
> 0: jdbc:drill:schema=dfs> select flatten(c1[0]) from `test.json` limit 0;
> +-+
> | EXPR$0  |
> +-+
> +-+
> No rows selected (0.258 seconds)
> {code}
> Without limit 0:
> {code}
> 0: jdbc:drill:schema=dfs> select flatten(c1[0]) from `test.json`;
> +-+
> | EXPR$0  |
> +-+
> | 1   |
> | 2   |
> | 3   |
> +-+
> 3 rows selected (0.268 seconds)
> 0: jdbc:drill:schema=dfs> select flatten(c1[0]), flatten(c1[1]) from 
> `test.json`;
> +-+-+
> | EXPR$0  | EXPR$1  |
> +-+-+
> | 1   | 10  |
> | 1   | 20  |
> | 1   | 30  |
> | 2   | 10  |
> | 2   | 20  |
> | 2   | 30  |
> | 3   | 10  |
> | 3   | 20  |
> | 3   | 30  |
> +-+-+
> 9 rows selected (0.26 seconds)
> {code}
> drillbit.log
> {code}
> 2015-11-03 17:13:19,237 [29c714cf-fdc7-2ac9-5fd4-c4fdae636550:foreman] INFO  
> o.a.d.e.s.schedule.BlockMapBuilder - Failure finding Drillbit running on host 
> atsqa4-135.qa.lab.  Skipping affinity to that host.
> 2015-11-03 17:13:19,237 [29c714cf-fdc7-2ac9-5fd4-c4fdae636550:foreman] INFO  
> o.a.d.e.s.schedule.BlockMapBuilder - Failure finding Drillbit running on host 
> atsqa4-134.qa.lab.  Skipping affinity to that host.
> 2015-11-03 17:13:19,237 [29c714cf-fdc7-2ac9-5fd4-c4fdae636550:foreman] INFO  
> o.a.d.e.s.schedule.BlockMapBuilder - Get block maps: Executed 1 out of 1 
> using 1 threads. Time: 1ms total, 1.717980ms avg, 1ms max.
> 2015-11-03 17:13:19,237 [29c714cf-fdc7-2ac9-5fd4-c4fdae636550:foreman] INFO  
> o.a.d.e.s.schedule.BlockMapBuilder - Get block maps: Executed 1 out of 1 
> using 1 threads. Earliest start: 1.82 μs, Latest start: 1.82 μs, 
> Average start: 1.82 μs .
> 2015-11-03 17:13:19,263 [29c714cf-fdc7-2ac9-5fd4-c4fdae636550:frag:0:0] INFO  
> o.a.d.e.w.fragment.FragmentExecutor - 
> 29c714cf-fdc7-2ac9-5fd4-c4fdae636550:0:0: State change requested 
> AWAITING_ALLOCATION --> RUNNING
> 2015-11-03 17:13:19,263 [29c714cf-fdc7-2ac9-5fd4-c4fdae636550:frag:0:0] INFO  
> o.a.d.e.w.f.FragmentStatusReporter - 
> 29c714cf-fdc7-2ac9-5fd4-c4fdae636550:0:0: State to report: RUNNING
> 2015-11-03 17:13:19,269 [29c714cf-fdc7-2ac9-5fd4-c4fdae636550:frag:0:0] INFO  
> o.a.d.e.w.fragment.FragmentExecutor - 
> 29c714cf-fdc7-2ac9-5fd4-c4fdae636550:0:0: State change requested RUNNING --> 
> FAILED
> 2015-11-03 17:13:19,269 [29c714cf-fdc7-2ac9-5fd4-c4fdae636550:frag:0:0] INFO  
> o.a.d.e.w.fragment.FragmentExecutor - 
> 29c714cf-fdc7-2ac9-5fd4-c4fdae636550:0:0: State change requested FAILED --> 
> FINISHED
> 2015-11-03 17:13:19,271 [29c714cf-fdc7-2ac9-5fd4-c4fdae636550:frag:0:0] ERROR 
> o.a.d.e.w.fragment.FragmentExecutor - SYSTEM ERROR: 
> UnsupportedOperationException: Unable to get value vector class for minor 
> type [LATE] and mode [OPTIONAL]
> Fragment 0:0
> [Error Id: 95cd27b5-065d-4714-a3e0-cd8d8ff6d519 on atsqa4-133.qa.lab:31010]
> org.apache.drill.common.exceptions.UserException: SYSTEM ERROR: 
> UnsupportedOperationException: Unable to get value vector class for minor 
> type [LATE] and mode [OPTIONAL]
> Fragment 0:0
> [Error Id: 95cd27b5-065d-4714-a3e0-cd8d8ff6d519 on atsqa4-133.qa.lab:31010]
> at 
> org.apache.drill.common.exceptions.UserException$Builder.build(UserException.java:534)
>  ~[drill-common-1.2.0.jar:1.2.0]
> at 
> org.apache.drill.exec.work.fragment.FragmentExecutor.sendFinalState(FragmentExecutor.java:323)
>  [drill-java-exec-1.2.0.jar:1.2.0]
> at 
> org.apache.drill.exec.work.fragment.FragmentExecutor.cleanup(FragmentExecutor.java:178)
>  [drill-java-exec-1.2.0.jar:1.2.0]
> at 
> 

[jira] [Updated] (DRILL-4020) The not-equal operator returns incorrect results when used on the HBase row key

2015-11-03 Thread Akihiko Kusanagi (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-4020?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akihiko Kusanagi updated DRILL-4020:

Attachment: DRILL-4020.patch

> The not-equal operator returns incorrect results when used on the HBase row 
> key
> ---
>
> Key: DRILL-4020
> URL: https://issues.apache.org/jira/browse/DRILL-4020
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Storage - HBase
>Affects Versions: 1.2.0
> Environment: Drill Sandbox
>Reporter: Akihiko Kusanagi
>Priority: Critical
> Attachments: DRILL-4020.1.patch.txt, DRILL-4020.patch, 
> DRILL-4020.patch, DRILL-4020.patch
>
>
> Create a test HBase table:
> {noformat}
> hbase> create 'table', 'f'
> hbase> put 'table', 'row1', 'f:c', 'value1'
> hbase> put 'table', 'row2', 'f:c', 'value2'
> hbase> put 'table', 'row3', 'f:c', 'value3'
> {noformat}
> The table looks like this:
> {noformat}
> 0: jdbc:drill:zk=maprdemo:5181> SELECT CONVERT_FROM(row_key, 'UTF8') FROM 
> hbase.`table`;
> +-+
> | EXPR$0  |
> +-+
> | row1|
> | row2|
> | row3|
> +-+
> 1 row selected (4.596 seconds)
> {noformat}
> However, this query returns incorrect results when a not-equal operator is 
> used on the row key:
> {noformat}
> 0: jdbc:drill:zk=maprdemo:5181> SELECT CONVERT_FROM(row_key, 'UTF8') FROM 
> hbase.`table` WHERE row_key <> 'row1';
> +-+
> | EXPR$0  |
> +-+
> | row1|
> | row2|
> | row3|
> +-+
> 1 row selected (0.573 seconds)
> {noformat}
> In the query plan, there is no RowFilter:
> {noformat}
> 00-00Screen
> 00-01  Project(EXPR$0=[CONVERT_FROMUTF8($0)])
> 00-02Scan(groupscan=[HBaseGroupScan [HBaseScanSpec=HBaseScanSpec 
> [tableName=table, startRow=, stopRow=, filter=null], columns=[`row_key`]]])
> {noformat}
> When the query has multiple not-equal operators, it works fine:
> {noformat}
> 0: jdbc:drill:zk=maprdemo:5181> SELECT CONVERT_FROM(row_key, 'UTF8') FROM 
> hbase.`table` WHERE row_key <> 'row1' AND row_key <> 'row2';
> +-+
> | EXPR$0  |
> +-+
> | row3|
> +-+
> 1 row selected (0.255 seconds)
> {noformat}
> In the query plan, a FilterList has two RowFilters with NOT_EQUAL operators:
> {noformat}
> 00-00Screen
> 00-01  Project(EXPR$0=[CONVERT_FROMUTF8($0)])
> 00-02Scan(groupscan=[HBaseGroupScan [HBaseScanSpec=HBaseScanSpec 
> [tableName=table, startRow=, stopRow=, filter=FilterList AND (2/2): 
> [RowFilter (NOT_EQUAL, row1), RowFilter (NOT_EQUAL, row2)]], 
> columns=[`row_key`]]])
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (DRILL-3765) Partition prune rule is unnecessary fired multiple times.

2015-11-03 Thread Jinfeng Ni (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-3765?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14987748#comment-14987748
 ] 

Jinfeng Ni commented on DRILL-3765:
---

Caching the result will help. But I feel it probably will not help a lot in 
case where intermediate filters are pushed down, and partition rule is fired 
against the intermediate filters. Ideally, we only want to apply the partition 
rule against the final filter. 
 

> Partition prune rule is unnecessary fired multiple times. 
> --
>
> Key: DRILL-3765
> URL: https://issues.apache.org/jira/browse/DRILL-3765
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Query Planning & Optimization
>Reporter: Jinfeng Ni
>Assignee: Jinfeng Ni
>
> It seems that the partition prune rule may be fired multiple times, even 
> after the first rule execution has pushed the filter into the scan operator. 
> Since partition prune has to build the vectors to contain the partition /file 
> / directory information, to invoke the partition prune rule unnecessary may 
> lead to big memory overhead.
> Drill planner should avoid the un-necessary partition prune rule, in order to 
> reduce the chance of hitting OOM exception, while the partition prune rule is 
> executed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (DRILL-4019) Limit 0 with two flatten operators in a query fails with unsupported exception

2015-11-03 Thread Victoria Markman (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-4019?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14987818#comment-14987818
 ] 

Victoria Markman commented on DRILL-4019:
-

It does work with drill 1.3.0 before Steven's check in for union type vector 
(if this is what you mean, Jacques)

{code}
0: jdbc:drill:schema=dfs> select * from sys.version;
+-+---+++--++
| version | commit_id | 
commit_message  
   |commit_time | build_email  | build_time 
|
+-+---+++--++
| 1.3.0-SNAPSHOT  | e4b94a78487f844be4fe71c4b9bf88b16c7f42f7  | DRILL-3937: 
Handle the case where min/max columns in metadata cache file are string or 
binary values.  | 30.10.2015 @ 01:57:33 UTC  | Unknown  | 31.10.2015 @ 
01:10:48 UTC  |
+-+---+++--++
1 row selected (0.7 seconds)
{code}

> Limit 0 with two flatten operators in a query fails with unsupported exception
> --
>
> Key: DRILL-4019
> URL: https://issues.apache.org/jira/browse/DRILL-4019
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Relational Operators
>Affects Versions: 1.2.0
>Reporter: Victoria Markman
>
> I think this is manifestation of DRILL-2256 in a different SQL scenario.
> test.json
> {code}
> {
> "c1" : [[1,2,3],[10,20,30]]
> }
> {code}
> Two flatten operators with limit zero fails:
> {code}
> 0: jdbc:drill:schema=dfs> select flatten(c1[0]), flatten(c1[1]) from 
> `test.json` limit 0;
> Error: SYSTEM ERROR: UnsupportedOperationException: Unable to get value 
> vector class for minor type [LATE] and mode [OPTIONAL]
> Fragment 0:0
> [Error Id: 0f19b566-ca53-43c7-ae0f-fec39c909cfd on atsqa4-133.qa.lab:31010] 
> (state=,code=0)
> {code}
> Single flatten: works
> {code}
> 0: jdbc:drill:schema=dfs> select flatten(c1[0]) from `test.json` limit 0;
> +-+
> | EXPR$0  |
> +-+
> +-+
> No rows selected (0.258 seconds)
> {code}
> Without limit 0:
> {code}
> 0: jdbc:drill:schema=dfs> select flatten(c1[0]) from `test.json`;
> +-+
> | EXPR$0  |
> +-+
> | 1   |
> | 2   |
> | 3   |
> +-+
> 3 rows selected (0.268 seconds)
> 0: jdbc:drill:schema=dfs> select flatten(c1[0]), flatten(c1[1]) from 
> `test.json`;
> +-+-+
> | EXPR$0  | EXPR$1  |
> +-+-+
> | 1   | 10  |
> | 1   | 20  |
> | 1   | 30  |
> | 2   | 10  |
> | 2   | 20  |
> | 2   | 30  |
> | 3   | 10  |
> | 3   | 20  |
> | 3   | 30  |
> +-+-+
> 9 rows selected (0.26 seconds)
> {code}
> drillbit.log
> {code}
> 2015-11-03 17:13:19,237 [29c714cf-fdc7-2ac9-5fd4-c4fdae636550:foreman] INFO  
> o.a.d.e.s.schedule.BlockMapBuilder - Failure finding Drillbit running on host 
> atsqa4-135.qa.lab.  Skipping affinity to that host.
> 2015-11-03 17:13:19,237 [29c714cf-fdc7-2ac9-5fd4-c4fdae636550:foreman] INFO  
> o.a.d.e.s.schedule.BlockMapBuilder - Failure finding Drillbit running on host 
> atsqa4-134.qa.lab.  Skipping affinity to that host.
> 2015-11-03 17:13:19,237 [29c714cf-fdc7-2ac9-5fd4-c4fdae636550:foreman] INFO  
> o.a.d.e.s.schedule.BlockMapBuilder - Get block maps: Executed 1 out of 1 
> using 1 threads. Time: 1ms total, 1.717980ms avg, 1ms max.
> 2015-11-03 17:13:19,237 [29c714cf-fdc7-2ac9-5fd4-c4fdae636550:foreman] INFO  
> o.a.d.e.s.schedule.BlockMapBuilder - Get block maps: Executed 1 out of 1 
> using 1 threads. Earliest start: 1.82 μs, Latest start: 1.82 μs, 
> Average start: 1.82 μs .
> 2015-11-03 17:13:19,263 [29c714cf-fdc7-2ac9-5fd4-c4fdae636550:frag:0:0] INFO  
> o.a.d.e.w.fragment.FragmentExecutor - 
> 29c714cf-fdc7-2ac9-5fd4-c4fdae636550:0:0: State change requested 
> AWAITING_ALLOCATION --> RUNNING
> 2015-11-03 17:13:19,263 [29c714cf-fdc7-2ac9-5fd4-c4fdae636550:frag:0:0] INFO  
> o.a.d.e.w.f.FragmentStatusReporter - 
> 29c714cf-fdc7-2ac9-5fd4-c4fdae636550:0:0: State to report: RUNNING
> 2015-11-03 17:13:19,269 [29c714cf-fdc7-2ac9-5fd4-c4fdae636550:frag:0:0] INFO  
> 

[jira] [Updated] (DRILL-4013) Umbrella: Add Support for client warnings

2015-11-03 Thread Jacques Nadeau (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-4013?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jacques Nadeau updated DRILL-4013:
--
Assignee: Abhijit Pol

> Umbrella: Add Support for client warnings
> -
>
> Key: DRILL-4013
> URL: https://issues.apache.org/jira/browse/DRILL-4013
> Project: Apache Drill
>  Issue Type: New Feature
>  Components: Client - JDBC, Client - ODBC, Execution - RPC
>Reporter: Jacques Nadeau
>Assignee: Abhijit Pol
> Fix For: 1.4.0
>
>
> We need to add a capability to support JDBC and ODBC warnings. We should come 
> up with a proposal for a design. For v1, I propose that we focus on Screen 
> operator warnings. Future enhancements could leverage this functionality to 
> provide sideband warnings as well. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (DRILL-4006) As json reader reads a field with empty lists, IOOB could happen

2015-11-03 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-4006?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14987639#comment-14987639
 ] 

ASF GitHub Bot commented on DRILL-4006:
---

Github user hsuanyi commented on the pull request:

https://github.com/apache/drill/pull/233#issuecomment-153421763
  
Will do after I rebase it


> As json reader reads a field with empty lists, IOOB could happen
> 
>
> Key: DRILL-4006
> URL: https://issues.apache.org/jira/browse/DRILL-4006
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Storage - JSON
>Reporter: Sean Hsuan-Yi Chu
>Assignee: Sean Hsuan-Yi Chu
> Attachments: a.json, b.json, c.json
>
>
> If a field in a json file has many empty lists before a non-empty list, there 
> could be an IOOB exception.
> Running the following query on the folder with files in the attachment can 
> reproduce the observation:
> {code}
> select a from`folder`
> {code}
> Exception:
> org.apache.drill.common.exceptions.UserRemoteException: DATA_READ ERROR: 
> index: 4448, length: 4 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (DRILL-4006) As json reader reads a field with empty lists, IOOB could happen

2015-11-03 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-4006?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14987644#comment-14987644
 ] 

ASF GitHub Bot commented on DRILL-4006:
---

Github user jacques-n commented on a diff in the pull request:

https://github.com/apache/drill/pull/233#discussion_r43777175
  
--- Diff: exec/java-exec/src/main/codegen/templates/MapWriters.java ---
@@ -110,9 +110,9 @@ public ListWriter list(String name) {
 FieldWriter writer = fields.get(name.toLowerCase());
 if(writer == null) {
   writer = new SingleListWriter(name, container, this);
-  writer.setPosition(${index});
   fields.put(name.toLowerCase(), writer);
 }
+writer.setPosition(${index});
--- End diff --

Your change seems to violate the design pattern of the writers. Position is 
managed via the setPosition call at line 152. If you're seeing problems, you 
should figure who isn't calling that method. I believe the code before your 
change is correct: the only reason we should have to reposition a writer when 
we retrieve it is it didn't previously exist and we had to create a new one. 
This is the pattern used across the board for position in the context of all 
the writers. Position is managed externally and only a new field creation 
should cause a location position set.


> As json reader reads a field with empty lists, IOOB could happen
> 
>
> Key: DRILL-4006
> URL: https://issues.apache.org/jira/browse/DRILL-4006
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Storage - JSON
>Reporter: Sean Hsuan-Yi Chu
>Assignee: Sean Hsuan-Yi Chu
> Attachments: a.json, b.json, c.json
>
>
> If a field in a json file has many empty lists before a non-empty list, there 
> could be an IOOB exception.
> Running the following query on the folder with files in the attachment can 
> reproduce the observation:
> {code}
> select a from`folder`
> {code}
> Exception:
> org.apache.drill.common.exceptions.UserRemoteException: DATA_READ ERROR: 
> index: 4448, length: 4 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (DRILL-4019) Limit 0 with two flatten operators in a query fails with unsupported exception

2015-11-03 Thread Victoria Markman (JIRA)
Victoria Markman created DRILL-4019:
---

 Summary: Limit 0 with two flatten operators in a query fails with 
unsupported exception
 Key: DRILL-4019
 URL: https://issues.apache.org/jira/browse/DRILL-4019
 Project: Apache Drill
  Issue Type: Bug
  Components: Execution - Relational Operators
Affects Versions: 1.2.0
Reporter: Victoria Markman


I think this is manifestation of DRILL-2256 in a different SQL scenario.

test.json
{code}
{
"c1" : [[1,2,3],[10,20,30]]
}
{code}

Two flatten operators with limit zero fails:
{code}
0: jdbc:drill:schema=dfs> select flatten(c1[0]), flatten(c1[1]) from 
`test.json` limit 0;
Error: SYSTEM ERROR: UnsupportedOperationException: Unable to get value vector 
class for minor type [LATE] and mode [OPTIONAL]
Fragment 0:0
[Error Id: 0f19b566-ca53-43c7-ae0f-fec39c909cfd on atsqa4-133.qa.lab:31010] 
(state=,code=0)
{code}

Single flatten: works
{code}
0: jdbc:drill:schema=dfs> select flatten(c1[0]) from `test.json` limit 0;
+-+
| EXPR$0  |
+-+
+-+
No rows selected (0.258 seconds)
{code}

Without limit 0:
{code}
0: jdbc:drill:schema=dfs> select flatten(c1[0]) from `test.json`;
+-+
| EXPR$0  |
+-+
| 1   |
| 2   |
| 3   |
+-+
3 rows selected (0.268 seconds)

0: jdbc:drill:schema=dfs> select flatten(c1[0]), flatten(c1[1]) from 
`test.json`;
+-+-+
| EXPR$0  | EXPR$1  |
+-+-+
| 1   | 10  |
| 1   | 20  |
| 1   | 30  |
| 2   | 10  |
| 2   | 20  |
| 2   | 30  |
| 3   | 10  |
| 3   | 20  |
| 3   | 30  |
+-+-+
9 rows selected (0.26 seconds)
{code}

drillbit.log
{code}
2015-11-03 17:13:19,237 [29c714cf-fdc7-2ac9-5fd4-c4fdae636550:foreman] INFO  
o.a.d.e.s.schedule.BlockMapBuilder - Failure finding Drillbit running on host 
atsqa4-135.qa.lab.  Skipping affinity to that host.
2015-11-03 17:13:19,237 [29c714cf-fdc7-2ac9-5fd4-c4fdae636550:foreman] INFO  
o.a.d.e.s.schedule.BlockMapBuilder - Failure finding Drillbit running on host 
atsqa4-134.qa.lab.  Skipping affinity to that host.
2015-11-03 17:13:19,237 [29c714cf-fdc7-2ac9-5fd4-c4fdae636550:foreman] INFO  
o.a.d.e.s.schedule.BlockMapBuilder - Get block maps: Executed 1 out of 1 using 
1 threads. Time: 1ms total, 1.717980ms avg, 1ms max.
2015-11-03 17:13:19,237 [29c714cf-fdc7-2ac9-5fd4-c4fdae636550:foreman] INFO  
o.a.d.e.s.schedule.BlockMapBuilder - Get block maps: Executed 1 out of 1 using 
1 threads. Earliest start: 1.82 μs, Latest start: 1.82 μs, Average 
start: 1.82 μs .
2015-11-03 17:13:19,263 [29c714cf-fdc7-2ac9-5fd4-c4fdae636550:frag:0:0] INFO  
o.a.d.e.w.fragment.FragmentExecutor - 29c714cf-fdc7-2ac9-5fd4-c4fdae636550:0:0: 
State change requested AWAITING_ALLOCATION --> RUNNING
2015-11-03 17:13:19,263 [29c714cf-fdc7-2ac9-5fd4-c4fdae636550:frag:0:0] INFO  
o.a.d.e.w.f.FragmentStatusReporter - 29c714cf-fdc7-2ac9-5fd4-c4fdae636550:0:0: 
State to report: RUNNING
2015-11-03 17:13:19,269 [29c714cf-fdc7-2ac9-5fd4-c4fdae636550:frag:0:0] INFO  
o.a.d.e.w.fragment.FragmentExecutor - 29c714cf-fdc7-2ac9-5fd4-c4fdae636550:0:0: 
State change requested RUNNING --> FAILED
2015-11-03 17:13:19,269 [29c714cf-fdc7-2ac9-5fd4-c4fdae636550:frag:0:0] INFO  
o.a.d.e.w.fragment.FragmentExecutor - 29c714cf-fdc7-2ac9-5fd4-c4fdae636550:0:0: 
State change requested FAILED --> FINISHED
2015-11-03 17:13:19,271 [29c714cf-fdc7-2ac9-5fd4-c4fdae636550:frag:0:0] ERROR 
o.a.d.e.w.fragment.FragmentExecutor - SYSTEM ERROR: 
UnsupportedOperationException: Unable to get value vector class for minor type 
[LATE] and mode [OPTIONAL]

Fragment 0:0

[Error Id: 95cd27b5-065d-4714-a3e0-cd8d8ff6d519 on atsqa4-133.qa.lab:31010]
org.apache.drill.common.exceptions.UserException: SYSTEM ERROR: 
UnsupportedOperationException: Unable to get value vector class for minor type 
[LATE] and mode [OPTIONAL]

Fragment 0:0

[Error Id: 95cd27b5-065d-4714-a3e0-cd8d8ff6d519 on atsqa4-133.qa.lab:31010]
at 
org.apache.drill.common.exceptions.UserException$Builder.build(UserException.java:534)
 ~[drill-common-1.2.0.jar:1.2.0]
at 
org.apache.drill.exec.work.fragment.FragmentExecutor.sendFinalState(FragmentExecutor.java:323)
 [drill-java-exec-1.2.0.jar:1.2.0]
at 
org.apache.drill.exec.work.fragment.FragmentExecutor.cleanup(FragmentExecutor.java:178)
 [drill-java-exec-1.2.0.jar:1.2.0]
at 
org.apache.drill.exec.work.fragment.FragmentExecutor.run(FragmentExecutor.java:292)
 [drill-java-exec-1.2.0.jar:1.2.0]
at 
org.apache.drill.common.SelfCleaningRunnable.run(SelfCleaningRunnable.java:38) 
[drill-common-1.2.0.jar:1.2.0]
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) 
[na:1.7.0_71]
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) 
[na:1.7.0_71]
at 

[jira] [Commented] (DRILL-3765) Partition prune rule is unnecessary fired multiple times.

2015-11-03 Thread Jacques Nadeau (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-3765?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14987809#comment-14987809
 ] 

Jacques Nadeau commented on DRILL-3765:
---

got it. thx

> Partition prune rule is unnecessary fired multiple times. 
> --
>
> Key: DRILL-3765
> URL: https://issues.apache.org/jira/browse/DRILL-3765
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Query Planning & Optimization
>Reporter: Jinfeng Ni
>Assignee: Jinfeng Ni
>
> It seems that the partition prune rule may be fired multiple times, even 
> after the first rule execution has pushed the filter into the scan operator. 
> Since partition prune has to build the vectors to contain the partition /file 
> / directory information, to invoke the partition prune rule unnecessary may 
> lead to big memory overhead.
> Drill planner should avoid the un-necessary partition prune rule, in order to 
> reduce the chance of hitting OOM exception, while the partition prune rule is 
> executed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (DRILL-4012) Limit 0 on top of query with kvg/flatten results in assert

2015-11-03 Thread Victoria Markman (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-4012?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14987854#comment-14987854
 ] 

Victoria Markman commented on DRILL-4012:
-

This works for me in 1.3.0 as well (pre-union type fix):

{code}
0: jdbc:drill:schema=dfs> select * from (select id, flatten(kvgen(m)) from 
`json_kvgenflatten/missing-map.json`) limit 0;
+-+-+
| id  | EXPR$1  |
+-+-+
+-+-+
No rows selected (0.8 seconds)
{code}

> Limit 0 on top of query with kvg/flatten results in assert
> --
>
> Key: DRILL-4012
> URL: https://issues.apache.org/jira/browse/DRILL-4012
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Data Types
>Reporter: Victoria Markman
>
> I've found couple of bugs that are very similar, but none of them are quite 
> the same:
> missing-map.json
> {code}
> {
> "id": 1,
> "m": {"a":1,"b":2}
> }
> {
> "id": 2
> }
> {
> "id": 3,
> "m": {"c":3,"d":4}
> }
> {code}
> 'limit 0' results in an assert:
> {code}
> 0: jdbc:drill:schema=dfs> select * from (select id, flatten(kvgen(m)) from 
> `missing-map.json`) limit 0;
> Error: SYSTEM ERROR: ClassCastException: Cannot cast 
> org.apache.drill.exec.vector.NullableIntVector to 
> org.apache.drill.exec.vector.complex.RepeatedValueVector
> Fragment 0:0
> [Error Id: 046bb4d4-2c54-43ab-9577-cf21542ff8ef on atsqa4-133.qa.lab:31010] 
> (state=,code=0)
> {code}
> 'limit 1' works:
> {code}
> 0: jdbc:drill:schema=dfs> select * from (select id, flatten(kvgen(m)) from 
> `missing-map.json`) limit 1;
> +-++
> | id  | EXPR$1 |
> +-++
> | 1   | {"key":"a","value":1}  |
> +-++
> 1 row selected (0.247 seconds)
> {code}
> No limit, just in subquery: works
> {code}
> 0: jdbc:drill:schema=dfs> select * from (select id, flatten(kvgen(m)) from 
> `json_kvgenflatten/missing-map.json`);
> +-++
> | id  | EXPR$1 |
> +-++
> | 1   | {"key":"a","value":1}  |
> | 1   | {"key":"b","value":2}  |
> | 3   | {"key":"c","value":3}  |
> | 3   | {"key":"d","value":4}  |
> +-++
> 4 rows selected (0.247 seconds)
> {code}
> drillbit.log
> {code}
> 2015-11-03 15:23:20,943 [29c72e96-c9f6-9fce-ecf1-14eaa145f72b:foreman] INFO  
> o.a.d.e.s.schedule.BlockMapBuilder - Failure finding Drillbit running on host 
> atsqa4-136.qa.lab.  Skipping affinity to that host.
> 2015-11-03 15:23:20,944 [29c72e96-c9f6-9fce-ecf1-14eaa145f72b:foreman] INFO  
> o.a.d.e.s.schedule.BlockMapBuilder - Failure finding Drillbit running on host 
> atsqa4-134.qa.lab.  Skipping affinity to that host.
> 2015-11-03 15:23:20,944 [29c72e96-c9f6-9fce-ecf1-14eaa145f72b:foreman] INFO  
> o.a.d.e.s.schedule.BlockMapBuilder - Get block maps: Executed 1 out of 1 
> using 1 threads. Time: 1ms total, 1.530719ms avg, 1ms max.
> 2015-11-03 15:23:20,944 [29c72e96-c9f6-9fce-ecf1-14eaa145f72b:foreman] INFO  
> o.a.d.e.s.schedule.BlockMapBuilder - Get block maps: Executed 1 out of 1 
> using 1 threads. Earliest start: 1.744000 μs, Latest start: 1.744000 μs, 
> Average start: 1.744000 μs .
> 2015-11-03 15:23:20,968 [29c72e96-c9f6-9fce-ecf1-14eaa145f72b:frag:0:0] INFO  
> o.a.d.e.w.fragment.FragmentExecutor - 
> 29c72e96-c9f6-9fce-ecf1-14eaa145f72b:0:0: State change requested 
> AWAITING_ALLOCATION --> RUNNING
> 2015-11-03 15:23:20,968 [29c72e96-c9f6-9fce-ecf1-14eaa145f72b:frag:0:0] INFO  
> o.a.d.e.w.f.FragmentStatusReporter - 
> 29c72e96-c9f6-9fce-ecf1-14eaa145f72b:0:0: State to report: RUNNING
> 2015-11-03 15:23:20,974 [29c72e96-c9f6-9fce-ecf1-14eaa145f72b:frag:0:0] WARN  
> o.a.d.e.e.ExpressionTreeMaterializer - Unable to find value vector of path 
> `EXPR$3`, returning null instance.
> 2015-11-03 15:23:20,975 [29c72e96-c9f6-9fce-ecf1-14eaa145f72b:frag:0:0] WARN  
> o.a.d.e.e.ExpressionTreeMaterializer - Unable to find value vector of path 
> `EXPR$3`, returning null instance.
> 2015-11-03 15:23:20,976 [29c72e96-c9f6-9fce-ecf1-14eaa145f72b:frag:0:0] INFO  
> o.a.d.e.w.fragment.FragmentExecutor - 
> 29c72e96-c9f6-9fce-ecf1-14eaa145f72b:0:0: State change requested RUNNING --> 
> FAILED
> 2015-11-03 15:23:20,976 [29c72e96-c9f6-9fce-ecf1-14eaa145f72b:frag:0:0] INFO  
> o.a.d.e.w.fragment.FragmentExecutor - 
> 29c72e96-c9f6-9fce-ecf1-14eaa145f72b:0:0: State change requested FAILED --> 
> FINISHED
> 2015-11-03 15:23:20,978 [29c72e96-c9f6-9fce-ecf1-14eaa145f72b:frag:0:0] ERROR 
> o.a.d.e.w.fragment.FragmentExecutor - SYSTEM ERROR: ClassCastException: 
> Cannot cast org.apache.drill.exec.vector.NullableIntVector to 
> org.apache.drill.exec.vector.complex.RepeatedValueVector
> Fragment 0:0
> [Error Id: c82c7f59-4dad-47ad-8901-5a2261c81279 on atsqa4-133.qa.lab:31010]
> 

[jira] [Issue Comment Deleted] (DRILL-4020) The not-equal operator returns incorrect results when used on the HBase row key

2015-11-03 Thread Akihiko Kusanagi (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-4020?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akihiko Kusanagi updated DRILL-4020:

Comment: was deleted

(was: Created reviewboard )

> The not-equal operator returns incorrect results when used on the HBase row 
> key
> ---
>
> Key: DRILL-4020
> URL: https://issues.apache.org/jira/browse/DRILL-4020
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Storage - HBase
>Affects Versions: 1.2.0
> Environment: Drill Sandbox
>Reporter: Akihiko Kusanagi
>Priority: Critical
> Attachments: DRILL-4020.1.patch.txt, DRILL-4020.patch, 
> DRILL-4020.patch, DRILL-4020.patch
>
>
> Create a test HBase table:
> {noformat}
> hbase> create 'table', 'f'
> hbase> put 'table', 'row1', 'f:c', 'value1'
> hbase> put 'table', 'row2', 'f:c', 'value2'
> hbase> put 'table', 'row3', 'f:c', 'value3'
> {noformat}
> The table looks like this:
> {noformat}
> 0: jdbc:drill:zk=maprdemo:5181> SELECT CONVERT_FROM(row_key, 'UTF8') FROM 
> hbase.`table`;
> +-+
> | EXPR$0  |
> +-+
> | row1|
> | row2|
> | row3|
> +-+
> 1 row selected (4.596 seconds)
> {noformat}
> However, this query returns incorrect results when a not-equal operator is 
> used on the row key:
> {noformat}
> 0: jdbc:drill:zk=maprdemo:5181> SELECT CONVERT_FROM(row_key, 'UTF8') FROM 
> hbase.`table` WHERE row_key <> 'row1';
> +-+
> | EXPR$0  |
> +-+
> | row1|
> | row2|
> | row3|
> +-+
> 1 row selected (0.573 seconds)
> {noformat}
> In the query plan, there is no RowFilter:
> {noformat}
> 00-00Screen
> 00-01  Project(EXPR$0=[CONVERT_FROMUTF8($0)])
> 00-02Scan(groupscan=[HBaseGroupScan [HBaseScanSpec=HBaseScanSpec 
> [tableName=table, startRow=, stopRow=, filter=null], columns=[`row_key`]]])
> {noformat}
> When the query has multiple not-equal operators, it works fine:
> {noformat}
> 0: jdbc:drill:zk=maprdemo:5181> SELECT CONVERT_FROM(row_key, 'UTF8') FROM 
> hbase.`table` WHERE row_key <> 'row1' AND row_key <> 'row2';
> +-+
> | EXPR$0  |
> +-+
> | row3|
> +-+
> 1 row selected (0.255 seconds)
> {noformat}
> In the query plan, a FilterList has two RowFilters with NOT_EQUAL operators:
> {noformat}
> 00-00Screen
> 00-01  Project(EXPR$0=[CONVERT_FROMUTF8($0)])
> 00-02Scan(groupscan=[HBaseGroupScan [HBaseScanSpec=HBaseScanSpec 
> [tableName=table, startRow=, stopRow=, filter=FilterList AND (2/2): 
> [RowFilter (NOT_EQUAL, row1), RowFilter (NOT_EQUAL, row2)]], 
> columns=[`row_key`]]])
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (DRILL-4020) The not-equal operator returns incorrect results when used on the HBase row key

2015-11-03 Thread Akihiko Kusanagi (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-4020?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akihiko Kusanagi updated DRILL-4020:

Attachment: (was: DRILL-4020.patch)

> The not-equal operator returns incorrect results when used on the HBase row 
> key
> ---
>
> Key: DRILL-4020
> URL: https://issues.apache.org/jira/browse/DRILL-4020
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Storage - HBase
>Affects Versions: 1.2.0
> Environment: Drill Sandbox
>Reporter: Akihiko Kusanagi
>Priority: Critical
> Attachments: DRILL-4020.1.patch.txt, DRILL-4020.patch, 
> DRILL-4020.patch, DRILL-4020.patch
>
>
> Create a test HBase table:
> {noformat}
> hbase> create 'table', 'f'
> hbase> put 'table', 'row1', 'f:c', 'value1'
> hbase> put 'table', 'row2', 'f:c', 'value2'
> hbase> put 'table', 'row3', 'f:c', 'value3'
> {noformat}
> The table looks like this:
> {noformat}
> 0: jdbc:drill:zk=maprdemo:5181> SELECT CONVERT_FROM(row_key, 'UTF8') FROM 
> hbase.`table`;
> +-+
> | EXPR$0  |
> +-+
> | row1|
> | row2|
> | row3|
> +-+
> 1 row selected (4.596 seconds)
> {noformat}
> However, this query returns incorrect results when a not-equal operator is 
> used on the row key:
> {noformat}
> 0: jdbc:drill:zk=maprdemo:5181> SELECT CONVERT_FROM(row_key, 'UTF8') FROM 
> hbase.`table` WHERE row_key <> 'row1';
> +-+
> | EXPR$0  |
> +-+
> | row1|
> | row2|
> | row3|
> +-+
> 1 row selected (0.573 seconds)
> {noformat}
> In the query plan, there is no RowFilter:
> {noformat}
> 00-00Screen
> 00-01  Project(EXPR$0=[CONVERT_FROMUTF8($0)])
> 00-02Scan(groupscan=[HBaseGroupScan [HBaseScanSpec=HBaseScanSpec 
> [tableName=table, startRow=, stopRow=, filter=null], columns=[`row_key`]]])
> {noformat}
> When the query has multiple not-equal operators, it works fine:
> {noformat}
> 0: jdbc:drill:zk=maprdemo:5181> SELECT CONVERT_FROM(row_key, 'UTF8') FROM 
> hbase.`table` WHERE row_key <> 'row1' AND row_key <> 'row2';
> +-+
> | EXPR$0  |
> +-+
> | row3|
> +-+
> 1 row selected (0.255 seconds)
> {noformat}
> In the query plan, a FilterList has two RowFilters with NOT_EQUAL operators:
> {noformat}
> 00-00Screen
> 00-01  Project(EXPR$0=[CONVERT_FROMUTF8($0)])
> 00-02Scan(groupscan=[HBaseGroupScan [HBaseScanSpec=HBaseScanSpec 
> [tableName=table, startRow=, stopRow=, filter=FilterList AND (2/2): 
> [RowFilter (NOT_EQUAL, row1), RowFilter (NOT_EQUAL, row2)]], 
> columns=[`row_key`]]])
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (DRILL-4013) Umbrella: Add Support for client warnings

2015-11-03 Thread Jacques Nadeau (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-4013?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14987601#comment-14987601
 ] 

Jacques Nadeau commented on DRILL-4013:
---

[~apol], can you propose an initial design?

> Umbrella: Add Support for client warnings
> -
>
> Key: DRILL-4013
> URL: https://issues.apache.org/jira/browse/DRILL-4013
> Project: Apache Drill
>  Issue Type: New Feature
>  Components: Client - JDBC, Client - ODBC, Execution - RPC
>Reporter: Jacques Nadeau
>Assignee: Abhijit Pol
> Fix For: 1.4.0
>
>
> We need to add a capability to support JDBC and ODBC warnings. We should come 
> up with a proposal for a design. For v1, I propose that we focus on Screen 
> operator warnings. Future enhancements could leverage this functionality to 
> provide sideband warnings as well. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (DRILL-3765) Partition prune rule is unnecessary fired multiple times.

2015-11-03 Thread Jacques Nadeau (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-3765?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14987626#comment-14987626
 ] 

Jacques Nadeau commented on DRILL-3765:
---

Wouldn't another option be caching the result? It seems like that could be a 
simpler solution.

> Partition prune rule is unnecessary fired multiple times. 
> --
>
> Key: DRILL-3765
> URL: https://issues.apache.org/jira/browse/DRILL-3765
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Query Planning & Optimization
>Reporter: Jinfeng Ni
>Assignee: Jinfeng Ni
>
> It seems that the partition prune rule may be fired multiple times, even 
> after the first rule execution has pushed the filter into the scan operator. 
> Since partition prune has to build the vectors to contain the partition /file 
> / directory information, to invoke the partition prune rule unnecessary may 
> lead to big memory overhead.
> Drill planner should avoid the un-necessary partition prune rule, in order to 
> reduce the chance of hitting OOM exception, while the partition prune rule is 
> executed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (DRILL-4006) As json reader reads a field with empty lists, IOOB could happen

2015-11-03 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-4006?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14987627#comment-14987627
 ] 

ASF GitHub Bot commented on DRILL-4006:
---

GitHub user hsuanyi opened a pull request:

https://github.com/apache/drill/pull/233

DRILL-4006: Ensure the position of MapWriters is set

This can help eliminate potential IOOB exception

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/hsuanyi/incubator-drill DRILL-4006

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/drill/pull/233.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #233


commit 4c0048090f56c6e017ffbad20170af46cbf53b41
Author: Hsuan-Yi Chu 
Date:   2015-11-01T00:14:34Z

DRILL-4006: Ensure the position of MapWriters is set

This can help eliminate potential IOOB exception




> As json reader reads a field with empty lists, IOOB could happen
> 
>
> Key: DRILL-4006
> URL: https://issues.apache.org/jira/browse/DRILL-4006
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Storage - JSON
>Reporter: Sean Hsuan-Yi Chu
>Assignee: Sean Hsuan-Yi Chu
> Attachments: a.json, b.json, c.json
>
>
> If a field in a json file has many empty lists before a non-empty list, there 
> could be an IOOB exception.
> Running the following query on the folder with files in the attachment can 
> reproduce the observation:
> {code}
> select a from`folder`
> {code}
> Exception:
> org.apache.drill.common.exceptions.UserRemoteException: DATA_READ ERROR: 
> index: 4448, length: 4 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (DRILL-4020) The not-equal operator returns incorrect results when used on the HBase row key

2015-11-03 Thread Akihiko Kusanagi (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-4020?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akihiko Kusanagi updated DRILL-4020:

Attachment: DRILL-4020.1.patch.txt

Uploaded a patch. This prevents parseTree() from removing a RowFilter when a 
NOT_EQUAL operator is used.

> The not-equal operator returns incorrect results when used on the HBase row 
> key
> ---
>
> Key: DRILL-4020
> URL: https://issues.apache.org/jira/browse/DRILL-4020
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Storage - HBase
>Affects Versions: 1.2.0
> Environment: Drill Sandbox
>Reporter: Akihiko Kusanagi
>Priority: Critical
> Attachments: DRILL-4020.1.patch.txt
>
>
> Create a test HBase table:
> {noformat}
> hbase> create 'table', 'f'
> hbase> put 'table', 'row1', 'f:c', 'value1'
> hbase> put 'table', 'row2', 'f:c', 'value2'
> hbase> put 'table', 'row3', 'f:c', 'value3'
> {noformat}
> The table looks like this:
> {noformat}
> 0: jdbc:drill:zk=maprdemo:5181> SELECT CONVERT_FROM(row_key, 'UTF8') FROM 
> hbase.`table`;
> +-+
> | EXPR$0  |
> +-+
> | row1|
> | row2|
> | row3|
> +-+
> 1 row selected (4.596 seconds)
> {noformat}
> However, this query returns incorrect results when a not-equal operator is 
> used on the row key:
> {noformat}
> 0: jdbc:drill:zk=maprdemo:5181> SELECT CONVERT_FROM(row_key, 'UTF8') FROM 
> hbase.`table` WHERE row_key <> 'row1';
> +-+
> | EXPR$0  |
> +-+
> | row1|
> | row2|
> | row3|
> +-+
> 1 row selected (0.573 seconds)
> {noformat}
> In the query plan, there is no RowFilter:
> {noformat}
> 00-00Screen
> 00-01  Project(EXPR$0=[CONVERT_FROMUTF8($0)])
> 00-02Scan(groupscan=[HBaseGroupScan [HBaseScanSpec=HBaseScanSpec 
> [tableName=table, startRow=, stopRow=, filter=null], columns=[`row_key`]]])
> {noformat}
> When the query has multiple not-equal operators, it works fine:
> {noformat}
> 0: jdbc:drill:zk=maprdemo:5181> SELECT CONVERT_FROM(row_key, 'UTF8') FROM 
> hbase.`table` WHERE row_key <> 'row1' AND row_key <> 'row2';
> +-+
> | EXPR$0  |
> +-+
> | row3|
> +-+
> 1 row selected (0.255 seconds)
> {noformat}
> In the query plan, a FilterList has two RowFilters with NOT_EQUAL operators:
> {noformat}
> 00-00Screen
> 00-01  Project(EXPR$0=[CONVERT_FROMUTF8($0)])
> 00-02Scan(groupscan=[HBaseGroupScan [HBaseScanSpec=HBaseScanSpec 
> [tableName=table, startRow=, stopRow=, filter=FilterList AND (2/2): 
> [RowFilter (NOT_EQUAL, row1), RowFilter (NOT_EQUAL, row2)]], 
> columns=[`row_key`]]])
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (DRILL-3845) Partition sender shouldn't send the "last batch" to a receiver that sent a "receiver finished" to the sender

2015-11-03 Thread Deneche A. Hakim (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-3845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14987825#comment-14987825
 ] 

Deneche A. Hakim commented on DRILL-3845:
-

The patch fixes partition sender so it doesn't need to send the "last batch" 
for receivers that already received, but the real problem seems to be in the 
unordered receiver.

Looking at the senders and receivers, the assumption is that when a receiving 
fragment finishes (e.g. limit cancellation) the receiver sends a "receiver 
finished" message to it's sender(s), but still wait for a "last batch" message 
before closing.

Unordered receiver doesn't wait for the "last batch" message. Most of the times 
this is fine because the rpc layer (data server) gracefully handles batches 
that are sent to closed receivers, but in the case of DRILL-2274 the "last 
batch" is sent more than 10 minutes after the receiving fragment closed, which 
will cause a "Data not accepted downstream" because the data server couldn't 
find a receiving fragment (we have a 10 minutes cache that keeps recently 
finished fragments).

A proper fix is to make sure unordered receiver waits for the "last batch" 
before closing.

> Partition sender shouldn't send the "last batch" to a receiver that sent a 
> "receiver finished" to the sender
> 
>
> Key: DRILL-3845
> URL: https://issues.apache.org/jira/browse/DRILL-3845
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Relational Operators
>Reporter: Deneche A. Hakim
>Assignee: Deneche A. Hakim
> Fix For: 1.3.0
>
>
> Even if a receiver has finished and informed the corresponding partition 
> sender, the sender will still try to send a "last batch" to the receiver when 
> it's done. In most cases this is fine as those batches will be silently 
> dropped by the receiving DataServer, but if a receiver has finished +10 
> minutes ago, DataServer will throw an exception as it couldn't find the 
> corresponding FragmentManager (WorkEventBus has a 10 minutes recentlyFinished 
> cache).
> DRILL-2274 is a reproduction for this case (after the corresponding fix is 
> applied).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (DRILL-4019) Limit 0 with two flatten operators in a query fails with unsupported exception

2015-11-03 Thread Victoria Markman (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-4019?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Victoria Markman updated DRILL-4019:

Labels: private_branch  (was: )

> Limit 0 with two flatten operators in a query fails with unsupported exception
> --
>
> Key: DRILL-4019
> URL: https://issues.apache.org/jira/browse/DRILL-4019
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Relational Operators
>Reporter: Victoria Markman
>  Labels: private_branch
>
> I think this is manifestation of DRILL-2256 in a different SQL scenario.
> test.json
> {code}
> {
> "c1" : [[1,2,3],[10,20,30]]
> }
> {code}
> Two flatten operators with limit zero fails:
> {code}
> 0: jdbc:drill:schema=dfs> select flatten(c1[0]), flatten(c1[1]) from 
> `test.json` limit 0;
> Error: SYSTEM ERROR: UnsupportedOperationException: Unable to get value 
> vector class for minor type [LATE] and mode [OPTIONAL]
> Fragment 0:0
> [Error Id: 0f19b566-ca53-43c7-ae0f-fec39c909cfd on atsqa4-133.qa.lab:31010] 
> (state=,code=0)
> {code}
> Single flatten: works
> {code}
> 0: jdbc:drill:schema=dfs> select flatten(c1[0]) from `test.json` limit 0;
> +-+
> | EXPR$0  |
> +-+
> +-+
> No rows selected (0.258 seconds)
> {code}
> Without limit 0:
> {code}
> 0: jdbc:drill:schema=dfs> select flatten(c1[0]) from `test.json`;
> +-+
> | EXPR$0  |
> +-+
> | 1   |
> | 2   |
> | 3   |
> +-+
> 3 rows selected (0.268 seconds)
> 0: jdbc:drill:schema=dfs> select flatten(c1[0]), flatten(c1[1]) from 
> `test.json`;
> +-+-+
> | EXPR$0  | EXPR$1  |
> +-+-+
> | 1   | 10  |
> | 1   | 20  |
> | 1   | 30  |
> | 2   | 10  |
> | 2   | 20  |
> | 2   | 30  |
> | 3   | 10  |
> | 3   | 20  |
> | 3   | 30  |
> +-+-+
> 9 rows selected (0.26 seconds)
> {code}
> drillbit.log
> {code}
> 2015-11-03 17:13:19,237 [29c714cf-fdc7-2ac9-5fd4-c4fdae636550:foreman] INFO  
> o.a.d.e.s.schedule.BlockMapBuilder - Failure finding Drillbit running on host 
> atsqa4-135.qa.lab.  Skipping affinity to that host.
> 2015-11-03 17:13:19,237 [29c714cf-fdc7-2ac9-5fd4-c4fdae636550:foreman] INFO  
> o.a.d.e.s.schedule.BlockMapBuilder - Failure finding Drillbit running on host 
> atsqa4-134.qa.lab.  Skipping affinity to that host.
> 2015-11-03 17:13:19,237 [29c714cf-fdc7-2ac9-5fd4-c4fdae636550:foreman] INFO  
> o.a.d.e.s.schedule.BlockMapBuilder - Get block maps: Executed 1 out of 1 
> using 1 threads. Time: 1ms total, 1.717980ms avg, 1ms max.
> 2015-11-03 17:13:19,237 [29c714cf-fdc7-2ac9-5fd4-c4fdae636550:foreman] INFO  
> o.a.d.e.s.schedule.BlockMapBuilder - Get block maps: Executed 1 out of 1 
> using 1 threads. Earliest start: 1.82 μs, Latest start: 1.82 μs, 
> Average start: 1.82 μs .
> 2015-11-03 17:13:19,263 [29c714cf-fdc7-2ac9-5fd4-c4fdae636550:frag:0:0] INFO  
> o.a.d.e.w.fragment.FragmentExecutor - 
> 29c714cf-fdc7-2ac9-5fd4-c4fdae636550:0:0: State change requested 
> AWAITING_ALLOCATION --> RUNNING
> 2015-11-03 17:13:19,263 [29c714cf-fdc7-2ac9-5fd4-c4fdae636550:frag:0:0] INFO  
> o.a.d.e.w.f.FragmentStatusReporter - 
> 29c714cf-fdc7-2ac9-5fd4-c4fdae636550:0:0: State to report: RUNNING
> 2015-11-03 17:13:19,269 [29c714cf-fdc7-2ac9-5fd4-c4fdae636550:frag:0:0] INFO  
> o.a.d.e.w.fragment.FragmentExecutor - 
> 29c714cf-fdc7-2ac9-5fd4-c4fdae636550:0:0: State change requested RUNNING --> 
> FAILED
> 2015-11-03 17:13:19,269 [29c714cf-fdc7-2ac9-5fd4-c4fdae636550:frag:0:0] INFO  
> o.a.d.e.w.fragment.FragmentExecutor - 
> 29c714cf-fdc7-2ac9-5fd4-c4fdae636550:0:0: State change requested FAILED --> 
> FINISHED
> 2015-11-03 17:13:19,271 [29c714cf-fdc7-2ac9-5fd4-c4fdae636550:frag:0:0] ERROR 
> o.a.d.e.w.fragment.FragmentExecutor - SYSTEM ERROR: 
> UnsupportedOperationException: Unable to get value vector class for minor 
> type [LATE] and mode [OPTIONAL]
> Fragment 0:0
> [Error Id: 95cd27b5-065d-4714-a3e0-cd8d8ff6d519 on atsqa4-133.qa.lab:31010]
> org.apache.drill.common.exceptions.UserException: SYSTEM ERROR: 
> UnsupportedOperationException: Unable to get value vector class for minor 
> type [LATE] and mode [OPTIONAL]
> Fragment 0:0
> [Error Id: 95cd27b5-065d-4714-a3e0-cd8d8ff6d519 on atsqa4-133.qa.lab:31010]
> at 
> org.apache.drill.common.exceptions.UserException$Builder.build(UserException.java:534)
>  ~[drill-common-1.2.0.jar:1.2.0]
> at 
> org.apache.drill.exec.work.fragment.FragmentExecutor.sendFinalState(FragmentExecutor.java:323)
>  [drill-java-exec-1.2.0.jar:1.2.0]
> at 
> org.apache.drill.exec.work.fragment.FragmentExecutor.cleanup(FragmentExecutor.java:178)
>  [drill-java-exec-1.2.0.jar:1.2.0]
> at 
> 

[jira] [Updated] (DRILL-4020) The not-equal operator returns incorrect results when used on the HBase row key

2015-11-03 Thread Akihiko Kusanagi (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-4020?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akihiko Kusanagi updated DRILL-4020:

Attachment: (was: DRILL-4020.patch)

> The not-equal operator returns incorrect results when used on the HBase row 
> key
> ---
>
> Key: DRILL-4020
> URL: https://issues.apache.org/jira/browse/DRILL-4020
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Storage - HBase
>Affects Versions: 1.2.0
> Environment: Drill Sandbox
>Reporter: Akihiko Kusanagi
>Priority: Critical
> Attachments: DRILL-4020.1.patch.txt
>
>
> Create a test HBase table:
> {noformat}
> hbase> create 'table', 'f'
> hbase> put 'table', 'row1', 'f:c', 'value1'
> hbase> put 'table', 'row2', 'f:c', 'value2'
> hbase> put 'table', 'row3', 'f:c', 'value3'
> {noformat}
> The table looks like this:
> {noformat}
> 0: jdbc:drill:zk=maprdemo:5181> SELECT CONVERT_FROM(row_key, 'UTF8') FROM 
> hbase.`table`;
> +-+
> | EXPR$0  |
> +-+
> | row1|
> | row2|
> | row3|
> +-+
> 1 row selected (4.596 seconds)
> {noformat}
> However, this query returns incorrect results when a not-equal operator is 
> used on the row key:
> {noformat}
> 0: jdbc:drill:zk=maprdemo:5181> SELECT CONVERT_FROM(row_key, 'UTF8') FROM 
> hbase.`table` WHERE row_key <> 'row1';
> +-+
> | EXPR$0  |
> +-+
> | row1|
> | row2|
> | row3|
> +-+
> 1 row selected (0.573 seconds)
> {noformat}
> In the query plan, there is no RowFilter:
> {noformat}
> 00-00Screen
> 00-01  Project(EXPR$0=[CONVERT_FROMUTF8($0)])
> 00-02Scan(groupscan=[HBaseGroupScan [HBaseScanSpec=HBaseScanSpec 
> [tableName=table, startRow=, stopRow=, filter=null], columns=[`row_key`]]])
> {noformat}
> When the query has multiple not-equal operators, it works fine:
> {noformat}
> 0: jdbc:drill:zk=maprdemo:5181> SELECT CONVERT_FROM(row_key, 'UTF8') FROM 
> hbase.`table` WHERE row_key <> 'row1' AND row_key <> 'row2';
> +-+
> | EXPR$0  |
> +-+
> | row3|
> +-+
> 1 row selected (0.255 seconds)
> {noformat}
> In the query plan, a FilterList has two RowFilters with NOT_EQUAL operators:
> {noformat}
> 00-00Screen
> 00-01  Project(EXPR$0=[CONVERT_FROMUTF8($0)])
> 00-02Scan(groupscan=[HBaseGroupScan [HBaseScanSpec=HBaseScanSpec 
> [tableName=table, startRow=, stopRow=, filter=FilterList AND (2/2): 
> [RowFilter (NOT_EQUAL, row1), RowFilter (NOT_EQUAL, row2)]], 
> columns=[`row_key`]]])
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (DRILL-4020) The not-equal operator returns incorrect results when used on the HBase row key

2015-11-03 Thread Akihiko Kusanagi (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-4020?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akihiko Kusanagi updated DRILL-4020:

Attachment: (was: DRILL-4020.patch)

> The not-equal operator returns incorrect results when used on the HBase row 
> key
> ---
>
> Key: DRILL-4020
> URL: https://issues.apache.org/jira/browse/DRILL-4020
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Storage - HBase
>Affects Versions: 1.2.0
> Environment: Drill Sandbox
>Reporter: Akihiko Kusanagi
>Priority: Critical
> Attachments: DRILL-4020.1.patch.txt
>
>
> Create a test HBase table:
> {noformat}
> hbase> create 'table', 'f'
> hbase> put 'table', 'row1', 'f:c', 'value1'
> hbase> put 'table', 'row2', 'f:c', 'value2'
> hbase> put 'table', 'row3', 'f:c', 'value3'
> {noformat}
> The table looks like this:
> {noformat}
> 0: jdbc:drill:zk=maprdemo:5181> SELECT CONVERT_FROM(row_key, 'UTF8') FROM 
> hbase.`table`;
> +-+
> | EXPR$0  |
> +-+
> | row1|
> | row2|
> | row3|
> +-+
> 1 row selected (4.596 seconds)
> {noformat}
> However, this query returns incorrect results when a not-equal operator is 
> used on the row key:
> {noformat}
> 0: jdbc:drill:zk=maprdemo:5181> SELECT CONVERT_FROM(row_key, 'UTF8') FROM 
> hbase.`table` WHERE row_key <> 'row1';
> +-+
> | EXPR$0  |
> +-+
> | row1|
> | row2|
> | row3|
> +-+
> 1 row selected (0.573 seconds)
> {noformat}
> In the query plan, there is no RowFilter:
> {noformat}
> 00-00Screen
> 00-01  Project(EXPR$0=[CONVERT_FROMUTF8($0)])
> 00-02Scan(groupscan=[HBaseGroupScan [HBaseScanSpec=HBaseScanSpec 
> [tableName=table, startRow=, stopRow=, filter=null], columns=[`row_key`]]])
> {noformat}
> When the query has multiple not-equal operators, it works fine:
> {noformat}
> 0: jdbc:drill:zk=maprdemo:5181> SELECT CONVERT_FROM(row_key, 'UTF8') FROM 
> hbase.`table` WHERE row_key <> 'row1' AND row_key <> 'row2';
> +-+
> | EXPR$0  |
> +-+
> | row3|
> +-+
> 1 row selected (0.255 seconds)
> {noformat}
> In the query plan, a FilterList has two RowFilters with NOT_EQUAL operators:
> {noformat}
> 00-00Screen
> 00-01  Project(EXPR$0=[CONVERT_FROMUTF8($0)])
> 00-02Scan(groupscan=[HBaseGroupScan [HBaseScanSpec=HBaseScanSpec 
> [tableName=table, startRow=, stopRow=, filter=FilterList AND (2/2): 
> [RowFilter (NOT_EQUAL, row1), RowFilter (NOT_EQUAL, row2)]], 
> columns=[`row_key`]]])
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (DRILL-4020) The not-equal operator returns incorrect results when used on the HBase row key

2015-11-03 Thread Akihiko Kusanagi (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-4020?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akihiko Kusanagi updated DRILL-4020:

Attachment: DRILL-4020.patch

> The not-equal operator returns incorrect results when used on the HBase row 
> key
> ---
>
> Key: DRILL-4020
> URL: https://issues.apache.org/jira/browse/DRILL-4020
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Storage - HBase
>Affects Versions: 1.2.0
> Environment: Drill Sandbox
>Reporter: Akihiko Kusanagi
>Priority: Critical
> Attachments: DRILL-4020.1.patch.txt, DRILL-4020.patch
>
>
> Create a test HBase table:
> {noformat}
> hbase> create 'table', 'f'
> hbase> put 'table', 'row1', 'f:c', 'value1'
> hbase> put 'table', 'row2', 'f:c', 'value2'
> hbase> put 'table', 'row3', 'f:c', 'value3'
> {noformat}
> The table looks like this:
> {noformat}
> 0: jdbc:drill:zk=maprdemo:5181> SELECT CONVERT_FROM(row_key, 'UTF8') FROM 
> hbase.`table`;
> +-+
> | EXPR$0  |
> +-+
> | row1|
> | row2|
> | row3|
> +-+
> 1 row selected (4.596 seconds)
> {noformat}
> However, this query returns incorrect results when a not-equal operator is 
> used on the row key:
> {noformat}
> 0: jdbc:drill:zk=maprdemo:5181> SELECT CONVERT_FROM(row_key, 'UTF8') FROM 
> hbase.`table` WHERE row_key <> 'row1';
> +-+
> | EXPR$0  |
> +-+
> | row1|
> | row2|
> | row3|
> +-+
> 1 row selected (0.573 seconds)
> {noformat}
> In the query plan, there is no RowFilter:
> {noformat}
> 00-00Screen
> 00-01  Project(EXPR$0=[CONVERT_FROMUTF8($0)])
> 00-02Scan(groupscan=[HBaseGroupScan [HBaseScanSpec=HBaseScanSpec 
> [tableName=table, startRow=, stopRow=, filter=null], columns=[`row_key`]]])
> {noformat}
> When the query has multiple not-equal operators, it works fine:
> {noformat}
> 0: jdbc:drill:zk=maprdemo:5181> SELECT CONVERT_FROM(row_key, 'UTF8') FROM 
> hbase.`table` WHERE row_key <> 'row1' AND row_key <> 'row2';
> +-+
> | EXPR$0  |
> +-+
> | row3|
> +-+
> 1 row selected (0.255 seconds)
> {noformat}
> In the query plan, a FilterList has two RowFilters with NOT_EQUAL operators:
> {noformat}
> 00-00Screen
> 00-01  Project(EXPR$0=[CONVERT_FROMUTF8($0)])
> 00-02Scan(groupscan=[HBaseGroupScan [HBaseScanSpec=HBaseScanSpec 
> [tableName=table, startRow=, stopRow=, filter=FilterList AND (2/2): 
> [RowFilter (NOT_EQUAL, row1), RowFilter (NOT_EQUAL, row2)]], 
> columns=[`row_key`]]])
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (DRILL-4020) The not-equal operator returns incorrect results when used on the HBase row key

2015-11-03 Thread Akihiko Kusanagi (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-4020?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14987857#comment-14987857
 ] 

Akihiko Kusanagi commented on DRILL-4020:
-

Created reviewboard 

> The not-equal operator returns incorrect results when used on the HBase row 
> key
> ---
>
> Key: DRILL-4020
> URL: https://issues.apache.org/jira/browse/DRILL-4020
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Storage - HBase
>Affects Versions: 1.2.0
> Environment: Drill Sandbox
>Reporter: Akihiko Kusanagi
>Priority: Critical
> Attachments: DRILL-4020.1.patch.txt, DRILL-4020.patch
>
>
> Create a test HBase table:
> {noformat}
> hbase> create 'table', 'f'
> hbase> put 'table', 'row1', 'f:c', 'value1'
> hbase> put 'table', 'row2', 'f:c', 'value2'
> hbase> put 'table', 'row3', 'f:c', 'value3'
> {noformat}
> The table looks like this:
> {noformat}
> 0: jdbc:drill:zk=maprdemo:5181> SELECT CONVERT_FROM(row_key, 'UTF8') FROM 
> hbase.`table`;
> +-+
> | EXPR$0  |
> +-+
> | row1|
> | row2|
> | row3|
> +-+
> 1 row selected (4.596 seconds)
> {noformat}
> However, this query returns incorrect results when a not-equal operator is 
> used on the row key:
> {noformat}
> 0: jdbc:drill:zk=maprdemo:5181> SELECT CONVERT_FROM(row_key, 'UTF8') FROM 
> hbase.`table` WHERE row_key <> 'row1';
> +-+
> | EXPR$0  |
> +-+
> | row1|
> | row2|
> | row3|
> +-+
> 1 row selected (0.573 seconds)
> {noformat}
> In the query plan, there is no RowFilter:
> {noformat}
> 00-00Screen
> 00-01  Project(EXPR$0=[CONVERT_FROMUTF8($0)])
> 00-02Scan(groupscan=[HBaseGroupScan [HBaseScanSpec=HBaseScanSpec 
> [tableName=table, startRow=, stopRow=, filter=null], columns=[`row_key`]]])
> {noformat}
> When the query has multiple not-equal operators, it works fine:
> {noformat}
> 0: jdbc:drill:zk=maprdemo:5181> SELECT CONVERT_FROM(row_key, 'UTF8') FROM 
> hbase.`table` WHERE row_key <> 'row1' AND row_key <> 'row2';
> +-+
> | EXPR$0  |
> +-+
> | row3|
> +-+
> 1 row selected (0.255 seconds)
> {noformat}
> In the query plan, a FilterList has two RowFilters with NOT_EQUAL operators:
> {noformat}
> 00-00Screen
> 00-01  Project(EXPR$0=[CONVERT_FROMUTF8($0)])
> 00-02Scan(groupscan=[HBaseGroupScan [HBaseScanSpec=HBaseScanSpec 
> [tableName=table, startRow=, stopRow=, filter=FilterList AND (2/2): 
> [RowFilter (NOT_EQUAL, row1), RowFilter (NOT_EQUAL, row2)]], 
> columns=[`row_key`]]])
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (DRILL-4014) Update Client RPC Protocol to support warnings

2015-11-03 Thread Jacques Nadeau (JIRA)
Jacques Nadeau created DRILL-4014:
-

 Summary: Update Client RPC Protocol to support warnings
 Key: DRILL-4014
 URL: https://issues.apache.org/jira/browse/DRILL-4014
 Project: Apache Drill
  Issue Type: Sub-task
  Components: Execution - RPC
Reporter: Jacques Nadeau
Assignee: Abhijit Pol






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (DRILL-4018) Schema Change: Screen Operator should filter behaviors unsupported by particular clients

2015-11-03 Thread Jacques Nadeau (JIRA)
Jacques Nadeau created DRILL-4018:
-

 Summary: Schema Change: Screen Operator should filter behaviors 
unsupported by particular clients
 Key: DRILL-4018
 URL: https://issues.apache.org/jira/browse/DRILL-4018
 Project: Apache Drill
  Issue Type: Improvement
  Components: Execution - Relational Operators
Reporter: Jacques Nadeau
Assignee: amit hadke


In some situations, a client may not be able to accommodate schema changes. We 
need to update the Screen operator to have configurable behavior according to 
what a client supports. One example is schema change. While Drill supports 
schema changes, many clients can't deal with these. We need to come with a set 
of schema change criteria that map to coercion, warning or error and then 
implement that filtering within the screen operator.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (DRILL-4006) As json reader reads a field with empty lists, IOOB could happen

2015-11-03 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-4006?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14987635#comment-14987635
 ] 

ASF GitHub Bot commented on DRILL-4006:
---

Github user zfong commented on the pull request:

https://github.com/apache/drill/pull/233#issuecomment-153421208
  
Is it possible to add a unit test for this problem?  Seems like a pretty 
straightforward repro.  Thanks.


> As json reader reads a field with empty lists, IOOB could happen
> 
>
> Key: DRILL-4006
> URL: https://issues.apache.org/jira/browse/DRILL-4006
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Storage - JSON
>Reporter: Sean Hsuan-Yi Chu
>Assignee: Sean Hsuan-Yi Chu
> Attachments: a.json, b.json, c.json
>
>
> If a field in a json file has many empty lists before a non-empty list, there 
> could be an IOOB exception.
> Running the following query on the folder with files in the attachment can 
> reproduce the observation:
> {code}
> select a from`folder`
> {code}
> Exception:
> org.apache.drill.common.exceptions.UserRemoteException: DATA_READ ERROR: 
> index: 4448, length: 4 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (DRILL-3845) Partition sender shouldn't send the "last batch" to a receiver that sent a "receiver finished" to the sender

2015-11-03 Thread Deneche A. Hakim (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-3845?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Deneche A. Hakim reassigned DRILL-3845:
---

Assignee: Deneche A. Hakim  (was: Venki Korukanti)

> Partition sender shouldn't send the "last batch" to a receiver that sent a 
> "receiver finished" to the sender
> 
>
> Key: DRILL-3845
> URL: https://issues.apache.org/jira/browse/DRILL-3845
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Relational Operators
>Reporter: Deneche A. Hakim
>Assignee: Deneche A. Hakim
> Fix For: 1.3.0
>
>
> Even if a receiver has finished and informed the corresponding partition 
> sender, the sender will still try to send a "last batch" to the receiver when 
> it's done. In most cases this is fine as those batches will be silently 
> dropped by the receiving DataServer, but if a receiver has finished +10 
> minutes ago, DataServer will throw an exception as it couldn't find the 
> corresponding FragmentManager (WorkEventBus has a 10 minutes recentlyFinished 
> cache).
> DRILL-2274 is a reproduction for this case (after the corresponding fix is 
> applied).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (DRILL-4020) The not-equal operator returns incorrect results when used on the HBase row key

2015-11-03 Thread Akihiko Kusanagi (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-4020?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akihiko Kusanagi updated DRILL-4020:

Attachment: DRILL-4020.patch

> The not-equal operator returns incorrect results when used on the HBase row 
> key
> ---
>
> Key: DRILL-4020
> URL: https://issues.apache.org/jira/browse/DRILL-4020
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Storage - HBase
>Affects Versions: 1.2.0
> Environment: Drill Sandbox
>Reporter: Akihiko Kusanagi
>Priority: Critical
> Attachments: DRILL-4020.1.patch.txt, DRILL-4020.patch, 
> DRILL-4020.patch
>
>
> Create a test HBase table:
> {noformat}
> hbase> create 'table', 'f'
> hbase> put 'table', 'row1', 'f:c', 'value1'
> hbase> put 'table', 'row2', 'f:c', 'value2'
> hbase> put 'table', 'row3', 'f:c', 'value3'
> {noformat}
> The table looks like this:
> {noformat}
> 0: jdbc:drill:zk=maprdemo:5181> SELECT CONVERT_FROM(row_key, 'UTF8') FROM 
> hbase.`table`;
> +-+
> | EXPR$0  |
> +-+
> | row1|
> | row2|
> | row3|
> +-+
> 1 row selected (4.596 seconds)
> {noformat}
> However, this query returns incorrect results when a not-equal operator is 
> used on the row key:
> {noformat}
> 0: jdbc:drill:zk=maprdemo:5181> SELECT CONVERT_FROM(row_key, 'UTF8') FROM 
> hbase.`table` WHERE row_key <> 'row1';
> +-+
> | EXPR$0  |
> +-+
> | row1|
> | row2|
> | row3|
> +-+
> 1 row selected (0.573 seconds)
> {noformat}
> In the query plan, there is no RowFilter:
> {noformat}
> 00-00Screen
> 00-01  Project(EXPR$0=[CONVERT_FROMUTF8($0)])
> 00-02Scan(groupscan=[HBaseGroupScan [HBaseScanSpec=HBaseScanSpec 
> [tableName=table, startRow=, stopRow=, filter=null], columns=[`row_key`]]])
> {noformat}
> When the query has multiple not-equal operators, it works fine:
> {noformat}
> 0: jdbc:drill:zk=maprdemo:5181> SELECT CONVERT_FROM(row_key, 'UTF8') FROM 
> hbase.`table` WHERE row_key <> 'row1' AND row_key <> 'row2';
> +-+
> | EXPR$0  |
> +-+
> | row3|
> +-+
> 1 row selected (0.255 seconds)
> {noformat}
> In the query plan, a FilterList has two RowFilters with NOT_EQUAL operators:
> {noformat}
> 00-00Screen
> 00-01  Project(EXPR$0=[CONVERT_FROMUTF8($0)])
> 00-02Scan(groupscan=[HBaseGroupScan [HBaseScanSpec=HBaseScanSpec 
> [tableName=table, startRow=, stopRow=, filter=FilterList AND (2/2): 
> [RowFilter (NOT_EQUAL, row1), RowFilter (NOT_EQUAL, row2)]], 
> columns=[`row_key`]]])
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Issue Comment Deleted] (DRILL-4020) The not-equal operator returns incorrect results when used on the HBase row key

2015-11-03 Thread Akihiko Kusanagi (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-4020?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akihiko Kusanagi updated DRILL-4020:

Comment: was deleted

(was: Created reviewboard )

> The not-equal operator returns incorrect results when used on the HBase row 
> key
> ---
>
> Key: DRILL-4020
> URL: https://issues.apache.org/jira/browse/DRILL-4020
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Storage - HBase
>Affects Versions: 1.2.0
> Environment: Drill Sandbox
>Reporter: Akihiko Kusanagi
>Priority: Critical
> Attachments: DRILL-4020.1.patch.txt, DRILL-4020.patch, 
> DRILL-4020.patch, DRILL-4020.patch, DRILL-4020.patch
>
>
> Create a test HBase table:
> {noformat}
> hbase> create 'table', 'f'
> hbase> put 'table', 'row1', 'f:c', 'value1'
> hbase> put 'table', 'row2', 'f:c', 'value2'
> hbase> put 'table', 'row3', 'f:c', 'value3'
> {noformat}
> The table looks like this:
> {noformat}
> 0: jdbc:drill:zk=maprdemo:5181> SELECT CONVERT_FROM(row_key, 'UTF8') FROM 
> hbase.`table`;
> +-+
> | EXPR$0  |
> +-+
> | row1|
> | row2|
> | row3|
> +-+
> 1 row selected (4.596 seconds)
> {noformat}
> However, this query returns incorrect results when a not-equal operator is 
> used on the row key:
> {noformat}
> 0: jdbc:drill:zk=maprdemo:5181> SELECT CONVERT_FROM(row_key, 'UTF8') FROM 
> hbase.`table` WHERE row_key <> 'row1';
> +-+
> | EXPR$0  |
> +-+
> | row1|
> | row2|
> | row3|
> +-+
> 1 row selected (0.573 seconds)
> {noformat}
> In the query plan, there is no RowFilter:
> {noformat}
> 00-00Screen
> 00-01  Project(EXPR$0=[CONVERT_FROMUTF8($0)])
> 00-02Scan(groupscan=[HBaseGroupScan [HBaseScanSpec=HBaseScanSpec 
> [tableName=table, startRow=, stopRow=, filter=null], columns=[`row_key`]]])
> {noformat}
> When the query has multiple not-equal operators, it works fine:
> {noformat}
> 0: jdbc:drill:zk=maprdemo:5181> SELECT CONVERT_FROM(row_key, 'UTF8') FROM 
> hbase.`table` WHERE row_key <> 'row1' AND row_key <> 'row2';
> +-+
> | EXPR$0  |
> +-+
> | row3|
> +-+
> 1 row selected (0.255 seconds)
> {noformat}
> In the query plan, a FilterList has two RowFilters with NOT_EQUAL operators:
> {noformat}
> 00-00Screen
> 00-01  Project(EXPR$0=[CONVERT_FROMUTF8($0)])
> 00-02Scan(groupscan=[HBaseGroupScan [HBaseScanSpec=HBaseScanSpec 
> [tableName=table, startRow=, stopRow=, filter=FilterList AND (2/2): 
> [RowFilter (NOT_EQUAL, row1), RowFilter (NOT_EQUAL, row2)]], 
> columns=[`row_key`]]])
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Issue Comment Deleted] (DRILL-4020) The not-equal operator returns incorrect results when used on the HBase row key

2015-11-03 Thread Akihiko Kusanagi (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-4020?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akihiko Kusanagi updated DRILL-4020:

Comment: was deleted

(was: Created reviewboard )

> The not-equal operator returns incorrect results when used on the HBase row 
> key
> ---
>
> Key: DRILL-4020
> URL: https://issues.apache.org/jira/browse/DRILL-4020
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Storage - HBase
>Affects Versions: 1.2.0
> Environment: Drill Sandbox
>Reporter: Akihiko Kusanagi
>Priority: Critical
> Attachments: DRILL-4020.1.patch.txt, DRILL-4020.patch, 
> DRILL-4020.patch, DRILL-4020.patch, DRILL-4020.patch
>
>
> Create a test HBase table:
> {noformat}
> hbase> create 'table', 'f'
> hbase> put 'table', 'row1', 'f:c', 'value1'
> hbase> put 'table', 'row2', 'f:c', 'value2'
> hbase> put 'table', 'row3', 'f:c', 'value3'
> {noformat}
> The table looks like this:
> {noformat}
> 0: jdbc:drill:zk=maprdemo:5181> SELECT CONVERT_FROM(row_key, 'UTF8') FROM 
> hbase.`table`;
> +-+
> | EXPR$0  |
> +-+
> | row1|
> | row2|
> | row3|
> +-+
> 1 row selected (4.596 seconds)
> {noformat}
> However, this query returns incorrect results when a not-equal operator is 
> used on the row key:
> {noformat}
> 0: jdbc:drill:zk=maprdemo:5181> SELECT CONVERT_FROM(row_key, 'UTF8') FROM 
> hbase.`table` WHERE row_key <> 'row1';
> +-+
> | EXPR$0  |
> +-+
> | row1|
> | row2|
> | row3|
> +-+
> 1 row selected (0.573 seconds)
> {noformat}
> In the query plan, there is no RowFilter:
> {noformat}
> 00-00Screen
> 00-01  Project(EXPR$0=[CONVERT_FROMUTF8($0)])
> 00-02Scan(groupscan=[HBaseGroupScan [HBaseScanSpec=HBaseScanSpec 
> [tableName=table, startRow=, stopRow=, filter=null], columns=[`row_key`]]])
> {noformat}
> When the query has multiple not-equal operators, it works fine:
> {noformat}
> 0: jdbc:drill:zk=maprdemo:5181> SELECT CONVERT_FROM(row_key, 'UTF8') FROM 
> hbase.`table` WHERE row_key <> 'row1' AND row_key <> 'row2';
> +-+
> | EXPR$0  |
> +-+
> | row3|
> +-+
> 1 row selected (0.255 seconds)
> {noformat}
> In the query plan, a FilterList has two RowFilters with NOT_EQUAL operators:
> {noformat}
> 00-00Screen
> 00-01  Project(EXPR$0=[CONVERT_FROMUTF8($0)])
> 00-02Scan(groupscan=[HBaseGroupScan [HBaseScanSpec=HBaseScanSpec 
> [tableName=table, startRow=, stopRow=, filter=FilterList AND (2/2): 
> [RowFilter (NOT_EQUAL, row1), RowFilter (NOT_EQUAL, row2)]], 
> columns=[`row_key`]]])
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (DRILL-4013) Umbrella: Add Support for client warnings

2015-11-03 Thread Jacques Nadeau (JIRA)
Jacques Nadeau created DRILL-4013:
-

 Summary: Umbrella: Add Support for client warnings
 Key: DRILL-4013
 URL: https://issues.apache.org/jira/browse/DRILL-4013
 Project: Apache Drill
  Issue Type: New Feature
  Components: Client - JDBC, Client - ODBC, Execution - RPC
Reporter: Jacques Nadeau
 Fix For: 1.4.0


We need to add a capability to support JDBC and ODBC warnings. We should come 
up with a proposal for a design. For v1, I propose that we focus on Screen 
operator warnings. Future enhancements could leverage this functionality to 
provide sideband warnings as well. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (DRILL-4016) Update C++ Client to support warnings

2015-11-03 Thread Jacques Nadeau (JIRA)
Jacques Nadeau created DRILL-4016:
-

 Summary: Update C++ Client to support warnings
 Key: DRILL-4016
 URL: https://issues.apache.org/jira/browse/DRILL-4016
 Project: Apache Drill
  Issue Type: Sub-task
  Components: Client - C++
Reporter: Jacques Nadeau






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (DRILL-4015) Update DrillClient and JDBC driver to expose warnings provided via RPC layer

2015-11-03 Thread Jacques Nadeau (JIRA)
Jacques Nadeau created DRILL-4015:
-

 Summary: Update DrillClient and JDBC driver to expose warnings 
provided via RPC layer
 Key: DRILL-4015
 URL: https://issues.apache.org/jira/browse/DRILL-4015
 Project: Apache Drill
  Issue Type: Sub-task
  Components: Client - JDBC, Execution - RPC
Reporter: Jacques Nadeau
Assignee: Abhijit Pol






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (DRILL-4020) The not-equal operator returns incorrect results when used on the HBase row key

2015-11-03 Thread Akihiko Kusanagi (JIRA)
Akihiko Kusanagi created DRILL-4020:
---

 Summary: The not-equal operator returns incorrect results when 
used on the HBase row key
 Key: DRILL-4020
 URL: https://issues.apache.org/jira/browse/DRILL-4020
 Project: Apache Drill
  Issue Type: Bug
  Components: Storage - HBase
Affects Versions: 1.2.0
 Environment: Drill Sandbox
Reporter: Akihiko Kusanagi
Priority: Critical


Create a test HBase table:

hbase> create 'table', 'f'
hbase> put 'table', 'row1', 'f:c', 'value1'
hbase> put 'table', 'row2', 'f:c', 'value2'
hbase> put 'table', 'row3', 'f:c', 'value3'

The table looks like this:

0: jdbc:drill:zk=maprdemo:5181> SELECT CONVERT_FROM(row_key, 'UTF8') FROM 
hbase.`table`;
+-+
| EXPR$0  |
+-+
| row1|
| row2|
| row3|
+-+
1 row selected (4.596 seconds)

However, this query returns incorrect results when a not-equal operator is used 
on the row key:

0: jdbc:drill:zk=maprdemo:5181> SELECT CONVERT_FROM(row_key, 'UTF8') FROM 
hbase.`table` WHERE row_key <> 'row1';
+-+
| EXPR$0  |
+-+
| row1|
| row2|
| row3|
+-+
1 row selected (0.573 seconds)

In the query plan, there is no RowFilter:

00-00Screen
00-01  Project(EXPR$0=[CONVERT_FROMUTF8($0)])
00-02Scan(groupscan=[HBaseGroupScan [HBaseScanSpec=HBaseScanSpec 
[tableName=table, startRow=, stopRow=, filter=null], columns=[`row_key`]]])

When the query has multiple not-equal operators, it works fine:

0: jdbc:drill:zk=maprdemo:5181> SELECT CONVERT_FROM(row_key, 'UTF8') FROM 
hbase.`table` WHERE row_key <> 'row1' AND row_key <> 'row2';
+-+
| EXPR$0  |
+-+
| row3|
+-+
1 row selected (0.255 seconds)

In the query plan, a FilterList has two RowFilters with NOT_EQUAL operators:

00-00Screen
00-01  Project(EXPR$0=[CONVERT_FROMUTF8($0)])
00-02Scan(groupscan=[HBaseGroupScan [HBaseScanSpec=HBaseScanSpec 
[tableName=table, startRow=, stopRow=, filter=FilterList AND (2/2): [RowFilter 
(NOT_EQUAL, row1), RowFilter (NOT_EQUAL, row2)]], columns=[`row_key`]]])



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (DRILL-4020) The not-equal operator returns incorrect results when used on the HBase row key

2015-11-03 Thread Akihiko Kusanagi (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-4020?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14987905#comment-14987905
 ] 

Akihiko Kusanagi commented on DRILL-4020:
-

Created reviewboard 

> The not-equal operator returns incorrect results when used on the HBase row 
> key
> ---
>
> Key: DRILL-4020
> URL: https://issues.apache.org/jira/browse/DRILL-4020
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Storage - HBase
>Affects Versions: 1.2.0
> Environment: Drill Sandbox
>Reporter: Akihiko Kusanagi
>Priority: Critical
> Attachments: DRILL-4020.1.patch.txt, DRILL-4020.patch, 
> DRILL-4020.patch
>
>
> Create a test HBase table:
> {noformat}
> hbase> create 'table', 'f'
> hbase> put 'table', 'row1', 'f:c', 'value1'
> hbase> put 'table', 'row2', 'f:c', 'value2'
> hbase> put 'table', 'row3', 'f:c', 'value3'
> {noformat}
> The table looks like this:
> {noformat}
> 0: jdbc:drill:zk=maprdemo:5181> SELECT CONVERT_FROM(row_key, 'UTF8') FROM 
> hbase.`table`;
> +-+
> | EXPR$0  |
> +-+
> | row1|
> | row2|
> | row3|
> +-+
> 1 row selected (4.596 seconds)
> {noformat}
> However, this query returns incorrect results when a not-equal operator is 
> used on the row key:
> {noformat}
> 0: jdbc:drill:zk=maprdemo:5181> SELECT CONVERT_FROM(row_key, 'UTF8') FROM 
> hbase.`table` WHERE row_key <> 'row1';
> +-+
> | EXPR$0  |
> +-+
> | row1|
> | row2|
> | row3|
> +-+
> 1 row selected (0.573 seconds)
> {noformat}
> In the query plan, there is no RowFilter:
> {noformat}
> 00-00Screen
> 00-01  Project(EXPR$0=[CONVERT_FROMUTF8($0)])
> 00-02Scan(groupscan=[HBaseGroupScan [HBaseScanSpec=HBaseScanSpec 
> [tableName=table, startRow=, stopRow=, filter=null], columns=[`row_key`]]])
> {noformat}
> When the query has multiple not-equal operators, it works fine:
> {noformat}
> 0: jdbc:drill:zk=maprdemo:5181> SELECT CONVERT_FROM(row_key, 'UTF8') FROM 
> hbase.`table` WHERE row_key <> 'row1' AND row_key <> 'row2';
> +-+
> | EXPR$0  |
> +-+
> | row3|
> +-+
> 1 row selected (0.255 seconds)
> {noformat}
> In the query plan, a FilterList has two RowFilters with NOT_EQUAL operators:
> {noformat}
> 00-00Screen
> 00-01  Project(EXPR$0=[CONVERT_FROMUTF8($0)])
> 00-02Scan(groupscan=[HBaseGroupScan [HBaseScanSpec=HBaseScanSpec 
> [tableName=table, startRow=, stopRow=, filter=FilterList AND (2/2): 
> [RowFilter (NOT_EQUAL, row1), RowFilter (NOT_EQUAL, row2)]], 
> columns=[`row_key`]]])
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (DRILL-4020) The not-equal operator returns incorrect results when used on the HBase row key

2015-11-03 Thread Akihiko Kusanagi (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-4020?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14987925#comment-14987925
 ] 

Akihiko Kusanagi commented on DRILL-4020:
-

Created reviewboard 

> The not-equal operator returns incorrect results when used on the HBase row 
> key
> ---
>
> Key: DRILL-4020
> URL: https://issues.apache.org/jira/browse/DRILL-4020
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Storage - HBase
>Affects Versions: 1.2.0
> Environment: Drill Sandbox
>Reporter: Akihiko Kusanagi
>Priority: Critical
> Attachments: DRILL-4020.1.patch.txt, DRILL-4020.patch, 
> DRILL-4020.patch, DRILL-4020.patch
>
>
> Create a test HBase table:
> {noformat}
> hbase> create 'table', 'f'
> hbase> put 'table', 'row1', 'f:c', 'value1'
> hbase> put 'table', 'row2', 'f:c', 'value2'
> hbase> put 'table', 'row3', 'f:c', 'value3'
> {noformat}
> The table looks like this:
> {noformat}
> 0: jdbc:drill:zk=maprdemo:5181> SELECT CONVERT_FROM(row_key, 'UTF8') FROM 
> hbase.`table`;
> +-+
> | EXPR$0  |
> +-+
> | row1|
> | row2|
> | row3|
> +-+
> 1 row selected (4.596 seconds)
> {noformat}
> However, this query returns incorrect results when a not-equal operator is 
> used on the row key:
> {noformat}
> 0: jdbc:drill:zk=maprdemo:5181> SELECT CONVERT_FROM(row_key, 'UTF8') FROM 
> hbase.`table` WHERE row_key <> 'row1';
> +-+
> | EXPR$0  |
> +-+
> | row1|
> | row2|
> | row3|
> +-+
> 1 row selected (0.573 seconds)
> {noformat}
> In the query plan, there is no RowFilter:
> {noformat}
> 00-00Screen
> 00-01  Project(EXPR$0=[CONVERT_FROMUTF8($0)])
> 00-02Scan(groupscan=[HBaseGroupScan [HBaseScanSpec=HBaseScanSpec 
> [tableName=table, startRow=, stopRow=, filter=null], columns=[`row_key`]]])
> {noformat}
> When the query has multiple not-equal operators, it works fine:
> {noformat}
> 0: jdbc:drill:zk=maprdemo:5181> SELECT CONVERT_FROM(row_key, 'UTF8') FROM 
> hbase.`table` WHERE row_key <> 'row1' AND row_key <> 'row2';
> +-+
> | EXPR$0  |
> +-+
> | row3|
> +-+
> 1 row selected (0.255 seconds)
> {noformat}
> In the query plan, a FilterList has two RowFilters with NOT_EQUAL operators:
> {noformat}
> 00-00Screen
> 00-01  Project(EXPR$0=[CONVERT_FROMUTF8($0)])
> 00-02Scan(groupscan=[HBaseGroupScan [HBaseScanSpec=HBaseScanSpec 
> [tableName=table, startRow=, stopRow=, filter=FilterList AND (2/2): 
> [RowFilter (NOT_EQUAL, row1), RowFilter (NOT_EQUAL, row2)]], 
> columns=[`row_key`]]])
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (DRILL-4020) The not-equal operator returns incorrect results when used on the HBase row key

2015-11-03 Thread Akihiko Kusanagi (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-4020?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14987930#comment-14987930
 ] 

Akihiko Kusanagi commented on DRILL-4020:
-

Created reviewboard 

> The not-equal operator returns incorrect results when used on the HBase row 
> key
> ---
>
> Key: DRILL-4020
> URL: https://issues.apache.org/jira/browse/DRILL-4020
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Storage - HBase
>Affects Versions: 1.2.0
> Environment: Drill Sandbox
>Reporter: Akihiko Kusanagi
>Priority: Critical
> Attachments: DRILL-4020.1.patch.txt, DRILL-4020.patch, 
> DRILL-4020.patch, DRILL-4020.patch, DRILL-4020.patch
>
>
> Create a test HBase table:
> {noformat}
> hbase> create 'table', 'f'
> hbase> put 'table', 'row1', 'f:c', 'value1'
> hbase> put 'table', 'row2', 'f:c', 'value2'
> hbase> put 'table', 'row3', 'f:c', 'value3'
> {noformat}
> The table looks like this:
> {noformat}
> 0: jdbc:drill:zk=maprdemo:5181> SELECT CONVERT_FROM(row_key, 'UTF8') FROM 
> hbase.`table`;
> +-+
> | EXPR$0  |
> +-+
> | row1|
> | row2|
> | row3|
> +-+
> 1 row selected (4.596 seconds)
> {noformat}
> However, this query returns incorrect results when a not-equal operator is 
> used on the row key:
> {noformat}
> 0: jdbc:drill:zk=maprdemo:5181> SELECT CONVERT_FROM(row_key, 'UTF8') FROM 
> hbase.`table` WHERE row_key <> 'row1';
> +-+
> | EXPR$0  |
> +-+
> | row1|
> | row2|
> | row3|
> +-+
> 1 row selected (0.573 seconds)
> {noformat}
> In the query plan, there is no RowFilter:
> {noformat}
> 00-00Screen
> 00-01  Project(EXPR$0=[CONVERT_FROMUTF8($0)])
> 00-02Scan(groupscan=[HBaseGroupScan [HBaseScanSpec=HBaseScanSpec 
> [tableName=table, startRow=, stopRow=, filter=null], columns=[`row_key`]]])
> {noformat}
> When the query has multiple not-equal operators, it works fine:
> {noformat}
> 0: jdbc:drill:zk=maprdemo:5181> SELECT CONVERT_FROM(row_key, 'UTF8') FROM 
> hbase.`table` WHERE row_key <> 'row1' AND row_key <> 'row2';
> +-+
> | EXPR$0  |
> +-+
> | row3|
> +-+
> 1 row selected (0.255 seconds)
> {noformat}
> In the query plan, a FilterList has two RowFilters with NOT_EQUAL operators:
> {noformat}
> 00-00Screen
> 00-01  Project(EXPR$0=[CONVERT_FROMUTF8($0)])
> 00-02Scan(groupscan=[HBaseGroupScan [HBaseScanSpec=HBaseScanSpec 
> [tableName=table, startRow=, stopRow=, filter=FilterList AND (2/2): 
> [RowFilter (NOT_EQUAL, row1), RowFilter (NOT_EQUAL, row2)]], 
> columns=[`row_key`]]])
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (DRILL-4020) The not-equal operator returns incorrect results when used on the HBase row key

2015-11-03 Thread Akihiko Kusanagi (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-4020?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akihiko Kusanagi updated DRILL-4020:

Attachment: DRILL-4020.patch

> The not-equal operator returns incorrect results when used on the HBase row 
> key
> ---
>
> Key: DRILL-4020
> URL: https://issues.apache.org/jira/browse/DRILL-4020
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Storage - HBase
>Affects Versions: 1.2.0
> Environment: Drill Sandbox
>Reporter: Akihiko Kusanagi
>Priority: Critical
> Attachments: DRILL-4020.1.patch.txt, DRILL-4020.patch, 
> DRILL-4020.patch, DRILL-4020.patch, DRILL-4020.patch
>
>
> Create a test HBase table:
> {noformat}
> hbase> create 'table', 'f'
> hbase> put 'table', 'row1', 'f:c', 'value1'
> hbase> put 'table', 'row2', 'f:c', 'value2'
> hbase> put 'table', 'row3', 'f:c', 'value3'
> {noformat}
> The table looks like this:
> {noformat}
> 0: jdbc:drill:zk=maprdemo:5181> SELECT CONVERT_FROM(row_key, 'UTF8') FROM 
> hbase.`table`;
> +-+
> | EXPR$0  |
> +-+
> | row1|
> | row2|
> | row3|
> +-+
> 1 row selected (4.596 seconds)
> {noformat}
> However, this query returns incorrect results when a not-equal operator is 
> used on the row key:
> {noformat}
> 0: jdbc:drill:zk=maprdemo:5181> SELECT CONVERT_FROM(row_key, 'UTF8') FROM 
> hbase.`table` WHERE row_key <> 'row1';
> +-+
> | EXPR$0  |
> +-+
> | row1|
> | row2|
> | row3|
> +-+
> 1 row selected (0.573 seconds)
> {noformat}
> In the query plan, there is no RowFilter:
> {noformat}
> 00-00Screen
> 00-01  Project(EXPR$0=[CONVERT_FROMUTF8($0)])
> 00-02Scan(groupscan=[HBaseGroupScan [HBaseScanSpec=HBaseScanSpec 
> [tableName=table, startRow=, stopRow=, filter=null], columns=[`row_key`]]])
> {noformat}
> When the query has multiple not-equal operators, it works fine:
> {noformat}
> 0: jdbc:drill:zk=maprdemo:5181> SELECT CONVERT_FROM(row_key, 'UTF8') FROM 
> hbase.`table` WHERE row_key <> 'row1' AND row_key <> 'row2';
> +-+
> | EXPR$0  |
> +-+
> | row3|
> +-+
> 1 row selected (0.255 seconds)
> {noformat}
> In the query plan, a FilterList has two RowFilters with NOT_EQUAL operators:
> {noformat}
> 00-00Screen
> 00-01  Project(EXPR$0=[CONVERT_FROMUTF8($0)])
> 00-02Scan(groupscan=[HBaseGroupScan [HBaseScanSpec=HBaseScanSpec 
> [tableName=table, startRow=, stopRow=, filter=FilterList AND (2/2): 
> [RowFilter (NOT_EQUAL, row1), RowFilter (NOT_EQUAL, row2)]], 
> columns=[`row_key`]]])
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Reopened] (DRILL-3765) Partition prune rule is unnecessary fired multiple times.

2015-11-03 Thread Aman Sinha (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-3765?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aman Sinha reopened DRILL-3765:
---

Re-opening since this is becoming a performance issue.  Discussed briefly with 
[~jni].  The Volcano planner's invocations of the PruneScanRule multiple times 
during the search exploration is too heavy-weight since the rule itself is 
complex.  We should ideally move the pruning rules to the Hep planner as long 
as we ensure that associated rules such as filter push-down, projection 
push-down are also applied before and after pruning.  

> Partition prune rule is unnecessary fired multiple times. 
> --
>
> Key: DRILL-3765
> URL: https://issues.apache.org/jira/browse/DRILL-3765
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Query Planning & Optimization
>Reporter: Jinfeng Ni
>Assignee: Jinfeng Ni
>
> It seems that the partition prune rule may be fired multiple times, even 
> after the first rule execution has pushed the filter into the scan operator. 
> Since partition prune has to build the vectors to contain the partition /file 
> / directory information, to invoke the partition prune rule unnecessary may 
> lead to big memory overhead.
> Drill planner should avoid the un-necessary partition prune rule, in order to 
> reduce the chance of hitting OOM exception, while the partition prune rule is 
> executed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (DRILL-4017) Create new control status message to alert foreman of warnings from fragments.

2015-11-03 Thread Jacques Nadeau (JIRA)
Jacques Nadeau created DRILL-4017:
-

 Summary: Create new control status message to alert foreman of 
warnings from fragments.
 Key: DRILL-4017
 URL: https://issues.apache.org/jira/browse/DRILL-4017
 Project: Apache Drill
  Issue Type: Sub-task
  Components: Execution - Flow
Reporter: Jacques Nadeau






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Issue Comment Deleted] (DRILL-4020) The not-equal operator returns incorrect results when used on the HBase row key

2015-11-03 Thread Akihiko Kusanagi (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-4020?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akihiko Kusanagi updated DRILL-4020:

Comment: was deleted

(was: Created reviewboard )

> The not-equal operator returns incorrect results when used on the HBase row 
> key
> ---
>
> Key: DRILL-4020
> URL: https://issues.apache.org/jira/browse/DRILL-4020
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Storage - HBase
>Affects Versions: 1.2.0
> Environment: Drill Sandbox
>Reporter: Akihiko Kusanagi
>Priority: Critical
> Attachments: DRILL-4020.1.patch.txt, DRILL-4020.patch, 
> DRILL-4020.patch, DRILL-4020.patch, DRILL-4020.patch
>
>
> Create a test HBase table:
> {noformat}
> hbase> create 'table', 'f'
> hbase> put 'table', 'row1', 'f:c', 'value1'
> hbase> put 'table', 'row2', 'f:c', 'value2'
> hbase> put 'table', 'row3', 'f:c', 'value3'
> {noformat}
> The table looks like this:
> {noformat}
> 0: jdbc:drill:zk=maprdemo:5181> SELECT CONVERT_FROM(row_key, 'UTF8') FROM 
> hbase.`table`;
> +-+
> | EXPR$0  |
> +-+
> | row1|
> | row2|
> | row3|
> +-+
> 1 row selected (4.596 seconds)
> {noformat}
> However, this query returns incorrect results when a not-equal operator is 
> used on the row key:
> {noformat}
> 0: jdbc:drill:zk=maprdemo:5181> SELECT CONVERT_FROM(row_key, 'UTF8') FROM 
> hbase.`table` WHERE row_key <> 'row1';
> +-+
> | EXPR$0  |
> +-+
> | row1|
> | row2|
> | row3|
> +-+
> 1 row selected (0.573 seconds)
> {noformat}
> In the query plan, there is no RowFilter:
> {noformat}
> 00-00Screen
> 00-01  Project(EXPR$0=[CONVERT_FROMUTF8($0)])
> 00-02Scan(groupscan=[HBaseGroupScan [HBaseScanSpec=HBaseScanSpec 
> [tableName=table, startRow=, stopRow=, filter=null], columns=[`row_key`]]])
> {noformat}
> When the query has multiple not-equal operators, it works fine:
> {noformat}
> 0: jdbc:drill:zk=maprdemo:5181> SELECT CONVERT_FROM(row_key, 'UTF8') FROM 
> hbase.`table` WHERE row_key <> 'row1' AND row_key <> 'row2';
> +-+
> | EXPR$0  |
> +-+
> | row3|
> +-+
> 1 row selected (0.255 seconds)
> {noformat}
> In the query plan, a FilterList has two RowFilters with NOT_EQUAL operators:
> {noformat}
> 00-00Screen
> 00-01  Project(EXPR$0=[CONVERT_FROMUTF8($0)])
> 00-02Scan(groupscan=[HBaseGroupScan [HBaseScanSpec=HBaseScanSpec 
> [tableName=table, startRow=, stopRow=, filter=FilterList AND (2/2): 
> [RowFilter (NOT_EQUAL, row1), RowFilter (NOT_EQUAL, row2)]], 
> columns=[`row_key`]]])
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (DRILL-3662) Improve DefaultFrameTemplate structure

2015-11-03 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-3662?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14988244#comment-14988244
 ] 

ASF GitHub Bot commented on DRILL-3662:
---

Github user amansinha100 commented on the pull request:

https://github.com/apache/drill/pull/222#issuecomment-153504706
  
+1.   Per previous comments, there is room for improvement in terms of code 
organization and reducing the number of passes which I understand will be 
covered by DRILL-3662.  


> Improve DefaultFrameTemplate structure
> --
>
> Key: DRILL-3662
> URL: https://issues.apache.org/jira/browse/DRILL-3662
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Execution - Relational Operators
>Reporter: Deneche A. Hakim
>Assignee: Deneche A. Hakim
>  Labels: window_function
> Fix For: 1.3.0
>
>
> as part of the review for DRILL-3536 two main comments were left aside and 
> will be addressed by this JIRA:
> - refactor DefaultFrameTemplate into separate templates, each one handles a 
> different "family" of window functions (e.g. aggregate vs ranking)
> - setupEvaluatePeer() should not be called more than once per incoming batch



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (DRILL-3171) Storage Plugins : Two processes tried to update the storage plugin at the same time

2015-11-03 Thread Hanifi Gunes (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-3171?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hanifi Gunes reassigned DRILL-3171:
---

Assignee: Hanifi Gunes

> Storage Plugins : Two processes tried to update the storage plugin at the 
> same time
> ---
>
> Key: DRILL-3171
> URL: https://issues.apache.org/jira/browse/DRILL-3171
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Storage - Information Schema
>Affects Versions: 1.0.0
>Reporter: Rahul Challapalli
>Assignee: Hanifi Gunes
>  Labels: test
> Fix For: Future
>
>
> Commit Id# : bd8ac4fca03ad5043bca27fbc7e0dec5a35ac474
> We have seen this issue happen with the below steps
>1. Clear out the zookeeper
>2. Update the storage plugin using the rest API on one of the node
>3. Submit 10 queries concurrently
> With randomized foreman node selection, the node executing the query might 
> not have the updated storage plugins info. This could be causing the issue.
> - Rahul



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (DRILL-3423) Add New HTTPD format plugin

2015-11-03 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-3423?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14988295#comment-14988295
 ] 

ASF GitHub Bot commented on DRILL-3423:
---

GitHub user kingmesal opened a pull request:

https://github.com/apache/drill/pull/234

DRILL-3423

This pull request is waiting for maven central to have the latest parser 
library (2.3) which it depends on: 
http://search.maven.org/#search%7Cga%7C1%7Ca%3A%22httpdlog-parser%22

This was passing all tests in my build environment.

This storage format plugin has complete support for:
- Full Parsing Pushdown
- Type Remapping
- Maps (e.g. like query string parameters)
- Multiple log formats in the same storage definition

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/kingmesal/drill DRILL-3423

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/drill/pull/234.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #234






> Add New HTTPD format plugin
> ---
>
> Key: DRILL-3423
> URL: https://issues.apache.org/jira/browse/DRILL-3423
> Project: Apache Drill
>  Issue Type: New Feature
>  Components: Storage - Other
>Reporter: Jacques Nadeau
>Assignee: Jim Scott
> Fix For: 1.3.0
>
>
> Add an HTTPD logparser based format plugin.  The author has been kind enough 
> to move the logparser project to be released under the Apache License.  Can 
> find it here:
> 
> nl.basjes.parse.httpdlog
> httpdlog-parser
> 2.0
> 
>  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (DRILL-4022) Negation of an interval results in a NPE

2015-11-03 Thread Krystal (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-4022?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Krystal updated DRILL-4022:
---
Labels: interval  (was: )

> Negation of an interval results in a NPE
> 
>
> Key: DRILL-4022
> URL: https://issues.apache.org/jira/browse/DRILL-4022
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Data Types
>Reporter: Krystal
>  Labels: interval
>
> Putting a minus sign before "interval" in a query causes NPE.
> select -interval '4:59' hour to minute from basic limit 1;
> Error: SYSTEM ERROR: NullPointerException



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (DRILL-4021) Cannot subract or add between two timestamps

2015-11-03 Thread Julian Hyde (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-4021?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14988403#comment-14988403
 ] 

Julian Hyde commented on DRILL-4021:


I'll ask a rhetorical question: If you subtract two timestamps, what units 
would you expect the result to have? Milliseconds? Days?

Standard SQL says that if you subtract two date-time values you need to say 
what type of interval you can back (e.g. an interval in seconds). Thus you must 
write {code}t2 - t1 second{code} or {code}t2 - t1 month to year{code}.

Standard SQL does not allow you add two date-time values at all. What would you 
expect 2015-01-01 + 2015-01-01 to return? I suppose you could say 'a value 
twice as far from the 1970-01-01 epoch as 2015-01-01' but then you are assuming 
an epoch.

If you want to add to a date-time value, add an interval. {code}date 
'2015-01-01' + interval '2' year{code} should work.

So, this is not a bug.

> Cannot subract or add between two timestamps
> 
>
> Key: DRILL-4021
> URL: https://issues.apache.org/jira/browse/DRILL-4021
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Data Types
>Reporter: Krystal
>
> The following subtraction between 2 now() function works:
> select now() - now()from voter_hive limit 1;
> +-+
> | EXPR$0  |
> +-+
> | PT0S|
> +-+
>  
> However, the following queries fail:
> select now() - create_time from voter_hive where voter_id=1;
> Error: VALIDATION ERROR: From line 1, column 8 to line 1, column 26: Cannot 
> apply '-' to arguments of type ' - '. Supported form(s): 
> ' - '
> ' - '
> ' - '
> select create_time - cast('1997-02-12 15:18:31.072' as timestamp) from 
> voter_hive where voter_id=1;
> Error: VALIDATION ERROR: From line 1, column 8 to line 1, column 65: Cannot 
> apply '-' to arguments of type ' - '. Supported 
> form(s): ' - '
> ' - '
> ' - '



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (DRILL-4023) getDriverVersion of DatabaseMetaData interface does not return JDBC driver version

2015-11-03 Thread Khurram Faraaz (JIRA)
Khurram Faraaz created DRILL-4023:
-

 Summary: getDriverVersion of DatabaseMetaData interface does not 
return JDBC driver version
 Key: DRILL-4023
 URL: https://issues.apache.org/jira/browse/DRILL-4023
 Project: Apache Drill
  Issue Type: Bug
  Components: Client - JDBC
Affects Versions: 1.3.0
Reporter: Khurram Faraaz


getDriverVersion of DatabaseMetaData interface does not return JDBC driver 
version, instead reports that apache-drill-jdbc.properties file is not loaded.

{code}

DatabaseMetaData meta = conn.getMetaData();
System.out.println("JDBC driver version is " + meta.getDriverVersion());

Output returned by call to getDriverVersion function is,


Properties file, apache-drill-jdbc.properties does not exist in the Drill 
project 

Admin:drill kfaraaz$ find . -name *.properties
./common/target/classes/git.properties
./contrib/data/target/classes/git.properties
./contrib/data/tpch-sample-data/target/classes/git.properties
./contrib/target/classes/git.properties
./exec/java-exec/target/classes/git.properties
./exec/target/classes/git.properties
./git.properties
./protocol/target/classes/git.properties
./target/classes/git.properties
{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (DRILL-3952) Improve Window Functions performance when not all batches are required to process the current batch

2015-11-03 Thread Aman Sinha (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-3952?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aman Sinha updated DRILL-3952:
--
Assignee: Deneche A. Hakim  (was: Aman Sinha)

> Improve Window Functions performance when not all batches are required to 
> process the current batch
> ---
>
> Key: DRILL-3952
> URL: https://issues.apache.org/jira/browse/DRILL-3952
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Relational Operators
>Affects Versions: 1.2.0
>Reporter: Deneche A. Hakim
>Assignee: Deneche A. Hakim
> Fix For: 1.3.0
>
>
> Currently, the window operator blocks until all batches of current partition 
> to be available. For some queries it's necessary (e.g. aggregate with no 
> order-by in the window definition), but for other cases the window operator 
> can process and pass the current batch downstream sooner.
> Implementing this should help the window operator use less memory and run 
> faster, especially in the presence of a limit operator.
> The purpose of this JIRA is to improve the window operator in the following 
> cases:
> - aggregate, when order-by clause is available in window definition, can 
> process current batch as soon as it receives the last peer row
> - lead can process current batch as soon as it receives 1 more batch
> - lag can process current batch immediately
> - first_value can process current batch immediately
> - last_value, when order-by clause is available in window definition, can 
> process current batch as soon as it receives the last peer row
> - row_number, rank and dense_rank can process current batch immediately 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (DRILL-4022) Negation of an interval results in a NPE

2015-11-03 Thread Krystal (JIRA)
Krystal created DRILL-4022:
--

 Summary: Negation of an interval results in a NPE
 Key: DRILL-4022
 URL: https://issues.apache.org/jira/browse/DRILL-4022
 Project: Apache Drill
  Issue Type: Bug
  Components: Execution - Data Types
Reporter: Krystal


Putting a minus sign before "interval" in a query causes NPE.

select -interval '4:59' hour to minute from basic limit 1;
Error: SYSTEM ERROR: NullPointerException



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (DRILL-4021) Cannot subract or add between two timestamps

2015-11-03 Thread Krystal (JIRA)
Krystal created DRILL-4021:
--

 Summary: Cannot subract or add between two timestamps
 Key: DRILL-4021
 URL: https://issues.apache.org/jira/browse/DRILL-4021
 Project: Apache Drill
  Issue Type: Bug
  Components: Execution - Data Types
Reporter: Krystal


The following subtraction between 2 now() function works:
select now() - now()from voter_hive limit 1;
+-+
| EXPR$0  |
+-+
| PT0S|
+-+
 
However, the following queries fail:
select now() - create_time from voter_hive where voter_id=1;
Error: VALIDATION ERROR: From line 1, column 8 to line 1, column 26: Cannot 
apply '-' to arguments of type ' - '. Supported form(s): 
' - '
' - '
' - '

select create_time - cast('1997-02-12 15:18:31.072' as timestamp) from 
voter_hive where voter_id=1;
Error: VALIDATION ERROR: From line 1, column 8 to line 1, column 65: Cannot 
apply '-' to arguments of type ' - '. Supported 
form(s): ' - '
' - '
' - '



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (DRILL-3994) Build Fails on Windows after DRILL-3742

2015-11-03 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-3994?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14988037#comment-14988037
 ] 

ASF GitHub Bot commented on DRILL-3994:
---

Github user julienledem commented on the pull request:

https://github.com/apache/drill/pull/226#issuecomment-153475589
  
@jacques-n thanks, I have fixed this one too.



> Build Fails on Windows after DRILL-3742
> ---
>
> Key: DRILL-3994
> URL: https://issues.apache.org/jira/browse/DRILL-3994
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Tools, Build & Test
>Reporter: Sudheesh Katkam
>Assignee: Julien Le Dem
>Priority: Critical
> Fix For: 1.3.0
>
>
> Build fails on Windows on the latest master:
> {code}
> c:\drill> mvn clean install -DskipTests 
> ...
> [INFO] Rat check: Summary of files. Unapproved: 0 unknown: 0 generated: 0 
> approved: 169 licence.
> [INFO] 
> [INFO] <<< exec-maven-plugin:1.2.1:java (default) < validate @ drill-common 
> <<<
> [INFO] 
> [INFO] --- exec-maven-plugin:1.2.1:java (default) @ drill-common ---
> SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder".
> SLF4J: Defaulting to no-operation (NOP) logger implementation
> SLF4J: See 
> http://www.slf4j.org/codes.html#StaticLoggerBinder
>  for further details.
> Scanning: C:\drill\common\target\classes
> [WARNING] 
> java.lang.reflect.InvocationTargetException
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:606)
>   at org.codehaus.mojo.exec.ExecJavaMojo$1.run(ExecJavaMojo.java:297)
>   at java.lang.Thread.run(Thread.java:745)
> Caused by: java.lang.IllegalArgumentException: 
> file:C:/drill/common/target/classes/ not in 
> [file:/C:/drill/common/target/classes/]
>   at 
> org.apache.drill.common.scanner.BuildTimeScan.main(BuildTimeScan.java:129)
>   ... 6 more
> [INFO] 
> 
> [INFO] Reactor Summary:
> [INFO] 
> [INFO] Apache Drill Root POM .. SUCCESS [ 10.016 
> s]
> [INFO] tools/Parent Pom ... SUCCESS [  1.062 
> s]
> [INFO] tools/freemarker codegen tooling ... SUCCESS [  6.922 
> s]
> [INFO] Drill Protocol . SUCCESS [ 10.062 
> s]
> [INFO] Common (Logical Plan, Base expressions)  FAILURE [  9.954 
> s]
> [INFO] contrib/Parent Pom . SKIPPED
> [INFO] contrib/data/Parent Pom  SKIPPED
> [INFO] contrib/data/tpch-sample-data .. SKIPPED
> [INFO] exec/Parent Pom  SKIPPED
> [INFO] exec/Java Execution Engine . SKIPPED
> [INFO] exec/JDBC Driver using dependencies  SKIPPED
> [INFO] JDBC JAR with all dependencies . SKIPPED
> [INFO] contrib/mongo-storage-plugin ... SKIPPED
> [INFO] contrib/hbase-storage-plugin ... SKIPPED
> [INFO] contrib/jdbc-storage-plugin  SKIPPED
> [INFO] contrib/hive-storage-plugin/Parent Pom . SKIPPED
> [INFO] contrib/hive-storage-plugin/hive-exec-shaded ... SKIPPED
> [INFO] contrib/hive-storage-plugin/core ... SKIPPED
> [INFO] contrib/drill-gis-plugin ... SKIPPED
> [INFO] Packaging and Distribution Assembly  SKIPPED
> [INFO] contrib/sqlline  SKIPPED
> [INFO] 
> 
> [INFO] BUILD FAILURE
> [INFO] 
> 
> [INFO] Total time: 38.813 s
> [INFO] Finished at: 2015-10-28T12:17:19-07:00
> [INFO] Final Memory: 67M/466M
> [INFO] 
> 
> [ERROR] Failed to execute goal org.codehaus.mojo:exec-maven-plugin:1.2.1:java 
> (default) on project drill-common: An exception occured while executing the 
> Java class. null: InvocationTargetException: 
> file:C:/drill/common/target/classes/ not in 
> [file:/C:/drill/common/target/classes/] -> [Help 1]
> [ERROR] 
> [ERROR] To see the full stack trace of the errors, re-run Maven with the -e 
> switch.
> [ERROR] Re-run Maven using the -X switch to enable full debug logging.
> [ERROR] 
> [ERROR] For more information about the errors and possible solutions, please 
> read the following articles:
> [ERROR] [Help 1] 
> 

[jira] [Commented] (DRILL-3994) Build Fails on Windows after DRILL-3742

2015-11-03 Thread Julien Le Dem (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-3994?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14988040#comment-14988040
 ] 

Julien Le Dem commented on DRILL-3994:
--

I have fixed the problem and verified it works.
I have run the build on windows and this part works.
I have other test failures that seem unrelated.
 

> Build Fails on Windows after DRILL-3742
> ---
>
> Key: DRILL-3994
> URL: https://issues.apache.org/jira/browse/DRILL-3994
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Tools, Build & Test
>Reporter: Sudheesh Katkam
>Assignee: Julien Le Dem
>Priority: Critical
> Fix For: 1.3.0
>
>
> Build fails on Windows on the latest master:
> {code}
> c:\drill> mvn clean install -DskipTests 
> ...
> [INFO] Rat check: Summary of files. Unapproved: 0 unknown: 0 generated: 0 
> approved: 169 licence.
> [INFO] 
> [INFO] <<< exec-maven-plugin:1.2.1:java (default) < validate @ drill-common 
> <<<
> [INFO] 
> [INFO] --- exec-maven-plugin:1.2.1:java (default) @ drill-common ---
> SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder".
> SLF4J: Defaulting to no-operation (NOP) logger implementation
> SLF4J: See 
> http://www.slf4j.org/codes.html#StaticLoggerBinder
>  for further details.
> Scanning: C:\drill\common\target\classes
> [WARNING] 
> java.lang.reflect.InvocationTargetException
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:606)
>   at org.codehaus.mojo.exec.ExecJavaMojo$1.run(ExecJavaMojo.java:297)
>   at java.lang.Thread.run(Thread.java:745)
> Caused by: java.lang.IllegalArgumentException: 
> file:C:/drill/common/target/classes/ not in 
> [file:/C:/drill/common/target/classes/]
>   at 
> org.apache.drill.common.scanner.BuildTimeScan.main(BuildTimeScan.java:129)
>   ... 6 more
> [INFO] 
> 
> [INFO] Reactor Summary:
> [INFO] 
> [INFO] Apache Drill Root POM .. SUCCESS [ 10.016 
> s]
> [INFO] tools/Parent Pom ... SUCCESS [  1.062 
> s]
> [INFO] tools/freemarker codegen tooling ... SUCCESS [  6.922 
> s]
> [INFO] Drill Protocol . SUCCESS [ 10.062 
> s]
> [INFO] Common (Logical Plan, Base expressions)  FAILURE [  9.954 
> s]
> [INFO] contrib/Parent Pom . SKIPPED
> [INFO] contrib/data/Parent Pom  SKIPPED
> [INFO] contrib/data/tpch-sample-data .. SKIPPED
> [INFO] exec/Parent Pom  SKIPPED
> [INFO] exec/Java Execution Engine . SKIPPED
> [INFO] exec/JDBC Driver using dependencies  SKIPPED
> [INFO] JDBC JAR with all dependencies . SKIPPED
> [INFO] contrib/mongo-storage-plugin ... SKIPPED
> [INFO] contrib/hbase-storage-plugin ... SKIPPED
> [INFO] contrib/jdbc-storage-plugin  SKIPPED
> [INFO] contrib/hive-storage-plugin/Parent Pom . SKIPPED
> [INFO] contrib/hive-storage-plugin/hive-exec-shaded ... SKIPPED
> [INFO] contrib/hive-storage-plugin/core ... SKIPPED
> [INFO] contrib/drill-gis-plugin ... SKIPPED
> [INFO] Packaging and Distribution Assembly  SKIPPED
> [INFO] contrib/sqlline  SKIPPED
> [INFO] 
> 
> [INFO] BUILD FAILURE
> [INFO] 
> 
> [INFO] Total time: 38.813 s
> [INFO] Finished at: 2015-10-28T12:17:19-07:00
> [INFO] Final Memory: 67M/466M
> [INFO] 
> 
> [ERROR] Failed to execute goal org.codehaus.mojo:exec-maven-plugin:1.2.1:java 
> (default) on project drill-common: An exception occured while executing the 
> Java class. null: InvocationTargetException: 
> file:C:/drill/common/target/classes/ not in 
> [file:/C:/drill/common/target/classes/] -> [Help 1]
> [ERROR] 
> [ERROR] To see the full stack trace of the errors, re-run Maven with the -e 
> switch.
> [ERROR] Re-run Maven using the -X switch to enable full debug logging.
> [ERROR] 
> [ERROR] For more information about the errors and possible solutions, please 
> read the following articles:
> [ERROR] [Help 1] 
> http://cwiki.apache.org/confluence/display/MAVEN/MojoExecutionException
> [ERROR] 
> 

[jira] [Commented] (DRILL-3972) Vectorize Parquet Writer

2015-11-03 Thread Julien Le Dem (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-3972?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14987974#comment-14987974
 ] 

Julien Le Dem commented on DRILL-3972:
--

yes just create a branch from master and rebase it as needed.
Tests should pass. If you see flaky tests or things that do not work out of the 
box please report them.
I think you need to create a subclass of CloseableRecordBatch yes.

> Vectorize Parquet Writer
> 
>
> Key: DRILL-3972
> URL: https://issues.apache.org/jira/browse/DRILL-3972
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Storage - Parquet
>Reporter: Julien Le Dem
>Assignee: Andrew Musselman
>  Labels: pick-me-up
>
> Currently the 
> [ParquetRecordWriter|https://github.com/apache/drill/blob/a98da39dd5a8fa368afd8765f4e981826bbfcc0f/exec/java-exec/src/main/java/org/apache/drill/exec/store/parquet/ParquetRecordWriter.java]
>  receives one record at a time and then turns that into columns.
> Which means we convert from Drill columns to rows and then to Parquet columns.
> Instead we could directly convert the Drill columns into Parquet columns in a 
> vectorized manner.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (DRILL-4001) Empty vectors from previous batch left by MapVector.load(...)/RecordBatchLoader.load(...)

2015-11-03 Thread Daniel Barclay (Drill) (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-4001?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Barclay (Drill) updated DRILL-4001:
--
Description: 
In certain cases, {{MapVector.load(...)}} (called by 
{{RecordBatchLoader.load(...)}}) returns with some map child vectors having a 
length of zero instead of having a length matching the length of sibling 
vectors and the number of records in the batch.  This causes 
{{MapVector.getObject(int)}} to fail, saying 
"{{java.lang.IndexOutOfBoundsException: index: 0, length: 1 (expected: range(0, 
0)}}" (one of the errors seen in fixing DRILL-2288).

The condition seems to be that a child field (e.g., an HBase column in a HBase 
column family) appears in an earlier batch and does not appear in a later 
batch.  

(The HBase column's child vector gets created (in the MapVector for the HBase 
column family) during loading of the earlier batch.  During loading of the 
later batch, all vectors get reset to zero length, and then only vectors for 
fields _appearing in the batch message being loaded_ get loaded and set to the 
length of the batch-\-other vectors created from earlier messages/{{load}} 
calls are left with a length of zero (instead of, say, being filled with nulls 
to the length of their siblings and the current record batch).)

See the TODO(DRILL-4001) mark and workaround in {{MapVector.getObject(int)}}.



  was:
In certain cases, {{MapVector.load(...)}} (called by 
{{RecordBatchLoader.load(...)}}) returns with some map child vectors having a 
length of zero instead of having a length matching the length of sibling 
vectors and the number of records in the batch.  This caused 
IndexOutOfBoundsException errors saying (roughly) "

  (This caused some of the {{IndexOutOfBoundException}} errors seen in fixing 
DRILL-2288.)

The condition seems to be that a child field (e.g., an HBase column in a HBase 
column family) appears in an earlier batch and does not appear in a later 
batch.  

(The HBase column's child vector gets created (in the MapVector for the HBase 
column family) during loading of the earlier batch.  During loading of the 
later batch, all vectors get reset to zero length, and then only vectors for 
fields _appearing in the batch message being loaded_ get loaded and set to the 
length of the batch-\-other vectors created from earlier messages/{{load}} 
calls are left with a length of zero (instead of, say, being filled with nulls 
to the length of their siblings and the current record batch).)

See the TODO(DRILL-4001) mark and workaround in {{MapVector.getObject(int)}}.




> Empty vectors from previous batch left by 
> MapVector.load(...)/RecordBatchLoader.load(...)
> -
>
> Key: DRILL-4001
> URL: https://issues.apache.org/jira/browse/DRILL-4001
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Data Types
>Reporter: Daniel Barclay (Drill)
>
> In certain cases, {{MapVector.load(...)}} (called by 
> {{RecordBatchLoader.load(...)}}) returns with some map child vectors having a 
> length of zero instead of having a length matching the length of sibling 
> vectors and the number of records in the batch.  This causes 
> {{MapVector.getObject(int)}} to fail, saying 
> "{{java.lang.IndexOutOfBoundsException: index: 0, length: 1 (expected: 
> range(0, 0)}}" (one of the errors seen in fixing DRILL-2288).
> The condition seems to be that a child field (e.g., an HBase column in a 
> HBase column family) appears in an earlier batch and does not appear in a 
> later batch.  
> (The HBase column's child vector gets created (in the MapVector for the HBase 
> column family) during loading of the earlier batch.  During loading of the 
> later batch, all vectors get reset to zero length, and then only vectors for 
> fields _appearing in the batch message being loaded_ get loaded and set to 
> the length of the batch-\-other vectors created from earlier 
> messages/{{load}} calls are left with a length of zero (instead of, say, 
> being filled with nulls to the length of their siblings and the current 
> record batch).)
> See the TODO(DRILL-4001) mark and workaround in {{MapVector.getObject(int)}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (DRILL-2288) ScanBatch violates IterOutcome protocol for zero-row sources [was: missing JDBC metadata (schema) for 0-row results...]

2015-11-03 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-2288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14987981#comment-14987981
 ] 

ASF GitHub Bot commented on DRILL-2288:
---

Github user dsbos commented on a diff in the pull request:

https://github.com/apache/drill/pull/228#discussion_r43797898
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/vector/complex/MapVector.java
 ---
@@ -309,7 +309,14 @@ public Object getObject(int index) {
   Map vv = new JsonStringHashMap<>();
   for (String child:getChildFieldNames()) {
 ValueVector v = getChild(child);
-if (v != null) {
+// TODO(DRILL-4001):  Resolve this hack:
--- End diff --

Added mention of MapVector.getObject(int) and exception message 
"java.lang.IndexOutOfBoundsException: index: 0, length: 1 (expected: range(0, 
0))" to DRILL-4001 report.


> ScanBatch violates IterOutcome protocol for zero-row sources [was: missing 
> JDBC metadata (schema) for 0-row results...]
> ---
>
> Key: DRILL-2288
> URL: https://issues.apache.org/jira/browse/DRILL-2288
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Storage - Information Schema
>Reporter: Daniel Barclay (Drill)
>Assignee: Daniel Barclay (Drill)
> Fix For: 1.3.0
>
> Attachments: Drill2288NoResultSetMetadataWhenZeroRowsTest.java
>
>
> The ResultSetMetaData object from getMetadata() of a ResultSet is not set up 
> (getColumnCount() returns zero, and trying to access any other metadata 
> throws IndexOutOfBoundsException) for a result set with zero rows, at least 
> for one from DatabaseMetaData.getColumns(...).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (DRILL-2288) ScanBatch violates IterOutcome protocol for zero-row sources [was: missing JDBC metadata (schema) for 0-row results...]

2015-11-03 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-2288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14988153#comment-14988153
 ] 

ASF GitHub Bot commented on DRILL-2288:
---

Github user dsbos commented on a diff in the pull request:

https://github.com/apache/drill/pull/228#discussion_r43808779
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/vector/complex/fn/JsonReader.java
 ---
@@ -93,13 +93,15 @@ public void ensureAtLeastOneField(ComplexWriter writer) 
{
 if (!atLeastOneWrite) {
   // if we had no columns, create one empty one so we can return some 
data for count purposes.
   SchemaPath sp = columns.get(0);
-  PathSegment root = sp.getRootSegment();
+  PathSegment fieldPath = sp.getRootSegment();
   BaseWriter.MapWriter fieldWriter = writer.rootAsMap();
-  while (root.getChild() != null && !root.getChild().isArray()) {
-fieldWriter = fieldWriter.map(root.getNameSegment().getPath());
-root = root.getChild();
+  while (fieldPath.getChild() != null && ! 
fieldPath.getChild().isArray()) {
+fieldWriter = 
fieldWriter.map(fieldPath.getNameSegment().getPath());
+fieldPath = fieldPath.getChild();
+  }
+  if (fieldWriter.isEmptyMap()) {
--- End diff --

Recall that ensureAtLeastOneField used to create that dummy 
NullableIntVector field/column (so that the schema didn't end up with zero 
fields/columns) _regardless_ of whether it needed to (i.e., even if the schema 
already had a column or columns from a previous JsonRecordReader for the same 
ScanBatch).  

Under the failure conditions, that blind vector setting frequently replaced 
a NullableVarCharVector with that NullableIntVector, causing either explicit 
"schema change not supported" user-targeted errors or internal errors about the 
mismatch between the NullableIntVector and the expectation of a 
NullableVarCharVector.

The additional check prevents ensureAtLeastOneField from overriding a 
column in an existing non-empty schema and causing those schema changes.



> ScanBatch violates IterOutcome protocol for zero-row sources [was: missing 
> JDBC metadata (schema) for 0-row results...]
> ---
>
> Key: DRILL-2288
> URL: https://issues.apache.org/jira/browse/DRILL-2288
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Storage - Information Schema
>Reporter: Daniel Barclay (Drill)
>Assignee: Daniel Barclay (Drill)
> Fix For: 1.3.0
>
> Attachments: Drill2288NoResultSetMetadataWhenZeroRowsTest.java
>
>
> The ResultSetMetaData object from getMetadata() of a ResultSet is not set up 
> (getColumnCount() returns zero, and trying to access any other metadata 
> throws IndexOutOfBoundsException) for a result set with zero rows, at least 
> for one from DatabaseMetaData.getColumns(...).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (DRILL-2288) ScanBatch violates IterOutcome protocol for zero-row sources [was: missing JDBC metadata (schema) for 0-row results...]

2015-11-03 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-2288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14987956#comment-14987956
 ] 

ASF GitHub Bot commented on DRILL-2288:
---

Github user dsbos commented on a diff in the pull request:

https://github.com/apache/drill/pull/228#discussion_r43796603
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/aggregate/HashAggTemplate.java
 ---
@@ -325,10 +325,13 @@ public AggOutcome doWork() {
 if (EXTRA_DEBUG_1) {
   logger.debug("Received new schema.  Batch has {} 
records.", incoming.getRecordCount());
 }
-newSchema = true;
-this.cleanup();
-// TODO: new schema case needs to be handled appropriately
-return AggOutcome.UPDATE_AGGREGATOR;
+final BatchSchema newIncomingSchema = incoming.getSchema();
+if ((! newIncomingSchema.equals(schema)) && schema != 
null) {
--- End diff --

> You shouldn't put a hack in to ignore a valid schema change.

Yeah, I'm reverting this HashAggTemplate change.

>  I don't know all the details but if this is related to HBase, my thought 
is you probably need
>  to resolve 4010 to avoid spurious schema changes. 

It's not clear whether it's related to HBase or not.  (The relevant test 
failure did occur in an HBase test, but it was in a Jenkins-run cluster test 
that has been hard to reproduce elsewhere.  At one point (I thought) I had a 
reproduction using just JSON files, but now that doesn't seem to work.)

> You should make sure to avoid propagating schema changes that are not 
real/required. 

Related to that, ExternalSortBatch seems to return two OK_NEW_SCHEMA 
batches with schemas differing only in their selection vectors: 

11:23:50.149 [main] TRACE o.a.d.e.p.i.v.IteratorValidatorBatchIterator - 
[#3; on ExternalSortBatch]: incoming next() return: #records = 204, 
  schema:
BatchSchema [fields=[`key1`(INT:REQUIRED), `alt`(BIGINT:REQUIRED)], 
selectionVector=FOUR_BYTE], 
  prev. new (not equal):
BatchSchema [fields=[`key1`(INT:REQUIRED), `alt`(BIGINT:REQUIRED)], 
selectionVector=NONE]

(The earlier schema ("prev. new:") seems to come from setting up the sort, 
and the latter one ("schema:") seems to come from completing the sort (it was 
triggered by a NONE from upstream).)

Is that correct?  (Should the selection vectors have been the same?  Should 
the second return have been OK instead of OK_NEW_SCHEMA?) 

If that is correct (e.g., vectors were changed and should have been), how 
should downstream operators, or at least  aggregation, handle that?

Should they be able to handle a "technical" schema change (OK_NEW_SCHEMA 
with possibly-changed vectors or certain details of the schema) that is not a 
logical schema change (not a change in the logical fields or types in the 
aggregated value), even if arbitrary logical schema changes can't be handled 
(e.g., because some don't even make sense)?




> ScanBatch violates IterOutcome protocol for zero-row sources [was: missing 
> JDBC metadata (schema) for 0-row results...]
> ---
>
> Key: DRILL-2288
> URL: https://issues.apache.org/jira/browse/DRILL-2288
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Storage - Information Schema
>Reporter: Daniel Barclay (Drill)
>Assignee: Daniel Barclay (Drill)
> Fix For: 1.3.0
>
> Attachments: Drill2288NoResultSetMetadataWhenZeroRowsTest.java
>
>
> The ResultSetMetaData object from getMetadata() of a ResultSet is not set up 
> (getColumnCount() returns zero, and trying to access any other metadata 
> throws IndexOutOfBoundsException) for a result set with zero rows, at least 
> for one from DatabaseMetaData.getColumns(...).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (DRILL-4001) Empty vectors from previous batch left by MapVector.load(...)/RecordBatchLoader.load(...)

2015-11-03 Thread Daniel Barclay (Drill) (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-4001?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Barclay (Drill) updated DRILL-4001:
--
Description: 
In certain cases, {{MapVector.load(...)}} (called by 
{{RecordBatchLoader.load(...)}}) returns with some map child vectors having a 
length of zero instead of having a length matching the length of sibling 
vectors and the number of records in the batch.  This caused 
IndexOutOfBoundsException errors saying (roughly) "

  (This caused some of the {{IndexOutOfBoundException}} errors seen in fixing 
DRILL-2288.)

The condition seems to be that a child field (e.g., an HBase column in a HBase 
column family) appears in an earlier batch and does not appear in a later 
batch.  

(The HBase column's child vector gets created (in the MapVector for the HBase 
column family) during loading of the earlier batch.  During loading of the 
later batch, all vectors get reset to zero length, and then only vectors for 
fields _appearing in the batch message being loaded_ get loaded and set to the 
length of the batch-\-other vectors created from earlier messages/{{load}} 
calls are left with a length of zero (instead of, say, being filled with nulls 
to the length of their siblings and the current record batch).)

See the TODO(DRILL-4001) mark and workaround in {{MapVector.getObject(int)}}.



  was:
In certain cases, {{MapVector.load(...)}} (called by 
{{RecordBatchLoader.load(...)}}) returns with some map child vectors having a 
length of zero instead of having a length matching the length of sibling 
vectors and the number of records in the batch.  (This caused some of the 
{{IndexOutOfBoundException}} errors seen in fixing DRILL-2288.)

The condition seems to be that a child field (e.g., an HBase column in a HBase 
column family) appears in an earlier batch and does not appear in a later 
batch.  

(The HBase column's child vector gets created (in the MapVector for the HBase 
column family) during loading of the earlier batch.  During loading of the 
later batch, all vectors get reset to zero length, and then only vectors for 
fields _appearing in the batch message being loaded_ get loaded and set to the 
length of the batch-\-other vectors created from earlier messages/{{load}} 
calls are left with a length of zero (instead of, say, being filled with nulls 
to the length of their siblings and the current record batch).)

See the TODO(DRILL-4001) mark and workaround in {{MapVector.getObject(int)}}.




> Empty vectors from previous batch left by 
> MapVector.load(...)/RecordBatchLoader.load(...)
> -
>
> Key: DRILL-4001
> URL: https://issues.apache.org/jira/browse/DRILL-4001
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Data Types
>Reporter: Daniel Barclay (Drill)
>
> In certain cases, {{MapVector.load(...)}} (called by 
> {{RecordBatchLoader.load(...)}}) returns with some map child vectors having a 
> length of zero instead of having a length matching the length of sibling 
> vectors and the number of records in the batch.  This caused 
> IndexOutOfBoundsException errors saying (roughly) "
>   (This caused some of the {{IndexOutOfBoundException}} errors seen in fixing 
> DRILL-2288.)
> The condition seems to be that a child field (e.g., an HBase column in a 
> HBase column family) appears in an earlier batch and does not appear in a 
> later batch.  
> (The HBase column's child vector gets created (in the MapVector for the HBase 
> column family) during loading of the earlier batch.  During loading of the 
> later batch, all vectors get reset to zero length, and then only vectors for 
> fields _appearing in the batch message being loaded_ get loaded and set to 
> the length of the batch-\-other vectors created from earlier 
> messages/{{load}} calls are left with a length of zero (instead of, say, 
> being filled with nulls to the length of their siblings and the current 
> record batch).)
> See the TODO(DRILL-4001) mark and workaround in {{MapVector.getObject(int)}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (DRILL-4026) CTAS Auto Partition on a wide varchar column is giving an IllegalReferenceCountException

2015-11-03 Thread Rahul Challapalli (JIRA)
Rahul Challapalli created DRILL-4026:


 Summary: CTAS Auto Partition on a wide varchar column is giving an 
IllegalReferenceCountException
 Key: DRILL-4026
 URL: https://issues.apache.org/jira/browse/DRILL-4026
 Project: Apache Drill
  Issue Type: Bug
  Components: Execution - Data Types, Storage - Writer
Reporter: Rahul Challapalli


git.commit.id.abbrev=bb69f22

The below query fails
{code}
create table vc_part partition by (a) as select cast(columns[0] as 
varchar(6000)) a, columns[1] b from dfs.`/drill/testdata/abc.tbl`;
Error: SYSTEM ERROR: IllegalReferenceCountException: refCnt: 0

Fragment 0:0

[Error Id: 8bbfcadb-07bb-468c-a772-24c85cecbcf6 on qa-node191.qa.lab:31010]

  (io.netty.util.IllegalReferenceCountException) refCnt: 0
io.netty.buffer.AbstractByteBuf.ensureAccessible():1178
io.netty.buffer.DrillBuf.checkIndexD():184
io.netty.buffer.DrillBuf.checkBytes():205
org.apache.drill.exec.expr.fn.impl.ByteFunctionHelpers.compare():101
org.apache.drill.exec.test.generated.ProjectorGen2.doEval():49
org.apache.drill.exec.test.generated.ProjectorGen2.projectRecords():62
org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.doWork():173
org.apache.drill.exec.record.AbstractSingleRecordBatch.innerNext():93

org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.innerNext():130
org.apache.drill.exec.record.AbstractRecordBatch.next():156

org.apache.drill.exec.physical.impl.validate.IteratorValidatorBatchIterator.next():119
org.apache.drill.exec.record.AbstractRecordBatch.next():113
org.apache.drill.exec.record.AbstractRecordBatch.next():103
org.apache.drill.exec.physical.impl.WriterRecordBatch.innerNext():91
org.apache.drill.exec.record.AbstractRecordBatch.next():156

org.apache.drill.exec.physical.impl.validate.IteratorValidatorBatchIterator.next():119
org.apache.drill.exec.record.AbstractRecordBatch.next():113
org.apache.drill.exec.record.AbstractRecordBatch.next():103
org.apache.drill.exec.record.AbstractSingleRecordBatch.innerNext():51

org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.innerNext():130
org.apache.drill.exec.record.AbstractRecordBatch.next():156

org.apache.drill.exec.physical.impl.validate.IteratorValidatorBatchIterator.next():119
org.apache.drill.exec.physical.impl.BaseRootExec.next():104
org.apache.drill.exec.physical.impl.ScreenCreator$ScreenRoot.innerNext():80
org.apache.drill.exec.physical.impl.BaseRootExec.next():94
org.apache.drill.exec.work.fragment.FragmentExecutor$1.run():256
org.apache.drill.exec.work.fragment.FragmentExecutor$1.run():250
java.security.AccessController.doPrivileged():-2
javax.security.auth.Subject.doAs():415
org.apache.hadoop.security.UserGroupInformation.doAs():1595
org.apache.drill.exec.work.fragment.FragmentExecutor.run():250
org.apache.drill.common.SelfCleaningRunnable.run():38
java.util.concurrent.ThreadPoolExecutor.runWorker():1145
java.util.concurrent.ThreadPoolExecutor$Worker.run():615
java.lang.Thread.run():745 (state=,code=0)
{code}

The data set contains a widestring (5000 chars) as the first column



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (DRILL-4026) CTAS Auto Partition on a wide varchar column is giving an IllegalReferenceCountException

2015-11-03 Thread Rahul Challapalli (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-4026?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14988651#comment-14988651
 ] 

Rahul Challapalli commented on DRILL-4026:
--

Changing the length of the varchar to 7000 instead of 6000 made the query run

> CTAS Auto Partition on a wide varchar column is giving an 
> IllegalReferenceCountException
> 
>
> Key: DRILL-4026
> URL: https://issues.apache.org/jira/browse/DRILL-4026
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Data Types, Storage - Writer
>Reporter: Rahul Challapalli
> Attachments: abc.tbl
>
>
> git.commit.id.abbrev=bb69f22
> The below query fails
> {code}
> create table vc_part partition by (a) as select cast(columns[0] as 
> varchar(6000)) a, columns[1] b from dfs.`/drill/testdata/abc.tbl`;
> Error: SYSTEM ERROR: IllegalReferenceCountException: refCnt: 0
> Fragment 0:0
> [Error Id: 8bbfcadb-07bb-468c-a772-24c85cecbcf6 on qa-node191.qa.lab:31010]
>   (io.netty.util.IllegalReferenceCountException) refCnt: 0
> io.netty.buffer.AbstractByteBuf.ensureAccessible():1178
> io.netty.buffer.DrillBuf.checkIndexD():184
> io.netty.buffer.DrillBuf.checkBytes():205
> org.apache.drill.exec.expr.fn.impl.ByteFunctionHelpers.compare():101
> org.apache.drill.exec.test.generated.ProjectorGen2.doEval():49
> org.apache.drill.exec.test.generated.ProjectorGen2.projectRecords():62
> 
> org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.doWork():173
> org.apache.drill.exec.record.AbstractSingleRecordBatch.innerNext():93
> 
> org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.innerNext():130
> org.apache.drill.exec.record.AbstractRecordBatch.next():156
> 
> org.apache.drill.exec.physical.impl.validate.IteratorValidatorBatchIterator.next():119
> org.apache.drill.exec.record.AbstractRecordBatch.next():113
> org.apache.drill.exec.record.AbstractRecordBatch.next():103
> org.apache.drill.exec.physical.impl.WriterRecordBatch.innerNext():91
> org.apache.drill.exec.record.AbstractRecordBatch.next():156
> 
> org.apache.drill.exec.physical.impl.validate.IteratorValidatorBatchIterator.next():119
> org.apache.drill.exec.record.AbstractRecordBatch.next():113
> org.apache.drill.exec.record.AbstractRecordBatch.next():103
> org.apache.drill.exec.record.AbstractSingleRecordBatch.innerNext():51
> 
> org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.innerNext():130
> org.apache.drill.exec.record.AbstractRecordBatch.next():156
> 
> org.apache.drill.exec.physical.impl.validate.IteratorValidatorBatchIterator.next():119
> org.apache.drill.exec.physical.impl.BaseRootExec.next():104
> 
> org.apache.drill.exec.physical.impl.ScreenCreator$ScreenRoot.innerNext():80
> org.apache.drill.exec.physical.impl.BaseRootExec.next():94
> org.apache.drill.exec.work.fragment.FragmentExecutor$1.run():256
> org.apache.drill.exec.work.fragment.FragmentExecutor$1.run():250
> java.security.AccessController.doPrivileged():-2
> javax.security.auth.Subject.doAs():415
> org.apache.hadoop.security.UserGroupInformation.doAs():1595
> org.apache.drill.exec.work.fragment.FragmentExecutor.run():250
> org.apache.drill.common.SelfCleaningRunnable.run():38
> java.util.concurrent.ThreadPoolExecutor.runWorker():1145
> java.util.concurrent.ThreadPoolExecutor$Worker.run():615
> java.lang.Thread.run():745 (state=,code=0)
> {code}
> The data set contains a widestring (5000 chars) as the first column



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (DRILL-4028) Merge Drill parquet modifications back into the mainline project

2015-11-03 Thread Jason Altekruse (JIRA)
Jason Altekruse created DRILL-4028:
--

 Summary: Merge Drill parquet modifications back into the mainline 
project
 Key: DRILL-4028
 URL: https://issues.apache.org/jira/browse/DRILL-4028
 Project: Apache Drill
  Issue Type: Bug
  Components: Storage - Parquet
Reporter: Jason Altekruse
Assignee: Jason Altekruse
 Fix For: 1.3.0


Drill has been maintaining a fork of Parquet for over a year. The changes need 
to make it back into the main repository so we don't have to bother merging in 
all of the new changes from the master repository into the fork.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (DRILL-4028) Merge Drill parquet modifications back into the mainline project

2015-11-03 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-4028?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14988748#comment-14988748
 ] 

ASF GitHub Bot commented on DRILL-4028:
---

Github user jaltekruse commented on the pull request:

https://github.com/apache/drill/pull/236#issuecomment-153550233
  
I need to update the POM version number once the parquet changes are 
actually merged and we can publish a new version of the "fork" (which will just 
be us hosting the current tip of parquet with a hard maven version number 
rather than a snapshot, we will upgrade to an official release of parquet when 
one is released by the parquet community). I am doing a final run of the 
regression tests now and will push any changes needed to address any remaining 
failures. I believe this is close to the final version of this work so I wanted 
to get it available for starting the review process. You can see the work done 
to get the parquet fork merged into the mainline here: 
https://github.com/apache/parquet-mr/pull/267


> Merge Drill parquet modifications back into the mainline project
> 
>
> Key: DRILL-4028
> URL: https://issues.apache.org/jira/browse/DRILL-4028
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Storage - Parquet
>Reporter: Jason Altekruse
>Assignee: Jason Altekruse
> Fix For: 1.3.0
>
>
> Drill has been maintaining a fork of Parquet for over a year. The changes 
> need to make it back into the main repository so we don't have to bother 
> merging in all of the new changes from the master repository into the fork.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (DRILL-3791) Test JDBC plugin with MySQL

2015-11-03 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-3791?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14988512#comment-14988512
 ] 

ASF GitHub Bot commented on DRILL-3791:
---

GitHub user aleph-zero opened a pull request:

https://github.com/apache/drill/pull/235

DRILL-3791: JDBC Tests

Adds tests for the JDBC plugin using MySQL and Derby.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/aleph-zero/drill issues/JDBC-Testing

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/drill/pull/235.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #235


commit 41d02784b18f6bb25b922d63916978ad43cfa4d2
Author: aleph-zero 
Date:   2015-11-03T23:55:42Z

DRILL-3791: JDBC Tests

Adds tests for the JDBC plugin using MySQL and Derby.




> Test JDBC plugin with MySQL
> ---
>
> Key: DRILL-3791
> URL: https://issues.apache.org/jira/browse/DRILL-3791
> Project: Apache Drill
>  Issue Type: Test
>  Components: Storage - Other
>Reporter: Andrew
>Assignee: Andrew
>Priority: Blocker
>
> Testing the new JDBC storage plugin against MySQL.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (DRILL-4021) Cannot subract or add between two timestamps

2015-11-03 Thread Victoria Markman (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-4021?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14988604#comment-14988604
 ] 

Victoria Markman commented on DRILL-4021:
-

Julian,

We need to remove addition from the title, it is not allowed, period :)
However, subtraction according to SQL 2011 ISO/IEC 9075-2:2011(E) is allowed 
between two timestamps ( see excerpt attached and hopefully I'm understanding 
it correctly )

It is supported in Postgres 9.3
{code}
postgres=# select c_timestamp, now() - c_timestamp as interval from test;
 c_timestamp |interval 
-+-
 2015-03-01 00:11:15 | 247 days 16:11:45.11652
 2015-03-01 00:20:46 | 247 days 16:02:14.11652
 2015-03-01 00:38:12 | 247 days 15:44:48.11652
 2015-03-01 00:53:00 | 247 days 15:30:00.11652
 2015-03-01 01:07:15 | 247 days 15:15:45.11652
 2015-03-01 01:14:14 | 247 days 15:08:46.11652
 2015-03-01 01:30:41 | 247 days 14:52:19.11652
 2015-03-01 01:42:47 | 247 days 14:40:13.11652
 2015-03-01 01:49:46 | 247 days 14:33:14.11652
 | 
(10 rows)
{code}

It is also supported in drill:
{code}
0: jdbc:drill:schema=dfs> select c_timestamp, now() - c_timestamp from j2;
++---+
|  c_timestamp   |  EXPR$1   |
++---+
| 2015-03-01 00:11:15.0  | P247DT86100.498S  |
| 2015-03-01 00:20:46.0  | P247DT85529.498S  |
| 2015-03-01 00:38:12.0  | P247DT84483.498S  |
| 2015-03-01 00:53:00.0  | P247DT83595.498S  |
| 2015-03-01 01:07:15.0  | P247DT82740.498S  |
| 2015-03-01 01:14:14.0  | P247DT82321.498S  |
| 2015-03-01 01:30:41.0  | P247DT81334.498S  |
| 2015-03-01 01:42:47.0  | P247DT80608.498S  |
| 2015-03-01 01:49:46.0  | P247DT80189.498S  |
| null   | null  |
++---+
10 rows selected (0.283 seconds)
{code}

j2 is a parquet file:
{code}
[Tue Nov 03 16:10:29 ] # ~/parquet-tools/parquet-schema 0_0_0.parquet 
message root {
  optional binary c_varchar (UTF8);
  optional int32 c_integer;
  optional int64 c_bigint;
  optional float c_float;
  optional double c_double;
  optional int32 c_date (DATE);
  optional int32 c_time (TIME_MILLIS);
  optional int64 c_timestamp (TIMESTAMP_MILLIS);
  optional boolean c_boolean;
  optional double d9;
  optional double d18;
  optional double d28;
  optional double d38;
}
{code}

The reason this case failed for Krystal, I believe, is because hive table is  a 
"strongly typed data source", and this behavior is inconsistent with the rest 
of our product.

In fact, if you create a drill view with cast to timestamp, you are going to 
get the same error:
{code}
0: jdbc:drill:schema=dfs> create or replace view test_timestamp(c1) as select 
CAST(c_timestamp as TIMESTAMP) from j2;
+---++
|  ok   |summary
 |
+---++
| true  | View 'test_timestamp' created successfully in 'dfs.subqueries' schema 
 |
+---++
1 row selected (0.337 seconds)

0: jdbc:drill:schema=dfs> select c1, now() - c1 from test_timestamp;
Error: VALIDATION ERROR: From line 1, column 12 to line 1, column 21: Cannot 
apply '-' to arguments of type ' - '. Supported form(s): 
' - '
' - '
' - '
[Error Id: 5928b2fc-0650-4096-a0a3-8be8337a1b8e on atsqa4-133.qa.lab:31010] 
(state=,code=0)
{code}

We do have a problem: drill behaves inconsistently. [~jnadeau] mentioned during 
our last hangout that he was trying to fix parquet types to be used during 
planning and he got tons of errors. Jacques, you probably got hundreds of those 
? 

> Cannot subract or add between two timestamps
> 
>
> Key: DRILL-4021
> URL: https://issues.apache.org/jira/browse/DRILL-4021
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Data Types
>Reporter: Krystal
>
> The following subtraction between 2 now() function works:
> select now() - now()from voter_hive limit 1;
> +-+
> | EXPR$0  |
> +-+
> | PT0S|
> +-+
>  
> However, the following queries fail:
> select now() - create_time from voter_hive where voter_id=1;
> Error: VALIDATION ERROR: From line 1, column 8 to line 1, column 26: Cannot 
> apply '-' to arguments of type ' - '. Supported form(s): 
> ' - '
> ' - '
> ' - '
> select create_time - cast('1997-02-12 15:18:31.072' as timestamp) from 
> voter_hive where voter_id=1;
> Error: VALIDATION ERROR: From line 1, column 8 to line 1, column 65: Cannot 
> apply '-' to arguments of type ' - '. Supported 
> 

[jira] [Updated] (DRILL-4026) CTAS Auto Partition on a wide varchar column is giving an IllegalReferenceCountException

2015-11-03 Thread Rahul Challapalli (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-4026?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rahul Challapalli updated DRILL-4026:
-
Attachment: abc.tbl

> CTAS Auto Partition on a wide varchar column is giving an 
> IllegalReferenceCountException
> 
>
> Key: DRILL-4026
> URL: https://issues.apache.org/jira/browse/DRILL-4026
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Data Types, Storage - Writer
>Reporter: Rahul Challapalli
> Attachments: abc.tbl
>
>
> git.commit.id.abbrev=bb69f22
> The below query fails
> {code}
> create table vc_part partition by (a) as select cast(columns[0] as 
> varchar(6000)) a, columns[1] b from dfs.`/drill/testdata/abc.tbl`;
> Error: SYSTEM ERROR: IllegalReferenceCountException: refCnt: 0
> Fragment 0:0
> [Error Id: 8bbfcadb-07bb-468c-a772-24c85cecbcf6 on qa-node191.qa.lab:31010]
>   (io.netty.util.IllegalReferenceCountException) refCnt: 0
> io.netty.buffer.AbstractByteBuf.ensureAccessible():1178
> io.netty.buffer.DrillBuf.checkIndexD():184
> io.netty.buffer.DrillBuf.checkBytes():205
> org.apache.drill.exec.expr.fn.impl.ByteFunctionHelpers.compare():101
> org.apache.drill.exec.test.generated.ProjectorGen2.doEval():49
> org.apache.drill.exec.test.generated.ProjectorGen2.projectRecords():62
> 
> org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.doWork():173
> org.apache.drill.exec.record.AbstractSingleRecordBatch.innerNext():93
> 
> org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.innerNext():130
> org.apache.drill.exec.record.AbstractRecordBatch.next():156
> 
> org.apache.drill.exec.physical.impl.validate.IteratorValidatorBatchIterator.next():119
> org.apache.drill.exec.record.AbstractRecordBatch.next():113
> org.apache.drill.exec.record.AbstractRecordBatch.next():103
> org.apache.drill.exec.physical.impl.WriterRecordBatch.innerNext():91
> org.apache.drill.exec.record.AbstractRecordBatch.next():156
> 
> org.apache.drill.exec.physical.impl.validate.IteratorValidatorBatchIterator.next():119
> org.apache.drill.exec.record.AbstractRecordBatch.next():113
> org.apache.drill.exec.record.AbstractRecordBatch.next():103
> org.apache.drill.exec.record.AbstractSingleRecordBatch.innerNext():51
> 
> org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.innerNext():130
> org.apache.drill.exec.record.AbstractRecordBatch.next():156
> 
> org.apache.drill.exec.physical.impl.validate.IteratorValidatorBatchIterator.next():119
> org.apache.drill.exec.physical.impl.BaseRootExec.next():104
> 
> org.apache.drill.exec.physical.impl.ScreenCreator$ScreenRoot.innerNext():80
> org.apache.drill.exec.physical.impl.BaseRootExec.next():94
> org.apache.drill.exec.work.fragment.FragmentExecutor$1.run():256
> org.apache.drill.exec.work.fragment.FragmentExecutor$1.run():250
> java.security.AccessController.doPrivileged():-2
> javax.security.auth.Subject.doAs():415
> org.apache.hadoop.security.UserGroupInformation.doAs():1595
> org.apache.drill.exec.work.fragment.FragmentExecutor.run():250
> org.apache.drill.common.SelfCleaningRunnable.run():38
> java.util.concurrent.ThreadPoolExecutor.runWorker():1145
> java.util.concurrent.ThreadPoolExecutor$Worker.run():615
> java.lang.Thread.run():745 (state=,code=0)
> {code}
> The data set contains a widestring (5000 chars) as the first column



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (DRILL-4028) Merge Drill parquet modifications back into the mainline project

2015-11-03 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-4028?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14988742#comment-14988742
 ] 

ASF GitHub Bot commented on DRILL-4028:
---

GitHub user jaltekruse opened a pull request:

https://github.com/apache/drill/pull/236

DRILL-4028: Get off parquet fork



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/jaltekruse/incubator-drill 
parquet-update-squash

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/drill/pull/236.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #236


commit afb72c81bbba69346c48c77852f2429bae47dea4
Author: Jason Altekruse 
Date:   2015-09-04T18:09:23Z

DRILL-4028: Part 1 - Remove references to the shaded version of a Jackson 
@JsonCreator annotation from parquet, replace with proper fasterxml version.

commit 0f51a6bf341699aa7f14457b2c49097e84fff936
Author: Jason Altekruse 
Date:   2015-09-04T18:17:21Z

DRILL-4028: Part 2 - Fixing imports using the wrong parquet packages after 
rebase.

clean up imports in generated source template

commit 4feb538da813f2f1a974337f5e6874866c3cd350
Author: Jason Altekruse 
Date:   2015-09-14T18:13:04Z

DRILL-4028: Part 3 - Fixing issues with Drill parquet read a write path 
after merging the Drill parquet fork back into mainline.

Fixed the issue with the writer, needed to flush the RecordConsumer in the 
ParquetRecordWriter.

Consolidate page reading code

Fix buffer sizes, uncompressed and compressed sizes were backwards

The issue was a mismatch in the usage of byte buffers. Even though the 
position of a buffer was being set, that seemed to be ignored in the setSafe 
method on the varbinary vector. I needed to pass in the offset as it seems to 
just read from the beginning of the buffer. I'm not sure this is how 
ByteBuffers are supposed to be used, but we seem to make use of this pattern 
commonly so I'm not sure it could be easily refactored.

Added some test to print out some additional context when an ordered 
comparison of two datasets fails in a test.

Removing usage of Drill classes from DirectCodecFactory, getting it ready 
to be moved into the parquet codebase.

Fix up parquet API usage in Hive Module.

Fix dictionary reading, the changes made I think may speed up reading 
dictionary encoded files by avoiding an extra copy.

Adding unit test to read a write all types in parquet, the decimal types 
and interval year have some issues.

Use direct codec factory from new package in the parquet library now that 
it has been moved.

Moving the test for Direct Codec Factory out of the Drill source as the 
class itself has been moved.

Small fix after consolidating two different ByteBuffer based 
implementations of BytesInput.

Small fixes to accommodate interface changes.

Small changes to remove direct references to DirectCodecFactory, this class 
is not accessible outside of parquet, but an instance with the same contract is 
now accessible with a new factory method on CodecFactory.

Fixed failing test using miniDFS when reading a larger parquet file.




> Merge Drill parquet modifications back into the mainline project
> 
>
> Key: DRILL-4028
> URL: https://issues.apache.org/jira/browse/DRILL-4028
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Storage - Parquet
>Reporter: Jason Altekruse
>Assignee: Jason Altekruse
> Fix For: 1.3.0
>
>
> Drill has been maintaining a fork of Parquet for over a year. The changes 
> need to make it back into the main repository so we don't have to bother 
> merging in all of the new changes from the master repository into the fork.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (DRILL-4024) CTAS with auto partition gives an NPE when the partition column has null values in it

2015-11-03 Thread Rahul Challapalli (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-4024?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rahul Challapalli updated DRILL-4024:
-
Attachment: error.log
fewtypes_null.parquet

> CTAS with auto partition gives an NPE when the partition column has null 
> values in it
> -
>
> Key: DRILL-4024
> URL: https://issues.apache.org/jira/browse/DRILL-4024
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Storage - Writer
>Reporter: Rahul Challapalli
> Attachments: error.log, fewtypes_null.parquet
>
>
> git.commit.id.abbrev=522eb81
> The data set used contains null values in the partition column. This causes 
> the below query to fail with an NPE
> {code}
> create table fewtypesnull_varcharpartition partition by (varchar_col) as 
> select * from dfs.`/drill/testdata/cross-sources/fewtypes_null.parquet`;
> Error: SYSTEM ERROR: NullPointerException
> Fragment 0:0
> [Error Id: 6ef352c0-a12d-477c-bba8-e4a747a6b78e on qa-node190.qa.lab:31010] 
> (state=,code=0)
> {code}
> I attached the data set and error log. Let me know if more information is 
> needed



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (DRILL-4021) Cannot subract or add between two timestamps

2015-11-03 Thread Krystal (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-4021?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14988593#comment-14988593
 ] 

Krystal commented on DRILL-4021:


I tried this and got the same error:
select cast((cast('2004-05-01 12:03:34' as timestamp) - cast('2004-04-29 
11:57:23' as timestamp)) as interval day to second) from basic where c_row=1;

>From postgres, I got the following result:
2 days 00:06:11


> Cannot subract or add between two timestamps
> 
>
> Key: DRILL-4021
> URL: https://issues.apache.org/jira/browse/DRILL-4021
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Data Types
>Reporter: Krystal
>
> The following subtraction between 2 now() function works:
> select now() - now()from voter_hive limit 1;
> +-+
> | EXPR$0  |
> +-+
> | PT0S|
> +-+
>  
> However, the following queries fail:
> select now() - create_time from voter_hive where voter_id=1;
> Error: VALIDATION ERROR: From line 1, column 8 to line 1, column 26: Cannot 
> apply '-' to arguments of type ' - '. Supported form(s): 
> ' - '
> ' - '
> ' - '
> select create_time - cast('1997-02-12 15:18:31.072' as timestamp) from 
> voter_hive where voter_id=1;
> Error: VALIDATION ERROR: From line 1, column 8 to line 1, column 65: Cannot 
> apply '-' to arguments of type ' - '. Supported 
> form(s): ' - '
> ' - '
> ' - '



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (DRILL-4021) Cannot subract or add between two timestamps

2015-11-03 Thread Victoria Markman (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-4021?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Victoria Markman updated DRILL-4021:

Attachment: Screen Shot 2015-11-03 at 3.44.30 PM.png

> Cannot subract or add between two timestamps
> 
>
> Key: DRILL-4021
> URL: https://issues.apache.org/jira/browse/DRILL-4021
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Data Types
>Reporter: Krystal
> Attachments: Screen Shot 2015-11-03 at 3.44.30 PM.png
>
>
> The following subtraction between 2 now() function works:
> select now() - now()from voter_hive limit 1;
> +-+
> | EXPR$0  |
> +-+
> | PT0S|
> +-+
>  
> However, the following queries fail:
> select now() - create_time from voter_hive where voter_id=1;
> Error: VALIDATION ERROR: From line 1, column 8 to line 1, column 26: Cannot 
> apply '-' to arguments of type ' - '. Supported form(s): 
> ' - '
> ' - '
> ' - '
> select create_time - cast('1997-02-12 15:18:31.072' as timestamp) from 
> voter_hive where voter_id=1;
> Error: VALIDATION ERROR: From line 1, column 8 to line 1, column 65: Cannot 
> apply '-' to arguments of type ' - '. Supported 
> form(s): ' - '
> ' - '
> ' - '



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (DRILL-4024) CTAS with auto partition gives an NPE when the partition column has null values in it

2015-11-03 Thread Rahul Challapalli (JIRA)
Rahul Challapalli created DRILL-4024:


 Summary: CTAS with auto partition gives an NPE when the partition 
column has null values in it
 Key: DRILL-4024
 URL: https://issues.apache.org/jira/browse/DRILL-4024
 Project: Apache Drill
  Issue Type: Bug
  Components: Storage - Writer
Reporter: Rahul Challapalli


git.commit.id.abbrev=522eb81

The data set used contains null values in the partition column. This causes the 
below query to fail with an NPE

{code}
create table fewtypesnull_varcharpartition partition by (varchar_col) as select 
* from dfs.`/drill/testdata/cross-sources/fewtypes_null.parquet`;
Error: SYSTEM ERROR: NullPointerException

Fragment 0:0

[Error Id: 6ef352c0-a12d-477c-bba8-e4a747a6b78e on qa-node190.qa.lab:31010] 
(state=,code=0)
{code}

I attached the data set and error log. Let me know if more information is needed



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (DRILL-4025) Don't invoke getFileStatus() when metadata cache is available

2015-11-03 Thread Mehant Baid (JIRA)
Mehant Baid created DRILL-4025:
--

 Summary: Don't invoke getFileStatus() when metadata cache is 
available
 Key: DRILL-4025
 URL: https://issues.apache.org/jira/browse/DRILL-4025
 Project: Apache Drill
  Issue Type: Bug
Affects Versions: 1.3.0
Reporter: Mehant Baid
Assignee: Mehant Baid


Currently we invoke getFileStatus() to list all the files under a directory 
even when we have the metadata cache file. The information is already present 
in the cache so we don't need to perform this operation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Issue Comment Deleted] (DRILL-4026) CTAS Auto Partition on a wide varchar column is giving an IllegalReferenceCountException

2015-11-03 Thread Rahul Challapalli (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-4026?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rahul Challapalli updated DRILL-4026:
-
Comment: was deleted

(was: Changing the length of the varchar to 7000 instead of 6000 made the query 
run)

> CTAS Auto Partition on a wide varchar column is giving an 
> IllegalReferenceCountException
> 
>
> Key: DRILL-4026
> URL: https://issues.apache.org/jira/browse/DRILL-4026
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Data Types, Storage - Writer
>Reporter: Rahul Challapalli
> Attachments: abc.tbl
>
>
> git.commit.id.abbrev=bb69f22
> The below query fails
> {code}
> create table vc_part partition by (a) as select cast(columns[0] as 
> varchar(6000)) a, columns[1] b from dfs.`/drill/testdata/abc.tbl`;
> Error: SYSTEM ERROR: IllegalReferenceCountException: refCnt: 0
> Fragment 0:0
> [Error Id: 8bbfcadb-07bb-468c-a772-24c85cecbcf6 on qa-node191.qa.lab:31010]
>   (io.netty.util.IllegalReferenceCountException) refCnt: 0
> io.netty.buffer.AbstractByteBuf.ensureAccessible():1178
> io.netty.buffer.DrillBuf.checkIndexD():184
> io.netty.buffer.DrillBuf.checkBytes():205
> org.apache.drill.exec.expr.fn.impl.ByteFunctionHelpers.compare():101
> org.apache.drill.exec.test.generated.ProjectorGen2.doEval():49
> org.apache.drill.exec.test.generated.ProjectorGen2.projectRecords():62
> 
> org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.doWork():173
> org.apache.drill.exec.record.AbstractSingleRecordBatch.innerNext():93
> 
> org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.innerNext():130
> org.apache.drill.exec.record.AbstractRecordBatch.next():156
> 
> org.apache.drill.exec.physical.impl.validate.IteratorValidatorBatchIterator.next():119
> org.apache.drill.exec.record.AbstractRecordBatch.next():113
> org.apache.drill.exec.record.AbstractRecordBatch.next():103
> org.apache.drill.exec.physical.impl.WriterRecordBatch.innerNext():91
> org.apache.drill.exec.record.AbstractRecordBatch.next():156
> 
> org.apache.drill.exec.physical.impl.validate.IteratorValidatorBatchIterator.next():119
> org.apache.drill.exec.record.AbstractRecordBatch.next():113
> org.apache.drill.exec.record.AbstractRecordBatch.next():103
> org.apache.drill.exec.record.AbstractSingleRecordBatch.innerNext():51
> 
> org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.innerNext():130
> org.apache.drill.exec.record.AbstractRecordBatch.next():156
> 
> org.apache.drill.exec.physical.impl.validate.IteratorValidatorBatchIterator.next():119
> org.apache.drill.exec.physical.impl.BaseRootExec.next():104
> 
> org.apache.drill.exec.physical.impl.ScreenCreator$ScreenRoot.innerNext():80
> org.apache.drill.exec.physical.impl.BaseRootExec.next():94
> org.apache.drill.exec.work.fragment.FragmentExecutor$1.run():256
> org.apache.drill.exec.work.fragment.FragmentExecutor$1.run():250
> java.security.AccessController.doPrivileged():-2
> javax.security.auth.Subject.doAs():415
> org.apache.hadoop.security.UserGroupInformation.doAs():1595
> org.apache.drill.exec.work.fragment.FragmentExecutor.run():250
> org.apache.drill.common.SelfCleaningRunnable.run():38
> java.util.concurrent.ThreadPoolExecutor.runWorker():1145
> java.util.concurrent.ThreadPoolExecutor$Worker.run():615
> java.lang.Thread.run():745 (state=,code=0)
> {code}
> The data set contains a widestring (5000 chars) as the first column



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (DRILL-4027) We need to log changes to configuration parameters to drillbit.log file

2015-11-03 Thread Khurram Faraaz (JIRA)
Khurram Faraaz created DRILL-4027:
-

 Summary: We need to log changes to configuration parameters to 
drillbit.log file
 Key: DRILL-4027
 URL: https://issues.apache.org/jira/browse/DRILL-4027
 Project: Apache Drill
  Issue Type: Bug
  Components: Execution - Flow
Affects Versions: 1.3.0
 Environment: 4 node cluster on CentOS
Reporter: Khurram Faraaz
Priority: Critical


Currently we do not log any changes that were made to configuration parameters 
that are available in sys.options table. We can SET/RESET values for these 
parameters using SET and RESET commands.

This JIRA is reported to ensure that we log any changes that are made to 
configuration parameters either at SYSTEM or SESSION level, in to the 
drillbit.log file.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (DRILL-4029) Non admin users should not be allowed to execute RESET ALL at SYSTEM level

2015-11-03 Thread Khurram Faraaz (JIRA)
Khurram Faraaz created DRILL-4029:
-

 Summary: Non admin users should not be allowed to execute RESET 
ALL at SYSTEM level
 Key: DRILL-4029
 URL: https://issues.apache.org/jira/browse/DRILL-4029
 Project: Apache Drill
  Issue Type: Bug
  Components: Execution - Flow
Affects Versions: 1.3.0
 Environment: 4 node cluster CentOS
Reporter: Khurram Faraaz
Priority: Critical


Set MAPR_IMPERSONATION_ENABLED=false and connect to Drill as user test (which 
is not admin user) I was able to RESET all options at SYSTEM level, this does 
not look right.

{code}
[root@centos bin]# ./sqlline -u "jdbc:drill:schema=dfs.tmp -n test -p test"
apache drill 1.3.0-SNAPSHOT
"say hello to my little drill"
0: jdbc:drill:schema=dfs.tmp> ALTER SYSTEM RESET ALL;
+---+---+
|  ok   |summary|
+---+---+
| true  | ALL updated.  |
+---+---+
1 row selected (2.013 seconds)
0: jdbc:drill:schema=dfs.tmp> !q
Closing: org.apache.drill.jdbc.impl.DrillConnectionImpl

[root@centos bin]# clush -g khurram grep "MAPR_IMPERSONATION_ENABLED" 
/opt/mapr/drill/drill-1.3.0/conf/drill-env.sh
: export MAPR_IMPERSONATION_ENABLED=false
: export MAPR_IMPERSONATION_ENABLED=false
: export MAPR_IMPERSONATION_ENABLED=false
: export MAPR_IMPERSONATION_ENABLED=false

[root@centos bin]# clush -g khurram tail -n 5 
/opt/mapr/drill/drill-1.3.0/conf/drill-override.conf
:
: drill.exec: {
:   cluster-id: "my_cluster_com-drillbits",
:   zk.connect: "10.10.100.201:5181"
: }
:
: drill.exec: {
:   cluster-id: "my_cluster_com-drillbits",
:   zk.connect: "10.10.100.201:5181"
: }
:
: drill.exec: {
:   cluster-id: "my_cluster_com-drillbits",
:   zk.connect: "10.10.100.201:5181"
: }
:
: drill.exec: {
:   cluster-id: "my_cluster_com-drillbits",
:   zk.connect: "10.10.100.201:5181"
: }
{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (DRILL-4029) Non admin users should not be allowed to execute RESET ALL at SYSTEM level

2015-11-03 Thread Sudheesh Katkam (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-4029?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14988909#comment-14988909
 ] 

Sudheesh Katkam commented on DRILL-4029:


Also see [DRILL-3622|https://issues.apache.org/jira/browse/DRILL-3622].

> Non admin users should not be allowed to execute RESET ALL at SYSTEM level
> --
>
> Key: DRILL-4029
> URL: https://issues.apache.org/jira/browse/DRILL-4029
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Flow
>Affects Versions: 1.3.0
> Environment: 4 node cluster CentOS
>Reporter: Khurram Faraaz
>Priority: Critical
>
> Set MAPR_IMPERSONATION_ENABLED=false and connect to Drill as user test (which 
> is not admin user) I was able to RESET all options at SYSTEM level, this does 
> not look right.
> {code}
> [root@centos bin]# ./sqlline -u "jdbc:drill:schema=dfs.tmp -n test -p test"
> apache drill 1.3.0-SNAPSHOT
> "say hello to my little drill"
> 0: jdbc:drill:schema=dfs.tmp> ALTER SYSTEM RESET ALL;
> +---+---+
> |  ok   |summary|
> +---+---+
> | true  | ALL updated.  |
> +---+---+
> 1 row selected (2.013 seconds)
> 0: jdbc:drill:schema=dfs.tmp> !q
> Closing: org.apache.drill.jdbc.impl.DrillConnectionImpl
> [root@centos bin]# clush -g khurram grep "MAPR_IMPERSONATION_ENABLED" 
> /opt/mapr/drill/drill-1.3.0/conf/drill-env.sh
> : export MAPR_IMPERSONATION_ENABLED=false
> : export MAPR_IMPERSONATION_ENABLED=false
> : export MAPR_IMPERSONATION_ENABLED=false
> : export MAPR_IMPERSONATION_ENABLED=false
> [root@centos bin]# clush -g khurram tail -n 5 
> /opt/mapr/drill/drill-1.3.0/conf/drill-override.conf
> :
> : drill.exec: {
> :   cluster-id: "my_cluster_com-drillbits",
> :   zk.connect: "10.10.100.201:5181"
> : }
> :
> : drill.exec: {
> :   cluster-id: "my_cluster_com-drillbits",
> :   zk.connect: "10.10.100.201:5181"
> : }
> :
> : drill.exec: {
> :   cluster-id: "my_cluster_com-drillbits",
> :   zk.connect: "10.10.100.201:5181"
> : }
> :
> : drill.exec: {
> :   cluster-id: "my_cluster_com-drillbits",
> :   zk.connect: "10.10.100.201:5181"
> : }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (DRILL-2139) Star is not expanded correctly in "select distinct" query

2015-11-03 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-2139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14988934#comment-14988934
 ] 

ASF GitHub Bot commented on DRILL-2139:
---

GitHub user hsuanyi opened a pull request:

https://github.com/apache/drill/pull/237

DRILL-2139: Support distinct over star column



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/hsuanyi/incubator-drill DRILL-2139

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/drill/pull/237.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #237


commit f21a226b832810d98d675c0898c0c4b9e5660639
Author: Hsuan-Yi Chu 
Date:   2015-03-17T21:42:32Z

DRILL-2139: Support distinct over star column




> Star is not expanded correctly in "select distinct" query
> -
>
> Key: DRILL-2139
> URL: https://issues.apache.org/jira/browse/DRILL-2139
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Query Planning & Optimization
>Affects Versions: 0.8.0
>Reporter: Victoria Markman
>Assignee: Sean Hsuan-Yi Chu
>Priority: Critical
> Fix For: 1.3.0
>
>
> {code}
> 0: jdbc:drill:schema=dfs> select distinct * from t1;
> ++
> | *  |
> ++
> | null   |
> ++
> 1 row selected (0.14 seconds)
> 0: jdbc:drill:schema=dfs> select distinct * from `test.json`;
> ++
> | *  |
> ++
> | null   |
> ++
> 1 row selected (0.163 seconds)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (DRILL-4029) Non admin users should not be allowed to execute RESET ALL at SYSTEM level

2015-11-03 Thread Sudheesh Katkam (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-4029?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14988778#comment-14988778
 ] 

Sudheesh Katkam commented on DRILL-4029:


[This 
functionality|https://drill.apache.org/docs/configuring-user-authentication/#administrator-privileges]
 is available only when impersonation is enabled.

> Non admin users should not be allowed to execute RESET ALL at SYSTEM level
> --
>
> Key: DRILL-4029
> URL: https://issues.apache.org/jira/browse/DRILL-4029
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Flow
>Affects Versions: 1.3.0
> Environment: 4 node cluster CentOS
>Reporter: Khurram Faraaz
>Priority: Critical
>
> Set MAPR_IMPERSONATION_ENABLED=false and connect to Drill as user test (which 
> is not admin user) I was able to RESET all options at SYSTEM level, this does 
> not look right.
> {code}
> [root@centos bin]# ./sqlline -u "jdbc:drill:schema=dfs.tmp -n test -p test"
> apache drill 1.3.0-SNAPSHOT
> "say hello to my little drill"
> 0: jdbc:drill:schema=dfs.tmp> ALTER SYSTEM RESET ALL;
> +---+---+
> |  ok   |summary|
> +---+---+
> | true  | ALL updated.  |
> +---+---+
> 1 row selected (2.013 seconds)
> 0: jdbc:drill:schema=dfs.tmp> !q
> Closing: org.apache.drill.jdbc.impl.DrillConnectionImpl
> [root@centos bin]# clush -g khurram grep "MAPR_IMPERSONATION_ENABLED" 
> /opt/mapr/drill/drill-1.3.0/conf/drill-env.sh
> : export MAPR_IMPERSONATION_ENABLED=false
> : export MAPR_IMPERSONATION_ENABLED=false
> : export MAPR_IMPERSONATION_ENABLED=false
> : export MAPR_IMPERSONATION_ENABLED=false
> [root@centos bin]# clush -g khurram tail -n 5 
> /opt/mapr/drill/drill-1.3.0/conf/drill-override.conf
> :
> : drill.exec: {
> :   cluster-id: "my_cluster_com-drillbits",
> :   zk.connect: "10.10.100.201:5181"
> : }
> :
> : drill.exec: {
> :   cluster-id: "my_cluster_com-drillbits",
> :   zk.connect: "10.10.100.201:5181"
> : }
> :
> : drill.exec: {
> :   cluster-id: "my_cluster_com-drillbits",
> :   zk.connect: "10.10.100.201:5181"
> : }
> :
> : drill.exec: {
> :   cluster-id: "my_cluster_com-drillbits",
> :   zk.connect: "10.10.100.201:5181"
> : }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (DRILL-2288) ScanBatch violates IterOutcome protocol for zero-row sources [was: missing JDBC metadata (schema) for 0-row results...]

2015-11-03 Thread Daniel Barclay (Drill) (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-2288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14983410#comment-14983410
 ] 

Daniel Barclay (Drill) edited comment on DRILL-2288 at 11/4/15 5:42 AM:


Chain of bugs and problems encountered and (partially) addressed:

1.  {{ScanBatch.next()}} returned {{NONE}} without ever returning 
{{OK_NEW_SCHEMA}} for a source having zero rows (so downstream operators didn't 
get its schema, even for static-schema sources, or even get trigger to update 
their own schema).

2.  {{RecordBatch.IterOutcome}}, especially the allowed sequence of values, was 
not documented clearly (so developers didn't know correctly what to expect or 
provide).

3.  {{IteratorValidatorBatchIterator}} didn't validate the sequence of 
{{IterOutcome values}} (so developers weren't notified about incorrect results).

4.  {{UnionAllRecordBatch}} did not interpret {{NONE}} and {{OK_NEW_SCHEMA}} 
correctly (so it reported spurious/incorrect schema-change and/or 
empty-/non-empty input exceptions).

5.  {{ScanBatch.Mutator.isNewSchema()}} didn't handle a short-circuit OR 
{"{{||}}"} correctly in calling {{SchemaChangeCallBack.getSchemaChange()}} (so 
it didn't reset nested schema-change state, and so caused spurious 
{{OK_NEW_SCHEMA}} notifications and downstream exceptions).

6.  {{JsonRecordReader.ensureAtLeastOneField()}} didn't check whether any field 
already existed in the batch (so in that case it forcibly changed the type to 
{{NullableIntVector}}, causing schema changes and downstream exceptions). 
\[Note:  DRILL-2288 does not address other problems with {{NullableIntVector}} 
dummy columns from {{JsonRecordReader}}.]

7.  HBase tests used only one table region, ignoring known problems with 
multi-region HBase tables (so latent {{HBaseRecordReader}} problems were left 
undetected and unresolved.)   \[Note: DRILL-2288 addresses only one test table 
(increasing the number of regions on the other test tables exposed at least one 
other problem; others remain).]

8.  {{HBaseRecordReader}} didn't create a {{MapVector}} for every HBase column 
family and every requested HBase column (so {{NullableIntVector}} dummy columns 
got created, causing spurious schema changes and downstream exceptions).

9.  Some {{RecordBatch}} classes didn't reset their record counts to zero 
({{OrderedPartitionRecordBatch.recordCount}}, 
{{ProjectRecordBatch.recordCount}}, and/or {{TopNBatch.recordCount}}) (so 
downstream code tried to access elements of (correctly) empty vectors, yielding 
{{IndexOutOfBoundException}} (with ~"... {{range (0, 0)}}") ).

10.  {{RecordBatchLoader}}'s record count was not reset to zero by 
{{UnorderedReceiverBatch}} (so, again, downstream code tried to access elements 
of (correctly) empty vectors, yielding {{IndexOutOfBoundException}} (with ~"... 
{{range (0, 0)}}") ).

11.  {{MapVector.load(...)}} left some existing vectors empty, not matching the 
returned length and the length of sibling vectors (so 
{{MapVector.getObject(int)}} got {{IndexOutOfBoundException}} (with ~"... 
{{range (0, 0)}}").  \[Note: DRILL-2288 does not address the root problem.]

12. {{BaseTestQuery.printResult(...)}} skipped deallocation calls in the case 
of a zero-record record batch (so when it read a zero-row record batch, it 
caused a memory leak reported at Drillbit shutdown time).

13. {{TestHBaseProjectPushDown.testRowKeyAndColumnPushDown()}} used delimited 
identifiers of a form (with a period) that Drill can't handle (so the test 
failed when the test ran with multiple fragments).





was (Author: dsbos):
Chain of bugs and problems encountered and (partially) addressed:

1.  {{ScanBatch.next()}} returned {{NONE}} without ever returning 
{{OK_NEW_SCHEMA}} for a source having zero rows (so downstream operators didn't 
get its schema, even for static-schema sources, or even get trigger to update 
their own schema).

2.  {{RecordBatch.IterOutcome}}, especially the allowed sequence of values, was 
not documented clearly (so developers didn't know correctly what to expect or 
provide).

3.  {{IteratorValidatorBatchIterator}} didn't validate the sequence of 
{{IterOutcome values}} (so developers weren't notified about incorrect results).

4.  {{UnionAllRecordBatch}} did not interpret {{NONE}} and {{OK_NEW_SCHEMA}} 
correctly (so it reported spurious/incorrect schema-change and/or 
empty-/non-empty input exceptions).

5.  {{ScanBatch.Mutator.isNewSchema()}} didn't handle a short-circuit OR 
{"{{||}}"} correctly in calling {{SchemaChangeCallBack.getSchemaChange()}} (so 
it didn't reset nested schema-change state, and so caused spurious 
{{OK_NEW_SCHEMA}} notifications and downstream exceptions).

6.  {{JsonRecordReader.ensureAtLeastOneField()}} didn't check whether any field 
already existed in the batch (so in that case it forcibly changed the type to 
{{NullableIntVector}}, causing schema changes and 

[jira] [Commented] (DRILL-3912) Common subexpression elimination in code generation

2015-11-03 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-3912?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14988958#comment-14988958
 ] 

ASF GitHub Bot commented on DRILL-3912:
---

Github user asfgit closed the pull request at:

https://github.com/apache/drill/pull/189


> Common subexpression elimination in code generation
> ---
>
> Key: DRILL-3912
> URL: https://issues.apache.org/jira/browse/DRILL-3912
> Project: Apache Drill
>  Issue Type: Improvement
>Reporter: Steven Phillips
>Assignee: Jinfeng Ni
>
> Drill currently will evaluate the full expression tree, even if there are 
> redundant subtrees. Many of these redundant evaluations can be eliminated by 
> reusing the results from previously evaluated expression trees.
> For example,
> {code}
> select a + 1, (a + 1)* (a - 1) from t
> {code}
> Will compute the entire (a + 1) expression twice. With CSE, it will only be 
> evaluated once.
> The benefit will be reducing the work done when evaluating expressions, as 
> well as reducing the amount of code that is generated, which could also lead 
> to better JIT optimization.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (DRILL-3994) Build Fails on Windows after DRILL-3742

2015-11-03 Thread Aditya Kishore (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-3994?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14988316#comment-14988316
 ] 

Aditya Kishore commented on DRILL-3994:
---

You'll need to extract {{hadoop.dll}} and {{winutils.exe}} from under 
{{winutils/bin/}} from Drill's tarball and put them somewhere in Windows 
program search path.

> Build Fails on Windows after DRILL-3742
> ---
>
> Key: DRILL-3994
> URL: https://issues.apache.org/jira/browse/DRILL-3994
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Tools, Build & Test
>Reporter: Sudheesh Katkam
>Assignee: Julien Le Dem
>Priority: Critical
> Fix For: 1.3.0
>
>
> Build fails on Windows on the latest master:
> {code}
> c:\drill> mvn clean install -DskipTests 
> ...
> [INFO] Rat check: Summary of files. Unapproved: 0 unknown: 0 generated: 0 
> approved: 169 licence.
> [INFO] 
> [INFO] <<< exec-maven-plugin:1.2.1:java (default) < validate @ drill-common 
> <<<
> [INFO] 
> [INFO] --- exec-maven-plugin:1.2.1:java (default) @ drill-common ---
> SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder".
> SLF4J: Defaulting to no-operation (NOP) logger implementation
> SLF4J: See 
> http://www.slf4j.org/codes.html#StaticLoggerBinder
>  for further details.
> Scanning: C:\drill\common\target\classes
> [WARNING] 
> java.lang.reflect.InvocationTargetException
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:606)
>   at org.codehaus.mojo.exec.ExecJavaMojo$1.run(ExecJavaMojo.java:297)
>   at java.lang.Thread.run(Thread.java:745)
> Caused by: java.lang.IllegalArgumentException: 
> file:C:/drill/common/target/classes/ not in 
> [file:/C:/drill/common/target/classes/]
>   at 
> org.apache.drill.common.scanner.BuildTimeScan.main(BuildTimeScan.java:129)
>   ... 6 more
> [INFO] 
> 
> [INFO] Reactor Summary:
> [INFO] 
> [INFO] Apache Drill Root POM .. SUCCESS [ 10.016 
> s]
> [INFO] tools/Parent Pom ... SUCCESS [  1.062 
> s]
> [INFO] tools/freemarker codegen tooling ... SUCCESS [  6.922 
> s]
> [INFO] Drill Protocol . SUCCESS [ 10.062 
> s]
> [INFO] Common (Logical Plan, Base expressions)  FAILURE [  9.954 
> s]
> [INFO] contrib/Parent Pom . SKIPPED
> [INFO] contrib/data/Parent Pom  SKIPPED
> [INFO] contrib/data/tpch-sample-data .. SKIPPED
> [INFO] exec/Parent Pom  SKIPPED
> [INFO] exec/Java Execution Engine . SKIPPED
> [INFO] exec/JDBC Driver using dependencies  SKIPPED
> [INFO] JDBC JAR with all dependencies . SKIPPED
> [INFO] contrib/mongo-storage-plugin ... SKIPPED
> [INFO] contrib/hbase-storage-plugin ... SKIPPED
> [INFO] contrib/jdbc-storage-plugin  SKIPPED
> [INFO] contrib/hive-storage-plugin/Parent Pom . SKIPPED
> [INFO] contrib/hive-storage-plugin/hive-exec-shaded ... SKIPPED
> [INFO] contrib/hive-storage-plugin/core ... SKIPPED
> [INFO] contrib/drill-gis-plugin ... SKIPPED
> [INFO] Packaging and Distribution Assembly  SKIPPED
> [INFO] contrib/sqlline  SKIPPED
> [INFO] 
> 
> [INFO] BUILD FAILURE
> [INFO] 
> 
> [INFO] Total time: 38.813 s
> [INFO] Finished at: 2015-10-28T12:17:19-07:00
> [INFO] Final Memory: 67M/466M
> [INFO] 
> 
> [ERROR] Failed to execute goal org.codehaus.mojo:exec-maven-plugin:1.2.1:java 
> (default) on project drill-common: An exception occured while executing the 
> Java class. null: InvocationTargetException: 
> file:C:/drill/common/target/classes/ not in 
> [file:/C:/drill/common/target/classes/] -> [Help 1]
> [ERROR] 
> [ERROR] To see the full stack trace of the errors, re-run Maven with the -e 
> switch.
> [ERROR] Re-run Maven using the -X switch to enable full debug logging.
> [ERROR] 
> [ERROR] For more information about the errors and possible solutions, please 
> read the following articles:
> [ERROR] [Help 1] 
> http://cwiki.apache.org/confluence/display/MAVEN/MojoExecutionException

[jira] [Assigned] (DRILL-4010) In HBase reader, create child vectors for referenced HBase columns to avoid spurious schema changes

2015-11-03 Thread Daniel Barclay (Drill) (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-4010?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Barclay (Drill) reassigned DRILL-4010:
-

Assignee: Daniel Barclay (Drill)

> In HBase reader, create child vectors for referenced HBase columns to avoid 
> spurious schema changes
> ---
>
> Key: DRILL-4010
> URL: https://issues.apache.org/jira/browse/DRILL-4010
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Data Types, Storage - HBase
>Reporter: Daniel Barclay (Drill)
>Assignee: Daniel Barclay (Drill)
>
> {{HBaseRecordReader}} needs to create child vectors for all 
> referenced/requested columns.
> Currently, if a fragment reads only HBase rows that don't have a particular 
> referenced column (within a given column family), downstream code adds a 
> dummy column of type {{NullableIntVector}} (as a child in the {{MapVector}} 
> for the containing HBase column family).
> If any other fragment reads an HBase row that _does_ contain the referenced 
> column, that fragment's reader will create a child 
> {{NullableVarBinaryVector}} for the referenced column.
> When the data from those two fragments comes together, Drill detects a schema 
> change, even though logically there isn't really any schema change.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (DRILL-4012) Limit 0 on top of query with kvg/flatten results in assert

2015-11-03 Thread Victoria Markman (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-4012?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Victoria Markman updated DRILL-4012:

Description: 
I've found couple of bugs that are very similar, but none of them are quite the 
same:

missing-map.json
{code}
{
"id": 1,
"m": {"a":1,"b":2}
}
{
"id": 2
}
{
"id": 3,
"m": {"c":3,"d":4}
}
{code}

'limit 0' results in an assert:
{code}
0: jdbc:drill:schema=dfs> select * from (select id, flatten(kvgen(m)) from 
`missing-map.json`) limit 0;
Error: SYSTEM ERROR: ClassCastException: Cannot cast 
org.apache.drill.exec.vector.NullableIntVector to 
org.apache.drill.exec.vector.complex.RepeatedValueVector
Fragment 0:0
[Error Id: 046bb4d4-2c54-43ab-9577-cf21542ff8ef on atsqa4-133.qa.lab:31010] 
(state=,code=0)
{code}

'limit 1' works:
{code}
0: jdbc:drill:schema=dfs> select * from (select id, flatten(kvgen(m)) from 
`missing-map.json`) limit 1;
+-++
| id  | EXPR$1 |
+-++
| 1   | {"key":"a","value":1}  |
+-++
1 row selected (0.247 seconds)
{code}

No limit, just in subquery: works
{code}
0: jdbc:drill:schema=dfs> select * from (select id, flatten(kvgen(m)) from 
`json_kvgenflatten/missing-map.json`);
+-++
| id  | EXPR$1 |
+-++
| 1   | {"key":"a","value":1}  |
| 1   | {"key":"b","value":2}  |
| 3   | {"key":"c","value":3}  |
| 3   | {"key":"d","value":4}  |
+-++
4 rows selected (0.247 seconds)
{code}

drillbit.log
{code}
2015-11-03 15:23:20,943 [29c72e96-c9f6-9fce-ecf1-14eaa145f72b:foreman] INFO  
o.a.d.e.s.schedule.BlockMapBuilder - Failure finding Drillbit running on host 
atsqa4-136.qa.lab.  Skipping affinity to that host.
2015-11-03 15:23:20,944 [29c72e96-c9f6-9fce-ecf1-14eaa145f72b:foreman] INFO  
o.a.d.e.s.schedule.BlockMapBuilder - Failure finding Drillbit running on host 
atsqa4-134.qa.lab.  Skipping affinity to that host.
2015-11-03 15:23:20,944 [29c72e96-c9f6-9fce-ecf1-14eaa145f72b:foreman] INFO  
o.a.d.e.s.schedule.BlockMapBuilder - Get block maps: Executed 1 out of 1 using 
1 threads. Time: 1ms total, 1.530719ms avg, 1ms max.
2015-11-03 15:23:20,944 [29c72e96-c9f6-9fce-ecf1-14eaa145f72b:foreman] INFO  
o.a.d.e.s.schedule.BlockMapBuilder - Get block maps: Executed 1 out of 1 using 
1 threads. Earliest start: 1.744000 μs, Latest start: 1.744000 μs, Average 
start: 1.744000 μs .
2015-11-03 15:23:20,968 [29c72e96-c9f6-9fce-ecf1-14eaa145f72b:frag:0:0] INFO  
o.a.d.e.w.fragment.FragmentExecutor - 29c72e96-c9f6-9fce-ecf1-14eaa145f72b:0:0: 
State change requested AWAITING_ALLOCATION --> RUNNING
2015-11-03 15:23:20,968 [29c72e96-c9f6-9fce-ecf1-14eaa145f72b:frag:0:0] INFO  
o.a.d.e.w.f.FragmentStatusReporter - 29c72e96-c9f6-9fce-ecf1-14eaa145f72b:0:0: 
State to report: RUNNING
2015-11-03 15:23:20,974 [29c72e96-c9f6-9fce-ecf1-14eaa145f72b:frag:0:0] WARN  
o.a.d.e.e.ExpressionTreeMaterializer - Unable to find value vector of path 
`EXPR$3`, returning null instance.
2015-11-03 15:23:20,975 [29c72e96-c9f6-9fce-ecf1-14eaa145f72b:frag:0:0] WARN  
o.a.d.e.e.ExpressionTreeMaterializer - Unable to find value vector of path 
`EXPR$3`, returning null instance.
2015-11-03 15:23:20,976 [29c72e96-c9f6-9fce-ecf1-14eaa145f72b:frag:0:0] INFO  
o.a.d.e.w.fragment.FragmentExecutor - 29c72e96-c9f6-9fce-ecf1-14eaa145f72b:0:0: 
State change requested RUNNING --> FAILED
2015-11-03 15:23:20,976 [29c72e96-c9f6-9fce-ecf1-14eaa145f72b:frag:0:0] INFO  
o.a.d.e.w.fragment.FragmentExecutor - 29c72e96-c9f6-9fce-ecf1-14eaa145f72b:0:0: 
State change requested FAILED --> FINISHED
2015-11-03 15:23:20,978 [29c72e96-c9f6-9fce-ecf1-14eaa145f72b:frag:0:0] ERROR 
o.a.d.e.w.fragment.FragmentExecutor - SYSTEM ERROR: ClassCastException: Cannot 
cast org.apache.drill.exec.vector.NullableIntVector to 
org.apache.drill.exec.vector.complex.RepeatedValueVector

Fragment 0:0

[Error Id: c82c7f59-4dad-47ad-8901-5a2261c81279 on atsqa4-133.qa.lab:31010]
org.apache.drill.common.exceptions.UserException: SYSTEM ERROR: 
ClassCastException: Cannot cast org.apache.drill.exec.vector.NullableIntVector 
to org.apache.drill.exec.vector.complex.RepeatedValueVector

Fragment 0:0
[Error Id: c82c7f59-4dad-47ad-8901-5a2261c81279 on atsqa4-133.qa.lab:31010]
at 
org.apache.drill.common.exceptions.UserException$Builder.build(UserException.java:534)
 ~[drill-common-1.2.0.jar:1.2.0]
at 
org.apache.drill.exec.work.fragment.FragmentExecutor.sendFinalState(FragmentExecutor.java:323)
 [drill-java-exec-1.2.0.jar:1.2.0]
at 
org.apache.drill.exec.work.fragment.FragmentExecutor.cleanup(FragmentExecutor.java:178)
 [drill-java-exec-1.2.0.jar:1.2.0]
at 
org.apache.drill.exec.work.fragment.FragmentExecutor.run(FragmentExecutor.java:292)
 [drill-java-exec-1.2.0.jar:1.2.0]
at 

[jira] [Commented] (DRILL-3987) Create a POC VV extraction

2015-11-03 Thread Jacques Nadeau (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-3987?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14987519#comment-14987519
 ] 

Jacques Nadeau commented on DRILL-3987:
---

Now that DRILL-3229 and related items are merged, I'll do a second attempt at 
my splitting and post a proposed patch/pr. I'm inclined to start by extracting 
this into a module that sits in-between common and java-exec. Then we can look 
at what else we can separate out and how we can remove a dependency on common & 
protocol.

> Create a POC VV extraction
> --
>
> Key: DRILL-3987
> URL: https://issues.apache.org/jira/browse/DRILL-3987
> Project: Apache Drill
>  Issue Type: Sub-task
>Reporter: Jacques Nadeau
>Assignee: Jacques Nadeau
>
> I'd like to start by looking at an extraction that pulls out the base 
> concepts of:
> buffer allocation, value vectors and complexwriter/fieldreader.
> I need to figure out how to resolve some of the cross-dependency issues (such 
> as the jdbc accessor connections).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (DRILL-3987) Create a POC VV extraction

2015-11-03 Thread Jacques Nadeau (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-3987?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jacques Nadeau updated DRILL-3987:
--
Fix Version/s: 1.4.0

> Create a POC VV extraction
> --
>
> Key: DRILL-3987
> URL: https://issues.apache.org/jira/browse/DRILL-3987
> Project: Apache Drill
>  Issue Type: Sub-task
>Reporter: Jacques Nadeau
>Assignee: Jacques Nadeau
> Fix For: 1.4.0
>
>
> I'd like to start by looking at an extraction that pulls out the base 
> concepts of:
> buffer allocation, value vectors and complexwriter/fieldreader.
> I need to figure out how to resolve some of the cross-dependency issues (such 
> as the jdbc accessor connections).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)