[jira] [Updated] (HIVE-10867) ArrayIndexOutOfBoundsException LazyBinaryUtils.byteArrayToLong with Hive on Tez

2016-04-26 Thread Alina Abramova (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10867?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alina Abramova updated HIVE-10867:
--
Attachment: HIVE-10867.patch

I created this patch based on https://issues.apache.org/jira/browse/HIVE-9517
I see that that fix works for Tez too

> ArrayIndexOutOfBoundsException LazyBinaryUtils.byteArrayToLong with Hive on 
> Tez
> ---
>
> Key: HIVE-10867
> URL: https://issues.apache.org/jira/browse/HIVE-10867
> Project: Hive
>  Issue Type: Bug
>  Components: Hive, Tez
>Affects Versions: 0.14.0, 1.0.0
> Environment: Hortwonworks distribution 2.2.4-2
> Hive 0.14.0
> Tez 0.5.2.2.2.4.2-2 on cluster
> Tez 0.7.0 in local setup
>Reporter: Per Ullberg
>Assignee: Alina Abramova
> Attachments: HIVE-10867.patch
>
>
> Hi, 
> The following query runs fine on map reduce engine but when setting the 
> hive.exection.engine to tez it produces an ArrayIndexOutOfBoundsException.
> Query
> {code}
> create external table table_1 (id string, date string, amount bigint);
> insert into table table_1 values (305,'2013-03-02',3790);
> create external table table_2 (id string);
> insert into table table_2 VALUES (305);
> create external table table_3 (id string, date_3 string, amount_3 bigint);
> insert into table table_3 values (305,'2013-03-01',-1600);
> create external table table_4 (id bigint, str_4 string, amount_4 bigint);
> create table table_5
> as
>   SELECT
> c.diff
>   FROM (
> SELECT
>   id AS id,
>   date AS create_date,
>   -amount AS diff
> FROM table_1
> UNION ALL
> SELECT
>   p.id AS id,
>   p.str_4 AS create_date,
>   -p.amount_4 AS diff
> FROM table_4 p
> UNION ALL
> SELECT
>   id,
>   create_date,
>   diff
> FROM (
>   SELECT
> i.id AS id,
> tp.date_3 AS create_date,
> cast(amount_3 as double) AS diff
>   FROM table_3 tp
>   INNER JOIN table_2 i ON cast(tp.id as string) = cast(i.id as string)
> ) fees
>   ) c
> INNER JOIN table_2 i ON cast(c.id as string) = cast(i.id as string);
> {code}
> Results with map reduce engine:
> {code}
> hive> select * from table_5;
> OK
> -1600.0
> -3790.0
> Time taken: 0.061 seconds, Fetched: 2 row(s)
> {code}
> Exception with tez engine:
> {code}
> Status: Failed
> Vertex failed, vertexName=Reducer 4, vertexId=vertex_1432809678493_0891_4_06, 
> diagnostics=[Task failed, taskId=task_1432809678493_0891_4_06_00, 
> diagnostics=[TaskAttempt 0 failed, info=[Error: Failure while running 
> task:java.lang.RuntimeException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while 
> processing row (tag=0) 
> {"key":{"reducesinkkey0":"305"},"value":{"_col1":-1600.0}}
>   at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:186)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:138)
>   at 
> org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:324)
>   at 
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:176)
>   at 
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:168)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:415)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
>   at 
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.call(TezTaskRunner.java:168)
>   at 
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.call(TezTaskRunner.java:163)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:262)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>   at java.lang.Thread.run(Thread.java:745)
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime 
> Error while processing row (tag=0) 
> {"key":{"reducesinkkey0":"305"},"value":{"_col1":-1600.0}}
>   at 
> org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource$GroupIterator.next(ReduceRecordSource.java:337)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.pushRecord(ReduceRecordSource.java:218)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.ReduceRecordProcessor.run(ReduceRecordProcessor.java:168)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:163)
>   ... 13 more
> Caused by: java.lang.ArrayIndexOutOfBoundsException: 6
>   at 
> 

[jira] [Updated] (HIVE-10867) ArrayIndexOutOfBoundsException LazyBinaryUtils.byteArrayToLong with Hive on Tez

2016-04-26 Thread Alina Abramova (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10867?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alina Abramova updated HIVE-10867:
--
Affects Version/s: 1.0.0
   Status: Patch Available  (was: In Progress)

> ArrayIndexOutOfBoundsException LazyBinaryUtils.byteArrayToLong with Hive on 
> Tez
> ---
>
> Key: HIVE-10867
> URL: https://issues.apache.org/jira/browse/HIVE-10867
> Project: Hive
>  Issue Type: Bug
>  Components: Hive, Tez
>Affects Versions: 1.0.0, 0.14.0
> Environment: Hortwonworks distribution 2.2.4-2
> Hive 0.14.0
> Tez 0.5.2.2.2.4.2-2 on cluster
> Tez 0.7.0 in local setup
>Reporter: Per Ullberg
>Assignee: Alina Abramova
> Attachments: HIVE-10867.patch
>
>
> Hi, 
> The following query runs fine on map reduce engine but when setting the 
> hive.exection.engine to tez it produces an ArrayIndexOutOfBoundsException.
> Query
> {code}
> create external table table_1 (id string, date string, amount bigint);
> insert into table table_1 values (305,'2013-03-02',3790);
> create external table table_2 (id string);
> insert into table table_2 VALUES (305);
> create external table table_3 (id string, date_3 string, amount_3 bigint);
> insert into table table_3 values (305,'2013-03-01',-1600);
> create external table table_4 (id bigint, str_4 string, amount_4 bigint);
> create table table_5
> as
>   SELECT
> c.diff
>   FROM (
> SELECT
>   id AS id,
>   date AS create_date,
>   -amount AS diff
> FROM table_1
> UNION ALL
> SELECT
>   p.id AS id,
>   p.str_4 AS create_date,
>   -p.amount_4 AS diff
> FROM table_4 p
> UNION ALL
> SELECT
>   id,
>   create_date,
>   diff
> FROM (
>   SELECT
> i.id AS id,
> tp.date_3 AS create_date,
> cast(amount_3 as double) AS diff
>   FROM table_3 tp
>   INNER JOIN table_2 i ON cast(tp.id as string) = cast(i.id as string)
> ) fees
>   ) c
> INNER JOIN table_2 i ON cast(c.id as string) = cast(i.id as string);
> {code}
> Results with map reduce engine:
> {code}
> hive> select * from table_5;
> OK
> -1600.0
> -3790.0
> Time taken: 0.061 seconds, Fetched: 2 row(s)
> {code}
> Exception with tez engine:
> {code}
> Status: Failed
> Vertex failed, vertexName=Reducer 4, vertexId=vertex_1432809678493_0891_4_06, 
> diagnostics=[Task failed, taskId=task_1432809678493_0891_4_06_00, 
> diagnostics=[TaskAttempt 0 failed, info=[Error: Failure while running 
> task:java.lang.RuntimeException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while 
> processing row (tag=0) 
> {"key":{"reducesinkkey0":"305"},"value":{"_col1":-1600.0}}
>   at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:186)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:138)
>   at 
> org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:324)
>   at 
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:176)
>   at 
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:168)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:415)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
>   at 
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.call(TezTaskRunner.java:168)
>   at 
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.call(TezTaskRunner.java:163)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:262)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>   at java.lang.Thread.run(Thread.java:745)
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime 
> Error while processing row (tag=0) 
> {"key":{"reducesinkkey0":"305"},"value":{"_col1":-1600.0}}
>   at 
> org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource$GroupIterator.next(ReduceRecordSource.java:337)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.pushRecord(ReduceRecordSource.java:218)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.ReduceRecordProcessor.run(ReduceRecordProcessor.java:168)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:163)
>   ... 13 more
> Caused by: java.lang.ArrayIndexOutOfBoundsException: 6
>   at 
> org.apache.hadoop.hive.serde2.lazybinary.LazyBinaryUtils.byteArrayToLong(LazyBinaryUtils.java:84)
>   at 
>