[jira] [Commented] (HIVE-14564) Column Pruning generates out of order columns in SelectOperator which cause ArrayIndexOutOfBoundsException.

2020-06-15 Thread Jl.Yang (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-14564?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17136371#comment-17136371
 ] 

Jl.Yang commented on HIVE-14564:


Hi,In  version 0.98, I select result by left join , in the first sub query I 
used order by key word to sort query result. It happend the same error, when I 
delete order by key word , the error is disapper .I want to know if it is the 
same question by serialization  LazyBinarySerDe. 

2020-06-16 11:01:21,128 INFO [AsyncDispatcher event handler] 
org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: Diagnostics report 
from attempt_1587475402360_2625484_m_000384_3: Error: 
java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: 
Hive Runtime Error while processing row 
{"_col0":29,"_col1":"_vivoX6PlusDEES8F6IFIFLBPFVK\u001dN_vivoX6PlusDEES8F6IFIFLBPFVK�]:�ڈ\u0011r
 
\u�F�{\nu","_col2":"�K�\u0006&\u0006�\u00014<�70AE�\u0006��\u0011�#��Dp\u�&��\u0011���\u0006�\u00014<�\u\u\u\u\u\u\u\u\u\u\u\u\u\u\u\u\u\u\u","_col3":70,"_col4":76,"_col5":66,"_col6":null,"_col7":null,"_col8":null,"_col9":null,"_col10":null,"_col11":null,"_col12":70,"_col13":null,"_col14":86,"_col15":null,"_col16":156413}
at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:195)
at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:450)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1642)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163)
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error 
while processing row 
{"_col0":29,"_col1":"_vivoX6PlusDEES8F6IFIFLBPFVK\u001dN_vivoX6PlusDEES8F6IFIFLBPFVK�]:�ڈ\u0011r
 
\u�F�{\nu","_col2":"�K�\u0006&\u0006�\u00014<�70AE�\u0006��\u0011�#��Dp\u�&��\u0011���\u0006�\u00014<�\u\u\u\u\u\u\u\u\u\u\u\u\u\u\u\u\u\u\u","_col3":70,"_col4":76,"_col5":66,"_col6":null,"_col7":null,"_col8":null,"_col9":null,"_col10":null,"_col11":null,"_col12":70,"_col13":null,"_col14":86,"_col15":null,"_col16":156413}
at 
org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:549)
at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:177)
... 8 more
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: 
java.lang.ArrayIndexOutOfBoundsException
at 
org.apache.hadoop.hive.ql.exec.ReduceSinkOperator.processOp(ReduceSinkOperator.java:329)
at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:796)
at 
org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:92)
at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:796)
at 
org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:539)
... 9 more
Caused by: java.lang.ArrayIndexOutOfBoundsException
at java.lang.System.arraycopy(Native Method)
at org.apache.hadoop.io.Text.set(Text.java:225)
at 
org.apache.hadoop.hive.serde2.lazybinary.LazyBinaryString.init(LazyBinaryString.java:48)
at 
org.apache.hadoop.hive.serde2.lazybinary.LazyBinaryStruct.uncheckedGetField(LazyBinaryStruct.java:261)
at 
org.apache.hadoop.hive.serde2.lazybinary.LazyBinaryStruct.getField(LazyBinaryStruct.java:199)
at 
org.apache.hadoop.hive.serde2.lazybinary.objectinspector.LazyBinaryStructObjectInspector.getStructFieldData(LazyBinaryStructObjectInspector.java:64)
at 
org.apache.hadoop.hive.ql.exec.ExprNodeColumnEvaluator._evaluate(ExprNodeColumnEvaluator.java:94)
at 
org.apache.hadoop.hive.ql.exec.ExprNodeEvaluator.evaluate(ExprNodeEvaluator.java:77)
at 
org.apache.hadoop.hive.ql.exec.ExprNodeEvaluator.evaluate(ExprNodeEvaluator.java:65)
at 
org.apache.hadoop.hive.ql.exec.ReduceSinkOperator.populateCachedDistributionKeys(ReduceSinkOperator.java:349)
at 
org.apache.hadoop.hive.ql.exec.ReduceSinkOperator.processOp(ReduceSinkOperator.java:282)
... 13 more

> Column Pruning generates out of order columns in SelectOperator which cause 
> ArrayIndexOutOfBoundsException.
> ---
>
> Key: HIVE-14564
> URL: https://issues.apache.org/jira/browse/HIVE-14564
> Project: Hive
>  Issue Type: Bug
>  Components: Query 

[jira] [Commented] (HIVE-14564) Column Pruning generates out of order columns in SelectOperator which cause ArrayIndexOutOfBoundsException.

2017-04-05 Thread zhihai xu (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14564?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15958359#comment-15958359
 ] 

zhihai xu commented on HIVE-14564:
--

[~ashutoshc], Sorry for the long delay, I was struggling with the test case. 
Finally I found a way to reproduce this issue with a good test case. I create a 
new .q test case "select_column_pruning.q" in the patch HIVE-14564.003.patch, 
which can trigger this issue. Without the patch, the test failed with the 
following exception:
{code}
2017-04-05T22:21:28,861 ERROR [LocalJobRunner Map Task Executor #0] 
mr.ExecMapper: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime 
Error while processing row [Error getting row data with exception 
java.lang.ArrayIndexOutOfBoundsException: 140
at 
org.apache.hadoop.hive.serde2.lazybinary.LazyBinaryUtils.readVInt(LazyBinaryUtils.java:314)
at 
org.apache.hadoop.hive.serde2.lazybinary.LazyBinaryUtils.checkObjectByteInfo(LazyBinaryUtils.java:183)
at 
org.apache.hadoop.hive.serde2.lazybinary.LazyBinaryStruct.parse(LazyBinaryStruct.java:142)
at 
org.apache.hadoop.hive.serde2.lazybinary.LazyBinaryStruct.getField(LazyBinaryStruct.java:202)
at 
org.apache.hadoop.hive.serde2.lazybinary.objectinspector.LazyBinaryStructObjectInspector.getStructFieldData(LazyBinaryStructObjectInspector.java:64)
at 
org.apache.hadoop.hive.serde2.SerDeUtils.buildJSONString(SerDeUtils.java:366)
at 
org.apache.hadoop.hive.serde2.SerDeUtils.getJSONString(SerDeUtils.java:202)
at 
org.apache.hadoop.hive.serde2.SerDeUtils.getJSONString(SerDeUtils.java:188)
at 
org.apache.hadoop.hive.ql.exec.MapOperator.toErrorMessage(MapOperator.java:588)
at 
org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:554)
at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:160)
at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:453)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
at 
org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:243)
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
 ]
at 
org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:562)
at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:160)
at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:453)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
at 
org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:243)
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: 
java.lang.ArrayIndexOutOfBoundsException: 140
at 
org.apache.hadoop.hive.ql.exec.ReduceSinkOperator.process(ReduceSinkOperator.java:397)
at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:897)
at 
org.apache.hadoop.hive.ql.exec.TableScanOperator.process(TableScanOperator.java:130)
at 
org.apache.hadoop.hive.ql.exec.MapOperator$MapOpCtx.forward(MapOperator.java:148)
at 
org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:547)
... 10 more
Caused by: java.lang.ArrayIndexOutOfBoundsException: 140
at 
org.apache.hadoop.hive.serde2.lazybinary.LazyBinaryUtils.readVInt(LazyBinaryUtils.java:314)
at 
org.apache.hadoop.hive.serde2.lazybinary.LazyBinaryUtils.checkObjectByteInfo(LazyBinaryUtils.java:183)
at 
org.apache.hadoop.hive.serde2.lazybinary.LazyBinaryStruct.parse(LazyBinaryStruct.java:142)
at 
org.apache.hadoop.hive.serde2.lazybinary.LazyBinaryStruct.getField(LazyBinaryStruct.java:202)
at 
org.apache.hadoop.hive.serde2.lazybinary.objectinspector.LazyBinaryStructObjectInspector.getStructFieldData(LazyBinaryStructObjectInspector.java:64)
at 
org.apache.hadoop.hive.ql.exec.ExprNodeColumnEvaluator._evaluate(ExprNodeColumnEvaluator.java:95)
at 
org.apache.hadoop.hive.ql.exec.ExprNodeEvaluator.evaluate(ExprNodeEvaluator.java:80)
at 
org.apache.hadoop.hive.ql.exec.

[jira] [Commented] (HIVE-14564) Column Pruning generates out of order columns in SelectOperator which cause ArrayIndexOutOfBoundsException.

2017-04-06 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14564?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15959793#comment-15959793
 ] 

Hive QA commented on HIVE-14564:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12862236/HIVE-14564.003.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 149 failed/errored test(s), 10580 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestBeeLineDriver.testCliDriver[drop_with_concurrency]
 (batchId=235)
org.apache.hadoop.hive.cli.TestBeeLineDriver.testCliDriver[escape_comments] 
(batchId=235)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[autoColumnStats_8] 
(batchId=13)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[columnstats_partlvl] 
(batchId=33)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[columnstats_partlvl_dp] 
(batchId=48)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[columnstats_tbllvl] 
(batchId=8)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[complex_alias] 
(batchId=16)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[constant_prop_3] 
(batchId=41)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[correlationoptimizer13] 
(batchId=10)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[decimal_udf] (batchId=9)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[display_colstats_tbllvl] 
(batchId=3)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[distinct_windowing_no_cbo]
 (batchId=61)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[druid_basic2] 
(batchId=10)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[dynamic_rdd_cache] 
(batchId=51)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[except_all] (batchId=43)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[groupby9] (batchId=6)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[groupby_join_pushdown] 
(batchId=74)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[groupby_position] 
(batchId=37)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[having2] (batchId=15)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[index_auto_update] 
(batchId=69)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[limit_pushdown_negative] 
(batchId=38)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[multi_insert_gby3] 
(batchId=70)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[multigroupby_singlemr] 
(batchId=65)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[nested_column_pruning] 
(batchId=32)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[parquet_types_non_dictionary_encoding_vectorization]
 (batchId=79)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[parquet_types_vectorization]
 (batchId=13)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[ppd_gby2] (batchId=81)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[ppd_gby] (batchId=35)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[ppd_windowing1] 
(batchId=42)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[ptfgroupbyjoin] 
(batchId=79)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[reduce_deduplicate_extended2]
 (batchId=56)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[subquery_in_having] 
(batchId=54)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[subquery_notexists] 
(batchId=83)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[subquery_notexists_having]
 (batchId=79)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[subquery_notin_having] 
(batchId=47)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[subquery_unqualcolumnrefs]
 (batchId=17)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[temp_table_display_colstats_tbllvl]
 (batchId=73)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[udaf_binarysetfunctions] 
(batchId=35)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_decimal_aggregate]
 (batchId=17)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_decimal_round_2] 
(batchId=22)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_groupby_3] 
(batchId=61)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_groupby_reduce] 
(batchId=53)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_orderby_5] 
(batchId=39)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vectorization_13] 
(batchId=47)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vectorization_15] 
(batchId=60)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vectorization_limit] 
(batchId=34)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vectorized_parquet_types]
 (batchId=62)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[windowing_gby2] 
(batchId=33)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[dynamic_partition_pruning_2]
 (batchId=139)
org.ap

[jira] [Commented] (HIVE-14564) Column Pruning generates out of order columns in SelectOperator which cause ArrayIndexOutOfBoundsException.

2017-04-06 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14564?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15959921#comment-15959921
 ] 

Ashutosh Chauhan commented on HIVE-14564:
-

Thanks for the test case. Fix looks good to me. You may use Set for colNames 
since only thing it is doing is to lookup the name.
Seems like you also need to update golden files for tests.

> Column Pruning generates out of order columns in SelectOperator which cause 
> ArrayIndexOutOfBoundsException.
> ---
>
> Key: HIVE-14564
> URL: https://issues.apache.org/jira/browse/HIVE-14564
> Project: Hive
>  Issue Type: Bug
>  Components: Query Planning
>Affects Versions: 2.1.0
>Reporter: zhihai xu
>Assignee: zhihai xu
>Priority: Critical
> Attachments: HIVE-14564.000.patch, HIVE-14564.001.patch, 
> HIVE-14564.002.patch, HIVE-14564.003.patch
>
>
> Column Pruning generates out of order columns in SelectOperator which cause 
> ArrayIndexOutOfBoundsException.
> {code}
> 2016-07-26 21:49:24,390 FATAL [main] 
> org.apache.hadoop.hive.ql.exec.mr.ExecMapper: 
> org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while 
> processing row
>   at 
> org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:507)
>   at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:170)
>   at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)
>   at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:453)
>   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
>   at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:415)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671)
>   at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: 
> java.lang.ArrayIndexOutOfBoundsException
>   at 
> org.apache.hadoop.hive.ql.exec.ReduceSinkOperator.processOp(ReduceSinkOperator.java:397)
>   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:815)
>   at 
> org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:95)
>   at 
> org.apache.hadoop.hive.ql.exec.MapOperator$MapOpCtx.forward(MapOperator.java:157)
>   at 
> org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:497)
>   ... 9 more
> Caused by: java.lang.ArrayIndexOutOfBoundsException
>   at java.lang.System.arraycopy(Native Method)
>   at org.apache.hadoop.io.Text.set(Text.java:225)
>   at 
> org.apache.hadoop.hive.serde2.lazybinary.LazyBinaryString.init(LazyBinaryString.java:48)
>   at 
> org.apache.hadoop.hive.serde2.lazybinary.LazyBinaryStruct.uncheckedGetField(LazyBinaryStruct.java:264)
>   at 
> org.apache.hadoop.hive.serde2.lazybinary.LazyBinaryStruct.getField(LazyBinaryStruct.java:201)
>   at 
> org.apache.hadoop.hive.serde2.lazybinary.objectinspector.LazyBinaryStructObjectInspector.getStructFieldData(LazyBinaryStructObjectInspector.java:64)
>   at 
> org.apache.hadoop.hive.ql.exec.ExprNodeColumnEvaluator._evaluate(ExprNodeColumnEvaluator.java:94)
>   at 
> org.apache.hadoop.hive.ql.exec.ExprNodeEvaluator.evaluate(ExprNodeEvaluator.java:77)
>   at 
> org.apache.hadoop.hive.ql.exec.ExprNodeEvaluator.evaluate(ExprNodeEvaluator.java:65)
>   at 
> org.apache.hadoop.hive.ql.exec.ReduceSinkOperator.makeValueWritable(ReduceSinkOperator.java:550)
>   at 
> org.apache.hadoop.hive.ql.exec.ReduceSinkOperator.processOp(ReduceSinkOperator.java:377)
>   ... 13 more
> {code}
> The exception is because the serialization and deserialization doesn't match.
> The serialization by LazyBinarySerDe from previous MapReduce job used 
> different order of columns. When the current MapReduce job deserialized the 
> intermediate sequence file generated by previous MapReduce job, it will get 
> corrupted data from the deserialization using wrong order of columns by 
> LazyBinaryStruct. The unmatched columns between  serialization and 
> deserialization is caused by SelectOperator's Column Pruning 
> {{ColumnPrunerSelectProc}}.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-14564) Column Pruning generates out of order columns in SelectOperator which cause ArrayIndexOutOfBoundsException.

2017-04-07 Thread zhihai xu (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14564?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15961475#comment-15961475
 ] 

zhihai xu commented on HIVE-14564:
--

[~ashutoshc], Thanks for the review! Yes, use Set will be better, I uploaded a 
new patch HIVE-14564.004.patch which updated golden files for test and use Set 
for colNames. Please review it.
Looks like the following 5 test failures are related to my change
org.apache.hive.hcatalog.pig.TestTextFileHCatStorer.testWriteVarchar 
(batchId=178)
org.apache.hadoop.hive.cli.TestBeeLineDriver.testCliDriver[drop_with_concurrency]
 (batchId=235)
org.apache.hadoop.hive.cli.TestBeeLineDriver.testCliDriver[escape_comments] 
(batchId=235)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_if_expr]
 (batchId=143)
org.apache.hive.jdbc.TestJdbcDriver2.testResultSetMetaData (batchId=221)


> Column Pruning generates out of order columns in SelectOperator which cause 
> ArrayIndexOutOfBoundsException.
> ---
>
> Key: HIVE-14564
> URL: https://issues.apache.org/jira/browse/HIVE-14564
> Project: Hive
>  Issue Type: Bug
>  Components: Query Planning
>Affects Versions: 2.1.0
>Reporter: zhihai xu
>Assignee: zhihai xu
>Priority: Critical
> Attachments: HIVE-14564.000.patch, HIVE-14564.001.patch, 
> HIVE-14564.002.patch, HIVE-14564.003.patch, HIVE-14564.004.patch
>
>
> Column Pruning generates out of order columns in SelectOperator which cause 
> ArrayIndexOutOfBoundsException.
> {code}
> 2016-07-26 21:49:24,390 FATAL [main] 
> org.apache.hadoop.hive.ql.exec.mr.ExecMapper: 
> org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while 
> processing row
>   at 
> org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:507)
>   at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:170)
>   at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)
>   at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:453)
>   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
>   at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:415)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671)
>   at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: 
> java.lang.ArrayIndexOutOfBoundsException
>   at 
> org.apache.hadoop.hive.ql.exec.ReduceSinkOperator.processOp(ReduceSinkOperator.java:397)
>   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:815)
>   at 
> org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:95)
>   at 
> org.apache.hadoop.hive.ql.exec.MapOperator$MapOpCtx.forward(MapOperator.java:157)
>   at 
> org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:497)
>   ... 9 more
> Caused by: java.lang.ArrayIndexOutOfBoundsException
>   at java.lang.System.arraycopy(Native Method)
>   at org.apache.hadoop.io.Text.set(Text.java:225)
>   at 
> org.apache.hadoop.hive.serde2.lazybinary.LazyBinaryString.init(LazyBinaryString.java:48)
>   at 
> org.apache.hadoop.hive.serde2.lazybinary.LazyBinaryStruct.uncheckedGetField(LazyBinaryStruct.java:264)
>   at 
> org.apache.hadoop.hive.serde2.lazybinary.LazyBinaryStruct.getField(LazyBinaryStruct.java:201)
>   at 
> org.apache.hadoop.hive.serde2.lazybinary.objectinspector.LazyBinaryStructObjectInspector.getStructFieldData(LazyBinaryStructObjectInspector.java:64)
>   at 
> org.apache.hadoop.hive.ql.exec.ExprNodeColumnEvaluator._evaluate(ExprNodeColumnEvaluator.java:94)
>   at 
> org.apache.hadoop.hive.ql.exec.ExprNodeEvaluator.evaluate(ExprNodeEvaluator.java:77)
>   at 
> org.apache.hadoop.hive.ql.exec.ExprNodeEvaluator.evaluate(ExprNodeEvaluator.java:65)
>   at 
> org.apache.hadoop.hive.ql.exec.ReduceSinkOperator.makeValueWritable(ReduceSinkOperator.java:550)
>   at 
> org.apache.hadoop.hive.ql.exec.ReduceSinkOperator.processOp(ReduceSinkOperator.java:377)
>   ... 13 more
> {code}
> The exception is because the serialization and deserialization doesn't match.
> The serialization by LazyBinarySerDe from previous MapReduce job used 
> different order of columns. When the current MapReduce job deserialized the 
> intermediate sequence file generated by previous MapReduce job, it will get 
> corrupted data from the deserialization using wrong order of columns by 
> LazyBinaryStruct. The unmatched columns between  serialization and 
> deserialization is caused by SelectOperator's 

[jira] [Commented] (HIVE-14564) Column Pruning generates out of order columns in SelectOperator which cause ArrayIndexOutOfBoundsException.

2017-04-07 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14564?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15961558#comment-15961558
 ] 

Hive QA commented on HIVE-14564:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12862536/HIVE-14564.004.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 6 failed/errored test(s), 10589 tests 
executed
*Failed tests:*
{noformat}
TestSSL - did not produce a TEST-*.xml file (likely timed out) (batchId=220)
org.apache.hadoop.hive.cli.TestBeeLineDriver.testCliDriver[drop_with_concurrency]
 (batchId=235)
org.apache.hadoop.hive.cli.TestBeeLineDriver.testCliDriver[escape_comments] 
(batchId=235)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[index_skewtable] 
(batchId=76)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_if_expr]
 (batchId=143)
org.apache.hive.jdbc.TestJdbcDriver2.testResultSetMetaData (batchId=221)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/4614/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/4614/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-4614/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 6 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12862536 - PreCommit-HIVE-Build

> Column Pruning generates out of order columns in SelectOperator which cause 
> ArrayIndexOutOfBoundsException.
> ---
>
> Key: HIVE-14564
> URL: https://issues.apache.org/jira/browse/HIVE-14564
> Project: Hive
>  Issue Type: Bug
>  Components: Query Planning
>Affects Versions: 2.1.0
>Reporter: zhihai xu
>Assignee: zhihai xu
>Priority: Critical
> Attachments: HIVE-14564.000.patch, HIVE-14564.001.patch, 
> HIVE-14564.002.patch, HIVE-14564.003.patch, HIVE-14564.004.patch
>
>
> Column Pruning generates out of order columns in SelectOperator which cause 
> ArrayIndexOutOfBoundsException.
> {code}
> 2016-07-26 21:49:24,390 FATAL [main] 
> org.apache.hadoop.hive.ql.exec.mr.ExecMapper: 
> org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while 
> processing row
>   at 
> org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:507)
>   at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:170)
>   at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)
>   at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:453)
>   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
>   at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:415)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671)
>   at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: 
> java.lang.ArrayIndexOutOfBoundsException
>   at 
> org.apache.hadoop.hive.ql.exec.ReduceSinkOperator.processOp(ReduceSinkOperator.java:397)
>   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:815)
>   at 
> org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:95)
>   at 
> org.apache.hadoop.hive.ql.exec.MapOperator$MapOpCtx.forward(MapOperator.java:157)
>   at 
> org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:497)
>   ... 9 more
> Caused by: java.lang.ArrayIndexOutOfBoundsException
>   at java.lang.System.arraycopy(Native Method)
>   at org.apache.hadoop.io.Text.set(Text.java:225)
>   at 
> org.apache.hadoop.hive.serde2.lazybinary.LazyBinaryString.init(LazyBinaryString.java:48)
>   at 
> org.apache.hadoop.hive.serde2.lazybinary.LazyBinaryStruct.uncheckedGetField(LazyBinaryStruct.java:264)
>   at 
> org.apache.hadoop.hive.serde2.lazybinary.LazyBinaryStruct.getField(LazyBinaryStruct.java:201)
>   at 
> org.apache.hadoop.hive.serde2.lazybinary.objectinspector.LazyBinaryStructObjectInspector.getStructFieldData(LazyBinaryStructObjectInspector.java:64)
>   at 
> org.apache.hadoop.hive.ql.exec.ExprNodeColumnEvaluator._evaluate(ExprNodeColumnEvaluator.java:94)
>   at 
> org.apache.hadoop.hive.ql.exec.ExprNodeEvaluator.evaluate(ExprNodeEvaluator.java:77)
>   at 
> org.apache.hadoop.hive.ql.exec.ExprNodeEvaluator.eval

[jira] [Commented] (HIVE-14564) Column Pruning generates out of order columns in SelectOperator which cause ArrayIndexOutOfBoundsException.

2017-04-07 Thread zhihai xu (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14564?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15961615#comment-15961615
 ] 

zhihai xu commented on HIVE-14564:
--

All the test failures are not related to my change.

> Column Pruning generates out of order columns in SelectOperator which cause 
> ArrayIndexOutOfBoundsException.
> ---
>
> Key: HIVE-14564
> URL: https://issues.apache.org/jira/browse/HIVE-14564
> Project: Hive
>  Issue Type: Bug
>  Components: Query Planning
>Affects Versions: 2.1.0
>Reporter: zhihai xu
>Assignee: zhihai xu
>Priority: Critical
> Attachments: HIVE-14564.000.patch, HIVE-14564.001.patch, 
> HIVE-14564.002.patch, HIVE-14564.003.patch, HIVE-14564.004.patch
>
>
> Column Pruning generates out of order columns in SelectOperator which cause 
> ArrayIndexOutOfBoundsException.
> {code}
> 2016-07-26 21:49:24,390 FATAL [main] 
> org.apache.hadoop.hive.ql.exec.mr.ExecMapper: 
> org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while 
> processing row
>   at 
> org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:507)
>   at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:170)
>   at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)
>   at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:453)
>   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
>   at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:415)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671)
>   at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: 
> java.lang.ArrayIndexOutOfBoundsException
>   at 
> org.apache.hadoop.hive.ql.exec.ReduceSinkOperator.processOp(ReduceSinkOperator.java:397)
>   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:815)
>   at 
> org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:95)
>   at 
> org.apache.hadoop.hive.ql.exec.MapOperator$MapOpCtx.forward(MapOperator.java:157)
>   at 
> org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:497)
>   ... 9 more
> Caused by: java.lang.ArrayIndexOutOfBoundsException
>   at java.lang.System.arraycopy(Native Method)
>   at org.apache.hadoop.io.Text.set(Text.java:225)
>   at 
> org.apache.hadoop.hive.serde2.lazybinary.LazyBinaryString.init(LazyBinaryString.java:48)
>   at 
> org.apache.hadoop.hive.serde2.lazybinary.LazyBinaryStruct.uncheckedGetField(LazyBinaryStruct.java:264)
>   at 
> org.apache.hadoop.hive.serde2.lazybinary.LazyBinaryStruct.getField(LazyBinaryStruct.java:201)
>   at 
> org.apache.hadoop.hive.serde2.lazybinary.objectinspector.LazyBinaryStructObjectInspector.getStructFieldData(LazyBinaryStructObjectInspector.java:64)
>   at 
> org.apache.hadoop.hive.ql.exec.ExprNodeColumnEvaluator._evaluate(ExprNodeColumnEvaluator.java:94)
>   at 
> org.apache.hadoop.hive.ql.exec.ExprNodeEvaluator.evaluate(ExprNodeEvaluator.java:77)
>   at 
> org.apache.hadoop.hive.ql.exec.ExprNodeEvaluator.evaluate(ExprNodeEvaluator.java:65)
>   at 
> org.apache.hadoop.hive.ql.exec.ReduceSinkOperator.makeValueWritable(ReduceSinkOperator.java:550)
>   at 
> org.apache.hadoop.hive.ql.exec.ReduceSinkOperator.processOp(ReduceSinkOperator.java:377)
>   ... 13 more
> {code}
> The exception is because the serialization and deserialization doesn't match.
> The serialization by LazyBinarySerDe from previous MapReduce job used 
> different order of columns. When the current MapReduce job deserialized the 
> intermediate sequence file generated by previous MapReduce job, it will get 
> corrupted data from the deserialization using wrong order of columns by 
> LazyBinaryStruct. The unmatched columns between  serialization and 
> deserialization is caused by SelectOperator's Column Pruning 
> {{ColumnPrunerSelectProc}}.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-14564) Column Pruning generates out of order columns in SelectOperator which cause ArrayIndexOutOfBoundsException.

2017-04-08 Thread zhihai xu (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14564?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15961886#comment-15961886
 ] 

zhihai xu commented on HIVE-14564:
--

Thanks [~ashutoshc] for reviewing and committing the patch!

> Column Pruning generates out of order columns in SelectOperator which cause 
> ArrayIndexOutOfBoundsException.
> ---
>
> Key: HIVE-14564
> URL: https://issues.apache.org/jira/browse/HIVE-14564
> Project: Hive
>  Issue Type: Bug
>  Components: Query Planning
>Affects Versions: 2.1.0
>Reporter: zhihai xu
>Assignee: zhihai xu
>Priority: Critical
> Fix For: 3.0.0
>
> Attachments: HIVE-14564.000.patch, HIVE-14564.001.patch, 
> HIVE-14564.002.patch, HIVE-14564.003.patch, HIVE-14564.004.patch
>
>
> Column Pruning generates out of order columns in SelectOperator which cause 
> ArrayIndexOutOfBoundsException.
> {code}
> 2016-07-26 21:49:24,390 FATAL [main] 
> org.apache.hadoop.hive.ql.exec.mr.ExecMapper: 
> org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while 
> processing row
>   at 
> org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:507)
>   at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:170)
>   at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)
>   at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:453)
>   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
>   at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:415)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671)
>   at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: 
> java.lang.ArrayIndexOutOfBoundsException
>   at 
> org.apache.hadoop.hive.ql.exec.ReduceSinkOperator.processOp(ReduceSinkOperator.java:397)
>   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:815)
>   at 
> org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:95)
>   at 
> org.apache.hadoop.hive.ql.exec.MapOperator$MapOpCtx.forward(MapOperator.java:157)
>   at 
> org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:497)
>   ... 9 more
> Caused by: java.lang.ArrayIndexOutOfBoundsException
>   at java.lang.System.arraycopy(Native Method)
>   at org.apache.hadoop.io.Text.set(Text.java:225)
>   at 
> org.apache.hadoop.hive.serde2.lazybinary.LazyBinaryString.init(LazyBinaryString.java:48)
>   at 
> org.apache.hadoop.hive.serde2.lazybinary.LazyBinaryStruct.uncheckedGetField(LazyBinaryStruct.java:264)
>   at 
> org.apache.hadoop.hive.serde2.lazybinary.LazyBinaryStruct.getField(LazyBinaryStruct.java:201)
>   at 
> org.apache.hadoop.hive.serde2.lazybinary.objectinspector.LazyBinaryStructObjectInspector.getStructFieldData(LazyBinaryStructObjectInspector.java:64)
>   at 
> org.apache.hadoop.hive.ql.exec.ExprNodeColumnEvaluator._evaluate(ExprNodeColumnEvaluator.java:94)
>   at 
> org.apache.hadoop.hive.ql.exec.ExprNodeEvaluator.evaluate(ExprNodeEvaluator.java:77)
>   at 
> org.apache.hadoop.hive.ql.exec.ExprNodeEvaluator.evaluate(ExprNodeEvaluator.java:65)
>   at 
> org.apache.hadoop.hive.ql.exec.ReduceSinkOperator.makeValueWritable(ReduceSinkOperator.java:550)
>   at 
> org.apache.hadoop.hive.ql.exec.ReduceSinkOperator.processOp(ReduceSinkOperator.java:377)
>   ... 13 more
> {code}
> The exception is because the serialization and deserialization doesn't match.
> The serialization by LazyBinarySerDe from previous MapReduce job used 
> different order of columns. When the current MapReduce job deserialized the 
> intermediate sequence file generated by previous MapReduce job, it will get 
> corrupted data from the deserialization using wrong order of columns by 
> LazyBinaryStruct. The unmatched columns between  serialization and 
> deserialization is caused by SelectOperator's Column Pruning 
> {{ColumnPrunerSelectProc}}.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-14564) Column Pruning generates out of order columns in SelectOperator which cause ArrayIndexOutOfBoundsException.

2016-08-17 Thread zhihai xu (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14564?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15425668#comment-15425668
 ] 

zhihai xu commented on HIVE-14564:
--

I uploaded a patch HIVE-14564.000.patch, which will preserve the order of 
columns in the input schema when SelectOperator does Column Pruning.

> Column Pruning generates out of order columns in SelectOperator which cause 
> ArrayIndexOutOfBoundsException.
> ---
>
> Key: HIVE-14564
> URL: https://issues.apache.org/jira/browse/HIVE-14564
> Project: Hive
>  Issue Type: Bug
>  Components: Query Planning
>Affects Versions: 2.1.0
>Reporter: zhihai xu
>Assignee: zhihai xu
>Priority: Critical
> Attachments: HIVE-14564.000.patch
>
>
> Column Pruning generates out of order columns in SelectOperator which cause 
> ArrayIndexOutOfBoundsException.
> {code}
> 2016-07-26 21:49:24,390 FATAL [main] 
> org.apache.hadoop.hive.ql.exec.mr.ExecMapper: 
> org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while 
> processing row
>   at 
> org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:507)
>   at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:170)
>   at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)
>   at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:453)
>   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
>   at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:415)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671)
>   at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: 
> java.lang.ArrayIndexOutOfBoundsException
>   at 
> org.apache.hadoop.hive.ql.exec.ReduceSinkOperator.processOp(ReduceSinkOperator.java:397)
>   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:815)
>   at 
> org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:95)
>   at 
> org.apache.hadoop.hive.ql.exec.MapOperator$MapOpCtx.forward(MapOperator.java:157)
>   at 
> org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:497)
>   ... 9 more
> Caused by: java.lang.ArrayIndexOutOfBoundsException
>   at java.lang.System.arraycopy(Native Method)
>   at org.apache.hadoop.io.Text.set(Text.java:225)
>   at 
> org.apache.hadoop.hive.serde2.lazybinary.LazyBinaryString.init(LazyBinaryString.java:48)
>   at 
> org.apache.hadoop.hive.serde2.lazybinary.LazyBinaryStruct.uncheckedGetField(LazyBinaryStruct.java:264)
>   at 
> org.apache.hadoop.hive.serde2.lazybinary.LazyBinaryStruct.getField(LazyBinaryStruct.java:201)
>   at 
> org.apache.hadoop.hive.serde2.lazybinary.objectinspector.LazyBinaryStructObjectInspector.getStructFieldData(LazyBinaryStructObjectInspector.java:64)
>   at 
> org.apache.hadoop.hive.ql.exec.ExprNodeColumnEvaluator._evaluate(ExprNodeColumnEvaluator.java:94)
>   at 
> org.apache.hadoop.hive.ql.exec.ExprNodeEvaluator.evaluate(ExprNodeEvaluator.java:77)
>   at 
> org.apache.hadoop.hive.ql.exec.ExprNodeEvaluator.evaluate(ExprNodeEvaluator.java:65)
>   at 
> org.apache.hadoop.hive.ql.exec.ReduceSinkOperator.makeValueWritable(ReduceSinkOperator.java:550)
>   at 
> org.apache.hadoop.hive.ql.exec.ReduceSinkOperator.processOp(ReduceSinkOperator.java:377)
>   ... 13 more
> {code}
> The exception is because the serialization and deserialization doesn't match.
> The serialization by LazyBinarySerDe from previous MapReduce job used 
> different order of columns. When the current MapReduce job deserialized the 
> intermediate sequence file generated by previous MapReduce job, it will get 
> corrupted data from the deserialization using wrong order of columns by 
> LazyBinaryStruct. The unmatched columns between  serialization and 
> deserialization is caused by SelectOperator's Column Pruning 
> {{ColumnPrunerSelectProc}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14564) Column Pruning generates out of order columns in SelectOperator which cause ArrayIndexOutOfBoundsException.

2016-08-18 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14564?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15426343#comment-15426343
 ] 

Hive QA commented on HIVE-14564:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12824260/HIVE-14564.000.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 132 failed/errored test(s), 10442 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[bucket_groupby]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[columnstats_partlvl]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[columnstats_partlvl_dp]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[columnstats_tbllvl]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[complex_alias]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[constant_prop_3]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[correlationoptimizer13]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[decimal_udf]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[display_colstats_tbllvl]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[dynamic_rdd_cache]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[groupby9]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[groupby_join_pushdown]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[groupby_position]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[having2]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[index_auto_update]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[lateral_view]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[limit_pushdown3]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[limit_pushdown]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[limit_pushdown_negative]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[multi_insert_gby3]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[multi_insert_lateral_view]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[multigroupby_singlemr]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[offset_limit_ppd_optimizer]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[ppd_gby2]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[ppd_gby]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[ppd_windowing1]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[ptf]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[ptfgroupbyjoin]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[reduce_deduplicate_extended2]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[subquery_in_having]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[subquery_notexists]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[subquery_notexists_having]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[subquery_unqualcolumnrefs]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[temp_table_display_colstats_tbllvl]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_decimal_aggregate]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_decimal_round_2]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_decimal_udf]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_groupby_3]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_groupby_reduce]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_interval_2]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_orderby_5]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vectorization_0]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vectorization_13]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vectorization_15]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vectorization_limit]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vectorization_short_regress]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vectorized_parquet]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vectorized_parquet_types]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vectorized_ptf]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[windowing_gby2]
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[dynamic_partition_pruning_2]
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainuser_1]
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainuser_2]
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[limit_pushdown]
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[load_dyn_part1]
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[load_dyn_part2]
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[ptf]
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[transform_ppr1]
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[vector_decim

[jira] [Commented] (HIVE-14564) Column Pruning generates out of order columns in SelectOperator which cause ArrayIndexOutOfBoundsException.

2016-08-18 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14564?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15427266#comment-15427266
 ] 

Ashutosh Chauhan commented on HIVE-14564:
-

[~zxu] Thanks for the patch. Can you add a testcase to demonstrate the problem 
you are facing here?

> Column Pruning generates out of order columns in SelectOperator which cause 
> ArrayIndexOutOfBoundsException.
> ---
>
> Key: HIVE-14564
> URL: https://issues.apache.org/jira/browse/HIVE-14564
> Project: Hive
>  Issue Type: Bug
>  Components: Query Planning
>Affects Versions: 2.1.0
>Reporter: zhihai xu
>Assignee: zhihai xu
>Priority: Critical
> Attachments: HIVE-14564.000.patch
>
>
> Column Pruning generates out of order columns in SelectOperator which cause 
> ArrayIndexOutOfBoundsException.
> {code}
> 2016-07-26 21:49:24,390 FATAL [main] 
> org.apache.hadoop.hive.ql.exec.mr.ExecMapper: 
> org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while 
> processing row
>   at 
> org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:507)
>   at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:170)
>   at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)
>   at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:453)
>   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
>   at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:415)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671)
>   at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: 
> java.lang.ArrayIndexOutOfBoundsException
>   at 
> org.apache.hadoop.hive.ql.exec.ReduceSinkOperator.processOp(ReduceSinkOperator.java:397)
>   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:815)
>   at 
> org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:95)
>   at 
> org.apache.hadoop.hive.ql.exec.MapOperator$MapOpCtx.forward(MapOperator.java:157)
>   at 
> org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:497)
>   ... 9 more
> Caused by: java.lang.ArrayIndexOutOfBoundsException
>   at java.lang.System.arraycopy(Native Method)
>   at org.apache.hadoop.io.Text.set(Text.java:225)
>   at 
> org.apache.hadoop.hive.serde2.lazybinary.LazyBinaryString.init(LazyBinaryString.java:48)
>   at 
> org.apache.hadoop.hive.serde2.lazybinary.LazyBinaryStruct.uncheckedGetField(LazyBinaryStruct.java:264)
>   at 
> org.apache.hadoop.hive.serde2.lazybinary.LazyBinaryStruct.getField(LazyBinaryStruct.java:201)
>   at 
> org.apache.hadoop.hive.serde2.lazybinary.objectinspector.LazyBinaryStructObjectInspector.getStructFieldData(LazyBinaryStructObjectInspector.java:64)
>   at 
> org.apache.hadoop.hive.ql.exec.ExprNodeColumnEvaluator._evaluate(ExprNodeColumnEvaluator.java:94)
>   at 
> org.apache.hadoop.hive.ql.exec.ExprNodeEvaluator.evaluate(ExprNodeEvaluator.java:77)
>   at 
> org.apache.hadoop.hive.ql.exec.ExprNodeEvaluator.evaluate(ExprNodeEvaluator.java:65)
>   at 
> org.apache.hadoop.hive.ql.exec.ReduceSinkOperator.makeValueWritable(ReduceSinkOperator.java:550)
>   at 
> org.apache.hadoop.hive.ql.exec.ReduceSinkOperator.processOp(ReduceSinkOperator.java:377)
>   ... 13 more
> {code}
> The exception is because the serialization and deserialization doesn't match.
> The serialization by LazyBinarySerDe from previous MapReduce job used 
> different order of columns. When the current MapReduce job deserialized the 
> intermediate sequence file generated by previous MapReduce job, it will get 
> corrupted data from the deserialization using wrong order of columns by 
> LazyBinaryStruct. The unmatched columns between  serialization and 
> deserialization is caused by SelectOperator's Column Pruning 
> {{ColumnPrunerSelectProc}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14564) Column Pruning generates out of order columns in SelectOperator which cause ArrayIndexOutOfBoundsException.

2016-08-28 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14564?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15445001#comment-15445001
 ] 

Hive QA commented on HIVE-14564:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12825960/HIVE-14564.001.patch

{color:red}ERROR:{color} -1 due to build exiting with an error

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-MASTER-Build/1030/testReport
Console output: 
https://builds.apache.org/job/PreCommit-HIVE-MASTER-Build/1030/console
Test logs: 
http://ec2-204-236-174-241.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-MASTER-Build-1030/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Tests exited with: NonZeroExitCodeException
Command 'bash /data/hive-ptest/working/scratch/source-prep.sh' failed with exit 
status 1 and output '+ [[ -n /usr/java/jdk1.8.0_25 ]]
+ export JAVA_HOME=/usr/java/jdk1.8.0_25
+ JAVA_HOME=/usr/java/jdk1.8.0_25
+ export 
PATH=/usr/java/jdk1.8.0_25/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games
+ 
PATH=/usr/java/jdk1.8.0_25/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games
+ export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m '
+ ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m '
+ export 'M2_OPTS=-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost 
-Dhttp.proxyPort=3128'
+ M2_OPTS='-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost 
-Dhttp.proxyPort=3128'
+ cd /data/hive-ptest/working/
+ tee /data/hive-ptest/logs/PreCommit-HIVE-MASTER-Build-1030/source-prep.txt
+ [[ false == \t\r\u\e ]]
+ mkdir -p maven ivy
+ [[ git = \s\v\n ]]
+ [[ git = \g\i\t ]]
+ [[ -z master ]]
+ [[ -d apache-github-source-source ]]
+ [[ ! -d apache-github-source-source/.git ]]
+ [[ ! -d apache-github-source-source ]]
+ cd apache-github-source-source
+ git fetch origin
+ git reset --hard HEAD
HEAD is now at cb534ab HIVE-14515: Schema evolution uses slow INSERT INTO .. 
VALUES (Matt McCline, reviewed by Prasanth Jayachandran)
+ git clean -f -d
+ git checkout master
Already on 'master'
Your branch is up-to-date with 'origin/master'.
+ git reset --hard origin/master
HEAD is now at cb534ab HIVE-14515: Schema evolution uses slow INSERT INTO .. 
VALUES (Matt McCline, reviewed by Prasanth Jayachandran)
+ git merge --ff-only origin/master
Already up-to-date.
+ git gc
+ patchCommandPath=/data/hive-ptest/working/scratch/smart-apply-patch.sh
+ patchFilePath=/data/hive-ptest/working/scratch/build.patch
+ [[ -f /data/hive-ptest/working/scratch/build.patch ]]
+ chmod +x /data/hive-ptest/working/scratch/smart-apply-patch.sh
+ /data/hive-ptest/working/scratch/smart-apply-patch.sh 
/data/hive-ptest/working/scratch/build.patch
The patch does not appear to apply with p0, p1, or p2
+ exit 1
'
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12825960 - PreCommit-HIVE-MASTER-Build

> Column Pruning generates out of order columns in SelectOperator which cause 
> ArrayIndexOutOfBoundsException.
> ---
>
> Key: HIVE-14564
> URL: https://issues.apache.org/jira/browse/HIVE-14564
> Project: Hive
>  Issue Type: Bug
>  Components: Query Planning
>Affects Versions: 2.1.0
>Reporter: zhihai xu
>Assignee: zhihai xu
>Priority: Critical
> Attachments: HIVE-14564.000.patch, HIVE-14564.001.patch
>
>
> Column Pruning generates out of order columns in SelectOperator which cause 
> ArrayIndexOutOfBoundsException.
> {code}
> 2016-07-26 21:49:24,390 FATAL [main] 
> org.apache.hadoop.hive.ql.exec.mr.ExecMapper: 
> org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while 
> processing row
>   at 
> org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:507)
>   at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:170)
>   at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)
>   at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:453)
>   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
>   at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:415)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671)
>   at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: 
> java.lang.ArrayIndexOutOfBoundsException
>   at 
> org.apache.hadoop.hive.ql.exec.ReduceSinkOperator.processOp(ReduceSinkOperator.java:397)
>   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:815)
>   at 
> org.apache

[jira] [Commented] (HIVE-14564) Column Pruning generates out of order columns in SelectOperator which cause ArrayIndexOutOfBoundsException.

2016-08-28 Thread zhihai xu (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14564?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15445006#comment-15445006
 ] 

zhihai xu commented on HIVE-14564:
--

Thanks for the review, [~ashutoshc]! a lot of test cases are updated to adapt 
to this patch, Looks like all these cases can verify this patch.

> Column Pruning generates out of order columns in SelectOperator which cause 
> ArrayIndexOutOfBoundsException.
> ---
>
> Key: HIVE-14564
> URL: https://issues.apache.org/jira/browse/HIVE-14564
> Project: Hive
>  Issue Type: Bug
>  Components: Query Planning
>Affects Versions: 2.1.0
>Reporter: zhihai xu
>Assignee: zhihai xu
>Priority: Critical
> Attachments: HIVE-14564.000.patch, HIVE-14564.001.patch
>
>
> Column Pruning generates out of order columns in SelectOperator which cause 
> ArrayIndexOutOfBoundsException.
> {code}
> 2016-07-26 21:49:24,390 FATAL [main] 
> org.apache.hadoop.hive.ql.exec.mr.ExecMapper: 
> org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while 
> processing row
>   at 
> org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:507)
>   at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:170)
>   at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)
>   at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:453)
>   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
>   at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:415)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671)
>   at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: 
> java.lang.ArrayIndexOutOfBoundsException
>   at 
> org.apache.hadoop.hive.ql.exec.ReduceSinkOperator.processOp(ReduceSinkOperator.java:397)
>   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:815)
>   at 
> org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:95)
>   at 
> org.apache.hadoop.hive.ql.exec.MapOperator$MapOpCtx.forward(MapOperator.java:157)
>   at 
> org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:497)
>   ... 9 more
> Caused by: java.lang.ArrayIndexOutOfBoundsException
>   at java.lang.System.arraycopy(Native Method)
>   at org.apache.hadoop.io.Text.set(Text.java:225)
>   at 
> org.apache.hadoop.hive.serde2.lazybinary.LazyBinaryString.init(LazyBinaryString.java:48)
>   at 
> org.apache.hadoop.hive.serde2.lazybinary.LazyBinaryStruct.uncheckedGetField(LazyBinaryStruct.java:264)
>   at 
> org.apache.hadoop.hive.serde2.lazybinary.LazyBinaryStruct.getField(LazyBinaryStruct.java:201)
>   at 
> org.apache.hadoop.hive.serde2.lazybinary.objectinspector.LazyBinaryStructObjectInspector.getStructFieldData(LazyBinaryStructObjectInspector.java:64)
>   at 
> org.apache.hadoop.hive.ql.exec.ExprNodeColumnEvaluator._evaluate(ExprNodeColumnEvaluator.java:94)
>   at 
> org.apache.hadoop.hive.ql.exec.ExprNodeEvaluator.evaluate(ExprNodeEvaluator.java:77)
>   at 
> org.apache.hadoop.hive.ql.exec.ExprNodeEvaluator.evaluate(ExprNodeEvaluator.java:65)
>   at 
> org.apache.hadoop.hive.ql.exec.ReduceSinkOperator.makeValueWritable(ReduceSinkOperator.java:550)
>   at 
> org.apache.hadoop.hive.ql.exec.ReduceSinkOperator.processOp(ReduceSinkOperator.java:377)
>   ... 13 more
> {code}
> The exception is because the serialization and deserialization doesn't match.
> The serialization by LazyBinarySerDe from previous MapReduce job used 
> different order of columns. When the current MapReduce job deserialized the 
> intermediate sequence file generated by previous MapReduce job, it will get 
> corrupted data from the deserialization using wrong order of columns by 
> LazyBinaryStruct. The unmatched columns between  serialization and 
> deserialization is caused by SelectOperator's Column Pruning 
> {{ColumnPrunerSelectProc}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14564) Column Pruning generates out of order columns in SelectOperator which cause ArrayIndexOutOfBoundsException.

2016-08-29 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14564?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15447139#comment-15447139
 ] 

Hive QA commented on HIVE-14564:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12826046/HIVE-14564.002.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 62 failed/errored test(s), 10466 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.org.apache.hadoop.hive.cli.TestCliDriver
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[acid_mapjoin]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_decimal_aggregate]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_decimal_round_2]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_decimal_udf]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_groupby_3]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_groupby_reduce]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_interval_2]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_join_part_col_char]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_orderby_5]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vectorization_0]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vectorization_13]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vectorization_15]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vectorization_limit]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vectorization_short_regress]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vectorized_parquet]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vectorized_parquet_types]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vectorized_ptf]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[windowing_gby2]
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[explainuser_1]
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[explainuser_2]
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[limit_pushdown]
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[ptf]
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[vector_decimal_aggregate]
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[vector_decimal_round_2]
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[vector_decimal_udf]
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[vector_groupby_3]
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[vector_groupby_reduce]
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[vector_interval_2]
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[vector_orderby_5]
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[vectorization_0]
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[vectorization_13]
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[vectorization_15]
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[vectorization_short_regress]
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[vectorized_parquet]
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[vectorized_parquet_types]
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[vectorized_ptf]
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[windowing_gby]
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainuser_3]
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[vectorization_limit]
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query17]
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query72]
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query85]
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query89]
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query91]
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[dynamic_rdd_cache]
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[groupby9]
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[groupby_position]
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[limit_pushdown]
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[multi_insert_gby3]
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[multi_insert_lateral_view]
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[multigroupby_singlemr]
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[ptf]
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[vector_decimal_aggregate]
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[vector_groupby_3]
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[vector_orderby_5]
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[vectorization_0]
org.

[jira] [Commented] (HIVE-14564) Column Pruning generates out of order columns in SelectOperator which cause ArrayIndexOutOfBoundsException.

2016-08-29 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14564?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15447322#comment-15447322
 ] 

Ashutosh Chauhan commented on HIVE-14564:
-

I still would like to see a query which reproduces this problem, since we 
havent seen this in past. Also test will be useful for regression purposes.

> Column Pruning generates out of order columns in SelectOperator which cause 
> ArrayIndexOutOfBoundsException.
> ---
>
> Key: HIVE-14564
> URL: https://issues.apache.org/jira/browse/HIVE-14564
> Project: Hive
>  Issue Type: Bug
>  Components: Query Planning
>Affects Versions: 2.1.0
>Reporter: zhihai xu
>Assignee: zhihai xu
>Priority: Critical
> Attachments: HIVE-14564.000.patch, HIVE-14564.001.patch, 
> HIVE-14564.002.patch
>
>
> Column Pruning generates out of order columns in SelectOperator which cause 
> ArrayIndexOutOfBoundsException.
> {code}
> 2016-07-26 21:49:24,390 FATAL [main] 
> org.apache.hadoop.hive.ql.exec.mr.ExecMapper: 
> org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while 
> processing row
>   at 
> org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:507)
>   at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:170)
>   at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)
>   at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:453)
>   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
>   at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:415)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671)
>   at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: 
> java.lang.ArrayIndexOutOfBoundsException
>   at 
> org.apache.hadoop.hive.ql.exec.ReduceSinkOperator.processOp(ReduceSinkOperator.java:397)
>   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:815)
>   at 
> org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:95)
>   at 
> org.apache.hadoop.hive.ql.exec.MapOperator$MapOpCtx.forward(MapOperator.java:157)
>   at 
> org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:497)
>   ... 9 more
> Caused by: java.lang.ArrayIndexOutOfBoundsException
>   at java.lang.System.arraycopy(Native Method)
>   at org.apache.hadoop.io.Text.set(Text.java:225)
>   at 
> org.apache.hadoop.hive.serde2.lazybinary.LazyBinaryString.init(LazyBinaryString.java:48)
>   at 
> org.apache.hadoop.hive.serde2.lazybinary.LazyBinaryStruct.uncheckedGetField(LazyBinaryStruct.java:264)
>   at 
> org.apache.hadoop.hive.serde2.lazybinary.LazyBinaryStruct.getField(LazyBinaryStruct.java:201)
>   at 
> org.apache.hadoop.hive.serde2.lazybinary.objectinspector.LazyBinaryStructObjectInspector.getStructFieldData(LazyBinaryStructObjectInspector.java:64)
>   at 
> org.apache.hadoop.hive.ql.exec.ExprNodeColumnEvaluator._evaluate(ExprNodeColumnEvaluator.java:94)
>   at 
> org.apache.hadoop.hive.ql.exec.ExprNodeEvaluator.evaluate(ExprNodeEvaluator.java:77)
>   at 
> org.apache.hadoop.hive.ql.exec.ExprNodeEvaluator.evaluate(ExprNodeEvaluator.java:65)
>   at 
> org.apache.hadoop.hive.ql.exec.ReduceSinkOperator.makeValueWritable(ReduceSinkOperator.java:550)
>   at 
> org.apache.hadoop.hive.ql.exec.ReduceSinkOperator.processOp(ReduceSinkOperator.java:377)
>   ... 13 more
> {code}
> The exception is because the serialization and deserialization doesn't match.
> The serialization by LazyBinarySerDe from previous MapReduce job used 
> different order of columns. When the current MapReduce job deserialized the 
> intermediate sequence file generated by previous MapReduce job, it will get 
> corrupted data from the deserialization using wrong order of columns by 
> LazyBinaryStruct. The unmatched columns between  serialization and 
> deserialization is caused by SelectOperator's Column Pruning 
> {{ColumnPrunerSelectProc}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)