[ https://issues.apache.org/jira/browse/HIVE-18817?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16381447#comment-16381447 ]
Jason Dere commented on HIVE-18817: ----------------------------------- +1 pending tests > ArrayIndexOutOfBounds exception during read of ACID table. > ---------------------------------------------------------- > > Key: HIVE-18817 > URL: https://issues.apache.org/jira/browse/HIVE-18817 > Project: Hive > Issue Type: Bug > Components: ORC, Transactions > Reporter: Jason Dere > Assignee: Eugene Koifman > Priority: Major > Attachments: HIVE-18817.01.patch, HIVE-18817.02.patch, > HIVE-18817.03.patch, repro.patch > > > Seeing some users hitting the following stack trace: > {noformat} > 2018-02-26 05:49:45,876 [ERROR] [TezChild] |tez.TezProcessor|: > java.lang.RuntimeException: java.io.IOException: > java.lang.ArrayIndexOutOfBoundsException: 7 > at > org.apache.hadoop.mapred.split.TezGroupedSplitsInputFormat$TezGroupedSplitsRecordReader.initNextRecordReader(TezGroupedSplitsInputFormat.java:196) > at > org.apache.hadoop.mapred.split.TezGroupedSplitsInputFormat$TezGroupedSplitsRecordReader.next(TezGroupedSplitsInputFormat.java:142) > at > org.apache.tez.mapreduce.lib.MRReaderMapred.next(MRReaderMapred.java:113) > at > org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.pushRecord(MapRecordSource.java:66) > at > org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.run(MapRecordProcessor.java:325) > at > org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:150) > at > org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:139) > at > org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:347) > at > org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:194) > at > org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:185) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1866) > at > org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:185) > at > org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:181) > at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > Caused by: java.io.IOException: java.lang.ArrayIndexOutOfBoundsException: 7 > at > org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderCreationException(HiveIOExceptionHandlerChain.java:97) > at > org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderCreationException(HiveIOExceptionHandlerUtil.java:57) > at > org.apache.hadoop.hive.ql.io.HiveInputFormat.getRecordReader(HiveInputFormat.java:258) > at > org.apache.hadoop.mapred.split.TezGroupedSplitsInputFormat$TezGroupedSplitsRecordReader.initNextRecordReader(TezGroupedSplitsInputFormat.java:193) > ... 19 more > Caused by: java.lang.ArrayIndexOutOfBoundsException: 7 > at > org.apache.hadoop.hive.ql.io.orc.OrcRawRecordMerger.discoverKeyBounds(OrcRawRecordMerger.java:388) > at > org.apache.hadoop.hive.ql.io.orc.OrcRawRecordMerger.<init>(OrcRawRecordMerger.java:457) > at > org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.getReader(OrcInputFormat.java:1456) > at > org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.getRecordReader(OrcInputFormat.java:1342) > at > org.apache.hadoop.hive.ql.io.HiveInputFormat.getRecordReader(HiveInputFormat.java:255) > ... 20 more > {noformat} > Have a JUnit test that appears to produce a similar stack trace - looks like > this occurs if there is an OrcSplit of an ACID table where the split offset > is beyond the starting offset of the last stripe in the ORC file. > cc [~ekoifman] -- This message was sent by Atlassian JIRA (v7.6.3#76005)