[
https://issues.apache.org/jira/browse/HIVE-6320?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Prasanth J updated HIVE-6320:
-----------------------------
Description:
ORC data reader crashes out on a BufferUnderflowException, while trying to read
data row-by-row with the predicate push-down enabled on current trunk.
Stack trace:
{code}
Caused by: java.nio.BufferUnderflowException
at java.nio.Buffer.nextGetIndex(Buffer.java:472)
at java.nio.HeapByteBuffer.get(HeapByteBuffer.java:117)
at
org.apache.hadoop.hive.ql.io.orc.InStream$CompressedStream.read(InStream.java:207)
at
org.apache.hadoop.hive.ql.io.orc.SerializationUtils.readInts(SerializationUtils.java:450)
at
org.apache.hadoop.hive.ql.io.orc.RunLengthIntegerReaderV2.readDirectValues(RunLengthIntegerReaderV2.java:240)
at
org.apache.hadoop.hive.ql.io.orc.RunLengthIntegerReaderV2.readValues(RunLengthIntegerReaderV2.java:53)
at
org.apache.hadoop.hive.ql.io.orc.RunLengthIntegerReaderV2.next(RunLengthIntegerReaderV2.java:288)
at
org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl$IntTreeReader.next(RecordReaderImpl.java:510)
at
org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl$StructTreeReader.next(RecordReaderImpl.java:1581)
at
org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.next(RecordReaderImpl.java:2707)
at
org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$OrcRecordReader.next(OrcInputFormat.java:125)
at
org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$OrcRecordReader.next(OrcInputFormat.java:101)
{code}
OR it could be
{code}
Caused by: java.lang.IndexOutOfBoundsException
at java.nio.ByteBuffer.wrap(ByteBuffer.java:352)
at
org.apache.hadoop.hive.ql.io.orc.InStream$CompressedStream.readHeader(InStream.java:180)
at
org.apache.hadoop.hive.ql.io.orc.InStream$CompressedStream.read(InStream.java:197)
at
org.apache.hadoop.hive.ql.io.orc.SerializationUtils.readInts(SerializationUtils.java:450)
at
org.apache.hadoop.hive.ql.io.orc.RunLengthIntegerReaderV2.readDirectValues(RunLengthIntegerReaderV2.java:252)
at
org.apache.hadoop.hive.ql.io.orc.RunLengthIntegerReaderV2.readValues(RunLengthIntegerReaderV2.java:59)
at
org.apache.hadoop.hive.ql.io.orc.RunLengthIntegerReaderV2.next(RunLengthIntegerReaderV2.java:300)
at
org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl$LongTreeReader.next(RecordReaderImpl.java:475)
at
org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl$StructTreeReader.next(RecordReaderImpl.java:1159)
at
org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.next(RecordReaderImpl.java:2198)
at
org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$OrcRecordReader.next(OrcInputFormat.java:108)
at
org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$OrcRecordReader.next(OrcInputFormat.java:57)
at
org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:274)
... 15 more
{code}
The query run is
{code}
set hive.vectorized.execution.enabled=false;
set hive.optimize.index.filter=true;
insert overwrite directory '/tmp/foo' select * from lineitem where l_orderkey
is not null;
{code}
Reason:
was:
ORC data reader crashes out on a BufferUnderflowException, while trying to read
data row-by-row with the predicate push-down enabled on current trunk.
{code}
Caused by: java.nio.BufferUnderflowException
at java.nio.Buffer.nextGetIndex(Buffer.java:472)
at java.nio.HeapByteBuffer.get(HeapByteBuffer.java:117)
at
org.apache.hadoop.hive.ql.io.orc.InStream$CompressedStream.read(InStream.java:207)
at
org.apache.hadoop.hive.ql.io.orc.SerializationUtils.readInts(SerializationUtils.java:450)
at
org.apache.hadoop.hive.ql.io.orc.RunLengthIntegerReaderV2.readDirectValues(RunLengthIntegerReaderV2.java:240)
at
org.apache.hadoop.hive.ql.io.orc.RunLengthIntegerReaderV2.readValues(RunLengthIntegerReaderV2.java:53)
at
org.apache.hadoop.hive.ql.io.orc.RunLengthIntegerReaderV2.next(RunLengthIntegerReaderV2.java:288)
at
org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl$IntTreeReader.next(RecordReaderImpl.java:510)
at
org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl$StructTreeReader.next(RecordReaderImpl.java:1581)
at
org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.next(RecordReaderImpl.java:2707)
at
org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$OrcRecordReader.next(OrcInputFormat.java:125)
at
org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$OrcRecordReader.next(OrcInputFormat.java:101)
{code}
The query run is
{code}
set hive.vectorized.execution.enabled=false;
set hive.optimize.index.filter=true;
insert overwrite directory '/tmp/foo' select * from lineitem where l_orderkey
is not null;
{code}
> Row-based ORC reader with PPD turned on dies on
> BufferUnderFlowException/IndexOutOfBoundsException
> ---------------------------------------------------------------------------------------------------
>
> Key: HIVE-6320
> URL: https://issues.apache.org/jira/browse/HIVE-6320
> Project: Hive
> Issue Type: Bug
> Components: Serializers/Deserializers
> Affects Versions: 0.13.0
> Reporter: Gopal V
> Assignee: Prasanth J
> Labels: orcfile
> Fix For: 0.13.0
>
> Attachments: HIVE-6320.1.patch, HIVE-6320.2.patch, HIVE-6320.2.patch,
> HIVE-6320.3.patch
>
>
> ORC data reader crashes out on a BufferUnderflowException, while trying to
> read data row-by-row with the predicate push-down enabled on current trunk.
> Stack trace:
> {code}
> Caused by: java.nio.BufferUnderflowException
> at java.nio.Buffer.nextGetIndex(Buffer.java:472)
> at java.nio.HeapByteBuffer.get(HeapByteBuffer.java:117)
> at
> org.apache.hadoop.hive.ql.io.orc.InStream$CompressedStream.read(InStream.java:207)
> at
> org.apache.hadoop.hive.ql.io.orc.SerializationUtils.readInts(SerializationUtils.java:450)
> at
> org.apache.hadoop.hive.ql.io.orc.RunLengthIntegerReaderV2.readDirectValues(RunLengthIntegerReaderV2.java:240)
> at
> org.apache.hadoop.hive.ql.io.orc.RunLengthIntegerReaderV2.readValues(RunLengthIntegerReaderV2.java:53)
> at
> org.apache.hadoop.hive.ql.io.orc.RunLengthIntegerReaderV2.next(RunLengthIntegerReaderV2.java:288)
> at
> org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl$IntTreeReader.next(RecordReaderImpl.java:510)
> at
> org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl$StructTreeReader.next(RecordReaderImpl.java:1581)
> at
> org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.next(RecordReaderImpl.java:2707)
> at
> org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$OrcRecordReader.next(OrcInputFormat.java:125)
> at
> org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$OrcRecordReader.next(OrcInputFormat.java:101)
> {code}
> OR it could be
> {code}
> Caused by: java.lang.IndexOutOfBoundsException
> at java.nio.ByteBuffer.wrap(ByteBuffer.java:352)
> at
> org.apache.hadoop.hive.ql.io.orc.InStream$CompressedStream.readHeader(InStream.java:180)
> at
> org.apache.hadoop.hive.ql.io.orc.InStream$CompressedStream.read(InStream.java:197)
> at
> org.apache.hadoop.hive.ql.io.orc.SerializationUtils.readInts(SerializationUtils.java:450)
> at
> org.apache.hadoop.hive.ql.io.orc.RunLengthIntegerReaderV2.readDirectValues(RunLengthIntegerReaderV2.java:252)
> at
> org.apache.hadoop.hive.ql.io.orc.RunLengthIntegerReaderV2.readValues(RunLengthIntegerReaderV2.java:59)
> at
> org.apache.hadoop.hive.ql.io.orc.RunLengthIntegerReaderV2.next(RunLengthIntegerReaderV2.java:300)
> at
> org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl$LongTreeReader.next(RecordReaderImpl.java:475)
> at
> org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl$StructTreeReader.next(RecordReaderImpl.java:1159)
> at
> org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.next(RecordReaderImpl.java:2198)
> at
> org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$OrcRecordReader.next(OrcInputFormat.java:108)
> at
> org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$OrcRecordReader.next(OrcInputFormat.java:57)
> at
> org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:274)
> ... 15 more
> {code}
> The query run is
> {code}
> set hive.vectorized.execution.enabled=false;
> set hive.optimize.index.filter=true;
> insert overwrite directory '/tmp/foo' select * from lineitem where l_orderkey
> is not null;
> {code}
> Reason:
--
This message was sent by Atlassian JIRA
(v6.2#6252)