Hello everyone,
I'm encountering an error that I can't find any information on. I've inserted
data into a table with storage in S3 in ORC ZLIB format. I can query this data
directly without issues. Running a query that requires LLAP causes the
following error:
java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException:
java.io.IOException: java.io.IOException:
java.lang.UnsupportedOperationException
at
org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:296)
at
org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:250)
at
org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:374)
at
org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:73)
at
org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:61)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)
at
org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:61)
at
org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:37)
at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
at
org.apache.hadoop.hive.llap.daemon.impl.StatsRecordingThreadPool$WrappedCallable.call(StatsRecordingThreadPool.java:110)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException:
java.io.IOException: java.io.IOException:
java.lang.UnsupportedOperationException
at
org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.pushRecord(MapRecordSource.java:80)
at
org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.run(MapRecordProcessor.java:419)
at
org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:267)
... 15 more
Caused by: java.io.IOException: java.io.IOException:
java.lang.UnsupportedOperationException
at
org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderNextException(HiveIOExceptionHandlerChain.java:121)
at
org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderNextException(HiveIOExceptionHandlerUtil.java:77)
at
org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:365)
at
org.apache.hadoop.hive.ql.io.HiveRecordReader.doNext(HiveRecordReader.java:79)
at
org.apache.hadoop.hive.ql.io.HiveRecordReader.doNext(HiveRecordReader.java:33)
at
org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.next(HiveContextAwareRecordReader.java:116)
at
org.apache.hadoop.mapred.split.TezGroupedSplitsInputFormat$TezGroupedSplitsRecordReader.next(TezGroupedSplitsInputFormat.java:151)
at
org.apache.tez.mapreduce.lib.MRReaderMapred.next(MRReaderMapred.java:116)
at
org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.pushRecord(MapRecordSource.java:68)
... 17 more
Caused by: java.io.IOException: java.lang.UnsupportedOperationException
at
org.apache.hadoop.hive.ql.io.orc.encoded.EncodedReaderImpl.readIndexStreams(EncodedReaderImpl.java:1954)
at
org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader.performDataRead(OrcEncodedDataReader.java:384)
at
org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader$4.run(OrcEncodedDataReader.java:263)
at
org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader$4.run(OrcEncodedDataReader.java:260)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)
at
org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader.callInternal(OrcEncodedDataReader.java:260)
at
org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader.callInternal(OrcEncodedDataReader.java:109)
... 6 more
Caused by: java.lang.UnsupportedOperationException
at java.nio.ByteBuffer.array(ByteBuffer.java:994)
at org.apache.orc.impl.ZlibCodec.decompress(ZlibCodec.java:94)
at
org.apache.hadoop.hive.ql.io.orc.encoded.EncodedReaderImpl.decompressChunk(EncodedReaderImpl.java:1283)
at
org.apache.hadoop.hive.ql.io.orc.encoded.EncodedReaderImpl.readEncodedStream(EncodedReaderImpl.java:902)
at
org.apache.hadoop.hive.ql.io.orc.encoded.EncodedReaderImpl.readIndexStreams(EncodedReaderImpl.java:1918)
This is specific to ORC ZLIB files on S3 being processed through LLAP. I can
query other types of files in S3 through LLAP, I can query the ORC ZLIB data on
S3 directly (select * from orc_zlib_on_s3_table limit 10) and I can execute the
same query that fails in LLAP in Native Tez containers. Does anyone have any
suggestions as to what the problem might be or how to debug it?
Thanks,
Aaron