Actually, it looks like LLAP is trying to get the ByteBuffer array from a direct byte buffer. Turning off direct byte buffers on read should fix the problem.
.. Owen On Thu, Jun 25, 2020 at 7:27 AM Aaron Grubb <aaron.gr...@clearpier.com> wrote: > This appears to have been caused by orc.write.variable.length.blocks=true > which I had set for HDFS-based tables. Setting this to false and inserting > data into the S3 table appears to have fixed this problem. > > > > *From:* Aaron Grubb <aaron.gr...@clearpier.com> > *Sent:* Wednesday, June 24, 2020 4:04 PM > *To:* user@hive.apache.org > *Subject:* LLAP can't read ORC ZLIB files from S3 > > > > Hello everyone, > > > > I’m encountering an error that I can’t find any information on. I’ve > inserted data into a table with storage in S3 in ORC ZLIB format. I can > query this data directly without issues. Running a query that requires LLAP > causes the following error: > > > > java.lang.RuntimeException: > org.apache.hadoop.hive.ql.metadata.HiveException: java.io.IOException: > java.io.IOException: java.lang.UnsupportedOperationException > > at > org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:296) > > at > org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:250) > > at > org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:374) > > at > org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:73) > > at > org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:61) > > at java.security.AccessController.doPrivileged(Native Method) > > at javax.security.auth.Subject.doAs(Subject.java:422) > > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730) > > at > org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:61) > > at > org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:37) > > at > org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36) > > at > org.apache.hadoop.hive.llap.daemon.impl.StatsRecordingThreadPool$WrappedCallable.call(StatsRecordingThreadPool.java:110) > > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > > at java.lang.Thread.run(Thread.java:748) > > Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: > java.io.IOException: java.io.IOException: > java.lang.UnsupportedOperationException > > at > org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.pushRecord(MapRecordSource.java:80) > > at > org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.run(MapRecordProcessor.java:419) > > at > org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:267) > > ... 15 more > > Caused by: java.io.IOException: java.io.IOException: > java.lang.UnsupportedOperationException > > at > org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderNextException(HiveIOExceptionHandlerChain.java:121) > > at > org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderNextException(HiveIOExceptionHandlerUtil.java:77) > > at > org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:365) > > at > org.apache.hadoop.hive.ql.io.HiveRecordReader.doNext(HiveRecordReader.java:79) > > at > org.apache.hadoop.hive.ql.io.HiveRecordReader.doNext(HiveRecordReader.java:33) > > at > org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.next(HiveContextAwareRecordReader.java:116) > > at > org.apache.hadoop.mapred.split.TezGroupedSplitsInputFormat$TezGroupedSplitsRecordReader.next(TezGroupedSplitsInputFormat.java:151) > > at > org.apache.tez.mapreduce.lib.MRReaderMapred.next(MRReaderMapred.java:116) > > at > org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.pushRecord(MapRecordSource.java:68) > > ... 17 more > > Caused by: java.io.IOException: java.lang.UnsupportedOperationException > > at > org.apache.hadoop.hive.ql.io.orc.encoded.EncodedReaderImpl.readIndexStreams(EncodedReaderImpl.java:1954) > > at > org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader.performDataRead(OrcEncodedDataReader.java:384) > > at > org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader$4.run(OrcEncodedDataReader.java:263) > > at > org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader$4.run(OrcEncodedDataReader.java:260) > > at java.security.AccessController.doPrivileged(Native Method) > > at javax.security.auth.Subject.doAs(Subject.java:422) > > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730) > > at > org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader.callInternal(OrcEncodedDataReader.java:260) > > at > org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader.callInternal(OrcEncodedDataReader.java:109) > > ... 6 more > > Caused by: java.lang.UnsupportedOperationException > > at java.nio.ByteBuffer.array(ByteBuffer.java:994) > > at org.apache.orc.impl.ZlibCodec.decompress(ZlibCodec.java:94) > > at > org.apache.hadoop.hive.ql.io.orc.encoded.EncodedReaderImpl.decompressChunk(EncodedReaderImpl.java:1283) > > at > org.apache.hadoop.hive.ql.io.orc.encoded.EncodedReaderImpl.readEncodedStream(EncodedReaderImpl.java:902) > > at > org.apache.hadoop.hive.ql.io.orc.encoded.EncodedReaderImpl.readIndexStreams(EncodedReaderImpl.java:1918) > > > > This is specific to ORC ZLIB files on S3 being processed through LLAP. I > can query other types of files in S3 through LLAP, I can query the ORC ZLIB > data on S3 directly (select * from orc_zlib_on_s3_table limit 10) and I can > execute the same query that fails in LLAP in Native Tez containers. Does > anyone have any suggestions as to what the problem might be or how to debug > it? > > > > Thanks, > > Aaron >