RE: LLAP can't read ORC ZLIB files from S3

2020-06-28 Thread Aaron Grubb
Hi Owen, the problem disappeared when I stopped using 
orc.write.variable.length.blocks but how would I go about turning off direct 
byte buffers on read out of curiosity? I can’t find any settings to control 
this.

Thanks,
Aaron

From: Owen O'Malley 
Sent: Thursday, June 25, 2020 1:21 PM
To: user@hive.apache.org
Subject: Re: LLAP can't read ORC ZLIB files from S3

Actually, it looks like LLAP is trying to get the ByteBuffer array from a 
direct byte buffer. Turning off direct byte buffers on read should fix the 
problem.

.. Owen

On Thu, Jun 25, 2020 at 7:27 AM Aaron Grubb 
mailto:aaron.gr...@clearpier.com>> wrote:
This appears to have been caused by orc.write.variable.length.blocks=true which 
I had set for HDFS-based tables. Setting this to false and inserting data into 
the S3 table appears to have fixed this problem.

From: Aaron Grubb mailto:aaron.gr...@clearpier.com>>
Sent: Wednesday, June 24, 2020 4:04 PM
To: user@hive.apache.org<mailto:user@hive.apache.org>
Subject: LLAP can't read ORC ZLIB files from S3

Hello everyone,

I’m encountering an error that I can’t find any information on. I’ve inserted 
data into a table with storage in S3 in ORC ZLIB format. I can query this data 
directly without issues. Running a query that requires LLAP causes the 
following error:

java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: 
java.io.IOException: java.io.IOException: 
java.lang.UnsupportedOperationException
at 
org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:296)
at 
org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:250)
at 
org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:374)
at 
org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:73)
at 
org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:61)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)
at 
org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:61)
at 
org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:37)
at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
at 
org.apache.hadoop.hive.llap.daemon.impl.StatsRecordingThreadPool$WrappedCallable.call(StatsRecordingThreadPool.java:110)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: 
java.io.IOException: java.io.IOException: 
java.lang.UnsupportedOperationException
at 
org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.pushRecord(MapRecordSource.java:80)
at 
org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.run(MapRecordProcessor.java:419)
at 
org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:267)
... 15 more
Caused by: java.io.IOException: java.io.IOException: 
java.lang.UnsupportedOperationException
at 
org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderNextException(HiveIOExceptionHandlerChain.java:121)
at 
org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderNextException(HiveIOExceptionHandlerUtil.java:77)
at 
org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:365)
at 
org.apache.hadoop.hive.ql.io.HiveRecordReader.doNext(HiveRecordReader.java:79)
at 
org.apache.hadoop.hive.ql.io.HiveRecordReader.doNext(HiveRecordReader.java:33)
at 
org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.next(HiveContextAwareRecordReader.java:116)
at 
org.apache.hadoop.mapred.split.TezGroupedSplitsInputFormat$TezGroupedSplitsRecordReader.next(TezGroupedSplitsInputFormat.java:151)
at 
org.apache.tez.mapreduce.lib.MRReaderMapred.next(MRReaderMapred.java:116)
at 
org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.pushRecord(MapRecordSource.java:68)
... 17 more
Caused by: java.io.IOException: java.lang.UnsupportedOperationException
at 
org.apache.hadoop.hive.ql.io.orc.encoded.EncodedReaderImpl.readIndexStreams(EncodedReaderImpl.java:1954)
at 
org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader.performDataRead(OrcEncodedDataReader.java:384)
at 
org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader$4.run(OrcEncodedDataReader.java:263)
at 
org.apache.hadoop.hive.llap.io.encoded.OrcEncod

Re: LLAP can't read ORC ZLIB files from S3

2020-06-25 Thread Owen O'Malley
Actually, it looks like LLAP is trying to get the ByteBuffer array from a
direct byte buffer. Turning off direct byte buffers on read should fix the
problem.

.. Owen

On Thu, Jun 25, 2020 at 7:27 AM Aaron Grubb 
wrote:

> This appears to have been caused by orc.write.variable.length.blocks=true
> which I had set for HDFS-based tables. Setting this to false and inserting
> data into the S3 table appears to have fixed this problem.
>
>
>
> *From:* Aaron Grubb 
> *Sent:* Wednesday, June 24, 2020 4:04 PM
> *To:* user@hive.apache.org
> *Subject:* LLAP can't read ORC ZLIB files from S3
>
>
>
> Hello everyone,
>
>
>
> I’m encountering an error that I can’t find any information on. I’ve
> inserted data into a table with storage in S3 in ORC ZLIB format. I can
> query this data directly without issues. Running a query that requires LLAP
> causes the following error:
>
>
>
> java.lang.RuntimeException:
> org.apache.hadoop.hive.ql.metadata.HiveException: java.io.IOException:
> java.io.IOException: java.lang.UnsupportedOperationException
>
> at
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:296)
>
> at
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:250)
>
> at
> org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:374)
>
> at
> org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:73)
>
> at
> org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:61)
>
> at java.security.AccessController.doPrivileged(Native Method)
>
> at javax.security.auth.Subject.doAs(Subject.java:422)
>
> at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)
>
> at
> org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:61)
>
> at
> org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:37)
>
> at
> org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
>
> at
> org.apache.hadoop.hive.llap.daemon.impl.StatsRecordingThreadPool$WrappedCallable.call(StatsRecordingThreadPool.java:110)
>
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>
> at java.lang.Thread.run(Thread.java:748)
>
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException:
> java.io.IOException: java.io.IOException:
> java.lang.UnsupportedOperationException
>
> at
> org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.pushRecord(MapRecordSource.java:80)
>
> at
> org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.run(MapRecordProcessor.java:419)
>
> at
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:267)
>
> ... 15 more
>
> Caused by: java.io.IOException: java.io.IOException:
> java.lang.UnsupportedOperationException
>
> at
> org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderNextException(HiveIOExceptionHandlerChain.java:121)
>
> at
> org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderNextException(HiveIOExceptionHandlerUtil.java:77)
>
> at
> org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:365)
>
> at
> org.apache.hadoop.hive.ql.io.HiveRecordReader.doNext(HiveRecordReader.java:79)
>
> at
> org.apache.hadoop.hive.ql.io.HiveRecordReader.doNext(HiveRecordReader.java:33)
>
> at
> org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.next(HiveContextAwareRecordReader.java:116)
>
> at
> org.apache.hadoop.mapred.split.TezGroupedSplitsInputFormat$TezGroupedSplitsRecordReader.next(TezGroupedSplitsInputFormat.java:151)
>
> at
> org.apache.tez.mapreduce.lib.MRReaderMapred.next(MRReaderMapred.java:116)
>
> at
> org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.pushRecord(MapRecordSource.java:68)
>
> ... 17 more
>
> Caused by: java.io.IOException: java.lang.UnsupportedOperationException
>
> at
> org.apache.hadoop.hive.ql.io.orc.encoded.EncodedReaderImpl.readIndexStreams(EncodedReaderImpl.java:1954)
>
> at
> org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader.performDataRead(OrcEncodedDataReader.java:384)
>
> at
> org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader$4.run(OrcEncodedDataReader.java:263)
>
> at
> org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader$4.run(OrcEncodedDataReader.java:260)
>
> at java.security.AccessController.doPrivileged(Native Method)
>
> at javax.security.auth.Subject.doAs(Subject.java:422)
>
> at
> 

RE: LLAP can't read ORC ZLIB files from S3

2020-06-25 Thread Aaron Grubb
This appears to have been caused by orc.write.variable.length.blocks=true which 
I had set for HDFS-based tables. Setting this to false and inserting data into 
the S3 table appears to have fixed this problem.

From: Aaron Grubb 
Sent: Wednesday, June 24, 2020 4:04 PM
To: user@hive.apache.org
Subject: LLAP can't read ORC ZLIB files from S3

Hello everyone,

I'm encountering an error that I can't find any information on. I've inserted 
data into a table with storage in S3 in ORC ZLIB format. I can query this data 
directly without issues. Running a query that requires LLAP causes the 
following error:

java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: 
java.io.IOException: java.io.IOException: 
java.lang.UnsupportedOperationException
at 
org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:296)
at 
org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:250)
at 
org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:374)
at 
org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:73)
at 
org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:61)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)
at 
org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:61)
at 
org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:37)
at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
at 
org.apache.hadoop.hive.llap.daemon.impl.StatsRecordingThreadPool$WrappedCallable.call(StatsRecordingThreadPool.java:110)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: 
java.io.IOException: java.io.IOException: 
java.lang.UnsupportedOperationException
at 
org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.pushRecord(MapRecordSource.java:80)
at 
org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.run(MapRecordProcessor.java:419)
at 
org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:267)
... 15 more
Caused by: java.io.IOException: java.io.IOException: 
java.lang.UnsupportedOperationException
at 
org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderNextException(HiveIOExceptionHandlerChain.java:121)
at 
org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderNextException(HiveIOExceptionHandlerUtil.java:77)
at 
org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:365)
at 
org.apache.hadoop.hive.ql.io.HiveRecordReader.doNext(HiveRecordReader.java:79)
at 
org.apache.hadoop.hive.ql.io.HiveRecordReader.doNext(HiveRecordReader.java:33)
at 
org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.next(HiveContextAwareRecordReader.java:116)
at 
org.apache.hadoop.mapred.split.TezGroupedSplitsInputFormat$TezGroupedSplitsRecordReader.next(TezGroupedSplitsInputFormat.java:151)
at 
org.apache.tez.mapreduce.lib.MRReaderMapred.next(MRReaderMapred.java:116)
at 
org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.pushRecord(MapRecordSource.java:68)
... 17 more
Caused by: java.io.IOException: java.lang.UnsupportedOperationException
at 
org.apache.hadoop.hive.ql.io.orc.encoded.EncodedReaderImpl.readIndexStreams(EncodedReaderImpl.java:1954)
at 
org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader.performDataRead(OrcEncodedDataReader.java:384)
at 
org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader$4.run(OrcEncodedDataReader.java:263)
at 
org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader$4.run(OrcEncodedDataReader.java:260)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)
at 
org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader.callInternal(OrcEncodedDataReader.java:260)
at 
org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader.callInternal(OrcEncodedDataReader.java:109)
... 6 more
Caused by: java.lang.UnsupportedOperationException
at java.nio.ByteBuffer.array(ByteBuffer.java:994)
at org.apache.orc.impl.ZlibCodec.decompress(ZlibCodec.java:94)