[ https://issues.apache.org/jira/browse/HIVE-6347?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15108883#comment-15108883 ]
Demeter Sztanko commented on HIVE-6347: --------------------------------------- Hello, I am running my cluster on Hadoop 2.7.1 (checksum fc0a1a23fc1868e4d5ee7fa2b28a58a), using Hive 1.2.1 (checksum ab480aca41b24a9c3751b8c023338231) and hive.exec.orc.zerocopy tends to cause failures. All my hive queries run fines, but once I enable zerocopy, it seems to have some problems with native libraries: {code} set hive.exec.orc.zerocopy = true; <execute my query> Hadoop job information for Stage-1: number of mappers: 316; number of reducers: 90 2016-01-20 16:37:54,479 Stage-1 map = 0%, reduce = 0% 2016-01-20 16:38:21,061 Stage-1 map = 100%, reduce = 100% 2016-01-20 16:39:21,246 Stage-1 map = 100%, reduce = 100% Ended Job = job_1452780282075_23380 with errors Error during job, obtaining debugging information... Diagnostic Messages for this Task: Error: java.io.IOException: java.lang.reflect.InvocationTargetException at org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderCreationException(HiveIOExceptionHandlerChain.java:97) at org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderCreationException(HiveIOExceptionHandlerUtil.java:57) at org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.initNextRecordReader(HadoopShimsSecure.java:266) at org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.<init>(HadoopShimsSecure.java:213) at org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileInputFormatShim.getRecordReader(HadoopShimsSecure.java:333) at org.apache.hadoop.hive.ql.io.CombineHiveInputFormat.getRecordReader(CombineHiveInputFormat.java:719) at org.apache.hadoop.mapred.MapTask$TrackedRecordReader.<init>(MapTask.java:169) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:432) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343) at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158) Caused by: java.lang.reflect.InvocationTargetException at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:422) at org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.initNextRecordReader(HadoopShimsSecure.java:252) ... 11 more Caused by: java.lang.UnsatisfiedLinkError: org.apache.hadoop.io.compress.snappy.SnappyDecompressor.decompressBytesDirect()I at org.apache.hadoop.io.compress.snappy.SnappyDecompressor.decompressBytesDirect(Native Method) at org.apache.hadoop.io.compress.snappy.SnappyDecompressor.decompressDirect(SnappyDecompressor.java:305) at org.apache.hadoop.io.compress.snappy.SnappyDecompressor$SnappyDirectDecompressor.decompress(SnappyDecompressor.java:341) at org.apache.hadoop.hive.shims.ZeroCopyShims$DirectDecompressorAdapter.decompress(ZeroCopyShims.java:101) at org.apache.hadoop.hive.ql.io.orc.SnappyCodec.directDecompress(SnappyCodec.java:100) at org.apache.hadoop.hive.ql.io.orc.SnappyCodec.decompress(SnappyCodec.java:67) at org.apache.hadoop.hive.ql.io.orc.InStream$CompressedStream.readHeader(InStream.java:214) at org.apache.hadoop.hive.ql.io.orc.InStream$CompressedStream.read(InStream.java:227) at org.apache.hadoop.hive.ql.io.orc.RunLengthIntegerReaderV2.readValues(RunLengthIntegerReaderV2.java:54) at org.apache.hadoop.hive.ql.io.orc.RunLengthIntegerReaderV2.next(RunLengthIntegerReaderV2.java:302) at org.apache.hadoop.hive.ql.io.orc.TreeReaderFactory$StringDictionaryTreeReader.readDictionaryLengthStream(TreeReaderFactory.java:1674) at org.apache.hadoop.hive.ql.io.orc.TreeReaderFactory$StringDictionaryTreeReader.startStripe(TreeReaderFactory.java:1654) at org.apache.hadoop.hive.ql.io.orc.TreeReaderFactory$StringTreeReader.startStripe(TreeReaderFactory.java:1382) at org.apache.hadoop.hive.ql.io.orc.TreeReaderFactory$StructTreeReader.startStripe(TreeReaderFactory.java:2040) at org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.readStripe(RecordReaderImpl.java:795) at org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.advanceStripe(RecordReaderImpl.java:986) at org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.advanceToNextRow(RecordReaderImpl.java:1019) at org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.<init>(RecordReaderImpl.java:205) at org.apache.hadoop.hive.ql.io.orc.ReaderImpl.rowsOptions(ReaderImpl.java:539) at org.apache.hadoop.hive.ql.io.orc.VectorizedOrcInputFormat$VectorizedOrcRecordReader.<init>(VectorizedOrcInputFormat.java:71) at org.apache.hadoop.hive.ql.io.orc.VectorizedOrcInputFormat.getRecordReader(VectorizedOrcInputFormat.java:156) at org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.createVectorizedReader(OrcInputFormat.java:1088) at org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.getRecordReader(OrcInputFormat.java:1102) at org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.<init>(CombineHiveRecordReader.java:67) ... 16 more Container killed by the ApplicationMaster. Container killed on request. Exit code is 143 Container exited with a non-zero exit code 143 {code} > ZeroCopy read path for ORC RecordReader > --------------------------------------- > > Key: HIVE-6347 > URL: https://issues.apache.org/jira/browse/HIVE-6347 > Project: Hive > Issue Type: Bug > Components: File Formats > Affects Versions: tez-branch > Reporter: Gopal V > Assignee: Gopal V > Fix For: tez-branch > > Attachments: HIVE-6347.1.patch, HIVE-6347.2-tez.patch, > HIVE-6347.3-tez.patch, HIVE-6347.4-tez.patch, HIVE-6347.5-tez.patch > > > ORC can use the new HDFS Caching APIs and the ZeroCopy readers to avoid extra > data copies into memory while scanning files. > Implement ORC zcr codepath and a hive.orc.zerocopy flag. -- This message was sent by Atlassian JIRA (v6.3.4#6332)