[ https://issues.apache.org/jira/browse/HIVE-6347?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15109350#comment-15109350 ]
Gopal V commented on HIVE-6347: ------------------------------- {code} Caused by: java.lang.UnsatisfiedLinkError: org.apache.hadoop.io.compress.snappy.SnappyDecompressor.decompressBytesDirect()I {code} Does look like the tasks are not able to locate the libhadoop.so binary (*or* the hadoop build was done without snappy-dev available). Zero copy readers don't work if libhadoop.so is missing anyway. But to fill in some later developments, turning on zerocopy=true needs cluster-wide configs to turn on YARN-1775. The YARN memory counting counts memory-mapped files as container memory, so without that change you might see containers being killed for using too much memory as you scale past the terabyte levels. > ZeroCopy read path for ORC RecordReader > --------------------------------------- > > Key: HIVE-6347 > URL: https://issues.apache.org/jira/browse/HIVE-6347 > Project: Hive > Issue Type: Bug > Components: File Formats > Affects Versions: tez-branch > Reporter: Gopal V > Assignee: Gopal V > Fix For: tez-branch > > Attachments: HIVE-6347.1.patch, HIVE-6347.2-tez.patch, > HIVE-6347.3-tez.patch, HIVE-6347.4-tez.patch, HIVE-6347.5-tez.patch > > > ORC can use the new HDFS Caching APIs and the ZeroCopy readers to avoid extra > data copies into memory while scanning files. > Implement ORC zcr codepath and a hive.orc.zerocopy flag. -- This message was sent by Atlassian JIRA (v6.3.4#6332)