Hello guys, I have a job that reads compressed (Snappy) data but when I run the job, it is throwing an error "native snappy library not available: this version of libhadoop was built without snappy support". . I followed this instruction but it did not resolve the issue: https://community.hortonworks.com/questions/18903/this- version-of-libhadoop-was-built-without-snappy.html
The check native command show that snappy is installed. hadoop checknative 16/10/04 21:01:30 INFO bzip2.Bzip2Factory: Successfully loaded & initialized native-bzip2 library system-native 16/10/04 21:01:30 INFO zlib.ZlibFactory: Successfully loaded & initialized native-zlib library Native library checking: hadoop: true /usr/lib/hadoop/lib/native/libhadoop.so.1.0.0 zlib: true /lib64/libz.so.1 snappy: true /usr/lib/hadoop/lib/native/libsnappy.so.1 lz4: true revision:99 bzip2: true /lib64/libbz2.so.1 openssl: true /usr/lib64/libcrypto.so I also have a code in the job to check whether native snappy is loaded, which is returning true. Now, I have no idea why I'm getting this error. Also, I had no issue reading Snappy data using MapReduce job on the same cluster, Could anyone tell me what is wrong? Thank you. Stack: java.lang.RuntimeException: native snappy library not available: this version of libhadoop was built without snappy support. at org.apache.hadoop.io.compress.SnappyCodec.checkNativeCodeLoaded( SnappyCodec.java:65) at org.apache.hadoop.io.compress.SnappyCodec.getDecompressorType( SnappyCodec.java:193) at org.apache.hadoop.io.compress.CodecPool.getDecompressor( CodecPool.java:178) at org.apache.hadoop.mapred.LineRecordReader.<init>( LineRecordReader.java:111) at org.apache.hadoop.mapred.TextInputFormat.getRecordReader( TextInputFormat.java:67) at org.apache.spark.rdd.HadoopRDD$$anon$1.<init>( HadoopRDD.scala:237) at org.apache.spark.rdd.HadoopRDD.compute(HadoopRDD.scala:208) at org.apache.spark.rdd.HadoopRDD.compute(HadoopRDD.scala:101) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306) at org.apache.spark.rdd.RDD.iterator(RDD.scala:270) at org.apache.spark.rdd.MapPartitionsRDD.compute( MapPartitionsRDD.scala:38) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306) at org.apache.spark.rdd.RDD.iterator(RDD.scala:270) at org.apache.spark.rdd.MapPartitionsRDD.compute( MapPartitionsRDD.scala:38) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306) at org.apache.spark.rdd.RDD.iterator(RDD.scala:270) at org.apache.spark.rdd.MapPartitionsRDD.compute( MapPartitionsRDD.scala:38) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306) at org.apache.spark.rdd.RDD.iterator(RDD.scala:270) at org.apache.spark.scheduler.ShuffleMapTask.runTask( ShuffleMapTask.scala:73) at org.apache.spark.scheduler.ShuffleMapTask.runTask( ShuffleMapTask.scala:41) at org.apache.spark.scheduler.Task.run(Task.scala:89) at org.apache.spark.executor.Executor$TaskRunner.run( Executor.scala:214) at java.util.concurrent.ThreadPoolExecutor.runWorker( ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run( ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745)