Hi Yulia, Thanks for you response.
i see only lzo only for impala [root@xxxxxxx ~]# locate *lzo*.so* /opt/cloudera/parcels/GPLEXTRAS-5.13.0-1.cdh5.13.0.p0.29/lib/impala/lib/libimpalalzo.so /usr/lib64/liblzo2.so.2 /usr/lib64/liblzo2.so.2.0.0 the /opt/cloudera/parcels/GPLEXTRAS-5.13.0-1.cdh5.13.0.p0.29/lib/hadoop/lib/native has : -rwxr-xr-x 1 cloudera-scm cloudera-scm 22918 Oct 4 2017 libgplcompression.a -rwxr-xr-x 1 cloudera-scm cloudera-scm 1204 Oct 4 2017 libgplcompression.la -rwxr-xr-x 1 cloudera-scm cloudera-scm 1205 Oct 4 2017 libgplcompression.lai -rwxr-xr-x 1 cloudera-scm cloudera-scm 15760 Oct 4 2017 libgplcompression.so -rwxr-xr-x 1 cloudera-scm cloudera-scm 15768 Oct 4 2017 libgplcompression.so.0 -rwxr-xr-x 1 cloudera-scm cloudera-scm 15768 Oct 4 2017 libgplcompression.so.0.0.0 and /opt/cloudera/parcels/GPLEXTRAS-5.13.0-1.cdh5.13.0.p0.29/lib/spark-netlib/lib has: -rw-r--r-- 1 cloudera-scm cloudera-scm 8673 Oct 4 2017 jniloader-1.1.jar -rw-r--r-- 1 cloudera-scm cloudera-scm 53249 Oct 4 2017 native_ref-java-1.1.jar -rw-r--r-- 1 cloudera-scm cloudera-scm 53295 Oct 4 2017 native_system-java-1.1.jar -rw-r--r-- 1 cloudera-scm cloudera-scm 1732268 Oct 4 2017 netlib-native_ref-linux-x86_64-1.1-natives.jar -rw-r--r-- 1 cloudera-scm cloudera-scm 446694 Oct 4 2017 netlib-native_system-linux-x86_64-1.1-natives.jar Note: The issue occuring only with the spark job, mapreduce job working fine. On Thu, May 3, 2018 at 9:17 PM, yuliya Feldman <yufeld...@yahoo.com> wrote: > Jar is not enough, you need native library (*.so) - see if your "native" > directory contains it > > drwxr-xr-x 2 cloudera-scm cloudera-scm 4096 Oct 4 2017 native > > and whether java.library.path or LD_LIBRARY_PATH points/includes > directory where your *.so library resides > > On Thursday, May 3, 2018, 5:06:35 AM PDT, Fawze Abujaber < > fawz...@gmail.com> wrote: > > > Hi Guys, > > I'm running into issue where my spark jobs are failing on the below error, > I'm using Spark 1.6.0 with CDH 5.13.0. > > I tried to figure it out with no success. > > Will appreciate any help or a direction how to attack this issue. > > User class threw exception: org.apache.spark.SparkException: Job aborted > due to stage failure: Task 0 in stage 1.0 failed 4 times, most recent > failure: Lost task 0.3 in stage 1.0 (TID 3, xxxxxx, executor 1): > *java.lang.RuntimeException: > native-lzo library not available* > *at > com.hadoop.compression.lzo.LzoCodec.getDecompressorType(LzoCodec.java:193)* > at org.apache.hadoop.io.compress.CodecPool.getDecompressor( > CodecPool.java:181) > at org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1995) > at org.apache.hadoop.io.SequenceFile$Reader.initialize(SequenceFile.java: > 1881) > at org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1830) > at org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1844) > at org.apache.hadoop.mapreduce.lib.input.SequenceFileRecordReader. > initialize(SequenceFileRecordReader.java:54) > at com.liveperson.dallas.lp.utils.incremental. > DallasGenericTextFileRecordReader.initialize( > DallasGenericTextFileRecordReader.java:64) > at com.liveperson.hadoop.fs.inputs.LPCombineFileRecordReaderWrapp > er.initialize(LPCombineFileRecordReaderWrapper.java:38) > at org.apache.hadoop.mapreduce.lib.input.CombineFileRecordReader. > initialize(CombineFileRecordReader.java:63) > at org.apache.spark.rdd.NewHadoopRDD$$anon$1.<init>( > NewHadoopRDD.scala:168) > at org.apache.spark.rdd.NewHadoopRDD.compute(NewHadoopRDD.scala:133) > at org.apache.spark.rdd.NewHadoopRDD.compute(NewHadoopRDD.scala:65) > at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306) > at org.apache.spark.rdd.RDD.iterator(RDD.scala:270) > at org.apache.spark.rdd.MapPartitionsRDD.compute( > MapPartitionsRDD.scala:38) > at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306) > at org.apache.spark.rdd.RDD.iterator(RDD.scala:270) > at org.apache.spark.rdd.MapPartitionsRDD.compute( > MapPartitionsRDD.scala:38) > at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306) > at org.apache.spark.rdd.RDD.iterator(RDD.scala:270) > at org.apache.spark.rdd.MapPartitionsRDD.compute( > MapPartitionsRDD.scala:38) > at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306) > at org.apache.spark.rdd.RDD.iterator(RDD.scala:270) > at org.apache.spark.scheduler.ShuffleMapTask.runTask( > ShuffleMapTask.scala:73) > at org.apache.spark.scheduler.ShuffleMapTask.runTask( > ShuffleMapTask.scala:41) > at org.apache.spark.scheduler.Task.run(Task.scala:89) > at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:242) > at java.util.concurrent.ThreadPoolExecutor.runWorker( > ThreadPoolExecutor.java:1145) > at java.util.concurrent.ThreadPoolExecutor$Worker.run( > ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:745) > Driver stacktrace: > > > > I see the LZO at GPextras: > > ll > total 104 > -rw-r--r-- 1 cloudera-scm cloudera-scm 35308 Oct 4 2017 > COPYING.hadoop-lzo > -rw-r--r-- 1 cloudera-scm cloudera-scm 62268 Oct 4 2017 > hadoop-lzo-0.4.15-cdh5.13.0.jar > lrwxrwxrwx 1 cloudera-scm cloudera-scm 31 May 3 07:23 hadoop-lzo.jar > -> hadoop-lzo-0.4.15-cdh5.13.0.jar > drwxr-xr-x 2 cloudera-scm cloudera-scm 4096 Oct 4 2017 native > > > > > -- > Take Care > Fawze Abujaber > -- Take Care Fawze Abujaber