This seems to be a Cloudera environment issue, and you might get a faster and more reliable answer in Cloudera forums.
On Fri, May 4, 2018 at 3:39 PM, Fawze Abujaber <fawz...@gmail.com> wrote: > Hi Yulia, > > Thanks for you response. > > i see only lzo only for impala > > [root@xxxxxxx ~]# locate *lzo*.so* > /opt/cloudera/parcels/GPLEXTRAS-5.13.0-1.cdh5.13.0.p0.29/lib/impala/lib/ > libimpalalzo.so > /usr/lib64/liblzo2.so.2 > /usr/lib64/liblzo2.so.2.0.0 > > the > /opt/cloudera/parcels/GPLEXTRAS-5.13.0-1.cdh5.13.0.p0.29/lib/hadoop/lib/native > has : > > -rwxr-xr-x 1 cloudera-scm cloudera-scm 22918 Oct 4 2017 > libgplcompression.a > -rwxr-xr-x 1 cloudera-scm cloudera-scm 1204 Oct 4 2017 > libgplcompression.la > -rwxr-xr-x 1 cloudera-scm cloudera-scm 1205 Oct 4 2017 > libgplcompression.lai > -rwxr-xr-x 1 cloudera-scm cloudera-scm 15760 Oct 4 2017 > libgplcompression.so > -rwxr-xr-x 1 cloudera-scm cloudera-scm 15768 Oct 4 2017 > libgplcompression.so.0 > -rwxr-xr-x 1 cloudera-scm cloudera-scm 15768 Oct 4 2017 > libgplcompression.so.0.0.0 > > > and > /opt/cloudera/parcels/GPLEXTRAS-5.13.0-1.cdh5.13.0.p0.29/lib/spark-netlib/lib > has: > > -rw-r--r-- 1 cloudera-scm cloudera-scm 8673 Oct 4 2017 > jniloader-1.1.jar > -rw-r--r-- 1 cloudera-scm cloudera-scm 53249 Oct 4 2017 > native_ref-java-1.1.jar > -rw-r--r-- 1 cloudera-scm cloudera-scm 53295 Oct 4 2017 > native_system-java-1.1.jar > -rw-r--r-- 1 cloudera-scm cloudera-scm 1732268 Oct 4 2017 > netlib-native_ref-linux-x86_64-1.1-natives.jar > -rw-r--r-- 1 cloudera-scm cloudera-scm 446694 Oct 4 2017 > netlib-native_system-linux-x86_64-1.1-natives.jar > > > Note: The issue occuring only with the spark job, mapreduce job working > fine. > > On Thu, May 3, 2018 at 9:17 PM, yuliya Feldman <yufeld...@yahoo.com> > wrote: > >> Jar is not enough, you need native library (*.so) - see if your "native" >> directory contains it >> >> drwxr-xr-x 2 cloudera-scm cloudera-scm 4096 Oct 4 2017 native >> >> and whether java.library.path or LD_LIBRARY_PATH points/includes >> directory where your *.so library resides >> >> On Thursday, May 3, 2018, 5:06:35 AM PDT, Fawze Abujaber < >> fawz...@gmail.com> wrote: >> >> >> Hi Guys, >> >> I'm running into issue where my spark jobs are failing on the below >> error, I'm using Spark 1.6.0 with CDH 5.13.0. >> >> I tried to figure it out with no success. >> >> Will appreciate any help or a direction how to attack this issue. >> >> User class threw exception: org.apache.spark.SparkException: Job aborted >> due to stage failure: Task 0 in stage 1.0 failed 4 times, most recent >> failure: Lost task 0.3 in stage 1.0 (TID 3, xxxxxx, executor 1): >> *java.lang.RuntimeException: >> native-lzo library not available* >> *at >> com.hadoop.compression.lzo.LzoCodec.getDecompressorType(LzoCodec.java:193)* >> at org.apache.hadoop.io.compress.CodecPool.getDecompressor(Code >> cPool.java:181) >> at org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1995) >> at org.apache.hadoop.io.SequenceFile$Reader.initialize( >> SequenceFile.java:1881) >> at org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile >> .java:1830) >> at org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile >> .java:1844) >> at org.apache.hadoop.mapreduce.lib.input.SequenceFileRecordRead >> er.initialize(SequenceFileRecordReader.java:54) >> at com.liveperson.dallas.lp.utils.incremental.DallasGenericText >> FileRecordReader.initialize(DallasGenericTextFileRecordReader.java:64) >> at com.liveperson.hadoop.fs.inputs.LPCombineFileRecordReaderWra >> pper.initialize(LPCombineFileRecordReaderWrapper.java:38) >> at org.apache.hadoop.mapreduce.lib.input.CombineFileRecordReade >> r.initialize(CombineFileRecordReader.java:63) >> at org.apache.spark.rdd.NewHadoopRDD$$anon$1.<init>(NewHadoopRD >> D.scala:168) >> at org.apache.spark.rdd.NewHadoopRDD.compute(NewHadoopRDD.scala:133) >> at org.apache.spark.rdd.NewHadoopRDD.compute(NewHadoopRDD.scala:65) >> at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306) >> at org.apache.spark.rdd.RDD.iterator(RDD.scala:270) >> at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsR >> DD.scala:38) >> at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306) >> at org.apache.spark.rdd.RDD.iterator(RDD.scala:270) >> at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsR >> DD.scala:38) >> at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306) >> at org.apache.spark.rdd.RDD.iterator(RDD.scala:270) >> at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsR >> DD.scala:38) >> at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306) >> at org.apache.spark.rdd.RDD.iterator(RDD.scala:270) >> at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMap >> Task.scala:73) >> at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMap >> Task.scala:41) >> at org.apache.spark.scheduler.Task.run(Task.scala:89) >> at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:242) >> at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPool >> Executor.java:1145) >> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoo >> lExecutor.java:615) >> at java.lang.Thread.run(Thread.java:745) >> Driver stacktrace: >> >> >> >> I see the LZO at GPextras: >> >> ll >> total 104 >> -rw-r--r-- 1 cloudera-scm cloudera-scm 35308 Oct 4 2017 >> COPYING.hadoop-lzo >> -rw-r--r-- 1 cloudera-scm cloudera-scm 62268 Oct 4 2017 >> hadoop-lzo-0.4.15-cdh5.13.0.jar >> lrwxrwxrwx 1 cloudera-scm cloudera-scm 31 May 3 07:23 hadoop-lzo.jar >> -> hadoop-lzo-0.4.15-cdh5.13.0.jar >> drwxr-xr-x 2 cloudera-scm cloudera-scm 4096 Oct 4 2017 native >> >> >> >> >> -- >> Take Care >> Fawze Abujaber >> > > > > -- > Take Care > Fawze Abujaber > -- Best Regards, Ayan Guha