[ 
https://issues.apache.org/jira/browse/SPARK-25928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16787996#comment-16787996
 ] 

Abhinay commented on SPARK-25928:
---------------------------------

We are running into the same issue with EMR 5.20.0 with Oozie 5.0.0 and Spark 
2.4.0. The workaround seemed to be working for now but it does require some 
additional setup either converting to Snappy or deleting the lz 1.4.0 in Oozie 
share lib. 

 

The feedback from AWS support team was it was more of an upstream issue with 
Spark. Can we know when this issue will be resolved?

 

> NoSuchMethodError 
> net.jpountz.lz4.LZ4BlockInputStream.<init>(Ljava/io/InputStream;Z)V
> -------------------------------------------------------------------------------------
>
>                 Key: SPARK-25928
>                 URL: https://issues.apache.org/jira/browse/SPARK-25928
>             Project: Spark
>          Issue Type: Bug
>          Components: Spark Submit
>    Affects Versions: 2.3.1
>         Environment: EMR 5.17 which is using oozie 5.0.0 and spark 2.3.1
>            Reporter: Jerry Chabot
>            Priority: Major
>
> I am not sure if this is an Oozie problem, a Spark problem or a user error. 
> It is blocking our upcoming release.
> We are upgrading from Amazon's EMR 5.7 to EMR 5.17. The version changes are:
>     oozie 4.3.0 -> 5.0.0
>      spark 2.1.1 -> 2.3.1
> All our Oozie/Spark jobs were working in EMR 5.7. After ugprading, some of 
> our jobs which use a spark action are failing with the NoSuchMethod as shown 
> further in the description. It seems like conflicting classes.
> I noticed the spark share lib directory has two versions of the LZ4 jar.
> sudo -u hdfs hadoop fs -ls /user/oozie/share/lib/lib_20181029182704/spark/*lz*
>  -rw-r--r--   3 oozie oozie      79845 2018-10-29 18:27 
> /user/oozie/share/lib/lib_20181029182704/spark/compress-lzf-1.0.3.jar
>  -rw-r--r--   3 hdfs  oozie     236880 2018-11-01 18:22 
> /user/oozie/share/lib/lib_20181029182704/spark/lz4-1.3.0.jar
>  -rw-r--r--   3 oozie oozie     370119 2018-10-29 18:27 
> /user/oozie/share/lib/lib_20181029182704/spark/lz4-java-1.4.0.jar
> But, both of these jars have the constructor 
> LZ4BlockInputStream(java/io/InputStream). The spark/jars directory has only 
> lz4-java-1.4.0.jar. share lib seems to be getting it from the 
> /usr/lib/oozie/oozie-sharelib.tar.gz.
> Unfortunately, my team member that knows most about Spark is on vacation. 
> Does anyone have any suggestions on how best to troubleshoot this problem?
> Here is the strack trace.
>       diagnostics: User class threw exception: 
> org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in 
> stage 0.0 failed 4 times, most recent failure: Lost task 0.3 in stage 0.0 
> (TID 3, ip-172-27-113-49.ec2.internal, executor 2): 
> java.lang.NoSuchMethodError: 
> net.jpountz.lz4.LZ4BlockInputStream.<init>(Ljava/io/InputStream;Z)V
>         at 
> org.apache.spark.io.LZ4CompressionCodec.compressedInputStream(CompressionCodec.scala:122)
>         at 
> org.apache.spark.broadcast.TorrentBroadcast$$anonfun$6.apply(TorrentBroadcast.scala:304)
>         at 
> org.apache.spark.broadcast.TorrentBroadcast$$anonfun$6.apply(TorrentBroadcast.scala:304)
>         at scala.Option.map(Option.scala:146)
>         at 
> org.apache.spark.broadcast.TorrentBroadcast$.unBlockifyObject(TorrentBroadcast.scala:304)
>         at 
> org.apache.spark.broadcast.TorrentBroadcast$$anonfun$readBroadcastBlock$1$$anonfun$apply$2.apply(TorrentBroadcast.scala:235)
>         at scala.Option.getOrElse(Option.scala:121)
>         at 
> org.apache.spark.broadcast.TorrentBroadcast$$anonfun$readBroadcastBlock$1.apply(TorrentBroadcast.scala:211)
>         at org.apache.spark.util.Utils$.tryOrIOException(Utils.scala:1346)
>         at 
> org.apache.spark.broadcast.TorrentBroadcast.readBroadcastBlock(TorrentBroadcast.scala:207)
>         at 
> org.apache.spark.broadcast.TorrentBroadcast._value$lzycompute(TorrentBroadcast.scala:66)
>         at 
> org.apache.spark.broadcast.TorrentBroadcast._value(TorrentBroadcast.scala:66)
>         at 
> org.apache.spark.broadcast.TorrentBroadcast.getValue(TorrentBroadcast.scala:96)
>         at org.apache.spark.broadcast.Broadcast.value(Broadcast.scala:70)
>         at 
> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:86)
>         at 
> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:53)
>         at org.apache.spark.scheduler.Task.run(Task.scala:109)
>         at 
> org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:345)
>         at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>         at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>         at java.lang.Thread.run(Thread.java:748)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to