Hi Guys,
As part of debugging this native library error in our environment, it
would be great if somebody can help me with this question. What kind of
temp, scratch, and staging directories does Spark need and use on the slave
nodes in the YARN cluster mode?
Thanks,
Aravind
On Mon, Nov 3, 2014 at 4:11 PM, Aravind Srinivasan arav...@altiscale.com
wrote:
Team,
We are running a build of spark 1.1.1 for hadoop 2.2. We can't get the
code to read LZO or snappy files in YARN. It fails to find the native libs.
I have tried many different ways of defining the lib path -
LD_LIBRARY_PATH, --driver-class-path, spark.executor.extraLibraryPath in
spark-defaults.conf, --driver-java-options, and SPARK_LIBRARY_PATH. But
none of them seem to take effect. What am I missing? Or is this a known
issue?
The example below (HdfsTest) works with plain text on both cluster and
local mode. LZO and snappy files work on local mode, but both fail in the
YARN cluster mode
LD_LIBRARY_PATH=/opt/hadoop/lib/native/ MASTER=yarn
SPARK_EXAMPLES_JAR=./examples/target/spark-examples_2.10-1.1.1.jar
./bin/run-example HdfsTest /user/input/part-r-0.snappy
Stack Trace:
Exception in thread main org.apache.spark.SparkException: Job aborted
due to stage failure: Task 0 in stage 0.0 failed 4 times, most recent
failure: Lost task 0.3 in stage 0.0 (TID 3, 101-26-03.sc1.verticloud.com):
ExecutorLostFailure (executor lost)
Driver stacktrace:
at org.apache.spark.scheduler.DAGScheduler.org
$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1185)
at
org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1174)
at
org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1173)
at
scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47)
at
org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1173)
at
org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:688)
at
org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:688)
at scala.Option.foreach(Option.scala:236)
at
org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:688)
at
org.apache.spark.scheduler.DAGSchedulerEventProcessActor$$anonfun$receive$2.applyOrElse(DAGScheduler.scala:1391)
at akka.actor.ActorCell.receiveMessage(ActorCell.scala:498)
at akka.actor.ActorCell.invoke(ActorCell.scala:456)
at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:237)
at akka.dispatch.Mailbox.run(Mailbox.scala:219)
at
akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:386)
at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
at
scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
at
scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
at
scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
Thanks,
Aravind