Re: Spark shuffle service does not work in stand alone

Jean-Baptiste Onofré Tue, 13 Oct 2015 07:32:57 -0700

Hi,

AFAIK, the shuffle service makes sense only to delegate the shuffle tomapreduce (as mapreduce shuffle is most of the time faster than thespark shuffle).

As you run in standalone mode, shuffle service will use the spark shuffle.


Not 100% thought.

Regards
JB

On 10/13/2015 04:23 PM, saif.a.ell...@wellsfargo.com wrote:

Has anyone tried shuffle service in Stand Alone cluster mode? I want to
enable it for d.a. but my jobs never start when I submit them.
This happens with all my jobs.
15/10/13 08:29:45 INFO DAGScheduler: Job 0 failed: json at
DataLoader.scala:86, took 16.318615 s
Exception in thread "main" org.apache.spark.SparkException: Job aborted
due to stage failure: Task 0 in stage 0.0 failed 4 times, most recent
failure: Lost task 0.3 in stage 0.0 (TID 7, 162.101.194.47):
ExecutorLostFailure (executor 4 lost)
Driver stacktrace:
         at
org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1283)
         at
org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1271)
         at
org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1270)
         at
scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
         at
scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47)
         at
org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1270)
         at
org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:697)
         at
org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:697)
         at scala.Option.foreach(Option.scala:236)
         at
org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:697)
         at
org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.doOnReceive(DAGScheduler.scala:1496)
         at
org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1458)
         at
org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1447)
         at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48)
         at
org.apache.spark.scheduler.DAGScheduler.runJob(DAGScheduler.scala:567)
         at org.apache.spark.SparkContext.runJob(SparkContext.scala:1822)
         at org.apache.spark.SparkContext.runJob(SparkContext.scala:1942)
         at org.apache.spark.rdd.RDD$$anonfun$reduce$1.apply(RDD.scala:1003)
         at
org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:147)
         at
org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:108)
         at org.apache.spark.rdd.RDD.withScope(RDD.scala:306)
         at org.apache.spark.rdd.RDD.reduce(RDD.scala:985)
         at
org.apache.spark.rdd.RDD$$anonfun$treeAggregate$1.apply(RDD.scala:1114)
         at
org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:147)
         at
org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:108)
         at org.apache.spark.rdd.RDD.withScope(RDD.scala:306)
         at org.apache.spark.rdd.RDD.treeAggregate(RDD.scala:1091)
         at
org.apache.spark.sql.execution.datasources.json.InferSchema$.apply(InferSchema.scala:58)
         at
org.apache.spark.sql.execution.datasources.json.JSONRelation$$anonfun$6.apply(JSONRelation.scala:105)
         at
org.apache.spark.sql.execution.datasources.json.JSONRelation$$anonfun$6.apply(JSONRelation.scala:100)
         at scala.Option.getOrElse(Option.scala:120)
         at
org.apache.spark.sql.execution.datasources.json.JSONRelation.dataSchema$lzycompute(JSONRelation.scala:100)
         at
org.apache.spark.sql.execution.datasources.json.JSONRelation.dataSchema(JSONRelation.scala:99)
         at
org.apache.spark.sql.sources.HadoopFsRelation.schema$lzycompute(interfaces.scala:561)
         at
org.apache.spark.sql.sources.HadoopFsRelation.schema(interfaces.scala:560)
         at
org.apache.spark.sql.execution.datasources.LogicalRelation.<init>(LogicalRelation.scala:31)
         at
org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:120)
         at
org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:104)
         at
org.apache.spark.sql.DataFrameReader.json(DataFrameReader.scala:219)
         at
org.apache.saif.loaders.DataLoader$.load_json(DataLoader.scala:86)


--
Jean-Baptiste Onofré
jbono...@apache.org
http://blog.nanthrax.net
Talend - http://www.talend.com

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org

Re: Spark shuffle service does not work in stand alone

Reply via email to