[ 
https://issues.apache.org/jira/browse/SPARK-537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14142474#comment-14142474
 ] 

Matthew Farrellee commented on SPARK-537:
-----------------------------------------

this should be resolved by a number of fixes in 1.0. please re-open if it still 
reproduces.

> driver.run() returned with code DRIVER_ABORTED
> ----------------------------------------------
>
>                 Key: SPARK-537
>                 URL: https://issues.apache.org/jira/browse/SPARK-537
>             Project: Spark
>          Issue Type: Bug
>            Reporter: yshaw
>
> Hi there,
> When I try to run Spark on Mesos as a cluster, some error happen like this:
> ```
>  ./run spark.examples.SparkPi *.*.*.*:5050
> 12/09/07 14:49:28 INFO spark.BoundedMemoryCache: BoundedMemoryCache.maxBytes 
> = 994836480
> 12/09/07 14:49:28 INFO spark.CacheTrackerActor: Registered actor on port 7077
> 12/09/07 14:49:28 INFO spark.CacheTrackerActor: Started slave cache (size 
> 948.8MB) on shawpc
> 12/09/07 14:49:28 INFO spark.MapOutputTrackerActor: Registered actor on port 
> 7077
> 12/09/07 14:49:28 INFO spark.ShuffleManager: Shuffle dir: 
> /tmp/spark-local-81220c47-bc43-4809-ac48-5e3e8e023c8a/shuffle
> 12/09/07 14:49:28 INFO server.Server: jetty-7.5.3.v20111011
> 12/09/07 14:49:28 INFO server.AbstractConnector: Started 
> SelectChannelConnector@0.0.0.0:57595 STARTING
> 12/09/07 14:49:28 INFO spark.ShuffleManager: Local URI: http://127.0.1.1:57595
> 12/09/07 14:49:28 INFO server.Server: jetty-7.5.3.v20111011
> 12/09/07 14:49:28 INFO server.AbstractConnector: Started 
> SelectChannelConnector@0.0.0.0:60113 STARTING
> 12/09/07 14:49:28 INFO broadcast.HttpBroadcast: Broadcast server started at 
> http://127.0.1.1:60113
> 12/09/07 14:49:28 INFO spark.MesosScheduler: Temp directory for JARs: 
> /tmp/spark-d541f37c-ae35-476c-b2fc-9908b0739f50
> 12/09/07 14:49:28 INFO server.Server: jetty-7.5.3.v20111011
> 12/09/07 14:49:28 INFO server.AbstractConnector: Started 
> SelectChannelConnector@0.0.0.0:50511 STARTING
> 12/09/07 14:49:28 INFO spark.MesosScheduler: JAR server started at 
> http://127.0.1.1:50511
> 12/09/07 14:49:28 INFO spark.MesosScheduler: Registered as framework ID 
> 201209071448-846324308-5050-26925-0000
> 12/09/07 14:49:29 INFO spark.SparkContext: Starting job...
> 12/09/07 14:49:29 INFO spark.CacheTracker: Registering RDD ID 1 with cache
> 12/09/07 14:49:29 INFO spark.CacheTrackerActor: Registering RDD 1 with 2 
> partitions
> 12/09/07 14:49:29 INFO spark.CacheTracker: Registering RDD ID 0 with cache
> 12/09/07 14:49:29 INFO spark.CacheTrackerActor: Registering RDD 0 with 2 
> partitions
> 12/09/07 14:49:29 INFO spark.CacheTrackerActor: Asked for current cache 
> locations
> 12/09/07 14:49:29 INFO spark.MesosScheduler: Final stage: Stage 0
> 12/09/07 14:49:29 INFO spark.MesosScheduler: Parents of final stage: List()
> 12/09/07 14:49:29 INFO spark.MesosScheduler: Missing parents: List()
> 12/09/07 14:49:29 INFO spark.MesosScheduler: Submitting Stage 0, which has no 
> missing parents
> 12/09/07 14:49:29 INFO spark.MesosScheduler: Got a job with 2 tasks
> 12/09/07 14:49:29 INFO spark.MesosScheduler: Adding job with ID 0
> 12/09/07 14:49:29 INFO spark.SimpleJob: Starting task 0:0 as TID 0 on slave 
> 201209071448-846324308-5050-26925-0: shawpc (preferred)
> 12/09/07 14:49:29 INFO spark.SimpleJob: Size of task 0:0 is 1606 bytes and 
> took 52 ms to serialize by spark.JavaSerializerInstance
> 12/09/07 14:49:29 INFO spark.SimpleJob: Starting task 0:1 as TID 1 on slave 
> 201209071448-846324308-5050-26925-0: shawpc (preferred)
> 12/09/07 14:49:29 INFO spark.SimpleJob: Size of task 0:1 is 1606 bytes and 
> took 1 ms to serialize by spark.JavaSerializerInstance
> 12/09/07 14:49:30 INFO spark.SimpleJob: Lost TID 0 (task 0:0)
> 12/09/07 14:49:30 INFO spark.SimpleJob: Starting task 0:0 as TID 2 on slave 
> 201209071448-846324308-5050-26925-0: shawpc (preferred)
> 12/09/07 14:49:30 INFO spark.SimpleJob: Size of task 0:0 is 1606 bytes and 
> took 0 ms to serialize by spark.JavaSerializerInstance
> 12/09/07 14:49:30 INFO spark.SimpleJob: Lost TID 1 (task 0:1)
> 12/09/07 14:49:30 INFO spark.SimpleJob: Lost TID 2 (task 0:0)
> 12/09/07 14:49:30 INFO spark.SimpleJob: Starting task 0:0 as TID 3 on slave 
> 201209071448-846324308-5050-26925-0: shawpc (preferred)
> 12/09/07 14:49:30 INFO spark.SimpleJob: Size of task 0:0 is 1606 bytes and 
> took 2 ms to serialize by spark.JavaSerializerInstance
> 12/09/07 14:49:32 INFO spark.SimpleJob: Starting task 0:1 as TID 4 on slave 
> 201209071448-846324308-5050-26925-0: shawpc (preferred)
> 12/09/07 14:49:32 INFO spark.SimpleJob: Size of task 0:1 is 1606 bytes and 
> took 1 ms to serialize by spark.JavaSerializerInstance
> 12/09/07 14:49:32 INFO spark.SimpleJob: Lost TID 3 (task 0:0)
> 12/09/07 14:49:32 INFO spark.SimpleJob: Starting task 0:0 as TID 5 on slave 
> 201209071448-846324308-5050-26925-0: shawpc (preferred)
> 12/09/07 14:49:32 INFO spark.SimpleJob: Size of task 0:0 is 1606 bytes and 
> took 0 ms to serialize by spark.JavaSerializerInstance
> 12/09/07 14:49:32 INFO spark.SimpleJob: Lost TID 4 (task 0:1)
> 12/09/07 14:49:32 INFO spark.SimpleJob: Lost TID 5 (task 0:0)
> 12/09/07 14:49:32 INFO spark.SimpleJob: Starting task 0:0 as TID 6 on slave 
> 201209071448-846324308-5050-26925-0: shawpc (preferred)
> 12/09/07 14:49:32 INFO spark.SimpleJob: Size of task 0:0 is 1606 bytes and 
> took 0 ms to serialize by spark.JavaSerializerInstance
> 12/09/07 14:49:34 INFO spark.SimpleJob: Starting task 0:1 as TID 7 on slave 
> 201209071448-846324308-5050-26925-0: shawpc (preferred)
> 12/09/07 14:49:34 INFO spark.SimpleJob: Size of task 0:1 is 1606 bytes and 
> took 2 ms to serialize by spark.JavaSerializerInstance
> 12/09/07 14:49:34 INFO spark.SimpleJob: Lost TID 6 (task 0:0)
> 12/09/07 14:49:34 ERROR spark.SimpleJob: Task 0:0 failed more than 4 times; 
> aborting job
> Exception in thread "Thread-50" java.io.EOFException
>       at 
> java.io.ObjectInputStream$PeekInputStream.readFully(ObjectInputStream.java:2280)
>       at 
> java.io.ObjectInputStream$BlockDataInputStream.readShort(ObjectInputStream.java:2749)
>       at 
> java.io.ObjectInputStream.readStreamHeader(ObjectInputStream.java:779)
>       at java.io.ObjectInputStream.<init>(ObjectInputStream.java:279)
>       at spark.JavaSerializerInstance$$anon$2.<init>(JavaSerializer.scala:39)
>       at spark.JavaSerializerInstance.deserialize(JavaSerializer.scala:39)
>       at spark.SimpleJob.taskLost(SimpleJob.scala:296)
>       at spark.SimpleJob.statusUpdate(SimpleJob.scala:207)
>       at spark.MesosScheduler.statusUpdate(MesosScheduler.scala:287)
> 12/09/07 14:49:34 INFO spark.SimpleJob: Starting task 0:0 as TID 8 on slave 
> 201209071448-846324308-5050-26925-0: shawpc (preferred)
> 12/09/07 14:49:34 INFO spark.SimpleJob: Size of task 0:0 is 1606 bytes and 
> took 1 ms to serialize by spark.JavaSerializerInstance
> 12/09/07 14:49:34 INFO spark.SimpleJob: Lost TID 7 (task 0:1)
> 12/09/07 14:49:34 INFO spark.MesosScheduler: driver.run() returned with code 
> DRIVER_ABORTED
> ```



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to