Python worker exited unexpectedly (crashed)

shahid Thu, 22 Oct 2015 09:39:18 -0700

Hi 

I am running 10 node standalone cluster on aws
and loading 100G data on HDFS.. doing first groupby operation.
and then generating pairs from the groupedrdd (key,[a1,b1],key,[a,b,c]) 
generating the pairs like
(a1,b1),(a,b),(a,c) ... n
PairRDD will get large in size.


some stats from ui when starting to get errors and finally script fails
Details for Stage 1 (Attempt 0)
Total Time Across All Tasks: 1.3 h
Shuffle Read: 4.4 GB / 1402058
Shuffle Spill (Memory): 73.1 GB
Shuffle Spill (Disk): 3.6 GB

Get following stack trace 

WARN scheduler.TaskSetManager: Lost task 0.3 in stage 1.0 (TID 943,
10.239.131.154): org.apache.spark.SparkException: Python worker exited
unexpectedly (crashed)
        at 
org.apache.spark.api.python.PythonRDD$$anon$1.read(PythonRDD.scala:175)
        at
org.apache.spark.api.python.PythonRDD$$anon$1.<init>(PythonRDD.scala:179)
        at org.apache.spark.api.python.PythonRDD.compute(PythonRDD.scala:97)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:277)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:244)
        at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:63)
        at org.apache.spark.scheduler.Task.run(Task.scala:70)
        at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:213)
        at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:745)
Caused by: java.io.EOFException
        at java.io.DataInputStream.readInt(DataInputStream.java:392)
        at 
org.apache.spark.api.python.PythonRDD$$anon$1.read(PythonRDD.scala:111)
        ... 10 more

15/10/22 16:30:17 ERROR scheduler.TaskSetManager: Task 0 in stage 1.0 failed
4 times; aborting job
15/10/22 16:30:17 INFO scheduler.TaskSchedulerImpl: Cancelling stage 1



--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/Python-worker-exited-unexpectedly-crashed-tp25164.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org

Python worker exited unexpectedly (crashed)

Reply via email to