Re: ExecutorLostFailure kills sparkcontext

2014-09-30 Thread Akhil Das
I also had similar problem while joining a dataset. After digging into the
worker logs i figured out it was throwing CancelledKeyException, Not sure
the cause.

Thanks
Best Regards

On Tue, Sep 30, 2014 at 5:15 AM, jamborta jambo...@gmail.com wrote:

 hi all,

 I have a problem with my application when I increase the data size over 5GB
 (the cluster has about 100GB memory to handle that). First I get this
 warning:

  WARN TaskSetManager: Lost task 10.1 in stage 4.1 (TID 408, backend-node1):
 FetchFailed(BlockManagerId(3, backend-node0, 41484, 0), shuffleId=1,
 mapId=0, r
 educeId=18)

 then this one:

 14/09/29 23:26:44 WARN TaskSetManager: Lost task 2.0 in stage 5.2 (TID 418,
 backend-node1): ExecutorLostFailure (executor lost)

 a few second later the all executors shut down:

 14/09/29 23:26:53 ERROR YarnClientSchedulerBackend: Yarn application
 already
 ended: FINISHED
 14/09/29 23:26:53 INFO SparkUI: Stopped Spark web UI at
 http://backend-node0:4040
 14/09/29 23:26:53 INFO YarnClientSchedulerBackend: Shutting down all
 executors

 even SparkContext stops.

 Not sure how to debug this, there is nothing in the logs apart from this. I
 have given enough memory to all executors.

 thanks for the help,




 --
 View this message in context:
 http://apache-spark-user-list.1001560.n3.nabble.com/ExecutorLostFailure-kills-sparkcontext-tp15370.html
 Sent from the Apache Spark User List mailing list archive at Nabble.com.

 -
 To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
 For additional commands, e-mail: user-h...@spark.apache.org




ExecutorLostFailure kills sparkcontext

2014-09-29 Thread jamborta
hi all,

I have a problem with my application when I increase the data size over 5GB
(the cluster has about 100GB memory to handle that). First I get this
warning:

 WARN TaskSetManager: Lost task 10.1 in stage 4.1 (TID 408, backend-node1):
FetchFailed(BlockManagerId(3, backend-node0, 41484, 0), shuffleId=1,
mapId=0, r
educeId=18)

then this one:

14/09/29 23:26:44 WARN TaskSetManager: Lost task 2.0 in stage 5.2 (TID 418,
backend-node1): ExecutorLostFailure (executor lost)

a few second later the all executors shut down:

14/09/29 23:26:53 ERROR YarnClientSchedulerBackend: Yarn application already
ended: FINISHED
14/09/29 23:26:53 INFO SparkUI: Stopped Spark web UI at
http://backend-node0:4040
14/09/29 23:26:53 INFO YarnClientSchedulerBackend: Shutting down all
executors

even SparkContext stops.

Not sure how to debug this, there is nothing in the logs apart from this. I
have given enough memory to all executors.

thanks for the help,




--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/ExecutorLostFailure-kills-sparkcontext-tp15370.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org