subject:"\[jira\] \[Comment Edited\] \(SPARK\-2019\) Spark workers die\/disappear when job fails for nearly any reason"

[jira] [Comment Edited] (SPARK-2019) Spark workers die/disappear when job fails for nearly any reason

2014-10-10 Thread Denis Serduik (JIRA)

[
https://issues.apache.org/jira/browse/SPARK-2019?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14166555#comment-14166555
]

Denis Serduik edited comment on SPARK-2019 at 10/10/14 8:39 AM:

I have noticed the same problem with workers behavior. My installation: Spark
1.0.2-hadoop2.0.0-mr1-cdh4.2.0 on Mesos 0.13. In my case, workers fail when
there was an error while serialization the closure. Also please notice that we
run Spark in coarse-grained mode

was (Author: dmaverick):
I have noticed the same problem with workers behavior. My installation: Spark
1.0.2-hadoop2.0.0-mr1-cdh4.2.0 on Mesos 0.13. In my case, workers fail when
there was an error while serialization the closure.

Spark workers die/disappear when job fails for nearly any reason

Key: SPARK-2019
URL: https://issues.apache.org/jira/browse/SPARK-2019
Project: Spark
Issue Type: Bug
Affects Versions: 0.9.0
Reporter: sam

We either have to reboot all the nodes, or run 'sudo service spark-worker
restart' across our cluster. I don't think this should happen - the job
failures are often not even that bad. There is a 5 upvoted SO question here:
http://stackoverflow.com/questions/22031006/spark-0-9-0-worker-keeps-dying-in-standalone-mode-when-job-fails

We shouldn't be giving restart privileges to our devs, and therefore our
sysadm has to frequently restart the workers. When the sysadm is not around,
there is nothing our devs can do.
Many thanks

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Comment Edited] (SPARK-2019) Spark workers die/disappear when job fails for nearly any reason

2014-10-10 Thread Denis Serduik (JIRA)

[
https://issues.apache.org/jira/browse/SPARK-2019?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14166555#comment-14166555
]

Denis Serduik edited comment on SPARK-2019 at 10/10/14 8:40 AM:

Spark workers die/disappear when job fails for nearly any reason

Key: SPARK-2019
URL: https://issues.apache.org/jira/browse/SPARK-2019
Project: Spark
Issue Type: Bug
Affects Versions: 0.9.0
Reporter: sam

We shouldn't be giving restart privileges to our devs, and therefore our
sysadm has to frequently restart the workers. When the sysadm is not around,
there is nothing our devs can do.
Many thanks

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Comment Edited] (SPARK-2019) Spark workers die/disappear when job fails for nearly any reason

2014-06-05 Thread sam (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-2019?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14018048#comment-14018048
 ] 

sam edited comment on SPARK-2019 at 6/5/14 9:47 AM:


Sorry. Its -0.9.1- 0.9.0


was (Author: sams):
Sorry. Its 0.9.1

 Spark workers die/disappear when job fails for nearly any reason
 

 Key: SPARK-2019
 URL: https://issues.apache.org/jira/browse/SPARK-2019
 Project: Spark
  Issue Type: Bug
Affects Versions: 0.9.0
Reporter: sam
Priority: Critical

 We either have to reboot all the nodes, or run 'sudo service spark-worker 
 restart' across our cluster.  I don't think this should happen - the job 
 failures are often not even that bad.  There is a 5 upvoted SO question here: 
 http://stackoverflow.com/questions/22031006/spark-0-9-0-worker-keeps-dying-in-standalone-mode-when-job-fails
 We shouldn't be giving restart privileges to our devs, and therefore our 
 sysadm has to frequently restart the workers.  When the sysadm is not around, 
 there is nothing our devs can do.
 Many thanks



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Comment Edited] (SPARK-2019) Spark workers die/disappear when job fails for nearly any reason

[jira] [Comment Edited] (SPARK-2019) Spark workers die/disappear when job fails for nearly any reason

[jira] [Comment Edited] (SPARK-2019) Spark workers die/disappear when job fails for nearly any reason

3 matches

Site Navigation

Mail list logo

Footer information