[ https://issues.apache.org/jira/browse/SPARK-6415?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15044110#comment-15044110 ]
Yoel Amram commented on SPARK-6415: ----------------------------------- It would be useful to let the application decide when to fail-fast, I can see several relevant use cases: 1. one or more workers are out of memory (could be unbalanced load causing only one/several partitions to fail). 2. all workers fail because of unreachable destination (e.g. network issues with target database). Also, maybe having a similar configuration to spark.task.maxFailures (i.e. spark.job.maxFailures) to retry job execution several times (with possible backoff period) before exiting the application. > Spark Streaming fail-fast: Stop scheduling jobs when a batch fails, and kills > the app > ------------------------------------------------------------------------------------- > > Key: SPARK-6415 > URL: https://issues.apache.org/jira/browse/SPARK-6415 > Project: Spark > Issue Type: Improvement > Components: Streaming > Reporter: Hari Shreedharan > > Of course, this would have to be done as a configurable param, but such a > fail-fast is useful else it is painful to figure out what is happening when > there are cascading failures. In some cases, the SparkContext shuts down and > streaming keeps scheduling jobs -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org