[ https://issues.apache.org/jira/browse/SPARK-2425?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Patrick Wendell updated SPARK-2425: ----------------------------------- Priority: Critical (was: Major) > Standalone Master is too aggressive in removing Applications > ------------------------------------------------------------ > > Key: SPARK-2425 > URL: https://issues.apache.org/jira/browse/SPARK-2425 > Project: Spark > Issue Type: Bug > Components: Spark Core > Affects Versions: 1.0.0 > Reporter: Mark Hamstra > Assignee: Mark Hamstra > Priority: Critical > > When standalone Executors trying to run a particular Application fail a > cummulative ApplicationState.MAX_NUM_RETRY times, Master will remove the > Application. This will be true even if there actually are a number of > Executors that are successfully running the Application. This makes > long-running standalone-mode Applications in particular unnecessarily > vulnerable to limited failures in the cluster -- e.g., a single bad node on > which Executors repeatedly fail for any reason can prevent an Application > from starting or can result in a running Application being removed even > though it could continue to run successfully (just not making use of all > potential Workers and Executors.) -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org