[ https://issues.apache.org/jira/browse/SPARK-5697?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Apache Spark reassigned SPARK-5697: ----------------------------------- Assignee: Apache Spark > Allow Spark driver to wait longer before giving up connecting to the master > --------------------------------------------------------------------------- > > Key: SPARK-5697 > URL: https://issues.apache.org/jira/browse/SPARK-5697 > Project: Spark > Issue Type: Improvement > Components: Deploy > Affects Versions: 1.1.1, 1.2.0 > Reporter: Matt Cheah > Assignee: Apache Spark > > In the AppClient class, the driver is configured to attempt connecting to the > master 3 times, with 20 second gaps, before giving up and killing the job. > In reality, some clusters may have high amounts of traffic and resource > contention, and in such environments jobs may wish to wait longer before > giving up. This reduces the user's overhead of needing to resubmit jobs that > simply had to wait for too long. An unreliable busy network may also cause > messages to take a longer time to propagate. > I suggest simply allowing the timeout and the number of retries for driver > registration to be configurable. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org