GitHub user srini-daruna opened a pull request: https://github.com/apache/spark/pull/18414
Update status of application to RUNNING if executors are accepted RUNNING SPARK-21169 ## What changes were proposed in this pull request? In Spark-HA, after active master failure, stand-by mater is choosen and workers,applications will be re-registered with new master. However, application state is not moving from WAITING state to RUNNING state. This code change checks applications after recovery and if all executors are RUNNING (Please fill in changes proposed in this fix) In the method completeRecovery in org.apache.spark.deploy.master.Master class, where cleanup of the workers and applictions is being done, i have added code change to move the application to RUNNING state, if application has more than 1 executors and all of them are in RUNNING status. In some cases, executors will be in LOADING status, but we cannot consider those to change application state to RUNNING, as executors in LOADING status might also happen due to resource unavailability. ## How was this patch tested? To check existing bug. 1) Created a zookeeper cluster 2) I have configured spark with recovery mode zookeeper and updated spark-env.sh with recovery mode settings. 3) Updated spark-defaults in both worker and master with both the masters. spark://<host1>:7077,<host2>:7077 4) Started spark master1 and spark master 2 and and workers in the order. 5) master1 is ACTIVE and master2 showed as STANDBY. 6) Started a sample streaming application. 7) Killed the spark-master1, and waited for the workers and applications to appear in master2. They appeared and job showed up in WAITING state. To check implemented fix: 1) I have built spark-core 2) removed spark-core jar in SPARK_HOME/jars folder and replaced with the newly built one. 3) Performed the same steps as above, and checked job status 4) checked spark-master logs to ensure the log message got printed and (Please explain how this patch was tested. E.g. unit tests, integration tests, manual tests) (If this patch involves UI changes, please attach a screenshot; otherwise, remove this) Please review http://spark.apac ![master2_ui_after_recovery_with_fix](https://user-images.githubusercontent.com/5573733/27513117-c73cc10c-5928-11e7-89d6-039c1410b43c.png) ![master1_ui_with_fix](https://user-images.githubusercontent.com/5573733/27513121-c73e23f8-5928-11e7-995b-2ebda7aad86c.png) ![master2_ui_after_master1_is_killed_before_fix](https://user-images.githubusercontent.com/5573733/27513118-c73d7624-5928-11e7-8854-2f6a4aaceeb9.png) ![master2_ui_before_fix](https://user-images.githubusercontent.com/5573733/27513120-c73db698-5928-11e7-8684-9a6d8adeabac.png) ![master1_ui_before_fix](https://user-images.githubusercontent.com/5573733/27513119-c73d7e76-5928-11e7-90c1-4a0151e6094b.png) he.org/contributing.html before opening a pull request. You can merge this pull request into a Git repository by running: $ git pull https://github.com/srini-daruna/spark master Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/18414.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #18414 ---- commit e48a0b002d128128c2b351b492de7f36dfcc67a9 Author: Daruna, Srinivasarao <srinivasarao.dar...@capitalone.com> Date: 2017-06-25T00:18:42Z Update status of application to RUNNING if executors are accepted and RUNNING SPARK-21169 commit 703742c2d937bca4459edab1b3aac3b01c788a39 Author: Daruna, Srinivasarao <srinivasarao.dar...@capitalone.com> Date: 2017-06-25T01:59:11Z Adding changes necessary to move application state to RUNNING,if executors are accepted and running after recovery ---- --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org