GitHub user srini-daruna opened a pull request:

    https://github.com/apache/spark/pull/18414

    Update status of application to RUNNING if executors are accepted RUNNING 
SPARK-21169

    ## What changes were proposed in this pull request?
    
    In Spark-HA, after active master failure, stand-by mater is choosen and 
workers,applications will be re-registered with new master.
    However, application state is not moving from WAITING state to RUNNING 
state.
    
    This code change checks applications after recovery and if all executors 
are RUNNING
    
    (Please fill in changes proposed in this fix)
    
    In the method completeRecovery in org.apache.spark.deploy.master.Master 
class, where cleanup of the workers and applictions is being done,
    i have added code change to move the application to RUNNING state, if 
application has more than 1 executors and all of them are in RUNNING status.
    
    In some cases, executors will be in LOADING status, but we cannot consider 
those to change application state to RUNNING, as executors in LOADING status 
might also happen due to resource unavailability.
    
    ## How was this patch tested?
    
    To check existing bug.
        1) Created a zookeeper cluster
        2) I have configured spark with recovery mode zookeeper and updated 
spark-env.sh with recovery mode settings.
        3) Updated spark-defaults in both worker and master with both the 
masters. spark://<host1>:7077,<host2>:7077 
        4) Started spark master1 and spark master 2 and and workers in the 
order.
        5) master1 is ACTIVE and master2 showed as STANDBY.
        6) Started a sample streaming application.
        7) Killed the spark-master1, and waited for the workers and 
applications to appear in master2. They appeared and job showed up in WAITING 
state.
    
    
    To check implemented fix:
        1) I have built spark-core  
        2) removed spark-core jar in SPARK_HOME/jars folder and replaced with 
the newly built one.
        3) Performed the same steps as above, and checked job status
        4) checked spark-master logs to ensure the log message got printed and
    
    (Please explain how this patch was tested. E.g. unit tests, integration 
tests, manual tests)
    (If this patch involves UI changes, please attach a screenshot; otherwise, 
remove this)
    
    Please review http://spark.apac
    
![master2_ui_after_recovery_with_fix](https://user-images.githubusercontent.com/5573733/27513117-c73cc10c-5928-11e7-89d6-039c1410b43c.png)
    
![master1_ui_with_fix](https://user-images.githubusercontent.com/5573733/27513121-c73e23f8-5928-11e7-995b-2ebda7aad86c.png)
    
![master2_ui_after_master1_is_killed_before_fix](https://user-images.githubusercontent.com/5573733/27513118-c73d7624-5928-11e7-8854-2f6a4aaceeb9.png)
    
![master2_ui_before_fix](https://user-images.githubusercontent.com/5573733/27513120-c73db698-5928-11e7-8684-9a6d8adeabac.png)
    
![master1_ui_before_fix](https://user-images.githubusercontent.com/5573733/27513119-c73d7e76-5928-11e7-90c1-4a0151e6094b.png)
    
    
    
    
    he.org/contributing.html before opening a pull request.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/srini-daruna/spark master

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/18414.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #18414
    
----
commit e48a0b002d128128c2b351b492de7f36dfcc67a9
Author: Daruna, Srinivasarao <srinivasarao.dar...@capitalone.com>
Date:   2017-06-25T00:18:42Z

    Update status of application to RUNNING if executors are accepted and 
RUNNING SPARK-21169

commit 703742c2d937bca4459edab1b3aac3b01c788a39
Author: Daruna, Srinivasarao <srinivasarao.dar...@capitalone.com>
Date:   2017-06-25T01:59:11Z

    Adding changes necessary to move application state to RUNNING,if executors 
are accepted and running after recovery

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to