GitHub user jianjianjiao opened a pull request:

    https://github.com/apache/spark/pull/22444

    implement incremental loading and add a flag to load incomplete or not

    ## What changes were proposed in this pull request?
    
    1.  Instead of loading all event logs in every loading, load only a certain 
amount of event logs. That is because if there are tens of thousands of event 
logs, loading all of them take long time. 
    2.  If we run Spark on Yarn, Spark jobs information can be obtained by Yarn 
Application master, this is no need to load incomplete applications, so add a 
flag not to load them. 
    
    ## How was this patch tested?
    This is tested manually in our production cluster.


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/jianjianjiao/spark speedUpSparkHistoryLoading

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/22444.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #22444
    
----
commit 1190ffcb109025bd62c909059b0cf16e6a748de9
Author: Rong Tang <rotang@...>
Date:   2018-09-17T22:00:23Z

    implement incremental loading and add a flag to load incomplete or not

----


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to