GitHub user jianjianjiao opened a pull request: https://github.com/apache/spark/pull/22444
implement incremental loading and add a flag to load incomplete or not ## What changes were proposed in this pull request? 1. Instead of loading all event logs in every loading, load only a certain amount of event logs. That is because if there are tens of thousands of event logs, loading all of them take long time. 2. If we run Spark on Yarn, Spark jobs information can be obtained by Yarn Application master, this is no need to load incomplete applications, so add a flag not to load them. ## How was this patch tested? This is tested manually in our production cluster. You can merge this pull request into a Git repository by running: $ git pull https://github.com/jianjianjiao/spark speedUpSparkHistoryLoading Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/22444.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #22444 ---- commit 1190ffcb109025bd62c909059b0cf16e6a748de9 Author: Rong Tang <rotang@...> Date: 2018-09-17T22:00:23Z implement incremental loading and add a flag to load incomplete or not ---- --- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org