GitHub user LiShuMing opened a pull request: https://github.com/apache/spark/pull/21856
[SPARK-24738] [HistoryServer] FsHistoryProvider clean outdated event ⦠Now FsHistoryProvider's policy : 1. FsHistoryProvider create checkEvent thread and cleanLog thread, they share one thread pool to avoid conflicts; 2. `checkThread` replay all event logs first; 3. `cleanThread` iterators all events to check the outdated events and remove about it; Why check thread at step2 not clean the outdated eventLogs at start, so no need replay the outdated logs to save times ? ## How was this patch tested? TODO (Please explain how this patch was tested. E.g. unit tests, integration tests, manual tests) (If this patch involves UI changes, please attach a screenshot; otherwise, remove this) Please review http://spark.apache.org/contributing.html before opening a pull request. You can merge this pull request into a Git repository by running: $ git pull https://github.com/LiShuMing/spark SPARK-24738 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/21856.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #21856 ---- commit dc86b509226b8f13c21f67c038bfda42f15cfc7c Author: åä¸ <shuming.lsm@...> Date: 2018-07-24T07:13:13Z [SPARK-24738] [HistoryServer] FsHistoryProvider clean outdated event logs at start ---- --- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org