[GitHub] [spark] zhouyejoe commented on pull request #29392: [SPARK-32574][CORE] Race condition in FsHistoryProvider listing iteration

2020-08-12 Thread GitBox
zhouyejoe commented on pull request #29392: URL: https://github.com/apache/spark/pull/29392#issuecomment-673037833 @yanxiaole From the stacktrace, there is indeed race condition between the replayTask and checkForLogs()/cleanLogs(), where the latter two has codes which transfers the whole

[GitHub] [spark] zhouyejoe commented on pull request #29392: [SPARK-32574][CORE] Race condition in FsHistoryProvider listing iteration

2020-08-11 Thread GitBox
zhouyejoe commented on pull request #29392: URL: https://github.com/apache/spark/pull/29392#issuecomment-672598759 Hi, @yanxiaole. I double checked the codes in checkForLogs(). I think SPARK-29043 does actually handles the race condition by filtering out the stale.filterNot(isProcessing),

[GitHub] [spark] zhouyejoe commented on pull request #29392: [SPARK-32574][CORE] Race condition in FsHistoryProvider listing iteration

2020-08-11 Thread GitBox
zhouyejoe commented on pull request #29392: URL: https://github.com/apache/spark/pull/29392#issuecomment-672428047 @yanxiaole I don't think there will be race condition between the checkForLogs() and cleanLogs() thread. These two threads are launched from the same pool, and the thread