yanxiaole commented on pull request #29392: URL: https://github.com/apache/spark/pull/29392#issuecomment-670993081
It happened when data structure getting modified, mainly `delete` caused the problem. Take `LevelDB` implementation for example, in its iterator's `next` function it will first check `hasNext` then call `get`, if the entry was deleted between two calls then there will be a `NoSuchElementException` which will interrupt outside workflow. In our cluster, it either terminate the `cleanLogs` thread or interrupt `checkForLogs` which introduces inconsistency between log and app listing. https://github.com/apache/spark/blob/2ec9b866285fc059cae6816033babca64b4da7ec/common/kvstore/src/main/java/org/apache/spark/util/kvstore/LevelDBIterator.java#L125 ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org