Github user markhamstra commented on the issue: https://github.com/apache/spark/pull/21096 As is, this PR isn't acceptable on multiple levels. Even if I were convinced (which I am not presently) that the sequence of `getShuffleDependencies` calls covered in this PR is the only one possible and therefore can be cached/memoized as this PR does, a complete lack of test coverage for the new code is not something we want in something as critical as the DAGScheduler. And I can tell you right now that there is at least one missing test that this PR will fail. You are adding a new data structure to the DAGScheduler, so at a bare minimum you also need to add that HashMap to `assertDataStructuresEmpty` in the DAGSchedulerSuite. Never deleting unneeded elements from a DAGScheduler data structure is an unacceptable resource leak.
--- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org