Github user markhamstra commented on the issue:
https://github.com/apache/spark/pull/21096
As is, this PR isn't acceptable on multiple levels. Even if I were
convinced (which I am not presently) that the sequence of
`getShuffleDependencies` calls covered in this PR is the only one possible and
therefore can be cached/memoized as this PR does, a complete lack of test
coverage for the new code is not something we want in something as critical as
the DAGScheduler. And I can tell you right now that there is at least one
missing test that this PR will fail. You are adding a new data structure to the
DAGScheduler, so at a bare minimum you also need to add that HashMap to
`assertDataStructuresEmpty` in the DAGSchedulerSuite. Never deleting unneeded
elements from a DAGScheduler data structure is an unacceptable resource leak.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org