Github user andrewor14 commented on the pull request: https://github.com/apache/spark/pull/126#issuecomment-37484105 It's true that finalizers are not predictable, but what this patch is doing is no worse than before. Cleanup() logic is decoupled from finalize(); it can still be called explicitly elsewhere. It's just that finalize() adds a case in which cleanup() is called (and this case is when the RDD / block goes out of scope). Also, the "severe performance penalty" in the article refers to a relatively small increase (on the order of milliseconds), which is not really an issue since these clean-ups don't happen very often.
--- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---