[ https://issues.apache.org/jira/browse/SPARK-30227?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Wenchen Fan resolved SPARK-30227. --------------------------------- Fix Version/s: 3.0.0 Resolution: Fixed Issue resolved by pull request 26855 [https://github.com/apache/spark/pull/26855] > Add close() on DataWriter interface > ----------------------------------- > > Key: SPARK-30227 > URL: https://issues.apache.org/jira/browse/SPARK-30227 > Project: Spark > Issue Type: Improvement > Components: Spark Core > Affects Versions: 3.0.0 > Reporter: Jungtaek Lim > Assignee: Jungtaek Lim > Priority: Major > Fix For: 3.0.0 > > > If the scaladoc of DataWriter is correct, the lifecycle of DataWriter > instance ends at either commit() or abort(). That makes datasource > implementors to feel they can place resource cleanup in both sides, but > abort() can be called when commit() fails; so they have to ensure they don't > do double-cleanup if cleanup is not idempotent. > So I'm proposing to add close() on DataWriter explicitly, which is "the > place" for resource cleanup. The lifecycle of DataWriter instance will (and > should) end at close(). > I've checked some callers to see whether they can apply "try-catch-finally" > to ensure close() is called at the end of lifecycle for DataWriter, and they > look like so. > The change would bring backward incompatible change, but given the interface > is marked as Evolving and we're making backward incompatible changes in Spark > 3.0, so I feel it may not matter. > I've raised the discussion around this issue and the feedbacks are positive: > https://lists.apache.org/thread.html/bfdb989fa83bc4d774804473610bd0cfcaa1dd5a020ca9a522f3510c%40%3Cdev.spark.apache.org%3E -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org