[ 
https://issues.apache.org/jira/browse/SPARK-30227?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wenchen Fan reassigned SPARK-30227:
-----------------------------------

    Assignee: Jungtaek Lim

> Add close() on DataWriter interface
> -----------------------------------
>
>                 Key: SPARK-30227
>                 URL: https://issues.apache.org/jira/browse/SPARK-30227
>             Project: Spark
>          Issue Type: Improvement
>          Components: Spark Core
>    Affects Versions: 3.0.0
>            Reporter: Jungtaek Lim
>            Assignee: Jungtaek Lim
>            Priority: Major
>
> If the scaladoc of DataWriter is correct, the lifecycle of DataWriter 
> instance ends at either commit() or abort(). That makes datasource 
> implementors to feel they can place resource cleanup in both sides, but 
> abort() can be called when commit() fails; so they have to ensure they don't 
> do double-cleanup if cleanup is not idempotent.
> So I'm proposing to add close() on DataWriter explicitly, which is "the 
> place" for resource cleanup. The lifecycle of DataWriter instance will (and 
> should) end at close().
> I've checked some callers to see whether they can apply "try-catch-finally" 
> to ensure close() is called at the end of lifecycle for DataWriter, and they 
> look like so.
> The change would bring backward incompatible change, but given the interface 
> is marked as Evolving and we're making backward incompatible changes in Spark 
> 3.0, so I feel it may not matter.
> I've raised the discussion around this issue and the feedbacks are positive: 
> https://lists.apache.org/thread.html/bfdb989fa83bc4d774804473610bd0cfcaa1dd5a020ca9a522f3510c%40%3Cdev.spark.apache.org%3E



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to