Sounds good to me, too. On Wed, Dec 11, 2019 at 1:18 AM Jungtaek Lim <kabhwan.opensou...@gmail.com> wrote:
> Thanks for the quick response, Wenchen! > > I'll leave this thread for early tomorrow so that someone in US timezone > can chime in, and craft a patch if no one objects. > > On Wed, Dec 11, 2019 at 4:41 PM Wenchen Fan <cloud0...@gmail.com> wrote: > >> PartitionReader extends Closable, seems reasonable to me to do the same >> for DataWriter. >> >> On Wed, Dec 11, 2019 at 1:35 PM Jungtaek Lim < >> kabhwan.opensou...@gmail.com> wrote: >> >>> Hi devs, >>> >>> I'd like to propose to add close() on DataWriter explicitly, which is >>> the place for resource cleanup. >>> >>> The rationalization of the proposal is due to the lifecycle of >>> DataWriter. If the scaladoc of DataWriter is correct, the lifecycle of >>> DataWriter instance ends at either commit() or abort(). That makes >>> datasource implementors to feel they can place resource cleanup in both >>> sides, but abort() can be called when commit() fails; so they have to >>> ensure they don't do double-cleanup if cleanup is not idempotent. >>> >>> I've checked some callers to see whether they can apply >>> "try-catch-finally" to ensure close() is called at the end of lifecycle for >>> DataWriter, and they look like so, but I might be missing something. >>> >>> What do you think? It would bring backward incompatible change, but >>> given the interface is marked as Evolving and we're making backward >>> incompatible changes in Spark 3.0, so I feel it may not matter. >>> >>> Would love to hear your thoughts. >>> >>> Thanks in advance, >>> Jungtaek Lim (HeartSaVioR) >>> >>> >>> -- Ryan Blue Software Engineer Netflix