Nice, thanks for the answer! I'll craft a PR soon. Thanks again. On Thu, Dec 12, 2019 at 3:32 AM Ryan Blue <[email protected]> wrote:
> Sounds good to me, too. > > On Wed, Dec 11, 2019 at 1:18 AM Jungtaek Lim <[email protected]> > wrote: > >> Thanks for the quick response, Wenchen! >> >> I'll leave this thread for early tomorrow so that someone in US timezone >> can chime in, and craft a patch if no one objects. >> >> On Wed, Dec 11, 2019 at 4:41 PM Wenchen Fan <[email protected]> wrote: >> >>> PartitionReader extends Closable, seems reasonable to me to do the same >>> for DataWriter. >>> >>> On Wed, Dec 11, 2019 at 1:35 PM Jungtaek Lim < >>> [email protected]> wrote: >>> >>>> Hi devs, >>>> >>>> I'd like to propose to add close() on DataWriter explicitly, which is >>>> the place for resource cleanup. >>>> >>>> The rationalization of the proposal is due to the lifecycle of >>>> DataWriter. If the scaladoc of DataWriter is correct, the lifecycle of >>>> DataWriter instance ends at either commit() or abort(). That makes >>>> datasource implementors to feel they can place resource cleanup in both >>>> sides, but abort() can be called when commit() fails; so they have to >>>> ensure they don't do double-cleanup if cleanup is not idempotent. >>>> >>>> I've checked some callers to see whether they can apply >>>> "try-catch-finally" to ensure close() is called at the end of lifecycle for >>>> DataWriter, and they look like so, but I might be missing something. >>>> >>>> What do you think? It would bring backward incompatible change, but >>>> given the interface is marked as Evolving and we're making backward >>>> incompatible changes in Spark 3.0, so I feel it may not matter. >>>> >>>> Would love to hear your thoughts. >>>> >>>> Thanks in advance, >>>> Jungtaek Lim (HeartSaVioR) >>>> >>>> >>>> > > -- > Ryan Blue > Software Engineer > Netflix >
