PartitionReader extends Closable, seems reasonable to me to do the same for DataWriter.
On Wed, Dec 11, 2019 at 1:35 PM Jungtaek Lim <kabhwan.opensou...@gmail.com> wrote: > Hi devs, > > I'd like to propose to add close() on DataWriter explicitly, which is the > place for resource cleanup. > > The rationalization of the proposal is due to the lifecycle of DataWriter. > If the scaladoc of DataWriter is correct, the lifecycle of DataWriter > instance ends at either commit() or abort(). That makes datasource > implementors to feel they can place resource cleanup in both sides, but > abort() can be called when commit() fails; so they have to ensure they > don't do double-cleanup if cleanup is not idempotent. > > I've checked some callers to see whether they can apply > "try-catch-finally" to ensure close() is called at the end of lifecycle for > DataWriter, and they look like so, but I might be missing something. > > What do you think? It would bring backward incompatible change, but given > the interface is marked as Evolving and we're making backward incompatible > changes in Spark 3.0, so I feel it may not matter. > > Would love to hear your thoughts. > > Thanks in advance, > Jungtaek Lim (HeartSaVioR) > > >