[
https://issues.apache.org/jira/browse/PIG-1519?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12892551#action_12892551
]
Thejas M Nair commented on PIG-1519:
------------------------------------
As part of these changes, we should consider keeping a (weak?) reference in the
bags to all the iterators that have been created and call clear() (a new method
in iterator impl class) that closes the DataInputStreams and invalidates the
iterators.
> Stop relying on finalize() to delete files, close filehandles in bag
> implementations
> ------------------------------------------------------------------------------------
>
> Key: PIG-1519
> URL: https://issues.apache.org/jira/browse/PIG-1519
> Project: Pig
> Issue Type: Improvement
> Affects Versions: 0.8.0
> Reporter: Thejas M Nair
> Priority: Minor
>
> In DefaultAbstractBag and its subclasses, the files used for spilling to disk
> are deleted using finalize() .
> The iterators associated with these bags use DataInputStreams but don't call
> close on them, and the underlying FileInputStream.close() is called only
> through FileInputStream.finalize().
> The use of finalize has performance implications and also makes it hard to
> predict when the resources will get freed.
> WeakReferences can be used to avoid the use of finalize(). See
> http://java.sun.com/developer/technicalArticles/javase/finalization/ (look
> for "An Alternative to Finalization") .
> I have marked the priority has minor because the allocation of these
> resources objects that have finalize happens only for large bags that spill
> to disk (see related jira - PIG-1516), so the performance impact of the use
> of finalize is not likely to be significant. Also, I haven't come across any
> case where we have run out of these resources because finalize() thread has
> not freed them yet.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.