[
https://issues.apache.org/jira/browse/CRUNCH-53?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Shawn Smith updated CRUNCH-53:
------------------------------
Attachment: CRUNCH-53-autoclose.patch
I've attached a patch that closes the input files as long as the calling code
loops through the entire iterable (until Iterable.hasNext() returns false).
This should handle most situations.
It doesn't fix the situation where the client doesn't loop through to
completion because of an early termination case or an exception being thrown.
That's actually the scenario that leads to the jets3t warning in the ticket
description. In those cases it will be left to finalizers to close files.
> AvroFileReaderFactory does not close input files
> ------------------------------------------------
>
> Key: CRUNCH-53
> URL: https://issues.apache.org/jira/browse/CRUNCH-53
> Project: Crunch
> Issue Type: Bug
> Components: IO
> Reporter: Shawn Smith
> Priority: Minor
> Attachments: CRUNCH-53-autoclose.patch
>
>
> The AvroFileReaderFactory read() method does not close its DataFileReader.
> With the Hadoop NativeS3FileSystem this can lead to the following warning:
> org.jets3t.service.impl.rest.httpclient.HttpMethodReleaseInputStream:
> Successfully released HttpMethod in finalize(). You were lucky this time...
> Please ensure S3 response data streams are always fully consumed or closed.
> WARN [2012-08-28 19:26:16,035]
> org.jets3t.service.impl.rest.httpclient.HttpMethodReleaseInputStream:
> Attempting to release HttpMethod in finalize() as its response data stream
> has gone out of scope. This attempt will not always succeed and cannot be
> relied upon! Please ensure S3 response data streams are always fully consumed
> or closed to avoid HTTP connection starvation.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira