[
https://issues.apache.org/jira/browse/HADOOP-882?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
[EMAIL PROTECTED] updated HADOOP-882:
-------------------------------------
Attachment: jets3t-upgrade.patch
jets3t-0.5.0.jar
Here's a patch that makes the minor changes necessary so the s3 implementation
can use the new 0.5.0 jets3t 'retrying' lib. It also exposes fs.s3.block.size
in hadoop-default.xml with a note about how to set the jets3t
RepeatableInputStream buffer size by adding a jets3t.properties to
${HADOOP_HOME}/conf. Setting this latter buffer to the same as the s3 block
size avoids failures of the kind 'Input stream is not repeatable as 1048576
bytes have been written, exceeding the available buffer size of 131072'.
Downside to this patch's approach is that if you want to match block and buffer
size, you need to set the same value in two places: once in hadoop-site and
again in jets3t.properties. This seemed to be me better than the alternative,
a tighter coupling bubbling the main jets3t properties up into hadoop-*.xml
filesystem section as fs.s3.jets3t.XXX properties with the init of the s3
filesystem setting the values into the org.jets3t.service.Jets3tProperties.
I didn't change the default S3 block size from 1MB. Setting it to 64MB seems
too far afield from the default jets3t RepeatableInputStream size of 100k only.
I've included the 0.5.0 jets3t lib as part of the upload (There doesn't seem to
be a way to include binaries using svn diff). Its license is apache 2.
Tom White, thanks for pointing me at the unit test. Also, I'd go along with
closing this issue with the update of jets3t lib opening another issue for
tracking the S3 filesystems implementing a general, 'traffic-level' hadoop
retry mechanism.
> S3FileSystem should retry if there is a communication problem with S3
> ---------------------------------------------------------------------
>
> Key: HADOOP-882
> URL: https://issues.apache.org/jira/browse/HADOOP-882
> Project: Hadoop
> Issue Type: Improvement
> Components: fs
> Affects Versions: 0.10.1
> Reporter: Tom White
> Assigned To: Tom White
> Attachments: jets3t-0.5.0.jar, jets3t-upgrade.patch
>
>
> File system operations currently fail if there is a communication problem
> (IOException) with S3. All operations that communicate with S3 should retry a
> fixed number of times before failing.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.