Re: Error while uploading large file to S3 via Hadoop 0.18

Tom White Wed, 03 Sep 2008 06:35:04 -0700

For the s3:// filesystem, files are split into 64MB blocks which are
sent to S3 individually. Rather than increase the jets3t.properties
retry buffer and retry count, it is better to change the Hadoop
properties fs.s3.maxRetries and fs.s3.sleepTimeSeconds, since the
Hadoop-level retry mechanism retries the whole block transfer, and the
block is stored on disk, so it doesn't consume memory. (The jets3t
mechanism is still useful for metadata operation retries.) See
https://issues.apache.org/jira/browse/HADOOP-997 for background.


Tom

On Tue, Sep 2, 2008 at 4:23 PM, Ryan LeCompte <[EMAIL PROTECTED]> wrote:
> Actually not if you're using the s3:// as opposed to s3n:// ...
>
> Thanks,
> Ryan
>
>
> On Tue, Sep 2, 2008 at 11:21 AM, James Moore <[EMAIL PROTECTED]> wrote:
>> On Mon, Sep 1, 2008 at 1:32 PM, Ryan LeCompte <[EMAIL PROTECTED]> wrote:
>>> Hello,
>>>
>>> I'm trying to upload a fairly large file (18GB or so) to my AWS S3
>>> account via bin/hadoop fs -put ... s3://...
>>
>> Isn't the maximum size of a file on s3 5GB?
>>
>> --
>> James Moore | [EMAIL PROTECTED]
>> Ruby and Ruby on Rails consulting
>> blog.restphone.com
>>
>

Re: Error while uploading large file to S3 via Hadoop 0.18

Reply via email to