Re: Error while uploading large file to S3 via Hadoop 0.18
For the s3:// filesystem, files are split into 64MB blocks which are sent to S3 individually. Rather than increase the jets3t.properties retry buffer and retry count, it is better to change the Hadoop properties fs.s3.maxRetries and fs.s3.sleepTimeSeconds, since the Hadoop-level retry mechanism retries the whole block transfer, and the block is stored on disk, so it doesn't consume memory. (The jets3t mechanism is still useful for metadata operation retries.) See https://issues.apache.org/jira/browse/HADOOP-997 for background. Tom On Tue, Sep 2, 2008 at 4:23 PM, Ryan LeCompte <[EMAIL PROTECTED]> wrote: > Actually not if you're using the s3:// as opposed to s3n:// ... > > Thanks, > Ryan > > > On Tue, Sep 2, 2008 at 11:21 AM, James Moore <[EMAIL PROTECTED]> wrote: >> On Mon, Sep 1, 2008 at 1:32 PM, Ryan LeCompte <[EMAIL PROTECTED]> wrote: >>> Hello, >>> >>> I'm trying to upload a fairly large file (18GB or so) to my AWS S3 >>> account via bin/hadoop fs -put ... s3://... >> >> Isn't the maximum size of a file on s3 5GB? >> >> -- >> James Moore | [EMAIL PROTECTED] >> Ruby and Ruby on Rails consulting >> blog.restphone.com >> >
Re: Error while uploading large file to S3 via Hadoop 0.18
Actually not if you're using the s3:// as opposed to s3n:// ... Thanks, Ryan On Tue, Sep 2, 2008 at 11:21 AM, James Moore <[EMAIL PROTECTED]> wrote: > On Mon, Sep 1, 2008 at 1:32 PM, Ryan LeCompte <[EMAIL PROTECTED]> wrote: >> Hello, >> >> I'm trying to upload a fairly large file (18GB or so) to my AWS S3 >> account via bin/hadoop fs -put ... s3://... > > Isn't the maximum size of a file on s3 5GB? > > -- > James Moore | [EMAIL PROTECTED] > Ruby and Ruby on Rails consulting > blog.restphone.com >
Re: Error while uploading large file to S3 via Hadoop 0.18
On Mon, Sep 1, 2008 at 1:32 PM, Ryan LeCompte <[EMAIL PROTECTED]> wrote: > Hello, > > I'm trying to upload a fairly large file (18GB or so) to my AWS S3 > account via bin/hadoop fs -put ... s3://... Isn't the maximum size of a file on s3 5GB? -- James Moore | [EMAIL PROTECTED] Ruby and Ruby on Rails consulting blog.restphone.com
Re: Error while uploading large file to S3 via Hadoop 0.18
Thanks, trying it now! Ryan On Mon, Sep 1, 2008 at 6:04 PM, Albert Chern <[EMAIL PROTECTED]> wrote: > Increase the retry buffer size in jets3t.properties and maybe up the number > of retries while you're at it. If there is no template file included in > Hadoop's conf dir you can find it at the jets3t web site. Make sure that > it's from the same version that your copy of Hadoop is using. > > On Mon, Sep 1, 2008 at 1:32 PM, Ryan LeCompte <[EMAIL PROTECTED]> wrote: > >> Hello, >> >> I'm trying to upload a fairly large file (18GB or so) to my AWS S3 >> account via bin/hadoop fs -put ... s3://... >> >> It copies for a good 15 or 20 minutes, and then eventually errors out >> with a failed retry attempt (saying that it can't retry since it has >> already written a certain number of bytes, etc. sorry don't have the >> original error message at the moment). Has anyone experienced anything >> similar? Can anyone suggest a workaround or a way to specify retries? >> Should I use another tool for uploading large files to s3? >> >> Thanks, >> Ryan >> >
Re: Error while uploading large file to S3 via Hadoop 0.18
Increase the retry buffer size in jets3t.properties and maybe up the number of retries while you're at it. If there is no template file included in Hadoop's conf dir you can find it at the jets3t web site. Make sure that it's from the same version that your copy of Hadoop is using. On Mon, Sep 1, 2008 at 1:32 PM, Ryan LeCompte <[EMAIL PROTECTED]> wrote: > Hello, > > I'm trying to upload a fairly large file (18GB or so) to my AWS S3 > account via bin/hadoop fs -put ... s3://... > > It copies for a good 15 or 20 minutes, and then eventually errors out > with a failed retry attempt (saying that it can't retry since it has > already written a certain number of bytes, etc. sorry don't have the > original error message at the moment). Has anyone experienced anything > similar? Can anyone suggest a workaround or a way to specify retries? > Should I use another tool for uploading large files to s3? > > Thanks, > Ryan >
Error while uploading large file to S3 via Hadoop 0.18
Hello, I'm trying to upload a fairly large file (18GB or so) to my AWS S3 account via bin/hadoop fs -put ... s3://... It copies for a good 15 or 20 minutes, and then eventually errors out with a failed retry attempt (saying that it can't retry since it has already written a certain number of bytes, etc. sorry don't have the original error message at the moment). Has anyone experienced anything similar? Can anyone suggest a workaround or a way to specify retries? Should I use another tool for uploading large files to s3? Thanks, Ryan