[ 
http://issues.apache.org/jira/browse/HADOOP-574?page=comments#action_12449188 ] 
            
Tom White commented on HADOOP-574:
----------------------------------

Thanks Doug. Collaboration sounds good: I'll contact Jim directly.

Regarding HADOOP-571 I agree it makes sense to tackle this in conjunction. I'll 
have a look at it after we get the basics of the S3 filesystem working.

As far as the design goes I agree that (like DFS) the S3 filesystem should 
divide things into blocks and buffer them to disk before writing them to S3. 
I'm not sure about using putting the block number at the end of the filename 
(using a delimiter) since this makes renames very inefficient as S3 has no 
rename operation. Instead I have opted for a level of indirection whereby the 
S3 object at the filename is a metadata file which lists the block IDs that 
hold the data. A rename then is simply a re-PUT of the metadata. What do you 
think?

The other aspect I haven't put much thought into yet is locking. Keeping the 
number of HTTP requests to a minimum will be an interesting challenge.


> want FileSystem implementation for Amazon S3
> --------------------------------------------
>
>                 Key: HADOOP-574
>                 URL: http://issues.apache.org/jira/browse/HADOOP-574
>             Project: Hadoop
>          Issue Type: New Feature
>          Components: fs
>            Reporter: Doug Cutting
>
> An S3-based Hadoop FileSystem would make a great addition to Hadoop.
> It would facillitate use of Hadoop on Amazon's EC2 computing grid, as 
> discussed here:
> http://www.mail-archive.com/[email protected]/msg00318.html
> This is related to HADOOP-571, which would make Hadoop's FileSystem 
> considerably easier to extend.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: 
http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to