[jira] Commented: (HADOOP-1700) Append to files in HDFS

Doug Cutting (JIRA) Wed, 05 Sep 2007 10:28:03 -0700

    [ 
https://issues.apache.org/jira/browse/HADOOP-1700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12525165
 ]


Doug Cutting commented on HADOOP-1700:
--------------------------------------

> sameer: solve the problem of detecting replicas from deleted files

Is that actually a problem?  We'll only generate such garbage blocks when nodes 
fail, and only a few then, so it doesn't happen at that high of a rate, and the 
current blockreport mechanism handles that.  Block versioning won't let us get 
rid of the blockreport mechanism, so I still don't see that as an advantage of 
block versioning.  Am I missing something?

> eric: see updates at a reasonable rate, IE soon after each flush or every 64k 
> bytes or so with less than a seconds latency

That's a new requirement that I'd not heard before.  The mechanism I'd proposed 
would not make appends visible until the file is closed, which is not what 
Eric's asking for.  So this may make the case for block versioning.  If we 
really must support what Eric's asking, then non-persistent timestamps on the 
blocks may be required.


> Append to files in HDFS
> -----------------------
>
>                 Key: HADOOP-1700
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1700
>             Project: Hadoop
>          Issue Type: New Feature
>          Components: dfs
>            Reporter: stack
>
> Request for being able to append to files in HDFS has been raised a couple of 
> times on the list of late.   For one example, see 
> http://www.nabble.com/HDFS%2C-appending-writes-status-tf3848237.html#a10916193.
>   Other mail describes folks' workarounds because this feature is lacking: 
> e.g. http://www.nabble.com/Loading-data-into-HDFS-tf4200003.html#a12039480 
> (Later on this thread, Jim Kellerman re-raises the HBase need of this 
> feature).  HADOOP-337 'DFS files should be appendable' makes mention of file 
> append but it was opened early in the life of HDFS when the focus was more on 
> implementing the basics rather than adding new features.  Interest fizzled.  
> Because HADOOP-337 is also a bit of a grab-bag -- it includes truncation and 
> being able to concurrently read/write -- rather than try and breathe new life 
> into HADOOP-337, instead, here is a new issue focused on file append.  
> Ultimately, being able to do as the google GFS paper describes -- having 
> multiple concurrent clients making 'Atomic Record Append' to a single file 
> would be sweet but at least for a first cut at this feature, IMO, a single 
> client appending to a single HDFS file letting the application manage the 
> access would be sufficent.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HADOOP-1700) Append to files in HDFS

Reply via email to