[ 
https://issues.apache.org/jira/browse/HADOOP-1700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12530219
 ] 

dhruba borthakur commented on HADOOP-1700:
------------------------------------------

{noformat}

Here is a slightly more detailed description of a proposal to support appending 
writes to files.
There is an DataGenerationStamp associated with every block. It is persisted by 
the namenode and by the datanode(s).

*The Writer*
1. The client requests the namenode for a new block. The namenode generates a 
new blockid and associates a DataGenerationStamp 
    of 0 with this block. It persists this blockId in the inode and the 
DataGenerationStamp in the BlocksMap. The Namenode 
    returns the blockid, DataGenerationStamp and block locations to the Client.
2. The Client sends the blockid and DataGenerationStamp to all the datanodes in 
the pipeline. The Datanodes record the blockid and the 
    DataGenerationStamp persistently and returns. In case of error, go to Step 
3a.
3. The Client then starts streaming data to the Datanodes in the pipeline.
    The Client notices if any datanodes in the pipeline encountered an error. 
In this case:
    3a. The Client removes the bad datanode from the pipeline.
    3b. The Client requests a new DataGenerationStamp for this block from the 
NameNode. The Client also informs the Namenode 
          of the bad Datanode.
    3c. The Namenode removes the bad datanode as a valid block location for 
this block. The Namenode increments the current 
          DataGenerationStamp by one, persists it, and returns it to the Client.
    3d. The Client sends the new DataGenerationStamp to all remaining datanodes 
in the pipeline.
    3e. The Datanodes receive the new DataGenerationStamp and persist it.
    3f. The Client can now continue, go back to Step 3 above.
4. The Datanode sends block confirmations to the namenode when the full block 
is received. The block confirmation has the 
    blockid and DataGenerationStamp in it.
5. The Namenode receives a block confirmation from a Datanode. If the 
DataGenerationStamp does not match with what is stored in 
    the inode, the namenode refuses to consider that Datanode as a valid 
replica location. The namenode sends a block delete 
    command to that Datanode.

*Reader (concurrent reading while file is being appended to)*
1.  A reader that opens a file gets the list of blocks from the Namenode. Each 
block has the block locations 
    and DataGenerationStamp too.
2.  A client sends the DataGenerationStamp along with every read request to a 
datanode. The datanode refuses the serve the 
     data if the DataGenerationStamp does not match with the value in its 
persistent store. In this case, the client will fail 
     over to other datanodes.

This algorithm came out of a discussion with Sameer. This solution does not 
solve the problem of duplicate blockids 
that can result when datanodes that were  down for a long time re-appear.

{noformat}

> Append to files in HDFS
> -----------------------
>
>                 Key: HADOOP-1700
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1700
>             Project: Hadoop
>          Issue Type: New Feature
>          Components: dfs
>            Reporter: stack
>
> Request for being able to append to files in HDFS has been raised a couple of 
> times on the list of late.   For one example, see 
> http://www.nabble.com/HDFS%2C-appending-writes-status-tf3848237.html#a10916193.
>   Other mail describes folks' workarounds because this feature is lacking: 
> e.g. http://www.nabble.com/Loading-data-into-HDFS-tf4200003.html#a12039480 
> (Later on this thread, Jim Kellerman re-raises the HBase need of this 
> feature).  HADOOP-337 'DFS files should be appendable' makes mention of file 
> append but it was opened early in the life of HDFS when the focus was more on 
> implementing the basics rather than adding new features.  Interest fizzled.  
> Because HADOOP-337 is also a bit of a grab-bag -- it includes truncation and 
> being able to concurrently read/write -- rather than try and breathe new life 
> into HADOOP-337, instead, here is a new issue focused on file append.  
> Ultimately, being able to do as the google GFS paper describes -- having 
> multiple concurrent clients making 'Atomic Record Append' to a single file 
> would be sweet but at least for a first cut at this feature, IMO, a single 
> client appending to a single HDFS file letting the application manage the 
> access would be sufficent.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to