[ 
https://issues.apache.org/jira/browse/HADOOP-1700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12546839
 ] 

Ruyue Ma commented on HADOOP-1700:
----------------------------------

Hi, everyone

We are making our best effort to improve Hadoop Distributed
Filesystem(HDFS). Now, our plan
is implementing single client appending and truncating based on hadoop
0.15.0 version.

Our implementation will follow the desigh of Hadoop-1700. In the
latest version of Hadoop-1700,
we found some points (or bugs) to be corrected/improved. Here are the points:

0. We recommend: the BlockID don't adopt random 64bit ID. We give a
global variable NextBlockID.
   The NextBlockID is 64bit, initialized to 0, is recorded into
transaction log. When we need a new BlockID,
    we give NextBlockID as the new BlockID, and the NextBlockID is
incremented by 1. The new NextBlockID
     is recorded into transaction log.

1. In section _The Writer_  of Hadoop-1700, the original text  is:

____________________________________________________________________________________________________________________________
   ..........
 The Writer requests the Namenode to create a new file or open an
existing file with an intention of appending to it.
The Namenode generates a new blockId and a new GenerationStamp for
this block. Let's call the GenerationStamp that
is associated with a block as BlockGenerationStamp. A new
BlockGenerationStamp is generated by incrementing the
global GenerationStamp by one and storing the global GenerationStamp
back into the transaction log. It records
the blockId, block locations and the BlockGenerationStamp in the
BlocksMap. The Namenode returns the blockId,
BlockGenerationStamp and block locations to the Client.
   .........
_______________________________________________________________________________________________________________________________
 our comment:
 The Writer requests the Namenode to create a new file or open an
existing file with an intention of appending to it.
The Namenode generates a new blockId and a new GenerationStamp for
this block. Let's call the GenerationStamp that
is associated with a block as BlockGenerationStamp. A new
BlockGenerationStamp is generated by incrementing the
global GenerationStamp by one and storing the global GenerationStamp
back into the transaction log. ____If the block is
a new block, the Namenode returns the blockID, BlockGenerationStamp
and block locations to the client. The namenode don't
records the blockID, BlockLocations, BlockGenerationStamp in
BlocksMap.  If the block is an existing block
 (the last block is not full), we generate a new BlockGenerationStamp
and return the BlockID, BlockLocations, Old BlockGenerationStamp,
new BlockGenerationStamp to the client, the namenode don't records the
new BlockGenerationStamp._____

When client successfully updates the all datanodes  where the replicas
of block is located in, the client tells namenode that
    he has successfully updated the datanodes, along with the
information (BlockID, BlockLocations, new BlockGenerationStamp).
    Then, Namenode records the  blockId, block locations and the new
BlockGenerationStamp in the BlocksMap, and creates the OpenFile
    transaction log.
____________________________________________________________________________________________________________________________-

*  notes: our method can tolerate the following failure.   If namenode
records the new BlockGenerationStamp in BlocksMap, when client don't
update the
             datanodes or client (writer) crashes, namenode will
start lease recovery. At this moment, replica version may be smaller
than the version
             recorded in namenode. So, namenode will discard the
replicas. It is not our expected result. Our method can tolerate this
failure.

2. In hadoop-1700, OpenFile transaction log records all the blocklist.
If client invokes flush operation frenquently, maybe the overhead of
   namenode is very heavy. So we adopt the following method:
   we only record the varying block, don't record the blocks which
are not being modified.
   If a new block is created, we record the blockID,
BlockGenerationStamp. If the BLockGenerationStamp of a block is
modified, we
   only record the new BlockGenerationStamp for the block.
   By this method, we reduce the overhead of namenode when namenode
creates the OpenFile transaction log.


We look forward to getting help from Hadoop Community. Your any advice
will be appreciated.


> Append to files in HDFS
> -----------------------
>
>                 Key: HADOOP-1700
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1700
>             Project: Hadoop
>          Issue Type: New Feature
>          Components: dfs
>            Reporter: stack
>         Attachments: Appends.doc, Appends.doc, Appends.html
>
>
> Request for being able to append to files in HDFS has been raised a couple of 
> times on the list of late.   For one example, see 
> http://www.nabble.com/HDFS%2C-appending-writes-status-tf3848237.html#a10916193.
>   Other mail describes folks' workarounds because this feature is lacking: 
> e.g. http://www.nabble.com/Loading-data-into-HDFS-tf4200003.html#a12039480 
> (Later on this thread, Jim Kellerman re-raises the HBase need of this 
> feature).  HADOOP-337 'DFS files should be appendable' makes mention of file 
> append but it was opened early in the life of HDFS when the focus was more on 
> implementing the basics rather than adding new features.  Interest fizzled.  
> Because HADOOP-337 is also a bit of a grab-bag -- it includes truncation and 
> being able to concurrently read/write -- rather than try and breathe new life 
> into HADOOP-337, instead, here is a new issue focused on file append.  
> Ultimately, being able to do as the google GFS paper describes -- having 
> multiple concurrent clients making 'Atomic Record Append' to a single file 
> would be sweet but at least for a first cut at this feature, IMO, a single 
> client appending to a single HDFS file letting the application manage the 
> access would be sufficent.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to