[jira] Commented: (HADOOP-1707) DFS client can allow user to write data to the next block while uploading previous block to HDFS

dhruba borthakur (JIRA) Wed, 10 Oct 2007 15:53:11 -0700

    [ 
https://issues.apache.org/jira/browse/HADOOP-1707?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12533895
 ]


dhruba borthakur commented on HADOOP-1707:
------------------------------------------

If the primary datanode fails the client can still replay the last-flushed-data 
buffer to the remaining datanodes. The client has to specify the offset in the 
block where this buffer contents has to be written. The datanode, given this 
offset-in-block, can determine whether to do the write or whether the write was 
already done. The pre-requisite is that a client holds on to a buffer until the 
write is complete on all known good datanodes.

Another option would be to say that the application gets an error if the 
Primary datanode fails. Do you think that this is acceptable? 

I think HADOOP-1927 says that if a non-primary datanode dies, the client should 
detect it and possibly take appropriate action. Currently the client has no way 
of knowing whether any secondary datanodes have died.



> DFS client can allow user to write data to the next block while uploading 
> previous block to HDFS
> ------------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-1707
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1707
>             Project: Hadoop
>          Issue Type: Bug
>          Components: dfs
>            Reporter: dhruba borthakur
>            Assignee: dhruba borthakur
>
> The DFS client currently uses a staging file on local disk to cache all 
> user-writes to a file. When the staging file accumulates 1 block worth of 
> data, its contents are flushed to a HDFS datanode. These operations occur 
> sequentially.
> A simple optimization of allowing the user to write to another staging file 
> while simultaneously uploading the contents of the first staging file to HDFS 
> will improve file-upload performance.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HADOOP-1707) DFS client can allow user to write data to the next block while uploading previous block to HDFS

Reply via email to