[ https://issues.apache.org/jira/browse/HDFS-826?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12841754#action_12841754 ]
dhruba borthakur commented on HDFS-826: --------------------------------------- Hi Hairong, Thanks for the review. I would really like to continue with the approach of making getNumCurrentReplicas return the default replication factor when there isn't a pipeline. One way that I am thinking is that the app can visualize one file to be one huge ever-expanding block and as long as there are no pipeline errors, the app will always see a NumReplicas as the default replication factor. I checked this with a few HBase folks and it appears simpler to them from an application programmer's perspective. I hope this is ok with you and thanks for reviewing it. > Allow a mechanism for an application to detect that datanode(s) have died in > the write pipeline > ------------------------------------------------------------------------------------------------ > > Key: HDFS-826 > URL: https://issues.apache.org/jira/browse/HDFS-826 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs client > Reporter: dhruba borthakur > Assignee: dhruba borthakur > Attachments: Replicable4.txt, ReplicableHdfs.txt, > ReplicableHdfs2.txt, ReplicableHdfs3.txt > > > HDFS does not replicate the last block of the file that is being currently > written to by an application. Every datanode death in the write pipeline > decreases the reliability of the last block of the currently-being-written > block. This situation can be improved if the application can be notified of a > datanode death in the write pipeline. Then, the application can decide what > is the right course of action to be taken on this event. > In our use-case, the application can close the file on the first datanode > death, and start writing to a newly created file. This ensures that the > reliability guarantee of a block is close to 3 at all time. > One idea is to make DFSOutoutStream. write() throw an exception if the number > of datanodes in the write pipeline fall below minimum.replication.factor that > is set on the client (this is backward compatible). -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.