[ https://issues.apache.org/jira/browse/HADOOP-6450?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
dhruba borthakur updated HADOOP-6450: ------------------------------------- Status: Patch Available (was: Open) Can somebody please review this patch? > Enhance FSDataOutputStream to allow retrieving the current number of replicas > of current block > ---------------------------------------------------------------------------------------------- > > Key: HADOOP-6450 > URL: https://issues.apache.org/jira/browse/HADOOP-6450 > Project: Hadoop Common > Issue Type: Improvement > Components: fs > Reporter: dhruba borthakur > Assignee: dhruba borthakur > Attachments: Replicable.txt, Replicable.txt > > > The current HDFS implementation has the limitation that it does not replicate > the last partial block of a file when it is being written into until the file > is closed. There are some long running applications (e.g. HBase) which writes > transactions logs into HDFS. If datanode(s) in the write pipeline dies off, > the application has no knowledge of it until all the datanode(s) fail and the > application gets an IO error. > These applictions would benefit a lot if they can determine the number of > live replicas of the current block to which it is writing data. For example, > the application can decide that when one of the datanode in the write > pipeline fails it will close the file and start writing to a new file. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.