[ https://issues.apache.org/jira/browse/HDFS-11674?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Vinayakumar B updated HDFS-11674: --------------------------------- Attachment: HDFS-11674-01.patch Attached the patch. Please review. > reserveSpaceForReplicas is not released if append request failed due to > mirror down and replica recovered > --------------------------------------------------------------------------------------------------------- > > Key: HDFS-11674 > URL: https://issues.apache.org/jira/browse/HDFS-11674 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode > Reporter: Vinayakumar B > Assignee: Vinayakumar B > Priority: Critical > Attachments: HDFS-11674-01.patch > > > Scenario: > 1. 3 Node cluster with > "dfs.client.block.write.replace-datanode-on-failure.policy" as DEFAULT > Block is written with x data. > 2. One of the Datanode, NOT the first DN, is down > 3. Client tries to append data to block and fails since one DN is down. > 4. calls recoverLease() on the file. > 5. Successfull recovery happens. > Issue: > 1. DNs which were connected from client before encountering mirror down, will > have the reservedSpaceForReplicas incremented, BUT never decremented. > 2. So in long run DN's all space will be in reservedSpaceForReplicas > resulting OutOfSpace errors. -- This message was sent by Atlassian JIRA (v6.3.15#6346) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org