[ https://issues.apache.org/jira/browse/HDFS-1547?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12980939#action_12980939 ]
Suresh Srinivas commented on HDFS-1547: --------------------------------------- Thinking a bit more about the problem, I think there could be issues in some cases: Consider a cluster with N nodes, L live and D decommissioned with transceiver load on each datanode {X1, X2, ... XN}. A datanode is not good for write when Xi > 2 * X /(L+D) That means when D > L, a lot of the nodes will be not eligible for writes. The remainining that are good, will have to take write load and will push X higher. Also read traffic that is not subject to the above condition will push X higher. In the worst case scenarios, if the load on every node is equal to X and write load dominates reads, then very few or no nodes are good for writes! Some observations: # This problem is severe as D gets closer to and more than N/2. # Doing such a decommission of large number datanodes has several issues: #* It reduces cluster available free storage for writes. Writes could simply fail because of no free storage. The decommissioning may not complete, because of lack of free storage. #* Further when this happens, the number nodes available for writes is significantly reduced (as writes are not done to D nodes). #* Note this problem also exists when decommissioning is in progress for large number of nodes. Given this I am leaning towards not handling this case. > Improve decommission mechanism > ------------------------------ > > Key: HDFS-1547 > URL: https://issues.apache.org/jira/browse/HDFS-1547 > Project: Hadoop HDFS > Issue Type: Improvement > Components: name-node > Affects Versions: 0.23.0 > Reporter: Suresh Srinivas > Assignee: Suresh Srinivas > Fix For: 0.23.0 > > Attachments: HDFS-1547.1.patch, HDFS-1547.patch > > > Current decommission mechanism driven using exclude file has several issues. > This bug proposes some changes in the mechanism for better manageability. See > the proposal in the next comment for more details. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.