[ https://issues.apache.org/jira/browse/HDFS-3645?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13413034#comment-13413034 ]
Harsh J commented on HDFS-3645: ------------------------------- Hi Suresh, Thank you, your question got me thinking some more. I filed this JIRA as a thought dump from some thoughts I was having, going through the policy impl. at present. Sorry for lack of clarification. Let me explain the case I imagine may exist with this specific check: # node.getXceiverCount() is a total 'socket' count. It includes writes, _and_ reads. # Consider a cluster situation such as this when computing the average (may sound a little hypothetical in this explanation but a near enough case is possible in some situations): 100 DNs are present. Average is about 250 but there are possibly some (very few) nodes with much higher xceiver counts, at about 600-800. A likely possibility for such a state is that these nodes are probably serving a very hot, local-block region (a bad HBase case, but quite plausible). # Now consider that this DN wanted to get a block allocated to it. We computed xceiver average, and found it to be, 250, and then we checked node count, it was 700. 700 > 250 leads to it not getting selected, due to us ignoring the fact that most of the "700" were actually reads and not writes. Perhaps it may have been OK to do a write in this case, if we knew the ratio of reads:writes aside of count(reads+writes) on the DN? I've not seen any major issues with this way of write selection at all, but it does seem to expose a certain edge case. Do you think we should account for such a scenario, or let it be as-is and continue to keep the load count aggregated? If not, let us close this out. > Improve the way we do detection of a busy DN in the cluster, when choosing it > for a block write > ----------------------------------------------------------------------------------------------- > > Key: HDFS-3645 > URL: https://issues.apache.org/jira/browse/HDFS-3645 > Project: Hadoop HDFS > Issue Type: Improvement > Components: name-node > Affects Versions: 2.0.0-alpha > Reporter: Harsh J > Priority: Minor > > Right now, I think we do too naive a computation for detecting if a chosen DN > target is busy by itself. We currently do {{node.getXceiverCount() > (2.0 * > avgLoad)}}. > We should improve on this computation with a more realistic measure of if a > DN is really busy by itself or not (rather than checking against cluster > average, where there's a good chance the value can be wrong to compare with, > for some cases) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira