[ https://issues.apache.org/jira/browse/HDFS-775?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12778828#action_12778828 ]
Steve Loughran commented on HDFS-775: ------------------------------------- Here's the code {code} long getCapacity() throws IOException { if (reserved > usage.getCapacity()) { //FIRST CALL return 0; } return usage.getCapacity()-reserved; //SECOND CALL } {code} It looks like the method intends to return capacity as a number >=0, but if the second invocation triggers a shell exec the capacity could decrease and the return value could then be negative, which could have implications elsewhere. Looking at the usages, FSVolumeSet can get confused by this, as it adds the capacities of all volumes together, no checks for being below zero. {code} long getCapacity() throws IOException { long capacity = 0L; for (int idx = 0; idx < volumes.length; idx++) { capacity += volumes[idx].getCapacity(); } return capacity; } {code} A negative capacity from one volume would make the entire datanode capacity appear smaller than it is. > FSDataset calls getCapacity() twice -bug? > ----------------------------------------- > > Key: HDFS-775 > URL: https://issues.apache.org/jira/browse/HDFS-775 > Project: Hadoop HDFS > Issue Type: Bug > Components: data-node > Affects Versions: 0.22.0 > Reporter: Steve Loughran > > I'm not sure this is a bug or "as intended", but I thought I'd mention it. > FSDataset.getCapacity() calls DF.getCapacity() twice, when evaluating its > capacity. Although there is caching to stop the shell being exec'd twice in a > row, there is a risk that the first call doesn't run the shell, and the > second does -so the value changes during the method. > If that is not intended, it is better to cache the first value for the whole > method -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.