available = usage.getAvailable() - reserved

It is incorrect to minus reserved from usage.getAvailable() above since the 
reserved space, which is the space reserved for non-hdfs used, may already be 
occupied by some non-hdfs files but not necessarily empty space.
In pre HDFS-5215 calculation, the non-DFS used is like "unplanned non-DFS used" 
while the "planned DFS used" is the reserved space.
Tsz-Wo

 

    On Wednesday, April 13, 2016 2:35 PM, Brahma Reddy Battula 
<brahmareddy.batt...@huawei.com> wrote:
 
 

 Gentle Remainder!!


--Brahma Reddy Battula

From: Brahma Reddy Battula
Sent: 28 March 2016 12:26
To: hdfs-dev@hadoop.apache.org
Cc: 'aagar...@hortonworks.com'; 'cnaur...@hortonworks.com'; 
'vinayakum...@apache.org'
Subject: [HDFS-9038] Non-Dfs used Calculation

Hi All,

Chris Nauroth / Arpit / Vinay and me discussing this calculation.

There is a disagreement on the definition of non-DFS used space, because of 
which Issue is not making progress.
Essentially, it's a question of whether this metric means "Raw Non-DFS Used" or 
"Unplanned Non-DFS Used".


Here is the summary of the conversation, by Arpit.

The pre HDFS-5215 calculation had two bugs.

 1. It incorrectly subtracted reserved space from the non-DFS used. (net 
negative). Chris suggests this is not really an issue as non-DFS used should be 
shown as zero unless it exceeds the DFS reserved value.

  2. It used File#getUsableSpace to calculate the volume free space instead of 
File#getFreeSpace. (net positive)

The net effect was that non-DFS used was displayed as zero unless the actual 
non-DFS used exceeded DFS reserved - system reserved.

HDFS-5215 fixed the first issue and the value that is now erroneously counted 
towards non-DFS used is in fact the system reserved 5%.

>From the testing it was found that, "Ext derivatives hold back 5% free space 
>while XFS does not."


Proposed calculation to report the exact Non-DFS Usage:

  non-DFS used = getCapacity() + reserved - getDfsUsed() - totalFreeSpace
              = usage.getCapacity() - reserved + reserved - getDfsUsed() - 
totalFreeSpace
              = usage.getCapacity() - getDfsUsed() - totalFreeSpace
              = File#getTotalSpace - getDfsUsed() - File#getFreeSpace

Chris Nauroth thinks we should subtract "dfs.datanode.du.reserved" for non-dfs 
used because it allowed  to monitor for unexpected non-zero non-DFS usage and 
react.

Even Akira given "+0" on above calculation.

We would like take inputs from you to see some progress on the issue.

Please let me know your thoughts on this issue.

Thanks
--Brahma Reddy Battula


 
  

Reply via email to