BELUGA BEHR created HDFS-9560: --------------------------------- Summary: Fair AvailableSpaceVolumeChoosingPolicy Key: HDFS-9560 URL: https://issues.apache.org/jira/browse/HDFS-9560 Project: Hadoop HDFS Issue Type: Improvement Reporter: BELUGA BEHR Priority: Minor
I took a look at AvailableSpaceVolumeChoosingPolicy. It seems a bit overkill and includes some configuration items that seem a bit arbitrary with no real clear guidance on how to effectively use them: _dfs.datanode.available-space-volume-choosing-policy.balanced-space-preference-fraction_ _dfs.datanode.available-space-volume-choosing-policy.balanced-space-threshold_ I have created an alternative implementation that does not require any external configuration, is thread-safe, and requires no synchronization. "Weighted Randomized Ordering" http://stackoverflow.com/questions/23971365/weighted-randomized-ordering Conceptually, a dart-board is constructed of several wedges, each wedge represents a disk volume. The more available space that a volume has relative to the other volumes, the larger its wedge. Then, a dart is thrown at the board and whichever wedge(volume) the dart lands on, that wedge is assigned the incoming block. Over time, the wedges balance and all have an equal chance of being "hit." -- This message was sent by Atlassian JIRA (v6.3.4#6332)