Max Lapan created HDFS-9014: ------------------------------- Summary: Block placement policy with respect to DN free space Key: HDFS-9014 URL: https://issues.apache.org/jira/browse/HDFS-9014 Project: Hadoop HDFS Issue Type: Improvement Components: namenode Reporter: Max Lapan
Default block allocation policy (also known as 'replication policy') implemented in NN is random selection from suitable candidates (rack-local or 'other rack'). This is ok when all DNs in a cluster has nearly equal amount of storage, but leads to problems when some DNs are significantly larger than other. In that situation, when NN places new blocks in random fashion, extra space becomes almost unusable and, in extreme case can lead to 100% usage of all other 'small' DNs with almost empty 'large', which can lead to various HDFS and MR problems. Situation when we have datanodes of different sizes is quite real in large, long-lived systems when different generations of machines are put in a single cluster. To overcome this, I implemented a different block allocation policy which places blocks with respect to free space available on a DN. Please, consider it for inclusion in hdfs codebase. -- This message was sent by Atlassian JIRA (v6.3.4#6332)