[ 
https://issues.apache.org/jira/browse/HDFS-4113?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13753073#comment-13753073
 ] 

Eli Collins commented on HDFS-4113:
-----------------------------------

There are some issues with the balancer workaround:
- Users have to manage a separate process (eg run via cron and keep HA)
- You write more data than you need to (disk & net bandwidth, not space)
- There may not be much cluster "downtime"

Given that hosts often have a decent multiple more disk bandwidth than network 
bandwidth and for many workloads the write bandwidth does not need to scale as 
much as the read bandwidth is it unreasonable to allow a pluggable policy (via 
HDFS-385) that fills the cluster evenly at the cost of lower write bandwidth?  
Seems reasonable to me. 

Either way, the jira summary claims puts failed after nodes filled up, was this 
because there was free space but not an available location that suited the 
policy constraints? If so that seems like a bug we should fix - better to write 
in violation of the policy than for the write to fail.


                
> When adding datanodes with less disk capacity to an existing cluster, the new 
> DNs fill up faster and subsequently cause errors during put operations
> ----------------------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: HDFS-4113
>                 URL: https://issues.apache.org/jira/browse/HDFS-4113
>             Project: Hadoop HDFS
>          Issue Type: New Feature
>          Components: datanode
>            Reporter: Stephen Fritz
>            Priority: Minor
>
> The request is that the allocation strategy be modified so that it allocates 
> equally on a 'free space percentage' basis between datanodes. IE disks that 
> are twice as big should have twice as much data written to them per unit time.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to