Throttle replication speed in case of datanode failure

2013-01-17 Thread Brennon Church
Hello, Is there a way to throttle the speed at which under-replicated blocks are copied across a cluster? Either limiting the bandwidth or the number of blocks per time period would work. I'm currently running Hadoop v1.0.1. I think the dfs.namenode.replication.work.multiplier.per.iteration opt

Re: Throttle replication speed in case of datanode failure

2013-01-17 Thread Jean-Daniel Cryans
Since this is a Hadoop question, it should be sent user@hadoop.apache.org (which I'm now sending this to and I put user@hbase in BCC). J-D On Thu, Jan 17, 2013 at 9:54 AM, Brennon Church wrote: > Hello, > > Is there a way to throttle the speed at which under-replicated blocks are > copied across

Re: Throttle replication speed in case of datanode failure

2013-01-17 Thread Harsh J
You can limit the bandwidth in bytes/second values applied via dfs.balance.bandwidthPerSec in each DN's hdfs-site.xml. Default is 1 MB/s (1048576). Also, unsure if your version already has it, but it can be applied at runtime too via the dfsadmin -setBalancerBandwidth command. On Thu, Jan 17, 20

Re: Throttle replication speed in case of datanode failure

2013-01-17 Thread Brennon Church
That doesn't seem to work for under-replicated blocks such as when decommissioning (or losing) a node, just for the balancer. I've got mine currently set to 10MB/s, but am seeing rates of 3-4 times that after decommissioning a node while it works on bringing things back up to the proper replic

Re: Throttle replication speed in case of datanode failure

2013-01-17 Thread Harsh J
Not true per the sources, it controls all DN->DN copy/move rates, although the property name is misleading. Are you noticing a consistent rise in the rate or is it spiky? On Fri, Jan 18, 2013 at 2:20 AM, Brennon Church wrote: > That doesn't seem to work for under-replicated blocks such as when

Re: Throttle replication speed in case of datanode failure

2013-01-17 Thread Brennon Church
Pretty spiky. I'll throttle it back to 1MB/s and see if it reduces things as expected. Thanks! --Brennon On 1/17/13 1:41 PM, Harsh J wrote: Not true per the sources, it controls all DN->DN copy/move rates, although the property name is misleading. Are you noticing a consistent rise in the r

Re: Throttle replication speed in case of datanode failure

2013-01-17 Thread Harsh J
One reason (for spikes) may be that the throttler actually runs periodically (instead of controlling the rate at source, we detect and block work if we exceed limits, at regular intervals). However, this period is pretty short so it generally does not cause any ill effects on the cluster. On Fri,