Hi Andrew,

Thank you for the quick response. I changed the bandwidth using
"hadoop dfsadmin
-setBalancerBandwidth" command and it works like a charm! Time to transfer
data is now proportional to the bandwidth I set.

Thanks again!

Best,
Karthiek


On Wed, Dec 18, 2013 at 6:23 PM, Andrew Wang <andrew.w...@cloudera.com>wrote:

> Hi Karthiek,
>
> I haven't checked 1.0.4, but in 2.2.0 and onwards, there's this setting you
> can tweak up:
>
> dfs.datanode.balance.bandwidthPerSec
>
> By default, it's set to just 1MB/s, which is pretty slow. Again at least in
> 2.2.0, there's also `hdfs dfsadmin -setBalancerBandwidth` which can be used
> to adjust this config property at runtime.
>
> Best,
> Andrew
>
>
> On Wed, Dec 18, 2013 at 2:40 PM, Karthiek C <karthi...@gmail.com> wrote:
>
> > Hi all,
> >
> > I am working on a research project where we are looking at algorithms to
> > "optimally" distribute data blocks in HDFS nodes. The definition of what
> is
> > optimal is omitted for brevity.
> >
> > I want to move specific blocks of a file that is *already* in HDFS. I am
> > able to achieve it using data transfer protocol (took cues from
> "Balancer"
> > module). But the operation turns out to be very time consuming. In my
> > cluster setup, to move 1 block of data (approximately 60 MB) from
> > data-node-1 to data-node-2 it takes nearly 60 seconds. A "dfs -put"
> > operation that copies the same file from data-node-1's local file system
> to
> > data-node-2 takes just 1.4 seconds.
> >
> > Any suggestions on how to speed up the movement of specific blocks?
> > Bringing down the running time is very important for us because this
> > operation may happen while executing a job.
> >
> > I am using hadoop-1.0.4 version.
> >
> > Thanks in advance!
> >
> > Best,
> > Karthiek
> >
>

Reply via email to