Re: How do I configure a Partitioner in the new API?

2011-05-05 Thread Harsh J
Hello,

On Wed, May 4, 2011 at 9:25 PM, W.P. McNeill bill...@gmail.com wrote:
 I only wanted the context as a way of getting at the configuration, so
 making the class implement Configurable will solve my problem.

Good to know. I believe this ought to be documented. I've opened
https://issues.apache.org/jira/browse/MAPREDUCE-2474 for the same.

-- 
Harsh J


Use different process method to process data by input file name?

2011-05-05 Thread 王志强
Hi, guys
  As the topic shows, how can I use different process methods to process 
data according to input file name in map function? Ie, May I get the input file 
name that current process line belong to?
  Austin


Re: Use different process method to process data by input file name?

2011-05-05 Thread Harsh J
Moving this to mapreduce--user@ since that is more appropriate for
hadoop-mapreduce questions (bcc: common-user@).

2011/5/5 王志强 wangzhiqi...@360.cn:
 Hi, guys
  As the topic shows, how can I use different process methods to process 
 data according to input file name in map function? Ie, May I get the input 
 file name that current process line belong to?
  Austin


You can get the filename using these additional JVM properties of the
MapTasks: 
http://hadoop.apache.org/common/docs/r0.20.2/mapred_tutorial.html#Task+JVM+Reuse
[Look at the table right below the link anchor for all properties, and
map.input.file in particular]

-- 
Harsh J


Re: Use different process method to process data by input file name?

2011-05-05 Thread Itzhak Pan
Hi,

I think MultipleInputs may fit your need.

Yaozhen

在 2011 5 5 17:43,王志强 wangzhiqi...@360.cn写道:
 Hi, guys
 As the topic shows, how can I use different process methods to process
data according to input file name in map function? Ie, May I get the input
file name that current process line belong to?
 Austin


How hadoop parse input files into (Key,Value) pairs ??

2011-05-05 Thread praveenesh kumar
Hi,

As we know hadoop mapper takes input as (Key,Value) pairs and generate
intermediate (Key,Value) pairs and usually we give input to our Mapper as a
text file.
How hadoop understand this and parse our input text file into (Key,Value)
Pairs

Usually our mapper looks like  --
*public* *void* map(LongWritable key, Text value,OutputCollectorText, Text
outputCollector, Reporter reporter) *throws* IOException {

String word = value.toString();

//Some lines of code

}

So if I pass any text file as input, it is taking every line as VALUE to
Mapper..on which I will do some processing and put it to OutputCollector.
But how hadoop parsed my text file into ( Key,Value ) pair and how can we
tell hadoop what (key,value) it should give to mapper ??

Thanks.


Re: Cluster hard drive ratios

2011-05-05 Thread Steve Loughran

On 04/05/11 19:59, Matt Goeke wrote:

Mike,

Thanks for the response. It looks like this discussion forked on the CDH
list so I have two different conversations now. Also, you're dead on
that one of the presentations I was referencing was Ravi's.

With your setup I agree that it would have made no sense to go the 2.5
drive route given it would have forced you into the 500-750GB SATA
drives and all it would allow is more spindles but less capacity at a
higher cost. The servers we have been considering are actually the
R710's so dual hexacore with 12 spindles of actual capacity is more of a
1:1 in terms of cores to spindles vs the 2:1 I have been reviewing. My
original issue attempted to focus more around at what point do you
actually see a plateau in write performance of cores:spindles but since
we are headed that direction anyway it looks like it was more to sate
curiosity than driving specifications.


some people are using this as it gives best storage density. You can 
also go for single hexacore servers as in a big cluster the savings 
there translate into even more storage. It all depends on the application.



As to your point, I forgot to include the issue of rebalancing in the
original email but you are absolutely right. That was another major
concern especially as we would get closer to filling capacity of a 24TB
box. I think the original plan was bonded GBe but I think our
infrastructure team has told us 10GBe would be standard.



1. If you want to play with bonded GBe then I have some notes I can send 
you -its harder than you think.
2. I don't know anyone who is running 10 GBe + Hadoop, though I see 
hints that StumbleUpon are doing this with Arista switches. You'd have 
to have a very chatty app or 10GBe on the mainboard to justify it.
3. I do know of installations with 24TB HDD and GBe, yes, the overhead 
of a node failure is higher. But with less nodes, P(failure) may be 
lower. The big fear is loss-of-rack, which can come from ToR switch 
failure or from network config errors. Hadoop isn't partition aware  
will treat a rack outage as the loss of 40+ servers, try to replicate 
all that data, and that's when you're in trouble (look at the AWS EBS 
outage for an example cascade failure).
4. There are JIRA issues for better handling of drive failure, including 
hotswapping and rebalancing data within a single machine.
5. I'd like support for the ability to say a node is going down, don't 
replicate, and the same for a rack, to ease maintenance.


-Steve


distcp performing much better for rebalancing than dedicated balancer

2011-05-05 Thread Ferdy Galema

Hi,

On our 15node cluster (1GB ethernet and 4x1TB disk per node) I noticed 
that distcp does a much better job at rebalancing than the dedicated 
balancer does. We needed to decommision 11 nodes, so that prior to 
rebalancing we had 4 used and 11 empty nodes. The 4 used nodes had about 
25% usage each. Most of our files are of average size: We have about 
500K files in 280K blocks and 800K blocks total (blocksize is 64MB).


So I changed dfs.balance.bandwidthPerSec to 800100100 and restarted the 
cluster. Started the balancer tool and I noticed that the it moved about 
200GB in 1 hour. (I grepped the balancer log for Need to move).


After stopping the balancer I started a distcp.  This tool copied 900GB 
in just 45 minutes, with an average replication of 2 so it's total 
throughput was around 2.4 TB/hour. Fair enough, it is not purely 
rebalancing because the 4 overused nodes also get new blocks, still it 
performs much better. Munin confirms the much higher disk/ethernet 
throughputs of the distcp.


Are these characteristics to be expected? Either way, can the balancer 
be boosted even more? (Aside the dfs.balance.bandwidthPerSec property).


Ferdy.


Re: distcp performing much better for rebalancing than dedicated balancer

2011-05-05 Thread Mathias Herberts
Did you explicitely start a balancer or did you decommission the nodes
using dfs.hosts.exclude and a dfsadmin -refreshNodes?

On Thu, May 5, 2011 at 14:30, Ferdy Galema ferdy.gal...@kalooga.com wrote:
 Hi,

 On our 15node cluster (1GB ethernet and 4x1TB disk per node) I noticed that
 distcp does a much better job at rebalancing than the dedicated balancer
 does. We needed to decommision 11 nodes, so that prior to rebalancing we had
 4 used and 11 empty nodes. The 4 used nodes had about 25% usage each. Most
 of our files are of average size: We have about 500K files in 280K blocks
 and 800K blocks total (blocksize is 64MB).

 So I changed dfs.balance.bandwidthPerSec to 800100100 and restarted the
 cluster. Started the balancer tool and I noticed that the it moved about
 200GB in 1 hour. (I grepped the balancer log for Need to move).

 After stopping the balancer I started a distcp.  This tool copied 900GB in
 just 45 minutes, with an average replication of 2 so it's total throughput
 was around 2.4 TB/hour. Fair enough, it is not purely rebalancing because
 the 4 overused nodes also get new blocks, still it performs much better.
 Munin confirms the much higher disk/ethernet throughputs of the distcp.

 Are these characteristics to be expected? Either way, can the balancer be
 boosted even more? (Aside the dfs.balance.bandwidthPerSec property).

 Ferdy.



including jar files in hadoop job

2011-05-05 Thread Nakul Chakrapani
Hi all , 

   Can anyone tell me how to include jar files while running a hadoop job . 
version is 0.21.0 .
   I used -libjars option in hadoop 0.20.2 , it was working , but now in 
this version , its not working.
   Please help me in this regard

Thbanks  Regards 
Nakul chakrapani

Can we access NameNode HDFS from slave Nodes ??

2011-05-05 Thread praveenesh kumar
hey,

Can we access NameNode's hdfs on our slave machines ??

I am just running command hadoop dfs -ls on my slave machine ( running
tasktracker and Datanode), and its giving me the following output :

hadoop@ub12:~$ hadoop dfs -ls
11/05/05 18:31:54 INFO ipc.Client: Retrying connect to server: ub13/
162.192.100.53:54310. Already tried 0 time(s).
11/05/05 18:31:55 INFO ipc.Client: Retrying connect to server: ub13/
162.192.100.53:54310. Already tried 1 time(s).
11/05/05 18:31:56 INFO ipc.Client: Retrying connect to server: ub13/
162.192.100.53:54310. Already tried 2 time(s).
11/05/05 18:31:57 INFO ipc.Client: Retrying connect to server: ub13/
162.192.100.53:54310. Already tried 3 time(s).
11/05/05 18:31:58 INFO ipc.Client: Retrying connect to server: ub13/
162.192.100.53:54310. Already tried 4 time(s).
11/05/05 18:31:59 INFO ipc.Client: Retrying connect to server: ub13/
162.192.100.53:54310. Already tried 5 time(s).
11/05/05 18:32:00 INFO ipc.Client: Retrying connect to server: ub13/
162.192.100.53:54310. Already tried 6 time(s).
11/05/05 18:32:01 INFO ipc.Client: Retrying connect to server: ub13/
162.192.100.53:54310. Already tried 7 time(s).
11/05/05 18:32:02 INFO ipc.Client: Retrying connect to server: ub13/
162.192.100.53:54310. Already tried 8 time(s).
11/05/05 18:32:03 INFO ipc.Client: Retrying connect to server: ub13/
162.192.100.53:54310. Already tried 9 time(s).
Bad connection to FS. command aborted.
I just restarted my Master Node ( and run start-all.sh )

The output on my master node is

hadoop@ub13:/usr/local/hadoop$ start-all.sh
starting namenode, logging to
/usr/local/hadoop/hadoop/bin/../logs/hadoop-hadoop-namenode-ub13.out
ub11: starting datanode, logging to
/usr/local/hadoop/hadoop/bin/../logs/hadoop-hadoop-datanode-ub11.out
ub10: starting datanode, logging to
/usr/local/hadoop/hadoop/bin/../logs/hadoop-hadoop-datanode-ub10.out
ub12: starting datanode, logging to
/usr/local/hadoop/hadoop/bin/../logs/hadoop-hadoop-datanode-ub12.out
ub13: starting datanode, logging to
/usr/local/hadoop/hadoop/bin/../logs/hadoop-hadoop-datanode-ub13.out
ub13: starting secondarynamenode, logging to
/usr/local/hadoop/hadoop/bin/../logs/hadoop-hadoop-secondarynamenode-ub13.out
starting jobtracker, logging to
/usr/local/hadoop/hadoop/bin/../logs/hadoop-hadoop-jobtracker-ub13.out
ub10: starting tasktracker, logging to
/usr/local/hadoop/hadoop/bin/../logs/hadoop-hadoop-tasktracker-ub10.out
ub11: starting tasktracker, logging to
/usr/local/hadoop/hadoop/bin/../logs/hadoop-hadoop-tasktracker-ub11.out
ub12: starting tasktracker, logging to
/usr/local/hadoop/hadoop/bin/../logs/hadoop-hadoop-tasktracker-ub12.out
ub13: starting tasktracker, logging to
/usr/local/hadoop/hadoop/bin/../logs/hadoop-hadoop-tasktracker-ub13.out
hadoop@ub13:/usr/local/hadoop$ jps
6471 NameNode
7070 Jps
6875 JobTracker
6632 DataNode
7030 TaskTracker
6795 SecondaryNameNode
Thanks,
Praveenesh


Re: distcp performing much better for rebalancing than dedicated balancer

2011-05-05 Thread Ferdy Galema
The decommissioning was performed with solely refreshNodes, but that's 
somewhat irrelevant because the balancing tests were performed after I 
re-added the 11 empty nodes. (FYI the drives were formatted with another 
unix fs). Though I did notice that the decommissioning shows about the 
same metrics as that of the balancer test afterwards,  not very fast 
that is.


On 05/05/2011 02:57 PM, Mathias Herberts wrote:

Did you explicitely start a balancer or did you decommission the nodes
using dfs.hosts.exclude and a dfsadmin -refreshNodes?

On Thu, May 5, 2011 at 14:30, Ferdy Galemaferdy.gal...@kalooga.com  wrote:

Hi,

On our 15node cluster (1GB ethernet and 4x1TB disk per node) I noticed that
distcp does a much better job at rebalancing than the dedicated balancer
does. We needed to decommision 11 nodes, so that prior to rebalancing we had
4 used and 11 empty nodes. The 4 used nodes had about 25% usage each. Most
of our files are of average size: We have about 500K files in 280K blocks
and 800K blocks total (blocksize is 64MB).

So I changed dfs.balance.bandwidthPerSec to 800100100 and restarted the
cluster. Started the balancer tool and I noticed that the it moved about
200GB in 1 hour. (I grepped the balancer log for Need to move).

After stopping the balancer I started a distcp.  This tool copied 900GB in
just 45 minutes, with an average replication of 2 so it's total throughput
was around 2.4 TB/hour. Fair enough, it is not purely rebalancing because
the 4 overused nodes also get new blocks, still it performs much better.
Munin confirms the much higher disk/ethernet throughputs of the distcp.

Are these characteristics to be expected? Either way, can the balancer be
boosted even more? (Aside the dfs.balance.bandwidthPerSec property).

Ferdy.



Re: distcp performing much better for rebalancing than dedicated balancer

2011-05-05 Thread Ferdy Galema
I figured out what caused the slow balancing. Starting the balancer with 
a too small threshold will decrease the speed dramatically:


./start-balancer.sh -threshold 0.01
2011-05-05 17:17:04,132 INFO 
org.apache.hadoop.hdfs.server.balancer.Balancer: Will move 1.26 GBbytes 
in this iteration
2011-05-05 17:17:36,684 INFO 
org.apache.hadoop.hdfs.server.balancer.Balancer: Will move 1.26 GBbytes 
in this iteration
2011-05-05 17:18:09,737 INFO 
org.apache.hadoop.hdfs.server.balancer.Balancer: Will move 1.26 GBbytes 
in this iteration
2011-05-05 17:18:41,977 INFO 
org.apache.hadoop.hdfs.server.balancer.Balancer: Will move 1.26 GBbytes 
in this iteration


as opposed to:

./start-balancer.sh
2011-05-05 17:19:01,676 INFO 
org.apache.hadoop.hdfs.server.balancer.Balancer: Will move 40 GBbytes in 
this iteration
2011-05-05 17:21:36,800 INFO 
org.apache.hadoop.hdfs.server.balancer.Balancer: Will move 30 GBbytes in 
this iteration
2011-05-05 17:24:13,191 INFO 
org.apache.hadoop.hdfs.server.balancer.Balancer: Will move 30 GBbytes in 
this iteration


I'd expect setting the granularity would not affect speed, just the 
stopping threshold. Perhaps a bug?


On 05/05/2011 03:43 PM, Ferdy Galema wrote:
The decommissioning was performed with solely refreshNodes, but that's 
somewhat irrelevant because the balancing tests were performed after I 
re-added the 11 empty nodes. (FYI the drives were formatted with 
another unix fs). Though I did notice that the decommissioning shows 
about the same metrics as that of the balancer test afterwards,  not 
very fast that is.


On 05/05/2011 02:57 PM, Mathias Herberts wrote:

Did you explicitely start a balancer or did you decommission the nodes
using dfs.hosts.exclude and a dfsadmin -refreshNodes?

On Thu, May 5, 2011 at 14:30, Ferdy Galemaferdy.gal...@kalooga.com  
wrote:

Hi,

On our 15node cluster (1GB ethernet and 4x1TB disk per node) I 
noticed that
distcp does a much better job at rebalancing than the dedicated 
balancer
does. We needed to decommision 11 nodes, so that prior to 
rebalancing we had
4 used and 11 empty nodes. The 4 used nodes had about 25% usage 
each. Most
of our files are of average size: We have about 500K files in 280K 
blocks

and 800K blocks total (blocksize is 64MB).

So I changed dfs.balance.bandwidthPerSec to 800100100 and restarted the
cluster. Started the balancer tool and I noticed that the it moved 
about

200GB in 1 hour. (I grepped the balancer log for Need to move).

After stopping the balancer I started a distcp.  This tool copied 
900GB in
just 45 minutes, with an average replication of 2 so it's total 
throughput
was around 2.4 TB/hour. Fair enough, it is not purely rebalancing 
because
the 4 overused nodes also get new blocks, still it performs much 
better.

Munin confirms the much higher disk/ethernet throughputs of the distcp.

Are these characteristics to be expected? Either way, can the 
balancer be

boosted even more? (Aside the dfs.balance.bandwidthPerSec property).

Ferdy.



Re: distcp performing much better for rebalancing than dedicated balancer

2011-05-05 Thread Eric Fiala
Ferdy - that is interesting.
I would expect lower threshold = more data to move around (or equal to
default 10%)

Try with a whole integer, we regularly run balancer, -threshold 1 (to
balance to 1%), maybe the decimal is throwing a wrench at hadoop.

EF

On 5 May 2011 09:27, Ferdy Galema ferdy.gal...@kalooga.com wrote:

 I figured out what caused the slow balancing. Starting the balancer with a
 too small threshold will decrease the speed dramatically:

 ./start-balancer.sh -threshold 0.01
 2011-05-05 17:17:04,132 INFO
 org.apache.hadoop.hdfs.server.balancer.Balancer: Will move 1.26 GBbytes in
 this iteration
 2011-05-05 17:17:36,684 INFO
 org.apache.hadoop.hdfs.server.balancer.Balancer: Will move 1.26 GBbytes in
 this iteration
 2011-05-05 17:18:09,737 INFO
 org.apache.hadoop.hdfs.server.balancer.Balancer: Will move 1.26 GBbytes in
 this iteration
 2011-05-05 17:18:41,977 INFO
 org.apache.hadoop.hdfs.server.balancer.Balancer: Will move 1.26 GBbytes in
 this iteration

 as opposed to:

 ./start-balancer.sh
 2011-05-05 17:19:01,676 INFO
 org.apache.hadoop.hdfs.server.balancer.Balancer: Will move 40 GBbytes in
 this iteration
 2011-05-05 17:21:36,800 INFO
 org.apache.hadoop.hdfs.server.balancer.Balancer: Will move 30 GBbytes in
 this iteration
 2011-05-05 17:24:13,191 INFO
 org.apache.hadoop.hdfs.server.balancer.Balancer: Will move 30 GBbytes in
 this iteration

 I'd expect setting the granularity would not affect speed, just the
 stopping threshold. Perhaps a bug?


 On 05/05/2011 03:43 PM, Ferdy Galema wrote:

 The decommissioning was performed with solely refreshNodes, but that's
 somewhat irrelevant because the balancing tests were performed after I
 re-added the 11 empty nodes. (FYI the drives were formatted with another
 unix fs). Though I did notice that the decommissioning shows about the same
 metrics as that of the balancer test afterwards,  not very fast that is.

 On 05/05/2011 02:57 PM, Mathias Herberts wrote:

 Did you explicitely start a balancer or did you decommission the nodes
 using dfs.hosts.exclude and a dfsadmin -refreshNodes?

 On Thu, May 5, 2011 at 14:30, Ferdy Galemaferdy.gal...@kalooga.com
  wrote:

 Hi,

 On our 15node cluster (1GB ethernet and 4x1TB disk per node) I noticed
 that
 distcp does a much better job at rebalancing than the dedicated balancer
 does. We needed to decommision 11 nodes, so that prior to rebalancing we
 had
 4 used and 11 empty nodes. The 4 used nodes had about 25% usage each.
 Most
 of our files are of average size: We have about 500K files in 280K
 blocks
 and 800K blocks total (blocksize is 64MB).

 So I changed dfs.balance.bandwidthPerSec to 800100100 and restarted the
 cluster. Started the balancer tool and I noticed that the it moved about
 200GB in 1 hour. (I grepped the balancer log for Need to move).

 After stopping the balancer I started a distcp.  This tool copied 900GB
 in
 just 45 minutes, with an average replication of 2 so it's total
 throughput
 was around 2.4 TB/hour. Fair enough, it is not purely rebalancing
 because
 the 4 overused nodes also get new blocks, still it performs much better.
 Munin confirms the much higher disk/ethernet throughputs of the distcp.

 Are these characteristics to be expected? Either way, can the balancer
 be
 boosted even more? (Aside the dfs.balance.bandwidthPerSec property).

 Ferdy.




Re: How do I configure a Partitioner in the new API?

2011-05-05 Thread W.P. McNeill
The other thing you want to document is that the setConf() function is
called by the Hadoop reflection utilities whenever it creates a Configurable
object. I guess that's the only way it could happen, but I still wasn't sure
how this worked until I stepped through it in the debugger.

You also should document under what circumstances getConf() would be called.
 I'm still not clear what the scenario would be.

On Thu, May 5, 2011 at 2:16 AM, Harsh J ha...@cloudera.com wrote:

 Hello,

 On Wed, May 4, 2011 at 9:25 PM, W.P. McNeill bill...@gmail.com wrote:
  I only wanted the context as a way of getting at the configuration, so
  making the class implement Configurable will solve my problem.

 Good to know. I believe this ought to be documented. I've opened
 https://issues.apache.org/jira/browse/MAPREDUCE-2474 for the same.

 --
 Harsh J



Re: Cluster hard drive ratios

2011-05-05 Thread Matthew Foley
a node (or rack) is going down, don't replicate == DataNode Decommissioning.

This feature is available.  The current usage is to add the hosts to be 
decommissioned to the exclusion file named in dfs.hosts.exclude, then use 
DFSAdmin to invoke -refreshNodes.  (Search for decommission in DFSAdmin 
source code.)  NN will stop using these servers as replication targets, and 
will re-replicate all their replicas to other hosts that are still in service.  
The count of nodes that are in the process of being decommissioned is reported 
in the NN status web page.

--Matt


On May 5, 2011, at 4:48 AM, Steve Loughran wrote:

On 04/05/11 19:59, Matt Goeke wrote:
 Mike,
 
 Thanks for the response. It looks like this discussion forked on the CDH
 list so I have two different conversations now. Also, you're dead on
 that one of the presentations I was referencing was Ravi's.
 
 With your setup I agree that it would have made no sense to go the 2.5
 drive route given it would have forced you into the 500-750GB SATA
 drives and all it would allow is more spindles but less capacity at a
 higher cost. The servers we have been considering are actually the
 R710's so dual hexacore with 12 spindles of actual capacity is more of a
 1:1 in terms of cores to spindles vs the 2:1 I have been reviewing. My
 original issue attempted to focus more around at what point do you
 actually see a plateau in write performance of cores:spindles but since
 we are headed that direction anyway it looks like it was more to sate
 curiosity than driving specifications.

some people are using this as it gives best storage density. You can 
also go for single hexacore servers as in a big cluster the savings 
there translate into even more storage. It all depends on the application.

 As to your point, I forgot to include the issue of rebalancing in the
 original email but you are absolutely right. That was another major
 concern especially as we would get closer to filling capacity of a 24TB
 box. I think the original plan was bonded GBe but I think our
 infrastructure team has told us 10GBe would be standard.


1. If you want to play with bonded GBe then I have some notes I can send 
you -its harder than you think.
2. I don't know anyone who is running 10 GBe + Hadoop, though I see 
hints that StumbleUpon are doing this with Arista switches. You'd have 
to have a very chatty app or 10GBe on the mainboard to justify it.
3. I do know of installations with 24TB HDD and GBe, yes, the overhead 
of a node failure is higher. But with less nodes, P(failure) may be 
lower. The big fear is loss-of-rack, which can come from ToR switch 
failure or from network config errors. Hadoop isn't partition aware  
will treat a rack outage as the loss of 40+ servers, try to replicate 
all that data, and that's when you're in trouble (look at the AWS EBS 
outage for an example cascade failure).
4. There are JIRA issues for better handling of drive failure, including 
hotswapping and rebalancing data within a single machine.
5. I'd like support for the ability to say a node is going down, don't 
replicate, and the same for a rack, to ease maintenance.

-Steve



Re: distcp performing much better for rebalancing than dedicated balancer

2011-05-05 Thread Ferdy Galema
I actually tried 1% right after I ran the balancer with the default 
threshold. The data moved was the same as default. So in short this is 
what I tried:


default: fast (lots of data moved; 30 to 40GB every iteration)
1%: fast (same as above)
0.01%: slow (it moves only 1.26GB in a iteration, for hours long the 
exact same amount)


At the moment the cluster is already fully balanced.

On 05/05/2011 05:46 PM, Eric Fiala wrote:

Ferdy - that is interesting.
I would expect lower threshold = more data to move around (or equal to
default 10%)

Try with a whole integer, we regularly run balancer, -threshold 1 (to
balance to 1%), maybe the decimal is throwing a wrench at hadoop.

EF

On 5 May 2011 09:27, Ferdy Galemaferdy.gal...@kalooga.com  wrote:


I figured out what caused the slow balancing. Starting the balancer with a
too small threshold will decrease the speed dramatically:

./start-balancer.sh -threshold 0.01
2011-05-05 17:17:04,132 INFO
org.apache.hadoop.hdfs.server.balancer.Balancer: Will move 1.26 GBbytes in
this iteration
2011-05-05 17:17:36,684 INFO
org.apache.hadoop.hdfs.server.balancer.Balancer: Will move 1.26 GBbytes in
this iteration
2011-05-05 17:18:09,737 INFO
org.apache.hadoop.hdfs.server.balancer.Balancer: Will move 1.26 GBbytes in
this iteration
2011-05-05 17:18:41,977 INFO
org.apache.hadoop.hdfs.server.balancer.Balancer: Will move 1.26 GBbytes in
this iteration

as opposed to:

./start-balancer.sh
2011-05-05 17:19:01,676 INFO
org.apache.hadoop.hdfs.server.balancer.Balancer: Will move 40 GBbytes in
this iteration
2011-05-05 17:21:36,800 INFO
org.apache.hadoop.hdfs.server.balancer.Balancer: Will move 30 GBbytes in
this iteration
2011-05-05 17:24:13,191 INFO
org.apache.hadoop.hdfs.server.balancer.Balancer: Will move 30 GBbytes in
this iteration

I'd expect setting the granularity would not affect speed, just the
stopping threshold. Perhaps a bug?


On 05/05/2011 03:43 PM, Ferdy Galema wrote:


The decommissioning was performed with solely refreshNodes, but that's
somewhat irrelevant because the balancing tests were performed after I
re-added the 11 empty nodes. (FYI the drives were formatted with another
unix fs). Though I did notice that the decommissioning shows about the same
metrics as that of the balancer test afterwards,  not very fast that is.

On 05/05/2011 02:57 PM, Mathias Herberts wrote:


Did you explicitely start a balancer or did you decommission the nodes
using dfs.hosts.exclude and a dfsadmin -refreshNodes?

On Thu, May 5, 2011 at 14:30, Ferdy Galemaferdy.gal...@kalooga.com
  wrote:


Hi,

On our 15node cluster (1GB ethernet and 4x1TB disk per node) I noticed
that
distcp does a much better job at rebalancing than the dedicated balancer
does. We needed to decommision 11 nodes, so that prior to rebalancing we
had
4 used and 11 empty nodes. The 4 used nodes had about 25% usage each.
Most
of our files are of average size: We have about 500K files in 280K
blocks
and 800K blocks total (blocksize is 64MB).

So I changed dfs.balance.bandwidthPerSec to 800100100 and restarted the
cluster. Started the balancer tool and I noticed that the it moved about
200GB in 1 hour. (I grepped the balancer log for Need to move).

After stopping the balancer I started a distcp.  This tool copied 900GB
in
just 45 minutes, with an average replication of 2 so it's total
throughput
was around 2.4 TB/hour. Fair enough, it is not purely rebalancing
because
the 4 overused nodes also get new blocks, still it performs much better.
Munin confirms the much higher disk/ethernet throughputs of the distcp.

Are these characteristics to be expected? Either way, can the balancer
be
boosted even more? (Aside the dfs.balance.bandwidthPerSec property).

Ferdy.




London Hadoop May Meetup: Clusters in the Cloud

2011-05-05 Thread Dan
Hey,

Our may meet-up is on next Thursday 12th at Skills Matter, this time
with talks on using Hadoop clusters in the cloud.

We've got Matt Wood from Amazon Web Services coming to talk about
their elastic map reduce service and how you can use it to get
on-demand Hadoop clusters. Lars George from Cloudera will also be
coming over to give us a talk on the Apache Whirr project and how you
can use it to orchestrate Hadoop and other distributed systems on EC2
and other cloud providers.

We'll also have pizza and beer again this time thanks to Cloudera one
of our sponsors (sorry about the lack of last Month..)

If you would like to come you will need to register here
http://skillsmatter.com/event/cloud-grid/hadoopug-may/js-1766

I'm also starting to organise June's meet-up and still need some talks
for the evening so please let me know if you are interested in giving
a talk on anything Hadoop related.

Thanks,
Dan Harvey