Re: Data migration in Hadoop

Ayon Sinha Wed, 14 Sep 2011 09:03:34 -0700

Hi Vikas,
The imbalance does create imbalance in MR but with your configuration it may 
not be a big issue. Basically the balancer will put data blocks on nodes based 
on their available percentage. So inevitably the bigger data nodes will end up 
with more blocks. This means that more mapper will get spawned on the bigger 
node, but if the total map capacity is the same for everyone, then if all the 
smaller data nodes have finished processing their blocks and the bigger node is 
busy to its map capacity, these smaller datanodes will have to pull blocks off 
the big one to run mappers.
I don't know if this will work well, but you can try increasing the max maps 
capacity of the bigger datanode and reduce the max reduce capacity for that 
node. Lets say your default is 8+8 for everyone, you can make the big node 12+4 
or even try 14+2.
Let us know how that works.
 
-Ayon
See My Photos on Flickr
Also check out my Blog for answers to commonly asked questions.




________________________________
From: Vikas Srivastava <vikas.srivast...@one97.net>
To: user@hive.apache.org; Ayon Sinha <ayonsi...@yahoo.com>; 
sonalgoy...@gmail.com
Sent: Tuesday, September 13, 2011 11:04 PM
Subject: Re: Data migration in Hadoop


thanks Ayon and sonal,

one more thing 

question:- does the imbalance size in cluster is of any datanode create a 
problem...or have any bad impact

Acc to what you are saying 

my cluster would be of 10 DN of (2tb hdd) and 1 DN of (8tb HDD)  does this make 
any bad impact.

please suggest.. this all config with 16 gb ram

Regards
Vikas Srivastava


On Tue, Sep 13, 2011 at 11:20 PM, Ayon Sinha <ayonsi...@yahoo.com> wrote:

What you can do for each node:
>1. decommission node (or 2 nodes if you want to do this faster). You can do 
>this with the excludes file.
>2. Wait for blocks to be moved off the decommed node(s)
>3. Replace the disks and put them back in service.
>4. Repeat until done.
> 
>-Ayon
>See My Photos on Flickr
>Also check out my Blog for answers to commonly asked questions.
>
>
>
>________________________________
>From: Vikas Srivastava <vikas.srivast...@one97.net>
>To: user@hive.apache.org
>Sent: Tuesday, September 13, 2011 5:27 AM
>Subject: Re: Data migration in Hadoop
>
>
>
>hey sonal!!
>
>Actually right now we have 11 node cluster each having 8 disks of 3oogb and 
>8gb ram,
>
>now what we want to do is to replace those 300gb disks with 1 tb disks so that 
>we can have more space per server.
>
>we have replication factor 2.
>
>my suggestion is ..
>1:- Add a node of 8 tb in cluster and run balancer to balance the load.
>2:- free any 1 node(repalcement node)..... 
>
>question:- does the imbalance size in cluster is of any datanode create a 
>problem...or have any bad impact
>
>regards 
>Vikas Srivastava
>
>
>
>On Tue, Sep 13, 2011 at 5:37 PM, Sonal Goyal <sonalgoy...@gmail.com> wrote:
>
>Hi Vikas,
>>
>>
>>This was discussed in the groups recently:
>>
>>
>>http://lucene.472066.n3.nabble.com/Fixing-a-bad-HD-tt2863634.html#none
>>
>>
>>Are you looking at replacing all your datanodes, or only a few? how big is 
>>your cluster?
>>
>>Best Regards,
>>Sonal
>>Crux: Reporting for HBase
>>Nube Technologies 
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>On Tue, Sep 13, 2011 at 1:52 PM, Vikas Srivastava 
>><vikas.srivast...@one97.net> wrote:
>>
>>HI ,
>>>
>>>can ny1 tell me how we can migrate hadoop or replace old hard disks with new 
>>>big size hdd.
>>>
>>>actually i need to replace old hdd of 300 tbs to 1 tb so how can i do this 
>>>efficiently!!!
>>>
>>>ploblem is to migrate data from 1 hdd to other
>>>
>>>
>>>-- 
>>>With Regards
>>>Vikas Srivastava
>>>
>>>DWH & Analytics Team
>>>Mob:+91 9560885900
>>>One97 | Let's get talking !
>>>
>>
>
>
>-- 
>With Regards
>Vikas Srivastava
>
>DWH & Analytics Team
>Mob:+91 9560885900
>One97 | Let's get talking !
>
>
>


-- 
With Regards
Vikas Srivastava

DWH & Analytics Team
Mob:+91 9560885900
One97 | Let's get talking !

Re: Data migration in Hadoop

Reply via email to