Re: effect on data after topology change

2012-01-17 Thread Todd Lipcon
Hi Ravi,

You'll probably need to up the replication level of the affected files
and then drop it back down to the desired level. Current versions of
HDFS do not automatically repair rack policy violations if they're
introduced in this manner.

-Todd

On Mon, Jan 16, 2012 at 3:53 PM, rk vishu talk2had...@gmail.com wrote:
 Hello All,

 If i change the rackid for some nodes and restart namenode, will data be
 rearranged accordingly? Do i need to run rebalancer?

 Any information on this would be appreciated.

 Thanks and Regards
 Ravi



-- 
Todd Lipcon
Software Engineer, Cloudera


Re: effect on data after topology change

2012-01-17 Thread rk vishu
Thank you very much Todd. I hope futute versions of hadoop rebalcer will
include this check.

I have one more question.

If we are in the process of setting up additional nodes incrementally in
different rack (say rack-2) and rack-2 size is only 25% of rack-1, how
would data be balanced (with default implementation)?
i.e Will hadoop prefers balancing the overall nodes or will it try to obey
the topology first that could fillup rack-2 quickly?.  I am positive that
it will try to balance overall nodes but want to be sure.

Thanks and Regards
Ravi
On Tue, Jan 17, 2012 at 10:41 AM, Todd Lipcon t...@cloudera.com wrote:

 Hi Ravi,

 You'll probably need to up the replication level of the affected files
 and then drop it back down to the desired level. Current versions of
 HDFS do not automatically repair rack policy violations if they're
 introduced in this manner.

 -Todd

 On Mon, Jan 16, 2012 at 3:53 PM, rk vishu talk2had...@gmail.com wrote:
  Hello All,
 
  If i change the rackid for some nodes and restart namenode, will data be
  rearranged accordingly? Do i need to run rebalancer?
 
  Any information on this would be appreciated.
 
  Thanks and Regards
  Ravi



 --
 Todd Lipcon
 Software Engineer, Cloudera



effect on data after topology change

2012-01-16 Thread rk vishu
Hello All,

If i change the rackid for some nodes and restart namenode, will data be
rearranged accordingly? Do i need to run rebalancer?

Any information on this would be appreciated.

Thanks and Regards
Ravi