Can you share the logs to confirm things?

Regards
Ram

-----Original Message-----
From: Matthias Hofschen [mailto:hofsc...@gmail.com] 
Sent: Wednesday, November 16, 2011 4:10 PM
To: hbase-u...@hadoop.apache.org
Subject: balancer stopped working

Hi,
we had a case today where the loadbalancer stopped working.
(cloudera-cdh3-u1, 52nodes). Basically we had a hot region that we moved to
another node. Shortly thereafter the regionserver of that region was
stopped. In the master logs we see that master is trying to contact this
regionserver to move the region. At the same time we had about 1000 regions
missing because of the stopped regionserver. These where not assigned to
another regionserver. I suspect that the the load balancer on the master
machine was blocked by trying to move the one region which was not
possible. We then restarted the master which solved the problem.

Is this a known problem, should I log an issue? I do have a stacktrace from
the master machine.

Cheers Matthias

Reply via email to