Re: Region Servers going down frequently

Amandeep Khurana Tue, 07 Apr 2009 23:11:26 -0700

20 nodes is good enough to begins with. How much memory do you have on each
node? IMO, you should keep 1GB per daemon and 1GB for the MR job like Andrew
suggested.
You dont necessarily have to separate the datanodes and tasktrackers as long
as you have enough resources.
10000 rows isnt big at all from hbase standpoint. What kind of computation
are you doing before dumping data into hbase? And what versions of Hadoop
and Hbase are you running?


There's another thing you should do. Increase the DataXceivers limit to 2048
(thats what I use).

If you have root privelege over the cluster, then increase the file limit to
32k (see hbase faq for details).

Try this out and see how it goes.


Amandeep Khurana
Computer Science Graduate Student
University of California, Santa Cruz


On Tue, Apr 7, 2009 at 2:45 AM, Rakhi Khatwani <[email protected]>wrote:

> Hi,
>      I have a 20 node cluster on ec2(small instance).... i have a set of
> tables which store huge amount of data (tried wid 10,000 rows... more to be
> added).... but during my map reduce jobs, some of the region servers shut
> down thereby causing data loss, stop in my program execution and infact one
> of my tables got damaged. when ever i scan the table, i get the could not
> obtain block error.
>
> 1. i want to make the cluster more robust. since it contains a lot of data.
> and its really important that they remain stable.
> 2. if one of my tables gets damaged (even after restarting dfs n hbase),
> how
> do i go about recovering it?
>
> my ec2 cluster mostly has the default configuration.
> with hadoop-site n hbase-site have some entries pertaining to map-reduce
> (for example. num of map tasks, mapred.task.timeout etc).
>
> Your help will be greatly appreciated.
> Thanks,
> Raakhi Khatwani
>

Re: Region Servers going down frequently

Reply via email to