Re: Region Servers going down frequently

Rakhi Khatwani Wed, 08 Apr 2009 00:03:13 -0700

Hi Amandeep,

Following is my ec2 cluster configuration:
High-CPU Medium Instance 1.7 GB of memory, 5 EC2 Compute Units (2 virtual
cores with 2.5 EC2 Compute Units each), 350 GB of instance storage, 32-bit
platform


so I don't think I have much option when it comes to the GB part. iHowever,
is there any way i can make use of 5ec2 compute units to increase my
performance?

Regarding the table splits, I dont see hbase doing the table spilts
automatically.
After loading about 17000 rows in table1, I can still see it as one region
(after checking it on web UI). thats why i had to manually split it. or is
there any configuration/settings I have to do to ensure that the tables are
split automatically?

I will increase the dataXceivers and ulimit to 32k

Thanks a ton
Rakhi.



>
> > Hi Amandeep,
> >                  I have 1GB Memory on each node on ec2 cluster(C1 Medium)
> .
> > i am using hadoop-0.19.0 and hbase-0.19.0
> > well we were starting with 10,000 rows, but later it will go up to
> 100,000
> > rows.
>
>
> 1GB is too low. You need around 4GB to get a stable system.
>
> >
> >
> > my map task basically reads an hbase table 'Table1', performs analysis on
> > each row, and dumps the analysis results into another hbase table
> 'Table2'.
> > each analysis task takes about 3-4 minutes when tested on local machine
> > (the
> > algorithm part.... w/o the map reduce).
> >
> > i have divided 'Table1' to 30 regions b4 sending it to the map. and set
> the
> > maximum number of map tasks to 20.
>
> Let hbase do the division into regions. Leave the table as it is in default
> state.
>
> >
> > i have set DataXceivers to 1024 and uLimit to 1024
>
> yes.. increase these..
> 2048 dataxceivers and 32k ulimit.
>
> >
> > i am able to process about 300 rows in an hour which i feel quite slow...
> > how do i increase the performance.
>
> the reaons are mentioned above.
>
> >
> >
> > meanwhile i will try settin the dataXceivers to 2048 and increasing the
> > file
> > limit as you mentioned.
> >
> > Thanks,
> > Rakhi
> >
> > On Wed, Apr 8, 2009 at 11:40 AM, Amandeep Khurana <[email protected]>
> > wrote:
> >
> > > 20 nodes is good enough to begins with. How much memory do you have on
> > each
> > > node? IMO, you should keep 1GB per daemon and 1GB for the MR job like
> > > Andrew
> > > suggested.
> > > You dont necessarily have to separate the datanodes and tasktrackers as
> > > long
> > > as you have enough resources.
> > > 10000 rows isnt big at all from hbase standpoint. What kind of
> > computation
> > > are you doing before dumping data into hbase? And what versions of
> Hadoop
> > > and Hbase are you running?
> > >
> > > There's another thing you should do. Increase the DataXceivers limit to
> > > 2048
> > > (thats what I use).
> > >
> > > If you have root privelege over the cluster, then increase the file
> limit
> > > to
> > > 32k (see hbase faq for details).
> > >
> > > Try this out and see how it goes.
> > >
> > >
> > > Amandeep Khurana
> > > Computer Science Graduate Student
> > > University of California, Santa Cruz
> > >
> > >
> > > On Tue, Apr 7, 2009 at 2:45 AM, Rakhi Khatwani <
> [email protected]
> > > >wrote:
> > >
> > > > Hi,
> > > >      I have a 20 node cluster on ec2(small instance).... i have a set
> > of
> > > > tables which store huge amount of data (tried wid 10,000 rows... more
> > to
> > > be
> > > > added).... but during my map reduce jobs, some of the region servers
> > shut
> > > > down thereby causing data loss, stop in my program execution and
> infact
> > > one
> > > > of my tables got damaged. when ever i scan the table, i get the could
> > not
> > > > obtain block error.
> > > >
> > > > 1. i want to make the cluster more robust. since it contains a lot of
> > > data.
> > > > and its really important that they remain stable.
> > > > 2. if one of my tables gets damaged (even after restarting dfs n
> > hbase),
> > > > how
> > > > do i go about recovering it?
> > > >
> > > > my ec2 cluster mostly has the default configuration.
> > > > with hadoop-site n hbase-site have some entries pertaining to
> > map-reduce
> > > > (for example. num of map tasks, mapred.task.timeout etc).
> > > >
> > > > Your help will be greatly appreciated.
> > > > Thanks,
> > > > Raakhi Khatwani
> > > >
> > >
> >
>

Re: Region Servers going down frequently

Reply via email to