Re: HBase stability

Stack Mon, 13 Dec 2010 09:45:18 -0800

Some comments inline in the below.

On Mon, Dec 13, 2010 at 8:45 AM, baggio liu <baggi...@gmail.com> wrote:
> Hi  Anze,
>   Our production cluster used HBase 0.20.6 and hdfs (CDH3b2), and we work
> for stability about a month. Some issue we have been met, and may helpful to
> you.
>


Thanks for writing back to the list with your experiences.

> HDFS:
>    1.  hbase file has short life cycle than map-red, some times there're
> many blocks should be delete, we should tuning for the speed of hdfs invalid
> block.


This can be true.  Yes.  What are you suggesting here?  What should we tune?


>    2. hadoop 0.20 branch can not deal with disk failure, HDFS-630 will be
> helpful.


hdfs-630 has been applied to the branch-0.20-append branch (Its also
in CDH IIRC).


>    3. region server can not deal IOException rightly. When DFSClient meet
> network error, it'll throw IOException, and it may be not fatal for region
> server, so these IOException MUST be review.


Usually if RegionServer has issues getting to HDFS, it'll shut itself
down.  This is 'normal' perhaps overly-defensive behavior.  The story
should be better in 0.90 but would be interested in any list you might
have where you think we should be able to catch and continue.


>    4. In large scale scan, there're many concurrent reader in a short time.


Just FYI, HBase opens all files and keeps them open on startup.
There'll be pressure on file handles, threads in data nodes, as soon
as you start up an HBase instance.  Scans use the already opened files
so whether 1 or N ongoing Scans, the pressure on HDFS is the same.

> We must make datanode dataxceiver number to a large number, and file handle
> limit should be tuning. In addition, the connection reuse between DFSClient
> and datanode should be done.
>

Yes.  This is in our requirements for HBase.  Here is the latest from
the 0.90.0RC HBase 'book':
http://people.apache.org/~stack/hbase-0.90.0-candidate-1/docs/notsoquick.html#ulimit

What do you mean by connection reuse?


> HBase
>    1. single thread compaction limit the speed of compaction, it should be
> made multi-thread.( during multi-thread compaction we should limit network
> bandwidth in compaction )

True but also in 0.90 compaction algorithm is smarter; there is less to do.


>    2. single thread split HLog (read HLog) wile make Hbase down time
> longer, make it multi-thread can limit HBase down time.


True in 0.20 but in 0.90, splits are much faster; splits come up
immediately on the regionserver that hosted the parent that split
rather than go back to the master for the master to assign out the new
daughter regions.

>    3.  Additional, some tools should be done such as meta region checker,
> fixer and so on.


Yes.  In 0.90, we have hbck tool to run checks and report on inconsistencies.


>    4.  zookeeper session timeout should be tuning according to your load on
> HBase cluster.

Yes.  ZooKeeper ping is the regionservers lifeline to the cluster.  If
it goes amiss, then regionserver is considered lost and master will
take restorative action.


>    5.  gc stratigy should be tuning on your region server/HMaster.
>


Yes.  Any suggestions from your experience?


>    Beside upon,  in production cluster, data loss issue should be fix  as
> while.(currently hadoop 0.20 append branch and CDH3b2 hadoop can be used.)


Yes.  Here is the 0.90 doc. on hadoop versions:
http://people.apache.org/~stack/hbase-0.90.0-candidate-1/docs/notsoquick.html#hadoop


>    Because of hdfs make many optimization on throughput, for application
> like HBase (many random read/write) . Many tuning and change on hdfs should
> be done.

Do you have suggestions?  A list?

Thanks for writing the list Baggio,
St.Ack


>    Hope this experience can be helpful to you.
>
>
> Thanks & Best regard
> Baggio
>
>
> 2010/12/14 Todd Lipcon <t...@cloudera.com>
>
>> HI Anze,
>>
>> In word, yes - 0.20.4 is not that stable in my experience, and
>> upgrading to the latest CDH3 beta (which includes HBase 0.89.20100924)
>> should give you a huge improvement in stability.
>>
>> You'll still need to do a bit of tuning of settings, but once it's
>> well tuned it should be able to hold up under load without crashing.
>>
>> -Todd
>>
>> On Mon, Dec 13, 2010 at 2:41 AM, Anze <anzen...@volja.net> wrote:
>> > Hi all!
>> >
>> > We have been using HBase 0.20.4 (cdh3b1) in production on 2 nodes for a
>> few
>> > months now and we are having constant issues with it. We fell over all
>> > standard traps (like "Too many open files", network configuration
>> > problems,...). All in all, we had about one crash every week or so.
>> > Fortunately we are still using it just for background processing so our
>> > service didn't suffer directly, but we have lost huge amounts of time
>> just
>> > fixing the data errors that resulted from data not being written to
>> permanent
>> > storage. Not to mention fixing the issues.
>> > As you can probably understand, we are very frustrated with this and are
>> > seriously considering moving to another bigtable.
>> >
>> > Right now, HBase crashes whenever we run very intensive rebuild of
>> secondary
>> > index (normal table, but we use it as secondary index) to a huge table. I
>> have
>> > found this:
>> > http://wiki.apache.org/hadoop/Hbase/Troubleshooting
>> > (see problem 9)
>> > One of the lines read:
>> > "Make sure you give plenty of RAM (in hbase-env.sh), the default of 1GB
>> won't
>> > be able to sustain long running imports."
>> >
>> > So, if I understand correctly, no matter how HBase is set up, if I run an
>> > intensive enough application, it will choke? I would expect it to be
>> slower
>> > when under (too much) pressure, but not to crash.
>> >
>> > Of course, we will somehow solve this issue (working on it), but... :(
>> >
>> > What are your experiences with HBase? Is it stable? Is it just us and the
>> way
>> > we set it up?
>> >
>> > Also, would upgrading to 0.89 (cdh3b3) help?
>> >
>> > Thanks,
>> >
>> > Anze
>> >
>> >
>>
>>
>>
>> --
>> Todd Lipcon
>> Software Engineer, Cloudera
>>
>

Re: HBase stability

Reply via email to