Re: Questions while evaluating HBase

Hua Su Thu, 04 Mar 2010 05:25:37 -0800

On Thu, Mar 4, 2010 at 6:02 PM, Eran Kutner <e...@gigya.com> wrote:

> Hi,
> I'm evaluating Hbase as a NoSql DB for a large scale, interactive, web
> service with very high uptime requirements, and have a few questions to the
> community.
>
>
>   1. I assume you've seen this benchmark by Yahoo (
>   http://www.brianfrankcooper.net/pubs/ycsb-v4.pdf and
>   http://www.brianfrankcooper.net/pubs/ycsb.pdf). They show three main
>   problems: latency goes up quite significantly when doing more operations,
>   operations/sec are capped at about half of the other tested platforms and
>   adding new nodes interrupts the normal operation of the cluster for a
> while.
>   Do you consider these results a problem and if so are there any plans to
>   address them?
>   2. While running our tests (most were done using 0.20.2) we've had a few
>   incidents where a table went into "transition" without ever going out of
> it.
>   We had to restart the cluster to release the stuck tables. Is this a
> common
>   issue?
>   3. If I understand correctly then any major upgrade requires completely
>   shutting down the cluster while doing the upgrade as well as deploying a
> new
>   version of the application compiled with the new version client? Did I
> get
>   it correctly? Is there any strategy for upgrading while the cluster is
> still
>   running?
>   4. This is more a bug report than a question but it seems that in 0.20.3
>   the master server doesn't stop cleanly and has to be killed manually. Is
>   someone else seeing it too?
>   5. Are there any performance benchmarks for the Thrift gateway? Do you
>   have an estimate of the performance penalty of using the gateway compared
> to
>   using the native API?
>


I also have concern about thrift gateway performance.

Maybe I misunderstand but after I have a glance at the code of hbase thrift
stuff, I find out that there is only one thrift server running. All requests
are sent to the thrift server first and then forwarded to region servers.

Is this a single failure point and a potential performance bottle neck?


>   6. Right now, my biggest concern about HBase is its administration
>   complexity and cost. If anyone can share their experience that would be a
>   huge help. How many serves do you have in the cluster? How much ongoing
>   effort does it take to administrate it? What uptime levels are you seeing
>   (including upgrades)? Do you have any good strategy for running one
> cluster
>   across two data centers, or replicating between two clusters in two
>   different DCs? Did you have any serious problems/crashes/downtime with
>   HBase?
>
>
> Thanks a lot,
> Eran Kutner
>

Best,
- Hua

Re: Questions while evaluating HBase

Reply via email to