Re: Questions while evaluating HBase

Stack Thu, 04 Mar 2010 10:51:57 -0800

On Thu, Mar 4, 2010 at 10:44 AM, Patrick Hunt <ph...@apache.org> wrote:
>> Please see our answer
>>
>> http://www.search-hadoop.com/m?id=7c962aed1002091610q14f2d6f0gc420ddade319f...@mail.gmail.com
>
> Any eta on when updated results will be available?
>
Not sure.  I'm working with Adam tomorrow.  Hopefully soon after that.
St.Ack



> Patrick
>
> Jean-Daniel Cryans wrote:
>>
>> Inline.
>>
>> J-D
>>
>>>  1. I assume you've seen this benchmark by Yahoo (
>>>  http://www.brianfrankcooper.net/pubs/ycsb-v4.pdf and
>>>  http://www.brianfrankcooper.net/pubs/ycsb.pdf). They show three main
>>>  problems: latency goes up quite significantly when doing more
>>> operations,
>>>  operations/sec are capped at about half of the other tested platforms
>>> and
>>>  adding new nodes interrupts the normal operation of the cluster for a
>>> while.
>>>  Do you consider these results a problem and if so are there any plans to
>>>  address them?
>>
>> Please see our answer
>>
>> http://www.search-hadoop.com/m?id=7c962aed1002091610q14f2d6f0gc420ddade319f...@mail.gmail.com
>>
>>>  2. While running our tests (most were done using 0.20.2) we've had a few
>>>  incidents where a table went into "transition" without ever going out of
>>> it.
>>>  We had to restart the cluster to release the stuck tables. Is this a
>>> common
>>>  issue?
>>
>> 0.20.3 has a much better story, 0.20.4 will include even more reliability
>> fixes.
>>
>>>  3. If I understand correctly then any major upgrade requires completely
>>>  shutting down the cluster while doing the upgrade as well as deploying a
>>> new
>>>  version of the application compiled with the new version client? Did I
>>> get
>>>  it correctly? Is there any strategy for upgrading while the cluster is
>>> still
>>>  running?
>>
>> Lots of different reasons why: Hadoop RPC is versionned, a new Hadoop
>> major version requires filesystem upgrades, etc...
>>
>> So for HBase, you currently can do rolling restarts between minor
>> versions until told otherwise (in the release notes). See
>> http://wiki.apache.org/hadoop/Hbase/RollingRestart
>>
>> Also Hadoop RPC will probably be replaced in the future with Avro and
>> by then all releases should be backward compatible (we hope).
>>
>>>  4. This is more a bug report than a question but it seems that in 0.20.3
>>>  the master server doesn't stop cleanly and has to be killed manually. Is
>>>  someone else seeing it too?
>>
>> Can you provide more details? Logs and stack traces appreciated.
>>
>>>  5. Are there any performance benchmarks for the Thrift gateway? Do you
>>>  have an estimate of the performance penalty of using the gateway
>>> compared to
>>>  using the native API?
>>
>> The good thing with thrift servers is that those they have long lived
>> clients so their cache is always full and HotSpot does it's magic. In
>> our tests (we use Thrift servers in production here at StumbleUpon),
>> it's maybe adding 1 or 2 ms per request...
>>
>>>  6. Right now, my biggest concern about HBase is its administration
>>>  complexity and cost. If anyone can share their experience that would be
>>> a
>>>  huge help. How many serves do you have in the cluster? How much ongoing
>>>  effort does it take to administrate it? What uptime levels are you
>>> seeing
>>>  (including upgrades)? Do you have any good strategy for running one
>>> cluster
>>>  across two data centers, or replicating between two clusters in two
>>>  different DCs? Did you have any serious problems/crashes/downtime with
>>>  HBase?
>>
>> HBase does require a knowledgeable admin, but which DB doesn't if used
>> on a very large scale? We have a full time DBA here for our mysql
>> clusters but the difference is that those are easier to find than
>> HBase admins, right? So some stats that we can make public:
>>
>> - We have a production cluster, another one for processing and a few
>> other for dev and testing (we have 3 HBase committers on staff so...
>> we need machines!). The production clusters have somewhat beefy nodes,
>> i7s with 24GB of RAM and 4x1TB in JBOD. None has more than 40 nodes.
>>
>> - Cluster replication is actually a feature I'm working on. See
>> http://issues.apache.org/jira/browse/HBASE-1295. We currently have 2
>> clusters replicating to each other, each hosted in a different city
>> and around 50M rows are sent each day (we aren't replicating
>> everything tho).
>>
>> - We did have some good crashes, we even run unofficial releases
>> sometimes, but since we are very knowledgeable we are able to fix
>> those and we always get them committed.
>>
>> - I can't disclose our uptime since it would give hints about uptime
>> of one of our product. I can say tho that it's getting better with
>> every release but eh, HBase is still very bleeding edge.
>>
>>>
>>> Thanks a lot,
>>> Eran Kutner
>>>
>

Re: Questions while evaluating HBase

Reply via email to