On Thu, Mar 4, 2010 at 10:44 AM, Patrick Hunt <ph...@apache.org> wrote: >> Please see our answer >> >> http://www.search-hadoop.com/m?id=7c962aed1002091610q14f2d6f0gc420ddade319f...@mail.gmail.com > > Any eta on when updated results will be available? > Not sure. I'm working with Adam tomorrow. Hopefully soon after that. St.Ack
> Patrick > > Jean-Daniel Cryans wrote: >> >> Inline. >> >> J-D >> >>> 1. I assume you've seen this benchmark by Yahoo ( >>> http://www.brianfrankcooper.net/pubs/ycsb-v4.pdf and >>> http://www.brianfrankcooper.net/pubs/ycsb.pdf). They show three main >>> problems: latency goes up quite significantly when doing more >>> operations, >>> operations/sec are capped at about half of the other tested platforms >>> and >>> adding new nodes interrupts the normal operation of the cluster for a >>> while. >>> Do you consider these results a problem and if so are there any plans to >>> address them? >> >> Please see our answer >> >> http://www.search-hadoop.com/m?id=7c962aed1002091610q14f2d6f0gc420ddade319f...@mail.gmail.com >> >>> 2. While running our tests (most were done using 0.20.2) we've had a few >>> incidents where a table went into "transition" without ever going out of >>> it. >>> We had to restart the cluster to release the stuck tables. Is this a >>> common >>> issue? >> >> 0.20.3 has a much better story, 0.20.4 will include even more reliability >> fixes. >> >>> 3. If I understand correctly then any major upgrade requires completely >>> shutting down the cluster while doing the upgrade as well as deploying a >>> new >>> version of the application compiled with the new version client? Did I >>> get >>> it correctly? Is there any strategy for upgrading while the cluster is >>> still >>> running? >> >> Lots of different reasons why: Hadoop RPC is versionned, a new Hadoop >> major version requires filesystem upgrades, etc... >> >> So for HBase, you currently can do rolling restarts between minor >> versions until told otherwise (in the release notes). See >> http://wiki.apache.org/hadoop/Hbase/RollingRestart >> >> Also Hadoop RPC will probably be replaced in the future with Avro and >> by then all releases should be backward compatible (we hope). >> >>> 4. This is more a bug report than a question but it seems that in 0.20.3 >>> the master server doesn't stop cleanly and has to be killed manually. Is >>> someone else seeing it too? >> >> Can you provide more details? Logs and stack traces appreciated. >> >>> 5. Are there any performance benchmarks for the Thrift gateway? Do you >>> have an estimate of the performance penalty of using the gateway >>> compared to >>> using the native API? >> >> The good thing with thrift servers is that those they have long lived >> clients so their cache is always full and HotSpot does it's magic. In >> our tests (we use Thrift servers in production here at StumbleUpon), >> it's maybe adding 1 or 2 ms per request... >> >>> 6. Right now, my biggest concern about HBase is its administration >>> complexity and cost. If anyone can share their experience that would be >>> a >>> huge help. How many serves do you have in the cluster? How much ongoing >>> effort does it take to administrate it? What uptime levels are you >>> seeing >>> (including upgrades)? Do you have any good strategy for running one >>> cluster >>> across two data centers, or replicating between two clusters in two >>> different DCs? Did you have any serious problems/crashes/downtime with >>> HBase? >> >> HBase does require a knowledgeable admin, but which DB doesn't if used >> on a very large scale? We have a full time DBA here for our mysql >> clusters but the difference is that those are easier to find than >> HBase admins, right? So some stats that we can make public: >> >> - We have a production cluster, another one for processing and a few >> other for dev and testing (we have 3 HBase committers on staff so... >> we need machines!). The production clusters have somewhat beefy nodes, >> i7s with 24GB of RAM and 4x1TB in JBOD. None has more than 40 nodes. >> >> - Cluster replication is actually a feature I'm working on. See >> http://issues.apache.org/jira/browse/HBASE-1295. We currently have 2 >> clusters replicating to each other, each hosted in a different city >> and around 50M rows are sent each day (we aren't replicating >> everything tho). >> >> - We did have some good crashes, we even run unofficial releases >> sometimes, but since we are very knowledgeable we are able to fix >> those and we always get them committed. >> >> - I can't disclose our uptime since it would give hints about uptime >> of one of our product. I can say tho that it's getting better with >> every release but eh, HBase is still very bleeding edge. >> >>> >>> Thanks a lot, >>> Eran Kutner >>> >