2010/4/26 Michał Podsiadłowski <podsiadlow...@gmail.com> > Hi Todd, > > Thanks for you input. Your words are making me sad though. I'm using > 0.20.4 taken from trunk around beginning of April. Exact version I can > tell you tomorrow. > With respect to 1) we are only shutting down no even killing regions > servers, datanodes are still working. This is not the first time we > manage to break whole cluster with just shutting down regions servers. >
Hi Michal, I agree that this use case should not cause the cluster to fail. By "just shutting down" do you mean you are running hbase-daemon.sh stop regionserver on 3 of the nodes? Are you doing all three at once or in quick succession? I'd like to try to reproduce your problem so we can get it fixed for 0.20.5. Thanks -Todd > > > 2010/4/26 Todd Lipcon <t...@cloudera.com>: > > Hi Michal, > > > > What version of HBase are you running? > > > > All currently released versions of HBase have known bugs with recovery > under > > crash scenarios, many of which have to do with the lack of a sync() > feature > > in released versions of HDFS. > > > > The goal for HBase 0.20.5, due out in the next couple of months, is to > fix > > all of these issues to achieve cluster stability under failure. > > > > I'm working full time on this branch, and happy to report that as of > > yesterday I have a 40-threaded client which is inserting records into a > > cluster where I am killing a region server once every 1-2 minutes, and it > is > > recovering completely and correctly through every failure. The test has > been > > running for about 24 hours, and no regions have been lost, etc. > > > > My next step is to start testing under 2-node failure scenarios, master > > failure scenarios, etc. > > > > Regarding your specific questions: > > > > 1) When you have a simultaneous failure of 3 nodes, you will have blocks > > become unavailable in the underlying HDFS. Thus, HBase has no recourse to > be > > able to continue operating correctly, since its data won't be accessible > and > > any edit logs writing to that set of 3 nodes will fail to append. Thus, I > > don't think we can reasonably expect to do anything to recover from this > > situation. We should shut down the cluster in such a way that, after HDFS > > has been restored, we can restart HBase without missing regions, etc. > There > > are probably bugs here, currently, but is lower on the priority list > > compared to more common scenarios. > > > > > 2) When a region is being reassigned, it does take some time to recover. > In > > my experience, a loss of a region server hosting META does take about 2 > > minutes to fully reassign. The loss of a region server not holding META > > takes about 1 minute to fully reassign. This is with a 1 minute ZK > session > > timeout. With shorter timeouts, you will detect failure faster, but more > > likely to have false failure detections due to GC pauses, etc. We're > working > > on improving this for 0.21. > > > > Regarding the suitability of this for a real time workload, there are > some > > ideas floating around for future work that would make the regions > available > > very quickly in a readonly/stale data mode while the logs are split and > > recovered. This is probably not going to happen in the short term, as it > > will be tricky to do correctly, and there are more pressing issues. > > > > Thanks > > -Todd > > > > > > > > > > > > 2010/4/26 Michał Podsiadłowski <podsiadlow...@gmail.com> > > > >> Hi Edward, > >> > >> these are not good news for us. If under low load you get 30 seconds > >> our 3 minutes are quite normal. Especially because your records are > >> quite big and there is lots of removals and inserts. I just wonder if > >> our use case scenarios are not in the sweet spot of hbase or hbase > >> availability simply low. Do you have any knowledge about change to > >> architecture in 0.21? As far as I can see partially problem is with > >> dividing logs from dead data node to table files logs. > >> Is there any way we could speed up recovery ? And can someone explain > >> what happened when we shutdown 3/6 regions servers? Why cluster got > >> into inconsistent state with so many missing regions? Is this so extra > >> usual situation that hbase can't handle? > >> > >> Thanks, > >> Michal > >> > > > > > > > > -- > > Todd Lipcon > > Software Engineer, Cloudera > > > -- Todd Lipcon Software Engineer, Cloudera