On 2010-08-13, at 5:04 PM, Stack wrote: > On Fri, Aug 13, 2010 at 3:59 PM, James Kennedy <[email protected]> > wrote: >> We've recently updated the hbase-transactional-tableindexed extension to >> work with the latest 0.89.20100726 version of HBase (still to be pushed). > > Good. Please update the list when you push out the change.
Will do. We're just working on getting an HBase patch up to snuff for review. We needed to make some things like the the HLog split code more extendible. > > Are you sure the master doesn't notice the server gone? Maybe the RS > is dead in all but the ZK Client session thread so its lease is not > timing out against the ZK ensemble? Will verify. > > The other issue that can get in the way of a RS recovery is replay of > logs; the master needs to grab the lease on the WAL log that the dead > regionserver was writing. This can take a little while. It won't > work at all if you are not using an hdfs that doesn't support > 'append'; i.e. the hadoop that is in hbase lib dir or build your own > out of apache hadoop branch-0.20-append branch (or get cdh3b2). Is > this the issue (Tail the log and look for failure to obtain file > lease). Yeah we definitely have appends enabled. We did at one point have a separate issue where that lease recovery loop would never exit. Problem was that our transactional extension was failing to close it's own THLog HDFS file when region server was aborted. We fixed that. But it seemed to be that that comment in the loop about possibly continuing after lease expiry time should be followed up on (maybe it has been in trunk since last i looked). > > An incomplete server crash recovery will show the exceptions you pasted below. > >> Could it really be that sudden region server death is not handled in hbase? >> Or more likely is this a failure of the testing framework to adequately >> simulate kill -9? >> > > Its kinda hard to simulate kill -9 in unit tests. The 'kill' method > above was a pale attempt at it in our unit testing context. I've > manually kill -9'd nodes -- RS and DN -- to simulate node death > testing 0.89.x builds and all seems to be basically working. Ok, we'll do some more manual testing and explore gremlins as JD Cryans suggests. I don't think kill -9 testing is critical for our hbase patch but it will be for guaranteeing that our transactional extension works. > On making it so master notices downed servers the faster so it'll > start up recovery the sooner, thats a case of playing with the zk > session timeout but coming in the other direction is tuning GC so GC > pause is not > ZK session expiration. In the case i mentioned i waited 5 min which was well over the set 60 sec zk lease time. > > You also want to ensure that there is not too much of a backlog of WAL > logs else recovery which involves splitting all WALs that had > outstanding edits on the crashed server has less work to do when it > runs. Default is maximum backlog of 32 WALs. You might want to tune > this down though there is a currently an issue because it seems like > we can overrun this limit. What happens if the maximum is exceeded? Logs are lost? Our system is data critical and we can't afford to lose anything. Thanks again, James
