On Tue, Nov 8, 2011 at 2:06 PM, Stack <[email protected]> wrote: > What happened above between 00:43:40 and 00:44:31?
Not much judging by the logs. In fact that's part of the issue here I think. > A big old GC? Unlikely -- the RS had tons of Heap, but of course anything's possible. > This is a standalone instance with all running in the on VM? That's a small cluster running on EC2. So at the very fundamental levels these are VMs, yes. But for all practical purposes -- it is a fully distributed standalone set of servers. > The YouAreDeadException happens usually when the master has figured > the RegionServer is dead before the RegionServer has figured it out. > This can happen when say, the RS has GC paused and first thing it does > when it comes out of the pause is it heartbeats the master (Meantime > its probably running the zookeeper session expiration code > concurrently). Right. I'll try to look into that in my testing. I also bumped to the timeout up to a minute (which I'm really nervous about, though). Lets see... Thanks, Roman.
