Hello, We've recently had a problem where regions will get stuck in transition for a long period of time. In fact, they don't ever appear to get out-of-transition unless we take manual action. Last time this happened I restarted the master and they were cleared out. This time I wanted to consult the list first.
I checked the admin ui for all 24 of our servers, and the region does not appear to be hosted anywhere. If I look in hdfs, I do see the region there and it has 2 files. The first instance of this region in my HMaster logs is: 2/04/15 17:48:06 INFO master.HMaster: balance > hri=visitor-activities-a2,\x00\x02EG120909,1333750824238.703fed4411f2d6ff4b3ea80506fb635e., > src=XXXXXXXXX.ec2.internal,60020,1334064456919, > dest=XXXXXXXX.ec2.internal,60020,1334064197946 > 12/04/15 17:48:06 INFO master.AssignmentManager: Server > serverName=XXXXXXXX.ec2.internal,60020,1334064456919, load=(requests=0, > regions=0, usedHeap=0, maxHeap=0) returned > org.apache.hadoop.hbase.NotServingRegionException: > org.apache.hadoop.hbase.NotServingRegionException: Received close for > visitor-activities-a2,\x00\x02EG120909,1333750824238.703fed4411f2d6ff4b3ea80506fb635e. > but we are not serving it for 703fed4411f2d6ff4b3ea80506fb635e It then keeps saying the same few logs every ~30 mins: 12/04/15 18:18:18 INFO master.AssignmentManager: Regions in transition > timed out: > > visitor-activities-a2,\x00\x02EG120909,1333750824238.703fed4411f2d6ff4b3ea80506fb635e. > state=PENDING_CLOSE, ts=1334526491544, server=null > 12/04/15 18:18:18 INFO master.AssignmentManager: Region has been > PENDING_CLOSE for too long, running forced unassign again on > region=visitor-activities-a2,\x00\x02EG120909,1333750824238.703fed4411f2d6ff4b3ea80506fb635e. > 12/04/15 18:18:18 INFO master.AssignmentManager: Server > serverName=XXXXXXXXX.ec2.internal,60020,1334064456919, load=(requests=0, > regions=0, usedHeap=0, maxHeap=0) returned > org.apache.hadoop.hbase.NotServingRegionException: > org.apache.hadoop.hbase.NotServingRegionException: Received close for > visitor-activities-a2,\x00\x02EG120909,1333750824238.703fed4411f2d6ff4b3ea80506fb635e. > but we are not serving it for 703fed4411f2d6ff4b3ea80506fb635e Any ideas how I can avoid this, or a better solution than restarting the HMaster? Thanks, Bryan