Hi,
I run the disable table command, after a while, two RegionServers shutdown.
I see the log, when close one region, compaction is running on this region:
I check the code, when close regions, it will first set writestate.
writesEnabled to false, but if there is still compact running, this setting
may interrupt compact and throw InterruptedIOException, when the HRegion
catched this Exception, compact will fail, is this the cause of Regionserver
down? If so, this may be a problem.
if (!this.region.areWritesEnabled()) {
writer.close();
fs.delete(writer.getPath(), false);
throw new InterruptedIOException(
"Aborting compaction of store " + this +
" in region " + this.region +
" because user requested stop.");
}
} catch (InterruptedIOException iioe) {
LOG.info("compaction interrupted by user: ", iioe);
} finally {
long now = EnvironmentEdgeManager.currentTimeMillis();
LOG.info(((completed) ? "completed" : "aborted")
+ " compaction on region " + this
+ " after " + StringUtils.formatTimeDiff(now, startTime));
Some logs:
2011-04-18 14:00:56,468 DEBUG
org.apache.hadoop.hbase.regionserver.CompactSplitThread: Compaction requested
for
ufdr,1000286138199982#0129000,1302767272113.80928bc54c94a029b76098ce04c22572.
because Region has too many store files; priority=6, compaction queue size=0
2011-04-18 14:01:06,569 DEBUG
org.apache.hadoop.hbase.regionserver.handler.CloseRegionHandler: Processing
close of
ufdr,1000286138199982#0129000,1302767272113.80928bc54c94a029b76098ce04c22572.
2011-04-18 14:01:06,569 DEBUG org.apache.hadoop.hbase.regionserver.HRegion:
Closing
ufdr,1000286138199982#0129000,1302767272113.80928bc54c94a029b76098ce04c22572.:
disabling compactions & flushes
2011-04-18 14:01:06,569 DEBUG org.apache.hadoop.hbase.regionserver.HRegion:
waiting for compaction to complete for region
ufdr,1000286138199982#0129000,1302767272113.80928bc54c94a029b76098ce04c22572.
2011-04-18 14:01:06,714 INFO org.apache.hadoop.hbase.regionserver.HRegion:
compaction interrupted by user:
java.io.InterruptedIOException: Aborting compaction of store value in region
ufdr,1000286138199982#0129000,1302767272113.80928bc54c94a029b76098ce04c22572.
because user requested stop.
2011-04-18 14:01:06,714 INFO org.apache.hadoop.hbase.regionserver.HRegion:
aborted compaction on region
ufdr,1000286138199982#0129000,1302767272113.80928bc54c94a029b76098ce04c22572.
after 10sec
2011-04-18 14:01:06,714 INFO
org.apache.hadoop.hbase.regionserver.CompactSplitThread:
regionserver60020.compactor exiting
2011-04-18 14:01:07,532 INFO org.apache.hadoop.hbase.regionserver.Leases:
regionserver60020 closing leases
2011-04-18 14:01:07,532 INFO org.apache.hadoop.hbase.regionserver.Leases:
regionserver60020 closed leases
2011-04-18 14:01:07,600 INFO
org.apache.hadoop.hbase.regionserver.HRegionServer: regionserver60020 exiting
Zhou Shuaifeng(Frank)
-------------------------------------------------------------------------------------------------------------------------------------
This e-mail and its attachments contain confidential information from HUAWEI,
which
is intended only for the person or entity whose address is listed above. Any
use of the
information contained herein in any way (including, but not limited to, total
or partial
disclosure, reproduction, or dissemination) by persons other than the intended
recipient(s) is prohibited. If you receive this e-mail in error, please notify
the sender by
phone or email immediately and delete it!