Splits getting blocked by compactions causeing region to be offline for the 
length of the compaction 10-15 mins
---------------------------------------------------------------------------------------------------------------

                 Key: HADOOP-2587
                 URL: https://issues.apache.org/jira/browse/HADOOP-2587
             Project: Hadoop
          Issue Type: Bug
          Components: contrib/hbase
         Environment: hadoop subversion 611087
            Reporter: Billy Pearson
             Fix For: 0.16.0


The below is cut out of one of my region servers logs full log attached

What is happening is there is one region on a this region server and its is 
under heave insert load so compaction are back to back one one finishes a new 
one starts the problem starts when its time to split the region. 

A compaction starts just millsecs before the split starts blocking the split 
but the split closes the region before the compaction is finished. Causing the 
region to be offline until the compaction is done. Once the compaction is done 
the split finishes and all is returned to normal but this is a big problem for 
production if the region is offline for 10-15 mins.

The solution would be not to let the split thread to issue the below line while 
a compaction on that region is happening.
2008-01-11 16:22:01,247 DEBUG org.apache.hadoop.hbase.HRegionServer: 
webdata,,1200085987488 closing (Adding to retiringRegions)

The only time I have seen this bug is when there is only one region on a region 
server because if more then one then the compaction happens to the other 
region(s) after the first one is done compaction and the split can do what it 
needs on the first region with out getting blocked.

{code}
2008-01-11 16:22:01,020 INFO org.apache.hadoop.hbase.HRegion: compaction 
completed on region webdata,,1200085987488. Took 16mins, 10sec
2008-01-11 16:22:01,020 DEBUG org.apache.hadoop.hbase.HStore: compaction for 
HStore webdata,,1200085987488/size needed.
2008-01-11 16:22:01,020 DEBUG org.apache.hadoop.hbase.HRegion: 1773667150/size 
needs compaction
2008-01-11 16:22:01,021 INFO org.apache.hadoop.hbase.HRegion: starting 
compaction on region webdata,,1200085987488
2008-01-11 16:22:01,021 DEBUG org.apache.hadoop.hbase.HStore: started 
compaction of 14 files using 
/gfs_storage/hadoop-root/hbase/hregion_1773667150/compaction.dir/hregion_1773667150/size
 for webdata,,1200085987488/size
2008-01-11 16:22:01,123 DEBUG org.apache.hadoop.hbase.HRegion: Started memcache 
flush for region webdata,,1200085987488. Size 31.2m
2008-01-11 16:22:01,232 INFO org.apache.hadoop.hbase.HRegion: Splitting 
webdata,,1200085987488 because largest aggregate size is 100.7m and desired 
size is 64.0m
2008-01-11 16:22:01,247 DEBUG org.apache.hadoop.hbase.HRegionServer: 
webdata,,1200085987488 closing (Adding to retiringRegions)
...
lots of NotServingRegionException's
...
2008-01-11 16:32:59,876 INFO org.apache.hadoop.hbase.HRegion: compaction 
completed on region webdata,,1200085987488. Took 10mins, 58sec
...

2008-01-11 16:33:02,193 DEBUG org.apache.hadoop.hbase.HRegion: Cleaned up 
/gfs_storage/hadoop-root/hbase/hregion_1773667150/splits true
2008-01-11 16:33:02,194 INFO org.apache.hadoop.hbase.HRegion: Region split of 
webdata,,1200085987488 complete; new regions: webdata,,1200090121237, 
webdata,com.tom.ent/2008-01-04/0PGM/09034104.html:http,1200090121239. Split 
took 11mins, 0sec
2008-01-11 16:33:02,227 DEBUG 
org.apache.hadoop.hbase.HConnectionManager$TableServers: No servers for .META.. 
Doing a find...
2008-01-11 16:33:02,283 DEBUG 
org.apache.hadoop.hbase.HConnectionManager$TableServers: Found 1 region(s) for 
.META. at address: 10.0.0.4:60020, regioninfo: regionname: -ROOT-,,0, startKey: 
<>, encodedName(70236052) tableDesc: {name: -ROOT-, families: {info:={name: 
info, max versions: 1, compression: NONE, in memory: false, max length: 
2147483647, bloom filter: none}}}
2008-01-11 16:33:02,284 INFO org.apache.hadoop.hbase.HRegionServer: Updating 
.META. with region split info
2008-01-11 16:33:02,290 DEBUG org.apache.hadoop.hbase.HRegionServer: Reporting 
region split to master
2008-01-11 16:33:02,291 INFO org.apache.hadoop.hbase.HRegionServer: region 
split, META update, and report to master all successful. Old 
region=webdata,,1200085987488, new regions: webdata,,1200090121237, 
webdata,com.tom.ent/2008-01-04/0PGM/09034104.html:http,1200090121239
{code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to