[ 
https://issues.apache.org/jira/browse/HADOOP-2587?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jim Kellerman updated HADOOP-2587:
----------------------------------

    Resolution: Fixed
        Status: Resolved  (was: Patch Available)

Tests passed. Committed.

> Splits getting blocked by compactions causeing region to be offline for the 
> length of the compaction 10-15 mins
> ---------------------------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-2587
>                 URL: https://issues.apache.org/jira/browse/HADOOP-2587
>             Project: Hadoop
>          Issue Type: Bug
>          Components: contrib/hbase
>         Environment: hadoop subversion 611087
>            Reporter: Billy Pearson
>            Assignee: Jim Kellerman
>             Fix For: 0.16.0
>
>         Attachments: hbase-root-regionserver-PE1750-3.log, patch.txt
>
>
> The below is cut out of one of my region servers logs full log attached
> What is happening is there is one region on a this region server and its is 
> under heave insert load so compaction are back to back one one finishes a new 
> one starts the problem starts when its time to split the region. 
> A compaction starts just millsecs before the split starts blocking the split 
> but the split closes the region before the compaction is finished. Causing 
> the region to be offline until the compaction is done. Once the compaction is 
> done the split finishes and all is returned to normal but this is a big 
> problem for production if the region is offline for 10-15 mins.
> The solution would be not to let the split thread to issue the below line 
> while a compaction on that region is happening.
> 2008-01-11 16:22:01,247 DEBUG org.apache.hadoop.hbase.HRegionServer: 
> webdata,,1200085987488 closing (Adding to retiringRegions)
> The only time I have seen this bug is when there is only one region on a 
> region server because if more then one then the compaction happens to the 
> other region(s) after the first one is done compaction and the split can do 
> what it needs on the first region with out getting blocked.
> {code}
> 2008-01-11 16:22:01,020 INFO org.apache.hadoop.hbase.HRegion: compaction 
> completed on region webdata,,1200085987488. Took 16mins, 10sec
> 2008-01-11 16:22:01,020 DEBUG org.apache.hadoop.hbase.HStore: compaction for 
> HStore webdata,,1200085987488/size needed.
> 2008-01-11 16:22:01,020 DEBUG org.apache.hadoop.hbase.HRegion: 
> 1773667150/size needs compaction
> 2008-01-11 16:22:01,021 INFO org.apache.hadoop.hbase.HRegion: starting 
> compaction on region webdata,,1200085987488
> 2008-01-11 16:22:01,021 DEBUG org.apache.hadoop.hbase.HStore: started 
> compaction of 14 files using 
> /gfs_storage/hadoop-root/hbase/hregion_1773667150/compaction.dir/hregion_1773667150/size
>  for webdata,,1200085987488/size
> 2008-01-11 16:22:01,123 DEBUG org.apache.hadoop.hbase.HRegion: Started 
> memcache flush for region webdata,,1200085987488. Size 31.2m
> 2008-01-11 16:22:01,232 INFO org.apache.hadoop.hbase.HRegion: Splitting 
> webdata,,1200085987488 because largest aggregate size is 100.7m and desired 
> size is 64.0m
> 2008-01-11 16:22:01,247 DEBUG org.apache.hadoop.hbase.HRegionServer: 
> webdata,,1200085987488 closing (Adding to retiringRegions)
> ...
> lots of NotServingRegionException's
> ...
> 2008-01-11 16:32:59,876 INFO org.apache.hadoop.hbase.HRegion: compaction 
> completed on region webdata,,1200085987488. Took 10mins, 58sec
> ...
> 2008-01-11 16:33:02,193 DEBUG org.apache.hadoop.hbase.HRegion: Cleaned up 
> /gfs_storage/hadoop-root/hbase/hregion_1773667150/splits true
> 2008-01-11 16:33:02,194 INFO org.apache.hadoop.hbase.HRegion: Region split of 
> webdata,,1200085987488 complete; new regions: webdata,,1200090121237, 
> webdata,com.tom.ent/2008-01-04/0PGM/09034104.html:http,1200090121239. Split 
> took 11mins, 0sec
> 2008-01-11 16:33:02,227 DEBUG 
> org.apache.hadoop.hbase.HConnectionManager$TableServers: No servers for 
> .META.. Doing a find...
> 2008-01-11 16:33:02,283 DEBUG 
> org.apache.hadoop.hbase.HConnectionManager$TableServers: Found 1 region(s) 
> for .META. at address: 10.0.0.4:60020, regioninfo: regionname: -ROOT-,,0, 
> startKey: <>, encodedName(70236052) tableDesc: {name: -ROOT-, families: 
> {info:={name: info, max versions: 1, compression: NONE, in memory: false, max 
> length: 2147483647, bloom filter: none}}}
> 2008-01-11 16:33:02,284 INFO org.apache.hadoop.hbase.HRegionServer: Updating 
> .META. with region split info
> 2008-01-11 16:33:02,290 DEBUG org.apache.hadoop.hbase.HRegionServer: 
> Reporting region split to master
> 2008-01-11 16:33:02,291 INFO org.apache.hadoop.hbase.HRegionServer: region 
> split, META update, and report to master all successful. Old 
> region=webdata,,1200085987488, new regions: webdata,,1200090121237, 
> webdata,com.tom.ent/2008-01-04/0PGM/09034104.html:http,1200090121239
> {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to