[
https://issues.apache.org/jira/browse/HADOOP-2587?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Jim Kellerman updated HADOOP-2587:
----------------------------------
Status: Patch Available (was: Open)
> Splits getting blocked by compactions causeing region to be offline for the
> length of the compaction 10-15 mins
> ---------------------------------------------------------------------------------------------------------------
>
> Key: HADOOP-2587
> URL: https://issues.apache.org/jira/browse/HADOOP-2587
> Project: Hadoop
> Issue Type: Bug
> Components: contrib/hbase
> Affects Versions: 0.16.0
> Environment: hadoop subversion 611087
> Reporter: Billy Pearson
> Assignee: Jim Kellerman
> Fix For: 0.16.0
>
> Attachments: hbase-root-regionserver-PE1750-3.log, log.log,
> patch.txt, patch.txt, patch.txt, patch.txt, patch.txt, patch.txt
>
>
> The below is cut out of one of my region servers logs full log attached
> What is happening is there is one region on a this region server and its is
> under heave insert load so compaction are back to back one one finishes a new
> one starts the problem starts when its time to split the region.
> A compaction starts just millsecs before the split starts blocking the split
> but the split closes the region before the compaction is finished. Causing
> the region to be offline until the compaction is done. Once the compaction is
> done the split finishes and all is returned to normal but this is a big
> problem for production if the region is offline for 10-15 mins.
> The solution would be not to let the split thread to issue the below line
> while a compaction on that region is happening.
> 2008-01-11 16:22:01,247 DEBUG org.apache.hadoop.hbase.HRegionServer:
> webdata,,1200085987488 closing (Adding to retiringRegions)
> The only time I have seen this bug is when there is only one region on a
> region server because if more then one then the compaction happens to the
> other region(s) after the first one is done compaction and the split can do
> what it needs on the first region with out getting blocked.
> {code}
> 2008-01-11 16:22:01,020 INFO org.apache.hadoop.hbase.HRegion: compaction
> completed on region webdata,,1200085987488. Took 16mins, 10sec
> 2008-01-11 16:22:01,020 DEBUG org.apache.hadoop.hbase.HStore: compaction for
> HStore webdata,,1200085987488/size needed.
> 2008-01-11 16:22:01,020 DEBUG org.apache.hadoop.hbase.HRegion:
> 1773667150/size needs compaction
> 2008-01-11 16:22:01,021 INFO org.apache.hadoop.hbase.HRegion: starting
> compaction on region webdata,,1200085987488
> 2008-01-11 16:22:01,021 DEBUG org.apache.hadoop.hbase.HStore: started
> compaction of 14 files using
> /gfs_storage/hadoop-root/hbase/hregion_1773667150/compaction.dir/hregion_1773667150/size
> for webdata,,1200085987488/size
> 2008-01-11 16:22:01,123 DEBUG org.apache.hadoop.hbase.HRegion: Started
> memcache flush for region webdata,,1200085987488. Size 31.2m
> 2008-01-11 16:22:01,232 INFO org.apache.hadoop.hbase.HRegion: Splitting
> webdata,,1200085987488 because largest aggregate size is 100.7m and desired
> size is 64.0m
> 2008-01-11 16:22:01,247 DEBUG org.apache.hadoop.hbase.HRegionServer:
> webdata,,1200085987488 closing (Adding to retiringRegions)
> ...
> lots of NotServingRegionException's
> ...
> 2008-01-11 16:32:59,876 INFO org.apache.hadoop.hbase.HRegion: compaction
> completed on region webdata,,1200085987488. Took 10mins, 58sec
> ...
> 2008-01-11 16:33:02,193 DEBUG org.apache.hadoop.hbase.HRegion: Cleaned up
> /gfs_storage/hadoop-root/hbase/hregion_1773667150/splits true
> 2008-01-11 16:33:02,194 INFO org.apache.hadoop.hbase.HRegion: Region split of
> webdata,,1200085987488 complete; new regions: webdata,,1200090121237,
> webdata,com.tom.ent/2008-01-04/0PGM/09034104.html:http,1200090121239. Split
> took 11mins, 0sec
> 2008-01-11 16:33:02,227 DEBUG
> org.apache.hadoop.hbase.HConnectionManager$TableServers: No servers for
> .META.. Doing a find...
> 2008-01-11 16:33:02,283 DEBUG
> org.apache.hadoop.hbase.HConnectionManager$TableServers: Found 1 region(s)
> for .META. at address: 10.0.0.4:60020, regioninfo: regionname: -ROOT-,,0,
> startKey: <>, encodedName(70236052) tableDesc: {name: -ROOT-, families:
> {info:={name: info, max versions: 1, compression: NONE, in memory: false, max
> length: 2147483647, bloom filter: none}}}
> 2008-01-11 16:33:02,284 INFO org.apache.hadoop.hbase.HRegionServer: Updating
> .META. with region split info
> 2008-01-11 16:33:02,290 DEBUG org.apache.hadoop.hbase.HRegionServer:
> Reporting region split to master
> 2008-01-11 16:33:02,291 INFO org.apache.hadoop.hbase.HRegionServer: region
> split, META update, and report to master all successful. Old
> region=webdata,,1200085987488, new regions: webdata,,1200090121237,
> webdata,com.tom.ent/2008-01-04/0PGM/09034104.html:http,1200090121239
> {code}
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.