[ 
https://issues.apache.org/jira/browse/HBASE-4579?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jean-Daniel Cryans updated HBASE-4579:
--------------------------------------

    Summary: CST.requestCompaction semantics changed, logs are now spammed when 
too many store files  (was: Compactions that can't run are requested forever)

I spent some time debugging this and the debug I added really helped:

{quote}
2011-10-13 22:04:16,633 DEBUG 
org.apache.hadoop.hbase.regionserver.CompactSplitThread: Large Compaction 
requested: 
regionName=TestTable,0028803615,1318381132063.1724203bae0dda87cddfaac9700b9dc6.,
 storeName=info, fileCount=4, fileSize=219.6m (54.8m, 55.0m, 55.0m, 54.8m), 
priority=-1, time=6062002855079899; Because: regionserver62023.cacheFlusher; 
compaction_queue=(1:0), split_queue=0
2011-10-13 22:04:21,999 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: 
Flush requested on 
TestTable,0028803615,1318381132063.1724203bae0dda87cddfaac9700b9dc6.
2011-10-13 22:04:21,999 WARN 
org.apache.hadoop.hbase.regionserver.MemStoreFlusher: Region 
TestTable,0028803615,1318381132063.1724203bae0dda87cddfaac9700b9dc6. has too 
many store files; delaying flush up to 90000ms
2011-10-13 22:04:21,999 DEBUG org.apache.hadoop.hbase.regionserver.Store: info: 
no store files to compact 1724203bae0dda87cddfaac9700b9dc6
2011-10-13 22:04:22,000 ERROR org.apache.hadoop.hbase.regionserver.Store:
java.lang.Exception
        at 
org.apache.hadoop.hbase.regionserver.Store.compactSelection(Store.java:1007)
        at 
org.apache.hadoop.hbase.regionserver.Store.requestCompaction(Store.java:908)
        at 
org.apache.hadoop.hbase.regionserver.CompactSplitThread.requestCompaction(CompactSplitThread.java:204)
        at 
org.apache.hadoop.hbase.regionserver.CompactSplitThread.requestCompaction(CompactSplitThread.java:177)
        at 
org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:360)
        at 
org.apache.hadoop.hbase.regionserver.MemStoreFlusher.run(MemStoreFlusher.java:239)
...
2011-10-13 22:05:52,584 INFO 
org.apache.hadoop.hbase.regionserver.MemStoreFlusher: Waited 90585ms on a 
compaction to clean up 'too many store files'; waited long enough... proceeding 
with flush of 
TestTable,0028803615,1318381132063.1724203bae0dda87cddfaac9700b9dc6.
{quote}

The exception is there just to show me where the call was coming from.

So what we see is that compaction is grabbing all the store files and during 
that time the MemStoreFlusher is trying to flush but hits the "too many store 
files" issue. It used to be that requesting a compaction was just putting 
something on a queue silently, but now it's actually going all the way down 
into Store and that's why it's now spammy.

Fixes I think are required:

 - For debugging purposes, show the encoded name when printing the "info: no 
store files to compact" message because it's pretty much useless if you don't 
know what it belongs to.
 - Require the compaction only once, the code already knows when it's the first 
time you spin so no need to ask it again and again.
 - Set a higher priority for the compaction
                
> CST.requestCompaction semantics changed, logs are now spammed when too many 
> store files
> ---------------------------------------------------------------------------------------
>
>                 Key: HBASE-4579
>                 URL: https://issues.apache.org/jira/browse/HBASE-4579
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 0.92.0
>            Reporter: Jean-Daniel Cryans
>            Priority: Critical
>             Fix For: 0.92.0
>
>
> Another bug I'm not so sure what's going on. I see this in my log:
> {quote}
> 2011-10-12 00:23:43,435 DEBUG org.apache.hadoop.hbase.regionserver.Store: 
> info: no store files to compact
> 2011-10-12 00:23:44,335 DEBUG org.apache.hadoop.hbase.regionserver.Store: 
> info: no store files to compact
> 2011-10-12 00:23:45,236 DEBUG org.apache.hadoop.hbase.regionserver.Store: 
> info: no store files to compact
> 2011-10-12 00:23:46,136 DEBUG org.apache.hadoop.hbase.regionserver.Store: 
> info: no store files to compact
> 2011-10-12 00:23:47,036 DEBUG org.apache.hadoop.hbase.regionserver.Store: 
> info: no store files to compact
> 2011-10-12 00:23:47,936 DEBUG org.apache.hadoop.hbase.regionserver.Store: 
> info: no store files to compact
> {quote}
> It spams for a while, and a little later instead I get:
> {quote}
> 2011-10-12 00:26:52,139 DEBUG org.apache.hadoop.hbase.regionserver.Store: 
> Skipped compaction of info.  Only 2 file(s) of size 176.4m have met 
> compaction criteria.
> 2011-10-12 00:26:53,040 DEBUG org.apache.hadoop.hbase.regionserver.Store: 
> Skipped compaction of info.  Only 2 file(s) of size 176.4m have met 
> compaction criteria.
> 2011-10-12 00:26:53,940 DEBUG org.apache.hadoop.hbase.regionserver.Store: 
> Skipped compaction of info.  Only 2 file(s) of size 176.4m have met 
> compaction criteria.
> 2011-10-12 00:26:54,840 DEBUG org.apache.hadoop.hbase.regionserver.Store: 
> Skipped compaction of info.  Only 2 file(s) of size 176.4m have met 
> compaction criteria.
> 2011-10-12 00:26:55,741 DEBUG org.apache.hadoop.hbase.regionserver.Store: 
> Skipped compaction of info.  Only 2 file(s) of size 176.4m have met 
> compaction criteria.
> 2011-10-12 00:26:56,641 DEBUG org.apache.hadoop.hbase.regionserver.Store: 
> Skipped compaction of info.  Only 2 file(s) of size 176.4m have met 
> compaction criteria.
> {quote}
> I believe I also saw something like that for flushes, but the region was 
> closing so at least I know why it was spamming (would be nice if it just 
> unrequested the flush):
> {quote}
> 2011-10-12 00:26:40,693 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: 
> Flush requested on 
> TestTable,0038168581,1318378894213.2beb8a1e29382a8d3e90a88b9662e5f5.
> 2011-10-12 00:26:40,694 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: 
> NOT flushing memstore for region 
> TestTable,0038168581,1318378894213.2beb8a1e29382a8d3e90a88b9662e5f5., 
> flushing=false, writesEnabled=false
> 2011-10-12 00:26:40,733 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: 
> Flush requested on 
> TestTable,0038168581,1318378894213.2beb8a1e29382a8d3e90a88b9662e5f5.
> 2011-10-12 00:26:40,733 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: 
> NOT flushing memstore for region 
> TestTable,0038168581,1318378894213.2beb8a1e29382a8d3e90a88b9662e5f5., 
> flushing=false, writesEnabled=false
> 2011-10-12 00:26:40,873 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: 
> Flush requested on 
> TestTable,0038168581,1318378894213.2beb8a1e29382a8d3e90a88b9662e5f5.
> 2011-10-12 00:26:40,873 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: 
> NOT flushing memstore for region 
> TestTable,0038168581,1318378894213.2beb8a1e29382a8d3e90a88b9662e5f5., 
> flushing=false, writesEnabled=false
> 2011-10-12 00:26:40,873 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: 
> Flush requested on 
> TestTable,0038168581,1318378894213.2beb8a1e29382a8d3e90a88b9662e5f5.
> 2011-10-12 00:26:40,873 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: 
> NOT flushing memstore for region 
> TestTable,0038168581,1318378894213.2beb8a1e29382a8d3e90a88b9662e5f5., 
> flushing=false, writesEnabled=false
> 2011-10-12 00:26:40,921 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: 
> Flush requested on 
> TestTable,0038168581,1318378894213.2beb8a1e29382a8d3e90a88b9662e5f5.
> 2011-10-12 00:26:40,922 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: 
> NOT flushing memstore for region 
> TestTable,0038168581,1318378894213.2beb8a1e29382a8d3e90a88b9662e5f5., 
> flushing=false, writesEnabled=false
> 2011-10-12 00:26:40,923 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: 
> Flush requested on 
> TestTable,0038168581,1318378894213.2beb8a1e29382a8d3e90a88b9662e5f5.
> 2011-10-12 00:26:40,923 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: 
> NOT flushing memstore for region 
> TestTable,0038168581,1318378894213.2beb8a1e29382a8d3e90a88b9662e5f5., 
> flushing=false, writesEnabled=false
> 2011-10-12 00:26:40,923 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: 
> Flush requested on 
> TestTable,0038168581,1318378894213.2beb8a1e29382a8d3e90a88b9662e5f5.
> 2011-10-12 00:26:40,923 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: 
> NOT flushing memstore for region 
> TestTable,0038168581,1318378894213.2beb8a1e29382a8d3e90a88b9662e5f5., 
> flushing=false, writesEnabled=false
> {quote}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to