[ 
https://issues.apache.org/jira/browse/HBASE-16394?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

qgxiaozhan updated HBASE-16394:
-------------------------------
    Description: 
My cluster dead one regionserver  because of "Compaction is trying to add a bad 
range"
Here the log:
[2016-08-09T18:30:19.094+08:00] [INFO] regionserver.ReplicationSource : Log 
hdfs://athene/hbase/oldWALs/MJQ-HBASE-ATHENE-11139%2C16020%2C1470729882622.default.1470736608897
 was moved to hdfs://athene/hbase/oldWA     
Ls/MJQ-HBASE-ATHENE%2C16020%2C1470729882622.default.1470736608897
[2016-08-09T18:30:30.225+08:00] [INFO] regionserver.MemStoreFlusher : Waited 
90070ms on a compaction to clean up 'TOO MANY STORE FILES'; waited long 
enough... proceeding with flush of tjs4:popt_info,160608008474430,147073716071  
   1.7900baab5204e4f36fa49379c30cd584.
[2016-08-09T18:30:30.226+08:00] [INFO] regionserver.HRegion : Started memstore 
flush for 
tjs4:popt_info,160608008474430,1470737160711.7900baab5204e4f36fa49379c30cd584., 
current region memstore size 769.41 MB, and 1/1 column fam     ilies' memstores 
are being flushed.
[2016-08-09T18:30:30.549+08:00] [INFO] regionserver.StripeStoreFileManager : 3 
conflicting files (likely created by a flush)  of size 156153021 are moved to 
L0 due to concurrent stripe change
[2016-08-09T18:30:31.199+08:00] [INFO] regionserver.HStore : Completed 
compaction of 203 file(s) in c of 
tjs4:popt_info,160608008474430,1470737160711.7900baab5204e4f36fa49379c30cd584. 
into 20347d203d09442cac30c42b424adda6(size=     3.0 G), 
ded362eab9cf4a819675cd35992d4974(size=3.0 G), 
281b1039ed2643679e5b0a3820f5059d(size=2.4 G), total size for store is 8.6 G. 
This selection was in queue for 0sec, and took 10mins, 16sec to execute.
[2016-08-09T18:30:31.200+08:00] [INFO] regionserver.CompactSplitThread : 
Completed compaction: Request = 
regionName=tjs4:popt_info,160608008474430,1470737160711.7900baab5204e4f36fa49379c30cd584.,
 storeName=c, fileCount=203, fil     eSize=7.2 G, priority=-3, 
time=5388162916126535; duration=10mins, 16sec
[2016-08-09T18:30:31.201+08:00] [INFO] regionserver.HRegion : Starting 
compaction on ci in region 
ad_union:union_click,3487f383ad484bcbb5cef727b69cec2a,1466484980245.c5772fc60c54f64cc977ba9cc01d74ad.
[2016-08-09T18:30:31.201+08:00] [INFO] regionserver.HStore : Starting 
compaction of 14 file(s) in ci of 
ad_union:union_click,3487f383ad484bcbb5cef727b69cec2a,1466484980245.c5772fc60c54f64cc977ba9cc01d74ad.
 into tmpdir=hdfs://at     
hene/hbase/data/ad_union/union_click/c5772fc60c54f64cc977ba9cc01d74ad/.tmp, 
totalSize=75.0 M
[2016-08-09T18:30:31.206+08:00] [INFO] hfile.CacheConfig : 
blockCache=org.apache.hadoop.hbase.io.hfile.CombinedBlockCache@52659482, 
cacheDataOnRead=true, cacheDataOnWrite=false, cacheIndexesOnWrite=false, 
cacheBloomsOnWrite=fal     se, cacheEvictOnClose=false, 
cacheDataCompressed=false, prefetchOnOpen=false
[2016-08-09T18:30:32.893+08:00] [INFO] regionserver.ReplicationSource : Log 
hdfs://athene/hbase/oldWALs/MJQ-HBASE-ATHENE-11139l%2C16020%2C1470729882622.default.1470736612825
 was moved to hdfs://athene/hbase/oldWA     
Ls/MJQ-HBASE-ATHENE-11139.%2C16020%2C1470729882622.default.1470736612825
[2016-08-09T18:30:34.373+08:00] [INFO] regionserver.HStore : Added 
hdfs://athene/hbase/data/tjs4/popt_info/7900baab5204e4f36fa49379c30cd584/c/775e8956cd2a48aaae70b9eded4457e9,
 entries=4336457, sequenceid=582528, filesize=48.7 M
[2016-08-09T18:30:34.373+08:00] [FATAL] regionserver.HRegionServer : ABORTING 
region server MJQ-HBASE-ATHENE-11139.,16020,1470729882622: Replay of WAL 
required. Forcing server shutdown
org.apache.hadoop.hbase.DroppedSnapshotException: region: 
tjs4:popt_info,160608008474430,1470737160711.7900baab5204e4f36fa49379c30cd584.
at 
org.apache.hadoop.hbase.regionserver.HRegion.internalFlushCacheAndCommit(HRegion.java:2354)
at 
org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:2057)
at 
org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:2019)
at org.apache.hadoop.hbase.regionserver.HRegion.flushcache(HRegion.java:1911)
at org.apache.hadoop.hbase.regionserver.HRegion.flush(HRegion.java:1837)
at 
org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:510)
at 
org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:471)
at 
org.apache.hadoop.hbase.regionserver.MemStoreFlusher.access$800(MemStoreFlusher.java:75)
at 
org.apache.hadoop.hbase.regionserver.MemStoreFlusher$FlushHandler.run(MemStoreFlusher.java:259)
at java.lang.Thread.run(Thread.java:745)
  Caused by: java.io.IOException: Compaction is trying to add a bad range.
at 
org.apache.hadoop.hbase.regionserver.StripeStoreFileManager$CompactionOrFlushMergeCopy.processNewCandidateStripes(StripeStoreFileManager.java:837)
at 
org.apache.hadoop.hbase.regionserver.StripeStoreFileManager$CompactionOrFlushMergeCopy.mergeResults(StripeStoreFileManager.java:672)
at 
org.apache.hadoop.hbase.regionserver.StripeStoreFileManager.insertNewFiles(StripeStoreFileManager.java:144)
at 
org.apache.hadoop.hbase.regionserver.HStore.updateStorefiles(HStore.java:1052)
at org.apache.hadoop.hbase.regionserver.HStore.access$500(HStore.java:128)
at 
org.apache.hadoop.hbase.regionserver.HStore$StoreFlusherImpl.commit(HStore.java:2231)
at 
org.apache.hadoop.hbase.regionserver.HRegion.internalFlushCacheAndCommit(HRegion.java:2315)


  was:
My cluster dead one regionserver  because of "Compaction is trying to add a bad 
range"
Here the log:

[2016-08-09T18:30:19.094+08:00] [INFO] regionserver.ReplicationSource : Log 
hdfs://athene/hbase/oldWALs/MJQ-HBASE-ATHENE-11139%2C16020%2C1470729882622.default.1470736608897
 was moved to hdfs://athene/hbase/oldWA     
Ls/MJQ-HBASE-ATHENE%2C16020%2C1470729882622.default.1470736608897
[2016-08-09T18:30:30.225+08:00] [INFO] regionserver.MemStoreFlusher : Waited 
90070ms on a compaction to clean up 'TOO MANY STORE FILES'; waited long 
enough... proceeding with flush of tjs4:popt_info,160608008474430,147073716071  
   1.7900baab5204e4f36fa49379c30cd584.
[2016-08-09T18:30:30.226+08:00] [INFO] regionserver.HRegion : Started memstore 
flush for 
tjs4:popt_info,160608008474430,1470737160711.7900baab5204e4f36fa49379c30cd584., 
current region memstore size 769.41 MB, and 1/1 column fam     ilies' memstores 
are being flushed.
[2016-08-09T18:30:30.549+08:00] [INFO] regionserver.StripeStoreFileManager : 3 
conflicting files (likely created by a flush)  of size 156153021 are moved to 
L0 due to concurrent stripe change
[2016-08-09T18:30:31.199+08:00] [INFO] regionserver.HStore : Completed 
compaction of 203 file(s) in c of 
tjs4:popt_info,160608008474430,1470737160711.7900baab5204e4f36fa49379c30cd584. 
into 20347d203d09442cac30c42b424adda6(size=     3.0 G), 
ded362eab9cf4a819675cd35992d4974(size=3.0 G), 
281b1039ed2643679e5b0a3820f5059d(size=2.4 G), total size for store is 8.6 G. 
This selection was in queue for 0sec, and took 10mins, 16sec to execute.
[2016-08-09T18:30:31.200+08:00] [INFO] regionserver.CompactSplitThread : 
Completed compaction: Request = 
regionName=tjs4:popt_info,160608008474430,1470737160711.7900baab5204e4f36fa49379c30cd584.,
 storeName=c, fileCount=203, fil     eSize=7.2 G, priority=-3, 
time=5388162916126535; duration=10mins, 16sec
[2016-08-09T18:30:31.201+08:00] [INFO] regionserver.HRegion : Starting 
compaction on ci in region 
ad_union:union_click,3487f383ad484bcbb5cef727b69cec2a,1466484980245.c5772fc60c54f64cc977ba9cc01d74ad.
[2016-08-09T18:30:31.201+08:00] [INFO] regionserver.HStore : Starting 
compaction of 14 file(s) in ci of 
ad_union:union_click,3487f383ad484bcbb5cef727b69cec2a,1466484980245.c5772fc60c54f64cc977ba9cc01d74ad.
 into tmpdir=hdfs://at     
hene/hbase/data/ad_union/union_click/c5772fc60c54f64cc977ba9cc01d74ad/.tmp, 
totalSize=75.0 M
[2016-08-09T18:30:31.206+08:00] [INFO] hfile.CacheConfig : 
blockCache=org.apache.hadoop.hbase.io.hfile.CombinedBlockCache@52659482, 
cacheDataOnRead=true, cacheDataOnWrite=false, cacheIndexesOnWrite=false, 
cacheBloomsOnWrite=fal     se, cacheEvictOnClose=false, 
cacheDataCompressed=false, prefetchOnOpen=false
[2016-08-09T18:30:32.893+08:00] [INFO] regionserver.ReplicationSource : Log 
hdfs://athene/hbase/oldWALs/MJQ-HBASE-ATHENE-11139l%2C16020%2C1470729882622.default.1470736612825
 was moved to hdfs://athene/hbase/oldWA     
Ls/MJQ-HBASE-ATHENE-11139.%2C16020%2C1470729882622.default.1470736612825
[2016-08-09T18:30:34.373+08:00] [INFO] regionserver.HStore : Added 
hdfs://athene/hbase/data/tjs4/popt_info/7900baab5204e4f36fa49379c30cd584/c/775e8956cd2a48aaae70b9eded4457e9,
 entries=4336457, sequenceid=582528, filesize=48.7 M
[2016-08-09T18:30:34.373+08:00] [FATAL] regionserver.HRegionServer : ABORTING 
region server MJQ-HBASE-ATHENE-11139.,16020,1470729882622: Replay of WAL 
required. Forcing server shutdown
org.apache.hadoop.hbase.DroppedSnapshotException: region: 
tjs4:popt_info,160608008474430,1470737160711.7900baab5204e4f36fa49379c30cd584.
at 
org.apache.hadoop.hbase.regionserver.HRegion.internalFlushCacheAndCommit(HRegion.java:2354)
at 
org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:2057)
at 
org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:2019)
at org.apache.hadoop.hbase.regionserver.HRegion.flushcache(HRegion.java:1911)
at org.apache.hadoop.hbase.regionserver.HRegion.flush(HRegion.java:1837)
at 
org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:510)
at 
org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:471)
at 
org.apache.hadoop.hbase.regionserver.MemStoreFlusher.access$800(MemStoreFlusher.java:75)
at 
org.apache.hadoop.hbase.regionserver.MemStoreFlusher$FlushHandler.run(MemStoreFlusher.java:259)
at java.lang.Thread.run(Thread.java:745)
  Caused by: java.io.IOException: Compaction is trying to add a bad range.
at 
org.apache.hadoop.hbase.regionserver.StripeStoreFileManager$CompactionOrFlushMergeCopy.processNewCandidateStripes(StripeStoreFileManager.java:837)
at 
org.apache.hadoop.hbase.regionserver.StripeStoreFileManager$CompactionOrFlushMergeCopy.mergeResults(StripeStoreFileManager.java:672)
at 
org.apache.hadoop.hbase.regionserver.StripeStoreFileManager.insertNewFiles(StripeStoreFileManager.java:144)
at 
org.apache.hadoop.hbase.regionserver.HStore.updateStorefiles(HStore.java:1052)
at org.apache.hadoop.hbase.regionserver.HStore.access$500(HStore.java:128)
at 
org.apache.hadoop.hbase.regionserver.HStore$StoreFlusherImpl.commit(HStore.java:2231)
at 
org.apache.hadoop.hbase.regionserver.HRegion.internalFlushCacheAndCommit(HRegion.java:2315)




> What cause "Compaction is trying to add a bad range",and Should stop the 
> regionserver?
> --------------------------------------------------------------------------------------
>
>                 Key: HBASE-16394
>                 URL: https://issues.apache.org/jira/browse/HBASE-16394
>             Project: HBase
>          Issue Type: Bug
>          Components: Compaction
>    Affects Versions: 1.1.2
>         Environment: hadoop-2.6.1 hbase-1.1.2
>            Reporter: qgxiaozhan
>
> My cluster dead one regionserver  because of "Compaction is trying to add a 
> bad range"
> Here the log:
> [2016-08-09T18:30:19.094+08:00] [INFO] regionserver.ReplicationSource : Log 
> hdfs://athene/hbase/oldWALs/MJQ-HBASE-ATHENE-11139%2C16020%2C1470729882622.default.1470736608897
>  was moved to hdfs://athene/hbase/oldWA     
> Ls/MJQ-HBASE-ATHENE%2C16020%2C1470729882622.default.1470736608897
> [2016-08-09T18:30:30.225+08:00] [INFO] regionserver.MemStoreFlusher : Waited 
> 90070ms on a compaction to clean up 'TOO MANY STORE FILES'; waited long 
> enough... proceeding with flush of 
> tjs4:popt_info,160608008474430,147073716071     
> 1.7900baab5204e4f36fa49379c30cd584.
> [2016-08-09T18:30:30.226+08:00] [INFO] regionserver.HRegion : Started 
> memstore flush for 
> tjs4:popt_info,160608008474430,1470737160711.7900baab5204e4f36fa49379c30cd584.,
>  current region memstore size 769.41 MB, and 1/1 column fam     ilies' 
> memstores are being flushed.
> [2016-08-09T18:30:30.549+08:00] [INFO] regionserver.StripeStoreFileManager : 
> 3 conflicting files (likely created by a flush)  of size 156153021 are moved 
> to L0 due to concurrent stripe change
> [2016-08-09T18:30:31.199+08:00] [INFO] regionserver.HStore : Completed 
> compaction of 203 file(s) in c of 
> tjs4:popt_info,160608008474430,1470737160711.7900baab5204e4f36fa49379c30cd584.
>  into 20347d203d09442cac30c42b424adda6(size=     3.0 G), 
> ded362eab9cf4a819675cd35992d4974(size=3.0 G), 
> 281b1039ed2643679e5b0a3820f5059d(size=2.4 G), total size for store is 8.6 G. 
> This selection was in queue for 0sec, and took 10mins, 16sec to execute.
> [2016-08-09T18:30:31.200+08:00] [INFO] regionserver.CompactSplitThread : 
> Completed compaction: Request = 
> regionName=tjs4:popt_info,160608008474430,1470737160711.7900baab5204e4f36fa49379c30cd584.,
>  storeName=c, fileCount=203, fil     eSize=7.2 G, priority=-3, 
> time=5388162916126535; duration=10mins, 16sec
> [2016-08-09T18:30:31.201+08:00] [INFO] regionserver.HRegion : Starting 
> compaction on ci in region 
> ad_union:union_click,3487f383ad484bcbb5cef727b69cec2a,1466484980245.c5772fc60c54f64cc977ba9cc01d74ad.
> [2016-08-09T18:30:31.201+08:00] [INFO] regionserver.HStore : Starting 
> compaction of 14 file(s) in ci of 
> ad_union:union_click,3487f383ad484bcbb5cef727b69cec2a,1466484980245.c5772fc60c54f64cc977ba9cc01d74ad.
>  into tmpdir=hdfs://at     
> hene/hbase/data/ad_union/union_click/c5772fc60c54f64cc977ba9cc01d74ad/.tmp, 
> totalSize=75.0 M
> [2016-08-09T18:30:31.206+08:00] [INFO] hfile.CacheConfig : 
> blockCache=org.apache.hadoop.hbase.io.hfile.CombinedBlockCache@52659482, 
> cacheDataOnRead=true, cacheDataOnWrite=false, cacheIndexesOnWrite=false, 
> cacheBloomsOnWrite=fal     se, cacheEvictOnClose=false, 
> cacheDataCompressed=false, prefetchOnOpen=false
> [2016-08-09T18:30:32.893+08:00] [INFO] regionserver.ReplicationSource : Log 
> hdfs://athene/hbase/oldWALs/MJQ-HBASE-ATHENE-11139l%2C16020%2C1470729882622.default.1470736612825
>  was moved to hdfs://athene/hbase/oldWA     
> Ls/MJQ-HBASE-ATHENE-11139.%2C16020%2C1470729882622.default.1470736612825
> [2016-08-09T18:30:34.373+08:00] [INFO] regionserver.HStore : Added 
> hdfs://athene/hbase/data/tjs4/popt_info/7900baab5204e4f36fa49379c30cd584/c/775e8956cd2a48aaae70b9eded4457e9,
>  entries=4336457, sequenceid=582528, filesize=48.7 M
> [2016-08-09T18:30:34.373+08:00] [FATAL] regionserver.HRegionServer : ABORTING 
> region server MJQ-HBASE-ATHENE-11139.,16020,1470729882622: Replay of WAL 
> required. Forcing server shutdown
> org.apache.hadoop.hbase.DroppedSnapshotException: region: 
> tjs4:popt_info,160608008474430,1470737160711.7900baab5204e4f36fa49379c30cd584.
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.internalFlushCacheAndCommit(HRegion.java:2354)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:2057)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:2019)
> at org.apache.hadoop.hbase.regionserver.HRegion.flushcache(HRegion.java:1911)
> at org.apache.hadoop.hbase.regionserver.HRegion.flush(HRegion.java:1837)
> at 
> org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:510)
> at 
> org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:471)
> at 
> org.apache.hadoop.hbase.regionserver.MemStoreFlusher.access$800(MemStoreFlusher.java:75)
> at 
> org.apache.hadoop.hbase.regionserver.MemStoreFlusher$FlushHandler.run(MemStoreFlusher.java:259)
> at java.lang.Thread.run(Thread.java:745)
>   Caused by: java.io.IOException: Compaction is trying to add a bad range.
> at 
> org.apache.hadoop.hbase.regionserver.StripeStoreFileManager$CompactionOrFlushMergeCopy.processNewCandidateStripes(StripeStoreFileManager.java:837)
> at 
> org.apache.hadoop.hbase.regionserver.StripeStoreFileManager$CompactionOrFlushMergeCopy.mergeResults(StripeStoreFileManager.java:672)
> at 
> org.apache.hadoop.hbase.regionserver.StripeStoreFileManager.insertNewFiles(StripeStoreFileManager.java:144)
> at 
> org.apache.hadoop.hbase.regionserver.HStore.updateStorefiles(HStore.java:1052)
> at org.apache.hadoop.hbase.regionserver.HStore.access$500(HStore.java:128)
> at 
> org.apache.hadoop.hbase.regionserver.HStore$StoreFlusherImpl.commit(HStore.java:2231)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.internalFlushCacheAndCommit(HRegion.java:2315)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to