On Mon, May 30, 2016 at 7:03 PM, 聪聪 <[email protected]> wrote:

> HI ALL:       Recently,I met a strange problem,  the first Region’s
> Memstore of one table (the only one) often blocked when flushing.(Both
> Version:  hbase-0.98.6-cdh5.2.0  and 1.0.1.1, I updated 0.98 to
> 1.0.1.1,hope to solve the problem,But failed)
>


Is region always on same machine or do you see this phenomenon on more than
one machine?



>        On the web UI, I can see the status shows:  ABORTED(since XXsec
> ago), Not flushing since already flushing.
>


The RS is ABORTED? Because it can't flush? Is that what it says in the log?
Can we see the log message around the ABORT?



>        But it will never flush success, and the usage of the disk will
> increase very high.Now other regionservers just use 30% of the disk
> capacity, the problematic region server will increase to 100%,unless
> killing the process.
>


100% disk only? The CPU does not go up too?

Can we see a thread dump? Do jstack -l  PID if you can.



>        What’s more, the region server process cannot be shutdown
> normally,every time I have to use the KILL -9 command.
>        I check the log,the reason why cannot flush is one of the
> MemstoreFlusher cannot exiting.
>        The log is like blow:
>        2016-05-29 19:54:11,982 INFO  [MemStoreFlusher.13]
> regionserver.MemStoreFlusher: MemStoreFlusher.13 exiting
>  2016-05-29 19:54:13,016 INFO  [MemStoreFlusher.6]
> regionserver.MemStoreFlusher: MemStoreFlusher.6 exiting
>  2016-05-29 19:54:13,260 INFO  [MemStoreFlusher.16]
> regionserver.MemStoreFlusher: MemStoreFlusher.16 exiting
>  2016-05-29 19:54:16,032 INFO  [MemStoreFlusher.33]
> regionserver.MemStoreFlusher: MemStoreFlusher.33 exiting
>  2016-05-29 19:54:16,341 INFO  [MemStoreFlusher.25]
> regionserver.MemStoreFlusher: MemStoreFlusher.25 exiting
>  2016-05-29 19:54:16,620 INFO  [MemStoreFlusher.31]
> regionserver.MemStoreFlusher: MemStoreFlusher.31 exiting
>  2016-05-29 19:54:16,621 INFO  [MemStoreFlusher.29]
> regionserver.MemStoreFlusher: MemStoreFlusher.29 exiting
>  2016-05-29 19:54:16,621 INFO  [MemStoreFlusher.23]
> regionserver.MemStoreFlusher: MemStoreFlusher.23 exiting
>  2016-05-29 19:54:16,621 INFO  [MemStoreFlusher.32]
> regionserver.MemStoreFlusher: MemStoreFlusher.32 exiting
>  2016-05-29 19:54:16,621 INFO  [MemStoreFlusher.1]
> regionserver.MemStoreFlusher: MemStoreFlusher.1 exiting
>  2016-05-29 19:54:16,621 INFO  [MemStoreFlusher.38]
> regionserver.MemStoreFlusher: MemStoreFlusher.38 exiting
>  2016-05-29 19:54:16,621 INFO  [MemStoreFlusher.10]
> regionserver.MemStoreFlusher: MemStoreFlusher.10 exiting
>  2016-05-29 19:54:16,620 INFO  [MemStoreFlusher.7]
> regionserver.MemStoreFlusher: MemStoreFlusher.7 exiting
>  2016-05-29 19:54:16,620 INFO  [MemStoreFlusher.12]
> regionserver.MemStoreFlusher: MemStoreFlusher.12 exiting
>  2016-05-29 19:54:16,620 INFO  [MemStoreFlusher.21]
> regionserver.MemStoreFlusher: MemStoreFlusher.21 exiting
>  2016-05-29 19:54:16,622 INFO  [MemStoreFlusher.37]
> regionserver.MemStoreFlusher: MemStoreFlusher.37 exiting
>  2016-05-29 19:54:16,622 INFO  [MemStoreFlusher.24]
> regionserver.MemStoreFlusher: MemStoreFlusher.24 exiting
>  2016-05-29 19:54:16,806 INFO  [MemStoreFlusher.39]
> regionserver.MemStoreFlusher: MemStoreFlusher.39 exiting
>  2016-05-29 19:54:16,908 INFO  [MemStoreFlusher.17]
> regionserver.MemStoreFlusher: MemStoreFlusher.17 exiting
>  2016-05-29 19:54:16,952 INFO  [MemStoreFlusher.3]
> regionserver.MemStoreFlusher: MemStoreFlusher.3 exiting
>  2016-05-29 19:54:16,952 INFO  [MemStoreFlusher.27]
> regionserver.MemStoreFlusher: MemStoreFlusher.27 exiting
>  2016-05-29 19:54:16,952 INFO  [MemStoreFlusher.18]
> regionserver.MemStoreFlusher: MemStoreFlusher.18 exiting
>  2016-05-29 19:54:16,952 INFO  [MemStoreFlusher.4]
> regionserver.MemStoreFlusher: MemStoreFlusher.4 exiting
>  2016-05-29 19:54:16,952 INFO  [MemStoreFlusher.19]
> regionserver.MemStoreFlusher: MemStoreFlusher.19 exiting
>  2016-05-29 19:54:16,952 INFO  [MemStoreFlusher.22]
> regionserver.MemStoreFlusher: MemStoreFlusher.22 exiting
>  2016-05-29 19:54:16,952 INFO  [MemStoreFlusher.9]
> regionserver.MemStoreFlusher: MemStoreFlusher.9 exiting
>  2016-05-29 19:54:17,137 INFO  [MemStoreFlusher.36]
> regionserver.MemStoreFlusher: MemStoreFlusher.36 exiting
>  2016-05-29 19:54:17,263 INFO  [MemStoreFlusher.11]
> regionserver.MemStoreFlusher: MemStoreFlusher.11 exiting
>  2016-05-29 19:54:17,499 INFO  [MemStoreFlusher.14]
> regionserver.MemStoreFlusher: MemStoreFlusher.14 exiting
>  2016-05-29 19:54:17,499 INFO  [MemStoreFlusher.2]
> regionserver.MemStoreFlusher: MemStoreFlusher.2 exiting
>  2016-05-29 19:54:17,499 INFO  [MemStoreFlusher.26]
> regionserver.MemStoreFlusher: MemStoreFlusher.26 exiting
>  2016-05-29 19:54:17,499 INFO  [MemStoreFlusher.0]
> regionserver.MemStoreFlusher: MemStoreFlusher.0 exiting
>  2016-05-29 19:54:17,499 INFO  [MemStoreFlusher.30]
> regionserver.MemStoreFlusher: MemStoreFlusher.30 exiting
>  2016-05-29 19:54:17,499 INFO  [MemStoreFlusher.28]
> regionserver.MemStoreFlusher: MemStoreFlusher.28 exiting
>
>
>  here is the jstack log when I try to shutdown the region server process
> normally:
>  we can see the MemstoreFlusher.34 meet some problems, it can not exiting
> normally
>  (By the way, we can see “locked at snappy”, but it not always locked at
> snappy,sometime it will locked at other place,so I think snappy is ok, Here
> I just want to show the MemStoreFlusher meet some problem).
>
>
> Any other special configurations going on on this install? Phoenix or use
of Coprocessors?



>   "MemStoreFlusher.34" prio=10 tid=0x00007f20b8939000 nid=0x511b runnable
> [0x00007f08941c3000]
>     java.lang.Thread.State: RUNNABLE
>  at
> org.apache.hadoop.io.compress.snappy.SnappyCompressor.compressBytesDirect(Native
> Method)
>  at
> org.apache.hadoop.io.compress.snappy.SnappyCompressor.compress(SnappyCompressor.java:232)
>  - locked <0x00007f0a5f85c430> (a
> org.apache.hadoop.io.compress.snappy.SnappyCompressor)
>  at
> org.apache.hadoop.io.compress.BlockCompressorStream.compress(BlockCompressorStream.java:149)
>  at
> org.apache.hadoop.io.compress.BlockCompressorStream.finish(BlockCompressorStream.java:142)
>  at
> org.apache.hadoop.hbase.io.encoding.HFileBlockDefaultEncodingContext.compressAfterEncoding(HFileBlockDefaultEncodingContext.java:238)
>  at
> org.apache.hadoop.hbase.io.encoding.HFileBlockDefaultEncodingContext.compressAfterEncodingWithBlockType(HFileBlockDefaultEncodingContext.java:149)
>  at
> org.apache.hadoop.hbase.io.hfile.HFileBlock$Writer.finishBlock(HFileBlock.java:783)
>  at
> org.apache.hadoop.hbase.io.hfile.HFileBlock$Writer.ensureBlockReady(HFileBlock.java:761)
>  at
> org.apache.hadoop.hbase.io.hfile.HFileBlock$Writer.finishBlockAndWriteHeaderAndData(HFileBlock.java:878)
>  at
> org.apache.hadoop.hbase.io.hfile.HFileBlock$Writer.writeHeaderAndData(HFileBlock.java:864)
>  at
> org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexWriter.writeIntermediateBlock(HFileBlockIndex.java:948)
>  at
> org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexWriter.writeIntermediateLevel(HFileBlockIndex.java:931)
>  at
> org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexWriter.writeIndexBlocks(HFileBlockIndex.java:840)
>  at
> org.apache.hadoop.hbase.io.hfile.HFileWriterV2.close(HFileWriterV2.java:367)
>  at
> org.apache.hadoop.hbase.regionserver.StoreFile$Writer.close(StoreFile.java:974)
>  at
> org.apache.hadoop.hbase.regionserver.StoreFlusher.finalizeWriter(StoreFlusher.java:69)
>  at
> org.apache.hadoop.hbase.regionserver.DefaultStoreFlusher.flushSnapshot(DefaultStoreFlusher.java:70)
>  - locked <0x00007f09fdcccda0> (a java.lang.Object)
>  at org.apache.hadoop.hbase.regionserver.HStore.flushCache(HStore.java:829)
>  at
> org.apache.hadoop.hbase.regionserver.HStore$StoreFlusherImpl.flushCache(HStore.java:1989)
>  at
> org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:1779)
>  at
> org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:1662)
>  at
> org.apache.hadoop.hbase.regionserver.HRegion.flushcache(HRegion.java:1575)
>  at
> org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:461)
>  at
> org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:435)
>  at
> org.apache.hadoop.hbase.regionserver.MemStoreFlusher.access$800(MemStoreFlusher.java:66)
>  at
> org.apache.hadoop.hbase.regionserver.MemStoreFlusher$FlushHandler.run(MemStoreFlusher.java:253)
>  at java.lang.Thread.run(Thread.java:744)
>
>

If you thread dump a few times, it is always stuck here?

Thanks,
St.Ack



>
>  "MemStoreFlusher.33" prio=10 tid=0x00007f20b8937000 nid=0x511a waiting on
> condition [0x00007f08942c5000]
>     java.lang.Thread.State: TIMED_WAITING (parking)
>  at sun.misc.Unsafe.park(Native Method)
>  - parking to wait for  <0x00007f09b8e20f80> (a
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
>  at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:226)
>  at
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2082)
>  at java.util.concurrent.DelayQueue.poll(DelayQueue.java:264)
>  at java.util.concurrent.DelayQueue.poll(DelayQueue.java:68)
>  at
> org.apache.hadoop.hbase.regionserver.MemStoreFlusher$FlushHandler.run(MemStoreFlusher.java:233)
>  at java.lang.Thread.run(Thread.java:744)
>
>
>
>
>
>     I do not know why the memstore cannot flush.
>     Is this a bug?
>
>
>
>    下面是被转发的邮件:
>
>   发件人: quan <[email protected]>
>
>   主题: memstore flush blocked
>
>   日期: 2016年5月30日 GMT+8 上午10:47:43
>
>   收件人: [email protected]
>
>   抄送: 蒲聪 <[email protected]>
>
>
>    HI ALL:       Recently,I met a strange problem,  the first Region’s
> Memstore of one table (the only one) often blocked when flushing.(Both
> Version:  hbase-0.98.6-cdh5.2.0  and 1.0.1.1, I updated 0.98 to
> 1.0.1.1,hope to solve the problem,But failed)
>        On the web UI, I can see the status shows:  ABORTED(since XXsec
> ago), Not flushing since already flushing.
>        But it will never flush success, and the usage of the disk will
> increase very high.Now other regionservers just use 30% of the disk
> capacity, the problematic region server will increase to 100%,unless
> killing the process.
>        What’s more, the region server process cannot be shutdown
> normally,every time I have to use the KILL -9 command.
>        I check the log,the reason why cannot flush is one of the
> MemstoreFlusher cannot exiting.
>        The log is like blow:
>        2016-05-29 19:54:11,982 INFO  [MemStoreFlusher.13]
> regionserver.MemStoreFlusher: MemStoreFlusher.13 exiting
>  2016-05-29 19:54:13,016 INFO  [MemStoreFlusher.6]
> regionserver.MemStoreFlusher: MemStoreFlusher.6 exiting
>  2016-05-29 19:54:13,260 INFO  [MemStoreFlusher.16]
> regionserver.MemStoreFlusher: MemStoreFlusher.16 exiting
>  2016-05-29 19:54:16,032 INFO  [MemStoreFlusher.33]
> regionserver.MemStoreFlusher: MemStoreFlusher.33 exiting
>  2016-05-29 19:54:16,341 INFO  [MemStoreFlusher.25]
> regionserver.MemStoreFlusher: MemStoreFlusher.25 exiting
>  2016-05-29 19:54:16,620 INFO  [MemStoreFlusher.31]
> regionserver.MemStoreFlusher: MemStoreFlusher.31 exiting
>  2016-05-29 19:54:16,621 INFO  [MemStoreFlusher.29]
> regionserver.MemStoreFlusher: MemStoreFlusher.29 exiting
>  2016-05-29 19:54:16,621 INFO  [MemStoreFlusher.23]
> regionserver.MemStoreFlusher: MemStoreFlusher.23 exiting
>  2016-05-29 19:54:16,621 INFO  [MemStoreFlusher.32]
> regionserver.MemStoreFlusher: MemStoreFlusher.32 exiting
>  2016-05-29 19:54:16,621 INFO  [MemStoreFlusher.1]
> regionserver.MemStoreFlusher: MemStoreFlusher.1 exiting
>  2016-05-29 19:54:16,621 INFO  [MemStoreFlusher.38]
> regionserver.MemStoreFlusher: MemStoreFlusher.38 exiting
>  2016-05-29 19:54:16,621 INFO  [MemStoreFlusher.10]
> regionserver.MemStoreFlusher: MemStoreFlusher.10 exiting
>  2016-05-29 19:54:16,620 INFO  [MemStoreFlusher.7]
> regionserver.MemStoreFlusher: MemStoreFlusher.7 exiting
>  2016-05-29 19:54:16,620 INFO  [MemStoreFlusher.12]
> regionserver.MemStoreFlusher: MemStoreFlusher.12 exiting
>  2016-05-29 19:54:16,620 INFO  [MemStoreFlusher.21]
> regionserver.MemStoreFlusher: MemStoreFlusher.21 exiting
>  2016-05-29 19:54:16,622 INFO  [MemStoreFlusher.37]
> regionserver.MemStoreFlusher: MemStoreFlusher.37 exiting
>  2016-05-29 19:54:16,622 INFO  [MemStoreFlusher.24]
> regionserver.MemStoreFlusher: MemStoreFlusher.24 exiting
>  2016-05-29 19:54:16,806 INFO  [MemStoreFlusher.39]
> regionserver.MemStoreFlusher: MemStoreFlusher.39 exiting
>  2016-05-29 19:54:16,908 INFO  [MemStoreFlusher.17]
> regionserver.MemStoreFlusher: MemStoreFlusher.17 exiting
>  2016-05-29 19:54:16,952 INFO  [MemStoreFlusher.3]
> regionserver.MemStoreFlusher: MemStoreFlusher.3 exiting
>  2016-05-29 19:54:16,952 INFO  [MemStoreFlusher.27]
> regionserver.MemStoreFlusher: MemStoreFlusher.27 exiting
>  2016-05-29 19:54:16,952 INFO  [MemStoreFlusher.18]
> regionserver.MemStoreFlusher: MemStoreFlusher.18 exiting
>  2016-05-29 19:54:16,952 INFO  [MemStoreFlusher.4]
> regionserver.MemStoreFlusher: MemStoreFlusher.4 exiting
>  2016-05-29 19:54:16,952 INFO  [MemStoreFlusher.19]
> regionserver.MemStoreFlusher: MemStoreFlusher.19 exiting
>  2016-05-29 19:54:16,952 INFO  [MemStoreFlusher.22]
> regionserver.MemStoreFlusher: MemStoreFlusher.22 exiting
>  2016-05-29 19:54:16,952 INFO  [MemStoreFlusher.9]
> regionserver.MemStoreFlusher: MemStoreFlusher.9 exiting
>  2016-05-29 19:54:17,137 INFO  [MemStoreFlusher.36]
> regionserver.MemStoreFlusher: MemStoreFlusher.36 exiting
>  2016-05-29 19:54:17,263 INFO  [MemStoreFlusher.11]
> regionserver.MemStoreFlusher: MemStoreFlusher.11 exiting
>  2016-05-29 19:54:17,499 INFO  [MemStoreFlusher.14]
> regionserver.MemStoreFlusher: MemStoreFlusher.14 exiting
>  2016-05-29 19:54:17,499 INFO  [MemStoreFlusher.2]
> regionserver.MemStoreFlusher: MemStoreFlusher.2 exiting
>  2016-05-29 19:54:17,499 INFO  [MemStoreFlusher.26]
> regionserver.MemStoreFlusher: MemStoreFlusher.26 exiting
>  2016-05-29 19:54:17,499 INFO  [MemStoreFlusher.0]
> regionserver.MemStoreFlusher: MemStoreFlusher.0 exiting
>  2016-05-29 19:54:17,499 INFO  [MemStoreFlusher.30]
> regionserver.MemStoreFlusher: MemStoreFlusher.30 exiting
>  2016-05-29 19:54:17,499 INFO  [MemStoreFlusher.28]
> regionserver.MemStoreFlusher: MemStoreFlusher.28 exiting
>
>
>  here is the jstack log when I try to shutdown the region server process
> normally:
>  we can see the MemstoreFlusher.34 meet some problems, it can not exiting
> normally
>  (By the way, we can see “locked at snappy”, but it not always locked at
> snappy,sometime it will locked at other place,so I think snappy is ok, Here
> I just want to show the MemStoreFlusher meet some problem).
>
>
>   "MemStoreFlusher.34" prio=10 tid=0x00007f20b8939000 nid=0x511b runnable
> [0x00007f08941c3000]
>     java.lang.Thread.State: RUNNABLE
>  at
> org.apache.hadoop.io.compress.snappy.SnappyCompressor.compressBytesDirect(Native
> Method)
>  at
> org.apache.hadoop.io.compress.snappy.SnappyCompressor.compress(SnappyCompressor.java:232)
>  - locked <0x00007f0a5f85c430> (a
> org.apache.hadoop.io.compress.snappy.SnappyCompressor)
>  at
> org.apache.hadoop.io.compress.BlockCompressorStream.compress(BlockCompressorStream.java:149)
>  at
> org.apache.hadoop.io.compress.BlockCompressorStream.finish(BlockCompressorStream.java:142)
>  at
> org.apache.hadoop.hbase.io.encoding.HFileBlockDefaultEncodingContext.compressAfterEncoding(HFileBlockDefaultEncodingContext.java:238)
>  at
> org.apache.hadoop.hbase.io.encoding.HFileBlockDefaultEncodingContext.compressAfterEncodingWithBlockType(HFileBlockDefaultEncodingContext.java:149)
>  at
> org.apache.hadoop.hbase.io.hfile.HFileBlock$Writer.finishBlock(HFileBlock.java:783)
>  at
> org.apache.hadoop.hbase.io.hfile.HFileBlock$Writer.ensureBlockReady(HFileBlock.java:761)
>  at
> org.apache.hadoop.hbase.io.hfile.HFileBlock$Writer.finishBlockAndWriteHeaderAndData(HFileBlock.java:878)
>  at
> org.apache.hadoop.hbase.io.hfile.HFileBlock$Writer.writeHeaderAndData(HFileBlock.java:864)
>  at
> org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexWriter.writeIntermediateBlock(HFileBlockIndex.java:948)
>  at
> org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexWriter.writeIntermediateLevel(HFileBlockIndex.java:931)
>  at
> org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexWriter.writeIndexBlocks(HFileBlockIndex.java:840)
>  at
> org.apache.hadoop.hbase.io.hfile.HFileWriterV2.close(HFileWriterV2.java:367)
>  at
> org.apache.hadoop.hbase.regionserver.StoreFile$Writer.close(StoreFile.java:974)
>  at
> org.apache.hadoop.hbase.regionserver.StoreFlusher.finalizeWriter(StoreFlusher.java:69)
>  at
> org.apache.hadoop.hbase.regionserver.DefaultStoreFlusher.flushSnapshot(DefaultStoreFlusher.java:70)
>  - locked <0x00007f09fdcccda0> (a java.lang.Object)
>  at org.apache.hadoop.hbase.regionserver.HStore.flushCache(HStore.java:829)
>  at
> org.apache.hadoop.hbase.regionserver.HStore$StoreFlusherImpl.flushCache(HStore.java:1989)
>  at
> org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:1779)
>  at
> org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:1662)
>  at
> org.apache.hadoop.hbase.regionserver.HRegion.flushcache(HRegion.java:1575)
>  at
> org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:461)
>  at
> org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:435)
>  at
> org.apache.hadoop.hbase.regionserver.MemStoreFlusher.access$800(MemStoreFlusher.java:66)
>  at
> org.apache.hadoop.hbase.regionserver.MemStoreFlusher$FlushHandler.run(MemStoreFlusher.java:253)
>  at java.lang.Thread.run(Thread.java:744)
>
>
>  "MemStoreFlusher.33" prio=10 tid=0x00007f20b8937000 nid=0x511a waiting on
> condition [0x00007f08942c5000]
>     java.lang.Thread.State: TIMED_WAITING (parking)
>  at sun.misc.Unsafe.park(Native Method)
>  - parking to wait for  <0x00007f09b8e20f80> (a
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
>  at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:226)
>  at
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2082)
>  at java.util.concurrent.DelayQueue.poll(DelayQueue.java:264)
>  at java.util.concurrent.DelayQueue.poll(DelayQueue.java:68)
>  at
> org.apache.hadoop.hbase.regionserver.MemStoreFlusher$FlushHandler.run(MemStoreFlusher.java:233)
>  at java.lang.Thread.run(Thread.java:744)
>
>
>
>
>
>     I do not know why the memstore cannot flush.
>     Is this a bug?

Reply via email to