On Mon, May 30, 2016 at 7:03 PM, 聪聪 <[email protected]> wrote: > HI ALL: Recently,I met a strange problem, the first Region’s > Memstore of one table (the only one) often blocked when flushing.(Both > Version: hbase-0.98.6-cdh5.2.0 and 1.0.1.1, I updated 0.98 to > 1.0.1.1,hope to solve the problem,But failed) >
Is region always on same machine or do you see this phenomenon on more than one machine? > On the web UI, I can see the status shows: ABORTED(since XXsec > ago), Not flushing since already flushing. > The RS is ABORTED? Because it can't flush? Is that what it says in the log? Can we see the log message around the ABORT? > But it will never flush success, and the usage of the disk will > increase very high.Now other regionservers just use 30% of the disk > capacity, the problematic region server will increase to 100%,unless > killing the process. > 100% disk only? The CPU does not go up too? Can we see a thread dump? Do jstack -l PID if you can. > What’s more, the region server process cannot be shutdown > normally,every time I have to use the KILL -9 command. > I check the log,the reason why cannot flush is one of the > MemstoreFlusher cannot exiting. > The log is like blow: > 2016-05-29 19:54:11,982 INFO [MemStoreFlusher.13] > regionserver.MemStoreFlusher: MemStoreFlusher.13 exiting > 2016-05-29 19:54:13,016 INFO [MemStoreFlusher.6] > regionserver.MemStoreFlusher: MemStoreFlusher.6 exiting > 2016-05-29 19:54:13,260 INFO [MemStoreFlusher.16] > regionserver.MemStoreFlusher: MemStoreFlusher.16 exiting > 2016-05-29 19:54:16,032 INFO [MemStoreFlusher.33] > regionserver.MemStoreFlusher: MemStoreFlusher.33 exiting > 2016-05-29 19:54:16,341 INFO [MemStoreFlusher.25] > regionserver.MemStoreFlusher: MemStoreFlusher.25 exiting > 2016-05-29 19:54:16,620 INFO [MemStoreFlusher.31] > regionserver.MemStoreFlusher: MemStoreFlusher.31 exiting > 2016-05-29 19:54:16,621 INFO [MemStoreFlusher.29] > regionserver.MemStoreFlusher: MemStoreFlusher.29 exiting > 2016-05-29 19:54:16,621 INFO [MemStoreFlusher.23] > regionserver.MemStoreFlusher: MemStoreFlusher.23 exiting > 2016-05-29 19:54:16,621 INFO [MemStoreFlusher.32] > regionserver.MemStoreFlusher: MemStoreFlusher.32 exiting > 2016-05-29 19:54:16,621 INFO [MemStoreFlusher.1] > regionserver.MemStoreFlusher: MemStoreFlusher.1 exiting > 2016-05-29 19:54:16,621 INFO [MemStoreFlusher.38] > regionserver.MemStoreFlusher: MemStoreFlusher.38 exiting > 2016-05-29 19:54:16,621 INFO [MemStoreFlusher.10] > regionserver.MemStoreFlusher: MemStoreFlusher.10 exiting > 2016-05-29 19:54:16,620 INFO [MemStoreFlusher.7] > regionserver.MemStoreFlusher: MemStoreFlusher.7 exiting > 2016-05-29 19:54:16,620 INFO [MemStoreFlusher.12] > regionserver.MemStoreFlusher: MemStoreFlusher.12 exiting > 2016-05-29 19:54:16,620 INFO [MemStoreFlusher.21] > regionserver.MemStoreFlusher: MemStoreFlusher.21 exiting > 2016-05-29 19:54:16,622 INFO [MemStoreFlusher.37] > regionserver.MemStoreFlusher: MemStoreFlusher.37 exiting > 2016-05-29 19:54:16,622 INFO [MemStoreFlusher.24] > regionserver.MemStoreFlusher: MemStoreFlusher.24 exiting > 2016-05-29 19:54:16,806 INFO [MemStoreFlusher.39] > regionserver.MemStoreFlusher: MemStoreFlusher.39 exiting > 2016-05-29 19:54:16,908 INFO [MemStoreFlusher.17] > regionserver.MemStoreFlusher: MemStoreFlusher.17 exiting > 2016-05-29 19:54:16,952 INFO [MemStoreFlusher.3] > regionserver.MemStoreFlusher: MemStoreFlusher.3 exiting > 2016-05-29 19:54:16,952 INFO [MemStoreFlusher.27] > regionserver.MemStoreFlusher: MemStoreFlusher.27 exiting > 2016-05-29 19:54:16,952 INFO [MemStoreFlusher.18] > regionserver.MemStoreFlusher: MemStoreFlusher.18 exiting > 2016-05-29 19:54:16,952 INFO [MemStoreFlusher.4] > regionserver.MemStoreFlusher: MemStoreFlusher.4 exiting > 2016-05-29 19:54:16,952 INFO [MemStoreFlusher.19] > regionserver.MemStoreFlusher: MemStoreFlusher.19 exiting > 2016-05-29 19:54:16,952 INFO [MemStoreFlusher.22] > regionserver.MemStoreFlusher: MemStoreFlusher.22 exiting > 2016-05-29 19:54:16,952 INFO [MemStoreFlusher.9] > regionserver.MemStoreFlusher: MemStoreFlusher.9 exiting > 2016-05-29 19:54:17,137 INFO [MemStoreFlusher.36] > regionserver.MemStoreFlusher: MemStoreFlusher.36 exiting > 2016-05-29 19:54:17,263 INFO [MemStoreFlusher.11] > regionserver.MemStoreFlusher: MemStoreFlusher.11 exiting > 2016-05-29 19:54:17,499 INFO [MemStoreFlusher.14] > regionserver.MemStoreFlusher: MemStoreFlusher.14 exiting > 2016-05-29 19:54:17,499 INFO [MemStoreFlusher.2] > regionserver.MemStoreFlusher: MemStoreFlusher.2 exiting > 2016-05-29 19:54:17,499 INFO [MemStoreFlusher.26] > regionserver.MemStoreFlusher: MemStoreFlusher.26 exiting > 2016-05-29 19:54:17,499 INFO [MemStoreFlusher.0] > regionserver.MemStoreFlusher: MemStoreFlusher.0 exiting > 2016-05-29 19:54:17,499 INFO [MemStoreFlusher.30] > regionserver.MemStoreFlusher: MemStoreFlusher.30 exiting > 2016-05-29 19:54:17,499 INFO [MemStoreFlusher.28] > regionserver.MemStoreFlusher: MemStoreFlusher.28 exiting > > > here is the jstack log when I try to shutdown the region server process > normally: > we can see the MemstoreFlusher.34 meet some problems, it can not exiting > normally > (By the way, we can see “locked at snappy”, but it not always locked at > snappy,sometime it will locked at other place,so I think snappy is ok, Here > I just want to show the MemStoreFlusher meet some problem). > > > Any other special configurations going on on this install? Phoenix or use of Coprocessors? > "MemStoreFlusher.34" prio=10 tid=0x00007f20b8939000 nid=0x511b runnable > [0x00007f08941c3000] > java.lang.Thread.State: RUNNABLE > at > org.apache.hadoop.io.compress.snappy.SnappyCompressor.compressBytesDirect(Native > Method) > at > org.apache.hadoop.io.compress.snappy.SnappyCompressor.compress(SnappyCompressor.java:232) > - locked <0x00007f0a5f85c430> (a > org.apache.hadoop.io.compress.snappy.SnappyCompressor) > at > org.apache.hadoop.io.compress.BlockCompressorStream.compress(BlockCompressorStream.java:149) > at > org.apache.hadoop.io.compress.BlockCompressorStream.finish(BlockCompressorStream.java:142) > at > org.apache.hadoop.hbase.io.encoding.HFileBlockDefaultEncodingContext.compressAfterEncoding(HFileBlockDefaultEncodingContext.java:238) > at > org.apache.hadoop.hbase.io.encoding.HFileBlockDefaultEncodingContext.compressAfterEncodingWithBlockType(HFileBlockDefaultEncodingContext.java:149) > at > org.apache.hadoop.hbase.io.hfile.HFileBlock$Writer.finishBlock(HFileBlock.java:783) > at > org.apache.hadoop.hbase.io.hfile.HFileBlock$Writer.ensureBlockReady(HFileBlock.java:761) > at > org.apache.hadoop.hbase.io.hfile.HFileBlock$Writer.finishBlockAndWriteHeaderAndData(HFileBlock.java:878) > at > org.apache.hadoop.hbase.io.hfile.HFileBlock$Writer.writeHeaderAndData(HFileBlock.java:864) > at > org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexWriter.writeIntermediateBlock(HFileBlockIndex.java:948) > at > org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexWriter.writeIntermediateLevel(HFileBlockIndex.java:931) > at > org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexWriter.writeIndexBlocks(HFileBlockIndex.java:840) > at > org.apache.hadoop.hbase.io.hfile.HFileWriterV2.close(HFileWriterV2.java:367) > at > org.apache.hadoop.hbase.regionserver.StoreFile$Writer.close(StoreFile.java:974) > at > org.apache.hadoop.hbase.regionserver.StoreFlusher.finalizeWriter(StoreFlusher.java:69) > at > org.apache.hadoop.hbase.regionserver.DefaultStoreFlusher.flushSnapshot(DefaultStoreFlusher.java:70) > - locked <0x00007f09fdcccda0> (a java.lang.Object) > at org.apache.hadoop.hbase.regionserver.HStore.flushCache(HStore.java:829) > at > org.apache.hadoop.hbase.regionserver.HStore$StoreFlusherImpl.flushCache(HStore.java:1989) > at > org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:1779) > at > org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:1662) > at > org.apache.hadoop.hbase.regionserver.HRegion.flushcache(HRegion.java:1575) > at > org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:461) > at > org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:435) > at > org.apache.hadoop.hbase.regionserver.MemStoreFlusher.access$800(MemStoreFlusher.java:66) > at > org.apache.hadoop.hbase.regionserver.MemStoreFlusher$FlushHandler.run(MemStoreFlusher.java:253) > at java.lang.Thread.run(Thread.java:744) > > If you thread dump a few times, it is always stuck here? Thanks, St.Ack > > "MemStoreFlusher.33" prio=10 tid=0x00007f20b8937000 nid=0x511a waiting on > condition [0x00007f08942c5000] > java.lang.Thread.State: TIMED_WAITING (parking) > at sun.misc.Unsafe.park(Native Method) > - parking to wait for <0x00007f09b8e20f80> (a > java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject) > at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:226) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2082) > at java.util.concurrent.DelayQueue.poll(DelayQueue.java:264) > at java.util.concurrent.DelayQueue.poll(DelayQueue.java:68) > at > org.apache.hadoop.hbase.regionserver.MemStoreFlusher$FlushHandler.run(MemStoreFlusher.java:233) > at java.lang.Thread.run(Thread.java:744) > > > > > > I do not know why the memstore cannot flush. > Is this a bug? > > > > 下面是被转发的邮件: > > 发件人: quan <[email protected]> > > 主题: memstore flush blocked > > 日期: 2016年5月30日 GMT+8 上午10:47:43 > > 收件人: [email protected] > > 抄送: 蒲聪 <[email protected]> > > > HI ALL: Recently,I met a strange problem, the first Region’s > Memstore of one table (the only one) often blocked when flushing.(Both > Version: hbase-0.98.6-cdh5.2.0 and 1.0.1.1, I updated 0.98 to > 1.0.1.1,hope to solve the problem,But failed) > On the web UI, I can see the status shows: ABORTED(since XXsec > ago), Not flushing since already flushing. > But it will never flush success, and the usage of the disk will > increase very high.Now other regionservers just use 30% of the disk > capacity, the problematic region server will increase to 100%,unless > killing the process. > What’s more, the region server process cannot be shutdown > normally,every time I have to use the KILL -9 command. > I check the log,the reason why cannot flush is one of the > MemstoreFlusher cannot exiting. > The log is like blow: > 2016-05-29 19:54:11,982 INFO [MemStoreFlusher.13] > regionserver.MemStoreFlusher: MemStoreFlusher.13 exiting > 2016-05-29 19:54:13,016 INFO [MemStoreFlusher.6] > regionserver.MemStoreFlusher: MemStoreFlusher.6 exiting > 2016-05-29 19:54:13,260 INFO [MemStoreFlusher.16] > regionserver.MemStoreFlusher: MemStoreFlusher.16 exiting > 2016-05-29 19:54:16,032 INFO [MemStoreFlusher.33] > regionserver.MemStoreFlusher: MemStoreFlusher.33 exiting > 2016-05-29 19:54:16,341 INFO [MemStoreFlusher.25] > regionserver.MemStoreFlusher: MemStoreFlusher.25 exiting > 2016-05-29 19:54:16,620 INFO [MemStoreFlusher.31] > regionserver.MemStoreFlusher: MemStoreFlusher.31 exiting > 2016-05-29 19:54:16,621 INFO [MemStoreFlusher.29] > regionserver.MemStoreFlusher: MemStoreFlusher.29 exiting > 2016-05-29 19:54:16,621 INFO [MemStoreFlusher.23] > regionserver.MemStoreFlusher: MemStoreFlusher.23 exiting > 2016-05-29 19:54:16,621 INFO [MemStoreFlusher.32] > regionserver.MemStoreFlusher: MemStoreFlusher.32 exiting > 2016-05-29 19:54:16,621 INFO [MemStoreFlusher.1] > regionserver.MemStoreFlusher: MemStoreFlusher.1 exiting > 2016-05-29 19:54:16,621 INFO [MemStoreFlusher.38] > regionserver.MemStoreFlusher: MemStoreFlusher.38 exiting > 2016-05-29 19:54:16,621 INFO [MemStoreFlusher.10] > regionserver.MemStoreFlusher: MemStoreFlusher.10 exiting > 2016-05-29 19:54:16,620 INFO [MemStoreFlusher.7] > regionserver.MemStoreFlusher: MemStoreFlusher.7 exiting > 2016-05-29 19:54:16,620 INFO [MemStoreFlusher.12] > regionserver.MemStoreFlusher: MemStoreFlusher.12 exiting > 2016-05-29 19:54:16,620 INFO [MemStoreFlusher.21] > regionserver.MemStoreFlusher: MemStoreFlusher.21 exiting > 2016-05-29 19:54:16,622 INFO [MemStoreFlusher.37] > regionserver.MemStoreFlusher: MemStoreFlusher.37 exiting > 2016-05-29 19:54:16,622 INFO [MemStoreFlusher.24] > regionserver.MemStoreFlusher: MemStoreFlusher.24 exiting > 2016-05-29 19:54:16,806 INFO [MemStoreFlusher.39] > regionserver.MemStoreFlusher: MemStoreFlusher.39 exiting > 2016-05-29 19:54:16,908 INFO [MemStoreFlusher.17] > regionserver.MemStoreFlusher: MemStoreFlusher.17 exiting > 2016-05-29 19:54:16,952 INFO [MemStoreFlusher.3] > regionserver.MemStoreFlusher: MemStoreFlusher.3 exiting > 2016-05-29 19:54:16,952 INFO [MemStoreFlusher.27] > regionserver.MemStoreFlusher: MemStoreFlusher.27 exiting > 2016-05-29 19:54:16,952 INFO [MemStoreFlusher.18] > regionserver.MemStoreFlusher: MemStoreFlusher.18 exiting > 2016-05-29 19:54:16,952 INFO [MemStoreFlusher.4] > regionserver.MemStoreFlusher: MemStoreFlusher.4 exiting > 2016-05-29 19:54:16,952 INFO [MemStoreFlusher.19] > regionserver.MemStoreFlusher: MemStoreFlusher.19 exiting > 2016-05-29 19:54:16,952 INFO [MemStoreFlusher.22] > regionserver.MemStoreFlusher: MemStoreFlusher.22 exiting > 2016-05-29 19:54:16,952 INFO [MemStoreFlusher.9] > regionserver.MemStoreFlusher: MemStoreFlusher.9 exiting > 2016-05-29 19:54:17,137 INFO [MemStoreFlusher.36] > regionserver.MemStoreFlusher: MemStoreFlusher.36 exiting > 2016-05-29 19:54:17,263 INFO [MemStoreFlusher.11] > regionserver.MemStoreFlusher: MemStoreFlusher.11 exiting > 2016-05-29 19:54:17,499 INFO [MemStoreFlusher.14] > regionserver.MemStoreFlusher: MemStoreFlusher.14 exiting > 2016-05-29 19:54:17,499 INFO [MemStoreFlusher.2] > regionserver.MemStoreFlusher: MemStoreFlusher.2 exiting > 2016-05-29 19:54:17,499 INFO [MemStoreFlusher.26] > regionserver.MemStoreFlusher: MemStoreFlusher.26 exiting > 2016-05-29 19:54:17,499 INFO [MemStoreFlusher.0] > regionserver.MemStoreFlusher: MemStoreFlusher.0 exiting > 2016-05-29 19:54:17,499 INFO [MemStoreFlusher.30] > regionserver.MemStoreFlusher: MemStoreFlusher.30 exiting > 2016-05-29 19:54:17,499 INFO [MemStoreFlusher.28] > regionserver.MemStoreFlusher: MemStoreFlusher.28 exiting > > > here is the jstack log when I try to shutdown the region server process > normally: > we can see the MemstoreFlusher.34 meet some problems, it can not exiting > normally > (By the way, we can see “locked at snappy”, but it not always locked at > snappy,sometime it will locked at other place,so I think snappy is ok, Here > I just want to show the MemStoreFlusher meet some problem). > > > "MemStoreFlusher.34" prio=10 tid=0x00007f20b8939000 nid=0x511b runnable > [0x00007f08941c3000] > java.lang.Thread.State: RUNNABLE > at > org.apache.hadoop.io.compress.snappy.SnappyCompressor.compressBytesDirect(Native > Method) > at > org.apache.hadoop.io.compress.snappy.SnappyCompressor.compress(SnappyCompressor.java:232) > - locked <0x00007f0a5f85c430> (a > org.apache.hadoop.io.compress.snappy.SnappyCompressor) > at > org.apache.hadoop.io.compress.BlockCompressorStream.compress(BlockCompressorStream.java:149) > at > org.apache.hadoop.io.compress.BlockCompressorStream.finish(BlockCompressorStream.java:142) > at > org.apache.hadoop.hbase.io.encoding.HFileBlockDefaultEncodingContext.compressAfterEncoding(HFileBlockDefaultEncodingContext.java:238) > at > org.apache.hadoop.hbase.io.encoding.HFileBlockDefaultEncodingContext.compressAfterEncodingWithBlockType(HFileBlockDefaultEncodingContext.java:149) > at > org.apache.hadoop.hbase.io.hfile.HFileBlock$Writer.finishBlock(HFileBlock.java:783) > at > org.apache.hadoop.hbase.io.hfile.HFileBlock$Writer.ensureBlockReady(HFileBlock.java:761) > at > org.apache.hadoop.hbase.io.hfile.HFileBlock$Writer.finishBlockAndWriteHeaderAndData(HFileBlock.java:878) > at > org.apache.hadoop.hbase.io.hfile.HFileBlock$Writer.writeHeaderAndData(HFileBlock.java:864) > at > org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexWriter.writeIntermediateBlock(HFileBlockIndex.java:948) > at > org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexWriter.writeIntermediateLevel(HFileBlockIndex.java:931) > at > org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexWriter.writeIndexBlocks(HFileBlockIndex.java:840) > at > org.apache.hadoop.hbase.io.hfile.HFileWriterV2.close(HFileWriterV2.java:367) > at > org.apache.hadoop.hbase.regionserver.StoreFile$Writer.close(StoreFile.java:974) > at > org.apache.hadoop.hbase.regionserver.StoreFlusher.finalizeWriter(StoreFlusher.java:69) > at > org.apache.hadoop.hbase.regionserver.DefaultStoreFlusher.flushSnapshot(DefaultStoreFlusher.java:70) > - locked <0x00007f09fdcccda0> (a java.lang.Object) > at org.apache.hadoop.hbase.regionserver.HStore.flushCache(HStore.java:829) > at > org.apache.hadoop.hbase.regionserver.HStore$StoreFlusherImpl.flushCache(HStore.java:1989) > at > org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:1779) > at > org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:1662) > at > org.apache.hadoop.hbase.regionserver.HRegion.flushcache(HRegion.java:1575) > at > org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:461) > at > org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:435) > at > org.apache.hadoop.hbase.regionserver.MemStoreFlusher.access$800(MemStoreFlusher.java:66) > at > org.apache.hadoop.hbase.regionserver.MemStoreFlusher$FlushHandler.run(MemStoreFlusher.java:253) > at java.lang.Thread.run(Thread.java:744) > > > "MemStoreFlusher.33" prio=10 tid=0x00007f20b8937000 nid=0x511a waiting on > condition [0x00007f08942c5000] > java.lang.Thread.State: TIMED_WAITING (parking) > at sun.misc.Unsafe.park(Native Method) > - parking to wait for <0x00007f09b8e20f80> (a > java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject) > at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:226) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2082) > at java.util.concurrent.DelayQueue.poll(DelayQueue.java:264) > at java.util.concurrent.DelayQueue.poll(DelayQueue.java:68) > at > org.apache.hadoop.hbase.regionserver.MemStoreFlusher$FlushHandler.run(MemStoreFlusher.java:233) > at java.lang.Thread.run(Thread.java:744) > > > > > > I do not know why the memstore cannot flush. > Is this a bug?
