Assuming it was thread RS_CLOSE_REGION-hdfs-ix03.se-ix.delta.prod,60020,1424687995350-1 which got stuck, there might be data loss if server is restarted since there would be some data unable to be flushed.
Cheers On Sat, Mar 14, 2015 at 2:58 PM, Kristoffer Sjögren <sto...@gmail.com> wrote: > I think I found the thread that is stuck. Is restarting the server harmless > in this state? > > "RS_CLOSE_REGION-hdfs-ix03.se-ix.delta.prod,60020,1424687995350-1" prio=10 > tid=0x00007f75a0008000 nid=0x23ee in Object.wait() [0x00007f757d30b000] > java.lang.Thread.State: WAITING (on object monitor) > at java.lang.Object.wait(Native Method) > at java.lang.Object.wait(Object.java:503) > at > > org.apache.hadoop.hdfs.DFSOutputStream.waitAndQueueCurrentPacket(DFSOutputStream.java:1411) > - locked <0x00000007544573e8> (a java.util.LinkedList) > at > > org.apache.hadoop.hdfs.DFSOutputStream.writeChunk(DFSOutputStream.java:1479) > - locked <0x0000000756780218> (a org.apache.hadoop.hdfs.DFSOutputStream) > at > > org.apache.hadoop.fs.FSOutputSummer.writeChecksumChunk(FSOutputSummer.java:173) > at org.apache.hadoop.fs.FSOutputSummer.write1(FSOutputSummer.java:116) > at org.apache.hadoop.fs.FSOutputSummer.write(FSOutputSummer.java:102) > - locked <0x0000000756780218> (a org.apache.hadoop.hdfs.DFSOutputStream) > at > > org.apache.hadoop.fs.FSDataOutputStream$PositionCache.write(FSDataOutputStream.java:54) > at java.io.DataOutputStream.write(DataOutputStream.java:107) > - locked <0x00000007543ef268> (a > org.apache.hadoop.hdfs.client.HdfsDataOutputStream) > at java.io.FilterOutputStream.write(FilterOutputStream.java:97) > at > > org.apache.hadoop.hbase.io.hfile.HFileBlock$Writer.writeHeaderAndData(HFileBlock.java:1061) > at > > org.apache.hadoop.hbase.io.hfile.HFileBlock$Writer.writeHeaderAndData(HFileBlock.java:1047) > at > > org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexWriter.writeIntermediateBlock(HFileBlockIndex.java:952) > at > > org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexWriter.writeIntermediateLevel(HFileBlockIndex.java:935) > at > > org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexWriter.writeIndexBlocks(HFileBlockIndex.java:844) > at > > org.apache.hadoop.hbase.io.hfile.HFileWriterV2.close(HFileWriterV2.java:403) > at > > org.apache.hadoop.hbase.regionserver.StoreFile$Writer.close(StoreFile.java:1272) > at > > org.apache.hadoop.hbase.regionserver.Store.internalFlushCache(Store.java:835) > - locked <0x000000075d8b2110> (a java.lang.Object) > at org.apache.hadoop.hbase.regionserver.Store.flushCache(Store.java:746) > at > > org.apache.hadoop.hbase.regionserver.Store$StoreFlusherImpl.flushCache(Store.java:2348) > at > > org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:1580) > at > > org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:1479) > at org.apache.hadoop.hbase.regionserver.HRegion.doClose(HRegion.java:992) > at org.apache.hadoop.hbase.regionserver.HRegion.close(HRegion.java:956) > - locked <0x000000075d97b628> (a java.lang.Object) > at > > org.apache.hadoop.hbase.regionserver.handler.CloseRegionHandler.process(CloseRegionHandler.java:119) > at org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:175) > at > > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:745) > > > On Sat, Mar 14, 2015 at 9:43 PM, Ted Yu <yuzhih...@gmail.com> wrote: > > > bq. flush the region manually using shell? > > > > I doubt that would work - you can give it a try. > > Please take jstack of region server in case you need to restart the > server. > > > > BTW HBASE-10499 didn't go into 0.94 (maybe it should have). Please > consider > > upgrading. > > > > Cheers > > > > On Sat, Mar 14, 2015 at 1:30 PM, Kristoffer Sjögren <sto...@gmail.com> > > wrote: > > > > > Hi Ted > > > > > > Sorry I forgot to mention, hbase-0.94.6 cdh 4.4. > > > > > > Yeah, it was a pretty write intensive scenario that I think triggered > it > > > (importing a lot of datapoints into opentsdb). > > > > > > Do I flush the region manually using shell? > > > > > > Cheers, > > > -Kristoffer > > > > > > On Sat, Mar 14, 2015 at 9:22 PM, Ted Yu <yuzhih...@gmail.com> wrote: > > > > > > > Which release of HBase are you using ? > > > > > > > > I wonder if your cluster was hit with HBASE-10499. > > > > > > > > Cheers > > > > > > > > On Sat, Mar 14, 2015 at 1:13 PM, Kristoffer Sjögren < > sto...@gmail.com> > > > > wrote: > > > > > > > > > Hi > > > > > > > > > > It seems one of our region servers has been stuck closing a region > > for > > > > > almost 22 hours. Puts or gets eventually fail with an exception > [1]. > > > > > > > > > > Is there any safe way to release the region like restarting the > > region > > > > > server? > > > > > > > > > > Cheers, > > > > > -Kristoffer > > > > > > > > > > > > > > > [1] > > > > > > > > > > 2015-03-14 21:02:24,316 INFO > > > > org.apache.hadoop.hbase.regionserver.HRegion: > > > > > Failed to unblock updates for region > > > > > tsdb,\x00\x00\x9ETU\xAC@ > > > > > > > > > > > > > > > \x00\x00\x01\x00\x00\xAD\x00\x00\x05\x00\x00\xA7,1426282871862.4512f92b3d81e9142542d3b458223b63. > > > > > 'IPC Server handler 9 on 60020' in 60000ms. The region is still > busy. > > > > > 2015-03-14 21:02:24,316 ERROR > > > > > org.apache.hadoop.hbase.regionserver.HRegionServer: > > > > > org.apache.hadoop.hbase.RegionTooBusyException: region is flushing > > > > > at > > > > > > > > > > > > > > > > > > > > org.apache.hadoop.hbase.regionserver.HRegion.checkResources(HRegion.java:2731) > > > > > at > > org.apache.hadoop.hbase.regionserver.HRegion.put(HRegion.java:2002) > > > > > at > > > > > > > > > > > > > > > > > > > > org.apache.hadoop.hbase.regionserver.HRegionServer.put(HRegionServer.java:2114) > > > > > at sun.reflect.GeneratedMethodAccessor109.invoke(Unknown Source) > > > > > at > > > > > > > > > > > > > > > > > > > > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > > > > > at java.lang.reflect.Method.invoke(Method.java:606) > > > > > at > > > > > > > > > > > > > > > > > > > > org.apache.hadoop.hbase.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:320) > > > > > at > > > > > > > > > > > > > > > org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1428) > > > > > > > > > > > > > > >