Xiaolin Ha created HBASE-25507:
----------------------------------
Summary: Leak of ESTABLISHED sockets when encountered
"java.io.IOException: Invalid HFile block magic"
Key: HBASE-25507
URL: https://issues.apache.org/jira/browse/HBASE-25507
Project: HBase
Issue Type: Improvement
Reporter: Xiaolin Ha
Assignee: Xiaolin Ha
Attachments: errorlogs.png,
increasing-of-established-sockets-image.png, problem-region-move-logs.png
Recently, we found socket leaks on our production cluster. The leaked sockets
are in ESTABLISHED state. We found this happened on RS who owned a particular
region from our analysis of metrics monitor and logs. RS without this region
works normally.
On the RS who owns the particular region, we found Exceptions as follows,
{code:java}
java.io.IOException: java.io.IOException: Could not seek
StoreFileScanner[org.apache.hadoop.hbase.ioo.HalfStoreFileReader$1@5388be2f,
cur=null] to key
org.apache.hadoop.hbase.CellUtil$FirstOnRowDeleteFFamilyCell@25aa56fd at
org.apache.hadoop.hbase.regionserver.StoreScanner.parallelSeek(StoreScanner.java:1128)
at
org.apache.hadoop.hbase.regionserver.StoreScanner.seekScanners(StoreScanner.java:437)
at
org.apache.hadoop.hbase.regionserver.StoreScanner.<init>(StoreScanner.java:329)
at
org.apache.hadoop.hbase.regionserver.StoreScanner.<init>(StoreScanner.java:302)
at
org.apache.hadoop.hbase.regionserver.compactions.Compactor.createScanner(Compactor.java:8806)
at
org.apache.hadoop.hbase.regionserver.compactions.StripeCompactor$StripeInternalScannerFacctory.createScanner(StripeCompactor.java:82)
at
org.apache.hadoop.hbase.regionserver.compactions.Compactor.compact(Compactor.java:316)
at
org.apache.hadoop.hbase.regionserver.compactions.StripeCompactor.compact(StripeCompactor..java:120)
at
org.apache.hadoop.hbase.regionserver.compactions.StripeCompactionPolicy$SplitStripeCompacctionRequest.execute(StripeCompactionPolicy.java:662)
at
org.apache.hadoop.hbase.regionserver.StripeStoreEngine$StripeCompaction.compact(StripeStooreEngine.java:114)
at
org.apache.hadoop.hbase.regionserver.HStore.compact(HStore.java:1461) at
org.apache.hadoop.hbase.regionserver.HRegion.compact(HRegion.java:2121)
at
org.apache.hadoop.hbase.regionserver.CompactSplitThread$CompactionRunner.doCompaction(CommpactSplitThread.java:519)
at
org.apache.hadoop.hbase.regionserver.CompactSplitThread$CompactionRunner.run(CompactSplittThread.java:555)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)Caused by: java.io.IOException:
Could not seek
StoreFileScanner[org.apache.hadoop.hbase.io.HalfStoreeFileReader$1@5388be2f,
cur=null] to key
org.apache.hadoop.hbase.CellUtil$FirstOnRowDeleteFamilyCell@@25aa56fd at
org.apache.hadoop.hbase.regionserver.StoreFileScanner.seek(StoreFileScanner.java:229)
at
org.apache.hadoop.hbase.regionserver.handler.ParallelSeekHandler.process(ParallelSeekHanddler.java:56)
at
org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:129)
... 3 moreCaused by: java.io.IOException: Invalid HFile block magic:
\x00\x00\x00\x00\x00\x00\x00\x00 at
org.apache.hadoop.hbase.io.hfile.BlockType.parse(BlockType.java:159) at
org.apache.hadoop.hbase.io.hfile.BlockType.read(BlockType.java:171) at
org.apache.hadoop.hbase.io.hfile.HFileBlock.createFromBuff(HFileBlock.java:333)
at
org.apache.hadoop.hbase.io.hfile.HFileBlock$FSReaderImpl.readBlockDataInternal(HFileBlockk.java:1753)
at
org.apache.hadoop.hbase.io.hfile.HFileBlock$FSReaderImpl.readBlockData(HFileBlock.java:15552)
at
org.apache.hadoop.hbase.io.hfile.HFileReaderImpl.readBlock(HFileReaderImpl.java:539)
at
org.apache.hadoop.hbase.io.hfile.HFileScannerImpl.readAndUpdateNewBlock(HFileScannerImpl..java:737)
at
org.apache.hadoop.hbase.io.hfile.HFileScannerImpl.seekTo(HFileScannerImpl.java:726)
at
org.apache.hadoop.hbase.io.HalfStoreFileReader$1.seekTo(HalfStoreFileReader.java:161)
at
org.apache.hadoop.hbase.regionserver.StoreFileScanner.seekAtOrAfter(StoreFileScanner.javaa:315)
at
org.apache.hadoop.hbase.regionserver.StoreFileScanner.seek(StoreFileScanner.java:211)
... 5 more
{code}
The count of established sockets is always increasing, see picture,
!increasing-of-established-sockets-image.png!
!problem-region-move-logs.png!
!errorlogs.png!
--
This message was sent by Atlassian Jira
(v8.3.4#803005)