[
https://issues.apache.org/jira/browse/HBASE-28065?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17762367#comment-17762367
]
Nick Dimiduk commented on HBASE-28065:
--------------------------------------
{noformat}
2023-08-25T01:29:07,243 [regionserver/hostname:60020-shortCompactions-0] ERROR
org.apache.hadoop.hbase.regionserver.CompactSplit: Compaction failed
region=someregion,1662852329076.46bb5107cf8dbd6355811caa601acb87.,
storeName=46bb5107cf8dbd6355811caa601acb87/0, priority=11,
startTime=1692926947046
java.io.IOException: Could not iterate StoreFileScanner[HFileScanner for reader
reader=hdfs://hostname:8020/hbase/data/default/table/46bb5107cf8dbd6355811caa601acb87/0/a3b88280bada4223a4ab170539f92f71,
compression=gz, cacheConf=cacheDataOnRead=true, cacheDataOnWrite=false,
cacheIndexesOnWrite=false, cacheBloomsOnWrite=false, cacheEvictOnClose=false,
cacheDataCompressed=false, prefetchOnOpen=false,
firstKey=Optional[rowkey/Put/seqid=0], lastKey=Optional[rowkey/Put/seqid=0],
avgKeyLen=60, avgValueLen=15, entries=188344, length=2529826,
cur=org.apache.hadoop.hbase.io.encoding.BufferedDataBlockEncoder$OffheapDecodedExtendedCell@2b6cbce3]
at
org.apache.hadoop.hbase.regionserver.StoreFileScanner.next(StoreFileScanner.java:204)
~[hbase-server-2.5-hubspot-20230418.183923-40.jar:2.5-hubspot-SNAPSHOT]
at
org.apache.hadoop.hbase.regionserver.KeyValueHeap.next(KeyValueHeap.java:118)
~[hbase-server-2.5-hubspot-20230418.183923-40.jar:2.5-hubspot-SNAPSHOT]
at
org.apache.hadoop.hbase.regionserver.StoreScanner.next(StoreScanner.java:692)
~[hbase-server-2.5-hubspot-20230418.183923-40.jar:2.5-hubspot-SNAPSHOT]
at
org.apache.hadoop.hbase.regionserver.compactions.Compactor.performCompaction(Compactor.java:440)
~[hbase-server-2.5-hubspot-20230418.183923-40.jar:2.5-hubspot-SNAPSHOT]
at
org.apache.hadoop.hbase.regionserver.compactions.Compactor.compact(Compactor.java:363)
~[hbase-server-2.5-hubspot-20230418.183923-40.jar:2.5-hubspot-SNAPSHOT]
at
org.apache.hadoop.hbase.regionserver.compactions.DefaultCompactor.compact(DefaultCompactor.java:64)
~[hbase-server-2.5-hubspot-20230418.183923-40.jar:2.5-hubspot-SNAPSHOT]
at
org.apache.hadoop.hbase.regionserver.DefaultStoreEngine$DefaultCompactionContext.compact(DefaultStoreEngine.java:122)
~[hbase-server-2.5-hubspot-20230418.183923-40.jar:2.5-hubspot-SNAPSHOT]
at
org.apache.hadoop.hbase.regionserver.HStore.compact(HStore.java:1145)
~[hbase-server-2.5-hubspot-20230418.183923-40.jar:2.5-hubspot-SNAPSHOT]
at
org.apache.hadoop.hbase.regionserver.HRegion.compact(HRegion.java:2287)
~[hbase-server-2.5-hubspot-20230418.183923-40.jar:2.5-hubspot-SNAPSHOT]
at
org.apache.hadoop.hbase.regionserver.CompactSplit$CompactionRunner.doCompaction(CompactSplit.java:667)
~[hbase-server-2.5-hubspot-20230418.183923-40.jar:2.5-hubspot-SNAPSHOT]
at
org.apache.hadoop.hbase.regionserver.CompactSplit$CompactionRunner.run(CompactSplit.java:716)
~[hbase-server-2.5-hubspot-20230418.183923-40.jar:2.5-hubspot-SNAPSHOT]
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
~[?:?]
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
~[?:?]
at java.lang.Thread.run(Thread.java:829) ~[?:?]
Caused by: java.io.IOException: Passed in onDiskSizeWithHeader=24745 !=
1805126624, offset=22426, fileContext=[usesHBaseChecksum=true,
checksumType=CRC32C, bytesPerChecksum=16384, blocksize=65536, encoding=NONE,
indexBlockEncoding=NONE, includesMvcc=true, includesTags=false,
compressAlgo=GZ, compressTags=false, cryptoContext=[cipher=NONE keyHash=NONE],
name=a3b88280bada4223a4ab170539f92f71,
cellComparator=org.apache.hadoop.hbase.CellComparatorImpl@7e05e3b5]
at
org.apache.hadoop.hbase.io.hfile.HFileBlock$FSReaderImpl.verifyOnDiskSizeMatchesHeader(HFileBlock.java:1579)
~[hbase-server-2.5-hubspot-20230418.183923-40.jar:2.5-hubspot-SNAPSHOT]
at
org.apache.hadoop.hbase.io.hfile.HFileBlock$FSReaderImpl.readBlockDataInternal(HFileBlock.java:1710)
~[hbase-server-2.5-hubspot-20230418.183923-40.jar:2.5-hubspot-SNAPSHOT]
at
org.apache.hadoop.hbase.io.hfile.HFileBlock$FSReaderImpl.readBlockData(HFileBlock.java:1503)
~[hbase-server-2.5-hubspot-20230418.183923-40.jar:2.5-hubspot-SNAPSHOT]
at
org.apache.hadoop.hbase.io.hfile.HFileReaderImpl.readBlock(HFileReaderImpl.java:1331)
~[hbase-server-2.5-hubspot-20230418.183923-40.jar:2.5-hubspot-SNAPSHOT]
at
org.apache.hadoop.hbase.io.hfile.HFileReaderImpl.readBlock(HFileReaderImpl.java:1252)
~[hbase-server-2.5-hubspot-20230418.183923-40.jar:2.5-hubspot-SNAPSHOT]
at
org.apache.hadoop.hbase.io.hfile.HFileReaderImpl$HFileScannerImpl.readNextDataBlock(HFileReaderImpl.java:754)
~[hbase-server-2.5-hubspot-20230418.183923-40.jar:2.5-hubspot-SNAPSHOT]
at
org.apache.hadoop.hbase.io.hfile.HFileReaderImpl$EncodedScanner.next(HFileReaderImpl.java:1520)
~[hbase-server-2.5-hubspot-20230418.183923-40.jar:2.5-hubspot-SNAPSHOT]
at
org.apache.hadoop.hbase.regionserver.StoreFileScanner.next(StoreFileScanner.java:195)
~[hbase-server-2.5-hubspot-20230418.183923-40.jar:2.5-hubspot-SNAPSHOT]
... 13 more
{noformat}
> Corrupt HFile data is mishandled in several cases
> -------------------------------------------------
>
> Key: HBASE-28065
> URL: https://issues.apache.org/jira/browse/HBASE-28065
> Project: HBase
> Issue Type: Bug
> Components: HFile
> Affects Versions: 2.5.2
> Reporter: Nick Dimiduk
> Priority: Major
>
> While riding over a spat of HDFS data corruption issues, we've observed
> several places in the read path that do not fall back to HDFS checksum
> appropriately. These failures manifest during client reads and during
> compactions. Sometimes failure is detected by the fallback
> {{verifyOnDiskSizeMatchesHeader}}, sometimes we attempt to allocate a buffer
> with a negative size, and sometimes we read through to a failure from block
> decompression.
> After code study, I think that all three cases arise from using a block
> header that was read without checksum validation.
> Will post up the stack traces in the comments. Now sure if we'll want a
> single patch or multiple.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)