[ https://issues.apache.org/jira/browse/HBASE-12949?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14336108#comment-14336108 ]
stack commented on HBASE-12949: ------------------------------- So, we'd be getting IllegalState instead of BufferUnderflow. Will the RS treat the two exceptions differently? Thanks [~jerryhe] > Scanner can be stuck in infinite loop if the HFile is corrupted > --------------------------------------------------------------- > > Key: HBASE-12949 > URL: https://issues.apache.org/jira/browse/HBASE-12949 > Project: HBase > Issue Type: Bug > Affects Versions: 0.94.3, 0.98.10 > Reporter: Jerry He > Attachments: HBASE-12949-master-v2 (1).patch, > HBASE-12949-master-v2.patch, HBASE-12949-master-v2.patch, > HBASE-12949-master.patch > > > We've encountered problem where compaction hangs and never completes. > After looking into it further, we found that the compaction scanner was stuck > in a infinite loop. See stack below. > {noformat} > org.apache.hadoop.hbase.regionserver.KeyValueHeap.generalizedSeek(KeyValueHeap.java:296) > org.apache.hadoop.hbase.regionserver.KeyValueHeap.reseek(KeyValueHeap.java:257) > org.apache.hadoop.hbase.regionserver.StoreScanner.reseek(StoreScanner.java:697) > org.apache.hadoop.hbase.regionserver.StoreScanner.seekToNextRow(StoreScanner.java:672) > org.apache.hadoop.hbase.regionserver.StoreScanner.next(StoreScanner.java:529) > org.apache.hadoop.hbase.regionserver.compactions.Compactor.performCompaction(Compactor.java:223) > {noformat} > We identified the hfile that seems to be corrupted. Using HFile tool shows > the following: > {noformat} > [biadmin@hdtest009 bin]$ hbase org.apache.hadoop.hbase.io.hfile.HFile -v -k > -m -f > /user/biadmin/CUMMINS_INSITE_V1/7106432d294dd844be15996ccbf2ba84/attributes/f1a7e3113c2c4047ac1fc8fbcb41d8b7 > 15/01/23 11:53:17 INFO Configuration.deprecation: hadoop.native.lib is > deprecated. Instead, use io.native.lib.available > 15/01/23 11:53:18 INFO util.ChecksumType: Checksum using > org.apache.hadoop.util.PureJavaCrc32 > 15/01/23 11:53:18 INFO util.ChecksumType: Checksum can use > org.apache.hadoop.util.PureJavaCrc32C > 15/01/23 11:53:18 INFO Configuration.deprecation: fs.default.name is > deprecated. Instead, use fs.defaultFS > Scanning -> > /user/biadmin/CUMMINS_INSITE_V1/7106432d294dd844be15996ccbf2ba84/attributes/f1a7e3113c2c4047ac1fc8fbcb41d8b7 > WARNING, previous row is greater then current row > filename -> > /user/biadmin/CUMMINS_INSITE_V1/7106432d294dd844be15996ccbf2ba84/attributes/f1a7e3113c2c4047ac1fc8fbcb41d8b7 > previous -> > \x00/20110203-094231205-79442793-1410161293068203000\x0Aattributes16794406\x00\x00\x01\x00\x00\x00\x00\x00\x00 > current -> > Exception in thread "main" java.nio.BufferUnderflowException > at java.nio.Buffer.nextGetIndex(Buffer.java:489) > at java.nio.HeapByteBuffer.getInt(HeapByteBuffer.java:347) > at > org.apache.hadoop.hbase.io.hfile.HFileReaderV2$ScannerV2.readKeyValueLen(HFileReaderV2.java:856) > at > org.apache.hadoop.hbase.io.hfile.HFileReaderV2$ScannerV2.next(HFileReaderV2.java:768) > at > org.apache.hadoop.hbase.io.hfile.HFilePrettyPrinter.scanKeysValues(HFilePrettyPrinter.java:362) > at > org.apache.hadoop.hbase.io.hfile.HFilePrettyPrinter.processFile(HFilePrettyPrinter.java:262) > at > org.apache.hadoop.hbase.io.hfile.HFilePrettyPrinter.run(HFilePrettyPrinter.java:220) > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70) > at > org.apache.hadoop.hbase.io.hfile.HFilePrettyPrinter.main(HFilePrettyPrinter.java:539) > at org.apache.hadoop.hbase.io.hfile.HFile.main(HFile.java:802) > {noformat} > Turning on Java Assert shows the following: > {noformat} > Exception in thread "main" java.lang.AssertionError: Key > 20110203-094231205-79442793-1410161293068203000/attributes:16794406/1099511627776/Minimum/vlen=15/mvcc=0 > followed by a smaller key //0/Minimum/vlen=0/mvcc=0 in cf attributes > at > org.apache.hadoop.hbase.regionserver.StoreScanner.checkScanOrder(StoreScanner.java:672) > {noformat} > It shows that the hfile seems to be corrupted -- the keys don't seem to be > right. > But Scanner is not able to give a meaningful error, but stuck in an infinite > loop in here: > {code} > KeyValueHeap.generalizedSeek() > while ((scanner = heap.poll()) != null) { > } > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)