[ https://issues.apache.org/jira/browse/HBASE-10615?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13912584#comment-13912584 ]
stack commented on HBASE-10615: ------------------------------- lgtm > Make LoadIncrementalHFiles skip reference files > ----------------------------------------------- > > Key: HBASE-10615 > URL: https://issues.apache.org/jira/browse/HBASE-10615 > Project: HBase > Issue Type: Improvement > Affects Versions: 0.96.0 > Reporter: Jerry He > Assignee: Jerry He > Priority: Minor > Attachments: HBASE-10615-trunk.patch > > > There is use base that the source of hfiles for LoadIncrementalHFiles can be > a FileSystem copy-out/backup of HBase table or archive hfiles. For example, > 1. Copy-out of hbase.rootdir, table dir, region dir (after disable) or > archive dir. > 2. ExportSnapshot > It is possible that there are reference files in the family dir in these > cases. > We have such use cases, where trying to load back into HBase, we'll get > {code} > Caused by: org.apache.hadoop.hbase.io.hfile.CorruptHFileException: Problem > reading HFile Trailer from file > hdfs://HDFS-AMR/tmp/restoreTemp/117182adfe861c5d2b607da91d60aa8a/info/aed3d01648384b31b29e5bad4cd80bec.d179ab341fc68e7612fcd74eaf7cafbd > at > org.apache.hadoop.hbase.io.hfile.HFile.pickReaderVersion(HFile.java:570) > at > org.apache.hadoop.hbase.io.hfile.HFile.createReaderWithEncoding(HFile.java:594) > at org.apache.hadoop.hbase.io.hfile.HFile.createReader(HFile.java:636) > at > org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles.groupOrSplit(LoadIncrementalHFiles.java:472) > at > org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles$2.call(LoadIncrementalHFiles.java:393) > at > org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles$2.call(LoadIncrementalHFiles.java:391) > at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:314) > at java.util.concurrent.FutureTask.run(FutureTask.java:149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:897) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:919) > at java.lang.Thread.run(Thread.java:738) > Caused by: java.lang.IllegalArgumentException: Invalid HFile version: > 16715777 (expected to be between 2 and 2) > at > org.apache.hadoop.hbase.io.hfile.HFile.checkFormatVersion(HFile.java:927) > at > org.apache.hadoop.hbase.io.hfile.FixedFileTrailer.readFromStream(FixedFileTrailer.java:426) > at > org.apache.hadoop.hbase.io.hfile.HFile.pickReaderVersion(HFile.java:568) > {code} > It is desirable and safe to skip these reference files since they don't > contain any real data for bulk load purpose. > -- This message was sent by Atlassian JIRA (v6.1.5#6160)