[ https://issues.apache.org/jira/browse/HBASE-8192?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13618181#comment-13618181 ]
Chenghao Jiang commented on HBASE-8192: --------------------------------------- It seems that there is another problem in LoadIncrementalHFiles.java In LoadIncrementalHFiles.doBulkLoad(Path hfofDir, final HTable table), it discoverLoadQueue(queue, hfofDir) to get a queue containing all the HFiles. Then we can check if all the families in the queue exist in the HTable. If not, exception should be thrown since this bulkLoad will never succeed, just like throwing TableNotFoundException when table does not exist. If we don't check this, split the hfile whose data belongs to a nonexistent family will fail since information about the family will be used when copyHFileHalf(). > wrong logic in HRegion.bulkLoadHFiles(List) > ------------------------------------------- > > Key: HBASE-8192 > URL: https://issues.apache.org/jira/browse/HBASE-8192 > Project: HBase > Issue Type: Bug > Components: regionserver > Affects Versions: 0.94.1, 0.94.2, 0.94.3, 0.94.4, 0.94.5 > Reporter: Chenghao Jiang > Labels: patch > Fix For: 0.94.7 > > Attachments: 8192.txt, 8192-v2-with-a-test-case.txt, > 8192-v3-with-a-test-case.txt, 8192-v4-with-a-test-case.txt, > 8192-v5-with-a-test-case.txt > > > the wrong logic is here: > when a ColumnFamily does not exist, it gets a null store object, then > ioes.add(ioe); failures.add(p) > but the code below, if (failures.size() != 0), it prints a warn log and > return false, so it will never go into the code if (ioes.size() != 0) below, > and IOException will not be thrown, then the client will keep retry forever. > there is the same situation when doing store.assertBulkLoadHFileOk, if any > WrongRegionException is caught and failures.add(p), then all the other > IOException thrown by assertBulkLoadHFileOk will be ignored. > so i think if (failures.size() != 0) {} should be dealt with after if > (ioes.size() !=0) {} > {code} > for (Pair<byte[], String> p : familyPaths) { > byte[] familyName = p.getFirst(); > String path = p.getSecond(); > Store store = getStore(familyName); > if (store == null) { > IOException ioe = new DoNotRetryIOException( > "No such column family " + Bytes.toStringBinary(familyName)); > ioes.add(ioe); > failures.add(p); > } else { > try { > store.assertBulkLoadHFileOk(new Path(path)); > } catch (WrongRegionException wre) { > // recoverable (file doesn't fit in region) > failures.add(p); > } catch (IOException ioe) { > // unrecoverable (hdfs problem) > ioes.add(ioe); > } > } > } > // validation failed, bail out before doing anything permanent. > if (failures.size() != 0) { > StringBuilder list = new StringBuilder(); > for (Pair<byte[], String> p : failures) { > list.append("\n").append(Bytes.toString(p.getFirst())).append(" : ") > .append(p.getSecond()); > } > // problem when validating > LOG.warn("There was a recoverable bulk load failure likely due to a" + > " split. These (family, HFile) pairs were not loaded: " + list); > return false; > } > // validation failed because of some sort of IO problem. > if (ioes.size() != 0) { > LOG.error("There were IO errors when checking if bulk load is ok. " + > "throwing exception!"); > throw MultipleIOException.createIOException(ioes); > } > {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira