[ https://issues.apache.org/jira/browse/HBASE-3721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13015041#comment-13015041 ]
Ted Yu commented on HBASE-3721: ------------------------------- LoadIncrementalHFiles may split StoreFile. The above proposal only works if there is no such splitting. > Speedup LoadIncrementalHFiles > ----------------------------- > > Key: HBASE-3721 > URL: https://issues.apache.org/jira/browse/HBASE-3721 > Project: HBase > Issue Type: Improvement > Components: util > Reporter: Ted Yu > > From Adam Phelps: > from the logs it looks like <1% of the hfiles we're loading have to be split. > Looking at the code for LoadIncrementHFiles (hbase v0.90.1), I'm actually > thinking our problem is that this code loads the hfiles sequentially. Our > largest table has over 2500 regions and the data being loaded is fairly well > distributed across them, so there end up being around 2500 HFiles for each > load period. At 1-2 seconds per HFile that means the loading process is very > time consuming. > Currently server.bulkLoadHFile() is a blocking call. > We can utilize ExecutorService to achieve better parallelism. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira