[ 
https://issues.apache.org/jira/browse/HBASE-3721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13030224#comment-13030224
 ] 

Hudson commented on HBASE-3721:
-------------------------------

Integrated in HBase-TRUNK #1909 (See 
[https://builds.apache.org/hudson/job/HBase-TRUNK/1909/])
    

> Speedup LoadIncrementalHFiles
> -----------------------------
>
>                 Key: HBASE-3721
>                 URL: https://issues.apache.org/jira/browse/HBASE-3721
>             Project: HBase
>          Issue Type: Improvement
>          Components: util
>            Reporter: Ted Yu
>            Assignee: Ted Yu
>             Fix For: 0.92.0
>
>         Attachments: 3721-v2.txt, 3721-v3.txt, 3721-v4.txt, 3721-v6.patch, 
> 3721.txt, LoadIncrementalHFiles.java
>
>
> From Adam Phelps:
> from the logs it looks like <1% of the hfiles we're loading have to be split. 
>  Looking at the code for LoadIncrementHFiles (hbase v0.90.1), I'm actually 
> thinking our problem is that this code loads the hfiles sequentially.  Our 
> largest table has over 2500 regions and the data being loaded is fairly well 
> distributed across them, so there end up being around 2500 HFiles for each 
> load period.  At 1-2 seconds per HFile that means the loading process is very 
> time consuming.
> Currently server.bulkLoadHFile() is a blocking call.
> We can utilize ExecutorService to achieve better parallelism on multi-core 
> computer.
> New configuration parameter "hbase.loadincremental.threads.max" is introduced 
> which sets the maximum number of threads for parallel bulk load.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to