[jira] [Commented] (HBASE-20526) multithreads bulkload performance

Ted Yu (JIRA) Sun, 06 May 2018 07:11:10 -0700

    [ 
https://issues.apache.org/jira/browse/HBASE-20526?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16465143#comment-16465143
 ]


Ted Yu commented on HBASE-20526:
--------------------------------

The change makes sense.
{code}
382       public void doBulkLoad(Path hfofDir, final Admin admin, Table table,
383           final Pair<byte[][], byte[][]> startEndKeys) throws 
TableNotFoundException, IOException  {
{code}
Please complete the javadoc for parameters.

> multithreads bulkload performance
> ---------------------------------
>
>                 Key: HBASE-20526
>                 URL: https://issues.apache.org/jira/browse/HBASE-20526
>             Project: HBase
>          Issue Type: Improvement
>          Components: mapreduce, Zookeeper
>    Affects Versions: 1.2.5, 1.3.2
>         Environment: hbase-server-1.2.0-cdh5.12.1 
> spark version 1.6
>            Reporter: Key Hutu
>            Assignee: Key Hutu
>            Priority: Minor
>              Labels: performance
>             Fix For: 1.3.2
>
>         Attachments: HBASE-20526-branch-1.3.V1.patch
>
>   Original Estimate: 96h
>  Remaining Estimate: 96h
>
> When doing bulkload , some interactive with zookeeper to getting region key 
> range may be cost more time.
> In multithreads enviorment, the duration maybe cost 5 minute or more.
> From the executor log, like 'Reading reply sessionid:0x262fb37f4a07080 , 
> packet:: clientPath:null server ...' contents appear many times.
>  
> It likely to provide new method for bulkload, caching the key range outside
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HBASE-20526) multithreads bulkload performance

Reply via email to