[ 
https://issues.apache.org/jira/browse/HBASE-13985?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14605101#comment-14605101
 ] 

Ted Yu commented on HBASE-13985:
--------------------------------

{code}
179        * Skip reference, HFileLink, files starting with "_" and non-valid 
hfiles.
{code}
The last part of the above comment should be revised to take new parameter into 
account.

Do you know how much percentage hfile validation took for your workload ?

Thanks

> Add configuration to skip validating HFile format when bulk loading millions 
> of HFiles
> --------------------------------------------------------------------------------------
>
>                 Key: HBASE-13985
>                 URL: https://issues.apache.org/jira/browse/HBASE-13985
>             Project: HBase
>          Issue Type: Improvement
>    Affects Versions: 0.98.13
>            Reporter: Victor Xu
>            Assignee: Victor Xu
>            Priority: Minor
>              Labels: regionserver
>             Fix For: 0.98.14
>
>         Attachments: HBASE-13985.patch
>
>
> When bulk loading millions of HFile into one HTable, checking HFile format is 
> the most time-consuming phase. Maybe we could use a parallel mechanism to 
> increase the speed, but when it comes to millions of HFiles, it may still 
> cost dozens of minutes. So I think it's necessary to add an option for 
> advanced user to bulkload without checking HFile format at all. 
> Of course, the default value of this option should be true.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to