[ https://issues.apache.org/jira/browse/HBASE-13985?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14605101#comment-14605101 ]
Ted Yu commented on HBASE-13985: -------------------------------- {code} 179 * Skip reference, HFileLink, files starting with "_" and non-valid hfiles. {code} The last part of the above comment should be revised to take new parameter into account. Do you know how much percentage hfile validation took for your workload ? Thanks > Add configuration to skip validating HFile format when bulk loading millions > of HFiles > -------------------------------------------------------------------------------------- > > Key: HBASE-13985 > URL: https://issues.apache.org/jira/browse/HBASE-13985 > Project: HBase > Issue Type: Improvement > Affects Versions: 0.98.13 > Reporter: Victor Xu > Assignee: Victor Xu > Priority: Minor > Labels: regionserver > Fix For: 0.98.14 > > Attachments: HBASE-13985.patch > > > When bulk loading millions of HFile into one HTable, checking HFile format is > the most time-consuming phase. Maybe we could use a parallel mechanism to > increase the speed, but when it comes to millions of HFiles, it may still > cost dozens of minutes. So I think it's necessary to add an option for > advanced user to bulkload without checking HFile format at all. > Of course, the default value of this option should be true. -- This message was sent by Atlassian JIRA (v6.3.4#6332)