Currently when bulk loading from a webhdfs filesystem, files are copied rather 
than renamed if they reside on the same cluster [1]. This causes the bulk load 
to not perform optimally.



It seems like the configured webhdfs namenodes can be compared against that of 
the namenodes being bulk loaded to, and if they are the same, then the bulk 
loaded files could be renamed rather than copied.



I was able to locate a JIRA comment bring up this use case [2] but wasn't able 
to find a comment or JIRA for with a resolution.



If this issue and proposed solution are acceptable, I would be happy to log a 
JIRA and work on a patch. Please let me know how to proceed.



[1] 
https://github.com/apache/hbase/blob/rel/2.1.2/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/SecureBulkLoadManager.java#L369-L383

[2] 
https://issues.apache.org/jira/browse/HBASE-8304?focusedCommentId=13923197&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-13923197



CONFIDENTIALITY NOTICE This message and any included attachments are from 
Cerner Corporation and are intended only for the addressee. The information 
contained in this message is confidential and may constitute inside or 
non-public information under international, federal, or state securities laws. 
Unauthorized forwarding, printing, copying, distribution, or use of such 
information is strictly prohibited and may be unlawful. If you are not the 
addressee, please promptly delete this message and notify the sender of the 
delivery error by e-mail or you may call Cerner's corporate offices in Kansas 
City, Missouri, U.S.A at (+1) (816)221-1024.

Reply via email to