[ 
https://issues.apache.org/jira/browse/HBASE-20579?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jingyun Tian updated HBASE-20579:
---------------------------------
    Release Note: 
This patch add a FSUtil.copyFilesParallel() to help copy files in parallel, and 
it will return all paths of directories and files traversed. Thus when we do 
copy manifest in ExportSnapshot, we can copy reference files concurrently and 
use the paths it return to help setOwner and setPermission. 
The size of thread pool is determined by the configuration 
snapshot.export.copy.references.threads, and its default value is the number of 
runtime available processors.

> Improve snapshot manifest copy in ExportSnapshot
> ------------------------------------------------
>
>                 Key: HBASE-20579
>                 URL: https://issues.apache.org/jira/browse/HBASE-20579
>             Project: HBase
>          Issue Type: Improvement
>          Components: mapreduce
>    Affects Versions: 1.4.0, 1.5.0, 2.0.0
>            Reporter: Jingyun Tian
>            Assignee: Jingyun Tian
>            Priority: Minor
>             Fix For: 3.0.0, 2.1.0
>
>         Attachments: HBASE-20579.master.001.patch, 
> HBASE-20579.master.002.patch
>
>
> ExportSnapshot need to copy snapshot manifest to destination cluster first, 
> then setOwner and setPermission for those paths. But it's done with one 
> thread, which lead to a long time to submit the job if your snapshot is big. 
> I tried to make them processing in parallel, which can reduce the total time 
> of submitting dramatically. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to