[ https://issues.apache.org/jira/browse/HDFS-7535?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Jing Zhao updated HDFS-7535: ---------------------------- Resolution: Fixed Fix Version/s: 2.7.0 Status: Resolved (was: Patch Available) Thanks again for the review, Nicholas! I've committed this to trunk and branch-2. > Utilize Snapshot diff report for distcp > --------------------------------------- > > Key: HDFS-7535 > URL: https://issues.apache.org/jira/browse/HDFS-7535 > Project: Hadoop HDFS > Issue Type: Improvement > Components: distcp, snapshots > Reporter: Jing Zhao > Assignee: Jing Zhao > Fix For: 2.7.0 > > Attachments: HDFS-7535.000.patch, HDFS-7535.001.patch, > HDFS-7535.002.patch, HDFS-7535.003.patch, HDFS-7535.004.patch > > > Currently HDFS snapshot diff report can identify file/directory creation, > deletion, rename and modification under a snapshottable directory. We can use > the diff report for distcp between the primary cluster and a backup cluster > to avoid unnecessary data copy. This is especially useful when there is a big > directory rename happening in the primary cluster: the current distcp cannot > detect the rename op thus this rename usually leads to large amounts of real > data copy. > More details of the approach will come in the first comment. -- This message was sent by Atlassian JIRA (v6.3.4#6332)