[ 
https://issues.apache.org/jira/browse/HDFS-10314?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15530595#comment-15530595
 ] 

Yongjun Zhang commented on HDFS-10314:
--------------------------------------

Had a discussion with [~jingzhao], and we had the following agreement:

1. For now, he will be fine with option 2 stated in

https://issues.apache.org/jira/browse/HDFS-10314?focusedCommentId=15524359&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15524359

as long as we document it well, even though it's not his favorite. In that 
case, we can continue to work on HDFS-9820. 

2. When creating a new tool in the future (HDFS-10314), we need to do the 
following: 
* refactor the DistCp code to separate out the snapshot sync part (that handles 
rename/delete per snapshot diff) and copyList calculation part to its own 
class, e.g., DistCpPrepare. 
* let both DistCp and DistSync to call DistCpPrepare for the functionality they 
need
* Modify DistCp to take an optional new argument copyListing.
* Let DistSync call DistCpPrepare to do the snapshot sync part and copyListing 
creation part, and then pass the copyListing to DIstCp.

Please feel free to correct/add if I'm inaccurate or missed anything.

Thanks much Jing.


> A new tool to sync current HDFS view to specified snapshot
> ----------------------------------------------------------
>
>                 Key: HDFS-10314
>                 URL: https://issues.apache.org/jira/browse/HDFS-10314
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: tools
>            Reporter: Yongjun Zhang
>            Assignee: Yongjun Zhang
>         Attachments: HDFS-10314.001.patch
>
>
> HDFS-9820 proposed adding -rdiff switch to distcp, as a reversed operation of 
> -diff switch. 
> Upon discussion with [~jingzhao], we will introduce a new tool that wraps 
> around distcp to achieve the same purpose.
> I'm thinking about calling the new tool "rsync", similar to unix/linux 
> command "rsync". The "r" here means remote.
> The syntax that simulate -rdiff behavior proposed in HDFS-9820 is
> {code}
> rsync <fromSnapshotName>  <toSnapshotName>  <source> <target>
> {code}
> This command ensure <fromSnapshotName>  is newer than <toSnapshotName>.
> I think, In the future, we can add another command to have the functionality 
> of -diff switch of distcp.
> {code}
> sync <fromSnapshotName>  <toSnapshotName>  <source> <target>
> {code}
> that ensures <fromSnapshotName>  is older than <toSnapshotName>.
> Thanks [~jingzhao].



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

Reply via email to