[ 
https://issues.apache.org/jira/browse/HBASE-13639?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15218582#comment-15218582
 ] 

Enis Soztutar commented on HBASE-13639:
---------------------------------------

[~davelatham] this is a great explanation. Do you think that we can simplify 
the whole usage by the tool so that it does these multiple steps (run hashtable 
+ distcp + synctable) automatically? The tool can take all the arguments 
needed, and coordinate running these steps so that it is easier to use and 
harder to make mistakes. Shall we open a follow up issue?  

> SyncTable - rsync for HBase tables
> ----------------------------------
>
>                 Key: HBASE-13639
>                 URL: https://issues.apache.org/jira/browse/HBASE-13639
>             Project: HBase
>          Issue Type: New Feature
>          Components: mapreduce, Operability, tooling
>            Reporter: Dave Latham
>            Assignee: Dave Latham
>              Labels: tooling
>             Fix For: 2.0.0, 0.98.14, 1.2.0
>
>         Attachments: HBASE-13639-0.98-addendum-hadoop-1.patch, 
> HBASE-13639-0.98.patch, HBASE-13639-v1.patch, HBASE-13639-v2.patch, 
> HBASE-13639-v3-0.98.patch, HBASE-13639-v3.patch, HBASE-13639.patch
>
>
> Given HBase tables in remote clusters with similar but not identical data, 
> efficiently update a target table such that the data in question is identical 
> to a source table.  Efficiency in this context means using far less network 
> traffic than would be required to ship all the data from one cluster to the 
> other.  Takes inspiration from rsync.
> Design doc: 
> https://docs.google.com/document/d/1-2c9kJEWNrXf5V4q_wBcoIXfdchN7Pxvxv1IO6PW0-U/



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to