*Simple procedure:* 1. Stop writing in one cluster.
2. Export snapshot as described below and Restore on other cluster for each of the tables 3. Start writing to another cluster. *Not so Simple Procedure (Zero Downtime)* *If your source cluster > 2.1 (has Serial replication)* You can do the following in sequence 1. Export snapshot to the destination cluster. Which starts at time say *t1* and it takes say an hour for example. hbase org.apache.hadoop.hbase.snapshot.ExportSnapshot -snapshot > $snapshot_name -copy-to hdfs://$dest_hdfs_ip:8020/hbase -bandwidth > $bandwidth_in_MB 2. Setup up a Peer to destination cluster at say time *t2* which replicates the table(s) to be migrated (works only if your source cluster has serial replication > hbase 2.1) (make sure replication scope is 1 on cf you want to copy over) add_peer 'cluster_dest', CLUSTER_KEY => > "zk1,zk2,zk3:2182:/hbase", TABLE_CFS => { "table1" => ["cf1", "cf2"] }, > STATE => "ENABLED", SERIAL => true, REPLICATE_ALL => false > 3. Make sure cluster replication is propagating data and then disable cluster replication temporarily. This will start piling up the data in oldWAL's and logs are kept in zk replication queues disable_peer 'cluster_dest' > 4. Copy the diff data from time *t1 *to present time using Copy Table which starts at time say *t3 ( > t2) *and it takes say 10 minutes for example hbase org.apache.hadoop.hbase.mapreduce.CopyTable --starttime= > $copy_starttime --endtime=$copy_endtime --peer.adr=$dest_zk_addr ' > $full_table_name' 5. Enable cluster replication back to ensure data is replicated from time *t2* enable_peer 'cluster_dest' 6. Stop writes to old cluster by marking tables read only alter '$full_table_name',{METHOD=>'table_att', READONLY=>true} > 7. (optional) Verify if data is the same between 2 clusters for the window of migration. (Note: This comes with caveats. If you have data being updated for the same rows, or ttl expiring data some row differences are expected.) On Source cluster > hbase org.apache.hadoop.hbase.mapreduce.HashTable --starttime= > $verify_starttime --endtime=$verify_endtime '$full_table_name' > $hash_table_full_path On Destination Cluster > hbase org.apache.hadoop.hbase.mapreduce.SyncTable --sourcezkcluster= > $src_zk_addr *--dryrun=true* hdfs://$src_hdfs_ip:8020$hash_table_full_path > '$full_table_name' '$full_table_name' *If your source cluster < 2.1 (No serial replication)* you will have some write downtime. Can skip step 3 and 5. and step 4 can be repeated as many times as you want until the downtime window is very small. Otherwise largely the same. --- Mallikarjun On Thu, Apr 15, 2021 at 7:23 PM Sajid Mohammed <sajid.had...@gmail.com> wrote: > Hello All > > Any one know simple procedure to migrate all Hbase Tables from one cluster > to another in one go ? > > > Thanks > Sajid. >