Yes, you can run a distcp to copy data from one cluster to another, also distcp has an option to tell if it will delete files on the destination if they are NOT on the source.
From: Vivek Singh Raghuwanshi [mailto:[email protected]] Sent: Wednesday, February 10, 2016 1:16 PM To: [email protected] Subject: Hadoop Backup and Archival Cluster Hi Friends, I am planning to setup a Hadoop Cluster (A) with Cluster replication (B). so that once data is reached to Cluster A it will replicated to Cluster D. I am having one question if i delete data from Cluster A on the basis of Time like one month old data is it also removed from Cluster B. if yes how i can avoid this. What i want to achieve. 1. Once data is reached to Cluster A it will automatically replicated to Cluster B. 2. After one year old data from Cluster A remove automatically but not from Cluster B. 3. If any one wants to run query on latest data Cluster A is available but for Older data Cluster B is available. Regards -- ViVek Raghuwanshi Mobile -+91-09595950504 Skype - vivek_raghuwanshi IRC vivekraghuwanshi http://vivekraghuwanshi.wordpress.com/ http://in.linkedin.com/in/vivekraghuwanshi
