For HDFS data, as Davide mentioned above, you can use distcp ( https://hadoop.apache.org/docs/r2.6.5/hadoop-mapreduce-client/hadoop-mapreduce-client-core/DistCp.html )
Jignesh Patel <[email protected]> 于2023年5月12日周五 20:15写道: > Thank you. > What about exporting files from HDFS with the same folder structure and > importing back to HDFS back with the same folder structure. > > On Wed, May 10, 2023 at 10:11 PM 杨光 <[email protected]> wrote: > > > Hi Jignesh, for online service hbase cluster (it means there is data > > written during transforming), I prefer to use ExportSnapshot to copy data > > from cluster to cluster, because of the less performance impact, and also > > you can use *-bandwidth* parameter to control network costs. You can also > > make snapshot for each hbase table individually. After that you can use > > CopyTable to copy data to new cluster which is written during > transforming > > period. But if your cluster is offline service, I think Export is also > > fine. > > > > About the usage of these tools for 0.98.7, you can check this link: > > *https://devdoc.net/bigdata/hbase-0.98.7-hadoop1/book/ops_mgt.html#tools > > <https://devdoc.net/bigdata/hbase-0.98.7-hadoop1/book/ops_mgt.html#tools > >* > > > > Jignesh Patel <[email protected]> 于2023年5月10日周三 01:09写道: > > > > > So which one is better approach exportsnapshot or export each table > > > individually? > > > > > > On Tue, May 9, 2023 at 8:54 AM Jignesh Patel <[email protected]> > > > wrote: > > > > > > > don't know the size of the data asI don't know the command to check. > > > > > > > > But can we follow this blog to export and then import > > > > > > > > > > > > > > https://blog.clairvoyantsoft.com/hbase-incremental-table-backup-and-disaster-recovery-using-aws-s3-storage-aa2bc1b40744 > > > > > > > > On Thu, May 4, 2023 at 11:57 AM Davide Vergari < > > [email protected] > > > > > > > > wrote: > > > > > > > >> If hbase tables you can create a snapshot for each table then > export > > > with > > > >> the ExportSnapshot mapreduce job (should be already available on > > > 0.98.x). > > > >> For data that are not in hbase you can use distcp > > > >> > > > >> Il giorno gio 4 mag 2023 alle ore 17:13 <[email protected]> ha > scritto: > > > >> > > > >> > Jignesh, how much data? Is the data currently in hbase > > format? > > > >> > Very kindly, Sean > > > >> > > > > >> > > > > >> > > On 05/04/2023 11:03 AM Jignesh Patel <[email protected]> > > > wrote: > > > >> > > > > > >> > > > > > >> > > We are in the process of having hadoop os, however we are using > a > > > very > > > >> > old > > > >> > > version of hadoop. > > > >> > > Hadoop 2.6 > > > >> > > and HBase 0.98.7. > > > >> > > > > > >> > > So how do we export and import the data from the cluster with > the > > > old > > > >> OS > > > >> > to > > > >> > > the new OS. We are trying to use the same hadoop/hbase version. > > > >> > > > > > >> > > -Jignesh > > > >> > > > > >> > > > > > > > > > >
