Wow, I guess it's a lot simpler than I thought it would be. I'll give it a try. Thank you both for your helpful and quick responses!
-James On Tue, 2010-02-09 at 21:07 -0600, Ryan Rawson wrote: > You can also use distcp instead of the copyToLocal && scp && > copyFromLocal chain you have there. > > On Tue, Feb 9, 2010 at 7:02 PM, Dan Washusen <[email protected]> wrote: > > +1 > > > > I use this method when performance testing on different data sets. I have > > several datasets to test on (varying sizes, etc). When I want to switch > > datasets I just shut down hbase and rename the /hbase directory... > > > > e.g. (assuming hbase is not running) > > hadoop/bin/hadoop fs -mv /hbase /hbase.small > > hadoop/bin/hadoop fs -mv /hbase.large /hbase > > > > When I want to move my data between clusters I use: > > hadoop/bin/hadoop fs -copyToLocal /hbase.large /tmp/hbase.large > > scp -r /tmp/hbase.large u...@host:/tmp > > ssh u...@host > > hadoop/bin/hadoop fs -put /tmp/hbase.large /hbase > > > > > > Very handy :) > > > > > > On 10 February 2010 13:50, Ryan Rawson <[email protected]> wrote: > > > >> If you stop the source cluster then you can distcp the /hbase to the > >> other cluster. Done. A perfect copy. > >> > >> That is probably the most efficient/highest performing way. > >> > >> On Tue, Feb 9, 2010 at 6:47 PM, James Baldassari <[email protected]> wrote: > >> > Hi, > >> > > >> > I'm wondering if it's possible to export all data from one HBase cluster > >> > and import it into another. We have a lot of data that we've imported > >> > into our staging HBase environment, and rather than repeating the > >> > lengthy import process in our production environment we would prefer to > >> > just copy all the data directly from HBase/HDFS in staging into > >> > production. Is there an easy way to do this? I know Hadoop has some > >> > distributed copy functionality, but I don't know if this will work with > >> > HBase. The number of region servers and the replication factor will be > >> > the same in the source and destination environments, but the > >> > hostnames/IPs will be different. The production environment is > >> > completely empty right now, so we don't need to worry about overwriting > >> > data. > >> > > >> > I came across these links while searching for information HBase > >> > export/import: > >> > > >> > http://issues.apache.org/jira/browse/HBASE-897 > >> > http://issues.apache.org/jira/browse/HBASE-1684 > >> > > >> http://hadoop.apache.org/hbase/docs/current/api/org/apache/hadoop/hbase/mapreduce/Export.html > >> > > >> > Has anyone used these tools? Is there a better way? > >> > > >> > Thanks, > >> > James > >> > > >> > > >> > > >> > >
