This is also how you can run multiple HBase instances on top of a single HDFS spanning all:
One HBase rootdir at hdfs://host:port/hbase-foo Another HBase rootdir at hdfs://host:port/hbase-bar etc. And you can run each HBase instance under a different user account and use HDFS permissions as a very coarse way to keep the data of one HBase private from another. This is a (lame) solution for multitenancy until HBASE-1697. Just a random thought, - Andy ----- Original Message ---- > From: Dan Washusen <[email protected]> > To: [email protected] > Sent: Tue, February 9, 2010 7:02:22 PM > Subject: Re: HBase export/import > > +1 > > I use this method when performance testing on different data sets. I have > several datasets to test on (varying sizes, etc). When I want to switch > datasets I just shut down hbase and rename the /hbase directory... > > e.g. (assuming hbase is not running) > hadoop/bin/hadoop fs -mv /hbase /hbase.small > hadoop/bin/hadoop fs -mv /hbase.large /hbase > > When I want to move my data between clusters I use: > hadoop/bin/hadoop fs -copyToLocal /hbase.large /tmp/hbase.large > scp -r /tmp/hbase.large u...@host:/tmp > ssh u...@host > hadoop/bin/hadoop fs -put /tmp/hbase.large /hbase > > > Very handy :) > > > On 10 February 2010 13:50, Ryan Rawson wrote: > > > If you stop the source cluster then you can distcp the /hbase to the > > other cluster. Done. A perfect copy. > > > > That is probably the most efficient/highest performing way. > > > > On Tue, Feb 9, 2010 at 6:47 PM, James Baldassari wrote: > > > Hi, > > > > > > I'm wondering if it's possible to export all data from one HBase cluster > > > and import it into another. We have a lot of data that we've imported > > > into our staging HBase environment, and rather than repeating the > > > lengthy import process in our production environment we would prefer to > > > just copy all the data directly from HBase/HDFS in staging into > > > production. Is there an easy way to do this? I know Hadoop has some > > > distributed copy functionality, but I don't know if this will work with > > > HBase. The number of region servers and the replication factor will be > > > the same in the source and destination environments, but the > > > hostnames/IPs will be different. The production environment is > > > completely empty right now, so we don't need to worry about overwriting > > > data. > > > > > > I came across these links while searching for information HBase > > > export/import: > > > > > > http://issues.apache.org/jira/browse/HBASE-897 > > > http://issues.apache.org/jira/browse/HBASE-1684 > > > > > > http://hadoop.apache.org/hbase/docs/current/api/org/apache/hadoop/hbase/mapreduce/Export.html > > > > > > Has anyone used these tools? Is there a better way? > > > > > > Thanks, > > > James > > > > > > > > > > >
