On Mon, Sep 22, 2008 at 8:20 PM, Ding, Hui <[EMAIL PROTECTED]> wrote: > This should be something the operators of your data store worriy about. > E.g., say hdfs uses three replicas, one should be on a local rack, the > other on a different rack (to protect against power outage) > And a third on a remote data center... > > If you have only a small cluster, then maybe use ups to guard against > power outage and watch out for storms? > After all, what are the chances that a meteorite hit your data center? Well knowing my luck :)
There are some times when moving data from one cluster to another is important. Moving data from a development cluster to a production one is another useful feature. I know some clusters are so vast its not so practical or not so important depending on what the data represents. Charlie M > -----Original Message----- > From: Charles Mason [mailto:[EMAIL PROTECTED] > Sent: Monday, September 22, 2008 12:13 PM > To: [email protected] > Subject: [LIKELY JUNK]Back Up Strategies > > Hi All, > > I was wondering what the options there are for backup and dumping an > HBase database. I appreciate that having it run on top of a HDFS > cluster can protect against individual node failure. However that > still doesn't protect against the massive but thankfully rare > disasters which take out whole server racks, fire, floods, etc... > > As far as I can tell there are two options: > > 1, Scan each table and dump the entire row to some external location, > like MySQL Dump does for MySQL. Then to recover simply put the new > data back. I am sure the performance of this is going to be fairly > bad. > > 2, Image the data stored on the HDFS cluster. Aren't there some big > issues with it not grabbing a consistent image as some updates won't > be flushed? Is there any way to force that, or to make it be > consistent some way, perhaps via snapshoting? > > Have I missed anything? Anyone got any suggestions? > > Charlie M >
