Hi All, I was wondering what the options there are for backup and dumping an HBase database. I appreciate that having it run on top of a HDFS cluster can protect against individual node failure. However that still doesn't protect against the massive but thankfully rare disasters which take out whole server racks, fire, floods, etc...
As far as I can tell there are two options: 1, Scan each table and dump the entire row to some external location, like MySQL Dump does for MySQL. Then to recover simply put the new data back. I am sure the performance of this is going to be fairly bad. 2, Image the data stored on the HDFS cluster. Aren't there some big issues with it not grabbing a consistent image as some updates won't be flushed? Is there any way to force that, or to make it be consistent some way, perhaps via snapshoting? Have I missed anything? Anyone got any suggestions? Charlie M
