Thanks Dejan. Currently, I am running in standalone mode but I am parallelly working towards making it run in distributed mode with single node for the time being and adding more slaves in the next couple of weeks. So i want to be prepared for that.
thanks vinod On Wed, Sep 28, 2011 at 11:52 AM, Dejan Menges <[email protected]>wrote: > Also, don't see lot of point to protection as this is running only in > single > mode, if I got it correctly. > > In that case, you can use M/R job to export/import your data, but just note > that won't export schema too, only data. > > On Wed, Sep 28, 2011 at 2:49 PM, Li Pi <[email protected]> wrote: > > > What kind of situations are you looking for to guard against? Partial > > hardware failure, full hardware failure (of live cluster), > > accidentally deleting all data? > > > > HDFS provides replication that already guards against partial hardware > > failure - if this is all you need, a ephemeral store should be fine. > > > > Also, HBase can use S3 directly as a datastore. You can choose the raw > > mode, in which HBase treats S3 as a disk. There used to be a block > > based mode as well, but now as S3 has increased the object size limit > > to 5tb, this isn't needed anymore. (Somebody correct me if i'm wrong). > > > > On Wed, Sep 28, 2011 at 9:15 AM, Vinod Gupta Tankala > > <[email protected]> wrote: > > > Hi, > > > Can someone answer these basic but important questions for me. > > > We are using hbase for our datastore and want to safeguard ourselves > from > > > data corruption/data loss. Also we are hosted on aws ec2. Currently, I > > only > > > have a single node but want to prepare for scale right away as things > are > > > going to change starting next couple of weeks. Also, I am currently > using > > > ephemeral store for hbase data. > > > > > > 1) What is the recommended aws data store method for hbase? should you > > use > > > ephemeral store and do S3 backups or use EBS? I read and heard that EBS > > can > > > be expensive and also unreliable in terms of read/write latency. > > Ofcourse, > > > it provides data replication and protection, so you don't have to worry > > > about that. > > > > > > 2) What is the recommended backup/restore method for hbase? I would > like > > to > > > take periodic data snapshots and then have a import utility that will > > > incrementally import data in case i lose some regions due to corruption > > or > > > table inconsistencies. also, if something catastrophic happens, i can > > > restore the whole data. > > > > > > 3) While we are at it, what is the recommended ec2 instance types for > > > running master/zookeeper/region servers? i get conflicting answers from > > > google search - ranging from c1.xlarge to m1.xlarge. > > > > > > I would really appreciate if someone could help me. > > > > > > thanks > > > vinod > > > > > >
