Also, don't see lot of point to protection as this is running only in single
mode, if I got it correctly.

In that case, you can use M/R job to export/import your data, but just note
that won't export schema too, only data.

On Wed, Sep 28, 2011 at 2:49 PM, Li Pi <[email protected]> wrote:

> What kind of situations are you looking for to guard against? Partial
> hardware failure, full hardware failure (of live cluster),
> accidentally deleting all data?
>
> HDFS provides replication that already guards against partial hardware
> failure - if this is all you need, a ephemeral store should be  fine.
>
> Also, HBase can use S3 directly as a datastore. You can choose the raw
> mode, in which HBase treats S3 as a disk. There used to be a block
> based mode as well, but now as S3 has increased the object size limit
> to 5tb, this isn't needed anymore. (Somebody correct me if i'm wrong).
>
> On Wed, Sep 28, 2011 at 9:15 AM, Vinod Gupta Tankala
> <[email protected]> wrote:
> > Hi,
> > Can someone answer these basic but important questions for me.
> > We are using hbase for our datastore and want to safeguard ourselves from
> > data corruption/data loss. Also we are hosted on aws ec2. Currently, I
> only
> > have a single node but want to prepare for scale right away as things are
> > going to change starting next couple of weeks. Also, I am currently using
> > ephemeral store for hbase data.
> >
> > 1) What is the recommended aws data store method for hbase? should you
> use
> > ephemeral store and do S3 backups or use EBS? I read and heard that EBS
> can
> > be expensive and also unreliable in terms of read/write latency.
> Ofcourse,
> > it provides data replication and protection, so you don't have to worry
> > about that.
> >
> > 2) What is the recommended backup/restore method for hbase? I would like
> to
> > take periodic data snapshots and then have a import utility that will
> > incrementally import data in case i lose some regions due to corruption
> or
> > table inconsistencies. also, if something catastrophic happens, i can
> > restore the whole data.
> >
> > 3) While we are at it, what is the recommended ec2 instance types for
> > running master/zookeeper/region servers? i get conflicting answers from
> > google search - ranging from c1.xlarge to m1.xlarge.
> >
> > I would really appreciate if someone could help me.
> >
> > thanks
> > vinod
> >
>

Reply via email to