The new scripts in trunk at src/contrib/ec2 will offer this approach soon. 
Right now they simply back HDFS with instance storage (volatile) and rely on 
not having more than the HDFS replication factor (default = 3) instances crash 
or terminate at one time. Using EBS is a big win for its persistence and 
transparent/background snapshot facility. One thing our scripts will have to 
deal with though is how to back a ~100 or so node cluster with EBS volumes, and 
also supporting elastic operation, creating them on the fly as necessary. 

Also in the cards is performance and stability testing with HBase root 
filesystem on Hadoop's S3N fs (http://wiki.apache.org/hadoop/AmazonS3). I tried 
some limited testing with the S3 fs option just for basic filesystem operations 
-- albeit on a 209 GB file -- and had an unhappy result so will avoid that for 
now. Some time ago Clint Morgan ran a simple performance comparison and here 
was his results: http://markmail.org/message/xqhwgdw25oi7u3rb
"So to summarize:
loading data: almost twice as slow
A long scan is about 1.5 times slower
short scans are over an order of magnitude slower
and random reads (done on the sorted "scan") are over 2 orders of
magnitude slower"

In some fairly short time we should have a replacement for the HBase S3 related 
page up on the wiki. In the meantime you may consider perusing 
http://www.google.com/search?hl=en&q=hbase+s3

    - Andy




________________________________
From: Vaibhav Puranik <vpura...@gmail.com>
To: hbase-user@hadoop.apache.org
Sent: Mon, November 16, 2009 9:46:52 AM
Subject: Re: Hbase on Amazon S3?

We have HBase 0.20.0 running on EC2 with EBS volume since July 2009.
We are using m1.Large machines for all the 4  nodes.

All of our data resides on EBS  volume. This helps us in backing up the
data. This also helps us in bringing up a separate cluster with the same
data for QA purposes.

So far no problems.

If you have any specific questions please let us know.

Regards,
Vaibhav Puranik
Gumgum



On Mon, Nov 16, 2009 at 9:38 AM, Something Something <
mailinglist...@gmail.com> wrote:

> Anyone installed HBase on S3 (or EC2 for that matter)?  Any pointers would
> be greatly appreciated.  Thanks.
>



      

Reply via email to