Hi David,

Alexander already gave you a good rundown on EC2 and Riak, but let me add some 
of my own experiences running databases on EC2 in general. 

The short answer is, Riak is certainly successfully used in production on EC2, 
so nothing should hold you back from testing a setup on EC2. But there's a 
whole bunch of things you should keep in mind.

First, it's probably a good idea to avoid using ephemeral storage as persistent 
storage. Even though it rarely happens, instances can crash on EC2 for any kind 
of reason, mostly a hardware failure of the underlying host of course.

Cluster compute instances offer especially high CPU power, but what you really 
want is really fast and reliable storage I/O, persisted for eternity if need 
be. CC instances are certainly a lot better than any other instance in that 
terms of general I/O (see [2] for a comparison), but fall prey to similar 
limitations in terms of network storage I/O as other instance types, see below.

The RAID 0'd ephemeral storage on the cluster compute instances may sound good 
in theory in terms of performance, but in practice it takes away data 
durability in case of a single disk failure. One disk fails, and the data on 
that node is gone. Depending on what kinds of seek your doing, an EBS setup may 
even turn out to be faster. See [6] and [4] for a comparison and some initial 
and extended measurements, and [7] for another comparison. But certainly, the 
cluster compute instance's ephemeral storage can achieve a good amount of 
throughput, see [5] for some pretty graphs comparing both RAID and non-RAID 

As Alexander pointed out, multiple instance failures can make this scenario a 
real killer, though you end up with the same risks as running on raw iron 
servers. Both ephemeral storage and EBS don't make the problem of a proper 
backup disappear. You could e.g. run off ephemeral storage, relying on both 
Riak's replication and a good backup e.g. to an EBS volume or to S3.

EBS on the other hand is prone to a large variance in network latency, making 
performance at any point unpredictable and unreliable. Every measurement you 
take is likely to be different an hour later. This may sound extreme, but it 
turns out to be a very big issue for databases where there's lots of disk I/O 
involved to read and write data, as is the case with Riak's Bitcask storage.

You can increase the performance and reliability of EBS by using a RAID of 
volumes. Preferrably go for a RAID 5 or RAID 10 to add redundancy. There's 
mixed opinions on whether that's really necessary on EBS, with Amazon keeping 
the data redundant on their end as well, but in general, it's a good tradeoff 
between increased performance through striping and increased redundancy through 
mirroring. [1] has a good summary of when it's better to choose RAID 5 vs. 10.

RAID 0 will obviously bring the best performance, it's certainly a valid setup. 
We've been running RAID 0 setups with 4 volumes, and got great improvements 
over a single volume. You're also likely to achieve more throughput on bigger 
instances with a setup like this. The caveat once again is that one corrupted 
volume is enough to make a RAID 0 setup unusable.

Another crazy thought is to setup a RAID striping across a bunch of ephemeral 
drives and EBS volumes, maximizing throughput on both local and network 
storage. But know what you're getting yourself into with this kind of setup, 
especially when your write load is a lot heavier then the available network 
bandwidth can handle, a scenario where your network volumes will never be able 
to catch up with the local storage.

All that said, EBS I/O sure is reasonably fast, but it depends on your 
particular use case and performance requirements. It's also worth noting that 
the I/O capabilities of EBS increase with the instance size. The bigger your 
instance, the more throughput you'll achieve (see [3]). Bigger instances tend 
to have better network throughput in general, with cluster compute instances 
obviously having some of the highest bandwidth available.

All this turns out to be much less of a problem when data can be held in memory 
very easily, e.g. with Innostore, where you can read and write to/from cache 
buffers first and then have InnoDB take care of flushing to disk.

Personally, I don't think you're overcomplicating things in regard to multiple 
availability zones, it's a good idea to do that, when highest availability 
possible is your goal, as when it's usually just a single availability zone 
that's affected by increased latency or network timeouts, but as Alexander 
said, you should think about having cross-datacenter replication in that 
scenario, as availability zones are data centers located in different physical 
locations. Usually they're not that far apart, but far enough to increase 
latency considerably. But as always, it depends on your particular use case.

Now, after all this realtalk, here's the kicker. Riak's way of replicating data 
can make both scenarios work. When it's ensured that your data is replicated on 
more than one node, it can work in both ways. You could use both ephemeral 
storage and be somewhat safe because data will reside on multiple nodes. The 
same is true for EBS volumes, as potential variances in I/O or even minutes of 
total unavailabilities (as seen on the recent Reddit outage) can be recovered a 
lot easier thanks to handoff and read repairs. You can increase the number of 
replicas (n_val) to increase your tolerance of instance failure, just make sure 
that n_val is less than the number of nodes in your cluster.

Don't get me wrong, I love EC2 and EBS, being able to spin up servers at any 
time and to attache more storage to a running instance is extremely powerful, 
when you can handle the downsides. But if very low latency is what you're 
looking for, raw iron with lots of memory and SSD as storage device thrown on 
top is hard to beat.

When in doubt, start with a RAID 0 setup on EBS with 4 volumes, and compare it 
with a RAID 5 in terms of performance. They're known to give a good enough 
performance in a lot of cases. If you decide to go with a RAID, be sure to add 
LVM on top for simpler snapshotting, which will be quite painful if not 
impossible to get consistent snapshots using just EBS snapshots on a bunch of 
striped volumes.

Let us know if you have more questions, there's lots of details involved when 
you're going under the hood, but this should cover the most important bases.

Mathias Meyer
Developer Advocate, Basho Technologies

[2] http://blog.cloudharmony.com/2010/09/benchmarking-of-ec2s-new-cluster.html
[3] http://blog.cloudharmony.com/2010/06/disk-io-benchmarking-in-cloud.html
[7] http://victortrac.com/EC2_Ephemeral_Disks_vs_EBS_Volumes

On Mittwoch, 30. März 2011 at 18:29, David Dawson wrote: 
> I am not sure if this has already been discussed, but I am looking at the 
> feasibility of running RIAK in a EC2 cloud, as we have a requirement that may 
> require us to scale up and down quite considerably on a month by month basis. 
> After some initial testing and investigation we have come to the conclusion 
> that there are 2 solutions although both have their downsides in my opinion:
> 1. Run multiple cluster compute( cc1.4xlarge ) instances ( 23 GB RAM, 10 
> Gigabit ethernet, 2 x 845 GB disks running RAID 0 )
> 2. Same as above but using EBS as the storage instead of the local disks.
> The problems I see are as follows with solution 1: 
> - A instance failure results in complete loss of data on that machine, as the 
> disks are ephemeral storage ( e.g. they only exist whilst the machine is up ).
> The problems I see are as follows with solution 2:
> - EBS is slower than the local disks and from what I have read is susceptible 
> to latency depending on factors out of your control.
> - There has been a bit of press lately about availability problems with EBS, 
> so we would have to use multiple availability zones although there are only 4 
> in total and it just seems as though I am over complicating things.
> Has anyone used EC2 and RIAK in production and if so what are their 
> experiences?
> Otherwise has anyone used RackSpace or Joyent? as these are alternatives 
> although the Joyent solution seems very expensive, and what are their 
> experiences?
> Dave
> _______________________________________________
> riak-users mailing list
> riak-users@lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

riak-users mailing list

Reply via email to