I’m not sure about Datastax’s official stance but using the SSD backed instances (ed. i2.2xl, c3.4xl etc) outperform the m2.2xl greatly. Also, since Datastax is pro-ssd, I doubt they would still recommend to stay on magnetic disks.
That said, I have benchmarked all the way up to the c3.8xl instances. The most IOPS I could get out of each node was around 4000-5000. This seemed to be because the context switching was preventing Cassandra from stressing the SSD drives to their maximum of 40,000 IOPS. Since the SSD backed EBS volumes offer up to 4000 IOPS, the speed of the disk would not be an issue. You would, however, still be sharing network resources, so without a proper benchmark you would still be rolling the dice. The best bang for the buck I’ve seen is the i2 instances. They offer more ephemeral disk space at less of a cost than the c3, albeit less cpu. We currently use the i2.xlrg and they are working out great. On August 19, 2014 at 10:09:26 AM, Brian Tarbox (briantar...@gmail.com) wrote: The last guidance I heard from DataStax was to use m2.2xlarge's on AWS and put data on the ephemeral drive....have they changed this guidance? Brian On Tue, Aug 19, 2014 at 9:41 AM, Oleg Dulin <oleg.du...@gmail.com> wrote: Distinguished Colleagues: Our current Cassandra cluster on AWS looks like this: 3 nodes in N. Virginia, one per zone. RF=3 Each node is a c3.4xlarge with 2x160G SSDs in RAID-0 (~300 Gig SSD on each node). Works great, I find it the most optimal configuration for a Cassandra node. But the time is coming soon when I need to expand storage capacity. I have the following options in front of me: 1) Add 3 more c3.4xlarge nodes. This keeps the amount of data on each node reasonable, and all repairs and other tasks can complete in a reasonable amount of time. The downside is that c3.4xlarge are pricey. 2) Add provisioned EBS volumes. These days I can get SSD-backed EBS with up to 4000 IOPS provisioned. I can add those volumes to "data_directories" list in Yaml, and I expect Cassandra can deal with that JBOD-style.... The upside is that it is much cheaper than option #1 above; the downside is that it is a much slower configuration and repairs can take longer. I'd appreciate any input on this topic. Thanks in advance, Oleg -- http://about.me/BrianTarbox