Re: EC2 storage options for C*

Ben Bromhead Wed, 03 Feb 2016 12:18:07 -0800

For what it's worth we've tried d2 instances and they encourage terrible
things like super dense nodes (increases your replacement time). In terms
of useable storage I would go with gp2 EBS on a m4 based instance.


On Mon, 1 Feb 2016 at 14:25 Jack Krupansky <jack.krupan...@gmail.com> wrote:

> Ah, yes, the good old days of m1.large.
>
> -- Jack Krupansky
>
> On Mon, Feb 1, 2016 at 5:12 PM, Jeff Jirsa <jeff.ji...@crowdstrike.com>
> wrote:
>
>> A lot of people use the old gen instances (m1 in particular) because they
>> came with a ton of effectively free ephemeral storage (up to 1.6TB).
>> Whether or not they’re viable is a decision for each user to make. They’re
>> very, very commonly used for C*, though. At a time when EBS was not
>> sufficiently robust or reliable, a cluster of m1 instances was the de facto
>> standard.
>>
>> The canonical “best practice” in 2015 was i2. We believe we’ve made a
>> compelling argument to use m4 or c4 instead of i2. There exists a company
>> we know currently testing d2 at scale, though I’m not sure they have much
>> in terms of concrete results at this time.
>>
>> - Jeff
>>
>> From: Jack Krupansky
>> Reply-To: "user@cassandra.apache.org"
>> Date: Monday, February 1, 2016 at 1:55 PM
>>
>> To: "user@cassandra.apache.org"
>> Subject: Re: EC2 storage options for C*
>>
>> Thanks. My typo - I referenced "C2 Dense Storage" which is really "D2
>> Dense Storage".
>>
>> The remaining question is whether any of the "Previous Generation
>> Instances" should be publicly recommended going forward.
>>
>> And whether non-SSD instances should be recommended going forward as
>> well. sure, technically, someone could use the legacy instances, but the
>> question is what we should be recommending as best practice going forward.
>>
>> Yeah, the i2 instances look like the sweet spot for any non-EBS clusters.
>>
>> -- Jack Krupansky
>>
>> On Mon, Feb 1, 2016 at 4:30 PM, Steve Robenalt <sroben...@highwire.org>
>> wrote:
>>
>>> Hi Jack,
>>>
>>> At the bottom of the instance-types page, there is a link to the
>>> previous generations, which includes the older series (m1, m2, etc), many
>>> of which have HDD options.
>>>
>>> There are also the d2 (Dense Storage) instances in the current
>>> generation that include various combos of local HDDs.
>>>
>>> The i2 series has good sized SSDs available, and has the advanced
>>> networking option, which is also useful for Cassandra. The enhanced
>>> networking is available with other instance types as well, as you'll see on
>>> the feature list under each type.
>>>
>>> Steve
>>>
>>>
>>>
>>> On Mon, Feb 1, 2016 at 1:17 PM, Jack Krupansky <jack.krupan...@gmail.com
>>> > wrote:
>>>
>>>> Thanks. Reading a little bit on AWS, and back to my SSD vs. magnetic
>>>> question, it seems like magnetic (HDD) is no longer a recommended storage
>>>> option for databases on AWS. In particular, only the C2 Dense Storage
>>>> instances have local magnetic storage - all the other instance types are
>>>> SSD or EBS-only - and EBS Magnetic is only recommended for "Infrequent Data
>>>> Access."
>>>>
>>>> For the record, that AWS doc has Cassandra listed as a use case for i2
>>>> instance types.
>>>>
>>>> Also, the AWS doc lists EBS io2 for the NoSQL database use case and gp2
>>>> only for the "small to medium databases" use case.
>>>>
>>>> Do older instances with local HDD still exist on AWS (m1, m2, etc.)? Is
>>>> the doc simply for any newly started instances?
>>>>
>>>> See:
>>>> https://aws.amazon.com/ec2/instance-types/
>>>> http://aws.amazon.com/ebs/details/
>>>>
>>>>
>>>> -- Jack Krupansky
>>>>
>>>> On Mon, Feb 1, 2016 at 2:09 PM, Jeff Jirsa <jeff.ji...@crowdstrike.com>
>>>> wrote:
>>>>
>>>>> > My apologies if my questions are actually answered on the video or
>>>>> slides, I just did a quick scan of the slide text.
>>>>>
>>>>> Virtually all of them are covered.
>>>>>
>>>>> > I'm curious where the EBS physical devices actually reside - are
>>>>> they in the same rack, the same data center, same availability zone? I
>>>>> mean, people try to minimize network latency between nodes, so how exactly
>>>>> is EBS able to avoid network latency?
>>>>>
>>>>> Not published,and probably not a straight forward answer (probably
>>>>> have redundancy cross-az, if it matches some of their other published
>>>>> behaviors). The promise they give you is ‘iops’, with a certain block 
>>>>> size.
>>>>> Some instance types are optimized for dedicated, ebs-only network
>>>>> interfaces. Like most things in cassandra / cloud, the only way to know 
>>>>> for
>>>>> sure is to test it yourself and see if observed latency is acceptable (or
>>>>> trust our testing, if you assume we’re sufficiently smart and honest).
>>>>>
>>>>> > Did your test use Amazon EBS–Optimized Instances?
>>>>>
>>>>> We tested dozens of instance type/size combinations (literally). The
>>>>> best performance was clearly with ebs-optimized instances that also have
>>>>> enhanced networking (c4, m4, etc) - slide 43
>>>>>
>>>>> > SSD or magnetic or does it make any difference?
>>>>>
>>>>> SSD, GP2 (slide 64)
>>>>>
>>>>> > What info is available on EBS performance at peak times, when
>>>>> multiple AWS customers have spikes of demand?
>>>>>
>>>>> Not published, but experiments show that we can hit 10k iops all day
>>>>> every day with only trivial noisy neighbor problems, not enough to impact 
>>>>> a
>>>>> real cluster (slide 58)
>>>>>
>>>>> > Is RAID much of a factor or help at all using EBS?
>>>>>
>>>>> You can use RAID to get higher IOPS than you’d normally get by default
>>>>> (GP2 IOPS cap is 10k, which you get with a 3.333T volume – if you need 
>>>>> more
>>>>> than 10k, you can stripe volumes together up to the ebs network link max)
>>>>> (hinted at in slide 64)
>>>>>
>>>>> > How exactly is EBS provisioned in terms of its own HA - I mean, with
>>>>> a properly configured Cassandra cluster RF provides HA, so what is the
>>>>> equivalent for EBS? If I have RF=3, what assurance is there that those
>>>>> three EBS volumes aren't all in the same physical rack?
>>>>>
>>>>> There is HA, I’m not sure that AWS publishes specifics. Occasionally
>>>>> specific volumes will have issues (hypervisor’s dedicated ethernet link to
>>>>> EBS network fails, for example). Occasionally instances will have issues.
>>>>> The volume-specific issues seem to be less common than the instance-store
>>>>> “instance retired” or “instance is running on degraded hardware” events.
>>>>> Stop/Start and you’ve recovered (possible with EBS, not possible with
>>>>> instance store). The assurances are in AWS’ SLA – if the SLA is
>>>>> insufficient (and it probably is insufficient), use more than one AZ 
>>>>> and/or
>>>>> AWS region or cloud vendor.
>>>>>
>>>>> > For multi-data center operation, what configuration options assure
>>>>> that the EBS volumes for each DC are truly physically separated?
>>>>>
>>>>> It used to be true that EBS control plane for a given region spanned
>>>>> AZs. That’s no longer true. AWS asserts that failure modes for each AZ are
>>>>> isolated (data may replicate between AZs, but a full outage in us-east-1a
>>>>> shouldn’t affect running ebs volumes in us-east-1b or us-east-1c). Slide 
>>>>> 65
>>>>>
>>>>> > In terms of syncing data for the commit log, if the OS call to sync
>>>>> an EBS volume returns, is the commit log data absolutely 100% synced at 
>>>>> the
>>>>> hardware level on the EBS end, such that a power failure of the systems on
>>>>> which the EBS volumes reside will still guarantee availability of the
>>>>> fsynced data. As well, is return from fsync an absolute guarantee of
>>>>> sstable durability when Cassandra is about to delete the commit log,
>>>>> including when the two are on different volumes? In practice, we would 
>>>>> like
>>>>> some significant degree of pipelining of data, such as during the full
>>>>> processing of flushing memtables, but for the fsync at the end a solid
>>>>> guarantee is needed.
>>>>>
>>>>> Most of the answers in this block are “probably not 100%, you should
>>>>> be writing to more than one host/AZ/DC/vendor to protect your organization
>>>>> from failures”. AWS targets something like 0.1% annual failure rate per
>>>>> volume and 99.999% availability (slide 66). We believe they’re exceeding
>>>>> those goals (at least based with the petabytes of data we have on gp2
>>>>> volumes).
>>>>>
>>>>>
>>>>>
>>>>> From: Jack Krupansky
>>>>> Reply-To: "user@cassandra.apache.org"
>>>>> Date: Monday, February 1, 2016 at 5:51 AM
>>>>>
>>>>> To: "user@cassandra.apache.org"
>>>>> Subject: Re: EC2 storage options for C*
>>>>>
>>>>> I'm not a fan of guy - this appears to be the slideshare corresponding
>>>>> to the video:
>>>>>
>>>>> http://www.slideshare.net/AmazonWebServices/bdt323-amazon-ebs-cassandra-1-million-writes-per-second
>>>>>
>>>>> My apologies if my questions are actually answered on the video or
>>>>> slides, I just did a quick scan of the slide text.
>>>>>
>>>>> I'm curious where the EBS physical devices actually reside - are they
>>>>> in the same rack, the same data center, same availability zone? I mean,
>>>>> people try to minimize network latency between nodes, so how exactly is 
>>>>> EBS
>>>>> able to avoid network latency?
>>>>>
>>>>> Did your test use Amazon EBS–Optimized Instances?
>>>>>
>>>>> SSD or magnetic or does it make any difference?
>>>>>
>>>>> What info is available on EBS performance at peak times, when multiple
>>>>> AWS customers have spikes of demand?
>>>>>
>>>>> Is RAID much of a factor or help at all using EBS?
>>>>>
>>>>> How exactly is EBS provisioned in terms of its own HA - I mean, with a
>>>>> properly configured Cassandra cluster RF provides HA, so what is the
>>>>> equivalent for EBS? If I have RF=3, what assurance is there that those
>>>>> three EBS volumes aren't all in the same physical rack?
>>>>>
>>>>> For multi-data center operation, what configuration options assure
>>>>> that the EBS volumes for each DC are truly physically separated?
>>>>>
>>>>> In terms of syncing data for the commit log, if the OS call to sync an
>>>>> EBS volume returns, is the commit log data absolutely 100% synced at the
>>>>> hardware level on the EBS end, such that a power failure of the systems on
>>>>> which the EBS volumes reside will still guarantee availability of the
>>>>> fsynced data. As well, is return from fsync an absolute guarantee of
>>>>> sstable durability when Cassandra is about to delete the commit log,
>>>>> including when the two are on different volumes? In practice, we would 
>>>>> like
>>>>> some significant degree of pipelining of data, such as during the full
>>>>> processing of flushing memtables, but for the fsync at the end a solid
>>>>> guarantee is needed.
>>>>>
>>>>>
>>>>> -- Jack Krupansky
>>>>>
>>>>> On Mon, Feb 1, 2016 at 12:56 AM, Eric Plowe <eric.pl...@gmail.com>
>>>>> wrote:
>>>>>
>>>>>> Jeff,
>>>>>>
>>>>>> If EBS goes down, then EBS Gp2 will go down as well, no? I'm not
>>>>>> discounting EBS, but prior outages are worrisome.
>>>>>>
>>>>>>
>>>>>> On Sunday, January 31, 2016, Jeff Jirsa <jeff.ji...@crowdstrike.com>
>>>>>> wrote:
>>>>>>
>>>>>>> Free to choose what you'd like, but EBS outages were also addressed
>>>>>>> in that video (second half, discussion by Dennis Opacki). 2016 EBS isn't
>>>>>>> the same as 2011 EBS.
>>>>>>>
>>>>>>> --
>>>>>>> Jeff Jirsa
>>>>>>>
>>>>>>>
>>>>>>> On Jan 31, 2016, at 8:27 PM, Eric Plowe <eric.pl...@gmail.com>
>>>>>>> wrote:
>>>>>>>
>>>>>>> Thank you all for the suggestions. I'm torn between GP2 vs
>>>>>>> Ephemeral. GP2 after testing is a viable contender for our workload. The
>>>>>>> only worry I have is EBS outages, which have happened.
>>>>>>>
>>>>>>> On Sunday, January 31, 2016, Jeff Jirsa <jeff.ji...@crowdstrike.com>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> Also in that video - it's long but worth watching
>>>>>>>>
>>>>>>>> We tested up to 1M reads/second as well, blowing out page cache to
>>>>>>>> ensure we weren't "just" reading from memory
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> --
>>>>>>>> Jeff Jirsa
>>>>>>>>
>>>>>>>>
>>>>>>>> On Jan 31, 2016, at 9:52 AM, Jack Krupansky <
>>>>>>>> jack.krupan...@gmail.com> wrote:
>>>>>>>>
>>>>>>>> How about reads? Any differences between read-intensive and
>>>>>>>> write-intensive workloads?
>>>>>>>>
>>>>>>>> -- Jack Krupansky
>>>>>>>>
>>>>>>>> On Sun, Jan 31, 2016 at 3:13 AM, Jeff Jirsa <
>>>>>>>> jeff.ji...@crowdstrike.com> wrote:
>>>>>>>>
>>>>>>>>> Hi John,
>>>>>>>>>
>>>>>>>>> We run using 4T GP2 volumes, which guarantee 10k iops. Even at 1M
>>>>>>>>> writes per second on 60 nodes, we didn’t come close to hitting even 
>>>>>>>>> 50%
>>>>>>>>> utilization (10k is more than enough for most workloads). PIOPS is not
>>>>>>>>> necessary.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> From: John Wong
>>>>>>>>> Reply-To: "user@cassandra.apache.org"
>>>>>>>>> Date: Saturday, January 30, 2016 at 3:07 PM
>>>>>>>>> To: "user@cassandra.apache.org"
>>>>>>>>> Subject: Re: EC2 storage options for C*
>>>>>>>>>
>>>>>>>>> For production I'd stick with ephemeral disks (aka instance
>>>>>>>>> storage) if you have running a lot of transaction.
>>>>>>>>> However, for regular small testing/qa cluster, or something you
>>>>>>>>> know you want to reload often, EBS is definitely good enough and we 
>>>>>>>>> haven't
>>>>>>>>> had issues 99%. The 1% is kind of anomaly where we have flush blocked.
>>>>>>>>>
>>>>>>>>> But Jeff, kudo that you are able to use EBS. I didn't go through
>>>>>>>>> the video, do you actually use PIOPS or just standard GP2 in your
>>>>>>>>> production cluster?
>>>>>>>>>
>>>>>>>>> On Sat, Jan 30, 2016 at 1:28 PM, Bryan Cheng <
>>>>>>>>> br...@blockcypher.com> wrote:
>>>>>>>>>
>>>>>>>>>> Yep, that motivated my question "Do you have any idea what kind
>>>>>>>>>> of disk performance you need?". If you need the performance, its 
>>>>>>>>>> hard to
>>>>>>>>>> beat ephemeral SSD in RAID 0 on EC2, and its a solid, battle tested
>>>>>>>>>> configuration. If you don't, though, EBS GP2 will save a _lot_ of 
>>>>>>>>>> headache.
>>>>>>>>>>
>>>>>>>>>> Personally, on small clusters like ours (12 nodes), we've found
>>>>>>>>>> our choice of instance dictated much more by the balance of price, 
>>>>>>>>>> CPU, and
>>>>>>>>>> memory. We're using GP2 SSD and we find that for our patterns the 
>>>>>>>>>> disk is
>>>>>>>>>> rarely the bottleneck. YMMV, of course.
>>>>>>>>>>
>>>>>>>>>> On Fri, Jan 29, 2016 at 7:32 PM, Jeff Jirsa <
>>>>>>>>>> jeff.ji...@crowdstrike.com> wrote:
>>>>>>>>>>
>>>>>>>>>>> If you have to ask that question, I strongly recommend m4 or c4
>>>>>>>>>>> instances with GP2 EBS.  When you don’t care about replacing a node 
>>>>>>>>>>> because
>>>>>>>>>>> of an instance failure, go with i2+ephemerals. Until then, GP2 EBS 
>>>>>>>>>>> is
>>>>>>>>>>> capable of amazing things, and greatly simplifies life.
>>>>>>>>>>>
>>>>>>>>>>> We gave a talk on this topic at both Cassandra Summit and AWS
>>>>>>>>>>> re:Invent: https://www.youtube.com/watch?v=1R-mgOcOSd4 It’s
>>>>>>>>>>> very much a viable option, despite any old documents online that say
>>>>>>>>>>> otherwise.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> From: Eric Plowe
>>>>>>>>>>> Reply-To: "user@cassandra.apache.org"
>>>>>>>>>>> Date: Friday, January 29, 2016 at 4:33 PM
>>>>>>>>>>> To: "user@cassandra.apache.org"
>>>>>>>>>>> Subject: EC2 storage options for C*
>>>>>>>>>>>
>>>>>>>>>>> My company is planning on rolling out a C* cluster in EC2. We
>>>>>>>>>>> are thinking about going with ephemeral SSDs. The question is this: 
>>>>>>>>>>> Should
>>>>>>>>>>> we put two in RAID 0 or just go with one? We currently run a 
>>>>>>>>>>> cluster in our
>>>>>>>>>>> data center with 2 250gig Samsung 850 EVO's in RAID 0 and we are 
>>>>>>>>>>> happy with
>>>>>>>>>>> the performance we are seeing thus far.
>>>>>>>>>>>
>>>>>>>>>>> Thanks!
>>>>>>>>>>>
>>>>>>>>>>> Eric
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>
>>>>
>>>
>>>
>>> --
>>> Steve Robenalt
>>> Software Architect
>>> sroben...@highwire.org <bza...@highwire.org>
>>> (office/cell): 916-505-1785
>>>
>>> HighWire Press, Inc.
>>> 425 Broadway St, Redwood City, CA 94063
>>> www.highwire.org
>>>
>>> Technology for Scholarly Communication
>>>
>>
>>
> --
Ben Bromhead
CTO | Instaclustr
+1 650 284 9692

Re: EC2 storage options for C*

Reply via email to