Latency SLAs are very much *not* Cassandra’s sweet spot, scaling throughput and storage is more where C*’s strengths shine. If you want just median latency you’ll find things a bit more amenable to modeling, but not if you have 2 nines and particularly not 3 nines SLA expectations. Basically, the harder you push on the nodes, the more you get sporadic but non-ignorable timing artifacts due to garbage collection and IO stalls when the flushing of the writes can choke out the disk reads. Also, running in AWS, you’ll find that noisy neighbors are a routine issue no matter what the specifics of your use.
What your actual data model is, and what your patterns of reads and writes are, the impact of deletes and TTLs requiring tombstone cleanup, etc., all dramatically change the picture. If you aren’t already aware of it, there is something called cassandra-stress that can help you do some experiments. The challenge though is determining if the experiments are representative of what your actual usage will be. Because of the GC issues in anything implemented in a JVM or interpreter, it’s pretty easy to fall off the cliff of relevance. TLP wrote an article about some of the challenges of this with cassandra-stress: https://thelastpickle.com/blog/2017/02/08/Modeling-real-life-workloads-with-cassandra-stress.html Note that one way to not have to care a lot about variable latency is to make use of speculative retry. Basically you’re trading off some of your median throughput to help achieve a latency SLA. The tradeoff benefit breaks down when you get to 3 nines. I’m actually hoping to start on some modeling of what the latency surface looks like with different assumptions in the new year, not because I expect the specific numbers to translate to anybody else but just to show how the underyling dynamics evidence themselves in metrics when C* nodes are under duress. R From: Fred Habash <fmhab...@gmail.com> Reply-To: "user@cassandra.apache.org" <user@cassandra.apache.org> Date: Tuesday, December 10, 2019 at 9:57 AM To: "user@cassandra.apache.org" <user@cassandra.apache.org> Subject: Predicting Read/Write Latency as a Function of Total Requests & Cluster Size Message from External Sender I'm looking for an empirical way to answer these two question: 1. If I increase application work load (read/write requests) by some percentage, how is it going to affect read/write latency. Of course, all other factors remaining constant e.g. ec2 instance class, ssd specs, number of nodes, etc. 2) How many nodes do I have to add to maintain a given read/write latency? Are there are any methods or instruments out there that can help answer these que ---------------------------------------- Thank you