Howdy, As a followup to my earlier thread I wanted to share some of our experimental data using an auto-tuning agent to find optimal thread pool settings for C*. The results surprised me. It invalidated a lot of previous thinking about tuning C* and calls into question some of the decisions that were made in the the 3.x timeline.
https://vorstella.com/blog/autotuning-cassandra-to-reduce-latencies/ I want to flag that some of this research is more than a year old, and the blog post focuses on the 2.2 branch. We also tested with 3.x, and found that the results do transfer, but Avinash didn't have those graphs handy when we went to write it up. We were inspired by the ottertune blog post. Using that and a couple other papers I hacked together an agent that's able to spin up multiple Cassandra clusters in Kubernetes (k8s), run a load generating container against the cluster, observe results and deploy a new configuration recommended by the ML. The hope was that we could find a smooth, multi-dimensional performance surface that would let us observe a workload and make a settings recommendation. We tested around 20 tune-able knobs including heap settings and performed dimensionality reduction techniques to arrive at some dominate knobs to use in demos. What we found was that the relationship between many of the configurable variables is non-linear, and that optimal settings for thread pools is highly dependent on read/write mix but also request sizes. *We were able to demonstrate a 43% latency reduction and an 80% throughput increase over documented best practices.* Additionally, we found that there is no single good value for MCT and that the decision to embed a simplistic hard-coded model in 3.x was probably a mistake. We used the same model/methodology paired with a Gatling container that mimicked a customer workload and were able to demonstrate a 2x lift in throughput and a 60% reduction in latency with a DataStax search customer, and that these performance gains allowed them to deploy to EBS on AWS successfully. We haven't moved this model to production yet, and chose to focus on a couple other items first. But, if you're interested in engaging further or have questions about some of our research we'd be happy to engage.