Joseph Lynch created CASSANDRA-14764:
----------------------------------------

             Summary: Evaluate 12 Node Breaking Point, compression=none, 
encryption=none, coalescing=off
                 Key: CASSANDRA-14764
                 URL: https://issues.apache.org/jira/browse/CASSANDRA-14764
             Project: Cassandra
          Issue Type: Sub-task
            Reporter: Joseph Lynch


*Setup:*
 * Cassandra: 12 (2*6) node i3.xlarge AWS instance (4 cpu cores, 30GB ram) 
running cassandra trunk off of jasobrown/14503 jdd7ec5a2 (Jasons patched 
internode messaging branch) vs the same footprint running 3.0.17
 * Two datacenters with 100ms latency between them
 * No compression, encryption, or coalescing turned on

*Test #1:*

ndbench sent 1.5k QPS at a coordinator level to one datacenter (RF=3*2 = 6 so 
3k global replica QPS) of 4kb single partition BATCH mutations at LOCAL_ONE. 
This represents about 250 QPS per coordinator in the first datacenter or 60 QPS 
per core. The goal was to observe P99 write and read latencies under various 
QPS.

*Result:*

The good news is since the CASSANDRA-14503 changes, instead of keeping the 
mutations on heap we put the message into hints instead and don't run out of 
memory. The bad news is that the {{MessagingService-NettyOutbound-Thread's}} 
would occasionally enter a degraded state where they would just spin on a core. 
I've attached flame graphs showing the CPU state as [~jasobrown] applied fixes 
to the {{OutboundMessagingConnection}} class.


 *Follow Ups:*
[~jasobrown] has committed a number of fixes onto his 
{{jasobrown/14503-collab}} branch including:
1. Limiting the amount of time spent dequeuing messages if they are expired 
(previously if messages entered the queue faster than we could dequeue them 
we'd just inifinte loop on the consumer side)
2. Don't call {{dequeueMessages}} from within {{dequeueMessages}} created 
callbacks.

We're continuing to use CPU flamegraphs to figure out where we're looping and 
fixing bugs as we find them.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

Reply via email to