[ https://issues.apache.org/jira/browse/CASSANDRA-19534?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17843376#comment-17843376 ]
Jon Haddad commented on CASSANDRA-19534: ---------------------------------------- I'm running these two workloads concurrently: {noformat} easy-cass-stress run RandomPartitionAccess --rate 50k -r .5 -d 10h --workload.rows=100000 --workload.select=partition easy-cass-stress run KeyValue --rate 50k -r .5 -d 24h -p 10m --populate 100m {noformat} In this screenshot, the top node is running the branch, the other two are running 5.0-HEAD. The first node has more completed native transport requests with a significantly less backed up queue: !screenshot-1.png! The cluster has reached a point where it's failing a ton, so I've stopped the workload to see how fast it recovers. The cassandra0 node with the branch recovered almost immediately. The other nodes took approximately 10 seconds. I restarted the above two workloads and added a third: {noformat} easy-cass-stress run KeyValue --keyspace test1 --field.keyvalue.value='random(1024,2048)' -p 1m -r .5 --populate 1m {noformat} The mixed nature of expensive and cheap reads is an easy way to create a deep queue for NTR. It wasn't long before I got to this: !screenshot-2.png! It looks like load is being shed much faster off cassandra0: !screenshot-3.png! Within 10 seconds the first node has fully recovered, it took about 10 additional for the other two nodes to recover as well. !screenshot-4.png! I've rerun this several times now and am finding [~ifesdjeen]'s patched version to recover quicker and have. The boxes are all running at 99+% CPU, and cassandra0 each time continues to get more completed requests as well as maintain a more shallow queue and recover first. !screenshot-5.png! Starting my test with all 3 nodes running the patch. > unbounded queues in native transport requests lead to node instability > ---------------------------------------------------------------------- > > Key: CASSANDRA-19534 > URL: https://issues.apache.org/jira/browse/CASSANDRA-19534 > Project: Cassandra > Issue Type: Bug > Components: Legacy/Local Write-Read Paths > Reporter: Jon Haddad > Assignee: Alex Petrov > Priority: Normal > Fix For: 4.1.x, 5.0-rc, 5.x > > Attachments: Scenario 1 - QUEUE + Backpressure.jpg, Scenario 1 - > QUEUE.jpg, Scenario 1 - Stock.jpg, Scenario 2 - QUEUE + Backpressure.jpg, > Scenario 2 - QUEUE.jpg, Scenario 2 - Stock.jpg, ci_summary.html, > screenshot-1.png, screenshot-2.png, screenshot-3.png, screenshot-4.png, > screenshot-5.png > > Time Spent: 20m > Remaining Estimate: 0h > > When a node is under pressure, hundreds of thousands of requests can show up > in the native transport queue, and it looks like it can take way longer to > timeout than is configured. We should be shedding load much more > aggressively and use a bounded queue for incoming work. This is extremely > evident when we combine a resource consuming workload with a smaller one: > Running 5.0 HEAD on a single node as of today: > {noformat} > # populate only > easy-cass-stress run RandomPartitionAccess -p 100 -r 1 > --workload.rows=100000 --workload.select=partition --maxrlat 100 --populate > 10m --rate 50k -n 1 > # workload 1 - larger reads > easy-cass-stress run RandomPartitionAccess -p 100 -r 1 > --workload.rows=100000 --workload.select=partition --rate 200 -d 1d > # second workload - small reads > easy-cass-stress run KeyValue -p 1m --rate 20k -r .5 -d 24h{noformat} > It appears our results don't time out at the requested server time either: > > {noformat} > Writes Reads > Deletes Errors > Count Latency (p99) 1min (req/s) | Count Latency (p99) 1min (req/s) | > Count Latency (p99) 1min (req/s) | Count 1min (errors/s) > 950286 70403.93 634.77 | 789524 70442.07 426.02 | > 0 0 0 | 9580484 18980.45 > 952304 70567.62 640.1 | 791072 70634.34 428.36 | > 0 0 0 | 9636658 18969.54 > 953146 70767.34 640.1 | 791400 70767.76 428.36 | > 0 0 0 | 9695272 18969.54 > 956833 71171.28 623.14 | 794009 71175.6 412.79 | > 0 0 0 | 9749377 19002.44 > 959627 71312.58 656.93 | 795703 71349.87 435.56 | > 0 0 0 | 9804907 18943.11{noformat} > > After stopping the load test altogether, it took nearly a minute before the > requests were no longer queued. -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org