[ 
https://issues.apache.org/jira/browse/CASSANDRA-14746?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joseph Lynch updated CASSANDRA-14746:
-------------------------------------
    Description: 
Before we release 4.0 let's ensure that the internode messaging refactor is 
100% solid. As internode messaging is naturally used in many code paths and 
widely configurable we have a large number of cluster configurations and test 
configurations that must be vetted.

We plan to vary the following:
 * Version of Cassandra 3.0.17 vs 4.0-alpha
 * Cluster sizes with *multi-dc* deployments ranging from 6 - 100 nodes
 * Client request rates varying between 1k QPS and 100k QPS of varying sizes 
and shapes (BATCH, INSERT, SELECT point, SELECT range, etc ...)
 * Internode compression
 * Internode SSL (as well as openssl vs jdk)
 * Internode Coalescing options

We are looking to measure the following as appropriate:
 * Latency distributions of reads and writes (lower is better)
 * Scaling limit, aka maximum throughput before violating p99 latency deadline 
of 100ms, on a fixed hardware deployment for 100% writes, 100% reads and 50-50 
writes+reads (higher is better)
 * Thread counts (lower is better)
 * Context switches (lower is better)
 * On-CPU time of tasks (higher periods without context switch is better)
 * GC allocation rates / throughput for a fixed size heap (lower allocation 
better)
 * Streaming recovery time for a single node failure, i.e. can Cassandra 
saturate the NIC

 

The goal is that 4.0 should have better latency, more throughput, fewer 
threads, fewer context switches, less GC allocation, and faster recovery time. 
I'm putting Jason Brown as the reviewer since he implemented most of the 
internode refactor.

Current collaborators driving this QA task: Dinesh Joshi (Apple), Jordan West 
(Apple), Joey Lynch (Netflix), Vinay Chella (Netflix)

Owning committer(s): Jason Brown (Apple)

  was:
Before we release 4.0 let's ensure that the internode messaging refactor is 
100% solid. As internode messaging is naturally used in many code paths and 
widely configurable we have a large number of cluster configurations and test 
configurations that must be vetted.

We plan to vary the following:
 * Version of Cassandra 3.0.17 vs 4.0-alpha
 * Cluster sizes with *multi-dc* deployments ranging from 6 - 100 nodes
 * Client request rates varying between 1k QPS and 100k QPS of varying sizes 
and shapes (BATCH, INSERT, SELECT point, SELECT range, etc ...)
 * Internode compression
 * Internode SSL (as well as openssl vs jdk)
 * Internode Coalescing options

We are looking to measure the following as appropriate:
 * Latency distributions of reads and writes (lower is better)
 * Scaling limit, aka maximum throughput before violating p99 latency deadline 
of 100ms, on a fixed hardware deployment for 100% writes, 100% reads and 50-50 
writes+reads (higher is better)
 * Thread counts (lower is better)
 * Context switches (lower is better)
 * On-CPU time of tasks (higher periods without context switch is better)
 * GC allocation rates / throughput for a fixed size heap (lower allocation 
better)
 * Streaming recovery time for a single node failure, i.e. can Cassandra 
saturate the NIC

 

The goal is that 4.0 should have better latency, more throughput, fewer 
threads, fewer context switches, less GC allocation, and faster recovery time. 
I'm putting Jason Brown as the reviewer since he implemented most of the 
internode refactor.


> Ensure Netty Internode Messaging Refactor is Solid
> --------------------------------------------------
>
>                 Key: CASSANDRA-14746
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-14746
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Streaming and Messaging
>            Reporter: Joseph Lynch
>            Priority: Major
>              Labels: 4.0-QA
>             Fix For: 4.0
>
>
> Before we release 4.0 let's ensure that the internode messaging refactor is 
> 100% solid. As internode messaging is naturally used in many code paths and 
> widely configurable we have a large number of cluster configurations and test 
> configurations that must be vetted.
> We plan to vary the following:
>  * Version of Cassandra 3.0.17 vs 4.0-alpha
>  * Cluster sizes with *multi-dc* deployments ranging from 6 - 100 nodes
>  * Client request rates varying between 1k QPS and 100k QPS of varying sizes 
> and shapes (BATCH, INSERT, SELECT point, SELECT range, etc ...)
>  * Internode compression
>  * Internode SSL (as well as openssl vs jdk)
>  * Internode Coalescing options
> We are looking to measure the following as appropriate:
>  * Latency distributions of reads and writes (lower is better)
>  * Scaling limit, aka maximum throughput before violating p99 latency 
> deadline of 100ms, on a fixed hardware deployment for 100% writes, 100% reads 
> and 50-50 writes+reads (higher is better)
>  * Thread counts (lower is better)
>  * Context switches (lower is better)
>  * On-CPU time of tasks (higher periods without context switch is better)
>  * GC allocation rates / throughput for a fixed size heap (lower allocation 
> better)
>  * Streaming recovery time for a single node failure, i.e. can Cassandra 
> saturate the NIC
>  
> The goal is that 4.0 should have better latency, more throughput, fewer 
> threads, fewer context switches, less GC allocation, and faster recovery 
> time. I'm putting Jason Brown as the reviewer since he implemented most of 
> the internode refactor.
> Current collaborators driving this QA task: Dinesh Joshi (Apple), Jordan West 
> (Apple), Joey Lynch (Netflix), Vinay Chella (Netflix)
> Owning committer(s): Jason Brown (Apple)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

Reply via email to