[ 
https://issues.apache.org/jira/browse/CASSANDRA-8692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14349355#comment-14349355
 ] 

Ariel Weisberg commented on CASSANDRA-8692:
-------------------------------------------

Are you testing with or without vnodes? Last time we talked it was without 
vnodes. I would expect that without vnodes you would get the performance boost 
even at large cluster sizes because each individual node should be frequently 
messaging with up to RF - 1 other nodes and there should be an opportunity to 
coalesce.

If you aren't seeing any benefit then I think there may be other bottlenecks in 
play. At 96 nodes are you able to reach CPU saturation at each node? I am not 
able to saturate in AWS and there is the weird step from 500% utilization to 
2000% with no change in throughput or latency. I think there is something in 
the way there that I want to look into. You are also measuring workloads that 
could be bottlenecked on other things. You really have to be specific about 
what you are measuring when determining whether this optimization is expected 
to help.

With vnodes I wouldn't expect much as you grow the cluster. I also didn't see a 
huge boost in GCE, the results are in an earlier comment.

You also don't get the most mileage out of this change until you combine it 
CASSANDRA-8789 which routes more messages over the same socket doubling the 
opportunity to coalesce if you are doing reads.

I don't feel passionate about getting this in to 2.1, but I would like to move 
on to investigating the utilization issues I have found.


> Coalesce intra-cluster network messages
> ---------------------------------------
>
>                 Key: CASSANDRA-8692
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-8692
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Ariel Weisberg
>            Assignee: Ariel Weisberg
>             Fix For: 2.1.4
>
>         Attachments: batching-benchmark.png
>
>
> While researching CASSANDRA-8457 we found that it is effective and can be 
> done without introducing additional latency at low concurrency/throughput.
> The patch from that was used and found to be useful in a real life scenario 
> so I propose we implement this in 2.1 in addition to 3.0.
> The change set is a single file and is small enough to be reviewable.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to