[ 
https://issues.apache.org/jira/browse/CASSANDRA-14174?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16331211#comment-16331211
 ] 

Jason Brown commented on CASSANDRA-14174:
-----------------------------------------

Here's a patch for trunk:

||trunk||
|[branch|https://github.com/jasobrown/cassandra/tree/14174-trunk]|
|[utests & 
dtests|https://circleci.com/gh/jasobrown/workflows/cassandra/tree/14174-trunk]|


> Remove GossipDigestSynVerbHandler#doSort()
> ------------------------------------------
>
>                 Key: CASSANDRA-14174
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-14174
>             Project: Cassandra
>          Issue Type: Improvement
>            Reporter: Jason Brown
>            Assignee: Jason Brown
>            Priority: Minor
>             Fix For: 4.x
>
>
> I have personally tripped up on this function a couple of times over the 
> years, believing that it contributes to bugs in some way or another. While I 
> have not found that (necessarily!) to be the case, I feel this function is 
> completely useless in the grand scope of things.
> Going back through the mists of time (that is, {{git log}}), it appears this 
> function was part of the original code drop from Facebook when they open 
> sourced cassandra. Looking at the {{#doSort()}} method, all it does is sort 
> the incoming list of {{GossipDigest}} s by the difference between the remote 
> node's maxValue for a given peer and the local nodes' maxValue.
> The only universe where this is actually an optimization is if you go back 
> and read the [Scuttlebutt 
> paper|https://www.cs.cornell.edu/home/rvr/papers/flowgossip.pdf] (upon which 
> cassandra's Gossip anti-entropy reconciliation is based). The end of section 
> 3.2 describes ordering of the incoming digests such that, in the case where 
> you do not return all of the differences (because you are optimizing for the 
> return message size), you can gather the differences for the peers which are 
> most of out sync. The ordering implemented in cassandra is the second 
> ordering described in the paper, called "scuttle depth".
> As we always send all differences between two nodes (message size be damned), 
> this optimization, borrowed from the paper, is largely irrelevant for 
> Cassandra's purposes.
> Thus, I propose we remove this method for the following gains:
>  - less garbage created
>  - less CPU (sure, it's mostly trivial; see next point)
>  - less time spent on unnecessary functionality on the *single threaded* 
> gossip stage.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

Reply via email to