[ 
https://issues.apache.org/jira/browse/CASSANDRA-8789?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14512597#comment-14512597
 ] 

Benedict commented on CASSANDRA-8789:
-------------------------------------

FTR, my current perception of this is:

* it does look to me like the increased throughput of the new code is a 
plausible cause of server degradation in these localhost tests, since we know 
that the server has no extra shedding logic in place beyond the normal timeout. 
** improved shedding should be addressed separately, e.g. CASSANDRA-8518
* that doesn't mean head of line blocking isn't a real concern, especially for 
low bandwidth links
** it does seem likely already an issue in 2.0/2.1 to some greater or lesser 
degree given the existing combination of gossip with read response data
** however this does change the exposure profile, and especially for smart 
routed clients it might exacerbate this problem in certain cases
** i don't think the exposure profile is sufficiently different to consider 
this a regression or to revert the other positive improvements delivered by 
this change
* I do think we can quite easily manage this by opening a new connection, 
managed by netty or raw NIO, over which we communicate only gossip messages (or 
other low frequency, high urgency messages) 
** this would in the typical case mean we are using no more connections than 
2.1 (though with large mutations/responses we may end up using 50% more 
connections), but:
*** these connections would not have significant threading impacts
*** nor would they have any impact on the improved throughput delivered by 
coalescing
* CASSANDRA-9237 is IMO a good place to continue this discussion

> OutboundTcpConnectionPool should route messages to sockets by size not type
> ---------------------------------------------------------------------------
>
>                 Key: CASSANDRA-8789
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-8789
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Ariel Weisberg
>            Assignee: Ariel Weisberg
>             Fix For: 3.0
>
>         Attachments: 8789.diff
>
>
> I was looking at this trying to understand what messages flow over which 
> connection.
> For reads the request goes out over the command connection and the response 
> comes back over the ack connection.
> For writes the request goes out over the command connection and the response 
> comes back over the command connection.
> Reads get a dedicated socket for responses. Mutation commands and responses 
> both travel over the same socket along with read requests.
> Sockets are used uni-directional so there are actually four sockets in play 
> and four threads at each node (2 inbounded, 2 outbound).
> CASSANDRA-488 doesn't leave a record of what the impact of this change was. 
> If someone remembers what situations were made better it would be good to 
> know.
> I am not clear on when/how this is helpful. The consumer side shouldn't be 
> blocking so the only head of line blocking issue is the time it takes to 
> transfer data over the wire.
> If message size is the cause of blocking issues then the current design mixes 
> small messages and large messages on the same connection retaining the head 
> of line blocking.
> Read requests share the same connection as write requests (which are large), 
> and write acknowledgments (which are small) share the same connections as 
> write requests. The only winner is read acknowledgements.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to