[ 
https://issues.apache.org/jira/browse/CASSANDRA-13630?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16138305#comment-16138305
 ] 

Jason Brown commented on CASSANDRA-13630:
-----------------------------------------

bq. I thought worst case memory amplification from this NIO approach was 2x 
message size which is worse than our current 1x message size, but it's not, 
it's cluster size * message size if a message is fanned out to all nodes in the 
cluster. 

We do not have 1x amplification in pre-4.0 code; it's always been messageSize 
times the number of target peers. In `OutboundTcpConnector` we wrote into a 
[backing buffer of 
64k|https://github.com/apache/cassandra/blob/cassandra-3.11/src/java/org/apache/cassandra/net/OutboundTcpConnection.java#L457]
 for each outbound peer and flushed when the buffer filled up (see 
`BufferedDataOutputStreamPlus`). The cost of the amplification is hidden by 
that reusable backing buffer, but it's still there.

With CASSANDRA-8457, everything gets it's own distinct buffer, allocated once 
per-message, which is serialized to and then flushed. With this ticket we'll 
move back to the previous model where there's a backing buffer that's used for 
aggregating small messages or chunks of larger messages. That buffer, of 
course, is not reused, but that's because of the asynchronous nature of NIO vs 
blocking IO. 

(FTR, I have thought about moving serialization outside of the "outbound 
connections" (either `OutboundTcpConnection` or netty handlers) - where we 
serialize before sending to the outbound channels and send a slice of a buffer 
to those channels. That way you only serialize once (less repetitive CPU work), 
as well as potentially consume less memory. But I think that's a different 
ticket.)

bq. I really wonder if that be a shared pool of threads and we size it 
generously

yeah, i thought about this. The problem is that because the deserialization is 
blocking, you basically need one thread in the pool for each "blocker"; else 
you starve some deserialization activities. Hence, i just used a background 
thread. Not my favorite choice, but I'm not sure a "well-sized" pool will be 
sufficient. 

Reading over your comments on the code itself this morning.


> support large internode messages with netty
> -------------------------------------------
>
>                 Key: CASSANDRA-13630
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-13630
>             Project: Cassandra
>          Issue Type: Task
>          Components: Streaming and Messaging
>            Reporter: Jason Brown
>            Assignee: Jason Brown
>             Fix For: 4.0
>
>
> As part of CASSANDRA-8457, we decided to punt on large mesages to reduce the 
> scope of that ticket. However, we still need that functionality to ship a 
> correctly operating internode messaging subsystem.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

Reply via email to