[ 
https://issues.apache.org/jira/browse/CASSANDRA-15066?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16860129#comment-16860129
 ] 

Aleksey Yeschenko commented on CASSANDRA-15066:
-----------------------------------------------

Agreed on readiness to commit in current state. To complete the list 
(non-exhaustively), below are some notable changes on my part.

To start with, the largest change has been the redesign of large message 
handling - suggested by [~xedin] during review. Whereas
previously we'd have a companion thread deserializing the large message as new 
frames kept coming, now, instead, we accumulate all
the frames needed for the large message deserialization - and then schedule a 
task directly on that message's verb's {{Stage}}, a task
that desrializes the message and executes the verb handler in one go. This not 
only simplified the logic in {{InboundMessageHandler}},
but also increases locality, and reduces the lifetime of the large messages on 
heap.

Other changes include: 
- Fixed a bug with double-release of permits on deser exceptions in 
{{InboundMessageHandler}}
- Fixed forgetting to signal a {{WaitQueue}} when releasing permits back in 
case of partial allocate failure
- Fixed {{FrameDecoder}} not propagating {{channelClose()}} to 
{{InboundMessageHandler}}
- Fixed several legacy handshake issues
- Fixed legacy LZ4 frame encoder and decoder performance (broken Netty xxhash 
behaviour)
- Fixed mutation forwarding to remote DCs mistakenly including the picked 
forwarder node itself (spotted by [~jmeredithco])
- Started immediately expiring callbacks for all forwarded mutation 
destinations when failing to send to the forwarder
- Introduced inbound backpressure counters (throttled count and nanos)
- Started treating all deserialize exceptions as non-fatal, to prevent 
unnecessary message loss and reconnects
- Factored out header fields from {{Message}} into a standalone {{Header}} 
class to prevent double-deserialization of some fields and to clean up callback 
signatures
- Introduced max message size config param, akin to max mutation size - set to 
endpoint reserve capacity by default
- Introduced an MPSC linked queue with volatile offer semantics and 
non-blocking {{poll()}} and {{drain()}} and used it to fix visibility issues or 
blocking behaviour in {{OutboundMessageQueue}}, 
{{InboundMessageHandler.WaitQueue}}, and Netty's event loops; then used it to 
minimise amount of signalling done when {{InboundMessageHandler}} get 
registered on the wait queue
- Refactored callbacks and callback map ({{RequestCallbacks}}) to allow reusing 
the same request ID for multiple messages, got rid of an extra object per entry
- Building on the refactoring above, reduced and mostly eliminated allocation 
of extra {{Message}} objects, allowing to save on {{serializedSize}} 
invocations and some garbage
- Reworked integration between {{InboundMessageHandler}} and {{FrameDecoder}} 
for clarity and performance
- Fixed {{FrameDecoder}} over-issuing {{channel.read()}} calls in some 
circumstances
- Refactored {{InboundMessageHandler}} frame handling and callbacks
- Push processing exception handling to callbacks/message sink
- Added a lot of comments/documentation, tests, made various logging 
improvements, better thread names

Also, some changes made by [~ifesdjeen] directly - in addition to his many 
helpful review corrections:
- Introduced in-JVM proxy to test expirations and closure, added tests for 
inbound expirations
- Fixed a bug in outbound virtual table (overflow_count/overflow_bytes swapped 
values), and same in outbound metrics
- Introduced {{UnknownColumnsException}} to more places instead of 
{{RuntimeException}}
- Fixed {{Message.Builder.builder(Message)}} to copy over original flags

> Improvements to Internode Messaging
> -----------------------------------
>
>                 Key: CASSANDRA-15066
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-15066
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Messaging/Internode
>            Reporter: Benedict
>            Assignee: Benedict
>            Priority: High
>             Fix For: 4.0
>
>         Attachments: 20k_backfill.png, 60k_RPS.png, 
> 60k_RPS_CPU_bottleneck.png, backfill_cass_perf_ft_msg_tst.svg, 
> baseline_patch_vs_30x.png, increasing_reads_latency.png, 
> many_reads_cass_perf_ft_msg_tst.svg
>
>
> CASSANDRA-8457 introduced asynchronous networking to internode messaging, but 
> there have been several follow-up endeavours to improve some semantic issues. 
>  CASSANDRA-14503 and CASSANDRA-13630 are the latest such efforts, and were 
> combined some months ago into a single overarching refactor of the original 
> work, to address some of the issues that have been discovered.  Given the 
> criticality of this work to the project, we wanted to bring some more eyes to 
> bear to ensure the release goes ahead smoothly.  In doing so, we uncovered a 
> number of issues with messaging, some of which long standing, that we felt 
> needed to be addressed.  This patch widens the scope of CASSANDRA-14503 and 
> CASSANDRA-13630 in an effort to close the book on the messaging service, at 
> least for the foreseeable future.
> The patch includes a number of clarifying refactors that touch outside of the 
> {{net.async}} package, and a number of semantic changes to the {{net.async}} 
> packages itself.  We believe it clarifies the intent and behaviour of the 
> code while improving system stability, which we will outline in comments 
> below.
> https://github.com/belliottsmith/cassandra/tree/messaging-improvements



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

Reply via email to