[ https://issues.apache.org/jira/browse/CASSANDRA-15066?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16860129#comment-16860129 ]
Aleksey Yeschenko commented on CASSANDRA-15066: ----------------------------------------------- Agreed on readiness to commit in current state. To complete the list (non-exhaustively), below are some notable changes on my part. To start with, the largest change has been the redesign of large message handling - suggested by [~xedin] during review. Whereas previously we'd have a companion thread deserializing the large message as new frames kept coming, now, instead, we accumulate all the frames needed for the large message deserialization - and then schedule a task directly on that message's verb's {{Stage}}, a task that desrializes the message and executes the verb handler in one go. This not only simplified the logic in {{InboundMessageHandler}}, but also increases locality, and reduces the lifetime of the large messages on heap. Other changes include: - Fixed a bug with double-release of permits on deser exceptions in {{InboundMessageHandler}} - Fixed forgetting to signal a {{WaitQueue}} when releasing permits back in case of partial allocate failure - Fixed {{FrameDecoder}} not propagating {{channelClose()}} to {{InboundMessageHandler}} - Fixed several legacy handshake issues - Fixed legacy LZ4 frame encoder and decoder performance (broken Netty xxhash behaviour) - Fixed mutation forwarding to remote DCs mistakenly including the picked forwarder node itself (spotted by [~jmeredithco]) - Started immediately expiring callbacks for all forwarded mutation destinations when failing to send to the forwarder - Introduced inbound backpressure counters (throttled count and nanos) - Started treating all deserialize exceptions as non-fatal, to prevent unnecessary message loss and reconnects - Factored out header fields from {{Message}} into a standalone {{Header}} class to prevent double-deserialization of some fields and to clean up callback signatures - Introduced max message size config param, akin to max mutation size - set to endpoint reserve capacity by default - Introduced an MPSC linked queue with volatile offer semantics and non-blocking {{poll()}} and {{drain()}} and used it to fix visibility issues or blocking behaviour in {{OutboundMessageQueue}}, {{InboundMessageHandler.WaitQueue}}, and Netty's event loops; then used it to minimise amount of signalling done when {{InboundMessageHandler}} get registered on the wait queue - Refactored callbacks and callback map ({{RequestCallbacks}}) to allow reusing the same request ID for multiple messages, got rid of an extra object per entry - Building on the refactoring above, reduced and mostly eliminated allocation of extra {{Message}} objects, allowing to save on {{serializedSize}} invocations and some garbage - Reworked integration between {{InboundMessageHandler}} and {{FrameDecoder}} for clarity and performance - Fixed {{FrameDecoder}} over-issuing {{channel.read()}} calls in some circumstances - Refactored {{InboundMessageHandler}} frame handling and callbacks - Push processing exception handling to callbacks/message sink - Added a lot of comments/documentation, tests, made various logging improvements, better thread names Also, some changes made by [~ifesdjeen] directly - in addition to his many helpful review corrections: - Introduced in-JVM proxy to test expirations and closure, added tests for inbound expirations - Fixed a bug in outbound virtual table (overflow_count/overflow_bytes swapped values), and same in outbound metrics - Introduced {{UnknownColumnsException}} to more places instead of {{RuntimeException}} - Fixed {{Message.Builder.builder(Message)}} to copy over original flags > Improvements to Internode Messaging > ----------------------------------- > > Key: CASSANDRA-15066 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15066 > Project: Cassandra > Issue Type: Improvement > Components: Messaging/Internode > Reporter: Benedict > Assignee: Benedict > Priority: High > Fix For: 4.0 > > Attachments: 20k_backfill.png, 60k_RPS.png, > 60k_RPS_CPU_bottleneck.png, backfill_cass_perf_ft_msg_tst.svg, > baseline_patch_vs_30x.png, increasing_reads_latency.png, > many_reads_cass_perf_ft_msg_tst.svg > > > CASSANDRA-8457 introduced asynchronous networking to internode messaging, but > there have been several follow-up endeavours to improve some semantic issues. > CASSANDRA-14503 and CASSANDRA-13630 are the latest such efforts, and were > combined some months ago into a single overarching refactor of the original > work, to address some of the issues that have been discovered. Given the > criticality of this work to the project, we wanted to bring some more eyes to > bear to ensure the release goes ahead smoothly. In doing so, we uncovered a > number of issues with messaging, some of which long standing, that we felt > needed to be addressed. This patch widens the scope of CASSANDRA-14503 and > CASSANDRA-13630 in an effort to close the book on the messaging service, at > least for the foreseeable future. > The patch includes a number of clarifying refactors that touch outside of the > {{net.async}} package, and a number of semantic changes to the {{net.async}} > packages itself. We believe it clarifies the intent and behaviour of the > code while improving system stability, which we will outline in comments > below. > https://github.com/belliottsmith/cassandra/tree/messaging-improvements -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org