[ 
https://issues.apache.org/jira/browse/CASSANDRA-15066?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16810247#comment-16810247
 ] 

Benedict commented on CASSANDRA-15066:
--------------------------------------

Thanks for the feedback.

I'm sorry we failed to get back to you as promptly as we had promised.  We 
began review of CASSANDRA-14503 with the expectation it would be a 2-3 week 
project, with perhaps some minor improvements and bugfixes identified.  
Unfortunately, as we dug in, we came to the conclusion that there were issues 
requiring significant work.

For context, as part of our effort toward stability in Cassandra 4.0, we 
committed to completing a source audit of the messaging system.  Reviewing 
14503 and trunk together made sense, given that 14503 included necessary fixes 
and constituted significant changes to trunk.  We were also under the 
impression that you intended to reduce your involvement in the project in your 
new role, and didn’t expect this change would allow time for the amount of 
collaboration needed.  

So, we began preparing a candidate patch to resolve issues that we found.  As 
we progressed, we found it difficult to produce a small patch in whose 
correctness we were confident, so the scope of modifications expanded in order 
to produce confidence in the result.  As this progressed, it became apparent 
that we could also, at little cost, better tailor the system's characteristics 
to Cassandra's needs by integrating knowledge we had gained during the wider 
review.  This also permitted, in our view, an overall risk reduction to the 
project by: 1) reducing the need for further major modifications in the next 
release; and 2) reducing the maintenance risk through clarified semantics.

Producing work for the community is a complex dance that is impossible to 
conduct perfectly.  In our view, a patchwork approach wouldn’t have yielded 
high confidence in the result, and would have reintroduced long standing 
defects to our 4.0 release.  We are also, as a project, clearly under 
significant time pressure to produce a 4.0 release that we can endorse.

Given the sole purpose and scope of this work is fixing stability issues that 
would land in 4.0, we feel that it is in the spirit of the feature freeze and 
our collective focus on stability.  As a project, we have promised not to 
repeat the mistake of rushing out an unsafe .0 release, and we feel this work 
is important for achieving that.  This is of course all open for debate, and it 
may be that through discussion together we revise our proposal for what’s to be 
merged, versus held for a future release.  We look forward to hearing the 
community's thoughts on this, which we hope to solicit from a wider discussion 
on the mailing list.

As always, we are eager to receive feedback on our work.  Do you have any 
specific technical concerns with the approach taken?


> Improvements to Internode Messaging
> -----------------------------------
>
>                 Key: CASSANDRA-15066
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-15066
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Messaging/Internode
>            Reporter: Benedict
>            Assignee: Benedict
>            Priority: Normal
>             Fix For: 4.0
>
>
> CASSANDRA-8457 introduced asynchronous networking to internode messaging, but 
> there have been several follow-up endeavours to improve some semantic issues. 
>  CASSANDRA-14503 and CASSANDRA-13630 are the latest such efforts, and were 
> combined some months ago into a single overarching refactor of the original 
> work, to address some of the issues that have been discovered.  Given the 
> criticality of this work to the project, we wanted to bring some more eyes to 
> bear to ensure the release goes ahead smoothly.  In doing so, we uncovered a 
> number of issues with messaging, some of which long standing, that we felt 
> needed to be addressed.  This patch widens the scope of CASSANDRA-14503 and 
> CASSANDRA-13630 in an effort to close the book on the messaging service, at 
> least for the foreseeable future.
> The patch includes a number of clarifying refactors that touch outside of the 
> {{net.async}} package, and a number of semantic changes to the {{net.async}} 
> packages itself.  We believe it clarifies the intent and behaviour of the 
> code while improving system stability, which we will outline in comments 
> below.
> https://github.com/belliottsmith/cassandra/tree/messaging-improvements



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

Reply via email to