To clarify: I don't think we should try and get rid of all forking and I didn't 
read any of our prior or this discussion as that absolute. I think some reasons 
for forking are healthy, and other reasons for forking are redundant and 
wasteful. We should celebrate the former and try and root out the latter.

For example: if someone has written a feature to a GA branch and upstreamed it, 
and then we don't cut a release for multiple years and don't cut alphas from 
trunk, I'd contend our processes are encouraging low-value, redundant, wasteful 
fork maintenance. Or forcing people to qualify bespoke releases from arbitrary 
SHAs on trunk and running the database based on that which I expect most people 
wouldn't be too comfortable doing.

Hence my desire to try and map out why people are forking. For instance: I 
think most people could agree that we shouldn't target completely getting rid 
of forks for reason #1: "You need bespoke code to integrate with internal 
infrastructure". We could take that as a sign to improve users' lives there by 
making integration points easier to write to, formalize some APIs, document 
them, etc.

On Tue, Oct 21, 2025, at 2:07 PM, C. Scott Andreas wrote:
> There’s a common motivation at the root of any fork: having at least one 
> patch that matters to you – perhaps one you’ve written yourself – and having 
> complete and total control over your ability to run it. Isaac's example of 
> C-20749 is a good example of this.
> 
> This is the classic story of open source, and for me it’s a positive one.
> 
> Releases published by the Apache Cassandra project are solid and stable. It’s 
> *also* true that many users pretty aggressively editorialize distributions of 
> the database they deploy. These can include urgent fixes for bugs they’ve 
> identified and patched; removal/disabling of features they haven’t qualified; 
> or incubation of patches they intend to contribute upstream. These are all 
> positive motivations, and something that open source enables.
> 
> I don’t agree that the existence of “forks” is a negative or that eliminating 
> them is a desirable or achievable goal. Folks will always want or need to be 
> able to apply a patch that matters to them to solve a problem or scratch an 
> itch - and they should.
> 
> If and as we learn that many users of a particular release have a common 
> challenge, our process of DISCUSS + backport provides a path to meet that 
> need, and I think we should exercise it more often. I see Isaac's example of 
> C-20749 as a good candidate for that as well.
> 
> – Scott
> 
>> On Oct 21, 2025, at 9:12 AM, Isaac Reath <[email protected]> wrote:
>> 
>> 
>> I'd say the biggest reasons to me are (1) and (2). For example, we recently 
>> worked on _CASSANDRA-20749_ 
>> <https://issues.apache.org/jira/browse/CASSANDRA-20749> due to an internal 
>> need for this functionality, and fortunately we’ve been able to bring it 
>> upstream. But, since we run 4.1 and 5.0, we brought this patch back to these 
>> versions so that we are able to use this feature today. 
>> 
>> 
>> To your point in (3), I'd say it's easier to qualify a release built off of 
>> a stable GA than trunk, especially once the GA release has gotten to where 
>> 4.1 is. I’d love to get to the point where we’re qualifying and running 
>> trunk builds in production, but even when we get there I still see a world 
>> where we still need to run the latest GA alongside trunk which would still 
>> motivate us to bring patches back to the latest GA where we need them. 
>> 
>> 
>> On Mon, Oct 20, 2025 at 3:24 PM Josh McKenzie <[email protected]> wrote:
>>> __
>>> We had a long conversation about the potential of piloting a supported 
>>> backport release branch here: 
>>> https://lists.apache.org/thread/xbxt21rttsqvhmh8ds9vs2cr7fx27w3k
>>> 
>>> When I tried to summarize the thread and identify next steps, one 
>>> observation stood out: I think we did a good job establishing the shape of 
>>> the challenge we'd like to address (people want to work on OSS, not 
>>> maintain private forks), but I don't think we got to the root of why this 
>>> challenge exists. If we take action now we run the risk of having the wrong 
>>> solution to the right problem.
>>> 
>>> So: why are people running forks? Some reasons I've seen brought up:
>>>  1. You need bespoke code to integrate with internal infrastructure
>>>  2. You've written a new feature targeting your internal version, 
>>> upstreamed the code, and have to wait for the feature to be in a GA release
>>>  3. Someone else has contributed a feature to trunk that's attractive and 
>>> it's less work or more palatable to back-port it to your private fork and 
>>> maintain the diff than to qualify a custom release off trunk
>>>  4. You have stability concerns with GA releases or running a release based 
>>> off trunk
>>> The backport branch we discussed in the previous thread would primarily 
>>> address #4 (stability concerns) and secondarily #2 and #3 (feature 
>>> availability). All four motivations could be addressed in other 
>>> ways—ideally by reducing the pressure to fork in the first place, rather 
>>> than accommodating forks as inevitable.
>>> 
>>> If you're running a fork and open to sharing your experience, do these 
>>> reasons match yours, or is something else at play? The more detail we can 
>>> gather, the better we can target improvements where they'll actually help.
>>> 
>>> I know these conversations take time and can be hard; I appreciate everyone 
>>> taking the time and energy to help us collectively improve.
>>> 
>>> ~Josh
>>> 
> 

Reply via email to