Re: [DISCUSS] Understanding fork motivations (backport branch follow-up)

Alex Petrov Fri, 24 Oct 2025 23:06:11 -0700

> On the other hand the idea of moving the project towards always having a 
> releasable trunk with no new things being enabled by default is intriguing to 
> me, as that is something which could possible reduce our need to fork.


I am very supportive of an idea of having trunk releasable. With our current 
arsenal of test tooling (in jvm dtests, targeted fuzzers, simulator, harry, ast 
test) we were able to gain confidence in both new features and features that 
were around but we didn’t have experience with. At least from my perspective 
there’s very little missing for us to release trunk more often. Is there 
anything we can change, from your perspective, to make it easier to release it 
more frequently?

On Fri, Oct 24, 2025, at 11:43 PM, Jeremiah Jordan wrote:
> To get back to the original goal of this thread: Understanding fork 
> motivations.
> 
> At DataStax (an IBM Company) we are currently maintaining a long term fork 
> for a few reasons.
> 
> We run a micro services based DBaaS that uses Apache Cassandra code as a 
> library to implement different parts of the service.  To do this we have had 
> to add different hooks, abstractions, interface points and some behavior 
> changes to the code that do not make sense when running as the monolithic 
> service that is Apache Cassandra.  So we maintain those in our fork.
> 
> Then the bigger reason for our fork, as a DBaaS we have a very different 
> release cycle and maintenance burden from the main project.  We keep our main 
> branch releasable, and release from main every few months.  Our main branch 
> tracks an Apache Cassandra GA branch, with our features on top of it.  So we 
> maintain a fork where we can put the new stuff we are working on for our next 
> release (which will be within a couple months at most), while also 
> contributing things upstream in parallel (where it may be a year or more 
> before it releases).  How much we maintain in our fork goes up and down over 
> time, as the Apache Cassandra project releases new GA versions we are able to 
> bring those in and drop anything from our fork that we upstreamed during the 
> previous cycle.  We will then accumulate new differences until the next 
> Apache Cassandra GA happens and we bring that in.
> 
> So for DataStax I do not think a “back ports” branch would do much for us in 
> terms of making it so we don’t need to maintain our own fork.  The things we 
> put in our fork are not restricted to “stuff that doesn’t change how the DB 
> works”.  We release from our main branch every few months, and all features 
> worked on go there.
> 
> On the other hand the idea of moving the project towards always having a 
> releasable trunk with no new things being enabled by default is intriguing to 
> me, as that is something which could possible reduce our need to fork.  We 
> could just always be releasing from trunk, with all our development work only 
> happening on trunk, and not needing to be on our fork and also trunk in 
> parallel.  Or at the least letting our fork more closely follow trunk, rather 
> than following a previous GA branch, with us only needing to put things in 
> the fork that only make sense because of the DBaaS bits.
> 
> -Jeremiah
> 
> 
> On Oct 20, 2025 at 2:24:05 PM, Josh McKenzie <[email protected]> wrote:
>> 
>> We had a long conversation about the potential of piloting a supported 
>> backport release branch here: 
>> https://lists.apache.org/thread/xbxt21rttsqvhmh8ds9vs2cr7fx27w3k
>> 
>> When I tried to summarize the thread and identify next steps, one 
>> observation stood out: I think we did a good job establishing the shape of 
>> the challenge we'd like to address (people want to work on OSS, not maintain 
>> private forks), but I don't think we got to the root of why this challenge 
>> exists. If we take action now we run the risk of having the wrong solution 
>> to the right problem.
>> 
>> So: why are people running forks? Some reasons I've seen brought up:
>>  1. You need bespoke code to integrate with internal infrastructure
>>  2. You've written a new feature targeting your internal version, upstreamed 
>> the code, and have to wait for the feature to be in a GA release
>>  3. Someone else has contributed a feature to trunk that's attractive and 
>> it's less work or more palatable to back-port it to your private fork and 
>> maintain the diff than to qualify a custom release off trunk
>>  4. You have stability concerns with GA releases or running a release based 
>> off trunk
>> The backport branch we discussed in the previous thread would primarily 
>> address #4 (stability concerns) and secondarily #2 and #3 (feature 
>> availability). All four motivations could be addressed in other ways—ideally 
>> by reducing the pressure to fork in the first place, rather than 
>> accommodating forks as inevitable.
>> 
>> If you're running a fork and open to sharing your experience, do these 
>> reasons match yours, or is something else at play? The more detail we can 
>> gather, the better we can target improvements where they'll actually help.
>> 
>> I know these conversations take time and can be hard; I appreciate everyone 
>> taking the time and energy to help us collectively improve.
>> 
>> ~Josh
>>

Re: [DISCUSS] Understanding fork motivations (backport branch follow-up)

Reply via email to