We have been talking a lot about the branch cutting date, but I agree with Benedict here, I think we should actually be talking about the expected release date. 

If we truly believe that we can release within 1-2 months of cutting the branch, and many people I have talked to think that is possible, then a May branch cut means we release by July. That would only be 7 months post 4.1 release, that seems a little fast to me.  IIRC the last time we had release cadence discussions most people were for keeping to a release cadence of around 12 months, and many were against a 6 month cadence.

So if we want to have a goal of “around 12 months” and also have a goal of “release before summit in December”. I would suggest we put our release date goal in October to give some runway for being late and still getting out by December.

So if the release date goal is October, we can also hedge with the longer 2 month estimate on “time after branching” to again make sure we make our goals. This would put the branching in August. So if we do release in an October that gives us 10 months since 4.1, which while still shorter than 12 much closer than only 7 months would be.

If people feel 1 month post branch cut is feasible we could cut the branch in September.

-Jeremiah

On Mar 1, 2023, at 10:34 AM, Henrik Ingo <henrik.i...@datastax.com> wrote:


Hi 

Those are great questions Mick. It's good to recognize this discussion impacts a broad range of contributors and users, and not all of them might be aware of the discussion in the first place.

More generally I would say that your questions brought to mind two fundamental principles with a "train model" release schedule:

  1. If a feature isn't ready by the cut-off date, there's no reason to delay the release, because the next release is guaranteed to be just around the corner.
  2. If there is a really important feature that won't make it, rather than delaying the planned release, we should (also) consider the opposite: we can do the next release earlier if there is a compelling feature ready to go. (Answers question 2b from Mick.)

I have arguments both for and against moving the release date:


The to stick with the current plan, is that we have a lot of big features now landing in trunk. If we delay the release for one feature, it will delay the GA of all the other features that were ready by May. For example, while SAI code is still being massaged based on review comments, we fully expect it to merge before May. Same for the work on tries, which is on its final stretch. Arguably Java 17 support can't come soon enough either. And so on... For some user it can be a simple feature, like just one specific guardrail, that they are waiting on. So just as we are excited to wait for that one feature to be ready and make it, I'm just as unexcited about the prospect of delaying the GA of several other features. If we had just one big feature that everyone was working on, this would be easier to decide...

Note also that postponing the release for a single feature that is still in development is a risky bet, because you never know what unknowns are still ahead once the work is code complete and put to more serious testing. At first it might sound reasonable to delay 1-3 months, but what if on that 3rd month some unforeseen work is discovered, and now we need to discuss delaying another 3 months. Such a risk is inherent to any software project, and we should anticipate it to happen. Scott's re-telling of CASSANDRA-18110 is a great example: These delays can happen due to a single issue, and it can be hard to speed things up by e.g. assigning more engineers to the work. So, when we say that we'd like to move the branching date from May to August, and specifically in order for some feature to be ready by then, what do we do if it's not ready in August?`It's presumably closer to being ready at that point, so the temptation to wait just a little bit more is always there. (And this is also my answer to Mick's question 1b.)



Now, let me switch to arguing the opposite opinion:

My instinct here would be to stick to early May as the cut-off date, but also allow for exceptions. I'm curious to hear how this proposal is received? If this was a startup, there could be a CEO or let's say a build manager, that could make these kind of case by case decisions expediently. But I realize in a consensus based open source project like Cassandra, we may also have to consider issues like fairness: Why would some feature be allowed a later date than others? How do we choose which work gets such exceptions?

Anyway, the fact is that we have several huge bodies of work in flight now. The Accord patch was about 28k lines of code when I looked at it, and note that this doesn't even include "accord itself", which is in a different repository. SAI, Tries (independently for memtable and sstables) and UCS are all in the 10k range too. And I presume the Java 17 support and transactional metadata are the same. Each of these pieces of code represent alone years of engineering work. For context, Cassandra as a whole has about 1 million lines of code. So each of these features is replacing or adding about 1-3% of  the codebase.

With that in mind, I feel like having  a hard deadline on a single day doesn't really serve justice to these features. In fact, most of them are not merged in a single PR either, but  a series of PRs, each of which independently is huge too. This makes me ask, what if some feature already merged 3 patches, but still has 2 to go? Can we allow extra time to merge the last two, or do we work on reverting the first 3? (Obviously not the latter...)

So it seems to me we should keep May Xth as the beginning of the cutoff, but where the actual cutoff is a fuzzy deadline rather than hard. For most work it would be early may, but for the big features a few weeks or even months of a window is ok.

This kind of flexible approach would still help advancing toward a release, since it would quiet down the release branch significantly, and for most contributors focus would shift to testing. (Alternatively, focus could shift to help review and test the features that are still being worked on.)

Mick and Benjamin have been good at remind me that we can't expect to merge all of this work the last week of April anyway. So from my point of view just as we have worked hard to get some of these big features in earlier, it would not be completely wrong to allow some to finish their work in the days and weeks after the official cutoff date. It seems this is my answer to Mick's question 2a.


In contrast, I fear that if we postpone the branch date altogether, it will delay everything and we will just have this same discussion in September again.


For the remaining questions, I would also be interested to hear answers to questions #1 and #2.

henrik



On Wed, Mar 1, 2023 at 3:38 PM Mick Semb Wever <m...@apache.org> wrote:
My thoughts don't touch on CEPs inflight. 



For the sake of broadening the discussion, additional questions I think worthwhile to raise are…

1. What third parties, or other initiatives, are invested and/or working against the May deadline? and what are their views on changing it?
  1a. If we push branching back to September, how confident are we that we'll get to GA before the December Summit?
2. What CEPs look like not landing by May that we consider a must-have this year?
  2a. Is it just tail-end commits in those CEPs that won't make it? Can these land (with or without a waiver) during the alpha phase?
  2b. If the final components to specified CEPs are not approved/appropriate to land during alpha, would it be better if the project commits to a one-off half-year release later in the year?


--

Henrik Ingo

c. +358 40 569 7354 

w. www.datastax.com

     


Reply via email to