Re: [DISCUSS] Merging incremental feature work

Henrik Ingo Fri, 03 Feb 2023 06:20:33 -0800

I've been an unusually active debater recently, so it might be appropriate
to start with a reminder/disclaimer that I'm not actually a Cassandra
contributor in any way(*), but I choose to share some thoughts where I feel
that sharing experiences from other open source database projects can be of
use to the discussion,

*) No. Even if I agree with the idea that there are many types of
contributions other than code, the only thing I have contributed is
opinions :-D

As I think I mentioned in an Accord related thread, there are two desirable
goals at odds here:
1. It is indeed better to merge smaller increments of work into trunk. And
to do this more frequently.
2. On the other hand if we want to have an always shippable trunk with a
fixed date for feature freeze, it implies that only complete, or at least
harmless/inactive units of work can be merged to trunk.

To give a specific example, I'll tell the story of the MySQL transactional
storage engine called Falcon... (Cue Disney soundtrack featuring a harp,
and blurring image...)

As Oracle acquired the only transactional storage engine for MySQL, InnoDB,
it became a strategic priority to develop a new, in house engine that could
replace InnoDB in functionality. This project was codenamed Falcon. Since
it was the top priority for MySQL 6.0, it was developed in the "main" v6.0
branch, because it was the definition of MySQL 6. The release date would be
whenever Falcon v1 is ready. Over time other features, such as
partitioning, transactional backups... were also built into the main
feature branch, and they depended on Falcon or were Falcon-only.

There was only one problem: Falcon never worked. In the end the v6.0
development branch just had to be abandoned and to this date MySQL has
never released a version 6. They had to skip a major version number because
of this.

(As an epilogue that at least I personally was very amused with: When
Oracle had to argue their case with the EU Commission that they would be
allowed to acquire MySQL, one of the commitments Oracle lawyer made was to
continue to develop *and release* MySQL version 6.0. I read the
almost-legally-binding statement, and thought yeah good luck with that!)

I wasn't involved in the active development, but possibly a similar example
could be the TPC architecture introduced in DSE 6.0. (A Cassandra derived
proprietary product that I worked on some time ago.) At least by the time I
was involved, I can't say that the performance resulting from that work
would have been a net positive to the full population of workloads. But
because it had been developed directly in the main 6.0 branch, and because
it is so invasive and core to everything else, it also wasn't possible to
roll back its introduction.

Finally, in an open source project, it's good to remember that we are all
volunteers here, in some sense, and sometimes it could happen that
development of a feature stops half way because its developer disappears
and nobody else picks it up.

So, returning back to this day and this database... Basically what you want
to avoid is to paint yourself into a corner, and particularly the wrong
corner. So the way I would answer this question is that large bodies of
work should:
 - Refactoring that is a) harmless, and/or b) improves the codebase anyway,
should be merged early into trunk.
 - The main body of the new functionality should be developed in a feature
branch up until some kind of MVP stage. This means that by the time it is
proposed for merge, a) it has been tested to both be of good quality and
that it actually provides the benefit it set out to implement. This means
that merging it to trunk will be a net improvement, always.
 - After that first big MVP merge, additional functionality of course could
be developed directly against trunk.
 - For patches that are very clean and self contained, for example in their
own Java package, it doesn't really matter, because they are easy to roll
back if needed. They can be developed against trunk.

So applying this to Josh's examples:

1) I assume JDK17 support is invasive, so that would suggest a feature
branch. However, the next question is, is there any risk involved in this
work (like Falcon for MySQL). Hypothetically it could be that Java 17 has
worse performance than Java 11, or some other blocking problem is
encountered. But in practice we probably estimate that this risk is small.
In such a case JDK17 support could indeed be developed with small patches
directly against trunk, but this would be an exception to the rule!

2) To take an example of an approved CEP, surprisingly enough, the
humongous Accord patch is actually very clean and self contained, and would
be easy to remove (or disable with a feature flag, which has been done). So
it could have been developed against trunk. (But I'm not sure that was
obvious in the beginning of development?)

3) The work on SSTable tries and Memtable tries was even explicitly split
into separate CEPs for the API refactor and the new functionality.

Perhaps Linus Torvalds said the above more succintly than me:

*So the name of the game is to _avoid_ decisions, at least the big and
painful ones. Making small and non-consequential decisions is fine, and
makes you look like you know what you're doing, so what a kernel manager
needs to do is to turn the big and painful ones into small things where
nobody really cares.It helps to realize that the key difference between a
big decision and a small one is whether you can fix your decision
afterwards. Any decision can be made small by just always making sure that
if you were wrong (and you _will_ be wrong), you can always undo the damage
later by backtracking. Suddenly, you get to be doubly managerial for making
_two_ inconsequential decisions - the wrong one _and_ the right one.*

https://www.openlife.cc/onlinebook/epilogue-linux-kernel-management-style-linus-torvalds

(I particularly like the last sentence!)

henrik

On Fri, Feb 3, 2023 at 2:06 PM Josh McKenzie <jmcken...@apache.org> wrote:

> The topic of how we handle merging large complex bodies of work came up
> recently with the CEP-15 merge and JDK17, and we've faced this question in
> the past as well (CASSANDRA-8099 comes to mind).
>
> The times we've done large bodies of work separately from trunk and then
> merged them in have their own benefits and costs, and the examples I can
> think of where we've merged in work to trunk incrementally with something
> flagged experimental have markedly different cost/benefits. Further, the
> two approaches have shaped the *way* we approached work quite differently
> with how we architected and tested things.
>
> My current thinking: I'd like to propose we all agree to move to merge
> work into trunk incrementally if it's either:
> 1) New JDK support
> 2) An approved CEP
>
> The bar for merging anything into trunk should remain:
> 1) 2 +1's from committers
> 2) Green CI (presently circle or ASF, in the future ideally ASF or an ASF
> analog env)
>
> I don't know if this is a generally held opinion and we just haven't
> discussed it and switched our general behavior yet, or if this is more
> controversial, so I won't burden this email with enumerating pros and cons
> of the two approaches until I get a gauge of the community's temperature.
>
> So - what do we think?
>

-- 

Henrik Ingo

c. +358 40 569 7354

w. www.datastax.com

<https://www.facebook.com/datastax>  <https://twitter.com/datastax>
<https://www.linkedin.com/company/datastax/>  <https://github.com/datastax/>

Re: [DISCUSS] Merging incremental feature work

Reply via email to