I guess option 5 is what we have today in cep-15, have the build file grab the relevant SHA for the library. This way you maintain a precise SHA for builds and scripts don’t have to be modified.
I believe this is also possible with git submodules, but I’m happy to bake this into our build file instead with a script. > As the library itself no longer has an explicit version, what I presume you > meant by logical version. I mean that we don’t want to duplicate work and risk diverging functionality maintaining what is logically (meant to be) the same code. As a developer, managing all of the branches is already a pain. Libraries naturally have a different development cadence to the main project, and tying the development to C* versions is just an unnecessary ongoing burden (and risk) that we can avoid. There’s also an additional penalty: we reduce the likelihood of outside contributions to the libraries only. Accord in particular I hope will attract outside interest if it is maintained as a separate library, as it has broad applicability, and is likely of academic interest. Tying it to C* version and more tightly coupling with C* codebase makes that less likely. We might also see folk interested in our utilities, or our simulator framework, if they were to be maintained separately, which could be valuable. > On 16 Jan 2023, at 10:49, Mick Semb Wever <m...@apache.org> wrote: > > >> I think (4) is the only sensible option. It permits different development >> branches to easily reference different versions of a library and also to >> easily co-develop them - from within the same IDE project, even. > > > I've only heard horror stories about submodules. The challenges they bring > should be listed and checked. > > Some examples > - you can no longer just `git clone …` (and we clone automatically in a > number of places) > - same with `git pull …` (easy to be left with out-of-sync submodules) > - permanence from a git SHA no longer exists > - our releases get more complicated (our source tarballs are the asf > releases) > - handling patches cover submodules > - switching branches, and using git worktrees, during dv > > I see (4) as a valid option, but concerned with the amount of work required > to adapt to it, and whether it will only make it more complicated for the new > contributor to the project. For example the first two points are addressed by > remembering to do `git clone --recurse-submodules …` . And who would be > fixing our build/test/release scripts to accommodate? > > Not blockers, just concerns we need to raise and address. > > >> We might even be able to avoid additional release votes as a matter of >> course, by compiling the library source as part of the C* release, so that >> they adopt the C* release vote (or else we may periodically release the >> library as we do other releases) > > > Yes. Today we do a combination of first (3) and then (1). Having to make a > release of these libraries every time a patch (/feature branch) is completing > is a horror story in itself. > >> I might be missing something, does anyone have any other bright ideas for >> approaching this problem? I’m sure there are plenty of opinions out there. > > > Looking at the problem with these libraries, > - we don't need releases > - we don't have a clean version/branch parity to in-tree > - codebase parity between branches is important for upgrade tests (shared > classloaders) > > For (2) you mention drift of the "same" version, isn't this only a problem > for dtest-api in the way it requires the "same version" of a codebase for > compatibility when running upgrade tests? As the library itself no longer has > an explicit version, what I presume you meant by logical version. > > To begin with, I'm leaning towards (2) because it is a cognitive re-use of > our release branches, and the problems around classpath compatibility can be > solved with tests. I'm sure I'm not seeing the whole picture though… >