I guess option 5 is what we have today in cep-15, have the build file grab the 
relevant SHA for the library. This way you maintain a precise SHA for builds 
and scripts don’t have to be modified.

I believe this is also possible with git submodules, but I’m happy to bake this 
into our build file instead with a script.

> As the library itself no longer has an explicit version, what I presume you 
> meant by logical version.

I mean that we don’t want to duplicate work and risk diverging functionality 
maintaining what is logically (meant to be) the same code. As a developer, 
managing all of the branches is already a pain. Libraries naturally have a 
different development cadence to the main project, and tying the development to 
C* versions is just an unnecessary ongoing burden (and risk) that we can avoid.

There’s also an additional penalty: we reduce the likelihood of outside 
contributions to the libraries only. Accord in particular I hope will attract 
outside interest if it is maintained as a separate library, as it has broad 
applicability, and is likely of academic interest. Tying it to C* version and 
more tightly coupling with C* codebase makes that less likely. We might also 
see folk interested in our utilities, or our simulator framework, if they were 
to be maintained separately, which could be valuable.




> On 16 Jan 2023, at 10:49, Mick Semb Wever <m...@apache.org> wrote:
> 
> 
>> I think (4) is the only sensible option. It permits different development 
>> branches to easily reference different versions of a library and also to 
>> easily co-develop them - from within the same IDE project, even.
> 
> 
> I've only heard horror stories about submodules. The challenges they bring 
> should be listed and checked.
> 
> Some examples
>  - you can no longer just `git clone …`  (and we clone automatically in a 
> number of places)
>  - same with `git pull …` (easy to be left with out-of-sync submodules)
>  - permanence from a git SHA no longer exists
>  - our releases get more complicated (our source tarballs are the asf 
> releases)
>  - handling patches cover submodules
>  - switching branches, and using git worktrees, during dv
> 
> I see (4) as a valid option, but concerned with the amount of work required 
> to adapt to it, and whether it will only make it more complicated for the new 
> contributor to the project. For example the first two points are addressed by 
> remembering to do `git clone --recurse-submodules …` . And who would be 
> fixing our build/test/release scripts to accommodate?
> 
> Not blockers, just concerns we need to raise and address.
> 
>  
>> We might even be able to avoid additional release votes as a matter of 
>> course, by compiling the library source as part of the C* release, so that 
>> they adopt the C* release vote (or else we may periodically release the 
>> library as we do other releases)
> 
> 
> Yes. Today we do a combination of first (3) and then (1). Having to make a 
> release of these libraries every time a patch (/feature branch) is completing 
> is a horror story in itself.
> 
>> I might be missing something, does anyone have any other bright ideas for 
>> approaching this problem? I’m sure there are plenty of opinions out there.
> 
> 
> Looking at the problem with these libraries, 
>  - we don't need releases
>  - we don't have a clean version/branch parity to in-tree
>  - codebase parity between branches is important for upgrade tests (shared 
> classloaders)
> 
>  For (2) you mention drift of the "same" version, isn't this only a problem 
> for dtest-api in the way it requires the "same version" of a codebase for 
> compatibility when running upgrade tests? As the library itself no longer has 
> an explicit version, what I presume you meant by logical version.
> 
> To begin with, I'm leaning towards (2) because it is a cognitive re-use of 
> our release branches, and the problems around classpath compatibility can be 
> solved with tests. I'm sure I'm not seeing the whole picture though…
> 

Reply via email to