Re: Intra-project dependencies

Josh McKenzie Mon, 16 Jan 2023 04:30:49 -0800

>  - permanence from a git SHA no longer exists
With the caveat that I haven't worked w/submodules before and only know about 
them from a cursory search, it looks like git-submodule status would show us 
the sha for submodules and we could have parent projects reference specific 
shas to pull for submodules to build? 
https://git-scm.com/docs/git-submodule/#Documentation/git-submodule.txt-status--cached--recursive--ltpathgt82308203


It seems like our use case is one of the primary ones git submodules are 
designed to address.

On Mon, Jan 16, 2023, at 6:40 AM, Benedict wrote:
> 
> I guess option 5 is what we have today in cep-15, have the build file grab 
> the relevant SHA for the library. This way you maintain a precise SHA for 
> builds and scripts don’t have to be modified.
> 
> I believe this is also possible with git submodules, but I’m happy to bake 
> this into our build file instead with a script.
> 
> > As the library itself no longer has an explicit version, what I presume you 
> > meant by logical version.
> 
> I mean that we don’t want to duplicate work and risk diverging functionality 
> maintaining what is logically (meant to be) the same code. As a developer, 
> managing all of the branches is already a pain. Libraries naturally have a 
> different development cadence to the main project, and tying the development 
> to C* versions is just an unnecessary ongoing burden (and risk) that we can 
> avoid.
> 
> There’s also an additional penalty: we reduce the likelihood of outside 
> contributions to the libraries only. Accord in particular I hope will attract 
> outside interest if it is maintained as a separate library, as it has broad 
> applicability, and is likely of academic interest. Tying it to C* version and 
> more tightly coupling with C* codebase makes that less likely. We might also 
> see folk interested in our utilities, or our simulator framework, if they 
> were to be maintained separately, which could be valuable.
> 
> 
> 
> 
>> On 16 Jan 2023, at 10:49, Mick Semb Wever <m...@apache.org> wrote:
>> 
>>> I think (4) is the only sensible option. It permits different development 
>>> branches to easily reference different versions of a library and also to 
>>> easily co-develop them - from within the same IDE project, even.
>>> 
>> 
>> 
>> I've only heard horror stories about submodules. The challenges they bring 
>> should be listed and checked.
>> 
>> Some examples
>>  - you can no longer just `git clone …`  (and we clone automatically in a 
>> number of places)
>>  - same with `git pull …` (easy to be left with out-of-sync submodules)
>>  - permanence from a git SHA no longer exists
>>  - our releases get more complicated (our source tarballs are the asf 
>> releases)
>>  - handling patches cover submodules
>>  - switching branches, and using git worktrees, during dv
>> 
>> I see (4) as a valid option, but concerned with the amount of work required 
>> to adapt to it, and whether it will only make it more complicated for the 
>> new contributor to the project. For example the first two points are 
>> addressed by remembering to do `git clone --recurse-submodules …` . And who 
>> would be fixing our build/test/release scripts to accommodate?
>> 
>> Not blockers, just concerns we need to raise and address.
>> 
>>  
>>> We might even be able to avoid additional release votes as a matter of 
>>> course, by compiling the library source as part of the C* release, so that 
>>> they adopt the C* release vote (or else we may periodically release the 
>>> library as we do other releases)
>>> 
>> 
>> 
>> Yes. Today we do a combination of first (3) and then (1). Having to make a 
>> release of these libraries every time a patch (/feature branch) is 
>> completing is a horror story in itself.
>> 
>> 
>>> I might be missing something, does anyone have any other bright ideas for 
>>> approaching this problem? I’m sure there are plenty of opinions out there.
>>> 
>> 
>> 
>> Looking at the problem with these libraries, 
>>  - we don't need releases
>>  - we don't have a clean version/branch parity to in-tree
>>  - codebase parity between branches is important for upgrade tests (shared 
>> classloaders)
>> 
>>  For (2) you mention drift of the "same" version, isn't this only a problem 
>> for dtest-api in the way it requires the "same version" of a codebase for 
>> compatibility when running upgrade tests? As the library itself no longer 
>> has an explicit version, what I presume you meant by logical version.
>> 
>> To begin with, I'm leaning towards (2) because it is a cognitive re-use of 
>> our release branches, and the problems around classpath compatibility can be 
>> solved with tests. I'm sure I'm not seeing the whole picture though…
>>

Re: Intra-project dependencies

Reply via email to