Re: [GNC-dev] Git branches

john Tue, 15 Nov 2022 21:08:47 -0800

I didn't follow completely all of dymitruk's essay either, but it seems clear 
to me that he's working in a much larger team than we are. His suggestion for 
handling merge conflicts was a shared git rerere cache; I understand the 
principle but I'm not completely clear about the implementation. I had the 
impression that  release branches were like feature branches: Used once by the 
release team and discarded, and that his team uses Atlassian's Jira to keep 
track of what branches are merged into each release. He didn't go into a lot of 
detail about how to do it, and absent us adopting Jira I don't think that would 
matter much. Anyway I didn't mean to suggest that we should take up that whole 
rather complicated process; I just thought it a useful outline of the 
single-branch strategy.


Not only do we not do semantic versioning, I don't think we even know how. 
https://en.wikipedia.org/wiki/Software_versioning#Semantic_versioning has a 
brief description that says that you bump the major number if you remove or 
change some existing API, the minor number if you add API, and the patch number 
(which we got rid of with 3.0) for all other changes. That doesn't make any 
sense at all for GnuCash the application, it's for libraries. For GnuCash right 
now it would really just for bindings and maybe only the Guile bindings.

Glib has some nice macros, based somewhat on Apple's Availability Macros, in 
https://gitlab.gnome.org/GNOME/glib/-/blob/main/glib/gversionmacros.h.in that 
can be used in functions declarations to emit or suppress deprecation warnings 
based on a target version of any Glib-based library. We *could* use those 
directly or we could pinch them and adapt them to however we want to manage 
deprecations. Glib and several other GNOME libraries also have deprecated 
directories, e.g. 
https://gitlab.gnome.org/GNOME/glib/-/tree/main/glib/deprecated that they use 
when they deprecate whole classes.

The SQL backend has that per-table version check and built in update queries to 
update a database when the version changes, but since it doesn't write out a 
new DB on every save it has a much greater need for that feature than does the 
XML backend. There's also GNUCASH_RESAVE_VERSION that's available to indicate 
that the database needs to be purged and rewritten from memory. Fortunately we 
haven't needed to change it. The XML backend does have versions on each 
top-level element. They're all 2.0.0 suggesting that either we haven't been 
very good about updating them when we change something or that the schema is a 
lot more stable than we think it is. From a design standpoint I'm not sure that 
versioning every entry is all that useful considering that everything is 
written out fresh with every save.

Regards,
John Ralls


> On Nov 15, 2022, at 9:25 AM, Geert Janssens <geert.gnuc...@kobaltwit.be> 
> wrote:
> 
> Op maandag 14 november 2022 19:59:24 CET schreef john:
> > I guess we could do that as long as we continue the no-backports policy, but
> > it's something you argued against when we started using git-flow a few
> > years ago.
> >
> 
> I don't have a clear memory of what I argued against way back then. It 
> doesn't matter much. In reality we have continued to avoid backporting 
> anyway, which is just fine for the small team that we are.
> 
> > But what about the opposite approach, having only one permanent branch and
> > no major releases? Instead of 5.0 next spring we'll release 2023.1 and the
> > spring after that 2024.1, with .2 in June, .3 in September, and .4 in
> > December every year? Major changes, like c++options, get merged when ready;
> > we might do a beta release (e.g. 2023.2beta) a month before a release with
> > a major change to get better user testing. We'd have to work out policies
> > for API and schema changes because it would blow up the file upgrade path
> > for users who've skipped some releases. There's a very dense exposition on
> > this pattern at http://dymitruk.com/blog/2012/02/05/branch-per-feature/.
> 
> It's actually a branch and release pattern I had been considering but was 
> hesitant to bring up as perhaps to radical. Since you now bring it up for 
> consideration, let's evaluate it after all.
> 
> 1. I like the idea of only a single release branch and all development 
> happening on feature or bugfix branches that get merged into this release 
> branch when ready.
> 
> 2. I also like the idea of dropping distinction between a stable and 
> development series. It would bring improvements to users much faster in 
> general - it will be released when ready, not queued for the next major 
> release (which could be only in 2 years worst case).
> It's a bit what fast moving projects such as webbrowsers currently do.
> 
> 3. Year based release numbering is also very clear. And always gives a 
> reasonable indication of how old a given version of gnucash is.
> 
> 
> On the flip side
> 1. This does do away with semantic versioning completely. But that's the 
> whole point of having only one release branch. Each release can be a mix of 
> bugfixes and new features.
> 2. I imagine this only works well if newly added code (features or bugfixes 
> alike) is well tested, implying having tests written for it. And that the 
> existing code base is well tested as well. While slowly improving, the 
> gnucash code is still not very well covered.
> 
> I also read through the dymitruk article you linked to. There are a few other 
> elements that are not fully clear to me yet:
> 
> * he talks about an integration branch. Is that a branch that people continue 
> to merge their new work in, and that just serves
> a. to discover and resolve merge conflicts early on and
> b. to run an integration test suite on
> Will this branch ever be cleaned or just merge upon merge be added to it ? I 
> have no clear example of how such a branch is used really.
> 
> * there's a separate release branch. Which can be reset from time to time if 
> bad features are to be skipped for the most recent release. Resetting a 
> branch seems to conflict with distributed repositories in my mind. But 
> perhaps this is not a problem if it's commonly known this a a resettable 
> branch. And no devs except for the release manager should really check out 
> this branch and then even only while preparing for releases ? It's a bit 
> vague to me.
> 
> * handling merge conflicts and sharing the resolutions seem to be an 
> important part of the solution. Otherwise these conflicts continue to trip up 
> different devs. There was a suggestion as to how to do this, but nothing 
> concrete. Something to figure out as well.
> 
> 
> 
> As for the API and schema changes, that would indeed require some 
> reconsideration.
> 
> I have a few first thoughts, but nothing well structured:
> 
> * For API the important change to keep in mind is deprecation. New API won't 
> be an issue.  Do we support function signature changes or should a new 
> function be defined in that case ?
> Current policy is that we deprecate in a stable series and remove in a future 
> major release. As our current schedule is a two-year cycle for major 
> releases, we could make the policy "a deprecated feature/function will be 
> kept around for 2 years, after which it will be definitely removed". Other 
> durations can be chosen as well, as long as it's clear. So consumers of the 
> api could at most jump two years ahead from the version they currently use 
> with a guarantee their own code continues to work. At that point they should 
> do the work of updating their code to cope with deprecated api.
> 
> * Alternatively we could maintain a list of deprecated symbols and write a 
> small tool around it that consumers of our API can run to test if their own 
> code still depends on these deprecated symbols. Or not really remove the 
> deprecated symbols but move them to a separate source file that prints out 
> error messages when this removed symbol is still used. These messages could 
> indicate in which version of gnucash the symbol was removed to help users go 
> back to a version that still works with their code. I have not thought this 
> through very deeply to imagine whether this is even feasible.
> 
> * It could also be a mixture of both, keep a removal list for a reasonable 
> amount of time to guide users to compatible older releases, but then finally 
> drop the removed symbol after all. I think much of this could be mostly 
> automated, similarly to how I wrote xml files to keep track of deprecated 
> GSettings schemas.
> 
> * As for schema changes again I see two possibilities.
> 
> 1. The first is to extend slightly on how we do things now, namely if the 
> user's data contains bits that require adjustment, just do them the first 
> time this data is loaded or effectively used. This has slowly become a 
> spaghetti throughout the engine and from time to time we drop bits of this to 
> keep the code manageable (and hence older data can't be upgraded any more). 
> If we start to effectively use the version component of our schemas (both xml 
> and dbi backends have this), we could also require a minimum version of a 
> schema for a certain version of gnucash. Each time we drop a bit of 
> conversion code, we can bump this minimum required schema version (and 
> conveniently guide the user to the last version of gnucash that wrote this 
> version of the schema). So each version of gnucash would have a minimum and a 
> maximum schema version it supports and these versions are updated as we add 
> or remove schema related code.
> 
> 2. The other approach is to implement a separate bit of code that's solely 
> responsible for tracking schema changes (akin to the idea of having a 
> scrubbing function specifically for this reason). Whenever a data file is to 
> be opened, this code can check the schema version and do all the data 
> transformations necessary to bring it to the current schema. The advantage of 
> this would be that we unclutter the rest of the code. The schema changes 
> could also be in separate subfunctions, eg one to do the changes from schema 
> version 1 to 2, a second to do the changes from schema version 2 to 3 and so 
> on. This list could become very long over time, yet stay very clear as each 
> step is nicely isolated and easily readable. For that matter these steps 
> could be in appropriate data transformation languages (xslt or an sql data 
> modelling language) rather than code or even a mixture of both depending on 
> what we need. It would allow us to convert very old data files (well, 
> starting from where we implem
 ent this at least) over time without having bits of data conversion code all 
over the engine code.
> With such a piece of code in place that only grows over time with schema 
> changes we can easily support converting older data files to the most recent 
> schema over very long periods of time.
> 
> All of this also comes with extra work of course which takes developer time 
> away from other interesting improvements. So it's worth evaluating of this 
> alternative versioning scheme brings enough benefits to be worth the effort.
> 
> Geert

_______________________________________________
gnucash-devel mailing list
gnucash-devel@gnucash.org
https://lists.gnucash.org/mailman/listinfo/gnucash-devel

Re: [GNC-dev] Git branches

Reply via email to