Re: [Plplot-devel] PLplot 6 and git

Alan W. Irwin Sun, 24 May 2015 15:02:31 -0700

Hi Phil:

This is long, but you have given me lots to respond to.  :-)

On 2015-05-24 09:37+0100 Phil Rosenberg wrote:

> Hi Alan and Dave
>
> Some specific comments first, then some general ones after.
>
>> Fundamentally, the git world is split on the rebase-only
>> versus merge-only question
> As it happens I fall in the merge camp. But for the work we have been
> doing up to now I don't feel there has been much difference either
> way.But this is personal and I know you feel strongly about rebase
> only so I have no intention to push you on this.

Just to explain further why I am being so conservative about this....
The advice I got from Brad King (who has experience advising a large
number of software projects on the svn to git transition) is stick
with our current rebase-only workflow until all developers were up to
speed with git.  I would argue we are not there yet since some of the
PLplot developers who were active in the svn era have not contributed
a single commit yet in the git era.  Some of those might just be
missing in action for other reasons, but I know of at least one case
where intimidation concerning git has played a significant role in
delaying participation for at least a while, but I am hoping he will
overcome his fears and become an active developer for PLplot again.
So this is definitely not a good time to start fooling with the
workflow.

Once we do get to the stage of being up to speed with git as a
development team, Brad went on to argue that moving to a merge-only
workflow that preserved a clean first-parent shape of history was not
for the faint-hearted and would require a merge czar (his current role
in CMake development) to fight through all the complicated merge
issues for the master branch with that merge czar being essentially
the only one in control of the master branch.  So the choices for the
merge-only model seem to be to either have a merge czar or abandon all
workflow rules which would mean that the history DAG was extremely
chaotic.

I don't like either of those choices.

I think none of us, including me, are qualified to be the merge czar,
and in any case I think that is bad politics for such a small
development community to have just one or two gatekeepers for the
master branch.  In my view it is much better for all our active
developers to feel responsible for the quality of PLplot including our
git history, and the rule for enforcing rebase-only workflow (no merge
commits allowed) is in principal a lot easier to understand than the
much more complicated rule required to have a good first-parent shape
for the history.

I think providing a meaningful history is really important for such
development work as git bisection to find regressions.  To expand on
that concern, I just skimmed an interesting paper called "Fighting
regressions with git bisect" by Christian Couder which I highly
recommend to others here.  (You can probably find it with a google
search, but for Debian wheezy it appears as 
<file:///usr/share/doc/git/html/git-bisect-lk2009.html>.) The
principal conclusion I drew from this paper was git bisection results
are more reliable the simpler the history.  And git bisection really
is a killer app that I and others used quite a bit in the last release
cycle to find regressions, and I would personally hate to compromise
that killer app by allowing our history to be chaotic due to an
uncontrolled merge workflow.

>> However, we could allow for deliberate rebasing (e.g., to
>> propagate a must-have fix from master to throwaway_plplot6), but that
>> would have to be scheduled a couple of days in advance
> I feel very strongly against this. If somebody misses the deadline
> because they are off email or their work isn't is a state to commit
> (i.e. it would break the build) then we could easily lose large chunks
> of work that somebody has created. In my opinion we absolutely must
> not rebase a branch we are working on. Ever.

I think that restriction is too strong.  Instead, in response to your
concern, I think we could establish the rule that the guy who is
proposing the rebase could simply wait for a positive OK from
everybody who is actively working on the public topic branch (which
might mean in practice the rebase only occurs right at the end when
the topic has matured, see below).

>
> So perhaps some general points now
>
> In the last development cycle I tried to work simultaneously on my
> Windows machine and two Linux machines to test my personal branches on
> all three. The rebase only workflow made this essentially impossible.
> I continually ended up in situations where I broke my repos because I
> rebased some work that existed in another repo and this caused massive
> issues. This was one person with three checked out versions of the
> code and it was a nightmare. If we have more than one person then I
> can guarantee we will break people's repos with almost every rebase.

I think this is an illustration of the point I made previously in this
thread that pushing between multiple servers is virtually a guarantee
of merge commits which are prohibited under the rebase-only workflow.

However, I think you could have made the above work quite simply as
follows (or at least I would like to see you try this method next
time):

1. Always keep master on each server exactly the same as master on
our official SF server.

2. Always rebase your topic branch on each server on that (common)
master branch.

Those two rules mean every topic branch on each of your servers is
identical except for additional development you have made on that
topic branch on one of your servers.  But then it should be trivial
using the "git format-patch"/"git am" method to update all your
servers' topic branches to be identical with the one where you have
done additional recent development, Note especially I have found the
--interactive option of "git am" and "git log --oneline" to be quite
effective in selecting the commits that will be applied from a series
generated by "git format-patch".

> On question that might have an impact. Do we wish to continue
> supporting PLplot 5 for some time with bug fixes? It might be that
> some users have legacy software that relies on v5 API. So maybe we
> should consider having permanently separate v5 and v6 branches? I'm
> not sure what this does to our development model.

See below for an answer to this question.

>
> I'm not sure this is right, but I would assume that if we apply a bug
> fix to the v5 branch then create a patch of this commit and apply that
> to the v6 branch then if we ever merge (or rebase) the branches then
> git is clever enough to not create a conflict. Is this correct?

I don't think we should limit how we develop on throwaway-plplot6 by
trying to avoid in advance rebase conflict issues.  So using patches
from master to throwaway-plplot6 or rebasing (if you can get complete
agreement to that step for all active developers at the time) should
be fine.  Of course, when we do our final rebase before the merge (see
below), we will just have to deal with conflicts the way that is
described in "git help rebase".

>
> So in my opinion we have limited options (in no particular order)
> 1)We just don't run a parallel v6 branch.
> 2)We run a parallel branch permanently and if we have commits we wish
> to apply to both v5 and v6 we do so with a patch
> 3)We run a parallel branch permanently and if we have commits we wish
> to apply to both v5 and v6 we do a rebase (I think this would be very
> bad!!!)
> 4)We move to a merge workflow
> 5)We hide our v6 branch so we only break out own when we rebase only
> once when v6 is ready (already discounted by Alan)
>
> Out of all those perhaps the idea of having a v5 and v6 branch that we
> actually never merge together, and use patches to commit to both gives
> use the advantage of parallel branches and also rebase workflow?
>

That last is pretty close to one of the two options I proposed so I
think we are quite close to consensus here.  In my proposal the names
of the two public branches would be throwaway-plplot6 for PLplot 6
development and master for PLplot 5 development.  And even if you are
uncomfortable with the rebase method I proposed above to deal with the
developer who is temporarily out of e-mail contact, I think it is
important to rebase at least when throwaway-plplot6 has matured to
make sure all innovations and bug fixes that are on the master branch
that are relevant to PLplot 6 are continued when throwaway-plplot6 is
merged into master.

To make that proposal more specific we should do the following once 
throwaway-plplot6 has matured.

1. Tag the tip of the master branch (with a name like
plplot5-branchpoint for easy future reference).

2. Rebase throwaway-plplot6 with master (making sure that
everyone is aware of this so that nobody is left behind
by this change).

3. merge --ff-only throwaway-plplot6 onto master.

4. Delete throwaway-plplot6.

In other words, we generally treat the public throwaway-plplot6 branch
the same as we ordinary treat a small private topic branch except that
the rebasing of throwaway-plplot6 will probably more limited due to
the developer coordination reasons discussed above.

In sum, I think this is a good compromise git development proposal
that works well for our rebase-only workflow, and which also satisfies
your concerns with ease of collaboration for many developers working
on a private topic branch.

Assuming you agree with this proposal and nobody else can find some
git issue with it, then I would like to flesh out how I visualize the
transition between PLplot 5 and PLplot 6.

The maturation stages that PLplot 6 will go through are something
like the following:

1. Just beginning to work, i.e., parts of it work on one specific
platform. (I assume it would take you only a week or so to achieve
this with a C++ plplot core with decent error propagation and a
corresponding C wrapper.)

2. Partly working, i.e., most components work on most platforms we
have access to.  We would want this to be an all-out effort
concentrating strictly on introducing all backwards-incompatible
changes we want for PLplot 6.  That is, I hope this can be done in a
month or so rather than dragging it out for years with problems in
that case with feature creep that tends to be an issue for unreleased
software.

3. Mostly working, i.e., the comprehensive test script works for
everyone on all platforms we can access as well as it does for PLplot
5.  The amount of time and effort going into this part of the effort
will depend strongly on how much test platform coverage we already
have achieved for PLplot 5, but I am hoping we achieve large coverage
for both PLplot 5 very soon now and PLplot 6 when it is ready so that
users of either will have a smooth ride on most platforms.

4. Completely works for us, e.g., the comprehensive test script without
any special options (i.e., testing everything) works for all of us
on the various platforms we have access to.

Note, I continue to strive for stage 4 with PLplot 5 which makes stage
3 a moving target for PLplot 6.  For example, with my help both in
selecting packages and with any further build system issues we run
into I hope Arjen will be able to achieve perfect comprehensive test
results in the next few weeks for a Cygwin platform where all possible
PLplot prerequisites have been installed.  And once that goal is
achieved and with willingness of our developers to run the tests on
the various platforms accessible to them, it should be fairly
straightforward to achieve similar good results on MinGW-w64/MSYS2 (as
a close analog of the Cygwin success) and Mac OS X (as a fairly close
Unix analog of our Linux success). But, I am pretty sure the
equivalent comprehensive testing success on MSVC is going to be more
difficult to achieve (even for the limited prerequisites that are
available on that platform) so I expect the distinction between stages
3 and 4 for PLplot 6 is going to be a real one for some time to come.

Also note that PLplot 6 is going to be backwards-incompatible with
PLplot 5 so making that transition is going to be painful for our
users without much to gain from their prospective other than a
superior response to errors. So we want to reduce their pain as much
as possible by avoiding a release of PLplot 6 before it is ready.
Furthermore, we have limited manpower so we want to minimize the
length of time we have to support both PLplot 5 and PLplot 6, and the
best way to do that is do not release PLplot 6 until it is ready i.e.,
it has _completely_ achieved stage 3 above.  So what I have in mind is
only when stage 3 has been achieved do we do the above steps to merge
throwaway-plplot6 onto master and release PLplot-5.99.y releases (6.0
release candidates) soon after for our users to evaluate followed by
the 6.0.0 release just as soon as we have no more user complaints
about those release candidates.

Note, I am emphasizing speed of development here and reliability
rather than features.  However, it sounds from what you have said that
propagating all error conditions will be trivial so it will be nice to
have that feature right away as (say) the sole initial selling point
of PLplot 6.0.0 to help pay for the user pain of the transition
from PLplot 5 to 6.

If we follow that model it should be possible in theory to go smoothly
from the last PLplot 5.x release (where x < 99) to the release of
6.0.0 with no overlap in support between PLplot 6 and 5, but we should
definitely tag the last PLplot 5 commit (as I stated in the details
above), and subsequently (if needed) use that tag as the origin of a
semi-permanent public plplot5 branch if we need to do further PLplot
5.x bug-fix releases. Obviously in this case the plplot5 branch would
never be rebased or merged with the master branch since PLplot 5 would
be a dead-end branch of development (only devoted to minimal bug
fixing) by design.

Sorry this has been so long, but there is a lot to think about beyond
just the programming aspects when planning our move from PLplot 5 to
PLplot 6!

Alan
__________________________
Alan W. Irwin

Astronomical research affiliation with Department of Physics and Astronomy,
University of Victoria (astrowww.phys.uvic.ca).

Programming affiliations with the FreeEOS equation-of-state
implementation for stellar interiors (freeeos.sf.net); the Time
Ephemerides project (timeephem.sf.net); PLplot scientific plotting
software package (plplot.sf.net); the libLASi project
(unifont.org/lasi); the Loads of Linux Links project (loll.sf.net);
and the Linux Brochure Project (lbproject.sf.net).
__________________________

Linux-powered Science
__________________________

------------------------------------------------------------------------------
One dashboard for servers and applications across Physical-Virtual-Cloud 
Widest out-of-the-box monitoring support with 50+ applications
Performance metrics, stats and reports that give you Actionable Insights
Deep dive visibility with transaction tracing using APM Insight.
http://ad.doubleclick.net/ddm/clk/290420510;117567292;y
_______________________________________________
Plplot-devel mailing list
Plplot-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/plplot-devel

Re: [Plplot-devel] PLplot 6 and git

Reply via email to