RE: Regression in git-subtree.sh, introduced in 2.20.1, after 315a84f9aa0e2e629b0680068646b0032518ebed

Strain, Roger L. Thu, 03 Jan 2019 07:43:13 -0800

> -----Original Message-----
> From: Johannes Schindelin <[email protected]>
> 
> Hi Roger,
> 
> 
> On Wed, 2 Jan 2019, Strain, Roger L. wrote:
> 
> > TL;DR: Current script uses git rev-list to retrieve all commits which
> > are reachable from HEAD but not from <abc123>. Is there a syntax that
> > will instead return all commits reachable from HEAD, but stop
> traversing
> > when <abc123> is encountered? It's a subtle distinction, but
> important.
> 
> Maybe you are looking for the --ancestry-path option? Essentially, `git
> rev-list --ancestry-path A..B` will list only commits that are reachable
> from B, not reachable from A, but that *can* reach A (i.e. that are
> descendants of A).
> 
> Ciao,
> Johannes


Thanks for the suggestion, but I don't think that one does quite what is needed 
here. It did provide a good sample graph to consider, though. Subtree needs to 
rebuild history and tie things in to previously reconstructed commits. Here's 
the sample graph from the --ancestry-path portion of the git-rev-list manpage:

            D---E-------F
           /     \       \
          B---C---G---H---I---J
         /                     \
        A-------K---------------L--M

Subtree maps mainline commits to known subtree commits, so let's assume we have 
a mapping of D to D'. As documented, if we were to rev-list D..M normally, we'd 
get all commits except D itself, and D's ancestors B and A. So the "normal" 
result would be:

                E-------F
                 \       \
              C---G---H---I---J
                               \
                K---------------L--M

This is bad for subtree, because commit C's parent is B, which is not a known 
commit to subtree, and which wasn't included in the list of commits to convert. 
It therefore assumes C is an initial commit, which is wrong. Likewise K's 
parent A isn't in the list to convert, so K is assumed to be an initial commit, 
which also is wrong. (E is okay here, because E's parent is D, and D maps to 
D', so we can stitch that history together properly.)

By using --ancestry-path, we would instead get only the things directly between 
D and M, as documented:

                E-------F
                 \       \
                  G---H---I---J
                               \
                                L--M

This actually moves us in the wrong direction, as now both G and L have one 
known parent and one unknown parent; I'm not sure how the script would handle 
this, but we actually end up with less information.

In this case, what I need is a way to trace back history along all merge 
parents, stopping only when I hit one of multiple known commits that I can 
directly tie back to. In this instance, subtree *knows* what D maps to, so any 
time D is encountered, we can stop tracing back. But if I can get to one of D's 
ancestors through another path, I need to keep following that path. Here's what 
I need for this to work properly:

                E-------F
                 \       \
          B---C---G---H---I---J
         /                     \
        A-------K---------------L--M

To give one more example (since removing a single commit frankly isn't very 
interesting) let's say that I have known subtree mappings for both D = D' and G 
= G'. I would therefore need to find all commits which are ancestors of M, but 
stop tracing history when I reach *either* D or G. Note that if I can reach a 
commit from any other path, I still need to know about it. Here's what we 
ultimately would want to find:

                E-------F
                         \
                      H---I---J
                               \
        A-------K---------------L--M

In this case, commit E will reference known commit D as a parent and maps to 
D', and is good. Commit H references known commit G as a parent and maps to G', 
and is good. Commit K references A, which itself is an initial commit so is 
converted to A' (just as it has been previous times subtree has run), and is 
good.

I'll keep digging around a little bit, but I'm starting to think the necessary 
plumbing for this operation might not exist. If I can't find it, I'll see if 
there's some way to unroll that recursive call.

-- 
Roger

RE: Regression in git-subtree.sh, introduced in 2.20.1, after 315a84f9aa0e2e629b0680068646b0032518ebed

Reply via email to