Re: Bug when git rev-list options "--first-parent" and "--ancestry-path" are used together?
On 05/24/2013 08:12 PM, Junio C Hamano wrote: > Michael Haggerty writes: > >> Now assume a slightly more complicated situation, in which master has >> been merged to feature branch at some point: >> >> o - 0 - 1 - 2 - 3 - 4← master >> \ \ >> A - B - C - D ← branch >>\ / >> X - Y >> >> Now when we do an incremental merge branch into master, how do we >> generate the list of commits to put one each axis of the table? The >> merge base is "2", so I think the best answer is >> >> 1- 2 - 3 - 4 ← master >>| || >>C - C3 - C4 >>| || >>D - D3 - D4 ← final merge >>↑ >> branch > > I am not sure if that is the best answer, though. > > After managing to create Cn, if a change between C and D (which come > from X and Y) is too complex, wouldn't you want to break down the > task to come up with Dn recursively into a smaller subtask of > merging X first and then Y on top and finally D? OK, so let's assume that C3 is done and D3 is giving us problems: o - 0 - 1 - 2 - 3 - 4← master \ \ \ \ \ C3- ? \ \ / / A - B - C - D ← branch \ / X - Y Your proposal is not to merge D directly but rather to merge X then Y. o - 0 - 1 - 2 - 3 - 4← master \ \ \ \ \ C3 --- X3 - Y3 - D3 \ \ / /// A - B - C - D- / -- / ---' ← the lines here... \ / // X - Y- / ---' ← ...and here don't intersect \/ `--' The problem is that the merges that would be required are not able to take advantage of the conflict resolution that was done in D: The merge base for X3 would be B. This is a little bit wrong because X3 includes C among its ancestors, so creating X3 requires some of the conflicts between X and C (which were resolved once in D) to be resolved again. The merge base for Y3 would be X. Note that Y3 already includes C and Y among its ancestors. Therefore, resolving Y3 involves resolving the same conflicts between Y and C that were already resolved in D. But since merge Y3 doesn't know about D, the user would be forced to resolve those conflicts again (albeit maybe helped by something like rerere). And merge D3 would have two merge bases, C and Y. This is related to the fact that there are now two independent known resolutions for merging C and Y, namely D and Y3. Given that Y3 in the above scenario needs to include include C (via C3) and also Y, it seems to me that this merge is superfluous. It should have exactly the same content as D3, assuming that the conflicts are resolved the same way. Therefore one could skip Y3 and proceed directly to D3: o - 0 - 1 - 2 - 3 - 4← master \ \ \ \ \ C3 --- X3 - D3 \ \ / // A - B - C - D- / ---'← the lines here... \ / / X - Y / ← ...and here don't intersect \/ `--' This merge could take advantage of the conflict resolution that was done in D. It would have an unambiguous merge base, C. But I still think that this approach is not as clean as an incremental merge of two linear branches, because X3 requires some of the same conflicts to be resolved as were already resolved in D. Incidentally, if merge D had been done incrementally and the full incremental merge resolution had been kept, then we would have the missing merge CX that would allow us to compute D3 incrementally: o - 0 - 1 - 2 - 3 - 4← master | | | A - B - C - C3 | | | X - CX- C3X | | | Y - D - D3 I think that all of the required merges have sane merge-bases and take advantage of all of the merges that have been done previously. This is another case where an incremental merges contains information that can be useful for the future. >> The simplest way I can think of to generate the list C,D is >> >> git rev-list --first-parent --ancestry-path 2..D >> >> We need --ancestry-path to avoid getting commits A and B. It's still >> not clear that this is always the best approach but at least it seems >> safe. > > Hmm, while I agree that A and B will be omitted by using ancestry > path on the example topology, I need to be convinced that it is > impossible to end up with disjoint segments of a history in any > ancestry graph by combining -f-p and -a-p that way to feel "safe". If by "disjoint" you mean that the history contains gaps, that is exactly what happens in the case I described in my last email: o - 0 - 1 - 2 - 3 - 4← master \ \ A - B - C --. \ \ X - Y - D← branch The result of "git rev-list --first-parent --ancestry-path 2..D" would be only D and commit C would disappear into the "gap". I am willing to accept that for my application, becau
Re: Bug when git rev-list options "--first-parent" and "--ancestry-path" are used together?
[Junio, sorry for the dup; somehow I failed to CC the first version to the mailing list.] On 05/23/2013 07:20 PM, Junio C Hamano wrote: > Michael Haggerty writes: > >> It seems to me that >> >> git rev-list --first-parent --ancestry-path A..B >> >> is well-defined and should list the commits in the intersection between >> >> git rev-list --first-parent A..B >> >> and >> >> git rev-list--ancestry-path A..B >> >> But in many cases the first command doesn't provide any output even >> though there are commits common to the output of the last two commands. >> >> For example, take as an example the DAG from test t6019: >> >> # D---E---F >> # / \ \ >> #B---C---G---H---I---J >> # / \ >> # A---K---L--M >> >> (The merges are always downwards; e.g., the first parent of commit L is >> K.) The command >> >> git rev-list --first-parent --ancestry-path D..J >> >> doesn't generate any output, whereas I would expect it to output "H I >> J". > > As I do not see how "only show first-parent chains from near the tip > but stop immediately when the chain deviates from the ancestry path" > could be a sensible operation (in other words, I do not offhand > think of examples of what useful things you can do with that > information), I actually expect that "-f-p -a-p D..J" should error > out, instead of giving no output. > > You are correct to point out that sometimes -f-p and -a-p _could_ be > compatible, e.g. "-f-p -a-p A..M", or "-f-p -a-p B..M". But I think > the only case that they are compatible is when "-f-p" output is a > strict subset of what "-a-p" without "-f-p" would give. I guess I should tell you the application that motivated the use of these options together. Maybe you can suggest a better approach. I'm trying to find the most general conditions where an incremental merge [1,2] makes sense. The simple case is merging two linear branches, call them "branch" and "master": o - 0 - 1 - 2 - 3 - 4← master \ A - B - C - D ← branch My tool git-imerge does this (conceptually) by constructing the pairwise merges between each commit on master and each commit on branch: o - 0 - 1 - 2 - 3 - 4 ← master | |||| A - A1 - A2 - A3 - A4 | |||| B - B1 - B2 - B3 - B4 | |||| C - C1 - C2 - C3 - C4 | |||| D - D1 - D2 - D3 - D4 ← final merge ↑ branch Each of the new items [ABCD][1234] is a merge commit between its neighbor above and its neighbor to the left. The idea is that the pairwise merges are less likely to conflict, and if they conflict are likely to be easy to reconcile. I am trying to generalize this approach to a more complicated situation where the two branches are not linear. The current incremental merge algorithm requires that each commit 0,1,2,3,4 and 0,A,B,C be a direct descendant of its predecessors. That's trivial for linear branches. But if the situation looks like this: o - 0 - 1 - 2 - 3 - 4← master \ A - B - C - D ← branch \ / X - Y then there is no correct ordering of 0,A,B,C,D,X,Y that has the descendancy property. So currently I always take use the first-parent list of commits on each branch (i.e., omitting X and Y), which does have that property. Now assume a slightly more complicated situation, in which master has been merged to feature branch at some point: o - 0 - 1 - 2 - 3 - 4← master \ \ A - B - C - D ← branch \ / X - Y Now when we do an incremental merge branch into master, how do we generate the list of commits to put one each axis of the table? The merge base is "2", so I think the best answer is 1- 2 - 3 - 4 ← master | || C - C3 - C4 | || D - D3 - D4 ← final merge ↑ branch The simplest way I can think of to generate the list C,D is git rev-list --first-parent --ancestry-path 2..D We need --ancestry-path to avoid getting commits A and B. It's still not clear that this is always the best approach but at least it seems safe. For example, suppose that the first parent of D is Y rather than C: o - 0 - 1 - 2 - 3 - 4← master \ \ A - B - C --. \ \ X - Y - D← branch In this case, the above command (it it worked correctly) would just output "D". This is correct and could be used for a (degenerate, in this case) incremental merge [3]. This is my reason for wanting --first-parent and --ancestry-path to work together. For now I'm just running rev-list twice and computing the intersection by hand, but it was surprising that git could not do this for me. Michael [1] https://github.com/mhagger/git-imerge [2] http://softwareswirl.blogspot.com/2013/05/git-imerge-practical-introduction.html (and references therein) [3] It might be more helpful to use C,D in this
Re: Bug when git rev-list options "--first-parent" and "--ancestry-path" are used together?
Michael Haggerty writes: > It seems to me that > > git rev-list --first-parent --ancestry-path A..B > > is well-defined and should list the commits in the intersection between > > git rev-list --first-parent A..B > > and > > git rev-list--ancestry-path A..B > > But in many cases the first command doesn't provide any output even > though there are commits common to the output of the last two commands. > > For example, take as an example the DAG from test t6019: > > # D---E---F > # / \ \ > #B---C---G---H---I---J > # / \ > # A---K---L--M > > (The merges are always downwards; e.g., the first parent of commit L is > K.) The command > > git rev-list --first-parent --ancestry-path D..J > > doesn't generate any output, whereas I would expect it to output "H I > J". As I do not see how "only show first-parent chains from near the tip but stop immediately when the chain deviates from the ancestry path" could be a sensible operation (in other words, I do not offhand think of examples of what useful things you can do with that information), I actually expect that "-f-p -a-p D..J" should error out, instead of giving no output. You are correct to point out that sometimes -f-p and -a-p _could_ be compatible, e.g. "-f-p -a-p A..M", or "-f-p -a-p B..M". But I think the only case that they are compatible is when "-f-p" output is a strict subset of what "-a-p" without "-f-p" would give. -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Bug when git rev-list options "--first-parent" and "--ancestry-path" are used together?
It seems to me that git rev-list --first-parent --ancestry-path A..B is well-defined and should list the commits in the intersection between git rev-list --first-parent A..B and git rev-list--ancestry-path A..B But in many cases the first command doesn't provide any output even though there are commits common to the output of the last two commands. For example, take as an example the DAG from test t6019: # D---E---F # / \ \ #B---C---G---H---I---J # / \ # A---K---L--M (The merges are always downwards; e.g., the first parent of commit L is K.) The command git rev-list --first-parent --ancestry-path D..J doesn't generate any output, whereas I would expect it to output "H I J". Similarly, git rev-list --first-parent --ancestry-path D..M doesn't generate any output, whereas I would expect it to output "L M". For fun, the attached script computes the output for all commit pairs in this DAG and outputs the discrepancies that it finds. (It should be run in directory "t/trash directory.t6019-rev-list-ancestry-path" after t6019 was run with "-d".) Is this a bug or are my expectations wrong? Michael -- Michael Haggerty mhag...@alum.mit.edu http://softwareswirl.blogspot.com/ x.sh Description: Bourne shell script