Re: Bug when git rev-list options --first-parent and --ancestry-path are used together?

2013-05-25 Thread Michael Haggerty
On 05/24/2013 08:12 PM, Junio C Hamano wrote:
 Michael Haggerty mhag...@alum.mit.edu writes:
 
 Now assume a slightly more complicated situation, in which master has
 been merged to feature branch at some point:

 o - 0 - 1 - 2 - 3 - 4← master
  \   \
   A - B - C - D  ← branch
\ /
 X - Y

 Now when we do an incremental merge branch into master, how do we
 generate the list of commits to put one each axis of the table?  The
 merge base is 2, so I think the best answer is

 1- 2 - 3  - 4   ← master
|   ||
C - C3 - C4
|   ||
D - D3 - D4  ← final merge
↑
  branch
 
 I am not sure if that is the best answer, though.
 
 After managing to create Cn, if a change between C and D (which come
 from X and Y) is too complex, wouldn't you want to break down the
 task to come up with Dn recursively into a smaller subtask of
 merging X first and then Y on top and finally D?

OK, so let's assume that C3 is done and D3 is giving us problems:

o - 0 - 1 - 2 - 3 - 4← master
 \   \   \
  \   \   C3- ?
   \   \ /   /
A - B - C - D  ← branch
 \ /
  X - Y

Your proposal is not to merge D directly but rather to merge X then Y.

o - 0 - 1 - 2 - 3 - 4← master
 \   \   \
  \   \   C3 --- X3 - Y3 - D3
   \   \ /  ///
A - B - C - D- / -- / ---'   ← the lines here...
 \ /  //
  X - Y- / ---'  ← ...and here don't intersect
   \/
`--'

The problem is that the merges that would be required are not able to
take advantage of the conflict resolution that was done in D:

The merge base for X3 would be B.  This is a little bit wrong because X3
includes C among its ancestors, so creating X3 requires some of the
conflicts between X and C (which were resolved once in D) to be resolved
again.

The merge base for Y3 would be X.  Note that Y3 already includes C and Y
among its ancestors.  Therefore, resolving Y3 involves resolving the
same conflicts between Y and C that were already resolved in D.  But
since merge Y3 doesn't know about D, the user would be forced to resolve
those conflicts again (albeit maybe helped by something like rerere).

And merge D3 would have two merge bases, C and Y.  This is related to
the fact that there are now two independent known resolutions for
merging C and Y, namely D and Y3.

Given that Y3 in the above scenario needs to include include C (via C3)
and also Y, it seems to me that this merge is superfluous.  It should
have exactly the same content as D3, assuming that the conflicts are
resolved the same way.  Therefore one could skip Y3 and proceed directly
to D3:

o - 0 - 1 - 2 - 3 - 4← master
 \   \   \
  \   \   C3 --- X3 - D3
   \   \ /  //
A - B - C - D- / ---'← the lines here...
 \ /  /
  X - Y  /   ← ...and here don't intersect
   \/
`--'

This merge could take advantage of the conflict resolution that was done
in D.  It would have an unambiguous merge base, C.

But I still think that this approach is not as clean as an incremental
merge of two linear branches, because X3 requires some of the same
conflicts to be resolved as were already resolved in D.

Incidentally, if merge D had been done incrementally and the full
incremental merge resolution had been kept, then we would have the
missing merge CX that would allow us to compute D3 incrementally:

o - 0 - 1 - 2 - 3 - 4← master
|   |   |
A - B - C - C3
|   |   |
X - CX- C3X
|   |   |
Y - D - D3

I think that all of the required merges have sane merge-bases and take
advantage of all of the merges that have been done previously.  This is
another case where an incremental merges contains information that can
be useful for the future.

 The simplest way I can think of to generate the list C,D is

 git rev-list --first-parent --ancestry-path 2..D

 We need --ancestry-path to avoid getting commits A and B.  It's still
 not clear that this is always the best approach but at least it seems
 safe.
 
 Hmm, while I agree that A and B will be omitted by using ancestry
 path on the example topology, I need to be convinced that it is
 impossible to end up with disjoint segments of a history in any
 ancestry graph by combining -f-p and -a-p that way to feel safe.

If by disjoint you mean that the history contains gaps, that is
exactly what happens in the case I described in my last email:

o - 0 - 1 - 2 - 3 - 4← master
 \   \
  A - B - C --.
   \   \
X - Y - D← branch

The result of git rev-list --first-parent --ancestry-path 2..D would
be only D and commit C would disappear into the gap.

I am willing to accept that for my application, because the incremental
merge algorithm can work despite gaps 

Bug when git rev-list options --first-parent and --ancestry-path are used together?

2013-05-23 Thread Michael Haggerty
It seems to me that

 git rev-list --first-parent --ancestry-path A..B

is well-defined and should list the commits in the intersection between

 git rev-list --first-parent A..B

and

 git rev-list--ancestry-path A..B

But in many cases the first command doesn't provide any output even
though there are commits common to the output of the last two commands.

For example, take as an example the DAG from test t6019:

#  D---E---F
# / \   \
#B---C---G---H---I---J
#   / \
#  A---K---L--M

(The merges are always downwards; e.g., the first parent of commit L is
K.)  The command

git rev-list --first-parent --ancestry-path D..J

doesn't generate any output, whereas I would expect it to output H I
J.  Similarly,

git rev-list --first-parent --ancestry-path D..M

doesn't generate any output, whereas I would expect it to output L M.

For fun, the attached script computes the output for all commit pairs in
this DAG and outputs the discrepancies that it finds.  (It should be run
in directory t/trash directory.t6019-rev-list-ancestry-path after
t6019 was run with -d.)

Is this a bug or are my expectations wrong?

Michael

-- 
Michael Haggerty
mhag...@alum.mit.edu
http://softwareswirl.blogspot.com/


x.sh
Description: Bourne shell script


Re: Bug when git rev-list options --first-parent and --ancestry-path are used together?

2013-05-23 Thread Junio C Hamano
Michael Haggerty mhag...@alum.mit.edu writes:

 It seems to me that

  git rev-list --first-parent --ancestry-path A..B

 is well-defined and should list the commits in the intersection between

  git rev-list --first-parent A..B

 and

  git rev-list--ancestry-path A..B

 But in many cases the first command doesn't provide any output even
 though there are commits common to the output of the last two commands.

 For example, take as an example the DAG from test t6019:

 #  D---E---F
 # / \   \
 #B---C---G---H---I---J
 #   / \
 #  A---K---L--M

 (The merges are always downwards; e.g., the first parent of commit L is
 K.)  The command

 git rev-list --first-parent --ancestry-path D..J

 doesn't generate any output, whereas I would expect it to output H I
 J.

As I do not see how only show first-parent chains from near the tip
but stop immediately when the chain deviates from the ancestry path
could be a sensible operation (in other words, I do not offhand
think of examples of what useful things you can do with that
information), I actually expect that -f-p -a-p D..J should error
out, instead of giving no output.

You are correct to point out that sometimes -f-p and -a-p _could_ be
compatible, e.g. -f-p -a-p A..M, or -f-p -a-p B..M.  But I think
the only case that they are compatible is when -f-p output is a
strict subset of what -a-p without -f-p would give.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html