Re: [RFC] Looking at multiple ancestors in merge

2005-08-26 Thread Daniel Barkalow
On Fri, 26 Aug 2005, Junio C Hamano wrote:

> Daniel Barkalow <[EMAIL PROTECTED]> writes:
>
> > I've started this, and have gotten as far as having read-tree accept > 3
> > trees and ignore everything but the last 3. Am I correct in assuming that
> > if I break read-tree in any way, some test will fail?
>
> If some test fails you would know you broke it, but the inverse
> is probably not always true.
>
> I think the current read-tree test suite has reasonably wide
> coverage of all the interesting cases.  But the definition of
> "interesting" was derived from the current world order (IOW, the
> test suite was designed around the way we do things right now as
> a whitebox test, not a blackbox test).  I would not be surprised
> if some of them did not catch breakage you may introduce during
> the development.

Okay; I think the only thing that I'm going to change with respect to how
it makes decisions will be with 4+ trees, and those will obviously need
new tests,

> I wonder however if extending the current way of doing things in
> the cache is the right thing.  Right now we use two bits out of
> the top four bits for recording stage, one bit for the update
> bit, so you have only one extra bit to extend the number of
> stages, which means you could hold at most 7 trees at once.
>
> You "ignore things but the last 3", so this may not be too much
> of a problem, but I am a bit puzzled what you meant by it
> though.  Are you talking about reading more than 3 trees and
> keeping only the 3 to be merged, discarding the rest, doing the
> selection per path?

For each path, I intend to look at all the entries and make trivial merge
judgements on them, but then only leave the usual stage 2 and stage 3, and
a chosen stage 1. The way I'm writing the changes is:

In the argument parsing loop, just form a list of the tree objects, and
actually read them after the whole list is ready. If there are more than
3, ignore all but the last 3. This lets you give an arbitary number of
common ancestors to read-tree, and it won't mess up, but it will only use
one of them. I've done this.

Next, scan through the tree entry lists for all the trees together, and
generate cache entries for the same path in the different trees at the
same time. I've written this, but I've got a few bugs, and the 3way merge
tests are dutifully failing.

Then, I'll do the trivial merge on tree entries rather than cache
entries.

Finally, I'll extend the trivial merge to use the extra ancestors.

Since merge(1) doesn't handle multiple common ancestors, having more than
3 stages in the cache after the trivial merge isn't going to be useful for
now.

-Daniel
*This .sig left intentionally blank*
-
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC] Looking at multiple ancestors in merge

2005-08-26 Thread Junio C Hamano
Daniel Barkalow <[EMAIL PROTECTED]> writes:

> I've started this, and have gotten as far as having read-tree accept > 3
> trees and ignore everything but the last 3. Am I correct in assuming that
> if I break read-tree in any way, some test will fail?

If some test fails you would know you broke it, but the inverse
is probably not always true.

I think the current read-tree test suite has reasonably wide
coverage of all the interesting cases.  But the definition of
"interesting" was derived from the current world order (IOW, the
test suite was designed around the way we do things right now as
a whitebox test, not a blackbox test).  I would not be surprised
if some of them did not catch breakage you may introduce during
the development.

I wonder however if extending the current way of doing things in
the cache is the right thing.  Right now we use two bits out of
the top four bits for recording stage, one bit for the update
bit, so you have only one extra bit to extend the number of
stages, which means you could hold at most 7 trees at once.

You "ignore things but the last 3", so this may not be too much
of a problem, but I am a bit puzzled what you meant by it
though.  Are you talking about reading more than 3 trees and
keeping only the 3 to be merged, discarding the rest, doing the
selection per path?


-
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC] Looking at multiple ancestors in merge

2005-08-25 Thread Daniel Barkalow
On Wed, 24 Aug 2005, Daniel Barkalow wrote:

> Of course, this is going to take a bit of work, because read-tree
> currently puts all of its arguments into the cache and then works on
> merging, and taking multiple ancestors requires putting them somewhere
> else, because they won't fit in the cache.

I've started this, and have gotten as far as having read-tree accept > 3
trees and ignore everything but the last 3. Am I correct in assuming that
if I break read-tree in any way, some test will fail?

-Daniel
*This .sig left intentionally blank*
-
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC] Looking at multiple ancestors in merge

2005-08-25 Thread Matthias Urlichs
Hi, Daniel Barkalow wrote:

> My proposal is actually to detect when a merge is ambiguous. In order to
> determine that, however, you have to evaluate multiple potential outcomes
> and see if they are actually different. I'm working on an efficient way to
> do that.

Good.

There's also a related problem which I've hit last month or so, where one
view has the same file (or at least one with the same name) added to both
sides of the branch, but the other view doesn't.

Unfortunately, that can happen to both branches independently, so you
really need to choose a merge base per file instead of globally. I suspect
that many of the problem cases simply go away when you do that.

-- 
Matthias Urlichs   |   {M:U} IT Design @ m-u-it.de   |  [EMAIL PROTECTED]
Disclaimer: The quote was selected randomly. Really. | http://smurf.noris.de
 - -
Jones' Second Law:
The man who smiles when things go wrong has thought of someone
to blame it on.


-
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC] Looking at multiple ancestors in merge

2005-08-24 Thread Daniel Barkalow
On Wed, 24 Aug 2005, A Large Angry SCM wrote:

> Daniel Barkalow wrote:
> > I'm starting to work on letting the merging process see multiple
> > ancestors, and I think it's messy enough that I should actually discuss
> > it.
> >
> > Review of the issue:
> >
> > It is possible to lost reverts in cases when merging two commits with
> > multiple ancestors, in the following pattern: (letters representing blobs
> > at some filename, children to the right)
> >
> > a-b-b-a-?
> >  \ X   /
> >   a-b-b
> >
> [Lots of stuff deleted]
>
> There seems to be a lot of effort being put into auto-magically choosing
> the "right" merge in the presence of multiple possible merge bases.
> Unfortunately, most (all?) of the proposals are attempting to divine
> intent, and so, are guaranteed to be 100% wrong at least some of the time.
>
> Wouldn't it be better, instead, to detect that current merge being
> attempted is ambiguous and require the user to specify the correct merge
> base? The alternative is a tool that appears to work all of the time but
> does the wrong thing some of the time.

My proposal is actually to detect when a merge is ambiguous. In order to
determine that, however, you have to evaluate multiple potential outcomes
and see if they are actually different. I'm working on an efficient way to
do that.

Then further work could look into eliminating possibilities when
information about the history excludes them. There were two issues in the
case that Tony hit: it ignored a potential correct outcome for the merge,
and it didn't ignore an outcome which could be demonstrated to be
incorrect. The priority is to resolve the first, but things which improve
the second or help with solutions to the second are worth understanding.

-Daniel
*This .sig left intentionally blank*
-
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC] Looking at multiple ancestors in merge

2005-08-24 Thread A Large Angry SCM
Daniel Barkalow wrote:
> I'm starting to work on letting the merging process see multiple
> ancestors, and I think it's messy enough that I should actually discuss
> it.
> 
> Review of the issue:
> 
> It is possible to lost reverts in cases when merging two commits with
> multiple ancestors, in the following pattern: (letters representing blobs
> at some filename, children to the right)
> 
> a-b-b-a-?
>  \ X   /
>   a-b-b
> 
[Lots of stuff deleted]

There seems to be a lot of effort being put into auto-magically choosing
the "right" merge in the presence of multiple possible merge bases.
Unfortunately, most (all?) of the proposals are attempting to divine
intent, and so, are guaranteed to be 100% wrong at least some of the time.

Wouldn't it be better, instead, to detect that current merge being
attempted is ambiguous and require the user to specify the correct merge
base? The alternative is a tool that appears to work all of the time but
does the wrong thing some of the time.
-
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[RFC] Looking at multiple ancestors in merge

2005-08-24 Thread Daniel Barkalow
I'm starting to work on letting the merging process see multiple
ancestors, and I think it's messy enough that I should actually discuss
it.

Review of the issue:

It is possible to lost reverts in cases when merging two commits with
multiple ancestors, in the following pattern: (letters representing blobs
at some filename, children to the right)

a-b-b-a-?
 \ X   /
  a-b-b

You form a branch with unrelated changes, apply a patch in the top line,
separately merge both ways, do unrelated development in the bottom line,
and revert the patch in the top line. Then you're trying to merge the two
lines. There are two candidates for the common ancestor, the two in the
second column. If you pick the top one, you get the revert; if you pick
the bottom one, you don't. This is a bug, because it ignores the 'a'
version due to it being "unchanged", but it actually did change and
changed back.

Note that the revert is going to also be ignored if there isn't the "X" in
the middle of that diagram and the a->b change on the bottom is due to
independantly applying the same patch. Users are more likely to expect
this, however, than the situation above, where the side that is causing
the patch to be included never applied it explicitly at all; it just
merged at an unfortunate moment.

My theory is that we should handle merges by passing all of the ancestors
to read-tree, and having it use the following additions to the rules for
trivial merges:

 - If any of the ancestors matches a side, don't use that side
 - If you eliminate both side, don't do the trivial merge

(The first of these also means that it'll pick the best combination of
ancestors for maximizing trivial merges, as a nice side effect; the second
means that it'll avoid messing up with reverts when it has a chance of
understanding them)

If it doesn't do the trivial merge, it just puts the blob from the first
listed ancestor in stage 1, rather than trying anything fancy.

(As a further improvement, we could actually look through the history for
reasons to disregard a similarity, which would determine that there isn't
a continuous line of similarity from the recent 'a' to the common ancestor
'a', and therefore that it should be retained; but I'll be satisfied for
now with having it just not do the incorrect trivial merge.)

Of course, this is going to take a bit of work, because read-tree
currently puts all of its arguments into the cache and then works on
merging, and taking multiple ancestors requires putting them somewhere
else, because they won't fit in the cache.

-Daniel
*This .sig left intentionally blank*
-
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html