Re: [PATCH 3/2] merge-trees script for Linus git

2005-04-16 Thread Junio C Hamano
 LT == Linus Torvalds [EMAIL PROTECTED] writes:

LT Damn, my cunning plan is some good stuff. 

I really like this a lot.  It is *so* *simple*, clear, flexible
and an example of elegance.  This is one of the things I would
happily say Sheesh!  Why didn't *I* think of *THAT*
first!!! to.

LT NOTE NOTE NOTE! I could make read-tree do some of these nontrivial 
LT merges, but I ended up deciding that only the matches in all three 
LT states thing collapses by default.

 * Understood and agreed.

LT Damn, I'm good.

 * Agreed ;-). Wholeheartedly.

So what's next?  Certainly I'd immediately drop (and I would
imagine you would as well) both C or Perl version of
merge-tree(s).

The userland merge policies need ways to extract the stage
information and manipulate them.  Am I correct to say that you
mean by ls-files -l the extracting part?

LT I should make ls-files have a -l format, which shows the
LT index and the mode for each file too.

You probably meant ls-tree.  You used the word mode but it
already shows the mode so I take it to mean stage.  Perhaps
something like this?

$ ls-tree -l -r 49c200191ba2e3cd61978672a59c90e392f54b8b
100644  blobfe2a4177a760fd110e78788734f167bd633be8deCOPYING
100644  blobb39b4ea37586693dd707d1d0750a9b580350ec50:1  man/frotz.6
100644  blobb39b4ea37586693dd707d1d0750a9b580350ec50:2  man/frotz.6
100664  blobeeed997e557fb079f38961354473113ca0d0b115:3  man/frotz.6
 ...

The above example shows that COPYING has merged successfully,
and O and A have the same contents and B has something different
at man/frotz.6.

Assuming that you would be working on that, I'd like to take the
dircache manipulation part.  Let's think about the minimally
necessary set of operations:

 * The merge policy decides to take one of the existing stage.

   In this case we need a way to register a known mode/sha1 at a
   path.  We already have this as update-cache --cacheinfo.
   We just need to make sure that when update-cache puts
   things at stage 0 it clears other stages as well.

 * The merge policy comes up with a desired blob somewhere on
   the filesystem (perhaps by running an external merge
   program).  It wants to register it as the result of the
   merge.

   We could do this today by first storing the desired blob
   in a temporary file somewhere in the path the dircache
   controls, update-cache --add the temporary file, ls-tree to
   find its mode/sha1, update-cache --remove the temporary
   file and finally update-cache --cacheinfo the mode/sha1.
   This is workable but clumsy.  How about:

   $ update-cache --graft [--add] desired-blob path

   to say I want to register mode/sha1 from desired-blob, which
   may not be of verify_path() satisfying name, at path in the
   dircache?

 * The merge policy decides to delete the path.

   We could do this today by first stashing away the file at the
   path if it exists, update-cache --remove it, and restore
   if necessary.  This is again workable but clumsy.  How about:

   $ update-cache --force-remove path

   to mean I want to remove the path from dircache even though
   it may exist in my working tree?

So it all boils down to update-cache.  The new things to be
introduced are:

 * An explicit update-cache always removes stage 1/2/3 entries
   associated with the named path.

 * update-cache --graft

 * update-cache --force-remove

Am I on the right track?

You might want to go even lower level by letting them say
something like:

 * update-cache --register-stage mode sha1 stage path

   Registers the mode/sha1 at stage for path.  Does not look at
   the working tree.  stage is [0-3]
 
 * update-cache --delete-stage stage-list path

   Removes the entry at named stages for path.  Does not look at
   the working tree.  stage-list is either [0-3](,[0-3])+ or
   bitmask (i.e. (1  stage-number) ORed together).  The former
   would probably be easier to work with by scripts

 * write-blob path

   Hashes and registers the file at path (regardless of what
   verify_path() says) and writes the resulting blob's mode/sha1
   to the standard output.

If you take this lower-level approach, an explicit update-cache
would not clear stage1/2/3.

My preference is the former, not so low-level, interface.
Guidance?

-
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 3/2] merge-trees script for Linus git

2005-04-16 Thread Linus Torvalds


On Sat, 16 Apr 2005, Junio C Hamano wrote:
 
 LT NOTE NOTE NOTE! I could make read-tree do some of these nontrivial 
 LT merges, but I ended up deciding that only the matches in all three 
 LT states thing collapses by default.
 
  * Understood and agreed.

Having slept on it, I think I'll merge all the trivial cases that don't 
involve a file going away or being added. Ie if the file is in all three 
trees, but it's the same in two of them, we know what to do.

That way we'll leave thigns where the tree itself changed (files added or 
removed at any point) and/or cases where you actually need a 3-way merge.

 The userland merge policies need ways to extract the stage
 information and manipulate them.  Am I correct to say that you
 mean by ls-files -l the extracting part?

No, I meant show-files, since we need to show the index, not a tree (no 
valid tree can ever have the modes information, since (a) it doesn't 
have the space for it anyway and (b) we refuse to write out a dirty index 
file.



 
 LT I should make ls-files have a -l format, which shows the
 LT index and the mode for each file too.
 
 You probably meant ls-tree.  You used the word mode but it
 already shows the mode so I take it to mean stage.  Perhaps
 something like this?
 
 $ ls-tree -l -r 49c200191ba2e3cd61978672a59c90e392f54b8b
 100644blobfe2a4177a760fd110e78788734f167bd633be8deCOPYING
 100644blobb39b4ea37586693dd707d1d0750a9b580350ec50:1  
 man/frotz.6
 100644blobb39b4ea37586693dd707d1d0750a9b580350ec50:2  
 man/frotz.6
 100664blobeeed997e557fb079f38961354473113ca0d0b115:3  
 man/frotz.6

Apart from the fact that it would be

show-files -l

since there are no tree objects that can have anything but fully merged
state, yes.

 Assuming that you would be working on that, I'd like to take the
 dircache manipulation part.  Let's think about the minimally
 necessary set of operations:
 
  * The merge policy decides to take one of the existing stage.
 
In this case we need a way to register a known mode/sha1 at a
path.  We already have this as update-cache --cacheinfo.
We just need to make sure that when update-cache puts
things at stage 0 it clears other stages as well.
 
  * The merge policy comes up with a desired blob somewhere on
the filesystem (perhaps by running an external merge
program).  It wants to register it as the result of the
merge.
 
We could do this today by first storing the desired blob
in a temporary file somewhere in the path the dircache
controls, update-cache --add the temporary file, ls-tree to
find its mode/sha1, update-cache --remove the temporary
file and finally update-cache --cacheinfo the mode/sha1.
This is workable but clumsy.  How about:
 
$ update-cache --graft [--add] desired-blob path
 
to say I want to register mode/sha1 from desired-blob, which
may not be of verify_path() satisfying name, at path in the
dircache?
 
  * The merge policy decides to delete the path.
 
We could do this today by first stashing away the file at the
path if it exists, update-cache --remove it, and restore
if necessary.  This is again workable but clumsy.  How about:
 
$ update-cache --force-remove path
 
to mean I want to remove the path from dircache even though
it may exist in my working tree?

Yes.

 Am I on the right track?

Exactly.

 You might want to go even lower level by letting them say
 something like:
 
  * update-cache --register-stage mode sha1 stage path
 
Registers the mode/sha1 at stage for path.  Does not look at
the working tree.  stage is [0-3]

I'd prefer not. I'd avoid playing games with the stages at any other level
than the full tree level until we show a real need for it.

Let's go with the known-needed minimal cases that are high-level enough to
make the scripting simple, and see if there is any reason to ever touch
the tree any other way.

Linus
-
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 3/2] merge-trees script for Linus git

2005-04-16 Thread Linus Torvalds


On Sat, 16 Apr 2005, Linus Torvalds wrote:
 
 Having slept on it, I think I'll merge all the trivial cases that don't 
 involve a file going away or being added. Ie if the file is in all three 
 trees, but it's the same in two of them, we know what to do.

Junio, I pushed this out, along with the two patches from you. It's still
more anal than my original tree-diff algorithm, in that it refuses to
touch anything where the name isn't the same in all three versions
(original, new1 and new2), but now it does the if two of them match, just
select the result directly trivial merges.

I really cannot see any sane case where user policy might dictate doing
anything else, but if somebody can come up with an argument for a merge
algorithm that wouldn't do what that trivial merge does, we can make a
flag for don't merge at all.

The reason I do want to merge at all in read-tree is that I want to
avoid having to write out a huge index-file (it's 1.6MB on the kernel, so
if you don't do _any_ trivial merges, it would be 4.8MB after reading
three trees) and then having people read it and parse it just to do stuff
that is obvious. Touching 5MB of data isn't cheap, even if you don't do a 
whole lot to it.

Anyway, with the modified read-tree, as far as I can tell it will now 
merge all the cases where one side has done something to a file, and the 
other side has left it alone (or where both sides have done the exact same 
modification). That should _really_ cut down the cases to just a few files 
for most of the kernel merges I can think of. 

Does it do the right thing for your tests?

Linus
-
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 3/2] merge-trees script for Linus git

2005-04-15 Thread Junio C Hamano
Linus,

the merge-trees I sent you earlier was expecting the old
diff-tree behaviour, and I did not realize that I need an
explicit -z flag now.  Here is a fix.

Signed-off-by: Junio C Hamano [EMAIL PROTECTED]
---
 merge-trees |4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

--- merge-trees 2005-04-15 13:21:35.0 -0700
+++ merge-trees+2005-04-15 16:27:34.0 -0700
@@ -78,8 +78,8 @@
 local ($_, $/);
 $/ = \0; 
 my %path;
-open $fhi, '-|', 'diff-tree', '-r', @tree
-   or die $!: diff-tree -r @tree;
+open $fhi, '-|', 'diff-tree', '-r', '-z', @tree
+   or die $!: diff-tree -r -z @tree;
 while ($fhi) {
chomp;
if (/^\*($reM)-($reM)\tblob\t($reID)-($reID)\t(.*)$/so) {

-
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 3/2] merge-trees script for Linus git

2005-04-15 Thread Linus Torvalds


On Fri, 15 Apr 2005, Junio C Hamano wrote:
 
 I'd take the hint, but I would say the current Perl version
 would be far more usable than the C version I would come up with
 by the end of this weekend because:

Actually, it turns out that I have a cunning plan.

I'm full of cunning plans, in fact. It turns out that I can do merges even
more simply, if I just allow the notion of state into an index entry,
and allow multiple index entries with the same name as long as they differ
in state.

And that means that I can do all the merging in the regular index tree, 
using very simple rules.

Let's see how that works out. I'm writing the code now.

Linus
-
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html