[PATCH 0/4] Merging merge-trees changes to pasky-0.4

2005-04-15 Thread Junio C Hamano
I finally sync'ed up with Pasky 0.4.  Reviewing the diffs
between Linus tree and Pasky tree for the core part you seem to
have picked up some good changes (especially the byteorder one),
so I decided to rebase my changes.  So here it comes...

What follows are the 3 patches to the core part to support the
three-tree merge script, and another to introduce the script
itself.  I used to call it git-merge.perl, but now it is called
merge-trees (per request from Pasky to drop git- prefix, and
Linus has merge-tree that does not recurse while this one does
subdirectories).  The core functinality has not changed much.
The changes from the previous version at this point is still
code and interface cleanup only.

My next step will be to make it possible to tell it not to do
anything but just output recipe.

[PATCH 1/4] Add --cacheinfo option to update-cache
[PATCH 2/4] Add -z option to show-files
[PATCH 3/4] Add -r and -z options to ls-tree
[PATCH 4/4] Makefile change and merge-trees script itself.

The patches are against 516f2a088903a7b5f5a542de96b6a70c17856314

-
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Remove need to untrack before tracking new branch

2005-04-15 Thread Paul Jackson
 No, '' have a higher priority (weight?) than ''.

 has a higher precedence than 

  C Operator Precedence and Associativity
  http://www.difranco.net/cop2220/op-prec.htm

and many others -- google for 'c operator precedence'

Where the bitops , | and ^ bite you is that they are
lower precedence than many other ops, including '=='.

-- 
  I won't rest till it's the best ...
  Programmer, Linux Scalability
  Paul Jackson [EMAIL PROTECTED] 1.650.933.1373, 
1.925.600.0401
-
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Merge with git-pasky II.

2005-04-15 Thread Junio C Hamano
 CL == Christopher Li [EMAIL PROTECTED] writes:

 - Result is this object $SHA1 with mode $mode at $path (takes
 one of the trees); you can do update-cache --cacheinfo (if
 you want to muck with dircache) or cat-file blob (if you want
 to get the file) or both.

CL Is that SHA1 for tree or the file object?

I am talking about a single file here.


-
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Merge with git-pasky II.

2005-04-15 Thread Christopher Li
On Fri, Apr 15, 2005 at 12:43:47AM -0700, Junio C Hamano wrote:
  CL == Christopher Li [EMAIL PROTECTED] writes:
 
 CL Is that SHA1 for tree or the file object?
 
 I am talking about a single file here.

Then do you emit the entry for it's parents directory?

e.g. /foo/bar get created. foo doesn't exists. You have
to create foo first. You don't have mode information for
foo yet. If it give the top level tree, the SCM can check it
out by tree. hopefully have the mode on directory correctly.
Well, if they care about those little details.

Chris
 
-
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Merge with git-pasky II.

2005-04-15 Thread David Woodhouse
On Fri, 2005-04-15 at 11:36 +0200, Ingo Molnar wrote:
 do such cases occur frequently? In the kernel at least it's not too 
 typical. 

Isn't it? I thought it was a fairly accurate representation of the
process I make a whole bunch of changes to files I maintain, pulling
from Linus while occasionally asking him to pull from my tree. Sometimes
my files are changed by someone else in Linus' tree, and sometimes I
change files that I don't actually own..

-- 
dwmw2


-
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] trivial argument parsing patches

2005-04-15 Thread Paul Mackerras
In perusing the git code, I noticed some errors in argument parsing,
which the patch below fixes.  The show-diff error (checking argv[1]
each time around the loop) probably doesn't actually cause any real
problem, but it could be confusing for a novice if show-diff x
produces an error but show-diff -s x doesn't (and ignores the extra
argument).

Signed-off-by: Paul Mackerras [EMAIL PROTECTED]

rev-tree.c:  7bf9e9a92f528485360f374239809714ce7a19f5
--- rev-tree.c
+++ rev-tree.c  2005-04-15 21:17:16.0 +1000
@@ -189,8 +189,8 @@
char *arg = argv[i];
 
if (!strcmp(arg, --cache)) {
-   read_cache_file(argv[2]);
i++;
+   read_cache_file(argv[i]);
continue;
}
 
show-diff.c:  a531ca4078525d1c8dcf84aae0bfa89fed6e5d96
--- show-diff.c
+++ show-diff.c 2005-04-15 21:22:28.0 +1000
@@ -61,12 +61,10 @@
int entries = read_cache();
int i;
 
-   while (argc--  1) {
-   if (!strcmp(argv[1], -s)) {
-   silent = 1;
-   continue;
-   }
-   usage(show-diff [-s]);
+   if (argc  1) {
+   if (argc  2 || strcmp(argv[1], -s))
+   usage(show-diff [-s]);
+   silent = 1;
}
 
if (entries  0) {
-
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Merge with git-pasky II.

2005-04-15 Thread Theodore Ts'o
On Fri, Apr 15, 2005 at 02:03:08PM +0200, Johannes Schindelin wrote:
 I disagree. In order to be trusted, this thing has to catch the following
 scenario:
 
 Skywalker and Solo start from the same base. They commit quite a lot to
 their trees. In between, Skywalker commits a tree, where the function
 kazoom() has been added to the file deathstar.c, but Solo also added
 this function, but to the file moon.c. A file-based merge would have no
 problem merging each file, such that in the end, kazoom() is defined
 twice.
 
 The same problems arise when one tries to merge line-wise, i.e. when for
 each line a (possibly different) merge-parent is sought.

Be careful.  There is a very big tradeoff between 100% perfections in
catching these sorts of errors, and usability.  There exists SCM's
where you are not allowed to do commit such merges until you do a test
compile, or run a regression test suite (that being the only way to
catch these sorts of problems when we merge two branches like this).  

BitKeeper never caught this sort of thing, and we trusted it.  In
practice it was also rarely a problem.

I'll also note that BitKeeper doesn't restrict you from doing a
committing a changeset when you have modified files that have yet to
be checked in to the tree.  Same issue; you can accidentally check in
changesets that in trees that won't build, but if we added this kind
of SCM-by-straightjacket philosophy it would decrease our productivity
and people would simply not use such an SCM, thus negating its
effectiveness.

- Ted
-
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: another perspective on renames.

2005-04-15 Thread C. Scott Ananian
On Thu, 14 Apr 2005, Paul Jackson wrote:
To me, rename is a special case of the more general case of a
big chunk of code (a portion of a file) that was in one place
either being moved or copied to another place.
I wonder if there might be someway to use the tools that biologists use
to analyze DNA sequences, to track the evolution of source code,
identifying things like common chunks of code that differ in just a few
mutations, and presenting the history of the evolution, at selectable
levels of detail.
The rsync algorithm (http://samba.anu.edu.au/rsync/tech_report/node2.html) 
is probably a good place to start, although it is relatively sensitive to 
mutations.  It will be able to efficiently detect identical blocks larger 
than some block size N (512 bytes or so for rsync).  You might well 
consider smaller blocks to be irrelevant.  The data can be made 
considerably more useful to developers by canonicalizing before searching 
(ie, compressing whitespace to ' ', etc)[*].  Note that the identical 
regions do *not* have to line up on block boundaries; see the rsync 
algorithm for more detail.

I think Linus has made a persuasive case that the 'developer-friendly' 
features of an SCM (ie annotate, log, and friends) can be built *on top* 
of GIT.   This is a perfect example.  Since the computation is non-trivial 
(although linear in the number of lines of code involved in the history of 
a file; ie doesn't depend on the unrelated size of the archive), it might 
make sense for the front-end SCM to maintain its own caches --- for 
example, of the block and rolling checksums for each file required by the 
rsync algorithm.  The key point being that these are just *caches*, not 
essential history information, and can always be wiped and regenerated.

The nice 'feature' of this system (some may disagree, I guess) is that it 
does *not* depend on extensive programmer annotation of file changes (ie, 
chunk A in file B came from lines C-D of file D, or file E was once named 
F, etc).  By inferring history from content-similar files and blocks, it 
seems that it would be more able to generate useful results after 
importing third-party sources, which may come in distinct 'releases' but 
lack explicit history annotations.
  --scott

[*] in general, i will be *glad* to see source-management move away from 
CVS' line-oriented style; there's no good reason we should still be worrying
about whitespace changes, etc.  When we build 'developer-friendly' tools 
we should make every effort to auto-detect source code, image formats, 
etc, and automatically perform appropriate canonicalization and 
beautification of diffs, because this can be/should be/is entirely 
separate from git's underlying storage representation.

Mk 48 PANCHO ZPSECANT MKDELTA SCRANTON D5 SLBM JMTRAX Delta Force 
MI6 SGUAT Khaddafi SMOTH interception mail drop SECANT PBSUCCESS Cocaine
 ( http://cscott.net/ )
-
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Merge with git-pasky II.

2005-04-15 Thread Junio C Hamano
 PB == Petr Baudis [EMAIL PROTECTED] writes:

PB I can't see the conflicts between what I want and what Linus wants.
PB After all, Linus says that I can use the directory cache in any way I
PB please (well, the user can, but I'm speaking for him ;-). So I'm doing
PB so, and with your tool I would get into problems, since it is suddenly
PB imposing a policy on what should be in the index.

I think our misunderstanding is coming from the use of the word
merge tree.  I think you have been assuming that I wanted you
to run merge-trees -o ,,merge --- which would certainly cause
me to muck with your dircache there.  I totally agree with you
that that is a *BAD* *THING*.  No question there.

However, my assumption has been different.  I was assuming that
you would run merge-trees -o merge~tree (i.e. different from
your merge tree), so that you can get the merge results in a
form parsable by you.  And then, using that information, you can
make your changes in ,,merge.  After you are done with that
information, you can remove merge~trees, of course.

The format I chose for the merge result in a form parsable by
you happens to be a dircache in merge~tree, with minimum
number of files checked out when merge cannot be automatically
done safely.  In the simplest case of not having any conflicting
merge between $C and $merged, Cogito can immediately run
write-tree in merge~tree (not ,,merge) to obtain its tree-ID
$T, so that it can feed it to diff-tree to compare it with
whatever tree state Cogito wants to apply the merges between $C
and $merged to.

I still do not understand what you do in ,,merge directory, but
here is one way you can update the user working directory
in-place without having a ,,merge directory [*2*].  You can run
your git diff between $C and $T [*1*].  The result is the diff
you need to apply on top of your user's working files.  If the
user does not like the result of running that diff, it can
easily be reversed.

If a manual merge were needed between $C and $merged, Cogito
could guide the user through that manual edit in merge~tree,
and run update-cache on those hand merged files in merge~tree,
before running write-tree in merge~tree to obtain $T; after
that, everything else is the same.

You make interesting points in other parts of your message I
need to regurgitate for a while, so I would not comment on them
in this message.

[Footnote]

*1* I really like the convenience of being able to use tree-ID
and commit-ID interchangeably there.  Thanks.

*2* I understand that this would change the user's git-tools
experience a bit.  The user will not be told to go to ,,merge
and commit there which will reflected back to your working tree
anymore.  Instead the merge happens in-place.  Committing, not
committing, or further hand-fixing the merge is up to the user.
I suspect this change might even be for the better.

-
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: write-tree is pasky-0.4

2005-04-15 Thread Junio C Hamano
 CSA == C Scott Ananian [EMAIL PROTECTED] writes:

CSA On Fri, 15 Apr 2005, Junio C Hamano wrote:
 to yours is no problem for me.  Currently I see your HEAD is at
 461aef08823a18a6c69d472499ef5257f8c7f6c8, so I will generate a
 set of patches against it.

CSA Have you considered using an s/key-like system to make these hashes
CSA more human-readable?  Using the S/Key translation (11-bit chunks map
CSA to a 1-4
CSA letter word), Linus' HEAD is at:
CSAWOW-SCAN-NAVE-AUK-JILL-BASH-HI-LACE-LID-RIDE-RUSE-LINE-GLEE-WICK-A
CSA ...which is a little longer, but speaking of branch wow-scan (which
CSA gives 22 bits of disambiguation) is probably less error-prone than
CSA discussing branch '461...' (only 12 bits).

I understand monotone folks have the same issue and they let you
use unambiguous prefix string.  And why do you stop counting at
461 in your example?  To my eyes, 461aef in this particular
string stands out and is easily typable, which gives me 24 bits
;-).

But seriously I doubt the hex format is needed to be shown to
humans very often.  E-mail communications like this one being a
very special exception.  I do not expect for people to be
talking about Hey, Junio's patch against 461aef... from Linus
is a total crap like that.

The only reason I mentioned his then-HEAD by hex is because I do
not have a public archive for him to pull from, and I wanted to
make it easy for him to do:

 $ export SHA1_FILE_DIRECTORY
 $ mkdir junk  cd junk  mkdir .git 
   read-tree `cat-file commit 461aef... | sed -e 's/^tree //;q'`
 $ patch  ../stupid-patch-from-junio-01
 $ show-diff

(it might have been better if I used the tree ID for this purpose).

For Cogito users the hex format does not matter.  git pull
will get whatever HEAD recorded in the file on the sending end
and the end user does not even have to know about it.

CSA This is obviously a cogito issue, rather than a git-fs thing.

Yes.

-
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 3/2] merge-trees script for Linus git

2005-04-15 Thread Junio C Hamano
Linus,

the merge-trees I sent you earlier was expecting the old
diff-tree behaviour, and I did not realize that I need an
explicit -z flag now.  Here is a fix.

Signed-off-by: Junio C Hamano [EMAIL PROTECTED]
---
 merge-trees |4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

--- merge-trees 2005-04-15 13:21:35.0 -0700
+++ merge-trees+2005-04-15 16:27:34.0 -0700
@@ -78,8 +78,8 @@
 local ($_, $/);
 $/ = \0; 
 my %path;
-open $fhi, '-|', 'diff-tree', '-r', @tree
-   or die $!: diff-tree -r @tree;
+open $fhi, '-|', 'diff-tree', '-r', '-z', @tree
+   or die $!: diff-tree -r -z @tree;
 while ($fhi) {
chomp;
if (/^\*($reM)-($reM)\tblob\t($reID)-($reID)\t(.*)$/so) {

-
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Re: Re: write-tree is pasky-0.4

2005-04-15 Thread Daniel Barkalow
On Fri, 15 Apr 2005, Linus Torvalds wrote:

 I think I've explained my name tracking worries.  When it comes to how to 
 merge, there's three issues:
 
  - we do commonly have merge clashes where both trees have applied the 
exact same patch. That should merge perfectly well using the 3-way
merge from a common parent that Junio has, but not your current bring
patches forward kind of strategy.

I think 3-way merge is probably the best starting point, but I think that
there might be value in being able to identify the commits of each side
involved in a conflict. I think this would help with cases where both
sides pick up an identical patch, and then each side makes a further
change to a different part of the changed region (you find out that the
other guy's change was supposed to follow the patch, and don't conflict
with it).

  - I _do_ actually sometimes merge with dirty state in my working 
directory, which is why I want the merge to take place in a separate 
(and temporary) directory, which allows for a failed merge without 
having any major cleanup. If the merge fails, it's not a big deal, and 
I can just blow the merge directory away without losing the work I had 
in my real working directory.

Is there some reason you don't commit before merging? All of the current
merge theory seems to want to merge two commits, using the information git
keeps about them. It should be cheap to get a new clean working directory
to merge in, too, particularly if we add a cache of hardlinkable expanded
blobs.

  - reliability. I care much less for clever than I care for guaranteed 
to never do the wrong thing. If I have to fix up some stuff by hand, 
I'll happily do so. But if I can't trust the merge and have to _check_ 
things by hand afterwards, that will make me leery of the merges, and
_that_ is bad.
 
 The third point is why I'm going to the ultra-conservative three-way 
 merge from the common parent. It's not fancy, but it's something I feel 
 comfortable with as a merge strategy. For example, arch (and in particular 
 darcs) seems to want to try to be clever about the merges, and I'd 
 always live in fear. 

How much do you care about the situation where there is no best common
ancestor (which can happen if you're merging two main lines, each of which
has merged with both of a pair of minor trees)? I think that arch is even
more conservative, in that it doesn't look for a common ancestor, and
reports conflicts whenever changes overlap at all. Of course, reliability
by virtue of never working without help is not a big win over living in
fear; you always have to check over it, not because you're afraid, but
because it needs you to.

 And, finally, there's obviously performance. I _think_ a normal merge with
 nary a conflict and just a few tens of files changed should be possible in
 a second. I realize that sounds crazy to some people, but I think it's
 entirely doable. Half of that is writing the new tree out (that is a
 relative costly op due to the compression). The other half is the work.

I think that the time spent on I/O will be overwhelmed by the time spent
issuing the command at that rate. It might matter if you start getting
into merging lots of things at once, but that's more like a minute for a
merge group with 600 changes rather than a second per merge; we could
potentially save a lot of time based of having a bunch of information left
over from the previous merge when starting merge number 2. So 15 seconds
plus half a second per merge might be better than a second per merge in
the case that matters.

-Daniel
*This .sig left intentionally blank*

-
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] Add '-z' to merge-tree.c

2005-04-15 Thread Junio C Hamano
Linus,

this adds '-z' to merge-tree and changes its default line
termination to LF to make it consistent with your other recent
changes.

The patch is against
commit 028c5948257e763b3deb391e567b624eb7975ec2
tree   6b866e10b16183e630db8449c64899f6810d4270

Signed-off-by: Junio C Hamano [EMAIL PROTECTED]
---
 merge-tree.c |   23 ---
 1 files changed, 20 insertions(+), 3 deletions(-)

--- ,,linus/merge-tree.c2005-04-15 18:09:29.0 -0700
+++ ./merge-tree.c  2005-04-15 17:55:42.0 -0700
@@ -1,5 +1,7 @@
 #include cache.h
 
+static int line_termination = '\n';
+
 struct tree_entry {
unsigned mode;
unsigned char *sha1;
@@ -35,7 +37,8 @@ static struct tree_entry *read_tree(unsi
 
 static void show(const struct tree_entry *a, const char *path)
 {
-   printf(select %o %s %s%c, a-mode, sha1_to_hex(a-sha1), path, 0);
+   printf(select %o %s %s%c, a-mode, sha1_to_hex(a-sha1), path,
+  line_termination);
 }
 
 static void merge(const struct tree_entry *a, const struct tree_entry *b, 
const struct tree_entry *c, const char *path)
@@ -46,7 +49,7 @@ static void merge(const struct tree_entr
strcpy(hex_c, sha1_to_hex(c-sha1));
printf(merge %o-%o,%o %s-%s,%s %s%c,
a-mode, b-mode, c-mode,
-   hex_a, hex_b, hex_c, path, 0);
+   hex_a, hex_b, hex_c, path, line_termination);
 }
 
 static int same(const struct tree_entry *a, const struct tree_entry *b)
@@ -114,15 +117,29 @@ static void merge_tree(struct tree_entry
}
 }
 
+static const char *merge_tree_usage =
+merge-tree [-z] src dst1 dst2;
+
 int main(int argc, char **argv)
 {
unsigned char src[20], dst1[20], dst2[20];
 
+   while ((1  argc)  argv[1][0] == '-') {
+   switch (argv[1][1]) {
+   case 'z':
+   line_termination = 0;
+   break;
+   default:
+   usage(merge_tree_usage);
+   }
+   argc--; argv++;
+   }
+
if (argc != 4 ||
get_sha1_hex(argv[1], src) ||
get_sha1_hex(argv[2], dst1) ||
get_sha1_hex(argv[3], dst2))
-   usage(merge-tree src dst1 dst2);
+   usage(merge_tree_usage);
merge_tree(read_tree(src), read_tree(dst1), read_tree(dst2));
return 0;
 }


-
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Re: Re: write-tree is pasky-0.4

2005-04-15 Thread Linus Torvalds


On Fri, 15 Apr 2005, Daniel Barkalow wrote:
 
 Is there some reason you don't commit before merging? All of the current
 merge theory seems to want to merge two commits, using the information git
 keeps about them.

Note that the 3-way merge would _only_ merge the committed state. The 
thing is, 99% of all merges end up touching files that I never touch 
myself (ie other architectures), so me being able to merge them even when 
_I_ am in the middle of something is a good thing.

So even when I have dirty state, the merge would only merge the clean
state. And then before the merge information is put back into my working
directory, I'd do a check-files on the result, making sure that nothing
that got changed by the merge isn't up-to-date.

 How much do you care about the situation where there is no best common
 ancestor

I care. Even if the best common parent is 3 months ago, I care. I'd much 
rather get a big explicit conflict than a clean merge that ends up being 
debatable because people played games with per-file merging or something 
questionable like that.

 I think that the time spent on I/O will be overwhelmed by the time spent
 issuing the command at that rate.

There is no time at all spent on IO.

All my email is local, and if this all ends up working out well, I can 
track the other peoples object trees in local subdirectories with some 
daily rsyncs. And I have enough memory in my machines that there is 
basically no disk IO - the only tree I normally touch is the kernel trees, 
they all stay in cache.

Linus
-
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Re: Add clone support to lntree

2005-04-15 Thread Linus Torvalds


On Sat, 16 Apr 2005, Petr Baudis wrote:
 
 I'm wondering, whether each tree should be fixed to a certain branch.

I'm wondering why you talk about branches at all.

No such thing should exist. There are no branches. There are just 
repositories. You can track somebody elses repository, but you should 
track it by location, not by any branch name.

And you track it by just merging it.

Yeah, we don't have really usable merges yet, but..

Linus
-
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Re: Re: write-tree is pasky-0.4

2005-04-15 Thread Daniel Barkalow
On Fri, 15 Apr 2005, Linus Torvalds wrote:

 On Fri, 15 Apr 2005, Daniel Barkalow wrote:
  
  So you want to merge someone else's tree into your committed state, and
  then merge the result with your working directory to get the working
  directory you continue with, provided that the second merge is trivial?
 
 No, you don't even merge the working directory.
 
 The low-level tools should entirely ignore the working directory. To a
 low-level merge, the working directory doesn't even exist. It just gets
 three commits (or trees) and merges two of them with the third as a
 parent, and does all of it in it's own temporary merge working
 directory.

It seems like users won't expect there to be a new working directory for
the merge in which they are supposed to resolve te conflicts, but where
they don't see their uncommited changes. In any case, the low-level tools
have to care about *some* working directory, even if it isn't the parent
of .git, and the parent of .git seems like where other similar things
happen. If we're being conservative about merging, we're likely to report
a lot of conflicts, at least until we work out better techniques than a
simple 3-way merge.

  For the latter, there are sometimes multiple ancestors which fit this
  criterion
 
 Yes. Let's just pick one at random (or more likely, the latest one by 
 date - let's not actually be _random_ random) at first. 

Okay; I've currently got the one where the number of generations it is
away from the further head is the smallest, and of equal ones, an
arbitrary choice. If people are generally similar in the amount they
diverge before commiting, this should be the most similar ancestor.

 There are other heuristics we can try, ie if it turns out that it's common
 to have a couple of alternatives (but no more than some small number, say
 five or so), we can literally just -try- to do a tree-only merge, and see
 how many lines out common output you get from diff-tree.
 
 Because that how mnay files do we need to merge is the number you want
 to minimize, and doing a couple of extra diff-tree + join  operations
 should be so fast that nobody will notice that we actually tried five
 different merges to see which one looked the best.
 
 But hey, especially if the merge fails with real clashes (ie there are
 changes in common and running merge leaves conflicts), and there were
 other alternate parents to choose, there's nothing wrong with just
 printing them out and saying you might try to specify one of these
 manually.

I think we should be able to get good results out of doing the 5 merges
and reporting a conflict only if there's a conflict in all of them; it
shouldn't be possible for two to succeed but give different results (if it
did, clearly our current algorithm is unsafe, since it would give some
undesired output if it happened to use the wrong ancestor).

I'm thinking of not actually calling merge(1) for this at all; it just
calls diff3, and diff3 is only 1745 lines including option parsing. We can
probably arrange to look around for better ancestors in case of conflicts
we'd otherwise have to report, and get this all tidy and more efficient
than having diff3 re-read files. And if we only go to other ancestors in
case of conflicts, we're going to be a lot faster total than getting a
reaction from the user, almost no matter what we do.

 I really don't think we should worry too much about this until we've 
 actually used the system for a while and seen what it does. So just start 
 with nearest common parent with most recent date. Which I think you 
 already implemented, no?

I've got something like that (see above); did you want it in some form
other than the patch I sent you?

-Daniel
*This .sig left intentionally blank*

-
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 3/2] merge-trees script for Linus git

2005-04-15 Thread Linus Torvalds


On Fri, 15 Apr 2005, Junio C Hamano wrote:
 
 I'd take the hint, but I would say the current Perl version
 would be far more usable than the C version I would come up with
 by the end of this weekend because:

Actually, it turns out that I have a cunning plan.

I'm full of cunning plans, in fact. It turns out that I can do merges even
more simply, if I just allow the notion of state into an index entry,
and allow multiple index entries with the same name as long as they differ
in state.

And that means that I can do all the merging in the regular index tree, 
using very simple rules.

Let's see how that works out. I'm writing the code now.

Linus
-
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html