Re: git merge-tree: bug report and some feature requests
Thanks, Ed. I think I'll pursue the libgit2 route; sounds promising. >> But the alternative appears to be punting entirely, as libgit2 does, >> and merely providing something akin to three index entries. > > Indeed, when I added merge to libgit2, we put the higher-level conflict > analysis into application code because there was not much interest in it > at the time. I've been meaning to add this to `git_status` in libgit2, > but it's not been a high priority. Is your conflict analysis application code public? I might be game to do some of the legwork to get it into libgit2's git_status (although I'm probably not the right person to do the API design). At a minimum, it would be helpful as a reference, as I'm probably about to recreate some subset of it myself. -josh
Re: git merge-tree: bug report and some feature requests
>> I'm experimenting with some new porcelain for interactive rebase. One >> goal is to leave the work tree untouched for most operations. It looks >> to me like 'git merge-tree' may be the right plumbing command for >> doing the merge part of the pick work of the todo list, one commit at >> a time. If I'm wrong about this, I'd love pointers; what follows may >> still be interesting anyway. > > I don't have a concrete alternative (yet?) but here are some pointers > to two alternate merge-without-touching-working-tree possibilities, if > your current route doesn't pan out as well as you like: > > I posted some patches last year to make merge-recursive.c be able to > do merges without touching the working tree. Adding a few flags would > then enable it for any of 'merge', 'cherry-pick', 'am', or > 'rebase'...though for unsuccessful merges, there's a clear question of > what/how conflicts should be reported to the user. That probably > depends a fair amount on the precise use-case. > > Although that series was placed on the backburner due to the immediate > driver of the feature going away, I'm still interested in such a > change, though I think it would fall out as a nice side effect of > implementing Junio's proposed ideal-world-merge-recursive rewrite[1]. > I have started looking into that[2], but no guarantees about how > quickly I'll find time to finish or even whether I will. > > [1] https://public-inbox.org/git/xmqqd147kpdm@gitster.mtv.corp.google.com > [2] https://github.com/newren/git/blob/ort/ort-cover-letter contains > overview of ideas and notes to myself about what I was hoping to > accomplish; currently it doesn't even compile or do anything Thanks for the pointer. That does seem promising. And yes, I see now that serialization of conflicts is decidedly challenging. More on that below. >> 4. API suggestion >> >> Here's what I really want 'git merge-tree' to output. :) > ... >> If the merge had conflicts, write the "as merged as possible" tree to > > You'd need to define "as merged as possible" more carefully, because I > thought you meant a tree containing all the three-way merge conflict > markers and such being present in the "resolved" file, but from your > parenthetical note below it appears you think that is a different tree > that would also be useful to diff against the first one. That leaves > me wondering what the first tree is. (Is it just the tree where for > each path, if that path had no conflicts associated with it then it's > the merge-resolved-file, and otherwise it's the file contents from the > merge-base?). FWIW, the parenthetical suggestion was indeed what I had in mind. But non-content conflicts appear to make that a non-starter. Or at least woefully incomplete. > Both of these trees are actually rather non-trivial to define. The > wording above isn't actually sufficient, because content conflicts > aren't the only kind of conflict. More on that below. > > There is already a bunch of code in merge-recursive.c to create a > forcibly-merged-accepting-conflict-markers-in-the-resolution and > record it as a tree (this is used for creating virtual merge bases in > the recursive case, namely when there isn't a single merge-base for > the two branches you are merging). It might be reusable for what you > want here, but it's not immediately clear whether all the things it > does are appropriate; someone would have to consider the non-content > (path-based) conflicts carefully. Ack. I assume this is also the code that generates the existing 'git merge-tree' patches, which includes conflict markers. >> the object database and give me its sha, and then also give me the >> three-way merge diff output for all conflicts, as a regular patch >> against that tree, using full path names and shas. (Alternatively, >> maybe better, give me a second sha for a tree containing all the >> three-way merge diff patches applied, which I can diff against the >> first tree to find the conflict patches.) > > As far as I can tell, you're assuming that it's possible with two > trees that are crafted "just right", that you can tell where the merge > conflicts are, with binary files being your only difficulty. Content > conflicts aren't the only type that exist; there are also path-based > conflicts. These type of conflicts also make it difficult to know how > the two trees you are requesting should even be created. > > For example, if there is a modify/delete conflict, how can that be > determined from just two trees? If the first tree has the base > version of the file, then the second tree either has a file at the > same position or it doesn't. Neither case looks like a conflict, but > the original merge had one. You need more information. The exact > same thing can be said for rename/delete conflicts. > > Similarly, rename/add (one side renames an existing file to some new > path (say, "new_path"), and the other adds a brand new file at > "new_path), or rename/rename(2to1) (each s
git merge-tree: bug report and some feature requests
Hi, all. I'm experimenting with some new porcelain for interactive rebase. One goal is to leave the work tree untouched for most operations. It looks to me like 'git merge-tree' may be the right plumbing command for doing the merge part of the pick work of the todo list, one commit at a time. If I'm wrong about this, I'd love pointers; what follows may still be interesting anyway. I've encountered some bumps with 'git merge-tree'. A bug report and some feature requests follow. Apologies for the long email. 1. Bug When a binary file containing NUL is added on only one side, the resulting patch is malformed. Reproduction script: mkdir test cd test git init . touch shared git add shared git commit -m "base" git checkout -b left echo "left" > x printf '\1\0\1\n' > binary git add x binary git commit -m "left" git checkout master git checkout -b right echo "right" > x git add x git commit -m "right" git merge-tree master right left git merge-tree master right left | xxd cd .. rm -rf test The merge-tree results I get with 2.15.1 are: added in remote their 100644 ddc50ce55647db1421b18aa33417442e29f63d2f binary @@ -0,0 +1 @@ +added in both our100644 c376d892e8b105bd712d06ec5162b5f31ce949c3 x their 100644 45cf141ba67d59203f02a54f03162f3fcef57830 x @@ -1 +1,5 @@ +<<< .our right +=== +left +>>> .their : 6164 6465 6420 696e 2072 656d 6f74 650a added in remote. 0010: 2020 7468 6569 7220 2031 3030 3634 3420their 100644 0020: 6464 6335 3063 6535 3536 3437 6462 3134 ddc50ce55647db14 0030: 3231 6231 3861 6133 3334 3137 3434 3265 21b18aa33417442e 0040: 3239 6636 3364 3266 2062 696e 6172 790a 29f63d2f binary. 0050: 4040 202d 302c 3020 2b31 2040 400a 2b01 @@ -0,0 +1 @@.+. 0060: 6164 6465 6420 696e 2062 6f74 680a 2020 added in both. 0070: 6f75 7220 2020 2031 3030 3634 3420 6333 our100644 c3 0080: 3736 6438 3932 6538 6231 3035 6264 3731 76d892e8b105bd71 0090: 3264 3036 6563 3531 3632 6235 6633 3163 2d06ec5162b5f31c 00a0: 6539 3439 6333 2078 0a20 2074 6865 6972 e949c3 x. their 00b0: 2020 3130 3036 3434 2034 3563 6631 3431100644 45cf141 00c0: 6261 3637 6435 3932 3033 6630 3261 3534 ba67d59203f02a54 00d0: 6630 3331 3632 6633 6663 6566 3537 3833 f03162f3fcef5783 00e0: 3020 780a 4040 202d 3120 2b31 2c35 2040 0 x.@@ -1 +1,5 @ 00f0: 400a 2b3c 3c3c 3c3c 3c3c 202e 6f75 720a @.+<<< .our. 0100: 2072 6967 6874 0a2b 3d3d 3d3d 3d3d 3d0a right.+===. 0110: 2b6c 6566 740a 2b3e 3e3e 3e3e 3e3e 202e +left.+>>> . 0120: 7468 6569 720a their. Note that the "added in both" explanation appears to be part of the diff for binary. The diff line should be '\1\0\1\n', but it is only '\1', obviously suggesting a C string operation gone awry. I haven't checked whether regular 'git diff' operations contain a similar bug. (The NUL would have to be pretty far into the file, to confuse the binary file detection heuristic, but that is possible.) I don't see any particularly good work-arounds. Looking for all possible explanations as a trigger to detect a malformed patch runs into false positives with the explanation "merged", which occurs in regular code. 2. Feature suggestion Related to the bug, may I suggest a flag to omit unnecessary patches? For "added in remote" and "deleted in remote", I don't actually need the patch--I can grab the blob contents from the SHA myself if needed. These cases need special handling anyway (to create/delete the file), so the (often large) patch doesn't add much anyway. This would provide a workaround for the bug. 3. Feature suggestion There's no direct indication of whether any given file's merge succeeded. Currently I sniff for merge conflicts by looking for "+<<< .our", which feels like an ugly kludge. Could we provide an explicit indicator? (And maybe also one for binary vs text processing?) Note that binary file merge conflicts don't generate patches with three-way merge markers but instead say "warning: Cannot merge binary files: binary (.our vs. .their)". Looking for this case even further complicates the output parser. 4. API suggestion Here's what I really want 'git merge-tree' to output. :) If the merge had no conflicts, write the resulting tree to the object database and give me its sha. I can always diff that tree against branch1's tree if I want to see what has changed. If the merge had conflicts, write the "as merged as possible" tree to the object database and give me its sha, and then also give me the three-way merge diff output for all conflicts, as a regular patch against that tree, using full path names and shas. (Alternatively, maybe better, give me a second sha for a tree containing all the three-way merge diff patches applied, which I can diff against the first tree to find the conflict patches.) I'm not sure what to do about binary merge conflicts, since they aren't representable with three-way markers. Maybe j
Re: Bug report: $program_name in error message
>> To reproduce, run 'git submodule' from within a bare repo. Result: >> >> $ git submodule >> fatal: $program_name cannot be used without a working tree. >> >> Looks like the intent was for $program_name to be interpolated. > > Which version of git do you use? $ git version git version 2.11.0 >> As an aside, I sent a message a few days ago about a segfault when >> working with a filesystem with direct_io on, but it appears not to >> have made it to the archives on marc.info. Am I perhaps still >> greylisted? > > Both emails show up in my mailbox (subscribed to the mailing list), > so I think that just nobody answered your first email as the answer > may be non trivial. Thanks for the confirmation. -josh
Bug report: $program_name in error message
To reproduce, run 'git submodule' from within a bare repo. Result: $ git submodule fatal: $program_name cannot be used without a working tree. Looks like the intent was for $program_name to be interpolated. As an aside, I sent a message a few days ago about a segfault when working with a filesystem with direct_io on, but it appears not to have made it to the archives on marc.info. Am I perhaps still greylisted? Thanks, Josh
Segfault in git_config_set_multivar_in_file_gently with direct_io in FUSE filesystem
I am using git with a simple in-memory FUSE filesystem. When I enable direct_io, I get a segfault from git_config_set_multivar_in_file_gently during git clone. I have full reproduction instructions using Go and macOS at https://github.com/josharian/gitbug. It also includes a stack trace in case anyone wants to try blind debugging. Happy to provide more info if it will help. Thanks, Josh
Bug report: checkout-index --temp path is not always relative to the current directory
In the section on using --temp, 'git help checkout-index' says: "The path field is always relative to the current directory and the temporary file names are always relative to the top level directory." However, this can be false when an absolute path to the file is provided. Reproduction: ~$ git version git version 2.2.1 ~$ mkdir demo ~$ cd demo ~/demo$ mkdir a ~/demo$ echo "a" > a/f ~/demo$ mkdir b ~/demo$ echo "b" > b/f ~/demo$ git init . Initialized empty Git repository in ~/demo/.git/ ~/demo$ git add . ~/demo$ git commit -m "init" [master (root-commit) 2afa910] init 2 files changed, 2 insertions(+) create mode 100644 a/f create mode 100644 b/f ~/demo$ echo "b2" > b/f ~/demo$ cd a ~/demo/a$ git checkout-index --temp -- `git rev-parse --show-toplevel`/b/f .merge_file_xm8RTd f ~/demo/a$ cat ../.merge_file_xm8RTd b Note that if f in the checkout-index output is interpreted as relative to the current directory, it would refer to a/f, whereas in fact is it b/f. This led to https://github.com/golang/go/issues/9476. Thanks, Josh -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html