date:20171110

cherry-pick very slow on big repository

2017-11-10 Thread Peter Krefting


Hi!

On a big repository (57000 files, 2,5 gigabytes in .git/objects), git 
cherry-pick is very slow for me (v2.15.0). This is cherry-picking a 
one-file change, where the file is in the same place on both branches, 
and which applies cleanly (I am backporting a few fixes to a 
maintenance version):


$ time git cherry-pick -x 717eb328940ca2e33f14ed27576e656327854b7b
[redacted 391454f16d] Redacted
 Author: Redacted 
 Date: Mon Oct 16 15:58:05 2017 +0200
 1 file changed, 2 insertions(+), 2 deletions(-)

real6m9,054s
user5m49,432s
sys 0m2,292s

Something is not how it should be here. The repo shares objects 
(.git/objects/info/alternates) with another repository (I have run 
"git gc" on both repositories).


Running strace, it seems like it is doing lstat(), open(), mmap(), 
close() and munmap() on every single file in the repository, which 
takes a lot of time.


I thought it was just updating the status, but "git status" returns 
immediately, while cherry-picking takes several minutes for every 
cherry-pick I do.


--
\\// Peter - http://www.softwolves.pp.se/

Re: Bug - Status - Space in Filename

2017-11-10 Thread Jeff King

On Fri, Nov 10, 2017 at 09:52:16AM +0900, Junio C Hamano wrote:

> > That said, if this is the only place that has this funny quoting, it may
> > not be worth polluting the rest of the code with the idea that quoting
> > spaces is a good thing to do.
> 
> Sounds sane.  We can probably use a helper like this:
> 
> static char *quote_path_with_sp(const char *in, const char *prefix, struct 
> strbuf *out)
> {
>   const char dq = '"';
> 
>   quote_path(in, prefix, out);
>   if (out->buf[0] != dq && strchr(out->buf, ' ') != NULL) {
>   strbuf_insert(out, 0, &dq, 1);
>   strbuf_addch(out, dq);
>   }
>   return out->buf;
> }
> 
> which allows the current users like shortstatus_status() to become a
> lot shorter.

Are there callers who don't just print the result? If not, we could just
always emit. That's slightly more efficient since it drops the expensive
strbuf_insert (though there are already so many copies going on in
quote_path_relative that it hardly matters). But it also drops the need
for the caller to know about the strbuf at all.

Like:
diff --git a/wt-status.c b/wt-status.c
index 937a87bbd5..4f4706a6e2 100644
--- a/wt-status.c
+++ b/wt-status.c
@@ -1703,6 +1703,18 @@ static void wt_shortstatus_unmerged(struct 
string_list_item *it,
}
 }
 
+static void emit_path(const char *in, const char *prefix)
+{
+   struct strbuf buf = STRBUF_INIT;
+   quote_path(in, prefix, &buf);
+   if (buf.buf[0] != '"' && strchr(buf.buf, ' ') != NULL) {
+   putchar('"');
+   strbuf_addch(&buf, '"');
+   }
+   fwrite(buf.buf, 1, buf.len, stdout);
+   strbuf_release(&buf);
+}
+
 static void wt_shortstatus_status(struct string_list_item *it,
 struct wt_status *s)
 {
@@ -1722,26 +1734,12 @@ static void wt_shortstatus_status(struct 
string_list_item *it,
if (d->head_path)
fprintf(stdout, "%s%c", d->head_path, 0);
} else {
-   struct strbuf onebuf = STRBUF_INIT;
-   const char *one;
if (d->head_path) {
-   one = quote_path(d->head_path, s->prefix, &onebuf);
-   if (*one != '"' && strchr(one, ' ') != NULL) {
-   putchar('"');
-   strbuf_addch(&onebuf, '"');
-   one = onebuf.buf;
-   }
-   printf("%s -> ", one);
-   strbuf_release(&onebuf);
+   emit_path(d->head_path, s->prefix);
+   printf(" -> ");
}
-   one = quote_path(it->string, s->prefix, &onebuf);
-   if (*one != '"' && strchr(one, ' ') != NULL) {
-   putchar('"');
-   strbuf_addch(&onebuf, '"');
-   one = onebuf.buf;
-   }
-   printf("%s\n", one);
-   strbuf_release(&onebuf);
+   emit_path(it->string, s->prefix);
+   putchar('\n');
}
 }
 

Though really I am fine with any solution that puts this pattern into a
helper function rather than repeating it inline.

-Peff

Re: cherry-pick very slow on big repository

2017-11-10 Thread Jeff King

On Fri, Nov 10, 2017 at 10:39:39AM +0100, Peter Krefting wrote:

> Running strace, it seems like it is doing lstat(), open(), mmap(), close()
> and munmap() on every single file in the repository, which takes a lot of
> time.
> 
> I thought it was just updating the status, but "git status" returns
> immediately, while cherry-picking takes several minutes for every
> cherry-pick I do.

It kind of sounds like a temporary index is being refreshed that doesn't
have the proper stat information.

Can you get a backtrace? I'd do something like:

  - gdb --args git cherry-pick ...
  - 'r' to run
  - give it a few seconds to hit the CPU heavy part, then ^C
  - 'bt' to generate the backtrace

which should give a sense of which code path is leading to the slowdown
(or of course use real profiling tools, but if the slow path is taking 6
minutes, you'll be likely to stop in the middle of it ;) ).

-Peff

[RFC] cover-at-tip

2017-11-10 Thread Nicolas Morey-Chaisemartin

Hi,

I'm starting to look into the cover-at-tip topic that I found in the leftover 
bits (http://www.spinics.net/lists/git/msg259573.html)

Here's a first draft of a patch that adds support for format-patch 
--cover-at-tip. It compiles and works in my nice and user firnedly test case.
Just wanted to make sure I was going roughly in the right direction here.


I was wondering where is the right place to put a commit_is_cover_at_tip() as 
the test will be needed in other place as the feature is extended to git 
am/merge/pull.

Feel free to comment. I know the help is not clear at this point and there's 
still some work to do on option handling (add a config option, probably have 
--cover-at-tip imply --cover-letter, etc) and
some testing :)


---
 Documentation/git-format-patch.txt |  4 
 builtin/log.c  | 38 +++---
 2 files changed, 35 insertions(+), 7 deletions(-)

diff --git a/Documentation/git-format-patch.txt 
b/Documentation/git-format-patch.txt
index 6cbe462a7..0ac9d4b71 100644
--- a/Documentation/git-format-patch.txt
+++ b/Documentation/git-format-patch.txt
@@ -228,6 +228,10 @@ feeding the result to `git send-email`.
containing the branch description, shortlog and the overall diffstat.  
You can
fill in a description in the file before sending it out.
 
+--[no-]cover-letter-at-tip::
+   Use the tip of the series as a cover letter if it is an empty commit.
+If no cover-letter is to be sent, the tip is ignored.
+
 --notes[=]::
Append the notes (see linkgit:git-notes[1]) for the commit
after the three-dash line.
diff --git a/builtin/log.c b/builtin/log.c
index 6c1fa896a..a0e9e61a3 100644
--- a/builtin/log.c
+++ b/builtin/log.c
@@ -986,11 +986,11 @@ static void make_cover_letter(struct rev_info *rev, int 
use_stdout,
  struct commit *origin,
  int nr, struct commit **list,
  const char *branch_name,
- int quiet)
+ int quiet,
+ struct commit *cover_at_tip_commit)
 {
const char *committer;
const char *body = "*** SUBJECT HERE ***\n\n*** BLURB HERE ***\n";
-   const char *msg;
struct shortlog log;
struct strbuf sb = STRBUF_INIT;
int i;
@@ -1021,14 +1021,18 @@ static void make_cover_letter(struct rev_info *rev, int 
use_stdout,
if (!branch_name)
branch_name = find_branch_name(rev);
 
-   msg = body;
pp.fmt = CMIT_FMT_EMAIL;
pp.date_mode.type = DATE_RFC2822;
pp.rev = rev;
pp.print_email_subject = 1;
-   pp_user_info(&pp, NULL, &sb, committer, encoding);
-   pp_title_line(&pp, &msg, &sb, encoding, need_8bit_cte);
-   pp_remainder(&pp, &msg, &sb, 0);
+
+   if (!cover_at_tip_commit) {
+   pp_user_info(&pp, NULL, &sb, committer, encoding);
+   pp_title_line(&pp, &body, &sb, encoding, need_8bit_cte);
+   pp_remainder(&pp, &body, &sb, 0);
+   } else {
+   pretty_print_commit(&pp, cover_at_tip_commit, &sb);
+   }
add_branch_description(&sb, branch_name);
fprintf(rev->diffopt.file, "%s\n", sb.buf);
 
@@ -1409,6 +1413,8 @@ int cmd_format_patch(int argc, const char **argv, const 
char *prefix)
int just_numbers = 0;
int ignore_if_in_upstream = 0;
int cover_letter = -1;
+   int cover_at_tip = -1;
+   struct commit *cover_at_tip_commit = NULL;
int boundary_count = 0;
int no_binary_diff = 0;
int zero_commit = 0;
@@ -1437,6 +1443,8 @@ int cmd_format_patch(int argc, const char **argv, const 
char *prefix)
N_("print patches to standard out")),
OPT_BOOL(0, "cover-letter", &cover_letter,
N_("generate a cover letter")),
+   OPT_BOOL(0, "cover-at-tip", &cover_at_tip,
+   N_("fill the cover letter with the tip of the 
branch")),
OPT_BOOL(0, "numbered-files", &just_numbers,
N_("use simple number sequence for output file 
names")),
OPT_STRING(0, "suffix", &fmt_patch_suffix, N_("sfx"),
@@ -1698,6 +1706,21 @@ int cmd_format_patch(int argc, const char **argv, const 
char *prefix)
if (ignore_if_in_upstream && has_commit_patch_id(commit, &ids))
continue;
 
+   if (!nr && cover_at_tip == 1 && !cover_at_tip_commit) {
+   /* Check that it is a candidate to be a cover at tip
+* Meaning:
+* - a single parent (merge commits are not eligible)
+* - tree oid == parent->tree->oid (no diff to the tree)
+*/
+   if (commit->parents && !commit->parents->next &&
+   !oidcmp(&com

`git commit -a` stages ignored submodules?

2017-11-10 Thread Sergey Sharybin

Hello everyone,

There seems to be a difference how Git 2.15.0 handles submodules in
comparison with 2.14.2.

In Git 2.14.2 `git commit -a` will not stage submodules which has
`ignore = all` set in their .gitmodule section. However, in Git 2.15.0
`git commit -a` will stage all submodules no matter what their
"ignore" setting is set to, and also no matter if submodule path is in
.gitignore.

Didn't see anything explicit about this in release logs. Is it an
expected new behavior?It is till possible to keep `git commit -a` to
ignore submodules which are configured to be ignored?

-- 
With best regards, Sergey Sharybin

Re: [PATCH] notes: add `rm` and `delete` commands

2017-11-10 Thread Adam Dinwoodie

On Friday 10 November 2017 at 12:14 pm +0900, Junio C Hamano wrote:
> Eric Sunshine  writes:
> 
> > or against the change:
> >
> > - synonym bloat; balloons documentation
> > - steals command verbs from potential future features
> > - ...
> 
> I tend to agree with these two (and if I were to replace the "..."
> with something concrete, this change is not desirable because it
> will force us to add all these three when we introduce "remove"
> elsewhere in the system).
> 
> Having said that, this remids me of an age-old topic we discussed in
> a distant past but stalled due to lack of strong drive, which is:
> 
> Some git subcommands take command verb (e.g. "git notes 'remove'")
> while others take command mode flags (e.g. "git branch --delete"),
> and the users need to remember them, which is not ideal.
> 
> I think the consensus back then, if we were to aim for consistency
> by uniformly using only one of the two, is to use the command mode
> flags, simply because existing commands that have the primary mode
> and take an optional command mode flag [*1*] cannot be migrated to
> the command verb UI safely and sanely.
> 
> And then, if we are not careful when we standardize all subcommands
> to take command mode flags, we might end up with a situation where
> some subcommand say "--remove" while other say "--delete", and some
> users will wish to rectify that situation by adding an alias defined
> for these flags---I view this patch as a part of that possible
> future topic ;-).

My motivation was entirely "these were the commands I tried before I
tried `remove` when I wanted to delete a note."  I think both have
precedence in Git with the likes of `git rm` and `git branch --delete`,
although you're entirely correct that I should have at least added that
justification in the commit message.

The matter of which verbs and which flags work where is clearly a much
broader concern.  However at the moment Git uses `rm`, `remove` and
`delete` with no apparent (at least to me) logic about which is used
where, and it seems to me that adding each as synonyms of the other
across the entire suite would improve usability.

In particular, I don't think I buy the argument that it steals verbs: I
can't imagine wanting to use multiple of those verbs in the same context
with different meanings, as that seems liable to cause significant
confusion.

That said, I do get the argument about UI and documentation bloat.  From
my perspective, the cost of that is less than the benefit of having a
more consistent interface, and this is something we can do to have a
more consistent interface *now* without waiting for a larger future
consistency project, but YMMV, and I'm very willing to bow to everyone
else's expertise on this.

Unify annotated and non-annotated tags

2017-11-10 Thread anatoly techtonik

Hi,

It is hard to work with Git tags, because on low level hash
of non-annotated tag is pointing to commit, but hash for
annotated tag is pointing to tag metadata.

On low level that means that there is no way to get commit
hash from tag in a single step. If tag is annotated, you need
to find and parse ^{} string of show-ref, if not, then look for
string without ^{}.

So, why not just make all tags work the same so that every
tag has its own hash and you need to dereference it in the
same way to get commit hash?

This way I could get all commit hashes with just:

git show-ref --tags -d | grep "\^{}"

or abandon ^{} completely and show commit hashes on -d:

git show-ref --tags --dereference

-- 
anatoly t.

[PATCH v2 2/9] commit: move empty message checks to libgit

2017-11-10 Thread Phillip Wood

From: Phillip Wood 

Move the functions that check for empty messages from bulitin/commit.c
to sequencer.c so they can be shared with other commands. The
functions are refactored to take an explicit cleanup mode and template
filename passed by the caller.

Signed-off-by: Phillip Wood 
---

Notes:
changes since v1:
 - prefix cleanup_mode enum and constants with commit_msg_

 builtin/commit.c | 99 +++-
 sequencer.c  | 61 ++
 sequencer.h  | 11 +++
 3 files changed, 91 insertions(+), 80 deletions(-)

diff --git a/builtin/commit.c b/builtin/commit.c
index 
605ea8c0e9663726c64950dba07afe21516c9b26..dbc160c525e7a9249b7c7df2180495a4c7102296
 100644
--- a/builtin/commit.c
+++ b/builtin/commit.c
@@ -128,12 +128,7 @@ static char *sign_commit;
  * if editor is used, and only the whitespaces if the message
  * is specified explicitly.
  */
-static enum {
-   CLEANUP_SPACE,
-   CLEANUP_NONE,
-   CLEANUP_SCISSORS,
-   CLEANUP_ALL
-} cleanup_mode;
+static enum commit_msg_cleanup_mode cleanup_mode;
 static const char *cleanup_arg;
 
 static enum commit_whence whence;
@@ -673,7 +668,7 @@ static int prepare_to_commit(const char *index_file, const 
char *prefix,
struct strbuf sb = STRBUF_INIT;
const char *hook_arg1 = NULL;
const char *hook_arg2 = NULL;
-   int clean_message_contents = (cleanup_mode != CLEANUP_NONE);
+   int clean_message_contents = (cleanup_mode != COMMIT_MSG_CLEANUP_NONE);
int old_display_comment_prefix;
 
/* This checks and barfs if author is badly specified */
@@ -812,7 +807,7 @@ static int prepare_to_commit(const char *index_file, const 
char *prefix,
struct ident_split ci, ai;
 
if (whence != FROM_COMMIT) {
-   if (cleanup_mode == CLEANUP_SCISSORS)
+   if (cleanup_mode == COMMIT_MSG_CLEANUP_SCISSORS)
wt_status_add_cut_line(s->fp);
status_printf_ln(s, GIT_COLOR_NORMAL,
whence == FROM_MERGE
@@ -832,14 +827,15 @@ static int prepare_to_commit(const char *index_file, 
const char *prefix,
}
 
fprintf(s->fp, "\n");
-   if (cleanup_mode == CLEANUP_ALL)
+   if (cleanup_mode == COMMIT_MSG_CLEANUP_ALL)
status_printf(s, GIT_COLOR_NORMAL,
_("Please enter the commit message for your 
changes."
  " Lines starting\nwith '%c' will be ignored, 
and an empty"
  " message aborts the commit.\n"), 
comment_line_char);
-   else if (cleanup_mode == CLEANUP_SCISSORS && whence == 
FROM_COMMIT)
+   else if (cleanup_mode == COMMIT_MSG_CLEANUP_SCISSORS &&
+whence == FROM_COMMIT)
wt_status_add_cut_line(s->fp);
-   else /* CLEANUP_SPACE, that is. */
+   else /* COMMIT_MSG_CLEANUP_SPACE, that is. */
status_printf(s, GIT_COLOR_NORMAL,
_("Please enter the commit message for your 
changes."
  " Lines starting\n"
@@ -984,65 +980,6 @@ static int prepare_to_commit(const char *index_file, const 
char *prefix,
return 1;
 }
 
-static int rest_is_empty(struct strbuf *sb, int start)
-{
-   int i, eol;
-   const char *nl;
-
-   /* Check if the rest is just whitespace and Signed-off-by's. */
-   for (i = start; i < sb->len; i++) {
-   nl = memchr(sb->buf + i, '\n', sb->len - i);
-   if (nl)
-   eol = nl - sb->buf;
-   else
-   eol = sb->len;
-
-   if (strlen(sign_off_header) <= eol - i &&
-   starts_with(sb->buf + i, sign_off_header)) {
-   i = eol;
-   continue;
-   }
-   while (i < eol)
-   if (!isspace(sb->buf[i++]))
-   return 0;
-   }
-
-   return 1;
-}
-
-/*
- * Find out if the message in the strbuf contains only whitespace and
- * Signed-off-by lines.
- */
-static int message_is_empty(struct strbuf *sb)
-{
-   if (cleanup_mode == CLEANUP_NONE && sb->len)
-   return 0;
-   return rest_is_empty(sb, 0);
-}
-
-/*
- * See if the user edited the message in the editor or left what
- * was in the template intact
- */
-static int template_untouched(struct strbuf *sb)
-{
-   struct strbuf tmpl = STRBUF_INIT;
-   const char *start;
-
-   if (cleanup_mode == CLEANUP_NONE && sb->len)
-   return 0;
-
-   if (!template_file || strbuf_read_file(&tmpl, template_file, 0) <= 0)
-   return 0;
-
-   strbuf_stripspace(&tmpl, cleanup_mode == CLEANUP_ALL);
-   if (!skip_prefix(sb->buf

[PATCH v2 6/9] sequencer: don't die in print_commit_summary()

2017-11-10 Thread Phillip Wood

From: Phillip Wood 

Return an error rather than dying so that the sequencer can exit
cleanly once it starts committing without forking 'git commit'

Signed-off-by: Phillip Wood 
---
 builtin/commit.c |  3 ++-
 sequencer.c  | 17 +++--
 sequencer.h  |  4 ++--
 3 files changed, 15 insertions(+), 9 deletions(-)

diff --git a/builtin/commit.c b/builtin/commit.c
index 
22d260197e84a828f984575cfa92789391531ca3..e924d8a14c8532a78d072420aeb662ce616181a4
 100644
--- a/builtin/commit.c
+++ b/builtin/commit.c
@@ -1598,7 +1598,8 @@ int cmd_commit(int argc, const char **argv, const char 
*prefix)
flags |= SUMMARY_INITIAL_COMMIT;
if (author_date_is_interesting())
flags |= SUMMARY_SHOW_AUTHOR_DATE;
-   print_commit_summary(prefix, &oid, flags);
+   if (print_commit_summary(prefix, &oid, flags))
+   exit(128);
}
 
UNLEAK(err);
diff --git a/sequencer.c b/sequencer.c
index 
03207db71821c0c3d1c442d3b01b11e3b010367e..61bb75352202cf896ffec41e972f366e94d569c1
 100644
--- a/sequencer.c
+++ b/sequencer.c
@@ -877,8 +877,8 @@ static const char *implicit_ident_advice(void)
 
 }
 
-void print_commit_summary(const char *prefix, const struct object_id *oid,
- unsigned int flags)
+int print_commit_summary(const char *prefix, const struct object_id *oid,
+unsigned int flags)
 {
struct rev_info rev;
struct commit *commit;
@@ -887,12 +887,13 @@ void print_commit_summary(const char *prefix, const 
struct object_id *oid,
struct pretty_print_context pctx = {0};
struct strbuf author_ident = STRBUF_INIT;
struct strbuf committer_ident = STRBUF_INIT;
+   int ret = 0;
 
commit = lookup_commit(oid);
if (!commit)
-   die(_("couldn't look up newly created commit"));
+   return error(_("couldn't look up newly created commit"));
if (parse_commit(commit))
-   die(_("could not parse newly created commit"));
+   return error(_("could not parse newly created commit"));
 
strbuf_addstr(&format, "format:%h] %s");
 
@@ -936,8 +937,10 @@ void print_commit_summary(const char *prefix, const struct 
object_id *oid,
diff_setup_done(&rev.diffopt);
 
head = resolve_ref_unsafe("HEAD", 0, NULL, NULL);
-   if (!head)
-   die_errno(_("unable to resolve HEAD after creating commit"));
+   if (!head) {
+   ret = error_errno(_("unable to resolve HEAD after creating 
commit"));
+   goto out;
+   }
if (!strcmp(head, "HEAD"))
head = _("detached HEAD");
else
@@ -950,7 +953,9 @@ void print_commit_summary(const char *prefix, const struct 
object_id *oid,
log_tree_commit(&rev, commit);
}
 
+out:
strbuf_release(&format);
+   return ret;
 }
 
 static int is_original_commit_empty(struct commit *commit)
diff --git a/sequencer.h b/sequencer.h
index 
8e0dbe59bbc1ccd3847454faf8bd09d5ca1ff93a..e4040cac08cb69c9b91db3fe77ae392c8d6d0f50
 100644
--- a/sequencer.h
+++ b/sequencer.h
@@ -78,6 +78,6 @@ void commit_post_rewrite(const struct commit *current_head,
 
 #define SUMMARY_INITIAL_COMMIT   (1 << 0)
 #define SUMMARY_SHOW_AUTHOR_DATE (1 << 1)
-void print_commit_summary(const char *prefix, const struct object_id *oid,
- unsigned int flags);
+int print_commit_summary(const char *prefix, const struct object_id *oid,
+unsigned int flags);
 #endif
-- 
2.15.0

[PATCH v2 1/9] t3404: check intermediate squash messages

2017-11-10 Thread Phillip Wood

From: Phillip Wood 

When there is more than one squash/fixup command in a row check the
intermediate messages are correct.

Signed-off-by: Phillip Wood 
---
 t/t3404-rebase-interactive.sh | 4 
 1 file changed, 4 insertions(+)

diff --git a/t/t3404-rebase-interactive.sh b/t/t3404-rebase-interactive.sh
index 
6a82d1ed876dd5d1073dc63be8ba5720adbf12e3..9ed0a244e6cdf34c7caca8232f0c0a8cf4864c42
 100755
--- a/t/t3404-rebase-interactive.sh
+++ b/t/t3404-rebase-interactive.sh
@@ -453,6 +453,10 @@ test_expect_success C_LOCALE_OUTPUT 'squash and fixup 
generate correct log messa
git rebase -i $base &&
git cat-file commit HEAD | sed -e 1,/^\$/d > actual-squash-fixup &&
test_cmp expect-squash-fixup actual-squash-fixup &&
+   git cat-file commit HEAD@{2} |
+   grep "^# This is a combination of 3 commits\."  &&
+   git cat-file commit HEAD@{3} |
+   grep "^# This is a combination of 2 commits\."  &&
git checkout to-be-rebased &&
git branch -D squash-fixup
 '
-- 
2.15.0

[PATCH v2 5/9] commit: move print_commit_summary() to libgit

2017-11-10 Thread Phillip Wood

From: Phillip Wood 

Move print_commit_summary() from builtin/commit.c to sequencer.c so it
can be shared with other commands. The function is modified by
changing the last argument to a flag so callers can specify whether
they want to show the author date in addition to specifying if this is
an initial commit.

Signed-off-by: Phillip Wood 
---

Notes:
changes since v1:
 - convert flags passed to print_commit_summary() to unsigned int

 builtin/commit.c | 128 ---
 sequencer.c  | 117 ++
 sequencer.h  |   5 +++
 3 files changed, 131 insertions(+), 119 deletions(-)

diff --git a/builtin/commit.c b/builtin/commit.c
index 
3bc5dff2c00c9556e6c032381d5baf18246bf8dc..22d260197e84a828f984575cfa92789391531ca3
 100644
--- a/builtin/commit.c
+++ b/builtin/commit.c
@@ -43,31 +43,6 @@ static const char * const builtin_status_usage[] = {
NULL
 };
 
-static const char implicit_ident_advice_noconfig[] =
-N_("Your name and email address were configured automatically based\n"
-"on your username and hostname. Please check that they are accurate.\n"
-"You can suppress this message by setting them explicitly. Run the\n"
-"following command and follow the instructions in your editor to edit\n"
-"your configuration file:\n"
-"\n"
-"git config --global --edit\n"
-"\n"
-"After doing this, you may fix the identity used for this commit with:\n"
-"\n"
-"git commit --amend --reset-author\n");
-
-static const char implicit_ident_advice_config[] =
-N_("Your name and email address were configured automatically based\n"
-"on your username and hostname. Please check that they are accurate.\n"
-"You can suppress this message by setting them explicitly:\n"
-"\n"
-"git config --global user.name \"Your Name\"\n"
-"git config --global user.email y...@example.com\n"
-"\n"
-"After doing this, you may fix the identity used for this commit with:\n"
-"\n"
-"git commit --amend --reset-author\n");
-
 static const char empty_amend_advice[] =
 N_("You asked to amend the most recent commit, but doing so would make\n"
 "it empty. You can repeat your command with --allow-empty, or you can\n"
@@ -1355,98 +1330,6 @@ int cmd_status(int argc, const char **argv, const char 
*prefix)
return 0;
 }
 
-static const char *implicit_ident_advice(void)
-{
-   char *user_config = expand_user_path("~/.gitconfig", 0);
-   char *xdg_config = xdg_config_home("config");
-   int config_exists = file_exists(user_config) || file_exists(xdg_config);
-
-   free(user_config);
-   free(xdg_config);
-
-   if (config_exists)
-   return _(implicit_ident_advice_config);
-   else
-   return _(implicit_ident_advice_noconfig);
-
-}
-
-static void print_summary(const char *prefix, const struct object_id *oid,
- int initial_commit)
-{
-   struct rev_info rev;
-   struct commit *commit;
-   struct strbuf format = STRBUF_INIT;
-   const char *head;
-   struct pretty_print_context pctx = {0};
-   struct strbuf author_ident = STRBUF_INIT;
-   struct strbuf committer_ident = STRBUF_INIT;
-
-   commit = lookup_commit(oid);
-   if (!commit)
-   die(_("couldn't look up newly created commit"));
-   if (parse_commit(commit))
-   die(_("could not parse newly created commit"));
-
-   strbuf_addstr(&format, "format:%h] %s");
-
-   format_commit_message(commit, "%an <%ae>", &author_ident, &pctx);
-   format_commit_message(commit, "%cn <%ce>", &committer_ident, &pctx);
-   if (strbuf_cmp(&author_ident, &committer_ident)) {
-   strbuf_addstr(&format, "\n Author: ");
-   strbuf_addbuf_percentquote(&format, &author_ident);
-   }
-   if (author_date_is_interesting()) {
-   struct strbuf date = STRBUF_INIT;
-   format_commit_message(commit, "%ad", &date, &pctx);
-   strbuf_addstr(&format, "\n Date: ");
-   strbuf_addbuf_percentquote(&format, &date);
-   strbuf_release(&date);
-   }
-   if (!committer_ident_sufficiently_given()) {
-   strbuf_addstr(&format, "\n Committer: ");
-   strbuf_addbuf_percentquote(&format, &committer_ident);
-   if (advice_implicit_identity) {
-   strbuf_addch(&format, '\n');
-   strbuf_addstr(&format, implicit_ident_advice());
-   }
-   }
-   strbuf_release(&author_ident);
-   strbuf_release(&committer_ident);
-
-   init_revisions(&rev, prefix);
-   setup_revisions(0, NULL, &rev, NULL);
-
-   rev.diff = 1;
-   rev.diffopt.output_format =
-   DIFF_FORMAT_SHORTSTAT | DIFF_FORMAT_SUMMARY;
-
-   rev.verbose_header = 1;
-   rev.show_root_diff = 1;
-   get_commit_format(format.buf, &rev);
-   rev.always_show_header = 0;
-   rev.diffopt.detect_ren

[PATCH v2 4/9] commit: move post-rewrite code to libgit

2017-11-10 Thread Phillip Wood

From: Phillip Wood 

Move run_rewrite_hook() from bulitin/commit.c to sequencer.c so it can
be shared with other commands and add a new function
commit_post_rewrite() based on the code in builtin/commit.c that
encapsulates rewriting notes and running the post-rewrite hook. Once
the sequencer learns how to create commits without forking 'git
commit' these functions will be used when squashing commits.

Signed-off-by: Phillip Wood 
---

Notes:
changes since v1:
 - reword commit message to explain why the code is being moved

 builtin/commit.c | 42 +-
 sequencer.c  | 47 +++
 sequencer.h  |  2 ++
 3 files changed, 50 insertions(+), 41 deletions(-)

diff --git a/builtin/commit.c b/builtin/commit.c
index 
7c2816644a665f7b7d8f4b6c0a5855aef01b..3bc5dff2c00c9556e6c032381d5baf18246bf8dc
 100644
--- a/builtin/commit.c
+++ b/builtin/commit.c
@@ -31,9 +31,7 @@
 #include "gpg-interface.h"
 #include "column.h"
 #include "sequencer.h"
-#include "notes-utils.h"
 #include "mailmap.h"
-#include "sigchain.h"
 
 static const char * const builtin_commit_usage[] = {
N_("git commit [] [--] ..."),
@@ -1478,37 +1476,6 @@ static int git_commit_config(const char *k, const char 
*v, void *cb)
return git_status_config(k, v, s);
 }
 
-static int run_rewrite_hook(const struct object_id *oldoid,
-   const struct object_id *newoid)
-{
-   struct child_process proc = CHILD_PROCESS_INIT;
-   const char *argv[3];
-   int code;
-   struct strbuf sb = STRBUF_INIT;
-
-   argv[0] = find_hook("post-rewrite");
-   if (!argv[0])
-   return 0;
-
-   argv[1] = "amend";
-   argv[2] = NULL;
-
-   proc.argv = argv;
-   proc.in = -1;
-   proc.stdout_to_stderr = 1;
-
-   code = start_command(&proc);
-   if (code)
-   return code;
-   strbuf_addf(&sb, "%s %s\n", oid_to_hex(oldoid), oid_to_hex(newoid));
-   sigchain_push(SIGPIPE, SIG_IGN);
-   write_in_full(proc.in, sb.buf, sb.len);
-   close(proc.in);
-   strbuf_release(&sb);
-   sigchain_pop(SIGPIPE);
-   return finish_command(&proc);
-}
-
 int run_commit_hook(int editor_is_used, const char *index_file, const char 
*name, ...)
 {
struct argv_array hook_env = ARGV_ARRAY_INIT;
@@ -1739,14 +1706,7 @@ int cmd_commit(int argc, const char **argv, const char 
*prefix)
rerere(0);
run_commit_hook(use_editor, get_index_file(), "post-commit", NULL);
if (amend && !no_post_rewrite) {
-   struct notes_rewrite_cfg *cfg;
-   cfg = init_copy_notes_for_rewrite("amend");
-   if (cfg) {
-   /* we are amending, so current_head is not NULL */
-   copy_note_for_rewrite(cfg, ¤t_head->object.oid, 
&oid);
-   finish_copy_notes_for_rewrite(cfg, "Notes added by 'git 
commit --amend'");
-   }
-   run_rewrite_hook(¤t_head->object.oid, &oid);
+   commit_post_rewrite(current_head, &oid);
}
if (!quiet)
print_summary(prefix, &oid, !current_head);
diff --git a/sequencer.c b/sequencer.c
index 
fcd8e92531a3fb1f0bc1294925e87b3624ada909..5529e5df1cbe9045b000ecbff6c28a0bee736ccc
 100644
--- a/sequencer.c
+++ b/sequencer.c
@@ -21,6 +21,8 @@
 #include "log-tree.h"
 #include "wt-status.h"
 #include "hashmap.h"
+#include "notes-utils.h"
+#include "sigchain.h"
 
 #define GIT_REFLOG_ACTION "GIT_REFLOG_ACTION"
 
@@ -789,6 +791,51 @@ int update_head_with_reflog(const struct commit *old_head,
return ret;
 }
 
+static int run_rewrite_hook(const struct object_id *oldoid,
+   const struct object_id *newoid)
+{
+   struct child_process proc = CHILD_PROCESS_INIT;
+   const char *argv[3];
+   int code;
+   struct strbuf sb = STRBUF_INIT;
+
+   argv[0] = find_hook("post-rewrite");
+   if (!argv[0])
+   return 0;
+
+   argv[1] = "amend";
+   argv[2] = NULL;
+
+   proc.argv = argv;
+   proc.in = -1;
+   proc.stdout_to_stderr = 1;
+
+   code = start_command(&proc);
+   if (code)
+   return code;
+   strbuf_addf(&sb, "%s %s\n", oid_to_hex(oldoid), oid_to_hex(newoid));
+   sigchain_push(SIGPIPE, SIG_IGN);
+   write_in_full(proc.in, sb.buf, sb.len);
+   close(proc.in);
+   strbuf_release(&sb);
+   sigchain_pop(SIGPIPE);
+   return finish_command(&proc);
+}
+
+void commit_post_rewrite(const struct commit *old_head,
+const struct object_id *new_head)
+{
+   struct notes_rewrite_cfg *cfg;
+
+   cfg = init_copy_notes_for_rewrite("amend");
+   if (cfg) {
+   /* we are amending, so old_head is not NULL */
+   copy_note_for_rewrite(cfg, &old_head->object.oid, new_head);
+   finish_copy_notes_for_rewrite(cfg, "Notes added by 'git commi

[PATCH v2 9/9] sequencer: try to commit without forking 'git commit'

2017-11-10 Thread Phillip Wood

From: Phillip Wood 

If the commit message does not need to be edited then create the
commit without forking 'git commit'. Taking the best time of ten runs
with a warm cache this reduces the time taken to cherry-pick 10
commits by 27% (from 282ms to 204ms), and the time taken by 'git
rebase --continue' to pick 10 commits by 45% (from 386ms to 212ms) on
my computer running linux. Some of greater saving for rebase is
because it no longer wastes time creating the commit summary just to
throw it away.

The code to create the commit is based on builtin/commit.c. It is
simplified as it doesn't have to deal with merges and modified so that
it does not die but returns an error to make sure the sequencer exits
cleanly, as it would when forking 'git commit'

Even when not forking 'git commit' the commit message is written to a
file and CHERRY_PICK_HEAD is created unnecessarily. This could be
eliminated in future. I hacked up a version that does not write these
files and just passed an strbuf (with the wrong message for fixup and
squash commands) to do_commit() but I couldn't measure any significant
time difference when running cherry-pick or rebase. I think
eliminating the writes properly for rebase would require a bit of
effort as the code would need to be restructured.

Signed-off-by: Phillip Wood 
---

Notes:
changes since v1:
 - added comments to explain return value of try_to_commit()
 - removed unnecessary NULL tests before calling free()
 - style cleanups
 - corrected commit message
 - prefixed cleanup_mode constants to reflect the changes to patch 2
   in this series

 sequencer.c | 178 +++-
 1 file changed, 176 insertions(+), 2 deletions(-)

diff --git a/sequencer.c b/sequencer.c
index 
0fd98391cb7dfcff6976271f44a56e897593e2c0..1f65e82696043fd9fb5ebb9f2b78445aa5a9599e
 100644
--- a/sequencer.c
+++ b/sequencer.c
@@ -592,6 +592,18 @@ static int read_env_script(struct argv_array *env)
return 0;
 }
 
+static char *get_author(const char* message)
+{
+   size_t len;
+   const char *a;
+
+   a = find_commit_header(message, "author", &len);
+   if (a)
+   return xmemdupz(a, len);
+
+   return NULL;
+}
+
 static const char staged_changes_advice[] =
 N_("you have staged changes in your working tree\n"
 "If these changes are meant to be squashed into the previous commit, run:\n"
@@ -987,6 +999,160 @@ int print_commit_summary(const char *prefix, const struct 
object_id *oid,
return ret;
 }
 
+static int parse_head(struct commit **head)
+{
+   struct commit *current_head;
+   struct object_id oid;
+
+   if (get_oid("HEAD", &oid)) {
+   current_head = NULL;
+   } else {
+   current_head = lookup_commit_reference(&oid);
+   if (!current_head)
+   return error(_("could not parse HEAD"));
+   if (oidcmp(&oid, ¤t_head->object.oid)) {
+   warning(_("HEAD %s is not a commit!"),
+   oid_to_hex(&oid));
+   }
+   if (parse_commit(current_head))
+   return error(_("could not parse HEAD commit"));
+   }
+   *head = current_head;
+
+   return 0;
+}
+
+/*
+ * Try to commit without forking 'git commit'. In some cases we need
+ * to run 'git commit' to display an error message
+ *
+ * Returns:
+ *  -1 - error unable to commit
+ *   0 - success
+ *   1 - run 'git commit'
+ */
+static int try_to_commit(struct strbuf *msg, const char *author,
+struct replay_opts *opts, unsigned int flags,
+struct object_id *oid)
+{
+   struct object_id tree;
+   struct commit *current_head;
+   struct commit_list *parents = NULL;
+   struct commit_extra_header *extra = NULL;
+   struct strbuf err = STRBUF_INIT;
+   struct strbuf amend_msg = STRBUF_INIT;
+   char *amend_author = NULL;
+   const char *gpg_sign;
+   enum commit_msg_cleanup_mode cleanup;
+   int res = 0;
+
+   if (parse_head(¤t_head))
+   return -1;
+
+   if (flags & AMEND_MSG) {
+   const char *exclude_gpgsig[] = { "gpgsig", NULL };
+   const char *out_enc = get_commit_output_encoding();
+   const char *message = logmsg_reencode(current_head, NULL,
+ out_enc);
+
+   if (!msg) {
+   const char *orig_message = NULL;
+
+   find_commit_subject(message, &orig_message);
+   msg = &amend_msg;
+   strbuf_addstr(msg, orig_message);
+   }
+   author = amend_author = get_author(message);
+   unuse_commit_buffer(current_head, message);
+   if (!author) {
+   res = error(_("unable to parse commit author"));
+   goto out;

[PATCH v2 8/9] sequencer: load commit related config

2017-11-10 Thread Phillip Wood

From: Phillip Wood 

Load default values for message cleanup and gpg signing of commits in
preparation for committing without forking 'git commit'.

Signed-off-by: Phillip Wood 
---

Notes:
changes since v1:
 - renamed git_revert_config() to common_config()
 - prefixed cleanup_mode constants to reflect the changes to patch 2
   in this series

 builtin/rebase--helper.c | 13 -
 builtin/revert.c | 15 +--
 sequencer.c  | 34 ++
 sequencer.h  |  1 +
 4 files changed, 60 insertions(+), 3 deletions(-)

diff --git a/builtin/rebase--helper.c b/builtin/rebase--helper.c
index 
f8519363a393862b6857acab037e74367c7f2134..68194d3aed950f327a8bc624fa1991478dfea01e
 100644
--- a/builtin/rebase--helper.c
+++ b/builtin/rebase--helper.c
@@ -9,6 +9,17 @@ static const char * const builtin_rebase_helper_usage[] = {
NULL
 };
 
+static int git_rebase_helper_config(const char *k, const char *v, void *cb)
+{
+   int status;
+
+   status = git_sequencer_config(k, v, NULL);
+   if (status)
+   return status;
+
+   return git_default_config(k, v, NULL);
+}
+
 int cmd_rebase__helper(int argc, const char **argv, const char *prefix)
 {
struct replay_opts opts = REPLAY_OPTS_INIT;
@@ -39,7 +50,7 @@ int cmd_rebase__helper(int argc, const char **argv, const 
char *prefix)
OPT_END()
};
 
-   git_config(git_default_config, NULL);
+   git_config(git_rebase_helper_config, NULL);
 
opts.action = REPLAY_INTERACTIVE_REBASE;
opts.allow_ff = 1;
diff --git a/builtin/revert.c b/builtin/revert.c
index 
b9d927eb09c9ed87c84681df1396f4e6d9b13c97..1938825efa6ad20ede5aba57f097863aeb33d1d5
 100644
--- a/builtin/revert.c
+++ b/builtin/revert.c
@@ -31,6 +31,17 @@ static const char * const cherry_pick_usage[] = {
NULL
 };
 
+static int common_config(const char *k, const char *v, void *cb)
+{
+   int status;
+
+   status = git_sequencer_config(k, v, NULL);
+   if (status)
+   return status;
+
+   return git_default_config(k, v, NULL);
+}
+
 static const char *action_name(const struct replay_opts *opts)
 {
return opts->action == REPLAY_REVERT ? "revert" : "cherry-pick";
@@ -208,7 +219,7 @@ int cmd_revert(int argc, const char **argv, const char 
*prefix)
if (isatty(0))
opts.edit = 1;
opts.action = REPLAY_REVERT;
-   git_config(git_default_config, NULL);
+   git_config(common_config, NULL);
res = run_sequencer(argc, argv, &opts);
if (res < 0)
die(_("revert failed"));
@@ -221,7 +232,7 @@ int cmd_cherry_pick(int argc, const char **argv, const char 
*prefix)
int res;
 
opts.action = REPLAY_PICK;
-   git_config(git_default_config, NULL);
+   git_config(common_config, NULL);
res = run_sequencer(argc, argv, &opts);
if (res < 0)
die(_("cherry-pick failed"));
diff --git a/sequencer.c b/sequencer.c
index 
497764ea5db3a3ba2802387316c1d5024d47d7eb..0fd98391cb7dfcff6976271f44a56e897593e2c0
 100644
--- a/sequencer.c
+++ b/sequencer.c
@@ -688,6 +688,40 @@ static int run_git_commit(const char *defmsg, struct 
replay_opts *opts,
return run_command(&cmd);
 }
 
+static enum commit_msg_cleanup_mode default_msg_cleanup =
+   COMMIT_MSG_CLEANUP_NONE;
+static char *default_gpg_sign;
+
+int git_sequencer_config(const char *k, const char *v, void *cb)
+{
+   if (!strcmp(k, "commit.cleanup")) {
+   int status;
+   const char *s;
+
+   status = git_config_string(&s, k, v);
+   if (status)
+   return status;
+
+   if (!strcmp(s, "verbatim"))
+   default_msg_cleanup = COMMIT_MSG_CLEANUP_NONE;
+   else if (!strcmp(s, "whitespace"))
+   default_msg_cleanup = COMMIT_MSG_CLEANUP_SPACE;
+   else if (!strcmp(s, "strip"))
+   default_msg_cleanup = COMMIT_MSG_CLEANUP_ALL;
+   else if (!strcmp(s, "scissors"))
+   default_msg_cleanup = COMMIT_MSG_CLEANUP_NONE;
+
+   return status;
+   }
+
+   if (!strcmp(k, "commit.gpgsign")) {
+   default_gpg_sign = git_config_bool(k, v) ? "" : NULL;
+   return 0;
+   }
+
+   return git_gpg_config(k, v, NULL);
+}
+
 static int rest_is_empty(const struct strbuf *sb, int start)
 {
int i, eol;
diff --git a/sequencer.h b/sequencer.h
index 
e4040cac08cb69c9b91db3fe77ae392c8d6d0f50..27f34be4003589790be686c901a19059f4642c22
 100644
--- a/sequencer.h
+++ b/sequencer.h
@@ -57,6 +57,7 @@ extern const char sign_off_header[];
 
 void append_signoff(struct strbuf *msgbuf, int ignore_footer, unsigned flag);
 void append_conflicts_hint(struct strbuf *msgbuf);
+int git_sequencer_config(const char *k, const char *v, void

[PATCH v2 0/9] sequencer: dont't fork git commit

2017-11-10 Thread Phillip Wood

From: Phillip Wood 

Thanks for the feedback on v1 I've updated the patches as
suggested. See the comments on each patch for what has changed. I've
added a patch to the start of the series to test the commit messages
of intermediate squashes. I've added this as the RFC version of this
series did not create these correctly but the test suite passed.

Here's the summary from the previous version
These patches teach the sequencer to create commits without forking
git commit when the commit message does not need to be edited. This
speeds up cherry picking 10 commits by 26% and picking 10 commits with
rebase --continue by 44%. The first few patches move bits of
builtin/commit.c to sequencer.c. The last two patches actually
implement creating commits in sequencer.c.

Phillip Wood (9):
  t3404: check intermediate squash messages
  commit: move empty message checks to libgit
  Add a function to update HEAD after creating a commit
  commit: move post-rewrite code to libgit
  commit: move print_commit_summary() to libgit
  sequencer: don't die in print_commit_summary()
  sequencer: simplify adding Signed-off-by: trailer
  sequencer: load commit related config
  sequencer: try to commit without forking 'git commit'

 builtin/commit.c  | 290 +++--
 builtin/rebase--helper.c  |  13 +-
 builtin/revert.c  |  15 +-
 sequencer.c   | 489 +-
 sequencer.h   |  23 ++
 t/t3404-rebase-interactive.sh |   4 +
 6 files changed, 565 insertions(+), 269 deletions(-)

-- 
2.15.0

[PATCH v2 3/9] Add a function to update HEAD after creating a commit

2017-11-10 Thread Phillip Wood

From: Phillip Wood 

Add update_head() based on the code that updates HEAD after committing
in builtin/commit.c that can be called by 'git commit' and other
commands.

Signed-off-by: Phillip Wood 
---

Notes:
changes since v1:
 - rename update_head() to update_head_with_reflog()

 builtin/commit.c | 20 ++--
 sequencer.c  | 39 ++-
 sequencer.h  |  4 
 3 files changed, 44 insertions(+), 19 deletions(-)

diff --git a/builtin/commit.c b/builtin/commit.c
index 
dbc160c525e7a9249b7c7df2180495a4c7102296..7c2816644a665f7b7d8f4b6c0a5855aef01b
 100644
--- a/builtin/commit.c
+++ b/builtin/commit.c
@@ -1591,13 +1591,11 @@ int cmd_commit(int argc, const char **argv, const char 
*prefix)
struct strbuf sb = STRBUF_INIT;
struct strbuf author_ident = STRBUF_INIT;
const char *index_file, *reflog_msg;
-   char *nl;
struct object_id oid;
struct commit_list *parents = NULL;
struct stat statbuf;
struct commit *current_head = NULL;
struct commit_extra_header *extra = NULL;
-   struct ref_transaction *transaction;
struct strbuf err = STRBUF_INIT;
 
if (argc == 2 && !strcmp(argv[1], "-h"))
@@ -1720,25 +1718,11 @@ int cmd_commit(int argc, const char **argv, const char 
*prefix)
strbuf_release(&author_ident);
free_commit_extra_headers(extra);
 
-   nl = strchr(sb.buf, '\n');
-   if (nl)
-   strbuf_setlen(&sb, nl + 1 - sb.buf);
-   else
-   strbuf_addch(&sb, '\n');
-   strbuf_insert(&sb, 0, reflog_msg, strlen(reflog_msg));
-   strbuf_insert(&sb, strlen(reflog_msg), ": ", 2);
-
-   transaction = ref_transaction_begin(&err);
-   if (!transaction ||
-   ref_transaction_update(transaction, "HEAD", &oid,
-  current_head
-  ? ¤t_head->object.oid : &null_oid,
-  0, sb.buf, &err) ||
-   ref_transaction_commit(transaction, &err)) {
+   if (update_head_with_reflog(current_head, &oid, reflog_msg, &sb,
+   &err)) {
rollback_index_files();
die("%s", err.buf);
}
-   ref_transaction_free(transaction);
 
unlink(git_path_cherry_pick_head());
unlink(git_path_revert_head());
diff --git a/sequencer.c b/sequencer.c
index 
23c250f16cfb7620215bb9c99b8e12d726dc9191..fcd8e92531a3fb1f0bc1294925e87b3624ada909
 100644
--- a/sequencer.c
+++ b/sequencer.c
@@ -1,10 +1,10 @@
 #include "cache.h"
 #include "config.h"
 #include "lockfile.h"
-#include "sequencer.h"
 #include "dir.h"
 #include "object.h"
 #include "commit.h"
+#include "sequencer.h"
 #include "tag.h"
 #include "run-command.h"
 #include "exec_cmd.h"
@@ -752,6 +752,43 @@ int template_untouched(const struct strbuf *sb, const char 
*template_file,
return rest_is_empty(sb, start - sb->buf);
 }
 
+int update_head_with_reflog(const struct commit *old_head,
+   const struct object_id *new_head,
+   const char *action, const struct strbuf *msg,
+   struct strbuf *err)
+{
+   struct ref_transaction *transaction;
+   struct strbuf sb = STRBUF_INIT;
+   const char *nl;
+   int ret = 0;
+
+   if (action) {
+   strbuf_addstr(&sb, action);
+   strbuf_addstr(&sb, ": ");
+   }
+
+   nl = strchr(msg->buf, '\n');
+   if (nl) {
+   strbuf_add(&sb, msg->buf, nl + 1 - msg->buf);
+   } else {
+   strbuf_addbuf(&sb, msg);
+   strbuf_addch(&sb, '\n');
+   }
+
+   transaction = ref_transaction_begin(err);
+   if (!transaction ||
+   ref_transaction_update(transaction, "HEAD", new_head,
+  old_head ? &old_head->object.oid : &null_oid,
+  0, sb.buf, err) ||
+   ref_transaction_commit(transaction, err)) {
+   ret = -1;
+   }
+   ref_transaction_free(transaction);
+   strbuf_release(&sb);
+
+   return ret;
+}
+
 static int is_original_commit_empty(struct commit *commit)
 {
const struct object_id *ptree_oid;
diff --git a/sequencer.h b/sequencer.h
index 
82e57713a2940c5d65ccac013c3f42c55cc12baf..82035ced129d35ff8ba64710477531a9a713b631
 100644
--- a/sequencer.h
+++ b/sequencer.h
@@ -69,4 +69,8 @@ int message_is_empty(const struct strbuf *sb,
 enum commit_msg_cleanup_mode cleanup_mode);
 int template_untouched(const struct strbuf *sb, const char *template_file,
   enum commit_msg_cleanup_mode cleanup_mode);
+int update_head_with_reflog(const struct commit *old_head,
+   const struct object_id *new_head,
+   const char* action, const struct strbuf *msg,
+   struct strbuf *err);
 #endif
-- 
2.15.0

[PATCH v2 7/9] sequencer: simplify adding Signed-off-by: trailer

2017-11-10 Thread Phillip Wood

From: Phillip Wood 

Add the Signed-off-by: trailer in one place rather than adding it to
the message when doing a recursive merge and specifying '--signoff'
when running 'git commit'. This means that if there are conflicts when
merging with a strategy other than 'recursive' the Signed-off-by:
trailer will be added if the user commits the resolution themselves
without passing '--signoff' to 'git commit'. It also simplifies the
in-process commit that is about to be added to the sequencer.

Signed-off-by: Phillip Wood 
---
 sequencer.c | 8 +++-
 1 file changed, 3 insertions(+), 5 deletions(-)

diff --git a/sequencer.c b/sequencer.c
index 
61bb75352202cf896ffec41e972f366e94d569c1..497764ea5db3a3ba2802387316c1d5024d47d7eb
 100644
--- a/sequencer.c
+++ b/sequencer.c
@@ -477,9 +477,6 @@ static int do_recursive_merge(struct commit *base, struct 
commit *next,
_(action_name(opts)));
rollback_lock_file(&index_lock);
 
-   if (opts->signoff)
-   append_signoff(msgbuf, 0, 0);
-
if (!clean)
append_conflicts_hint(msgbuf);
 
@@ -657,8 +654,6 @@ static int run_git_commit(const char *defmsg, struct 
replay_opts *opts,
argv_array_push(&cmd.args, "--amend");
if (opts->gpg_sign)
argv_array_pushf(&cmd.args, "-S%s", opts->gpg_sign);
-   if (opts->signoff)
-   argv_array_push(&cmd.args, "-s");
if (defmsg)
argv_array_pushl(&cmd.args, "-F", defmsg, NULL);
if ((flags & CLEANUP_MSG))
@@ -1350,6 +1345,9 @@ static int do_pick_commit(enum todo_command command, 
struct commit *commit,
}
}
 
+   if (opts->signoff)
+   append_signoff(&msgbuf, 0, 0);
+
if (is_rebase_i(opts) && write_author_script(msg.message) < 0)
res = -1;
else if (!opts->strategy || !strcmp(opts->strategy, "recursive") || 
command == TODO_REVERT) {
-- 
2.15.0

Re: [PATCH v2 Outreachy] mru: use double-linked list from list.h

2017-11-10 Thread Оля Тележная

> We hung back on it to leave it as low-hanging fruit for other Outreachy
> applicants. Perhaps Olga would like to pick it up now that the
> application period is over.

It's absolutely not a problem for me, I can do that as one more
warm-up exercise in the beginning of the internship.

Thanks!

I wait for your prompt response.

2017-11-10 Thread SAM AZADA

Good day,

I am Mr. Sam Azada from Burkina Faso  a Minister confide on me to look
for foreign partner who will assist him to invest the sum of  Fifty
Million  Dollars  ($50,000,000) in your country.

He has investment interest in mining, exotic properties for commercial
resident, development properties, hotels and any other viable
investment opportunities in your country based on your recommendation
will be highly welcomed.

Hence your co -operation is highly needed to actualize this investment project

I wait for your prompt response.

Sincerely yours

Mr Sam Azada.

Re: cherry-pick very slow on big repository

2017-11-10 Thread Peter Krefting


Jeff King:


Can you get a backtrace? I'd do something like:


Seems that it spends most time in diffcore_count_changes(), that is 
where it hits whenever I hit Ctrl+C (various line numbers 199-207 in 
diffcore-delta.c; this is on the v2.15.0 tag).


(gdb) bt
#0  diffcore_count_changes (src=src@entry=0x5db99970,
dst=dst@entry=0x5d6a4810,
src_count_p=src_count_p@entry=0x5db8,
dst_count_p=dst_count_p@entry=0x5d6a4838,
src_copied=src_copied@entry=0x7fffd3e0,
literal_added=literal_added@entry=0x7fffd3f0)
at diffcore-delta.c:203
#1  0x556dee1a in estimate_similarity (minimum_score=3,
dst=0x5d6a4810, src=0x5db99970) at diffcore-rename.c:193
#2  diffcore_rename (options=options@entry=0x7fffd4f0)
at diffcore-rename.c:560
#3  0x55623d83 in diffcore_std (
options=options@entry=0x7fffd4f0) at diff.c:5846
#4  0x5564ab46 in get_renames (o=o@entry=0x7fffd850,
tree=tree@entry=0x559d1b98,
o_tree=o_tree@entry=0x559d1bc0,
a_tree=a_tree@entry=0x559d1b98,
b_tree=b_tree@entry=0x559d1b70,
entries=entries@entry=0x59351d20) at merge-recursive.c:554
#5  0x5564e7d9 in merge_trees (o=o@entry=0x7fffd850,
head=head@entry=0x559d1b98, merge=,
merge@entry=0x559d1b70, common=,
common@entry=0x559d1bc0, result=result@entry=0x7fffd830)
at merge-recursive.c:1985
#6  0x5569b2cc in do_recursive_merge (opts=0x7fffdf70,
msgbuf=0x7fffd810, head=0x7fffd7f0,
next_label=, base_label=,
next=, base=0x559c1ba0) at sequencer.c:459
#7  do_pick_commit (command=TODO_PICK,
commit=commit@entry=0x559c1b60,
opts=opts@entry=0x7fffdf70, final_fixup=final_fixup@entry=0)
at sequencer.c:1088
#8  0x5569e324 in single_pick (opts=0x7fffdf70,
cmit=0x559c1b60) at sequencer.c:2306
#9  sequencer_pick_revisions (opts=0x7fffdf70)
at sequencer.c:2355
#10 0x555d4097 in run_sequencer (argc=1, argc@entry=3,
argv=argv@entry=0x7fffe320, opts=,
opts@entry=0x7fffdf70) at builtin/revert.c:200
#11 0x555d449a in cmd_cherry_pick (argc=3,
argv=0x7fffe320, prefix=)
at builtin/revert.c:225
#12 0x55567a38 in run_builtin (argv=,
argc=, p=) at git.c:346
#13 handle_builtin (argc=3, argv=0x7fffe320) at git.c:554
#14 0x55567cf6 in run_argv (argv=0x7fffe0e0,
argcp=0x7fffe0ec) at git.c:606
#15 cmd_main (argc=, argv=)
at git.c:683
#16 0x55566e01 in main (argc=4, argv=0x7fffe318)
at common-main.c:43

--
\\// Peter - http://www.softwolves.pp.se/

Re: cherry-pick very slow on big repository

2017-11-10 Thread Derrick Stolee


On 11/10/2017 7:37 AM, Peter Krefting wrote:

Jeff King:


Can you get a backtrace? I'd do something like:


Seems that it spends most time in diffcore_count_changes(), that is 
where it hits whenever I hit Ctrl+C (various line numbers 199-207 in 
diffcore-delta.c; this is on the v2.15.0 tag).


(gdb) bt
#0  diffcore_count_changes (src=src@entry=0x5db99970,
    dst=dst@entry=0x5d6a4810,
    src_count_p=src_count_p@entry=0x5db8,
    dst_count_p=dst_count_p@entry=0x5d6a4838,
    src_copied=src_copied@entry=0x7fffd3e0,
    literal_added=literal_added@entry=0x7fffd3f0)
    at diffcore-delta.c:203
#1  0x556dee1a in estimate_similarity (minimum_score=3,
    dst=0x5d6a4810, src=0x5db99970) at diffcore-rename.c:193
#2  diffcore_rename (options=options@entry=0x7fffd4f0)
    at diffcore-rename.c:560
#3  0x55623d83 in diffcore_std (
    options=options@entry=0x7fffd4f0) at diff.c:5846
...


Git is spending time detecting renames, which implies you probably 
renamed a folder or added and deleted a large number of files. This 
rename detection is quadratic (# adds times # deletes).


You can remove this rename detection by running your cherry-pick with 
`git -c diff.renameLimit=1 cherry-pick ...`


See https://git-scm.com/docs/diff-config#diff-config-diffrenameLimit

Thanks,
-Stolee

[PATCH v3 2/8] t0021/rot13-filter: refactor packet reading functions

2017-11-10 Thread Christian Couder

To make it possible in a following commit to move packet
reading and writing functions into a Packet.pm module,
let's refactor these functions, so they don't handle
printing debug output and exiting.

While at it let's create packet_key_val_read()
to still handle erroring out in a common case.

Signed-off-by: Christian Couder 
---
 t/t0021/rot13-filter.pl | 43 +++
 1 file changed, 31 insertions(+), 12 deletions(-)

diff --git a/t/t0021/rot13-filter.pl b/t/t0021/rot13-filter.pl
index 4b2d9087d4..b1909699f4 100644
--- a/t/t0021/rot13-filter.pl
+++ b/t/t0021/rot13-filter.pl
@@ -74,8 +74,7 @@ sub packet_bin_read {
my $bytes_read = read STDIN, $buffer, 4;
if ( $bytes_read == 0 ) {
# EOF - Git stopped talking to us!
-   print $debug "STOP\n";
-   exit();
+   return ( -1, "" );
}
elsif ( $bytes_read != 4 ) {
die "invalid packet: '$buffer'";
@@ -99,10 +98,20 @@ sub packet_bin_read {
 
 sub packet_txt_read {
my ( $res, $buf ) = packet_bin_read();
-   unless ( $buf eq '' or $buf =~ s/\n$// ) {
-   die "A non-binary line MUST be terminated by an LF.";
+   if ( $res == -1 or $buf eq '' or $buf =~ s/\n$// ) {
+   return ( $res, $buf );
+   }
+   die "A non-binary line MUST be terminated by an LF.";
+}
+
+# Read a text line and check that it is in the form "key=value"
+sub packet_key_val_read {
+   my ( $key ) = @_;
+   my ( $res, $buf ) = packet_txt_read();
+   if ( $res == -1 or ( $buf =~ s/^$key=// and $buf ne '' ) ) {
+   return ( $res, $buf );
}
-   return ( $res, $buf );
+   die "bad $key: '$buf'";
 }
 
 sub packet_bin_write {
@@ -152,13 +161,18 @@ print $debug "init handshake complete\n";
 $debug->flush();
 
 while (1) {
-   my ( $command ) = packet_txt_read() =~ /^command=(.+)$/;
+   my ( $res, $command ) = packet_key_val_read("command");
+   if ( $res == -1 ) {
+   print $debug "STOP\n";
+   exit();
+   }
print $debug "IN: $command";
$debug->flush();
 
if ( $command eq "list_available_blobs" ) {
# Flush
-   packet_bin_read();
+   packet_compare_lists([1, ""], packet_bin_read()) ||
+   die "bad list_available_blobs end";
 
foreach my $pathname ( sort keys %DELAY ) {
if ( $DELAY{$pathname}{"requested"} >= 1 ) {
@@ -184,14 +198,13 @@ while (1) {
packet_flush();
}
else {
-   my ( $pathname ) = packet_txt_read() =~ /^pathname=(.+)$/;
+   my ( $res, $pathname ) = packet_key_val_read("pathname");
+   if ( $res == -1 ) {
+   die "unexpected EOF while expecting pathname";
+   }
print $debug " $pathname";
$debug->flush();
 
-   if ( $pathname eq "" ) {
-   die "bad pathname '$pathname'";
-   }
-
# Read until flush
my ( $done, $buffer ) = packet_txt_read();
while ( $buffer ne '' ) {
@@ -205,6 +218,9 @@ while (1) {
 
( $done, $buffer ) = packet_txt_read();
}
+   if ( $done == -1 ) {
+   die "unexpected EOF after pathname '$pathname'";
+   }
 
my $input = "";
{
@@ -215,6 +231,9 @@ while (1) {
( $done, $buffer ) = packet_bin_read();
$input .= $buffer;
}
+   if ( $done == -1 ) {
+   die "unexpected EOF while reading input for 
'$pathname'";
+   }   
print $debug " " . length($input) . " [OK] -- ";
$debug->flush();
}
-- 
2.15.0.132.g7ad97d78be

[PATCH v3 0/8] Create Git/Packet.pm

2017-11-10 Thread Christian Couder

Goal


Packet related functions in Perl can be useful to write new filters or
to debug or test existing filters. They might also in the future be
used by other software using the same packet line protocol. So instead
of having them in t0021/rot13-filter.pl, let's extract them into a new
Git/Packet.pm module.

Changes since the previous version
~~

There are only a few small changes compared to v2:

Patch 2/8 has the following change:

  - packet_required_key_val_read() is renamed packet_key_val_read()
and a comment is added before the function,

Patch 2/8 and 6/8 have the following change:

  - "if" is used instead of "unless" to make the logic easier to
understand

The diff with the previous version is:

diff --git a/perl/Git/Packet.pm b/perl/Git/Packet.pm
index 255b28c098..1682403ffc 100644
--- a/perl/Git/Packet.pm
+++ b/perl/Git/Packet.pm
@@ -17,7 +17,7 @@ our @EXPORT = qw(
packet_compare_lists
packet_bin_read
packet_txt_read
-   packet_required_key_val_read
+   packet_key_val_read
packet_bin_write
packet_txt_write
packet_flush
@@ -68,28 +68,29 @@ sub packet_bin_read {
 
 sub remove_final_lf_or_die {
my $buf = shift;
-   unless ( $buf =~ s/\n$// ) {
-   die "A non-binary line MUST be terminated by an LF.\n"
-   . "Received: '$buf'";
+   if ( $buf =~ s/\n$// ) {
+   return $buf;
}
-   return $buf;
+   die "A non-binary line MUST be terminated by an LF.\n"
+   . "Received: '$buf'";
 }

 sub packet_txt_read {
my ( $res, $buf ) = packet_bin_read();
-   unless ( $res == -1 or $buf eq '' ) {
+   if ( $res != -1 and $buf ne '' ) {
$buf = remove_final_lf_or_die($buf);
}
return ( $res, $buf );
 }
 
-sub packet_required_key_val_read {
+# Read a text line and check that it is in the form "key=value"
+sub packet_key_val_read {
my ( $key ) = @_;
my ( $res, $buf ) = packet_txt_read();
-   unless ( $res == -1 or ( $buf =~ s/^$key=// and $buf ne '' ) ) {
-   die "bad $key: '$buf'";
+   if ( $res == -1 or ( $buf =~ s/^$key=// and $buf ne '' ) ) {
+   return ( $res, $buf );
}
-   return ( $res, $buf );
+   die "bad $key: '$buf'";
 }
 
 sub packet_bin_write {
diff --git a/t/t0021/rot13-filter.pl b/t/t0021/rot13-filter.pl
index 6fd7fa476b..f1678851de 100644
--- a/t/t0021/rot13-filter.pl
+++ b/t/t0021/rot13-filter.pl
@@ -70,7 +70,7 @@ print $debug "init handshake complete\n";
 $debug->flush();
 
 while (1) {
-   my ( $res, $command ) = packet_required_key_val_read("command");
+   my ( $res, $command ) = packet_key_val_read("command");
if ( $res == -1 ) {
print $debug "STOP\n";
exit();
@@ -106,7 +106,7 @@ while (1) {
packet_txt_write("status=success");
packet_flush();
} else {
-   my ( $res, $pathname ) = 
packet_required_key_val_read("pathname");
+   my ( $res, $pathname ) = packet_key_val_read("pathname");
if ( $res == -1 ) {
die "unexpected EOF while expecting pathname";
}

Links
~

This patch series is on the following branch:

https://github.com/chriscool/git/commits/gl-prep-external-odb

Version 1 and 2 of this patch series are on the mailing list here:

https://public-inbox.org/git/20171019123030.17338-1-chrisc...@tuxfamily.org/
https://public-inbox.org/git/20171105213836.11717-1-chrisc...@tuxfamily.org/

They are also available in the following branch:

https://github.com/chriscool/git/commits/gl-prep-external-odb1
https://github.com/chriscool/git/commits/gl-prep-external-odb14

This patch series was extracted from previous "Add initial
experimental external ODB support" patch series.

Version 1, 2, 3, 4, 5 and 6 of this previous series are on the mailing
list here:

https://public-inbox.org/git/20160613085546.11784-1-chrisc...@tuxfamily.org/
https://public-inbox.org/git/20160628181933.24620-1-chrisc...@tuxfamily.org/
https://public-inbox.org/git/20161130210420.15982-1-chrisc...@tuxfamily.org/
https://public-inbox.org/git/20170620075523.26961-1-chrisc...@tuxfamily.org/
https://public-inbox.org/git/20170803091926.1755-1-chrisc...@tuxfamily.org/
https://public-inbox.org/git/20170916080731.13925-1-chrisc...@tuxfamily.org/

They are also available in the following branches:

https://github.com/chriscool/git/commits/gl-external-odb12
https://github.com/chriscool/git/commits/gl-external-odb22
https://github.com/chriscool/git/commits/gl-external-odb61
https://github.com/chriscool/git/commits/gl-external-odb239
https://github.com/chriscool/git/commits/gl-external-odb373
https://github.com/chriscool/git/commits/gl-external-odb411


Christian Cou

[PATCH v3 1/8] t0021/rot13-filter: fix list comparison

2017-11-10 Thread Christian Couder

Since edcc8581 ("convert: add filter..process
option", 2016-10-16) when t0021/rot13-filter.pl was created, list
comparison in this perl script have been quite broken.

packet_txt_read() returns a 2-element list, and the right hand
side of "eq" also has a list with (two, elements), but "eq" takes
the last element of the list on each side, and compares them. The
first elements (0 or 1) on the right hand side lists do not matter,
which means we do not require to see a flush at the end of the
version -- a simple empty string or an EOF would do, which is
definitely not what we want.

Signed-off-by: Christian Couder 
---
 t/t0021/rot13-filter.pl | 35 ---
 1 file changed, 28 insertions(+), 7 deletions(-)

diff --git a/t/t0021/rot13-filter.pl b/t/t0021/rot13-filter.pl
index ad685d92f8..4b2d9087d4 100644
--- a/t/t0021/rot13-filter.pl
+++ b/t/t0021/rot13-filter.pl
@@ -55,6 +55,20 @@ sub rot13 {
return $str;
 }
 
+sub packet_compare_lists {
+   my ($expect, @result) = @_;
+   my $ix;
+   if (scalar @$expect != scalar @result) {
+   return undef;
+   }
+   for ($ix = 0; $ix < $#result; $ix++) {
+   if ($expect->[$ix] ne $result[$ix]) {
+   return undef;
+   }
+   }
+   return 1;
+}
+
 sub packet_bin_read {
my $buffer;
my $bytes_read = read STDIN, $buffer, 4;
@@ -110,18 +124,25 @@ sub packet_flush {
 print $debug "START\n";
 $debug->flush();
 
-( packet_txt_read() eq ( 0, "git-filter-client" ) ) || die "bad initialize";
-( packet_txt_read() eq ( 0, "version=2" ) ) || die "bad version";
-( packet_bin_read() eq ( 1, "" ) )  || die "bad version end";
+packet_compare_lists([0, "git-filter-client"], packet_txt_read()) ||
+   die "bad initialize";
+packet_compare_lists([0, "version=2"], packet_txt_read()) ||
+   die "bad version";
+packet_compare_lists([1, ""], packet_bin_read()) ||
+   die "bad version end";
 
 packet_txt_write("git-filter-server");
 packet_txt_write("version=2");
 packet_flush();
 
-( packet_txt_read() eq ( 0, "capability=clean" ) )  || die "bad capability";
-( packet_txt_read() eq ( 0, "capability=smudge" ) ) || die "bad capability";
-( packet_txt_read() eq ( 0, "capability=delay" ) )  || die "bad capability";
-( packet_bin_read() eq ( 1, "" ) )  || die "bad capability 
end";
+packet_compare_lists([0, "capability=clean"], packet_txt_read()) ||
+   die "bad capability";
+packet_compare_lists([0, "capability=smudge"], packet_txt_read()) ||
+   die "bad capability";
+packet_compare_lists([0, "capability=delay"], packet_txt_read()) ||
+   die "bad capability";
+packet_compare_lists([1, ""], packet_bin_read()) ||
+   die "bad capability end";
 
 foreach (@capabilities) {
packet_txt_write( "capability=" . $_ );
-- 
2.15.0.132.g7ad97d78be

[PATCH v3 3/8] t0021/rot13-filter: improve 'if .. elsif .. else' style

2017-11-10 Thread Christian Couder

Before further refactoring the "t0021/rot13-filter.pl" script,
let's modernize the style of its 'if .. elsif .. else' clauses
to improve its readability by making it more similar to our
other perl scripts.

Signed-off-by: Christian Couder 
---
 t/t0021/rot13-filter.pl | 39 +--
 1 file changed, 13 insertions(+), 26 deletions(-)

diff --git a/t/t0021/rot13-filter.pl b/t/t0021/rot13-filter.pl
index b1909699f4..8bba97af1a 100644
--- a/t/t0021/rot13-filter.pl
+++ b/t/t0021/rot13-filter.pl
@@ -75,23 +75,20 @@ sub packet_bin_read {
if ( $bytes_read == 0 ) {
# EOF - Git stopped talking to us!
return ( -1, "" );
-   }
-   elsif ( $bytes_read != 4 ) {
+   } elsif ( $bytes_read != 4 ) {
die "invalid packet: '$buffer'";
}
my $pkt_size = hex($buffer);
if ( $pkt_size == 0 ) {
return ( 1, "" );
-   }
-   elsif ( $pkt_size > 4 ) {
+   } elsif ( $pkt_size > 4 ) {
my $content_size = $pkt_size - 4;
$bytes_read = read STDIN, $buffer, $content_size;
if ( $bytes_read != $content_size ) {
die "invalid packet ($content_size bytes expected; 
$bytes_read bytes read)";
}
return ( 0, $buffer );
-   }
-   else {
+   } else {
die "invalid packet size: $pkt_size";
}
 }
@@ -196,8 +193,7 @@ while (1) {
$debug->flush();
packet_txt_write("status=success");
packet_flush();
-   }
-   else {
+   } else {
my ( $res, $pathname ) = packet_key_val_read("pathname");
if ( $res == -1 ) {
die "unexpected EOF while expecting pathname";
@@ -241,17 +237,13 @@ while (1) {
my $output;
if ( exists $DELAY{$pathname} and exists 
$DELAY{$pathname}{"output"} ) {
$output = $DELAY{$pathname}{"output"}
-   }
-   elsif ( $pathname eq "error.r" or $pathname eq "abort.r" ) {
+   } elsif ( $pathname eq "error.r" or $pathname eq "abort.r" ) {
$output = "";
-   }
-   elsif ( $command eq "clean" and grep( /^clean$/, @capabilities 
) ) {
+   } elsif ( $command eq "clean" and grep( /^clean$/, 
@capabilities ) ) {
$output = rot13($input);
-   }
-   elsif ( $command eq "smudge" and grep( /^smudge$/, 
@capabilities ) ) {
+   } elsif ( $command eq "smudge" and grep( /^smudge$/, 
@capabilities ) ) {
$output = rot13($input);
-   }
-   else {
+   } else {
die "bad command '$command'";
}
 
@@ -260,25 +252,21 @@ while (1) {
$debug->flush();
packet_txt_write("status=error");
packet_flush();
-   }
-   elsif ( $pathname eq "abort.r" ) {
+   } elsif ( $pathname eq "abort.r" ) {
print $debug "[ABORT]\n";
$debug->flush();
packet_txt_write("status=abort");
packet_flush();
-   }
-   elsif ( $command eq "smudge" and
+   } elsif ( $command eq "smudge" and
exists $DELAY{$pathname} and
-   $DELAY{$pathname}{"requested"} == 1
-   ) {
+   $DELAY{$pathname}{"requested"} == 1 ) {
print $debug "[DELAYED]\n";
$debug->flush();
packet_txt_write("status=delayed");
packet_flush();
$DELAY{$pathname}{"requested"} = 2;
$DELAY{$pathname}{"output"} = $output;
-   }
-   else {
+   } else {
packet_txt_write("status=success");
packet_flush();
 
@@ -298,8 +286,7 @@ while (1) {
print $debug ".";
if ( length($output) > $MAX_PACKET_CONTENT_SIZE 
) {
$output = substr( $output, 
$MAX_PACKET_CONTENT_SIZE );
-   }
-   else {
+   } else {
$output = "";
}
}
-- 
2.15.0.132.g7ad97d78be

[PATCH v3 4/8] t0021/rot13-filter: improve error message

2017-11-10 Thread Christian Couder

If there is no new line at the end of something it receives,
the packet_txt_read() function die()s, but it's difficult to
debug without much context.

Let's give a bit more information when that happens.

Signed-off-by: Christian Couder 
---
 t/t0021/rot13-filter.pl | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/t/t0021/rot13-filter.pl b/t/t0021/rot13-filter.pl
index 8bba97af1a..55b6e17034 100644
--- a/t/t0021/rot13-filter.pl
+++ b/t/t0021/rot13-filter.pl
@@ -98,7 +98,8 @@ sub packet_txt_read {
if ( $res == -1 or $buf eq '' or $buf =~ s/\n$// ) {
return ( $res, $buf );
}
-   die "A non-binary line MUST be terminated by an LF.";
+   die "A non-binary line MUST be terminated by an LF.\n"
+   . "Received: '$buf'";
 }
 
 # Read a text line and check that it is in the form "key=value"
-- 
2.15.0.132.g7ad97d78be

[PATCH v3 5/8] t0021/rot13-filter: add packet_initialize()

2017-11-10 Thread Christian Couder

Let's refactor the code to initialize communication into its own
packet_initialize() function, so that we can reuse this
functionality in following patches.

Signed-off-by: Christian Couder 
---
 t/t0021/rot13-filter.pl | 26 --
 1 file changed, 16 insertions(+), 10 deletions(-)

diff --git a/t/t0021/rot13-filter.pl b/t/t0021/rot13-filter.pl
index 55b6e17034..9e18be66b6 100644
--- a/t/t0021/rot13-filter.pl
+++ b/t/t0021/rot13-filter.pl
@@ -128,19 +128,25 @@ sub packet_flush {
STDOUT->flush();
 }
 
+sub packet_initialize {
+   my ($name, $version) = @_;
+
+   packet_compare_lists([0, $name . "-client"], packet_txt_read()) ||
+   die "bad initialize";
+   packet_compare_lists([0, "version=" . $version], packet_txt_read()) ||
+   die "bad version";
+   packet_compare_lists([1, ""], packet_bin_read()) ||
+   die "bad version end";
+
+   packet_txt_write( $name . "-server" );
+   packet_txt_write( "version=" . $version );
+   packet_flush();
+}
+
 print $debug "START\n";
 $debug->flush();
 
-packet_compare_lists([0, "git-filter-client"], packet_txt_read()) ||
-   die "bad initialize";
-packet_compare_lists([0, "version=2"], packet_txt_read()) ||
-   die "bad version";
-packet_compare_lists([1, ""], packet_bin_read()) ||
-   die "bad version end";
-
-packet_txt_write("git-filter-server");
-packet_txt_write("version=2");
-packet_flush();
+packet_initialize("git-filter", 2);
 
 packet_compare_lists([0, "capability=clean"], packet_txt_read()) ||
die "bad capability";
-- 
2.15.0.132.g7ad97d78be

[PATCH v3 6/8] t0021/rot13-filter: refactor checking final lf

2017-11-10 Thread Christian Couder

As checking for a lf character at the end of a buffer
will be useful in another function, let's refactor this
functionality into a small remove_final_lf_or_die()
helper function.

Signed-off-by: Christian Couder 
---
 t/t0021/rot13-filter.pl | 16 
 1 file changed, 12 insertions(+), 4 deletions(-)

diff --git a/t/t0021/rot13-filter.pl b/t/t0021/rot13-filter.pl
index 9e18be66b6..8f255b6131 100644
--- a/t/t0021/rot13-filter.pl
+++ b/t/t0021/rot13-filter.pl
@@ -93,15 +93,23 @@ sub packet_bin_read {
}
 }
 
-sub packet_txt_read {
-   my ( $res, $buf ) = packet_bin_read();
-   if ( $res == -1 or $buf eq '' or $buf =~ s/\n$// ) {
-   return ( $res, $buf );
+sub remove_final_lf_or_die {
+   my $buf = shift;
+   if ( $buf =~ s/\n$// ) {
+   return $buf;
}
die "A non-binary line MUST be terminated by an LF.\n"
. "Received: '$buf'";
 }
 
+sub packet_txt_read {
+   my ( $res, $buf ) = packet_bin_read();
+   if ( $res != -1 and $buf ne '' ) {
+   $buf = remove_final_lf_or_die($buf);
+   }
+   return ( $res, $buf );
+}
+
 # Read a text line and check that it is in the form "key=value"
 sub packet_key_val_read {
my ( $key ) = @_;
-- 
2.15.0.132.g7ad97d78be

[PATCH v3 8/8] Add Git/Packet.pm from parts of t0021/rot13-filter.pl

2017-11-10 Thread Christian Couder

And while at it let's simplify t0021/rot13-filter.pl by
using Git/Packet.pm.

This will make it possible to reuse packet related
functions in other test scripts.

Signed-off-by: Christian Couder 
---
 perl/Git/Packet.pm  | 169 
 perl/Makefile   |   1 +
 t/t0021/rot13-filter.pl | 141 +---
 3 files changed, 173 insertions(+), 138 deletions(-)
 create mode 100644 perl/Git/Packet.pm

diff --git a/perl/Git/Packet.pm b/perl/Git/Packet.pm
new file mode 100644
index 00..1682403ffc
--- /dev/null
+++ b/perl/Git/Packet.pm
@@ -0,0 +1,169 @@
+package Git::Packet;
+use 5.008;
+use strict;
+use warnings;
+BEGIN {
+   require Exporter;
+   if ($] < 5.008003) {
+   *import = \&Exporter::import;
+   } else {
+   # Exporter 5.57 which supports this invocation was
+   # released with perl 5.8.3
+   Exporter->import('import');
+   }
+}
+
+our @EXPORT = qw(
+   packet_compare_lists
+   packet_bin_read
+   packet_txt_read
+   packet_key_val_read
+   packet_bin_write
+   packet_txt_write
+   packet_flush
+   packet_initialize
+   packet_read_capabilities
+   packet_read_and_check_capabilities
+   packet_check_and_write_capabilities
+   );
+our @EXPORT_OK = @EXPORT;
+
+sub packet_compare_lists {
+   my ($expect, @result) = @_;
+   my $ix;
+   if (scalar @$expect != scalar @result) {
+   return undef;
+   }
+   for ($ix = 0; $ix < $#result; $ix++) {
+   if ($expect->[$ix] ne $result[$ix]) {
+   return undef;
+   }
+   }
+   return 1;
+}
+
+sub packet_bin_read {
+   my $buffer;
+   my $bytes_read = read STDIN, $buffer, 4;
+   if ( $bytes_read == 0 ) {
+   # EOF - Git stopped talking to us!
+   return ( -1, "" );
+   } elsif ( $bytes_read != 4 ) {
+   die "invalid packet: '$buffer'";
+   }
+   my $pkt_size = hex($buffer);
+   if ( $pkt_size == 0 ) {
+   return ( 1, "" );
+   } elsif ( $pkt_size > 4 ) {
+   my $content_size = $pkt_size - 4;
+   $bytes_read = read STDIN, $buffer, $content_size;
+   if ( $bytes_read != $content_size ) {
+   die "invalid packet ($content_size bytes expected; 
$bytes_read bytes read)";
+   }
+   return ( 0, $buffer );
+   } else {
+   die "invalid packet size: $pkt_size";
+   }
+}
+
+sub remove_final_lf_or_die {
+   my $buf = shift;
+   if ( $buf =~ s/\n$// ) {
+   return $buf;
+   }
+   die "A non-binary line MUST be terminated by an LF.\n"
+   . "Received: '$buf'";
+}
+
+sub packet_txt_read {
+   my ( $res, $buf ) = packet_bin_read();
+   if ( $res != -1 and $buf ne '' ) {
+   $buf = remove_final_lf_or_die($buf);
+   }
+   return ( $res, $buf );
+}
+
+# Read a text line and check that it is in the form "key=value"
+sub packet_key_val_read {
+   my ( $key ) = @_;
+   my ( $res, $buf ) = packet_txt_read();
+   if ( $res == -1 or ( $buf =~ s/^$key=// and $buf ne '' ) ) {
+   return ( $res, $buf );
+   }
+   die "bad $key: '$buf'";
+}
+
+sub packet_bin_write {
+   my $buf = shift;
+   print STDOUT sprintf( "%04x", length($buf) + 4 );
+   print STDOUT $buf;
+   STDOUT->flush();
+}
+
+sub packet_txt_write {
+   packet_bin_write( $_[0] . "\n" );
+}
+
+sub packet_flush {
+   print STDOUT sprintf( "%04x", 0 );
+   STDOUT->flush();
+}
+
+sub packet_initialize {
+   my ($name, $version) = @_;
+
+   packet_compare_lists([0, $name . "-client"], packet_txt_read()) ||
+   die "bad initialize";
+   packet_compare_lists([0, "version=" . $version], packet_txt_read()) ||
+   die "bad version";
+   packet_compare_lists([1, ""], packet_bin_read()) ||
+   die "bad version end";
+
+   packet_txt_write( $name . "-server" );
+   packet_txt_write( "version=" . $version );
+   packet_flush();
+}
+
+sub packet_read_capabilities {
+   my @cap;
+   while (1) {
+   my ( $res, $buf ) = packet_bin_read();
+   if ( $res == -1 ) {
+   die "unexpected EOF when reading capabilities";
+   }
+   return ( $res, @cap ) if ( $res != 0 );
+   $buf = remove_final_lf_or_die($buf);
+   unless ( $buf =~ s/capability=// ) {
+   die "bad capability buf: '$buf'";
+   }
+   push @cap, $buf;
+   }
+}
+
+# Read remote capabilities and check them against capabilities we requi

[PATCH v3 7/8] t0021/rot13-filter: add capability functions

2017-11-10 Thread Christian Couder

These function help read and write capabilities.

To make them more generic and make it easy to reuse them,
the following changes are made:

- we don't require capabilities to come in a fixed order,
- we allow duplicates,
- we check that the remote supports the capabilities we
  advertise,
- we don't check if the remote declares any capability we
  don't know about.

The reason behind the last change is that the protocol
should work using only the capabilities that both ends
support, and it should not stop working if one end starts
to advertise a new capability.

Despite those changes, we can still require a set of
capabilities, and die if one of them is not supported.

Signed-off-by: Christian Couder 
---
 t/t0021/rot13-filter.pl | 58 ++---
 1 file changed, 45 insertions(+), 13 deletions(-)

diff --git a/t/t0021/rot13-filter.pl b/t/t0021/rot13-filter.pl
index 8f255b6131..6cd9ecffee 100644
--- a/t/t0021/rot13-filter.pl
+++ b/t/t0021/rot13-filter.pl
@@ -151,24 +151,56 @@ sub packet_initialize {
packet_flush();
 }
 
+sub packet_read_capabilities {
+   my @cap;
+   while (1) {
+   my ( $res, $buf ) = packet_bin_read();
+   if ( $res == -1 ) {
+   die "unexpected EOF when reading capabilities";
+   }
+   return ( $res, @cap ) if ( $res != 0 );
+   $buf = remove_final_lf_or_die($buf);
+   unless ( $buf =~ s/capability=// ) {
+   die "bad capability buf: '$buf'";
+   }
+   push @cap, $buf;
+   }
+}
+
+# Read remote capabilities and check them against capabilities we require
+sub packet_read_and_check_capabilities {
+   my @required_caps = @_;
+   my ($res, @remote_caps) = packet_read_capabilities();
+   my %remote_caps = map { $_ => 1 } @remote_caps;
+   foreach (@required_caps) {
+   unless (exists($remote_caps{$_})) {
+   die "required '$_' capability not available from 
remote" ;
+   }
+   }
+   return %remote_caps;
+}
+
+# Check our capabilities we want to advertise against the remote ones
+# and then advertise our capabilities
+sub packet_check_and_write_capabilities {
+   my ($remote_caps, @our_caps) = @_;
+   foreach (@our_caps) {
+   unless (exists($remote_caps->{$_})) {
+   die "our capability '$_' is not available from remote"
+   }
+   packet_txt_write( "capability=" . $_ );
+   }
+   packet_flush();
+}
+
 print $debug "START\n";
 $debug->flush();
 
 packet_initialize("git-filter", 2);
 
-packet_compare_lists([0, "capability=clean"], packet_txt_read()) ||
-   die "bad capability";
-packet_compare_lists([0, "capability=smudge"], packet_txt_read()) ||
-   die "bad capability";
-packet_compare_lists([0, "capability=delay"], packet_txt_read()) ||
-   die "bad capability";
-packet_compare_lists([1, ""], packet_bin_read()) ||
-   die "bad capability end";
-
-foreach (@capabilities) {
-   packet_txt_write( "capability=" . $_ );
-}
-packet_flush();
+my %remote_caps = packet_read_and_check_capabilities("clean", "smudge", 
"delay");
+packet_check_and_write_capabilities(\%remote_caps, @capabilities);
+
 print $debug "init handshake complete\n";
 $debug->flush();
 
-- 
2.15.0.132.g7ad97d78be

Re: [PATCH v1 2/2] log: add option to choose which refs to decorate

2017-11-10 Thread Rafael Ascensão

On 07/11/17 00:18, Junio C Hamano wrote:
> Jacob Keller  writes:
> 
> I would have to say that the describe's one is wrong if it does not
> match what for_each_glob_ref() does for the log family of commands'
> "--branches=" etc.  describe.c::get_name() uses positive
> and negative patterns, just like log-tree.c::add_ref_decoration()
> would with the patch we are discussing, so perhaps the items in
> these lists should get the same "normalize" treatment the patch 1/2
> of this series brings in to make things consistent?
> 

I agree that describe should receive the "normalize" treatment. However,
and following the same reasoning, why should describe users adopt the
rules imposed by --glob? I could argue they're also used to the way it
works now.

That being said, the suggestion I mentioned earlier would allow to keep
both current behaviors consistent at the expense of the extra call to
refs.c::ref_exists().

+if (!has_glob_specials(pattern) && !ref_exists(normalized_pattern->buf)) {
+/* Append implied '/' '*' if not present. */
+strbuf_complete(normalized_pattern, '/');
+/* No need to check for '*', there is none. */
+strbuf_addch(normalized_pattern, '*');
+}

But I don't have enough expertise to decide if this consistency is worth 
the extra call to refs.c::ref_exists() or if there are other side-effects
I am not considering.

>> That being said, if we think the extra glob would not cause
>> problems and generally do what people mean... I guess consistent
>> with --glob would be good... But it's definitely not what I'd
>> expect at first glance.

My position is that consistency is good, but the "first glance
expectation" is definitely something important we should take into
consideration.

is there a stylistic preference for a trailing "--" on a command?

2017-11-10 Thread Robert P. J. Day


  just noticed these examples in "man git-bisect":

EXAMPLES
  $ git bisect start HEAD v1.2 --  # HEAD is bad, v1.2 is good
  ...
  $ git bisect start HEAD origin --# HEAD is bad, origin is good
  ...
  $ git bisect start HEAD HEAD~10 --   # culprit is among the last 10

is there some rationale or stylistic significance to those trailing
"--" on those commands? i assume they have no effect, just curious as
to why they're there.

rday

-- 


Robert P. J. Day Ottawa, Ontario, CANADA
http://crashcourse.ca

Twitter:   http://twitter.com/rpjday
LinkedIn:   http://ca.linkedin.com/in/rpjday

Re: cherry-pick very slow on big repository

2017-11-10 Thread Peter Krefting


Derrick Stolee:

Git is spending time detecting renames, which implies you probably 
renamed a folder or added and deleted a large number of files. This 
rename detection is quadratic (# adds times # deletes).


Yes, a couple of directories with a lot of template files have been 
renamed (and some removed, some added) between the current development 
branch and this old maintenance branch. I get the "Performing inexact 
rename detection" a lot when merging changes in the other direction.


However, none of them applies to these particular commits, which only 
touches files that are in the exact same location on both branches.


You can remove this rename detection by running your cherry-pick 
with `git -c diff.renameLimit=1 cherry-pick ...`


That didn't work, actually it failed to finish with this setting in 
effect, it hangs in such a way that I can't stop it with Ctrl+C 
(neither when running from the command line, nor when running inside 
gdb). It didn't finish in the 20 minutes I gave it.


I also tried with diff.renames=false, which also seemed to fail.

--
\\// Peter - http://www.softwolves.pp.se/

Re: [PATCH v1 5/8] sequencer: don't die in print_commit_summary()

2017-11-10 Thread Phillip Wood

On 07/11/17 15:13, Junio C Hamano wrote:
> Junio C Hamano  writes:
> 
>> And this step is going in the right direction, but I am not sure if
>> this made the function safe enough to be called repeatedly from the
>> rebase machinery and we are ready to unleash this to the end users
>> and tell them it is safe to use it.
> 
> Another possibility perhaps is that the function is safe to reuse
> already even without this patch, of course ;-).
> 
Hmm, maybe it is. Looking at pick_commits() and do_pick_commit() if the
sequencer dies in print_commit_summary() (which can only happen when
cherry-picking or reverting) then neither the todo list or the abort
safety file are updated to reflect the commit that was just made.

As I understand it print_commit_summary() dies because: (i) it cannot
resolve HEAD either because some other process is updating it (which is
bad news in the middle of a cherry-pick); (ii) because something went
wrong HEAD is corrupt; or (iii) log_tree_commit() cannot read some
objects. In all those cases dying will leave the sequencer in a sane
state for aborting - 'git cherry-pick --abort' will rewind HEAD to the
last successful commit before there was a problem with HEAD or the
object database. If the user somehow fixes the problem and runs 'git
cherry-pick --continue' then the sequencer will try and pick the same
commit again which may or may not be what the user wants depending on
what caused print_commit_summary() to die.

[PATCH v4] doc/SubmittingPatches: correct subject guidance

2017-11-10 Thread Adam Dinwoodie

The examples and common practice for adding markers such as "RFC" or
"v2" to the subject of patch emails is to have them within the same
brackets as the "PATCH" text, not after the closing bracket.  Further,
the practice of `git format-patch` and the like, as well as what appears
to be the more common pratice on the mailing list, is to use "[RFC
PATCH]", not "[PATCH/RFC]".

Update the SubmittingPatches article to match and to reference the
`format-patch` helper arguments, and also make some minor text
clarifications in the area.

Signed-off-by: Adam Dinwoodie 
Helped-by: Eric Sunshine 
---

Notes:
Changes since v3:
- Clarified meaning of "RFC" per Eric's suggestion
- Made the impact of --subject-prefix and friends clearer per Eric's
  suggestion

Thank you for your nitpicking, Eric, it's useful and very much
appreciated :)

 Documentation/SubmittingPatches | 19 ---
 1 file changed, 12 insertions(+), 7 deletions(-)

diff --git a/Documentation/SubmittingPatches b/Documentation/SubmittingPatches
index 558d465b6..89f239071 100644
--- a/Documentation/SubmittingPatches
+++ b/Documentation/SubmittingPatches
@@ -184,21 +184,26 @@ lose tabs that way if you are not careful.
 
 It is a common convention to prefix your subject line with
 [PATCH].  This lets people easily distinguish patches from other
-e-mail discussions.  Use of additional markers after PATCH and
-the closing bracket to mark the nature of the patch is also
-encouraged.  E.g. [PATCH/RFC] is often used when the patch is
-not ready to be applied but it is for discussion, [PATCH v2],
-[PATCH v3] etc. are often seen when you are sending an update to
-what you have previously sent.
+e-mail discussions.  Use of markers in addition to PATCH within
+the brackets to describe the nature of the patch is also
+encouraged.  E.g. [RFC PATCH] (where RFC stands for "request for
+comments") is often used to indicate a patch needs further
+discussion before being accepted, [PATCH v2], [PATCH v3] etc.
+are often seen when you are sending an update to what you have
+previously sent.
 
-"git format-patch" command follows the best current practice to
+The "git format-patch" command follows the best current practice to
 format the body of an e-mail message.  At the beginning of the
 patch should come your commit message, ending with the
 Signed-off-by: lines, and a line that consists of three dashes,
 followed by the diffstat information and the patch itself.  If
 you are forwarding a patch from somebody else, optionally, at
 the beginning of the e-mail message just before the commit
 message starts, you can put a "From: " line to name that person.
+To change the default "[PATCH]" in the subject to "[]", use
+`git format-patch --subject-prefix=`.  As a shortcut, you
+can use `--rfc` instead of `--subject-prefix="RFC PATCH"`, or
+`-v ` instead of `--subject-prefix="PATCH v"`.
 
 You often want to add additional explanation about the patch,
 other than the commit message itself.  Place such "cover letter"
-- 
2.14.3

Re: [RFC] cover-at-tip

2017-11-10 Thread Nicolas Morey-Chaisemartin



Le 10/11/2017 à 11:24, Nicolas Morey-Chaisemartin a écrit :
> Hi,
>
> I'm starting to look into the cover-at-tip topic that I found in the leftover 
> bits (http://www.spinics.net/lists/git/msg259573.html)
>
> Here's a first draft of a patch that adds support for format-patch 
> --cover-at-tip. It compiles and works in my nice and user firnedly test case.
> Just wanted to make sure I was going roughly in the right direction here.
>
>
> I was wondering where is the right place to put a commit_is_cover_at_tip() as 
> the test will be needed in other place as the feature is extended to git 
> am/merge/pull.
>
> Feel free to comment. I know the help is not clear at this point and there's 
> still some work to do on option handling (add a config option, probably have 
> --cover-at-tip imply --cover-letter, etc) and
> some testing :)
>
>
> ---

Leaving some more updates and questions before the week end:

I started on git am --cover-at-tip.

The proposed patch for format-patch does not output any "---" to signal the end 
of the commit log and the begining of the patch in the cover letter.
This means that the log summary, the diffstat and the git footer ( --\n) is seen as part of the commit log. Which is just wrong.

Removing them would solve the issue but I feel they bring some useful info (or 
they would not be here).
Adding a "---" between the commit log and those added infos poses another 
problem: git am does not see an empty patch anymore.
I would need to add "some" level of parsing to am.c to make sure the patch 
content is just garbage and that there are no actual hunks for that.

I did not find any public API that would allow me to do that, although 
apply_path/parse_chunk would fit the bill.
Is that the right way to approach this ?

My branch is here if anyone want to give a look: 
https://github.com/nmorey/git/tree/dev/cover-at-tip

Nicolas

Bug: cherry-pick & submodule

2017-11-10 Thread Ефимов Василий


I faced an unexpected behaviour during cherry-picking
a commit to a branch with a submodule.

Git graph:

A -- B [master]
 \
  `- C -- D [test]

Both branches have a file with name `a_file`.
It has been added by commit A.
Commits B and C add a folder with name `a_submodule` to the respective 
branches.

Commit C does it regularly by adding a file with name `a_submodule/content`.
Commit B adds a submodule with name `a_submodule`.
Commit D only modifies `a_file`.
Note that `a_file` and `a_submodule` are not related.

[repo]
  |- a_file
  `- a_submodule

When I trying to cherry pick commit D on commit B, I got
a conflict with `a_submodule`. Changes of `a_file` are
successfully cherry-picked.

I expected, that there would be no conflicts.

A bash script reproducing the bug is attached.

Vasily.


bug.sh
Description: application/shellscript

proper patch prefix for tweaking both git-bisect.sh and git-bisect.txt?

2017-11-10 Thread Robert P. J. Day


  digging through both git-bisect.sh and git-bisect.txt, and it seems
pretty clear they're both a bit out of date WRT documenting the newer
alternatives "old"/"new" as opposed to the older "good"/"bad" terms,
and a few other things.

  first, trivially, neither the script nor the man page mention "view"
as an alternative to "visualize", but that's easy to fix. however,
most of the inconsistency involves that good/bad/old/new stuff.

  the man page reads (in part):

  git bisect (bad|new|) []
  git bisect (good|old|) [...]

which i assume should actually read:

  git bisect (bad|new||) []
  git bisect (good|old||) [...]

unless there's some implicit assumption that isn't mentioned there.

  also from the man page, i'm guessing that:

  git bisect terms [--term-good | --term-bad]

might need to say:

  git bisect terms [--term-good | --term-bad | --term-new | --term-old]

and so on, and so on (again, unless the generality of those terms is
understood).

  so given that all of that is, technically, documentation (even the
usage message in the script), if one submits a patch to change both
files appropriately, what is the subject line prefix to use?

  anyway, maybe i'll do this in bite-size pieces to keep it
manageable. my first patch to the code base ... whoo hoo!

rday

-- 


Robert P. J. Day Ottawa, Ontario, CANADA
http://crashcourse.ca

Twitter:   http://twitter.com/rpjday
LinkedIn:   http://ca.linkedin.com/in/rpjday

[PATCH] bisect: mention "view" as an alternative to "visualize"

2017-11-10 Thread Robert P. J. Day

Tweak a number of files to mention "view" as an alternative to
"visualize":

 Documentation/git-bisect.txt   | 9 -
 Documentation/user-manual.txt  | 3 ++-
 builtin/bisect--helper.c   | 2 +-
 contrib/completion/git-completion.bash | 2 +-
 git-bisect.sh  | 4 ++--
 5 files changed, 10 insertions(+), 10 deletions(-)

Signed-off-by: Robert P. J. Day 

---

  here's hoping i have the right format for this patch ... are there
any parts of this that are inappropriate, such as extending the bash
completion?

diff --git a/Documentation/git-bisect.txt b/Documentation/git-bisect.txt
index 6c42abf07..89e6f9667 100644
--- a/Documentation/git-bisect.txt
+++ b/Documentation/git-bisect.txt
@@ -23,7 +23,7 @@ on the subcommand:
  git bisect terms [--term-good | --term-bad]
  git bisect skip [(|)...]
  git bisect reset []
- git bisect visualize
+ git bisect visualize|view
  git bisect replay 
  git bisect log
  git bisect run ...
@@ -196,15 +196,14 @@ of `git bisect good` and `git bisect bad` to mark commits.
 Bisect visualize
 

-To see the currently remaining suspects in 'gitk', issue the following
-command during the bisection process:
+To see the currently remaining suspects in 'gitk', issue either of the
+following equivalent commands during the bisection process:

 
 $ git bisect visualize
+$ git bisect view
 

-`view` may also be used as a synonym for `visualize`.
-
 If the `DISPLAY` environment variable is not set, 'git log' is used
 instead.  You can also give command-line options such as `-p` and
 `--stat`.
diff --git a/Documentation/user-manual.txt b/Documentation/user-manual.txt
index 3a03e63eb..55ec58986 100644
--- a/Documentation/user-manual.txt
+++ b/Documentation/user-manual.txt
@@ -538,10 +538,11 @@ Note that the version which `git bisect` checks out for 
you at each
 point is just a suggestion, and you're free to try a different
 version if you think it would be a good idea.  For example,
 occasionally you may land on a commit that broke something unrelated;
-run
+run either of the equivalent commands

 -
 $ git bisect visualize
+$ git bisect view
 -

 which will run gitk and label the commit it chose with a marker that
diff --git a/builtin/bisect--helper.c b/builtin/bisect--helper.c
index 35d2105f9..4b5fadcbe 100644
--- a/builtin/bisect--helper.c
+++ b/builtin/bisect--helper.c
@@ -46,7 +46,7 @@ static int check_term_format(const char *term, const char 
*orig_term)
return error(_("'%s' is not a valid term"), term);

if (one_of(term, "help", "start", "skip", "next", "reset",
-   "visualize", "replay", "log", "run", "terms", NULL))
+   "visualize", "view", "replay", "log", "run", "terms", 
NULL))
return error(_("can't use the builtin command '%s' as a term"), 
term);

/*
diff --git a/contrib/completion/git-completion.bash 
b/contrib/completion/git-completion.bash
index fdd984d34..52f68c922 100644
--- a/contrib/completion/git-completion.bash
+++ b/contrib/completion/git-completion.bash
@@ -1162,7 +1162,7 @@ _git_bisect ()
 {
__git_has_doubledash && return

-   local subcommands="start bad good skip reset visualize replay log run"
+   local subcommands="start bad good skip reset visualize view replay log 
run"
local subcommand="$(__git_find_on_cmdline "$subcommands")"
if [ -z "$subcommand" ]; then
__git_find_repo_path
diff --git a/git-bisect.sh b/git-bisect.sh
index 0138a8860..e8b622a47 100755
--- a/git-bisect.sh
+++ b/git-bisect.sh
@@ -1,6 +1,6 @@
 #!/bin/sh

-USAGE='[help|start|bad|good|new|old|terms|skip|next|reset|visualize|replay|log|run]'
+USAGE='[help|start|bad|good|new|old|terms|skip|next|reset|visualize|view|replay|log|run]'
 LONG_USAGE='git bisect help
print this long help message.
 git bisect start [--term-{old,good}= --term-{new,bad}=]
@@ -20,7 +20,7 @@ git bisect next
find next bisection to test and check it out.
 git bisect reset []
finish bisection search and go back to commit.
-git bisect visualize
+git bisect visualize|view
show bisect status in gitk.
 git bisect replay 
replay bisection log.


-- 


Robert P. J. Day Ottawa, Ontario, CANADA
http://crashcourse.ca

Twitter:   http://twitter.com/rpjday
LinkedIn:   http://ca.linkedin.com/in/rpjday

Bug: cherry-pick & submodule

2017-11-10 Thread Ефимов Василий


I faced an unexpected behaviour during cherry-picking
a commit to a branch with a submodule.

Git graph:

A -- B [master]
 \
  `- C -- D [test]

Both branches have a file with name `a_file`.
It has been added by commit A.
Commits B and C add a folder with name `a_submodule` to the respective 
branches.

Commit C does it regularly by adding a file with name `a_submodule/content`.
Commit B adds a submodule with name `a_submodule`.
Commit D only modifies `a_file`.
Note that `a_file` and `a_submodule` are not related.

[repo]
  |- a_file
  `- a_submodule

When I trying to cherry pick commit D on commit B, I got
a conflict with `a_submodule`. Changes of `a_file` are
successfully cherry-picked.

I expected, that there would be no conflicts.

A bash script reproducing the bug is listed below.

Vasily.


rm -rf a_submodule a_repo

mkdir a_submodule
cd a_submodule
git init
touch a_file_in_a_submodule
git add a_file_in_a_submodule
git commit -m "add a file in a submodule"
cd ..

mkdir a_repo
cd a_repo
git init

touch a_file
git add a_file
git commit -m "add a file"

git branch test
git checkout test

mkdir a_submodule
touch a_submodule/content
git add a_submodule/content
git commit -m "add a regular folder with name a_submodule"

echo "123" > a_file
git add a_file
git commit -m "modify a file"

git checkout master

git submodule add ../a_submodule a_submodule
git submodule update a_submodule
git commit -m "add a submodule info folder with name a_submodule"

# Trying to cherry-pick modification of a file from test branch.
git cherry-pick test

# some debug
git status

RE: cherry-pick very slow on big repository

2017-11-10 Thread Kevin Willford

Since this is happening during a merge, you might need to use merge.renameLimit
or the merge strategy option of -Xno-renames.  Although the code does fallback
to use the diff.renameLimit but there is still a lot that is done before even 
checking
the rename limit so I would first try getting renames turned off.

Thanks,
Kevin

> -Original Message-
> From: git-ow...@vger.kernel.org [mailto:git-ow...@vger.kernel.org] On Behalf
> Of Peter Krefting
> Sent: Friday, November 10, 2017 7:05 AM
> To: Derrick Stolee 
> Cc: Jeff King ; Git Mailing List 
> Subject: Re: cherry-pick very slow on big repository
> 
> Derrick Stolee:
> 
> > Git is spending time detecting renames, which implies you probably
> > renamed a folder or added and deleted a large number of files. This
> > rename detection is quadratic (# adds times # deletes).
> 
> Yes, a couple of directories with a lot of template files have been
> renamed (and some removed, some added) between the current development
> branch and this old maintenance branch. I get the "Performing inexact
> rename detection" a lot when merging changes in the other direction.
> 
> However, none of them applies to these particular commits, which only
> touches files that are in the exact same location on both branches.
> 
> > You can remove this rename detection by running your cherry-pick
> > with `git -c diff.renameLimit=1 cherry-pick ...`
> 
> That didn't work, actually it failed to finish with this setting in
> effect, it hangs in such a way that I can't stop it with Ctrl+C
> (neither when running from the command line, nor when running inside
> gdb). It didn't finish in the 20 minutes I gave it.
> 
> I also tried with diff.renames=false, which also seemed to fail.
> 
> --
> \\// Peter -
> https://na01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.softw
> olves.pp.se%2F&data=02%7C01%7Ckewillf%40microsoft.com%7C6b831a75739e4
> 0428d3808d52844106c%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C636
> 459195209466999&sdata=kJtNLAs1LSoPy%2B%2BNADJkuEBPMZVcxkSkKzOEEeIG
> VpM%3D&reserved=0

Urgent Message.

2017-11-10 Thread Western Union

Dear Western Union Customer,
 
You have been awarded with the sum of $50,000 USD by our office, as one of our 
customers who use Western Union in their daily business transaction. This award 
was been selected through the internet, where your e-mail address was indicated 
and notified. Please provide Mr. James Udo with the following details listed 
below so that your fund will be remited to you through Western Union.
 
1. Name:
2. Address:
3. Country:
4. Phone Number:
5. Occupation:
6. Sex:
7. Age:
 
Mr. James Udo E-mail: westerrnunion2...@outlook.com
 
As soon as these details are received and verified, your fund will be 
transferred to you.Thank you, for using western union.

JPMorgan Chase & Co.

2017-11-10 Thread Chase Bank


JPMorgan Chase & Co.
270 Park Avenue,
Midtown Manhattan
New York City,

Attention:

We are pleased to inform you about your fund which was seized by International 
Monetary Fund (IMF) due to your failure to provide necessary credentials which 
state the legitimacy of your fund, the fund was said to be transferred from BOA 
before its seizure by IMF. Presently the fund is in JPMorgan Chase bank for 
immediate remittance to your nominated bank account, below is the account 
details we have about you kindly reconfirm if all the details are correct and 
update before we make the transfer.

NOTE: no one will ask you for transfer or any form of charge to transfer your 
fund if ever happen please report to us because it’s never from us.

Acct. name: Doreen Koehler.
Bank name: Bank of America
WIRE#026009593
ACC.898046532236

Yours sincerely
John Mour
(917) 900-0351

Re: cherry-pick very slow on big repository

2017-11-10 Thread Elijah Newren

Interesting timing.  I have some performance patches specifically
developed because rename detection during merges made a small
cherry-pick in a large repo rather slow...in my case, I dropped the
time for the cherry pick by a factor of about 30 (no guarantees you'll
see the same; it's very history-specific).  I was just about to start
sending my three series of patches, the performance one being the
third...

On Fri, Nov 10, 2017 at 6:05 AM, Peter Krefting  wrote:
> Derrick Stolee:
>
>> Git is spending time detecting renames, which implies you probably renamed
>> a folder or added and deleted a large number of files. This rename detection
>> is quadratic (# adds times # deletes).
>
>
> Yes, a couple of directories with a lot of template files have been renamed
> (and some removed, some added) between the current development branch and
> this old maintenance branch. I get the "Performing inexact rename detection"
> a lot when merging changes in the other direction.
>
> However, none of them applies to these particular commits, which only
> touches files that are in the exact same location on both branches.
>
>> You can remove this rename detection by running your cherry-pick with `git
>> -c diff.renameLimit=1 cherry-pick ...`
>
>
> That didn't work, actually it failed to finish with this setting in effect,
> it hangs in such a way that I can't stop it with Ctrl+C (neither when
> running from the command line, nor when running inside gdb). It didn't
> finish in the 20 minutes I gave it.
>
> I also tried with diff.renames=false, which also seemed to fail.
>
>
> --
> \\// Peter - http://www.softwolves.pp.se/

Re: [PATCH v1 2/2] log: add option to choose which refs to decorate

2017-11-10 Thread Junio C Hamano

Rafael Ascensão  writes:

> I agree that describe should receive the "normalize" treatment. However,
> and following the same reasoning, why should describe users adopt the
> rules imposed by --glob? I could argue they're also used to the way it
> works now.
>
> That being said, the suggestion I mentioned earlier would allow to keep
> both current behaviors consistent at the expense of the extra call to
> refs.c::ref_exists().

In any case, updating the "describe" for consistency is something we
can and should leave for later, to be done as a separate topic.

While I agree with you that the consistent behaviour between
commands is desirable, and also I agree with you that given a
pattern $X that does not have any glob char, trying to match $X when
a ref whose name exactly is $X exists and trying to match $X/*
otherwise would give us a consistent semantics without hurting any
existing uses, I do not think you need to pay any extra expense of
calling ref_exists() at all to achieve that.

That is because when $X exists, you already know $X/otherthing does
not exist.  And when $X does not exist, $X/otherthing might.  So a
naive implementation would be just to add two patterns $X and $X/*
to the filter list and be done with it.  If you exactly have
refs/heads/master, even with the naive logic may throw both
refs/heads/master and refs/heads/master/* to the filter list,
nothing will match the latter to contaminate your result (and vice
versa).

A bit more clever implementation "just throw in two items" would go
like this.  It is not all that involved:

 - In load_ref_decorations(), before running add_ref_decoration for
   each ref and head ref, iterate over the elements in the refname
   filter list.  For each element:

   - if item->string has a trailing '/', trim that.

   - store NULL in the item->util field for item whose string field
 has a glob char.

   - store something non-NULL (e.g. item->string) for item whose
 string field does not have a glob char.

 - In add_ref_decoration(), where your previous round iterates over
   filter->{include,exclude}, get rid of normalize_glob_ref() and
   use of real_pattern.  Instead do something like:

matched = 0;
if (item->util == NULL) {
if (!wildmatch(item->string, refname, 0))
matched = 1;
} else {
const char *rest;
if (skip_prefix(refname, item->string, &rest) &&
(!*rest || *rest == '/'))
matched = 1;
}
if (matched)
...

   Of course, you would probably want to encapsulate the logic to
   set matched = 1/0 in a helper function, e.g.

static int match_ref_pattern(const char *refname,
 const struct string_list_item *item) {
int matched = 0;
... do either wildmatch or head match with tail validation
... depending on the item->util's NULLness (see above)
return matched;
}

   and call that from the two loops for exclude and include list.

Hmm?

[PATCH 2/4] Remove silent clamp of renameLimit

2017-11-10 Thread Elijah Newren

In commit 0024a5492 (Fix the rename detection limit checking; 2007-09-14),
the renameLimit was clamped to 32767.  This appears to have been to simply
avoid integer overflow in the following computation:

   num_create * num_src <= rename_limit * rename_limit

although it also could be viewed as a hardcoded bound on the amount of CPU
time we're willing to allow users to tell git to spend on handling
renames.  An upper bound may make sense, particularly as the computation
is O(rename_limit^2), but only if the bound is documented and communicated
to the user -- neither of which were true.

In fact, the silent clamping of the renameLimit to a smaller value and
lack of reporting of the needed renameLimit when it was too large made it
appear to the user as though they had used a high enough value; however,
git would proceed to mess up the merge or cherry-pick badly based on the
lack of rename detection.  Some hardy folks, despite the lack of feedback
on the correct limit to choose, were desperate enough to repeatedly retry
their cherry-picks with increasingly larger renameLimit values (going
orders of magnitude beyond the built-in limit of 32767), but were
consistently met with the same failure.

Although large limits can make things slow, we have users who would be
ecstatic to have a small five file change be correctly cherry picked even
if they have to manually specify a large limit and it took git ten minutes
to compute it.

Signed-off-by: Elijah Newren 
---
 diff.c|  2 +-
 diffcore-rename.c | 11 ---
 2 files changed, 5 insertions(+), 8 deletions(-)

diff --git a/diff.c b/diff.c
index 6fd288420b..c6597e3231 100644
--- a/diff.c
+++ b/diff.c
@@ -5524,7 +5524,7 @@ void diff_warn_rename_limit(const char *varname, int 
needed, int degraded_cc)
warning(_(rename_limit_warning));
else
return;
-   if (0 < needed && needed < 32767)
+   if (0 < needed)
warning(_(rename_limit_advice), varname, needed);
 }
 
diff --git a/diffcore-rename.c b/diffcore-rename.c
index 0d8c3d2ee4..7f9a463f5a 100644
--- a/diffcore-rename.c
+++ b/diffcore-rename.c
@@ -391,14 +391,10 @@ static int too_many_rename_candidates(int num_create,
 * growing larger than a "rename_limit" square matrix, ie:
 *
 *num_create * num_src > rename_limit * rename_limit
-*
-* but handles the potential overflow case specially (and we
-* assume at least 32-bit integers)
 */
-   if (rename_limit <= 0 || rename_limit > 32767)
-   rename_limit = 32767;
if ((num_create <= rename_limit || num_src <= rename_limit) &&
-   (num_create * num_src <= rename_limit * rename_limit))
+   ((double)num_create * (double)num_src
+<= (double)rename_limit * (double)rename_limit))
return 0;
 
options->needed_rename_limit =
@@ -415,7 +411,8 @@ static int too_many_rename_candidates(int num_create,
num_src++;
}
if ((num_create <= rename_limit || num_src <= rename_limit) &&
-   (num_create * num_src <= rename_limit * rename_limit))
+   ((double)num_create * (double)num_src
+<= (double)rename_limit * (double)rename_limit))
return 2;
return 1;
 }
-- 
2.15.0.5.g9567be9905

[PATCH 1/4] sequencer: Warn when internal merge may be suboptimal due to renameLimit

2017-11-10 Thread Elijah Newren

Having renamed files silently treated as deleted/modified conflicts
instead of correctly resolving the renamed/modified case has caused lots
of pain for some users.  Eventually, some of them figured out to set
merge.renameLimit to something higher, but without feedback about what
value it should have, they were just repeatedly guessing and retrying.

Signed-off-by: Elijah Newren 
---
 sequencer.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/sequencer.c b/sequencer.c
index f2a10cc4f2..2b4cecb617 100644
--- a/sequencer.c
+++ b/sequencer.c
@@ -462,6 +462,7 @@ static int do_recursive_merge(struct commit *base, struct 
commit *next,
if (is_rebase_i(opts) && clean <= 0)
fputs(o.obuf.buf, stdout);
strbuf_release(&o.obuf);
+   diff_warn_rename_limit("merge.renamelimit", o.needed_rename_limit, 0);
if (clean < 0)
return clean;
 
-- 
2.15.0.5.g9567be9905

[PATCH 3/4] progress: Fix progress meters when dealing with lots of work

2017-11-10 Thread Elijah Newren

The possibility of setting merge.renameLimit beyond 2^16 raises the
possibility that the values passed to progress can exceed 2^32.  For my
usecase of interest, I only needed to pass a value a little over 2^31.  If
I were only interested in fixing my usecase, I could have simply changed
last_value from int to unsigned, and casted each of rename_dst_nr and
rename_src_nr (in merge-recursive.c) from int to unsigned just before
multiplying them.  However, as long as we're making changes to allow
larger progress meters, we may as well make a little more room in general.
Use uint64_t, because it "ought to be enough for anybody".  :-)

Signed-off-by: Elijah Newren 
---
 diffcore-rename.c |  4 ++--
 progress.c| 22 +++---
 progress.h|  8 
 3 files changed, 17 insertions(+), 17 deletions(-)

diff --git a/diffcore-rename.c b/diffcore-rename.c
index 7f9a463f5a..6ba6157c61 100644
--- a/diffcore-rename.c
+++ b/diffcore-rename.c
@@ -531,7 +531,7 @@ void diffcore_rename(struct diff_options *options)
if (options->show_rename_progress) {
progress = start_delayed_progress(
_("Performing inexact rename detection"),
-   rename_dst_nr * rename_src_nr);
+   (uint64_t)rename_dst_nr * 
(uint64_t)rename_src_nr);
}
 
mx = xcalloc(st_mult(NUM_CANDIDATE_PER_DST, num_create), sizeof(*mx));
@@ -568,7 +568,7 @@ void diffcore_rename(struct diff_options *options)
diff_free_filespec_blob(two);
}
dst_cnt++;
-   display_progress(progress, (i+1)*rename_src_nr);
+   display_progress(progress, 
(uint64_t)(i+1)*(uint64_t)rename_src_nr);
}
stop_progress(&progress);
 
diff --git a/progress.c b/progress.c
index 289678d43d..7e4a2f9532 100644
--- a/progress.c
+++ b/progress.c
@@ -30,8 +30,8 @@ struct throughput {
 
 struct progress {
const char *title;
-   int last_value;
-   unsigned total;
+   uint64_t last_value;
+   uint64_t total;
unsigned last_percent;
unsigned delay;
unsigned delayed_percent_threshold;
@@ -79,7 +79,7 @@ static int is_foreground_fd(int fd)
return tpgrp < 0 || tpgrp == getpgid(0);
 }
 
-static int display(struct progress *progress, unsigned n, const char *done)
+static int display(struct progress *progress, uint64_t n, const char *done)
 {
const char *eol, *tp;
 
@@ -106,7 +106,7 @@ static int display(struct progress *progress, unsigned n, 
const char *done)
if (percent != progress->last_percent || progress_update) {
progress->last_percent = percent;
if (is_foreground_fd(fileno(stderr)) || done) {
-   fprintf(stderr, "%s: %3u%% (%u/%u)%s%s",
+   fprintf(stderr, "%s: %3u%% (%lu/%lu)%s%s",
progress->title, percent, n,
progress->total, tp, eol);
fflush(stderr);
@@ -116,7 +116,7 @@ static int display(struct progress *progress, unsigned n, 
const char *done)
}
} else if (progress_update) {
if (is_foreground_fd(fileno(stderr)) || done) {
-   fprintf(stderr, "%s: %u%s%s",
+   fprintf(stderr, "%s: %lu%s%s",
progress->title, n, tp, eol);
fflush(stderr);
}
@@ -127,7 +127,7 @@ static int display(struct progress *progress, unsigned n, 
const char *done)
return 0;
 }
 
-static void throughput_string(struct strbuf *buf, off_t total,
+static void throughput_string(struct strbuf *buf, uint64_t total,
  unsigned int rate)
 {
strbuf_reset(buf);
@@ -138,7 +138,7 @@ static void throughput_string(struct strbuf *buf, off_t 
total,
strbuf_addstr(buf, "/s");
 }
 
-void display_throughput(struct progress *progress, off_t total)
+void display_throughput(struct progress *progress, uint64_t total)
 {
struct throughput *tp;
uint64_t now_ns;
@@ -200,12 +200,12 @@ void display_throughput(struct progress *progress, off_t 
total)
display(progress, progress->last_value, NULL);
 }
 
-int display_progress(struct progress *progress, unsigned n)
+int display_progress(struct progress *progress, uint64_t n)
 {
return progress ? display(progress, n, NULL) : 0;
 }
 
-static struct progress *start_progress_delay(const char *title, unsigned total,
+static struct progress *start_progress_delay(const char *title, uint64_t total,
 unsigned percent_threshold, 
unsigned delay)
 {
struct progress *progress = malloc(sizeof(*progress));
@@ -227,12 +227,12 @@ static struct progress *start_progress_delay(const char 
*title, unsigned total,

[PATCH 4/4] sequencer: Show rename progress during cherry picks

2017-11-10 Thread Elijah Newren

When trying to cherry-pick a change that has lots of renames, it is
somewhat unsettling to wait a really long time without any feedback.

Signed-off-by: Elijah Newren 
---
 sequencer.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/sequencer.c b/sequencer.c
index 2b4cecb617..247d93f363 100644
--- a/sequencer.c
+++ b/sequencer.c
@@ -448,6 +448,7 @@ static int do_recursive_merge(struct commit *base, struct 
commit *next,
o.branch2 = next ? next_label : "(empty tree)";
if (is_rebase_i(opts))
o.buffer_output = 2;
+   o.show_rename_progress = 1;
 
head_tree = parse_tree_indirect(head);
next_tree = next ? next->tree : empty_tree();
-- 
2.15.0.5.g9567be9905

Re: [PATCH v4] doc/SubmittingPatches: correct subject guidance

2017-11-10 Thread Josh Triplett

On Fri, Nov 10, 2017 at 03:02:50PM +, Adam Dinwoodie wrote:
> The examples and common practice for adding markers such as "RFC" or
> "v2" to the subject of patch emails is to have them within the same
> brackets as the "PATCH" text, not after the closing bracket.  Further,
> the practice of `git format-patch` and the like, as well as what appears
> to be the more common pratice on the mailing list, is to use "[RFC
> PATCH]", not "[PATCH/RFC]".
> 
> Update the SubmittingPatches article to match and to reference the
> `format-patch` helper arguments, and also make some minor text
> clarifications in the area.
> 
> Signed-off-by: Adam Dinwoodie 
> Helped-by: Eric Sunshine 

This looks great! Thank you for updating this documentation.

Reviewed-by: Josh Triplett 

> ---
> 
> Notes:
> Changes since v3:
> - Clarified meaning of "RFC" per Eric's suggestion
> - Made the impact of --subject-prefix and friends clearer per Eric's
>   suggestion
> 
> Thank you for your nitpicking, Eric, it's useful and very much
> appreciated :)
> 
>  Documentation/SubmittingPatches | 19 ---
>  1 file changed, 12 insertions(+), 7 deletions(-)
> 
> diff --git a/Documentation/SubmittingPatches b/Documentation/SubmittingPatches
> index 558d465b6..89f239071 100644
> --- a/Documentation/SubmittingPatches
> +++ b/Documentation/SubmittingPatches
> @@ -184,21 +184,26 @@ lose tabs that way if you are not careful.
>  
>  It is a common convention to prefix your subject line with
>  [PATCH].  This lets people easily distinguish patches from other
> -e-mail discussions.  Use of additional markers after PATCH and
> -the closing bracket to mark the nature of the patch is also
> -encouraged.  E.g. [PATCH/RFC] is often used when the patch is
> -not ready to be applied but it is for discussion, [PATCH v2],
> -[PATCH v3] etc. are often seen when you are sending an update to
> -what you have previously sent.
> +e-mail discussions.  Use of markers in addition to PATCH within
> +the brackets to describe the nature of the patch is also
> +encouraged.  E.g. [RFC PATCH] (where RFC stands for "request for
> +comments") is often used to indicate a patch needs further
> +discussion before being accepted, [PATCH v2], [PATCH v3] etc.
> +are often seen when you are sending an update to what you have
> +previously sent.
>  
> -"git format-patch" command follows the best current practice to
> +The "git format-patch" command follows the best current practice to
>  format the body of an e-mail message.  At the beginning of the
>  patch should come your commit message, ending with the
>  Signed-off-by: lines, and a line that consists of three dashes,
>  followed by the diffstat information and the patch itself.  If
>  you are forwarding a patch from somebody else, optionally, at
>  the beginning of the e-mail message just before the commit
>  message starts, you can put a "From: " line to name that person.
> +To change the default "[PATCH]" in the subject to "[]", use
> +`git format-patch --subject-prefix=`.  As a shortcut, you
> +can use `--rfc` instead of `--subject-prefix="RFC PATCH"`, or
> +`-v ` instead of `--subject-prefix="PATCH v"`.
>  
>  You often want to add additional explanation about the patch,
>  other than the commit message itself.  Place such "cover letter"
> -- 
> 2.14.3
>

Re: [Query] Separate hooks for Git worktrees

2017-11-10 Thread Stefan Beller

On Thu, Nov 9, 2017 at 9:00 PM, Junio C Hamano  wrote:
> Stefan Beller  writes:
>
>> We have no worktree specific config yet, though patches for
>> this were floated on the mailing list.
>>
>> Though recent versions of git learned to conditionally include
>> config files. (look for includeIf in man git-config), which I think
>> could be used to set the option gerrit.createChangeId  depending
>> on the worktree you are in.
>>
>>> Any idea how I can get around this problem without having separate
>>> repositories for kernel and android ?
>>
>> The proposed approach above might be hacky but sounds as if
>> it should work?
>
> If you meant "conditional include" by "proposed approach above", I
> do not see which part you found possibly hacky.

Compared to a per-worktree configuration that you can setup via

git config --for-worktree=X key value

the setup using conditional includes seems hacky to me.
(I just realize that these conditional includes can be set using
regular git-config, so maybe it is not as hacky as I thought.)

>  It is to allow
> different set of configurations to apply depending on where you are
> working at, which I think was invented exactly for something like
> this.

>From a UX perspective, I can imagine a way easier workflow,
but the data model seems to make sense.

> It certainly is not any hackier than using the same repository to
> house separately manged projects even if they may be related
> projects.

Well it is the same project with different upstream workflows.
For example I would imagine that Viresh wants to cherry-pick
from one branch to another, or even send the same patch
(just with different commit messages, with or without the
ChangeId) to the different upstreams?

> Where does the aversion of "having separate repositories" primarily
> come from?  Is it bad due to disk consumption?  Is it bad because
> you cannot do "git diff android-branch kernel-branch"?  Something
> else?

Yeah, that is an interesting question!
(I suspect workflow related things, diff/cherry-pick)

> If it is purely disk consumption that is an issue, perhaps the real
> solution is to make it easier to maintain separate repositories
> while sharing as much disk space as possible.  GC may have to be
> made a lot more robust in the presense of alternate object stores,
> for example.

Re: [PATCH v1 5/8] sequencer: don't die in print_commit_summary()

2017-11-10 Thread Junio C Hamano

Phillip Wood  writes:

> On 07/11/17 15:13, Junio C Hamano wrote:
> ...
>> Another possibility perhaps is that the function is safe to reuse
>> already even without this patch, of course ;-).
>> 
> Hmm, maybe it is. Looking at pick_commits() and do_pick_commit() if the
> sequencer dies in print_commit_summary() (which can only happen when
> cherry-picking or reverting) then neither the todo list or the abort
> safety file are updated to reflect the commit that was just made.
> 
> As I understand it print_commit_summary() dies because: (i) it cannot
> resolve HEAD either because some other process is updating it (which is
> bad news in the middle of a cherry-pick); (ii) because something went
> wrong HEAD is corrupt; or (iii) log_tree_commit() cannot read some
> objects. In all those cases dying will leave the sequencer in a sane
> state for aborting - 'git cherry-pick --abort' will rewind HEAD to the
> last successful commit before there was a problem with HEAD or the
> object database. If the user somehow fixes the problem and runs 'git
> cherry-pick --continue' then the sequencer will try and pick the same
> commit again which may or may not be what the user wants depending on
> what caused print_commit_summary() to die.

The above is all good analysis---thanks for your diligence.  Perhaps
some if not all of it can go to the log message?

Re: is there a stylistic preference for a trailing "--" on a command?

2017-11-10 Thread Stefan Beller

On Fri, Nov 10, 2017 at 5:57 AM, Robert P. J. Day  wrote:
>
>   just noticed these examples in "man git-bisect":
>
> EXAMPLES
>   $ git bisect start HEAD v1.2 --  # HEAD is bad, v1.2 is good
>   ...
>   $ git bisect start HEAD origin --# HEAD is bad, origin is good
>   ...
>   $ git bisect start HEAD HEAD~10 --   # culprit is among the last 10
>
> is there some rationale or stylistic significance to those trailing
> "--" on those commands? i assume they have no effect, just curious as
> to why they're there.

By having the -- there, it is clear that the strings are ref specs and not files
of such a name. (Who would want to store a file named HEAD~10 in their
repo?)

[PATCH 0/4] Fix issues with rename detection limits

2017-11-10 Thread Elijah Newren

In a repo with around 60k files with deep directory hierarchies (due to
being predominantly java code) and which had a few high level directories
moved around (making it appear to git as dozens of thousands of individual
file renames), attempts to cherry-pick commits across product versions
resulted in non-sensical delete/modify conflicts (rather than
rename/modify as expected).  We had a few teams who were in the unenviable
position of having to backport hundreds or thousands of commits across
such product versions, and the result was months of pain, which could have
been alleviated were it not for a few small git bugs:

  * When you try to cherry-pick changes, unlike when you merge two
branches, git will not notify you when you need to set a higher
merge.renameLimit.

  * If you know git internals well enough, you can try to trick git into
telling you the correct renameLimit by doing a merge instead of a
cherry-pick.  If you do that, and have a renameLimit that is too
small, git will let you know the value you need.  You can then undo
the merge and proceed with the cherry-pick.  Except that...

  * If you are performing a merge and specify a large renameLimit that
isn't large enough, and the necessary renameLimit you need is greater
than 32767, then git tells you nothing, leading you to believe that
the limit you specified is high enough, but only to watch it still
mess up the merge badly.

  * If you happen to specify a merge.renameLimit that *is* high enough,
but just happens to be greater than 32767, then git will silently
pretend you specified 32767, determine that 32767 is not high enough,
not tell you anything about it's silent clamping, and then proceed to
mess up the merge or cherry-pick badly.  Not only do you get no
feedback, the clamping to 32767 isn't documented anywhere even if you
tried to read every manual page.

Folks did discover the merge.renameLimit and tried setting it to various
values, spread over the spectrum from 1000 (the default) up to 9
(or maybe more, that's just the highest number I heard), completely
unaware that most their attempts were ignored and wondering why git
couldn't handle things and couldn't explain why either.

Eventually folks wrote scripts that would run the output of format-patch
through a bunch of sed commands to pretend patches were against the
filename on the other side of history and then pipe back through git-am.
Such scripts grew as more and more rename rules were added.

I eventually learned of one of these scripts and said something close to,
"You don't need these pile of rename rules; just set merge.renameLimit to
something higher."  When they responded that merge.renameLimit didn't
work, I didn't believe them.  This patch series, along with two others
that I will be sending shortly, represent my attempt to continue to not
believe them.  :-)

Elijah Newren (4):
  sequencer: Warn when internal merge may be suboptimal due to
renameLimit
  Remove silent clamp of renameLimit
  progress: Fix progress meters when dealing with lots of work
  sequencer: Show rename progress during cherry picks

 diff.c|  2 +-
 diffcore-rename.c | 15 ++-
 progress.c| 22 +++---
 progress.h|  8 
 sequencer.c   |  2 ++
 5 files changed, 24 insertions(+), 25 deletions(-)

-- 
2.15.0.5.g9567be9905

[PATCH 0/4] Fix issues with rename detection limits

2017-11-10 Thread Elijah Newren

In a repo with around 60k files with deep directory hierarchies (due to
being predominantly java code) and which had a few high level directories
moved around (making it appear to git as dozens of thousands of individual
file renames), attempts to cherry-pick commits across product versions
resulted in non-sensical delete/modify conflicts (rather than
rename/modify as expected).  We had a few teams who were in the unenviable
position of having to backport hundreds or thousands of commits across
such product versions, and the result was months of pain, which could have
been alleviated were it not for a few small git bugs:

  * When you try to cherry-pick changes, unlike when you merge two
branches, git will not notify you when you need to set a higher
merge.renameLimit.

  * If you know git internals well enough, you can try to trick git into
telling you the correct renameLimit by doing a merge instead of a
cherry-pick.  If you do that, and have a renameLimit that is too
small, git will let you know the value you need.  You can then undo
the merge and proceed with the cherry-pick.  Except that...

  * If you are performing a merge and specify a large renameLimit that
isn't large enough, and the necessary renameLimit you need is greater
than 32767, then git tells you nothing, leading you to believe that
the limit you specified is high enough, but only to watch it still
mess up the merge badly.

  * If you happen to specify a merge.renameLimit that *is* high enough,
but just happens to be greater than 32767, then git will silently
pretend you specified 32767, determine that 32767 is not high enough,
not tell you anything about it's silent clamping, and then proceed to
mess up the merge or cherry-pick badly.  Not only do you get no
feedback, the clamping to 32767 isn't documented anywhere even if you
tried to read every manual page.

Folks did discover the merge.renameLimit and tried setting it to various
values, spread over the spectrum from 1000 (the default) up to 9
(or maybe more, that's just the highest number I heard), completely
unaware that most their attempts were ignored and wondering why git
couldn't handle things and couldn't explain why either.

Eventually folks wrote scripts that would run the output of format-patch
through a bunch of sed commands to pretend patches were against the
filename on the other side of history and then pipe back through git-am.
Such scripts grew as more and more rename rules were added.

I eventually learned of one of these scripts and said something close to,
"You don't need these pile of rename rules; just set merge.renameLimit to
something higher."  When they responded that merge.renameLimit didn't
work, I didn't believe them.  This patch series, along with two others
that I will be sending shortly, represent my attempt to continue to not
believe them.  :-)

Elijah Newren (4):
  sequencer: Warn when internal merge may be suboptimal due to
renameLimit
  Remove silent clamp of renameLimit
  progress: Fix progress meters when dealing with lots of work
  sequencer: Show rename progress during cherry picks

 diff.c|  2 +-
 diffcore-rename.c | 15 ++-
 progress.c| 22 +++---
 progress.h|  8 
 sequencer.c   |  2 ++
 5 files changed, 24 insertions(+), 25 deletions(-)

-- 
2.15.0.5.g9567be9905

Re: Bug - Status - Space in Filename

2017-11-10 Thread Junio C Hamano

Jeff King  writes:

> Are there callers who don't just print the result? If not, we could just
> always emit. That's slightly more efficient since it drops the expensive
> strbuf_insert (though there are already so many copies going on in
> quote_path_relative that it hardly matters). But it also drops the need
> for the caller to know about the strbuf at all.

I am fine by that, too.  Consider that I'm still suffering from the
trauma caused by some patches (not in this area but others) that
changed output we have long been directly emiting to stdio to
instead capture to strings ;-)

This might also be a good bite-sized material for the weekend thing
;-)

Re: [RFC] cover-at-tip

2017-11-10 Thread Junio C Hamano

Nicolas Morey-Chaisemartin  writes:

> I would need to add "some" level of parsing to am.c to make sure
> the patch content is just garbage and that there are no actual
> hunks for that.
>
> I did not find any public API that would allow me to do that,
> although apply_path/parse_chunk would fit the bill.  Is that the
> right way to approach this ?

I do not think you would want this non-patch cruft seen at the apply
layer at all.  Reading a mailbox, with the help of mailsplit and
mailinfo, and being the driver to create a series of commits is what
"am" is about, and it would have to notice that the non-patch cruft
at the beginning is not a patch at all and defer creation of an
empty commit with that cover material at the end.  For each of the
other messages in the series that has patches, it will need to call
apply to update the index and the working tree so that it can make a
commit, but there is NO reason whatsoever to ask help from apply, whose
sole purpose is to read a patch and make modifications to the index
and the working tree, to handle the cover material.

Re: [PATCH 2/4] Remove silent clamp of renameLimit

2017-11-10 Thread Stefan Beller

On Fri, Nov 10, 2017 at 9:39 AM, Elijah Newren  wrote:
> In commit 0024a5492 (Fix the rename detection limit checking; 2007-09-14),
> the renameLimit was clamped to 32767.  This appears to have been to simply
> avoid integer overflow in the following computation:
>
>num_create * num_src <= rename_limit * rename_limit
>
> although it also could be viewed as a hardcoded bound on the amount of CPU
> time we're willing to allow users to tell git to spend on handling
> renames.  An upper bound may make sense, particularly as the computation
> is O(rename_limit^2), but only if the bound is documented and communicated
> to the user -- neither of which were true.
>
> In fact, the silent clamping of the renameLimit to a smaller value and
> lack of reporting of the needed renameLimit when it was too large made it
> appear to the user as though they had used a high enough value; however,
> git would proceed to mess up the merge or cherry-pick badly based on the
> lack of rename detection.  Some hardy folks, despite the lack of feedback
> on the correct limit to choose, were desperate enough to repeatedly retry
> their cherry-picks with increasingly larger renameLimit values (going
> orders of magnitude beyond the built-in limit of 32767), but were
> consistently met with the same failure.
>
> Although large limits can make things slow, we have users who would be
> ecstatic to have a small five file change be correctly cherry picked even
> if they have to manually specify a large limit and it took git ten minutes
> to compute it.
>
> Signed-off-by: Elijah Newren 
> ---
>  diff.c|  2 +-
>  diffcore-rename.c | 11 ---
>  2 files changed, 5 insertions(+), 8 deletions(-)
>
> diff --git a/diff.c b/diff.c
> index 6fd288420b..c6597e3231 100644
> --- a/diff.c
> +++ b/diff.c
> @@ -5524,7 +5524,7 @@ void diff_warn_rename_limit(const char *varname, int 
> needed, int degraded_cc)
> warning(_(rename_limit_warning));
> else
> return;
> -   if (0 < needed && needed < 32767)
> +   if (0 < needed)
> warning(_(rename_limit_advice), varname, needed);
>  }
>
> diff --git a/diffcore-rename.c b/diffcore-rename.c
> index 0d8c3d2ee4..7f9a463f5a 100644
> --- a/diffcore-rename.c
> +++ b/diffcore-rename.c
> @@ -391,14 +391,10 @@ static int too_many_rename_candidates(int num_create,
>  * growing larger than a "rename_limit" square matrix, ie:
>  *
>  *num_create * num_src > rename_limit * rename_limit
> -*
> -* but handles the potential overflow case specially (and we
> -* assume at least 32-bit integers)
>  */
> -   if (rename_limit <= 0 || rename_limit > 32767)
> -   rename_limit = 32767;
> if ((num_create <= rename_limit || num_src <= rename_limit) &&
> -   (num_create * num_src <= rename_limit * rename_limit))
> +   ((double)num_create * (double)num_src
> +<= (double)rename_limit * (double)rename_limit))
> return 0;

>From a technical perspective, I would think that if
(num_create <= rename_limit || num_src <= rename_limit)
holds true, that the double-cast condition would also be always true?
Could we just remove that last check?

Or phrased differently, if we can cast to double and extend the check
here, do we have to adapt code at other places as well?

>
> options->needed_rename_limit =
> @@ -415,7 +411,8 @@ static int too_many_rename_candidates(int num_create,
> num_src++;
> }
> if ((num_create <= rename_limit || num_src <= rename_limit) &&
> -   (num_create * num_src <= rename_limit * rename_limit))
> +   ((double)num_create * (double)num_src
> +<= (double)rename_limit * (double)rename_limit))
> return 2;
> return 1;
>  }
> --
> 2.15.0.5.g9567be9905
>

Re: [PATCH 0/4] Fix issues with rename detection limits

2017-11-10 Thread Elijah Newren

On Fri, Nov 10, 2017 at 10:13 AM, Elijah Newren  wrote:
> In a repo with around 60k files with deep directory hierarchies (due to

> Elijah Newren (4):
>   sequencer: Warn when internal merge may be suboptimal due to
> renameLimit
>   Remove silent clamp of renameLimit
>   progress: Fix progress meters when dealing with lots of work
>   sequencer: Show rename progress during cherry picks

Sorry for the duplicate send.

I noticed the cover letter didn't appear on in the email list,
double-checked https://public-inbox.org/git/, waited over half an hour
and double checked both email and public-inbox again and still didn't
see it, so I re-sent.  Just as soon as I did, it seems that the
original and the new cover letter emails suddenly showed up right
then.  *shrug*.

Re: [RFC] cover-at-tip

2017-11-10 Thread Jonathan Tan

On Fri, 10 Nov 2017 16:37:49 +0100
Nicolas Morey-Chaisemartin  wrote:

> > Hi,
> >
> > I'm starting to look into the cover-at-tip topic that I found in the 
> > leftover bits (http://www.spinics.net/lists/git/msg259573.html)

Thanks - I personally would find this very useful.

> > Here's a first draft of a patch that adds support for format-patch 
> > --cover-at-tip. It compiles and works in my nice and user firnedly test 
> > case.
> > Just wanted to make sure I was going roughly in the right direction here.
> >
> >
> > I was wondering where is the right place to put a commit_is_cover_at_tip() 
> > as the test will be needed in other place as the feature is extended to git 
> > am/merge/pull.

I think you can put this in (root)/commit.c, especially since that test
operates on a "struct commit *".

> > Feel free to comment. I know the help is not clear at this point and 
> > there's still some work to do on option handling (add a config option, 
> > probably have --cover-at-tip imply --cover-letter, etc) and
> > some testing :)

Both are good ideas. You should probably use a
--cover-letter={no,auto,yes} instead of the current boolean, so that the
config can use the same options and configuring it to "auto" (to use a
cover letter if the tip is empty and singly-parented, and not to use a
cover letter otherwise) is meaningful.

> The proposed patch for format-patch does not output any "---" to signal the 
> end of the commit log and the begining of the patch in the cover letter.
> This means that the log summary, the diffstat and the git footer ( --\n version>) is seen as part of the commit log. Which is just wrong.
> 
> Removing them would solve the issue but I feel they bring some useful info 
> (or they would not be here).
> Adding a "---" between the commit log and those added infos poses another 
> problem: git am does not see an empty patch anymore.
> I would need to add "some" level of parsing to am.c to make sure the patch 
> content is just garbage and that there are no actual hunks for that.

Could you just take the message from the commit and put that in the
cover letter? The summary and diffstat do normally have useful info, but
if the commit is specifically made to be used only for the cover letter,
I think that is no longer true.

Re: [PATCH 2/4] Remove silent clamp of renameLimit

2017-11-10 Thread Elijah Newren

Thanks for taking a look!

On Fri, Nov 10, 2017 at 10:26 AM, Stefan Beller  wrote:

>> -   if (rename_limit <= 0 || rename_limit > 32767)
>> -   rename_limit = 32767;
>> if ((num_create <= rename_limit || num_src <= rename_limit) &&
>> -   (num_create * num_src <= rename_limit * rename_limit))
>> +   ((double)num_create * (double)num_src
>> +<= (double)rename_limit * (double)rename_limit))
>> return 0;
>
> From a technical perspective, I would think that if
> (num_create <= rename_limit || num_src <= rename_limit)
> holds true, that the double-cast condition would also be always true?
> Could we just remove that last check?

Not necessarily.  For example, if num_create = rename_limit-1 and
num_src = rename_limit+2, then the first condition will be satisfied
but the second won't.  If it was && rather than ||, then the second
condition would be superfluous.

> Or phrased differently, if we can cast to double and extend the check
> here, do we have to adapt code at other places as well?

Good point, and yes.  Perhaps I should have re-ordered my patch series
because I came back to it later and realized that the progress code
was broken due to overflow/wraparound, and a patch 3 fixed that.

Further, the later patch used uint64_t instead of double.  While
double works, perhaps I should change the double here to uint64_t for
consistency?

Re: [PATCH v2 3/9] Add a function to update HEAD after creating a commit

2017-11-10 Thread Junio C Hamano

Phillip Wood  writes:

> From: Phillip Wood 
>
> Add update_head() based on the code that updates HEAD after committing
> in builtin/commit.c that can be called by 'git commit' and other
> commands.
>
> Signed-off-by: Phillip Wood 
> ---
>
> Notes:
> changes since v1:
>  - rename update_head() to update_head_with_reflog()

The above proposed log message still calls it with the old name
though.

Re: [PATCH v2 2/9] commit: move empty message checks to libgit

2017-11-10 Thread Ramsay Jones



On 10/11/17 11:09, Phillip Wood wrote:
> From: Phillip Wood 
> 
> Move the functions that check for empty messages from bulitin/commit.c
> to sequencer.c so they can be shared with other commands. The
> functions are refactored to take an explicit cleanup mode and template
> filename passed by the caller.
> 
> Signed-off-by: Phillip Wood 
> ---
> 
> Notes:
> changes since v1:
>  - prefix cleanup_mode enum and constants with commit_msg_
> 
>  builtin/commit.c | 99 
> +++-
>  sequencer.c  | 61 ++
>  sequencer.h  | 11 +++
>  3 files changed, 91 insertions(+), 80 deletions(-)
> 

Just an idle thought - why are these functions moving to
sequencer.[ch] rather than commit[.ch]?

Similar comments for other patches in the series which moves
code from builtin/commit.c to sequencer.[ch].

ATB,
Ramsay Jones

Re: [PATCH v3 0/8] Create Git/Packet.pm

2017-11-10 Thread Junio C Hamano

Christian Couder  writes:

> There are only a few small changes compared to v2:

Please go incremental as v2 is already in 'next', and small changes
are easier to reiew and understand when given as follow-up "polish
this minor glitch in the initial effort" patches.

Thanks.

[PATCH 01/30] Tighten and correct a few testcases for merging and cherry-picking

2017-11-10 Thread Elijah Newren

t3501 had a testcase originally added to ensure cherry-pick wouldn't
segfault when working with a dirty file involved in a rename.  While
the segfault was fixed, there was another problem this test demonstrated:
namely, that git would overwrite a dirty file involved in a rename.
Further, the test encoded a "successful merge" and overwriting of this
file as correct behavior.  Modify the test so that it would still catch
the segfault, but to require the correct behavior.

t7607 had a test specific to looking for a merge overwriting a dirty file
involved in a rename, but it too actually encoded what I would term
incorrect behavior: it expected the merge to succeed.  Fix that, and add
a few more checks to make sure that the merge really does produce the
expected results.

Signed-off-by: Elijah Newren 
---
 t/t3501-revert-cherry-pick.sh | 7 +--
 t/t7607-merge-overwrite.sh| 5 -
 2 files changed, 9 insertions(+), 3 deletions(-)

diff --git a/t/t3501-revert-cherry-pick.sh b/t/t3501-revert-cherry-pick.sh
index 4f2a263b63..783bdbf59d 100755
--- a/t/t3501-revert-cherry-pick.sh
+++ b/t/t3501-revert-cherry-pick.sh
@@ -141,7 +141,7 @@ test_expect_success 'cherry-pick "-" works with arguments' '
test_cmp expect actual
 '
 
-test_expect_success 'cherry-pick works with dirty renamed file' '
+test_expect_failure 'cherry-pick works with dirty renamed file' '
test_commit to-rename &&
git checkout -b unrelated &&
test_commit unrelated &&
@@ -150,7 +150,10 @@ test_expect_success 'cherry-pick works with dirty renamed 
file' '
test_tick &&
git commit -m renamed &&
echo modified >renamed &&
-   git cherry-pick refs/heads/unrelated
+   test_must_fail git cherry-pick refs/heads/unrelated >out &&
+   test_i18ngrep "Refusing to lose dirty file at renamed" out &&
+   test $(git rev-parse :0:renamed) = $(git rev-parse HEAD^:to-rename.t) &&
+   grep -q "^modified$" renamed
 '
 
 test_done
diff --git a/t/t7607-merge-overwrite.sh b/t/t7607-merge-overwrite.sh
index 9444d6a9b9..00617dadf8 100755
--- a/t/t7607-merge-overwrite.sh
+++ b/t/t7607-merge-overwrite.sh
@@ -97,7 +97,10 @@ test_expect_failure 'will not overwrite unstaged changes in 
renamed file' '
git mv c1.c other.c &&
git commit -m rename &&
cp important other.c &&
-   git merge c1a &&
+   test_must_fail git merge c1a >out &&
+   test_i18ngrep "Refusing to lose dirty file at other.c" out &&
+   test -f other.c~HEAD &&
+   test $(git hash-object other.c~HEAD) = $(git rev-parse c1a:c1.c) &&
test_cmp important other.c
 '
 
-- 
2.15.0.5.g9567be9905

[PATCH 00/30] Add directory rename detection to git

2017-11-10 Thread Elijah Newren

[This series is entirely independent of my rename detection limits series.
However, I have a separate rename detection performance series that depends
on both this series and the rename detection limits series.]

In this patchset, I introduce directory rename detection to merge-recursive,
predominantly so that when files are added to directories on one side of
history and those directories are renamed on the other side of history, the
files will end up in the proper location after a merge or cherry-pick.

However, this isn't limited to that simplistic case.  More interesting
possibilities exist, such as:

  * a file being renamed into a directory which is renamed on the other
side of history, causing the need for a transitive rename.

  * two (or three or N) directories being merged (with no conflicts so
long as files/directories within the merged directory have different
names), and the "merging" being detected as a directory rename for
each original directory.

  * not all files in a directory being renamed to the same location;
i.e. perhaps the directory was renamed, but some files within it were
renamed to a different location

  * a directory being renamed, which also contained a subdirectory that
was renamed to some entirely different location.  (And perhaps the
inner directory itself contained inner directories that were renamed
to yet other locations).

Also, I found it useful to allow all files within the directory being
renamed to themselves be renamed and still detect the directory rename.
For example, if goal/a and goal/b are renamed to priority/alpha and
priority/bravo, we can detect that goal/ was renamed to priority/, so that
if someone adds goal/c on the other side of history, after the merge we'll
end up with priority/c.  (In the absence of a readily available
libmindread.so library that I can link to, we can't rename directly from
goal/c to priority/charlie automatically, and will need to have priority/c
suffice.)

Naturally, an attempt to do all of the above brings up all kinds of
interesting edge and corner cases, some of which result in conflicts
that cannot be represented in the index, and others of which might be
considered too complex for users to understand and resolve.  For
example:

  * An add/add/add/.../add conflict, all on one side of history (see
testcase 9e in the new t6043, or any of the testcases in section 5)

  * Doubly, triply, or N-fold transitive renames (testcases 9c & 9d)

In order to prevent such problems, I introduce a couple basic rules that
limit when directory rename detection applies:

  1) If a subset of to-be-renamed files have a file or directory in the
 way (or would be in the way of each other), "turn off" the directory
 rename for those specific sub-paths and report the conflict to the
 user.

  2) If the other side of history did a directory rename to a path that
 your side of history renamed away, then ignore that particular
 rename from the other side of history for any implicit directory
 renames (but warn the user).

Further, there's a basic question about when directory rename detection
should be applied at all.  I have a simple rule:

  3) If a given directory still exists on both sides of a merge, we do
 not consider it to have been renamed.

Rule 3 may sound obvious at first, but it will probably arise as a
question for some users -- what if someone "mostly" moved a directory but
still left some files around, or, equivalently (from the perspective of the
three-way merge that merge-recursive performs), fully renamed a directory
in one commmit and then recreated that directory in a later commit adding
some new files and then tried to merge?  See the big comment in section 4
of the new t6043 for further discussion of this rule.

This set of rules seems to be reasonably easy to explain, is
self-consistent, allows all conflict cases to be represented without
changing any on-disk data structures or introducing new terminology or
commands for users, prevents excessively complex conflicts that users
might struggle to understand, and brings peace to the middle east.
Actually, maybe not that last one.

While I feel that this directory rename detection reduces the number of
suboptimal merges and cherry-picks that git performs, there are sadly
still a number of cases that remain suboptimal, or that even newly appear
to be not-quite-consistent with other cases.  The fact that one file
layout might trigger some of the rules above while another "slightly"
different file layout doesn't might occasionally cause some user
grumblings.  I've tried to explore and document these cases in section 8
of the new t6043-merge-rename-directories.sh

Finally, from an implementation perspective, there's another strong
advantage to the ruleset above: it means that any path to which we want
to apply an implicit directory rename will have a free and open spot
for us to move it into.  Thus, we can just adjust the

[PATCH 08/30] directory rename detection: files/directories in the way of some renames

2017-11-10 Thread Elijah Newren

Signed-off-by: Elijah Newren 
---
 t/t6043-merge-rename-directories.sh | 303 
 1 file changed, 303 insertions(+)

diff --git a/t/t6043-merge-rename-directories.sh 
b/t/t6043-merge-rename-directories.sh
index ec054b210a..d15153c652 100755
--- a/t/t6043-merge-rename-directories.sh
+++ b/t/t6043-merge-rename-directories.sh
@@ -750,4 +750,307 @@ test_expect_success '4a-check: Directory split, with 
original directory still pr
 #   detection.)  But, sadly, see testcase 8b.
 ###
 
+
+###
+# SECTION 5: Files/directories in the way of subset of to-be-renamed paths
+#
+# Implicitly renaming files due to a detected directory rename could run
+# into problems if there are files or directories in the way of the paths
+# we want to rename.  Explore such cases in this section.
+###
+
+# Testcase 5a, Merge directories, other side adds files to original and target
+#   Commit A: z/{b,c},   y/d
+#   Commit B: z/{b,c,e_1,f}, y/{d,e_2}
+#   Commit C: y/{b,c,d}
+#   Expected: z/e_1, y/{b,c,d,e_2,f} + CONFLICT warning
+#   NOTE: While directory rename detection is active here causing z/f to
+# become y/f, we did not apply this for z/e_1 because that would
+# give us an add/add conflict for y/e_1 vs y/e_2.  This problem with
+# this add/add, is that both versions of y/e are from the same side
+# of history, giving us no way to represent this conflict in the
+# index.
+
+test_expect_success '5a-setup: Merge directories, other side adds files to 
original and target' '
+   git rm -rf . &&
+   git clean -fdqx &&
+   rm -rf .git &&
+   git init &&
+
+   mkdir z &&
+   echo b >z/b &&
+   echo c >z/c &&
+   mkdir y &&
+   echo d >y/d &&
+   git add z y &&
+   test_tick &&
+   git commit -m "A" &&
+
+   git branch A &&
+   git branch B &&
+   git branch C &&
+
+   git checkout B &&
+   echo e1 >z/e &&
+   echo f >z/f &&
+   echo e2 >y/e &&
+   git add z/e z/f y/e &&
+   test_tick &&
+   git commit -m "B" &&
+
+   git checkout C &&
+   git mv z/b y/ &&
+   git mv z/c y/ &&
+   rmdir z &&
+   test_tick &&
+   git commit -m "C"
+'
+
+test_expect_failure '5a-check: Merge directories, other side adds files to 
original and target' '
+   git checkout B^0 &&
+
+   test_must_fail git merge -s recursive C^0 >out &&
+
+   test 6 -eq $(git ls-files -s | wc -l) &&
+   test 0 -eq $(git ls-files -u | wc -l) &&
+   test 1 -eq $(git ls-files -o | wc -l) &&
+
+   test $(git rev-parse :0:y/b) = $(git rev-parse A:z/b) &&
+   test $(git rev-parse :0:y/c) = $(git rev-parse A:z/c) &&
+   test $(git rev-parse :0:y/d) = $(git rev-parse A:y/d) &&
+
+   test $(git rev-parse :0:y/e) = $(git rev-parse B:y/e) &&
+   test $(git rev-parse :0:z/e) = $(git rev-parse B:z/e) &&
+
+   test $(git rev-parse :0:y/f) = $(git rev-parse B:z/f) &&
+
+   test_i18ngrep "CONFLICT.*implicit dir rename" out
+'
+
+# Testcase 5b, Rename/delete in order to get add/add/add conflict
+#   (Related to testcase 8d; these may appear slightly inconsistent to users;
+#Also related to testcases 7d and 7e)
+#   Commit A: z/{b,c,d_1}
+#   Commit B: y/{b,c,d_2}
+#   Commit C: z/{b,c,d_1,e}, y/d_3
+#   Expected: y/{b,c,e}, CONFLICT(add/add: y/d_2 vs. y/d_3)
+#   NOTE: If z/d_1 in commit C were to be involved in dir rename detection, as
+# we normaly would since z/ is being renamed to y/, then this would be
+# a rename/delete (z/d_1 -> y/d_1 vs. deleted) AND an add/add/add
+# conflict of y/d_1 vs. y/d_2 vs. y/d_3.  Add/add/add is not
+# representable in the index, so the existence of y/d_3 needs to
+# cause us to bail on directory rename detection for that path, falling
+# back to git behavior without the directory rename detection.
+
+test_expect_success '5b-setup: Rename/delete in order to get add/add/add 
conflict' '
+   git rm -rf . &&
+   git clean -fdqx &&
+   rm -rf .git &&
+   git init &&
+
+   mkdir z &&
+   echo b >z/b &&
+   echo c >z/c &&
+   echo d1 >z/d &&
+   git add z &&
+   test_tick &&
+   git commit -m "A" &&
+
+   git branch A &&
+   git branch B &&
+   git branch C &&
+
+   git checkout B &&
+   git rm z/d &&
+   git mv z y &&
+   echo d2 >y/d &&
+   git add y/d &&
+   test_tick &&
+   git commit -m "B" &&
+
+   git checkout C &&
+   mkdir y &&
+   echo d3 >y/d &&
+   echo e >z/e &&
+   git add y/d z/e &&
+   test_tick &&
+   git commit -m "C"
+'
+
+test_expect_failure '5b-check: Rename/delete in order to get add/add/add 
conflict' '
+   git checkout B^0 &&
+
+

[PATCH 16/30] merge-recursive: Introduce new functions to handle rename logic

2017-11-10 Thread Elijah Newren

The amount of logic in merge_trees() relative to renames was just a few
lines, but split it out into new handle_renames() and cleanup_renames()
functions to prepare for additional logic to be added to each.  No code
or logic changes, just a new place to put stuff for when the rename
detection gains additional checks.

Signed-off-by: Elijah Newren 
---
 merge-recursive.c | 48 ++--
 1 file changed, 38 insertions(+), 10 deletions(-)

diff --git a/merge-recursive.c b/merge-recursive.c
index 382016508b..49710c0964 100644
--- a/merge-recursive.c
+++ b/merge-recursive.c
@@ -1638,6 +1638,38 @@ static int process_renames(struct merge_options *o,
return clean_merge;
 }
 
+struct rename_info {
+   struct string_list *head_renames;
+   struct string_list *merge_renames;
+};
+
+static struct rename_info *handle_renames(struct merge_options *o,
+ struct tree *common,
+ struct tree *head,
+ struct tree *merge,
+ struct string_list *entries,
+ int *clean)
+{
+   struct rename_info *rei = xcalloc(1, sizeof(struct rename_info));
+
+   rei->head_renames  = get_renames(o, head, common, head, merge, entries);
+   rei->merge_renames = get_renames(o, merge, common, head, merge, 
entries);
+   *clean = process_renames(o, rei->head_renames, rei->merge_renames);
+
+   return rei;
+}
+
+static void cleanup_renames(struct rename_info *re_info)
+{
+   string_list_clear(re_info->head_renames, 0);
+   string_list_clear(re_info->merge_renames, 0);
+
+   free(re_info->head_renames);
+   free(re_info->merge_renames);
+
+   free(re_info);
+}
+
 static struct object_id *stage_oid(const struct object_id *oid, unsigned mode)
 {
return (is_null_oid(oid) || mode == 0) ? NULL: (struct object_id *)oid;
@@ -1989,7 +2021,8 @@ int merge_trees(struct merge_options *o,
}
 
if (unmerged_cache()) {
-   struct string_list *entries, *re_head, *re_merge;
+   struct string_list *entries;
+   struct rename_info *re_info;
int i;
/*
 * Only need the hashmap while processing entries, so
@@ -2003,9 +2036,7 @@ int merge_trees(struct merge_options *o,
get_files_dirs(o, merge);
 
entries = get_unmerged();
-   re_head  = get_renames(o, head, common, head, merge, entries);
-   re_merge = get_renames(o, merge, common, head, merge, entries);
-   clean = process_renames(o, re_head, re_merge);
+   re_info = handle_renames(o, common, head, merge, entries, 
&clean);
record_df_conflict_files(o, entries);
if (clean < 0)
goto cleanup;
@@ -2030,16 +2061,13 @@ int merge_trees(struct merge_options *o,
}
 
 cleanup:
-   string_list_clear(re_merge, 0);
-   string_list_clear(re_head, 0);
+   cleanup_renames(re_info);
+
string_list_clear(entries, 1);
+   free(entries);
 
hashmap_free(&o->current_file_dir_set, 1);
 
-   free(re_merge);
-   free(re_head);
-   free(entries);
-
if (clean < 0)
return clean;
}
-- 
2.15.0.5.g9567be9905

[PATCH 24/30] merge-recursive: Add computation of collisions due to dir rename & merging

2017-11-10 Thread Elijah Newren

directory renaming and merging can cause one or more files to be moved to
where an existing file is, or to cause several files to all be moved to
the same (otherwise vacant) location.  Add checking and reporting for such
cases, falling back to no-directory-rename handling for such paths.

Signed-off-by: Elijah Newren 
---
 merge-recursive.c | 118 +-
 1 file changed, 116 insertions(+), 2 deletions(-)

diff --git a/merge-recursive.c b/merge-recursive.c
index 1858686c35..251c4cc7fa 100644
--- a/merge-recursive.c
+++ b/merge-recursive.c
@@ -1408,6 +1408,30 @@ static int tree_has_path(struct tree *tree, const char 
*path)
   hashy, &mode_o);
 }
 
+/*
+ * Return a new string that replaces the beginning portion (which matches
+ * entry->dir), with entry->new_dir.  In perl-speak:
+ *   new_path_name = (old_path =~ s/entry->dir/entry->new_dir/);
+ */
+static char *apply_dir_rename(struct dir_rename_entry *entry,
+  const char *old_path) {
+   char *new_path;
+   int oldlen, newlen;
+
+   if (entry->non_unique_new_dir)
+   return NULL;
+
+   oldlen = strlen(entry->dir);
+   assert(strncmp(entry->dir, old_path, oldlen) == 0 &&
+  old_path[oldlen] == '/');
+   newlen = strlen(entry->new_dir) + (strlen(old_path) - oldlen) + 1;
+   new_path = malloc(newlen);
+   strcpy(new_path, entry->new_dir);
+   strcpy(&new_path[strlen(new_path)], &old_path[oldlen]);
+
+   return new_path;
+}
+
 static void get_renamed_dir_portion(const char *old_path, const char *new_path,
char **old_dir, char **new_dir) {
*old_dir = NULL;
@@ -1625,6 +1649,82 @@ static struct hashmap *get_directory_renames(struct 
diff_queue_struct *pairs,
return dir_renames;
 }
 
+static struct dir_rename_entry *check_dir_renamed(const char *path,
+ struct hashmap *dir_renames) {
+   char temp[PATH_MAX];
+   char *end;
+   struct dir_rename_entry *entry;
+
+   strcpy(temp, path);
+   while ((end = strrchr(temp, '/'))) {
+   *end = '\0';
+   entry = dir_rename_find_entry(dir_renames, temp);
+   if (entry)
+   return entry;
+   }
+   return NULL;
+}
+
+static void compute_collisions(struct hashmap *collisions,
+  struct hashmap *dir_renames,
+  struct diff_queue_struct *pairs)
+{
+   int i;
+
+   /*
+* Multiple files can be mapped to the same path due to directory
+* renames done by the other side of history.  Since that other
+* side of history could have merged multiple directories into one,
+* if our side of history added the same file basename to each of
+* those directories, then all N of them would get implicitly
+* renamed by the directory rename detection into the same path,
+* and we'd get an add/add/.../add conflict, and all those adds
+* from *this* side of history.  This is not representable in the
+* index, and users aren't going to easily be able to make sense of
+* it.  So we need to provide a good warning about what's
+* happening, and fall back to no-directory-rename detection
+* behavior for those paths.
+*
+* See testcases 9e and all of section 5 from t6043 for examples.
+*/
+   collision_init(collisions);
+
+   for (i = 0; i < pairs->nr; ++i) {
+   struct dir_rename_entry *dir_rename_ent;
+   struct collision_entry *collision_ent;
+   char *new_path;
+   struct diff_filepair *pair = pairs->queue[i];
+
+   if (pair->status == 'D')
+   continue;
+   dir_rename_ent = check_dir_renamed(pair->two->path, 
dir_renames);
+   if (!dir_rename_ent)
+   continue;
+
+   new_path = apply_dir_rename(dir_rename_ent, pair->two->path);
+   if (!new_path)
+   /*
+* dir_rename_ent->non_unique_new_path is true, which
+* means there is no directory rename for us to use,
+* which means it won't cause us any additional
+* collisions.
+*/
+   continue;
+   collision_ent = collision_find_entry(collisions, new_path);
+   if (!collision_ent) {
+   collision_ent = xcalloc(1,
+   sizeof(struct collision_entry));
+   hashmap_entry_init(collision_ent, strhash(new_path));
+   hashmap_put(collisions, collision_ent);
+   collision_ent->target_file = new_path;
+   } else {
+

[PATCH 23/30] merge-recursive: Add a new hashmap for storing file collisions

2017-11-10 Thread Elijah Newren

Directory renames with the ability to merge directories opens up the
possibility of add/add/add/.../add conflicts, if each of the N
directories being merged into one target directory all had a file with
the same name.  We need a way to check for and report on such
collisions; this hashmap will be used for this purpose.

Signed-off-by: Elijah Newren 
---
 merge-recursive.c | 23 +++
 merge-recursive.h |  7 +++
 2 files changed, 30 insertions(+)

diff --git a/merge-recursive.c b/merge-recursive.c
index 3633be0123..1858686c35 100644
--- a/merge-recursive.c
+++ b/merge-recursive.c
@@ -73,6 +73,29 @@ static void dir_rename_init(struct hashmap *map)
hashmap_init(map, (hashmap_cmp_fn) dir_rename_cmp, NULL, 0);
 }
 
+static struct collision_entry *collision_find_entry(struct hashmap *hashmap,
+   char *target_file)
+{
+   struct collision_entry key;
+
+   hashmap_entry_init(&key, strhash(target_file));
+   key.target_file = target_file;
+   return hashmap_get(hashmap, &key, NULL);
+}
+
+static int collision_cmp(void *unused_cmp_data,
+const struct collision_entry *e1,
+const struct collision_entry *e2,
+const void *unused_keydata)
+{
+   return strcmp(e1->target_file, e2->target_file);
+}
+
+static void collision_init(struct hashmap *map)
+{
+   hashmap_init(map, (hashmap_cmp_fn) collision_cmp, NULL, 0);
+}
+
 static void flush_output(struct merge_options *o)
 {
if (o->buffer_output < 2 && o->obuf.len) {
diff --git a/merge-recursive.h b/merge-recursive.h
index a024949739..e02c1e1243 100644
--- a/merge-recursive.h
+++ b/merge-recursive.h
@@ -37,6 +37,13 @@ struct dir_rename_entry {
struct string_list possible_new_dirs;
 };
 
+struct collision_entry {
+   struct hashmap_entry ent; /* must be the first member! */
+   char *target_file;
+   struct string_list source_files;
+   unsigned reported_already:1;
+};
+
 /* merge_trees() but with recursive ancestor consolidation */
 int merge_recursive(struct merge_options *o,
struct commit *h1,
-- 
2.15.0.5.g9567be9905

[PATCH 11/30] directory rename detection: testcases exploring possibly suboptimal merges

2017-11-10 Thread Elijah Newren

Signed-off-by: Elijah Newren 
---
 t/t6043-merge-rename-directories.sh | 371 
 1 file changed, 371 insertions(+)

diff --git a/t/t6043-merge-rename-directories.sh 
b/t/t6043-merge-rename-directories.sh
index 115d0d2622..bdfd943c88 100755
--- a/t/t6043-merge-rename-directories.sh
+++ b/t/t6043-merge-rename-directories.sh
@@ -1683,4 +1683,375 @@ test_expect_failure '7e-check: transitive rename in 
rename/delete AND dirs in th
test $(git hash-object y/d~C^0) = $(git rev-parse A:x/d)
 '
 
+
+###
+# SECTION 8: Suboptimal merges
+#
+# As alluded to in the last section, the ruleset we have built up for
+# detecting directory renames unfortunately has some special cases where it
+# results in slightly suboptimal or non-intuitive behavior.  This section
+# explores these cases.
+#
+# To be fair, we already had non-intuitive or suboptimal behavior for most
+# of these cases in git before introducing implicit directory rename
+# detection, but it'd be nice if there was a modified ruleset out there
+# that handled these cases a bit better.
+###
+
+# Testcase 8a, Dual-directory rename, one into the others' way
+#   Commit A. x/{a,b},   y/{c,d}
+#   Commit B. x/{a,b,e}, y/{c,d,f}
+#   Commit C. y/{a,b},   z/{c,d}
+#
+# Possible Resolutions:
+#   Previous git: y/{a,b,f},   z/{c,d},   x/e
+#   Expected: y/{a,b,e,f}, z/{c,d}
+#   Preferred:y/{a,b,e},   z/{c,d,f}
+#
+# Note: Both x and y got renamed and it'd be nice to detect both, and we do
+# better with directory rename detection than git did previously, but the
+# simple rule from section 5 prevents me from handling this as optimally as
+# we potentially could.
+
+test_expect_success '8a-setup: Dual-directory rename, one into the others way' 
'
+   git rm -rf . &&
+   git clean -fdqx &&
+   rm -rf .git &&
+   git init &&
+
+   mkdir x &&
+   mkdir y &&
+   echo a >x/a &&
+   echo b >x/b &&
+   echo c >y/c &&
+   echo d >y/d &&
+   git add x y &&
+   test_tick &&
+   git commit -m "A" &&
+
+   git branch A &&
+   git branch B &&
+   git branch C &&
+
+   git checkout B &&
+   echo e >x/e &&
+   echo f >y/f &&
+   git add x/e y/f &&
+   test_tick &&
+   git commit -m "B" &&
+
+   git checkout C &&
+   git mv y z &&
+   git mv x y &&
+   test_tick &&
+   git commit -m "C"
+'
+
+test_expect_failure '8a-check: Dual-directory rename, one into the others way' 
'
+   git checkout B^0 &&
+
+   git merge -s recursive C^0 &&
+
+   test 6 -eq $(git ls-files -s | wc -l) &&
+   test 0 -eq $(git ls-files -u | wc -l) &&
+   test 0 -eq $(git ls-files -o | wc -l) &&
+
+   test $(git rev-parse HEAD:y/a) = $(git rev-parse A:x/a) &&
+   test $(git rev-parse HEAD:y/b) = $(git rev-parse A:x/b) &&
+   test $(git rev-parse HEAD:y/e) = $(git rev-parse B:x/e) &&
+   test $(git rev-parse HEAD:y/f) = $(git rev-parse B:y/f) &&
+   test $(git rev-parse HEAD:z/c) = $(git rev-parse A:y/c) &&
+   test $(git rev-parse HEAD:z/d) = $(git rev-parse A:y/d)
+'
+
+# Testcase 8b, Dual-directory rename, one into the others' way, with 
conflicting filenames
+#   Commit A. x/{a_1,b_1}, y/{a_2,b_2}
+#   Commit B. x/{a_1,b_1,e_1}, y/{a_2,b_2,e_2}
+#   Commit C. y/{a_1,b_1}, z/{a_2,b_2}
+#
+# Possible Resolutions:
+#   Previous git: y/{a_1,b_1,e_2}, z/{a_2,b_2}, x/e_1
+#   Scary:y/{a_1,b_1}, z/{a_2,b_2}, CONFLICT(add/add, e_1 vs. e_2)
+#   Preferred:y/{a_1,b_1,e_1}, z/{a_2,b_2,e_2}
+#
+# Note: Very similar to 8a, except instead of 'e' and 'f' in directories x and
+# y, both are named 'e'.  Without directory rename detection, neither file
+# moves directories.  Implment directory rename detection suboptimally, and
+# you get an add/add conflict, but both files were added in commit B, so this
+# is an add/add conflict where one side of history added both files --
+# something we can't represent in the index.  Obviously, we'd prefer the last
+# resolution, but our previous rules are too coarse to allow it.  Using both
+# the rules from section 4 and section 5 save us from the Scary resolution,
+# making us fall back to pre-directory-rename-detection behavior for both
+# e_1 and e_2.
+
+test_expect_success '8b-setup: Dual-directory rename, one into the others way, 
with conflicting filenames' '
+   git rm -rf . &&
+   git clean -fdqx &&
+   rm -rf .git &&
+   git init &&
+
+   mkdir x &&
+   mkdir y &&
+   echo a1 >x/a &&
+   echo b1 >x/b &&
+   echo a2 >y/a &&
+   echo b2 >y/b &&
+   git add x y &&
+   test_tick &&
+   git commit -m "A" &&
+
+   git branch A &&
+   git branch B &&
+   git branch C &&
+
+   git checkout B &&
+   echo e1 >x/e &&
+   echo e2 >y/e &&
+   git add x

[PATCH 09/30] directory rename detection: testcases checking which side did the rename

2017-11-10 Thread Elijah Newren

Signed-off-by: Elijah Newren 
---
 t/t6043-merge-rename-directories.sh | 283 
 1 file changed, 283 insertions(+)

diff --git a/t/t6043-merge-rename-directories.sh 
b/t/t6043-merge-rename-directories.sh
index d15153c652..157299105f 100755
--- a/t/t6043-merge-rename-directories.sh
+++ b/t/t6043-merge-rename-directories.sh
@@ -1053,4 +1053,287 @@ test_expect_failure '5d-check: Directory/file/file 
conflict due to directory ren
 #   back to old handling.  But, sadly, see testcases 8a and 8b.
 ###
 
+
+###
+# SECTION 6: Same side of the merge was the one that did the rename
+#
+# It may sound obvious that you only want to apply implicit directory
+# renames to directories if the _other_ side of history did the renaming.
+# If you did make an implementation that didn't explicitly enforce this
+# rule, the majority of cases that would fall under this section would
+# also be solved by following the rules from the above sections.  But
+# there are still a few that stick out, so this section covers them just
+# to make sure we also get them right.
+###
+
+# Testcase 6a, Tricky rename/delete
+#   Commit A: z/{b,c,d}
+#   Commit B: z/b
+#   Commit C: y/{b,c}, z/d
+#   Expected: y/b, CONFLICT(rename/delete, z/c -> y/c vs. NULL)
+#   Note: We're just checking here that the rename of z/b and z/c to put
+# them under y/ doesn't accidentally catch z/d and make it look like
+# it is also involved in a rename/delete conflict.
+
+test_expect_success '6a-setup: Tricky rename/delete' '
+   git rm -rf . &&
+   git clean -fdqx &&
+   rm -rf .git &&
+   git init &&
+
+   mkdir z &&
+   echo b >z/b &&
+   echo c >z/c &&
+   echo d >z/d &&
+   git add z &&
+   test_tick &&
+   git commit -m "A" &&
+
+   git branch A &&
+   git branch B &&
+   git branch C &&
+
+   git checkout B &&
+   git rm z/c &&
+   git rm z/d &&
+   test_tick &&
+   git commit -m "B" &&
+
+   git checkout C &&
+   mkdir y &&
+   git mv z/b y/ &&
+   git mv z/c y/ &&
+   test_tick &&
+   git commit -m "C"
+'
+
+test_expect_success '6a-check: Tricky rename/delete' '
+   git checkout B^0 &&
+
+   test_must_fail git merge -s recursive C^0 >out &&
+   test_i18ngrep "CONFLICT (rename/delete).*z/c.*y/c" out &&
+
+   test 2 -eq $(git ls-files -s | wc -l) &&
+   test 1 -eq $(git ls-files -u | wc -l) &&
+   test 1 -eq $(git ls-files -o | wc -l) &&
+
+   test $(git rev-parse :0:y/b) = $(git rev-parse A:z/b) &&
+   test $(git rev-parse :3:y/c) = $(git rev-parse A:z/c)
+'
+
+# Testcase 6b, Same rename done on both sides
+#   (Related to testcases 6c and 8e)
+#   Commit A: z/{b,c}
+#   Commit B: y/{b,c}
+#   Commit C: y/{b,c}, z/d
+#   Note: If we did directory rename detection here, we'd move z/d into y/,
+# but C did that rename and still decided to put the file into z/,
+# so we probably shouldn't apply directory rename detection for it.
+
+test_expect_success '6b-setup: Same rename done on both sides' '
+   git rm -rf . &&
+   git clean -fdqx &&
+   rm -rf .git &&
+   git init &&
+
+   mkdir z &&
+   echo b >z/b &&
+   echo c >z/c &&
+   git add z &&
+   test_tick &&
+   git commit -m "A" &&
+
+   git branch A &&
+   git branch B &&
+   git branch C &&
+
+   git checkout B &&
+   git mv z y &&
+   test_tick &&
+   git commit -m "B" &&
+
+   git checkout C &&
+   git mv z y &&
+   mkdir z &&
+   echo d >z/d &&
+   git add z/d &&
+   test_tick &&
+   git commit -m "C"
+'
+
+test_expect_success '6b-check: Same rename done on both sides' '
+   git checkout B^0 &&
+
+   git merge -s recursive C^0 &&
+
+   test 3 -eq $(git ls-files -s | wc -l) &&
+   test 0 -eq $(git ls-files -u | wc -l) &&
+   test 0 -eq $(git ls-files -o | wc -l) &&
+
+   test $(git rev-parse HEAD:y/b) = $(git rev-parse A:z/b) &&
+   test $(git rev-parse HEAD:y/c) = $(git rev-parse A:z/c) &&
+   test $(git rev-parse HEAD:z/d) = $(git rev-parse C:z/d)
+'
+
+# Testcase 6c, Rename only done on same side
+#   (Related to testcases 6b and 8e)
+#   Commit A: z/{b,c}
+#   Commit B: z/{b,c} (no change)
+#   Commit C: y/{b,c}, z/d
+#   Expected: y/{b,c}, z/d
+#   NOTE: Seems obvious, but just checking that the implementation doesn't
+# "accidentally detect a rename" and give us y/{b,c,d}.
+
+test_expect_success '6c-setup: Rename only done on same side' '
+   git rm -rf . &&
+   git clean -fdqx &&
+   rm -rf .git &&
+   git init &&
+
+   mkdir z &&
+   echo b >z/b &&
+   echo c >z/c &&
+   git add z &&
+   test_tick &&
+   git commit -m

[PATCH 06/30] directory rename detection: testcases to avoid taking detection too far

2017-11-10 Thread Elijah Newren

Signed-off-by: Elijah Newren 
---
 t/t6043-merge-rename-directories.sh | 137 
 1 file changed, 137 insertions(+)

diff --git a/t/t6043-merge-rename-directories.sh 
b/t/t6043-merge-rename-directories.sh
index 00811f512a..021513ec00 100755
--- a/t/t6043-merge-rename-directories.sh
+++ b/t/t6043-merge-rename-directories.sh
@@ -513,4 +513,141 @@ test_expect_success '2b-check: Directory split into two 
on one side, with equal
 #   messages are handled correctly.
 ###
 
+
+###
+# SECTION 3: Path in question is the source path for some rename already
+#
+# Combining cases from Section 1 and trying to handle them could lead to
+# directory renaming detection being over-applied.  So, this section
+# provides some good testcases to check that the implementation doesn't go
+# too far.
+###
+
+# Testcase 3a, Avoid implicit rename if involved as source on other side
+#   (Related to testcases 1c and 1f)
+#   Commit A: z/{b,c,d}
+#   Commit B: z/{b,c,d} (no change)
+#   Commit C: y/{b,c}, x/d
+#   Expected: y/{b,c}, x/d
+test_expect_success '3a-setup: Avoid implicit rename if involved as source on 
other side' '
+   git rm -rf . &&
+   git clean -fdqx &&
+   rm -rf .git &&
+   git init &&
+
+   mkdir z &&
+   echo b >z/b &&
+   echo c >z/c &&
+   echo d >z/d &&
+   git add z &&
+   test_tick &&
+   git commit -m "A" &&
+
+   git branch A &&
+   git branch B &&
+   git branch C &&
+
+   git checkout B &&
+   test_tick &&
+   git commit --allow-empty -m "B" &&
+
+   git checkout C &&
+   mkdir y &&
+   mkdir x &&
+   git mv z/b y/ &&
+   git mv z/c y/ &&
+   git mv z/d x/ &&
+   rmdir z &&
+   test_tick &&
+   git commit -m "C"
+'
+
+test_expect_success '3a-check: Avoid implicit rename if involved as source on 
other side' '
+   git checkout B^0 &&
+
+   git merge -s recursive C^0 &&
+
+   test 3 -eq $(git ls-files -s | wc -l) &&
+
+   test $(git rev-parse HEAD:y/b) = $(git rev-parse A:z/b) &&
+   test $(git rev-parse HEAD:y/c) = $(git rev-parse A:z/c) &&
+   test $(git rev-parse HEAD:x/d) = $(git rev-parse A:z/d)
+'
+
+# Testcase 3b, Avoid implicit rename if involved as source on other side
+#   (Related to testcases 5c and 7c, also kind of 1e and 1f)
+#   Commit A: z/{b,c,d}
+#   Commit B: y/{b,c}, x/d
+#   Commit C: z/{b,c}, w/d
+#   Expected: y/{b,c}, CONFLICT:(z/d -> x/d vs. w/d)
+#   NOTE: We're particularly checking that since z/d is already involved as
+# a source in a file rename on the same side of history, that we don't
+# get it involved in directory rename detection.  If it were, we might
+# end up with CONFLICT:(z/d -> y/d vs. x/d vs. w/d), i.e. a
+# rename/rename/rename(1to3) conflict, which is just weird.
+test_expect_success '3b-setup: Avoid implicit rename if involved as source on 
current side' '
+   git rm -rf . &&
+   git clean -fdqx &&
+   rm -rf .git &&
+   git init &&
+
+   mkdir z &&
+   echo b >z/b &&
+   echo c >z/c &&
+   echo d >z/d &&
+   git add z &&
+   test_tick &&
+   git commit -m "A" &&
+
+   git branch A &&
+   git branch B &&
+   git branch C &&
+
+   git checkout B &&
+   mkdir y &&
+   mkdir x &&
+   git mv z/b y/ &&
+   git mv z/c y/ &&
+   git mv z/d x/ &&
+   rmdir z &&
+   test_tick &&
+   git commit -m "B" &&
+
+   git checkout C &&
+   mkdir w &&
+   git mv z/d w/ &&
+   test_tick &&
+   git commit -m "C"
+'
+
+test_expect_success '3b-check: Avoid implicit rename if involved as source on 
current side' '
+   git checkout B^0 &&
+
+   test_must_fail git merge -s recursive C^0 >out &&
+
+   test 5 -eq $(git ls-files -s | wc -l) &&
+   test 3 -eq $(git ls-files -u | wc -l) &&
+   test 1 -eq $(git ls-files -o | wc -l) &&
+
+   test $(git rev-parse :0:y/b) = $(git rev-parse A:z/b) &&
+   test $(git rev-parse :0:y/c) = $(git rev-parse A:z/c) &&
+
+   test $(git rev-parse :1:z/d) = $(git rev-parse A:z/d) &&
+   test $(git rev-parse :2:x/d) = $(git rev-parse A:z/d) &&
+   test $(git rev-parse :3:w/d) = $(git rev-parse A:z/d) &&
+   test ! -f z/d &&
+   test $(git hash-object x/d) = $(git rev-parse A:z/d) &&
+   test $(git hash-object w/d) = $(git rev-parse A:z/d) &&
+
+   test_i18ngrep CONFLICT.*rename/rename.*z/d.*x/d.*w/d out &&
+   ! test_i18ngrep CONFLICT.*rename/rename.*y/d
+'
+
+###
+# Rules suggested by section 3:
+#
+#   Avoid directory-rename-detection for a path, if that path is the source
+#   of a rename on either side of a merge

[PATCH 07/30] directory rename detection: partially renamed directory testcase/discussion

2017-11-10 Thread Elijah Newren

Signed-off-by: Elijah Newren 
---
 t/t6043-merge-rename-directories.sh | 100 
 1 file changed, 100 insertions(+)

diff --git a/t/t6043-merge-rename-directories.sh 
b/t/t6043-merge-rename-directories.sh
index 021513ec00..ec054b210a 100755
--- a/t/t6043-merge-rename-directories.sh
+++ b/t/t6043-merge-rename-directories.sh
@@ -650,4 +650,104 @@ test_expect_success '3b-check: Avoid implicit rename if 
involved as source on cu
 #   of a rename on either side of a merge.
 ###
 
+
+###
+# SECTION 4: Partially renamed directory; still exists on both sides of merge
+#
+# What if we were to attempt to do directory rename detection when someone
+# "mostly" moved a directory but still left some files around, or,
+# equivalently, fully renamed a directory in one commmit and then recreated
+# that directory in a later commit adding some new files and then tried to
+# merge?
+#
+# It's hard to divine user intent in these cases, because you can make an
+# argument that, depending on the intermediate history of the side being
+# merged, that some users will want files in that directory to
+# automatically be detected and renamed, while users with a different
+# intermediate history wouldn't want that rename to happen.
+#
+# I think that it is best to simply not have directory rename detection
+# apply to such cases.  My reasoning for this is four-fold: (1) it's
+# easiest for users in general to figure out what happened if we don't
+# apply directory rename detection in any such case, (2) it's an easy rule
+# to explain ["We don't do directory rename detection if the directory
+# still exists on both sides of the merge"], (3) we can get some hairy
+# edge/corner cases that would be really confusing and possibly not even
+# representable in the index if we were to even try, and [related to 3] (4)
+# attempting to resolve this issue of divining user intent by examining
+# intermediate history goes against the spirit of three-way merges and is a
+# path towards crazy corner cases that are far more complex than what we're
+# already dealing with.
+#
+# This section contains a test for this partially-renamed-directory case.
+###
+
+# Testcase 4a, Directory split, with original directory still present
+#   (Related to testcase 1f)
+#   Commit A: z/{b,c,d,e}
+#   Commit B: y/{b,c,d}, z/e
+#   Commit C: z/{b,c,d,e,f}
+#   Expected: y/{b,c,d}, z/{e,f}
+#   NOTE: Even though most files from z moved to y, we don't want f to follow.
+
+test_expect_success '4a-setup: Directory split, with original directory still 
present' '
+   git rm -rf . &&
+   git clean -fdqx &&
+   rm -rf .git &&
+   git init &&
+
+   mkdir z &&
+   echo b >z/b &&
+   echo c >z/c &&
+   echo d >z/d &&
+   echo e >z/e &&
+   git add z &&
+   test_tick &&
+   git commit -m "A" &&
+
+   git branch A &&
+   git branch B &&
+   git branch C &&
+
+   git checkout B &&
+   mkdir y &&
+   git mv z/b y/ &&
+   git mv z/c y/ &&
+   git mv z/d y/ &&
+   test_tick &&
+   git commit -m "B" &&
+
+   git checkout C &&
+   echo f >z/f &&
+   git add z/f &&
+   test_tick &&
+   git commit -m "C"
+'
+
+test_expect_success '4a-check: Directory split, with original directory still 
present' '
+   git checkout B^0 &&
+
+   git merge -s recursive C^0 &&
+
+   test 5 -eq $(git ls-files -s | wc -l) &&
+   test 0 -eq $(git ls-files -u | wc -l) &&
+   test 0 -eq $(git ls-files -o | wc -l) &&
+
+   test $(git rev-parse HEAD:y/b) = $(git rev-parse A:z/b) &&
+   test $(git rev-parse HEAD:y/c) = $(git rev-parse A:z/c) &&
+   test $(git rev-parse HEAD:y/d) = $(git rev-parse A:z/d) &&
+   test $(git rev-parse HEAD:z/e) = $(git rev-parse A:z/e) &&
+   test $(git rev-parse HEAD:z/f) = $(git rev-parse C:z/f)
+'
+
+###
+# Rules suggested by section 4:
+#
+#   Directory-rename-detection should be turned off for any directories (as
+#   a source for renames) that exist on both sides of the merge.  (The "as
+#   a source for renames" clarification is due to cases like 1c where
+#   the target directory exists on both sides and we do want the rename
+#   detection.)  But, sadly, see testcase 8b.
+###
+
 test_done
-- 
2.15.0.5.g9567be9905

[PATCH 26/30] merge-recursive: When comparing files, don't include trees

2017-11-10 Thread Elijah Newren

get_renames() would look up stage data that already existed (populated
in get_unmerged(), taken from whatever unpack_trees() created), and if
it didn't exist, would call insert_stage_data() to create the necessary
entry for the given file.  The insert_stage_data() fallback becomes
much more important for directory rename detection, because that creates
a mechanism to have a file in the resulting merge that didn't exist on
either side of history.  However, insert_stage_data(), due to calling
get_tree_entry() loaded up trees as readily as files.  We aren't
interested in comparing trees to files; the D/F conflict handling is
done elsewhere.  This code is just concerned with what entries existed
for a given path on the different sides of the merge, so create a
get_tree_entry_if_blob() helper function and use it.

Signed-off-by: Elijah Newren 
---
 merge-recursive.c | 27 +--
 1 file changed, 21 insertions(+), 6 deletions(-)

diff --git a/merge-recursive.c b/merge-recursive.c
index 7c2c29bb51..2a7258f6bb 100644
--- a/merge-recursive.c
+++ b/merge-recursive.c
@@ -407,6 +407,21 @@ static void get_files_dirs(struct merge_options *o, struct 
tree *tree)
read_tree_recursive(tree, "", 0, 0, &match_all, save_files_dirs, o);
 }
 
+static int get_tree_entry_if_blob(const unsigned char *tree,
+ const char *path,
+ unsigned char *hashy,
+ unsigned *mode_o)
+{
+   int ret;
+
+   ret = get_tree_entry(tree, path, hashy, mode_o);
+   if (S_ISDIR(*mode_o)) {
+   hashcpy(hashy, null_sha1);
+   *mode_o = 0;
+   }
+   return ret;
+}
+
 /*
  * Returns an index_entry instance which doesn't have to correspond to
  * a real cache entry in Git's index.
@@ -417,12 +432,12 @@ static struct stage_data *insert_stage_data(const char 
*path,
 {
struct string_list_item *item;
struct stage_data *e = xcalloc(1, sizeof(struct stage_data));
-   get_tree_entry(o->object.oid.hash, path,
-   e->stages[1].oid.hash, &e->stages[1].mode);
-   get_tree_entry(a->object.oid.hash, path,
-   e->stages[2].oid.hash, &e->stages[2].mode);
-   get_tree_entry(b->object.oid.hash, path,
-   e->stages[3].oid.hash, &e->stages[3].mode);
+   get_tree_entry_if_blob(o->object.oid.hash, path,
+  e->stages[1].oid.hash, &e->stages[1].mode);
+   get_tree_entry_if_blob(a->object.oid.hash, path,
+  e->stages[2].oid.hash, &e->stages[2].mode);
+   get_tree_entry_if_blob(b->object.oid.hash, path,
+  e->stages[3].oid.hash, &e->stages[3].mode);
item = string_list_insert(entries, path);
item->util = e;
return e;
-- 
2.15.0.5.g9567be9905

[PATCH 05/30] directory rename detection: directory splitting testcases

2017-11-10 Thread Elijah Newren

Signed-off-by: Elijah Newren 
---
 t/t6043-merge-rename-directories.sh | 125 
 1 file changed, 125 insertions(+)

diff --git a/t/t6043-merge-rename-directories.sh 
b/t/t6043-merge-rename-directories.sh
index b737b0a105..00811f512a 100755
--- a/t/t6043-merge-rename-directories.sh
+++ b/t/t6043-merge-rename-directories.sh
@@ -388,4 +388,129 @@ test_expect_failure '1f-check: Split a directory into two 
other directories' '
 #   in section 2, plus testcases 3a and 4a.
 ###
 
+
+###
+# SECTION 2: Split into multiple directories, with equal number of paths
+#
+# Explore the splitting-a-directory rules a bit; what happens in the
+# edge cases?
+#
+# Note that there is a closely related case of a directory not being
+# split on either side of history, but being renamed differently on
+# each side.  See testcase 8e for that.
+###
+
+# Testcase 2a, Directory split into two on one side, with equal numbers of 
paths
+#   Commit A: z/{b,c}
+#   Commit B: y/b, w/c
+#   Commit C: z/{b,c,d}
+#   Expected: y/b, w/c, z/d, with warning about z/ -> (y/ vs. w/) conflict
+test_expect_success '2a-setup: Directory split into two on one side, with 
equal numbers of paths' '
+   git rm -rf . &&
+   git clean -fdqx &&
+   rm -rf .git &&
+   git init &&
+
+   mkdir z &&
+   echo b >z/b &&
+   echo c >z/c &&
+   git add z &&
+   test_tick &&
+   git commit -m "A" &&
+
+   git branch A &&
+   git branch B &&
+   git branch C &&
+
+   git checkout B &&
+   mkdir y &&
+   mkdir w &&
+   git mv z/b y/ &&
+   git mv z/c w/ &&
+   test_tick &&
+   git commit -m "B" &&
+
+   git checkout C &&
+   echo d >z/d &&
+   git add z/d &&
+   test_tick &&
+   git commit -m "C"
+'
+
+test_expect_failure '2a-check: Directory split into two on one side, with 
equal numbers of paths' '
+   git checkout B^0 &&
+
+   test_must_fail git merge -s recursive C^0 >out &&
+
+   test 3 -eq $(git ls-files -s | wc -l) &&
+   test 0 -eq $(git ls-files -u | wc -l) &&
+   test 1 -eq $(git ls-files -o | wc -l) &&
+
+   test $(git rev-parse :0:y/b) = $(git rev-parse A:z/b) &&
+   test $(git rev-parse :0:w/c) = $(git rev-parse A:z/c) &&
+   test $(git rev-parse :0:z/d) = $(git rev-parse C:z/d) &&
+   test_i18ngrep "CONFLICT.*directory rename split" out
+'
+
+# Testcase 2b, Directory split into two on one side, with equal numbers of 
paths
+#   Commit A: z/{b,c}
+#   Commit B: y/b, w/c
+#   Commit C: z/{b,c}, x/d
+#   Expected: y/b, w/c, x/d; No warning about z/ -> (y/ vs. w/) conflict
+test_expect_success '2b-setup: Directory split into two on one side, with 
equal numbers of paths' '
+   git rm -rf . &&
+   git clean -fdqx &&
+   rm -rf .git &&
+   git init &&
+
+   mkdir z &&
+   echo b >z/b &&
+   echo c >z/c &&
+   git add z &&
+   test_tick &&
+   git commit -m "A" &&
+
+   git branch A &&
+   git branch B &&
+   git branch C &&
+
+   git checkout B &&
+   mkdir y &&
+   mkdir w &&
+   git mv z/b y/ &&
+   git mv z/c w/ &&
+   test_tick &&
+   git commit -m "B" &&
+
+   git checkout C &&
+   mkdir x &&
+   echo d >x/d &&
+   git add x/d &&
+   test_tick &&
+   git commit -m "C"
+'
+
+test_expect_success '2b-check: Directory split into two on one side, with 
equal numbers of paths' '
+   git checkout B^0 &&
+
+   git merge -s recursive C^0 >out &&
+
+   test 3 -eq $(git ls-files -s | wc -l) &&
+   test 0 -eq $(git ls-files -u | wc -l) &&
+   test 1 -eq $(git ls-files -o | wc -l) &&
+
+   test $(git rev-parse :0:y/b) = $(git rev-parse A:z/b) &&
+   test $(git rev-parse :0:w/c) = $(git rev-parse A:z/c) &&
+   test $(git rev-parse :0:x/d) = $(git rev-parse C:x/d) &&
+   ! test_i18ngrep "CONFLICT.*directory rename split" out
+'
+
+###
+# Rules suggested by section 2:
+#
+#   None; the rule was already covered in section 1.  These testcases are
+#   here just to make sure the conflict resolution and necessary warning
+#   messages are handled correctly.
+###
+
 test_done
-- 
2.15.0.5.g9567be9905

[PATCH 12/30] directory rename detection: miscellaneous testcases to complete coverage

2017-11-10 Thread Elijah Newren

Signed-off-by: Elijah Newren 
---
 t/t6043-merge-rename-directories.sh | 505 
 1 file changed, 505 insertions(+)

diff --git a/t/t6043-merge-rename-directories.sh 
b/t/t6043-merge-rename-directories.sh
index bdfd943c88..bb179b16c8 100755
--- a/t/t6043-merge-rename-directories.sh
+++ b/t/t6043-merge-rename-directories.sh
@@ -265,6 +265,7 @@ test_expect_failure '1d-check: Directory renames cause a 
rename/rename(2to1) con
 '
 
 # Testcase 1e, Renamed directory, with all filenames being renamed too
+#   (Related to testcases 9f & 9g)
 #   Commit A: z/{oldb,oldc}
 #   Commit B: y/{newb,newc}
 #   Commit C: z/{oldb,oldc,d}
@@ -2054,4 +2055,508 @@ test_expect_failure '8e-check: Both sides rename, one 
side adds to original dire
test_i18ngrep CONFLICT.*rename/rename.*z/b.*y/b.*w/b out
 '
 
+###
+# SECTION 9: Other testcases
+#
+# I came up with the testcases in the first eight sections before coding up
+# the implementation.  The testcases in this section were mostly ones I
+# thought of while coding/debugging, and which I was too lazy to insert
+# into the previous sections because I didn't want to re-label with all the
+# testcase references.  :-)
+###
+
+# Testcase 9a, Inner renamed directory within outer renamed directory
+#   (Related to testcase 1f)
+#   Commit A: z/{b,c,d/{e,f,g}}
+#   Commit B: y/{b,c}, x/w/{e,f,g}
+#   Commit C: z/{b,c,d/{e,f,g,h},i}
+#   Expected: y/{b,c,i}, x/w/{e,f,g,h}
+#   NOTE: The only reason this one is interesting is because when a directory
+# is split into multiple other directories, we determine by the weight
+# of which one had the most paths going to it.  A naive implementation
+# of that could take the new file in commit C at z/i to x/w/i or x/i.
+
+test_expect_success '9a-setup: Inner renamed directory within outer renamed 
directory' '
+   git rm -rf . &&
+   git clean -fdqx &&
+   rm -rf .git &&
+   git init &&
+
+   mkdir -p z/d &&
+   echo b >z/b &&
+   echo c >z/c &&
+   echo e >z/d/e &&
+   echo f >z/d/f &&
+   echo g >z/d/g &&
+   git add z &&
+   test_tick &&
+   git commit -m "A" &&
+
+   git branch A &&
+   git branch B &&
+   git branch C &&
+
+   git checkout B &&
+   mkdir x &&
+   git mv z/d x/w &&
+   git mv z y &&
+   test_tick &&
+   git commit -m "B" &&
+
+   git checkout C &&
+   echo h >z/d/h &&
+   echo i >z/i &&
+   git add z &&
+   test_tick &&
+   git commit -m "C"
+'
+
+test_expect_failure '9a-check: Inner renamed directory within outer renamed 
directory' '
+   git checkout B^0 &&
+
+   git merge -s recursive C^0 &&
+
+   test 7 -eq $(git ls-files -s | wc -l) &&
+   test 0 -eq $(git ls-files -u | wc -l) &&
+   test 0 -eq $(git ls-files -o | wc -l) &&
+
+   test $(git rev-parse HEAD:y/b) = $(git rev-parse A:z/b) &&
+   test $(git rev-parse HEAD:y/c) = $(git rev-parse A:z/c) &&
+   test $(git rev-parse HEAD:y/i) = $(git rev-parse C:z/i) &&
+
+   test $(git rev-parse HEAD:x/w/e) = $(git rev-parse A:z/d/e) &&
+   test $(git rev-parse HEAD:x/w/f) = $(git rev-parse A:z/d/f) &&
+   test $(git rev-parse HEAD:x/w/g) = $(git rev-parse A:z/d/g) &&
+   test $(git rev-parse HEAD:x/w/h) = $(git rev-parse C:z/d/h)
+'
+
+# Testcase 9b, Transitive rename with content merge
+#   (Related to testcase 1c)
+#   Commit A: z/{b,c},   x/d_1
+#   Commit B: y/{b,c},   x/d_2
+#   Commit C: z/{b,c,d_3}
+#   Expected: y/{b,c,d_merged}
+
+test_expect_success '9b-setup: Transitive rename with content merge' '
+   git rm -rf . &&
+   git clean -fdqx &&
+   rm -rf .git &&
+   git init &&
+
+   mkdir z &&
+   echo b >z/b &&
+   echo c >z/c &&
+   mkdir x &&
+   test_seq 1 10 >x/d &&
+   git add z x &&
+   test_tick &&
+   git commit -m "A" &&
+
+   git branch A &&
+   git branch B &&
+   git branch C &&
+
+   git checkout B &&
+   git mv z y &&
+   test_seq 1 11 >x/d &&
+   git add x/d &&
+   test_tick &&
+   git commit -m "B" &&
+
+   git checkout C &&
+   test_seq 0 10 >x/d &&
+   git mv x/d z/d &&
+   git add z/d &&
+   test_tick &&
+   git commit -m "C"
+'
+
+test_expect_failure '9b-check: Transitive rename with content merge' '
+   git checkout B^0 &&
+
+   git merge -s recursive C^0 &&
+
+   test 3 -eq $(git ls-files -s | wc -l) &&
+
+   test $(git rev-parse HEAD:y/b) = $(git rev-parse A:z/b) &&
+   test $(git rev-parse HEAD:y/c) = $(git rev-parse A:z/c) &&
+   test_must_fail git rev-parse HEAD:x/d &&
+   test_must_fail git rev-parse HEAD:z/d &&
+   test ! -f z/d &&
+
+   test $(git rev-parse HEAD:y/d) != $(git rev-parse A:x/d) &&
+   test $(git rev-parse HEAD:y/d

[RFC PATCH 29/30] merge-recursive: Fix overwriting dirty files involved in renames

2017-11-10 Thread Elijah Newren

This fixes an issue that existed before my directory rename detection
patches that affects both normal renames and renames implied by
directory rename detection.  Additional codepaths that only affect
overwriting of directy files that are involved in directory rename
detection will be added in a subsequent commit.

Signed-off-by: Elijah Newren 
---
Seems kinda hacky, in multiple different ways.  It seems like there
should be a better way to handle lots of different things about this
patch, though I have a hard time seeing how without doing a bigger
rewrite of the whole interface between unpack_trees and
merge-recursive (possibly rewriting them so there isn't an interface
but some piece of code that does both functions).  Alternate simpler
suggestions?


 merge-recursive.c   | 81 -
 merge-recursive.h   |  2 +
 t/t3501-revert-cherry-pick.sh   |  2 +-
 t/t6043-merge-rename-directories.sh |  2 +-
 t/t7607-merge-overwrite.sh  |  2 +-
 unpack-trees.c  |  4 +-
 unpack-trees.h  |  4 ++
 7 files changed, 73 insertions(+), 24 deletions(-)

diff --git a/merge-recursive.c b/merge-recursive.c
index 1b3ee5b9fb..86ddb89727 100644
--- a/merge-recursive.c
+++ b/merge-recursive.c
@@ -323,32 +323,32 @@ static void init_tree_desc_from_tree(struct tree_desc 
*desc, struct tree *tree)
init_tree_desc(desc, tree->buffer, tree->size);
 }
 
-static int git_merge_trees(int index_only,
+static int git_merge_trees(struct merge_options *o,
   struct tree *common,
   struct tree *head,
   struct tree *merge)
 {
int rc;
struct tree_desc t[3];
-   struct unpack_trees_options opts;
 
-   memset(&opts, 0, sizeof(opts));
-   if (index_only)
-   opts.index_only = 1;
+   memset(&o->unpack_opts, 0, sizeof(o->unpack_opts));
+   if (o->call_depth)
+   o->unpack_opts.index_only = 1;
else
-   opts.update = 1;
-   opts.merge = 1;
-   opts.head_idx = 2;
-   opts.fn = threeway_merge;
-   opts.src_index = &the_index;
-   opts.dst_index = &the_index;
-   setup_unpack_trees_porcelain(&opts, "merge");
+   o->unpack_opts.update = 1;
+   o->unpack_opts.merge = 1;
+   o->unpack_opts.head_idx = 2;
+   o->unpack_opts.fn = threeway_merge;
+   o->unpack_opts.src_index = &the_index;
+   o->unpack_opts.dst_index = &the_index;
+   setup_unpack_trees_porcelain(&o->unpack_opts, "merge");
 
init_tree_desc_from_tree(t+0, common);
init_tree_desc_from_tree(t+1, head);
init_tree_desc_from_tree(t+2, merge);
 
-   rc = unpack_trees(3, t, &opts);
+   rc = unpack_trees(3, t, &o->unpack_opts);
+   o->unpack_opts.src_index = &the_index; // unpack_trees NULLifies 
src_index, but it's used in verify_uptodate, so set it back
cache_tree_free(&active_cache_tree);
return rc;
 }
@@ -783,6 +783,20 @@ static int would_lose_untracked(const char *path)
return !was_tracked(path) && file_exists(path);
 }
 
+static int was_dirty(struct merge_options *o, const char *path)
+{
+   struct cache_entry *ce;
+
+   int dirty = 1;
+   if (o->call_depth || !was_tracked(path))
+ return !dirty;
+
+   ce = cache_file_exists(path, strlen(path), ignore_case);
+   dirty = (ce->ce_stat_data.sd_mtime.sec > 0 &&
+verify_uptodate(ce, &o->unpack_opts) != 0);
+   return dirty;
+}
+
 static int make_room_for_path(struct merge_options *o, const char *path)
 {
int status, i;
@@ -2635,6 +2649,7 @@ static int handle_modify_delete(struct merge_options *o,
 
 static int merge_content(struct merge_options *o,
 const char *path,
+int file_in_way,
 struct object_id *o_oid, int o_mode,
 struct object_id *a_oid, int a_mode,
 struct object_id *b_oid, int b_mode,
@@ -2709,7 +2724,7 @@ static int merge_content(struct merge_options *o,
return -1;
}
 
-   if (df_conflict_remains) {
+   if (df_conflict_remains || file_in_way) {
char *new_path;
if (o->call_depth) {
remove_file_from_cache(path);
@@ -2743,6 +2758,31 @@ static int merge_content(struct merge_options *o,
return mfi.clean;
 }
 
+static int conflict_rename_normal(struct merge_options *o,
+ const char *path,
+ struct object_id *o_oid, unsigned o_mode,
+ struct object_id *a_oid, unsigned a_mode,
+ struct object_id *b_oid, unsigned b_mode,
+ struct rename_conflict_info *ci)
+{
+   int clean_merge;
+   int file_in_the_way = 0;
+
+

[PATCH 15/30] merge-recursive: Move the get_renames() function

2017-11-10 Thread Elijah Newren

I want to re-use some other functions in the file without moving those other
functions or dealing with a handful of annoying split function declarations
and definitions.

Signed-off-by: Elijah Newren 
---
 merge-recursive.c | 138 +++---
 1 file changed, 69 insertions(+), 69 deletions(-)

diff --git a/merge-recursive.c b/merge-recursive.c
index 3526c8d0b8..382016508b 100644
--- a/merge-recursive.c
+++ b/merge-recursive.c
@@ -540,75 +540,6 @@ struct rename {
unsigned processed:1;
 };
 
-/*
- * Get information of all renames which occurred between 'o_tree' and
- * 'tree'. We need the three trees in the merge ('o_tree', 'a_tree' and
- * 'b_tree') to be able to associate the correct cache entries with
- * the rename information. 'tree' is always equal to either a_tree or b_tree.
- */
-static struct string_list *get_renames(struct merge_options *o,
-  struct tree *tree,
-  struct tree *o_tree,
-  struct tree *a_tree,
-  struct tree *b_tree,
-  struct string_list *entries)
-{
-   int i;
-   struct string_list *renames;
-   struct diff_options opts;
-
-   renames = xcalloc(1, sizeof(struct string_list));
-   if (!o->detect_rename)
-   return renames;
-
-   diff_setup(&opts);
-   DIFF_OPT_SET(&opts, RECURSIVE);
-   DIFF_OPT_CLR(&opts, RENAME_EMPTY);
-   opts.detect_rename = DIFF_DETECT_RENAME;
-   opts.rename_limit = o->merge_rename_limit >= 0 ? o->merge_rename_limit :
-   o->diff_rename_limit >= 0 ? o->diff_rename_limit :
-   1000;
-   opts.rename_score = o->rename_score;
-   opts.show_rename_progress = o->show_rename_progress;
-   opts.output_format = DIFF_FORMAT_NO_OUTPUT;
-   diff_setup_done(&opts);
-   diff_tree_oid(&o_tree->object.oid, &tree->object.oid, "", &opts);
-   diffcore_std(&opts);
-   if (opts.needed_rename_limit > o->needed_rename_limit)
-   o->needed_rename_limit = opts.needed_rename_limit;
-   for (i = 0; i < diff_queued_diff.nr; ++i) {
-   struct string_list_item *item;
-   struct rename *re;
-   struct diff_filepair *pair = diff_queued_diff.queue[i];
-   if (pair->status != 'R') {
-   diff_free_filepair(pair);
-   continue;
-   }
-   re = xmalloc(sizeof(*re));
-   re->processed = 0;
-   re->pair = pair;
-   item = string_list_lookup(entries, re->pair->one->path);
-   if (!item)
-   re->src_entry = insert_stage_data(re->pair->one->path,
-   o_tree, a_tree, b_tree, entries);
-   else
-   re->src_entry = item->util;
-
-   item = string_list_lookup(entries, re->pair->two->path);
-   if (!item)
-   re->dst_entry = insert_stage_data(re->pair->two->path,
-   o_tree, a_tree, b_tree, entries);
-   else
-   re->dst_entry = item->util;
-   item = string_list_insert(renames, pair->one->path);
-   item->util = re;
-   }
-   opts.output_format = DIFF_FORMAT_NO_OUTPUT;
-   diff_queued_diff.nr = 0;
-   diff_flush(&opts);
-   return renames;
-}
-
 static int update_stages(struct merge_options *opt, const char *path,
 const struct diff_filespec *o,
 const struct diff_filespec *a,
@@ -1383,6 +1314,75 @@ static int conflict_rename_rename_2to1(struct 
merge_options *o,
return ret;
 }
 
+/*
+ * Get information of all renames which occurred between 'o_tree' and
+ * 'tree'. We need the three trees in the merge ('o_tree', 'a_tree' and
+ * 'b_tree') to be able to associate the correct cache entries with
+ * the rename information. 'tree' is always equal to either a_tree or b_tree.
+ */
+static struct string_list *get_renames(struct merge_options *o,
+  struct tree *tree,
+  struct tree *o_tree,
+  struct tree *a_tree,
+  struct tree *b_tree,
+  struct string_list *entries)
+{
+   int i;
+   struct string_list *renames;
+   struct diff_options opts;
+
+   renames = xcalloc(1, sizeof(struct string_list));
+   if (!o->detect_rename)
+   return renames;
+
+   diff_setup(&opts);
+   DIFF_OPT_SET(&opts, RECURSIVE);
+   DIFF_OPT_CLR(&opts, RENAME_EMPTY);
+   opts.detect_rename = DIFF_DETECT_RENAME;
+   opts.rename_limit = o->merge_rename_limit >= 0 ? o->merge_rename_limit :
+

[PATCH 10/30] directory rename detection: more involved edge/corner testcases

2017-11-10 Thread Elijah Newren

Signed-off-by: Elijah Newren 
---
 t/t6043-merge-rename-directories.sh | 347 
 1 file changed, 347 insertions(+)

diff --git a/t/t6043-merge-rename-directories.sh 
b/t/t6043-merge-rename-directories.sh
index 157299105f..115d0d2622 100755
--- a/t/t6043-merge-rename-directories.sh
+++ b/t/t6043-merge-rename-directories.sh
@@ -1336,4 +1336,351 @@ test_expect_success '6e-check: Add/add from one side' '
test $(git rev-parse HEAD:z/d) = $(git rev-parse C:z/d)
 '
 
+
+###
+# SECTION 7: More involved Edge/Corner cases
+#
+# The ruleset we have generated in the above sections seems to provide
+# well-defined merges.  But can we find edge/corner cases that either (a)
+# are harder for users to understand, or (b) have a resolution that is
+# non-intuitive or suboptimal?
+#
+# The testcases in this section dive into cases that I've tried to craft in
+# a way to find some that might be surprising to users or difficult for
+# them to understand (the next section will look at non-intuitive or
+# suboptimal merge results).  Some of the testcases are similar to ones
+# from past sections, but have been simplified to try to highlight error
+# messages using a "modified" path (due to the directory rename).  Are
+# users okay with these?
+#
+# In my opinion, testcases that are difficult to understand from this
+# section is due to difficulty in the testcase rather than the directory
+# renaming (similar to how t6042 and t6036 have difficult resolutions due
+# to the problem setup itself being complex).  And I don't think the
+# error messages are a problem.
+#
+# On the other hand, the testcases in section 8 worry me slightly more...
+###
+
+# Testcase 7a, rename-dir vs. rename-dir (NOT split evenly) PLUS add-other-file
+#   Commit A: z/{b,c}
+#   Commit B: y/{b,c}
+#   Commit C: w/b, x/c, z/d
+#   Expected: y/d, CONFLICT(rename/rename for both z/b and z/c)
+#   NOTE: There's a rename of z/ here, y/ has more renames, so z/d -> y/d.
+
+test_expect_success '7a-setup: rename-dir vs. rename-dir (NOT split evenly) 
PLUS add-other-file' '
+   git rm -rf . &&
+   git clean -fdqx &&
+   rm -rf .git &&
+   git init &&
+
+   mkdir z &&
+   echo b >z/b &&
+   echo c >z/c &&
+   git add z &&
+   test_tick &&
+   git commit -m "A" &&
+
+   git branch A &&
+   git branch B &&
+   git branch C &&
+
+   git checkout B &&
+   git mv z y &&
+   test_tick &&
+   git commit -m "B" &&
+
+   git checkout C &&
+   mkdir w &&
+   mkdir x &&
+   git mv z/b w/ &&
+   git mv z/c x/ &&
+   echo d > z/d &&
+   git add z/d &&
+   test_tick &&
+   git commit -m "C"
+'
+
+test_expect_failure '7a-check: rename-dir vs. rename-dir (NOT split evenly) 
PLUS add-other-file' '
+   git checkout B^0 &&
+
+   test_must_fail git merge -s recursive C^0 >out &&
+   test_i18ngrep "CONFLICT (rename/rename).*z/b.*y/b.*w/b" out &&
+   test_i18ngrep "CONFLICT (rename/rename).*z/c.*y/c.*x/c" out &&
+
+   test 7 -eq $(git ls-files -s | wc -l) &&
+   test 6 -eq $(git ls-files -u | wc -l) &&
+   test 1 -eq $(git ls-files -o | wc -l) &&
+
+   test $(git rev-parse :0:y/d) = $(git rev-parse C:z/d) &&
+
+   test $(git rev-parse :1:z/b) = $(git rev-parse A:z/b) &&
+   test $(git rev-parse :2:y/b) = $(git rev-parse A:z/b) &&
+   test $(git rev-parse :3:w/b) = $(git rev-parse A:z/b) &&
+
+   test $(git rev-parse :1:z/c) = $(git rev-parse A:z/c) &&
+   test $(git rev-parse :2:y/c) = $(git rev-parse A:z/c) &&
+   test $(git rev-parse :3:x/c) = $(git rev-parse A:z/c) &&
+
+   test $(git hash-object y/b) = $(git rev-parse A:z/b) &&
+   test $(git hash-object w/b) = $(git rev-parse A:z/b) &&
+   test $(git hash-object y/c) = $(git rev-parse A:z/c) &&
+   test $(git hash-object x/c) = $(git rev-parse A:z/c)
+'
+
+# Testcase 7b, rename/rename(2to1), but only due to transitive rename
+#   (Related to testcase 1d)
+#   Commit A: z/{b,c}, x/d_1, w/d_2
+#   Commit B: y/{b,c,d_2}, x/d_1
+#   Commit C: z/{b,c,d_1},w/d_2
+#   Expected: y/{b,c}, CONFLICT(rename/rename(2to1): x/d_1, w/d_2 -> y_d)
+
+test_expect_success '7b-setup: rename/rename(2to1), but only due to transitive 
rename' '
+   git rm -rf . &&
+   git clean -fdqx &&
+   rm -rf .git &&
+   git init &&
+
+   mkdir z &&
+   mkdir x &&
+   mkdir w &&
+   echo b >z/b &&
+   echo c >z/c &&
+   echo d1 > x/d &&
+   echo d2 > w/d &&
+   git add z x w &&
+   test_tick &&
+   git commit -m "A" &&
+
+   git branch A &&
+   git branch B &&
+   git branch C &&
+
+   git checkout B &&
+   git mv z y &&
+   git mv w/d y/ &&
+   test_tick &&
+   git commit -m "B" &&
+
+   git checkout C &&
+

[PATCH 14/30] directory rename detection: tests for handling overwriting dirty files

2017-11-10 Thread Elijah Newren

Signed-off-by: Elijah Newren 
---
 t/t6043-merge-rename-directories.sh | 401 
 1 file changed, 401 insertions(+)

diff --git a/t/t6043-merge-rename-directories.sh 
b/t/t6043-merge-rename-directories.sh
index 7af8962512..4066b08767 100755
--- a/t/t6043-merge-rename-directories.sh
+++ b/t/t6043-merge-rename-directories.sh
@@ -2873,4 +2873,405 @@ test_expect_failure '10e-check: Does git complain about 
untracked file that is n
test "random" = "$(cat z/c)"
 '
 
+###
+# SECTION 11: Handling dirty (not up-to-date) files
+#
+# unpack_trees(), upon which the recursive merge algorithm is based, aborts
+# the operation if untracked or dirty files would be deleted or overwritten
+# by the merge.  Unfortunately, unpack_trees() does not understand renames,
+# and if it doesn't abort, then it muddies up the working directory before
+# we even get to the point of detecting renames, so we need some special
+# handling.  This was true even of normal renames, but there are additional
+# codepaths that need special handling with directory renames.  Add
+# testcases for both renamed-by-directory-rename-detection and standard
+# rename cases.
+###
+
+# Testcase 11a, Avoid losing dirty contents with simple rename
+#   Commit A: z/{a,b_v1},
+#   Commit B: z/{a,c_v1}, and z/c_v1 has uncommitted mods
+#   Commit C: z/{a,b_v2}
+#   Expected: ERROR_MSG(Refusing to lose dirty file at z/c) +
+# z/a, staged version of z/c has sha1sum matching C:z/b_v2,
+# z/c~HEAD with contents of C:z/b_v2,
+# z/c with uncommitted mods on top of B:z/c_v1
+
+test_expect_success '11a-setup: Avoid losing dirty contents with simple 
rename' '
+   git rm -rf . &&
+   git clean -fdqx &&
+   rm -rf .git &&
+   git init &&
+
+   mkdir z &&
+   echo a >z/a &&
+   test_seq 1 10 >z/b &&
+   git add z &&
+   test_tick &&
+   git commit -m "A" &&
+
+   git branch A &&
+   git branch B &&
+   git branch C &&
+
+   git checkout B &&
+   git mv z/b z/c &&
+   test_tick &&
+   git commit -m "B" &&
+
+   git checkout C &&
+   echo 11 >>z/b &&
+   git add z/b &&
+   test_tick &&
+   git commit -m "C"
+'
+
+test_expect_failure '11a-check: Avoid losing dirty contents with simple 
rename' '
+   git checkout B^0 &&
+   echo stuff >>z/c &&
+
+   test_must_fail git merge -s recursive C^0 >out 2>err &&
+   test_i18ngrep "Refusing to lose dirty file at z/c" out &&
+
+   test_seq 1 10 >expected &&
+   echo stuff >>expected &&
+
+   test 2 -eq $(git ls-files -s | wc -l) &&
+   test 1 -eq $(git ls-files -u | wc -l) &&
+   test 4 -eq $(git ls-files -o | wc -l) &&
+
+   test $(git rev-parse :0:z/a) = $(git rev-parse A:z/a) &&
+   test $(git rev-parse :2:z/c) = $(git rev-parse C:z/b) &&
+
+   test "$(git hash-object z/c~HEAD)" = $(git rev-parse C:z/b) &&
+   test_cmp expected z/c
+'
+
+# Testcase 11b, Avoid losing dirty file involved in directory rename
+#   Commit A: z/a, x/{b,c_v1}
+#   Commit B: z/{a,c_v1},  x/b,   and z/c_v1 has uncommitted mods
+#   Commit C: y/a, x/{b,c_v2}
+#   Expected: y/{a,c_v2}, x/b, z/c_v1 with uncommited mods untracked,
+# ERROR_MSG(Refusing to lose dirty file at z/c)
+
+
+test_expect_success '11b-setup: Avoid losing dirty file involved in directory 
rename' '
+   git rm -rf . &&
+   git clean -fdqx &&
+   rm -rf .git &&
+   git init &&
+
+   mkdir z x &&
+   echo a >z/a &&
+   echo b >x/b &&
+   test_seq 1 10 >x/c &&
+   git add z x &&
+   test_tick &&
+   git commit -m "A" &&
+
+   git branch A &&
+   git branch B &&
+   git branch C &&
+
+   git checkout B &&
+   git mv x/c z/c &&
+   test_tick &&
+   git commit -m "B" &&
+
+   git checkout C &&
+   git mv z y &&
+   echo 11 >>x/c &&
+   git add x/c &&
+   test_tick &&
+   git commit -m "C"
+'
+
+test_expect_failure '11b-check: Avoid losing dirty file involved in directory 
rename' '
+   git checkout B^0 &&
+   echo stuff >>z/c &&
+
+   git merge -s recursive C^0 >out 2>err &&
+   test_i18ngrep "Refusing to lose dirty file at z/c" out &&
+
+   grep -q stuff */* &&
+   test_seq 1 10 >expected &&
+   echo stuff >>expected &&
+
+   test 3 -eq $(git ls-files -s | wc -l) &&
+   test 0 -eq $(git ls-files -u | wc -l) &&
+   test 0 -eq $(git ls-files -m | wc -l) &&
+   test 4 -eq $(git ls-files -o | wc -l) &&
+
+   test $(git rev-parse :0:x/b) = $(git rev-parse A:x/b) &&
+   test $(git rev-parse :0:y/a) = $(git rev-parse A:z/a) &&
+   test $(git rev-parse :0:y/c) = $(git rev-parse C:x/c) &&
+
+   test "$(git hash-object y/c)" = $(git rev-parse C:x/c) &&
+   test_cm

[PATCH 22/30] merge-recursive: Check for directory level conflicts

2017-11-10 Thread Elijah Newren

Before trying to apply directory renames to paths within the given
directories, we want to make sure that there aren't conflicts at the
directory level.  There will be additional checks at the individual
file level too, which will be added later.

Signed-off-by: Elijah Newren 
---
 merge-recursive.c | 112 ++
 1 file changed, 112 insertions(+)

diff --git a/merge-recursive.c b/merge-recursive.c
index b5770d3d7f..3633be0123 100644
--- a/merge-recursive.c
+++ b/merge-recursive.c
@@ -1376,6 +1376,15 @@ static struct diff_queue_struct *get_diffpairs(struct 
merge_options *o,
return ret;
 }
 
+static int tree_has_path(struct tree *tree, const char *path)
+{
+   unsigned char hashy[20];
+   unsigned mode_o;
+
+   return !get_tree_entry(tree->object.oid.hash, path,
+  hashy, &mode_o);
+}
+
 static void get_renamed_dir_portion(const char *old_path, const char *new_path,
char **old_dir, char **new_dir) {
*old_dir = NULL;
@@ -1425,6 +1434,105 @@ static void get_renamed_dir_portion(const char 
*old_path, const char *new_path,
}
 }
 
+/*
+ * There are a couple things we want to do at the directory level:
+ *   1. Check for both sides renaming to the same thing, in order to avoid
+ *  implicit renaming of files that should be left in place.  (See
+ *  testcase 6b in t6043 for details.)
+ *   2. Prune directory renames if there are still files left in the
+ *  the original directory.  These represent a partial directory rename,
+ *  i.e. a rename where only some of the files within the directory
+ *  were renamed elsewhere.  (Technically, this could be done earlier
+ *  in get_directory_renames(), except that would prevent us from
+ *  doing the previous check and thus failing testcase 6b.)
+ *   3. Check for rename/rename(1to2) conflicts (at the directory level).
+ *  In the future, we could potentially record this info as well and
+ *  omit reporting rename/rename(1to2) conflicts for each path within
+ *  the affected directories, thus cleaning up the merge output.
+ *   NOTE: We do NOT check for rename/rename(2to1) conflicts at the
+ * directory level, because merging directories is fine.  If it
+ * causes conflicts for files within those merged directories, then
+ * that should be detected at the individual path level.
+ */
+static void handle_directory_level_conflicts(struct merge_options *o,
+struct hashmap *dir_re_head,
+struct tree *head,
+struct hashmap *dir_re_merge,
+struct tree *merge)
+{
+   struct hashmap_iter iter;
+   struct dir_rename_entry *head_ent;
+   struct dir_rename_entry *merge_ent;
+   int i;
+
+   struct string_list remove_from_head = STRING_LIST_INIT_NODUP;
+   struct string_list remove_from_merge = STRING_LIST_INIT_NODUP;
+
+   hashmap_iter_init(dir_re_head, &iter);
+   while ((head_ent = hashmap_iter_next(&iter))) {
+   merge_ent = dir_rename_find_entry(dir_re_merge, head_ent->dir);
+   if (merge_ent &&
+   !head_ent->non_unique_new_dir &&
+   !merge_ent->non_unique_new_dir &&
+   !strcmp(head_ent->new_dir, merge_ent->new_dir)) {
+   /* 1. Renamed identically; remove it from both sides */
+   string_list_append(&remove_from_head,
+  head_ent->dir)->util = head_ent;
+   free(head_ent->new_dir);
+   string_list_append(&remove_from_merge,
+  merge_ent->dir)->util = merge_ent;
+   free(merge_ent->new_dir);
+   } else if (tree_has_path(head, head_ent->dir)) {
+   /* 2. This wasn't a directory rename after all */
+   string_list_append(&remove_from_head,
+  head_ent->dir)->util = head_ent;
+   free(head_ent->new_dir);
+   }
+   }
+
+   hashmap_iter_init(dir_re_merge, &iter);
+   while ((merge_ent = hashmap_iter_next(&iter))) {
+   head_ent = dir_rename_find_entry(dir_re_head, merge_ent->dir);
+   if (tree_has_path(merge, merge_ent->dir)) {
+   /* 2. This wasn't a directory rename after all */
+   string_list_append(&remove_from_merge,
+  merge_ent->dir)->util = merge_ent;
+   } else if (head_ent &&
+  !head_ent->non_unique_new_dir &&
+  !merge_ent->non_unique_new_dir) {
+   /* 3. rename/rename(1to2) */
+

[PATCH 17/30] merge-recursive: Fix leaks of allocated renames and diff_filepairs

2017-11-10 Thread Elijah Newren

get_renames() has always zero'ed out diff_queued_diff.nr while only
manually free'ing diff_filepairs that did not correspond to renames.
Further, it allocated struct renames that were tucked away in the
return string_list.  Make sure all of these are deallocated when we
are done with them.

Signed-off-by: Elijah Newren 
---
 merge-recursive.c | 15 +--
 1 file changed, 13 insertions(+), 2 deletions(-)

diff --git a/merge-recursive.c b/merge-recursive.c
index 49710c0964..7a3402e50c 100644
--- a/merge-recursive.c
+++ b/merge-recursive.c
@@ -1661,10 +1661,21 @@ static struct rename_info *handle_renames(struct 
merge_options *o,
 
 static void cleanup_renames(struct rename_info *re_info)
 {
-   string_list_clear(re_info->head_renames, 0);
-   string_list_clear(re_info->merge_renames, 0);
+   const struct rename *re;
+   int i;
 
+   for (i = 0; i < re_info->head_renames->nr; i++) {
+   re = re_info->head_renames->items[i].util;
+   diff_free_filepair(re->pair);
+   }
+   string_list_clear(re_info->head_renames, 1);
free(re_info->head_renames);
+
+   for (i = 0; i < re_info->merge_renames->nr; i++) {
+   re = re_info->merge_renames->items[i].util;
+   diff_free_filepair(re->pair);
+   }
+   string_list_clear(re_info->merge_renames, 1);
free(re_info->merge_renames);
 
free(re_info);
-- 
2.15.0.5.g9567be9905

[PATCH 25/30] merge-recursive: Check for file level conflicts then get new name

2017-11-10 Thread Elijah Newren

Before trying to apply directory renames to paths within the given
directories, we want to make sure that there aren't conflicts at the
file level either.  If there aren't any, then get the new name from
any directory renames.

Signed-off-by: Elijah Newren 
---
 merge-recursive.c   | 184 ++--
 t/t6043-merge-rename-directories.sh |   2 +-
 2 files changed, 176 insertions(+), 10 deletions(-)

diff --git a/merge-recursive.c b/merge-recursive.c
index 251c4cc7fa..7c2c29bb51 100644
--- a/merge-recursive.c
+++ b/merge-recursive.c
@@ -1481,6 +1481,102 @@ static void get_renamed_dir_portion(const char 
*old_path, const char *new_path,
}
 }
 
+/*
+ * Write:
+ *   element1, element2, element3, ..., elementN
+ * to str.  If only one element, just write "element1" to str.
+ */
+static void comma_separated_list(char *str, struct string_list *slist) {
+   int i;
+   for (i=0; inr; i++) {
+   str += sprintf(str, "%s", slist->items[i].string);
+   if (i < slist->nr-1)
+   str += sprintf(str, ", ");
+   }
+}
+
+/*
+ * See if there is a directory rename for path, and if there are any file
+ * level conflicts for the renamed location.  If there is a rename and
+ * there are no conflicts, return the new name.  Otherwise, return NULL.
+ */
+static char* handle_path_level_conflicts(struct merge_options *o,
+const char *path,
+struct dir_rename_entry *entry,
+struct hashmap *collisions,
+struct tree *tree)
+{
+   char *new_path = NULL;
+   struct collision_entry *collision_ent;
+   int clean = 1;
+
+   /*
+* entry has the mapping of old directory name to new directory name
+* that we want to apply to path.
+*/
+   new_path = apply_dir_rename(entry, path);
+
+   if (!new_path) {
+   /* This should only happen when entry->non_unique_new_dir set */
+   assert(entry->non_unique_new_dir);
+   output(o, 1, _("CONFLICT (directory rename split): "
+  "Unclear where to place %s because directory "
+  "%s was renamed to multiple other directories, "
+  "with no destination getting a majority of the "
+  "files."),
+  path, entry->dir);
+   clean = 0;
+   return NULL;
+   }
+
+   /*
+* The caller needs to have ensured that it has pre-populated
+* collisions with all paths that map to new_path.  Do a quick check
+* to ensure that's the case.
+ */
+   collision_ent = collision_find_entry(collisions, new_path);
+   assert(collision_ent != NULL);
+
+   /*
+* Check for one-sided add/add/.../add conflicts, i.e.
+* where implicit renames from the other side doing
+* directory rename(s) can affect this side of history
+* to put multiple paths into the same location.  Warn
+* and bail on directory renames for such paths.
+*/
+   char collision_paths[(PATH_MAX+2) * collision_ent->source_files.nr];
+   if (collision_ent->reported_already) {
+   clean = 0;
+   } else if (tree_has_path(tree, new_path)) {
+   collision_ent->reported_already = 1;
+   comma_separated_list(collision_paths,
+&collision_ent->source_files);
+   output(o, 1, _("CONFLICT (implicit dir rename): Existing "
+  "file/dir at %s in the way of implicit "
+  "directory rename(s) putting the following "
+  "path(s) there: %s."),
+  new_path, collision_paths);
+   clean = 0;
+   } else if (collision_ent->source_files.nr > 1) {
+   collision_ent->reported_already = 1;
+   comma_separated_list(collision_paths,
+&collision_ent->source_files);
+   output(o, 1, _("CONFLICT (implicit dir rename): Cannot map "
+  "more than one path to %s; implicit directory "
+  "renames tried to put these paths there: %s"),
+  new_path, collision_paths);
+   clean = 0;
+   }
+
+   /* Free memory we no longer need */
+   if (!clean && new_path) {
+   free(new_path);
+   return NULL;
+   }
+
+   return new_path;
+}
+
 /*
  * There are a couple things we want to do at the directory level:
  *   1. Check for both sides renaming to the same thing, in order to avoid
@@ -1725,6 +1821,58 @@ static void compute_collisions(struct hashmap 
*collisions,
}
 }
 
+static char *check_for_directory_rename(s

[PATCH 28/30] merge-recursive: Avoid clobbering untracked files with directory renames

2017-11-10 Thread Elijah Newren

Signed-off-by: Elijah Newren 
---
 merge-recursive.c   | 39 +++--
 t/t6043-merge-rename-directories.sh |  6 +++---
 2 files changed, 40 insertions(+), 5 deletions(-)

diff --git a/merge-recursive.c b/merge-recursive.c
index 838bfd32ec..1b3ee5b9fb 100644
--- a/merge-recursive.c
+++ b/merge-recursive.c
@@ -1129,6 +1129,25 @@ static int conflict_rename_dir(struct merge_options *o,
 {
const struct diff_filespec *dest = pair->two;
 
+   if (!o->call_depth && would_lose_untracked(dest->path)) {
+   char *alt_path = unique_path(o, dest->path, rename_branch);
+   output(o, 1, _("Error: Refusing to lose untracked file at %s; "
+  "writing to %s instead."),
+  dest->path, alt_path);
+   /*
+* Write the file in worktree at alt_path, but not in the
+* index.  Instead, write to dest->path for the index but
+* only at the higher appropriate stage.
+*/
+   if (update_file(o, 0, &dest->oid, dest->mode, alt_path))
+   return -1;
+   free(alt_path);
+   return update_stages(o, dest->path, NULL,
+rename_branch == o->branch1 ? dest : NULL,
+rename_branch == o->branch1 ? NULL : dest);
+   }
+
+   /* Update dest->path both in index and in worktree */
if (update_file(o, 1, &dest->oid, dest->mode, dest->path))
return -1;
return 0;
@@ -1147,7 +1166,8 @@ static int handle_change_delete(struct merge_options *o,
const char *update_path = path;
int ret = 0;
 
-   if (dir_in_way(path, !o->call_depth, 0)) {
+   if (dir_in_way(path, !o->call_depth, 0) ||
+   (!o->call_depth && would_lose_untracked(path))) {
update_path = alt_path = unique_path(o, path, change_branch);
}
 
@@ -1273,6 +1293,10 @@ static int handle_file(struct merge_options *o,
dst_name = unique_path(o, rename->path, cur_branch);
output(o, 1, _("%s is a directory in %s adding as %s 
instead"),
   rename->path, other_branch, dst_name);
+   } else if (!o->call_depth && 
would_lose_untracked(rename->path)) {
+   dst_name = unique_path(o, rename->path, cur_branch);
+   output(o, 1, _("Refusing to lose untracked file at %s; 
adding as %s instead"),
+  rename->path, dst_name);
}
}
if ((ret = update_file(o, 0, &rename->oid, rename->mode, dst_name)))
@@ -1398,7 +1422,18 @@ static int conflict_rename_rename_2to1(struct 
merge_options *o,
char *new_path2 = unique_path(o, path, ci->branch2);
output(o, 1, _("Renaming %s to %s and %s to %s instead"),
   a->path, new_path1, b->path, new_path2);
-   remove_file(o, 0, path, 0);
+   if (would_lose_untracked(path))
+   /*
+* Only way we get here is if both renames were from
+* a directory rename AND user had an untracked file
+* at the location where both files end up after the
+* two directory renames.  See testcase 10d of t6043.
+*/
+   output(o, 1, _("Refusing to lose untracked file at "
+  "%s, even though it's in the way."),
+  path);
+   else
+   remove_file(o, 0, path, 0);
ret = update_file(o, 0, &mfi_c1.oid, mfi_c1.mode, new_path1);
if (!ret)
ret = update_file(o, 0, &mfi_c2.oid, mfi_c2.mode,
diff --git a/t/t6043-merge-rename-directories.sh 
b/t/t6043-merge-rename-directories.sh
index e737bad2c5..6db764a1b6 100755
--- a/t/t6043-merge-rename-directories.sh
+++ b/t/t6043-merge-rename-directories.sh
@@ -2660,7 +2660,7 @@ test_expect_success '10b-setup: Overwrite untracked with 
dir rename + delete' '
git commit -m "C"
 '
 
-test_expect_failure '10b-check: Overwrite untracked with dir rename + delete' '
+test_expect_success '10b-check: Overwrite untracked with dir rename + delete' '
git checkout B^0 &&
echo very >y/c &&
echo important >y/d &&
@@ -2727,7 +2727,7 @@ test_expect_success '10c-setup: Overwrite untracked with 
dir rename/rename(1to2)
git commit -m "C"
 '
 
-test_expect_failure '10c-check: Overwrite untracked with dir 
rename/rename(1to2)' '
+test_expect_success '10c-check: Overwrite untracked with dir 
rename/rename(1to2)' '
git checkout B^0 &&
echo important >y/c &&
 
@@ -2793,7 +2793,7 @@ test_expect_success '10d-setup: Delete untracked with dir 
rename/rename(2to1)' '
git c

[PATCH 20/30] merge-recursive: Add a new hashmap for storing directory renames

2017-11-10 Thread Elijah Newren

This just adds dir_rename_entry and the associated functions; code using
these will be added in subsequent commits.

Signed-off-by: Elijah Newren 
---
 merge-recursive.c | 24 
 merge-recursive.h |  8 
 2 files changed, 32 insertions(+)

diff --git a/merge-recursive.c b/merge-recursive.c
index 8c9543d85c..89a9b32635 100644
--- a/merge-recursive.c
+++ b/merge-recursive.c
@@ -49,6 +49,30 @@ static unsigned int path_hash(const char *path)
return ignore_case ? strihash(path) : strhash(path);
 }
 
+static struct dir_rename_entry *dir_rename_find_entry(struct hashmap *hashmap, 
char *dir)
+{
+   struct dir_rename_entry key;
+
+   if (dir == NULL)
+   return NULL;
+   hashmap_entry_init(&key, strhash(dir));
+   key.dir = dir;
+   return hashmap_get(hashmap, &key, NULL);
+}
+
+static int dir_rename_cmp(void *unused_cmp_data,
+ const struct dir_rename_entry *e1,
+ const struct dir_rename_entry *e2,
+ const void *unused_keydata)
+{
+   return strcmp(e1->dir, e2->dir);
+}
+
+static void dir_rename_init(struct hashmap *map)
+{
+   hashmap_init(map, (hashmap_cmp_fn) dir_rename_cmp, NULL, 0);
+}
+
 static void flush_output(struct merge_options *o)
 {
if (o->buffer_output < 2 && o->obuf.len) {
diff --git a/merge-recursive.h b/merge-recursive.h
index 80d69d1401..a024949739 100644
--- a/merge-recursive.h
+++ b/merge-recursive.h
@@ -29,6 +29,14 @@ struct merge_options {
struct string_list df_conflict_file_set;
 };
 
+struct dir_rename_entry {
+   struct hashmap_entry ent; /* must be the first member! */
+   char *dir;
+   unsigned non_unique_new_dir:1;
+   char *new_dir;
+   struct string_list possible_new_dirs;
+};
+
 /* merge_trees() but with recursive ancestor consolidation */
 int merge_recursive(struct merge_options *o,
struct commit *h1,
-- 
2.15.0.5.g9567be9905

[PATCH 30/30] merge-recursive: Fix remaining directory rename + dirty overwrite cases

2017-11-10 Thread Elijah Newren

Signed-off-by: Elijah Newren 
---
 merge-recursive.c   | 26 +++---
 t/t6043-merge-rename-directories.sh |  8 
 2 files changed, 27 insertions(+), 7 deletions(-)

diff --git a/merge-recursive.c b/merge-recursive.c
index 86ddb89727..6ef1d52f0a 100644
--- a/merge-recursive.c
+++ b/merge-recursive.c
@@ -1296,11 +1296,23 @@ static int handle_file(struct merge_options *o,
 
add = filespec_from_entry(&other, dst_entry, stage ^ 1);
if (add) {
+   int ren_src_was_dirty = was_dirty(o, rename->path);
char *add_name = unique_path(o, rename->path, other_branch);
if (update_file(o, 0, &add->oid, add->mode, add_name))
return -1;
 
-   remove_file(o, 0, rename->path, 0);
+   if (ren_src_was_dirty) {
+   output(o, 1, _("Refusing to lose dirty file at %s"),
+  rename->path);
+   }
+   /*
+* Stupid double negatives in remove_file; it somehow manages
+* to repeatedly mess me up.  So, just for myself:
+*1) update_wd iff !ren_src_was_dirty.
+*2) no_wd iff !update_wd
+*3) so, no_wd == !!ren_src_was_dirty == ren_src_was_dirty
+*/
+   remove_file(o, 0, rename->path, ren_src_was_dirty);
dst_name = unique_path(o, rename->path, cur_branch);
} else {
if (dir_in_way(rename->path, !o->call_depth, 0)) {
@@ -1436,7 +1448,10 @@ static int conflict_rename_rename_2to1(struct 
merge_options *o,
char *new_path2 = unique_path(o, path, ci->branch2);
output(o, 1, _("Renaming %s to %s and %s to %s instead"),
   a->path, new_path1, b->path, new_path2);
-   if (would_lose_untracked(path))
+   if (was_dirty(o, path))
+   output(o, 1, _("Refusing to lose dirty file at %s"),
+  path);
+   else if (would_lose_untracked(path))
/*
 * Only way we get here is if both renames were from
 * a directory rename AND user had an untracked file
@@ -2002,6 +2017,7 @@ static void apply_directory_rename_modifications(struct 
merge_options *o,
 {
struct string_list_item *item;
int stage = (tree == a_tree ? 2 : 3);
+   int update_wd;
 
/*
 * In all cases where we can do directory rename detection,
@@ -2012,7 +2028,11 @@ static void apply_directory_rename_modifications(struct 
merge_options *o,
 * saying the file would have been overwritten), but it might
 * be dirty, though.
 */
-   remove_file(o, 1, pair->two->path, 0 /* no_wd */);
+   update_wd = !was_dirty(o, pair->two->path);
+   if (!update_wd)
+   output(o, 1, _("Refusing to lose dirty file at %s"),
+  pair->two->path);
+   remove_file(o, 1, pair->two->path, !update_wd);
 
/* Find or create a new re->dst_entry */
item = string_list_lookup(entries, new_path);
diff --git a/t/t6043-merge-rename-directories.sh 
b/t/t6043-merge-rename-directories.sh
index 02c97c9823..676e72e9e6 100755
--- a/t/t6043-merge-rename-directories.sh
+++ b/t/t6043-merge-rename-directories.sh
@@ -2985,7 +2985,7 @@ test_expect_success '11b-setup: Avoid losing dirty file 
involved in directory re
git commit -m "C"
 '
 
-test_expect_failure '11b-check: Avoid losing dirty file involved in directory 
rename' '
+test_expect_success '11b-check: Avoid losing dirty file involved in directory 
rename' '
git checkout B^0 &&
echo stuff >>z/c &&
 
@@ -3109,7 +3109,7 @@ test_expect_success '11d-setup: Avoid losing not-uptodate 
with rename + D/F conf
git commit -m "C"
 '
 
-test_expect_failure '11d-check: Avoid losing not-uptodate with rename + D/F 
conflict' '
+test_expect_success '11d-check: Avoid losing not-uptodate with rename + D/F 
conflict' '
git checkout B^0 &&
echo stuff >>z/c &&
 
@@ -3178,7 +3178,7 @@ test_expect_success '11e-setup: Avoid deleting 
not-uptodate with dir rename/rena
git commit -m "C"
 '
 
-test_expect_failure '11e-check: Avoid deleting not-uptodate with dir 
rename/rename(1to2)/add' '
+test_expect_success '11e-check: Avoid deleting not-uptodate with dir 
rename/rename(1to2)/add' '
git checkout B^0 &&
echo mods >>y/c &&
 
@@ -3247,7 +3247,7 @@ test_expect_success '11f-setup: Avoid deleting 
not-uptodate with dir rename/rena
git commit -m "C"
 '
 
-test_expect_failure '11f-check: Avoid deleting not-uptodate with dir 
rename/rename(2to1)' '
+test_expect_success '11f-check: Avoid deleting not-uptodate with dir 
rename/rename(2to1)' '
git checkout B^0 &&
echo important >>y/wham &&
 
-- 
2.15.0.5.g9567be9905

[PATCH 18/30] merge-recursive: Make !o->detect_rename codepath more obvious

2017-11-10 Thread Elijah Newren

Previously, if !o->detect_rename then get_renames() would return an
empty string_list, and then process_renames() would have nothing to
iterate over.  It seems more straightforward to simply avoid calling
either function in that case.

Signed-off-by: Elijah Newren 
---
 merge-recursive.c | 9 +++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/merge-recursive.c b/merge-recursive.c
index 7a3402e50c..f40c70990c 100644
--- a/merge-recursive.c
+++ b/merge-recursive.c
@@ -1332,8 +1332,6 @@ static struct string_list *get_renames(struct 
merge_options *o,
struct diff_options opts;
 
renames = xcalloc(1, sizeof(struct string_list));
-   if (!o->detect_rename)
-   return renames;
 
diff_setup(&opts);
DIFF_OPT_SET(&opts, RECURSIVE);
@@ -1652,6 +1650,10 @@ static struct rename_info *handle_renames(struct 
merge_options *o,
 {
struct rename_info *rei = xcalloc(1, sizeof(struct rename_info));
 
+   *clean = 1;
+   if (!o->detect_rename)
+   return NULL;
+
rei->head_renames  = get_renames(o, head, common, head, merge, entries);
rei->merge_renames = get_renames(o, merge, common, head, merge, 
entries);
*clean = process_renames(o, rei->head_renames, rei->merge_renames);
@@ -1664,6 +1666,9 @@ static void cleanup_renames(struct rename_info *re_info)
const struct rename *re;
int i;
 
+   if (!re_info)
+   return;
+
for (i = 0; i < re_info->head_renames->nr; i++) {
re = re_info->head_renames->items[i].util;
diff_free_filepair(re->pair);
-- 
2.15.0.5.g9567be9905

[PATCH 04/30] directory rename detection: basic testcases

2017-11-10 Thread Elijah Newren

Signed-off-by: Elijah Newren 
---
 t/t6043-merge-rename-directories.sh | 391 
 1 file changed, 391 insertions(+)
 create mode 100755 t/t6043-merge-rename-directories.sh

diff --git a/t/t6043-merge-rename-directories.sh 
b/t/t6043-merge-rename-directories.sh
new file mode 100755
index 00..b737b0a105
--- /dev/null
+++ b/t/t6043-merge-rename-directories.sh
@@ -0,0 +1,391 @@
+#!/bin/sh
+
+test_description="recursive merge with directory renames"
+# includes checking of many corner cases, with a similar methodology to:
+#   t6042: corner cases with renames but not criss-cross merges
+#   t6036: corner cases with both renames and criss-cross merges
+#
+# The setup for all of them, pictorially, is:
+#
+#  B
+#  o
+# / \
+#  A o   ?
+# \ /
+#  o
+#  C
+#
+# To help make it easier to follow the flow of tests, they have been
+# divided into sections and each test will start with a quick explanation
+# of what commits A, B, and C contain.
+#
+# Notation:
+#z/{b,c}   means  files z/b and z/c both exist
+#x/d_1 means  file x/d exists with content d1.  (Purpose of the
+# underscore notation is to differentiate different
+# files that might be renamed into each other's paths.)
+
+. ./test-lib.sh
+
+
+###
+# SECTION 1: Basic cases we should be able to handle
+###
+
+# Testcase 1a, Basic directory rename.
+#   Commit A: z/{b,c}
+#   Commit B: y/{b,c}
+#   Commit C: z/{b,c,d,e/f}
+#   Expected: y/{b,c,d,e/f}
+
+test_expect_success '1a-setup: Simple directory rename detection' '
+   mkdir z &&
+   echo b >z/b &&
+   echo c >z/c &&
+   git add z &&
+   test_tick &&
+   git commit -m "A" &&
+
+   git branch A &&
+   git branch B &&
+   git branch C &&
+
+   git checkout B &&
+   git mv z y &&
+   test_tick &&
+   git commit -m "B" &&
+
+   git checkout C &&
+   echo d >z/d &&
+   mkdir z/e &&
+   echo f >z/e/f &&
+   git add z/d z/e/f &&
+   test_tick &&
+   git commit -m "C"
+'
+
+test_expect_failure '1a-check: Simple directory rename detection' '
+   git checkout B^0 &&
+
+   git merge -s recursive C^0 &&
+
+   test 4 -eq $(git ls-files -s | wc -l) &&
+
+   test $(git rev-parse HEAD:y/b) = $(git rev-parse A:z/b) &&
+   test $(git rev-parse HEAD:y/c) = $(git rev-parse A:z/c) &&
+   test $(git rev-parse HEAD:y/d) = $(git rev-parse C:z/d) &&
+   test "$(git hash-object y/d)" = $(git rev-parse C:z/d) &&
+   test $(git rev-parse HEAD:y/e/f) = $(git rev-parse C:z/e/f) &&
+   test_must_fail git rev-parse HEAD:z/d &&
+   test_must_fail git rev-parse HEAD:z/e/f &&
+   test ! -d z/d &&
+   test ! -d z/e/f
+'
+
+# Testcase 1b, Merge a directory with another
+#   Commit A: z/{b,c},   y/d
+#   Commit B: z/{b,c,e}, y/d
+#   Commit C: y/{b,c,d}
+#   Expected: y/{b,c,d,e}
+
+test_expect_success '1b-setup: Merge a directory with another' '
+   git rm -rf . &&
+   git clean -fdqx &&
+   rm -rf .git &&
+   git init &&
+
+   mkdir z &&
+   echo b >z/b &&
+   echo c >z/c &&
+   mkdir y &&
+   echo d >y/d &&
+   git add z y &&
+   test_tick &&
+   git commit -m "A" &&
+
+   git branch A &&
+   git branch B &&
+   git branch C &&
+
+   git checkout B &&
+   echo e >z/e &&
+   git add z/e &&
+   test_tick &&
+   git commit -m "B" &&
+
+   git checkout C &&
+   git mv z/b y &&
+   git mv z/c y &&
+   rmdir z &&
+   test_tick &&
+   git commit -m "C"
+'
+
+test_expect_failure '1b-check: Merge a directory with another' '
+   git checkout B^0 &&
+
+   git merge -s recursive C^0 &&
+
+   test 4 -eq $(git ls-files -s | wc -l) &&
+
+   test $(git rev-parse HEAD:y/b) = $(git rev-parse A:z/b) &&
+   test $(git rev-parse HEAD:y/c) = $(git rev-parse A:z/c) &&
+   test $(git rev-parse HEAD:y/d) = $(git rev-parse A:y/d) &&
+   test $(git rev-parse HEAD:y/e) = $(git rev-parse B:z/e) &&
+   test_must_fail git rev-parse HEAD:z/e
+'
+
+# Testcase 1c, Transitive renaming
+#   (Related to testcases 3a and 6d -- when should a transitive rename apply?)
+#   (Related to testcases 9c and 9d -- can transitivity repeat?)
+#   Commit A: z/{b,c},   x/d
+#   Commit B: y/{b,c},   x/d
+#   Commit C: z/{b,c,d}
+#   Expected: y/{b,c,d}  (because x/d -> z/d -> y/d)
+
+test_expect_success '1c-setup: Transitive renaming' '
+   git rm -rf . &&
+   git clean -fdqx &&
+   rm -rf .git &&
+   git init &&
+
+   mkdir z &&
+   echo b >z/b &&
+   echo c >z/c &&
+   mkdir x &&
+   echo d >x/d &&
+   git add z x &&
+   test_tick &&
+   git commit -m "A" &&
+
+   git branch A &&
+   git branch B &&
+   git branch C &

[PATCH 27/30] merge-recursive: Apply necessary modifications for directory renames

2017-11-10 Thread Elijah Newren

This commit hooks together all the directory rename logic by making the
necessary changes to the rename struct, it's dst_entry, and the
diff_filepair under consideration.

Signed-off-by: Elijah Newren 
---
 merge-recursive.c   | 195 +++-
 t/t6043-merge-rename-directories.sh |  50 -
 2 files changed, 219 insertions(+), 26 deletions(-)

diff --git a/merge-recursive.c b/merge-recursive.c
index 2a7258f6bb..838bfd32ec 100644
--- a/merge-recursive.c
+++ b/merge-recursive.c
@@ -166,6 +166,7 @@ static int oid_eq(const struct object_id *a, const struct 
object_id *b)
 
 enum rename_type {
RENAME_NORMAL = 0,
+   RENAME_DIR,
RENAME_DELETE,
RENAME_ONE_FILE_TO_ONE,
RENAME_ONE_FILE_TO_TWO,
@@ -599,6 +600,7 @@ struct rename {
 */
struct stage_data *src_entry;
struct stage_data *dst_entry;
+   unsigned add_turned_into_rename:1;
unsigned processed:1;
 };
 
@@ -633,6 +635,26 @@ static int update_stages(struct merge_options *opt, const 
char *path,
return 0;
 }
 
+static int update_stages_for_stage_data(struct merge_options *opt,
+   const char *path,
+   const struct stage_data *stage_data)
+{
+   struct diff_filespec o, a, b;
+   o.mode = stage_data->stages[1].mode;
+   oidcpy(&o.oid, &stage_data->stages[1].oid);
+
+   a.mode = stage_data->stages[2].mode;
+   oidcpy(&a.oid, &stage_data->stages[2].oid);
+
+   b.mode = stage_data->stages[3].mode;
+   oidcpy(&b.oid, &stage_data->stages[3].oid);
+
+   return update_stages(opt, path,
+is_null_sha1(o.oid.hash) ? NULL : &o,
+is_null_sha1(a.oid.hash) ? NULL : &a,
+is_null_sha1(b.oid.hash) ? NULL : &b);
+}
+
 static void update_entry(struct stage_data *entry,
 struct diff_filespec *o,
 struct diff_filespec *a,
@@ -1100,6 +1122,18 @@ static int merge_file_one(struct merge_options *o,
return merge_file_1(o, &one, &a, &b, branch1, branch2, mfi);
 }
 
+static int conflict_rename_dir(struct merge_options *o,
+  struct diff_filepair *pair,
+  const char *rename_branch,
+  const char *other_branch)
+{
+   const struct diff_filespec *dest = pair->two;
+
+   if (update_file(o, 1, &dest->oid, dest->mode, dest->path))
+   return -1;
+   return 0;
+}
+
 static int handle_change_delete(struct merge_options *o,
 const char *path, const char *old_path,
 const struct object_id *o_oid, int o_mode,
@@ -1369,6 +1403,24 @@ static int conflict_rename_rename_2to1(struct 
merge_options *o,
if (!ret)
ret = update_file(o, 0, &mfi_c2.oid, mfi_c2.mode,
  new_path2);
+   /*
+* unpack_trees() actually populates the index for us for
+* "normal" rename/rename(2to1) situtations so that the
+* correct entries are at the higher stages, which would
+* make the call below to update_stages_for_stage_data
+* unnecessary.  However, if either of the renames came
+* from a directory rename, then unpack_trees() will not
+* have gotten the right data loaded into the index, so we
+* need to do so now.  (While it'd be tempting to move this
+* call to update_stages_for_stage_data() to
+* apply_directory_rename_modifications(), that would break
+* our intermediate calls to would_lose_untracked() since
+* those rely on the current in-memory index.  See also the
+* big "NOTE" in update_stages()).
+*/
+   if (update_stages_for_stage_data(o, path, ci->dst_entry1))
+   ret = -1;
+
free(new_path2);
free(new_path1);
}
@@ -1888,6 +1940,120 @@ static char *check_for_directory_rename(struct 
merge_options *o,
return new_path;
 }
 
+static void apply_directory_rename_modifications(struct merge_options *o,
+struct diff_filepair *pair,
+char *new_path,
+struct rename *re,
+struct tree *tree,
+struct tree *o_tree,
+struct tree *a_tree,
+struct tree *b_tree,
+struct string_list *entries,
+

[PATCH 02/30] merge-recursive: Fix logic ordering issue

2017-11-10 Thread Elijah Newren

merge_trees() did a variety of work, including:
  * Calling get_unmerged() to get unmerged entries
  * Calling record_df_conflict_files() with all unmerged entries to
do some work to ensure we could handle D/F conflicts correctly
  * Calling get_renames() to check for renames.

An easily overlooked issue is that get_renames() can create more
unmerged entries and add them to the list, which have the possibility of
being involved in D/F conflicts.  So the call to
record_df_conflict_files() should really be moved after all the rename
detection.  I didn't come up with any testcases demonstrating any bugs
with the old ordering, but I suspect there were some for both normal
renames and for directory renames.  Fix the ordering.

Signed-off-by: Elijah Newren 
---
 merge-recursive.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/merge-recursive.c b/merge-recursive.c
index 1d3f8f0d22..52521faf09 100644
--- a/merge-recursive.c
+++ b/merge-recursive.c
@@ -1981,10 +1981,10 @@ int merge_trees(struct merge_options *o,
get_files_dirs(o, merge);
 
entries = get_unmerged();
-   record_df_conflict_files(o, entries);
re_head  = get_renames(o, head, common, head, merge, entries);
re_merge = get_renames(o, merge, common, head, merge, entries);
clean = process_renames(o, re_head, re_merge);
+   record_df_conflict_files(o, entries);
if (clean < 0)
goto cleanup;
for (i = entries->nr-1; 0 <= i; i--) {
-- 
2.15.0.5.g9567be9905

[PATCH 03/30] merge-recursive: Add explanation for src_entry and dst_entry

2017-11-10 Thread Elijah Newren

If I have to walk through the debugger and inspect the values found in
here in order to figure out their meaning, despite having known these
things inside and out some years back, then they probably need a comment
for the casual reader to explain their purpose.

Signed-off-by: Elijah Newren 
---
 merge-recursive.c | 22 ++
 1 file changed, 22 insertions(+)

diff --git a/merge-recursive.c b/merge-recursive.c
index 52521faf09..3526c8d0b8 100644
--- a/merge-recursive.c
+++ b/merge-recursive.c
@@ -513,6 +513,28 @@ static void record_df_conflict_files(struct merge_options 
*o,
 
 struct rename {
struct diff_filepair *pair;
+   /*
+* Because I keep forgetting every few years what src_entry and
+* dst_entry are and have to walk through a debugger and puzzle
+* through it to remind myself...
+*
+* If 'before' is renamed to 'after' then src_entry will contain
+* the versions of 'before' from the merge_base, HEAD, and MERGE in
+* stages 1, 2, and 3; dst_entry will contain the versions of
+* 'after' from the merge_base, HEAD, and MERGE in stages 1, 2, and
+* 3.  Thus, we have a total of six modes and oids, though some
+* will be null.  (Stage 0 is ignored; we're interested in handling
+* conflicts.)
+*
+* Since we don't turn on break-rewrites by default, neither
+* src_entry nor dst_entry can have all three of their stages have
+* non-null oids, meaning at most four of the six will be non-null.
+* Also, since this is a rename, both src_entry and dst_entry will
+* have at least one non-null oid, meaning at least two will be
+* non-null.  Of the six oids, a typical rename will have three be
+* non-null.  Only two implies a rename/delete, and four implies a
+* rename/add.
+*/
struct stage_data *src_entry;
struct stage_data *dst_entry;
unsigned processed:1;
-- 
2.15.0.5.g9567be9905

[PATCH 21/30] merge-recursive: Add get_directory_renames()

2017-11-10 Thread Elijah Newren

This populates a list of directory renames for us.  The list of
directory renames is not yet used, but will be in subsequent commits.

Signed-off-by: Elijah Newren 
---
 merge-recursive.c | 146 ++
 1 file changed, 146 insertions(+)

diff --git a/merge-recursive.c b/merge-recursive.c
index 89a9b32635..b5770d3d7f 100644
--- a/merge-recursive.c
+++ b/merge-recursive.c
@@ -1376,6 +1376,124 @@ static struct diff_queue_struct *get_diffpairs(struct 
merge_options *o,
return ret;
 }
 
+static void get_renamed_dir_portion(const char *old_path, const char *new_path,
+   char **old_dir, char **new_dir) {
+   *old_dir = NULL;
+   *new_dir = NULL;
+
+   /* For
+*"a/b/c/d/foo.c" -> "a/b/something-else/d/foo.c"
+* the "d/foo.c" part is the same, we just want to know that
+*"a/b/c" was renamed to "a/b/something-else"
+* so, for this example, this function returns "a/b/c" in
+* *old_dir and "a/b/something-else" in *new_dir.
+*
+* Also, if the basename of the file changed, we don't care.  We
+* want to know which portion of the directory, if any, changed.
+*/
+   char *end_of_old = strrchr(old_path, '/');
+   char *end_of_new = strrchr(new_path, '/');
+   if (end_of_old == NULL || end_of_new == NULL)
+   return;
+   while (*--end_of_new == *--end_of_old &&
+  end_of_old != old_path &&
+  end_of_new != new_path)
+   ; // Do nothing; all in the while loop
+   /*
+* We've found the first non-matching character in the directory
+* paths.  That means the current directory we were comparing
+* represents the rename.  Move end_of_old and end_of_new back
+* to the full directory name.
+*/
+   if (*end_of_old == '/')
+   end_of_old++;
+   if (*end_of_old != '/')
+   end_of_new++;
+   end_of_old = strchr(end_of_old, '/');
+   end_of_new = strchr(end_of_new, '/');
+
+   /*
+* It may have been the case that old_path and new_path were the same
+* directory all along.  Don't claim a rename if they're the same.
+*/
+   int old_len = end_of_old - old_path;
+   int new_len = end_of_new - new_path;
+
+   if (old_len != new_len || strncmp(old_path, new_path, old_len)) {
+   *old_dir = strndup(old_path, old_len);
+   *new_dir = strndup(new_path, new_len);
+   }
+}
+
+static struct hashmap *get_directory_renames(struct diff_queue_struct *pairs,
+struct tree *tree) {
+   struct hashmap *dir_renames;
+   struct hashmap_iter iter;
+   struct dir_rename_entry *entry;
+   int i;
+
+   dir_renames = malloc(sizeof(struct hashmap));
+   dir_rename_init(dir_renames);
+   for (i = 0; i < pairs->nr; ++i) {
+   struct string_list_item *item;
+   int *count;
+   struct diff_filepair *pair = pairs->queue[i];
+
+   char *old_dir, *new_dir;
+   get_renamed_dir_portion(pair->one->path, pair->two->path,
+   &old_dir,&new_dir);
+   if (!old_dir)
+   // Directory didn't change at all; ignore this one.
+   continue;
+
+   entry = dir_rename_find_entry(dir_renames, old_dir);
+   if (!entry) {
+   entry = xcalloc(1, sizeof(struct dir_rename_entry));
+   hashmap_entry_init(entry, strhash(old_dir));
+   hashmap_put(dir_renames, entry);
+   entry->dir = old_dir;
+   } else {
+   free(old_dir);
+   }
+   item = string_list_lookup(&entry->possible_new_dirs, new_dir);
+   if (!item) {
+   item = string_list_insert(&entry->possible_new_dirs, 
new_dir);
+   item->util = xcalloc(1, sizeof(int));
+   } else {
+   free(new_dir);
+   }
+   count = item->util;
+   *count += 1;
+   }
+
+   hashmap_iter_init(dir_renames, &iter);
+   while ((entry = hashmap_iter_next(&iter))) {
+   int max = 0;
+   int bad_max = 0;
+   char *best = NULL;
+   for (i = 0; i < entry->possible_new_dirs.nr; i++) {
+   int *count = entry->possible_new_dirs.items[i].util;
+   if (*count == max)
+   bad_max = max;
+   else if (*count > max) {
+   max = *count;
+   best = entry->possible_new_dirs.items[i].string;
+   }
+   }
+   if (bad_max == max)
+   en

[PATCH 13/30] directory rename detection: tests for handling overwriting untracked files

2017-11-10 Thread Elijah Newren

Signed-off-by: Elijah Newren 
---
 t/t6043-merge-rename-directories.sh | 314 
 1 file changed, 314 insertions(+)

diff --git a/t/t6043-merge-rename-directories.sh 
b/t/t6043-merge-rename-directories.sh
index bb179b16c8..7af8962512 100755
--- a/t/t6043-merge-rename-directories.sh
+++ b/t/t6043-merge-rename-directories.sh
@@ -2559,4 +2559,318 @@ test_expect_failure '9g-check: Renamed directory that 
only contained immediate s
 #   side of history for any implicit directory renames.
 ###
 
+###
+# SECTION 10: Handling untracked files
+#
+# unpack_trees(), upon which the recursive merge algorithm is based, aborts
+# the operation if untracked or dirty files would be deleted or overwritten
+# by the merge.  Unfortunately, unpack_trees() does not understand renames,
+# and if it doesn't abort, then it muddies up the working directory before
+# we even get to the point of detecting renames, so we need some special
+# handling, at least in the case of directory renames.
+###
+
+# Testcase 10a, Overwrite untracked: normal rename/delete
+#   Commit A: z/{b,c_1}
+#   Commit B: z/b + untracked z/c + untracked z/d
+#   Commit C: z/{b,d_1}
+#   Expected: Aborted Merge +
+#   ERROR_MSG(untracked working tree files would be overwritten by merge)
+
+test_expect_success '10a-setup: Overwrite untracked with normal rename/delete' 
'
+   git rm -rf . &&
+   git clean -fdqx &&
+   rm -rf .git &&
+   git init &&
+
+   mkdir z &&
+   echo b >z/b &&
+   echo c >z/c &&
+   git add z &&
+   test_tick &&
+   git commit -m "A" &&
+
+   git branch A &&
+   git branch B &&
+   git branch C &&
+
+   git checkout B &&
+   git rm z/c &&
+   test_tick &&
+   git commit -m "B" &&
+
+   git checkout C &&
+   git mv z/c z/d &&
+   test_tick &&
+   git commit -m "C"
+'
+
+test_expect_success '10a-check: Overwrite untracked with normal rename/delete' 
'
+   git checkout B^0 &&
+   echo very >z/c &&
+   echo important >z/d &&
+
+   test_must_fail git merge -s recursive C^0 >out 2>err &&
+   test_i18ngrep "The following untracked working tree files would be 
overwritten by merge" err &&
+
+   test 1 -eq $(git ls-files -s | wc -l) &&
+   test 4 -eq $(git ls-files -o | wc -l) &&
+
+   test "very" = "$(cat z/c)" &&
+   test "important" = "$(cat z/d)" &&
+   test $(git rev-parse HEAD:z/b) = $(git rev-parse A:z/b)
+'
+
+# Testcase 10b, Overwrite untracked: dir rename + delete
+#   Commit A: z/{b,c_1}
+#   Commit B: y/b + untracked y/{c,d,e}
+#   Commit C: z/{b,d_1,e}
+#   Expected: Failed Merge; y/b + untracked y/c + untracked y/d on disk +
+# z/c_1 -> z/d_1 rename recorded at stage 3 for y/d +
+#   ERROR_MSG(refusing to lose untracked file at 'y/d')
+
+test_expect_success '10b-setup: Overwrite untracked with dir rename + delete' '
+   git rm -rf . &&
+   git clean -fdqx &&
+   rm -rf .git &&
+   git init &&
+
+   mkdir z &&
+   echo b >z/b &&
+   echo c >z/c &&
+   git add z &&
+   test_tick &&
+   git commit -m "A" &&
+
+   git branch A &&
+   git branch B &&
+   git branch C &&
+
+   git checkout B &&
+   git rm z/c &&
+   git mv z/ y/ &&
+   test_tick &&
+   git commit -m "B" &&
+
+   git checkout C &&
+   git mv z/c z/d &&
+   echo e >z/e &&
+   git add z/e &&
+   test_tick &&
+   git commit -m "C"
+'
+
+test_expect_failure '10b-check: Overwrite untracked with dir rename + delete' '
+   git checkout B^0 &&
+   echo very >y/c &&
+   echo important >y/d &&
+   echo contents >y/e &&
+
+   test_must_fail git merge -s recursive C^0 >out 2>err &&
+   test_i18ngrep "CONFLICT (rename/delete).*Version C^0 of y/d left in 
tree at y/d~C^0" out &&
+   test_i18ngrep "Error: Refusing to lose untracked file at y/e; writing 
to y/e~C^0 instead" out &&
+
+   test 3 -eq $(git ls-files -s | wc -l) &&
+   test 2 -eq $(git ls-files -u | wc -l) &&
+   test 5 -eq $(git ls-files -o | wc -l) &&
+
+   test $(git rev-parse :0:y/b) = $(git rev-parse A:z/b) &&
+   test "very" = "$(cat y/c)" &&
+
+   test "important" = "$(cat y/d)" &&
+   test "important" != "$(git rev-parse :3:y/d)" &&
+   test $(git rev-parse :3:y/d) = $(git rev-parse A:z/c) &&
+
+   test "contents" = "$(cat y/e)" &&
+   test "contents" != "$(git rev-parse :3:y/e)" &&
+   test $(git rev-parse :3:y/e) = $(git rev-parse C:z/e)
+'
+
+# Testcase 10c, Overwrite untracked: dir rename/rename(1to2)
+#   Commit A: z/{a,b}, x/{c,d}
+#   Commit B: y/{a,b}, w/c, x/d + different untracked y/c
+#   Commit C: z/{a,b,c}, x/d
+#   Expected: Failed Merge; y/{a,b} + x/d + u

[PATCH 19/30] merge-recursive: Split out code for determining diff_filepairs

2017-11-10 Thread Elijah Newren

Create a new function, get_diffpairs() to compute the diff_filepairs
between two trees.  While these are currently only used in
get_renames(), I want them to be available to some new functions.  No
actual logic changes yet.

Signed-off-by: Elijah Newren 
---
 merge-recursive.c | 81 ---
 1 file changed, 60 insertions(+), 21 deletions(-)

diff --git a/merge-recursive.c b/merge-recursive.c
index f40c70990c..8c9543d85c 100644
--- a/merge-recursive.c
+++ b/merge-recursive.c
@@ -1315,24 +1315,15 @@ static int conflict_rename_rename_2to1(struct 
merge_options *o,
 }
 
 /*
- * Get information of all renames which occurred between 'o_tree' and
- * 'tree'. We need the three trees in the merge ('o_tree', 'a_tree' and
- * 'b_tree') to be able to associate the correct cache entries with
- * the rename information. 'tree' is always equal to either a_tree or b_tree.
+ * Get the diff_filepairs changed between o_tree and tree.
  */
-static struct string_list *get_renames(struct merge_options *o,
-  struct tree *tree,
-  struct tree *o_tree,
-  struct tree *a_tree,
-  struct tree *b_tree,
-  struct string_list *entries)
+static struct diff_queue_struct *get_diffpairs(struct merge_options *o,
+  struct tree *o_tree,
+  struct tree *tree)
 {
-   int i;
-   struct string_list *renames;
+   struct diff_queue_struct *ret;
struct diff_options opts;
 
-   renames = xcalloc(1, sizeof(struct string_list));
-
diff_setup(&opts);
DIFF_OPT_SET(&opts, RECURSIVE);
DIFF_OPT_CLR(&opts, RENAME_EMPTY);
@@ -1348,10 +1339,43 @@ static struct string_list *get_renames(struct 
merge_options *o,
diffcore_std(&opts);
if (opts.needed_rename_limit > o->needed_rename_limit)
o->needed_rename_limit = opts.needed_rename_limit;
-   for (i = 0; i < diff_queued_diff.nr; ++i) {
+
+   ret = malloc(sizeof(struct diff_queue_struct));
+   ret->queue = diff_queued_diff.queue;
+   ret->nr = diff_queued_diff.nr;
+   // Ignore diff_queued_diff.alloc; we won't be changing the size at all
+
+   opts.output_format = DIFF_FORMAT_NO_OUTPUT;
+   diff_queued_diff.nr = 0;
+   diff_queued_diff.queue = NULL;
+   diff_flush(&opts);
+   return ret;
+}
+
+/*
+ * Get information of all renames which occurred in 'pairs', making use of
+ * any implicit directory renames inferred from the other side of history.
+ * We need the three trees in the merge ('o_tree', 'a_tree' and 'b_tree')
+ * to be able to associate the correct cache entries with the rename
+ * information; tree is always equal to either a_tree or b_tree.
+ */
+static struct string_list *get_renames(struct merge_options *o,
+  struct diff_queue_struct *pairs,
+  struct tree *tree,
+  struct tree *o_tree,
+  struct tree *a_tree,
+  struct tree *b_tree,
+  struct string_list *entries)
+{
+   int i;
+   struct string_list *renames;
+
+   renames = xcalloc(1, sizeof(struct string_list));
+
+   for (i = 0; i < pairs->nr; ++i) {
struct string_list_item *item;
struct rename *re;
-   struct diff_filepair *pair = diff_queued_diff.queue[i];
+   struct diff_filepair *pair = pairs->queue[i];
if (pair->status != 'R') {
diff_free_filepair(pair);
continue;
@@ -1375,9 +1399,6 @@ static struct string_list *get_renames(struct 
merge_options *o,
item = string_list_insert(renames, pair->one->path);
item->util = re;
}
-   opts.output_format = DIFF_FORMAT_NO_OUTPUT;
-   diff_queued_diff.nr = 0;
-   diff_flush(&opts);
return renames;
 }
 
@@ -1649,15 +1670,33 @@ static struct rename_info *handle_renames(struct 
merge_options *o,
  int *clean)
 {
struct rename_info *rei = xcalloc(1, sizeof(struct rename_info));
+   struct diff_queue_struct *head_pairs, *merge_pairs;
 
*clean = 1;
if (!o->detect_rename)
return NULL;
 
-   rei->head_renames  = get_renames(o, head, common, head, merge, entries);
-   rei->merge_renames = get_renames(o, merge, common, head, merge, 
entries);
+   head_pairs = get_diffpairs(o, common, head);
+   merge_pairs = get_diffpairs(o, common, merge);
+
+   rei->head_renames  = get_renames(o, head_pairs, head,
+common, head, merge, entries);
+   r

Re: [PATCH v2 0/9] sequencer: dont't fork git commit

2017-11-10 Thread Junio C Hamano

Phillip Wood  writes:

> Here's the summary from the previous version
> These patches teach the sequencer to create commits without forking
> git commit when the commit message does not need to be edited. This
> speeds up cherry picking 10 commits by 26% and picking 10 commits with
> rebase --continue by 44%. The first few patches move bits of
> builtin/commit.c to sequencer.c. The last two patches actually
> implement creating commits in sequencer.c.

Thanks.  The changes since the initial iteration seems quite small
and I didn't find much objectionable.

Here are some style fixes I needed to add on top to make the output
of "diff master HEAD" checkpatch.pl-clean.  I think 3/9 and 9/9 are
the culprits.

diff --git a/sequencer.c b/sequencer.c
index 1f65e82696..a989588ee5 100644
--- a/sequencer.c
+++ b/sequencer.c
@@ -592,7 +592,7 @@ static int read_env_script(struct argv_array *env)
return 0;
 }
 
-static char *get_author(const char* message)
+static char *get_author(const char *message)
 {
size_t len;
const char *a;
@@ -1104,7 +1104,7 @@ static int try_to_commit(struct strbuf *msg, const char 
*author,
}
 
if (update_head_with_reflog(current_head, oid,
-   getenv("GIT_REFLOG_ACTION"), msg, &err)){
+   getenv("GIT_REFLOG_ACTION"), msg, &err)) {
res = error("%s", err.buf);
goto out;
}
@@ -1121,7 +1121,7 @@ static int try_to_commit(struct strbuf *msg, const char 
*author,
return res;
 }
 
-static int do_commit(const char *msg_file, const char* author,
+static int do_commit(const char *msg_file, const char *author,
 struct replay_opts *opts, unsigned int flags)
 {
int res = 1;
@@ -1521,7 +1521,7 @@ static int do_pick_commit(enum todo_command command, 
struct commit *commit,
strbuf_addstr(&msgbuf, oid_to_hex(&commit->object.oid));
strbuf_addstr(&msgbuf, ")\n");
}
-   if (!is_fixup (command))
+   if (!is_fixup(command))
author = get_author(msg.message);
}
 
diff --git a/sequencer.h b/sequencer.h
index 27f34be400..e0be354301 100644
--- a/sequencer.h
+++ b/sequencer.h
@@ -72,7 +72,7 @@ int template_untouched(const struct strbuf *sb, const char 
*template_file,
   enum commit_msg_cleanup_mode cleanup_mode);
 int update_head_with_reflog(const struct commit *old_head,
const struct object_id *new_head,
-   const char* action, const struct strbuf *msg,
+   const char *action, const struct strbuf *msg,
struct strbuf *err);
 void commit_post_rewrite(const struct commit *current_head,
 const struct object_id *new_head);

Re: [RFD] Long term plan with submodule refs?

2017-11-10 Thread Stefan Beller

>
>>> Basically, a workflow where it's easier to have each submodule checked
>>> out at master, and we can still keep track of historical relationship
>>> of what commit was the submodule at some time ago, but without causing
>>> some of these headaches.
>>
>> So essentially a repo or otherwise parallel workflow just with the versioning
>> happening magically behind your back?
>
> Ideally, my developers would like to just have each submodule checked
> out at master.
>
> Ideally, I'd like to be able to checkout an old version of the parent
> project and have it recorded what version of the shared submodule was
> at at the time.

This sounds as if a "passive superproject" would work best for you, i.e.
each commit in a submodule is bubbled up into the superproject,
making a commit potentially even behind the scenes, such that the
user interaction with the superproject would be none.

However this approach also sounds careless, as there is no precondition
that e.g. the superproject builds with all the submodules as is; it is a mere
tracking of "at this time we have the submodules arranged as such",
whereas for the versioning aspect, you would want to have commit messages
in the superproject saying *why* you bumped up a specific submodule.
The user may not like to give such an explanation as they already wrote
a commit message for the individual project.

Also this approach sounds like a local approach, as it is not clear to me,
why you'd want to share the superproject history.

> Ideally, my developers don't want to have to worry about knowing that
> they shouldn't "git add -a" or "git commit -a" when they have a
> submodule checked out at a different location from the parent projects
> gitlink.
>
> Thanks,
> Jake
>

Re: [RFC] protocol version 2

2017-11-10 Thread Jonathan Tan

On Fri, 20 Oct 2017 10:18:39 -0700
Brandon Williams  wrote:

> Some of the pain points with the current protocol spec are:

After some in-office discussion, I think that the most important pain
point is that we have to implement each protocol twice: once for
HTTP(S), and once for SSH (and friends) that support bidirectional byte
streams.

If it weren't for this, I think that what is discussed in this document
(e.g. ls-refs, fetch-object) can be less invasively accomplished with
v1, specifying "extra parameters" (explained in this e-mail [1]) to
merely tweak the output of upload-pack instead of replacing it nearly
completely, thus acting more as optimizations than changing the mode of
operation entirely.

[1] 
https://public-inbox.org/git/20171010193956.168385-1-jonathanta...@google.com/

>   * The server's initial response is the ref advertisement.  This
> advertisement cannot be omitted and can become an issue due to the
> sheer number of refs that can be sent with large repositories.  For
> example, when contacting the internal equivalent of
> `https://android.googlesource.com/`, the server will send
> approximately 1 million refs totaling 71MB.  This is data that is
> sent during each and every fetch and is not scalable.

For me, this is not a compelling one, because we can provide a ref
whitelist as an "extra parameter" in v1.

>   * Capabilities were implemented as a hack and are hidden behind a NUL
> byte after the first ref sent from the server during the ref
> advertisement:
> 
>\0  
> 
> Since they are sent in the context of a pkt-line they are also subject
> to the same length limitations (1k bytes with old clients).  While we
> may not be close to hitting this limitation with capabilities alone, it
> has become a problem when trying to abuse capabilities for other
> purposes (e.g. 
> [symrefs](https://public-inbox.org/git/20160816161838.klvjhhoxsftvkfmd@x/)).
> 
>   * Various other technical debt (e.g. abusing capabilities to
> communicate agent and symref data, service name set using a query
> parameter).

I think these 2 are the same - I would emphasize the fact that we cannot
add more stuff here, rather than the fact that we're putting this behind
NUL.

>  Special Packets
> -
> 
> In protocol v2 these special packets will have the following semantics:
> 
>   * '' Flush Packet (flush-pkt) - indicates the end of a message
>   * '0001' End-of-List delimiter (delim-pkt) - indicates the end of a list

To address the pain point of HTTP(S) being different from the others
(mentioned above), I think the packet semantics should be further
qualified:

 - Communications must be divided up into packets terminated by a
   flush-pkt. Also, each side must be implemented without knowing
   whether packets-in-progress can or cannot be seen by the other side.
 - Each request packet must have a corresponding, possibly empty,
   response packet.
 - A request packet may be sent even if a response packet corresponding
   to a previously sent request packet is awaited. (This allows us to
   retain the existing optimization in fetch-pack wherein, during
   negotiation, the "have" request-response packet pairs are
   interleaved.)

This will allow us to more easily share code between HTTP(S) and the
others.

In summary, I think that we need a big motivation to make the jump from
v1 to v2, instead of merely making small changes to v1 (and I do think
that the proposed new commands, such as "ls-refs" and "fetch-object",
can be implemented merely by small changes). And I think that the
ability to better share code between HTTP(S) and others provides that
motivation.

[PATCH v1] fsmonitor: simplify determining the git worktree under Windows

2017-11-10 Thread Ben Peart

I haven't tested the non Windows paths but the patch looks reasonable.

This inspired me to get someone more familiar with perl (thanks Johannes)
to revisit this code for the Windows side as well.  The logic for
determining the git worktree when running on Windows is more complex
than necessary.  It also spawns multiple processes (uname and cygpath)
which slows things down.

Simplify and speed up the process of finding the git worktree when
running on Windows by keeping it in perl and avoiding spawning helper
processes.

Signed-off-by: Ben Peart 
Signed-off-by: Johannes Schindelin 
---

Notes:
Base Ref:
Web-Diff: https://github.com/benpeart/git/commit/20affe124b
Checkout: git fetch https://github.com/benpeart/git fsmonitor_splitindex-v1 
&& git checkout 20affe124b

 t/t7519/fsmonitor-watchman | 13 +++--
 templates/hooks--fsmonitor-watchman.sample | 13 +++--
 2 files changed, 6 insertions(+), 20 deletions(-)

diff --git a/t/t7519/fsmonitor-watchman b/t/t7519/fsmonitor-watchman
index 5fe72cefaf..5514edcf68 100755
--- a/t/t7519/fsmonitor-watchman
+++ b/t/t7519/fsmonitor-watchman
@@ -29,17 +29,10 @@ if ($version == 1) {
"Falling back to scanning...\n";
 }
 
-# Convert unix style paths to escaped Windows style paths when running
-# in Windows command prompt
-
-my $system = `uname -s`;
-$system =~ s/[\r\n]+//g;
 my $git_work_tree;
-
-if ($system =~ m/^MSYS_NT/ || $system =~ m/^MINGW/) {
-   $git_work_tree = `cygpath -aw "\$PWD"`;
-   $git_work_tree =~ s/[\r\n]+//g;
-   $git_work_tree =~ s,\\,/,g;
+if ($^O =~ 'msys' || $^O =~ 'cygwin') {
+   $git_work_tree = Win32::GetCwd();
+   $git_work_tree =~ tr/\\/\//;
 } else {
require Cwd;
$git_work_tree = Cwd::cwd();
diff --git a/templates/hooks--fsmonitor-watchman.sample 
b/templates/hooks--fsmonitor-watchman.sample
index ba6d88c5f8..e673bb3980 100755
--- a/templates/hooks--fsmonitor-watchman.sample
+++ b/templates/hooks--fsmonitor-watchman.sample
@@ -28,17 +28,10 @@ if ($version == 1) {
"Falling back to scanning...\n";
 }
 
-# Convert unix style paths to escaped Windows style paths when running
-# in Windows command prompt
-
-my $system = `uname -s`;
-$system =~ s/[\r\n]+//g;
 my $git_work_tree;
-
-if ($system =~ m/^MSYS_NT/ || $system =~ m/^MINGW/) {
-   $git_work_tree = `cygpath -aw "\$PWD"`;
-   $git_work_tree =~ s/[\r\n]+//g;
-   $git_work_tree =~ s,\\,/,g;
+if ($^O =~ 'msys' || $^O =~ 'cygwin') {
+   $git_work_tree = Win32::GetCwd();
+   $git_work_tree =~ tr/\\/\//;
 } else {
require Cwd;
$git_work_tree = Cwd::cwd();

base-commit: f9d9e50b62094689773dccc5f9493fa15e30d592
-- 
2.15.0.windows.1

Re: [PATCH] bisect: mention "view" as an alternative to "visualize"

2017-11-10 Thread Eric Sunshine

Thanks for the patch. Some comments below...

On Fri, Nov 10, 2017 at 11:32 AM, Robert P. J. Day
 wrote:
> Tweak a number of files to mention "view" as an alternative to
> "visualize":

You probably want to end this sentence with a period, not a colon.

>  Documentation/git-bisect.txt   | 9 -
>  Documentation/user-manual.txt  | 3 ++-
>  builtin/bisect--helper.c   | 2 +-
>  contrib/completion/git-completion.bash | 2 +-
>  git-bisect.sh  | 4 ++--
>  5 files changed, 10 insertions(+), 10 deletions(-)

The diffstat belongs below the "---" separator, otherwise this text
will (undesirably) become part of the commit message proper.

> Signed-off-by: Robert P. J. Day 
>
> ---
>
>   here's hoping i have the right format for this patch ... are there
> any parts of this that are inappropriate, such as extending the bash
> completion?

This is the correct place for your commentary. The diffstat should
appear below your commentary.

> diff --git a/Documentation/git-bisect.txt b/Documentation/git-bisect.txt
> @@ -23,7 +23,7 @@ on the subcommand:
>   git bisect terms [--term-good | --term-bad]
>   git bisect skip [(|)...]
>   git bisect reset []
> - git bisect visualize
> + git bisect visualize|view

I think you need parentheses around these terms (see "git bisect
skip", for example):

git bisect (visualize | view)

However, in this case, it might be easier for readers if each is
presented on its own line (and subsequent discussion can make it clear
that they are synonyms).

git bisect visualize
git bisect view

But, that's a matter of taste...

> @@ -196,15 +196,14 @@ of `git bisect good` and `git bisect bad` to mark 
> commits.
>  Bisect visualize
>  
>
> -To see the currently remaining suspects in 'gitk', issue the following
> -command during the bisection process:
> +To see the currently remaining suspects in 'gitk', issue either of the
> +following equivalent commands during the bisection process:
>
>  
>  $ git bisect visualize
> +$ git bisect view
>  
>
> -`view` may also be used as a synonym for `visualize`.

Honestly, I think the original was clearer and placed a bit less
cognitive load on the reader. Moreover, for someone scanning the
documentation without reading it too deeply, the revised example makes
it seem as if it is necessary to invoke both commands rather than one
or the other.

> diff --git a/Documentation/user-manual.txt b/Documentation/user-manual.txt
> @@ -538,10 +538,11 @@ Note that the version which `git bisect` checks out for 
> you at each
>  point is just a suggestion, and you're free to try a different
>  version if you think it would be a good idea.  For example,
>  occasionally you may land on a commit that broke something unrelated;
> -run
> +run either of the equivalent commands
>
>  -
>  $ git bisect visualize
> +$ git bisect view
>  -

Same observation as above. This has the potential to confuse someone
quickly scanning the documentation into thinking that both commands
must be invoked. Merely stating in prose that one is the alias of the
other might be better.

>  which will run gitk and label the commit it chose with a marker that
> diff --git a/contrib/completion/git-completion.bash 
> b/contrib/completion/git-completion.bash
> index fdd984d34..52f68c922 100644
> --- a/contrib/completion/git-completion.bash
> +++ b/contrib/completion/git-completion.bash
> @@ -1162,7 +1162,7 @@ _git_bisect ()
>  {
> __git_has_doubledash && return
>
> -   local subcommands="start bad good skip reset visualize replay log run"
> +   local subcommands="start bad good skip reset visualize view replay 
> log run"

People using muscle memory to type "git bisect v"  or
"...vi" might find it annoying for this to suddenly become
ambiguous. Just an observation; no strong opinion on it...

> diff --git a/git-bisect.sh b/git-bisect.sh
> @@ -20,7 +20,7 @@ git bisect next
> find next bisection to test and check it out.
>  git bisect reset []
> finish bisection search and go back to commit.
> -git bisect visualize
> +git bisect visualize|view
> show bisect status in gitk.

Again, this might be easier to read if split over two lines:

git bisect visualize
git bisect view
show bisect status in gitk.

in which case it's plenty clear that the commands are synonyms.

1 2 >

1 - 100 of 129 matches

Mail list logo