Re: [PATCH 1/1] doc(stash): clarify the description of `save`

2019-10-10 Thread Thomas Gummerer
On 10/10, Johannes Schindelin via GitGitGadget wrote:
> From: Johannes Schindelin 
> 
> The original phrasing of this paragraph made at least one person stumble
> over the word "from" (thinking that it was a typo and "from" was
> intended), and other readers chimed in, agreeing that it was confusing:
> https://public-inbox.org/git/0102016b8d597569-c1f6cfdc-cb45-4428-8737-cb1bc30655d8-000...@eu-west-1.amazonses.com/#t
> 
> Let's rewrite that paragraph for clarity.
> 
> Inspired-by-a-patch-by: Catalin Criste 
> Signed-off-by: Johannes Schindelin 

Thanks for picking this thread up again, I had already forgotten about
it.  The updated wording sounds like an improvement to me.

> ---
>  Documentation/git-stash.txt | 5 +++--
>  1 file changed, 3 insertions(+), 2 deletions(-)
> 
> diff --git a/Documentation/git-stash.txt b/Documentation/git-stash.txt
> index 8fbe12c66c..53e1a1205d 100644
> --- a/Documentation/git-stash.txt
> +++ b/Documentation/git-stash.txt
> @@ -87,8 +87,9 @@ The `--patch` option implies `--keep-index`.  You can use
>  save [-p|--patch] [-k|--[no-]keep-index] [-u|--include-untracked] [-a|--all] 
> [-q|--quiet] []::
>  
>   This option is deprecated in favour of 'git stash push'.  It
> - differs from "stash push" in that it cannot take pathspecs,
> - and any non-option arguments form the message.
> + differs from "stash push" in that it cannot take pathspecs.
> + Instead, all non-option arguments are concatenated to form the stash
> + message.
>  
>  list []::
>  
> -- 
> gitgitgadget


[PATCH v2] range-diff: don't segfault with mode-only changes

2019-10-08 Thread Thomas Gummerer
In ef283b3699 ("apply: make parse_git_diff_header public", 2019-07-11)
the 'parse_git_diff_header' function was made public and useable by
callers outside of apply.c.

However it was missed that its (then) only caller, 'find_header' did
some error handling, and completing 'struct patch' appropriately.

range-diff then started using this function, and tried to handle this
appropriately itself, but fell short in some cases.  This in turn
would lead to range-diff segfaulting when there are mode-only changes
in a range.

Move the error handling and completing of the struct into the
'parse_git_diff_header' function, so other callers can take advantage
of it.  This fixes the segfault in 'git range-diff'.

Reported-by: Uwe Kleine-König 
Signed-off-by: Thomas Gummerer 
---

Thanks Junio and Dscho for your reviews.  I decided to lift the whole
error handling behaviour from find_header into parse_git_diff_header,
instead of just filling the two names with xstrdup(def_name) if
(!old_name && !new_name && !!def_name).  I think the additional
information presented there can be useful.  For example we would have
gotten some "error: git diff header lacks filename information"
instead of a segfault for the problem described in
https://public-inbox.org/git/20191002141615.gb17...@kitsune.suse.cz/T/#me576615d7a151cf2ed46186c482fbd88f9959914.

Dscho, I didn't re-use your test case here as I had already written
one, and think what I have is slightly nicer in that it follows what
most other range-diff tests do in using the fast-exported history.  It
also expands the test coverage slightly, as we currently don't have
any coverage of the mode-change header, but will with this test.

The downside is of course that the fast export script is harder to
understand than the test you had, at least for me, but I think the
tradeoff of having the additional test coverage, and having it similar
to the rest of the test script is worth it.  If you strongly prefer
your test though I'm not going to be unhappy to use that :)

 apply.c| 43 +-
 t/t3206-range-diff.sh  | 40 +++
 t/t3206/history.export | 31 +-
 3 files changed, 92 insertions(+), 22 deletions(-)

diff --git a/apply.c b/apply.c
index 57a61f2881..f8a046a6a5 100644
--- a/apply.c
+++ b/apply.c
@@ -1361,11 +1361,32 @@ int parse_git_diff_header(struct strbuf *root,
if (check_header_line(*linenr, patch))
return -1;
if (res > 0)
-   return offset;
+   goto done;
break;
}
}
 
+done:
+   if (!patch->old_name && !patch->new_name) {
+   if (!patch->def_name) {
+   error(Q_("git diff header lacks filename information 
when removing "
+"%d leading pathname component (line %d)",
+"git diff header lacks filename information 
when removing "
+"%d leading pathname components (line %d)",
+parse_hdr_state.p_value),
+ parse_hdr_state.p_value, *linenr);
+   return -128;
+   }
+   patch->old_name = xstrdup(patch->def_name);
+   patch->new_name = xstrdup(patch->def_name);
+   }
+   if ((!patch->new_name && !patch->is_delete) ||
+   (!patch->old_name && !patch->is_new)) {
+   error(_("git diff header lacks filename information "
+   "(line %d)"), *linenr);
+   return -128;
+   }
+   patch->is_toplevel_relative = 1;
return offset;
 }
 
@@ -1546,26 +1567,6 @@ static int find_header(struct apply_state *state,
return -128;
if (git_hdr_len <= len)
continue;
-   if (!patch->old_name && !patch->new_name) {
-   if (!patch->def_name) {
-   error(Q_("git diff header lacks 
filename information when removing "
-   "%d leading pathname 
component (line %d)",
-   "git diff header lacks 
filename information when removing "
-   "%d leading pathname 
components (line %d)",
-   state->p_value),
-  

Re: [PATCH v3 00/13] ci: include a Visual Studio build & test in our Azure Pipeline

2019-10-07 Thread Thomas Gummerer
On 10/07, Junio C Hamano wrote:
> Johannes Schindelin  writes:
> 
> > Date:   Fri, 04 Oct 2019 08:09:25 -0700 (PDT)
> > [...]
> > X-Google-Original-Date: Fri, 04 Oct 2019 15:09:10 GMT
> > [...]
> >
> > I am fairly certain that the latter is the actual `Date:` line sent to
> > GMail, and GMail just decides that it will not respect it.
> 
> If the submitting program said "Fri, 04 Oct 2019 15:09:10 +
> (GMT)" instead of "Fri, 04 Oct 2019 15:09:10 GMT", that would match
> the format the MTA produced itself, I guess.  I am kind-of surprised
> if the problem is the use of the obs-zone format (RFC 2822 page 31),
> but anything is possible with GMail X-<.

Yeah, the obs-zone format did seem to be the problem.  I just dug up
the previous thread we had about this, where I confirmed that + as
the timezone worked just fine in my setup through GMail [*1*].  Note
sure if the (GMT) would cause any problems, but I'd agree with
avoiding it as you mention below to make sure GMail doesn't do
anything funny with it.

*1*: 
https://public-inbox.org/git/20190318214842.ga32...@hank.intra.tgummerer.com/

> How does send-email write that date header?  Matching that would be
> probably the most appropriate, if possible, given that GGG was
> written for send-email refugees, I guess ;-)
> 
> Here is what its format_2822_time sub does, so + without any
> textual zone name, it is.
> 
>   return sprintf("%s, %2d %s %d %02d:%02d:%02d %s%02d%02d",
>  qw(Sun Mon Tue Wed Thu Fri Sat)[$localtm[6]],
>  $localtm[3],
>  qw(Jan Feb Mar Apr May Jun
> Jul Aug Sep Oct Nov Dec)[$localtm[4]],
>  $localtm[5]+1900,
>  $localtm[2],
>  $localtm[1],
>  $localtm[0],
>  ($offset >= 0) ? '+' : '-',
>  abs($offhour),
>  $offmin,
>  );
> 
> 


Re: Regression in v2.23

2019-10-07 Thread Thomas Gummerer
 GitHub as well if you prefer that to applying
the patch yourself: https://github.com/tgummerer/git 
tg/range-diff-mode-only-change

--- >8 ---
Subject: [PATCH] range-diff: don't segfault with mode-only changes

If we don't have a new file, deleted file or renamed file in a diff,
we currently add 'patch.new_name' to the range-diff header.  This
works well for files that are changed.  However if we have a pure mode
change, 'patch.new_name' is NULL, and thus range-diff segfaults.

We can however rely on 'patch.def_name' in that case, which is
extracted from the 'diff --git' line and should be equal to
'patch.new_name'.  Use that instead to avoid the segfault.

Signed-off-by: Thomas Gummerer 
---
 range-diff.c | 20 ++--
 1 file changed, 10 insertions(+), 10 deletions(-)

diff --git a/range-diff.c b/range-diff.c
index ba1e9a4265..d8d906b3c6 100644
--- a/range-diff.c
+++ b/range-diff.c
@@ -116,20 +116,20 @@ static int read_patches(const char *range, struct 
string_list *list)
if (len < 0)
die(_("could not parse git header '%.*s'"), 
(int)len, line);
strbuf_addstr(&buf, " ## ");
-   if (patch.is_new > 0)
+   free(current_filename);
+   if (patch.is_new > 0) {
strbuf_addf(&buf, "%s (new)", patch.new_name);
-   else if (patch.is_delete > 0)
+   current_filename = xstrdup(patch.new_name);
+   } else if (patch.is_delete > 0) {
strbuf_addf(&buf, "%s (deleted)", 
patch.old_name);
-   else if (patch.is_rename)
-   strbuf_addf(&buf, "%s => %s", patch.old_name, 
patch.new_name);
-   else
-   strbuf_addstr(&buf, patch.new_name);
-
-   free(current_filename);
-   if (patch.is_delete > 0)
current_filename = xstrdup(patch.old_name);
-   else
+   } else if (patch.is_rename) {
+   strbuf_addf(&buf, "%s => %s", patch.old_name, 
patch.new_name);
current_filename = xstrdup(patch.new_name);
+   } else {
+   strbuf_addstr(&buf, patch.def_name);
+   current_filename = xstrdup(patch.def_name);
+   }
 
if (patch.new_mode && patch.old_mode &&
patch.old_mode != patch.new_mode)
-- 
2.23.0.501.gb744c3af07



Re: Bi-Weekly Standup - Time/timezone in calendar?

2019-09-30 Thread Thomas Gummerer
On 09/28, Junio C Hamano wrote:
> Thomas Gummerer  writes:
> 
> >> I thought it was to be 1700 UTC, which currently is 1800 BST her in UK, and
> >> 1900 CST in Europe.
> >
> > That's my recollection as well, and what the calendar should say.
> > Thanks for flagging this!
> >
> > I don't know.  I'd be happy to keep it at 17:00 UTC, but that might be
> > a bit early for folks living on the west coast.  I don't have a strong
> > opinion on this, but I'm happy to update the calendar (or not
> > depending on what we decide) once the decision is made.
> 
> By the way, this is sort of off-topic, but should I add this to
> tinyurl.com/gitCal (or even better, should I add you as another
> editor of the said calendar), so that people have one fewer
> calendars to follow?

Yeah, I think that would be awesome, thanks for offering.  One less
calendar should definitely make things easier for people (and give the
standup some more visibility).

And I'd be happy to take care of adding/updating the events if you add
me as editor to the calendar.

Thanks!


Re: [git issue] git am failed for patches of converting the format of source codes from dos to unix

2019-09-27 Thread Thomas Gummerer
On 09/27, Beyondhorizon Zheng wrote:
> [git issue] git am failed for patches of converting the file format of
> source codes from dos to unix
> 
> Git version: git version 2.23.0
> Host PC: ubuntu 16.04.10
> Reporter: Shuang Zheng
> 
> I have submitted a patch which convert the file format of source file
> from dos to unix with command:
> dos2unix misc/acrn-config/config_app/controller.py
> then submit a patch with this change:
> git add controller.py
> git commit -m “change file format”
> git format-patch HEAD~1

Here you are preparing a patch of your last commit.  Good.

> git am 0001-change-file-format.patch

Running this right after 'git format-patch' will try to apply the
patch, but your latest commit already includes those changes.  So
there is nothing to apply, and 'git am' fails.  Normally you would not
use 'format-patch' and then try to apply the patch again, but rather
send it to someone else, who could then apply your changes.

You could use 'git am -3 0001-change-file-format.patch', which falls
back to a three-way merge, which should tell you something like:

No changes -- Patch already applied.

If you want to check that the patch applies correctly, you can use
'git reset --hard HEAD~1' (make sure you don't have any untracked
changes you don't want to loose first), and then use the 'git am'
command from above.

Hope that helps.

> errors as below:
> Applying: change file format
> error: patch failed: misc/acrn-config/config_app/controller.py:1
> error: misc/acrn-config/config_app/controller.py: patch does not apply
> Patch failed at 0001 fix format
> hint: Use 'git am --show-current-patch' to see the failed patch
> When you have resolved this problem, run "git am --continue".
> If you prefer to skip this patch, run "git am --skip" instead.
> To restore the original branch and stop patching, run "git am --abort".


Re: Bi-Weekly Standup - Time/timezone in calendar?

2019-09-25 Thread Thomas Gummerer
On 09/25, Philip Oakley wrote:
> Hi,
> 
> At the Virtual Git Contributors Summit we discussed (#13) the bi-weekly
> standup meetings (mentioned in the Git Rev News edition 55 under
> 'News/Various').
> 
> The Git Events calendar [1] that's linked from the Rev News doesn't actually
> say what time zone to use for the stand-up start time, so at first glance
> one can get confused by summer time and national time zone differences.
> 
> Currently it's saying (when clicked on, via 'more details')
> 
> Git Standup
> Monday, September 30⋅6:00 – 7:00pm
> Every 2 weeks on Monday
> 
> I thought it was to be 1700 UTC, which currently is 1800 BST her in UK, and
> 1900 CST in Europe.

That's my recollection as well, and what the calendar should say.
Thanks for flagging this!

> If I hover over the event (have to restart the calendar), I (depending on
> view) do get an indicator in the lower left status bar that
> "Events shown in time zone: Coordinated Universal Time". but with a 5pm
> indication in the calendar sheet.
> 
> Is this google calendar trying to be too clever, or should "1700 UTC" be
> included in the event details? I don't use Google calendar except for
> occasional reference.

Yeah, I think the embed view of the Google Calendar is trying to be
too clever here.  I did set the event up to be at 17:00 UTC, and I'm
not sure why it showed up otherwise for you.

I have now updated the event with a few more details, including that
it is happening at 17:00 UTC, and added that information to the title
as well.  Hope that helps!

Also replying to your last question in 
https://github.com/git/git.github.io/issues/394#issuecomment-535030810:

> Also given Dscho's comment at the summit about late evenings, are we
> changing the time for those dark winter nights soon to come (N
> Hemisphere)?

I don't know.  I'd be happy to keep it at 17:00 UTC, but that might be
a bit early for folks living on the west coast.  I don't have a strong
opinion on this, but I'm happy to update the calendar (or not
depending on what we decide) once the decision is made.

> Philip
> 
> 
> [1] 
> https://calendar.google.com/calendar/embed?src=nk8ph2kh4p5tgfcctb8i7dm6d4%40group.calendar.google.com
> 


Re: [PATCH] add a Code of Conduct document

2019-09-24 Thread Thomas Gummerer
On 09/24, Jeff King wrote:
> We've never had a formally written Code of Conduct document. Though it
> has been discussed off and on over the years, for the most part the
> behavior on the mailing list has been good enough that nobody felt the
> need to push one forward.
> 
> However, even if there aren't specific problems now, it's a good idea to
> have a document:
> 
>   - it puts everybody on the same page with respect to expectations.
> This might avoid poor behavior, but also makes it easier to handle
> it if it does happen.
> 
>   - it publicly advertises that good conduct is important to us and will
> be enforced, which may make some people more comfortable with
> joining our community
> 
>   - it may be a good time to cement our expectations when things are
> quiet, since it gives everybody some distance rather than focusing
> on a current contentious issue
> 
> This patch adapts the Contributor Covenant Code of Conduct. As opposed
> to writing our own from scratch, this uses common and well-accepted
> language, and strikes a good balance between illustrating expectations
> and avoiding a laundry list of behaviors. It's also the same document
> used by the Git for Windows project.
> 
> The text is taken mostly verbatim from:
> 
>   https://www.contributor-covenant.org/version/1/4/code-of-conduct.html
> 
> I also stole a very nice introductory paragraph from the Git for Windows
> version of the file.
> 
> There are a few subtle points, though:
> 
>   - the document refers to "the project maintainers". For the code, we
> generally only consider there to be one maintainer: Junio C Hamano.
> But for dealing with community issues, it makes sense to involve
> more people to spread the responsibility. I've listed the project
> committee address of g...@sfconservancy.org as the contact point.
> 
>   - the document mentions banning from the community, both in the intro
> paragraph and in "Our Responsibilities". The exact mechanism here is
> left vague. I can imagine it might start with social enforcement
> (not accepting patches, ignoring emails) and could escalate to
> technical measures if necessary (asking vger admins to block an
> address). It probably make sense _not_ to get too specific at this
> point, and deal with specifics as they come up.
> 
> Signed-off-by: Jeff King 

I don't have much to add to this, the commit message spells out all
the reasons why we should have this nicely and I wholeheartedly agree
with introducing it, and choosing the Contributor Covenant as our
template.  So I'm adding my ACK to the others that have been coming in
already.  Thanks for submitting this!


Re: What's cooking in git.git (Sep 2019, #02; Wed, 18)

2019-09-19 Thread Thomas Gummerer
On 09/18, Junio C Hamano wrote:
> * tg/stash-refresh-index (2019-09-05) 3 commits
>  - stash: make sure to write refreshed cache
>  - merge: use refresh_and_write_cache
>  - factor out refresh_and_write_cache function
> 
>  "git stash" learned to write refreshed index back to disk.
> 
>  Needs coordination with js/builtin-add-i topic, as they both wants
>  the same kind of enhancement to the same API function.

I have sent an updated version of this, that integrates the changes
the js/builtin-add-i topic needs in [*1*].  I think it would be ok to
pick up that version and keep js/builtin-add-i out of pu until it's
rebased on top of that.

Dscho: to help reduce the amount of work you (and to double check that
my series works well with the builtin-add-i series) I have rebased
js/builtin-add-i on top of my series, and pushed the result to
https://github.com/tgummerer/git js/builtin-add-i.  Feel free to use
that if it helps :)

*1*: https://public-inbox.org/git/20190911182027.41284-1-t.gumme...@gmail.com/.


Re: Git standup on IRC

2019-09-16 Thread Thomas Gummerer
On 09/16, Jonathan Nieder wrote:
> > (In case it was not clear: those standups are meant to offer a really
> > informal venue to talk about patches you are working (or planning to
> > work) on, _especially_ for people who feel intimidated by this here
> > mailing list...)
> 
> Thanks to Thomas for setting up the calendar
> https://calendar.google.com/calendar/embed?src=nk8ph2kh4p5tgfcctb8i7dm6d4%40group.calendar.google.com

Here's a link to it in .ics format as well, as that may be easier to
subscribe to, depending on which calendar apps folks are using:
https://calendar.google.com/calendar/ical/nk8ph2kh4p5tgfcctb8i7dm6d4%40group.calendar.google.com/public/basic.ics


[PATCH v4 2/3] merge: use refresh_and_write_cache

2019-09-11 Thread Thomas Gummerer
Use the 'refresh_and_write_cache()' convenience function introduced in
the last commit, instead of refreshing and writing the index manually
in merge.c

Signed-off-by: Thomas Gummerer 
---
 builtin/merge.c | 13 +++--
 1 file changed, 3 insertions(+), 10 deletions(-)

diff --git a/builtin/merge.c b/builtin/merge.c
index e2ccbc44e2..83e42fcb10 100644
--- a/builtin/merge.c
+++ b/builtin/merge.c
@@ -688,16 +688,13 @@ static int try_merge_strategy(const char *strategy, 
struct commit_list *common,
  struct commit_list *remoteheads,
  struct commit *head)
 {
-   struct lock_file lock = LOCK_INIT;
const char *head_arg = "HEAD";
 
-   hold_locked_index(&lock, LOCK_DIE_ON_ERROR);
-   refresh_cache(REFRESH_QUIET);
-   if (write_locked_index(&the_index, &lock,
-  COMMIT_LOCK | SKIP_IF_UNCHANGED))
+   if (refresh_and_write_cache(REFRESH_QUIET, SKIP_IF_UNCHANGED, 0) < 0)
return error(_("Unable to write index."));
 
if (!strcmp(strategy, "recursive") || !strcmp(strategy, "subtree")) {
+   struct lock_file lock = LOCK_INIT;
int clean, x;
struct commit *result;
struct commit_list *reversed = NULL;
@@ -860,12 +857,8 @@ static int merge_trivial(struct commit *head, struct 
commit_list *remoteheads)
 {
struct object_id result_tree, result_commit;
struct commit_list *parents, **pptr = &parents;
-   struct lock_file lock = LOCK_INIT;
 
-   hold_locked_index(&lock, LOCK_DIE_ON_ERROR);
-   refresh_cache(REFRESH_QUIET);
-   if (write_locked_index(&the_index, &lock,
-  COMMIT_LOCK | SKIP_IF_UNCHANGED))
+   if (refresh_and_write_cache(REFRESH_QUIET, SKIP_IF_UNCHANGED, 0) < 0)
return error(_("Unable to write index."));
 
write_tree_trivial(&result_tree);
-- 
2.23.0.rc2.194.ge5444969c9



[PATCH v4 1/3] factor out refresh_and_write_cache function

2019-09-11 Thread Thomas Gummerer
Getting the lock for the index, refreshing it and then writing it is a
pattern that happens more than once throughout the codebase, and isn't
trivial to get right.  Factor out the refresh_and_write_cache function
from builtin/am.c to read-cache.c, so it can be re-used in other
places in a subsequent commit.

Note that we return different error codes for failing to refresh the
cache, and failing to write the index.  The current caller only cares
about failing to write the index.  However for other callers we're
going to convert in subsequent patches we will need this distinction.

Helped-by: Martin Ågren 
Helped-by: Johannes Schindelin 
Signed-off-by: Thomas Gummerer 
---
 builtin/am.c | 16 ++--
 cache.h  | 18 ++
 read-cache.c | 21 +
 3 files changed, 41 insertions(+), 14 deletions(-)

diff --git a/builtin/am.c b/builtin/am.c
index 1aea657a7f..92e0e70069 100644
--- a/builtin/am.c
+++ b/builtin/am.c
@@ -1071,19 +1071,6 @@ static const char *msgnum(const struct am_state *state)
return sb.buf;
 }
 
-/**
- * Refresh and write index.
- */
-static void refresh_and_write_cache(void)
-{
-   struct lock_file lock_file = LOCK_INIT;
-
-   hold_locked_index(&lock_file, LOCK_DIE_ON_ERROR);
-   refresh_cache(REFRESH_QUIET);
-   if (write_locked_index(&the_index, &lock_file, COMMIT_LOCK))
-   die(_("unable to write index file"));
-}
-
 /**
  * Dies with a user-friendly message on how to proceed after resolving the
  * problem. This message can be overridden with state->resolvemsg.
@@ -1703,7 +1690,8 @@ static void am_run(struct am_state *state, int resume)
 
unlink(am_path(state, "dirtyindex"));
 
-   refresh_and_write_cache();
+   if (refresh_and_write_cache(REFRESH_QUIET, 0, 0) < 0)
+   die(_("unable to write index file"));
 
if (repo_index_has_changes(the_repository, NULL, &sb)) {
write_state_bool(state, "dirtyindex", 1);
diff --git a/cache.h b/cache.h
index b1da1ab08f..68a54f50ac 100644
--- a/cache.h
+++ b/cache.h
@@ -414,6 +414,7 @@ extern struct index_state the_index;
 #define add_file_to_cache(path, flags) add_file_to_index(&the_index, (path), 
(flags))
 #define chmod_cache_entry(ce, flip) chmod_index_entry(&the_index, (ce), (flip))
 #define refresh_cache(flags) refresh_index(&the_index, (flags), NULL, NULL, 
NULL)
+#define refresh_and_write_cache(refresh_flags, write_flags, gentle) 
repo_refresh_and_write_index(the_repository, (refresh_flags), (write_flags), 
(gentle), NULL, NULL, NULL)
 #define ce_match_stat(ce, st, options) ie_match_stat(&the_index, (ce), (st), 
(options))
 #define ce_modified(ce, st, options) ie_modified(&the_index, (ce), (st), 
(options))
 #define cache_dir_exists(name, namelen) index_dir_exists(&the_index, (name), 
(namelen))
@@ -812,6 +813,23 @@ void fill_stat_cache_info(struct index_state *istate, 
struct cache_entry *ce, st
 #define REFRESH_IN_PORCELAIN   0x0020  /* user friendly output, not "needs 
update" */
 #define REFRESH_PROGRESS   0x0040  /* show progress bar if stderr is tty */
 int refresh_index(struct index_state *, unsigned int flags, const struct 
pathspec *pathspec, char *seen, const char *header_msg);
+/*
+ * Refresh the index and write it to disk.
+ *
+ * 'refresh_flags' is passed directly to 'refresh_index()', while
+ * 'COMMIT_LOCK | write_flags' is passed to 'write_locked_index()', so
+ * the lockfile is always either committed or rolled back.
+ *
+ * If 'gentle' is passed, errors locking the index are ignored.
+ *
+ * Return 1 if refreshing the index returns an error, -1 if writing
+ * the index to disk fails, 0 on success.
+ *
+ * Note that if refreshing the index returns an error, we still write
+ * out the index (unless locking fails).
+ */
+int repo_refresh_and_write_index(struct repository*, unsigned int 
refresh_flags, unsigned int write_flags, int gentle, const struct pathspec *, 
char *seen, const char *header_msg);
+
 struct cache_entry *refresh_cache_entry(struct index_state *, struct 
cache_entry *, unsigned int);
 
 void set_alternate_index_output(const char *);
diff --git a/read-cache.c b/read-cache.c
index 52ffa8a313..7e646e06c2 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -1472,6 +1472,27 @@ static void show_file(const char * fmt, const char * 
name, int in_porcelain,
printf(fmt, name);
 }
 
+int repo_refresh_and_write_index(struct repository *repo,
+unsigned int refresh_flags,
+unsigned int write_flags,
+int gentle,
+const struct pathspec *pathspec,
+char *seen, const char *header_msg)
+{
+   struct lock_file lock_file = LOCK_INIT;
+   int fd, ret = 0;
+
+   fd = repo_hold_locke

[PATCH v4 3/3] stash: make sure to write refreshed cache

2019-09-11 Thread Thomas Gummerer
When converting stash into C, calls to 'git update-index --refresh'
were replaced with the 'refresh_cache()' function.  That is fine as
long as the index is only needed in-core, and not re-read from disk.

However in many cases we do actually need the refreshed index to be
written to disk, for example 'merge_recursive_generic()' discards the
in-core index before re-reading it from disk, and in the case of 'apply
--quiet', the 'refresh_cache()' we currently have is pointless without
writing the index to disk.

Always write the index after refreshing it to ensure there are no
regressions in this compared to the scripted stash.  In the future we
can consider avoiding the write where possible after making sure none
of the subsequent calls actually need the refreshed cache, and it is
not expected to be refreshed after stash exits or it is written
somewhere else already.

Reported-by: Jeff King 
Signed-off-by: Thomas Gummerer 
---
 builtin/stash.c  | 11 +++
 t/t3903-stash.sh | 16 
 2 files changed, 23 insertions(+), 4 deletions(-)

diff --git a/builtin/stash.c b/builtin/stash.c
index b5a301f24d..ab30d1e920 100644
--- a/builtin/stash.c
+++ b/builtin/stash.c
@@ -396,7 +396,7 @@ static int do_apply_stash(const char *prefix, struct 
stash_info *info,
const struct object_id *bases[1];
 
read_cache_preload(NULL);
-   if (refresh_cache(REFRESH_QUIET))
+   if (refresh_and_write_cache(REFRESH_QUIET, 0, 0))
return -1;
 
if (write_cache_as_tree(&c_tree, 0, NULL))
@@ -485,7 +485,7 @@ static int do_apply_stash(const char *prefix, struct 
stash_info *info,
}
 
if (quiet) {
-   if (refresh_cache(REFRESH_QUIET))
+   if (refresh_and_write_cache(REFRESH_QUIET, 0, 0))
warning("could not refresh index");
} else {
struct child_process cp = CHILD_PROCESS_INIT;
@@ -1129,7 +1129,10 @@ static int do_create_stash(const struct pathspec *ps, 
struct strbuf *stash_msg_b
prepare_fallback_ident("git stash", "git@stash");
 
read_cache_preload(NULL);
-   refresh_cache(REFRESH_QUIET);
+   if (refresh_and_write_cache(REFRESH_QUIET, 0, 0) < 0) {
+   ret = -1;
+   goto done;
+   }
 
if (get_oid("HEAD", &info->b_commit)) {
if (!quiet)
@@ -1290,7 +1293,7 @@ static int do_push_stash(const struct pathspec *ps, const 
char *stash_msg, int q
free(ps_matched);
}
 
-   if (refresh_cache(REFRESH_QUIET)) {
+   if (refresh_and_write_cache(REFRESH_QUIET, 0, 0)) {
ret = -1;
goto done;
}
diff --git a/t/t3903-stash.sh b/t/t3903-stash.sh
index b8e337893f..392954d6dd 100755
--- a/t/t3903-stash.sh
+++ b/t/t3903-stash.sh
@@ -1241,4 +1241,20 @@ test_expect_success 'stash --keep-index with file 
deleted in index does not resu
test_path_is_missing to-remove
 '
 
+test_expect_success 'stash apply should succeed with unmodified file' '
+   echo base >file &&
+   git add file &&
+   git commit -m base &&
+
+   # now stash a modification
+   echo modified >file &&
+   git stash &&
+
+   # make the file stat dirty
+   cp file other &&
+   mv other file &&
+
+   git stash apply
+'
+
 test_done
-- 
2.23.0.rc2.194.ge5444969c9



[PATCH v4 0/3] make sure stash refreshes the index properly

2019-09-11 Thread Thomas Gummerer
Compared to the previous round this round introduces a gentle flag for
refresh_and_write_{index,cache}, which should make this function
suitable for use in the Dscho's builtin-add-i series.  The latter will have to 
be 

I have also pushed this to https://github.com/tgummerer/git 
tg/stash-refresh-index

Range-diff below:

1:  7cc9f5fff4 ! 1:  2a7bebb20f factor out refresh_and_write_cache function
@@ Commit message
 about failing to write the index.  However for other callers we're
 going to convert in subsequent patches we will need this distinction.
 
+Helped-by: Martin Ågren 
+Helped-by: Johannes Schindelin 
 Signed-off-by: Thomas Gummerer 
 
  ## builtin/am.c ##
@@ builtin/am.c: static void am_run(struct am_state *state, int resume)
unlink(am_path(state, "dirtyindex"));
  
 -  refresh_and_write_cache();
-+  if (refresh_and_write_cache(REFRESH_QUIET, 0) < 0)
++  if (refresh_and_write_cache(REFRESH_QUIET, 0, 0) < 0)
 +  die(_("unable to write index file"));
  
if (repo_index_has_changes(the_repository, NULL, &sb)) {
@@ cache.h: extern struct index_state the_index;
  #define add_file_to_cache(path, flags) add_file_to_index(&the_index, 
(path), (flags))
  #define chmod_cache_entry(ce, flip) chmod_index_entry(&the_index, (ce), 
(flip))
  #define refresh_cache(flags) refresh_index(&the_index, (flags), NULL, 
NULL, NULL)
-+#define refresh_and_write_cache(refresh_flags, write_flags) 
repo_refresh_and_write_index(the_repository, (refresh_flags), (write_flags), 
NULL, NULL, NULL)
++#define refresh_and_write_cache(refresh_flags, write_flags, gentle) 
repo_refresh_and_write_index(the_repository, (refresh_flags), (write_flags), 
(gentle), NULL, NULL, NULL)
  #define ce_match_stat(ce, st, options) ie_match_stat(&the_index, (ce), 
(st), (options))
  #define ce_modified(ce, st, options) ie_modified(&the_index, (ce), (st), 
(options))
  #define cache_dir_exists(name, namelen) index_dir_exists(&the_index, 
(name), (namelen))
@@ cache.h: void fill_stat_cache_info(struct index_state *istate, struct 
cache_entr
 + * 'COMMIT_LOCK | write_flags' is passed to 'write_locked_index()', so
 + * the lockfile is always either committed or rolled back.
 + *
++ * If 'gentle' is passed, errors locking the index are ignored.
++ *
 + * Return 1 if refreshing the index returns an error, -1 if writing
 + * the index to disk fails, 0 on success.
 + *
-+ * Note that if refreshing the index returns an error, we don't write
-+ * the result to disk.
++ * Note that if refreshing the index returns an error, we still write
++ * out the result (unless locking failed).
 + */
-+int repo_refresh_and_write_index(struct repository*, unsigned int 
refresh_flags, unsigned int write_flags, const struct pathspec *, char *seen, 
const char *header_msg);
++int repo_refresh_and_write_index(struct repository*, unsigned int 
refresh_flags, unsigned int write_flags, int gentle, const struct pathspec *, 
char *seen, const char *header_msg);
 +
  struct cache_entry *refresh_cache_entry(struct index_state *, struct 
cache_entry *, unsigned int);
  
@@ read-cache.c: static void show_file(const char * fmt, const char * name, 
int in_
printf(fmt, name);
  }
  
-+int repo_refresh_and_write_index(struct  repository *repo,
++int repo_refresh_and_write_index(struct repository *repo,
 +   unsigned int refresh_flags,
 +   unsigned int write_flags,
++   int gentle,
 +   const struct pathspec *pathspec,
 +   char *seen, const char *header_msg)
 +{
 +  struct lock_file lock_file = LOCK_INIT;
++  int fd, ret = 0;
 +
-+  repo_hold_locked_index(repo, &lock_file, LOCK_DIE_ON_ERROR);
-+  if (refresh_index(repo->index, refresh_flags, pathspec, seen, 
header_msg)) {
-+  rollback_lock_file(&lock_file);
-+  return 1;
-+  }
-+  if (write_locked_index(repo->index, &lock_file, COMMIT_LOCK | 
write_flags))
++  fd = repo_hold_locked_index(repo, &lock_file, 0);
++  if (!gentle && fd < 0)
 +  return -1;
-+  return 0;
++  if (refresh_index(repo->index, refresh_flags, pathspec, seen, 
header_msg))
++  ret = 1;
++  if (0 <= fd && write_locked_index(repo->index, &lock_file, COMMIT_LOCK 
| write_flags))
++  ret = -1;
++  rollback_lock_file(&lock_file);
++  return ret;
 +}
 +
 +
2:  0367d938b1 ! 2:  555c982eae merge: use refresh_and_write_cache
@@ builtin/merge.c: static int try_merge_strategy(const char *strategy, 
struc

Re: [PATCH v3 1/3] factor out refresh_and_write_cache function

2019-09-11 Thread Thomas Gummerer
On 09/11, Johannes Schindelin wrote:
> Hi Thomas,
> 
> On Fri, 6 Sep 2019, Thomas Gummerer wrote:
> > Oops, I didn't realize there was another series in flight that also
> > introduces 'repo_refresh_and_write_index'.  Probably should have done
> > a test merge of this with pu.
> 
> Yep, our patches clash. I would not mind placing my patch series on top
> of yours, provided that you can make a few changes that I need ;-)

Sounds good.  Looking ahead further I don't mind these changes at all!

> > Right, and if gentle is set to false, it avoids writing the index,
> > which seems fine from my perspective.
> 
> This also suggests that it would make sense to avoid
> `LOCK_DIE_ON_ERROR`, _in particular_ because this is supposed to be a
> library function, not just a helper function for a one-shot built-in
> (don't you like how this idea "it is okay to use exit() to clean up
> after us, we don't care" comes back to bite us?).

Yup, returning an error for this definitely makes sense, especially
for future proofing.

> > >  - This version allows to pass pathspec, seen and header_msg, while
> > >the one in builtin-add-i cannot limit the part of the index
> > >getting refreshed with pathspec.  It wouldn't be a brain surgery
> > >to use this version and adjust the caller (there only is one) in
> > >the builtin-add-i topic.
> >
> > 'pathspec', 'seen' and 'header_msg' are not used in my version either,
> > I just implemented it for completeness and compatibility.  So I'd be
> > fine to do without them.
> 
> Oh, why not keep them? I'd rather keep them and adjust the caller in
> `builtin-add-i`.

Great, I'm happy to keep them.

> > There's two more differences between the versions:
> >
> >  - The version in my series allows passing in write_flags to be passed
> >to write_locked_index, which is required to convert the callers in
> >builtin/merge.c.
> 
> I can always pass in 0 as `write_flags`.
> 
> >  - Dscho's version also calls 'repo_read_index_preload()', which I
> >don't do in mine.  Some callers don't need to do that, so I think it
> >would be nice to keep that outside of the
> >'repo_refresh_and_write_index()' function.
> 
> Agreed.
> 
> > I can think of a few ways forward here:
> >
> >  - I incorporate features that are needed for the builtin-add-i series
> >here, and that is rebased on top of this series.
> 
> I'd prefer this way forward. The `builtin-add-i` patch series is
> evolving more slowly than yours.

Great!  I'll send an updated version of my series soon.  Thanks!


Re: [PATCH v3 1/3] factor out refresh_and_write_cache function

2019-09-06 Thread Thomas Gummerer
On 09/05, Junio C Hamano wrote:
> Thomas Gummerer  writes:
> 
> > Getting the lock for the index, refreshing it and then writing it is a
> > pattern that happens more than once throughout the codebase, and isn't
> > trivial to get right.  Factor out the refresh_and_write_cache function
> > from builtin/am.c to read-cache.c, so it can be re-used in other
> > places in a subsequent commit.
> >
> > Note that we return different error codes for failing to refresh the
> > cache, and failing to write the index.  The current caller only cares
> > about failing to write the index.  However for other callers we're
> > going to convert in subsequent patches we will need this distinction.
> >
> > Signed-off-by: Thomas Gummerer 
> > ---
> >  builtin/am.c | 16 ++--
> >  cache.h  | 16 
> >  read-cache.c | 19 +++
> >  3 files changed, 37 insertions(+), 14 deletions(-)
> 
> I think this goes in the right direction, but obviously conflicts
> with what Dscho wants to do in the builtin-add-i series, and needs
> to be reconciled by working better together.

Oops, I didn't realize there was another series in flight that also
introduces 'repo_refresh_and_write_index'.  Probably should have done
a test merge of this with pu.

> For now, I'll eject builtin-add-i and queue this for a few days to
> give it a bit more exposure, but after that requeue builtin-add-i
> and discard these three patches.  By that time, hopefully you two
> would have a rerolled version of this one and builtin-add-i that
> agree what kind of refresh-and-write-index behaviour they both want.
>
> The differences I see that need reconciling are:

Thanks for writing these down.

>  - builtin-add-i seems to allow 'gentle' and allow returning an
>error when we cannot open the index for writing by passing false
>to 'gentle'; this feature is not used yet, though.

Right, and if gentle is set to false, it avoids writing the index,
which seems fine from my perspective.

>  - This version allows to pass pathspec, seen and header_msg, while
>the one in builtin-add-i cannot limit the part of the index
>getting refreshed with pathspec.  It wouldn't be a brain surgery
>to use this version and adjust the caller (there only is one) in
>the builtin-add-i topic.

'pathspec', 'seen' and 'header_msg' are not used in my version either,
I just implemented it for completeness and compatibility.  So I'd be
fine to do without them.

>  - This version does not write the index back when refresh_index()
>returns non-zero, but the one in builtin-add-i ignores the
>returned value.  I think, as a performance measure, it probably
>is a better idea to write it back, even when the function returns
>non-zero (the local variable's name is has_errors, but having an
>entry in the index that does not get refreshed is *not* an error;
>e.g. an unmerged entry is a normal thing in the index, and as
>long as we refreshed other entries while having an unmerged and
>unrefreshable entry, we are making progress that is worth writing
>out).

I'm happy with writing the index back even if there are errors.
However I think we still need the option to get the return code from
'refresh_index()', as some callers where I'm using
'refresh_and_write_index()' in this series behave differently
depending on its return code.

There's two more differences between the versions:

 - The version in my series allows passing in write_flags to be passed
   to write_locked_index, which is required to convert the callers in
   builtin/merge.c.

 - Dscho's version also calls 'repo_read_index_preload()', which I
   don't do in mine.  Some callers don't need to do that, so I think it
   would be nice to keep that outside of the
   'repo_refresh_and_write_index()' function.

I can think of a few ways forward here:

 - I incorporate features that are needed for the builtin-add-i series
   here, and that is rebased on top of this series.

 - We drop the first two patches of this series, so we only fix the
   problems in 'git stash' for now.  Later we can have a refactoring
   series that uses repo_refresh_and_write_index in the places we
   converted here, once the dust of the builtin-add-i series settled.

 - I rebase this on top of builtin-add-i.

I'm happy with either of the first two, but less so with the last
option.  I was hoping this series could potentially go to maint as it
was a bugfix, which we obviously can't do with that option.

Dscho, what do you think? :)

> Thanks.
> 
> > +int repo_refresh_and_write_index(struct  reposito

[PATCH v3 2/3] merge: use refresh_and_write_cache

2019-09-03 Thread Thomas Gummerer
Use the 'refresh_and_write_cache()' convenience function introduced in
the last commit, instead of refreshing and writing the index manually
in merge.c

Signed-off-by: Thomas Gummerer 
---
 builtin/merge.c | 13 +++--
 1 file changed, 3 insertions(+), 10 deletions(-)

diff --git a/builtin/merge.c b/builtin/merge.c
index e2ccbc44e2..0148d938c9 100644
--- a/builtin/merge.c
+++ b/builtin/merge.c
@@ -688,16 +688,13 @@ static int try_merge_strategy(const char *strategy, 
struct commit_list *common,
  struct commit_list *remoteheads,
  struct commit *head)
 {
-   struct lock_file lock = LOCK_INIT;
const char *head_arg = "HEAD";
 
-   hold_locked_index(&lock, LOCK_DIE_ON_ERROR);
-   refresh_cache(REFRESH_QUIET);
-   if (write_locked_index(&the_index, &lock,
-  COMMIT_LOCK | SKIP_IF_UNCHANGED))
+   if (refresh_and_write_cache(REFRESH_QUIET, SKIP_IF_UNCHANGED) < 0)
return error(_("Unable to write index."));
 
if (!strcmp(strategy, "recursive") || !strcmp(strategy, "subtree")) {
+   struct lock_file lock = LOCK_INIT;
int clean, x;
struct commit *result;
struct commit_list *reversed = NULL;
@@ -860,12 +857,8 @@ static int merge_trivial(struct commit *head, struct 
commit_list *remoteheads)
 {
struct object_id result_tree, result_commit;
struct commit_list *parents, **pptr = &parents;
-   struct lock_file lock = LOCK_INIT;
 
-   hold_locked_index(&lock, LOCK_DIE_ON_ERROR);
-   refresh_cache(REFRESH_QUIET);
-   if (write_locked_index(&the_index, &lock,
-  COMMIT_LOCK | SKIP_IF_UNCHANGED))
+   if (refresh_and_write_cache(REFRESH_QUIET, SKIP_IF_UNCHANGED) < 0)
return error(_("Unable to write index."));
 
write_tree_trivial(&result_tree);
-- 
2.23.0.rc2.194.ge5444969c9



[PATCH v3 3/3] stash: make sure to write refreshed cache

2019-09-03 Thread Thomas Gummerer
When converting stash into C, calls to 'git update-index --refresh'
were replaced with the 'refresh_cache()' function.  That is fine as
long as the index is only needed in-core, and not re-read from disk.

However in many cases we do actually need the refreshed index to be
written to disk, for example 'merge_recursive_generic()' discards the
in-core index before re-reading it from disk, and in the case of 'apply
--quiet', the 'refresh_cache()' we currently have is pointless without
writing the index to disk.

Always write the index after refreshing it to ensure there are no
regressions in this compared to the scripted stash.  In the future we
can consider avoiding the write where possible after making sure none
of the subsequent calls actually need the refreshed cache, and it is
not expected to be refreshed after stash exits or it is written
somewhere else already.

Reported-by: Jeff King 
Signed-off-by: Thomas Gummerer 
---
 builtin/stash.c  | 11 +++
 t/t3903-stash.sh | 16 
 2 files changed, 23 insertions(+), 4 deletions(-)

diff --git a/builtin/stash.c b/builtin/stash.c
index b5a301f24d..da1260ca8e 100644
--- a/builtin/stash.c
+++ b/builtin/stash.c
@@ -396,7 +396,7 @@ static int do_apply_stash(const char *prefix, struct 
stash_info *info,
const struct object_id *bases[1];
 
read_cache_preload(NULL);
-   if (refresh_cache(REFRESH_QUIET))
+   if (refresh_and_write_cache(REFRESH_QUIET, 0))
return -1;
 
if (write_cache_as_tree(&c_tree, 0, NULL))
@@ -485,7 +485,7 @@ static int do_apply_stash(const char *prefix, struct 
stash_info *info,
}
 
if (quiet) {
-   if (refresh_cache(REFRESH_QUIET))
+   if (refresh_and_write_cache(REFRESH_QUIET, 0))
warning("could not refresh index");
} else {
struct child_process cp = CHILD_PROCESS_INIT;
@@ -1129,7 +1129,10 @@ static int do_create_stash(const struct pathspec *ps, 
struct strbuf *stash_msg_b
prepare_fallback_ident("git stash", "git@stash");
 
read_cache_preload(NULL);
-   refresh_cache(REFRESH_QUIET);
+   if (refresh_and_write_cache(REFRESH_QUIET, 0) < 0) {
+   ret = -1;
+   goto done;
+   }
 
if (get_oid("HEAD", &info->b_commit)) {
if (!quiet)
@@ -1290,7 +1293,7 @@ static int do_push_stash(const struct pathspec *ps, const 
char *stash_msg, int q
free(ps_matched);
}
 
-   if (refresh_cache(REFRESH_QUIET)) {
+   if (refresh_and_write_cache(REFRESH_QUIET, 0)) {
ret = -1;
goto done;
}
diff --git a/t/t3903-stash.sh b/t/t3903-stash.sh
index b8e337893f..392954d6dd 100755
--- a/t/t3903-stash.sh
+++ b/t/t3903-stash.sh
@@ -1241,4 +1241,20 @@ test_expect_success 'stash --keep-index with file 
deleted in index does not resu
test_path_is_missing to-remove
 '
 
+test_expect_success 'stash apply should succeed with unmodified file' '
+   echo base >file &&
+   git add file &&
+   git commit -m base &&
+
+   # now stash a modification
+   echo modified >file &&
+   git stash &&
+
+   # make the file stat dirty
+   cp file other &&
+   mv other file &&
+
+   git stash apply
+'
+
 test_done
-- 
2.23.0.rc2.194.ge5444969c9



[PATCH v3 0/3] make sure stash refreshes the index properly

2019-09-03 Thread Thomas Gummerer
Thanks Martin and Junio for the comments on the previous round.

Changes compared to the previous round:
- Document that when failing to refresh the index, the result won't be
  written to disk.
- Rollback the lock file if refreshing the index fails, so we don't
  end up with a lock file that can't be rolled back or committed after
  the function returns
- Some small tweaks in the commit message and documentation of the
  function.

Range-diff below:

1:  1f25fe227c ! 1:  7cc9f5fff4 factor out refresh_and_write_cache function
@@ Commit message
 factor out refresh_and_write_cache function
 
 Getting the lock for the index, refreshing it and then writing it is a
-pattern that happens more than once throughout the codebase.  Factor
-out the refresh_and_write_cache function from builtin/am.c to
-read-cache.c, so it can be re-used in other places in a subsequent
-commit.
+pattern that happens more than once throughout the codebase, and isn't
+trivial to get right.  Factor out the refresh_and_write_cache function
+from builtin/am.c to read-cache.c, so it can be re-used in other
+places in a subsequent commit.
 
 Note that we return different error codes for failing to refresh the
 cache, and failing to write the index.  The current caller only cares
@@ cache.h: void fill_stat_cache_info(struct index_state *istate, struct 
cache_entr
 + * 'COMMIT_LOCK | write_flags' is passed to 'write_locked_index()', so
 + * the lockfile is always either committed or rolled back.
 + *
-+ * Return 1 if refreshing the cache failed, -1 if writing the cache to
-+ * disk failed, 0 on success.
++ * Return 1 if refreshing the index returns an error, -1 if writing
++ * the index to disk fails, 0 on success.
++ *
++ * Note that if refreshing the index returns an error, we don't write
++ * the result to disk.
 + */
 +int repo_refresh_and_write_index(struct repository*, unsigned int 
refresh_flags, unsigned int write_flags, const struct pathspec *, char *seen, 
const char *header_msg);
 +
@@ read-cache.c: static void show_file(const char * fmt, const char * name, 
int in_
 +  struct lock_file lock_file = LOCK_INIT;
 +
 +  repo_hold_locked_index(repo, &lock_file, LOCK_DIE_ON_ERROR);
-+  if (refresh_index(repo->index, refresh_flags, pathspec, seen, 
header_msg))
++  if (refresh_index(repo->index, refresh_flags, pathspec, seen, 
header_msg)) {
++  rollback_lock_file(&lock_file);
 +  return 1;
++  }
 +  if (write_locked_index(repo->index, &lock_file, COMMIT_LOCK | 
write_flags))
 +  return -1;
 +  return 0;
2:  148a65d649 = 2:  0367d938b1 merge: use refresh_and_write_cache
3:  e0f6815192 = 3:  8ed3df9fec stash: make sure to write refreshed cache

Thomas Gummerer (3):
  factor out refresh_and_write_cache function
  merge: use refresh_and_write_cache
  stash: make sure to write refreshed cache

 builtin/am.c | 16 ++--
 builtin/merge.c  | 13 +++--
 builtin/stash.c  | 11 +++
 cache.h  | 16 
 read-cache.c | 19 +++
 t/t3903-stash.sh | 16 
 6 files changed, 63 insertions(+), 28 deletions(-)

-- 
2.23.0.rc2.194.ge5444969c9


[PATCH v3 1/3] factor out refresh_and_write_cache function

2019-09-03 Thread Thomas Gummerer
Getting the lock for the index, refreshing it and then writing it is a
pattern that happens more than once throughout the codebase, and isn't
trivial to get right.  Factor out the refresh_and_write_cache function
from builtin/am.c to read-cache.c, so it can be re-used in other
places in a subsequent commit.

Note that we return different error codes for failing to refresh the
cache, and failing to write the index.  The current caller only cares
about failing to write the index.  However for other callers we're
going to convert in subsequent patches we will need this distinction.

Signed-off-by: Thomas Gummerer 
---
 builtin/am.c | 16 ++--
 cache.h  | 16 
 read-cache.c | 19 +++
 3 files changed, 37 insertions(+), 14 deletions(-)

diff --git a/builtin/am.c b/builtin/am.c
index 1aea657a7f..ddedd2b9d4 100644
--- a/builtin/am.c
+++ b/builtin/am.c
@@ -1071,19 +1071,6 @@ static const char *msgnum(const struct am_state *state)
return sb.buf;
 }
 
-/**
- * Refresh and write index.
- */
-static void refresh_and_write_cache(void)
-{
-   struct lock_file lock_file = LOCK_INIT;
-
-   hold_locked_index(&lock_file, LOCK_DIE_ON_ERROR);
-   refresh_cache(REFRESH_QUIET);
-   if (write_locked_index(&the_index, &lock_file, COMMIT_LOCK))
-   die(_("unable to write index file"));
-}
-
 /**
  * Dies with a user-friendly message on how to proceed after resolving the
  * problem. This message can be overridden with state->resolvemsg.
@@ -1703,7 +1690,8 @@ static void am_run(struct am_state *state, int resume)
 
unlink(am_path(state, "dirtyindex"));
 
-   refresh_and_write_cache();
+   if (refresh_and_write_cache(REFRESH_QUIET, 0) < 0)
+   die(_("unable to write index file"));
 
if (repo_index_has_changes(the_repository, NULL, &sb)) {
write_state_bool(state, "dirtyindex", 1);
diff --git a/cache.h b/cache.h
index b1da1ab08f..2b14768bea 100644
--- a/cache.h
+++ b/cache.h
@@ -414,6 +414,7 @@ extern struct index_state the_index;
 #define add_file_to_cache(path, flags) add_file_to_index(&the_index, (path), 
(flags))
 #define chmod_cache_entry(ce, flip) chmod_index_entry(&the_index, (ce), (flip))
 #define refresh_cache(flags) refresh_index(&the_index, (flags), NULL, NULL, 
NULL)
+#define refresh_and_write_cache(refresh_flags, write_flags) 
repo_refresh_and_write_index(the_repository, (refresh_flags), (write_flags), 
NULL, NULL, NULL)
 #define ce_match_stat(ce, st, options) ie_match_stat(&the_index, (ce), (st), 
(options))
 #define ce_modified(ce, st, options) ie_modified(&the_index, (ce), (st), 
(options))
 #define cache_dir_exists(name, namelen) index_dir_exists(&the_index, (name), 
(namelen))
@@ -812,6 +813,21 @@ void fill_stat_cache_info(struct index_state *istate, 
struct cache_entry *ce, st
 #define REFRESH_IN_PORCELAIN   0x0020  /* user friendly output, not "needs 
update" */
 #define REFRESH_PROGRESS   0x0040  /* show progress bar if stderr is tty */
 int refresh_index(struct index_state *, unsigned int flags, const struct 
pathspec *pathspec, char *seen, const char *header_msg);
+/*
+ * Refresh the index and write it to disk.
+ *
+ * 'refresh_flags' is passed directly to 'refresh_index()', while
+ * 'COMMIT_LOCK | write_flags' is passed to 'write_locked_index()', so
+ * the lockfile is always either committed or rolled back.
+ *
+ * Return 1 if refreshing the index returns an error, -1 if writing
+ * the index to disk fails, 0 on success.
+ *
+ * Note that if refreshing the index returns an error, we don't write
+ * the result to disk.
+ */
+int repo_refresh_and_write_index(struct repository*, unsigned int 
refresh_flags, unsigned int write_flags, const struct pathspec *, char *seen, 
const char *header_msg);
+
 struct cache_entry *refresh_cache_entry(struct index_state *, struct 
cache_entry *, unsigned int);
 
 void set_alternate_index_output(const char *);
diff --git a/read-cache.c b/read-cache.c
index 52ffa8a313..2ad96677ae 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -1472,6 +1472,25 @@ static void show_file(const char * fmt, const char * 
name, int in_porcelain,
printf(fmt, name);
 }
 
+int repo_refresh_and_write_index(struct  repository *repo,
+unsigned int refresh_flags,
+unsigned int write_flags,
+const struct pathspec *pathspec,
+char *seen, const char *header_msg)
+{
+   struct lock_file lock_file = LOCK_INIT;
+
+   repo_hold_locked_index(repo, &lock_file, LOCK_DIE_ON_ERROR);
+   if (refresh_index(repo->index, refresh_flags, pathspec, seen, 
header_msg)) {
+   rollback_lock_file(&lock_file);
+   return 1;
+   }
+   if (write_locked_index(rep

[PATCH] push: disallow --all and refspecs when remote..mirror is set

2019-09-02 Thread Thomas Gummerer
On 08/20, Filippo Valsorda wrote:
> When used in a repository cloned with --mirror, git push with refs on
> the command line deletes all unmentioned refs.
> 
> This was investigated by @saleemrash1d on Twitter. I'm reporting
> their findings here and a reproduction below.
> 
> > seems to be a regression.
> > TRANSPORT_PUSH_MIRROR is set by remote->mirror
> > (https://github.com/git/git/blob/5fa0f5238b0/builtin/push.c#L410)
> > AFTER the check for refspecs provided on the command-line
> > (https://github.com/git/git/blob/5fa0f5238/builtin/push.c#L615).
> > introduced by 800a4ab399e954b8970897076b327bf1cf18c0ac.
> 
> > it's mirroring only the refspecs you provided on the command-line to
> > the server. i.e. all local refs that aren't stated on the command-line
> > will still be deleted
> 
> > this unexpected behaviour is why --mirror isn't allowed to be used
> > when refspecs are specified on the command-line. but the above commit
> > moves the sanity check so it doesn't catch the implied --mirror when
> > remote..mirror is set (i.e. cloned with --mirror)
> 
> https://twitter.com/saleemrash1d/status/1163963105849876481

Thanks for the report.  This indeed looks like a regression, as
pointed out by @saleemrash1d.

Here's a patch to fix it:

--- >8 ---
Pushes with --all, or refspecs are disallowed when --mirror is given
to 'git push', or when 'remote..mirror' is set in the config of
the repository, because they can have surprising
effects. 800a4ab399 ("push: check for errors earlier", 2018-05-16)
refactored this code to do that check earlier, so we can explicitly
check for the presence of flags, instead of their sideeffects.

However when 'remote..mirror' is set in the config, the
TRANSPORT_PUSH_MIRROR flag would only be set after we calling
'do_push()', so the checks would miss it entirely.

This leads to surprises for users [*1*].

Fix this by making sure we set the flag (if appropriate) before
checking for compatibility of the various options.

*1*: https://twitter.com/FiloSottile/status/1163918701462249472

Reported-by: Filippo Valsorda 
Helped-by: Saleem Rashid
Signed-off-by: Thomas Gummerer 
---
 builtin/push.c | 69 ++
 t/t5517-push-mirror.sh | 10 ++
 2 files changed, 46 insertions(+), 33 deletions(-)

diff --git a/builtin/push.c b/builtin/push.c
index 021dd3b1e4..3742daf7b0 100644
--- a/builtin/push.c
+++ b/builtin/push.c
@@ -385,30 +385,14 @@ static int push_with_options(struct transport *transport, 
struct refspec *rs,
 }
 
 static int do_push(const char *repo, int flags,
-  const struct string_list *push_options)
+  const struct string_list *push_options,
+  struct remote *remote)
 {
int i, errs;
-   struct remote *remote = pushremote_get(repo);
const char **url;
int url_nr;
struct refspec *push_refspec = &rs;
 
-   if (!remote) {
-   if (repo)
-   die(_("bad repository '%s'"), repo);
-   die(_("No configured push destination.\n"
-   "Either specify the URL from the command-line or configure 
a remote repository using\n"
-   "\n"
-   "git remote add  \n"
-   "\n"
-   "and then push using the remote name\n"
-   "\n"
-   "git push \n"));
-   }
-
-   if (remote->mirror)
-   flags |= (TRANSPORT_PUSH_MIRROR|TRANSPORT_PUSH_FORCE);
-
if (push_options->nr)
flags |= TRANSPORT_PUSH_OPTIONS;
 
@@ -548,6 +532,7 @@ int cmd_push(int argc, const char **argv, const char 
*prefix)
struct string_list push_options_cmdline = STRING_LIST_INIT_DUP;
struct string_list *push_options;
const struct string_list_item *item;
+   struct remote *remote;
 
struct option options[] = {
OPT__VERBOSITY(&verbosity),
@@ -602,20 +587,6 @@ int cmd_push(int argc, const char **argv, const char 
*prefix)
die(_("--delete is incompatible with --all, --mirror and 
--tags"));
if (deleterefs && argc < 2)
die(_("--delete doesn't make sense without any refs"));
-   if (flags & TRANSPORT_PUSH_ALL) {
-   if (tags)
-   die(_("--all and --tags are incompatible"));
-   if (argc >= 2)
-   die(_("--all can't be combined with refspecs"));
-   }
-   if (flags & TRANSPORT_PUSH_MIRROR) {
-   if (tags)
-   die(_(&q

Re: [PATCH v2 1/3] factor out refresh_and_write_cache function

2019-09-02 Thread Thomas Gummerer
On 08/30, Junio C Hamano wrote:
> Martin Ågren  writes:
> 
> > There's a difference in behavior that I'm not sure about: We used
> > to ignore the return value of `refresh_cache()`, i.e. we didn't care
> > whether it had any errors. I have no idea whether that's safe to do --
> > especially as we go on to write the index. So I don't know whether this
> > patch fixes a bug by introducing the early return. Or if it *introduces*
> > a bug by bailing too aggressively. Do you know more?
> 
> One common reason why refresh_cache() fails is because the index is
> unmerged (i.e. has one or more higher-stage entries).  After an
> attempt to refresh, this would not wrote out the index in such a
> case, which might even be more correct thing to do than the original
> in the original context of "git am" implementation.  The next thing
> that happens after the caller calls this function is to ask
> repo_index_has_changes(), and we'd say "the index is dirty" whether
> the index is written back or not from such a state.

Looking at the other callsites, we seem to do something similar
everywhere, and usually fail if the index has unmerged entries.  So
the refreshed index would only not be written out in the case where
there's unmerged entries, and we fail later, which I think is okay.

> > The above makes me think that once this new function is in good shape,
> > the commit introducing it could sell it as "this is hard to get right --
> > let's implement it correctly once and for all". ;-)
> 
> Yes, that is a more severe issue.

With this do you mean what you quoted above, or that the lockfile is
not rolled back?  I agree that the lockfile not being rolled back if
'refresh_cache()' fails is indeed the bigger issue, and I'll fix that
in v3.  I can also add something like the above to the commit message,
just wanted to make sure I'm not missing something subtle in what you
quoted above.


Re: git range-diff throws Segmentation fault

2019-08-30 Thread Thomas Gummerer
On 08/30, van den Berg, Kasper wrote:
> Hello,
>
> `git range-diff  ` prints "segmentation fault" to
> the console and nothing else.  It happens in git version
> 2.23.0.windows.1 and only occurs for some branches in my repository.
> I have not exactly determined when it does happen and when it does
> not (I'm not familiar with git's codebase).  These are my results:

Thanks for your bug report.

I guess this is probably related to my patch series aiming at giving
nicer output in 'git range-diff', which probably introduced some bug.

> StatusVersion Config  
> Result
> ✘ 2.23.0.windows.164-bit, local   
> Segmentation fault
> ✘ 2.23.0.windows.164-bit, local, different ranges related to my 
> current work  Segmentation fault
> ✔ 2.23.0.windows.164-bit, local, different ranges completely 
> different from my current work   Expected range-diff 
> output
> ✘ 2.23.0.windows.1Remote connection to same workdir   
> Segmentation fault
> ✔ 2.23.0.windows.164-bit, local, fresh clone, different ranges 
> completely different from my current work  Expected range-diff output
> ✘ 2.23.0.windows.132-bit, local   
> Segmentation fault
> ✔ 2.21.0.windows.164-bit, local   
> Expected range-diff output
> ✔ 2.21.0.windows.164-bit, remoteconnection to same workdir
> Expected range-diff output
>
> Both  and  comprise between 213 and 270 commits; git
> gc counts 140394 objects (Total 140394 (delta 110638), reused 137275
> (delta 107799)).  I have not preserved the offending branch names.
> However, they contain 2 or more slashes and perhaps an at-sign
> (e.g. 'feature/',
> 'tmp/old@{''}/feature/', 'develop', and
> 'tmp/project-base').

I don't think branch names would cause this, but rather some diff that
range-diff doesn't handle correctly.  Would you be able to gather a
backtrace from the segfault (not sure how to do this on Windows), or
share the repository where this segmentation fault occurs? 

> To avoid the problem I returned to using git version 2.21.0.windows.1
> 
> I there something I should take into account when doing `git
> range-diff` on a workdir or on a ranges of commits?  What things
> should I look for?  How can I repair the broken ranges?  Why does
> version 2.21.0.windows.1 work while version 2.23.0.windows.1 does
> not?

I don't think there's anything you should do here, this sounds like a
bug in Git that should be fixed.  It would be great if you could help
with some more details per above though, otherwise it's going to be
hard to track this down.

> With kind regards,
> Kasper van den Berg


[PATCH v2 1/3] factor out refresh_and_write_cache function

2019-08-29 Thread Thomas Gummerer
Getting the lock for the index, refreshing it and then writing it is a
pattern that happens more than once throughout the codebase.  Factor
out the refresh_and_write_cache function from builtin/am.c to
read-cache.c, so it can be re-used in other places in a subsequent
commit.

Note that we return different error codes for failing to refresh the
cache, and failing to write the index.  The current caller only cares
about failing to write the index.  However for other callers we're
going to convert in subsequent patches we will need this distinction.

Signed-off-by: Thomas Gummerer 
---
 builtin/am.c | 16 ++--
 cache.h  | 13 +
 read-cache.c | 17 +
 3 files changed, 32 insertions(+), 14 deletions(-)

diff --git a/builtin/am.c b/builtin/am.c
index 1aea657a7f..ddedd2b9d4 100644
--- a/builtin/am.c
+++ b/builtin/am.c
@@ -1071,19 +1071,6 @@ static const char *msgnum(const struct am_state *state)
return sb.buf;
 }
 
-/**
- * Refresh and write index.
- */
-static void refresh_and_write_cache(void)
-{
-   struct lock_file lock_file = LOCK_INIT;
-
-   hold_locked_index(&lock_file, LOCK_DIE_ON_ERROR);
-   refresh_cache(REFRESH_QUIET);
-   if (write_locked_index(&the_index, &lock_file, COMMIT_LOCK))
-   die(_("unable to write index file"));
-}
-
 /**
  * Dies with a user-friendly message on how to proceed after resolving the
  * problem. This message can be overridden with state->resolvemsg.
@@ -1703,7 +1690,8 @@ static void am_run(struct am_state *state, int resume)
 
unlink(am_path(state, "dirtyindex"));
 
-   refresh_and_write_cache();
+   if (refresh_and_write_cache(REFRESH_QUIET, 0) < 0)
+   die(_("unable to write index file"));
 
if (repo_index_has_changes(the_repository, NULL, &sb)) {
write_state_bool(state, "dirtyindex", 1);
diff --git a/cache.h b/cache.h
index b1da1ab08f..987d289e8f 100644
--- a/cache.h
+++ b/cache.h
@@ -414,6 +414,7 @@ extern struct index_state the_index;
 #define add_file_to_cache(path, flags) add_file_to_index(&the_index, (path), 
(flags))
 #define chmod_cache_entry(ce, flip) chmod_index_entry(&the_index, (ce), (flip))
 #define refresh_cache(flags) refresh_index(&the_index, (flags), NULL, NULL, 
NULL)
+#define refresh_and_write_cache(refresh_flags, write_flags) 
repo_refresh_and_write_index(the_repository, (refresh_flags), (write_flags), 
NULL, NULL, NULL)
 #define ce_match_stat(ce, st, options) ie_match_stat(&the_index, (ce), (st), 
(options))
 #define ce_modified(ce, st, options) ie_modified(&the_index, (ce), (st), 
(options))
 #define cache_dir_exists(name, namelen) index_dir_exists(&the_index, (name), 
(namelen))
@@ -812,6 +813,18 @@ void fill_stat_cache_info(struct index_state *istate, 
struct cache_entry *ce, st
 #define REFRESH_IN_PORCELAIN   0x0020  /* user friendly output, not "needs 
update" */
 #define REFRESH_PROGRESS   0x0040  /* show progress bar if stderr is tty */
 int refresh_index(struct index_state *, unsigned int flags, const struct 
pathspec *pathspec, char *seen, const char *header_msg);
+/*
+ * Refresh the index and write it to disk.
+ *
+ * 'refresh_flags' is passed directly to 'refresh_index()', while
+ * 'COMMIT_LOCK | write_flags' is passed to 'write_locked_index()', so
+ * the lockfile is always either committed or rolled back.
+ *
+ * Return 1 if refreshing the cache failed, -1 if writing the cache to
+ * disk failed, 0 on success.
+ */
+int repo_refresh_and_write_index(struct repository*, unsigned int 
refresh_flags, unsigned int write_flags, const struct pathspec *, char *seen, 
const char *header_msg);
+
 struct cache_entry *refresh_cache_entry(struct index_state *, struct 
cache_entry *, unsigned int);
 
 void set_alternate_index_output(const char *);
diff --git a/read-cache.c b/read-cache.c
index 52ffa8a313..72662df077 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -1472,6 +1472,23 @@ static void show_file(const char * fmt, const char * 
name, int in_porcelain,
printf(fmt, name);
 }
 
+int repo_refresh_and_write_index(struct  repository *repo,
+unsigned int refresh_flags,
+unsigned int write_flags,
+const struct pathspec *pathspec,
+char *seen, const char *header_msg)
+{
+   struct lock_file lock_file = LOCK_INIT;
+
+   repo_hold_locked_index(repo, &lock_file, LOCK_DIE_ON_ERROR);
+   if (refresh_index(repo->index, refresh_flags, pathspec, seen, 
header_msg))
+   return 1;
+   if (write_locked_index(repo->index, &lock_file, COMMIT_LOCK | 
write_flags))
+   return -1;
+   return 0;
+}
+
+
 int refresh_index(struct index_state *istate, unsigned int flags,
  const struct pathspec *pathspec,
  char *seen, const char *header_msg)
-- 
2.23.0.rc2.194.ge5444969c9



[PATCH v2 3/3] stash: make sure to write refreshed cache

2019-08-29 Thread Thomas Gummerer
When converting stash into C, calls to 'git update-index --refresh'
were replaced with the 'refresh_cache()' function.  That is fine as
long as the index is only needed in-core, and not re-read from disk.

However in many cases we do actually need the refreshed index to be
written to disk, for example 'merge_recursive_generic()' discards the
in-core index before re-reading it from disk, and in the case of 'apply
--quiet', the 'refresh_cache()' we currently have is pointless without
writing the index to disk.

Always write the index after refreshing it to ensure there are no
regressions in this compared to the scripted stash.  In the future we
can consider avoiding the write where possible after making sure none
of the subsequent calls actually need the refreshed cache, and it is
not expected to be refreshed after stash exits or it is written
somewhere else already.

Reported-by: Jeff King 
Signed-off-by: Thomas Gummerer 
---
 builtin/stash.c  | 11 +++
 t/t3903-stash.sh | 16 
 2 files changed, 23 insertions(+), 4 deletions(-)

diff --git a/builtin/stash.c b/builtin/stash.c
index b5a301f24d..da1260ca8e 100644
--- a/builtin/stash.c
+++ b/builtin/stash.c
@@ -396,7 +396,7 @@ static int do_apply_stash(const char *prefix, struct 
stash_info *info,
const struct object_id *bases[1];
 
read_cache_preload(NULL);
-   if (refresh_cache(REFRESH_QUIET))
+   if (refresh_and_write_cache(REFRESH_QUIET, 0))
return -1;
 
if (write_cache_as_tree(&c_tree, 0, NULL))
@@ -485,7 +485,7 @@ static int do_apply_stash(const char *prefix, struct 
stash_info *info,
}
 
if (quiet) {
-   if (refresh_cache(REFRESH_QUIET))
+   if (refresh_and_write_cache(REFRESH_QUIET, 0))
warning("could not refresh index");
} else {
struct child_process cp = CHILD_PROCESS_INIT;
@@ -1129,7 +1129,10 @@ static int do_create_stash(const struct pathspec *ps, 
struct strbuf *stash_msg_b
prepare_fallback_ident("git stash", "git@stash");
 
read_cache_preload(NULL);
-   refresh_cache(REFRESH_QUIET);
+   if (refresh_and_write_cache(REFRESH_QUIET, 0) < 0) {
+   ret = -1;
+   goto done;
+   }
 
if (get_oid("HEAD", &info->b_commit)) {
if (!quiet)
@@ -1290,7 +1293,7 @@ static int do_push_stash(const struct pathspec *ps, const 
char *stash_msg, int q
free(ps_matched);
}
 
-   if (refresh_cache(REFRESH_QUIET)) {
+   if (refresh_and_write_cache(REFRESH_QUIET, 0)) {
ret = -1;
goto done;
}
diff --git a/t/t3903-stash.sh b/t/t3903-stash.sh
index b8e337893f..392954d6dd 100755
--- a/t/t3903-stash.sh
+++ b/t/t3903-stash.sh
@@ -1241,4 +1241,20 @@ test_expect_success 'stash --keep-index with file 
deleted in index does not resu
test_path_is_missing to-remove
 '
 
+test_expect_success 'stash apply should succeed with unmodified file' '
+   echo base >file &&
+   git add file &&
+   git commit -m base &&
+
+   # now stash a modification
+   echo modified >file &&
+   git stash &&
+
+   # make the file stat dirty
+   cp file other &&
+   mv other file &&
+
+   git stash apply
+'
+
 test_done
-- 
2.23.0.rc2.194.ge5444969c9



[PATCH v2 0/3] make sure stash refreshes the index properly

2019-08-29 Thread Thomas Gummerer
  if (refresh_and_write_cache(REFRESH_QUIET, COMMIT_LOCK))
++  if (refresh_and_write_cache(REFRESH_QUIET, 0))
warning("could not refresh index");
} else {
struct child_process cp = CHILD_PROCESS_INIT;
@@ builtin/stash.c: static int do_create_stash(const struct pathspec *ps, 
struct st
  
read_cache_preload(NULL);
 -  refresh_cache(REFRESH_QUIET);
-+  if (refresh_and_write_cache(REFRESH_QUIET, COMMIT_LOCK) < 0) {
++  if (refresh_and_write_cache(REFRESH_QUIET, 0) < 0) {
 +  ret = -1;
 +  goto done;
 +  }
@@ builtin/stash.c: static int do_push_stash(const struct pathspec *ps, 
const char
    }
  
 -  if (refresh_cache(REFRESH_QUIET)) {
-+  if (refresh_and_write_cache(REFRESH_QUIET, COMMIT_LOCK)) {
++  if (refresh_and_write_cache(REFRESH_QUIET, 0)) {
ret = -1;
goto done;
}

Thomas Gummerer (3):
  factor out refresh_and_write_cache function
  merge: use refresh_and_write_cache
  stash: make sure to write refreshed cache

 builtin/am.c | 16 ++--
 builtin/merge.c  | 17 +
 builtin/stash.c  | 11 +++
 cache.h  | 13 +
 read-cache.c | 17 +
 t/t3903-stash.sh | 16 
 6 files changed, 60 insertions(+), 30 deletions(-)

-- 
2.23.0.rc2.194.ge5444969c9


[PATCH v2 2/3] merge: use refresh_and_write_cache

2019-08-29 Thread Thomas Gummerer
Use the 'refresh_and_write_cache()' convenience function introduced in
the last commit, instead of refreshing and writing the index manually
in merge.c

Signed-off-by: Thomas Gummerer 
---
 builtin/merge.c | 13 +++--
 1 file changed, 3 insertions(+), 10 deletions(-)

diff --git a/builtin/merge.c b/builtin/merge.c
index e2ccbc44e2..0148d938c9 100644
--- a/builtin/merge.c
+++ b/builtin/merge.c
@@ -688,16 +688,13 @@ static int try_merge_strategy(const char *strategy, 
struct commit_list *common,
  struct commit_list *remoteheads,
  struct commit *head)
 {
-   struct lock_file lock = LOCK_INIT;
const char *head_arg = "HEAD";
 
-   hold_locked_index(&lock, LOCK_DIE_ON_ERROR);
-   refresh_cache(REFRESH_QUIET);
-   if (write_locked_index(&the_index, &lock,
-  COMMIT_LOCK | SKIP_IF_UNCHANGED))
+   if (refresh_and_write_cache(REFRESH_QUIET, SKIP_IF_UNCHANGED) < 0)
return error(_("Unable to write index."));
 
if (!strcmp(strategy, "recursive") || !strcmp(strategy, "subtree")) {
+   struct lock_file lock = LOCK_INIT;
int clean, x;
struct commit *result;
struct commit_list *reversed = NULL;
@@ -860,12 +857,8 @@ static int merge_trivial(struct commit *head, struct 
commit_list *remoteheads)
 {
struct object_id result_tree, result_commit;
struct commit_list *parents, **pptr = &parents;
-   struct lock_file lock = LOCK_INIT;
 
-   hold_locked_index(&lock, LOCK_DIE_ON_ERROR);
-   refresh_cache(REFRESH_QUIET);
-   if (write_locked_index(&the_index, &lock,
-  COMMIT_LOCK | SKIP_IF_UNCHANGED))
+   if (refresh_and_write_cache(REFRESH_QUIET, SKIP_IF_UNCHANGED) < 0)
return error(_("Unable to write index."));
 
write_tree_trivial(&result_tree);
-- 
2.23.0.rc2.194.ge5444969c9



Re: [PATCH 2/3] merge: use refresh_and_write_cache

2019-08-29 Thread Thomas Gummerer
On 08/28, Martin Ågren wrote:
> On Tue, 27 Aug 2019 at 12:15, Thomas Gummerer  wrote:
> 
> > struct lock_file lock = LOCK_INIT;
> > const char *head_arg = "HEAD";
> >
> > -   hold_locked_index(&lock, LOCK_DIE_ON_ERROR);
> > -   refresh_cache(REFRESH_QUIET);
> > -   if (write_locked_index(&the_index, &lock,
> > -  COMMIT_LOCK | SKIP_IF_UNCHANGED))
> > -   return error(_("Unable to write index."));
> > +   if (refresh_and_write_cache(REFRESH_QUIET, COMMIT_LOCK | 
> > SKIP_IF_UNCHANGED) < 0)
> > +   return -1;
> 
> I wondered why you didn't drop the `struct lock_file`, but it turns out
> we still need it further down.
> 
> > if (!strcmp(strategy, "recursive") || !strcmp(strategy, "subtree")) 
> > {
> > int clean, x;
> 
> What you could do, I guess, is to move its declaration to around here.
> Probably not worth a re-roll.

I'll re-roll anyway for the things you spotted in the first patch, so
I'll drop it down here while I'm at it, thanks!

> > @@ -860,13 +857,9 @@ static int merge_trivial(struct commit *head, struct 
> > commit_list *remoteheads)
> >  {
> > struct object_id result_tree, result_commit;
> > struct commit_list *parents, **pptr = &parents;
> > -   struct lock_file lock = LOCK_INIT;
> >
> > -   hold_locked_index(&lock, LOCK_DIE_ON_ERROR);
> > -   refresh_cache(REFRESH_QUIET);
> > -   if (write_locked_index(&the_index, &lock,
> > -  COMMIT_LOCK | SKIP_IF_UNCHANGED))
> > -   return error(_("Unable to write index."));
> > +   if (refresh_and_write_cache(REFRESH_QUIET, COMMIT_LOCK | 
> > SKIP_IF_UNCHANGED) < 0)
> > +   return -1;
> 
> Here you do drop the `struct lock_file` entirely, ok.
> 
> 
> 
> Martin


Re: [PATCH 1/3] factor out refresh_and_write_cache function

2019-08-29 Thread Thomas Gummerer
On 08/28, Martin Ågren wrote:
> On Tue, 27 Aug 2019 at 12:14, Thomas Gummerer  wrote:
> >
> > Getting the lock for the index, refreshing it and then writing it is a
> > pattern that happens more than once throughout the codebase.  Factor
> > out the refresh_and_write_cache function from builtin/am.c to
> > read-cache.c, so it can be re-used in other places in a subsequent
> > commit.
> 
> > +/*
> > + * Refresh the index and write it to disk.
> > + *
> > + * Return 1 if refreshing the cache failed, -1 if writing the cache to
> > + * disk failed, 0 on success.
> > + */
> 
> Thank you for documenting. :-) Should we say something about how this
> doesn't explicitly print any error in case refreshing fails (that is, we
> leave it to `refresh_index()`), but that we *do* explicitly print an
> error if writing the index fails? That caught me off-guard as I looked
> at how you convert the callers.
> 
> And do we actually want that asymmetry? Maybe we do.

I think I needed the error for something while I went through a few
iterations of how to best structure this function, but I don't
remember for what exactly now.  I think it might actually be better to
just return -1 here, and let the caller distinguish and show the error
message if they need to.  That also avoids duplicating the error in
case the caller wants to die on error.

> Might be worth pointing out as you convert the callers how some (all?)
> of them now emit different error messages from before, but that it
> shouldn't matter(?) and it makes sense to unify those messages.

Yeah, I don't think changing the error message should matter, but
unifying them is not actually a goal of this series.  So with what you
pointed out above, I think I'll leave them as they are.

> > +int repo_refresh_and_write_index(struct repository*, unsigned int 
> > refresh_flags, unsigned int write_flags, const struct pathspec *, char 
> > *seen, const char *header_msg);
> 
> > +int repo_refresh_and_write_index(struct  repository *repo,
> > +unsigned int refresh_flags,
> > +unsigned int write_flags,
> > +const struct pathspec *pathspec,
> > +char *seen, const char *header_msg)
> > +{
> > +   struct lock_file lock_file = LOCK_INIT;
> > +
> > +   repo_hold_locked_index(repo, &lock_file, LOCK_DIE_ON_ERROR);
> > +   if (refresh_index(repo->index, refresh_flags, pathspec, seen, 
> > header_msg))
> > +   return 1;
> > +   if (write_locked_index(repo->index, &lock_file, write_flags))
> > +   return error(_("unable to write index file"));
> > +   return 0;
> > +}
> 
> If `flags` doesn't contain `COMMIT_LOCK`, the lockfile will be closed
> "gently", meaning we still need to either commit it, or roll it back. Or
> let the exit handler roll it back, which is what would happen here, no?
> We lose our handle on the stack and there's no way for anyone to say
> "ok, now I'm done, commit it please" (or "roll it back").
> 
> In short, I think calling this function without providing `COMMIT_LOCK`
> would be useless at best. We should probably let this function provide
> `COMMIT_LOCK | write_flags` or `COMMIT_LOCK | extra_write_flags` or
> whatever. Most callers would just provide "0". Hm?
> 
> Or, we could BUG if the COMMIT_LOCK bit isn't set, but that seems like a
> less good choice to me. If we're so adamant about the bit being set --
> which we should be, IMHO -- we might as well set it ourselves.

Yeah, you're right, making this function use `COMMIT_LOCK | write_flags`
would probably be the best option.  I'll change that, and document it
as well.

Thanks for your review!

> 
> 
> Martin


[PATCH 3/3] stash: make sure to write refreshed cache

2019-08-27 Thread Thomas Gummerer
When converting stash into C, calls to 'git update-index --refresh'
were replaced with the 'refresh_cache()' function.  That is fine as
long as the index is only needed in-core, and not re-read from disk.

However in many cases we do actually need the refreshed index to be
written to disk, for example 'merge_recursive_generic()' discards the
in-core index before re-reading it from disk, and in the case of 'apply
--quiet', the 'refresh_cache()' we currently have is pointless without
writing the index to disk.

Always write the index after refreshing it to ensure there are no
regressions in this compared to the scripted stash.  In the future we
can consider avoiding the write where possible after making sure none
of the subsequent calls actually need the refreshed cache, and it is
not expected to be refreshed after stash exits or it is written
somewhere else already.

Reported-by: Jeff King 
Signed-off-by: Thomas Gummerer 
---
 builtin/stash.c  | 11 +++
 t/t3903-stash.sh | 16 
 2 files changed, 23 insertions(+), 4 deletions(-)

diff --git a/builtin/stash.c b/builtin/stash.c
index b5a301f24d..b36aada644 100644
--- a/builtin/stash.c
+++ b/builtin/stash.c
@@ -396,7 +396,7 @@ static int do_apply_stash(const char *prefix, struct 
stash_info *info,
const struct object_id *bases[1];
 
read_cache_preload(NULL);
-   if (refresh_cache(REFRESH_QUIET))
+   if (refresh_and_write_cache(REFRESH_QUIET, COMMIT_LOCK))
return -1;
 
if (write_cache_as_tree(&c_tree, 0, NULL))
@@ -485,7 +485,7 @@ static int do_apply_stash(const char *prefix, struct 
stash_info *info,
}
 
if (quiet) {
-   if (refresh_cache(REFRESH_QUIET))
+   if (refresh_and_write_cache(REFRESH_QUIET, COMMIT_LOCK))
warning("could not refresh index");
} else {
struct child_process cp = CHILD_PROCESS_INIT;
@@ -1129,7 +1129,10 @@ static int do_create_stash(const struct pathspec *ps, 
struct strbuf *stash_msg_b
prepare_fallback_ident("git stash", "git@stash");
 
read_cache_preload(NULL);
-   refresh_cache(REFRESH_QUIET);
+   if (refresh_and_write_cache(REFRESH_QUIET, COMMIT_LOCK) < 0) {
+   ret = -1;
+   goto done;
+   }
 
if (get_oid("HEAD", &info->b_commit)) {
if (!quiet)
@@ -1290,7 +1293,7 @@ static int do_push_stash(const struct pathspec *ps, const 
char *stash_msg, int q
free(ps_matched);
}
 
-   if (refresh_cache(REFRESH_QUIET)) {
+   if (refresh_and_write_cache(REFRESH_QUIET, COMMIT_LOCK)) {
ret = -1;
goto done;
}
diff --git a/t/t3903-stash.sh b/t/t3903-stash.sh
index b8e337893f..392954d6dd 100755
--- a/t/t3903-stash.sh
+++ b/t/t3903-stash.sh
@@ -1241,4 +1241,20 @@ test_expect_success 'stash --keep-index with file 
deleted in index does not resu
test_path_is_missing to-remove
 '
 
+test_expect_success 'stash apply should succeed with unmodified file' '
+   echo base >file &&
+   git add file &&
+   git commit -m base &&
+
+   # now stash a modification
+   echo modified >file &&
+   git stash &&
+
+   # make the file stat dirty
+   cp file other &&
+   mv other file &&
+
+   git stash apply
+'
+
 test_done
-- 
2.23.0.rc2.194.ge5444969c9



[PATCH 0/3] make sure stash refreshes the index properly

2019-08-27 Thread Thomas Gummerer
Thanks Peff for spotting the bug!  Here's a series that fixes it.

> And before the third one, introduction of a new entry point that
> makes merge-recursive machinery inherit the already populated
> in-core index, happens, I think the right solution is to write the
> in-core index out---the write is not pointless.

Yup, I agree with that.  In fact there are some other places where we
just call 'refresh_cache()' as a replacement for 'git update-index
--refresh'.  At least the other one in 'do_apply_stash()' also seems
like a bug, as I assume the original intention (and behaviour) was
that the index is refreshed after 'stash apply -q' finishes.

I think in do_push_stash and do_create_stash we might be able to get
away without the write, but I wasn't 100% sure, so I made them write
the index after refreshing it as well, which is what the shell script
did.

The first patch is a small refactoring that makes the actual fix a bit
easier, while the second patch is a cleanup that I found while there.

Thomas Gummerer (3):
  factor out refresh_and_write_cache function
  merge: use refresh_and_write_cache
  stash: make sure to write refreshed cache

 builtin/am.c | 16 ++--
 builtin/merge.c  | 15 ---
 builtin/stash.c  | 11 +++
 cache.h  |  9 +
 read-cache.c | 17 +
 t/t3903-stash.sh | 16 
 6 files changed, 55 insertions(+), 29 deletions(-)

-- 
2.23.0.rc2.194.ge5444969c9



[PATCH 1/3] factor out refresh_and_write_cache function

2019-08-27 Thread Thomas Gummerer
Getting the lock for the index, refreshing it and then writing it is a
pattern that happens more than once throughout the codebase.  Factor
out the refresh_and_write_cache function from builtin/am.c to
read-cache.c, so it can be re-used in other places in a subsequent
commit.

Note that we return different error codes for failing to refresh the
cache, and failing to write the index.  The current caller only cares
about failing to write the index.  However for other callers we're
going to convert in subsequent patches we will need this distinction.

Signed-off-by: Thomas Gummerer 
---
 builtin/am.c | 16 ++--
 cache.h  |  9 +
 read-cache.c | 17 +
 3 files changed, 28 insertions(+), 14 deletions(-)

diff --git a/builtin/am.c b/builtin/am.c
index 1aea657a7f..e00410e4d7 100644
--- a/builtin/am.c
+++ b/builtin/am.c
@@ -1071,19 +1071,6 @@ static const char *msgnum(const struct am_state *state)
return sb.buf;
 }
 
-/**
- * Refresh and write index.
- */
-static void refresh_and_write_cache(void)
-{
-   struct lock_file lock_file = LOCK_INIT;
-
-   hold_locked_index(&lock_file, LOCK_DIE_ON_ERROR);
-   refresh_cache(REFRESH_QUIET);
-   if (write_locked_index(&the_index, &lock_file, COMMIT_LOCK))
-   die(_("unable to write index file"));
-}
-
 /**
  * Dies with a user-friendly message on how to proceed after resolving the
  * problem. This message can be overridden with state->resolvemsg.
@@ -1703,7 +1690,8 @@ static void am_run(struct am_state *state, int resume)
 
unlink(am_path(state, "dirtyindex"));
 
-   refresh_and_write_cache();
+   if (refresh_and_write_cache(REFRESH_QUIET, COMMIT_LOCK) < 0)
+   die(_("failed to refresh cache"));
 
if (repo_index_has_changes(the_repository, NULL, &sb)) {
write_state_bool(state, "dirtyindex", 1);
diff --git a/cache.h b/cache.h
index b1da1ab08f..f72392f32b 100644
--- a/cache.h
+++ b/cache.h
@@ -414,6 +414,7 @@ extern struct index_state the_index;
 #define add_file_to_cache(path, flags) add_file_to_index(&the_index, (path), 
(flags))
 #define chmod_cache_entry(ce, flip) chmod_index_entry(&the_index, (ce), (flip))
 #define refresh_cache(flags) refresh_index(&the_index, (flags), NULL, NULL, 
NULL)
+#define refresh_and_write_cache(refresh_flags, write_flags) 
repo_refresh_and_write_index(the_repository, (refresh_flags), (write_flags), 
NULL, NULL, NULL)
 #define ce_match_stat(ce, st, options) ie_match_stat(&the_index, (ce), (st), 
(options))
 #define ce_modified(ce, st, options) ie_modified(&the_index, (ce), (st), 
(options))
 #define cache_dir_exists(name, namelen) index_dir_exists(&the_index, (name), 
(namelen))
@@ -812,6 +813,14 @@ void fill_stat_cache_info(struct index_state *istate, 
struct cache_entry *ce, st
 #define REFRESH_IN_PORCELAIN   0x0020  /* user friendly output, not "needs 
update" */
 #define REFRESH_PROGRESS   0x0040  /* show progress bar if stderr is tty */
 int refresh_index(struct index_state *, unsigned int flags, const struct 
pathspec *pathspec, char *seen, const char *header_msg);
+/*
+ * Refresh the index and write it to disk.
+ *
+ * Return 1 if refreshing the cache failed, -1 if writing the cache to
+ * disk failed, 0 on success.
+ */
+int repo_refresh_and_write_index(struct repository*, unsigned int 
refresh_flags, unsigned int write_flags, const struct pathspec *, char *seen, 
const char *header_msg);
+
 struct cache_entry *refresh_cache_entry(struct index_state *, struct 
cache_entry *, unsigned int);
 
 void set_alternate_index_output(const char *);
diff --git a/read-cache.c b/read-cache.c
index 52ffa8a313..905d2ddd10 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -1472,6 +1472,23 @@ static void show_file(const char * fmt, const char * 
name, int in_porcelain,
printf(fmt, name);
 }
 
+int repo_refresh_and_write_index(struct  repository *repo,
+unsigned int refresh_flags,
+unsigned int write_flags,
+const struct pathspec *pathspec,
+char *seen, const char *header_msg)
+{
+   struct lock_file lock_file = LOCK_INIT;
+
+   repo_hold_locked_index(repo, &lock_file, LOCK_DIE_ON_ERROR);
+   if (refresh_index(repo->index, refresh_flags, pathspec, seen, 
header_msg))
+   return 1;
+   if (write_locked_index(repo->index, &lock_file, write_flags))
+   return error(_("unable to write index file"));
+   return 0;
+}
+
+
 int refresh_index(struct index_state *istate, unsigned int flags,
  const struct pathspec *pathspec,
  char *seen, const char *header_msg)
-- 
2.23.0.rc2.194.ge5444969c9



[PATCH 2/3] merge: use refresh_and_write_cache

2019-08-27 Thread Thomas Gummerer
Use the 'refresh_and_write_cache()' convenience function introduced in
the last commit, instead of refreshing and writing the index manually
in merge.c

Signed-off-by: Thomas Gummerer 
---
 builtin/merge.c | 15 ---
 1 file changed, 4 insertions(+), 11 deletions(-)

diff --git a/builtin/merge.c b/builtin/merge.c
index e2ccbc44e2..b5e31ce283 100644
--- a/builtin/merge.c
+++ b/builtin/merge.c
@@ -691,11 +691,8 @@ static int try_merge_strategy(const char *strategy, struct 
commit_list *common,
struct lock_file lock = LOCK_INIT;
const char *head_arg = "HEAD";
 
-   hold_locked_index(&lock, LOCK_DIE_ON_ERROR);
-   refresh_cache(REFRESH_QUIET);
-   if (write_locked_index(&the_index, &lock,
-  COMMIT_LOCK | SKIP_IF_UNCHANGED))
-   return error(_("Unable to write index."));
+   if (refresh_and_write_cache(REFRESH_QUIET, COMMIT_LOCK | 
SKIP_IF_UNCHANGED) < 0)
+   return -1;
 
if (!strcmp(strategy, "recursive") || !strcmp(strategy, "subtree")) {
int clean, x;
@@ -860,13 +857,9 @@ static int merge_trivial(struct commit *head, struct 
commit_list *remoteheads)
 {
struct object_id result_tree, result_commit;
struct commit_list *parents, **pptr = &parents;
-   struct lock_file lock = LOCK_INIT;
 
-   hold_locked_index(&lock, LOCK_DIE_ON_ERROR);
-   refresh_cache(REFRESH_QUIET);
-   if (write_locked_index(&the_index, &lock,
-  COMMIT_LOCK | SKIP_IF_UNCHANGED))
-   return error(_("Unable to write index."));
+   if (refresh_and_write_cache(REFRESH_QUIET, COMMIT_LOCK | 
SKIP_IF_UNCHANGED) < 0)
+   return -1;
 
write_tree_trivial(&result_tree);
printf(_("Wonderful.\n"));
-- 
2.23.0.rc2.194.ge5444969c9



[PATCH v2] t0021: make sure clean filter runs

2019-08-22 Thread Thomas Gummerer
In t0021.15 one of the things we are checking is that the clean filter
is run when checking out empty-branch.  The clean filter needs to be
run to make sure there are no modifications on the file system for the
test.r file, and thus it isn't dangerous to overwrite it.

However in the current test setup it is not always necessary to run
the clean filter, and thus the test sometimes fails, as debug.log
isn't written.

This happens when test.r has an older mtime than the index itself.
That mtime is also recorded as stat data for test.r in the index, and
based on the heuristic we're using for index entries, git correctly
assumes this file is up-to-date.

Usually this test succeeds because the mtime of test.r is the same as
the mtime of the index.  In this case test.r is racily clean, so git
actually checks the contents, for which the clean filter is run.

Fix the test by updating the mtime of test.r, so git is forced to
check the contents of the file, and the clean filter is run as the
test expects.

Signed-off-by: Thomas Gummerer 
---

v2 adds the comment as suggested by Szeder.

Junio: I saw this is marked as "merged to 'next'" in the What's
cooking, so if it got merged already I'm fine with just keeping v1,
but otherwise I think adding the comment would be nice.

 t/t0021-conversion.sh | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/t/t0021-conversion.sh b/t/t0021-conversion.sh
index e10f5f787f..c954c709ad 100755
--- a/t/t0021-conversion.sh
+++ b/t/t0021-conversion.sh
@@ -390,6 +390,9 @@ test_expect_success PERL 'required process filter should 
filter data' '
EOF
test_cmp_exclude_clean expected.log debug.log &&
 
+   # Make sure that the file appears dirty, so checkout below has 
to
+   # run the configured filter.
+   touch test.r &&
filter_git checkout --quiet --no-progress empty-branch &&
cat >expected.log <<-EOF &&
START
-- 
2.23.0.rc2.194.ge5444969c9



Re: [PATCH] t0021: make sure clean filter runs

2019-08-22 Thread Thomas Gummerer
On 08/22, SZEDER Gábor wrote:
> On Wed, Aug 21, 2019 at 08:23:23PM +0200, Johannes Sixt wrote:
> > Am 21.08.19 um 16:56 schrieb Thomas Gummerer:
> > > On 08/20, Johannes Sixt wrote:
> > >> Am 20.08.19 um 08:56 schrieb Thomas Gummerer:
> > >>> Fix the test by updating the mtime of test.r, ...
> > >>
> > >>> diff --git a/t/t0021-conversion.sh b/t/t0021-conversion.sh
> > >>> index e10f5f787f..66f75005d5 100755
> > >>> --- a/t/t0021-conversion.sh
> > >>> +++ b/t/t0021-conversion.sh
> > >>> @@ -390,6 +390,7 @@ test_expect_success PERL 'required process filter 
> > >>> should filter data' '
> > >>> EOF
> > >>> test_cmp_exclude_clean expected.log debug.log &&
> > >>>  
> > >>> +   touch test.r &&
> > >>
> > >>  test-tool chmtime +10 test.r
> > >>
> > >> would be more reliable.
> > > 
> > > Hmm, is touch unreliable on some platforms?  I didn't think of
> > > 'test-tool chmtime', but I'm also not sure it's better than touch in
> > > this case.
> > > 
> > > To me te 'touch' signifies that the timestamp must be updated after
> > > the previous checkout, so git thinks it could possibly have been
> > > changed, which I think is clearer in this case than setting the mtime
> > > to a future time.
> > 
> > touch does not guarantee that the current time is different from the
> > timestamp that the file already carries, particularly not when the
> > filesystem stores just a resolution of 1 second, and commands are
> > executed quickly.
> 
> This 'touch' must ensure that the timestamp of the file is not older
> than the timestamp of the index, and to achive that it doesn't
> necessarily have to modify the timestamp.
> 
> The file is modified first, then the index is updated, and finally
> comes this 'touch'.  Consequently, if 'touch' doesn't modify the
> timestamp of the file, then it must have the same timestamp as the
> index, IOW it's racily clean, and the subsequent 'git checkout' has to
> look at the file content and has to run the filter, and that's what we
> want to see here.

Right.

> However, I'm not sure what would happens if the system clock were to
> jump back in between, but since it's only a test I don't think it's
> worth caring about.

Yeah, I think much of git wouldn't work correctly if that happens, so
I think it's fairly safe to ignore in the test suite.

> > But when we use test-tool chmtime +10, then the timestamp is definitely
> > different.
> 
> 'test-tool chmtime +10' adjusts the timestamp of the file relative to
> its current timestamp.  So yeah, the file's timestamp definitely
> changes, but that's not enough, because it doesn't ensure that the new
> timestamp is not older than the timestamp of the index.  Just imagine
> the arguably pathological situation that right after the file was last
> modified the system miraculously comes to a complete stall, and only
> manages to resume after 15 seconds to continue with updating the
> index.  This means that the timestamp of the file will be 15s older
> than the index, and after that 'chmtime +10' it will still be 5s
> older.  Consequently, 'git checkout' will think that the file is
> clean, it won't run the filter that we expect, and the test will fail.
> So instead of '+10' it should be '=+10' to set the new timestamp
> relative to the current time, but I'm not too keen about the timestamp
> in the future either (though the file is about to be deleted anyway).

Right, the above is why I think 'touch' is a good idea here.  Short of
system clocks jumping around, which will most likely break more than
this test anyway it guarantees that the timestamp is equal or greater
than the timestamp of the index, which is what we need here.

> I think it would be best to explicitly set the timestamp of the file
> and the index sort-of relative to each other and add an in-code
> comment as well, e.g.:
> 
>   # Make sure that the file appears dirty, so checkout below has to
>   # run the configured filter.
>   test-tool chmtime =-10 .git/index &&
>   test-tool chmtime =+0 test.r &&

I think the comment is a good idea.  I personally still prefer just
using 'touch' though, as I find it slightly easier to read (I had to
go look up what the =-/=+ in 'test-tool chmtime' does, while I knew
what touch would be doing :)

That said that's a minor preference for me, if people have a strong
opinion that test-tool chmtime is really better here I'm fine with
changing it.


Re: [PATCH] t0021: make sure clean filter runs

2019-08-21 Thread Thomas Gummerer
On 08/20, Johannes Sixt wrote:
> Am 20.08.19 um 08:56 schrieb Thomas Gummerer:
> > Fix the test by updating the mtime of test.r, ...
> 
> > diff --git a/t/t0021-conversion.sh b/t/t0021-conversion.sh
> > index e10f5f787f..66f75005d5 100755
> > --- a/t/t0021-conversion.sh
> > +++ b/t/t0021-conversion.sh
> > @@ -390,6 +390,7 @@ test_expect_success PERL 'required process filter 
> > should filter data' '
> > EOF
> > test_cmp_exclude_clean expected.log debug.log &&
> >  
> > +   touch test.r &&
> 
>   test-tool chmtime +10 test.r
> 
> would be more reliable.

Hmm, is touch unreliable on some platforms?  I didn't think of
'test-tool chmtime', but I'm also not sure it's better than touch in
this case.

To me te 'touch' signifies that the timestamp must be updated after
the previous checkout, so git thinks it could possibly have been
changed, which I think is clearer in this case than setting the mtime
to a future time.

But I'm happy to change it if there's something I'm missing why
'test-tool chmtime' is better in this case.

> > filter_git checkout --quiet --no-progress empty-branch &&
> > cat >expected.log <<-EOF &&
> > START
> > 
> 
> -- Hannes


Re: [PATCH] t0021: make sure clean filter runs

2019-08-21 Thread Thomas Gummerer
On 08/20, Junio C Hamano wrote:
> Thomas Gummerer  writes:
> 
> > Fix the test by updating the mtime of test.r, so git is forced to
> > check the contents of the file, and the clean filter is run as the
> > test expects.
> 
> Hmph, depending on the timestamp granularity, with this patch,
> test.r would have mtime that is the same or a bit later than that of
> the index file.  Is it sufficient to really "force" Git to check the
> contents, or does it just make the likelyhood that it would choose
> to check a bit bigger (in other words, are we solving the race, or
> merely making the race window smaller)?

This test only worked until now because git checks the contents if the
mtime of the file and the index are the same.  This is because of
racy-git.  I tried to describe this in the commit message, but looks
like it wasn't clear enough.  Do you have any suggestions on how to
make it clearer?

It will also check the contents if the mtime is greater than the
timestamp of the index, so the 'touch' here would also cover that.

So the changes here do solve the race completely.

> Thanks.
> 
> >
> > Signed-off-by: Thomas Gummerer 
> > ---
> >  t/t0021-conversion.sh | 1 +
> >  1 file changed, 1 insertion(+)
> >
> > diff --git a/t/t0021-conversion.sh b/t/t0021-conversion.sh
> > index e10f5f787f..66f75005d5 100755
> > --- a/t/t0021-conversion.sh
> > +++ b/t/t0021-conversion.sh
> > @@ -390,6 +390,7 @@ test_expect_success PERL 'required process filter 
> > should filter data' '
> > EOF
> > test_cmp_exclude_clean expected.log debug.log &&
> >  
> > +   touch test.r &&
> > filter_git checkout --quiet --no-progress empty-branch &&
> > cat >expected.log <<-EOF &&
> > START


[PATCH] t0021: make sure clean filter runs

2019-08-19 Thread Thomas Gummerer
In t0021.15 one of the things we are checking is that the clean filter
is run when checking out empty-branch.  The clean filter needs to be
run to make sure there are no modifications on the file system for the
test.r file, and thus it isn't dangerous to overwrite it.

However in the current test setup it is not always necessary to run
the clean filter, and thus the test sometimes fails, as debug.log
isn't written.

This happens when test.r has an older mtime than the index itself.
That mtime is also recorded as stat data for test.r in the index, and
based on the heuristic we're using for index entries, git correctly
assumes this file is up-to-date.

Usually this test succeeds because the mtime of test.r is the same as
the mtime of the index.  In this case test.r is racily clean, so git
actually checks the contents, for which the clean filter is run.

Fix the test by updating the mtime of test.r, so git is forced to
check the contents of the file, and the clean filter is run as the
test expects.

Signed-off-by: Thomas Gummerer 
---
 t/t0021-conversion.sh | 1 +
 1 file changed, 1 insertion(+)

diff --git a/t/t0021-conversion.sh b/t/t0021-conversion.sh
index e10f5f787f..66f75005d5 100755
--- a/t/t0021-conversion.sh
+++ b/t/t0021-conversion.sh
@@ -390,6 +390,7 @@ test_expect_success PERL 'required process filter should 
filter data' '
EOF
test_cmp_exclude_clean expected.log debug.log &&
 
+   touch test.r &&
filter_git checkout --quiet --no-progress empty-branch &&
cat >expected.log <<-EOF &&
START
-- 
2.23.0.rc2.194.ge5444969c9



[PATCH v2] stash: fix handling removed files with --keep-index

2019-07-16 Thread Thomas Gummerer
git stash push --keep-index is supposed to keep all changes that have
been added to the index, both in the index and on disk.

Currently this doesn't behave correctly when a file is removed from
the index.  Instead of keeping it deleted on disk, --keep-index
currently restores the file.

Fix that behaviour by using 'git checkout' in no-overlay mode which
can faithfully restore the index and working tree.  This also
simplifies the code.

Note that this will overwrite untracked files if the untracked file
has the same name as a file that has been deleted in the index.

Signed-off-by: Thomas Gummerer 
---

This would be the version using 'git checkout' instead of 'git
restore'.  Still not doing everything in-core though, as mentioned in
the previous email.

 builtin/stash.c  | 32 +---
 t/t3903-stash.sh |  7 +++
 2 files changed, 16 insertions(+), 23 deletions(-)

diff --git a/builtin/stash.c b/builtin/stash.c
index fde6397caa..b5a301f24d 100644
--- a/builtin/stash.c
+++ b/builtin/stash.c
@@ -1391,30 +1391,16 @@ static int do_push_stash(const struct pathspec *ps, 
const char *stash_msg, int q
}
 
if (keep_index == 1 && !is_null_oid(&info.i_tree)) {
-   struct child_process cp_ls = CHILD_PROCESS_INIT;
-   struct child_process cp_checkout = CHILD_PROCESS_INIT;
-   struct strbuf out = STRBUF_INIT;
-
-   if (reset_tree(&info.i_tree, 0, 1)) {
-   ret = -1;
-   goto done;
-   }
-
-   cp_ls.git_cmd = 1;
-   argv_array_pushl(&cp_ls.args, "ls-files", "-z",
-"--modified", "--", NULL);
-
-   add_pathspecs(&cp_ls.args, ps);
-   if (pipe_command(&cp_ls, NULL, 0, &out, 0, NULL, 0)) {
-   ret = -1;
-   goto done;
-   }
+   struct child_process cp = CHILD_PROCESS_INIT;
 
-   cp_checkout.git_cmd = 1;
-   argv_array_pushl(&cp_checkout.args, "checkout-index",
-"-z", "--force", "--stdin", NULL);
-   if (pipe_command(&cp_checkout, out.buf, out.len, NULL,
-0, NULL, 0)) {
+   cp.git_cmd = 1;
+   argv_array_pushl(&cp.args, "checkout", "--no-overlay",
+oid_to_hex(&info.i_tree), "--", NULL);
+   if (!ps->nr)
+   argv_array_push(&cp.args, ":/");
+   else
+   add_pathspecs(&cp.args, ps);
+   if (run_command(&cp)) {
ret = -1;
goto done;
}
diff --git a/t/t3903-stash.sh b/t/t3903-stash.sh
index b22e671608..b8e337893f 100755
--- a/t/t3903-stash.sh
+++ b/t/t3903-stash.sh
@@ -1234,4 +1234,11 @@ test_expect_success 'stash works when user.name and 
user.email are not set' '
)
 '
 
+test_expect_success 'stash --keep-index with file deleted in index does not 
resurrect it on disk' '
+   test_commit to-remove to-remove &&
+   git rm to-remove &&
+   git stash --keep-index &&
+   test_path_is_missing to-remove
+'
+
 test_done
-- 
2.22.0.599.gf5cf68d754


Re: [PATCH] stash: fix handling removed files with --keep-index

2019-07-16 Thread Thomas Gummerer
On 07/11, Junio C Hamano wrote:
> Thomas Gummerer  writes:
> 
> > Fix that behaviour by using 'git restore' which can faithfully restore
> > the index and working tree.  This also simplifies the code.
> 
> Hmph.  I would have preferred to see we stayed away from 'restore'
> (and used 'checkout' instead, if you must use a Porcelain command),
> so that the "fix" can go to maintenance tracks, if distro packagers
> choose to backport it.

Fair enough.  I thought this wouldn't even go to 'maint', since the
bug exists since a while, so 'git restore' would be fine, but didn't
think of distro packagers.  I'm happy to use 'checkout' here instead.  

> Isn't the machinery for "git status" (in wt-status.c) mature enough
> to allow us to learn what got changed all in-core, without spawning
> an external process these days, though?

Maybe, I'm not all that familar with that machinery.  My longer term
hope was actually to libify the checkout machinery, and to use that
here and use that to do all this (and the 'add', 'diff-index' and
'apply' dance above) in core.  But maybe it's worth looking at the
"git status" machinery for that as well?

I probably won't have enough time to do that in the next few weeks
though, so my preference would be to just use checkout for this (I'll
send an updated patch) to fix the bug in the next release.  As we're
already spawning two external processes and would replace that with
just spawning one it wouldn't make anything worse at least.

Then we can try to do this all in-core at some point later, which I
think is a bit more work, and probably wouldn't be ready for the next
release (at least I won't have time to work on it).

> > if (keep_index == 1 && !is_null_oid(&info.i_tree)) {
> > -   struct child_process cp_ls = CHILD_PROCESS_INIT;
> > -   struct child_process cp_checkout = CHILD_PROCESS_INIT;
> > -   struct strbuf out = STRBUF_INIT;
> > -
> > -   if (reset_tree(&info.i_tree, 0, 1)) {
> > -   ret = -1;
> > -   goto done;
> > -   }
> > -
> > -   cp_ls.git_cmd = 1;
> > -   argv_array_pushl(&cp_ls.args, "ls-files", "-z",
> > -"--modified", "--", NULL);
> > -
> > -   add_pathspecs(&cp_ls.args, ps);
> > -   if (pipe_command(&cp_ls, NULL, 0, &out, 0, NULL, 0)) {
> > -   ret = -1;
> > -   goto done;
> > -   }
> > -
> > -   cp_checkout.git_cmd = 1;
> > -   argv_array_pushl(&cp_checkout.args, "checkout-index",
> > -"-z", "--force", "--stdin", NULL);
> > -   if (pipe_command(&cp_checkout, out.buf, out.len, NULL,
> > -0, NULL, 0)) {
> > +   struct child_process cp_restore = CHILD_PROCESS_INIT;
> > +
> > +   cp_restore.git_cmd = 1;
> > +   argv_array_pushl(&cp_restore.args, "restore", 
> > "--source", oid_to_hex(&info.i_tree),
> > +"--staged", "--worktree", NULL);
> > +   if (!ps->nr)
> > +   argv_array_push(&cp_restore.args, ".");
> > +   else
> > +   add_pathspecs(&cp_restore.args, ps);
> > +   if (run_command(&cp_restore)) {
> > ret = -1;
> > goto done;
> > }
> > diff --git a/t/t3903-stash.sh b/t/t3903-stash.sh
> > index b22e671608..b8e337893f 100755
> > --- a/t/t3903-stash.sh
> > +++ b/t/t3903-stash.sh
> > @@ -1234,4 +1234,11 @@ test_expect_success 'stash works when user.name and 
> > user.email are not set' '
> > )
> >  '
> >  
> > +test_expect_success 'stash --keep-index with file deleted in index does 
> > not resurrect it on disk' '
> > +   test_commit to-remove to-remove &&
> > +   git rm to-remove &&
> > +   git stash --keep-index &&
> > +   test_path_is_missing to-remove
> > +'
> > +
> >  test_done


[PATCH] stash: fix handling removed files with --keep-index

2019-07-11 Thread Thomas Gummerer
On 07/11, Martin Nicolay wrote:
> Hi!
> 
> I don't know if this is a software or documentation bug.
> 
> man git-stash says about --keep-index:
> If the --keep-index option is used, all changes already added to
> the index are left intact.
> 
> If a file is deleted and this deletion is in the index a following
> $ git stash push --keep-index
> keeps this deletion in the index but not in the working-tree.
> 
> If a file is changed and this change is in the index a following
> $ git stash push --keep-index
> keeps this change in the index and also in the working-tree.
> 
> This is inconsistent.

Thanks for your report.  This has come up before in
https://public-inbox.org/git/1555437849815.60...@rasenplanscher.info/,
which I first thought was expected behaviour, but that was just me
misunderstanding the --keep-index option.  So I belive this is indeed
a bug.

Luckily I had some more time to actually look at this this time
around, so below is a potential fix.

This comes with a small caveat of overwriting untracked files if they
have been removed from the index, and replaced with a file that has
not been added yet.  I think that's okay as that happens in other
places as well in stash, but wanted to point it out anyway.

--- >8 ---
Subject: [PATCH] stash: fix handling removed files with --keep-index

git stash push --keep-index is supposed to keep all changes that have
been added to the index, both in the index and on disk.

Currently this doesn't behave correctly when a file is removed from
the index.  Instead of keeping it deleted on disk, --keep-index
currently restores the file.

Fix that behaviour by using 'git restore' which can faithfully restore
the index and working tree.  This also simplifies the code.

Note that this will overwrite untracked files if the untracked file
has the same name as a file that has been deleted in the index.

Signed-off-by: Thomas Gummerer 
---
 builtin/stash.c  | 34 ++
 t/t3903-stash.sh |  7 +++
 2 files changed, 17 insertions(+), 24 deletions(-)

diff --git a/builtin/stash.c b/builtin/stash.c
index fde6397caa..2a58c007e1 100644
--- a/builtin/stash.c
+++ b/builtin/stash.c
@@ -1391,30 +1391,16 @@ static int do_push_stash(const struct pathspec *ps, 
const char *stash_msg, int q
}
 
if (keep_index == 1 && !is_null_oid(&info.i_tree)) {
-   struct child_process cp_ls = CHILD_PROCESS_INIT;
-   struct child_process cp_checkout = CHILD_PROCESS_INIT;
-   struct strbuf out = STRBUF_INIT;
-
-   if (reset_tree(&info.i_tree, 0, 1)) {
-   ret = -1;
-   goto done;
-   }
-
-   cp_ls.git_cmd = 1;
-   argv_array_pushl(&cp_ls.args, "ls-files", "-z",
-"--modified", "--", NULL);
-
-   add_pathspecs(&cp_ls.args, ps);
-   if (pipe_command(&cp_ls, NULL, 0, &out, 0, NULL, 0)) {
-   ret = -1;
-   goto done;
-   }
-
-   cp_checkout.git_cmd = 1;
-   argv_array_pushl(&cp_checkout.args, "checkout-index",
-"-z", "--force", "--stdin", NULL);
-   if (pipe_command(&cp_checkout, out.buf, out.len, NULL,
-0, NULL, 0)) {
+   struct child_process cp_restore = CHILD_PROCESS_INIT;
+
+   cp_restore.git_cmd = 1;
+   argv_array_pushl(&cp_restore.args, "restore", 
"--source", oid_to_hex(&info.i_tree),
+"--staged", "--worktree", NULL);
+   if (!ps->nr)
+   argv_array_push(&cp_restore.args, ".");
+   else
+   add_pathspecs(&cp_restore.args, ps);
+   if (run_command(&cp_restore)) {
ret = -1;
goto done;
}
diff --git a/t/t3903-stash.sh b/t/t3903-stash.sh
index b22e671608..b8e337893f 100755
--- a/t/t3903-stash.sh
+++ b/t/t3903-stash.sh
@@ -1234,4 +1234,11 @@ test_expect_success 'stash works when user.name and 
user.email are not set' '
)
 '
 
+test_expect_success 'stash --keep-index with file deleted in index does not 
resurrect it on disk' '
+   test_commit to-remove to-remove &&
+   git rm to-remove &&
+   git stash --keep-index &&
+   test_path_is_missing to-remove
+'
+
 test_done
-- 
2.22.0.599.gf5cf68d754


[PATCH v4 11/14] range-diff: suppress line count in outer diff

2019-07-11 Thread Thomas Gummerer
The line count in the outer diff's hunk headers of a range diff is not
all that interesting.  It merely shows how far along the inner diff
are on both sides.  That number is of no use for human readers, and
range-diffs are not meant to be machine readable.

In a subsequent commit we're going to add some more contextual
information such as the filename corresponding to the diff to the hunk
headers.  Remove the unnecessary information, and just keep the "@@"
to indicate that a new hunk of the outer diff is starting.

Signed-off-by: Thomas Gummerer 
---
 diff.c|  5 -
 diff.h|  1 +
 range-diff.c  |  1 +
 t/t3206-range-diff.sh | 16 
 4 files changed, 14 insertions(+), 9 deletions(-)

diff --git a/diff.c b/diff.c
index ec5c095199..9c28ff0a92 100644
--- a/diff.c
+++ b/diff.c
@@ -1672,7 +1672,10 @@ static void emit_hunk_header(struct emit_callback 
*ecbdata,
if (ecbdata->opt->flags.dual_color_diffed_diffs)
strbuf_addstr(&msgbuf, reverse);
strbuf_addstr(&msgbuf, frag);
-   strbuf_add(&msgbuf, line, ep - line);
+   if (ecbdata->opt->flags.suppress_hunk_header_line_count)
+   strbuf_add(&msgbuf, atat, sizeof(atat));
+   else
+   strbuf_add(&msgbuf, line, ep - line);
strbuf_addstr(&msgbuf, reset);
 
/*
diff --git a/diff.h b/diff.h
index c9db9825bb..49913049f9 100644
--- a/diff.h
+++ b/diff.h
@@ -98,6 +98,7 @@ struct diff_flags {
unsigned stat_with_summary;
unsigned suppress_diff_headers;
unsigned dual_color_diffed_diffs;
+   unsigned suppress_hunk_header_line_count;
 };
 
 static inline void diff_flags_or(struct diff_flags *a,
diff --git a/range-diff.c b/range-diff.c
index a5202d8b6c..f4a90b33b8 100644
--- a/range-diff.c
+++ b/range-diff.c
@@ -486,6 +486,7 @@ int show_range_diff(const char *range1, const char *range2,
opts.output_format = DIFF_FORMAT_PATCH;
opts.flags.suppress_diff_headers = 1;
opts.flags.dual_color_diffed_diffs = dual_color;
+   opts.flags.suppress_hunk_header_line_count = 1;
opts.output_prefix = output_prefix_cb;
strbuf_addstr(&indent, "");
opts.output_prefix_data = &indent;
diff --git a/t/t3206-range-diff.sh b/t/t3206-range-diff.sh
index aebd4e3693..9f89af7178 100755
--- a/t/t3206-range-diff.sh
+++ b/t/t3206-range-diff.sh
@@ -99,7 +99,7 @@ test_expect_success 'changed commit' '
1:  4de457d = 1:  a4b s/5/A/
2:  fccce22 = 2:  f51d370 s/4/A/
3:  147e64e ! 3:  0559556 s/11/B/
-   @@ -10,7 +10,7 @@
+   @@
  9
  10
 -11
@@ -109,7 +109,7 @@ test_expect_success 'changed commit' '
  13
  14
4:  a63e992 ! 4:  d966c5c s/12/B/
-   @@ -8,7 +8,7 @@
+   @@
 @@ A
  9
  10
@@ -158,7 +158,7 @@ test_expect_success 'changed commit with sm config' '
1:  4de457d = 1:  a4b s/5/A/
2:  fccce22 = 2:  f51d370 s/4/A/
3:  147e64e ! 3:  0559556 s/11/B/
-   @@ -10,7 +10,7 @@
+   @@
  9
  10
 -11
@@ -168,7 +168,7 @@ test_expect_success 'changed commit with sm config' '
  13
  14
4:  a63e992 ! 4:  d966c5c s/12/B/
-   @@ -8,7 +8,7 @@
+   @@
 @@ A
  9
  10
@@ -191,7 +191,7 @@ test_expect_success 'changed message' '
sed s/Z/\ /g >expected <<-EOF &&
1:  4de457d = 1:  f686024 s/5/A/
2:  fccce22 ! 2:  4ab067d s/4/A/
-   @@ -2,6 +2,8 @@
+   @@
Z
Zs/4/A/
Z
@@ -210,7 +210,7 @@ test_expect_success 'dual-coloring' '
sed -e "s|^:||" >expect <<-\EOF &&
:1:  a4b = 1:  f686024 s/5/A/
:2:  f51d370 ! 2:  
4ab067d s/4/A/
-   :@@ -2,6 +2,8 @@
+   :@@
: 
: s/4/A/
: 
@@ -220,7 +220,7 @@ test_expect_success 'dual-coloring' '
:  --- a/file
:  +++ b/file
:3:  0559556 ! 3:  
b9cb956 s/11/B/
-   :@@ -10,7 +10,7 @@
+   :@@
:  9
:  10
: -11
@@ -230,7 +230,7 @@ test_expect_success 'dual-coloring' '
:  13
:  14
:4:  d966c5c ! 4:  
8add5f1 s/12/B/
-   :@@ -8,7 +8,7 @@
+   :@@
: @@ A
:  9
:  10
-- 
2.22.0.510.g264f2c817a



[PATCH v4 14/14] range-diff: add headers to the outer hunk header

2019-07-11 Thread Thomas Gummerer
Add the section headers/hunk headers we introduced in the previous
commits to the outer diff's hunk headers.  This makes it easier to
understand which change we are actually looking at.  For example an
outer hunk header might now look like:

@@  Documentation/config/interactive.txt

while previously it would have only been

@@

which doesn't give a lot of context for the change that follows.

For completeness also add section headers for the commit metadata and
the commit message, although they are arguably less important.

Signed-off-by: Thomas Gummerer 
---
 range-diff.c  |  9 ++---
 t/t3206-range-diff.sh | 41 ++---
 2 files changed, 28 insertions(+), 22 deletions(-)

diff --git a/range-diff.c b/range-diff.c
index 7a96a587f1..ba1e9a4265 100644
--- a/range-diff.c
+++ b/range-diff.c
@@ -139,8 +139,10 @@ static int read_patches(const char *range, struct 
string_list *list)
strbuf_addstr(&buf, " ##");
} else if (in_header) {
if (starts_with(line, "Author: ")) {
+   strbuf_addstr(&buf, " ## Metadata ##\n");
strbuf_addstr(&buf, line);
strbuf_addstr(&buf, "\n\n");
+   strbuf_addstr(&buf, " ## Commit message ##\n");
} else if (starts_with(line, "")) {
p = line + len - 2;
while (isspace(*p) && p >= line)
@@ -402,8 +404,9 @@ static void output_pair_header(struct diff_options *diffopt,
fwrite(buf->buf, buf->len, 1, diffopt->file);
 }
 
-static struct userdiff_driver no_func_name = {
-   .funcname = { "$^", 0 }
+static struct userdiff_driver section_headers = {
+   .funcname = { "^ ## (.*) ##$\n"
+ "^.?@@ (.*)$", REG_EXTENDED }
 };
 
 static struct diff_filespec *get_filespec(const char *name, const char *p)
@@ -415,7 +418,7 @@ static struct diff_filespec *get_filespec(const char *name, 
const char *p)
spec->size = strlen(p);
spec->should_munmap = 0;
spec->is_stdin = 1;
-   spec->driver = &no_func_name;
+   spec->driver = §ion_headers;
 
return spec;
 }
diff --git a/t/t3206-range-diff.sh b/t/t3206-range-diff.sh
index d4de270979..ec548654ce 100755
--- a/t/t3206-range-diff.sh
+++ b/t/t3206-range-diff.sh
@@ -99,7 +99,7 @@ test_expect_success 'changed commit' '
1:  4de457d = 1:  a4b s/5/A/
2:  fccce22 = 2:  f51d370 s/4/A/
3:  147e64e ! 3:  0559556 s/11/B/
-   @@
+   @@ file: A
  9
  10
 -11
@@ -109,7 +109,7 @@ test_expect_success 'changed commit' '
  13
  14
4:  a63e992 ! 4:  d966c5c s/12/B/
-   @@
+   @@ file
 @@ file: A
  9
  10
@@ -158,7 +158,7 @@ test_expect_success 'changed commit with sm config' '
1:  4de457d = 1:  a4b s/5/A/
2:  fccce22 = 2:  f51d370 s/4/A/
3:  147e64e ! 3:  0559556 s/11/B/
-   @@
+   @@ file: A
  9
  10
 -11
@@ -168,7 +168,7 @@ test_expect_success 'changed commit with sm config' '
  13
  14
4:  a63e992 ! 4:  d966c5c s/12/B/
-   @@
+   @@ file
 @@ file: A
  9
  10
@@ -186,9 +186,10 @@ test_expect_success 'renamed file' '
sed s/Z/\ /g >expected <<-EOF &&
1:  4de457d = 1:  f258d75 s/5/A/
2:  fccce22 ! 2:  017b62d s/4/A/
-   @@
+   @@ Metadata
ZAuthor: Thomas Rast 
Z
+   Z ## Commit message ##
-s/4/A/
+s/4/A/ + rename file
Z
@@ -198,8 +199,8 @@ test_expect_success 'renamed file' '
Z 1
Z 2
3:  147e64e ! 3:  3ce7af6 s/11/B/
-   @@
-   Z
+   @@ Metadata
+   Z ## Commit message ##
Zs/11/B/
Z
- ## file ##
@@ -210,8 +211,8 @@ test_expect_success 'renamed file' '
Z 9
Z 10
4:  a63e992 ! 4:  1e6226b s/12/B/
-   @@
-   Z
+   @@ Metadata
+   Z ## Commit message ##
Zs/12/B/
Z
- ## file ##
@@ -230,30 +231,32 @@ test_expect_success 'file added and later removed' '
sed s/Z/\ /g >expected <<-EOF &&
1:  4de457d = 1:  096b1ba s/5/A/
2:  fccce22 ! 2:  d92e698 s/4/A/
-   @@
+   @@ Metadata
ZAuthor: Thomas Rast 
 

[PATCH v4 12/14] range-diff: add section header instead of diff header

2019-07-11 Thread Thomas Gummerer
Currently range-diff keeps the diff header of the inner diff
intact (apart from stripping lines starting with index).  This diff
header is somewhat useful, especially when files get different
names in different ranges.

However there is no real need to keep the whole diff header for that.
The main reason we currently do that is probably because it is easy to
do.

Introduce a new range diff hunk header, that's enclosed by "##",
similar to how line numbers in diff hunks are enclosed by "@@", and
give human readable information of what exactly happened to the file,
including the file name.

This improves the readability of the range-diff by giving more concise
information to the users.  For example if a file was renamed in one
iteration, but not in another, the diff of the headers would be quite
noisy.  However the diff of a single line is concise and should be
easier to understand.

Additionally, this allows us to add these range diff section headers to
the outer diffs hunk headers using a custom userdiff pattern, which
should help making the range-diff more readable.

Signed-off-by: Thomas Gummerer 
---
 range-diff.c   | 34 
 t/t3206-range-diff.sh  | 91 +++---
 t/t3206/history.export | 84 --
 3 files changed, 192 insertions(+), 17 deletions(-)

diff --git a/range-diff.c b/range-diff.c
index f4a90b33b8..5f64380fe4 100644
--- a/range-diff.c
+++ b/range-diff.c
@@ -10,6 +10,7 @@
 #include "commit.h"
 #include "pretty.h"
 #include "userdiff.h"
+#include "apply.h"
 
 struct patch_util {
/* For the search for an exact match */
@@ -101,12 +102,35 @@ static int read_patches(const char *range, struct 
string_list *list)
}
 
if (starts_with(line, "diff --git")) {
+   struct patch patch = { 0 };
+   struct strbuf root = STRBUF_INIT;
+   int linenr = 0;
+
in_header = 0;
strbuf_addch(&buf, '\n');
if (!util->diff_offset)
util->diff_offset = buf.len;
-   strbuf_addch(&buf, ' ');
-   strbuf_addstr(&buf, line);
+   line[len - 1] = '\n';
+   len = parse_git_diff_header(&root, &linenr, 1, line,
+   len, size, &patch);
+   if (len < 0)
+   die(_("could not parse git header '%.*s'"), 
(int)len, line);
+   strbuf_addstr(&buf, " ## ");
+   if (patch.is_new > 0)
+   strbuf_addf(&buf, "%s (new)", patch.new_name);
+   else if (patch.is_delete > 0)
+   strbuf_addf(&buf, "%s (deleted)", 
patch.old_name);
+   else if (patch.is_rename)
+   strbuf_addf(&buf, "%s => %s", patch.old_name, 
patch.new_name);
+   else
+   strbuf_addstr(&buf, patch.new_name);
+
+   if (patch.new_mode && patch.old_mode &&
+   patch.old_mode != patch.new_mode)
+   strbuf_addf(&buf, " (mode change %06o => %06o)",
+   patch.old_mode, patch.new_mode);
+
+   strbuf_addstr(&buf, " ##");
} else if (in_header) {
if (starts_with(line, "Author: ")) {
strbuf_addstr(&buf, line);
@@ -122,17 +146,13 @@ static int read_patches(const char *range, struct 
string_list *list)
} else if (skip_prefix(line, "@@ ", &p)) {
p = strstr(p, "@@");
strbuf_addstr(&buf, p ? p : "@@");
-   } else if (!line[0] || starts_with(line, "index "))
+   } else if (!line[0])
/*
 * A completely blank (not ' \n', which is context)
 * line is not valid in a diff.  We skip it
 * silently, because this neatly handles the blank
 * separator line between commits in git-log
 * output.
-*
-* We also want to ignore the diff's `index` lines
-* because they contain exact blob hashes in which
-* we are not interested.
 */

[PATCH v4 10/14] range-diff: don't remove funcname from inner diff

2019-07-11 Thread Thomas Gummerer
When postprocessing the inner diff in range-diff, we currently replace
the whole hunk header line with just "@@".  This matches how 'git
tbdiff' used to handle hunk headers as well.

Most likely this is being done because line numbers in the hunk header
are not relevant without other changes.  They can for example easily
change if a range is rebased, and lines are added/removed before a
change that we actually care about in our ranges.

However it can still be useful to have the function name that 'git
diff' extracts as additional context for the change.

Note that it is not guaranteed that the hunk header actually shows up
in the range-diff, and this change only aims to improve the case where
a hunk header would already be included in the final output.

Signed-off-by: Thomas Gummerer 
---
 range-diff.c  | 7 ---
 t/t3206-range-diff.sh | 6 +++---
 2 files changed, 7 insertions(+), 6 deletions(-)

diff --git a/range-diff.c b/range-diff.c
index 784fac301b..a5202d8b6c 100644
--- a/range-diff.c
+++ b/range-diff.c
@@ -119,9 +119,10 @@ static int read_patches(const char *range, struct 
string_list *list)
strbuf_addch(&buf, '\n');
}
continue;
-   } else if (starts_with(line, "@@ "))
-   strbuf_addstr(&buf, "@@");
-   else if (!line[0] || starts_with(line, "index "))
+   } else if (skip_prefix(line, "@@ ", &p)) {
+   p = strstr(p, "@@");
+   strbuf_addstr(&buf, p ? p : "@@");
+   } else if (!line[0] || starts_with(line, "index "))
/*
 * A completely blank (not ' \n', which is context)
 * line is not valid in a diff.  We skip it
diff --git a/t/t3206-range-diff.sh b/t/t3206-range-diff.sh
index 048feaf6dd..aebd4e3693 100755
--- a/t/t3206-range-diff.sh
+++ b/t/t3206-range-diff.sh
@@ -110,7 +110,7 @@ test_expect_success 'changed commit' '
  14
4:  a63e992 ! 4:  d966c5c s/12/B/
@@ -8,7 +8,7 @@
-@@
+@@ A
  9
  10
- B
@@ -169,7 +169,7 @@ test_expect_success 'changed commit with sm config' '
  14
4:  a63e992 ! 4:  d966c5c s/12/B/
@@ -8,7 +8,7 @@
-@@
+@@ A
  9
  10
- B
@@ -231,7 +231,7 @@ test_expect_success 'dual-coloring' '
:  14
:4:  d966c5c ! 4:  
8add5f1 s/12/B/
:@@ -8,7 +8,7 @@
-   : @@
+   : @@ A
:  9
:  10
:- BB
-- 
2.22.0.510.g264f2c817a



[PATCH v4 13/14] range-diff: add filename to inner diff

2019-07-11 Thread Thomas Gummerer
In a range-diff it's not always clear which file a certain funcname of
the inner diff belongs to, because the diff header (or section header
as added in a previous commit) is not always visible in the
range-diff.

Add the filename to the inner diffs header, so it's always visible to
users.

This also allows us to add the filename + the funcname to the outer
diffs hunk headers using a custom userdiff pattern, which will be done
in the next commit.

Signed-off-by: Thomas Gummerer 
---
 range-diff.c  | 15 +--
 t/t3206-range-diff.sh | 16 ++--
 2 files changed, 23 insertions(+), 8 deletions(-)

diff --git a/range-diff.c b/range-diff.c
index 5f64380fe4..7a96a587f1 100644
--- a/range-diff.c
+++ b/range-diff.c
@@ -46,7 +46,7 @@ static int read_patches(const char *range, struct string_list 
*list)
struct strbuf buf = STRBUF_INIT, contents = STRBUF_INIT;
struct patch_util *util = NULL;
int in_header = 1;
-   char *line;
+   char *line, *current_filename = NULL;
int offset, len;
size_t size;
 
@@ -125,6 +125,12 @@ static int read_patches(const char *range, struct 
string_list *list)
else
strbuf_addstr(&buf, patch.new_name);
 
+   free(current_filename);
+   if (patch.is_delete > 0)
+   current_filename = xstrdup(patch.old_name);
+   else
+   current_filename = xstrdup(patch.new_name);
+
if (patch.new_mode && patch.old_mode &&
patch.old_mode != patch.new_mode)
strbuf_addf(&buf, " (mode change %06o => %06o)",
@@ -145,7 +151,11 @@ static int read_patches(const char *range, struct 
string_list *list)
continue;
} else if (skip_prefix(line, "@@ ", &p)) {
p = strstr(p, "@@");
-   strbuf_addstr(&buf, p ? p : "@@");
+   strbuf_addstr(&buf, "@@");
+   if (current_filename && p[2])
+   strbuf_addf(&buf, " %s:", current_filename);
+   if (p)
+   strbuf_addstr(&buf, p + 2);
} else if (!line[0])
/*
 * A completely blank (not ' \n', which is context)
@@ -177,6 +187,7 @@ static int read_patches(const char *range, struct 
string_list *list)
if (util)
string_list_append(list, buf.buf)->util = util;
strbuf_release(&buf);
+   free(current_filename);
 
if (finish_command(&cp))
return -1;
diff --git a/t/t3206-range-diff.sh b/t/t3206-range-diff.sh
index c277756057..d4de270979 100755
--- a/t/t3206-range-diff.sh
+++ b/t/t3206-range-diff.sh
@@ -110,7 +110,7 @@ test_expect_success 'changed commit' '
  14
4:  a63e992 ! 4:  d966c5c s/12/B/
@@
-@@ A
+@@ file: A
  9
  10
- B
@@ -169,7 +169,7 @@ test_expect_success 'changed commit with sm config' '
  14
4:  a63e992 ! 4:  d966c5c s/12/B/
@@
-@@ A
+@@ file: A
  9
  10
- B
@@ -203,20 +203,24 @@ test_expect_success 'renamed file' '
Zs/11/B/
Z
- ## file ##
+   -@@ file: A
+ ## renamed-file ##
-   Z@@ A
+   +@@ renamed-file: A
Z 8
Z 9
+   Z 10
4:  a63e992 ! 4:  1e6226b s/12/B/
@@
Z
Zs/12/B/
Z
- ## file ##
+   -@@ file: A
+ ## renamed-file ##
-   Z@@ A
+   +@@ renamed-file: A
Z 9
Z 10
+   Z B
EOF
test_cmp expected actual
 '
@@ -248,7 +252,7 @@ test_expect_success 'file added and later removed' '
+s/11/B/ + remove file
Z
Z ## file ##
-   Z@@ A
+   Z@@ file: A
@@
Z 12
Z 13
@@ -310,7 +314,7 @@ test_expect_success 'dual-coloring' '
:  14
:4:  d966c5c ! 4:  
8add5f1 s/12/B/
:@@
-   : @@ A
+   : @@ file: A
:  9
:  10
:- BB
-- 
2.22.0.510.g264f2c817a



[PATCH v4 07/14] apply: make parse_git_diff_header public

2019-07-11 Thread Thomas Gummerer
Make 'parse_git_header()' (renamed to 'parse_git_diff_header()') a
"public" function in apply.h, so we can re-use it in range-diff in a
subsequent commit.  We're renaming the function to make it clearer in
other parts of the codebase that we're talking about a diff header and
not just any header.

Signed-off-by: Thomas Gummerer 
---
 apply.c | 69 -
 apply.h | 48 +++
 2 files changed, 67 insertions(+), 50 deletions(-)

diff --git a/apply.c b/apply.c
index 034d134dd5..d68a6acad7 100644
--- a/apply.c
+++ b/apply.c
@@ -207,40 +207,6 @@ struct fragment {
 #define BINARY_DELTA_DEFLATED  1
 #define BINARY_LITERAL_DEFLATED 2
 
-/*
- * This represents a "patch" to a file, both metainfo changes
- * such as creation/deletion, filemode and content changes represented
- * as a series of fragments.
- */
-struct patch {
-   char *new_name, *old_name, *def_name;
-   unsigned int old_mode, new_mode;
-   int is_new, is_delete;  /* -1 = unknown, 0 = false, 1 = true */
-   int rejected;
-   unsigned ws_rule;
-   int lines_added, lines_deleted;
-   int score;
-   int extension_linenr; /* first line specifying delete/new/rename/copy */
-   unsigned int is_toplevel_relative:1;
-   unsigned int inaccurate_eof:1;
-   unsigned int is_binary:1;
-   unsigned int is_copy:1;
-   unsigned int is_rename:1;
-   unsigned int recount:1;
-   unsigned int conflicted_threeway:1;
-   unsigned int direct_to_threeway:1;
-   unsigned int crlf_in_old:1;
-   struct fragment *fragments;
-   char *result;
-   size_t resultsize;
-   char old_oid_prefix[GIT_MAX_HEXSZ + 1];
-   char new_oid_prefix[GIT_MAX_HEXSZ + 1];
-   struct patch *next;
-
-   /* three-way fallback result */
-   struct object_id threeway_stage[3];
-};
-
 static void free_fragment_list(struct fragment *list)
 {
while (list) {
@@ -1320,12 +1286,13 @@ static int check_header_line(int linenr, struct patch 
*patch)
return 0;
 }
 
-/* Verify that we recognize the lines following a git header */
-static int parse_git_header(struct apply_state *state,
-   const char *line,
-   int len,
-   unsigned int size,
-   struct patch *patch)
+int parse_git_diff_header(struct strbuf *root,
+ int *linenr,
+ int p_value,
+ const char *line,
+ int len,
+ unsigned int size,
+ struct patch *patch)
 {
unsigned long offset;
struct gitdiff_data parse_hdr_state;
@@ -1340,21 +1307,21 @@ static int parse_git_header(struct apply_state *state,
 * or removing or adding empty files), so we get
 * the default name from the header.
 */
-   patch->def_name = git_header_name(state->p_value, line, len);
-   if (patch->def_name && state->root.len) {
-   char *s = xstrfmt("%s%s", state->root.buf, patch->def_name);
+   patch->def_name = git_header_name(p_value, line, len);
+   if (patch->def_name && root->len) {
+   char *s = xstrfmt("%s%s", root->buf, patch->def_name);
free(patch->def_name);
patch->def_name = s;
}
 
line += len;
size -= len;
-   state->linenr++;
-   parse_hdr_state.root = &state->root;
-   parse_hdr_state.linenr = state->linenr;
-   parse_hdr_state.p_value = state->p_value;
+   (*linenr)++;
+   parse_hdr_state.root = root;
+   parse_hdr_state.linenr = *linenr;
+   parse_hdr_state.p_value = p_value;
 
-   for (offset = len ; size > 0 ; offset += len, size -= len, line += len, 
state->linenr++) {
+   for (offset = len ; size > 0 ; offset += len, size -= len, line += len, 
(*linenr)++) {
static const struct opentry {
const char *str;
int (*fn)(struct gitdiff_data *, const char *, struct 
patch *);
@@ -1391,7 +1358,7 @@ static int parse_git_header(struct apply_state *state,
res = p->fn(&parse_hdr_state, line + oplen, patch);
if (res < 0)
return -1;
-   if (check_header_line(state->linenr, patch))
+   if (check_header_line(*linenr, patch))
return -1;
if (res > 0)
return offset;
@@ -1572,7 +1539,9 @@ static int find_header(struct apply_state *state,
 * or mode change, so we handle that specially
 */
if (!

[PATCH v4 06/14] apply: only pass required data to gitdiff_* functions

2019-07-11 Thread Thomas Gummerer
Currently the 'gitdiff_*()' functions take 'struct apply_state' as
parameter, even though they only needs the root, linenr and p_value
from that struct.

These functions are in the callchain of 'parse_git_header()', which we
want to make more generally useful in a subsequent commit.  To make
that happen we only want to pass in the required data to
'parse_git_header()', and not the whole 'struct apply_state', and thus
we want functions in the callchain of 'parse_git_header()' to only
take arguments they really need.

As these functions are called in a loop using their function pointers,
each function needs to be passed all the parameters even if only one
of the functions actually needs it.  We therefore pass this data along
in a struct to avoid adding too many unused parameters to each
function and making the code very verbose in the process.

Signed-off-by: Thomas Gummerer 
---
 apply.c | 59 ++---
 1 file changed, 35 insertions(+), 24 deletions(-)

diff --git a/apply.c b/apply.c
index 3cd4e3d3b3..034d134dd5 100644
--- a/apply.c
+++ b/apply.c
@@ -22,6 +22,12 @@
 #include "rerere.h"
 #include "apply.h"
 
+struct gitdiff_data {
+   struct strbuf *root;
+   int linenr;
+   int p_value;
+};
+
 static void git_apply_config(void)
 {
git_config_get_string_const("apply.whitespace", 
&apply_default_whitespace);
@@ -914,7 +920,7 @@ static int parse_traditional_patch(struct apply_state 
*state,
return 0;
 }
 
-static int gitdiff_hdrend(struct apply_state *state,
+static int gitdiff_hdrend(struct gitdiff_data *state,
  const char *line,
  struct patch *patch)
 {
@@ -933,14 +939,14 @@ static int gitdiff_hdrend(struct apply_state *state,
 #define DIFF_OLD_NAME 0
 #define DIFF_NEW_NAME 1
 
-static int gitdiff_verify_name(struct apply_state *state,
+static int gitdiff_verify_name(struct gitdiff_data *state,
   const char *line,
   int isnull,
   char **name,
   int side)
 {
if (!*name && !isnull) {
-   *name = find_name(&state->root, line, NULL, state->p_value, 
TERM_TAB);
+   *name = find_name(state->root, line, NULL, state->p_value, 
TERM_TAB);
return 0;
}
 
@@ -949,7 +955,7 @@ static int gitdiff_verify_name(struct apply_state *state,
if (isnull)
return error(_("git apply: bad git-diff - expected 
/dev/null, got %s on line %d"),
 *name, state->linenr);
-   another = find_name(&state->root, line, NULL, state->p_value, 
TERM_TAB);
+   another = find_name(state->root, line, NULL, state->p_value, 
TERM_TAB);
if (!another || strcmp(another, *name)) {
free(another);
return error((side == DIFF_NEW_NAME) ?
@@ -965,7 +971,7 @@ static int gitdiff_verify_name(struct apply_state *state,
return 0;
 }
 
-static int gitdiff_oldname(struct apply_state *state,
+static int gitdiff_oldname(struct gitdiff_data *state,
   const char *line,
   struct patch *patch)
 {
@@ -974,7 +980,7 @@ static int gitdiff_oldname(struct apply_state *state,
   DIFF_OLD_NAME);
 }
 
-static int gitdiff_newname(struct apply_state *state,
+static int gitdiff_newname(struct gitdiff_data *state,
   const char *line,
   struct patch *patch)
 {
@@ -992,21 +998,21 @@ static int parse_mode_line(const char *line, int linenr, 
unsigned int *mode)
return 0;
 }
 
-static int gitdiff_oldmode(struct apply_state *state,
+static int gitdiff_oldmode(struct gitdiff_data *state,
   const char *line,
   struct patch *patch)
 {
return parse_mode_line(line, state->linenr, &patch->old_mode);
 }
 
-static int gitdiff_newmode(struct apply_state *state,
+static int gitdiff_newmode(struct gitdiff_data *state,
   const char *line,
   struct patch *patch)
 {
return parse_mode_line(line, state->linenr, &patch->new_mode);
 }
 
-static int gitdiff_delete(struct apply_state *state,
+static int gitdiff_delete(struct gitdiff_data *state,
  const char *line,
  struct patch *patch)
 {
@@ -1016,7 +1022,7 @@ static int gitdiff_delete(struct apply_state *state,
return gitdiff_oldmode(state, line, patch);
 }
 
-static int gitdiff_newfile(struct apply_state *state,
+static int gitdiff_newfile(struct gitdiff_data *state,
   const char *line,
   

[PATCH v4 08/14] range-diff: fix function parameter indentation

2019-07-11 Thread Thomas Gummerer
Fix the indentation of the function parameters for a couple of
functions, to match the style in the rest of the file.

Signed-off-by: Thomas Gummerer 
---
 range-diff.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/range-diff.c b/range-diff.c
index 48b0e1b4ce..9242b8975f 100644
--- a/range-diff.c
+++ b/range-diff.c
@@ -148,7 +148,7 @@ static int read_patches(const char *range, struct 
string_list *list)
 }
 
 static int patch_util_cmp(const void *dummy, const struct patch_util *a,
-const struct patch_util *b, const char *keydata)
+ const struct patch_util *b, const char *keydata)
 {
return strcmp(a->diff, keydata ? keydata : b->diff);
 }
@@ -373,7 +373,7 @@ static struct diff_filespec *get_filespec(const char *name, 
const char *p)
 }
 
 static void patch_diff(const char *a, const char *b,
- struct diff_options *diffopt)
+  struct diff_options *diffopt)
 {
diff_queue(&diff_queued_diff,
   get_filespec("a", a), get_filespec("b", b));
-- 
2.22.0.510.g264f2c817a



[PATCH v4 09/14] range-diff: split lines manually

2019-07-11 Thread Thomas Gummerer
Currently range-diff uses the 'strbuf_getline()' function for doing
its line by line processing.  In a future patch we want to do parts of
that parsing using the 'parse_git_diff_header()' function.  That
function does its own line by line reading of the input, and doesn't
use strbufs.  This doesn't match with how we do the line-by-line
processing in range-diff currently.

Switch range-diff to do our own line by line parsing, so we can re-use
the 'parse_git_diff_header()' function later.

Signed-off-by: Thomas Gummerer 
---
 range-diff.c | 68 
 1 file changed, 42 insertions(+), 26 deletions(-)

diff --git a/range-diff.c b/range-diff.c
index 9242b8975f..784fac301b 100644
--- a/range-diff.c
+++ b/range-diff.c
@@ -24,6 +24,17 @@ struct patch_util {
struct object_id oid;
 };
 
+static size_t find_end_of_line(char *buffer, unsigned long size)
+{
+   char *eol = memchr(buffer, '\n', size);
+
+   if (!eol)
+   return size;
+
+   *eol = '\0';
+   return eol + 1 - buffer;
+}
+
 /*
  * Reads the patches into a string list, with the `util` field being populated
  * as struct object_id (will need to be free()d).
@@ -31,10 +42,12 @@ struct patch_util {
 static int read_patches(const char *range, struct string_list *list)
 {
struct child_process cp = CHILD_PROCESS_INIT;
-   FILE *in;
-   struct strbuf buf = STRBUF_INIT, line = STRBUF_INIT;
+   struct strbuf buf = STRBUF_INIT, contents = STRBUF_INIT;
struct patch_util *util = NULL;
int in_header = 1;
+   char *line;
+   int offset, len;
+   size_t size;
 
argv_array_pushl(&cp.args, "log", "--no-color", "-p", "--no-merges",
"--reverse", "--date-order", "--decorate=no",
@@ -54,17 +67,20 @@ static int read_patches(const char *range, struct 
string_list *list)
 
if (start_command(&cp))
return error_errno(_("could not start `log`"));
-   in = fdopen(cp.out, "r");
-   if (!in) {
+   if (strbuf_read(&contents, cp.out, 0) < 0) {
error_errno(_("could not read `log` output"));
finish_command(&cp);
return -1;
}
 
-   while (strbuf_getline(&line, in) != EOF) {
+   line = contents.buf;
+   size = contents.len;
+   for (offset = 0; size > 0; offset += len, size -= len, line += len) {
const char *p;
 
-   if (skip_prefix(line.buf, "commit ", &p)) {
+   len = find_end_of_line(line, size);
+   line[len - 1] = '\0';
+   if (skip_prefix(line, "commit ", &p)) {
if (util) {
string_list_append(list, buf.buf)->util = util;
strbuf_reset(&buf);
@@ -75,8 +91,7 @@ static int read_patches(const char *range, struct string_list 
*list)
free(util);
string_list_clear(list, 1);
strbuf_release(&buf);
-   strbuf_release(&line);
-   fclose(in);
+   strbuf_release(&contents);
finish_command(&cp);
return -1;
}
@@ -85,26 +100,28 @@ static int read_patches(const char *range, struct 
string_list *list)
continue;
}
 
-   if (starts_with(line.buf, "diff --git")) {
+   if (starts_with(line, "diff --git")) {
in_header = 0;
strbuf_addch(&buf, '\n');
if (!util->diff_offset)
util->diff_offset = buf.len;
strbuf_addch(&buf, ' ');
-   strbuf_addbuf(&buf, &line);
+   strbuf_addstr(&buf, line);
} else if (in_header) {
-   if (starts_with(line.buf, "Author: ")) {
-   strbuf_addbuf(&buf, &line);
+   if (starts_with(line, "Author: ")) {
+   strbuf_addstr(&buf, line);
strbuf_addstr(&buf, "\n\n");
-   } else if (starts_with(line.buf, "")) {
-   strbuf_rtrim(&line);
-   strbuf_addbuf(&buf, &line);
+   } else if (starts_with(line, "")) {
+   p = line + len - 2;
+ 

[PATCH v4 05/14] apply: only pass required data to find_name_*

2019-07-11 Thread Thomas Gummerer
Currently the 'find_name_*()' functions take 'struct apply_state' as
parameter, even though they only need the 'root' member from that
struct.

These functions are in the callchain of 'parse_git_header()', which we
want to make more generally useful in a subsequent commit.  To make
that happen we only want to pass in the required data to
'parse_git_header()', and not the whole 'struct apply_state', and thus
we want functions in the callchain of 'parse_git_header()' to only
take arguments they really need.

Signed-off-by: Thomas Gummerer 
---
 apply.c | 48 
 1 file changed, 24 insertions(+), 24 deletions(-)

diff --git a/apply.c b/apply.c
index 1602fd5db0..3cd4e3d3b3 100644
--- a/apply.c
+++ b/apply.c
@@ -469,7 +469,7 @@ static char *squash_slash(char *name)
return name;
 }
 
-static char *find_name_gnu(struct apply_state *state,
+static char *find_name_gnu(struct strbuf *root,
   const char *line,
   int p_value)
 {
@@ -495,8 +495,8 @@ static char *find_name_gnu(struct apply_state *state,
}
 
strbuf_remove(&name, 0, cp - name.buf);
-   if (state->root.len)
-   strbuf_insert(&name, 0, state->root.buf, state->root.len);
+   if (root->len)
+   strbuf_insert(&name, 0, root->buf, root->len);
return squash_slash(strbuf_detach(&name, NULL));
 }
 
@@ -659,7 +659,7 @@ static size_t diff_timestamp_len(const char *line, size_t 
len)
return line + len - end;
 }
 
-static char *find_name_common(struct apply_state *state,
+static char *find_name_common(struct strbuf *root,
  const char *line,
  const char *def,
  int p_value,
@@ -702,30 +702,30 @@ static char *find_name_common(struct apply_state *state,
return squash_slash(xstrdup(def));
}
 
-   if (state->root.len) {
-   char *ret = xstrfmt("%s%.*s", state->root.buf, len, start);
+   if (root->len) {
+   char *ret = xstrfmt("%s%.*s", root->buf, len, start);
return squash_slash(ret);
}
 
return squash_slash(xmemdupz(start, len));
 }
 
-static char *find_name(struct apply_state *state,
+static char *find_name(struct strbuf *root,
   const char *line,
   char *def,
   int p_value,
   int terminate)
 {
if (*line == '"') {
-   char *name = find_name_gnu(state, line, p_value);
+   char *name = find_name_gnu(root, line, p_value);
if (name)
return name;
}
 
-   return find_name_common(state, line, def, p_value, NULL, terminate);
+   return find_name_common(root, line, def, p_value, NULL, terminate);
 }
 
-static char *find_name_traditional(struct apply_state *state,
+static char *find_name_traditional(struct strbuf *root,
   const char *line,
   char *def,
   int p_value)
@@ -734,7 +734,7 @@ static char *find_name_traditional(struct apply_state 
*state,
size_t date_len;
 
if (*line == '"') {
-   char *name = find_name_gnu(state, line, p_value);
+   char *name = find_name_gnu(root, line, p_value);
if (name)
return name;
}
@@ -742,10 +742,10 @@ static char *find_name_traditional(struct apply_state 
*state,
len = strchrnul(line, '\n') - line;
date_len = diff_timestamp_len(line, len);
if (!date_len)
-   return find_name_common(state, line, def, p_value, NULL, 
TERM_TAB);
+   return find_name_common(root, line, def, p_value, NULL, 
TERM_TAB);
len -= date_len;
 
-   return find_name_common(state, line, def, p_value, line + len, 0);
+   return find_name_common(root, line, def, p_value, line + len, 0);
 }
 
 /*
@@ -759,7 +759,7 @@ static int guess_p_value(struct apply_state *state, const 
char *nameline)
 
if (is_dev_null(nameline))
return -1;
-   name = find_name_traditional(state, nameline, NULL, 0);
+   name = find_name_traditional(&state->root, nameline, NULL, 0);
if (!name)
return -1;
cp = strchr(name, '/');
@@ -883,17 +883,17 @@ static int parse_traditional_patch(struct apply_state 
*state,
if (is_dev_null(first)) {
patch->is_new = 1;
patch->is_delete = 0;
-   name = find_name_traditional(state, second, NULL, 
state->p_value);
+   name = find_name_traditional(&state->root, second, NULL, 
state-&

[PATCH v4 04/14] apply: only pass required data to check_header_line

2019-07-11 Thread Thomas Gummerer
Currently the 'check_header_line()' function takes 'struct
apply_state' as parameter, even though it only needs the linenr from
that struct.

This function is in the callchain of 'parse_git_header()', which we
want to make more generally useful in a subsequent commit.  To make
that happen we only want to pass in the required data to
'parse_git_header()', and not the whole 'struct apply_state', and thus
we want functions in the callchain of 'parse_git_header()' to only
take arguments they really need.

Signed-off-by: Thomas Gummerer 
---
 apply.c | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/apply.c b/apply.c
index ac668e754d..1602fd5db0 100644
--- a/apply.c
+++ b/apply.c
@@ -1302,15 +1302,15 @@ static char *git_header_name(int p_value,
}
 }
 
-static int check_header_line(struct apply_state *state, struct patch *patch)
+static int check_header_line(int linenr, struct patch *patch)
 {
int extensions = (patch->is_delete == 1) + (patch->is_new == 1) +
 (patch->is_rename == 1) + (patch->is_copy == 1);
if (extensions > 1)
return error(_("inconsistent header lines %d and %d"),
-patch->extension_linenr, state->linenr);
+patch->extension_linenr, linenr);
if (extensions && !patch->extension_linenr)
-   patch->extension_linenr = state->linenr;
+   patch->extension_linenr = linenr;
return 0;
 }
 
@@ -1380,7 +1380,7 @@ static int parse_git_header(struct apply_state *state,
res = p->fn(state, line + oplen, patch);
if (res < 0)
return -1;
-   if (check_header_line(state, patch))
+   if (check_header_line(state->linenr, patch))
return -1;
if (res > 0)
return offset;
-- 
2.22.0.510.g264f2c817a



[PATCH v4 03/14] apply: only pass required data to git_header_name

2019-07-11 Thread Thomas Gummerer
Currently the 'git_header_name()' function takes 'struct apply_state'
as parameter, even though it only needs the p_value from that struct.

This function is in the callchain of 'parse_git_header()', which we
want to make more generally useful in a subsequent commit.  To make
that happen we only want to pass in the required data to
'parse_git_header()', and not the whole 'struct apply_state', and thus
we want functions in the callchain of 'parse_git_header()' to only
take arguments they really need.

Signed-off-by: Thomas Gummerer 
---
 apply.c | 16 
 1 file changed, 8 insertions(+), 8 deletions(-)

diff --git a/apply.c b/apply.c
index fc7083fcbc..ac668e754d 100644
--- a/apply.c
+++ b/apply.c
@@ -1164,7 +1164,7 @@ static const char *skip_tree_prefix(int p_value,
  * creation or deletion of an empty file.  In any of these cases,
  * both sides are the same name under a/ and b/ respectively.
  */
-static char *git_header_name(struct apply_state *state,
+static char *git_header_name(int p_value,
 const char *line,
 int llen)
 {
@@ -1184,7 +1184,7 @@ static char *git_header_name(struct apply_state *state,
goto free_and_fail1;
 
/* strip the a/b prefix including trailing slash */
-   cp = skip_tree_prefix(state->p_value, first.buf, first.len);
+   cp = skip_tree_prefix(p_value, first.buf, first.len);
if (!cp)
goto free_and_fail1;
strbuf_remove(&first, 0, cp - first.buf);
@@ -1201,7 +1201,7 @@ static char *git_header_name(struct apply_state *state,
if (*second == '"') {
if (unquote_c_style(&sp, second, NULL))
goto free_and_fail1;
-   cp = skip_tree_prefix(state->p_value, sp.buf, sp.len);
+   cp = skip_tree_prefix(p_value, sp.buf, sp.len);
if (!cp)
goto free_and_fail1;
/* They must match, otherwise ignore */
@@ -1212,7 +1212,7 @@ static char *git_header_name(struct apply_state *state,
}
 
/* unquoted second */
-   cp = skip_tree_prefix(state->p_value, second, line + llen - 
second);
+   cp = skip_tree_prefix(p_value, second, line + llen - second);
if (!cp)
goto free_and_fail1;
if (line + llen - cp != first.len ||
@@ -1227,7 +1227,7 @@ static char *git_header_name(struct apply_state *state,
}
 
/* unquoted first name */
-   name = skip_tree_prefix(state->p_value, line, llen);
+   name = skip_tree_prefix(p_value, line, llen);
if (!name)
return NULL;
 
@@ -1243,7 +1243,7 @@ static char *git_header_name(struct apply_state *state,
if (unquote_c_style(&sp, second, NULL))
goto free_and_fail2;
 
-   np = skip_tree_prefix(state->p_value, sp.buf, sp.len);
+   np = skip_tree_prefix(p_value, sp.buf, sp.len);
if (!np)
goto free_and_fail2;
 
@@ -1287,7 +1287,7 @@ static char *git_header_name(struct apply_state *state,
 */
if (!name[len + 1])
return NULL; /* no postimage name */
-   second = skip_tree_prefix(state->p_value, name + len + 
1,
+   second = skip_tree_prefix(p_value, name + len + 1,
  line_len - (len + 1));
if (!second)
return NULL;
@@ -1333,7 +1333,7 @@ static int parse_git_header(struct apply_state *state,
 * or removing or adding empty files), so we get
 * the default name from the header.
 */
-   patch->def_name = git_header_name(state, line, len);
+   patch->def_name = git_header_name(state->p_value, line, len);
if (patch->def_name && state->root.len) {
char *s = xstrfmt("%s%s", state->root.buf, patch->def_name);
free(patch->def_name);
-- 
2.22.0.510.g264f2c817a



[PATCH v4 02/14] apply: only pass required data to skip_tree_prefix

2019-07-11 Thread Thomas Gummerer
Currently the 'skip_tree_prefix()' function takes 'struct apply_state'
as parameter, even though it only needs the p_value from that struct.

This function is in the callchain of 'parse_git_header()', which we
want to make more generally useful in a subsequent commit.  To make
that happen we only want to pass in the required data to
'parse_git_header()', and not the whole 'struct apply_state', and thus
we want functions in the callchain of 'parse_git_header()' to only
take arguments they really need.

Signed-off-by: Thomas Gummerer 
---
 apply.c | 18 +-
 1 file changed, 9 insertions(+), 9 deletions(-)

diff --git a/apply.c b/apply.c
index 599cf8956f..fc7083fcbc 100644
--- a/apply.c
+++ b/apply.c
@@ -1137,17 +1137,17 @@ static int gitdiff_unrecognized(struct apply_state 
*state,
  * Skip p_value leading components from "line"; as we do not accept
  * absolute paths, return NULL in that case.
  */
-static const char *skip_tree_prefix(struct apply_state *state,
+static const char *skip_tree_prefix(int p_value,
const char *line,
int llen)
 {
int nslash;
int i;
 
-   if (!state->p_value)
+   if (!p_value)
return (llen && line[0] == '/') ? NULL : line;
 
-   nslash = state->p_value;
+   nslash = p_value;
for (i = 0; i < llen; i++) {
int ch = line[i];
if (ch == '/' && --nslash <= 0)
@@ -1184,7 +1184,7 @@ static char *git_header_name(struct apply_state *state,
goto free_and_fail1;
 
/* strip the a/b prefix including trailing slash */
-   cp = skip_tree_prefix(state, first.buf, first.len);
+   cp = skip_tree_prefix(state->p_value, first.buf, first.len);
if (!cp)
goto free_and_fail1;
strbuf_remove(&first, 0, cp - first.buf);
@@ -1201,7 +1201,7 @@ static char *git_header_name(struct apply_state *state,
if (*second == '"') {
if (unquote_c_style(&sp, second, NULL))
goto free_and_fail1;
-   cp = skip_tree_prefix(state, sp.buf, sp.len);
+   cp = skip_tree_prefix(state->p_value, sp.buf, sp.len);
if (!cp)
goto free_and_fail1;
/* They must match, otherwise ignore */
@@ -1212,7 +1212,7 @@ static char *git_header_name(struct apply_state *state,
}
 
/* unquoted second */
-   cp = skip_tree_prefix(state, second, line + llen - second);
+   cp = skip_tree_prefix(state->p_value, second, line + llen - 
second);
if (!cp)
goto free_and_fail1;
if (line + llen - cp != first.len ||
@@ -1227,7 +1227,7 @@ static char *git_header_name(struct apply_state *state,
}
 
/* unquoted first name */
-   name = skip_tree_prefix(state, line, llen);
+   name = skip_tree_prefix(state->p_value, line, llen);
if (!name)
return NULL;
 
@@ -1243,7 +1243,7 @@ static char *git_header_name(struct apply_state *state,
if (unquote_c_style(&sp, second, NULL))
goto free_and_fail2;
 
-   np = skip_tree_prefix(state, sp.buf, sp.len);
+   np = skip_tree_prefix(state->p_value, sp.buf, sp.len);
if (!np)
goto free_and_fail2;
 
@@ -1287,7 +1287,7 @@ static char *git_header_name(struct apply_state *state,
 */
if (!name[len + 1])
return NULL; /* no postimage name */
-   second = skip_tree_prefix(state, name + len + 1,
+   second = skip_tree_prefix(state->p_value, name + len + 
1,
  line_len - (len + 1));
if (!second)
return NULL;
-- 
2.22.0.510.g264f2c817a



[PATCH v4 01/14] apply: replace marc.info link with public-inbox

2019-07-11 Thread Thomas Gummerer
public-inbox.org links include the whole message ID by default.  This
means the message can still be found even if the site goes away, which
is not the case with the marc.info link.  Replace the marc.info link
with a more future proof one.

Signed-off-by: Thomas Gummerer 
---
 apply.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/apply.c b/apply.c
index f15afa9f6a..599cf8956f 100644
--- a/apply.c
+++ b/apply.c
@@ -478,7 +478,7 @@ static char *find_name_gnu(struct apply_state *state,
 
/*
 * Proposed "new-style" GNU patch/diff format; see
-* http://marc.info/?l=git&m=112927316408690&w=2
+* https://public-inbox.org/git/7vll0wvb2a@assigned-by-dhcp.cox.net/
 */
if (unquote_c_style(&name, line, NULL)) {
strbuf_release(&name);
-- 
2.22.0.510.g264f2c817a



[PATCH v4 00/14] output improvements for git range-diff

2019-07-11 Thread Thomas Gummerer
Thanks Junio for the comment on the previous round [1].  This round
reanmes the struct we're using in apply.c to 'struct gitdiff_data',
and updates the commit message of 7/14 to reflect the new name of the
renamed function.

[1]: https://public-inbox.org/git/20190708163315.29912-1-t.gumme...@gmail.com/

Thomas Gummerer (14):
  apply: replace marc.info link with public-inbox
  apply: only pass required data to skip_tree_prefix
  apply: only pass required data to git_header_name
  apply: only pass required data to check_header_line
  apply: only pass required data to find_name_*
  apply: only pass required data to gitdiff_* functions
  apply: make parse_git_diff_header public
  range-diff: fix function parameter indentation
  range-diff: split lines manually
  range-diff: don't remove funcname from inner diff
  range-diff: suppress line count in outer diff
  range-diff: add section header instead of diff header
  range-diff: add filename to inner diff
  range-diff: add headers to the outer hunk header

 apply.c| 186 ++---
 apply.h|  48 +++
 diff.c |   5 +-
 diff.h |   1 +
 range-diff.c   | 124 +++
 t/t3206-range-diff.sh  | 124 ++-
 t/t3206/history.export |  84 ++-
 7 files changed, 409 insertions(+), 163 deletions(-)

Range-diff against v3:
 1:  ef2245edda =  1:  ef2245edda apply: replace marc.info link with 
public-inbox
 2:  94578fa45c =  2:  94578fa45c apply: only pass required data to 
skip_tree_prefix
 3:  988269a68e =  3:  988269a68e apply: only pass required data to 
git_header_name
 4:  a2c1ef3f5f =  4:  a2c1ef3f5f apply: only pass required data to 
check_header_line
 5:  0f4cfe21cb =  5:  0f4cfe21cb apply: only pass required data to find_name_*
 6:  07a271518d !  6:  42665e5295 apply: only pass required data to gitdiff_* 
functions
@@ -28,7 +28,7 @@
  #include "rerere.h"
  #include "apply.h"
  
-+struct parse_git_header_state {
++struct gitdiff_data {
 +  struct strbuf *root;
 +  int linenr;
 +  int p_value;
@@ -42,7 +42,7 @@
  }
  
 -static int gitdiff_hdrend(struct apply_state *state,
-+static int gitdiff_hdrend(struct parse_git_header_state *state,
++static int gitdiff_hdrend(struct gitdiff_data *state,
  const char *line,
  struct patch *patch)
  {
@@ -51,7 +51,7 @@
  #define DIFF_NEW_NAME 1
  
 -static int gitdiff_verify_name(struct apply_state *state,
-+static int gitdiff_verify_name(struct parse_git_header_state *state,
++static int gitdiff_verify_name(struct gitdiff_data *state,
   const char *line,
   int isnull,
   char **name,
@@ -77,7 +77,7 @@
  }
  
 -static int gitdiff_oldname(struct apply_state *state,
-+static int gitdiff_oldname(struct parse_git_header_state *state,
++static int gitdiff_oldname(struct gitdiff_data *state,
   const char *line,
   struct patch *patch)
  {
@@ -86,7 +86,7 @@
  }
  
 -static int gitdiff_newname(struct apply_state *state,
-+static int gitdiff_newname(struct parse_git_header_state *state,
++static int gitdiff_newname(struct gitdiff_data *state,
   const char *line,
   struct patch *patch)
  {
@@ -95,7 +95,7 @@
  }
  
 -static int gitdiff_oldmode(struct apply_state *state,
-+static int gitdiff_oldmode(struct parse_git_header_state *state,
++static int gitdiff_oldmode(struct gitdiff_data *state,
   const char *line,
   struct patch *patch)
  {
@@ -103,7 +103,7 @@
  }
  
 -static int gitdiff_newmode(struct apply_state *state,
-+static int gitdiff_newmode(struct parse_git_header_state *state,
++static int gitdiff_newmode(struct gitdiff_data *state,
   const char *line,
   struct patch *patch)
  {
@@ -111,7 +111,7 @@
  }
  
 -static int gitdiff_delete(struct apply_state *state,
-+static int gitdiff_delete(struct parse_git_header_state *state,
++static int gitdiff_delete(struct gitdiff_data *state,
  const char *line,
  struct patch *patch)
  {
@@ -120,7 +120,7 @@
  }
  
 -static int gitdiff_newfile(struct apply_state *state,
-+static int gitdiff_newfile(struct parse_git_header_state *state,
++static int gitdiff_newfile(struct gitdiff_data *state,
   const char *line,
   struct patch *patch)
  {
@@ -129,7 +129,7 @@
  }
  
 -static int gitdiff_copysrc(str

Re: [PATCH v3 07/14] apply: make parse_git_header public

2019-07-10 Thread Thomas Gummerer
On 07/09, Junio C Hamano wrote:
> Thomas Gummerer  writes:
> 
> > Maybe it would be even better to name it 'struct gitdiff_data', as
> > it's really only used for those few functions?
> 
> Is it really the case where "these three are only used by the
> codepath you made public"?  If so, I agree that "gitdiff_data" is a
> perfectly good name for it.
> 
> I however had an impression that it is the oppposite, i.e. "the
> codepath you made public only needs these three, but these three are
> used by other (still private) parts, too."  If this is the case,
> then "gitdiff_data" is a misnomer, if we were to embed an instance
> inside apply_state.

Yeah, that's correct.  What I meant was that since we're only using
this struct for the private 'gitdiff_*()' functions, which are called
from 'parse_git_diff_header()', 'struct gitdiff_data' would be a
better name than 'struct parse_git_diff_header_data'.

I do agree that it wouldn't be a good name if we were to embed it
inside 'struct apply_state', and as mentioned in the previous email
I'd have a hard time coming up with a good name if we were to do that.

> It seems that it is not a good idea to do such embedding, and if
> that is the case, "gitdiff_data" is a fine for the three-field
> struct.

Yeah, I think that's the best way forward, thanks.

> Thanks.


Re: [PATCH v3 07/14] apply: make parse_git_header public

2019-07-09 Thread Thomas Gummerer
On 07/09, Junio C Hamano wrote:
> Thomas Gummerer  writes:
> 
> > Make parse_git_header a "public" function in apply.h, so we can re-use
> > it in range-diff in a subsequent commit.

Eek, I just noticed that I forgot updating the name here.  This and
the Subject should say 'parse_git_diff_header()' now, instead of
parse_git_header of course.  Will fix that in the reroll.

> > Signed-off-by: Thomas Gummerer 
> > ---
> 
> Thanks for these refactoring patches on "apply" machinery in the
> early part of the series.  I noticed two small things, though.
> 
>  - The apply_state instance *does* represent a state and various
>fields get updated as we read and process the patch.  The smaller
>structure you invented, on the other hand, does not carry any
>"state" at all.  Even its "linenr" field does not get incremented
>as we read/process---you create a new copy to take a snapshot of
>the current state from apply_state.  parse_git_header_data may
>have been a name that reflects the nature of the structure
>better.

Yeah, I think that's better.  Will change, thanks!

Maybe it would be even better to name it 'struct gitdiff_data', as
it's really only used for those few functions?

>  - I wonder if it makes the concept clearer if you did not create a
>new instance outside the apply_state, but instead replaced the
>three fields in the apply_state with an instance of this new
>structure.  When you call an API function with shrunk interface,
>you'd pass a pointer to a field inside the apply_state instance,
>instead of copying three fields manually.

I had considered that.  However I struggled to come up with a name
that makes sense in both as an interface to 'parse_git_diff_header()',
and inside 'struct apply_state'.  'linenr' is not specific to parsing
git diff headers (or even parsing any type of diff header), but is
used all over the apply code.  So 'parse_git_header_data' doesn't make
sense as a name anymore (and gets complicated to explain to the
readers of the code I think).

At that point the name should also be _state again, because
we do update the linenr inside 'parse_git_diff_header()', just not
inside any of the 'gitdiff_*' functions, though that is only a minor
point.

So unless there's a good name for this struct that I couldn't think
of, I think it's better to pass in the variables separately to
'parse_git_diff_header()', and then pass the struct just to the
'gitdiff_*' functions, as it's done currently.

> But other than that, I think these patches are generally moving bits
> in the right direction.

Thanks for the review!

> I do not have strong opinions on the later part of the series on
> range-diff proper.
> 
> Thanks.


[PATCH v3 08/14] range-diff: fix function parameter indentation

2019-07-08 Thread Thomas Gummerer
Fix the indentation of the function parameters for a couple of
functions, to match the style in the rest of the file.

Signed-off-by: Thomas Gummerer 
---
 range-diff.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/range-diff.c b/range-diff.c
index 48b0e1b4ce..9242b8975f 100644
--- a/range-diff.c
+++ b/range-diff.c
@@ -148,7 +148,7 @@ static int read_patches(const char *range, struct 
string_list *list)
 }
 
 static int patch_util_cmp(const void *dummy, const struct patch_util *a,
-const struct patch_util *b, const char *keydata)
+ const struct patch_util *b, const char *keydata)
 {
return strcmp(a->diff, keydata ? keydata : b->diff);
 }
@@ -373,7 +373,7 @@ static struct diff_filespec *get_filespec(const char *name, 
const char *p)
 }
 
 static void patch_diff(const char *a, const char *b,
- struct diff_options *diffopt)
+  struct diff_options *diffopt)
 {
diff_queue(&diff_queued_diff,
   get_filespec("a", a), get_filespec("b", b));
-- 
2.22.0.510.g264f2c817a



[PATCH v3 05/14] apply: only pass required data to find_name_*

2019-07-08 Thread Thomas Gummerer
Currently the 'find_name_*()' functions take 'struct apply_state' as
parameter, even though they only need the 'root' member from that
struct.

These functions are in the callchain of 'parse_git_header()', which we
want to make more generally useful in a subsequent commit.  To make
that happen we only want to pass in the required data to
'parse_git_header()', and not the whole 'struct apply_state', and thus
we want functions in the callchain of 'parse_git_header()' to only
take arguments they really need.

Signed-off-by: Thomas Gummerer 
---
 apply.c | 48 
 1 file changed, 24 insertions(+), 24 deletions(-)

diff --git a/apply.c b/apply.c
index 1602fd5db0..3cd4e3d3b3 100644
--- a/apply.c
+++ b/apply.c
@@ -469,7 +469,7 @@ static char *squash_slash(char *name)
return name;
 }
 
-static char *find_name_gnu(struct apply_state *state,
+static char *find_name_gnu(struct strbuf *root,
   const char *line,
   int p_value)
 {
@@ -495,8 +495,8 @@ static char *find_name_gnu(struct apply_state *state,
}
 
strbuf_remove(&name, 0, cp - name.buf);
-   if (state->root.len)
-   strbuf_insert(&name, 0, state->root.buf, state->root.len);
+   if (root->len)
+   strbuf_insert(&name, 0, root->buf, root->len);
return squash_slash(strbuf_detach(&name, NULL));
 }
 
@@ -659,7 +659,7 @@ static size_t diff_timestamp_len(const char *line, size_t 
len)
return line + len - end;
 }
 
-static char *find_name_common(struct apply_state *state,
+static char *find_name_common(struct strbuf *root,
  const char *line,
  const char *def,
  int p_value,
@@ -702,30 +702,30 @@ static char *find_name_common(struct apply_state *state,
return squash_slash(xstrdup(def));
}
 
-   if (state->root.len) {
-   char *ret = xstrfmt("%s%.*s", state->root.buf, len, start);
+   if (root->len) {
+   char *ret = xstrfmt("%s%.*s", root->buf, len, start);
return squash_slash(ret);
}
 
return squash_slash(xmemdupz(start, len));
 }
 
-static char *find_name(struct apply_state *state,
+static char *find_name(struct strbuf *root,
   const char *line,
   char *def,
   int p_value,
   int terminate)
 {
if (*line == '"') {
-   char *name = find_name_gnu(state, line, p_value);
+   char *name = find_name_gnu(root, line, p_value);
if (name)
return name;
}
 
-   return find_name_common(state, line, def, p_value, NULL, terminate);
+   return find_name_common(root, line, def, p_value, NULL, terminate);
 }
 
-static char *find_name_traditional(struct apply_state *state,
+static char *find_name_traditional(struct strbuf *root,
   const char *line,
   char *def,
   int p_value)
@@ -734,7 +734,7 @@ static char *find_name_traditional(struct apply_state 
*state,
size_t date_len;
 
if (*line == '"') {
-   char *name = find_name_gnu(state, line, p_value);
+   char *name = find_name_gnu(root, line, p_value);
if (name)
return name;
}
@@ -742,10 +742,10 @@ static char *find_name_traditional(struct apply_state 
*state,
len = strchrnul(line, '\n') - line;
date_len = diff_timestamp_len(line, len);
if (!date_len)
-   return find_name_common(state, line, def, p_value, NULL, 
TERM_TAB);
+   return find_name_common(root, line, def, p_value, NULL, 
TERM_TAB);
len -= date_len;
 
-   return find_name_common(state, line, def, p_value, line + len, 0);
+   return find_name_common(root, line, def, p_value, line + len, 0);
 }
 
 /*
@@ -759,7 +759,7 @@ static int guess_p_value(struct apply_state *state, const 
char *nameline)
 
if (is_dev_null(nameline))
return -1;
-   name = find_name_traditional(state, nameline, NULL, 0);
+   name = find_name_traditional(&state->root, nameline, NULL, 0);
if (!name)
return -1;
cp = strchr(name, '/');
@@ -883,17 +883,17 @@ static int parse_traditional_patch(struct apply_state 
*state,
if (is_dev_null(first)) {
patch->is_new = 1;
patch->is_delete = 0;
-   name = find_name_traditional(state, second, NULL, 
state->p_value);
+   name = find_name_traditional(&state->root, second, NULL, 
state-&

[PATCH v3 07/14] apply: make parse_git_header public

2019-07-08 Thread Thomas Gummerer
Make parse_git_header a "public" function in apply.h, so we can re-use
it in range-diff in a subsequent commit.

Signed-off-by: Thomas Gummerer 
---
 apply.c | 69 -
 apply.h | 48 +++
 2 files changed, 67 insertions(+), 50 deletions(-)

diff --git a/apply.c b/apply.c
index 468f1d3fee..32b5b072ee 100644
--- a/apply.c
+++ b/apply.c
@@ -207,40 +207,6 @@ struct fragment {
 #define BINARY_DELTA_DEFLATED  1
 #define BINARY_LITERAL_DEFLATED 2
 
-/*
- * This represents a "patch" to a file, both metainfo changes
- * such as creation/deletion, filemode and content changes represented
- * as a series of fragments.
- */
-struct patch {
-   char *new_name, *old_name, *def_name;
-   unsigned int old_mode, new_mode;
-   int is_new, is_delete;  /* -1 = unknown, 0 = false, 1 = true */
-   int rejected;
-   unsigned ws_rule;
-   int lines_added, lines_deleted;
-   int score;
-   int extension_linenr; /* first line specifying delete/new/rename/copy */
-   unsigned int is_toplevel_relative:1;
-   unsigned int inaccurate_eof:1;
-   unsigned int is_binary:1;
-   unsigned int is_copy:1;
-   unsigned int is_rename:1;
-   unsigned int recount:1;
-   unsigned int conflicted_threeway:1;
-   unsigned int direct_to_threeway:1;
-   unsigned int crlf_in_old:1;
-   struct fragment *fragments;
-   char *result;
-   size_t resultsize;
-   char old_oid_prefix[GIT_MAX_HEXSZ + 1];
-   char new_oid_prefix[GIT_MAX_HEXSZ + 1];
-   struct patch *next;
-
-   /* three-way fallback result */
-   struct object_id threeway_stage[3];
-};
-
 static void free_fragment_list(struct fragment *list)
 {
while (list) {
@@ -1320,12 +1286,13 @@ static int check_header_line(int linenr, struct patch 
*patch)
return 0;
 }
 
-/* Verify that we recognize the lines following a git header */
-static int parse_git_header(struct apply_state *state,
-   const char *line,
-   int len,
-   unsigned int size,
-   struct patch *patch)
+int parse_git_diff_header(struct strbuf *root,
+ int *linenr,
+ int p_value,
+ const char *line,
+ int len,
+ unsigned int size,
+ struct patch *patch)
 {
unsigned long offset;
struct parse_git_header_state parse_hdr_state;
@@ -1340,21 +1307,21 @@ static int parse_git_header(struct apply_state *state,
 * or removing or adding empty files), so we get
 * the default name from the header.
 */
-   patch->def_name = git_header_name(state->p_value, line, len);
-   if (patch->def_name && state->root.len) {
-   char *s = xstrfmt("%s%s", state->root.buf, patch->def_name);
+   patch->def_name = git_header_name(p_value, line, len);
+   if (patch->def_name && root->len) {
+   char *s = xstrfmt("%s%s", root->buf, patch->def_name);
free(patch->def_name);
patch->def_name = s;
}
 
line += len;
size -= len;
-   state->linenr++;
-   parse_hdr_state.root = &state->root;
-   parse_hdr_state.linenr = state->linenr;
-   parse_hdr_state.p_value = state->p_value;
+   (*linenr)++;
+   parse_hdr_state.root = root;
+   parse_hdr_state.linenr = *linenr;
+   parse_hdr_state.p_value = p_value;
 
-   for (offset = len ; size > 0 ; offset += len, size -= len, line += len, 
state->linenr++) {
+   for (offset = len ; size > 0 ; offset += len, size -= len, line += len, 
(*linenr)++) {
static const struct opentry {
const char *str;
int (*fn)(struct parse_git_header_state *, const char 
*, struct patch *);
@@ -1391,7 +1358,7 @@ static int parse_git_header(struct apply_state *state,
res = p->fn(&parse_hdr_state, line + oplen, patch);
if (res < 0)
return -1;
-   if (check_header_line(state->linenr, patch))
+   if (check_header_line(*linenr, patch))
return -1;
if (res > 0)
return offset;
@@ -1572,7 +1539,9 @@ static int find_header(struct apply_state *state,
 * or mode change, so we handle that specially
 */
if (!memcmp("diff --git ", line, 11)) {
-   int git_hdr_len = parse_git_header(state, line, len, 
size, patch);
+   int git_hdr_len

[PATCH v3 06/14] apply: only pass required data to gitdiff_* functions

2019-07-08 Thread Thomas Gummerer
Currently the 'gitdiff_*()' functions take 'struct apply_state' as
parameter, even though they only needs the root, linenr and p_value
from that struct.

These functions are in the callchain of 'parse_git_header()', which we
want to make more generally useful in a subsequent commit.  To make
that happen we only want to pass in the required data to
'parse_git_header()', and not the whole 'struct apply_state', and thus
we want functions in the callchain of 'parse_git_header()' to only
take arguments they really need.

As these functions are called in a loop using their function pointers,
each function needs to be passed all the parameters even if only one
of the functions actually needs it.  We therefore pass this data along
in a struct to avoid adding too many unused parameters to each
function and making the code very verbose in the process.

Signed-off-by: Thomas Gummerer 
---
 apply.c | 59 ++---
 1 file changed, 35 insertions(+), 24 deletions(-)

diff --git a/apply.c b/apply.c
index 3cd4e3d3b3..468f1d3fee 100644
--- a/apply.c
+++ b/apply.c
@@ -22,6 +22,12 @@
 #include "rerere.h"
 #include "apply.h"
 
+struct parse_git_header_state {
+   struct strbuf *root;
+   int linenr;
+   int p_value;
+};
+
 static void git_apply_config(void)
 {
git_config_get_string_const("apply.whitespace", 
&apply_default_whitespace);
@@ -914,7 +920,7 @@ static int parse_traditional_patch(struct apply_state 
*state,
return 0;
 }
 
-static int gitdiff_hdrend(struct apply_state *state,
+static int gitdiff_hdrend(struct parse_git_header_state *state,
  const char *line,
  struct patch *patch)
 {
@@ -933,14 +939,14 @@ static int gitdiff_hdrend(struct apply_state *state,
 #define DIFF_OLD_NAME 0
 #define DIFF_NEW_NAME 1
 
-static int gitdiff_verify_name(struct apply_state *state,
+static int gitdiff_verify_name(struct parse_git_header_state *state,
   const char *line,
   int isnull,
   char **name,
   int side)
 {
if (!*name && !isnull) {
-   *name = find_name(&state->root, line, NULL, state->p_value, 
TERM_TAB);
+   *name = find_name(state->root, line, NULL, state->p_value, 
TERM_TAB);
return 0;
}
 
@@ -949,7 +955,7 @@ static int gitdiff_verify_name(struct apply_state *state,
if (isnull)
return error(_("git apply: bad git-diff - expected 
/dev/null, got %s on line %d"),
 *name, state->linenr);
-   another = find_name(&state->root, line, NULL, state->p_value, 
TERM_TAB);
+   another = find_name(state->root, line, NULL, state->p_value, 
TERM_TAB);
if (!another || strcmp(another, *name)) {
free(another);
return error((side == DIFF_NEW_NAME) ?
@@ -965,7 +971,7 @@ static int gitdiff_verify_name(struct apply_state *state,
return 0;
 }
 
-static int gitdiff_oldname(struct apply_state *state,
+static int gitdiff_oldname(struct parse_git_header_state *state,
   const char *line,
   struct patch *patch)
 {
@@ -974,7 +980,7 @@ static int gitdiff_oldname(struct apply_state *state,
   DIFF_OLD_NAME);
 }
 
-static int gitdiff_newname(struct apply_state *state,
+static int gitdiff_newname(struct parse_git_header_state *state,
   const char *line,
   struct patch *patch)
 {
@@ -992,21 +998,21 @@ static int parse_mode_line(const char *line, int linenr, 
unsigned int *mode)
return 0;
 }
 
-static int gitdiff_oldmode(struct apply_state *state,
+static int gitdiff_oldmode(struct parse_git_header_state *state,
   const char *line,
   struct patch *patch)
 {
return parse_mode_line(line, state->linenr, &patch->old_mode);
 }
 
-static int gitdiff_newmode(struct apply_state *state,
+static int gitdiff_newmode(struct parse_git_header_state *state,
   const char *line,
   struct patch *patch)
 {
return parse_mode_line(line, state->linenr, &patch->new_mode);
 }
 
-static int gitdiff_delete(struct apply_state *state,
+static int gitdiff_delete(struct parse_git_header_state *state,
  const char *line,
  struct patch *patch)
 {
@@ -1016,7 +1022,7 @@ static int gitdiff_delete(struct apply_state *state,
return gitdiff_oldmode(state, line, patch);
 }
 
-static int gitdiff_newfile(struct apply_state *state,
+static int gitdiff_newfile(s

[PATCH v3 11/14] range-diff: suppress line count in outer diff

2019-07-08 Thread Thomas Gummerer
The line count in the outer diff's hunk headers of a range diff is not
all that interesting.  It merely shows how far along the inner diff
are on both sides.  That number is of no use for human readers, and
range-diffs are not meant to be machine readable.

In a subsequent commit we're going to add some more contextual
information such as the filename corresponding to the diff to the hunk
headers.  Remove the unnecessary information, and just keep the "@@"
to indicate that a new hunk of the outer diff is starting.

Signed-off-by: Thomas Gummerer 
---
 diff.c|  5 -
 diff.h|  1 +
 range-diff.c  |  1 +
 t/t3206-range-diff.sh | 16 
 4 files changed, 14 insertions(+), 9 deletions(-)

diff --git a/diff.c b/diff.c
index ec5c095199..9c28ff0a92 100644
--- a/diff.c
+++ b/diff.c
@@ -1672,7 +1672,10 @@ static void emit_hunk_header(struct emit_callback 
*ecbdata,
if (ecbdata->opt->flags.dual_color_diffed_diffs)
strbuf_addstr(&msgbuf, reverse);
strbuf_addstr(&msgbuf, frag);
-   strbuf_add(&msgbuf, line, ep - line);
+   if (ecbdata->opt->flags.suppress_hunk_header_line_count)
+   strbuf_add(&msgbuf, atat, sizeof(atat));
+   else
+   strbuf_add(&msgbuf, line, ep - line);
strbuf_addstr(&msgbuf, reset);
 
/*
diff --git a/diff.h b/diff.h
index c9db9825bb..49913049f9 100644
--- a/diff.h
+++ b/diff.h
@@ -98,6 +98,7 @@ struct diff_flags {
unsigned stat_with_summary;
unsigned suppress_diff_headers;
unsigned dual_color_diffed_diffs;
+   unsigned suppress_hunk_header_line_count;
 };
 
 static inline void diff_flags_or(struct diff_flags *a,
diff --git a/range-diff.c b/range-diff.c
index a5202d8b6c..f4a90b33b8 100644
--- a/range-diff.c
+++ b/range-diff.c
@@ -486,6 +486,7 @@ int show_range_diff(const char *range1, const char *range2,
opts.output_format = DIFF_FORMAT_PATCH;
opts.flags.suppress_diff_headers = 1;
opts.flags.dual_color_diffed_diffs = dual_color;
+   opts.flags.suppress_hunk_header_line_count = 1;
opts.output_prefix = output_prefix_cb;
strbuf_addstr(&indent, "");
opts.output_prefix_data = &indent;
diff --git a/t/t3206-range-diff.sh b/t/t3206-range-diff.sh
index aebd4e3693..9f89af7178 100755
--- a/t/t3206-range-diff.sh
+++ b/t/t3206-range-diff.sh
@@ -99,7 +99,7 @@ test_expect_success 'changed commit' '
1:  4de457d = 1:  a4b s/5/A/
2:  fccce22 = 2:  f51d370 s/4/A/
3:  147e64e ! 3:  0559556 s/11/B/
-   @@ -10,7 +10,7 @@
+   @@
  9
  10
 -11
@@ -109,7 +109,7 @@ test_expect_success 'changed commit' '
  13
  14
4:  a63e992 ! 4:  d966c5c s/12/B/
-   @@ -8,7 +8,7 @@
+   @@
 @@ A
  9
  10
@@ -158,7 +158,7 @@ test_expect_success 'changed commit with sm config' '
1:  4de457d = 1:  a4b s/5/A/
2:  fccce22 = 2:  f51d370 s/4/A/
3:  147e64e ! 3:  0559556 s/11/B/
-   @@ -10,7 +10,7 @@
+   @@
  9
  10
 -11
@@ -168,7 +168,7 @@ test_expect_success 'changed commit with sm config' '
  13
  14
4:  a63e992 ! 4:  d966c5c s/12/B/
-   @@ -8,7 +8,7 @@
+   @@
 @@ A
  9
  10
@@ -191,7 +191,7 @@ test_expect_success 'changed message' '
sed s/Z/\ /g >expected <<-EOF &&
1:  4de457d = 1:  f686024 s/5/A/
2:  fccce22 ! 2:  4ab067d s/4/A/
-   @@ -2,6 +2,8 @@
+   @@
Z
Zs/4/A/
Z
@@ -210,7 +210,7 @@ test_expect_success 'dual-coloring' '
sed -e "s|^:||" >expect <<-\EOF &&
:1:  a4b = 1:  f686024 s/5/A/
:2:  f51d370 ! 2:  
4ab067d s/4/A/
-   :@@ -2,6 +2,8 @@
+   :@@
: 
: s/4/A/
: 
@@ -220,7 +220,7 @@ test_expect_success 'dual-coloring' '
:  --- a/file
:  +++ b/file
:3:  0559556 ! 3:  
b9cb956 s/11/B/
-   :@@ -10,7 +10,7 @@
+   :@@
:  9
:  10
: -11
@@ -230,7 +230,7 @@ test_expect_success 'dual-coloring' '
:  13
:  14
:4:  d966c5c ! 4:  
8add5f1 s/12/B/
-   :@@ -8,7 +8,7 @@
+   :@@
: @@ A
:  9
:  10
-- 
2.22.0.510.g264f2c817a



[PATCH v3 12/14] range-diff: add section header instead of diff header

2019-07-08 Thread Thomas Gummerer
Currently range-diff keeps the diff header of the inner diff
intact (apart from stripping lines starting with index).  This diff
header is somewhat useful, especially when files get different
names in different ranges.

However there is no real need to keep the whole diff header for that.
The main reason we currently do that is probably because it is easy to
do.

Introduce a new range diff hunk header, that's enclosed by "##",
similar to how line numbers in diff hunks are enclosed by "@@", and
give human readable information of what exactly happened to the file,
including the file name.

This improves the readability of the range-diff by giving more concise
information to the users.  For example if a file was renamed in one
iteration, but not in another, the diff of the headers would be quite
noisy.  However the diff of a single line is concise and should be
easier to understand.

Additionally, this allows us to add these range diff section headers to
the outer diffs hunk headers using a custom userdiff pattern, which
should help making the range-diff more readable.

Signed-off-by: Thomas Gummerer 
---
 range-diff.c   | 34 
 t/t3206-range-diff.sh  | 91 +++---
 t/t3206/history.export | 84 --
 3 files changed, 192 insertions(+), 17 deletions(-)

diff --git a/range-diff.c b/range-diff.c
index f4a90b33b8..5f64380fe4 100644
--- a/range-diff.c
+++ b/range-diff.c
@@ -10,6 +10,7 @@
 #include "commit.h"
 #include "pretty.h"
 #include "userdiff.h"
+#include "apply.h"
 
 struct patch_util {
/* For the search for an exact match */
@@ -101,12 +102,35 @@ static int read_patches(const char *range, struct 
string_list *list)
}
 
if (starts_with(line, "diff --git")) {
+   struct patch patch = { 0 };
+   struct strbuf root = STRBUF_INIT;
+   int linenr = 0;
+
in_header = 0;
strbuf_addch(&buf, '\n');
if (!util->diff_offset)
util->diff_offset = buf.len;
-   strbuf_addch(&buf, ' ');
-   strbuf_addstr(&buf, line);
+   line[len - 1] = '\n';
+   len = parse_git_diff_header(&root, &linenr, 1, line,
+   len, size, &patch);
+   if (len < 0)
+   die(_("could not parse git header '%.*s'"), 
(int)len, line);
+   strbuf_addstr(&buf, " ## ");
+   if (patch.is_new > 0)
+   strbuf_addf(&buf, "%s (new)", patch.new_name);
+   else if (patch.is_delete > 0)
+   strbuf_addf(&buf, "%s (deleted)", 
patch.old_name);
+   else if (patch.is_rename)
+   strbuf_addf(&buf, "%s => %s", patch.old_name, 
patch.new_name);
+   else
+   strbuf_addstr(&buf, patch.new_name);
+
+   if (patch.new_mode && patch.old_mode &&
+   patch.old_mode != patch.new_mode)
+   strbuf_addf(&buf, " (mode change %06o => %06o)",
+   patch.old_mode, patch.new_mode);
+
+   strbuf_addstr(&buf, " ##");
} else if (in_header) {
if (starts_with(line, "Author: ")) {
strbuf_addstr(&buf, line);
@@ -122,17 +146,13 @@ static int read_patches(const char *range, struct 
string_list *list)
} else if (skip_prefix(line, "@@ ", &p)) {
p = strstr(p, "@@");
strbuf_addstr(&buf, p ? p : "@@");
-   } else if (!line[0] || starts_with(line, "index "))
+   } else if (!line[0])
/*
 * A completely blank (not ' \n', which is context)
 * line is not valid in a diff.  We skip it
 * silently, because this neatly handles the blank
 * separator line between commits in git-log
 * output.
-*
-* We also want to ignore the diff's `index` lines
-* because they contain exact blob hashes in which
-* we are not interested.
 */

[PATCH v3 09/14] range-diff: split lines manually

2019-07-08 Thread Thomas Gummerer
Currently range-diff uses the 'strbuf_getline()' function for doing
its line by line processing.  In a future patch we want to do parts of
that parsing using the 'parse_git_diff_header()' function.  That
function does its own line by line reading of the input, and doesn't
use strbufs.  This doesn't match with how we do the line-by-line
processing in range-diff currently.

Switch range-diff to do our own line by line parsing, so we can re-use
the 'parse_git_diff_header()' function later.

Signed-off-by: Thomas Gummerer 
---
 range-diff.c | 68 
 1 file changed, 42 insertions(+), 26 deletions(-)

diff --git a/range-diff.c b/range-diff.c
index 9242b8975f..784fac301b 100644
--- a/range-diff.c
+++ b/range-diff.c
@@ -24,6 +24,17 @@ struct patch_util {
struct object_id oid;
 };
 
+static size_t find_end_of_line(char *buffer, unsigned long size)
+{
+   char *eol = memchr(buffer, '\n', size);
+
+   if (!eol)
+   return size;
+
+   *eol = '\0';
+   return eol + 1 - buffer;
+}
+
 /*
  * Reads the patches into a string list, with the `util` field being populated
  * as struct object_id (will need to be free()d).
@@ -31,10 +42,12 @@ struct patch_util {
 static int read_patches(const char *range, struct string_list *list)
 {
struct child_process cp = CHILD_PROCESS_INIT;
-   FILE *in;
-   struct strbuf buf = STRBUF_INIT, line = STRBUF_INIT;
+   struct strbuf buf = STRBUF_INIT, contents = STRBUF_INIT;
struct patch_util *util = NULL;
int in_header = 1;
+   char *line;
+   int offset, len;
+   size_t size;
 
argv_array_pushl(&cp.args, "log", "--no-color", "-p", "--no-merges",
"--reverse", "--date-order", "--decorate=no",
@@ -54,17 +67,20 @@ static int read_patches(const char *range, struct 
string_list *list)
 
if (start_command(&cp))
return error_errno(_("could not start `log`"));
-   in = fdopen(cp.out, "r");
-   if (!in) {
+   if (strbuf_read(&contents, cp.out, 0) < 0) {
error_errno(_("could not read `log` output"));
finish_command(&cp);
return -1;
}
 
-   while (strbuf_getline(&line, in) != EOF) {
+   line = contents.buf;
+   size = contents.len;
+   for (offset = 0; size > 0; offset += len, size -= len, line += len) {
const char *p;
 
-   if (skip_prefix(line.buf, "commit ", &p)) {
+   len = find_end_of_line(line, size);
+   line[len - 1] = '\0';
+   if (skip_prefix(line, "commit ", &p)) {
if (util) {
string_list_append(list, buf.buf)->util = util;
strbuf_reset(&buf);
@@ -75,8 +91,7 @@ static int read_patches(const char *range, struct string_list 
*list)
free(util);
string_list_clear(list, 1);
strbuf_release(&buf);
-   strbuf_release(&line);
-   fclose(in);
+   strbuf_release(&contents);
finish_command(&cp);
return -1;
}
@@ -85,26 +100,28 @@ static int read_patches(const char *range, struct 
string_list *list)
continue;
}
 
-   if (starts_with(line.buf, "diff --git")) {
+   if (starts_with(line, "diff --git")) {
in_header = 0;
strbuf_addch(&buf, '\n');
if (!util->diff_offset)
util->diff_offset = buf.len;
strbuf_addch(&buf, ' ');
-   strbuf_addbuf(&buf, &line);
+   strbuf_addstr(&buf, line);
} else if (in_header) {
-   if (starts_with(line.buf, "Author: ")) {
-   strbuf_addbuf(&buf, &line);
+   if (starts_with(line, "Author: ")) {
+   strbuf_addstr(&buf, line);
strbuf_addstr(&buf, "\n\n");
-   } else if (starts_with(line.buf, "")) {
-   strbuf_rtrim(&line);
-   strbuf_addbuf(&buf, &line);
+   } else if (starts_with(line, "")) {
+   p = line + len - 2;
+ 

[PATCH v3 14/14] range-diff: add headers to the outer hunk header

2019-07-08 Thread Thomas Gummerer
Add the section headers/hunk headers we introduced in the previous
commits to the outer diff's hunk headers.  This makes it easier to
understand which change we are actually looking at.  For example an
outer hunk header might now look like:

@@  Documentation/config/interactive.txt

while previously it would have only been

@@

which doesn't give a lot of context for the change that follows.

For completeness also add section headers for the commit metadata and
the commit message, although they are arguably less important.

Signed-off-by: Thomas Gummerer 
---
 range-diff.c  |  9 ++---
 t/t3206-range-diff.sh | 41 ++---
 2 files changed, 28 insertions(+), 22 deletions(-)

diff --git a/range-diff.c b/range-diff.c
index 7a96a587f1..ba1e9a4265 100644
--- a/range-diff.c
+++ b/range-diff.c
@@ -139,8 +139,10 @@ static int read_patches(const char *range, struct 
string_list *list)
strbuf_addstr(&buf, " ##");
} else if (in_header) {
if (starts_with(line, "Author: ")) {
+   strbuf_addstr(&buf, " ## Metadata ##\n");
strbuf_addstr(&buf, line);
strbuf_addstr(&buf, "\n\n");
+   strbuf_addstr(&buf, " ## Commit message ##\n");
} else if (starts_with(line, "")) {
p = line + len - 2;
while (isspace(*p) && p >= line)
@@ -402,8 +404,9 @@ static void output_pair_header(struct diff_options *diffopt,
fwrite(buf->buf, buf->len, 1, diffopt->file);
 }
 
-static struct userdiff_driver no_func_name = {
-   .funcname = { "$^", 0 }
+static struct userdiff_driver section_headers = {
+   .funcname = { "^ ## (.*) ##$\n"
+ "^.?@@ (.*)$", REG_EXTENDED }
 };
 
 static struct diff_filespec *get_filespec(const char *name, const char *p)
@@ -415,7 +418,7 @@ static struct diff_filespec *get_filespec(const char *name, 
const char *p)
spec->size = strlen(p);
spec->should_munmap = 0;
spec->is_stdin = 1;
-   spec->driver = &no_func_name;
+   spec->driver = §ion_headers;
 
return spec;
 }
diff --git a/t/t3206-range-diff.sh b/t/t3206-range-diff.sh
index d4de270979..ec548654ce 100755
--- a/t/t3206-range-diff.sh
+++ b/t/t3206-range-diff.sh
@@ -99,7 +99,7 @@ test_expect_success 'changed commit' '
1:  4de457d = 1:  a4b s/5/A/
2:  fccce22 = 2:  f51d370 s/4/A/
3:  147e64e ! 3:  0559556 s/11/B/
-   @@
+   @@ file: A
  9
  10
 -11
@@ -109,7 +109,7 @@ test_expect_success 'changed commit' '
  13
  14
4:  a63e992 ! 4:  d966c5c s/12/B/
-   @@
+   @@ file
 @@ file: A
  9
  10
@@ -158,7 +158,7 @@ test_expect_success 'changed commit with sm config' '
1:  4de457d = 1:  a4b s/5/A/
2:  fccce22 = 2:  f51d370 s/4/A/
3:  147e64e ! 3:  0559556 s/11/B/
-   @@
+   @@ file: A
  9
  10
 -11
@@ -168,7 +168,7 @@ test_expect_success 'changed commit with sm config' '
  13
  14
4:  a63e992 ! 4:  d966c5c s/12/B/
-   @@
+   @@ file
 @@ file: A
  9
  10
@@ -186,9 +186,10 @@ test_expect_success 'renamed file' '
sed s/Z/\ /g >expected <<-EOF &&
1:  4de457d = 1:  f258d75 s/5/A/
2:  fccce22 ! 2:  017b62d s/4/A/
-   @@
+   @@ Metadata
ZAuthor: Thomas Rast 
Z
+   Z ## Commit message ##
-s/4/A/
+s/4/A/ + rename file
Z
@@ -198,8 +199,8 @@ test_expect_success 'renamed file' '
Z 1
Z 2
3:  147e64e ! 3:  3ce7af6 s/11/B/
-   @@
-   Z
+   @@ Metadata
+   Z ## Commit message ##
Zs/11/B/
Z
- ## file ##
@@ -210,8 +211,8 @@ test_expect_success 'renamed file' '
Z 9
Z 10
4:  a63e992 ! 4:  1e6226b s/12/B/
-   @@
-   Z
+   @@ Metadata
+   Z ## Commit message ##
Zs/12/B/
Z
- ## file ##
@@ -230,30 +231,32 @@ test_expect_success 'file added and later removed' '
sed s/Z/\ /g >expected <<-EOF &&
1:  4de457d = 1:  096b1ba s/5/A/
2:  fccce22 ! 2:  d92e698 s/4/A/
-   @@
+   @@ Metadata
ZAuthor: Thomas Rast 
 

[PATCH v3 13/14] range-diff: add filename to inner diff

2019-07-08 Thread Thomas Gummerer
In a range-diff it's not always clear which file a certain funcname of
the inner diff belongs to, because the diff header (or section header
as added in a previous commit) is not always visible in the
range-diff.

Add the filename to the inner diffs header, so it's always visible to
users.

This also allows us to add the filename + the funcname to the outer
diffs hunk headers using a custom userdiff pattern, which will be done
in the next commit.

Signed-off-by: Thomas Gummerer 
---
 range-diff.c  | 15 +--
 t/t3206-range-diff.sh | 16 ++--
 2 files changed, 23 insertions(+), 8 deletions(-)

diff --git a/range-diff.c b/range-diff.c
index 5f64380fe4..7a96a587f1 100644
--- a/range-diff.c
+++ b/range-diff.c
@@ -46,7 +46,7 @@ static int read_patches(const char *range, struct string_list 
*list)
struct strbuf buf = STRBUF_INIT, contents = STRBUF_INIT;
struct patch_util *util = NULL;
int in_header = 1;
-   char *line;
+   char *line, *current_filename = NULL;
int offset, len;
size_t size;
 
@@ -125,6 +125,12 @@ static int read_patches(const char *range, struct 
string_list *list)
else
strbuf_addstr(&buf, patch.new_name);
 
+   free(current_filename);
+   if (patch.is_delete > 0)
+   current_filename = xstrdup(patch.old_name);
+   else
+   current_filename = xstrdup(patch.new_name);
+
if (patch.new_mode && patch.old_mode &&
patch.old_mode != patch.new_mode)
strbuf_addf(&buf, " (mode change %06o => %06o)",
@@ -145,7 +151,11 @@ static int read_patches(const char *range, struct 
string_list *list)
continue;
} else if (skip_prefix(line, "@@ ", &p)) {
p = strstr(p, "@@");
-   strbuf_addstr(&buf, p ? p : "@@");
+   strbuf_addstr(&buf, "@@");
+   if (current_filename && p[2])
+   strbuf_addf(&buf, " %s:", current_filename);
+   if (p)
+   strbuf_addstr(&buf, p + 2);
} else if (!line[0])
/*
 * A completely blank (not ' \n', which is context)
@@ -177,6 +187,7 @@ static int read_patches(const char *range, struct 
string_list *list)
if (util)
string_list_append(list, buf.buf)->util = util;
strbuf_release(&buf);
+   free(current_filename);
 
if (finish_command(&cp))
return -1;
diff --git a/t/t3206-range-diff.sh b/t/t3206-range-diff.sh
index c277756057..d4de270979 100755
--- a/t/t3206-range-diff.sh
+++ b/t/t3206-range-diff.sh
@@ -110,7 +110,7 @@ test_expect_success 'changed commit' '
  14
4:  a63e992 ! 4:  d966c5c s/12/B/
@@
-@@ A
+@@ file: A
  9
  10
- B
@@ -169,7 +169,7 @@ test_expect_success 'changed commit with sm config' '
  14
4:  a63e992 ! 4:  d966c5c s/12/B/
@@
-@@ A
+@@ file: A
  9
  10
- B
@@ -203,20 +203,24 @@ test_expect_success 'renamed file' '
Zs/11/B/
Z
- ## file ##
+   -@@ file: A
+ ## renamed-file ##
-   Z@@ A
+   +@@ renamed-file: A
Z 8
Z 9
+   Z 10
4:  a63e992 ! 4:  1e6226b s/12/B/
@@
Z
Zs/12/B/
Z
- ## file ##
+   -@@ file: A
+ ## renamed-file ##
-   Z@@ A
+   +@@ renamed-file: A
Z 9
Z 10
+   Z B
EOF
test_cmp expected actual
 '
@@ -248,7 +252,7 @@ test_expect_success 'file added and later removed' '
+s/11/B/ + remove file
Z
Z ## file ##
-   Z@@ A
+   Z@@ file: A
@@
Z 12
Z 13
@@ -310,7 +314,7 @@ test_expect_success 'dual-coloring' '
:  14
:4:  d966c5c ! 4:  
8add5f1 s/12/B/
:@@
-   : @@ A
+   : @@ file: A
:  9
:  10
:- BB
-- 
2.22.0.510.g264f2c817a



[PATCH v3 10/14] range-diff: don't remove funcname from inner diff

2019-07-08 Thread Thomas Gummerer
When postprocessing the inner diff in range-diff, we currently replace
the whole hunk header line with just "@@".  This matches how 'git
tbdiff' used to handle hunk headers as well.

Most likely this is being done because line numbers in the hunk header
are not relevant without other changes.  They can for example easily
change if a range is rebased, and lines are added/removed before a
change that we actually care about in our ranges.

However it can still be useful to have the function name that 'git
diff' extracts as additional context for the change.

Note that it is not guaranteed that the hunk header actually shows up
in the range-diff, and this change only aims to improve the case where
a hunk header would already be included in the final output.

Signed-off-by: Thomas Gummerer 
---
 range-diff.c  | 7 ---
 t/t3206-range-diff.sh | 6 +++---
 2 files changed, 7 insertions(+), 6 deletions(-)

diff --git a/range-diff.c b/range-diff.c
index 784fac301b..a5202d8b6c 100644
--- a/range-diff.c
+++ b/range-diff.c
@@ -119,9 +119,10 @@ static int read_patches(const char *range, struct 
string_list *list)
strbuf_addch(&buf, '\n');
}
continue;
-   } else if (starts_with(line, "@@ "))
-   strbuf_addstr(&buf, "@@");
-   else if (!line[0] || starts_with(line, "index "))
+   } else if (skip_prefix(line, "@@ ", &p)) {
+   p = strstr(p, "@@");
+   strbuf_addstr(&buf, p ? p : "@@");
+   } else if (!line[0] || starts_with(line, "index "))
/*
 * A completely blank (not ' \n', which is context)
 * line is not valid in a diff.  We skip it
diff --git a/t/t3206-range-diff.sh b/t/t3206-range-diff.sh
index 048feaf6dd..aebd4e3693 100755
--- a/t/t3206-range-diff.sh
+++ b/t/t3206-range-diff.sh
@@ -110,7 +110,7 @@ test_expect_success 'changed commit' '
  14
4:  a63e992 ! 4:  d966c5c s/12/B/
@@ -8,7 +8,7 @@
-@@
+@@ A
  9
  10
- B
@@ -169,7 +169,7 @@ test_expect_success 'changed commit with sm config' '
  14
4:  a63e992 ! 4:  d966c5c s/12/B/
@@ -8,7 +8,7 @@
-@@
+@@ A
  9
  10
- B
@@ -231,7 +231,7 @@ test_expect_success 'dual-coloring' '
:  14
:4:  d966c5c ! 4:  
8add5f1 s/12/B/
:@@ -8,7 +8,7 @@
-   : @@
+   : @@ A
:  9
:  10
:- BB
-- 
2.22.0.510.g264f2c817a



[PATCH v3 03/14] apply: only pass required data to git_header_name

2019-07-08 Thread Thomas Gummerer
Currently the 'git_header_name()' function takes 'struct apply_state'
as parameter, even though it only needs the p_value from that struct.

This function is in the callchain of 'parse_git_header()', which we
want to make more generally useful in a subsequent commit.  To make
that happen we only want to pass in the required data to
'parse_git_header()', and not the whole 'struct apply_state', and thus
we want functions in the callchain of 'parse_git_header()' to only
take arguments they really need.

Signed-off-by: Thomas Gummerer 
---
 apply.c | 16 
 1 file changed, 8 insertions(+), 8 deletions(-)

diff --git a/apply.c b/apply.c
index fc7083fcbc..ac668e754d 100644
--- a/apply.c
+++ b/apply.c
@@ -1164,7 +1164,7 @@ static const char *skip_tree_prefix(int p_value,
  * creation or deletion of an empty file.  In any of these cases,
  * both sides are the same name under a/ and b/ respectively.
  */
-static char *git_header_name(struct apply_state *state,
+static char *git_header_name(int p_value,
 const char *line,
 int llen)
 {
@@ -1184,7 +1184,7 @@ static char *git_header_name(struct apply_state *state,
goto free_and_fail1;
 
/* strip the a/b prefix including trailing slash */
-   cp = skip_tree_prefix(state->p_value, first.buf, first.len);
+   cp = skip_tree_prefix(p_value, first.buf, first.len);
if (!cp)
goto free_and_fail1;
strbuf_remove(&first, 0, cp - first.buf);
@@ -1201,7 +1201,7 @@ static char *git_header_name(struct apply_state *state,
if (*second == '"') {
if (unquote_c_style(&sp, second, NULL))
goto free_and_fail1;
-   cp = skip_tree_prefix(state->p_value, sp.buf, sp.len);
+   cp = skip_tree_prefix(p_value, sp.buf, sp.len);
if (!cp)
goto free_and_fail1;
/* They must match, otherwise ignore */
@@ -1212,7 +1212,7 @@ static char *git_header_name(struct apply_state *state,
}
 
/* unquoted second */
-   cp = skip_tree_prefix(state->p_value, second, line + llen - 
second);
+   cp = skip_tree_prefix(p_value, second, line + llen - second);
if (!cp)
goto free_and_fail1;
if (line + llen - cp != first.len ||
@@ -1227,7 +1227,7 @@ static char *git_header_name(struct apply_state *state,
}
 
/* unquoted first name */
-   name = skip_tree_prefix(state->p_value, line, llen);
+   name = skip_tree_prefix(p_value, line, llen);
if (!name)
return NULL;
 
@@ -1243,7 +1243,7 @@ static char *git_header_name(struct apply_state *state,
if (unquote_c_style(&sp, second, NULL))
goto free_and_fail2;
 
-   np = skip_tree_prefix(state->p_value, sp.buf, sp.len);
+   np = skip_tree_prefix(p_value, sp.buf, sp.len);
if (!np)
goto free_and_fail2;
 
@@ -1287,7 +1287,7 @@ static char *git_header_name(struct apply_state *state,
 */
if (!name[len + 1])
return NULL; /* no postimage name */
-   second = skip_tree_prefix(state->p_value, name + len + 
1,
+   second = skip_tree_prefix(p_value, name + len + 1,
  line_len - (len + 1));
if (!second)
return NULL;
@@ -1333,7 +1333,7 @@ static int parse_git_header(struct apply_state *state,
 * or removing or adding empty files), so we get
 * the default name from the header.
 */
-   patch->def_name = git_header_name(state, line, len);
+   patch->def_name = git_header_name(state->p_value, line, len);
if (patch->def_name && state->root.len) {
char *s = xstrfmt("%s%s", state->root.buf, patch->def_name);
free(patch->def_name);
-- 
2.22.0.510.g264f2c817a



[PATCH v3 04/14] apply: only pass required data to check_header_line

2019-07-08 Thread Thomas Gummerer
Currently the 'check_header_line()' function takes 'struct
apply_state' as parameter, even though it only needs the linenr from
that struct.

This function is in the callchain of 'parse_git_header()', which we
want to make more generally useful in a subsequent commit.  To make
that happen we only want to pass in the required data to
'parse_git_header()', and not the whole 'struct apply_state', and thus
we want functions in the callchain of 'parse_git_header()' to only
take arguments they really need.

Signed-off-by: Thomas Gummerer 
---
 apply.c | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/apply.c b/apply.c
index ac668e754d..1602fd5db0 100644
--- a/apply.c
+++ b/apply.c
@@ -1302,15 +1302,15 @@ static char *git_header_name(int p_value,
}
 }
 
-static int check_header_line(struct apply_state *state, struct patch *patch)
+static int check_header_line(int linenr, struct patch *patch)
 {
int extensions = (patch->is_delete == 1) + (patch->is_new == 1) +
 (patch->is_rename == 1) + (patch->is_copy == 1);
if (extensions > 1)
return error(_("inconsistent header lines %d and %d"),
-patch->extension_linenr, state->linenr);
+patch->extension_linenr, linenr);
if (extensions && !patch->extension_linenr)
-   patch->extension_linenr = state->linenr;
+   patch->extension_linenr = linenr;
return 0;
 }
 
@@ -1380,7 +1380,7 @@ static int parse_git_header(struct apply_state *state,
res = p->fn(state, line + oplen, patch);
if (res < 0)
return -1;
-   if (check_header_line(state, patch))
+   if (check_header_line(state->linenr, patch))
return -1;
if (res > 0)
return offset;
-- 
2.22.0.510.g264f2c817a



[PATCH v3 02/14] apply: only pass required data to skip_tree_prefix

2019-07-08 Thread Thomas Gummerer
Currently the 'skip_tree_prefix()' function takes 'struct apply_state'
as parameter, even though it only needs the p_value from that struct.

This function is in the callchain of 'parse_git_header()', which we
want to make more generally useful in a subsequent commit.  To make
that happen we only want to pass in the required data to
'parse_git_header()', and not the whole 'struct apply_state', and thus
we want functions in the callchain of 'parse_git_header()' to only
take arguments they really need.

Signed-off-by: Thomas Gummerer 
---
 apply.c | 18 +-
 1 file changed, 9 insertions(+), 9 deletions(-)

diff --git a/apply.c b/apply.c
index 599cf8956f..fc7083fcbc 100644
--- a/apply.c
+++ b/apply.c
@@ -1137,17 +1137,17 @@ static int gitdiff_unrecognized(struct apply_state 
*state,
  * Skip p_value leading components from "line"; as we do not accept
  * absolute paths, return NULL in that case.
  */
-static const char *skip_tree_prefix(struct apply_state *state,
+static const char *skip_tree_prefix(int p_value,
const char *line,
int llen)
 {
int nslash;
int i;
 
-   if (!state->p_value)
+   if (!p_value)
return (llen && line[0] == '/') ? NULL : line;
 
-   nslash = state->p_value;
+   nslash = p_value;
for (i = 0; i < llen; i++) {
int ch = line[i];
if (ch == '/' && --nslash <= 0)
@@ -1184,7 +1184,7 @@ static char *git_header_name(struct apply_state *state,
goto free_and_fail1;
 
/* strip the a/b prefix including trailing slash */
-   cp = skip_tree_prefix(state, first.buf, first.len);
+   cp = skip_tree_prefix(state->p_value, first.buf, first.len);
if (!cp)
goto free_and_fail1;
strbuf_remove(&first, 0, cp - first.buf);
@@ -1201,7 +1201,7 @@ static char *git_header_name(struct apply_state *state,
if (*second == '"') {
if (unquote_c_style(&sp, second, NULL))
goto free_and_fail1;
-   cp = skip_tree_prefix(state, sp.buf, sp.len);
+   cp = skip_tree_prefix(state->p_value, sp.buf, sp.len);
if (!cp)
goto free_and_fail1;
/* They must match, otherwise ignore */
@@ -1212,7 +1212,7 @@ static char *git_header_name(struct apply_state *state,
}
 
/* unquoted second */
-   cp = skip_tree_prefix(state, second, line + llen - second);
+   cp = skip_tree_prefix(state->p_value, second, line + llen - 
second);
if (!cp)
goto free_and_fail1;
if (line + llen - cp != first.len ||
@@ -1227,7 +1227,7 @@ static char *git_header_name(struct apply_state *state,
}
 
/* unquoted first name */
-   name = skip_tree_prefix(state, line, llen);
+   name = skip_tree_prefix(state->p_value, line, llen);
if (!name)
return NULL;
 
@@ -1243,7 +1243,7 @@ static char *git_header_name(struct apply_state *state,
if (unquote_c_style(&sp, second, NULL))
goto free_and_fail2;
 
-   np = skip_tree_prefix(state, sp.buf, sp.len);
+   np = skip_tree_prefix(state->p_value, sp.buf, sp.len);
if (!np)
goto free_and_fail2;
 
@@ -1287,7 +1287,7 @@ static char *git_header_name(struct apply_state *state,
 */
if (!name[len + 1])
return NULL; /* no postimage name */
-   second = skip_tree_prefix(state, name + len + 1,
+   second = skip_tree_prefix(state->p_value, name + len + 
1,
  line_len - (len + 1));
if (!second)
return NULL;
-- 
2.22.0.510.g264f2c817a



[PATCH v3 01/14] apply: replace marc.info link with public-inbox

2019-07-08 Thread Thomas Gummerer
public-inbox.org links include the whole message ID by default.  This
means the message can still be found even if the site goes away, which
is not the case with the marc.info link.  Replace the marc.info link
with a more future proof one.

Signed-off-by: Thomas Gummerer 
---
 apply.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/apply.c b/apply.c
index f15afa9f6a..599cf8956f 100644
--- a/apply.c
+++ b/apply.c
@@ -478,7 +478,7 @@ static char *find_name_gnu(struct apply_state *state,
 
/*
 * Proposed "new-style" GNU patch/diff format; see
-* http://marc.info/?l=git&m=112927316408690&w=2
+* https://public-inbox.org/git/7vll0wvb2a@assigned-by-dhcp.cox.net/
 */
if (unquote_c_style(&name, line, NULL)) {
strbuf_release(&name);
-- 
2.22.0.510.g264f2c817a



[PATCH v3 00/14] output improvements for git range-diff

2019-07-08 Thread Thomas Gummerer
Thanks Dscho for the review of the previous round [1].  This rounds
addresses all the comments from that review.

In particular

- update commit messages where necessary.
- rename the function in apply.c to 'parse_git_diff_header()'
- code cleanups in 09/14
- fix a memory leak introduced in 09/14
- be less strict about parsing hunk headers, so the new code isn't
  more strict than it was before
- give more information when we are unable to parse the git diff
  header

[1]: https://public-inbox.org/git/20190705170630.27500-1-t.gumme...@gmail.com/

Thomas Gummerer (14):
  apply: replace marc.info link with public-inbox
  apply: only pass required data to skip_tree_prefix
  apply: only pass required data to git_header_name
  apply: only pass required data to check_header_line
  apply: only pass required data to find_name_*
  apply: only pass required data to gitdiff_* functions
  apply: make parse_git_header public
  range-diff: fix function parameter indentation
  range-diff: split lines manually
  range-diff: don't remove funcname from inner diff
  range-diff: suppress line count in outer diff
  range-diff: add section header instead of diff header
  range-diff: add filename to inner diff
  range-diff: add headers to the outer hunk header

 apply.c| 186 ++---
 apply.h|  48 +++
 diff.c |   5 +-
 diff.h |   1 +
 range-diff.c   | 124 +++
 t/t3206-range-diff.sh  | 124 ++-
 t/t3206/history.export |  84 ++-
 7 files changed, 409 insertions(+), 163 deletions(-)

Range-diff against v2:
 1:  ef2245edda =  1:  ef2245edda apply: replace marc.info link with 
public-inbox
 2:  94578fa45c =  2:  94578fa45c apply: only pass required data to 
skip_tree_prefix
 3:  988269a68e =  3:  988269a68e apply: only pass required data to 
git_header_name
 4:  a2c1ef3f5f =  4:  a2c1ef3f5f apply: only pass required data to 
check_header_line
 5:  0f4cfe21cb =  5:  0f4cfe21cb apply: only pass required data to find_name_*
 6:  7f1d7a4569 !  6:  07a271518d apply: only pass required data to gitdiff_* 
functions
@@ Commit message
 we want functions in the callchain of 'parse_git_header()' to only
 take arguments they really need.
 
+As these functions are called in a loop using their function pointers,
+each function needs to be passed all the parameters even if only one
+of the functions actually needs it.  We therefore pass this data along
+in a struct to avoid adding too many unused parameters to each
+function and making the code very verbose in the process.
+
 Signed-off-by: Thomas Gummerer 
 
  ## apply.c ##
 7:  a5af8b0845 !  7:  9cb6732a5f apply: make parse_git_header public
@@ apply.c: struct fragment {
  {
while (list) {
 @@ apply.c: static int check_header_line(int linenr, struct patch *patch)
+   return 0;
  }
  
- /* Verify that we recognize the lines following a git header */
+-/* Verify that we recognize the lines following a git header */
 -static int parse_git_header(struct apply_state *state,
 -  const char *line,
 -  int len,
 -  unsigned int size,
 -  struct patch *patch)
-+int parse_git_header(struct strbuf *root,
-+   int *linenr,
-+   int p_value,
-+   const char *line,
-+   int len,
-+   unsigned int size,
-+   struct patch *patch)
++int parse_git_diff_header(struct strbuf *root,
++int *linenr,
++int p_value,
++const char *line,
++int len,
++unsigned int size,
++struct patch *patch)
  {
unsigned long offset;
struct parse_git_header_state parse_hdr_state;
@@ apply.c: static int find_header(struct apply_state *state,
 */
if (!memcmp("diff --git ", line, 11)) {
 -  int git_hdr_len = parse_git_header(state, line, len, 
size, patch);
-+  int git_hdr_len = parse_git_header(&state->root, 
&state->linenr,
-+ state->p_value, 
line, len,
-+ size, patch);
++  int git_hdr_len = parse_git_diff_header(&state->root, 
&state->linenr,
++  state->p_value, 
line, len,
++  size, patch);
if (git_hdr_len < 0)
ret

Re: [PATCH v2 12/14] range-diff: add section header instead of diff header

2019-07-08 Thread Thomas Gummerer
>expected <<-EOF &&
> > +   1:  4de457d = 1:  f258d75 s/5/A/
> > +   2:  fccce22 ! 2:  017b62d s/4/A/
> > +   @@
> > +   ZAuthor: Thomas Rast 
> > +   Z
> > +   -s/4/A/
> > +   +s/4/A/ + rename file
> > +   Z
> > +   - ## file ##
> > +   + ## file => renamed-file ##
> 
> I guess there is no good way to suppress the `- ## file ##` line in this
> case? It is a bit distracting...

No, I can't think of a good way.  I'm also not sure it would be right
to remove it.  In this case it means that in the previous version this
was only called 'file', while in the new version it was renamend in
this patch to 'renamed-file', so it does give some useful information.

Not sure how else we could represent that.  If we just had a 
'## file => renamed-file ##' section header, I'd expect that 'file'
has been renamed to 'renamed-file' in both versions.

> > @@ -216,9 +295,9 @@ test_expect_success 'dual-coloring' '
> > : 
> > :+Also a silly comment here!
> > :+
> > -   :  diff --git a/file b/file
> > -   :  --- a/file
> > -   :  +++ b/file
> > +   :  ## file ##
> > +   : @@
> > +   :  1
> 
> I am a bit confused where these last two lines come from all of a
> sudden... They were not there before, and I do not see any code change in
> this patch that would be responsible for them, either...
> 
> Could you help me understand?

Sure.  The actual change (in the range-diff) here is that "Also a
silly comment here!" was added to the commit message.  The diff header
is context lines after that.

We now replace the diff header with the new "section header", which is
only a single line, so we get a couple of additional lines of the
context of the subsequent inner diff.  


> > :3:  0559556 ! 3:  
> > b9cb956 s/11/B/
> > :@@
> > :  9
> > diff --git a/t/t3206/history.export b/t/t3206/history.export
> > index b80940..7bb3814962 100644
> > --- a/t/t3206/history.export
> > +++ b/t/t3206/history.export
> > @@ -22,8 +22,8 @@ data 51
> >  19
> >  20
> >
> > -reset refs/heads/removed
> > -commit refs/heads/removed
> > +reset refs/heads/renamed-file
> > +commit refs/heads/renamed-file
> 
> Hmm. Is the `removed` ref no longer required by the 'removed a commit'
> test case?

It is, and it still exists.  I'm not entirely familar with the format
for fast-export/fast-import scripts.  What I did was just fast-import
the existing script, create the new refs that were required for the
tests and then fast-export'ed it again.

So not sure exactly why this changed, but the 'removed' ref still
exists :)

> >  mark :2
> >  author Thomas Rast  1374424921 +0200
> >  committer Thomas Rast  1374484724 +0200
> > @@ -599,6 +599,82 @@ s/12/B/
> >  from :46
> >  M 100644 :28 file
> >
> > -reset refs/heads/removed
> > -from :47
> > +commit refs/heads/added-removed
> > +mark :48
> > +author Thomas Rast  1374485014 +0200
> > +committer Thomas Gummerer  1556574151 +0100
> 
> Neat ;-)
> 
> > +data 7
> > +s/5/A/
> > +from :2
> > +M 100644 :3 file
> > +
> > +blob
> > +mark :49
> > +data 0
> > +
> > +commit refs/heads/added-removed
> > +mark :50
> > +author Thomas Rast  1374485024 +0200
> > +committer Thomas Gummerer  1556574177 +0100
> > +data 18
> > +s/4/A/ + new-file
> > +from :48
> > +M 100644 :5 file
> > +M 100644 :49 new-file
> > +
> > +commit refs/heads/added-removed
> > +mark :51
> > +author Thomas Rast  1374485036 +0200
> > +committer Thomas Gummerer  1556574177 +0100
> > +data 22
> > +s/11/B/ + remove file
> > +from :50
> > +M 100644 :7 file
> > +D new-file
> > +
> > +commit refs/heads/added-removed
> > +mark :52
> > +author Thomas Rast  1374485044 +0200
> > +committer Thomas Gummerer  1556574177 +0100
> > +data 8
> > +s/12/B/
> > +from :51
> > +M 100644 :9 file
> > +
> > +commit refs/heads/renamed-file
> > +mark :53
> > +author Thomas Rast  1374485014 +0200
> > +committer Thomas Gummerer  1556574309 +0100
> > +data 7
> > +s/5/A/
> > +from :2
> > +M 100644 :3 file
> > +
> > +commit refs/heads/renamed-file
> > +mark :54
> > +author Thomas Rast  1374485024 +0200
> > +committer Thomas Gummerer  1556574312 +0100
> > +data 21
> > +s/4/A/ + rename file
> > +from :53
> > +D file
> > +M 100644 :5 renamed-file
> > +
> > +commit refs/heads/renamed-file
> > +mark :55
> > +author Thomas Rast  1374485036 +0200
> > +committer Thomas Gummerer  1556574319 +0100
> > +data 8
> > +s/11/B/
> > +from :54
> > +M 100644 :7 renamed-file
> > +
> > +commit refs/heads/renamed-file
> > +mark :56
> > +author Thomas Rast  1374485044 +0200
> > +committer Thomas Gummerer  1556574319 +0100
> > +data 8
> > +s/12/B/
> > +from :55
> > +M 100644 :9 renamed-file
> 
> I have to admit that I allowed myself not to study this script too
> closely, trusting that the range-diff explains better what commit history
> it creates than the fast-import script.
> 
> Thanks,
> Dscho
> 
> >
> > --
> > 2.22.0.510.g264f2c817a
> >
> >


Re: [PATCH v2 00/14] output improvements for git range-diff

2019-07-08 Thread Thomas Gummerer
On 07/05, Johannes Schindelin wrote:
> Hi Thomas,
> 
> On Fri, 5 Jul 2019, Thomas Gummerer wrote:
> 
> > It's been quite a while since I sent the RFC [1] (thanks all for the
> > comments on that), and the series changed shapes quite a bit since the
> > last round.
> >
> > Since it's been such a long time, just to remind everyone, the goal of
> > this series is to make the range-diff output clearer, by showing
> > information about the filenames to which the current diff belongs.
> 
> Thank you for that reminder ;-)
> 
> > In the previous round, we did this using "section headers" that
> > include information about the current file and adding that to the
> > outer diff's hunk headers.
> >
> > In this round we still keep the section headers (with slightly more
> > information), but in addition we also add the filename to the inner
> > diff hunk headers.  In the outer diff hunk headers we then display
> > either the section header or the inner diff hunk header using a
> > userdiff pattern.
> 
> 
> I like this idea!
> 
> > In terms of code changes the biggest change is that we're now re-using
> > the 'parse_git_header' function from the apply code to parse the diff
> > headers, instead of trying to parse them with some hacky parsing code
> > in range-diff.c.  This way we are sure that the diff headers are
> > properly parsed.
> 
> Yep, very good.
> 
> > I had also considered just outputting the section headers directly
> > from 'git log', but then decided against that.  Parsing the headers
> > allows a future enhancement of range-diff, where we would allow
> > parsing an mbox file [2].
> 
> Thanks you for your consideration; I still would like to have the option
> at some stage to compare a patch series from public-inbox.org/git to the
> commits in `pu`, without having to fiddle with finding a valid base commit
> to apply the patches on.

Yeah, I would like that as well ;)

> > I split the "only pass required data" commits up, in the hopes of
> > making them easier to review, but I'm also happy to squash them if
> > people feel like that makes it easier to review them.
> 
> I found it very easy to review in the current form, thank you for putting
> in the extra effort.
> 
> > An added advantage of this is that we're also getting rid of things
> > like the similarity index, which are not important in the range-diff,
> > and are thus not represented in the "section header".
> >
> > One thing that did not change is that the new/deleted strings are not
> > translated in this version either.  This is similar to the regular
> > diff output, where we also don't translate these.  We can still
> > consider translating them in the future though.
> >
> > [1]: 
> > https://public-inbox.org/git/20190414210933.20875-1-t.gumme...@gmail.com/
> > [2]: https://github.com/gitgitgadget/git/issues/207
> >
> > I'm including the range-diff between this version of the series and
> > the RFC, and a diff between the range diff and the range-diff without
> > these changes below.  Probably not useful in reviewing the code, but
> > good to show off the changes made in this series.
> 
> Indeed!
> 
> I very much like the idea, and the current iteration. I offered a couple
> of minor suggestions, in the hope that you find them helpful.

Thanks for your review!  I did find the suggestions very helpful
indeed :)


Re: [PATCH v2 09/14] range-diff: split lines manually

2019-07-08 Thread Thomas Gummerer
On 07/05, Johannes Schindelin wrote:
> Hi Thomas,
> 
> 
> On Fri, 5 Jul 2019, Thomas Gummerer wrote:
> 
> > Currently range-diff uses the 'strbuf_getline()' function for doing
> > its line by line processing.  In a future patch we want to do parts of
> > that parsing using the 'parse_git_header()' function, which does
> 
> If you like my suggestion in patch 7/14, this commit message needs to talk
> about the new name, too.

Thanks for the reminder here!  I do indeed like the new name, but
would probably have forgotten to change it in the commit message here.

> > requires reading parts of the input from that function, which doesn't
> 
> s/requires/require/
> 
> > use strbufs.
> >
> > Switch range-diff to do our own line by line parsing, so we can re-use
> > the parse_git_header function later.
> >
> > Signed-off-by: Thomas Gummerer 
> > ---
> >
> > Longer term it might be better to have both range-diff and apply code
> > use strbufs.  However I didn't feel it's worth making that change for
> > this patch series.
> 
> Makes sense.
> 
> >  range-diff.c | 69 +---
> >  1 file changed, 39 insertions(+), 30 deletions(-)
> >
> > diff --git a/range-diff.c b/range-diff.c
> > index 9242b8975f..916afa44c0 100644
> > --- a/range-diff.c
> > +++ b/range-diff.c
> > @@ -24,6 +24,17 @@ struct patch_util {
> > struct object_id oid;
> >  };
> >
> > +static unsigned long linelen(const char *buffer, unsigned long size)
> 
> Shouldn't this be `size_t`?
> 
> > +{
> > +   unsigned long len = 0;
> 
> Likewise.
> 
> > +   while (size--) {
> > +   len++;
> > +   if (*buffer++ == '\n')
> > +   break;
> > +   }
> > +   return len;
> 
> How about
> 
>   const char *eol = memchr(buffer, '\n', size);
> 
>   return !eol ? size : eol + 1 - buffer;
> 
> instead?
> 
> For an extra brownie point, you could even rename this function to
> `find_end_of_line()` and replace the LF by a NUL:
> 
>   if (!eol)
>   return size;
> 
>   *eol = '\0';
>   return eol + 1 - buffer;

I like this, thank you!

> > +}
> > +
> >  /*
> >   * Reads the patches into a string list, with the `util` field being 
> > populated
> >   * as struct object_id (will need to be free()d).
> > @@ -31,10 +42,12 @@ struct patch_util {
> >  static int read_patches(const char *range, struct string_list *list)
> >  {
> > struct child_process cp = CHILD_PROCESS_INIT;
> > -   FILE *in;
> > -   struct strbuf buf = STRBUF_INIT, line = STRBUF_INIT;
> > +   struct strbuf buf = STRBUF_INIT, file = STRBUF_INIT;
> 
> This puzzled me. I'd like to suggest s/file/contents/

Thanks, will change.

> > struct patch_util *util = NULL;
> > int in_header = 1;
> > +   char *line;
> > +   int offset, len;
> > +   size_t size;
> >
> > argv_array_pushl(&cp.args, "log", "--no-color", "-p", "--no-merges",
> > "--reverse", "--date-order", "--decorate=no",
> > @@ -54,17 +67,15 @@ static int read_patches(const char *range, struct 
> > string_list *list)
> >
> > if (start_command(&cp))
> > return error_errno(_("could not start `log`"));
> > -   in = fdopen(cp.out, "r");
> > -   if (!in) {
> > -   error_errno(_("could not read `log` output"));
> > -   finish_command(&cp);
> > -   return -1;
> > -   }
> > +   strbuf_read(&file, cp.out, 0);
> 
> Shouldn't we handle a negative return value here, erroring out with "could
> not read `log` output" as before?

Yeah, that was an oversight, we should definitely still handle errors
here.

> >
> > -   while (strbuf_getline(&line, in) != EOF) {
> > +   line = strbuf_detach(&file, &size);
> 
> I strongly suspect this to leak, given that `line` is subsequently
> advanced, and there is no backup copy.
> 
> Maybe
> 
>   line = file.buf;
>   size = file.len;
> 
> would make more sense here?

Hmm good point, that makes more sense indeed.

> > +   for (offset = 0; size > 0; offset += len, size -= len, line += len) {
> > const char *p;
> >
> > -   if (skip_prefix(line.buf, "commit ", &p)) {
> > +   len = linelen(line, size);

[PATCH v2 10/14] range-diff: don't remove funcname from inner diff

2019-07-05 Thread Thomas Gummerer
When postprocessing the inner diff in range-diff, we currently replace
the whole hunk header line with just "@@".  This matches how 'git
tbdiff' used to handle hunk headers as well.

Most likely this is being done because line numbers in the hunk header
are not relevant without other changes.  They can for example easily
change if a range is rebased, and lines are added/removed before a
change that we actually care about in our ranges.

However it can still be useful to have the function name that 'git
diff' extracts as additional context for the change.

Note that it is not guaranteed that the hunk header actually shows up
in the range-diff, and this change only aims to improve the case where
a hunk header would already be included in the final output.

Signed-off-by: Thomas Gummerer 
---
 range-diff.c  | 8 +---
 t/t3206-range-diff.sh | 6 +++---
 2 files changed, 8 insertions(+), 6 deletions(-)

diff --git a/range-diff.c b/range-diff.c
index 916afa44c0..484b1ec5a9 100644
--- a/range-diff.c
+++ b/range-diff.c
@@ -113,9 +113,11 @@ static int read_patches(const char *range, struct 
string_list *list)
strbuf_addch(&buf, '\n');
}
continue;
-   } else if (starts_with(line, "@@ "))
-   strbuf_addstr(&buf, "@@");
-   else if (!line[0] || starts_with(line, "index "))
+   } else if (skip_prefix(line, "@@ ", &p)) {
+   if (!(p = strstr(p, "@@")))
+   die(_("invalid hunk header in inner diff"));
+   strbuf_addstr(&buf, p);
+   } else if (!line[0] || starts_with(line, "index "))
/*
 * A completely blank (not ' \n', which is context)
 * line is not valid in a diff.  We skip it
diff --git a/t/t3206-range-diff.sh b/t/t3206-range-diff.sh
index 048feaf6dd..aebd4e3693 100755
--- a/t/t3206-range-diff.sh
+++ b/t/t3206-range-diff.sh
@@ -110,7 +110,7 @@ test_expect_success 'changed commit' '
  14
4:  a63e992 ! 4:  d966c5c s/12/B/
@@ -8,7 +8,7 @@
-@@
+@@ A
  9
  10
- B
@@ -169,7 +169,7 @@ test_expect_success 'changed commit with sm config' '
  14
4:  a63e992 ! 4:  d966c5c s/12/B/
@@ -8,7 +8,7 @@
-@@
+@@ A
  9
  10
- B
@@ -231,7 +231,7 @@ test_expect_success 'dual-coloring' '
:  14
:4:  d966c5c ! 4:  
8add5f1 s/12/B/
:@@ -8,7 +8,7 @@
-   : @@
+   : @@ A
:  9
:  10
:- BB
-- 
2.22.0.510.g264f2c817a



[PATCH v2 03/14] apply: only pass required data to git_header_name

2019-07-05 Thread Thomas Gummerer
Currently the 'git_header_name()' function takes 'struct apply_state'
as parameter, even though it only needs the p_value from that struct.

This function is in the callchain of 'parse_git_header()', which we
want to make more generally useful in a subsequent commit.  To make
that happen we only want to pass in the required data to
'parse_git_header()', and not the whole 'struct apply_state', and thus
we want functions in the callchain of 'parse_git_header()' to only
take arguments they really need.

Signed-off-by: Thomas Gummerer 
---
 apply.c | 16 
 1 file changed, 8 insertions(+), 8 deletions(-)

diff --git a/apply.c b/apply.c
index fc7083fcbc..ac668e754d 100644
--- a/apply.c
+++ b/apply.c
@@ -1164,7 +1164,7 @@ static const char *skip_tree_prefix(int p_value,
  * creation or deletion of an empty file.  In any of these cases,
  * both sides are the same name under a/ and b/ respectively.
  */
-static char *git_header_name(struct apply_state *state,
+static char *git_header_name(int p_value,
 const char *line,
 int llen)
 {
@@ -1184,7 +1184,7 @@ static char *git_header_name(struct apply_state *state,
goto free_and_fail1;
 
/* strip the a/b prefix including trailing slash */
-   cp = skip_tree_prefix(state->p_value, first.buf, first.len);
+   cp = skip_tree_prefix(p_value, first.buf, first.len);
if (!cp)
goto free_and_fail1;
strbuf_remove(&first, 0, cp - first.buf);
@@ -1201,7 +1201,7 @@ static char *git_header_name(struct apply_state *state,
if (*second == '"') {
if (unquote_c_style(&sp, second, NULL))
goto free_and_fail1;
-   cp = skip_tree_prefix(state->p_value, sp.buf, sp.len);
+   cp = skip_tree_prefix(p_value, sp.buf, sp.len);
if (!cp)
goto free_and_fail1;
/* They must match, otherwise ignore */
@@ -1212,7 +1212,7 @@ static char *git_header_name(struct apply_state *state,
}
 
/* unquoted second */
-   cp = skip_tree_prefix(state->p_value, second, line + llen - 
second);
+   cp = skip_tree_prefix(p_value, second, line + llen - second);
if (!cp)
goto free_and_fail1;
if (line + llen - cp != first.len ||
@@ -1227,7 +1227,7 @@ static char *git_header_name(struct apply_state *state,
}
 
/* unquoted first name */
-   name = skip_tree_prefix(state->p_value, line, llen);
+   name = skip_tree_prefix(p_value, line, llen);
if (!name)
return NULL;
 
@@ -1243,7 +1243,7 @@ static char *git_header_name(struct apply_state *state,
if (unquote_c_style(&sp, second, NULL))
goto free_and_fail2;
 
-   np = skip_tree_prefix(state->p_value, sp.buf, sp.len);
+   np = skip_tree_prefix(p_value, sp.buf, sp.len);
if (!np)
goto free_and_fail2;
 
@@ -1287,7 +1287,7 @@ static char *git_header_name(struct apply_state *state,
 */
if (!name[len + 1])
return NULL; /* no postimage name */
-   second = skip_tree_prefix(state->p_value, name + len + 
1,
+   second = skip_tree_prefix(p_value, name + len + 1,
  line_len - (len + 1));
if (!second)
return NULL;
@@ -1333,7 +1333,7 @@ static int parse_git_header(struct apply_state *state,
 * or removing or adding empty files), so we get
 * the default name from the header.
 */
-   patch->def_name = git_header_name(state, line, len);
+   patch->def_name = git_header_name(state->p_value, line, len);
if (patch->def_name && state->root.len) {
char *s = xstrfmt("%s%s", state->root.buf, patch->def_name);
free(patch->def_name);
-- 
2.22.0.510.g264f2c817a



[PATCH v2 05/14] apply: only pass required data to find_name_*

2019-07-05 Thread Thomas Gummerer
Currently the 'find_name_*()' functions take 'struct apply_state' as
parameter, even though they only need the 'root' member from that
struct.

These functions are in the callchain of 'parse_git_header()', which we
want to make more generally useful in a subsequent commit.  To make
that happen we only want to pass in the required data to
'parse_git_header()', and not the whole 'struct apply_state', and thus
we want functions in the callchain of 'parse_git_header()' to only
take arguments they really need.

Signed-off-by: Thomas Gummerer 
---
 apply.c | 48 
 1 file changed, 24 insertions(+), 24 deletions(-)

diff --git a/apply.c b/apply.c
index 1602fd5db0..3cd4e3d3b3 100644
--- a/apply.c
+++ b/apply.c
@@ -469,7 +469,7 @@ static char *squash_slash(char *name)
return name;
 }
 
-static char *find_name_gnu(struct apply_state *state,
+static char *find_name_gnu(struct strbuf *root,
   const char *line,
   int p_value)
 {
@@ -495,8 +495,8 @@ static char *find_name_gnu(struct apply_state *state,
}
 
strbuf_remove(&name, 0, cp - name.buf);
-   if (state->root.len)
-   strbuf_insert(&name, 0, state->root.buf, state->root.len);
+   if (root->len)
+   strbuf_insert(&name, 0, root->buf, root->len);
return squash_slash(strbuf_detach(&name, NULL));
 }
 
@@ -659,7 +659,7 @@ static size_t diff_timestamp_len(const char *line, size_t 
len)
return line + len - end;
 }
 
-static char *find_name_common(struct apply_state *state,
+static char *find_name_common(struct strbuf *root,
  const char *line,
  const char *def,
  int p_value,
@@ -702,30 +702,30 @@ static char *find_name_common(struct apply_state *state,
return squash_slash(xstrdup(def));
}
 
-   if (state->root.len) {
-   char *ret = xstrfmt("%s%.*s", state->root.buf, len, start);
+   if (root->len) {
+   char *ret = xstrfmt("%s%.*s", root->buf, len, start);
return squash_slash(ret);
}
 
return squash_slash(xmemdupz(start, len));
 }
 
-static char *find_name(struct apply_state *state,
+static char *find_name(struct strbuf *root,
   const char *line,
   char *def,
   int p_value,
   int terminate)
 {
if (*line == '"') {
-   char *name = find_name_gnu(state, line, p_value);
+   char *name = find_name_gnu(root, line, p_value);
if (name)
return name;
}
 
-   return find_name_common(state, line, def, p_value, NULL, terminate);
+   return find_name_common(root, line, def, p_value, NULL, terminate);
 }
 
-static char *find_name_traditional(struct apply_state *state,
+static char *find_name_traditional(struct strbuf *root,
   const char *line,
   char *def,
   int p_value)
@@ -734,7 +734,7 @@ static char *find_name_traditional(struct apply_state 
*state,
size_t date_len;
 
if (*line == '"') {
-   char *name = find_name_gnu(state, line, p_value);
+   char *name = find_name_gnu(root, line, p_value);
if (name)
return name;
}
@@ -742,10 +742,10 @@ static char *find_name_traditional(struct apply_state 
*state,
len = strchrnul(line, '\n') - line;
date_len = diff_timestamp_len(line, len);
if (!date_len)
-   return find_name_common(state, line, def, p_value, NULL, 
TERM_TAB);
+   return find_name_common(root, line, def, p_value, NULL, 
TERM_TAB);
len -= date_len;
 
-   return find_name_common(state, line, def, p_value, line + len, 0);
+   return find_name_common(root, line, def, p_value, line + len, 0);
 }
 
 /*
@@ -759,7 +759,7 @@ static int guess_p_value(struct apply_state *state, const 
char *nameline)
 
if (is_dev_null(nameline))
return -1;
-   name = find_name_traditional(state, nameline, NULL, 0);
+   name = find_name_traditional(&state->root, nameline, NULL, 0);
if (!name)
return -1;
cp = strchr(name, '/');
@@ -883,17 +883,17 @@ static int parse_traditional_patch(struct apply_state 
*state,
if (is_dev_null(first)) {
patch->is_new = 1;
patch->is_delete = 0;
-   name = find_name_traditional(state, second, NULL, 
state->p_value);
+   name = find_name_traditional(&state->root, second, NULL, 
state-&

[PATCH v2 04/14] apply: only pass required data to check_header_line

2019-07-05 Thread Thomas Gummerer
Currently the 'check_header_line()' function takes 'struct
apply_state' as parameter, even though it only needs the linenr from
that struct.

This function is in the callchain of 'parse_git_header()', which we
want to make more generally useful in a subsequent commit.  To make
that happen we only want to pass in the required data to
'parse_git_header()', and not the whole 'struct apply_state', and thus
we want functions in the callchain of 'parse_git_header()' to only
take arguments they really need.

Signed-off-by: Thomas Gummerer 
---
 apply.c | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/apply.c b/apply.c
index ac668e754d..1602fd5db0 100644
--- a/apply.c
+++ b/apply.c
@@ -1302,15 +1302,15 @@ static char *git_header_name(int p_value,
}
 }
 
-static int check_header_line(struct apply_state *state, struct patch *patch)
+static int check_header_line(int linenr, struct patch *patch)
 {
int extensions = (patch->is_delete == 1) + (patch->is_new == 1) +
 (patch->is_rename == 1) + (patch->is_copy == 1);
if (extensions > 1)
return error(_("inconsistent header lines %d and %d"),
-patch->extension_linenr, state->linenr);
+patch->extension_linenr, linenr);
if (extensions && !patch->extension_linenr)
-   patch->extension_linenr = state->linenr;
+   patch->extension_linenr = linenr;
return 0;
 }
 
@@ -1380,7 +1380,7 @@ static int parse_git_header(struct apply_state *state,
res = p->fn(state, line + oplen, patch);
if (res < 0)
return -1;
-   if (check_header_line(state, patch))
+   if (check_header_line(state->linenr, patch))
return -1;
if (res > 0)
return offset;
-- 
2.22.0.510.g264f2c817a



[PATCH v2 14/14] range-diff: add headers to the outer hunk header

2019-07-05 Thread Thomas Gummerer
Add the section headers/hunk headers we introduced in the previous
commits to the outer diff's hunk headers.  This makes it easier to
understand which change we are actually looking at.  For example an
outer hunk header might now look like:

@@  Documentation/config/interactive.txt

while previously it would have only been

@@

which doesn't give a lot of context for the change that follows.

For completeness also add section headers for the commit metadata and
the commit message, although they are arguably less important.

Signed-off-by: Thomas Gummerer 
---
 range-diff.c  |  9 ++---
 t/t3206-range-diff.sh | 41 ++---
 2 files changed, 28 insertions(+), 22 deletions(-)

diff --git a/range-diff.c b/range-diff.c
index 09cb1ddbb1..f43b229031 100644
--- a/range-diff.c
+++ b/range-diff.c
@@ -134,8 +134,10 @@ static int read_patches(const char *range, struct 
string_list *list)
strbuf_addstr(&buf, " ##");
} else if (in_header) {
if (starts_with(line, "Author: ")) {
+   strbuf_addstr(&buf, " ## Metadata ##\n");
strbuf_addstr(&buf, line);
strbuf_addstr(&buf, "\n\n");
+   strbuf_addstr(&buf, " ## Commit message ##\n");
} else if (starts_with(line, "")) {
p = line + len - 2;
while (isspace(*p) && p >= line)
@@ -396,8 +398,9 @@ static void output_pair_header(struct diff_options *diffopt,
fwrite(buf->buf, buf->len, 1, diffopt->file);
 }
 
-static struct userdiff_driver no_func_name = {
-   .funcname = { "$^", 0 }
+static struct userdiff_driver section_headers = {
+   .funcname = { "^ ## (.*) ##$\n"
+ "^.?@@ (.*)$", REG_EXTENDED }
 };
 
 static struct diff_filespec *get_filespec(const char *name, const char *p)
@@ -409,7 +412,7 @@ static struct diff_filespec *get_filespec(const char *name, 
const char *p)
spec->size = strlen(p);
spec->should_munmap = 0;
spec->is_stdin = 1;
-   spec->driver = &no_func_name;
+   spec->driver = §ion_headers;
 
return spec;
 }
diff --git a/t/t3206-range-diff.sh b/t/t3206-range-diff.sh
index d4de270979..ec548654ce 100755
--- a/t/t3206-range-diff.sh
+++ b/t/t3206-range-diff.sh
@@ -99,7 +99,7 @@ test_expect_success 'changed commit' '
1:  4de457d = 1:  a4b s/5/A/
2:  fccce22 = 2:  f51d370 s/4/A/
3:  147e64e ! 3:  0559556 s/11/B/
-   @@
+   @@ file: A
  9
  10
 -11
@@ -109,7 +109,7 @@ test_expect_success 'changed commit' '
  13
  14
4:  a63e992 ! 4:  d966c5c s/12/B/
-   @@
+   @@ file
 @@ file: A
  9
  10
@@ -158,7 +158,7 @@ test_expect_success 'changed commit with sm config' '
1:  4de457d = 1:  a4b s/5/A/
2:  fccce22 = 2:  f51d370 s/4/A/
3:  147e64e ! 3:  0559556 s/11/B/
-   @@
+   @@ file: A
  9
  10
 -11
@@ -168,7 +168,7 @@ test_expect_success 'changed commit with sm config' '
  13
  14
4:  a63e992 ! 4:  d966c5c s/12/B/
-   @@
+   @@ file
 @@ file: A
  9
  10
@@ -186,9 +186,10 @@ test_expect_success 'renamed file' '
sed s/Z/\ /g >expected <<-EOF &&
1:  4de457d = 1:  f258d75 s/5/A/
2:  fccce22 ! 2:  017b62d s/4/A/
-   @@
+   @@ Metadata
ZAuthor: Thomas Rast 
Z
+   Z ## Commit message ##
-s/4/A/
+s/4/A/ + rename file
Z
@@ -198,8 +199,8 @@ test_expect_success 'renamed file' '
Z 1
Z 2
3:  147e64e ! 3:  3ce7af6 s/11/B/
-   @@
-   Z
+   @@ Metadata
+   Z ## Commit message ##
Zs/11/B/
Z
- ## file ##
@@ -210,8 +211,8 @@ test_expect_success 'renamed file' '
Z 9
Z 10
4:  a63e992 ! 4:  1e6226b s/12/B/
-   @@
-   Z
+   @@ Metadata
+   Z ## Commit message ##
Zs/12/B/
Z
- ## file ##
@@ -230,30 +231,32 @@ test_expect_success 'file added and later removed' '
sed s/Z/\ /g >expected <<-EOF &&
1:  4de457d = 1:  096b1ba s/5/A/
2:  fccce22 ! 2:  d92e698 s/4/A/
-   @@
+   @@ Metadata
ZAuthor: Thomas Rast 
 

[PATCH v2 01/14] apply: replace marc.info link with public-inbox

2019-07-05 Thread Thomas Gummerer
public-inbox.org links include the whole message ID by default.  This
means the message can still be found even if the site goes away, which
is not the case with the marc.info link.  Replace the marc.info link
with a more future proof one.

Signed-off-by: Thomas Gummerer 
---
 apply.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/apply.c b/apply.c
index f15afa9f6a..599cf8956f 100644
--- a/apply.c
+++ b/apply.c
@@ -478,7 +478,7 @@ static char *find_name_gnu(struct apply_state *state,
 
/*
 * Proposed "new-style" GNU patch/diff format; see
-* http://marc.info/?l=git&m=112927316408690&w=2
+* https://public-inbox.org/git/7vll0wvb2a@assigned-by-dhcp.cox.net/
 */
if (unquote_c_style(&name, line, NULL)) {
strbuf_release(&name);
-- 
2.22.0.510.g264f2c817a



[PATCH v2 08/14] range-diff: fix function parameter indentation

2019-07-05 Thread Thomas Gummerer
Fix the indentation of the function parameters for a couple of
functions, to match the style in the rest of the file.

Signed-off-by: Thomas Gummerer 
---
 range-diff.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/range-diff.c b/range-diff.c
index 48b0e1b4ce..9242b8975f 100644
--- a/range-diff.c
+++ b/range-diff.c
@@ -148,7 +148,7 @@ static int read_patches(const char *range, struct 
string_list *list)
 }
 
 static int patch_util_cmp(const void *dummy, const struct patch_util *a,
-const struct patch_util *b, const char *keydata)
+ const struct patch_util *b, const char *keydata)
 {
return strcmp(a->diff, keydata ? keydata : b->diff);
 }
@@ -373,7 +373,7 @@ static struct diff_filespec *get_filespec(const char *name, 
const char *p)
 }
 
 static void patch_diff(const char *a, const char *b,
- struct diff_options *diffopt)
+  struct diff_options *diffopt)
 {
diff_queue(&diff_queued_diff,
   get_filespec("a", a), get_filespec("b", b));
-- 
2.22.0.510.g264f2c817a



[PATCH v2 09/14] range-diff: split lines manually

2019-07-05 Thread Thomas Gummerer
Currently range-diff uses the 'strbuf_getline()' function for doing
its line by line processing.  In a future patch we want to do parts of
that parsing using the 'parse_git_header()' function, which does
requires reading parts of the input from that function, which doesn't
use strbufs.

Switch range-diff to do our own line by line parsing, so we can re-use
the parse_git_header function later.

Signed-off-by: Thomas Gummerer 
---

Longer term it might be better to have both range-diff and apply code
use strbufs.  However I didn't feel it's worth making that change for
this patch series.

 range-diff.c | 69 +---
 1 file changed, 39 insertions(+), 30 deletions(-)

diff --git a/range-diff.c b/range-diff.c
index 9242b8975f..916afa44c0 100644
--- a/range-diff.c
+++ b/range-diff.c
@@ -24,6 +24,17 @@ struct patch_util {
struct object_id oid;
 };
 
+static unsigned long linelen(const char *buffer, unsigned long size)
+{
+   unsigned long len = 0;
+   while (size--) {
+   len++;
+   if (*buffer++ == '\n')
+   break;
+   }
+   return len;
+}
+
 /*
  * Reads the patches into a string list, with the `util` field being populated
  * as struct object_id (will need to be free()d).
@@ -31,10 +42,12 @@ struct patch_util {
 static int read_patches(const char *range, struct string_list *list)
 {
struct child_process cp = CHILD_PROCESS_INIT;
-   FILE *in;
-   struct strbuf buf = STRBUF_INIT, line = STRBUF_INIT;
+   struct strbuf buf = STRBUF_INIT, file = STRBUF_INIT;
struct patch_util *util = NULL;
int in_header = 1;
+   char *line;
+   int offset, len;
+   size_t size;
 
argv_array_pushl(&cp.args, "log", "--no-color", "-p", "--no-merges",
"--reverse", "--date-order", "--decorate=no",
@@ -54,17 +67,15 @@ static int read_patches(const char *range, struct 
string_list *list)
 
if (start_command(&cp))
return error_errno(_("could not start `log`"));
-   in = fdopen(cp.out, "r");
-   if (!in) {
-   error_errno(_("could not read `log` output"));
-   finish_command(&cp);
-   return -1;
-   }
+   strbuf_read(&file, cp.out, 0);
 
-   while (strbuf_getline(&line, in) != EOF) {
+   line = strbuf_detach(&file, &size);
+   for (offset = 0; size > 0; offset += len, size -= len, line += len) {
const char *p;
 
-   if (skip_prefix(line.buf, "commit ", &p)) {
+   len = linelen(line, size);
+   line[len - 1] = '\0';
+   if (skip_prefix(line, "commit ", &p)) {
if (util) {
string_list_append(list, buf.buf)->util = util;
strbuf_reset(&buf);
@@ -75,8 +86,6 @@ static int read_patches(const char *range, struct string_list 
*list)
free(util);
string_list_clear(list, 1);
strbuf_release(&buf);
-   strbuf_release(&line);
-   fclose(in);
finish_command(&cp);
return -1;
}
@@ -85,26 +94,28 @@ static int read_patches(const char *range, struct 
string_list *list)
continue;
}
 
-   if (starts_with(line.buf, "diff --git")) {
+   if (starts_with(line, "diff --git")) {
in_header = 0;
strbuf_addch(&buf, '\n');
if (!util->diff_offset)
util->diff_offset = buf.len;
strbuf_addch(&buf, ' ');
-   strbuf_addbuf(&buf, &line);
+   strbuf_addstr(&buf, line);
} else if (in_header) {
-   if (starts_with(line.buf, "Author: ")) {
-   strbuf_addbuf(&buf, &line);
+   if (starts_with(line, "Author: ")) {
+   strbuf_addstr(&buf, line);
strbuf_addstr(&buf, "\n\n");
-   } else if (starts_with(line.buf, "")) {
-   strbuf_rtrim(&line);
-   strbuf_addbuf(&buf, &line);
+   } else if (starts_with(line, "")) {
+   p = line + len - 2;
+

[PATCH v2 02/14] apply: only pass required data to skip_tree_prefix

2019-07-05 Thread Thomas Gummerer
Currently the 'skip_tree_prefix()' function takes 'struct apply_state'
as parameter, even though it only needs the p_value from that struct.

This function is in the callchain of 'parse_git_header()', which we
want to make more generally useful in a subsequent commit.  To make
that happen we only want to pass in the required data to
'parse_git_header()', and not the whole 'struct apply_state', and thus
we want functions in the callchain of 'parse_git_header()' to only
take arguments they really need.

Signed-off-by: Thomas Gummerer 
---
 apply.c | 18 +-
 1 file changed, 9 insertions(+), 9 deletions(-)

diff --git a/apply.c b/apply.c
index 599cf8956f..fc7083fcbc 100644
--- a/apply.c
+++ b/apply.c
@@ -1137,17 +1137,17 @@ static int gitdiff_unrecognized(struct apply_state 
*state,
  * Skip p_value leading components from "line"; as we do not accept
  * absolute paths, return NULL in that case.
  */
-static const char *skip_tree_prefix(struct apply_state *state,
+static const char *skip_tree_prefix(int p_value,
const char *line,
int llen)
 {
int nslash;
int i;
 
-   if (!state->p_value)
+   if (!p_value)
return (llen && line[0] == '/') ? NULL : line;
 
-   nslash = state->p_value;
+   nslash = p_value;
for (i = 0; i < llen; i++) {
int ch = line[i];
if (ch == '/' && --nslash <= 0)
@@ -1184,7 +1184,7 @@ static char *git_header_name(struct apply_state *state,
goto free_and_fail1;
 
/* strip the a/b prefix including trailing slash */
-   cp = skip_tree_prefix(state, first.buf, first.len);
+   cp = skip_tree_prefix(state->p_value, first.buf, first.len);
if (!cp)
goto free_and_fail1;
strbuf_remove(&first, 0, cp - first.buf);
@@ -1201,7 +1201,7 @@ static char *git_header_name(struct apply_state *state,
if (*second == '"') {
if (unquote_c_style(&sp, second, NULL))
goto free_and_fail1;
-   cp = skip_tree_prefix(state, sp.buf, sp.len);
+   cp = skip_tree_prefix(state->p_value, sp.buf, sp.len);
if (!cp)
goto free_and_fail1;
/* They must match, otherwise ignore */
@@ -1212,7 +1212,7 @@ static char *git_header_name(struct apply_state *state,
}
 
/* unquoted second */
-   cp = skip_tree_prefix(state, second, line + llen - second);
+   cp = skip_tree_prefix(state->p_value, second, line + llen - 
second);
if (!cp)
goto free_and_fail1;
if (line + llen - cp != first.len ||
@@ -1227,7 +1227,7 @@ static char *git_header_name(struct apply_state *state,
}
 
/* unquoted first name */
-   name = skip_tree_prefix(state, line, llen);
+   name = skip_tree_prefix(state->p_value, line, llen);
if (!name)
return NULL;
 
@@ -1243,7 +1243,7 @@ static char *git_header_name(struct apply_state *state,
if (unquote_c_style(&sp, second, NULL))
goto free_and_fail2;
 
-   np = skip_tree_prefix(state, sp.buf, sp.len);
+   np = skip_tree_prefix(state->p_value, sp.buf, sp.len);
if (!np)
goto free_and_fail2;
 
@@ -1287,7 +1287,7 @@ static char *git_header_name(struct apply_state *state,
 */
if (!name[len + 1])
return NULL; /* no postimage name */
-   second = skip_tree_prefix(state, name + len + 1,
+   second = skip_tree_prefix(state->p_value, name + len + 
1,
  line_len - (len + 1));
if (!second)
return NULL;
-- 
2.22.0.510.g264f2c817a



[PATCH v2 00/14] output improvements for git range-diff

2019-07-05 Thread Thomas Gummerer
  4:  a63e992 ! 4:  d966c5c s/12/B/
+   @@ -8,7 +8,7 @@
+-   @@
++   @@ A
+ 9
+ 10
+   - B
+@@ t/t3206-range-diff.sh: test_expect_success 'changed commit with sm 
config' '
+ 14
+   4:  a63e992 ! 4:  d966c5c s/12/B/
+   @@ -8,7 +8,7 @@
+-   @@
++   @@ A
+ 9
+ 10
+   - B
+@@ t/t3206-range-diff.sh: test_expect_success 'dual-coloring' '
+   :  14
+   :4:  d966c5c ! 4:  
8add5f1 s/12/B/
+   :@@ -8,7 +8,7 @@
+-  : @@
++  : @@ A
+   :  9
+   :  10
+   :- BB
 3:  71679cb747 <  -:  -- range-diff: add section header instead of 
diff header
 -:  -- > 11:  69654fe76d range-diff: suppress line count in outer diff
 -:  -- > 12:  c38f929b9a range-diff: add section header instead of 
diff header
 -:  -- > 13:  6df03ecdcf range-diff: add filename to inner diff
 4:  8f45b6a995 ! 14:  5ceef49035 range-diff: add section headers to the outer 
hunk header
@@ Metadata
 Author: Thomas Gummerer 
 
  ## Commit message ##
-range-diff: add section headers to the outer hunk header
+range-diff: add headers to the outer hunk header
 
-Add the section headers we introduced in the previous commit to the
-outer diff's hunk headers.  This makes it easier to understand which
-change we are actually looking at.  For example an outer hunk header
-might now look like:
+Add the section headers/hunk headers we introduced in the previous
+commits to the outer diff's hunk headers.  This makes it easier to
+understand which change we are actually looking at.  For example an
+outer hunk header might now look like:
 
-@@ -77,15 +78,43 @@ modified file 
Documentation/config/interactive.txt
+@@  Documentation/config/interactive.txt
 
 while previously it would have only been
 
-@@ -77,15 +78,43 @@
+@@
 
 which doesn't give a lot of context for the change that follows.
 
@@ Commit message
 
  ## range-diff.c ##
 @@ range-diff.c: static int read_patches(const char *range, struct 
string_list *list)
-   strbuf_addstr(&buf, " ##\n");
+   strbuf_addstr(&buf, " ##");
} else if (in_header) {
-   if (starts_with(line.buf, "Author: ")) {
+   if (starts_with(line, "Author: ")) {
 +  strbuf_addstr(&buf, " ## Metadata ##\n");
-   strbuf_addbuf(&buf, &line);
+   strbuf_addstr(&buf, line);
strbuf_addstr(&buf, "\n\n");
 +  strbuf_addstr(&buf, " ## Commit message ##\n");
-   } else if (starts_with(line.buf, "")) {
-   strbuf_rtrim(&line);
-   strbuf_addbuf(&buf, &line);
+   } else if (starts_with(line, "")) {
+   p = line + len - 2;
+   while (isspace(*p) && p >= line)
 @@ range-diff.c: static void output_pair_header(struct diff_options 
*diffopt,
fwrite(buf->buf, buf->len, 1, diffopt->file);
  }
@@ range-diff.c: static void output_pair_header(struct diff_options 
*diffopt,
 -static struct userdiff_driver no_func_name = {
 -  .funcname = { "$^", 0 }
 +static struct userdiff_driver section_headers = {
-+  .funcname = { "^ ## (.*) ##$", REG_EXTENDED }
++  .funcname = { "^ ## (.*) ##$\n"
++"^.?@@ (.*)$", REG_EXTENDED }
  };
  
  static struct diff_filespec *get_filespec(const char *name, const char *p)
@@ range-diff.c: static struct diff_filespec *get_filespec(const char 
*name, const
  
return spec;
  }
+
+ ## t/t3206-range-diff.sh ##
+@@ t/t3206-range-diff.sh: test_expect_success 'changed commit' '
+   1:  4de457d = 1:  a4b s/5/A/
+   2:  fccce22 = 2:  f51d370 s/4/A/
+   3:  147e64e ! 3:  0559556 s/11/B/
+-  @@
++  @@ file: A
+ 9
+ 10
+-11
+@@ t/t3206-range-diff.sh: test_expect_success 'changed commit' '
+ 13
+ 14
+   4:  a63e992 ! 4:  d966c5c s/12/B/
+-  @@
++  @@ file
+@@ file: A
+ 9
+ 10
+@@ t/t3206-range-diff.sh: test_expect_success 'changed commit with sm 
config' '
+   1:  4de457d = 1:  a4b s/5/A/
  

[PATCH v2 12/14] range-diff: add section header instead of diff header

2019-07-05 Thread Thomas Gummerer
Currently range-diff keeps the diff header of the inner diff
intact (apart from stripping lines starting with index).  This diff
header is somewhat useful, especially when files get different
names in different ranges.

However there is no real need to keep the whole diff header for that.
The main reason we currently do that is probably because it is easy to
do.

Introduce a new range diff hunk header, that's enclosed by "##",
similar to how line numbers in diff hunks are enclosed by "@@", and
give human readable information of what exactly happened to the file,
including the file name.

This improves the readability of the range-diff by giving more concise
information to the users.  For example if a file was renamed in one
iteration, but not in another, the diff of the headers would be quite
noisy.  However the diff of a single line is concise and should be
easier to understand.

Additionaly, this allows us to add these range diff section headers to
the outer diffs hunk headers using a custom userdiff pattern, which
should help making the range-diff more readable.

Signed-off-by: Thomas Gummerer 
---
 range-diff.c   | 35 
 t/t3206-range-diff.sh  | 91 +++---
 t/t3206/history.export | 84 --
 3 files changed, 193 insertions(+), 17 deletions(-)

diff --git a/range-diff.c b/range-diff.c
index b31fbab026..cc01f7f573 100644
--- a/range-diff.c
+++ b/range-diff.c
@@ -10,6 +10,7 @@
 #include "commit.h"
 #include "pretty.h"
 #include "userdiff.h"
+#include "apply.h"
 
 struct patch_util {
/* For the search for an exact match */
@@ -95,12 +96,36 @@ static int read_patches(const char *range, struct 
string_list *list)
}
 
if (starts_with(line, "diff --git")) {
+   struct patch patch;
+   struct strbuf root = STRBUF_INIT;
+   int linenr = 0;
+
in_header = 0;
strbuf_addch(&buf, '\n');
if (!util->diff_offset)
util->diff_offset = buf.len;
-   strbuf_addch(&buf, ' ');
-   strbuf_addstr(&buf, line);
+   memset(&patch, 0, sizeof(patch));
+   line[len - 1] = '\n';
+   len = parse_git_header(&root, &linenr, 1, line,
+  len, size, &patch);
+   if (len < 0)
+   die(_("could not parse git header"));
+   strbuf_addstr(&buf, " ## ");
+   if (patch.is_new > 0)
+   strbuf_addf(&buf, "%s (new)", patch.new_name);
+   else if (patch.is_delete > 0)
+   strbuf_addf(&buf, "%s (deleted)", 
patch.old_name);
+   else if (patch.is_rename)
+   strbuf_addf(&buf, "%s => %s", patch.old_name, 
patch.new_name);
+   else
+   strbuf_addstr(&buf, patch.new_name);
+
+   if (patch.new_mode && patch.old_mode &&
+   patch.old_mode != patch.new_mode)
+   strbuf_addf(&buf, " (mode change %06o => %06o)",
+   patch.old_mode, patch.new_mode);
+
+   strbuf_addstr(&buf, " ##");
} else if (in_header) {
if (starts_with(line, "Author: ")) {
strbuf_addstr(&buf, line);
@@ -117,17 +142,13 @@ static int read_patches(const char *range, struct 
string_list *list)
if (!(p = strstr(p, "@@")))
die(_("invalid hunk header in inner diff"));
strbuf_addstr(&buf, p);
-   } else if (!line[0] || starts_with(line, "index "))
+   } else if (!line[0])
/*
 * A completely blank (not ' \n', which is context)
 * line is not valid in a diff.  We skip it
 * silently, because this neatly handles the blank
 * separator line between commits in git-log
 * output.
-*
-* We also want to ignore the diff's `index` lines
-* because they contain exact blob hashes in which
-* we are not interested.
 */

[PATCH v2 06/14] apply: only pass required data to gitdiff_* functions

2019-07-05 Thread Thomas Gummerer
Currently the 'gitdiff_*()' functions take 'struct apply_state' as
parameter, even though they only needs the root, linenr and p_value
from that struct.

These functions are in the callchain of 'parse_git_header()', which we
want to make more generally useful in a subsequent commit.  To make
that happen we only want to pass in the required data to
'parse_git_header()', and not the whole 'struct apply_state', and thus
we want functions in the callchain of 'parse_git_header()' to only
take arguments they really need.

Signed-off-by: Thomas Gummerer 
---
 apply.c | 59 ++---
 1 file changed, 35 insertions(+), 24 deletions(-)

diff --git a/apply.c b/apply.c
index 3cd4e3d3b3..468f1d3fee 100644
--- a/apply.c
+++ b/apply.c
@@ -22,6 +22,12 @@
 #include "rerere.h"
 #include "apply.h"
 
+struct parse_git_header_state {
+   struct strbuf *root;
+   int linenr;
+   int p_value;
+};
+
 static void git_apply_config(void)
 {
git_config_get_string_const("apply.whitespace", 
&apply_default_whitespace);
@@ -914,7 +920,7 @@ static int parse_traditional_patch(struct apply_state 
*state,
return 0;
 }
 
-static int gitdiff_hdrend(struct apply_state *state,
+static int gitdiff_hdrend(struct parse_git_header_state *state,
  const char *line,
  struct patch *patch)
 {
@@ -933,14 +939,14 @@ static int gitdiff_hdrend(struct apply_state *state,
 #define DIFF_OLD_NAME 0
 #define DIFF_NEW_NAME 1
 
-static int gitdiff_verify_name(struct apply_state *state,
+static int gitdiff_verify_name(struct parse_git_header_state *state,
   const char *line,
   int isnull,
   char **name,
   int side)
 {
if (!*name && !isnull) {
-   *name = find_name(&state->root, line, NULL, state->p_value, 
TERM_TAB);
+   *name = find_name(state->root, line, NULL, state->p_value, 
TERM_TAB);
return 0;
}
 
@@ -949,7 +955,7 @@ static int gitdiff_verify_name(struct apply_state *state,
if (isnull)
return error(_("git apply: bad git-diff - expected 
/dev/null, got %s on line %d"),
 *name, state->linenr);
-   another = find_name(&state->root, line, NULL, state->p_value, 
TERM_TAB);
+   another = find_name(state->root, line, NULL, state->p_value, 
TERM_TAB);
if (!another || strcmp(another, *name)) {
free(another);
return error((side == DIFF_NEW_NAME) ?
@@ -965,7 +971,7 @@ static int gitdiff_verify_name(struct apply_state *state,
return 0;
 }
 
-static int gitdiff_oldname(struct apply_state *state,
+static int gitdiff_oldname(struct parse_git_header_state *state,
   const char *line,
   struct patch *patch)
 {
@@ -974,7 +980,7 @@ static int gitdiff_oldname(struct apply_state *state,
   DIFF_OLD_NAME);
 }
 
-static int gitdiff_newname(struct apply_state *state,
+static int gitdiff_newname(struct parse_git_header_state *state,
   const char *line,
   struct patch *patch)
 {
@@ -992,21 +998,21 @@ static int parse_mode_line(const char *line, int linenr, 
unsigned int *mode)
return 0;
 }
 
-static int gitdiff_oldmode(struct apply_state *state,
+static int gitdiff_oldmode(struct parse_git_header_state *state,
   const char *line,
   struct patch *patch)
 {
return parse_mode_line(line, state->linenr, &patch->old_mode);
 }
 
-static int gitdiff_newmode(struct apply_state *state,
+static int gitdiff_newmode(struct parse_git_header_state *state,
   const char *line,
   struct patch *patch)
 {
return parse_mode_line(line, state->linenr, &patch->new_mode);
 }
 
-static int gitdiff_delete(struct apply_state *state,
+static int gitdiff_delete(struct parse_git_header_state *state,
  const char *line,
  struct patch *patch)
 {
@@ -1016,7 +1022,7 @@ static int gitdiff_delete(struct apply_state *state,
return gitdiff_oldmode(state, line, patch);
 }
 
-static int gitdiff_newfile(struct apply_state *state,
+static int gitdiff_newfile(struct parse_git_header_state *state,
   const char *line,
   struct patch *patch)
 {
@@ -1026,47 +1032,47 @@ static int gitdiff_newfile(struct apply_state *state,
return gitdiff_newmode(state, line, patch);
 }
 
-static int gitdiff_copysrc(struct apply_state *state,
+static int gitdi

[PATCH v2 13/14] range-diff: add filename to inner diff

2019-07-05 Thread Thomas Gummerer
In a range-diff it's not always clear which file a certain funcname of
the inner diff belongs to, because the diff header (or section header
as added in a previous commit) is not always visible in the
range-diff.

Add the filename to the inner diffs header, so it's always visible to
users.

This also allows us to add the filename + the funcname to the outer
diffs hunk headers using a custom userdiff pattern, which will be done
in the next commit.

Signed-off-by: Thomas Gummerer 
---
 range-diff.c  | 14 --
 t/t3206-range-diff.sh | 16 ++--
 2 files changed, 22 insertions(+), 8 deletions(-)

diff --git a/range-diff.c b/range-diff.c
index cc01f7f573..09cb1ddbb1 100644
--- a/range-diff.c
+++ b/range-diff.c
@@ -46,7 +46,7 @@ static int read_patches(const char *range, struct string_list 
*list)
struct strbuf buf = STRBUF_INIT, file = STRBUF_INIT;
struct patch_util *util = NULL;
int in_header = 1;
-   char *line;
+   char *line, *current_filename = NULL;
int offset, len;
size_t size;
 
@@ -120,6 +120,12 @@ static int read_patches(const char *range, struct 
string_list *list)
else
strbuf_addstr(&buf, patch.new_name);
 
+   free(current_filename);
+   if (patch.is_delete > 0)
+   current_filename = xstrdup(patch.old_name);
+   else
+   current_filename = xstrdup(patch.new_name);
+
if (patch.new_mode && patch.old_mode &&
patch.old_mode != patch.new_mode)
strbuf_addf(&buf, " (mode change %06o => %06o)",
@@ -141,7 +147,10 @@ static int read_patches(const char *range, struct 
string_list *list)
} else if (skip_prefix(line, "@@ ", &p)) {
if (!(p = strstr(p, "@@")))
die(_("invalid hunk header in inner diff"));
-   strbuf_addstr(&buf, p);
+   strbuf_addstr(&buf, "@@");
+   if (current_filename && p[2])
+   strbuf_addf(&buf, " %s:", current_filename);
+   strbuf_addstr(&buf, p + 2);
} else if (!line[0])
/*
 * A completely blank (not ' \n', which is context)
@@ -172,6 +181,7 @@ static int read_patches(const char *range, struct 
string_list *list)
if (util)
string_list_append(list, buf.buf)->util = util;
strbuf_release(&buf);
+   free(current_filename);
 
if (finish_command(&cp))
return -1;
diff --git a/t/t3206-range-diff.sh b/t/t3206-range-diff.sh
index c277756057..d4de270979 100755
--- a/t/t3206-range-diff.sh
+++ b/t/t3206-range-diff.sh
@@ -110,7 +110,7 @@ test_expect_success 'changed commit' '
  14
4:  a63e992 ! 4:  d966c5c s/12/B/
@@
-@@ A
+@@ file: A
  9
  10
- B
@@ -169,7 +169,7 @@ test_expect_success 'changed commit with sm config' '
  14
4:  a63e992 ! 4:  d966c5c s/12/B/
@@
-@@ A
+@@ file: A
  9
  10
- B
@@ -203,20 +203,24 @@ test_expect_success 'renamed file' '
Zs/11/B/
Z
- ## file ##
+   -@@ file: A
+ ## renamed-file ##
-   Z@@ A
+   +@@ renamed-file: A
Z 8
Z 9
+   Z 10
4:  a63e992 ! 4:  1e6226b s/12/B/
@@
Z
Zs/12/B/
Z
- ## file ##
+   -@@ file: A
+ ## renamed-file ##
-   Z@@ A
+   +@@ renamed-file: A
Z 9
Z 10
+   Z B
EOF
test_cmp expected actual
 '
@@ -248,7 +252,7 @@ test_expect_success 'file added and later removed' '
+s/11/B/ + remove file
Z
Z ## file ##
-   Z@@ A
+   Z@@ file: A
@@
Z 12
Z 13
@@ -310,7 +314,7 @@ test_expect_success 'dual-coloring' '
:  14
:4:  d966c5c ! 4:  
8add5f1 s/12/B/
:@@
-   : @@ A
+   : @@ file: A
:  9
:  10
:- BB
-- 
2.22.0.510.g264f2c817a



[PATCH v2 07/14] apply: make parse_git_header public

2019-07-05 Thread Thomas Gummerer
Make parse_git_header a "public" function in apply.h, so we can re-use
it in range-diff in a subsequent commit.

Signed-off-by: Thomas Gummerer 
---

I considered creating a separate struct for only the metadata here,
and embedding that in 'struct patch'.  As struct patch is mostly
metadata fields though, I decided against that to avoid more code
churn here.

 apply.c | 68 -
 apply.h | 48 
 2 files changed, 67 insertions(+), 49 deletions(-)

diff --git a/apply.c b/apply.c
index 468f1d3fee..04319c233f 100644
--- a/apply.c
+++ b/apply.c
@@ -207,40 +207,6 @@ struct fragment {
 #define BINARY_DELTA_DEFLATED  1
 #define BINARY_LITERAL_DEFLATED 2
 
-/*
- * This represents a "patch" to a file, both metainfo changes
- * such as creation/deletion, filemode and content changes represented
- * as a series of fragments.
- */
-struct patch {
-   char *new_name, *old_name, *def_name;
-   unsigned int old_mode, new_mode;
-   int is_new, is_delete;  /* -1 = unknown, 0 = false, 1 = true */
-   int rejected;
-   unsigned ws_rule;
-   int lines_added, lines_deleted;
-   int score;
-   int extension_linenr; /* first line specifying delete/new/rename/copy */
-   unsigned int is_toplevel_relative:1;
-   unsigned int inaccurate_eof:1;
-   unsigned int is_binary:1;
-   unsigned int is_copy:1;
-   unsigned int is_rename:1;
-   unsigned int recount:1;
-   unsigned int conflicted_threeway:1;
-   unsigned int direct_to_threeway:1;
-   unsigned int crlf_in_old:1;
-   struct fragment *fragments;
-   char *result;
-   size_t resultsize;
-   char old_oid_prefix[GIT_MAX_HEXSZ + 1];
-   char new_oid_prefix[GIT_MAX_HEXSZ + 1];
-   struct patch *next;
-
-   /* three-way fallback result */
-   struct object_id threeway_stage[3];
-};
-
 static void free_fragment_list(struct fragment *list)
 {
while (list) {
@@ -1321,11 +1287,13 @@ static int check_header_line(int linenr, struct patch 
*patch)
 }
 
 /* Verify that we recognize the lines following a git header */
-static int parse_git_header(struct apply_state *state,
-   const char *line,
-   int len,
-   unsigned int size,
-   struct patch *patch)
+int parse_git_header(struct strbuf *root,
+int *linenr,
+int p_value,
+const char *line,
+int len,
+unsigned int size,
+struct patch *patch)
 {
unsigned long offset;
struct parse_git_header_state parse_hdr_state;
@@ -1340,21 +1308,21 @@ static int parse_git_header(struct apply_state *state,
 * or removing or adding empty files), so we get
 * the default name from the header.
 */
-   patch->def_name = git_header_name(state->p_value, line, len);
-   if (patch->def_name && state->root.len) {
-   char *s = xstrfmt("%s%s", state->root.buf, patch->def_name);
+   patch->def_name = git_header_name(p_value, line, len);
+   if (patch->def_name && root->len) {
+   char *s = xstrfmt("%s%s", root->buf, patch->def_name);
free(patch->def_name);
patch->def_name = s;
}
 
line += len;
size -= len;
-   state->linenr++;
-   parse_hdr_state.root = &state->root;
-   parse_hdr_state.linenr = state->linenr;
-   parse_hdr_state.p_value = state->p_value;
+   (*linenr)++;
+   parse_hdr_state.root = root;
+   parse_hdr_state.linenr = *linenr;
+   parse_hdr_state.p_value = p_value;
 
-   for (offset = len ; size > 0 ; offset += len, size -= len, line += len, 
state->linenr++) {
+   for (offset = len ; size > 0 ; offset += len, size -= len, line += len, 
(*linenr)++) {
static const struct opentry {
const char *str;
int (*fn)(struct parse_git_header_state *, const char 
*, struct patch *);
@@ -1391,7 +1359,7 @@ static int parse_git_header(struct apply_state *state,
res = p->fn(&parse_hdr_state, line + oplen, patch);
if (res < 0)
return -1;
-   if (check_header_line(state->linenr, patch))
+   if (check_header_line(*linenr, patch))
return -1;
if (res > 0)
return offset;
@@ -1572,7 +1540,9 @@ static int find_header(struct apply_state *state,
 * or mode change, so we handle that specially
 */
if (!memcmp("diff --git ", line, 1

[PATCH v2 11/14] range-diff: suppress line count in outer diff

2019-07-05 Thread Thomas Gummerer
The line count in the outer diff's hunk headers of a range diff is not
all that interesting.  It merely shows how far along the inner diff
are on both sides.  That number is of no use for human readers, and
range-diffs are not meant to be machine readable.

In a subsequent commit we're going to add some more contextual
information such as the filename corresponding to the diff to the hunk
headers.  Remove the unnecessary information, and just keep the "@@"
to indicate that a new hunk of the outer diff is starting.

Signed-off-by: Thomas Gummerer 
---
 diff.c|  5 -
 diff.h|  1 +
 range-diff.c  |  1 +
 t/t3206-range-diff.sh | 16 
 4 files changed, 14 insertions(+), 9 deletions(-)

diff --git a/diff.c b/diff.c
index ec5c095199..9c28ff0a92 100644
--- a/diff.c
+++ b/diff.c
@@ -1672,7 +1672,10 @@ static void emit_hunk_header(struct emit_callback 
*ecbdata,
if (ecbdata->opt->flags.dual_color_diffed_diffs)
strbuf_addstr(&msgbuf, reverse);
strbuf_addstr(&msgbuf, frag);
-   strbuf_add(&msgbuf, line, ep - line);
+   if (ecbdata->opt->flags.suppress_hunk_header_line_count)
+   strbuf_add(&msgbuf, atat, sizeof(atat));
+   else
+   strbuf_add(&msgbuf, line, ep - line);
strbuf_addstr(&msgbuf, reset);
 
/*
diff --git a/diff.h b/diff.h
index c9db9825bb..49913049f9 100644
--- a/diff.h
+++ b/diff.h
@@ -98,6 +98,7 @@ struct diff_flags {
unsigned stat_with_summary;
unsigned suppress_diff_headers;
unsigned dual_color_diffed_diffs;
+   unsigned suppress_hunk_header_line_count;
 };
 
 static inline void diff_flags_or(struct diff_flags *a,
diff --git a/range-diff.c b/range-diff.c
index 484b1ec5a9..b31fbab026 100644
--- a/range-diff.c
+++ b/range-diff.c
@@ -480,6 +480,7 @@ int show_range_diff(const char *range1, const char *range2,
opts.output_format = DIFF_FORMAT_PATCH;
opts.flags.suppress_diff_headers = 1;
opts.flags.dual_color_diffed_diffs = dual_color;
+   opts.flags.suppress_hunk_header_line_count = 1;
opts.output_prefix = output_prefix_cb;
strbuf_addstr(&indent, "");
opts.output_prefix_data = &indent;
diff --git a/t/t3206-range-diff.sh b/t/t3206-range-diff.sh
index aebd4e3693..9f89af7178 100755
--- a/t/t3206-range-diff.sh
+++ b/t/t3206-range-diff.sh
@@ -99,7 +99,7 @@ test_expect_success 'changed commit' '
1:  4de457d = 1:  a4b s/5/A/
2:  fccce22 = 2:  f51d370 s/4/A/
3:  147e64e ! 3:  0559556 s/11/B/
-   @@ -10,7 +10,7 @@
+   @@
  9
  10
 -11
@@ -109,7 +109,7 @@ test_expect_success 'changed commit' '
  13
  14
4:  a63e992 ! 4:  d966c5c s/12/B/
-   @@ -8,7 +8,7 @@
+   @@
 @@ A
  9
  10
@@ -158,7 +158,7 @@ test_expect_success 'changed commit with sm config' '
1:  4de457d = 1:  a4b s/5/A/
2:  fccce22 = 2:  f51d370 s/4/A/
3:  147e64e ! 3:  0559556 s/11/B/
-   @@ -10,7 +10,7 @@
+   @@
  9
  10
 -11
@@ -168,7 +168,7 @@ test_expect_success 'changed commit with sm config' '
  13
  14
4:  a63e992 ! 4:  d966c5c s/12/B/
-   @@ -8,7 +8,7 @@
+   @@
 @@ A
  9
  10
@@ -191,7 +191,7 @@ test_expect_success 'changed message' '
sed s/Z/\ /g >expected <<-EOF &&
1:  4de457d = 1:  f686024 s/5/A/
2:  fccce22 ! 2:  4ab067d s/4/A/
-   @@ -2,6 +2,8 @@
+   @@
Z
Zs/4/A/
Z
@@ -210,7 +210,7 @@ test_expect_success 'dual-coloring' '
sed -e "s|^:||" >expect <<-\EOF &&
:1:  a4b = 1:  f686024 s/5/A/
:2:  f51d370 ! 2:  
4ab067d s/4/A/
-   :@@ -2,6 +2,8 @@
+   :@@
: 
: s/4/A/
: 
@@ -220,7 +220,7 @@ test_expect_success 'dual-coloring' '
:  --- a/file
:  +++ b/file
:3:  0559556 ! 3:  
b9cb956 s/11/B/
-   :@@ -10,7 +10,7 @@
+   :@@
:  9
:  10
: -11
@@ -230,7 +230,7 @@ test_expect_success 'dual-coloring' '
:  13
:  14
:4:  d966c5c ! 4:  
8add5f1 s/12/B/
-   :@@ -8,7 +8,7 @@
+   :@@
: @@ A
:  9
:  10
-- 
2.22.0.510.g264f2c817a



Re: What's cooking in git.git (Jun 2019, #06; Wed, 26)

2019-06-28 Thread Thomas Gummerer
On 06/26, Junio C Hamano wrote:
> * ra/cherry-pick-revert-skip (2019-06-24) 6 commits
>  - cherry-pick/revert: advise using --skip
>  - cherry-pick/revert: add --skip option
>  - sequencer: use argv_array in reset_merge
>  - sequencer: rename reset_for_rollback to reset_merge
>  - sequencer: add advice for revert
>  - advice: add sequencerInUse config variable
> 
>  "git cherry-pick/revert" learned a new "--skip" action.
> 
>  Is this one ready for 'next'?

Yes, I believe this is ready for 'next'.  I had a look at the latest
round, and only had a minor comment on the organization of the patch
series that is probably not worth a re-roll.

I also added Phillip to Cc, as he's been heavily involved in reviewing
this series, in case he has any more comments.


Re: [PATCH] doc: fix form -> from typo

2019-06-26 Thread Thomas Gummerer
On 06/25, Martin Ågren wrote:
> Hi Catalin
> 
> Welcome to the list!
> 
> On Tue, 25 Jun 2019 at 09:43, Catalin Criste  wrote:
> 
> > @@ -88,7 +88,7 @@ save [-p|--patch] [-k|--[no-]keep-index] 
> > [-u|--include-untracked] [-a|--all] [-q
> >
> > This option is deprecated in favour of 'git stash push'.  It
> > differs from "stash push" in that it cannot take pathspecs,
> > -   and any non-option arguments form the message.
> > +   and any non-option arguments from the message.
> 
> I think this is actually intended as "form". It took me a couple of
> readings, but what this paragraph wants to say is that any non-option
> arguments will be used to form (construct) the message.
> 
> Do you have any suggestions as to how this could be made clearer?
> There are at least two of us that have stumbled on this. :-)

Even though I originally wrote this I had to have a double take on it.
Maybe what you're saying above would be good?  "and any non-option
arguments are used to form the message" sounds a little clearer to
me.  Or maybe even use the word construct?  Dunno.


Re: [PATCH v2 00/10] Add 'ls-files --debug-json' to dump the index in json

2019-06-25 Thread Thomas Gummerer
On 06/25, Duy Nguyen wrote:
> On Tue, Jun 25, 2019 at 1:00 AM Johannes Schindelin
>  wrote:
> > especially when we offer this as a better way
> > for 3rd-party applications to interact with Git (which I think will be the
> > use case for this feature that will be _far_ more common than using it for
> > debugging).
> 
> We may have conflicting goals. For me, first priority is the debug
> tool for Git developers. 3rd-party support is a stretch. I could move
> all this back to test-tool, then you can provide a 3rd-party API if
> you want. Or I'll withdraw this series and go back to my original
> plan.

FWIW, I am very much in favor of this series, and would have found
something like this very useful many times in the past when I was
digging into the index code.  So I'd be more than happy to just have
this as debug tool, rather than as 3rd-party API.

I'd also be fine with this living in test-tool, as long as it's
somewhere in git.git where it's easily usable, I'd find it helpful.

Thanks for working on this!


Re: [GSoC][PATCH v7 1/6] advice: add sequencerInUse config variable

2019-06-25 Thread Thomas Gummerer
On 06/24, Rohit Ashiwal wrote:
> Calls to advise() which are not guarded by advice.* config variables
> are "bad" as they do not let the user say, "I've learned this part
> of Git enough, please don't tell me what to do verbosely.". Add a
> configuration variable "sequencerInUse" which controls whether to
> display advice when any sequencer command is in progress.

It would be nice if this patch not only introduced this config
variable, but also started making use of it.  That would make it
immediately clear why this variable is useful.  Otherwise the commit
message should state that this is only useful in a future commit.

Not sure that's worth a reroll by itself though.

> Signed-off-by: Rohit Ashiwal 
> ---
>  Documentation/config/advice.txt | 2 ++
>  advice.c| 2 ++
>  advice.h| 1 +
>  3 files changed, 5 insertions(+)
> 
> diff --git a/Documentation/config/advice.txt b/Documentation/config/advice.txt
> index ec4f6ae658..1cd9096c98 100644
> --- a/Documentation/config/advice.txt
> +++ b/Documentation/config/advice.txt
> @@ -57,6 +57,8 @@ advice.*::
>   resolveConflict::
>   Advice shown by various commands when conflicts
>   prevent the operation from being performed.
> + sequencerInUse::
> + Advice shown when a sequencer command is already in progress.
>   implicitIdentity::
>   Advice on how to set your identity configuration when
>   your information is guessed from the system username and
> diff --git a/advice.c b/advice.c
> index ce5f374ecd..b101f0c264 100644
> --- a/advice.c
> +++ b/advice.c
> @@ -15,6 +15,7 @@ int advice_status_u_option = 1;
>  int advice_commit_before_merge = 1;
>  int advice_reset_quiet_warning = 1;
>  int advice_resolve_conflict = 1;
> +int advice_sequencer_in_use = 1;
>  int advice_implicit_identity = 1;
>  int advice_detached_head = 1;
>  int advice_set_upstream_failure = 1;
> @@ -71,6 +72,7 @@ static struct {
>   { "commitBeforeMerge", &advice_commit_before_merge },
>   { "resetQuiet", &advice_reset_quiet_warning },
>   { "resolveConflict", &advice_resolve_conflict },
> + { "sequencerInUse", &advice_sequencer_in_use },
>   { "implicitIdentity", &advice_implicit_identity },
>   { "detachedHead", &advice_detached_head },
>   { "setupStreamFailure", &advice_set_upstream_failure },
> diff --git a/advice.h b/advice.h
> index e50f02cdfe..ebc838d7bc 100644
> --- a/advice.h
> +++ b/advice.h
> @@ -15,6 +15,7 @@ extern int advice_status_u_option;
>  extern int advice_commit_before_merge;
>  extern int advice_reset_quiet_warning;
>  extern int advice_resolve_conflict;
> +extern int advice_sequencer_in_use;
>  extern int advice_implicit_identity;
>  extern int advice_detached_head;
>  extern int advice_set_upstream_failure;
> -- 
> 2.21.0
> 


Re: [PATCH v2 01/10] ls-files: add --json to dump the index

2019-06-25 Thread Thomas Gummerer
On 06/24, Nguyễn Thái Ngọc Duy wrote:
> So far we don't have a command to basically dump the index file out,
> with all its glory details. Checking some info, for example, stat
> time, usually involves either writing new code or firing up "xxd" and
> decoding values by yourself.
> 
> This --json is supposed to help that. It dumps the index in a human
> readable format but also easy to be processed with tools. And it will
> print almost enough info to reconstruct the index later.
> 
> In this patch we only dump the main part, not extensions. But at the
> end of the series, the entire index is dumped. The end result could be
> very verbose even on a small repository such as git.git.
> 
> Signed-off-by: Nguyễn Thái Ngọc Duy 
> ---
>  Documentation/git-ls-files.txt|  5 +++
>  builtin/ls-files.c| 38 +---
>  cache.h   |  2 +
>  json-writer.c | 22 ++
>  json-writer.h | 23 ++
>  read-cache.c  | 72 ++-
>  t/t3011-ls-files-json.sh (new +x) | 44 +++
>  t/t3011/basic (new)   | 67 
>  8 files changed, 265 insertions(+), 8 deletions(-)
>
> [...]
>
> diff --git a/t/t3011-ls-files-json.sh b/t/t3011-ls-files-json.sh
> new file mode 100755
> index 00..97bcd814be
> --- /dev/null
> +++ b/t/t3011-ls-files-json.sh
> @@ -0,0 +1,44 @@
> +#!/bin/sh
> +
> +test_description='ls-files dumping json'
> +
> +. ./test-lib.sh
> +
> +strip_number() {
> + for name; do
> + echo 's/\("'$name'":\) [0-9]\+/\1 /' >>filter.sed
> + done
> +}
> +
> +strip_string() {
> + for name; do
> + echo 's/\("'$name'":\) ".*"/\1 /' >>filter.sed
> + done
> +}
> +
> +compare_json() {
> + git ls-files --debug-json >json &&
> + sed -f filter.sed json >filtered &&
> + test_cmp "$TEST_DIRECTORY"/t3011/"$1" filtered
> +}
> +
> +test_expect_success 'setup' '
> + mkdir sub &&
> + echo one >one &&
> + git add one &&
> + echo 2 >sub/two &&
> + git add sub/two &&
> +
> + echo intent-to-add >ita &&
> + git add -N ita &&
> +
> + strip_number ctime_sec ctime_nsec mtime_sec mtime_nsec &&
> + strip_number device inode uid gid file_offset ext_size &&
> + strip_string oid ident
> +'
> +
> +test_expect_success 'ls-files --json, main entries' '
> + compare_json basic
> +'
> +
> +test_done
> diff --git a/t/t3011/basic b/t/t3011/basic
> new file mode 100644
> index 00..9436445d90
> --- /dev/null
> +++ b/t/t3011/basic
> @@ -0,0 +1,67 @@
> +{
> +  "version": 3,

This will break the test suite when 'GIT_TEST_INDEX_VERSION' is set to
4 for example.  I think this applies to a few other tests in later
patches as well.

> +  "oid": ,
> +  "mtime_sec": ,
> +  "mtime_nsec": ,
> +  "entries": [
> +{
> +  "id": 0,
> +  "name": "ita",
> +  "mode": "100644",
> +  "flags": 536887296,
> +  "extended_flags": true,
> +  "intent_to_add": true,
> +  "oid": ,
> +  "stat": {
> +"ctime_sec": ,
> +"ctime_nsec": ,
> +"mtime_sec": ,
> +"mtime_nsec": ,
> +"device": ,
> +"inode": ,
> +"uid": ,
> +"gid": ,
> +"size": 0
> +  },
> +  "file_offset": 
> +},
> +{
> +  "id": 1,
> +  "name": "one",
> +  "mode": "100644",
> +  "flags": 0,
> +  "oid": ,
> +  "stat": {
> +"ctime_sec": ,
> +"ctime_nsec": ,
> +"mtime_sec": ,
> +"mtime_nsec": ,
> +"device": ,
> +"inode": ,
> +"uid": ,
> +"gid": ,
> +"size": 4
> +  },
> +  "file_offset": 
> +},
> +{
> +  "id": 2,
> +  "name": "sub/two",
> +  "mode": "100644",
> +  "flags": 0,
> +  "oid": ,
> +  "stat": {
> +"ctime_sec": ,
> +"ctime_nsec": ,
> +"mtime_sec": ,
> +"mtime_nsec": ,
> +"device": ,
> +"inode": ,
> +"uid": ,
> +"gid": ,
> +"size": 2
> +  },
> +  "file_offset": 
> +}
> +  ]
> +}
> -- 
> 2.22.0.rc0.322.g2b0371e29a
> 


Re: [GSoC][PATCH v4 0/4] [GSoC][PATCH 0/3] Teach cherry-pick/revert to skip commits

2019-06-17 Thread Thomas Gummerer
On 06/16, Rohit Ashiwal wrote:
> Yet another iteration of my patch. We have changed the series a little bit. We
> now have a commit that rename `reset_for_rollback` to `reset_merge`. A lot of
> nit-picks were handled in this revision.

Thanks for your work!  I allowed myself to nitpick a bit more at this
stage :)

One other thing I wanted to point out here is range-diff, which can be
helpful to include for the benefit of reviewers that saw this series
before.  See 'man git-range-diff' or the --range-diff flag in 'git
format-patch' for more info on how it works.  It helps seeing at a
glance what changed between versions of a series.

For the benefit of other reviewers that might find it helpful, here's
one I generated between v3 and v4 of the series::

1:  8f29142755 ! 1:  99279e617c sequencer: add advice for revert
@@ -25,8 +25,8 @@
 -  error(_("a cherry-pick or revert is already in progress"));
 -  advise(_("try \"git cherry-pick (--continue | --quit | 
--abort)\""));
 +  enum replay_action action;
-+  const char *in_progress_advice;
 +  const char *in_progress_error = NULL;
++  const char *in_progress_advice = NULL;
 +
 +  if (!sequencer_get_last_command(r, &action)) {
 +  switch (action) {
@@ -41,7 +41,7 @@
 +  _("try \"git cherry-pick (--continue | --abort | 
--quit)\"");
 +  break;
 +  default:
-+  BUG(_("the control must not reach here"));
++  BUG(_("unexpected action in create_seq_dir"));
 +  }
 +  }
 +  if (in_progress_error) {
-:  -- > 2:  c64aabf2d2 sequencer: rename reset_for_rollback to 
reset_merge
2:  3bc8678df4 ! 3:  8b483815ca cherry-pick/revert: add --skip option
@@ -27,7 +27,7 @@
 -'git cherry-pick' --continue
 -'git cherry-pick' --quit
 -'git cherry-pick' --abort
-+'git cherry-pick' --continue | --skip | --abort | --quit
++'git cherry-pick' (--continue | --skip | --abort | --quit)
  
  DESCRIPTION
  ---
@@ -42,7 +42,7 @@
 -'git revert' --continue
 -'git revert' --quit
 -'git revert' --abort
-+'git revert' --continue | --skip | --abort | --quit
++'git revert' (--continue | --skip | --abort | --quit)
  
  DESCRIPTION
  ---
@@ -97,10 +97,11 @@
  +++ b/sequencer.c
 @@
  
- static int reset_for_rollback(const struct object_id *oid)
+ static int reset_merge(const struct object_id *oid)
  {
 -  const char *argv[4];/* reset --merge  + NULL */
-+  struct argv_array argv = ARGV_ARRAY_INIT;   /* reset --merge  
+ NULL */
++  int ret;
++  struct argv_array argv = ARGV_ARRAY_INIT;
  
 -  argv[0] = "reset";
 -  argv[1] = "--merge";
@@ -112,34 +113,29 @@
 +  if (!is_null_oid(oid))
 +  argv_array_push(&argv, oid_to_hex(oid));
 +
-+  return run_command_v_opt(argv.argv, RUN_GIT_CMD);
++  ret = run_command_v_opt(argv.argv, RUN_GIT_CMD);
++  argv_array_clear(&argv);
++
++  return ret;
  }
  
--static int rollback_single_pick(struct repository *r)
-+static int rollback_single_pick(struct repository *r, unsigned int 
is_skip)
- {
-   struct object_id head_oid;
- 
-   if (!file_exists(git_path_cherry_pick_head(r)) &&
--  !file_exists(git_path_revert_head(r)))
-+  !file_exists(git_path_revert_head(r)) && !is_skip)
-   return error(_("no cherry-pick or revert in progress"));
-   if (read_ref_full("HEAD", 0, &head_oid, NULL))
-   return error(_("cannot resolve HEAD"));
--  if (is_null_oid(&head_oid))
-+  if (is_null_oid(&head_oid) && !is_skip)
-   return error(_("cannot abort from a branch yet to be born"));
-   return reset_for_rollback(&head_oid);
- }
+ static int rollback_single_pick(struct repository *r)
 @@
-* If CHERRY_PICK_HEAD or REVERT_HEAD indicates
-* a single-cherry-pick in progress, abort that.
-*/
--  return rollback_single_pick(r);
-+  return rollback_single_pick(r, 0);
-   }
-   if (!f)
-   return error_errno(_("cannot open '%s'"), git_path_head_file());
+   return reset_merge(&head_oid);
+ }
+ 
++static int skip_single_pick(void)
++{
++  struct object_id head;
++
++  if (read_ref_full("HEAD", 0, &head, NULL))
++  return error(_("cannot resolve HEAD"));
++  return reset_merge(&head);
++}
++
+ int sequencer_rollback(struct repository *r, struct replay_opts *opts)
+ {
+   FILE *f;
 @@
return -1;
  }
@@ -149,13 +145,35 @@
 +  enum replay_action action = -1;
 +  sequencer_get_last_command(r, &action);
 +
++  /*
++   * opts->action tells us which subcommand requested to skip
++   * the commit.
++   */
 +  sw

Re: [GSoC][PATCH v4 3/4] cherry-pick/revert: add --skip option

2019-06-17 Thread Thomas Gummerer
On 06/16, Rohit Ashiwal wrote:
> git am or rebase have a --skip flag to skip the current commit if the
> user wishes to do so. During a cherry-pick or revert a user could
> likewise skip a commit, but needs to use 'git reset' (or in the case
> of conflicts 'git reset --merge'), followed by 'git (cherry-pick |
> revert) --continue' to skip the commit. This is more annoying and
> sometimes confusing on the users' part. Add a `--skip` option to make
> skipping commits easier for the user and to make the commands more
> consistent.
> 
> In the next commit, we will change the advice messages and some tests
> hence finishing the process of teaching revert and cherry-pick
> "how to skip commits".

Changing the advice messages and some tests sounds to me like we're
not changing the tests here even though we should.  I know that is not
the case, and in fact we're only adding another test for something
that we're introducing in the next patch.

I think this whole paragraph could be dropped, but at least the "and
some tests" part should be, as it's slightly misleading imo.

> Signed-off-by: Rohit Ashiwal 
> ---
> changes:
> - Introduce '('s around documentation/help
> - Introduce a wrapper function skip_single_pick to reset_merge
> - Add comments to sequencer_skip
> - Change tests to use test_i18ncmp instead of test_cmp to not fail under
>   GETTEXT_POISON
> 
>  Documentation/git-cherry-pick.txt |   4 +-
>  Documentation/git-revert.txt  |   4 +-
>  Documentation/sequencer.txt   |   4 ++
>  builtin/revert.c  |   5 ++
>  sequencer.c   |  94 +--
>  sequencer.h   |   1 +
>  t/t3510-cherry-pick-sequence.sh   | 102 ++
>  7 files changed, 202 insertions(+), 12 deletions(-)
> 
> diff --git a/Documentation/git-cherry-pick.txt 
> b/Documentation/git-cherry-pick.txt
> index 754b16ce0c..83ce51aedf 100644
> --- a/Documentation/git-cherry-pick.txt
> +++ b/Documentation/git-cherry-pick.txt
> @@ -10,9 +10,7 @@ SYNOPSIS
>  [verse]
>  'git cherry-pick' [--edit] [-n] [-m parent-number] [-s] [-x] [--ff]
> [-S[]] ...
> -'git cherry-pick' --continue
> -'git cherry-pick' --quit
> -'git cherry-pick' --abort
> +'git cherry-pick' (--continue | --skip | --abort | --quit)
>  
>  DESCRIPTION
>  ---
> diff --git a/Documentation/git-revert.txt b/Documentation/git-revert.txt
> index 0c82ca5bc0..665e065ee3 100644
> --- a/Documentation/git-revert.txt
> +++ b/Documentation/git-revert.txt
> @@ -9,9 +9,7 @@ SYNOPSIS
>  
>  [verse]
>  'git revert' [--[no-]edit] [-n] [-m parent-number] [-s] [-S[]] 
> ...
> -'git revert' --continue
> -'git revert' --quit
> -'git revert' --abort
> +'git revert' (--continue | --skip | --abort | --quit)
>  
>  DESCRIPTION
>  ---
> diff --git a/Documentation/sequencer.txt b/Documentation/sequencer.txt
> index 5a57c4a407..3bceb56474 100644
> --- a/Documentation/sequencer.txt
> +++ b/Documentation/sequencer.txt
> @@ -3,6 +3,10 @@
>   `.git/sequencer`.  Can be used to continue after resolving
>   conflicts in a failed cherry-pick or revert.
>  
> +--skip::
> + Skip the current commit and continue with the rest of the
> + sequence.
> +
>  --quit::
>   Forget about the current operation in progress.  Can be used
>   to clear the sequencer state after a failed cherry-pick or
> diff --git a/builtin/revert.c b/builtin/revert.c
> index d4dcedbdc6..5dc5891ea2 100644
> --- a/builtin/revert.c
> +++ b/builtin/revert.c
> @@ -102,6 +102,7 @@ static int run_sequencer(int argc, const char **argv, 
> struct replay_opts *opts)
>   OPT_CMDMODE(0, "quit", &cmd, N_("end revert or cherry-pick 
> sequence"), 'q'),
>   OPT_CMDMODE(0, "continue", &cmd, N_("resume revert or 
> cherry-pick sequence"), 'c'),
>   OPT_CMDMODE(0, "abort", &cmd, N_("cancel revert or cherry-pick 
> sequence"), 'a'),
> + OPT_CMDMODE(0, "skip", &cmd, N_("skip current commit and 
> continue"), 's'),
>   OPT_CLEANUP(&cleanup_arg),
>   OPT_BOOL('n', "no-commit", &opts->no_commit, N_("don't 
> automatically commit")),
>   OPT_BOOL('e', "edit", &opts->edit, N_("edit the commit 
> message")),
> @@ -151,6 +152,8 @@ static int run_sequencer(int argc, const char **argv, 
> struct replay_opts *opts)
>   this_operation = "--quit";
>   else if (cmd == 'c')
>   this_operation = "--continue";
> + else if (cmd == 's')
> + this_operation = "--skip";
>   else {
>   assert(cmd == 'a');
>   this_operation = "--abort";
> @@ -210,6 +213,8 @@ static int run_sequencer(int argc, const char **argv, 
> struct replay_opts *opts)
>   return sequencer_continue(the_repository, opts);
>   if (cmd == 'a')
>   return sequencer_rollback(the_repository, opts);
> + if (c

Re: [PATCH] stash: fix show referencing stash index

2019-06-16 Thread Thomas Gummerer
On 06/15, Andrei Rybak wrote:
> On 6/15/19 1:26 PM, Thomas Gummerer wrote:
> > diff --git a/t/t3903-stash.sh b/t/t3903-stash.sh
> > index ea30d5f6a0..3973cbda0e 100755
> > --- a/t/t3903-stash.sh
> > +++ b/t/t3903-stash.sh
> > @@ -708,6 +708,24 @@ test_expect_success 'invalid ref of the form "n", n >= 
> > N' '
> > git stash drop
> >  '
> >  
> > +test_expect_success 'valid ref of the form "n", n >= N' '
> 
> If ref is valid, 'n < N' was probably meant here.

Yes, indeed.  Thanks!


Re: [GSoC][PATCH v4 1/4] sequencer: add advice for revert

2019-06-16 Thread Thomas Gummerer
On 06/16, Rohit Ashiwal wrote:
> In the case of merge conflicts, while performing a revert, we are
> currently advised to use `git cherry-pick --`
> of which --continue is incompatible for continuing the revert.
> Introduce a separate advice message for `git revert`. Also change
> the signature of `create_seq_dir` to handle which advice to display
> selectively.
> 
> Signed-off-by: Rohit Ashiwal 
> ---
> changes:
> - change BUG()'s message under create_seq_dir
> 
>  sequencer.c | 34 --
>  1 file changed, 28 insertions(+), 6 deletions(-)
> 
> diff --git a/sequencer.c b/sequencer.c
> index f88a97fb10..d80e1c3fbb 100644
> --- a/sequencer.c
> +++ b/sequencer.c
> @@ -2650,15 +2650,37 @@ static int walk_revs_populate_todo(struct todo_list 
> *todo_list,
>   return 0;
>  }
>  
> -static int create_seq_dir(void)
> +static int create_seq_dir(struct repository *r)
>  {
> - if (file_exists(git_path_seq_dir())) {
> - error(_("a cherry-pick or revert is already in progress"));
> - advise(_("try \"git cherry-pick (--continue | --quit | 
> --abort)\""));
> + enum replay_action action;
> + const char *in_progress_error = NULL;
> + const char *in_progress_advice = NULL;
> +
> + if (!sequencer_get_last_command(r, &action)) {
> + switch (action) {
> + case REPLAY_REVERT:
> + in_progress_error = _("revert is already in progress");
> + in_progress_advice =
> + _("try \"git revert (--continue | --abort | --quit)\"");
> + break;
> + case REPLAY_PICK:
> + in_progress_error = _("cherry-pick is already in 
> progress");
> + in_progress_advice =
> + _("try \"git cherry-pick (--continue | --abort | 
> --quit)\"");
> + break;
> + default:
> + BUG(_("unexpected action in create_seq_dir"));

As Phillip mentioned in the previous round, this doesn't need to be
translated.  I'd go one step further and say this should not be
translated.  Translating it just adds extra work for translators for a
message that the user will (hopefully) never see.

In fact if we look through the rest of the codebase, BUG() messages
are never translated anywhere, and neither is the one in 3/4.

> + }
> + }
> + if (in_progress_error) {
> + error("%s", in_progress_error);
> + advise("%s", in_progress_advice);
>   return -1;
> - } else if (mkdir(git_path_seq_dir(), 0777) < 0)
> + }
> + if (mkdir(git_path_seq_dir(), 0777) < 0)
>   return error_errno(_("could not create sequencer directory 
> '%s'"),
>  git_path_seq_dir());
> +
>   return 0;
>  }
>  
> @@ -4237,7 +4259,7 @@ int sequencer_pick_revisions(struct repository *r,
>*/
>  
>   if (walk_revs_populate_todo(&todo_list, opts) ||
> - create_seq_dir() < 0)
> + create_seq_dir(r) < 0)
>   return -1;
>   if (get_oid("HEAD", &oid) && (opts->action == REPLAY_REVERT))
>   return error(_("can't revert as initial commit"));
> -- 
> 2.21.0
> 


  1   2   3   4   5   6   7   8   9   10   >