date:20180717

Receiving console output from GIT 10mins after abort/termination?

2018-07-17 Thread Frank Wolf

Hi @ll,

I hope I'm posting to the right group (not sure if it's Windows related) but 
I've got
a weird problem using GIT:

By accident I've tried to push a repository (containing an already
commited but not yet pushed submodule reference).
This fails immediately with an error of course BUT

after 10 mins I get an output on the console though the command exited!?
(... $Received disconnect from : User session has timed out 
idling after 600 ms)

Does anyone have an explanation why I still get an output after the command was 
aborted?

/Frank

Re: [PATCH] sequencer.c: terminate the last line of author-script properly

2018-07-17 Thread Akinori MUSHA

That's perfectly fine with me.  I just thought each test case would
run in a separate shell process and that's why I chose not to use a
subshell for the last lines.  I've learned a lot from feedback from
you all.  Thanks!

On Wed, 18 Jul 2018 08:25:22 +0900,
Junio C Hamano wrote:
> 
> I'll squash the following in (which I have been carrying in 'pu' for
> the past few days) unless I hear otherwise soonish to correct the
> issues raised during the review.
> 
> Thanks.
> 
>  t/t3404-rebase-interactive.sh | 12 +++-
>  1 file changed, 7 insertions(+), 5 deletions(-)
> 
> diff --git a/t/t3404-rebase-interactive.sh b/t/t3404-rebase-interactive.sh
> index 2d189da2f1..b0cef509ab 100755
> --- a/t/t3404-rebase-interactive.sh
> +++ b/t/t3404-rebase-interactive.sh
> @@ -81,11 +81,13 @@ test_expect_success 'rebase -i writes out 
> .git/rebase-merge/author-script in "ed
>   set_fake_editor &&
>   FAKE_LINES="edit 1" git rebase -i HEAD^ &&
>   test -f .git/rebase-merge/author-script &&
> - unset GIT_AUTHOR_NAME GIT_AUTHOR_EMAIL GIT_AUTHOR_DATE &&
> - eval "$(cat .git/rebase-merge/author-script)" &&
> - test "$(git show --quiet --pretty=format:%an)" = "$GIT_AUTHOR_NAME" &&
> - test "$(git show --quiet --pretty=format:%ae)" = "$GIT_AUTHOR_EMAIL" &&
> - test "$(git show --quiet --date=raw --pretty=format:@%ad)" = 
> "$GIT_AUTHOR_DATE"
> + (
> + sane_unset GIT_AUTHOR_NAME GIT_AUTHOR_EMAIL GIT_AUTHOR_DATE &&
> + eval "$(cat .git/rebase-merge/author-script)" &&
> + test "$(git show --quiet --pretty=format:%an)" = 
> "$GIT_AUTHOR_NAME" &&
> + test "$(git show --quiet --pretty=format:%ae)" = 
> "$GIT_AUTHOR_EMAIL" &&
> + test "$(git show --quiet --date=raw --pretty=format:@%ad)" = 
> "$GIT_AUTHOR_DATE"
> + )
>  '
>  
>  test_expect_success 'rebase -i with the exec command' '
> -- 
> 2.18.0-129-ge3331758f1
>

[PATCH] diff.c: offer config option to control ws handling in move detection

2018-07-17 Thread Stefan Beller

Signed-off-by: Stefan Beller 
---

This is the cherry on the cake named sb/diff-color-move-more.

 Documentation/config.txt   | 5 +
 Documentation/diff-options.txt | 7 +--
 diff.c | 9 +
 3 files changed, 19 insertions(+), 2 deletions(-)

diff --git a/Documentation/config.txt b/Documentation/config.txt
index 2659153cb37..6ca7118b018 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -1122,6 +1122,11 @@ diff.colorMoved::
true the default color mode will be used. When set to false,
moved lines are not colored.
 
+diff.colorMovedWS::
+   When moved lines are colored using e.g. the `diff.colorMoved` setting,
+   this option controls the `` how spaces are treated
+   for details of valid modes see '--color-moved-ws' in 
linkgit:git-diff[1].
+
 color.diff.::
Use customized color for diff colorization.  `` specifies
which part of the patch to use the specified color, and is one
diff --git a/Documentation/diff-options.txt b/Documentation/diff-options.txt
index 143acd9417e..8da7fed4e22 100644
--- a/Documentation/diff-options.txt
+++ b/Documentation/diff-options.txt
@@ -294,8 +294,11 @@ dimmed_zebra::
 
 --color-moved-ws=::
This configures how white spaces are ignored when performing the
-   move detection for `--color-moved`. These modes can be given
-   as a comma separated list:
+   move detection for `--color-moved`.
+ifdef::git-diff[]
+   It can be set by the `diff.colorMovedWS` configuration setting.
+endif::git-diff[]
+   These modes can be given as a comma separated list:
 +
 --
 ignore-space-at-eol::
diff --git a/diff.c b/diff.c
index f51f0ac32f4..9de917108d8 100644
--- a/diff.c
+++ b/diff.c
@@ -35,6 +35,7 @@ static int diff_rename_limit_default = 400;
 static int diff_suppress_blank_empty;
 static int diff_use_color_default = -1;
 static int diff_color_moved_default;
+static int diff_color_moved_ws_default;
 static int diff_context_default = 3;
 static int diff_interhunk_context_default;
 static const char *diff_word_regex_cfg;
@@ -332,6 +333,13 @@ int git_diff_ui_config(const char *var, const char *value, 
void *cb)
diff_color_moved_default = cm;
return 0;
}
+   if (!strcmp(var, "diff.colormovedws")) {
+   int cm = parse_color_moved_ws(value);
+   if (cm < 0)
+   return -1;
+   diff_color_moved_ws_default = cm;
+   return 0;
+   }
if (!strcmp(var, "diff.context")) {
diff_context_default = git_config_int(var, value);
if (diff_context_default < 0)
@@ -4327,6 +4335,7 @@ void diff_setup(struct diff_options *options)
}
 
options->color_moved = diff_color_moved_default;
+   options->color_moved_ws_handling = diff_color_moved_ws_default;
 }
 
 void diff_setup_done(struct diff_options *options)
-- 
2.18.0.233.g985f88cf7e-goog

Re: [RFC] push: add documentation on push v2

2018-07-17 Thread Stefan Beller

On Tue, Jul 17, 2018 at 2:09 PM Brandon Williams  wrote:
>
> Signed-off-by: Brandon Williams 
> ---
>
> Since introducing protocol v2 and enabling fetch I've been thinking
> about what its inverse 'push' would look like.  After talking with a
> number of people I have a longish list of things that could be done to
> improve push and I think I've been able to distill the core features we
> want in push v2.

It would be nice to know which things you want to improve.

>  Thankfully (due to the capability system) most of the
> other features/improvements can be added later with ease.
>
> What I've got now is a rough design for a more flexible push, more
> flexible because it allows for the server to do what it wants with the
> refs that are pushed and has the ability to communicate back what was
> done to the client.  The main motivation for this is to work around
> issues when working with Gerrit and other code-review systems where you
> need to have Change-Ids in the commit messages (now the server can just
> insert them for you and send back new commits) and you need to push to
> magic refs to get around various limitations (now a Gerrit server should
> be able to communicate that pushing to 'master' doesn't update master
> but instead creates a refs/changes/ ref).

Well Gerrit is our main motivation, but this allows for other workflows as well.
For example Facebook uses hg internally and they have a
"rebase-on-the-server-after-push" workflow IIRC as pushing to a single repo
brings up quite some contention. The protocol outlined below would allow
for such a workflow as well? (This might be an easier sell to the Git
community as most are not quite familiar with Gerrit)

> Before actually moving to write any code I'm hoping to get some feedback
> on if we think this is an acceptable base design for push (other
> features like atomic-push, signed-push, etc can be added as
> capabilities), so any comments are appreciated.
>
>  Documentation/technical/protocol-v2.txt | 76 +
>  1 file changed, 76 insertions(+)
>
> diff --git a/Documentation/technical/protocol-v2.txt 
> b/Documentation/technical/protocol-v2.txt
> index 49bda76d23..16c1ce60dd 100644
> --- a/Documentation/technical/protocol-v2.txt
> +++ b/Documentation/technical/protocol-v2.txt
> @@ -403,6 +403,82 @@ header.
> 2 - progress messages
> 3 - fatal error message just before stream aborts
>
> + push
> +~~
> +
> +`push` is the command used to push ref-updates and a packfile to a remote
> +server in v2.
> +
> +Additional features not supported in the base command will be advertised
> +as the value of the command in the capability advertisement in the form
> +of a space separated list of features: "= "
> +
> +The format of a push request is as follows:
> +
> +request = *section
> +section = (ref-updates | packfile)

This reads as if a request consists of sections, which
each can be a "ref-updates" or a packfile, no order given,
such that multiple ref-update sections mixed with packfiles
are possible.

I would assume we'd only want to allow for ref-updates
followed by the packfile.

Given the example above for "rebase-on-push" though
it is better to first send the packfile (as that is assumed to
take longer) and then send the ref updates, such that the
rebasing could be faster and has no bottleneck.

> +  (delim-pkt | flush-pkt)

> +
> +ref-updates = PKT-LINE("ref-updates" LF)
> + *PKT-Line(update/force-update LF)
> +
> +update = txn_id SP action SP refname SP old_oid SP new_oid
> +force-update = txn_id SP "force" SP action SP refname SP new_oid

So we insert "force" after the transaction id if we want to force it.
When adding the atomic capability later we could imagine another insert here

  1 atomic create refs/heads/new-ref <0-hash> 
  1 atomic delete refs/heads/old-ref  <0-hash>

which would look like a "rename" that we could also add instead.
The transaction numbers are an interesting concept, how do you
envision them to be used? In the example I put them both in the same
transaction to demonstrate the "atomic-ness", but one could also
imagine different transactions numbers per ref (i.e. exactly one
ref per txn_id) to have a better understanding of what the server did
to each individual ref.

> +action = ("create" | "delete" | "update")
> +txn_id = 1*DIGIT
> +
> +packfile = PKT-LINE("packfile" LF)
> +  *PKT-LINE(*%x00-ff)
> +
> +ref-updates section
> +   * Transaction id's allow for mapping what was requested to what the
> + server actually did with the ref-update.

this would imply the client ought to have at most one ref per transaction id.
Is the client allowed to put multiple refs per id?

Are new capabilities attached to ref updates or transactions?
Unlike the example above, stating "atomic" on each line, you could just
say "transaction 1 should be atomic" in another line, that would address
all refs

Re: [PATCH] sequencer.c: terminate the last line of author-script properly

2018-07-17 Thread Junio C Hamano

I'll squash the following in (which I have been carrying in 'pu' for
the past few days) unless I hear otherwise soonish to correct the
issues raised during the review.

Thanks.

 t/t3404-rebase-interactive.sh | 12 +++-
 1 file changed, 7 insertions(+), 5 deletions(-)

diff --git a/t/t3404-rebase-interactive.sh b/t/t3404-rebase-interactive.sh
index 2d189da2f1..b0cef509ab 100755
--- a/t/t3404-rebase-interactive.sh
+++ b/t/t3404-rebase-interactive.sh
@@ -81,11 +81,13 @@ test_expect_success 'rebase -i writes out 
.git/rebase-merge/author-script in "ed
set_fake_editor &&
FAKE_LINES="edit 1" git rebase -i HEAD^ &&
test -f .git/rebase-merge/author-script &&
-   unset GIT_AUTHOR_NAME GIT_AUTHOR_EMAIL GIT_AUTHOR_DATE &&
-   eval "$(cat .git/rebase-merge/author-script)" &&
-   test "$(git show --quiet --pretty=format:%an)" = "$GIT_AUTHOR_NAME" &&
-   test "$(git show --quiet --pretty=format:%ae)" = "$GIT_AUTHOR_EMAIL" &&
-   test "$(git show --quiet --date=raw --pretty=format:@%ad)" = 
"$GIT_AUTHOR_DATE"
+   (
+   sane_unset GIT_AUTHOR_NAME GIT_AUTHOR_EMAIL GIT_AUTHOR_DATE &&
+   eval "$(cat .git/rebase-merge/author-script)" &&
+   test "$(git show --quiet --pretty=format:%an)" = 
"$GIT_AUTHOR_NAME" &&
+   test "$(git show --quiet --pretty=format:%ae)" = 
"$GIT_AUTHOR_EMAIL" &&
+   test "$(git show --quiet --date=raw --pretty=format:@%ad)" = 
"$GIT_AUTHOR_DATE"
+   )
 '
 
 test_expect_success 'rebase -i with the exec command' '
-- 
2.18.0-129-ge3331758f1

[PATCH 0/2] RFC ref store to repository migration

2018-07-17 Thread Stefan Beller

Stolee said (privately):

I also ran into an issue where some of the "arbitrary repository"
patches don't fully connect. Jonathan's test demonstrated this
issue when I connected some things in an in-process patch 
Work in progress: https://github.com/gitgitgadget/git/pull/11
Specifically: 
https://github.com/gitgitgadget/git/pull/11/commits/287ec6c1cd4bf4da76c05698373aee15749d011a

And I dislike the approach taken in
https://github.com/gitgitgadget/git/pull/11/commits/287ec6c1cd4bf4da76c05698373aee15749d011a
and would prefer another approach shown at
https://github.com/stefanbeller/git/tree/object-store-convert-refstore-partial
or in this series.

This approach doesn't have the ugliness of having the repository around twice,
e.g.

int register_replace_ref(const char *refname, ...
{
  struct repository *r = cb_data;
  ...
}

...  

for_each_replace_ref(r, register_replace_ref, r);

which would iterate over the refs of the first "r" and have "r" as a call back
data point for register_replace_ref.

This approach also takes the gentle hint of Junio to not replace well used 
functions
throughout the whole code base (using coccinelle), but for now exposes just
one example in the second patch.

These patches have been developed on top of jt/commit-graph-per-object-store.

Opinions?

Thanks,
Stefan   

Stefan Beller (2):
  refs.c: migrate internal ref iteration to pass thru repository
argument
  refs.c: upgrade for_each_replace_ref to be a each_repo_ref_fn callback

 builtin/replace.c|  3 ++-
 refs.c   | 48 +---
 refs.h   | 12 ++-
 refs/iterator.c  |  6 +++---
 refs/refs-internal.h |  5 +++--
 replace-object.c |  3 ++-
 6 files changed, 62 insertions(+), 15 deletions(-)

-- 
2.18.0.233.g985f88cf7e-goog

[PATCH 2/2] refs.c: upgrade for_each_replace_ref to be a each_repo_ref_fn callback

2018-07-17 Thread Stefan Beller

Signed-off-by: Stefan Beller 
---
 builtin/replace.c | 3 ++-
 refs.c| 9 -
 refs.h| 2 +-
 replace-object.c  | 3 ++-
 4 files changed, 9 insertions(+), 8 deletions(-)

diff --git a/builtin/replace.c b/builtin/replace.c
index ef22d724bbc..5f34659071f 100644
--- a/builtin/replace.c
+++ b/builtin/replace.c
@@ -39,7 +39,8 @@ struct show_data {
enum replace_format format;
 };
 
-static int show_reference(const char *refname, const struct object_id *oid,
+static int show_reference(struct repository *r, const char *refname,
+ const struct object_id *oid,
  int flag, void *cb_data)
 {
struct show_data *data = cb_data;
diff --git a/refs.c b/refs.c
index 2513f77acb3..5700cd4683f 100644
--- a/refs.c
+++ b/refs.c
@@ -1478,12 +1478,11 @@ int refs_for_each_fullref_in(struct ref_store *refs, 
const char *prefix,
return do_for_each_ref(refs, prefix, fn, 0, flag, cb_data);
 }
 
-int for_each_replace_ref(struct repository *r, each_ref_fn fn, void *cb_data)
+int for_each_replace_ref(struct repository *r, each_repo_ref_fn fn, void 
*cb_data)
 {
-   return do_for_each_ref(get_main_ref_store(r),
-  git_replace_ref_base, fn,
-  strlen(git_replace_ref_base),
-  DO_FOR_EACH_INCLUDE_BROKEN, cb_data);
+   return do_for_each_repo_ref(r, git_replace_ref_base, fn,
+   strlen(git_replace_ref_base),
+   DO_FOR_EACH_INCLUDE_BROKEN, cb_data);
 }
 
 int for_each_namespaced_ref(each_ref_fn fn, void *cb_data)
diff --git a/refs.h b/refs.h
index 80eec8bbc68..a0a18223a14 100644
--- a/refs.h
+++ b/refs.h
@@ -317,7 +317,7 @@ int for_each_fullref_in(const char *prefix, each_ref_fn fn, 
void *cb_data,
 int for_each_tag_ref(each_ref_fn fn, void *cb_data);
 int for_each_branch_ref(each_ref_fn fn, void *cb_data);
 int for_each_remote_ref(each_ref_fn fn, void *cb_data);
-int for_each_replace_ref(struct repository *r, each_ref_fn fn, void *cb_data);
+int for_each_replace_ref(struct repository *r, each_repo_ref_fn fn, void 
*cb_data);
 int for_each_glob_ref(each_ref_fn fn, const char *pattern, void *cb_data);
 int for_each_glob_ref_in(each_ref_fn fn, const char *pattern,
 const char *prefix, void *cb_data);
diff --git a/replace-object.c b/replace-object.c
index 801b5c16789..01a5a59a35a 100644
--- a/replace-object.c
+++ b/replace-object.c
@@ -6,7 +6,8 @@
 #include "repository.h"
 #include "commit.h"
 
-static int register_replace_ref(const char *refname,
+static int register_replace_ref(struct repository *r,
+   const char *refname,
const struct object_id *oid,
int flag, void *cb_data)
 {
-- 
2.18.0.233.g985f88cf7e-goog

[PATCH 1/2] refs.c: migrate internal ref iteration to pass thru repository argument

2018-07-17 Thread Stefan Beller

In 60ce76d3581 (refs: add repository argument to for_each_replace_ref,
2018-04-11) and 0d296c57aec (refs: allow for_each_replace_ref to handle
arbitrary repositories, 2018-04-11), for_each_replace_ref learned how
to iterate over refs by a given arbitrary repository.
New attempts in the object store conversion have shown that it is useful
to have the repository handle available that the refs iteration is
currently iterating over.

To achieve this goal we will need to add a repository argument to
each_ref_fn in refs.h. However as many callers rely on the signature
such a patch would be too large.

So convert the internals of the ref subsystem first to pass through a
repository argument without exposing the change to the user. Assume
the_repository for the passed through repository, although it is not
used anywhere yet.

Signed-off-by: Stefan Beller 
---
 refs.c   | 39 +--
 refs.h   | 10 ++
 refs/iterator.c  |  6 +++---
 refs/refs-internal.h |  5 +++--
 4 files changed, 53 insertions(+), 7 deletions(-)

diff --git a/refs.c b/refs.c
index fcfd3171e83..2513f77acb3 100644
--- a/refs.c
+++ b/refs.c
@@ -1390,17 +1390,50 @@ struct ref_iterator *refs_ref_iterator_begin(
  * non-zero value, stop the iteration and return that value;
  * otherwise, return 0.
  */
+static int do_for_each_repo_ref(struct repository *r, const char *prefix,
+   each_repo_ref_fn fn, int trim, int flags,
+   void *cb_data)
+{
+   struct ref_iterator *iter;
+   struct ref_store *refs = get_main_ref_store(r);
+
+   if (!refs)
+   return 0;
+
+   iter = refs_ref_iterator_begin(refs, prefix, trim, flags);
+
+   return do_for_each_repo_ref_iterator(r, iter, fn, cb_data);
+}
+
+struct do_for_each_ref_help {
+   each_ref_fn *fn;
+   void *cb_data;
+};
+
+static int do_for_each_ref_helper(struct repository *r,
+ const char *refname,
+ const struct object_id *oid,
+ int flags,
+ void *cb_data)
+{
+   struct do_for_each_ref_help *hp = cb_data;
+
+   return hp->fn(refname, oid, flags, hp->cb_data);
+}
+
 static int do_for_each_ref(struct ref_store *refs, const char *prefix,
   each_ref_fn fn, int trim, int flags, void *cb_data)
 {
struct ref_iterator *iter;
+   struct do_for_each_ref_help hp = { fn, cb_data };
 
if (!refs)
return 0;
 
iter = refs_ref_iterator_begin(refs, prefix, trim, flags);
 
-   return do_for_each_ref_iterator(iter, fn, cb_data);
+   return do_for_each_repo_ref_iterator(the_repository, iter,
+   do_for_each_ref_helper, &hp);
 }
 
 int refs_for_each_ref(struct ref_store *refs, each_ref_fn fn, void *cb_data)
@@ -2029,10 +2062,12 @@ int refs_verify_refname_available(struct ref_store 
*refs,
 int refs_for_each_reflog(struct ref_store *refs, each_ref_fn fn, void *cb_data)
 {
struct ref_iterator *iter;
+   struct do_for_each_ref_help hp = { fn, cb_data };
 
iter = refs->be->reflog_iterator_begin(refs);
 
-   return do_for_each_ref_iterator(iter, fn, cb_data);
+   return do_for_each_repo_ref_iterator(the_repository, iter,
+do_for_each_ref_helper, &hp);
 }
 
 int for_each_reflog(each_ref_fn fn, void *cb_data)
diff --git a/refs.h b/refs.h
index cc2fb4c68c0..80eec8bbc68 100644
--- a/refs.h
+++ b/refs.h
@@ -274,6 +274,16 @@ struct ref_transaction;
 typedef int each_ref_fn(const char *refname,
const struct object_id *oid, int flags, void *cb_data);
 
+/*
+ * The same as each_ref_fn, but also with a repository argument that
+ * contains the repository associated with the callback.
+ */
+typedef int each_repo_ref_fn(struct repository *r,
+const char *refname,
+const struct object_id *oid,
+int flags,
+void *cb_data);
+
 /*
  * The following functions invoke the specified callback function for
  * each reference indicated.  If the function ever returns a nonzero
diff --git a/refs/iterator.c b/refs/iterator.c
index 2ac91ac3401..629e00a122a 100644
--- a/refs/iterator.c
+++ b/refs/iterator.c
@@ -407,15 +407,15 @@ struct ref_iterator *prefix_ref_iterator_begin(struct 
ref_iterator *iter0,
 
 struct ref_iterator *current_ref_iter = NULL;
 
-int do_for_each_ref_iterator(struct ref_iterator *iter,
-each_ref_fn fn, void *cb_data)
+int do_for_each_repo_ref_iterator(struct repository *r, struct ref_iterator 
*iter,
+ each_repo_ref_fn fn, void *cb_data)
 {
int retval = 0, ok;
struct ref_iterator *old_ref_iter = current_ref_iter;
 
current_ref_iter = iter;

Re: [PATCH v2 4/4] ref-filter: use oid_object_info() to get object

2018-07-17 Thread Junio C Hamano

Оля Тележная   writes:

>> Hmph, doesn't this belong to the previous step?  In other words,
>> isn't the result of applying 1-3/4 has a bug that can leave eaten
>> uninitialized (and base decision to free(buf) later on it), and
>> isn't this change a fix for it?
>
> Oh. I was thinking that it was new bug created by me. Now I see that
> previously we had the same problem.

The original said something like:

int eaten;
void *buf = get_obj(..., &eaten);
...
if (!eaten)
free(buf);

and get_obj() left eaten untouched when it returned NULL.  As a
random uninitialized cruft in eaten that happened to be "true" would
just cause free(NULL) on many archs, there was no practical problem
in such a code, but it is undefined behaviour nevertheless.

And the previous step made it a bit more alarming ;-)

Re: [PATCH v3 4/4] builtin/rebase: support running "git rebase "

2018-07-17 Thread Beat Bolli

On 17.07.18 23:49, Beat Bolli wrote:
> On 06.07.18 14:08, Pratik Karki wrote:
>> +static GIT_PATH_FUNC(apply_dir, "rebase-apply");
>> +static GIT_PATH_FUNC(merge_dir, "rebase-merge");
> 
> Maybe fix this up with
> 
> -static GIT_PATH_FUNC(apply_dir, "rebase-apply");
> -static GIT_PATH_FUNC(merge_dir, "rebase-merge");
> +static GIT_PATH_FUNC(apply_dir, "rebase-apply")
> +static GIT_PATH_FUNC(merge_dir, "rebase-merge")
> 
> ?

Sorry, this should have been a reply to [PATCH v4 4/4]. The remark still
applies, though.

> (See https://public-inbox.org/git/20180709192537.18564-5-dev+...@drbeat.li/#t)

Cheers, Beat

Re: [GSoC] GSoC with git, week 11

2018-07-17 Thread Paul-Sebastian Ungureanu


Hello,

On 17.07.2018 21:41, Alban Gruin wrote:

Hi,

I published a new blog post about last week:

 https://blog.pa1ch.fr/posts/2018/07/17/en/gsoc2018-week11.html

Cheers,
Alban



Great entry!

Here I am too, with two updates on the blog:

* The weekly blog.

https://ungps.github.io/2018/07/15/week-11/

* Some of my thoughts on GSoC.

https://ungps.github.io/2018/07/16/About-GSoC/

Best,
Paul

Re: [PATCH v3 4/4] builtin/rebase: support running "git rebase "

2018-07-17 Thread Beat Bolli

On 06.07.18 14:08, Pratik Karki wrote:
> +static GIT_PATH_FUNC(apply_dir, "rebase-apply");
> +static GIT_PATH_FUNC(merge_dir, "rebase-merge");

Maybe fix this up with

-static GIT_PATH_FUNC(apply_dir, "rebase-apply");
-static GIT_PATH_FUNC(merge_dir, "rebase-merge");
+static GIT_PATH_FUNC(apply_dir, "rebase-apply")
+static GIT_PATH_FUNC(merge_dir, "rebase-merge")

?

(See
https://public-inbox.org/git/20180709192537.18564-5-dev+...@drbeat.li/#t)

Cheers, Beat

Re: [PATCH v2 6/6] submodule--helper: introduce new update-module-mode helper

2018-07-17 Thread Junio C Hamano

SZEDER Gábor  writes:

>> +fprintf(stdout, submodule_strategy_to_string(&update_strategy));
>
> Various compilers warn about the potential insecurity of the above
> call:
>
>   CC builtin/submodule--helper.o
>   builtin/submodule--helper.c: In function ‘module_update_module_mode’:
>   builtin/submodule--helper.c:1502:2: error: format not a string literal and 
> no format arguments [-Werror=format-security]
> fprintf(stdout, submodule_strategy_to_string(&update_strategy));
> ^
>   cc1: all warnings being treated as errors
>   Makefile:2261: recipe for target 'builtin/submodule--helper.o' failed
>   make: *** [builtin/submodule--helper.o] Error 1
>
> I think it should either use an explicit format string:
>
>   fprintf(stdout, "%s", submodule_strategy_to_string(&update_strategy));
>
> or, perhaps better yet, simply use fputs().

Sounds good.

Re: [PATCH v2 0/6] git-submodule.sh: convert part of cmd_update to C

2018-07-17 Thread Junio C Hamano

Stefan Beller  writes:

>> As I most often edit the log message and material below three-dash
>> lines (long) _after_ format-patch produced files, I do not think it
>> is a win to force me to push and ask to pull
>
> Ah, that is an interesting workflow. Do you keep patch files/emails
> around locally, only to (long after) add a message and resend it?

The time I "finish" working on a series and commit is typically much
closer to the time I format-patch the result, but it will take time
for me to actually mail out these files.  It will take time to
re-review and re-think if the explanation in them makes sense to
readers, and sending half-edited patch is something I try to avoid
as it would be a waste of time for everybody involved (including me
who would then need to respond to messages that helpfully point out
silly typoes, in addition to those who spots them).

I am trained myself not to touch these files after sending them out,
as comparing them with a newer iteration of the correspoinding files
was one way to see what has (and hasn't) changed between iterations
before I learned "git tbdiff", and that habit persists.

Re: [PATCH v4 2/7] t/t7510: check the validation of the new config gpg.format

2018-07-17 Thread Junio C Hamano

Henning Schild  writes:

> Test setting gpg.format to both invalid and valid values.
>
> Signed-off-by: Henning Schild 
> ---
>  t/t7510-signed-commit.sh | 9 +
>  1 file changed, 9 insertions(+)
>
> diff --git a/t/t7510-signed-commit.sh b/t/t7510-signed-commit.sh
> index 6e2015ed9..7bdad570c 100755
> --- a/t/t7510-signed-commit.sh
> +++ b/t/t7510-signed-commit.sh
> @@ -227,4 +227,13 @@ test_expect_success GPG 'log.showsignature behaves like 
> --show-signature' '
>   grep "gpg: Good signature" actual
>  '
>  
> +test_expect_success GPG 'check config gpg.format values' '
> + test_config gpg.format openpgp &&
> + git commit -S --amend -m "success" &&
> + test_config gpg.format OpEnPgP &&
> + test_must_fail git commit -S --amend -m "fail" &&

This second one is a good demonstration that the value for this
variable is case sensitive.

> + test_config gpg.format malformed &&
> + test_must_fail git commit -S --amend -m "fail"

And there is no longer a good reason to try another one.  Let's drop
this last/third case.

> +'
> +
>  test_done

Re: [PATCH v4 7/7] gpg-interface t: extend the existing GPG tests with GPGSM

2018-07-17 Thread Junio C Hamano

Henning Schild  writes:

> diff --git a/t/lib-gpg.sh b/t/lib-gpg.sh
> index a5d3b2cba..3fe02876c 100755
> --- a/t/lib-gpg.sh
> +++ b/t/lib-gpg.sh
> @@ -38,7 +38,33 @@ then
>   "$TEST_DIRECTORY"/lib-gpg/ownertrust &&
>   gpg --homedir "${GNUPGHOME}" /dev/null 2>&1 \
>   --sign -u commit...@example.com &&
> - test_set_prereq GPG
> + test_set_prereq GPG &&

We do not mind making GPGSM dependent on GPG, hence this && is justified.

> + # Available key info:
> + # * see t/lib-gpg/gpgsm-gen-key.in
> + # To generate new certificate:
> + #  * no passphrase
> + #   gpgsm --homedir /tmp/gpghome/ \
> + #   -o /tmp/gpgsm.crt.user \
> + #   --generate-key \
> + #   --batch t/lib-gpg/gpgsm-gen-key.in
> + # To import certificate:
> + #   gpgsm --homedir /tmp/gpghome/ \
> + #   --import /tmp/gpgsm.crt.user
> + # To export into a .p12 we can later import:
> + #   gpgsm --homedir /tmp/gpghome/ \
> + #   -o t/lib-gpg/gpgsm_cert.p12 \
> + #   --export-secret-key-p12 "commit...@example.com"
> + echo | gpgsm --homedir "${GNUPGHOME}" 2>/dev/null \
> + --passphrase-fd 0 --pinentry-mode loopback \
> + --import "$TEST_DIRECTORY"/lib-gpg/gpgsm_cert.p12 &&
> + gpgsm --homedir "${GNUPGHOME}" 2>/dev/null -K \
> + | grep fingerprint: | cut -d" " -f4 | tr -d '\n' > \
> + ${GNUPGHOME}/trustlist.txt &&
> + echo " S relax" >> ${GNUPGHOME}/trustlist.txt &&
> + (gpgconf --kill gpg-agent >/dev/null 2>&1 || : ) &&
> + echo hello | gpgsm --homedir "${GNUPGHOME}" >/dev/null \
> + -u commit...@example.com -o /dev/null --sign - 2>&1 &&
> + test_set_prereq GPGSM

And when any of the above fails, we refrain from setting GPGSM
prereq.  Otherwise we are prepared to perform tests with gpgsm
and get the prereq.

> diff --git a/t/t4202-log.sh b/t/t4202-log.sh
> index 25b1f8cc7..f57781e39 100755
> --- a/t/t4202-log.sh
> +++ b/t/t4202-log.sh
> @@ -1556,12 +1556,28 @@ test_expect_success GPG 'setup signed branch' '
>   git commit -S -m signed_commit
>  '
>  
> +test_expect_success GPGSM 'setup signed branch x509' '
> + test_when_finished "git reset --hard && git checkout master" &&
> + git checkout -b signed-x509 master &&
> + echo foo >foo &&
> + git add foo &&
> + test_config gpg.format x509 &&
> + test_config user.signingkey $GIT_COMMITTER_EMAIL &&
> + git commit -S -m signed_commit
> +'

OK.

> +test_expect_success GPGSM 'log --graph --show-signature x509' '
> + git log --graph --show-signature -n1 signed-x509 >actual &&
> + grep "^| gpgsm: Signature made" actual &&
> + grep "^| gpgsm: Good signature" actual
> +'

OK.

> @@ -1581,6 +1597,29 @@ test_expect_success GPG 'log --graph --show-signature 
> for merged tag' '
>   grep "^| | gpg: Good signature" actual
>  '
>  
> +test_expect_success GPGSM 'log --graph --show-signature for merged tag x509' 
> '
> + test_when_finished "git reset --hard && git checkout master" &&
> + test_config gpg.format x509 &&
> + test_config user.signingkey $GIT_COMMITTER_EMAIL &&
> + git checkout -b plain-x509 master &&
> + echo aaa >bar &&
> + git add bar &&
> + git commit -m bar_commit &&
> + git checkout -b tagged-x509 master &&
> + echo bbb >baz &&
> + git add baz &&
> + git commit -m baz_commit &&
> + git tag -s -m signed_tag_msg signed_tag_x509 &&
> + git checkout plain-x509 &&
> + git merge --no-ff -m msg signed_tag_x509 &&
> + git log --graph --show-signature -n1 plain-x509 >actual &&
> + grep "^|\\\  merged tag" actual &&
> + grep "^| | gpgsm: Signature made" actual &&
> + grep "^| | gpgsm: Good signature" actual &&
> + git config --unset gpg.format &&
> + git config --unset user.signingkey

You are using test_config early enough in this test; doesn't that
take care of the last two steps for you, even when an earlier step
failed?  If that is the case, then remove the last two line (and &&
at the end of the line before).

> diff --git a/t/t5534-push-signed.sh b/t/t5534-push-signed.sh
> index 1cea758f7..a3a12bd05 100755
> --- a/t/t5534-push-signed.sh
> +++ b/t/t5534-push-signed.sh
> @@ -218,4 +218,56 @@ test_expect_success GPG 'fail without key and heed 
> user.signingkey' '
>   test_cmp expect dst/push-cert-status
>  '
>  
> +test_expect_success GPGSM 'fail without key and heed user.signingkey x509' '
> + test_config gpg.format x509 &&
> + env | grep GIT > envfile &&

The "envfile" is unused, no?  Remove this line.

> + prepare_dst &&
> + mkdir -p dst/.git/hooks &&
> +

[RFC] push: add documentation on push v2

2018-07-17 Thread Brandon Williams

Signed-off-by: Brandon Williams 
---

Since introducing protocol v2 and enabling fetch I've been thinking
about what its inverse 'push' would look like.  After talking with a
number of people I have a longish list of things that could be done to
improve push and I think I've been able to distill the core features we
want in push v2.  Thankfully (due to the capability system) most of the
other features/improvements can be added later with ease.

What I've got now is a rough design for a more flexible push, more
flexible because it allows for the server to do what it wants with the
refs that are pushed and has the ability to communicate back what was
done to the client.  The main motivation for this is to work around
issues when working with Gerrit and other code-review systems where you
need to have Change-Ids in the commit messages (now the server can just
insert them for you and send back new commits) and you need to push to
magic refs to get around various limitations (now a Gerrit server should
be able to communicate that pushing to 'master' doesn't update master
but instead creates a refs/changes/ ref).

Before actually moving to write any code I'm hoping to get some feedback
on if we think this is an acceptable base design for push (other
features like atomic-push, signed-push, etc can be added as
capabilities), so any comments are appreciated.

 Documentation/technical/protocol-v2.txt | 76 +
 1 file changed, 76 insertions(+)

diff --git a/Documentation/technical/protocol-v2.txt 
b/Documentation/technical/protocol-v2.txt
index 49bda76d23..16c1ce60dd 100644
--- a/Documentation/technical/protocol-v2.txt
+++ b/Documentation/technical/protocol-v2.txt
@@ -403,6 +403,82 @@ header.
2 - progress messages
3 - fatal error message just before stream aborts
 
+ push
+~~
+
+`push` is the command used to push ref-updates and a packfile to a remote
+server in v2.
+
+Additional features not supported in the base command will be advertised
+as the value of the command in the capability advertisement in the form
+of a space separated list of features: "= "
+
+The format of a push request is as follows:
+
+request = *section
+section = (ref-updates | packfile)
+  (delim-pkt | flush-pkt)
+
+ref-updates = PKT-LINE("ref-updates" LF)
+ *PKT-Line(update/force-update LF)
+
+update = txn_id SP action SP refname SP old_oid SP new_oid
+force-update = txn_id SP "force" SP action SP refname SP new_oid
+action = ("create" | "delete" | "update")
+txn_id = 1*DIGIT
+
+packfile = PKT-LINE("packfile" LF)
+  *PKT-LINE(*%x00-ff)
+
+ref-updates section
+   * Transaction id's allow for mapping what was requested to what the
+ server actually did with the ref-update.
+   * Normal ref-updates require that the old value of a ref is supplied so
+ that the server can verify that the reference that is being updated
+ hasn't changed while the request was being processed.
+   * Forced ref-updates only include the new value of a ref as we don't
+ care what the old value was.
+
+packfile section
+   * A packfile MAY not be included if only delete commands are used or if
+ an update only incorperates objects the server already has
+
+The server will receive the packfile, unpack it, then validate each ref-update,
+and it will run any update hooks to make sure that the update is acceptable.
+If all of that is fine, the server will then update the references.
+
+The format of a push response is as follows:
+
+response = *section
+section = (unpack-error | ref-update-status | packfile)
+ (delim-pkt | flush-pkt)
+
+unpack-error = PKT-LINE("ERR" SP error-msg LF)
+
+ref-update-status = *(update-result | update-error)
+update-result = *PKT-LINE(txn_id SP result LF)
+result = ("created" | "deleted" | "updated") SP refname SP old_oid SP 
new_oid
+update-error = PKT-LINE(txn_id SP "error" SP error-msg LF)
+
+packfile = PKT-LINE("packfile" LF)
+  *PKT-LINE(*%x00-ff)
+
+ref-update-status section
+   * This section is always included unless there was an error unpacking
+ the packfile sent in the request.
+   * The server is given the freedom to do what it wants with the
+ ref-updates provided in the reqeust.  This means that an update sent
+ from the server may result in the creation of a ref or rebasing the
+ update on the server.
+   * If a server creates any new objects due to a ref-update, a packfile
+ MUST be sent back in the response.
+
+packfile section
+   * This section is included if the server decided to do something with
+ the ref-updates that involved creating new objects.
+
  server-option
 ~~~
 
-- 
2.18.0.203.gfac676dfb9-goog

Re: [PATCH v2 0/6] git-submodule.sh: convert part of cmd_update to C

2018-07-17 Thread Stefan Beller

On Tue, Jul 17, 2018 at 12:52 PM Junio C Hamano  wrote:

> > (A) This sign off is inherent to the workflow. So we could
> > change the workflow, i.e. you pull series instead of applying them.
> > I think this "more in git, less in email" workflow would find supporters,
> > such as DScho (cc'd).
>
> Sign-off is inherent to the project, in the sense that we want to
> see how the change flowed recorded in the commits.
>
> With a pull-request based workflow, until Git is somehow improved so
> that a "pull" becomes "fetch and rebase what got fetched on top of
> my stuff, and add my sign-off while at it" (which is the opposite of
> the usual "pull --rebase"),

and here is where our thoughts did not align.
I imagined a git-pull as of today (fetch + merge), where you in the role
of the maintainer tells me in the role of an author to base my series
on top of $X, such that there is no need for rebase on your end,
(you would also not sign off each commit, but only the resulting merge)
Even merge conflicts could be handed off to the authors instead of
burdening you.

>  we would lose the ability to fully "use"
> Git to run this project, as we would lose most sign-offs, unlike the
> e-mail based workflow, which we can use Git fully to have it do its
> job of recording necessary information.

I think all needed information would still be there, but there would be an
actual change as authors would be committers (as they commit locally
and we keep it all in Git to get it to you and you keep it in Git to put it
into the blessed repository)

> And much more importantly, when "pull-request" based workflow is
> improved enough so that your original without my sign-off (and you
> shouldn't, unless you are relaying my changes) becomes what I
> pulled, which does have my sign-off, range-diff that compares both
> histories does need to deal with a pair of commits with only one
> side having a sign-off.  So switching the tool is not a proper
> solution to work around the "sign-off noise" we observed.

I do not view it as work around, but "another proper workflow that
has advantages and disadvantages, one of the advantages is that it
would enable us to work with this tool".

>  One side
> having a sign-off while the other side does not is inherent to what
> we actively want,

[in the current workflow that has proven good for 10 years]

> and you are letting your tail wag your dog by
> suggesting to discard it, which is disappointing.

I am suggesting to continue thinking about workflows in general, as there
are many; all of them having advantages and disadvantages.
I am not sure if workflows can be adapted easily via improving the current
workflow continually or if sometimes a workflow has to be rethought to to
changes in the landscape of available tools.

When the Git project started, an email based workflow was chosen,
precisely because Git was not available.

Now that it has gained wide spread adoption (among its developers at least)
the workflow could adapt to that.

> > The other (2) downside is that everyone else (authors, reviewers) have
> > to adapt as well. For authors this might be easy to adapt (push instead
> > of sending email sounds like a win).
>
> As I most often edit the log message and material below three-dash
> lines (long) _after_ format-patch produced files, I do not think it
> is a win to force me to push and ask to pull

Ah, that is an interesting workflow. Do you keep patch files/emails
around locally, only to (long after) add a message and resend it?
I try to keep any contribution of mine in Git as long as possible as that
helps me tracking and fixing errors in my code and log messages.

> > (B) The other point of view that I can offer is that we teach range-diff
> > to ignore certain patterns. Maybe in combination with interpret-trailers
> > this can be an easy configurable thing, or even a default to ignore
> > all sign offs?
>
> That indicates the same direction as I alluded to in the message you
> are responding to, I guess, which is a good thing.

Yes, I imagined this is the approach we'll be taking.
I think we would want to give %(trailers:none) or rather
"ignore-sign-offs" to the inner diffs.

Thanks,
Stefan

Re: [PATCH v4 4/7] gpg-interface: do not hardcode the key string len anymore

2018-07-17 Thread Junio C Hamano

Henning Schild  writes:

> gnupg does print the keyid followed by a space and the signer comes
> next. The same pattern is also used in gpgsm, but there the key length
> would be 40 instead of 16. Instead of hardcoding the expected length,
> find the first space and calculate it.
> Input that does not match the expected format will be ignored now,
> before we jumped to found+17 which might have been behind the end of an
> unexpected string.
>
> Signed-off-by: Henning Schild 
> ---
>  gpg-interface.c | 7 ---
>  1 file changed, 4 insertions(+), 3 deletions(-)

Again, really nice.

>
> diff --git a/gpg-interface.c b/gpg-interface.c
> index a02db7658..51cad9081 100644
> --- a/gpg-interface.c
> +++ b/gpg-interface.c
> @@ -95,10 +95,11 @@ static void parse_gpg_output(struct signature_check *sigc)
>   sigc->result = sigcheck_gpg_status[i].result;
>   /* The trust messages are not followed by key/signer 
> information */
>   if (sigc->result != 'U') {
> - sigc->key = xmemdupz(found, 16);
> + next = strchrnul(found, ' ');
> + sigc->key = xmemdupz(found, next - found);
>   /* The ERRSIG message is not followed by signer 
> information */
> - if (sigc-> result != 'E') {
> - found += 17;
> + if (*next && sigc-> result != 'E') {
> + found = next + 1;
>   next = strchrnul(found, '\n');
>   sigc->signer = xmemdupz(found, next - found);
>   }

Re: [PATCH v4 3/7] gpg-interface: introduce an abstraction for multiple gpg formats

2018-07-17 Thread Junio C Hamano

Henning Schild  writes:

> Create a struct that holds the format details for the supported formats.
> At the moment that is still just "openpgp". This commit prepares for the
> introduction of more formats, that might use other programs and match
> other signatures.
>
> Signed-off-by: Henning Schild 
> ---

I didn't spot anything questionable in this version.  Looking good.

Re: [PATCH] gc: do not warn about too many loose objects

2018-07-17 Thread Jeff King

On Tue, Jul 17, 2018 at 10:59:50AM +0200, Ævar Arnfjörð Bjarmason wrote:

> That doesn't make sense to me. Just because git itself is happily
> ignoring the exit code I don't think that should mean there shouldn't be
> a meaningful exit code. Why don't you just ignore the exit code in the
> repo tool as well?
> 
> Now e.g. automation I have to see if git-gc ---auto is having issues
> can't just be 'git gc --auto || echo fail @ {hostname}' across a fleet
> of servers, but will need to start caring if stderr was emitted to.

If you're daemonizing gc, though, there are a number of cases where the
exit code is not propagated. If you really care about the outcome,
you're probably better off either:

  - doing synchronous gc's, which will still return a meaningful code
after Jonathan's patches

  - inspecting the log yourself. I know that comes close to the un-unixy
stderr thing. But in this case, the presence of a non-empty log is
literally the on-disk bit for "the previous run failed". And doing
so catches all of the daemonized cases, even the "first one" that
you'd miss by paying attention to the immediate exit code.

This will treat the zero-exit-code "warning" case as an error. I'm
not against propagating the true original error code, if somebody
wants to work on it. But I think Jonathan's patches are a strict
improvement over the current situation, and a patch to propagate
could come on top.

> I think you're conflating two things here in a way that (if I'm reading
> this correctly) produces a pretty bad regression for users.
> 
>  a) If we have something in the gc.log we keep yelling at users until we
> reach the gc.logExpiry, this was the subject of my thread back in
> January. This sucks, and should be fixed somehow.
> 
> Maybe we should only emit the warning once, e.g. creating a
> .git/gc.log.wasemitted marker and as long as its ctime is later than
> the mtime on .git/gc.log we don't emit it again).
> 
> But that also creates the UX problem that it's easy to miss this
> message in the middle of some huge "fetch" output. Perhaps we should
> just start emitting this as part of "git status" or something (or
> solve the root cause, as Peff notes...).

I kind of like that "git status" suggestion. Of course many users run
"git status" more often than "git commit", so it may actually spam them
more!

>  b) We *also* use this presence of a gc.log as a marker for "we failed
> too recently, let's not try again until after a day".
> 
> The semantics of b) are very useful, and just because they're tied up
> with a) right now, let's not throw out b) just because we're trying to
> solve a).

Yeah. In general my concern was the handling of (b), which I think this
last crop of patches is fine with. As far as showing the message
repeatedly or not, I don't have a strong opinion (it could even be
configurable, once it's split from the "marker" behavior).

> Right now one thing we do right is we always try to look at the actual
> state of the repo with too_many_packs() and too_many_loose_objects().
> 
> We don't assume the state of your repo hasn't drastically changed
> recently. That means that we do the right thing and gc when the repo
> needs it, not just because we GC'd recently enough.
> 
> With these proposed semantics we'd skip a needed GC (potentially for
> days, depending on the default) just because we recently ran one.

Yeah, I agree that deferring repeated gc's based on time is going to run
into pathological corner cases. OTOH, what we've found at GitHub is that
"gc --auto" is quite insufficient for scheduling maintenance anyway
(e.g., if a machine gets pushes to 100 separate repositories in quick
succession, you probably want to queue and throttle any associated gc).

But there are probably more mid-size sites that are big enough to have
corner cases, but not so big that "gc --auto" is hopeless. ;)

-Peff

Re: [PATCH 3/3] gc: do not return error for prior errors in daemonized mode

2018-07-17 Thread Jeff King

On Mon, Jul 16, 2018 at 11:57:40PM -0700, Jonathan Nieder wrote:

> External tools like repo[1], though, do care about the exit status
> from "git gc --auto".  In non-daemonized mode, the exit status is
> straightforward: if there is an error, it is nonzero, but after a
> warning like the above, the status is zero.  The daemonized mode, as a
> side effect of the other properties provided, offers a very strange
> exit code convention:
> 
>  - if no housekeeping was required, the exit status is 0
> 
>  - the first real run, after forking into the background, returns exit
>status 0 unconditionally.  The parent process has no way to know
>whether gc will succeed.
> 
>  - if there is any diagnostic output in gc.log, subsequent runs return
>a nonzero exit status to indicate that gc was not triggered.
> 
> There's nothing for the calling program to act on on the basis of that
> error.  Use status 0 consistently instead, to indicate that we decided
> not to run a gc (just like if no housekeeping was required).  This
> way, repo and similar tools can get the benefit of the same behavior
> as tools like "git fetch" that ignore the exit status from gc --auto.

I think this is a good change.

In theory it might be useful to propagate the original exit code (_not_
"did we see a warning or an error", but the true original exit code. But
as you note, it's not deterministic anyway (we'd miss that exit code on
the first run, or even any simultaneous runs that exit early due to lock
contention). So it's clear that callers can't really do anything robust
based on the exit code of a daemonized "gc --auto".

I still think that "repo" should probably stop respecting the exit code.
But that's no excuse for Git not to have a sensible exit code in the
first place.

The patch itself looks good overall. A few comments (none of which I
think even requires a fix):

> diff --git a/Documentation/config.txt b/Documentation/config.txt
> index 1cc18a828c..5eaa4aaa7d 100644
> --- a/Documentation/config.txt
> +++ b/Documentation/config.txt
> @@ -1649,7 +1649,8 @@ will be repacked. After this the number of packs should 
> go below
>  gc.autoPackLimit and gc.bigPackThreshold should be respected again.
>  
>  gc.logExpiry::
> - If the file gc.log exists, then `git gc --auto` won't run
> + If the file gc.log exists, then `git gc --auto` will print
> + its content and exit with status zero instead of running
>   unless that file is more than 'gc.logExpiry' old.  Default is
>   "1.day".  See `gc.pruneExpire` for more ways to specify its
>   value.

Yeah, this is definitely worth documenting. I was surprised there's no
discussion of daemonization at all in git-gc(1). I don't think adding it
is a blocker for this series, though.

>  static void gc_before_repack(void)
> @@ -561,7 +576,13 @@ int cmd_gc(int argc, const char **argv, const char 
> *prefix)
>   fprintf(stderr, _("See \"git help gc\" for manual 
> housekeeping.\n"));
>   }
>   if (detach_auto) {
> - report_last_gc_error(); /* dies on error */
> + int ret = report_last_gc_error();
> + if (ret < 0)
> + /* an I/O error occured, already reported */
> + exit(128);

We have a few exit(128)'s sprinkled throughout the code-base, and I
always wonder if they will one day go stale if we change the code that
die() uses. But it probably doesn't matter, and anyway I don't think
there is a better way to do this currently.

I would have written "return 128" since the other case arm also returns,
but I really cannot think of a reason to prefer one over the other.

> diff --git a/t/t6500-gc.sh b/t/t6500-gc.sh
> index c474a94a9f..a222efdbe1 100755
> --- a/t/t6500-gc.sh
> +++ b/t/t6500-gc.sh
> @@ -116,11 +116,11 @@ test_expect_success 'background auto gc does not run if 
> gc.log is present and re
>   test_config gc.autopacklimit 1 &&
>   test_config gc.autodetach true &&
>   echo fleem >.git/gc.log &&
> - test_must_fail git gc --auto 2>err &&
> - test_i18ngrep "^fatal:" err &&
> + git gc --auto 2>err &&
> + test_i18ngrep "^warning:" err &&
>   test_config gc.logexpiry 5.days &&
>   test-tool chmtime =-345600 .git/gc.log &&
> - test_must_fail git gc --auto &&
> + git gc --auto &&

Nice. At first I thought this was changing an existing test to cover the
new case (which I usually frown on), but it is just that your patch is
intentionally changing the case covered here. So this is the right thing
to do.

-Peff

Re: [PATCH 2/3] gc: exit with status 128 on failure

2018-07-17 Thread Jeff King

On Mon, Jul 16, 2018 at 11:54:16PM -0700, Jonathan Nieder wrote:

> A value of -1 returned from cmd_gc gets propagated to exit(),
> resulting in an exit status of 255.  Use die instead for a clearer
> error message and a controlled exit.

This feels a little funny because we know we're going to turn some of
these back in the next patch. That said, I'm OK with it, since this
version is already written.

-Peff

Re: [PATCH 1/3] gc: improve handling of errors reading gc.log

2018-07-17 Thread Jeff King

On Mon, Jul 16, 2018 at 11:53:21PM -0700, Jonathan Nieder wrote:

> A collection of minor error handling fixes:
> 
> - use an error message in lower case, following the usual style
> - quote filenames in error messages to make them easier to read and to
>   decrease translation load by matching other 'stat' error messages
> - check for and report errors from 'read', too
> - avoid being confused by a gc.log larger than INT_MAX bytes
> 
> Noticed by code inspection.
> 
> Signed-off-by: Jonathan Nieder 

Thanks, this all looks obviously good.

-Peff

Re: Are clone/checkout operations deterministic?

2018-07-17 Thread Jeff King

On Tue, Jul 17, 2018 at 11:48:45AM +0200, Ævar Arnfjörð Bjarmason wrote:

> In practice I think clone, checkout, reset etc. always work in the same
> order you see with `git ls-tree -r --name-only HEAD`, but as far as I
> know this has never been guaranteed or documented, and shouldn't be
> relied on.

I think this paragraph is correct in general (and I agree with the
sentiment that this is subject to change in future versions).

There is one concrete case I know that has non-deterministic order in
current versions: long-lived clean/smudge filters can defer their
response. The LFS filter uses this to tell Git "no, I'm still
downloading the content", at which point Git will proceed with checking
out other local files (or even other LFS files that happen to arrive
sooner).

Depending on what one wants to do with the determinism, it may be OK to
ignore that case. ;)

-Peff

Re: [PATCH v2 0/6] git-submodule.sh: convert part of cmd_update to C

2018-07-17 Thread Junio C Hamano

Stefan Beller  writes:

> Actually I thought it was really cool, i.e. when using your queued branch
> instead of my last sent branch, I can see any edits *you* did
> (including fixing up typos or applying at slightly different bases).

Absolutely.  I did not say that there needs a mode to ignore log
message.

> The sign offs are a bit unfortunate as they are repetitive.
> I have two conflicting points of view on that:
>
> (A) This sign off is inherent to the workflow. So we could
> change the workflow, i.e. you pull series instead of applying them.
> I think this "more in git, less in email" workflow would find supporters,
> such as DScho (cc'd).

Sign-off is inherent to the project, in the sense that we want to
see how the change flowed recorded in the commits.

With a pull-request based workflow, until Git is somehow improved so
that a "pull" becomes "fetch and rebase what got fetched on top of
my stuff, and add my sign-off while at it" (which is the opposite of
the usual "pull --rebase"), we would lose the ability to fully "use"
Git to run this project, as we would lose most sign-offs, unlike the
e-mail based workflow, which we can use Git fully to have it do its
job of recording necessary information.

And much more importantly, when "pull-request" based workflow is
improved enough so that your original without my sign-off (and you
shouldn't, unless you are relaying my changes) becomes what I
pulled, which does have my sign-off, range-diff that compares both
histories does need to deal with a pair of commits with only one
side having a sign-off.  So switching the tool is not a proper
solution to work around the "sign-off noise" we observed.  One side
having a sign-off while the other side does not is inherent to what
we actively want, and you are letting your tail wag your dog by
suggesting to discard it, which is disappointing.

> The other (2) downside is that everyone else (authors, reviewers) have
> to adapt as well. For authors this might be easy to adapt (push instead
> of sending email sounds like a win).

As I most often edit the log message and material below three-dash
lines (long) _after_ format-patch produced files, I do not think it
is a win to force me to push and ask to pull.

> (B) The other point of view that I can offer is that we teach range-diff
> to ignore certain patterns. Maybe in combination with interpret-trailers
> this can be an easy configurable thing, or even a default to ignore
> all sign offs?

That indicates the same direction as I alluded to in the message you
are responding to, I guess, which is a good thing.

Re: [PATCH 2/2] repack -ad: prune the list of shallow commits

2018-07-17 Thread Jeff King

On Tue, Jul 17, 2018 at 07:28:07PM +0200, Duy Nguyen wrote:

> On Mon, Jul 16, 2018 at 7:38 PM Jeff King  wrote:
> > in a user-visible way. And that's really my question: is pruning here
> > going to bite people unexpectedly (not rhetorical -- I really don't
> > know)?
> 
> If shallow points are no longer reachable, removing them should not
> bite anybody, I think.

I slept on this to see if I could brainstorm any other ways.

The only thing I really came up with is that removing shallows is racy
with respect to the reachability check.  For instance, if "prune" runs
at the same as an incoming shallow fetch, we'll compute the reachability
without holding the shallow lock. Somebody else may write an entry for
an object we've never heard of (because it just showed up), and we'd
erase it, making the repository appear corrupt.

But note that I used "prune" in the above example, because this bug
already exists. So probably we're not changing anything material here.
Though if we wanted to fix it, I think it would require holding the
shallow lock during the reachability analysis, which is probably not
something that repack would want to do. Unlike prune, it then does
a lot of other expensive operations, like delta compression and writing
out the pack, before it would get to the shallow prune.

Although even holding the lock for the duration of "prune" is more than
we need. Shallow commits must be commits, so we don't need to do a full
graph walk (and in my experience there's often an order-of-magnitude
difference between the two; even more so if we're using Stolee's
commit-graph cache).

> > I think in the case of git-prune and prune_shallow(), it's a little
> > tricky, though. It looks like prune_shallow() relies on somebody having
> > marked the SEEN flag, which implies doing a full reachability walk.
> 
> Sorry, my bad for hiding this SEEN flag deep in library code like this
> in eab3296c7e (prune: clean .git/shallow after pruning objects -
> 2013-12-05) But in my defense I didn't realize the horror of sharing
> object flags a year later and added the "object flag allocation" in
> object.h

Sort of an aside to the patch under discussion, but I think it may make
sense for prune_shallow() to take a callback function for determining
whether a commit is reachable.

I have an old patch that teaches git-prune to lazily do the reachability
check, since in many cases "git repack" will have just packed all of the
loose objects. But it just occurred to me that this patch is totally
broken with respect to prune_shallow(), because it would not set the
SEEN flag (I've literally been running with it for years, which goes to
show how often I use the shallow feature).

And if we were to have repack do a prune_shallow(), it may want to use a
different method than traversing and marking each object SEEN.

> > So either we have to do a reachability walk (which is expensive), or we
> > have to rely on some other command (like prune) that is doing a
> > reachability walk and reuse that work.
> 
> Since "git prune" took care of loose objects, if we're going to delete
> some redundant packs, we can perform a reasonably cheap "reachability"
> test in repack, I think. We have the list of new packs from
> pack-objects which should contain all reachable objects (and even some
> unreachable ones). If we traverse all of their idx files and mark as
> SEEN, then whatever shallow points that are not marked SEEN _and_ not
> loose objects could be safely removed.

Hmm. I don't immediately see any reason that would not work with the
current code. But I am a little uncomfortable adding these sorts of
subtle dependencies and assumptions. It feels like we're building a
house of cards.

> I don't see any problem with this either, but then I've not worked on
> shallow code for quite some time. The only potential problem is where
> we do this check. If we check (and drop) shallow points when we read
> .git/info/shallow, then when we write it down some time in the future
> we accidentally prune them. Which might be ok but I feel safer when we
> only prune at one place (prune/gc/repack) where we can be more careful
> and can do more testing.
> 
> If we read the shallow list as-is, then just not send "shallow" lines
> in fetch-pack for shallow points that do not exist, I don't see a
> problem, but it may be a bit more work and we could get to some
> confusing state where upload-pack gives us the same shallow point that
> we have but ignore because we don't have objects behind it. Hm...
> actually I might see a problem :)

Yeah, I agree if we were to do this it would definitely be in a
"read-only" way: let the current command skip those grafts for its
operation, but do not impact the on-disk shallow file. Then races or
other assumptions can at worst impact that operation, and not cause a
lasting corruption (and in particular I think we'd be subject to the
race I described at the start of this mail).

-Peff

Re: [PATCH v2 0/2] repack -ad: fix after `fetch --prune` in a shallow repository

2018-07-17 Thread Jeff King

On Tue, Jul 17, 2018 at 03:15:31PM -0400, Jeff King wrote:

> > - Also trigger `prune_shallow()` when `--unpack-unreachable=` 
> > was passed to `git repack`.
> > - No need to trigger `prune_shallow()` when `git repack` was called with 
> > `-k`.
> 
> I think you might have missed the bigger problem I pointed at, as it was
> buried deep within my later reply. Try applying this patch on top, which
> tests the opposite case (reachable shallow commits are retained):

By the way, I notice that the patches themselves are cc'd to you, but
the cover letter isn't. So my reply went only to gitgitgadget, which
(AFAIK) has not yet learned to work as a mail-to-comment gateway.

So I'm copying this to you directly to make sure you see it, but also
because I'm not sure if the gitgitgadget cc behavior is intentional or
not.

-Peff

Re: [PATCH v2 0/2] repack -ad: fix after `fetch --prune` in a shallow repository

2018-07-17 Thread Jeff King

On Tue, Jul 17, 2018 at 06:51:39AM -0700, Johannes Schindelin via GitGitGadget 
wrote:

> Under certain circumstances, commits that were reachable can be made
> unreachable, e.g. via `git fetch --prune`. These commits might have
> been packed already, in which case `git repack -adlf` will just drop
> them without giving them the usual grace period before an eventual
> `git prune` (via `git gc`) prunes them.
> 
> This is a problem when the `shallow` file still lists them, which is
> the reason why `git prune` edits that file. And with the proposed
> changes, `git repack -ad` will now do the same.
> 
> Reported by Alejandro Pauly.
> 
> Changes since v1:
> 
> - Also trigger `prune_shallow()` when `--unpack-unreachable=` 
> was passed to `git repack`.
> - No need to trigger `prune_shallow()` when `git repack` was called with `-k`.

I think you might have missed the bigger problem I pointed at, as it was
buried deep within my later reply. Try applying this patch on top, which
tests the opposite case (reachable shallow commits are retained):

diff --git a/t/t5537-fetch-shallow.sh b/t/t5537-fetch-shallow.sh
index d32ba20f9d..911e457ae1 100755
--- a/t/t5537-fetch-shallow.sh
+++ b/t/t5537-fetch-shallow.sh
@@ -186,17 +186,20 @@ EOF
test_cmp expect actual
 '
 
-test_expect_success '.git/shallow is edited by repack' '
+test_expect_success 'set up shallow server' '
git init shallow-server &&
test_commit -C shallow-server A &&
test_commit -C shallow-server B &&
git -C shallow-server checkout -b branch &&
test_commit -C shallow-server C &&
test_commit -C shallow-server E &&
test_commit -C shallow-server D &&
d="$(git -C shallow-server rev-parse --verify D)" &&
-   git -C shallow-server checkout master &&
+   git -C shallow-server checkout master
+'
 
+test_expect_success 'repack drops unreachable objects from .git/shallow' '
+   test_when_finished "rm -rf shallow-client" &&
git clone --depth=1 --no-tags --no-single-branch \
"file://$PWD/shallow-server" shallow-client &&
 
@@ -213,4 +216,13 @@ test_expect_success '.git/shallow is edited by repack' '
origin "+refs/heads/*:refs/remotes/origin/*"
 '
 
+test_expect_success 'repack keeps reachable objects in .git/shallow' '
+   test_when_finished "rm -rf shallow-client" &&
+   git clone --depth=1 --no-tags --no-single-branch \
+   "file://$PWD/shallow-server" shallow-client &&
+
+   git -C shallow-client repack -adfl &&
+   grep $d shallow-client/.git/shallow
+'
+
 test_done

Re: [RFC PATCH 5/6] sha1-name: Teach `get_oid_with_context[_1]()` to be gentle

2018-07-17 Thread Junio C Hamano

Paul-Sebastian Ungureanu  writes:

> @@ -1769,7 +1831,8 @@ void maybe_die_on_misspelt_object_name(const char 
> *name, const char *prefix)
>  
>  int get_oid_with_context(const char *str, unsigned flags, struct object_id 
> *oid, struct object_context *oc)
>  {
> - if (flags & GET_OID_FOLLOW_SYMLINKS && flags & GET_OID_ONLY_TO_DIE)
> + if (flags & (GET_OID_FOLLOW_SYMLINKS | GET_OID_GENTLY) &&
> + flags & GET_OID_ONLY_TO_DIE)
>   BUG("incompatible flags for get_sha1_with_context");
>   return get_oid_with_context_1(str, flags, NULL, oid, oc);
>  }

This points us back to "only-to-die" which was "gently" before
2e83b66c ("fix overslow :/no-such-string-ever-existed diagnostics",
2011-05-10).  I think we have to keep them both, as only-to-die
means more than just being not gentle, and we cannot revert the
renaming s/!gently/only-to-die/ done by 2e83b66c and teach GENTLY to
more codepaths, I think.  But I might be mistaken and we may be able
to get rid of only-to-die at the end of this series.  I dunno.

In any case, what's the reason why this new "gentle" option is
incompatible with "only-to-die"?

Re: [PATCH v2 0/6] git-submodule.sh: convert part of cmd_update to C

2018-07-17 Thread Eric Sunshine

On Tue, Jul 17, 2018 at 2:53 PM Stefan Beller  wrote:
> > A tangent.
> >
> > Because this "-- " is a conventional signature separator, MUAs like
> > Emacs message-mode seems to omit everything below it from the quote
> > while responding, making it cumbersome to comment on the tbdiff.
> >
> > Something to think about if somebody is contemplating on adding more
> > to format-patch's cover letter.
>
> +cc Eric who needs to think about this tangent, then.
> https://public-inbox.org/git/20180530080325.37520-1-sunsh...@sunshineco.com/

The "git-format-patch --range-diff" option implemented by that patch
series (and its upcoming re-roll) place the range-diff before the "--
" signature line, so this isn't a problem.

I think Junio's tangent was targeted more at humans blindly plopping
the copy/pasted range-diff at the end of the cover letter without
taking the "-- " signature line into account. (Indeed, Gmail hides
everything below the "-- " line by default, as well.)

Re: [RFC PATCH 2/6] tree-walk: Add three new gentle helpers

2018-07-17 Thread Junio C Hamano

Paul-Sebastian Ungureanu  writes:

> Add `get_tree_entry_gently()`, `find_tree_entry_gently()`
> and `get_tree_entry_follow_symlinks_gently()`, which will
> make `get_oid()` to be more gently.
>
> Since `get_tree_entry()` is used in more than 20 places,
> adding a new parameter will make this commit harder to read.
> In every place it is called there will need to be an additional
> 0 parameter at the end of the call. The solution to avoid this is
> to rename the function in `get_tree_entry_gently()` which gets
> an additional `flags` variable. A new `get_tree_entry()`
> will call `get_tree_entry_gently()` with `flags` being 0.
> This way, no additional changes will be needed.

And that is the right way to introduce a new feature to existing API
with many callers in general.

I wonder if the GENTLY option should apply to update_tree_entry()
the same way as it would to the other codepaths that currently die
to express "we were handed this string by the caller and told to
give back object ID the string represents, and we found no good
answer".  In this one (and the "bad ref" one), the existing failures
in these two codepaths are not "we got a string and that does not
resolve to an object name", but "we didn't have the data to work on
to begin with (either a corrupt tree object or a corrupt ref").

In other words, it's not like "We were given HEAD:no-such-path and
there is no such path in that tree"; it is "We tried to read HEAD:
tree for no-such-path in it, but the tree was corrupt and we couldn't
even tell if such a path is or is not in it", no?

> Signed-off-by: Paul-Sebastian Ungureanu 
> ---
>  sha1-name.c |   2 +-
>  tree-walk.c | 108 +++-
>  tree-walk.h |   3 +-
>  3 files changed, 94 insertions(+), 19 deletions(-)
>
> diff --git a/sha1-name.c b/sha1-name.c
> index 60d9ef3c7..d741e1129 100644
> --- a/sha1-name.c
> +++ b/sha1-name.c
> @@ -1721,7 +1721,7 @@ static int get_oid_with_context_1(const char *name,
>   if (flags & GET_OID_FOLLOW_SYMLINKS) {
>   ret = get_tree_entry_follow_symlinks(&tree_oid,
>   filename, oid, &oc->symlink_path,
> - &oc->mode);
> + &oc->mode, flags);
>   } else {
>   ret = get_tree_entry(&tree_oid, filename, oid,
>&oc->mode);
> diff --git a/tree-walk.c b/tree-walk.c
> index 8f5090862..2925eaec2 100644
> --- a/tree-walk.c
> +++ b/tree-walk.c
> @@ -491,7 +491,9 @@ struct dir_state {
>   struct object_id oid;
>  };
>  
> -static int find_tree_entry(struct tree_desc *t, const char *name, struct 
> object_id *result, unsigned *mode)
> +static int find_tree_entry(struct tree_desc *t, const char *name,
> +   struct object_id *result, unsigned *mode,
> +   int flags)
>  {
>   int namelen = strlen(name);
>   while (t->size) {
> @@ -501,7 +503,11 @@ static int find_tree_entry(struct tree_desc *t, const 
> char *name, struct object_
>  
>   oid = tree_entry_extract(t, &entry, mode);
>   entrylen = tree_entry_len(&t->entry);
> - update_tree_entry(t);
> +
> + if (!(flags & GET_OID_GENTLY))
> + update_tree_entry(t);
> + else if (update_tree_entry_gently(t))
> + return -1;
>   if (entrylen > namelen)
>   continue;
>   cmp = memcmp(name, entry, entrylen);
> @@ -521,19 +527,28 @@ static int find_tree_entry(struct tree_desc *t, const 
> char *name, struct object_
>   oidcpy(result, oid);
>   return 0;
>   }
> - return get_tree_entry(oid, name + entrylen, result, mode);
> + return get_tree_entry_gently(oid, name + entrylen, result, 
> mode, flags);
>   }
>   return -1;
>  }
>  
> -int get_tree_entry(const struct object_id *tree_oid, const char *name, 
> struct object_id *oid, unsigned *mode)
> +int get_tree_entry_gently(const struct object_id *tree_oid, const char *name,
> +   struct object_id *oid, unsigned *mode, int flags)
>  {
>   int retval;
>   void *tree;
>   unsigned long size;
>   struct object_id root;
>  
> - tree = read_object_with_reference(tree_oid, tree_type, &size, &root);
> + if (!(flags & GET_OID_GENTLY)) {
> + tree = read_object_with_reference(tree_oid, tree_type, &size, 
> &root);
> + } else {
> + struct object_info oi = OBJECT_INFO_INIT;
> +
> + oi.contentp = tree;
> + if (oid_object_info_extended(the_repository, tree_oid, &oi, 0) 
> < 0)
> + return -1;
> + }
>   if (!tree)
>   return -1;
>  
> @@ -547,13 +562,27 @@ int get_tree_entry(const struct object_

Re: [PATCH v2 0/6] git-submodule.sh: convert part of cmd_update to C

2018-07-17 Thread Stefan Beller

> A tangent.
>
> Because this "-- " is a conventional signature separator, MUAs like
> Emacs message-mode seems to omit everything below it from the quote
> while responding, making it cumbersome to comment on the tbdiff.
>
> Something to think about if somebody is contemplating on adding more
> to format-patch's cover letter.

+cc Eric who needs to think about this tangent, then.
https://public-inbox.org/git/20180530080325.37520-1-sunsh...@sunshineco.com/

>
> >> 2.18.0.203.gfac676dfb9-goog
> >>
> >> 1:  d4e1ec45740 ! 1:  bbc8697a8ca git-submodule.sh: align error reporting 
> >> for update mode to use path
> >> @@ -6,7 +6,6 @@
> >>  on its path, so let's do that for invalid update modes, too.
> >>
> >>  Signed-off-by: Stefan Beller 
> >> -Signed-off-by: Junio C Hamano 
> >>
> >>  diff --git a/git-submodule.sh b/git-submodule.sh
> >>  --- a/git-submodule.sh
>
> This is quite unfortunate.  I wonder if it is easy to tell
> range-diff that certain differences in the log message are to be
> ignored so that we can show that the first patch is unchanged in a
> case like this.  This series has 4 really changed ones with 2
> otherwise unchanged ones shown all as changed, which is not too bad,
> but for a series like sb/diff-colro-move-more reroll that has 9
> patches, out of only two have real updated patches, showing
> otherwise unchanged 7 as changed like this hunk does would make the
> cover letter useless.  It is a shame that adding range-diff to the
> cover does have so much potential.

Actually I thought it was really cool, i.e. when using your queued branch
instead of my last sent branch, I can see any edits *you* did
(including fixing up typos or applying at slightly different bases).

The sign offs are a bit unfortunate as they are repetitive.
I have two conflicting points of view on that:

(A) This sign off is inherent to the workflow. So we could
change the workflow, i.e. you pull series instead of applying them.
I think this "more in git, less in email" workflow would find supporters,
such as DScho (cc'd).

The downside is that (1) you'd have to change your
workflow, i.e. instead of applying the patches at the base you think is
best for maintenance you'd have to tell the author "please rebase to $X";
but that also has upsides, such as "If you want to have your series integrated
please merge with $Y and $Z" (looking at the object store stuff).

The other (2) downside is that everyone else (authors, reviewers) have
to adapt as well. For authors this might be easy to adapt (push instead
of sending email sounds like a win). For reviewers we'd need to have
an easy way to review things "stored in git" and not exposed via email,
which is not obvious how to do.

(B) The other point of view that I can offer is that we teach range-diff
to ignore certain patterns. Maybe in combination with interpret-trailers
this can be an easy configurable thing, or even a default to ignore
all sign offs?

Thanks,
Stefan

[GSoC] GSoC with git, week 11

2018-07-17 Thread Alban Gruin

Hi,

I published a new blog post about last week:

https://blog.pa1ch.fr/posts/2018/07/17/en/gsoc2018-week11.html

Cheers,
Alban

Re: [PATCH v2 0/6] git-submodule.sh: convert part of cmd_update to C

2018-07-17 Thread Junio C Hamano

Stefan Beller  writes:

>> v2:
>> addressed review comments, renaming the struct, improving the commit message.
>>
>> v1:
>> https://public-inbox.org/git/20180712194754.71979-1-sbel...@google.com/
>> I thought about writing it all in one go, but the series got too large,
>> so let's chew one bite at a time.
>>
>> Thanks,
>> Stefan
>>
>> Stefan Beller (6):
>>   git-submodule.sh: align error reporting for update mode to use path
>>   git-submodule.sh: rename unused variables
>>   builtin/submodule--helper: factor out submodule updating
>>   builtin/submodule--helper: store update_clone information in a struct
>>   builtin/submodule--helper: factor out method to update a single
>> submodule
>>   submodule--helper: introduce new update-module-mode helper
>>
>>  builtin/submodule--helper.c | 152 
>>  git-submodule.sh|  22 +-
>>  2 files changed, 122 insertions(+), 52 deletions(-)
>>
>> -- 

A tangent.

Because this "-- " is a conventional signature separator, MUAs like
Emacs message-mode seems to omit everything below it from the quote
while responding, making it cumbersome to comment on the tbdiff.

Something to think about if somebody is contemplating on adding more
to format-patch's cover letter.

>> 2.18.0.203.gfac676dfb9-goog
>>
>> 1:  d4e1ec45740 ! 1:  bbc8697a8ca git-submodule.sh: align error reporting 
>> for update mode to use path
>> @@ -6,7 +6,6 @@
>>  on its path, so let's do that for invalid update modes, too.
>>  
>>  Signed-off-by: Stefan Beller 
>> -Signed-off-by: Junio C Hamano 
>>  
>>  diff --git a/git-submodule.sh b/git-submodule.sh
>>  --- a/git-submodule.sh

This is quite unfortunate.  I wonder if it is easy to tell
range-diff that certain differences in the log message are to be
ignored so that we can show that the first patch is unchanged in a
case like this.  This series has 4 really changed ones with 2
otherwise unchanged ones shown all as changed, which is not too bad,
but for a series like sb/diff-colro-move-more reroll that has 9
patches, out of only two have real updated patches, showing
otherwise unchanged 7 as changed like this hunk does would make the
cover letter useless.  It is a shame that adding range-diff to the
cover does have so much potential.

Re: [PATCH 2/3] gc: exit with status 128 on failure

2018-07-17 Thread Junio C Hamano

Jonathan Nieder  writes:

> A value of -1 returned from cmd_gc gets propagated to exit(),
> resulting in an exit status of 255.  Use die instead for a clearer
> error message and a controlled exit.
>
> Signed-off-by: Jonathan Nieder 
> ---
> As in
> https://public-inbox.org/git/20180716225639.gk11...@aiede.svl.corp.google.com/.
> The only change is splitting out patch 1/3.
>
>  builtin/gc.c  | 35 ++-
>  t/t6500-gc.sh |  2 +-
>  2 files changed, 15 insertions(+), 22 deletions(-)

I double checked "return' in cmd_gc() and this patch catches them
all.  Good clean-up.

Thanks.

> diff --git a/builtin/gc.c b/builtin/gc.c
> index d69fc4c0b0..95c8afd07b 100644
> --- a/builtin/gc.c
> +++ b/builtin/gc.c
> @@ -438,10 +438,9 @@ static const char *lock_repo_for_gc(int force, pid_t* 
> ret_pid)
>   return NULL;
>  }
>  
> -static int report_last_gc_error(void)
> +static void report_last_gc_error(void)
>  {
>   struct strbuf sb = STRBUF_INIT;
> - int ret = 0;
>   ssize_t len;
>   struct stat st;
>   char *gc_log_path = git_pathdup("gc.log");
> @@ -450,8 +449,7 @@ static int report_last_gc_error(void)
>   if (errno == ENOENT)
>   goto done;
>  
> - ret = error_errno(_("cannot stat '%s'"), gc_log_path);
> - goto done;
> + die_errno(_("cannot stat '%s'"), gc_log_path);
>   }
>  
>   if (st.st_mtime < gc_log_expire_time)
> @@ -459,9 +457,9 @@ static int report_last_gc_error(void)
>  
>   len = strbuf_read_file(&sb, gc_log_path, 0);
>   if (len < 0)
> - ret = error_errno(_("cannot read '%s'"), gc_log_path);
> + die_errno(_("cannot read '%s'"), gc_log_path);
>   else if (len > 0)
> - ret = error(_("The last gc run reported the following. "
> + die(_("The last gc run reported the following. "
>  "Please correct the root cause\n"
>  "and remove %s.\n"
>  "Automatic cleanup will not be performed "
> @@ -471,20 +469,18 @@ static int report_last_gc_error(void)
>   strbuf_release(&sb);
>  done:
>   free(gc_log_path);
> - return ret;
>  }
>  
> -static int gc_before_repack(void)
> +static void gc_before_repack(void)
>  {
>   if (pack_refs && run_command_v_opt(pack_refs_cmd.argv, RUN_GIT_CMD))
> - return error(FAILED_RUN, pack_refs_cmd.argv[0]);
> + die(FAILED_RUN, pack_refs_cmd.argv[0]);
>  
>   if (prune_reflogs && run_command_v_opt(reflog.argv, RUN_GIT_CMD))
> - return error(FAILED_RUN, reflog.argv[0]);
> + die(FAILED_RUN, reflog.argv[0]);
>  
>   pack_refs = 0;
>   prune_reflogs = 0;
> - return 0;
>  }
>  
>  int cmd_gc(int argc, const char **argv, const char *prefix)
> @@ -565,13 +561,11 @@ int cmd_gc(int argc, const char **argv, const char 
> *prefix)
>   fprintf(stderr, _("See \"git help gc\" for manual 
> housekeeping.\n"));
>   }
>   if (detach_auto) {
> - if (report_last_gc_error())
> - return -1;
> + report_last_gc_error(); /* dies on error */
>  
>   if (lock_repo_for_gc(force, &pid))
>   return 0;
> - if (gc_before_repack())
> - return -1;
> + gc_before_repack(); /* dies on failure */
>   delete_tempfile(&pidfile);
>  
>   /*
> @@ -611,12 +605,11 @@ int cmd_gc(int argc, const char **argv, const char 
> *prefix)
>   atexit(process_log_file_at_exit);
>   }
>  
> - if (gc_before_repack())
> - return -1;
> + gc_before_repack();
>  
>   if (!repository_format_precious_objects) {
>   if (run_command_v_opt(repack.argv, RUN_GIT_CMD))
> - return error(FAILED_RUN, repack.argv[0]);
> + die(FAILED_RUN, repack.argv[0]);
>  
>   if (prune_expire) {
>   argv_array_push(&prune, prune_expire);
> @@ -626,18 +619,18 @@ int cmd_gc(int argc, const char **argv, const char 
> *prefix)
>   argv_array_push(&prune,
>   "--exclude-promisor-objects");
>   if (run_command_v_opt(prune.argv, RUN_GIT_CMD))
> - return error(FAILED_RUN, prune.argv[0]);
> + die(FAILED_RUN, prune.argv[0]);
>   }
>   }
>  
>   if (prune_worktrees_expire) {
>   argv_array_push(&prune_worktrees, prune_worktrees_expire);
>   if (run_command_v_opt(prune_worktrees.argv, RUN_GIT_CMD))
> - return error(FAILED_RUN, prune_worktrees.argv[0]);
> + die(FAILED_RUN, prune_worktrees.argv[0]);
>   }
>  
>   if (run_comma

Re: [PATCH 1/3] gc: improve handling of errors reading gc.log

2018-07-17 Thread Junio C Hamano

Jonathan Nieder  writes:

> - avoid being confused by a gc.log larger than INT_MAX bytes

;-)

>
> Noticed by code inspection.
>
> Signed-off-by: Jonathan Nieder 
> ---

Looks good.

Re: [PATCH] gc: do not warn about too many loose objects

2018-07-17 Thread Junio C Hamano

Jonathan Nieder  writes:

> That doesn't really solve the problem:
>
>  1. "gc --auto" would produce noise *every time it runs* until gc.log
> is removed, for example via expiry
>
>  2. "gc --auto" would not do any garbage collection until gc.log is
> removed, for example by expiry
>
>  3. "gc --auto" would still not ratelimit itself, for example when
> there are a large number of loose unreachable objects that is not
> large enough to hit the loose object threshold.
>
> but maybe it's better than the present state.
>
> To solve (1) and (2), we could introduce a gc.warnings file that
> behaves like gc.log except that it only gets printed once and then
> self-destructs, allowing gc --auto to proceed.  

That makes it not rate-limit, no?

> To solve (3), we could
> introduce a gc.lastrun file that is touched whenever "gc --auto" runs
> successfully and make "gc --auto" a no-op for a while after each run.

Does absolute time-interval make sense here?  Some repositories may
be busily churning new objects and for them 5-minute interval may be
too infrequent, while other repositories may be relatively slow and
once a day may be sufficient for them.  I wonder if we can somehow
auto-tune this.

> To avoid wasteful repeated fruitless gcs, when gc.log is present, the
> subsequent "gc --auto" would die after print its contents.  Most git

s/print/&ing/

> commands, such as "git fetch", ignore the exit status from "git gc
> --auto" so all is well at this point: the user gets to see the error
> message, and the fetch succeeds, without a wasteful additional attempt
> at an automatic gc.

The above, by saying "Most git commands", leaves readers want to
know "then what are minority git commands that do not correctly do
so?"  If you are not going to answer that question, it probably is
better not to even say "Most".

> External tools like repo[1], though, do care about the exit status
> from "git gc --auto".  In non-daemonized mode, the exit status is
> straightforward: if there is an error, it is nonzero, but after a
> warning like the above, the status is zero.  The daemonized mode, as a
> side effect of the other properties provided, offers a very strange
> exit code convention:
>
>  - if no housekeeping was required, the exit status is 0

OK.

>  - the first real run, after forking into the background, returns exit
>status 0 unconditionally.  The parent process has no way to know
>whether gc will succeed.

Understandable; allowing the parent to exit before we know the
outcome of the gc is the whole point of backgrounding.

>  - if there is any diagnostic output in gc.log, subsequent runs return
>a nonzero exit status to indicate that gc was not triggered.

This is unfortunate.

> There's nothing for the calling program to act on on the basis of that
> error.  Use status 0 consistently instead, to indicate that we decided
> not to run a gc (just like if no housekeeping was required).  

s/.$/ in the last case./  And I fully agree with the reasoning.

> This
> way, repo and similar tools can get the benefit of the same behavior
> as tools like "git fetch" that ignore the exit status from gc --auto.
>
> Once the period of time described by gc.pruneExpire elapses, the
> unreachable loose objects will be removed by "git gc --auto"
> automatically.
>
> [1] https://gerrit-review.googlesource.com/c/git-repo/+/10598/
>
> Reported-by: Andrii Dehtiarov 
> Helped-by: Jeff King 
> Signed-off-by: Jonathan Nieder 
> ---
>  Documentation/config.txt |  3 ++-
>  builtin/gc.c | 16 +++-
>  t/t6500-gc.sh|  6 +++---
>  3 files changed, 16 insertions(+), 9 deletions(-)
>
> diff --git a/Documentation/config.txt b/Documentation/config.txt
> index 1cc18a828c..5eaa4aaa7d 100644
> --- a/Documentation/config.txt
> +++ b/Documentation/config.txt
> @@ -1649,7 +1649,8 @@ will be repacked. After this the number of packs should 
> go below
>  gc.autoPackLimit and gc.bigPackThreshold should be respected again.
>  
>  gc.logExpiry::
> - If the file gc.log exists, then `git gc --auto` won't run
> + If the file gc.log exists, then `git gc --auto` will print
> + its content and exit with status zero instead of running
>   unless that file is more than 'gc.logExpiry' old.  Default is
>   "1.day".  See `gc.pruneExpire` for more ways to specify its
>   value.
> diff --git a/builtin/gc.c b/builtin/gc.c
> index 2bebc52bda..484ab21b8c 100644
> --- a/builtin/gc.c
> +++ b/builtin/gc.c
> @@ -438,7 +438,7 @@ static const char *lock_repo_for_gc(int force, pid_t* 
> ret_pid)
>   return NULL;
>  }
>  
> -static void report_last_gc_error(void)
> +static int report_last_gc_error(void)
>  {
>   struct strbuf sb = STRBUF_INIT;
>   ssize_t ret;
> @@ -449,7 +449,7 @@ static void report_last_gc_error(void)
>   if (errno == ENOENT)
>   goto done;
>  
> - die_errno(_("cannot stat '%s'"), gc_log_path);
> + return error_errno(_

Re: Are clone/checkout operations deterministic?

2018-07-17 Thread Duy Nguyen

On Tue, Jul 17, 2018 at 11:50 AM Ævar Arnfjörð Bjarmason
 wrote:
> In practice I think clone, checkout, reset etc. always work in the same
> order you see with `git ls-tree -r --name-only HEAD`, but as far as I
> know this has never been guaranteed or documented, and shouldn't be
> relied on.
>
> E.g. there's probably cases where writing files in parallel is going to
> be faster than writing them sequentially. We don't have such a mode just
> because nobody's written a patch for it, but having that patch would

Well I did have some patches [1] but as usually I did not follow
through. Interestingly there's actually concern about timestamp order
[2] even back then

[1] https://public-inbox.org/git/20160415095139.GA3985@lanh/
[2] 
https://public-inbox.org/git/cap8ufd0wzhriy340eh3k6ygzb0txnot+xay8+c2k+n2x9ub...@mail.gmail.com/

> break any assumptions of our current order.
-- 
Duy

Re: Are clone/checkout operations deterministic?

2018-07-17 Thread Stefan Beller

On Tue, Jul 17, 2018 at 2:48 AM Ævar Arnfjörð Bjarmason
 wrote:
>
>
> On Tue, Jul 17 2018, J. Paul Reed wrote:
>
> > Hey Git Devs,
> >
> > I have a bit of an odd question: do git clone/checkout operations have a
> > deterministic ordering?
> >
> > That is: are files guaranteed to be laid down onto disk in any specific
> > (and deterministic) order as a clone and/or checkout operations occurs?
> > (And if so, is it alphabetical order? Depth-first? Something else?)
> >
> > In case the answer is different (and I'd guess that it might be?), I'm
> > mostly interested in the initial clone case... but it would be great to
> > know if, indeed, the answer is different for just-checkouts too.
> >
> > I did some cursory googling, but nothing hopped out at me as an answer to
> > this question.
>
> In practice I think clone, checkout, reset etc. always work in the same
> order you see with `git ls-tree -r --name-only HEAD`, but as far as I
> know this has never been guaranteed or documented, and shouldn't be
> relied on.

The transmission of packfiles is non-deterministic, as the packfiles
(which are packed for each clone separately when using core git as
a server) are not packed in a deterministic fashion, but in a threaded
environment which allows different packing orders.

If you clone from a server that gives you exactly the same pack at
all times (assuming the remote repo doesn't change refs), then
checkout is currently deterministic in unpacking files to the working tree.

>
> E.g. there's probably cases where writing files in parallel is going to
> be faster than writing them sequentially. We don't have such a mode just
> because nobody's written a patch for it, but having that patch would
> break any assumptions of our current order.

+cc Ben who is looking into that, but hasn't spoken up on the mailing list yet.

Re: [PATCH 11/25] t7400: fix broken "submodule add/reconfigure --force" test

2018-07-17 Thread Junio C Hamano

Junio C Hamano  writes:

>> So, this SQUASH looks like the correct way forward. Hopefully, Junio
>> can just squash it so I don't have to flood the list again with this
>> long series.
>
> The topic already has another dependent topic so rerolling or
> squashing would be a bit cumbersome.  I'll see what I could do but
> it may not be until tomorrow's pushout.

OK, there was only one topic that forked fro this one, so squash
with rebase are all in the 'pu'.

Thanks, all.

Re: [PATCH v3 09/20] range-diff: adjust the output of the commit pairs

2018-07-17 Thread Stefan Beller

Hi Johannes,

> > It's nice to see that the bulk of the range-diff functionality has
> > been libified in this re-roll (residing in range-diff.c rather than
>
> Can we *please* stop calling it "re-roll"? Thanks.

Fun fact of the day:

First appearance of "reroll" in the public archive is (09 Dec 2007)
https://public-inbox.org/git/7vy7c3ogu2@gitster.siamese.dyndns.org/
which is predated by "re-roll" (05 May 2006)
https://public-inbox.org/git/7vr738w8t4@assigned-by-dhcp.cox.net/

Stefan

Re: [RFC PATCH 0/6] Add gentle alternative for `get_oid()`

2018-07-17 Thread Duy Nguyen

On Tue, Jul 17, 2018 at 2:10 PM Paul-Sebastian Ungureanu
 wrote:
>
> At the moment, `get_oid()` might call `die()` in some cases. To
> prevent that from happening, this patches introduces a new flag
> called `GET_OID_GENTLY` and a new function `get_oid_gently()`,
> which passes the mentioned flag further to `get_oid_with_context()`.

Since get_oid() callers must handle failure when it returns non-zero,
I would say "gently" is already implied by get_oid() and we could just
convert those die() to error() or warning(). Unless some of those
die() are very special that we need to choose which call sites should
go "even gentler" where some sites should still die()?
-- 
Duy

Re: [PATCH v3 1/3] t7501: add merge conflict tests for dry run

2018-07-17 Thread Junio C Hamano

Junio C Hamano  writes:

> But by splitting these into separate tests, the patch makes such a
> potential failure with "git commit --short" break the later steps.
>
> Not very nice.
>
> It may be a better change to just do in the original one
>
>   git add test-file &&
>   git commit --dry-run &&
> + git commit --short &&
> + git commit --long &&
> + git commit --porcelain &&
>   git commit -m "conflicts fixed from merge."
>
> without adding these new and separate tests, and then mark that one
> to expect a failure (because it would pass up to the --dry-run
> commit, but the --short commit would fail) at this step, perhaps?

Of course, if you want to be more thorough, anticipating that other
people in their future updates may break --short but not --long or
--porcelain, testing each option in separate test_expect_success is
a necessary way to do so, but then you'd need to actually be more
thorough, by not merely running each of them in separate
test_expect_success block but also arranging that each of them start
in an expected state to try the thing we want it to try.  That is

for opt in --dry-run --short --long --porcelain
do
test_expect_success "commit $opt" '
set up the conflicted state after merge &&
git commit $opt
'
done

where the "set up the state" part makes sure it can tolerate
potential mistakes of previous run of "git commit $opt" (e.g. it
by mistake made a commit, making the index identical to HEAD and
taking us out of "merge in progress" state).

But from your 1/3 I did not get the impression that you particularly
want to be more thorough, and from your 3/3 I did not get the
impression that you anticipate --short/--long/--porcelain may get
broken independently.  And if that is the case, then chaining all of
them together like the above is a more honest way to express that we
are only doing a minimum set of testing.

Thanks.

Re: [PATCH v2 2/2] repack -ad: prune the list of shallow commits

2018-07-17 Thread Eric Sunshine

On Tue, Jul 17, 2018 at 9:51 AM Johannes Schindelin via GitGitGadget
 wrote:
> While it is true that we never add unreachable commits into pack files
> intentionally (as `git repack`'s documentation states), we must not
> forget that a `git fetch --prune` (or even a `git fetch` when a ref was
> force-pushed in the meantime) can make a commit unreachable that was
> reachable before.
>
> Therefore it is not safe to assume that a `git repack -adlf` will keep
> unreachable commits alone (under the assumption that they had not been
> packed in the first place).
>
> This is particularly important to keep in mind when looking at the
> `.git/shallow` file: if any commits listed in that file become
> unreachable, it is not a problem, but if they go missing, it *is* a
> problem. One symptom of this problem is that a deepening fetch may now
> fail with
>
> fatal: error in object: unshallow 
>
> To avoid this problem, let's prune the shallow list in `git repack` when
> the `-d` option is passed, unless `-A` is passed, too (which would force
> the now-unreachable objects to be turned into loose objects instead of
> being deleted). Additionally, e also need to take `--keep-reachable` and

s/, e/, we/

> `--unpack-unreachable=` into account.
>
> Signed-off-by: Johannes Schindelin

Re: [PATCH v3 3/3] commit: fix exit code for --short/--porcelain

2018-07-17 Thread Junio C Hamano

Samuel Lijin  writes:

> diff --git a/wt-status.c b/wt-status.c
> index 75d389944..4ba657978 100644
> --- a/wt-status.c
> +++ b/wt-status.c
> @@ -718,6 +718,39 @@ static void wt_status_collect_untracked(struct wt_status 
> *s)
>   s->untracked_in_ms = (getnanotime() - t_begin) / 100;
>  }
>  
> +static int has_unmerged(const struct wt_status *s)
> +{
> + int i;
> +
> + for (i = 0; i < s->change.nr; i++) {
> + struct wt_status_change_data *d;
> + d = s->change.items[i].util;
> + if (d->stagemask)
> + return 1;
> + }
> + return 0;
> +}
> +
> +static void wt_status_mark_committable(
> + struct wt_status *s, const struct wt_status_state *state)
> +{
> + int i;
> +
> + if (state->merge_in_progress && !has_unmerged(s)) {
> + s->committable = 1;
> + return;
> + }

Is this trying to say:

During/after a merge, if there is no higher stage entry in
the index, we can commit.

I am wondering if we also should say:

During/after a merge, if there is any unresolved conflict in
the index, we cannot commit.

in which case the above becomes more like this:

if (state->merge_in_progress) {
s->committable = !has_unmerged(s);
return;
}

But with your patch, with no remaining conflict in the index during
a merge, the control comes here and goes into the next loop.

> + for (i = 0; i < s->change.nr; i++) {
> + struct wt_status_change_data *d = (s->change.items[i]).util;
> +
> + if (d->index_status && d->index_status != DIFF_STATUS_UNMERGED) 
> {
> + s->committable = 1;
> + return;
> + }
> + }

The loop seems to say "As long as there is one entry in the index
that is not in conflict and is different from the HEAD, then we can
commit".  Is that correct?  

Imagine there are two paths A and B in the branches involved in a
merge, and A cleanly resolves (say, we take their version because
our history did not touch it since we diverged) while B has
conflict.  We'll come to this loop (because we are in a merge but
have some unmerged paths) and we find that A is different from HEAD,
happily set committable bit and return.

I _think_ with the change to "what happens during merge" above that
I suggested, this loop automatically becomes correct, but I didn't
think it through.  If there are ways other than .merge_in_progress
that place conflicted entries in the index, then this loop is still
incorrect and would want to be more like:

for (i = 0; i < s->change.nr; i++) {
struct wt_status_change_data *d = (s->change.items[i]).util;

if (d->index_status == DIFF_STATUS_UNMERGED) {
s->committable = 0;
return;
}
if (d->index_status)
s->committable = 1;
}

i.e. we declare "not ready to commit" if there is *any* conflicted
entry, but otherwise set committable to 1 if we see any entry that
is different from HEAD (to declare succcess once we successfully
loop through to the last entry without seeing any conflict).

>  void wt_status_collect(struct wt_status *s, const struct wt_status_state 
> *state)
>  {
>   wt_status_collect_changes_worktree(s);
> @@ -728,6 +761,8 @@ void wt_status_collect(struct wt_status *s, const struct 
> wt_status_state *state)
>   wt_status_collect_changes_index(s);
>  
>   wt_status_collect_untracked(s);
> +
> + wt_status_mark_committable(s, state);
>  }
>  
>  static void wt_longstatus_print_unmerged(const struct wt_status *s)
> @@ -753,28 +788,28 @@ static void wt_longstatus_print_unmerged(const struct 
> wt_status *s)
>  
>  }
>  
> -static void wt_longstatus_print_updated(struct wt_status *s)
> +static void wt_longstatus_print_updated(const struct wt_status *s)
>  {
> - int shown_header = 0;
>   int i;
>  
> + if (!s->committable) {
> + return;
> + }

No need to have {} around a single statement.  Especially when you
know you won't be touching the line (e.g. to later add more
statements in the block) in this last patch in a series.

> + wt_longstatus_print_cached_header(s);
> +

Re: clean filter run in top of repo with wrong GIT_WORK_TREE

2018-07-17 Thread Joey Hess

The clean filter can work around this problem by chdir GIT_PREFIX,
but needing to do this in unusual cases seems to be asking for bugs.

-- 
see shy jo


signature.asc
Description: PGP signature

Re: [PATCH 2/2] repack -ad: prune the list of shallow commits

2018-07-17 Thread Duy Nguyen

On Mon, Jul 16, 2018 at 7:38 PM Jeff King  wrote:
> in a user-visible way. And that's really my question: is pruning here
> going to bite people unexpectedly (not rhetorical -- I really don't
> know)?

If shallow points are no longer reachable, removing them should not
bite anybody, I think.

> For instance, at GitHub we do not ever run "git gc", but have our
> maintenance regimen that builds around "git repack". I don't think this
> patch will matter to us either way, because we would never have a
> shallow repository in the first place. But I'm wondering if people in a
> similar situation might.
>
> I'm also not entirely sure if people _care_ if their shallow list is
> pruned. Maybe it's not a big deal, and should just be quietly cleaned
> up.
>
> And I know that you're probably coming at it from the opposite angle.
> Somebody isn't running git-gc but doing a "repack -adl" and they _do_
> want these shallow files cleaned up (because their repo is broken after
> dropping the objects). I just want to make sure we don't regress some
> other case.
>
> > > I.e., it seems unexpected that "git repack" is going to tweak your
> > > shallow lists. If we were designing from scratch, the sane behavior
> > > seems to me to be:
> > >
> > >   1. Shallow pruning should be its own separate command (distinct from
> > >  either repacking or loose object pruning), and should be triggered
> > >  as part of a normal git-gc.
> >
> > I fail to see the value in a separate command.
>
> The value of a separate command is that you can run those other commands
> without triggering this behavior. I actually think "git prune" does too
> much already (e.g., imagine that I do not ever want to prune objects,
> but I _do_ want to get rid of tmp_pack and tmp_obj cruft; what command
> do I run?). But that is definitely not a new problem. And I'm OK with
> not fixing it for now. My main concern is that we are using the presence
> of that mistake to justify making it again.
>
> > And: `git gc` already calls `git prune`, which *does* prune the shallow
> > list.
>
> Right, I was trying to propose that each individual cleanup step which
> _could_ be done independently should be done so, but that "gc" should
> continue to do them all.
>
> I think in the case of git-prune and prune_shallow(), it's a little
> tricky, though. It looks like prune_shallow() relies on somebody having
> marked the SEEN flag, which implies doing a full reachability walk.

Sorry, my bad for hiding this SEEN flag deep in library code like this
in eab3296c7e (prune: clean .git/shallow after pruning objects -
2013-12-05) But in my defense I didn't realize the horror of sharing
object flags a year later and added the "object flag allocation" in
object.h

> So it's really amortizing the work already being done by prune.
>
> Speaking of which, I don't see how your patch can work as-is. The repack
> command does not do such a walk, so it has no idea which commits are
> SEEN or not, and will delete all of them! Try this:
>
>   [shallow of clone of git.git (or any repo)]
>   $ git clone --no-local --depth=1 /path/to/git tmp
>   ...
>   $ cd tmp
>
>   [we have a commit]
>   $ git log --oneline -1
>   de46fca (grafted, HEAD) repack -ad: prune the list of shallow commits
>
>   [repacking with existing git is fine]
>   $ git repack -adl
>   ...
>   $ git log --oneline -1
>   de46fca (grafted, HEAD) repack -ad: prune the list of shallow commits
>
>   [repacking with your patch empties the shallow file, which
>makes the repository look corrupt]
>   $ /path/to/git repack -adl
>   $ git log --oneline -1
>   error: Could not read a2df50675979af639cf9490f7fc9b86fa18fd907
>   fatal: Failed to traverse parents of commit 
> de46fca5b43f47f3c5c6f9232a17490d39fc80b1
>
> So either we have to do a reachability walk (which is expensive), or we
> have to rely on some other command (like prune) that is doing a
> reachability walk and reuse that work.

Since "git prune" took care of loose objects, if we're going to delete
some redundant packs, we can perform a reasonably cheap "reachability"
test in repack, I think. We have the list of new packs from
pack-objects which should contain all reachable objects (and even some
unreachable ones). If we traverse all of their idx files and mark as
SEEN, then whatever shallow points that are not marked SEEN _and_ not
loose objects could be safely removed.

> > >   2b. It's OK for shallow objects to be missing, and the shallow code
> > >   should be more resilient to missing objects. Neither repack nor
> > >   prune needs to know or care.
> >
> > That would be possible. Kind of like saying: we do have this list of
> > shallow commits, but oh, BTW, it is likely incorrect, so we painstakingly
> > verify for every entry during every fetch and push that those commits
> > objects are still there.
>
> Obviously verifying reachability on each fetch is a bad idea. But my
> understanding of the shallow list is that it says "this is a graft point
> wher

Re: [PATCH v3 2/3] wt-status: teach wt_status_collect about merges in progress

2018-07-17 Thread Junio C Hamano

Samuel Lijin  writes:

> To fix the breakages documented by t7501, the next patch in this series
> will teach wt_status_collect() to set the committable bit, instead of
> having wt_longstatus_print_updated() and show_merge_in_progress() set it
> (which is what currently happens). Unfortunately, wt_status_collect()
> needs to know whether or not there is a merge in progress to set the bit
> correctly,

s/correctly,/correctly (a brief desription of why),/

would be nicer.  The description might be

(after a merge, it is OK for the result to be identical to HEAD,
which usually causes a "nothing to commit" error)

or something like that.

> so teach its (two) callers to create, initialize, and pass
> in instances of wt_status_state, which records this metadata.
>
> Since wt_longstatus_print() and show_merge_in_progress() are in the same
> callpaths and currently create and init copies of wt_status_state,
> remove that logic and instead pass wt_status_state through.

OK.  Sounds like a good clean-up.

> Make wt_status_get_state easier to use, add a helper method to clean up

Your description so far marked function names with trailing ();
let's do so consistently for wt_status_get_state(), too.

> wt_status_state, const-ify as many struct pointers in method signatures
> as possible, and add a FIXME for a struct pointer which should be const
> but isn't (that this patch series will not address).

"should be but isn't" because...?  I am wondering if it is better to
leave _all_ constifying to a later effort, if we are leaving some of
them behind anyway.  It would be better only if it will make this
patch easier to read if we did so.

Also you did s/commitable/committable/ everywhere, which was
somewhat distracting.  It would have been nicer to follow if that
were a separate preparatory clean-up patch.

>
> Signed-off-by: Samuel Lijin 
> ---
>  builtin/commit.c |  32 
>  ref-filter.c |   3 +-
>  wt-status.c  | 188 +++
>  wt-status.h  |  13 ++--
>  4 files changed, 120 insertions(+), 116 deletions(-)
>
> diff --git a/builtin/commit.c b/builtin/commit.c
> index 37fcb55ab..79ef4f11a 100644
> --- a/builtin/commit.c
> +++ b/builtin/commit.c
> @@ -463,6 +463,7 @@ static const char *prepare_index(int argc, const char 
> **argv, const char *prefix
>  static int run_status(FILE *fp, const char *index_file, const char *prefix, 
> int nowarn,
> struct wt_status *s)
>  {
> + struct wt_status_state state;
>   struct object_id oid;
>  
>   if (s->relative_paths)
> @@ -482,10 +483,12 @@ static int run_status(FILE *fp, const char *index_file, 
> const char *prefix, int
>   s->status_format = status_format;
>   s->ignore_submodule_arg = ignore_submodule_arg;
>  
> - wt_status_collect(s);
> - wt_status_print(s);
> + wt_status_get_state(s, &state);
> + wt_status_collect(s, &state);
> + wt_status_print(s, &state);
> + wt_status_clear_state(&state);
>  
> - return s->commitable;
> + return s->committable;
>  }
>  
>  static int is_a_merge(const struct commit *current_head)
> @@ -631,7 +634,7 @@ static int prepare_to_commit(const char *index_file, 
> const char *prefix,
>  {
>   struct stat statbuf;
>   struct strbuf committer_ident = STRBUF_INIT;
> - int commitable;
> + int committable;
>   struct strbuf sb = STRBUF_INIT;
>   const char *hook_arg1 = NULL;
>   const char *hook_arg2 = NULL;
> @@ -848,7 +851,7 @@ static int prepare_to_commit(const char *index_file, 
> const char *prefix,
>  
>   saved_color_setting = s->use_color;
>   s->use_color = 0;
> - commitable = run_status(s->fp, index_file, prefix, 1, s);
> + committable = run_status(s->fp, index_file, prefix, 1, s);
>   s->use_color = saved_color_setting;
>   } else {
>   struct object_id oid;
> @@ -866,7 +869,7 @@ static int prepare_to_commit(const char *index_file, 
> const char *prefix,
>   for (i = 0; i < active_nr; i++)
>   if (ce_intent_to_add(active_cache[i]))
>   ita_nr++;
> - commitable = active_nr - ita_nr > 0;
> + committable = active_nr - ita_nr > 0;
>   } else {
>   /*
>* Unless the user did explicitly request a submodule
> @@ -882,7 +885,7 @@ static int prepare_to_commit(const char *index_file, 
> const char *prefix,
>   if (ignore_submodule_arg &&
>   !strcmp(ignore_submodule_arg, "all"))
>   flags.ignore_submodules = 1;
> - commitable = index_differs_from(parent, &flags, 1);
> + committable = index_differs_from(parent, &flags, 1);
>   }
>   }
>   strbuf_release(&committer_ident);
> @@ -894,7 +897,7 @@ static

clean filter run in top of repo with wrong GIT_WORK_TREE

2018-07-17 Thread Joey Hess

When git is running inside a subdirectory of the repository, 
and needs to run the clean filter, it runs it chdired back to the top of
the repository. However, if git was run with a relative --work-tree,
it passes that relative path in GIT_WORK_TREE on to the clean filter.

If git was run with eg, "--work-tree=..", the clean filter sees a work
tree that is outside the repository. It might then read files located
outside the repository. That seems like it could have security
consequences, but it's certianly a surprising problem to need to deal
with when writing a clean filter.

Brian posted a fix for a very similar bug in sequencer.c on the 14th, 
so it seems likely there are other occurances of the same problem
elsewhere.

Demonstration of this bug:

joey@darkstar:~/tmp/repo>cat .gitattributes
* filter=foo
joey@darkstar:~/tmp/repo>git config filter.foo.clean
clean-filter %f
joey@darkstar:~/tmp/repo>cat ~/bin/clean-filter 
#!/bin/sh
pwd >&2
echo $GIT_WORK_TREE >&2
ls "$GIT_WORK_TREE/$1"
joey@darkstar:~/tmp/repo>cd foo/bar/
joey@darkstar:~/tmp/repo/foo/bar>ls
x
joey@darkstar:~/tmp/repo/foo/bar>touch x
joey@darkstar:~/tmp/repo/foo/bar>git --work-tree=../.. ls-files --modified
/home/joey/tmp/repo
../..
ls: cannot access '../../foo/bar/x': No such file or directory

git version 2.18.0

-- 
see shy jo


signature.asc
Description: PGP signature

Re: [PATCH v3 1/3] t7501: add merge conflict tests for dry run

2018-07-17 Thread Junio C Hamano

Samuel Lijin  writes:

> The behavior of git commit when doing a dry run changes if there are
> unfixed/fixed merge conflits, but the test suite currently only asserts
> that `git commit --dry-run` succeeds when all merge conflicts are fixed.
>
> Add tests to document the behavior of all flags which imply a dry run
> when (1) there is at least one unfixed merge conflict and (2) when all
> merge conflicts are all fixed.

s/conflits/conflicts/
s/fixed/resolved/g  (both above and in the patch text)
s/unfixed/unresolved/g  (both above and in the patch text)

> Signed-off-by: Samuel Lijin 
> ---
>  t/t7501-commit.sh | 45 -
>  1 file changed, 40 insertions(+), 5 deletions(-)
>
> diff --git a/t/t7501-commit.sh b/t/t7501-commit.sh
> index fa61b1a4e..be087e73f 100755
> --- a/t/t7501-commit.sh
> +++ b/t/t7501-commit.sh
> @@ -652,7 +652,8 @@ test_expect_success '--only works on to-be-born branch' '
>   test_cmp expected actual
>  '
>  
> -test_expect_success '--dry-run with conflicts fixed from a merge' '
> +# set up env for tests of --dry-run given fixed/unfixed merge conflicts
> +test_expect_success 'setup env with unfixed merge conflicts' '
>   # setup two branches with conflicting information
>   # in the same file, resolve the conflict,
>   # call commit with --dry-run
> @@ -665,11 +666,45 @@ test_expect_success '--dry-run with conflicts fixed 
> from a merge' '
>   git checkout -b branch-2 HEAD^1 &&
>   echo "commit-2-state" >test-file &&
>   git commit -m "commit 2" -i test-file &&
> - ! $(git merge --no-commit commit-1) &&
> - echo "commit-2-state" >test-file &&
> + test_expect_code 1 git merge --no-commit commit-1

The original is bad and also embarrassing.  Whatever comes out of
the standard output of "git merge" is $IFS split and executed as a
shell command (which likely results in "no such command" failure)
and it tries to make sure that a failure happens.

The right way to write that line (without your enhancement in this
patch) would have been:

test_must_fail git merge --no-commit commit-1 &&

I doubt it is a good idea to hardcode exit status of 1 by using
test_expect_code, though.  "git merge --help" does not say anything
about "1 means this failure, 2 means that failure, 3 means that
other failure".  And my quick forward scan of this series does not
tell me that you are trying to declare that from here on we _will_
make that promise to the end users by carving the exit status(es) in
stone.  The same about "git commit"'s exit code in the following
four tests.

> +'
> +
> +test_expect_success '--dry-run with unfixed merge conflicts' '
> + test_expect_code 1 git commit --dry-run
> +'
> +
> +test_expect_success '--short with unfixed merge conflicts' '
> + test_expect_code 1 git commit --short
> +'
> +
> +test_expect_success '--porcelain with unfixed merge conflicts' '
> + test_expect_code 1 git commit --porcelain
> +'
> +
> +test_expect_success '--long with unfixed merge conflicts' '
> + test_expect_code 1 git commit --long
> +'
> +
> +test_expect_success '--dry-run with conflicts fixed from a merge' '
> + echo "merge-conflicts-fixed" >test-file &&

The original test pretended that we resolved favouring the current
state with "commit-2-state" in the file, as if we ran "-s ours".
Is there a reason why we now use a different contents, or is this
just a change based on subjective preference?  

Not saying that the latter is necessrily bad; just trying to
understand why we are making this change.

>   git add test-file &&
> - git commit --dry-run &&
> - git commit -m "conflicts fixed from merge."
> + git commit --dry-run

OK, the original tried --dry-run to ensure it exited with 0 status
(i.e. have something to commit) and then did a commit to record the
updated state with a message.  You are checking only the dry-run
part, leaving the check of the final commit's status to another
test.

> +'
> +
> +test_expect_failure '--short with conflicts fixed from a merge' '
> + git commit --short
> +'

With "test_expect_failure", you are saying that "--short" _should_
exit with 0 but currently it does not.  An untold expectation is
that even with the breakage with the exit code, the command still
honors the (implicit) --dry-run correctly and does not create a
new commit.

That was actually tested in the original.  By &&-chaining like this

git commit --dry-run &&
git commit -m "conflicts fixed from merge."

we would have noticed if a newly introduced bug caused the first
step "commit --dry-run" to return non-zero status (because then the
step would fail), or if it stopped being dry-run and made a commit
(because then the next step would fail with "nothing to commit").

But by splitting these into separate tests, the patch makes such a
potential failure with "git commit --short" break the later steps.

Not very nice.

It may be a better change to just do in the

Re: [PATCH 2/2] repack -ad: prune the list of shallow commits

2018-07-17 Thread Duy Nguyen

On Tue, Jul 17, 2018 at 6:39 PM Duy Nguyen  wrote:
>
> On Fri, Jul 13, 2018 at 10:19 PM Johannes Schindelin via GitGitGadget
>  wrote:
> >
> > From: Johannes Schindelin 
> >
> > While it is true that we never add unreachable commits into pack files
> > intentionally (as `git repack`'s documentation states), we must not
> > forget that a `git fetch --prune` (or even a `git fetch` when a ref was
> > force-pushed in the meantime) can make a commit unreachable that was
> > reachable before.
> >
> > Therefore it is not safe to assume that a `git repack -adlf` will keep
> > unreachable commits alone (under the assumption that they had not been
> > packed in the first place).
> >
> > This is particularly important to keep in mind when looking at the
> > `.git/shallow` file: if any commits listed in that file become
> > unreachable, it is not a problem, but if they go missing, it *is* a
> > problem. One symptom of this problem is that a deepening fetch may now
> > fail with
> >
> > fatal: error in object: unshallow 
> >
>
> Could you elaborate a bit more?

Never mind. The problem is described in another patch.. sigh.. I
understand you want to flip that "failure" to "success" but personally
I don't think it worth it to look back in history and have "what?"
moments like when I read this patch alone.
-- 
Duy

Re: [PATCH v3] sequencer: use configured comment character

2018-07-17 Thread Johannes Schindelin

Hi Daniel,

On Mon, 16 Jul 2018, Daniel Harding wrote:

> On Mon, 16 Jul 2018 at 18:59:03 +0300, Johannes Schindelin wrote:
> > 
> > On Mon, 16 Jul 2018, Aaron Schrab wrote:
> > >
> > > Looking into that a bit further, it does seem like my explanation above
> > > was incorrect.  Here's another attempt to explain why setting
> > > core.commentChar=auto isn't a problem for this change.
> > >
> > > 8< -
> > >
> > > Use the configured comment character when generating comments about
> > > branches in a todo list.  Failure to honor this configuration causes a
> > > failure to parse the resulting todo list.
> > >
> > > Setting core.commentChar to "auto" will not be honored here, and the
> > > previously configured or default value will be used instead. But, since
> > > the todo list will consist of only generated content, there should not
> > > be any non-comment lines beginning with that character.
> > 
> > How about this instead?
> > 
> >  If core.commentChar is set to "auto", the intention is to
> >  determine the comment line character from whatever content is there
> >  already.
> > 
> >  As the code path in question is the one *generating* the todo list
> >  from scratch, it will automatically use whatever core.commentChar
> >  has been configured before the "auto" (and fall back to "#" if none
> >  has been configured explicitly), which is consistent with users'
> >  expectations.
> 
> Honestly, the above still doesn't read clearly to me.  I've take a stab at it
> myself - let me know what you think:
> 
> If core.commentChar is set to "auto", the comment_line_char global
> variable will be initialized to '#'.  The only time
> comment_line_char gets changed to an automatic value is when the
> prepare_to_commit() function (in commit.c) calls
> adjust_comment_line_char().  This does not happen when generating
> the todo list, so '#' will be used as the comment character in the
> todo list if core.commentChar is set to "auto".

There is a concocted way to have core.commentChar = auto *and* to override
the comment char: if you use `git config --add core.commentChar auto`, or
if you have it set in $HOME/.gitconfig and in .git/config.

I tried to cover that in my suggestion, but that was probably trying to be
too precise, rather than being useful...

Ciao,
Johannes

Re: [PATCH v3 20/20] range-diff: make --dual-color the default mode

2018-07-17 Thread Johannes Schindelin

Hi Eric,

On Mon, 16 Jul 2018, Eric Sunshine wrote:

> On Tue, Jul 3, 2018 at 7:26 AM Johannes Schindelin via GitGitGadget
>  wrote:
> > After using this command extensively for the last two months, this
> > developer came to the conclusion that even if the dual color mode still
> > leaves a lot of room for confusion what was actually changed, the
> 
> "...confusion _about_ what..."
> 
> > non-dual color mode is substantially worse in that regard.
> >
> > Therefore, we really want to make the dual color mode the default.
> >
> > Signed-off-by: Johannes Schindelin 
> > ---
> > diff --git a/Documentation/git-range-diff.txt 
> > b/Documentation/git-range-diff.txt
> > @@ -31,11 +31,14 @@ all of their ancestors have been shown.
> > ---dual-color::
> > -   When the commit diffs differ, recreate the original diffs'
> > -   coloring, and add outer -/+ diff markers with the *background*
> > -   being red/green to make it easier to see e.g. when there was a
> > -   change in what exact lines were added.
> > +--no-dual-color::
> > +   When the commit diffs differ, `git range-diff` recreates the
> > +   original diffs' coloring, and add outer -/+ diff markers with
> 
> s/add/adds/
> 
> > +   the *background* being red/green to make it easier to see e.g.
> > +   when there was a change in what exact lines were added. This is
> > +   known to `range-diff` as "dual coloring". Use `--no-dual-color`
> > +   to revert to color all lines according to the outer diff markers
> > +   (and completely ignore the inner diff when it comes to color).

Yep, thank you very much!
Dscho

Re: [PATCH 2/2] repack -ad: prune the list of shallow commits

2018-07-17 Thread Duy Nguyen

On Fri, Jul 13, 2018 at 10:19 PM Johannes Schindelin via GitGitGadget
 wrote:
>
> From: Johannes Schindelin 
>
> While it is true that we never add unreachable commits into pack files
> intentionally (as `git repack`'s documentation states), we must not
> forget that a `git fetch --prune` (or even a `git fetch` when a ref was
> force-pushed in the meantime) can make a commit unreachable that was
> reachable before.
>
> Therefore it is not safe to assume that a `git repack -adlf` will keep
> unreachable commits alone (under the assumption that they had not been
> packed in the first place).
>
> This is particularly important to keep in mind when looking at the
> `.git/shallow` file: if any commits listed in that file become
> unreachable, it is not a problem, but if they go missing, it *is* a
> problem. One symptom of this problem is that a deepening fetch may now
> fail with
>
> fatal: error in object: unshallow 
>

Could you elaborate a bit more? I don't quite see the connection
between the reachability in the first two paragraphs and .git/shallow
in the third one. I'm guessing we drop objects in between because
"they go missing"? Where? How does prune_shallow() in prune.c play any
role in this (if it does)?
-- 
Duy

Re: [PATCH v3 17/20] range-diff: add a man page

2018-07-17 Thread Johannes Schindelin

Hi Eric,

On Mon, 16 Jul 2018, Eric Sunshine wrote:

> On Tue, Jul 3, 2018 at 7:27 AM Johannes Schindelin via GitGitGadget
>  wrote:
> > The bulk of this patch consists of a heavily butchered version of
> > tbdiff's README written by Thomas Rast and Thomas Gummerer, lifted from
> > https://github.com/trast/tbdiff.
> >
> > Signed-off-by: Johannes Schindelin 
> > ---
> > diff --git a/Documentation/git-range-diff.txt 
> > b/Documentation/git-range-diff.txt
> > @@ -0,0 +1,235 @@
> > +To that end, it first finds pairs of commits from both commit ranges
> > +that correspond with each other. Two commits are said to correspond when
> > +the diff between their patches (i.e. the author information, the commit
> > +message and the commit diff) is reasonably small compared to the
> > +patches' size. See ``Algorithm` below for details.
> 
> Unbalanced number of backticks on "Algorithm".

Of course!

Thanks,
Dscho

> 
> > +Finally, the list of matching commits is shown in the order of the
> > +second commit range, with unmatched commits being inserted just after
> > +all of their ancestors have been shown.
>

Re: [PATCH v3 11/20] range-diff: add tests

2018-07-17 Thread Johannes Schindelin

Hi Eric,

On Mon, 16 Jul 2018, Eric Sunshine wrote:

> On Tue, Jul 3, 2018 at 7:26 AM Thomas Rast via GitGitGadget
>  wrote:
> > These are essentially lifted from https://github.com/trast/tbdiff, with
> > light touch-ups to account for the command now being an option of `git
> > branch`.
> 
> The "option of `git branch`" mention is outdated. Perhaps just drop
> everything after "...touch-ups" (or mention "range-diff" instead).

Ah, the line break made my `grep` fail :-(

> > Apart from renaming `tbdiff` to `range-diff`, only one test case needed
> > to be adjusted: 11 - 'changed message'.
> > [...]
> > Signed-off-by: Johannes Schindelin 
> > ---
> > diff --git a/t/.gitattributes b/t/.gitattributes
> > @@ -18,5 +18,6 @@ t[0-9][0-9][0-9][0-9]/* -whitespace
> >  /t7500/* eol=lf
> > +/t7910/* eol=lf
> 
> Does this need to be changed to t3206?

Absolutely.

> 
> >  /t8005/*.txt eol=lf
> > diff --git a/t/t3206-range-diff.sh b/t/t3206-range-diff.sh
> > new file mode 100755
> > index 0..2237c7f4a
> > --- /dev/null
> > +++ b/t/t3206-range-diff.sh

Thanks for your thorough review,
Dscho

Re: [PATCH 2/2] repack -ad: prune the list of shallow commits

2018-07-17 Thread Junio C Hamano

Jeff King  writes:

> I'm OK with having a partial fix, or one that fixes immediate pain
> without doing a big cleanup, as long as it doesn't make anything _worse_
> in a user-visible way. And that's really my question: is pruning here
> going to bite people unexpectedly (not rhetorical -- I really don't
> know)?

Yeah, that matches the general guideline I follow when reviewing a
patch that claims to make existing things better.  And I do not
think I can explain to a third person why pruning here is a good
idea and won't cause problems, after seeing these patches and
the discussion from the sideline.

> Right, I think "git gc" is absolutely the place to do such cleanups. My
> only complaint was that having it as part of repack may be unexpected.

Re: [PATCH v3 09/20] range-diff: adjust the output of the commit pairs

2018-07-17 Thread Johannes Schindelin

Hi Eric,

On Mon, 16 Jul 2018, Eric Sunshine wrote:

> On Tue, Jul 3, 2018 at 7:26 AM Johannes Schindelin via GitGitGadget
>  wrote:
> > This change brings `git range-diff` yet another step closer to
> > feature parity with tbdiff: it now shows the oneline, too, and indicates
> > with `=` when the commits have identical diffs.
> >
> > Signed-off-by: Johannes Schindelin 
> > ---
> > diff --git a/range-diff.c b/range-diff.c
> > @@ -251,9 +253,57 @@ static void get_correspondences(struct string_list *a, 
> > struct string_list *b,
> > +static void output_pair_header(struct strbuf *buf,
> > +  struct patch_util *a_util,
> > +  struct patch_util *b_util)
> >  {
> > -   return find_unique_abbrev(&util->oid, DEFAULT_ABBREV);
> > +   static char *dashes;
> > +   struct object_id *oid = a_util ? &a_util->oid : &b_util->oid;
> > +   struct commit *commit;
> > +
> > +   if (!dashes) {
> > +   char *p;
> > +
> > +   dashes = xstrdup(find_unique_abbrev(oid, DEFAULT_ABBREV));
> 
> It's nice to see that the bulk of the range-diff functionality has
> been libified in this re-roll (residing in range-diff.c rather than

Can we *please* stop calling it "re-roll"? Thanks.

(Or are you really "never gonna give you up, never gonna let you down"?)

> builtin/range-diff.c as in earlier versions), so it's somewhat
> surprising to see libified code holding onto the 'dashes' buffer like
> this in a static variable. An alternative would have been for the
> caller to pass in the same buffer to output_pair_header() for re-use,
> and then dispose of it at the end of processing.

Sure, to be honest, I had completely forgotten about what I did there, and
had to read up on it to fix it.

> 
> > +   for (p = dashes; *p; p++)
> > +   *p = '-';
> > +   }
> > +
> > +   strbuf_reset(buf);
> 
> ...much like 'buf' is allocated by the caller, passed in and re-used
> for each invocation, then released by the caller at the end.

Yep, I now pass in another strbuf, `dashes`.

Thanks,
Dscho

Re: [PATCH] gc: do not warn about too many loose objects

2018-07-17 Thread Duy Nguyen

On Tue, Jul 17, 2018 at 3:53 AM Jonathan Nieder  wrote:
> Subject: gc: do not return error for prior errors in daemonized mode
>
> Some build machines started failing to fetch updated source using
> "repo sync", with error
>
>   error: The last gc run reported the following. Please correct the root cause
>   and remove /build/.repo/projects/tools/git.git/gc.log.
>   Automatic cleanup will not be performed until the file is removed.
>
>   warning: There are too many unreachable loose objects; run 'git prune' to 
> remove them.
>
> The cause takes some time to describe.
>
> In v2.0.0-rc0~145^2 (gc: config option for running --auto in
> background, 2014-02-08), "git gc --auto" learned to run in the
> background instead of blocking the invoking command.  In this mode, it
> closed stderr to avoid interleaving output with any subsequent
> commands, causing warnings like the above to be swallowed; v2.6.3~24^2
> (gc: save log from daemonized gc --auto and print it next time,
> 2015-09-19) addressed this by storing any diagnostic output in
> .git/gc.log and allowing the next "git gc --auto" run to print it.
>
> To avoid wasteful repeated fruitless gcs, when gc.log is present, the
> subsequent "gc --auto" would die after print its contents.  Most git
> commands, such as "git fetch", ignore the exit status from "git gc
> --auto" so all is well at this point: the user gets to see the error
> message, and the fetch succeeds, without a wasteful additional attempt
> at an automatic gc.
>
> External tools like repo[1], though, do care about the exit status
> from "git gc --auto".  In non-daemonized mode, the exit status is
> straightforward: if there is an error, it is nonzero, but after a
> warning like the above, the status is zero.  The daemonized mode, as a
> side effect of the other properties provided, offers a very strange
> exit code convention:
>
>  - if no housekeeping was required, the exit status is 0
>
>  - the first real run, after forking into the background, returns exit
>status 0 unconditionally.  The parent process has no way to know
>whether gc will succeed.
>
>  - if there is any diagnostic output in gc.log, subsequent runs return
>a nonzero exit status to indicate that gc was not triggered.
>
> There's nothing for the calling program to act on on the basis of that
> error.  Use status 0 consistently instead, to indicate that we decided
> not to run a gc (just like if no housekeeping was required).  This
> way, repo and similar tools can get the benefit of the same behavior
> as tools like "git fetch" that ignore the exit status from gc --auto.

This background gc thing is pretty much designed for interactive use.
When it's scripted, you probably should avoid it. Then you can fork a
new process by yourself and have much better control if you still want
"background" gc. So an alternative here is to turn on or off
gc.autodetach based on interactiveness (i.e. having tty). We can add a
new (and default) value "auto" to gc.autodetach for this purpose.

In automated scripts, it will always run in non-damonized mode. You
will get non-zero exit code when real errors occur. You don't worry
about hanging processes. Rate limit is also thrown out in this mode if
I'm not mistaken, but it could be added back and more tailored for
server needs.

> Once the period of time described by gc.pruneExpire elapses, the
> unreachable loose objects will be removed by "git gc --auto"
> automatically.
>
> [1] https://gerrit-review.googlesource.com/c/git-repo/+/10598/
>
> Reported-by: Andrii Dehtiarov 
> Helped-by: Jeff King 
> Signed-off-by: Jonathan Nieder 
> ---
>  Documentation/config.txt |  3 ++-
>  builtin/gc.c | 16 +++-
>  t/t6500-gc.sh|  6 +++---
>  3 files changed, 16 insertions(+), 9 deletions(-)
>
> diff --git a/Documentation/config.txt b/Documentation/config.txt
> index 1cc18a828c..5eaa4aaa7d 100644
> --- a/Documentation/config.txt
> +++ b/Documentation/config.txt
> @@ -1649,7 +1649,8 @@ will be repacked. After this the number of packs should 
> go below
>  gc.autoPackLimit and gc.bigPackThreshold should be respected again.
>
>  gc.logExpiry::
> -   If the file gc.log exists, then `git gc --auto` won't run
> +   If the file gc.log exists, then `git gc --auto` will print
> +   its content and exit with status zero instead of running
> unless that file is more than 'gc.logExpiry' old.  Default is
> "1.day".  See `gc.pruneExpire` for more ways to specify its
> value.
> diff --git a/builtin/gc.c b/builtin/gc.c
> index 2bebc52bda..484ab21b8c 100644
> --- a/builtin/gc.c
> +++ b/builtin/gc.c
> @@ -438,7 +438,7 @@ static const char *lock_repo_for_gc(int force, pid_t* 
> ret_pid)
> return NULL;
>  }
>
> -static void report_last_gc_error(void)
> +static int report_last_gc_error(void)
>  {
> struct strbuf sb = STRBUF_INIT;
> ssize_t ret;
> @@ -449,7 +449,7 @@ static void report_last_gc_error(void)
> if (err

Re: [PATCH] gc: do not warn about too many loose objects

2018-07-17 Thread Ævar Arnfjörð Bjarmason

On Tue, Jul 17 2018, Jonathan Nieder wrote:

> Hi Ævar,
>
> Ævar Arnfjörð Bjarmason wrote:
>> On Tue, Jul 17 2018, Jonathan Nieder wrote:
>
>>> That suggests a possible improvement.  If all callers should be
>>> ignoring the exit code, can we make the exit code in daemonize mode
>>> unconditionally zero in this kind of case?
>>
>> That doesn't make sense to me. Just because git itself is happily
>> ignoring the exit code I don't think that should mean there shouldn't be
>> a meaningful exit code. Why don't you just ignore the exit code in the
>> repo tool as well?
>
> I don't maintain the repo tool.  I am just trying to improve git to
> make some users' lives better.
>
> repo worked fine for years, until gc's daemonized mode broke it.  I
> don't think it makes sense for us to declare that it has been holding
> git gc wrong for all that time before then.

Before the sequence of commits noted at the bottom of my
https://public-inbox.org/git/87inc89j38@evledraar.gmail.com/ plus
a831c06a2b ("gc: ignore old gc.log files", 2017-02-10) we wouldn't do
that, that's true.

We'd just happily run GC again pointlessly even though it wasn't going
to accomplish anything, in cases where you really did have ~>6700 loose
objects that weren't going to be pruned.

I don't think it makes sense to revert those patches and go back to the
much more naïve (and user waiting there twiddling his thumbs while it
runs) method, but I also don't think making no distinction between the
following states:

 1. gc --auto has nothing to do
 2. gc --auto has something to do, will fork and try to do it
 3. gc --auto has something to do, but notices that gc has been failing
before and can't do anything now.

I think #3 should exit with non-zero.

Yes, before this whole backgrounding etc. we wouldn't have exited with
non-zero either, we'd just print a thing to STDERR.

But with backgrounding we implicitly introduced a new state, which is we
won't do *any* maintenance at all, including consolidating packfiles
(very important for servers), so I think it makes sense to exit with
non-zero since gc can't run at all.

>> Now e.g. automation I have to see if git-gc ---auto is having issues
>> can't just be 'git gc --auto || echo fail @ {hostname}' across a fleet
>> of servers, but will need to start caring if stderr was emitted to.
>
> Thanks for bringing this up.  The thing is, the current exit code is
> not useful for that purpose.  It doesn't report a failure until the
> *next* `git gc --auto` run, when it is too late to be useful.
>
> See the commit message on the proposed patch
> https://public-inbox.org/git/20180717065740.gd177...@aiede.svl.corp.google.com/
> for more on that subject.

Right. I know. What I mean is now I can (and do) use it to run 'git gc
--auto' across our server fleet and see whether I have any of #3, or
whether it's all #1 or #2. If there's nothing to do in #1 that's fine,
and it just so happens that I'll run gc due to #2 that's also fine, but
I'd like to see if gc really is stuck.

This of course relies on them having other users / scripts doing normal
git commands which would trigger previous 'git gc --auto' runs.

>> I don't care if we e.g. have a 'git gc --auto --exit-code' similar to
>> what git-diff does, but it doesn't make sense to me that we *know* we
>> can't background the gc due to a previous error and then always return
>> 0. Having to parse STDERR to see if it *really* failed is un-unixy,
>> let's use exit codes. That's what they're for.
>
> Do you know of anyone trying to use the exit code from gc --auto in
> this way?
>
> It sounds to me like for what you're proposing, a separate
>
>   git gc --auto --last-run-failed
>
> command that a tool could use to check the outcome from the previous
> gc --auto run without triggering a new one would be best.

Yeah this is admittedly a bit of a poweruser thing I'm doing, so I don't
mind if it's hidden behind something like that in principle, but it
seems wrong to exit with zero in this (#3) case:

$ git gc --auto; echo $?
Auto packing the repository in background for optimum performance.
See "git help gc" for manual housekeeping.
error: The last gc run reported the following. Please correct the root cause
and remove .git/gc.log.
Automatic cleanup will not be performed until the file is removed.

warning: There are too many unreachable loose objects; run 'git prune' to 
remove them.

255

As a previous (good) patch in this series notes that shouldn't be 255,
let's fix that, but I don't see how emitting an "error" and "warning"
saying we're broken and can't gc at all here in conjunction with exiting
with zero makes sense.

> [...]
>> I think you're conflating two things here in a way that (if I'm reading
>> this correctly) produces a pretty bad regression for users.
>
> The patch doesn't touch the "ratelimiting" behavior at all, so I'm not
> sure what I'm doing to regress users.

Sorry about being unclear again, this is not a comm

Re: [PATCH] gc: do not warn about too many loose objects

2018-07-17 Thread Jonathan Nieder

Hi Ævar,

Ævar Arnfjörð Bjarmason wrote:
> On Tue, Jul 17 2018, Jonathan Nieder wrote:

>> That suggests a possible improvement.  If all callers should be
>> ignoring the exit code, can we make the exit code in daemonize mode
>> unconditionally zero in this kind of case?
>
> That doesn't make sense to me. Just because git itself is happily
> ignoring the exit code I don't think that should mean there shouldn't be
> a meaningful exit code. Why don't you just ignore the exit code in the
> repo tool as well?

I don't maintain the repo tool.  I am just trying to improve git to
make some users' lives better.

repo worked fine for years, until gc's daemonized mode broke it.  I
don't think it makes sense for us to declare that it has been holding
git gc wrong for all that time before then.

> Now e.g. automation I have to see if git-gc ---auto is having issues
> can't just be 'git gc --auto || echo fail @ {hostname}' across a fleet
> of servers, but will need to start caring if stderr was emitted to.

Thanks for bringing this up.  The thing is, the current exit code is
not useful for that purpose.  It doesn't report a failure until the
*next* `git gc --auto` run, when it is too late to be useful.

See the commit message on the proposed patch
https://public-inbox.org/git/20180717065740.gd177...@aiede.svl.corp.google.com/
for more on that subject.

> I don't care if we e.g. have a 'git gc --auto --exit-code' similar to
> what git-diff does, but it doesn't make sense to me that we *know* we
> can't background the gc due to a previous error and then always return
> 0. Having to parse STDERR to see if it *really* failed is un-unixy,
> let's use exit codes. That's what they're for.

Do you know of anyone trying to use the exit code from gc --auto in
this way?

It sounds to me like for what you're proposing, a separate

git gc --auto --last-run-failed

command that a tool could use to check the outcome from the previous
gc --auto run without triggering a new one would be best.

[...]
> I think you're conflating two things here in a way that (if I'm reading
> this correctly) produces a pretty bad regression for users.

The patch doesn't touch the "ratelimiting" behavior at all, so I'm not
sure what I'm doing to regress users.

[...]
>> To solve (3), we could
>> introduce a gc.lastrun file that is touched whenever "gc --auto" runs
>> successfully and make "gc --auto" a no-op for a while after each run.
>
> This would work around my concern with b) above in most cases by
> introducing a purely time-based rate limit, but I'm uneasy about this
> change in git-gc semantics.
[..]
> With these proposed semantics we'd skip a needed GC (potentially for
> days, depending on the default) just because we recently ran one.
>
> In many environments, such as on busy servers, we could have tens of
> thousands of packs by this point, since this facility would (presumably)
> bypass both gc.autoPackLimit and gc.auto in favor of only running gc at
> a maximum of every N minutes, similarly in a local checkout I could have
> a crapload of loose objects because I ran a big rebase or a
> filter-branch on one of my branches.

I think we all agree that getting rid of the hack that 'explodes'
unreachable objects into loose objects is the best way forward, long
term.

Even in that future, some ratelimiting may be useful, though.  I also
suspect that we're never going to achieve a perfect set of defaults
that works well for both humans and servers, though we can try.

Thanks and hope that helps,
Jonathan

[PATCH v2 1/2] repack: point out a bug handling stale shallow info

2018-07-17 Thread Johannes Schindelin via GitGitGadget

From: Johannes Schindelin 

A `git fetch --prune` can turn previously-reachable objects unreachable,
even commits that are in the `shallow` list. A subsequent `git repack
-ad` will then unceremoniously drop those unreachable commits, and the
`shallow` list will become stale. This means that when we try to fetch
with a larger `--depth` the next time, we may end up with:

fatal: error in object: unshallow 

Reported by Alejandro Pauly.

Signed-off-by: Johannes Schindelin 
---
 t/t5537-fetch-shallow.sh | 27 +++
 1 file changed, 27 insertions(+)

diff --git a/t/t5537-fetch-shallow.sh b/t/t5537-fetch-shallow.sh
index 943231af9..561485d31 100755
--- a/t/t5537-fetch-shallow.sh
+++ b/t/t5537-fetch-shallow.sh
@@ -186,4 +186,31 @@ EOF
test_cmp expect actual
 '
 
+test_expect_failure '.git/shallow is edited by repack' '
+   git init shallow-server &&
+   test_commit -C shallow-server A &&
+   test_commit -C shallow-server B &&
+   git -C shallow-server checkout -b branch &&
+   test_commit -C shallow-server C &&
+   test_commit -C shallow-server E &&
+   test_commit -C shallow-server D &&
+   d="$(git -C shallow-server rev-parse --verify D)" &&
+   git -C shallow-server checkout master &&
+
+   git clone --depth=1 --no-tags --no-single-branch \
+   "file://$PWD/shallow-server" shallow-client &&
+
+   : now remove the branch and fetch with prune &&
+   git -C shallow-server branch -D branch &&
+   git -C shallow-client fetch --prune --depth=1 \
+   origin "+refs/heads/*:refs/remotes/origin/*" &&
+   git -C shallow-client repack -adfl &&
+   test_must_fail git -C shallow-client rev-parse --verify $d^0 &&
+   ! grep $d shallow-client/.git/shallow &&
+
+   git -C shallow-server branch branch-orig D^0 &&
+   git -C shallow-client fetch --prune --depth=2 \
+   origin "+refs/heads/*:refs/remotes/origin/*"
+'
+
 test_done
-- 
gitgitgadget

[PATCH v2 2/2] repack -ad: prune the list of shallow commits

2018-07-17 Thread Johannes Schindelin via GitGitGadget

From: Johannes Schindelin 

While it is true that we never add unreachable commits into pack files
intentionally (as `git repack`'s documentation states), we must not
forget that a `git fetch --prune` (or even a `git fetch` when a ref was
force-pushed in the meantime) can make a commit unreachable that was
reachable before.

Therefore it is not safe to assume that a `git repack -adlf` will keep
unreachable commits alone (under the assumption that they had not been
packed in the first place).

This is particularly important to keep in mind when looking at the
`.git/shallow` file: if any commits listed in that file become
unreachable, it is not a problem, but if they go missing, it *is* a
problem. One symptom of this problem is that a deepening fetch may now
fail with

fatal: error in object: unshallow 

To avoid this problem, let's prune the shallow list in `git repack` when
the `-d` option is passed, unless `-A` is passed, too (which would force
the now-unreachable objects to be turned into loose objects instead of
being deleted). Additionally, e also need to take `--keep-reachable` and
`--unpack-unreachable=` into account.

Signed-off-by: Johannes Schindelin 
---
 builtin/repack.c | 6 ++
 t/t5537-fetch-shallow.sh | 2 +-
 2 files changed, 7 insertions(+), 1 deletion(-)

diff --git a/builtin/repack.c b/builtin/repack.c
index 6c636e159..4caf57221 100644
--- a/builtin/repack.c
+++ b/builtin/repack.c
@@ -444,6 +444,12 @@ int cmd_repack(int argc, const char **argv, const char 
*prefix)
if (!quiet && isatty(2))
opts |= PRUNE_PACKED_VERBOSE;
prune_packed_objects(opts);
+
+   if (!keep_unreachable &&
+   (!(pack_everything & LOOSEN_UNREACHABLE) ||
+unpack_unreachable) &&
+   is_repository_shallow())
+   prune_shallow(0);
}
 
if (!no_update_server_info)
diff --git a/t/t5537-fetch-shallow.sh b/t/t5537-fetch-shallow.sh
index 561485d31..d32ba20f9 100755
--- a/t/t5537-fetch-shallow.sh
+++ b/t/t5537-fetch-shallow.sh
@@ -186,7 +186,7 @@ EOF
test_cmp expect actual
 '
 
-test_expect_failure '.git/shallow is edited by repack' '
+test_expect_success '.git/shallow is edited by repack' '
git init shallow-server &&
test_commit -C shallow-server A &&
test_commit -C shallow-server B &&
-- 
gitgitgadget

[PATCH v2 0/2] repack -ad: fix after `fetch --prune` in a shallow repository

2018-07-17 Thread Johannes Schindelin via GitGitGadget

Under certain circumstances, commits that were reachable can be made 
unreachable, e.g. via `git fetch --prune`. These commits might have been packed 
already, in which case `git repack -adlf` will just drop them without giving 
them the usual grace period before an eventual `git prune` (via `git gc`) 
prunes them.

This is a problem when the `shallow` file still lists them, which is the reason 
why `git prune` edits that file. And with the proposed changes, `git repack 
-ad` will now do the same.

Reported by Alejandro Pauly.

Changes since v1:

- Also trigger `prune_shallow()` when `--unpack-unreachable=` was 
passed to `git repack`.
- No need to trigger `prune_shallow()` when `git repack` was called with `-k`.

Johannes Schindelin (2):
  repack: point out a bug handling stale shallow info
  repack -ad: prune the list of shallow commits

 builtin/repack.c |  6 ++
 t/t5537-fetch-shallow.sh | 27 +++
 2 files changed, 33 insertions(+)


base-commit: e3331758f12da22f4103eec7efe1b5304a9be5e9
Published-As: 
https://github.com/gitgitgadget/git/releases/tags/pr-9%2Fdscho%2Fshallow-and-fetch-prune-v2
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git 
pr-9/dscho/shallow-and-fetch-prune-v2
Pull-Request: https://github.com/gitgitgadget/git/pull/9

Range-diff vs v1:

 1:  d2be40131 = 1:  d2be40131 repack: point out a bug handling stale shallow 
info
 2:  b4e01a963 ! 2:  c7ee6e008 repack -ad: prune the list of shallow commits
 @@ -23,7 +23,8 @@
  To avoid this problem, let's prune the shallow list in `git repack` 
when
  the `-d` option is passed, unless `-A` is passed, too (which would 
force
  the now-unreachable objects to be turned into loose objects instead of
 -being deleted).
 +being deleted). Additionally, e also need to take `--keep-reachable` 
and
 +`--unpack-unreachable=` into account.
  
  Signed-off-by: Johannes Schindelin 
  
 @@ -35,7 +36,9 @@
opts |= PRUNE_PACKED_VERBOSE;
prune_packed_objects(opts);
  +
 -+ if (!(pack_everything & LOOSEN_UNREACHABLE) &&
 ++ if (!keep_unreachable &&
 ++ (!(pack_everything & LOOSEN_UNREACHABLE) ||
 ++  unpack_unreachable) &&
  + is_repository_shallow())
  + prune_shallow(0);
}

-- 
gitgitgadget

[PATCH v4 6/7] gpg-interface: introduce new signature format "x509" using gpgsm

2018-07-17 Thread Henning Schild

This commit allows git to create and check x509 type signatures using
gpgsm.

Signed-off-by: Henning Schild 
---
 Documentation/config.txt |  5 +++--
 gpg-interface.c  | 15 +++
 2 files changed, 18 insertions(+), 2 deletions(-)

diff --git a/Documentation/config.txt b/Documentation/config.txt
index 0e871346a..ff1d4a76c 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -1830,12 +1830,13 @@ gpg.program::
 
 gpg.format::
Specifies which key format to use when signing with `--gpg-sign`.
-   Default is "openpgp", that is also the only supported value.
+   Default is "openpgp" and another possible value is "x509".
 
 gpg..program::
Use this to customize the program used for the signing format you
chose. (see `gpg.program` and `gpg.format`) `gpg.program` can still
-   be used as a legacy synonym for `gpg.openpgp.program`.
+   be used as a legacy synonym for `gpg.openpgp.program`. The default
+   value for `gpg.x509.program` is "gpgsm".
 
 gui.commitMsgWidth::
Defines how wide the commit message window is in the
diff --git a/gpg-interface.c b/gpg-interface.c
index a158f08c1..bb8ea668b 100644
--- a/gpg-interface.c
+++ b/gpg-interface.c
@@ -24,11 +24,23 @@ static const char *openpgp_sigs[] = {
NULL
 };
 
+static const char *x509_verify_args[] = {
+   NULL
+};
+static const char *x509_sigs[] = {
+   "-BEGIN SIGNED MESSAGE-",
+   NULL
+};
+
 static struct gpg_format gpg_format[] = {
{ .name = "openpgp", .program = "gpg",
  .verify_args = openpgp_verify_args,
  .sigs = openpgp_sigs
},
+   { .name = "x509", .program = "gpgsm",
+ .verify_args = x509_verify_args,
+ .sigs = x509_sigs
+   },
 };
 
 static struct gpg_format *use_format = &gpg_format[0];
@@ -192,6 +204,9 @@ int git_gpg_config(const char *var, const char *value, void 
*cb)
if (!strcmp(var, "gpg.program") || !strcmp(var, "gpg.openpgp.program"))
fmtname = "openpgp";
 
+   if (!strcmp(var, "gpg.x509.program"))
+   fmtname = "x509";
+
if (fmtname) {
fmt = get_format_by_name(fmtname);
return git_config_string(&fmt->program, var, value);
-- 
2.16.4

[PATCH v4 7/7] gpg-interface t: extend the existing GPG tests with GPGSM

2018-07-17 Thread Henning Schild

Add test cases to cover the new X509/gpgsm support. Most of them
resemble existing ones. They just switch the format to x509 and set the
signingkey when creating signatures. Validation of signatures does not
need any configuration of git, it does need gpgsm to be configured to
trust the key(-chain).
Several of the testcases build on top of existing gpg testcases.
The commit ships a self-signed key for commit...@example.com and
configures gpgsm to trust it.

Signed-off-by: Henning Schild 
---
 t/lib-gpg.sh   |  28 +++-
 t/lib-gpg/gpgsm-gen-key.in |   8 +++
 t/lib-gpg/gpgsm_cert.p12   | Bin 0 -> 2652 bytes
 t/t4202-log.sh |  39 ++
 t/t5534-push-signed.sh |  52 +
 t/t7004-tag.sh |  13 
 t/t7030-verify-tag.sh  |  33 
 7 files changed, 172 insertions(+), 1 deletion(-)
 create mode 100644 t/lib-gpg/gpgsm-gen-key.in
 create mode 100644 t/lib-gpg/gpgsm_cert.p12

diff --git a/t/lib-gpg.sh b/t/lib-gpg.sh
index a5d3b2cba..3fe02876c 100755
--- a/t/lib-gpg.sh
+++ b/t/lib-gpg.sh
@@ -38,7 +38,33 @@ then
"$TEST_DIRECTORY"/lib-gpg/ownertrust &&
gpg --homedir "${GNUPGHOME}" /dev/null 2>&1 \
--sign -u commit...@example.com &&
-   test_set_prereq GPG
+   test_set_prereq GPG &&
+   # Available key info:
+   # * see t/lib-gpg/gpgsm-gen-key.in
+   # To generate new certificate:
+   #  * no passphrase
+   #   gpgsm --homedir /tmp/gpghome/ \
+   #   -o /tmp/gpgsm.crt.user \
+   #   --generate-key \
+   #   --batch t/lib-gpg/gpgsm-gen-key.in
+   # To import certificate:
+   #   gpgsm --homedir /tmp/gpghome/ \
+   #   --import /tmp/gpgsm.crt.user
+   # To export into a .p12 we can later import:
+   #   gpgsm --homedir /tmp/gpghome/ \
+   #   -o t/lib-gpg/gpgsm_cert.p12 \
+   #   --export-secret-key-p12 "commit...@example.com"
+   echo | gpgsm --homedir "${GNUPGHOME}" 2>/dev/null \
+   --passphrase-fd 0 --pinentry-mode loopback \
+   --import "$TEST_DIRECTORY"/lib-gpg/gpgsm_cert.p12 &&
+   gpgsm --homedir "${GNUPGHOME}" 2>/dev/null -K \
+   | grep fingerprint: | cut -d" " -f4 | tr -d '\n' > \
+   ${GNUPGHOME}/trustlist.txt &&
+   echo " S relax" >> ${GNUPGHOME}/trustlist.txt &&
+   (gpgconf --kill gpg-agent >/dev/null 2>&1 || : ) &&
+   echo hello | gpgsm --homedir "${GNUPGHOME}" >/dev/null \
+   -u commit...@example.com -o /dev/null --sign - 2>&1 &&
+   test_set_prereq GPGSM
;;
esac
 fi
diff --git a/t/lib-gpg/gpgsm-gen-key.in b/t/lib-gpg/gpgsm-gen-key.in
new file mode 100644
index 0..a7fd87c06
--- /dev/null
+++ b/t/lib-gpg/gpgsm-gen-key.in
@@ -0,0 +1,8 @@
+Key-Type: RSA
+Key-Length: 2048
+Key-Usage: sign
+Serial: random
+Name-DN: CN=C O Mitter, O=Example, SN=C O, GN=Mitter
+Name-Email: commit...@example.com
+Not-Before: 1970-01-01 00:00:00
+Not-After: 3000-01-01 00:00:00
diff --git a/t/lib-gpg/gpgsm_cert.p12 b/t/lib-gpg/gpgsm_cert.p12
new file mode 100644
index 
..94ffad0d31a3b6c4e849b29d960762e485ba7ea8
GIT binary patch
literal 2652
zcmY+Ec{mh|8pQ{*m?2^;S+ivrVaAfiQnHSnA|b+zofwQYAI3haA=!75ZX!!|$-Y*$
zWS5XVvW}3h?sM<`?)~F^&U?;zp7ZAqMS|U-rJ+NSVEkYxG8!9AJx2qf$s@s-fg~8i
zSqwpufKGo`;5-uW&RJwiO9MC)gTEUZ6fYR|?*&F0Fp3FCzs?3aZK`58prxe;gpq&(
zm?nam#=;<-Y>ihd?+EMByF}ef4%bKLwbO{e4U=0LG4jP<8x$`N#5#&@0as@!cSSK>
zh}oaTUi^XPMV(3;kKAjFB5Yq)zn_vnExr>`s&?M}i{A>>$j-{u*Jo)riK{fW!LXet
z)hkG3xgvIpDClY8=o4q?_bD%htwXG!_=;NxVjU=3#{TU0Q@=ZMc%6wes*Uy&CAPf=JB(a
zb<9VW4e)@R+qGEgfnewABCPf}mtJOYxL>*jtu9yKk?Fu3UWr~zJ#<*I(Ey>fhA6~_gUvg(8n<*TM
zdXo$0jMn2G$GcM484J8MUx*x=@XWq^uvCnI1jp>#Y_2Y<+T_*yX!uTA$vm7Ugs56O
z@!~>Y?E@FA@{5@ZDHD`TkEeHQ{M2n_Slp%By{89uVia3%>w?IZdvBBX$K*3UzCM7`
zxGZ-dbh^KlF#?RZaS4_^R|%@F5xX0j)vdE&#M1Ch^W+WzQjVH^m|vu9jE@aRtlbcP
z^NxHpt!%n59KJ!@-IHzjWw9l;)N}J^LLwLT#Xz!Tq4vWkHTO5#j_1_nvA>#u
zBX5jy&r_5h7PKFH26B8-1BMryQYu8HiW#k_Nq!qB4-5#YH|RDUL_&Ug
z+Y4DwCgmPTpSqu5lU#dxgcf>N8-aXsM}TDNd_~tW$92abZG)0q=Xzr)t;M39D`u~9
zdbi)dKxTXdE3OHN@sCmOGfdNEKC*BgS&ku$LNb1M>x6#MZ_37F<5gz)d_&6)%S0zP
z5_lFp_A5n(wMlQ+_2n%!z1tomK+~=DQ9MC|QDxZ1?2?A{(Dqi(*TE;q4g+&HH;vvX
zI%M?`5*^D+i5eC~tth)c%;)Q^n`$=Nt8=_jLdLg)*d^-BKJ2siLCaLDMJwSxfF|uO
zc}bc4oW*xw>!m!o6BG%Q_CG+$BZ1<8Bv8~@9Da5oV21zT1x7=A#-YtK0ImHWb?E+3
z$NK7MLMBq{?jPy^Nx&Yxu7#tyt@t3@=F07Bf;BHh5A|9FYRL9bIo$1a44s9^Wk41>
zq-0_p!?=F1wYc$wC8sgycQ~IHO0x4@BODYL?(uFu5qaXSm+YP$_!_RrSrzJ#*f(fy
zF)|i|!Ac}}R(rT~U3@SDll^wcX#c1XS5VcvP3@>e?Qfbr5MU*oxy@`fDnT0wr
zw!P*fjd#ErDfYvcz210jWuO

[PATCH v4 3/7] gpg-interface: introduce an abstraction for multiple gpg formats

2018-07-17 Thread Henning Schild

Create a struct that holds the format details for the supported formats.
At the moment that is still just "openpgp". This commit prepares for the
introduction of more formats, that might use other programs and match
other signatures.

Signed-off-by: Henning Schild 
---
 gpg-interface.c | 88 +++--
 1 file changed, 67 insertions(+), 21 deletions(-)

diff --git a/gpg-interface.c b/gpg-interface.c
index b39a27980..a02db7658 100644
--- a/gpg-interface.c
+++ b/gpg-interface.c
@@ -7,11 +7,52 @@
 #include "tempfile.h"
 
 static char *configured_signing_key;
-static const char *gpg_format = "openpgp";
-static const char *gpg_program = "gpg";
+struct gpg_format {
+   const char *name;
+   const char *program;
+   const char **verify_args;
+   const char **sigs;
+};
+
+static const char *openpgp_verify_args[] = {
+   "--keyid-format=long",
+   NULL
+};
+static const char *openpgp_sigs[] = {
+   "-BEGIN PGP SIGNATURE-",
+   "-BEGIN PGP MESSAGE-",
+   NULL
+};
+
+static struct gpg_format gpg_format[] = {
+   { .name = "openpgp", .program = "gpg",
+ .verify_args = openpgp_verify_args,
+ .sigs = openpgp_sigs
+   },
+};
+
+static struct gpg_format *use_format = &gpg_format[0];
 
-#define PGP_SIGNATURE "-BEGIN PGP SIGNATURE-"
-#define PGP_MESSAGE "-BEGIN PGP MESSAGE-"
+static struct gpg_format *get_format_by_name(const char *str)
+{
+   int i;
+
+   for (i = 0; i < ARRAY_SIZE(gpg_format); i++)
+   if (!strcmp(gpg_format[i].name, str))
+   return gpg_format + i;
+   return NULL;
+}
+
+static struct gpg_format *get_format_by_sig(const char *sig)
+{
+   int i, j;
+
+   for (i = 0; i < ARRAY_SIZE(gpg_format); i++)
+   for (j = 0; gpg_format[i].sigs[j]; j++)
+   if (starts_with(sig, gpg_format[i].sigs[j]))
+   return gpg_format + i;
+   return NULL;
+}
 
 void signature_check_clear(struct signature_check *sigc)
 {
@@ -102,12 +143,6 @@ void print_signature_buffer(const struct signature_check 
*sigc, unsigned flags)
fputs(output, stderr);
 }
 
-static int is_gpg_start(const char *line)
-{
-   return starts_with(line, PGP_SIGNATURE) ||
-   starts_with(line, PGP_MESSAGE);
-}
-
 size_t parse_signature(const char *buf, size_t size)
 {
size_t len = 0;
@@ -115,7 +150,7 @@ size_t parse_signature(const char *buf, size_t size)
while (len < size) {
const char *eol;
 
-   if (is_gpg_start(buf + len))
+   if (get_format_by_sig(buf + len))
match = len;
 
eol = memchr(buf + len, '\n', size - len);
@@ -132,6 +167,9 @@ void set_signing_key(const char *key)
 
 int git_gpg_config(const char *var, const char *value, void *cb)
 {
+   struct gpg_format *fmt = NULL;
+   char *fmtname = NULL;
+
if (!strcmp(var, "user.signingkey")) {
if (!value)
return config_error_nonbool(var);
@@ -142,17 +180,20 @@ int git_gpg_config(const char *var, const char *value, 
void *cb)
if (!strcmp(var, "gpg.format")) {
if (!value)
return config_error_nonbool(var);
-   if (strcmp(value, "openpgp"))
+   fmt = get_format_by_name(value);
+   if (!fmt)
return error("unsupported value for %s: %s",
 var, value);
-   return git_config_string(&gpg_format, var, value);
+   use_format = fmt;
+   return 0;
}
 
-   if (!strcmp(var, "gpg.program")) {
-   if (!value)
-   return config_error_nonbool(var);
-   gpg_program = xstrdup(value);
-   return 0;
+   if (!strcmp(var, "gpg.program"))
+   fmtname = "openpgp";
+
+   if (fmtname) {
+   fmt = get_format_by_name(fmtname);
+   return git_config_string(&fmt->program, var, value);
}
 
return 0;
@@ -173,7 +214,7 @@ int sign_buffer(struct strbuf *buffer, struct strbuf 
*signature, const char *sig
struct strbuf gpg_status = STRBUF_INIT;
 
argv_array_pushl(&gpg.args,
-gpg_program,
+use_format->program,
 "--status-fd=2",
 "-bsau", signing_key,
 NULL);
@@ -211,6 +252,7 @@ int verify_signed_buffer(const char *payload, size_t 
payload_size,
 struct strbuf *gpg_output, struct strbuf *gpg_status)
 {
struct child_process gpg = CHILD_PROCESS_INIT;
+   struct gpg_format *fmt;
struct tempfile *temp;
int ret;
struct strbuf buf = STRBUF_INIT;
@@ -226,10 +268,14 @@ int verify_signed_buffer(const char *payload, size_t 
payload_siz

Re: [PATCH v3 1/7] gpg-interface: add new config to select how to sign a commit

2018-07-17 Thread Henning Schild

Am Mon, 16 Jul 2018 13:14:34 -0700
schrieb Junio C Hamano :

> Henning Schild  writes:
> 
> > Add "gpg.format" where the user can specify which type of signature
> > to use for commits. At the moment only "openpgp" is supported and
> > the value is not even used. This commit prepares for a new types of
> > signatures.
> >
> > Signed-off-by: Henning Schild 
> > ---
> >  Documentation/config.txt | 4 
> >  gpg-interface.c  | 7 +++
> >  2 files changed, 11 insertions(+)
> >
> > diff --git a/Documentation/config.txt b/Documentation/config.txt
> > index 1cc18a828..ac373e3f4 100644
> > --- a/Documentation/config.txt
> > +++ b/Documentation/config.txt
> > @@ -1828,6 +1828,10 @@ gpg.program::
> > signed, and the program is expected to send the result to
> > its standard output.
> >  
> > +gpg.format::
> > +   Specifies which key format to use when signing with
> > `--gpg-sign`.
> > +   Default is "openpgp", that is also the only supported
> > value. +
> >  gui.commitMsgWidth::
> > Defines how wide the commit message window is in the
> > linkgit:git-gui[1]. "75" is the default.
> > diff --git a/gpg-interface.c b/gpg-interface.c
> > index 09ddfbc26..960c40086 100644
> > --- a/gpg-interface.c
> > +++ b/gpg-interface.c
> > @@ -7,6 +7,7 @@
> >  #include "tempfile.h"
> >  
> >  static char *configured_signing_key;
> > +static const char *gpg_format = "openpgp";
> >  static const char *gpg_program = "gpg";
> >  
> >  #define PGP_SIGNATURE "-BEGIN PGP SIGNATURE-"
> > @@ -138,6 +139,12 @@ int git_gpg_config(const char *var, const char
> > *value, void *cb) return 0;
> > }
> >  
> > +   if (!strcmp(var, "gpg.format")) {
> > +   if (value && strcasecmp(value, "openpgp"))
> > +   return error("malformed value for %s: %s",
> > var, value);
> > +   return git_config_string(&gpg_format, var,
> > value);  
> 
> I may be mistaken (sorry, I read so many topics X-<) but I think the
> consensus was to accept only "openpgp" so that we can ensure that
> 
>   [gpg "openpgp"] program = foo
> 
> etc. can be parsed more naturally without having to manually special
> case the subsection name to be case insensitive.  In other words,
> strcasecmp() should just be strcmp().  Otherwise, people would get a
> wrong expectation that
> 
>   [gpg] format = OpenPGP
>   [gpg "openpgp"] program = foo
> 
> should refer to the same and single OpenPGP, but that would violate
> the usual configuration syntax rules.

Ok, also having read the other mails i think we are still at
gpg.format and gpg..program, so i switched the two patches from
strcasecmp back to strcmp.

> The structure of checking the error looks quite suboptimal even when
> we initially support the single "openpgp" at this step.  I would
> have expected you to be writing the above like so:
> 
>   if (!value)
>   return config_error_nonbool(var);
>   if (strcmp(value, "openpgp"))
>   return error("unsupported value for %s: %s", var,
> value); return git_config_string(...);
> 
> That would make it more clear that the variable is "nonbool", which
> does not change across additional support for new formats in later
> steps.

Right, at one point (not mailed) i had that but expected it would not
pass the review, since git_config_string contains that check as well.
Changed.

Henning
 
> > +   }
> > +
> > if (!strcmp(var, "gpg.program")) {
> > if (!value)
> > return config_error_nonbool(var);

[PATCH v4 1/7] gpg-interface: add new config to select how to sign a commit

2018-07-17 Thread Henning Schild

Add "gpg.format" where the user can specify which type of signature to
use for commits. At the moment only "openpgp" is supported and the value is
not even used. This commit prepares for a new types of signatures.

Signed-off-by: Henning Schild 
---
 Documentation/config.txt |  4 
 gpg-interface.c  | 10 ++
 2 files changed, 14 insertions(+)

diff --git a/Documentation/config.txt b/Documentation/config.txt
index 1cc18a828..ac373e3f4 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -1828,6 +1828,10 @@ gpg.program::
signed, and the program is expected to send the result to its
standard output.
 
+gpg.format::
+   Specifies which key format to use when signing with `--gpg-sign`.
+   Default is "openpgp", that is also the only supported value.
+
 gui.commitMsgWidth::
Defines how wide the commit message window is in the
linkgit:git-gui[1]. "75" is the default.
diff --git a/gpg-interface.c b/gpg-interface.c
index 09ddfbc26..b39a27980 100644
--- a/gpg-interface.c
+++ b/gpg-interface.c
@@ -7,6 +7,7 @@
 #include "tempfile.h"
 
 static char *configured_signing_key;
+static const char *gpg_format = "openpgp";
 static const char *gpg_program = "gpg";
 
 #define PGP_SIGNATURE "-BEGIN PGP SIGNATURE-"
@@ -138,6 +139,15 @@ int git_gpg_config(const char *var, const char *value, 
void *cb)
return 0;
}
 
+   if (!strcmp(var, "gpg.format")) {
+   if (!value)
+   return config_error_nonbool(var);
+   if (strcmp(value, "openpgp"))
+   return error("unsupported value for %s: %s",
+var, value);
+   return git_config_string(&gpg_format, var, value);
+   }
+
if (!strcmp(var, "gpg.program")) {
if (!value)
return config_error_nonbool(var);
-- 
2.16.4

[PATCH v4 5/7] gpg-interface: introduce new config to select per gpg format program

2018-07-17 Thread Henning Schild

Supporting multiple signing formats we will have the need to configure a
custom program each. Add a new config value to cater for that.

Signed-off-by: Henning Schild 
---
 Documentation/config.txt | 5 +
 gpg-interface.c  | 2 +-
 2 files changed, 6 insertions(+), 1 deletion(-)

diff --git a/Documentation/config.txt b/Documentation/config.txt
index ac373e3f4..0e871346a 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -1832,6 +1832,11 @@ gpg.format::
Specifies which key format to use when signing with `--gpg-sign`.
Default is "openpgp", that is also the only supported value.
 
+gpg..program::
+   Use this to customize the program used for the signing format you
+   chose. (see `gpg.program` and `gpg.format`) `gpg.program` can still
+   be used as a legacy synonym for `gpg.openpgp.program`.
+
 gui.commitMsgWidth::
Defines how wide the commit message window is in the
linkgit:git-gui[1]. "75" is the default.
diff --git a/gpg-interface.c b/gpg-interface.c
index 51cad9081..a158f08c1 100644
--- a/gpg-interface.c
+++ b/gpg-interface.c
@@ -189,7 +189,7 @@ int git_gpg_config(const char *var, const char *value, void 
*cb)
return 0;
}
 
-   if (!strcmp(var, "gpg.program"))
+   if (!strcmp(var, "gpg.program") || !strcmp(var, "gpg.openpgp.program"))
fmtname = "openpgp";
 
if (fmtname) {
-- 
2.16.4

[PATCH v4 0/7] X509 (gpgsm) commit signing support

2018-07-17 Thread Henning Schild

Changes in v4:
 - make gpg.format matching case sensitive
 - (final?) coding style and wording changes

Changes in v3:
 - drop patches 1 and 2 known from < v3, see pu hs/push-cert-check-cleanup
 - dropped some testcases from p6, added two t7004 bad key/sig
 - ship a binary x509 certificate for tests, no generation anymore
 - silence gpgsm in tests
 - switch all tests to use test_config instead of "git config"
 - p4 deal with invalid input
 - p3 several refactorings, see "PATCH v2 5/9" discussion

Changes in v2:
 - removed trailing commas in array initializers and add leading space
 - replaced assert(0) with BUG in p5
 - consolidated 2 format lookups reusing get_format_data p5
 - changed from format "PGP" to "openpgp", later X509 to "x509"
 - use strcasecmp instead of strcmp for format matching
 - introduce gpg..program in p8, no gpg.programX509 anymore
 - moved t/7510 patch inbetween the two patches changing validation code
 - changed t/7510 patch to use test_config and test_must_fail

This series adds support for signing commits with gpgsm.

The first 5 patches prepare for the introduction of the actual feature in
patch 6.
Finally patch 7 extends the testsuite to cover the new feature.

This series can be seen as a follow up of a series that appeared under
the name "gpg-interface: Multiple signing tools" in april 2018 [1]. After
that series was not merged i decided to get my patches ready. The
original series aimed at being generic for any sort of signing tool, while
this series just introduced the X509 variant of gpg. (gpgsm)
I collected authors and reviewers of that first series and already put them
on cc.

[1] https://public-inbox.org/git/20180409204129.43537-1-mastahy...@gmail.com/

Henning Schild (7):
  gpg-interface: add new config to select how to sign a commit
  t/t7510: check the validation of the new config gpg.format
  gpg-interface: introduce an abstraction for multiple gpg formats
  gpg-interface: do not hardcode the key string len anymore
  gpg-interface: introduce new config to select per gpg format program
  gpg-interface: introduce new signature format "x509" using gpgsm
  gpg-interface t: extend the existing GPG tests with GPGSM

 Documentation/config.txt   |  10 +
 gpg-interface.c| 108 +
 t/lib-gpg.sh   |  28 +++-
 t/lib-gpg/gpgsm-gen-key.in |   8 
 t/lib-gpg/gpgsm_cert.p12   | Bin 0 -> 2652 bytes
 t/t4202-log.sh |  39 
 t/t5534-push-signed.sh |  52 ++
 t/t7004-tag.sh |  13 ++
 t/t7030-verify-tag.sh  |  33 ++
 t/t7510-signed-commit.sh   |   9 
 10 files changed, 281 insertions(+), 19 deletions(-)
 create mode 100644 t/lib-gpg/gpgsm-gen-key.in
 create mode 100644 t/lib-gpg/gpgsm_cert.p12

-- 
2.16.4

Re: [PATCH v3 5/7] gpg-interface: introduce new config to select per gpg format program

2018-07-17 Thread Henning Schild

Am Mon, 16 Jul 2018 13:45:40 -0700
schrieb Junio C Hamano :

> Henning Schild  writes:
> 
> > +gpg..program::
> > +   Use this to customize the program used for the signing
> > format you
> > +   chose. (see gpg.program) gpg.openpgp.program is a synonym
> > for the
> > +   legacy gpg.program.  
> 
> I _think_ you meant "see gpg.format", but I am not 100% sure.

No i actually meant program, the next version just refers to both
config options for further reading.

> When X is a synonym for Y, Y is also a synonym for X, so technically
> speaking this does not matter, but when we talk about backward
> compatibility fallback, I think we say "OLDway is retained as a
> legacy synonym for NEWway", i.e. the other way around.
> 
> Also, `typeset in tt` what end-users would type literally, like
> configuration variable names, i.e.
> 
>   Use this to customize the rpogram used for the signing
>   format you chose (see `gpg.format`).  `gpg.program` can
>   still be used as a legacy synonym for `gpg.openpgp.program`.

Used that second sentence.

Henning

> >  gui.commitMsgWidth::
> > Defines how wide the commit message window is in the
> > linkgit:git-gui[1]. "75" is the default.
> > diff --git a/gpg-interface.c b/gpg-interface.c
> > index 93bd0fb32..f3c22b551 100644
> > --- a/gpg-interface.c
> > +++ b/gpg-interface.c
> > @@ -182,7 +182,7 @@ int git_gpg_config(const char *var, const char
> > *value, void *cb) return 0;
> > }
> >  
> > -   if (!strcmp(var, "gpg.program"))
> > +   if (!strcmp(var, "gpg.program") || !strcmp(var,
> > "gpg.openpgp.program")) fmtname = "openpgp";
> >  
> > if (fmtname) {

[PATCH v4 4/7] gpg-interface: do not hardcode the key string len anymore

2018-07-17 Thread Henning Schild

gnupg does print the keyid followed by a space and the signer comes
next. The same pattern is also used in gpgsm, but there the key length
would be 40 instead of 16. Instead of hardcoding the expected length,
find the first space and calculate it.
Input that does not match the expected format will be ignored now,
before we jumped to found+17 which might have been behind the end of an
unexpected string.

Signed-off-by: Henning Schild 
---
 gpg-interface.c | 7 ---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/gpg-interface.c b/gpg-interface.c
index a02db7658..51cad9081 100644
--- a/gpg-interface.c
+++ b/gpg-interface.c
@@ -95,10 +95,11 @@ static void parse_gpg_output(struct signature_check *sigc)
sigc->result = sigcheck_gpg_status[i].result;
/* The trust messages are not followed by key/signer 
information */
if (sigc->result != 'U') {
-   sigc->key = xmemdupz(found, 16);
+   next = strchrnul(found, ' ');
+   sigc->key = xmemdupz(found, next - found);
/* The ERRSIG message is not followed by signer 
information */
-   if (sigc-> result != 'E') {
-   found += 17;
+   if (*next && sigc-> result != 'E') {
+   found = next + 1;
next = strchrnul(found, '\n');
sigc->signer = xmemdupz(found, next - found);
}
-- 
2.16.4

[PATCH v4 2/7] t/t7510: check the validation of the new config gpg.format

2018-07-17 Thread Henning Schild

Test setting gpg.format to both invalid and valid values.

Signed-off-by: Henning Schild 
---
 t/t7510-signed-commit.sh | 9 +
 1 file changed, 9 insertions(+)

diff --git a/t/t7510-signed-commit.sh b/t/t7510-signed-commit.sh
index 6e2015ed9..7bdad570c 100755
--- a/t/t7510-signed-commit.sh
+++ b/t/t7510-signed-commit.sh
@@ -227,4 +227,13 @@ test_expect_success GPG 'log.showsignature behaves like 
--show-signature' '
grep "gpg: Good signature" actual
 '
 
+test_expect_success GPG 'check config gpg.format values' '
+   test_config gpg.format openpgp &&
+   git commit -S --amend -m "success" &&
+   test_config gpg.format OpEnPgP &&
+   test_must_fail git commit -S --amend -m "fail" &&
+   test_config gpg.format malformed &&
+   test_must_fail git commit -S --amend -m "fail"
+'
+
 test_done
-- 
2.16.4

Re: [PATCH v3 3/7] gpg-interface: introduce an abstraction for multiple gpg formats

2018-07-17 Thread Henning Schild

Am Mon, 16 Jul 2018 13:40:32 -0700
schrieb Junio C Hamano :

> Henning Schild  writes:
> 
> > Create a struct that holds the format details for the supported
> > formats. At the moment that is still just "openpgp". This commit
> > prepares for the introduction of more formats, that might use other
> > programs and match other signatures.
> >
> > Signed-off-by: Henning Schild 
> > ---
> >  gpg-interface.c | 84
> > ++--- 1 file
> > changed, 63 insertions(+), 21 deletions(-)
> >
> > diff --git a/gpg-interface.c b/gpg-interface.c
> > index 960c40086..699651fd9 100644
> > --- a/gpg-interface.c
> > +++ b/gpg-interface.c
> > @@ -7,11 +7,46 @@
> >  #include "tempfile.h"
> >  
> >  static char *configured_signing_key;
> > -static const char *gpg_format = "openpgp";
> > -static const char *gpg_program = "gpg";
> > +struct gpg_format {
> > +   const char *name;
> > +   const char *program;
> > +   const char **extra_args_verify;
> > +   const char **sigs;
> > +};
> > +
> > +static const char *openpgp_verify_args[] =
> > { "--keyid-format=long", NULL };  
> 
> Let's write it like this, even if the current line is short enough:
> 
> static const char *openpgp_verify_args[] = {
>   "--keyid-format=long",
>   NULL
> };
> 
> Ditto for the next one.  Even if you do not expect these two arrays
> to get longer, people tend to copy & paste to imitate existing code
> when making new things, and we definitely would not want them to be
> doing so on a code like the above (or below).  When they need to add
> new elements to their arrays, having the terminating NULL on its own
> line means they will have to see less patch noise.

Ok, for consistency a later patch will introduce { NULL }; on three
lines.

> > +static const char *openpgp_sigs[] = {
> > +   "-BEGIN PGP SIGNATURE-",
> > +   "-BEGIN PGP MESSAGE-", NULL };
> > +
> > +static struct gpg_format gpg_formats[] = {
> > +   { .name = "openpgp", .program = "gpg",
> > + .extra_args_verify = openpgp_verify_args,  
> 
> If the underlying aray is "verify_args" (not "extra"), perhaps the
> field name should also be .verify_args to match.

Renamed extra_args_verify to verify_args.

> Giving an array of things a name "things[]" is a disease, unless the
> array is very often passed around as a whole to represent the
> collection of everything.  When the array is mostly accessed one
> element at a time, being able to say thing[3] to mean the third
> thing is much more pleasant.
> 
> So, from that point of view, I pretty much agree with
> openpgp_verify_args[] and openpgp_sigs[], but am against
> gpg_formats[].  The last one should be gpg_format[], for which we
> happen to have only one variant right now.

renamed gpg_formats[] to gpg_format[].

> > + .sigs = openpgp_sigs
> > +   },
> > +};
> > +static struct gpg_format *current_format = &gpg_formats[0];  
> 
> Have a blank line before the above one.
> 
> What does "current_format" mean?  Is it different from "format to be
> used"?  As we will overwrite the variable upon reading the config,
> we cannot afford to call it default_format, but somehow "current"
> feels misleading, at least to me.  I expected to find "old format"
> elsewhere and there may be some format conversion or something from
> the variable name.
> 
> static struct gpg_format *use_format = &gpg_format[0];
> 
> perhaps?

renamed current_format to use_format.

> > +
> > +static struct gpg_format *get_format_by_name(const char *str)
> > +{
> > +   int i;
> > +
> > +   for (i = 0; i < ARRAY_SIZE(gpg_formats); i++)
> > +   if (!strcasecmp(gpg_formats[i].name, str))  
> 
> As [1/7], this would become strcmp(), I presume?
> 
> > +   return gpg_formats + i;
> > +   return NULL;
> > +}
> >  
> > -#define PGP_SIGNATURE "-BEGIN PGP SIGNATURE-"
> > -#define PGP_MESSAGE "-BEGIN PGP MESSAGE-"
> > +static struct gpg_format *get_format_by_sig(const char *sig)
> > +{
> > +   int i, j;
> > +
> > +   for (i = 0; i < ARRAY_SIZE(gpg_formats); i++)
> > +   for (j = 0; gpg_formats[i].sigs[j]; j++)
> > +   if (starts_with(sig,
> > gpg_formats[i].sigs[j]))
> > +   return gpg_formats + i;
> > +   return NULL;
> > +}  
> 
> OK.
> 
> > @@ -140,18 +172,23 @@ int git_gpg_config(const char *var, const
> > char *value, void *cb) }
> >  
> > if (!strcmp(var, "gpg.format")) {
> > -   if (value && strcasecmp(value, "openpgp"))
> > -   return error("malformed value for %s: %s",
> > var, value);
> > -   return git_config_string(&gpg_format, var, value);
> > -   }
> > -
> > -   if (!strcmp(var, "gpg.program")) {
> > if (!value)
> > return config_error_nonbool(var);
> > -   gpg_program = xstrdup(value);
> > +   fmt = get_format_by_name(value);
> > +   if (!fmt)
> > +   return error("malformed value for %s: %s",
> > var, value);

[RFC PATCH 0/6] Add gentle alternative for `get_oid()`

2018-07-17 Thread Paul-Sebastian Ungureanu

At the moment, `get_oid()` might call `die()` in some cases. To
prevent that from happening, this patches introduces a new flag
called `GET_OID_GENTLY` and a new function `get_oid_gently()`,
which passes the mentioned flag further to `get_oid_with_context()`.

The call graph of `get_oid()` is pretty complex and I hope I covered
all the cases where `exit()` might be called. Although I believe this
series of patches will not introduce any regression and work as
intended, I think that is better to mark it with [RFC].

This patch would be useful for converting `git stash` to C. At the
moment, `git stash` spawns a child process to avoid `get_oid()` to
die. If this series turns out to be good enough to be accepted, do
I need to wait until it gets merged in `master` to use it in the
other project mentioned before?

Thanks,
Paul

Paul-Sebastian Ungureanu (6):
  sha1-name: Add `GET_OID_GENTLY` flag
  tree-walk: Add three new gentle helpers
  refs.c: Teach `read_ref_at()` to accept `GET_OID_GENTLY` flag
  sha1-name: Teach `get_oid_basic()` to be gentle
  sha1-name: Teach `get_oid_with_context[_1]()` to be gentle
  sha1-name: Add gentle alternative for `get_oid()`

 cache.h |   2 +
 refs.c  |   2 +
 sha1-name.c | 127 +---
 tree-walk.c | 108 +---
 tree-walk.h |   3 +-
 5 files changed, 199 insertions(+), 43 deletions(-)

-- 
2.18.0.rc2.184.ga79db55c2.dirty

[RFC PATCH 3/6] refs.c: Teach `read_ref_at()` to accept `GET_OID_GENTLY` flag

2018-07-17 Thread Paul-Sebastian Ungureanu

This commit introduces a way to call `read_ref_at()` without
exiting on failure.

Signed-off-by: Paul-Sebastian Ungureanu 
---
 refs.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/refs.c b/refs.c
index 0eb379f93..4a470158e 100644
--- a/refs.c
+++ b/refs.c
@@ -932,6 +932,8 @@ int read_ref_at(const char *refname, unsigned int flags, 
timestamp_t at_time, in
for_each_reflog_ent_reverse(refname, read_ref_at_ent, &cb);
 
if (!cb.reccnt) {
+   if (flags & GET_OID_GENTLY)
+   return -1;
if (flags & GET_OID_QUIETLY)
exit(128);
else
-- 
2.18.0.rc2.184.ga79db55c2.dirty

[RFC PATCH 1/6] sha1-name: Add `GET_OID_GENTLY` flag

2018-07-17 Thread Paul-Sebastian Ungureanu

The current API does not provide a method to call
`get_oid()` and avoid `exit()` to be called. This commit
intention is to introduce a flag in order to make `get_oid()`
able to get the sha1 safely, without exiting the program.

Since `get_oid()` calls a lot of functions, which call other
functions as well (and so on), there are a lot of cases in which
`exit()` could be called. To make this idea more clear, here
is one example, which could cause `get_oid()` to die.

  get_oid() -> get_oid_with_context() -> get_oid_with_context_1()
  -> get_oid_1() -> read_ref_at() -> exit()

Where `function1() -> function2()` means that `function1()` might
call `function2()` at some point.

Signed-off-by: Paul-Sebastian Ungureanu 
---
 cache.h | 1 +
 1 file changed, 1 insertion(+)

diff --git a/cache.h b/cache.h
index d49092d94..cb8803e2f 100644
--- a/cache.h
+++ b/cache.h
@@ -1314,6 +1314,7 @@ struct object_context {
 #define GET_OID_FOLLOW_SYMLINKS 0100
 #define GET_OID_RECORD_PATH 0200
 #define GET_OID_ONLY_TO_DIE04000
+#define GET_OID_GENTLY   01
 
 #define GET_OID_DISAMBIGUATORS \
(GET_OID_COMMIT | GET_OID_COMMITTISH | \
-- 
2.18.0.rc2.184.ga79db55c2.dirty

[RFC PATCH 4/6] sha1-name: Teach `get_oid_basic()` to be gentle

2018-07-17 Thread Paul-Sebastian Ungureanu

After teaching `read_ref_at()` we need to teach `get_oid_basic()`
that `read_ref_at()` might not call `exit()`, but report an
error through the return value.

Signed-off-by: Paul-Sebastian Ungureanu 
---
 sha1-name.c | 16 
 1 file changed, 12 insertions(+), 4 deletions(-)

diff --git a/sha1-name.c b/sha1-name.c
index d741e1129..74ecbd550 100644
--- a/sha1-name.c
+++ b/sha1-name.c
@@ -778,6 +778,7 @@ static int get_oid_basic(const char *str, int len, struct 
object_id *oid,
timestamp_t at_time;
timestamp_t co_time;
int co_tz, co_cnt;
+   int ret;
 
/* Is it asking for N-th entry, or approxidate? */
for (i = nth = 0; 0 <= nth && i < reflog_len; i++) {
@@ -802,8 +803,12 @@ static int get_oid_basic(const char *str, int len, struct 
object_id *oid,
return -1;
}
}
-   if (read_ref_at(real_ref, flags, at_time, nth, oid, NULL,
-   &co_time, &co_tz, &co_cnt)) {
+
+   ret = read_ref_at(real_ref, flags, at_time, nth, oid, NULL,
+   &co_time, &co_tz, &co_cnt);
+   if (ret == -1)
+   return -1;
+   if (ret) {
if (!len) {
if (starts_with(real_ref, "refs/heads/")) {
str = real_ref + 11;
@@ -821,9 +826,12 @@ static int get_oid_basic(const char *str, int len, struct 
object_id *oid,
show_date(co_time, co_tz, 
DATE_MODE(RFC2822)));
}
} else {
-   if (flags & GET_OID_QUIETLY) {
-   exit(128);
+   if (flags & GET_OID_GENTLY) {
+   free(real_ref);
+   return -1;
}
+   if (flags & GET_OID_QUIETLY)
+   exit(128);
die("Log for '%.*s' only has %d entries.",
len, str, co_cnt);
}
-- 
2.18.0.rc2.184.ga79db55c2.dirty

[RFC PATCH 2/6] tree-walk: Add three new gentle helpers

2018-07-17 Thread Paul-Sebastian Ungureanu

Add `get_tree_entry_gently()`, `find_tree_entry_gently()`
and `get_tree_entry_follow_symlinks_gently()`, which will
make `get_oid()` to be more gently.

Since `get_tree_entry()` is used in more than 20 places,
adding a new parameter will make this commit harder to read.
In every place it is called there will need to be an additional
0 parameter at the end of the call. The solution to avoid this is
to rename the function in `get_tree_entry_gently()` which gets
an additional `flags` variable. A new `get_tree_entry()`
will call `get_tree_entry_gently()` with `flags` being 0.
This way, no additional changes will be needed.

Signed-off-by: Paul-Sebastian Ungureanu 
---
 sha1-name.c |   2 +-
 tree-walk.c | 108 +++-
 tree-walk.h |   3 +-
 3 files changed, 94 insertions(+), 19 deletions(-)

diff --git a/sha1-name.c b/sha1-name.c
index 60d9ef3c7..d741e1129 100644
--- a/sha1-name.c
+++ b/sha1-name.c
@@ -1721,7 +1721,7 @@ static int get_oid_with_context_1(const char *name,
if (flags & GET_OID_FOLLOW_SYMLINKS) {
ret = get_tree_entry_follow_symlinks(&tree_oid,
filename, oid, &oc->symlink_path,
-   &oc->mode);
+   &oc->mode, flags);
} else {
ret = get_tree_entry(&tree_oid, filename, oid,
 &oc->mode);
diff --git a/tree-walk.c b/tree-walk.c
index 8f5090862..2925eaec2 100644
--- a/tree-walk.c
+++ b/tree-walk.c
@@ -491,7 +491,9 @@ struct dir_state {
struct object_id oid;
 };
 
-static int find_tree_entry(struct tree_desc *t, const char *name, struct 
object_id *result, unsigned *mode)
+static int find_tree_entry(struct tree_desc *t, const char *name,
+ struct object_id *result, unsigned *mode,
+ int flags)
 {
int namelen = strlen(name);
while (t->size) {
@@ -501,7 +503,11 @@ static int find_tree_entry(struct tree_desc *t, const char 
*name, struct object_
 
oid = tree_entry_extract(t, &entry, mode);
entrylen = tree_entry_len(&t->entry);
-   update_tree_entry(t);
+
+   if (!(flags & GET_OID_GENTLY))
+   update_tree_entry(t);
+   else if (update_tree_entry_gently(t))
+   return -1;
if (entrylen > namelen)
continue;
cmp = memcmp(name, entry, entrylen);
@@ -521,19 +527,28 @@ static int find_tree_entry(struct tree_desc *t, const 
char *name, struct object_
oidcpy(result, oid);
return 0;
}
-   return get_tree_entry(oid, name + entrylen, result, mode);
+   return get_tree_entry_gently(oid, name + entrylen, result, 
mode, flags);
}
return -1;
 }
 
-int get_tree_entry(const struct object_id *tree_oid, const char *name, struct 
object_id *oid, unsigned *mode)
+int get_tree_entry_gently(const struct object_id *tree_oid, const char *name,
+ struct object_id *oid, unsigned *mode, int flags)
 {
int retval;
void *tree;
unsigned long size;
struct object_id root;
 
-   tree = read_object_with_reference(tree_oid, tree_type, &size, &root);
+   if (!(flags & GET_OID_GENTLY)) {
+   tree = read_object_with_reference(tree_oid, tree_type, &size, 
&root);
+   } else {
+   struct object_info oi = OBJECT_INFO_INIT;
+
+   oi.contentp = tree;
+   if (oid_object_info_extended(the_repository, tree_oid, &oi, 0) 
< 0)
+   return -1;
+   }
if (!tree)
return -1;
 
@@ -547,13 +562,27 @@ int get_tree_entry(const struct object_id *tree_oid, 
const char *name, struct ob
retval = -1;
} else {
struct tree_desc t;
-   init_tree_desc(&t, tree, size);
-   retval = find_tree_entry(&t, name, oid, mode);
+   if (!(flags & GET_OID_GENTLY)) {
+   init_tree_desc(&t, tree, size);
+   } else {
+   if (init_tree_desc_gently(&t, tree, size)) {
+   retval = -1;
+   goto done;
+   }
+   }
+   retval = find_tree_entry(&t, name, oid, mode, flags);
}
+done:
free(tree);
return retval;
 }
 
+int get_tree_entry(const struct object_id *tree_oid, const char *name,
+  struct object_id *oid, unsigned *mode)
+{
+   return get_tree_entry_gently(tree_oid, name, oid, mode, 0);
+}
+
 /*
  * This is Linux's built-in max for the number of symlinks to follow.
  * That limit, of course, does not affect

[RFC PATCH 5/6] sha1-name: Teach `get_oid_with_context[_1]()` to be gentle

2018-07-17 Thread Paul-Sebastian Ungureanu

This commit makes `get_oid_with_context()` and `get_oid_with_context_1()`
to recognize the `GET_OID_GENTLY` flag.

The `gentle` flag does not imply `quiet` and we might need to reconsider
whether we should display any message if `GET_OID_GENTLY` is given.

Signed-off-by: Paul-Sebastian Ungureanu 
---
 sha1-name.c | 103 ++--
 1 file changed, 83 insertions(+), 20 deletions(-)

diff --git a/sha1-name.c b/sha1-name.c
index 74ecbd550..a5d4e0dc7 100644
--- a/sha1-name.c
+++ b/sha1-name.c
@@ -1521,11 +1521,12 @@ int get_oid_blob(const char *name, struct object_id 
*oid)
 }
 
 /* Must be called only when object_name:filename doesn't exist. */
-static void diagnose_invalid_oid_path(const char *prefix,
+static int diagnose_invalid_oid_path(const char *prefix,
  const char *filename,
  const struct object_id *tree_oid,
  const char *object_name,
- int object_name_len)
+ int object_name_len,
+ int gentle)
 {
struct object_id oid;
unsigned mode;
@@ -1533,12 +1534,19 @@ static void diagnose_invalid_oid_path(const char 
*prefix,
if (!prefix)
prefix = "";
 
-   if (file_exists(filename))
+   if (file_exists(filename)) {
+   if (gentle)
+   return -1;
die("Path '%s' exists on disk, but not in '%.*s'.",
filename, object_name_len, object_name);
+   }
if (is_missing_file_error(errno)) {
char *fullname = xstrfmt("%s%s", prefix, filename);
-
+   if (gentle) {
+   warning(_("%s or %s does not exist."), fullname,
+   filename);
+   return -1;
+   }
if (!get_tree_entry(tree_oid, fullname, &oid, &mode)) {
die("Path '%s' exists, but not '%s'.\n"
"Did you mean '%.*s:%s' aka '%.*s:./%s'?",
@@ -1552,12 +1560,14 @@ static void diagnose_invalid_oid_path(const char 
*prefix,
die("Path '%s' does not exist in '%.*s'",
filename, object_name_len, object_name);
}
+   return 0;
 }
 
 /* Must be called only when :stage:filename doesn't exist. */
-static void diagnose_invalid_index_path(int stage,
+static int diagnose_invalid_index_path(int stage,
const char *prefix,
-   const char *filename)
+   const char *filename,
+   int gentle)
 {
const struct cache_entry *ce;
int pos;
@@ -1574,11 +1584,20 @@ static void diagnose_invalid_index_path(int stage,
if (pos < active_nr) {
ce = active_cache[pos];
if (ce_namelen(ce) == namelen &&
-   !memcmp(ce->name, filename, namelen))
+   !memcmp(ce->name, filename, namelen)) {
+   if (gentle) {
+   warning("Path '%s' is in the index "
+   "but not at stage %d.\n"
+   "Did you mean ':%d:%s'?",
+   filename, stage,
+   ce_stage(ce), filename);
+   return -1;
+   }
die("Path '%s' is in the index, but not at stage %d.\n"
"Did you mean ':%d:%s'?",
filename, stage,
ce_stage(ce), filename);
+   }
}
 
/* Confusion between relative and absolute filenames? */
@@ -1590,31 +1609,58 @@ static void diagnose_invalid_index_path(int stage,
if (pos < active_nr) {
ce = active_cache[pos];
if (ce_namelen(ce) == fullname.len &&
-   !memcmp(ce->name, fullname.buf, fullname.len))
+   !memcmp(ce->name, fullname.buf, fullname.len)) {
+   if (gentle)
+   return -1;
die("Path '%s' is in the index, but not '%s'.\n"
"Did you mean ':%d:%s' aka ':%d:./%s'?",
fullname.buf, filename,
ce_stage(ce), fullname.buf,
ce_stage(ce), filename);
+   }
}
 
-   if (file_exists(filename))
+   if (file_exists(filename)) {
+   if (gentle)
+   return -1;
die("Path '%s' exists on disk, but not in the index.", 
filename);
-   if (is_missing_file_error(errno))
+   }
+   if (is_missing_file_error(errno)) {
+   i

[RFC PATCH 6/6] sha1-name: Add gentle alternative for `get_oid()`

2018-07-17 Thread Paul-Sebastian Ungureanu

Add `get_oid_gently()` to be a gentle alternative to `get_oid()`.

Signed-off-by: Paul-Sebastian Ungureanu 
---
 cache.h | 1 +
 sha1-name.c | 6 ++
 2 files changed, 7 insertions(+)

diff --git a/cache.h b/cache.h
index cb8803e2f..36e196202 100644
--- a/cache.h
+++ b/cache.h
@@ -1321,6 +1321,7 @@ struct object_context {
GET_OID_TREE | GET_OID_TREEISH | \
GET_OID_BLOB)
 
+extern int get_oid_gently(const char *str, struct object_id *oid);
 extern int get_oid(const char *str, struct object_id *oid);
 extern int get_oid_commit(const char *str, struct object_id *oid);
 extern int get_oid_committish(const char *str, struct object_id *oid);
diff --git a/sha1-name.c b/sha1-name.c
index a5d4e0dc7..6ee48fdf7 100644
--- a/sha1-name.c
+++ b/sha1-name.c
@@ -1474,6 +1474,12 @@ int get_oid(const char *name, struct object_id *oid)
return get_oid_with_context(name, 0, oid, &unused);
 }
 
+int get_oid_gently(const char *name, struct object_id *oid)
+{
+   struct object_context unused;
+   return get_oid_with_context(name, GET_OID_GENTLY, oid, &unused);
+}
+
 
 /*
  * Many callers know that the user meant to name a commit-ish by
-- 
2.18.0.rc2.184.ga79db55c2.dirty

Re: [RFC PATCH 5/5] format-patch: add --creation-weight tweak for --range-diff

2018-07-17 Thread Johannes Schindelin

Hi Eric,

On Wed, 30 May 2018, Eric Sunshine wrote:

> When generating a range-diff, matching up commits between two version of
> a patch series involves heuristics, thus may give unexpected results.
> git-branch-diff allows tweaking the heuristic via --creation-weight.
> Follow suit by accepting --creation-weight in combination with
> --range-diff when generating a range-diff for a cover-letter.

Since I opted to change this to `--creation-factor`, taking an integer
between 0 and 100 (essentially), this patch will need heavy adjustment ;-=)

Thanks,
Dscho

> Signed-off-by: Eric Sunshine 
> ---
>  Documentation/git-format-patch.txt |  8 +++-
>  builtin/log.c  | 19 +++
>  2 files changed, 22 insertions(+), 5 deletions(-)
> 
> diff --git a/Documentation/git-format-patch.txt 
> b/Documentation/git-format-patch.txt
> index 25026ae26e..7ed9ec9dae 100644
> --- a/Documentation/git-format-patch.txt
> +++ b/Documentation/git-format-patch.txt
> @@ -23,7 +23,7 @@ SYNOPSIS
>  [(--reroll-count|-v) ]
>  [--to=] [--cc=]
>  [--[no-]cover-letter] [--quiet] [--notes[=]]
> -[--range-diff=]
> +[--range-diff= [--creation-weight=]]
>  [--progress]
>  []
>  [  |  ]
> @@ -240,6 +240,12 @@ feeding the result to `git send-email`.
>   disjoint (for example `git format-patch --cover-letter
>   --range-diff=feature/v1~3..feature/v1 -3 feature/v2`).
>  
> +--creation-weight=::
> + Used with `--range-diff`, tweak the heuristic which matches up commits
> + between the previous and current series of patches by adjusting the
> + creation/deletion cost fudge factor. See linkgit:git-branch-diff[1])
> + for details.
> +
>  --notes[=]::
>   Append the notes (see linkgit:git-notes[1]) for the commit
>   after the three-dash line.
> diff --git a/builtin/log.c b/builtin/log.c
> index 3089d3a50a..2c49011b51 100644
> --- a/builtin/log.c
> +++ b/builtin/log.c
> @@ -1012,12 +1012,16 @@ static void infer_diff_ranges(struct argv_array *args,
>  }
>  
>  static int get_range_diff(struct strbuf *sb,
> -   const struct argv_array *ranges)
> +   const struct argv_array *ranges,
> +   const char *creation_weight)
>  {
>   struct child_process cp = CHILD_PROCESS_INIT;
>  
>   cp.git_cmd = 1;
>   argv_array_pushl(&cp.args, "branch-diff", "--no-color", NULL);
> + if (creation_weight)
> + argv_array_pushf(&cp.args,
> +  "--creation-weight=%s", creation_weight);
>   argv_array_pushv(&cp.args, ranges->argv);
>   return capture_command(&cp, sb, 0);
>  }
> @@ -1047,7 +1051,8 @@ static void make_cover_letter(struct rev_info *rev, int 
> use_stdout,
> int nr, struct commit **list,
> const char *branch_name,
> int quiet,
> -   const char *range_diff)
> +   const char *range_diff,
> +   const char *creation_weight)
>  {
>   const char *committer;
>   const char *body = "*** SUBJECT HERE ***\n\n*** BLURB HERE ***\n";
> @@ -1070,7 +1075,7 @@ static void make_cover_letter(struct rev_info *rev, int 
> use_stdout,
>   if (range_diff) {
>   struct argv_array ranges = ARGV_ARRAY_INIT;
>   infer_diff_ranges(&ranges, range_diff, origin, head);
> - if (get_range_diff(&diff, &ranges))
> + if (get_range_diff(&diff, &ranges, creation_weight))
>   die(_("failed to generate range-diff"));
>   argv_array_clear(&ranges);
>   }
> @@ -1495,6 +1500,7 @@ int cmd_format_patch(int argc, const char **argv, const 
> char *prefix)
>   int show_progress = 0;
>   struct progress *progress = NULL;
>   const char *range_diff = NULL;
> + const char *creation_weight = NULL;
>  
>   const struct option builtin_format_patch_options[] = {
>   { OPTION_CALLBACK, 'n', "numbered", &numbered, NULL,
> @@ -1570,6 +1576,8 @@ int cmd_format_patch(int argc, const char **argv, const 
> char *prefix)
>N_("show progress while generating patches")),
>   OPT_STRING(0, "range-diff", &range_diff, N_("rev-range"),
>  N_("show changes against  in cover 
> letter")),
> + OPT_STRING(0, "creation-weight", &creation_weight, N_("factor"),
> +N_("fudge factor by which creation is weighted")),
>   OPT_END()
>   };
>  
> @@ -1664,6 +1672,9 @@ int cmd_format_patch(int argc, const char **argv, const 
> char *prefix)
>   die (_("--subject-prefix/--rfc and -k are mutually 
> exclusive."));
>   rev.preserve_subject = keep_subject;
>  
> + if (creation_weight && !range_diff)
> + die(_("--creation-

Re: [RFC PATCH 3/5] format-patch: extend --range-diff to accept revision range

2018-07-17 Thread Eric Sunshine

On Tue, Jul 17, 2018 at 6:45 AM Johannes Schindelin
 wrote:
> On Wed, 30 May 2018, Eric Sunshine wrote:
> > + if (strstr(prev, "..")) {
> > + if (!origin)
> > + die(_("failed to infer range-diff ranges"));
> > + argv_array_push(args, prev);
> > + argv_array_pushf(args, "%s..%s",
> > +  oid_to_hex(&origin->object.oid),
> > +  oid_to_hex(&head->object.oid));
> > + } else {
> > + argv_array_pushf(args, "%s...%s", prev,
> > +  oid_to_hex(&head->object.oid));
> > + }
>
> This would be easier to read if the order was inverted:
>
> if (!strstr(...))
> ...
> else if (!origin)
> die(...)
> else {
> ...
> }
>
> Otherwise, it makes sense to me.

Thanks, I re-wrote it that way in the re-roll already.

Re: [RFC PATCH 2/5] format-patch: add --range-diff option to embed diff in cover letter

2018-07-17 Thread Eric Sunshine

Thanks for the review comments. In the new version of the series
(almost complete), almost the entire implementation has changed,
including most of the stuff on which you commented. Anyhow, see my
responses to your comments below...

On Tue, Jul 17, 2018 at 6:31 AM Johannes Schindelin
 wrote:
> On Wed, 30 May 2018, Eric Sunshine wrote:
> > +static int get_range_diff(struct strbuf *sb,
>
> If you rename `sb` to `out`, it makes it more obvious to the casual reader
> that this strbuf will receive the output.

This is gone in the re-roll.

> > + cp.git_cmd = 1;
> > + argv_array_pushl(&cp.args, "branch-diff", "--no-color", NULL);
>
> Obviously, this needs to be "range-diff" now.

The re-roll takes advantage of the libified range-diff rather than
invoking it as a command.

> > + argv_array_pushv(&cp.args, ranges->argv);
>
> As there must be exactly two ranges, it would make more sense to pass them
> explicitly. And then you can use one single call to `argv_array_pushl()`
> instead.

Gone in the re-roll.

> > + struct strbuf diff = STRBUF_INIT;
>
> I guess you want to call it `diff` instead of `range_diff` because a
> future change will reuse this for the interdiff instead? And then also to
> avoid a naming conflict with the parameter.
>
> Dunno. I would still call it `range_diff` until the day comes (if ever)
> when `--interdiff` is implemented. And I would call the `range_diff`
> parameter `range_diff_opt` instead, or some such.

Sharing the variable between interdiff and range-diff was the idea
initially, however, I later decided that the --range-diff and
--interdiff options didn't need to be mutually-exclusive, so, in the
re-roll, all variables now have distinct names (no commonality between
them).

> > + if (range_diff) {
> > + struct argv_array ranges = ARGV_ARRAY_INIT;
> > + infer_diff_ranges(&ranges, range_diff, head);
> > + if (get_range_diff(&diff, &ranges))
> > + die(_("failed to generate range-diff"));
>
> BTW I like to have an extra space in front of all the range-diff lines, to
> make it easier to discern them from the rest.

I'm not sure what you mean. Perhaps I'm misreading your comment.

> >   if (!use_stdout &&
> >   open_next_file(NULL, rev->numbered_files ? NULL : "cover-letter", 
> > rev, quiet))
> > - return;
> > + goto done;
>
> Hmm. If you think you will add more `goto done`s in the future, I guess
> this is okay. Otherwise, it would probably make sense to go ahead and
> simply `strbuf_release(&diff)` before `return`ing.

In the re-roll, there are several more things which need to be cleaned
up, so the 'goto' makes life easier.

> > @@ -1438,6 +1480,7 @@ int cmd_format_patch(int argc, const char **argv, 
> > const char *prefix)
> > + const char *range_diff = NULL;
>
> Maybe `range_diff_opt`? It's not exactly the range diff that is contained
> in this variable.

I could, though I was trying to keep it shorter rather than longer.
This is still the same in the re-roll, but I can rename it if you
insist.

> > + if (range_diff && !cover_letter)
> > + die(_("--range-diff requires --cover-letter"));
>
> I guess this will be changed in a future patch, allowing a single patch to
> ship with a range diff, too?

Yes, that's already the case in the re-roll.

> > +format_patch () {
> > + title=$1 &&
> > + range=$2 &&
> > + test_expect_success "format-patch --range-diff ($title)" '
> > + git format-patch --stdout --cover-letter --range-diff=$range \
> > + master..unmodified >actual &&
> > + grep "= 1: .* s/5/A" actual &&
> > + grep "= 2: .* s/4/A" actual &&
> > + grep "= 3: .* s/11/B" actual &&
> > + grep "= 4: .* s/12/B" actual
>
> I guess this might make sense if `format_patch` was not a function, but it
> is specifically marked as a function... so... shouldn't these `grep`s also
> be using function parameters?

A later patch adds a second test which specifies the same ranges but
in a different way, so the result will be the same, hence the
hard-coded grep'ing. The function avoids repetition across the two
tests. I suppose I could do this a bit differently, though, to avoid
pretending it's a general-purpose function.

Re: [RFC PATCH 3/5] format-patch: extend --range-diff to accept revision range

2018-07-17 Thread Johannes Schindelin

Hi Eric,

On Wed, 30 May 2018, Eric Sunshine wrote:

> diff --git a/builtin/log.c b/builtin/log.c
> index 460c31a293..e38cf06050 100644
> --- a/builtin/log.c
> +++ b/builtin/log.c
> @@ -995,10 +995,20 @@ static char *find_branch_name(struct rev_info *rev)
>  
>  static void infer_diff_ranges(struct argv_array *args,
> const char *prev,
> +   struct commit *origin,
> struct commit *head)
>  {
> - argv_array_pushf(args, "%s...%s", prev,
> -  oid_to_hex(&head->object.oid));
> + if (strstr(prev, "..")) {
> + if (!origin)
> + die(_("failed to infer range-diff ranges"));
> + argv_array_push(args, prev);
> + argv_array_pushf(args, "%s..%s",
> +  oid_to_hex(&origin->object.oid),
> +  oid_to_hex(&head->object.oid));
> + } else {
> + argv_array_pushf(args, "%s...%s", prev,
> +  oid_to_hex(&head->object.oid));
> + }

This would be easier to read if the order was inverted:

if (!strstr(...))
...
else if (!origin)
die(...)
else {
...
}

Otherwise, it makes sense to me.

Thanks,
Dscho

>  }
>  
>  static int get_range_diff(struct strbuf *sb,
> @@ -1059,7 +1069,7 @@ static void make_cover_letter(struct rev_info *rev, int 
> use_stdout,
>   /* might die from bad user input so try before creating cover letter */
>   if (range_diff) {
>   struct argv_array ranges = ARGV_ARRAY_INIT;
> - infer_diff_ranges(&ranges, range_diff, head);
> + infer_diff_ranges(&ranges, range_diff, origin, head);
>   if (get_range_diff(&diff, &ranges))
>   die(_("failed to generate range-diff"));
>   argv_array_clear(&ranges);
> diff --git a/t/t7910-branch-diff.sh b/t/t7910-branch-diff.sh
> index edbd69b6f8..c0e8668ed9 100755
> --- a/t/t7910-branch-diff.sh
> +++ b/t/t7910-branch-diff.sh
> @@ -155,5 +155,6 @@ format_patch () {
>  }
>  
>  format_patch 'B...C' topic
> +format_patch 'A..B A..C' master..topic
>  
>  test_done
> -- 
> 2.17.1.1235.ge295dfb56e
> 
>

Re: [RFC PATCH 2/5] format-patch: add --range-diff option to embed diff in cover letter

2018-07-17 Thread Johannes Schindelin

Hi Eric,

On Wed, 30 May 2018, Eric Sunshine wrote:

> diff --git a/builtin/log.c b/builtin/log.c
> index e01a256c11..460c31a293 100644
> --- a/builtin/log.c
> +++ b/builtin/log.c
> @@ -28,6 +28,7 @@
>  #include "mailmap.h"
>  #include "gpg-interface.h"
>  #include "progress.h"
> +#include "run-command.h"
>  
>  #define MAIL_DEFAULT_WRAP 72
>  
> @@ -992,6 +993,25 @@ static char *find_branch_name(struct rev_info *rev)
>   return branch;
>  }
>  
> +static void infer_diff_ranges(struct argv_array *args,
> +   const char *prev,
> +   struct commit *head)
> +{
> + argv_array_pushf(args, "%s...%s", prev,
> +  oid_to_hex(&head->object.oid));
> +}
> +
> +static int get_range_diff(struct strbuf *sb,

If you rename `sb` to `out`, it makes it more obvious to the casual reader
that this strbuf will receive the output.

> +   const struct argv_array *ranges)
> +{
> + struct child_process cp = CHILD_PROCESS_INIT;
> +
> + cp.git_cmd = 1;
> + argv_array_pushl(&cp.args, "branch-diff", "--no-color", NULL);

Obviously, this needs to be "range-diff" now.

> + argv_array_pushv(&cp.args, ranges->argv);

As there must be exactly two ranges, it would make more sense to pass them
explicitly. And then you can use one single call to `argv_array_pushl()`
instead.

> + return capture_command(&cp, sb, 0);
> +}
> +
>  static void emit_diffstat(struct rev_info *rev,
> struct commit *origin, struct commit *head)
>  {
> @@ -1016,7 +1036,8 @@ static void make_cover_letter(struct rev_info *rev, int 
> use_stdout,
> struct commit *origin,
> int nr, struct commit **list,
> const char *branch_name,
> -   int quiet)
> +   int quiet,
> +   const char *range_diff)
>  {
>   const char *committer;
>   const char *body = "*** SUBJECT HERE ***\n\n*** BLURB HERE ***\n";
> @@ -1028,15 +1049,25 @@ static void make_cover_letter(struct rev_info *rev, 
> int use_stdout,
>   int need_8bit_cte = 0;
>   struct pretty_print_context pp = {0};
>   struct commit *head = list[0];
> + struct strbuf diff = STRBUF_INIT;

I guess you want to call it `diff` instead of `range_diff` because a
future change will reuse this for the interdiff instead? And then also to
avoid a naming conflict with the parameter.

Dunno. I would still call it `range_diff` until the day comes (if ever)
when `--interdiff` is implemented. And I would call the `range_diff`
parameter `range_diff_opt` instead, or some such.

Or maybe use `extra_footer` instead of `diff`.

>   if (!cmit_fmt_is_mail(rev->commit_format))
>   die(_("Cover letter needs email format"));
>  
>   committer = git_committer_info(0);
>  
> + /* might die from bad user input so try before creating cover letter */
> + if (range_diff) {
> + struct argv_array ranges = ARGV_ARRAY_INIT;
> + infer_diff_ranges(&ranges, range_diff, head);
> + if (get_range_diff(&diff, &ranges))
> + die(_("failed to generate range-diff"));

BTW I like to have an extra space in front of all the range-diff lines, to
make it easier to discern them from the rest.

> + argv_array_clear(&ranges);
> + }
> +
>   if (!use_stdout &&
>   open_next_file(NULL, rev->numbered_files ? NULL : "cover-letter", 
> rev, quiet))
> - return;
> + goto done;

Hmm. If you think you will add more `goto done`s in the future, I guess
this is okay. Otherwise, it would probably make sense to go ahead and
simply `strbuf_release(&diff)` before `return`ing.

>   log_write_email_headers(rev, head, &pp.after_subject, &need_8bit_cte);
>  
> @@ -1077,6 +1108,17 @@ static void make_cover_letter(struct rev_info *rev, 
> int use_stdout,
>   /* We can only do diffstat with a unique reference point */
>   if (origin)
>   emit_diffstat(rev, origin, head);
> +
> + if (diff.len) {
> + FILE *fp = rev->diffopt.file;
> + fputs(_("Changes since previous version:"), fp);
> + fputs("\n\n", fp);
> + fputs(diff.buf, fp);
> + fputc('\n', fp);
> + }
> +
> +done:
> + strbuf_release(&diff);
>  }
>  
>  static const char *clean_message_id(const char *msg_id)
> @@ -1438,6 +1480,7 @@ int cmd_format_patch(int argc, const char **argv, const 
> char *prefix)
>   struct base_tree_info bases;
>   int show_progress = 0;
>   struct progress *progress = NULL;
> + const char *range_diff = NULL;

Maybe `range_diff_opt`? It's not exactly the range diff that is contained
in this variable.

>   const struct option builtin_format_patch_options[] = {
>   { OPTION_CALLBACK, 'n', "numbered", &numbered, NULL,
> @@ -1511,6 +1554,8 @@ int cmd_format_patch(in

Re: [RFC PATCH 1/5] format-patch: allow additional generated content in make_cover_letter()

2018-07-17 Thread Eric Sunshine

On Tue, Jul 17, 2018 at 6:15 AM Johannes Schindelin
 wrote:
> On Wed, 30 May 2018, Eric Sunshine wrote:
> > make_cover_letter() returns early when it lacks sufficient state to emit
> > a diffstat, which makes it difficult to extend the function to reliably
> > emit additional generated content. Work around this shortcoming by
> > factoring diffstat-printing logic out to its own function and calling it
> > as needed without otherwise inhibiting normal control flow.
> >
> > Signed-off-by: Eric Sunshine 
>
> Makes sense.

Thanks, but it's probably not worth spending time reviewing this RFC
series. I already have a new series in the works (in fact, mostly
finished) in which the implementation is drastically changed from this
one. Aside from adding an --interdiff option to git-format-patch (in
addition to a --range-diff option) and allowing interdiff and
range-diff to be added as commentary to a single-patch, the new series
also takes advantage of the newly-libified range-diff engine rather
than running git-range-diff as a command. So, most or all of the code
has changed.

(Though, perhaps it wouldn't hurt to review the documentation changes
in this RFC series to see if I botched how I described the option.)

Re: [RFC PATCH 1/5] format-patch: allow additional generated content in make_cover_letter()

2018-07-17 Thread Johannes Schindelin

Hi Eric,

On Wed, 30 May 2018, Eric Sunshine wrote:

> make_cover_letter() returns early when it lacks sufficient state to emit
> a diffstat, which makes it difficult to extend the function to reliably
> emit additional generated content. Work around this shortcoming by
> factoring diffstat-printing logic out to its own function and calling it
> as needed without otherwise inhibiting normal control flow.
> 
> Signed-off-by: Eric Sunshine 

Makes sense.

Ciao,
Dscho

> ---
>  builtin/log.c | 43 +++
>  1 file changed, 23 insertions(+), 20 deletions(-)
> 
> diff --git a/builtin/log.c b/builtin/log.c
> index 71f68a3e4f..e01a256c11 100644
> --- a/builtin/log.c
> +++ b/builtin/log.c
> @@ -992,6 +992,26 @@ static char *find_branch_name(struct rev_info *rev)
>   return branch;
>  }
>  
> +static void emit_diffstat(struct rev_info *rev,
> +   struct commit *origin, struct commit *head)
> +{
> + struct diff_options opts;
> +
> + memcpy(&opts, &rev->diffopt, sizeof(opts));
> + opts.output_format = DIFF_FORMAT_SUMMARY | DIFF_FORMAT_DIFFSTAT;
> + opts.stat_width = MAIL_DEFAULT_WRAP;
> +
> + diff_setup_done(&opts);
> +
> + diff_tree_oid(&origin->tree->object.oid,
> +   &head->tree->object.oid,
> +   "", &opts);
> + diffcore_std(&opts);
> + diff_flush(&opts);
> +
> + fprintf(rev->diffopt.file, "\n");
> +}
> +
>  static void make_cover_letter(struct rev_info *rev, int use_stdout,
> struct commit *origin,
> int nr, struct commit **list,
> @@ -1005,7 +1025,6 @@ static void make_cover_letter(struct rev_info *rev, int 
> use_stdout,
>   struct strbuf sb = STRBUF_INIT;
>   int i;
>   const char *encoding = "UTF-8";
> - struct diff_options opts;
>   int need_8bit_cte = 0;
>   struct pretty_print_context pp = {0};
>   struct commit *head = list[0];
> @@ -1055,25 +1074,9 @@ static void make_cover_letter(struct rev_info *rev, 
> int use_stdout,
>  
>   shortlog_output(&log);
>  
> - /*
> -  * We can only do diffstat with a unique reference point
> -  */
> - if (!origin)
> - return;
> -
> - memcpy(&opts, &rev->diffopt, sizeof(opts));
> - opts.output_format = DIFF_FORMAT_SUMMARY | DIFF_FORMAT_DIFFSTAT;
> - opts.stat_width = MAIL_DEFAULT_WRAP;
> -
> - diff_setup_done(&opts);
> -
> - diff_tree_oid(&origin->tree->object.oid,
> -   &head->tree->object.oid,
> -   "", &opts);
> - diffcore_std(&opts);
> - diff_flush(&opts);
> -
> - fprintf(rev->diffopt.file, "\n");
> + /* We can only do diffstat with a unique reference point */
> + if (origin)
> + emit_diffstat(rev, origin, head);
>  }
>  
>  static const char *clean_message_id(const char *msg_id)
> -- 
> 2.17.1.1235.ge295dfb56e
> 
>

Re: [PATCH 11/25] t7400: fix broken "submodule add/reconfigure --force" test

2018-07-17 Thread Johannes Schindelin

Hi Eric,

On Mon, 16 Jul 2018, Eric Sunshine wrote:

> On Mon, Jul 16, 2018 at 02:37:32PM -0700, Junio C Hamano wrote:
> > Eric Sunshine  writes:
> > > On Mon, Jul 16, 2018 at 11:51 AM Johannes Schindelin
> > >  wrote:
> > >> On Mon, 16 Jul 2018, Johannes Schindelin wrote:
> > >> > > -   git submodule add --force bogus-url submod &&
> > >> > > +   git submodule add --force /bogus-url submod &&
> > >> > This breaks on Windows because of the absolute Unix-y path having to be
> > >> > translated to a Windows path:
> > >> > I could imagine that using "$(pwd)/bogus-url" (which will generate a
> > >> > Windows-y absolute path on Windows) would work, though.
> > >>
> > >> Yes, this works indeed, see the patch below. Could you please squash it 
> > >> in
> > >> if you send another iteration of your patch series? Junio, could you
> > >> please add this as a SQUASH??? commit so that `pu` is fixed on Windows?
> > >
> > > So, this SQUASH looks like the correct way forward. Hopefully, Junio
> > > can just squash it so I don't have to flood the list again with this
> > > long series.
> > 
> > The topic already has another dependent topic so rerolling or
> > squashing would be a bit cumbersome.  I'll see what I could do but
> > it may not be until tomorrow's pushout.
> 
> No problem. Here's Dscho's fix in the form of a proper patch if
> that makes it easier for you (it just needs his sign-off):
> 
> --- >8 ---
> From: Johannes Schindelin 
> Subject: [PATCH] t7400: make "submodule add/reconfigure --force" work on
>  Windows
> 
> On Windows, the "Unixy" path /bogus-url gets translated automatically to
> a Windows-style path (such as C:/git-sdk/bogus_url). This is normally
> not problem, since it's still a valid and usable path in that form,

s/not problem/not a problem/

> however, this test wants to do a comparison against the original path.
> $(pwd) was invented exactly for this case, so use it to make the path
> comparison work.
> 
> [es: commit message]
> 

Signed-off-by: Johannes Schindelin 

> Signed-off-by: Eric Sunshine 
> ---
>  t/t7400-submodule-basic.sh | 7 ---
>  1 file changed, 4 insertions(+), 3 deletions(-)
> 
> diff --git a/t/t7400-submodule-basic.sh b/t/t7400-submodule-basic.sh
> index 76cf522a08..bfb09dd566 100755
> --- a/t/t7400-submodule-basic.sh
> +++ b/t/t7400-submodule-basic.sh
> @@ -171,10 +171,11 @@ test_expect_success 'submodule add to .gitignored path 
> with --force' '
>  test_expect_success 'submodule add to reconfigure existing submodule with 
> --force' '
>   (
>   cd addtest-ignore &&
> - git submodule add --force /bogus-url submod &&
> + bogus_url="$(pwd)/bogus-url" &&
> + git submodule add --force "$bogus_url" submod &&
>   git submodule add --force -b initial "$submodurl" submod-branch 
> &&
> - test "/bogus-url" = "$(git config -f .gitmodules 
> submodule.submod.url)" &&
> - test "/bogus-url" = "$(git config submodule.submod.url)" &&
> + test "$bogus_url" = "$(git config -f .gitmodules 
> submodule.submod.url)" &&
> + test "$bogus_url" = "$(git config submodule.submod.url)" &&
>   # Restore the url
>   git submodule add --force "$submodurl" submod &&
>   test "$submodurl" = "$(git config -f .gitmodules 
> submodule.submod.url)" &&
> -- 
> 2.18.0.233.g985f88cf7e

Thanks!
Dscho

Re: [RFC PATCH 0/5] format-patch: automate cover letter range-diff

2018-07-17 Thread Johannes Schindelin

Hi Eric,

On Wed, 30 May 2018, Eric Sunshine wrote:

> Dscho recently implemented a 'tbdiff' replacement as a Git builtin named
> git-branch-diff[1] which computes differences between two versions of a
> patch series. Such a diff can be a useful aid for reviewers when
> inserted into a cover letter. However, doing so requires manual
> generation (invoking git-branch-diff) and copy/pasting the result into
> the cover letter.
> 
> This series fully automates the process by adding a --range-diff option
> to git-format-patch.

Nice!

> It is RFC for a couple reasons:
> 
> * The final name for the 'tbdiff' replacement has not yet been nailed
>   down. The name git-branch-diff is moribund[2]; Dscho favors merging
>   the functionality into git-branch as a new --diff option[3]; others
>   prefer a standalone command named git-range-diff or
>   git-series-diff[4] or similar.

I think this has been settled now: range-diff it is.

> * I have some ideas for future enhancements and want to be careful not
>   to lock in a UI which doesn't mesh well with them (though I think the
>   current UI turned out reasonably well). First, I foresee a
>   complementary --interdiff option for inserting an interdiff-style diff
>   for cases when that style is easier to read or simply more
>   appropriate. Second, although the current patch series only supports
>   --range-diff for the cover letter, some people insert interdiff- or
>   tbdiff-style diffs (indented) into the commentary section of
>   standalone patches. Although this often makes for a noisy mess, it is
>   periodically useful.

Sure.

> This series is built atop js/branch-diff in 'pu'.
> 
> [1]: 
> https://public-inbox.org/git/cover.1525448066.git.johannes.schinde...@gmx.de/
> [2]: 
> https://public-inbox.org/git/nycvar.qro.7.76.6.1805061401260...@tvgsbejvaqbjf.bet/
> [3]: 
> https://public-inbox.org/git/nycvar.qro.7.76.6.1805062155120...@tvgsbejvaqbjf.bet/
> [4]: https://public-inbox.org/git/xmqqin7gzbkb@gitster-ct.c.googlers.com/
> 
> Eric Sunshine (5):
>   format-patch: allow additional generated content in
> make_cover_letter()
>   format-patch: add --range-diff option to embed diff in cover letter
>   format-patch: extend --range-diff to accept revision range
>   format-patch: teach --range-diff to respect -v/--reroll-count
>   format-patch: add --creation-weight tweak for --range-diff
> 
>  Documentation/git-format-patch.txt |  18 +
>  builtin/log.c  | 119 -
>  t/t7910-branch-diff.sh |  16 
>  3 files changed, 132 insertions(+), 21 deletions(-)
> 
> -- 
> 2.17.1.1235.ge295dfb56e

Thanks!
Dscho

Re: [PATCH v3 03/20] range-diff: first rudimentary implementation

2018-07-17 Thread Johannes Schindelin

Hi Eric,

On Mon, 16 Jul 2018, Eric Sunshine wrote:

> On Tue, Jul 3, 2018 at 7:27 AM Johannes Schindelin via GitGitGadget
>  wrote:
> > At this stage, `git range-diff` can determine corresponding commits
> > of two related commit ranges. This makes use of the recently introduced
> > implementation of the Hungarian algorithm.
> 
> Did you want s/Hungarian/Jonker-Volgenant/ here? (Not worth a re-roll.)

It is worth a new iteration, and I'd rather say "linear assignment" than
either Hungarian or Jonker-Volgenant. Thanks for pointing this out.

> > The core of this patch is a straight port of the ideas of tbdiff, the
> > apparently dormant project at https://github.com/trast/tbdiff.
> > [...]
> > Signed-off-by: Johannes Schindelin 
> > ---
> > diff --git a/builtin/range-diff.c b/builtin/range-diff.c
> > @@ -17,9 +18,49 @@ int cmd_range_diff(int argc, const char **argv, const 
> > char *prefix)
> > +   int res = 0;
> > +   struct strbuf range1 = STRBUF_INIT, range2 = STRBUF_INIT;
> >
> > -   argc = parse_options(argc, argv, NULL, options,
> > -builtin_range_diff_usage, 0);
> > +   argc = parse_options(argc, argv, NULL, options, 
> > builtin_range_diff_usage,
> > +0);
> 
> This parse_options() change appears to be merely a re-wrapping of the
> line between patches 2 and 3.

True, and it is a bad change because it makes the line longer than 80
columns.

Fixed.

> > -   return 0;
> > +   if (argc == 2) {
> > +   if (!strstr(argv[0], ".."))
> > +   warning(_("no .. in range: '%s'"), argv[0]);
> > +   strbuf_addstr(&range1, argv[0]);
> > +
> > +   if (!strstr(argv[1], ".."))
> > +   warning(_("no .. in range: '%s'"), argv[1]);
> > +   strbuf_addstr(&range2, argv[1]);
> 
> Should these die() (like the "..." case below) rather than warning()?
> Warning and continuing doesn't seem like intended behavior. When I
> test this with on git.git and omit the "..", git sits for a long, long
> time consuming the CPU. I guess it's git-log'ing pretty much the
> entire history.

I had to go back to `git-tbdiff.py` to see how it handles this, and you
are right: it should die().

Fixed.

(Technically, it is conceivable that some user wants to compare two
independent commit histories, e.g. when a repository was imported from a
different SCM two times, independently. I guess when that happens, we can
always implement a `range-diff --root  ` or some such.)

> % GIT_TRACE=1 git range-diff v1 v2
> warning: no .. in range: 'v1'
> warning: no .. in range: 'v2'
> trace: git log --no-color -p --no-merges --reverse \
> --date-order --decorate=no --no-abbrev-commit v1
> ^C
> %
> 
> > +   } else if (argc == 3) {
> > +   strbuf_addf(&range1, "%s..%s", argv[0], argv[1]);
> > +   strbuf_addf(&range2, "%s..%s", argv[0], argv[2]);
> > +   } else if (argc == 1) {
> > +   const char *b = strstr(argv[0], "..."), *a = argv[0];
> > +   int a_len;
> > +
> > +   if (!b)
> > +   die(_("single arg format requires a symmetric 
> > range"));
> > diff --git a/range-diff.c b/range-diff.c
> > @@ -0,0 +1,307 @@
> > +static int read_patches(const char *range, struct string_list *list)
> > +{
> > +   while (strbuf_getline(&line, in) != EOF) {
> > +   if (skip_prefix(line.buf, "commit ", &p)) {
> > +   [...]
> > +   in_header = 1;
> > +   continue;
> > +   }
> > +   if (starts_with(line.buf, "diff --git")) {
> > +   in_header = 0;
> > +   [...]
> > +   } else if (in_header) {
> > +   if (starts_with(line.buf, "Author: ")) {
> > +   [...]
> > +   } else if (starts_with(line.buf, "")) {
> > +   [...]
> > +   }
> > +   continue;
> > +   } else if (starts_with(line.buf, "@@ "))
> > +   strbuf_addstr(&buf, "@@");
> > +   else if (line.buf[0] && !starts_with(line.buf, "index "))
> > +   /*
> > +* A completely blank (not ' \n', which is context)
> > +* line is not valid in a diff.  We skip it
> > +* silently, because this neatly handles the blank
> > +* separator line between commits in git-log
> > +* output.
> > +*/
> > +   strbuf_addbuf(&buf, &line);
> 
> This comment had me confused for a bit since it doesn't seem to agree
> with the 'then' part of the 'if', but rather applies more to the
> 'else'.  Had it been split into two parts (one for 'then' and one for
> 'else'), it migh

Re: Are clone/checkout operations deterministic?

2018-07-17 Thread Ævar Arnfjörð Bjarmason

On Tue, Jul 17 2018, J. Paul Reed wrote:

> Hey Git Devs,
>
> I have a bit of an odd question: do git clone/checkout operations have a
> deterministic ordering?
>
> That is: are files guaranteed to be laid down onto disk in any specific
> (and deterministic) order as a clone and/or checkout operations occurs?
> (And if so, is it alphabetical order? Depth-first? Something else?)
>
> In case the answer is different (and I'd guess that it might be?), I'm
> mostly interested in the initial clone case... but it would be great to
> know if, indeed, the answer is different for just-checkouts too.
>
> I did some cursory googling, but nothing hopped out at me as an answer to
> this question.

In practice I think clone, checkout, reset etc. always work in the same
order you see with `git ls-tree -r --name-only HEAD`, but as far as I
know this has never been guaranteed or documented, and shouldn't be
relied on.

E.g. there's probably cases where writing files in parallel is going to
be faster than writing them sequentially. We don't have such a mode just
because nobody's written a patch for it, but having that patch would
break any assumptions of our current order.

Re: Git 2.18: RUNTIME_PREFIX... is it working?

2018-07-17 Thread Johannes Schindelin

Hi Jonathan,

[had to drop Perry Hutchinson, as the email address is apparently invalid,
and I only realized now that I never sent this out.]

On Tue, 10 Jul 2018, Jonathan Nieder wrote:

> If this [/proc issue] is the main obstacle to enabling RUNTIME_PREFIX by
> default, one option would be to make RUNTIME_PREFIX fall back to a
> hard-coded path when and only when git is not able to find the path from
> which it was run.

That is already the case. Look for FALLBACK_RUNTIME_PREFIX in the code.

Ciao,
Dscho

Are clone/checkout operations deterministic?

2018-07-17 Thread J. Paul Reed



Hey Git Devs,

I have a bit of an odd question: do git clone/checkout operations have a
deterministic ordering?

That is: are files guaranteed to be laid down onto disk in any specific
(and deterministic) order as a clone and/or checkout operations occurs?
(And if so, is it alphabetical order? Depth-first? Something else?)

In case the answer is different (and I'd guess that it might be?), I'm
mostly interested in the initial clone case... but it would be great to
know if, indeed, the answer is different for just-checkouts too.

I did some cursory googling, but nothing hopped out at me as an answer to
this question.

Thanks!

-preed
-- 
J. Paul Reed  PGP: 0xDF8708F8
=
I've never seen an airplane yet that can read the type ratings on your
pilot's license.   -- Chuck Boedecker

Re: [PATCH] gc: do not warn about too many loose objects

2018-07-17 Thread Ævar Arnfjörð Bjarmason

On Tue, Jul 17 2018, Jonathan Nieder wrote:

> Hi,
>
> Jeff King wrote:
>> On Mon, Jul 16, 2018 at 03:56:39PM -0700, Jonathan Nieder wrote:
>
>>> The calling command in the motivating example is Android's "repo" tool:
>>>
>>> bare_git.gc('--auto')
>>>
>>> from https://gerrit-review.googlesource.com/c/git-repo/+/10598/.  I
>>> think it's reasonable that it expects a status code of 0 in the normal
>>> case.  So life is less simple than I hoped.
>>
>> IMHO it should ignore the return code, since that's what Git does
>> itself. And at any rate, you'd miss errors from daemonized gc's (at
>> least until the _next_ one runs and propagates the error code).

I've read this whole thread, and as Peff noted I've barked up this
particular tree before[1] without coming up with a solution myself.

So please don't take the following as critique of any way of moving
forward, I'm just trying to poke holes in what you're doing to make sure
we don't have regressions to the currently (sucky) logic.

> That suggests a possible improvement.  If all callers should be
> ignoring the exit code, can we make the exit code in daemonize mode
> unconditionally zero in this kind of case?

That doesn't make sense to me. Just because git itself is happily
ignoring the exit code I don't think that should mean there shouldn't be
a meaningful exit code. Why don't you just ignore the exit code in the
repo tool as well?

Now e.g. automation I have to see if git-gc ---auto is having issues
can't just be 'git gc --auto || echo fail @ {hostname}' across a fleet
of servers, but will need to start caring if stderr was emitted to.

I don't care if we e.g. have a 'git gc --auto --exit-code' similar to
what git-diff does, but it doesn't make sense to me that we *know* we
can't background the gc due to a previous error and then always return
0. Having to parse STDERR to see if it *really* failed is un-unixy,
let's use exit codes. That's what they're for.

> That doesn't really solve the problem:
>
>  1. "gc --auto" would produce noise *every time it runs* until gc.log
> is removed, for example via expiry
>
>  2. "gc --auto" would not do any garbage collection until gc.log is
> removed, for example by expiry
>
>  3. "gc --auto" would still not ratelimit itself, for example when
> there are a large number of loose unreachable objects that is not
> large enough to hit the loose object threshold.
>
> but maybe it's better than the present state.
>
> To solve (1) and (2), we could introduce a gc.warnings file that
> behaves like gc.log except that it only gets printed once and then
> self-destructs, allowing gc --auto to proceed.

I think you're conflating two things here in a way that (if I'm reading
this correctly) produces a pretty bad regression for users.

 a) If we have something in the gc.log we keep yelling at users until we
reach the gc.logExpiry, this was the subject of my thread back in
January. This sucks, and should be fixed somehow.

Maybe we should only emit the warning once, e.g. creating a
.git/gc.log.wasemitted marker and as long as its ctime is later than
the mtime on .git/gc.log we don't emit it again).

But that also creates the UX problem that it's easy to miss this
message in the middle of some huge "fetch" output. Perhaps we should
just start emitting this as part of "git status" or something (or
solve the root cause, as Peff notes...).

 b) We *also* use this presence of a gc.log as a marker for "we failed
too recently, let's not try again until after a day".

The semantics of b) are very useful, and just because they're tied up
with a) right now, let's not throw out b) just because we're trying to
solve a).

We have dev machines with limited I/O & CPU/memory that occasionally
break the gc.auto limit, I really don't want those churning a background
"git gc" on every fetch/commit etc. until we're finally below the
gc.auto limit (possibly days later), which would be a side-effect of
printing the .git/gc.log once and then removing it.

> To solve (3), we could
> introduce a gc.lastrun file that is touched whenever "gc --auto" runs
> successfully and make "gc --auto" a no-op for a while after each run.

This would work around my concern with b) above in most cases by
introducing a purely time-based rate limit, but I'm uneasy about this
change in git-gc semantics.

Right now one thing we do right is we always try to look at the actual
state of the repo with too_many_packs() and too_many_loose_objects().

We don't assume the state of your repo hasn't drastically changed
recently. That means that we do the right thing and gc when the repo
needs it, not just because we GC'd recently enough.

With these proposed semantics we'd skip a needed GC (potentially for
days, depending on the default) just because we recently ran one.

In many environments, such as on busy servers, we could have tens of
thousands of packs by this point, since this facility would (presumably)
bypass b

[PATCH v3 1/5] ref-filter: add info_source to valid_atom

2018-07-17 Thread Olga Telezhnaya

Add the source of object data to prevent parsing of unneeded data.
The goal is to improve performance by avoiding calling expensive
functions when we don't need the information they provide
or when we could get it by using a cheaper function.

It is stored in valid_atoms because it depends on the atoms we are
interested in.

Signed-off-by: Olga Telezhnaia 
---
 ref-filter.c | 82 +++-
 1 file changed, 43 insertions(+), 39 deletions(-)

diff --git a/ref-filter.c b/ref-filter.c
index fa3685d91f046..8611c24fd57d1 100644
--- a/ref-filter.c
+++ b/ref-filter.c
@@ -41,6 +41,7 @@ void setup_ref_filter_porcelain_msg(void)
 
 typedef enum { FIELD_STR, FIELD_ULONG, FIELD_TIME } cmp_type;
 typedef enum { COMPARE_EQUAL, COMPARE_UNEQUAL, COMPARE_NONE } cmp_status;
+typedef enum { SOURCE_NONE = 0, SOURCE_OBJ, SOURCE_OTHER } info_source;
 
 struct align {
align_type position;
@@ -73,6 +74,7 @@ struct refname_atom {
 static struct used_atom {
const char *name;
cmp_type type;
+   info_source source;
union {
char color[COLOR_MAXLEN];
struct align align;
@@ -380,49 +382,50 @@ static int head_atom_parser(const struct ref_format 
*format, struct used_atom *a
 
 static struct {
const char *name;
+   info_source source;
cmp_type cmp_type;
int (*parser)(const struct ref_format *format, struct used_atom *atom,
  const char *arg, struct strbuf *err);
 } valid_atom[] = {
-   { "refname" , FIELD_STR, refname_atom_parser },
-   { "objecttype" },
-   { "objectsize", FIELD_ULONG },
-   { "objectname", FIELD_STR, objectname_atom_parser },
-   { "tree" },
-   { "parent" },
-   { "numparent", FIELD_ULONG },
-   { "object" },
-   { "type" },
-   { "tag" },
-   { "author" },
-   { "authorname" },
-   { "authoremail" },
-   { "authordate", FIELD_TIME },
-   { "committer" },
-   { "committername" },
-   { "committeremail" },
-   { "committerdate", FIELD_TIME },
-   { "tagger" },
-   { "taggername" },
-   { "taggeremail" },
-   { "taggerdate", FIELD_TIME },
-   { "creator" },
-   { "creatordate", FIELD_TIME },
-   { "subject", FIELD_STR, subject_atom_parser },
-   { "body", FIELD_STR, body_atom_parser },
-   { "trailers", FIELD_STR, trailers_atom_parser },
-   { "contents", FIELD_STR, contents_atom_parser },
-   { "upstream", FIELD_STR, remote_ref_atom_parser },
-   { "push", FIELD_STR, remote_ref_atom_parser },
-   { "symref", FIELD_STR, refname_atom_parser },
-   { "flag" },
-   { "HEAD", FIELD_STR, head_atom_parser },
-   { "color", FIELD_STR, color_atom_parser },
-   { "align", FIELD_STR, align_atom_parser },
-   { "end" },
-   { "if", FIELD_STR, if_atom_parser },
-   { "then" },
-   { "else" },
+   { "refname", SOURCE_NONE, FIELD_STR, refname_atom_parser },
+   { "objecttype", SOURCE_OTHER },
+   { "objectsize", SOURCE_OTHER, FIELD_ULONG },
+   { "objectname", SOURCE_OTHER, FIELD_STR, objectname_atom_parser },
+   { "tree", SOURCE_OBJ },
+   { "parent", SOURCE_OBJ },
+   { "numparent", SOURCE_OBJ, FIELD_ULONG },
+   { "object", SOURCE_OBJ },
+   { "type", SOURCE_OBJ },
+   { "tag", SOURCE_OBJ },
+   { "author", SOURCE_OBJ },
+   { "authorname", SOURCE_OBJ },
+   { "authoremail", SOURCE_OBJ },
+   { "authordate", SOURCE_OBJ, FIELD_TIME },
+   { "committer", SOURCE_OBJ },
+   { "committername", SOURCE_OBJ },
+   { "committeremail", SOURCE_OBJ },
+   { "committerdate", SOURCE_OBJ, FIELD_TIME },
+   { "tagger", SOURCE_OBJ },
+   { "taggername", SOURCE_OBJ },
+   { "taggeremail", SOURCE_OBJ },
+   { "taggerdate", SOURCE_OBJ, FIELD_TIME },
+   { "creator", SOURCE_OBJ },
+   { "creatordate", SOURCE_OBJ, FIELD_TIME },
+   { "subject", SOURCE_OBJ, FIELD_STR, subject_atom_parser },
+   { "body", SOURCE_OBJ, FIELD_STR, body_atom_parser },
+   { "trailers", SOURCE_OBJ, FIELD_STR, trailers_atom_parser },
+   { "contents", SOURCE_OBJ, FIELD_STR, contents_atom_parser },
+   { "upstream", SOURCE_NONE, FIELD_STR, remote_ref_atom_parser },
+   { "push", SOURCE_NONE, FIELD_STR, remote_ref_atom_parser },
+   { "symref", SOURCE_NONE, FIELD_STR, refname_atom_parser },
+   { "flag", SOURCE_NONE },
+   { "HEAD", SOURCE_NONE, FIELD_STR, head_atom_parser },
+   { "color", SOURCE_NONE, FIELD_STR, color_atom_parser },
+   { "align", SOURCE_NONE, FIELD_STR, align_atom_parser },
+   { "end", SOURCE_NONE },
+   { "if", SOURCE_NONE, FIELD_STR, if_atom_parser },
+   { "then", SOURCE_NONE },
+   { "else", SOURCE_NONE },
 };
 
 #define REF_FORMATTING_STATE_INIT  { 0, NULL }
@@ -498,6 +501,7 @@ static int parse_ref_filter_atom(const struct ref_format 
*format,
REALLOC_ARRAY(used_atom, used_atom

[PATCH v3 5/5] ref-filter: use oid_object_info() to get object

2018-07-17 Thread Olga Telezhnaya

Use oid_object_info_extended() to get object info instead of
read_object_file().
It will help to handle some requests faster (e.g., we do not need to
parse whole object if we need to know %(objectsize)).
It could also help us to add new atoms such as %(objectsize:disk)
and %(deltabase).

Signed-off-by: Olga Telezhnaia 
---
 ref-filter.c | 120 +++
 1 file changed, 87 insertions(+), 33 deletions(-)

diff --git a/ref-filter.c b/ref-filter.c
index 2b401a17c4689..112955f006648 100644
--- a/ref-filter.c
+++ b/ref-filter.c
@@ -61,6 +61,17 @@ struct refname_atom {
int lstrip, rstrip;
 };
 
+static struct expand_data {
+   struct object_id oid;
+   enum object_type type;
+   unsigned long size;
+   off_t disk_size;
+   struct object_id delta_base_oid;
+   void *content;
+
+   struct object_info info;
+} oi, oi_deref;
+
 /*
  * An atom is a valid field atom listed below, possibly prefixed with
  * a "*" to denote deref_tag().
@@ -202,6 +213,30 @@ static int remote_ref_atom_parser(const struct ref_format 
*format, struct used_a
return 0;
 }
 
+static int objecttype_atom_parser(const struct ref_format *format, struct 
used_atom *atom,
+ const char *arg, struct strbuf *err)
+{
+   if (arg)
+   return strbuf_addf_ret(err, -1, _("%%(objecttype) does not take 
arguments"));
+   if (*atom->name == '*')
+   oi_deref.info.typep = &oi_deref.type;
+   else
+   oi.info.typep = &oi.type;
+   return 0;
+}
+
+static int objectsize_atom_parser(const struct ref_format *format, struct 
used_atom *atom,
+ const char *arg, struct strbuf *err)
+{
+   if (arg)
+   return strbuf_addf_ret(err, -1, _("%%(objectsize) does not take 
arguments"));
+   if (*atom->name == '*')
+   oi_deref.info.sizep = &oi_deref.size;
+   else
+   oi.info.sizep = &oi.size;
+   return 0;
+}
+
 static int body_atom_parser(const struct ref_format *format, struct used_atom 
*atom,
const char *arg, struct strbuf *err)
 {
@@ -388,8 +423,8 @@ static struct {
  const char *arg, struct strbuf *err);
 } valid_atom[] = {
{ "refname", SOURCE_NONE, FIELD_STR, refname_atom_parser },
-   { "objecttype", SOURCE_OTHER },
-   { "objectsize", SOURCE_OTHER, FIELD_ULONG },
+   { "objecttype", SOURCE_OTHER, FIELD_STR, objecttype_atom_parser },
+   { "objectsize", SOURCE_OTHER, FIELD_ULONG, objectsize_atom_parser },
{ "objectname", SOURCE_OTHER, FIELD_STR, objectname_atom_parser },
{ "tree", SOURCE_OBJ },
{ "parent", SOURCE_OBJ },
@@ -502,6 +537,12 @@ static int parse_ref_filter_atom(const struct ref_format 
*format,
used_atom[at].name = xmemdupz(atom, ep - atom);
used_atom[at].type = valid_atom[i].cmp_type;
used_atom[at].source = valid_atom[i].source;
+   if (used_atom[at].source == SOURCE_OBJ) {
+   if (*atom == '*')
+   oi_deref.info.contentp = &oi_deref.content;
+   else
+   oi.info.contentp = &oi.content;
+   }
if (arg) {
arg = used_atom[at].name + (arg - atom) + 1;
if (!*arg) {
@@ -817,7 +858,7 @@ static int grab_objectname(const char *name, const struct 
object_id *oid,
 }
 
 /* See grab_values */
-static void grab_common_values(struct atom_value *val, int deref, struct 
object *obj, void *buf, unsigned long sz)
+static void grab_common_values(struct atom_value *val, int deref, struct 
expand_data *oi)
 {
int i;
 
@@ -829,13 +870,13 @@ static void grab_common_values(struct atom_value *val, 
int deref, struct object
if (deref)
name++;
if (!strcmp(name, "objecttype"))
-   v->s = type_name(obj->type);
+   v->s = type_name(oi->type);
else if (!strcmp(name, "objectsize")) {
-   v->value = sz;
-   v->s = xstrfmt("%lu", sz);
+   v->value = oi->size;
+   v->s = xstrfmt("%lu", oi->size);
}
else if (deref)
-   grab_objectname(name, &obj->oid, v, &used_atom[i]);
+   grab_objectname(name, &oi->oid, v, &used_atom[i]);
}
 }
 
@@ -1194,7 +1235,6 @@ static void fill_missing_values(struct atom_value *val)
  */
 static void grab_values(struct atom_value *val, int deref, struct object *obj, 
void *buf, unsigned long sz)
 {
-   grab_common_values(val, deref, obj, buf, sz);
switch (obj->type) {
case OBJ_TAG:
grab_tag_values(val, deref, obj, buf, sz);
@@ -1418,29 +1458,36 @@ static const char *get_refname(struct used_atom *atom, 
struct ref_array_item *re
return show_ref(&atom->u.refname, re

[PATCH v3 3/5] ref-filter: initialize eaten variable

2018-07-17 Thread Olga Telezhnaya

Initialize variable `eaten` before its using. We may initialize it in
parse_object_buffer(), but there are cases when we do not reach this
invocation.

Signed-off-by: Olga Telezhnaia 
---
 ref-filter.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/ref-filter.c b/ref-filter.c
index 27733ef013bed..8db7ca95b12c0 100644
--- a/ref-filter.c
+++ b/ref-filter.c
@@ -1439,7 +1439,8 @@ static const char *get_refname(struct used_atom *atom, 
struct ref_array_item *re
 static int get_object(struct ref_array_item *ref, const struct object_id *oid,
   int deref, struct object **obj, struct strbuf *err)
 {
-   int eaten;
+   /* parse_object_buffer() will set eaten to 0 if free() will be needed */
+   int eaten = 1;
int ret = 0;
unsigned long size;
void *buf = get_obj(oid, obj, &size, &eaten);

--
https://github.com/git/git/pull/520

1 2 >

1 - 100 of 105 matches

Mail list logo