date:20180709

Re: [PATCH 0/6] Add merge recursive testcases with undetected conflicts

2018-07-09 Thread Jeff King

On Mon, Jul 09, 2018 at 01:22:29PM -0700, Elijah Newren wrote:

> Oh, I didn't know about test-lint.  Is there a place that documents
> the various checks you run, so I can avoid slowing you down?  Ones I
> know about:
> 
> Already documented:
>   * `make DEVELOPER=1` (from CodingGuidelines)
>   * running tests (from SubmittingPatches)
> 
> Stuff I've seen you mention in emails over time:
>   * linux/scripts/checkpatch.pl
>   * git grep -e '\' --and --not -e 'static inline' -- \*.h
>   * make -C t/ test-lint

test-lint is supposed to be run automatically as part of "make test" (or
"make prove"), unless you've specifically disabled it by setting
TEST_LINT. And it does complain for me with your patches. If it doesn't
for you, then we have a bug to fix. :)

I won't be surprised, though, if you just ran "./t6036" manually before
sending, since your patches literally didn't touch any other files.

In theory we could push some of the linting down into the test scripts
themselves (some of it, like the &&-linter, is there already by
necessity). But it might also end up annoying, since usually dropping
down to manual single-test runs means you're trying to debug something,
and extra linting processes could get in the way.

> Are there others?

I like:

  make SANITIZE=address,undefined test

though it's pretty heavy-weight (but not nearly as much as valgrind).
You probably also need BLK_SHA1=Yes, since the default DC_SHA1 has some
unaligned loads that make UBSan complain. We should maybe teach the
Makefile to do that by default.

I've also been playing with clang's scan-build. It _did_ find a real bug
recently, but it has a bunch of false positives.

Stefan runs Coverity against pu periodically. IIRC It's a pain to run
yourself, but the shared results can be mailed to you, or you can poke
around at https://scan.coverity.com/projects/git. That _also_ has a ton
of false positives, but it's good about cataloguing them so the periodic
email usually just mentions the new ones.

-Peff

[PATCH v2 0/2] de-confuse git cherry-pick --author

2018-07-09 Thread Jeff King

On Mon, Jul 09, 2018 at 10:15:05PM -0400, Jeff King wrote:

> > Should this not rather be
> > 
> > -   if (!cmit || get_revision(opts->revs))
> > -   return error("BUG: expected exactly one commit from 
> > walk");
> > +   if (!cmit)
> > +   return error(_("empty commit set passed"));
> > +   if (get_revision(opts->revs))
> > +   return error(_("unexpected extra commit from walk"));
> 
> Yeah, you're right. I'm not sure how a single rev with no-walk would
> ever turn up more than one commit, though. So I think we should probably
> go with:
> 
>   if (!cmit)
>   return error(_("empty commit set passed"));
>   if (get_revision(opts->revs))
>   BUG("unexpected extra commit from walk");
> 
> And then if we ever see that case, we can decide from there what the
> right action is (though _probably_ it's just to emit an error like you
> have above, it might be a sign that our single-pick logic is wrong).
> 
> I'll re-roll in that direction, and discuss further in the commit
> message.

After poking at it a bit more, I've convinced myself that this is the
right thing, as options like "--branches" which expand into multiple
tips already push us into the other code path.

So here's a re-roll. The first one is identical except for the typo-fix
in the commit message.

  [1/2]: sequencer: handle empty-set cases consistently
  [2/2]: sequencer: don't say BUG on bogus input

 sequencer.c | 12 
 t/t3510-cherry-pick-sequence.sh |  7 ++-
 2 files changed, 14 insertions(+), 5 deletions(-)

-Peff

[PATCH v2 2/2] sequencer: don't say BUG on bogus input

2018-07-09 Thread Jeff King

When cherry-picking a single commit, we go through a special
code path that avoids creating a sequencer todo list at all.
This path expects our revision parsing to turn up exactly
one commit, and dies with a BUG if it doesn't.

But it's actually quite easy to fool. For example:

  $ git cherry-pick --author=no.such.person HEAD
  error: BUG: expected exactly one commit from walk
  fatal: cherry-pick failed

This isn't a bug; it's just bogus input.

The condition to trigger this message actually has two
parts:

  1. We saw no commits. That's the case in the example
 above. Let's drop the "BUG" here to make it clear that
 the input is the problem. And let's also use the phrase
 "empty commit set passed", which matches what we say
 when we do a real revision walk and it turns up empty.

  2. We saw more than one commit. That one _should_ be
 impossible to trigger, since we fed at most one tip and
 provided the no_walk option (and we'll have already
 expanded options like "--branches" that can turn into
 multiple tips). If this ever triggers, it's an
 indication that the conditional added by 7acaaac275
 (revert: allow single-pick in the middle of cherry-pick
 sequence, 2011-12-10) needs to more carefully define
 the single-pick case.

 So this can remain a bug, but we'll upgrade it to use
 the BUG() macro, which would make it easier to detect
 and analyze if it does trigger.

Signed-off-by: Jeff King 
---
 sequencer.c | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/sequencer.c b/sequencer.c
index f692b2ef44..8f0a015160 100644
--- a/sequencer.c
+++ b/sequencer.c
@@ -3636,8 +3636,10 @@ int sequencer_pick_revisions(struct replay_opts *opts)
if (prepare_revision_walk(opts->revs))
return error(_("revision walk setup failed"));
cmit = get_revision(opts->revs);
-   if (!cmit || get_revision(opts->revs))
-   return error("BUG: expected exactly one commit from 
walk");
+   if (!cmit)
+   return error(_("empty commit set passed"));
+   if (get_revision(opts->revs))
+   BUG("unexpected extra commit from walk");
return single_pick(cmit, opts);
}
 
-- 
2.18.0.400.g702e398724

[PATCH v2 1/2] sequencer: handle empty-set cases consistently

2018-07-09 Thread Jeff King

If the user gives us a set that prepare_revision_walk()
takes to be empty, like:

  git cherry-pick base..base

then we report an error. It's nonsense, and there's nothing
to pick.

But if they use revision options that later cull the list,
like:

  git cherry-pick --author=nobody base~2..base

then we quietly create an empty todo list and return
success.

Arguably either behavior is acceptable, but we should
definitely be consistent about it. Reporting an error
seems to match the original intent, which dates all the way
back to 7e2bfd3f99 (revert: allow cherry-picking more than
one commit, 2010-06-02). That in turn was trying to match
the single-commit case that existed before then (and which
continues to issue an error).

Signed-off-by: Jeff King 
---
 sequencer.c | 6 --
 t/t3510-cherry-pick-sequence.sh | 7 ++-
 2 files changed, 10 insertions(+), 3 deletions(-)

diff --git a/sequencer.c b/sequencer.c
index 5354d4d51e..f692b2ef44 100644
--- a/sequencer.c
+++ b/sequencer.c
@@ -1863,8 +1863,6 @@ static int prepare_revs(struct replay_opts *opts)
if (prepare_revision_walk(opts->revs))
return error(_("revision walk setup failed"));
 
-   if (!opts->revs->commits)
-   return error(_("empty commit set passed"));
return 0;
 }
 
@@ -2317,6 +2315,10 @@ static int walk_revs_populate_todo(struct todo_list 
*todo_list,
short_commit_name(commit), subject_len, subject);
unuse_commit_buffer(commit, commit_buffer);
}
+
+   if (!todo_list->nr)
+   return error(_("empty commit set passed"));
+
return 0;
 }
 
diff --git a/t/t3510-cherry-pick-sequence.sh b/t/t3510-cherry-pick-sequence.sh
index b42cd66d3a..3505b6aa14 100755
--- a/t/t3510-cherry-pick-sequence.sh
+++ b/t/t3510-cherry-pick-sequence.sh
@@ -480,11 +480,16 @@ test_expect_success 'malformed instruction sheet 2' '
test_expect_code 128 git cherry-pick --continue
 '
 
-test_expect_success 'empty commit set' '
+test_expect_success 'empty commit set (no commits to walk)' '
pristine_detach initial &&
test_expect_code 128 git cherry-pick base..base
 '
 
+test_expect_success 'empty commit set (culled during walk)' '
+   pristine_detach initial &&
+   test_expect_code 128 git cherry-pick -2 --author=no.such.author base
+'
+
 test_expect_success 'malformed instruction sheet 3' '
pristine_detach initial &&
test_expect_code 1 git cherry-pick base..anotherpick &&
-- 
2.18.0.400.g702e398724

Re: Git 2.18: RUNTIME_PREFIX... is it working?

2018-07-09 Thread Jeff King

On Mon, Jul 09, 2018 at 03:58:22PM -0400, Jeff King wrote:

> On Sun, Jul 08, 2018 at 11:52:22PM +0200, Johannes Schindelin wrote:
> 
> > Now, if you care to have a look at Dan's (and my) patches to implement
> > RUNTIME_PREFIX so that it looks for a directory *relative to the Git
> > binary*, you will see that it is far from portable. In fact, it is very
> > definitely not portable, and needs specific support for *every single
> > supported Operating System*. And while we covered a lot, we did not cover
> > all of them.
> > 
> > So unfortunately, it is impossible to make it the default, I am afraid.
> 
> Would it be reasonable to make RUNTIME_PREFIX the default on systems
> where we _do_ have that support? AFAIK there is no downside to having it
> enabled (minus a few syscalls to find the prefix, I suppose, but I
> assume that's negligible).

Brainstorming a little more on "what could be the possible downsides".

If I understand correctly, the Linux implementation requires reading
from /proc. So an executable that only did RUNTIME_PREFIX (with no
fallback to static paths) would be unhappy inside a chroot or other
container that didn't mount /proc.

-Peff

Re: Git 2.18: RUNTIME_PREFIX... is it working?

2018-07-09 Thread Jeff King

On Mon, Jul 09, 2018 at 10:26:54PM +0200, Johannes Schindelin wrote:

> > Would it be reasonable to make RUNTIME_PREFIX the default on systems
> > where we _do_ have that support? AFAIK there is no downside to having it
> > enabled (minus a few syscalls to find the prefix, I suppose, but I
> > assume that's negligible).
> > 
> > I.e., a patch to config.mak.uname (and possibly better support for
> > _disabling_ it, though I think "make RUNTIME_PREFIX=" would probably
> > work).
> 
> The obvious downside is that we would be a lot more likely to break one
> side of the equation. At least right now, we have Git for Windows being a
> prime user of RUNTIME_PREFIX (so breakages should be caught relatively
> quickly), and macOS/Linux *not* being users of that feature (so breakages
> in the non-RUNTIME_PREFIX code paths should be caught even quicker). By
> turning on RUNTIME_PREFIX for the major platforms, the fringe platforms
> are even further out on their own.

That's true. On the other hand, we have a zillion compat features for
fringe platforms already, so there already is an expectation that people
on those platforms would need to occasionally report and fix
system-specific bugs. Perhaps thinking of it not as an feature to opt
into, but rather as a compat for "your system has not caught up to the
modern world by implementing RUNTIME_PREFIX" would encourage people on
those platforms to implement the necessary scaffolding.

I also have a gut feeling that it is much easier for static-path devs to
break RUNTIME_PREFIX folks, rather than the other way around, simply
because RUNTIME_PREFIX has a lot more moving parts. But I admit that's
just a feeling.

-Peff

Re: [RFC PATCH 2/6] refs/refs-internal.h: avoid forward declaration of an enum

2018-07-09 Thread Jeff King

On Mon, Jul 09, 2018 at 09:30:12PM +0200, Beat Bolli wrote:

> > Other than this minor quibble, the whole series looks good to me, modulo
> > the existing review.
> 
> Ooosp, I've just sent the non-RFC reroll without this change.
> 
> Junio, would you squash this into [1/6] and [2/6], please (if you agree,
> of course :-)

:) I'd be happy with that squash, but not the end of the world if it
doesn't make it. Thanks for the whole series.

-Peff

Re: [PATCH 2/2] sequencer: don't say BUG on bogus input

2018-07-09 Thread Jeff King

On Mon, Jul 09, 2018 at 10:24:25PM +0200, Johannes Schindelin wrote:

> > diff --git a/sequencer.c b/sequencer.c
> > index f692b2ef44..234666b980 100644
> > --- a/sequencer.c
> > +++ b/sequencer.c
> > @@ -3637,7 +3637,7 @@ int sequencer_pick_revisions(struct replay_opts *opts)
> > return error(_("revision walk setup failed"));
> > cmit = get_revision(opts->revs);
> > if (!cmit || get_revision(opts->revs))
> > -   return error("BUG: expected exactly one commit from 
> > walk");
> > +   return error(_("empty commit set passed"));
> 
> Should this not rather be
> 
> - if (!cmit || get_revision(opts->revs))
> - return error("BUG: expected exactly one commit from 
> walk");
> + if (!cmit)
> + return error(_("empty commit set passed"));
> + if (get_revision(opts->revs))
> + return error(_("unexpected extra commit from walk"));

Yeah, you're right. I'm not sure how a single rev with no-walk would
ever turn up more than one commit, though. So I think we should probably
go with:

  if (!cmit)
return error(_("empty commit set passed"));
  if (get_revision(opts->revs))
BUG("unexpected extra commit from walk");

And then if we ever see that case, we can decide from there what the
right action is (though _probably_ it's just to emit an error like you
have above, it might be a sign that our single-pick logic is wrong).

I'll re-roll in that direction, and discuss further in the commit
message.

-Peff

Re: [PATCH v2 6/6] commit-graph: add repo arg to graph readers

2018-07-09 Thread Derrick Stolee


On 7/9/2018 4:44 PM, Jonathan Tan wrote:

Add a struct repository argument to the functions in commit-graph.h that
read the commit graph. (This commit does not affect functions that write
commit graphs.)

Because the commit graph functions can now read the commit graph of any
repository, the global variable core_commit_graph has been removed.
Instead, the config option core.commitGraph is now read on the first
time in a repository that a commit is attempted to be parsed using its
commit graph.

This commit includes a test that exercises the functionality on an
arbitrary repository that is not the_repository.

Signed-off-by: Jonathan Tan 
---
  Makefile   |  1 +
  builtin/fsck.c |  2 +-
  cache.h|  1 -
  commit-graph.c | 60 ++
  commit-graph.h |  7 +--
  commit.c   |  6 +--
  config.c   |  5 ---
  environment.c  |  1 -
  ref-filter.c   |  2 +-
  t/helper/test-repository.c | 88 ++
  t/helper/test-tool.c   |  1 +
  t/helper/test-tool.h   |  1 +
  t/t5318-commit-graph.sh| 35 +++
  13 files changed, 168 insertions(+), 42 deletions(-)
  create mode 100644 t/helper/test-repository.c

diff --git a/Makefile b/Makefile
index 0cb6590f24..bb8bd67201 100644
--- a/Makefile
+++ b/Makefile
@@ -719,6 +719,7 @@ TEST_BUILTINS_OBJS += test-prio-queue.o
  TEST_BUILTINS_OBJS += test-read-cache.o
  TEST_BUILTINS_OBJS += test-ref-store.o
  TEST_BUILTINS_OBJS += test-regex.o
+TEST_BUILTINS_OBJS += test-repository.o
  TEST_BUILTINS_OBJS += test-revision-walking.o
  TEST_BUILTINS_OBJS += test-run-command.o
  TEST_BUILTINS_OBJS += test-scrap-cache-tree.o
diff --git a/builtin/fsck.c b/builtin/fsck.c
index eca7900ee0..2abfb2d782 100644
--- a/builtin/fsck.c
+++ b/builtin/fsck.c
@@ -825,7 +825,7 @@ int cmd_fsck(int argc, const char **argv, const char 
*prefix)
  
  	check_connectivity();
  
-	if (core_commit_graph) {

+   if (!git_config_get_bool("core.commitgraph", &i) && i) {
struct child_process commit_graph_verify = CHILD_PROCESS_INIT;
const char *verify_argv[] = { "commit-graph", "verify", NULL, 
NULL, NULL };
  
diff --git a/cache.h b/cache.h

index d49092d94d..8dc59bedba 100644
--- a/cache.h
+++ b/cache.h
@@ -813,7 +813,6 @@ extern char *git_replace_ref_base;
  
  extern int fsync_object_files;

  extern int core_preload_index;
-extern int core_commit_graph;
  extern int core_apply_sparse_checkout;
  extern int precomposed_unicode;
  extern int protect_hfs;
diff --git a/commit-graph.c b/commit-graph.c
index af97a10603..8eab199b1b 100644
--- a/commit-graph.c
+++ b/commit-graph.c
@@ -183,15 +183,15 @@ struct commit_graph *load_commit_graph_one(const char 
*graph_file)
exit(1);
  }
  
-static void prepare_commit_graph_one(const char *obj_dir)

+static void prepare_commit_graph_one(struct repository *r, const char *obj_dir)
  {
char *graph_name;
  
-	if (the_repository->objects->commit_graph)

+   if (r->objects->commit_graph)
return;
  
  	graph_name = get_commit_graph_filename(obj_dir);

-   the_repository->objects->commit_graph =
+   r->objects->commit_graph =
load_commit_graph_one(graph_name);
  
  	FREE_AND_NULL(graph_name);

@@ -203,26 +203,34 @@ static void prepare_commit_graph_one(const char *obj_dir)
   * On the first invocation, this function attemps to load the commit
   * graph if the_repository is configured to have one.
   */
-static int prepare_commit_graph(void)
+static int prepare_commit_graph(struct repository *r)
  {
struct alternate_object_database *alt;
char *obj_dir;
+   int config_value;
  
-	if (the_repository->objects->commit_graph_attempted)

-   return !!the_repository->objects->commit_graph;
-   the_repository->objects->commit_graph_attempted = 1;
+   if (r->objects->commit_graph_attempted)
+   return !!r->objects->commit_graph;
+   r->objects->commit_graph_attempted = 1;
  
-	if (!core_commit_graph)

+   if (repo_config_get_bool(r, "core.commitgraph", &config_value) ||
+   !config_value)
+   /*
+* This repository is not configured to use commit graphs, so
+* do not load one. (But report commit_graph_attempted anyway
+* so that commit graph loading is not attempted again for this
+* repository.)
+*/


I reacted first to complain about this extra config lookup, but it is 
only run once per repository, so that should be fine.



return 0;
  
-	obj_dir = get_object_directory();

-   prepare_commit_graph_one(obj_dir);
-   prepare_alt_odb(the_repository);
-   for (alt = the_repository->objects->alt_odb_list;
-!the_repository->objects->commit_graph && alt;
+   obj_dir = r->objects->objectdir;
+   prepare_c

Re: [PATCH v2 on ds/commit-graph-fsck 0/6] Object store refactoring: commit graph

2018-07-09 Thread Derrick Stolee




On 7/9/2018 8:30 PM, Derrick Stolee wrote:

On 7/9/2018 6:27 PM, Junio C Hamano wrote:

Jonathan Tan  writes:


This is on ds/commit-graph-fsck.

I saw that ds/commit-graph-fsck has been updated to the latest version
(v7, including "gc.writeCommitGraph"), so I've rebased my changes on 
top

of that branch. There were some mechanical changes needed during the
rebase, so I'm sending the rebased patches out.

I've also added a patch (patch 1) that removes some duplication of
implementation that Junio talked about in [1].

[1] 
https://public-inbox.org/git/xmqqefgtmrgi@gitster-ct.c.googlers.com/ 


While attempting to merge this topic to 'pu', I noticed that you and
Derrick are perhaps playing a game of whack-a-mole by you getting
rid of core_commit_graph global and making it a per in-core
repository instance, while Derrick adding core_multi_pack_index,
making it necessary for yet another round of similar clean-up?


We did have collisions with Jonathan's v1, but this v2 is on my latest 
commit-graph things so should not have conflicts.


The core_commit_graph variable appears to still be global (do we have 
config storage in the_repository yet?) so core_multi_pack_index is 
similar.


I do put the multi_pack_index pointer inside the_repository->objects, 
so the equivalent of this series will not be necessary for the MIDX 
series.



This series looks good to me, so please add "Reviewed-by: Derrick 
Stolee "


I think we are set for another series on top of this one that lets the 
commit-graph feature handle arbitrary repositories (pass a 'struct 
repository *r' in all of the functions).




Of course, after I send this message I see that my inbox had poorly 
threaded this patch and item 6/6 was below these messages. Patch 6/6 
does convert this variable and passes an arbitrary repository.


I'll update the MIDX patch in v4 to use that model for the config setting.

Re: [PATCH v2 on ds/commit-graph-fsck 0/6] Object store refactoring: commit graph

2018-07-09 Thread Derrick Stolee


On 7/9/2018 6:27 PM, Junio C Hamano wrote:

Jonathan Tan  writes:


This is on ds/commit-graph-fsck.

I saw that ds/commit-graph-fsck has been updated to the latest version
(v7, including "gc.writeCommitGraph"), so I've rebased my changes on top
of that branch. There were some mechanical changes needed during the
rebase, so I'm sending the rebased patches out.

I've also added a patch (patch 1) that removes some duplication of
implementation that Junio talked about in [1].

[1] https://public-inbox.org/git/xmqqefgtmrgi@gitster-ct.c.googlers.com/

While attempting to merge this topic to 'pu', I noticed that you and
Derrick are perhaps playing a game of whack-a-mole by you getting
rid of core_commit_graph global and making it a per in-core
repository instance, while Derrick adding core_multi_pack_index,
making it necessary for yet another round of similar clean-up?


We did have collisions with Jonathan's v1, but this v2 is on my latest 
commit-graph things so should not have conflicts.


The core_commit_graph variable appears to still be global (do we have 
config storage in the_repository yet?) so core_multi_pack_index is similar.


I do put the multi_pack_index pointer inside the_repository->objects, so 
the equivalent of this series will not be necessary for the MIDX series.



This series looks good to me, so please add "Reviewed-by: Derrick Stolee 
"


I think we are set for another series on top of this one that lets the 
commit-graph feature handle arbitrary repositories (pass a 'struct 
repository *r' in all of the functions).


Thanks,

-Stolee

Re: [PATCH v2 1/6] commit-graph: refactor preparing commit graph

2018-07-09 Thread Derrick Stolee


On 7/9/2018 5:41 PM, Stefan Beller wrote:

Hi Jonathan,
On Mon, Jul 9, 2018 at 1:44 PM Jonathan Tan  wrote:

Two functions in the code (1) check if the repository is configured for
commit graphs, (2) call prepare_commit_graph(), and (3) check if the
graph exists. Move (1) and (3) into prepare_commit_graph(), reducing
duplication of code.

Signed-off-by: Jonathan Tan 




  static int prepare_commit_graph_run_once = 0;
-static void prepare_commit_graph(void)
+
+/*
+ * Return 1 if commit_graph is non-NULL, and 0 otherwise.
+ *
+ * On the first invocation, this function attemps to load the commit
+ * graph if the_repository is configured to have one.

and as we talk about in-memory commit graph (and not some
stale file that may still be around on the fs), we can assertly return
0 when core_commit_graph is false.

Makes sense!


@@ -337,22 +348,17 @@ static int parse_commit_in_graph_one(struct commit_graph 
*g, struct commit *item

  int parse_commit_in_graph(struct commit *item)
  {
-   if (!core_commit_graph)
+   if (!prepare_commit_graph())
 return 0;
-
-   prepare_commit_graph();
-   if (commit_graph)
-   return parse_commit_in_graph_one(commit_graph, item);
-   return 0;
+   return parse_commit_in_graph_one(commit_graph, item);

Makes sense.


  }

  void load_commit_graph_info(struct commit *item)
  {
 uint32_t pos;
-   if (!core_commit_graph)
+   if (!prepare_commit_graph())
 return;
-   prepare_commit_graph();
-   if (commit_graph && find_commit_in_graph(item, commit_graph, &pos))
+   if (find_commit_in_graph(item, commit_graph, &pos))
 fill_commit_graph_info(item, commit_graph, pos);

here too,

This is
Reviewed-by: Stefan Beller 


These changes make sense. Thanks!

Re: I can do past and feature commits. It is a bug?

2018-07-09 Thread Jonathan Nieder

+the public git mailing list, git-secur...@googlegroups.com -> bcc
Hi,

Lin Terry wrote:

> I can do past and feature commits. It is a bug?
>
> Check out my gitgub page.
> https://github.com/terrylinooo
>
> You can see a LOVE, they are past commits I commited yesterday.

The ability to set the author and committer date freely is an
intentional capability.  See the 'Commit Information' section of "git
help commit-tree", for example.

If you'd like the ability to prove that an object was pushed by a
particular person at a particular time, you might like the push
certificates feature.  The pusher is authenticated using a GPG
signature; the push time can be recorded by the server in its
pre-receive or post-receive hook (so you still have to trust the
server for that).  See the description of --signed in "git help push"
and of hooks in "git help receive-pack" for more details on that
subject.

Thanks and hope that helps,
Jonathan

Re: [PATCH 2/2] t3430: update to test with custom commentChar

2018-07-09 Thread brian m. carlson

On Mon, Jul 09, 2018 at 09:48:55PM +0300, Daniel Harding wrote:
> Hello brian,
> 
> On Mon, 09 Jul 2018 at 00:02:00 +0300, brian m. carlson wrote:
> > Should this affect the "# Merge the topic branch" line (and the "# C",
> > "# E", and "# H" lines in the next test) that appears below this?  It
> > would seem those would qualify as comments as well.
> 
> I intentionally did not change that behavior for two reasons:
> 
> a) from a Git perspective, comment characters are only effectual for
> comments if they are the first character in a line
> 
> and
> 
> b) there are places where a '#' character from the todo list is actually
> parsed and used e.g. [0] and [1].  I have not yet gotten to the point of
> grokking what is going on there, so I didn't want to risk breaking something
> I didn't understand.  Perhaps Johannes could shed some light on whether the
> cases you mentioned should be changed to use the configured commentChar or
> not.

Fair enough.  Thanks for the explanation.
-- 
brian m. carlson: Houston, Texas, US
OpenPGP: https://keybase.io/bk2204


signature.asc
Description: PGP signature

Re: [PATCH 07/17] commit: increase commit message buffer size

2018-07-09 Thread brian m. carlson

On Mon, Jul 09, 2018 at 10:45:33AM -0700, Junio C Hamano wrote:
> Derrick Stolee  writes:
> 
> > On 7/8/2018 7:36 PM, brian m. carlson wrote:
> >> diff --git a/refs/files-backend.c b/refs/files-backend.c
> >> index a9a066dcfb..252f835bae 100644
> >> --- a/refs/files-backend.c
> >> +++ b/refs/files-backend.c
> >> @@ -1587,7 +1587,7 @@ static int log_ref_write_fd(int fd, const struct 
> >> object_id *old_oid,
> >>char *logrec;
> >>msglen = msg ? strlen(msg) : 0;
> >> -  maxlen = strlen(committer) + msglen + 100;
> >> +  maxlen = strlen(committer) + msglen + 200;
> >>logrec = xmalloc(maxlen);
> >>len = xsnprintf(logrec, maxlen, "%s %s %s\n",
> >>oid_to_hex(old_oid),
> >
> > nit: 100 is not enough anymore, but wasn't a very descriptive
> > value. 200 may be enough now, but I'm not sure why.

200 is definitely enough.  Suppose we had a message consisting entirely
of SHA-1 hashes (5, at 20 bytes a piece).  If our new hash is 32 bytes
long, then it would require at most 160 bytes.

I only noticed this because the old code segfaulted.  My approach to
using a 32-byte hash was to set it up, do some basic tests, find out
what crashed, and fix it.  Most of this series is the basics necessary
to get the most rudimentary functionality out of a 32-byte Git,
excluding the index pieces, which are necessarily inelegant.

I didn't include them because there are other ways to implement the
changes which are more elegant in some ways and less elegant in other
ways, and I want to think more about it before I send them in.

> As Brandon alludes to downthread, we probably should use strbuf for
> things like this these days, so a preliminary clean-up to do so is
> probably a welcome change to sneak in and rebase this series on top
> of.

Sure, I agree that would be a better change, and I'm happy to reroll
with that.
-- 
brian m. carlson: Houston, Texas, US
OpenPGP: https://keybase.io/bk2204

signature.asc
Description: PGP signature

Re: [PATCH 01/17] cache: update object ID functions for the_hash_algo

2018-07-09 Thread brian m. carlson

On Sun, Jul 08, 2018 at 09:31:42PM -0700, Jacob Keller wrote:
> On Sun, Jul 8, 2018 at 9:05 PM Eric Sunshine  wrote:
> >
> > On Sun, Jul 8, 2018 at 10:38 PM Jacob Keller  wrote:
> > > On Sun, Jul 8, 2018 at 4:39 PM brian m. carlson
> > >  wrote:
> > > >  static inline int oidcmp(const struct object_id *oid1, const struct 
> > > > object_id *oid2)
> > > >  {
> > > > -   return hashcmp(oid1->hash, oid2->hash);
> > > > +   return memcmp(oid1->hash, oid2->hash, the_hash_algo->rawsz);
> > > >  }
> > >
> > > Just curious, what's the reasoning for not using the hashcmp anymore?
> >
> > hashcmp() is specific to SHA-1 (for instance, it hardocdes
> > GIT_SHA1_RAWSZ). oidcmp() is meant as the hash-agnostic replacement
> > for hashcmp(), so it doesn't make sense to continue implementing
> > oidcmp() in terms of hashcmp() (the latter of which will eventually be
> > retired, presumably).
> 
> Fair. I just saw that hashcmp was also updated to use the_hash_algo,
> but if we're going to drop it eventually, then there's zero reason to
> keep implementing oidcmp in terms of it, so... makes sense to me!

Actually, this reminded me that I have a patch that I had forgotten
about in my next series that updates hashcmp.  I'll squash that in in my
reroll, and undo this change.

As a bonus, it also has a nicer commit message which I will include
explaining why this is necessary.
-- 
brian m. carlson: Houston, Texas, US
OpenPGP: https://keybase.io/bk2204


signature.asc
Description: PGP signature

Re: Subscribing Apple people to git-secur...@googlegroups.com

2018-07-09 Thread Jonathan Nieder

+git@vger
Jeff King wrote:
> On Tue, Jul 03, 2018 at 08:48:14AM -0700, Jonathan Nieder wrote:

>> Administrivia: do you mind if I bounce these messages to some archived
>> list, either git@vger.kernel.org or git-security?  Or if we'd prefer
>> to avoid the noise from that, do you mind if I work with Eric Wong to
>> get them injected in the https://public-inbox.org/ archive?
>
> I don't mind at all. I'm actually going to work later today on preparing
> other messages from the security list to go to the public-inbox.org
> archive, so that might pave the way.

Thanks, done.

This doesn't work as a trial run for
https://public-inbox.org/meta/20180703160910.gb51...@aiede.svl.corp.google.com,
since I just bounced the mails.  So it won't work for threads like the
326-message thread you mentioned, but it should work in more modest
cases.

It seems that vger only delivered some of the messages.  I suspect
it's related to DMARC.

Jonathan

Re: [PATCH 0/4] Use oid_object_info() instead of read_object_file()

2018-07-09 Thread Junio C Hamano

Оля Тележная   writes:

> Hello everyone,
> This is my new attempt to start using oid_object_info_extended() in
> ref-filter. You could look at previous one [1] [2] but it is not
> necessary.

Yup, it sounds like a sensible thing to do to try asking object-info
helper instead of reading the whole object in-core and inspecting it
ourselves when we can avoid it.

Re: [PATCH 2/4] ref-filter: add empty values to technical fields

2018-07-09 Thread Junio C Hamano

Olga Telezhnaya  writes:

> Atoms like "align" or "end" do not have string representation.
> Earlier we had to go and parse whole object with a hope that we
> could fill their string representations. It's easier to fill them
> with an empty string before we start to work with whole object.
>
> Signed-off-by: Olga Telezhnaia 
> ---

Just being curious, but is there any meaningful relationship between
what was labelled as SOURCE_NONE in the previous step and what this
step calls "technical fields"?  Things like "upstream" (which is not
affected by the contents of the object, but is affected by the ref
in question) and "if" (which merely exists to construct the language
syntax) would fall into quite different category, so one might be
subset/superset of the other, but I am wondering if we can take
advantage of more table-driven approach taken by the previous step. 


>  ref-filter.c | 8 +++-
>  1 file changed, 7 insertions(+), 1 deletion(-)
>
> diff --git a/ref-filter.c b/ref-filter.c
> index 8611c24fd57d1..27733ef013bed 100644
> --- a/ref-filter.c
> +++ b/ref-filter.c
> @@ -1497,6 +1497,7 @@ static int populate_value(struct ref_array_item *ref, 
> struct strbuf *err)
>   refname = get_symref(atom, ref);
>   else if (starts_with(name, "upstream")) {
>   const char *branch_name;
> + v->s = "";
>   /* only local branches may have an upstream */
>   if (!skip_prefix(ref->refname, "refs/heads/",
>&branch_name))
> @@ -1509,6 +1510,7 @@ static int populate_value(struct ref_array_item *ref, 
> struct strbuf *err)
>   continue;
>   } else if (atom->u.remote_ref.push) {
>   const char *branch_name;
> + v->s = "";
>   if (!skip_prefix(ref->refname, "refs/heads/",
>&branch_name))
>   continue;
> @@ -1549,22 +1551,26 @@ static int populate_value(struct ref_array_item *ref, 
> struct strbuf *err)
>   continue;
>   } else if (starts_with(name, "align")) {
>   v->handler = align_atom_handler;
> + v->s = "";
>   continue;
>   } else if (!strcmp(name, "end")) {
>   v->handler = end_atom_handler;
> + v->s = "";
>   continue;
>   } else if (starts_with(name, "if")) {
>   const char *s;
> -
> + v->s = "";
>   if (skip_prefix(name, "if:", &s))
>   v->s = xstrdup(s);
>   v->handler = if_atom_handler;
>   continue;
>   } else if (!strcmp(name, "then")) {
>   v->handler = then_atom_handler;
> + v->s = "";
>   continue;
>   } else if (!strcmp(name, "else")) {
>   v->handler = else_atom_handler;
> + v->s = "";
>   continue;
>   } else
>   continue;
>
> --
> https://github.com/git/git/pull/520

Re: Subscribing Apple people to git-secur...@googlegroups.com

2018-07-09 Thread Jeff King

On Tue, Jul 03, 2018 at 08:48:14AM -0700, Jonathan Nieder wrote:

> Administrivia: do you mind if I bounce these messages to some archived
> list, either git@vger.kernel.org or git-security?  Or if we'd prefer
> to avoid the noise from that, do you mind if I work with Eric Wong to
> get them injected in the https://public-inbox.org/ archive?

I don't mind at all. I'm actually going to work later today on preparing
other messages from the security list to go to the public-inbox.org
archive, so that might pave the way.

-Peff

Re: Subscribing Apple people to git-secur...@googlegroups.com

2018-07-09 Thread Jeff King

On Mon, Jul 02, 2018 at 09:29:41PM +0200, Christian Couder wrote:

> When people complained a month ago about the MacOS package on
> https://git-scm.com/ not being up-to-date after the Git security
> release, I got in touch with Apple people GitLab has been working with
> to see if they could help on this.

Unfortunately I don't think this will quite solve the issue we had, just
because people get their copy of Git in various ways. So Homebrew
updated pretty promptly, but people going to git-scm.com to find a
binary package were left without help. Likewise, this will help people
getting Git as part of XCode, but not people gettin the package from
git-scm.com.

All that said, I'm happy to get as many binary packagers into the loop
as early as possible. It can only help, even if it doesn't solve all
problems. :)

> Please add these addresses to the git-security mailing list:
> jerem...@apple.com
> akils...@apple.com
> dt-...@group.apple.com

Done.

> Please add these GitHub accounts to the cabal repo:
> jeremyhu

Done.

> productsecurityOSSapple

I couldn't find that account. Is it maybe a team name within the apple
org or something?

> I am also personally very happy with the Apple developers' willingness
> to get involved and help.

Yes, welcome aboard!

I hope that maybe they're also interested in reducing the overall diff
between upstream Git and what ships with XCode. Last time I looked
(which was admittedly a while ago), a lot of the changes seemed like
things that could probably be considered upstream.

-Peff

Re: Subscribing Apple people to git-secur...@googlegroups.com

2018-07-09 Thread Jeff King

On Mon, Jul 02, 2018 at 01:15:19PM -0700, Jeremy Huddleston Sequoia wrote:

> > I hope that maybe they're also interested in reducing the overall
> > diff between upstream Git and what ships with XCode. Last time I
> > looked (which was admittedly a while ago), a lot of the changes
> > seemed like things that could probably be considered upstream.
> 
> I'm very very interested in having reduced differences between what we
> ship in Xcode and what is upstream.  I've been maintaining a repo with
> our patches that I rebase as we move forward, in the hope that these
> changes might be useful to others and a derivative of them might
> eventually be accepted upstream.  See
> https://github.com/jeremyhu/git/commits/master for the current set of
> changes that are in our shipping git (currently on top of 2.17.1).

Thanks for sharing. Skimming over it, I see:

 - several of the changes look related to run-time relocation. There was
   a series that shipped in v2.18.0 related to this, so that may reduce
   your diff once you rebase.

 - The xcode_gitattributes() bits aren't likely to go upstream as-is.
   But possibly these could ship as a default $sysconfdir/gitattributes?

 - the rest look like assorted little fixes that probably could go
   upstream

-Peff

Re: Subscribing Apple people to git-secur...@googlegroups.com

2018-07-09 Thread Jeff King

On Mon, Jul 02, 2018 at 01:58:21PM -0700, Akilsrin wrote:

> Could “ProdsecOSS "  also be added to the
> git-security mailing list. It’s another account I control to ensure my
> team and I track open source bugs.
> 
> The git repo: could you add https://github.com/product-security-OSS
>  and
> https://github.com/Akilsrin  to the
> GitHub cabal repo please.

Done, done, and done.

-Peff

Re: Subscribing Apple people to git-secur...@googlegroups.com

2018-07-09 Thread Jonathan Nieder

Administrivia: do you mind if I bounce these messages to some archived
list, either git@vger.kernel.org or git-security?  Or if we'd prefer
to avoid the noise from that, do you mind if I work with Eric Wong to
get them injected in the https://public-inbox.org/ archive?

Hi,

Jeff King wrote:
> On Mon, Jul 02, 2018 at 01:15:19PM -0700, Jeremy Huddleston Sequoia wrote:

>> I'm very very interested in having reduced differences between what we
>> ship in Xcode and what is upstream.
[...]
> Thanks for sharing. Skimming over it, I see:
>
>  - several of the changes look related to run-time relocation. There was
>a series that shipped in v2.18.0 related to this, so that may reduce
>your diff once you rebase.
>
>  - The xcode_gitattributes() bits aren't likely to go upstream as-is.
>But possibly these could ship as a default $sysconfdir/gitattributes?
>
>  - the rest look like assorted little fixes that probably could go
>upstream

I agree with Peff's assessment.  I'd also like to emphasize that
upstream is happy to see an [FYI/PATCH] when you have a divergence,
which would provide a thread to reply to to figure out whether there's
some generalization that is suitable for upstream.  (For example,
maybe we want some Makefile knob to allow setting some baked-in
attributes.)

Thanks,
Jonathan

Re: Subscribing Apple people to git-secur...@googlegroups.com

2018-07-09 Thread Jeremy Huddleston Sequoia

+Akila

Hi,

Replies inline.

> On Jul 2, 2018, at 12:50, Jeff King  wrote:
> 
> On Mon, Jul 02, 2018 at 09:29:41PM +0200, Christian Couder wrote:
> 
>> When people complained a month ago about the MacOS package on
>> https://git-scm.com/ not being up-to-date after the Git security
>> release, I got in touch with Apple people GitLab has been working with
>> to see if they could help on this.
> 
> Unfortunately I don't think this will quite solve the issue we had, just
> because people get their copy of Git in various ways. So Homebrew
> updated pretty promptly, but people going to git-scm.com to find a
> binary package were left without help. Likewise, this will help people
> getting Git as part of XCode, but not people gettin the package from
> git-scm.com.
> 
> All that said, I'm happy to get as many binary packagers into the loop
> as early as possible. It can only help, even if it doesn't solve all
> problems. :)
> 
>> Please add these addresses to the git-security mailing list:
>>jerem...@apple.com
>>akils...@apple.com
>>dt-...@group.apple.com
> 
> Done.
> 
>> Please add these GitHub accounts to the cabal repo:
>>jeremyhu
> 
> Done.
> 
>>productsecurityOSSapple
> 
> I couldn't find that account. Is it maybe a team name within the apple
> org or something?

This is the account name I got from Akila.  Akila, can you please work with 
Jeff to get this sorted?  In the mean time, I have access.

> I am also personally very happy with the Apple developers' willingness
>> to get involved and help.
> 
> Yes, welcome aboard!
> 
> I hope that maybe they're also interested in reducing the overall diff
> between upstream Git and what ships with XCode. Last time I looked
> (which was admittedly a while ago), a lot of the changes seemed like
> things that could probably be considered upstream.

I'm very very interested in having reduced differences between what we ship in 
Xcode and what is upstream.  I've been maintaining a repo with our patches that 
I rebase as we move forward, in the hope that these changes might be useful to 
others and a derivative of them might eventually be accepted upstream.  See 
https://github.com/jeremyhu/git/commits/master for the current set of changes 
that are in our shipping git (currently on top of 2.17.1).

Thanks,
Jeremy


> 
> -Peff

Subscribing Apple people to git-secur...@googlegroups.com

2018-07-09 Thread Christian Couder

Hi Peff (and Jonathan),

When people complained a month ago about the MacOS package on
https://git-scm.com/ not being up-to-date after the Git security
release, I got in touch with Apple people GitLab has been working with
to see if they could help on this.

As the urgent issue was resolved when Tim Harper eventually generated
a new MacOS package, it took some time, but now the Apple persons in
CC are asking for the following:

Please add these addresses to the git-security mailing list:
jerem...@apple.com
akils...@apple.com
dt-...@group.apple.com

Please add these GitHub accounts to the cabal repo:
jeremyhu
productsecurityOSSapple

I think it could significantly improve things, even if Tim Harper
continues to prepare the MacOS package published on
https://git-scm.com/, as Apple people also release their own Git
versions at least along with XCode.

I am also personally very happy with the Apple developers' willingness
to get involved and help.

Thanks in advance,
Christian.

Re: [PATCH v2 on ds/commit-graph-fsck 0/6] Object store refactoring: commit graph

2018-07-09 Thread Junio C Hamano

Jonathan Tan  writes:

> This is on ds/commit-graph-fsck.
>
> I saw that ds/commit-graph-fsck has been updated to the latest version
> (v7, including "gc.writeCommitGraph"), so I've rebased my changes on top
> of that branch. There were some mechanical changes needed during the
> rebase, so I'm sending the rebased patches out.
>
> I've also added a patch (patch 1) that removes some duplication of
> implementation that Junio talked about in [1].
>
> [1] https://public-inbox.org/git/xmqqefgtmrgi@gitster-ct.c.googlers.com/

While attempting to merge this topic to 'pu', I noticed that you and
Derrick are perhaps playing a game of whack-a-mole by you getting
rid of core_commit_graph global and making it a per in-core
repository instance, while Derrick adding core_multi_pack_index,
making it necessary for yet another round of similar clean-up?

Re: [PATCH v3 01/20] linear-assignment: a function to solve least-cost assignment problems

2018-07-09 Thread Junio C Hamano

Johannes Schindelin  writes:

> Speaking of GitGitGadget: I just encoutered a problem with your
> `refs/notes/amlog` and I hope you can help me with that.
> ...
> When I ask `git notes --ref=refs/notes/gitster-amlog show
> 4cec3986f017d84c8d6a2c4233d2eba4a3ffa60d` (the SHA-1 is the one
> corresponding to `Message-Id: <...>` for that mail), it insists on
> outputting
>
>   5902152ab02291af4454f24a8ccaf2adddefc306

It is not uncommon for me to have to do "am" the same patch twice
when attempting to find the right branch/commit to base a change on,
so the reverse direction that abuses the notes mechanism to map
message id to resulting commits would be unreliable, especially
given that they may need to further go through "rebase -i" or manual
"cherry-pick " depending on the situation.

I am kind of surprised that the message-to-commit mapping still
records any data that is remotely useful (these days, I only use it
to run "show --notes=amlog" for commit-to-message mapping).  I do
not think I have anything special when amending the commit, but
amlog notes should be updated in both diretions for its entries to
stay correct across amending, I would think.

refs/notes/amlog problems, was Re: [PATCH v3 01/20] linear-assignment: a function to solve least-cost assignment problems

2018-07-09 Thread Johannes Schindelin

Hi Junio,

On Sun, 8 Jul 2018, Johannes Schindelin wrote:

> I just encoutered a problem with your `refs/notes/amlog` and I hope you
> can help me with that.
> 
> Concretely, I want GitGitGadget to be able to identify the commit that
> corresponds to a given mail that contained a patch (if it ever made it
> into `pu`), to automate all kinds of tedious things that I currently have
> to perform manually.
> 
> And here I hit a block: I am looking for the commit corresponding to
> aca087479b35cbcbd7c84c7ca3bcf556133d0548.1530274571.git.gitgitgad...@gmail.com
> 
> When I ask `git notes --ref=refs/notes/gitster-amlog show
> 4cec3986f017d84c8d6a2c4233d2eba4a3ffa60d` (the SHA-1 is the one
> corresponding to `Message-Id: <...>` for that mail), it insists on
> outputting
> 
>   5902152ab02291af4454f24a8ccaf2adddefc306
> 
> However, I cannot find that commit anywhere.
> 
> When I look for the commit in the same manual, tedious way that I want to
> automate, I find that it *is* in `pu`, but as
> 
>   5cf8e064747be2026bb23be37f84f2f0b2a31781
> 
> Even curiouser: when I now ask for the commit notes for both of those
> SHA-1s, I get back the correct, same Message-Id *for both of them*, which
> makes me think that it was recorded correctly, but then overwritten due to
> some process I don't understand.
> 
> Would you be able to shed light into this?

I think I reconstructed the culprit:

In https://github.com/git/git/commit/a7cddab6e8, your post-applypatch hook
added the note for commit 5902152ab02291af4454f24a8ccaf2adddefc306 that it
was generated from Message-Id:
,
and then https://github.com/git/git/commit/ff28c8f9283 added the note to
map that Message-Id back to that commit.

So far, so good!

But then, https://github.com/git/git/commit/81b08c718e9 indicates that you
ran an interactive rebase and amended the commit
5902152ab02291af4454f24a8ccaf2adddefc306 and the result was a new commit
5cf8e064747be2026bb23be37f84f2f0b2a31781 that was then also mapped to that
Message-Id.

And obviously, you lack a post-rewrite hook a la

```sh
refopt=--ref=refs/notes/amlog
while read old new rest
do
mid="$(git notes $refopt show $old 2>/dev/null)" &&
git notes $refopt set -m "$mid" $new
done
```

I was pretty happy to figure that out all on my own, and already on my way
to come up with that post-rewrite hook and a script to parse all of the
commits in refs/notes/amlog whose commit message contains `commit --amend`
to fix those problems, but before starting, I wanted to sanity check the
oldest such commit: https://github.com/git/git/commit/49bc3858e3c

You will be readily able to verify that it maps the commit
73bfebd43e14bcc1502577c0933b6a16ad540b99 to Message-Id:
<20170619175605.27864-3-phillip.w...@talktalk.net>, but that 7c1a3dcf23e
(which corresponds to that Message-Id) maps to
f64760904766db662badf1256923532b9e1a6ebd. So yes, there is the same
problem with this mapping, and we need to fix it.

*However*. Neither https://github.com/git/git/commit/73bfebd43e1 nor
https://github.com/git/git/commit/f6476090476 show any commit!

Does that mean that the patch with that Message-Id never made it into
`master` and was simply dropped and gc'ed at some stage?

Actually, no:
https://public-inbox.org/git/20170619175605.27864-3-phillip.w...@talktalk.net/
corresponds quite clearly to
https://github.com/git/git/commit/1ceb9dfab7e

Now, that commit message was clearly edited by you (I note the capital "A"
in Phillip's "Add" vs your lower-case "a" in "add"), but the patch
quite obviously made it into our code based in its original shape.

So I looked for the commit notes for that commit, but there aren't any!

To summarize, there are two commits recorded for that Message-Id, the
later one not mapped back, and neither is the correct commit that made it
into `master`.

It would be nice to figure out what went wrong there, and how to fix it
for the future (and also to fix up the existing mis-mappings in `amlog`).

However, at this stage I really have not enough information at my hands,
even with as much effort as I spent so far to figure out where my patch
went (which started this bug hunt). Could you kindly spend some time on
that? Otherwise, `amlog` is a lot less useful than it could otherwise be.

Thanks,
Dscho

P.S.: funny side note: it would appear that the rewritten notes all get
the author of the patch author, look e.g. at the author of
https://github.com/git/git/commit/81b08c718e97

Re: [PATCH 0/6] Compile cleanly in pedantic mode

2018-07-09 Thread Beat Bolli

On 09.07.18 23:45, Junio C Hamano wrote:
> Beat Bolli  writes:
> 
>> While developing 6aaded550 ("builtin/config: work around an unsized
>> array forward declaration", 2018-07-05), I have compiled Git with
>> CFLAGS="-std=c99 -pedantic".
> 
> Nicely done.  
> 
> With these 6 patches and the USE_PARENCE_AROUND_GETTEXT_N hack, the
> forward decl of the unsized static array you dealt with separately
> becomes the only remaining violation in the codebase, which is good.
> 
> Will queue.  Thanks.

Thanks!

Beat

Re: [PATCH v2 on ds/commit-graph-fsck 0/6] Object store refactoring: commit graph

2018-07-09 Thread Stefan Beller

Hi Jonathan,

> This is on ds/commit-graph-fsck.
>
[...]
> I've also added a patch (patch 1) that removes some duplication of
> implementation that Junio talked about in [1].

I think this series is good;
Thanks,
Stefan

Re: [PATCH 0/6] Compile cleanly in pedantic mode

2018-07-09 Thread Junio C Hamano

Beat Bolli  writes:

> While developing 6aaded550 ("builtin/config: work around an unsized
> array forward declaration", 2018-07-05), I have compiled Git with
> CFLAGS="-std=c99 -pedantic".

Nicely done.  

With these 6 patches and the USE_PARENCE_AROUND_GETTEXT_N hack, the
forward decl of the unsized static array you dealt with separately
becomes the only remaining violation in the codebase, which is good.

Will queue.  Thanks.

Re: [PATCH v2 1/6] commit-graph: refactor preparing commit graph

2018-07-09 Thread Stefan Beller

Hi Jonathan,
On Mon, Jul 9, 2018 at 1:44 PM Jonathan Tan  wrote:
>
> Two functions in the code (1) check if the repository is configured for
> commit graphs, (2) call prepare_commit_graph(), and (3) check if the
> graph exists. Move (1) and (3) into prepare_commit_graph(), reducing
> duplication of code.
>
> Signed-off-by: Jonathan Tan 



>  static int prepare_commit_graph_run_once = 0;
> -static void prepare_commit_graph(void)
> +
> +/*
> + * Return 1 if commit_graph is non-NULL, and 0 otherwise.
> + *
> + * On the first invocation, this function attemps to load the commit
> + * graph if the_repository is configured to have one.

and as we talk about in-memory commit graph (and not some
stale file that may still be around on the fs), we can assertly return
0 when core_commit_graph is false.

Makes sense!

> @@ -337,22 +348,17 @@ static int parse_commit_in_graph_one(struct 
> commit_graph *g, struct commit *item
>
>  int parse_commit_in_graph(struct commit *item)
>  {
> -   if (!core_commit_graph)
> +   if (!prepare_commit_graph())
> return 0;
> -
> -   prepare_commit_graph();
> -   if (commit_graph)
> -   return parse_commit_in_graph_one(commit_graph, item);
> -   return 0;
> +   return parse_commit_in_graph_one(commit_graph, item);

Makes sense.

>  }
>
>  void load_commit_graph_info(struct commit *item)
>  {
> uint32_t pos;
> -   if (!core_commit_graph)
> +   if (!prepare_commit_graph())
> return;
> -   prepare_commit_graph();
> -   if (commit_graph && find_commit_in_graph(item, commit_graph, &pos))
> +   if (find_commit_in_graph(item, commit_graph, &pos))
> fill_commit_graph_info(item, commit_graph, pos);

here too,

This is
Reviewed-by: Stefan Beller

Re: [PATCH 4/6] sequencer.c: avoid empty statements at top level

2018-07-09 Thread Junio C Hamano

Beat Bolli  writes:

> The macro GIT_PATH_FUNC expands to a function definition that ends with
> a closing brace. Remove two extra semicolons.

Good.  Thanks.

>
> While at it, fix the example in path.h.
>
> Signed-off-by: Beat Bolli 
> ---
>  path.h  | 2 +-
>  sequencer.c | 4 ++--
>  2 files changed, 3 insertions(+), 3 deletions(-)
>
> diff --git a/path.h b/path.h
> index 1ccd0373c9..fc9d3487a0 100644
> --- a/path.h
> +++ b/path.h
> @@ -147,7 +147,7 @@ extern void report_linked_checkout_garbage(void);
>  /*
>   * You can define a static memoized git path like:
>   *
> - *static GIT_PATH_FUNC(git_path_foo, "FOO");
> + *static GIT_PATH_FUNC(git_path_foo, "FOO")
>   *
>   * or use one of the global ones below.
>   */
> diff --git a/sequencer.c b/sequencer.c
> index 5354d4d51e..66e7073995 100644
> --- a/sequencer.c
> +++ b/sequencer.c
> @@ -62,12 +62,12 @@ static GIT_PATH_FUNC(rebase_path_done, 
> "rebase-merge/done")
>   * The file to keep track of how many commands were already processed (e.g.
>   * for the prompt).
>   */
> -static GIT_PATH_FUNC(rebase_path_msgnum, "rebase-merge/msgnum");
> +static GIT_PATH_FUNC(rebase_path_msgnum, "rebase-merge/msgnum")
>  /*
>   * The file to keep track of how many commands are to be processed in total
>   * (e.g. for the prompt).
>   */
> -static GIT_PATH_FUNC(rebase_path_msgtotal, "rebase-merge/end");
> +static GIT_PATH_FUNC(rebase_path_msgtotal, "rebase-merge/end")
>  /*
>   * The commit message that is planned to be used for any changes that
>   * need to be committed following a user interaction.

Re: [RFC PATCH 4/6] sequencer.c: avoid empty statements at top level

2018-07-09 Thread Beat Bolli

On 09.07.18 23:34, Junio C Hamano wrote:
> Beat Bolli  writes:
> 
>> The marco GIT_PATH_FUNC expands to a complete statement including the
>> semicolon. Remove two extra trailing semicolons.
> 
> Wait a bit.  The observation in the log message and the
> implementation of GIT_PATH_FUNC() do not match.
> 
> #define GIT_PATH_FUNC(func, filename) \
> const char *func(void) \
> { \
> static char *ret; \
> if (!ret) \
> ret = git_pathdup(filename); \
> return ret; \
> }
> 
> The code generated does "include semicolon" but that is not why the
> caller should place semicolon after the closing parens.  Perhaps
> replace "including the semicolon." with something else, like ", and
> adding a semicolon after it not only is unnecessary but is wrong."
> or soemthing like that?

This message is fixed in the non-RFC series that I sent at 19:25 UTC. I
noticed the error after the message from Philip Oakley.

Beat

Re: [RFC PATCH 4/6] sequencer.c: avoid empty statements at top level

2018-07-09 Thread Junio C Hamano

Beat Bolli  writes:

> The marco GIT_PATH_FUNC expands to a complete statement including the
> semicolon. Remove two extra trailing semicolons.

Wait a bit.  The observation in the log message and the
implementation of GIT_PATH_FUNC() do not match.

#define GIT_PATH_FUNC(func, filename) \
const char *func(void) \
{ \
static char *ret; \
if (!ret) \
ret = git_pathdup(filename); \
return ret; \
}

The code generated does "include semicolon" but that is not why the
caller should place semicolon after the closing parens.  Perhaps
replace "including the semicolon." with something else, like ", and
adding a semicolon after it not only is unnecessary but is wrong."
or soemthing like that?

It is a bit unfortunate that we need to live with a slight uglyness
of the resulting source code, unlike e.g. define_commit_slab() that
can (and must) end with a semicolon, which gives us a more natural
look.  But that is a separate issue.

>
> Signed-off-by: Beat Bolli 
> ---
>  sequencer.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/sequencer.c b/sequencer.c
> index 5354d4d51e..66e7073995 100644
> --- a/sequencer.c
> +++ b/sequencer.c
> @@ -62,12 +62,12 @@ static GIT_PATH_FUNC(rebase_path_done, 
> "rebase-merge/done")
>   * The file to keep track of how many commands were already processed (e.g.
>   * for the prompt).
>   */
> -static GIT_PATH_FUNC(rebase_path_msgnum, "rebase-merge/msgnum");
> +static GIT_PATH_FUNC(rebase_path_msgnum, "rebase-merge/msgnum")
>  /*
>   * The file to keep track of how many commands are to be processed in total
>   * (e.g. for the prompt).
>   */
> -static GIT_PATH_FUNC(rebase_path_msgtotal, "rebase-merge/end");
> +static GIT_PATH_FUNC(rebase_path_msgtotal, "rebase-merge/end")
>  /*
>   * The commit message that is planned to be used for any changes that
>   * need to be committed following a user interaction.

Re: [PATCH 1/3] t7405: add a file/submodule conflict

2018-07-09 Thread Stefan Beller

On Sat, Jul 7, 2018 at 1:44 PM Elijah Newren  wrote:
>
> In the case of a file/submodule conflict, although both cannot exist at
> the same path, we expect both to be present somewhere for the user to be
> able to resolve the conflict with.  Add a testcase for this.
>
> Signed-off-by: Elijah Newren 
> ---
>  t/t7405-submodule-merge.sh | 56 ++
>  1 file changed, 56 insertions(+)
>
> diff --git a/t/t7405-submodule-merge.sh b/t/t7405-submodule-merge.sh
> index 7bfb2f498..95fd05d83 100755
> --- a/t/t7405-submodule-merge.sh
> +++ b/t/t7405-submodule-merge.sh
> @@ -279,4 +279,60 @@ test_expect_success 'recursive merge with submodule' '
>  grep "$(cat expect3)" actual > /dev/null)
>  '
>
> +# File/submodule conflict
> +#   Commit O: 
> +#   Commit A: path (submodule)
> +#   Commit B: path
> +#   Expected: path/ is submodule and file contents for B's path are somewhere
> +
> +test_expect_success 'setup file/submodule conflict' '
> +   test_create_repo file-submodule &&
> +   (
> +   cd file-submodule &&
> +
> +   git commit --allow-empty -m O &&
> +
> +   git branch A &&
> +   git branch B &&
> +
> +   git checkout B &&
> +   echo contents >path &&
> +   git add path &&
> +   git commit -m B &&
> +
> +   git checkout A &&
> +   test_create_repo path &&
> +   (
> +   cd path &&
> +
> +   echo hello >world &&

test_commit -C path &&

or do we need a dirty submodule specifically?
If so what is important, the untracked file or
would changes in the index be sufficient?

> +   git add world &&

when observing this one in verbose mode , you may be
asked to use 'git submodule add' instead
https://github.com/git/git/blob/master/builtin/add.c#L323

> +   git commit -m "submodule"

Not sure if we'd need the shell it is only test_commit,
the [submodule] add and commit, so maybe we can get away with
3 times -C ?

> +test_expect_failure 'file/submodule conflict' '

Which part fails here?

> +   test_when_finished "git -C file-submodule reset --hard" &&
> +   (
> +   cd file-submodule &&
> +
> +   git checkout A^0 &&
> +   test_must_fail git merge B^0 >out 2>err &&
> +
> +   git ls-files -s >out &&
> +   test_line_count = 2 out &&
> +   git ls-files -u >out &&
> +   test_line_count = 2 out &&
> +
> +   # path/ is still a submodule
> +   test_path_is_dir path/.git &&
> +
> +   echo Checking if contents from B:path showed up anywhere &&

This could be a comment instead?

> +   grep -q content * 2>/dev/null

Ah that is why we needed the specific echo above.

Sorry for being dense here, I am not quite following the last part of the test,
as it is unclear what ought to fail in this test.

Thanks,
Stefan

Re: [PATCH 2/2] sequencer: don't say BUG on bogus input

2018-07-09 Thread Junio C Hamano

Johannes Schindelin  writes:

> Hi Peff,
>
> On Mon, 9 Jul 2018, Jeff King wrote:
>
>> diff --git a/sequencer.c b/sequencer.c
>> index f692b2ef44..234666b980 100644
>> --- a/sequencer.c
>> +++ b/sequencer.c
>> @@ -3637,7 +3637,7 @@ int sequencer_pick_revisions(struct replay_opts *opts)
>>  return error(_("revision walk setup failed"));
>>  cmit = get_revision(opts->revs);
>>  if (!cmit || get_revision(opts->revs))
>> -return error("BUG: expected exactly one commit from 
>> walk");
>> +return error(_("empty commit set passed"));
>
> Should this not rather be
>
> - if (!cmit || get_revision(opts->revs))
> - return error("BUG: expected exactly one commit from 
> walk");
> + if (!cmit)
> + return error(_("empty commit set passed"));
> + if (get_revision(opts->revs))
> + return error(_("unexpected extra commit from walk"));

Good eyes.

Re: [PATCH] gc --auto: release pack files before auto packing

2018-07-09 Thread Junio C Hamano

Duy Nguyen  writes:

> On Sun, Jul 8, 2018 at 1:16 AM Kim Gybels  wrote:
>> Should I post a v3 that goes back to the original fix, but uses
>> test_i18ngrep instead of grep?
>
> Yes please. In my comment I did write we didn't need the repo anymore
> (or something along that line) which turns out to be wrong.
>
>> In addition to not breaking any tests, close_all_packs is already used
>> in a similar way in am and fetch just before running "gc --auto".
>>
>> -Kim

Sound good.  

I recall that "clear repo should treat the_repository special" was
discussed when we saw the patch that became 74373b5f ("repository:
fix free problem with repo_clear(the_repository)", 2018-05-10),
instead of treating only the index portion specially.  Perhaps it
was a more correct approach after all?

Re: Git 2.18: RUNTIME_PREFIX... is it working?

2018-07-09 Thread Junio C Hamano

Jeff King  writes:

> On Sun, Jul 08, 2018 at 11:52:22PM +0200, Johannes Schindelin wrote:
>
>> Now, if you care to have a look at Dan's (and my) patches to implement
>> RUNTIME_PREFIX so that it looks for a directory *relative to the Git
>> binary*, you will see that it is far from portable. In fact, it is very
>> definitely not portable, and needs specific support for *every single
>> supported Operating System*. And while we covered a lot, we did not cover
>> all of them.
>> 
>> So unfortunately, it is impossible to make it the default, I am afraid.
>
> Would it be reasonable to make RUNTIME_PREFIX the default on systems
> where we _do_ have that support? AFAIK there is no downside to having it
> enabled (minus a few syscalls to find the prefix, I suppose, but I
> assume that's negligible).
>
> I.e., a patch to config.mak.uname (and possibly better support for
> _disabling_ it, though I think "make RUNTIME_PREFIX=" would probably
> work).

I think that is a sensible approach.

Re: [PATCH v3 16/20] range-diff --dual-color: work around bogus white-space warning

2018-07-09 Thread Junio C Hamano

Stefan Beller  writes:

> On Tue, Jul 3, 2018 at 4:26 AM Johannes Schindelin via GitGitGadget
>  wrote:
>>
>> From: Johannes Schindelin 
>>
>> When displaying a diff of diffs, it is possible that there is an outer
>> `+` before a context line. That happens when the context changed between
>> old and new commit. When that context line starts with a tab (after the
>> space that marks it as context line), our diff machinery spits out a
>> white-space error (space before tab), but in this case, that is
>> incorrect.
>>
>> Work around this by detecting that situation and simply *not* printing
>> the space in that case.
>
> ok. If that is the workaround that you deem to be the right thing for now.
> (I do not have an opinion if that is the right approach, or if we'd want
> to s/// for example.)
>
>> This is slightly improper a fix because it is conceivable that an
>> output_prefix might be configured with *just* the right length to let
>> that tab jump to a different tab stop depending whether we emit that
>> space or not.
>>
>> However, the proper fix would be relatively ugly and intrusive because
>> it would have to *weaken* the WS_SPACE_BEFORE_TAB option in ws.c.

I agree that weaking the error checking is a wrong solution.  Is the
root cause of this whole problem because for a diff of diff e.g.

  context that did not change between iterations
- context in old interation
-+whatever new contents added by old iteration
+ context in new interation updated by earlier step
++whatever new contents added by new iteration

there needs to be a way to tell the ws.c whitespace breakage
checking logic that the very first column is not interesting at all,
and the "+" before "whatever" and " " before "context" should be
considered to actually sit at the first (or zero-th) column of the
diff output to be checked, but there is no interface to tell the
machinery that wish, because there is no such need when inspecting a
diff of contents?  If the word "context" above were indented with HT,
I can understand that the one common between iterations would
trigger SP+HT violation that way.  Is that what is happening here?

Adding a way to tell that the apparent first column is to be ignored
to ws.c machinery (or arranging the caller to skip the first column)
may be more intrusive than it is worth, only to support this tool.
Ignoring the problem altogether and live with an incorrectly colored
SP-before-HT might be a less noisy but still acceptable solution
from that point of view, though.

I also wonder if we should be feeding the context lines to ws.c
machinery in the first place though.  In the above hypothetical
diff-of-diff output, I _think_ the only two lines we want to check
for ws.c breakage are the ones that begin with "whatever".  We may
find that both iterations are trying to introduce a ws breakage, or
we may find that old one had violation which the new one corrected.
A whitespace breakage on "context" lines, whether they are the ones
being removed by the patch or the ones staying the same across the
patch, is not worth painting---the normal diff-of-contents do not
by default show them as violation, no?

Re: Unexpected behavior with :/ references

2018-07-09 Thread William Chargin

Yep, I agree with your analysis. I'd be happy to test and commit this,
probably later today.

Thanks for your input, and for the patch!

Best,
WC

[PATCH v2 6/6] commit-graph: add repo arg to graph readers

2018-07-09 Thread Jonathan Tan

Add a struct repository argument to the functions in commit-graph.h that
read the commit graph. (This commit does not affect functions that write
commit graphs.)

Because the commit graph functions can now read the commit graph of any
repository, the global variable core_commit_graph has been removed.
Instead, the config option core.commitGraph is now read on the first
time in a repository that a commit is attempted to be parsed using its
commit graph.

This commit includes a test that exercises the functionality on an
arbitrary repository that is not the_repository.

Signed-off-by: Jonathan Tan 
---
 Makefile   |  1 +
 builtin/fsck.c |  2 +-
 cache.h|  1 -
 commit-graph.c | 60 ++
 commit-graph.h |  7 +--
 commit.c   |  6 +--
 config.c   |  5 ---
 environment.c  |  1 -
 ref-filter.c   |  2 +-
 t/helper/test-repository.c | 88 ++
 t/helper/test-tool.c   |  1 +
 t/helper/test-tool.h   |  1 +
 t/t5318-commit-graph.sh| 35 +++
 13 files changed, 168 insertions(+), 42 deletions(-)
 create mode 100644 t/helper/test-repository.c

diff --git a/Makefile b/Makefile
index 0cb6590f24..bb8bd67201 100644
--- a/Makefile
+++ b/Makefile
@@ -719,6 +719,7 @@ TEST_BUILTINS_OBJS += test-prio-queue.o
 TEST_BUILTINS_OBJS += test-read-cache.o
 TEST_BUILTINS_OBJS += test-ref-store.o
 TEST_BUILTINS_OBJS += test-regex.o
+TEST_BUILTINS_OBJS += test-repository.o
 TEST_BUILTINS_OBJS += test-revision-walking.o
 TEST_BUILTINS_OBJS += test-run-command.o
 TEST_BUILTINS_OBJS += test-scrap-cache-tree.o
diff --git a/builtin/fsck.c b/builtin/fsck.c
index eca7900ee0..2abfb2d782 100644
--- a/builtin/fsck.c
+++ b/builtin/fsck.c
@@ -825,7 +825,7 @@ int cmd_fsck(int argc, const char **argv, const char 
*prefix)
 
check_connectivity();
 
-   if (core_commit_graph) {
+   if (!git_config_get_bool("core.commitgraph", &i) && i) {
struct child_process commit_graph_verify = CHILD_PROCESS_INIT;
const char *verify_argv[] = { "commit-graph", "verify", NULL, 
NULL, NULL };
 
diff --git a/cache.h b/cache.h
index d49092d94d..8dc59bedba 100644
--- a/cache.h
+++ b/cache.h
@@ -813,7 +813,6 @@ extern char *git_replace_ref_base;
 
 extern int fsync_object_files;
 extern int core_preload_index;
-extern int core_commit_graph;
 extern int core_apply_sparse_checkout;
 extern int precomposed_unicode;
 extern int protect_hfs;
diff --git a/commit-graph.c b/commit-graph.c
index af97a10603..8eab199b1b 100644
--- a/commit-graph.c
+++ b/commit-graph.c
@@ -183,15 +183,15 @@ struct commit_graph *load_commit_graph_one(const char 
*graph_file)
exit(1);
 }
 
-static void prepare_commit_graph_one(const char *obj_dir)
+static void prepare_commit_graph_one(struct repository *r, const char *obj_dir)
 {
char *graph_name;
 
-   if (the_repository->objects->commit_graph)
+   if (r->objects->commit_graph)
return;
 
graph_name = get_commit_graph_filename(obj_dir);
-   the_repository->objects->commit_graph =
+   r->objects->commit_graph =
load_commit_graph_one(graph_name);
 
FREE_AND_NULL(graph_name);
@@ -203,26 +203,34 @@ static void prepare_commit_graph_one(const char *obj_dir)
  * On the first invocation, this function attemps to load the commit
  * graph if the_repository is configured to have one.
  */
-static int prepare_commit_graph(void)
+static int prepare_commit_graph(struct repository *r)
 {
struct alternate_object_database *alt;
char *obj_dir;
+   int config_value;
 
-   if (the_repository->objects->commit_graph_attempted)
-   return !!the_repository->objects->commit_graph;
-   the_repository->objects->commit_graph_attempted = 1;
+   if (r->objects->commit_graph_attempted)
+   return !!r->objects->commit_graph;
+   r->objects->commit_graph_attempted = 1;
 
-   if (!core_commit_graph)
+   if (repo_config_get_bool(r, "core.commitgraph", &config_value) ||
+   !config_value)
+   /*
+* This repository is not configured to use commit graphs, so
+* do not load one. (But report commit_graph_attempted anyway
+* so that commit graph loading is not attempted again for this
+* repository.)
+*/
return 0;
 
-   obj_dir = get_object_directory();
-   prepare_commit_graph_one(obj_dir);
-   prepare_alt_odb(the_repository);
-   for (alt = the_repository->objects->alt_odb_list;
-!the_repository->objects->commit_graph && alt;
+   obj_dir = r->objects->objectdir;
+   prepare_commit_graph_one(r, obj_dir);
+   prepare_alt_odb(r);
+   for (alt = r->objects->alt_odb_list;
+!r->objects->commit_graph && alt;
 alt = alt->next)
-

[PATCH v2 5/6] commit-graph: store graph in struct object_store

2018-07-09 Thread Jonathan Tan

Instead of storing commit graphs in static variables, store them in
struct object_store. There are no changes to the signatures of existing
functions - they all still only support the_repository, and support for
other instances of struct repository will be added in a subsequent
commit.

Signed-off-by: Jonathan Tan 
---
 commit-graph.c | 40 +++-
 object-store.h |  3 +++
 object.c   |  5 +
 3 files changed, 27 insertions(+), 21 deletions(-)

diff --git a/commit-graph.c b/commit-graph.c
index 6d1bc4ff7c..af97a10603 100644
--- a/commit-graph.c
+++ b/commit-graph.c
@@ -183,24 +183,20 @@ struct commit_graph *load_commit_graph_one(const char 
*graph_file)
exit(1);
 }
 
-/* global storage */
-static struct commit_graph *commit_graph = NULL;
-
 static void prepare_commit_graph_one(const char *obj_dir)
 {
char *graph_name;
 
-   if (commit_graph)
+   if (the_repository->objects->commit_graph)
return;
 
graph_name = get_commit_graph_filename(obj_dir);
-   commit_graph = load_commit_graph_one(graph_name);
+   the_repository->objects->commit_graph =
+   load_commit_graph_one(graph_name);
 
FREE_AND_NULL(graph_name);
 }
 
-static int prepare_commit_graph_run_once = 0;
-
 /*
  * Return 1 if commit_graph is non-NULL, and 0 otherwise.
  *
@@ -212,9 +208,9 @@ static int prepare_commit_graph(void)
struct alternate_object_database *alt;
char *obj_dir;
 
-   if (prepare_commit_graph_run_once)
-   return !!commit_graph;
-   prepare_commit_graph_run_once = 1;
+   if (the_repository->objects->commit_graph_attempted)
+   return !!the_repository->objects->commit_graph;
+   the_repository->objects->commit_graph_attempted = 1;
 
if (!core_commit_graph)
return 0;
@@ -223,16 +219,16 @@ static int prepare_commit_graph(void)
prepare_commit_graph_one(obj_dir);
prepare_alt_odb(the_repository);
for (alt = the_repository->objects->alt_odb_list;
-!commit_graph && alt;
+!the_repository->objects->commit_graph && alt;
 alt = alt->next)
prepare_commit_graph_one(alt->path);
-   return !!commit_graph;
+   return !!the_repository->objects->commit_graph;
 }
 
 static void close_commit_graph(void)
 {
-   free_commit_graph(commit_graph);
-   commit_graph = NULL;
+   free_commit_graph(the_repository->objects->commit_graph);
+   the_repository->objects->commit_graph = NULL;
 }
 
 static int bsearch_graph(struct commit_graph *g, struct object_id *oid, 
uint32_t *pos)
@@ -342,7 +338,7 @@ int parse_commit_in_graph(struct commit *item)
 {
if (!prepare_commit_graph())
return 0;
-   return parse_commit_in_graph_one(commit_graph, item);
+   return parse_commit_in_graph_one(the_repository->objects->commit_graph, 
item);
 }
 
 void load_commit_graph_info(struct commit *item)
@@ -350,8 +346,8 @@ void load_commit_graph_info(struct commit *item)
uint32_t pos;
if (!prepare_commit_graph())
return;
-   if (find_commit_in_graph(item, commit_graph, &pos))
-   fill_commit_graph_info(item, commit_graph, pos);
+   if (find_commit_in_graph(item, the_repository->objects->commit_graph, 
&pos))
+   fill_commit_graph_info(item, 
the_repository->objects->commit_graph, pos);
 }
 
 static struct tree *load_tree_for_commit(struct commit_graph *g, struct commit 
*c)
@@ -379,7 +375,7 @@ static struct tree *get_commit_tree_in_graph_one(struct 
commit_graph *g,
 
 struct tree *get_commit_tree_in_graph(const struct commit *c)
 {
-   return get_commit_tree_in_graph_one(commit_graph, c);
+   return 
get_commit_tree_in_graph_one(the_repository->objects->commit_graph, c);
 }
 
 static void write_graph_chunk_fanout(struct hashfile *f,
@@ -696,15 +692,17 @@ void write_commit_graph(const char *obj_dir,
 
if (append) {
prepare_commit_graph_one(obj_dir);
-   if (commit_graph)
-   oids.alloc += commit_graph->num_commits;
+   if (the_repository->objects->commit_graph)
+   oids.alloc += 
the_repository->objects->commit_graph->num_commits;
}
 
if (oids.alloc < 1024)
oids.alloc = 1024;
ALLOC_ARRAY(oids.list, oids.alloc);
 
-   if (append && commit_graph) {
+   if (append && the_repository->objects->commit_graph) {
+   struct commit_graph *commit_graph =
+   the_repository->objects->commit_graph;
for (i = 0; i < commit_graph->num_commits; i++) {
const unsigned char *hash = 
commit_graph->chunk_oid_lookup +
commit_graph->hash_len * i;
diff --git a/object-store.h b/object-store.h
index f0b77146e4..39f4a3d5b9 100644
--- a/object-store.h
+++ b/object-store.h
@@ -106,6 +106,9

[PATCH v2 4/6] commit-graph: add free_commit_graph

2018-07-09 Thread Jonathan Tan

Signed-off-by: Jonathan Tan 
---
 builtin/commit-graph.c |  2 ++
 commit-graph.c | 24 ++--
 commit-graph.h |  2 ++
 3 files changed, 18 insertions(+), 10 deletions(-)

diff --git a/builtin/commit-graph.c b/builtin/commit-graph.c
index c7d0db5ab4..0bf0c48657 100644
--- a/builtin/commit-graph.c
+++ b/builtin/commit-graph.c
@@ -115,6 +115,8 @@ static int graph_read(int argc, const char **argv)
printf(" large_edges");
printf("\n");
 
+   free_commit_graph(graph);
+
return 0;
 }
 
diff --git a/commit-graph.c b/commit-graph.c
index 5ba60f63f9..6d1bc4ff7c 100644
--- a/commit-graph.c
+++ b/commit-graph.c
@@ -231,16 +231,8 @@ static int prepare_commit_graph(void)
 
 static void close_commit_graph(void)
 {
-   if (!commit_graph)
-   return;
-
-   if (commit_graph->graph_fd >= 0) {
-   munmap((void *)commit_graph->data, commit_graph->data_len);
-   commit_graph->data = NULL;
-   close(commit_graph->graph_fd);
-   }
-
-   FREE_AND_NULL(commit_graph);
+   free_commit_graph(commit_graph);
+   commit_graph = NULL;
 }
 
 static int bsearch_graph(struct commit_graph *g, struct object_id *oid, 
uint32_t *pos)
@@ -1033,3 +1025,15 @@ int verify_commit_graph(struct repository *r, struct 
commit_graph *g)
 
return verify_commit_graph_error;
 }
+
+void free_commit_graph(struct commit_graph *g)
+{
+   if (!g)
+   return;
+   if (g->graph_fd >= 0) {
+   munmap((void *)g->data, g->data_len);
+   g->data = NULL;
+   close(g->graph_fd);
+   }
+   free(g);
+}
diff --git a/commit-graph.h b/commit-graph.h
index 674052bef4..94defb04a9 100644
--- a/commit-graph.h
+++ b/commit-graph.h
@@ -58,4 +58,6 @@ void write_commit_graph(const char *obj_dir,
 
 int verify_commit_graph(struct repository *r, struct commit_graph *g);
 
+void free_commit_graph(struct commit_graph *);
+
 #endif
-- 
2.18.0.203.gfac676dfb9-goog

[PATCH v2 3/6] commit-graph: add missing forward declaration

2018-07-09 Thread Jonathan Tan

Signed-off-by: Jonathan Tan 
---
 commit-graph.h | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/commit-graph.h b/commit-graph.h
index 506cb45fb1..674052bef4 100644
--- a/commit-graph.h
+++ b/commit-graph.h
@@ -5,6 +5,8 @@
 #include "repository.h"
 #include "string-list.h"
 
+struct commit;
+
 char *get_commit_graph_filename(const char *obj_dir);
 
 /*
-- 
2.18.0.203.gfac676dfb9-goog

[PATCH v2 on ds/commit-graph-fsck 0/6] Object store refactoring: commit graph

2018-07-09 Thread Jonathan Tan

This is on ds/commit-graph-fsck.

I saw that ds/commit-graph-fsck has been updated to the latest version
(v7, including "gc.writeCommitGraph"), so I've rebased my changes on top
of that branch. There were some mechanical changes needed during the
rebase, so I'm sending the rebased patches out.

I've also added a patch (patch 1) that removes some duplication of
implementation that Junio talked about in [1].

[1] https://public-inbox.org/git/xmqqefgtmrgi@gitster-ct.c.googlers.com/

Jonathan Tan (6):
  commit-graph: refactor preparing commit graph
  object-store: add missing include
  commit-graph: add missing forward declaration
  commit-graph: add free_commit_graph
  commit-graph: store graph in struct object_store
  commit-graph: add repo arg to graph readers

 Makefile   |   1 +
 builtin/commit-graph.c |   2 +
 builtin/fsck.c |   2 +-
 cache.h|   1 -
 commit-graph.c | 108 +
 commit-graph.h |  11 ++--
 commit.c   |   6 +--
 config.c   |   5 --
 environment.c  |   1 -
 object-store.h |   6 +++
 object.c   |   5 ++
 ref-filter.c   |   2 +-
 t/helper/test-repository.c |  88 ++
 t/helper/test-tool.c   |   1 +
 t/helper/test-tool.h   |   1 +
 t/t5318-commit-graph.sh|  35 
 16 files changed, 213 insertions(+), 62 deletions(-)
 create mode 100644 t/helper/test-repository.c

-- 
2.18.0.203.gfac676dfb9-goog

[PATCH v2 1/6] commit-graph: refactor preparing commit graph

2018-07-09 Thread Jonathan Tan

Two functions in the code (1) check if the repository is configured for
commit graphs, (2) call prepare_commit_graph(), and (3) check if the
graph exists. Move (1) and (3) into prepare_commit_graph(), reducing
duplication of code.

Signed-off-by: Jonathan Tan 
---
 commit-graph.c | 28 +---
 1 file changed, 17 insertions(+), 11 deletions(-)

diff --git a/commit-graph.c b/commit-graph.c
index 212232e752..5ba60f63f9 100644
--- a/commit-graph.c
+++ b/commit-graph.c
@@ -200,15 +200,25 @@ static void prepare_commit_graph_one(const char *obj_dir)
 }
 
 static int prepare_commit_graph_run_once = 0;
-static void prepare_commit_graph(void)
+
+/*
+ * Return 1 if commit_graph is non-NULL, and 0 otherwise.
+ *
+ * On the first invocation, this function attemps to load the commit
+ * graph if the_repository is configured to have one.
+ */
+static int prepare_commit_graph(void)
 {
struct alternate_object_database *alt;
char *obj_dir;
 
if (prepare_commit_graph_run_once)
-   return;
+   return !!commit_graph;
prepare_commit_graph_run_once = 1;
 
+   if (!core_commit_graph)
+   return 0;
+
obj_dir = get_object_directory();
prepare_commit_graph_one(obj_dir);
prepare_alt_odb(the_repository);
@@ -216,6 +226,7 @@ static void prepare_commit_graph(void)
 !commit_graph && alt;
 alt = alt->next)
prepare_commit_graph_one(alt->path);
+   return !!commit_graph;
 }
 
 static void close_commit_graph(void)
@@ -337,22 +348,17 @@ static int parse_commit_in_graph_one(struct commit_graph 
*g, struct commit *item
 
 int parse_commit_in_graph(struct commit *item)
 {
-   if (!core_commit_graph)
+   if (!prepare_commit_graph())
return 0;
-
-   prepare_commit_graph();
-   if (commit_graph)
-   return parse_commit_in_graph_one(commit_graph, item);
-   return 0;
+   return parse_commit_in_graph_one(commit_graph, item);
 }
 
 void load_commit_graph_info(struct commit *item)
 {
uint32_t pos;
-   if (!core_commit_graph)
+   if (!prepare_commit_graph())
return;
-   prepare_commit_graph();
-   if (commit_graph && find_commit_in_graph(item, commit_graph, &pos))
+   if (find_commit_in_graph(item, commit_graph, &pos))
fill_commit_graph_info(item, commit_graph, pos);
 }
 
-- 
2.18.0.203.gfac676dfb9-goog

[PATCH v2 2/6] object-store: add missing include

2018-07-09 Thread Jonathan Tan

Signed-off-by: Jonathan Tan 
---
 object-store.h | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/object-store.h b/object-store.h
index d683112fd7..f0b77146e4 100644
--- a/object-store.h
+++ b/object-store.h
@@ -2,6 +2,9 @@
 #define OBJECT_STORE_H
 
 #include "oidmap.h"
+#include "list.h"
+#include "sha1-array.h"
+#include "strbuf.h"
 
 struct alternate_object_database {
struct alternate_object_database *next;
-- 
2.18.0.203.gfac676dfb9-goog

Re: [PATCH v3 17/20] range-diff: add a man page

2018-07-09 Thread Johannes Schindelin

Hi Stefan,

On Mon, 9 Jul 2018, Stefan Beller wrote:

> On Mon, Jul 9, 2018 at 1:00 PM Johannes Schindelin
>  wrote:
> >
> > On Mon, 9 Jul 2018, Stefan Beller wrote:
> >
> > > On Tue, Jul 3, 2018 at 4:26 AM Johannes Schindelin via GitGitGadget
> > >  wrote:
> > >
> > > > +'git range-diff' [--color=[]] [--no-color] []
> > > > +   [--dual-color] [--creation-factor=]
> > > > +   (   | ... |)
> > > > +
> > > > +DESCRIPTION
> > > > +---
> > > > +
> > > > +This command shows the differences between two versions of a patch
> > > > +series, or more generally, two commit ranges (ignoring merges).
> > >
> > > Does it completely ignore merges or does it die("not supported"), how is
> > > the user expected to cope with the accidental merge in the given range?
> >
> > It ignores merges. It does not reject them. It simply ignores them and
> > won't talk about them as a consequence.
> >
> > Could you suggest an improved way to say that?
> 
> Well that is what the patch said already; I was just dense in reading.
> I just tested it, and the commit with more than one parent itself is
> ignored (not showing up in the output), but the commits that are merged
> in are still considered.

So a more accurate wording would be "ignoring merge commits" rather than
"ignoring merges".

Makes sense?

> So giving range1 as f7761a5a065..0a5677f6f68 with
> 
>   0a5677f6f68 (Merge branch 'js/branch-diff' into pu, 2018-07-06)
>   f7761a5a065 (Merge branch 'jk/fsck-gitmodules-gently' into jch, 2018-07-06)
> 
> still produces cool output and with --word-diff it is even more amazing, as it
> just tells me a large part was s/branch-diff/range-diff/ :-)

I never thought about using `--word-diff` with `range-diff`.

Sadly, for me it gives rather stupid output, e.g.

4:  6927c11a311 ! 4:  ebf3fea2517 range-diff: make --dual-color the default mode
@@ -14,6 +14,15 @@
diff --git a/Documentation/git-range-diff.txt 
b/Documentation/git-range-diff.txt
--- a/Documentation/git-range-diff.txt
+++ b/Documentation/git-range-diff.txt
{+@@+}
{+ +}
{+ [verse]+}
{+ 'git range-diff' [--color=[]] [--no-color] []+}
{+- [--dual-color] [--creation-factor=]+}
{++ [--no-dual-color] [--creation-factor=]+}
{+  (   | ... |)+}
{+ +}
{+ DESCRIPTION+}
@@

 OPTIONS

> > > > +Algorithm
> > > > +-
> > > > +
> > > > +The general idea is this: we generate a cost matrix between the commits
> > > > +in both commit ranges, then solve the least-cost assignment.
> > >
> > > Can you say more about the generation of the cost matrix?
> > > I assume that it counts the number of lines added/deleted to make
> > > one patch into the other patch.
> >
> > I think that is correct.
> >
> > *reading the patch*
> >
> > Actually, no, I was wrong. For the cost matrix, the *length* of the diff
> > *of the diffs* is computed. Think of it as
> >
> > git diff --no-index <(git diff A^!) <(git diff B^!) | wc -l
> 
> So the matching is based only on diffs, but the output still takes
> the commit messages into account. So when diffing my series to the
> series that Junio applies, (merely adding his sign off,) would be a
> "cost of 0" in this context, but I still have output.

Exactly.

> > > Another spot to look at is further metadata, such as author and
> > > author-date, which are kept the same in a rebase workflow.
> >
> > I encourage you to offer that as an add-on patch series. Because what you
> > suggest is not necessary for my use cases, so I'd rather not spend time on
> > it.
> 
> Makes sense. When I stumble about this yet theoretical problem to
> materialize in practice I will send a patch. In my mind this is not
> another use case, but just an improved matching, with the matching that
> this series provides being good enough for now.

Sure.

Ciao,
Dscho

[PATCH v3] gc --auto: release pack files before auto packing

2018-07-09 Thread Kim Gybels

Teach gc --auto to release pack files before auto packing the repository
to prevent failures when removing them.

Also teach the test 'fetching with auto-gc does not lock up' to complain
when it is no longer triggering an auto packing of the repository.

Fixes https://github.com/git-for-windows/git/issues/500

Signed-off-by: Kim Gybels 
---

Changes since v2:
- revert fix back to v1: use close_all_packs instead of repo_clear
- use test_i18ngrep only for translated string

 builtin/gc.c | 1 +
 t/t5510-fetch.sh | 2 ++
 2 files changed, 3 insertions(+)

diff --git a/builtin/gc.c b/builtin/gc.c
index ccfb1ceaeb..63b62ab57c 100644
--- a/builtin/gc.c
+++ b/builtin/gc.c
@@ -612,6 +612,7 @@ int cmd_gc(int argc, const char **argv, const char *prefix)
return -1;
 
if (!repository_format_precious_objects) {
+   close_all_packs(the_repository->objects);
if (run_command_v_opt(repack.argv, RUN_GIT_CMD))
return error(FAILED_RUN, repack.argv[0]);
 
diff --git a/t/t5510-fetch.sh b/t/t5510-fetch.sh
index e402aee6a2..6f1450eb69 100755
--- a/t/t5510-fetch.sh
+++ b/t/t5510-fetch.sh
@@ -828,9 +828,11 @@ test_expect_success 'fetching with auto-gc does not lock 
up' '
test_commit test2 &&
(
cd auto-gc &&
+   git config fetch.unpackLimit 1 &&
git config gc.autoPackLimit 1 &&
git config gc.autoDetach false &&
GIT_ASK_YESNO="$D/askyesno" git fetch >fetch.out 2>&1 &&
+   test_i18ngrep "Auto packing the repository" fetch.out &&
! grep "Should I try again" fetch.out
)
 '
-- 
2.18.0.windows.1

Re: [PATCH v4] grep.c: teach 'git grep --only-matching'

2018-07-09 Thread Taylor Blau

On Mon, Jul 09, 2018 at 03:33:47PM -0500, Taylor Blau wrote:
> [ ... ]
> ---
>  Documentation/git-grep.txt |  7 +-
>  builtin/grep.c |  6 +
>  grep.c | 51 ++
>  grep.h |  1 +
>  t/t7810-grep.sh| 15 +++
>  5 files changed, 63 insertions(+), 17 deletions(-)
>
> diff --git a/Documentation/git-grep.txt b/Documentation/git-grep.txt
> index 0de3493b80..a3049af1a3 100644
> --- a/Documentation/git-grep.txt
> +++ b/Documentation/git-grep.txt
> @@ -17,7 +17,7 @@ SYNOPSIS
>  [-l | --files-with-matches] [-L | --files-without-match]
>  [(-O | --open-files-in-pager) []]
>  [-z | --null]
> -[-c | --count] [--all-match] [-q | --quiet]
> +[ -o | --only-matching ] [-c | --count] [--all-match] [-q | --quiet]
>  [--max-depth ]
>  [--color[=] | --no-color]
>  [--break] [--heading] [-p | --show-function]
> @@ -201,6 +201,11 @@ providing this option will cause it to die.
>   Output \0 instead of the character that normally follows a
>   file name.
>
> +-o::
> +--only-matching::
> + Print only the matched (non-empty) parts of a matching line, with each 
> such
> + part on a separate output line.

OK, it seems as if the consensus is (1) take the description as-is from
GNU grep, and (2) don't change the existing behavior of "git grep -o
'^'".

This patch does both of those things, and can be queued as 2/2 in this
series.

Thanks, everybody :-).

> --
> 2.18.0

Thanks,
Taylor

Re: [PATCH v2 2/2] fetch: send "refs/tags/" prefix upon CLI refspecs

2018-07-09 Thread Junio C Hamano

Brandon Williams  writes:

> On 07/09, Junio C Hamano wrote:
>> Brandon Williams  writes:
>> 
>> > I agree with this observation, though I'm a bit sad about it.  I think
>> > that having tag auto-following the default is a little confusing (and
>> > hurts perf[1] when using proto v2) but since thats the way its always been
>> > we'll have to live with it for now.  I think exploring changing the
>> > defaults might be a good thing to do in the future.  But for now we've
>> > had enough people comment on this lacking functionality that we should
>> > fix it.
>> >
>> > [1] Thankfully it doesn't completely undo what protocol v2 did, as we
>> > still are able to eliminate refs/changes or refs/pull or other various
>> > refs which significantly add to the number of refs advertised during
>> > fetch.
>> 
>> I thought JTan's <20180618231642.174650-1-jonathanta...@google.com>
>> showed us a way forward to reduce the overhead even further without
>> having to be sad ;-).  Am I mistaken?
>
> That's true, what Jonathan mentioned there would avoid having to send
> "refs/tags/*" when requesting the refs.  The question is do we wait on
> implementing that functionality (as another feature to fetch) or do we
> fix this now?

It's not like the earlier v2 protocol used to be super efficient and
correct, whose performance benefit is destroyed with this "fix"
irreversibly.  It was a fast but sometimes incorrect implementation,
and we'd protect users of the still-under-development feature with
these two patches while updating the protocol further in order to
become truly efficient and correct, so I do not see it a wrong move
to take a hit like this patch does in the meantime.

What I would see a wrong move would be to leave it very long as-is
after we apply this fix, and declare to flip the default not to
follow tags, using the performance hit as an excuse.

So I do not care too deeply either way, whether we wait to regain
efficiency or taking this safe fix as the first step, as long as it
is fixed in the longer term.  I had an impression that the sentiment
in the thread was that it was OK to accept the inefficiency for now
and fix it later, and I am personally fine with that approach.

[PATCH v4] grep.c: teach 'git grep --only-matching'

2018-07-09 Thread Taylor Blau

Teach 'git grep --only-matching', a new option to only print the
matching part(s) of a line.

For instance, a line containing the following (taken from README.md:27):

  (`man gitcvs-migration` or `git help cvs-migration` if git is

Is printed as follows:

  $ git grep --line-number --column --only-matching -e git -- \
README.md | grep ":27"
  README.md:27:7:git
  README.md:27:16:git
  README.md:27:38:git

The patch works mostly as one would expect, with the exception of a few
considerations that are worth mentioning here.

Like GNU grep, this patch ignores --only-matching when --invert (-v) is
given. There is a sensible answer here, but parity with the behavior of
other tools is preferred.

Because a line might contain more than one match, there are special
considerations pertaining to when to print line headers, newlines, and
how to increment the match column offset. The line header and newlines
are handled as a special case within the main loop to avoid polluting
the surrounding code with conditionals that have large blocks.

Signed-off-by: Taylor Blau 
---
 Documentation/git-grep.txt |  7 +-
 builtin/grep.c |  6 +
 grep.c | 51 ++
 grep.h |  1 +
 t/t7810-grep.sh| 15 +++
 5 files changed, 63 insertions(+), 17 deletions(-)

diff --git a/Documentation/git-grep.txt b/Documentation/git-grep.txt
index 0de3493b80..a3049af1a3 100644
--- a/Documentation/git-grep.txt
+++ b/Documentation/git-grep.txt
@@ -17,7 +17,7 @@ SYNOPSIS
   [-l | --files-with-matches] [-L | --files-without-match]
   [(-O | --open-files-in-pager) []]
   [-z | --null]
-  [-c | --count] [--all-match] [-q | --quiet]
+  [ -o | --only-matching ] [-c | --count] [--all-match] [-q | --quiet]
   [--max-depth ]
   [--color[=] | --no-color]
   [--break] [--heading] [-p | --show-function]
@@ -201,6 +201,11 @@ providing this option will cause it to die.
Output \0 instead of the character that normally follows a
file name.
 
+-o::
+--only-matching::
+   Print only the matched (non-empty) parts of a matching line, with each 
such
+   part on a separate output line.
+
 -c::
 --count::
Instead of showing every matched line, show the number of
diff --git a/builtin/grep.c b/builtin/grep.c
index 61bcaf6e58..228b83990f 100644
--- a/builtin/grep.c
+++ b/builtin/grep.c
@@ -843,6 +843,8 @@ int cmd_grep(int argc, const char **argv, const char 
*prefix)
OPT_BOOL_F('z', "null", &opt.null_following_name,
   N_("print NUL after filenames"),
   PARSE_OPT_NOCOMPLETE),
+   OPT_BOOL('o', "only-matching", &opt.only_matching,
+   N_("show only matching parts of a line")),
OPT_BOOL('c', "count", &opt.count,
N_("show the number of matches instead of matching 
lines")),
OPT__COLOR(&opt.color, N_("highlight matches")),
@@ -962,6 +964,10 @@ int cmd_grep(int argc, const char **argv, const char 
*prefix)
if (!opt.pattern_list)
die(_("no pattern given."));
 
+   /* --only-matching has no effect with --invert. */
+   if (opt.invert)
+   opt.only_matching = 0;
+
/*
 * We have to find "--" in a separate pass, because its presence
 * influences how we will parse arguments that come before it.
diff --git a/grep.c b/grep.c
index 4ff8a73043..49a744f96b 100644
--- a/grep.c
+++ b/grep.c
@@ -51,6 +51,7 @@ void init_grep_defaults(void)
color_set(opt->color_match_selected, GIT_COLOR_BOLD_RED);
color_set(opt->color_selected, "");
color_set(opt->color_sep, GIT_COLOR_CYAN);
+   opt->only_matching = 0;
opt->color = -1;
opt->output = std_output;
 }
@@ -158,6 +159,7 @@ void grep_init(struct grep_opt *opt, const char *prefix)
opt->pattern_tail = &opt->pattern_list;
opt->header_tail = &opt->header_list;
 
+   opt->only_matching = def->only_matching;
opt->color = def->color;
opt->extended_regexp_option = def->extended_regexp_option;
opt->pattern_type_option = def->pattern_type_option;
@@ -1446,7 +1448,8 @@ static void show_line(struct grep_opt *opt, char *bol, 
char *eol,
  const char *name, unsigned lno, ssize_t cno, char sign)
 {
int rest = eol - bol;
-   const char *match_color, *line_color = NULL;
+   const char *match_color = NULL;
+   const char *line_color = NULL;
 
if (opt->file_break && opt->last_shown == 0) {
if (opt->show_hunk_mark)
@@ -1462,39 +1465,55 @@ static void show_line(struct grep_opt *opt, char *bol, 
char *eol,
opt->output(opt, "\n", 1);
}
}
-   show_line_header(opt, name, lno, cno, sign);
-   if (opt->color) {
+   if (!opt->only_matching) {
+

Re: Git 2.18: RUNTIME_PREFIX... is it working?

2018-07-09 Thread Johannes Schindelin

Hi Peff,

On Mon, 9 Jul 2018, Jeff King wrote:

> On Sun, Jul 08, 2018 at 11:52:22PM +0200, Johannes Schindelin wrote:
> 
> > Now, if you care to have a look at Dan's (and my) patches to implement
> > RUNTIME_PREFIX so that it looks for a directory *relative to the Git
> > binary*, you will see that it is far from portable. In fact, it is very
> > definitely not portable, and needs specific support for *every single
> > supported Operating System*. And while we covered a lot, we did not cover
> > all of them.
> > 
> > So unfortunately, it is impossible to make it the default, I am afraid.
> 
> Would it be reasonable to make RUNTIME_PREFIX the default on systems
> where we _do_ have that support? AFAIK there is no downside to having it
> enabled (minus a few syscalls to find the prefix, I suppose, but I
> assume that's negligible).
> 
> I.e., a patch to config.mak.uname (and possibly better support for
> _disabling_ it, though I think "make RUNTIME_PREFIX=" would probably
> work).

The obvious downside is that we would be a lot more likely to break one
side of the equation. At least right now, we have Git for Windows being a
prime user of RUNTIME_PREFIX (so breakages should be caught relatively
quickly), and macOS/Linux *not* being users of that feature (so breakages
in the non-RUNTIME_PREFIX code paths should be caught even quicker). By
turning on RUNTIME_PREFIX for the major platforms, the fringe platforms
are even further out on their own.

Ciao,
Dscho

Re: [PATCH 0/6] Compile cleanly in pedantic mode

2018-07-09 Thread Beat Bolli

On 09.07.18 21:25, Beat Bolli wrote:
> While developing 6aaded550 ("builtin/config: work around an unsized
> array forward declaration", 2018-07-05), I have compiled Git with
> CFLAGS="-std=c99 -pedantic".
> 
> This series fixes a few compiler warnings when compiling with these
> options.

As a small aside, I have also compiled all of Git with -std=c11 using
gcc 8.1. This didn't turn up any new warnings, so we're looking pretty
future-proof in this regard.

Cheers,
Beat

Re: [PATCH v3 17/20] range-diff: add a man page

2018-07-09 Thread Stefan Beller

On Mon, Jul 9, 2018 at 1:00 PM Johannes Schindelin
 wrote:
>
> Hi Stefan,
>
> On Mon, 9 Jul 2018, Stefan Beller wrote:
>
> > On Tue, Jul 3, 2018 at 4:26 AM Johannes Schindelin via GitGitGadget
> >  wrote:
> >
> > > +'git range-diff' [--color=[]] [--no-color] []
> > > +   [--dual-color] [--creation-factor=]
> > > +   (   | ... |)
> > > +
> > > +DESCRIPTION
> > > +---
> > > +
> > > +This command shows the differences between two versions of a patch
> > > +series, or more generally, two commit ranges (ignoring merges).
> >
> > Does it completely ignore merges or does it die("not supported"), how is
> > the user expected to cope with the accidental merge in the given range?
>
> It ignores merges. It does not reject them. It simply ignores them and
> won't talk about them as a consequence.
>
> Could you suggest an improved way to say that?

Well that is what the patch said already; I was just dense in reading.
I just tested it, and the commit with more than one parent itself is
ignored (not showing up in the output), but the commits that are merged
in are still considered. So giving range1 as
f7761a5a065..0a5677f6f68 with

  0a5677f6f68 (Merge branch 'js/branch-diff' into pu, 2018-07-06)
  f7761a5a065 (Merge branch 'jk/fsck-gitmodules-gently' into jch, 2018-07-06)

still produces cool output and with --word-diff it is even more amazing, as it
just tells me a large part was s/branch-diff/range-diff/ :-)

12:  cbc752c57ce ! 12:  7273cc64797 branch-diff: use color for the commit pairs
@@ -1,31 +1,28 @@
Author: Johannes Schindelin 

[-branch-diff:-]{+range-diff:+} use color for the commit pairs

Arguably the most important part of [-branch-diff's-]{+`git
range-diff`'s+} output is the
list of commits in the two branches, together with their relationships.

For that reason, tbdiff introduced color-coding that is pretty
intuitive, especially for unchanged patches (all dim yellow, like the
first line in `git show`'s output) vs modified patches (old commit is
red, new commit is green). Let's imitate that color scheme.

[-While at it, also copy tbdiff's change of the fragment color
to magenta.-]

Signed-off-by: Johannes Schindelin 
[-Signed-off-by: Junio C Hamano -]

Sorry for being offtopic here; I do not have a better suggestion than what
is already said.

> > I presume this is a boolean option, and can be turned off with
> > --no-dual-color, but not with --dual-color=no. Would it be worth to
> > give the --no-option here as well.
> > The more pressing question I had when reading this, is whether this
> > is the default.
>
> In the final patch (which I mulled about adding or not for a couple of
> weeks), the `--dual-color` mode is the default, and the man page talks
> about `--no-dual-color`.
>
> Do you want me to change this intermediate commit, even if that change
> will be reverted anyway?

No, I just wasn't aware of that part, yet, as I have seen some patch series
that add the man page/documentation as their final patch. This looked
so similar that I assumed this is the final man page. My bad!

> The coloring gives a strong hint of "pre" vs "post", i.e. old vs new: the
> changes that are only in the "old" patches are marked with a minus with a
> red background color, which only really makes sense if you think about
> these changes as "dropped" or "removed" from the "new" changes.
>
> So yes, it is really considered an older version, in my mind.
>
> Again, if you have suggestions how to improve my patch (giving rise to a
> "new" patch :-)), let's hear them.

I will send patches as I get more used to this new tool.

>
> > I think this comes from the notion of e.g. patch 4 ("range-diff: improve the
> > order of the shown commits "), that assume the user wants the range-diff
> > to be expressed with range2 as its "base range".
>
> No, it is motivated by the fact that we use -/+ markers to indicate
> differences between the "old" and the "new" patches.

And at this point in time we do not want to question the use of -/+ markers
for the next layer of abstraction. While +/- are well understood on
the patch level
we could argue for a different set of characters for diffs of diffs,
as that helps
to differentiate between the layers ("is it added in the diff or the
diff of the diff
due to e.g. different context?"), but then it would not recurse. (I am
not sure I
want to read diff of diffs of diffs)

Okay, for now I accept the terms of old and new patches, as I have no
better idea.

>
> > > +Algorithm
> > > +-
> > > +
> > > +The general idea is this: we generate a cost matrix between the commits
> > > +in both commit ranges, then solve the least-cost assignment.
> >
> > Can you say more about the generation of the cost matrix?
> > I assume that it counts the number of lines added/deleted to make
> > one patch into the other patch.
>
> I think that is correct.
>
> *reading the patch*
>
> Actually, no, I was wron

Re: [PATCH 2/2] sequencer: don't say BUG on bogus input

2018-07-09 Thread Johannes Schindelin

Hi Peff,

On Mon, 9 Jul 2018, Jeff King wrote:

> diff --git a/sequencer.c b/sequencer.c
> index f692b2ef44..234666b980 100644
> --- a/sequencer.c
> +++ b/sequencer.c
> @@ -3637,7 +3637,7 @@ int sequencer_pick_revisions(struct replay_opts *opts)
>   return error(_("revision walk setup failed"));
>   cmit = get_revision(opts->revs);
>   if (!cmit || get_revision(opts->revs))
> - return error("BUG: expected exactly one commit from 
> walk");
> + return error(_("empty commit set passed"));

Should this not rather be

-   if (!cmit || get_revision(opts->revs))
-   return error("BUG: expected exactly one commit from 
walk");
+   if (!cmit)
+   return error(_("empty commit set passed"));
+   if (get_revision(opts->revs))
+   return error(_("unexpected extra commit from walk"));

Ciao,
Dscho

Re: [PATCH 0/6] Add merge recursive testcases with undetected conflicts

2018-07-09 Thread Elijah Newren

On Mon, Jul 9, 2018 at 10:53 AM, Junio C Hamano  wrote:
> Elijah Newren  writes:
>
>> When a merge succeeds, we expect the resulting contents to depend only
>> upon the trees and blobs of the branches involved and of their merge
>> base(s).  Unfortunately, there are currently about half a dozen cases
>> where the contents of a "successful" merge depend on the relative
>> commit timestamps of the merge bases.  Document these with testcases.
>>
>> (This series came out of looking at modifying how file collision
>> conflict types are handled, as discussed at [1].  I discovered these
>> issues while working on that topic.)
>
> I have a topic branch for this series but not merged to 'pu' as
> test-lint gives these:
>
> t6036-recursive-corner-cases.sh:1222: error: "export FOO=bar" is not portable 
> (please use FOO=bar && export FOO):   echo "export 
> PATH=~/bin:$PATH" >source_me.bash &&
> t6036-recursive-corner-cases.sh:1227: error: "export FOO=bar" is not portable 
> (please use FOO=bar && export FOO):   echo "export 
> PATH=~/bin:$PATH" >source_me.bash &&
> Makefile:77: recipe for target 'test-lint-shell-syntax' failed
> make: *** [test-lint-shell-syntax] Error 1
>
> Arguably these are false positives because "source_me.bash" file is
> a mere payload to go through the merge process to be munged and we
> never intend to actually execute its contents with bash, but then
> the test payload probably does not even have to be a string that
> triggers such a false positive to begin with ;-)

Oh, I didn't know about test-lint.  Is there a place that documents the various 
checks you run, so I can avoid slowing you down?  Ones I know about:

Already documented:
  * `make DEVELOPER=1` (from CodingGuidelines)
  * running tests (from SubmittingPatches)

Stuff I've seen you mention in emails over time:
  * linux/scripts/checkpatch.pl
  * git grep -e '\' --and --not -e 'static inline' -- \*.h
  * make -C t/ test-lint

Are there others?


Also, here's a fixup to the topic; as you pointed out, the exact contents
of the script being written were actually irrelevant; it was just an
input to a merge.

-- 8< --
Subject: [PATCH] fixup! t6036: add a failed conflict detection case: regular
 files, different modes

---
 t/t6036-recursive-corner-cases.sh | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/t/t6036-recursive-corner-cases.sh 
b/t/t6036-recursive-corner-cases.sh
index f8f7b30460..5a8fe061ab 100755
--- a/t/t6036-recursive-corner-cases.sh
+++ b/t/t6036-recursive-corner-cases.sh
@@ -1219,12 +1219,12 @@ test_expect_success 'setup conflicting modes for 
regular file' '
git tag A &&
 
git checkout -b B A &&
-   echo "export PATH=~/bin:$PATH" >source_me.bash &&
+   echo "command_to_run" >source_me.bash &&
git add source_me.bash &&
git commit -m B &&
 
git checkout -b C A &&
-   echo "export PATH=~/bin:$PATH" >source_me.bash &&
+   echo "command_to_run" >source_me.bash &&
git add source_me.bash &&
test_chmod +x source_me.bash &&
git commit -m C &&
-- 
2.18.0.135.gd4ea5491ab

Re: [PATCH 1/2] sequencer: handle empty-set cases consistently

2018-07-09 Thread Johannes Schindelin

Hi Peff,

On Mon, 9 Jul 2018, Jeff King wrote:

> If the user gives us a set that prepare_revision_walk()
> takes to be empty, like:
> 
>   git cherry-pick base..base
> 
> then we report an error. It's nonsense, and there's nothing
> to pick.
> 
> But if they use revision options that later cull the list,
> like:
> 
>   git cherry-pick --author=nobody base~2..base
> 
> then we quietly create an empty todo list and return
> success.
> 
> Arguably either behavior is acceptable, but we should
> definitely be consistent about it. Reporting an error
> seems to match the original intent, which dates all the way
> back to 7e2bfd3f99 (revert: allow cherry-picking more than
> one commit, 2010-06-02). That in turn was trying to match
> the single-commit case that exited before then (and which
> continues to issue an error).
> 
> Signed-off-by: Jeff King 

Makes sense to me.

Thanks,
Dscho

Re: [PATCH 1/2] sequencer: handle empty-set cases consistently

2018-07-09 Thread Junio C Hamano

Jeff King  writes:

> If the user gives us a set that prepare_revision_walk()
> takes to be empty, like:
>
>   git cherry-pick base..base
>
> then we report an error. It's nonsense, and there's nothing
> to pick.
>
> But if they use revision options that later cull the list,
> like:
>
>   git cherry-pick --author=nobody base~2..base
>
> then we quietly create an empty todo list and return
> success.
>
> Arguably either behavior is acceptable, but we should
> definitely be consistent about it. Reporting an error
> seems to match the original intent, which dates all the way
> back to 7e2bfd3f99 (revert: allow cherry-picking more than
> one commit, 2010-06-02). That in turn was trying to match
> the single-commit case that exited before then (and which
> continues to issue an error).

Other than s/exited/existed/, the above looks perfect ;-)  I wish
all analysis part of proposed log messages were written like this.



>
> Signed-off-by: Jeff King 
> ---
>  sequencer.c | 6 --
>  t/t3510-cherry-pick-sequence.sh | 7 ++-
>  2 files changed, 10 insertions(+), 3 deletions(-)
>
> diff --git a/sequencer.c b/sequencer.c
> index 5354d4d51e..f692b2ef44 100644
> --- a/sequencer.c
> +++ b/sequencer.c
> @@ -1863,8 +1863,6 @@ static int prepare_revs(struct replay_opts *opts)
>   if (prepare_revision_walk(opts->revs))
>   return error(_("revision walk setup failed"));
>  
> - if (!opts->revs->commits)
> - return error(_("empty commit set passed"));
>   return 0;
>  }
>  
> @@ -2317,6 +2315,10 @@ static int walk_revs_populate_todo(struct todo_list 
> *todo_list,
>   short_commit_name(commit), subject_len, subject);
>   unuse_commit_buffer(commit, commit_buffer);
>   }
> +
> + if (!todo_list->nr)
> + return error(_("empty commit set passed"));
> +
>   return 0;
>  }
>  
> diff --git a/t/t3510-cherry-pick-sequence.sh b/t/t3510-cherry-pick-sequence.sh
> index b42cd66d3a..3505b6aa14 100755
> --- a/t/t3510-cherry-pick-sequence.sh
> +++ b/t/t3510-cherry-pick-sequence.sh
> @@ -480,11 +480,16 @@ test_expect_success 'malformed instruction sheet 2' '
>   test_expect_code 128 git cherry-pick --continue
>  '
>  
> -test_expect_success 'empty commit set' '
> +test_expect_success 'empty commit set (no commits to walk)' '
>   pristine_detach initial &&
>   test_expect_code 128 git cherry-pick base..base
>  '
>  
> +test_expect_success 'empty commit set (culled during walk)' '
> + pristine_detach initial &&
> + test_expect_code 128 git cherry-pick -2 --author=no.such.author base
> +'
> +
>  test_expect_success 'malformed instruction sheet 3' '
>   pristine_detach initial &&
>   test_expect_code 1 git cherry-pick base..anotherpick &&

Re: [PATCH 0/2] Avoiding errors when partial cloning a tagged blob

2018-07-09 Thread Jonathan Tan

> > An argument could be made that we should not merge patch 2 just yet due
> > to the fact that some server implementations (such as Git and JGit)
> > still exhibit the old behavior, and the resulting clones (albeit failing
> > fsck) are still usable, because when attempting to load the blob, Git
> > will automatically fetch it. I'm on the fence about this, and have
> > included patch 2 in this patch set nevertheless for completeness.
> 
> I think the latter is probably a good thing to nudge the server
> implementations in the right direction by gently poking them ;-)

OK, I'll keep patch 2 then.

> The patches textually apply cleanly on 'master' but apparently it
> needs some fixes in jt/partial-clone-fsck-connectivity topic to
> function correctly?

I forgot to mention in the original e-mail that this is on
bw/ref-in-want. (I mentioned it in an e-mail I sent later [1].) I've
checked and basing it on both bw/ref-in-want and
jt/partial-clone-fsck-connectivity (which is based on bw/ref-in-want
itself) both work, so you can choose either one.

[1] 
https://public-inbox.org/git/20180706193847.160161-1-jonathanta...@google.com/

Re: [PATCH 2/2] t3430: update to test with custom commentChar

2018-07-09 Thread Junio C Hamano

Daniel Harding  writes:

> One question about my original patch - there I had replaced a "grep
> -v" call with a "git stripspace" call in the 'generate correct todo
> list' test.  Is relying on "git stripspace" in a test acceptable, or
> should external text manipulation tools like grep, sed etc. be
> preferred?

I think trusting stripspace is OK here.  Even though this test is
not about stripspace (i.e. trying to catch breakage in stripspace),
if we broke it, we'll notice it as a side effect of running this
test ;-).

Re: [RFC PATCH 6/6] utf8.c: avoid char overflow

2018-07-09 Thread Johannes Schindelin

Hi Beat,

On Mon, 9 Jul 2018, Beat Bolli wrote:

> Am 09.07.2018 15:14, schrieb Johannes Schindelin:
> > 
> > On Sun, 8 Jul 2018, Beat Bolli wrote:
> > 
> > > In ISO C, char constants must be in the range -128..127. Change the BOM
> > > constants to unsigned char to avoid overflow.
> > > 
> > > Signed-off-by: Beat Bolli 
> > > ---
> > >  utf8.c | 10 +-
> > >  1 file changed, 5 insertions(+), 5 deletions(-)
> > > 
> > > diff --git a/utf8.c b/utf8.c
> > > index d55e20c641..833ce00617 100644
> > > --- a/utf8.c
> > > +++ b/utf8.c
> > > @@ -561,15 +561,15 @@ char *reencode_string_len(const char *in, int insz,
> > >  #endif
> > > 
> > > static int has_bom_prefix(const char *data, size_t len,
> > > -   const char *bom, size_t bom_len)
> > > +   const unsigned char *bom, size_t bom_len)
> > >  {
> > >   return data && bom && (len >= bom_len) && !memcmp(data, bom, bom_len);
> > >  }
> > > 
> > > -static const char utf16_be_bom[] = {0xFE, 0xFF};
> > > -static const char utf16_le_bom[] = {0xFF, 0xFE};
> > > -static const char utf32_be_bom[] = {0x00, 0x00, 0xFE, 0xFF};
> > > -static const char utf32_le_bom[] = {0xFF, 0xFE, 0x00, 0x00};
> > > +static const unsigned char utf16_be_bom[] = {0xFE, 0xFF};
> > > +static const unsigned char utf16_le_bom[] = {0xFF, 0xFE};
> > > +static const unsigned char utf32_be_bom[] = {0x00, 0x00, 0xFE, 0xFF};
> > > +static const unsigned char utf32_le_bom[] = {0xFF, 0xFE, 0x00, 0x00};
> > 
> > An alternative approach that might be easier to read (and avoids the
> > confusion arising from our use of (signed) chars for strings pretty much
> > everywhere):
> > 
> > #define FE ((char)0xfe)
> > #define FF ((char)0xff)
> > 
> > ...
> 
> I have tried this first (without the macros, though), and thought it looked
> really ugly.

Yep, I would totally agree that it would be very ugly without the macros.

Which is why I suggested the macros instead, in which case it looks
relatively elegant to my eyes.

Ciao,
Dscho

Re: [PATCH v1 1/2] convert: refactor conversion driver config parsing

2018-07-09 Thread Junio C Hamano

Lars Schneider  writes:

>> On Jul 8, 2018, at 8:30 PM, larsxschnei...@gmail.com wrote:
>> 
>> From: Lars Schneider 
>> 
>> Refactor conversion driver config parsing to ease the parsing of new
>> configs in a subsequent patch.
>> 
>> No functional change intended.
>> 
>> Signed-off-by: Lars Schneider 
>> ---
>> convert.c | 64 +++
>> 1 file changed, 32 insertions(+), 32 deletions(-)
>> 
>> diff --git a/convert.c b/convert.c
>> index 64d0d30e08..949bc783e4 100644
>> --- a/convert.c
>> +++ b/convert.c
>> @@ -1003,43 +1003,43 @@ static int read_convert_config(const char *var, 
>> const char *value, void *cb)
>>  int namelen;
>>  struct convert_driver *drv;
>> 
>> ...
>> 
>> -/*
>> - * filter..smudge and filter..clean specifies
>> - * the command line:
>> - *
>> - *  command-line
>> - *
>> - * The command-line will not be interpolated in any way.
>> - */
>> +/*
>> + * filter..smudge and filter..clean specifies
>> + * the command line:
>> + *
>> + *  command-line
>> + *
>> + * The command-line will not be interpolated in any way.
>> + */
>
> I stumbled over this comment introduced in aa4ed402c9 
> ("Add 'filter' attribute and external filter driver definition.", 2007-04-21).
>
> Is the middle "command-line" intentional?

I think it was a deliberate but ineffective attempt to emphasize the
fact that the command line is used as-is, and does not get split at
SP nor goes through interpolation of placeholders using API such as
strbuf_expand().

Re: [PATCH v3 17/20] range-diff: add a man page

2018-07-09 Thread Johannes Schindelin

Hi Stefan,

On Mon, 9 Jul 2018, Stefan Beller wrote:

> On Tue, Jul 3, 2018 at 4:26 AM Johannes Schindelin via GitGitGadget
>  wrote:
> 
> > +'git range-diff' [--color=[]] [--no-color] []
> > +   [--dual-color] [--creation-factor=]
> > +   (   | ... |)
> > +
> > +DESCRIPTION
> > +---
> > +
> > +This command shows the differences between two versions of a patch
> > +series, or more generally, two commit ranges (ignoring merges).
> 
> Does it completely ignore merges or does it die("not supported"), how is
> the user expected to cope with the accidental merge in the given range?

It ignores merges. It does not reject them. It simply ignores them and
won't talk about them as a consequence.

Could you suggest an improved way to say that?

> > +To that end, it first finds pairs of commits from both commit ranges
> > +that correspond with each other. Two commits are said to correspond when
> > +the diff between their patches (i.e. the author information, the commit
> > +message and the commit diff) is reasonably small compared to the
> > +patches' size. See ``Algorithm` below for details.
> > +
> > +Finally, the list of matching commits is shown in the order of the
> > +second commit range, with unmatched commits being inserted just after
> > +all of their ancestors have been shown.
> > +
> > +
> > +OPTIONS
> > +---
> > +--dual-color::
> > +   When the commit diffs differ, recreate the original diffs'
> > +   coloring, and add outer -/+ diff markers with the *background*
> > +   being red/green to make it easier to see e.g. when there was a
> > +   change in what exact lines were added.
> 
> I presume this is a boolean option, and can be turned off with
> --no-dual-color, but not with --dual-color=no. Would it be worth to
> give the --no-option here as well.
> The more pressing question I had when reading this, is whether this
> is the default.

In the final patch (which I mulled about adding or not for a couple of
weeks), the `--dual-color` mode is the default, and the man page talks
about `--no-dual-color`.

Do you want me to change this intermediate commit, even if that change
will be reverted anyway?

> > +--creation-factor=::
> > +   Set the creation/deletion cost fudge factor to ``.
> > +   Defaults to 60. Try a larger value if `git range-diff` erroneously
> > +   considers a large change a total rewrite (deletion of one commit
> > +   and addition of another), and a smaller one in the reverse case.
> > +   See the ``Algorithm`` section below for an explanation why this is
> > +   needed.
> > +
> > + ::
> > +   Compare the commits specified by the two ranges, where
> > +   `` is considered an older version of ``.
> 
> Is it really older? How does that help the user?

It is important to get your ducks in a row, so to speak, when looking at
range-diffs. They are even more unintuitive than diffs, so it makes sense
to have a very clear mental picture of what you are trying to compare
here.

The coloring gives a strong hint of "pre" vs "post", i.e. old vs new: the
changes that are only in the "old" patches are marked with a minus with a
red background color, which only really makes sense if you think about
these changes as "dropped" or "removed" from the "new" changes.

So yes, it is really considered an older version, in my mind.

Again, if you have suggestions how to improve my patch (giving rise to a
"new" patch :-)), let's hear them.

> I think this comes from the notion of e.g. patch 4 ("range-diff: improve the
> order of the shown commits "), that assume the user wants the range-diff
> to be expressed with range2 as its "base range".

No, it is motivated by the fact that we use -/+ markers to indicate
differences between the "old" and the "new" patches.

> > +Algorithm
> > +-
> > +
> > +The general idea is this: we generate a cost matrix between the commits
> > +in both commit ranges, then solve the least-cost assignment.
> 
> Can you say more about the generation of the cost matrix?
> I assume that it counts the number of lines added/deleted to make
> one patch into the other patch.

I think that is correct.

*reading the patch*

Actually, no, I was wrong. For the cost matrix, the *length* of the diff
*of the diffs* is computed. Think of it as

git diff --no-index <(git diff A^!) <(git diff B^!) | wc -l

> If that assumption was correct, an edit of a commit message adding one
> line is just as costly as adding one line in the diff.

Nope, editing a commit message does not have any influence on the
algorithm's idea whether the commit matches or not. Only the content
changes associated with the commit have any say over this.

> Further I would assume that the context lines are ignored?

No.

> I think this is worth spelling out.

Sure.

> Another spot to look at is further metadata, such as author and
> author-date, which are kept the same in a rebase workflow.

I encourage you to offer that as an add-on patch series. B

Re: Git 2.18: RUNTIME_PREFIX... is it working?

2018-07-09 Thread Jeff King

On Sun, Jul 08, 2018 at 11:52:22PM +0200, Johannes Schindelin wrote:

> Now, if you care to have a look at Dan's (and my) patches to implement
> RUNTIME_PREFIX so that it looks for a directory *relative to the Git
> binary*, you will see that it is far from portable. In fact, it is very
> definitely not portable, and needs specific support for *every single
> supported Operating System*. And while we covered a lot, we did not cover
> all of them.
> 
> So unfortunately, it is impossible to make it the default, I am afraid.

Would it be reasonable to make RUNTIME_PREFIX the default on systems
where we _do_ have that support? AFAIK there is no downside to having it
enabled (minus a few syscalls to find the prefix, I suppose, but I
assume that's negligible).

I.e., a patch to config.mak.uname (and possibly better support for
_disabling_ it, though I think "make RUNTIME_PREFIX=" would probably
work).

-Peff

[PATCH 2/2] sequencer: don't say BUG on bogus input

2018-07-09 Thread Jeff King

When cherry-picking a single commit, we go through a special
code path that avoids creating a sequencer todo list at all.
This path expects our revision parsing to turn up exactly
one commit, and dies with a BUG if it doesn't.

But it's actually quite easy to fool. For example:

  $ git cherry-pick --author=no.such.person HEAD
  error: BUG: expected exactly one commit from walk
  fatal: cherry-pick failed

This isn't a bug; it's just bogus input.

Let's drop the "BUG" to make it clear that the input is the
problem. And let's also use the phrase "empty commit set
passed", which matches what we say when we do a real
revision walk and it turns up empty.

This BUG dates back to 7acaaac275 (revert: allow single-pick
in the middle of cherry-pick sequence, 2011-12-10), and
could be triggered in the same way even then. So clearly
this outcome is unexpected. Another approach would be to
make the conditional from 7acaaac275 smarter, and avoid even
entering this single-pick case.  But since the action is
identical either way (we have nothing to pick, so we exit)
there's not much point in trying to distinguish the two.

Signed-off-by: Jeff King 
---
 sequencer.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/sequencer.c b/sequencer.c
index f692b2ef44..234666b980 100644
--- a/sequencer.c
+++ b/sequencer.c
@@ -3637,7 +3637,7 @@ int sequencer_pick_revisions(struct replay_opts *opts)
return error(_("revision walk setup failed"));
cmit = get_revision(opts->revs);
if (!cmit || get_revision(opts->revs))
-   return error("BUG: expected exactly one commit from 
walk");
+   return error(_("empty commit set passed"));
return single_pick(cmit, opts);
}
 
-- 
2.18.0.400.g702e398724

[PATCH 1/2] sequencer: handle empty-set cases consistently

2018-07-09 Thread Jeff King

If the user gives us a set that prepare_revision_walk()
takes to be empty, like:

  git cherry-pick base..base

then we report an error. It's nonsense, and there's nothing
to pick.

But if they use revision options that later cull the list,
like:

  git cherry-pick --author=nobody base~2..base

then we quietly create an empty todo list and return
success.

Arguably either behavior is acceptable, but we should
definitely be consistent about it. Reporting an error
seems to match the original intent, which dates all the way
back to 7e2bfd3f99 (revert: allow cherry-picking more than
one commit, 2010-06-02). That in turn was trying to match
the single-commit case that exited before then (and which
continues to issue an error).

Signed-off-by: Jeff King 
---
 sequencer.c | 6 --
 t/t3510-cherry-pick-sequence.sh | 7 ++-
 2 files changed, 10 insertions(+), 3 deletions(-)

diff --git a/sequencer.c b/sequencer.c
index 5354d4d51e..f692b2ef44 100644
--- a/sequencer.c
+++ b/sequencer.c
@@ -1863,8 +1863,6 @@ static int prepare_revs(struct replay_opts *opts)
if (prepare_revision_walk(opts->revs))
return error(_("revision walk setup failed"));
 
-   if (!opts->revs->commits)
-   return error(_("empty commit set passed"));
return 0;
 }
 
@@ -2317,6 +2315,10 @@ static int walk_revs_populate_todo(struct todo_list 
*todo_list,
short_commit_name(commit), subject_len, subject);
unuse_commit_buffer(commit, commit_buffer);
}
+
+   if (!todo_list->nr)
+   return error(_("empty commit set passed"));
+
return 0;
 }
 
diff --git a/t/t3510-cherry-pick-sequence.sh b/t/t3510-cherry-pick-sequence.sh
index b42cd66d3a..3505b6aa14 100755
--- a/t/t3510-cherry-pick-sequence.sh
+++ b/t/t3510-cherry-pick-sequence.sh
@@ -480,11 +480,16 @@ test_expect_success 'malformed instruction sheet 2' '
test_expect_code 128 git cherry-pick --continue
 '
 
-test_expect_success 'empty commit set' '
+test_expect_success 'empty commit set (no commits to walk)' '
pristine_detach initial &&
test_expect_code 128 git cherry-pick base..base
 '
 
+test_expect_success 'empty commit set (culled during walk)' '
+   pristine_detach initial &&
+   test_expect_code 128 git cherry-pick -2 --author=no.such.author base
+'
+
 test_expect_success 'malformed instruction sheet 3' '
pristine_detach initial &&
test_expect_code 1 git cherry-pick base..anotherpick &&
-- 
2.18.0.400.g702e398724

[PATCH 0/2] de-confuse git cherry-pick --author behavior

2018-07-09 Thread Jeff King

On Mon, Jul 09, 2018 at 03:16:07PM -0400, Jeff King wrote:

> I agree that having something similar to commit's "--author" (or even
> just "--reset-author") would be useful. When I've had to do this before,
> I usually just cherry-pick and then follow-up with "commit --amend
> --author" (or use "rebase -i" if there are several commits). You can
> also do "cherry-pick -n $commit" followed by "commit -c $commit".

I'll leave adding any options there as an exercise if somebody is
interested. This series just focuses on the inconsistent error behavior
you found.

  [1/2]: sequencer: handle empty-set cases consistently
  [2/2]: sequencer: don't say BUG on bogus input

 sequencer.c | 8 +---
 t/t3510-cherry-pick-sequence.sh | 7 ++-
 2 files changed, 11 insertions(+), 4 deletions(-)

-Peff

[PATCH] unicode: update the width tables to Unicode 11

2018-07-09 Thread Beat Bolli

Now that Unicode 11 has been announced[0], update the character
width tables to the new version.

[0] http://blog.unicode.org/2018/06/announcing-unicode-standard-version-110.html

Signed-off-by: Beat Bolli 
---
 unicode-width.h | 41 -
 1 file changed, 28 insertions(+), 13 deletions(-)

diff --git a/unicode-width.h b/unicode-width.h
index 6dee2c77ce..7c643760f8 100644
--- a/unicode-width.h
+++ b/unicode-width.h
@@ -20,12 +20,13 @@ static const struct interval zero_width[] = {
 { 0x0730, 0x074A },
 { 0x07A6, 0x07B0 },
 { 0x07EB, 0x07F3 },
+{ 0x07FD, 0x07FD },
 { 0x0816, 0x0819 },
 { 0x081B, 0x0823 },
 { 0x0825, 0x0827 },
 { 0x0829, 0x082D },
 { 0x0859, 0x085B },
-{ 0x08D4, 0x0902 },
+{ 0x08D3, 0x0902 },
 { 0x093A, 0x093A },
 { 0x093C, 0x093C },
 { 0x0941, 0x0948 },
@@ -37,6 +38,7 @@ static const struct interval zero_width[] = {
 { 0x09C1, 0x09C4 },
 { 0x09CD, 0x09CD },
 { 0x09E2, 0x09E3 },
+{ 0x09FE, 0x09FE },
 { 0x0A01, 0x0A02 },
 { 0x0A3C, 0x0A3C },
 { 0x0A41, 0x0A42 },
@@ -63,6 +65,7 @@ static const struct interval zero_width[] = {
 { 0x0BC0, 0x0BC0 },
 { 0x0BCD, 0x0BCD },
 { 0x0C00, 0x0C00 },
+{ 0x0C04, 0x0C04 },
 { 0x0C3E, 0x0C40 },
 { 0x0C46, 0x0C48 },
 { 0x0C4A, 0x0C4D },
@@ -182,6 +185,7 @@ static const struct interval zero_width[] = {
 { 0xA825, 0xA826 },
 { 0xA8C4, 0xA8C5 },
 { 0xA8E0, 0xA8F1 },
+{ 0xA8FF, 0xA8FF },
 { 0xA926, 0xA92D },
 { 0xA947, 0xA951 },
 { 0xA980, 0xA982 },
@@ -219,19 +223,22 @@ static const struct interval zero_width[] = {
 { 0x10A38, 0x10A3A },
 { 0x10A3F, 0x10A3F },
 { 0x10AE5, 0x10AE6 },
+{ 0x10D24, 0x10D27 },
+{ 0x10F46, 0x10F50 },
 { 0x11001, 0x11001 },
 { 0x11038, 0x11046 },
 { 0x1107F, 0x11081 },
 { 0x110B3, 0x110B6 },
 { 0x110B9, 0x110BA },
 { 0x110BD, 0x110BD },
+{ 0x110CD, 0x110CD },
 { 0x11100, 0x11102 },
 { 0x11127, 0x1112B },
 { 0x1112D, 0x11134 },
 { 0x11173, 0x11173 },
 { 0x11180, 0x11181 },
 { 0x111B6, 0x111BE },
-{ 0x111CA, 0x111CC },
+{ 0x111C9, 0x111CC },
 { 0x1122F, 0x11231 },
 { 0x11234, 0x11234 },
 { 0x11236, 0x11237 },
@@ -239,13 +246,14 @@ static const struct interval zero_width[] = {
 { 0x112DF, 0x112DF },
 { 0x112E3, 0x112EA },
 { 0x11300, 0x11301 },
-{ 0x1133C, 0x1133C },
+{ 0x1133B, 0x1133C },
 { 0x11340, 0x11340 },
 { 0x11366, 0x1136C },
 { 0x11370, 0x11374 },
 { 0x11438, 0x1143F },
 { 0x11442, 0x11444 },
 { 0x11446, 0x11446 },
+{ 0x1145E, 0x1145E },
 { 0x114B3, 0x114B8 },
 { 0x114BA, 0x114BA },
 { 0x114BF, 0x114C0 },
@@ -264,8 +272,9 @@ static const struct interval zero_width[] = {
 { 0x1171D, 0x1171F },
 { 0x11722, 0x11725 },
 { 0x11727, 0x1172B },
-{ 0x11A01, 0x11A06 },
-{ 0x11A09, 0x11A0A },
+{ 0x1182F, 0x11837 },
+{ 0x11839, 0x1183A },
+{ 0x11A01, 0x11A0A },
 { 0x11A33, 0x11A38 },
 { 0x11A3B, 0x11A3E },
 { 0x11A47, 0x11A47 },
@@ -285,6 +294,10 @@ static const struct interval zero_width[] = {
 { 0x11D3C, 0x11D3D },
 { 0x11D3F, 0x11D45 },
 { 0x11D47, 0x11D47 },
+{ 0x11D90, 0x11D91 },
+{ 0x11D95, 0x11D95 },
+{ 0x11D97, 0x11D97 },
+{ 0x11EF3, 0x11EF4 },
 { 0x16AF0, 0x16AF4 },
 { 0x16B30, 0x16B36 },
 { 0x16F8F, 0x16F92 },
@@ -355,7 +368,7 @@ static const struct interval double_width[] = {
 { 0x3000, 0x303E },
 { 0x3041, 0x3096 },
 { 0x3099, 0x30FF },
-{ 0x3105, 0x312E },
+{ 0x3105, 0x312F },
 { 0x3131, 0x318E },
 { 0x3190, 0x31BA },
 { 0x31C0, 0x31E3 },
@@ -375,7 +388,7 @@ static const struct interval double_width[] = {
 { 0xFF01, 0xFF60 },
 { 0xFFE0, 0xFFE6 },
 { 0x16FE0, 0x16FE1 },
-{ 0x17000, 0x187EC },
+{ 0x17000, 0x187F1 },
 { 0x18800, 0x18AF2 },
 { 0x1B000, 0x1B11E },
 { 0x1B170, 0x1B2FB },
@@ -410,13 +423,15 @@ static const struct interval double_width[] = {
 { 0x1F6CC, 0x1F6CC },
 { 0x1F6D0, 0x1F6D2 },
 { 0x1F6EB, 0x1F6EC },
-{ 0x1F6F4, 0x1F6F8 },
+{ 0x1F6F4, 0x1F6F9 },
 { 0x1F910, 0x1F93E },
-{ 0x1F940, 0x1F94C },
-{ 0x1F950, 0x1F96B },
-{ 0x1F980, 0x1F997 },
-{ 0x1F9C0, 0x1F9C0 },
-{ 0x1F9D0, 0x1F9E6 },
+{ 0x1F940, 0x1F970 },
+{ 0x1F973, 0x1F976 },
+{ 0x1F97A, 0x1F97A },
+{ 0x1F97C, 0x1F9A2 },
+{ 0x1F9B0, 0x1F9B9 },
+{ 0x1F9C0, 0x1F9C2 },
+{ 0x1F9D0, 0x1F9FF },
 { 0x2, 0x2FFFD },
 { 0x3, 0x3FFFD }
 };
-- 
2.18.0.203.gfac676dfb9

Re: [PATCH v3 16/20] range-diff --dual-color: work around bogus white-space warning

2018-07-09 Thread Stefan Beller

On Tue, Jul 3, 2018 at 4:26 AM Johannes Schindelin via GitGitGadget
 wrote:
>
> From: Johannes Schindelin 
>
> When displaying a diff of diffs, it is possible that there is an outer
> `+` before a context line. That happens when the context changed between
> old and new commit. When that context line starts with a tab (after the
> space that marks it as context line), our diff machinery spits out a
> white-space error (space before tab), but in this case, that is
> incorrect.
>
> Work around this by detecting that situation and simply *not* printing
> the space in that case.

ok. If that is the workaround that you deem to be the right thing for now.
(I do not have an opinion if that is the right approach, or if we'd want
to s/// for example.)

> This is slightly improper a fix because it is conceivable that an
> output_prefix might be configured with *just* the right length to let
> that tab jump to a different tab stop depending whether we emit that
> space or not.
>
> However, the proper fix would be relatively ugly and intrusive because
> it would have to *weaken* the WS_SPACE_BEFORE_TAB option in ws.c.
> Besides, we do not expose the --dual-color option in cases other than
> the `git range-diff` command, which only uses a hard-coded
> output_prefix of four spaces (which misses the problem by one
> column... ;-)).

That makes sense!

> Signed-off-by: Johannes Schindelin 
> ---
>  diff.c | 6 ++
>  1 file changed, 6 insertions(+)
>
> diff --git a/diff.c b/diff.c
> index 26445ffa1..325007167 100644
> --- a/diff.c
> +++ b/diff.c
> @@ -1093,6 +1093,12 @@ static void emit_diff_symbol_from_struct(struct 
> diff_options *o,
> set = diff_get_color_opt(o, DIFF_FRAGINFO);
> else if (c != '+')
> set = diff_get_color_opt(o, DIFF_CONTEXT);
> +   /* Avoid space-before-tab warning */
> +   if (c == ' ' && (len < 2 || line[1] == '\t' ||
> +line[1] == '\r' || line[1] == '\n')) 
> {
> +   line++;
> +   len--;
> +   }
> }

And this is inside the check for 'o->flags.dual_color_diffed_diffs',
so that is protected against other diffs.

Thanks,
Stefan

Re: [PATCH 0/2] Avoiding errors when partial cloning a tagged blob

2018-07-09 Thread Junio C Hamano

Jonathan Tan  writes:

> When cloning a repository with a tagged blob (like the Git repository)
> with --filter=blob:none, the following message appears:
>
> error: missing object referenced by 'refs/tags/junio-gpg-pub'
>
> and the resulting repository also fails fsck.
>
> Patch 1 fixes the protocol documentation and the server side of Git, and
> patch 2 makes clone error out when such a situation occurs.
>
> An argument could be made that we should not merge patch 2 just yet due
> to the fact that some server implementations (such as Git and JGit)
> still exhibit the old behavior, and the resulting clones (albeit failing
> fsck) are still usable, because when attempting to load the blob, Git
> will automatically fetch it. I'm on the fence about this, and have
> included patch 2 in this patch set nevertheless for completeness.

I think the latter is probably a good thing to nudge the server
implementations in the right direction by gently poking them ;-)

The patches textually apply cleanly on 'master' but apparently it
needs some fixes in jt/partial-clone-fsck-connectivity topic to
function correctly?

Re: [RFC PATCH 2/6] refs/refs-internal.h: avoid forward declaration of an enum

2018-07-09 Thread Beat Bolli

On 09.07.18 20:46, Jeff King wrote:
> On Sun, Jul 08, 2018 at 04:43:38PM +0200, Beat Bolli wrote:
> 
>> diff --git a/refs/refs-internal.h b/refs/refs-internal.h
>> index dd834314bd..a78b5cb803 100644
>> --- a/refs/refs-internal.h
>> +++ b/refs/refs-internal.h
>> @@ -1,6 +1,8 @@
>>  #ifndef REFS_REFS_INTERNAL_H
>>  #define REFS_REFS_INTERNAL_H
>>  
>> +#include "iterator.h"   /* for enum iterator_selection */
> 
> IMHO this kind of comment does more harm than good, because it is so
> prone to going stale (nobody is going to bother updating it when they
> add new dependencies on iterator.h). Anybody who is interested in the
> original reason can use "git blame" to dig up your commit message. And
> anybody who is thinking about deleting that line would need to dig into
> whether anything had been added in the meantime that also requires the
> include.
> 
> So at best it's redundant, and at worst it's slightly misleading. :)
> 
> Not worth a re-roll by itself, but it looked like you had a few other
> bits in the other patches to address.
> 
> Other than this minor quibble, the whole series looks good to me, modulo
> the existing review.
> 
> -Peff
> 

Ooosp, I've just sent the non-RFC reroll without this change.

Junio, would you squash this into [1/6] and [2/6], please (if you agree,
of course :-)

Beat

Re: [PATCH v3 14/20] diff: add an internal option to dual-color diffs of diffs

2018-07-09 Thread Stefan Beller

On Tue, Jul 3, 2018 at 4:27 AM Johannes Schindelin via GitGitGadget
 wrote:
>
> From: Johannes Schindelin 
>
> When diffing diffs, it can be quite daunting to figure out what the heck
> is going on, as there are nested +/- signs.
>
> Let's make this easier by adding a flag in diff_options that allows
> color-coding the outer diff sign with inverted colors, so that the
> preimage and postimage is colored like the diff it is.
>
> Of course, this really only makes sense when the preimage and postimage
> *are* diffs. So let's not expose this flag via a command-line option for
> now.
>
> This is a feature that was invented by git-tbdiff, and it will be used
> by `git range-diff` in the next commit, by offering it via a new option:
> `--dual-color`.
>
> Signed-off-by: Johannes Schindelin 
> ---
>  diff.c | 83 +++---
>  diff.h |  1 +
>  2 files changed, 69 insertions(+), 15 deletions(-)
>
> diff --git a/diff.c b/diff.c
> index 8c568cbe0..26445ffa1 100644
> --- a/diff.c
> +++ b/diff.c
> @@ -562,14 +562,18 @@ static void check_blank_at_eof(mmfile_t *mf1, mmfile_t 
> *mf2,
> ecbdata->blank_at_eof_in_postimage = (at - l2) + 1;
>  }
>
> -static void emit_line_0(struct diff_options *o, const char *set, const char 
> *reset,
> +static void emit_line_0(struct diff_options *o,
> +   const char *set, unsigned reverse, const char *reset,
> int first, const char *line, int len)
>  {
> int has_trailing_newline, has_trailing_carriage_return;
> int nofirst;
> FILE *file = o->file;
>
> -   fputs(diff_line_prefix(o), file);
> +   if (first)
> +   fputs(diff_line_prefix(o), file);
> +   else if (!len)
> +   return;

This case is not a problem for empty lines in e.g. "git-log --line-prefix"
because first would contain the LF.

> if (len == 0) {
> has_trailing_newline = (first == '\n');
> @@ -587,8 +591,10 @@ static void emit_line_0(struct diff_options *o, const 
> char *set, const char *res
> }
>
> if (len || !nofirst) {
> +   if (reverse && want_color(o->use_color))
> +   fputs(GIT_COLOR_REVERSE, file);

Would it make sense to have the function signature take a char* for reverse
and we pass in diff_get_color(o, GIT_COLOR_REVERSE), that would align
with the set and reset color passed in?

> fputs(set, file);
> -   if (!nofirst)
> +   if (first && !nofirst)
> fputc(first, file);

'first' is line[0] and comes from user data, so I think this could change
the output of diffs that has lines with the NUL character first in a line
as then that character would be silently eaten?

The 'nofirst' (which is a bad name) is used to detect if we do not
want to double print the first character in case of an empty line.
(Before this series we always had 'first' as a valid character, now we also
have 0 encoded for "do not print anything?"

> @@ -962,7 +968,8 @@ static void dim_moved_lines(struct diff_options *o)
>
>  static void emit_line_ws_markup(struct diff_options *o,
> const char *set, const char *reset,
> -   const char *line, int len, char sign,
> +   const char *line, int len,
> +   const char *set_sign, char sign,
> unsigned ws_rule, int blank_at_eof)
>  {
> const char *ws = NULL;
> @@ -973,14 +980,20 @@ static void emit_line_ws_markup(struct diff_options *o,
> ws = NULL;
> }
>
> -   if (!ws)
> -   emit_line_0(o, set, reset, sign, line, len);
> -   else if (blank_at_eof)
> +   if (!ws && !set_sign)
> +   emit_line_0(o, set, 0, reset, sign, line, len);
> +   else if (!ws) {
> +   /* Emit just the prefix, then the rest. */
> +   emit_line_0(o, set_sign ? set_sign : set, !!set_sign, reset,
> +   sign, "", 0);
> +   emit_line_0(o, set, 0, reset, 0, line, len);

(FYI:)
My long term vision for the emit_line_* functions was to have them actually
line oriented, and here we observe that the preimage already breaks
this assumption but just uses it as it sees fit.
I added that wart when refactoring the diff code to use the emit_ functionality
as I wanted to stay backwards compatible.

The actual issue is that each emit_line_0 will encapsulate its content
with its designated color and then end it with a reset. Looking at t4015
a common occurrence is output like:

  +{

which we'd add one more layer to it now when set_sign is set.

I think this is ok for now (I value having this series land over insisting
on the perfect code), but just wanted to note my concern.

Ideally we'd refactor to only call emit_line once per line and when
the set sign (and the newly introduced set_sign

[PATCH 5/6] string-list.c: avoid conversion from void * to function pointer

2018-07-09 Thread Beat Bolli

ISO C forbids the conversion of void pointers to function pointers.
Introduce a context struct that encapsulates the function pointer.

Signed-off-by: Beat Bolli 
---
 string-list.c | 18 ++
 1 file changed, 14 insertions(+), 4 deletions(-)

diff --git a/string-list.c b/string-list.c
index a0cf0cfe88..771c455098 100644
--- a/string-list.c
+++ b/string-list.c
@@ -224,18 +224,28 @@ struct string_list_item *string_list_append(struct 
string_list *list,
list->strdup_strings ? xstrdup(string) : (char 
*)string);
 }
 
+/*
+ * Encapsulate the compare function pointer because ISO C99 forbids
+ * casting from void * to a function pointer and vice versa.
+ */
+struct string_list_sort_ctx
+{
+   compare_strings_fn cmp;
+};
+
 static int cmp_items(const void *a, const void *b, void *ctx)
 {
-   compare_strings_fn cmp = ctx;
+   struct string_list_sort_ctx *sort_ctx = ctx;
const struct string_list_item *one = a;
const struct string_list_item *two = b;
-   return cmp(one->string, two->string);
+   return sort_ctx->cmp(one->string, two->string);
 }
 
 void string_list_sort(struct string_list *list)
 {
-   QSORT_S(list->items, list->nr, cmp_items,
-   list->cmp ? list->cmp : strcmp);
+   struct string_list_sort_ctx sort_ctx = {list->cmp ? list->cmp : strcmp};
+
+   QSORT_S(list->items, list->nr, cmp_items, &sort_ctx);
 }
 
 struct string_list_item *unsorted_string_list_lookup(struct string_list *list,
-- 
2.18.0.203.gfac676dfb9

[PATCH 4/6] sequencer.c: avoid empty statements at top level

2018-07-09 Thread Beat Bolli

The macro GIT_PATH_FUNC expands to a function definition that ends with
a closing brace. Remove two extra semicolons.

While at it, fix the example in path.h.

Signed-off-by: Beat Bolli 
---
 path.h  | 2 +-
 sequencer.c | 4 ++--
 2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/path.h b/path.h
index 1ccd0373c9..fc9d3487a0 100644
--- a/path.h
+++ b/path.h
@@ -147,7 +147,7 @@ extern void report_linked_checkout_garbage(void);
 /*
  * You can define a static memoized git path like:
  *
- *static GIT_PATH_FUNC(git_path_foo, "FOO");
+ *static GIT_PATH_FUNC(git_path_foo, "FOO")
  *
  * or use one of the global ones below.
  */
diff --git a/sequencer.c b/sequencer.c
index 5354d4d51e..66e7073995 100644
--- a/sequencer.c
+++ b/sequencer.c
@@ -62,12 +62,12 @@ static GIT_PATH_FUNC(rebase_path_done, "rebase-merge/done")
  * The file to keep track of how many commands were already processed (e.g.
  * for the prompt).
  */
-static GIT_PATH_FUNC(rebase_path_msgnum, "rebase-merge/msgnum");
+static GIT_PATH_FUNC(rebase_path_msgnum, "rebase-merge/msgnum")
 /*
  * The file to keep track of how many commands are to be processed in total
  * (e.g. for the prompt).
  */
-static GIT_PATH_FUNC(rebase_path_msgtotal, "rebase-merge/end");
+static GIT_PATH_FUNC(rebase_path_msgtotal, "rebase-merge/end")
 /*
  * The commit message that is planned to be used for any changes that
  * need to be committed following a user interaction.
-- 
2.18.0.203.gfac676dfb9

[PATCH 6/6] utf8.c: avoid char overflow

2018-07-09 Thread Beat Bolli

In ISO C, char constants must be in the range -128..127. Change the BOM
constants to char literals to avoid overflow.

Signed-off-by: Beat Bolli 
---
 utf8.c | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/utf8.c b/utf8.c
index d55e20c641..982217eec9 100644
--- a/utf8.c
+++ b/utf8.c
@@ -566,10 +566,10 @@ static int has_bom_prefix(const char *data, size_t len,
return data && bom && (len >= bom_len) && !memcmp(data, bom, bom_len);
 }
 
-static const char utf16_be_bom[] = {0xFE, 0xFF};
-static const char utf16_le_bom[] = {0xFF, 0xFE};
-static const char utf32_be_bom[] = {0x00, 0x00, 0xFE, 0xFF};
-static const char utf32_le_bom[] = {0xFF, 0xFE, 0x00, 0x00};
+static const char utf16_be_bom[] = {'\xFE', '\xFF'};
+static const char utf16_le_bom[] = {'\xFF', '\xFE'};
+static const char utf32_be_bom[] = {'\0', '\0', '\xFE', '\xFF'};
+static const char utf32_le_bom[] = {'\xFF', '\xFE', '\0', '\0'};
 
 int has_prohibited_utf_bom(const char *enc, const char *data, size_t len)
 {
-- 
2.18.0.203.gfac676dfb9

[PATCH 2/6] refs/refs-internal.h: avoid forward declaration of an enum

2018-07-09 Thread Beat Bolli

Include iterator.h to define enum iterator_selection.

Signed-off-by: Beat Bolli 
---
 refs/refs-internal.h | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/refs/refs-internal.h b/refs/refs-internal.h
index dd834314bd..a78b5cb803 100644
--- a/refs/refs-internal.h
+++ b/refs/refs-internal.h
@@ -1,6 +1,8 @@
 #ifndef REFS_REFS_INTERNAL_H
 #define REFS_REFS_INTERNAL_H
 
+#include "iterator.h"   /* for enum iterator_selection */
+
 /*
  * Data structures and functions for the internal use of the refs
  * module. Code outside of the refs module should use only the public
-- 
2.18.0.203.gfac676dfb9

[PATCH 3/6] convert.c: replace "\e" escapes with "\033".

2018-07-09 Thread Beat Bolli

The "\e" escape is not defined in ISO C.

While on this line, add a missing space after the comma.

Signed-off-by: Beat Bolli 
---
 convert.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/convert.c b/convert.c
index 64d0d30e08..edebb946f5 100644
--- a/convert.c
+++ b/convert.c
@@ -334,7 +334,7 @@ static void trace_encoding(const char *context, const char 
*path,
strbuf_addf(&trace, "%s (%s, considered %s):\n", context, path, 
encoding);
for (i = 0; i < len && buf; ++i) {
strbuf_addf(
-   &trace,"| \e[2m%2i:\e[0m %2x \e[2m%c\e[0m%c",
+   &trace, "| \033[2m%2i:\033[0m %2x \033[2m%c\033[0m%c",
i,
(unsigned char) buf[i],
(buf[i] > 32 && buf[i] < 127 ? buf[i] : ' '),
-- 
2.18.0.203.gfac676dfb9

[PATCH 1/6] connect.h: avoid forward declaration of an enum

2018-07-09 Thread Beat Bolli

Include protocol.h to define enum protocol_version.

Signed-off-by: Beat Bolli 
---
 connect.h | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/connect.h b/connect.h
index 0e69c6709c..c86f862f2f 100644
--- a/connect.h
+++ b/connect.h
@@ -1,6 +1,8 @@
 #ifndef CONNECT_H
 #define CONNECT_H
 
+#include "protocol.h"   /* for enum protocol_version */
+
 #define CONNECT_VERBOSE   (1u << 0)
 #define CONNECT_DIAG_URL  (1u << 1)
 #define CONNECT_IPV4  (1u << 2)
-- 
2.18.0.203.gfac676dfb9

[PATCH 0/6] Compile cleanly in pedantic mode

2018-07-09 Thread Beat Bolli

While developing 6aaded550 ("builtin/config: work around an unsized
array forward declaration", 2018-07-05), I have compiled Git with
CFLAGS="-std=c99 -pedantic".

This series fixes a few compiler warnings when compiling with these
options.

Note that all warnings were produced by -pedantic; the C99 standard
option by itself didn't cause any of them.

The warnings were:

1) Char arrays initialized from a parenthesized string.

Suppressed by defining USE_PARENS_AROUND_GETTEXT_N to 0
globally. This was done just to keep the amount of warnings
manageable; this series leaves that knob alone. The advantage of
not mistakenly concatenating two translated strings is greater.

2) connect.h, refs/refs-internal.h: Forward reference to an enum.

Added two #includes that define the enums. This was already
(inconclusively) talked about in [0].

3) convert.c: Invalid escape sequence "\e".

Replaced with "\033".

4) sequencer.c: Empty statements at top level.

Removed the extra semicolons.

5) string-list.c: Forbidden to cast from void * to a function pointer and
   vice versa.

Encapsulated the function pointer in a context struct. This is
controversial because it has a performance impact, namely one
additional pointer dereference per string comparison. An
alternative might be to use multiple casts via intptr_t. But
I'm not sure if this is worth the trouble.

6) utf8.c: overflow of char values.

Used proper char literals for the BOM constants.

This series has patches for 2) to 6).

Regards,
Beat

[0] 
https://public-inbox.org/git/53ab8626-f862-a732-b369-abeab69a4...@ramsayjones.plus.com/T/


Beat Bolli (6):
  connect.h: avoid forward declaration of an enum
  refs/refs-internal.h: avoid forward declaration of an enum
  convert.c: replace "\e" escapes with "\033".
  sequencer.c: avoid empty statements at top level
  string-list.c: avoid conversion from void * to function pointer
  utf8.c: avoid char overflow

 connect.h|  2 ++
 convert.c|  2 +-
 path.h   |  2 +-
 refs/refs-internal.h |  2 ++
 sequencer.c  |  4 ++--
 string-list.c| 18 ++
 utf8.c   |  8 
 7 files changed, 26 insertions(+), 12 deletions(-)


Interdiff from the RFC series:

diff --git a/path.h b/path.h
index 1ccd0373c9..fc9d3487a0 100644
--- a/path.h
+++ b/path.h
@@ -147,7 +147,7 @@ extern void report_linked_checkout_garbage(void);
 /*
  * You can define a static memoized git path like:
  *
- *static GIT_PATH_FUNC(git_path_foo, "FOO");
+ *static GIT_PATH_FUNC(git_path_foo, "FOO")
  *
  * or use one of the global ones below.
  */
diff --git a/utf8.c b/utf8.c
index 833ce00617..982217eec9 100644
--- a/utf8.c
+++ b/utf8.c
@@ -561,15 +561,15 @@ char *reencode_string_len(const char *in, int insz,
 #endif

 static int has_bom_prefix(const char *data, size_t len,
- const unsigned char *bom, size_t bom_len)
+ const char *bom, size_t bom_len)
 {
return data && bom && (len >= bom_len) && !memcmp(data, bom, bom_len);
 }

-static const unsigned char utf16_be_bom[] = {0xFE, 0xFF};
-static const unsigned char utf16_le_bom[] = {0xFF, 0xFE};
-static const unsigned char utf32_be_bom[] = {0x00, 0x00, 0xFE, 0xFF};
-static const unsigned char utf32_le_bom[] = {0xFF, 0xFE, 0x00, 0x00};
+static const char utf16_be_bom[] = {'\xFE', '\xFF'};
+static const char utf16_le_bom[] = {'\xFF', '\xFE'};
+static const char utf32_be_bom[] = {'\0', '\0', '\xFE', '\xFF'};
+static const char utf32_le_bom[] = {'\xFF', '\xFE', '\0', '\0'};

 int has_prohibited_utf_bom(const char *enc, const char *data, size_t len)
 {

-- 
2.18.0.203.gfac676dfb9

Re: [BUG] git cherry-pick does not complain about unknown options

2018-07-09 Thread Jeff King

On Mon, Jul 09, 2018 at 04:16:16PM +0200, Andrei Rybak wrote:

> I was trying to cherry pick commits, while simultaneously changing the
> author.  Unfortunately, cherry-pick doesn't have the same --author
> option as git-commit.  However, instead of complaining about unknown
> option:

Yeah, its "--author" option is interpreted by revision.c similar to "log
--author" (because we just pass all unknown options to the traversal
machinery). So you could say:

  git cherry-pick --author=peff v1.0..v2.0

to pick all of my commits between the two versions. I'm not sure if that
would ever be _useful_, but that's how it has behaved since 7e2bfd3f99
(revert: allow cherry-picking more than one commit, 2010-06-02), I
think.

I agree that having something similar to commit's "--author" (or even
just "--reset-author") would be useful. When I've had to do this before,
I usually just cherry-pick and then follow-up with "commit --amend
--author" (or use "rebase -i" if there are several commits). You can
also do "cherry-pick -n $commit" followed by "commit -c $commit".

> All commits in tests existed in repository:
> 
> $ git cherry-pick --author='TEST'  # case 1
> error: BUG: expected exactly one commit from walk
> fatal: cherry-pick failed
> $ echo $?
> 128

I think it's reasonable for this to issue an error. But a BUG is
definitely wrong. The error is more like "empty commit set passed",
which is what we get for something like:

  $ git cherry-pick HEAD..HEAD
  error: empty commit set passed
  fatal: cherry-pick failed

that triggers the revision walker but has no positive commits. But we
don't trigger that code because the single-rev case has some magic to
allow a cherry-pick during another sequencer operation.

So I think that BUG message needs to be softened.

> $ git cherry-pick --author='TEST'# case 2
> $ echo $?
> 0

And here we come up with an empty set, but we don't flag it as an error
at all. I could even accept that philosophically as "empty input,
nothing to do", but we clearly do flag it as an error in the case I
showed above.

I think the problem is that we catch the error in
sequencer.c:prepare_revs(), after prepare_revision_walk() returns no
commits. But we _do_ have a starting commit for this case, and it's not
culled until we actually ask get_revisions() to start walking.

I think we'd want something like this:

diff --git a/sequencer.c b/sequencer.c
index 4034c0461b..b978157399 100644
--- a/sequencer.c
+++ b/sequencer.c
@@ -1860,8 +1860,6 @@ static int prepare_revs(struct replay_opts *opts)
if (prepare_revision_walk(opts->revs))
return error(_("revision walk setup failed"));

-   if (!opts->revs->commits)
-   return error(_("empty commit set passed"));
return 0;
 }

@@ -2314,6 +2312,10 @@ static int walk_revs_populate_todo(struct todo_list 
*todo_list,
short_commit_name(commit), subject_len, subject);
unuse_commit_buffer(commit, commit_buffer);
}
+
+   if (!todo_list->nr)
+   return error(_("empty commit set passed"));
+
return 0;
 }

-Peff

Re: [PATCH 2/2] t3430: update to test with custom commentChar

2018-07-09 Thread Johannes Schindelin

Hi Daniel,

On Mon, 9 Jul 2018, Daniel Harding wrote:

> On Mon, 09 Jul 2018 at 00:02:00 +0300, brian m. carlson wrote:
> > On Sun, Jul 08, 2018 at 09:41:11PM +0300, Daniel Harding wrote:
> > > Signed-off-by: Daniel Harding 
> > 
> > > diff --git a/t/t3430-rebase-merges.sh b/t/t3430-rebase-merges.sh
> > > index 78f7c9958..ff474d033 100755
> > > --- a/t/t3430-rebase-merges.sh
> > > +++ b/t/t3430-rebase-merges.sh
> > > @@ -56,12 +56,12 @@ test_expect_success 'create completely different
> > > structure' '
> > >cat >script-from-scratch <<-\EOF &&
> > >label onto
> > >   -   # onebranch
> > > +  > onebranch
> > >pick G
> > >pick D
> > >label onebranch
> > >   -   # second
> > > +  > second
> > >reset onto
> > >pick B
> > >label second
> > 
> > Should this affect the "# Merge the topic branch" line (and the "# C",
> > "# E", and "# H" lines in the next test) that appears below this?  It
> > would seem those would qualify as comments as well.
> 
> I intentionally did not change that behavior for two reasons:
> 
> a) from a Git perspective, comment characters are only effectual for comments
> if they are the first character in a line
> 
> and
> 
> b) there are places where a '#' character from the todo list is actually
> parsed and used e.g. [0] and [1].  I have not yet gotten to the point of
> grokking what is going on there, so I didn't want to risk breaking something I
> didn't understand.  Perhaps Johannes could shed some light on whether the
> cases you mentioned should be changed to use the configured commentChar or
> not.
> 
> [0]
> https://github.com/git/git/blob/53f9a3e157dbbc901a02ac2c73346d375e24978c/sequencer.c#L2869
> [1]
> https://github.com/git/git/blob/53f9a3e157dbbc901a02ac2c73346d375e24978c/sequencer.c#L3797

These are related. The first one tries to support

merge -C cafecafe second-branch third-branch # Octopus 2nd/3rd branch

i.e. use '#' to separate between the commit(s) to merge and the oneline
(the latter for the reader's pleasure, just like the onelines in the `pick
 ` lines.

The second ensures that there is no valid label `#`.

I have not really thought about the ramifications of changing this to
comment_line_char, but I guess it *could* work if both locations were
changed.

Ciao,
Johannes

Re: [PATCH 2/2] t3430: update to test with custom commentChar

2018-07-09 Thread Johannes Schindelin

Hi Daniel,

On Mon, 9 Jul 2018, Daniel Harding wrote:

> One question about my original patch - there I had replaced a "grep -v"
> call with a "git stripspace" call in the 'generate correct todo list'
> test.  Is relying on "git stripspace" in a test acceptable, or should
> external text manipulation tools like grep, sed etc. be preferred?

I personally have no strong preference there; I tend to trust the
"portability" of Git's tools more than the idiosyncrasies of whatever
`grep` any setup happens to sport. But I am relatively certain that it is
the exact opposite for, say, Junio (Git's maintainer), so...

Ciao,
Johannes

Re: [PATCH v3 06/24] multi-pack-index: load into memory

2018-07-09 Thread Junio C Hamano

Derrick Stolee  writes:

> +struct multi_pack_index *load_multi_pack_index(const char *object_dir)
> +{
> + struct multi_pack_index *m = NULL;
> + int fd;
> + struct stat st;
> + size_t midx_size;
> + void *midx_map = NULL;
> + uint32_t hash_version;
> + char *midx_name = get_midx_filename(object_dir);
> +
> + fd = git_open(midx_name);
> +
> + if (fd < 0)
> + goto cleanup_fail;
> + if (fstat(fd, &st)) {
> + error_errno(_("failed to read %s"), midx_name);
> + goto cleanup_fail;
> + }
> +
> + midx_size = xsize_t(st.st_size);
> +
> + if (midx_size < MIDX_MIN_SIZE) {
> + close(fd);

With the use of "do things normally and jump to cleanup-fail label"
pattern, I think you do not want the close() here (unless you also
assign -1 to fd yourself, but that is a pointless workaround).
Another goto we see above after fstat() failure correctly omits it.

> + error(_("multi-pack-index file %s is too small"), midx_name);
> + goto cleanup_fail;
> + }
> +
> + FREE_AND_NULL(midx_name);

This correctly calls free-and-null not just free (otherwise we'd
break the cleanup-fail procedure below), which is good.

> + midx_map = xmmap(NULL, midx_size, PROT_READ, MAP_PRIVATE, fd, 0);
> +
> + m = xcalloc(1, sizeof(*m) + strlen(object_dir) + 1);
> + strcpy(m->object_dir, object_dir);

Hmph, I thought we had FLEX_ALLOC_*() convenience functions exactly
for doing things like this more safely.

> + m->fd = fd;
> + m->data = midx_map;
> + m->data_len = midx_size;
> +
> + m->signature = get_be32(m->data);
> + if (m->signature != MIDX_SIGNATURE) {
> + error(_("multi-pack-index signature 0x%08x does not match 
> signature 0x%08x"),
> +   m->signature, MIDX_SIGNATURE);
> + goto cleanup_fail;
> + }
> +
> + m->version = m->data[MIDX_BYTE_FILE_VERSION];
> + if (m->version != MIDX_VERSION) {
> + error(_("multi-pack-index version %d not recognized"),
> +   m->version);
> + goto cleanup_fail;
> + }
> +
> + hash_version = m->data[MIDX_BYTE_HASH_VERSION];
> + if (hash_version != MIDX_HASH_VERSION) {
> + error(_("hash version %u does not match"), hash_version);
> + goto cleanup_fail;
> + }
> + m->hash_len = MIDX_HASH_LEN;
> +
> + m->num_chunks = m->data[MIDX_BYTE_NUM_CHUNKS];
> +
> + m->num_packs = get_be32(m->data + MIDX_BYTE_NUM_PACKS);
> +
> + return m;

> +cleanup_fail:
> + /* no need to check for NULL when freeing */

I wonder who the target reader of this comment is.  We certainly are
not in the business of C language newbies.

If this _were_ a commit that looked like this:

-   if (ptr)
-   free(ptr);
+   /* no need to check for NULL when freeing */
+   free(ptr);

then it might be more understandable, but it still is wrong (such a
comment does not help understanding the new code, which is the only
thing the people who read the comment sees, without knowing what was
there previously---it belongs to the commit log message as a rationale
to make that change).

> diff --git a/midx.h b/midx.h
> index dbdbe9f873..2d83dd9ec1 100644
> --- a/midx.h
> +++ b/midx.h
> @@ -1,6 +1,10 @@
>  #ifndef __MIDX_H__
>  #define __MIDX_H__
>  
> +struct multi_pack_index;

I actually was quite surprised that this struct is defined in
object-store.h and not here.  It feels the other way around.

The raw_object_store needs to know that such an in-core structure
might exist as an optional feature in an object store, but as an
optional feature, I suspect that it has a pointer to an instance of
multi_pack_index, instead of embedding the struct itself in it, so I
would have expected to see an "I am only telling you that there is a
struct with this name, but I am leaving it opaque as you do not have
any business looking inside the struct yourself.  You only need to
be aware of the type's existence and a pointer to it so that you can
call helpers that know what's inside and that should be sufficient
for your needs." decl like this in object-store.h and instead an
actual implementation in here.

Re: [PATCH 2/2] t3430: update to test with custom commentChar

2018-07-09 Thread Daniel Harding


Hello brian,

On Mon, 09 Jul 2018 at 00:02:00 +0300, brian m. carlson wrote:

On Sun, Jul 08, 2018 at 09:41:11PM +0300, Daniel Harding wrote:

Signed-off-by: Daniel Harding 



diff --git a/t/t3430-rebase-merges.sh b/t/t3430-rebase-merges.sh
index 78f7c9958..ff474d033 100755
--- a/t/t3430-rebase-merges.sh
+++ b/t/t3430-rebase-merges.sh
@@ -56,12 +56,12 @@ test_expect_success 'create completely different structure' 
'
cat >script-from-scratch <<-\EOF &&
label onto
  
-	# onebranch

+   > onebranch
pick G
pick D
label onebranch
  
-	# second

+   > second
reset onto
pick B
label second


Should this affect the "# Merge the topic branch" line (and the "# C",
"# E", and "# H" lines in the next test) that appears below this?  It
would seem those would qualify as comments as well.


I intentionally did not change that behavior for two reasons:

a) from a Git perspective, comment characters are only effectual for 
comments if they are the first character in a line


and

b) there are places where a '#' character from the todo list is actually 
parsed and used e.g. [0] and [1].  I have not yet gotten to the point of 
grokking what is going on there, so I didn't want to risk breaking 
something I didn't understand.  Perhaps Johannes could shed some light 
on whether the cases you mentioned should be changed to use the 
configured commentChar or not.


[0] 
https://github.com/git/git/blob/53f9a3e157dbbc901a02ac2c73346d375e24978c/sequencer.c#L2869
[1] 
https://github.com/git/git/blob/53f9a3e157dbbc901a02ac2c73346d375e24978c/sequencer.c#L3797


Thanks,

Daniel Harding

Re: [RFC PATCH 2/6] refs/refs-internal.h: avoid forward declaration of an enum

2018-07-09 Thread Jeff King

On Sun, Jul 08, 2018 at 04:43:38PM +0200, Beat Bolli wrote:

> diff --git a/refs/refs-internal.h b/refs/refs-internal.h
> index dd834314bd..a78b5cb803 100644
> --- a/refs/refs-internal.h
> +++ b/refs/refs-internal.h
> @@ -1,6 +1,8 @@
>  #ifndef REFS_REFS_INTERNAL_H
>  #define REFS_REFS_INTERNAL_H
>  
> +#include "iterator.h"   /* for enum iterator_selection */

IMHO this kind of comment does more harm than good, because it is so
prone to going stale (nobody is going to bother updating it when they
add new dependencies on iterator.h). Anybody who is interested in the
original reason can use "git blame" to dig up your commit message. And
anybody who is thinking about deleting that line would need to dig into
whether anything had been added in the meantime that also requires the
include.

So at best it's redundant, and at worst it's slightly misleading. :)

Not worth a re-roll by itself, but it looked like you had a few other
bits in the other patches to address.

Other than this minor quibble, the whole series looks good to me, modulo
the existing review.

-Peff

Re: [PATCH v2 2/2] fetch: send "refs/tags/" prefix upon CLI refspecs

2018-07-09 Thread Brandon Williams

On 07/09, Junio C Hamano wrote:
> Brandon Williams  writes:
> 
> > I agree with this observation, though I'm a bit sad about it.  I think
> > that having tag auto-following the default is a little confusing (and
> > hurts perf[1] when using proto v2) but since thats the way its always been
> > we'll have to live with it for now.  I think exploring changing the
> > defaults might be a good thing to do in the future.  But for now we've
> > had enough people comment on this lacking functionality that we should
> > fix it.
> >
> > [1] Thankfully it doesn't completely undo what protocol v2 did, as we
> > still are able to eliminate refs/changes or refs/pull or other various
> > refs which significantly add to the number of refs advertised during
> > fetch.
> 
> I thought JTan's <20180618231642.174650-1-jonathanta...@google.com>
> showed us a way forward to reduce the overhead even further without
> having to be sad ;-).  Am I mistaken?

That's true, what Jonathan mentioned there would avoid having to send
"refs/tags/*" when requesting the refs.  The question is do we wait on
implementing that functionality (as another feature to fetch) or do we
fix this now?

-- 
Brandon Williams

Re: [PATCH v2 2/2] fetch: send "refs/tags/" prefix upon CLI refspecs

2018-07-09 Thread Junio C Hamano

Brandon Williams  writes:

> I agree with this observation, though I'm a bit sad about it.  I think
> that having tag auto-following the default is a little confusing (and
> hurts perf[1] when using proto v2) but since thats the way its always been
> we'll have to live with it for now.  I think exploring changing the
> defaults might be a good thing to do in the future.  But for now we've
> had enough people comment on this lacking functionality that we should
> fix it.
>
> [1] Thankfully it doesn't completely undo what protocol v2 did, as we
> still are able to eliminate refs/changes or refs/pull or other various
> refs which significantly add to the number of refs advertised during
> fetch.

I thought JTan's <20180618231642.174650-1-jonathanta...@google.com>
showed us a way forward to reduce the overhead even further without
having to be sad ;-).  Am I mistaken?

Re: [PATCH 2/2] t3430: update to test with custom commentChar

2018-07-09 Thread Daniel Harding


Hi Johannes,

On Mon, 09 Jul 2018 at 10:52:13 +0300, Johannes Schindelin wrote:

Hi Brian,

On Sun, 8 Jul 2018, brian m. carlson wrote:


On Sun, Jul 08, 2018 at 09:41:11PM +0300, Daniel Harding wrote:

Signed-off-by: Daniel Harding 


I think maybe, as you suggested, a separate test for this would be
beneficial.  It might be as simple as modifying 'script-from-scratch' by
doing "sed 's/#/>/'".


It might be even simpler if you come up with a new "fake editor" to merely
copy the todo list, then run a rebase without overridden
commentChar, then one with overridden commentChar, then pipe the todo list
of the first through that `sed` call:


 write_script copy-todo-list.sh <<-\EOF &&
 cp "$1" todo-list.copy
 EOF
test_config sequence.editor \""$PWD"/copy-todo-list.sh\" &&
git rebase -r  &&
sed "s/#/%/" expect &&
test_config core.commentChar % &&
git rebase -r  &&
test_cmp expect todo-list.copy


Indeed, as I thought about it more, using a "no-op" todo editor seemed 
like a good approach.  Thanks for giving me a head start - I'll play 
with that and try to get a new patch with an improved test posted in the 
next couple of days.


One question about my original patch - there I had replaced a "grep -v" 
call with a "git stripspace" call in the 'generate correct todo list' 
test.  Is relying on "git stripspace" in a test acceptable, or should 
external text manipulation tools like grep, sed etc. be preferred?


Thanks,

Daniel Harding

Re: [PATCH v3 17/20] range-diff: add a man page

2018-07-09 Thread Stefan Beller

On Tue, Jul 3, 2018 at 4:26 AM Johannes Schindelin via GitGitGadget
 wrote:

> +'git range-diff' [--color=[]] [--no-color] []
> +   [--dual-color] [--creation-factor=]
> +   (   | ... |)
> +
> +DESCRIPTION
> +---
> +
> +This command shows the differences between two versions of a patch
> +series, or more generally, two commit ranges (ignoring merges).

Does it completely ignore merges or does it die("not supported"), how is the
user expected to cope with the accidental merge in the given range?

> +To that end, it first finds pairs of commits from both commit ranges
> +that correspond with each other. Two commits are said to correspond when
> +the diff between their patches (i.e. the author information, the commit
> +message and the commit diff) is reasonably small compared to the
> +patches' size. See ``Algorithm` below for details.
> +
> +Finally, the list of matching commits is shown in the order of the
> +second commit range, with unmatched commits being inserted just after
> +all of their ancestors have been shown.
> +
> +
> +OPTIONS
> +---
> +--dual-color::
> +   When the commit diffs differ, recreate the original diffs'
> +   coloring, and add outer -/+ diff markers with the *background*
> +   being red/green to make it easier to see e.g. when there was a
> +   change in what exact lines were added.

I presume this is a boolean option, and can be turned off with
--no-dual-color, but not with --dual-color=no. Would it be worth to
give the --no-option here as well.
The more pressing question I had when reading this, is whether this
is the default.

> +--creation-factor=::
> +   Set the creation/deletion cost fudge factor to ``.
> +   Defaults to 60. Try a larger value if `git range-diff` erroneously
> +   considers a large change a total rewrite (deletion of one commit
> +   and addition of another), and a smaller one in the reverse case.
> +   See the ``Algorithm`` section below for an explanation why this is
> +   needed.
> +
> + ::
> +   Compare the commits specified by the two ranges, where
> +   `` is considered an older version of ``.

Is it really older? How does that help the user?
I think this comes from the notion of e.g. patch 4 ("range-diff: improve the
order of the shown commits "), that assume the user wants the range-diff
to be expressed with range2 as its "base range".

> +...::
> +   Equivalent to passing `..` and `..`.

That is cool.

> +Algorithm
> +-
> +
> +The general idea is this: we generate a cost matrix between the commits
> +in both commit ranges, then solve the least-cost assignment.

Can you say more about the generation of the cost matrix?
I assume that it counts the number of lines added/deleted to make
one patch into the other patch.

If that assumption was correct, an edit of a commit message adding one
line is just as costly as adding one line in the diff.

Further I would assume that the context lines are ignored?

I think this is worth spelling out.

Another spot to look at is further metadata, such as author and author-date,
which are kept the same in a rebase workflow.

Maybe worth noting that this algorithm doesn't pay special attention to these,
but a change in them would be strong signal that the two patches compared are
not the same?

I like the example below, thanks!
Stefan

Re: [RFC PATCH 6/6] utf8.c: avoid char overflow

2018-07-09 Thread Junio C Hamano

Junio C Hamano  writes:

> Beat Bolli  writes:
>
 -static const char utf16_be_bom[] = {0xFE, 0xFF};
 -static const char utf16_le_bom[] = {0xFF, 0xFE};
 -static const char utf32_be_bom[] = {0x00, 0x00, 0xFE, 0xFF};
 -static const char utf32_le_bom[] = {0xFF, 0xFE, 0x00, 0x00};
 +static const unsigned char utf16_be_bom[] = {0xFE, 0xFF};
 +static const unsigned char utf16_le_bom[] = {0xFF, 0xFE};
 +static const unsigned char utf32_be_bom[] = {0x00, 0x00, 0xFE, 0xFF};
 +static const unsigned char utf32_le_bom[] = {0xFF, 0xFE, 0x00, 0x00};
>>>
>>> An alternative approach that might be easier to read (and avoids the
>>> confusion arising from our use of (signed) chars for strings pretty
>>> much
>>> everywhere):
>>>
>>> #define FE ((char)0xfe)
>>> #define FF ((char)0xff)
>>>
>>> ...
>>
>> I have tried this first (without the macros, though), and thought
>> it looked really ugly. That's why I chose this solution. The usage
>> is pretty local and close to function has_bom_prefix().
>
> I found that what you posted was already OK, as has_bom_prefix()
> appears only locally in this file and that is the only thing that
> cares about these foo_bom[] constants.  Casting the elements in
> these arrays to (char) type is also fine and not all that ugly,
> I think, and between the two (but without the macro) I have no
> strong preference.  I wonder if writing them as '\376' and '\377'
> as old timers would helps the compiler, though.

Heh, I should have read the remainder of my mailbox to learn that
you came to the same conclusion in response to Dscho's comment
before saying the above.  Hex is OK as all our compilers would be
capable of gloking '\xff' (there is a rather unusual '\x20' already
in builtin/clone.c).

Thanks.

Re: What's cooking in git.git (Jun 2018, #07; Thu, 28)

2018-07-09 Thread Junio C Hamano

Jonathan Tan  writes:

>> * jt/fetch-pack-negotiator (2018-06-15) 7 commits
>>  - fetch-pack: introduce negotiator API
>>  - fetch-pack: move common check and marking together
>>  - fetch-pack: make negotiation-related vars local
>>  - fetch-pack: use ref adv. to prune "have" sent
>>  - fetch-pack: directly end negotiation if ACK ready
>>  - fetch-pack: clear marks before re-marking
>>  - fetch-pack: split up everything_local()
>>  (this branch is used by jt/fetch-nego-tip.)
>> 
>>  Code restructuring and a small fix to transport protocol v2 during
>>  fetching.
>> 
>>  Is this ready for 'next'?
>
> Sorry for the late reply - I think that this is ready for "next". I have
> addressed all the comments that you, Jonathan Nieder and Brandon
> Williams have given, and Brandon has given his OK [1].
>
> [1] https://public-inbox.org/git/20180625182434.ge19...@google.com/

Thanks, I've re-read it over the weekend and agree that we are in a
good shape here.

Re: Unexpected behavior with :/ references

2018-07-09 Thread Jeff King

On Sun, Jul 08, 2018 at 10:13:12PM -0700, William Chargin wrote:

> After further investigation, it appears that ":/foo" indeed resolves to
> the commit with message "foobar" (in the above example) if the commits
> are not all created at the same time: e.g., by adding `sleep 1` between
> the commit commands, or exporting `GIT_AUTHOR_DATE`.

I suspect it's actually GIT_COMMITTER_DATE (but I didn't dig into it, so
I could be wrong). "Youngest" in terms of commits generally refers to
the committer date.

> This leaves only the question of why :/ references don't resolve
> to commits reachable only from HEAD, and whether this is the best
> behavior.

I agree that it would make sense to look at HEAD here. The point of ":/"
is to search everything reachable, and HEAD is part of that
reachability. _Usually_ it doesn't matter because HEAD is pointing to
another ref, but the detached case is an exception.

Doing this would bring us in line with the behavior of other parts of
Git. E.g., f0298cf1c6 (revision walker: include a detached HEAD
in --all, 2009-01-16).

The patch is probably this:

diff --git a/sha1-name.c b/sha1-name.c
index 60d9ef3c7e..641ca12f91 100644
--- a/sha1-name.c
+++ b/sha1-name.c
@@ -1650,6 +1650,7 @@ static int get_oid_with_context_1(const char *name,
struct commit_list *list = NULL;

for_each_ref(handle_one_ref, &list);
+   head_ref(handle_one_ref, &list);
commit_list_sort_by_date(&list);
return get_oid_oneline(name + 2, oid, list);
}

but I didn't test it at all. Do you want to try adding a case for this
to our test suite (probably in t4208) and wrapping it up with a commit
message?

-Peff

Re: [PATCH v2 2/2] fetch: send "refs/tags/" prefix upon CLI refspecs

2018-07-09 Thread Brandon Williams

On 07/09, Jonathan Nieder wrote:
> Hi,
> 
> Jonathan Tan wrote:
> 
> > --- a/builtin/fetch.c
> > +++ b/builtin/fetch.c
> > @@ -359,7 +359,7 @@ static struct ref *get_ref_map(struct transport 
> > *transport,
> > refspec_ref_prefixes(&transport->remote->fetch, &ref_prefixes);
> >  
> > if (ref_prefixes.argc &&
> > -   (tags == TAGS_SET || (tags == TAGS_DEFAULT && !rs->nr))) {
> > +   (tags == TAGS_SET || tags == TAGS_DEFAULT)) {
> 
> Makes a lot of sense.
> 
> This means that if I run
> 
>   git fetch origin master
> 
> then the ls-refs step will now include refs/tags/* in addition to
> refs/heads/master, erasing some of the gain that protocol v2 brought.
> But since the user didn't pass --no-tags, that is what the user asked
> for.
> 
> One could argue that the user didn't *mean* to ask for that.  In other
> words, I wonder if the following would better match the user intent:
> 
>  - introduce a new tagopt value, --tags=auto or something, that means
>that tags are only followed when no refspec is supplied.  In other
>words,
> 
>   git fetch --tags=auto 
> 
>would perform tag auto-following, while
> 
>   git fetch --tags=auto  
> 
>would act like fetch --no-tags.
> 
>  - make --tags=auto the default for new clones.
> 
> What do you think?
> 
> Some related thoughts on tag auto-following:
> 
> It would be tempting to not use tag auto-following at all in the
> default configuration and just include refs/tags/*:refs/tags/* in the
> default clone refspec.  Unfortunately, that would be a pretty
> significant behavior change from the present behavior, since
> 
>  - it would fetch tags pointing to objects unrelated to the fetched
>history
>  - if we have ref-in-want *with* pattern support, it would mean we
>try to fetch all tags on every fetch.  As discussed in the
>thread [1], this is expensive and difficult to get right.
>  - if an unannotated tag is fast-forwarded, it would allow existing
>tags to be updated
>  - it interacts poorly with --prune
>  - ...
> 
> I actually suspect it's a good direction in the long term, but until
> we address those downsides (see also the occasional discussions on tag
> namespaces), we're not ready for it.  The --tags=auto described above
> is a sort of half approximation.
> 
> Anyway, this is all a long way of saying that despite the performance
> regression I believe this patch heads in the right direction.  The
> performance regression is inevitable in what the user is
> (unintentionally) requesting, and we can address it separately by
> changing the defaults to better match what the user intends to
> request.

I agree with this observation, though I'm a bit sad about it.  I think
that having tag auto-following the default is a little confusing (and
hurts perf[1] when using proto v2) but since thats the way its always been
we'll have to live with it for now.  I think exploring changing the
defaults might be a good thing to do in the future.  But for now we've
had enough people comment on this lacking functionality that we should
fix it.

[1] Thankfully it doesn't completely undo what protocol v2 did, as we
still are able to eliminate refs/changes or refs/pull or other various
refs which significantly add to the number of refs advertised during
fetch.

-- 
Brandon Williams

Re: [RFC PATCH 6/6] utf8.c: avoid char overflow

2018-07-09 Thread Beat Bolli

On 09.07.18 18:33, Junio C Hamano wrote:
> Beat Bolli  writes:
> 
 -static const char utf16_be_bom[] = {0xFE, 0xFF};
 -static const char utf16_le_bom[] = {0xFF, 0xFE};
 -static const char utf32_be_bom[] = {0x00, 0x00, 0xFE, 0xFF};
 -static const char utf32_le_bom[] = {0xFF, 0xFE, 0x00, 0x00};
 +static const unsigned char utf16_be_bom[] = {0xFE, 0xFF};
 +static const unsigned char utf16_le_bom[] = {0xFF, 0xFE};
 +static const unsigned char utf32_be_bom[] = {0x00, 0x00, 0xFE, 0xFF};
 +static const unsigned char utf32_le_bom[] = {0xFF, 0xFE, 0x00, 0x00};
>>>
>>> An alternative approach that might be easier to read (and avoids the
>>> confusion arising from our use of (signed) chars for strings pretty
>>> much
>>> everywhere):
>>>
>>> #define FE ((char)0xfe)
>>> #define FF ((char)0xff)
>>>
>>> ...
>>
>> I have tried this first (without the macros, though), and thought
>> it looked really ugly. That's why I chose this solution. The usage
>> is pretty local and close to function has_bom_prefix().
> 
> I found that what you posted was already OK, as has_bom_prefix()
> appears only locally in this file and that is the only thing that
> cares about these foo_bom[] constants.  Casting the elements in
> these arrays to (char) type is also fine and not all that ugly,
> I think, and between the two (but without the macro) I have no
> strong preference.  I wonder if writing them as '\376' and '\377'
> as old timers would helps the compiler, though.
> 

Yes, it does, as I found out in
https://public-inbox.org/git/e3df2644b59b170e26b2a7c0d3978...@drbeat.li/

But I prefer hex; it's closer to the usual definition of the BOM bytes.

Beat

Re: [PATCH 0/6] Add merge recursive testcases with undetected conflicts

2018-07-09 Thread Junio C Hamano

Elijah Newren  writes:

> SPOILER ALERT: This series contains answers to the "fun puzzle" at
>   
> https://public-inbox.org/git/CABPp-BFc1OLYKzS5rauOehvEugPc0oGMJp-NMEAmVMW7QR=4...@mail.gmail.com/
>
> When a merge succeeds, we expect the resulting contents to depend only
> upon the trees and blobs of the branches involved and of their merge
> base(s).  Unfortunately, there are currently about half a dozen cases
> where the contents of a "successful" merge depend on the relative
> commit timestamps of the merge bases.  Document these with testcases.
>
> (This series came out of looking at modifying how file collision
> conflict types are handled, as discussed at [1].  I discovered these
> issues while working on that topic.)

I have a topic branch for this series but not merged to 'pu' as
test-lint gives these:

t6036-recursive-corner-cases.sh:1222: error: "export FOO=bar" is not portable 
(please use FOO=bar && export FOO):   echo "export 
PATH=~/bin:$PATH" >source_me.bash &&
t6036-recursive-corner-cases.sh:1227: error: "export FOO=bar" is not portable 
(please use FOO=bar && export FOO):   echo "export 
PATH=~/bin:$PATH" >source_me.bash &&
Makefile:77: recipe for target 'test-lint-shell-syntax' failed
make: *** [test-lint-shell-syntax] Error 1

Arguably these are false positives because "source_me.bash" file is
a mere payload to go through the merge process to be munged and we
never intend to actually execute its contents with bash, but then
the test payload probably does not even have to be a string that
triggers such a false positive to begin with ;-)

Re: [PATCH 07/17] commit: increase commit message buffer size

2018-07-09 Thread Junio C Hamano

Derrick Stolee  writes:

> On 7/8/2018 7:36 PM, brian m. carlson wrote:
>> 100 bytes is not sufficient to ensure we can write a commit message
>> buffer when using a 32-byte hash algorithm.  Increase the buffer size to
>> ensure we have sufficient space.
>>
>> Signed-off-by: brian m. carlson 
>> ---
>>   refs/files-backend.c | 2 +-
>>   1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/refs/files-backend.c b/refs/files-backend.c
>> index a9a066dcfb..252f835bae 100644
>> --- a/refs/files-backend.c
>> +++ b/refs/files-backend.c
>> @@ -1587,7 +1587,7 @@ static int log_ref_write_fd(int fd, const struct 
>> object_id *old_oid,
>>  char *logrec;
>>  msglen = msg ? strlen(msg) : 0;
>> -maxlen = strlen(committer) + msglen + 100;
>> +maxlen = strlen(committer) + msglen + 200;
>>  logrec = xmalloc(maxlen);
>>  len = xsnprintf(logrec, maxlen, "%s %s %s\n",
>>  oid_to_hex(old_oid),
>
> nit: 100 is not enough anymore, but wasn't a very descriptive
> value. 200 may be enough now, but I'm not sure why.

As Brandon alludes to downthread, we probably should use strbuf for
things like this these days, so a preliminary clean-up to do so is
probably a welcome change to sneak in and rebase this series on top
of.

"%s %s %s\n" with old and new commit object name and the message
will be "2 * len(hash_in_hex) + 4" bytes long (counting the three
whitespaces and the terminating NUL), and Shawn's original in
6de08ae6 ("Log ref updates to logs/refs/", 2006-05-17) actually
computed this one as "strlen(...) + 2*40+4".

100 was merely me being sloppier than Shawn at 8ac65937 ("Make sure
we do not write bogus reflog entries.", 2007-01-26), preferring
being sufficient over not wasting even a single byte.

Re: [PATCH v2 2/2] fetch: send "refs/tags/" prefix upon CLI refspecs

2018-07-09 Thread Jonathan Nieder

Hi,

Jonathan Tan wrote:

> --- a/builtin/fetch.c
> +++ b/builtin/fetch.c
> @@ -359,7 +359,7 @@ static struct ref *get_ref_map(struct transport 
> *transport,
>   refspec_ref_prefixes(&transport->remote->fetch, &ref_prefixes);
>  
>   if (ref_prefixes.argc &&
> - (tags == TAGS_SET || (tags == TAGS_DEFAULT && !rs->nr))) {
> + (tags == TAGS_SET || tags == TAGS_DEFAULT)) {

Makes a lot of sense.

This means that if I run

git fetch origin master

then the ls-refs step will now include refs/tags/* in addition to
refs/heads/master, erasing some of the gain that protocol v2 brought.
But since the user didn't pass --no-tags, that is what the user asked
for.

One could argue that the user didn't *mean* to ask for that.  In other
words, I wonder if the following would better match the user intent:

 - introduce a new tagopt value, --tags=auto or something, that means
   that tags are only followed when no refspec is supplied.  In other
   words,

git fetch --tags=auto 

   would perform tag auto-following, while

git fetch --tags=auto  

   would act like fetch --no-tags.

 - make --tags=auto the default for new clones.

What do you think?

Some related thoughts on tag auto-following:

It would be tempting to not use tag auto-following at all in the
default configuration and just include refs/tags/*:refs/tags/* in the
default clone refspec.  Unfortunately, that would be a pretty
significant behavior change from the present behavior, since

 - it would fetch tags pointing to objects unrelated to the fetched
   history
 - if we have ref-in-want *with* pattern support, it would mean we
   try to fetch all tags on every fetch.  As discussed in the
   thread [1], this is expensive and difficult to get right.
 - if an unannotated tag is fast-forwarded, it would allow existing
   tags to be updated
 - it interacts poorly with --prune
 - ...

I actually suspect it's a good direction in the long term, but until
we address those downsides (see also the occasional discussions on tag
namespaces), we're not ready for it.  The --tags=auto described above
is a sort of half approximation.

Anyway, this is all a long way of saying that despite the performance
regression I believe this patch heads in the right direction.  The
performance regression is inevitable in what the user is
(unintentionally) requesting, and we can address it separately by
changing the defaults to better match what the user intends to
request.

Thanks,
Jonathan

[1] https://public-inbox.org/git/20180605175144.4225-1-bmw...@google.com/

1 2 >

1 - 100 of 143 matches

Mail list logo