Re: What's cooking in git.git (May 2017, #03; Wed, 10)

2017-05-11 Thread Junio C Hamano
Johannes Schindelin  writes:

> Hi Junio,
>
> On Wed, 10 May 2017, Junio C Hamano wrote:
>
>> * jc/bundle (2016-03-03) 6 commits
>>  - index-pack: --clone-bundle option
>>  - Merge branch 'jc/index-pack' into jc/bundle
>>  - bundle v3: the beginning
>>  - bundle: keep a copy of bundle file name in the in-core bundle header
>>  - bundle: plug resource leak
>>  - bundle doc: 'verify' is not about verifying the bundle
>> 
>>  The beginning of "split bundle", which could be one of the
>>  ingredients to allow "git clone" traffic off of the core server
>>  network to CDN.
>> 
>>  This was surrected from a "to be discarded" pile, as from time to
>>  time people wonder about resumable clone that can be primed without
>>  bothering Git servers with dynamic packfile creation, and some
>>  people seem to think that the topic could serve as a useful
>>  building block for that goal.  But nothing seem to have happend.
>>  Unless people really want it, I am inclined to discard this topic.
>>  Opinions?
>
> The primary concern that wants to be solved by these patches is the
> resumable clone, right?
>
> If so, I think that we may want to rethink that approach. If your
> bandwidth is flakey and your repository is large, the upcoming work to
> support fetching objects incrementally (there are three competing
> proposals about this IIUC, hopefully they will settle into a single
> approach soon) may actually be the better way forward.

In short, these won't help, those who asked them to be kept a bit
longer in my tree were mistaken, and nobody will miss them if I just
discarded this topic?

I'm all for that ;-)  The smaller number of patches I need to carry
around, the better.

Thanks.


Re: [PATCH v2 0/6] convert pathspec.c to take an index parameter

2017-05-11 Thread Junio C Hamano
Brandon Williams  writes:

> The main difference in v2 is that instead of piping through an index_state
> struct into parse_pathspec, I ripped out the logic that needed to access the
> index and either removed it completely if it wasn't needed anymore (stripping
> submodule slash) or factored it out into its own function which can be called
> after initializing a pathspec object (dying if a path descends into a
> submodule).
>
> Brandon Williams (6):
>   pathspec: provide a more descriptive die message
>   submodule: add die_in_unpopulated_submodule function
>   pathspec: remove PATHSPEC_STRIP_SUBMODULE_SLASH_EXPENSIVE flag
>   ls-files: prevent prune_cache from overeagerly pruning submodules
>   pathspec: remove PATHSPEC_STRIP_SUBMODULE_SLASH_CHEAP
>   pathspec: convert find_pathspecs_matching_against_index to take an
> index

This was (perhaps surprisingly) quite a pleasant read, even compared
to the other approach ;-).


Re: [PATCH] C style: use standard style for "TRANSLATORS" comments

2017-05-11 Thread Johannes Sixt

Am 11.05.2017 um 23:20 schrieb Ævar Arnfjörð Bjarmason:

diff --git a/builtin/notes.c b/builtin/notes.c
index 7b891471c4..fb856e53b6 100644
--- a/builtin/notes.c
+++ b/builtin/notes.c
@@ -340,8 +340,10 @@ static struct notes_tree *init_notes_check(const char 
*subcommand,
  
  	ref = (flags & NOTES_INIT_WRITABLE) ? t->update_ref : t->ref;

if (!starts_with(ref, "refs/notes/"))
-   /* TRANSLATORS: the first %s will be replaced by a
-  git notes command: 'add', 'merge', 'remove', etc.*/
+   /*
+* TRANSLATORS: the first %s will be replaced by a git
+* notes command: 'add', 'merge', 'remove', etc.
+*/


Rewrapping lines is generally frowned upon because it makes it difficult 
to see whether something was changed. Keeping the line wrapping will 
also reduce the noise in the next .pot commit, I think (not sure if that 
is a worthwhile goal, though).



I hate it when J. Random Developer insists in a particular line length 
and when they have their editor do the wrapping, logical entities are 
suddenly split into two lines: it is "git notes", one logical thing; not 
two words "git" "notes" that happen to occur in sequence.



-- Hannes


Re: [PATCH 11/29] grep: add a test helper function for less verbose -f \0 tests

2017-05-11 Thread Junio C Hamano
Ævar Arnfjörð Bjarmason   writes:

> Add a helper function to make the tests which check for patterns with
> \0 in them more succinct. Right now this isn't a big win, but
> subsequent commits will add a lot more of these tests.
>
> The helper is based on the match() function in t3070-wildmatch.sh.
>
> Signed-off-by: Ævar Arnfjörð Bjarmason 
> ---
>  t/t7008-grep-binary.sh | 58 
> +-
>  1 file changed, 29 insertions(+), 29 deletions(-)
>
> diff --git a/t/t7008-grep-binary.sh b/t/t7008-grep-binary.sh
> index 9c9c378119..6c1952eafa 100755
> --- a/t/t7008-grep-binary.sh
> +++ b/t/t7008-grep-binary.sh
> @@ -4,6 +4,29 @@ test_description='git grep in binary files'
>  
>  . ./test-lib.sh
>  
> +nul_match() {

Micronit: "nul_match () {"

> + status=$1
> + flags=$2
> + pattern=$3
> + pattern_human=$(echo $pattern | sed 's/Q//g')

Double quote around "$pattern"?

> +
> + if test $status = "1"

Double quote around "$status" and drop double quote around "1"
(which is clearly a literal string without any funnies) instead?

> + then
> + test_expect_success "git grep -f f $flags '$pattern_human' a" "
> + printf '$pattern' | q_to_nul >f &&
> + git grep -f f $flags a
> + "
> + elif test $status = "0"
> + then
> + test_expect_success "git grep -f f $flags '$pattern_human' a" "
> + printf '$pattern' | q_to_nul >f &&
> + test_must_fail git grep -f f $flags a
> + "

It somehow was unintuitive that 0 expected failure and 1 expected
success, but it probably was just me.


[PATCH 2/3] commit.c: add is_scissors_line

2017-05-11 Thread Brian Malehorn
Move is_scissors_line to commit.c and expose it through commit.h.
This is needed in commit.c, and mailinfo.c shouldn't really own it.
---
 commit.c   | 52 
 commit.h   |  1 +
 mailinfo.c | 53 +
 3 files changed, 54 insertions(+), 52 deletions(-)

diff --git a/commit.c b/commit.c
index fab826973..041cfa5a9 100644
--- a/commit.c
+++ b/commit.c
@@ -1646,6 +1646,58 @@ const char *find_commit_header(const char *msg, const 
char *key, size_t *out_len
return NULL;
 }
 
+int is_scissors_line(const char *line)
+{
+   const char *c;
+   int scissors = 0, gap = 0;
+   const char *first_nonblank = NULL, *last_nonblank = NULL;
+   int visible, perforation = 0, in_perforation = 0;
+
+   for (c = line; *c != '\n'; c++) {
+   if (isspace(*c)) {
+   if (in_perforation) {
+   perforation++;
+   gap++;
+   }
+   continue;
+   }
+   last_nonblank = c;
+   if (first_nonblank == NULL)
+   first_nonblank = c;
+   if (*c == '-') {
+   in_perforation = 1;
+   perforation++;
+   continue;
+   }
+   if ((!memcmp(c, ">8", 2) || !memcmp(c, "8<", 2) ||
+!memcmp(c, ">%", 2) || !memcmp(c, "%<", 2))) {
+   in_perforation = 1;
+   perforation += 2;
+   scissors += 2;
+   c++;
+   continue;
+   }
+   in_perforation = 0;
+   }
+
+   /*
+* The mark must be at least 8 bytes long (e.g. "-- >8 --").
+* Even though there can be arbitrary cruft on the same line
+* (e.g. "cut here"), in order to avoid misidentification, the
+* perforation must occupy more than a third of the visible
+* width of the line, and dashes and scissors must occupy more
+* than half of the perforation.
+*/
+
+   if (first_nonblank && last_nonblank)
+   visible = last_nonblank - first_nonblank + 1;
+   else
+   visible = 0;
+   return (scissors && 8 <= visible &&
+   visible < perforation * 3 &&
+   gap * 2 < perforation);
+}
+
 /*
  * Inspect the given string and determine the true "end" of the log message, in
  * order to find where to put a new Signed-off-by: line.  Ignored are
diff --git a/commit.h b/commit.h
index 9c12abb91..58cbab1cd 100644
--- a/commit.h
+++ b/commit.h
@@ -353,6 +353,7 @@ extern void free_commit_extra_headers(struct 
commit_extra_header *extra);
  */
 extern const char *find_commit_header(const char *msg, const char *key,
  size_t *out_len);
+extern int is_scissors_line(const char *line);
 
 /* Find the end of the log message, the right place for a new trailer. */
 extern int ignore_non_trailer(const char *buf, size_t len);
diff --git a/mailinfo.c b/mailinfo.c
index eadd0597f..52af800a5 100644
--- a/mailinfo.c
+++ b/mailinfo.c
@@ -1,6 +1,7 @@
 #include "cache.h"
 #include "utf8.h"
 #include "strbuf.h"
+#include "commit.h"
 #include "mailinfo.h"
 
 static void cleanup_space(struct strbuf *sb)
@@ -654,58 +655,6 @@ static inline int patchbreak(const struct strbuf *line)
return 0;
 }
 
-static int is_scissors_line(const char *line)
-{
-   const char *c;
-   int scissors = 0, gap = 0;
-   const char *first_nonblank = NULL, *last_nonblank = NULL;
-   int visible, perforation = 0, in_perforation = 0;
-
-   for (c = line; *c != '\n'; c++) {
-   if (isspace(*c)) {
-   if (in_perforation) {
-   perforation++;
-   gap++;
-   }
-   continue;
-   }
-   last_nonblank = c;
-   if (first_nonblank == NULL)
-   first_nonblank = c;
-   if (*c == '-') {
-   in_perforation = 1;
-   perforation++;
-   continue;
-   }
-   if ((!memcmp(c, ">8", 2) || !memcmp(c, "8<", 2) ||
-!memcmp(c, ">%", 2) || !memcmp(c, "%<", 2))) {
-   in_perforation = 1;
-   perforation += 2;
-   scissors += 2;
-   c++;
-   continue;
-   }
-   in_perforation = 0;
-   }
-
-   /*
-* The mark must be at least 8 bytes long (e.g. "-- >8 --").
-* Even though there can be arbitrary cruft on the same line
-* (e.g. "cut here"), in order to avoid misidentification, the
-* perforation must occupy more than a third of the 

[PATCH 3/3] commit.c: skip scissors when computing trailers

2017-05-11 Thread Brian Malehorn
"scissors" ("- >8 -") can be automatically added to commit
messages by setting commit.verbose = true. Prevent this from interfering
with trailer calculations by automatically skipping over scissors,
instead of (usually) treating them as a comment.
---
 commit.c  | 13 +
 t/t7513-interpret-trailers.sh | 17 +
 2 files changed, 26 insertions(+), 4 deletions(-)

diff --git a/commit.c b/commit.c
index 041cfa5a9..9a7b41d09 100644
--- a/commit.c
+++ b/commit.c
@@ -1701,10 +1701,10 @@ int is_scissors_line(const char *line)
 /*
  * Inspect the given string and determine the true "end" of the log message, in
  * order to find where to put a new Signed-off-by: line.  Ignored are
- * trailing comment lines and blank lines, and also the traditional
- * "Conflicts:" block that is not commented out, so that we can use
- * "git commit -s --amend" on an existing commit that forgot to remove
- * it.
+ * trailing comment lines and blank lines.  To support "git commit -s
+ * --amend" on an existing commit, we also ignore "Conflicts:".  To
+ * support "git commit -v", we truncate at " >8 " and similar
+ * scissors lines.
  *
  * Returns the number of bytes from the tail to ignore, to be fed as
  * the second parameter to append_signoff().
@@ -1723,6 +1723,11 @@ int ignore_non_trailer(const char *buf, size_t len)
else
next_line++;
 
+   if (is_scissors_line([bol])) {
+   if (!boc)
+   boc = bol;
+   break;
+   }
if (buf[bol] == comment_line_char || buf[bol] == '\n') {
/* is this the first of the run of comments? */
if (!boc)
diff --git a/t/t7513-interpret-trailers.sh b/t/t7513-interpret-trailers.sh
index 4dd1d7c52..d88d4a4ff 100755
--- a/t/t7513-interpret-trailers.sh
+++ b/t/t7513-interpret-trailers.sh
@@ -1258,4 +1258,21 @@ test_expect_success 'with no command and no key' '
test_cmp expected actual
 '
 
+test_expect_success 'with scissors' '
+   cat >expected <<-EOF &&
+   my subject
+
+   review: Brian
+   sign: A U Thor 
+   #  >8 
+   ignore this
+   EOF
+   git interpret-trailers --trailer review:Brian >actual <<-EOF &&
+   my subject
+   #  >8 
+   ignore this
+   EOF
+   test_cmp expected actual
+'
+
 test_done
-- 
2.12.3.3.g39c96af



[PATCH 1/3] mailinfo.c: is_scissors_line ends on newline

2017-05-11 Thread Brian Malehorn
Needed to work with git interpret-trailers. Since "line" is, of course,
a line, it will always end with "\n\0" and therefore we can safely end
on "\n".
---
 mailinfo.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/mailinfo.c b/mailinfo.c
index a489d9d0f..eadd0597f 100644
--- a/mailinfo.c
+++ b/mailinfo.c
@@ -661,7 +661,7 @@ static int is_scissors_line(const char *line)
const char *first_nonblank = NULL, *last_nonblank = NULL;
int visible, perforation = 0, in_perforation = 0;
 
-   for (c = line; *c; c++) {
+   for (c = line; *c != '\n'; c++) {
if (isspace(*c)) {
if (in_perforation) {
perforation++;
-- 
2.12.3.3.g39c96af



[PATCH 0/3] interpret-trailers + commit -v bugfix

2017-05-11 Thread Brian Malehorn

Hi all,

This patch series addresses a bug in interpret-trailers. If the commit
that is being editted is "verbose", it will contain a scissors string
("-- >8 --") and a diff. interpret-trailers doesn't interpret the
scissors and therefore places trailer information after the diff. A
simple reproduction is:

git config commit.verbose true
GIT_EDITOR='git interpret-trailers --in-place --trailer Acked-by:me' \
git commit --amend

This patch series fixes the issue by obeying scissors when computing
trailers location.

P.S. This is my first patch series to the git mailing list, to feel free
to point out any mistakes I made when submitting.

 commit.c  | 65 ---
 commit.h  |  1 +
 mailinfo.c| 53 +--
 t/t7513-interpret-trailers.sh | 17 +++
 4 files changed, 80 insertions(+), 56 deletions(-)


Re: [PATCH 09/29] grep: amend submodule recursion test for regex engine testing

2017-05-11 Thread Junio C Hamano
Ævar Arnfjörð Bjarmason   writes:

> Amend the submodule recursion test to prepare it for subsequent tests
> of whether it passes along the grep.patternType to the submodule
> greps.
>
> This is the result of searching & replacing:
>
> foobar -> (1|2)d(3|4)
> foo-> (1|2)
> bar-> (3|4)
> ...
>  test_expect_success 'grep and multiple patterns' '
>   cat >expect <<-\EOF &&
> - b/b:bar
> + b/b:(3|4)
>   EOF
>  
> - git grep -e "bar" --and --not -e "foo" --recurse-submodules >actual &&
> + git grep -e "(3|4)" --and --not -e "(1|2)d" --recurse-submodules 
> >actual &&


This breaks the promise "foo maps to (1|2)"; I do not think you need
to add 'd' in order to make the test to succeed, so I am not sure
what is going on here.



Re: [PATCH 04/29] log: add exhaustive tests for pattern style options & config

2017-05-11 Thread Junio C Hamano
Ævar Arnfjörð Bjarmason   writes:

> Add exhaustive tests for how the different grep.patternType options &
> the corresponding command-line options affect git-log.
> ...
> The patterns being passed to fixed/basic/extended/PCRE are carefully
> crafted to return the wrong thing if the grep engine were to pick any
> other matching method than the one it's told to use.

That is good; it may be even more helpful to the readers if they are
given a brief in-code comment.  I had to scratch head while reading
them.

>
> Signed-off-by: Ævar Arnfjörð Bjarmason 
> ---
>  t/t4202-log.sh | 77 
> +-
>  1 file changed, 76 insertions(+), 1 deletion(-)
>
> diff --git a/t/t4202-log.sh b/t/t4202-log.sh
> index f577990716..6d1411abea 100755
> --- a/t/t4202-log.sh
> +++ b/t/t4202-log.sh
> @@ -262,7 +262,23 @@ test_expect_success 'log --grep -i' '
>  
>  test_expect_success 'log -F -E --grep= uses ere' '
>   echo second >expect &&
> - git log -1 --pretty="tformat:%s" -F -E --grep=s.c.nd >actual &&
# BRE would need \(s\) to do the same.
> + git log -1 --pretty="tformat:%s" -F -E --grep="(s).c.nd" >actual &&
> + test_cmp expect actual
> +'
> +
> +test_expect_success PCRE 'log -F -E --perl-regexp --grep= uses PCRE' '
> + test_when_finished "rm -rf num_commits" &&
> + git init num_commits &&
> + (
> + cd num_commits &&
> + test_commit 1d &&
> + test_commit 2e
> + ) &&
> + echo 2e >expect &&
# In PCRE \d in [\d] is like saying "0-9", and match '2' in 2e
> + git -C num_commits log -1 --pretty="tformat:%s" -F -E --perl-regexp 
> --grep="[\d]" >actual &&
> + test_cmp expect actual &&
> + echo 1d >expect &&
# In ERE [\d] is bs or letter 'd' and would not match '2' in '2e'
# but does match 'd' in '1d'
> + git -C num_commits log -1 --pretty="tformat:%s" -F -E --grep="[\d]" 
> >actual &&
>   test_cmp expect actual
>  '
>  
> @@ -280,6 +296,65 @@ test_expect_success 'log with grep.patternType 
> configuration and command line' '
>   test_cmp expect actual
>  '
>  
> +test_expect_success 'log with various grep.patternType configurations & 
> command-lines' '
> + git init pattern-type &&
> + (
> + cd pattern-type &&
> + test_commit 1 file A &&
> + test_commit "(1|2)" file B &&
> +
> + echo "(1|2)" >expect.fixed &&
> + cp expect.fixed expect.basic &&
> + cp expect.fixed expect.extended &&
> + cp expect.fixed expect.perl &&
> +
# Fixed finds these literally
> + git -c grep.patternType=fixed log --pretty=tformat:%s \
> + --grep="(1|2)" >actual.fixed &&
# BRE matches (, |, and ) literally
> + git -c grep.patternType=basic log --pretty=tformat:%s \
> + --grep="(.|.)" >actual.basic &&
# ERE needs | quoted with bs to match | literally
> + git -c grep.patternType=extended log --pretty=tformat:%s \
> + --grep="\|2" >actual.extended &&

If we use BRE by mistake, wouldn't this pattern actually find
"(1|2)" anyway, without skipping it and show "1 file A" instead?

> + if test_have_prereq PCRE
> + then
> + git -c grep.patternType=perl log --pretty=tformat:%s \
> + --grep="[\d]\|" >actual.perl
# Only PCRE would match [\d]\| with "(1|2)" due to [\d]
> + fi &&
> + test_cmp expect.fixed actual.fixed &&
> + test_cmp expect.basic actual.basic &&
> + test_cmp expect.extended actual.extended &&
> + if test_have_prereq PCRE
> + then
> + test_cmp expect.perl actual.perl
> + fi &&

It could be just a style thing, but I would actually find it easier
to follow if the above two "only with PCRE" tests, i.e. running test
and taking its output in actual.perl and comparing it with
expect.perl, were inside a single "if test_have_prereq PCRE" block.


Re: [PATCH v5] fetch-pack: always allow fetching of literal SHA1s

2017-05-11 Thread Jeff King
On Thu, May 11, 2017 at 03:30:54PM -0700, Jonathan Tan wrote:

> fetch-pack, when fetching a literal SHA-1 from a server that is not
> configured with uploadpack.allowtipsha1inwant (or similar), always
> returns an error message of the form "Server does not allow request for
> unadvertised object %s". However, it is sometimes the case that such
> object is advertised. This situation would occur, for example, if a user
> or a script was provided a SHA-1 instead of a branch or tag name for
> fetching, and wanted to invoke "git fetch" or "git fetch-pack" using
> that SHA-1.
> 
> Teach fetch-pack to also check the SHA-1s of the refs in the received
> ref advertisement if a literal SHA-1 was given by the user.
> 
> Helped-by: Jeff King 
> Signed-off-by: Jonathan Tan 
> ---

This looks good to me. There's one minor nit that I don't think I saw
mentioned, and I don't think needs to hold up the patch. But I wanted to
mention it just in case I'm wrong that it doesn't matter.

>  static void filter_refs(struct fetch_pack_args *args,
>   struct ref **refs,
>   struct ref **sought, int nr_sought)
>  {
>   struct ref *newlist = NULL;
>   struct ref **newtail = 
> + struct ref *unmatched = NULL;
>   struct ref *ref, *next;
> + struct oidset tip_oids = OIDSET_INIT;
>   int i;
>  
>   i = 0;
> @@ -631,7 +651,8 @@ static void filter_refs(struct fetch_pack_args *args,
>   ref->next = NULL;
>   newtail = >next;
>   } else {
> - free(ref);
> + ref->next = unmatched;
> + unmatched = ref;
>   }

The incoming "refs" list is sorted, and we rely on that sorting to do
the linear walk. Likewise, we append to newlist via newtail, so it
remains sorted (and I suspect the consumers of the list rely on that).
But your new "unmatched" list is done by prepending, so it's in reverse
order.

I don't think that matters for our purposes here, and the list doesn't
escape our function. So there's no bug, but I just wonder if it might
end up biting somebody in the future. I'm OK with leaving it, though.

-Peff


Re: [PATCH v5] fetch-pack: always allow fetching of literal SHA1s

2017-05-11 Thread Jeff King
On Thu, May 11, 2017 at 03:46:39PM -0700, Jonathan Nieder wrote:

> > +static void add_refs_to_oidset(struct oidset *oids, struct ref *refs)
> > +{
> > +   for (; refs; refs = refs->next)
> > +   oidset_insert(oids, >old_oid);
> > +}
> > +
> > +static int tip_oids_contain(struct oidset *tip_oids,
> > +   struct ref *unmatched, struct ref *newlist,
> > +   const struct object_id *id)
> > +{
> > +   if (!tip_oids->map.cmpfn) {
> 
> This feels like a layering violation.  Could it be e.g. a static inline
> function oidset_is_initialized in oidset.h?

It definitely is a layering violation, though we normally are pretty
"open" with letting callers peek in at our data structures.

I'd be fine with it as-is or with the helper function you suggested.
But...

> +/**
> + * Returns true iff "set" has been initialized (for example by inserting
> + * an entry). An oidset is considered uninitialized if it hasn't had any
> + * oids inserted since it was last cleared.
> + */
> +static inline int oidset_initialized(const struct oidset *set)
> +{
> + return set->map.cmpfn ? 1 : 0;
> +}

Now we're committing our own layering violation. I was surprised to find
that hashmap_free() clears "cmpfn", and I'm not sure if we would want to
count on that (not that it probably matters all that much, but it is
required for the description you gave to be accurate).

The hashmap documentation suggests using "tablesize" for checking
whether something is initialized. Maybe we ought to use that.

-Peff


Re: [RFC] send-email: support validate hook

2017-05-11 Thread Junio C Hamano
Jonathan Tan  writes:

> diff --git a/git-send-email.perl b/git-send-email.perl
> index eea0a517f..7de91ca7c 100755
> --- a/git-send-email.perl
> +++ b/git-send-email.perl
> @@ -27,6 +27,7 @@ use Term::ANSIColor;
>  use File::Temp qw/ tempdir tempfile /;
>  use File::Spec::Functions qw(catfile);
>  use Error qw(:try);
> +use Cwd qw(abs_path cwd);
>  use Git;
>  use Git::I18N;
>  
> @@ -628,9 +629,24 @@ if (@rev_list_opts) {
>  @files = handle_backup_files(@files);
>  
>  if ($validate) {
> + my @hook = ($repo->repo_path().'/hooks/sendemail-validate', '');
> + my $use_hook = -x $hook[0];
> + if ($use_hook) {
> + # The hook needs a correct GIT_DIR.
> + $ENV{"GIT_DIR"} = $repo->repo_path();
> + }
>   foreach my $f (@files) {
>   unless (-p $f) {
> - my $error = validate_patch($f);
> + my $error;
> + if ($use_hook) {
> + $hook[1] = abs_path($f);
> + my $cwd_save = cwd();
> + chdir($repo->wc_path() or $repo->repo_path());
> + $error = "rejected by sendemail-validate hook"
> + unless system(@hook) == 0;
> + chdir($cwd_save);
> + }
> + $error = validate_patch($f) unless $error;
>   $error and die sprintf(__("fatal: %s: %s\nwarning: no 
> patches were sent\n"),
> $f, $error);
>   }

This is not about "Perl code" but I find it somewhat strange to add
this code to outside validate_patch() when we have a helper function 
specifically designed to "validate patch" and the new code is about
teaching the program an additional way to "validate patch".

Also I am not sure if setting and exporting GIT_DIR for the entire
program, only when we run this hook is a sensible thing to do.

Either the remainder of the program is safe to see the GIT_DIR
pointing at the correct location (in which case $ENV{GIT_DIR}
assignment can be done outside "if ($validate && $use_hook)" to
affect everything), or the rest of the program is not prepared to
see such a forced assignment and export (in which case we should
limit the setting of the environment to the subprocess that runs
system(@hook) without affecting anything else).  Doing something in
the middle conditionally feels like it is inviting future troubles
coming from the inconsistency (e.g. "this feature totally unrelated
to the validate hook works when validate hook is in use but
otherwise it doesn't, because $GIT_DIR is sometimes set and
sometimes not").


> diff --git a/t/t9001-send-email.sh b/t/t9001-send-email.sh
> index 60a80f60b..f3f238d40 100755
> --- a/t/t9001-send-email.sh
> +++ b/t/t9001-send-email.sh
> @@ -1913,4 +1913,44 @@ test_expect_success $PREREQ 'leading and trailing 
> whitespaces are removed' '
>   test_cmp expected-list actual-list
>  '
>  
> +test_expect_success $PREREQ 'invoke hook' '
> + mkdir -p .git/hooks &&
> +
> + write_script .git/hooks/sendemail-validate <<-\EOF &&
> + # test that we have the correct environment variable, pwd, and
> + # argument
> + case "$GIT_DIR" in
> + *.git)
> + true
> + ;;
> + *)
> + false
> + ;;
> + esac &&

Case arms indented one level too deep.

> + test -e 0001-add-master.patch &&

Do we only care about existence and do not mind if it is a
directory?  Otherwise using "-f" would be more sensible, perhaps?


Re: [PATCH] compat/regex: fix compilation on Windows

2017-05-11 Thread Junio C Hamano
Johannes Schindelin  writes:

> The real issue here is that GNU awk's regex implementation assumes a bit
> too much about the relative sizes of pointers and long integers. What they
> really want is to use intptr_t.

Good.  I got annoyed enough to do the same myself before opening my
mailbox.  One thing that is curious about your version is that it
still has "#include " behind HAVE_STDINT_H; when I tried
to compile an equivalent of your change last night, the compilation
failed because intptr_t wasn't available without exposing 

Where is Windows build getting its intptr_t, I wonder.


Re: [PATCH 00/11] Start retiring .git/remotes/ and .git/branches/ for good

2017-05-11 Thread Junio C Hamano
Johannes Schindelin  writes:

> Git uses the config for remote/upstream information in favor of the
> previously-used .git/remotes/ and .git/branches/ for a decade now.

The last time I thought about trying this several years ago, I found
that people who need to grab things from many places still do use
.git/branches/ and their use case is hard to migrate to .git/config,
primarily because the former is "one per file" and it is easy to
add/remove/tweak without affecting others.  Ask akpm@ if he still
prefers to use .git/branches/ for example.

Is it really hurting us having to support these old information
sources we treat as read-only?


Re: What's cooking in git.git (May 2017, #03; Wed, 10)

2017-05-11 Thread Junio C Hamano
Jeff King  writes:

> On Thu, May 11, 2017 at 05:31:45PM -0400, Marc Branchaud wrote:
>
>> > * mb/diff-default-to-indent-heuristics (2017-05-09) 4 commits
>> [...]
>> >  Kicked out of next; it seems it is still getting review suggestions?
>> 
>> I believe v4 of this one is ready to cook.
>
> Yeah, I think it's ready, too (and I agree with you on the text for the
> release notes).

Yeah, what is on 'pu' is v4 and I think everybody on the list
discussion was happy with it.  Thanks.


Re: [PATCH 8/8] pathspec: convert parse_pathspec to take an index

2017-05-11 Thread Junio C Hamano
Brandon Williams  writes:

> ...  Note that if we go with the route to not pass
> in an index now, it doesn't necessarily mean that down the line we won't
> have to pass a 'repository' instance into parse_pathspec().

Correct.  

An attribute annotated pathspec may want to check if the attributes
used in it are used in the repository at all for validation or
optimization purposes, for example.


Re: [PATCH 1/7] compat/regex: add a README with a maintenance guide

2017-05-11 Thread Junio C Hamano
Ævar Arnfjörð Bjarmason   writes:

> diff --git a/compat/regex/README b/compat/regex/README
> new file mode 100644
> index 00..345d322d8c
> --- /dev/null
> +++ b/compat/regex/README
> @@ -0,0 +1,21 @@
> +This is the Git project's copy of the GNU awk (Gawk) regex
> +engine. It's used when Git is build with e.g. NO_REGEX=NeedsStartEnd,
> +or when the C library's regular expression functions are otherwise
> +deficient.
> +
> +This is not a fork, but a source code copy. Upstream is the Gawk
> +project, and the sources should be periodically updated from their
> +copy, which can be done with:
> +
> +for f in $(find . -name '*.[ch]' -printf "%f\n"); do wget 
> http://git.savannah.gnu.org/cgit/gawk.git/plain/support/$f -O $f; done
> +
> +For ease of maintenance, and to intentionally make it inconvenient to
> +diverge from upstream (since it makes it harder to re-merge) any local
> +changes should be stored in the patches/ directory, which after doing
> +the above can be applied as:
> +
> +for p in patches/*; do patch -p3 < $p; done
> +
> +For any changes that aren't specific to the git.git copy please submit
> +a patch to the Gawk project and/or to the GNU C library (the Gawk
> +regex engine is a periodically & forked copy from glibc.git).

I am not a huge fan of placing patch files under version control.

If I were doing the "code drop from the outside world from time to
time", I'd rather do the following every time we update:

 - have a topic branch for importing version N+1, and in its first
   commit, replace compat/regex/ with the pristine copy of the files
   we'll borrow from version N+1.

 - ask "git log -p compat/regex/" to grab all changes made to the
   directory, and stop at the commit that imported the pristine copy
   of the files we borrowed from version N.  These are the changes
   we made to the pristine copy of version N to adjust it to our
   needs.

 - cherry-pick these patches on the topic branch; some of them
   hopefully have been upstreamed, the remainder of the patches are
   presumably to adjust the code to our local needs.

 - make more changes, while still on the topic branch, to adjust the
   code to our local and current needs.

 - once the result becomes buildable and tests OK, merge it back to
   the mainline.

This may break bisectability, but I think it is OK (you should be
able to skip and test only first-parent chain, treating as if these
are squashed together into a single change).  The patch files your
approach is keeping will become the individual patches on the topic
branch, and will be explained and justified the same way as any
other patches in their commit log message.

Having said all that, since I am not expecting to be the primary one
working in this area, I'll let you (who I take to be volunteering to
be the one) pick the approach that you would find the easiest and
least error prone to handle this task.

Thanks.


Re: [PATCH 0/7] Update the compat/regex engine from upstream

2017-05-11 Thread Junio C Hamano
Ævar Arnfjörð Bjarmason   writes:

> See the first patch for motivation & why.
> ...

I do not necessarily agree with the upgrading strategy outlined in
1/7, but that is a separate issue.  There may be some other bits
that needs resurrecting out of "git log -p master compat/regex/",
but I cannot test them myself (the part changed by the following
patch have no effect in an environment where long is intptr_t, so
"make NO_REGEX=YesPlease" build does not fail for me), so I'm
letting Dscho's Windows builder find it out via Travis.

-- >8 --
Date: Fri, 12 May 2017 09:00:07 +0900
Subject: [PATCH] compat/regex: make it compile with -Werror=int-to-pointer-cast

The change by 56a1a3ab ("Silence GCC's "cast of pointer to integer
of a different size" warning", 2015-10-26) may need resurrecting; I
do not think an unprotected #include  is the best way to
do this, but for the purpose of places that needs further work,
thishsould do.

Signed-off-by: Junio C Hamano 
---
 compat/regex/regcomp.c | 6 ++
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/compat/regex/regcomp.c b/compat/regex/regcomp.c
index 5e9ea26cd4..5688a639bf 100644
--- a/compat/regex/regcomp.c
+++ b/compat/regex/regcomp.c
@@ -24,9 +24,7 @@
License along with the GNU C Library; if not, see
.  */
 
-#ifdef HAVE_STDINT_H
 #include 
-#endif
 
 #ifdef HAVE_STRINGS_H
 #include 
@@ -2641,7 +2639,7 @@ parse_dup_op (bin_tree_t *elem, re_string_t *regexp, 
re_dfa_t *dfa,
 old_tree = NULL;
 
   if (elem->token.type == SUBEXP)
-postorder (elem, mark_opt_subexp, (void *) (long) elem->token.opr.idx);
+postorder (elem, mark_opt_subexp, (void *) (intptr_t) elem->token.opr.idx);
 
   tree = create_tree (dfa, elem, NULL, (end == -1 ? OP_DUP_ASTERISK : OP_ALT));
   if (BE (tree == NULL, 0))
@@ -3868,7 +3866,7 @@ create_token_tree (re_dfa_t *dfa, bin_tree_t *left, 
bin_tree_t *right,
 static reg_errcode_t
 mark_opt_subexp (void *extra, bin_tree_t *node)
 {
-  int idx = (int) (long) extra;
+  int idx = (int) (intptr_t) extra;
   if (node->token.type == SUBEXP && node->token.opr.idx == idx)
 node->token.opt_subexp = 1;
 
-- 
2.13.0-334-gbb1c091dbc



Re: Possible bug in includeIf / conditional includes on non git initialised directories

2017-05-11 Thread Jeff King
On Thu, May 11, 2017 at 10:31:14PM +0200, Sebastian Schuberth wrote:

> On 2017-05-11 20:53, Raphael Stolt wrote:
> 
> > I might have stumbled this time over a real bug in includeIf / conditional 
> > includes or maybe it's just as intended.
> > 1) Given I have a correct configured includeIf and I’m issuing `git config 
> > --show-origin --get user.email` against an directory which hasn’t been `git 
> > init`ed I get the user.email configured globally.
> 
> I don't think that's a bug surprise: The condition in the conditional
> include is "gitdir:". Before running "git init", it simply *is* no
> gitdir.

Yeah, I think all is working as intended. A "cwd:" conditional seems
like it would be useful, but I think it would have a lot of corner
cases. It may change over the course of a program, and you have
weirdness with things like "git --git-dir=/some/other/path", where your
cwd and the git repository you're looking at are totally unrelated.

-Peff


Re: What's cooking in git.git (May 2017, #03; Wed, 10)

2017-05-11 Thread Jeff King
On Thu, May 11, 2017 at 05:31:45PM -0400, Marc Branchaud wrote:

> > * mb/diff-default-to-indent-heuristics (2017-05-09) 4 commits
> [...]
> >  Kicked out of next; it seems it is still getting review suggestions?
> 
> I believe v4 of this one is ready to cook.

Yeah, I think it's ready, too (and I agree with you on the text for the
release notes).

-Peff


Re: Best "triangle" workflow setup?

2017-05-11 Thread Jeff King
On Thu, May 11, 2017 at 04:23:03PM -0500, Robert Dailey wrote:

> On Thu, May 11, 2017 at 3:17 PM, Jeff King  wrote:
> > I think you want:
> >
> >   [push]
> >   default = current
> >   [remote]
> >   pushDefault = myfork
> >
> > to make "git push" do what you want. And then generally have branches
> > mark their counterparts on "origin" (which you can do either at creation
> > time, or probably by using "git push -u origin my-topic" when you push
> > them).
> 
> So without the `pushDefault` setting, `current` will default to a
> remote named `origin` if there is no tracking branch set, correct? So
> `pushDefault` is effectively overriding this built-in default? In
> addition, it seems like since this overrides `branch.name.remote`,
> that this effectively makes the remote tracking branch *only* for
> `pull`. Is this a correct understanding?

Right. The general idea of a triangular workflow is that where you pull
from is not the same as where you push to. We have branch.*.pushremote
if you really wanted to do it on a per-branch basis, but in my
experience you almost always want to use "myfork", because you can't
push to "origin" in the first place. :)

> > This is similar to what I do for my git.git workflow, though I usually
> > have origin/master as the branch's upstream. I.e., I'd create them with:
> >
> >   git checkout -b my-topic origin
> 
> I'm looking through the `git checkout` and `git branch` documentation,
> but I don't see any mention of it being valid to use a remote name as
> the  parameter (you're using `origin` in the above
> example). Am I misunderstanding? Did you mean origin/my-topic?

Using "origin" there will resolve to "origin/HEAD", i.e., origin/master.
So basically I am saying that all of my topic branches are based on
master, and if I were to rebase them (for example), I'd want to rebase
the whole thing.

If I were to "git pull", they'd also pull from master, which may or may
not be what you want (though with pull.rebase, perhaps). I don't
generally use "git pull" at all for my git.git workflow.

> > And then rebasing always happens on top of master (because "origin"
> > doesn't even have my topic branch at all). If I want to compare with
> > what I've pushed to my fork, I'd use "@{push}".
> 
> Can you explain more about how your rebase chooses master instead of
> your same-named remote tracking branch? Maybe provide some examples of
> your rebase command and respective configuration (unless what you've
> already provided is sufficient). As for @{push}, I haven't used this
> before, so I'll dig in the docs and learn about it.

The default for "git rebase" (if you don't specify a base) is the
configured upstream, which in my case is origin/master. Most of my
rebasing is "rebase -i" to rewrite bits, so it automatically picks all
the commits on my topic branch.

Maybe it would help to set up a trivial example:

  # just a helper to make dummy commits
  commit() { echo "$1" >"$1" && git add "$1" && git commit -m "$1"; }

  # some parent repo
  git init parent
  (cd parent && commit one)

  # and imagine you have a public fork, too
  git clone --bare parent myfork.git

  # and then you have your local clone; in real life this is obviously
  # the only one that would actually be on your machine, but this is a
  # toy example
  git clone parent local
  cd local

  # set up our triangular config
  git remote add myfork ../myfork.git
  git config remote.pushdefault myfork
  git config push.default current

  # now let's try a topic branch
  git checkout -b topic origin
  commit two
  commit three

  # config will show our topic based on origin/master:
  #  [branch "topic"]
  # remote = origin
  # merge = refs/heads/master
  less .git/config

  # this should default to all the commits in our topic (i.e., two, three)
  git rebase -i

  # let's imagine upstream makes more commits on master. We can "pull
  # --rebase" to put our work on top
  (cd ../parent && commit four)
  git pull --rebase

  # pushes go to the matching branch on myfork
  git push

  # if you want to see what you haven't pushed yet, you can use @{push}
  commit five
  git log @{push}..

  # likewise, if you wanted to rebase only commits that you've been
  # working on since your last push:
  git rebase -i @{push}

  # Now imagine "origin" picks up your branch...
  (cd ../parent && git fetch ../myfork.git topic:topic)

  # Depending on your project's workflow, you may want to consider that
  # the new base for further development (and never rebase or rewrite
  # any commits that origin has). You do that by re-pointing your
  # @{upstream} config.
  git fetch
  git branch --set-upstream-to=origin/topic topic

  # now a "rebase -i" would show only the commits origin doesn't have
  # (five and six in this case)
  commit six
  git rebase -i


Hopefully that shows off some of the ways you can use the upstream and
push config in practice.  Some people may not be as excited about the
"rebase" default as I am. 

Re: [PATCH v5] fetch-pack: always allow fetching of literal SHA1s

2017-05-11 Thread Jonathan Nieder
Jonathan Tan wrote:

[...]
> --- a/fetch-pack.c
> +++ b/fetch-pack.c
> @@ -15,6 +15,7 @@
>  #include "version.h"
>  #include "prio-queue.h"
>  #include "sha1-array.h"
> +#include "oidset.h"
>  
>  static int transfer_unpack_limit = -1;
>  static int fetch_unpack_limit = -1;
> @@ -592,13 +593,32 @@ static void mark_recent_complete_commits(struct 
> fetch_pack_args *args,
>   }
>  }
>  
> +static void add_refs_to_oidset(struct oidset *oids, struct ref *refs)
> +{
> + for (; refs; refs = refs->next)
> + oidset_insert(oids, >old_oid);
> +}
> +
> +static int tip_oids_contain(struct oidset *tip_oids,
> + struct ref *unmatched, struct ref *newlist,
> + const struct object_id *id)
> +{
> + if (!tip_oids->map.cmpfn) {

This feels like a layering violation.  Could it be e.g. a static inline
function oidset_is_initialized in oidset.h?

> + add_refs_to_oidset(tip_oids, unmatched);
> + add_refs_to_oidset(tip_oids, newlist);
> + }
> + return oidset_contains(tip_oids, id);
> +}

The rest looks good.

With or without that change,
Reviewed-by: Jonathan Nieder 

Thanks for your patient work.

diff --git i/fetch-pack.c w/fetch-pack.c
index 9dd430a65a..0394580434 100644
--- i/fetch-pack.c
+++ w/fetch-pack.c
@@ -603,7 +603,7 @@ static int tip_oids_contain(struct oidset *tip_oids,
struct ref *unmatched, struct ref *newlist,
const struct object_id *id)
 {
-   if (!tip_oids->map.cmpfn) {
+   if (!oidset_initialized(tip_oids)) {
add_refs_to_oidset(tip_oids, unmatched);
add_refs_to_oidset(tip_oids, newlist);
}
diff --git i/oidset.c w/oidset.c
index ac169f05d3..f2a6753b8a 100644
--- i/oidset.c
+++ w/oidset.c
@@ -18,7 +18,7 @@ int oidset_contains(const struct oidset *set, const struct 
object_id *oid)
 {
struct hashmap_entry key;
 
-   if (!set->map.cmpfn)
+   if (!oidset_initialized(set))
return 0;
 
hashmap_entry_init(, sha1hash(oid->hash));
@@ -29,7 +29,7 @@ int oidset_insert(struct oidset *set, const struct object_id 
*oid)
 {
struct oidset_entry *entry;
 
-   if (!set->map.cmpfn)
+   if (!oidset_initialized(set))
hashmap_init(>map, oidset_hashcmp, 0);
 
if (oidset_contains(set, oid))
diff --git i/oidset.h w/oidset.h
index b7eaab5b88..2e7d889770 100644
--- i/oidset.h
+++ w/oidset.h
@@ -22,6 +22,16 @@ struct oidset {
 
 #define OIDSET_INIT { { NULL } }
 
+/**
+ * Returns true iff "set" has been initialized (for example by inserting
+ * an entry). An oidset is considered uninitialized if it hasn't had any
+ * oids inserted since it was last cleared.
+ */
+static inline int oidset_initialized(const struct oidset *set)
+{
+   return set->map.cmpfn ? 1 : 0;
+}
+
 /**
  * Returns true iff `set` contains `oid`.
  */


[PATCH v5] fetch-pack: always allow fetching of literal SHA1s

2017-05-11 Thread Jonathan Tan
fetch-pack, when fetching a literal SHA-1 from a server that is not
configured with uploadpack.allowtipsha1inwant (or similar), always
returns an error message of the form "Server does not allow request for
unadvertised object %s". However, it is sometimes the case that such
object is advertised. This situation would occur, for example, if a user
or a script was provided a SHA-1 instead of a branch or tag name for
fetching, and wanted to invoke "git fetch" or "git fetch-pack" using
that SHA-1.

Teach fetch-pack to also check the SHA-1s of the refs in the received
ref advertisement if a literal SHA-1 was given by the user.

Helped-by: Jeff King 
Signed-off-by: Jonathan Tan 
---

Change from v4: incorporated Jonathan Nieder's suggestion about using
another function. There is no oidset_is_empty, so I checked for a
presence of a member variable instead.
---
 fetch-pack.c  | 34 --
 t/t5500-fetch-pack.sh | 35 +++
 2 files changed, 67 insertions(+), 2 deletions(-)

diff --git a/fetch-pack.c b/fetch-pack.c
index afb8b0502..9dd430a65 100644
--- a/fetch-pack.c
+++ b/fetch-pack.c
@@ -15,6 +15,7 @@
 #include "version.h"
 #include "prio-queue.h"
 #include "sha1-array.h"
+#include "oidset.h"
 
 static int transfer_unpack_limit = -1;
 static int fetch_unpack_limit = -1;
@@ -592,13 +593,32 @@ static void mark_recent_complete_commits(struct 
fetch_pack_args *args,
}
 }
 
+static void add_refs_to_oidset(struct oidset *oids, struct ref *refs)
+{
+   for (; refs; refs = refs->next)
+   oidset_insert(oids, >old_oid);
+}
+
+static int tip_oids_contain(struct oidset *tip_oids,
+   struct ref *unmatched, struct ref *newlist,
+   const struct object_id *id)
+{
+   if (!tip_oids->map.cmpfn) {
+   add_refs_to_oidset(tip_oids, unmatched);
+   add_refs_to_oidset(tip_oids, newlist);
+   }
+   return oidset_contains(tip_oids, id);
+}
+
 static void filter_refs(struct fetch_pack_args *args,
struct ref **refs,
struct ref **sought, int nr_sought)
 {
struct ref *newlist = NULL;
struct ref **newtail = 
+   struct ref *unmatched = NULL;
struct ref *ref, *next;
+   struct oidset tip_oids = OIDSET_INIT;
int i;
 
i = 0;
@@ -631,7 +651,8 @@ static void filter_refs(struct fetch_pack_args *args,
ref->next = NULL;
newtail = >next;
} else {
-   free(ref);
+   ref->next = unmatched;
+   unmatched = ref;
}
}
 
@@ -648,7 +669,9 @@ static void filter_refs(struct fetch_pack_args *args,
continue;
 
if ((allow_unadvertised_object_request &
-   (ALLOW_TIP_SHA1 | ALLOW_REACHABLE_SHA1))) {
+(ALLOW_TIP_SHA1 | ALLOW_REACHABLE_SHA1)) ||
+   tip_oids_contain(_oids, unmatched, newlist,
+>old_oid)) {
ref->match_status = REF_MATCHED;
*newtail = copy_ref(ref);
newtail = &(*newtail)->next;
@@ -656,6 +679,13 @@ static void filter_refs(struct fetch_pack_args *args,
ref->match_status = REF_UNADVERTISED_NOT_ALLOWED;
}
}
+
+   oidset_clear(_oids);
+   for (ref = unmatched; ref; ref = next) {
+   next = ref->next;
+   free(ref);
+   }
+
*refs = newlist;
 }
 
diff --git a/t/t5500-fetch-pack.sh b/t/t5500-fetch-pack.sh
index b5865b385..80a1a3239 100755
--- a/t/t5500-fetch-pack.sh
+++ b/t/t5500-fetch-pack.sh
@@ -547,6 +547,41 @@ test_expect_success 'fetch-pack can fetch a raw sha1' '
git fetch-pack hidden $(git -C hidden rev-parse refs/hidden/one)
 '
 
+test_expect_success 'fetch-pack can fetch a raw sha1 that is advertised as a 
ref' '
+   rm -rf server client &&
+   git init server &&
+   test_commit -C server 1 &&
+
+   git init client &&
+   git -C client fetch-pack ../server \
+   $(git -C server rev-parse refs/heads/master)
+'
+
+test_expect_success 'fetch-pack can fetch a raw sha1 overlapping a named ref' '
+   rm -rf server client &&
+   git init server &&
+   test_commit -C server 1 &&
+   test_commit -C server 2 &&
+
+   git init client &&
+   git -C client fetch-pack ../server \
+   $(git -C server rev-parse refs/tags/1) refs/tags/1
+'
+
+test_expect_success 'fetch-pack cannot fetch a raw sha1 that is not advertised 
as a ref' '
+   rm -rf server &&
+
+   git init server &&
+   test_commit -C server 5 &&
+   git -C server tag -d 5 &&
+   test_commit -C server 6 &&
+
+   git init client &&
+   test_must_fail git -C client 

[PATCH v2 5/6] pathspec: remove PATHSPEC_STRIP_SUBMODULE_SLASH_CHEAP

2017-05-11 Thread Brandon Williams
Since (ae8d08242 pathspec: pass directory indicator to
match_pathspec_item()) the path matching logic has been able to cope
with submodules without needing to strip off a trailing slash if a path
refers to a submodule.

Since stripping the slash is no longer necessary, remove the
PATHSPEC_STRIP_SUBMODULE_SLASH_CHEAP flag.

Signed-off-by: Brandon Williams 
---
 builtin/reset.c |  1 -
 builtin/rm.c|  3 +--
 builtin/submodule--helper.c |  3 +--
 pathspec.c  | 15 ---
 pathspec.h  | 10 --
 5 files changed, 6 insertions(+), 26 deletions(-)

diff --git a/builtin/reset.c b/builtin/reset.c
index fc3b906c4..5db2adc4c 100644
--- a/builtin/reset.c
+++ b/builtin/reset.c
@@ -236,7 +236,6 @@ static void parse_args(struct pathspec *pathspec,
 
parse_pathspec(pathspec, 0,
   PATHSPEC_PREFER_FULL |
-  PATHSPEC_STRIP_SUBMODULE_SLASH_CHEAP |
   (patch_mode ? PATHSPEC_PREFIX_ORIGIN : 0),
   prefix, argv);
 }
diff --git a/builtin/rm.c b/builtin/rm.c
index fb79dcab1..7c323d012 100644
--- a/builtin/rm.c
+++ b/builtin/rm.c
@@ -271,8 +271,7 @@ int cmd_rm(int argc, const char **argv, const char *prefix)
die(_("index file corrupt"));
 
parse_pathspec(, 0,
-  PATHSPEC_PREFER_CWD |
-  PATHSPEC_STRIP_SUBMODULE_SLASH_CHEAP,
+  PATHSPEC_PREFER_CWD,
   prefix, argv);
refresh_index(_index, REFRESH_QUIET, , NULL, NULL);
 
diff --git a/builtin/submodule--helper.c b/builtin/submodule--helper.c
index 566a5b6a6..8cc648d85 100644
--- a/builtin/submodule--helper.c
+++ b/builtin/submodule--helper.c
@@ -233,8 +233,7 @@ static int module_list_compute(int argc, const char **argv,
int i, result = 0;
char *ps_matched = NULL;
parse_pathspec(pathspec, 0,
-  PATHSPEC_PREFER_FULL |
-  PATHSPEC_STRIP_SUBMODULE_SLASH_CHEAP,
+  PATHSPEC_PREFER_FULL,
   prefix, argv);
 
if (pathspec->nr)
diff --git a/pathspec.c b/pathspec.c
index e42431278..1e5df2316 100644
--- a/pathspec.c
+++ b/pathspec.c
@@ -386,18 +386,6 @@ static const char *parse_element_magic(unsigned *magic, 
int *prefix_len,
return parse_short_magic(magic, elem);
 }
 
-static void strip_submodule_slash_cheap(struct pathspec_item *item)
-{
-   if (item->len >= 1 && item->match[item->len - 1] == '/') {
-   int i = cache_name_pos(item->match, item->len - 1);
-
-   if (i >= 0 && S_ISGITLINK(active_cache[i]->ce_mode)) {
-   item->len--;
-   item->match[item->len] = '\0';
-   }
-   }
-}
-
 /*
  * Perform the initialization of a pathspec_item based on a pathspec element.
  */
@@ -470,9 +458,6 @@ static void init_pathspec_item(struct pathspec_item *item, 
unsigned flags,
item->original = xstrdup(elt);
}
 
-   if (flags & PATHSPEC_STRIP_SUBMODULE_SLASH_CHEAP)
-   strip_submodule_slash_cheap(item);
-
if (magic & PATHSPEC_LITERAL) {
item->nowildcard_len = item->len;
} else {
diff --git a/pathspec.h b/pathspec.h
index 3729efa85..6671b9577 100644
--- a/pathspec.h
+++ b/pathspec.h
@@ -58,19 +58,17 @@ struct pathspec {
 #define PATHSPEC_PREFER_CWD (1<<0) /* No args means match cwd */
 #define PATHSPEC_PREFER_FULL (1<<1) /* No args means match everything */
 #define PATHSPEC_MAXDEPTH_VALID (1<<2) /* max_depth field is valid */
-/* strip the trailing slash if the given path is a gitlink */
-#define PATHSPEC_STRIP_SUBMODULE_SLASH_CHEAP (1<<3)
 /* die if a symlink is part of the given path's directory */
-#define PATHSPEC_SYMLINK_LEADING_PATH (1<<4)
-#define PATHSPEC_PREFIX_ORIGIN (1<<5)
-#define PATHSPEC_KEEP_ORDER (1<<6)
+#define PATHSPEC_SYMLINK_LEADING_PATH (1<<3)
+#define PATHSPEC_PREFIX_ORIGIN (1<<4)
+#define PATHSPEC_KEEP_ORDER (1<<5)
 /*
  * For the callers that just need pure paths from somewhere else, not
  * from command line. Global --*-pathspecs options are ignored. No
  * magic is parsed in each pathspec either. If PATHSPEC_LITERAL is
  * allowed, then it will automatically set for every pathspec.
  */
-#define PATHSPEC_LITERAL_PATH (1<<7)
+#define PATHSPEC_LITERAL_PATH (1<<6)
 
 extern void parse_pathspec(struct pathspec *pathspec,
   unsigned magic_mask,
-- 
2.13.0.rc2.291.g57267f2277-goog



clone vs submodule operation with HTTP cURL

2017-05-11 Thread Jean-Francois Bouchard
Hello everyone,

In our usage of GSSAPI via HTTPS, all our operation with git are very
well handle, however, when trying to update a submodule, git seems to
be managing cURL differently. cURL drop the ball quickly.

Example (No other setup needed on the client) :
git clone HTTPrepo -> GET -> 401 -> GET -> 401 -> GET (this time with
Authorization: Negotiate)  -> 200 OK

git submodule update -> GET -> 401 -> git prompt for username

Is the codepath for clone regarding cURL is different than with submodule ?

Using : 2.13.0, I have also tried the emptyAuth stuff with no avail.

Thanks,
JF

-- 


Avis de confidentialité

Les informations contenues dans le présent message et dans toute pièce qui 
lui est jointe sont confidentielles et peuvent être protégées par le secret 
professionnel. Ces informations sont à l’usage exclusif de son ou de ses 
destinataires. Si vous recevez ce message par erreur, veuillez s’il vous 
plait communiquer immédiatement avec l’expéditeur et en détruire tout 
exemplaire. De plus, il vous est strictement interdit de le divulguer, de 
le distribuer ou de le reproduire sans l’autorisation de l’expéditeur. 
Merci.

Confidentiality notice

This e-mail message and any attachment hereto contain confidential 
information which may be privileged and which is intended for the exclusive 
use of its addressee(s). If you receive this message in error, please 
inform sender immediately and destroy any copy thereof. Furthermore, any 
disclosure, distribution or copying of this message and/or any attachment 
hereto without the consent of the sender is strictly prohibited. Thank you.


[PATCH v2 6/6] pathspec: convert find_pathspecs_matching_against_index to take an index

2017-05-11 Thread Brandon Williams
Convert find_pathspecs_matching_against_index to take an index
parameter.

In addition mark pathspec.c with NO_THE_INDEX_COMPATIBILITY_MACROS now
that it doesn't use any cache macros or reference 'the_index'.

Signed-off-by: Brandon Williams 
---
 builtin/add.c  |  4 ++--
 builtin/check-ignore.c |  2 +-
 pathspec.c | 11 +++
 pathspec.h |  7 +--
 4 files changed, 15 insertions(+), 9 deletions(-)

diff --git a/builtin/add.c b/builtin/add.c
index 86770d6af..0365a5209 100644
--- a/builtin/add.c
+++ b/builtin/add.c
@@ -136,7 +136,7 @@ static char *prune_directory(struct dir_struct *dir, struct 
pathspec *pathspec,
*dst++ = entry;
}
dir->nr = dst - dir->entries;
-   add_pathspec_matches_against_index(pathspec, seen);
+   add_pathspec_matches_against_index(pathspec, _index, seen);
return seen;
 }
 
@@ -418,7 +418,7 @@ int cmd_add(int argc, const char **argv, const char *prefix)
int i;
 
if (!seen)
-   seen = find_pathspecs_matching_against_index();
+   seen = find_pathspecs_matching_against_index(, 
_index);
 
/*
 * file_exists() assumes exact match
diff --git a/builtin/check-ignore.c b/builtin/check-ignore.c
index 91040a4b0..7629018a5 100644
--- a/builtin/check-ignore.c
+++ b/builtin/check-ignore.c
@@ -98,7 +98,7 @@ static int check_ignore(struct dir_struct *dir,
 * should not be ignored, in order to be consistent with
 * 'git status', 'git add' etc.
 */
-   seen = find_pathspecs_matching_against_index();
+   seen = find_pathspecs_matching_against_index(, _index);
for (i = 0; i < pathspec.nr; i++) {
full_path = pathspec.items[i].match;
exclude = NULL;
diff --git a/pathspec.c b/pathspec.c
index 1e5df2316..828405021 100644
--- a/pathspec.c
+++ b/pathspec.c
@@ -1,3 +1,4 @@
+#define NO_THE_INDEX_COMPATIBILITY_MACROS
 #include "cache.h"
 #include "dir.h"
 #include "pathspec.h"
@@ -17,6 +18,7 @@
  * to use find_pathspecs_matching_against_index() instead.
  */
 void add_pathspec_matches_against_index(const struct pathspec *pathspec,
+   const struct index_state *istate,
char *seen)
 {
int num_unmatched = 0, i;
@@ -32,8 +34,8 @@ void add_pathspec_matches_against_index(const struct pathspec 
*pathspec,
num_unmatched++;
if (!num_unmatched)
return;
-   for (i = 0; i < active_nr; i++) {
-   const struct cache_entry *ce = active_cache[i];
+   for (i = 0; i < istate->cache_nr; i++) {
+   const struct cache_entry *ce = istate->cache[i];
ce_path_match(ce, pathspec, seen);
}
 }
@@ -46,10 +48,11 @@ void add_pathspec_matches_against_index(const struct 
pathspec *pathspec,
  * nature of the "closest" (i.e. most specific) matches which each of the
  * given pathspecs achieves against all items in the index.
  */
-char *find_pathspecs_matching_against_index(const struct pathspec *pathspec)
+char *find_pathspecs_matching_against_index(const struct pathspec *pathspec,
+   const struct index_state *istate)
 {
char *seen = xcalloc(pathspec->nr, 1);
-   add_pathspec_matches_against_index(pathspec, seen);
+   add_pathspec_matches_against_index(pathspec, istate, seen);
return seen;
 }
 
diff --git a/pathspec.h b/pathspec.h
index 6671b9577..60e650040 100644
--- a/pathspec.h
+++ b/pathspec.h
@@ -96,7 +96,10 @@ static inline int ps_strcmp(const struct pathspec_item *item,
return strcmp(s1, s2);
 }
 
-extern char *find_pathspecs_matching_against_index(const struct pathspec 
*pathspec);
-extern void add_pathspec_matches_against_index(const struct pathspec 
*pathspec, char *seen);
+extern void add_pathspec_matches_against_index(const struct pathspec *pathspec,
+  const struct index_state *istate,
+  char *seen);
+extern char *find_pathspecs_matching_against_index(const struct pathspec 
*pathspec,
+  const struct index_state 
*istate);
 
 #endif /* PATHSPEC_H */
-- 
2.13.0.rc2.291.g57267f2277-goog



[PATCH v2 1/6] pathspec: provide a more descriptive die message

2017-05-11 Thread Brandon Williams
The current message displayed upon an internal error in
'init_pathspec_item()' isn't very descriptive and doesn't provide much
context to where the error occurred.  Update the error message to
provide more context to where the error occured.

Signed-off-by: Brandon Williams 
---
 pathspec.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/pathspec.c b/pathspec.c
index 50f76fff4..904cfb739 100644
--- a/pathspec.c
+++ b/pathspec.c
@@ -555,7 +555,7 @@ static void init_pathspec_item(struct pathspec_item *item, 
unsigned flags,
 * would trigger that.
 */
die_inside_submodule_path(item);
-   die ("BUG: item->nowildcard_len > item->len || item->prefix > 
item->len)");
+   die ("BUG: error initializing pathspec_item");
}
 }
 
-- 
2.13.0.rc2.291.g57267f2277-goog



[PATCH v2 2/6] submodule: add die_in_unpopulated_submodule function

2017-05-11 Thread Brandon Williams
Currently 'git add' is the only command which dies when launched from an
unpopulated submodule (the place-holder directory for a submodule which
hasn't been checked out).  This is triggered implicitly by passing the
PATHSPEC_STRIP_SUBMODULE_SLASH_EXPENSIVE flag to 'parse_pathspec()'.

Instead make this desire more explicit by creating a function
'die_in_unpopulated_submodule()' which dies if the provided 'prefix' has
a leading path component which matches a submodule in the the index.

Signed-off-by: Brandon Williams 
---
 builtin/add.c|  3 +++
 pathspec.c   | 29 -
 submodule.c  | 30 ++
 submodule.h  |  2 ++
 t/t6134-pathspec-in-submodule.sh |  6 +-
 5 files changed, 36 insertions(+), 34 deletions(-)

diff --git a/builtin/add.c b/builtin/add.c
index 9f53f020d..ec58e3679 100644
--- a/builtin/add.c
+++ b/builtin/add.c
@@ -17,6 +17,7 @@
 #include "revision.h"
 #include "bulk-checkin.h"
 #include "argv-array.h"
+#include "submodule.h"
 
 static const char * const builtin_add_usage[] = {
N_("git add [] [--] ..."),
@@ -379,6 +380,8 @@ int cmd_add(int argc, const char **argv, const char *prefix)
if (read_cache() < 0)
die(_("index file corrupt"));
 
+   die_in_unpopulated_submodule(_index, prefix);
+
/*
 * Check the "pathspec '%s' did not match any files" block
 * below before enabling new magic.
diff --git a/pathspec.c b/pathspec.c
index 904cfb739..9b7634c5a 100644
--- a/pathspec.c
+++ b/pathspec.c
@@ -424,27 +424,6 @@ static void strip_submodule_slash_expensive(struct 
pathspec_item *item)
}
 }
 
-static void die_inside_submodule_path(struct pathspec_item *item)
-{
-   int i;
-
-   for (i = 0; i < active_nr; i++) {
-   struct cache_entry *ce = active_cache[i];
-   int ce_len = ce_namelen(ce);
-
-   if (!S_ISGITLINK(ce->ce_mode))
-   continue;
-
-   if (item->len < ce_len ||
-   !(item->match[ce_len] == '/' || item->match[ce_len] == 
'\0') ||
-   memcmp(ce->name, item->match, ce_len))
-   continue;
-
-   die(_("Pathspec '%s' is in submodule '%.*s'"),
-   item->original, ce_len, ce->name);
-   }
-}
-
 /*
  * Perform the initialization of a pathspec_item based on a pathspec element.
  */
@@ -547,14 +526,6 @@ static void init_pathspec_item(struct pathspec_item *item, 
unsigned flags,
/* sanity checks, pathspec matchers assume these are sane */
if (item->nowildcard_len > item->len ||
item->prefix > item->len) {
-   /*
-* This case can be triggered by the user pointing us to a
-* pathspec inside a submodule, which is an input error.
-* Detect that here and complain, but fallback in the
-* non-submodule case to a BUG, as we have no idea what
-* would trigger that.
-*/
-   die_inside_submodule_path(item);
die ("BUG: error initializing pathspec_item");
}
 }
diff --git a/submodule.c b/submodule.c
index d3299e29c..885663c42 100644
--- a/submodule.c
+++ b/submodule.c
@@ -282,6 +282,36 @@ int is_submodule_populated_gently(const char *path, int 
*return_error_code)
return ret;
 }
 
+/*
+ * Dies if the provided 'prefix' corresponds to an unpopulated submodule
+ */
+void die_in_unpopulated_submodule(const struct index_state *istate,
+ const char *prefix)
+{
+   int i, prefixlen;
+
+   if (!prefix)
+   return;
+
+   prefixlen = strlen(prefix);
+
+   for (i = 0; i < istate->cache_nr; i++) {
+   struct cache_entry *ce = istate->cache[i];
+   int ce_len = ce_namelen(ce);
+
+   if (!S_ISGITLINK(ce->ce_mode))
+   continue;
+   if (prefixlen <= ce_len)
+   continue;
+   if (strncmp(ce->name, prefix, ce_len))
+   continue;
+   if (prefix[ce_len] != '/')
+   continue;
+
+   die(_("in unpopulated submodule '%s'"), ce->name);
+   }
+}
+
 int parse_submodule_update_strategy(const char *value,
struct submodule_update_strategy *dst)
 {
diff --git a/submodule.h b/submodule.h
index 1277480ad..d11b4da40 100644
--- a/submodule.h
+++ b/submodule.h
@@ -49,6 +49,8 @@ extern int is_submodule_initialized(const char *path);
  * Otherwise the return error code is the same as of resolve_gitdir_gently.
  */
 extern int is_submodule_populated_gently(const char *path, int 
*return_error_code);
+extern void die_in_unpopulated_submodule(const struct index_state *istate,
+const char *prefix);
 extern int 

[PATCH v2 0/6] convert pathspec.c to take an index parameter

2017-05-11 Thread Brandon Williams
The main difference in v2 is that instead of piping through an index_state
struct into parse_pathspec, I ripped out the logic that needed to access the
index and either removed it completely if it wasn't needed anymore (stripping
submodule slash) or factored it out into its own function which can be called
after initializing a pathspec object (dying if a path descends into a
submodule).

Brandon Williams (6):
  pathspec: provide a more descriptive die message
  submodule: add die_in_unpopulated_submodule function
  pathspec: remove PATHSPEC_STRIP_SUBMODULE_SLASH_EXPENSIVE flag
  ls-files: prevent prune_cache from overeagerly pruning submodules
  pathspec: remove PATHSPEC_STRIP_SUBMODULE_SLASH_CHEAP
  pathspec: convert find_pathspecs_matching_against_index to take an
index

 builtin/add.c| 12 --
 builtin/check-ignore.c   |  6 ++-
 builtin/ls-files.c   | 31 ---
 builtin/reset.c  |  1 -
 builtin/rm.c |  3 +-
 builtin/submodule--helper.c  |  3 +-
 pathspec.c   | 86 
 pathspec.h   | 25 +---
 submodule.c  | 63 +
 submodule.h  |  4 ++
 t/t6134-pathspec-in-submodule.sh |  6 +--
 11 files changed, 124 insertions(+), 116 deletions(-)

-- 
2.13.0.rc2.291.g57267f2277-goog



[PATCH v2 3/6] pathspec: remove PATHSPEC_STRIP_SUBMODULE_SLASH_EXPENSIVE flag

2017-05-11 Thread Brandon Williams
Since (ae8d08242 pathspec: pass directory indicator to
match_pathspec_item()) the path matching logic has been able to cope
with submodules without needing to strip off a trailing slash if a path
refers to a submodule.

Since the stripping the trailing slash is no longer necessary, remove
the PATHSPEC_STRIP_SUBMODULE_SLASH_EXPENSIVE flag.  In addition, factor
out the logic which dies if a path decends into a submodule so that it
can still be used as a check after a pathspec struct has been
initialized.

Signed-off-by: Brandon Williams 
---
 builtin/add.c  |  5 +++--
 builtin/check-ignore.c |  4 +++-
 pathspec.c | 29 -
 pathspec.h | 14 +++---
 submodule.c| 33 +
 submodule.h|  2 ++
 6 files changed, 44 insertions(+), 43 deletions(-)

diff --git a/builtin/add.c b/builtin/add.c
index ec58e3679..86770d6af 100644
--- a/builtin/add.c
+++ b/builtin/add.c
@@ -388,10 +388,11 @@ int cmd_add(int argc, const char **argv, const char 
*prefix)
 */
parse_pathspec(, 0,
   PATHSPEC_PREFER_FULL |
-  PATHSPEC_SYMLINK_LEADING_PATH |
-  PATHSPEC_STRIP_SUBMODULE_SLASH_EXPENSIVE,
+  PATHSPEC_SYMLINK_LEADING_PATH,
   prefix, argv);
 
+   die_path_inside_submodule(_index, );
+
if (add_new_files) {
int baselen;
 
diff --git a/builtin/check-ignore.c b/builtin/check-ignore.c
index 1d73d3ca3..91040a4b0 100644
--- a/builtin/check-ignore.c
+++ b/builtin/check-ignore.c
@@ -4,6 +4,7 @@
 #include "quote.h"
 #include "pathspec.h"
 #include "parse-options.h"
+#include "submodule.h"
 
 static int quiet, verbose, stdin_paths, show_non_matching, no_index;
 static const char * const check_ignore_usage[] = {
@@ -87,10 +88,11 @@ static int check_ignore(struct dir_struct *dir,
parse_pathspec(,
   PATHSPEC_ALL_MAGIC & ~PATHSPEC_FROMTOP,
   PATHSPEC_SYMLINK_LEADING_PATH |
-  PATHSPEC_STRIP_SUBMODULE_SLASH_EXPENSIVE |
   PATHSPEC_KEEP_ORDER,
   prefix, argv);
 
+   die_path_inside_submodule(_index, );
+
/*
 * look for pathspecs matching entries in the index, since these
 * should not be ignored, in order to be consistent with
diff --git a/pathspec.c b/pathspec.c
index 9b7634c5a..e42431278 100644
--- a/pathspec.c
+++ b/pathspec.c
@@ -398,32 +398,6 @@ static void strip_submodule_slash_cheap(struct 
pathspec_item *item)
}
 }
 
-static void strip_submodule_slash_expensive(struct pathspec_item *item)
-{
-   int i;
-
-   for (i = 0; i < active_nr; i++) {
-   struct cache_entry *ce = active_cache[i];
-   int ce_len = ce_namelen(ce);
-
-   if (!S_ISGITLINK(ce->ce_mode))
-   continue;
-
-   if (item->len <= ce_len || item->match[ce_len] != '/' ||
-   memcmp(ce->name, item->match, ce_len))
-   continue;
-
-   if (item->len == ce_len + 1) {
-   /* strip trailing slash */
-   item->len--;
-   item->match[item->len] = '\0';
-   } else {
-   die(_("Pathspec '%s' is in submodule '%.*s'"),
-   item->original, ce_len, ce->name);
-   }
-   }
-}
-
 /*
  * Perform the initialization of a pathspec_item based on a pathspec element.
  */
@@ -499,9 +473,6 @@ static void init_pathspec_item(struct pathspec_item *item, 
unsigned flags,
if (flags & PATHSPEC_STRIP_SUBMODULE_SLASH_CHEAP)
strip_submodule_slash_cheap(item);
 
-   if (flags & PATHSPEC_STRIP_SUBMODULE_SLASH_EXPENSIVE)
-   strip_submodule_slash_expensive(item);
-
if (magic & PATHSPEC_LITERAL) {
item->nowildcard_len = item->len;
} else {
diff --git a/pathspec.h b/pathspec.h
index 55e976972..3729efa85 100644
--- a/pathspec.h
+++ b/pathspec.h
@@ -62,23 +62,15 @@ struct pathspec {
 #define PATHSPEC_STRIP_SUBMODULE_SLASH_CHEAP (1<<3)
 /* die if a symlink is part of the given path's directory */
 #define PATHSPEC_SYMLINK_LEADING_PATH (1<<4)
-/*
- * This is like a combination of ..LEADING_PATH and .._SLASH_CHEAP
- * (but not the same): it strips the trailing slash if the given path
- * is a gitlink but also checks and dies if gitlink is part of the
- * leading path (i.e. the given path goes beyond a submodule). It's
- * safer than _SLASH_CHEAP and also more expensive.
- */
-#define PATHSPEC_STRIP_SUBMODULE_SLASH_EXPENSIVE (1<<5)
-#define PATHSPEC_PREFIX_ORIGIN (1<<6)
-#define PATHSPEC_KEEP_ORDER (1<<7)
+#define PATHSPEC_PREFIX_ORIGIN (1<<5)
+#define PATHSPEC_KEEP_ORDER (1<<6)
 /*
  * For the callers that just need pure paths from somewhere else, not
  * from command line. Global 

[PATCH v2 4/6] ls-files: prevent prune_cache from overeagerly pruning submodules

2017-05-11 Thread Brandon Williams
Since (ae8d08242 pathspec: pass directory indicator to
match_pathspec_item()) the path matching logic has been able to cope
with submodules without needing to strip off a trailing slash if a path
refers to a submodule.

ls-files is the only caller of 'parse_pathspec()' which relies on the
behavior of the PATHSPEC_STRIP_SUBMODULE_SLASH_CHEAP flag because it
uses the result to construct a common prefix of all provided pathspecs
which is then used to prune the index of all entries which don't have
that prefix.  Since submodules entries in the index don't have a
trailing slash 'prune_cache()' will be overeager and prune a submodule
'sub' if the common prefix is 'sub/'.  To correct this behavior, only
prune entries which don't match up to, but not including, a trailing
slash of the common prefix.

This is in preparation to remove the
PATHSPEC_STRIP_SUBMODULE_SLASH_CHEAP flag in a later patch.

Signed-off-by: Brandon Williams 
---
 builtin/ls-files.c | 31 +--
 1 file changed, 25 insertions(+), 6 deletions(-)

diff --git a/builtin/ls-files.c b/builtin/ls-files.c
index a6c70dbe9..1f3d47844 100644
--- a/builtin/ls-files.c
+++ b/builtin/ls-files.c
@@ -97,7 +97,7 @@ static void show_dir_entry(const char *tag, struct dir_entry 
*ent)
 {
int len = max_prefix_len;
 
-   if (len >= ent->len)
+   if (len > ent->len)
die("git ls-files: internal error - directory entry not 
superset of prefix");
 
if (!dir_path_match(ent, , len, ps_matched))
@@ -238,7 +238,7 @@ static void show_ce_entry(const char *tag, const struct 
cache_entry *ce)
strbuf_addstr(, super_prefix);
strbuf_addstr(, ce->name);
 
-   if (len >= ce_namelen(ce))
+   if (len > ce_namelen(ce))
die("git ls-files: internal error - cache entry not superset of 
prefix");
 
if (recurse_submodules && S_ISGITLINK(ce->ce_mode) &&
@@ -403,6 +403,25 @@ static void prune_cache(const char *prefix, size_t 
prefixlen)
active_nr = last - pos;
 }
 
+static int get_common_prefix_len(const char *common_prefix)
+{
+   int common_prefix_len;
+
+   if (!common_prefix)
+   return 0;
+
+   common_prefix_len = strlen(common_prefix);
+
+   /*
+* If the prefix has a trailing slash, strip it so that submodules wont
+* be pruned from the index.
+*/
+   if (common_prefix[common_prefix_len - 1] == '/')
+   common_prefix_len--;
+
+   return common_prefix_len;
+}
+
 /*
  * Read the tree specified with --with-tree option
  * (typically, HEAD) into stage #1 and then
@@ -624,8 +643,7 @@ int cmd_ls_files(int argc, const char **argv, const char 
*cmd_prefix)
"--error-unmatch");
 
parse_pathspec(, 0,
-  PATHSPEC_PREFER_CWD |
-  PATHSPEC_STRIP_SUBMODULE_SLASH_CHEAP,
+  PATHSPEC_PREFER_CWD,
   prefix, argv);
 
/*
@@ -637,7 +655,9 @@ int cmd_ls_files(int argc, const char **argv, const char 
*cmd_prefix)
max_prefix = NULL;
else
max_prefix = common_prefix();
-   max_prefix_len = max_prefix ? strlen(max_prefix) : 0;
+   max_prefix_len = get_common_prefix_len(max_prefix);
+
+   prune_cache(max_prefix, max_prefix_len);
 
/* Treat unmatching pathspec elements as errors */
if (pathspec.nr && error_unmatch)
@@ -651,7 +671,6 @@ int cmd_ls_files(int argc, const char **argv, const char 
*cmd_prefix)
  show_killed || show_modified || show_resolve_undo))
show_cached = 1;
 
-   prune_cache(max_prefix, max_prefix_len);
if (with_tree) {
/*
 * Basic sanity check; show-stages and show-unmerged
-- 
2.13.0.rc2.291.g57267f2277-goog



Re: [PATCH v4] fetch-pack: always allow fetching of literal SHA1s

2017-05-11 Thread Jeff King
On Thu, May 11, 2017 at 02:35:17PM -0700, Jonathan Nieder wrote:

> This structure could be simplified by putting the lazy-initializing
> tip_oids lookup in a separate function.  For example:
> 
>   int tip_oids_contain(struct oidset *tip_oids,
>   struct ref *unmatched, struct ref *newlist,
>   const struct oid *id)
>   {
>   if (oidset_is_empty(tip_oids)) {
>   add_refs_to_oidset(tip_oids, unmatched);
>   add_refs_to_oidset(tip_oids, newlist);
>   }
>   return oidset_contains(tip_oids, id);
>   }

Yeah, I started to write it that way, but in the empty-ref case it will
try to add the empty refs over and over. I guess that's not that big a
deal, though, as we know the noop add is O(1) then. :)

> That way, the caller could be kept simple (eliminating can_append
> and the repeated if).

Yes, shoving it all into the sub-function is a big win for readability.

-Peff


Re: [PATCH] C style: use standard style for "TRANSLATORS" comments

2017-05-11 Thread Brandon Williams
On 05/11, Jonathan Nieder wrote:
> Hi,
> 
> Ævar Arnfjörð Bjarmason wrote:
> 
> > Change all the "TRANSLATORS: [...]" comments in the C code to use the
> > regular Git coding style, and amend the style guide so that the
> > example there uses that style.
> 
> Hooray!
> 
> [...]
> > --- a/Documentation/CodingGuidelines
> > +++ b/Documentation/CodingGuidelines
> > @@ -256,12 +256,12 @@ For C programs:
> >  
> > Note however that a comment that explains a translatable string to
> 
> The "Note however" isn't needed since it's not contradicting
> the previous point any more.  This can be an entirely separate item:
> 
>  - A comment that explains a translatable string to translators
>uses a convention of starting with a magic token "TRANSLATORS: "
>[etc]
> 
> It might even make sense to remove the explanation of TRANSLATORS
> comments from this file altogether, since they're intuitive to use.
> A more common place to want to learn about them is po/README, which
> already explains them.
> 
> [...]
> > --- a/bisect.c
> > +++ b/bisect.c
> > @@ -995,8 +995,10 @@ int bisect_next_all(const char *prefix, int 
> > no_checkout)
> >  
> > steps_msg = xstrfmt(Q_("(roughly %d step)", "(roughly %d steps)",
> >   steps), steps);
> > -   /* TRANSLATORS: the last %s will be replaced with
> > -  "(roughly %d steps)" translation */
> > +   /*
> > +* TRANSLATORS: the last %s will be replaced with "(roughly %d
> > +* steps)" translation.
> > +*/
> 
> Nice.
> 
> [...]
> > +++ b/ref-filter.c
> > @@ -1251,13 +1251,17 @@ char *get_head_description(void)
> > state.branch);
> > else if (state.detached_from) {
> > if (state.detached_at)
> > -   /* TRANSLATORS: make sure this matches
> > -  "HEAD detached at " in wt-status.c */
> > +   /*
> > +* TRANSLATORS: make sure this matches "HEAD
> > +* detached at " in wt-status.c
> > +*/
> 
> optional: could treat "HEAD detached at " as an unbreakable phrase
> for the sake of line-breaking, for easier grepping.
> 
> But what's here is also perfectly fine.
> 
> [...]
> > -   /* TRANSLATORS: make sure this matches
> > -  "HEAD detached from " in wt-status.c */
> > +   /*
> > +* TRANSLATORS: make sure this matches "HEAD
> > +* detached from " in wt-status.c
> > +*/
> 
> Likewise.
> 
> The rest also look good. This is great.

I agree with Jonathan.  I like having everything more uniform :)

-- 
Brandon Williams


Re: [PATCH] C style: use standard style for "TRANSLATORS" comments

2017-05-11 Thread Jonathan Nieder
Hi,

Ævar Arnfjörð Bjarmason wrote:

> Change all the "TRANSLATORS: [...]" comments in the C code to use the
> regular Git coding style, and amend the style guide so that the
> example there uses that style.

Hooray!

[...]
> --- a/Documentation/CodingGuidelines
> +++ b/Documentation/CodingGuidelines
> @@ -256,12 +256,12 @@ For C programs:
>  
> Note however that a comment that explains a translatable string to

The "Note however" isn't needed since it's not contradicting
the previous point any more.  This can be an entirely separate item:

 - A comment that explains a translatable string to translators
   uses a convention of starting with a magic token "TRANSLATORS: "
   [etc]

It might even make sense to remove the explanation of TRANSLATORS
comments from this file altogether, since they're intuitive to use.
A more common place to want to learn about them is po/README, which
already explains them.

[...]
> --- a/bisect.c
> +++ b/bisect.c
> @@ -995,8 +995,10 @@ int bisect_next_all(const char *prefix, int no_checkout)
>  
>   steps_msg = xstrfmt(Q_("(roughly %d step)", "(roughly %d steps)",
> steps), steps);
> - /* TRANSLATORS: the last %s will be replaced with
> -"(roughly %d steps)" translation */
> + /*
> +  * TRANSLATORS: the last %s will be replaced with "(roughly %d
> +  * steps)" translation.
> +  */

Nice.

[...]
> +++ b/ref-filter.c
> @@ -1251,13 +1251,17 @@ char *get_head_description(void)
>   state.branch);
>   else if (state.detached_from) {
>   if (state.detached_at)
> - /* TRANSLATORS: make sure this matches
> -"HEAD detached at " in wt-status.c */
> + /*
> +  * TRANSLATORS: make sure this matches "HEAD
> +  * detached at " in wt-status.c
> +  */

optional: could treat "HEAD detached at " as an unbreakable phrase
for the sake of line-breaking, for easier grepping.

But what's here is also perfectly fine.

[...]
> - /* TRANSLATORS: make sure this matches
> -"HEAD detached from " in wt-status.c */
> + /*
> +  * TRANSLATORS: make sure this matches "HEAD
> +  * detached from " in wt-status.c
> +  */

Likewise.

The rest also look good. This is great.

Thanks,
Jonathan


Re: [PATCH v4] fetch-pack: always allow fetching of literal SHA1s

2017-05-11 Thread Jonathan Nieder
Hi,

Jonathan Tan wrote:

> Thanks, peff. I've incorporated your suggestions - I don't feel very
> strongly about this, but I guess it's worthwhile to avoid the quadratic
> behavior if we can.
>
> Also incorporated Jonathan Nieder's suggestion about the placement of
> the last line. The relevant function is also factored out (following
> peff's suggestion).

Thanks.  The structure still seems more complicated than it needs to
be.  More details below.

[...]
> +++ b/fetch-pack.c
[...]
> @@ -592,13 +593,22 @@ static void mark_recent_complete_commits(struct 
> fetch_pack_args *args,
>   }
>  }
>  
> +static void add_refs_to_oidset(struct oidset *oids, const struct ref *refs)
> +{
> + for (; refs; refs = refs->next)
> + oidset_insert(oids, >old_oid);

Makes sense.

[...]
>   /* Append unmatched requests to the list */
>   for (i = 0; i < nr_sought; i++) {
>   unsigned char sha1[20];
> + int can_append = 0;
>  
>   ref = sought[i];
>   if (ref->match_status != REF_NOT_MATCHED)
> @@ -649,6 +661,21 @@ static void filter_refs(struct fetch_pack_args *args,
>  
>   if ((allow_unadvertised_object_request &
>   (ALLOW_TIP_SHA1 | ALLOW_REACHABLE_SHA1))) {
> + can_append = 1;
> + } else {
> + if (!tip_oids_initialized) {
> + /*
> +  * Check all refs, including those already
> +  * matched
> +  */
> + add_refs_to_oidset(_oids, unmatched);
> + add_refs_to_oidset(_oids, newlist);
> + tip_oids_initialized = 1;
> + }
> + can_append = oidset_contains(_oids, >old_oid);
> + }
> +
> + if (can_append) {

This structure could be simplified by putting the lazy-initializing
tip_oids lookup in a separate function.  For example:

int tip_oids_contain(struct oidset *tip_oids,
struct ref *unmatched, struct ref *newlist,
const struct oid *id)
{
if (oidset_is_empty(tip_oids)) {
add_refs_to_oidset(tip_oids, unmatched);
add_refs_to_oidset(tip_oids, newlist);
}
return oidset_contains(tip_oids, id);
}

That way, the caller could be kept simple (eliminating can_append
and the repeated if).

Thanks,
Jonathan


Re: What's cooking in git.git (May 2017, #03; Wed, 10)

2017-05-11 Thread Marc Branchaud

On 2017-05-10 01:18 AM, Junio C Hamano wrote:


* mb/diff-default-to-indent-heuristics (2017-05-09) 4 commits
 - add--interactive: drop diff.indentHeuristic handling
 - diff: enable indent heuristic by default
 - diff: have the diff-* builtins configure diff before initializing revisions
 - diff: make the indent heuristic part of diff's basic configuration

 Make the "indent" heuristics the default in "diff" and diff.indentHeuristics
 configuration variable an escape hatch for those who do no want it.


Typo fixes:
s/heuristics/heuristic/  (both places)
s/do no want/do not want/


 Kicked out of next; it seems it is still getting review suggestions?


I believe v4 of this one is ready to cook.

The most salient aspect of the review discussion was about where to go 
after this topic is applied.  We also concluded that the topic deserves 
a release note about breaking patch ID backwards-compatibility.  I think 
such a note should mention rerere, so I would suggest the following:


The diff "indent" heuristic is now enabled by default.  This changes the 
patch IDs calculated by git-patch-id (and used by git-rerere and 
git-cherry), which could affect workflows that rely on 
previously-computed patch IDs.  The heuristic can be disabled by setting 
diff.indentHeuristic to false.


M.



Re: Best "triangle" workflow setup?

2017-05-11 Thread Robert Dailey
On Thu, May 11, 2017 at 3:17 PM, Jeff King  wrote:
> I think you want:
>
>   [push]
>   default = current
>   [remote]
>   pushDefault = myfork
>
> to make "git push" do what you want. And then generally have branches
> mark their counterparts on "origin" (which you can do either at creation
> time, or probably by using "git push -u origin my-topic" when you push
> them).

So without the `pushDefault` setting, `current` will default to a
remote named `origin` if there is no tracking branch set, correct? So
`pushDefault` is effectively overriding this built-in default? In
addition, it seems like since this overrides `branch.name.remote`,
that this effectively makes the remote tracking branch *only* for
`pull`. Is this a correct understanding?

> This is similar to what I do for my git.git workflow, though I usually
> have origin/master as the branch's upstream. I.e., I'd create them with:
>
>   git checkout -b my-topic origin

I'm looking through the `git checkout` and `git branch` documentation,
but I don't see any mention of it being valid to use a remote name as
the  parameter (you're using `origin` in the above
example). Am I misunderstanding? Did you mean origin/my-topic?

> And then rebasing always happens on top of master (because "origin"
> doesn't even have my topic branch at all). If I want to compare with
> what I've pushed to my fork, I'd use "@{push}".

Can you explain more about how your rebase chooses master instead of
your same-named remote tracking branch? Maybe provide some examples of
your rebase command and respective configuration (unless what you've
already provided is sufficient). As for @{push}, I haven't used this
before, so I'll dig in the docs and learn about it.

Thanks for your advice, so far I like this direction but seems like
there is more for me to learn!


[PATCH] C style: use standard style for "TRANSLATORS" comments

2017-05-11 Thread Ævar Arnfjörð Bjarmason
Change all the "TRANSLATORS: [...]" comments in the C code to use the
regular Git coding style, and amend the style guide so that the
example there uses that style.

This custom style was necessary back in 2010 when the gettext support
was initially added, and was subsequently documented in commit
cbcfd4e3ea ("i18n: mention "TRANSLATORS:" marker in
Documentation/CodingGuidelines", 2014-04-18).

GNU xgettext hasn't had the parsing limitation that necessitated this
exception for almost 3 years. Since its 0.19 release on 2014-06-02
it's been able to recognize TRANSLATOR comments in the standard Git
comment syntax[1].

Usually we'd like to keep compatibility with software that's that
young, but in this case literally the only person who needs to be
using a gettext newer than 3 years old is Jiang Xin (the only person
who runs & commits "make pot" results), so I think in this case we can
make an exception.

This xgettext parsing feature was added after a thread on the Git
mailing list[2] which continued on the bug-gettext[3] list, but we
never subsequently changed our style & styleguide, do so.

There are already longstanding changes in git that use the standard
comment style & have their TRANSLATORS comments extracted properly
without getting the literal "*"'s mixed up in the text, as would
happen before xgettext 0.19.

Commit 7ff2683253 ("builtin-am: implement -i/--interactive",
2015-08-04) added one such comment, which in commit df0617bfa7 ("l10n:
git.pot: v2.6.0 round 1 (123 new, 41 removed)", 2015-09-05) got picked
up in the po/git.pot file with the right format, showing that Jiang
already runs a modern xgettext.

The xgettext parser does not handle the sort of non-standard comment
style that I'm amending here in sequencer.c, but that isn't standard
Git comment syntax anyway. With this change to sequencer.c & "make
pot" the comment in the pot file is now correct:

 #. TRANSLATORS: %s will be "revert", "cherry-pick" or
-#. * "rebase -i".
+#. "rebase -i".

1. http://git.savannah.gnu.org/cgit/gettext.git/commit/?id=10af7fe6bd
2. 
<2ce9ec406501d112e032c8208417f8100bed04c6.1397712142.git.worldhello@gmail.com>
   
(https://public-inbox.org/git/2ce9ec406501d112e032c8208417f8100bed04c6.1397712142.git.worldhello@gmail.com/)
3. https://lists.gnu.org/archive/html/bug-gettext/2014-04/msg00016.html

Signed-off-by: Ævar Arnfjörð Bjarmason 
---

On Thu, May 11, 2017 at 10:43 PM, Brandon Williams  wrote:
> On 05/11, Ævar Arnfjörð Bjarmason wrote:
>> On Thu, May 11, 2017 at 10:21 PM, Brandon Williams  wrote:
>> > On 05/11, Ævar Arnfjörð Bjarmason wrote:
>> >> [...]
>> >> +#ifdef NO_PTHREADS
>> >> + else if (num_threads && num_threads != 1) {
>> >> + /* TRANSLATORS: %s is the configuration
>> >> +variable for tweaking threads, currently
>> >> +grep.threads */
>> >
>> > nit: this comment isn't formatted properly:
>> >   /*
>> >* ... comment ...
>> >*/
>>
>> Comments for translators use a different style, see cbcfd4e3ea ("i18n:
>> mention "TRANSLATORS:" marker in Documentation/CodingGuidelines",
>> 2014-04-18). Otherwise the "*" gets interpolated into the string the
>> translators see in their UI.
>>
>
> Ah got it, I wasn't aware of that.

As it turns out this is just something we've been cargo-culting for
years for no reason. Will fix this comment in my v2, but first let's
do this.

 Documentation/CodingGuidelines | 10 +-
 bisect.c   |  6 --
 builtin/blame.c| 15 +--
 builtin/notes.c|  6 --
 builtin/remote.c   |  7 +--
 notes-utils.c  |  7 +--
 parse-options.c|  6 --
 ref-filter.c   | 12 
 sequencer.c|  3 ++-
 9 files changed, 46 insertions(+), 26 deletions(-)

diff --git a/Documentation/CodingGuidelines b/Documentation/CodingGuidelines
index a4191aa388..9fd7383819 100644
--- a/Documentation/CodingGuidelines
+++ b/Documentation/CodingGuidelines
@@ -256,12 +256,12 @@ For C programs:
 
Note however that a comment that explains a translatable string to
translators uses a convention of starting with a magic token
-   "TRANSLATORS: " immediately after the opening delimiter, even when
-   it spans multiple lines.  We do not add an asterisk at the beginning
-   of each line, either.  E.g.
+   "TRANSLATORS: ", e.g.
 
-   /* TRANSLATORS: here is a comment that explains the string
-  to be translated, that follows immediately after it */
+   /*
+* TRANSLATORS: here is a comment that explains the string to
+* be translated, that follows immediately after it.
+*/
_("Here is a translatable string explained by the above.");
 
  - Double negation is often harder to understand than no negation
diff --git a/bisect.c b/bisect.c
index 

[PATCH v4] fetch-pack: always allow fetching of literal SHA1s

2017-05-11 Thread Jonathan Tan
fetch-pack, when fetching a literal SHA-1 from a server that is not
configured with uploadpack.allowtipsha1inwant (or similar), always
returns an error message of the form "Server does not allow request for
unadvertised object %s". However, it is sometimes the case that such
object is advertised. This situation would occur, for example, if a user
or a script was provided a SHA-1 instead of a branch or tag name for
fetching, and wanted to invoke "git fetch" or "git fetch-pack" using
that SHA-1.

Teach fetch-pack to also check the SHA-1s of the refs in the received
ref advertisement if a literal SHA-1 was given by the user.

Helped-by: Jeff King 
Signed-off-by: Jonathan Tan 
---

Thanks, peff. I've incorporated your suggestions - I don't feel very
strongly about this, but I guess it's worthwhile to avoid the quadratic
behavior if we can.

Also incorporated Jonathan Nieder's suggestion about the placement of
the last line. The relevant function is also factored out (following
peff's suggestion).
---
 fetch-pack.c  | 36 +++-
 t/t5500-fetch-pack.sh | 35 +++
 2 files changed, 70 insertions(+), 1 deletion(-)

diff --git a/fetch-pack.c b/fetch-pack.c
index afb8b0502..1adb1a6c2 100644
--- a/fetch-pack.c
+++ b/fetch-pack.c
@@ -15,6 +15,7 @@
 #include "version.h"
 #include "prio-queue.h"
 #include "sha1-array.h"
+#include "oidset.h"
 
 static int transfer_unpack_limit = -1;
 static int fetch_unpack_limit = -1;
@@ -592,13 +593,22 @@ static void mark_recent_complete_commits(struct 
fetch_pack_args *args,
}
 }
 
+static void add_refs_to_oidset(struct oidset *oids, const struct ref *refs)
+{
+   for (; refs; refs = refs->next)
+   oidset_insert(oids, >old_oid);
+}
+
 static void filter_refs(struct fetch_pack_args *args,
struct ref **refs,
struct ref **sought, int nr_sought)
 {
struct ref *newlist = NULL;
struct ref **newtail = 
+   struct ref *unmatched = NULL;
struct ref *ref, *next;
+   struct oidset tip_oids = OIDSET_INIT;
+   int tip_oids_initialized = 0;
int i;
 
i = 0;
@@ -631,13 +641,15 @@ static void filter_refs(struct fetch_pack_args *args,
ref->next = NULL;
newtail = >next;
} else {
-   free(ref);
+   ref->next = unmatched;
+   unmatched = ref;
}
}
 
/* Append unmatched requests to the list */
for (i = 0; i < nr_sought; i++) {
unsigned char sha1[20];
+   int can_append = 0;
 
ref = sought[i];
if (ref->match_status != REF_NOT_MATCHED)
@@ -649,6 +661,21 @@ static void filter_refs(struct fetch_pack_args *args,
 
if ((allow_unadvertised_object_request &
(ALLOW_TIP_SHA1 | ALLOW_REACHABLE_SHA1))) {
+   can_append = 1;
+   } else {
+   if (!tip_oids_initialized) {
+   /*
+* Check all refs, including those already
+* matched
+*/
+   add_refs_to_oidset(_oids, unmatched);
+   add_refs_to_oidset(_oids, newlist);
+   tip_oids_initialized = 1;
+   }
+   can_append = oidset_contains(_oids, >old_oid);
+   }
+
+   if (can_append) {
ref->match_status = REF_MATCHED;
*newtail = copy_ref(ref);
newtail = &(*newtail)->next;
@@ -656,6 +683,13 @@ static void filter_refs(struct fetch_pack_args *args,
ref->match_status = REF_UNADVERTISED_NOT_ALLOWED;
}
}
+
+   oidset_clear(_oids);
+   for (ref = unmatched; ref; ref = next) {
+   next = ref->next;
+   free(ref);
+   }
+
*refs = newlist;
 }
 
diff --git a/t/t5500-fetch-pack.sh b/t/t5500-fetch-pack.sh
index b5865b385..80a1a3239 100755
--- a/t/t5500-fetch-pack.sh
+++ b/t/t5500-fetch-pack.sh
@@ -547,6 +547,41 @@ test_expect_success 'fetch-pack can fetch a raw sha1' '
git fetch-pack hidden $(git -C hidden rev-parse refs/hidden/one)
 '
 
+test_expect_success 'fetch-pack can fetch a raw sha1 that is advertised as a 
ref' '
+   rm -rf server client &&
+   git init server &&
+   test_commit -C server 1 &&
+
+   git init client &&
+   git -C client fetch-pack ../server \
+   $(git -C server rev-parse refs/heads/master)
+'
+
+test_expect_success 'fetch-pack can fetch a raw sha1 overlapping a named ref' '
+   rm -rf server client &&
+   git init server &&
+   test_commit -C server 

Re: [PATCH] fetch-pack: always allow fetching of literal SHA1s

2017-05-11 Thread Jeff King
On Thu, May 11, 2017 at 12:03:54PM -0700, Jonathan Nieder wrote:

> > But even leaving all the refs/pull stuff aside, allowAnySHA1InWant does
> > seem to increase that confusion, and I don't see a way around it short
> > of never sharing objects between repositories at all. So I think at most
> > we'd do allowReachableSHA1InWant.
> 
> I had guessed you didn't want to do allowReachableSHA1InWant for
> performance reasons.  (I haven't checked to what extent we are already
> taking advantage of bitmaps to avoid a slow reachability check.)  If I
> was wrong and allowReachableSHA1InWant is on the table then it is of
> course even better. :)

Performance is definitely a consideration for resolving sha1's via the
website, because those can be any object, and you have to look in all
the trees. Traversing all the commits in linux.git is ~5s of CPU. Doing
the same for the whole object graph is more like ~45s.

Our experiments did involve bitmaps, but there are some corner cases
where we might not have good bitmap coverage of certain refs (I think
this is something that could be improved; the commit selection in the
current bitmap writer is fairly naive).

But I think upload-pack's reachability check only handles commits
anyway. Adding 5s of CPU to a request isn't great, but it's pretty easy
to convince Git to use 5s of CPU already. And this would only kick in
when the client asked for a non-tip ref anyway, and in general I'd
expect requested sha1's to be near the ref tips.

I'll open a discussion internally about enabling it and measuring the
performance.

-Peff


Re: [PATCH v3] fetch-pack: always allow fetching of literal SHA1s

2017-05-11 Thread Jeff King
On Thu, May 11, 2017 at 10:51:37AM -0700, Jonathan Tan wrote:

> > This is inside the nr_sought loop. So if I were to do:
> > 
> >   git fetch origin $(git ls-remote origin | awk '{print $1}')
> > 
> > we're back to being quadratic. I realize that's probably a silly thing
> > to do, but in the general case, you're O(m*n), where "n" is number of
> > unmatched remote refs and "m" is the number of SHA-1s you're looking
> > for.
> 
> The original patch was quadratic regardless of whether we're fetching names
> or SHA-1s, which can be said to be bad since it is a regression in an
> existing and common use case (and I agree). This one is O(m*n), as you said
> - the "quadratic-ness" only kicks in if you fetch SHA-1s, which before this
> patch is impossible.

Yeah, no question this is an improvement over the original. I think this
still "regresses" a certain request performance-wise, but it's a request
that could never have succeeded in the original. Mostly I'd just rather
not leave a hidden quadratic loop that will blow up in somebody's face
a year or two down the road.

> Having a sha1_array or oidset would require allocation (O(n log n) time, I
> think, in addition to O(n) space) and this cost would be incurred regardless
> of how many SHA-1s were actually fetched (if m is an order of magnitude less
> than log n, for example, having a sha1_array might be actually worse). Also,
> presumably we don't want to incur this cost if we are fetching zero SHA-1s,
> so we would need to do some sort of pre-check. I'm inclined to leave it the
> way it is and consider this only if the use case of fetching many SHA1-s
> comes up.

An oidset should be O(n) in both time and space. Which we're already
spending in the earlier loop (and in general, when we allocate O(n) ref
structs in the first place). I don't think we need to care too much
about micro-optimizing bumps to the constant-factor here; we just need
to make sure we don't end up in an algorithmic explosion.

And if we initialize the oidset lazily, then we incur that cost only
when we would be doing the linear walk in your patch anyway. See the
patch below.

> > AIUI, you could also avoid creating the unmatched list entirely when the
> > server advertises tip/reachable sha1s. That's a small optimization, but
> > I think it may actually make the logic clearer.
> 
> If you mean adding an "if" block at the point where we add the unmatched ref
> to the unmatched list (that either adds it to the list or immediately frees
> it), I think it makes the logic slightly more complicated. Or maybe you had
> something else in mind?

I was just thinking that it does not leave a case where we create a
data structure (the unmatched list) even though we know right then that
it will not ever be used. So it's an extra line of logic there, but the
overall function may be clearer. I dunno, it's probably not a big deal
either way.

-Peff

---
diff --git a/fetch-pack.c b/fetch-pack.c
index 5cace7458..a660331e6 100644
--- a/fetch-pack.c
+++ b/fetch-pack.c
@@ -15,6 +15,7 @@
 #include "version.h"
 #include "prio-queue.h"
 #include "sha1-array.h"
+#include "oidset.h"
 
 static int transfer_unpack_limit = -1;
 static int fetch_unpack_limit = -1;
@@ -592,6 +593,12 @@ static void mark_recent_complete_commits(struct 
fetch_pack_args *args,
}
 }
 
+static void add_refs_to_oidset(struct oidset *oids, const struct ref *refs)
+{
+   for (; refs; refs = refs->next)
+   oidset_insert(oids, >old_oid);
+}
+
 static void filter_refs(struct fetch_pack_args *args,
struct ref **refs,
struct ref **sought, int nr_sought)
@@ -600,6 +607,8 @@ static void filter_refs(struct fetch_pack_args *args,
struct ref **newtail = 
struct ref *unmatched = NULL;
struct ref *ref, *next;
+   struct oidset tip_oids = OIDSET_INIT;
+   int tip_oids_initialized = 0;
int i;
 
i = 0;
@@ -654,22 +663,15 @@ static void filter_refs(struct fetch_pack_args *args,
(ALLOW_TIP_SHA1 | ALLOW_REACHABLE_SHA1))) {
can_append = 1;
} else {
-   struct ref *u;
-   /* Check all refs, including those already matched */
-   for (u = unmatched; u; u = u->next) {
-   if (!oidcmp(>old_oid, >old_oid)) {
-   can_append = 1;
-   goto can_append;
-   }
-   }
-   for (u = newlist; u; u = u->next) {
-   if (!oidcmp(>old_oid, >old_oid)) {
-   can_append = 1;
-   break;
-   }
+   if (!tip_oids_initialized) {
+   /* Check all refs, including those already 
matched */
+  

Re: [PATCH 18/29] grep: catch a missing enum in switch statement

2017-05-11 Thread Ævar Arnfjörð Bjarmason
On Thu, May 11, 2017 at 10:40 PM, Stefan Beller  wrote:
> On Thu, May 11, 2017 at 1:08 PM, Brandon Williams  wrote:
>> On 05/11, Ævar Arnfjörð Bjarmason wrote:
>>> Add a die(...) to a default case for the switch statement selecting
>>> between grep pattern types under --recurse-submodules.
>>>
>>> Normally this would be caught by -Wswitch, but the grep_pattern_type
>>> type is converted to int by going through parse_options(). Changing
>>> the argument type passed to compile_submodule_options() won't work,
>>> the value will just get coerced.
>>>
>>> Thus catching this at runtime is the least worst option. This won't
>>> ever trigger in practice, but if a new pattern type were to be added
>>> this catches an otherwise silent bug during development.
>>>
>>> See commit 0281e487fd ("grep: optionally recurse into submodules",
>>> 2016-12-16) for the initial addition of this code.
>>>
>>> Signed-off-by: Ævar Arnfjörð Bjarmason 
>>> ---
>>>  builtin/grep.c | 6 ++
>>>  1 file changed, 6 insertions(+)
>>>
>>> diff --git a/builtin/grep.c b/builtin/grep.c
>>> index 3ffb5b4e81..1c0adb30f3 100644
>>> --- a/builtin/grep.c
>>> +++ b/builtin/grep.c
>>> @@ -495,6 +495,12 @@ static void compile_submodule_options(const struct 
>>> grep_opt *opt,
>>>   break;
>>>   case GREP_PATTERN_TYPE_UNSPECIFIED:
>>>   break;
>>> + default:
>>> + /*
>>> +  * -Wswitch doesn't catch this due to casting &
>>> +  * -Wswitch-default is too noisy.
>>> +  */
>>> + die("BUG: Added a new grep pattern type without updating 
>>> switch statement");
>
> I am not sure if the comment is of enough value for the used screen
> real estate value.
> People interested in the existence of the default would just use
> blame/log to find out.

Thanks, I was on the fence about it, will remove it (and add the
mention of -Wswitch-default to the commit message).


Re: [PATCH 28/29] grep: given --threads with NO_PTHREADS=YesPlease, warn

2017-05-11 Thread Brandon Williams
On 05/11, Ævar Arnfjörð Bjarmason wrote:
> On Thu, May 11, 2017 at 10:21 PM, Brandon Williams  wrote:
> > On 05/11, Ævar Arnfjörð Bjarmason wrote:
> >> Add a warning about missing thread support when grep.threads or
> >> --threads is set to a non 0 (default) or 1 (no parallelism) value
> >> under NO_PTHREADS=YesPlease.
> >>
> >> This is for consistency with the index-pack & pack-objects commands,
> >> which also take a --threads option & are configurable via
> >> pack.threads, and have long warned about the same under
> >> NO_PTHREADS=YesPlease.
> >>
> >> Signed-off-by: Ævar Arnfjörð Bjarmason 
> >> ---
> >>  builtin/grep.c  | 11 +++
> >>  t/t7810-grep.sh | 18 ++
> >>  2 files changed, 29 insertions(+)
> >>
> >> diff --git a/builtin/grep.c b/builtin/grep.c
> >> index 1c0adb30f3..f4e08dd2b6 100644
> >> --- a/builtin/grep.c
> >> +++ b/builtin/grep.c
> >> @@ -289,6 +289,15 @@ static int grep_cmd_config(const char *var, const 
> >> char *value, void *cb)
> >>   if (num_threads < 0)
> >>   die(_("invalid number of threads specified (%d) for 
> >> %s"),
> >>   num_threads, var);
> >> +#ifdef NO_PTHREADS
> >> + else if (num_threads && num_threads != 1) {
> >> + /* TRANSLATORS: %s is the configuration
> >> +variable for tweaking threads, currently
> >> +grep.threads */
> >
> > nit: this comment isn't formatted properly:
> >   /*
> >* ... comment ...
> >*/
> 
> Comments for translators use a different style, see cbcfd4e3ea ("i18n:
> mention "TRANSLATORS:" marker in Documentation/CodingGuidelines",
> 2014-04-18). Otherwise the "*" gets interpolated into the string the
> translators see in their UI.
> 

Ah got it, I wasn't aware of that.

> >> + warning(_("no threads support, ignoring %s"), var);
> >> + num_threads = 0;
> >> + }
> >> +#endif
> >>   }
> >>
> >>   return st;
> >> @@ -1233,6 +1242,8 @@ int cmd_grep(int argc, const char **argv, const char 
> >> *prefix)
> >>   else if (num_threads < 0)
> >>   die(_("invalid number of threads specified (%d)"), 
> >> num_threads);
> >>  #else
> >> + if (num_threads)
> >> + warning(_("no threads support, ignoring --threads"));
> >>   num_threads = 0;
> >>  #endif
> >>
> >> diff --git a/t/t7810-grep.sh b/t/t7810-grep.sh
> >> index 561709ef6a..f106387820 100755
> >> --- a/t/t7810-grep.sh
> >> +++ b/t/t7810-grep.sh
> >> @@ -791,6 +791,24 @@ do
> >>   "
> >>  done
> >>
> >> +test_expect_success !PTHREADS,C_LOCALE_OUTPUT 'grep --threads=N or 
> >> pack.threads=N warns when no pthreads' '
> >> + git grep --threads=2 Hello hello_world 2>err &&
> >> + grep ^warning: err >warnings &&
> >> + test_line_count = 1 warnings &&
> >> + grep -F "no threads support, ignoring --threads" err &&
> >> + git -c grep.threads=2 grep Hello hello_world 2>err &&
> >> + grep ^warning: err >warnings &&
> >> + test_line_count = 1 warnings &&
> >> + grep -F "no threads support, ignoring grep.threads" err &&
> >> + git -c grep.threads=2 grep --threads=4 Hello hello_world 2>err &&
> >> + grep ^warning: err >warnings &&
> >> + test_line_count = 2 warnings &&
> >> + grep -F "no threads support, ignoring --threads" err &&
> >> + grep -F "no threads support, ignoring grep.threads" err &&
> >> + git -c grep.threads=0 grep --threads=0 Hello hello_world 2>err &&
> >> + test_line_count = 0 err
> >> +'
> >> +
> >
> > Same bit about doing the correct checks on the error strings to account
> > for translation.
> 
> Do you mean why not use test_i18ngrep? The test is guarded by
> C_LOCALE_OUTPUT which does the same thing, the whole thing is testing
> output so no point in doing just parts of it IMO, unlike some other
> tests that just end in "let's compare the output" but actually test
> other stuff.
> 
> I could e.g. test that there's something on stderr under poison, but
> no point in doing so.

Fair enough, and I didn't notice the C_LOCALE_OUTPUT.

> 
> >>  test_expect_success 'grep from a subdirectory to search wider area (1)' '
> >>   mkdir -p s &&
> >>   (
> >> --
> >> 2.11.0
> >>
> >
> > --
> > Brandon Williams

-- 
Brandon Williams


Re: [PATCH 18/29] grep: catch a missing enum in switch statement

2017-05-11 Thread Stefan Beller
On Thu, May 11, 2017 at 1:08 PM, Brandon Williams  wrote:
> On 05/11, Ævar Arnfjörð Bjarmason wrote:
>> Add a die(...) to a default case for the switch statement selecting
>> between grep pattern types under --recurse-submodules.
>>
>> Normally this would be caught by -Wswitch, but the grep_pattern_type
>> type is converted to int by going through parse_options(). Changing
>> the argument type passed to compile_submodule_options() won't work,
>> the value will just get coerced.
>>
>> Thus catching this at runtime is the least worst option. This won't
>> ever trigger in practice, but if a new pattern type were to be added
>> this catches an otherwise silent bug during development.
>>
>> See commit 0281e487fd ("grep: optionally recurse into submodules",
>> 2016-12-16) for the initial addition of this code.
>>
>> Signed-off-by: Ævar Arnfjörð Bjarmason 
>> ---
>>  builtin/grep.c | 6 ++
>>  1 file changed, 6 insertions(+)
>>
>> diff --git a/builtin/grep.c b/builtin/grep.c
>> index 3ffb5b4e81..1c0adb30f3 100644
>> --- a/builtin/grep.c
>> +++ b/builtin/grep.c
>> @@ -495,6 +495,12 @@ static void compile_submodule_options(const struct 
>> grep_opt *opt,
>>   break;
>>   case GREP_PATTERN_TYPE_UNSPECIFIED:
>>   break;
>> + default:
>> + /*
>> +  * -Wswitch doesn't catch this due to casting &
>> +  * -Wswitch-default is too noisy.
>> +  */
>> + die("BUG: Added a new grep pattern type without updating 
>> switch statement");

I am not sure if the comment is of enough value for the used screen
real estate value.
People interested in the existence of the default would just use
blame/log to find out.

Thanks,
Stefan


Re: [RFC] send-email: support validate hook

2017-05-11 Thread Brandon Williams
On 05/11, Jonathan Tan wrote:
> Currently, send-email has support for rudimentary e-mail validation.
> Allow the user to add support for more validation by providing a
> sendemail-validate hook.
> 
> Signed-off-by: Jonathan Tan 
> ---
> 
> This is motivated by situations in which we discuss a work in progress
> using Gerrit (which requires Change-Id trailers in patches), and then,
> forgetting to remove the Change-Id trailers, send them to the Git
> mailing list (which does not want such trailers). I can envision such
> functionality being useful in other situations, hence this patch
> submission.
> 
> I'm not very familiar with Perl, and "There Is More Than One Way To Do
> It", so advice on Perl style is appreciated.

I can't attest to the Perl code itself (I prefer to not touch it :D) but
I've tested this and it works for my purposes!

> ---
>  Documentation/git-send-email.txt |  1 +
>  Documentation/githooks.txt   |  8 
>  git-send-email.perl  | 18 +-
>  t/t9001-send-email.sh| 40 
> 
>  4 files changed, 66 insertions(+), 1 deletion(-)
> 
> diff --git a/Documentation/git-send-email.txt 
> b/Documentation/git-send-email.txt
> index 9d66166f6..bb23b02ca 100644
> --- a/Documentation/git-send-email.txt
> +++ b/Documentation/git-send-email.txt
> @@ -377,6 +377,7 @@ have been specified, in which case default to 'compose'.
>   Currently, validation means the following:
>  +
>  --
> + *   Invoke the sendemail-validate hook if present (see 
> linkgit:githooks[5]).
>   *   Warn of patches that contain lines longer than 998 
> characters; this
>   is due to SMTP limits as described by 
> http://www.ietf.org/rfc/rfc2821.txt.
>  --
> diff --git a/Documentation/githooks.txt b/Documentation/githooks.txt
> index 706091a56..b2514f4d4 100644
> --- a/Documentation/githooks.txt
> +++ b/Documentation/githooks.txt
> @@ -447,6 +447,14 @@ rebase::
>  The commits are guaranteed to be listed in the order that they were
>  processed by rebase.
>  
> +sendemail-validate
> +~~
> +
> +This hook is invoked by 'git send-email'.  It takes a single parameter,
> +the name of the file that holds the e-mail to be sent.  Exiting with a
> +non-zero status causes 'git send-email' to abort before sending any
> +e-mails.
> +
>  
>  GIT
>  ---
> diff --git a/git-send-email.perl b/git-send-email.perl
> index eea0a517f..7de91ca7c 100755
> --- a/git-send-email.perl
> +++ b/git-send-email.perl
> @@ -27,6 +27,7 @@ use Term::ANSIColor;
>  use File::Temp qw/ tempdir tempfile /;
>  use File::Spec::Functions qw(catfile);
>  use Error qw(:try);
> +use Cwd qw(abs_path cwd);
>  use Git;
>  use Git::I18N;
>  
> @@ -628,9 +629,24 @@ if (@rev_list_opts) {
>  @files = handle_backup_files(@files);
>  
>  if ($validate) {
> + my @hook = ($repo->repo_path().'/hooks/sendemail-validate', '');
> + my $use_hook = -x $hook[0];
> + if ($use_hook) {
> + # The hook needs a correct GIT_DIR.
> + $ENV{"GIT_DIR"} = $repo->repo_path();
> + }
>   foreach my $f (@files) {
>   unless (-p $f) {
> - my $error = validate_patch($f);
> + my $error;
> + if ($use_hook) {
> + $hook[1] = abs_path($f);
> + my $cwd_save = cwd();
> + chdir($repo->wc_path() or $repo->repo_path());
> + $error = "rejected by sendemail-validate hook"
> + unless system(@hook) == 0;
> + chdir($cwd_save);
> + }
> + $error = validate_patch($f) unless $error;
>   $error and die sprintf(__("fatal: %s: %s\nwarning: no 
> patches were sent\n"),
> $f, $error);
>   }
> diff --git a/t/t9001-send-email.sh b/t/t9001-send-email.sh
> index 60a80f60b..f3f238d40 100755
> --- a/t/t9001-send-email.sh
> +++ b/t/t9001-send-email.sh
> @@ -1913,4 +1913,44 @@ test_expect_success $PREREQ 'leading and trailing 
> whitespaces are removed' '
>   test_cmp expected-list actual-list
>  '
>  
> +test_expect_success $PREREQ 'invoke hook' '
> + mkdir -p .git/hooks &&
> +
> + write_script .git/hooks/sendemail-validate <<-\EOF &&
> + # test that we have the correct environment variable, pwd, and
> + # argument
> + case "$GIT_DIR" in
> + *.git)
> + true
> + ;;
> + *)
> + false
> + ;;
> + esac &&
> + test -e 0001-add-master.patch &&
> + grep "add master" "$1"
> + EOF
> +
> + mkdir subdir &&
> + (
> + # 

Re: [PATCH 26/29] pack-objects & index-pack: add test for --threads warning

2017-05-11 Thread Ævar Arnfjörð Bjarmason
On Thu, May 11, 2017 at 10:17 PM, Brandon Williams  wrote:
> On 05/11, Ævar Arnfjörð Bjarmason wrote:
>> Add a test for the warning that's emitted when --threads or
>> pack.threads is provided under NO_PTHREADS=YesPlease. This uses the
>> new PTHREADS prerequisite.
>>
>> The assertion for C_LOCALE_OUTPUT in the latter test is currently
>> redundant, since unlike index-pack the pack-objects warnings aren't
>> i18n'd. However they might be changed to be i18n'd in the future, and
>> there's no harm in future-proofing the test.
>>
>> There's an existing bug in the implementation of pack-objects which
>> this test currently tests for as-is. Details about the bug & the fix
>> are included in a follow-up change.
>>
>> Signed-off-by: Ævar Arnfjörð Bjarmason 
>> ---
>>  t/t5300-pack-object.sh | 34 ++
>>  1 file changed, 34 insertions(+)
>>
>> diff --git a/t/t5300-pack-object.sh b/t/t5300-pack-object.sh
>> index 43a672c345..1629fa80b0 100755
>> --- a/t/t5300-pack-object.sh
>> +++ b/t/t5300-pack-object.sh
>> @@ -421,6 +421,40 @@ test_expect_success 'index-pack  works in 
>> non-repo' '
>>   test_path_is_file foo.idx
>>  '
>>
>> +test_expect_success !PTHREADS,C_LOCALE_OUTPUT 'index-pack --threads=N or 
>> pack.threads=N warns when no pthreads' '
>> + test_must_fail git index-pack --threads=2 2>err &&
>> + grep ^warning: err >warnings &&
>> + test_line_count = 1 warnings &&
>> + grep -F "no threads support, ignoring --threads=2" err &&
>> + test_must_fail git -c pack.threads=2 index-pack 2>err &&
>> + grep ^warning: err >warnings &&
>> + test_line_count = 1 warnings &&
>> + grep -F "no threads support, ignoring pack.threads" err &&
>> + test_must_fail git -c pack.threads=2 index-pack --threads=4 2>err &&
>> + grep ^warning: err >warnings &&
>> + test_line_count = 2 warnings &&
>> + grep -F "no threads support, ignoring --threads=4" err &&
>> + grep -F "no threads support, ignoring pack.threads" err
>> +'
>> +
>> +test_expect_success !PTHREADS,C_LOCALE_OUTPUT 'pack-objects --threads=N or 
>> pack.threads=N warns when no pthreads' '
>> + git pack-objects --threads=2 --stdout --all /dev/null 
>> 2>err &&
>> + grep ^warning: err >warnings &&
>> + test_line_count = 1 warnings &&
>> + grep -F "no threads support, ignoring --threads" err &&
>> + git -c pack.threads=2 pack-objects --stdout --all > >/dev/null 2>err &&
>> + cat err &&
>> + grep ^warning: err >warnings &&
>> + test_line_count = 2 warnings &&
>> + grep -F "no threads support, ignoring --threads" err &&
>> + grep -F "no threads support, ignoring pack.threads" err &&
>> + git -c pack.threads=2 pack-objects --threads=4 --stdout --all 
>> /dev/null 2>err &&
>> + grep ^warning: err >warnings &&
>> + test_line_count = 2 warnings &&
>> + grep -F "no threads support, ignoring --threads" err &&
>> + grep -F "no threads support, ignoring pack.threads" err
>> +'
>> +
>
> Some of these tests you might want to rewrite using test_i18ncmp to
> ensure that the messages match in other languages.  That is assuming
> this error message is translated (which it should be).

[Mostly for my own notes so I see I covered this]

Covered in a side-thread, the test is guarded by C_LOCALE_OUTPUT which
does the same thing.

>>  #
>>  # WARNING!
>>  #
>> --
>> 2.11.0
>>
>
> --
> Brandon Williams


Re: [PATCH 28/29] grep: given --threads with NO_PTHREADS=YesPlease, warn

2017-05-11 Thread Ævar Arnfjörð Bjarmason
On Thu, May 11, 2017 at 10:21 PM, Brandon Williams  wrote:
> On 05/11, Ævar Arnfjörð Bjarmason wrote:
>> Add a warning about missing thread support when grep.threads or
>> --threads is set to a non 0 (default) or 1 (no parallelism) value
>> under NO_PTHREADS=YesPlease.
>>
>> This is for consistency with the index-pack & pack-objects commands,
>> which also take a --threads option & are configurable via
>> pack.threads, and have long warned about the same under
>> NO_PTHREADS=YesPlease.
>>
>> Signed-off-by: Ævar Arnfjörð Bjarmason 
>> ---
>>  builtin/grep.c  | 11 +++
>>  t/t7810-grep.sh | 18 ++
>>  2 files changed, 29 insertions(+)
>>
>> diff --git a/builtin/grep.c b/builtin/grep.c
>> index 1c0adb30f3..f4e08dd2b6 100644
>> --- a/builtin/grep.c
>> +++ b/builtin/grep.c
>> @@ -289,6 +289,15 @@ static int grep_cmd_config(const char *var, const char 
>> *value, void *cb)
>>   if (num_threads < 0)
>>   die(_("invalid number of threads specified (%d) for 
>> %s"),
>>   num_threads, var);
>> +#ifdef NO_PTHREADS
>> + else if (num_threads && num_threads != 1) {
>> + /* TRANSLATORS: %s is the configuration
>> +variable for tweaking threads, currently
>> +grep.threads */
>
> nit: this comment isn't formatted properly:
>   /*
>* ... comment ...
>*/

Comments for translators use a different style, see cbcfd4e3ea ("i18n:
mention "TRANSLATORS:" marker in Documentation/CodingGuidelines",
2014-04-18). Otherwise the "*" gets interpolated into the string the
translators see in their UI.

>> + warning(_("no threads support, ignoring %s"), var);
>> + num_threads = 0;
>> + }
>> +#endif
>>   }
>>
>>   return st;
>> @@ -1233,6 +1242,8 @@ int cmd_grep(int argc, const char **argv, const char 
>> *prefix)
>>   else if (num_threads < 0)
>>   die(_("invalid number of threads specified (%d)"), 
>> num_threads);
>>  #else
>> + if (num_threads)
>> + warning(_("no threads support, ignoring --threads"));
>>   num_threads = 0;
>>  #endif
>>
>> diff --git a/t/t7810-grep.sh b/t/t7810-grep.sh
>> index 561709ef6a..f106387820 100755
>> --- a/t/t7810-grep.sh
>> +++ b/t/t7810-grep.sh
>> @@ -791,6 +791,24 @@ do
>>   "
>>  done
>>
>> +test_expect_success !PTHREADS,C_LOCALE_OUTPUT 'grep --threads=N or 
>> pack.threads=N warns when no pthreads' '
>> + git grep --threads=2 Hello hello_world 2>err &&
>> + grep ^warning: err >warnings &&
>> + test_line_count = 1 warnings &&
>> + grep -F "no threads support, ignoring --threads" err &&
>> + git -c grep.threads=2 grep Hello hello_world 2>err &&
>> + grep ^warning: err >warnings &&
>> + test_line_count = 1 warnings &&
>> + grep -F "no threads support, ignoring grep.threads" err &&
>> + git -c grep.threads=2 grep --threads=4 Hello hello_world 2>err &&
>> + grep ^warning: err >warnings &&
>> + test_line_count = 2 warnings &&
>> + grep -F "no threads support, ignoring --threads" err &&
>> + grep -F "no threads support, ignoring grep.threads" err &&
>> + git -c grep.threads=0 grep --threads=0 Hello hello_world 2>err &&
>> + test_line_count = 0 err
>> +'
>> +
>
> Same bit about doing the correct checks on the error strings to account
> for translation.

Do you mean why not use test_i18ngrep? The test is guarded by
C_LOCALE_OUTPUT which does the same thing, the whole thing is testing
output so no point in doing just parts of it IMO, unlike some other
tests that just end in "let's compare the output" but actually test
other stuff.

I could e.g. test that there's something on stderr under poison, but
no point in doing so.

>>  test_expect_success 'grep from a subdirectory to search wider area (1)' '
>>   mkdir -p s &&
>>   (
>> --
>> 2.11.0
>>
>
> --
> Brandon Williams


Re: Possible bug in includeIf / conditional includes on non git initialised directories

2017-05-11 Thread Sebastian Schuberth
On 2017-05-11 20:53, Raphael Stolt wrote:

> I might have stumbled this time over a real bug in includeIf / conditional 
> includes or maybe it's just as intended.
> 1) Given I have a correct configured includeIf and I’m issuing `git config 
> --show-origin --get user.email` against an directory which hasn’t been `git 
> init`ed I get the user.email configured globally.

I don't think that's a bug surprise: The condition in the conditional include 
is "gitdir:". Before running "git init", it simply *is* no gitdir.

-- 
Sebastian Schuberth


Re: [PATCH 21/29] grep: factor test for \0 in grep patterns into a function

2017-05-11 Thread Ævar Arnfjörð Bjarmason
On Thu, May 11, 2017 at 10:15 PM, Brandon Williams  wrote:
> On 05/11, Ævar Arnfjörð Bjarmason wrote:
>> Factor the test for \0 in grep patterns into a function. Since commit
>> 9eceddeec6 ("Use kwset in grep", 2011-08-21) any pattern containing a
>> \0 is considered fixed as regcomp() can't handle it.
>>
>> This limitation was never documented, and other some regular
>> expression engines are capable of compiling a pattern containing a
>> \0. Factoring this out makes a subsequent change which does that
>> smaller.
>>
>> See a previous commit in this series ("grep: add tests to fix blind
>> spots with \0 patterns", 2017-04-21) for further details & rationale.
>>
>> Signed-off-by: Ævar Arnfjörð Bjarmason 
>> ---
>>  grep.c | 19 ---
>>  1 file changed, 12 insertions(+), 7 deletions(-)
>>
>> diff --git a/grep.c b/grep.c
>> index bf6c2494fd..27de615209 100644
>> --- a/grep.c
>> +++ b/grep.c
>> @@ -394,12 +394,6 @@ static int is_fixed(const char *s, size_t len)
>>  {
>>   size_t i;
>>
>> - /* regcomp cannot accept patterns with NULs so we
>> -  * consider any pattern containing a NUL fixed.
>> -  */
>> - if (memchr(s, 0, len))
>> - return 1;
>> -
>>   for (i = 0; i < len; i++) {
>>   if (is_regex_special(s[i]))
>>   return 0;
>> @@ -408,6 +402,17 @@ static int is_fixed(const char *s, size_t len)
>>   return 1;
>>  }
>>
>> +static int has_null(const char *s, size_t len)
>> +{
>> + /* regcomp cannot accept patterns with NULs so when using it
>> +  * we consider any pattern containing a NUL fixed.
>> +  */
>
> I commented on a later patch but really the comment should be fixed
> here.  And why not simply move this to where you intend it to be at the
> end of the series now?

Just losing the forest for the trees in rebasing this giant, willdo in
v2, i.e. just make this a function in the right place in this change.

>> + if (memchr(s, 0, len))
>> + return 1;
>> +
>> + return 0;
>> +}
>> +
>>  static void compile_fixed_regexp(struct grep_pat *p, struct grep_opt *opt)
>>  {
>>   struct strbuf sb = STRBUF_INIT;
>> @@ -451,7 +456,7 @@ static void compile_regexp(struct grep_pat *p, struct 
>> grep_opt *opt)
>>* simple string match using kws.  p->fixed tells us if we
>>* want to use kws.
>>*/
>> - if (opt->fixed || is_fixed(p->pattern, p->patternlen))
>> + if (opt->fixed || has_null(p->pattern, p->patternlen) || 
>> is_fixed(p->pattern, p->patternlen))
>>   p->fixed = !icase || ascii_only;
>>   else
>>   p->fixed = 0;
>> --
>> 2.11.0
>>
>
> --
> Brandon Williams


Re: [PATCH 28/29] grep: given --threads with NO_PTHREADS=YesPlease, warn

2017-05-11 Thread Brandon Williams
On 05/11, Ævar Arnfjörð Bjarmason wrote:
> Add a warning about missing thread support when grep.threads or
> --threads is set to a non 0 (default) or 1 (no parallelism) value
> under NO_PTHREADS=YesPlease.
> 
> This is for consistency with the index-pack & pack-objects commands,
> which also take a --threads option & are configurable via
> pack.threads, and have long warned about the same under
> NO_PTHREADS=YesPlease.
> 
> Signed-off-by: Ævar Arnfjörð Bjarmason 
> ---
>  builtin/grep.c  | 11 +++
>  t/t7810-grep.sh | 18 ++
>  2 files changed, 29 insertions(+)
> 
> diff --git a/builtin/grep.c b/builtin/grep.c
> index 1c0adb30f3..f4e08dd2b6 100644
> --- a/builtin/grep.c
> +++ b/builtin/grep.c
> @@ -289,6 +289,15 @@ static int grep_cmd_config(const char *var, const char 
> *value, void *cb)
>   if (num_threads < 0)
>   die(_("invalid number of threads specified (%d) for 
> %s"),
>   num_threads, var);
> +#ifdef NO_PTHREADS
> + else if (num_threads && num_threads != 1) {
> + /* TRANSLATORS: %s is the configuration
> +variable for tweaking threads, currently
> +grep.threads */

nit: this comment isn't formatted properly:
  /*
   * ... comment ...
   */

> + warning(_("no threads support, ignoring %s"), var);
> + num_threads = 0;
> + }
> +#endif
>   }
>  
>   return st;
> @@ -1233,6 +1242,8 @@ int cmd_grep(int argc, const char **argv, const char 
> *prefix)
>   else if (num_threads < 0)
>   die(_("invalid number of threads specified (%d)"), num_threads);
>  #else
> + if (num_threads)
> + warning(_("no threads support, ignoring --threads"));
>   num_threads = 0;
>  #endif
>  
> diff --git a/t/t7810-grep.sh b/t/t7810-grep.sh
> index 561709ef6a..f106387820 100755
> --- a/t/t7810-grep.sh
> +++ b/t/t7810-grep.sh
> @@ -791,6 +791,24 @@ do
>   "
>  done
>  
> +test_expect_success !PTHREADS,C_LOCALE_OUTPUT 'grep --threads=N or 
> pack.threads=N warns when no pthreads' '
> + git grep --threads=2 Hello hello_world 2>err &&
> + grep ^warning: err >warnings &&
> + test_line_count = 1 warnings &&
> + grep -F "no threads support, ignoring --threads" err &&
> + git -c grep.threads=2 grep Hello hello_world 2>err &&
> + grep ^warning: err >warnings &&
> + test_line_count = 1 warnings &&
> + grep -F "no threads support, ignoring grep.threads" err &&
> + git -c grep.threads=2 grep --threads=4 Hello hello_world 2>err &&
> + grep ^warning: err >warnings &&
> + test_line_count = 2 warnings &&
> + grep -F "no threads support, ignoring --threads" err &&
> + grep -F "no threads support, ignoring grep.threads" err &&
> + git -c grep.threads=0 grep --threads=0 Hello hello_world 2>err &&
> + test_line_count = 0 err
> +'
> +

Same bit about doing the correct checks on the error strings to account
for translation.

>  test_expect_success 'grep from a subdirectory to search wider area (1)' '
>   mkdir -p s &&
>   (
> -- 
> 2.11.0
> 

-- 
Brandon Williams


Re: [PATCH 24/29] grep: move two functions to avoid forward declaration

2017-05-11 Thread Ævar Arnfjörð Bjarmason
On Thu, May 11, 2017 at 10:14 PM, Brandon Williams  wrote:
> On 05/11, Ævar Arnfjörð Bjarmason wrote:
>> Move the is_fixed() and has_null() functions which are currently only
>> used in compile_regexp() earlier so they can be used in the PCRE
>> family of functions in a later change.
>>
>> Signed-off-by: Ævar Arnfjörð Bjarmason 
>> ---
>>  grep.c | 46 +++---
>>  1 file changed, 23 insertions(+), 23 deletions(-)
>>
>> diff --git a/grep.c b/grep.c
>> index 4507765811..5c808f8971 100644
>> --- a/grep.c
>> +++ b/grep.c
>> @@ -321,6 +321,29 @@ static NORETURN void compile_regexp_failed(const struct 
>> grep_pat *p,
>>   die("%s'%s': %s", where, p->pattern, error);
>>  }
>>
>> +static int is_fixed(const char *s, size_t len)
>> +{
>> + size_t i;
>> +
>> + for (i = 0; i < len; i++) {
>> + if (is_regex_special(s[i]))
>> + return 0;
>> + }
>> +
>> + return 1;
>> +}
>> +
>> +static int has_null(const char *s, size_t len)
>> +{
>> + /* regcomp cannot accept patterns with NULs so when using it
>> +  * we consider any pattern containing a NUL fixed.
>> +  */
>
> Since you're already moving these functions, mind cleaning up the
> comment to conform to our style guide?
Willdo.
>> + if (memchr(s, 0, len))
>> + return 1;
>> +
>> + return 0;
>> +}
>> +
>>  #ifdef USE_LIBPCRE1
>>  static void compile_pcre1_regexp(struct grep_pat *p, const struct grep_opt 
>> *opt)
>>  {
>> @@ -390,29 +413,6 @@ static void free_pcre1_regexp(struct grep_pat *p)
>>  }
>>  #endif /* !USE_LIBPCRE1 */
>>
>> -static int is_fixed(const char *s, size_t len)
>> -{
>> - size_t i;
>> -
>> - for (i = 0; i < len; i++) {
>> - if (is_regex_special(s[i]))
>> - return 0;
>> - }
>> -
>> - return 1;
>> -}
>> -
>> -static int has_null(const char *s, size_t len)
>> -{
>> - /* regcomp cannot accept patterns with NULs so when using it
>> -  * we consider any pattern containing a NUL fixed.
>> -  */
>> - if (memchr(s, 0, len))
>> - return 1;
>> -
>> - return 0;
>> -}
>> -
>>  static void compile_fixed_regexp(struct grep_pat *p, struct grep_opt *opt)
>>  {
>>   struct strbuf sb = STRBUF_INIT;
>> --
>> 2.11.0
>>
>
> --
> Brandon Williams


Re: [PATCH 26/29] pack-objects & index-pack: add test for --threads warning

2017-05-11 Thread Brandon Williams
On 05/11, Ævar Arnfjörð Bjarmason wrote:
> Add a test for the warning that's emitted when --threads or
> pack.threads is provided under NO_PTHREADS=YesPlease. This uses the
> new PTHREADS prerequisite.
> 
> The assertion for C_LOCALE_OUTPUT in the latter test is currently
> redundant, since unlike index-pack the pack-objects warnings aren't
> i18n'd. However they might be changed to be i18n'd in the future, and
> there's no harm in future-proofing the test.
> 
> There's an existing bug in the implementation of pack-objects which
> this test currently tests for as-is. Details about the bug & the fix
> are included in a follow-up change.
> 
> Signed-off-by: Ævar Arnfjörð Bjarmason 
> ---
>  t/t5300-pack-object.sh | 34 ++
>  1 file changed, 34 insertions(+)
> 
> diff --git a/t/t5300-pack-object.sh b/t/t5300-pack-object.sh
> index 43a672c345..1629fa80b0 100755
> --- a/t/t5300-pack-object.sh
> +++ b/t/t5300-pack-object.sh
> @@ -421,6 +421,40 @@ test_expect_success 'index-pack  works in 
> non-repo' '
>   test_path_is_file foo.idx
>  '
>  
> +test_expect_success !PTHREADS,C_LOCALE_OUTPUT 'index-pack --threads=N or 
> pack.threads=N warns when no pthreads' '
> + test_must_fail git index-pack --threads=2 2>err &&
> + grep ^warning: err >warnings &&
> + test_line_count = 1 warnings &&
> + grep -F "no threads support, ignoring --threads=2" err &&
> + test_must_fail git -c pack.threads=2 index-pack 2>err &&
> + grep ^warning: err >warnings &&
> + test_line_count = 1 warnings &&
> + grep -F "no threads support, ignoring pack.threads" err &&
> + test_must_fail git -c pack.threads=2 index-pack --threads=4 2>err &&
> + grep ^warning: err >warnings &&
> + test_line_count = 2 warnings &&
> + grep -F "no threads support, ignoring --threads=4" err &&
> + grep -F "no threads support, ignoring pack.threads" err
> +'
> +
> +test_expect_success !PTHREADS,C_LOCALE_OUTPUT 'pack-objects --threads=N or 
> pack.threads=N warns when no pthreads' '
> + git pack-objects --threads=2 --stdout --all /dev/null 2>err 
> &&
> + grep ^warning: err >warnings &&
> + test_line_count = 1 warnings &&
> + grep -F "no threads support, ignoring --threads" err &&
> + git -c pack.threads=2 pack-objects --stdout --all /dev/null 
> 2>err &&
> + cat err &&
> + grep ^warning: err >warnings &&
> + test_line_count = 2 warnings &&
> + grep -F "no threads support, ignoring --threads" err &&
> + grep -F "no threads support, ignoring pack.threads" err &&
> + git -c pack.threads=2 pack-objects --threads=4 --stdout --all 
> /dev/null 2>err &&
> + grep ^warning: err >warnings &&
> + test_line_count = 2 warnings &&
> + grep -F "no threads support, ignoring --threads" err &&
> + grep -F "no threads support, ignoring pack.threads" err
> +'
> +

Some of these tests you might want to rewrite using test_i18ncmp to
ensure that the messages match in other languages.  That is assuming
this error message is translated (which it should be).

>  #
>  # WARNING!
>  #
> -- 
> 2.11.0
> 

-- 
Brandon Williams


Re: Best "triangle" workflow setup?

2017-05-11 Thread Jeff King
On Thu, May 11, 2017 at 02:41:41PM -0500, Robert Dailey wrote:

> What I want (as a default) is for `git pull` to pull from the
> same-named branch on the upstream repository, but for `git push` to
> push to the same-named branch on the fork repository. However to
> override this behavior for when I want to push directly to upstream
> repo, I should be able to use an explicit `git push origin my-topic`
> (but `git push` by default will act as `git push fork my-topic`).

I think you want:

  [push]
  default = current
  [remote]
  pushDefault = myfork

to make "git push" do what you want. And then generally have branches
mark their counterparts on "origin" (which you can do either at creation
time, or probably by using "git push -u origin my-topic" when you push
them).

This is similar to what I do for my git.git workflow, though I usually
have origin/master as the branch's upstream. I.e., I'd create them with:

  git checkout -b my-topic origin

And then rebasing always happens on top of master (because "origin"
doesn't even have my topic branch at all). If I want to compare with
what I've pushed to my fork, I'd use "@{push}".

-Peff


Re: [PATCH 21/29] grep: factor test for \0 in grep patterns into a function

2017-05-11 Thread Brandon Williams
On 05/11, Ævar Arnfjörð Bjarmason wrote:
> Factor the test for \0 in grep patterns into a function. Since commit
> 9eceddeec6 ("Use kwset in grep", 2011-08-21) any pattern containing a
> \0 is considered fixed as regcomp() can't handle it.
> 
> This limitation was never documented, and other some regular
> expression engines are capable of compiling a pattern containing a
> \0. Factoring this out makes a subsequent change which does that
> smaller.
> 
> See a previous commit in this series ("grep: add tests to fix blind
> spots with \0 patterns", 2017-04-21) for further details & rationale.
> 
> Signed-off-by: Ævar Arnfjörð Bjarmason 
> ---
>  grep.c | 19 ---
>  1 file changed, 12 insertions(+), 7 deletions(-)
> 
> diff --git a/grep.c b/grep.c
> index bf6c2494fd..27de615209 100644
> --- a/grep.c
> +++ b/grep.c
> @@ -394,12 +394,6 @@ static int is_fixed(const char *s, size_t len)
>  {
>   size_t i;
>  
> - /* regcomp cannot accept patterns with NULs so we
> -  * consider any pattern containing a NUL fixed.
> -  */
> - if (memchr(s, 0, len))
> - return 1;
> -
>   for (i = 0; i < len; i++) {
>   if (is_regex_special(s[i]))
>   return 0;
> @@ -408,6 +402,17 @@ static int is_fixed(const char *s, size_t len)
>   return 1;
>  }
>  
> +static int has_null(const char *s, size_t len)
> +{
> + /* regcomp cannot accept patterns with NULs so when using it
> +  * we consider any pattern containing a NUL fixed.
> +  */

I commented on a later patch but really the comment should be fixed
here.  And why not simply move this to where you intend it to be at the
end of the series now?

> + if (memchr(s, 0, len))
> + return 1;
> +
> + return 0;
> +}
> +
>  static void compile_fixed_regexp(struct grep_pat *p, struct grep_opt *opt)
>  {
>   struct strbuf sb = STRBUF_INIT;
> @@ -451,7 +456,7 @@ static void compile_regexp(struct grep_pat *p, struct 
> grep_opt *opt)
>* simple string match using kws.  p->fixed tells us if we
>* want to use kws.
>*/
> - if (opt->fixed || is_fixed(p->pattern, p->patternlen))
> + if (opt->fixed || has_null(p->pattern, p->patternlen) || 
> is_fixed(p->pattern, p->patternlen))
>   p->fixed = !icase || ascii_only;
>   else
>   p->fixed = 0;
> -- 
> 2.11.0
> 

-- 
Brandon Williams


Re: [PATCH 24/29] grep: move two functions to avoid forward declaration

2017-05-11 Thread Brandon Williams
On 05/11, Ævar Arnfjörð Bjarmason wrote:
> Move the is_fixed() and has_null() functions which are currently only
> used in compile_regexp() earlier so they can be used in the PCRE
> family of functions in a later change.
> 
> Signed-off-by: Ævar Arnfjörð Bjarmason 
> ---
>  grep.c | 46 +++---
>  1 file changed, 23 insertions(+), 23 deletions(-)
> 
> diff --git a/grep.c b/grep.c
> index 4507765811..5c808f8971 100644
> --- a/grep.c
> +++ b/grep.c
> @@ -321,6 +321,29 @@ static NORETURN void compile_regexp_failed(const struct 
> grep_pat *p,
>   die("%s'%s': %s", where, p->pattern, error);
>  }
>  
> +static int is_fixed(const char *s, size_t len)
> +{
> + size_t i;
> +
> + for (i = 0; i < len; i++) {
> + if (is_regex_special(s[i]))
> + return 0;
> + }
> +
> + return 1;
> +}
> +
> +static int has_null(const char *s, size_t len)
> +{
> + /* regcomp cannot accept patterns with NULs so when using it
> +  * we consider any pattern containing a NUL fixed.
> +  */

Since you're already moving these functions, mind cleaning up the
comment to conform to our style guide?

> + if (memchr(s, 0, len))
> + return 1;
> +
> + return 0;
> +}
> +
>  #ifdef USE_LIBPCRE1
>  static void compile_pcre1_regexp(struct grep_pat *p, const struct grep_opt 
> *opt)
>  {
> @@ -390,29 +413,6 @@ static void free_pcre1_regexp(struct grep_pat *p)
>  }
>  #endif /* !USE_LIBPCRE1 */
>  
> -static int is_fixed(const char *s, size_t len)
> -{
> - size_t i;
> -
> - for (i = 0; i < len; i++) {
> - if (is_regex_special(s[i]))
> - return 0;
> - }
> -
> - return 1;
> -}
> -
> -static int has_null(const char *s, size_t len)
> -{
> - /* regcomp cannot accept patterns with NULs so when using it
> -  * we consider any pattern containing a NUL fixed.
> -  */
> - if (memchr(s, 0, len))
> - return 1;
> -
> - return 0;
> -}
> -
>  static void compile_fixed_regexp(struct grep_pat *p, struct grep_opt *opt)
>  {
>   struct strbuf sb = STRBUF_INIT;
> -- 
> 2.11.0
> 

-- 
Brandon Williams


Re: [PATCH 22/29] grep: change the internal PCRE macro names to be PCRE1

2017-05-11 Thread Brandon Williams
On 05/11, Ævar Arnfjörð Bjarmason wrote:
> Change the internal USE_LIBPCRE define, & build options flag to use a
> naming convention ending in PCRE1, without changing the long-standing
> USE_LIBPCRE Makefile flag which enables this code.
> 
> This is for preparation for libpcre2 support where having things like
> USE_LIBPCRE and USE_LIBPCRE2 in any more places than we absolutely
> need to for backwards compatibility with old Makefile arguments would
> be confusing.
> 
> In some ways it would be better to change everything that now uses
> USE_LIBPCRE to use USE_LIBPCRE1, and to make specifying
> USE_LIBPCRE (or --with-pcre) an error. This would impose a one-time
> burden on packagers of git to s/USE_LIBPCRE/USE_LIBPCRE1/ in their
> build scripts.
> 
> However I'd like to leave the door open to making
> USE_LIBPCRE=YesPlease eventually mean USE_LIBPCRE2=YesPlease,
> i.e. once PCRE v2 is ubiquitous enough that it makes sense to make it
> the default.

I know this isn't what you initially wanted but its probably the safest
option.  Thanks.

-- 
Brandon Williams


Re: [PATCH 18/29] grep: catch a missing enum in switch statement

2017-05-11 Thread Brandon Williams
On 05/11, Ævar Arnfjörð Bjarmason wrote:
> Add a die(...) to a default case for the switch statement selecting
> between grep pattern types under --recurse-submodules.
> 
> Normally this would be caught by -Wswitch, but the grep_pattern_type
> type is converted to int by going through parse_options(). Changing
> the argument type passed to compile_submodule_options() won't work,
> the value will just get coerced.
> 
> Thus catching this at runtime is the least worst option. This won't
> ever trigger in practice, but if a new pattern type were to be added
> this catches an otherwise silent bug during development.
> 
> See commit 0281e487fd ("grep: optionally recurse into submodules",
> 2016-12-16) for the initial addition of this code.
> 
> Signed-off-by: Ævar Arnfjörð Bjarmason 
> ---
>  builtin/grep.c | 6 ++
>  1 file changed, 6 insertions(+)
> 
> diff --git a/builtin/grep.c b/builtin/grep.c
> index 3ffb5b4e81..1c0adb30f3 100644
> --- a/builtin/grep.c
> +++ b/builtin/grep.c
> @@ -495,6 +495,12 @@ static void compile_submodule_options(const struct 
> grep_opt *opt,
>   break;
>   case GREP_PATTERN_TYPE_UNSPECIFIED:
>   break;
> + default:
> + /*
> +  * -Wswitch doesn't catch this due to casting &
> +  * -Wswitch-default is too noisy.
> +  */
> + die("BUG: Added a new grep pattern type without updating switch 
> statement");
>   }

Thanks for adding this, as I got it wrong while developing this bit of
code.

>  
>   for (pattern = opt->pattern_list; pattern != NULL;
> -- 
> 2.11.0
> 

-- 
Brandon Williams


Re: [PATCH 07/29] grep: change non-ASCII -i test to stop using --debug

2017-05-11 Thread Brandon Williams
On 05/11, Ævar Arnfjörð Bjarmason wrote:
> Thanks a lot for looking this giant deluge of patches over.

No problem, I'm no expert in some of the areas you're touching but
at least I can catch little things like this :)

-- 
Brandon Williams


Re: [PATCH] pull: optionally rebase submodules

2017-05-11 Thread Philip Oakley

From: "Brandon Williams" 

Teach pull to optionally update submodules when '--recurse-submodules'
is provided.  This will teach pull to run 'submodule update --rebase'
when the '--recurse-submodules' and '--rebase' flags are given.

Signed-off-by: Brandon Williams 
---

Pull is already a shortcut for running fetch followed by a merge/rebase, 
so why

not have it be a shortcut for running 'submodule update --rebase' when the
recurse flag is given!

builtin/pull.c| 30 ++-
t/t5572-pull-submodule.sh | 97 
+++


Shouldn't this also touch the documentation to say that this has been 
taught?

--
Philip



2 files changed, 125 insertions(+), 2 deletions(-)

diff --git a/builtin/pull.c b/builtin/pull.c
index dd1a4a94e..d73d654e6 100644
--- a/builtin/pull.c
+++ b/builtin/pull.c
@@ -77,6 +77,7 @@ static const char * const pull_usage[] = {
/* Shared options */
static int opt_verbosity;
static char *opt_progress;
+static int recurse_submodules;

/* Options passed to git-merge or git-rebase */
static enum rebase_type opt_rebase = -1;
@@ -532,6 +533,17 @@ static int pull_into_void(const struct object_id 
*merge_head,

 return 0;
}

+static int  update_submodules(void)
+{
+ struct child_process cp = CHILD_PROCESS_INIT;
+ cp.git_cmd = 1;
+
+ argv_array_pushl(, "submodule", "update", "--recursive", NULL);
+ argv_array_push(, "--rebase");
+
+ return run_command();
+}
+
/**
 * Runs git-merge, returning its exit status.
 */
@@ -816,6 +828,14 @@ int cmd_pull(int argc, const char **argv, const char 
*prefix)

 oidclr(_fork_point);
 }

+ if (opt_recurse_submodules &&
+ !strcmp(opt_recurse_submodules, "--recurse-submodules")) {
+ recurse_submodules = 1;
+
+ if (!opt_rebase)
+ die(_("--recurse-submodules is only valid with --rebase"));
+ }
+
 if (run_fetch(repo, refspecs))
 return 1;

@@ -862,6 +882,7 @@ int cmd_pull(int argc, const char **argv, const char 
*prefix)

 die(_("Cannot rebase onto multiple branches."));

 if (opt_rebase) {
+ int ret = 0;
 struct commit_list *list = NULL;
 struct commit *merge_head, *head;

@@ -871,9 +892,14 @@ int cmd_pull(int argc, const char **argv, const char 
*prefix)

 if (is_descendant_of(merge_head, list)) {
 /* we can fast-forward this without invoking rebase */
 opt_ff = "--ff-only";
- return run_merge();
+ ret = run_merge();
 }
- return run_rebase(_head, merge_heads.oid, _fork_point);
+ ret = run_rebase(_head, merge_heads.oid, _fork_point);
+
+ if (!ret && recurse_submodules)
+ ret = update_submodules();
+
+ return ret;
 } else {
 return run_merge();
 }
diff --git a/t/t5572-pull-submodule.sh b/t/t5572-pull-submodule.sh
index accfa5cc0..234a22315 100755
--- a/t/t5572-pull-submodule.sh
+++ b/t/t5572-pull-submodule.sh
@@ -42,4 +42,101 @@ 
KNOWN_FAILURE_NOFF_MERGE_DOESNT_CREATE_EMPTY_SUBMODULE_DIR=1

KNOWN_FAILURE_NOFF_MERGE_ATTEMPTS_TO_MERGE_REMOVED_SUBMODULE_FILES=1
test_submodule_switch "git_pull_noff"

+test_expect_success 'pull --recurse-submodule setup' '
+ git init child &&
+ (
+ cd child &&
+ echo "bar" >file &&
+ git add file &&
+ git commit -m "initial commit"
+ ) &&
+ git init parent &&
+ (
+ cd parent &&
+ echo "foo" >file &&
+ git add file &&
+ git commit -m "Initial commit" &&
+ git submodule add ../child sub &&
+ git commit -m "add submodule"
+ ) &&
+ git clone --recurse-submodule parent super &&
+ git -C super/sub checkout master
+'
+
+test_expect_success 'pull recursive fails without --rebase' '
+ test_must_fail git -C super pull --recurse-submodules 2>actual &&
+ test_i18ngrep "recurse-submodules is only valid with --rebase" actual
+'
+
+test_expect_success 'pull basic recurse' '
+ (
+ cd child &&
+ echo "foobar" >>file &&
+ git add file &&
+ git commit -m "update file"
+ ) &&
+ (
+ cd parent &&
+ git -C sub pull &&
+ git add sub &&
+ git commit -m "update submodule"
+ ) &&
+
+ git -C parent rev-parse master >expect_super &&
+ git -C child rev-parse master >expect_sub &&
+
+ git -C super pull --rebase --recurse-submodules &&
+ git -C super rev-parse master >actual_super &&
+ git -C super/sub rev-parse master >actual_sub &&
+ test_cmp expect_super actual_super &&
+ test_cmp expect_sub actual_sub
+'
+
+test_expect_success 'pull basic rebase recurse' '
+ (
+ cd child &&
+ echo "a" >file &&
+ git add file &&
+ git commit -m "update file"
+ ) &&
+ (
+ cd parent &&
+ git -C sub pull &&
+ git add sub &&
+ git commit -m "update submodule"
+ ) &&
+ (
+ cd super/sub &&
+ echo "b" >file2 &&
+ git add file2 &&
+ git commit -m "add file2"
+ ) &&
+
+ git -C parent rev-parse master >expect_super &&
+ git -C child rev-parse master >expect_sub &&
+
+ git -C super pull --rebase --recurse-submodules &&
+ git -C super rev-parse master >actual_super &&
+ git -C super/sub rev-parse master^ >actual_sub &&
+ test_cmp expect_super actual_super &&
+ test_cmp expect_sub actual_sub &&
+
+ echo "a" >expect &&
+ test_cmp expect super/sub/file &&
+ echo "b" >expect &&
+ test_cmp expect 

Best "triangle" workflow setup?

2017-05-11 Thread Robert Dailey
I know Git has evolved to support the "triangle" workflow model in
different ways, with the goal of making it better. However because
there are so many different options from separate push URLs for
remotes to various ways to manage tracking branches, it's not clear to
me which specific configuration suits this workflow the best.

So my situation is that I have 3 repositories: The original upstream
repository, a fork of that repository (also remote), and my local
clone of the upstream repository.

What I want (as a default) is for `git pull` to pull from the
same-named branch on the upstream repository, but for `git push` to
push to the same-named branch on the fork repository. However to
override this behavior for when I want to push directly to upstream
repo, I should be able to use an explicit `git push origin my-topic`
(but `git push` by default will act as `git push fork my-topic`).

What is the best way to achieve this? Is there a different workflow
from what I'm imagining that works a little better (in other words, I
don't need it to work *exactly* as I've described, mainly I just want
to avoid accidentally pushing changes to the upstream repo in the
default case when I want to push to the fork instead for pull
request)?

Thanks in advance for any advice.


[RFC] send-email: support validate hook

2017-05-11 Thread Jonathan Tan
Currently, send-email has support for rudimentary e-mail validation.
Allow the user to add support for more validation by providing a
sendemail-validate hook.

Signed-off-by: Jonathan Tan 
---

This is motivated by situations in which we discuss a work in progress
using Gerrit (which requires Change-Id trailers in patches), and then,
forgetting to remove the Change-Id trailers, send them to the Git
mailing list (which does not want such trailers). I can envision such
functionality being useful in other situations, hence this patch
submission.

I'm not very familiar with Perl, and "There Is More Than One Way To Do
It", so advice on Perl style is appreciated.
---
 Documentation/git-send-email.txt |  1 +
 Documentation/githooks.txt   |  8 
 git-send-email.perl  | 18 +-
 t/t9001-send-email.sh| 40 
 4 files changed, 66 insertions(+), 1 deletion(-)

diff --git a/Documentation/git-send-email.txt b/Documentation/git-send-email.txt
index 9d66166f6..bb23b02ca 100644
--- a/Documentation/git-send-email.txt
+++ b/Documentation/git-send-email.txt
@@ -377,6 +377,7 @@ have been specified, in which case default to 'compose'.
Currently, validation means the following:
 +
 --
+   *   Invoke the sendemail-validate hook if present (see 
linkgit:githooks[5]).
*   Warn of patches that contain lines longer than 998 
characters; this
is due to SMTP limits as described by 
http://www.ietf.org/rfc/rfc2821.txt.
 --
diff --git a/Documentation/githooks.txt b/Documentation/githooks.txt
index 706091a56..b2514f4d4 100644
--- a/Documentation/githooks.txt
+++ b/Documentation/githooks.txt
@@ -447,6 +447,14 @@ rebase::
 The commits are guaranteed to be listed in the order that they were
 processed by rebase.
 
+sendemail-validate
+~~
+
+This hook is invoked by 'git send-email'.  It takes a single parameter,
+the name of the file that holds the e-mail to be sent.  Exiting with a
+non-zero status causes 'git send-email' to abort before sending any
+e-mails.
+
 
 GIT
 ---
diff --git a/git-send-email.perl b/git-send-email.perl
index eea0a517f..7de91ca7c 100755
--- a/git-send-email.perl
+++ b/git-send-email.perl
@@ -27,6 +27,7 @@ use Term::ANSIColor;
 use File::Temp qw/ tempdir tempfile /;
 use File::Spec::Functions qw(catfile);
 use Error qw(:try);
+use Cwd qw(abs_path cwd);
 use Git;
 use Git::I18N;
 
@@ -628,9 +629,24 @@ if (@rev_list_opts) {
 @files = handle_backup_files(@files);
 
 if ($validate) {
+   my @hook = ($repo->repo_path().'/hooks/sendemail-validate', '');
+   my $use_hook = -x $hook[0];
+   if ($use_hook) {
+   # The hook needs a correct GIT_DIR.
+   $ENV{"GIT_DIR"} = $repo->repo_path();
+   }
foreach my $f (@files) {
unless (-p $f) {
-   my $error = validate_patch($f);
+   my $error;
+   if ($use_hook) {
+   $hook[1] = abs_path($f);
+   my $cwd_save = cwd();
+   chdir($repo->wc_path() or $repo->repo_path());
+   $error = "rejected by sendemail-validate hook"
+   unless system(@hook) == 0;
+   chdir($cwd_save);
+   }
+   $error = validate_patch($f) unless $error;
$error and die sprintf(__("fatal: %s: %s\nwarning: no 
patches were sent\n"),
  $f, $error);
}
diff --git a/t/t9001-send-email.sh b/t/t9001-send-email.sh
index 60a80f60b..f3f238d40 100755
--- a/t/t9001-send-email.sh
+++ b/t/t9001-send-email.sh
@@ -1913,4 +1913,44 @@ test_expect_success $PREREQ 'leading and trailing 
whitespaces are removed' '
test_cmp expected-list actual-list
 '
 
+test_expect_success $PREREQ 'invoke hook' '
+   mkdir -p .git/hooks &&
+
+   write_script .git/hooks/sendemail-validate <<-\EOF &&
+   # test that we have the correct environment variable, pwd, and
+   # argument
+   case "$GIT_DIR" in
+   *.git)
+   true
+   ;;
+   *)
+   false
+   ;;
+   esac &&
+   test -e 0001-add-master.patch &&
+   grep "add master" "$1"
+   EOF
+
+   mkdir subdir &&
+   (
+   # Test that it works even if we are not at the root of the
+   # working tree
+   cd subdir &&
+   git send-email \
+   --from="Example " \
+   --to=nob...@example.com \
+   

Re: [PATCH 08/29] grep: add tests for --threads=N and grep.threads

2017-05-11 Thread Ævar Arnfjörð Bjarmason
On Thu, May 11, 2017 at 8:36 PM, Brandon Williams  wrote:
> On 05/11, Ævar Arnfjörð Bjarmason wrote:
>> Add tests for --threads=N being supplied on the command-line, or when
>> grep.threads=N being supplied in the configuration.
>>
>> When the threading support was made run-time configurable in commit
>> 89f09dd34e ("grep: add --threads= option and grep.threads
>> configuration", 2015-12-15) no tests were added for it.
>>
>> In developing a change to the grep code I was able to make
>> '--threads=1 ` segfault, while the test suite still passed. This
>> change fixes that blind spot in the tests.
>>
>> In addition to asserting that asking for N threads shouldn't segfault,
>> test that the grep output given any N is the same.
>>
>> The choice to test only 1..10 as opposed to 1..8 or 1..16 or whatever
>> is arbitrary. Testing 1..1024 works locally for me (but gets
>> noticeably slower as more threads are spawned). Given the structure of
>> the code there's no reason to test an arbitrary number of threads,
>> only 0, 1 and >=2 are special modes of operation.
>>
>> Signed-off-by: Ævar Arnfjörð Bjarmason 
>> ---
>>  t/t7810-grep.sh | 16 
>>  1 file changed, 16 insertions(+)
>>
>> diff --git a/t/t7810-grep.sh b/t/t7810-grep.sh
>> index daa906b9b0..561709ef6a 100755
>> --- a/t/t7810-grep.sh
>> +++ b/t/t7810-grep.sh
>> @@ -775,6 +775,22 @@ test_expect_success 'grep -W with userdiff' '
>>   test_cmp expected actual
>>  '
>>
>> +for threads in $(test_seq 0 10)
>> +do
>> + test_expect_success "grep --threads=$threads & -c 
>> grep.threads=$threads" "
>> + git grep --threads=$threads . >actual.$threads &&
>> + if test $threads -ge 1
>> + then
>> + test_cmp actual.\$(($threads - 1)) actual.$threads
>> + fi &&
>> + git -c grep.threads=$threads grep . >actual.$threads &&
>> + if test $threads -ge 1
>> + then
>> + test_cmp actual.\$(($threads - 1)) actual.$threads
>> + fi
>> + "
>> +done
>> +
>
> Is there a test condition to require PTHREADS?  Otherwise this might
> break if git is compiled with NO_PTHREADS.

Good catch, this test works and I'll leave it like it is in a v2, but
explain it better in the commit message.

We just ignore --threads= with NO_PTHREADS, however later in this
series I introduce a warning for --threads when no threads are
supported, see "grep: given --threads with NO_PTHREADS=YesPlease,
warn".

But --threads=N still works, so this as a side-benefit tests that
--threads=N still works with NO_PTHREADS, and tests don't fail if we
spew to stderr.

The commit that adds the warning then tests for --threads getting a
warning with NO_THREADS=Y.


Re: [PATCH v4 2/4] diff: have the diff-* builtins configure diff before initializing revisions

2017-05-11 Thread Marc Branchaud

On 2017-05-08 11:22 PM, Jeff King wrote:

On Mon, May 08, 2017 at 12:03:37PM -0400, Marc Branchaud wrote:


This matches how the diff Porcelain works.  It makes the plumbing commands
respect diff's configuration options, such as indentHeuristic, because
init_revisions() calls diff_setup() which fills in the diff_options struct.


I don't know if you want to note here that this is only _some_ options.
I.e., ones that we handle by copying via diff_setup(). Maybe it's
obvious from the description already (it's hard for me to tell because I
already know it either way :) ).


(shrug)  I'm fine with the way it is, but I'd also be OK with "respect 
some of diff's configuration options".


Junio, please feel free to reword the message if you like.  Or I can 
send out a v5, if that's easier for you.


M.



Re: [PATCH 11/11] PREVIEW: remove support for .git/remotes/ and .git/branches/

2017-05-11 Thread Johannes Schindelin
Hi Stefan,

On Thu, 11 May 2017, Stefan Beller wrote:

> On Thu, May 11, 2017 at 6:48 AM, Johannes Schindelin
>  wrote:
> > At long last, after a cycle or three of warning users who *still* use
> > the ancient feature of .git/remotes/ and .git/branches/, it is time to
> > retire the code.
> >
> > Signed-off-by: Johannes Schindelin 
> > ---
> 
> The PREVIEW patches also look good to me, though I just
> skimmed them and other may chime in on the timing of them.

Thanks for the review!

I agree that the PREVIEW part is not quite as important as the first half.
I only bothered with it because I did want to make sure that the first
half is correct, that it does cover all the places we will need to delete
in the end.

In other words: the PREVIEW part should become its own patch series, that
I hope Junio will carry in pu (or next) for a while, similar to the
changes that removed the historical wart of the funny argument order in
`git merge`.

Ciao,
Dscho


Re: [PATCH v4 0/2] perf: show that wildmatch() regressed for pathological cases in v2.0

2017-05-11 Thread Jonathan Nieder
Ævar Arnfjörð Bjarmason wrote:

> Ævar Arnfjörð Bjarmason (2):
>   perf: add function to setup a fresh test repo
>   perf: add test showing exponential growth in path globbing
> 
>  t/perf/README|  1 +
>  t/perf/p0100-globbing.sh | 43 +++
>  t/perf/perf-lib.sh   | 19 +++
>  3 files changed, 59 insertions(+), 4 deletions(-)
>  create mode 100755 t/perf/p0100-globbing.sh

Reviewed-by: Jonathan Nieder 

Thanks.


Re: [PATCH 01/11] git-parse-remote: fix highly misleading man page

2017-05-11 Thread Johannes Schindelin
Hi Stefan,


On Thu, 11 May 2017, Stefan Beller wrote:

> On Thu, May 11, 2017 at 6:47 AM, Johannes Schindelin
>  wrote:
> > The man page still talked about the .git/remotes/ directory (which is
> > no longer in use, as of 75c384efb52 (Do not create $GIT_DIR/remotes/
> > directory anymore., 2006-12-19)).
> >
> > Let's just revamp it almost completely to reflect the *purpose* of
> > that scriptlet, as opposed to its implementation details.
> >
> > Signed-off-by: Johannes Schindelin 
> > ---
> >  Documentation/git-parse-remote.txt | 7 +++
> >  1 file changed, 3 insertions(+), 4 deletions(-)
> >
> > diff --git a/Documentation/git-parse-remote.txt 
> > b/Documentation/git-parse-remote.txt
> > index a45ea1ece81..7f865f33898 100644
> > --- a/Documentation/git-parse-remote.txt
> > +++ b/Documentation/git-parse-remote.txt
> > @@ -3,7 +3,7 @@ git-parse-remote(1)
> >
> >  NAME
> >  
> > -git-parse-remote - Routines to help parsing remote repository access 
> > parameters
> > +git-parse-remote - Routines to help parsing remote repository information
> 
> Today I learned about git-parse-remote. Upon further inspection it is
> just a lib, not anything useful for a user. (The man page with or
> without this patch is not very helpful to me)

Yes, I figured as much when I read the man page. It shows how much most of
our man pages of the olden days improved when you find an unchanged one...

> Only git-rebase as well as git-submodule include this lib, the
> submodules only need get_default_remote (4 lines of sh), which is also
> available in the submodule--helper, we'd just have to expose it and make
> it accessible.
> 
> I wonder if we'd want to retire this script in the long run.

I do not think that we can. Just like git-sh-setup, we advertised it for
use in custom scripts.

> I also wonder if rewriting the man page is good use of (your) time, as
> the last contribution specifically to Documentation/git-parse-remote.txt
> is 62d955fd43 (parse-remote: remove unused functions, 2009-06-12), which
> has been a while since.

1) we still advertise those "shell library files" for consumption in
   users' scripts, so I think we have to keep the man page, too, and

2) The fact that the man page is stale just shows how dearly it is in need
   of being edited, no? ;-)

> The outline in the coverletter promised more than just rewording, but
> I am fine with the change as is; it's a good start.

You mean the commit message (I do not think I talked about the
git-parse-remote man page in the cover letter)?

If so, I think I only promised to completely revamp it and to stop talking
about implementation details...

Ciao,
Dscho


Re: [PATCH v3 2/3] read-tree -m: make error message for merging 0 trees less smart aleck

2017-05-11 Thread Jonathan Nieder
Hi,

Jean-Noel Avila wrote:

> Signed-off-by: Jean-Noel Avila 
> Signed-off-by: Jonathan Nieder 

Please remove my sign-off.  I didn't write or carry this patch.

If you want to acknowledge my contribution, you can use something
like Helped-by, but it's not necessary.

[...]
> +++ b/Documentation/git-read-tree.txt
> @@ -135,10 +135,10 @@ OPTIONS
>  
>  Merging
>  ---
> -If `-m` is specified, 'git read-tree' can perform 3 kinds of
> -merge, a single tree merge if only 1 tree is given, a
> -fast-forward merge with 2 trees, or a 3-way merge if 3 trees are
> -provided.
> +If `-m` is specified, at least one tree must be given on the command
> +line.

As I mentioned before, this sentence feels redundant and doesn't fix
the real problem of the `-m` reference elsewhere in this file not
pointing to this section.

[...]
> --- a/builtin/read-tree.c
> +++ b/builtin/read-tree.c
> @@ -132,7 +132,7 @@ int cmd_read_tree(int argc, const char **argv, const char 
> *unused_prefix)
>   OPT_BOOL(0, "empty", _empty,
>   N_("only empty the index")),
>   OPT__VERBOSE(_update, N_("be verbose")),
> - OPT_GROUP(N_("Merging")),
> + OPT_GROUP(N_("Merging (needs at least one tree-ish")),

This also seems a little too much of a special detail to put in the
prominent section title.  If you run "git read-tree -h", where would
you expect to find this information?

The "git read-tree -h" output turns out to not be useful for much more
than a reminder of supported options --- it doesn't give a useful
overview of the usage, since the usage string at the start is very
long.  That's unfortunate but it seems outside the scope of this
patch.  Probably the simplest thing is to drop this hunk from the
patch.


[...]
> @@ -226,9 +226,10 @@ int cmd_read_tree(int argc, const char **argv, const 
> char *unused_prefix)
>   setup_work_tree();
>  
>   if (opts.merge) {
> - if (stage < 2)
> - die("just how do you expect me to merge %d trees?", 
> stage-1);
>   switch (stage - 1) {
> + case 0:
> + die(_("you must specify at least one tree to merge"));
> + break;

This part looks good.

Thanks for your patient work.
Jonathan


Re: [PATCH] fetch-pack: always allow fetching of literal SHA1s

2017-05-11 Thread Jonathan Nieder
Jeff King wrote:
> On Wed, May 10, 2017 at 10:00:44AM -0700, Jonathan Nieder wrote:
>> Jeff King wrote:

>>> [1] The reachability checks from upload-pack don't actually do much on
>>> GitHub, because you can generally access the objects via the API or
>>> the web site anyway.
[...]
>> Given that, what would make me really happy is if github enables
>> uploadpack.allowAnySHA1InWant.  That would be useful for me, at least.
>
> One of my hesitations is that we've actually considered moving in the
> opposite direction. The object storage for all of the repositories in a
> network is shared, so I can fork git.git, push up malicious crap, and
> then point people to:
>
>   https://github.com/git/git/commit/$sha1
>
> and it resolves. Obviously there's a social-engineering component to any
> such attack, but it's not great. And even without security in mind, it's
> potentially confusing.
[...]
> But even leaving all the refs/pull stuff aside, allowAnySHA1InWant does
> seem to increase that confusion, and I don't see a way around it short
> of never sharing objects between repositories at all. So I think at most
> we'd do allowReachableSHA1InWant.

I had guessed you didn't want to do allowReachableSHA1InWant for
performance reasons.  (I haven't checked to what extent we are already
taking advantage of bitmaps to avoid a slow reachability check.)  If I
was wrong and allowReachableSHA1InWant is on the table then it is of
course even better. :)

Thanks,
Jonathan


Possible bug in includeIf / conditional includes on non git initialised directories

2017-05-11 Thread Raphael Stolt
Hi there,

I might have stumbled this time over a real bug in includeIf / conditional 
includes or maybe it's just as intended.
1) Given I have a correct configured includeIf and I’m issuing `git config 
--show-origin --get user.email` against an directory which hasn’t been `git 
init`ed I get the user.email configured globally.
2) Given I have a correct configured includeIf and I’m issuing `git config 
--show-origin --get user.email` against an directory which has been `git 
init`ed I get the user.email configured conditionally.
For 1) I would probably expect to get the user.email configured conditionally 
even for a plain directory.

More details see this 
(http://stackoverflow.com/questions/43919191/git-2-13-conditional-config-on-windows/)
 Stack Overflow question.
Best regards,
Raphael Stolt







Re: [PATCH 08/29] grep: add tests for --threads=N and grep.threads

2017-05-11 Thread Brandon Williams
On 05/11, Ævar Arnfjörð Bjarmason wrote:
> Add tests for --threads=N being supplied on the command-line, or when
> grep.threads=N being supplied in the configuration.
> 
> When the threading support was made run-time configurable in commit
> 89f09dd34e ("grep: add --threads= option and grep.threads
> configuration", 2015-12-15) no tests were added for it.
> 
> In developing a change to the grep code I was able to make
> '--threads=1 ` segfault, while the test suite still passed. This
> change fixes that blind spot in the tests.
> 
> In addition to asserting that asking for N threads shouldn't segfault,
> test that the grep output given any N is the same.
> 
> The choice to test only 1..10 as opposed to 1..8 or 1..16 or whatever
> is arbitrary. Testing 1..1024 works locally for me (but gets
> noticeably slower as more threads are spawned). Given the structure of
> the code there's no reason to test an arbitrary number of threads,
> only 0, 1 and >=2 are special modes of operation.
> 
> Signed-off-by: Ævar Arnfjörð Bjarmason 
> ---
>  t/t7810-grep.sh | 16 
>  1 file changed, 16 insertions(+)
> 
> diff --git a/t/t7810-grep.sh b/t/t7810-grep.sh
> index daa906b9b0..561709ef6a 100755
> --- a/t/t7810-grep.sh
> +++ b/t/t7810-grep.sh
> @@ -775,6 +775,22 @@ test_expect_success 'grep -W with userdiff' '
>   test_cmp expected actual
>  '
>  
> +for threads in $(test_seq 0 10)
> +do
> + test_expect_success "grep --threads=$threads & -c 
> grep.threads=$threads" "
> + git grep --threads=$threads . >actual.$threads &&
> + if test $threads -ge 1
> + then
> + test_cmp actual.\$(($threads - 1)) actual.$threads
> + fi &&
> + git -c grep.threads=$threads grep . >actual.$threads &&
> + if test $threads -ge 1
> + then
> + test_cmp actual.\$(($threads - 1)) actual.$threads
> + fi
> + "
> +done
> +

Is there a test condition to require PTHREADS?  Otherwise this might
break if git is compiled with NO_PTHREADS.

-- 
Brandon Williams


Re: [PATCH 07/29] grep: change non-ASCII -i test to stop using --debug

2017-05-11 Thread Ævar Arnfjörð Bjarmason
On Thu, May 11, 2017 at 8:31 PM, Brandon Williams  wrote:
> On 05/11, Ęvar Arnfjörš Bjarmason wrote:
>> Change a non-ASCII case-insensitive test case to stop using --debug,
>> and instead simply test for the expected results.
>>
>> The test coverage remains the same with this change, but the test
>> won't break due to internal refactoring.
>>
>> This test was added in commit 793dc676e0 ("grep/icase: avoid kwsset
>> when -F is specified", 2016-06-25). It was asserting that the regex
>> must be compiled with compile_fixed_regexp(), instead test for the
>> expected results, allowing the underlying implementation to change.
>>
>> Signed-off-by: Ęvar Arnfjörš Bjarmason 
>> ---
>>  t/t7812-grep-icase-non-ascii.sh | 25 +
>>  1 file changed, 5 insertions(+), 20 deletions(-)
>>
>> diff --git a/t/t7812-grep-icase-non-ascii.sh 
>> b/t/t7812-grep-icase-non-ascii.sh
>> index 04a61cb8e0..969e7c0dda 100755
>> --- a/t/t7812-grep-icase-non-ascii.sh
>> +++ b/t/t7812-grep-icase-non-ascii.sh
>> @@ -36,29 +36,14 @@ test_expect_success GETTEXT_LOCALE,PCRE 'grep pcre utf-8 
>> string with "+"' '
>>  '
>>
>>  test_expect_success REGEX_LOCALE 'grep literal string, with -F' '
>> - git grep --debug -i -F "TILRAUN: Halló Heimur!"  2>&1 >/dev/null |
>> -  grep fixed >debug1 &&
>> - test_write_lines "fixed TILRAUN: Halló Heimur!" >expect1 &&
>> - test_cmp expect1 debug1 &&
>> -
>> - git grep --debug -i -F "TILRAUN: HALLÓ HEIMUR!"  2>&1 >/dev/null |
>> -  grep fixed >debug2 &&
>> - test_write_lines "fixed TILRAUN: HALLÓ HEIMUR!" >expect2 &&
>> - test_cmp expect2 debug2
>> + git grep -i -F "TILRAUN: Halló Heimur!" &&
>> + git grep -i -F "TILRAUN: HALLÓ HEIMUR!"
>>  '
>>
>>  test_expect_success REGEX_LOCALE 'grep string with regex, with -F' '
>> - test_write_lines "^*TILR^AUN:.* \\Halló \$He[]imur!\$" >file &&
>> -
>> - git grep --debug -i -F "^*TILR^AUN:.* \\Halló \$He[]imur!\$" 2>&1 
>> >/dev/null |
>> -  grep fixed >debug1 &&
>> - test_write_lines "fixed \\^*TILR^AUN:\\.\\* Halló 
>> \$He\\[]imur!\\\$" >expect1 &&
>> - test_cmp expect1 debug1 &&
>> -
>> - git grep --debug -i -F "^*TILR^AUN:.* \\HALLÓ \$HE[]IMUR!\$"  2>&1 
>> >/dev/null |
>> -  grep fixed >debug2 &&
>> - test_write_lines "fixed \\^*TILR^AUN:\\.\\* HALLÓ 
>> \$HE\\[]IMUR!\\\$" >expect2 &&
>> - test_cmp expect2 debug2
>> + test_write_lines "TILRAUN: Halló Heimur [abc]!" >file3 &&
>> + git add file3 &&
>> + git grep --debug -i -F "TILRAUN: Halló Heimur [abc]!" file3
>>  '
>
> Your commit message leads me to believe that you are reformatting the
> tests such that you don't use the '--dubug' flag yet this last line uses
> it.  Is this intentional?

Nope, my mistake. Removing it is functionally equivalent (we discard
stderr there). Will queue up a fix locally & send eventually in a v2.

Thanks a lot for looking this giant deluge of patches over.


Re: [PATCH 07/29] grep: change non-ASCII -i test to stop using --debug

2017-05-11 Thread Brandon Williams
On 05/11, Ævar Arnfjörð Bjarmason wrote:
> Change a non-ASCII case-insensitive test case to stop using --debug,
> and instead simply test for the expected results.
> 
> The test coverage remains the same with this change, but the test
> won't break due to internal refactoring.
> 
> This test was added in commit 793dc676e0 ("grep/icase: avoid kwsset
> when -F is specified", 2016-06-25). It was asserting that the regex
> must be compiled with compile_fixed_regexp(), instead test for the
> expected results, allowing the underlying implementation to change.
> 
> Signed-off-by: Ævar Arnfjörð Bjarmason 
> ---
>  t/t7812-grep-icase-non-ascii.sh | 25 +
>  1 file changed, 5 insertions(+), 20 deletions(-)
> 
> diff --git a/t/t7812-grep-icase-non-ascii.sh b/t/t7812-grep-icase-non-ascii.sh
> index 04a61cb8e0..969e7c0dda 100755
> --- a/t/t7812-grep-icase-non-ascii.sh
> +++ b/t/t7812-grep-icase-non-ascii.sh
> @@ -36,29 +36,14 @@ test_expect_success GETTEXT_LOCALE,PCRE 'grep pcre utf-8 
> string with "+"' '
>  '
>  
>  test_expect_success REGEX_LOCALE 'grep literal string, with -F' '
> - git grep --debug -i -F "TILRAUN: Halló Heimur!"  2>&1 >/dev/null |
> -  grep fixed >debug1 &&
> - test_write_lines "fixed TILRAUN: Halló Heimur!" >expect1 &&
> - test_cmp expect1 debug1 &&
> -
> - git grep --debug -i -F "TILRAUN: HALLÓ HEIMUR!"  2>&1 >/dev/null |
> -  grep fixed >debug2 &&
> - test_write_lines "fixed TILRAUN: HALLÓ HEIMUR!" >expect2 &&
> - test_cmp expect2 debug2
> + git grep -i -F "TILRAUN: Halló Heimur!" &&
> + git grep -i -F "TILRAUN: HALLÓ HEIMUR!"
>  '
>  
>  test_expect_success REGEX_LOCALE 'grep string with regex, with -F' '
> - test_write_lines "^*TILR^AUN:.* \\Halló \$He[]imur!\$" >file &&
> -
> - git grep --debug -i -F "^*TILR^AUN:.* \\Halló \$He[]imur!\$" 2>&1 
> >/dev/null |
> -  grep fixed >debug1 &&
> - test_write_lines "fixed \\^*TILR^AUN:\\.\\* Halló 
> \$He\\[]imur!\\\$" >expect1 &&
> - test_cmp expect1 debug1 &&
> -
> - git grep --debug -i -F "^*TILR^AUN:.* \\HALLÓ \$HE[]IMUR!\$"  2>&1 
> >/dev/null |
> -  grep fixed >debug2 &&
> - test_write_lines "fixed \\^*TILR^AUN:\\.\\* HALLÓ 
> \$HE\\[]IMUR!\\\$" >expect2 &&
> - test_cmp expect2 debug2
> + test_write_lines "TILRAUN: Halló Heimur [abc]!" >file3 &&
> + git add file3 &&
> + git grep --debug -i -F "TILRAUN: Halló Heimur [abc]!" file3
>  '

Your commit message leads me to believe that you are reformatting the
tests such that you don't use the '--dubug' flag yet this last line uses
it.  Is this intentional?

-- 
Brandon Williams


Re: [PATCH 11/11] PREVIEW: remove support for .git/remotes/ and .git/branches/

2017-05-11 Thread Stefan Beller
On Thu, May 11, 2017 at 6:48 AM, Johannes Schindelin
 wrote:
> At long last, after a cycle or three of warning users who *still* use
> the ancient feature of .git/remotes/ and .git/branches/, it is time to
> retire the code.
>
> Signed-off-by: Johannes Schindelin 
> ---

The PREVIEW patches also look good to me, though I just
skimmed them and other may chime in on the timing of them.

Thanks,
Stefan


Re: [PATCH] pull: optionally rebase submodules

2017-05-11 Thread Stefan Beller
On Thu, May 11, 2017 at 10:24 AM, Brandon Williams  wrote:
> Teach pull to optionally update submodules when '--recurse-submodules'
> is provided.  This will teach pull to run 'submodule update --rebase'
> when the '--recurse-submodules' and '--rebase' flags are given.
>
> Signed-off-by: Brandon Williams 
> ---
>
> Pull is already a shortcut for running fetch followed by a merge/rebase, so 
> why
> not have it be a shortcut for running 'submodule update --rebase' when the
> recurse flag is given!

I have not thought about the implications of this shortcut, as opposed to
actually implementing it in C (which presumably would contain more checks).
Will do.

>
>  builtin/pull.c| 30 ++-
>  t/t5572-pull-submodule.sh | 97 
> +++
>  2 files changed, 125 insertions(+), 2 deletions(-)
>
> diff --git a/builtin/pull.c b/builtin/pull.c
> index dd1a4a94e..d73d654e6 100644
> --- a/builtin/pull.c
> +++ b/builtin/pull.c
> @@ -77,6 +77,7 @@ static const char * const pull_usage[] = {
>  /* Shared options */
>  static int opt_verbosity;
>  static char *opt_progress;
> +static int recurse_submodules;
>
>  /* Options passed to git-merge or git-rebase */
>  static enum rebase_type opt_rebase = -1;
> @@ -532,6 +533,17 @@ static int pull_into_void(const struct object_id 
> *merge_head,
> return 0;
>  }
>
> +static int  update_submodules(void)

Maybe s/update_submodules/rebase_submodules/ ?

> +{
> +   struct child_process cp = CHILD_PROCESS_INIT;
> +   cp.git_cmd = 1;
> +
> +   argv_array_pushl(, "submodule", "update", "--recursive", 
> NULL);
> +   argv_array_push(, "--rebase");

The --rebase could be part of the _pushl ?
Also we could set
no_stdin = 1
we do need stdout/err though.


> +
> +   return run_command();
> +}
> +
>  /**
>   * Runs git-merge, returning its exit status.
>   */
> @@ -816,6 +828,14 @@ int cmd_pull(int argc, const char **argv, const char 
> *prefix)
> oidclr(_fork_point);
> }
>
> +   if (opt_recurse_submodules &&
> +   !strcmp(opt_recurse_submodules, "--recurse-submodules")) {

So this checks if we pass --recurse-submodules to fetch and if it is not
a on-demand/no.
Maybe we'd want to use the same infrastructure as fetch does, such that
parse_fetch_recurse makes the decision. (Then "--recurse-submodules=TrUe"
would work as well, IIUC)


> +   recurse_submodules = 1;
> +
> +   if (!opt_rebase)
> +   die(_("--recurse-submodules is only valid with 
> --rebase"));

I wonder if there are existing users of "git pull --recurse --merge";
as of now this would fetch the submodules (on-demand) and merge
in the local commits of the superprojects. It sounds like a useful workflow
which we'd be blocking here? Maybe just do nothing in case of !opt_rebase,
i.e. make it part of the first condition added in this hunk?

> +   ret = run_rebase(_head, merge_heads.oid, 
> _fork_point);
> +
> +   if (!ret && recurse_submodules)
> +   ret = update_submodules();

Instead of doing the rebase of submodules here, we may just want to pass
'recurse_submodules' into run_rebase, which can do it, too. (It already has
a 'ret' value, so the main cmd is not as cluttered.

---
Before reviewing the tests, let's step a bit back and talk about the design
and what is useful to the user. From reading the code, we
  1) perform a fetch in the superproject
  2) rebase the superproject (not rewriting any submodule pointers,
but that may be ok for now)
  3) sequentially:
  3a) fetch each submodule on demand
  3b) run rebase inside of it.


(A) On the sequence:
If in a normal pull --rebase we have a failure, we can fixup the failure
and then continue via "git rebase --continue"; now if we have a failure
in 3), we would need to fixup the submodule ourselves and then as
we lost all progress in the superproject, rerun "pull --rebase --recurse"?

(B)
As discussed offline this produces bad results if we have a non-ff
operation in the superproject, that also has submodule pointer updates.
So additionally we would need to walk the superprojects local commits
and check if any submodule is touched.


>
> +test_expect_success 'pull --recurse-submodule setup' '
> +   git init child &&

test_create_repo child

> +   (
> +   cd child &&
> +   echo "bar" >file &&
> +   git add file &&
> +   git commit -m "initial commit"

test_create_commit -C child

> +   ) &&
> +   git init parent &&
> +   (
> +   cd parent &&
> +   echo "foo" >file &&
> +   git add file &&
> +   git commit -m "Initial commit" &&
> +   git submodule add ../child sub &&
> +   git commit -m "add submodule"
> +   ) &&

Same setup comment as for the child


> +   git clone 

[PATCH/RFC 5/6] grep: support regex patterns containing \0 via PCRE v2

2017-05-11 Thread Ævar Arnfjörð Bjarmason
Support regex patterns with embedded \0's, as an earlier commit[1]
notes this was previously impossible due to an internal limitation.

Before this change any regex metacharacters in patterns containing \0
were silently ignored and the pattern matched as if it were a
--fixed-strings pattern.

Now these patterns will be matched with PCRE instead, which supports
combining regex metacharacters with patterns containing \0.

A side-effect of this change is that these patterns which previously
would be considered --fixed-strings patterns regardless of the engine
requested now all implicitly become --perl-regexp instead.

A subsequent change introduces a POSIX to PCRE syntax converter, and
could be used to be 100% truthful to our documentation by using POSIX
basic syntax (which we haven't been in quite some time with kwset).

But due to a chicken & egg issue with this change being easier to
implement stand-alone first, the subsequent change depending on a SVN
trunk version of PCRE, but most importantly I don't think anyone will
mind this change, so I'm leaving it as it is.

This implementation is faster than the previous kwset implementation,
but I haven't bothered to come up with a \0-specific fixed-string
performance test.

See the next change in this series for a change which optionally
expands the PCRE v2 use to use it for all fixed-string patterns, the
performance tests for those will be applicable to these patterns as
well, since PCRE matches \0 like any other character.

1. ("grep: factor test for \0 in grep patterns into a function",
   2017-05-08)

Signed-off-by: Ævar Arnfjörð Bjarmason 
---
 grep.c | 24 
 t/t7008-grep-binary.sh | 74 ++
 2 files changed, 75 insertions(+), 23 deletions(-)

diff --git a/grep.c b/grep.c
index 2ff4e253ff..5db614cf80 100644
--- a/grep.c
+++ b/grep.c
@@ -613,6 +613,30 @@ static void compile_regexp(struct grep_pat *p, struct 
grep_opt *opt)
icase  = opt->regflags & REG_ICASE || p->ignore_case;
ascii_only = !has_non_ascii(p->pattern);
 
+#ifdef USE_LIBPCRE2
+   if (has_null(p->pattern, p->patternlen)) {
+   struct strbuf sb = STRBUF_INIT;
+   if (icase)
+   strbuf_add(, "(?i)", 4);
+   if (opt->fixed)
+   strbuf_add(, "\\Q", 2);  
+   strbuf_add(, p->pattern, p->patternlen);
+   if (opt->fixed)
+   strbuf_add(, "\\E", 2);
+
+   p->pattern = sb.buf;
+   p->patternlen = sb.len;
+
+   /* FIXME: Check in compile_pcre2_pattern() that we're
+* using basic rx using !opt->pcre2 && 
+*/
+   opt->pcre2 = 1;
+
+   compile_pcre2_pattern(p, opt);
+   return;
+   }
+#endif
+
/*
 * Even when -F (fixed) asks us to do a non-regexp search, we
 * may not be able to correctly case-fold when -i
diff --git a/t/t7008-grep-binary.sh b/t/t7008-grep-binary.sh
index ba3db06501..fc86ed5fce 100755
--- a/t/t7008-grep-binary.sh
+++ b/t/t7008-grep-binary.sh
@@ -124,35 +124,63 @@ nul_match 0 '-F' '[æ]Qð'
 nul_match 0 '-Fi' 'ÆQ[Ð]'
 nul_match 0 '-Fi' '[Æ]QÐ'
 
-# kwset is disabled on -i & non-ASCII. No way to match non-ASCII \0
-# patterns case-insensitively.
-nul_match T1 '-i' 'ÆQÐ'
-
-# \0 implicitly disables regexes. This is an undocumented internal
-# limitation.
-nul_match T1 '' 'yQ[f]'
-nul_match T1 '' '[y]Qf'
-nul_match T1 '-i' 'YQ[F]'
-nul_match T1 '-i' '[Y]Qf'
-nul_match T1 '' 'æQ[ð]'
-nul_match T1 '' '[æ]Qð'
-nul_match T1 '-i' 'ÆQ[Ð]'
-
-# ... because of \0 implicitly disabling regexes regexes that
-# should/shouldn't match don't do the right thing.
-nul_match T1 '' 'eQm.*cQ'
-nul_match T1 '-i' 'EQM.*cQ'
-nul_match T0 '' 'eQm[*]c'
-nul_match T0 '-i' 'EQM[*]C'
+if test_have_prereq LIBPCRE2
+then
+   # Regex patterns that should match without -F
+   nul_match 1 '' 'yQ[f]'
+   nul_match 1 '' '[y]Qf'
+   nul_match 1 '-i' 'YQ[F]'
+   nul_match 1 '-i' '[Y]Qf'
+   nul_match 1 '' 'æQ[ð]'
+   nul_match 1 '' '[æ]Qð'
+   nul_match 0 '-i' '[Æ]Qð'
+   nul_match 1 '' 'eQm.*cQ'
+   nul_match 1 '-i' 'EQM.*cQ'
+   nul_match 0 '' 'eQm[*]c'
+   nul_match 0 '-i' 'EQM[*]C'
+
+   # These should also match, but don't due to some heisenbug,
+   # they succeed when run manually!
+   nul_match T1 '-i' 'ÆQÐ'
+   nul_match T1 '-i' 'ÆQ[Ð]'
+else
+   # \0 implicitly disables regexes. This is an undocumented
+   # internal limitation.
+   nul_match T1 '' 'yQ[f]'
+   nul_match T1 '' '[y]Qf'
+   nul_match T1 '-i' 'YQ[F]'
+   nul_match T1 '-i' '[Y]Qf'
+   nul_match T1 '' 'æQ[ð]'
+   nul_match T1 '' '[æ]Qð'
+   nul_match T1 '-i' 'ÆQ[Ð]'
+
+   # ... because of \0 implicitly disabling regexes regexes that
+   # should/shouldn't match don't do 

[PATCH/RFC 2/6] Makefile & compat/pcre2: add dependency on pcre2_convert.c

2017-05-11 Thread Ævar Arnfjörð Bjarmason
Add a dependency on the experimental pcre2_convert.c. This only exists
in svn trunk of pcre2 currently, and allows for converting POSIX
basic/extended & glob patterns to patterns accepted by PCRE[1][2].

1. https://bugs.exim.org/show_bug.cgi?id=2106
2. https://bugs.exim.org/show_bug.cgi?id=2107

Signed-off-by: Ævar Arnfjörð Bjarmason 
---
 Makefile  | 1 +
 compat/pcre2/get-pcre2.sh | 1 +
 2 files changed, 2 insertions(+)

diff --git a/Makefile b/Makefile
index b18867196e..e437fa011c 100644
--- a/Makefile
+++ b/Makefile
@@ -1541,6 +1541,7 @@ endif
compat/pcre2/src/pcre2_compile.o \
compat/pcre2/src/pcre2_config.o \
compat/pcre2/src/pcre2_context.o \
+   compat/pcre2/src/pcre2_convert.o \
compat/pcre2/src/pcre2_error.o \
compat/pcre2/src/pcre2_find_bracket.o \
compat/pcre2/src/pcre2_jit_compile.o \
diff --git a/compat/pcre2/get-pcre2.sh b/compat/pcre2/get-pcre2.sh
index f1796cb518..7679fba8e4 100755
--- a/compat/pcre2/get-pcre2.sh
+++ b/compat/pcre2/get-pcre2.sh
@@ -26,6 +26,7 @@ for srcfile in \
pcre2_compile.c \
pcre2_config.c \
pcre2_context.c \
+   pcre2_convert.c \
pcre2_error.c \
pcre2_find_bracket.c \
pcre2_jit_compile.c \
-- 
2.11.0



[PATCH/RFC 6/6] grep: use PCRE v2 under the hood for -G & -E for amazing speedup

2017-05-11 Thread Ævar Arnfjörð Bjarmason
Change the underlying engine powering POSIX basic & extended patterns
to be PCRE v2 under the hood.

This relies on an experimental SVN-trunk only PCRE v2 API which Philip
Hazel (the PCRE maintainer) wrote up in response to a feature request
I filed1[1].

This allows us to use pcre2_pattern_convert() to power all grep regex
matches by converting the POSIX patterns into PCRE syntax before
compiling them.

Due to PCRE generally being faster than POSIX, but most importantly
due to its JIT feature (where available) this speeds up grep by
a *lot*.

The improvements to the "perl" tests are already a part of this
series, but all the other benchmarks show improvements made by this
change alone:

$ GIT_PERF_REPEAT_COUNT=30 GIT_PERF_LARGE_REPO=~/g/linux [see note #2 for 
the GIT_PERF_MAKE_COMMAND ...]
[...]
Test   v2.13.0 HEAD

-
7810.1: grep worktree, cheap regex 0.19(0.39+0.52) 
0.19(0.26+0.56) +0.0%
7810.2: grep worktree, expensive regex 5.11(29.75+0.33)
1.07(5.36+0.34) -79.1%
7810.3: grep --cached, cheap regex 2.91(2.77+0.12) 
2.85(2.78+0.06) -2.1%
7810.4: grep --cached, expensive regex 21.00(20.89+0.08)   
6.25(6.18+0.06) -70.2%
7820.1: basic grep how.to  0.32(1.20+0.43) 
0.19(0.26+0.56) -40.6%
7820.2: extended grep how.to   0.32(1.12+0.51) 
0.19(0.22+0.60) -40.6%
7820.3: perl grep how.to   0.31(1.11+0.45) 
0.19(0.30+0.54) -38.7%
7820.5: basic grep ^how to 0.31(1.09+0.51) 
0.19(0.24+0.57) -38.7%
7820.6: extended grep ^how to  0.31(1.14+0.46) 
0.19(0.29+0.52) -38.7%
7820.7: perl grep ^how to  0.57(2.63+0.38) 
0.19(0.25+0.56) -66.7%
7820.9: basic grep [how] to0.49(2.19+0.36) 
0.22(0.36+0.54) -55.1%
7820.10: extended grep [how] to0.49(2.16+0.41) 
0.22(0.41+0.50) -55.1%
7820.11: perl grep [how] to0.57(2.55+0.40) 
0.22(0.35+0.55) -61.4%
7820.13: basic grep \(e.t[^ ]*\|v.ry\) rare0.65(3.18+0.38) 
0.22(0.44+0.52) -66.2%
7820.14: extended grep (e.t[^ ]*|v.ry) rare0.65(3.17+0.40) 
0.21(0.47+0.52) -67.7%
7820.15: perl grep (e.t[^ ]*|v.ry) rare1.05(5.64+0.34) 
0.22(0.46+0.53) -79.0%
7820.17: basic grep m\(ú\|u\)ult.b\(æ\|y\)te   0.33(1.33+0.38) 
0.19(0.31+0.51) -42.4%
7820.18: extended grep m(ú|u)ult.b(æ|y)te  0.33(1.27+0.44) 
0.19(0.32+0.50) -42.4%
7820.19: perl grep m(ú|u)ult.b(æ|y)te  0.37(1.58+0.40) 
0.19(0.30+0.53) -48.6%
7821.1: fixed grep int 0.53(1.70+0.60) 
0.43(1.13+0.66) -18.9%
7821.2: basic grep int 0.55(1.62+0.59) 
0.46(1.08+0.64) -16.4%
7821.3: extended grep int  0.54(1.65+0.59) 
0.45(1.17+0.56) -16.7%
7821.4: perl grep int  0.54(1.63+0.62) 
0.46(1.12+0.60) -14.8%
7821.6: fixed grep -i int  0.58(1.93+0.54) 
0.48(1.40+0.52) -17.2%
7821.7: basic grep -i int  0.83(1.91+0.60) 
0.57(1.23+0.67) -31.3%
7821.8: extended grep -i int   0.59(1.80+0.66) 
0.48(1.33+0.59) -18.6%
7821.9: perl grep -i int   0.61(1.91+0.56) 
0.52(1.28+0.63) -14.8%
7821.11: fixed grep æ  0.34(1.25+0.45) 
0.19(0.29+0.51) -44.1%
7821.12: basic grep æ  0.34(1.26+0.43) 
0.19(0.28+0.53) -44.1%
7821.13: extended grep æ   0.34(1.22+0.48) 
0.19(0.29+0.53) -44.1%
7821.14: perl grep æ   0.34(1.30+0.41) 
0.19(0.26+0.57) -44.1%
7821.16: fixed grep -i æ   0.27(0.88+0.46) 
0.19(0.30+0.51) -29.6%
7821.17: basic grep -i æ   0.27(0.88+0.44) 
0.19(0.27+0.54) -29.6%
7821.18: extended grep -i æ0.27(0.90+0.42) 
0.19(0.22+0.59) -29.6%
7821.19: perl grep -i æ0.25(0.74+0.51) 
0.18(0.27+0.58) -28.0%
7821.1: fixed grep int 0.53(1.70+0.60) 
0.43(1.13+0.66) -18.9%
7821.2: basic grep int 0.55(1.62+0.59) 
0.46(1.08+0.64) -16.4%
7821.3: extended grep int  0.54(1.65+0.59) 
0.45(1.17+0.56) -16.7%
7821.4: perl grep int  0.54(1.63+0.62) 
0.46(1.12+0.60) -14.8%
7821.6: fixed grep -i int  0.58(1.93+0.54) 
0.48(1.40+0.52) -17.2%
7821.7: basic grep -i int  0.83(1.91+0.60) 
0.57(1.23+0.67) -31.3%
7821.8: extended grep -i int   0.59(1.80+0.66) 
0.48(1.33+0.59) -18.6%
7821.9: perl grep 

[PATCH/RFC 4/6] test-lib: add LIBPCRE1 & LIBPCRE2 prerequisites

2017-05-11 Thread Ævar Arnfjörð Bjarmason
Add LIBPCRE1 and LIBPCRE2 prerequisites which are true when git is
compiled with USE_LIBPCRE1=YesPlease or USE_LIBPCRE2=YesPlease,
respectively.

There are various edge cases or version-specific features that need to
be tested for.

Signed-off-by: Ævar Arnfjörð Bjarmason 
---
 t/README  | 12 
 t/test-lib.sh |  2 ++
 2 files changed, 14 insertions(+)

diff --git a/t/README b/t/README
index 2f95860369..1ff612ca65 100644
--- a/t/README
+++ b/t/README
@@ -808,6 +808,18 @@ use these, and "test_set_prereq" for how to define your 
own.
Git was compiled with support for PCRE. Wrap any tests
that use git-grep --perl-regexp or git-grep -P in these.
 
+ - LIBPCRE1
+
+   Git was compiled with PCRE v1 support via
+   USE_LIBPCRE1=YesPlease. Wrap any PCRE using tests that for some
+   reason need v1 of the PCRE library instead of v2 in these.
+
+ - LIBPCRE2
+
+   Git was compiled with PCRE v2 support via
+   USE_LIBPCRE2=YesPlease. Wrap any PCRE using tests that for some
+   reason need v2 of the PCRE library instead of v1 in these.
+
  - CASE_INSENSITIVE_FS
 
Test is run on a case insensitive file system.
diff --git a/t/test-lib.sh b/t/test-lib.sh
index 44d4679384..13ed81dc16 100644
--- a/t/test-lib.sh
+++ b/t/test-lib.sh
@@ -1012,6 +1012,8 @@ test -z "$NO_PERL" && test_set_prereq PERL
 test -z "$NO_PTHREADS" && test_set_prereq PTHREADS
 test -z "$NO_PYTHON" && test_set_prereq PYTHON
 test -n "$USE_LIBPCRE1$USE_LIBPCRE2" && test_set_prereq PCRE
+test -n "$USE_LIBPCRE1" && test_set_prereq LIBPCRE1
+test -n "$USE_LIBPCRE2" && test_set_prereq LIBPCRE2
 test -z "$NO_GETTEXT" && test_set_prereq GETTEXT
 
 # Can we rely on git's output in the C locale?
-- 
2.11.0



[PATCH/RFC 1/6] Makefile & compat/pcre2: add ability to build an embedded PCRE

2017-05-11 Thread Ævar Arnfjörð Bjarmason
Add a USE_LIBPCRE2_BUNDLED=YesIHaveNoPackagedVersion flag to the
Makefile which'll use the PCRE v2 shipped in compat/pcre2 instead of
trying to find it via -lpcre2-8 on the installed system.

As covered in a previous commits ("grep: add support for PCRE v2",
2017-04-08) there are major benefits to using a bleeding edge PCRE v2,
but more importantly I'd like to experiment with making PCRE a
mandatory dependency to power various internal features of grep/log
without the user being aware that they're using the library under the
hood, similar to how we use kwset now for fixed-string searches.

Imposing that hard dependency on everyone using git would bother a lot
of people, whereas if git itself ships PCRE it's no more bothersome
than the code using kwset, i.e. it can be invisible to the builder &
user, and allow git to target newer PCRE APIs without worrying about
versioning.

See [1] for a mostly one-sided pcre-dev mailing list thread discussing
how embed the library.

Implementation details:

 * I configured PCRE v2 with ./configure --enable-jit --enable-utf

 * It sets a lot of -DHAVE_* but these are used by the subset of the
   files I copied, many are either unused or only used by pcre2test.c
   which isn't brought in by the script.

 * These -DHAVE_* flags are something we have already by default &
   assume in other git.git code, so it should be fine to define it.

 * -DPCRE2_CODE_UNIT_WIDTH=8 only compiles the functions linking to
   -lpcre2-8 would have gotten us.

 * All the limits / sizes are the PCRE defaults, the
   MATCH_LIMIT_RECURSION define is a synonym for MATCH_LIMIT_DEPTH in
   older versions, it allows building against older (currently
   release) versions of the library.

 * -DNEWLINE_DEFAULT=2 means only \n is recognized as a newline. This
corresponds to the --enable-newline-is-lf option. It's also
possible to set this to CR, CRLF, any of CR, LF, or CRLF, or any
Unicode newline character being recognized as \n.

This *might* have to be customized on Windows, but I think the
grep machinery always splits on newlines for us already, so this
probably works on Windows as-is, but needs testing.

1. https://lists.exim.org/lurker/thread/20170507.223619.fbee8f00.en.html

Signed-off-by: Ævar Arnfjörð Bjarmason 
---
 Makefile  | 52 
 compat/pcre2/get-pcre2.sh | 67 +++
 2 files changed, 119 insertions(+)
 create mode 100755 compat/pcre2/get-pcre2.sh

diff --git a/Makefile b/Makefile
index d77ca4c1a5..b18867196e 100644
--- a/Makefile
+++ b/Makefile
@@ -34,6 +34,11 @@ all::
 # library. The USE_LIBPCRE flag will likely be changed to mean v2 by
 # default in future releases.
 #
+# Define USE_LIBPCRE2_BUNDLED=YesIHaveNoPackagedVersion in addition to
+# USE_LIBPCRE2=YesPlease if you'd like to use a copy of PCRE version 2
+# bunded with Git. This is for setups where getting a hold of a
+# packaged PCRE is inconvenient.
+#
 # Define LIBPCREDIR=/foo/bar if your PCRE header and library files are in
 # /foo/bar/include and /foo/bar/lib directories.
 #
@@ -1105,8 +1110,10 @@ endif
 
 ifdef USE_LIBPCRE2
BASIC_CFLAGS += -DUSE_LIBPCRE2
+ifndef USE_LIBPCRE2_BUNDLED
EXTLIBS += -lpcre2-8
 endif
+endif
 
 ifdef LIBPCREDIR
BASIC_CFLAGS += -I$(LIBPCREDIR)/include
@@ -1505,6 +1512,50 @@ ifdef NO_REGEX
COMPAT_CFLAGS += -Icompat/regex
COMPAT_OBJS += compat/regex/regex.o
 endif
+ifdef USE_LIBPCRE2_BUNDLED
+ifndef USE_LIBPCRE2
+$(error please set USE_LIBPCRE2=YesPlease when setting \
+USE_LIBPCRE2_BUNDLED=$(USE_LIBPCRE2_BUNDLED))
+endif
+   COMPAT_CFLAGS += \
+   -Icompat/pcre2/src \
+   -DHAVE_BCOPY=1 \
+   -DHAVE_INTTYPES_H=1 \
+   -DHAVE_MEMMOVE=1 \
+   -DHAVE_STDINT_H=1 \
+   -DPCRE2_CODE_UNIT_WIDTH=8 \
+   -DLINK_SIZE=2 \
+   -DHEAP_LIMIT=2000 \
+   -DMATCH_LIMIT=1000 \
+   -DMATCH_LIMIT_DEPTH=1000 \
+   -DMATCH_LIMIT_RECURSION=1000 \
+   -DMAX_NAME_COUNT=1 \
+   -DMAX_NAME_SIZE=32 \
+   -DPARENS_NEST_LIMIT=250 \
+   -DNEWLINE_DEFAULT=2 \
+   -DSUPPORT_JIT \
+   -DSUPPORT_UNICODE
+   COMPAT_OBJS += \
+   compat/pcre2/src/pcre2_auto_possess.o \
+   compat/pcre2/src/pcre2_chartables.o \
+   compat/pcre2/src/pcre2_compile.o \
+   compat/pcre2/src/pcre2_config.o \
+   compat/pcre2/src/pcre2_context.o \
+   compat/pcre2/src/pcre2_error.o \
+   compat/pcre2/src/pcre2_find_bracket.o \
+   compat/pcre2/src/pcre2_jit_compile.o \
+   compat/pcre2/src/pcre2_maketables.o \
+   compat/pcre2/src/pcre2_match.o \
+   compat/pcre2/src/pcre2_match_data.o \
+   compat/pcre2/src/pcre2_newline.o 

Re: [PATCH v3] fetch-pack: always allow fetching of literal SHA1s

2017-05-11 Thread Jonathan Tan
Thanks for your suggestions. I'll hold off on sending out a new patch 
(following Jonathan Nieder's suggestions [1]) until we decide if further 
optimizations (for example, as suggested by Peff) need to be done.


[1] <20170510232231.gc28...@aiede.svl.corp.google.com>

On 05/11/2017 02:46 AM, Jeff King wrote:

On Wed, May 10, 2017 at 03:11:57PM -0700, Jonathan Tan wrote:


After looking at ways to solve jrnieder's performance concerns, if we're
going to need to manage one more item of state within the function, I
might as well use my earlier idea of storing unmatched refs in its own
list instead of immediately freeing them. This version of the patch
should have much better performance characteristics.


Hrm. So the problem in your original was that the loop became quadratic
in the number of refs when fetching all of them (because the original
relies on the sorting to essentially do a list-merge). Are there any
quadratic bits left?


@@ -649,6 +652,25 @@ static void filter_refs(struct fetch_pack_args *args,

if ((allow_unadvertised_object_request &
(ALLOW_TIP_SHA1 | ALLOW_REACHABLE_SHA1))) {
+   can_append = 1;
+   } else {
+   struct ref *u;
+   /* Check all refs, including those already matched */
+   for (u = unmatched; u; u = u->next) {
+   if (!oidcmp(>old_oid, >old_oid)) {
+   can_append = 1;
+   goto can_append;
+   }
+   }


This is inside the nr_sought loop. So if I were to do:

  git fetch origin $(git ls-remote origin | awk '{print $1}')

we're back to being quadratic. I realize that's probably a silly thing
to do, but in the general case, you're O(m*n), where "n" is number of
unmatched remote refs and "m" is the number of SHA-1s you're looking
for.


The original patch was quadratic regardless of whether we're fetching 
names or SHA-1s, which can be said to be bad since it is a regression in 
an existing and common use case (and I agree). This one is O(m*n), as 
you said - the "quadratic-ness" only kicks in if you fetch SHA-1s, which 
before this patch is impossible.



Doing better would require either sorting both lists, or storing the
oids in something that has better than linear-time lookup.  Perhaps a
sha1_array or an oidset? We don't actually need to know anything about
the unmatched refs after the first loop. We just need the list of oids
that let us do can_append.


Having a sha1_array or oidset would require allocation (O(n log n) time, 
I think, in addition to O(n) space) and this cost would be incurred 
regardless of how many SHA-1s were actually fetched (if m is an order of 
magnitude less than log n, for example, having a sha1_array might be 
actually worse). Also, presumably we don't want to incur this cost if we 
are fetching zero SHA-1s, so we would need to do some sort of pre-check. 
I'm inclined to leave it the way it is and consider this only if the use 
case of fetching many SHA1-s comes up.



AIUI, you could also avoid creating the unmatched list entirely when the
server advertises tip/reachable sha1s. That's a small optimization, but
I think it may actually make the logic clearer.


If you mean adding an "if" block at the point where we add the unmatched 
ref to the unmatched list (that either adds it to the list or 
immediately frees it), I think it makes the logic slightly more 
complicated. Or maybe you had something else in mind?


[PATCH/RFC 0/6] Speed up git-grep by using PCRE v2 as a backend

2017-05-11 Thread Ævar Arnfjörð Bjarmason
Thought I'd send this to the list too. This is first of my WIP
post-PCRE v2 inclusion series's.

In addition to the huge speed improvements for grep -P noted in the
culmination of that series[1], this speeds up all other types of grep
invocations (fixed string & POSIX basic/extended) by using an
experimental PCRE API to translate those patterns to PCRE syntax.

Fixed string grep is sped up by ~15-50%, and any greps containing
regexes by 40-70%, with 50% seeming to be the average for most normal
patterns.

It isn't ready for the reasons noted in the last patch in the series,
and currently brings most of PCRE into git in compat/pcre2 since it
uses an experimental API.

The 5/6 patch is pretty much ready though and works on stock PCRE, it
fixes TODO tests for patterns that contain a \0, and enables regex
metacharacters in such patterns (right now they're all implicitly
fixed strings), see the discussion in that patch for some of the
caveats.

That patch will most likely be dropped by the list, it can be
retrieved from https://github.com/avar/git
avar/grep-and-pcre-and-more, or the whole series viewed at
https://github.com/git/git/compare/master...avar:avar/grep-and-pcre-and-more.

1. <20170511170142.15934-8-ava...@gmail.com>
   (https://public-inbox.org/git/20170511170142.15934-8-ava...@gmail.com/)

Ævar Arnfjörð Bjarmason (6):
  Makefile & compat/pcre2: add ability to build an embedded PCRE
  Makefile & compat/pcre2: add dependency on pcre2_convert.c
  compat/pcre2: import pcre2 from svn trunk
  test-lib: add LIBPCRE1 & LIBPCRE2 prerequisites
  grep: support regex patterns containing \0 via PCRE v2
  grep: use PCRE v2 under the hood for -G & -E for amazing speedup

 Makefile   |53 +
 compat/pcre2/get-pcre2.sh  |68 +
 compat/pcre2/src/pcre2.h   |   832 ++
 compat/pcre2/src/pcre2_auto_possess.c  |  1291 ++
 compat/pcre2/src/pcre2_chartables.c| 1 +
 compat/pcre2/src/pcre2_chartables.c.dist   |   198 +
 compat/pcre2/src/pcre2_compile.c   |  9626 +++
 compat/pcre2/src/pcre2_config.c|   222 +
 compat/pcre2/src/pcre2_context.c   |   450 +
 compat/pcre2/src/pcre2_convert.c   |   724 ++
 compat/pcre2/src/pcre2_error.c |   327 +
 compat/pcre2/src/pcre2_find_bracket.c  |   218 +
 compat/pcre2/src/pcre2_internal.h  |  1967 +++
 compat/pcre2/src/pcre2_intmodedep.h|   884 ++
 compat/pcre2/src/pcre2_jit_compile.c   | 12307 +++
 compat/pcre2/src/pcre2_jit_match.c |   189 +
 compat/pcre2/src/pcre2_jit_misc.c  |   227 +
 compat/pcre2/src/pcre2_maketables.c|   157 +
 compat/pcre2/src/pcre2_match.c |  6826 ++
 compat/pcre2/src/pcre2_match_data.c|   147 +
 compat/pcre2/src/pcre2_newline.c   |   243 +
 compat/pcre2/src/pcre2_ord2utf.c   |   120 +
 compat/pcre2/src/pcre2_string_utils.c  |   201 +
 compat/pcre2/src/pcre2_study.c |  1624 +++
 compat/pcre2/src/pcre2_tables.c|   765 ++
 compat/pcre2/src/pcre2_ucd.c   |  3761 ++
 compat/pcre2/src/pcre2_ucp.h   |   268 +
 compat/pcre2/src/pcre2_valid_utf.c |   398 +
 compat/pcre2/src/pcre2_xclass.c|   271 +
 compat/pcre2/src/sljit/sljitConfig.h   |   145 +
 compat/pcre2/src/sljit/sljitConfigInternal.h   |   725 ++
 compat/pcre2/src/sljit/sljitExecAllocator.c|   312 +
 compat/pcre2/src/sljit/sljitLir.c  |  2224 
 compat/pcre2/src/sljit/sljitLir.h  |  1392 +++
 compat/pcre2/src/sljit/sljitNativeARM_32.c |  2326 
 compat/pcre2/src/sljit/sljitNativeARM_64.c |  2104 
 compat/pcre2/src/sljit/sljitNativeARM_T2_32.c  |  1987 +++
 compat/pcre2/src/sljit/sljitNativeMIPS_32.c|   437 +
 compat/pcre2/src/sljit/sljitNativeMIPS_64.c|   539 +
 compat/pcre2/src/sljit/sljitNativeMIPS_common.c|  2110 
 compat/pcre2/src/sljit/sljitNativePPC_32.c |   276 +
 compat/pcre2/src/sljit/sljitNativePPC_64.c |   447 +
 compat/pcre2/src/sljit/sljitNativePPC_common.c |  2421 
 compat/pcre2/src/sljit/sljitNativeSPARC_32.c   |   165 +
 compat/pcre2/src/sljit/sljitNativeSPARC_common.c   |  1471 +++
 compat/pcre2/src/sljit/sljitNativeTILEGX-encoder.c | 10159 +++
 compat/pcre2/src/sljit/sljitNativeTILEGX_64.c  |  2555 
 compat/pcre2/src/sljit/sljitNativeX86_32.c |   602 +
 compat/pcre2/src/sljit/sljitNativeX86_64.c |   742 ++
 compat/pcre2/src/sljit/sljitNativeX86_common.c |  2921 +
 compat/pcre2/src/sljit/sljitProtExecAllocator.c|   421 +
 compat/pcre2/src/sljit/sljitUtils.c  

Re: [PATCH 8/8] pathspec: convert parse_pathspec to take an index

2017-05-11 Thread Brandon Williams
On 05/11, Junio C Hamano wrote:
> Brandon Williams  writes:
> 
> > ls-files is the only command (that I know of) which does cache pruning
> > based on the common prefix of all pathspecs given.  If there is a
> > submodule named 'sub' and you provided a pathspec 'sub/', the matching
> > logic can handle this but the cache pruning logic would prune all
> > entries from the index which don't have a leading 'sub/' which is the
> > common prefix of all pathspecs (if you didn't strip off the slash).
> > Meaning you'd prune the submodule 'sub' when  you really didn't want to.
> > This could probably be fixed to have the cache pruning logic to prune by
> > ignoring the trailing slash always.
> >
> > So there's another option here if you don't feel quite right about
> > piping through an index into parse_pathspec just yet.
> 
> Oh, don't get me wrong---I do not think passing an index_state
> instance throughout the callchain (and perhaps later we may pass an
> instance of "repository" instead) is a wrong move in the longer term
> for various APIs.  I was just wondering if we have callers that can
> benefit from this change immediately---manipulators like "commit" do
> already use multiple instances of index_state structure.

I didn't get the feeling you thought it was a bad change.  I really
appreciate your thoughts since they are things which I also was
wondering about.

> But perhaps you are right---it may be wrong for the contents of the
> current index (or any index) to affect how a pathspec element is
> parsed in the first place.  If the current code (before this series)
> uses the_index only for error checking, we may want to separate that
> out of the parse_pathspec() callchain, so that it does not even look
> at any index (not just the_index).  And that may be a better change
> overall.

I'll polish up the other version of the series which does the processing
(using an index) outside of parse_pathspec() and let you see how you
feel about those changes.  Note that if we go with the route to not pass
in an index now, it doesn't necessarily mean that down the line we won't
have to pass a 'repository' instance into parse_pathspec().  Just food
for thought.

-- 
Brandon Williams


Re: [PATCH 05/11] Revert "Revert "Don't create the $GIT_DIR/branches directory on init""

2017-05-11 Thread Stefan Beller
On Thu, May 11, 2017 at 6:47 AM, Johannes Schindelin
 wrote:
> A long, long, long time ago, we stored the "upstream" information of
> branches in files inside the .git/branches/ directory. We don't do this
> anymore, though. Since 5751f49010e (Move remote parsing into a library
> file out of builtin-push., 2007-05-12), to be precise.
>
> This is sort of a sibling to 75c384efb52 (Do not create $GIT_DIR/remotes/
> directory anymore., 2006-12-19).
>
> The tests t5505-remote and t5516-fetch-push need to be adjusted now, as
> they expect to find a .git/branches/ directory.
>
> This reverts c8a58ac5a52 (Revert "Don't create the $GIT_DIR/branches
> directory on init", 2009-10-31).
>
> Signed-off-by: Johannes Schindelin 
> ---

This (and the previous patches) makes sense to me.

Thanks,
Stefan


[PATCH] pull: optionally rebase submodules

2017-05-11 Thread Brandon Williams
Teach pull to optionally update submodules when '--recurse-submodules'
is provided.  This will teach pull to run 'submodule update --rebase'
when the '--recurse-submodules' and '--rebase' flags are given.

Signed-off-by: Brandon Williams 
---

Pull is already a shortcut for running fetch followed by a merge/rebase, so why
not have it be a shortcut for running 'submodule update --rebase' when the
recurse flag is given!

 builtin/pull.c| 30 ++-
 t/t5572-pull-submodule.sh | 97 +++
 2 files changed, 125 insertions(+), 2 deletions(-)

diff --git a/builtin/pull.c b/builtin/pull.c
index dd1a4a94e..d73d654e6 100644
--- a/builtin/pull.c
+++ b/builtin/pull.c
@@ -77,6 +77,7 @@ static const char * const pull_usage[] = {
 /* Shared options */
 static int opt_verbosity;
 static char *opt_progress;
+static int recurse_submodules;
 
 /* Options passed to git-merge or git-rebase */
 static enum rebase_type opt_rebase = -1;
@@ -532,6 +533,17 @@ static int pull_into_void(const struct object_id 
*merge_head,
return 0;
 }
 
+static int  update_submodules(void)
+{
+   struct child_process cp = CHILD_PROCESS_INIT;
+   cp.git_cmd = 1;
+
+   argv_array_pushl(, "submodule", "update", "--recursive", NULL);
+   argv_array_push(, "--rebase");
+
+   return run_command();
+}
+
 /**
  * Runs git-merge, returning its exit status.
  */
@@ -816,6 +828,14 @@ int cmd_pull(int argc, const char **argv, const char 
*prefix)
oidclr(_fork_point);
}
 
+   if (opt_recurse_submodules &&
+   !strcmp(opt_recurse_submodules, "--recurse-submodules")) {
+   recurse_submodules = 1;
+
+   if (!opt_rebase)
+   die(_("--recurse-submodules is only valid with 
--rebase"));
+   }
+
if (run_fetch(repo, refspecs))
return 1;
 
@@ -862,6 +882,7 @@ int cmd_pull(int argc, const char **argv, const char 
*prefix)
die(_("Cannot rebase onto multiple branches."));
 
if (opt_rebase) {
+   int ret = 0;
struct commit_list *list = NULL;
struct commit *merge_head, *head;
 
@@ -871,9 +892,14 @@ int cmd_pull(int argc, const char **argv, const char 
*prefix)
if (is_descendant_of(merge_head, list)) {
/* we can fast-forward this without invoking rebase */
opt_ff = "--ff-only";
-   return run_merge();
+   ret = run_merge();
}
-   return run_rebase(_head, merge_heads.oid, 
_fork_point);
+   ret = run_rebase(_head, merge_heads.oid, 
_fork_point);
+
+   if (!ret && recurse_submodules)
+   ret = update_submodules();
+
+   return ret;
} else {
return run_merge();
}
diff --git a/t/t5572-pull-submodule.sh b/t/t5572-pull-submodule.sh
index accfa5cc0..234a22315 100755
--- a/t/t5572-pull-submodule.sh
+++ b/t/t5572-pull-submodule.sh
@@ -42,4 +42,101 @@ KNOWN_FAILURE_NOFF_MERGE_DOESNT_CREATE_EMPTY_SUBMODULE_DIR=1
 KNOWN_FAILURE_NOFF_MERGE_ATTEMPTS_TO_MERGE_REMOVED_SUBMODULE_FILES=1
 test_submodule_switch "git_pull_noff"
 
+test_expect_success 'pull --recurse-submodule setup' '
+   git init child &&
+   (
+   cd child &&
+   echo "bar" >file &&
+   git add file &&
+   git commit -m "initial commit"
+   ) &&
+   git init parent &&
+   (
+   cd parent &&
+   echo "foo" >file &&
+   git add file &&
+   git commit -m "Initial commit" &&
+   git submodule add ../child sub &&
+   git commit -m "add submodule"
+   ) &&
+   git clone --recurse-submodule parent super &&
+   git -C super/sub checkout master
+'
+
+test_expect_success 'pull recursive fails without --rebase' '
+   test_must_fail git -C super pull --recurse-submodules 2>actual &&
+   test_i18ngrep "recurse-submodules is only valid with --rebase" actual
+'
+
+test_expect_success 'pull basic recurse' '
+   (
+   cd child &&
+   echo "foobar" >>file &&
+   git add file &&
+   git commit -m "update file"
+   ) &&
+   (
+   cd parent &&
+   git -C sub pull &&
+   git add sub &&
+   git commit -m "update submodule"
+   ) &&
+
+   git -C parent rev-parse master >expect_super &&
+   git -C child rev-parse master >expect_sub &&
+
+   git -C super pull --rebase --recurse-submodules &&
+   git -C super rev-parse master >actual_super &&
+   git -C super/sub rev-parse master >actual_sub &&
+   test_cmp expect_super actual_super &&
+   test_cmp expect_sub actual_sub
+'
+
+test_expect_success 'pull basic rebase recurse' '
+   (
+   cd 

Re: [PATCH 01/11] git-parse-remote: fix highly misleading man page

2017-05-11 Thread Stefan Beller
On Thu, May 11, 2017 at 6:47 AM, Johannes Schindelin
 wrote:
> The man page still talked about the .git/remotes/ directory (which is no
> longer in use, as of 75c384efb52 (Do not create $GIT_DIR/remotes/
> directory anymore., 2006-12-19)).
>
> Let's just revamp it almost completely to reflect the *purpose* of that
> scriptlet, as opposed to its implementation details.
>
> Signed-off-by: Johannes Schindelin 
> ---
>  Documentation/git-parse-remote.txt | 7 +++
>  1 file changed, 3 insertions(+), 4 deletions(-)
>
> diff --git a/Documentation/git-parse-remote.txt 
> b/Documentation/git-parse-remote.txt
> index a45ea1ece81..7f865f33898 100644
> --- a/Documentation/git-parse-remote.txt
> +++ b/Documentation/git-parse-remote.txt
> @@ -3,7 +3,7 @@ git-parse-remote(1)
>
>  NAME
>  
> -git-parse-remote - Routines to help parsing remote repository access 
> parameters
> +git-parse-remote - Routines to help parsing remote repository information

Today I learned about git-parse-remote. Upon further inspection it is
just a lib,
not anything useful for a user. (The man page with or without this patch
is not very helpful to me)

Only git-rebase as well as git-submodule include this lib, the submodules
only need get_default_remote (4 lines of sh), which is also available
in the submodule--helper, we'd just have to expose it and make it accessible.

I wonder if we'd want to retire this script in the long run.

I also wonder if rewriting the man page is good use of (your) time, as the
last contribution specifically to Documentation/git-parse-remote.txt
is 62d955fd43 (parse-remote: remove unused functions, 2009-06-12),
which has been a while since.

The outline in the coverletter promised more than just rewording, but
I am fine with the change as is; it's a good start.

Thanks,
Stefan


[PATCH 4/7] grep: add support for the PCRE v1 JIT API

2017-05-11 Thread Ævar Arnfjörð Bjarmason
Change the grep PCRE v1 code to use JIT when available. When PCRE
support was initially added in commit 63e7e9d8b6 ("git-grep: Learn
PCRE", 2011-05-09) PCRE had no JIT support, it was integrated into
8.20 released on 2011-10-21.

Enabling JIT support usually improves performance by more than
40%. The pattern compilation times are relatively slower, but those
relative numbers are tiny, and are easily made back in all but the
most trivial cases of grep. Detailed benchmarks & overview of
compilation times is at: http://sljit.sourceforge.net/pcre.html

With this change the difference in a t/perf/p7820-grep-engines.sh run
is, with just the /perl/ tests shown:

$ GIT_PERF_REPEAT_COUNT=30 GIT_PERF_LARGE_REPO=~/g/linux 
GIT_PERF_MAKE_OPTS='-j8 USE_LIBPCRE=YesPlease CC=~/perl5/installed/bin/gcc 
NO_R_TO_GCC_LINKER=YesPlease CFLAGS=-O3 LIBPCREDIR=/home/avar/g/pcre/inst 
LDFLAGS=-Wl,-rpath,/home/avar/g/pcre/inst/lib' ./run HEAD~ HEAD 
p7820-grep-engines.sh
Test   HEAD~ HEAD

---
7820.3: perl grep how.to   0.31(1.11+0.44)   
0.20(0.35+0.57) -35.5%
7820.7: perl grep ^how to  0.57(2.66+0.38)   
0.23(0.65+0.46) -59.6%
7820.11: perl grep [how] to0.55(2.54+0.43)   
0.29(0.86+0.52) -47.3%
7820.15: perl grep (e.t[^ ]*|v.ry) rare1.05(5.54+0.33)   
0.30(1.10+0.44) -71.4%
7820.19: perl grep m(ú|u)ult.b(æ|y)te  0.37(1.53+0.43)   
0.24(0.70+0.47) -35.1%

The conditional support for JIT is implemented as suggested in the
pcrejit(3) man page. E.g. defining PCRE_STUDY_JIT_COMPILE to 0 if it's
not present.

The implementation is relatively verbose because even if
PCRE_CONFIG_JIT is defined only a call to pcre_config() can determine
if the JIT is available, and if so the faster pcre_jit_exec() function
should be called instead of pcre_exec(), and a different (but not
complimentary!) function needs to be called to free pcre1_extra_info.

There's no graceful fallback if pcre_jit_stack_alloc() fails under
PCRE_CONFIG_JIT, instead the program will simply abort. I don't think
this is worth handling gracefully, it'll only fail in cases where
malloc() doesn't work, in which case we're screwed anyway.

Signed-off-by: Ævar Arnfjörð Bjarmason 
---
 grep.c | 37 -
 grep.h |  6 ++
 2 files changed, 42 insertions(+), 1 deletion(-)

diff --git a/grep.c b/grep.c
index 5c808f8971..593e72f92a 100644
--- a/grep.c
+++ b/grep.c
@@ -350,6 +350,9 @@ static void compile_pcre1_regexp(struct grep_pat *p, const 
struct grep_opt *opt)
const char *error;
int erroffset;
int options = PCRE_MULTILINE;
+#ifdef PCRE_CONFIG_JIT
+   int canjit;
+#endif
 
if (opt->ignore_case) {
if (has_non_ascii(p->pattern))
@@ -364,9 +367,20 @@ static void compile_pcre1_regexp(struct grep_pat *p, const 
struct grep_opt *opt)
if (!p->pcre1_regexp)
compile_regexp_failed(p, error);
 
-   p->pcre1_extra_info = pcre_study(p->pcre1_regexp, 0, );
+   p->pcre1_extra_info = pcre_study(p->pcre1_regexp, 
PCRE_STUDY_JIT_COMPILE, );
if (!p->pcre1_extra_info && error)
die("%s", error);
+
+#ifdef PCRE_CONFIG_JIT
+   pcre_config(PCRE_CONFIG_JIT, );
+   if (canjit == 1) {
+   p->pcre1_jit_stack = pcre_jit_stack_alloc(1, 1024 * 1024);
+   if (!p->pcre1_jit_stack)
+   die("BUG: Couldn't allocate PCRE JIT stack");
+   pcre_assign_jit_stack(p->pcre1_extra_info, NULL, 
p->pcre1_jit_stack);
+   p->pcre1_jit_on = 1;
+   }
+#endif
 }
 
 static int pcre1match(struct grep_pat *p, const char *line, const char *eol,
@@ -377,8 +391,20 @@ static int pcre1match(struct grep_pat *p, const char 
*line, const char *eol,
if (eflags & REG_NOTBOL)
flags |= PCRE_NOTBOL;
 
+#ifdef PCRE_CONFIG_JIT
+   if (p->pcre1_jit_on)
+   ret = pcre_jit_exec(p->pcre1_regexp, p->pcre1_extra_info, line,
+   eol - line, 0, flags, ovector,
+   ARRAY_SIZE(ovector), p->pcre1_jit_stack);
+   else
+   ret = pcre_exec(p->pcre1_regexp, p->pcre1_extra_info, line,
+   eol - line, 0, flags, ovector,
+   ARRAY_SIZE(ovector));
+#else
ret = pcre_exec(p->pcre1_regexp, p->pcre1_extra_info, line, eol - line,
0, flags, ovector, ARRAY_SIZE(ovector));
+#endif
+
if (ret < 0 && ret != PCRE_ERROR_NOMATCH)
die("pcre_exec failed with error code %d", ret);
if (ret > 0) {
@@ -393,7 +419,16 @@ static int pcre1match(struct grep_pat *p, const char 
*line, const char *eol,
 static void free_pcre1_regexp(struct grep_pat *p)
 {

[PATCH 5/7] grep: un-break building with PCRE < 8.32

2017-05-11 Thread Ævar Arnfjörð Bjarmason
Amend my change earlier in this series ("grep: add support for the
PCRE v1 JIT API", 2017-04-11) to un-break the build on PCRE v1
versions earlier than 8.32.

The JIT support was added in version 8.20 released on 2011-10-21, but
it wasn't until 8.32 released on 2012-11-30 that the fast code path to
use the JIT via pcre_jit_exec() was added[1] (see also [2]).

This means that versions 8.20 through 8.31 could still use the JIT,
but supporting it on those versions would add to the already verbose
macro soup around JIT support it, and I don't expect that the use-case
of compiling a brand new git against a 5 year old PCRE is particularly
common, and if someone does that they can just get the existing
pre-JIT slow codepath.

So just take the easy way out and disable the JIT on any version older
than 8.32.

The reason this change isn't part of the initial change PCRE JIT
support is because possibly slightly annoying someone who's bisecting
with an ancient PCRE is worth it to have a cleaner history showing
which parts of the implementation are only used for ancient PCRE
versions. This also makes it easier to revert this change if we ever
decide to stop supporting those old versions.

1. http://www.pcre.org/original/changelog.txt ("28. Introducing a
   native interface for JIT. Through this interface, the
   compiled[...]")
2. https://bugs.exim.org/show_bug.cgi?id=2121

Signed-off-by: Ævar Arnfjörð Bjarmason 
---
 grep.c | 8 
 grep.h | 5 +
 2 files changed, 9 insertions(+), 4 deletions(-)

diff --git a/grep.c b/grep.c
index 593e72f92a..9ba9aab1a9 100644
--- a/grep.c
+++ b/grep.c
@@ -350,7 +350,7 @@ static void compile_pcre1_regexp(struct grep_pat *p, const 
struct grep_opt *opt)
const char *error;
int erroffset;
int options = PCRE_MULTILINE;
-#ifdef PCRE_CONFIG_JIT
+#ifdef GIT_PCRE1_CAN_DO_MODERN_JIT
int canjit;
 #endif
 
@@ -371,7 +371,7 @@ static void compile_pcre1_regexp(struct grep_pat *p, const 
struct grep_opt *opt)
if (!p->pcre1_extra_info && error)
die("%s", error);
 
-#ifdef PCRE_CONFIG_JIT
+#ifdef GIT_PCRE1_CAN_DO_MODERN_JIT
pcre_config(PCRE_CONFIG_JIT, );
if (canjit == 1) {
p->pcre1_jit_stack = pcre_jit_stack_alloc(1, 1024 * 1024);
@@ -391,7 +391,7 @@ static int pcre1match(struct grep_pat *p, const char *line, 
const char *eol,
if (eflags & REG_NOTBOL)
flags |= PCRE_NOTBOL;
 
-#ifdef PCRE_CONFIG_JIT
+#ifdef GIT_PCRE1_CAN_DO_MODERN_JIT
if (p->pcre1_jit_on)
ret = pcre_jit_exec(p->pcre1_regexp, p->pcre1_extra_info, line,
eol - line, 0, flags, ovector,
@@ -419,7 +419,7 @@ static int pcre1match(struct grep_pat *p, const char *line, 
const char *eol,
 static void free_pcre1_regexp(struct grep_pat *p)
 {
pcre_free(p->pcre1_regexp);
-#ifdef PCRE_CONFIG_JIT
+#ifdef GIT_PCRE1_CAN_DO_MODERN_JIT
if (p->pcre1_jit_on) {
pcre_free_study(p->pcre1_extra_info);
pcre_jit_stack_free(p->pcre1_jit_stack);
diff --git a/grep.h b/grep.h
index 14f47189f9..73ef0ef8ec 100644
--- a/grep.h
+++ b/grep.h
@@ -3,6 +3,11 @@
 #include "color.h"
 #ifdef USE_LIBPCRE1
 #include 
+#ifdef PCRE_CONFIG_JIT
+#if PCRE_MAJOR >= 8 && PCRE_MINOR >= 32
+#define GIT_PCRE1_CAN_DO_MODERN_JIT
+#endif
+#endif
 #ifndef PCRE_STUDY_JIT_COMPILE
 #define PCRE_STUDY_JIT_COMPILE 0
 #endif
-- 
2.11.0



[PATCH 6/7] grep: un-break building with PCRE < 8.20

2017-05-11 Thread Ævar Arnfjörð Bjarmason
Amend my change earlier in this series ("grep: add support for the
PCRE v1 JIT API", 2017-04-11) to un-break the build on PCRE v1
versions earlier than 8.20.

The 8.20 release was the first release to have JIT & pcre_jit_stack in
the headers, so a mock type needs to be provided for it on those
releases.

Now git should compile with all PCRE versions that it supported before
my JIT change.

I've tested it as far back as version 7.5 released on 2008-01-10, once
I got down to version 7.0 it wouldn't build anymore with GCC 7.1.1,
and I couldn't be bothered to anything older than 7.5 as I'm confident
that if the build breaks on those older versions it's not because of
my JIT change.

See the "un-break" change in this series ("grep: un-break building
with PCRE < 8.32", 2017-05-10) for why this isn't squashed into the
main PCRE JIT commit.

Signed-off-by: Ævar Arnfjörð Bjarmason 
---
 grep.h | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/grep.h b/grep.h
index 73ef0ef8ec..b7b9d487b0 100644
--- a/grep.h
+++ b/grep.h
@@ -11,6 +11,9 @@
 #ifndef PCRE_STUDY_JIT_COMPILE
 #define PCRE_STUDY_JIT_COMPILE 0
 #endif
+#if PCRE_MAJOR <= 8 && PCRE_MINOR < 20
+typedef int pcre_jit_stack;
+#endif
 #else
 typedef int pcre;
 typedef int pcre_extra;
-- 
2.11.0



[PATCH 7/7] grep: add support for PCRE v2

2017-05-11 Thread Ævar Arnfjörð Bjarmason
Add support for v2 of the PCRE API. This is a new major version of
PCRE that came out in early 2015[1].

The regular expression syntax is the same, but while the API is
similar, pretty much every function is either renamed or takes
different arguments. Thus using it via entirely new functions makes
sense, as opposed to trying to e.g. have one compile_pcre_pattern()
that would call either PCRE v1 or v2 functions.

Git can now be compiled with either USE_LIBPCRE1=YesPlease or
USE_LIBPCRE2=YesPlease, with USE_LIBPCRE=YesPlease currently being a
synonym for the former. Providing both is a compile-time error.

With earlier patches to enable JIT for PCRE v1 the performance of the
release versions of both libraries is almost exactly the same, with
PCRE v2 being around 1% slower.

However after I reported this to the pcre-dev mailing list[2] I got a
lot of help with the API use from Zoltán Herczeg, he subsequently
optimized some of the JIT functionality in v2 of the library.

Running the p7820-grep-engines.sh performance test against the latest
Subversion trunk of both, with both them and git compiled as -O3, and
the test run against linux.git, gives the following results. Just the
/perl/ tests shown:

$ GIT_PERF_REPEAT_COUNT=30 GIT_PERF_LARGE_REPO=~/g/linux 
GIT_PERF_MAKE_COMMAND='grep -q LIBPCRE2 Makefile && make -j8 
USE_LIBPCRE2=YesPlease CC=~/perl5/installed/bin/gcc 
NO_R_TO_GCC_LINKER=YesPlease CFLAGS=-O3 LIBPCREDIR=/home/avar/g/pcre2/inst 
LDFLAGS=-Wl,-rpath,/home/avar/g/pcre2/inst/lib || make -j8 
USE_LIBPCRE=YesPlease CC=~/perl5/installed/bin/gcc NO_R_TO_GCC_LINKER=YesPlease 
CFLAGS=-O3 LIBPCREDIR=/home/avar/g/pcre/inst 
LDFLAGS=-Wl,-rpath,/home/avar/g/pcre/inst/lib' ./run HEAD~2 HEAD~ HEAD 
p7820-grep-engines.sh
[...]
Test   HEAD~2HEAD~  
  HEAD


7820.2: perl grep how.to   0.30(1.07+0.48)   
0.20(0.36+0.54) -33.3%   0.20(0.26+0.60) -33.3%
7820.5: perl grep ^how to  0.56(2.62+0.42)   
0.25(0.61+0.48) -55.4%   0.19(0.27+0.61) -66.1%
7820.8: perl grep [how] to 0.57(2.54+0.41)   
0.29(0.95+0.43) -49.1%   0.22(0.36+0.58) -61.4%
7820.11: perl grep (e.t[^ ]*|v.ry) rare1.03(5.57+0.34)   
0.30(1.07+0.47) -70.9%   0.23(0.54+0.44) -77.7%
7820.14: perl grep m(ú|u)ult.b(æ|y)te  0.37(1.60+0.37)   
0.25(0.73+0.44) -32.4%   0.20(0.29+0.55) -45.9%

See commit ("perf: add a performance comparison test of grep -G, -E
and -P", 2017-04-19) for further details on the machine the above test
run was executed on.

Here HEAD~2 is git with PCRE v1 without JIT, HEAD~ is PCRE v1 with
JIT, and HEAD is PCRE v2 (also with JIT). See previous commits of mine
mentioning p7820-grep-engines.sh for more details on the test setup.

For ease of readability, a different run just of HEAD~ (PCRE v1 with
JIT v.s. PCRE v2), again with just the /perl/ tests shown:

Test   HEAD~ HEAD

---
7820.2: perl grep how.to   0.20(0.40+0.48)   
0.20(0.31+0.55) +0.0%
7820.5: perl grep ^how to  0.24(0.62+0.50)   
0.19(0.30+0.58) -20.8%
7820.8: perl grep [how] to 0.29(0.98+0.39)   
0.22(0.35+0.61) -24.1%
7820.11: perl grep (e.t[^ ]*|v.ry) rare0.30(1.16+0.38)   
0.22(0.50+0.50) -26.7%
7820.14: perl grep m(ú|u)ult.b(æ|y)te  0.24(0.74+0.43)   
0.20(0.31+0.54) -16.7%

I.e. the two are either neck-to-neck, but PCRE v2 usually pulls ahead,
when it does it's around 20% faster.

A brief note on thread safety: As noted in pcre2api(3) & pcre2jit(3)
the compiled pattern can be shared between threads, but not some of
the JIT context, however the grep threading support does all pattern &
JIT compilation in separate threads, so this code doesn't need to
concern itself with thread safety.

See commit 63e7e9d8b6 ("git-grep: Learn PCRE", 2011-05-09) for the
initial addition of PCRE v1. This change follows some of the same
patterns it did (and which were discussed on list at the time),
e.g. mocking up types with typedef instead of ifdef-ing them out when
USE_LIBPCRE2 isn't defined. This adds some trivial memory use to the
program, but makes the code look nicer.

1. https://lists.exim.org/lurker/message/20150105.162835.0666407a.en.html
2. https://lists.exim.org/lurker/thread/20170419.172322.833ee099.en.html

Signed-off-by: Ævar Arnfjörð Bjarmason 
---
 Makefile  |  30 +---
 configure.ac  |  77 ++-
 grep.c| 143 ++
 grep.h|  17 +++
 t/test-lib.sh |   2 +-
 5 files changed, 250 insertions(+), 19 deletions(-)

diff --git a/Makefile 

[PATCH 3/7] log: add -P as a synonym for --perl-regexp

2017-05-11 Thread Ævar Arnfjörð Bjarmason
Add a short -P option as a synonym for the longer --perl-regexp, for
consistency with the options the corresponding grep invocations
accept.

This was intentionally omitted in commit 727b6fc3ed ("log --grep:
accept --basic-regexp and --perl-regexp", 2012-10-03) for unspecified
future use.

Make it consistent with "grep" rather than to keep it open for future
use, and to avoid the confusion of -P meaning different things for
grep & log, as is the case with the -G option.

As noted in the aforementioned commit the --basic-regexp option can't
have a corresponding -G argument, as the log command already uses that
for -G.

Signed-off-by: Ævar Arnfjörð Bjarmason 
---
 Documentation/rev-list-options.txt |  1 +
 revision.c |  2 +-
 t/t4202-log.sh | 12 
 3 files changed, 14 insertions(+), 1 deletion(-)

diff --git a/Documentation/rev-list-options.txt 
b/Documentation/rev-list-options.txt
index a46f70c2b1..9c44eae55d 100644
--- a/Documentation/rev-list-options.txt
+++ b/Documentation/rev-list-options.txt
@@ -91,6 +91,7 @@ endif::git-rev-list[]
Consider the limiting patterns to be fixed strings (don't interpret
pattern as a regular expression).
 
+-P::
 --perl-regexp::
Consider the limiting patterns to be Perl-compatible regular
expressions.
diff --git a/revision.c b/revision.c
index 7ff61ff5f7..03a3a012de 100644
--- a/revision.c
+++ b/revision.c
@@ -1995,7 +1995,7 @@ static int handle_revision_opt(struct rev_info *revs, int 
argc, const char **arg
DIFF_OPT_SET(>diffopt, PICKAXE_IGNORE_CASE);
} else if (!strcmp(arg, "--fixed-strings") || !strcmp(arg, "-F")) {
revs->grep_filter.pattern_type_option = GREP_PATTERN_TYPE_FIXED;
-   } else if (!strcmp(arg, "--perl-regexp")) {
+   } else if (!strcmp(arg, "--perl-regexp") || !strcmp(arg, "-P")) {
revs->grep_filter.pattern_type_option = GREP_PATTERN_TYPE_PCRE;
} else if (!strcmp(arg, "--all-match")) {
revs->grep_filter.all_match = 1;
diff --git a/t/t4202-log.sh b/t/t4202-log.sh
index b0122a1991..a87b7f6089 100755
--- a/t/t4202-log.sh
+++ b/t/t4202-log.sh
@@ -331,8 +331,20 @@ test_expect_success 'log with various grep.patternType 
configurations & command-
--grep="(1|2)" >actual.fixed.short-arg &&
git log --pretty=tformat:%s -E \
--grep="\|2" >actual.extended.short-arg &&
+   if test_have_prereq PCRE
+   then
+   git log --pretty=tformat:%s -P \
+   --grep="[\d]\|" >actual.perl.short-arg
+   else
+   test_must_fail git log -P \
+   --grep="[\d]\|"
+   fi &&
test_cmp expect.fixed actual.fixed.short-arg &&
test_cmp expect.extended actual.extended.short-arg &&
+   if test_have_prereq PCRE
+   then
+   test_cmp expect.perl actual.perl.short-arg
+   fi &&
 
git log --pretty=tformat:%s --fixed-strings \
--grep="(1|2)" >actual.fixed.long-arg &&
-- 
2.11.0



[PATCH 2/7] grep: skip pthreads overhead when using one thread

2017-05-11 Thread Ævar Arnfjörð Bjarmason
Skip the administrative overhead of using pthreads when only using one
thread. Instead take the non-threaded path which would be taken under
NO_PTHREADS.

The threading support was initially added in commit
5b594f457a ("Threaded grep", 2010-01-25) with a hardcoded compile-time
number of 8 threads. Later the number of threads was made configurable
in commit 89f09dd34e ("grep: add --threads= option and
grep.threads configuration", 2015-12-15).

That change did not add any special handling for --threads=1. Now we
take a slightly faster path by skipping thread handling entirely when
1 thread is requested.

Signed-off-by: Ævar Arnfjörð Bjarmason 
---
 builtin/grep.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/builtin/grep.c b/builtin/grep.c
index 7baa4778b7..9c0d1ecc12 100644
--- a/builtin/grep.c
+++ b/builtin/grep.c
@@ -1240,6 +1240,8 @@ int cmd_grep(int argc, const char **argv, const char 
*prefix)
num_threads = GREP_NUM_THREADS_DEFAULT;
else if (num_threads < 0)
die(_("invalid number of threads specified (%d)"), num_threads);
+   if (num_threads == 1)
+   num_threads = 0;
 #else
if (num_threads)
warning(_("no threads support, ignoring --threads"));
-- 
2.11.0



[PATCH 1/7] grep: don't redundantly compile throwaway patterns under threading

2017-05-11 Thread Ævar Arnfjörð Bjarmason
Change the pattern compilation logic under threading so that grep
doesn't compile a pattern it never ends up using on the non-threaded
code path, only to compile it again N times for N threads which will
each use their own copy, ignoring the initially compiled pattern.

This redundant compilation dates back to the initial introduction of
the threaded grep in commit 5b594f457a ("Threaded grep",
2010-01-25).

There was never any reason for doing this redundant work other than an
oversight in the initial commit. Jeff King suggested on-list in
<20170414212325.fefrl3qdjigwy...@sigill.intra.peff.net> that this
might be needed to check the pattern for sanity before threaded
execution commences.

That's not the case. The pattern is compiled under threading in
start_threads() before any concurrent execution has started by calling
pthread_create(), so if the pattern contains an error we still do the
right thing. I.e. die with one error before any threaded execution has
commenced, instead of e.g. spewing out an error for each N threads,
which could be a regression a change like this might inadvertently
introduce.

The undocumented --debug mode added in commit 17bf35a3c7 ("grep: teach
--debug option to dump the parse tree", 2012-09-13) still works
properly with this change. It only emits debugging info during pattern
compilation, which is now dumped by the pattern compiled just before
the first thread is started.

Signed-off-by: Ævar Arnfjörð Bjarmason 
---
 builtin/grep.c | 14 +++---
 1 file changed, 11 insertions(+), 3 deletions(-)

diff --git a/builtin/grep.c b/builtin/grep.c
index 50e4bd2cd2..7baa4778b7 100644
--- a/builtin/grep.c
+++ b/builtin/grep.c
@@ -224,7 +224,8 @@ static void start_threads(struct grep_opt *opt)
int err;
struct grep_opt *o = grep_opt_dup(opt);
o->output = strbuf_out;
-   o->debug = 0;
+   if (i)
+   o->debug = 0;
compile_grep_patterns(o);
err = pthread_create([i], NULL, run, o);
 
@@ -1169,8 +1170,6 @@ int cmd_grep(int argc, const char **argv, const char 
*prefix)
if (!opt.fixed && opt.ignore_case)
opt.regflags |= REG_ICASE;
 
-   compile_grep_patterns();
-
/*
 * We have to find "--" in a separate pass, because its presence
 * influences how we will parse arguments that come before it.
@@ -1247,6 +1246,15 @@ int cmd_grep(int argc, const char **argv, const char 
*prefix)
num_threads = 0;
 #endif
 
+   if (!num_threads)
+   /*
+* The compiled patterns on the main path are only
+* used when not using threading. Otherwise
+* start_threads() below calls compile_grep_patterns()
+* for each thread.
+*/
+   compile_grep_patterns();
+
 #ifndef NO_PTHREADS
if (num_threads) {
if (!(opt.name_only || opt.unmatch_name_only || opt.count)
-- 
2.11.0



[PATCH 0/7] PCRE v2, PCRE v1 JIT, log -P & fixes

2017-05-11 Thread Ævar Arnfjörð Bjarmason
This goes on top of the 29 patch series of "Easy to review grep &
pre-PCRE changes" (<20170511091829.5634-1-ava...@gmail.com>;
https://public-inbox.org/git/20170511091829.5634-1-ava...@gmail.com/).

This could be split into 3 unrelated things, but I have think it's
probably easier for everyone to bundle these up, since they all go on
top of the other series. Comments below:

Ævar Arnfjörð Bjarmason (7):
  grep: don't redundantly compile throwaway patterns under threading
  grep: skip pthreads overhead when using one thread

Internal changes to grep to not redundantly spawn threads. No
functional changes, just internal cleanup.

  log: add -P as a synonym for --perl-regexp

Trivial change to add -P.

  grep: add support for the PCRE v1 JIT API
  grep: un-break building with PCRE < 8.32
  grep: un-break building with PCRE < 8.20

I tested ancient versions of PCRE, which turned up build issues that
are fixed this time around.

  grep: add support for PCRE v2

The main point of this whole thing.

 Documentation/rev-list-options.txt |   1 +
 Makefile   |  30 +--
 builtin/grep.c |  16 +++-
 configure.ac   |  77 +---
 grep.c | 180 -
 grep.h |  31 +++
 revision.c |   2 +-
 t/t4202-log.sh |  12 +++
 t/test-lib.sh  |   2 +-
 9 files changed, 327 insertions(+), 24 deletions(-)

-- 
2.11.0



Re: [PATCH v3] fetch-pack: always allow fetching of literal SHA1s

2017-05-11 Thread Brandon Williams
On 05/11, Jeff King wrote:
> On Wed, May 10, 2017 at 03:11:57PM -0700, Jonathan Tan wrote:
> 
> > fetch-pack, when fetching a literal SHA-1 from a server that is not
> > configured with uploadpack.allowtipsha1inwant (or similar), always
> > returns an error message of the form "Server does not allow request for
> > unadvertised object %s". However, it is sometimes the case that such
> > object is advertised. This situation would occur, for example, if a user
> > or a script was provided a SHA-1 instead of a branch or tag name for
> > fetching, and wanted to invoke "git fetch" or "git fetch-pack" using
> > that SHA-1.
> > 
> > Teach fetch-pack to also check the SHA-1s of the refs in the received
> > ref advertisement if a literal SHA-1 was given by the user.
> 
> Stepping back a bit, what does this mean for a world where we implement
> protocol extensions to let the client specify a set of refspecs to limit
> the advertisement?
> 
> If we give the server our usual set of fetch refspecs, then we might
> fail to fulfill a request that would have been advertised outside of
> that set. And the behavior is confusing and non-transparent to the user.
> I don't think that really makes sense, though; the advertisement we ask
> for from the server should include only the bits we're interested in for
> _this_ fetch.
> 
> If we tell the server "we are interested in abcd1234", then it's not
> going to find any matching ref by name, obviously. So should servers
> then treat 40-hex names in the incoming refspecs as a request to show
> any names which have a matching sha1? That works against any server-side
> optimizations to avoid looking at the complete set of refs, but it would
> only have to kick in when the user actually specified a single SHA-1
> (and even then only when allowAnySHA1 isn't on). So that's probably
> workable.
> 
> None of this is your problem now either way; the advertisement-limiting
> extension is still vaporware, albeit one we've discussed a lot. I just
> wanted to make sure we weren't painting ourselves into any corners. And
> I think this case could probably be handled.

I can't remember, is there any reason why it is still vaporware? As in
what is holding us back from doing the advertisement-limiting (or doing
away with the initial ref  advertisement)?

-- 
Brandon Williams


Re: [PATCH 1/4] docs/config: clarify include/includeIf relationship

2017-05-11 Thread Ramsay Jones


On 11/05/17 10:10, Jeff King wrote:
> The "includeIf" directives behave exactly like include ones,
> except they only kick in when the conditional is true. That
> was mentioned in the "conditional" section, but let's make
> it more clear for the whole "includes" section, since people
> don't necessarily read the documentation top to bottom.
> 
> Signed-off-by: Jeff King 
> ---
>  Documentation/config.txt | 17 +++--
>  1 file changed, 11 insertions(+), 6 deletions(-)
> 
> diff --git a/Documentation/config.txt b/Documentation/config.txt
> index 475e874d5..d5a453ed3 100644
> --- a/Documentation/config.txt
> +++ b/Documentation/config.txt
> @@ -79,14 +79,20 @@ escape sequences) are invalid.
>  Includes
>  
>  
> +The `include` and `includeIf` sections allow you include config

s/you include/you to include/

ATB,
Ramsay Jones


Re: What's cooking in git.git (May 2017, #03; Wed, 10)

2017-05-11 Thread Lars Schneider

> On 11 May 2017, at 14:09, Johannes Schindelin  
> wrote:
> 
> Hi,
> 
> [making sure Lars reads this]
> 
> On Wed, 10 May 2017, Junio C Hamano wrote:
> 
>> * ls/filter-process-delayed (2017-03-06) 1 commit
>> - convert: add "status=delayed" to filter process protocol
>> 
>> What's the status of this one???
>> cf. 
> 

Thanks Dscho! I was holding back my new round as I wanted
to wait for 2.13. Right now I am traveling but I will send
out the new round ASAP (next week).

Cheers,
Lars

Re: Article on the branch strategy in gitworkflows(7)

2017-05-11 Thread Raman Gupta
FYI - the article mentioned below has been published @
https://medium.com/@ramangupta/how-the-creators-of-git-do-branches-e6fcc57270fb

Thank you to the two or three people that I know of that took the time
to read the draft.

Regards,
Raman

On 22/03/17 12:41 PM, Raman Gupta wrote:
> Several years ago, I contributed [1] to the gitworkflows(7)
> documentation, because I thought the process by which git.git does
> branching was really interesting.
> 
> Since then, I have found it odd that gitworkflows has mostly remained
> under the radar. Other, in my opinion, lesser flows, have become very
> popular, like GitFlow [2]. I have written an article explaining the
> "why" of gitworkflow in contrast to GitFlow and others:
> 
> https://docs.google.com/document/d/1cGNujRNVzeLV2SXkVlKwai6qJmlVT3LwlOsVYNr0FZo/edit?usp=sharing
> 
> This article is not published yet -- its still a DRAFT and only
> visible via the above URL.
> 
> I'd love to have input on the draft from the experts in this
> community. Feel free to suggest changes and add comments in the
> article via Google Docs.
> 
> Also, some explanatory illustrations to accompany the article would be
> excellent, if anyone feels like contributing. Visuals are not my
> strong suit. I'd love to include something similar to the graphic [3]
> that was a big part of making GitFlow so popular. The article can be
> partially rewritten to match illustrations, if any are contributed.
> 
> [1]
> https://github.com/git/git/commit/382e54312220ac02586a3cbb06b0e4eb7789f043
> 
> [2] http://nvie.com/posts/a-successful-git-branching-model/
> 
> [3] http://nvie.com/img/git-mo...@2x.png
> 
> Regards,
> Raman Gupta
> 



[BUG] fast-export --anonymize does not maintain fixup! commits

2017-05-11 Thread Sebastian Schuberth
Hi,

I just tried to created an anonymized repo to allow the list to
reproduce a bug in "rebase -i" I discovered in Git 2.13 for Linux
(Windows is not affected). However, in order to reproduce the bug it's
important to keep the "fixup!" prefixes as part of the commit
messages. Unfortunately, "fast-export --anonymize" does not maintain
these (or any other command prefixes in commit messages). Given that
the --anonymize option is explicitly designed to help reproducing
bugs, I consider this to be a bug in the --anonymize option itself.

-- 
Sebastian Schuberth


Re: [PATCH] compat/regex: fix compilation on Windows

2017-05-11 Thread Ævar Arnfjörð Bjarmason
On Thu, May 11, 2017 at 3:50 PM, Johannes Schindelin
 wrote:
> The real issue here is that GNU awk's regex implementation assumes a bit
> too much about the relative sizes of pointers and long integers. What they
> really want is to use intptr_t.

Thanks, looks good!

> This patch recapitulates what 56a1a3ab449 (Silence GCC's "cast of pointer
> to integer of a different size" warning, 2015-10-26) did to our previous
> copy of GNU awk's regex engine.
>
> Signed-off-by: Johannes Schindelin 
> ---
> Published-As: https://github.com/dscho/git/releases/tag/compat-regex-fixes-v1
> Fetch-It-Via: git fetch https://github.com/dscho/git compat-regex-fixes-v1
>
>  .../0003-Use-intptr_t-instead-of-long.patch| 22 
> ++
>  compat/regex/regcomp.c |  4 ++--
>  2 files changed, 24 insertions(+), 2 deletions(-)
>  create mode 100644 
> compat/regex/patches/0003-Use-intptr_t-instead-of-long.patch
>
> diff --git a/compat/regex/patches/0003-Use-intptr_t-instead-of-long.patch 
> b/compat/regex/patches/0003-Use-intptr_t-instead-of-long.patch
> new file mode 100644
> index 000..246ff256fb8
> --- /dev/null
> +++ b/compat/regex/patches/0003-Use-intptr_t-instead-of-long.patch
> @@ -0,0 +1,22 @@
> +diff --git a/compat/regex/regcomp.c b/compat/regex/regcomp.c
> +index 5e9ea26cd46..e6469167a80 100644
> +--- a/compat/regex/regcomp.c
>  b/compat/regex/regcomp.c
> +@@ -2641,7 +2641,7 @@ parse_dup_op (bin_tree_t *elem, re_string_t *regexp, 
> re_dfa_t *dfa,
> + old_tree = NULL;
> +
> +   if (elem->token.type == SUBEXP)
> +-postorder (elem, mark_opt_subexp, (void *) (long) elem->token.opr.idx);
> ++postorder (elem, mark_opt_subexp, (void *) (intptr_t) 
> elem->token.opr.idx);
> +
> +   tree = create_tree (dfa, elem, NULL, (end == -1 ? OP_DUP_ASTERISK : 
> OP_ALT));
> +   if (BE (tree == NULL, 0))
> +@@ -3868,7 +3868,7 @@ create_token_tree (re_dfa_t *dfa, bin_tree_t *left, 
> bin_tree_t *right,
> + static reg_errcode_t
> + mark_opt_subexp (void *extra, bin_tree_t *node)
> + {
> +-  int idx = (int) (long) extra;
> ++  int idx = (int) (intptr_t) extra;
> +   if (node->token.type == SUBEXP && node->token.opr.idx == idx)
> + node->token.opt_subexp = 1;
> +
> diff --git a/compat/regex/regcomp.c b/compat/regex/regcomp.c
> index 5e9ea26cd46..e6469167a80 100644
> --- a/compat/regex/regcomp.c
> +++ b/compat/regex/regcomp.c
> @@ -2641,7 +2641,7 @@ parse_dup_op (bin_tree_t *elem, re_string_t *regexp, 
> re_dfa_t *dfa,
>  old_tree = NULL;
>
>if (elem->token.type == SUBEXP)
> -postorder (elem, mark_opt_subexp, (void *) (long) elem->token.opr.idx);
> +postorder (elem, mark_opt_subexp, (void *) (intptr_t) 
> elem->token.opr.idx);
>
>tree = create_tree (dfa, elem, NULL, (end == -1 ? OP_DUP_ASTERISK : 
> OP_ALT));
>if (BE (tree == NULL, 0))
> @@ -3868,7 +3868,7 @@ create_token_tree (re_dfa_t *dfa, bin_tree_t *left, 
> bin_tree_t *right,
>  static reg_errcode_t
>  mark_opt_subexp (void *extra, bin_tree_t *node)
>  {
> -  int idx = (int) (long) extra;
> +  int idx = (int) (intptr_t) extra;
>if (node->token.type == SUBEXP && node->token.opr.idx == idx)
>  node->token.opt_subexp = 1;
>
>
> base-commit: 4e23cefb4da69a2d884c2d5a303825f40008ca42
> --
> 2.12.2.windows.2.800.gede8f145e06


  1   2   >