Re: Fetch/push lets a malicious server steal the targets of "have" lines

2016-10-28 Thread Matt McCutchen
On Fri, 2016-10-28 at 18:11 -0700, Junio C Hamano wrote:
> Ah, I see.  My immediate reaction is that you can do worse things in
> the reverse direction compared to this, but your scenario does sound
> bad already.

Are you saying that clients connecting to untrusted servers already
face worse risks that people should know about, so there is no point in
documenting this one?  I guess I don't know about the other risks aside
from accepting a corrupt object, which should be preventable by
enabling fetch.fsckObjects.  It seems we need either a statement that
connecting to untrusted servers is officially unsupported or a
description of the specific risks.

Matt


Re: [PATCH v1 16/19] read-cache: unlink old sharedindex files

2016-10-28 Thread Duy Nguyen
On Thu, Oct 27, 2016 at 11:13 PM, Junio C Hamano  wrote:
>> git-gc just can't match this because while it's running, somebody else
>> may be updating $GIT_DIR/index. Handling races would be a lot harder.
>
> It could attempt to take a lock on the primary index while it runs,
> and refrain to do anything if it can't take the lock ("gc --auto"
> may want to silently retry), and then the race is no longer an
> issue, no?

No, I thought of that. But if gc is holding the lock, another program
that wants to update the index may fail. And git-gc is supposed to be
non-intrusive
-- 
Duy


Re: [PATCH v3 2/3] sha1_file: open window into packfiles with O_CLOEXEC

2016-10-28 Thread Junio C Hamano
Linus Torvalds  writes:

> Apparently windows doesn't even support it, at least not mingw

Assuming that the above was a misunderstanding, assuming that we can
do O_CLOEXEC (but not FD_CLOEXEC) on Windows just fine, where stray
file descriptors held open in the children matter more, and ...

> In contrast, O_NOATIME isn't a maintenance problem, since it's purely
> an optimization and has no semantic difference, so it not existing on
> some platform is immaterial.

... assuming that I didn't screw up my use of fcntl() to set
O_NOATIME on a fd opened with O_CLOEXEC, are you OK with the
approach in patch *1*?  We can drop *2* to keep O_NOATIME, which has
been working for those with old kernels with many loose objects, and
it is not too much code to keep.

*1* http://public-inbox.org/git/xmqqh97w38gj@gitster.mtv.corp.google.com
*2* http://public-inbox.org/git/xmqqd1ik38f4@gitster.mtv.corp.google.com


Re: [PATCH 0/4] Make other git commands use trailer layout

2016-10-28 Thread Junio C Hamano
Jonathan Tan  writes:

> This is built off jt/trailer-with-cruft (commit 60ef86a).
>
> This patch set makes "commit -s", "cherry-pick -x", and
> "format-patch --signoff" use the new trailer definition implemented in
> jt/trailer-with-cruft, with some refactoring along the way. With this
> patch set, the aforementioned commands would now handle trailers like
> those described in [1].
>
> [1] <84f28caa-2e4b-1231-1a76-3b7e765c0...@google.com>

Ooooh.  Looks delicious ;-)


Re: Fetch/push lets a malicious server steal the targets of "have" lines

2016-10-28 Thread Junio C Hamano
Matt McCutchen  writes:

> Then the server generates a commit X3 that lists Y2 as a parent, even
> though it doesn't have Y2, and advances 'x' to X3.  The victim fetches
> 'x':
>
>victim  server
>
>  Y1---Y2  (Y2)
> /   \ \ 
> ---O---O---X1---X2---X3   ---O---O---X1---X2---X3
>
> Then the server rolls back 'x' to X2:
>
>victim  server
>
>  Y1---Y2
> /   \
> ---O---O---X1---X2---X3   ---O---O---X1---X2

Ah, I see.  My immediate reaction is that you can do worse things in
the reverse direction compared to this, but your scenario does sound
bad already.



[PATCH 3/4] trailer: have function to describe trailer layout

2016-10-28 Thread Jonathan Tan
Create a function that, taking a string, describes the position of its
trailer block (if available) and the contents thereof, and make trailer
use it. This makes it easier for other Git components, in the future, to
interpret trailer blocks in the same way as trailer.

In a subsequent patch, another component will be made to use this.

Signed-off-by: Jonathan Tan 
---
 trailer.c | 120 +++---
 trailer.h |  25 +
 2 files changed, 108 insertions(+), 37 deletions(-)

diff --git a/trailer.c b/trailer.c
index d4d9e10..2f5c815 100644
--- a/trailer.c
+++ b/trailer.c
@@ -46,6 +46,8 @@ static LIST_HEAD(conf_head);
 
 static char *separators = ":";
 
+static int configured = 0;
+
 #define TRAILER_ARG_STRING "$ARG"
 
 static const char *git_generated_prefixes[] = {
@@ -549,6 +551,17 @@ static int git_trailer_config(const char *conf_key, const 
char *value, void *cb)
return 0;
 }
 
+static void ensure_configured(void)
+{
+   if (configured)
+   return;
+
+   /* Default config must be setup first */
+   git_config(git_trailer_default_config, NULL);
+   git_config(git_trailer_config, NULL);
+   configured = 1;
+}
+
 static const char *token_from_item(struct arg_item *item, char *tok)
 {
if (item->conf.key)
@@ -872,59 +885,43 @@ static int process_input_file(FILE *outfile,
  const char *str,
  struct list_head *head)
 {
-   int patch_start, trailer_start, trailer_end;
+   struct trailer_info info;
struct strbuf tok = STRBUF_INIT;
struct strbuf val = STRBUF_INIT;
-   struct trailer_item *last = NULL;
-   struct strbuf *trailer, **trailer_lines, **ptr;
+   int i;
 
-   patch_start = find_patch_start(str);
-   trailer_end = find_trailer_end(str, patch_start);
-   trailer_start = find_trailer_start(str, trailer_end);
+   trailer_info_get(, str);
 
/* Print lines before the trailers as is */
-   fwrite(str, 1, trailer_start, outfile);
+   fwrite(str, 1, info.trailer_start - str, outfile);
 
-   if (!ends_with_blank_line(str, trailer_start))
+   if (!info.blank_line_before_trailer)
fprintf(outfile, "\n");
 
-   /* Parse trailer lines */
-   trailer_lines = strbuf_split_buf(str + trailer_start, 
-trailer_end - trailer_start,
-'\n',
-0);
-   for (ptr = trailer_lines; *ptr; ptr++) {
+   for (i = 0; i < info.trailer_nr; i++) {
int separator_pos;
-   trailer = *ptr;
-   if (trailer->buf[0] == comment_line_char)
+   char *trailer = info.trailers[i];
+   if (trailer[0] == comment_line_char)
continue;
-   if (last && isspace(trailer->buf[0])) {
-   struct strbuf sb = STRBUF_INIT;
-   strbuf_addf(, "%s\n%s", last->value, trailer->buf);
-   strbuf_strip_suffix(, "\n");
-   free(last->value);
-   last->value = strbuf_detach(, NULL);
-   continue;
-   }
-   separator_pos = find_separator(trailer->buf, separators);
+   separator_pos = find_separator(trailer, separators);
if (separator_pos >= 1) {
-   parse_trailer(, , NULL, trailer->buf,
- separator_pos);
-   last = add_trailer_item(head,
-   strbuf_detach(, NULL),
-   strbuf_detach(, NULL));
+   parse_trailer(, , NULL, trailer,
+ separator_pos);
+   add_trailer_item(head,
+strbuf_detach(, NULL),
+strbuf_detach(, NULL));
} else {
-   strbuf_addbuf(, trailer);
+   strbuf_addstr(, trailer);
strbuf_strip_suffix(, "\n");
add_trailer_item(head,
 NULL,
 strbuf_detach(, NULL));
-   last = NULL;
}
}
-   strbuf_list_free(trailer_lines);
 
-   return trailer_end;
+   trailer_info_release();
+
+   return info.trailer_end - str;
 }
 
 static void free_all(struct list_head *head)
@@ -975,9 +972,7 @@ void process_trailers(const char *file, int in_place, int 
trim_empty, struct str
int trailer_end;
FILE *outfile = stdout;
 
-   /* Default config must be setup first */
-   git_config(git_trailer_default_config, NULL);
-   

[PATCH 4/4] sequencer: use trailer's trailer layout

2016-10-28 Thread Jonathan Tan
Make sequencer use trailer.c's trailer layout definition, as opposed to
parsing the footer by itself. This makes "commit -s", "cherry-pick -x",
and "format-patch --signoff" consistent with trailer, allowing
non-trailer lines and multiple-line trailers in trailer blocks under
certain conditions, and therefore suppressing the extra newline in those
cases.

Consistency with trailer extends to respecting trailer configs.  Tests
have been included to show that.

Signed-off-by: Jonathan Tan 
---
 sequencer.c  | 75 +---
 t/t3511-cherry-pick-x.sh | 16 +--
 t/t4014-format-patch.sh  | 40 +-
 t/t7501-commit.sh| 36 +++
 4 files changed, 96 insertions(+), 71 deletions(-)

diff --git a/sequencer.c b/sequencer.c
index eec8a60..ec0157d 100644
--- a/sequencer.c
+++ b/sequencer.c
@@ -15,6 +15,7 @@
 #include "merge-recursive.h"
 #include "refs.h"
 #include "argv-array.h"
+#include "trailer.h"
 
 #define GIT_REFLOG_ACTION "GIT_REFLOG_ACTION"
 
@@ -26,30 +27,6 @@ static GIT_PATH_FUNC(git_path_opts_file, SEQ_OPTS_FILE)
 static GIT_PATH_FUNC(git_path_seq_dir, SEQ_DIR)
 static GIT_PATH_FUNC(git_path_head_file, SEQ_HEAD_FILE)
 
-static int is_rfc2822_line(const char *buf, int len)
-{
-   int i;
-
-   for (i = 0; i < len; i++) {
-   int ch = buf[i];
-   if (ch == ':')
-   return 1;
-   if (!isalnum(ch) && ch != '-')
-   break;
-   }
-
-   return 0;
-}
-
-static int is_cherry_picked_from_line(const char *buf, int len)
-{
-   /*
-* We only care that it looks roughly like (cherry picked from ...)
-*/
-   return len > strlen(cherry_picked_prefix) + 1 &&
-   starts_with(buf, cherry_picked_prefix) && buf[len - 1] == ')';
-}
-
 /*
  * Returns 0 for non-conforming footer
  * Returns 1 for conforming footer
@@ -59,49 +36,25 @@ static int is_cherry_picked_from_line(const char *buf, int 
len)
 static int has_conforming_footer(struct strbuf *sb, struct strbuf *sob,
int ignore_footer)
 {
-   char prev;
-   int i, k;
-   int len = sb->len - ignore_footer;
-   const char *buf = sb->buf;
-   int found_sob = 0;
-
-   /* footer must end with newline */
-   if (!len || buf[len - 1] != '\n')
-   return 0;
+   struct trailer_info info;
+   int i;
+   int found_sob = 0, found_sob_last = 0;
 
-   prev = '\0';
-   for (i = len - 1; i > 0; i--) {
-   char ch = buf[i];
-   if (prev == '\n' && ch == '\n') /* paragraph break */
-   break;
-   prev = ch;
-   }
+   trailer_info_get(, sb->buf);
 
-   /* require at least one blank line */
-   if (prev != '\n' || buf[i] != '\n')
+   if (info.trailer_start == info.trailer_end)
return 0;
 
-   /* advance to start of last paragraph */
-   while (i < len - 1 && buf[i] == '\n')
-   i++;
-
-   for (; i < len; i = k) {
-   int found_rfc2822;
-
-   for (k = i; k < len && buf[k] != '\n'; k++)
-   ; /* do nothing */
-   k++;
+   for (i = 0; i < info.trailer_nr; i++)
+   if (sob && !strncmp(info.trailers[i], sob->buf, sob->len)) {
+   found_sob = 1;
+   if (i == info.trailer_nr - 1)
+   found_sob_last = 1;
+   }
 
-   found_rfc2822 = is_rfc2822_line(buf + i, k - i - 1);
-   if (found_rfc2822 && sob &&
-   !strncmp(buf + i, sob->buf, sob->len))
-   found_sob = k;
+   trailer_info_release();
 
-   if (!(found_rfc2822 ||
- is_cherry_picked_from_line(buf + i, k - i - 1)))
-   return 0;
-   }
-   if (found_sob == i)
+   if (found_sob_last)
return 3;
if (found_sob)
return 2;
diff --git a/t/t3511-cherry-pick-x.sh b/t/t3511-cherry-pick-x.sh
index 9cce5ae..bf0a5c9 100755
--- a/t/t3511-cherry-pick-x.sh
+++ b/t/t3511-cherry-pick-x.sh
@@ -25,9 +25,8 @@ Signed-off-by: B.U. Thor "
 
 mesg_broken_footer="$mesg_no_footer
 
-The signed-off-by string should begin with the words Signed-off-by followed
-by a colon and space, and then the signers name and email address. e.g.
-Signed-off-by: $GIT_COMMITTER_NAME <$GIT_COMMITTER_EMAIL>"
+This is not recognized as a footer because Myfooter is not a recognized token.
+Myfooter: A.U. Thor "
 
 mesg_with_footer_sob="$mesg_with_footer
 Signed-off-by: $GIT_COMMITTER_NAME <$GIT_COMMITTER_EMAIL>"
@@ -112,6 +111,17 @@ test_expect_success 'cherry-pick -s inserts blank line 
after non-conforming foot
test_cmp expect actual
 '
 
+test_expect_success 'cherry-pick -s recognizes trailer config' '
+   

[PATCH 1/4] commit: make ignore_non_trailer take buf/len

2016-10-28 Thread Jonathan Tan
Make ignore_non_trailer take a buf/len pair instead of struct strbuf.

Signed-off-by: Jonathan Tan 
---
 builtin/commit.c |  2 +-
 commit.c | 22 +++---
 commit.h |  2 +-
 trailer.c|  2 +-
 4 files changed, 14 insertions(+), 14 deletions(-)

diff --git a/builtin/commit.c b/builtin/commit.c
index 1cba3b7..c26b939 100644
--- a/builtin/commit.c
+++ b/builtin/commit.c
@@ -790,7 +790,7 @@ static int prepare_to_commit(const char *index_file, const 
char *prefix,
strbuf_stripspace(, 0);
 
if (signoff)
-   append_signoff(, ignore_non_trailer(), 0);
+   append_signoff(, ignore_non_trailer(sb.buf, sb.len), 0);
 
if (fwrite(sb.buf, 1, sb.len, s->fp) < sb.len)
die_errno(_("could not write commit template"));
diff --git a/commit.c b/commit.c
index 856fd4a..f70835a 100644
--- a/commit.c
+++ b/commit.c
@@ -1649,7 +1649,7 @@ const char *find_commit_header(const char *msg, const 
char *key, size_t *out_len
 }
 
 /*
- * Inspect sb and determine the true "end" of the log message, in
+ * Inspect the given string and determine the true "end" of the log message, in
  * order to find where to put a new Signed-off-by: line.  Ignored are
  * trailing comment lines and blank lines, and also the traditional
  * "Conflicts:" block that is not commented out, so that we can use
@@ -1659,37 +1659,37 @@ const char *find_commit_header(const char *msg, const 
char *key, size_t *out_len
  * Returns the number of bytes from the tail to ignore, to be fed as
  * the second parameter to append_signoff().
  */
-int ignore_non_trailer(struct strbuf *sb)
+int ignore_non_trailer(const char *buf, size_t len)
 {
int boc = 0;
int bol = 0;
int in_old_conflicts_block = 0;
 
-   while (bol < sb->len) {
-   char *next_line;
+   while (bol < len) {
+   const char *next_line;
 
-   if (!(next_line = memchr(sb->buf + bol, '\n', sb->len - bol)))
-   next_line = sb->buf + sb->len;
+   if (!(next_line = memchr(buf + bol, '\n', len - bol)))
+   next_line = buf + len;
else
next_line++;
 
-   if (sb->buf[bol] == comment_line_char || sb->buf[bol] == '\n') {
+   if (buf[bol] == comment_line_char || buf[bol] == '\n') {
/* is this the first of the run of comments? */
if (!boc)
boc = bol;
/* otherwise, it is just continuing */
-   } else if (starts_with(sb->buf + bol, "Conflicts:\n")) {
+   } else if (starts_with(buf + bol, "Conflicts:\n")) {
in_old_conflicts_block = 1;
if (!boc)
boc = bol;
-   } else if (in_old_conflicts_block && sb->buf[bol] == '\t') {
+   } else if (in_old_conflicts_block && buf[bol] == '\t') {
; /* a pathname in the conflicts block */
} else if (boc) {
/* the previous was not trailing comment */
boc = 0;
in_old_conflicts_block = 0;
}
-   bol = next_line - sb->buf;
+   bol = next_line - buf;
}
-   return boc ? sb->len - boc : 0;
+   return boc ? len - boc : 0;
 }
diff --git a/commit.h b/commit.h
index afd14f3..9c12abb 100644
--- a/commit.h
+++ b/commit.h
@@ -355,7 +355,7 @@ extern const char *find_commit_header(const char *msg, 
const char *key,
  size_t *out_len);
 
 /* Find the end of the log message, the right place for a new trailer. */
-extern int ignore_non_trailer(struct strbuf *sb);
+extern int ignore_non_trailer(const char *buf, size_t len);
 
 typedef void (*each_mergetag_fn)(struct commit *commit, struct 
commit_extra_header *extra,
 void *cb_data);
diff --git a/trailer.c b/trailer.c
index d19a92c..6d8375b 100644
--- a/trailer.c
+++ b/trailer.c
@@ -828,7 +828,7 @@ static int find_trailer_end(struct strbuf **lines, int 
patch_start)
 
for (i = 0; i < patch_start; i++)
strbuf_addbuf(, lines[i]);
-   ignore_bytes = ignore_non_trailer();
+   ignore_bytes = ignore_non_trailer(sb.buf, sb.len);
strbuf_release();
for (i = patch_start - 1; i >= 0 && ignore_bytes > 0; i--)
ignore_bytes -= lines[i]->len;
-- 
2.8.0.rc3.226.g39d4020



[PATCH 2/4] trailer: avoid unnecessary splitting on lines

2016-10-28 Thread Jonathan Tan
trailer.c currently splits lines while processing a buffer (and also
rejoins lines when needing to invoke ignore_non_trailer).

Avoid such line splitting, except when generating the strings
corresponding to trailers (for ease of use by clients - a subsequent
patch will allow other components to obtain the layout of a trailer
block in a buffer, including the trailers themselves). The main purpose
of this is to make it easy to return pointers into the original buffer
(for a subsequent patch), but this also significantly reduces the number
of memory allocations required.

Signed-off-by: Jonathan Tan 
---
 trailer.c | 215 +-
 1 file changed, 116 insertions(+), 99 deletions(-)

diff --git a/trailer.c b/trailer.c
index 6d8375b..d4d9e10 100644
--- a/trailer.c
+++ b/trailer.c
@@ -102,12 +102,12 @@ static int same_trailer(struct trailer_item *a, struct 
arg_item *b)
return same_token(a, b) && same_value(a, b);
 }
 
-static inline int contains_only_spaces(const char *str)
+static inline int is_blank_line(const char *str)
 {
const char *s = str;
-   while (*s && isspace(*s))
+   while (*s && *s != '\n' && isspace(*s))
s++;
-   return !*s;
+   return !*s || *s == '\n';
 }
 
 static inline void strbuf_replace(struct strbuf *sb, const char *a, const char 
*b)
@@ -566,13 +566,18 @@ static int token_matches_item(const char *tok, struct 
arg_item *item, int tok_le
 }
 
 /*
- * Return the location of the first separator in line, or -1 if there is no
- * separator.
+ * Return the location of the first separator in the given line, or -1 if there
+ * is no separator.
+ *
+ * The interests parameter must contain the acceptable separators and may
+ * contain '\n'. If interests contains '\n', the input line is considered to
+ * end at the first newline encountered. Otherwise, the input line is
+ * considered to end at its null terminator.
  */
-static int find_separator(const char *line, const char *separators)
+static int find_separator(const char *line, const char *interests)
 {
-   int loc = strcspn(line, separators);
-   if (!line[loc])
+   int loc = strcspn(line, interests);
+   if (!line[loc] || line[loc] == '\n')
return -1;
return loc;
 }
@@ -688,51 +693,71 @@ static void process_command_line_args(struct list_head 
*arg_head,
free(cl_separators);
 }
 
-static struct strbuf **read_input_file(const char *file)
+static void read_input_file(struct strbuf *sb, const char *file)
 {
-   struct strbuf **lines;
-   struct strbuf sb = STRBUF_INIT;
-
if (file) {
-   if (strbuf_read_file(, file, 0) < 0)
+   if (strbuf_read_file(sb, file, 0) < 0)
die_errno(_("could not read input file '%s'"), file);
} else {
-   if (strbuf_read(, fileno(stdin), 0) < 0)
+   if (strbuf_read(sb, fileno(stdin), 0) < 0)
die_errno(_("could not read from stdin"));
}
+}
 
-   lines = strbuf_split(, '\n');
+static const char *next_line(const char *str)
+{
+   const char *nl = strchrnul(str, '\n');
+   return nl + !!*nl;
+}
 
-   strbuf_release();
+/*
+ * Return the position of the start of the last line. If len is 0, return -1.
+ */
+static int last_line(const char *buf, size_t len)
+{
+   int i;
+   if (len == 0)
+   return -1;
+   if (len == 1)
+   return 0;
+   /*
+* Skip the last character (in addition to the null terminator),
+* because if the last character is a newline, it is considered as part
+* of the last line anyway.
+*/
+   i = len - 2;
 
-   return lines;
+   for (; i >= 0; i--) {
+   if (buf[i] == '\n')
+   return i + 1;
+   }
+   return 0;
 }
 
 /*
- * Return the (0 based) index of the start of the patch or the line
- * count if there is no patch in the message.
+ * Return the position of the start of the patch or the length of str if there
+ * is no patch in the message.
  */
-static int find_patch_start(struct strbuf **lines, int count)
+static int find_patch_start(const char *str)
 {
-   int i;
+   const char *s;
 
-   /* Get the start of the patch part if any */
-   for (i = 0; i < count; i++) {
-   if (starts_with(lines[i]->buf, "---"))
-   return i;
+   for (s = str; *s; s = next_line(s)) {
+   if (starts_with(s, "---"))
+   return s - str;
}
 
-   return count;
+   return s - str;
 }
 
 /*
- * Return the (0 based) index of the first trailer line or count if
- * there are no trailers. Trailers are searched only in the lines from
- * index (count - 1) down to index 0.
+ * Return the position of the first trailer line or len if there are no
+ * trailers.
  */
-static int find_trailer_start(struct 

[PATCH 0/4] Make other git commands use trailer layout

2016-10-28 Thread Jonathan Tan
This is built off jt/trailer-with-cruft (commit 60ef86a).

This patch set makes "commit -s", "cherry-pick -x", and
"format-patch --signoff" use the new trailer definition implemented in
jt/trailer-with-cruft, with some refactoring along the way. With this
patch set, the aforementioned commands would now handle trailers like
those described in [1].

[1] <84f28caa-2e4b-1231-1a76-3b7e765c0...@google.com>

Jonathan Tan (4):
  commit: make ignore_non_trailer take buf/len
  trailer: avoid unnecessary splitting on lines
  trailer: have function to describe trailer layout
  sequencer: use trailer's trailer layout

 builtin/commit.c |   2 +-
 commit.c |  22 ++--
 commit.h |   2 +-
 sequencer.c  |  75 +++-
 t/t3511-cherry-pick-x.sh |  16 ++-
 t/t4014-format-patch.sh  |  40 +--
 t/t7501-commit.sh|  36 ++
 trailer.c| 295 ---
 trailer.h|  25 
 9 files changed, 313 insertions(+), 200 deletions(-)

-- 
2.8.0.rc3.226.g39d4020



Re: [PATCHv2 00/36] Revamp the attr subsystem!

2016-10-28 Thread Ramsay Jones


On 28/10/16 19:54, Stefan Beller wrote:
> previous discussion at 
> https://public-inbox.org/git/20161022233225.8883-1-sbel...@google.com
> 
> This implements the discarded series':
> jc/attr
> jc/attr-more
> sb/pathspec-label
> sb/submodule-default-paths
> 
> This includes
> * The fixes for windows
> * Junios latest suggestion to use git_attr_check_initv instead of
>   alloc/append.
> 
> * I implemented the thread safe attr API in patch 27 (attr: convert to new 
> threadsafe API)
> * patch 28 (attr: keep attr stack for each check) makes it actually possible
>   to run in a multithreaded environment.
> * I added a test for the multithreaded when it is introduced in patch 32
>   (pathspec: allow querying for attributes) as well as a test to disallow
>   multiple "attr"s in a pathspec.

By the end of this series, 'git_attr_counted()' and 'git_attr()' are
both file local symbols and can be marked with static. (I gave up the
search for which actual patch should change the symbols to static).

Also, 'git_attr()' ends up with a single caller, so maybe inline that
call?

I was about to have a moan about PTHREAD_MUTEX_INITIALIZER, since it
causes sparse to issue some warnings, but I see that you have decided
not to use it. So, phew! ;-)

ATB,
Ramsay Jones



Re: [PATCH] Documenation: fmt-merge-msg: fix markup in example

2016-10-28 Thread Stefan Christ
Hi, 

On Fri, Oct 28, 2016 at 08:15:57AM -0700, Junio C Hamano wrote:
> Jeff King  writes:
> 
> > On Fri, Oct 28, 2016 at 12:01:26PM +0200, Stefan Christ wrote:
> >
> >> diff --git a/Documentation/git-fmt-merge-msg.txt 
> >> b/Documentation/git-fmt-merge-msg.txt
> >> index 6526b17..44892c4 100644
> >> --- a/Documentation/git-fmt-merge-msg.txt
> >> +++ b/Documentation/git-fmt-merge-msg.txt
> >> @@ -60,10 +60,10 @@ merge.summary::
> >>  EXAMPLE
> >>  ---
> >>  
> >> ---
> >> +-
> >>  $ git fetch origin master
> >>  $ git fmt-merge-msg --log <$GIT_DIR/FETCH_HEAD
> >> ---
> >> +-
> >
> > Thanks. Asciidoc generally requires at least 4 delimiter characters to
> > open a delimited block (including a ListingBlock, which is what we want
> > here). There is one exception, "--", which is a generic OpenBlock, which
> > is just used for grouping, and not any special syntactic meaning (so
> > that's why this _didn't_ render the "--", but did render the contents
> > without line breaks).
> >
> > So looks good, modulo the typo in the subject that somebody else pointed
> > out.
> 
> Thanks, both.  I queued with a bit more enhanced log message while
> fixing the typo.
> 

I'm totally ok with these changes. Thanks.

Kind Regards,
Stefan Christ


Re: [PATCHv2 27/36] attr: convert to new threadsafe API

2016-10-28 Thread Junio C Hamano
Stefan Beller  writes:

> On Fri, Oct 28, 2016 at 3:06 PM, Junio C Hamano  wrote:
>> Probably this needs to be squashed in, now the MinGW discussion has
>> settled.
>
> I was about to propose this (and resend it non-rebased).
>
> So I do not resend, but rather ask you to squash this patch?

That's OK.  I've queued a "SQUASH???" separately for now and if we
need futher changes, you may want to resend, but I can locally
squash before merging it down to 'next' if it turns out that there
is no more changes necessary.

Thanks.


Re: Fetch/push lets a malicious server steal the targets of "have" lines

2016-10-28 Thread Matt McCutchen
On Fri, 2016-10-28 at 15:00 -0700, Junio C Hamano wrote:
> Let me see if I understood your scenario correctly.
> 
> Suppose we start from this history where 'O' are common, your victim
> has a 'Y' branch with two commits that are private to it, as well as
> a 'X' branch on which it has X1 that it previously obtained from the
> server.  On the other hand, the server does not know about Y1 or Y2,
> and it added one commit X2 to the branch 'x' the victim is
> following:
> 
>    victimserver
> 
>  Y1---Y2   
> /  
> ---O---O---X1   ---O---O---X1---X2
> 
> Then when victim wants to fetch 'x' from the server, it would say
> 
> have X1, have Y2, have Y1, have O
> 
> and gets told to shut up by the server who heard enough.  The
> histories on these two parties will then become like this:
> 
> 
>    victimserver
> 
>  Y1---Y2   
> /  
> ---O---O---X1---X2  ---O---O---X1---X2

Then the server generates a commit X3 that lists Y2 as a parent, even
though it doesn't have Y2, and advances 'x' to X3.  The victim fetches
'x':

           victim                  server

 Y1---Y2                      (Y2)
/           \                         \ 
---O---O---X1---X2---X3   ---O---O---X1---X2---X3

Then the server rolls back 'x' to X2:

           victim                  server

             Y1---Y2
/           \
---O---O---X1---X2---X3   ---O---O---X1---X2

And the victim pushes:

           victim                  server

             Y1---Y2               Y1---Y2
/           \             /           \
---O---O---X1---X2---X3   ---O---O---X1---X2---X3

Now the server has the content of Y2.

If the victim is fetching and pulling a whole "directory" of refs, e.g:

fetch: refs/heads/*:refs/remotes/server1/*
push: refs/heads/for-server1/*:refs/heads/*

then instead of generating a merge commit, the server can just generate
another ref 'xx' pointing to Y2, assuming it can entice the victim to
set up a corresponding local branch refs/heads/for-server1/xx and push
it back.  Or if the victim is for some reason just mirroring back and
forth:

fetch: refs/heads/*:refs/heads/for-server1/*
push: refs/heads/for-
server1/*:refs/heads/*

then it doesn't have to set up a local branch as separate step.

Matt


Re: [PATCHv2 27/36] attr: convert to new threadsafe API

2016-10-28 Thread Stefan Beller
On Fri, Oct 28, 2016 at 3:06 PM, Junio C Hamano  wrote:
> Probably this needs to be squashed in, now the MinGW discussion has
> settled.

I was about to propose this (and resend it non-rebased).

So I do not resend, but rather ask you to squash this patch?

Thanks,
Stefan


Re: [PATCHv2 27/36] attr: convert to new threadsafe API

2016-10-28 Thread Junio C Hamano
Probably this needs to be squashed in, now the MinGW discussion has
settled.

 attr.c | 2 +-
 common-main.c  | 2 ++
 compat/mingw.c | 4 
 3 files changed, 3 insertions(+), 5 deletions(-)

diff --git a/attr.c b/attr.c
index 082b5ed343..961218a0d5 100644
--- a/attr.c
+++ b/attr.c
@@ -50,7 +50,7 @@ static struct git_attr *(git_attr_hash[HASHSIZE]);
 
 #ifndef NO_PTHREADS
 
-static pthread_mutex_t attr_mutex = PTHREAD_MUTEX_INITIALIZER;
+static pthread_mutex_t attr_mutex;
 #define attr_lock()pthread_mutex_lock(_mutex)
 #define attr_unlock()  pthread_mutex_unlock(_mutex)
 void attr_start(void) { pthread_mutex_init(_mutex, NULL); }
diff --git a/common-main.c b/common-main.c
index 44a29e8b13..d4699cd404 100644
--- a/common-main.c
+++ b/common-main.c
@@ -1,5 +1,6 @@
 #include "cache.h"
 #include "exec_cmd.h"
+#include "attr.h"
 
 /*
  * Many parts of Git have subprograms communicate via pipe, expect the
@@ -32,6 +33,7 @@ int main(int argc, const char **argv)
sanitize_stdfds();
 
git_setup_gettext();
+   attr_start();
 
argv[0] = git_extract_argv0_path(argv[0]);
 
diff --git a/compat/mingw.c b/compat/mingw.c
index 51ed76326b..3fbfda5978 100644
--- a/compat/mingw.c
+++ b/compat/mingw.c
@@ -5,7 +5,6 @@
 #include "../strbuf.h"
 #include "../run-command.h"
 #include "../cache.h"
-#include "../attr.h"
 
 #define HCAST(type, handle) ((type)(intptr_t)handle)
 
@@ -2233,9 +2232,6 @@ void mingw_startup(void)
/* initialize critical section for waitpid pinfo_t list */
InitializeCriticalSection(_cs);
 
-   /* initialize critical sections in the attr code */
-   attr_start();
-
/* set up default file mode and file modes for stdin/out/err */
_fmode = _O_BINARY;
_setmode(_fileno(stdin), _O_BINARY);


Re: [PATCHv2 00/36] Revamp the attr subsystem!

2016-10-28 Thread Stefan Beller
On Fri, Oct 28, 2016 at 2:43 PM, Junio C Hamano  wrote:
> Stefan Beller  writes:
>
>> previous discussion at 
>> https://public-inbox.org/git/20161022233225.8883-1-sbel...@google.com
>>
>> This implements the discarded series':
>> jc/attr
>> jc/attr-more
>> sb/pathspec-label
>> sb/submodule-default-paths
>>
>> This includes
>> * The fixes for windows
>> * Junios latest suggestion to use git_attr_check_initv instead of
>>   alloc/append.
>>
>> * I implemented the thread safe attr API in patch 27 (attr: convert to new 
>> threadsafe API)
>> * patch 28 (attr: keep attr stack for each check) makes it actually possible
>>   to run in a multithreaded environment.
>> * I added a test for the multithreaded when it is introduced in patch 32
>>   (pathspec: allow querying for attributes) as well as a test to disallow
>>   multiple "attr"s in a pathspec.
>
> I'd appreciate if you didn't unnecessarily rebase the series.  It
> would make comparing the new round with the previous one a lot
> easier.
>
> Thanks.

I can resend on the original base if you want to; I'd need to reroll anyway
now that the agreement is to put the attr_start call not in the Windows
specific parts.


Re: Fetch/push lets a malicious server steal the targets of "have" lines

2016-10-28 Thread Junio C Hamano
Matt McCutchen  writes:

> I was studying the fetch protocol and I realized that in a scenario in
> which a client regularly fetches a set of refs from a server and pushes
> them back without careful scrutiny, the server can steal the targets of
> unrelated refs from the client repository by fabricating its own refs
> to the "have" objects specified by the client during the fetch.

Let me see if I understood your scenario correctly.

Suppose we start from this history where 'O' are common, your victim
has a 'Y' branch with two commits that are private to it, as well as
a 'X' branch on which it has X1 that it previously obtained from the
server.  On the other hand, the server does not know about Y1 or Y2,
and it added one commit X2 to the branch 'x' the victim is
following:

   victimserver

 Y1---Y2   
/  
---O---O---X1   ---O---O---X1---X2

Then when victim wants to fetch 'x' from the server, it would say

have X1, have Y2, have Y1, have O

and gets told to shut up by the server who heard enough.  The
histories on these two parties will then become like this:


   victimserver

 Y1---Y2   
/  
---O---O---X1---X2  ---O---O---X1---X2

Victim wishes to keep Y1 and Y2 private, but pushes some other
branch (perhaps builds X3 on top of X2 and pushes 'x').  On push
protocol, the server would lie to the victim that it has Y2 without
knowing what they are.

Is that how your attack scenario goes?


Re: [PATCHv2 00/36] Revamp the attr subsystem!

2016-10-28 Thread Junio C Hamano
Stefan Beller  writes:

> previous discussion at 
> https://public-inbox.org/git/20161022233225.8883-1-sbel...@google.com
>
> This implements the discarded series':
> jc/attr
> jc/attr-more
> sb/pathspec-label
> sb/submodule-default-paths
>
> This includes
> * The fixes for windows
> * Junios latest suggestion to use git_attr_check_initv instead of
>   alloc/append.
>
> * I implemented the thread safe attr API in patch 27 (attr: convert to new 
> threadsafe API)
> * patch 28 (attr: keep attr stack for each check) makes it actually possible
>   to run in a multithreaded environment.
> * I added a test for the multithreaded when it is introduced in patch 32
>   (pathspec: allow querying for attributes) as well as a test to disallow
>   multiple "attr"s in a pathspec.

I'd appreciate if you didn't unnecessarily rebase the series.  It
would make comparing the new round with the previous one a lot
easier.

Thanks.


Fetch/push lets a malicious server steal the targets of "have" lines

2016-10-28 Thread Matt McCutchen
I was studying the fetch protocol and I realized that in a scenario in
which a client regularly fetches a set of refs from a server and pushes
them back without careful scrutiny, the server can steal the targets of
unrelated refs from the client repository by fabricating its own refs
to the "have" objects specified by the client during the fetch.  This
is the reverse of attack #1 described in the "SECURITY" section of the
gitnamespaces(7) man page, with the addition that the server doesn't
have to know the object IDs in advance.  Is this supposed to be well-
known?  I've been using git since 2006 and it was a surprise to me.

Hopefully it isn't very common for a user to fetch and push with a
server they don't trust to have all the data in their repository.  I
don't think I have any such cases myself; I have unfinished work that
isn't meant for scrutiny by others, but nothing really damaging if it
were released to the server.  This attack presents no new risks if a
user already runs code fetched from the server in such a way that it
can read the repository.  But there might be some users who just review
embargoed security fixes from multiple sources (or something like that)
without running code themselves, and their security expectations might
be violated.

If my analysis is correct, I'd argue for documenting the issue in a
"SECURITY" section in the git-fetch man page.  Shall I submit a patch?

Thanks for your attention.

Matt


Re: [PATCHv2 28/36] attr: keep attr stack for each check

2016-10-28 Thread Junio C Hamano
Stefan Beller  writes:

> Instead of having a global attr stack, attach the stack to each check.
> This allows to use the attr in a multithreaded way.
>
>
>
> Signed-off-by: Stefan Beller 
> ---

>  attr.c| 101 
> +++---
>  attr.h|   4 ++-
>  hashmap.h |   2 ++
>  3 files changed, 69 insertions(+), 38 deletions(-)

This looks surprisingly simple ;-)  I like it.

I briefly wondered if the addition of lock/unlock surrounding
git_check_attrs() function belongs to [27/36], but that step is not
about making things thread-safe and is primarily to prepare existing
users to use an updated API that can be made thread-safe in later
steps.  This [28/36] is the step to have these---so the addition is
not out-of-space at all.  

Nicely done.

As this starts to pass a fully populated check object down to the
callchain that begins at bootstrap_attr_stack(), it makes it easier
to add the per-check optimization to read and keep only the relevant
entries from the attribute files later, by passing check also to the
read_attr_from_file() function.

The "set-direction" thing is not yet thread-safe, but I am not sure
what the best way to go there offhand.  It somehow feels unnecessary
to allow some thread to be going in the GIT_ATTR_CHECKIN direction
while others to be going in the GIT_ATTR_CHECKOUT direction, so we
probably can leave it at a lower priority for now.


Re: [PATCH] pre-receive.sample: Mark executable

2016-10-28 Thread Junio C Hamano
Anders Kaseorg  writes:

> For consistency with other hooks, allow the hook to be activated by
> renaming pre-receive.sample to pre-receive without a separate step to
> mark it executable.
>
> Signed-off-by: Anders Kaseorg 
> ---

Makes sense.  Thanks.


>  templates/hooks--pre-receive.sample | 0
>  1 file changed, 0 insertions(+), 0 deletions(-)
>  mode change 100644 => 100755 templates/hooks--pre-receive.sample
>
> diff --git a/templates/hooks--pre-receive.sample 
> b/templates/hooks--pre-receive.sample
> old mode 100644
> new mode 100755


RE: Can't get git to stop outputting to StdErr

2016-10-28 Thread Scott R. Chamberlain
This is talking to a visualstudio.com git repo. I will file a bug with them.

Scott Chamberlain
Software Engineer 
ImproMed, LLC
(800) 925-7171
www.impromed.com


-Original Message-
From: Junio C Hamano [mailto:gits...@pobox.com] 
Sent: Friday, October 28, 2016 4:07 PM
To: Scott R. Chamberlain 
Cc: git@vger.kernel.org
Subject: Re: Can't get git to stop outputting to StdErr

"Scott R. Chamberlain"  writes:

> The line I do is:
>
> git push -q binaryRepo HEAD:"$Env:BUILD_SOURCEBRANCH"

This would 

 (1) squelch the output from the sending side (i.e. local), and

 (2) ask "quiet" to the receiving side (i.e. remote), if they know
 how to be quiet.

> But I get the following in my log after the build
>
> 2016-10-28T20:05:32.3179442Z ##[error]remote: 
> remote: Analyzing objects... (3/3) (657 ms)
> remote: Storing packfile... done (40 ms)
> remote: Storing index... done (42 ms)

These three lines prefixed with "remote:" are coming from the software that 
runs on the remote machine that accepts your push, but the way it says these 
three things do not look familiar to me.  Is it possible that the remote 
machine is running a Git server that is not ours, which lacks the support for 
"quiet" capability?  If that is the case, the symptom is understandable.

A quick archive search tells me that you are seeing the same issue as this one:

https://public-inbox.org/git/20160516133731.ga6...@sigill.intra.peff.net/

where the concluding remark, to which I agree, is:

The server side here is clearly not stock git, from the content
of those progress messages (some googling shows it looks like
whatever visualstudio.com is running, but I don't know what that
is). So either the server implementation doesn't support the
"quiet" protocol extension, or it is ignoring it. It might be
worth filing a bug with them.



-- 
Rely On Us.
ImproMed LLC
Henry Schein Animal Health
--



Re: Can't get git to stop outputting to StdErr

2016-10-28 Thread Junio C Hamano
"Scott R. Chamberlain"  writes:

> The line I do is:
>
> git push -q binaryRepo HEAD:"$Env:BUILD_SOURCEBRANCH"

This would 

 (1) squelch the output from the sending side (i.e. local), and

 (2) ask "quiet" to the receiving side (i.e. remote), if they know
 how to be quiet.

> But I get the following in my log after the build
>
> 2016-10-28T20:05:32.3179442Z ##[error]remote: 
> remote: Analyzing objects... (3/3) (657 ms)
> remote: Storing packfile... done (40 ms)
> remote: Storing index... done (42 ms)

These three lines prefixed with "remote:" are coming from the
software that runs on the remote machine that accepts your push, but
the way it says these three things do not look familiar to me.  Is
it possible that the remote machine is running a Git server that is
not ours, which lacks the support for "quiet" capability?  If that
is the case, the symptom is understandable.

A quick archive search tells me that you are seeing the same issue
as this one:

https://public-inbox.org/git/20160516133731.ga6...@sigill.intra.peff.net/

where the concluding remark, to which I agree, is:

The server side here is clearly not stock git, from the content
of those progress messages (some googling shows it looks like
whatever visualstudio.com is running, but I don't know what that
is). So either the server implementation doesn't support the
"quiet" protocol extension, or it is ignoring it. It might be
worth filing a bug with them.




Re: [PATCH] compat: Allow static initializer for pthreads on Windows

2016-10-28 Thread Junio C Hamano
Johannes Sixt  writes:

> Am 28.10.2016 um 22:29 schrieb Junio C Hamano:
>> Johannes Sixt  writes:
>>
>>> Another problem with the proposed patch is that there is no
>>> declaration for attr_start() before the call in compat/mingw.c. We
>>> would have to move the declaration of attr_start() to cache.h (for
>>> example), because #including attr.h in compat/mingw.c is plainly
>>> wrong. However, it would not be a major offense to #include attr.h in
>>> common-main.c. But when we do that, we can certainly spare the few
>>> cycles to call pthread_mutex_init.
>>
>> That sounds like a good argument to have it in common-main.c.
>>
>> Would it mean that the code that defines the mutex needs to have
>> #ifdef that defines a no-op attr_start() and defines the mutex with
>> PTHREAD_MUTEX_INITILIZER with #else that just defines the mutex
>> without initializatin, with the real attr_start(), though?
>
> No. My intent was to call pthread_mutex_init for all platforms.

Ah, OK, so there was no magic plan.  That's OK.



Re: [PATCH] compat: Allow static initializer for pthreads on Windows

2016-10-28 Thread Johannes Sixt

Am 28.10.2016 um 22:29 schrieb Junio C Hamano:

Johannes Sixt  writes:


Another problem with the proposed patch is that there is no
declaration for attr_start() before the call in compat/mingw.c. We
would have to move the declaration of attr_start() to cache.h (for
example), because #including attr.h in compat/mingw.c is plainly
wrong. However, it would not be a major offense to #include attr.h in
common-main.c. But when we do that, we can certainly spare the few
cycles to call pthread_mutex_init.


That sounds like a good argument to have it in common-main.c.

Would it mean that the code that defines the mutex needs to have
#ifdef that defines a no-op attr_start() and defines the mutex with
PTHREAD_MUTEX_INITILIZER with #else that just defines the mutex
without initializatin, with the real attr_start(), though?


No. My intent was to call pthread_mutex_init for all platforms. We 
already call open("/dev/null") unconditionally on every program startup. 
The few cycles spent in pthread_mutex_init should be the least of our 
worries.


-- Hannes



Can't get git to stop outputting to StdErr

2016-10-28 Thread Scott R. Chamberlain
I am working on some build scripts that get run on TFS. During the build 
process I need to check in the changes that where done during the build process 
(A set of binaries other projects depend on).

I would really like to leave the option "Fail on Standard Error" enabled for 
the script in TFS, however my push keeps writing to standard error even though 
I told it not to.

The line I do is:

git push -q binaryRepo HEAD:"$Env:BUILD_SOURCEBRANCH"

But I get the following in my log after the build

2016-10-28T20:05:32.3179442Z ##[error]remote: 
remote: Analyzing objects... (3/3) (657 ms)
remote: Storing packfile... done (40 ms)
remote: Storing index... done (42 ms)

2016-10-28T20:05:32.3209423Z Done
2016-10-28T20:05:32.4019436Z ##[error]Process completed with exit code 0 
and had 1 error(s) written to the error stream.
2016-10-28T20:05:32.4029436Z ##[debug]System.Exception: Process completed 
with exit code 0 and had 1 error(s) written to the error stream.

Why am I still getting output to standard error when I included the `-q` switch?

For reference, `git version` reports `2.10.0.windows.1` and HEAD is a detached 
HEAD.


Scott Chamberlain
Software Engineer 
ImproMed, LLC



-- 
Rely On Us.
ImproMed LLC
Henry Schein Animal Health
--



Re: [PATCH] compat: Allow static initializer for pthreads on Windows

2016-10-28 Thread Junio C Hamano
Stefan Beller  writes:

> On Fri, Oct 28, 2016 at 1:29 PM, Junio C Hamano  wrote:
>> Johannes Sixt  writes:
>>
>>> Another problem with the proposed patch is that there is no
>>> declaration for attr_start() before the call in compat/mingw.c. We
>>> would have to move the declaration of attr_start() to cache.h (for
>>> example), because #including attr.h in compat/mingw.c is plainly
>>> wrong. However, it would not be a major offense to #include attr.h in
>>> common-main.c. But when we do that, we can certainly spare the few
>>> cycles to call pthread_mutex_init.
>>
>> That sounds like a good argument to have it in common-main.c.
>
> If we're going that route, I would get rid of PTHREAD_MUTEX_INITILIZER
> and call a pthread_mutex_init platform independently.

Yup, but earlier j6t was trying hard not to penalize any single
platform, and that would certainly penalize the ones with static
initialization, so I was hoping to hear if there is a clever
workaround to avoid that.

>> Would it mean that the code that defines the mutex needs to have
>> #ifdef that defines a no-op attr_start() and defines the mutex with
>> PTHREAD_MUTEX_INITILIZER with #else that just defines the mutex
>> without initializatin, with the real attr_start(), though?


Re: [PATCH] compat: Allow static initializer for pthreads on Windows

2016-10-28 Thread Stefan Beller
On Fri, Oct 28, 2016 at 1:29 PM, Junio C Hamano  wrote:
> Johannes Sixt  writes:
>
>> Another problem with the proposed patch is that there is no
>> declaration for attr_start() before the call in compat/mingw.c. We
>> would have to move the declaration of attr_start() to cache.h (for
>> example), because #including attr.h in compat/mingw.c is plainly
>> wrong. However, it would not be a major offense to #include attr.h in
>> common-main.c. But when we do that, we can certainly spare the few
>> cycles to call pthread_mutex_init.
>
> That sounds like a good argument to have it in common-main.c.

If we're going that route, I would get rid of PTHREAD_MUTEX_INITILIZER
and call a pthread_mutex_init platform independently.

>
> Would it mean that the code that defines the mutex needs to have
> #ifdef that defines a no-op attr_start() and defines the mutex with
> PTHREAD_MUTEX_INITILIZER with #else that just defines the mutex
> without initializatin, with the real attr_start(), though?


Re: [PATCH] compat: Allow static initializer for pthreads on Windows

2016-10-28 Thread Junio C Hamano
Johannes Sixt  writes:

> Another problem with the proposed patch is that there is no
> declaration for attr_start() before the call in compat/mingw.c. We
> would have to move the declaration of attr_start() to cache.h (for
> example), because #including attr.h in compat/mingw.c is plainly
> wrong. However, it would not be a major offense to #include attr.h in
> common-main.c. But when we do that, we can certainly spare the few
> cycles to call pthread_mutex_init.

That sounds like a good argument to have it in common-main.c.  

Would it mean that the code that defines the mutex needs to have
#ifdef that defines a no-op attr_start() and defines the mutex with
PTHREAD_MUTEX_INITILIZER with #else that just defines the mutex
without initializatin, with the real attr_start(), though?


Re: [PATCH] compat: Allow static initializer for pthreads on Windows

2016-10-28 Thread Johannes Sixt

Am 28.10.2016 um 20:48 schrieb Jacob Keller:

On Fri, Oct 28, 2016 at 4:58 AM, Johannes Schindelin
 wrote:

Hi Jake,

On Thu, 27 Oct 2016, Jacob Keller wrote:


I agree with Stefan that there isn't really a great place to put a
dynamic initialization.


Ummm. Wait. What???

https://github.com/git/git/blob/v2.10.1/common-main.c#L25-L41


I think Stefan has since provided a better suggestion of isolating the
dynamic initialization into the MINGW startup section. However, you
are right either place works, though I think platforms which support
static initialization should use it.


It's a pity that this new patch introduces the first use of 
PTHREAD_MUTEX_INITILIZER. (I do not count the one in compat/nedmalloc, 
it's in platform specific code.) We have to introduce a dummy definition 
on Windows to even compile the code.


Now we have a double-edged sword: Other coders who are not aware of this 
thread might think it works. But it does not.


Another problem with the proposed patch is that there is no declaration 
for attr_start() before the call in compat/mingw.c. We would have to 
move the declaration of attr_start() to cache.h (for example), because 
#including attr.h in compat/mingw.c is plainly wrong. However, it would 
not be a major offense to #include attr.h in common-main.c. But when we 
do that, we can certainly spare the few cycles to call pthread_mutex_init.


-- Hannes



Re: [PATCH 5/5] grep: enable recurse-submodules to work on objects

2016-10-28 Thread Brandon Williams
On 10/27, Brandon Williams wrote:
> diff --git a/tree-walk.c b/tree-walk.c
> index 828f4356b..b3f996174 100644
> --- a/tree-walk.c
> +++ b/tree-walk.c
> @@ -999,10 +999,11 @@ static enum interesting do_match(const struct 
> name_entry *entry,
>   return entry_interesting;
>  
>   /*
> -  * Match all directories. We'll try to
> -  * match files later on.
> +  * Match all directories and gitlinks. We'll
> +  * try to match files later on.
>*/
> - if (ps->recursive && S_ISDIR(entry->mode))
> + if (ps->recursive && (S_ISDIR(entry->mode) ||
> +   S_ISGITLINK(entry->mode)))
>   return entry_interesting;
>   }
>  
> @@ -1043,13 +1044,13 @@ static enum interesting do_match(const struct 
> name_entry *entry,
>   strbuf_setlen(base, base_offset + baselen);
>  
>   /*
> -  * Match all directories. We'll try to match files
> -  * later on.
> -  * max_depth is ignored but we may consider support it
> -  * in future, see
> +  * Match all directories and gitlinks. We'll try to match files
> +  * later on.  max_depth is ignored but we may consider support
> +  * it in future, see
>* 
> http://thread.gmane.org/gmane.comp.version-control.git/163757/focus=163840
>*/
> - if (ps->recursive && S_ISDIR(entry->mode))
> + if (ps->recursive && (S_ISDIR(entry->mode) ||
> +   S_ISGITLINK(entry->mode)))
>   return entry_interesting;
>   }
>   return never_interesting; /* No matches */

Looks like this change actually breaks a test in t4010-diff-pathspec.sh.
I think I'll have to add a flag to optionally let through submodules as
apposed to just treating them like directories.

-- 
Brandon Williams


[PATCH] push: do not use potentially ambiguous default refspec

2016-10-28 Thread Junio C Hamano
When the user does the lazy "git push" with no parameters with
push.default set to either "upstream", "simple" or "current",
we internally generated a refspec that has the current branch name
on the source side and used it to push.

However, the branch name (say "test") may be an ambiguous refname in
the context of the source repository---there may be a tag with the
same name, for example.  This would trigger an unnecessary error
without any fault on the end-user's side.

Be explicit and give a full refname as the source side to avoid the
ambiguity.  The destination side when pushing with the "current"
sent only the name of the branch and forcing the receiving end to
guess, which is the same issue.  Be explicit there as well.

Reported-by: Kannan Goundan 
Signed-off-by: Junio C Hamano 
---

 * It is appreciated if somebody with spare cycles can add a test or
   two for this in t/t5523-push-upstream.sh or somewhere nearby.

 builtin/push.c | 7 +--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/builtin/push.c b/builtin/push.c
index 4e9e4dbab2..00a1b68483 100644
--- a/builtin/push.c
+++ b/builtin/push.c
@@ -194,15 +194,18 @@ static void setup_push_upstream(struct remote *remote, 
struct branch *branch,
die_push_simple(branch, remote);
}
 
-   strbuf_addf(, "%s:%s", branch->name, branch->merge[0]->src);
+   strbuf_addf(, "%s:%s", branch->refname, branch->merge[0]->src);
add_refspec(refspec.buf);
 }
 
 static void setup_push_current(struct remote *remote, struct branch *branch)
 {
+   struct strbuf refspec = STRBUF_INIT;
+
if (!branch)
die(_(message_detached_head_die), remote->name);
-   add_refspec(branch->name);
+   strbuf_addf(, "%s:%s", branch->refname, branch->refname);
+   add_refspec(refspec.buf);
 }
 
 static int is_workflow_triangular(struct remote *remote)
-- 
2.10.2-722-gd3f6888e25



Re: "git push" says "src refspec XYZ matches more than one" even without explicit XYZ argument.

2016-10-28 Thread Junio C Hamano
Kannan Goundan  writes:

> 1. My repo has a branch named 'v1' that is tracking 'origin/v1'.
> 2. My repo has a tag named 'v1'.
> 3. I have "push.default" set to "upstream".
>
> I made a commit on branch 'v1' and tried doing a push:
>
> # git push
> error: src refspec v1 matches more than one.
> error: failed to push some refs to 'g...@github.com:whatever/ns1-go.git'
>
> If I rename the branch to 'v1-dev', then the push goes through.
>
> I understand why the command "git push origin/v1 v1" is ambiguous.
> But if I do a plain "git push", I thought Git would know to push my
> current branch.
>
> [Git version 2.10.1 from Homebrew on Mac OS 10.11.6.]

Thanks.  

You are right that the refspec Git internally create for this case
should not be v1:refs/heads/v1, which would notice that the source
side (e.g. "v1") is ambiguous.  Instead we should spell that out.

Perhaps something like this would fix it for you?

 builtin/push.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/builtin/push.c b/builtin/push.c
index 3bb9d6b7e6..02fd235742 100644
--- a/builtin/push.c
+++ b/builtin/push.c
@@ -194,7 +194,7 @@ static void setup_push_upstream(struct remote *remote, 
struct branch *branch,
die_push_simple(branch, remote);
}
 
-   strbuf_addf(, "%s:%s", branch->name, branch->merge[0]->src);
+   strbuf_addf(, "%s:%s", branch->refname, branch->merge[0]->src);
add_refspec(refspec.buf);
 }
 


[PATCHv2 12/36] attr: convert git_all_attrs() to use "struct git_attr_check"

2016-10-28 Thread Stefan Beller
From: Junio C Hamano 

This updates the other two ways the attribute check is done via an
array of "struct git_attr_check_elem" elements.  These two niches
appear only in "git check-attr".

 * The caller does not know offhand what attributes it wants to ask
   about and cannot use git_attr_check_initl() to prepare the
   git_attr_check structure.

 * The caller may not know what attributes it wants to ask at all,
   and instead wants to learn everything that the given path has.

Such a caller can call git_attr_check_alloc() to allocate an empty
git_attr_check, and then call git_attr_check_append() to add
attribute names one by one.  A new attribute can be appended until
git_attr_check structure is "finalized", which happens when it is
used to ask for attributes for any path by calling git_check_attr()
or git_all_attrs().  A git_attr_check structure that is initialized
by git_attr_check_initl() is already finalized when it is returned.

I am not at all happy with the way git_all_attrs() API turned out to
be, but it is only to support one niche caller ("check-attr --all"),
so I'll stop here for now.

Signed-off-by: Junio C Hamano 
Signed-off-by: Stefan Beller 
---
 attr.c   | 75 ++--
 attr.h   | 16 ++-
 builtin/check-attr.c | 51 ++-
 3 files changed, 90 insertions(+), 52 deletions(-)

diff --git a/attr.c b/attr.c
index 20d64b34f4..7ffa19875b 100644
--- a/attr.c
+++ b/attr.c
@@ -728,6 +728,11 @@ static int macroexpand_one(int nr, int rem)
return rem;
 }
 
+static int attr_check_is_dynamic(const struct git_attr_check *check)
+{
+   return (void *)(check->check) != (void *)(check + 1);
+}
+
 /*
  * Collect attributes for path into the array pointed to by
  * check_all_attr. If num is non-zero, only attributes in check[] are
@@ -793,32 +798,21 @@ int git_check_attrs(const char *path, int num, struct 
git_attr_check_elem *check
return 0;
 }
 
-int git_all_attrs(const char *path, int *num, struct git_attr_check_elem 
**check)
+void git_all_attrs(const char *path, struct git_attr_check *check)
 {
-   int i, count, j;
+   int i;
 
+   git_attr_check_clear(check);
collect_some_attrs(path, 0, NULL);
 
-   /* Count the number of attributes that are set. */
-   count = 0;
-   for (i = 0; i < attr_nr; i++) {
-   const char *value = check_all_attr[i].value;
-   if (value != ATTR__UNSET && value != ATTR__UNKNOWN)
-   ++count;
-   }
-   *num = count;
-   ALLOC_ARRAY(*check, count);
-   j = 0;
for (i = 0; i < attr_nr; i++) {
+   const char *name = check_all_attr[i].attr->name;
const char *value = check_all_attr[i].value;
-   if (value != ATTR__UNSET && value != ATTR__UNKNOWN) {
-   (*check)[j].attr = check_all_attr[i].attr;
-   (*check)[j].value = value;
-   ++j;
-   }
+   if (value == ATTR__UNSET || value == ATTR__UNKNOWN)
+   continue;
+   git_attr_check_append(check, git_attr(name));
+   check->check[check->check_nr - 1].value = value;
}
-
-   return 0;
 }
 
 void git_attr_set_direction(enum git_attr_direction new, struct index_state 
*istate)
@@ -836,6 +830,7 @@ void git_attr_set_direction(enum git_attr_direction new, 
struct index_state *ist
 
 int git_check_attr(const char *path, struct git_attr_check *check)
 {
+   check->finalized = 1;
return git_check_attrs(path, check->check_nr, check->check);
 }
 
@@ -853,17 +848,57 @@ struct git_attr_check *git_attr_check_initl(const char 
*one, ...)
check = xcalloc(1,
sizeof(*check) + cnt * sizeof(*(check->check)));
check->check_nr = cnt;
+   check->finalized = 1;
check->check = (struct git_attr_check_elem *)(check + 1);
 
check->check[0].attr = git_attr(one);
va_start(params, one);
for (cnt = 1; cnt < check->check_nr; cnt++) {
+   struct git_attr *attr;
param = va_arg(params, const char *);
if (!param)
die("BUG: counted %d != ended at %d",
check->check_nr, cnt);
-   check->check[cnt].attr = git_attr(param);
+   attr = git_attr(param);
+   if (!attr)
+   die("BUG: %s: not a valid attribute name", param);
+   check->check[cnt].attr = attr;
}
va_end(params);
return check;
 }
+
+struct git_attr_check *git_attr_check_alloc(void)
+{
+   return xcalloc(1, sizeof(struct git_attr_check));
+}
+
+struct git_attr_check_elem *git_attr_check_append(struct git_attr_check *check,
+ const struct git_attr *attr)

[PATCHv2 11/36] attr: (re)introduce git_check_attr() and struct git_attr_check

2016-10-28 Thread Stefan Beller
From: Junio C Hamano 

A common pattern to check N attributes for many paths is to

 (1) prepare an array A of N git_attr_check_elem items;
 (2) call git_attr() to intern the N attribute names and fill A;
 (3) repeatedly call git_check_attrs() for path with N and A;

A look-up for these N attributes for a single path P scans the
entire attr_stack, starting from the .git/info/attributes file and
then .gitattributes file in the directory the path P is in, going
upwards to find .gitattributes file found in parent directories.

An earlier commit 06a604e6 (attr: avoid heavy work when we know the
specified attr is not defined, 2014-12-28) tried to optimize out
this scanning for one trivial special case: when the attribute being
sought is known not to exist, we do not have to scan for it.  While
this may be a cheap and effective heuristic, it would not work well
when N is (much) more than 1.

What we would want is a more customized way to skip irrelevant
entries in the attribute stack, and the definition of irrelevance
is tied to the set of attributes passed to git_check_attrs() call,
i.e. the set of attributes being sought.  The data necessary for
this optimization needs to live alongside the set of attributes, but
a simple array of git_attr_check_elem simply does not have any place
for that.

Introduce "struct git_attr_check" that contains N, the number of
attributes being sought, and A, the array that holds N
git_attr_check_elem items, and a function git_check_attr() that
takes a path P and this structure as its parameters.  This structure
can later be extended to hold extra data necessary for optimization.

Also, to make it easier to write the first two steps in common
cases, introduce git_attr_check_initl() helper function, which takes
a NULL-terminated list of attribute names and initialize this
structure.

As an illustration of this new API, convert archive.c that asks for
export-subst and export-ignore attributes for each paths.

Signed-off-by: Junio C Hamano 
Signed-off-by: Stefan Beller 
---
 archive.c | 24 ++--
 attr.c| 34 ++
 attr.h|  9 +
 3 files changed, 49 insertions(+), 18 deletions(-)

diff --git a/archive.c b/archive.c
index 2dc8d6ca57..11e3951371 100644
--- a/archive.c
+++ b/archive.c
@@ -87,19 +87,6 @@ void *sha1_file_to_archive(const struct archiver_args *args,
return buffer;
 }
 
-static void setup_archive_check(struct git_attr_check_elem *check)
-{
-   static struct git_attr *attr_export_ignore;
-   static struct git_attr *attr_export_subst;
-
-   if (!attr_export_ignore) {
-   attr_export_ignore = git_attr("export-ignore");
-   attr_export_subst = git_attr("export-subst");
-   }
-   check[0].attr = attr_export_ignore;
-   check[1].attr = attr_export_subst;
-}
-
 struct directory {
struct directory *up;
struct object_id oid;
@@ -123,7 +110,7 @@ static int write_archive_entry(const unsigned char *sha1, 
const char *base,
struct archiver_context *c = context;
struct archiver_args *args = c->args;
write_archive_entry_fn_t write_entry = c->write_entry;
-   struct git_attr_check_elem check[2];
+   static struct git_attr_check *check;
const char *path_without_prefix;
int err;
 
@@ -137,11 +124,12 @@ static int write_archive_entry(const unsigned char *sha1, 
const char *base,
strbuf_addch(, '/');
path_without_prefix = path.buf + args->baselen;
 
-   setup_archive_check(check);
-   if (!git_check_attrs(path_without_prefix, ARRAY_SIZE(check), check)) {
-   if (ATTR_TRUE(check[0].value))
+   if (!check)
+   check = git_attr_check_initl("export-ignore", "export-subst", 
NULL);
+   if (!git_check_attr(path_without_prefix, check)) {
+   if (ATTR_TRUE(check->check[0].value))
return 0;
-   args->convert = ATTR_TRUE(check[1].value);
+   args->convert = ATTR_TRUE(check->check[1].value);
}
 
if (S_ISDIR(mode) || S_ISGITLINK(mode)) {
diff --git a/attr.c b/attr.c
index ff7f0a59eb..20d64b34f4 100644
--- a/attr.c
+++ b/attr.c
@@ -833,3 +833,37 @@ void git_attr_set_direction(enum git_attr_direction new, 
struct index_state *ist
drop_attr_stack();
use_index = istate;
 }
+
+int git_check_attr(const char *path, struct git_attr_check *check)
+{
+   return git_check_attrs(path, check->check_nr, check->check);
+}
+
+struct git_attr_check *git_attr_check_initl(const char *one, ...)
+{
+   struct git_attr_check *check;
+   int cnt;
+   va_list params;
+   const char *param;
+
+   va_start(params, one);
+   for (cnt = 1; (param = va_arg(params, const char *)) != NULL; cnt++)
+   ;
+   va_end(params);
+   check = xcalloc(1,
+   sizeof(*check) + cnt * 

[PATCHv2 07/36] attr.c: simplify macroexpand_one()

2016-10-28 Thread Stefan Beller
From: Junio C Hamano 

The double-loop wants to do an early return immediately when one
matching macro is found.  Eliminate the extra variable 'a' used for
that purpose and rewrite the "assign the found item to 'a' to make
it non-NULL and force the loop(s) to terminate" with a direct return
from there.

Signed-off-by: Junio C Hamano 
Signed-off-by: Stefan Beller 
---
 attr.c | 11 ---
 1 file changed, 4 insertions(+), 7 deletions(-)

diff --git a/attr.c b/attr.c
index 17297fffee..e42f931b35 100644
--- a/attr.c
+++ b/attr.c
@@ -705,24 +705,21 @@ static int fill(const char *path, int pathlen, int 
basename_offset,
 static int macroexpand_one(int nr, int rem)
 {
struct attr_stack *stk;
-   struct match_attr *a = NULL;
int i;
 
if (check_all_attr[nr].value != ATTR__TRUE ||
!check_all_attr[nr].attr->maybe_macro)
return rem;
 
-   for (stk = attr_stack; !a && stk; stk = stk->prev)
-   for (i = stk->num_matches - 1; !a && 0 <= i; i--) {
+   for (stk = attr_stack; stk; stk = stk->prev) {
+   for (i = stk->num_matches - 1; 0 <= i; i--) {
struct match_attr *ma = stk->attrs[i];
if (!ma->is_macro)
continue;
if (ma->u.attr->attr_nr == nr)
-   a = ma;
+   return fill_one("expand", ma, rem);
}
-
-   if (a)
-   rem = fill_one("expand", a, rem);
+   }
 
return rem;
 }
-- 
2.10.1.714.ge3da0db



[PATCHv2 09/36] attr.c: plug small leak in parse_attr_line()

2016-10-28 Thread Stefan Beller
From: Junio C Hamano 

If any error is noticed after the match_attr structure is allocated,
we shouldn't just return NULL from this function.

Add a fail_return label that frees the allocated structure and
returns NULL, and consistently jump there when we want to return
NULL after cleaning up.

Signed-off-by: Junio C Hamano 
Signed-off-by: Stefan Beller 
---
 attr.c | 12 
 1 file changed, 8 insertions(+), 4 deletions(-)

diff --git a/attr.c b/attr.c
index f7cf7ae306..d180c7833e 100644
--- a/attr.c
+++ b/attr.c
@@ -223,7 +223,7 @@ static struct match_attr *parse_attr_line(const char *line, 
const char *src,
if (!macro_ok) {
fprintf(stderr, "%s not allowed: %s:%d\n",
name, src, lineno);
-   return NULL;
+   goto fail_return;
}
is_macro = 1;
name += strlen(ATTRIBUTE_MACRO_PREFIX);
@@ -233,7 +233,7 @@ static struct match_attr *parse_attr_line(const char *line, 
const char *src,
fprintf(stderr,
"%.*s is not a valid attribute name: %s:%d\n",
namelen, name, src, lineno);
-   return NULL;
+   goto fail_return;
}
}
else
@@ -246,7 +246,7 @@ static struct match_attr *parse_attr_line(const char *line, 
const char *src,
for (cp = states, num_attr = 0; *cp; num_attr++) {
cp = parse_attr(src, lineno, cp, NULL);
if (!cp)
-   return NULL;
+   goto fail_return;
}
 
res = xcalloc(1,
@@ -267,7 +267,7 @@ static struct match_attr *parse_attr_line(const char *line, 
const char *src,
if (res->u.pat.flags & EXC_FLAG_NEGATIVE) {
warning(_("Negative patterns are ignored in git 
attributes\n"
  "Use '\\!' for literal leading 
exclamation."));
-   return NULL;
+   goto fail_return;
}
}
res->is_macro = is_macro;
@@ -283,6 +283,10 @@ static struct match_attr *parse_attr_line(const char 
*line, const char *src,
}
 
return res;
+
+fail_return:
+   free(res);
+   return NULL;
 }
 
 /*
-- 
2.10.1.714.ge3da0db



[PATCHv2 25/36] attr.c: outline the future plans by heavily commenting

2016-10-28 Thread Stefan Beller
From: Junio C Hamano 

Signed-off-by: Junio C Hamano 
Signed-off-by: Stefan Beller 
---
 attr.c | 40 +++-
 1 file changed, 39 insertions(+), 1 deletion(-)

diff --git a/attr.c b/attr.c
index 92b3130f1e..10f2042fbb 100644
--- a/attr.c
+++ b/attr.c
@@ -30,6 +30,11 @@ static const char git_attr__unknown[] = "(builtin)unknown";
 #define DEBUG_ATTR 0
 #endif
 
+/*
+ * NEEDSWORK: the global dictionary of the interned attributes
+ * must stay a singleton even after we become thread-ready.
+ * Access to these must be surrounded with mutex when it happens.
+ */
 struct git_attr {
struct git_attr *next;
unsigned h;
@@ -39,10 +44,19 @@ struct git_attr {
char name[FLEX_ARRAY];
 };
 static int attr_nr;
+static struct git_attr *(git_attr_hash[HASHSIZE]);
+
+/*
+ * NEEDSWORK: maybe-real, maybe-macro are not property of
+ * an attribute, as it depends on what .gitattributes are
+ * read.  Once we introduce per git_attr_check attr_stack
+ * and check_all_attr, the optimization based on them will
+ * become unnecessary and can go away.  So is this variable.
+ */
 static int cannot_trust_maybe_real;
 
+/* NEEDSWORK: This will become per git_attr_check */
 static struct git_attr_check_elem *check_all_attr;
-static struct git_attr *(git_attr_hash[HASHSIZE]);
 
 const char *git_attr_name(const struct git_attr *attr)
 {
@@ -117,6 +131,11 @@ struct git_attr *git_attr_counted(const char *name, size_t 
len)
a->maybe_real = 0;
git_attr_hash[pos] = a;
 
+   /*
+* NEEDSWORK: per git_attr_check check_all_attr
+* will be initialized a lot more lazily, not
+* like this, and not here.
+*/
REALLOC_ARRAY(check_all_attr, attr_nr);
check_all_attr[a->attr_nr].attr = a;
check_all_attr[a->attr_nr].value = ATTR__UNKNOWN;
@@ -329,6 +348,7 @@ static struct match_attr *parse_attr_line(const char *line, 
const char *src,
  * .gitignore file and info/excludes file as a fallback.
  */
 
+/* NEEDSWORK: This will become per git_attr_check */
 static struct attr_stack {
struct attr_stack *prev;
char *origin;
@@ -393,6 +413,24 @@ static struct attr_stack *read_attr_from_array(const char 
**list)
return res;
 }
 
+/*
+ * NEEDSWORK: these two are tricky.  The callers assume there is a
+ * single, system-wide global state "where we read attributes from?"
+ * and when the state is flipped by calling git_attr_set_direction(),
+ * attr_stack is discarded so that subsequent attr_check will lazily
+ * read from the right place.  And they do not know or care who called
+ * by them uses the attribute subsystem, hence have no knowledge of
+ * existing git_attr_check instances or future ones that will be
+ * created).
+ *
+ * Probably we need a thread_local that holds these two variables,
+ * and a list of git_attr_check instances (which need to be maintained
+ * by hooking into git_attr_check_alloc(), git_attr_check_initl(), and
+ * git_attr_check_clear().  Then git_attr_set_direction() updates the
+ * fields in that thread_local for these two variables, iterate over
+ * all the active git_attr_check instances and discard the attr_stack
+ * they hold.  Yuck, but it sounds doable.
+ */
 static enum git_attr_direction direction;
 static struct index_state *use_index;
 
-- 
2.10.1.714.ge3da0db



[PATCHv2 15/36] attr: add counted string version of git_check_attr()

2016-10-28 Thread Stefan Beller
From: Junio C Hamano 

Often a potential caller has  pair that
represents the path it wants to ask attributes for; when
path[pathlen] is not NUL, the caller has to xmemdupz()
only to call git_check_attr().

Add git_check_attr_counted() that takes such a counted
string instead of "const char *path".

Signed-off-by: Junio C Hamano 
Signed-off-by: Stefan Beller 
---
 attr.c | 23 ++-
 attr.h |  1 +
 2 files changed, 15 insertions(+), 9 deletions(-)

diff --git a/attr.c b/attr.c
index 30953c905b..dc9d61b411 100644
--- a/attr.c
+++ b/attr.c
@@ -738,20 +738,19 @@ static int attr_check_is_dynamic(const struct 
git_attr_check *check)
  * check_all_attr. If num is non-zero, only attributes in check[] are
  * collected. Otherwise all attributes are collected.
  */
-static void collect_some_attrs(const char *path, int num,
+static void collect_some_attrs(const char *path, int pathlen, int num,
   struct git_attr_check_elem *check)
 
 {
struct attr_stack *stk;
-   int i, pathlen, rem, dirlen;
+   int i, rem, dirlen;
const char *cp, *last_slash = NULL;
int basename_offset;
 
-   for (cp = path; *cp; cp++) {
+   for (cp = path; cp < path + pathlen; cp++) {
if (*cp == '/' && cp[1])
last_slash = cp;
}
-   pathlen = cp - path;
if (last_slash) {
basename_offset = last_slash + 1 - path;
dirlen = last_slash - path;
@@ -782,12 +781,12 @@ static void collect_some_attrs(const char *path, int num,
rem = fill(path, pathlen, basename_offset, stk, rem);
 }
 
-static int git_check_attrs(const char *path, int num,
+static int git_check_attrs(const char *path, int pathlen, int num,
   struct git_attr_check_elem *check)
 {
int i;
 
-   collect_some_attrs(path, num, check);
+   collect_some_attrs(path, pathlen, num, check);
 
for (i = 0; i < num; i++) {
const char *value = 
check_all_attr[check[i].attr->attr_nr].value;
@@ -804,7 +803,7 @@ void git_all_attrs(const char *path, struct git_attr_check 
*check)
int i;
 
git_attr_check_clear(check);
-   collect_some_attrs(path, 0, NULL);
+   collect_some_attrs(path, strlen(path), 0, NULL);
 
for (i = 0; i < attr_nr; i++) {
const char *name = check_all_attr[i].attr->name;
@@ -829,10 +828,16 @@ void git_attr_set_direction(enum git_attr_direction new, 
struct index_state *ist
use_index = istate;
 }
 
-int git_check_attr(const char *path, struct git_attr_check *check)
+int git_check_attr_counted(const char *path, int pathlen,
+  struct git_attr_check *check)
 {
check->finalized = 1;
-   return git_check_attrs(path, check->check_nr, check->check);
+   return git_check_attrs(path, pathlen, check->check_nr, check->check);
+}
+
+int git_check_attr(const char *path, struct git_attr_check *check)
+{
+   return git_check_attr_counted(path, strlen(path), check);
 }
 
 struct git_attr_check *git_attr_check_initl(const char *one, ...)
diff --git a/attr.h b/attr.h
index 506db0ca74..c84f164b8e 100644
--- a/attr.h
+++ b/attr.h
@@ -38,6 +38,7 @@ struct git_attr_check {
 
 extern struct git_attr_check *git_attr_check_initl(const char *, ...);
 extern int git_check_attr(const char *path, struct git_attr_check *);
+extern int git_check_attr_counted(const char *, int, struct git_attr_check *);
 
 extern struct git_attr_check *git_attr_check_alloc(void);
 extern struct git_attr_check_elem *git_attr_check_append(struct git_attr_check 
*, const struct git_attr *);
-- 
2.10.1.714.ge3da0db



[PATCHv2 18/36] attr: support quoting pathname patterns in C style

2016-10-28 Thread Stefan Beller
From: Nguyễn Thái Ngọc Duy 

Full pattern must be quoted. So 'pat"t"ern attr' will give exactly
'pat"t"ern', not 'pattern'. Also clarify that leading whitespaces are
not part of the pattern and document comment syntax.

Signed-off-by: Nguyễn Thái Ngọc Duy 
Signed-off-by: Junio C Hamano 
Signed-off-by: Stefan Beller 
---
 Documentation/gitattributes.txt |  8 +---
 attr.c  | 15 +--
 t/t0003-attributes.sh   | 26 ++
 3 files changed, 44 insertions(+), 5 deletions(-)

diff --git a/Documentation/gitattributes.txt b/Documentation/gitattributes.txt
index 7aff940202..8a061af0cc 100644
--- a/Documentation/gitattributes.txt
+++ b/Documentation/gitattributes.txt
@@ -21,9 +21,11 @@ Each line in `gitattributes` file is of form:
pattern attr1 attr2 ...
 
 That is, a pattern followed by an attributes list,
-separated by whitespaces.  When the pattern matches the
-path in question, the attributes listed on the line are given to
-the path.
+separated by whitespaces. Leading and trailing whitespaces are
+ignored. Lines that begin with '#' are ignored. Patterns
+that begin with a double quote are quoted in C style.
+When the pattern matches the path in question, the attributes
+listed on the line are given to the path.
 
 Each attribute can be in one of these states for a given path:
 
diff --git a/attr.c b/attr.c
index 33021cc857..c8c4936f36 100644
--- a/attr.c
+++ b/attr.c
@@ -13,6 +13,7 @@
 #include "attr.h"
 #include "dir.h"
 #include "utf8.h"
+#include "quote.h"
 
 const char git_attr__true[] = "(builtin)true";
 const char git_attr__false[] = "\0(builtin)false";
@@ -225,12 +226,21 @@ static struct match_attr *parse_attr_line(const char 
*line, const char *src,
const char *cp, *name, *states;
struct match_attr *res = NULL;
int is_macro;
+   struct strbuf pattern = STRBUF_INIT;
 
cp = line + strspn(line, blank);
if (!*cp || *cp == '#')
return NULL;
name = cp;
-   namelen = strcspn(name, blank);
+
+   if (*cp == '"' && !unquote_c_style(, name, )) {
+   name = pattern.buf;
+   namelen = pattern.len;
+   } else {
+   namelen = strcspn(name, blank);
+   states = name + namelen;
+   }
+
if (strlen(ATTRIBUTE_MACRO_PREFIX) < namelen &&
starts_with(name, ATTRIBUTE_MACRO_PREFIX)) {
if (!macro_ok) {
@@ -250,7 +260,6 @@ static struct match_attr *parse_attr_line(const char *line, 
const char *src,
else
is_macro = 0;
 
-   states = name + namelen;
states += strspn(states, blank);
 
/* First pass to count the attr_states */
@@ -293,9 +302,11 @@ static struct match_attr *parse_attr_line(const char 
*line, const char *src,
cannot_trust_maybe_real = 1;
}
 
+   strbuf_release();
return res;
 
 fail_return:
+   strbuf_release();
free(res);
return NULL;
 }
diff --git a/t/t0003-attributes.sh b/t/t0003-attributes.sh
index f0fbb42554..f19ae4f8cc 100755
--- a/t/t0003-attributes.sh
+++ b/t/t0003-attributes.sh
@@ -13,10 +13,31 @@ attr_check () {
test_line_count = 0 err
 }
 
+attr_check_quote () {
+
+   path="$1"
+   quoted_path="$2"
+   expect="$3"
+
+   git check-attr test -- "$path" >actual &&
+   echo "\"$quoted_path\": test: $expect" >expect &&
+   test_cmp expect actual
+
+}
+
+test_expect_success 'open-quoted pathname' '
+   echo "\"a test=a" >.gitattributes &&
+   test_must_fail attr_check a a
+'
+
+
 test_expect_success 'setup' '
mkdir -p a/b/d a/c b &&
(
echo "[attr]notest !test"
+   echo "\" d \"   test=d"
+   echo " etest=e"
+   echo " e\"  test=e"
echo "f test=f"
echo "a/i test=a/i"
echo "onoff test -test"
@@ -69,6 +90,11 @@ test_expect_success 'command line checks' '
 '
 
 test_expect_success 'attribute test' '
+
+   attr_check " d " d &&
+   attr_check e e &&
+   attr_check_quote e\" e\\\" e &&
+
attr_check f f &&
attr_check a/f f &&
attr_check a/c/f f &&
-- 
2.10.1.714.ge3da0db



[PATCHv2 14/36] attr: retire git_check_attrs() API

2016-10-28 Thread Stefan Beller
From: Junio C Hamano 

Since nobody uses the old API, make it file-scope static, and update
the documentation to describe the new API.

Signed-off-by: Junio C Hamano 
Signed-off-by: Stefan Beller 
---
 Documentation/technical/api-gitattributes.txt | 82 ++-
 attr.c|  3 +-
 attr.h|  2 -
 3 files changed, 56 insertions(+), 31 deletions(-)

diff --git a/Documentation/technical/api-gitattributes.txt 
b/Documentation/technical/api-gitattributes.txt
index 2602668677..92fc32a0ff 100644
--- a/Documentation/technical/api-gitattributes.txt
+++ b/Documentation/technical/api-gitattributes.txt
@@ -16,10 +16,15 @@ Data Structure
of no interest to the calling programs.  The name of the
attribute can be retrieved by calling `git_attr_name()`.
 
+`struct git_attr_check_elem`::
+
+   This structure represents one attribute and its value.
+
 `struct git_attr_check`::
 
-   This structure represents a set of attributes to check in a call
-   to `git_check_attr()` function, and receives the results.
+   This structure represents a collection of `git_attr_check_elem`.
+   It is passed to `git_check_attr()` function, specifying the
+   attributes to check, and receives their values.
 
 
 Attribute Values
@@ -48,49 +53,51 @@ value of the attribute for the path.
 Querying Specific Attributes
 
 
-* Prepare an array of `struct git_attr_check` to define the list of
-  attributes you would want to check.  To populate this array, you would
-  need to define necessary attributes by calling `git_attr()` function.
+* Prepare `struct git_attr_check` using git_attr_check_initl()
+  function, enumerating the names of attributes whose values you are
+  interested in, terminated with a NULL pointer.  Alternatively, an
+  empty `struct git_attr_check` can be prepared by calling
+  `git_attr_check_alloc()` function and then attributes you want to
+  ask about can be added to it with `git_attr_check_append()`
+  function.
 
 * Call `git_check_attr()` to check the attributes for the path.
 
-* Inspect `git_attr_check` structure to see how each of the attribute in
-  the array is defined for the path.
+* Inspect `git_attr_check` structure to see how each of the
+  attribute in the array is defined for the path.
 
 
 Example
 ---
 
-To see how attributes "crlf" and "indent" are set for different paths.
+To see how attributes "crlf" and "ident" are set for different paths.
 
-. Prepare an array of `struct git_attr_check` with two elements (because
-  we are checking two attributes).  Initialize their `attr` member with
-  pointers to `struct git_attr` obtained by calling `git_attr()`:
+. Prepare a `struct git_attr_check` with two elements (because
+  we are checking two attributes):
 
 
-static struct git_attr_check check[2];
+static struct git_attr_check *check;
 static void setup_check(void)
 {
-   if (check[0].attr)
+   if (check)
return; /* already done */
-   check[0].attr = git_attr("crlf");
-   check[1].attr = git_attr("ident");
+   check = git_attr_check_initl("crlf", "ident", NULL);
 }
 
 
-. Call `git_check_attr()` with the prepared array of `struct git_attr_check`:
+. Call `git_check_attr()` with the prepared `struct git_attr_check`:
 
 
const char *path;
 
setup_check();
-   git_check_attr(path, ARRAY_SIZE(check), check);
+   git_check_attr(path, check);
 
 
-. Act on `.value` member of the result, left in `check[]`:
+. Act on `.value` member of the result, left in `check->check[]`:
 
 
-   const char *value = check[0].value;
+   const char *value = check->check[0].value;
 
if (ATTR_TRUE(value)) {
The attribute is Set, by listing only the name of the
@@ -109,20 +116,39 @@ static void setup_check(void)
}
 
 
+To see how attributes in argv[] are set for different paths, only
+the first step in the above would be different.
+
+
+static struct git_attr_check *check;
+static void setup_check(const char **argv)
+{
+   check = git_attr_check_alloc();
+   while (*argv) {
+   struct git_attr *attr = git_attr(*argv);
+   git_attr_check_append(check, attr);
+   argv++;
+   }
+}
+
+
 
 Querying All Attributes
 ---
 
 To get the values of all attributes associated with a file:
 
-* Call `git_all_attrs()`, which returns an array of `git_attr_check`
-  structures.
+* Prepare an empty `git_attr_check` structure by calling
+  `git_attr_check_alloc()`.
+
+* Call `git_all_attrs()`, which populates the `git_attr_check`
+  with the attributes attached to the path.
 
-* Iterate over the `git_attr_check` array to examine the attribute
-  names and values.  The name of the 

[PATCHv2 28/36] attr: keep attr stack for each check

2016-10-28 Thread Stefan Beller
Instead of having a global attr stack, attach the stack to each check.
This allows to use the attr in a multithreaded way.



Signed-off-by: Stefan Beller 
---
 attr.c| 101 +++---
 attr.h|   4 ++-
 hashmap.h |   2 ++
 3 files changed, 69 insertions(+), 38 deletions(-)

diff --git a/attr.c b/attr.c
index 8c217ae593..a143807698 100644
--- a/attr.c
+++ b/attr.c
@@ -375,15 +375,17 @@ static struct match_attr *parse_attr_line(const char 
*line, const char *src,
  * .gitignore file and info/excludes file as a fallback.
  */
 
-/* NEEDSWORK: This will become per git_attr_check */
-static struct attr_stack {
+struct attr_stack {
struct attr_stack *prev;
char *origin;
size_t originlen;
unsigned num_matches;
unsigned alloc;
struct match_attr **attrs;
-} *attr_stack;
+};
+
+static struct hashmap all_attr_stacks;
+static int all_attr_stacks_init;
 
 static void free_attr_elem(struct attr_stack *e)
 {
@@ -564,11 +566,23 @@ static void debug_set(const char *what, const char 
*match, struct git_attr *attr
 
 static void drop_attr_stack(void)
 {
-   while (attr_stack) {
-   struct attr_stack *elem = attr_stack;
-   attr_stack = elem->prev;
-   free_attr_elem(elem);
+   struct hashmap_iter iter;
+   struct git_attr_check *check;
+
+   attr_lock();
+   if (!all_attr_stacks_init) {
+   attr_unlock();
+   return;
}
+   hashmap_iter_init(_attr_stacks, );
+   while ((check = hashmap_iter_next())) {
+   while (check->attr_stack) {
+   struct attr_stack *elem = check->attr_stack;
+   check->attr_stack = elem->prev;
+   free_attr_elem(elem);
+   }
+   }
+   attr_unlock();
 }
 
 static const char *git_etc_gitattributes(void)
@@ -598,30 +612,31 @@ static void push_stack(struct attr_stack **attr_stack_p,
}
 }
 
-static void bootstrap_attr_stack(void)
+static void bootstrap_attr_stack(struct git_attr_check *check)
 {
struct attr_stack *elem;
 
-   if (attr_stack)
+   if (check->attr_stack)
return;
 
-   push_stack(_stack, read_attr_from_array(builtin_attr), NULL, 0);
+   push_stack(>attr_stack,
+  read_attr_from_array(builtin_attr), NULL, 0);
 
if (git_attr_system())
-   push_stack(_stack,
+   push_stack(>attr_stack,
   read_attr_from_file(git_etc_gitattributes(), 1),
   NULL, 0);
 
if (!git_attributes_file)
git_attributes_file = xdg_config_home("attributes");
if (git_attributes_file)
-   push_stack(_stack,
+   push_stack(>attr_stack,
   read_attr_from_file(git_attributes_file, 1),
   NULL, 0);
 
if (!is_bare_repository() || direction == GIT_ATTR_INDEX) {
elem = read_attr(GITATTRIBUTES_FILE, 1);
-   push_stack(_stack, elem, xstrdup(""), 0);
+   push_stack(>attr_stack, elem, xstrdup(""), 0);
debug_push(elem);
}
 
@@ -632,10 +647,11 @@ static void bootstrap_attr_stack(void)
 
if (!elem)
elem = xcalloc(1, sizeof(*elem));
-   push_stack(_stack, elem, NULL, 0);
+   push_stack(>attr_stack, elem, NULL, 0);
 }
 
-static void prepare_attr_stack(const char *path, int dirlen)
+static void prepare_attr_stack(const char *path, int dirlen,
+  struct git_attr_check *check)
 {
struct attr_stack *elem, *info;
const char *cp;
@@ -655,13 +671,13 @@ static void prepare_attr_stack(const char *path, int 
dirlen)
 * .gitattributes in deeper directories to shallower ones,
 * and finally use the built-in set as the default.
 */
-   bootstrap_attr_stack();
+   bootstrap_attr_stack(check);
 
/*
 * Pop the "info" one that is always at the top of the stack.
 */
-   info = attr_stack;
-   attr_stack = info->prev;
+   info = check->attr_stack;
+   check->attr_stack = info->prev;
 
/*
 * Pop the ones from directories that are not the prefix of
@@ -669,17 +685,17 @@ static void prepare_attr_stack(const char *path, int 
dirlen)
 * the root one (whose origin is an empty string "") or the builtin
 * one (whose origin is NULL) without popping it.
 */
-   while (attr_stack->origin) {
-   int namelen = strlen(attr_stack->origin);
+   while (check->attr_stack->origin) {
+   int namelen = strlen(check->attr_stack->origin);
 
-   elem = attr_stack;
+   elem = check->attr_stack;
if (namelen <= dirlen &&
!strncmp(elem->origin, path, namelen) &&
(!namelen 

[PATCHv2 34/36] submodule update: add `--init-default-path` switch

2016-10-28 Thread Stefan Beller
The new switch `--init-default-path` initializes the submodules which are
configured in `submodule.defaultUpdatePath` instead of those given as
command line arguments before updating. In the first implementation this
is made incompatible with further command line arguments as it is
unclear what the user means by

git submodule update --init --init-default-path 

This new switch allows to record more complex patterns as it saves
retyping them whenever you invoke update.

Signed-off-by: Stefan Beller 
Signed-off-by: Junio C Hamano 
---
 Documentation/config.txt|  5 
 Documentation/git-submodule.txt | 17 +
 git-submodule.sh| 21 +---
 t/t7400-submodule-basic.sh  | 53 +
 4 files changed, 89 insertions(+), 7 deletions(-)

diff --git a/Documentation/config.txt b/Documentation/config.txt
index 27069ac032..72901ef90e 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -2886,6 +2886,11 @@ submodule.alternateErrorStrategy
as computed via `submodule.alternateLocation`. Possible values are
`ignore`, `info`, `die`. Default is `die`.
 
+submodule.defaultUpdatePath::
+   Specifies a set of submodules to initialize when calling
+   `git submodule --init-default-group` by using the pathspec
+   syntax.
+
 tag.forceSignAnnotated::
A boolean to specify whether annotated tags created should be GPG 
signed.
If `--annotate` is specified on the command line, it takes
diff --git a/Documentation/git-submodule.txt b/Documentation/git-submodule.txt
index d841573475..98a6d36c78 100644
--- a/Documentation/git-submodule.txt
+++ b/Documentation/git-submodule.txt
@@ -14,10 +14,10 @@ SYNOPSIS
 'git submodule' [--quiet] status [--cached] [--recursive] [--] [...]
 'git submodule' [--quiet] init [--] [...]
 'git submodule' [--quiet] deinit [-f|--force] (--all|[--] ...)
-'git submodule' [--quiet] update [--init] [--remote] [-N|--no-fetch]
- [--[no-]recommend-shallow] [-f|--force] [--rebase|--merge]
- [--reference ] [--depth ] [--recursive]
- [--jobs ] [--] [...]
+'git submodule' [--quiet] update [--init[-default-path]] [--remote] 
[-N|--no-fetch]
+ [--[no-]recommend-shallow]
+ [-f|--force] [--rebase|--merge] [--reference ]
+ [--depth ] [--recursive] [--jobs ] [--] [...]
 'git submodule' [--quiet] summary [--cached|--files] [(-n|--summary-limit) ]
  [commit] [--] [...]
 'git submodule' [--quiet] foreach [--recursive] 
@@ -194,6 +194,10 @@ If the submodule is not yet initialized, and you just want 
to use the
 setting as stored in .gitmodules, you can automatically initialize the
 submodule with the `--init` option.
 
+You can configure a set of submodules using pathspec syntax in
+submodule.defaultUpdatePath you can use `--init-default-path` to initialize
+those before updating.
+
 If `--recursive` is specified, this command will recurse into the
 registered submodules, and update any nested submodules within.
 --
@@ -363,6 +367,11 @@ the submodule itself.
Initialize all submodules for which "git submodule init" has not been
called so far before updating.
 
+--init-default-path::
+   This option is only valid for the update command.
+   Initialize all submodules configured in "`submodule.defaultUpdatePath`"
+   that have not been updated before.
+
 --name::
This option is only valid for the add command. It sets the submodule's
name to the given string instead of defaulting to its path. The name
diff --git a/git-submodule.sh b/git-submodule.sh
index a024a135d6..334cecc62d 100755
--- a/git-submodule.sh
+++ b/git-submodule.sh
@@ -9,7 +9,7 @@ USAGE="[--quiet] add [-b ] [-f|--force] [--name ] 
[--reference ...]
or: $dashless [--quiet] init [--] [...]
or: $dashless [--quiet] deinit [-f|--force] (--all| [--] ...)
-   or: $dashless [--quiet] update [--init] [--remote] [-N|--no-fetch] 
[-f|--force] [--checkout|--merge|--rebase] [--[no-]recommend-shallow] 
[--reference ] [--recursive] [--] [...]
+   or: $dashless [--quiet] update [--init[-default-path]] [--remote] 
[-N|--no-fetch] [-f|--force] [--checkout|--merge|--rebase] 
[--[no-]recommend-shallow] [--reference ] [--recursive] [--] 
[...]
or: $dashless [--quiet] summary [--cached|--files] [--summary-limit ] 
[commit] [--] [...]
or: $dashless [--quiet] foreach [--recursive] 
or: $dashless [--quiet] sync [--recursive] [--] [...]"
@@ -503,7 +503,12 @@ cmd_update()
progress="--progress"
;;
-i|--init)
-   init=1
+   test -z $init || test $init = by_args || die "$(gettext 
"Only one of --init or --init-default-path may be used.")"
+   init=by_args
+   ;;
+   --init-default-path)
+   test -z 

[PATCHv2 29/36] Documentation: fix a typo

2016-10-28 Thread Stefan Beller
Signed-off-by: Stefan Beller 
Signed-off-by: Junio C Hamano 
---
 Documentation/gitattributes.txt | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/Documentation/gitattributes.txt b/Documentation/gitattributes.txt
index 8a061af0cc..5b317971c0 100644
--- a/Documentation/gitattributes.txt
+++ b/Documentation/gitattributes.txt
@@ -88,7 +88,7 @@ is either not set or empty, $HOME/.config/git/attributes is 
used instead.
 Attributes for all users on a system should be placed in the
 `$(prefix)/etc/gitattributes` file.
 
-Sometimes you would need to override an setting of an attribute
+Sometimes you would need to override a setting of an attribute
 for a path to `Unspecified` state.  This can be done by listing
 the name of the attribute prefixed with an exclamation point `!`.
 
-- 
2.10.1.714.ge3da0db



[PATCHv2 13/36] attr: convert git_check_attrs() callers to use the new API

2016-10-28 Thread Stefan Beller
From: Junio C Hamano 

The remaining callers are all simple "I have N attributes I am
interested in.  I'll ask about them with various paths one by one".

After this step, no caller to git_check_attrs() remains.  After
removing it, we can extend "struct git_attr_check" struct with data
that can be used in optimizing the query for the specific N
attributes it contains.

Signed-off-by: Junio C Hamano 
Signed-off-by: Stefan Beller 
---
 builtin/pack-objects.c | 19 +--
 convert.c  | 18 +++---
 ll-merge.c | 33 ++---
 userdiff.c | 19 ---
 ws.c   | 19 ++-
 5 files changed, 40 insertions(+), 68 deletions(-)

diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
index 3cb38ed68e..3918c07188 100644
--- a/builtin/pack-objects.c
+++ b/builtin/pack-objects.c
@@ -896,24 +896,15 @@ static void write_pack_file(void)
written, nr_result);
 }
 
-static void setup_delta_attr_check(struct git_attr_check_elem *check)
-{
-   static struct git_attr *attr_delta;
-
-   if (!attr_delta)
-   attr_delta = git_attr("delta");
-
-   check[0].attr = attr_delta;
-}
-
 static int no_try_delta(const char *path)
 {
-   struct git_attr_check_elem check[1];
+   static struct git_attr_check *check;
 
-   setup_delta_attr_check(check);
-   if (git_check_attrs(path, ARRAY_SIZE(check), check))
+   if (!check)
+   check = git_attr_check_initl("delta", NULL);
+   if (git_check_attr(path, check))
return 0;
-   if (ATTR_FALSE(check->value))
+   if (ATTR_FALSE(check->check[0].value))
return 1;
return 0;
 }
diff --git a/convert.c b/convert.c
index e1e47d2367..4eca0b5dda 100644
--- a/convert.c
+++ b/convert.c
@@ -779,24 +779,20 @@ struct conv_attrs {
int ident;
 };
 
-static const char *conv_attr_name[] = {
-   "crlf", "ident", "filter", "eol", "text",
-};
-#define NUM_CONV_ATTRS ARRAY_SIZE(conv_attr_name)
-
 static void convert_attrs(struct conv_attrs *ca, const char *path)
 {
-   int i;
-   static struct git_attr_check_elem ccheck[NUM_CONV_ATTRS];
+   static struct git_attr_check *check;
 
-   if (!ccheck[0].attr) {
-   for (i = 0; i < NUM_CONV_ATTRS; i++)
-   ccheck[i].attr = git_attr(conv_attr_name[i]);
+   if (!check) {
+   check = git_attr_check_initl("crlf", "ident",
+"filter", "eol", "text",
+NULL);
user_convert_tail = _convert;
git_config(read_convert_config, NULL);
}
 
-   if (!git_check_attrs(path, NUM_CONV_ATTRS, ccheck)) {
+   if (!git_check_attr(path, check)) {
+   struct git_attr_check_elem *ccheck = check->check;
ca->crlf_action = git_path_check_crlf(ccheck + 4);
if (ca->crlf_action == CRLF_UNDEFINED)
ca->crlf_action = git_path_check_crlf(ccheck + 0);
diff --git a/ll-merge.c b/ll-merge.c
index eb2c37ea92..bc6479ce7f 100644
--- a/ll-merge.c
+++ b/ll-merge.c
@@ -336,15 +336,6 @@ static const struct ll_merge_driver 
*find_ll_merge_driver(const char *merge_attr
return _merge_drv[LL_TEXT_MERGE];
 }
 
-static int git_path_check_merge(const char *path, struct git_attr_check_elem 
check[2])
-{
-   if (!check[0].attr) {
-   check[0].attr = git_attr("merge");
-   check[1].attr = git_attr("conflict-marker-size");
-   }
-   return git_check_attrs(path, 2, check);
-}
-
 static void normalize_file(mmfile_t *mm, const char *path)
 {
struct strbuf strbuf = STRBUF_INIT;
@@ -362,7 +353,7 @@ int ll_merge(mmbuffer_t *result_buf,
 mmfile_t *theirs, const char *their_label,
 const struct ll_merge_options *opts)
 {
-   static struct git_attr_check_elem check[2];
+   static struct git_attr_check *check;
static const struct ll_merge_options default_opts;
const char *ll_driver_name = NULL;
int marker_size = DEFAULT_CONFLICT_MARKER_SIZE;
@@ -376,10 +367,14 @@ int ll_merge(mmbuffer_t *result_buf,
normalize_file(ours, path);
normalize_file(theirs, path);
}
-   if (!git_path_check_merge(path, check)) {
-   ll_driver_name = check[0].value;
-   if (check[1].value) {
-   marker_size = atoi(check[1].value);
+
+   if (!check)
+   check = git_attr_check_initl("merge", "conflict-marker-size", 
NULL);
+
+   if (!git_check_attr(path, check)) {
+   ll_driver_name = check->check[0].value;
+   if (check->check[1].value) {
+   marker_size = atoi(check->check[1].value);
if (marker_size <= 0)
   

[PATCHv2 35/36] clone: add --init-submodule= switch

2016-10-28 Thread Stefan Beller
The new switch passes the pathspec to `git submodule update --init`
which is called after the actual clone is done.

Additionally this configures the submodule.defaultUpdatePath to
be the given pathspec, such that any future invocation of
`git submodule update --init-default-paths` will keep up
with the pathspec.

Signed-off-by: Stefan Beller 
Signed-off-by: Junio C Hamano 
---
 Documentation/git-clone.txt | 23 +
 builtin/clone.c | 36 ++--
 t/t7400-submodule-basic.sh  | 81 +
 3 files changed, 131 insertions(+), 9 deletions(-)

diff --git a/Documentation/git-clone.txt b/Documentation/git-clone.txt
index 35cc34b2fb..1089f3812c 100644
--- a/Documentation/git-clone.txt
+++ b/Documentation/git-clone.txt
@@ -15,7 +15,8 @@ SYNOPSIS
  [--dissociate] [--separate-git-dir ]
  [--depth ] [--[no-]single-branch]
  [--recursive | --recurse-submodules] [--[no-]shallow-submodules]
- [--jobs ] [--]  []
+ [--init-submodule ] [--jobs ] [--]
+  []
 
 DESCRIPTION
 ---
@@ -217,12 +218,20 @@ objects from the source repository into a pack in the 
cloned repository.
 
 --recursive::
 --recurse-submodules::
-   After the clone is created, initialize all submodules within,
-   using their default settings. This is equivalent to running
-   `git submodule update --init --recursive` immediately after
-   the clone is finished. This option is ignored if the cloned
-   repository does not have a worktree/checkout (i.e. if any of
-   `--no-checkout`/`-n`, `--bare`, or `--mirror` is given)
+   After the clone is created, initialize and clone all submodules
+   within, using their default settings. This is equivalent to
+   running `git submodule update --recursive --init `
+   immediately after the clone is finished. This option is ignored
+   if the cloned repository does not have a worktree/checkout (i.e.
+   if any of `--no-checkout`/`-n`, `--bare`, or `--mirror` is given)
+
+--init-submodule::
+   After the clone is created, initialize and clone specified
+   submodules within, using their default settings. It is possible
+   to give multiple specifications by giving this argument multiple
+   times. This is equivalent to configure `submodule.defaultUpdateGroup`
+   and then running `git submodule update --init-default-path`
+   immediately after the clone is finished.
 
 --[no-]shallow-submodules::
All submodules which are cloned will be shallow with a depth of 1.
diff --git a/builtin/clone.c b/builtin/clone.c
index 6c76a6ed66..748e7c0c19 100644
--- a/builtin/clone.c
+++ b/builtin/clone.c
@@ -56,6 +56,16 @@ static struct string_list option_required_reference = 
STRING_LIST_INIT_NODUP;
 static struct string_list option_optional_reference = STRING_LIST_INIT_NODUP;
 static int option_dissociate;
 static int max_jobs = -1;
+static struct string_list init_submodules;
+
+static int init_submodules_cb(const struct option *opt, const char *arg, int 
unset)
+{
+   if (unset)
+   return -1;
+
+   string_list_append((struct string_list *)opt->value, arg);
+   return 0;
+}
 
 static struct option builtin_clone_options[] = {
OPT__VERBOSITY(_verbosity),
@@ -112,6 +122,9 @@ static struct option builtin_clone_options[] = {
TRANSPORT_FAMILY_IPV4),
OPT_SET_INT('6', "ipv6", , N_("use IPv6 addresses only"),
TRANSPORT_FAMILY_IPV6),
+   OPT_CALLBACK(0, "init-submodule", _submodules, N_(""),
+   N_("clone specific submodules. Pass multiple times for 
complex pathspecs"),
+   init_submodules_cb),
OPT_END()
 };
 
@@ -733,13 +746,21 @@ static int checkout(int submodule_progress)
err |= run_hook_le(NULL, "post-checkout", sha1_to_hex(null_sha1),
   sha1_to_hex(sha1), "1", NULL);
 
-   if (!err && option_recursive) {
+   if (!err && (option_recursive || init_submodules.nr > 0)) {
struct argv_array args = ARGV_ARRAY_INIT;
-   argv_array_pushl(, "submodule", "update", "--init", 
"--recursive", NULL);
+   argv_array_pushl(, "submodule", "update", NULL);
+
+   if (init_submodules.nr > 0)
+   argv_array_pushf(, "--init-default-path");
+   else
+   argv_array_pushf(, "--init");
 
if (option_shallow_submodules == 1)
argv_array_push(, "--depth=1");
 
+   if (option_recursive)
+   argv_array_pushf(, "--recursive");
+
if (max_jobs != -1)
argv_array_pushf(, "--jobs=%d", max_jobs);
 
@@ -887,6 +908,17 @@ int cmd_clone(int argc, const char **argv, const char 
*prefix)
option_no_checkout = 1;
}
 
+   if 

[PATCHv2 22/36] attr.c: correct ugly hack for git_all_attrs()

2016-10-28 Thread Stefan Beller
From: Junio C Hamano 

The collect_some_attrs() function has an ugly hack since

06a604e6 (attr: avoid heavy work when we know the specified attr is
not defined, 2014-12-28) added an optimization that relies on the
fact that the caller knows what attributes it is interested in, so
that we can leave once we know the final answer for all the
attributes the caller asked.

git_all_attrs() that asks "what attributes are on this path?"
however does not know what attributes it is interested in, other
than the vague "we are interested in all of them", which is not a
very useful thing to say.  As a way to disable this optimization
for this caller, the said commit added a code to skip it when
the caller passes a NULL for the check structure.

However, it skipped the optimization not when check is NULL, but
when the number of attributes being checked is 0, which is
unnecessarily pessimistic.

Signed-off-by: Junio C Hamano 
Signed-off-by: Stefan Beller 
---
 attr.c | 24 
 1 file changed, 8 insertions(+), 16 deletions(-)

diff --git a/attr.c b/attr.c
index c85685da29..94476de3d2 100644
--- a/attr.c
+++ b/attr.c
@@ -752,8 +752,8 @@ static int attr_check_is_dynamic(const struct 
git_attr_check *check)
 
 /*
  * Collect attributes for path into the array pointed to by
- * check_all_attr. If num is non-zero, only attributes in check[] are
- * collected. Otherwise all attributes are collected.
+ * check_all_attr.  If check is not NULL, only attributes in
+ * check[] are collected. Otherwise all attributes are collected.
  */
 static void collect_some_attrs(const char *path, int pathlen,
   struct git_attr_check *check)
@@ -763,17 +763,6 @@ static void collect_some_attrs(const char *path, int 
pathlen,
int i, rem, dirlen;
const char *cp, *last_slash = NULL;
int basename_offset;
-   int num;
-   struct git_attr_check_elem *celem;
-
-   if (!check) {
-   /* Yuck - ugly git_all_attrs() hack! */
-   celem = NULL;
-   num = 0;
-   } else {
-   celem = check->check;
-   num = check->check_nr;
-   }
 
for (cp = path; cp < path + pathlen; cp++) {
if (*cp == '/' && cp[1])
@@ -790,9 +779,12 @@ static void collect_some_attrs(const char *path, int 
pathlen,
prepare_attr_stack(path, dirlen);
for (i = 0; i < attr_nr; i++)
check_all_attr[i].value = ATTR__UNKNOWN;
-   if (num && !cannot_trust_maybe_real) {
+
+   if (check && !cannot_trust_maybe_real) {
+   struct git_attr_check_elem *celem = check->check;
+
rem = 0;
-   for (i = 0; i < num; i++) {
+   for (i = 0; i < check->check_nr; i++) {
if (!celem[i].attr->maybe_real) {
struct git_attr_check_elem *c;
c = check_all_attr + celem[i].attr->attr_nr;
@@ -800,7 +792,7 @@ static void collect_some_attrs(const char *path, int 
pathlen,
rem++;
}
}
-   if (rem == num)
+   if (rem == check->check_nr)
return;
}
 
-- 
2.10.1.714.ge3da0db



[PATCHv2 17/36] attr: expose validity check for attribute names

2016-10-28 Thread Stefan Beller
From: Junio C Hamano 

Export attr_name_valid() function, and a helper function that
returns the message to be given when a given  pair
is not a good name for an attribute.

We could later update the message to exactly spell out what the
rules for a good attribute name are, etc.

Signed-off-by: Junio C Hamano 
Signed-off-by: Stefan Beller 
---
 attr.c | 39 +--
 attr.h | 10 ++
 2 files changed, 35 insertions(+), 14 deletions(-)

diff --git a/attr.c b/attr.c
index 7058e1c9fa..33021cc857 100644
--- a/attr.c
+++ b/attr.c
@@ -59,23 +59,38 @@ static unsigned hash_name(const char *name, int namelen)
return val;
 }
 
-static int invalid_attr_name(const char *name, int namelen)
+int attr_name_valid(const char *name, size_t namelen)
 {
/*
 * Attribute name cannot begin with '-' and must consist of
 * characters from [-A-Za-z0-9_.].
 */
if (namelen <= 0 || *name == '-')
-   return -1;
+   return 0;
while (namelen--) {
char ch = *name++;
if (! (ch == '-' || ch == '.' || ch == '_' ||
   ('0' <= ch && ch <= '9') ||
   ('a' <= ch && ch <= 'z') ||
   ('A' <= ch && ch <= 'Z')) )
-   return -1;
+   return 0;
}
-   return 0;
+   return 1;
+}
+
+void invalid_attr_name_message(struct strbuf *err, const char *name, int len)
+{
+   strbuf_addf(err, _("%.*s is not a valid attribute name"),
+   len, name);
+}
+
+static void report_invalid_attr(const char *name, size_t len,
+   const char *src, int lineno)
+{
+   struct strbuf err = STRBUF_INIT;
+   invalid_attr_name_message(, name, len);
+   fprintf(stderr, "%s: %s:%d\n", err.buf, src, lineno);
+   strbuf_release();
 }
 
 struct git_attr *git_attr_counted(const char *name, size_t len)
@@ -90,7 +105,7 @@ struct git_attr *git_attr_counted(const char *name, size_t 
len)
return a;
}
 
-   if (invalid_attr_name(name, len))
+   if (!attr_name_valid(name, len))
return NULL;
 
FLEX_ALLOC_MEM(a, name, name, len);
@@ -176,17 +191,15 @@ static const char *parse_attr(const char *src, int 
lineno, const char *cp,
cp++;
len--;
}
-   if (invalid_attr_name(cp, len)) {
-   fprintf(stderr,
-   "%.*s is not a valid attribute name: %s:%d\n",
-   len, cp, src, lineno);
+   if (!attr_name_valid(cp, len)) {
+   report_invalid_attr(cp, len, src, lineno);
return NULL;
}
} else {
/*
 * As this function is always called twice, once with
 * e == NULL in the first pass and then e != NULL in
-* the second pass, no need for invalid_attr_name()
+* the second pass, no need for attr_name_valid()
 * check here.
 */
if (*cp == '-' || *cp == '!') {
@@ -229,10 +242,8 @@ static struct match_attr *parse_attr_line(const char 
*line, const char *src,
name += strlen(ATTRIBUTE_MACRO_PREFIX);
name += strspn(name, blank);
namelen = strcspn(name, blank);
-   if (invalid_attr_name(name, namelen)) {
-   fprintf(stderr,
-   "%.*s is not a valid attribute name: %s:%d\n",
-   namelen, name, src, lineno);
+   if (!attr_name_valid(name, namelen)) {
+   report_invalid_attr(name, namelen, src, lineno);
goto fail_return;
}
}
diff --git a/attr.h b/attr.h
index bcedf928b5..fe26f3a588 100644
--- a/attr.h
+++ b/attr.h
@@ -13,6 +13,16 @@ extern struct git_attr *git_attr(const char *);
 /* The same, but with counted string */
 extern struct git_attr *git_attr_counted(const char *, size_t);
 
+/*
+ * Return the name of the attribute represented by the argument.  The
+ * return value is a pointer to a null-delimited string that is part
+ * of the internal data structure; it should not be modified or freed.
+ */
+extern const char *git_attr_name(const struct git_attr *);
+
+extern int attr_name_valid(const char *name, size_t namelen);
+extern void invalid_attr_name_message(struct strbuf *, const char *, int);
+
 /* Internal use */
 extern const char git_attr__true[];
 extern const char git_attr__false[];
-- 
2.10.1.714.ge3da0db



[PATCHv2 30/36] pathspec: move long magic parsing out of prefix_pathspec

2016-10-28 Thread Stefan Beller
`prefix_pathspec` is quite a lengthy function and we plan on adding more.
Split it up for better readability. As we want to add code into the
inner loop of the long magic parsing, we also benefit from lower
indentation.

Signed-off-by: Stefan Beller 
Signed-off-by: Junio C Hamano 
---
 pathspec.c | 84 +++---
 1 file changed, 47 insertions(+), 37 deletions(-)

diff --git a/pathspec.c b/pathspec.c
index 22ca74a126..ad13556c82 100644
--- a/pathspec.c
+++ b/pathspec.c
@@ -88,6 +88,52 @@ static void prefix_short_magic(struct strbuf *sb, int 
prefixlen,
strbuf_addf(sb, ",prefix:%d)", prefixlen);
 }
 
+static void eat_long_magic(struct pathspec_item *item, const char *elt,
+   unsigned *magic, int *pathspec_prefix,
+   const char **copyfrom_, const char **long_magic_end)
+{
+   int i;
+   const char *copyfrom = *copyfrom_;
+   /* longhand */
+   const char *nextat;
+   for (copyfrom = elt + 2;
+*copyfrom && *copyfrom != ')';
+copyfrom = nextat) {
+   size_t len = strcspn(copyfrom, ",)");
+   if (copyfrom[len] == ',')
+   nextat = copyfrom + len + 1;
+   else
+   /* handle ')' and '\0' */
+   nextat = copyfrom + len;
+   if (!len)
+   continue;
+   for (i = 0; i < ARRAY_SIZE(pathspec_magic); i++) {
+   if (strlen(pathspec_magic[i].name) == len &&
+   !strncmp(pathspec_magic[i].name, copyfrom, len)) {
+   *magic |= pathspec_magic[i].bit;
+   break;
+   }
+   if (starts_with(copyfrom, "prefix:")) {
+   char *endptr;
+   *pathspec_prefix = strtol(copyfrom + 7,
+ , 10);
+   if (endptr - copyfrom != len)
+   die(_("invalid parameter for pathspec 
magic 'prefix'"));
+   /* "i" would be wrong, but it does not matter */
+   break;
+   }
+   }
+   if (ARRAY_SIZE(pathspec_magic) <= i)
+   die(_("Invalid pathspec magic '%.*s' in '%s'"),
+   (int) len, copyfrom, elt);
+   }
+   if (*copyfrom != ')')
+   die(_("Missing ')' at the end of pathspec magic in '%s'"), elt);
+   *long_magic_end = copyfrom;
+   copyfrom++;
+   *copyfrom_ = copyfrom;
+}
+
 /*
  * Take an element of a pathspec and check for magic signatures.
  * Append the result to the prefix. Return the magic bitmap.
@@ -150,43 +196,7 @@ static unsigned prefix_pathspec(struct pathspec_item *item,
(flags & PATHSPEC_LITERAL_PATH)) {
; /* nothing to do */
} else if (elt[1] == '(') {
-   /* longhand */
-   const char *nextat;
-   for (copyfrom = elt + 2;
-*copyfrom && *copyfrom != ')';
-copyfrom = nextat) {
-   size_t len = strcspn(copyfrom, ",)");
-   if (copyfrom[len] == ',')
-   nextat = copyfrom + len + 1;
-   else
-   /* handle ')' and '\0' */
-   nextat = copyfrom + len;
-   if (!len)
-   continue;
-   for (i = 0; i < ARRAY_SIZE(pathspec_magic); i++) {
-   if (strlen(pathspec_magic[i].name) == len &&
-   !strncmp(pathspec_magic[i].name, copyfrom, 
len)) {
-   magic |= pathspec_magic[i].bit;
-   break;
-   }
-   if (starts_with(copyfrom, "prefix:")) {
-   char *endptr;
-   pathspec_prefix = strtol(copyfrom + 7,
-, 10);
-   if (endptr - copyfrom != len)
-   die(_("invalid parameter for 
pathspec magic 'prefix'"));
-   /* "i" would be wrong, but it does not 
matter */
-   break;
-   }
-   }
-   if (ARRAY_SIZE(pathspec_magic) <= i)
-   die(_("Invalid pathspec magic '%.*s' in '%s'"),
-   (int) len, copyfrom, elt);
-   }
-   if (*copyfrom != ')')
-   

[PATCHv2 36/36] completion: clone can initialize specific submodules

2016-10-28 Thread Stefan Beller
Signed-off-by: Stefan Beller 
---
 contrib/completion/git-completion.bash | 1 +
 1 file changed, 1 insertion(+)

diff --git a/contrib/completion/git-completion.bash 
b/contrib/completion/git-completion.bash
index 21016bf8df..90eb772652 100644
--- a/contrib/completion/git-completion.bash
+++ b/contrib/completion/git-completion.bash
@@ -1138,6 +1138,7 @@ _git_clone ()
--single-branch
--branch
--recurse-submodules
+   --init-submodule
"
return
;;
-- 
2.10.1.714.ge3da0db



[PATCHv2 31/36] pathspec: move prefix check out of the inner loop

2016-10-28 Thread Stefan Beller
The prefix check is not related the check of pathspec magic; also there
is no code that is relevant after we'd break the loop on a match for
"prefix:". So move the check before the loop and shortcircuit the outer
loop.

Signed-off-by: Stefan Beller 
Signed-off-by: Junio C Hamano 
---
 pathspec.c | 19 ++-
 1 file changed, 10 insertions(+), 9 deletions(-)

diff --git a/pathspec.c b/pathspec.c
index ad13556c82..83b46c1910 100644
--- a/pathspec.c
+++ b/pathspec.c
@@ -107,21 +107,22 @@ static void eat_long_magic(struct pathspec_item *item, 
const char *elt,
nextat = copyfrom + len;
if (!len)
continue;
+
+   if (starts_with(copyfrom, "prefix:")) {
+   char *endptr;
+   *pathspec_prefix = strtol(copyfrom + 7,
+ , 10);
+   if (endptr - copyfrom != len)
+   die(_("invalid parameter for pathspec magic 
'prefix'"));
+   continue;
+   }
+
for (i = 0; i < ARRAY_SIZE(pathspec_magic); i++) {
if (strlen(pathspec_magic[i].name) == len &&
!strncmp(pathspec_magic[i].name, copyfrom, len)) {
*magic |= pathspec_magic[i].bit;
break;
}
-   if (starts_with(copyfrom, "prefix:")) {
-   char *endptr;
-   *pathspec_prefix = strtol(copyfrom + 7,
- , 10);
-   if (endptr - copyfrom != len)
-   die(_("invalid parameter for pathspec 
magic 'prefix'"));
-   /* "i" would be wrong, but it does not matter */
-   break;
-   }
}
if (ARRAY_SIZE(pathspec_magic) <= i)
die(_("Invalid pathspec magic '%.*s' in '%s'"),
-- 
2.10.1.714.ge3da0db



[PATCHv2 24/36] attr.c: always pass check[] to collect_some_attrs()

2016-10-28 Thread Stefan Beller
From: Junio C Hamano 

This function used to be called with check=NULL to signal it to
collect all attributes in the global check_all_attr[] array.

Because the longer term plan is to allocate check_all_attr[] and
attr_stack data structures per git_attr_check instance (i.e. "check"
here) to make the attr subsystem thread-safe, it is unacceptable.

Pass "Are we grabbing all attributes defined in the system?" bit as
a separate argument and pass it from the callers.

Signed-off-by: Junio C Hamano 
Signed-off-by: Stefan Beller 
---
 attr.c | 37 +++--
 1 file changed, 19 insertions(+), 18 deletions(-)

diff --git a/attr.c b/attr.c
index 1098300e54..92b3130f1e 100644
--- a/attr.c
+++ b/attr.c
@@ -760,11 +760,12 @@ static void empty_attr_check_elems(struct git_attr_check 
*check)
 
 /*
  * Collect attributes for path into the array pointed to by
- * check_all_attr.  If check is not NULL, only attributes in
- * check[] are collected. Otherwise all attributes are collected.
+ * check_all_attr.  If collect_all is zero, only attributes in
+ * check[] are collected.  Otherwise, check[] is cleared and
+ * any and all attributes that are visible are collected in it.
  */
 static void collect_some_attrs(const char *path, int pathlen,
-  struct git_attr_check *check)
+  struct git_attr_check *check, int collect_all)
 
 {
struct attr_stack *stk;
@@ -785,10 +786,11 @@ static void collect_some_attrs(const char *path, int 
pathlen,
}
 
prepare_attr_stack(path, dirlen);
+
for (i = 0; i < attr_nr; i++)
check_all_attr[i].value = ATTR__UNKNOWN;
 
-   if (check && !cannot_trust_maybe_real) {
+   if (!collect_all && !cannot_trust_maybe_real) {
struct git_attr_check_elem *celem = check->check;
 
rem = 0;
@@ -807,6 +809,17 @@ static void collect_some_attrs(const char *path, int 
pathlen,
rem = attr_nr;
for (stk = attr_stack; 0 < rem && stk; stk = stk->prev)
rem = fill(path, pathlen, basename_offset, stk, rem);
+
+   if (collect_all) {
+   empty_attr_check_elems(check);
+   for (i = 0; i < attr_nr; i++) {
+   const struct git_attr *attr = check_all_attr[i].attr;
+   const char *value = check_all_attr[i].value;
+   if (value == ATTR__UNSET || value == ATTR__UNKNOWN)
+   continue;
+   git_attr_check_append(check, attr)->value = value;
+   }
+   }
 }
 
 static int git_check_attrs(const char *path, int pathlen,
@@ -815,7 +828,7 @@ static int git_check_attrs(const char *path, int pathlen,
int i;
struct git_attr_check_elem *celem = check->check;
 
-   collect_some_attrs(path, pathlen, check);
+   collect_some_attrs(path, pathlen, check, 0);
 
for (i = 0; i < check->check_nr; i++) {
const char *value = 
check_all_attr[celem[i].attr->attr_nr].value;
@@ -829,19 +842,7 @@ static int git_check_attrs(const char *path, int pathlen,
 
 void git_all_attrs(const char *path, struct git_attr_check *check)
 {
-   int i;
-
-   git_attr_check_clear(check);
-   collect_some_attrs(path, strlen(path), NULL);
-
-   for (i = 0; i < attr_nr; i++) {
-   const char *name = check_all_attr[i].attr->name;
-   const char *value = check_all_attr[i].value;
-   if (value == ATTR__UNSET || value == ATTR__UNKNOWN)
-   continue;
-   git_attr_check_append(check, git_attr(name));
-   check->check[check->check_nr - 1].value = value;
-   }
+   collect_some_attrs(path, strlen(path), check, 1);
 }
 
 void git_attr_set_direction(enum git_attr_direction new, struct index_state 
*istate)
-- 
2.10.1.714.ge3da0db



[PATCHv2 19/36] attr.c: add push_stack() helper

2016-10-28 Thread Stefan Beller
From: Junio C Hamano 

There are too many repetitious "I have this new attr_stack element;
push it at the top of the stack" sequence.  The new helper function
push_stack() gives us a way to express what is going on at these
places, and as a side effect, halves the number of times we mention
the attr_stack global variable.

Signed-off-by: Junio C Hamano 
Signed-off-by: Stefan Beller 
---
 attr.c | 71 +++---
 1 file changed, 33 insertions(+), 38 deletions(-)

diff --git a/attr.c b/attr.c
index c8c4936f36..0a8ed5b582 100644
--- a/attr.c
+++ b/attr.c
@@ -521,6 +521,18 @@ static int git_attr_system(void)
 
 static GIT_PATH_FUNC(git_path_info_attributes, INFOATTRIBUTES_FILE)
 
+static void push_stack(struct attr_stack **attr_stack_p,
+  struct attr_stack *elem, char *origin, size_t originlen)
+{
+   if (elem) {
+   elem->origin = origin;
+   if (origin)
+   elem->originlen = originlen;
+   elem->prev = *attr_stack_p;
+   *attr_stack_p = elem;
+   }
+}
+
 static void bootstrap_attr_stack(void)
 {
struct attr_stack *elem;
@@ -528,37 +540,23 @@ static void bootstrap_attr_stack(void)
if (attr_stack)
return;
 
-   elem = read_attr_from_array(builtin_attr);
-   elem->origin = NULL;
-   elem->prev = attr_stack;
-   attr_stack = elem;
-
-   if (git_attr_system()) {
-   elem = read_attr_from_file(git_etc_gitattributes(), 1);
-   if (elem) {
-   elem->origin = NULL;
-   elem->prev = attr_stack;
-   attr_stack = elem;
-   }
-   }
+   push_stack(_stack, read_attr_from_array(builtin_attr), NULL, 0);
+
+   if (git_attr_system())
+   push_stack(_stack,
+  read_attr_from_file(git_etc_gitattributes(), 1),
+  NULL, 0);
 
if (!git_attributes_file)
git_attributes_file = xdg_config_home("attributes");
-   if (git_attributes_file) {
-   elem = read_attr_from_file(git_attributes_file, 1);
-   if (elem) {
-   elem->origin = NULL;
-   elem->prev = attr_stack;
-   attr_stack = elem;
-   }
-   }
+   if (git_attributes_file)
+   push_stack(_stack,
+  read_attr_from_file(git_attributes_file, 1),
+  NULL, 0);
 
if (!is_bare_repository() || direction == GIT_ATTR_INDEX) {
elem = read_attr(GITATTRIBUTES_FILE, 1);
-   elem->origin = xstrdup("");
-   elem->originlen = 0;
-   elem->prev = attr_stack;
-   attr_stack = elem;
+   push_stack(_stack, elem, xstrdup(""), 0);
debug_push(elem);
}
 
@@ -569,15 +567,12 @@ static void bootstrap_attr_stack(void)
 
if (!elem)
elem = xcalloc(1, sizeof(*elem));
-   elem->origin = NULL;
-   elem->prev = attr_stack;
-   attr_stack = elem;
+   push_stack(_stack, elem, NULL, 0);
 }
 
 static void prepare_attr_stack(const char *path, int dirlen)
 {
struct attr_stack *elem, *info;
-   int len;
const char *cp;
 
/*
@@ -637,20 +632,21 @@ static void prepare_attr_stack(const char *path, int 
dirlen)
 
assert(attr_stack->origin);
while (1) {
-   len = strlen(attr_stack->origin);
+   size_t len = strlen(attr_stack->origin);
+   char *origin;
+
if (dirlen <= len)
break;
cp = memchr(path + len + 1, '/', dirlen - len - 1);
if (!cp)
cp = path + dirlen;
-   strbuf_add(, path, cp - path);
-   strbuf_addch(, '/');
-   strbuf_addstr(, GITATTRIBUTES_FILE);
+   strbuf_addf(,
+   "%.*s/%s", (int)(cp - path), path,
+   GITATTRIBUTES_FILE);
elem = read_attr(pathbuf.buf, 0);
strbuf_setlen(, cp - path);
-   elem->origin = strbuf_detach(, 
>originlen);
-   elem->prev = attr_stack;
-   attr_stack = elem;
+   origin = strbuf_detach(, );
+   push_stack(_stack, elem, origin, len);
debug_push(elem);
}
 
@@ -660,8 +656,7 @@ static void prepare_attr_stack(const char *path, int dirlen)
/*
 * Finally push the "info" one at the top of the stack.
 */
-   info->prev = attr_stack;
-   

[PATCHv2 33/36] pathspec: allow escaped query values

2016-10-28 Thread Stefan Beller
In our own .gitattributes file we have attributes such as:

*.[ch] whitespace=indent,trail,space

When querying for attributes we want to be able to ask for the exact
value, i.e.

git ls-files :(attr:whitespace=indent,trail,space)

should work, but the commas are used in the attr magic to introduce
the next attr, such that this query currently fails with

fatal: Invalid pathspec magic 'trail' in ':(attr:whitespace=indent,trail,space)'

This change allows escaping characters by a backslash, such that the query

git ls-files :(attr:whitespace=indent\,trail\,space)

will match all path that have the value "indent,trail,space" for the
whitespace attribute. To accomplish this, we need to modify two places.
First `eat_long_magic` needs to not stop early upon seeing a comma or
closing paren that is escaped. As a second step we need to remove any
escaping from the attr value.

Helped-by: Junio C Hamano 
Signed-off-by: Stefan Beller 
Signed-off-by: Junio C Hamano 
---
 pathspec.c  | 53 +
 t/t6134-pathspec-with-labels.sh | 10 
 2 files changed, 58 insertions(+), 5 deletions(-)

diff --git a/pathspec.c b/pathspec.c
index 478449eec5..e8d20a0962 100644
--- a/pathspec.c
+++ b/pathspec.c
@@ -90,13 +90,57 @@ static void prefix_short_magic(struct strbuf *sb, int 
prefixlen,
strbuf_addf(sb, ",prefix:%d)", prefixlen);
 }
 
+static size_t strcspn_escaped(const char *s, const char *stop)
+{
+   const char *i;
+
+   for (i = s; *i; i++) {
+   /* skip the escaped character */
+   if (i[0] == '\\' && i[1]) {
+   i++;
+   continue;
+   }
+
+   if (strchr(stop, *i))
+   break;
+   }
+   return i - s;
+}
+
+static inline int invalid_value_char(const char ch)
+{
+   if (isalnum(ch) || strchr(",-_", ch))
+   return 0;
+   return -1;
+}
+
+static char *attr_value_unescape(const char *value)
+{
+   const char *src;
+   char *dst, *ret;
+
+   ret = xmallocz(strlen(value));
+   for (src = value, dst = ret; *src; src++, dst++) {
+   if (*src == '\\') {
+   if (!src[1])
+   die(_("Escape character '\\' not allowed as "
+ "last character in attr value"));
+   src++;
+   }
+   if (invalid_value_char(*src))
+   die("cannot use '%c' for value matching", *src);
+   *dst = *src;
+   }
+   *dst = '\0';
+   return ret;
+}
+
 static void parse_pathspec_attr_match(struct pathspec_item *item, const char 
*value)
 {
struct string_list_item *si;
struct string_list list = STRING_LIST_INIT_DUP;
struct argv_array attrs = ARGV_ARRAY_INIT;
 
-
if (!value || !strlen(value))
die(_("attr spec must not be empty"));
 
@@ -128,10 +172,9 @@ static void parse_pathspec_attr_match(struct pathspec_item 
*item, const char *va
if (attr[attr_len] != '=')
am->match_mode = MATCH_SET;
else {
+   const char *v = [attr_len + 1];
am->match_mode = MATCH_VALUE;
-   am->value = xstrdup([attr_len + 1]);
-   if (strchr(am->value, '\\'))
-   die(_("attr spec values must not 
contain backslashes"));
+   am->value = attr_value_unescape(v);
}
break;
}
@@ -168,7 +211,7 @@ static void eat_long_magic(struct pathspec_item *item, 
const char *elt,
for (copyfrom = elt + 2;
 *copyfrom && *copyfrom != ')';
 copyfrom = nextat) {
-   size_t len = strcspn(copyfrom, ",)");
+   size_t len = strcspn_escaped(copyfrom, ",)");
if (copyfrom[len] == ',')
nextat = copyfrom + len + 1;
else
diff --git a/t/t6134-pathspec-with-labels.sh b/t/t6134-pathspec-with-labels.sh
index 643e8ac35a..3d03e0a6e0 100755
--- a/t/t6134-pathspec-with-labels.sh
+++ b/t/t6134-pathspec-with-labels.sh
@@ -172,4 +172,14 @@ test_expect_success 'abort on asking for wrong magic' '
test_must_fail git ls-files . ":(attr:!label=foo)"
 '
 
+test_expect_success 'check attribute list' '
+   cat <<-EOF >>.gitattributes &&
+   * whitespace=indent,trail,space
+   EOF
+   cat .gitattributes &&
+   git ls-files ":(attr:whitespace=indent\,trail\,space)" >actual &&
+   git ls-files >expect &&
+   test_cmp expect actual
+'
+
 test_done
-- 
2.10.1.714.ge3da0db



[PATCHv2 27/36] attr: convert to new threadsafe API

2016-10-28 Thread Stefan Beller
This revamps the API of the attr subsystem to be thread safe.
Before we had the question and its results in one struct type.
The typical usage of the API was

static struct git_attr_check *check;

if (!check)
check = git_attr_check_initl("text", NULL);

git_check_attr(path, check);
act_on(check->value[0]);

This has a couple of issues when it comes to thread safety:

* the initialization is racy in this implementation. To make it
  thread safe, we need to acquire a mutex, such that only one
  thread is executing the code in git_attr_check_initl.
  As we do not want to introduce a mutex at each call site,
  this is best done in the attr code. However to do so, we need
  to have access to the `check` variable, i.e. the API has to
  look like
git_attr_check_initl(struct git_attr_check**, ...);
  Then one of the threads calling git_attr_check_initl will
  acquire the mutex and init the `check`, while all other threads
  will wait on the mutex just to realize they're late to the
  party and they'll return with no work done.

* While the check for attributes to be questioned only need to
  be initalized once as that part will be read only after its
  initialisation, the answer may be different for each path.
  Because of that we need to decouple the check and the answer,
  such that each thread can obtain an answer for the path it
  is currently processing.

This commit changes the API and adds locking in
git_attr_check_initl that provides the thread safety for constructing
`struct git_attr_check`.

The usage of the new API will be:

/*
 * The initl call will thread-safely check whether the
 * struct git_attr_check has been initialized. We only
 * want to do the initialization work once, hence we do
 * that work inside a thread safe environment.
 */
static struct git_attr_check *check;
git_attr_check_initl(, "text", NULL);

/* We're just asking for one attribute "text". */
git_attr_result myresult[1];

/* Perform the check and act on it: */
git_check_attr(path, check, myresult);
act_on(myresult[0].value);

/*
 * No need to free the check as it is static, hence doesn't leak
 * memory. The result is also static, so no need to free there either.
 */

Signed-off-by: Stefan Beller 
---
 Documentation/technical/api-gitattributes.txt | 136 +++-
 archive.c |  11 +-
 attr.c| 172 +-
 attr.h|  73 +--
 builtin/check-attr.c  |  50 
 builtin/pack-objects.c|  16 +--
 compat/mingw.c|   4 +
 convert.c |  40 +++---
 ll-merge.c|  24 ++--
 userdiff.c|  16 +--
 ws.c  |   8 +-
 11 files changed, 324 insertions(+), 226 deletions(-)

diff --git a/Documentation/technical/api-gitattributes.txt 
b/Documentation/technical/api-gitattributes.txt
index 92fc32a0ff..7dd6616a24 100644
--- a/Documentation/technical/api-gitattributes.txt
+++ b/Documentation/technical/api-gitattributes.txt
@@ -10,21 +10,23 @@ Data Structure
 
 `struct git_attr`::
 
-   An attribute is an opaque object that is identified by its name.
-   Pass the name to `git_attr()` function to obtain the object of
-   this type.  The internal representation of this structure is
+   The internal representation of this structure is
of no interest to the calling programs.  The name of the
attribute can be retrieved by calling `git_attr_name()`.
 
-`struct git_attr_check_elem`::
-
-   This structure represents one attribute and its value.
-
 `struct git_attr_check`::
 
-   This structure represents a collection of `git_attr_check_elem`.
+   This structure represents a collection of `struct git_attrs`.
It is passed to `git_check_attr()` function, specifying the
-   attributes to check, and receives their values.
+   attributes to check, and receives their values into a corresponding
+   `struct git_attr_result`.
+
+`struct git_attr_result`::
+
+   This structure represents one results for a check, such that an
+   array of `struct git_attr_results` corresponds to a
+   `struct git_attr_check`. The answers given in that array are in
+   the the same order as the check struct.
 
 
 Attribute Values
@@ -32,7 +34,7 @@ Attribute Values
 
 An attribute for a path can be in one of four states: Set, Unset,
 Unspecified or set to a string, and `.value` member of `struct
-git_attr_check` records it.  There are three macros to check these:
+git_attr_result` records it.  There are three macros to check these:
 
 `ATTR_TRUE()`::
 
@@ -53,19 +55,28 @@ value of the attribute for the path.
 Querying Specific 

[PATCHv2 26/36] attr: make git_check_attr_counted static

2016-10-28 Thread Stefan Beller
It's not used outside the attr code, so let's keep it private.

Signed-off-by: Stefan Beller 
---
 attr.c | 4 ++--
 attr.h | 1 -
 2 files changed, 2 insertions(+), 3 deletions(-)

diff --git a/attr.c b/attr.c
index 10f2042fbb..9f2da42e5f 100644
--- a/attr.c
+++ b/attr.c
@@ -896,8 +896,8 @@ void git_attr_set_direction(enum git_attr_direction new, 
struct index_state *ist
use_index = istate;
 }
 
-int git_check_attr_counted(const char *path, int pathlen,
-  struct git_attr_check *check)
+static int git_check_attr_counted(const char *path, int pathlen,
+ struct git_attr_check *check)
 {
check->finalized = 1;
return git_check_attrs(path, pathlen, check);
diff --git a/attr.h b/attr.h
index fe26f3a588..292d56fe0f 100644
--- a/attr.h
+++ b/attr.h
@@ -51,7 +51,6 @@ struct git_attr_check {
 
 extern struct git_attr_check *git_attr_check_initl(const char *, ...);
 extern int git_check_attr(const char *path, struct git_attr_check *);
-extern int git_check_attr_counted(const char *, int, struct git_attr_check *);
 
 extern struct git_attr_check *git_attr_check_alloc(void);
 extern struct git_attr_check_elem *git_attr_check_append(struct git_attr_check 
*, const struct git_attr *);
-- 
2.10.1.714.ge3da0db



[PATCHv2 20/36] attr.c: pass struct git_attr_check down the callchain

2016-10-28 Thread Stefan Beller
From: Junio C Hamano 

The callchain that starts from git_check_attrs() down to
collect_some_attrs() used to take an array of git_attr_check_elem
as their parameters.  Pass the enclosing git_attr_check instance
instead, so that they will have access to new fields we will add to
the data structure.

Signed-off-by: Junio C Hamano 
Signed-off-by: Stefan Beller 
---
 attr.c | 36 
 1 file changed, 24 insertions(+), 12 deletions(-)

diff --git a/attr.c b/attr.c
index 0a8ed5b582..47ab59c157 100644
--- a/attr.c
+++ b/attr.c
@@ -755,14 +755,25 @@ static int attr_check_is_dynamic(const struct 
git_attr_check *check)
  * check_all_attr. If num is non-zero, only attributes in check[] are
  * collected. Otherwise all attributes are collected.
  */
-static void collect_some_attrs(const char *path, int pathlen, int num,
-  struct git_attr_check_elem *check)
+static void collect_some_attrs(const char *path, int pathlen,
+  struct git_attr_check *check)
 
 {
struct attr_stack *stk;
int i, rem, dirlen;
const char *cp, *last_slash = NULL;
int basename_offset;
+   int num;
+   struct git_attr_check_elem *celem;
+
+   if (!check) {
+   /* Yuck - ugly git_all_attrs() hack! */
+   celem = NULL;
+   num = 0;
+   } else {
+   celem = check->check;
+   num = check->check_nr;
+   }
 
for (cp = path; cp < path + pathlen; cp++) {
if (*cp == '/' && cp[1])
@@ -782,9 +793,9 @@ static void collect_some_attrs(const char *path, int 
pathlen, int num,
if (num && !cannot_trust_maybe_real) {
rem = 0;
for (i = 0; i < num; i++) {
-   if (!check[i].attr->maybe_real) {
+   if (!celem[i].attr->maybe_real) {
struct git_attr_check_elem *c;
-   c = check_all_attr + check[i].attr->attr_nr;
+   c = check_all_attr + celem[i].attr->attr_nr;
c->value = ATTR__UNSET;
rem++;
}
@@ -798,18 +809,19 @@ static void collect_some_attrs(const char *path, int 
pathlen, int num,
rem = fill(path, pathlen, basename_offset, stk, rem);
 }
 
-static int git_check_attrs(const char *path, int pathlen, int num,
-  struct git_attr_check_elem *check)
+static int git_check_attrs(const char *path, int pathlen,
+  struct git_attr_check *check)
 {
int i;
+   struct git_attr_check_elem *celem = check->check;
 
-   collect_some_attrs(path, pathlen, num, check);
+   collect_some_attrs(path, pathlen, check);
 
-   for (i = 0; i < num; i++) {
-   const char *value = 
check_all_attr[check[i].attr->attr_nr].value;
+   for (i = 0; i < check->check_nr; i++) {
+   const char *value = 
check_all_attr[celem[i].attr->attr_nr].value;
if (value == ATTR__UNKNOWN)
value = ATTR__UNSET;
-   check[i].value = value;
+   celem[i].value = value;
}
 
return 0;
@@ -820,7 +832,7 @@ void git_all_attrs(const char *path, struct git_attr_check 
*check)
int i;
 
git_attr_check_clear(check);
-   collect_some_attrs(path, strlen(path), 0, NULL);
+   collect_some_attrs(path, strlen(path), NULL);
 
for (i = 0; i < attr_nr; i++) {
const char *name = check_all_attr[i].attr->name;
@@ -849,7 +861,7 @@ int git_check_attr_counted(const char *path, int pathlen,
   struct git_attr_check *check)
 {
check->finalized = 1;
-   return git_check_attrs(path, pathlen, check->check_nr, check->check);
+   return git_check_attrs(path, pathlen, check);
 }
 
 int git_check_attr(const char *path, struct git_attr_check *check)
-- 
2.10.1.714.ge3da0db



[PATCHv2 16/36] attr: add counted string version of git_attr()

2016-10-28 Thread Stefan Beller
From: Junio C Hamano 

Often a potential caller has  pair that
represents the name it wants to create an attribute out of.

When name[namelen] is not NUL, the caller has to xmemdupz()
only to call git_attr().

Add git_attr_counted() that takes such a counted string instead of
"const char *name".

Signed-off-by: Junio C Hamano 
Signed-off-by: Stefan Beller 
---
 attr.c | 8 
 attr.h | 5 -
 2 files changed, 8 insertions(+), 5 deletions(-)

diff --git a/attr.c b/attr.c
index dc9d61b411..7058e1c9fa 100644
--- a/attr.c
+++ b/attr.c
@@ -78,7 +78,7 @@ static int invalid_attr_name(const char *name, int namelen)
return 0;
 }
 
-static struct git_attr *git_attr_internal(const char *name, int len)
+struct git_attr *git_attr_counted(const char *name, size_t len)
 {
unsigned hval = hash_name(name, len);
unsigned pos = hval % HASHSIZE;
@@ -109,7 +109,7 @@ static struct git_attr *git_attr_internal(const char *name, 
int len)
 
 struct git_attr *git_attr(const char *name)
 {
-   return git_attr_internal(name, strlen(name));
+   return git_attr_counted(name, strlen(name));
 }
 
 /* What does a matched pattern decide? */
@@ -199,7 +199,7 @@ static const char *parse_attr(const char *src, int lineno, 
const char *cp,
else {
e->setto = xmemdupz(equals + 1, ep - equals - 1);
}
-   e->attr = git_attr_internal(cp, len);
+   e->attr = git_attr_counted(cp, len);
}
return ep + strspn(ep, blank);
 }
@@ -254,7 +254,7 @@ static struct match_attr *parse_attr_line(const char *line, 
const char *src,
  sizeof(struct attr_state) * num_attr +
  (is_macro ? 0 : namelen + 1));
if (is_macro) {
-   res->u.attr = git_attr_internal(name, namelen);
+   res->u.attr = git_attr_counted(name, namelen);
res->u.attr->maybe_macro = 1;
} else {
char *p = (char *)&(res->state[num_attr]);
diff --git a/attr.h b/attr.h
index c84f164b8e..bcedf928b5 100644
--- a/attr.h
+++ b/attr.h
@@ -8,7 +8,10 @@ struct git_attr;
  * Given a string, return the gitattribute object that
  * corresponds to it.
  */
-struct git_attr *git_attr(const char *);
+extern struct git_attr *git_attr(const char *);
+
+/* The same, but with counted string */
+extern struct git_attr *git_attr_counted(const char *, size_t);
 
 /* Internal use */
 extern const char git_attr__true[];
-- 
2.10.1.714.ge3da0db



[PATCHv2 21/36] attr.c: rename a local variable check

2016-10-28 Thread Stefan Beller
From: Junio C Hamano 

Throughout this series, we are trying to use "check" to name an
instance of "git_attr_check" structure; let's rename a "check" that
refers to an array whose elements are git_attr_check_elem to avoid
confusion.

Signed-off-by: Junio C Hamano 
Signed-off-by: Stefan Beller 
---
 attr.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/attr.c b/attr.c
index 47ab59c157..c85685da29 100644
--- a/attr.c
+++ b/attr.c
@@ -686,12 +686,12 @@ static int macroexpand_one(int attr_nr, int rem);
 
 static int fill_one(const char *what, struct match_attr *a, int rem)
 {
-   struct git_attr_check_elem *check = check_all_attr;
+   struct git_attr_check_elem *celem = check_all_attr;
int i;
 
for (i = a->num_attr - 1; 0 < rem && 0 <= i; i--) {
struct git_attr *attr = a->state[i].attr;
-   const char **n = &(check[attr->attr_nr].value);
+   const char **n = &(celem[attr->attr_nr].value);
const char *v = a->state[i].setto;
 
if (*n == ATTR__UNKNOWN) {
-- 
2.10.1.714.ge3da0db



[PATCHv2 08/36] attr.c: tighten constness around "git_attr" structure

2016-10-28 Thread Stefan Beller
From: Junio C Hamano 

It holds an interned string, and git_attr_name() is a way to peek
into it.  Make sure the involved pointer types are pointer-to-const.

Signed-off-by: Junio C Hamano 
Signed-off-by: Stefan Beller 
---
 attr.c | 2 +-
 attr.h | 4 ++--
 2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/attr.c b/attr.c
index e42f931b35..f7cf7ae306 100644
--- a/attr.c
+++ b/attr.c
@@ -43,7 +43,7 @@ static int cannot_trust_maybe_real;
 static struct git_attr_check *check_all_attr;
 static struct git_attr *(git_attr_hash[HASHSIZE]);
 
-char *git_attr_name(struct git_attr *attr)
+const char *git_attr_name(const struct git_attr *attr)
 {
return attr->name;
 }
diff --git a/attr.h b/attr.h
index 8b08d33af8..00d7a662c9 100644
--- a/attr.h
+++ b/attr.h
@@ -25,7 +25,7 @@ extern const char git_attr__false[];
  * Unset one is returned as NULL.
  */
 struct git_attr_check {
-   struct git_attr *attr;
+   const struct git_attr *attr;
const char *value;
 };
 
@@ -34,7 +34,7 @@ struct git_attr_check {
  * return value is a pointer to a null-delimited string that is part
  * of the internal data structure; it should not be modified or freed.
  */
-char *git_attr_name(struct git_attr *);
+extern const char *git_attr_name(const struct git_attr *);
 
 int git_check_attr(const char *path, int, struct git_attr_check *);
 
-- 
2.10.1.714.ge3da0db



[PATCHv2 32/36] pathspec: allow querying for attributes

2016-10-28 Thread Stefan Beller
The pathspec mechanism is extended via the new
":(attr:eol=input)pattern/to/match" syntax to filter paths so that it
requires paths to not just match the given pattern but also have the
specified attrs attached for them to be chosen.

Signed-off-by: Stefan Beller 
Signed-off-by: Junio C Hamano 
---
 Documentation/glossary-content.txt |  20 +
 dir.c  |  35 
 pathspec.c | 105 +-
 pathspec.h |  15 
 t/t6134-pathspec-with-labels.sh| 175 +
 5 files changed, 346 insertions(+), 4 deletions(-)
 create mode 100755 t/t6134-pathspec-with-labels.sh

diff --git a/Documentation/glossary-content.txt 
b/Documentation/glossary-content.txt
index 8ad29e61a9..f90bd45395 100644
--- a/Documentation/glossary-content.txt
+++ b/Documentation/glossary-content.txt
@@ -384,6 +384,26 @@ full pathname may have special meaning:
 +
 Glob magic is incompatible with literal magic.
 
+attr;;
+After `attr:` comes a space separated list of "attribute
+requirements", all of which must be met in order for the
+path to be considered a match; this is in addition to the
+usual non-magic pathspec pattern matching.
++
+Each of the attribute requirements for the path takes one of
+these forms:
+
+- "`ATTR`" requires that the attribute `ATTR` must be set.
+
+- "`-ATTR`" requires that the attribute `ATTR` must be unset.
+
+- "`ATTR=VALUE`" requires that the attribute `ATTR` must be
+  set to the string `VALUE`.
+
+- "`!ATTR`" requires that the attribute `ATTR` must be
+  unspecified.
++
+
 exclude;;
After a path matches any non-exclude pathspec, it will be run
through all exclude pathspec (magic signature: `!`). If it
diff --git a/dir.c b/dir.c
index bfa8c8a9a5..7d2c5c0925 100644
--- a/dir.c
+++ b/dir.c
@@ -9,6 +9,7 @@
  */
 #include "cache.h"
 #include "dir.h"
+#include "attr.h"
 #include "refs.h"
 #include "wildmatch.h"
 #include "pathspec.h"
@@ -207,6 +208,37 @@ int within_depth(const char *name, int namelen,
return 1;
 }
 
+static int match_attrs(const char *name, int namelen,
+  const struct pathspec_item *item)
+{
+   int i;
+   struct git_attr_result *res = git_attr_result_alloc(item->attr_check);
+
+   git_check_attr(name, item->attr_check, res);
+   for (i = 0; i < item->attr_match_nr; i++) {
+   const char *value;
+   int matched;
+   enum attr_match_mode match_mode;
+
+   value = res[i].value;
+   match_mode = item->attr_match[i].match_mode;
+
+   if (ATTR_TRUE(value))
+   matched = (match_mode == MATCH_SET);
+   else if (ATTR_FALSE(value))
+   matched = (match_mode == MATCH_UNSET);
+   else if (ATTR_UNSET(value))
+   matched = (match_mode == MATCH_UNSPECIFIED);
+   else
+   matched = (match_mode == MATCH_VALUE &&
+  !strcmp(item->attr_match[i].value, value));
+   if (!matched)
+   return 0;
+   }
+
+   return 1;
+}
+
 #define DO_MATCH_EXCLUDE   (1<<0)
 #define DO_MATCH_DIRECTORY (1<<1)
 #define DO_MATCH_SUBMODULE (1<<2)
@@ -263,6 +295,9 @@ static int match_pathspec_item(const struct pathspec_item 
*item, int prefix,
strncmp(item->match, name - prefix, item->prefix))
return 0;
 
+   if (item->attr_match_nr && !match_attrs(name, namelen, item))
+   return 0;
+
/* If the match was just the prefix, we matched */
if (!*match)
return MATCHED_RECURSIVELY;
diff --git a/pathspec.c b/pathspec.c
index 83b46c1910..478449eec5 100644
--- a/pathspec.c
+++ b/pathspec.c
@@ -1,6 +1,8 @@
 #include "cache.h"
 #include "dir.h"
 #include "pathspec.h"
+#include "attr.h"
+#include "argv-array.h"
 
 /*
  * Finds which of the given pathspecs match items in the index.
@@ -88,12 +90,79 @@ static void prefix_short_magic(struct strbuf *sb, int 
prefixlen,
strbuf_addf(sb, ",prefix:%d)", prefixlen);
 }
 
+static void parse_pathspec_attr_match(struct pathspec_item *item, const char 
*value)
+{
+   struct string_list_item *si;
+   struct string_list list = STRING_LIST_INIT_DUP;
+   struct argv_array attrs = ARGV_ARRAY_INIT;
+
+
+   if (!value || !strlen(value))
+   die(_("attr spec must not be empty"));
+
+   string_list_split(, value, ' ', -1);
+   string_list_remove_empty_items(, 0);
+
+   ALLOC_GROW(item->attr_match, item->attr_match_nr + list.nr, 
item->attr_match_alloc);
+
+   for_each_string_list_item(si, ) {
+   size_t attr_len;
+
+   int j = item->attr_match_nr++;
+   const char *attr = si->string;
+   struct attr_match *am = >attr_match[j];
+
+   switch (*attr) {
+   

[PATCHv2 10/36] attr: rename function and struct related to checking attributes

2016-10-28 Thread Stefan Beller
From: Junio C Hamano 

The traditional API to check attributes is to prepare an N-element
array of "struct git_attr_check" and pass N and the array to the
function "git_check_attr()" as arguments.

In preparation to revamp the API to pass a single structure, in
which these N elements are held, rename the type used for these
individual array elements to "struct git_attr_check_elem" and rename
the function to "git_check_attrs()".

Signed-off-by: Junio C Hamano 
Signed-off-by: Stefan Beller 
---
 archive.c  |  6 +++---
 attr.c | 12 ++--
 attr.h |  8 
 builtin/check-attr.c   | 19 ++-
 builtin/pack-objects.c |  6 +++---
 convert.c  | 12 ++--
 ll-merge.c | 10 +-
 userdiff.c |  4 ++--
 ws.c   |  6 +++---
 9 files changed, 42 insertions(+), 41 deletions(-)

diff --git a/archive.c b/archive.c
index dde1ab4c79..2dc8d6ca57 100644
--- a/archive.c
+++ b/archive.c
@@ -87,7 +87,7 @@ void *sha1_file_to_archive(const struct archiver_args *args,
return buffer;
 }
 
-static void setup_archive_check(struct git_attr_check *check)
+static void setup_archive_check(struct git_attr_check_elem *check)
 {
static struct git_attr *attr_export_ignore;
static struct git_attr *attr_export_subst;
@@ -123,7 +123,7 @@ static int write_archive_entry(const unsigned char *sha1, 
const char *base,
struct archiver_context *c = context;
struct archiver_args *args = c->args;
write_archive_entry_fn_t write_entry = c->write_entry;
-   struct git_attr_check check[2];
+   struct git_attr_check_elem check[2];
const char *path_without_prefix;
int err;
 
@@ -138,7 +138,7 @@ static int write_archive_entry(const unsigned char *sha1, 
const char *base,
path_without_prefix = path.buf + args->baselen;
 
setup_archive_check(check);
-   if (!git_check_attr(path_without_prefix, ARRAY_SIZE(check), check)) {
+   if (!git_check_attrs(path_without_prefix, ARRAY_SIZE(check), check)) {
if (ATTR_TRUE(check[0].value))
return 0;
args->convert = ATTR_TRUE(check[1].value);
diff --git a/attr.c b/attr.c
index d180c7833e..ff7f0a59eb 100644
--- a/attr.c
+++ b/attr.c
@@ -40,7 +40,7 @@ struct git_attr {
 static int attr_nr;
 static int cannot_trust_maybe_real;
 
-static struct git_attr_check *check_all_attr;
+static struct git_attr_check_elem *check_all_attr;
 static struct git_attr *(git_attr_hash[HASHSIZE]);
 
 const char *git_attr_name(const struct git_attr *attr)
@@ -669,7 +669,7 @@ static int macroexpand_one(int attr_nr, int rem);
 
 static int fill_one(const char *what, struct match_attr *a, int rem)
 {
-   struct git_attr_check *check = check_all_attr;
+   struct git_attr_check_elem *check = check_all_attr;
int i;
 
for (i = a->num_attr - 1; 0 < rem && 0 <= i; i--) {
@@ -734,7 +734,7 @@ static int macroexpand_one(int nr, int rem)
  * collected. Otherwise all attributes are collected.
  */
 static void collect_some_attrs(const char *path, int num,
-  struct git_attr_check *check)
+  struct git_attr_check_elem *check)
 
 {
struct attr_stack *stk;
@@ -762,7 +762,7 @@ static void collect_some_attrs(const char *path, int num,
rem = 0;
for (i = 0; i < num; i++) {
if (!check[i].attr->maybe_real) {
-   struct git_attr_check *c;
+   struct git_attr_check_elem *c;
c = check_all_attr + check[i].attr->attr_nr;
c->value = ATTR__UNSET;
rem++;
@@ -777,7 +777,7 @@ static void collect_some_attrs(const char *path, int num,
rem = fill(path, pathlen, basename_offset, stk, rem);
 }
 
-int git_check_attr(const char *path, int num, struct git_attr_check *check)
+int git_check_attrs(const char *path, int num, struct git_attr_check_elem 
*check)
 {
int i;
 
@@ -793,7 +793,7 @@ int git_check_attr(const char *path, int num, struct 
git_attr_check *check)
return 0;
 }
 
-int git_all_attrs(const char *path, int *num, struct git_attr_check **check)
+int git_all_attrs(const char *path, int *num, struct git_attr_check_elem 
**check)
 {
int i, count, j;
 
diff --git a/attr.h b/attr.h
index 00d7a662c9..dd3c4a3aef 100644
--- a/attr.h
+++ b/attr.h
@@ -20,11 +20,11 @@ extern const char git_attr__false[];
 #define ATTR_UNSET(v) ((v) == NULL)
 
 /*
- * Send one or more git_attr_check to git_check_attr(), and
+ * Send one or more git_attr_check to git_check_attrs(), and
  * each 'value' member tells what its value is.
  * Unset one is returned as NULL.
  */
-struct git_attr_check {
+struct git_attr_check_elem {
const struct 

[PATCHv2 23/36] attr.c: introduce empty_attr_check_elems()

2016-10-28 Thread Stefan Beller
From: Junio C Hamano 

One codepath needs to just empty the git_attr_check_elem array in
the git_attr_check structure, without releasing the entire resource.
Introduce a helper to do so and rewrite git_attr_check_clear() using
it.

Signed-off-by: Junio C Hamano 
Signed-off-by: Stefan Beller 
---
 attr.c | 11 +--
 1 file changed, 9 insertions(+), 2 deletions(-)

diff --git a/attr.c b/attr.c
index 94476de3d2..1098300e54 100644
--- a/attr.c
+++ b/attr.c
@@ -750,6 +750,14 @@ static int attr_check_is_dynamic(const struct 
git_attr_check *check)
return (void *)(check->check) != (void *)(check + 1);
 }
 
+static void empty_attr_check_elems(struct git_attr_check *check)
+{
+   if (!attr_check_is_dynamic(check))
+   die("BUG: emptying a statically initialized git_attr_check");
+   check->check_nr = 0;
+   check->finalized = 0;
+}
+
 /*
  * Collect attributes for path into the array pointed to by
  * check_all_attr.  If check is not NULL, only attributes in
@@ -916,12 +924,11 @@ struct git_attr_check_elem *git_attr_check_append(struct 
git_attr_check *check,
 
 void git_attr_check_clear(struct git_attr_check *check)
 {
+   empty_attr_check_elems(check);
if (!attr_check_is_dynamic(check))
die("BUG: clearing a statically initialized git_attr_check");
free(check->check);
-   check->check_nr = 0;
check->check_alloc = 0;
-   check->finalized = 0;
 }
 
 void git_attr_check_free(struct git_attr_check *check)
-- 
2.10.1.714.ge3da0db



[PATCHv2 03/36] attr.c: update a stale comment on "struct match_attr"

2016-10-28 Thread Stefan Beller
From: Junio C Hamano 

When 82dce998 (attr: more matching optimizations from .gitignore,
2012-10-15) changed a pointer to a string "*pattern" into an
embedded "struct pattern" in struct match_attr, it forgot to update
the comment that describes the structure.

Signed-off-by: Junio C Hamano 
Signed-off-by: Stefan Beller 
---
 attr.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/attr.c b/attr.c
index 04d24334e8..007f1a2995 100644
--- a/attr.c
+++ b/attr.c
@@ -131,9 +131,8 @@ struct pattern {
  * If is_macro is true, then u.attr is a pointer to the git_attr being
  * defined.
  *
- * If is_macro is false, then u.pattern points at the filename pattern
- * to which the rule applies.  (The memory pointed to is part of the
- * memory block allocated for the match_attr instance.)
+ * If is_macro is false, then u.pat is the filename pattern to which the
+ * rule applies.
  *
  * In either case, num_attr is the number of attributes affected by
  * this rule, and state is an array listing them.  The attributes are
-- 
2.10.1.714.ge3da0db



[PATCHv2 05/36] attr.c: complete a sentence in a comment

2016-10-28 Thread Stefan Beller
From: Junio C Hamano 

Signed-off-by: Junio C Hamano 
Signed-off-by: Stefan Beller 
---
 attr.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/attr.c b/attr.c
index 6b55a57ef7..9bdf87a6fe 100644
--- a/attr.c
+++ b/attr.c
@@ -300,7 +300,7 @@ static struct match_attr *parse_attr_line(const char *line, 
const char *src,
  * directory (again, reading the file from top to bottom) down to the
  * current directory, and then scan the list backwards to find the first match.
  * This is exactly the same as what is_excluded() does in dir.c to deal with
- * .gitignore
+ * .gitignore file and info/excludes file as a fallback.
  */
 
 static struct attr_stack {
-- 
2.10.1.714.ge3da0db



[PATCHv2 06/36] attr.c: mark where #if DEBUG ends more clearly

2016-10-28 Thread Stefan Beller
From: Junio C Hamano 

Signed-off-by: Junio C Hamano 
Signed-off-by: Stefan Beller 
---
 attr.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/attr.c b/attr.c
index 9bdf87a6fe..17297fffee 100644
--- a/attr.c
+++ b/attr.c
@@ -469,7 +469,7 @@ static void debug_set(const char *what, const char *match, 
struct git_attr *attr
 #define debug_push(a) do { ; } while (0)
 #define debug_pop(a) do { ; } while (0)
 #define debug_set(a,b,c,d) do { ; } while (0)
-#endif
+#endif /* DEBUG_ATTR */
 
 static void drop_attr_stack(void)
 {
-- 
2.10.1.714.ge3da0db



[PATCHv2 02/36] attr.c: use strchrnul() to scan for one line

2016-10-28 Thread Stefan Beller
From: Junio C Hamano 

Signed-off-by: Junio C Hamano 
Signed-off-by: Stefan Beller 
---
 attr.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/attr.c b/attr.c
index 1fcf042b87..04d24334e8 100644
--- a/attr.c
+++ b/attr.c
@@ -402,8 +402,8 @@ static struct attr_stack *read_attr_from_index(const char 
*path, int macro_ok)
for (sp = buf; *sp; ) {
char *ep;
int more;
-   for (ep = sp; *ep && *ep != '\n'; ep++)
-   ;
+
+   ep = strchrnul(sp, '\n');
more = (*ep == '\n');
*ep = '\0';
handle_attr_line(res, sp, path, ++lineno, macro_ok);
-- 
2.10.1.714.ge3da0db



[PATCHv2 01/36] commit.c: use strchrnul() to scan for one line

2016-10-28 Thread Stefan Beller
From: Junio C Hamano 

Signed-off-by: Junio C Hamano 
Signed-off-by: Stefan Beller 
---
 commit.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/commit.c b/commit.c
index 856fd4aeef..41b2fdd335 100644
--- a/commit.c
+++ b/commit.c
@@ -415,8 +415,7 @@ int find_commit_subject(const char *commit_buffer, const 
char **subject)
p++;
if (*p) {
p = skip_blank_lines(p + 2);
-   for (eol = p; *eol && *eol != '\n'; eol++)
-   ; /* do nothing */
+   eol = strchrnul(p, '\n');
} else
eol = p;
 
-- 
2.10.1.714.ge3da0db



[PATCHv2 04/36] attr.c: explain the lack of attr-name syntax check in parse_attr()

2016-10-28 Thread Stefan Beller
From: Junio C Hamano 

Signed-off-by: Junio C Hamano 
Signed-off-by: Stefan Beller 
---
 attr.c | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/attr.c b/attr.c
index 007f1a2995..6b55a57ef7 100644
--- a/attr.c
+++ b/attr.c
@@ -183,6 +183,12 @@ static const char *parse_attr(const char *src, int lineno, 
const char *cp,
return NULL;
}
} else {
+   /*
+* As this function is always called twice, once with
+* e == NULL in the first pass and then e != NULL in
+* the second pass, no need for invalid_attr_name()
+* check here.
+*/
if (*cp == '-' || *cp == '!') {
e->setto = (*cp == '-') ? ATTR__FALSE : ATTR__UNSET;
cp++;
-- 
2.10.1.714.ge3da0db



[PATCHv2 00/36] Revamp the attr subsystem!

2016-10-28 Thread Stefan Beller
previous discussion at 
https://public-inbox.org/git/20161022233225.8883-1-sbel...@google.com

This implements the discarded series':
jc/attr
jc/attr-more
sb/pathspec-label
sb/submodule-default-paths

This includes
* The fixes for windows
* Junios latest suggestion to use git_attr_check_initv instead of
  alloc/append.

* I implemented the thread safe attr API in patch 27 (attr: convert to new 
threadsafe API)
* patch 28 (attr: keep attr stack for each check) makes it actually possible
  to run in a multithreaded environment.
* I added a test for the multithreaded when it is introduced in patch 32
  (pathspec: allow querying for attributes) as well as a test to disallow
  multiple "attr"s in a pathspec.

Thanks,
Stefan

Junio C Hamano (24):
  commit.c: use strchrnul() to scan for one line
  attr.c: use strchrnul() to scan for one line
  attr.c: update a stale comment on "struct match_attr"
  attr.c: explain the lack of attr-name syntax check in parse_attr()
  attr.c: complete a sentence in a comment
  attr.c: mark where #if DEBUG ends more clearly
  attr.c: simplify macroexpand_one()
  attr.c: tighten constness around "git_attr" structure
  attr.c: plug small leak in parse_attr_line()
  attr: rename function and struct related to checking attributes
  attr: (re)introduce git_check_attr() and struct git_attr_check
  attr: convert git_all_attrs() to use "struct git_attr_check"
  attr: convert git_check_attrs() callers to use the new API
  attr: retire git_check_attrs() API
  attr: add counted string version of git_check_attr()
  attr: add counted string version of git_attr()
  attr: expose validity check for attribute names
  attr.c: add push_stack() helper
  attr.c: pass struct git_attr_check down the callchain
  attr.c: rename a local variable check
  attr.c: correct ugly hack for git_all_attrs()
  attr.c: introduce empty_attr_check_elems()
  attr.c: always pass check[] to collect_some_attrs()
  attr.c: outline the future plans by heavily commenting

Nguyễn Thái Ngọc Duy (1):
  attr: support quoting pathname patterns in C style

Stefan Beller (11):
  attr: make git_check_attr_counted static
  attr: convert to new threadsafe API
  attr: keep attr stack for each check
  Documentation: fix a typo
  pathspec: move long magic parsing out of prefix_pathspec
  pathspec: move prefix check out of the inner loop
  pathspec: allow querying for attributes
  pathspec: allow escaped query values
  submodule update: add `--init-default-path` switch
  clone: add --init-submodule= switch
  completion: clone can initialize specific submodules

 Documentation/config.txt  |   5 +
 Documentation/git-clone.txt   |  23 +-
 Documentation/git-submodule.txt   |  17 +-
 Documentation/gitattributes.txt   |  10 +-
 Documentation/glossary-content.txt|  20 +
 Documentation/technical/api-gitattributes.txt | 140 ---
 archive.c |  25 +-
 attr.c| 525 ++
 attr.h|  71 ++--
 builtin/check-attr.c  |  74 ++--
 builtin/clone.c   |  36 +-
 builtin/pack-objects.c|  27 +-
 commit.c  |   3 +-
 compat/mingw.c|   4 +
 contrib/completion/git-completion.bash|   1 +
 convert.c |  46 ++-
 dir.c |  35 ++
 git-submodule.sh  |  21 +-
 hashmap.h |   2 +
 ll-merge.c|  35 +-
 pathspec.c| 227 +--
 pathspec.h|  15 +
 t/t0003-attributes.sh |  26 ++
 t/t6134-pathspec-with-labels.sh   | 185 +
 t/t7400-submodule-basic.sh| 134 +++
 userdiff.c|  21 +-
 ws.c  |  19 +-
 27 files changed, 1336 insertions(+), 411 deletions(-)
 create mode 100755 t/t6134-pathspec-with-labels.sh

-- 
2.10.1.714.ge3da0db



Re: [PATCH] compat: Allow static initializer for pthreads on Windows

2016-10-28 Thread Jacob Keller
On Fri, Oct 28, 2016 at 6:01 AM, Philip Oakley  wrote:
> From: "Johannes Sixt" 
>>
>>
>> One point is that the DCLP idiom must be implemented correctly. There are
>> solutions, of course, and when the initialization is over, we have a
>> miniscule overhead at each pthread_mutex_lock call.
>>
>
> I had to look up DCLP ( = Double Checked Locking Patterns), and found a good
> write up on the issues..
>
> http://www.aristeia.com/Papers/DDJ_Jul_Aug_2004_revised.pdf "C++ and the
> Perils of Double-Checked Locking", which include 'C' issues, and
> multi-thread, multi-processor issues. Not an easy issue when fighting
> optimisers..
>
> --
>
> Philip


Yep, this is why we have memory barriers. Ofcourse languages like C
don't really allow you to express them in the language and we restore
to various platform specific methods.

Thanks,
Jake


Re: [PATCH] compat: Allow static initializer for pthreads on Windows

2016-10-28 Thread Jacob Keller
On Fri, Oct 28, 2016 at 4:58 AM, Johannes Schindelin
 wrote:
> Hi Jake,
>
> On Thu, 27 Oct 2016, Jacob Keller wrote:
>
>> I agree with Stefan that there isn't really a great place to put a
>> dynamic initialization.
>
> Ummm. Wait. What???
>
> https://github.com/git/git/blob/v2.10.1/common-main.c#L25-L41
>
> Ciao,
> Johannes

I think Stefan has since provided a better suggestion of isolating the
dynamic initialization into the MINGW startup section. However, you
are right either place works, though I think platforms which support
static initialization should use it.

Thanks,
Jake


[PATCH] pre-receive.sample: Mark executable

2016-10-28 Thread Anders Kaseorg
For consistency with other hooks, allow the hook to be activated by
renaming pre-receive.sample to pre-receive without a separate step to
mark it executable.

Signed-off-by: Anders Kaseorg 
---
 templates/hooks--pre-receive.sample | 0
 1 file changed, 0 insertions(+), 0 deletions(-)
 mode change 100644 => 100755 templates/hooks--pre-receive.sample

diff --git a/templates/hooks--pre-receive.sample 
b/templates/hooks--pre-receive.sample
old mode 100644
new mode 100755
-- 
2.10.1



Re: [PATCH] attr: convert to new threadsafe API

2016-10-28 Thread Johannes Sixt

Am 28.10.2016 um 00:15 schrieb Stefan Beller:

* use attr_start on Windows to dynamically initialize the Single Big Attr Mutex


I'm sorry, I didn't find time to test the patch on Windows. I won't be 
back at my Windows box before Wednesday.


-- Hannes



Re: [PATCH v3 2/3] sha1_file: open window into packfiles with O_CLOEXEC

2016-10-28 Thread Junio C Hamano
Linus Torvalds  writes:

> On Fri, Oct 28, 2016 at 9:48 AM, Junio C Hamano  wrote:
>>
>> Setting FD_CLOEXEC with fcntl(2) may be racy, but in what way
>> it is broken to open(2) with O_CLOEXEC?
>
> Apparently windows doesn't even support it, at least not mingw. So
> you'll have to have some supprt for just explicitly closing things at
> execve anyway. And if you do that for windows, then what's the point
> of using O_CLOEXEC on Linux, and having two totally different paths
> that will just mean maintenance headache.

Ah, that's where your reaction comes from.  If I understood Dscho
correctly, what he said was that I cannot set FD_CLOEXEC via fcntl()
emulation they have.  It wasn't an objection to O_CLOEXEC.  In fact,
since 05d1ed6148 ("mingw: ensure temporary file handles are not
inherited by child processes", 2016-08-22), we have been relying on
open(2) with O_CLOEXEC even on Windows (by emulating it with
O_NOINHERIT, which Windows has) on some codepaths.




Re: [PATCH v3 2/3] sha1_file: open window into packfiles with O_CLOEXEC

2016-10-28 Thread Linus Torvalds
On Fri, Oct 28, 2016 at 9:48 AM, Junio C Hamano  wrote:
>
> Excuse me, but care to elaborate a bit more?

It just seems entirely pointless to change the games with O_xyz.

> Setting FD_CLOEXEC with fcntl(2) may be racy, but in what way
> it is broken to open(2) with O_CLOEXEC?

Apparently windows doesn't even support it, at least not mingw. So
you'll have to have some supprt for just explicitly closing things at
execve anyway. And if you do that for windows, then what's the point
of using O_CLOEXEC on Linux, and having two totally different paths
that will just mean maintenance headache.

In contrast, O_NOATIME isn't a maintenance problem, since it's purely
an optimization and has no semantic difference, so it not existing on
some platform is immaterial.

End result: leave O_NOATIME as it is (working), and just forget about O_CLOEXEC.

I have no actual idea about mingw support for this, but Johannes'
email certainly implies it's painful on Windows. Some googling also
shows people doing things like

  #ifdef O_CLOEXEC
  flags |= O_CLOEXEC;
  #endif

or

  #ifndef O_CLOEXEC
  #define O_CLOEXEC 0
  #endif

so O_CLOEXEC definitely isn't considered portable. Do you really want
to rely on it?

  Linus


Re: [PATCH] attr: convert to new threadsafe API

2016-10-28 Thread Junio C Hamano
Junio C Hamano  writes:

>> +if (check)
>> +return; /* already done */
>>  check = git_attr_check_alloc();
>>  while (*argv) {
>>  struct git_attr *attr = git_attr(*argv);
>>  git_attr_check_append(check, attr);

I thought you made git_attr() constructor unavailable, so
check_append() would just get a "const char *" instead?

>>  argv++;
>>  }
>> +struct git_attr_result *result = git_attr_result_alloc(check);
>
> This does not look like thread-safe.
>
> I could understand it if the calling convention were like this,
> though:
>
>   if (git_attr_check_alloc()) {
>   while (*argv) {
>   ... append ...
>   }
>   git_attr_check_finished_appending();
>   }
>   result = result_alloc();
>
> In this variant, git_attr_check_alloc() is responsible for ensuring
> that the "check" is allocated only once just like _initl() is, and
> at the same time, it makes simultanous callers wait until the first
> caller who appends to the singleton check instance declares that it
> finished appending.  The return value signals if you are the first
> caller (who is responsible for populating the check and for
> declaring the check is ready to use at the end of appending).
> Everybody else waits while the first caller is doing the if (...) {
> } thing, and then receives false, at which time everybody (including
> the first caller) goes on and allocating its own result and start
> making queries.

Having said that, how flexible does the "alloc then append" side of
API have to be in the envisioned set of callers?  Is it expected
that it wouldn't be too hard to arrange them so that they have an
array of "const char *"s when they need to initialize/populate a
check?  If that is the case, instead of the "alloc and have others
wait" illustrated above, it may be a lot simpler to give _initv()
variant to them and be done with it, i.e. the above sample caller
would become just:

git_attr_check_initv(, argv);
result = git_attr_result_alloc();

would that be too limiting?  The use of "alloc then append" pattern
in "git check-attr" done in your patch seems to fit the _initv()
well, and the "pathspec with attr match" that appears later in the
series also has all the attributes it needs to query available by
the time it wants to allocate and append to create an instance of
"check", I would think, so the _initv() might be sufficient as a
public API.


Re: [RFC PATCH 0/5] recursively grep across submodules

2016-10-28 Thread Junio C Hamano
Brandon Williams  writes:

>> Just a few brief comments, before reading the patches carefully.
>> 
>>  * It is somewhat surprising that [1/5] is even needed (in other
>>words, I would have expected something like this to be already
>>there, and my knee-jerk reaction was "Heh, how does 'git status'
>>know how to show submodules that are and are not initialized
>>differently without this?"  
>
> Yeah I was also surprised to find that this kind of functionality didn't
> already exist.  Though I guess there are still many builtin's that don't
> play nice with submodules so maybe this kind of functionality just
> wasn't needed until now.
>
>>  * It is somewhat surprising that [4/5] does not even use the
>>previous ls-files to find out the paths.
>
> The first attempt I made at this series used ls-files to produce a list
> of files which was then fed to the grep machinery.  The problem I found
> with this approach was when I started moving to work on grepping
> history, at that point it seemed to make more sense to have a process
> for each submodule.

Yup, that makes sense to me.



Re: [PATCH] attr: convert to new threadsafe API

2016-10-28 Thread Junio C Hamano
Stefan Beller  writes:

> +* Prepare a `struct git_attr_check` using `git_attr_check_initl()`
>function, enumerating the names of attributes whose values you are
>interested in, terminated with a NULL pointer.  Alternatively, an
> -  empty `struct git_attr_check` can be prepared by calling
> -  `git_attr_check_alloc()` function and then attributes you want to
> -  ask about can be added to it with `git_attr_check_append()`
> -  function.
> -
> -* Call `git_check_attr()` to check the attributes for the path.
> -
> -* Inspect `git_attr_check` structure to see how each of the
> -  attribute in the array is defined for the path.
> -
> +  empty `struct git_attr_check` as allocated by git_attr_check_alloc()

Need to drop "as allocated by git_attr_check_alloc()" here.

> +  can be prepared by calling `git_attr_check_alloc()` function and
> +  then attributes you want to ask about can be added to it with
> +  `git_attr_check_append()` function.
> +  Both ways with `git_attr_check_initl()` as well as the
> +  alloc and append route are thread safe, i.e. you can call it
> +  from different threads at the same time; when check determines
> +  the initialization is still needed, the threads will use a
> +  single global mutex to perform the initialization just once, the
> +  others will wait on the the thread to actually perform the
> +  initialization.

I have some comments on the example in the doc on the "alloc-append"
side.  _initl() side looks OK.

> + static struct git_attr_check *check;
> + git_attr_check_initl(, "crlf", "ident", NULL);

OK.

>   const char *path;
> + struct git_attr_result result[2];
>  
> + git_check_attr(path, check, result);

OK.  The above two may be easier to understand if they were a single
example, though.

> +. Act on `result.value[]`:
>  
>  
> - const char *value = check->check[0].value;
> + const char *value = result.value[0];

OK.

> @@ -123,12 +135,15 @@ the first step in the above would be different.
>  static struct git_attr_check *check;
>  static void setup_check(const char **argv)
>  {
> + if (check)
> + return; /* already done */
>   check = git_attr_check_alloc();
>   while (*argv) {
>   struct git_attr *attr = git_attr(*argv);
>   git_attr_check_append(check, attr);
>   argv++;
>   }
> + struct git_attr_result *result = git_attr_result_alloc(check);

This does not look like thread-safe.

I could understand it if the calling convention were like this,
though:

if (git_attr_check_alloc()) {
while (*argv) {
... append ...
}
git_attr_check_finished_appending();
}
result = result_alloc();

In this variant, git_attr_check_alloc() is responsible for ensuring
that the "check" is allocated only once just like _initl() is, and
at the same time, it makes simultanous callers wait until the first
caller who appends to the singleton check instance declares that it
finished appending.  The return value signals if you are the first
caller (who is responsible for populating the check and for
declaring the check is ready to use at the end of appending).
Everybody else waits while the first caller is doing the if (...) {
} thing, and then receives false, at which time everybody (including
the first caller) goes on and allocating its own result and start
making queries.

> +* Setup a local variables for the question
> +  `struct git_attr_check` as well as a pointer where the result
> +  `struct git_attr_result` will be stored. Both should be initialized
> +  to NULL.
> +
> +
> +  struct git_attr_check *check = NULL;
> +  struct git_attr_result *result = NULL;
> +
> +
> +* Call `git_all_attrs()`.
>  
> +
> +  git_all_attrs(full_path, , );
> +

OK.

Thanks.


Re: [RFC PATCH 0/5] recursively grep across submodules

2016-10-28 Thread Brandon Williams
On 10/27, Junio C Hamano wrote:
> Brandon Williams  writes:
> 
> > As for the rest of the series, it should be ready for review or comments.
> 
> Just a few brief comments, before reading the patches carefully.
> 
>  * It is somewhat surprising that [1/5] is even needed (in other
>words, I would have expected something like this to be already
>there, and my knee-jerk reaction was "Heh, how does 'git status'
>know how to show submodules that are and are not initialized
>differently without this?"  

Yeah I was also surprised to find that this kind of functionality didn't
already exist.  Though I guess there are still many builtin's that don't
play nice with submodules so maybe this kind of functionality just
wasn't needed until now.

>  * It is somewhat surprising that [4/5] does not even use the
>previous ls-files to find out the paths.

The first attempt I made at this series used ls-files to produce a list
of files which was then fed to the grep machinery.  The problem I found
with this approach was when I started moving to work on grepping
history, at that point it seemed to make more sense to have a process
for each submodule.

-- 
Brandon Williams


Re: [PATCH v3 2/3] sha1_file: open window into packfiles with O_CLOEXEC

2016-10-28 Thread Junio C Hamano
Linus Torvalds  writes:

> On Fri, Oct 28, 2016 at 4:11 AM, Johannes Schindelin
>  wrote:
>>
>> You guys. I mean: You guys! You sure make my life hard. A brief look at
>> mingw.h could have answered your implicit question:
>
> So here's what you guys should do:
>
>  - leave O_NOATIME damn well alone. It works. It has worked for 10+
> years. Stop arguing against it, people who do.
>
>  - get rid of all O_CLOEXEC games. They don't work. If you want to
> close file descriptors at execve(), you - gasp - close the file
> descriptor before doing an execve.
>
> So O_CLOEXEC or FD_CLOEXEC is broken.
>
> DO NOT BREAK O_NOATIME JUST TO ADD COMPLETELY NEW BREAKAGE.

Excuse me, but care to elaborate a bit more?

Did I botch the way I ask fcntl(2) to set O_NOATIME in *1*?  Its
follow-up in *2* is optional and as peff said in *3*, I do not think
we mind dropping that step and keeping O_NOATIME.  It is not that
much code to keep.

Setting FD_CLOEXEC with fcntl(2) may be racy, but in what way
it is broken to open(2) with O_CLOEXEC?

If we want to close file descriptors we opened at execve() time
ourselves, we'd need to somehow keep track of which file descriptors
are meant to be closed upon execve() time, and one way to mark that
may be to use O_CLOEXEC, but once we mark a file descriptor with the
bit, execve() would take care of "closing" it for us---wouldn't that
be the whole point of that bit?


*1* http://public-inbox.org/git/xmqqh97w38gj@gitster.mtv.corp.google.com
*2* http://public-inbox.org/git/xmqqd1ik38f4@gitster.mtv.corp.google.com
*3* 
http://public-inbox.org/git/20161028075104.la24zydnr3ogb...@sigill.intra.peff.net


Re: [RFC PATCH 0/5] recursively grep across submodules

2016-10-28 Thread Philip Oakley

From: "Junio C Hamano" 


I hate it when people become overly defensive and start making
excuses when given harmless observations.



Hi Junio,

It can sometimes be difficult for readers to appreciate which way comments 
are meant to be interpreted, especially as one cannot usually 'see' the 
issue being raised with one's personal work, no matter who writes them. I 
too have to supportively review the work of others (as a volunteer), who 
then don't always respond or understand, as hoped, and it can be 
frustrating.


It can be very hard to write a reasonable write up that gets the balance 
between being on the one hand patronising (e.g. over-explained) and on the 
other too terse, and yet still not be too nuanced that the points are 
missed. The responder has the similar problem, especially if they have 
misunderstood the comment, and then end up just end up digging the hole 
deeper by over-explaining their position. Extricating the discussion from 
the trap can be tricky.


Thank you for your reviews.
--
Philip 



Re: feature request

2016-10-28 Thread Philip Oakley

From: "David Lang" 

On Thu, 27 Oct 2016, John Rood wrote:


Thanks, I think changing the default for windows is a good idea.


notepad doesn't work well with unix line endings, wordpad handles the 
files much more cleanly.


David Lang



Notepad++ does work well, but isn't a standard part of Windows.

[core]
editor = 'C:/Program 
Files/Notepad++/notepad++.exe' -multiInst -notabbar -nosession -noplugin


.. is one of the standard StackOverflow recipes.
--
Philip 



Re: [PATCH] compat: Allow static initializer for pthreads on Windows

2016-10-28 Thread Philip Oakley

From: "Johannes Sixt" 


One point is that the DCLP idiom must be implemented correctly. There are 
solutions, of course, and when the initialization is over, we have a 
miniscule overhead at each pthread_mutex_lock call.




I had to look up DCLP ( = Double Checked Locking Patterns), and found a good 
write up on the issues..


http://www.aristeia.com/Papers/DDJ_Jul_Aug_2004_revised.pdf "C++ and the 
Perils of Double-Checked Locking", which include 'C' issues, and 
multi-thread, multi-processor issues. Not an easy issue when fighting 
optimisers..


--

Philip 



Re: [PATCH v3 2/3] sha1_file: open window into packfiles with O_CLOEXEC

2016-10-28 Thread Linus Torvalds
On Fri, Oct 28, 2016 at 4:11 AM, Johannes Schindelin
 wrote:
>
> You guys. I mean: You guys! You sure make my life hard. A brief look at
> mingw.h could have answered your implicit question:

So here's what you guys should do:

 - leave O_NOATIME damn well alone. It works. It has worked for 10+
years. Stop arguing against it, people who do.

 - get rid of all O_CLOEXEC games. They don't work. If you want to
close file descriptors at execve(), you - gasp - close the file
descriptor before doing an execve.

So O_CLOEXEC or FD_CLOEXEC is broken.

DO NOT BREAK O_NOATIME JUST TO ADD COMPLETELY NEW BREAKAGE.

 Linus


Re: [PATCH] Fix typo in 2.11.0 RelNotes

2016-10-28 Thread Henrik Ahlgren
On Fri, 2016-10-28 at 06:03 -0700, Junio C Hamano wrote:

> If the original text were like the following, would it have been
> clear enough that prevented you from sending your patch?
> 
>  * An empty string used as a pathspec element has always meant
>'everything matches', but it is too easy to write a script that
>finds a path to remove in $path and run 'git rm "$paht"' by 
>mistake (when the user meant to give "$path"), which ends up
>removing everything.  This release starts warning about the
>use of an empty string that is used for 'everything matches' and
>asks users to use a more explicit '.' for that instead.

Oops. Yes, but not sure if it really needs to be clear enough even for
idiots like me. :-) Sorry.




Re: [PATCH] Documenation: fmt-merge-msg: fix markup in example

2016-10-28 Thread Junio C Hamano
Jeff King  writes:

> On Fri, Oct 28, 2016 at 12:01:26PM +0200, Stefan Christ wrote:
>
>> diff --git a/Documentation/git-fmt-merge-msg.txt 
>> b/Documentation/git-fmt-merge-msg.txt
>> index 6526b17..44892c4 100644
>> --- a/Documentation/git-fmt-merge-msg.txt
>> +++ b/Documentation/git-fmt-merge-msg.txt
>> @@ -60,10 +60,10 @@ merge.summary::
>>  EXAMPLE
>>  ---
>>  
>> ---
>> +-
>>  $ git fetch origin master
>>  $ git fmt-merge-msg --log <$GIT_DIR/FETCH_HEAD
>> ---
>> +-
>
> Thanks. Asciidoc generally requires at least 4 delimiter characters to
> open a delimited block (including a ListingBlock, which is what we want
> here). There is one exception, "--", which is a generic OpenBlock, which
> is just used for grouping, and not any special syntactic meaning (so
> that's why this _didn't_ render the "--", but did render the contents
> without line breaks).
>
> So looks good, modulo the typo in the subject that somebody else pointed
> out.

Thanks, both.  I queued with a bit more enhanced log message while
fixing the typo.

-- >8 --
From: Stefan Christ 
Date: Fri, 28 Oct 2016 12:01:26 +0200
Subject: [PATCH] Documentation/fmt-merge-msg: fix markup in example

Use at least 4 delimiting dashes that are required for
ListingBlock to get this block rendered as verbatim text.

Signed-off-by: Stefan Christ 
Signed-off-by: Junio C Hamano 
---
 Documentation/git-fmt-merge-msg.txt | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/Documentation/git-fmt-merge-msg.txt 
b/Documentation/git-fmt-merge-msg.txt
index 6526b178e8..44892c447e 100644
--- a/Documentation/git-fmt-merge-msg.txt
+++ b/Documentation/git-fmt-merge-msg.txt
@@ -60,10 +60,10 @@ merge.summary::
 EXAMPLE
 ---
 
---
+-
 $ git fetch origin master
 $ git fmt-merge-msg --log <$GIT_DIR/FETCH_HEAD
---
+-
 
 Print a log message describing a merge of the "master" branch from
 the "origin" remote.
-- 
2.10.1-791-g404733b9cf



[PATCH v2] Documentation: fmt-merge-msg: fix markup in example

2016-10-28 Thread Stefan Christ
The example was not rendered as verbatim text. Fix it.

Signed-off-by: Stefan Christ 
---
v2: fix misspelling in subject

Hi,

thanks to rday for spotting the typo in the subject line (I missed to
autoenable the spell checking in vim) and thanks to Peff for the asciidoc
explanation.

Kind regards,
Stefan Christ
---
 Documentation/git-fmt-merge-msg.txt | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/Documentation/git-fmt-merge-msg.txt 
b/Documentation/git-fmt-merge-msg.txt
index 6526b17..44892c4 100644
--- a/Documentation/git-fmt-merge-msg.txt
+++ b/Documentation/git-fmt-merge-msg.txt
@@ -60,10 +60,10 @@ merge.summary::
 EXAMPLE
 ---
 
---
+-
 $ git fetch origin master
 $ git fmt-merge-msg --log <$GIT_DIR/FETCH_HEAD
---
+-
 
 Print a log message describing a merge of the "master" branch from
 the "origin" remote.
-- 
2.7.3



Is the entire working copy “at one branch”?

2016-10-28 Thread Stefan Monov
This is a very basic question but maybe I just don't know how to
phrase my query well enough to find an answer.

Let me begin with a SVN example. SVN doesn't have a "current branch"
as a property of a working copy, so I'll give the example with
revisions instead. Consider a working copy with the dirs `foo` and
`bar` in it. In SVN one can `update` foo to an earlier version, but
keep `bar` at `HEAD`.

Is it possible in an analogical git working copy to checkout one
branch in dir `foo` without changing the branch checked out in `bar`?

If not, why not?

TIA,
Stefan Monov


Git crash course for SVN users - which of the 2?

2016-10-28 Thread Stefan Monov
I googled and found these two official-looking results:

https://git.wiki.kernel.org/index.php/GitSvnCrashCourse
https://git-scm.com/course/svn.html

Which of the two should I use? And I'd ask for the other one to be
deleted so there's no such further confusion for other user.


Re: [PATCH v3 2/3] sha1_file: open window into packfiles with O_CLOEXEC

2016-10-28 Thread Junio C Hamano
Junio C Hamano  writes:

> Junio C Hamano  writes:
>
>> Hmph.  This may not fly well in practice, though.  
>
> We can flip the order around, and then it becomes simpler to later
> drop atime support.  The first step goes like this.

And the second step would be like this.

-- >8 --
Subject: [PATCH] sha1_file: stop opening files with O_NOATIME

When we open object files, we try to do so with O_NOATIME.
This dates back to 144bde78e9 (Use O_NOATIME when opening
the sha1 files., 2005-04-23), which is an optimization to
avoid creating a bunch of dirty inodes when we're accessing
many objects.  But a few things have changed since then:

  1. In June 2005, git learned about packfiles, which means
 we would do a lot fewer atime updates (rather than one
 per object access, we'd generally get one per packfile).

  2. In late 2006, Linux learned about "relatime", which is
 generally the default on modern installs. So
 performance around atimes updates is a non-issue there
 these days.

 All the world isn't Linux, but as it turns out, Linux
 is the only platform to implement O_NOATIME in the
 first place.

So it's very unlikely that this code is helping anybody
these days.

Helped-by: Jeff King 
[jc: took idea and log message from peff]
Signed-off-by: Junio C Hamano 
---
 cache.h |  2 +-
 sha1_file.c | 21 -
 2 files changed, 1 insertion(+), 22 deletions(-)

diff --git a/cache.h b/cache.h
index 6f1c21a352..f440d3fd1e 100644
--- a/cache.h
+++ b/cache.h
@@ -1123,7 +1123,7 @@ extern int hash_sha1_file_literally(const void *buf, 
unsigned long len, const ch
 extern int pretend_sha1_file(void *, unsigned long, enum object_type, unsigned 
char *);
 extern int force_object_loose(const unsigned char *sha1, time_t mtime);
 extern int git_open_cloexec(const char *name, int flags);
-extern int git_open(const char *name);
+#define git_open(name) git_open_cloexec(name, O_RDONLY)
 extern void *map_sha1_file(const unsigned char *sha1, unsigned long *size);
 extern int unpack_sha1_header(git_zstream *stream, unsigned char *map, 
unsigned long mapsize, void *buffer, unsigned long bufsiz);
 extern int parse_sha1_header(const char *hdr, unsigned long *sizep);
diff --git a/sha1_file.c b/sha1_file.c
index dfbf398183..25d9359402 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -27,14 +27,6 @@
 #include "list.h"
 #include "mergesort.h"
 
-#ifndef O_NOATIME
-#if defined(__linux__) && (defined(__i386__) || defined(__PPC__))
-#define O_NOATIME 0100
-#else
-#define O_NOATIME 0
-#endif
-#endif
-
 #define SZ_FMT PRIuMAX
 static inline uintmax_t sz_fmt(size_t s) { return s; }
 
@@ -1572,19 +1564,6 @@ int git_open_cloexec(const char *name, int flags)
return fd;
 }
 
-int git_open(const char *name)
-{
-   static int noatime = O_NOATIME;
-   int fd = git_open_cloexec(name, O_RDONLY);
-
-   if (0 <= fd && (noatime & O_NOATIME)) {
-   int flags = fcntl(fd, F_GETFL);
-   if (fcntl(fd, F_SETFL, flags | noatime))
-   noatime = 0;
-   }
-   return fd;
-}
-
 static int stat_sha1_file(const unsigned char *sha1, struct stat *st)
 {
struct alternate_object_database *alt;
-- 
2.10.1-791-g404733b9cf



Re: [PATCH v3 2/3] sha1_file: open window into packfiles with O_CLOEXEC

2016-10-28 Thread Junio C Hamano
Junio C Hamano  writes:

> Hmph.  This may not fly well in practice, though.  

We can flip the order around, and then it becomes simpler to later
drop atime support.  The first step goes like this.

-- >8 --
From: Junio C Hamano 
Date: Fri, 28 Oct 2016 06:23:07 -0700
Subject: [PATCH] git_open(): untangle possible NOATIME and CLOEXEC interactions

The way we structured the fallback/retry mechanism for opening with
O_NOATIME and O_CLOEXEC meant that if we failed due to lack of
support to open the file with O_NOATIME option (i.e. EINVAL), we
would still try to drop O_CLOEXEC first and retry, and then drop
O_NOATIME.  A platform on which O_NOATIME is defined in the header
without support from the kernel wouldn't have a chance to open with
O_CLOEXEC option due to this code structure.

Arguably, O_CLOEXEC is more important than O_NOATIME, as the latter
is mostly about performance, while the former can affect correctness.

Instead use O_CLOEXEC to open the file, and then use fcntl(2) to set
O_NOATIME on the resulting file descriptor.  open(2) itself does not
cause atime to be updated according to Linus [*1*].

The helper to do the former can be usable in the codepath in
ce_compare_data() that was recently added to open a file descriptor
with O_CLOEXEC; use it while we are at it.

*1* 

Signed-off-by: Junio C Hamano 
---
 cache.h  |  1 +
 read-cache.c |  9 +
 sha1_file.c  | 39 +++
 3 files changed, 21 insertions(+), 28 deletions(-)

diff --git a/cache.h b/cache.h
index a902ca1f8e..6f1c21a352 100644
--- a/cache.h
+++ b/cache.h
@@ -1122,6 +1122,7 @@ extern int write_sha1_file(const void *buf, unsigned long 
len, const char *type,
 extern int hash_sha1_file_literally(const void *buf, unsigned long len, const 
char *type, unsigned char *sha1, unsigned flags);
 extern int pretend_sha1_file(void *, unsigned long, enum object_type, unsigned 
char *);
 extern int force_object_loose(const unsigned char *sha1, time_t mtime);
+extern int git_open_cloexec(const char *name, int flags);
 extern int git_open(const char *name);
 extern void *map_sha1_file(const unsigned char *sha1, unsigned long *size);
 extern int unpack_sha1_header(git_zstream *stream, unsigned char *map, 
unsigned long mapsize, void *buffer, unsigned long bufsiz);
diff --git a/read-cache.c b/read-cache.c
index db5d910642..c27d3e240b 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -156,14 +156,7 @@ void fill_stat_cache_info(struct cache_entry *ce, struct 
stat *st)
 static int ce_compare_data(const struct cache_entry *ce, struct stat *st)
 {
int match = -1;
-   static int cloexec = O_CLOEXEC;
-   int fd = open(ce->name, O_RDONLY | cloexec);
-
-   if ((cloexec & O_CLOEXEC) && fd < 0 && errno == EINVAL) {
-   /* Try again w/o O_CLOEXEC: the kernel might not support it */
-   cloexec &= ~O_CLOEXEC;
-   fd = open(ce->name, O_RDONLY | cloexec);
-   }
+   int fd = git_open_cloexec(ce->name, O_RDONLY);
 
if (fd >= 0) {
unsigned char sha1[20];
diff --git a/sha1_file.c b/sha1_file.c
index 09045df1dc..dfbf398183 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -1559,31 +1559,30 @@ int check_sha1_signature(const unsigned char *sha1, 
void *map,
return hashcmp(sha1, real_sha1) ? -1 : 0;
 }
 
-int git_open(const char *name)
+int git_open_cloexec(const char *name, int flags)
 {
-   static int sha1_file_open_flag = O_NOATIME | O_CLOEXEC;
-
-   for (;;) {
-   int fd;
-
-   errno = 0;
-   fd = open(name, O_RDONLY | sha1_file_open_flag);
-   if (fd >= 0)
-   return fd;
+   static int cloexec = O_CLOEXEC;
+   int fd = open(name, flags | cloexec);
 
+   if ((cloexec & O_CLOEXEC) && fd < 0 && errno == EINVAL) {
/* Try again w/o O_CLOEXEC: the kernel might not support it */
-   if ((sha1_file_open_flag & O_CLOEXEC) && errno == EINVAL) {
-   sha1_file_open_flag &= ~O_CLOEXEC;
-   continue;
-   }
+   cloexec &= ~O_CLOEXEC;
+   fd = open(name, flags | cloexec);
+   }
+   return fd;
+}
 
-   /* Might the failure be due to O_NOATIME? */
-   if (errno != ENOENT && (sha1_file_open_flag & O_NOATIME)) {
-   sha1_file_open_flag &= ~O_NOATIME;
-   continue;
-   }
-   return -1;
+int git_open(const char *name)
+{
+   static int noatime = O_NOATIME;
+   int fd = git_open_cloexec(name, O_RDONLY);
+
+   if (0 <= fd && (noatime & O_NOATIME)) {
+   int flags = fcntl(fd, F_GETFL);
+   if (fcntl(fd, F_SETFL, flags | noatime))
+   noatime = 0;
}
+   return fd;
 }
 
 static int 

Re: [PATCH] Fix typo in 2.11.0 RelNotes

2016-10-28 Thread Junio C Hamano
Henrik Ahlgren  writes:

> Signed-off-by: Henrik Ahlgren 
> ---
>  Documentation/RelNotes/2.11.0.txt | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/Documentation/RelNotes/2.11.0.txt 
> b/Documentation/RelNotes/2.11.0.txt
> index 3590620..1d3a07d 100644
> --- a/Documentation/RelNotes/2.11.0.txt
> +++ b/Documentation/RelNotes/2.11.0.txt
> @@ -5,7 +5,7 @@ Backward compatibility notes.
>  
>   * An empty string used as a pathspec element has always meant
> 'everything matches', but it is too easy to write a script that
> -   finds a path to remove in $path and run 'git rm "$paht"', which
> +   finds a path to remove in $path and run 'git rm "$path"', which
> ends up removing everything.  This release starts warning about the
> use of an empty string that is used for 'everything matches' and
> asks users to use a more explicit '.' for that instead.

What you spotted is certainly a typo, but it is a deliberate one
that must not be fixed like this.  "..., but it is too easy to ..."
is illustrating a scenario in which an empty string is accidentally
given to "git rm" as a pathspec by mistake, and the example it uses
is for the user to prepare a path to be removed in variable $path,
and referring to it as its typoed $paht by mistake.  Fixing that typo
in this paragraph defeats the whole point of the example.

But the fact that you spotted the typo (which is good; we want the
deliberate typo in the example to be clearly visible) and thought
that the writer of the paragraph must have meant a non-typoed
version there (which is bad) indicates that the paragraph needs
improvement to save readers from making the same mis-reading as you
did.

If the original text were like the following, would it have been
clear enough that prevented you from sending your patch?

 * An empty string used as a pathspec element has always meant
   'everything matches', but it is too easy to write a script that
   finds a path to remove in $path and run 'git rm "$paht"' by 
   mistake (when the user meant to give "$path"), which ends up
   removing everything.  This release starts warning about the
   use of an empty string that is used for 'everything matches' and
   asks users to use a more explicit '.' for that instead.

Thanks.


Re: [PATCH] valgrind: support test helpers

2016-10-28 Thread Junio C Hamano
René Scharfe  writes:

> Tests run with --valgrind call git commands through a wrapper script
> that invokes valgrind on them.  This script (valgrind.sh) is in turn
> invoked through symlinks created for each command in t/valgrind/bin/.
>
> Since e6e7530d (test helpers: move test-* to t/helper/ subdirectory)
> these symlinks have been broken for test helpers -- they point to the
> old locations in the root of the build directory.  Fix that by teaching
> the code for creating the links about the new location of the binaries,
> and do the same in the wrapper script to allow it to find its payload.
>
> Signed-off-by: Rene Scharfe 
> ---

Hmph.  I somehow thought this was supposed to have been fixed by
503e224180 ("t/test-lib.sh: fix running tests with --valgrind",
2016-07-11) already.



>  t/test-lib.sh  |  9 -
>  t/valgrind/valgrind.sh | 12 ++--
>  2 files changed, 18 insertions(+), 3 deletions(-)
>
> diff --git a/t/test-lib.sh b/t/test-lib.sh
> index b859db6..a724181 100644
> --- a/t/test-lib.sh
> +++ b/t/test-lib.sh
> @@ -809,7 +809,14 @@ then
>   return;
>  
>   base=$(basename "$1")
> - symlink_target=$GIT_BUILD_DIR/$base
> + case "$base" in
> + test-*)
> + symlink_target="$GIT_BUILD_DIR/t/helper/$base"
> + ;;
> + *)
> + symlink_target="$GIT_BUILD_DIR/$base"
> + ;;
> + esac
>   # do not override scripts
>   if test -x "$symlink_target" &&
>   test ! -d "$symlink_target" &&
> diff --git a/t/valgrind/valgrind.sh b/t/valgrind/valgrind.sh
> index 4215303..669ebaf 100755
> --- a/t/valgrind/valgrind.sh
> +++ b/t/valgrind/valgrind.sh
> @@ -1,11 +1,19 @@
>  #!/bin/sh
>  
>  base=$(basename "$0")
> +case "$base" in
> +test-*)
> + program="$GIT_VALGRIND/../../t/helper/$base"
> + ;;
> +*)
> + program="$GIT_VALGRIND/../../$base"
> + ;;
> +esac
>  
>  TOOL_OPTIONS='--leak-check=no'
>  
>  test -z "$GIT_VALGRIND_ENABLED" &&
> -exec "$GIT_VALGRIND"/../../"$base" "$@"
> +exec "$program" "$@"
>  
>  case "$GIT_VALGRIND_MODE" in
>  memcheck-fast)
> @@ -29,4 +37,4 @@ exec valgrind -q --error-exitcode=126 \
>   --log-fd=4 \
>   --input-fd=4 \
>   $GIT_VALGRIND_OPTIONS \
> - "$GIT_VALGRIND"/../../"$base" "$@"
> + "$program" "$@"


Re: [PATCH] compat: Allow static initializer for pthreads on Windows

2016-10-28 Thread Johannes Schindelin
Hi Jake,

On Thu, 27 Oct 2016, Jacob Keller wrote:

> I agree with Stefan that there isn't really a great place to put a
> dynamic initialization.

Ummm. Wait. What???

https://github.com/git/git/blob/v2.10.1/common-main.c#L25-L41

Ciao,
Johannes


[PATCH] Fix typo in 2.11.0 RelNotes

2016-10-28 Thread Henrik Ahlgren
Signed-off-by: Henrik Ahlgren 
---
 Documentation/RelNotes/2.11.0.txt | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/Documentation/RelNotes/2.11.0.txt 
b/Documentation/RelNotes/2.11.0.txt
index 3590620..1d3a07d 100644
--- a/Documentation/RelNotes/2.11.0.txt
+++ b/Documentation/RelNotes/2.11.0.txt
@@ -5,7 +5,7 @@ Backward compatibility notes.
 
  * An empty string used as a pathspec element has always meant
'everything matches', but it is too easy to write a script that
-   finds a path to remove in $path and run 'git rm "$paht"', which
+   finds a path to remove in $path and run 'git rm "$path"', which
ends up removing everything.  This release starts warning about the
use of an empty string that is used for 'everything matches' and
asks users to use a more explicit '.' for that instead.
-- 
2.1.4



Re: [PATCH v3 2/3] sha1_file: open window into packfiles with O_CLOEXEC

2016-10-28 Thread Johannes Schindelin
Hi,

On Thu, 27 Oct 2016, Junio C Hamano wrote:

> Junio C Hamano  writes:
> 
> > Linus Torvalds  writes:
> >
> >> On Thu, Oct 27, 2016 at 4:36 PM, Junio C Hamano  wrote:
> >>>
> >>> Would the best endgame shape for this function be to open with
> >>> O_NOATIME (and retry without), and then add CLOEXEC with fcntl(2)
> >>> but ignoring an error from it, I guess?  That would be the closest
> >>> to what we historically had, I would think.
> >>
> >> I think that's the best model.
> >
> > OK, so perhaps like this.
> 
> Hmph.  This may not fly well in practice, though.  
> 
> To Unix folks, CLOEXEC is not a huge correctness issue.  A child
> process may hold onto an open file descriptor a bit longer than the
> lifetime of the parent but as long as the child eventually exits,
> nothing is affected.  Over there, things are different.  The parent
> cannot even rename(2) or unlink(2) a file it created and closed
> while the child is still holding the file descriptor open and the
> lack of CLOEXEC will make the parent fail.  I do not know how well
> fcntl(2) emulation works on Windows, but I would not be surprised
> if J6t or Dscho comes back and says that FD_CLOEXEC given to F_SETFD
> would not work while O_CLOEXEC given to open(2) does.

You guys. I mean: You guys! You sure make my life hard. A brief look at
mingw.h could have answered your implicit question:

static inline int fcntl(int fd, int cmd, ...)
{
if (cmd == F_GETFD || cmd == F_SETFD)
return 0;
errno = EINVAL;
return -1;
}

So while you discuss in your Linux Ivory Tower how to optimize Git for
Linux, and Linux only, I'll have to drop everything else and spend the
rest of my Friday trying to find a way to adjust a file handle
*immediately after opening it with undesired flags* (when it could have
been opened with the desired flags, as suggested, to begin with).

Ciao,
Johannes


Re: [PATCH] Documenation: fmt-merge-msg: fix markup in example

2016-10-28 Thread Jeff King
On Fri, Oct 28, 2016 at 12:01:26PM +0200, Stefan Christ wrote:

> diff --git a/Documentation/git-fmt-merge-msg.txt 
> b/Documentation/git-fmt-merge-msg.txt
> index 6526b17..44892c4 100644
> --- a/Documentation/git-fmt-merge-msg.txt
> +++ b/Documentation/git-fmt-merge-msg.txt
> @@ -60,10 +60,10 @@ merge.summary::
>  EXAMPLE
>  ---
>  
> ---
> +-
>  $ git fetch origin master
>  $ git fmt-merge-msg --log <$GIT_DIR/FETCH_HEAD
> ---
> +-

Thanks. Asciidoc generally requires at least 4 delimiter characters to
open a delimited block (including a ListingBlock, which is what we want
here). There is one exception, "--", which is a generic OpenBlock, which
is just used for grouping, and not any special syntactic meaning (so
that's why this _didn't_ render the "--", but did render the contents
without line breaks).

So looks good, modulo the typo in the subject that somebody else pointed
out.

-Peff


Re: [PATCH] Documenation: fmt-merge-msg: fix markup in example

2016-10-28 Thread Robert P. J. Day

  "Documenation"?

rday

-- 


Robert P. J. Day Ottawa, Ontario, CANADA
http://crashcourse.ca

Twitter:   http://twitter.com/rpjday
LinkedIn:   http://ca.linkedin.com/in/rpjday




  1   2   >