[PATCH 5/5] index: offer advice for unknown index extensions

2018-11-19 Thread Jonathan Nieder
It is not unusual for multiple distinct versions of Git to act on a
single repository.  For example, some IDEs bundle a particular version
of Git, which can be a different version from the system copy of Git,
or on a Mac, /usr/bin/git quickly goes out of sync with the Homebrew
git in /usr/local/bin/git.

When a newer version of Git writes an index file that an older version
of Git does not know how to make full use of, this is a teaching
opportunity.  The user may not be aware of what version of Git they
are using.  Print an advice message to help the user to use the most
full featured version of Git (e.g. by futzing with their PATH).

  warning: ignoring optional IEOT index extension
  hint: This is likely due to the file having been written by a newer
  hint: version of Git than is reading it.  You can upgrade Git to
  hint: take advantage of performance improvements from the updated
  hint: file format.
  hint:
  hint: You can run "git config advice.unknownIndexExtension false"
  hint: to suppress this message.

This replaces the message

  ignoring IEOT extension

that existed previously and did not provide enough detail for a user
to act on it or suppress it.

Helped-by: Junio C Hamano 
Signed-off-by: Jonathan Nieder 
---
New, based on Junio's hints about the message removed in patch 3/5.

That's the end of the series.  Thanks for reading, and thanks again
for your help so far.

 advice.c |  2 ++
 advice.h |  1 +
 read-cache.c | 11 +++
 3 files changed, 14 insertions(+)

diff --git a/advice.c b/advice.c
index 5f35656409..91a55046fd 100644
--- a/advice.c
+++ b/advice.c
@@ -24,6 +24,7 @@ int advice_add_embedded_repo = 1;
 int advice_ignored_hook = 1;
 int advice_waiting_for_editor = 1;
 int advice_graft_file_deprecated = 1;
+int advice_unknown_index_extension = 1;
 int advice_checkout_ambiguous_remote_branch_name = 1;
 
 static int advice_use_color = -1;
@@ -78,6 +79,7 @@ static struct {
{ "ignoredHook", _ignored_hook },
{ "waitingForEditor", _waiting_for_editor },
{ "graftFileDeprecated", _graft_file_deprecated },
+   { "unknownIndexExtension", _unknown_index_extension },
{ "checkoutAmbiguousRemoteBranchName", 
_checkout_ambiguous_remote_branch_name },
 
/* make this an alias for backward compatibility */
diff --git a/advice.h b/advice.h
index 696bf0e7d2..8da0845cfc 100644
--- a/advice.h
+++ b/advice.h
@@ -24,6 +24,7 @@ extern int advice_add_embedded_repo;
 extern int advice_ignored_hook;
 extern int advice_waiting_for_editor;
 extern int advice_graft_file_deprecated;
+extern int advice_unknown_index_extension;
 extern int advice_checkout_ambiguous_remote_branch_name;
 
 int git_default_advice_config(const char *var, const char *value);
diff --git a/read-cache.c b/read-cache.c
index 002ed2c1e4..d1d903e5a1 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -1727,6 +1727,17 @@ static int read_index_extension(struct index_state 
*istate,
return error("index uses %.4s extension, which we do 
not understand",
 ext);
trace_printf("ignoring %.4s extension\n", ext);
+   if (advice_unknown_index_extension) {
+   warning(_("ignoring optional %.4s index extension"), 
ext);
+   advise(_("This is likely due to the file having been 
written by a newer\n"
+"version of Git than is reading it. You can 
upgrade Git to\n"
+"take advantage of performance improvements 
from the updated\n"
+"file format.\n"
+"\n"
+"Run \"%s\"\n"
+"to suppress this message."),
+  "git config advice.unknownIndexExtension false");
+   }
break;
}
return 0;
-- 
2.20.0.rc0.387.gc7a69e6b6c



[PATCH 4/5] index: make index.threads=true enable ieot and eoie

2018-11-19 Thread Jonathan Nieder
If a user explicitly sets

[index]
threads = true

to read the index using multiple threads, ensure that index writes
include the offset table by default to make that possible.  This
ensures that the user's intent of turning on threading is respected.

In other words, permit the following configurations:

- index.threads and index.recordOffsetTable unspecified: do not write
  the offset table yet (to avoid alarming the user with "ignoring IEOT
  extension" messages when an older version of Git accesses the
  repository) but do make use of multiple threads to read the index if
  the supporting offset table is present.

  This can also be requested explicitly by setting index.threads=true,
  0, or >1 and index.recordOffsetTable=false.

- index.threads=false or 1: do not write the offset table, and do not
  make use of the offset table.

  One can set index.recordOffsetTable=false as well, to be more
  explicit.

- index.threads=true, 0, or >1 and index.recordOffsetTable unspecified:
  write the offset table and make use of threads at read time.

  This can also be requested by setting index.threads=true, 0, >1, or
  unspecified and index.recordOffsetTable=true.

Fortunately the complication is temporary: once most Git installations
have upgraded to a version with support for the IEOT and EOIE
extensions, we can flip the defaults for index.recordEndOfIndexEntries
and index.recordOffsetTable to true and eliminate the settings.

Helped-by: Ben Peart 
Signed-off-by: Jonathan Nieder 
---
New, based on Ben Peart's feedback.  Turned out simpler than I feared
--- thanks, Ben, for pushing for this.

 Documentation/config/index.txt |  6 --
 config.c   | 17 ++---
 config.h   |  2 +-
 read-cache.c   | 23 +--
 4 files changed, 32 insertions(+), 16 deletions(-)

diff --git a/Documentation/config/index.txt b/Documentation/config/index.txt
index de44183235..f181503041 100644
--- a/Documentation/config/index.txt
+++ b/Documentation/config/index.txt
@@ -3,14 +3,16 @@ index.recordEndOfIndexEntries::
Entry" section. This reduces index load time on multiprocessor
machines but produces a message "ignoring EOIE extension" when
reading the index using Git versions before 2.20. Defaults to
-   'false'.
+   'true' if index.threads has been explicitly enabled, 'false'
+   otherwise.
 
 index.recordOffsetTable::
Specifies whether the index file should include an "Index Entry
Offset Table" section. This reduces index load time on
multiprocessor machines but produces a message "ignoring IEOT
extension" when reading the index using Git versions before 2.20.
-   Defaults to 'false'.
+   Defaults to 'true' if index.threads has been explicitly enabled,
+   'false' otherwise.
 
 index.threads::
Specifies the number of threads to spawn when loading the index.
diff --git a/config.c b/config.c
index 04286f7717..ff521eb27a 100644
--- a/config.c
+++ b/config.c
@@ -2294,22 +2294,25 @@ int git_config_get_fsmonitor(void)
return 0;
 }
 
-int git_config_get_index_threads(void)
+int git_config_get_index_threads(int *dest)
 {
-   int is_bool, val = 0;
+   int is_bool, val;
 
val = git_env_ulong("GIT_TEST_INDEX_THREADS", 0);
-   if (val)
-   return val;
+   if (val) {
+   *dest = val;
+   return 0;
+   }
 
if (!git_config_get_bool_or_int("index.threads", _bool, )) {
if (is_bool)
-   return val ? 0 : 1;
+   *dest = val ? 0 : 1;
else
-   return val;
+   *dest = val;
+   return 0;
}
 
-   return 0; /* auto */
+   return 1;
 }
 
 NORETURN
diff --git a/config.h b/config.h
index a06027e69b..ee5d3fa7b4 100644
--- a/config.h
+++ b/config.h
@@ -246,11 +246,11 @@ extern int git_config_get_bool(const char *key, int 
*dest);
 extern int git_config_get_bool_or_int(const char *key, int *is_bool, int 
*dest);
 extern int git_config_get_maybe_bool(const char *key, int *dest);
 extern int git_config_get_pathname(const char *key, const char **dest);
+extern int git_config_get_index_threads(int *dest);
 extern int git_config_get_untracked_cache(void);
 extern int git_config_get_split_index(void);
 extern int git_config_get_max_percent_split_change(void);
 extern int git_config_get_fsmonitor(void);
-extern int git_config_get_index_threads(void);
 
 /* This dies if the configured or default date is in the future */
 extern int git_config_get_expiry(const char *key, const char **output);
diff --git a/read-cache.c b/read-cache.c
index 83d24357a6..002ed2c1e4 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -2176,7 +2176,8 @@ int do_read_index(struct index_state *istate, const char 
*path, int must_exist)
 
src_offset = sizeof(*hdr);
 
-   nr_threads = 

Re: [PATCH 1/1] revision.c: use new topo-order logic in tests

2018-11-19 Thread Junio C Hamano
"Derrick Stolee via GitGitGadget"  writes:

> @@ -3143,6 +3144,9 @@ int prepare_revision_walk(struct rev_info *revs)
>   commit_list_sort_by_date(>commits);
>   if (revs->no_walk)
>   return 0;
> + if (revs->limited &&
> + git_env_bool(GIT_TEST_COMMIT_GRAPH, 0))
> + revs->limited = 0;
>   if (revs->limited) {
>   if (limit_list(revs) < 0)
>   return -1;

That is equivalent to say

if (git_env_bool(GIT_TEST_COMMIT_GRAPH, 0))
revs->limited = 0;

Wouldn't that make the codepath that involves limit_list()
completely unreachable while testing, though?

The title only mentions "topo-order" logic, but the topo-order is
not the only reason why limited bit can be set, is it?  When showing
merges, simplifying merges, or post-processing to show ancestry
path, or showing a bottom-limited revision range, the limited bit is
automatically set because all of these depend on first calling
limit_list() and then postprocessing its result.  Doesn't it hurt
these cases to unconditionally drop limited bit?



[PATCH 3/5] index: do not warn about unrecognized extensions

2018-11-19 Thread Jonathan Nieder
Documentation/technical/index-format explains:

 4-byte extension signature. If the first byte is 'A'..'Z' the
 extension is optional and can be ignored.

This allows gracefully introducing a new index extension without
having to rely on all readers having support for it.  Mandatory
extensions start with a lowercase letter and optional ones start with
a capital.  Thus the versions of Git acting on a shared local
repository do not have to upgrade in lockstep.

We almost obey that convention, but there is a problem: when
encountering an unrecognized optional extension, we write

ignoring FNCY extension

to stderr, which alarms users.  This means that in practice we have
had to introduce index extensions in two steps: first add read
support, and then a while later, start writing by default.  This
delays when users can benefit from improvements to the index format.

We cannot change the past, but for index extensions of the future,
there is a straightforward improvement: silence that message except
when tracing.  This way, the message is still available when
debugging, but in everyday use it does not show up so (once most Git
users have this patch) we can turn on new optional extensions right
away without alarming people.

Signed-off-by: Jonathan Nieder 
---
Unchanged.

 read-cache.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/read-cache.c b/read-cache.c
index f3d5638d9e..83d24357a6 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -1726,7 +1726,7 @@ static int read_index_extension(struct index_state 
*istate,
if (*ext < 'A' || 'Z' < *ext)
return error("index uses %.4s extension, which we do 
not understand",
 ext);
-   fprintf(stderr, "ignoring %.4s extension\n", ext);
+   trace_printf("ignoring %.4s extension\n", ext);
break;
}
return 0;
-- 
2.20.0.rc0.387.gc7a69e6b6c



[PATCH 2/5] ieot: default to not writing IEOT section

2018-11-19 Thread Jonathan Nieder
As with EOIE, popular versions of Git do not support the new IEOT
extension yet.  When accessing a Git repository written by a more
modern version of Git, they correctly ignore the unrecognized section,
but in the process they loudly warn

ignoring IEOT extension

resulting in confusion for users.  Introduce the index extension more
gently by not writing it yet in this first version with support for
it.  Soon, once sufficiently many users are running a modern version
of Git, we can flip the default so users benefit from this index
extension by default.

Introduce a '[index] recordOffsetTable' configuration variable to
control whether the new index extension is written.

Signed-off-by: Jonathan Nieder 
---
As with patch 1/5, no change from v1 other than rebasing.

 Documentation/config/index.txt |  7 +++
 read-cache.c   | 11 ++-
 2 files changed, 17 insertions(+), 1 deletion(-)

diff --git a/Documentation/config/index.txt b/Documentation/config/index.txt
index 8e138aba7a..de44183235 100644
--- a/Documentation/config/index.txt
+++ b/Documentation/config/index.txt
@@ -5,6 +5,13 @@ index.recordEndOfIndexEntries::
reading the index using Git versions before 2.20. Defaults to
'false'.
 
+index.recordOffsetTable::
+   Specifies whether the index file should include an "Index Entry
+   Offset Table" section. This reduces index load time on
+   multiprocessor machines but produces a message "ignoring IEOT
+   extension" when reading the index using Git versions before 2.20.
+   Defaults to 'false'.
+
 index.threads::
Specifies the number of threads to spawn when loading the index.
This is meant to reduce index load time on multiprocessor machines.
diff --git a/read-cache.c b/read-cache.c
index 1e9c772603..f3d5638d9e 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -2698,6 +2698,15 @@ static int record_eoie(void)
return 0;
 }
 
+static int record_ieot(void)
+{
+   int val;
+
+   if (!git_config_get_bool("index.recordoffsettable", ))
+   return val;
+   return 0;
+}
+
 /*
  * On success, `tempfile` is closed. If it is the temporary file
  * of a `struct lock_file`, we will therefore effectively perform
@@ -2761,7 +2770,7 @@ static int do_write_index(struct index_state *istate, 
struct tempfile *tempfile,
else
nr_threads = 1;
 
-   if (nr_threads != 1) {
+   if (nr_threads != 1 && record_ieot()) {
int ieot_blocks, cpus;
 
/*
-- 
2.20.0.rc0.387.gc7a69e6b6c-goog



[PATCH 1/5] eoie: default to not writing EOIE section

2018-11-19 Thread Jonathan Nieder
Since 3b1d9e04 (eoie: add End of Index Entry (EOIE) extension,
2018-10-10) Git defaults to writing the new EOIE section when writing
out an index file.  Usually that is a good thing because it improves
threaded performance, but when a Git repository is shared with older
versions of Git, it produces a confusing warning:

  $ git status
  ignoring EOIE extension
  HEAD detached at 371ed0defa
  nothing to commit, working tree clean

Let's introduce the new index extension more gently.  First we'll roll
out the new version of Git that understands it, and then once
sufficiently many users are using such a version, we can flip the
default to writing it by default.

Introduce a '[index] recordEndOfIndexEntries' configuration variable
to allow interested users to benefit from this index extension early.

Signed-off-by: Jonathan Nieder 
---
Rebased.  No other change from v1.

As Jonathan pointed out, it would be nice to have tests here.  Ben,
any advice for how I could write some in a followup change?  E.g. does
Derrick Stolee's tracing based testing trick apply here?

 Documentation/config/index.txt |  7 +++
 read-cache.c   | 11 ++-
 t/t1700-split-index.sh | 11 +++
 3 files changed, 24 insertions(+), 5 deletions(-)

diff --git a/Documentation/config/index.txt b/Documentation/config/index.txt
index 4b94b6bedc..8e138aba7a 100644
--- a/Documentation/config/index.txt
+++ b/Documentation/config/index.txt
@@ -1,3 +1,10 @@
+index.recordEndOfIndexEntries::
+   Specifies whether the index file should include an "End Of Index
+   Entry" section. This reduces index load time on multiprocessor
+   machines but produces a message "ignoring EOIE extension" when
+   reading the index using Git versions before 2.20. Defaults to
+   'false'.
+
 index.threads::
Specifies the number of threads to spawn when loading the index.
This is meant to reduce index load time on multiprocessor machines.
diff --git a/read-cache.c b/read-cache.c
index 4ca81286c0..1e9c772603 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -2689,6 +2689,15 @@ void update_index_if_able(struct index_state *istate, 
struct lock_file *lockfile
rollback_lock_file(lockfile);
 }
 
+static int record_eoie(void)
+{
+   int val;
+
+   if (!git_config_get_bool("index.recordendofindexentries", ))
+   return val;
+   return 0;
+}
+
 /*
  * On success, `tempfile` is closed. If it is the temporary file
  * of a `struct lock_file`, we will therefore effectively perform
@@ -2936,7 +2945,7 @@ static int do_write_index(struct index_state *istate, 
struct tempfile *tempfile,
 * read.  Write it out regardless of the strip_extensions parameter as 
we need it
 * when loading the shared index.
 */
-   if (offset) {
+   if (offset && record_eoie()) {
struct strbuf sb = STRBUF_INIT;
 
write_eoie_extension(, _c, offset);
diff --git a/t/t1700-split-index.sh b/t/t1700-split-index.sh
index 2ac47aa0e4..0cbac64e28 100755
--- a/t/t1700-split-index.sh
+++ b/t/t1700-split-index.sh
@@ -25,14 +25,17 @@ test_expect_success 'enable split index' '
git update-index --split-index &&
test-tool dump-split-index .git/index >actual &&
indexversion=$(test-tool index-version <.git/index) &&
+
+   # NEEDSWORK: Stop hard-coding checksums.
if test "$indexversion" = "4"
then
-   own=3527df833c6c100d3d1d921a9a782d62a8be4b58
-   base=746f7ab2ed44fb839efdfbffcf399d0b113fb4cb
+   own=432ef4b63f32193984f339431fd50ca796493569
+   base=508851a7f0dfa8691e9f69c7f055865389012491
else
-   own=5e9b60117ece18da410ddecc8b8d43766a0e4204
-   base=4370042739b31cd17a5c5cd6043a77c9a00df113
+   own=8299b0bcd1ac364e5f1d7768efb62fa2da79a339
+   base=39d890139ee5356c7ef572216cebcd27aa41f9df
fi &&
+
cat >expect <<-EOF &&
own $own
base $base
-- 
2.20.0.rc0.387.gc7a69e6b6c



[PATCH v2 0/5] Avoid confusing messages from new index extensions

2018-11-19 Thread Jonathan Nieder
Junio C Hamano wrote:
> Ben Peart  writes:

>> Why introduce a new setting to disable writing the IEOT extension
>> instead of just using the existing index.threads setting?
>
> But index.threads is about what the reader does, not about the
> writer who does not even know who will be reading the resulting
> index, no?

It affects the writer, too, since it affects the number of blocks, but
from an end user's point of view, I agree.

Here's an updated version of the series.

Patches 1-3 are as before, except that they are rebased to avoid
conflicting with nd/config-split.

Patch 4 allows enabling the new index extensions with a single config
setting, to address the feedback above.

Patch 5 revives the noisiness when encountering an unknown index
extension, guarded with an advice setting.

Sorry for the delay in getting this out.  Thoughts of all kinds
welcome, as always.

Sincerely,
Jonathan Nieder (5):
  eoie: default to not writing EOIE section
  ieot: default to not writing IEOT section
  index: do not warn about unrecognized extensions
  index: make index.threads=true enable ieot and eoie
  index: offer advice for unknown index extensions

 Documentation/config/index.txt | 16 ++
 advice.c   |  2 ++
 advice.h   |  1 +
 config.c   | 17 ++-
 config.h   |  2 +-
 read-cache.c   | 54 +-
 t/t1700-split-index.sh | 11 ---
 7 files changed, 84 insertions(+), 19 deletions(-)


Re: [PATCH v3 2/5] pretty: allow showing specific trailers

2018-11-19 Thread Junio C Hamano
Anders Waldenborg  writes:

> +  followed by a colon and zero or more comma-separated options:
> +  ** 'only': omit non-trailer lines from the trailer block.
> +  ** 'unfold': make it behave as if interpret-trailer's `--unfold`
> + option was given.
> +  ** 'key=': only show trailers with specified key. Matching is
> + done case-insensitively and trailing colon is optional. If option
> + is given multiple times only last one is used.

It would be good to allow multiple keys, as

%(trailers:key=signed-off-by,key=helped-by)

and

%(trailers:key=signed-off-by)%(trailers:key=helped-by)

would mean quite a different thing.  The former can preserve the
order of these sign-offs and helped-bys in the original, while the
latter would have to show all the sign-offs before showing the
helped-bys, and I am not convinced if the latter is the only valid
use case.

Also, use of 'key=' automatically turns on 'only' as described, and
I tend to agree that it would a convenient default mode (i.e. when
picking certain trailers only with this mechanism, it is likely that
the user is willing to use %(subject) etc. to fill in what was lost
by the implicit use of 'only'), but at the same time, it makes me
wonder if we need to have a way to countermand an 'only' (or
'unfold' for that matter) that was given earlier, e.g.

%(trailers:key=signed-off-by,only=no)

Thanks.


Re: [PATCH v3 2/5] pretty: allow showing specific trailers

2018-11-19 Thread Junio C Hamano
Anders Waldenborg  writes:

> +  followed by a colon and zero or more comma-separated options:
> +  ** 'only': omit non-trailer lines from the trailer block.
> +  ** 'unfold': make it behave as if interpret-trailer's `--unfold`
> + option was given.
> +  ** 'key=': only show trailers with specified key. Matching is
> + done case-insensitively and trailing colon is optional. If option
> + is given multiple times only last one is used.
> +  ** Examples: `%(trailers:only,unfold)` unfolds and shows all trailer
> + lines, `%(trailers:key=Reviewed-by,unfold)` unfolds and shows
> + trailer lines with key `Reviewed-by`.

The last "Examples" item does not logically belong to the other
three, which bothers me a bit.

> diff --git a/pretty.c b/pretty.c
> index aa03d5b23..aaedc8447 100644
> --- a/pretty.c
> +++ b/pretty.c
> @@ -1074,6 +1074,17 @@ static int match_placeholder_arg(const char *to_parse, 
> const char *candidate,
>   return 0;
>  }
>  
> +struct format_trailer_match_data {
> + const char *trailer;
> + size_t trailer_len;
> +};
> +
> +static int format_trailer_match_cb(const struct strbuf *sb, void *ud)
> +{
> + struct format_trailer_match_data *data = ud;
> + return data->trailer_len == sb->len && !strncasecmp (data->trailer, 
> sb->buf, sb->len);
> +}

>   if (skip_prefix(placeholder, "(trailers", )) {
>   struct process_trailer_options opts = 
> PROCESS_TRAILER_OPTIONS_INIT;
> + struct format_trailer_match_data filter_data;
>   size_t ret = 0;
>  
>   opts.no_divider = 1;
> @@ -1323,7 +1335,24 @@ static size_t format_commit_one(struct strbuf *sb, /* 
> in UTF-8 */
>   opts.only_trailers = 1;
>   else if (match_placeholder_arg(arg, "unfold", 
> ))
>   opts.unfold = 1;
> - else
> + else if (skip_prefix(arg, "key=", )) {
> + const char *end = arg + strcspn(arg, 
> ",)");
> +
> + filter_data.trailer = arg;
> + filter_data.trailer_len = end - arg;
> +
> + if (filter_data.trailer_len &&
> + 
> filter_data.trailer[filter_data.trailer_len - 1] == ':')
> + filter_data.trailer_len--;
> +
> + opts.filter = format_trailer_match_cb;
> + opts.filter_data = _data;
> + opts.only_trailers = 1;
> +
> + arg = end;
> + if (*arg == ',')
> + arg++;
> + } else
>   break;
>   }

This is part of a loop that is entered after seeing "%(trailers:"
and existing one looks for 'unfold' and 'only' by using the
match_placeholder_arg() helper, which

 - returns false if the keyword is not what is being sought after;
 - returns 1 if it finds the keyword, followed by ',' or ')', and
   updates the end pointer to point at either the closing ')' or one
   place after the ','.

The new part cannot directly reuse the same helper because it
expects some non-constant string after "key=", but shouldn't we be
introducing a similar helper with extended feature to help this part
of the code to stay readable?  Exposing the minute details of the
logic to parse "key=,..." while hiding the counterpart to
parse "(only|unfold),..." makes the implementation of the above loop
uneven and hard to follow.

I wonder if a helper like this would help:

static int match_placeholder(const char *to_parse, const char *keyword,
 const char **value, size_t *valuelen,
 const char **end)
{
const char *p;

if (!(skip_prefix(to_parse, keyword, )))
return 0;

if (value && valuelen) {
/* expect "=" */
size_t vlen;
if (*p++ != '=')
return 1;
vlen = strcspn(p, ",)");
if (!p[vlen])
return 0;
*value = p;
*valuelen = vlen;
p = p + vlen;
}

if (*p == ',') {
*end = p + 1;
return 1;
}
if (*p == ')') {
*end = p;
return 1;
}
return 0;
}

which would allow the existing one to become a thin wrapper

static int match_placeholder_arg(const char *a, const char *b, const char **c)
{
return match_placeholder(a, b, NULL, NULL, c);
}

or can simply be eliminated by updating its only two callsites.

In the version you wrote, it is not 

Re: grep issues

2018-11-19 Thread Torsten Bögershausen
On Sun, Nov 11, 2018 at 03:17:50PM +0200, Orgad Shaneh wrote:
> Hi,
> 
> I found 2 bugs in grep, using Git for Windows 2.19.1 (but noticed
> these several versions ago):
> 
> 1. git grep --recursive on a worktree (without rev) always matches
> against the submodule's HEAD, not its worktree, as it should.
> 2. When core.autocrlf (or eol=crlf) is used, and a file in the
> worktree has CRLF, git grep fails to match $ against EOL.
> 
> For example:
> git init
> echo 'file eol=crlf' > .gitattributes
> echo ABCD > file
> git add file
> git commit -m 'CRLF'
> rm file
> git checkout -f file
> git grep 'D$' file # Nothing
> git grep 'D$' HEAD -- file # Found!
> 
> - Orgad

I can confirm the "2. When core.autocrlf" bug, or should we call
it a non-implemented feature.
It seems as if a convert_to_git() is needed in grep.c,
but I haven't found the time to dig deeper.
Does anybody wants to work on this ?


Re: [PATCH/RFC v2 1/1] Use size_t instead of 'unsigned long' for data in memory

2018-11-19 Thread Torsten Bögershausen
On Tue, Nov 20, 2018 at 06:04:54AM +0100, tbo...@web.de wrote:
> From: Torsten Bögershausen 
> 
> Currently the length of data which is stored in memory is stored
> in "unsigned long" at many places in the code base.
> This is OK when both "unsigned long" and size_t are 32 bits,
> (and is OK when both are 64 bits).
> On a 64 bit Windows system am "unsigned long" is 32 bit, and
> that may be too short to measure the size of objects in memory,
> a size_t is the natural choice.
> 
> Improve the code base in "small steps", as small as possible.
> The smallest step seems to be much bigger than expected.

Ops, it seems as if I send this message out twice -
please ignore the "PATCH/RFC v2 1/1"
Sorry for the noise.


[PATCH v2 1/1] Use size_t instead of 'unsigned long' for data in memory

2018-11-19 Thread tboegi
From: Torsten Bögershausen 

Currently the length of data which is stored in memory is stored
in "unsigned long" at many places in the code base.
This is OK when both "unsigned long" and size_t are 32 bits,
(and is OK when both are 64 bits).
On a 64 bit Windows system am "unsigned long" is 32 bit, and
that may be too short to measure the size of objects in memory,
a size_t is the natural choice.

Improve the code base in "small steps", as small as possible.
The smallest step seems to be much bigger than expected.
See this code-snippet from convert.c:

const char *ret;
unsigned long sz;
void *data = read_blob_data_from_index(istate, path, );
ret = gather_convert_stats_ascii(data, sz);

The corrected version looks like this:
const char *ret;
size_t sz;
void *data = read_blob_data_from_index(istate, path, );
ret = gather_convert_stats_ascii(data, sz);

However, when the Git code base is compiled with a compiler that
complains that "unsigned long" is different from size_t, we end
up in this huge patch, before the code base cleanly compiles.

Signed-off-by: Torsten Bögershausen 
---

Thanks for all the comments on V1.
Changes since V1:
- Make the motivation somewhat clearer in the commit message
- Rebase to the November 19 pu

What we really need for this patch to fly are this branches:
mk/use-size-t-in-zlib
tb/print-size-t-with-uintmax-format

And then it is rebased on top of all cooking stuff, too many branches
to be mentioned here.

It may be usefull to examine all "unsigned long" which are left after
this patch and turn them into (what ? unsigned int? size_t? uint32_t ?).
And once they are settled, re-do this patch with help of a coccinelle script.
I don't know.
I probably will rebase it until Junio says stop or someone else comes with
a better solution.

apply.c  | 14 -
 archive-tar.c| 18 +--
 archive-zip.c|  2 +-
 archive.c|  2 +-
 archive.h|  2 +-
 bisect.c |  2 +-
 blame.c  |  6 ++--
 blame.h  |  2 +-
 builtin/cat-file.c   | 10 +++---
 builtin/difftool.c   |  2 +-
 builtin/fast-export.c|  6 ++--
 builtin/fmt-merge-msg.c  |  3 +-
 builtin/fsck.c   |  6 ++--
 builtin/grep.c   |  8 ++---
 builtin/index-pack.c | 27 
 builtin/log.c|  4 +--
 builtin/ls-tree.c|  2 +-
 builtin/merge-tree.c |  6 ++--
 builtin/mktag.c  |  4 +--
 builtin/notes.c  |  6 ++--
 builtin/pack-objects.c   | 56 +-
 builtin/reflog.c |  2 +-
 builtin/replace.c|  2 +-
 builtin/tag.c|  4 +--
 builtin/unpack-file.c|  2 +-
 builtin/unpack-objects.c | 35 ++---
 builtin/verify-commit.c  |  4 +--
 bundle.c |  2 +-
 cache.h  | 10 +++---
 combine-diff.c   | 11 ---
 commit.c | 22 +++---
 commit.h | 10 +++---
 config.c |  2 +-
 convert.c| 18 +--
 delta.h  | 20 ++--
 diff-delta.c |  4 +--
 diff.c   | 30 +-
 diff.h   |  2 +-
 diffcore-pickaxe.c   |  4 +--
 diffcore.h   |  2 +-
 dir.c|  6 ++--
 dir.h|  2 +-
 entry.c  |  4 +--
 fast-import.c| 26 
 fsck.c   | 12 
 fsck.h   |  2 +-
 fuzz-pack-headers.c  |  4 +--
 grep.h   |  2 +-
 http-push.c  |  2 +-
 list-objects-filter.c|  2 +-
 mailmap.c|  2 +-
 match-trees.c|  4 +--
 merge-blobs.c|  6 ++--
 merge-blobs.h|  2 +-
 merge-recursive.c|  4 +--
 notes-cache.c|  2 +-
 notes-merge.c|  4 +--
 notes.c  |  6 ++--
 object-store.h   | 20 ++--
 object.c |  4 +--
 object.h |  2 +-
 pack-check.c |  2 +-
 pack-objects.h   | 14 -
 pack.h   |  2 +-
 packfile.c   | 40 
 packfile.h   |  8 ++---
 patch-delta.c|  8 ++---
 range-diff.c |  2 +-
 read-cache.c | 48 ++---
 ref-filter.c | 30 +-
 remote-testsvn.c |  4 +--
 rerere.c |  2 +-
 sha1-file.c  | 66 
 sha1dc_git.c |  2 +-
 sha1dc_git.h |  2 +-
 streaming.c  | 12 
 streaming.h  |  2 +-
 submodule-config.c   |  2 +-
 t/helper/test-delta.c|  2 +-
 tag.c|  6 ++--
 tag.h|  2 +-
 tree-walk.c  | 14 -
 tree.c  

[PATCH/RFC v2 1/1] Use size_t instead of 'unsigned long' for data in memory

2018-11-19 Thread tboegi
From: Torsten Bögershausen 

Currently the length of data which is stored in memory is stored
in "unsigned long" at many places in the code base.
This is OK when both "unsigned long" and size_t are 32 bits,
(and is OK when both are 64 bits).
On a 64 bit Windows system am "unsigned long" is 32 bit, and
that may be too short to measure the size of objects in memory,
a size_t is the natural choice.

Improve the code base in "small steps", as small as possible.
The smallest step seems to be much bigger than expected.
See this code-snippet from convert.c:

const char *ret;
unsigned long sz;
void *data = read_blob_data_from_index(istate, path, );
ret = gather_convert_stats_ascii(data, sz);

The corrected version looks like this:
const char *ret;
size_t sz;
void *data = read_blob_data_from_index(istate, path, );
ret = gather_convert_stats_ascii(data, sz);

However, when the Git code base is compiled with a compiler that
complains that "unsigned long" is different from size_t, we end
up in this huge patch, before the code base cleanly compiles.

Signed-off-by: Torsten Bögershausen 
---

Thanks for all the comments on V1.
Changes since V1:
- Make the motivation somewhat clearer in the commit message
- Rebase to the November 19 pu

What we really need for this patch to fly are this branches:
mk/use-size-t-in-zlib
tb/print-size-t-with-uintmax-format

And then it is rebased on top of all cooking stuff, too many branches
to be mentioned here.

It may be usefull to examine all "unsigned long" which are left after
this patch and turn them into (what ? unsigned int? size_t? uint32_t ?).
And once they are settled, re-do this patch with help of a coccinelle script.
I don't know.
I probably will rebase it until Junio says stop or someone else comes with
a better solution.

apply.c  | 14 -
 archive-tar.c| 18 +--
 archive-zip.c|  2 +-
 archive.c|  2 +-
 archive.h|  2 +-
 bisect.c |  2 +-
 blame.c  |  6 ++--
 blame.h  |  2 +-
 builtin/cat-file.c   | 10 +++---
 builtin/difftool.c   |  2 +-
 builtin/fast-export.c|  6 ++--
 builtin/fmt-merge-msg.c  |  3 +-
 builtin/fsck.c   |  6 ++--
 builtin/grep.c   |  8 ++---
 builtin/index-pack.c | 27 
 builtin/log.c|  4 +--
 builtin/ls-tree.c|  2 +-
 builtin/merge-tree.c |  6 ++--
 builtin/mktag.c  |  4 +--
 builtin/notes.c  |  6 ++--
 builtin/pack-objects.c   | 56 +-
 builtin/reflog.c |  2 +-
 builtin/replace.c|  2 +-
 builtin/tag.c|  4 +--
 builtin/unpack-file.c|  2 +-
 builtin/unpack-objects.c | 35 ++---
 builtin/verify-commit.c  |  4 +--
 bundle.c |  2 +-
 cache.h  | 10 +++---
 combine-diff.c   | 11 ---
 commit.c | 22 +++---
 commit.h | 10 +++---
 config.c |  2 +-
 convert.c| 18 +--
 delta.h  | 20 ++--
 diff-delta.c |  4 +--
 diff.c   | 30 +-
 diff.h   |  2 +-
 diffcore-pickaxe.c   |  4 +--
 diffcore.h   |  2 +-
 dir.c|  6 ++--
 dir.h|  2 +-
 entry.c  |  4 +--
 fast-import.c| 26 
 fsck.c   | 12 
 fsck.h   |  2 +-
 fuzz-pack-headers.c  |  4 +--
 grep.h   |  2 +-
 http-push.c  |  2 +-
 list-objects-filter.c|  2 +-
 mailmap.c|  2 +-
 match-trees.c|  4 +--
 merge-blobs.c|  6 ++--
 merge-blobs.h|  2 +-
 merge-recursive.c|  4 +--
 notes-cache.c|  2 +-
 notes-merge.c|  4 +--
 notes.c  |  6 ++--
 object-store.h   | 20 ++--
 object.c |  4 +--
 object.h |  2 +-
 pack-check.c |  2 +-
 pack-objects.h   | 14 -
 pack.h   |  2 +-
 packfile.c   | 40 
 packfile.h   |  8 ++---
 patch-delta.c|  8 ++---
 range-diff.c |  2 +-
 read-cache.c | 48 ++---
 ref-filter.c | 30 +-
 remote-testsvn.c |  4 +--
 rerere.c |  2 +-
 sha1-file.c  | 66 
 sha1dc_git.c |  2 +-
 sha1dc_git.h |  2 +-
 streaming.c  | 12 
 streaming.h  |  2 +-
 submodule-config.c   |  2 +-
 t/helper/test-delta.c|  2 +-
 tag.c|  6 ++--
 tag.h|  2 +-
 tree-walk.c  | 14 -
 tree.c  

Re: help.autoCorrect prefix selection considered a bit dangerous

2018-11-19 Thread Junio C Hamano
Ævar Arnfjörð Bjarmason  writes:

> I don't have time to poke at this now, but wonder if:
>
>  1) The correction facility shouldn't at least have a list of "this does
> stuff over the wire" commands and would then use a more conservative
> estimate.

Not limited to 'over the wire' but 'can have consequences that might
cause regret' would be a reasonable list to have.

On a similar topic, it would be a disaster for "git reset --h"
to complete to "--hard" instead of "--help", for example.  Perhaps
parse-options API also needs to learn a list of possibly regrettable
options.


Re: [PATCH/RFC] checkout: print something when checking out paths

2018-11-19 Thread Junio C Hamano
Duy Nguyen  writes:

>> > I see this at the same level as "Switched to branch 'foo'" which is
>> > also protected by opts->quiet. If we start hiding messages based on
>> > tty it should be done for other messages in update_refs_for_switch()
>> > too, I guess?
>
> No let's drop this for now. I promise to come back with something
> better (at least it still sounds better in my mind). If that idea does
> not work out, we can come back and see if we can improve this.

Let's leave it in 'pu'.

I do agree that this is similar to existing messages that talk about
checkout out a branch to work on etc., and I think giving feedback
when checkout paths out _is_ a good thing to do for interactive
users.

If we were to squelch such "notice" output for non-interactive use,
we should do so to the "notice" messages for checking out a branch,
as well, and also to other subcommands that report what they did
with these "notice" output.  And that is a separate topic.

The primary reason why I was annoyed was because "make test" (I
think I have DEFAULT_TEST_TARGET=prove, if it matters) output was
littered with these "checked out N paths", even though I am not
asking for verbose output.  

It could be that the real cause of that is perhaps because we have
too many "git checkout" that is outside test_expect_* block, in
which case we should fix these tests to perform the checkout inside
a test_expect_success block for test set-up, and my annoyance was
only shooting at the messenger.

For example, the attached patch illustrates the right problem but
addresses it in a wrong way.  This checkout_files() helper does too
many things outside (and before) the test_expect_success block it
has (other helpers like commit_chk_wrnNNO share the same problem),
and making "git checkout" noisy will reveal that as a problem by
leaking the noisy output directly to the tester.  But the real fix
is to enclose the set-up step inside a test_expect_success block,
which is not done by the following illustration patch, which instead
just squelches the message.

 t/t0027-auto-crlf.sh | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/t/t0027-auto-crlf.sh b/t/t0027-auto-crlf.sh
index beb5927f77..3587e454f1 100755
--- a/t/t0027-auto-crlf.sh
+++ b/t/t0027-auto-crlf.sh
@@ -293,9 +293,9 @@ checkout_files () {
do
rm crlf_false_attr__$f.txt &&
if test -z "$ceol"; then
-   git checkout crlf_false_attr__$f.txt
+   git checkout -- crlf_false_attr__$f.txt
else
-   git -c core.eol=$ceol checkout crlf_false_attr__$f.txt
+   git -c core.eol=$ceol checkout -- 
crlf_false_attr__$f.txt
fi
done
 



[L10N] Kickoff for Git 2.20.0 round 1

2018-11-19 Thread Jiang Xin
Hi,

Git v2.20.0-rc0 has been released, and it's time to start new round of git l10n.
This time there are 254 updated messages need to be translated since last
update:

l10n: git.pot: v2.20.0 round 1 (254 new, 27 removed)

Generate po/git.pot from v2.20.0-rc0-23-gbb75be6cb9 for git v2.20.0 l10n
round 1.

Signed-off-by: Jiang Xin https://github.com/git-l10n/git-po/

As how to update your XX.po and help to translate Git, please see
"Updating a XX.po file" and other sections in “po/README" file.

--
Jiang Xin


Re: [PATCH v5] clone: report duplicate entries on case-insensitive filesystems

2018-11-19 Thread Junio C Hamano
Duy Nguyen  writes:

>  
> - for (i = 0; i < state->istate->cache_nr; i++) {
> + for (i = 0; i < trust_ino && state->istate->cache_nr; i++) {

There is some typo here, but modulo that this looks like the right
thing to do.

> @@ -419,10 +419,24 @@ static void mark_colliding_entries(const struct 
> checkout *state,
>   if (dup->ce_flags & (CE_MATCHED | CE_VALID | CE_SKIP_WORKTREE))
>   continue;
>  
> - if ((trust_ino && dup->ce_stat_data.sd_ino == st->st_ino) ||
> - (!trust_ino && !fspathcmp(ce->name, dup->name))) {
> + if (dup->ce_stat_data.sd_ino == (unsigned int)st->st_ino) {

This is slightly unfortunate but is the best we can do for now.  

The reason why the design of the "cached stat info" mechanism allows
the sd_* fields to be narrower than the underlying fields is because
they are used only as an early-culling measure (if the value saved
with truncation is different from the current value with truncation,
then they cannot possibly be the same, so we know that the file
changed without looking at the contents).

This use however is different.  Equality of truncated values
immediately declare CE_MATCHED here, producing false negative, which
is not what we want, no?

>   dup->ce_flags |= CE_MATCHED;
> + return;
> + }
> + }



Re: [PATCH] msvc: Directly use MS version (_stricmp) of strcasecmp

2018-11-19 Thread Junio C Hamano
Sven Strickroth  writes:

> This also removes an implicit conversion from size_t (unsigned) to int 
> (signed).
>
> _stricmp as well as _strnicmp are both available since VS2012.
>
> Signed-off-by: Sven Strickroth 
> ---
>  compat/msvc.h | 8 +---
>  1 file changed, 1 insertion(+), 7 deletions(-)

Will apply, thanks.

The substition from ftello with _ftelli64 does not appear in our
codebase yet, but it was easy enough to adjust the patch myself, so
no need to resend this patch.

> diff --git a/compat/msvc.h b/compat/msvc.h
> index e6e1a6bbf7..2d558bae14 100644
> --- a/compat/msvc.h
> +++ b/compat/msvc.h
> @@ -14,18 +14,12 @@
>  #define inline __inline
>  #define __inline__ __inline
>  #define __attribute__(x)
> +#define strcasecmp   _stricmp
>  #define strncasecmp  _strnicmp
>  #define ftruncate_chsize
>  #define strtoull _strtoui64
>  #define strtoll  _strtoi64
>  
> -static __inline int strcasecmp (const char *s1, const char *s2)
> -{
> - int size1 = strlen(s1);
> - int sisz2 = strlen(s2);
> - return _strnicmp(s1, s2, sisz2 > size1 ? sisz2 : size1);
> -}
> -
>  #undef ERROR
>  
>  #define ftello _ftelli64


Re: [PATCH/RFC v1 1/1] Use size_t instead of unsigned long

2018-11-19 Thread Junio C Hamano
Torsten Bögershausen  writes:

> The only problematic system is Win64, where "unsigned long" is 32 bit,
> and therefore we must use size_t to address data in memory.
> This is not to be confused with off_t, which is used for "data on disk"
> (and nothing else) or timestamp_t which is used for timestamps (and nothing 
> else).
>
> I haven't followed the "coccinelle script" development at all, if someone
> makes a patch do replace "unsigned long" with size_t, that could replace
> my whole patch. (Some of them may be downgraded to "unsigned int" ?)

This paragraph makes it sound as if this patch is s/ulong/size_t/,
but that contradicts with the previous paragraph, no?  It is much
better to leave a ulong that is not about the size of a memory
region as-is, to be turned into appropriate and correct type later,
than changing it into another wrong type (size_t).

In short, we could do ulong to size_t with Coccinelle, but I do not
think we want to blindly rewrite all.



Re: [PATCH] technical doc: add a design doc for the evolve command

2018-11-19 Thread Jonathan Nieder
Ævar Arnfjörð Bjarmason wrote:
> On Thu, Nov 15 2018, sxe...@google.com wrote:

>> +Parent-type
>> +---
>> +The “parent-type” field in the commit header identifies a commit as a
>> +meta-commit and indicates the meaning for each of its parents. It is never
>> +present for normal commits.
[...]
> I think it's worth pointing out for those that are rusty on commit
> object details (but I checked) is that the reason for it not being:
>
> tree 4b825dc642cb6eb9a060e54bf8d69288fbee4904
> parent aa7ce55545bf2c14bef48db91af1a74e2347539a
> parent-type content
> parent d64309ee51d0af12723b6cb027fc9f195b15a5e9
> parent-type obsolete
> parent 7e1bbcd3a0fa854a7a9eac9bf1eea6465de98136
> parent-type origin
> author Stefan Xenos  1540841596 -0700
> committer Stefan Xenos  1540841596 -0700
>
> Which would be easier to read, is that we're very sensitive to the order
> of the first few fields (tree -> parent -> author -> committer) and fsck
> will error out if we interjected a new field.

By the way, in the spirit of limiting the initial scope, I wonder
whether the parent-type fields can be stored in the commit message
initially.

Elsewhere in this thread it was mentioned that the parent-type is a
field to allow tools like "git fsck" to understand the meaning of
these parent relationships (for example, to forbid a commit
referencing a meta-commit).  The same could be done using special
commit message text, though.

The advantage of such an approach would be that we could experiment
without changing the official object format at all.  If experiments
revealed a different set of information to store, we could update the
format without having to maintain the memory of the older format in
"git fsck"'s understanding of commit object fields.  So even though I
think that in the end we would want to put this information in the
commit object header, I'm tempted to suspect that the benefits of
putting it in the commit message to start outweigh the costs (in
particular, of having to migrate to another format later).

Thanks,
Jonathan


Re: [PATCH] technical doc: add a design doc for the evolve command

2018-11-19 Thread Jonathan Nieder
Hi,

Stefan Xenos wrote:

> But since several comments have focused on the commands, let's brainstorm!
>
> Here's some options that occur to me:
>
> 1. Three commands: evolve + change + obslog as top-level commands (the
> current proposal). Change gets a bunch of subcommands for manipulating
> the change graph and metas/ namespace.
>
> 2. All top-level: evolve + lschange + mkchange + rmchange +
> prunechange + obslog. None of the commands get subcommands.
>
> 3. Everything under change: "change evolve", "change obslog" become
> new change subcommands.
>
> 4. Evolve as a rebase argument, obslog as a log argument. Use "rebase
> --evolve" to initiate evolve and use "log --obslog" to initiate
> obslog. The change command stays as it is in the proposal.
>
> 5. Two commands: evolve + change. obslog becomes a "log" argument.
>
> Note that there will be more "evolve"-specific arguments in the
> future. For most transformations that evolve uses, there will be a
> matching argument to enable or disable that transformation and as we
> add transformations we'll also add arguments.
>
> If we go with option 3, it would make for a very cluttered help page.
> For example, if you're looking for information on how to use evolve,
> you'd have to scroll past a bunch of formatting information that are
> only relevant to obslog... and if you're looking for the formatting
> options, you'd have to scroll past a bunch of
> transformation-enablement options that are only relevant to evolve.
>
> Based on your log feedback above, I'm thinking #5 may make sense.

(5) sounds good to me, too.  Thanks, both, for your thoughtfulness.

Jonathan


Re: [PATCH v5] clone: report duplicate entries on case-insensitive filesystems

2018-11-19 Thread Carlo Arenas
ok 99 - colliding file detection

as well in macOS with APFS

Carlo


Re: [PATCH] technical doc: add a design doc for the evolve command

2018-11-19 Thread Junio C Hamano
Stefan Xenos  writes:

>> But it is not immediately obvious to me how it would help to have
>> "Z was cherry-picked from W" in "evolve".
>
> The evolve command would use it for handling the
> obsolescence-over-cherry-pick (OOCP) feature. If someone cherry-picks
> a commit and then amends the original, the evolve command would give
> you the option of applying the same amendment to the cherry-picked
> version.

Yeah, I missed that case when I was formulating my thought on how we
can start smaller and simpler to get the ball rolling.  And for
"this commit and anything built on top of it need to be adjusted
since that other commit, which this commit was made by cherry-picking
it, has been obsoleted" to work, the "origin" commit pointed at by
the meta commit must be made available.

> Are you claiming that this is undesirable, or are you claiming that
> this could be accomplished without origin parents?

I was trying to see if this is something we can leave out to limit
the initial scope.


Re: Cygwin Git with Windows paths

2018-11-19 Thread Steven Penny
On Sun, Nov 18, 2018 at 11:21 PM Torsten Bögershausen wrote:
> If nothing works,
> it may help to add some fprintf(stderr,...) in the functions used
> by 05b458c104708141d2f:
>
> strip_last_component(),
> get_next_component()
> real_path_internal()

I didnt see any "real_path_internal" in the current codebase - however i added
some "printf" to the other 2 and got this:

$ git clone git://github.com/benhoyt/goawk 'C:\cygwin64\tmp\goawk'
get_next_component, next, []
get_next_component, remaining, [C:\cygwin64\tmp\goawk]
Cloning into 'C:\cygwin64\tmp\goawk'...
get_next_component, next, []
get_next_component, remaining, [C:\cygwin64\tmp\goawk/.git]
fatal: Invalid path '/usr/local/cache/git/C:\cygwin64\tmp\goawk': No such file
or directory


Re: [PATCH v5] clone: report duplicate entries on case-insensitive filesystems

2018-11-19 Thread Ramsay Jones



On 19/11/2018 23:29, Ramsay Jones wrote:
> 
> 
> On 19/11/2018 21:03, Duy Nguyen wrote:
>> First of all, Ramsay, it would be great if you could test the below
>> patch and see if it works on Cygwin. I assume since Cygwin shares the
>> underlying filesystem, it will share the same "no trusting inode"
>> issue with native builds (or it calculates inodes anyway using some
>> other source?).
> 
> Hmm, I have no idea why you would like me to try this patch - care
> to explain? [I just saw, "Has this been tested on cygwin?" and, since
> it has been happily passing for some time, responded yes!]
> 
> Just for the giggles, I removed the !CYGWIN prerequisite from the
> test and when, as expected, the test failed, had a look around:
> 
> $ pwd
> /home/ramsay/git/t/trash directory.t5601-clone
> $ cat icasefs/warning 
> Cloning into 'bogus'...
> done.
> warning: the following paths have collided (e.g. case-sensitive paths
> on a case-insensitive filesystem) and only one from the same
> colliding group is in the working tree:
> 
>   'x'
> $ cd icasefs/bogus
> $ ls -l
> total 0
> -rw-r--r-- 1 ramsay None 0 Nov 19 22:40 x
> $ git ls-files --debug
> ignoring EOIE extension
> X
>   ctime: 1542667201:664036600
>   mtime: 1542667201:663055400
>   dev: 2378432ino: 324352
>   uid: 1001   gid: 513
>   size: 0 flags: 0
> x
>   ctime: 1542667201:665026800
>   mtime: 1542667201:665026800
>   dev: 2378432ino: 324352
>   uid: 1001   gid: 513
>   size: 0 flags: 0
> $ 
> 
> So, both X and x are in the index with the same inode number.
> 
> Does that help?

Well, I haven't even looked at the patch, but when I apply it to
the current 'pu' branch (just what I happened to have checked out)
and run that one test:

$ ./t5601-clone.sh
...
ok 96 - shallow clone locally
ok 97 - GIT_TRACE_PACKFILE produces a usable pack
ok 98 - clone on case-insensitive fs
ok 99 - colliding file detection
ok 100 - partial clone
ok 101 - partial clone: warn if server does not support object filtering
ok 102 - batch missing blob request during checkout
ok 103 - batch missing blob request does not inadvertently try to fetch gitlinks
# passed all 103 test(s)
# SKIP no web server found at '/usr/sbin/apache2'
1..103
$ 

... the colliding file detection test passes!

ATB,
Ramsay Jones




Re: [PATCH] tests: send "bug in the test script" errors to the script's stderr

2018-11-19 Thread SZEDER Gábor
On Mon, Nov 19, 2018 at 02:44:32PM -0500, Jeff King wrote:
> On Mon, Nov 19, 2018 at 02:13:26PM +0100, SZEDER Gábor wrote:
> 
> > Send these "bug in the test script" error messages directly to the
> > test scripts standard error and thus to the terminal, so those bugs
> > will be much harder to overlook.  Instead of updating all ~20 such
> > 'error' calls with a redirection, let's add a BUG() function to
> > 'test-lib.sh', wrapping an 'error' call with the proper redirection
> > and also including the common prefix of those error messages, and
> > convert all those call sites [4] to use this new BUG() function
> > instead.
> 
> Yes, I think this is an improvement.
> 
> > +BUG () {
> > +   error >&7 "bug in the test script: $*"
> > +}
> 
> I naively expected this to go to >&4, but of course that is the
> conditional "stderr or /dev/null, depending on --verbose" descriptor. 

The first version of this patch did send the error message to fd 4 ;)

> I do notice that many of the existing "FATAL:" errors use descriptor 5,
> which goes to stdout. I'm not sure if those should actually be going to
> stderr (or if there's some TAP significance to those lines).

I chose to send these messages to standard error, because they are,
well, errors.  TAP only cares about stdout, but by aborting the test
script in BUG(), error() or die() we are already violating TAP anyway,
so I don't think it matters whether we send "bug in test script" or
"FATAL" errors to stdout or stderr.

BTW, TAP understands "Bail out!" as bail out from the _entire_ test
suite, but that's not what we want here, I'd think.

https://testanything.org/tap-specification.html#bail-out



Re: [PATCH v5] clone: report duplicate entries on case-insensitive filesystems

2018-11-19 Thread Ramsay Jones



On 19/11/2018 21:03, Duy Nguyen wrote:
> First of all, Ramsay, it would be great if you could test the below
> patch and see if it works on Cygwin. I assume since Cygwin shares the
> underlying filesystem, it will share the same "no trusting inode"
> issue with native builds (or it calculates inodes anyway using some
> other source?).

Hmm, I have no idea why you would like me to try this patch - care
to explain? [I just saw, "Has this been tested on cygwin?" and, since
it has been happily passing for some time, responded yes!]

Just for the giggles, I removed the !CYGWIN prerequisite from the
test and when, as expected, the test failed, had a look around:

$ pwd
/home/ramsay/git/t/trash directory.t5601-clone
$ cat icasefs/warning 
Cloning into 'bogus'...
done.
warning: the following paths have collided (e.g. case-sensitive paths
on a case-insensitive filesystem) and only one from the same
colliding group is in the working tree:

  'x'
$ cd icasefs/bogus
$ ls -l
total 0
-rw-r--r-- 1 ramsay None 0 Nov 19 22:40 x
$ git ls-files --debug
ignoring EOIE extension
X
  ctime: 1542667201:664036600
  mtime: 1542667201:663055400
  dev: 2378432  ino: 324352
  uid: 1001 gid: 513
  size: 0   flags: 0
x
  ctime: 1542667201:665026800
  mtime: 1542667201:665026800
  dev: 2378432  ino: 324352
  uid: 1001 gid: 513
  size: 0   flags: 0
$ 

So, both X and x are in the index with the same inode number.

Does that help?

ATB,
Ramsay Jones


Re: [PATCH] commit-graph: split up close_reachable() progress output

2018-11-19 Thread SZEDER Gábor
On Mon, Nov 19, 2018 at 08:23:00PM +, Ævar Arnfjörð Bjarmason wrote:
> Amend the progress output added in 7b0f229222 ("commit-graph write:
> add progress output", 2018-09-17) so that the total numbers it reports
> aren't higher than the total number of commits anymore. See [1] for a
> bug report pointing that out.

Please make the commit message more self-contained, i.e. describe the
issue this patch fixes in more detail, so readers won't have to follow
links to understand the problem.

> When I added this I wasn't intending to provide an accurate count, but
> just have some progress output to show the user the command wasn't
> hanging[2]. But since we are showing numbers, let's make them
> accurate. The progress descriptions were suggested by Derrick Stolee
> in [3].
> 
> As noted in [2] we are unlikely to show anything except the "Expanding
> reachable..." message even on fairly large repositories such as
> linux.git. On a test repository I have with north of 7 million commits
> all of these are displayed. Two of them don't show up for long, but as
> noted in [5] future-proofing this for if the loops become more
> expensive in the future makes sense.

In my opinion this is rather one of those "we'll cross that bridge
when (or if ever) we get there" situations.

> 1. https://public-inbox.org/git/20181010203738.ge23...@szeder.dev/
> 2. https://public-inbox.org/git/87pnwhea8y@evledraar.gmail.com/
> 3. 
> https://public-inbox.org/git/f7a0cbee-863c-61d3-4959-5cec8b43c...@gmail.com/
> 4. https://public-inbox.org/git/20181015160545.gg19...@szeder.dev/
> 5. https://public-inbox.org/git/87murle8da@evledraar.gmail.com/
> 
> Reported-by: SZEDER Gábor 
> Helped-by: Derrick Stolee 
> Signed-off-by: Ævar Arnfjörð Bjarmason 
> ---
> 
> On Mon, Nov 19 2018, SZEDER Gábor wrote:
> 
> > Ping?
> >
> > We are at -rc0, this progress output is a new feature since v2.19.0,
> > and the numbers shown are still way off.
> 
> I was under the impression after your
> https://public-inbox.org/git/20181015160545.gg19...@szeder.dev/ that
> you were going to do some more digging & report back, so I put it on
> my "waiting for feedback" list and then forgot about it.

No, after I managed to "get my numbers straight" I sent another bug
report in

  https://public-inbox.org/git/20181015165447.gh19...@szeder.dev/

but as a reply to your original patch.  Sorry about the confusion.

> But here's a patch that should address the issue you pointed out, but
> I don't know if it fixes whatever you were alluding to in the linked
> E-Mail above.

I'm afraid this patch doesn't address that issue, as it's limited to
close_reachable(), and that issue is related to the progress output in
add_packed_commits().

>  commit-graph.c | 13 ++---
>  1 file changed, 10 insertions(+), 3 deletions(-)
> 
> diff --git a/commit-graph.c b/commit-graph.c
> index 40c855f185..9c0d6914be 100644
> --- a/commit-graph.c
> +++ b/commit-graph.c
> @@ -641,26 +641,29 @@ static void add_missing_parents(struct packed_oid_list 
> *oids, struct commit *com
>  
>  static void close_reachable(struct packed_oid_list *oids, int 
> report_progress)
>  {
> - int i;
> + int i, j;
>   struct commit *commit;
>   struct progress *progress = NULL;
> - int j = 0;
>  
>   if (report_progress)
>   progress = start_delayed_progress(
> - _("Annotating commits in commit graph"), 0);
> + _("Loading known commits in commit graph"), j = 0);
>   for (i = 0; i < oids->nr; i++) {
>   display_progress(progress, ++j);
>   commit = lookup_commit(the_repository, >list[i]);
>   if (commit)
>   commit->object.flags |= UNINTERESTING;
>   }
> + stop_progress();
>  
>   /*
>* As this loop runs, oids->nr may grow, but not more
>* than the number of missing commits in the reachable
>* closure.
>*/
> + if (report_progress)
> + progress = start_delayed_progress(
> + _("Expanding reachable commits in commit graph"), j = 
> 0);
>   for (i = 0; i < oids->nr; i++) {
>   display_progress(progress, ++j);
>   commit = lookup_commit(the_repository, >list[i]);
> @@ -668,7 +671,11 @@ static void close_reachable(struct packed_oid_list 
> *oids, int report_progress)
>   if (commit && !parse_commit(commit))
>   add_missing_parents(oids, commit);
>   }
> + stop_progress();
>  
> + if (report_progress)
> + progress = start_delayed_progress(
> + _("Clearing commit marks in commit graph"), j = 0);
>   for (i = 0; i < oids->nr; i++) {
>   display_progress(progress, ++j);
>   commit = lookup_commit(the_repository, >list[i]);
> -- 
> 2.19.1.1182.g4ecb1133ce
> 


Re: [PATCH] t5562: skip if NO_CURL is enabled

2018-11-19 Thread Jeff King
On Mon, Nov 19, 2018 at 11:26:03PM +0200, Max Kirillov wrote:

> On Mon, Nov 19, 2018 at 11:36:08AM -0800, Carlo Arenas wrote:
> > NO_CURL reflects the build setting (for http support); CURL checks for
> > the curl binary, but as Ævar points out the requirements must be from
> > somewhere else since a NO_CURL=1 build (tested in macOS) still passes
> > the test, but not in NetBSD.
> > 
> > tests 3-8 seem to fail because perl is hardcoded to /urs/bin/perl in
> > t5562/invoke-with-content-length.pl,
> 
> I see.
> 
> In other perl files I can see either '#!/usr/bin/perl' or
> '#!/ust/bin/env perl'. The second one should be more
> portable. Does the latter work on the NetBSD?
> 
> To all: what is supposed to be done about it?

You should swap this out for $PERL_PATH. You can use write_script() to
help if you're copying the script around anyway. Though it looks like
you just run it from the one function. So maybe just:

diff --git a/t/t5562-http-backend-content-length.sh 
b/t/t5562-http-backend-content-length.sh
index b24d8b05a4..90d890d02f 100755
--- a/t/t5562-http-backend-content-length.sh
+++ b/t/t5562-http-backend-content-length.sh
@@ -31,6 +31,7 @@ test_http_env() {
PATH_TRANSLATED="$PWD/.git/git-$handler_type-pack" \
GIT_HTTP_EXPORT_ALL=TRUE \
REQUEST_METHOD=POST \
+   "$PERL_PATH" \
"$TEST_DIRECTORY"/t5562/invoke-with-content-length.pl \
"$request_body" git http-backend >act.out 2>act.err
 }

(note that it's normally OK to just run "perl", because we use a
shell-function wrapper that respects $PERL_PATH, but here we're actually
passing it to "env").

You could also lose the executable bit on the script at that point. It
doesn't matter much, but it would catch an erroneous call relying on the
shebang line.

-Peff


Re: Git Test Coverage Report (v2.20.0-rc0)

2018-11-19 Thread Ævar Arnfjörð Bjarmason


On Mon, Nov 19 2018, Derrick Stolee wrote:

> On 11/19/2018 1:33 PM, Ævar Arnfjörð Bjarmason wrote:
>> On Mon, Nov 19 2018, Derrick Stolee wrote:
>>
>>> [...]
>>> builtin/rebase.c
>>> 62c23938fa 55) return env;
>>> [...]
>>> Ævar Arnfjörð Bjarmason 62c23938f: tests: add a special setup
>>> where rebase.useBuiltin is off
>> This one would be covered with
>> GIT_TEST_REBASE_USE_BUILTIN=false. Obviously trivial, but I wonder if
>> the rest of the coverage would look different when passed through the 
>> various GIT_TEST_* options.
>>
>
> Thanks for pointing out this GIT_TEST_* variable to me. I had been
> running builds with some of them enabled, but didn't know about this
> one.
>
> Unfortunately, t3406-rebase-message.sh fails with
> GIT_TEST_REBASE_USE_BUILTIN=false and it bisects to 4520c2337: Merge
> branch 'ab/rebase-in-c-escape-hatch'.
>
> The issue is that the commit 04519d72 "rebase: validate -C and
> --whitespace= parameters early" introduced the following test
> that cares about error messages:
>
> +test_expect_success 'error out early upon -C or --whitespace=' '
> + test_must_fail git rebase -Cnot-a-number HEAD 2>err &&
> + test_i18ngrep "numerical value" err &&
> + test_must_fail git rebase --whitespace=bad HEAD 2>err &&
> + test_i18ngrep "Invalid whitespace option" err
> +'
>
> The merge commit then was the first place where this test could run
> with that variable.

Yup. Sorry should have mentioned that, it's broken in master. Reported
it earlier today:
https://public-inbox.org/git/874lcd1bub@evledraar.gmail.com/

> What's the correct fix here? Force the builtin rebase in this test?
> Unify the error message in the non-builtin case?

Probably to just force the builtin, unless Johannes wants to also fix
the bug for the shellscript version. I don't know if for 2.20 we're
trying to maintain 100% compatibility.


Re: [PATCH] t5562: skip if NO_CURL is enabled

2018-11-19 Thread Max Kirillov
On Mon, Nov 19, 2018 at 11:36:08AM -0800, Carlo Arenas wrote:
> NO_CURL reflects the build setting (for http support); CURL checks for
> the curl binary, but as Ævar points out the requirements must be from
> somewhere else since a NO_CURL=1 build (tested in macOS) still passes
> the test, but not in NetBSD.
> 
> tests 3-8 seem to fail because perl is hardcoded to /urs/bin/perl in
> t5562/invoke-with-content-length.pl,

I see.

In other perl files I can see either '#!/usr/bin/perl' or
'#!/ust/bin/env perl'. The second one should be more
portable. Does the latter work on the NetBSD?

To all: what is supposed to be done about it?

> while I seem to be getting some
> sporadic errors in 9 with the following output :

This is more complicated.

Does it happen often?

Does test 12 ("push gzipped") ever fail?

So far I can imagine either a buffering issue or some
mistake in length calculation.


Re: [PATCH] technical doc: add a design doc for the evolve command

2018-11-19 Thread Stefan Xenos
> Subcommand names and --long-options are just as descriptive.

Yeah, I'm convinced about the descriptiveness. If you check the latest
version of the patch, I've already updated the "change" command to use
subcommands rather than lettered arguments.

> If a user wants to deal with reflogs, then there is 'git help reflog'

I guess it depends on whether you prefer having a single big help page
(risk: user may see irrelevant content), or a number of small help
pages (risk: user may need to follow cross-references). My guess is
that we should probably try to hit the sweet spot that minimizes the
amount of irrelevant information on a help page, the probability of
needing to follow a cross-reference to understand context, and the
amount of content that needs to be duplicated between pages.

But assuming we add a bunch of formatting options to obslog that match
log, it may make sense to just have an "--obslog" argument to log.

> I think 'git obslog' should allow the same when showing the log of a change.

Sounds good. We should probably also change the default formatting for
the obslog command to be some sort of description of what changed
since the commit message will probably be very similar for every
entry. I'll update the proposal to mention formatting options once we
sort out where obslog will actually live.

> By adding several new commands users will have to consult the manpages of 
> 'evolve',
> 'change', 'obslog', etc., even though the commands and the concepts are 
> closely related.

I'm not sure that's the case. There is some common background material
you'd need to understand in order to use any of those commands ("what
are changes?"), but the same could be said of pretty much any git
command ("what are commits?"). Assuming the user knows what a change
is, I'm pretty sure I could write a self-contained description for
evolve, change, or obslog that doesn't require cross-referencing any
of the other commands... and the evolve command could probably be
understood even without understanding changes.

But since several comments have focused on the commands, let's brainstorm!

Here's some options that occur to me:
1. Three commands: evolve + change + obslog as top-level commands (the
current proposal). Change gets a bunch of subcommands for manipulating
the change graph and metas/ namespace.
2. All top-level: evolve + lschange + mkchange + rmchange +
prunechange + obslog. None of the commands get subcommands.
3. Everything under change: "change evolve", "change obslog" become
new change subcommands.
4. Evolve as a rebase argument, obslog as a log argument. Use "rebase
--evolve" to initiate evolve and use "log --obslog" to initiate
obslog. The change command stays as it is in the proposal.
5. Two commands: evolve + change. obslog becomes a "log" argument.

Note that there will be more "evolve"-specific arguments in the
future. For most transformations that evolve uses, there will be a
matching argument to enable or disable that transformation and as we
add transformations we'll also add arguments.

If we go with option 3, it would make for a very cluttered help page.
For example, if you're looking for information on how to use evolve,
you'd have to scroll past a bunch of formatting information that are
only relevant to obslog... and if you're looking for the formatting
options, you'd have to scroll past a bunch of
transformation-enablement options that are only relevant to evolve.

Based on your log feedback above, I'm thinking #5 may make sense.

  - Stefan
On Mon, Nov 19, 2018 at 7:55 AM SZEDER Gábor  wrote:
>
> On Sat, Nov 17, 2018 at 12:30:58PM -0800, Stefan Xenos wrote:
> > > Further, I see that this document tries to suggest a proliferation of new 
> > > commands
> >
> > It does. Let me explain a bit about the reasoning behind this
> > breakdown of commands. My main priority was to keep the commands as
> > consistent with existing git commands as possible. Secondary goals
> > were:
> > - Mapping a single intent to a single command where possible makes it
> > easier to explain what that command does.
> > - Having lots of simpler commands as opposed to a few complex commands
> > makes them easier to type.
> > - Command names are more descriptive than lettered arguments.
>
> Subcommand names and --long-options are just as descriptive.
>
>
> > Git already has a "log" and "reflog" command for displaying two
> > different types of log,
>
> No, there is 'git log' for displaying logs in various ways, and there
> is 'git reflog' which not only displays reflogs, but also operates on
> them, e.g. deletes specific reflog entries or expires old entries,
> necessitating and justifying the dedicated 'git reflog' command.
>
> > so putting "obslog" on its own command makes
> > it consistent with the existing logs, easier to type, and keeps the
> > command simple.
>
> > - We could turn "obslog" into an extra option on the "log" command,
> > but that would be inconsistent with reflog and would complicate the
> > 

Re: Git Test Coverage Report (v2.20.0-rc0)

2018-11-19 Thread Derrick Stolee

On 11/19/2018 1:33 PM, Ævar Arnfjörð Bjarmason wrote:

On Mon, Nov 19 2018, Derrick Stolee wrote:


[...]
builtin/rebase.c
62c23938fa 55) return env;
[...]
Ævar Arnfjörð Bjarmason 62c23938f: tests: add a special setup
where rebase.useBuiltin is off

This one would be covered with
GIT_TEST_REBASE_USE_BUILTIN=false. Obviously trivial, but I wonder if
the rest of the coverage would look different when passed through the various 
GIT_TEST_* options.



Thanks for pointing out this GIT_TEST_* variable to me. I had been 
running builds with some of them enabled, but didn't know about this one.


Unfortunately, t3406-rebase-message.sh fails with 
GIT_TEST_REBASE_USE_BUILTIN=false and it bisects to 4520c2337: Merge 
branch 'ab/rebase-in-c-escape-hatch'.


The issue is that the commit 04519d72 "rebase: validate -C and 
--whitespace= parameters early" introduced the following test that 
cares about error messages:


+test_expect_success 'error out early upon -C or --whitespace=' '
+   test_must_fail git rebase -Cnot-a-number HEAD 2>err &&
+   test_i18ngrep "numerical value" err &&
+   test_must_fail git rebase --whitespace=bad HEAD 2>err &&
+   test_i18ngrep "Invalid whitespace option" err
+'

The merge commit then was the first place where this test could run with 
that variable.


What's the correct fix here? Force the builtin rebase in this test? 
Unify the error message in the non-builtin case?


Thanks,
-Stolee


Re: [PATCH v5] clone: report duplicate entries on case-insensitive filesystems

2018-11-19 Thread Duy Nguyen
On Mon, Nov 19, 2018 at 10:03 PM Duy Nguyen  wrote:
> Thanks Carlo for the file and "stat" output. The problem is APFS has
> 64-bit inode (according to the Internet) while we store inodes as
> 32-bit, so it's truncated.
> ...
> We will have to deal with the same
> truncated inode elsewhere to make sure we index refresh performance
> does not degrade on APFS.

... and we don't have a problem there. Either Linus predicted dealing
with 64-bit inodes, or he had a habit of casting st_ino to unsigned
int, I cannot tell. This code

ce->st_ino != (unsigned int)st->st_ino

is from e83c516331 (Initial revision of "git", the information manager
from hell - 2005-04-07) and it's still used today for comparing sd_ino
with st->st_ino in read-cache.c. I guess I should have copied and
pasted more often.
-- 
Duy


Re: Git Test Coverage Report (v2.20.0-rc0)

2018-11-19 Thread Derrick Stolee

On 11/19/2018 2:00 PM, Ben Peart wrote:



On 11/19/2018 10:40 AM, Derrick Stolee wrote:


There are a lot of lines introduced by the IEOT extension in these 
commits:


 > Ben Peart  3255089ad: ieot: add Index Entry Offset Table 
(IEOT) extension
 > Ben Peart  3b1d9e045: eoie: add End of Index Entry (EOIE) 
extension
 > Ben Peart  77ff1127a: read-cache: load cache entries on worker 
threads
 > Ben Peart  abb4bb838: read-cache: load cache extensions on a 
worker thread
 > Ben Peart  c780b9cfe: config: add new index.threads config 
setting
 > Ben Peart  d1664e73a: add: speed up cmd_add() by utilizing 
read_cache_preload()
 > Ben Peart  fa655d841: checkout: optimize "git checkout -b 
"




These should be hit if you run the test suite with 
GIT_TEST_INDEX_THREADS=2.  Without that, the indexes for the various 
tests are too small to trigger multi-threaded index reads/writes.


From t/README:

GIT_TEST_INDEX_THREADS= enables exercising the multi-threaded loading
of the index for the whole test suite by bypassing the default number of
cache entries and thread minimums. Setting this to 1 will make the
index loading single threaded.


I updated my build to add GIT_TEST_INDEX_THREADS=2 and I still see lots 
of uncovered stuff, including that load_cache_entries_threaded() is 
never run.


I added the following diff to my repo and ran the test suite manually 
with GIT_TEST_INDEX_THREADS=2 and it didn't fail:


diff --git a/read-cache.c b/read-cache.c
index 4ca81286c0..36502586a2 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -2057,6 +2057,9 @@ static unsigned long 
load_cache_entries_threaded(struct index_state *istate, con

    struct load_cache_entries_thread_data *data;
    unsigned long consumed = 0;

+   fprintf(stderr, "load_cache_entries_threaded\n");
+   exit(1);
+
    /* a little sanity checking */
    if (istate->name_hash_initialized)
    BUG("the name hash isn't thread safe");

Am I missing something? Is there another variable I should add?

When I look for where the GIT_TEST_INDEX_THREADS variable is checked, I 
see that the calls to git_config_get_index_threads() are followed by a 
check for NO_PTHREADS (setting the number of threads to 1 again). Is it 
possible that my compiler environment is not allowing me to even compile 
with threads?


Thanks,
-Stolee





Re: [PATCH v5] clone: report duplicate entries on case-insensitive filesystems

2018-11-19 Thread Duy Nguyen
... and I "dear Ramsay" without CCing him.. sigh.. sorry for the noise.

On Mon, Nov 19, 2018 at 10:03 PM Duy Nguyen  wrote:
>
> First of all, Ramsay, it would be great if you could test the below
> patch and see if it works on Cygwin. I assume since Cygwin shares the
> underlying filesystem, it will share the same "no trusting inode"
> issue with native builds (or it calculates inodes anyway using some
> other source?).
>
> Back to the APFS problem...
>
> On Mon, Nov 19, 2018 at 07:24:26PM +0100, Duy Nguyen wrote:
> > Could you send me the "index" file in  t/trash\
> > directory.t5601-clone/icasefs/bogus/.git/index ? Also the output of
> > "stat /path/to/icase/bogus/x"
> >
> > My only explanation is somehow the inode value we save is not the same
> > one on disk, which is weird and could even cause other problems. I'd
> > like to know why this happens before trying to fix anything.
>
> Thanks Carlo for the file and "stat" output. The problem is APFS has
> 64-bit inode (according to the Internet) while we store inodes as
> 32-bit, so it's truncated. Which means this comparison
>
> sd_ino == st_ino
>
> is never true because sd_ino is truncated (0x2121063) while st_ino is
> not (0x202121063).
>
> Carlo, it would be great if you could test this patch also with
> APFS. It should fix problem. We will have to deal with the same
> truncated inode elsewhere to make sure we index refresh performance
> does not degrade on APFS. But that's a separate problem. Thank you for
> bringing this up.
>
> -- 8< --
> diff --git a/entry.c b/entry.c
> index 5d136c5d55..809d3e2ba7 100644
> --- a/entry.c
> +++ b/entry.c
> @@ -404,13 +404,13 @@ static void mark_colliding_entries(const struct 
> checkout *state,
>  {
> int i, trust_ino = check_stat;
>
> -#if defined(GIT_WINDOWS_NATIVE)
> +#if defined(GIT_WINDOWS_NATIVE) || defined(__CYGWIN__)
> trust_ino = 0;
>  #endif
>
> ce->ce_flags |= CE_MATCHED;
>
> -   for (i = 0; i < state->istate->cache_nr; i++) {
> +   for (i = 0; i < trust_ino && state->istate->cache_nr; i++) {
> struct cache_entry *dup = state->istate->cache[i];
>
> if (dup == ce)
> @@ -419,10 +419,24 @@ static void mark_colliding_entries(const struct 
> checkout *state,
> if (dup->ce_flags & (CE_MATCHED | CE_VALID | 
> CE_SKIP_WORKTREE))
> continue;
>
> -   if ((trust_ino && dup->ce_stat_data.sd_ino == st->st_ino) ||
> -   (!trust_ino && !fspathcmp(ce->name, dup->name))) {
> +   if (dup->ce_stat_data.sd_ino == (unsigned int)st->st_ino) {
> dup->ce_flags |= CE_MATCHED;
> +   return;
> +   }
> +   }
> +
> +   for (i = 0; i < state->istate->cache_nr; i++) {
> +   struct cache_entry *dup = state->istate->cache[i];
> +
> +   if (dup == ce)
> break;
> +
> +   if (dup->ce_flags & (CE_MATCHED | CE_VALID | 
> CE_SKIP_WORKTREE))
> +   continue;
> +
> +   if (!fspathcmp(ce->name, dup->name)) {
> +   dup->ce_flags |= CE_MATCHED;
> +   return;
> }
> }
>  }
> diff --git a/t/t5601-clone.sh b/t/t5601-clone.sh
> index f1a49e94f5..c28d51bd59 100755
> --- a/t/t5601-clone.sh
> +++ b/t/t5601-clone.sh
> @@ -628,7 +628,7 @@ test_expect_success 'clone on case-insensitive fs' '
> )
>  '
>
> -test_expect_success !MINGW,!CYGWIN,CASE_INSENSITIVE_FS 'colliding file 
> detection' '
> +test_expect_success !MINGW,CASE_INSENSITIVE_FS 'colliding file detection' '
> grep X icasefs/warning &&
> grep x icasefs/warning &&
> test_i18ngrep "the following paths have collided" icasefs/warning
> -- 8< --



-- 
Duy


Re: [PATCH v5] clone: report duplicate entries on case-insensitive filesystems

2018-11-19 Thread Duy Nguyen
First of all, Ramsay, it would be great if you could test the below
patch and see if it works on Cygwin. I assume since Cygwin shares the
underlying filesystem, it will share the same "no trusting inode"
issue with native builds (or it calculates inodes anyway using some
other source?).

Back to the APFS problem...

On Mon, Nov 19, 2018 at 07:24:26PM +0100, Duy Nguyen wrote:
> Could you send me the "index" file in  t/trash\
> directory.t5601-clone/icasefs/bogus/.git/index ? Also the output of
> "stat /path/to/icase/bogus/x"
> 
> My only explanation is somehow the inode value we save is not the same
> one on disk, which is weird and could even cause other problems. I'd
> like to know why this happens before trying to fix anything.

Thanks Carlo for the file and "stat" output. The problem is APFS has
64-bit inode (according to the Internet) while we store inodes as
32-bit, so it's truncated. Which means this comparison

sd_ino == st_ino

is never true because sd_ino is truncated (0x2121063) while st_ino is
not (0x202121063).

Carlo, it would be great if you could test this patch also with
APFS. It should fix problem. We will have to deal with the same
truncated inode elsewhere to make sure we index refresh performance
does not degrade on APFS. But that's a separate problem. Thank you for
bringing this up.

-- 8< --
diff --git a/entry.c b/entry.c
index 5d136c5d55..809d3e2ba7 100644
--- a/entry.c
+++ b/entry.c
@@ -404,13 +404,13 @@ static void mark_colliding_entries(const struct checkout 
*state,
 {
int i, trust_ino = check_stat;
 
-#if defined(GIT_WINDOWS_NATIVE)
+#if defined(GIT_WINDOWS_NATIVE) || defined(__CYGWIN__)
trust_ino = 0;
 #endif
 
ce->ce_flags |= CE_MATCHED;
 
-   for (i = 0; i < state->istate->cache_nr; i++) {
+   for (i = 0; i < trust_ino && state->istate->cache_nr; i++) {
struct cache_entry *dup = state->istate->cache[i];
 
if (dup == ce)
@@ -419,10 +419,24 @@ static void mark_colliding_entries(const struct checkout 
*state,
if (dup->ce_flags & (CE_MATCHED | CE_VALID | CE_SKIP_WORKTREE))
continue;
 
-   if ((trust_ino && dup->ce_stat_data.sd_ino == st->st_ino) ||
-   (!trust_ino && !fspathcmp(ce->name, dup->name))) {
+   if (dup->ce_stat_data.sd_ino == (unsigned int)st->st_ino) {
dup->ce_flags |= CE_MATCHED;
+   return;
+   }
+   }
+
+   for (i = 0; i < state->istate->cache_nr; i++) {
+   struct cache_entry *dup = state->istate->cache[i];
+
+   if (dup == ce)
break;
+
+   if (dup->ce_flags & (CE_MATCHED | CE_VALID | CE_SKIP_WORKTREE))
+   continue;
+
+   if (!fspathcmp(ce->name, dup->name)) {
+   dup->ce_flags |= CE_MATCHED;
+   return;
}
}
 }
diff --git a/t/t5601-clone.sh b/t/t5601-clone.sh
index f1a49e94f5..c28d51bd59 100755
--- a/t/t5601-clone.sh
+++ b/t/t5601-clone.sh
@@ -628,7 +628,7 @@ test_expect_success 'clone on case-insensitive fs' '
)
 '
 
-test_expect_success !MINGW,!CYGWIN,CASE_INSENSITIVE_FS 'colliding file 
detection' '
+test_expect_success !MINGW,CASE_INSENSITIVE_FS 'colliding file detection' '
grep X icasefs/warning &&
grep x icasefs/warning &&
test_i18ngrep "the following paths have collided" icasefs/warning
-- 8< --


Re: [PATCH] tests: add a special setup where rebase.useBuiltin is off (Re: Git Test Coverage Report (v2.20.0-rc0))

2018-11-19 Thread Ævar Arnfjörð Bjarmason


On Mon, Nov 19 2018, Jonathan Nieder wrote:

> Ævar Arnfjörð Bjarmason wrote:
>> On Mon, Nov 19 2018, Derrick Stolee wrote:
>
>>> [...]
>>> builtin/rebase.c
>>> 62c23938fa 55) return env;
>>> [...]
>>> Ævar Arnfjörð Bjarmason 62c23938f: tests: add a special setup
>>> where rebase.useBuiltin is off
>>
>> This one would be covered with
>> GIT_TEST_REBASE_USE_BUILTIN=false. Obviously trivial, but I wonder if
>> the rest of the coverage would look different when passed through the 
>> various GIT_TEST_* options.
>
> I wonder if we can do better for this kind of thing.
>
> When I do routine development, I am not running tests with any
> non-default flags.  So why should tests run with non-default flags
> count toward coverage?  Is there a way to make the default test
> settings dip their feet into some non-default configurations, without
> running the full battery of tests and slowing tests down accordingly?
> E.g. is there some kind of smoke test that rebase with
> useBuiltin=false works at all that could run, even if I am not running
> the full battery of rebase tests?

Yeah, definitely. Just pointing out that it would smoke out coverage we
don't have at all v.s. cases where we just don't have coverage with the
default tests without any special modes.

Derrick: I think it would be useful to produce some delta report showing
covered lines without any GIT_TEST_* variables v.s. when they're set. As
Jonathan points out those should ideally be tested with the normal test
suite, leaving GIT_TEST_* just for stress testing to find new bugs.

> That's a bit of a non sequitor for this example, which is actual code
> to handle GIT_TEST_REBASE_USE_BUILTIN, though.  For it, I wonder why
> we need rebase.c to understand the envvar --- couldn't test-lib.sh
> take care of setting rebase.useBuiltin to false when it's set?

I guess the test-lib.sh could pass down things like
"GIT_CONFIG_PARAMETERS='x.y=z'". Maybe that's a better way to do it.


Re: [PATCH] commit-graph: split up close_reachable() progress output

2018-11-19 Thread Derrick Stolee

On 11/19/2018 3:23 PM, Ævar Arnfjörð Bjarmason wrote:

+   if (report_progress)
+   progress = start_delayed_progress(
+   _("Expanding reachable commits in commit graph"), j = 
0);


This should be the only one that shows up in all but the very largest of 
repositories.


LGTM.

Thanks,
-Stolee


help.autoCorrect prefix selection considered a bit dangerous

2018-11-19 Thread Ævar Arnfjörð Bjarmason
Replying to this blast from the past:
https://public-inbox.org/git/1290787239-4508-1-git-send-email-kusmab...@gmail.com/

I apparently like to live dangerously and have help.autoCorrect
enabled. I just had:

git puss

Auto-corrected to:

git push

When I meant:

git pull

(For those wondering how I could have mistyped that, "l" and "s" are
right next to each other on a Dvorak layout).

As seen in the E-Mail from 2010 this intentional, i.e. "pull" is pruned
since the "pu" prefix isn't matched, but "pus" is. This was meant to
correct e.g. "git st" to "git status".

I don't have time to poke at this now, but wonder if:

 1) The correction facility shouldn't at least have a list of "this does
stuff over the wire" commands and would then use a more conservative
estimate.

 2) Whether we can do better with typo detection. E.g. add commands like
"pull" to the list if we have a long enough prefix for them, and if
the number of characters entered matches the number of characters in
another command.


Re: [PATCH] technical doc: add a design doc for the evolve command

2018-11-19 Thread Jonathan Nieder
Hi,

Xenos wrote:

> Lets explore the "when" question. I think there's a compelling reason
> to add them as soon as possible - namely, gerrit. If and when we come
> to some sort of agreement on this proposal, gerrit could start adding
> tooling to understand change graphs as an alternative to change-id
> footers. That work could proceed in parallel with the work in git-core
> once we know what the data structures look like, but it can't start
> until the data structures are sufficient to address all the use cases
> that were previously covered by change-id. At the moment, meta-commits
> without origin parents would not cover all of gerrit's use-cases so
> this would block adoption in gerrit.

By this, are you referring to the "Cherry-picks" list in the Gerrit
web UI?

Thanks,
Jonathan


[PATCH] commit-graph: split up close_reachable() progress output

2018-11-19 Thread Ævar Arnfjörð Bjarmason
Amend the progress output added in 7b0f229222 ("commit-graph write:
add progress output", 2018-09-17) so that the total numbers it reports
aren't higher than the total number of commits anymore. See [1] for a
bug report pointing that out.

When I added this I wasn't intending to provide an accurate count, but
just have some progress output to show the user the command wasn't
hanging[2]. But since we are showing numbers, let's make them
accurate. The progress descriptions were suggested by Derrick Stolee
in [3].

As noted in [2] we are unlikely to show anything except the "Expanding
reachable..." message even on fairly large repositories such as
linux.git. On a test repository I have with north of 7 million commits
all of these are displayed. Two of them don't show up for long, but as
noted in [5] future-proofing this for if the loops become more
expensive in the future makes sense.

1. https://public-inbox.org/git/20181010203738.ge23...@szeder.dev/
2. https://public-inbox.org/git/87pnwhea8y@evledraar.gmail.com/
3. https://public-inbox.org/git/f7a0cbee-863c-61d3-4959-5cec8b43c...@gmail.com/
4. https://public-inbox.org/git/20181015160545.gg19...@szeder.dev/
5. https://public-inbox.org/git/87murle8da@evledraar.gmail.com/

Reported-by: SZEDER Gábor 
Helped-by: Derrick Stolee 
Signed-off-by: Ævar Arnfjörð Bjarmason 
---

On Mon, Nov 19 2018, SZEDER Gábor wrote:

> Ping?
>
> We are at -rc0, this progress output is a new feature since v2.19.0,
> and the numbers shown are still way off.

I was under the impression after your
https://public-inbox.org/git/20181015160545.gg19...@szeder.dev/ that
you were going to do some more digging & report back, so I put it on
my "waiting for feedback" list and then forgot about it.

But here's a patch that should address the issue you pointed out, but
I don't know if it fixes whatever you were alluding to in the linked
E-Mail above.

 commit-graph.c | 13 ++---
 1 file changed, 10 insertions(+), 3 deletions(-)

diff --git a/commit-graph.c b/commit-graph.c
index 40c855f185..9c0d6914be 100644
--- a/commit-graph.c
+++ b/commit-graph.c
@@ -641,26 +641,29 @@ static void add_missing_parents(struct packed_oid_list 
*oids, struct commit *com
 
 static void close_reachable(struct packed_oid_list *oids, int report_progress)
 {
-   int i;
+   int i, j;
struct commit *commit;
struct progress *progress = NULL;
-   int j = 0;
 
if (report_progress)
progress = start_delayed_progress(
-   _("Annotating commits in commit graph"), 0);
+   _("Loading known commits in commit graph"), j = 0);
for (i = 0; i < oids->nr; i++) {
display_progress(progress, ++j);
commit = lookup_commit(the_repository, >list[i]);
if (commit)
commit->object.flags |= UNINTERESTING;
}
+   stop_progress();
 
/*
 * As this loop runs, oids->nr may grow, but not more
 * than the number of missing commits in the reachable
 * closure.
 */
+   if (report_progress)
+   progress = start_delayed_progress(
+   _("Expanding reachable commits in commit graph"), j = 
0);
for (i = 0; i < oids->nr; i++) {
display_progress(progress, ++j);
commit = lookup_commit(the_repository, >list[i]);
@@ -668,7 +671,11 @@ static void close_reachable(struct packed_oid_list *oids, 
int report_progress)
if (commit && !parse_commit(commit))
add_missing_parents(oids, commit);
}
+   stop_progress();
 
+   if (report_progress)
+   progress = start_delayed_progress(
+   _("Clearing commit marks in commit graph"), j = 0);
for (i = 0; i < oids->nr; i++) {
display_progress(progress, ++j);
commit = lookup_commit(the_repository, >list[i]);
-- 
2.19.1.1182.g4ecb1133ce



Re: [PATCH] technical doc: add a design doc for the evolve command

2018-11-19 Thread Stefan Xenos
> But it is not immediately obvious to me how it would help to have "Z was 
> cherry-picked from W" in "evolve".

The evolve command would use it for handling the
obsolescence-over-cherry-pick (OOCP) feature. If someone cherry-picks
a commit and then amends the original, the evolve command would give
you the option of applying the same amendment to the cherry-picked
version.

Are you claiming that this is undesirable, or are you claiming that
this could be accomplished without origin parents?

> the developer wanted to use the change between W^ and W in a context that is 
> quite different from

I guess that depends on the reason for doing the cherry-pick. A very
common scenario I see for cherry-picks is cherry-picking a bugfix from
a development branch to a maintenance branch. In that situation, if
there was a better version of the original bugfix you'd also want to
update the cherry-pick on the maintenance branch to use the better
version of the fix. That's what OOCP does.

> make no sense to "evolve" anything that was built on top of W on top of Z.

Agreed. But that's not what evolve would do with the origin edges. It
would be looking for amendments of W, not children of W.

> It is of course OK to build a different feature that can take advantage of 
> the cherry-pick information on top of the same meta commit concept in later 
> steps

All valid points - we could build a useful "evolve" command without
origin edges (and without OOCP), we could easily add origin parents
later to a design that just supported obsolete and content parents,
and the decision about /when/ to add origin parents is orthogonal to
the decision about /if/ to add them.

Lets explore the "when" question. I think there's a compelling reason
to add them as soon as possible - namely, gerrit. If and when we come
to some sort of agreement on this proposal, gerrit could start adding
tooling to understand change graphs as an alternative to change-id
footers. That work could proceed in parallel with the work in git-core
once we know what the data structures look like, but it can't start
until the data structures are sufficient to address all the use cases
that were previously covered by change-id. At the moment, meta-commits
without origin parents would not cover all of gerrit's use-cases so
this would block adoption in gerrit.

  - Stefan
On Sun, Nov 18, 2018 at 8:15 PM Junio C Hamano  wrote:
>
> Stefan Xenos  writes:
>
> > The scenario you describe would not produce an origin edge in the
> > metacommit graph. If the user amended X, there would be no origin
> > edges - just a replacement. If you cherry-picked Z you'd get no
> > replacements and just an origin. In neither case would you get both
> > types of parent.
>
> OK, that makes things a lot simpler.
>
> I can see why we want to record "commit X obsoletes commit Y" to
> help the "evolve" feature, which was the original motivation this
> started the whole discussion.  But it is not immediately obvious to
> me how it would help to have "Z was cherry-picked from W" in
> "evolve".
>
> The whole point of cherry-picking an old commit W to produce a new
> commit Z is because the developer wanted to use the change between
> W^ and W in a context that is quite different from W^, so it would
> make no sense to "evolve" anything that was built on top of W on top
> of Z.
>
> It is of course OK to build a different feature that can take
> advantage of the cherry-pick information on top of the same meta
> commit concept in later steps, and to ensure that is doable, the
> initial meta commit design must be done in a way that is flexible
> enough to be extended, but it is not clear to me if this "origin"
> thing is "while this does not have much to do with 'evolve', let's
> throw in fields that would help another feature while we are at it"
> or "in addition to 'X obsoletes Y', we need the cherry-pick
> information for 'evolve' feature because..." (and because it is not
> clear, I am assuming that it is the former).  If we can design the
> "evolve" thing with only the "contents" and "obsoletes", that would
> allow us to limit the scope of discussion we need to have around
> meta commit and have something that works earlier, wouldn't it?
>
> Thanks.


Re: [PATCH] test-lib-functions: make 'test_cmp_rev' more informative on failure

2018-11-19 Thread Jeff King
On Mon, Nov 19, 2018 at 02:28:18PM +0100, SZEDER Gábor wrote:

> The 'test_cmp_rev' helper is merely a wrapper around 'test_cmp'
> checking the output of two 'git rev-parse' commands, which means that
> its output on failure is not particularly informative, as it's
> basically two OIDs with a bit of extra clutter of the diff header, but
> without any indication of which two revisions have caused the failure:
> 
>   --- expect.rev  2018-11-17 14:02:11.569747033 +
>   +++ actual.rev  2018-11-17 14:02:11.569747033 +
>   @@ -1 +1 @@
>   -d79ce1670bdcb76e6d1da2ae095e890ccb326ae9
>   +139b20d8e6c5b496de61f033f642d0e3dbff528d
> 
> It also pollutes the test repo with these two intermediate files,
> though that doesn't seem to cause any complications in our current
> tests (meaning that I couldn't find any tests that have to work around
> the presence of these files by explicitly removing or ignoring them).
> 
> Enhance 'test_cmp_rev' to provide a more useful output on failure with
> less clutter:
> 
>   error: two revisions point to different objects:
> 'HEAD^': d79ce1670bdcb76e6d1da2ae095e890ccb326ae9
> 'extra': 139b20d8e6c5b496de61f033f642d0e3dbff528d
> 
> Doing so is more convenient when storing the OIDs outputted by 'git
> rev-parse' in a local variable each, which, as a bonus, won't pollute
> the repository with intermediate files.
> 
> While at it, also ensure that 'test_cmp_rev' is invoked with the right
> number of parameters, namely two.

This is an improvement, in my opinion (and I agree that using your new
BUG for this last part would be better still). It also saves a process
in the common case.

One question:

> + else
> + local r1 r2
> + r1=$(git rev-parse --verify "$1") &&
> + r2=$(git rev-parse --verify "$2") &&
> + if test "$r1" != "$r2"
> + then
> + cat >&4 <<-EOF
> + error: two revisions point to different objects:
> +   '$1': $r1
> +   '$2': $r2
> + EOF
> + return 1
> + fi

Why does this cat go to descriptor 4? I get why you'd want it to (it's
meant for the user's eyes, and that's where 4 goes), but we do not
usually bother to do so for our helper functions (like test_cmp).

I don't think it matters usually in practice, because nobody tries to
capture the stderr of test_cmp, etc. I don't think it would ever hurt,
though. Should we be doing that for all the others, too?

-Peff


Re: Git Test Coverage Report (v2.20.0-rc0)

2018-11-19 Thread Derrick Stolee

On 11/19/2018 2:39 PM, Ævar Arnfjörð Bjarmason wrote:

On Mon, Nov 19 2018, Derrick Stolee wrote:


The coverage report has been using the following:

export GIT_TEST_MULTI_PACK_INDEX=1
export GIT_TEST_COMMIT_GRAPH=1
export GIT_TEST_INDEX_VERION=4
export GIT_TEST_SPLIT_INDEX=yes
export GIT_TEST_OE_SIZE=10
export GIT_TEST_OE_DELTA_SIZE=5

I need to add GIT_TEST_INDEX_THREADS=2 and GIT_TEST_REBASE_USE_BUILTIN=false

...although note you'll need to also test without
GIT_TEST_REBASE_USE_BUILTIN=false, otherwise a lot of the new C code
won't have coverage.


Sorry for lack of clarity: I first run 'make coverage-test' with no 
GIT_TEST_* variables, then run the test suite again with the optional 
variables.


Thanks,
-Stolee


Re: [PATCH] tests: send "bug in the test script" errors to the script's stderr

2018-11-19 Thread Jeff King
On Mon, Nov 19, 2018 at 02:13:26PM +0100, SZEDER Gábor wrote:

> Send these "bug in the test script" error messages directly to the
> test scripts standard error and thus to the terminal, so those bugs
> will be much harder to overlook.  Instead of updating all ~20 such
> 'error' calls with a redirection, let's add a BUG() function to
> 'test-lib.sh', wrapping an 'error' call with the proper redirection
> and also including the common prefix of those error messages, and
> convert all those call sites [4] to use this new BUG() function
> instead.

Yes, I think this is an improvement.

> +BUG () {
> + error >&7 "bug in the test script: $*"
> +}

I naively expected this to go to >&4, but of course that is the
conditional "stderr or /dev/null, depending on --verbose" descriptor. I
have a feeling that we could get rid of descriptors 5 and 7 in favor of
3 and 4, if we did the conditional redirection when running each test,
instead of ahead of time.

But unless we are running out of descriptors, it's not worth the effort
(it's debatable whether we are; 9be795fbce (t5615: avoid re-using
descriptor 4, 2017-12-08) made me nervous, but it's more about the
special-ness of BASHE_XTRACEFD than anything).

Anyway, that's all a tangent to your patch.

I do notice that many of the existing "FATAL:" errors use descriptor 5,
which goes to stdout. I'm not sure if those should actually be going to
stderr (or if there's some TAP significance to those lines).

-Peff


Re: [PATCH v5] clone: report duplicate entries on case-insensitive filesystems

2018-11-19 Thread Carlo Arenas
On Mon, Nov 19, 2018 at 9:23 AM Ramsay Jones
 wrote:
> ok 99 # skip colliding file detection (missing !CYGWIN of 
> !MINGW,!CYGWIN,CASE_INSENSITIVE_FS)

you need to enable this specific test first (removing !CYGWIN) so it
doesn't get skipped

Carlo


Re: Git Test Coverage Report (v2.20.0-rc0)

2018-11-19 Thread Ævar Arnfjörð Bjarmason


On Mon, Nov 19 2018, Derrick Stolee wrote:

> On 11/19/2018 1:33 PM, Ævar Arnfjörð Bjarmason wrote:
>> On Mon, Nov 19 2018, Derrick Stolee wrote:
>>
>>> Here is a test coverage report for the uncovered lines introduced in
>>> v2.20.0-rc0 compared to v2.19.1.
>> Thanks a lot for this.
>>
>>> [...]
>>> builtin/rebase.c
>>> 62c23938fa 55) return env;
>>> [...]
>>> Ævar Arnfjörð Bjarmason 62c23938f: tests: add a special setup
>>> where rebase.useBuiltin is off
>> This one would be covered with
>> GIT_TEST_REBASE_USE_BUILTIN=false. Obviously trivial, but I wonder if
>> the rest of the coverage would look different when passed through the 
>> various GIT_TEST_* options.
>
> The coverage report has been using the following:
>
> export GIT_TEST_MULTI_PACK_INDEX=1
> export GIT_TEST_COMMIT_GRAPH=1
> export GIT_TEST_INDEX_VERION=4
> export GIT_TEST_SPLIT_INDEX=yes
> export GIT_TEST_OE_SIZE=10
> export GIT_TEST_OE_DELTA_SIZE=5
>
> I need to add GIT_TEST_INDEX_THREADS=2 and GIT_TEST_REBASE_USE_BUILTIN=false

...although note you'll need to also test without
GIT_TEST_REBASE_USE_BUILTIN=false, otherwise a lot of the new C code
won't have coverage.


Re: [PATCH] t5562: skip if NO_CURL is enabled

2018-11-19 Thread Carlo Arenas
On Mon, Nov 19, 2018 at 10:40 AM Max Kirillov  wrote:
>
> On Mon, Nov 19, 2018 at 02:15:35AM -0800, Carlo Marcelo Arenas Belón wrote:
> > 6c213e863a ("http-backend: respect CONTENT_LENGTH for receive-pack", 
> > 2018-07-27)
> > introduced all tests but without a check for CURL support from git.
>
> The tests should not be using curl, they just pipe data to
> http-backend's standard input.

NO_CURL reflects the build setting (for http support); CURL checks for
the curl binary, but as Ævar points out the requirements must be from
somewhere else since a NO_CURL=1 build (tested in macOS) still passes
the test, but not in NetBSD.

tests 3-8 seem to fail because perl is hardcoded to /urs/bin/perl in
t5562/invoke-with-content-length.pl, while I seem to be getting some
sporadic errors in 9 with the following output :

++ env CONTENT_TYPE=application/x-git-receive-pack-request
QUERY_STRING=/repo.git/git-receive-pack
'PATH_TRANSLATED=/home/carenas/src/git/t/trash
directory.t5562-http-backend-content-length/.git/git-receive-pack'
GIT_HTTP_EXPORT_ALL=TRUE REQUEST_METHOD=POST
/home/carenas/src/git/t/t5562/invoke-with-content-length.pl push_body
git http-backend
++ verify_http_result '200 OK'
++ grep fatal: act.err
Binary file act.err matches
++ return 1
error: last command exited with $?=1
not ok 9 - push plain

and the following output in act.err (with a 200 in act)

fatal: the remote end hung up unexpectedly

Carlo


Re: Git Test Coverage Report (v2.20.0-rc0)

2018-11-19 Thread Derrick Stolee

On 11/19/2018 1:33 PM, Ævar Arnfjörð Bjarmason wrote:

On Mon, Nov 19 2018, Derrick Stolee wrote:


Here is a test coverage report for the uncovered lines introduced in
v2.20.0-rc0 compared to v2.19.1.

Thanks a lot for this.


[...]
builtin/rebase.c
62c23938fa 55) return env;
[...]
Ævar Arnfjörð Bjarmason 62c23938f: tests: add a special setup
where rebase.useBuiltin is off

This one would be covered with
GIT_TEST_REBASE_USE_BUILTIN=false. Obviously trivial, but I wonder if
the rest of the coverage would look different when passed through the various 
GIT_TEST_* options.


The coverage report has been using the following:

export GIT_TEST_MULTI_PACK_INDEX=1
export GIT_TEST_COMMIT_GRAPH=1
export GIT_TEST_INDEX_VERION=4
export GIT_TEST_SPLIT_INDEX=yes
export GIT_TEST_OE_SIZE=10
export GIT_TEST_OE_DELTA_SIZE=5

I need to add GIT_TEST_INDEX_THREADS=2 and GIT_TEST_REBASE_USE_BUILTIN=false

Thanks!
-Stolee


[PATCH 1/1] revision.c: use new topo-order logic in tests

2018-11-19 Thread Derrick Stolee via GitGitGadget
From: Derrick Stolee 

The revision-walk machinery is being rewritten to use generation
numbers in the commit-graph when availble. Due to some problematic
commit histories, the new logic can be slower than the previous
method due to how commit dates and generation numbers interact.
Thus, the new logic is not used in comparison queries, such as

git log --topo-order A..B

The logic for these queries was implemented during the refactor,
but is unreachable due to the potential performance penalty. The
code came along with a larger block of code that was copied from
the old code. When generation numbers are updated to v2 (corrected
commit dates), then we will no longer have a performance penalty
and this logic is what we will want to use.

In the meantime, use the new logic when GIT_TEST_COMMIT_GRAPH is
enabled. This will demonstrate that the new logic works for all
comparison queries in the test suite, including these variants:

git log --topo-order --ancestry-path A..B
git log --topo-order A...B

Signed-off-by: Derrick Stolee 
---
 revision.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/revision.c b/revision.c
index 4ef47d2fb4..d52da6e24f 100644
--- a/revision.c
+++ b/revision.c
@@ -27,6 +27,7 @@
 #include "commit-reach.h"
 #include "commit-graph.h"
 #include "prio-queue.h"
+#include "config.h"
 
 volatile show_early_output_fn_t show_early_output;
 
@@ -3143,6 +3144,9 @@ int prepare_revision_walk(struct rev_info *revs)
commit_list_sort_by_date(>commits);
if (revs->no_walk)
return 0;
+   if (revs->limited &&
+   git_env_bool(GIT_TEST_COMMIT_GRAPH, 0))
+   revs->limited = 0;
if (revs->limited) {
if (limit_list(revs) < 0)
return -1;
-- 
gitgitgadget


[PATCH 0/1] Use new topo-order logic with GIT_TEST_COMMIT_GRAPH

2018-11-19 Thread Derrick Stolee via GitGitGadget
The recent Git test report for v2.20.0-rc0 shows that the logic around
UNINTERESTING commits is not covered by the test suite. This is because the
code is currently unreachable! See the commit message for details.

An alternate approach would be to delete the code around UNINTERESTING
commits, but that doesn't seem necessary.

Thanks, -Stolee

Derrick Stolee (1):
  revision.c: use new topo-order logic in tests

 revision.c | 4 
 1 file changed, 4 insertions(+)


base-commit: 561b583749b7428f1790f03164d0d0e75be71d7b
Published-As: 
https://github.com/gitgitgadget/git/releases/tags/pr-83%2Fderrickstolee%2Ftopo-order-test-v1
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git 
pr-83/derrickstolee/topo-order-test-v1
Pull-Request: https://github.com/gitgitgadget/git/pull/83
-- 
gitgitgadget


Re: Git Test Coverage Report (v2.20.0-rc0)

2018-11-19 Thread Ben Peart




On 11/19/2018 10:40 AM, Derrick Stolee wrote:
The test coverage reports started mid-way through this release cycle, so 
I thought it would be good to do a full review of the new uncovered code 
since the last release.


I eliminated most of the uncovered code due to the following cases:

1. Code was only moved or refactored.
2. Code was related to unusual error conditions (e.g. open_pack_index() 
fails)


The comments below are intended only to point out potential directions 
to improve test coverage. Some of it is for me to do!


Thanks,
-Stolee

On 11/18/2018 9:54 PM, Derrick Stolee wrote:

There are a lot of lines introduced by the IEOT extension in these commits:

 > Ben Peart  3255089ad: ieot: add Index Entry Offset Table (IEOT) 
extension

 > Ben Peart  3b1d9e045: eoie: add End of Index Entry (EOIE) extension
 > Ben Peart  77ff1127a: read-cache: load cache entries on worker 
threads
 > Ben Peart  abb4bb838: read-cache: load cache extensions on a 
worker thread

 > Ben Peart  c780b9cfe: config: add new index.threads config setting
 > Ben Peart  d1664e73a: add: speed up cmd_add() by utilizing 
read_cache_preload()
 > Ben Peart  fa655d841: checkout: optimize "git checkout -b 
"




These should be hit if you run the test suite with 
GIT_TEST_INDEX_THREADS=2.  Without that, the indexes for the various 
tests are too small to trigger multi-threaded index reads/writes.


From t/README:

GIT_TEST_INDEX_THREADS= enables exercising the multi-threaded loading
of the index for the whole test suite by bypassing the default number of
cache entries and thread minimums. Setting this to 1 will make the
index loading single threaded.




Re: [PATCH] tests: add a special setup where rebase.useBuiltin is off (Re: Git Test Coverage Report (v2.20.0-rc0))

2018-11-19 Thread Jonathan Nieder
Ævar Arnfjörð Bjarmason wrote:
> On Mon, Nov 19 2018, Derrick Stolee wrote:

>> [...]
>> builtin/rebase.c
>> 62c23938fa 55) return env;
>> [...]
>> Ævar Arnfjörð Bjarmason 62c23938f: tests: add a special setup
>> where rebase.useBuiltin is off
>
> This one would be covered with
> GIT_TEST_REBASE_USE_BUILTIN=false. Obviously trivial, but I wonder if
> the rest of the coverage would look different when passed through the various 
> GIT_TEST_* options.

I wonder if we can do better for this kind of thing.

When I do routine development, I am not running tests with any
non-default flags.  So why should tests run with non-default flags
count toward coverage?  Is there a way to make the default test
settings dip their feet into some non-default configurations, without
running the full battery of tests and slowing tests down accordingly?
E.g. is there some kind of smoke test that rebase with
useBuiltin=false works at all that could run, even if I am not running
the full battery of rebase tests?

That's a bit of a non sequitor for this example, which is actual code
to handle GIT_TEST_REBASE_USE_BUILTIN, though.  For it, I wonder why
we need rebase.c to understand the envvar --- couldn't test-lib.sh
take care of setting rebase.useBuiltin to false when it's set?

Thanks,
Jonathan


Re: [PATCH] t5562: skip if NO_CURL is enabled

2018-11-19 Thread Max Kirillov
On Mon, Nov 19, 2018 at 02:15:35AM -0800, Carlo Marcelo Arenas Belón wrote:
> 6c213e863a ("http-backend: respect CONTENT_LENGTH for receive-pack", 
> 2018-07-27)
> introduced all tests but without a check for CURL support from git.

The tests should not be using curl, they just pipe data to
http-backend's standard input.

-- 
Max


Re: Git Test Coverage Report (v2.20.0-rc0)

2018-11-19 Thread Jeff King
On Mon, Nov 19, 2018 at 11:21:38AM -0500, Jeff King wrote:

> > +   if (use_delta_islands) {
> > +   const char *p;
> > +   unsigned depth = 0;
> > +   struct object_entry *ent;
> > +
> > +   for (p = strchr(name, '/'); p; p = strchr(p + 1, '/'))
> > +   depth++;
> > +
> > +   ent = packlist_find(_pack, obj->oid.hash, NULL);
> > +   if (ent && depth > ent->tree_depth)
> > +   ent->tree_depth = depth;
> > +   }
> >  }
> > 
> > And that 'ent->tree_depth = depth;' line is replaced with the
> > oe_set_tree_depth() call in the report.
> > 
> > Since depth is never incremented, we are not covering this block. Is it
> > possible to test?
> 
> Looks like t5320 only has single-level trees. We probably just need to
> add a deeper tree. A more interesting case is when an object really is
> found at multiple depths, but constructing a case that cared about that
> would be quite complicated.
> 
> That said, there is much bigger problem with this code, which is that
> 108f530385 (pack-objects: move tree_depth into 'struct packing_data',
> 2018-08-16) is totally broken. It works on the trivial repository in the
> test, but try this (especially under valgrind or ASan) on a real
> repository:
> 
>   git repack -adi
> 
> which will crash immediately.  It doesn't correctly maintain the
> invariant that if tree_depth is not NULL, it is the same size as
> the main object array.
> 
> I'll see if I can come up with a fix.

Just a quick update to prevent anyone else looking at it: I have a fix
for this (and another related issue with that commit).

There's an edge case in that depth computation that I think is
unhandled, as well. I _think_ it doesn't trigger very often, but I'm
running some experiments to verify it. That's S-L-O-W, so I probably
won't have results until tomorrow.

-Peff


Re: Git Test Coverage Report (v2.20.0-rc0)

2018-11-19 Thread Ævar Arnfjörð Bjarmason


On Mon, Nov 19 2018, Derrick Stolee wrote:

> Here is a test coverage report for the uncovered lines introduced in
> v2.20.0-rc0 compared to v2.19.1.

Thanks a lot for this.

> [...]
> builtin/rebase.c
> 62c23938fa 55) return env;
> [...]
> Ævar Arnfjörð Bjarmason 62c23938f: tests: add a special setup
> where rebase.useBuiltin is off

This one would be covered with
GIT_TEST_REBASE_USE_BUILTIN=false. Obviously trivial, but I wonder if
the rest of the coverage would look different when passed through the various 
GIT_TEST_* options.

That would make it take at least 12x as long if you did them one at a
time. Probably just 2-3x as long in practice since most can be combined
in some way, and somewhat computationally infeasible if you're going to
try all possible combinations.

> [...]
> fsck.c
> fb8952077d 214) die_errno("Could not read '%s'", path);
>
> René Scharfe fb8952077: fsck: use strbuf_getline() to read
> skiplist file

The fault of a patch I submitted. Couldn't find a sane & portable way to
emulate cases where ferror() would trigger.


Re: [PATCH v5] clone: report duplicate entries on case-insensitive filesystems

2018-11-19 Thread Duy Nguyen
On Mon, Nov 19, 2018 at 6:14 PM Carlo Arenas  wrote:
>
> On Mon, Nov 19, 2018 at 4:28 AM Torsten Bögershausen  wrote:
> >
> > Did you test it on Mac ?
>
> macOS 10.14.1 but only using APFS, did you test my patch with HFS+?
>
> > So what exactly are you trying to fix ?
>
> I get
>
> not ok 99 - colliding file detection
> #
> # grep X icasefs/warning &&
> # grep x icasefs/warning &&
> # test_i18ngrep "the following paths have collided" icasefs/warning
> #
>
> and the output of "warning" only shows one of the conflicting files,
> instead of both:
>
> Cloning into 'bogus'...
> done.
> warning: the following paths have collided (e.g. case-sensitive paths
> on a case-insensitive filesystem) and only one from the same
> colliding group is in the working tree:
>
>   'x'
>
> Carlo

Could you send me the "index" file in  t/trash\
directory.t5601-clone/icasefs/bogus/.git/index ? Also the output of
"stat /path/to/icase/bogus/x"

My only explanation is somehow the inode value we save is not the same
one on disk, which is weird and could even cause other problems. I'd
like to know why this happens before trying to fix anything.
-- 
Duy


Re: [PATCH/RFC v1 1/1] Use size_t instead of unsigned long

2018-11-19 Thread René Scharfe
Am 19.11.2018 um 06:33 schrieb Torsten Bögershausen:
> The archive-tar.c is actually a good example, why a step-by-step update
> is not ideal (the code would not work any more on Win64).
> 
> If we look here:
> 
> static int stream_blocked(const struct object_id *oid)
> {
>   struct git_istream *st;
>   enum object_type type;
>   size_t sz;
>   char buf[BLOCKSIZE];
>   ssize_t readlen;
> 
>   st = open_istream(oid, , , NULL);
>   ^
>   if (!st)
>   return error(_("cannot stream blob %s"), oid_to_hex(oid));
>   for (;;) {
> 
> The sz variable must follow whatever open_istream() uses, so if we start
> with archive-tar.c, we must use either size_t or ulong, whatever
> open_istream() needs. Otherwise things will break:
> archive-tar.c uses ulong, open_istream() size_t, but we are passing pointers
> around, and here  != _t
> 
> If we only update open_istream(), but not archive-tar.c, then
> things are not better:
> _t != 
> 
> I don't have a good idea how to split the patch.

sz is not actually used later in that function; this change can
be done independently of any other ulong/size_t conversion in that
file.

Hmm, looking at that call I wonder why open_istream() doesn't return
type and size in the struct git_istream.  To make it match
read_object_file(), I suppose.  Perhaps it's an opportunity to
improve its interface?

René


Re: [PATCH/RFC v1 1/1] Use size_t instead of unsigned long

2018-11-19 Thread Torsten Bögershausen
On Sun, Nov 18, 2018 at 03:18:52PM -0500, Derrick Stolee wrote:
> On 11/17/2018 10:11 AM, tbo...@web.de wrote:
> >From: Torsten Bögershausen 
> >
> >Currently Git users can not commit files >4Gib under 64 bit Windows,
> >where "long" is 32 bit but size_t is 64 bit.
> >Improve the code base in small steps, as small as possible.
> >What started with a small patch to replace "unsigned long" with size_t
> >in one file (convert.c) ended up with a change in many files.
> >
> >Signed-off-by: Torsten Bögershausen 
> >---
> >
> >This needs to go on top of pu, to cover all the good stuff
> >   cooking here.
> 
> Better to work on top of 'master', as the work in 'pu' will be rewritten
> several times, probably.
> 
> >I have started this series on November 1st, since that 2 or 3 rebases
> >   had been done to catch up, and now it is on pu from November 15.
> >
> >I couldn't find a reason why changing "unsigned ling"
> >   into "size_t" may break anything, any thoughts, please ?
> 
> IIRC, the blocker for why we haven't done this already is that "size_t",
> "timestamp_t" and "off_t" are all 64-bit types that give more implied
> meaning, so we should swap types specifically to these as we want. One
> example series does a specific, small change [1].
> 
> If we wanted to do a single swap that removes 'unsigned long' with an
> unambiguously 64-bit type, I would recommend "uint64_t". Later replacements
> to size_t, off_t, and timestamp_t could happen on a case-by-case basis for
> type clarity.
> 
> It may benefit reviewers to split this change into multiple patches,
> starting at the lowest levels of the call stack (so higher 'unsigned long's
> can up-cast to the 64-bit types when calling a function) and focusing the
> changes to one or two files at a time (X.c and X.h, preferrably).
> 
> Since you are talking about the benefits for Git for Windows to accept 4GB
> files, I wonder if we can add a test that verifies such a file will work. If
> you have such a test, then I could help verify that the test fails before
> the change and succeeds afterward.
> 
> Finally, it may be good to add a coccinelle script that replaces 'unsigned
> long' with 'uint64_t' so we can easily fix any new introductions that happen
> in the future.

The plan was never to change "unsigned long" to a 64 bit value in general.
The usage of "unsigned long" (instead of size_t) was (and is) still good
for 32bit systems, where both are 32 bit. (at least all system I am aware of).

For 64 bit systems like Linux or Mac OS it is the same, both are 64 bit.

The only problematic system is Win64, where "unsigned long" is 32 bit,
and therefore we must use size_t to address data in memory.
This is not to be confused with off_t, which is used for "data on disk"
(and nothing else) or timestamp_t which is used for timestamps (and nothing 
else).

I haven't followed the "coccinelle script" development at all, if someone
makes a patch do replace "unsigned long" with size_t, that could replace
my whole patch. (Some of them may be downgraded to "unsigned int" ?)

However, as we need to let  tb/print-size-t-with-uintmax-format make it to
master, otherwise we are not able to print the variables in a portable way. 


> Thanks! I do think we should make this change, but we must be careful. It
> may be disruptive to topics in flight.
> 
> -Stolee
> 
> [1] https://public-inbox.org/git/20181112084031.11769-1-care...@gmail.com/
> 


Re: [PATCH v5] clone: report duplicate entries on case-insensitive filesystems

2018-11-19 Thread Ramsay Jones



On 19/11/2018 12:28, Torsten Bögershausen wrote:
> On 2018-11-19 09:20, Carlo Marcelo Arenas Belón wrote:
>> While I don't have an HFS+ volume to test, I suspect this patch should be
>> needed for both, even if I have to say thay even the broken output was
>> better than the current state.
>>
>> Travis seems to be using a case sensitive filesystem so wouldn't catch this.
>>
>> Was windows/cygwin tested?
>>
>> Carlo
>> -- >8 --
>> Subject: [PATCH] entry: fix t5061 on macOS
>>
>> b878579ae7 ("clone: report duplicate entries on case-insensitive 
>> filesystems",
>> 2018-08-17) was tested on Linux with an excemption for Windows that needs
>> to be expanded for macOS (using APFS), which then would show :
>>
>> $ git clone git://git.kernel.org/pub/scm/docs/man-pages/man-pages.git
>> warning: the following paths have collided (e.g. case-sensitive paths
>> on a case-insensitive filesystem) and only one from the same
>> colliding group is in the working tree:
>>
>>   'man2/_Exit.2'
>>   'man2/_exit.2'
>>   'man3/NAN.3'
>>   'man3/nan.3'
>>
>> Signed-off-by: Carlo Marcelo Arenas Belón 
>> ---
>>  entry.c | 2 +-
>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/entry.c b/entry.c
>> index 5d136c5d55..3845f570f7 100644
>> --- a/entry.c
>> +++ b/entry.c
>> @@ -404,7 +404,7 @@ static void mark_colliding_entries(const struct checkout 
>> *state,
>>  {
>>  int i, trust_ino = check_stat;
>>  
>> -#if defined(GIT_WINDOWS_NATIVE)
>> +#if defined(GIT_WINDOWS_NATIVE) || defined(__APPLE__)
>>  trust_ino = 0;
>>  #endif
>>  
>>
> 
> Sorry,
> but I can't reproduce your problem here.
> 
> Did you test it on Mac ?
> If I run t5601 on a case sensitive files system
> (Mac, mounted NFS, exported from Linux)
> I get:
> ok 99 # skip colliding file detection (missing CASE_INSENSITIVE_FS of
> !MINGW,!CYGWIN,CASE_INSENSITIVE_FS)

I tested v2.20.0-rc0 on cygwin last night and it passed just fine.
I just ran t5601-clone.sh on its own and got:

$ ./t5601-clone.sh
...
ok 98 - clone on case-insensitive fs
ok 99 # skip colliding file detection (missing !CYGWIN of 
!MINGW,!CYGWIN,CASE_INSENSITIVE_FS)
ok 100 - partial clone
ok 101 - partial clone: warn if server does not support object filtering
ok 102 - batch missing blob request during checkout
ok 103 - batch missing blob request does not inadvertently try to fetch 
gitlinks
# passed all 103 test(s)
# SKIP no web server found at '/usr/sbin/apache2'
1..103
$ 

ATB,
Ramsay Jones


Re: [PATCH v5] clone: report duplicate entries on case-insensitive filesystems

2018-11-19 Thread Carlo Arenas
On Mon, Nov 19, 2018 at 4:28 AM Torsten Bögershausen  wrote:
>
> Did you test it on Mac ?

macOS 10.14.1 but only using APFS, did you test my patch with HFS+?

> So what exactly are you trying to fix ?

I get

not ok 99 - colliding file detection
#
# grep X icasefs/warning &&
# grep x icasefs/warning &&
# test_i18ngrep "the following paths have collided" icasefs/warning
#

and the output of "warning" only shows one of the conflicting files,
instead of both:

Cloning into 'bogus'...
done.
warning: the following paths have collided (e.g. case-sensitive paths
on a case-insensitive filesystem) and only one from the same
colliding group is in the working tree:

  'x'

Carlo


Re: Git Test Coverage Report (v2.20.0-rc0)

2018-11-19 Thread Jeff King
On Mon, Nov 19, 2018 at 10:40:53AM -0500, Derrick Stolee wrote:

> > 2fa233a554 builtin/pack-objects.c 1512) hashcpy(base_oid.hash,
> > base_sha1);
> > 2fa233a554 builtin/pack-objects.c 1513) if
> > (!in_same_island(>idx.oid, _oid))
> > 2fa233a554 builtin/pack-objects.c 1514) return 0;
> 
> These lines are inside a block for the following if statement:
> 
> +   /*
> +    * Otherwise, reachability bitmaps may tell us if the receiver has
> it,
> +    * even if it was buried too deep in history to make it into the
> +    * packing list.
> +    */
> +   if (thin && bitmap_has_sha1_in_uninteresting(bitmap_git, base_sha1))
> {
> 
> Peff: is this difficult to test?

A bit.

The two features (thin-packs and delta-islands) would not generally be
used together (one is for serving fetches, the other is for optimizing
on-disk packs to serve fetches). Something like:

 echo HEAD^ | git pack-objects --revs --thin --delta-islands

would probably exercise it (assuming there's a delta in HEAD^ against
something in HEAD), but you'd need to construct a very specific scenario
if you wanted to do any kind of checks no the output.

> > 28b8a73080 builtin/pack-objects.c 2793) depth++;
> > 108f530385 builtin/pack-objects.c 2797) oe_set_tree_depth(_pack, ent,
> > depth);
> 
> This 'depth' variable is incremented as part of a for loop in this patch:
> 
>  static void show_object(struct object *obj, const char *name, void *data)
> @@ -2686,6 +2706,19 @@ static void show_object(struct object *obj, const
> char *name, void *data)
>     add_preferred_base_object(name);
>     add_object_entry(>oid, obj->type, name, 0);
>     obj->flags |= OBJECT_ADDED;
> +
> +   if (use_delta_islands) {
> +   const char *p;
> +   unsigned depth = 0;
> +   struct object_entry *ent;
> +
> +   for (p = strchr(name, '/'); p; p = strchr(p + 1, '/'))
> +   depth++;
> +
> +   ent = packlist_find(_pack, obj->oid.hash, NULL);
> +   if (ent && depth > ent->tree_depth)
> +   ent->tree_depth = depth;
> +   }
>  }
> 
> And that 'ent->tree_depth = depth;' line is replaced with the
> oe_set_tree_depth() call in the report.
> 
> Since depth is never incremented, we are not covering this block. Is it
> possible to test?

Looks like t5320 only has single-level trees. We probably just need to
add a deeper tree. A more interesting case is when an object really is
found at multiple depths, but constructing a case that cared about that
would be quite complicated.

That said, there is much bigger problem with this code, which is that
108f530385 (pack-objects: move tree_depth into 'struct packing_data',
2018-08-16) is totally broken. It works on the trivial repository in the
test, but try this (especially under valgrind or ASan) on a real
repository:

  git repack -adi

which will crash immediately.  It doesn't correctly maintain the
invariant that if tree_depth is not NULL, it is the same size as
the main object array.

I'll see if I can come up with a fix.

-Peff


Re: [PATCH v3 1/2] commit-graph write: add progress output

2018-11-19 Thread SZEDER Gábor
Ping?

We are at -rc0, this progress output is a new feature since v2.19.0,
and the numbers shown are still way off.


On Mon, Oct 15, 2018 at 06:54:47PM +0200, SZEDER Gábor wrote:
> On Mon, Sep 17, 2018 at 03:33:35PM +, Ævar Arnfjörð Bjarmason wrote:
> 
> > @@ -560,6 +563,9 @@ static int add_packed_commits(const struct object_id 
> > *oid,
> > off_t offset = nth_packed_object_offset(pack, pos);
> > struct object_info oi = OBJECT_INFO_INIT;
> >  
> > +   if (list->progress)
> > +   display_progress(list->progress, ++list->progress_done);
> 
> Note that add_packed_commits() is used as a callback function for
> for_each_object_in_pack() (with '--stdin-packs') or
> for_each_packed_object() (no options), i.e. this will count the number
> of objects, not commits:
> 
>   $ git rev-list --all |wc -l
>   768524
>   $ git rev-list --objects --all |wc -l
>   6130295
>   # '--count --objects' together didn't work as expected.
>   $ time ~/src/git/git commit-graph write
>   Finding commits for commit graph: 6130295, done.
>   Annotating commits in commit graph: 2305572, done.
>   Computing commit graph generation numbers: 100% (768524/768524), done.
> 
> (Now I also see the 3x difference in the "Annotating commits" counter
> that you mentioned.)
> 
> I see two options:
> 
>   - Provide a different title for this progress counter, e.g.
> "Scanning objects for c-g", or "Processing objects...", or
> something else that says "objects" instead of "commits".
> 
>   - Move this condition and display_progress() call to the end of the
> function, so it will only count commits, not any other objects.
> (As far as I understand both for_each_object_in_pack() and
> for_each_packed_object() iterate in pack .idx order, i.e. it's
> essentially random.  This means that commit objects should be
> distributed evenly among other kinds of objects, so we don't have
> to worry about the counter stalling for a long stretch of
> consecutive non-commit objects.  At least in theory.)
> 
> 
> 


Re: [PATCH] technical doc: add a design doc for the evolve command

2018-11-19 Thread SZEDER Gábor
On Sat, Nov 17, 2018 at 12:30:58PM -0800, Stefan Xenos wrote:
> > Further, I see that this document tries to suggest a proliferation of new 
> > commands
> 
> It does. Let me explain a bit about the reasoning behind this
> breakdown of commands. My main priority was to keep the commands as
> consistent with existing git commands as possible. Secondary goals
> were:
> - Mapping a single intent to a single command where possible makes it
> easier to explain what that command does.
> - Having lots of simpler commands as opposed to a few complex commands
> makes them easier to type.
> - Command names are more descriptive than lettered arguments.

Subcommand names and --long-options are just as descriptive.


> Git already has a "log" and "reflog" command for displaying two
> different types of log,

No, there is 'git log' for displaying logs in various ways, and there
is 'git reflog' which not only displays reflogs, but also operates on
them, e.g. deletes specific reflog entries or expires old entries,
necessitating and justifying the dedicated 'git reflog' command.

> so putting "obslog" on its own command makes
> it consistent with the existing logs, easier to type, and keeps the
> command simple.

> - We could turn "obslog" into an extra option on the "log" command,
> but that would be inconsistent with reflog and would complicate the
> already-complex log command.

On one hand, it's unclear to me what additional operations the
proposed 'git obslog' command will perform besides showing the log of
a change.  If there are no such operations, then it can't really be
compared to 'git reflog' to justify a dedicated 'git obslog' command.

OTOH, note that 'git log' already has a '--walk-reflogs' option, and
indeed 'git reflog [show]' is implemented via the common log
machinery.  And this is not a mere implementation detail, because "git
reflog show accepts any of the options accepted by git log" (quoting
its manpage), making it possible to filter, limit and format reflog
entries, e.g.:

  git reflog --format='%h %cd %s' --author=szeder -5 branch file

I think 'git obslog' should allow the same when showing the log of a
change.


> Personally, I don't
> consider a proliferation of new commands to be inherently bad (or
> inherently good, really). Is there a reason new commands should be
> avoided?

If a user wants to deal with reflogs, then there is 'git help reflog'
which in one manpage describes the concept, and how to list and
expire reflogs and delete individual entries from a reflog using the
various subcommands.  If a user wants to deal with stashes, then there
is 'git help stash', which in one manpage describes the concept, and
how to create, list, show, apply, delete, etc. stashes using the
various subcommands.  See where this is going?  The same applies to
bisect, notes, remotes, rerere, submodules, worktree; maybe there are
more.  This is a Good Thing.

By adding several new commands users will have to consult the manpages
of 'evolve', 'change', 'obslog', etc., even though the commands and
the concepts are closely related.




Re: Git Test Coverage Report (v2.20.0-rc0)

2018-11-19 Thread Derrick Stolee
The test coverage reports started mid-way through this release cycle, so 
I thought it would be good to do a full review of the new uncovered code 
since the last release.


I eliminated most of the uncovered code due to the following cases:

1. Code was only moved or refactored.
2. Code was related to unusual error conditions (e.g. open_pack_index() 
fails)


The comments below are intended only to point out potential directions 
to improve test coverage. Some of it is for me to do!


Thanks,
-Stolee

On 11/18/2018 9:54 PM, Derrick Stolee wrote:

66ec0390e7 builtin/fsck.c 888) midx_argv[2] = "--object-dir";
66ec0390e7 builtin/fsck.c 889) midx_argv[3] = alt->path;
66ec0390e7 builtin/fsck.c 890) if (run_command(_verify))
66ec0390e7 builtin/fsck.c 891) errors_found |= ERROR_COMMIT_GRAPH;



There are two things wrong here:

1. Not properly covering multi-pack-index fsck with alternates.
2. the ERROR_COMMIT_GRAPH flag when the multi-pack-index is being checked.

I'll submit a patch to fix this.

2fa233a554 builtin/pack-objects.c 1512) hashcpy(base_oid.hash, 
base_sha1);
2fa233a554 builtin/pack-objects.c 1513) if 
(!in_same_island(>idx.oid, _oid))

2fa233a554 builtin/pack-objects.c 1514) return 0;


These lines are inside a block for the following if statement:

+   /*
+    * Otherwise, reachability bitmaps may tell us if the receiver 
has it,

+    * even if it was buried too deep in history to make it into the
+    * packing list.
+    */
+   if (thin && bitmap_has_sha1_in_uninteresting(bitmap_git, 
base_sha1)) {


Peff: is this difficult to test?


28b8a73080 builtin/pack-objects.c 2793) depth++;
108f530385 builtin/pack-objects.c 2797) oe_set_tree_depth(_pack, 
ent, depth); 


This 'depth' variable is incremented as part of a for loop in this patch:

 static void show_object(struct object *obj, const char *name, void *data)
@@ -2686,6 +2706,19 @@ static void show_object(struct object *obj, const 
char *name, void *data)

    add_preferred_base_object(name);
    add_object_entry(>oid, obj->type, name, 0);
    obj->flags |= OBJECT_ADDED;
+
+   if (use_delta_islands) {
+   const char *p;
+   unsigned depth = 0;
+   struct object_entry *ent;
+
+   for (p = strchr(name, '/'); p; p = strchr(p + 1, '/'))
+   depth++;
+
+   ent = packlist_find(_pack, obj->oid.hash, NULL);
+   if (ent && depth > ent->tree_depth)
+   ent->tree_depth = depth;
+   }
 }

And that 'ent->tree_depth = depth;' line is replaced with the 
oe_set_tree_depth() call in the report.


Since depth is never incremented, we are not covering this block. Is it 
possible to test?



builtin/repack.c
16d75fa48d  48) use_delta_islands = git_config_bool(var, value);
16d75fa48d  49) return 0;


This is a simple config option check for "repack.useDeltaIslands". The 
logic it enables is tested, so this is an OK gap, in my opinion.


> builtin/submodule--helper.c

ee69b2a90c 1476) out->type = sub->update_strategy.type;
ee69b2a90c 1477) out->command = sub->update_strategy.command;


This block was introduced by this part of the patch:

+   } else if (sub->update_strategy.type != SM_UPDATE_UNSPECIFIED) {
+   trace_printf("loaded thing");
+   out->type = sub->update_strategy.type;
+   out->command = sub->update_strategy.command;

Which seems to be an important case, as the other SM_UPDATE_* types seem 
like interesting cases.


Stefan: what actions would trigger this block? Is it easy to test?


delta-islands.c
c8d521faf7  53) memcpy(b, old, size);


This memcpy happens when the 'old' island_bitmap is passed to 
island_bitmap_new(), but...



c8d521faf7 187) b->refcount--;
c8d521faf7 188) b = kh_value(island_marks, pos) = island_bitmap_new(b);


This block has the only non-NULL caller to island_bitmap_new().


c8d521faf7 212) obj = ((struct tag *)obj)->tagged;
c8d521faf7 213) if (obj) {
c8d521faf7 214) parse_object(the_repository, >oid);
c8d521faf7 215) marks = create_or_get_island_marks(obj);
c8d521faf7 216) island_bitmap_set(marks, island_counter);


It appears that this block would happen if we placed a tag in the delta 
island.



c8d521faf7 397) strbuf_addch(_name, '-');


This block is inside the following patch:

+   if (matches[ARRAY_SIZE(matches) - 1].rm_so != -1)
+   warning(_("island regex from config has "
+ "too many capture groups (max=%d)"),
+   (int)ARRAY_SIZE(matches) - 2);
+
+   for (m = 1; m < ARRAY_SIZE(matches); m++) {
+   regmatch_t *match = [m];
+
+   if (match->rm_so == -1)
+   continue;
+
+   if (island_name.len)
+   strbuf_addch(_name, '-');
+
+   strbuf_add(_name, refname + match->rm_so, 
match->rm_eo - match->rm_so);

+   }

This likely means that ARRAY_SIZE(matches) is never more than 

Re: [PATCH 5/5] tree-walk: support :(attr) matching

2018-11-19 Thread Duy Nguyen
On Sun, Nov 18, 2018 at 8:58 PM Ævar Arnfjörð Bjarmason
 wrote:
>
>
> On Sun, Nov 18 2018, Nguyễn Thái Ngọc Duy wrote:
>
> As noted in
> https://public-inbox.org/git/87d0r217vr@evledraar.gmail.com/ I'm
> happy to see this implemented. I have not read this patch in much
> detail...
>
> > [...]
> >  Documentation/glossary-content.txt |  2 +
> > [...]
> > diff --git a/Documentation/glossary-content.txt 
> > b/Documentation/glossary-content.txt
> > index 0d2aa48c63..023ca95e7c 100644
> > --- a/Documentation/glossary-content.txt
> > +++ b/Documentation/glossary-content.txt
> > @@ -404,6 +404,8 @@ these forms:
> >  - "`!ATTR`" requires that the attribute `ATTR` be
> >unspecified.
> >  +
> > +Note that when matching against a tree object, attributes are still
> > +obtained from working tree, not from the given tree object.
> >
> >  exclude;;
> >   After a path matches any non-exclude pathspec, it will be run
>
> Just a poke again about what I brought up in the thread you replied to
> in
> https://public-inbox.org/git/cacsjy8clhq0mkhkxvtdaqy9tlwefbsvheu5ubpxhx4is2mk...@mail.gmail.com/
>
> I.e. the documentation of these various wildmatch() / attributes
> patterns we support is all over the place. Some in gitignore(5), some
> not documented at all, and some in gitglossary(7) (which really should
> not be serving as primary documentation for anything).

Poking me is not going help, I'm afraid. Except bugs, which have
highest priority to me, I do other stuff first either because I need
it or it's fun to do, and this wildmatch documentation is neither (and
I have a long backlog).
-- 
Duy


Re: [PATCH 0/5] Make :(attr) pathspec work with "git log"

2018-11-19 Thread Duy Nguyen
On Mon, Nov 19, 2018 at 12:42 PM Ævar Arnfjörð Bjarmason
 wrote:
>
>
> On Sun, Nov 18 2018, Nguyễn Thái Ngọc Duy wrote:
>
> > When :(attr) was added, it supported one of the two main pathspec
> > matching functions, the one that works on a list of paths. The other
> > one works on a tree, tree_entry_interesting(), which gets :(attr)
> > support in this series.
> >
> > With this, "git grep   -- :(attr)" or "git log :(attr)"
> > will not abort with BUG() anymore.
> >
> > But this also reveals an interesting thing: even though we walk on a
> > tree, we check attributes from _worktree_ (and optionally fall back to
> > the index). This is how attributes are implemented since forever. I
> > think this is not a big deal if we communicate clearly with the user.
> > But otherwise, this series can be scraped, as reading attributes from
> > a specific tree could be a lot of work.
> >
> > The main patch is the last one. The others are just to open a path to
> > pass "struct index_state *" down to tree_entry_interesting(). This may
> > become standard procedure because we don't want to stick the_index (or
> > the_repository) here and there.
>
> Another side-note (this thread is turning into my personal blog at this
> point...) I found an old related thread:
> https://public-inbox.org/git/20170509225219.gb106...@google.com/
>
> So this series fixes 1/2 of the issues noted there, but git-ls-tree will
> still die with the same error.

If you mean BUG(), not it does not. There are just a couple more
places besides tree_entry_interesting() and match_pathspec() that need
to be aware of all the magic things. ls-tree is one of those but it's
well guarded and will display this instead

$ git ls-tree HEAD ':(attr:abc=def)'
fatal: :(attr:abc=def): pathspec magic not supported by this command: 'attr'

If you mean making ls-tree support :(attr) and other stuff, it's not
going to happen in near future. It may be nice to switch to
tree_entry_interesting() in this command, but if it was easy to do I
would have done it years ago :P
-- 
Duy


Re: [PATCH 0/5] Make :(attr) pathspec work with "git log"

2018-11-19 Thread Duy Nguyen
On Mon, Nov 19, 2018 at 12:16 PM Ævar Arnfjörð Bjarmason
 wrote:
> As an aside, how do you do the inverse of matching for an attribute by
> value? I.e.:
>
> $ git ls-files | wc -l; git ls-files ':(attr:diff=perl)' | wc -l
> 3522
> 65
>
> I'd like something gives me all files that don't match diff=perl,
> i.e. 3522-65 = 3457 files, or what I'd get if I constructed such a match
> manually with excludes:
>
> $ git ls-files $(grep diff=perl .gitattributes | cut -d ' ' -f1 | sed 
> 's!^!:(exclude)!') | wc -l
> 3457
>
> From my reading of parse_pathspec_attr_match() and match_attrs() this
> isn't possible and I'd need to support ':(attr:diff!=perl)' via a new
> MATCH_NOT_VALUE mode. But I wanted to make sure I wasn't missing some
> subtlety, i.e. that this was implemented already via some other feature.
>
> I thought I could do:
>
> git ls-files ':(exclude):(attr:diff=perl)'
>
> But we don't support chaining like that, and this would only exclude a
> file that's actually called ":(attr:diff=perl)". I.e. created via
> something like "touch ':(attr:diff=perl)'".

I think we allow :(exclude,attr:diff=perl) which should "exclude all
paths that have diff=perl attribute". It's actually tested in t6135
for ls-files (but I didn't add the same test for 'git grep' because I
was so confident it would work; I'll work on that).
-- 
Duy


Re: [PATCH/RFC] checkout: print something when checking out paths

2018-11-19 Thread Duy Nguyen
On Mon, Nov 19, 2018 at 2:08 PM Ævar Arnfjörð Bjarmason
 wrote:
>
>
> On Mon, Nov 12 2018, Duy Nguyen wrote:
>
> > On Mon, Nov 12, 2018 at 7:21 AM Junio C Hamano  wrote:
> >>
> >> Nguyễn Thái Ngọc Duy   writes:
> >>
> >> > Since the purpose of printing this is to help disambiguate. Only do it
> >> > when "--" is missing (the actual reason though is many tests check
> >> > empty stderr to see that no error is raised and I'm too lazy to fix
> >> > all the test cases).
> >>
> >> Heh, honesty is good but in this particular case the official reason
> >> alone would make perfect sense, too ;-)
> >>
> >> As with progress output, shouldn't this automatically be turned off
> >> when the standard error stream goes to non tty, as the real purpose
> >> of printing is to help the user sitting in front of the terminal and
> >> interacting with the command?
> >
> > I see this at the same level as "Switched to branch 'foo'" which is
> > also protected by opts->quiet. If we start hiding messages based on
> > tty it should be done for other messages in update_refs_for_switch()
> > too, I guess?
>
> I have no real opinion either way, but whatever we can do about
> "checkout" being so confusing since it does so many things is most
> welcome.
>
> Just an alternative: Maybe we can start this out as advice() output
> that's either opt-in via config (not on by default) to start with, or
> have some advice_tty() that only emits it in the same circumstances we
> emit the progress output?

No let's drop this for now. I promise to come back with something
better (at least it still sounds better in my mind). If that idea does
not work out, we can come back and see if we can improve this.
-- 
Duy


[PATCH] msvc: Directly use MS version (_stricmp) of strcasecmp

2018-11-19 Thread Sven Strickroth
This also removes an implicit conversion from size_t (unsigned) to int (signed).

_stricmp as well as _strnicmp are both available since VS2012.

Signed-off-by: Sven Strickroth 
---
 compat/msvc.h | 8 +---
 1 file changed, 1 insertion(+), 7 deletions(-)

diff --git a/compat/msvc.h b/compat/msvc.h
index e6e1a6bbf7..2d558bae14 100644
--- a/compat/msvc.h
+++ b/compat/msvc.h
@@ -14,18 +14,12 @@
 #define inline __inline
 #define __inline__ __inline
 #define __attribute__(x)
+#define strcasecmp   _stricmp
 #define strncasecmp  _strnicmp
 #define ftruncate_chsize
 #define strtoull _strtoui64
 #define strtoll  _strtoi64
 
-static __inline int strcasecmp (const char *s1, const char *s2)
-{
-   int size1 = strlen(s1);
-   int sisz2 = strlen(s2);
-   return _strnicmp(s1, s2, sisz2 > size1 ? sisz2 : size1);
-}
-
 #undef ERROR
 
 #define ftello _ftelli64
-- 
2.19.1.windows.1


Strange behavior of `git describe`

2018-11-19 Thread Alexander Amelkin
Hi all!

We're observing a strange behavior of git describe. We have a local fork of an 
open-source github.com/openbmc/openbmc repository. The local fork has some 
extra commits and we merge from upstream on a regular basis. Until today it all 
worked fine, the last merge was from upstream commit b1b8f77 and `git describe` 
was showing the correct info based on `v2.3` tag added with commit ad35219:

```
$ git log --oneline --decorate=short --graph
*   f0f01334f (HEAD) Merge remote-tracking branch 'upstream/master'
|\ 
| * b1b8f7705 phosphor-logging: srcrev bump f8d38bbebe..30047bf964
| * 0a848f84d entity-manager: srcrev bump 8ca708689f..d534f7433d
| * 7e16090cd google-ipmi-sys: srcrev bump 3ebf2dd524..ff40f273a7
| * 9b7dcc807 google-ipmi-i2c: srcrev bump 38e8c6e170..ad05703ce0
...
| * ad35219cd (tag: v2.3) Update bmcweb to latest
...
$
$ git describe --debug
searching to describe HEAD
finished search at eb8644742b22a77f40e6cfe9f3d325e92a602def
 annotated   1601 v2.3
 annotated   1965 v2.2
 annotated   2416 v2.1
 annotated   2637 v2.0
 annotated   2899 v1.99.10
traversed 2974 commits
v2.3-1601-gf0f0133
$
```

Since that point there have been some commits both to the upstream and to our 
local fork. Now as we've merged again from upstream, we're seeing something 
strange. The latest (timestamp-wise) commit before the merge was to the 
upstream and it was tagged `v2.4` (annotated):

```
$ git log --oneline --decorate=short --graph upstream/master
* c71bcdc01 (tag: v2.4, upstream/master, upstream/HEAD, upstream) 
phosphor-webui: srcrev bump 1cc7021c9e..bc5cc7f75c
* 3db7820e6 skiboot: Bump to 6.0.13
* 5a963b231 linux-aspeed: Move to 4.18.16
* 36a49e2d7 Add Entity Manager to Facebook Tiogapass
```

Our local fork after merging looks like this:

```
$ git merge upstream/master
Merge made by the 'recursive' strategy.
 meta-aspeed/recipes-kernel/linux/linux-aspeed_git.bb   
 |  4 ++--
 meta-facebook/meta-tiogapass/conf/machine/tiogapass.conf   
 |  1 +
 
meta-facebook/meta-tiogapass/recipes-fbtp/packagegroups/packagegroup-fb-apps.bb 
| 18 ++
 meta-google/recipes-google/ipmi/google-ipmi-sys_git.bb 
 |  2 +-
 meta-openpower/recipes-bsp/skiboot/skiboot.inc 
 |  6 +++---
 meta-phosphor/recipes-phosphor/state/phosphor-state-manager_git.bb 
 |  2 +-
 meta-phosphor/recipes-phosphor/webui/phosphor-webui_git.bb 
 |  2 +-
 7 files changed, 27 insertions(+), 8 deletions(-)
 create mode 100644 
meta-facebook/meta-tiogapass/recipes-fbtp/packagegroups/packagegroup-fb-apps.bb
$
$ git log --oneline --decorate=short --graph
*   4aaa7d156 (HEAD -> master) Merge remote-tracking branch 'upstream/master'
|\ 
| * c71bcdc01 (tag: v2.4, upstream/master, upstream/HEAD, upstream) 
phosphor-webui: srcrev bump 1cc7021c9e..bc5cc7f75c
| * 3db7820e6 skiboot: Bump to 6.0.13
| * 5a963b231 linux-aspeed: Move to 4.18.16
| * 36a49e2d7 Add Entity Manager to Facebook Tiogapas
...
$
```

The strange part is this:

```
$ git describe --debug
searching to describe HEAD
finished search at eb8644742b22a77f40e6cfe9f3d325e92a602def
 annotated   1612 v2.3
 annotated   1976 v2.2
 annotated   2427 v2.1
 annotated   2525 v2.4
 annotated   2648 v2.0
 annotated   2910 v1.99.10
traversed 2985 commits
v2.3-1612-g4aaa7d1
```

As you can see, the `v2.4` tag, that according to the graph log is the 
immediate parent of the current HEAD, is not the best candidate somehow, and 
`v2.3` is instead reported. It also somehow thinks that there are 2525 commits 
between `v2.4` tag and the HEAD.

Is this a bug in git or are we doing anything wrong?

Thanks.

Alexander.

P.S. This has been verified both with stock git version 2.7.4 and with version 
2.20.0.rc0.349.g148beda from `next` top-of-branch.



signature.asc
Description: OpenPGP digital signature


Re: [PATCH] test-lib-functions: make 'test_cmp_rev' more informative on failure

2018-11-19 Thread SZEDER Gábor
On Mon, Nov 19, 2018 at 02:28:18PM +0100, SZEDER Gábor wrote:
> diff --git a/t/test-lib-functions.sh b/t/test-lib-functions.sh
> index d158c8d0bf..fc84db67a1 100644
> --- a/t/test-lib-functions.sh
> +++ b/t/test-lib-functions.sh
> @@ -854,9 +854,23 @@ test_must_be_empty () {
>  
>  # Tests that its two parameters refer to the same revision
>  test_cmp_rev () {
> - git rev-parse --verify "$1" >expect.rev &&
> - git rev-parse --verify "$2" >actual.rev &&
> - test_cmp expect.rev actual.rev
> + if test $# != 2
> + then
> + error "bug in the test script: test_cmp_rev requires two 
> revisions, but got $#"

And this here is another "bug in the test script" error that should be
turned into:

  BUG "test_cmp_rev requires two revisions, but got $#"

See: https://public-inbox.org/git/20181119131326.2435-1-szeder@gmail.com/



[PATCH] test-lib-functions: make 'test_cmp_rev' more informative on failure

2018-11-19 Thread SZEDER Gábor
The 'test_cmp_rev' helper is merely a wrapper around 'test_cmp'
checking the output of two 'git rev-parse' commands, which means that
its output on failure is not particularly informative, as it's
basically two OIDs with a bit of extra clutter of the diff header, but
without any indication of which two revisions have caused the failure:

  --- expect.rev  2018-11-17 14:02:11.569747033 +
  +++ actual.rev  2018-11-17 14:02:11.569747033 +
  @@ -1 +1 @@
  -d79ce1670bdcb76e6d1da2ae095e890ccb326ae9
  +139b20d8e6c5b496de61f033f642d0e3dbff528d

It also pollutes the test repo with these two intermediate files,
though that doesn't seem to cause any complications in our current
tests (meaning that I couldn't find any tests that have to work around
the presence of these files by explicitly removing or ignoring them).

Enhance 'test_cmp_rev' to provide a more useful output on failure with
less clutter:

  error: two revisions point to different objects:
'HEAD^': d79ce1670bdcb76e6d1da2ae095e890ccb326ae9
'extra': 139b20d8e6c5b496de61f033f642d0e3dbff528d

Doing so is more convenient when storing the OIDs outputted by 'git
rev-parse' in a local variable each, which, as a bonus, won't pollute
the repository with intermediate files.

While at it, also ensure that 'test_cmp_rev' is invoked with the right
number of parameters, namely two.

Signed-off-by: SZEDER Gábor 
---
 t/test-lib-functions.sh | 20 +---
 1 file changed, 17 insertions(+), 3 deletions(-)

diff --git a/t/test-lib-functions.sh b/t/test-lib-functions.sh
index d158c8d0bf..fc84db67a1 100644
--- a/t/test-lib-functions.sh
+++ b/t/test-lib-functions.sh
@@ -854,9 +854,23 @@ test_must_be_empty () {
 
 # Tests that its two parameters refer to the same revision
 test_cmp_rev () {
-   git rev-parse --verify "$1" >expect.rev &&
-   git rev-parse --verify "$2" >actual.rev &&
-   test_cmp expect.rev actual.rev
+   if test $# != 2
+   then
+   error "bug in the test script: test_cmp_rev requires two 
revisions, but got $#"
+   else
+   local r1 r2
+   r1=$(git rev-parse --verify "$1") &&
+   r2=$(git rev-parse --verify "$2") &&
+   if test "$r1" != "$r2"
+   then
+   cat >&4 <<-EOF
+   error: two revisions point to different objects:
+ '$1': $r1
+ '$2': $r2
+   EOF
+   return 1
+   fi
+   fi
 }
 
 # Print a sequence of integers in increasing order, either with
-- 
2.20.0.rc0.134.gf0022f8e60



[PATCH] tests: send "bug in the test script" errors to the script's stderr

2018-11-19 Thread SZEDER Gábor
Some of the functions in our test library check that they were invoked
properly with conditions like this:

  test "$#" = 2 ||
  error "bug in the test script: not 2 parameters to test-expect-success"

If this particular condition is triggered, then 'error' will abort the
whole test script with a bold red error message [1] right away.

However, under certain circumstances the test script will be aborted
completely silently, namely if:

  - a similar condition in a test helper function like
'test_line_count' is triggered,
  - which is invoked from the test script's "main" shell [2],
  - and the test script is run manually (i.e. './t1234-foo.sh' as
opposed to 'make t1234-foo.sh' or 'make test') [3]
  - and without the '--verbose' option,

because the error message is printed from within 'test_eval_', where
standard output is redirected either to /dev/null or to a log file.
The only indication that something is wrong is that not all tests in
the script are executed and at the end of the test script's output
there is no "# passed all N tests" message, which are subtle and can
easily go unnoticed, as I had to experience myself.

Send these "bug in the test script" error messages directly to the
test scripts standard error and thus to the terminal, so those bugs
will be much harder to overlook.  Instead of updating all ~20 such
'error' calls with a redirection, let's add a BUG() function to
'test-lib.sh', wrapping an 'error' call with the proper redirection
and also including the common prefix of those error messages, and
convert all those call sites [4] to use this new BUG() function
instead.

[1] That particular error message from 'test_expect_success' is
printed in color only when running with or without '--verbose';
with '--tee' or '--verbose-log' the error is printed without
color, but it is printed to the terminal nonetheless.

[2] If such a condition is triggered in a subshell of a test, then
'error' won't be able to abort the whole test script, but only the
subshell, which in turn causes the test to fail in the usual way,
indicating loudly and clearly that something is wrong.

[3] Well, 'error' aborts the test script the same way when run
manually or by 'make' or 'prove', but both 'make' and 'prove' pay
attention to the test script's exit status, and even a silently
aborted test script would then trigger those tools' usual
noticable error messages.

[4] Strictly speaking, not all those 'error' calls need that
redirection to send their output to the terminal, see e.g.
'test_expect_success' in the opening example, but I think it's
better to be consistent.

Signed-off-by: SZEDER Gábor 
---
 t/perf/perf-lib.sh  |  4 ++--
 t/t0001-init.sh |  4 ++--
 t/t4013-diff-various.sh |  2 +-
 t/t5516-fetch-push.sh   |  2 +-
 t/t9902-completion.sh   |  2 +-
 t/test-lib-functions.sh | 25 -
 t/test-lib.sh   | 10 +++---
 7 files changed, 26 insertions(+), 23 deletions(-)

diff --git a/t/perf/perf-lib.sh b/t/perf/perf-lib.sh
index 11d1922cf5..2e33ab3ec3 100644
--- a/t/perf/perf-lib.sh
+++ b/t/perf/perf-lib.sh
@@ -82,7 +82,7 @@ test_perf_do_repo_symlink_config_ () {
 
 test_perf_create_repo_from () {
test "$#" = 2 ||
-   error "bug in the test script: not 2 parameters to test-create-repo"
+   BUG "not 2 parameters to test-create-repo"
repo="$1"
source="$2"
source_git="$("$MODERN_GIT" -C "$source" rev-parse --git-dir)"
@@ -184,7 +184,7 @@ test_wrapper_ () {
test_start_
test "$#" = 3 && { test_prereq=$1; shift; } || test_prereq=
test "$#" = 2 ||
-   error "bug in the test script: not 2 or 3 parameters to 
test-expect-success"
+   BUG "not 2 or 3 parameters to test-expect-success"
export test_prereq
if ! test_skip "$@"
then
diff --git a/t/t0001-init.sh b/t/t0001-init.sh
index 182da069f1..42a263cada 100755
--- a/t/t0001-init.sh
+++ b/t/t0001-init.sh
@@ -319,14 +319,14 @@ test_lazy_prereq GETCWD_IGNORES_PERMS '
base=GETCWD_TEST_BASE_DIR &&
mkdir -p $base/dir &&
chmod 100 $base ||
-   error "bug in test script: cannot prepare $base"
+   BUG "cannot prepare $base"
 
(cd $base/dir && /bin/pwd -P)
status=$?
 
chmod 700 $base &&
rm -rf $base ||
-   error "bug in test script: cannot clean $base"
+   BUG "cannot clean $base"
return $status
 '
 
diff --git a/t/t4013-diff-various.sh b/t/t4013-diff-various.sh
index 73f7038253..7d985ff6b1 100755
--- a/t/t4013-diff-various.sh
+++ b/t/t4013-diff-various.sh
@@ -129,7 +129,7 @@ do
case "$magic" in
noellipses) ;;
*)
-   die "bug in t4103: unknown magic $magic" ;;
+   BUG "unknown magic $magic" ;;
esac ;;
*)
cmd="$magic $cmd" magic=
diff --git a/t/t5516-fetch-push.sh b/t/t5516-fetch-push.sh

Re: [PATCH/RFC] checkout: print something when checking out paths

2018-11-19 Thread Ævar Arnfjörð Bjarmason


On Mon, Nov 12 2018, Duy Nguyen wrote:

> On Mon, Nov 12, 2018 at 7:21 AM Junio C Hamano  wrote:
>>
>> Nguyễn Thái Ngọc Duy   writes:
>>
>> > Since the purpose of printing this is to help disambiguate. Only do it
>> > when "--" is missing (the actual reason though is many tests check
>> > empty stderr to see that no error is raised and I'm too lazy to fix
>> > all the test cases).
>>
>> Heh, honesty is good but in this particular case the official reason
>> alone would make perfect sense, too ;-)
>>
>> As with progress output, shouldn't this automatically be turned off
>> when the standard error stream goes to non tty, as the real purpose
>> of printing is to help the user sitting in front of the terminal and
>> interacting with the command?
>
> I see this at the same level as "Switched to branch 'foo'" which is
> also protected by opts->quiet. If we start hiding messages based on
> tty it should be done for other messages in update_refs_for_switch()
> too, I guess?

I have no real opinion either way, but whatever we can do about
"checkout" being so confusing since it does so many things is most
welcome.

Just an alternative: Maybe we can start this out as advice() output
that's either opt-in via config (not on by default) to start with, or
have some advice_tty() that only emits it in the same circumstances we
emit the progress output?


Re: [PATCH v2 2/2] rebase: validate -C and --whitespace= parameters early

2018-11-19 Thread Ævar Arnfjörð Bjarmason


On Wed, Nov 14 2018, Johannes Schindelin via GitGitGadget wrote:

> From: Johannes Schindelin 
>
> It is a good idea to error out early upon seeing, say, `-Cbad`, rather
> than starting the rebase only to have the `--am` backend complain later.
>
> Let's do this.
>
> The only options accepting parameters which we pass through to `git am`
> (which may, or may not, forward them to `git apply`) are `-C` and
> `--whitespace`. The other options we pass through do not accept
> parameters, so we do not have to validate them here.
>
> Signed-off-by: Johannes Schindelin 
> ---
>  builtin/rebase.c  | 12 +++-
>  t/t3406-rebase-message.sh |  7 +++
>  2 files changed, 18 insertions(+), 1 deletion(-)
>
> diff --git a/builtin/rebase.c b/builtin/rebase.c
> index 96ffa80b71..571cf899d5 100644
> --- a/builtin/rebase.c
> +++ b/builtin/rebase.c
> @@ -1064,12 +1064,22 @@ int cmd_rebase(int argc, const char **argv, const 
> char *prefix)
>   }
>
>   for (i = 0; i < options.git_am_opts.argc; i++) {
> - const char *option = options.git_am_opts.argv[i];
> + const char *option = options.git_am_opts.argv[i], *p;
>   if (!strcmp(option, "--committer-date-is-author-date") ||
>   !strcmp(option, "--ignore-date") ||
>   !strcmp(option, "--whitespace=fix") ||
>   !strcmp(option, "--whitespace=strip"))
>   options.flags |= REBASE_FORCE;
> + else if (skip_prefix(option, "-C", )) {
> + while (*p)
> + if (!isdigit(*(p++)))
> + die(_("switch `C' expects a "
> +   "numerical value"));
> + } else if (skip_prefix(option, "--whitespace=", )) {
> + if (*p && strcmp(p, "warn") && strcmp(p, "nowarn") &&
> + strcmp(p, "error") && strcmp(p, "error-all"))
> + die("Invalid whitespace option: '%s'", p);
> + }
>   }
>
>   if (!(options.flags & REBASE_NO_QUIET))
> diff --git a/t/t3406-rebase-message.sh b/t/t3406-rebase-message.sh
> index 0392e36d23..2c79eed4fe 100755
> --- a/t/t3406-rebase-message.sh
> +++ b/t/t3406-rebase-message.sh
> @@ -84,4 +84,11 @@ test_expect_success 'rebase --onto outputs the invalid 
> ref' '
>   test_i18ngrep "invalid-ref" err
>  '
>
> +test_expect_success 'error out early upon -C or --whitespace=' '
> + test_must_fail git rebase -Cnot-a-number HEAD 2>err &&
> + test_i18ngrep "numerical value" err &&
> + test_must_fail git rebase --whitespace=bad HEAD 2>err &&
> + test_i18ngrep "Invalid whitespace option" err
> +'
> +

The combination of this gitster/js/rebase-am-options and my
gitster/ab/rebase-in-c-escape-hatch breaks tests under
GIT_TEST_REBASE_USE_BUILTIN=false for obvious reasons. The C version is
now more strict.


Re: [PATCH v5] clone: report duplicate entries on case-insensitive filesystems

2018-11-19 Thread Torsten Bögershausen
On 2018-11-19 09:20, Carlo Marcelo Arenas Belón wrote:
> While I don't have an HFS+ volume to test, I suspect this patch should be
> needed for both, even if I have to say thay even the broken output was
> better than the current state.
> 
> Travis seems to be using a case sensitive filesystem so wouldn't catch this.
> 
> Was windows/cygwin tested?
> 
> Carlo
> -- >8 --
> Subject: [PATCH] entry: fix t5061 on macOS
> 
> b878579ae7 ("clone: report duplicate entries on case-insensitive filesystems",
> 2018-08-17) was tested on Linux with an excemption for Windows that needs
> to be expanded for macOS (using APFS), which then would show :
> 
> $ git clone git://git.kernel.org/pub/scm/docs/man-pages/man-pages.git
> warning: the following paths have collided (e.g. case-sensitive paths
> on a case-insensitive filesystem) and only one from the same
> colliding group is in the working tree:
> 
>   'man2/_Exit.2'
>   'man2/_exit.2'
>   'man3/NAN.3'
>   'man3/nan.3'
> 
> Signed-off-by: Carlo Marcelo Arenas Belón 
> ---
>  entry.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/entry.c b/entry.c
> index 5d136c5d55..3845f570f7 100644
> --- a/entry.c
> +++ b/entry.c
> @@ -404,7 +404,7 @@ static void mark_colliding_entries(const struct checkout 
> *state,
>  {
>   int i, trust_ino = check_stat;
>  
> -#if defined(GIT_WINDOWS_NATIVE)
> +#if defined(GIT_WINDOWS_NATIVE) || defined(__APPLE__)
>   trust_ino = 0;
>  #endif
>  
> 

Sorry,
but I can't reproduce your problem here.

Did you test it on Mac ?
If I run t5601 on a case sensitive files system
(Mac, mounted NFS, exported from Linux)
I get:
ok 99 # skip colliding file detection (missing CASE_INSENSITIVE_FS of
!MINGW,!CYGWIN,CASE_INSENSITIVE_FS)

And if I run it on a case-insensitive HFS+,
I get
ok 99 - colliding file detection

So what exactly are you trying to fix ?



Re: [PATCH 0/5] Make :(attr) pathspec work with "git log"

2018-11-19 Thread Jeff King
On Sun, Nov 18, 2018 at 08:51:52PM +0100, Ævar Arnfjörð Bjarmason wrote:

> > But this also reveals an interesting thing: even though we walk on a
> > tree, we check attributes from _worktree_ (and optionally fall back to
> > the index). This is how attributes are implemented since forever. I
> > think this is not a big deal if we communicate clearly with the user.
> > But otherwise, this series can be scraped, as reading attributes from
> > a specific tree could be a lot of work.
> 
> I'm very happy to see this implemented, and I think the behavior
> described here is the right way to go. E.g. in git.git we have diff=perl
> entries in .gitattributes. It would suck if:
> 
> git log ':(attr:diff=perl)'
> 
> Would only list commits as far as 20460635a8 (".gitattributes: use the
> "perl" differ for Perl", 2018-04-26), since that's when we stop having
> that attribute. Ditto for wanting to run "grep" on e.g. perl files in
> 2.12.0.
> 
> I have also run into cases where I want to use a .gitattributes file
> from a specific commit. E.g. when writing pre-receive hooks where I've
> wanted the .gitattributes of the commit being pushed to configure
> something about it. But as you note this isn't supported at all.
> 
> But a concern is whether we should be making :(attr:*) behave like this
> for now. Are we going to regret it later? I don't think so, I think
> wanting to use the current working tree's / index's is the most sane
> default, and if we get the ability to read it from revisions as we
> e.g. walk the log it would make most sense to just call that
> :(treeattr:*) or something like that.

I think that ship already sailed with the fact that "git log -p" will
show diffs using the worktree attrs. I agree that it would sometimes be
nice to specify attributes from a particular tree, but at this point the
default probably needs to remain as it is.

-Peff


Re: Big repository on Aix

2018-11-19 Thread Ævar Arnfjörð Bjarmason


On Mon, Nov 19 2018, Mgr Georg Black wrote:

> Hello everyone,
> We have source codes in many files of 20 years development. It's approx 
> 800mb. There are many programs. Every has about 100-200 modules.
> Company has 40 developers. They all works via terminal on aix.
> At this time we have three folders for three versions. Everybody send changes 
> to them via script blocking parallel work.
> It's possible migrate to the Git version system. We are afraid of big local 
> copies for every developer. We have not enough space for 40 x 800MB plus 
> history etc. Exist some less dramatic way?
> Thanks for your info.

The first thing you should do is import what you have into git. Then run
"git gc --prune=now'.

Then see how big your checked-out directory is (everything except .git)
and the history (the .git folder).

If it's say 1GB for the working tree and 2GB for the history maybe you
can just use git the simplest of modes. You'd have 3GB of data for each
developer, you have 40 so 3*40=120GB. Let's add RAID 1 to that and end
up at 50% usage, you're still below 500GB of disk space. Is that really
that big of an investment for 40 devs?

If it is there's still things you can do. E.g. shallow clones, or using
altrenates on-disk to de-dupe space. See my
https://public-inbox.org/git/87muwzc2kv@evledraar.gmail.com/ and
https://public-inbox.org/git/87in7nbi5b@evledraar.gmail.com/ for
some details. That needs to be managed very carefully least you
introduce repository corruption, but it would bring the disk space down
to something closer to 40GB for these 40 developers in this example.


Re: [PATCH 0/5] Make :(attr) pathspec work with "git log"

2018-11-19 Thread Ævar Arnfjörð Bjarmason


On Sun, Nov 18 2018, Nguyễn Thái Ngọc Duy wrote:

> When :(attr) was added, it supported one of the two main pathspec
> matching functions, the one that works on a list of paths. The other
> one works on a tree, tree_entry_interesting(), which gets :(attr)
> support in this series.
>
> With this, "git grep   -- :(attr)" or "git log :(attr)"
> will not abort with BUG() anymore.
>
> But this also reveals an interesting thing: even though we walk on a
> tree, we check attributes from _worktree_ (and optionally fall back to
> the index). This is how attributes are implemented since forever. I
> think this is not a big deal if we communicate clearly with the user.
> But otherwise, this series can be scraped, as reading attributes from
> a specific tree could be a lot of work.
>
> The main patch is the last one. The others are just to open a path to
> pass "struct index_state *" down to tree_entry_interesting(). This may
> become standard procedure because we don't want to stick the_index (or
> the_repository) here and there.

Another side-note (this thread is turning into my personal blog at this
point...) I found an old related thread:
https://public-inbox.org/git/20170509225219.gb106...@google.com/

So this series fixes 1/2 of the issues noted there, but git-ls-tree will
still die with the same error.


Big repository on Aix

2018-11-19 Thread Mgr Georg Black
Hello everyone,
We have source codes in many files of 20 years development. It's approx 800mb. 
There are many programs. Every has about 100-200 modules.
Company has 40 developers. They all works via terminal on aix.
At this time we have three folders for three versions. Everybody send changes 
to them via script blocking parallel work.
It's possible migrate to the Git version system. We are afraid of big local 
copies for every developer. We have not enough space for 40 x 800MB plus 
history etc. Exist some less dramatic way?
Thanks for your info.
Best Regards
Georg Black


Re: [PATCH 0/5] Make :(attr) pathspec work with "git log"

2018-11-19 Thread Ævar Arnfjörð Bjarmason


On Sun, Nov 18 2018, Ævar Arnfjörð Bjarmason wrote:

> On Sun, Nov 18 2018, Nguyễn Thái Ngọc Duy wrote:
>
>> When :(attr) was added, it supported one of the two main pathspec
>> matching functions, the one that works on a list of paths. The other
>> one works on a tree, tree_entry_interesting(), which gets :(attr)
>> support in this series.
>>
>> With this, "git grep   -- :(attr)" or "git log :(attr)"
>> will not abort with BUG() anymore.
>>
>> But this also reveals an interesting thing: even though we walk on a
>> tree, we check attributes from _worktree_ (and optionally fall back to
>> the index). This is how attributes are implemented since forever. I
>> think this is not a big deal if we communicate clearly with the user.
>> But otherwise, this series can be scraped, as reading attributes from
>> a specific tree could be a lot of work.
>
> I'm very happy to see this implemented, and I think the behavior
> described here is the right way to go. E.g. in git.git we have diff=perl
> entries in .gitattributes. It would suck if:
>
> git log ':(attr:diff=perl)'
>
> Would only list commits as far as 20460635a8 (".gitattributes: use the
> "perl" differ for Perl", 2018-04-26), since that's when we stop having
> that attribute. Ditto for wanting to run "grep" on e.g. perl files in
> 2.12.0.
>
> I have also run into cases where I want to use a .gitattributes file
> from a specific commit. E.g. when writing pre-receive hooks where I've
> wanted the .gitattributes of the commit being pushed to configure
> something about it. But as you note this isn't supported at all.
>
> But a concern is whether we should be making :(attr:*) behave like this
> for now. Are we going to regret it later? I don't think so, I think
> wanting to use the current working tree's / index's is the most sane
> default, and if we get the ability to read it from revisions as we
> e.g. walk the log it would make most sense to just call that
> :(treeattr:*) or something like that.

As an aside, how do you do the inverse of matching for an attribute by
value? I.e.:

$ git ls-files | wc -l; git ls-files ':(attr:diff=perl)' | wc -l
3522
65

I'd like something gives me all files that don't match diff=perl,
i.e. 3522-65 = 3457 files, or what I'd get if I constructed such a match
manually with excludes:

$ git ls-files $(grep diff=perl .gitattributes | cut -d ' ' -f1 | sed 
's!^!:(exclude)!') | wc -l
3457

>From my reading of parse_pathspec_attr_match() and match_attrs() this
isn't possible and I'd need to support ':(attr:diff!=perl)' via a new
MATCH_NOT_VALUE mode. But I wanted to make sure I wasn't missing some
subtlety, i.e. that this was implemented already via some other feature.

I thought I could do:

git ls-files ':(exclude):(attr:diff=perl)'

But we don't support chaining like that, and this would only exclude a
file that's actually called ":(attr:diff=perl)". I.e. created via
something like "touch ':(attr:diff=perl)'".


Re: [PATCH] t5562: skip if NO_CURL is enabled

2018-11-19 Thread Ævar Arnfjörð Bjarmason


On Mon, Nov 19 2018, Carlo Marcelo Arenas Belón wrote:

> 6c213e863a ("http-backend: respect CONTENT_LENGTH for receive-pack", 
> 2018-07-27)
> introduced all tests but without a check for CURL support from git.
>
> Signed-off-by: Carlo Marcelo Arenas Belón 
> ---
>  t/t5562-http-backend-content-length.sh | 6 ++
>  1 file changed, 6 insertions(+)
>
> diff --git a/t/t5562-http-backend-content-length.sh 
> b/t/t5562-http-backend-content-length.sh
> index b24d8b05a4..7594899471 100755
> --- a/t/t5562-http-backend-content-length.sh
> +++ b/t/t5562-http-backend-content-length.sh
> @@ -3,6 +3,12 @@
>  test_description='test git-http-backend respects CONTENT_LENGTH'
>  . ./test-lib.sh

This seems like the wrong fix for whatever bug you're encountering. I
just built with NO_CURL and:

$ ./t5561-http-backend.sh
1..0 # SKIP skipping test, git built without http support
$ ./t5562-http-backend-content-length.sh
ok 1 - setup
ok 2 - setup, compression related
ok 3 - fetch plain
ok 4 - fetch plain truncated
ok 5 - fetch plain empty
ok 6 - fetch gzipped
ok 7 - fetch gzipped truncated
ok 8 - fetch gzipped empty
ok 9 - push plain
ok 10 - push plain truncated
ok 11 - push plain empty
ok 12 - push gzipped
ok 13 - push gzipped truncated
ok 14 - push gzipped empty
ok 15 - CONTENT_LENGTH overflow ssite_t
ok 16 - empty CONTENT_LENGTH
# passed all 16 test(s)
1..16

So all these test pass.

Of courses I still have curl on my system, but I don't see the curl(1)
utility used in the test, and my git at this point can't operate on
https?:// URLs, so what error are you getting? Can you paste the test
output with -x -v?

> +if test -n "$NO_CURL"
> +then
> + skip_all='skipping test, git built without http support'
> + test_done
> +fi
> +
>  test_lazy_prereq GZIP 'gzip --version'
>
>  verify_http_result() {

If we do end up needing this after all it seems better to do something
like:

diff --git a/t/lib-httpd.sh b/t/lib-httpd.sh
index a8729f8232..adad654277 100644
--- a/t/lib-httpd.sh
+++ b/t/lib-httpd.sh
@@ -30,11 +30,7 @@
 # Copyright (c) 2008 Clemens Buchacher 
 #

-if test -n "$NO_CURL"
-then
-   skip_all='skipping test, git built without http support'
-   test_done
-fi
+. "$TEST_DIRECTORY"/lib-no-curl.sh

 if test -n "$NO_EXPAT" && test -n "$LIB_HTTPD_DAV"
 then
diff --git a/t/lib-no-curl.sh b/t/lib-no-curl.sh
new file mode 100644
index 00..014947aa2d
--- /dev/null
+++ b/t/lib-no-curl.sh
@@ -0,0 +1,5 @@
+if test -n "$NO_CURL"
+then
+   skip_all='skipping test, git built without http support'
+   test_done
+fi
diff --git a/t/t5562-http-backend-content-length.sh 
b/t/t5562-http-backend-content-length.sh
index b24d8b05a4..cffb460673 100755
--- a/t/t5562-http-backend-content-length.sh
+++ b/t/t5562-http-backend-content-length.sh
@@ -2,6 +2,7 @@

 test_description='test git-http-backend respects CONTENT_LENGTH'
 . ./test-lib.sh
+. ./lib-no-curl.sh

 test_lazy_prereq GZIP 'gzip --version'

Not really a problem with your patch, we have lots of this copy/pasting
all over the place already. I.e. stuff like:

if test -n "$X"
then
skip_all="$Y"
test_done
fi

or:

if ! test_have_prereq "$X"
then
skip_all="$Y"
test_done
fi

Maybe we should make more use of test_lazy_prereq and factor all that
into a new helper like:

test_have_prereq_or_skip_all "$X" "$Y"

Which could be put at the top of these various tests...


Re: how often do you check in and with what tool(s)?

2018-11-19 Thread Jeff King
On Tue, Nov 13, 2018 at 06:04:15PM -0500, _g e r r y _ _l o w r y _ wrote:

> Examples:
> 
> {me, extreme often, Windows}  very granular, with a brief yet appropriate 
> comment [like narrating a story] per commit-i change a few
> lines of code,
> Alt+Tab to Git Bash, Git Add/Commit,
> Alt+Tab back to some coding tool (example LINQPad).
> [generally, i check in one source file before moving to the next source file]
> 
> 
> {not me, very extreme seldom} in some project, not at all granular, in 
> batches such as 50 of 75 files that have been modified, all
> are checked in with a single detailed comment as to the overall purpose of 
> the batched changes.
> 
> 
> QUESTION:  how often do you check in and with what tool(s)?

I think the more important thing is not how often you commit, but that
your final product of commits is reasonably split to communicate a
logical flow to the proposed changes. That organization helps reviewers
understand what's going on, but it also helps me convince myself that
the direction is good (and that each individual change is necessary and
correct). I get there through a combination of:

  - breaking the problem down ahead of time into logical steps, then
committing after each step is done. E.g., if I'm writing a feature
in foo.c that's going to need infrastructure from bar.c, it's pretty
clear the patch series is going to look something like:

  - refactor bar.c into reusable bits
  - start using reusable bits in foo.c

Take as an example the recent object-cache series I did in:

  https://public-inbox.org/git/20181112144627.ga2...@sigill.intra.peff.net/

I knew I needed to make the existing object-cache code from the
alternates struct available more universally, so I did that first
and then finally in patch 8 I could make use of it in the new spot.

  - committing when you discover a new logical breakpoint. This is
mostly intuition, but is often obvious. Say you're refactoring bar.c
into reusable bits, and you realize there are three distinct bits of
factoring.

Going back to that real-world example above, I found there were a
bunch of discrete steps: renaming the struct (patch 3), refactoring
path creation (patch 5), unifying the alt/non-alt cases (patch 6),
and then providing the cache helpers (patch 7).

Those mostly became clear as I was doing the work.

  - similarly, you might stumble over unrelated or tangential issues. In
that same series, I noticed an existing bug, whose fix became patch
1. That was definitely _not_ where I started, but as soon as I saw
that bug, I probably did a "git add -p && git commit" to file it
away (even with a lousy commit message, and then later I went back
and polished it with "rebase -i").

  - if you're lucky that all happens linearly. But most of the time it
doesn't. It's usually more like a rabbit hole, where you know you're
trying to get to point X, and trying to get there reveals all the
other problems. So at any given time during a series like that, my
working tree is a mess of related half-finished changes. I'll
periodically break that down into rough patches with "add -p" and
commit.

Those intermediate results often have minor compilation problems
(because of ordering issues, or maybe the whole state I had when I
broke it down didn't work yet, but I could start to see the chunks).
So then I take a pass with "rebase -i" ("rebase -x 'make test'" is
especially helpful here) and get something sensible.

So in response to your original question, I commit as often as once a
minute, or as rarely as a big chunk at the end of a day. :)

But in the latter case, I'm almost always going back to convert it into
a series of commits that each represent probably no more than a half
hour of work (and that could be read in much less time).

-Peff


[PATCH] t5562: skip if NO_CURL is enabled

2018-11-19 Thread Carlo Marcelo Arenas Belón
6c213e863a ("http-backend: respect CONTENT_LENGTH for receive-pack", 2018-07-27)
introduced all tests but without a check for CURL support from git.

Signed-off-by: Carlo Marcelo Arenas Belón 
---
 t/t5562-http-backend-content-length.sh | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/t/t5562-http-backend-content-length.sh 
b/t/t5562-http-backend-content-length.sh
index b24d8b05a4..7594899471 100755
--- a/t/t5562-http-backend-content-length.sh
+++ b/t/t5562-http-backend-content-length.sh
@@ -3,6 +3,12 @@
 test_description='test git-http-backend respects CONTENT_LENGTH'
 . ./test-lib.sh
 
+if test -n "$NO_CURL"
+then
+   skip_all='skipping test, git built without http support'
+   test_done
+fi
+
 test_lazy_prereq GZIP 'gzip --version'
 
 verify_http_result() {
-- 
2.20.0.rc0



Draft of Git Rev News edition 45

2018-11-19 Thread Christian Couder
Hi,

A draft of a new Git Rev News edition is available here:

  https://github.com/git/git.github.io/blob/master/rev_news/drafts/edition-45.md

Everyone is welcome to contribute in any section either by editing the
above page on GitHub and sending a pull request, or by commenting on
this GitHub issue:

  https://github.com/git/git.github.io/issues/314

You can also reply to this email.

In general all kinds of contribution, for example proofreading,
suggestions for articles or links, help on the issues in GitHub, and
so on, are very much appreciated.

I tried to cc everyone who appears in this edition, but maybe I missed
some people, sorry about that.

Jakub, Markus, Gabriel and me plan to publish this edition on
Wednesday November 21st.

Thanks,
Christian.


[PATCH v5] clone: report duplicate entries on case-insensitive filesystems

2018-11-19 Thread Carlo Marcelo Arenas Belón
While I don't have an HFS+ volume to test, I suspect this patch should be
needed for both, even if I have to say thay even the broken output was
better than the current state.

Travis seems to be using a case sensitive filesystem so wouldn't catch this.

Was windows/cygwin tested?

Carlo
-- >8 --
Subject: [PATCH] entry: fix t5061 on macOS

b878579ae7 ("clone: report duplicate entries on case-insensitive filesystems",
2018-08-17) was tested on Linux with an excemption for Windows that needs
to be expanded for macOS (using APFS), which then would show :

$ git clone git://git.kernel.org/pub/scm/docs/man-pages/man-pages.git
warning: the following paths have collided (e.g. case-sensitive paths
on a case-insensitive filesystem) and only one from the same
colliding group is in the working tree:

  'man2/_Exit.2'
  'man2/_exit.2'
  'man3/NAN.3'
  'man3/nan.3'

Signed-off-by: Carlo Marcelo Arenas Belón 
---
 entry.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/entry.c b/entry.c
index 5d136c5d55..3845f570f7 100644
--- a/entry.c
+++ b/entry.c
@@ -404,7 +404,7 @@ static void mark_colliding_entries(const struct checkout 
*state,
 {
int i, trust_ino = check_stat;
 
-#if defined(GIT_WINDOWS_NATIVE)
+#if defined(GIT_WINDOWS_NATIVE) || defined(__APPLE__)
trust_ino = 0;
 #endif
 
-- 
2.20.0.rc0