date:20180803

With the new cache-tree, we could mostly avoid I/O (due to odb access)
the code mostly becomes a loop of "check this, check that, add the
entry to the index". We could skip a couple checks in this giant loop
to go faster:

- We know here that we're copying entries from the source index to the
  result one. All paths in the source index must have been validated
  at load time already (and we're not taking strange paths from tree
  objects) which means we can skip verify_path() without compromise.

- We also know that D/F conflicts can't happen for all these entries
  (since cache-tree and all the trees are the same) so we can skip
  that as well.

This gives rather nice speedups for "unpack trees" rows where "unpack
trees" time is now cut in half compared to when
traverse_by_cache_tree() is added, or 1/7 of the original "unpack
trees" time.

   baseline  cache-treethis patch
 
   0.018239226   0.019365414   0.020519621 s: read cache .git/index
   0.052541655   0.049605548   0.048814384 s: preload index
   0.001537598   0.001571695   0.001575382 s: refresh index
   0.168167768   0.049677212   0.024719308 s: unpack trees
   0.002897186   0.002845256   0.00280 s: update worktree after a merge
   0.131661745   0.136597522   0.134891617 s: repair cache-tree
   0.075389117   0.075422517   0.074832291 s: write index, changed mask = 2a
   0.111702023   0.032813253   0.008616479 s: unpack trees
   0.23245   0.22002   0.26630 s: update worktree after a merge
   0.111793866   0.032933140   0.008714071 s: diff-index
   0.587933288   0.398924370   0.380452871 s: git command: 
/home/pclouds/w/git/git

Total saving of this new patch looks even less impressive, now that
time spent in unpacking trees is so small. Which is why the next
attempt should be on that "repair cache-tree" line.

Signed-off-by: Nguyễn Thái Ngọc Duy 
---
 cache.h|  1 +
 read-cache.c   |  3 ++-
 unpack-trees.c | 20 
 unpack-trees.h |  1 +
 4 files changed, 24 insertions(+), 1 deletion(-)

diff --git a/cache.h b/cache.h
index 8b447652a7..e6f7ee4b64 100644
--- a/cache.h
+++ b/cache.h
@@ -673,6 +673,7 @@ extern int index_name_pos(const struct index_state *, const 
char *name, int name
 #define ADD_CACHE_JUST_APPEND 8/* Append only; 
tree.c::read_tree() */
 #define ADD_CACHE_NEW_ONLY 16  /* Do not replace existing ones */
 #define ADD_CACHE_KEEP_CACHE_TREE 32   /* Do not invalidate cache-tree */
+#define ADD_CACHE_SKIP_VERIFY_PATH 64  /* Do not verify path */
 extern int add_index_entry(struct index_state *, struct cache_entry *ce, int 
option);
 extern void rename_index_entry_at(struct index_state *, int pos, const char 
*new_name);
 
diff --git a/read-cache.c b/read-cache.c
index e865254bea..b0b5df5de7 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -1170,6 +1170,7 @@ static int add_index_entry_with_check(struct index_state 
*istate, struct cache_e
int ok_to_add = option & ADD_CACHE_OK_TO_ADD;
int ok_to_replace = option & ADD_CACHE_OK_TO_REPLACE;
int skip_df_check = option & ADD_CACHE_SKIP_DFCHECK;
+   int skip_verify_path = option & ADD_CACHE_SKIP_VERIFY_PATH;
int new_only = option & ADD_CACHE_NEW_ONLY;
 
if (!(option & ADD_CACHE_KEEP_CACHE_TREE))
@@ -1210,7 +1211,7 @@ static int add_index_entry_with_check(struct index_state 
*istate, struct cache_e
 
if (!ok_to_add)
return -1;
-   if (!verify_path(ce->name, ce->ce_mode))
+   if (!skip_verify_path && !verify_path(ce->name, ce->ce_mode))
return error("Invalid path '%s'", ce->name);
 
if (!skip_df_check &&
diff --git a/unpack-trees.c b/unpack-trees.c
index c8defc2015..1438ee1555 100644
--- a/unpack-trees.c
+++ b/unpack-trees.c
@@ -201,6 +201,7 @@ static int do_add_entry(struct unpack_trees_options *o, 
struct cache_entry *ce,
 
ce->ce_flags = (ce->ce_flags & ~clear) | set;
return add_index_entry(>result, ce,
+  o->extra_add_index_flags |
   ADD_CACHE_OK_TO_ADD | ADD_CACHE_OK_TO_REPLACE);
 }
 
@@ -701,6 +702,24 @@ static int traverse_by_cache_tree(int pos, int nr_entries, 
int nr_names,
if (!o->merge)
BUG("We need cache-tree to do this optimization");
 
+   /*
+* Try to keep add_index_entry() as fast as possible since
+* we're going to do a lot of them.
+*
+* Skipping verify_path() should totally be safe because these
+* paths are from the source index, which must have been
+* verified.
+*
+* Skipping D/F and cache-tree validation checks is trickier
+* because it assumes what n-merge code would do when all
+* trees and the index are the same. We probably could just
+* optimize those code instead (e.g. we don't invalidate that
+* many cache-tree, but the searching for them

[PATCH v3 0/4] Speed up unpack_trees()

This is a minor update to address Ben's comments and add his
measurements in the commit message of 2/4 for the record.

I've also checked about the lookahead thing in unpack_trees() to see
if we accidentally break something there, which is my biggest worry.
See [1] and [2] for context, but I believe since we can't have D/F
conflicts, the situation where lookahead is needed will not occur. So
we should be safe.

[1] da165f470e (unpack-trees.c: prepare for looking ahead in the index - 
2010-01-07)
[2] 730f72840c (unpack-trees.c: look ahead in the index - 2009-09-20)

range-diff:

1:  789f7e2872 ! 1:  05eb762d2d unpack-trees.c: add performance tracing
@@ -1,6 +1,6 @@
 Author: Nguyễn Thái Ngọc Duy 
 
-unpack-trees.c: add performance tracing
+unpack-trees: add performance tracing
 
 We're going to optimize unpack_trees() a bit in the following
 patches. Let's add some tracing to measure how long it takes before
2:  589bed1366 ! 2:  02286ad123 unpack-trees: optimize walking same trees with 
cache-tree
@@ -32,6 +32,24 @@
 0.111793866   0.032933140 s: diff-index
 0.587933288   0.398924370 s: git command: /home/pclouds/w/git/git
 
+Another measurement from Ben's running "git checkout" with over 500k
+trees (on the whole series):
+
+baselinenew
+  
--
+0.535510167 0.556558733 s: read cache .git/index
+0.3057373   0.3147105   s: initialize name hash
+0.0184082   0.023558433 s: preload index
+0.086910967 0.089085967 s: refresh index
+7.889590767 2.191554433 s: unpack trees
+0.120760833 0.131941267 s: update worktree after a merge
+2.2583504   2.572663167 s: repair cache-tree
+0.8916137   0.959495233 s: write index, changed mask = 28
+3.405199233 0.2710663   s: unpack trees
+0.000999667 0.0021554   s: update worktree after a merge
+3.4063306   0.273318333 s: diff-index
+16.9524923  9.462943133 s: git command: git.exe checkout
+
 This command calls unpack_trees() twice, the first time on 2way merge
 and the second 1way merge. In both times, "unpack trees" time is
 reduced to one third. Overall time reduction is not that impressive of
@@ -39,7 +57,6 @@
 repair cache-tree line.
 
 Signed-off-by: Nguyễn Thái Ngọc Duy 
-Signed-off-by: Junio C Hamano 
 
 diff --git a/unpack-trees.c b/unpack-trees.c
 --- a/unpack-trees.c
@@ -170,7 +187,7 @@
  }
  
 +/*
-+ * Note that traverse_by_cache_tree() duplicates some logic in this 
funciton
++ * Note that traverse_by_cache_tree() duplicates some logic in this 
function
 + * without actually calling it. If you change the logic here you may need 
to
 + * check and change there as well.
 + */
@@ -189,12 +206,3 @@
  static int unpack_callback(int n, unsigned long mask, unsigned long 
dirmask, struct name_entry *names, struct traverse_info *info)
  {
struct cache_entry *src[MAX_UNPACK_TREES + 1] = { NULL, };
-@@
-   uint64_t start = getnanotime();
- 
-   if (len > MAX_UNPACK_TREES)
--  die("unpack_trees takes at most %d trees", MAX_UNPACK_TREES);
-+  die(_("unpack_trees takes at most %d trees"), MAX_UNPACK_TREES);
- 
-   memset(, 0, sizeof(el));
-   if (!core_apply_sparse_checkout || !o->update)
3:  7c6f863fc0 = 3:  c87b82ffee unpack-trees: reduce malloc in cache-tree walk
4:  6ca17b1138 ! 4:  e791cdfc82 unpack-trees: cheaper index update when walking 
by cache-tree
@@ -40,7 +40,6 @@
 attempt should be on that "repair cache-tree" line.
 
 Signed-off-by: Nguyễn Thái Ngọc Duy 
-Signed-off-by: Junio C Hamano 
 
 diff --git a/cache.h b/cache.h
 --- a/cache.h
@@ -119,20 +118,6 @@
free(tree_ce);
if (o->debug_unpack)
printf("Unpacked %d entries from %s to %s using cache-tree\n",
-@@
-   if (!ret) {
-   if (!o->result.cache_tree)
-   o->result.cache_tree = cache_tree();
-+  /*
-+   * TODO: Walk o.src_index->cache_tree, quickly check
-+   * if o->result.cache has the exact same content for
-+   * any valid cache-tree in o.src_index, then we can
-+   * just copy the cache-tree over instead of hashing a
-+   * new tree object.
-+   */
-   if (!cache_tree_fully_valid(o->result.cache_tree))
-   cache_tree_update(>result,
-

[PATCH v3 2/4] unpack-trees: optimize walking same trees with cache-tree

From: Duy Nguyen 

In order to merge one or many trees with the index, unpack-trees code
walks multiple trees in parallel with the index and performs n-way
merge. If we find out at start of a directory that all trees are the
same (by comparing OID) and cache-tree happens to be available for
that directory as well, we could avoid walking the trees because we
already know what these trees contain: it's flattened in what's called
"the index".

The upside is of course a lot less I/O since we can potentially skip
lots of trees (think subtrees). We also save CPU because we don't have
to inflate and the apply deltas. The downside is of course more
fragile code since the logic in some functions are now duplicated
elsewhere.

"checkout -" with this patch on gcc.git:

baseline  new
  
0.018239226   0.019365414 s: read cache .git/index
0.052541655   0.049605548 s: preload index
0.001537598   0.001571695 s: refresh index
0.168167768   0.049677212 s: unpack trees
0.002897186   0.002845256 s: update worktree after a merge
0.131661745   0.136597522 s: repair cache-tree
0.075389117   0.075422517 s: write index, changed mask = 2a
0.111702023   0.032813253 s: unpack trees
0.23245   0.22002 s: update worktree after a merge
0.111793866   0.032933140 s: diff-index
0.587933288   0.398924370 s: git command: /home/pclouds/w/git/git

Another measurement from Ben's running "git checkout" with over 500k
trees (on the whole series):

baselinenew
  --
0.535510167 0.556558733 s: read cache .git/index
0.3057373   0.3147105   s: initialize name hash
0.0184082   0.023558433 s: preload index
0.086910967 0.089085967 s: refresh index
7.889590767 2.191554433 s: unpack trees
0.120760833 0.131941267 s: update worktree after a merge
2.2583504   2.572663167 s: repair cache-tree
0.8916137   0.959495233 s: write index, changed mask = 28
3.405199233 0.2710663   s: unpack trees
0.000999667 0.0021554   s: update worktree after a merge
3.4063306   0.273318333 s: diff-index
16.9524923  9.462943133 s: git command: git.exe checkout

This command calls unpack_trees() twice, the first time on 2way merge
and the second 1way merge. In both times, "unpack trees" time is
reduced to one third. Overall time reduction is not that impressive of
course because index operations take a big chunk. And there's that
repair cache-tree line.

Signed-off-by: Nguyễn Thái Ngọc Duy 
---
 unpack-trees.c | 117 +
 1 file changed, 117 insertions(+)

diff --git a/unpack-trees.c b/unpack-trees.c
index a32ddee159..ba3d2e947e 100644
--- a/unpack-trees.c
+++ b/unpack-trees.c
@@ -644,6 +644,102 @@ static inline int are_same_oid(struct name_entry *name_j, 
struct name_entry *nam
return name_j->oid && name_k->oid && !oidcmp(name_j->oid, name_k->oid);
 }
 
+static int all_trees_same_as_cache_tree(int n, unsigned long dirmask,
+   struct name_entry *names,
+   struct traverse_info *info)
+{
+   struct unpack_trees_options *o = info->data;
+   int i;
+
+   if (!o->merge || dirmask != ((1 << n) - 1))
+   return 0;
+
+   for (i = 1; i < n; i++)
+   if (!are_same_oid(names, names + i))
+   return 0;
+
+   return cache_tree_matches_traversal(o->src_index->cache_tree, names, 
info);
+}
+
+static int index_pos_by_traverse_info(struct name_entry *names,
+ struct traverse_info *info)
+{
+   struct unpack_trees_options *o = info->data;
+   int len = traverse_path_len(info, names);
+   char *name = xmalloc(len + 1 /* slash */ + 1 /* NUL */);
+   int pos;
+
+   make_traverse_path(name, info, names);
+   name[len++] = '/';
+   name[len] = '\0';
+   pos = index_name_pos(o->src_index, name, len);
+   if (pos >= 0)
+   BUG("This is a directory and should not exist in index");
+   pos = -pos - 1;
+   if (!starts_with(o->src_index->cache[pos]->name, name) ||
+   (pos > 0 && starts_with(o->src_index->cache[pos-1]->name, name)))
+   BUG("pos must point at the first entry in this directory");
+   free(name);
+   return pos;
+}
+
+/*
+ * Fast path if we detect that all trees are the same as cache-tree at this
+ * path. We'll walk these trees recursively using cache-tree/index instead of
+ * ODB since already know what these trees contain.
+ */
+static int traverse_by_cache_tree(int pos, int nr_entries, int nr_names,
+ struct name_entry *names,
+ struct traverse_info *info)
+{
+   struct

[PATCH v3 1/4] unpack-trees: add performance tracing

We're going to optimize unpack_trees() a bit in the following
patches. Let's add some tracing to measure how long it takes before
and after. This is the baseline ("git checkout -" on gcc.git, 80k
files on worktree)

0.018239226 s: read cache .git/index
0.052541655 s: preload index
0.001537598 s: refresh index
0.168167768 s: unpack trees
0.002897186 s: update worktree after a merge
0.131661745 s: repair cache-tree
0.075389117 s: write index, changed mask = 2a
0.111702023 s: unpack trees
0.23245 s: update worktree after a merge
0.111793866 s: diff-index
0.587933288 s: git command: /home/pclouds/w/git/git checkout -

Signed-off-by: Nguyễn Thái Ngọc Duy 
Signed-off-by: Junio C Hamano 
---
 cache-tree.c   | 2 ++
 unpack-trees.c | 4 
 2 files changed, 6 insertions(+)

diff --git a/cache-tree.c b/cache-tree.c
index 6b46711996..0dbe10fc85 100644
--- a/cache-tree.c
+++ b/cache-tree.c
@@ -426,6 +426,7 @@ static int update_one(struct cache_tree *it,
 
 int cache_tree_update(struct index_state *istate, int flags)
 {
+   uint64_t start = getnanotime();
struct cache_tree *it = istate->cache_tree;
struct cache_entry **cache = istate->cache;
int entries = istate->cache_nr;
@@ -437,6 +438,7 @@ int cache_tree_update(struct index_state *istate, int flags)
if (i < 0)
return i;
istate->cache_changed |= CACHE_TREE_CHANGED;
+   trace_performance_since(start, "repair cache-tree");
return 0;
 }
 
diff --git a/unpack-trees.c b/unpack-trees.c
index cd0680f11e..a32ddee159 100644
--- a/unpack-trees.c
+++ b/unpack-trees.c
@@ -352,6 +352,7 @@ static int check_updates(struct unpack_trees_options *o)
struct progress *progress = NULL;
struct index_state *index = >result;
struct checkout state = CHECKOUT_INIT;
+   uint64_t start = getnanotime();
int i;
 
state.force = 1;
@@ -423,6 +424,7 @@ static int check_updates(struct unpack_trees_options *o)
errs |= finish_delayed_checkout();
if (o->update)
git_attr_set_direction(GIT_ATTR_CHECKIN, NULL);
+   trace_performance_since(start, "update worktree after a merge");
return errs != 0;
 }
 
@@ -1275,6 +1277,7 @@ int unpack_trees(unsigned len, struct tree_desc *t, 
struct unpack_trees_options
int i, ret;
static struct cache_entry *dfc;
struct exclude_list el;
+   uint64_t start = getnanotime();
 
if (len > MAX_UNPACK_TREES)
die("unpack_trees takes at most %d trees", MAX_UNPACK_TREES);
@@ -1423,6 +1426,7 @@ int unpack_trees(unsigned len, struct tree_desc *t, 
struct unpack_trees_options
goto done;
}
}
+   trace_performance_since(start, "unpack trees");
 
ret = check_updates(o) ? (-2) : 0;
if (o->dst_index) {
-- 
2.18.0.656.gda699b98b3

[PATCH v3 3/4] unpack-trees: reduce malloc in cache-tree walk

This is a micro optimization that probably only shines on repos with
deep directory structure. Instead of allocating and freeing a new
cache_entry in every iteration, we reuse the last one and only update
the parts that are new each iteration.

Signed-off-by: Nguyễn Thái Ngọc Duy 
Signed-off-by: Junio C Hamano 
---
 unpack-trees.c | 29 -
 1 file changed, 20 insertions(+), 9 deletions(-)

diff --git a/unpack-trees.c b/unpack-trees.c
index ba3d2e947e..c8defc2015 100644
--- a/unpack-trees.c
+++ b/unpack-trees.c
@@ -694,6 +694,8 @@ static int traverse_by_cache_tree(int pos, int nr_entries, 
int nr_names,
 {
struct cache_entry *src[MAX_UNPACK_TREES + 1] = { NULL, };
struct unpack_trees_options *o = info->data;
+   struct cache_entry *tree_ce = NULL;
+   int ce_len = 0;
int i, d;
 
if (!o->merge)
@@ -708,30 +710,39 @@ static int traverse_by_cache_tree(int pos, int 
nr_entries, int nr_names,
 * in the first place.
 */
for (i = 0; i < nr_entries; i++) {
-   struct cache_entry *tree_ce;
-   int len, rc;
+   int new_ce_len, len, rc;
 
src[0] = o->src_index->cache[pos + i];
 
len = ce_namelen(src[0]);
-   tree_ce = xcalloc(1, cache_entry_size(len));
+   new_ce_len = cache_entry_size(len);
+
+   if (new_ce_len > ce_len) {
+   new_ce_len <<= 1;
+   tree_ce = xrealloc(tree_ce, new_ce_len);
+   memset(tree_ce, 0, new_ce_len);
+   ce_len = new_ce_len;
+
+   tree_ce->ce_flags = create_ce_flags(0);
+
+   for (d = 1; d <= nr_names; d++)
+   src[d] = tree_ce;
+   }
 
tree_ce->ce_mode = src[0]->ce_mode;
-   tree_ce->ce_flags = create_ce_flags(0);
tree_ce->ce_namelen = len;
oidcpy(_ce->oid, [0]->oid);
memcpy(tree_ce->name, src[0]->name, len + 1);
 
-   for (d = 1; d <= nr_names; d++)
-   src[d] = tree_ce;
-
rc = call_unpack_fn((const struct cache_entry * const *)src, o);
-   free(tree_ce);
-   if (rc < 0)
+   if (rc < 0) {
+   free(tree_ce);
return rc;
+   }
 
mark_ce_used(src[0], o);
}
+   free(tree_ce);
if (o->debug_unpack)
printf("Unpacked %d entries from %s to %s using cache-tree\n",
   nr_entries,
-- 
2.18.0.656.gda699b98b3

[PATCH] Makefile: enable DEVELOPER by default

Reviewer bandwidth is limited, so let's have the machine of the
(potentially new) contributor warn about issues with the code by default.

As setting DEVELOPER, the compiler is stricter and we may run into problems
on some architectures. But packagers of said platforms are knowledgeable
enough to turn off this flag. (Also they are fewer than the number of new
contributors)

Signed-off-by: Stefan Beller 
---

Resending with  cc'd.

 Makefile | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/Makefile b/Makefile
index 41b93689add..95aa3ff3185 100644
--- a/Makefile
+++ b/Makefile
@@ -497,6 +497,8 @@ ALL_CFLAGS = $(CPPFLAGS) $(CFLAGS)
 ALL_LDFLAGS = $(LDFLAGS)
 STRIP ?= strip
 
+DEVELOPER=1
+
 # Create as necessary, replace existing, make ranlib unneeded.
 ARFLAGS = rcs
 
-- 
2.18.0.597.ga71716f1ad-goog

[PATCH] Makefile: enable DEVELOPER by default

Reviewer bandwidth is limited, so let's have the machine of the
(potentially new) contributor warn about issues with the code by default.

As setting DEVELOPER, the compiler is stricter and we may run into problems
on some architectures. But packagers of said platforms are knowledgeable
enough to turn off this flag. (Also they are fewer than the number of new
contributors)

Signed-off-by: Stefan Beller 
---
 Makefile | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/Makefile b/Makefile
index 41b93689add..95aa3ff3185 100644
--- a/Makefile
+++ b/Makefile
@@ -497,6 +497,8 @@ ALL_CFLAGS = $(CPPFLAGS) $(CFLAGS)
 ALL_LDFLAGS = $(LDFLAGS)
 STRIP ?= strip
 
+DEVELOPER=1
+
 # Create as necessary, replace existing, make ranlib unneeded.
 ARFLAGS = rcs
 
-- 
2.18.0.597.ga71716f1ad-goog

[RFC PATCH 7/7] diff/am: enhance diff format to use */~ for moved lines

Try it out via
./git-format-patch --mark-moved 15ef69314d^..15ef69314d
to see if you like it.

This separates the coloring decision from the detection of moved lines.
When giving --mark-moved, move detection is still performed and the output
markers are adjusted to */~ for new and old code.

git-apply and git-am will also accept these patches by rewriting those
signs back to +/-.

Signed-off-by: Stefan Beller 
---
 apply.c | 12 
 diff.c  | 21 +
 diff.h  |  5 -
 3 files changed, 37 insertions(+), 1 deletion(-)

diff --git a/apply.c b/apply.c
index 23a0f25ded8..cc42a4fa02a 100644
--- a/apply.c
+++ b/apply.c
@@ -2900,6 +2900,12 @@ static int apply_one_fragment(struct apply_state *state,
ws_blank_line(patch + 1, plen, ws_rule))
is_blank_context = 1;
/* fallthrough */
+   case '~':
+   /*
+* For now ignore moved line indicators and apply
+* as a regular old line
+*/
+   /* fallthrough */
case '-':
memcpy(old, patch + 1, plen);
add_line_info(, old, plen,
@@ -2908,6 +2914,12 @@ static int apply_one_fragment(struct apply_state *state,
if (first == '-')
break;
/* fallthrough */
+   case '*':
+   /*
+* For now ignore moved line indicators and apply
+* as a regular new line
+*/
+   /* fallthrough */
case '+':
/* --no-add does not add new lines */
if (first == '+' && state->no_add)
diff --git a/diff.c b/diff.c
index 56bab011df7..8e39e77229f 100644
--- a/diff.c
+++ b/diff.c
@@ -1043,6 +1043,9 @@ static const char *determine_line_color(struct 
diff_options *o,
const int off = (eds->s == DIFF_SYMBOL_PLUS) ?
DIFF_FILE_NEW_MOVED - DIFF_FILE_OLD_MOVED : 0;
 
+   if (!o->color_moved)
+   goto default_color;
+
switch (flags & (DIFF_SYMBOL_MOVED_LINE |
 DIFF_SYMBOL_MOVED_LINE_ALT |
 DIFF_SYMBOL_MOVED_LINE_UNINTERESTING)) {
@@ -1063,6 +1066,7 @@ static const char *determine_line_color(struct 
diff_options *o,
set = diff_get_color_opt(o, DIFF_FILE_OLD_MOVED + off);
break;
default:
+default_color:
set = (eds->s == DIFF_SYMBOL_PLUS) ?
diff_get_color_opt(o, DIFF_FILE_NEW):
diff_get_color_opt(o, DIFF_FILE_OLD);
@@ -1152,6 +1156,9 @@ static void emit_diff_symbol_from_struct(struct 
diff_options *o,
 
first = o->output_indicators[OI_NEW] ?
o->output_indicators[OI_NEW] : "+";
+   if (o->output_indicators[OI_MOVED_NEW] &&
+  (flags & DIFF_SYMBOL_MOVED_LINE))
+   first = "*";
emit_line_ws_markup(o, set_sign, set, reset, first, line, len,
flags & DIFF_SYMBOL_CONTENT_WS_MASK,
flags & DIFF_SYMBOL_CONTENT_BLANK_LINE_EOF);
@@ -1176,6 +1183,9 @@ static void emit_diff_symbol_from_struct(struct 
diff_options *o,
}
first = o->output_indicators[OI_OLD] ?
o->output_indicators[OI_OLD] : "-";
+   if (o->output_indicators[OI_MOVED_NEW] &&
+  (flags & DIFF_SYMBOL_MOVED_LINE))
+   first = "~";
emit_line_ws_markup(o, set_sign, set, reset, first, line, len,
flags & DIFF_SYMBOL_CONTENT_WS_MASK, 0);
break;
@@ -4795,6 +4805,7 @@ int diff_opt_parse(struct diff_options *options,
else if (!strcmp(arg, "--no-color"))
options->use_color = 0;
else if (!strcmp(arg, "--color-moved")) {
+   options->color_moved = 1;
if (diff_color_moved_default)
options->markup_moved = diff_color_moved_default;
if (options->markup_moved == COLOR_MOVED_NO)
@@ -4806,6 +4817,16 @@ int diff_opt_parse(struct diff_options *options,
if (cm < 0)
die("bad --color-moved argument: %s", arg);
options->markup_moved = cm;
+   options->color_moved = 1;
+   } else if (skip_prefix(arg, "--mark-moved", )) {
+   /*
+* NEEDSWORK:
+* Once merged with 51da15eb230 (diff.c: add a blocks mode for
+* moved code detection, 2018-07-16), make it COLOR_MOVED_BLOCKS
+*/
+   options->markup_moved = COLOR_MOVED_PLAIN;
+

[PATCH 2/7] diff.c: add --output-indicator-{new, old, context}

This will prove useful in range-diff in a later patch as we will be able to
differentiate between adding a new file (that line is starting with +++
and then the file name) and regular new lines.

It could also be useful for experimentation in new patch formats, i.e.
we could teach git to emit moved lines with lines other than +/-.

Signed-off-by: Stefan Beller 
---
 diff.c | 21 +
 diff.h |  5 +
 2 files changed, 22 insertions(+), 4 deletions(-)

diff --git a/diff.c b/diff.c
index 57a8a38755e..2e711809700 100644
--- a/diff.c
+++ b/diff.c
@@ -1032,7 +1032,7 @@ static void emit_diff_symbol_from_struct(struct 
diff_options *o,
 struct emitted_diff_symbol *eds)
 {
static const char *nneof = " No newline at end of file\n";
-   const char *context, *reset, *set, *set_sign, *meta, *fraginfo;
+   const char *context, *reset, *set, *set_sign, *meta, *fraginfo, *first;
struct strbuf sb = STRBUF_INIT;
 
enum diff_symbol s = eds->s;
@@ -1083,7 +1083,9 @@ static void emit_diff_symbol_from_struct(struct 
diff_options *o,
else if (c == '-')
set = diff_get_color_opt(o, DIFF_FILE_OLD);
}
-   emit_line_ws_markup(o, set_sign, set, reset, " ", line, len,
+   first = o->output_indicators[OI_CONTEXT] ?
+   o->output_indicators[OI_CONTEXT] : " ";
+   emit_line_ws_markup(o, set_sign, set, reset, first, line, len,
flags & (DIFF_SYMBOL_CONTENT_WS_MASK), 0);
break;
case DIFF_SYMBOL_PLUS:
@@ -1126,7 +1128,10 @@ static void emit_diff_symbol_from_struct(struct 
diff_options *o,
set = diff_get_color_opt(o, DIFF_CONTEXT_BOLD);
flags |= WS_IGNORE_FIRST_SPACE;
}
-   emit_line_ws_markup(o, set_sign, set, reset, "+", line, len,
+
+   first = o->output_indicators[OI_NEW] ?
+   o->output_indicators[OI_NEW] : "+";
+   emit_line_ws_markup(o, set_sign, set, reset, first, line, len,
flags & DIFF_SYMBOL_CONTENT_WS_MASK,
flags & DIFF_SYMBOL_CONTENT_BLANK_LINE_EOF);
break;
@@ -1169,7 +1174,9 @@ static void emit_diff_symbol_from_struct(struct 
diff_options *o,
else
set = diff_get_color_opt(o, DIFF_CONTEXT_DIM);
}
-   emit_line_ws_markup(o, set_sign, set, reset, "-", line, len,
+   first = o->output_indicators[OI_OLD] ?
+   o->output_indicators[OI_OLD] : "-";
+   emit_line_ws_markup(o, set_sign, set, reset, first, line, len,
flags & DIFF_SYMBOL_CONTENT_WS_MASK, 0);
break;
case DIFF_SYMBOL_WORDS_PORCELAIN:
@@ -4670,6 +4677,12 @@ int diff_opt_parse(struct diff_options *options,
 options->output_format |= DIFF_FORMAT_DIFFSTAT;
} else if (!strcmp(arg, "--no-compact-summary"))
 options->flags.stat_with_summary = 0;
+   else if (skip_prefix(arg, "--output-indicator-new=", ))
+   options->output_indicators[OI_NEW] = arg;
+   else if (skip_prefix(arg, "--output-indicator-old=", ))
+   options->output_indicators[OI_OLD] = arg;
+   else if (skip_prefix(arg, "--output-indicator-context=", ))
+   options->output_indicators[OI_CONTEXT] = arg;
 
/* renames options */
else if (starts_with(arg, "-B") ||
diff --git a/diff.h b/diff.h
index a08a3b2a293..b8bbe7baeb8 100644
--- a/diff.h
+++ b/diff.h
@@ -194,6 +194,11 @@ struct diff_options {
FILE *file;
int close_file;
 
+#define OI_NEW 0
+#define OI_OLD 1
+#define OI_CONTEXT 2
+   const char *output_indicators[3];
+
struct pathspec pathspec;
pathchange_fn_t pathchange;
change_fn_t change;
-- 
2.18.0.597.ga71716f1ad-goog

[PATCH 5/7] diff.c: rename color_moved to markup_moved

This just renames a variable to make the next patch easier to review.

Signed-off-by: Stefan Beller 
---
 diff.c | 28 ++--
 diff.h |  2 +-
 2 files changed, 15 insertions(+), 15 deletions(-)

diff --git a/diff.c b/diff.c
index 2e711809700..d3829c7d086 100644
--- a/diff.c
+++ b/diff.c
@@ -821,7 +821,7 @@ static int shrink_potential_moved_blocks(struct moved_entry 
**pmb,
 }
 
 /*
- * If o->color_moved is COLOR_MOVED_PLAIN, this function does nothing.
+ * If o->markup_moved is COLOR_MOVED_PLAIN, this function does nothing.
  *
  * Otherwise, if the last block has fewer alphanumeric characters than
  * COLOR_MOVED_MIN_ALNUM_COUNT, unset DIFF_SYMBOL_MOVED_LINE on all lines in
@@ -836,7 +836,7 @@ static int shrink_potential_moved_blocks(struct moved_entry 
**pmb,
 static void adjust_last_block(struct diff_options *o, int n, int block_length)
 {
int i, alnum_count = 0;
-   if (o->color_moved == COLOR_MOVED_PLAIN)
+   if (o->markup_moved == COLOR_MOVED_PLAIN)
return;
for (i = 1; i < block_length + 1; i++) {
const char *c = o->emitted_symbols->buf[n - i].line;
@@ -895,7 +895,7 @@ static void mark_color_as_moved(struct diff_options *o,
 
l->flags |= DIFF_SYMBOL_MOVED_LINE;
 
-   if (o->color_moved == COLOR_MOVED_PLAIN)
+   if (o->markup_moved == COLOR_MOVED_PLAIN)
continue;
 
/* Check any potential block runs, advance each or nullify */
@@ -4220,7 +4220,7 @@ void diff_setup(struct diff_options *options)
options->b_prefix = "b/";
}
 
-   options->color_moved = diff_color_moved_default;
+   options->markup_moved = diff_color_moved_default;
 }
 
 void diff_setup_done(struct diff_options *options)
@@ -4333,7 +4333,7 @@ void diff_setup_done(struct diff_options *options)
die(_("--follow requires exactly one pathspec"));
 
if (!options->use_color || external_diff())
-   options->color_moved = 0;
+   options->markup_moved = 0;
 }
 
 static int opt_arg(const char *arg, int arg_short, const char *arg_long, int 
*val)
@@ -4796,16 +4796,16 @@ int diff_opt_parse(struct diff_options *options,
options->use_color = 0;
else if (!strcmp(arg, "--color-moved")) {
if (diff_color_moved_default)
-   options->color_moved = diff_color_moved_default;
-   if (options->color_moved == COLOR_MOVED_NO)
-   options->color_moved = COLOR_MOVED_DEFAULT;
+   options->markup_moved = diff_color_moved_default;
+   if (options->markup_moved == COLOR_MOVED_NO)
+   options->markup_moved = COLOR_MOVED_DEFAULT;
} else if (!strcmp(arg, "--no-color-moved"))
-   options->color_moved = COLOR_MOVED_NO;
+   options->markup_moved = COLOR_MOVED_NO;
else if (skip_prefix(arg, "--color-moved=", )) {
int cm = parse_color_moved(arg);
if (cm < 0)
die("bad --color-moved argument: %s", arg);
-   options->color_moved = cm;
+   options->markup_moved = cm;
} else if (skip_to_optional_arg_default(arg, "--color-words", 
>word_regex, NULL)) {
options->use_color = 1;
options->word_diff = DIFF_WORDS_COLOR;
@@ -5623,7 +5623,7 @@ static void diff_flush_patch_all_file_pairs(struct 
diff_options *o)
if (WSEH_NEW & WS_RULE_MASK)
BUG("WS rules bit mask overlaps with diff symbol flags");
 
-   if (o->color_moved)
+   if (o->markup_moved)
o->emitted_symbols = 
 
for (i = 0; i < q->nr; i++) {
@@ -5633,7 +5633,7 @@ static void diff_flush_patch_all_file_pairs(struct 
diff_options *o)
}
 
if (o->emitted_symbols) {
-   if (o->color_moved) {
+   if (o->markup_moved) {
struct hashmap add_lines, del_lines;
 
hashmap_init(_lines,
@@ -5643,7 +5643,7 @@ static void diff_flush_patch_all_file_pairs(struct 
diff_options *o)
 
add_lines_to_move_detection(o, _lines, _lines);
mark_color_as_moved(o, _lines, _lines);
-   if (o->color_moved == COLOR_MOVED_ZEBRA_DIM)
+   if (o->markup_moved == COLOR_MOVED_ZEBRA_DIM)
dim_moved_lines(o);
 
hashmap_free(_lines, 0);
@@ -5731,7 +5731,7 @@ void diff_flush(struct diff_options *options)
fclose(options->file);
options->file = xfopen("/dev/null", "w");
options->close_file = 1;
-   options->color_moved = 0;
+   options->markup_moved = 0;
for (i = 0; i < q->nr; i++) {
struct diff_filepair *p = q->queue[i];
if

[PATCH 0/7] improve range-diffs coloring and [RFC] move detection

This builds on top of sb/range-diff-colors, which builds on js/range-diff.
As Thomas seemed happy with the range-diff and said
"and fwiw I didn't think there's anything major that needs to be addressed."
I found it reasonable to build more on top of those series.
[1] https://public-inbox.org/git/20180729215053.ge9...@hank.intra.tgummerer.com/
It can also be found via
  git fetch https://github.com/stefanbeller/git moved-in-patches

This series provides for 2 goals, which share a common base refactoring:

The refactoring part is to allow the diff machinery to prefix
old/new/context lines with prefixes as chosen by the user.

The first feature is to slightly adapt coloring of range-diff to have the
file markers

  --- a/diff.c
  +++ b/diff.c
  
be shown as context color. This is done by using custom prefixes for
old/new/context lines in the inner diffs and indenting all other lines
instead by a white space indicating it to be context.

The second feature is more RFC-ish in nature and just exposes the
mechanism to the user in a meaning-ful way.
All of the diff family (including format-patch) as well as apply/am
can use */~ instead of +/- for moved lines of code.

Thanks,
Stefan

Stefan Beller (7):
  diff.c: emit_line_0 to take string instead of first sign
  diff.c: add --output-indicator-{new, old, context}
  range-diff: make use of different output indicators
  range-diff: indent special lines as context
  diff.c: rename color_moved to markup_moved
  diff.c: factor determine_line_color out of
emit_diff_symbol_from_struct
  diff/am: enhance diff format to use */~ for moved lines

 apply.c   |  12 +++
 diff.c| 180 ++
 diff.h|  10 ++-
 range-diff.c  |  17 +++-
 t/t3206-range-diff.sh |  12 +--
 5 files changed, 153 insertions(+), 78 deletions(-)

-- 
2.18.0.597.ga71716f1ad-goog

[PATCH 3/7] range-diff: make use of different output indicators

This change itself only changes the internal communication and should
have no visible effect to the user. We instruct the diff code that produces
the inner diffs to use X, Y, Z instead of the usual markers for new, old
and context lines

Signed-off-by: Stefan Beller 
---
 range-diff.c | 15 ++-
 1 file changed, 14 insertions(+), 1 deletion(-)

diff --git a/range-diff.c b/range-diff.c
index 347b4a79f25..a4ff945427e 100644
--- a/range-diff.c
+++ b/range-diff.c
@@ -38,6 +38,9 @@ static int read_patches(const char *range, struct string_list 
*list)
 
argv_array_pushl(, "log", "--no-color", "-p", "--no-merges",
"--reverse", "--date-order", "--decorate=no",
+   "--output-indicator-new=X",
+   "--output-indicator-old=Y",
+   "--output-indicator-context=Z",
"--no-abbrev-commit", range,
NULL);
cp.out = -1;
@@ -108,8 +111,18 @@ static int read_patches(const char *range, struct 
string_list *list)
 * we are not interested.
 */
continue;
-   else
+   else if (line.buf[0] == 'X') {
+   strbuf_addch(, '+');
+   strbuf_add(, line.buf + 1, line.len - 1);
+   } else if (line.buf[0] == 'Y') {
+   strbuf_addch(, '-');
+   strbuf_add(, line.buf + 1, line.len - 1);
+   } else if (line.buf[0] == 'Z') {
+   strbuf_addch(, ' ');
+   strbuf_add(, line.buf + 1, line.len - 1);
+   } else {
strbuf_addbuf(, );
+   }
 
strbuf_addch(, '\n');
util->diffsize++;
-- 
2.18.0.597.ga71716f1ad-goog

[PATCH 4/7] range-diff: indent special lines as context

The range-diff coloring is a bit fuzzy when it comes to special lines of
a diff, such as indicating new and old files with +++ and ---, as it
would pickup the first character and interpret it for its coloring, which
seems annoying as in regular diffs, these lines are colored bold via
DIFF_METAINFO.

By indenting these lines by a white space, they will be treated as context
which is much more useful, an example [1] on the range diff series itself:

[...]
+ diff --git a/Documentation/git-range-diff.txt 
b/Documentation/git-range-diff.txt
+ new file mode 100644
+ --- /dev/null
+ +++ b/Documentation/git-range-diff.txt
+@@
++git-range-diff(1)
[...]
+
  diff --git a/Makefile b/Makefile
  --- a/Makefile
  +++ b/Makefile
[...]

The first lines that introduce the new file for the man page will have the
'+' sign colored and the rest of the line will be bold.

The later lines that indicate a change to the Makefile will be treated as
context both in the outer and inner diff, such that those lines stay
regular color.

[1] ./git-range-diff pr-1/dscho/branch-diff-v3...pr-1/dscho/branch-diff-v4
These tags are found at https://github.com/gitgitgadget/git

Signed-off-by: Stefan Beller 
---
 range-diff.c  |  2 ++
 t/t3206-range-diff.sh | 12 ++--
 2 files changed, 8 insertions(+), 6 deletions(-)

diff --git a/range-diff.c b/range-diff.c
index a4ff945427e..91d5f12180d 100644
--- a/range-diff.c
+++ b/range-diff.c
@@ -85,6 +85,7 @@ static int read_patches(const char *range, struct string_list 
*list)
strbuf_addch(, '\n');
if (!util->diff_offset)
util->diff_offset = buf.len;
+   strbuf_addch(, ' ');
strbuf_addbuf(, );
} else if (in_header) {
if (starts_with(line.buf, "Author: ")) {
@@ -121,6 +122,7 @@ static int read_patches(const char *range, struct 
string_list *list)
strbuf_addch(, ' ');
strbuf_add(, line.buf + 1, line.len - 1);
} else {
+   strbuf_addch(, ' ');
strbuf_addbuf(, );
}
 
diff --git a/t/t3206-range-diff.sh b/t/t3206-range-diff.sh
index e3b0764b433..0cd23cbff41 100755
--- a/t/t3206-range-diff.sh
+++ b/t/t3206-range-diff.sh
@@ -133,9 +133,9 @@ test_expect_success 'changed message' '
Z
+Also a silly comment here!
+
-   Zdiff --git a/file b/file
-   Z--- a/file
-   Z+++ b/file
+   Z diff --git a/file b/file
+   Z --- a/file
+   Z +++ b/file
3:  147e64e = 3:  b9cb956 s/11/B/
4:  a63e992 = 4:  8add5f1 s/12/B/
EOF
@@ -152,9 +152,9 @@ test_expect_success 'dual-coloring' '
: 
:+Also a silly comment here!
:+
-   : diff --git a/file b/file
-   : --- a/file
-   : +++ b/file
+   :  diff --git a/file b/file
+   :  --- a/file
+   :  +++ b/file
:3:  0559556 ! 3:  
b9cb956 s/11/B/
:@@ -10,7 +10,7 @@
:  9
-- 
2.18.0.597.ga71716f1ad-goog

[PATCH 1/7] diff.c: emit_line_0 to take string instead of first sign

By providing a string as the first part of the emission we can extend
it later more easily.

While at it, document emit_line_0.

Signed-off-by: Stefan Beller 
---
 diff.c | 28 +---
 1 file changed, 17 insertions(+), 11 deletions(-)

diff --git a/diff.c b/diff.c
index 2bd4d3d6839..57a8a38755e 100644
--- a/diff.c
+++ b/diff.c
@@ -575,9 +575,15 @@ static void check_blank_at_eof(mmfile_t *mf1, mmfile_t 
*mf2,
ecbdata->blank_at_eof_in_postimage = (at - l2) + 1;
 }
 
+/*
+ * Emits
+ *   LF
+ * if they are present. 'first' is a NULL terminated string,
+ * 'second' is a buffer of length 'len'.
+ */
 static void emit_line_0(struct diff_options *o,
const char *set_sign, const char *set, const char 
*reset,
-   int first, const char *line, int len)
+   const char *first, const char *second, int len)
 {
int has_trailing_newline, has_trailing_carriage_return;
int reverse = !!set && !!set_sign;
@@ -587,11 +593,11 @@ static void emit_line_0(struct diff_options *o,
 
fputs(diff_line_prefix(o), file);
 
-   has_trailing_newline = (len > 0 && line[len-1] == '\n');
+   has_trailing_newline = (len > 0 && second[len-1] == '\n');
if (has_trailing_newline)
len--;
 
-   has_trailing_carriage_return = (len > 0 && line[len-1] == '\r');
+   has_trailing_carriage_return = (len > 0 && second[len-1] == '\r');
if (has_trailing_carriage_return)
len--;
 
@@ -609,7 +615,7 @@ static void emit_line_0(struct diff_options *o,
}
 
if (first)
-   fputc(first, file);
+   fputs(first, file);
 
if (!len)
goto end_of_line;
@@ -620,7 +626,7 @@ static void emit_line_0(struct diff_options *o,
fputs(set, file);
needs_reset = 1;
}
-   fwrite(line, len, 1, file);
+   fwrite(second, len, 1, file);
needs_reset |= len > 0;
 
 end_of_line:
@@ -635,7 +641,7 @@ static void emit_line_0(struct diff_options *o,
 static void emit_line(struct diff_options *o, const char *set, const char 
*reset,
  const char *line, int len)
 {
-   emit_line_0(o, set, NULL, reset, 0, line, len);
+   emit_line_0(o, set, NULL, reset, NULL, line, len);
 }
 
 enum diff_symbol {
@@ -996,7 +1002,7 @@ static void dim_moved_lines(struct diff_options *o)
 static void emit_line_ws_markup(struct diff_options *o,
const char *set_sign, const char *set,
const char *reset,
-   char sign, const char *line, int len,
+   const char *sign, const char *line, int len,
unsigned ws_rule, int blank_at_eof)
 {
const char *ws = NULL;
@@ -1039,7 +1045,7 @@ static void emit_diff_symbol_from_struct(struct 
diff_options *o,
context = diff_get_color_opt(o, DIFF_CONTEXT);
reset = diff_get_color_opt(o, DIFF_RESET);
putc('\n', o->file);
-   emit_line_0(o, context, NULL, reset, '\\',
+   emit_line_0(o, context, NULL, reset, "\\",
nneof, strlen(nneof));
break;
case DIFF_SYMBOL_SUBMODULE_HEADER:
@@ -1077,7 +1083,7 @@ static void emit_diff_symbol_from_struct(struct 
diff_options *o,
else if (c == '-')
set = diff_get_color_opt(o, DIFF_FILE_OLD);
}
-   emit_line_ws_markup(o, set_sign, set, reset, ' ', line, len,
+   emit_line_ws_markup(o, set_sign, set, reset, " ", line, len,
flags & (DIFF_SYMBOL_CONTENT_WS_MASK), 0);
break;
case DIFF_SYMBOL_PLUS:
@@ -1120,7 +1126,7 @@ static void emit_diff_symbol_from_struct(struct 
diff_options *o,
set = diff_get_color_opt(o, DIFF_CONTEXT_BOLD);
flags |= WS_IGNORE_FIRST_SPACE;
}
-   emit_line_ws_markup(o, set_sign, set, reset, '+', line, len,
+   emit_line_ws_markup(o, set_sign, set, reset, "+", line, len,
flags & DIFF_SYMBOL_CONTENT_WS_MASK,
flags & DIFF_SYMBOL_CONTENT_BLANK_LINE_EOF);
break;
@@ -1163,7 +1169,7 @@ static void emit_diff_symbol_from_struct(struct 
diff_options *o,
else
set = diff_get_color_opt(o, DIFF_CONTEXT_DIM);
}
-   emit_line_ws_markup(o, set_sign, set, reset, '-', line, len,
+   emit_line_ws_markup(o, set_sign, set, reset, "-", line, len,
flags & DIFF_SYMBOL_CONTENT_WS_MASK, 0);
break;
case DIFF_SYMBOL_WORDS_PORCELAIN:
-- 
2.18.0.597.ga71716f1ad-goog

[PATCH 6/7] diff.c: factor determine_line_color out of emit_diff_symbol_from_struct

Factoring out code that determines color for both the new and old code
even saves some lines, though minuscule. However we introduce an offset
computation in the color array.

Signed-off-by: Stefan Beller 
---
 diff.c | 88 +-
 1 file changed, 44 insertions(+), 44 deletions(-)

diff --git a/diff.c b/diff.c
index d3829c7d086..56bab011df7 100644
--- a/diff.c
+++ b/diff.c
@@ -1028,6 +1028,48 @@ static void emit_line_ws_markup(struct diff_options *o,
}
 }
 
+static const char *determine_line_color(struct diff_options *o,
+   struct emitted_diff_symbol *eds)
+{
+   const char *set;
+   unsigned flags = eds->flags;
+
+   /*
+* To have these offsets work, we need to keep
+* DIFF_FILE_OLD_MOVED{_ALT, _ALT_DIM, DIM, _}
+* in the same order as their _NEW_ equivalents; we do not need
+* to care about DIFF_FILE_{NEW, OLD} and their relations to others.
+*/
+   const int off = (eds->s == DIFF_SYMBOL_PLUS) ?
+   DIFF_FILE_NEW_MOVED - DIFF_FILE_OLD_MOVED : 0;
+
+   switch (flags & (DIFF_SYMBOL_MOVED_LINE |
+DIFF_SYMBOL_MOVED_LINE_ALT |
+DIFF_SYMBOL_MOVED_LINE_UNINTERESTING)) {
+   case DIFF_SYMBOL_MOVED_LINE |
+DIFF_SYMBOL_MOVED_LINE_ALT |
+DIFF_SYMBOL_MOVED_LINE_UNINTERESTING:
+   set = diff_get_color_opt(o, DIFF_FILE_OLD_MOVED_ALT_DIM + off);
+   break;
+   case DIFF_SYMBOL_MOVED_LINE |
+DIFF_SYMBOL_MOVED_LINE_ALT:
+   set = diff_get_color_opt(o, DIFF_FILE_OLD_MOVED_ALT + off);
+   break;
+   case DIFF_SYMBOL_MOVED_LINE |
+DIFF_SYMBOL_MOVED_LINE_UNINTERESTING:
+   set = diff_get_color_opt(o, DIFF_FILE_OLD_MOVED_DIM + off);
+   break;
+   case DIFF_SYMBOL_MOVED_LINE:
+   set = diff_get_color_opt(o, DIFF_FILE_OLD_MOVED + off);
+   break;
+   default:
+   set = (eds->s == DIFF_SYMBOL_PLUS) ?
+   diff_get_color_opt(o, DIFF_FILE_NEW):
+   diff_get_color_opt(o, DIFF_FILE_OLD);
+   }
+   return set;
+}
+
 static void emit_diff_symbol_from_struct(struct diff_options *o,
 struct emitted_diff_symbol *eds)
 {
@@ -1089,28 +1131,7 @@ static void emit_diff_symbol_from_struct(struct 
diff_options *o,
flags & (DIFF_SYMBOL_CONTENT_WS_MASK), 0);
break;
case DIFF_SYMBOL_PLUS:
-   switch (flags & (DIFF_SYMBOL_MOVED_LINE |
-DIFF_SYMBOL_MOVED_LINE_ALT |
-DIFF_SYMBOL_MOVED_LINE_UNINTERESTING)) {
-   case DIFF_SYMBOL_MOVED_LINE |
-DIFF_SYMBOL_MOVED_LINE_ALT |
-DIFF_SYMBOL_MOVED_LINE_UNINTERESTING:
-   set = diff_get_color_opt(o, 
DIFF_FILE_NEW_MOVED_ALT_DIM);
-   break;
-   case DIFF_SYMBOL_MOVED_LINE |
-DIFF_SYMBOL_MOVED_LINE_ALT:
-   set = diff_get_color_opt(o, DIFF_FILE_NEW_MOVED_ALT);
-   break;
-   case DIFF_SYMBOL_MOVED_LINE |
-DIFF_SYMBOL_MOVED_LINE_UNINTERESTING:
-   set = diff_get_color_opt(o, DIFF_FILE_NEW_MOVED_DIM);
-   break;
-   case DIFF_SYMBOL_MOVED_LINE:
-   set = diff_get_color_opt(o, DIFF_FILE_NEW_MOVED);
-   break;
-   default:
-   set = diff_get_color_opt(o, DIFF_FILE_NEW);
-   }
+   set = determine_line_color(o, eds);
reset = diff_get_color_opt(o, DIFF_RESET);
if (!o->flags.dual_color_diffed_diffs)
set_sign = NULL;
@@ -1136,28 +1157,7 @@ static void emit_diff_symbol_from_struct(struct 
diff_options *o,
flags & DIFF_SYMBOL_CONTENT_BLANK_LINE_EOF);
break;
case DIFF_SYMBOL_MINUS:
-   switch (flags & (DIFF_SYMBOL_MOVED_LINE |
-DIFF_SYMBOL_MOVED_LINE_ALT |
-DIFF_SYMBOL_MOVED_LINE_UNINTERESTING)) {
-   case DIFF_SYMBOL_MOVED_LINE |
-DIFF_SYMBOL_MOVED_LINE_ALT |
-DIFF_SYMBOL_MOVED_LINE_UNINTERESTING:
-   set = diff_get_color_opt(o, 
DIFF_FILE_OLD_MOVED_ALT_DIM);
-   break;
-   case DIFF_SYMBOL_MOVED_LINE |
-DIFF_SYMBOL_MOVED_LINE_ALT:
-   set = diff_get_color_opt(o, DIFF_FILE_OLD_MOVED_ALT);
-   break;
-   case DIFF_SYMBOL_MOVED_LINE |
-DIFF_SYMBOL_MOVED_LINE_UNINTERESTING:
-   set

Re: [PATCH] t4150: fix broken test for am --scissors

2018-08-03 Thread Andrei Rybak

On 2018-08-04 01:04, Junio C Hamano wrote:
> Hmph, I am not quite sure what is going on.  Is the only bug in the
> original that scissors-patch.eml and no-scissors-patch.eml files were
> incorrectly named?  IOW, if we fed no-scissors-patch.eml (which has
> a scissors line in it) with --scissors option to am, would it have
> worked just fine without other changes in this patch?

Just swapping eml files wouldn't be enough, because in old tests the
prepared commits touch different files: scissors-file and
no-scissors-file. And since tests are about cutting/keeping commit
message, it doesn't make much sense to keep two eml files which differ
only in contents of their diffs. I'll try to reword the commit message
to also include this bit.

> I am not saying that we shouldn't make other changes or renaming the
> confusing .eml files.  I am just trying to understand what the
> nature of the breakage was.  For example, it is not immediately
> obvious why the new test needs to prepare the message _with_
> "Subject:" in front of the first line when it prepares the commit
> to be used for testing.
> 
>   ... goes back and thinks a bit ...
> 
> OK, the Subject: thing that appears after the scissors line serves
> as an in-body header to override the subject line of the e-mail
> itself.  That change is necessary to _drop_ the subject from the
> outer e-mail and replace it with the subject we do want to use.
> 
> So I can explain why "Subject:" thing needed to be added.

Yes, the adding of "Subject: " prefix is completely overlooked in the
commit message. I'll add explanation in re-send.

> I cannot still explain why a blank line needs to be removed after
> the scissors line, though.  We should be able to have blank lines
> before the in-body header, IIRC.

I'll double check this and restore the blank line in v2, if the
removal is not needed. IIRC, I removed it by accident and didn't
think too much of it.

Thank you for review.

Re: [PATCH] t7406: Make a test_must_fail call fail for the right reason

On Fri, Aug 3, 2018 at 7:14 PM Elijah Newren  wrote:
> A test making use of test_must_fail was failing like this:
>   fatal: ambiguous argument '|': unknown revision or path not in the working 
> tree.
> when the intent was to verify that a specific string was not found
> in the output of the git diff command, i.e. that grep returned
> non-zero.  Fix the test to do that.
> ---
> diff --git a/t/t7406-submodule-update.sh b/t/t7406-submodule-update.sh
> @@ -599,7 +599,7 @@ test_expect_success 'submodule update - update=none in 
> .git/config but --checkou
>  git diff --raw | grep "submodule" &&
>  git submodule update --checkout &&
> -test_must_fail git diff --raw \| grep "submodule" &&
> +git diff --raw | test_must_fail grep " submodule" &&

Unfortunately, this is a mis-use of test_must_fail() which is intended
only for Git commands; it does extra checking to ensure that the Git
command failed in a sane way (say, by returning a failing exit status)
rather than by crashing. It's not intended for use with system
commands which are assumed to be bug-free.

Having a Git command upstream of a pipe is also discouraged since the
pipe swallows its exit status, which means we won't know if the Git
command actually crashed. So, what you really want is this:

git diff --raw >out &&
! grep "" out &&

(where  is a literal TAB)

Since this script has a number of instances of Git commands upstream
pipes, it may not make sense to fix just this one. So, either a
preparatory cleanup patch could fix them all at once, and then this
patch could come on top or, if you don't want to fix all the pipe
cases, you could do this instead:

! git diff --raw | grep "" &&

Re: [PATCH] t3031: update test description to mention desired behavior

Elijah Newren  writes:

> This test description looks like it was written with the originally
> observed behavior ("causes segfault") rather than the desired and now
> current behavior ("does not cause segfault").  Fix it.

GOod.

>
> Signed-off-by: Elijah Newren 
> ---
>  t/t3031-merge-criscross.sh | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/t/t3031-merge-criscross.sh b/t/t3031-merge-criscross.sh
> index e59b0a32d6..3824756a02 100755
> --- a/t/t3031-merge-criscross.sh
> +++ b/t/t3031-merge-criscross.sh
> @@ -88,7 +88,7 @@ test_expect_success 'setup repo with criss-cross history' '
>   git branch G
>  '
>  
> -test_expect_success 'recursive merge between F and G, causes segfault' '
> +test_expect_success 'recursive merge between F and G does not cause 
> segfault' '
>   git merge F
>  '

Re: [PATCH] t7406: Make a test_must_fail call fail for the right reason

On Fri, Aug 3, 2018 at 7:40 PM Eric Sunshine  wrote:
> git diff --raw >out &&
> ! grep "" out &&


> (where  is a literal TAB)
>
> Since this script has a number of instances of Git commands upstream
> pipes, it may not make sense to fix just this one. So, either a
> preparatory cleanup patch could fix them all at once, and then this
> patch could come on top or, if you don't want to fix all the pipe
> cases, you could do this instead:
>
> ! git diff --raw | grep "" &&

Of course, I mean "submodule".

Re: [PATCH] t7406: Make a test_must_fail call fail for the right reason

Elijah Newren  writes:

> A test making use of test_must_fail was failing like this:
>   fatal: ambiguous argument '|': unknown revision or path not in the working 
> tree.
> when the intent was to verify that a specific string was not found
> in the output of the git diff command, i.e. that grep returned
> non-zero.  Fix the test to do that.
> ---
>  t/t7406-submodule-update.sh | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/t/t7406-submodule-update.sh b/t/t7406-submodule-update.sh
> index f604ef7a72..7be8b59569 100755
> --- a/t/t7406-submodule-update.sh
> +++ b/t/t7406-submodule-update.sh
> @@ -599,7 +599,7 @@ test_expect_success 'submodule update - update=none in 
> .git/config but --checkou
>) &&
>git diff --raw | grep "submodule" &&
>git submodule update --checkout &&
> -  test_must_fail git diff --raw \| grep "submodule" &&
> +  git diff --raw | test_must_fail grep " submodule" &&

Good spotting, but a few comments.

 * I've seen "do not have 'git' upstream on a pipee (it would hide
   the exit status from an unexpected failure of 'git') recently.
   We probably want to do the same.

 * We do not use test_must_fail for non-git things, as we are not in
   the business of protecting us from unexpected segfault of system
   binaries like grep.

So an immediate improvement for this line would be

! git diff --raw | grep " submodule" &&

and longer-term clean-up would aim for

git diff --raw >differs &&
! grep " submodule" &&

or something like that.  I suspect that --raw may want to be updated
to --name-only or something, as I do not see the tests using the
object names hence no strong need for using --raw format.

[PATCH] t7406: Make a test_must_fail call fail for the right reason

2018-08-03 Thread Elijah Newren

A test making use of test_must_fail was failing like this:
  fatal: ambiguous argument '|': unknown revision or path not in the working 
tree.
when the intent was to verify that a specific string was not found
in the output of the git diff command, i.e. that grep returned
non-zero.  Fix the test to do that.
---
 t/t7406-submodule-update.sh | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/t/t7406-submodule-update.sh b/t/t7406-submodule-update.sh
index f604ef7a72..7be8b59569 100755
--- a/t/t7406-submodule-update.sh
+++ b/t/t7406-submodule-update.sh
@@ -599,7 +599,7 @@ test_expect_success 'submodule update - update=none in 
.git/config but --checkou
 ) &&
 git diff --raw | grep "submodule" &&
 git submodule update --checkout &&
-test_must_fail git diff --raw \| grep "submodule" &&
+git diff --raw | test_must_fail grep " submodule" &&
 (cd submodule &&
  test_must_fail compare_head
 ) &&
-- 
2.18.0.548.gcd835b18f7

[PATCH] t3031: update test description to mention desired behavior

2018-08-03 Thread Elijah Newren

This test description looks like it was written with the originally
observed behavior ("causes segfault") rather than the desired and now
current behavior ("does not cause segfault").  Fix it.

Signed-off-by: Elijah Newren 
---
 t/t3031-merge-criscross.sh | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/t/t3031-merge-criscross.sh b/t/t3031-merge-criscross.sh
index e59b0a32d6..3824756a02 100755
--- a/t/t3031-merge-criscross.sh
+++ b/t/t3031-merge-criscross.sh
@@ -88,7 +88,7 @@ test_expect_success 'setup repo with criss-cross history' '
git branch G
 '
 
-test_expect_success 'recursive merge between F and G, causes segfault' '
+test_expect_success 'recursive merge between F and G does not cause segfault' '
git merge F
 '
 
-- 
2.18.0.548.gcd835b18f7

Re: [PATCH] t4150: fix broken test for am --scissors

Andrei Rybak  writes:

> Tests for "git am --[no-]scissors" [1] work in the following way:
>
>  1. Create files with commit messages
>  2. Use these files to create expected commits
>  3. Generate eml file with patch from expected commits
>  4. Create commits using git am with these eml files
>  5. Compare these commits with expected
>
> The test for "git am --scissors" is supposed to take a message with a
> scissors line above commit message and demonstrate that only the text
> below the scissors line is included in the commit created by invocation
> of "git am --scissors".  However, the setup of the test uses commits
> without the scissors line in the commit message, therefore creating an
> eml file without scissors line.
>
> This can be checked by intentionally breaking is_scissors_line function
> in mailinfo.c. Test t4150-am.sh should fail, but does not.
>
> Fix broken test by generating only one eml file--with scissors line, and
> by using it both for --scissors and --no-scissors. To clarify the
> intention of the test, give files and tags more explicit names.
>
> [1]: introduced in bf72ac17d (t4150: tests for am --[no-]scissors,
>  2015-07-19)
>
> Signed-off-by: Andrei Rybak 

Thanks for digging and fixing.

> diff --git a/t/t4150-am.sh b/t/t4150-am.sh
> index e9b6f8158..23e3b0e91 100755
> --- a/t/t4150-am.sh
> +++ b/t/t4150-am.sh
> @@ -67,17 +67,18 @@ test_expect_success 'setup: messages' '
>  
>   EOF
>  
> - cat >scissors-msg <<-\EOF &&
> - Test git-am with scissors line
> + cat >msg-without-scissors-line <<-\EOF &&
> + Test that git-am --scissors cuts at the scissors line
>  
>   This line should be included in the commit message.
>   EOF
>  
> - cat - scissors-msg >no-scissors-msg <<-\EOF &&
> + printf "Subject: " >subject-prefix &&
> +
> + cat - subject-prefix msg-without-scissors-line >msg-with-scissors-line 
> <<-\EOF &&
>   This line should not be included in the commit message with --scissors 
> enabled.
>  
>- - >8 - - remove everything above this line - - >8 - -
> -
>   EOF

So... the original prepared a scissors-msg file that does not have a
line with the scissors sign, and used it to create no-scissors-msg
file that does have a line with the scissors sign?  That does sound
like a backward way to name these files X-<.  And the renamed result
obviously looks much saner.

I see two more differences; the new one has "Subject: " in front of
the first line, and also it lacks a blank line immediately after the
scissors line.  Do these differences matter, a reader wonders, and
reads on...

> @@ -150,18 +151,17 @@ test_expect_success setup '
>   } >patch1-hg.eml &&
>  
>  
> - echo scissors-file >scissors-file &&
> - git add scissors-file &&
> - git commit -F scissors-msg &&
> - git tag scissors &&
> - git format-patch --stdout scissors^ >scissors-patch.eml &&

OK, the old one created the message with formta-patch, so it
shouldn't have "Subject: " there to begin with.  How does the new
one work, a reader now wonders, and reads on...

> + echo file >file &&
> + git add file &&
> + git commit -F msg-without-scissors-line &&
> + git tag scissors-used &&

So far, the commit created is more or less the same from the original,
but we do not yet create an e-mail message yet.

>   git reset --hard HEAD^ &&
>  
> - echo no-scissors-file >no-scissors-file &&
> - git add no-scissors-file &&
> - git commit -F no-scissors-msg &&
> - git tag no-scissors &&
> - git format-patch --stdout no-scissors^ >no-scissors-patch.eml &&

The old one committed with scissors and told format-patch to create
an e-mail, which does not make much sense to me, but I guess users
are allowed to do this.

> + echo file >file &&
> + git add file &&
> + git commit -F msg-with-scissors-line &&
> + git tag scissors-not-used &&
> + git format-patch --stdout scissors-not-used^ 
> >patch-with-scissors-line.eml &&

Now we get an e-mail out of the system, presumably with a line with
scissors marker on it in the log message.

>   git reset --hard HEAD^ &&
>  
>   sed -n -e "3,\$p" msg >file &&
> @@ -418,10 +418,10 @@ test_expect_success 'am --scissors cuts the message at 
> the scissors line' '
>   rm -fr .git/rebase-apply &&
>   git reset --hard &&
>   git checkout second &&
> - git am --scissors scissors-patch.eml &&
> + git am --scissors patch-with-scissors-line.eml &&
>   test_path_is_missing .git/rebase-apply &&
> - git diff --exit-code scissors &&
> - test_cmp_rev scissors HEAD
> + git diff --exit-code scissors-used &&
> + test_cmp_rev scissors-used HEAD

Hmph, I am not quite sure what is going on.  Is the only bug in the
original that scissors-patch.eml and no-scissors-patch.eml files were
incorrectly named?  IOW, if we fed no-scissors-patch.eml (which has
a scissors line in it) with --scissors option to am, would it have
worked just fine without

Re: [PATCH] add a script to diff rendered documentation

On Fri, Aug 3, 2018 at 5:47 PM Junio C Hamano  wrote:
> Eric Sunshine  writes:
> > There doesn't seem to a usage() function defined anywhere (and
> > OPTIONS_SPEC doesn't seem to be used).
>
> Isn't this using the parse-options thing git-sh-setup gives us for
> free?

Yes. I saw that git-sh-setup was being dot-sourced and then promptly
forgot about it. Peff corrected me.

[PATCH] t4150: fix broken test for am --scissors

2018-08-03 Thread Andrei Rybak

Tests for "git am --[no-]scissors" [1] work in the following way:

 1. Create files with commit messages
 2. Use these files to create expected commits
 3. Generate eml file with patch from expected commits
 4. Create commits using git am with these eml files
 5. Compare these commits with expected

The test for "git am --scissors" is supposed to take a message with a
scissors line above commit message and demonstrate that only the text
below the scissors line is included in the commit created by invocation
of "git am --scissors".  However, the setup of the test uses commits
without the scissors line in the commit message, therefore creating an
eml file without scissors line.

This can be checked by intentionally breaking is_scissors_line function
in mailinfo.c. Test t4150-am.sh should fail, but does not.

Fix broken test by generating only one eml file--with scissors line, and
by using it both for --scissors and --no-scissors. To clarify the
intention of the test, give files and tags more explicit names.

[1]: introduced in bf72ac17d (t4150: tests for am --[no-]scissors,
 2015-07-19)

Signed-off-by: Andrei Rybak 
---

Applies on top of 980a3d3dd (Merge branch 'pt/am-tests', 2015-08-03).
This patch is also available at

  https://github.com/rybak/git fix-am-scissors-test


 t/t4150-am.sh | 40 
 1 file changed, 20 insertions(+), 20 deletions(-)

diff --git a/t/t4150-am.sh b/t/t4150-am.sh
index e9b6f8158..23e3b0e91 100755
--- a/t/t4150-am.sh
+++ b/t/t4150-am.sh
@@ -67,17 +67,18 @@ test_expect_success 'setup: messages' '
 
EOF
 
-   cat >scissors-msg <<-\EOF &&
-   Test git-am with scissors line
+   cat >msg-without-scissors-line <<-\EOF &&
+   Test that git-am --scissors cuts at the scissors line
 
This line should be included in the commit message.
EOF
 
-   cat - scissors-msg >no-scissors-msg <<-\EOF &&
+   printf "Subject: " >subject-prefix &&
+
+   cat - subject-prefix msg-without-scissors-line >msg-with-scissors-line 
<<-\EOF &&
This line should not be included in the commit message with --scissors 
enabled.
 
 - - >8 - - remove everything above this line - - >8 - -
-
EOF
 
signoff="Signed-off-by: $GIT_COMMITTER_NAME <$GIT_COMMITTER_EMAIL>"
@@ -150,18 +151,17 @@ test_expect_success setup '
} >patch1-hg.eml &&
 
 
-   echo scissors-file >scissors-file &&
-   git add scissors-file &&
-   git commit -F scissors-msg &&
-   git tag scissors &&
-   git format-patch --stdout scissors^ >scissors-patch.eml &&
+   echo file >file &&
+   git add file &&
+   git commit -F msg-without-scissors-line &&
+   git tag scissors-used &&
git reset --hard HEAD^ &&
 
-   echo no-scissors-file >no-scissors-file &&
-   git add no-scissors-file &&
-   git commit -F no-scissors-msg &&
-   git tag no-scissors &&
-   git format-patch --stdout no-scissors^ >no-scissors-patch.eml &&
+   echo file >file &&
+   git add file &&
+   git commit -F msg-with-scissors-line &&
+   git tag scissors-not-used &&
+   git format-patch --stdout scissors-not-used^ 
>patch-with-scissors-line.eml &&
git reset --hard HEAD^ &&
 
sed -n -e "3,\$p" msg >file &&
@@ -418,10 +418,10 @@ test_expect_success 'am --scissors cuts the message at 
the scissors line' '
rm -fr .git/rebase-apply &&
git reset --hard &&
git checkout second &&
-   git am --scissors scissors-patch.eml &&
+   git am --scissors patch-with-scissors-line.eml &&
test_path_is_missing .git/rebase-apply &&
-   git diff --exit-code scissors &&
-   test_cmp_rev scissors HEAD
+   git diff --exit-code scissors-used &&
+   test_cmp_rev scissors-used HEAD
 '
 
 test_expect_success 'am --no-scissors overrides mailinfo.scissors' '
@@ -429,10 +429,10 @@ test_expect_success 'am --no-scissors overrides 
mailinfo.scissors' '
git reset --hard &&
git checkout second &&
test_config mailinfo.scissors true &&
-   git am --no-scissors no-scissors-patch.eml &&
+   git am --no-scissors patch-with-scissors-line.eml &&
test_path_is_missing .git/rebase-apply &&
-   git diff --exit-code no-scissors &&
-   test_cmp_rev no-scissors HEAD
+   git diff --exit-code scissors-not-used &&
+   test_cmp_rev scissors-not-used HEAD
 '
 
 test_expect_success 'setup: new author and committer' '
-- 
2.18.0

Re: [PATCH 0/7] Resend of sb/submodule-update-in-c

Stefan Beller  writes:

> * Introduce new patch
>   "submodule--helper: replace connect-gitdir-workingtree by 
> ensure-core-worktree"
>   that resolves the conflict with earlier versions of this series with
>   sb/submodule-core-worktree
> * This series is based on master, which already contains 
>   sb/submodule-core-worktree

Thanks; as this is not a bugfix but a new way of implementing the
thing, it is good to base it on 'master'.

Will queue.

[PATCH 4/7] builtin/submodule--helper: store update_clone information in a struct

The information that is printed for update_submodules in
'submodule--helper update-clone' and consumed by 'git submodule update'
is stored as a string per submodule. This made sense at the time of
48308681b07 (git submodule update: have a dedicated helper for cloning,
2016-02-29), but as we want to migrate the rest of the submodule update
into C, we're better off having access to the raw information in a helper
struct.

Signed-off-by: Stefan Beller 
Signed-off-by: Junio C Hamano 
---
 builtin/submodule--helper.c | 37 +++--
 1 file changed, 27 insertions(+), 10 deletions(-)

diff --git a/builtin/submodule--helper.c b/builtin/submodule--helper.c
index 32f00ca6f87..40b94dd622e 100644
--- a/builtin/submodule--helper.c
+++ b/builtin/submodule--helper.c
@@ -1446,6 +1446,12 @@ static int module_clone(int argc, const char **argv, 
const char *prefix)
return 0;
 }
 
+struct update_clone_data {
+   const struct submodule *sub;
+   struct object_id oid;
+   unsigned just_cloned;
+};
+
 struct submodule_update_clone {
/* index into 'list', the list of submodules to look into for cloning */
int current;
@@ -1465,8 +1471,9 @@ struct submodule_update_clone {
const char *recursive_prefix;
const char *prefix;
 
-   /* Machine-readable status lines to be consumed by git-submodule.sh */
-   struct string_list projectlines;
+   /* to be consumed by git-submodule.sh */
+   struct update_clone_data *update_clone;
+   int update_clone_nr; int update_clone_alloc;
 
/* If we want to stop as fast as possible and return an error */
unsigned quickstop : 1;
@@ -1480,7 +1487,7 @@ struct submodule_update_clone {
 #define SUBMODULE_UPDATE_CLONE_INIT {0, MODULE_LIST_INIT, 0, \
SUBMODULE_UPDATE_STRATEGY_INIT, 0, 0, -1, STRING_LIST_INIT_DUP, 0, \
NULL, NULL, NULL, \
-   STRING_LIST_INIT_DUP, 0, NULL, 0, 0}
+   NULL, 0, 0, 0, NULL, 0, 0, 0}
 
 
 static void next_submodule_warn_missing(struct submodule_update_clone *suc,
@@ -1574,10 +1581,12 @@ static int prepare_to_clone_next_submodule(const struct 
cache_entry *ce,
strbuf_addf(, "%s/.git", ce->name);
needs_cloning = !file_exists(sb.buf);
 
-   strbuf_reset();
-   strbuf_addf(, "dummy %s %d\t%s\n",
-   oid_to_hex(>oid), needs_cloning, ce->name);
-   string_list_append(>projectlines, sb.buf);
+   ALLOC_GROW(suc->update_clone, suc->update_clone_nr + 1,
+  suc->update_clone_alloc);
+   oidcpy(>update_clone[suc->update_clone_nr].oid, >oid);
+   suc->update_clone[suc->update_clone_nr].just_cloned = needs_cloning;
+   suc->update_clone[suc->update_clone_nr].sub = sub;
+   suc->update_clone_nr++;
 
if (!needs_cloning)
goto cleanup;
@@ -1720,7 +1729,8 @@ static int git_update_clone_config(const char *var, const 
char *value,
 
 static int update_submodules(struct submodule_update_clone *suc)
 {
-   struct string_list_item *item;
+   int i;
+   struct strbuf sb = STRBUF_INIT;
 
run_processes_parallel(suc->max_jobs,
   update_clone_get_next_task,
@@ -1739,9 +1749,16 @@ static int update_submodules(struct 
submodule_update_clone *suc)
if (suc->quickstop)
return 1;
 
-   for_each_string_list_item(item, >projectlines)
-   fprintf(stdout, "%s", item->string);
+   for (i = 0; i < suc->update_clone_nr; i++) {
+   strbuf_addf(, "dummy %s %d\t%s\n",
+   oid_to_hex(>update_clone[i].oid),
+   suc->update_clone[i].just_cloned,
+   suc->update_clone[i].sub->path);
+   fprintf(stdout, "%s", sb.buf);
+   strbuf_reset();
+   }
 
+   strbuf_release();
return 0;
 }
 
-- 
2.18.0.132.g195c49a2227

[PATCH 6/7] submodule--helper: replace connect-gitdir-workingtree by ensure-core-worktree

e98317508c0 (submodule: ensure core.worktree is set after update,
2018-06-18) was overly aggressive in calling connect_work_tree_and_git_dir
as that ensures both the 'core.worktree' configuration is set as well as
setting up correct gitlink file pointing at the git directory.

We do not need to check for the gitlink in this part of the cmd_update
in git-submodule.sh, as the initial call to update-clone will have ensured
that. So we can reduce the work to only (check and potentially) set the
'core.worktree' setting.

While at it move the check from shell to C as that proves to be useful in
a follow up patch, as we do not need the 'name' in shell now.

Signed-off-by: Stefan Beller 
---
 builtin/submodule--helper.c | 64 +++--
 git-submodule.sh|  7 ++--
 2 files changed, 42 insertions(+), 29 deletions(-)

diff --git a/builtin/submodule--helper.c b/builtin/submodule--helper.c
index 8b1088ab58a..e7635d5d9ab 100644
--- a/builtin/submodule--helper.c
+++ b/builtin/submodule--helper.c
@@ -1964,6 +1964,45 @@ static int push_check(int argc, const char **argv, const 
char *prefix)
return 0;
 }
 
+static int ensure_core_worktree(int argc, const char **argv, const char 
*prefix)
+{
+   const struct submodule *sub;
+   const char *path;
+   char *cw;
+   struct repository subrepo;
+
+   if (argc != 2)
+   BUG("submodule--helper connect-gitdir-workingtree  
");
+
+   path = argv[1];
+
+   sub = submodule_from_path(the_repository, _oid, path);
+   if (!sub)
+   BUG("We could get the submodule handle before?");
+
+   if (repo_submodule_init(, the_repository, path))
+   die(_("could not get a repository handle for submodule '%s'"), 
path);
+
+   if (!repo_config_get_string(, "core.worktree", )) {
+   char *cfg_file, *abs_path;
+   const char *rel_path;
+   struct strbuf sb = STRBUF_INIT;
+
+   cfg_file = xstrfmt("%s/config", subrepo.gitdir);
+
+   abs_path = absolute_pathdup(path);
+   rel_path = relative_path(abs_path, subrepo.gitdir, );
+
+   git_config_set_in_file(cfg_file, "core.worktree", rel_path);
+
+   free(cfg_file);
+   free(abs_path);
+   strbuf_release();
+   }
+
+   return 0;
+}
+
 static int absorb_git_dirs(int argc, const char **argv, const char *prefix)
 {
int i;
@@ -2029,29 +2068,6 @@ static int check_name(int argc, const char **argv, const 
char *prefix)
return 0;
 }
 
-static int connect_gitdir_workingtree(int argc, const char **argv, const char 
*prefix)
-{
-   struct strbuf sb = STRBUF_INIT;
-   const char *name, *path;
-   char *sm_gitdir;
-
-   if (argc != 3)
-   BUG("submodule--helper connect-gitdir-workingtree  
");
-
-   name = argv[1];
-   path = argv[2];
-
-   strbuf_addf(, "%s/modules/%s", get_git_dir(), name);
-   sm_gitdir = absolute_pathdup(sb.buf);
-
-   connect_work_tree_and_git_dir(path, sm_gitdir, 0);
-
-   strbuf_release();
-   free(sm_gitdir);
-
-   return 0;
-}
-
 #define SUPPORT_SUPER_PREFIX (1<<0)
 
 struct cmd_struct {
@@ -2065,7 +2081,7 @@ static struct cmd_struct commands[] = {
{"name", module_name, 0},
{"clone", module_clone, 0},
{"update-clone", update_clone, 0},
-   {"connect-gitdir-workingtree", connect_gitdir_workingtree, 0},
+   {"ensure-core-worktree", ensure_core_worktree, 0},
{"relative-path", resolve_relative_path, 0},
{"resolve-relative-url", resolve_relative_url, 0},
{"resolve-relative-url-test", resolve_relative_url_test, 0},
diff --git a/git-submodule.sh b/git-submodule.sh
index 8caaf274e25..19d010eac06 100755
--- a/git-submodule.sh
+++ b/git-submodule.sh
@@ -535,6 +535,8 @@ cmd_update()
do
die_if_unmatched "$quickabort" "$sha1"
 
+   git submodule--helper ensure-core-worktree "$sm_path"
+
name=$(git submodule--helper name "$sm_path") || exit
if ! test -z "$update"
then
@@ -577,11 +579,6 @@ cmd_update()
die "$(eval_gettext "Unable to find current 
\${remote_name}/\${branch} revision in submodule path '\$sm_path'")"
fi
 
-   if ! $(git config -f "$(git rev-parse 
--git-common-dir)/modules/$name/config" core.worktree) 2>/dev/null
-   then
-   git submodule--helper connect-gitdir-workingtree 
"$name" "$sm_path"
-   fi
-
if test "$subsha1" != "$sha1" || test -n "$force"
then
subforce=$force
-- 
2.18.0.132.g195c49a2227

[PATCH 2/7] git-submodule.sh: rename unused variables

The 'mode' variable is not used in cmd_update for its original purpose,
rename it to 'dummy' as it only serves the purpose to abort quickly
documenting this knowledge.

The variable 'stage' is also not used any more in cmd_update, so remove it.

This went unnoticed as first each function used the commonly used
submodule listing, which was converted in 74703a1e4df (submodule: rewrite
`module_list` shell function in C, 2015-09-02). When cmd_update was
using its own function starting in 48308681b07 (git submodule update:
have a dedicated helper for cloning, 2016-02-29), its removal was missed.

A later patch in this series also touches the communication between
the submodule helper and git-submodule.sh, but let's have this as
a preparatory patch, as it eases the next patch, which stores the
raw data instead of the line printed for this communication.

Signed-off-by: Stefan Beller 
Signed-off-by: Junio C Hamano 
---
 builtin/submodule--helper.c | 5 ++---
 git-submodule.sh| 4 ++--
 2 files changed, 4 insertions(+), 5 deletions(-)

diff --git a/builtin/submodule--helper.c b/builtin/submodule--helper.c
index a3c4564c6c8..da700c88963 100644
--- a/builtin/submodule--helper.c
+++ b/builtin/submodule--helper.c
@@ -1573,9 +1573,8 @@ static int prepare_to_clone_next_submodule(const struct 
cache_entry *ce,
needs_cloning = !file_exists(sb.buf);
 
strbuf_reset();
-   strbuf_addf(, "%06o %s %d %d\t%s\n", ce->ce_mode,
-   oid_to_hex(>oid), ce_stage(ce),
-   needs_cloning, ce->name);
+   strbuf_addf(, "dummy %s %d\t%s\n",
+   oid_to_hex(>oid), needs_cloning, ce->name);
string_list_append(>projectlines, sb.buf);
 
if (!needs_cloning)
diff --git a/git-submodule.sh b/git-submodule.sh
index 5a58812645d..8caaf274e25 100755
--- a/git-submodule.sh
+++ b/git-submodule.sh
@@ -531,9 +531,9 @@ cmd_update()
"$@" || echo "#unmatched" $?
} | {
err=
-   while read -r mode sha1 stage just_cloned sm_path
+   while read -r quickabort sha1 just_cloned sm_path
do
-   die_if_unmatched "$mode" "$sha1"
+   die_if_unmatched "$quickabort" "$sha1"
 
name=$(git submodule--helper name "$sm_path") || exit
if ! test -z "$update"
-- 
2.18.0.132.g195c49a2227

[PATCH 1/7] git-submodule.sh: align error reporting for update mode to use path

All other error messages in cmd_update are reporting the submodule based
on its path, so let's do that for invalid update modes, too.

Signed-off-by: Stefan Beller 
Signed-off-by: Junio C Hamano 
---
 git-submodule.sh | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/git-submodule.sh b/git-submodule.sh
index 8b5ad59bdee..5a58812645d 100755
--- a/git-submodule.sh
+++ b/git-submodule.sh
@@ -632,7 +632,7 @@ cmd_update()
must_die_on_failure=yes
;;
*)
-   die "$(eval_gettext "Invalid update mode 
'$update_module' for submodule '$name'")"
+   die "$(eval_gettext "Invalid update mode 
'$update_module' for submodule path '$path'")"
esac
 
if (sanitize_submodule_env; cd "$sm_path" && $command 
"$sha1")
-- 
2.18.0.132.g195c49a2227

[PATCH 5/7] builtin/submodule--helper: factor out method to update a single submodule

In a later patch we'll find this method handy.

Signed-off-by: Stefan Beller 
Signed-off-by: Junio C Hamano 
---
 builtin/submodule--helper.c | 20 ++--
 1 file changed, 10 insertions(+), 10 deletions(-)

diff --git a/builtin/submodule--helper.c b/builtin/submodule--helper.c
index 40b94dd622e..8b1088ab58a 100644
--- a/builtin/submodule--helper.c
+++ b/builtin/submodule--helper.c
@@ -1727,10 +1727,17 @@ static int git_update_clone_config(const char *var, 
const char *value,
return 0;
 }
 
+static void update_submodule(struct update_clone_data *ucd)
+{
+   fprintf(stdout, "dummy %s %d\t%s\n",
+   oid_to_hex(>oid),
+   ucd->just_cloned,
+   ucd->sub->path);
+}
+
 static int update_submodules(struct submodule_update_clone *suc)
 {
int i;
-   struct strbuf sb = STRBUF_INIT;
 
run_processes_parallel(suc->max_jobs,
   update_clone_get_next_task,
@@ -1749,16 +1756,9 @@ static int update_submodules(struct 
submodule_update_clone *suc)
if (suc->quickstop)
return 1;
 
-   for (i = 0; i < suc->update_clone_nr; i++) {
-   strbuf_addf(, "dummy %s %d\t%s\n",
-   oid_to_hex(>update_clone[i].oid),
-   suc->update_clone[i].just_cloned,
-   suc->update_clone[i].sub->path);
-   fprintf(stdout, "%s", sb.buf);
-   strbuf_reset();
-   }
+   for (i = 0; i < suc->update_clone_nr; i++)
+   update_submodule(>update_clone[i]);
 
-   strbuf_release();
return 0;
 }
 
-- 
2.18.0.132.g195c49a2227

[PATCH 7/7] submodule--helper: introduce new update-module-mode helper

This chews off a bit of the shell part of the update command in
git-submodule.sh. When writing the C code, keep in mind that the
submodule--helper part will go away eventually and we want to have
a C function that is able to determine the submodule update strategy,
it as a nicety, make determine_submodule_update_strategy accessible
for arbitrary repositories.

Signed-off-by: Stefan Beller 
Signed-off-by: Junio C Hamano 
---
 builtin/submodule--helper.c | 61 +
 git-submodule.sh| 16 +-
 2 files changed, 62 insertions(+), 15 deletions(-)

diff --git a/builtin/submodule--helper.c b/builtin/submodule--helper.c
index e7635d5d9ab..e72157664f5 100644
--- a/builtin/submodule--helper.c
+++ b/builtin/submodule--helper.c
@@ -1446,6 +1446,66 @@ static int module_clone(int argc, const char **argv, 
const char *prefix)
return 0;
 }
 
+static void determine_submodule_update_strategy(struct repository *r,
+   int just_cloned,
+   const char *path,
+   const char *update,
+   struct 
submodule_update_strategy *out)
+{
+   const struct submodule *sub = submodule_from_path(r, _oid, path);
+   char *key;
+   const char *val;
+
+   key = xstrfmt("submodule.%s.update", sub->name);
+
+   if (update) {
+   trace_printf("parsing update");
+   if (parse_submodule_update_strategy(update, out) < 0)
+   die(_("Invalid update mode '%s' for submodule path 
'%s'"),
+   update, path);
+   } else if (!repo_config_get_string_const(r, key, )) {
+   if (parse_submodule_update_strategy(val, out) < 0)
+   die(_("Invalid update mode '%s' configured for 
submodule path '%s'"),
+   val, path);
+   } else if (sub->update_strategy.type != SM_UPDATE_UNSPECIFIED) {
+   trace_printf("loaded thing");
+   out->type = sub->update_strategy.type;
+   out->command = sub->update_strategy.command;
+   } else
+   out->type = SM_UPDATE_CHECKOUT;
+
+   if (just_cloned &&
+   (out->type == SM_UPDATE_MERGE ||
+out->type == SM_UPDATE_REBASE ||
+out->type == SM_UPDATE_NONE))
+   out->type = SM_UPDATE_CHECKOUT;
+
+   free(key);
+}
+
+static int module_update_module_mode(int argc, const char **argv, const char 
*prefix)
+{
+   const char *path, *update = NULL;
+   int just_cloned;
+   struct submodule_update_strategy update_strategy = { .type = 
SM_UPDATE_CHECKOUT };
+
+   if (argc < 3 || argc > 4)
+   die("submodule--helper update-module-clone expects 
  []");
+
+   just_cloned = git_config_int("just_cloned", argv[1]);
+   path = argv[2];
+
+   if (argc == 4)
+   update = argv[3];
+
+   determine_submodule_update_strategy(the_repository,
+   just_cloned, path, update,
+   _strategy);
+   fputs(submodule_strategy_to_string(_strategy), stdout);
+
+   return 0;
+}
+
 struct update_clone_data {
const struct submodule *sub;
struct object_id oid;
@@ -2080,6 +2140,7 @@ static struct cmd_struct commands[] = {
{"list", module_list, 0},
{"name", module_name, 0},
{"clone", module_clone, 0},
+   {"update-module-mode", module_update_module_mode, 0},
{"update-clone", update_clone, 0},
{"ensure-core-worktree", ensure_core_worktree, 0},
{"relative-path", resolve_relative_path, 0},
diff --git a/git-submodule.sh b/git-submodule.sh
index 19d010eac06..19c9f1215e1 100755
--- a/git-submodule.sh
+++ b/git-submodule.sh
@@ -537,27 +537,13 @@ cmd_update()
 
git submodule--helper ensure-core-worktree "$sm_path"
 
-   name=$(git submodule--helper name "$sm_path") || exit
-   if ! test -z "$update"
-   then
-   update_module=$update
-   else
-   update_module=$(git config submodule."$name".update)
-   if test -z "$update_module"
-   then
-   update_module="checkout"
-   fi
-   fi
+   update_module=$(git submodule--helper update-module-mode 
$just_cloned "$sm_path" $update)
 
displaypath=$(git submodule--helper relative-path 
"$prefix$sm_path" "$wt_prefix")
 
if test $just_cloned -eq 1
then
subsha1=
-   case "$update_module" in
-   merge | rebase | none)
-   update_module=checkout ;;
-   esac
else

[PATCH 0/7] Resend of sb/submodule-update-in-c

* Introduce new patch
  "submodule--helper: replace connect-gitdir-workingtree by 
ensure-core-worktree"
  that resolves the conflict with earlier versions of this series with
  sb/submodule-core-worktree
* This series is based on master, which already contains 
  sb/submodule-core-worktree
  
Thanks,
Stefan

Stefan Beller (7):
  git-submodule.sh: align error reporting for update mode to use path
  git-submodule.sh: rename unused variables
  builtin/submodule--helper: factor out submodule updating
  builtin/submodule--helper: store update_clone information in a struct
  builtin/submodule--helper: factor out method to update a single
submodule
  submodule--helper: replace connect-gitdir-workingtree by
ensure-core-worktree
  submodule--helper: introduce new update-module-mode helper

 builtin/submodule--helper.c | 216 ++--
 git-submodule.sh|  29 +
 2 files changed, 164 insertions(+), 81 deletions(-)

  ./git-range-diff origin/sb/submodule-update-in-c...HEAD
  [...]
  -:  --- > 338:  1d89318c48d Fifth batch for 2.19 cycle
  1:  c1cb423b249 = 339:  3090cbcb46e git-submodule.sh: align error reporting 
for update mode to use path
  2:  f188b30a9b9 = 340:  850a16e9085 git-submodule.sh: rename unused variables
  3:  70d84fa6a09 ! 341:  88af0cdcba6 builtin/submodule--helper: factor out 
submodule updating
@@ -73,10 +73,10 @@
};
suc.prefix = prefix;
  
--  config_from_gitmodules(gitmodules_update_clone_config, _jobs);
--  git_config(gitmodules_update_clone_config, _jobs);
-+  config_from_gitmodules(gitmodules_update_clone_config, _jobs);
-+  git_config(gitmodules_update_clone_config, _jobs);
+-  update_clone_config_from_gitmodules(_jobs);
+-  git_config(git_update_clone_config, _jobs);
++  update_clone_config_from_gitmodules(_jobs);
++  git_config(git_update_clone_config, _jobs);
  
argc = parse_options(argc, argv, prefix, module_update_clone_options,
 git_submodule_helper_usage, 0);
  4:  511e8a139c9 = 342:  2fdd479a6d5 builtin/submodule--helper: store 
update_clone information in a struct
  5:  65b2a720b90 = 343:  34589e724b3 builtin/submodule--helper: factor out 
method to update a single submodule
  -:  --- > 344:  ee2bb4f23d8 submodule--helper: replace 
connect-gitdir-workingtree by ensure-core-worktree
  6:  e5803d07f9b ! 345:  03300626ba7 submodule--helper: introduce new 
update-module-mode helper
@@ -88,15 +88,15 @@
{"clone", module_clone, 0},
 +  {"update-module-mode", module_update_module_mode, 0},
{"update-clone", update_clone, 0},
+   {"ensure-core-worktree", ensure_core_worktree, 0},
{"relative-path", resolve_relative_path, 0},
-   {"resolve-relative-url", resolve_relative_url, 0},
 
 diff --git a/git-submodule.sh b/git-submodule.sh
 --- a/git-submodule.sh
 +++ b/git-submodule.sh
 @@
-   do
-   die_if_unmatched "$quickabort" "$sha1"
+ 
+   git submodule--helper ensure-core-worktree "$sm_path"
  
 -  name=$(git submodule--helper name "$sm_path") || exit
 -  if ! test -z "$update"

[PATCH 3/7] builtin/submodule--helper: factor out submodule updating

Separate the command line parsing from the actual execution of the command
within the repository. For now there is not a lot of execution as
most of it is still in git-submodule.sh.

Signed-off-by: Stefan Beller 
Signed-off-by: Junio C Hamano 
---
 builtin/submodule--helper.c | 59 +
 1 file changed, 33 insertions(+), 26 deletions(-)

diff --git a/builtin/submodule--helper.c b/builtin/submodule--helper.c
index da700c88963..32f00ca6f87 100644
--- a/builtin/submodule--helper.c
+++ b/builtin/submodule--helper.c
@@ -1474,6 +1474,8 @@ struct submodule_update_clone {
/* failed clones to be retried again */
const struct cache_entry **failed_clones;
int failed_clones_nr, failed_clones_alloc;
+
+   int max_jobs;
 };
 #define SUBMODULE_UPDATE_CLONE_INIT {0, MODULE_LIST_INIT, 0, \
SUBMODULE_UPDATE_STRATEGY_INIT, 0, 0, -1, STRING_LIST_INIT_DUP, 0, \
@@ -1716,11 +1718,36 @@ static int git_update_clone_config(const char *var, 
const char *value,
return 0;
 }
 
+static int update_submodules(struct submodule_update_clone *suc)
+{
+   struct string_list_item *item;
+
+   run_processes_parallel(suc->max_jobs,
+  update_clone_get_next_task,
+  update_clone_start_failure,
+  update_clone_task_finished,
+  suc);
+
+   /*
+* We saved the output and put it out all at once now.
+* That means:
+* - the listener does not have to interleave their (checkout)
+*   work with our fetching.  The writes involved in a
+*   checkout involve more straightforward sequential I/O.
+* - the listener can avoid doing any work if fetching failed.
+*/
+   if (suc->quickstop)
+   return 1;
+
+   for_each_string_list_item(item, >projectlines)
+   fprintf(stdout, "%s", item->string);
+
+   return 0;
+}
+
 static int update_clone(int argc, const char **argv, const char *prefix)
 {
const char *update = NULL;
-   int max_jobs = 1;
-   struct string_list_item *item;
struct pathspec pathspec;
struct submodule_update_clone suc = SUBMODULE_UPDATE_CLONE_INIT;
 
@@ -1742,7 +1769,7 @@ static int update_clone(int argc, const char **argv, 
const char *prefix)
OPT_STRING(0, "depth", , "",
   N_("Create a shallow clone truncated to the "
  "specified number of revisions")),
-   OPT_INTEGER('j', "jobs", _jobs,
+   OPT_INTEGER('j', "jobs", _jobs,
N_("parallel jobs")),
OPT_BOOL(0, "recommend-shallow", _shallow,
N_("whether the initial clone should follow the 
shallow recommendation")),
@@ -1758,8 +1785,8 @@ static int update_clone(int argc, const char **argv, 
const char *prefix)
};
suc.prefix = prefix;
 
-   update_clone_config_from_gitmodules(_jobs);
-   git_config(git_update_clone_config, _jobs);
+   update_clone_config_from_gitmodules(_jobs);
+   git_config(git_update_clone_config, _jobs);
 
argc = parse_options(argc, argv, prefix, module_update_clone_options,
 git_submodule_helper_usage, 0);
@@ -1774,27 +1801,7 @@ static int update_clone(int argc, const char **argv, 
const char *prefix)
if (pathspec.nr)
suc.warn_if_uninitialized = 1;
 
-   run_processes_parallel(max_jobs,
-  update_clone_get_next_task,
-  update_clone_start_failure,
-  update_clone_task_finished,
-  );
-
-   /*
-* We saved the output and put it out all at once now.
-* That means:
-* - the listener does not have to interleave their (checkout)
-*   work with our fetching.  The writes involved in a
-*   checkout involve more straightforward sequential I/O.
-* - the listener can avoid doing any work if fetching failed.
-*/
-   if (suc.quickstop)
-   return 1;
-
-   for_each_string_list_item(item, )
-   fprintf(stdout, "%s", item->string);
-
-   return 0;
+   return update_submodules();
 }
 
 static int resolve_relative_path(int argc, const char **argv, const char 
*prefix)
-- 
2.18.0.132.g195c49a2227

RE: abstracting commit signing/verify to support other signing schemes

2018-08-03 Thread Randall S. Becker

On August 3, 2018 5:39 PM, Tacitus Aedifex wrote:
> I'm looking at the existing commit signing and verification integration and 
> it is
> all GPG specific. I'm interested in refactoring the code to have a generic
> signing/verifying interface so that "drivers"
> for other signing tools can be created and other signing tools can be used
> (e.g. OpenBSD signify).
> 
> The existing interface defined in gpg-interface.h is already fairly generic. 
> It
> looks like the only things that would need to be fixed are the names of some
> members in the signature_check struct and the GPG specific constants.
> 
> I propose to rename the gpg-interface.h file to signature-interface.h.
> There are several different ways to do the "polymorphism" needed to have a
> base signature_check struct with a tool-specific part for storing the tool-
> specific data (e.g. gpg_output, gpg_status, result). I'm looking for
> suggestions on the way this has been done in other places in the Git code so I
> can do it the same way. My initial impulse it to have a union of tool-specific
> structs inside of the signature_check struct.
> 
> The plan for changing the signing behavior is to change the code looking for
> commit.gpgsign in sequencer.c to instead look for commit.signtool.
> The string value will define which signing tool to use. The default will be 
> null
> which is the equivilent to gpgsign=false. To get GPG signing the user would
> set it to "gpg". To maintain backwards compatibility, the code will continue 
> to
> check for commit.gpgsign and translate that to commit.signtool=gpg and
> output a warning.
> 
> I also think that it makes sense to move the user.signingkey to be
> gpg.signingkey since that only makes sense in the context of GPG.
> 
> The real trick here is how to handle signatures from different tools in a 
> given
> project. I think the answer is to store the value of commit.signtool along 
> with
> the signature blob associted with each signed commit. That way the
> signature verification code can know which tool to use to verify the
> signature. If a commit has a signture but no tool selector, the default will 
> be
> to assume GPG to preserve backwards compatibility.

If I may suggest something a little off the wall... the abstraction needs to go 
beyond just the signing tool, but the whole signing infrastructure. I would 
like to see something along the lines of introducing a signing authority into 
the mix, so that not only the tool of signing is abstracted, but also the 
interface to who, if anyone, is responsible for signing. If I had my dream, it 
would be that one (or more) signing authorities would have potentially 
overlapping responsibilities for signing parts of the tree either on demand or 
by requirement.

So when a commit occurs, at least on master, or other designated branches, it 
may be the repository requires a signature from a particular authority, 
regardless of whether the committer has requested one. And there may be more 
than one authority or notary involved. Or, the repository could accept the 
signature of the committer as abstracted.

Where I'm going is that I would like to see a tighter integration with 
block-chain concepts in git. My customer base has very tight requirements for 
this type of software certification. Signatures, GPG or other, may only go so 
far. I am even considering whether particular parts of the tree are even 
visible (remember the Islands of Sparceness discussion?).

I expect to be able to contribute more to this conversation in a few months 
(current $NDA prohibition), if this goes anywhere.

My feature time machine window doesn't see this any time soon, if ever, but one 
never knows. I have my delusional hopes. 

Please take this as simply a suggestion for the long-term.

Cheers,
Randall

-- Brief whoami:
 NonStop developer since approximately 2112884442
 UNIX developer since approximately 421664400
-- In my real life, I talk too much.

Re: abstracting commit signing/verify to support other signing schemes

On Fri, Aug 03, 2018 at 09:38:34PM +, Tacitus Aedifex wrote:

> I'm looking at the existing commit signing and verification
> integration and it is all GPG specific. I'm interested in refactoring
> the code to have a generic signing/verifying interface so that "drivers"
> for other signing tools can be created and other signing tools can be
> used (e.g. OpenBSD signify).
> [...]
> Any other thoughts and/or suggestions?

There's been some work on this lately. See this patch and the response
thread:

  https://public-inbox.org/git/20180409204129.43537-9-mastahy...@gmail.com/

One of the main complaints there was that it was doing just enough to
make gpgsm work, and it was unclear if some of the abstractions would be
insufficient for something like signify.

The more recent work focused on just doing the minimum to provide
gpg/gpgsm variants:

  https://public-inbox.org/git/cover.1531831244.git.henning.sch...@siemens.com/

That replaces the earlier patch series, and is currently merged to the
'next' branch and is on track to get merged to 'master' before Git 2.19
is released.

One of the downsides there is that if we eventually move to a generic
signing-tool config, we'd have to support two layers of historical
abstraction (the original "gpg.program" config, and the new
"gpg..*" config).

So _if_ we knew what it would take to support signify, we could
potentially adjust what's going into 2.19 in order to skip straight to
the more generic interface. But on the OTOH, it may not be worth
rushing, and there is already a vague plan of how the gpg..*
config would interact with a more generic config.

Anyway. Hopefully that gives you a sense of what the current state is,
and that work should answer the questions you asked about how to
approach the code changes.

-Peff

Re: [PATCH] add a script to diff rendered documentation

Eric Sunshine  writes:

>> +   shift; break ;;
>> +   *)
>> +   usage ;;
>
> There doesn't seem to a usage() function defined anywhere (and
> OPTIONS_SPEC doesn't seem to be used).

Isn't this using the parse-options thing git-sh-setup gives us for
free?

abstracting commit signing/verify to support other signing schemes

2018-08-03 Thread Tacitus Aedifex

I'm looking at the existing commit signing and verification
integration and it is all GPG specific. I'm interested in refactoring
the code to have a generic signing/verifying interface so that "drivers"
for other signing tools can be created and other signing tools can be
used (e.g. OpenBSD signify).

The existing interface defined in gpg-interface.h is already fairly
generic. It looks like the only things that would need to be fixed
are the names of some members in the signature_check struct and the GPG
specific constants.

I propose to rename the gpg-interface.h file to signature-interface.h.
There are several different ways to do the "polymorphism" needed to have
a base signature_check struct with a tool-specific part for storing the
tool-specific data (e.g. gpg_output, gpg_status, result). I'm looking
for suggestions on the way this has been done in other places in the Git
code so I can do it the same way. My initial impulse it to have a union
of tool-specific structs inside of the signature_check struct.

The plan for changing the signing behavior is to change the code looking
for commit.gpgsign in sequencer.c to instead look for commit.signtool.
The string value will define which signing tool to use. The default will
be null which is the equivilent to gpgsign=false. To get GPG
signing the user would set it to "gpg". To maintain backwards
compatibility, the code will continue to check for commit.gpgsign and
translate that to commit.signtool=gpg and output a warning.

I also think that it makes sense to move the user.signingkey to be
gpg.signingkey since that only makes sense in the context of GPG.

The real trick here is how to handle signatures from different tools in
a given project. I think the answer is to store the value of
commit.signtool along with the signature blob associted with each signed
commit. That way the signature verification code can know which tool to
use to verify the signature. If a commit has a signture but no tool
selector, the default will be to assume GPG to preserve backwards
compatibility.

Any other thoughts and/or suggestions?

//t??

Re: What's cooking in git.git (Aug 2018, #01; Thu, 2)

Stefan Beller  writes:

> - The second part of having an immediate plan is *very* nice
>   to see, though I would argue that it could be improved
>   by having these updates in the thread instead of summarized
>   unrelated to that thread.
>
>   We do not do this for now due to tooling issues, I suppose.

I need the current draft of "What's cooking" message that I haven't
sent out (and the committed copy on 'todo' branch, which is a copy
of what was already sent out) to be the single place I go that
drives my day-to-day integration work anyway.  I don't have time to
jump around 30-70 discussion threads (assuming there only is one
active one per each topic in flight) to see what the last tentative
verdict I gave to each one of them before deciding which ones to
merge to 'next' in day's integration run.

You are welcome to split these pieces out of the "What's cooking"
report and spray them into the original message, or write a bot to
do so.

> Is there a better way to start a workflow discussion?

My take on it is not to discuss ways to add more to other's workload
to worsen the bottleneck in the first place, and instead try to do
things you would think you can do to help, see if it actually helps,
and then after that encourage discussit as a way for others to do to
help the project, perhaps?

Re: [PATCH] add a script to diff rendered documentation

On Fri, Aug 3, 2018 at 5:38 PM Jeff King  wrote:
> On Fri, Aug 03, 2018 at 05:33:17PM -0400, Eric Sunshine wrote:
> I suppose so. Frankly I only added that line to appease git-sh-options
> anyway.
> > Should "j" and "f" be "-j" and "-f", respectively?
> No, they're input to "rev-parse --parseopt".
> > There doesn't seem to a usage() function defined anywhere (and
> > OPTIONS_SPEC doesn't seem to be used).
> It's git-sh-setup automagic. Try "./doc-diff --foo"

Ah yes, I saw but then forgot that 'git-sh-setup' was sourced.

> > > +# We'll do both builds in a single worktree, which lets make reuse
> > > +# results that don't differ between the two trees.
> >
> > "which lets make reuse"?
>
> As in, lets the tool "make" reuse results...

Okay, I was confused by "make" being a verb, and thought you had made
a last-minute edit, rewriting "which makes it possible to reuse...",
and intending to say instead "which lets us reuse...". Had it been
formatted "which lets 'make' reuse...", I'd probably have read it
correctly. Not worth a re-roll.

Re: [PATCH] add a script to diff rendered documentation

On Fri, Aug 03, 2018 at 05:33:17PM -0400, Eric Sunshine wrote:

> > +OPTIONS_SPEC="\
> > +doc-diff   [-- diff options]
> 
> Should this be?
> 
> doc-diff []   [-- ]

I suppose so. Frankly I only added that line to appease git-sh-options
anyway.

> > +--
> > +j  parallel argument to pass to make
> > +f  force rebuild; do not rely on cached results
> > +"
> 
> Should "j" and "f" be "-j" and "-f", respectively?

No, they're input to "rev-parse --parseopt".

> > +while test $# -gt 0
> > +do
> > +   case "$1" in
> > +   -j)
> > +   parallel=${1#-j} ;;
> > +   -f)
> > +   force=t ;;
> > +   --)
> > +   shift; break ;;
> > +   *)
> > +   usage ;;
> 
> There doesn't seem to a usage() function defined anywhere (and
> OPTIONS_SPEC doesn't seem to be used).

It's git-sh-setup automagic. Try "./doc-diff --foo"

> > +# We'll do both builds in a single worktree, which lets make reuse
> > +# results that don't differ between the two trees.
> 
> "which lets make reuse"?

As in, lets the tool "make" reuse results...

-Peff

Re: [PATCH] add a script to diff rendered documentation

On Fri, Aug 3, 2018 at 4:52 PM Jeff King  wrote:
> [...]
> Let's provide a script that builds and installs the manpages
> for two commits, renders the results using "man", and diffs
> the result. Since this is time-consuming, we'll also do our
> best to avoid repeated work, keeping intermediate results
> between runs.
> [...]
> Signed-off-by: Jeff King 
> ---
> diff --git a/Documentation/doc-diff b/Documentation/doc-diff
> @@ -0,0 +1,100 @@
> +OPTIONS_SPEC="\
> +doc-diff   [-- diff options]

Should this be?

doc-diff []   [-- ]

> +--
> +j  parallel argument to pass to make
> +f  force rebuild; do not rely on cached results
> +"

Should "j" and "f" be "-j" and "-f", respectively?

> +while test $# -gt 0
> +do
> +   case "$1" in
> +   -j)
> +   parallel=${1#-j} ;;
> +   -f)
> +   force=t ;;
> +   --)
> +   shift; break ;;
> +   *)
> +   usage ;;

There doesn't seem to a usage() function defined anywhere (and
OPTIONS_SPEC doesn't seem to be used).

> +   esac
> +   shift
> +done
> +# We'll do both builds in a single worktree, which lets make reuse
> +# results that don't differ between the two trees.

"which lets make reuse"?

[PATCH] add a script to diff rendered documentation

After making a change to the documentation, it's easy to
forget to check the rendered version to make sure it was
formatted as you intended. And simply doing a diff between
the two built versions is less trivial than you might hope:

  - diffing the roff or html output isn't particularly
readable; what we really care about is what the end user
will see

  - you have to tweak a few build variables to avoid
spurious differences (e.g., version numbers, build
times)

Let's provide a script that builds and installs the manpages
for two commits, renders the results using "man", and diffs
the result. Since this is time-consuming, we'll also do our
best to avoid repeated work, keeping intermediate results
between runs.

Some of this could probably be made a little less ugly if we
built support into Documentation/Makefile. But by relying
only on "make install-man" working, this script should work
for generating a diff between any two versions, whether they
include this script or not.

Signed-off-by: Jeff King 
---
I wrote this up for my own use after our discussion in [1]. I'm not sure
if it's too ugly for inclusion, or if it might be helpful to others.
I've only just written it, but my plan is to try to run it on anything I
submit to check the formatting. So it _seems_ useful and appears to
work, but only after a few minutes of playing with it. :)

[1] 
https://public-inbox.org/git/20180720223608.ge18...@genre.crustytoothpaste.net/

 Documentation/.gitignore |   1 +
 Documentation/doc-diff   | 100 +++
 2 files changed, 101 insertions(+)
 create mode 100755 Documentation/doc-diff

diff --git a/Documentation/.gitignore b/Documentation/.gitignore
index c7096f11f1..3ef54e0adb 100644
--- a/Documentation/.gitignore
+++ b/Documentation/.gitignore
@@ -12,3 +12,4 @@ cmds-*.txt
 mergetools-*.txt
 manpage-base-url.xsl
 SubmittingPatches.txt
+tmp-doc-diff/
diff --git a/Documentation/doc-diff b/Documentation/doc-diff
new file mode 100755
index 00..61deb2579e
--- /dev/null
+++ b/Documentation/doc-diff
@@ -0,0 +1,100 @@
+#!/bin/sh
+
+OPTIONS_SPEC="\
+doc-diff   [-- diff options]
+--
+j  parallel argument to pass to make
+f  force rebuild; do not rely on cached results
+"
+SUBDIRECTORY_OK=1
+. "$(git --exec-path)/git-sh-setup"
+
+parallel=8
+force=
+while test $# -gt 0
+do
+   case "$1" in
+   -j)
+   parallel=${1#-j} ;;
+   -f)
+   force=t ;;
+   --)
+   shift; break ;;
+   *)
+   usage ;;
+   esac
+   shift
+done
+
+test $# -gt 1 || usage
+from=$1; shift
+to=$1; shift
+
+from_oid=$(git rev-parse --verify "$from") || exit 1
+to_oid=$(git rev-parse --verify "$to") || exit 1
+
+cd_to_toplevel
+tmp=Documentation/tmp-doc-diff
+
+if test -n "$force"
+then
+   rm -rf "$tmp"
+fi
+
+# We'll do both builds in a single worktree, which lets make reuse
+# results that don't differ between the two trees.
+if ! test -d "$tmp/worktree"
+then
+   git worktree add --detach "$tmp/worktree" "$from" &&
+   dots=$(echo "$tmp/worktree" | sed 's#[^/]*#..#g') &&
+   ln -s "$dots/config.mak" "$tmp/worktree/config.mak"
+fi
+
+# generate_render_makefile  
+generate_render_makefile () {
+   find "$1" -type f |
+   while read src
+   do
+   dst=$2/${src#$1/}
+   printf 'all:: %s\n' "$dst"
+   printf '%s: %s\n' "$dst" "$src"
+   printf '\t@echo >&2 "  RENDER $(notdir $@)" && \\\n'
+   printf '\tmkdir -p $(dir $@) && \\\n'
+   printf '\tMANWIDTH=80 man -l $< >$@+ && \\\n'
+   printf '\tmv $@+ $@\n'
+   done
+}
+
+# render_tree  
+render_tree () {
+   # Skip install-man entirely if we already have an installed directory.
+   # We can't rely on make here, since "install-man" unconditionally
+   # copies the files (spending effort, but also updating timestamps that
+   # we then can't rely on during the render step). We use "mv" to make
+   # sure we don't get confused by a previous run that failed partway
+   # through.
+   if ! test -d "$tmp/installed/$1"
+   then
+   git -C "$tmp/worktree" checkout "$2" &&
+   make -j$parallel -C "$tmp/worktree" \
+   GIT_VERSION=omitted \
+   SOURCE_DATE_EPOCH=0 \
+   DESTDIR="$PWD/$tmp/installed/$1+" \
+   install-man &&
+   mv "$tmp/installed/$1+" "$tmp/installed/$1"
+   fi &&
+
+   # As with "installed" above, we skip the render if it's already been
+   # done.  So using make here is primarily just about running in
+   # parallel.
+   if ! test -d "$tmp/rendered/$1"
+   then
+   generate_render_makefile "$tmp/installed/$1" 
"$tmp/rendered/$1+" |
+   make -j$parallel -f - &&
+   mv "$tmp/rendered/$1+" "$tmp/rendered/$1"
+   fi
+}
+
+render_tree $from_oid

Re: What's cooking in git.git (Aug 2018, #01; Thu, 2)

On Fri, Aug 3, 2018 at 1:07 PM Junio C Hamano  wrote:
>
> Brandon Williams  writes:
>
> > [1] 
> > https://public-inbox.org/git/cagz79kygs4dvoetygx01cinrxxlcqgxovsplhmgyz8b51lz...@mail.gmail.com/
> > This mail seems to counter that indicating that the "What's Cooking"
> > emails should not be used as a status update.
>
> You are the second one who were negatively affected by Stefan's
> "summary" that reads a lot more in what I said than what actually
> was said by me.  Stop paying attention to that message, but do go to
> the original if you want to hear what I actually said.

Please note that I put that one out to "in a deliberatly
outrageous way"[1] so that I get more arguments on why
this workflow is the best we have.

Personally I think the mailing list workflow could be improved
by having less manual work on your side.

That would (a) make the whole process from one end (of writing
the patch) to the other (of seeing its effect in a shipped binary
by a distribution) more transparent, as then DScho could build
his amlog consumer more easily for example.
And (b) it would be less work for you, though different work.
(i.e. instead of resolving conflicts yourself, you'd tell 2 people
to resolve conflicts between their series and you'd then fetch
from them; c.f. Linus lieutenants).

[1] https://public-inbox.org/git/xmqqo9ejwag9@gitster-ct.c.googlers.com/

I did miss the first person who was negatively affected?

> The mention of "discussion thread on the list is the authoritative"
> was said in the context where somebody refuted these "cf. " on
> a topic and I got an impression that it was done in the belief that
> doing so for each and every "cf. " would be enough to exonerate
> the topic of all the issues.  I was explaining that they were not
> meant to be exhaustive list, but rather are my personal reminder so
> that I can go back to the thread to recall why I said things like
> "Waiting for review to conclude", "expecting a reroll", etc.; as I
> do not need to point at all the review comments that matter for them
> to serve that purpose, these "cf. " must not be taken as the
> "clear these and you are home free" list.  To cover all the issues
> pointed out in the review process, you must go to the original.
>
> "What's cooking" primarily serves two purposes.
>
>  - After list of patches in a topic, I try to summarize what the
>topic is about.  This is to become merge commit message for the
>topic when it is merged to various integration branches, and also
>to become an entry in the release notes.
>
>  - Then an immediate plan like "Will merge to 'next'", etc. for the
>topic is announced, optionally with comments like "cf. " to
>remind me why I chose to label the topic as such.
>
> The former is outside the topic of this thread, but both are *not*
> obviously written by everybody; the former is my attempt to
> summarize, and people are welcome to improve them.  If my attempted
> summary is way incorrect, that might be an indication that the
> topic, especially its log messages, is not clearly done.
>
> If my immediate plan contradicts the list concensus, it likely is an
> indication that I missed the discussion, and it is very much
> appreciated for those involved in the discussion to correct me.
> That may result in my dropping a "cf. " when an issue I thought
> to be blocking (or to require the topic to be fast-tracked) turns
> out to have been resolved already, or adding another one when it is
> pointed out to me that I missed an important issue to block (or
> fast-track) the topic.

Thanks for explaining the thoughts behind the cooking emails.

- The first part is very nice to have on the mailing list, but it is
  not strictly needed as you could just write the release notes
  and then people could send patches to the release notes as
  usual.

- The second part of having an immediate plan is *very* nice
  to see, though I would argue that it could be improved
  by having these updates in the thread instead of summarized
  unrelated to that thread.

  We do not do this for now due to tooling issues, I suppose.

  But I would think it makes the threads more discoverable if
  we'd have the status updates in each thread as then people
  would keep the discussion there and not jump on the cooking
  email to continue discussion there.

> Hope this clarifies a bit of the confusion caused by that summary.

I am sorry that this seems as if I would stir up the community
and cause troubles intentionally, but all I really want is a better
workflow process. And as I do not know what is better, I think out
loud trying to get a discussion going that point out things that are
good or bad.

Is there a better way to start a workflow discussion?

Thanks,
Stefan

Re: [PATCH 2/2] rebase --exec: make it work with --rebase-merges

"Johannes Schindelin via GitGitGadget" 
writes:

> + /*
> +  * Insert  after every pick. Here, fixup/squash chains
> +  * are considered part of the pick, so we insert the commands *after*
> +  * those chains if there are any.
> +  */
> + insert_final_commands = 1;
> + for (i = 0; i < todo_list.nr; ) {
> + enum todo_command command = todo_list.items[i].command;
> + int j = 0;
> + ...
> + /* skip fixup/squash chain, if any */
> + for (i++; i < todo_list.nr; i++, j = 0) {

Does 'j' need to be reset to 0 in each iteration?  Nobody looks at
'j' after exiting this inner loop, and every referernce to 'j'
inside this inner loop happens _after_ it gets assigned "i+1" at the
beginning of "skip comment" loop.

For that matter, I wonder if 'j' can be a variable local to this
inner loop, not the outer loop that iterates over todo_list.items[].

> + command = todo_list.items[i].command;
> +
> + if (is_fixup(command))
> + continue;
> +
> + if (command != TODO_COMMENT)
> + break;
> +
> + /* skip comment if followed by any fixup/squash */
> + for (j = i + 1; j < todo_list.nr; j++)
> + if (todo_list.items[j].command != TODO_COMMENT)
> + break;
> + if (j < todo_list.nr &&
> + is_fixup(todo_list.items[j].command)) {
> + i = j;
> + continue;
> + }
> + break;
>   }

Re: What's cooking in git.git (Aug 2018, #01; Thu, 2)

Brandon Williams writes:

> [1]
> https://public-inbox.org/git/cagz79kygs4dvoetygx01cinrxxlcqgxovsplhmgyz8b51lz...@mail.gmail.com/
> This mail seems to counter that indicating that the "What's Cooking"
> emails should not be used as a status update.

You are the second one who were negatively affected by Stefan's
"summary" that reads a lot more in what I said than what actually
was said by me. Stop paying attention to that message, but do go to
the original if you want to hear what I actually said.

The mention of "discussion thread on the list is the authoritative"
was said in the context where somebody refuted these "cf. " on
a topic and I got an impression that it was done in the belief that
doing so for each and every "cf. " would be enough to exonerate
the topic of all the issues. I was explaining that they were not
meant to be exhaustive list, but rather are my personal reminder so
that I can go back to the thread to recall why I said things like
"Waiting for review to conclude", "expecting a reroll", etc.; as I
do not need to point at all the review comments that matter for them
to serve that purpose, these "cf. " must not be taken as the
"clear these and you are home free" list. To cover all the issues
pointed out in the review process, you must go to the original.

"What's cooking" primarily serves two purposes.

- After list of patches in a topic, I try to summarize what the
topic is about. This is to become merge commit message for the
topic when it is merged to various integration branches, and also
to become an entry in the release notes.

- Then an immediate plan like "Will merge to 'next'", etc. for the
topic is announced, optionally with comments like "cf. " to
remind me why I chose to label the topic as such.

The former is outside the topic of this thread, but both are *not*
obviously written by everybody; the former is my attempt to
summarize, and people are welcome to improve them. If my attempted
summary is way incorrect, that might be an indication that the
topic, especially its log messages, is not clearly done.

If my immediate plan contradicts the list concensus, it likely is an
indication that I missed the discussion, and it is very much
appreciated for those involved in the discussion to correct me.
That may result in my dropping a "cf. " when an issue I thought
to be blocking (or to require the topic to be fast-tracked) turns
out to have been resolved already, or adding another one when it is
pointed out to me that I missed an important issue to block (or
fast-track) the topic.

Hope this clarifies a bit of the confusion caused by that summary.

Re: What's cooking in git.git (Aug 2018, #01; Thu, 2)

2018-08-03 Thread Brandon Williams

On 08/03, Junio C Hamano wrote:
> Junio C Hamano  writes:
> 
> > "brian m. carlson"  writes:
> >
> >> On Fri, Aug 03, 2018 at 11:40:08AM -0700, Junio C Hamano wrote:
> >>> "brian m. carlson"  writes:
> >>> 
> >>> > On Thu, Aug 02, 2018 at 04:02:36PM -0700, Junio C Hamano wrote:
> >>> >> --
> >>> >> [New Topics]
> >>> >
> >>> > I had expected to see
> >>> > <20180729192803.1047050-1-sand...@crustytoothpaste.net> (the refspec @
> >>> > handling) in this list, but I don't.  Were you expecting changes or
> >>> > additional feedback before picking it up?
> >>> 
> >>> Neither.  I just am not all that interested in seeing @ used for
> >>> HEAD in the first place, as I find it quite confusing.

Discussion on our contributor workflow (and potentially adjusting or
changing it) tends to be an area that can result in heated discussion, I
would like to avoid that.  At some point I would really like it if we
could have a constructive discussion on how to improve our workflow as
well as make things easier for newcomers.  Maybe this isn't the time or
place but I find this situation particularly challenging as a
contributor.

Someone sent a patch to the list, they received a review (which doesn't
happen all the time because no one is assigned to review a patches) and
then they wait.  Our project doesn't have a bug tracker (yes jrn you'll
say we have one, https://crbug.com/git/, but its not very discoverable
and the project as a whole hasn't started using it) (and I mention a bug
tracker because that's also a place which could be used to communicate
the 'status' of a series), and it doesn't make use of a code review tool
which has a definitive status of a patch series.  As a contributor I'm
left waiting and unsure of if my patch series needs more review, do I
need to send a re-roll, does the project not like the idea and I should
drop it, or the patch is good as-is and will be merged.

When I started contributing to the project I was told that these "What's
cooking" emails were supposed to be that status[1].  I knew that they
weren't real-time but it meant that I could at least get an idea about
once or twice a week about the various status' of ongoing series in the
project.  When something is just silently omitted, what is a contributor
to think?  Even if someone sends something that the project isn't
interested in taking, shouldn't they at least get informed of that?

Anyway those are just some of my thoughts.  If my thinking is mistaken or I'm
missing something please point it out to me.

[1] 
https://public-inbox.org/git/cagz79kygs4dvoetygx01cinrxxlcqgxovsplhmgyz8b51lz...@mail.gmail.com/
This mail seems to counter that indicating that the "What's Cooking"
emails should not be used as a status update.

-- 
Brandon Williams

Re: What's cooking in git.git (Aug 2018, #01; Thu, 2)

Junio C Hamano  writes:

> "brian m. carlson"  writes:
>
>> On Fri, Aug 03, 2018 at 11:40:08AM -0700, Junio C Hamano wrote:
>>> "brian m. carlson"  writes:
>>> 
>>> > On Thu, Aug 02, 2018 at 04:02:36PM -0700, Junio C Hamano wrote:
>>> >> --
>>> >> [New Topics]
>>> >
>>> > I had expected to see
>>> > <20180729192803.1047050-1-sand...@crustytoothpaste.net> (the refspec @
>>> > handling) in this list, but I don't.  Were you expecting changes or
>>> > additional feedback before picking it up?
>>> 
>>> Neither.  I just am not all that interested in seeing @ used for
>>> HEAD in the first place, as I find it quite confusing.
>>
>> Okay.  Is that just in this case, or in general?  If nobody but me wants
>> this feature, I'm fine dropping it.
>
> That's the typical case of me saying something in a deliberatly
> outrageous way so that we can hopefully see people from both sides.

I guess that makes the answer to the original question "the latter"
;-)

Re: [PATCH/RFC] clone: report duplicate entries on case-insensitive filesystems

On Fri, Aug 03, 2018 at 02:23:17PM -0400, Jeff Hostetler wrote:

> > Maybe. It might not work as ino_t. Or it might be expensive to get.  Or
> > maybe it's simply impossible. I don't know much about Windows. Some
> > searching implies that NTFS does have a "file index" concept which is
> > supposed to be unique.
> 
> This is hard and/or expensive on Windows.  Yes, you can get the
> "file index" values for an open file handle with a cost similar to
> an fstat().  Unfortunately, the FindFirst/FindNext routines (equivalent
> to the opendir/readdir routines), don't give you that data.  So we'd
> have to scan the directory and then open and stat each file.  This is
> terribly expensive on Windows -- and the reason we have the fscache
> layer (in the GfW version) to intercept the lstat() calls whenever
> possible.

I think that high cost might be OK for our purposes here. This code
would _only_ kick in during a clone, and then only on the error path
once we knew we had a collision during the checkout step.

> Another thing to keep in mind is that the collision could be because
> of case folding (or other such nonsense) on a directory in the path.
> I mean, if someone on Linux builds a commit containing:
> 
> a/b/c/D/e/foo.txt
> a/b/c/d/e/foo.txt
> 
> we'll get a similar collision as if one of them were spelled "FOO.txt".

True, though I think that may be OK. If you had conflicting directories
you'd get a _ton_ of duplicates listed, but that makes sense: you
actually have a ton of duplicates.

> Also, do we need to worry about hard-links or symlinks here?

I think we can ignore hardlinks. Git never creates them, and we know the
directory was empty when we started. Symlinks should be handled by using
lstat(). (Obviously that's for a Unix-ish platform).

> I'm sure there are other edge cases here that make reporting
> difficult; these are just a few I thought of.  I guess what I'm
> trying to say is that as a first step just report that you found
> a collision -- without trying to identify the set existing objects
> that it collided with.

I certainly don't disagree with that. :)

> > At any rate, until we have an actual plan for Windows, I think it would
> > make sense only to split the cases into "has working inodes" and
> > "other", and make sure "other" does something sensible in the meantime
> > (like mention the conflict, but skip trying to list duplicates).
> 
> Yes, this should be split.  Do the "easy" Linux version first.
> Keep in mind that there may also be a different solution for the Mac.

I assumed that an inode-based solution would work for Mac, since it's
mostly BSD under the hood. There may be subtleties I don't know about,
though.

-Peff

Re: What's cooking in git.git (Aug 2018, #01; Thu, 2)

"brian m. carlson"  writes:

> On Fri, Aug 03, 2018 at 11:40:08AM -0700, Junio C Hamano wrote:
>> "brian m. carlson"  writes:
>> 
>> > On Thu, Aug 02, 2018 at 04:02:36PM -0700, Junio C Hamano wrote:
>> >> --
>> >> [New Topics]
>> >
>> > I had expected to see
>> > <20180729192803.1047050-1-sand...@crustytoothpaste.net> (the refspec @
>> > handling) in this list, but I don't.  Were you expecting changes or
>> > additional feedback before picking it up?
>> 
>> Neither.  I just am not all that interested in seeing @ used for
>> HEAD in the first place, as I find it quite confusing.
>
> Okay.  Is that just in this case, or in general?  If nobody but me wants
> this feature, I'm fine dropping it.

That's the typical case of me saying something in a deliberatly
outrageous way so that we can hopefully see people from both sides.

Re: [PATCH/RFC] clone: report duplicate entries on case-insensitive filesystems

Jeff Hostetler  writes:

> Another thing to keep in mind is that the collision could be because
> of case folding (or other such nonsense) on a directory in the path.
> I mean, if someone on Linux builds a commit containing:
>
> a/b/c/D/e/foo.txt
> a/b/c/d/e/foo.txt
>
> we'll get a similar collision as if one of them were spelled "FOO.txt".

I'd think the approach to teach checkout_entry() codepath to notice
it needed to unlink the existing file in order to check out the
entry it wanted to check out would cover this equally well.

> Also, do we need to worry about hard-links or symlinks here?

I do not think so.  You do not get a file with multiple hardlinks
in a "git clone" or "git checkout" result, and we do not check
things out beyond a symbolic link in the first place.

> If checkout populates symlinks, then you might have another collision
> opportunity.  For example:
>
> a/b/c/D/e/foo.txt
> a/link -> ./b/c/d
> a/link/e/foo.txt

In other words, a tree with a/link (symlink) and a/link/
that requires a/link to be a symlink and a directory at the same
time cannot be created, so you won't get one with "git clone"

> Also, some platforms (like the Mac) allow directory hard-links.
> Granted, Git doesn't create hard-links during checkout, but the
> user might.

And we'd report "we are doing a fresh checkout immediately after a
clone and saw some file we haven't created, which may indicate a
case smashing filesystem glitch (or a competing third-party process
creating random files)", so noticing that would be a good thing, I
would think.

> I'm sure there are other edge cases here that make reporting
> difficult; these are just a few I thought of.  I guess what I'm
> trying to say is that as a first step just report that you found
> a collision -- without trying to identify the set existing objects
> that it collided with.

Yup, I think that is sensible.  If it can be done cheaply, i.e. on a
filesystem with trustable and cheap inum, after noticing such a
collision, go back and lstat() all paths in the index we have
checked out so far to see which ones are colliding, it adds useful
clue to the report, but noticing the collision in the first place
obviously has more value.

Re: What's cooking in git.git (Aug 2018, #01; Thu, 2)

2018-08-03 Thread brian m. carlson

On Fri, Aug 03, 2018 at 11:40:08AM -0700, Junio C Hamano wrote:
> "brian m. carlson"  writes:
> 
> > On Thu, Aug 02, 2018 at 04:02:36PM -0700, Junio C Hamano wrote:
> >> --
> >> [New Topics]
> >
> > I had expected to see
> > <20180729192803.1047050-1-sand...@crustytoothpaste.net> (the refspec @
> > handling) in this list, but I don't.  Were you expecting changes or
> > additional feedback before picking it up?
> 
> Neither.  I just am not all that interested in seeing @ used for
> HEAD in the first place, as I find it quite confusing.

Okay.  Is that just in this case, or in general?  If nobody but me wants
this feature, I'm fine dropping it.
-- 
brian m. carlson: Houston, Texas, US
OpenPGP: https://keybase.io/bk2204


signature.asc
Description: PGP signature

Re: What's cooking in git.git (Aug 2018, #01; Thu, 2)

"brian m. carlson"  writes:

> On Thu, Aug 02, 2018 at 04:02:36PM -0700, Junio C Hamano wrote:
>> --
>> [New Topics]
>
> I had expected to see
> <20180729192803.1047050-1-sand...@crustytoothpaste.net> (the refspec @
> handling) in this list, but I don't.  Were you expecting changes or
> additional feedback before picking it up?

Neither.  I just am not all that interested in seeing @ used for
HEAD in the first place, as I find it quite confusing.

Nurses contact information

2018-08-03 Thread Cheryl Strack

Hello there,

 

We have a newly compiled list of Nurses contact information.

 

Contact Information such Name, Company's Name, Phone Number, Fax Number, Job
Title, Email address, Complete Mailing Address, SIC code, Company revenue,
size, Web address etc.

 

We also have other specialist such as Cardiologist, Internal medicine,
Radiologists Family Practitioners Orthopedists and all others.

 

We provide these contacts with complete details with direct email address
with unlimited usage for your email marketing. Please let me know if you are
interested.

 

Please let me know the below and I shall get back to you with other list
details accordingly.

 

Target Specialist___?

Target Geography___?

 

Hope to hear from you soon.

 

Regards,

 

Cheryl Strack - Marketing Analyst

 

This is an attempt to begin a conversation with you. If you would rather not
hear from us, please respond mentioning UNSUBSCRIBE in the subject line.

If you are not the right person please forward this email to the right
person in your organization.

Re: [PATCH 2/2] rebase --exec: make it work with --rebase-merges

"Johannes Schindelin via GitGitGadget" 
writes:

> From: Johannes Schindelin 
>
> The idea of `--exec` is to append an `exec` call after each `pick`.
>
> Since the introduction of fixup!/squash! commits, this idea was extended
> to apply to "pick, possibly followed by a fixup/squash chain", i.e. an
> exec would not be inserted between a `pick` and any of its corresponding
> `fixup` or `squash` lines.
>
> The current implementation uses a dirty trick to achieve that: it
> assumes that there are only pick/fixup/squash commands, and then
> *inserts* the `exec` lines before any `pick` but the first, and appends
> a final one.

Ahh, it may be "dirty" but "clever" ;-) As there is no way to say
"add exec after only the third one", inserting before 'pick',
assuming the lines before that would be a "previous group" that
began with a pick and then possibly its amending operations, was
sufficient.  Makes sense.

> With the todo lists generated by `git rebase --rebase-merges`, this
> simple implementation shows its problems: it produces the exact wrong
> thing when there are `label`, `reset` and `merge` commands.

Understandable.

> Let's change the implementation to do exactly what we want: look for
> `pick` lines, skip any fixup/squash chains, and then insert the `exec`
> line. Lather, rinse, repeat.
>
> While at it, also add `exec` lines after `merge` commands, because they
> are similar in spirit to `pick` commands: they add new commits.

Yeah, while reading the "Let's change" paragraph, that exactly was
what came to my mind---the old one was buggy because we didn't
anticipate that 'pick' won't stay to be the first command in the
block that gives us a good anchoring point to find the end of the
previous block, so assuming that 'pick' will stay to be the only
command the users want to add exec after would be making a design
mistake that is quite similar without learning any lesson from the
bug being fixed.

I do agree that it is a good idea to add exec after merge; what I
wanted to say with the above is that I just do not think it is
"While at it"; it would be an integral part of supporting --exec
in rebase-merges mode.

> Signed-off-by: Johannes Schindelin 
> ---
>  sequencer.c  | 59 
>  t/t3430-rebase-merges.sh |  2 +-
>  2 files changed, 49 insertions(+), 12 deletions(-)
>
> diff --git a/sequencer.c b/sequencer.c
> index 31038472f..dda5cdbba 100644
> --- a/sequencer.c
> +++ b/sequencer.c
> @@ -4244,10 +4244,9 @@ int sequencer_add_exec_commands(const char *commands)
>  {
>   const char *todo_file = rebase_path_todo();
>   struct todo_list todo_list = TODO_LIST_INIT;
> - struct todo_item *item;
>   struct strbuf *buf = _list.buf;
>   size_t offset = 0, commands_len = strlen(commands);
> - int i, first;
> + int i, insert_final_commands;
>  
>   if (strbuf_read_file(_list.buf, todo_file, 0) < 0)
>   return error(_("could not read '%s'."), todo_file);
> @@ -4257,19 +4256,57 @@ int sequencer_add_exec_commands(const char *commands)
>   return error(_("unusable todo list: '%s'"), todo_file);
>   }
>  
> - first = 1;
> - /* insert  before every pick except the first one */
> - for (item = todo_list.items, i = 0; i < todo_list.nr; i++, item++) {
> - if (item->command == TODO_PICK && !first) {
> - strbuf_insert(buf, item->offset_in_buf + offset,
> -   commands, commands_len);
> - offset += commands_len;
> + /*
> +  * Insert  after every pick. Here, fixup/squash chains
> +  * are considered part of the pick, so we insert the commands *after*
> +  * those chains if there are any.
> +  */

This is a tangent, but can a merge be amended with fixup/squash?  I
am hoping that I can use this machinery to augment Meta/Reintegrate
logic someday, and amending merges to resolve semantic conflicts
between topiocs in flight is what needs to happen constantly.

It appears the code treats TODO_PICK and TODO_MERGE the same way, so
the answer to the question apparently is "yes", which is good.

"after every pick" needs to become "after every pick and merge", or
if you prefer "after creating every new commit (i.e. pick and merge)".

> + insert_final_commands = 1;

We assume, before entering the loop, that we'd need to append
another exec at the end.

> + for (i = 0; i < todo_list.nr; ) {
> + enum todo_command command = todo_list.items[i].command;
> + int j = 0;
> +
> + if (command != TODO_PICK && command != TODO_MERGE) {
> + i++;
> + continue;

If we ever see a todo-list without any pick/merge, then insert_final
is still 1 when we leave the loop and we will add one single exec at
the end.  Which may or may not make sense---I dunno, as I do not
offhand think of a reason why the user would give us such a sequence
in the first place, so

Re: [PATCH/RFC] clone: report duplicate entries on case-insensitive filesystems

2018-08-03 Thread Jeff Hostetler





On 8/2/2018 5:28 PM, Jeff King wrote:

On Thu, Aug 02, 2018 at 02:14:30PM -0700, Junio C Hamano wrote:


Jeff King  writes:


I also wonder if Windows could return some other file-unique identifier
that would work in place of an inode here. That would be pretty easy to
swap in via an #ifdef's helper function. I'd be OK shipping without that
and letting Windows folks fill it in later (as long as we do not do
anything too stupid until then, like claim all of the inode==0 files are
the same).


Yeah, but such a useful file-unique identifier would probably be
used in place of inum in their (l)stat emulation already, if exists,
no?


Maybe. It might not work as ino_t. Or it might be expensive to get.  Or
maybe it's simply impossible. I don't know much about Windows. Some
searching implies that NTFS does have a "file index" concept which is
supposed to be unique.


This is hard and/or expensive on Windows.  Yes, you can get the
"file index" values for an open file handle with a cost similar to
an fstat().  Unfortunately, the FindFirst/FindNext routines (equivalent
to the opendir/readdir routines), don't give you that data.  So we'd
have to scan the directory and then open and stat each file.  This is
terribly expensive on Windows -- and the reason we have the fscache
layer (in the GfW version) to intercept the lstat() calls whenever
possible.

It might be possible to use the NTFS Master File Table to discover
this (very big handwave), but I would need to do a little digging.

This would all be NTFS specific.  FAT and other volume types would not
be covered.

Another thing to keep in mind is that the collision could be because
of case folding (or other such nonsense) on a directory in the path.
I mean, if someone on Linux builds a commit containing:

a/b/c/D/e/foo.txt
a/b/c/d/e/foo.txt

we'll get a similar collision as if one of them were spelled "FOO.txt".

Also, do we need to worry about hard-links or symlinks here?
If checkout populates symlinks, then you might have another collision
opportunity.  For example:

a/b/c/D/e/foo.txt
a/link -> ./b/c/d
a/link/e/foo.txt

Also, some platforms (like the Mac) allow directory hard-links.
Granted, Git doesn't create hard-links during checkout, but the
user might.

I'm sure there are other edge cases here that make reporting
difficult; these are just a few I thought of.  I guess what I'm
trying to say is that as a first step just report that you found
a collision -- without trying to identify the set existing objects
that it collided with.



At any rate, until we have an actual plan for Windows, I think it would
make sense only to split the cases into "has working inodes" and
"other", and make sure "other" does something sensible in the meantime
(like mention the conflict, but skip trying to list duplicates).


Yes, this should be split.  Do the "easy" Linux version first.
Keep in mind that there may also be a different solution for the Mac.


When somebody wants to work on Windows support, then we can figure out
if it just needs to wrap the "get unique identifier" operation, or if it
would use a totally different algorithm.

-Peff



Jeff

Re: What's cooking in git.git (Aug 2018, #01; Thu, 2)

2018-08-03 Thread brian m. carlson

On Thu, Aug 02, 2018 at 04:02:36PM -0700, Junio C Hamano wrote:
> --
> [New Topics]

I had expected to see
<20180729192803.1047050-1-sand...@crustytoothpaste.net> (the refspec @
handling) in this list, but I don't.  Were you expecting changes or
additional feedback before picking it up?
-- 
brian m. carlson: Houston, Texas, US
OpenPGP: https://keybase.io/bk2204


signature.asc
Description: PGP signature

Re: [RFC PATCH v5 0/4] add -p: select individual hunk lines

On Fri, Aug 03 2018, Phillip Wood wrote:

[snip (all made sense)]

> It seems clear for your comment and Junio's that I need to improve the
> documentation, I'm not sure if that will be enough though or do we need
> to change the behavior? [I'm beginning to see why all the other programs
> I tried while writing this (tig, gitg, gitk and mercurial's version of
> add -i) don't make any attempt to stage modified lines correctly, though
> I think git should have some way of doing it.]

I think this is ready to move forward conceptually, i.e. in these two
review rounds I've been focusing on the UI this exposes, and whether
this is even a viable thing to have conceptually.

I'm convinced that it is, even though there's osme confusing edge cases
(as noted in my mail) I don't see how those could be entire avoided, a
feature like this will inherently have such edge cases.

I've avoided going deep into how the actual code itself works / is
implemented in these review rounds, figuring that given the topic it was
better to move past RFC for that type of review once we had docs.

Thanks for working on this.

Re: [PATCH v2 2/2] doc hash-function-transition: pick SHA-256 as NewHash

2018-08-03 Thread brian m. carlson

On Fri, Aug 03, 2018 at 12:20:14AM -0700, Jonathan Nieder wrote:
> Ævar Arnfjörð Bjarmason wrote:
> >  Object format
> >  ~
> >  The content as a byte sequence of a tag, commit, or tree object named
> > -by sha1 and newhash differ because an object named by newhash-name refers 
> > to
> > +by sha1 and sha256 differ because an object named by sha256-name refers to
> 
> Not about this patch: this should say SHA-1 and SHA-256, I think.
> Leaving it as is in this patch as you did is the right thing.
> 
> [...]
> > @@ -255,10 +252,10 @@ network byte order):
> >up to and not including the table of CRC32 values.
> >  - Zero or more NUL bytes.
> >  - The trailer consists of the following:
> > -  - A copy of the 20-byte NewHash checksum at the end of the
> > +  - A copy of the 20-byte SHA-256 checksum at the end of the
> 
> Not about this patch: a SHA-256 is 32 bytes.  Leaving that for a
> separate patch as you did is the right thing, though.
> 
> [...]
> > -  - 20-byte NewHash checksum of all of the above.
> > +  - 20-byte SHA-256 checksum of all of the above.
> 
> Likewise.

For the record, my code for these does use 32 bytes.  I'm fine with this
being a separate patch, though.

> [...]
> > @@ -351,12 +348,12 @@ the following steps:
> > (This list only contains objects reachable from the "wants". If the
> > pack from the server contained additional extraneous objects, then
> > they will be discarded.)
> > -3. convert to newhash: open a new (newhash) packfile. Read the 
> > topologically
> > +3. convert to sha256: open a new (sha256) packfile. Read the topologically
> 
> Not about this patch: this one's more murky, since it's talking about
> the object names instead of the hash function.  I guess "sha256"
> instead of "SHA-256" for this could be right, but I worry it's going
> to take time for me to figure out the exact distinction.  That seems
> like a reason to just call it SHA-256 (but in a separate patch).

My assumption has been that when we are referring to the algorithm,
we'll use SHA-1 and SHA-256, and when we're referring to the input to
Git (in config files or in ^{sha256} notation), we use "sha1" and
"sha256".  I see this as analogous to "Git" and "git".

Otherwise, I'm fine with this document as it is.
-- 
brian m. carlson: Houston, Texas, US
OpenPGP: https://keybase.io/bk2204


signature.asc
Description: PGP signature

Need assistance synchronize a fork with it's parent

2018-08-03 Thread Dennis German


I need assistance synchronize a fork with it's parent

   https://github.com/DG12/ruuvitag_fw
   This branch is 5 commits ahead, 29 commits behind ruuvi:master.

I want to revise main.c and the issue a pull request.

Sorry, but I am lost.

Seems that I am trying to do something simple.

Any help / comments would be appreciated.

SIncerely,

Dennis German

Re: [PATCH v2 2/2] doc hash-function-transition: pick SHA-256 as NewHash



On Fri, Aug 03 2018, Jonathan Nieder wrote:

> Hi again,
>
> Sorry for the slow review.  I finally got a chance to look this over
> again.
>
> My main nits are about the commit message: I think it still focuses
> too much on the process instead of the usual "knowing what I know now,
> here's the clearest explanation for why we need this patch" approach.
> I can send a patch illustrating what I mean tomorrow morning.

I think it makes if you just take over 2/2 of this series (or even the
whole thing), since the meat of it is already something I copy/pasted
from you, and you've got more of an opinion / idea about how to proceed
(which is good!); it's more efficient than me trying to fix various
stuff you're pointing out at this point, I also think it makes sense
that you change the "Author" line for that, since the rest of the
changes will mainly be search-replace by me.

Perhaps it's better for readability if those search-replace changes go
into their own change, i.e. make it a three-part where 2/3 does the
search-replace, and promises that 3/3 has the full rationale etc.

> Ævar Arnfjörð Bjarmason wrote:
>
>> From a security perspective, it seems that SHA-256, BLAKE2, SHA3-256,
>> K12, and so on are all believed to have similar security properties.
>> All are good options from a security point of view.
>>
>> SHA-256 has a number of advantages:
>>
>> * It has been around for a while, is widely used, and is supported by
>>   just about every single crypto library (OpenSSL, mbedTLS, CryptoNG,
>>   SecureTransport, etc).
>>
>> * When you compare against SHA1DC, most vectorized SHA-256
>>   implementations are indeed faster, even without acceleration.
>>
>> * If we're doing signatures with OpenPGP (or even, I suppose, CMS),
>>   we're going to be using SHA-2, so it doesn't make sense to have our
>>   security depend on two separate algorithms when either one of them
>>   alone could break the security when we could just depend on one.
>>
>> So SHA-256 it is.
>
> The above is what I wrote, so of course I'd like it. ;-)
>
>>See the "Hash algorithm analysis" thread as of
>> [1]. Linus has come around to this choice and suggested Junio make the
>> final pick, and he's endorsed SHA-256 [3].
>
> The above paragraph has the same problem as before of (1) not being
> self-contained and (2) focusing on the process that led to this patch
> instead of the benefit of the patch itself.  I think we should omit it.
>
>> This follow-up change changes occurrences of "NewHash" to
>> "SHA-256" (or "sha256", depending on the context). The "Selection of a
>> New Hash" section has also been changed to note that historically we
>> used the the "NewHash" name while we didn't know what the new hash
>> function would be.
>
> nit: Commit messages are usually in the imperative tense.  This is in
> the past tense, I think because it is a continuation of that
> discussion about process.
>
> For this part, I think we can let the patch speak for itself.
>
>> This leaves no use of "NewHash" anywhere in git.git except in the
>> aforementioned section (and as a variable name in t/t9700/test.pl, but
>> that use from 2008 has nothing to do with this transition plan).
>
> This part is helpful --- good.
>
>> 1. 
>> https://public-inbox.org/git/20180720215220.gb18...@genre.crustytoothpaste.net/
>> 2. 
>> https://public-inbox.org/git/ca+55afwse9bf8e0hlk9pp3fvd5lavy5grdsv3fbntgzekja...@mail.gmail.com/
>> 3. https://public-inbox.org/git/xmqqzhygwd5o@gitster-ct.c.googlers.com/
>
> Footnotes to the historical part --- I'd recommend removing these.
>
>> Helped-by: Jonathan Nieder 
>> Helped-by: Junio C Hamano 
>> Signed-off-by: Ævar Arnfjörð Bjarmason 
>
> Here I'd want to put a pile of acks, e.g.:
>
>  Acked-by: Linus Torvalds 
>  Acked-by: brian m. carlson 
>  Acked-by: Johannes Schindelin 
>  Acked-by: Dan Shumow 
>  Acked-by: Junio C Hamano 
>
> [...]
>> --- a/Documentation/technical/hash-function-transition.txt
>> +++ b/Documentation/technical/hash-function-transition.txt
>> @@ -59,14 +59,11 @@ that are believed to be cryptographically secure.
>>
>>  Goals
>>  -
>> -Where NewHash is a strong 256-bit hash function to replace SHA-1 (see
>> -"Selection of a New Hash", below):
>> -
>> -1. The transition to NewHash can be done one local repository at a time.
>> +1. The transition to SHA-256 can be done one local repository at a time.
>
> Yay!
>
> [...]
>>  [extensions]
>> -objectFormat = newhash
>> +objectFormat = sha256
>>  compatObjectFormat = sha1
>
> Yes, makes sense.
>
> [...]
>> @@ -155,36 +152,36 @@ repository extensions.
>>  Object names
>>  
>>  Objects can be named by their 40 hexadecimal digit sha1-name or 64
>> -hexadecimal digit newhash-name, plus names derived from those (see
>> +hexadecimal digit sha256-name, plus names derived from those (see
>>  gitrevisions(7)).
>>
>>  The sha1-name of an object is the SHA-1 of the concatenation of its
>>  type, length, a

[PATCH 0/2] Make git rebase work with --rebase-merges and --exec

2018-08-03 Thread Johannes Schindelin via GitGitGadget

It was reported via IRC that the exec lines are inserted in the wrong spots
when using --rebase-merges.

The reason is that we used a simple, incorrect implementation that happened
to work as long as the generated todo list only contains pick, fixup and 
squash commands. Which is not the case with--rebase-merges.

Fix this issue by using a correct, if longer and slightly more complex
implementation instead.

Johannes Schindelin (2):
  t3430: demonstrate what -r, --autosquash & --exec should do
  rebase --exec: make it work with --rebase-merges

 sequencer.c  | 59 
 t/t3430-rebase-merges.sh | 17 
 2 files changed, 65 insertions(+), 11 deletions(-)


base-commit: 1d89318c48d233d52f1db230cf622935ac3c69fa
Published-As: 
https://github.com/gitgitgadget/git/releases/tags/pr-13%2Fdscho%2Frebase-merges-and-exec-commands-v1
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git 
pr-13/dscho/rebase-merges-and-exec-commands-v1
Pull-Request: https://github.com/gitgitgadget/git/pull/13
-- 
gitgitgadget

[PATCH 1/2] t3430: demonstrate what -r, --autosquash & --exec should do

2018-08-03 Thread Johannes Schindelin via GitGitGadget

From: Johannes Schindelin 

The --exec option's implementation is not really well-prepared for
--rebase-merges. Demonstrate this.

Signed-off-by: Johannes Schindelin 
---
 t/t3430-rebase-merges.sh | 17 +
 1 file changed, 17 insertions(+)

diff --git a/t/t3430-rebase-merges.sh b/t/t3430-rebase-merges.sh
index 9e6229727..0bf5eaa37 100755
--- a/t/t3430-rebase-merges.sh
+++ b/t/t3430-rebase-merges.sh
@@ -363,4 +363,21 @@ test_expect_success 'octopus merges' '
EOF
 '
 
+test_expect_failure 'with --autosquash and --exec' '
+   git checkout -b with-exec H &&
+   echo Booh >B.t &&
+   test_tick &&
+   git commit --fixup B B.t &&
+   write_script show.sh <<-\EOF &&
+   subject="$(git show -s --format=%s HEAD)"
+   content="$(git diff HEAD^! | tail -n 1)"
+   echo "$subject: $content"
+   EOF
+   test_tick &&
+   git rebase -ir --autosquash --exec ./show.sh A >actual &&
+   grep "B: +Booh" actual &&
+   grep "E: +Booh" actual &&
+   grep "G: +G" actual
+'
+
 test_done
-- 
gitgitgadget

[PATCH 2/2] rebase --exec: make it work with --rebase-merges

2018-08-03 Thread Johannes Schindelin via GitGitGadget

From: Johannes Schindelin 

The idea of `--exec` is to append an `exec` call after each `pick`.

Since the introduction of fixup!/squash! commits, this idea was extended
to apply to "pick, possibly followed by a fixup/squash chain", i.e. an
exec would not be inserted between a `pick` and any of its corresponding
`fixup` or `squash` lines.

The current implementation uses a dirty trick to achieve that: it
assumes that there are only pick/fixup/squash commands, and then
*inserts* the `exec` lines before any `pick` but the first, and appends
a final one.

With the todo lists generated by `git rebase --rebase-merges`, this
simple implementation shows its problems: it produces the exact wrong
thing when there are `label`, `reset` and `merge` commands.

Let's change the implementation to do exactly what we want: look for
`pick` lines, skip any fixup/squash chains, and then insert the `exec`
line. Lather, rinse, repeat.

While at it, also add `exec` lines after `merge` commands, because they
are similar in spirit to `pick` commands: they add new commits.

Signed-off-by: Johannes Schindelin 
---
 sequencer.c  | 59 
 t/t3430-rebase-merges.sh |  2 +-
 2 files changed, 49 insertions(+), 12 deletions(-)

diff --git a/sequencer.c b/sequencer.c
index 31038472f..dda5cdbba 100644
--- a/sequencer.c
+++ b/sequencer.c
@@ -4244,10 +4244,9 @@ int sequencer_add_exec_commands(const char *commands)
 {
const char *todo_file = rebase_path_todo();
struct todo_list todo_list = TODO_LIST_INIT;
-   struct todo_item *item;
struct strbuf *buf = _list.buf;
size_t offset = 0, commands_len = strlen(commands);
-   int i, first;
+   int i, insert_final_commands;
 
if (strbuf_read_file(_list.buf, todo_file, 0) < 0)
return error(_("could not read '%s'."), todo_file);
@@ -4257,19 +4256,57 @@ int sequencer_add_exec_commands(const char *commands)
return error(_("unusable todo list: '%s'"), todo_file);
}
 
-   first = 1;
-   /* insert  before every pick except the first one */
-   for (item = todo_list.items, i = 0; i < todo_list.nr; i++, item++) {
-   if (item->command == TODO_PICK && !first) {
-   strbuf_insert(buf, item->offset_in_buf + offset,
- commands, commands_len);
-   offset += commands_len;
+   /*
+* Insert  after every pick. Here, fixup/squash chains
+* are considered part of the pick, so we insert the commands *after*
+* those chains if there are any.
+*/
+   insert_final_commands = 1;
+   for (i = 0; i < todo_list.nr; ) {
+   enum todo_command command = todo_list.items[i].command;
+   int j = 0;
+
+   if (command != TODO_PICK && command != TODO_MERGE) {
+   i++;
+   continue;
+   }
+
+   /* skip fixup/squash chain, if any */
+   for (i++; i < todo_list.nr; i++, j = 0) {
+   command = todo_list.items[i].command;
+
+   if (is_fixup(command))
+   continue;
+
+   if (command != TODO_COMMENT)
+   break;
+
+   /* skip comment if followed by any fixup/squash */
+   for (j = i + 1; j < todo_list.nr; j++)
+   if (todo_list.items[j].command != TODO_COMMENT)
+   break;
+   if (j < todo_list.nr &&
+   is_fixup(todo_list.items[j].command)) {
+   i = j;
+   continue;
+   }
+   break;
}
-   first = 0;
+
+   if (i >= todo_list.nr) {
+   insert_final_commands = 1;
+   break;
+   }
+
+   strbuf_insert(buf, todo_list.items[i].offset_in_buf + offset,
+ commands, commands_len);
+   offset += commands_len;
+   insert_final_commands = 0;
}
 
/* append final  */
-   strbuf_add(buf, commands, commands_len);
+   if (insert_final_commands)
+   strbuf_add(buf, commands, commands_len);
 
i = write_message(buf->buf, buf->len, todo_file, 0);
todo_list_release(_list);
diff --git a/t/t3430-rebase-merges.sh b/t/t3430-rebase-merges.sh
index 0bf5eaa37..90ae613e2 100755
--- a/t/t3430-rebase-merges.sh
+++ b/t/t3430-rebase-merges.sh
@@ -363,7 +363,7 @@ test_expect_success 'octopus merges' '
EOF
 '
 
-test_expect_failure 'with --autosquash and --exec' '
+test_expect_success 'with --autosquash and --exec' '
git checkout -b with-exec H &&
echo Booh >B.t &&
test_tick &&
-- 
gitgitgadget

Re: [PATCH 1/1] verify-tag/verify-commit should exit unsuccessfully when signature is not trusted

Santiago Torres  writes:

> Now that I think about it though, verify tag can verify more than one
> tag. I assume that this would make it difficult to propagate individual
> errors in trusting. I honestly don't know what's the best way to modify
> this behavior then.

I am not sure if changing the behaviour is warranted to begin with.
Especially when somebody wants to get more than a single bit.  I
think our single bit signals "does signature compute correctly?"
without taking "how much trust do we place in that particular key?"
into account.  As Brian mentioned in an earlier response, those who
want to take different factors like the level of trust etc. into
account can read from "--raw", using the exit code only for "does
signature compute correctly?" and nothing else.  That would allow
them to perform any validation underlying GNUPG let them to do.

Re: Squash & Merge Implementation

2018-08-03 Thread Vadim Belov

Thanks, Peff!

I'm just doing the CI and the status check is for testing each commit to
the PR-Branch.
I'll try to get response from github on this as you suggested.

Thanks again,
Vadim

On Fri, Aug 3, 2018 at 4:49 PM Jeff King  wrote:

> On Fri, Aug 03, 2018 at 03:01:15PM +0300, Vadim Belov wrote:
>
> > 1. This merges successfully without squash:
> > git checkout origin/master
> > git merge ${PR-Branch}
> > git push origin HEAD:master
> > git push origin --delete ${PR-Branch}
>
> Right, this is a normal merge.
>
> > 2. This closes the PR, but there is no update seen on master:
> > git checkout origin/master
> > git merge --squash --commit ${PR-Branch}
> > git push origin HEAD:master
> > git push origin --delete ${PR-Branch}
>
> Doing "merge --squash --commit" doesn't do what you expect; namely
> "--commit" does not override the non-committing nature of "--squash". It
> only override a "--no-commit" found elsewhere.
>
> IMHO this is something that could be improved in Git (i.e., telling the
> difference between "the user did not say --no-commit" and "the user said
> --commit" and respecting it for --squash).
>
> But that explains what you see. The push to master is a noop, since you
> didn't make a new commit. And then deleting the PR branch on GitHub
> auto-closes the PR.
>
> > 3. This fails to push to master with the error "GH006: Protected branch
> > update failed"  (despite that the PR is set to SUCCESS):
> > git checkout origin/master
> > git merge --squash ${PR-Branch}
> > git commit -am"comment"
> > git push origin HEAD:${m_mainBranch}
> > git push origin --delete ${m_prBranch}
>
> So here you _do_ make an actual commit. But that commit doesn't look
> like a merge to the receiver; it just looks like a single commit that
> has all the changes there were on PR-Branch.
>
> The tree of that commit should be the same tree that would result from a
> real merge. So in theory things like protected-branch status checks
> could handle that, but I suspect they use the actual commit id (the tree
> id is fine if you're just doing CI, but if you wanted to have a status
> check for commit messages, say, you'd obviously want that to be tied to
> the actual commit object).
>
> I don't offhand recall how that is implemented (and you could also be
> falling afoul of other checks, like required reviews). But this is a
> GitHub-specific question, and you should probably ask GitHub support to
> go further.
>
> -Peff
>

Re: [PATCH 00/12] Kill the_index part2, header file cleanup

Duy Nguyen  writes:

> Another friendly ping. I really need to know if I need to update (or
> drop) this part before moving on to part 3.

Easier is to resend and see if it sticks this time (that is how I
landed format-patch back when I was just an individual contributor
;-)

I will see if I can locate them in the archive (I actually do not
have strong preference either way myself, so the inaction was not a
form of objection but was merely showing a "Meh").

Thanks for prodding.

Re: [PATCH v2 2/2] doc hash-function-transition: pick SHA-256 as NewHash

2018-08-03 Thread Linus Torvalds

On Fri, Aug 3, 2018 at 9:40 AM Junio C Hamano  wrote:
>
> > [...]
> >> -  - 20-byte NewHash checksum of all of the above.
> >> +  - 20-byte SHA-256 checksum of all of the above.
> >
> > Likewise.
>
> Hmph, I've always assumed since NewHash plan was written that this
> part was not about tamper resistance but was about bit-flipping
> detection and it was deliberate to stick to 20-byte sum, truncating
> as necessary.

Yeah, in fact, since this was one area where people actually had
performance issues with the hash, it might be worth considering
_weakening_ the hash.

Things like the pack index files (and just the regular file index) are
entirely local, and the SHA1 in those is really just a fancy CRC. It's
really just "good protection against disk corruption" (it happens),
and in fact you cannot use it as protection against active tampering,
since there's no secret there and any active attacker that has access
to your local filesystem could just recompute the hash anyway.

And because they are local anyway and aren't really transported
(modulo "shared repositories" where you use them across users or
legacy rsync-like synchronization), they can be handled separately
from any hashing changes. The pack and index file formats have in fact
been changed before.

It might make sense to either keep it as SHA1 (just to minimize any
changes) or if there are still issues with index file performance it
could even be made to use something fast-but-not-cryptographic like
just making it use crc32().

Remember: one of the original core git design requirements was
"corruption detection".

For normal hashed objects, that came naturally, and the normal object
store additionally has active tamper protection thanks to the
interconnected nature of the hashes and the distribution of the
objects.

But for things like packfiles and the file index, it is just a
separate checksum. There is no tamper protection anyway, since anybody
who can modify them directly can just recompute the hash checksum.

The fact that both of these things used SHA1 was more of a convenience
issue than anything else, and the pack/index file checksum is
fundamentally not cryptographic (but a cryptographic hash is obviously
by definition also a very good corruption detector).

   Linus

Re: [RFC PATCH v5 0/4] add -p: select individual hunk lines

Phillip Wood  writes:

> ... [I'm beginning to see why all the other programs I tried while
> writing this (tig, gitg, gitk and mercurial's version of add -i)
> don't make any attempt to stage modified lines correctly, though I
> think git should have some way of doing it.]

Yes, this is a kind of feature that one can propose and implement
something that works well for some of the limited cases one uses,
but not in other cases, and it becomes very hard to explain how to
work around the implementation limitation---that is why I stopped
at "split this hunk?" and did not go beyond it when I designed the
original "incremental add" feature.

I think the real reason why it is hard is that there is no good
definition of "modified" in "stage modified lines", and worse, there
is no good way to mechanically figure it out, because a patch only
gives you "these were deleted" and "these are added", without giving
you "this line in the deleted block corresponds to these two lines
in the added block" (i.e. "this original one line was modified into
this thing in the result").

[PATCH] builtin/config.c: separate message and value by colon in error messages

2018-08-03 Thread Ralf Thielow

The error message and the value that caused that error should be
separated by a colon.

Signed-off-by: Ralf Thielow 
---
 builtin/config.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/builtin/config.c b/builtin/config.c
index 2c93a289a7..862f870b58 100644
--- a/builtin/config.c
+++ b/builtin/config.c
@@ -98,7 +98,7 @@ static int option_parse_type(const struct option *opt, const 
char *arg,
else if (!strcmp(arg, "color"))
new_type = TYPE_COLOR;
else
-   die(_("unrecognized --type argument, %s"), arg);
+   die(_("unrecognized --type argument: %s"), arg);
}
 
to_type = opt->value;
-- 
2.18.0.203.gfac676dfb9

Re: [PATCH v2 2/2] doc hash-function-transition: pick SHA-256 as NewHash

2018-08-03 Thread Linus Torvalds

On Fri, Aug 3, 2018 at 12:20 AM Jonathan Nieder  wrote:
>
>
> Here I'd want to put a pile of acks, e.g.:
>
>  Acked-by: Linus Torvalds 
> ..

Sure, feel free to add my Ack for this.

   Linus

Re: [PATCH v2 2/2] doc hash-function-transition: pick SHA-256 as NewHash

Jonathan Nieder  writes:

> Sorry for the slow review.  I finally got a chance to look this over
> again.
>
> My main nits are about the commit message: I think it still focuses
> too much on the process instead of the usual "knowing what I know now,
> here's the clearest explanation for why we need this patch" approach.
> I can send a patch illustrating what I mean tomorrow morning.

I'll turn 'Will merge to next' to 'Hold' for now.

>>  Object format
>>  ~
>>  The content as a byte sequence of a tag, commit, or tree object named
>> -by sha1 and newhash differ because an object named by newhash-name refers to
>> +by sha1 and sha256 differ because an object named by sha256-name refers to
>
> Not about this patch: this should say SHA-1 and SHA-256, I think.
> Leaving it as is in this patch as you did is the right thing.

Perhaps deserves a "NEEDSWORK" comment, though.

> [...]
>> @@ -255,10 +252,10 @@ network byte order):
>>up to and not including the table of CRC32 values.
>>  - Zero or more NUL bytes.
>>  - The trailer consists of the following:
>> -  - A copy of the 20-byte NewHash checksum at the end of the
>> +  - A copy of the 20-byte SHA-256 checksum at the end of the
>
> Not about this patch: a SHA-256 is 32 bytes.  Leaving that for a
> separate patch as you did is the right thing, though.
>
> [...]
>> -  - 20-byte NewHash checksum of all of the above.
>> +  - 20-byte SHA-256 checksum of all of the above.
>
> Likewise.

Hmph, I've always assumed since NewHash plan was written that this
part was not about tamper resistance but was about bit-flipping
detection and it was deliberate to stick to 20-byte sum, truncating
as necessary.

It definitely is a good idea to leave it for a separate patch to
update this part.

Thanks.

Re: [RFC PATCH v2 06/12] submodule--helper: add a '--stage' option to the 'config' sub command

Antonio Ospite  writes:

> The rationale behind the change is to close the circle started with 04
> and 05 and stop referring to .gitmodules explicitly by file path in git
> commands. The change is not strictly needed for the series, it's for
> consistency sake.

Sorry, but I am still not quite sure what consistency you are
referring to.

I also do not see a reason why we want to stop referring to
.gitmodules explicitly by name.  We do not hide the fact that
in-tree .gitignore and .gitattributes files are used to hold the
metainformation about the project tree, saying that it is an
implementation detail.  Is there a good reason why .gitmodules
should be different from these other two?

Re: [PATCH 00/12] Kill the_index part2, header file cleanup

2018-08-03 Thread Duy Nguyen

Another friendly ping. I really need to know if I need to update (or
drop) this part before moving on to part 3.

On Sat, Jul 21, 2018 at 11:06 AM Duy Nguyen  wrote:
>
> On Sat, Jun 30, 2018 at 11:20 AM Nguyễn Thái Ngọc Duy  
> wrote:
> >
> > Like part 1 this is also boring. I wanted to drop these 'extern'
> > everywhere actually, so before I touched any header file in this
> > series, I did a clean up first. This is the result (and to reduce diff
> > noise later)
>
> Junio, part1 of the "kill the_index" series is dropped, but what about
> this one? I think it's still a good cleanup and it only slightly
> conflicts with 'pu'.
>
> >
> > Nguyễn Thái Ngọc Duy (12):
> >   apply.h: drop extern on func declaration
> >   attr.h: drop extern from function declaration
> >   blame.h: drop extern on func declaration
> >   cache-tree.h: drop extern from function declaration
> >   convert.h: drop 'extern' from function declaration
> >   diffcore.h: drop extern from function declaration
> >   diff.h: remove extern from function declaration
> >   line-range.h: drop extern from function declaration
> >   rerere.h: drop extern from function declaration
> >   repository.h: drop extern from function declaration
> >   revision.h: drop extern from function declaration
> >   submodule.h: drop extern from function declaration
> >
> >  apply.h  |  23 +-
> >  attr.h   |  24 +--
> >  blame.h  |  28 ++--
> >  cache-tree.h |   2 +-
> >  convert.h|  56 
> >  diff.h   | 120 +--
> >  diffcore.h   |  50 ++---
> >  line-range.h |  12 +++---
> >  repository.h |  25 +--
> >  rerere.h |  14 +++---
> >  revision.h   |  69 ++---
> >  submodule.h  | 112 +++
> >  12 files changed, 269 insertions(+), 266 deletions(-)
> >
> > --
> > 2.18.0.rc2.476.g39500d3211
> >
>
>
> --
> Duy



-- 
Duy

Re: What's cooking in git.git (Aug 2018, #01; Thu, 2)

Pratik Karki  writes:

> Hi Junio,
>
> On Fri, Aug 3, 2018 at 4:47 AM Junio C Hamano  wrote:
>>
>> * pk/rebase-in-c (2018-07-30) 3 commits
>>  - builtin/rebase: support running "git rebase "
>>  - rebase: refactor common shell functions into their own file
>>  - rebase: start implementing it as a builtin
>>
>>  Rewrite of the "rebase" machinery in C.
>>
>>  Will merge to 'next'.
>
> On a recent mail[1], I had suggested stalling this iteration in `pu`
> and wait for another iteration to merge in `next`.
>
> [1]: 
> https://public-inbox.org/git/CAOZc8M_FJmMCEB1MkJrBRpLiFjy8OTEg_MxoNQTh5-brHxR-=a...@mail.gmail.com/

Thanks for a prompt response and correction.  Very much appreciated.

Re: What's cooking in git.git (Aug 2018, #01; Thu, 2)

Stefan Beller  writes:

> On Thu, Aug 2, 2018 at 4:02 PM Junio C Hamano  wrote:
>
>> * sb/config-write-fix (2018-08-01) 3 commits
>>  - git-config: document accidental multi-line setting in deprecated syntax
>>  - config: fix case sensitive subsection names on writing
>>  - t1300: document current behavior of setting options
>>
>>  Recent update to "git config" broke updating variable in a
>>  subsection, which has been corrected.
>>
>>  Not quite?
>>  cf. 
>
> I'd rather point to
> https://public-inbox.org/git/xmqqftzx67vo@gitster-ct.c.googlers.com/
> https://public-inbox.org/git/xmqqva8t4s63@gitster-ct.c.googlers.com/
> instead (reason: shoddiness),

Thanks; I do not think the series was particulary shoddy, but does
deserve a bit more polishing.

> Personally I do not want to care about the old notation
> and by implementing it the way the series is, the
> old notation doesn't see any *changes*.

Yup, I agree that it is good enough.

>> * ds/commit-graph-with-grafts (2018-07-19) 8 commits
>>   (merged to 'next' on 2018-08-02 at 0ee624e329)
>>  + commit-graph: close_commit_graph before shallow walk
>>  + commit-graph: not compatible with uninitialized repo
>>  + commit-graph: not compatible with grafts
>>  + commit-graph: not compatible with replace objects
>>  + test-repository: properly init repo
>>  + commit-graph: update design document
>>  + refs.c: upgrade for_each_replace_ref to be a each_repo_ref_fn callback
>>  + refs.c: migrate internal ref iteration to pass thru repository argument
>>
>>  The recently introduced commit-graph auxiliary data is incompatible
>>  with mechanisms such as replace & grafts that "breaks" immutable
>>  nature of the object reference relationship.  Disable optimizations
>>  based on its use (and updating existing commit-graph) when these
>>  incompatible features are in use in the repository.
>
> Makes sense as a whole, but I dislike the first 2 patches
> (they were my suggestion) for the refs API. I plan to re send patches
> https://public-inbox.org/git/20180730194731.220191-1-sbel...@google.com/
> but fixed for real.
>
> (do not let this stop you from merging down this series)

Well, I do not think we are in a particular hurry to get this down
to 'master', and honestly I'd feel safer to cook a topic that has
potential impact to the core longer in 'next' than other things like
this one (the distinction between the "core" and "other" being how
many things are potentially affected, and because commit-graph is
being integrated into history walking, a bug in the subsystem has a
lot bigger impact than say "rebase -i" that breaks "rebase -i
--root" by producing a malformed author-script file, whose impact
may be severe but limited).

>> * sb/submodule-update-in-c (2018-07-18) 6 commits
>>  - submodule--helper: introduce new update-module-mode helper
>>  - builtin/submodule--helper: factor out method to update a single submodule
>>  - builtin/submodule--helper: store update_clone information in a struct
>>  - builtin/submodule--helper: factor out submodule updating
>>  - git-submodule.sh: rename unused variables
>>  - git-submodule.sh: align error reporting for update mode to use path
>>
>>  "git submodule update" is getting rewritten piece-by-piece into C.
>>
>>  Will merge to 'next'.
>
> Please do not, AFAICT this is still breaking in combination with the
> series merged at 7e25437d35a (Merge branch 'sb/submodule-core-worktree',
> 2018-07-18) and I do not recall fixing the interaction between those two.

Thanks for stopping me.

Re: [PATCH 1/1] verify-tag/verify-commit should exit unsuccessfully when signature is not trusted

On Fri, Aug 03, 2018 at 11:43:44AM -0400, Santiago Torres wrote:

> > This is not a deviation. GPG correctly recognizes difference between 
> > trusted,
> > untrusted and unknown levels. git on the other hand does not. Well it did 
> > until
> > the commit a4cc18f29. That one removed GPG exit code propagation.
> 
> Oh wow. Sorry my assumption parted from looking at the code back in the
> day where this happens. I assumed git was quietly propagating the gpg
> error code and took it from there. 
> 
> Now that I think about it though, verify tag can verify more than one
> tag. I assume that this would make it difficult to propagate individual
> errors in trusting. I honestly don't know what's the best way to modify
> this behavior then.

I think the only sensible thing is to err on the conservative side, and
return non-zero if we saw _any_ invalid signature.

I will note, though, that just checking the exit code of `verify-tag`
isn't really that thorough. It shows that there was _a_ signature, but
we don't know:

  - if it was an identity the user would expect to be signing tags

  - if it even matches the refname we used to find the tag

So I'd argue that any real verification needs to either have a human in
the loop, or implement a custom policy based on reading the full output.

I know we (and you specifically Santiago) talked about this a while ago,
and we ended up providing ways to get more information out of
verify-tag, so that a tool could sit on top of that and implement more
project-specific policy. I don't know offhand of any reusable tools that
do so, though.

-Peff

Re: [PATCH v2] checkout: optimize "git checkout -b "

2018-08-03 Thread Duy Nguyen

On Thu, Aug 02, 2018 at 02:02:00PM -0400, Ben Peart wrote:
> 
> 
> On 8/1/2018 11:10 AM, Duy Nguyen wrote:
> > On Tue, Jul 31, 2018 at 7:03 PM Ben Peart  wrote:
> >>
> >> From: Ben Peart 
> >>
> >> Skip merging the commit, updating the index and working directory if and
> >> only if we are creating a new branch via "git checkout -b ."
> >> Any other checkout options will still go through the former code path.
> > 
> > I'd like to see this giant list of checks broken down and pushed down
> > to smaller areas so that chances of new things being added but checks
> > not updated become much smaller. And ideally there should just be no
> > behavior change (I think with your change, "checkout -b" will not
> > report local changes, but it's not mentioned in the config document;
> > more things like that can easily slip).
> > 
> 
> One trade off of pushing these optimizations down into the lower-level 
> functions is that they have a greater potential to break other command 
> if our assumptions are wrong.  Changing these low level functions is a 
> much more invasive set of patches.
> 
> I didn't feel confident enough to pursue this path and instead, decided 
> to do the single, high level optimization around the specific scenario. 
> While it has its own drawbacks (the nasty set of conditions we're 
> testing), the potential for breaking other commands is much smaller.
> 
> That said, I'm willing to look into the model of pushing the 
> checks/optimizations down to smaller areas if we can 1) ensure we aren't 
> breaking other commands and 2) we can get similar performance.
> 
> > So. I assume this reason for this patch is because on super large worktrees
> > 
> >   - 2-way merge is too slow
> >   - applying spare checkout patterns on a huge worktree is also slow
> >   - writing index is, again, slow
> >   - show_local_changes() slow
> > 
> 
> That is pretty close but here is some real data on a large repo.
> 
> "git checkout -b " with this patch takes 0.32 seconds.
> "git checkout -b " without this patch takes 14.6 seconds.
> 
> Note, all numbers are with a hot disk cache - real world numbers for the 
> unpatched case can be much worse as it has to do a lot of disk IO to 
> read/write the 267 MB index, load 500K+ tree objects, etc:
> 
> Name  Inc % Inc
>   ||+ git!mainCRTStartup   89.2  13,380
>   || + git!__tmainCRTStartup   89.2  13,380
>   ||  + git!cmd_main   89.2  13,380
>   ||   + git!handle_builtin89.2  13,380
>   ||+ git!cmd_checkout 89.2  13,380
>   || + git!unpack_trees71.5  10,725
>   || |+ git!traverse_trees 39.7   5,956
>   || |+ git!cache_tree_update  16.1   2,408
>   || |+ git!??unpack_callback  11.0   1,649
>   || |+ git!discard_index   2.8 423
>   || + git!write_locked_index   8.4   1,257
>   || + git!??cache_tree_invalidate_path 5.1 767
>   || + git!read_index_preload   3.4 514
> 
> > For 2-way merge, I believe we can detect inside unpack_trees() that
> > it's a 2-way merge (fn == twoway_merge), from HEAD to HEAD (simple
> > enough check), then from the 2-way merge table we know for sure
> > nothing is going to change and we can just skip traverse_trees() call
> > in unpack_trees().
> > 
> 
> If we can skip the call to traverse_trees(), that will give us the bulk 
> of the savings (39.7% + 11% = 50.7% if my math is correct).

That cache_tree_invalidate_path() should belong to unpack_trees() as
well. At least I known unpack_trees() does that (and I'm pretty sure
it's useless to call it in this context), but maybe it's called
elsewhere too.

> 
> > On the sparse checkout application. This only needs to be done when
> > there are new files added, or the spare-checkout file has been updated
> > since the last time it's been used. We can keep track of these things
> > (sparse-checkout file change could be kept track with just stat info
> > maybe as an index extension) then we can skip applying sparse checkout
> > not for this particular case for but general checkouts as well. Spare
> > checkout file rarely changes. Big win overall.
> > 
> 
> With the current patch, we don't need to load or update the index at 
> all.  Without the patch, we've already replaced the standard 
> sparse-checkout logic with something significantly faster so in our 
> particular case, I think it's safe to skip the additional complexity of 
> keeping track of changes to the sparse-checkout file.
> 
> > And if all go according to plan, there will be no changes made in the
> > index (by either 2-way merge or sparse checkout stuff) we should be
> > able to just skip writing down the index, if we haven't done that
> > already.
> > 
> 
> That would be great as writing the index is

Re: [PATCH 1/2] config: document git config getter return value.

Jeff King  writes:

> This is returning the value of git_config_from_file(), which is 0/-1. I
> think it should be left where it is in the original, and not part of
> this block of functions.
>
> Other than that, the patch seems sane to me (I think the 0/1 return
> value from these "get" functions is unnecessarily inconsistent with the
> rest of Git, but changing it would be a pain. Documenting it is at least
> a step forward).

I do not think changing it would be in the scope of this series.

It makes sense to document (1) that zero is success, non-zero is
failure, to help those who are reading the current callers and
adding new callers, and (2) that we are aware that the exact
"non-zero" value chosen for these are not in line with the rest of
the system.  The latter can just be a "NEEDSWORK" comment.

Re: [PATCH v3 1/2] sequencer: handle errors in read_author_ident()

Eric Sunshine  writes:

> I think this patch can be simplified considerably by shifting one's
> perspective. If we admit that read_author_ident() is already correctly
> reporting an error by returning NULL (which is exactly what it is
> doing), then the bug is is purely on the calling side; namely, the
> caller is ignoring the error. (In fact, your commit message already
> states this.)

This approach looks quite sensible.

Re: [PATCH 2/3] config: fix case sensitive subsection names on writing

Stefan Beller  writes:

> And *technically* the two level is old style, so I figured it's ok.

If we call the bit not after the recentness of the style but after
what it is about, e.g. "is the section name as a whole (including
its possible subsection part) case insensitive?", then yes, a
two-level name can be treated exactly the same way as the old style
names with a subsection.  Perhaps we should do that rename to save
us from future confusion.

>> > - !strncasecmp(cf->var.buf, store->key, 
>> > store->baselen);
>> > + !cmpfn(cf->var.buf, store->key, store->baselen);
>>
>> OK.  Section names should still be case insensitive (only the case
>> sensitivity of subsection names is special), but presumably that's
>> already normalized by the caller so we do not have to worry when we
>> use strncmp()?  Can we add a test to demonstrate that it works
>> correctly?
>
> That was already demonstrated (but not tested) in
> https://public-inbox.org/git/20180730230443.74416-4-sbel...@google.com/

Yeah, I also manually tried when I was playing with the old-style
names to see how well it works.  We would want to make sure this
won't get broken in the future, still.

Thanks.

Re: [PATCH 1/1] verify-tag/verify-commit should exit unsuccessfully when signature is not trusted

2018-08-03 Thread Santiago Torres

> > disable fetching keys from hkp servers. This way signature verification
> > should fail.
> > 
> > Thanks,
> > -Santiago.
> 
> This is not a deviation. GPG correctly recognizes difference between trusted,
> untrusted and unknown levels. git on the other hand does not. Well it did 
> until
> the commit a4cc18f29. That one removed GPG exit code propagation.

Oh wow. Sorry my assumption parted from looking at the code back in the
day where this happens. I assumed git was quietly propagating the gpg
error code and took it from there. 

Now that I think about it though, verify tag can verify more than one
tag. I assume that this would make it difficult to propagate individual
errors in trusting. I honestly don't know what's the best way to modify
this behavior then.

Thanks,
-Santiago


signature.asc
Description: PGP signature

Re: Re* [PATCH] push: comment on a funny unbalanced option help

René Scharfe  writes:

> Am 02.08.2018 um 22:36 schrieb Junio C Hamano:
>> Ævar Arnfjörð Bjarmason  writes:
>> 
>>> Thanks, FWIW that's fine by me, and also if you want to drop this "fake"
>>> patch of mine and replace it with something René came up with (I have
>>> not yet read his 1-6 patches submitted on this topic, so maybe they're
>>> not mutually exclusive).
>> 
>> I think the 6-patch series supersedes this one.  Thanks anyway.
>
> Actually no, I specifically avoided touching builtin/push.c because this
> patch already handled it; it should still be applied before patch 6 from
> my series.

You're of course correct.  Thanks.

Re: [PATCH] push: comment on a funny unbalanced option help

René Scharfe  writes:

> Am 02.08.2018 um 22:01 schrieb Junio C Hamano:
>> 
>> Straying sideways into a tangent, but do we know if any locale wants
>> to use something other than "<>" as an enclosing braket around a
>> placeholder?
>
> Bulgarian seems to use capitals instead; here are some examples found
> with git grep -A1 'msgid "<' po/:
>
> po/bg.po:msgid ""
> po/bg.po-msgstr "ОТДАЛЕЧЕНО_ХРАНИЛИЩЕ"
> ...
>> 
>> Perhaps we should do _("<%s>") here?  That way, the result would
>> hopefully be made consistent with
>> 
>>  N_("[:]")  with LIT-ARG-HELP
>> 
>> which allows translator to use the bracket of the locale's choice.

I did not consider locales that do not use any kind of bracket, but
I guess _("<%s>") would work for them, too, in this context.  When
the locale's convention is to upcase, for example, the arg-help text
that is not marked with PARSE_OPT_LITERAL_ARGHELP would be upcased,
and _("<%s>") in the usage_argh() can be translated to "%s", which
passes the translated (i.e. upcased) arg-help text through.

Re: [PATCH v1 01/25] structured-logging: design document

2018-08-03 Thread Ben Peart





On 7/13/2018 12:55 PM, g...@jeffhostetler.com wrote:

From: Jeff Hostetler 

Signed-off-by: Jeff Hostetler 
---
  Documentation/technical/structured-logging.txt | 816 +
  1 file changed, 816 insertions(+)
  create mode 100644 Documentation/technical/structured-logging.txt

diff --git a/Documentation/technical/structured-logging.txt 
b/Documentation/technical/structured-logging.txt
new file mode 100644
index 000..794c614
--- /dev/null
+++ b/Documentation/technical/structured-logging.txt
@@ -0,0 +1,816 @@
+Structured Logging
+==
+
+Structured Logging (SLOG) is an optional feature to allow Git to
+generate structured log data for executed commands.  This includes
+command line arguments, command run times, error codes and messages,
+child process information, time spent in various critical functions,
+and repository data-shape information.  Data is written to a target
+log file in JSON[1,2,3] format.
+
+SLOG is disabled by default.  Several steps are required to enable it:
+
+1. Add the compile-time flag "STRUCTURED_LOGGING=1" when building git
+   to include the SLOG routines in the git executable.
+


Is the intent to remove this compile-time flag before this is merged? 
With it off by default in builds, the audience for this is limited to 
those who build their own/custom versions of git. I can see other 
organizations wanting to use this that don't have a custom fork of git 
they build and install on their users machines.


Like the GIT_TRACE mechanism today, I think this should be compiled in 
but turned off via the default settings by default.



+2. Set "slog.*" config settings[5] to enable SLOG in your repo.
+
+
+Motivation
+==
+
+Git users may be faced with scenarios that are surprisingly slow or
+produce unexpected results.  And Git developers may have difficulty
+reproducing these experiences.  Structured logging allows users to
+provide developers with additional usage, performance and error data
+that can help diagnose and debug issues.
+
+Many Git hosting providers and users with many developers have bespoke
+efforts to help troubleshoot problems; for example, command wrappers,
+custom pre- and post-command hooks, and custom instrumentation of Git
+code.  These are inefficient and/or difficult to maintain.  The goal
+of SLOG is to provide this data as efficiently as possible.
+
+And having structured rather than free format log data, will help
+developers with their analysis.
+
+
+Background (Git Merge 2018 Barcelona)
+=
+
+Performance and/or error logging was discussed during the contributor's
+summit in Barcelona.  Here are the relevant notes from the meeting
+minutes[6].
+
+> Performance misc (Ævar)
+> ---
+> [...]
+>  - central error reporting for git
+>- `git status` logging
+>- git config that collects data, pushes to known endpoint with `git push`
+>- pre_command and post_command hooks, for logs
+>- `gvfs diagnose` that looks at packfiles, etc
+>- detect BSODs, etc
+>- Dropbox writes out json with index properties and command-line
+>information for status/fetch/push, fork/execs external tool to upload
+>- windows trace facility; would be nice to have cross-platform
+>- would hosting providers care?
+>- zipfile of logs to give when debugging
+>- sanitizing data is harder
+>- more in a company setting
+>- fileshare to upload zipfile
+>- most of the errors are proxy when they shouldn't, wrong proxy, proxy
+>specific to particular URL; so upload endpoint wouldn't work
+>- GIT_TRACE is supposed to be that (for proxy)
+>- but we need more trace variables
+>- series to make tracing cheaper
+>- except that curl selects the proxy
+>- trace should have an API, so it can call an executable
+>- dump to .git/traces/... and everything else happens externally
+>- tools like visual studio can't set GIT_TRACE, so
+>- sourcetree has seen user environments where commands just take forever
+>- third-party tools like perf/strace - could we be better leveraging 
those?
+>- distribute turn-key solution to handout to collect more data?
+


While it makes sense to have clear goals in the design document, the 
motivation and background sections feel somehow out of place.  I'd 
recommend you clearly articulate the design goals and drop the 
background data that led you to the goals.



+
+A Quick Example
+===
+
+Note: JSON pretty-printing is enabled in all of the examples shown in
+this document.  When pretty-printing is turned off, each event is
+written on a single line.  Pretty-printing is intended for debugging.
+It should be turned off in production to make post-processing easier.
+ > +$ git config slog.pretty 
+


nit - I'd move this "Note:" section to the end of your "A Quick 
Example." While it's good to understand about pretty printing, its not 
the first or most important

Re: [PATCH/RFC] clone: report duplicate entries on case-insensitive filesystems

2018-08-03 Thread Torsten Bögershausen

On 2018-08-02 15:43, Duy Nguyen wrote:
> On Wed, Aug 1, 2018 at 11:20 PM Junio C Hamano  wrote:
>>
>> Junio C Hamano  writes:
>>
>>> Jeff King  writes:
>>>
> Presumably we are already in an error codepath, so if it is
> absolutely necessary, then we can issue a lstat() to grab the inum
> for the path we are about to create, iterate over the previously
> checked out paths issuing lstat() and see which one yields the same
> inum, to find the one who is the culprit.

 Yes, this is the cleverness I was missing in my earlier response.

 So it seems do-able, and I like that this incurs no cost in the
 non-error case.
>>>
>>> Not so fast, unfortunately.
>>>
>>> I suspect that some filesystems do not give us inum that we can use
>>> for that "identity" purpose, and they tend to be the ones with the
>>> case smashing characteristics where we need this code in the error
>>> path the most X-<.
>>
>> But even if inum is unreliable, we should be able to use other
>> clues, perhaps the same set of fields we use for cached stat
>> matching optimization we use for "diff" plumbing commands, to
>> implement the error report.  The more important part of the idea is
>> that we already need to notice that we need to remove a path that is
>> in the working tree while doing the checkout, so the alternative
>> approach won't incur any extra cost for normal cases where the
>> project being checked out does not have two files whose pathnames
>> are only different in case (or checking out such an offending
>> project to a case sensitive filesytem, of course).
>>
>> So I guess it still _is_ workable.  Any takers?
> 
> OK so we're going back to the original way of checking that we check
> out the different files on the same place (because fs is icase) and
> try to collect all paths for reporting, yes?

I would say: Yes.

> I can give it another go
> (but of course if anybody else steps up, I'd very gladly hand this
> over)
> 

Not at the moment.

Re: [PATCH v3 2/2] sequencer: fix quoting in write_author_script

2018-08-03 Thread Phillip Wood

Hi Eric
On 03/08/18 11:02, Eric Sunshine wrote:

On Fri, Aug 3, 2018 at 5:33 AM Phillip Wood wrote:

If there isn't some backward compatibility then if git gets upgraded
while rebase is stopped then the author data will be silently corrupted
if it contains "'". read_author_ident() will error out but that is only
used for the root commit. read_env_script() which is used for normal
picks will not dequote the badly quoted value correctly and will not
return an error. It is unlikely but possible, I'll leave it to Junio to
decide if it is worth it

If I understand correctly, the approach you implemented earlier[1]
(perhaps coupled with the more robust detection suggested here[2])
would be sufficient to handle this backward compatibility concern.
While it may not be as pretty or generalized as the current patch, it
involves far less machinery, thus is less likely to harbor its own
bugs. The earlier version is also much more self-contained, which
makes it easier to drop at some point when backward compatibility is
no longer a concern (if ever).

Yes I think the earlier approach with the more robust detection you
suggested is probably a good compromise. Junio does that sound good to you?

Best Wishes

Phillip

There is a precedent for adding backwards compatibility 84df4560ed
("rebase: extract code for writing basic state", 2011-02-06) though it
is much simpler.

Indeed, it is much simpler, adding a one-liner 'else' case to an
'if-then' for backward compatibility. Your earlier implementation[1]
was pretty much the equivalent, just adding an extra one-liner arm to
an 'if-then' statement.

The bug fix itself is important, and, while I do favor the cleaner
approach of not worrying about backward compatibility for this fairly
unlikely case, your earlier version seems a better compromise between
having no backward compatibility and the much more heavyweight version
implemented here.

Anyhow, I'm fine with whatever Junio decides.

[1]:
https://public-inbox.org/git/2018073532.9358-3-phillip.w...@talktalk.net/
[2]:
https://public-inbox.org/git/capig+ctttbv2fjnos_sztwh2j4wwzsbk+48babt1cv0utyn...@mail.gmail.com/

Re: Squash & Merge Implementation

On Fri, Aug 03, 2018 at 03:01:15PM +0300, Vadim Belov wrote:

> 1. This merges successfully without squash:
> git checkout origin/master
> git merge ${PR-Branch}
> git push origin HEAD:master
> git push origin --delete ${PR-Branch}

Right, this is a normal merge.

> 2. This closes the PR, but there is no update seen on master:
> git checkout origin/master
> git merge --squash --commit ${PR-Branch}
> git push origin HEAD:master
> git push origin --delete ${PR-Branch}

Doing "merge --squash --commit" doesn't do what you expect; namely
"--commit" does not override the non-committing nature of "--squash". It
only override a "--no-commit" found elsewhere.

IMHO this is something that could be improved in Git (i.e., telling the
difference between "the user did not say --no-commit" and "the user said
--commit" and respecting it for --squash).

But that explains what you see. The push to master is a noop, since you
didn't make a new commit. And then deleting the PR branch on GitHub
auto-closes the PR.

> 3. This fails to push to master with the error "GH006: Protected branch
> update failed"  (despite that the PR is set to SUCCESS):
> git checkout origin/master
> git merge --squash ${PR-Branch}
> git commit -am"comment"
> git push origin HEAD:${m_mainBranch}
> git push origin --delete ${m_prBranch}

So here you _do_ make an actual commit. But that commit doesn't look
like a merge to the receiver; it just looks like a single commit that
has all the changes there were on PR-Branch.

The tree of that commit should be the same tree that would result from a
real merge. So in theory things like protected-branch status checks
could handle that, but I suspect they use the actual commit id (the tree
id is fine if you're just doing CI, but if you wanted to have a status
check for commit messages, say, you'd obviously want that to be tied to
the actual commit object).

I don't offhand recall how that is implemented (and you could also be
falling afoul of other checks, like required reviews). But this is a
GitHub-specific question, and you should probably ask GitHub support to
go further.

-Peff

Re: [PATCH 1/1] verify-tag/verify-commit should exit unsuccessfully when signature is not trusted

2018-08-03 Thread Karel Kočí

On Tue, Jul 31, 2018 at 08:25:47PM -0400, Santiago Torres wrote:
> On Wed, Aug 01, 2018 at 12:19:42AM +, brian m. carlson wrote:
> > On Tue, Jul 31, 2018 at 10:05:22PM +0200, Vojtech Myslivec wrote:
> > > Hello,
> > > 
> > > me and my colleague are struggling with automation of verifying git
> > > repositories and we have encountered that git verify-commit and
> > > verify-tag accepts untrusted signatures and exit successfully.
> > 
> > I don't have strong feelings on your change one way or the other, but
> > for automation it may be useful to use the --raw flag, which gives you
> > the raw gpg output and much greater control.  For example, you can
> > require that a subkey is or is not used or require certain algorithms.
> > 
> > I will say that most signatures are untrusted in my experience, so
> > unless people are using TOFU mode or making local signatures, git will
> > exit nonzero for most signatures.  I think the current status is to exit
> > on a good signature, even if it isn't necessarily a valid signature.
> > 
> > I'm interested to hear others' thoughts on this.
> 
> I'd find it odd that we deviate from the gpg behavior, that returns 0
> when verifyng an untrusted signatures. Tooling around gpg is generally
> difficult for this reason, but using the raw output should be enough to
> discard signatures with untrusted keys.
> 
> Another alternative is to use a keyring with trusted keys *only* and
> disable fetching keys from hkp servers. This way signature verification
> should fail.
> 
> Thanks,
> -Santiago.

This is not a deviation. GPG correctly recognizes difference between trusted,
untrusted and unknown levels. git on the other hand does not. Well it did until
the commit a4cc18f29. That one removed GPG exit code propagation.

I think that core of the problem is that git considers both 'TRUST_NEVER' and
'TRUST_UNDEFINED' as being the same. In git they both should result in error or 
at
lest 'TRUST_NEVER' should (the same way as it does with just GPG).

There is an illustration of difference of untrusted and unknown key verification
at the end of this mail. You can test it just by using your own GPG key (just 
use
our fingerprint instead).

My and my colleague's expectations are that 'git verify-commit branch' would
handle the commit on the tip of `branch` the  same way as 'git merge
--verify-signature branch'.

It is also confusing that untrusted key is ok but expired or missing one results
in error. GPG's primary usage is for building web of trust. And I think that
untrusted keys are more problematic in that sense than expired ones.

I also disagree with idea that using --raw should be only way how to check if we
trust a key. It is handy in scripts for sure if some additional info about
signature is required but I think that it should not be primary way to check for
signature validity. Exit code should serve that purpose the same way as in case 
of
GPG itself.

Small example would be that to replace current GPG behavior in portable way
(shells such as dash or ash) it is required to do at least this:

set -e
RES="$(git verify-tag --raw v1.1 2>&1)"
! echo "$RES" | grep -q "^\[GNUPG:\] TRUST_NEVER"

And that will for sure get more complicated when I use other trust levels and/or
trust database and different trust model but direct.

Overall I thing that deviation from returning GPG exit code was not the best 
and I
don't understand why it was done. Decision if signature validity is reported as
error should be on GPG not on git specially when there is almost no 
configuration
possibility for GPG in git.

Note that our proposed change is conservative and drawn as a unification of
behavior of pull/merge and verify-*. To make it really optimal for automation it
would be best to return exit code of GPG command.

With regards
Karel Kočí

Illustration of difference between unknown and untrusted key:

$ gpg --sign file
$ mkdir -m 700 gpgt
$ export GNUPGHOME=$PWD/gpgt
$ echo 'trust-model:0:"direct' | gpgconfg --change-options gpg
$ gpg --recv-keys 2B1F70F95F1B48DA2265A7FAA6BC8B8CEB31659B
$ gpg --list-key
/home/cynerd/gpgt/pubring.kbx
-
pub   rsa2048 2016-07-07 [SC]
  2B1F70F95F1B48DA2265A7FAA6BC8B8CEB31659B
uid   [ unknown] Karel Kočí (nic.cz) 
uid   [ unknown] Karel Kočí (cynerd.cz) 
uid   [ unknown] Karel Kočí (sh.cvut.cz) 
uid   [ unknown] Karel Kočí (fel.cvut.cz) 
sub   rsa2048 2016-07-07 [E]
sub   rsa4096 2016-09-27 [S]
$ gpg --verify file.gpg
gpg: Signature made Fri 03 Aug 2018 02:01:49 PM CEST
gpg:using RSA key 46D715A0ED0E0C433CBF5963D83BD732AC2BD828
gpg:issuer "karel.k...@nic.cz"
gpg: Good signature from "Karel Kočí (nic.cz) " [marginal]
gpg: aka "Karel Kočí (cynerd.cz) " [marginal]
gpg: aka "Karel Kočí (sh.cvut.cz) " 
[marginal]
gpg: aka "Karel Kočí (fel.cvut.cz) " 
[marginal]
gpg: WARNING: This key is not certified with sufficiently trusted

[RFC] broken test for git am --scissors

2018-08-03 Thread Andrei Rybak

I was tweaking is_scissors_line function in mailinfo.c and tried writing
some tests for it.  And discovered that existing test for git am option
--scissors is broken.  I then confirmed that by intentionally breaking
is_scissors_line, like this:

--- 8< ---
Subject: [PATCH 1/2] mailinfo.c: intentionally break is_scissors_line

It seems that tests for "git am" don't actually test the --scissor
option logic.  Break is_scissors_line function by using bogus symbols to
be able to check the tests.

Note that test suite does not pass with this patch applied.  The
expected failure does not happen.
---
 mailinfo.c| 4 ++--
 t/t4150-am.sh | 3 ++-
 2 files changed, 4 insertions(+), 3 deletions(-)

diff --git a/mailinfo.c b/mailinfo.c
index 3281a37d5..7938d85e3 100644
--- a/mailinfo.c
+++ b/mailinfo.c
@@ -682,8 +682,8 @@ static int is_scissors_line(const char *line)
perforation++;
continue;
}
-   if ((!memcmp(c, ">8", 2) || !memcmp(c, "8<", 2) ||
-!memcmp(c, ">%", 2) || !memcmp(c, "%<", 2))) {
+   if ((!memcmp(c, ">o", 2) || !memcmp(c, "o<", 2) ||
+!memcmp(c, ">/", 2) || !memcmp(c, "/<", 2))) {
in_perforation = 1;
perforation += 2;
scissors += 2;
diff --git a/t/t4150-am.sh b/t/t4150-am.sh
index 1ebc587f8..59bcb5afd 100755
--- a/t/t4150-am.sh
+++ b/t/t4150-am.sh
@@ -412,7 +412,8 @@ test_expect_success 'am with failing post-applypatch hook' '
test_cmp head.expected head.actual
 '
 
-test_expect_success 'am --scissors cuts the message at the scissors line' '
+# Test should fail, but succeeds
+test_expect_failure 'am --scissors cuts the message at the scissors line' '
rm -fr .git/rebase-apply &&
git reset --hard &&
git checkout second &&

--- >8 ---

Here's a proof-of-concept patch for the test, to make it actually fail
when is_scissors_line is broken.  It is the easiest way to do so, that I
could come up with, it is not ready to be applied.  I think the two
tests for --scissors should be rewritten pretty much from scratch, with
more obvious naming of files used.

(I made the changes to files in both tests the same just to be able to
re-use file "no-scissors-patch.eml", it's not relevant to the scissor
line in the commit messages.)

--- 8< ---
Subject: [PATCH 2/2] t4150-am.sh: fix test for --scissors

Test for option --scissors should check that the eml file with a scissor
line inside will be cut up and only the part under the cut will be
turned into commit.

However, the test for --scissors generates eml file without such line.
Fix the test for --scissors option.

Signed-off-by: Andrei Rybak 
---
 t/t4150-am.sh | 19 ---
 1 file changed, 12 insertions(+), 7 deletions(-)

diff --git a/t/t4150-am.sh b/t/t4150-am.sh
index 59bcb5afd..5ad71fe05 100755
--- a/t/t4150-am.sh
+++ b/t/t4150-am.sh
@@ -69,17 +69,22 @@ test_expect_success 'setup: messages' '
 
EOF
 
+   cat >new-scissors-msg <<-\EOF &&
+   Subject: Test git-am with scissors line
+
+   This line should be included in the commit message.
+   EOF
+
cat >scissors-msg <<-\EOF &&
Test git-am with scissors line
 
This line should be included in the commit message.
EOF
 
-   cat - scissors-msg >no-scissors-msg <<-\EOF &&
+   cat - new-scissors-msg >no-scissors-msg <<-\EOF &&
This line should not be included in the commit message with --scissors 
enabled.
 
 - - >8 - - remove everything above this line - - >8 - -
-
EOF
 
signoff="Signed-off-by: $GIT_COMMITTER_NAME <$GIT_COMMITTER_EMAIL>"
@@ -148,15 +153,15 @@ test_expect_success setup '
} >patch1-hg.eml &&
 
 
-   echo scissors-file >scissors-file &&
-   git add scissors-file &&
+   echo file >file &&
+   git add file &&
git commit -F scissors-msg &&
git tag scissors &&
git format-patch --stdout scissors^ >scissors-patch.eml &&
git reset --hard HEAD^ &&
 
-   echo no-scissors-file >no-scissors-file &&
-   git add no-scissors-file &&
+   echo file >file &&
+   git add file &&
git commit -F no-scissors-msg &&
git tag no-scissors &&
git format-patch --stdout no-scissors^ >no-scissors-patch.eml &&
@@ -417,7 +422,7 @@ test_expect_failure 'am --scissors cuts the message at the 
scissors line' '
rm -fr .git/rebase-apply &&
git reset --hard &&
git checkout second &&
-   git am --scissors scissors-patch.eml &&
+   git am --scissors no-scissors-patch.eml &&
test_path_is_missing .git/rebase-apply &&
git diff --exit-code scissors &&
test_cmp_rev scissors HEAD
--

--- >8 ---

Relevant old threads:

1. 
https://public-inbox.org/git/1435861000-25278-11-git-send-email-pyoka...@gmail.com
2.

Re: Question regarding quarantine environments

On Fri, Aug 03, 2018 at 03:25:08PM +0200, Ævar Arnfjörð Bjarmason wrote:

> > I'd be a bit careful with that, though, as the definition of "new" is
> > vague there.
> >
> > For example, completing a thin pack may mean that the receiver creates a
> > copy of a base object found in the main repo. That object isn't new as
> > part of the push, nor was it even sent on the wire, but it will appear
> > in the quarantine directory. But only sometimes, depending on whether we
> > kept the sender's pack or exploded it to loose objects.
> 
> Right, I mean:
> 
> is_new = !in_quarantine() && in_main()
> 
> Or:
> 
> is_new = !in_main()
> 
> Should work, in the latter case if the object really is missing from the
> quarnatine too, other fsck bits will stop the push.

Ah, OK. Yes, I agree that should work to cover new objects (including
ones that the other side but aren't actually needed to update the refs,
though hopefully that is rare).

There may also be other object stores, if the main repository used
alternates (or if somebody set GIT_ALTERNATE_OBJECT_DIRECTORIES). You
can probably disregard that, though, as:

  1. If you ignore the main repo, presumably you ignore its
 recursive info/alternates, too.

  2. The easy mechanism for ignoring the main repo is to ignore
 GIT_ALTERNATE_OBJECT_DIRECTORIES, so you'd already be handling
 that.

> But as you point out:
> 
> is_new = in_quarantine()
> 
> Cannot be relied upon, although it'll be true most of the time.
> 
> Perhaps I'm missing some edge case above, but I wanted to reword it to
> make sure I understood it correctly (and perhaps you have a correction).

Nope, I just didn't think through what you were saying carefully enough. ;)

-Peff

Re: [GSoC][PATCH v4 15/20] rebase -i: rewrite write_basic_state() in C

On Mon, Jul 30, 2018 at 08:25:16PM +0200, SZEDER Gábor wrote:

> > diff --git a/sequencer.c b/sequencer.c
> > index 1c035ceec7..d257903db0 100644
> > --- a/sequencer.c
> > +++ b/sequencer.c
> 
> > +int write_basic_state(struct replay_opts *opts, const char *head_name,
> > + const char *onto, const char *orig_head)
> > +{
> > +   const char *quiet = getenv("GIT_QUIET");
> > +
> > +   if (head_name)
> > +   write_file(rebase_path_head_name(), "%s\n", head_name);
> > +   if (onto)
> > +   write_file(rebase_path_onto(), "%s\n", onto);
> > +   if (orig_head)
> > +   write_file(rebase_path_orig_head(), "%s\n", orig_head);
> > +
> > +   if (quiet)
> > +   write_file(rebase_path_quiet(), "%s\n", quiet);
> > +   else
> > +   write_file(rebase_path_quiet(), "");
> 
> This is not a faithful conversion of the original.  git-rebase.sh writes
> this 'quiet' file with:
> 
>   echo "$GIT_QUIET" > "$state_dir"/quiet
> 
> which means that a single newline character was written even when
> $GIT_QUIET was unset/empty.

write_file() will call strbuf_complete_line(), so even passing "" will
result in a file with a newline in it (I didn't dig, but this comes from
e7ffa38c in 2015, so it may well have been a response to the breakage
you were thinking of).

So actually all of the "%s\n" here can be just "%s".

But there _is_ a reason not to use "", which is that it triggers
-Wformat-zero-length (which is part of -Wall unless you explicitly turn
it off, which our DEVELOPER=1 setup does). For a long time you _had_ to
do that, but these days we're actually clean with respect to that
warning.

So using "\n" here is better, and likewise here:

> > +   if (opts->verbose)
> > +   write_file(rebase_path_verbose(), "");

Unless we really do want a zero-length file, in which case:

  write_file_buf(path, "", 0);

is the right option. That matches the shell version, but I'd be
surprised if it mattered, since it's clearly meant to be "if this file
exists, we're verbose".

-Peff

Re: Question regarding quarantine environments



On Fri, Aug 03 2018, Jeff King wrote:

> On Fri, Aug 03, 2018 at 02:56:11PM +0200, Ævar Arnfjörð Bjarmason wrote:
>
>> > Any Git commands you run should therefore find objects from either
>> > location, but any writes would go to the quarantine (most notably, Git's
>> > own index-pack/unpack-objects processes, which is the point of the
>> > quarantine in the first place).
>>
>> To add to this, one interesting thing that you can do with hooks because
>> of this quarantine is to answer certain questions about the push that
>> were prohibitively expensive before it existed, but there's no explicit
>> documentation for this.
>>
>> E.g. for a hook that wants to ban big blobs in the repo, but wants to
>> allow all existing blobs (you don't want to block e.g. a revert of a
>> commit that removed it from the checkout), you can juggle these two env
>> variables and hide the "main" object dir from the hook for some
>> operations, so e.g. if a blob lookup succeeds in the alternate
>> quarantine dir, but not the main object dir, you know it's new.
>
> I'd be a bit careful with that, though, as the definition of "new" is
> vague there.
>
> For example, completing a thin pack may mean that the receiver creates a
> copy of a base object found in the main repo. That object isn't new as
> part of the push, nor was it even sent on the wire, but it will appear
> in the quarantine directory. But only sometimes, depending on whether we
> kept the sender's pack or exploded it to loose objects.

Right, I mean:

is_new = !in_quarantine() && in_main()

Or:

is_new = !in_main()

Should work, in the latter case if the object really is missing from the
quarnatine too, other fsck bits will stop the push.

But as you point out:

is_new = in_quarantine()

Cannot be relied upon, although it'll be true most of the time.

Perhaps I'm missing some edge case above, but I wanted to reword it to
make sure I understood it correctly (and perhaps you have a correction).

Re: [PoC] coccinelle: make Coccinelle-related make targets more fine-grained

On Thu, Aug 02, 2018 at 11:21:44PM -0700, Jonathan Nieder wrote:

> > diff --git a/Makefile b/Makefile
> > index d616c0412..86fdcf567 100644
> > --- a/Makefile
> > +++ b/Makefile
> > @@ -2674,15 +2674,17 @@ COCCI_SOURCES = $(filter-out 
> > sha1collisiondetection/%,$(C_SOURCES))
> >  else
> >  COCCI_SOURCES = $(filter-out sha1dc/%,$(C_SOURCES))
> >  endif
> > +COCCI_COMBINED = contrib/coccinelle/combined.cocci
> 
> I like this approach.

I was pretty pleased with myself, too, but I had a lingering doubt about
whether just cat-ing the files was legitimate. It sounds from the
response elsewhere that it's not (but just happens to work now for out
limited case). But it also sounds like there may be even better options.

> > I guess you could even replace "COCCI_COMBINED" with "COCCI_PATCH" in
> > most of the targets, and that would let people do individual:
> > 
> >   make COCCI_PATCH=contrib/coccinelle/foo.cocci coccicheck
> 
> The issue here is that the dependencies for foo.cocci become
> unreliable, so I'd rather have a separate target for that (e.g.
> depending on FORCE) if we go that way.

Can you be more specific? I don't see how it's unreliable, unless you
mean that anything relying on "coccicheck" would depend on the exact
value of COCCI_PATCH.

But it may all be moot anyway, based no the responses elsewhere in the
thread.

-Peff

Re: [PATCH] negotiator/skipping: skip commits during fetch

2018-08-03 Thread Johannes Schindelin

Hi Jonathan,

On Fri, 27 Jul 2018, Johannes Schindelin wrote:

> On Thu, 26 Jul 2018, Jonathan Tan wrote:
> 
> > > On Mon, 16 Jul 2018, Jonathan Tan wrote:
> > > 
> > > >  t/t5552-skipping-fetch-negotiator.sh | 179 +++
> > > 
> > > This test seems to be failing consistently in the recent `pu` builds:
> > > 
> > > https://git-for-windows.visualstudio.com/git/_build/results?buildId=14337=logs
> > > 
> > > Could you have a look, please?
> > 
> > Hmm...on my Linux computer, this test passes on both pu (as of the time
> > of writing) and 838143aa5c ("Merge branch 'ab/newhash-is-sha256' into
> > pu", 2018-07-25) (pu at the time of that build, according to the website
> > you linked above). If you could rerun that test with additional code,
> > could you add a "cat trace" and show me what the client sends?
> 
> I can give you something even better: a playground. Just open a PR at
> https://github.com/gitgitgadget/git (all of the branches on gitster/git ar
> mirrored, including yours, I am sure, so you can target that branch
> specifically).
> 
> Once you open a Pull Request, it will automatically build and run the test
> suite on Windows, macOS and Linux. You will see it in the "checks" section
> on the bottom. Example for my range-diff series:
> 
> https://git-for-windows.visualstudio.com/git/_build/results?buildId=14279
> 
> For a quicker turnaround, you could add a commit that forces the `all`
> target in `t/Makefile` to run only your test.
> 
> > When I do that, the relevant parts are:
> > 
> >   packet:fetch> have 9ab46928dc282aa09f4dbf96893a252e058e7e8e
> >   packet:fetch> have dc824fafb05f3229aedf1f320bbe572e35364dfe
> >   packet:fetch> have caef059de69917b9119176a11b88afcef769331d
> >   packet:fetch> have 41bd8dc092ee110ba80e350a346ec507ab2e42a0
> >   packet:fetch> have e9a2c092a8e911567a377c881a7f6031e7f892ea
> >   packet:fetch> done
> > 
> > which is exactly as I (and the test) expect.
> > 
> > Two possible reasons for the discrepancy that I can think of offhand are
> > that (1) my computer generates different commits from your test system,
> > and (2) the priority queue pops commits in a different order. For (1),
> > that's not possible because the SHA-1s are the same (as can be seen by
> > comparing your link and the "have" lines I quoted above), and for (2),
> > the code seems OK:
> > 
> >   static int compare(const void *a_, const void *b_, void *unused)
> >   {
> > const struct entry *a = a_;
> > const struct entry *b = b_;
> > return compare_commits_by_commit_date(a->commit, b->commit, NULL);
> >   }
> > 
> > Let me know if you can observe the output of "cat trace" or if you have
> > any other ideas.
> 
> Like I said, you can use those "CI" builds, I think that would be more
> effective than if you waited for me to react, I am quite overwhelmed these
> days.

Hopefully you have a chance to do so. I got the impression that it is
actually more of a flakey test than a consistent test failure:

https://git-for-windows.visualstudio.com/git/_build/results?buildId=15015=logs

Ciao,
Dscho

Re: [PATCH 1/2] config: document git config getter return value.

On Thu, Aug 02, 2018 at 08:29:48PM -0700, Jonathan Nieder wrote:

> (cc-ing peff, config whiz)

Actually, this is all about the configset caching layer, which I never
really worked on. This is mostly from Tanay Abhra ,
who was a GSoC student mentored by Matthieu Moy .

That said...

> > +
> > +/*
> > + * These functions return 1 if not found, and 0 if found, leaving the found
> > + * value in the 'dest' pointer.
> > + */
> > +extern int git_configset_add_file(struct config_set *cs, const char 
> > *filename);
> 
> This function doesn't take a 'dest' argument.  Is the comment meant to
> apply to it?  If not, can the comment go below it?

This is returning the value of git_config_from_file(), which is 0/-1. I
think it should be left where it is in the original, and not part of
this block of functions.

Other than that, the patch seems sane to me (I think the 0/1 return
value from these "get" functions is unnecessarily inconsistent with the
rest of Git, but changing it would be a pain. Documenting it is at least
a step forward).

-Peff

Re: Question regarding quarantine environments

On Fri, Aug 03, 2018 at 02:56:11PM +0200, Ævar Arnfjörð Bjarmason wrote:

> > Any Git commands you run should therefore find objects from either
> > location, but any writes would go to the quarantine (most notably, Git's
> > own index-pack/unpack-objects processes, which is the point of the
> > quarantine in the first place).
> 
> To add to this, one interesting thing that you can do with hooks because
> of this quarantine is to answer certain questions about the push that
> were prohibitively expensive before it existed, but there's no explicit
> documentation for this.
> 
> E.g. for a hook that wants to ban big blobs in the repo, but wants to
> allow all existing blobs (you don't want to block e.g. a revert of a
> commit that removed it from the checkout), you can juggle these two env
> variables and hide the "main" object dir from the hook for some
> operations, so e.g. if a blob lookup succeeds in the alternate
> quarantine dir, but not the main object dir, you know it's new.

I'd be a bit careful with that, though, as the definition of "new" is
vague there.

For example, completing a thin pack may mean that the receiver creates a
copy of a base object found in the main repo. That object isn't new as
part of the push, nor was it even sent on the wire, but it will appear
in the quarantine directory. But only sometimes, depending on whether we
kept the sender's pack or exploded it to loose objects.

-Peff

Re: Question regarding quarantine environments