[PATCH 3/3] pathspec: handle non-terminated strings with :(attr)

2018-11-01 Thread Jeff King
The pathspec code always takes names to be matched as a
name/namelen pair, but match_attrs() never looks at namelen,
and just treats "name" like a NUL-terminated string, passing
it to git_check_attr().

This usually works anyway. Every caller passes a
NUL-terminated string, and in all but one the passed-in
length is the same as the length of the string (the
exception is dir_path_match(), which may pass a smaller
length to drop a trailing slash). So we won't currently ever
read random memory, and the one case I found actually
happens to work correctly because the attr code can handle
the trailing slash itself.

But it's still worth addressing, as the function interface
implies that the name does not have to be NUL-terminated,
making this an accident waiting to happen.

Since teaching git_check_attr() to take a ptr/len pair would
be a big refactor, we'll just allocate a new string. We can
do this only when necessary, which avoids paying the cost
for most callers.

Signed-off-by: Jeff King 
---
 dir.c | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/dir.c b/dir.c
index 47c2fca8dc..ab6477d777 100644
--- a/dir.c
+++ b/dir.c
@@ -281,8 +281,15 @@ static int match_attrs(const struct index_state *istate,
   const struct pathspec_item *item)
 {
int i;
+   char *to_free = NULL;
+
+   if (name[namelen])
+   name = to_free = xmemdupz(name, namelen);
 
git_check_attr(istate, name, item->attr_check);
+
+   free(to_free);
+
for (i = 0; i < item->attr_match_nr; i++) {
const char *value;
int matched;
-- 
2.19.1.1336.g081079ac04


Understanding pack format

2018-11-01 Thread Farhan Khan
Hi all,

I am trying to understand the pack file format and have been reading
the documentation, specifically https://git-scm.com/docs/pack-format
(which is in git's own git repository as
"Documentation/technical/pack-format.txt"). I see that the file starts
with the "PACK" signature, followed by the 4 byte version and 4 byte
number of objects. After this, the documentation speaks about
Undeltified and Deltified representations. I understand conceptually
what each is, but do not know specifically how git parses it out.

Can someone please explain this to me? Is there any sample code of how
to interpret each entry? Where is this in the git code? That might
serve as a good guide.

I see a few references to "PACK_SIGNATURE", but not certain which
actually reads the data.

Thanks!
--
Farhan Khan
PGP Fingerprint: B28D 2726 E2BC A97E 3854 5ABE 9A9F 00BC D525 16EE


[PATCH 2/3] approxidate: handle pending number for "specials"

2018-11-01 Thread Jeff King
The approxidate parser has a table of special keywords like
"yesterday", "noon", "pm", etc. Some of these, like "pm", do
the right thing if we've recently seen a number: "3pm" is
what you'd think.

However, most of them do not look at or modify the
pending-number flag at all, which means a number may "jump"
across a significant keyword and be used unexpectedly. For
example, when parsing:

  January 5th noon pm

we'd connect the "5" to "pm", and ignore it as a
day-of-month. This is obviously a bit silly, as "noon"
already implies "pm". And other mis-parsed things are
generally as silly ("January 5th noon, years ago" would
connect the 5 to "years", but probably nobody would type
that).

However, the fix is simple: when we see a keyword like
"noon", we should flush the pending number (as we would if
we hit another number, or the end of the string). In a few
of the specials that actually modify the day, we can simply
throw away the number (saying "Jan 5 yesterday" should not
respect the number at all).

Note that we have to either move or forward-declare the
static pending_number() to make it accessible to these
functions; this patch moves it.

Signed-off-by: Jeff King 
---
 date.c  | 60 +++--
 t/t0006-date.sh |  1 +
 2 files changed, 34 insertions(+), 27 deletions(-)

diff --git a/date.c b/date.c
index 49f943e25b..7adce327a3 100644
--- a/date.c
+++ b/date.c
@@ -887,13 +887,42 @@ static time_t update_tm(struct tm *tm, struct tm *now, 
time_t sec)
return n;
 }
 
+/*
+ * Do we have a pending number at the end, or when
+ * we see a new one? Let's assume it's a month day,
+ * as in "Dec 6, 1992"
+ */
+static void pending_number(struct tm *tm, int *num)
+{
+   int number = *num;
+
+   if (number) {
+   *num = 0;
+   if (tm->tm_mday < 0 && number < 32)
+   tm->tm_mday = number;
+   else if (tm->tm_mon < 0 && number < 13)
+   tm->tm_mon = number-1;
+   else if (tm->tm_year < 0) {
+   if (number > 1969 && number < 2100)
+   tm->tm_year = number - 1900;
+   else if (number > 69 && number < 100)
+   tm->tm_year = number;
+   else if (number < 38)
+   tm->tm_year = 100 + number;
+   /* We screw up for number = 00 ? */
+   }
+   }
+}
+
 static void date_now(struct tm *tm, struct tm *now, int *num)
 {
+   *num = 0;
update_tm(tm, now, 0);
 }
 
 static void date_yesterday(struct tm *tm, struct tm *now, int *num)
 {
+   *num = 0;
update_tm(tm, now, 24*60*60);
 }
 
@@ -908,16 +937,19 @@ static void date_time(struct tm *tm, struct tm *now, int 
hour)
 
 static void date_midnight(struct tm *tm, struct tm *now, int *num)
 {
+   pending_number(tm, num);
date_time(tm, now, 0);
 }
 
 static void date_noon(struct tm *tm, struct tm *now, int *num)
 {
+   pending_number(tm, num);
date_time(tm, now, 12);
 }
 
 static void date_tea(struct tm *tm, struct tm *now, int *num)
 {
+   pending_number(tm, num);
date_time(tm, now, 17);
 }
 
@@ -953,6 +985,7 @@ static void date_never(struct tm *tm, struct tm *now, int 
*num)
 {
time_t n = 0;
localtime_r(, tm);
+   *num = 0;
 }
 
 static const struct special {
@@ -1110,33 +1143,6 @@ static const char *approxidate_digit(const char *date, 
struct tm *tm, int *num,
return end;
 }
 
-/*
- * Do we have a pending number at the end, or when
- * we see a new one? Let's assume it's a month day,
- * as in "Dec 6, 1992"
- */
-static void pending_number(struct tm *tm, int *num)
-{
-   int number = *num;
-
-   if (number) {
-   *num = 0;
-   if (tm->tm_mday < 0 && number < 32)
-   tm->tm_mday = number;
-   else if (tm->tm_mon < 0 && number < 13)
-   tm->tm_mon = number-1;
-   else if (tm->tm_year < 0) {
-   if (number > 1969 && number < 2100)
-   tm->tm_year = number - 1900;
-   else if (number > 69 && number < 100)
-   tm->tm_year = number;
-   else if (number < 38)
-   tm->tm_year = 100 + number;
-   /* We screw up for number = 00 ? */
-   }
-   }
-}
-
 static timestamp_t approxidate_str(const char *date,
   const struct timeval *tv,
   int *error_ret)
diff --git a/t/t0006-date.sh b/t/t0006-date.sh
index 64ff86df8e..b7ea5fbc36 100755
--- a/t/t0006-date.sh
+++ b/t/t0006-date.sh
@@ -113,6 +113,7 @@ check_approxidate '3:00' '2009-08-30 03:00:00'
 check_approxidate '15:00' '2009-08-30 15:00:00'
 check_approxidate 'noon today' '2009-08-30 12:00:00'
 

[PATCH 1/3] rev-list: handle flags for --indexed-objects

2018-11-01 Thread Jeff King
When a traversal sees the --indexed-objects option, it adds
all blobs and valid cache-trees from the index to the
traversal using add_index_objects_to_pending(). But that
function totally ignores its flags parameter!

That means that doing:

  git rev-list --objects --indexed-objects

and

  git rev-list --objects --not --indexed-objects

produce the same output, because we ignore the UNINTERESTING
flag when walking the index in the second example.

Nobody noticed because this feature was added as a way for
tools like repack to increase their coverage of reachable
objects, meaning it would only be used like the first
example above.

But since it's user facing (and because the documentation
describes it "as if the objects are listed on the command
line"), we should make sure the negative case behaves
sensibly.

Signed-off-by: Jeff King 
---
 revision.c   | 15 +--
 t/t6000-rev-list-misc.sh |  7 +++
 2 files changed, 16 insertions(+), 6 deletions(-)

diff --git a/revision.c b/revision.c
index a1ddb9e11c..8e56c9641f 100644
--- a/revision.c
+++ b/revision.c
@@ -1321,13 +1321,14 @@ void add_reflogs_to_pending(struct rev_info *revs, 
unsigned flags)
 }
 
 static void add_cache_tree(struct cache_tree *it, struct rev_info *revs,
-  struct strbuf *path)
+  struct strbuf *path, unsigned int flags)
 {
size_t baselen = path->len;
int i;
 
if (it->entry_count >= 0) {
struct tree *tree = lookup_tree(revs->repo, >oid);
+   tree->object.flags |= flags;
add_pending_object_with_path(revs, >object, "",
 04, path->buf);
}
@@ -1335,14 +1336,15 @@ static void add_cache_tree(struct cache_tree *it, 
struct rev_info *revs,
for (i = 0; i < it->subtree_nr; i++) {
struct cache_tree_sub *sub = it->down[i];
strbuf_addf(path, "%s%s", baselen ? "/" : "", sub->name);
-   add_cache_tree(sub->cache_tree, revs, path);
+   add_cache_tree(sub->cache_tree, revs, path, flags);
strbuf_setlen(path, baselen);
}
 
 }
 
 static void do_add_index_objects_to_pending(struct rev_info *revs,
-   struct index_state *istate)
+   struct index_state *istate,
+   unsigned int flags)
 {
int i;
 
@@ -1356,13 +1358,14 @@ static void do_add_index_objects_to_pending(struct 
rev_info *revs,
blob = lookup_blob(revs->repo, >oid);
if (!blob)
die("unable to add index blob to traversal");
+   blob->object.flags |= flags;
add_pending_object_with_path(revs, >object, "",
 ce->ce_mode, ce->name);
}
 
if (istate->cache_tree) {
struct strbuf path = STRBUF_INIT;
-   add_cache_tree(istate->cache_tree, revs, );
+   add_cache_tree(istate->cache_tree, revs, , flags);
strbuf_release();
}
 }
@@ -1372,7 +1375,7 @@ void add_index_objects_to_pending(struct rev_info *revs, 
unsigned int flags)
struct worktree **worktrees, **p;
 
read_index(revs->repo->index);
-   do_add_index_objects_to_pending(revs, revs->repo->index);
+   do_add_index_objects_to_pending(revs, revs->repo->index, flags);
 
if (revs->single_worktree)
return;
@@ -1388,7 +1391,7 @@ void add_index_objects_to_pending(struct rev_info *revs, 
unsigned int flags)
if (read_index_from(,
worktree_git_path(wt, "index"),
get_worktree_git_dir(wt)) > 0)
-   do_add_index_objects_to_pending(revs, );
+   do_add_index_objects_to_pending(revs, , flags);
discard_index();
}
free_worktrees(worktrees);
diff --git a/t/t6000-rev-list-misc.sh b/t/t6000-rev-list-misc.sh
index fb4d295aa0..0507999729 100755
--- a/t/t6000-rev-list-misc.sh
+++ b/t/t6000-rev-list-misc.sh
@@ -90,11 +90,18 @@ test_expect_success 'rev-list can show index objects' '
9200b628cf9dc883a85a7abc8d6e6730baee589c two
EOF
echo only-in-index >only-in-index &&
+   test_when_finished "git reset --hard" &&
git add only-in-index &&
git rev-list --objects --indexed-objects >actual &&
test_cmp expect actual
 '
 
+test_expect_success 'rev-list can negate index objects' '
+   git rev-parse HEAD >expect &&
+   git rev-list -1 --objects HEAD --not --indexed-objects >actual &&
+   test_cmp expect actual
+'
+
 test_expect_success '--bisect and --first-parent can not be combined' '
test_must_fail git rev-list --bisect --first-parent HEAD
 '
-- 
2.19.1.1336.g081079ac04



[PATCH 0/3] mixed bag of -Wunused-parameter bugfixes

2018-11-01 Thread Jeff King
Here are three minor bug-fixes that can be applied independently. The
common thread is that I found them by looking at the results of
compiling with -Wunused-parameter. In each of these cases, the parameter
_should_ be used, and not doing so was a bug.

  [1/3]: rev-list: handle flags for --indexed-objects
  [2/3]: approxidate: handle pending number for "specials"
  [3/3]: pathspec: handle non-terminated strings with :(attr)

 date.c   | 60 ++--
 dir.c|  7 +
 revision.c   | 15 ++
 t/t0006-date.sh  |  1 +
 t/t6000-rev-list-misc.sh |  7 +
 5 files changed, 57 insertions(+), 33 deletions(-)

-Peff


Re: [PATCH 2/3] [Outreachy] ident: introduce set_fallback_ident() function

2018-11-01 Thread Junio C Hamano
Junio C Hamano  writes:

> As we currently have no idea when builtin/stash.c becomes ready for
> 'next', how about doing something like this instead, in order to
> help end-users without waiting in the meantime?  The fix can be
> picked up and ported when the C rewrite is updated, of course.

I think a better approach to fix the current C version, assuming
that a reroll won't change the structure of the code too much and
keeps using commit_tree() to synthesize the stash entries, would be
to teach commit_tree() -> commit_tree_extended() codepath to take
commiter identity just like it takes author identity.  Then inside
builtin/stash.c, we can choose what committer and author identity to
pass without having commit_tree_extended() ask for identity with the
STRICT option.  Right now, commit_tree() does not have a good way,
other than somehow lying to git_author/committer_info(), to record a
commit created under an arbitrary identity, which would be fixed
with such an approach and will help callers with similar needs in
the future.






What's cooking in git.git (Nov 2018, #02; Fri, 2)

2018-11-01 Thread Junio C Hamano
Here are the topics that have been cooking.  Commits prefixed with
'-' are only in 'pu' (proposed updates) while commits prefixed with
'+' are in 'next'.  The ones marked with '.' do not appear in any of
the integration branches, but I am still holding onto them.

The eighth batch is all about "rebase and rebase-i in C" ;-)

The disruptive "let's split up the Documentation/config.txt file"
topic will enter 'next' soon, and I expect I'd be slow while it
skips over various topics in flight that changes the source it
touches, as they would produce conflicts with different shape and
won't be helped by rerere.  Let's merge it down quickly.

You can find the changes described here in the integration branches
of the repositories listed at

http://git-blame.blogspot.com/p/git-public-repositories.html

--
[Graduated to "master"]

* ag/rebase-i-in-c (2018-10-09) 20 commits
  (merged to 'next' on 2018-10-19 at 1b24712d46)
 + rebase -i: move rebase--helper modes to rebase--interactive
 + rebase -i: remove git-rebase--interactive.sh
 + rebase--interactive2: rewrite the submodes of interactive rebase in C
 + rebase -i: implement the main part of interactive rebase as a builtin
 + rebase -i: rewrite init_basic_state() in C
 + rebase -i: rewrite write_basic_state() in C
 + rebase -i: rewrite the rest of init_revisions_and_shortrevisions() in C
 + rebase -i: implement the logic to initialize $revisions in C
 + rebase -i: remove unused modes and functions
 + rebase -i: rewrite complete_action() in C
 + t3404: todo list with commented-out commands only aborts
 + sequencer: change the way skip_unnecessary_picks() returns its result
 + sequencer: refactor append_todo_help() to write its message to a buffer
 + rebase -i: rewrite checkout_onto() in C
 + rebase -i: rewrite setup_reflog_action() in C
 + sequencer: add a new function to silence a command, except if it fails
 + rebase -i: rewrite the edit-todo functionality in C
 + editor: add a function to launch the sequence editor
 + rebase -i: rewrite append_todo_help() in C
 + sequencer: make three functions and an enum from sequencer.c public
 (this branch is used by ag/sequencer-reduce-rewriting-todo, 
cb/printf-empty-format, js/rebase-autostash-fix, js/rebase-i-break, 
js/rebase-i-shortopt, js/rebase-in-c-5.5-work-with-rebase-i-in-c and 
pk/rebase-in-c-6-final.)

 Rewrite of the remaining "rebase -i" machinery in C.


* cb/printf-empty-format (2018-10-27) 1 commit
  (merged to 'next' on 2018-11-01 at 9fcb05f22c)
 + sequencer: cleanup for gcc warning in non developer mode
 (this branch uses ag/rebase-i-in-c; is tangled with 
ag/sequencer-reduce-rewriting-todo, js/rebase-autostash-fix, js/rebase-i-break, 
js/rebase-i-shortopt, js/rebase-in-c-5.5-work-with-rebase-i-in-c and 
pk/rebase-in-c-6-final.)

 Build fix for a topic in flight.


* jc/rebase-in-c-5-test-typofix (2018-10-11) 1 commit
  (merged to 'next' on 2018-10-19 at 08c9d86ffd)
 + rebase: fix typoes in error messages
 (this branch uses pk/rebase-in-c, pk/rebase-in-c-2-basic, 
pk/rebase-in-c-3-acts, pk/rebase-in-c-4-opts and pk/rebase-in-c-5-test; is 
tangled with js/rebase-autostash-fix, 
js/rebase-in-c-5.5-work-with-rebase-i-in-c and pk/rebase-in-c-6-final.)

 Typofix.


* js/rebase-autostash-fix (2018-10-24) 5 commits
  (merged to 'next' on 2018-10-29 at 680e648001)
 + rebase --autostash: fix issue with dirty submodules
 + rebase --autostash: demonstrate a problem with dirty submodules
 + rebase (autostash): use an explicit OID to apply the stash
 + rebase (autostash): store the full OID in /autostash
 + rebase (autostash): avoid duplicate call to state_dir_path()
 (this branch uses ag/rebase-i-in-c, 
js/rebase-in-c-5.5-work-with-rebase-i-in-c, pk/rebase-in-c, 
pk/rebase-in-c-2-basic, pk/rebase-in-c-3-acts, pk/rebase-in-c-4-opts, 
pk/rebase-in-c-5-test and pk/rebase-in-c-6-final; is tangled with 
ag/sequencer-reduce-rewriting-todo, cb/printf-empty-format, 
jc/rebase-in-c-5-test-typofix, js/rebase-i-break and js/rebase-i-shortopt.)

 "git rebase" that has recently been rewritten in C had a few issues
 in its "--autstash" feature, which have been corrected.


* js/rebase-i-break (2018-10-12) 2 commits
  (merged to 'next' on 2018-10-19 at 6db9b14495)
 + rebase -i: introduce the 'break' command
 + rebase -i: clarify what happens on a failed `exec`
 (this branch is used by js/rebase-i-shortopt; uses ag/rebase-i-in-c; is 
tangled with ag/sequencer-reduce-rewriting-todo, cb/printf-empty-format, 
js/rebase-autostash-fix, js/rebase-in-c-5.5-work-with-rebase-i-in-c and 
pk/rebase-in-c-6-final.)

 "git rebase -i" learned a new insn, 'break', that the user can
 insert in the to-do list.  Upon hitting it, the command returns
 control back to the user.


* js/rebase-i-shortopt (2018-10-26) 1 commit
  (merged to 'next' on 2018-11-01 at 4e9da19145)
 + rebase -i: recognize short commands without arguments
 (this branch uses ag/rebase-i-in-c and js/rebase-i-break; is tangled with 

[PATCH 2+3/3] stash: tolerate missing user identity

2018-11-01 Thread Junio C Hamano
The "git stash" command insists on having a usable user identity to
the same degree as the "git commit-tree" and "git commit" commands
do, because it uses the same codepath that creates commit objects
as these commands.

It is not strictly necesary to do so.  Check if we will barf before
creating commit objects and then supply fake identity to please the
machinery that creates commits.

This is not that much of usability improvement, as the users who run
"git stash" would eventually want to record their changes that are
temporarily stored in the stashes in a more permanent history by
committing, and they must do "git config user.{name,email}" at that
point anyway, so arguably this change is only delaying a step that
is necessary to work in the repository.

Signed-off-by: Junio C Hamano 
---

 * This time with a proposed commit log message and a flip to the
   test; this would be able to replce 2/3 and 3/3 without waiting
   for ps/stash-in-c to stabilize and become ready to be based on
   further work like this one.

   We need to extend the test so that when a reasonable identity is
   present, the stashes are created under that identity and not with
   the fallback one, which I do not think is tested with the previous
   step, so there still is a bit of room to improve [PATCH 1/3]

 git-stash.sh | 17 +
 t/t3903-stash.sh |  2 +-
 2 files changed, 18 insertions(+), 1 deletion(-)

diff --git a/git-stash.sh b/git-stash.sh
index 94793c1a91..789ce2f41d 100755
--- a/git-stash.sh
+++ b/git-stash.sh
@@ -55,6 +55,20 @@ untracked_files () {
git ls-files -o $z $excl_opt -- "$@"
 }
 
+prepare_fallback_ident () {
+   if ! git -c user.useconfigonly=yes var GIT_COMMITTER_IDENT >/dev/null 
2>&1
+   then
+   GIT_AUTHOR_NAME="git stash"
+   GIT_AUTHOR_EMAIL=git@stash
+   GIT_COMMITTER_NAME="git stash"
+   GIT_COMMITTER_EMAIL=git@stash
+   export GIT_AUTHOR_NAME
+   export GIT_AUTHOR_EMAIL
+   export GIT_COMMITTER_NAME
+   export GIT_COMMITTER_EMAIL
+   fi
+}
+
 clear_stash () {
if test $# != 0
then
@@ -67,6 +81,9 @@ clear_stash () {
 }
 
 create_stash () {
+
+   prepare_fallback_ident
+
stash_msg=
untracked=
while test $# != 0
diff --git a/t/t3903-stash.sh b/t/t3903-stash.sh
index a9a573efa0..3dcf2f14d1 100755
--- a/t/t3903-stash.sh
+++ b/t/t3903-stash.sh
@@ -1096,7 +1096,7 @@ test_expect_success 'stash --  works with binary 
files' '
test_path_is_file subdir/untracked
 '
 
-test_expect_failure 'stash works when user.name and user.email are not set' '
+test_expect_success 'stash works when user.name and user.email are not set' '
git reset &&
>1 &&
git add 1 &&
-- 
2.19.1-801-gd582ea202b



Re: [PATCH 2/3] [Outreachy] ident: introduce set_fallback_ident() function

2018-11-01 Thread Junio C Hamano
Junio C Hamano  writes:

> Rather than adding this fallback trap, can't we do it more like
> this?
>
> - At the beginning of "git stash", after parsing the command
>   line, we know what subcommand of "git stash" we are going to
>   run.
>
> - If it is a subcommand that could need the ident (i.e. the ones
>   that create a stash entry), we check the ident (e.g. make a
>   call to git_author/committer_info() ourselves) but without
>   STRICT bit, so that we can probe without dying if we need to
>   supply a fallback identy.
>
>   - And if we do need it, then setenv() the necessary
> environment variables and arrange the next call by anybody
> to git_author/committer_info() will get the fallback values
> from there.

As we currently have no idea when builtin/stash.c becomes ready for
'next', how about doing something like this instead, in order to
help end-users without waiting in the meantime?  The fix can be
picked up and ported when the C rewrite is updated, of course.

 git-stash.sh | 17 +
 1 file changed, 17 insertions(+)

diff --git a/git-stash.sh b/git-stash.sh
index 94793c1a91..789ce2f41d 100755
--- a/git-stash.sh
+++ b/git-stash.sh
@@ -55,6 +55,20 @@ untracked_files () {
git ls-files -o $z $excl_opt -- "$@"
 }
 
+prepare_fallback_ident () {
+   if ! git -c user.useconfigonly=yes var GIT_COMMITTER_IDENT >/dev/null 
2>&1
+   then
+   GIT_AUTHOR_NAME="git stash"
+   GIT_AUTHOR_EMAIL=git@stash
+   GIT_COMMITTER_NAME="git stash"
+   GIT_COMMITTER_EMAIL=git@stash
+   export GIT_AUTHOR_NAME
+   export GIT_AUTHOR_EMAIL
+   export GIT_COMMITTER_NAME
+   export GIT_COMMITTER_EMAIL
+   fi
+}
+
 clear_stash () {
if test $# != 0
then
@@ -67,6 +81,9 @@ clear_stash () {
 }
 
 create_stash () {
+
+   prepare_fallback_ident
+
stash_msg=
untracked=
while test $# != 0


Re: [PATCH v3] i18n: make GETTEXT_POISON a runtime option

2018-11-01 Thread Junio C Hamano
Ævar Arnfjörð Bjarmason   writes:

> Change the GETTEXT_POISON compile-time + runtime GIT_GETTEXT_POISON
> test parameter to only be a GIT_TEST_GETTEXT_POISON=
> runtime parameter, to be consistent with other parameters documented
> in "Running tests with special setups" in t/README.
> ...
>  #ifndef NO_GETTEXT
>  extern void git_setup_gettext(void);
>  extern int gettext_width(const char *s);
>  #else
>  static inline void git_setup_gettext(void)
>  {
> + use_gettext_poison();; /* getenv() reentrancy paranoia */
>  }

Too many "case/esac" statement in shell scripting?



Re: git appears to ignore GIT_CONFIG environment variable

2018-11-01 Thread Junio C Hamano
Sirio Balmelli  writes:

> It appears that git ignores the GIT_CONFIG environment variable,
> while git-config *does* consider it.

Yup, that is exactly how it is designed and documented.  These
dasys, with "git config" taking "--file" to work on any arbitrary
filename, you do not necessarily need GIT_CONFIG enviornment.



Re: [PATCH 2/3] [Outreachy] ident: introduce set_fallback_ident() function

2018-11-01 Thread Junio C Hamano
Slavica Djukic  writes:

> Usually, when creating a commit, ident is needed to record the author
> and commiter.
> But, when there is commit not intended to published, e.g. when stashing
> changes,  valid ident is not necessary.
> To allow creating commits in such scenario, let's introduce helper
> function "set_fallback_ident(), which will pre-load the ident.
>
> In following commit, set_fallback_ident() function will be called in stash.
>
> Signed-off-by: Slavica Djukic 
> ---

This is quite the other way around from what I expected to see.

I would have expected that a patch would introduce a new flag to
tell git_author/committer_info() functions that it is OK not to have
name or email given, and pass that flag down the callchain.

Anybody who knows that git_author/committer_info() will eventually
get called can instead tell these helpers not to fail (and yield a
substitute non-answer) beforehand with this function, instead of
passing a flag down to affect _only_ one callflow without affecting
others, using this new function.

I am not yet saying that being opposite from my intuition is
necessarily wrong, but the approach is like setting a global
variable that affects everybody and it will probably make it
harder to later libify the functions involved.  It certainly
makes this patch (and the next step) much simpler than passing
a flag IDENT_NO_NAME_OK|IDENT_NO_MAIL_OK thru the codepath.

> +void set_fallback_ident(const char *name, const char *email)
> +{
> + if (!git_default_name.len) {
> + strbuf_addstr(_default_name, name);
> + committer_ident_explicitly_given |= IDENT_NAME_GIVEN;
> + author_ident_explicitly_given |= IDENT_NAME_GIVEN;
> + ident_config_given |= IDENT_NAME_GIVEN;
> + }
> +
> + if (!git_default_email.len) {
> + strbuf_addstr(_default_email, email);
> + committer_ident_explicitly_given |= IDENT_MAIL_GIVEN;
> + author_ident_explicitly_given |= IDENT_MAIL_GIVEN;
> + ident_config_given |= IDENT_MAIL_GIVEN;
> + }
> +}

One terrible thing about this approach is that the caller cannot
tell if it used fallback name or a real name, because it lies in
these "explicitly_given" fields.  The immediate caller (i.e. the one
that creates commit objects used to represent a stash entry) may not
care, but a helper function pretending to be reusable incredient to
solve a more general issue, this is far less than ideal.

So in short, I do agree that the series tackles an issue worth
addressing, but I am not impressed with the approach at all.

Rather than adding this fallback trap, can't we do it more like
this?

- At the beginning of "git stash", after parsing the command
  line, we know what subcommand of "git stash" we are going to
  run.

- If it is a subcommand that could need the ident (i.e. the ones
  that create a stash entry), we check the ident (e.g. make a
  call to git_author/committer_info() ourselves) but without
  STRICT bit, so that we can probe without dying if we need to
  supply a fallback identy.

  - And if we do need it, then setenv() the necessary
environment variables and arrange the next call by anybody
to git_author/committer_info() will get the fallback values
from there.


git-rebase is ignoring working-tree-encoding

2018-11-01 Thread Adrián Gimeno Balaguer
I’m attempting to perform fixups via git-rebase of UTF-16 LE files
(the project I’m working on requires that exact encoding on certain
files). When the rebase is complete, Git changes that file’s encoding
to UTF-16 BE. I have been using the newer working-tree-encoding
attribute in .gitattributes. I’m using Git for Windows.

$ git version
git version 2.19.1.windows.1

Here is a sample UTF-16 LE file (with BOM and LF endings) with
following atributes in .gitattributes:

test.txt eol=lf -text working-tree-encoding=UTF-16

I put eol=lf and -text to tell Git to not change the encoding of the
file on checkout, but that doesn’t even help. Asides, the newer
working-tree-encoding allows me to view human-readable diffs of that
file (in GitHub Desktop and Git Bash). Now, note that doing for
example consecutive commits to the same file does not affect the
UTF-16 LE encoding. And before I discovered this attribute, the whole
thing was even worse when squash/fixup rebasing, as Git would modify
the file with Chinese characters (when manually setting it as text via
.gitattributes).

So, again the problem with the exposed .gitattributes line is that
after fixup rebasing, UTF-16 LE files encoding change to UTF-16 BE.

For long, I have been working with the involved UTF-16 LE files set as
binary via .gitattributes (e.g. “test.txt binary”), so that Git would
not modify the file encoding, but this doesn’t allow me to view the
diffs upon changes in GitHub Desktop, which I want (and neither via
git diff).


Re: [PATCH] travis-ci: install packages in 'ci/install-dependencies.sh'

2018-11-01 Thread Junio C Hamano
SZEDER Gábor  writes:

> Ever since we started using Travis CI, we specified the list of
> packages to install in '.travis.yml' via the APT addon.  While running
> our builds on Travis CI's container-based infrastructure we didn't
> have another choice, because that environment didn't support 'sudo',
> and thus we didn't have permission to install packages ourselves.  With
> the switch to the VM-based infrastructure in the previous patch we do
> get a working 'sudo', so we can install packages by running 'sudo
> apt-get -y install ...' as well.

OK, so far we learned that this is now _doable_; but not enough to
decide if this is a good thing to do or not.  Let's read on to find
out.

> Let's make use of this and install necessary packages in
> 'ci/install-dependencies.sh', so all the dependencies (i.e. both
> packages and "non-packages" (P4 and Git-LFS)) are handled in the same
> file.  

So we used to have two waysto prepare the test environment; non
packaged software were done via install-dependencies.sh, but
packaged ones weren't.  Unifying them so that the script installs
both would be a good change to simplify the procedure.  

Is that how this sentence argues for this change?

> Install gcc-8 only in the 'linux-gcc' build job; so far it has
> been unnecessarily installed in the 'linux-clang' build job as well.

Is this "unneeded gcc-8 was installed" something we can fix only
because we now stopped doing the installation via apt addon?  Or we
could have fixed it while we were on apt addon but we didn't bother,
and this patch fixes it "while at it"---primarily because the shell
script is far more flexible to work with than travis.yml matrix and
this kind of customization is far easier to do?

> Print the versions of P4 and Git-LFS conditionally, i.e. only when
> they have been installed; with this change even the static analysis
> and documentation build jobs start using 'ci/install-dependencies.sh'
> to install packages, and neither of these two build jobs depend on and
> thus install those.
>
> This change will presumably be beneficial for the upcoming Azure
> Pipelines integration [1]: preliminary versions of that patch series
> run a couple of 'apt-get' commands to install the necessary packages
> before running 'ci/install-dependencies.sh', but with this patch it
> will be sufficient to run only 'ci/install-dependencies.sh'.

So the main point of this change is to have less knowledge to
prepare the target configuration in the .travis.yml file and keep
them all in ci/install-dependencies.sh, which hopefully is more
reusable than .travis.yml in a non Travis environment?

If that is the case, it makes sense to me.

> This patch should go on top of 'ss/travis-ci-force-vm-mode'.
>
> I'm not sure about the last paragraph, because:
>
>   - It talks about presumed benefits for a currently still
> work-in-progress patch series of an other contributor, and I'm not
> really sure that that's a good thing.  Perhaps I should have
> rather put it below the '---'.
>
>   - I'm confused about the name of this Azure thing.  The cover letter
> mentions "Azure Pipelines", the file is called
> 'azure-pipelines.yml', but the relevant patch I link to talks
> about "Azure DevOps" in the commit message.
>
> Anyway, keep that last paragraph or drop it as you see fit.

I hope we'll hear from Dscho in one or two revolutions of the Earth
;-)

> diff --git a/ci/install-dependencies.sh b/ci/install-dependencies.sh
> index 75a9fd2475..06c3546e1e 100755
> --- a/ci/install-dependencies.sh
> +++ b/ci/install-dependencies.sh
> @@ -10,6 +10,15 @@ 
> LFSWHENCE=https://github.com/github/git-lfs/releases/download/v$LINUX_GIT_LFS_VE
>  
>  case "$jobname" in
>  linux-clang|linux-gcc)
> + sudo apt-add-repository -y "ppa:ubuntu-toolchain-r/test"
> + sudo apt-get -q update
> + sudo apt-get -q -y install language-pack-is git-svn apache2
> + case "$jobname" in
> + linux-gcc)
> + sudo apt-get -q -y install gcc-8
> + ;;
> + esac
> +
>   mkdir --parents "$P4_PATH"
>   pushd "$P4_PATH"
>   wget --quiet "$P4WHENCE/bin.linux26x86_64/p4d"
> @@ -32,11 +41,25 @@ osx-clang|osx-gcc)
>   brew link --force gettext
>   brew install caskroom/cask/perforce
>   ;;
> +StaticAnalysis)
> + sudo apt-get -q update
> + sudo apt-get -q -y install coccinelle
> + ;;
> +Documentation)
> + sudo apt-get -q update
> + sudo apt-get -q -y install asciidoc xmlto
> + ;;
>  esac
>  
> -echo "$(tput setaf 6)Perforce Server Version$(tput sgr0)"
> -p4d -V | grep Rev.
> -echo "$(tput setaf 6)Perforce Client Version$(tput sgr0)"
> -p4 -V | grep Rev.
> -echo "$(tput setaf 6)Git-LFS Version$(tput sgr0)"
> -git-lfs version
> +if type p4d >/dev/null && type p4 >/dev/null
> +then
> + echo "$(tput setaf 6)Perforce Server Version$(tput sgr0)"
> + p4d -V | grep Rev.
> + echo "$(tput setaf 6)Perforce Client Version$(tput sgr0)"
> + p4 -V | grep Rev.

Re: [PATCHv2 00/24] Bring more repository handles into our code base]

2018-11-01 Thread Junio C Hamano
Stefan Beller  writes:

>> What was posted would have been perfectly fine as a "how about doing
>> it this way" weatherbaloon patch, but as a part of real series, it
>> needs to explain to the developers what the distinctions between two
>> classes are, and who is to use the cocci patch at what point in the
>> development cycle, etc., in an accompanying documentation update.
>
> if only we had documentation [as found via "git grep coccicheck"]
> that I could update ... I'd totally do that.
> I guess this asks for documentation to begin with, now?

So far, we didn't even need any, as there was no "workflow" to speak
of.  It's just "any random developer finds a suggested update,
either by running 'make coccicheck' oneself or by peeking Travis
log, and sends in a patch".

But the "pending" thing has a lot more workflow elements, who is
responsible for noticing update suggested by "pending" ones, for
updating the code, and for promoting "pending" to the normal.  These
are all new, and these are all needed as ongoing basis to help
developers---I'd say the way Documentation/SubmittingPatches helps
developers is very close to the way this new document would help them.


Re: [PATCH] send-email: Avoid empty transfer encoding header

2018-11-01 Thread brian m. carlson
On Thu, Nov 01, 2018 at 09:09:34PM -0400, Aaron Lindsay wrote:
> Fix a small bug introduced by "7a36987ff (send-email: add an auto option
> for transfer encoding, 2018-07-14)
> 
> I saw the following message when setting --transfer-encoding for a file
> with the same encoding:
> $ git send-email --transfer-encoding=8bit example.patch
> Use of uninitialized value $xfer_encoding in concatenation (.) or string
> at /usr/lib/git-core/git-send-email line 1744.
> 
> Signed-off-by: Aaron Lindsay 
> ---
>  git-send-email.perl | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/git-send-email.perl b/git-send-email.perl
> index 2be5dac33..39c15bcc8 100755
> --- a/git-send-email.perl
> +++ b/git-send-email.perl
> @@ -1834,7 +1834,7 @@ sub apply_transfer_encoding {
>   my $from = shift;
>   my $to = shift;
>  
> - return $message if ($from eq $to and $from ne '7bit');
> + return ($message, $to) if ($from eq $to and $from ne '7bit');

Thanks, this is obviously correct.

Would you like to squash in the below patch to add a test?

- >% -
From  Mon Sep 17 00:00:00 2001
From: "brian m. carlson" 
Date: Fri, 2 Nov 2018 01:51:33 +
Subject: [PATCH] squash! send-email: Avoid empty transfer encoding header

Signed-off-by: brian m. carlson 
---
 t/t9001-send-email.sh | 15 +++
 1 file changed, 15 insertions(+)

diff --git a/t/t9001-send-email.sh b/t/t9001-send-email.sh
index 1ef1a19003..ee1efcc59d 100755
--- a/t/t9001-send-email.sh
+++ b/t/t9001-send-email.sh
@@ -492,6 +492,21 @@ do
--validate \
$patches longline.patch
'
+
+done
+
+for enc in 7bit 8bit quoted-printable base64
+do
+   test_expect_success $PREREQ "--transfer-encoding=$enc produces correct 
header" '
+   clean_fake_sendmail &&
+   git send-email \
+   --from="Example " \
+   --to=nob...@example.com \
+   --smtp-server="$(pwd)/fake.sendmail" \
+   --transfer-encoding=$enc \
+   $patches &&
+   grep "Content-Transfer-Encoding: $enc" msgtxt1
+   '
 done
 
 test_expect_success $PREREQ 'Invalid In-Reply-To' '
-- 
brian m. carlson: Houston, Texas, US
OpenPGP: https://keybase.io/bk2204


signature.asc
Description: PGP signature


Re: [PATCH 1/3] commit-reach: implement get_reachable_subset

2018-11-01 Thread Junio C Hamano
Derrick Stolee  writes:

> We could do this, but it does come with a performance hit when the following
> are all true:
>
> 1. 'to' is not reachable from any 'from' commits.
>
> 2. The 'to' and 'from' commits are close in commit-date.
>
> 3. Generation numbers are not available, or the topology is skewed to have
>commits with high commit date and low generation number.
>
> Since in_merge_bases_many() calls paint_down_to_common(), it has the same
> issues with the current generation numbers. This can be fixed when we have
> the next version of generation numbers available.
>
> I'll make a note to have in_merge_bases_many() call get_reachable_subset()
> conditionally (like the generation_numbers_available() trick in the
> --topo-order
> series) after the generation numbers are settled and implemented.

Sounds good.

I wondered how this compares with in_merge_bases_many() primarily
because how performance characteristics between two approaches trade
off.  After all, what it needs to compute is a specialized case of
get_reachable_subset() where "to" happens to be a set with only
single element.  If the latter with a single element "to" has the
above performance "issues" compared to paint-down-to-common based
approach, it could be possible that, for small enough N>1, running N
in_merge_bases_many() traversals is more performant than a single
get_reachable_subset() call that works on N-element "to".

I am hoping that an update to better generation numbers to help
get_reachable_subset() would help paint-down-to-common the same way,
and would eventually allow us to use a single traversal that is best
for N==1 and N>1.

Thanks.



git appears to ignore GIT_CONFIG environment variable

2018-11-01 Thread Sirio Balmelli
It appears that git ignores the GIT_CONFIG environment variable, while 
git-config *does* consider it.

I have written a short script that shows this in detail and reproduces, it is 
included below and also posted on github at 
https://github.com/siriobalmelli/toolbench/blob/master/git/git-env-check.sh
This behavior is confirmed in 2.17.1, 2.18.0 and 2.19.1

I have tried to google this but don’t see any references to GIT_CONFIG outside 
of the git-config manual.
Is it intended behavior that this environment variable is only valid with 
git-config?

Thank you very much,

Sirio

—

#!/bin/bash
# this script demonstrates that the GIT_CONFIG is used by 'git-config'
# but *not* by 'git' itself.
# 2018 Sirio Balmelli

cleanup()
{
rm -rf ~/.gitconfig git-env-check
}

fail()
{
echo "$*" >&2
cleanup
exit 1
}

# don't break the user's config
if [[ -e ~/.gitconfig ]]; then
# don't fail, deleting ~/.gitconfig, for obvious reasons
echo "this script would break your existing ~/.gitconfig - please 
remove it and run again" >&2
exit 1
fi

unset GIT_CONFIG
git config -l | grep -q alias.he=help \
&& fail "alias 'he' already set, can't use it for this test" \
|| echo "1. the alias 'he' is unset by default"

echo "2. write a gitconfig in a non-standard location; export to GIT_CONFIG:"
mkdir git-env-check
cat 

Re: Using --word-diff breaks --color-moved

2018-11-01 Thread james harvey
On Wed, Oct 31, 2018 at 1:42 PM Stefan Beller  wrote:
>
> On Tue, Oct 30, 2018 at 7:06 PM james harvey  wrote:
> > I think "--color-moved" should have precedence over "--word-diff".
>
> I agree for precedence as in "work well together". Now we'd need
> to figure out what that means. In its current form, the move
> detection can detect moved lines across diff hunks or file
> boundaries.
>
> Should that also be the case for word diffing?
> I think word diffing is mostly used for free text, which has different
> properties compared to code, that the color-moved was originally
> intended for.

That's how I think of it too.  I think I'd be fine if word diffing
stayed not being able to be detected with moved lines across diff
hunks or file boundaries.

> >   I
> > cannot think of a scenario where a user would supply both options, and
> > actually want "--word-diff" to take precedence.  If I'm not thinking
> > of a scenario where this wouldn't be desired, perhaps whichever is
> > first as an argument could take precedence.
>
> word diffing and move detection are completely orthogonal at the moment.
> Instead of option order, I'd rather introduce a new option that tells us
> how to resolve some corner case. Or in the short term we might just
> want to raise an error?

I'm fine with option order not mattering, as it does now.  Was
assuming it didn't matter now, but mentioned trying it in case it
worked that way.  And, mentioned it as an alternative in case it
turned out the two could conflict in some corner case.  I think
defaulting to resolving one way or the other with an optional option
to go the other way makes sense.

> > (The same behavior happens if 4+ lines are moved and
> > "--color-moved{default=zebra}" is used, but below
> > "--color-moved=plain" is used to be a smaller testcase.)
> >
> > [...]
>
> This sounds like you are asking for two things:
> (1) make color-moved work with words (somehow)
> (2) allow the user to fine tune the heuristics for a block,
> such that default=zebra would still work.

I was asking for #1.  #2 might be a good idea, but I just tried using
"--color-moved" for the first time the other day, so haven't used it
enough to get that far.  If they worked together, I'm not sure yet if
I'd be using plain or zebra.  I mentioned "4+ lines" because I can
remember something said zebra only worked with more than 3 lines.  Not
sure where that was.  I thought it was the manpage, but I'm not seeing
that in there now.


[PATCH] send-email: Avoid empty transfer encoding header

2018-11-01 Thread Aaron Lindsay
Fix a small bug introduced by "7a36987ff (send-email: add an auto option
for transfer encoding, 2018-07-14)

I saw the following message when setting --transfer-encoding for a file
with the same encoding:
$ git send-email --transfer-encoding=8bit example.patch
Use of uninitialized value $xfer_encoding in concatenation (.) or string
at /usr/lib/git-core/git-send-email line 1744.

Signed-off-by: Aaron Lindsay 
---
 git-send-email.perl | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/git-send-email.perl b/git-send-email.perl
index 2be5dac33..39c15bcc8 100755
--- a/git-send-email.perl
+++ b/git-send-email.perl
@@ -1834,7 +1834,7 @@ sub apply_transfer_encoding {
my $from = shift;
my $to = shift;
 
-   return $message if ($from eq $to and $from ne '7bit');
+   return ($message, $to) if ($from eq $to and $from ne '7bit');
 
require MIME::QuotedPrint;
require MIME::Base64;
-- 
2.19.1



Re: master updated? (was Re: What's cooking in git.git (Nov 2018, #01; Thu, 1))

2018-11-01 Thread Junio C Hamano
Derrick Stolee  writes:

> On 11/1/2018 5:59 AM, Junio C Hamano wrote:
>> --
>> [Graduated to "master"]
>
> I see that several topics graduated, but it appears the master branch
> was not updated at https://github.com/gister/git. Was this
> intentional?

The "Nov #1" issue of "What's cooking" report lists various topics
including ah/doc-updates, ..., uk/merge- subtree-doc-update in the
"Graduated" section by mistake (probably due to crossing month
boundary); they already were listed in the previous "Oct #6" issue.

Hopefully we'll have the following topics graduate to 'master'
today, as part of the Eighth batch of the cycle.

js/rebase-i-shortopt
js/rebase-i-break
js/rebase-autostash-fix
cb/printf-empty-format
jc/rebase-in-c-5-test-typofix
pk/rebase-in-c-6-final
js/rebase-in-c-5.5-work-with-rebase-i-in-c
pk/rebase-in-c-5-test
pk/rebase-in-c-4-opts
pk/rebase-in-c-3-acts
pk/rebase-in-c-2-basic
ag/rebase-i-in-c
pk/rebase-in-c

Thanks.


Re: [PATCH v4 0/7] Use generation numbers for --topo-order

2018-11-01 Thread Junio C Hamano
Derrick Stolee  writes:

>> Review discussions seem to have petered out.  Would we merge this to
>> 'next' and start cooking, perhaps for the remainder of this cycle?
>
> Thanks, but I've just sent a v5 responding to Jakub's feedback on v4. [1]
>
> I'd be happy to let it sit in next until you feel it has cooked long
> enough. I'm available to respond to feedback in the form of new
> topics.

OK.  I'm quite happy to see this round of review helped greatly by
Jakub, by the way.

THanks, both.


Re: [PATCH v2] completion: use builtin completion for format-patch

2018-11-01 Thread Junio C Hamano
Duy Nguyen  writes:

>> > I have no comment about this. In an ideal world, sendemail.perl could
>> > be taught to support --git-completion-helper but I don't think my
>> > little remaining Perl knowledge (or time) is enough to do it. Perhaps
>> > this will do. I don't know.
>>
>> So "all", "attach", etc. are added to this list while these similar
>> options are lost from the other variable?  Is this a good trade-off?
>
> Not sure if I understand you correctly, but it looks to me that the
> options in git-send-email.perl are well organized, so we could...

Yes, but I wasn't commenting on your "sendemail should also be able
to help completion by supporting --completion-helper option" (I think
that is a sensible approach).  My comment was about Denton's patch,
which reduced the hard-coded list of format-patch options (i.e. the
first hunk) but had to add back many of them to send-email's
completion (i.e. the last hunk)---overall, it did not help reducing
the number of options hardcoded in the script.

If it makes sense to complete all options to format-patch to
send-email, then as you outlined, grabbing them out of format-patch
with the --completion-helper option at runtime, and using them to
complete both format-patch and send-email would be a good idea.  And
that should be doable even before send-email learns how to list its
supported options to help the completion.

> --git-completon-helper in that script to print all send-email specific
> options, then call "git format-patch --git-completion-helper" to add a
> bunch more. The options that are handled by setup_revisions() will
> have to be maintained manually here like $__git_format_patch_options
> and added on top in both _git_send_email () and _git_format_patch ().
>
> So, nothing option is lost and by the time setup_revisions() supports
> -git-completion-helper, we can get rid of the manual shell variable
> too. The downside is, lots of work, probably.


Re: [PATCH 3/3] tests: optionally skip `git rebase -p` tests

2018-11-01 Thread Junio C Hamano
Johannes Sixt  writes:

> Am 01.11.18 um 07:12 schrieb Junio C Hamano:
>> "Johannes Schindelin via GitGitGadget" 
>> writes:
>>
>>> The `--preserve-merges` mode of the `rebase` command is slated to be
>>> deprecated soon, ...
>>
>> Is everybody on board on this statement?  I vaguely recall that some
>> people wanted to have something different from what rebase-merges
>> does (e.g. wrt first-parent history), and extending perserve-merges
>> might be one way to do so.
>
> Maybe you are referring to my proposals from a long time ago. My
> first-parent hack did not work very well, and I have changed my
> workflow. --preserve-merges is certainly not a feature that *I* would
> like to keep.

Thanks, that reduces my worries.

> The important question is whether there are too many users of
> preserve-merges who would be hurt when it is removed.

Yes, and the claim this series makes is that there is none and all
existing users should be able to happily use the rebase-merges,
which also means that we need to commit to improve rebase-merges to
support them, if there were some corner cases, which we failed to
consider so far, that are not yet served well.

As I said, as long as everybody agrees with the plan (e.g. we'll
know when we hear no objections to the planned deprecation in a few
weeks), I am perfectly OK with it.

Thanks.


Re: [PATCH v2 06/16] sequencer: refactor sequencer_add_exec_commands() to work on a todo_list

2018-11-01 Thread Alban Gruin
Le 30/10/2018 à 17:47, Phillip Wood a écrit :
> On 27/10/2018 22:29, Alban Gruin wrote:
>> This refactors sequencer_add_exec_commands() to work on a todo_list to
>> avoid redundant reads and writes to the disk.
>>
>> An obvious way to do this would be to insert the `exec' command between
>> the other commands, and reparse it once this is done.  This is not what
>> is done here.  Instead, the command is appended to the buffer once, and
>> a new list of items is created.  Items from the old list are copied to
>> it, and new `exec' items are appended when necessary.  This eliminates
>> the need to reparse the todo list, but this also means its buffer cannot
>> be directly written to the disk, hence todo_list_write_to_disk().
> 
> I'd reword this slightly, maybe
> 
> Instead of just inserting the `exec' command between the other commands,
> and re-parsing the buffer at the end the exec command is appended to the
> buffer once, and a new list of items is created.  Items from the old
> list are copied across and new `exec' items are appended when necessary.
>  This eliminates the need to reparse the buffer, but this also means we
> have to use todo_list_write_to_disk() to write the file.
> 
>> sequencer_add_exec_commands() still reads the todo list from the disk,
>> as it is needed by rebase -p.  todo_list_add_exec_commands() works on a
>> todo_list structure, and reparses it at the end.
> 
> I think the saying 'reparses' is confusing as that is what we're trying
> to avoid.
> 
>> complete_action() still uses sequencer_add_exec_commands() for now.
>> This will be changed in a future commit.
>>
>> Signed-off-by: Alban Gruin 
>> ---
>>   sequencer.c | 69 +
>>   1 file changed, 49 insertions(+), 20 deletions(-)
>>
>> diff --git a/sequencer.c b/sequencer.c
>> index e12860c047..12a3efeca8 100644
>> --- a/sequencer.c
>> +++ b/sequencer.c
>> @@ -4216,6 +4216,50 @@ int sequencer_make_script(FILE *out, int argc,
>> const char **argv,
>>   return 0;
>>   }
>>   +static void todo_list_add_exec_commands(struct todo_list *todo_list,
>> +    const char *commands)
>> +{
>> +    struct strbuf *buf = _list->buf;
>> +    const char *old_buf = buf->buf;
>> +    size_t commands_len = strlen(commands + strlen("exec ")) - 1;
>> +    int i, first = 1, nr = 0, alloc = 0;
> 
> Minor nit pick, I think it is clearer if first is initialized just
> before the loop as it is in the deleted code below.
> 
>> +    struct todo_item *items = NULL,
>> +    base_item = {TODO_EXEC, NULL, 0, 0, commands_len, 0};
>> +
>> +    strbuf_addstr(buf, commands);
>> +    base_item.offset_in_buf = buf->len - commands_len - 1;
>> +    base_item.arg = buf->buf + base_item.offset_in_buf;
> 
> I think if the user gives --exec more than once on the command line then
> commands will contain more than one exec command so this needs to parse
> commands and create one todo_item for each command.
> 

Ouch, you’re right.  Thanks for the heads up.

>> +
>> +    /*
>> + * Insert  after every pick. Here, fixup/squash chains
>> + * are considered part of the pick, so we insert the commands
>> *after*
>> + * those chains if there are any.
>> + */
>> +    for (i = 0; i < todo_list->nr; i++) {
>> +    enum todo_command command = todo_list->items[i].command;
>> +    if (todo_list->items[i].arg)
>> +    todo_list->items[i].arg = todo_list->items[i].arg -
>> old_buf + buf->buf;
>> +
>> +    if (command == TODO_PICK && !first) {
>> +    ALLOC_GROW(items, nr + 1, alloc);
>> +    memcpy(items + nr++, _item, sizeof(struct todo_item));
> 
> I think it would be clearer to say
> items[nr++] = base_item;
> rather than using memcpy. This applies below and to some of the other
> patches as well. Also this needs to loop over all the base_items if the
> user gave --exec more than once on the command line.
> 

I agree with you, it’s way more readable, IMO.  But for some reason, I
thought it was not possible to assign a struct to another in C.

> Best Wishes
> 
> Phillip
> 

Cheers,
Alban



Re: [PATCH v2 04/16] sequencer: introduce todo_list_write_to_file()

2018-11-01 Thread Alban Gruin
Hi Phillip,

Le 30/10/2018 à 17:28, Phillip Wood a écrit :
> Hi Alban
> 
> I like the direction this is going, it is an improvement on re-scanning
> the list at the end of each function.
> 
> On 27/10/2018 22:29, Alban Gruin wrote:
>> This introduce a new function to recreate the text of a todo list from
>> its commands, and then to write it to the disk.  This will be useful in
>> the future, the buffer of a todo list won’t be treated as a strict
>> mirror of the todo file by some of its functions once they will be
>> refactored.
> 
> I'd suggest rewording this slightly, maybe something like
> 
> This introduces a new function to recreate the text of a todo list from
> its commands and write it to a file. This will be useful as the next few
> commits will change the use of the buffer in struct todo_list so it will
> no-longer be a mirror of the file on disk.
> 
>> This functionnality can already be found in todo_list_transform(), but
> 
> s/functionnality/functionality/
> 
>> it is specifically made to replace the buffer of a todo list, which is
>> not the desired behaviour.  Thus, the part of todo_list_transform() that
>> actually creates the buffer is moved to a new function,
>> todo_list_to_strbuf().  The rest is unused, and so is dropped.
>>
>> todo_list_write_to_file() can also take care to append the help text to
> 
> s/care to append/care of appending/
> 
>> the buffer before writing it to the disk, or to write only the first n
>> items of the list.
> 
> Why/when do we only want to write a subset of the items?
>

In skip_unnecessary_picks(), in patch [10/16].  It needs to write the
elements of the todo list that were already done in the `done' file.

> […]
>> +int todo_list_write_to_file(struct todo_list *todo_list, const char
>> *file,
>> +    const char *shortrevisions, const char *shortonto,
>> +    int command_count, int append_help, int num, unsigned
>> flags)
> 
> This is a really long argument list which makes it easy for callers to
> get the parameters in the wrong order. I think append_help could
> probably be folded into the flags, I'm not sure what the command_count
> is used for but I've only read the first few patches. Maybe it would be
> better to pass a struct so we have named fields.
> 

You’re right, command_count is not really needed since we pass the
complete todo list.

The only bit that irks me is that, if I stop passing command_count, I
would have to call count_commands() twice in complete_action(): once to
check if there are any commands in the todo list, and again inside of
todo_list_write_to_file() (see [09/16].)

Perhaps I could move this check before calling todo_list_rearrange_squash()?

As a sidenote, this is not why I added command_count to the parameters
of todo_list_write_to_file().  It was a confusion of my part.

> Best Wishes
> 
> Phillip
> 

Cheers,
Alban



Re: [RFC] Generation Number v2

2018-11-01 Thread Jakub Narebski
Derrick Stolee  writes:

> Here is a re-formatted version of the tables I introduced earlier.
> The tables were too wide for public-inbox to render correctly (when
> paired with my email client). Hopefully this bulleted-list format
> works better. Thanks, Stefan, for pointing out the rendering
> problems!
>
> ### Test 1: `git log --topo-order -N`
>
> This test focuses on the number of commits that are parsed during
> a `git log --topo-order` before writing `N` commits to output.
>
> You can reproduce this test using `topo-order-tests.sh` and
> see all the data in `topo-order-summary.txt`. The values reported
> here are a sampling of the data, ignoring tests where all values
> were the same or extremely close in value.
>
> android-base, N = 100
>     V0: 5487
>     V1: 8534
>     V2: 6937
>     V3: 6419
>     V4: 6453
[...]

> ### Test 2: `git log --topo-order -10 A..B`
[...]
> android-base 53c1972bc8f 92f18ac3e39
>   OLD: 39403
>    V0:  1544
>    V1:  6957
>    V2:    26
>    V3:  1015
>    V4:  1098

Two minor issues.  First, now that the data is not in the table format,
where every bit of horizontal space matters, why not spell the names of
new proposed "generation numbers" (or rather reachability orderings) in
full, instead of using V0...V4 shortcuts?  It is not as if we were short
of space.

Second, why OLD data is present in tests 2 and 3, but not in test 1?

Best,
-- 
Jakub Narębski


Re: [PATCH v5] branch: introduce --show-current display option

2018-11-01 Thread Jeff King
On Fri, Oct 26, 2018 at 09:52:24AM +0900, Junio C Hamano wrote:

> Eric Sunshine  writes:
> 
> >> +   test_when_finished "git tag -d branch-and-tag-name" &&
> >> +   git tag branch-and-tag-name &&
> >
> > If git-tag crashes before actually creating the new tag, then "git tag
> > -d", passed to test_when_finished(), will error out too, which is
> > probably undesirable since "cleanup code" isn't expected to error out.
> 
> Ah, I somehow thought that clean-up actions set up via when_finished
> are allowed to fail without affecting the outcome, but apparently I
> was mistaken.

If a when_finished block fails, we consider that a test failure. But if
we failed to create the tag, the test is failing anyway. Do we actually
care at that point?

We would still want to make sure we run the rest of the cleanup, but
looking at the definition of test_when_finished(), I think we do.

> I haven't gone through the list of when_finished clean-up actions
> that do not end with "|| :"; I suspect some of them are simply being
> sloppy and would want to have "|| :", but what I want to find out
> out of such an audit is if there is a legitimate case where it helps
> to catch failures in the clean-up actions.  If there is none, then
> ...

I think in the success case it is legitimately helpful. If that "tag -d"
failed above (after the tag creation and the rest of the test
succeeded), it would certainly be unexpected and we would want to know
that it happened. So I think "|| :" in this case is not just
unnecessary, but actively bad.

-Peff


Re: [PATCH v4 9/9] Documentation/config: add odb..promisorRemote

2018-11-01 Thread Jeff King
On Wed, Oct 31, 2018 at 07:28:09AM +0100, Christian Couder wrote:

> > For (2), I would like to see us improve the remote helper
> > infrastructure instead of introducing a new ODB helper.  Remote
> > helpers are already permitted to fetch some objects without listing
> > refs --- perhaps we will want to
> >
> >  i. split listing refs to a separate capability, so that a remote
> > helper can advertise that it doesn't support that.  (Alternatively
> > the remote could advertise that it has no refs.)
> >
> >  ii. Use the "long-running process" mechanism to improve how Git
> >  communicates with a remote helper.
> 
> Yeah, I agree that improving the remote helper infrastructure is
> probably better than what I have been trying to add. And I agree with
> the above 2 steps you propose.

One thing you might want to port over is the ability to ask the remote
helper "tell me the type and size of these objects". The reason I built
that into the original external-odb interface proposal was so that diff
could easily skip large objects without faulting them in (because
they're considered binary, and we'd just say "binary files differ"
anyway). That makes things like "git log -p" work a lot better (try it
with a blob-less partial clone now; it's pretty painful).

I know that's kind of the _opposite_ of how partial clones work now,
where we try really hard not to have to even tell the client the full
list of objects. That's good if the reason you want a partial clone is
because there are gigantic numbers of objects (e.g., the Windows repo).
But I think many people are interested in having a more moderate number
of large objects (e.g., things like game development that are using
git-lfs now). It would be great if we could support both use cases
easily.

-Peff


Re: [RFC] Generation Number v2

2018-11-01 Thread Jakub Narebski
Stefan Beller  writes:

>> Based on the performance results alone, we should remove minimum
>> generation numbers, (epoch, date) pairs, and FELINE index from
>> consideration. There are enough examples of these indexes performing
>> poorly.
>>
>> In contrast, maximum generation numbers and corrected commit
>> dates both performed quite well. They are frequently the top
>> two performing indexes, and rarely significantly different.
>>
>> The trade-off here now seems to be: which _property_ is more important,
>> locally-computable or backwards-compatible?
>>
>> * Maximum generation number is backwards-compatible but not
>>locally-computable or immutable.
>
> These maximum generation numbers sound like the reverse of
> the generation numbers as they are currently implemented, i.e.
> we count all commits between the commit A and all heads.

Actually

  maximum generation number =
  number of commits  -  reverse generation number

> How would this impact creation of a commit?
>
> The current generation numbers can be lazily updated or not
> updated at all. In my understanding of the maximum generation
> numbers, a new commit would make these maximum generation
> numbers invalid (i.e. they have to be recomputed).
>
> Are there ways out by deferring the computation of maximum
> generation numbers while still being able to work with some
> commits that are un-accounted for?

As I understand it, new commits added since commit-graph file was
created would get simply INFINITY maximum/maximal generation numbers (if
we were using reverse generation numbers, new commits would get ZERO for
generation number).

> When recomputing these numbers, the current generation number
> (V0) has the property that already existing numbers can be re-used
> as-is. We only need to compute the numbers for new commits,
> and then insert this to the appropriate data structures (which is
> a separate problem, one could imagine a split generation
> numbers file like the split index)

Sidenote: I wonder if there is some on-disk data structure with
similarly fast read, but easier update.

> For the V2 maximum generation numbers, would we need to
> rewrite the numbers for all commits once we recompute them?
> Assuming that is true, it sounds like the benchmark doesn't
> cover the whole costs associated with V2, which is why the
> exceptional performance can be explained.

Let's check it using a simple example

First, (minimum) parent-based generation numbers before and after
extending the commit graph:

  1   2 3   4 5   6   7new
  1   2 3   4 5   -   -old
  .---.-.---.-.---*---*
   \
\   3   4 5   6new
 \  3   4 5   6old
  \-.---.-.---.
 \
  \   5new
   \  -old
\-*


Next, maximum generation numbers.  We start with 9 commits, and we end
up with 12 commits after the change

  6   7 8   9 10  11  12   new
  4   5 7   8 9   -   -old
  .---.-.---.-.---*---*
   \
\   9   1011  12   new
 \  6   7 8   9old
  \-.---.-.---.
 \
  \   12   new
   \  -old
\-*


As you can see all maximum generation numbers got rewritten.

Though if instead using the number of commits, we use the maximum
generation number, or in other words the length of the longest path, we
get the following:

  1   2 3   4 5   6   7new
  1   2 4   5 6   -   -old
  .---.-.---.-.---*---*
   \
\   4   5 6   7new
 \  3   4 5   6old
  \-.---.-.---.
 \
  \   7new
   \  -old
\-*

A bit better, but still much change in numbers.

[...]
>>
>> * Corrected commit-date is locally-computable and immutable,
>>but not backwards-compatible.
>
> How are these dates not backwards incompatible?
> We don't have to expose these dates to the user, but
> - just like generation numbers - could store them and use them
> but not tell the user about it.
>
> We would need to be really careful to not expose them at all
> as they look like the real dates, so that could make for an
> awkward bug report.
>
> The approach of "corrected commit date" sounds like we could
> have a very lazy approach, i.e. no extra data structures needed
> for many commits (as the corrected date equals the real date)
> and only need to store the corrections for some commits.
> Such an approach however would not make it easy to know
> if we operate on corrected dates, or if we even checked them
> for correctness before.
>
> So if we'd have an additional row in the generation numbers file
> telling the corrected date, then we should be able to be backwards
> compatible?

Here, from what I understand, being "backwards 

Re: [PATCHv2 00/24] Bring more repository handles into our code base]

2018-11-01 Thread Stefan Beller
On Tue, Oct 30, 2018 at 11:41 PM Junio C Hamano  wrote:
>
> Stefan Beller  writes:
>
> > I also picked up the patch for pending semantic patches, as the
> > first patch, can I have your sign off, please?
>
> I find this step quite lacking.
>
> What was posted would have been perfectly fine as a "how about doing
> it this way" weatherbaloon patch, but as a part of real series, it
> needs to explain to the developers what the distinctions between two
> classes are, and who is to use the cocci patch at what point in the
> development cycle, etc., in an accompanying documentation update.

if only we had documentation [as found via "git grep coccicheck"]
that I could update ... I'd totally do that.
I guess this asks for documentation to begin with, now?

> So can we have both sign-off and write-up (perhaps cut from
> the original e-mail discussion)?

I'll see where to put the docs; I assumed commit messages are enough.
63f0a758a0 (add coccicheck make target, 2016-09-15)
is what I found nice.


> >   t/helper/test-repository: celebrate independence from the_repository
>
> It seems that this topic is full of commits with overly long title.

yep.
> > git range-diff origin/sb/more-repo-in-api...
> > [...] # rebased to origin/master
>
> I offhand do not recall what happened during these 100+ pacthes.
> DId we acquire something significantly new that would have
> conflicted with this new round of the topic?  I do not mind at all
> seeing that a series gets adjusted to the updated codebase, but I do
> dislike seeing it done without explanation---an unexplained rebase
> to a new base is a wasted opportunity to learn what areas of the
> codebase are so hot that multiple and separate topics would want to
> touch them.

>From the point of view of these large scale refactorings,
all of the code is hot, e.g. the patch that was present in the RFC
"apply all semantic patches" would clash with nearly any topic.

As I do not carry that patch any more, I do not recall any conflicts
that needed to be resolved.

Thanks.


Re: ab/* topics

2018-11-01 Thread Ævar Arnfjörð Bjarmason


On Thu, Nov 01 2018, Duy Nguyen wrote:

> On Thu, Nov 1, 2018 at 3:04 PM Junio C Hamano  wrote:
>>
>> Ævar Arnfjörð Bjarmason  writes:
>>
>> > Could you please pick up
>> > https://public-inbox.org/git/20181024114725.3927-1-ava...@gmail.com/ ?
>> > It seems to have fallen between the cracks and addressed the feedback on
>> > v1, and looks good to me (and nobody's objected so far...).
>>
>> If this is the runtime-gettext-poison thing, this did not fall thru
>> the cracks---I didn't get the impression that "no objection was a
>> sign of being excellent"; rather I recall feeling that you were the
>> only person who were excited about it, while everybody else was
>> "Meh".
>
> I would be more excited if the scrambling patches go first (*)

Sorry to be so unexciting :)

I've sent a v3 in
https://public-inbox.org/git/20181101193115.32681-1-ava...@gmail.com/T/#u
which which hopefully addresses the concerns you & SZEDER had.

> and then we start to make "make test" poisoned by default. Scrambled
> output is still very readable and it will make people not forget about
> grepping translated stuff the wrong way. Of course *i18n* functions in
> the test suite need to be updated to to grep/compare for real, not
> just exit early like they do now.

I think now that this is a runtime option we'd instead just do stuff
like:

GIT_TEST_GETTEXT_POISON= git ... 2>err &&
grep str err

Which has the advantages of:

 1) You can grep for error messages you find in the code and find tests
that check for them.

 2) When you run the tests and something goes wrong, it's not some
scrambled output, so you can quickly e.g. search for that error in
the code / on Google without needing to hunt for some "how did I
disable the scrambling again...?" knob.

After all, the entire point of the facility is to catch us updating
strings which have existing tests without "GIT_TEST_GETTEXT_POISON=" (or
test_i18n...), but once we find those (with my patch) we can just
selectively turn the whole thing off.

> (*) The pseudolocalization idea is also good, but printing unicode by
> default may be a bit of a stretch. Not everybody is running the test
> suite with a unicode-capable terminal.


[PATCH v3] i18n: make GETTEXT_POISON a runtime option

2018-11-01 Thread Ævar Arnfjörð Bjarmason
Change the GETTEXT_POISON compile-time + runtime GIT_GETTEXT_POISON
test parameter to only be a GIT_TEST_GETTEXT_POISON=
runtime parameter, to be consistent with other parameters documented
in "Running tests with special setups" in t/README.

When I added GETTEXT_POISON in bb946bba76 ("i18n: add GETTEXT_POISON
to simulate unfriendly translator", 2011-02-22) I was concerned with
ensuring that the _() function would get constant folded if NO_GETTEXT
was defined, and likewise that GETTEXT_POISON would be compiled out
unless it was defined.

But as the benchmark in my [1] shows doing a one-off runtime
getenv("GIT_TEST_[...]") is trivial, and since GETTEXT_POISON was
originally added the GIT_TEST_* env variables have become the common
idiom for turning on special test setups.

So change GETTEXT_POISON to work the same way. Now the
GETTEXT_POISON=YesPlease compile-time option is gone, and running the
tests with GIT_TEST_GETTEXT_POISON=[YesPlease|] can be toggled on/off
without recompiling.

This allows for conditionally amending tests to test with/without
poison, similar to what 859fdc0c3c ("commit-graph: define
GIT_TEST_COMMIT_GRAPH", 2018-08-29) did for GIT_TEST_COMMIT_GRAPH. Do
some of that, now we e.g. always run the t0205-gettext-poison.sh test.

I did enough there to remove the GETTEXT_POISON prerequisite, but its
inverse C_LOCALE_OUTPUT is still around, and surely some tests using
it can be converted to e.g. always set GIT_TEST_GETTEXT_POISON=.

Notes on the implementation:

 * We still compile a dedicated GETTEXT_POISON build in Travis CI.
   This is probably the wrong thing to do and should be followed-up
   with something similar to ae59a4e44f ("travis: run tests with
   GIT_TEST_SPLIT_INDEX", 2018-01-07) to re-use an existing test setup
   for running in the GIT_TEST_GETTEXT_POISON mode.

 * We now skip a test in t-basic.sh under
   GIT_TEST_GETTEXT_POISON=YesPlease that wasn't skipped before. This
   test relies on C locale output, but due to an edge case in how the
   previous implementation of GETTEXT_POISON worked (reading it from
   GIT-BUILD-OPTIONS) wasn't enabling poison correctly. Now it does,
   and needs to be skipped.

 * The getenv() function is not reentrant, so out of paranoia about
   code of the form:

   printf(_("%s"), getenv("some-env"));

   call use_gettext_poison() in our early setup in git_setup_gettext()
   so we populate the "poison_requested" variable in a codepath that's
   won't suffer from that race condition.

See also [3] for more on the motivation behind this patch, and the
history of the GETTEXT_POISON facility.

1. https://public-inbox.org/git/871s8gd32p@evledraar.gmail.com/
2. https://public-inbox.org/git/87woq7b58k@evledraar.gmail.com/
3. https://public-inbox.org/git/878t2pd6yu@evledraar.gmail.com/

Signed-off-by: Ævar Arnfjörð Bjarmason 
---

Now:

 * The new i18n helper is gone. We just use "test -n" semantics for
   $GIT_TEST_GETTEXT_POISON

 * We error out in the Makefile if you're still saying
   GETTEXT_POISON=YesPlease.

   This makes more sense than just making it a synonym since now this
   also needs to be defined at runtime.

 * The caveat with avoiding test_lazy_prereq is gone (although there's
   still some unrelated bug there worth looking into).

 * We call use_gettext_poison() really early to avoid any reentrancy
   issue with getenv().


 .travis.yml   |  2 +-
 Makefile  |  8 +---
 ci/lib-travisci.sh|  4 ++--
 gettext.c | 11 +++
 gettext.h |  9 +++--
 git-sh-i18n.sh|  2 +-
 po/README | 13 -
 t/README  |  6 ++
 t/lib-gettext.sh  |  2 +-
 t/t-basic.sh  |  2 +-
 t/t0205-gettext-poison.sh |  8 +---
 t/t3406-rebase-message.sh |  2 +-
 t/t7201-co.sh |  6 +++---
 t/t9902-completion.sh |  3 ++-
 t/test-lib-functions.sh   |  8 
 t/test-lib.sh |  6 +-
 16 files changed, 43 insertions(+), 49 deletions(-)

diff --git a/.travis.yml b/.travis.yml
index 4d4e26c9df..4523a2e5ec 100644
--- a/.travis.yml
+++ b/.travis.yml
@@ -26,7 +26,7 @@ addons:
 
 matrix:
   include:
-- env: jobname=GETTEXT_POISON
+- env: jobname=GIT_TEST_GETTEXT_POISON
   os: linux
   compiler:
   addons:
diff --git a/Makefile b/Makefile
index b08d5ea258..3a08626db0 100644
--- a/Makefile
+++ b/Makefile
@@ -362,11 +362,6 @@ all::
 # Define HAVE_DEV_TTY if your system can open /dev/tty to interact with the
 # user.
 #
-# Define GETTEXT_POISON if you are debugging the choice of strings marked
-# for translation.  In a GETTEXT_POISON build, you can turn all strings marked
-# for translation into gibberish by setting the GIT_GETTEXT_POISON variable
-# (to any value) in your environment.
-#
 # Define JSMIN to point to JavaScript minifier that functions as
 # a filter to have gitweb.js minified.
 #
@@ -1450,7 +1445,7 @@ ifdef NO_SYMLINK_HEAD

Re: [PATCH 18/19] submodule: use submodule repos for object lookup

2018-11-01 Thread Stefan Beller
On Wed, Oct 31, 2018 at 6:38 AM Derrick Stolee  wrote:
>
> On 10/16/2018 7:35 PM, Stefan Beller wrote:
> > @@ -482,14 +483,46 @@ void prepare_submodule_repo_env(struct argv_array 
> > *out)
> >DEFAULT_GIT_DIR_ENVIRONMENT);
> >   }
> >
> > -/* Helper function to display the submodule header line prior to the full
> > - * summary output. If it can locate the submodule objects directory it will
> > - * attempt to lookup both the left and right commits and put them into the
> > - * left and right pointers.
> > +/*
> > + * Initialize 'out' based on the provided submodule path.
> > + *
> > + * Unlike repo_submodule_init, this tolerates submodules not present
> > + * in .gitmodules. This function exists only to preserve historical 
> > behavior,
> > + *
> > + * Returns 0 on success, -1 when the submodule is not present.
> > + */
> > +static int open_submodule(struct repository *out, const char *path)
> > +{
> > + struct strbuf sb = STRBUF_INIT;
> > +
> > + if (submodule_to_gitdir(, path) || repo_init(out, sb.buf, NULL)) {
> > + strbuf_release();
> > + return -1;
> > + }
> > +
> > + out->submodule_prefix = xstrdup(path);
> > + out->submodule_prefix = xstrfmt("%s%s/",
> > + the_repository->submodule_prefix ?
> > + the_repository->submodule_prefix :
> > + "", path);
> > +
> > + strbuf_release();
> > + return 0;
> > +}
>
> Based on the recent test coverage report [1], this xstrfmt() call is never
> run witha non-null the_repository->submodule_prefix. Is there a way we can
> exercise that branch?

No it's dead code, actually. the_repository never has submodule_prefix set
as it is the main repository. So this is overly cautious to enable the
'any repo'
case.
In a resend we'll go with xstrdup(path);


Re: [PATCH 0/3] Make add_missing_tags() linear

2018-11-01 Thread Derrick Stolee

On 11/1/2018 2:57 PM, Elijah Newren wrote:

On Thu, Nov 1, 2018 at 5:32 AM Derrick Stolee  wrote:

No rush. I'd just like to understand how removing the commit-graph file
can make the new algorithm faster. Putting a similar count in the old
algorithm would involve giving a count for every call to
in_merge_bases_many(), which would be very noisy.

$ time git push --dry-run --follow-tags /home/newren/repo-mirror
count: 92912
To /home/newren/repo-mirror
  * [new branch]  test5 -> test5

real0m3.024s
user0m2.752s
sys0m0.320s


Is the above test with or without the commit-graph file? Can you run it 
in the other mode, too? I'd like to see if the "count" value changes 
when the only difference is the presence of a commit-graph file.


Thanks,
-Stolee


Re: [PATCH 0/3] Make add_missing_tags() linear

2018-11-01 Thread Elijah Newren
On Thu, Nov 1, 2018 at 5:32 AM Derrick Stolee  wrote:
> On 11/1/2018 2:52 AM, Elijah Newren wrote:
> > On Wed, Oct 31, 2018 at 5:05 AM Derrick Stolee  wrote:
> >> On 10/31/2018 2:04 AM, Elijah Newren wrote:
> >>>
> >>> On the original repo where the topic was brought up, with commit-graph
> >>> NOT turned on and using origin/master, I see:
> >>>
> >>> $ time git push --dry-run --follow-tags /home/newren/repo-mirror
> >>> To /home/newren/repo-mirror
> >>>   * [new branch]   test5 -> test5
> >>>
> >>> real 1m20.081s
> >>> user 1m19.688s
> >>> sys 0m0.292s
> >>>
> >>> Merging this series in, I now get:
> >>>
> >>> $ time git push --dry-run --follow-tags /home/newren/repo-mirror
> >>> To /home/newren/repo-mirror
> >>>   * [new branch]   test5 -> test5
> >>>
> >>> real 0m2.857s
> >>> user 0m2.580s
> >>> sys 0m0.328s
> >>>
> >>> which provides a very nice speedup.
> >>>
> >>> Oddly enough, if I _also_ do the following:
> >>> $ git config core.commitgraph true
> >>> $ git config gc.writecommitgraph true
> >>> $ git gc
> >>>
> >>> then my timing actually slows down just slightly:
> >>> $ time git push --follow-tags --dry-run /home/newren/repo-mirror
> >>> To /home/newren/repo-mirror
> >>>   * [new branch]  test5 -> test5
> >>>
> >>> real 0m3.027s
> >>> user 0m2.696s
> >>> sys 0m0.400s
> >> So you say that the commit-graph is off in the 2.8s case, but not here
> >> in the 3.1s case? I would expect _at minimum_ that the cost of parsing
> >> commits would have a speedup in the commit-graph case.  There may be
> >> something else going on here, since you are timing a `push` event that
> >> is doing more than the current walk.
> >>
> >>> (run-to-run variation seems pretty consistent, < .1s variation, so
> >>> this difference is just enough to notice.)  I wouldn't be that
> >>> surprised if that means there's some really old tags with very small
> >>> generation numbers, meaning it's not gaining anything in this special
> >>> case from the commit-graph, but it does pay the cost of loading the
> >>> commit-graph.
> >> While you have this test environment, do you mind applying the diff
> >> below and re-running the tests? It will output a count for how many
> >> commits are walked by the algorithm. This should help us determine if
> >> this is another case where generation numbers are worse than commit-date,
> >> or if there is something else going on. Thanks!
> > I can do that, but wouldn't you want a similar patch for the old
> > get_merge_bases_many() in order to compare?  Does an absolute number
> > help by itself?
> > It's going to have to be tomorrow, though; not enough time tonight.
>
> No rush. I'd just like to understand how removing the commit-graph file
> can make the new algorithm faster. Putting a similar count in the old
> algorithm would involve giving a count for every call to
> in_merge_bases_many(), which would be very noisy.

$ time git push --dry-run --follow-tags /home/newren/repo-mirror
count: 92912
To /home/newren/repo-mirror
 * [new branch]  test5 -> test5

real0m3.024s
user0m2.752s
sys0m0.320s


Also:
$ git rev-list --count HEAD
55764
$ git rev-list --count --all
91820

Seems a little odd to me that count is greater than `git rev-list
--count --all`.  However, the fact that they are close in magnitude
isn't surprising since I went digging for the commit with smallest
generation number not found in the upstream repo, and found:
$ git ls-remote /home/newren/repo-mirror/ | grep refs/tags/v0.2.0; echo $?
1
$ git rev-list --count refs/tags/v0.2.0
4
$ git rev-list --count refs/tags/v0.2.0 ^HEAD
4


So, the commit-graph can only help us avoid parsing 3 or so commits,
but we have to parse the 5M .git/objects/info/commit-graph file, and
then for every parse_commit() call we need to bsearch_graph() for the
commit.My theory is that parsing the commit-graph file and binary
searching it for commits is relatively fast, but that the time is just
big enough to measure and notice, while avoiding parsing 3 more
commits is a negligible time savings.

To me, I'm thinking this is one of those bizarre corner cases where
the commit-graph is almost imperceptibly slower than without the
commit-graph.  (And it is a very weird repo; someone repeatedly
filter-branched lots of small independent repos into a monorepo, but
didn't always push everything and didn't clean out all old stuff.)
But if you still see weird stuff you want to dig into further (maybe
the 92912 > 91820 bit?), I'm happy to try out other stuff.


Re: [PATCH] pretty: Add %(trailer:X) to display single trailer

2018-11-01 Thread Jeff King
On Thu, Nov 01, 2018 at 12:01:28AM +0100, Anders Waldenborg wrote:

> Jeff King writes:
> 
> > On the other hand, if the rule were not "this affects the next
> > placeholder" but had a true ending mark, then we could make a real
> > parse-tree out of it, and format chunks of placeholders. E.g.:
> >
> >   %(format:lpad=30,filename)%(subject) %(authordate)%(end)
> >
> > would pad and format the whole string with two placeholders. I know that
> > going down this road eventually involves reinventing XML, but I think
> > having an actual tree structure may not be an unreasonable thing to
> > shoot for.
> 
> Yes. I'm thinking that with [] for formatting specifiers and () for
> placeholders, {} would be available for nesting. E.g:
> 
>%[lpad=30,mangle]{%(subject) %ad%}

Hmm. That's kind of ugly, but probably not really any uglier than any of
the things I showed. And it has the advantage that we could implement
%[] now, and later extend it (well, I guess we'd want to make sure that
"%[lpad=30]{foo}" does not treat the curly braces literally, since we'd
eventually make them syntactically significant).

> I'm planning to work on the initial "trailer:key=" part later this
> week. Maybe I can play around with different formatting options and see
> how it affects the parser.

Great! Thanks for working on this.

-Peff


Re: [BUG?] protocol.version=2 sends HTTP "Expect" headers

2018-11-01 Thread Jeff King
On Thu, Nov 01, 2018 at 12:48:04AM +, brian m. carlson wrote:

> > So a few questions:
> > 
> >   - is this a bug or not? I.e., do we still need to care about proxies
> > that can't handle Expect? The original commit was from 2011. Maybe
> > things are better now. Or maybe that's blind optimism.
> > 
> > Nobody has complained yet, but that's probably just because v2 isn't
> > widely deployed yet.
> 
> HTTP/1.1 requires support for 100 Continue on the server side and in
> proxies (it is mandatory to implement).  The original commit disabling
> it (959dfcf42f ("smart-http: Really never use Expect: 100-continue",
> 2011-03-14)), describes proxies as the reason for disabling it.
> 
> It's my understanding that all major proxies (including, as of version
> 3.2, Squid) support HTTP/1.1 at this point.  Moreover, Kerberos is more
> likely to be used in enterprises, where proxies (especially poorly
> behaved and outright broken proxies) are more common.  We haven't seen
> any complaints about Kerberos support in a long time.  This leads me to
> believe that things generally work[0].

Rooting around in the archive a bit, I found this discussion:

  
https://public-inbox.org/git/CAJo=hjvyormjfyznvwz4izr88ewor6luqoe-mpt4lspyodd...@mail.gmail.com/

The original motivation for 959dfcf42f was apparently Google's own
servers. And they are likely the biggest users of the v2 protocol
(since my impression is that Google ships a modified client to most of
their devs that flips the v2 switch). So if they're not having problems,
maybe that's a sign that this is a non-issue.

> Finally, modern versions of libcurl also have a timeout after which they
> assume that the server is not going to respond and just send the request
> body anyways.  This makes the issue mostly moot.

I think this was always the case. It's just that it introduced annoying
stalls into the protocol.

> >   - alternatively, should we just leave it on for v2, and provide a
> > config switch to disable it if you have a crappy proxy? I don't know
> > how widespread the problem is, but I can imagine that the issue is
> > subtle enough that most users wouldn't even know.
> 
> For the reasons I mentioned above, I'd leave it on for now.  Between
> libcurl and better proxy support, I think this issue is mostly solved.
> If we need to fix it in the future, we can, and people can fall back to
> older protocols via config in the interim.

OK. If nobody is complaining, this seems like a good way to ease into
the migration. I.e., if we dropped the old "suppress Expect headers"
code, people might complain that things are now broken. But if it
gradually follows the v2 adoptions, that's a bit more of a gentle curve
(well, until we hit the cliff where we switch the default to trying v2).

-Peff


Re: [PATCH 3/3] tests: optionally skip `git rebase -p` tests

2018-11-01 Thread Johannes Sixt

Am 01.11.18 um 07:12 schrieb Junio C Hamano:

"Johannes Schindelin via GitGitGadget" 
writes:


The `--preserve-merges` mode of the `rebase` command is slated to be
deprecated soon, ...


Is everybody on board on this statement?  I vaguely recall that some
people wanted to have something different from what rebase-merges
does (e.g. wrt first-parent history), and extending perserve-merges
might be one way to do so.


Maybe you are referring to my proposals from a long time ago. My 
first-parent hack did not work very well, and I have changed my 
workflow. --preserve-merges is certainly not a feature that *I* would 
like to keep.


The important question is whether there are too many users of 
preserve-merges who would be hurt when it is removed. It is irrelevant 
whether preserve-merges is a good base for extensions because it can 
easily be resurrected if the need arises.



I do not mind at all if the way forward were to extend rebase-merges
for any future features.  To me, it is preferrable having to deal
with just one codepath than two written in different languages.

I just want to make sure we know everybody is on board the plan that
we will eventually remove preserve-merges, tell those who want to
use it to switch to rebase-merges, and we will extend rebase-merges
when they raise issues with it saying that they cannot do something
preserve-merges would have served them well with rebase-merges.


-- Hannes


Re: [PATCH] doc: move git-cherry to plumbing

2018-11-01 Thread Duy Nguyen
On Thu, Oct 11, 2018 at 9:38 PM Daniels Umanovskis
 wrote:
>
> Also remove git-cherry from Bash completion because plumbing
> commands do not belong there.

Er.. why?

>
> Signed-off-by: Daniels Umanovskis 
> ---
>
> Up to discussion whether cherry should be considered plumbing.
> I lean towards considering it a rarely-used porcelain command, but
> a case could be made either way so let's see what the list thinks.
>
>  command-list.txt   |  2 +-
>  contrib/completion/git-completion.bash | 11 ---
>  2 files changed, 1 insertion(+), 12 deletions(-)
>
> diff --git a/command-list.txt b/command-list.txt
> index c36ea3c18..bdca6e3d3 100644
> --- a/command-list.txt
> +++ b/command-list.txt
> @@ -62,7 +62,7 @@ git-check-mailmap   purehelpers
>  git-checkoutmainporcelain   history
>  git-checkout-index  plumbingmanipulators
>  git-check-ref-formatpurehelpers
> -git-cherry  ancillaryinterrogators  
> complete
> +git-cherry  plumbinginterrogators  
> complete
>  git-cherry-pick mainporcelain
>  git-citool  mainporcelain
>  git-clean   mainporcelain
> diff --git a/contrib/completion/git-completion.bash 
> b/contrib/completion/git-completion.bash
> index d63d2dffd..12f7ce0c5 100644
> --- a/contrib/completion/git-completion.bash
> +++ b/contrib/completion/git-completion.bash
> @@ -1340,17 +1340,6 @@ _git_checkout ()
> esac
>  }
>
> -_git_cherry ()
> -{
> -   case "$cur" in
> -   --*)
> -   __gitcomp_builtin cherry
> -   return
> -   esac
> -
> -   __git_complete_refs

I think this is a regression. Because now "git cherry " will not
complete refs (the default completer can still complete "git cherry
--" fine). We support option completion of all commands no matter
what categeory they are. The category is mostly to hide them from "git
".

If you just want to hide "git cherry" from the "git " list, then
you could remove the "complete" tag in command-list.txt above.

> -}
> -
>  __git_cherry_pick_inprogress_options="--continue --quit --abort"
>
>  _git_cherry_pick ()
> --
> 2.19.1.330.g93276587c.dirty
>


-- 
Duy


Re: [PATCH v4 0/5] am/rebase: share read_author_script()

2018-11-01 Thread Phillip Wood

Hi Junio

On 01/11/2018 03:01, Junio C Hamano wrote:

Phillip Wood  writes:


From: Phillip Wood 

Sorry for the confusion with v3, here are the updated patches.

Thanks to Junio for the feedback on v2. I've updated patch 4 based on
those comments, the rest are unchanged.


The mistake of overwriting -1 (i.e. earlier we detected dup) with
the third instance of the same originates at [2/5], so updating
[4/5] without fixing it at its source would mean [4/5] is not a pure
code movement to make it available to libgit users---it instead hides
a (not so important) bugfix in it.


Facepalm. Thanks for clearing up my mess, what you've got in pu looks 
fine to me.


Thanks

Phillip





v1 cover letter:

This is a follow up to pw/rebase-i-author-script-fix, it reduces code
duplication and improves rebase's parsing of the author script. After
this I'll do another series to share the code to write the author
script.


Phillip Wood (5):
   am: don't die in read_author_script()
   am: improve author-script error reporting
   am: rename read_author_script()
   add read_author_script() to libgit
   sequencer: use read_author_script()

  builtin/am.c |  60 ++--
  sequencer.c  | 192 ---
  sequencer.h  |   3 +
  3 files changed, 128 insertions(+), 127 deletions(-)




Git Test Coverage Report (Thursday, Nov 1) Part A

2018-11-01 Thread Derrick Stolee

Today's test report doesn't contain any information for the 'master'
branch because the ref was not updated before this run (see 'master' and
'master@{1}' are the same below). If 'master' updates today, then I will
rerun the build and send the report for new lines in 'next' and 'master'.

Thanks,
-Stolee

[1] https://git.visualstudio.com/git/_build/results?buildId=237=logs

---

pu: f86906bb5aab712b2a63746c4a643efd6ce19d5b
jch: ce1cd767b7bb0e53083d26b4d60257e3ec0eaeb4
next: 0c4a18a71f0e8e4f10970951c5f8875f429eaef7
master: 4ede3d42dfb57f9a41ac96a1f216c62eb7566cc2
master@{1}: 4ede3d42dfb57f9a41ac96a1f216c62eb7566cc2

Uncovered code in 'pu' not in 'jch'
--

builtin/blame.c
f86906bb5a builtin/blame.c    200) 
repo_unuse_commit_buffer(the_repository, commit, message);
74e8221b52 builtin/blame.c    924) blame_date_width = sizeof("Thu Oct 19 
16:00");

74e8221b52 builtin/blame.c    925) break;

builtin/describe.c
f86906bb5a builtin/describe.c 257) repo_parse_commit(the_repository, p);

builtin/fsck.c
209830491c builtin/fsck.c 622) fprintf_ln(stderr, _("Checking %s link"), 
head_ref_name);

209830491c builtin/fsck.c 627) return error(_("invalid %s"), head_ref_name);

builtin/grep.c
76e9bdc437 builtin/grep.c  429) grep_read_unlock();

builtin/pack-objects.c
f86906bb5a builtin/pack-objects.c 2832) if 
(!repo_has_object_file(the_repository, >oid) && 
is_promisor_object(>oid))


builtin/reflog.c
c9ef0d95eb builtin/reflog.c 585) all_worktrees = 0;
c9ef0d95eb builtin/reflog.c 621) continue;

date.c
74e8221b52  113) die("Timestamp too large for this system: %"PRItime, time);
74e8221b52  216) if (tm->tm_mon == human_tm->tm_mon) {
74e8221b52  217) if (tm->tm_mday > human_tm->tm_mday) {
74e8221b52  219) } else if (tm->tm_mday == human_tm->tm_mday) {
74e8221b52  220) hide.date = hide.wday = 1;
74e8221b52  221) } else if (tm->tm_mday + 5 > human_tm->tm_mday) {
74e8221b52  223) hide.date = 1;
74e8221b52  231) gettimeofday(, NULL);
74e8221b52  232) show_date_relative(time, tz, , buf);
74e8221b52  233) return;
74e8221b52  246) hide.seconds = 1;
74e8221b52  247) hide.tz |= !hide.date;
74e8221b52  248) hide.wday = hide.time = !hide.year;
74e8221b52  262) strbuf_rtrim(buf);
74e8221b52  287) gettimeofday(, NULL);
74e8221b52  290) human_tz = local_time_tzoffset(now.tv_sec, _tm);
74e8221b52  886) static int auto_date_style(void)
74e8221b52  888) return (isatty(1) || pager_in_use()) ? DATE_HUMAN : 
DATE_NORMAL;

74e8221b52  909) return DATE_HUMAN;
74e8221b52  911) return auto_date_style();

fsck.c
f86906bb5a  858) repo_unuse_commit_buffer(the_repository, commit, buffer);
f86906bb5a  878) repo_read_object_file(the_repository,
f86906bb5a  879)   >object.oid, , );

http-push.c
f86906bb5a 1635) if (!repo_has_object_file(the_repository, _oid))
f86906bb5a 1642) if (!repo_has_object_file(the_repository, 
_ref->old_oid))


merge-recursive.c
4cdc48e412 1585) return -1;
4cdc48e412 1588) return -1;
4cdc48e412 1594) return -1;
4cdc48e412 1596) if (update_file(o, 1, b_oid, b_mode, collide_path))
4cdc48e412 1597) return -1;
4cdc48e412 1664) return -1;
4cdc48e412 1667) return -1;
4cdc48e412 1670) return -1;
b58ae691c0 1703) return -1;
387361a6a7 1738) return -1;
387361a6a7 1786) return -1;
387361a6a7 1795) new_path = unique_path(o, a->path, ci->branch1);
387361a6a7 1796) output(o, 1, _("Refusing to lose untracked file"
387361a6a7 1802) return -1;
387361a6a7 1805) return -1;
387361a6a7 1815) return -1;
387361a6a7 1831) return -1;
387361a6a7 1834) return -1;

negotiator/default.c
f86906bb5a  71) if (repo_parse_commit(the_repository, commit))

refs.c
3a3b9d8cde  657) return 0;

refs/files-backend.c
remote.c
879b6a9e6f 1140) return error(_("dst ref %s receives from more than one 
src."),


revision.c
f86906bb5a  726) if (repo_parse_commit(the_repository, p) < 0)

sequencer.c
f86906bb5a 1624) repo_unuse_commit_buffer(the_repository, head_commit,
f86906bb5a 3868) repo_unuse_commit_buffer(the_repository,

sha1-array.c
bba406749a 91) oidcpy([dst], [src]);

submodule-config.c
bcbc780d14 740) return CONFIG_INVALID_KEY;
45f5ef3d77 755) warning(_("Could not update .gitmodules entry %s"), key);

submodule.c
b303ef65e7  524) the_repository->submodule_prefix :
e2419f7e30 1378) strbuf_release();
7454fe5cb6 1501) struct get_next_submodule_task *task = task_cb;
7454fe5cb6 1505) get_next_submodule_task_release(task);
7454fe5cb6 1532) return 0;
7454fe5cb6 1536) goto out;
7454fe5cb6 1551) return 0;

tree.c
f86906bb5a 108) if (repo_parse_commit(the_repository, commit))

worktree.c
3a3b9d8cde 495) return -1;
3a3b9d8cde 508) return -1;
3a3b9d8cde 517) return -1;
ab3e1f78ae 537) break;

wrapper.c
5efde212fc  70) die("Out of memory, malloc failed (tried to allocate %" 
PRIuMAX " bytes)",
5efde212fc  73) error("Out of memory, malloc failed (tried to allocate 
%" PRIuMAX " bytes)",


Commits introducing uncovered code:
Ævar Arnfjörð Bjarmason  879b6a9e6: i18n: remote.c: mark error(...) 
messages for translation
Antonio Ospite  

Re: [PATCH v5 6/7] revision.c: generation-based topo-order algorithm

2018-11-01 Thread Derrick Stolee

On 11/1/2018 11:48 AM, SZEDER Gábor wrote:

On Thu, Nov 01, 2018 at 01:46:22PM +, Derrick Stolee wrote:

1. EXPLORE: using the explore_queue priority queue (ordered by
maximizing the generation number)
2. INDEGREE: using the indegree_queue priority queue (ordered
by maximizing the generation number)

Nit: I've been pondering for a while what exactly does "order by
maximizing ..." mean.  Highest to lowest or lowest to highest?  If I
understand the rest of the descriptions (that I snipped) correctly,
then it's the former, but I find that phrase in itself too ambiguous.


It means that our priority-queue "get" operation selects the item in the
queue that is largest by our comparison function (first generation number,
thencommit-date for ties).This means we walk commits that have high
generation number before those with lower generation number, guaranteeing
that we walk all children of a commit before walking that commit.

Thanks,
-Stolee


master updated? (was Re: What's cooking in git.git (Nov 2018, #01; Thu, 1))

2018-11-01 Thread Derrick Stolee

On 11/1/2018 5:59 AM, Junio C Hamano wrote:

--
[Graduated to "master"]


I see that several topics graduated, but it appears the master branch 
was not updated at https://github.com/gister/git. Was this intentional?


I noticed because the test-coverage build [1] using the previous master 
as master@{1} reported no uncovered code into master (because no new 
commits exist on master).


Thanks,
-Stolee

[1] https://git.visualstudio.com/git/_build/results?buildId=237=log


Re: [PATCH v5 6/7] revision.c: generation-based topo-order algorithm

2018-11-01 Thread SZEDER Gábor
On Thu, Nov 01, 2018 at 01:46:22PM +, Derrick Stolee wrote:
> 1. EXPLORE: using the explore_queue priority queue (ordered by
>maximizing the generation number)

> 2. INDEGREE: using the indegree_queue priority queue (ordered
>by maximizing the generation number)

Nit: I've been pondering for a while what exactly does "order by
maximizing ..." mean.  Highest to lowest or lowest to highest?  If I
understand the rest of the descriptions (that I snipped) correctly,
then it's the former, but I find that phrase in itself too ambiguous.



Re: [PATCH v2] completion: use builtin completion for format-patch

2018-11-01 Thread Duy Nguyen
On Thu, Nov 1, 2018 at 2:42 AM Junio C Hamano  wrote:
> >> @@ -2080,16 +2071,19 @@ _git_send_email ()
> >> return
> >> ;;
> >> --*)
> >> -   __gitcomp "--annotate --bcc --cc --cc-cmd --chain-reply-to
> >> -   --compose --confirm= --dry-run --envelope-sender
> >> -   --from --identity
> >> -   --in-reply-to --no-chain-reply-to 
> >> --no-signed-off-by-cc
> >> -   --no-suppress-from --no-thread --quiet --reply-to
> >> -   --signed-off-by-cc --smtp-pass --smtp-server
> >> -   --smtp-server-port --smtp-encryption= --smtp-user
> >> -   --subject --suppress-cc= --suppress-from --thread 
> >> --to
> >> -   --validate --no-validate
> >> -   $__git_format_patch_options"
> >> +   __gitcomp "--all --annotate --attach --bcc --binary --cc 
> >> --cc= --cc-cmd
> >> +   --chain-reply-to --compose --confirm= 
> >> --cover-letter --dry-run
> >> +   --dst-prefix= --envelope-sender --from 
> >> --full-index --identity
> >> +   --ignore-if-in-upstream --inline --in-reply-to 
> >> --in-reply-to=
> >> +   --keep-subject --no-attach --no-chain-reply-to 
> >> --no-prefix
> >> +   --no-signature --no-signed-off-by-cc 
> >> --no-suppress-from --not --notes
> >> +   --no-thread --no-validate --numbered 
> >> --numbered-files
> >> +   --output-directory --quiet --reply-to 
> >> --reroll-count --signature
> >> +   --signed-off-by-cc --signoff --smtp-encryption= 
> >> --smtp-pass
> >> +   --smtp-server --smtp-server-port --smtp-user 
> >> --src-prefix=
> >> +   --start-number --stdout --subject 
> >> --subject-prefix= --suffix=
> >> +   --suppress-cc= --suppress-from --thread --thread= 
> >> --to --to=
> >> +   --validate"
> >
> > I have no comment about this. In an ideal world, sendemail.perl could
> > be taught to support --git-completion-helper but I don't think my
> > little remaining Perl knowledge (or time) is enough to do it. Perhaps
> > this will do. I don't know.
>
> So "all", "attach", etc. are added to this list while these similar
> options are lost from the other variable?  Is this a good trade-off?

Not sure if I understand you correctly, but it looks to me that the
options in git-send-email.perl are well organized, so we could add
--git-completon-helper in that script to print all send-email specific
options, then call "git format-patch --git-completion-helper" to add a
bunch more. The options that are handled by setup_revisions() will
have to be maintained manually here like $__git_format_patch_options
and added on top in both _git_send_email () and _git_format_patch ().

So, nothing option is lost and by the time setup_revisions() supports
-git-completion-helper, we can get rid of the manual shell variable
too. The downside is, lots of work, probably.
-- 
Duy


Re: ab/* topics

2018-11-01 Thread Duy Nguyen
On Thu, Nov 1, 2018 at 3:04 PM Junio C Hamano  wrote:
>
> Ævar Arnfjörð Bjarmason  writes:
>
> > Could you please pick up
> > https://public-inbox.org/git/20181024114725.3927-1-ava...@gmail.com/ ?
> > It seems to have fallen between the cracks and addressed the feedback on
> > v1, and looks good to me (and nobody's objected so far...).
>
> If this is the runtime-gettext-poison thing, this did not fall thru
> the cracks---I didn't get the impression that "no objection was a
> sign of being excellent"; rather I recall feeling that you were the
> only person who were excited about it, while everybody else was
> "Meh".

I would be more excited if the scrambling patches go first (*) and
then we start to make "make test" poisoned by default. Scrambled
output is still very readable and it will make people not forget about
grepping translated stuff the wrong way. Of course *i18n* functions in
the test suite need to be updated to to grep/compare for real, not
just exit early like they do now.

(*) The pseudolocalization idea is also good, but printing unicode by
default may be a bit of a stretch. Not everybody is running the test
suite with a unicode-capable terminal.
-- 
Duy


RE: [RFE] Please add name and email to git credentials

2018-11-01 Thread Randall S. Becker
On November 1, 2018 10:13 AM, Christian Couder wrote:
> Sent: > To: nicolas.mail...@laposte.net
> Cc: git 
> Subject: Re: [RFE] Please add name and email to git credentials
> 
> On Thu, Nov 1, 2018 at 2:31 PM Nicolas Mailhot
>  wrote:
> >
> > Le jeudi 01 novembre 2018 à 12:22 +0100, Ævar Arnfjörð Bjarmason a
> > écrit :
> > >
> > > Where would we get an E-Mail to lookup to pass to the helper? Are
> > > you just asking that the helper git the result of $(git config
> > > user.name && git config user.email)? If so why can't it just look
> > > this up itself?
> >
> > So, just in case it was not clear enough, allow things in .gitconfig
> > like
> >
> > [credential "https://pkgs.fedoraproject.org/;]
> > username = doe4ever
> > name = John Doe
> > email = doe4e...@fedoraproject.org
> > [credential "https://gitlab.corp.com/;] username = jdoe56874 name =
> > John Doe, Snr Engineer email = john@corp.com
> >
> > Instead of just
> >
> > [user]
> > name = John Doe
> > email =  john@corp.com
> > [credential "https://pkgs.fedoraproject.org/;]
> > username = doe4ever
> > [credential "https://gitlab.corp.com/;] username = jdoe56874
> >
> > and drat, I've commited to XXX with the wrong name/email again
> 
> How can Git know when you commit where you will want to push the
> commit afterwards?
> 
> What if you want to push the same commit to 2 different places that need
> different credentials?

Agree. You are asking git to change history depending on where pushes are done. 
Applying a legacy VCS paradigm here? Git has a global view of history. It must 
be the same everywhere, so if you push to two different places, the history 
must be the same, because those two places may in turn push to another shared 
repo. Then who is the authority?

What I have managed to do is to associated name and email with config --local 
so that it binds to the clone and overrides your --global setting. I can have 
different identities based on what clone I am working on, but once history is 
created, that's it. If I push from one clone to another, the identity of the 
clone where I did my first commit is what the second clone sees.

Hope this helps.
Randall

-- Brief whoami:
 NonStop developer since approximately 2112884442
 UNIX developer since approximately 421664400
-- In my real life, I talk too much.





Re: [PATCH 1/3][Outreachy] t3903-stash: test without configured user.name and user.email

2018-11-01 Thread Christian Couder
On Thu, Nov 1, 2018 at 2:31 PM Slavica Djukic
 wrote:
>
> Add test to assert that stash fails if user.name and user.email

Nit: I am not sure that "assert" is the right word here.
test_expect_failure() is more for documenting an existing bug than for
really asserting a behavior (that users could rely upon). So I would
replace "assert" with "document" or maybe "document the bug".

> are not configured.
> In the final commit, test will be updated to expect success.

Other nit: maybe use "In a later commit" instead of "In the final
commit" as you, or someone else, may add another commit in this patch
series after the current final one.

> Signed-off-by: Slavica Djukic 

Thanks!


Re: [RFE] Please add name and email to git credentials

2018-11-01 Thread Nicolas Mailhot
Le jeudi 01 novembre 2018 à 15:13 +0100, Christian Couder a écrit :
> 
> How can Git know when you commit where you will want to push the
> commit afterwards?

You have an url in the repo config. of course you can change it between
the commit and the push, but that's not the general case.

Nowadays, most git projects have a preferred git hosting, and your
name/email with the project match the credentials you use to push
(otherwise things like gitlab/github issues trackers would not work at
all).

> What if you want to push the same commit to 2 different places that
> need different credentials?

Then you do not use git credentials and have to configure all by hand.
Which will usually be a major error-prone PITA, so you’ll end up pushing
to the system that matches the ID you want to se in git logs, and then
push from this system to others.

-- 
Nicolas Mailhot



Re: [RFE] Please add a standard ref object to releases

2018-11-01 Thread Nicolas Mailhot
Le jeudi 01 novembre 2018 à 14:09 +0100, Ævar Arnfjörð Bjarmason a
é

> You're the one trying to ostensibly file some sort of bug report or
> feature request, it's up to you to explain the problem in detail.
> 
> > I’m just asking that when a project releases “x.y.z”
> > 
> > 1. there was a standard git command available to it to create "the
> > release “x.y.z”" ref
> 
> And would these also be annotated tags, or are you proposing that we
> add
> a new "release" object type in addition to blob/tree/commit/tag? If so
> what would that look like?

It could have been done via annotated tags if release annotations, and
the associated porcelain, had been clearly defined and documented from
the start up. The technical implementation would have been fuzzy but
strong convention would have compensated the fuzziness.

This ship sailed a long time ago, at this point not only you do not have
convention to help you but you have all the custom workarounds people
invented to get by to overcome. So, I doubt anything short of a separate
object can work now (but I'd be delighted to be proven wrong).

Of course as long as the porcelain is unambiguous most git users won't
care how it is stored by git.

> > 2. there was a git syntax to say "the release “x.y.z”" ref in all
> > git
> > commands that manipulate refs
> 
> What do you mean "git syntaX"? The rev-parse syntax (see 'git help
> rev-parse`) or something else?

I mean that:
– I give you a repo name and its URL.
– I give you a release name such as 1.2.3.4
 1. write the command to checkout this release
 2. write  the command to diff this release with the latest master
commit
 3. write  the command to declare that master is release 1.2.3.4.5
etc

You’re forbidden to look at the content of the repo to browse its tags
and branches and use human logic to guess human convention. You can only
be sure that 1.2.3.4 is the actual release chosen by the project owner
as stated in its release notes.

And then try to do it for Apache Thrift 0.11 and git 2.19.0
https://github.com/apache/thrift/
and
https://github.com/git/git

(see, I’m nice, I didn’t even fed you Gitlab vs GitHub differences, or
some project released by an illustrious anonymous)

Or alternatively, try
gnome-calendar 3.28.0.1 and git 2.18.1

They’re on
https://gitlab.com/git-vcs/git/
https://gitlab.gnome.org/GNOME/gnome-calendar/

so, latest version of Gitlab for both of them.

> > 3. that this "release “x.y.z”" ref could be used in all the
> > "download
> > release “x.y.z”" on GitLab/GitHub, etc
> 
> So instead of offering a download of annotated tags as they do now,
> see
> https://github.com/git/git/releases and
> https://gitlab.com/git-vcs/git/tags they'd offer a download of whatevr
> this new thing is?

So they could build a
https://github.com/git/git/releases//
or
https://gitlab.com/git-vcs/git/releases//

and it would just give you the correct release archive, for example. Not
difficult for them to do as long as the mapping from release name “x.y.z
” to git repo object is well defined.

> > 4. that there was no fuziness or human interpretation involved into
> > converting "x.y.z" into the syntax used to select "release “x.y.z”"
> > in git commands
> 
> So since we'd store this in refs/* somewhere since it's (presumably)
> named named ref of some sort, you're suggesting that this thing be
> exempt from the DWYM name resolution of refs, e.g. us resolving
> "master"
> to "refs/heads/master" or v1.0 to "refs/tags/v1.0" (but as you note
> that's ambiguous).
> 
> So you'd always need to do e.g. "git show refs/releases/v1.0"?

It would be nice to have something shorter to type, but the root problem
is the ambiguity and lack of definition of current git releases, so
something unambiguous trumps something short, but fuzzy.

And, in your example, the v is unnecessary.

The v was just shoved in tag names to try to distinguish releases. It
didn’t work. It would have worked if git had blocked any tag starting
with v that was not a release tag, and forbidden branches with a v name.
But that ship sailed a long time ago.

Years of workarounds have brainwashed you into thinking the v is
necessary. But, it does not exist outside git land. Or even in the
release notes filenames git releases itself
https://github.com/git/git/tree/master/Documentation/RelNotes

(Much as I hate the v thing, I could have lived with it if it was
consistently applied by git projects. It isn’t)

> Whereas if you're proposing some mechanism that we draw a line
> somewhere
> in that flow and say "now tag/mark/release stuff as different sorts of
> objects" it's up to you to convince us why that's a realistic view of
> the world.

If it was not a realistic view of the world github wouln’t have had to
define separate sections for releases and tags in its UI.

If tags and release were the same thing one could take random git
projects on github or gitlab and apply release rules like semver to
their tags. Even projects that use semver do not have semver-only 

Re: ab/* topics (was: Re: What's cooking in git.git (Nov 2018, #01; Thu, 1))

2018-11-01 Thread SZEDER Gábor
On Thu, Nov 01, 2018 at 02:46:41PM +0100, Ævar Arnfjörð Bjarmason wrote:
> > However, if you push that patch with 'sh-i18n--helper' as-is, then I
> > do object now: parsing a boolean in shell is not at all that difficult
> > to justify this new command.
> 
> So instead of calling a helper (which is undocumented, and only used by
> git itself internally) you'd instead like to have some shellscript
> thingy like:
> 
> if test $value = 'true' -o $value = '1' []
> then
> exit 0
> elif test $value = 'false' -o $value = '0' [...]

Yes, but more like:

  case "$GIT_TEST_GETTEXT_POISON" in
  yes|true|on|1)
GIT_INTERNAL_GETTEXT_SH_SCHEME=poison
;;
  esac

There is no need to check for 'false', 0, etc.

Yes, I know that this is not as fully-fledged as git_env_bool(), e.g.
it won't recognize 'TrUe' and '42' as true, but since this is "merely"
a testing aid, I think it's sufficient.

> Sure, if that's the consensus I can change that, but it seems like the
> opposite of the direction we're moving with the general *.sh -> *.c
> migration. I.e. implementing helpers whenever possible instead of adding
> new shellscript-only logic.

I doubt that we really want to implement helpers "whenever possible",
i.e. even instead of such a rather trivial piece of shell code.

In the discusson of v1 of that patch there were suggestions on how to
turn an environment variable into a boolean in a more proper way, e.g.
by extending 'git var', but I agree with Peff that "we should do
nothing for now and see how often this comes up" [1].  In the meantime
this builtin seems to be the worse direction of all.

[1] https://public-inbox.org/git/20181025212358.ga23...@sigill.intra.peff.net/



Re: [RFE] Please add name and email to git credentials

2018-11-01 Thread Christian Couder
On Thu, Nov 1, 2018 at 2:31 PM Nicolas Mailhot
 wrote:
>
> Le jeudi 01 novembre 2018 à 12:22 +0100, Ævar Arnfjörð Bjarmason a
> écrit :
> >
> > Where would we get an E-Mail to lookup to pass to the helper? Are you
> > just asking that the helper git the result of $(git config user.name
> > &&
> > git config user.email)? If so why can't it just look this up itself?
>
> So, just in case it was not clear enough, allow things in .gitconfig
> like
>
> [credential "https://pkgs.fedoraproject.org/;]
> username = doe4ever
> name = John Doe
> email = doe4e...@fedoraproject.org
> [credential "https://gitlab.corp.com/;]
> username = jdoe56874
> name = John Doe, Snr Engineer
> email = john@corp.com
>
> Instead of just
>
> [user]
> name = John Doe
> email =  john@corp.com
> [credential "https://pkgs.fedoraproject.org/;]
> username = doe4ever
> [credential "https://gitlab.corp.com/;]
> username = jdoe56874
>
> and drat, I've commited to XXX with the wrong name/email again

How can Git know when you commit where you will want to push the
commit afterwards?

What if you want to push the same commit to 2 different places that
need different credentials?


Re: [PATCH v4 0/7] Use generation numbers for --topo-order

2018-11-01 Thread Derrick Stolee

On 11/1/2018 1:21 AM, Junio C Hamano wrote:

"Derrick Stolee via GitGitGadget"  writes:


This patch series performs a decently-sized refactoring of the revision-walk
machinery. Well, "refactoring" is probably the wrong word, as I don't
actually remove the old code. Instead, when we see certain options in the
'rev_info' struct, we redirect the commit-walk logic to a new set of methods
that distribute the workload differently. By using generation numbers in the
commit-graph, we can significantly improve 'git log --graph' commands (and
the underlying 'git rev-list --topo-order').

Review discussions seem to have petered out.  Would we merge this to
'next' and start cooking, perhaps for the remainder of this cycle?


Thanks, but I've just sent a v5 responding to Jakub's feedback on v4. [1]

I'd be happy to let it sit in next until you feel it has cooked long 
enough. I'm available to respond to feedback in the form of new topics.


Thanks,
-Stolee

[1] 
https://public-inbox.org/git/20181101134623.84055-1-dsto...@microsoft.com/T/#u


Re: ab/* topics (was: Re: What's cooking in git.git (Nov 2018, #01; Thu, 1))

2018-11-01 Thread Ævar Arnfjörð Bjarmason


On Thu, Nov 01 2018, SZEDER Gábor wrote:

> On Thu, Nov 01, 2018 at 12:02:01PM +0100, Ævar Arnfjörð Bjarmason wrote:
>> Could you please pick up
>> https://public-inbox.org/git/20181024114725.3927-1-ava...@gmail.com/ ?
>> It seems to have fallen between the cracks and addressed the feedback on
>> v1, and looks good to me (and nobody's objected so far...).
>
> I didn't object, because in
>
>   https://public-inbox.org/git/87muqzllh0@evledraar.gmail.com/
>
> you asked for "a more general review than just the problem of how
> we turn an env variable into a boolean".
>
> However, if you push that patch with 'sh-i18n--helper' as-is, then I
> do object now: parsing a boolean in shell is not at all that difficult
> to justify this new command.

So instead of calling a helper (which is undocumented, and only used by
git itself internally) you'd instead like to have some shellscript
thingy like:

if test $value = 'true' -o $value = '1' []
then
exit 0
elif test $value = 'false' -o $value = '0' [...]

Sure, if that's the consensus I can change that, but it seems like the
opposite of the direction we're moving with the general *.sh -> *.c
migration. I.e. implementing helpers whenever possible instead of adding
new shellscript-only logic.


[PATCH v5 6/7] revision.c: generation-based topo-order algorithm

2018-11-01 Thread Derrick Stolee
The current --topo-order algorithm requires walking all
reachable commits up front, topo-sorting them, all before
outputting the first value. This patch introduces a new
algorithm which uses stored generation numbers to
incrementally walk in topo-order, outputting commits as
we go. This can dramatically reduce the computation time
to write a fixed number of commits, such as when limiting
with "-n " or filling the first page of a pager.

When running a command like 'git rev-list --topo-order HEAD',
Git performed the following steps:

1. Run limit_list(), which parses all reachable commits,
   adds them to a linked list, and distributes UNINTERESTING
   flags. If all unprocessed commits are UNINTERESTING, then
   it may terminate without walking all reachable commits.
   This does not occur if we do not specify UNINTERESTING
   commits.

2. Run sort_in_topological_order(), which is an implementation
   of Kahn's algorithm. It first iterates through the entire
   set of important commits and computes the in-degree of each
   (plus one, as we use 'zero' as a special value here). Then,
   we walk the commits in priority order, adding them to the
   priority queue if and only if their in-degree is one. As
   we remove commits from this priority queue, we decrement the
   in-degree of their parents.

3. While we are peeling commits for output, get_revision_1()
   uses pop_commit on the full list of commits computed by
   sort_in_topological_order().

In the new algorithm, these three steps correspond to three
different commit walks. We run these walks simultaneously,
and advance each only as far as necessary to satisfy the
requirements of the 'higher order' walk. We know when we can
pause each walk by using generation numbers from the commit-
graph feature.

Recall that the generation number of a commit satisfies:

* If the commit has at least one parent, then the generation
  number is one more than the maximum generation number among
  its parents.

* If the commit has no parent, then the generation number is one.

There are two special generation numbers:

* GENERATION_NUMBER_INFINITY: this value is 0x and
  indicates that the commit is not stored in the commit-graph and
  the generation number was not previously calculated.

* GENERATION_NUMBER_ZERO: this value (0) is a special indicator
  to say that the commit-graph was generated by a version of Git
  that does not compute generation numbers (such as v2.18.0).

Since we use generation_numbers_enabled() before using the new
algorithm, we do not need to worry about GENERATION_NUMBER_ZERO.
However, the existence of GENERATION_NUMBER_INFINITY implies the
following weaker statement than the usual we expect from
generation numbers:

If A and B are commits with generation numbers gen(A) and
gen(B) and gen(A) < gen(B), then A cannot reach B.

Thus, we will walk in each of our stages until the "maximum
unexpanded generation number" is strictly lower than the
generation number of a commit we are about to use.

The walks are as follows:

1. EXPLORE: using the explore_queue priority queue (ordered by
   maximizing the generation number), parse each reachable
   commit until all commits in the queue have generation
   number strictly lower than needed. During this walk, update
   the UNINTERESTING flags as necessary.

2. INDEGREE: using the indegree_queue priority queue (ordered
   by maximizing the generation number), add one to the in-
   degree of each parent for each commit that is walked. Since
   we walk in order of decreasing generation number, we know
   that discovering an in-degree value of 0 means the value for
   that commit was not initialized, so should be initialized to
   two. (Recall that in-degree value "1" is what we use to say a
   commit is ready for output.) As we iterate the parents of a
   commit during this walk, ensure the EXPLORE walk has walked
   beyond their generation numbers.

3. TOPO: using the topo_queue priority queue (ordered based on
   the sort_order given, which could be commit-date, author-
   date, or typical topo-order which treats the queue as a LIFO
   stack), remove a commit from the queue and decrement the
   in-degree of each parent. If a parent has an in-degree of
   one, then we add it to the topo_queue. Before we decrement
   the in-degree, however, ensure the INDEGREE walk has walked
   beyond that generation number.

The implementations of these walks are in the following methods:

* explore_walk_step and explore_to_depth
* indegree_walk_step and compute_indegrees_to_depth
* next_topo_commit and expand_topo_walk

These methods have some patterns that may seem strange at first,
but they are probably carry-overs from their equivalents in
limit_list and sort_in_topological_order.

One thing that is missing from this implementation is a proper
way to stop walking when the entire queue is UNINTERESTING, so
this implementation is not enabled by comparisions, such as in
'git rev-list --topo-order A..B'. This 

[PATCH v5 2/7] test-reach: add run_three_modes method

2018-11-01 Thread Derrick Stolee
The 'test_three_modes' method assumes we are using the 'test-tool
reach' command for our test. However, we may want to use the data
shape of our commit graph and the three modes (no commit-graph,
full commit-graph, partial commit-graph) for other git commands.

Split test_three_modes to be a simple translation on a more general
run_three_modes method that executes the given command and tests
the actual output to the expected output.

Signed-off-by: Derrick Stolee 
---
 t/t6600-test-reach.sh | 12 
 1 file changed, 8 insertions(+), 4 deletions(-)

diff --git a/t/t6600-test-reach.sh b/t/t6600-test-reach.sh
index d139a00d1d..9d65b8b946 100755
--- a/t/t6600-test-reach.sh
+++ b/t/t6600-test-reach.sh
@@ -53,18 +53,22 @@ test_expect_success 'setup' '
git config core.commitGraph true
 '
 
-test_three_modes () {
+run_three_modes () {
test_when_finished rm -rf .git/objects/info/commit-graph &&
-   test-tool reach $1 actual &&
+   "$@" actual &&
test_cmp expect actual &&
cp commit-graph-full .git/objects/info/commit-graph &&
-   test-tool reach $1 actual &&
+   "$@" actual &&
test_cmp expect actual &&
cp commit-graph-half .git/objects/info/commit-graph &&
-   test-tool reach $1 actual &&
+   "$@" actual &&
test_cmp expect actual
 }
 
+test_three_modes () {
+   run_three_modes test-tool reach "$@"
+}
+
 test_expect_success 'ref_newer:miss' '
cat >input <<-\EOF &&
A:commit-5-7
-- 
2.19.1.542.gc4df23f792



[PATCH v5 3/7] test-reach: add rev-list tests

2018-11-01 Thread Derrick Stolee
The rev-list command is critical to Git's functionality. Ensure it
works in the three commit-graph environments constructed in
t6600-test-reach.sh. Here are a few important types of rev-list
operations:

* Basic: git rev-list --topo-order HEAD
* Range: git rev-list --topo-order compare..HEAD
* Ancestry: git rev-list --topo-order --ancestry-path compare..HEAD
* Symmetric Difference: git rev-list --topo-order compare...HEAD

Signed-off-by: Derrick Stolee 
---
 t/t6600-test-reach.sh | 84 +++
 1 file changed, 84 insertions(+)

diff --git a/t/t6600-test-reach.sh b/t/t6600-test-reach.sh
index 9d65b8b946..288f703b7b 100755
--- a/t/t6600-test-reach.sh
+++ b/t/t6600-test-reach.sh
@@ -243,4 +243,88 @@ test_expect_success 'commit_contains:miss' '
test_three_modes commit_contains --tag
 '
 
+test_expect_success 'rev-list: basic topo-order' '
+   git rev-parse \
+   commit-6-6 commit-5-6 commit-4-6 commit-3-6 commit-2-6 
commit-1-6 \
+   commit-6-5 commit-5-5 commit-4-5 commit-3-5 commit-2-5 
commit-1-5 \
+   commit-6-4 commit-5-4 commit-4-4 commit-3-4 commit-2-4 
commit-1-4 \
+   commit-6-3 commit-5-3 commit-4-3 commit-3-3 commit-2-3 
commit-1-3 \
+   commit-6-2 commit-5-2 commit-4-2 commit-3-2 commit-2-2 
commit-1-2 \
+   commit-6-1 commit-5-1 commit-4-1 commit-3-1 commit-2-1 
commit-1-1 \
+   >expect &&
+   run_three_modes git rev-list --topo-order commit-6-6
+'
+
+test_expect_success 'rev-list: first-parent topo-order' '
+   git rev-parse \
+   commit-6-6 \
+   commit-6-5 \
+   commit-6-4 \
+   commit-6-3 \
+   commit-6-2 \
+   commit-6-1 commit-5-1 commit-4-1 commit-3-1 commit-2-1 
commit-1-1 \
+   >expect &&
+   run_three_modes git rev-list --first-parent --topo-order commit-6-6
+'
+
+test_expect_success 'rev-list: range topo-order' '
+   git rev-parse \
+   commit-6-6 commit-5-6 commit-4-6 commit-3-6 commit-2-6 
commit-1-6 \
+   commit-6-5 commit-5-5 commit-4-5 commit-3-5 commit-2-5 
commit-1-5 \
+   commit-6-4 commit-5-4 commit-4-4 commit-3-4 commit-2-4 
commit-1-4 \
+   commit-6-3 commit-5-3 commit-4-3 \
+   commit-6-2 commit-5-2 commit-4-2 \
+   commit-6-1 commit-5-1 commit-4-1 \
+   >expect &&
+   run_three_modes git rev-list --topo-order commit-3-3..commit-6-6
+'
+
+test_expect_success 'rev-list: range topo-order' '
+   git rev-parse \
+   commit-6-6 commit-5-6 commit-4-6 \
+   commit-6-5 commit-5-5 commit-4-5 \
+   commit-6-4 commit-5-4 commit-4-4 \
+   commit-6-3 commit-5-3 commit-4-3 \
+   commit-6-2 commit-5-2 commit-4-2 \
+   commit-6-1 commit-5-1 commit-4-1 \
+   >expect &&
+   run_three_modes git rev-list --topo-order commit-3-8..commit-6-6
+'
+
+test_expect_success 'rev-list: first-parent range topo-order' '
+   git rev-parse \
+   commit-6-6 \
+   commit-6-5 \
+   commit-6-4 \
+   commit-6-3 \
+   commit-6-2 \
+   commit-6-1 commit-5-1 commit-4-1 \
+   >expect &&
+   run_three_modes git rev-list --first-parent --topo-order 
commit-3-8..commit-6-6
+'
+
+test_expect_success 'rev-list: ancestry-path topo-order' '
+   git rev-parse \
+   commit-6-6 commit-5-6 commit-4-6 commit-3-6 \
+   commit-6-5 commit-5-5 commit-4-5 commit-3-5 \
+   commit-6-4 commit-5-4 commit-4-4 commit-3-4 \
+   commit-6-3 commit-5-3 commit-4-3 \
+   >expect &&
+   run_three_modes git rev-list --topo-order --ancestry-path 
commit-3-3..commit-6-6
+'
+
+test_expect_success 'rev-list: symmetric difference topo-order' '
+   git rev-parse \
+   commit-6-6 commit-5-6 commit-4-6 \
+   commit-6-5 commit-5-5 commit-4-5 \
+   commit-6-4 commit-5-4 commit-4-4 \
+   commit-6-3 commit-5-3 commit-4-3 \
+   commit-6-2 commit-5-2 commit-4-2 \
+   commit-6-1 commit-5-1 commit-4-1 \
+   commit-3-8 commit-2-8 commit-1-8 \
+   commit-3-7 commit-2-7 commit-1-7 \
+   >expect &&
+   run_three_modes git rev-list --topo-order commit-3-8...commit-6-6
+'
+
 test_done
-- 
2.19.1.542.gc4df23f792



[PATCH v5 5/7] commit/revisions: bookkeeping before refactoring

2018-11-01 Thread Derrick Stolee
There are a few things that need to move around a little before
making a big refactoring in the topo-order logic:

1. We need access to record_author_date() and
   compare_commits_by_author_date() in revision.c. These are used
   currently by sort_in_topological_order() in commit.c.

2. Moving these methods to commit.h requires adding an author_date_slab
   declaration to commit.h. Consumers will need their own implementation.

3. The add_parents_to_list() method in revision.c performs logic
   around the UNINTERESTING flag and other special cases depending
   on the struct rev_info. Allow this method to ignore a NULL 'list'
   parameter, as we will not be populating the list for our walk.
   Also rename the method to the slightly more generic name
   process_parents() to make clear that this method does more than
   add to a list (and no list is required anymore).

Helped-by: Jeff King 
Signed-off-by: Derrick Stolee 
---
 commit.c   |  9 -
 commit.h   |  7 +++
 revision.c | 18 ++
 3 files changed, 21 insertions(+), 13 deletions(-)

diff --git a/commit.c b/commit.c
index d0f199e122..a025a0db60 100644
--- a/commit.c
+++ b/commit.c
@@ -655,11 +655,10 @@ struct commit *pop_commit(struct commit_list **stack)
 /* count number of children that have not been emitted */
 define_commit_slab(indegree_slab, int);
 
-/* record author-date for each commit object */
 define_commit_slab(author_date_slab, timestamp_t);
 
-static void record_author_date(struct author_date_slab *author_date,
-  struct commit *commit)
+void record_author_date(struct author_date_slab *author_date,
+   struct commit *commit)
 {
const char *buffer = get_commit_buffer(commit, NULL);
struct ident_split ident;
@@ -684,8 +683,8 @@ static void record_author_date(struct author_date_slab 
*author_date,
unuse_commit_buffer(commit, buffer);
 }
 
-static int compare_commits_by_author_date(const void *a_, const void *b_,
- void *cb_data)
+int compare_commits_by_author_date(const void *a_, const void *b_,
+  void *cb_data)
 {
const struct commit *a = a_, *b = b_;
struct author_date_slab *author_date = cb_data;
diff --git a/commit.h b/commit.h
index 2b1a734388..ec5b9093ad 100644
--- a/commit.h
+++ b/commit.h
@@ -8,6 +8,7 @@
 #include "gpg-interface.h"
 #include "string-list.h"
 #include "pretty.h"
+#include "commit-slab.h"
 
 #define COMMIT_NOT_FROM_GRAPH 0x
 #define GENERATION_NUMBER_INFINITY 0x
@@ -328,6 +329,12 @@ extern int remove_signature(struct strbuf *buf);
  */
 extern int check_commit_signature(const struct commit *commit, struct 
signature_check *sigc);
 
+/* record author-date for each commit object */
+struct author_date_slab;
+void record_author_date(struct author_date_slab *author_date,
+   struct commit *commit);
+
+int compare_commits_by_author_date(const void *a_, const void *b_, void 
*unused);
 int compare_commits_by_commit_date(const void *a_, const void *b_, void 
*unused);
 int compare_commits_by_gen_then_commit_date(const void *a_, const void *b_, 
void *unused);
 
diff --git a/revision.c b/revision.c
index 2dcde8a8ac..36458265a0 100644
--- a/revision.c
+++ b/revision.c
@@ -768,8 +768,8 @@ static void commit_list_insert_by_date_cached(struct commit 
*p, struct commit_li
*cache = new_entry;
 }
 
-static int add_parents_to_list(struct rev_info *revs, struct commit *commit,
-   struct commit_list **list, struct commit_list **cache_ptr)
+static int process_parents(struct rev_info *revs, struct commit *commit,
+  struct commit_list **list, struct commit_list 
**cache_ptr)
 {
struct commit_list *parent = commit->parents;
unsigned left_flag;
@@ -808,7 +808,8 @@ static int add_parents_to_list(struct rev_info *revs, 
struct commit *commit,
if (p->object.flags & SEEN)
continue;
p->object.flags |= SEEN;
-   commit_list_insert_by_date_cached(p, list, cached_base, 
cache_ptr);
+   if (list)
+   commit_list_insert_by_date_cached(p, list, 
cached_base, cache_ptr);
}
return 0;
}
@@ -847,7 +848,8 @@ static int add_parents_to_list(struct rev_info *revs, 
struct commit *commit,
p->object.flags |= left_flag;
if (!(p->object.flags & SEEN)) {
p->object.flags |= SEEN;
-   commit_list_insert_by_date_cached(p, list, cached_base, 
cache_ptr);
+   if (list)
+   commit_list_insert_by_date_cached(p, list, 
cached_base, cache_ptr);
}
if (revs->first_parent_only)
break;
@@ -1091,7 +1093,7 @@ static int 

[PATCH v5 4/7] revision.c: begin refactoring --topo-order logic

2018-11-01 Thread Derrick Stolee
When running 'git rev-list --topo-order' and its kin, the topo_order
setting in struct rev_info implies the limited setting. This means
that the following things happen during prepare_revision_walk():

* revs->limited implies we run limit_list() to walk the entire
  reachable set. There are some short-cuts here, such as if we
  perform a range query like 'git rev-list COMPARE..HEAD' and we
  can stop limit_list() when all queued commits are uninteresting.

* revs->topo_order implies we run sort_in_topological_order(). See
  the implementation of that method in commit.c. It implies that
  the full set of commits to order is in the given commit_list.

These two methods imply that a 'git rev-list --topo-order HEAD'
command must walk the entire reachable set of commits _twice_ before
returning a single result.

If we have a commit-graph file with generation numbers computed, then
there is a better way. This patch introduces some necessary logic
redirection when we are in this situation.

In v2.18.0, the commit-graph file contains zero-valued bytes in the
positions where the generation number is stored in v2.19.0 and later.
Thus, we use generation_numbers_enabled() to check if the commit-graph
is available and has non-zero generation numbers.

When setting revs->limited only because revs->topo_order is true,
only do so if generation numbers are not available. There is no
reason to use the new logic as it will behave similarly when all
generation numbers are INFINITY or ZERO.

In prepare_revision_walk(), if we have revs->topo_order but not
revs->limited, then we trigger the new logic. It breaks the logic
into three pieces, to fit with the existing framework:

1. init_topo_walk() fills a new struct topo_walk_info in the rev_info
   struct. We use the presence of this struct as a signal to use the
   new methods during our walk. In this patch, this method simply
   calls limit_list() and sort_in_topological_order(). In the future,
   this method will set up a new data structure to perform that logic
   in-line.

2. next_topo_commit() provides get_revision_1() with the next topo-
   ordered commit in the list. Currently, this simply pops the commit
   from revs->commits.

3. expand_topo_walk() provides get_revision_1() with a way to signal
   walking beyond the latest commit. Currently, this calls
   add_parents_to_list() exactly like the old logic.

While this commit presents method redirection for performing the
exact same logic as before, it allows the next commit to focus only
on the new logic.

Signed-off-by: Derrick Stolee 
---
 revision.c | 42 ++
 revision.h |  4 
 2 files changed, 42 insertions(+), 4 deletions(-)

diff --git a/revision.c b/revision.c
index e18bd530e4..2dcde8a8ac 100644
--- a/revision.c
+++ b/revision.c
@@ -25,6 +25,7 @@
 #include "worktree.h"
 #include "argv-array.h"
 #include "commit-reach.h"
+#include "commit-graph.h"
 
 volatile show_early_output_fn_t show_early_output;
 
@@ -2454,7 +2455,7 @@ int setup_revisions(int argc, const char **argv, struct 
rev_info *revs, struct s
if (revs->diffopt.objfind)
revs->simplify_history = 0;
 
-   if (revs->topo_order)
+   if (revs->topo_order && !generation_numbers_enabled(the_repository))
revs->limited = 1;
 
if (revs->prune_data.nr) {
@@ -2892,6 +2893,33 @@ static int mark_uninteresting(const struct object_id 
*oid,
return 0;
 }
 
+struct topo_walk_info {};
+
+static void init_topo_walk(struct rev_info *revs)
+{
+   struct topo_walk_info *info;
+   revs->topo_walk_info = xmalloc(sizeof(struct topo_walk_info));
+   info = revs->topo_walk_info;
+   memset(info, 0, sizeof(struct topo_walk_info));
+
+   limit_list(revs);
+   sort_in_topological_order(>commits, revs->sort_order);
+}
+
+static struct commit *next_topo_commit(struct rev_info *revs)
+{
+   return pop_commit(>commits);
+}
+
+static void expand_topo_walk(struct rev_info *revs, struct commit *commit)
+{
+   if (add_parents_to_list(revs, commit, >commits, NULL) < 0) {
+   if (!revs->ignore_missing_links)
+   die("Failed to traverse parents of commit %s",
+   oid_to_hex(>object.oid));
+   }
+}
+
 int prepare_revision_walk(struct rev_info *revs)
 {
int i;
@@ -2928,11 +2956,13 @@ int prepare_revision_walk(struct rev_info *revs)
commit_list_sort_by_date(>commits);
if (revs->no_walk)
return 0;
-   if (revs->limited)
+   if (revs->limited) {
if (limit_list(revs) < 0)
return -1;
-   if (revs->topo_order)
-   sort_in_topological_order(>commits, revs->sort_order);
+   if (revs->topo_order)
+   sort_in_topological_order(>commits, 
revs->sort_order);
+   } else if (revs->topo_order)
+   init_topo_walk(revs);
if (revs->line_level_traverse)

[PATCH v5 7/7] t6012: make rev-list tests more interesting

2018-11-01 Thread Derrick Stolee
As we are working to rewrite some of the revision-walk machinery,
there could easily be some interesting interactions between the
options that force topological constraints (--topo-order,
--date-order, and --author-date-order) along with specifying a
path.

Add extra tests to t6012-rev-list-simplify.sh to add coverage of
these interactions. To ensure interesting things occur, alter the
repo data shape to have different orders depending on topo-, date-,
or author-date-order.

When testing using GIT_TEST_COMMIT_GRAPH, this assists in covering
the new logic for topo-order walks using generation numbers. The
extra tests can be added indepently.

Signed-off-by: Derrick Stolee 
---
 t/t6012-rev-list-simplify.sh | 45 
 1 file changed, 36 insertions(+), 9 deletions(-)

diff --git a/t/t6012-rev-list-simplify.sh b/t/t6012-rev-list-simplify.sh
index b5a1190ffe..a10f0df02b 100755
--- a/t/t6012-rev-list-simplify.sh
+++ b/t/t6012-rev-list-simplify.sh
@@ -12,6 +12,22 @@ unnote () {
git name-rev --tags --stdin | sed -e "s|$OID_REGEX (tags/\([^)]*\)) |\1 
|g"
 }
 
+#
+# Create a test repo with interesting commit graph:
+#
+# A--B--G--H--I--K--L
+#  \  \   / /
+#   \  \ / /
+#C--E---F J
+#\_/
+#
+# The commits are laid out from left-to-right starting with
+# the root commit A and terminating at the tip commit L.
+#
+# There are a few places where we adjust the commit date or
+# author date to make the --topo-order, --date-order, and
+# --author-date-order flags produce different output.
+
 test_expect_success setup '
echo "Hi there" >file &&
echo "initial" >lost &&
@@ -21,10 +37,18 @@ test_expect_success setup '
 
git branch other-branch &&
 
+   git symbolic-ref HEAD refs/heads/unrelated &&
+   git rm -f "*" &&
+   echo "Unrelated branch" >side &&
+   git add side &&
+   test_tick && git commit -m "Side root" &&
+   note J &&
+   git checkout master &&
+
echo "Hello" >file &&
echo "second" >lost &&
git add file lost &&
-   test_tick && git commit -m "Modified file and lost" &&
+   test_tick && GIT_AUTHOR_DATE=$(($test_tick + 120)) git commit -m 
"Modified file and lost" &&
note B &&
 
git checkout other-branch &&
@@ -63,13 +87,6 @@ test_expect_success setup '
test_tick && git commit -a -m "Final change" &&
note I &&
 
-   git symbolic-ref HEAD refs/heads/unrelated &&
-   git rm -f "*" &&
-   echo "Unrelated branch" >side &&
-   git add side &&
-   test_tick && git commit -m "Side root" &&
-   note J &&
-
git checkout master &&
test_tick && git merge --allow-unrelated-histories -m "Coolest" 
unrelated &&
note K &&
@@ -103,14 +120,24 @@ check_result () {
check_outcome success "$@"
 }
 
-check_result 'L K J I H G F E D C B A' --full-history
+check_result 'L K J I H F E D C G B A' --full-history --topo-order
+check_result 'L K I H G F E D C B J A' --full-history
+check_result 'L K I H G F E D C B J A' --full-history --date-order
+check_result 'L K I H G F E D B C J A' --full-history --author-date-order
 check_result 'K I H E C B A' --full-history -- file
 check_result 'K I H E C B A' --full-history --topo-order -- file
 check_result 'K I H E C B A' --full-history --date-order -- file
+check_result 'K I H E B C A' --full-history --author-date-order -- file
 check_result 'I E C B A' --simplify-merges -- file
+check_result 'I E C B A' --simplify-merges --topo-order -- file
+check_result 'I E C B A' --simplify-merges --date-order -- file
+check_result 'I E B C A' --simplify-merges --author-date-order -- file
 check_result 'I B A' -- file
 check_result 'I B A' --topo-order -- file
+check_result 'I B A' --date-order -- file
+check_result 'I B A' --author-date-order -- file
 check_result 'H' --first-parent -- another-file
+check_result 'H' --first-parent --topo-order -- another-file
 
 check_result 'E C B A' --full-history E -- lost
 test_expect_success 'full history simplification without parent' '
-- 
2.19.1.542.gc4df23f792



[PATCH v5 0/7] Use generation numbers for --topo-order

2018-11-01 Thread Derrick Stolee
This patch series performs a decently-sized refactoring of the
revision-walk machinery. Well, "refactoring" is probably the wrong word,
as I don't actually remove the old code. Instead, when we see certain
options in the 'rev_info' struct, we redirect the commit-walk logic to
a new set of methods that distribute the workload differently. By using
generation numbers in the commit-graph, we can significantly improve
'git log --graph' commands (and the underlying 'git rev-list --topo-order').

On the Linux repository, I got the following performance results when
comparing to the previous version with or without a commit-graph:

Test: git rev-list --topo-order -100 HEAD
HEAD~1, no commit-graph: 6.80 s
HEAD~1, w/ commit-graph: 0.77 s
  HEAD, w/ commit-graph: 0.02 s

Test: git rev-list --topo-order -100 HEAD -- tools
HEAD~1, no commit-graph: 9.63 s
HEAD~1, w/ commit-graph: 6.06 s
  HEAD, w/ commit-graph: 0.06 s

If you want to read this series but are unfamiliar with the commit-graph
and generation numbers, then I recommend reading
`Documentation/technical/commit-graph.txt` or a blog post [1] I wrote on
the subject. In particular, the three-part walk described in "revision.c:
refactor basic topo-order logic" is present (but underexplained) as an
animated PNG [2].

**UPDATED** Now that we have had some review and some dogfooding, I'm
removing the paragraph I had here about "RFC quality". I think this is
ready to merge!

One notable case that is not included in this series is the case of a
history comparison such as 'git rev-list --topo-order A..B'. The existing
code in limit_list() has ways to cut the walk short when all pending
commits are UNINTERESTING. Since this code depends on commit_list instead
of the prio_queue we are using here, I chose to leave it untouched for now.
We can revisit it in a separate series later. Since handle_commit() turns
on revs->limited when a commit is UNINTERESTING, we do not hit the new
code in this case. Removing this 'revs->limited = 1;' line yields correct
results, but the performance can be worse.

**UPDATED** See the discussion about Generation Number V2 [4] for more
on this topic.

Changes in V5: Thanks Jakub for feedback on the huge commit! I think
I've responded to all the code feedback. See the range-diff at the
end of this cover-page.

Thanks,
-Stolee

[1] 
https://blogs.msdn.microsoft.com/devops/2018/07/09/supercharging-the-git-commit-graph-iii-generations/
   Supercharging the Git Commit Graph III: Generations and Graph Algorithms

[2] 
https://msdnshared.blob.core.windows.net/media/2018/06/commit-graph-topo-order-b-a.png
Animation showing three-part walk

[3] https://github.com/derrickstolee/git/tree/topo-order/test
A branch containing this series along with commits to compute commit-graph 
in entire test suite.

[4] https://public-inbox.org/git/6367e30a-1b3a-4fe9-611b-d931f51ef...@gmail.com/
[RFC] Generation Number v2

Note: I'm not submitting this version via GitGitGadget because it's
currently struggling with how to handle a PR in a conflict state.
The new flags in revision.h have a conflict with recent changes in
master.

Derrick Stolee (7):
  prio-queue: add 'peek' operation
  test-reach: add run_three_modes method
  test-reach: add rev-list tests
  revision.c: begin refactoring --topo-order logic
  commit/revisions: bookkeeping before refactoring
  revision.c: generation-based topo-order algorithm
  t6012: make rev-list tests more interesting

 commit.c |   9 +-
 commit.h |   7 +
 object.h |   4 +-
 prio-queue.c |   9 ++
 prio-queue.h |   6 +
 revision.c   | 243 +--
 revision.h   |   6 +
 t/helper/test-prio-queue.c   |  26 ++--
 t/t0009-prio-queue.sh|  14 ++
 t/t6012-rev-list-simplify.sh |  45 +--
 t/t6600-test-reach.sh|  96 +-
 11 files changed, 426 insertions(+), 39 deletions(-)


base-commit: 2d3b1c576c85b7f5db1f418907af00ab88e0c303
-- 
2.19.1.542.gc4df23f792

-->8--

1:  2358cfd5ed = 1:  7c75a56505 prio-queue: add 'peek' operation
2:  3a4b68e479 = 2:  686c4370de test-reach: add run_three_modes method
3:  12a3f6d367 = 3:  7410c00982 test-reach: add rev-list tests
4:  cd9eef9688 = 4:  5439e11e37 revision.c: begin refactoring --topo-order logic
5:  f3e291665d ! 5:  71554deb9b commit/revisions: bookkeeping before refactoring
@@ -9,8 +9,8 @@
compare_commits_by_author_date() in revision.c. These are used
currently by sort_in_topological_order() in commit.c.
 
-2. Moving these methods to commit.h requires adding the author_slab
-   definition to commit.h.
+2. Moving these methods to commit.h requires adding an author_date_slab
+   declaration to commit.h. Consumers will need their own 
implementation.
 
 3. The add_parents_to_list() method in revision.c performs 

[PATCH v5 1/7] prio-queue: add 'peek' operation

2018-11-01 Thread Derrick Stolee
When consuming a priority queue, it can be convenient to inspect
the next object that will be dequeued without actually dequeueing
it. Our existing library did not have such a 'peek' operation, so
add it as prio_queue_peek().

Add a reference-level comparison in t/helper/test-prio-queue.c
so this method is exercised by t0009-prio-queue.sh. Further, add
a test that checks the behavior when the compare function is NULL
(i.e. the queue becomes a stack).

Signed-off-by: Derrick Stolee 
---
 prio-queue.c   |  9 +
 prio-queue.h   |  6 ++
 t/helper/test-prio-queue.c | 26 ++
 t/t0009-prio-queue.sh  | 14 ++
 4 files changed, 47 insertions(+), 8 deletions(-)

diff --git a/prio-queue.c b/prio-queue.c
index a078451872..d3f488cb05 100644
--- a/prio-queue.c
+++ b/prio-queue.c
@@ -85,3 +85,12 @@ void *prio_queue_get(struct prio_queue *queue)
}
return result;
 }
+
+void *prio_queue_peek(struct prio_queue *queue)
+{
+   if (!queue->nr)
+   return NULL;
+   if (!queue->compare)
+   return queue->array[queue->nr - 1].data;
+   return queue->array[0].data;
+}
diff --git a/prio-queue.h b/prio-queue.h
index d030ec9dd6..682e51867a 100644
--- a/prio-queue.h
+++ b/prio-queue.h
@@ -46,6 +46,12 @@ extern void prio_queue_put(struct prio_queue *, void *thing);
  */
 extern void *prio_queue_get(struct prio_queue *);
 
+/*
+ * Gain access to the "thing" that would be returned by
+ * prio_queue_get, but do not remove it from the queue.
+ */
+extern void *prio_queue_peek(struct prio_queue *);
+
 extern void clear_prio_queue(struct prio_queue *);
 
 /* Reverse the LIFO elements */
diff --git a/t/helper/test-prio-queue.c b/t/helper/test-prio-queue.c
index 9807b649b1..5bc9c46ea5 100644
--- a/t/helper/test-prio-queue.c
+++ b/t/helper/test-prio-queue.c
@@ -22,14 +22,24 @@ int cmd__prio_queue(int argc, const char **argv)
struct prio_queue pq = { intcmp };
 
while (*++argv) {
-   if (!strcmp(*argv, "get"))
-   show(prio_queue_get());
-   else if (!strcmp(*argv, "dump")) {
-   int *v;
-   while ((v = prio_queue_get()))
-  show(v);
-   }
-   else {
+   if (!strcmp(*argv, "get")) {
+   void *peek = prio_queue_peek();
+   void *get = prio_queue_get();
+   if (peek != get)
+   BUG("peek and get results do not match");
+   show(get);
+   } else if (!strcmp(*argv, "dump")) {
+   void *peek;
+   void *get;
+   while ((peek = prio_queue_peek())) {
+   get = prio_queue_get();
+   if (peek != get)
+   BUG("peek and get results do not 
match");
+   show(get);
+   }
+   } else if (!strcmp(*argv, "stack")) {
+   pq.compare = NULL;
+   } else {
int *v = malloc(sizeof(*v));
*v = atoi(*argv);
prio_queue_put(, v);
diff --git a/t/t0009-prio-queue.sh b/t/t0009-prio-queue.sh
index e56dfce668..3941ad2528 100755
--- a/t/t0009-prio-queue.sh
+++ b/t/t0009-prio-queue.sh
@@ -47,4 +47,18 @@ test_expect_success 'notice empty queue' '
test_cmp expect actual
 '
 
+cat >expect <<'EOF'
+3
+2
+6
+4
+5
+1
+8
+EOF
+test_expect_success 'stack order' '
+   test-tool prio-queue stack 8 1 5 4 6 2 3 dump >actual &&
+   test_cmp expect actual
+'
+
 test_done

base-commit: 2d3b1c576c85b7f5db1f418907af00ab88e0c303
-- 
2.19.1.542.gc4df23f792



Re: [RFC] Generation Number v2

2018-11-01 Thread Derrick Stolee

On 11/1/2018 8:27 AM, Jakub Narebski wrote:

[I have noticed that in some places I wrote A..B instead of B..A.  Sorry
  about that]

Derrick Stolee  writes:


Please also let me know about any additional tests that I could
run. Now that I've got a lot of test scripts built up, I can re-run
the test suite pretty quickly.

I would add straighforward 1-to-N and N-to-1 reachability tests in the
form of `git branch / tag --contains` and `git branch / tag --merged`,
and perhaps also ahead-behind calculations used by `git status`, and
N-to-M reachability tests used by tag following code in push and fetch.

Possibly also A...B walk, if it is not implemented via calculating
merge-base.


I believe this uses paint_down_to_common(), but looks at the PARENT1 and
PARENT2 flags afterward to determine the left/right/both relationships.


Maybe even --ancestry-path walk, though I am not sure how important best
performance for rhis case is (we would want good performance, but
perhaps best is not needed for rarely used command).


Currently, the implementation of --ancestry-path does not use a
reachability index.



See explanation below.



Generation Number Performance Test
==

Git uses the commit history to answer many basic questions, such as
computing a merge-base. Some of these algorithms can benefit from a
_reachability index_, which is a condition that allows us to answer
"commit A cannot reach commit B" in some cases. These require pre-
computing some values that we can load and use at run-time. Git
already has a notion of _generation number_, stores it in the commit-
graph file, and uses it in several reachability algorithms.

Note that there are other kinds of reachability indices.

First, there are reachability indices that can answer the full
reachability query (if commit A can reach commit B, or if commit A
cannot reach commit B) directly, without walking the commit graph at
all: so called label-only approach.  For example one could store for
each commit the compressed list of all commits reachable from it
(transitive closure compression).

Those, I think (but I have not checked), would be of not much use for
Git, as the size of the index grows stronger than linear with the number
of commits, as grows the time to compute such index.  So probably of no
use to Git, at least not directly (Git uses so called "bitmap index",
see e.g. [1], which stores reachability bit-vector as compressed
bitmap... but not for all commits, only for a small subset).


Second, beside negative-cut reachability indexes, that can answer with
certainity that "commit A cannot reach commit B", like the generation
numbers (also known as level, or topological level), there also
positive-cut indexes (usually if not always based on the spanning tree
or trees for the DAG), that can answer when "commit A can reach commit
B".

I think that those can lead to significant speedups for at least some
types of commit traversal and reachability queries that Git needs to
answer.  But currently no algorithms has a provision for using
positive-cut filter index.  Anyway, such index would be auxiliary thing,
see the next point.


Third, we require more than having reachability index in the sense of
having some kind of _labelling_, often composite, that can answer either
"commit A cannot reach commit B" or "commit A can reach commit B",
depending on the type.  Git for most operations needs more, namely an
actual _ordering_, that is a weak order (or to be more exact a total
preorder, i.e. there can be two different commits with the same
"generation number" or index, but always either idx(A) ≲ idx(B) or
idx(B) ≲ idx(A)) and not only partial ordering (where there can be
incomparable elements, i.e neither idx(A) ≼ idx(B) nor idx(B) ≼ idx(A)).
This is because it needs to be used in priority queue to decide which
commit to travel next; more on that later.  This is also why there can
be only one such "main" reachability _index_.

[1]: https://githubengineering.com/counting-objects/


Thanks for the details. At the moment, I'm only interested in improving our
negative-cut reachability index, as we have algorithms that can take 
advantage

of them (and can compare their performance directly).




You can read more about generation numbers and how to use them in
algorithms in [this blog
post](https://blogs.msdn.microsoft.com/devops/2018/07/09/supercharging-the-git-commit-graph-iii-generations/).

However, [generation numbers do not always improve our
algorithms](https://public-inbox.org/git/pull.28.git.gitgitgad...@gmail.com/T/#u).
Specifically, some algorithms in Git already use commit date as a
heuristic reachability index. This has some problems, though, since
commit date can be incorrect for several reasons (clock skew between
machines, purposefully setting GIT_COMMIT_DATE to the author date,
etc.).

That's what we use the "slop" mechanism for: Git would walk a few
commits more than necessary than if there were no clock skew (if 

Re: ab/* topics

2018-11-01 Thread Junio C Hamano
Ævar Arnfjörð Bjarmason  writes:

> Could you please pick up
> https://public-inbox.org/git/20181024114725.3927-1-ava...@gmail.com/ ?
> It seems to have fallen between the cracks and addressed the feedback on
> v1, and looks good to me (and nobody's objected so far...).

If this is the runtime-gettext-poison thing, this did not fall thru
the cracks---I didn't get the impression that "no objection was a
sign of being excellent"; rather I recall feeling that you were the
only person who were excited about it, while everybody else was
"Meh".

Thanks for pinging.  It is very possible that I didn't read (or
rememer) the thread correctly.  Let me go back to the archive in the
morning to double check.




Re: [RFE] Please add name and email to git credentials

2018-11-01 Thread Ævar Arnfjörð Bjarmason


On Thu, Nov 01 2018, Nicolas Mailhot wrote:

> Le jeudi 01 novembre 2018 à 12:22 +0100, Ævar Arnfjörð Bjarmason a
> écrit :
>>
>> Where would we get an E-Mail to lookup to pass to the helper? Are you
>> just asking that the helper git the result of $(git config user.name
>> &&
>> git config user.email)? If so why can't it just look this up itself?
>
>
> So, just in case it was not clear enough, allow things in .gitconfig
> like
>
> [credential "https://pkgs.fedoraproject.org/;]
> username = doe4ever
> name = John Doe
> email = doe4e...@fedoraproject.org
> [credential "https://gitlab.corp.com/;]
> username = jdoe56874
> name = John Doe, Snr Engineer
> email = john@corp.com
>
> Instead of just
>
> [user]
> name = John Doe
> email =  john@corp.com
> [credential "https://pkgs.fedoraproject.org/;]
> username = doe4ever
> [credential "https://gitlab.corp.com/;]
> username = jdoe56874
>
> and drat, I've commited to XXX with the wrong name/email again

Aaaah! So really you just want to set user.{name,email} if you match a
given URL in the project, and this per-se has nothing to do with
credentials..

Yeah that's a fair request. Although I think tying that up with
credential.* doesn't make sense because we'd:

 1) Need yet another place (config, env vars, now this...) to search for
what we're putting in the commit object.

 2) Users want to configure this for e.g. different URLs even though
they don't need different credentials for the two.

I'm too lazy to dig up the thread, but there's been a discussion before
of extending the IncludeIf syntax to support more things that "gitdir",
e.g. matching on the remote URL.

So then you'd do:

[credential "https://pkgs.fedoraproject.org/;]
username = doe4ever
[IncludeIf "remote:https://pkgs.fedoraproject.org/*;]
path ~/.gitconfig.d/fedoraproject.config

But now what you need to do is clone all the projects in
e.g. ~/git/fedoraproject/* and do:

[credential "https://pkgs.fedoraproject.org/;]
username = doe4ever
[IncludeIf "gitdir:~/g/fedoraproject/*"]
path ~/.gitconfig.d/fedoraproject.config


Re: ab/* topics (was: Re: What's cooking in git.git (Nov 2018, #01; Thu, 1))

2018-11-01 Thread SZEDER Gábor
On Thu, Nov 01, 2018 at 12:02:01PM +0100, Ævar Arnfjörð Bjarmason wrote:
> Could you please pick up
> https://public-inbox.org/git/20181024114725.3927-1-ava...@gmail.com/ ?
> It seems to have fallen between the cracks and addressed the feedback on
> v1, and looks good to me (and nobody's objected so far...).

I didn't object, because in

  https://public-inbox.org/git/87muqzllh0@evledraar.gmail.com/

you asked for "a more general review than just the problem of how
we turn an env variable into a boolean".

However, if you push that patch with 'sh-i18n--helper' as-is, then I
do object now: parsing a boolean in shell is not at all that difficult
to justify this new command.



Re: [RFE] Please add a standard ref object to releases

2018-11-01 Thread Ævar Arnfjörð Bjarmason


On Thu, Nov 01 2018, Nicolas Mailhot wrote:

> Le jeudi 01 novembre 2018 à 12:15 +0100, Ævar Arnfjörð Bjarmason a
> écrit :
>>
>> For both this and your other report, it would be helpful if you
>> describe
>> in concrete terms (with examples of git commands, or UI etc.) what git
>> commands do now, what's wrong with it, and some sketch of what you
>> expect an alternate interface to look like.
>>
>> As for this report: I know this area of git quite well, but I still
>> have no idea quite what you're asking for.
>
> It says a lot that you claim to know this area of git quite well, but
> have no understanding how it is (mis)used on github/gitlab/bitbucket/etc
> by people who actually try to use git.

Yeah I think it says a lot about how vague your proposal is.

You're the one trying to ostensibly file some sort of bug report or
feature request, it's up to you to explain the problem in detail.

> I’m just asking that when a project releases “x.y.z”
>
> 1. there was a standard git command available to it to create "the
> release “x.y.z”" ref

And would these also be annotated tags, or are you proposing that we add
a new "release" object type in addition to blob/tree/commit/tag? If so
what would that look like?

> 2. there was a git syntax to say "the release “x.y.z”" ref in all git
> commands that manipulate refs

What do you mean "git syntaX"? The rev-parse syntax (see 'git help
rev-parse`) or something else?

> 3. that this "release “x.y.z”" ref could be used in all the "download
> release “x.y.z”" on GitLab/GitHub, etc

So instead of offering a download of annotated tags as they do now, see
https://github.com/git/git/releases and
https://gitlab.com/git-vcs/git/tags they'd offer a download of whatevr
this new thing is?

> 4. that there was no fuziness or human interpretation involved into
> converting "x.y.z" into the syntax used to select "release “x.y.z”" in
> git commands

So since we'd store this in refs/* somewhere since it's (presumably)
named named ref of some sort, you're suggesting that this thing be
exempt from the DWYM name resolution of refs, e.g. us resolving "master"
to "refs/heads/master" or v1.0 to "refs/tags/v1.0" (but as you note
that's ambiguous).

So you'd always need to do e.g. "git show refs/releases/v1.0"?

> 5. there was no ambiguïty in this syntax that led to the selection of
> things that are not "release" refs and just look like them

I don't know what you mean by "syntax" and "selection" in this context.

> And **not** the current situation where there are no official "release
> ref" objects and "just map release names to tags, mapping recipe is left
> to the reader". Because everyone invents its own recipe and the result
> does not work.

I'm still entirely unclear about "does not work" here.

> And no, vfoo is not a form of release ref, because v1 can be a branch,
> not a tag, version3 tag is not the release ersion3, etc (real-world
> examples I can provide links if you don't believe me). You can’t let
> things undefined as they are now because git users as a whole are making
> a mess of things without tooling help.

This seems like another complaint about the ref ambiguities. That's fair
enough, but how is that per-se related to this proposal? We also have
ambiguity now between "master" tags and branches.

>> If we assume this is a good idea, how do you imagine this would work
>> once you don't just have two levels (random labels v.s. "real"
>> releases)
>> but three or more (random labels v.s. "real" releases v.s. "LTS"
>> releases, )?
>
> IMHO you’re over-engineering things there. There is a need for separate
> release refs, as evidenced by the fact every major git web frontend had
> to separate them from normal tags in its UI. I'm not aware of the same
> thing for LTS or whatever.
>
> Of course implementing generic namespacing, would be a way to get a
> separate release namespace. As long as this release namespace is
> unambiguously defined at the git level without replaying the 'just
> invent your own tag recipe' mess at another level.

It's not over-engineering. Right now we have a hierarchical namespace
for tags and branches, so if you want N levels deep or whatever you do
that with prefixes or subdirectories, and there's plenty of examples of
that in the wild. E.g. tags created for hacking, QA, pre-release,
release, LTS etc.

Whereas if you're proposing some mechanism that we draw a line somewhere
in that flow and say "now tag/mark/release stuff as different sorts of
objects" it's up to you to convince us why that's a realistic view of
the world.


Re: [RFE] Please add name and email to git credentials

2018-11-01 Thread Nicolas Mailhot
Le jeudi 01 novembre 2018 à 12:22 +0100, Ævar Arnfjörð Bjarmason a
écrit :
> 
> Where would we get an E-Mail to lookup to pass to the helper? Are you
> just asking that the helper git the result of $(git config user.name
> &&
> git config user.email)? If so why can't it just look this up itself?


So, just in case it was not clear enough, allow things in .gitconfig
like

[credential "https://pkgs.fedoraproject.org/;]
username = doe4ever
name = John Doe
email = doe4e...@fedoraproject.org
[credential "https://gitlab.corp.com/;]
username = jdoe56874
name = John Doe, Snr Engineer
email = john@corp.com

Instead of just

[user]
name = John Doe
email =  john@corp.com
[credential "https://pkgs.fedoraproject.org/;]
username = doe4ever
[credential "https://gitlab.corp.com/;]
username = jdoe56874

and drat, I've commited to XXX with the wrong name/email again


-- 
Nicolas Mailhot



Re: [PATCH 0/3] Make add_missing_tags() linear

2018-11-01 Thread Derrick Stolee

On 11/1/2018 2:52 AM, Elijah Newren wrote:

On Wed, Oct 31, 2018 at 5:05 AM Derrick Stolee  wrote:

On 10/31/2018 2:04 AM, Elijah Newren wrote:


On the original repo where the topic was brought up, with commit-graph
NOT turned on and using origin/master, I see:

$ time git push --dry-run --follow-tags /home/newren/repo-mirror
To /home/newren/repo-mirror
  * [new branch]   test5 -> test5

real 1m20.081s
user 1m19.688s
sys 0m0.292s

Merging this series in, I now get:

$ time git push --dry-run --follow-tags /home/newren/repo-mirror
To /home/newren/repo-mirror
  * [new branch]   test5 -> test5

real 0m2.857s
user 0m2.580s
sys 0m0.328s

which provides a very nice speedup.

Oddly enough, if I _also_ do the following:
$ git config core.commitgraph true
$ git config gc.writecommitgraph true
$ git gc

then my timing actually slows down just slightly:
$ time git push --follow-tags --dry-run /home/newren/repo-mirror
To /home/newren/repo-mirror
  * [new branch]  test5 -> test5

real 0m3.027s
user 0m2.696s
sys 0m0.400s

So you say that the commit-graph is off in the 2.8s case, but not here
in the 3.1s case? I would expect _at minimum_ that the cost of parsing
commits would have a speedup in the commit-graph case.  There may be
something else going on here, since you are timing a `push` event that
is doing more than the current walk.


(run-to-run variation seems pretty consistent, < .1s variation, so
this difference is just enough to notice.)  I wouldn't be that
surprised if that means there's some really old tags with very small
generation numbers, meaning it's not gaining anything in this special
case from the commit-graph, but it does pay the cost of loading the
commit-graph.

While you have this test environment, do you mind applying the diff
below and re-running the tests? It will output a count for how many
commits are walked by the algorithm. This should help us determine if
this is another case where generation numbers are worse than commit-date,
or if there is something else going on. Thanks!

I can do that, but wouldn't you want a similar patch for the old
get_merge_bases_many() in order to compare?  Does an absolute number
help by itself?
It's going to have to be tomorrow, though; not enough time tonight.


No rush. I'd just like to understand how removing the commit-graph file
can make the new algorithm faster. Putting a similar count in the old
algorithm would involve giving a count for every call to
in_merge_bases_many(), which would be very noisy.

Thanks!
-Stolee


Re: [RFC] Generation Number v2

2018-11-01 Thread Jakub Narebski
[I have noticed that in some places I wrote A..B instead of B..A.  Sorry
 about that]

Derrick Stolee  writes:

> We've discussed in several places how to improve upon generation
> numbers. This RFC is a report based on my investigation into a
> few new options, and how they compare for Git's purposes on
> several existing open-source repos.
>
> You can find this report and the associated test scripts at
> https://github.com/derrickstolee/gen-test.

Which use prototype test implementation from the `reach-perf` branch in
https://github.com/derrickstolee/git.

> I have some more work to do in order to make this fully
> reproducible (including a single script to clone all of the
> repos I used, compute the commit-graphs using the different
> index versions, and run the tests).
>
> TL;DR: I recommend we pursue one of the following two options:
>
> 1. Maximum Generation Number.
> 2. Corrected Commit Date.
>
> Each of these perform well, but have different implementation
> drawbacks. I'd like to hear what everyone thinks about those
> drawbacks.

I agree with Junio that incremental computation is more important than
backwards compatibility (that the old clients can work with the new
commit-graph without changes).  We would have compatibility breaking
anyway with the planned move from SHA-1 to NewHash aka. SHA-256.

> Please also let me know about any additional tests that I could
> run. Now that I've got a lot of test scripts built up, I can re-run
> the test suite pretty quickly.

I would add straighforward 1-to-N and N-to-1 reachability tests in the
form of `git branch / tag --contains` and `git branch / tag --merged`,
and perhaps also ahead-behind calculations used by `git status`, and
N-to-M reachability tests used by tag following code in push and fetch.

Possibly also A...B walk, if it is not implemented via calculating
merge-base.

Maybe even --ancestry-path walk, though I am not sure how important best
performance for rhis case is (we would want good performance, but
perhaps best is not needed for rarely used command).

See explanation below.


> Generation Number Performance Test
> ==
>
> Git uses the commit history to answer many basic questions, such as
> computing a merge-base. Some of these algorithms can benefit from a
> _reachability index_, which is a condition that allows us to answer
> "commit A cannot reach commit B" in some cases. These require pre-
> computing some values that we can load and use at run-time. Git
> already has a notion of _generation number_, stores it in the commit-
> graph file, and uses it in several reachability algorithms.

Note that there are other kinds of reachability indices.

First, there are reachability indices that can answer the full
reachability query (if commit A can reach commit B, or if commit A
cannot reach commit B) directly, without walking the commit graph at
all: so called label-only approach.  For example one could store for
each commit the compressed list of all commits reachable from it
(transitive closure compression).

Those, I think (but I have not checked), would be of not much use for
Git, as the size of the index grows stronger than linear with the number
of commits, as grows the time to compute such index.  So probably of no
use to Git, at least not directly (Git uses so called "bitmap index",
see e.g. [1], which stores reachability bit-vector as compressed
bitmap... but not for all commits, only for a small subset).


Second, beside negative-cut reachability indexes, that can answer with
certainity that "commit A cannot reach commit B", like the generation
numbers (also known as level, or topological level), there also
positive-cut indexes (usually if not always based on the spanning tree
or trees for the DAG), that can answer when "commit A can reach commit
B".

I think that those can lead to significant speedups for at least some
types of commit traversal and reachability queries that Git needs to
answer.  But currently no algorithms has a provision for using
positive-cut filter index.  Anyway, such index would be auxiliary thing,
see the next point.


Third, we require more than having reachability index in the sense of
having some kind of _labelling_, often composite, that can answer either
"commit A cannot reach commit B" or "commit A can reach commit B",
depending on the type.  Git for most operations needs more, namely an
actual _ordering_, that is a weak order (or to be more exact a total
preorder, i.e. there can be two different commits with the same
"generation number" or index, but always either idx(A) ≲ idx(B) or
idx(B) ≲ idx(A)) and not only partial ordering (where there can be
incomparable elements, i.e neither idx(A) ≼ idx(B) nor idx(B) ≼ idx(A)).
This is because it needs to be used in priority queue to decide which
commit to travel next; more on that later.  This is also why there can
be only one such "main" reachability _index_.

[1]: https://githubengineering.com/counting-objects/

> You can 

Re: [RFE] Please add a standard ref object to releases

2018-11-01 Thread Nicolas Mailhot
Le jeudi 01 novembre 2018 à 12:15 +0100, Ævar Arnfjörð Bjarmason a
écrit :
> 
> For both this and your other report, it would be helpful if you
> describe
> in concrete terms (with examples of git commands, or UI etc.) what git
> commands do now, what's wrong with it, and some sketch of what you
> expect an alternate interface to look like.
> 
> As for this report: I know this area of git quite well, but I still
> have no idea quite what you're asking for.

It says a lot that you claim to know this area of git quite well, but
have no understanding how it is (mis)used on github/gitlab/bitbucket/etc
by people who actually try to use git.

I’m just asking that when a project releases “x.y.z”

1. there was a standard git command available to it to create "the
release “x.y.z”" ref

2. there was a git syntax to say "the release “x.y.z”" ref in all git
commands that manipulate refs

3. that this "release “x.y.z”" ref could be used in all the "download
release “x.y.z”" on GitLab/GitHub, etc

4. that there was no fuziness or human interpretation involved into
converting "x.y.z" into the syntax used to select "release “x.y.z”" in
git commands

5. there was no ambiguïty in this syntax that led to the selection of
things that are not "release" refs and just look like them

And **not** the current situation where there are no official "release
ref" objects and "just map release names to tags, mapping recipe is left
to the reader". Because everyone invents its own recipe and the result
does not work.

And no, vfoo is not a form of release ref, because v1 can be a branch,
not a tag, version3 tag is not the release ersion3, etc (real-world
examples I can provide links if you don't believe me). You can’t let
things undefined as they are now because git users as a whole are making
a mess of things without tooling help.

> If we assume this is a good idea, how do you imagine this would work
> once you don't just have two levels (random labels v.s. "real"
> releases)
> but three or more (random labels v.s. "real" releases v.s. "LTS"
> releases, )?

IMHO you’re over-engineering things there. There is a need for separate
release refs, as evidenced by the fact every major git web frontend had
to separate them from normal tags in its UI. I'm not aware of the same
thing for LTS or whatever.

Of course implementing generic namespacing, would be a way to get a
separate release namespace. As long as this release namespace is
unambiguously defined at the git level without replaying the 'just
invent your own tag recipe' mess at another level.

Regards,

-- 
Nicolas Mailhot



[PATCH 3/3] [Outreachy] stash: use set_fallback_ident() function

2018-11-01 Thread Slavica Djukic
Call set_fallback_ident() in cmd_stash() and update test
from the first commit to expect success.

Executing stash without user.name and user.email configured
can be useful when bots or similar users use stash, without anyone
specifing valid ident. Use case would be automated testing.
There are also users who  find this convinient.
For example, in this thread:
https://public-inbox.org/git/87o9debty4@evledraar.gmail.com/T/#ma4fb50903a54cbcdecd4ef05856bf8094bc3c323
user points out that he would find it useful if stash had --author option.

Signed-off-by: Slavica Djukic 
---
 builtin/stash.c  | 1 +
 t/t3903-stash.sh | 2 +-
 2 files changed, 2 insertions(+), 1 deletion(-)

diff --git a/builtin/stash.c b/builtin/stash.c
index 965e938ebd..add30aae64 100644
--- a/builtin/stash.c
+++ b/builtin/stash.c
@@ -1523,6 +1523,7 @@ int cmd_stash(int argc, const char **argv, const char 
*prefix)
trace_repo_setup(prefix);
setup_work_tree();
 
+   set_fallback_ident("git stash", "stash@git.commands");
git_config(git_default_config, NULL);
 
argc = parse_options(argc, argv, prefix, options, git_stash_usage,
diff --git a/t/t3903-stash.sh b/t/t3903-stash.sh
index aaff36978e..06a2ffb398 100755
--- a/t/t3903-stash.sh
+++ b/t/t3903-stash.sh
@@ -1156,7 +1156,7 @@ test_expect_success 'stash --  works with binary 
files' '
test_path_is_file subdir/untracked
 '
 
-test_expect_failure 'stash works when user.name and user.email are not set' '
+test_expect_success 'stash works when user.name and user.email are not set' '
git reset &&
>1 &&
git add 1 &&
-- 
2.19.1.windows.1



[PATCH 2/3] [Outreachy] ident: introduce set_fallback_ident() function

2018-11-01 Thread Slavica Djukic
Usually, when creating a commit, ident is needed to record the author
and commiter.
But, when there is commit not intended to published, e.g. when stashing
changes,  valid ident is not necessary.
To allow creating commits in such scenario, let's introduce helper
function "set_fallback_ident(), which will pre-load the ident.

In following commit, set_fallback_ident() function will be called in stash.

Signed-off-by: Slavica Djukic 
---
 cache.h |  1 +
 ident.c | 17 +
 2 files changed, 18 insertions(+)

diff --git a/cache.h b/cache.h
index 681307f716..6b5b559a05 100644
--- a/cache.h
+++ b/cache.h
@@ -1470,6 +1470,7 @@ extern const char *git_sequence_editor(void);
 extern const char *git_pager(int stdout_is_tty);
 extern int is_terminal_dumb(void);
 extern int git_ident_config(const char *, const char *, void *);
+void set_fallback_ident(const char *name, const char *email);
 extern void reset_ident_date(void);
 
 struct ident_split {
diff --git a/ident.c b/ident.c
index 33bcf40644..410bd495e9 100644
--- a/ident.c
+++ b/ident.c
@@ -505,6 +505,23 @@ int git_ident_config(const char *var, const char *value, 
void *data)
return 0;
 }
 
+void set_fallback_ident(const char *name, const char *email)
+{
+   if (!git_default_name.len) {
+   strbuf_addstr(_default_name, name);
+   committer_ident_explicitly_given |= IDENT_NAME_GIVEN;
+   author_ident_explicitly_given |= IDENT_NAME_GIVEN;
+   ident_config_given |= IDENT_NAME_GIVEN;
+   }
+
+   if (!git_default_email.len) {
+   strbuf_addstr(_default_email, email);
+   committer_ident_explicitly_given |= IDENT_MAIL_GIVEN;
+   author_ident_explicitly_given |= IDENT_MAIL_GIVEN;
+   ident_config_given |= IDENT_MAIL_GIVEN;
+   }
+}
+
 static int buf_cmp(const char *a_begin, const char *a_end,
   const char *b_begin, const char *b_end)
 {
-- 
2.19.1.windows.1



[PATCH 1/3][Outreachy] t3903-stash: test without configured user.name and user.email

2018-11-01 Thread Slavica Djukic
Add test to assert that stash fails if user.name and user.email
are not configured.
In the final commit, test will be updated to expect success.

Signed-off-by: Slavica Djukic 
---
 t/t3903-stash.sh | 15 +++
 1 file changed, 15 insertions(+)

diff --git a/t/t3903-stash.sh b/t/t3903-stash.sh
index 9e06494ba0..aaff36978e 100755
--- a/t/t3903-stash.sh
+++ b/t/t3903-stash.sh
@@ -1156,4 +1156,19 @@ test_expect_success 'stash --  works with binary 
files' '
test_path_is_file subdir/untracked
 '
 
+test_expect_failure 'stash works when user.name and user.email are not set' '
+   git reset &&
+   >1 &&
+   git add 1 &&
+   test_config user.useconfigonly true &&
+   test_config stash.usebuiltin true &&
+   sane_unset GIT_AUTHOR_NAME &&
+   sane_unset GIT_AUTHOR_EMAIL &&
+   sane_unset GIT_COMMITTER_NAME &&
+   sane_unset GIT_COMMITTER_EMAIL &&
+   test_unconfig user.email &&
+   test_unconfig user.name &&
+   git stash
+'
+
 test_done
-- 
2.19.1.windows.1



[PATCH 0/3] [Outreachy] make stash work if user.name and user.email are not configured

2018-11-01 Thread Slavica Djukic
Enhancement request that ask for 'git stash' to work even if 
'user.name' and 'user.email' are not configured.
Due to an implementation detail, git-stash undesirably requires 
'user.name' and 'user.email' to be set, but shouldn't.

Slavica Djukic(3):
  [Outreachy] t3903-stash: test without configured user.name and
user.email
  [Outreachy] ident: introduce set_fallback_ident() function
  [Outreachy] stash: use set_fallback_ident() function

 builtin/stash.c  |  1 +
 cache.h  |  1 +
 ident.c  | 17 +
 t/t3903-stash.sh | 15 +++
 4 files changed, 34 insertions(+)

-- 
2.19.1.windows.1



[PATCH] travis-ci: install packages in 'ci/install-dependencies.sh'

2018-11-01 Thread SZEDER Gábor
Ever since we started using Travis CI, we specified the list of
packages to install in '.travis.yml' via the APT addon.  While running
our builds on Travis CI's container-based infrastructure we didn't
have another choice, because that environment didn't support 'sudo',
and thus we didn't have permission to install packages ourselves.  With
the switch to the VM-based infrastructure in the previous patch we do
get a working 'sudo', so we can install packages by running 'sudo
apt-get -y install ...' as well.

Let's make use of this and install necessary packages in
'ci/install-dependencies.sh', so all the dependencies (i.e. both
packages and "non-packages" (P4 and Git-LFS)) are handled in the same
file.  Install gcc-8 only in the 'linux-gcc' build job; so far it has
been unnecessarily installed in the 'linux-clang' build job as well.
Print the versions of P4 and Git-LFS conditionally, i.e. only when
they have been installed; with this change even the static analysis
and documentation build jobs start using 'ci/install-dependencies.sh'
to install packages, and neither of these two build jobs depend on and
thus install those.

This change will presumably be beneficial for the upcoming Azure
Pipelines integration [1]: preliminary versions of that patch series
run a couple of 'apt-get' commands to install the necessary packages
before running 'ci/install-dependencies.sh', but with this patch it
will be sufficient to run only 'ci/install-dependencies.sh'.

[1] 
https://public-inbox.org/git/1a22efe849d6da79f2c639c62a1483361a130238.1539598316.git.gitgitgad...@gmail.com/

Signed-off-by: SZEDER Gábor 
---

This patch should go on top of 'ss/travis-ci-force-vm-mode'.

I'm not sure about the last paragraph, because:

  - It talks about presumed benefits for a currently still
work-in-progress patch series of an other contributor, and I'm not
really sure that that's a good thing.  Perhaps I should have
rather put it below the '---'.

  - I'm confused about the name of this Azure thing.  The cover letter
mentions "Azure Pipelines", the file is called
'azure-pipelines.yml', but the relevant patch I link to talks
about "Azure DevOps" in the commit message.

Anyway, keep that last paragraph or drop it as you see fit.


 .travis.yml| 21 -
 ci/install-dependencies.sh | 35 +--
 2 files changed, 29 insertions(+), 27 deletions(-)

diff --git a/.travis.yml b/.travis.yml
index 8d2499739e..a5a82d6832 100644
--- a/.travis.yml
+++ b/.travis.yml
@@ -12,16 +12,6 @@ compiler:
   - clang
   - gcc
 
-addons:
-  apt:
-sources:
-- ubuntu-toolchain-r-test
-packages:
-- language-pack-is
-- git-svn
-- apache2
-- gcc-8
-
 matrix:
   include:
 - env: jobname=GETTEXT_POISON
@@ -50,22 +40,11 @@ matrix:
 - env: jobname=StaticAnalysis
   os: linux
   compiler:
-  addons:
-apt:
-  packages:
-  - coccinelle
-  before_install:
   script: ci/run-static-analysis.sh
   after_failure:
 - env: jobname=Documentation
   os: linux
   compiler:
-  addons:
-apt:
-  packages:
-  - asciidoc
-  - xmlto
-  before_install:
   script: ci/test-documentation.sh
   after_failure:
 
diff --git a/ci/install-dependencies.sh b/ci/install-dependencies.sh
index 75a9fd2475..06c3546e1e 100755
--- a/ci/install-dependencies.sh
+++ b/ci/install-dependencies.sh
@@ -10,6 +10,15 @@ 
LFSWHENCE=https://github.com/github/git-lfs/releases/download/v$LINUX_GIT_LFS_VE
 
 case "$jobname" in
 linux-clang|linux-gcc)
+   sudo apt-add-repository -y "ppa:ubuntu-toolchain-r/test"
+   sudo apt-get -q update
+   sudo apt-get -q -y install language-pack-is git-svn apache2
+   case "$jobname" in
+   linux-gcc)
+   sudo apt-get -q -y install gcc-8
+   ;;
+   esac
+
mkdir --parents "$P4_PATH"
pushd "$P4_PATH"
wget --quiet "$P4WHENCE/bin.linux26x86_64/p4d"
@@ -32,11 +41,25 @@ osx-clang|osx-gcc)
brew link --force gettext
brew install caskroom/cask/perforce
;;
+StaticAnalysis)
+   sudo apt-get -q update
+   sudo apt-get -q -y install coccinelle
+   ;;
+Documentation)
+   sudo apt-get -q update
+   sudo apt-get -q -y install asciidoc xmlto
+   ;;
 esac
 
-echo "$(tput setaf 6)Perforce Server Version$(tput sgr0)"
-p4d -V | grep Rev.
-echo "$(tput setaf 6)Perforce Client Version$(tput sgr0)"
-p4 -V | grep Rev.
-echo "$(tput setaf 6)Git-LFS Version$(tput sgr0)"
-git-lfs version
+if type p4d >/dev/null && type p4 >/dev/null
+then
+   echo "$(tput setaf 6)Perforce Server Version$(tput sgr0)"
+   p4d -V | grep Rev.
+   echo "$(tput setaf 6)Perforce Client Version$(tput sgr0)"
+   p4 -V | grep Rev.
+fi
+if type git-lfs >/dev/null
+then
+   echo "$(tput setaf 6)Git-LFS Version$(tput sgr0)"
+   git-lfs version
+fi
-- 

Re: [RFE] Please add name and email to git credentials

2018-11-01 Thread Nicolas Mailhot
Le jeudi 01 novembre 2018 à 10:59 +0100, Nicolas Mailhot a écrit :
> Hi,
> 
> A dev persona is not just a username, please add email (and probably
> also name) support to git credentials so the correct set for a repo
> url
> is automatically picked up by git

So, just in case it was not clear enough, allow things in .gitconfig
like

[credential "https://pkgs.fedoraproject.org/;]
username = doe4ever
name = John Doe
email = foo4e...@fedoraproject.org
[credential "https://gitlab.corp.com/;]
username = jdoe56874
name = John Doe, Snr Engineer
email = john@corp.com

Instead of just

[user]
name = John Doe
email =  john@corp.com
[credential "https://pkgs.fedoraproject.org/;]
username = doe4ever
email = f...@fedoraproject.org
[credential "https://gitlab.corp.com/;]
username = jdoe56874

and drat, I've commited to XXX with the wrong name/email again

-- 
Nicolas Mailhot



Re: [RFE] Please add name and email to git credentials

2018-11-01 Thread Ævar Arnfjörð Bjarmason


On Thu, Nov 01 2018, Nicolas Mailhot wrote:

> A dev persona is not just a username, please add email (and probably
> also name) support to git credentials so the correct set for a repo url
> is automatically picked up by git

The "git-credential" helper needs to look at a URL like
g...@github.com:git/git.git and decide that protocol=ssh, username=git,
host=github.com, path=git/git.git etc, because that's the credential we
need to lookup to push to.

Where would we get an E-Mail to lookup to pass to the helper? Are you
just asking that the helper git the result of $(git config user.name &&
git config user.email)? If so why can't it just look this up itself?


HOPEFULLY

2018-11-01 Thread Zongo Daniel
Hello

I am Zongo Daniel and work with BICIAB BANK Of Burkina Faso. I have an
important business proposal that will be of mutual benefit.

Please send me your direct number or your personal email so i can tell
you more about my proposal. It is very important we communicate and
you will be very glad with my proposition.

Waiting your response soonest through my alternative address below:

zongodanie...@yahoo.com

My dearest regards.

Zongo Daniel


Re: [RFE] Please add a standard ref object to releases

2018-11-01 Thread Ævar Arnfjörð Bjarmason


On Thu, Nov 01 2018, Nicolas Mailhot wrote:

> git makes no provision for versioned release references.
>
> However, software projects need versioned releases. Software project
> integrators need versionned releases. Security auditors need versioned
> release. Software project users need versioned releases.
>
> Versioned releases are not the same thing as free-form tags. They have
> semantics to allow deducing upgrade paths (usually, a form of semver).
> They imply some form of API stability promise. They imply release
> documentation completion. They're not just a random point in the project
> history like tags are.
>
> This is why most git hosting sites provide a way to select versioned
> releases, even if it's not a native git concept. And this way is clearly
> separate and distinct from git tag selection.
>
> Unfortunately, since git makes no provision for versioned release
> references, git hosting sites have to shove release refs into tag refs.
> And it's a huge mess.
>
> Some put release ids in tags as-is, others add a v prefix, others a
> version_ prefix, it's all hoster-specific, it's all inconsistent. It
> ends up being inconsistent within projects, as they migrate from a
> hoster to another, are mirrored from one hoster to another. It depends
> on the habits of the person cutting a release, and the release manager
> of a project can change over time. It ends up being inconsistent in
> release archives, as the version munging can percolate in the archive
> name and structure, or not, for mysterious heuristic reasons that change
> over time.
>
> As a result, when assembling a project that uses other git repositories,
> you spend more time checking repository by repository and version by
> version how the version ref was mangled in a tag ref for this specific
> (repo,version,date) tuple, than doing actual software dev and QA work.
>
> Please add a specific release reference to git, so software projects
> that do versioned releases can use this ref object directly, without
> needing to invent custom version rewriting rules to shove them in tags
> while marking they are not just tags but release references.

For both this and your other report, it would be helpful if you describe
in concrete terms (with examples of git commands, or UI etc.) what git
commands do now, what's wrong with it, and some sketch of what you
expect an alternate interface to look like.

As for this report: I know this area of git quite well, but I still have
no idea quite what you're asking for.

Do you just mean that we should have some other second-level namespace
for tags, i.e. in addition to refs/tags/* we'd have
refs/this-time-I-really-meant-it-tags/*., and e.g. linux.git and git.git
v* tags would go into that, and have some "git tag --i-really-mean-it"
interface?

If we assume this is a good idea, how do you imagine this would work
once you don't just have two levels (random labels v.s. "real" releases)
but three or more (random labels v.s. "real" releases v.s. "LTS"
releases, )?


ab/* topics (was: Re: What's cooking in git.git (Nov 2018, #01; Thu, 1))

2018-11-01 Thread Ævar Arnfjörð Bjarmason


On Thu, Nov 01 2018, Junio C Hamano wrote:

> * ab/push-dwim-dst (2018-10-29) 9 commits
>  - SQUASH???
>  - push doc: document the DWYM behavior pushing to unqualified 
>  - push: add DWYM support for "git push refs/remotes/...:"
>  - push: test that  doesn't DWYM if  is unqualified
>  - push: add an advice on unqualified  push
>  - push: move unqualified refname error into a function
>  - push: improve the error shown on unqualified  push
>  - i18n: remote.c: mark error(...) messages for translation
>  - remote.c: add braces in anticipation of a follow-up change
>
>  "git push $there $src:$dst" rejects when $dst is not a fully
>  qualified refname and not clear what the end user meant.  The
>  codepath has been taught to give a clearer error message, and also
>  guess where the push should go by taking the type of the pushed
>  object into account (e.g. a tag object would want to go under
>  refs/tags/).
>
>  The last few steps are questionable.
>  cf. <87in1lkw54@evledraar.gmail.com>

Will send an update to this soon.

> * ab/pack-tests-cleanup (2018-10-31) 3 commits
>  - index-pack tests: don't leave test repo dirty at end
>  - pack-objects tests: don't leave test .git corrupt at end
>  - pack-objects test: modernize style
>
>  A couple of tests used to leave the repository in a state that is
>  deliberately corrupt, which have been corrected.
>
>  Will merge to 'next'.

Thanks!

> * ab/reject-alias-loop (2018-10-19) 1 commit
>   (merged to 'next' on 2018-10-26 at bc213f1bef)
>  + alias: detect loops in mixed execution mode
>
>  Two (or more) aliases that mutually refer to each other can form an
>  infinite loop; we now attempt to notice and stop.
>
>  Discarded.
>  Reverted out of 'next'.
>  cf. <87sh0slvxm@evledraar.gmail.com>

*nod* will try to find time to work on this soon, but treating it as
non-urgent.

Could you please pick up
https://public-inbox.org/git/20181024114725.3927-1-ava...@gmail.com/ ?
It seems to have fallen between the cracks and addressed the feedback on
v1, and looks good to me (and nobody's objected so far...).


[RFE] Please add a standard ref object to releases

2018-11-01 Thread Nicolas Mailhot
Hi,

git makes no provision for versioned release references.

However, software projects need versioned releases. Software project
integrators need versionned releases. Security auditors need versioned
release. Software project users need versioned releases.

Versioned releases are not the same thing as free-form tags. They have
semantics to allow deducing upgrade paths (usually, a form of semver).
They imply some form of API stability promise. They imply release
documentation completion. They're not just a random point in the project
history like tags are.

This is why most git hosting sites provide a way to select versioned
releases, even if it's not a native git concept. And this way is clearly
separate and distinct from git tag selection.

Unfortunately, since git makes no provision for versioned release
references, git hosting sites have to shove release refs into tag refs.
And it's a huge mess.

Some put release ids in tags as-is, others add a v prefix, others a
version_ prefix, it's all hoster-specific, it's all inconsistent. It
ends up being inconsistent within projects, as they migrate from a
hoster to another, are mirrored from one hoster to another. It depends
on the habits of the person cutting a release, and the release manager
of a project can change over time. It ends up being inconsistent in
release archives, as the version munging can percolate in the archive
name and structure, or not, for mysterious heuristic reasons that change
over time.

As a result, when assembling a project that uses other git repositories,
you spend more time checking repository by repository and version by
version how the version ref was mangled in a tag ref for this specific
(repo,version,date) tuple, than doing actual software dev and QA work.

Please add a specific release reference to git, so software projects
that do versioned releases can use this ref object directly, without
needing to invent custom version rewriting rules to shove them in tags
while marking they are not just tags but release references.

Regards,


-- 
Nicolas Mailhot



Re: [PATCH 1/1] poll: use GetTickCount64() to avoid wrap-around issues

2018-11-01 Thread Johannes Sixt

Am 31.10.18 um 22:11 schrieb Steve Hoelzer via GitGitGadget:

From: Steve Hoelzer 

 From Visual Studio 2015 Code Analysis: Warning C28159 Consider using
'GetTickCount64' instead of 'GetTickCount'.

Reason: GetTickCount() overflows roughly every 49 days. Code that does
not take that into account can loop indefinitely. GetTickCount64()
operates on 64 bit values and does not have that problem.

Note: this patch has been carried in Git for Windows for almost two
years, but with a fallback for Windows XP, as the GetTickCount64()
function is only available on Windows Vista and later. However, in the
meantime we require Vista or later, hence we can drop that fallback.

Signed-off-by: Steve Hoelzer 
Signed-off-by: Johannes Schindelin 
---
  compat/poll/poll.c | 10 +++---
  1 file changed, 7 insertions(+), 3 deletions(-)

diff --git a/compat/poll/poll.c b/compat/poll/poll.c
index ad5dcde439..4abbfcb6a4 100644
--- a/compat/poll/poll.c
+++ b/compat/poll/poll.c
@@ -18,6 +18,9 @@
 You should have received a copy of the GNU General Public License along
 with this program; if not, see .  */
  
+/* To bump the minimum Windows version to Windows Vista */

+#include "git-compat-util.h"
+
  /* Tell gcc not to warn about the (nfd < 0) tests, below.  */
  #if (__GNUC__ == 4 && 3 <= __GNUC_MINOR__) || 4 < __GNUC__
  # pragma GCC diagnostic ignored "-Wtype-limits"
@@ -449,7 +452,8 @@ poll (struct pollfd *pfd, nfds_t nfd, int timeout)
static HANDLE hEvent;
WSANETWORKEVENTS ev;
HANDLE h, handle_array[FD_SETSIZE + 2];
-  DWORD ret, wait_timeout, nhandles, start = 0, elapsed, orig_timeout = 0;
+  DWORD ret, wait_timeout, nhandles, elapsed, orig_timeout = 0;
+  ULONGLONG start = 0;
fd_set rfds, wfds, xfds;
BOOL poll_again;
MSG msg;
@@ -465,7 +469,7 @@ poll (struct pollfd *pfd, nfds_t nfd, int timeout)
if (timeout != INFTIM)
  {
orig_timeout = timeout;
-  start = GetTickCount();
+  start = GetTickCount64();
  }
  
if (!hEvent)

@@ -614,7 +618,7 @@ restart:
  
if (!rc && orig_timeout && timeout != INFTIM)

  {
-  elapsed = GetTickCount() - start;
+  elapsed = (DWORD)(GetTickCount64() - start);


AFAICS, this subtraction in the old code is the correct way to take 
account of wrap-arounds in the tick count. The new code truncates the 64 
bit difference to 32 bits; the result is exactly identical to a 
difference computed from truncated 32 bit values, which is what we had 
in the old code.


IOW, there is no change in behavior. The statement "avoid wrap-around 
issues" in the subject line is not correct. The patch's only effect is 
that it removes Warning C28159.


What is really needed is that all quantities in the calculations are 
promoted to ULONGLONG. Unless, of course, we agree that a timeout of 
more than 49 days cannot happen ;)



timeout = elapsed >= orig_timeout ? 0 : orig_timeout - elapsed;
  }
  


-- Hannes


[RFE] Please add name and email to git credentials

2018-11-01 Thread Nicolas Mailhot
Hi,

A dev persona is not just a username, please add email (and probably
also name) support to git credentials so the correct set for a repo url
is automatically picked up by git

Regards,

-- 
Nicolas Mailhot



What's cooking in git.git (Nov 2018, #01; Thu, 1)

2018-11-01 Thread Junio C Hamano
Here are the topics that have been cooking.  Commits prefixed with
'-' are only in 'pu' (proposed updates) while commits prefixed with
'+' are in 'next'.  The ones marked with '.' do not appear in any of
the integration branches, but I am still holding onto them.

Two groups of "rewrite rebase/rebase-i in C" topics, together with a
handful of associated fix-up topics to them, will all be merged to
'master' tomorrow.  Some of them haven't spent as much time as usual
in 'next', so there still may be rough edges, but let's make sure we
find them and smooth them out before the release.

You can find the changes described here in the integration branches
of the repositories listed at

http://git-blame.blogspot.com/p/git-public-repositories.html

--
[Graduated to "master"]

* ah/doc-updates (2018-10-23) 6 commits
  (merged to 'next' on 2018-10-26 at b0bb46a602)
 + doc: fix formatting in git-update-ref
 + doc: fix indentation of listing blocks in gitweb.conf.txt
 + doc: fix descripion for 'git tag --format'
 + doc: fix inappropriate monospace formatting
 + doc: fix ASCII art tab spacing
 + doc: clarify boundaries of 'git worktree list --porcelain'

 Doc updates.


* bc/hash-transition-part-15 (2018-10-15) 15 commits
  (merged to 'next' on 2018-10-26 at 4ff8111b4b)
 + rerere: convert to use the_hash_algo
 + submodule: make zero-oid comparison hash function agnostic
 + apply: rename new_sha1_prefix and old_sha1_prefix
 + apply: replace hard-coded constants
 + tag: express constant in terms of the_hash_algo
 + transport: use parse_oid_hex instead of a constant
 + upload-pack: express constants in terms of the_hash_algo
 + refs/packed-backend: express constants using the_hash_algo
 + packfile: express constants in terms of the_hash_algo
 + pack-revindex: express constants in terms of the_hash_algo
 + builtin/fetch-pack: remove constants with parse_oid_hex
 + builtin/mktree: remove hard-coded constant
 + builtin/repack: replace hard-coded constants
 + pack-bitmap-write: use GIT_MAX_RAWSZ for allocation
 + object_id.cocci: match only expressions of type 'struct object_id'

 More codepaths are moving away from hardcoded hash sizes.


* cb/compat-mmap-is-private-read-only (2018-10-25) 1 commit
  (merged to 'next' on 2018-10-26 at d3bfab3034)
 + compat: make sure git_mmap is not expected to write

 Code tightening.


* cb/khash-maybe-unused-function (2018-10-24) 2 commits
  (merged to 'next' on 2018-10-26 at 17fc4e55a0)
 + khash: silence -Wunused-function for delta-islands
 + commit-slabs: move MAYBE_UNUSED out

 Build fix.


* cb/remove-dead-init (2018-10-19) 2 commits
  (merged to 'next' on 2018-10-26 at ba725a64ad)
 + multi-pack-index: avoid dead store for struct progress
 + unpack-trees: avoid dead store for struct progress

 Code clean-up.


* ch/subtree-build (2018-10-18) 3 commits
  (merged to 'next' on 2018-10-18 at f89fd5e6aa)
 + Revert "subtree: make install targets depend on build targets"
  (merged to 'next' on 2018-10-16 at 919599cc37)
 + subtree: make install targets depend on build targets
  (merged to 'next' on 2018-10-12 at 4ed9ff6300)
 + subtree: add build targets 'man' and 'html'

 Build update for "git subtree" (in contrib/) documentation pages.


* dl/mergetool-gui-option (2018-10-25) 3 commits
  (merged to 'next' on 2018-10-26 at 2c46355e81)
 + doc: document diff/merge.guitool config keys
 + completion: support `git mergetool --[no-]gui`
 + mergetool: accept -g/--[no-]gui as arguments

 "git mergetool" learned to take the "--[no-]gui" option, just like
 "git difftool" does.


* ds/ci-commit-graph-and-midx (2018-10-19) 1 commit
  (merged to 'next' on 2018-10-26 at a13664e49a)
 + ci: add optional test variables

 One of our CI tests to run with "unusual/experimental/random"
 settings now also uses commit-graph and midx.


* ds/reachable (2018-10-23) 1 commit
  (merged to 'next' on 2018-10-26 at 76b5fc9a46)
 + commit-reach: fix cast in compare_commits_by_gen()

 Trivial bugfix.


* ds/reachable-first-parent-fix (2018-10-19) 1 commit
  (merged to 'next' on 2018-10-26 at 076442d512)
 + commit-reach: fix first-parent heuristic

 Correct performance regression in commit ancestry computation when
 generation numbers are involved.


* jc/cocci-preincr (2018-10-24) 2 commits
  (merged to 'next' on 2018-10-26 at cbd98b44e2)
 + fsck: s/++i > 1/i++/
 + cocci: simplify "if (++u > 1)" to "if (u++)"

 Code cleanup.


* jc/receive-deny-current-branch-fix (2018-10-19) 1 commit
  (merged to 'next' on 2018-10-26 at 2975c42215)
 + receive: denyCurrentBranch=updateinstead should not blindly update

 The receive.denyCurrentBranch=updateInstead codepath kicked in even
 when the push should have been rejected due to other reasons, such
 as it does not fast-forward or the update-hook rejects it, which
 has been corrected.


* jk/run-command-notdot (2018-10-25) 2 commits
  (merged to 'next' on 2018-10-26 at 9d9335b23f)
 + t0061: adjust to test-tool transition
 + run-command: 

Re: Import/Export as a fast way to purge files from Git?

2018-11-01 Thread Elijah Newren
On Wed, Oct 31, 2018 at 12:16 PM Lars Schneider
 wrote:
> > On Sep 24, 2018, at 7:24 PM, Elijah Newren  wrote:
> > On Sun, Sep 23, 2018 at 6:08 AM Lars Schneider  
> > wrote:
> >>
> >> Hi,
> >>
> >> I recently had to purge files from large Git repos (many files, many 
> >> commits).
> >> The usual recommendation is to use `git filter-branch --index-filter` to 
> >> purge
> >> files. However, this is *very* slow for large repos (e.g. it takes 45min to
> >> remove the `builtin` directory from git core). I realized that I can remove
> >> files *way* faster by exporting the repo, removing the file references,
> >> and then importing the repo (see Perl script below, it takes ~30sec to 
> >> remove
> >> the `builtin` directory from git core). Do you see any problem with this
> >> approach?
> >
> > It looks like others have pointed you at other tools, and you're
> > already shifting to that route.  But I think it's a useful question to
> > answer more generally, so for those that are really curious...
> >
> >
> > The basic approach is fine, though if you try to extend it much you
> > can run into a few possible edge/corner cases (more on that below).
> > I've been using this basic approach for years and even created a
> > mini-python library[1] designed specifically to allow people to create
> > "fast-filters", used as
> >   git fast-export  | your-fast-filter | git fast-import 
> >
> > But that library didn't really take off; even I have rarely used it,
> > often opting for filter-branch despite its horrible performance or a
> > simple fast-export | long-sed-command | fast-import (with some extra
> > pre-checking to make sure the sed wouldn't unintentionally munge other
> > data).  BFG is great, as long as you're only interested in removing a
> > few big items, but otherwise doesn't seem very useful (to be fair,
> > it's very upfront about only wanting to solve that problem).
> > Recently, due to continuing questions on filter-branch and folks still
> > getting confused with it, I looked at existing tools, decided I didn't
> > think any quite fit, and started looking into converting
> > git_fast_filter into a filter-branch-like tool instead of just a
> > libary.  Found some bugs and missing features in fast-export along the
> > way (and have some patches I still need to send in).  But I kind of
> > got stuck -- if the tool is in python, will that limit adoption too
> > much?  It'd be kind of nice to have this tool in core git.  But I kind
> > of like leaving open the possibility of using it as a tool _or_ as a
> > library, the latter for the special cases where case-specific
> > programmatic filtering is needed.  But a developer-convenience library
> > makes almost no sense unless in a higher level language, such as
> > python.  I'm still trying to make up my mind about what I want (and
> > what others might want), and have been kind of blocking on that.  (If
> > others have opinions, I'm all ears.)
>
> That library sounds like a very interesting idea. Unfortunately, the
> referenced repo seems not to be available anymore:
> git://gitorious.org/git_fast_filter/mainline.git

Yeah, gitorious went down at a time when I was busy with enough other
things that I never bothered moving my repos to a new hosting site.
Sorry about that.

I've got a copy locally, but I've been editing it heavily, without the
testing I should have in place, so I hesitate to point you at it right
now.  (Also, the old version failed to handle things like --no-data
output, which is important.)  I'll post an updated copy soon; feel
free to ping me in a week if you haven't heard anything yet.

> I very much like Python. However, more recently I started to
> write Git tools in Perl as they work out of the box on every
> machine with Git installed ... and I think Perl can be quite
> readable if no shortcuts are used :-).

Yeah, when portability matters, perl makes sense.  I thought about
switching it over, but I'm not sure I want to rewrite 1-2k lines of
code.  Especially since repo-filtering tools are kind of one-shot by
nature, and only need to be done by one person of a team, on one
specific machine, and won't affect daily development thereafter.
(Also, since I don't depend on any libraries and use only stuff from
the default python library, it ought to be relatively portable
anyway.)


Re: [PATCH v3 8/8] merge-recursive: improve rename/rename(1to2)/add[/add] handling

2018-11-01 Thread Elijah Newren
On Wed, Oct 31, 2018 at 8:08 AM Derrick Stolee  wrote:
>
> On 10/19/2018 3:31 PM, Elijah Newren wrote:
> > [snip]
> >
> > + char *new_path = NULL;
> > + if (dir_in_way(b->path, !o->call_depth, 0)) {
> > + new_path = unique_path(o, b->path, 
> > ci->branch2);
> > + output(o, 1, _("%s is a directory in %s 
> > adding "
> > +"as %s instead"),
> > +b->path, ci->branch1, new_path);
>
> I tried really hard, but failed to get a test to cover the block below.
> I was able to
> find that the "check handling of differently renamed file with D/F
> conflicts" test
> in t6022-merge-rename.sh covers the block above. Trying to tweak the
> example using
> untracked files seems to hit an error message from unpack-trees.c instead.
>
> > + } else if (would_lose_untracked(b->path)) {
> > + new_path = unique_path(o, b->path, 
> > ci->branch2);
> > + output(o, 1, _("Refusing to lose untracked 
> > file"
> > +" at %s; adding as %s 
> > instead"),
> > +b->path, new_path);
>
> It could also be that I failed because I'm less familiar with this part
> of the
> codebase. Elijah, do you think it is possible to hit this block?

Yeah, this one's going to be a little harder; the upper block would be
done with a D/F, but I think for this block you'd need a directory
rename so that unpack-trees.c can't tell that the untracked file in
the way is actually in the way of anything.  But since this is in the
rename/rename(1to2) area, I think I had rules around avoiding doing
directory renames if the other side renamed the file to avoid getting
into rename/rename(1to3) situations and other weirdness.  So, it might
require a transitive rename (i.e. file renamed on both sides, and on
one side it's renamed into a directory that the other side renamed
away).

I'll try to take a look at it tomorrow, with everything else.  We'll
see how much I can get done.

Thanks for digging in to all these and bringing them up.


Re: [PATCH v3 2/8] t6036, t6042: testcases for rename collision of already conflicting files

2018-11-01 Thread Elijah Newren
On Wed, Oct 31, 2018 at 7:01 AM Derrick Stolee  wrote:
>
> On 10/19/2018 3:31 PM, Elijah Newren wrote:
> > +test_expect_success "setup nested conflicts" '
>
> nit: should these test names be single-quoted? I see you using double-quotes
> in PATCH 1/8 as well, but that seems to be because there are variables in
> the test names.
>
>
> ...
> > +test_expect_failure "check nested conflicts" '
>
> Same here.
>
> > +test_expect_success "setup nested conflicts from rename/rename(2to1)" '
>
> > +test_expect_failure "check nested conflicts from rename/rename(2to1)" '

I'll fix them up; thanks.


Re: [PATCH v3 4/8] merge-recursive: new function for better colliding conflict resolutions

2018-11-01 Thread Elijah Newren
On Wed, Oct 31, 2018 at 6:57 AM Derrick Stolee  wrote:
>
> On 10/31/2018 9:53 AM, Derrick Stolee wrote:
> > On 10/19/2018 3:31 PM, Elijah Newren wrote:
> >> +#if 0 // #if-0-ing avoids unused function warning; will make live in
> >> next commit
> >> +static int handle_file_collision(struct merge_options *o,
> >> + const char *collide_path,
> >> + const char *prev_path1,
> >> + const char *prev_path2,
> >> + const char *branch1, const char *branch2,
> >> + const struct object_id *a_oid,
> >> + unsigned int a_mode,
> >> + const struct object_id *b_oid,
> >> + unsigned int b_mode)
> >> +{
> >> +struct merge_file_info mfi;
> >> +struct diff_filespec null, a, b;
> >> +char *alt_path = NULL;
> >> +const char *update_path = collide_path;
> >> +
> >> +/*
> >> + * In the recursive case, we just opt to undo renames
> >> + */
> >> +if (o->call_depth && (prev_path1 || prev_path2)) {
> >> +/* Put first file (a_oid, a_mode) in its original spot */
> >> +if (prev_path1) {
> >> +if (update_file(o, 1, a_oid, a_mode, prev_path1))
> >> +return -1;
> >> +} else {
> >> +if (update_file(o, 1, a_oid, a_mode, collide_path))
> >
> > The latest test coverage report [1] shows this if statement is never
> > run, so
> > it appears that every call to this method in the test suite has either
> > o->call_depth positive, prev_path1 non-NULL, or both prev_path1 and
> > prev_path2
> > NULL.
> >
> > Is there a way we can add a test case that calls this method with
> > o->call_depth
> > positive, prev_path1 NULL, and prev_path2 non-NULL?
> >
> >> +return -1;
> >> +}
> >> +
> >> +/* Put second file (b_oid, b_mode) in its original spot */
> >> +if (prev_path2) {
> >> +if (update_file(o, 1, b_oid, b_mode, prev_path2))
> >
> > Since this line is covered, we _do_ call the method with prev_path2
> > non-NULL, but
> > prev_path1 must be non-NULL in all cases.
> >
> > I may have found a reason why this doesn't happen in one of the
> > callers you introduced.
> > I'm going to comment on PATCH 8/8 to see if that is the case.
>
> Nevermind on the PATCH 8/8 situation. I thought I saw you pass (a->path,
> NULL) and
> (b->path, NULL) into the (prev_path1, prev_path2) pairs, but in each
> case the non-NULL
> parameter is actually for 'collide_path'.
>
> It is still interesting if we can hit this case. Perhaps we need a
> different kind of
> conflict, like (rename, delete) [but I struggle to make sense of how to
> do that].

rename/delete conflicts are sent through handle_rename_delete() which
do not call into handle_file_collision().  What you'd instead need is
a rename/add conflict, in the virtual merge base, on the appropriate
side.  The fact that the prev_path2 non-NULL case is covered means
there's already a regression test that's probably nearly good enough,
you'd just need to edit the committer timestamps of the two merge
bases so that a different one was older.  I'm pretty sure we can come
up with one without too much effort.  I'll take a look tomorrow; too
late tonight.


Re: [PATCH 0/3] Make add_missing_tags() linear

2018-11-01 Thread Elijah Newren
On Wed, Oct 31, 2018 at 5:05 AM Derrick Stolee  wrote:
>
> On 10/31/2018 2:04 AM, Elijah Newren wrote:
> > On Tue, Oct 30, 2018 at 7:16 AM Derrick Stolee via GitGitGadget
> >  wrote:
> >>
> >> As reported earlier [1], the add_missing_tags() method in remote.c has
> >> quadratic performance. Some of that performance is curbed due to the
> >> generation-number cutoff in in_merge_bases_many(). However, that fix 
> >> doesn't
> >> help users without a commit-graph, and it can still be painful if that
> >> cutoff is sufficiently low compared to the tags we are using for
> >> reachability testing.
> >>
> >> Add a new method in commit-reach.c called get_reachable_subset() which does
> >> a many-to-many reachability test. Starting at the 'from' commits, walk 
> >> until
> >> the generation is below the smallest generation in the 'to' commits, or all
> >> 'to' commits have been discovered. This performs only one commit walk for
> >> the entire add_missing_tags() method, giving linear performance in the 
> >> worst
> >> case.
> >>
> >> Tests are added in t6600-test-reach.sh to ensure get_reachable_subset()
> >> works independently of its application in add_missing_tags().
> >
> > On the original repo where the topic was brought up, with commit-graph
> > NOT turned on and using origin/master, I see:
> >
> > $ time git push --dry-run --follow-tags /home/newren/repo-mirror
> > To /home/newren/repo-mirror
> >  * [new branch]   test5 -> test5
> >
> > real 1m20.081s
> > user 1m19.688s
> > sys 0m0.292s
> >
> > Merging this series in, I now get:
> >
> > $ time git push --dry-run --follow-tags /home/newren/repo-mirror
> > To /home/newren/repo-mirror
> >  * [new branch]   test5 -> test5
> >
> > real 0m2.857s
> > user 0m2.580s
> > sys 0m0.328s
> >
> > which provides a very nice speedup.
> >
> > Oddly enough, if I _also_ do the following:
> > $ git config core.commitgraph true
> > $ git config gc.writecommitgraph true
> > $ git gc
> >
> > then my timing actually slows down just slightly:
> > $ time git push --follow-tags --dry-run /home/newren/repo-mirror
> > To /home/newren/repo-mirror
> >  * [new branch]  test5 -> test5
> >
> > real 0m3.027s
> > user 0m2.696s
> > sys 0m0.400s
>
> So you say that the commit-graph is off in the 2.8s case, but not here
> in the 3.1s case? I would expect _at minimum_ that the cost of parsing
> commits would have a speedup in the commit-graph case.  There may be
> something else going on here, since you are timing a `push` event that
> is doing more than the current walk.
>
> > (run-to-run variation seems pretty consistent, < .1s variation, so
> > this difference is just enough to notice.)  I wouldn't be that
> > surprised if that means there's some really old tags with very small
> > generation numbers, meaning it's not gaining anything in this special
> > case from the commit-graph, but it does pay the cost of loading the
> > commit-graph.
>
> While you have this test environment, do you mind applying the diff
> below and re-running the tests? It will output a count for how many
> commits are walked by the algorithm. This should help us determine if
> this is another case where generation numbers are worse than commit-date,
> or if there is something else going on. Thanks!

I can do that, but wouldn't you want a similar patch for the old
get_merge_bases_many() in order to compare?  Does an absolute number
help by itself?
It's going to have to be tomorrow, though; not enough time tonight.


Re: [PATCH 3/3] tests: optionally skip `git rebase -p` tests

2018-11-01 Thread Junio C Hamano
"Johannes Schindelin via GitGitGadget" 
writes:

> The `--preserve-merges` mode of the `rebase` command is slated to be
> deprecated soon, ...

Is everybody on board on this statement?  I vaguely recall that some
people wanted to have something different from what rebase-merges
does (e.g. wrt first-parent history), and extending perserve-merges
might be one way to do so.

I do not mind at all if the way forward were to extend rebase-merges
for any future features.  To me, it is preferrable having to deal
with just one codepath than two written in different languages.

I just want to make sure we know everybody is on board the plan that
we will eventually remove preserve-merges, tell those who want to
use it to switch to rebase-merges, and we will extend rebase-merges
when they raise issues with it saying that they cannot do something
preserve-merges would have served them well with rebase-merges.


Re:Business proposition for you

2018-11-01 Thread Melvin Greg
Hello, 



I have a client from Syrian who will like to invest with your 
company. My client is willing to invest $4 Million. Can I have 
your company website to show to my client your company so that 
they will check and decide if they will invest there funds with 
you as joint partner. 

This information is needed urgently.

Please reply. 

Best Regards,
Agent Melvin Greg
Tel:+1 4045966532