date:20180806

Re: What's cooking in git.git (Aug 2018, #02; Mon, 6)

2018-08-06 Thread Eric Sunshine

On Mon, Aug 6, 2018 at 11:47 PM Eric Sunshine  wrote:
> On Mon, Aug 6, 2018 at 6:36 PM Junio C Hamano  wrote:
> > * pw/rebase-i-author-script-fix (2018-08-02) 2 commits
> >  - sequencer: fix quoting in write_author_script
> >  - sequencer: handle errors in read_author_ident()
> >  (this branch uses es/rebase-i-author-script-fix.)
> >
> >  Undecided.
> >  Is it the list consensus to favor this "with extra code, read the
> >  script written by bad writer" approach?
>
> Phillip's original "effectively one-liner" backward compatibility[1]
> seemed a reasonable compromise[2] between the choices of no backward
> compatibility and heavyweight backward compatibility of his
> re-roll[3]. His reference[4] to an earlier "one-liner" backward
> compatibility solution given similar circumstances bolstered the case
> for his original approach.

To summarize, I think Phillip is planning to re-roll, going with the
simpler backward-compatibility (though he was waiting an "okay" from
you[2]).

[2]: 
https://public-inbox.org/git/455fafb5-3c92-4348-0c2c-0a4ab62cf...@talktalk.net/

Re: What's cooking in git.git (Aug 2018, #02; Mon, 6)

2018-08-06 Thread Ævar Arnfjörð Bjarmason

On Mon, Aug 06 2018, Junio C Hamano wrote:

> * ab/newhash-is-sha256 (2018-08-06) 2 commits
>  - doc hash-function-transition: pick SHA-256 as NewHash
>  - doc hash-function-transition: note the lack of a changelog
>
>  Documentation update.
>
>  Will Merge to 'next'.

The tip of gitster/ab/newhash-is-sha256 has a unicode-corrupted
signed-off-by line from me. See
https://github.com/git/git/commit/f855a19d45 not sure how that got
there...

Re: Support JSON with --json flag and provide header files

2018-08-06 Thread Alexander Mills

To make it clear, these headers files would allow for compile-time type safety.
When you are parsing serialized data into a data structure, you know
have header files that describe the data type(s).
It would be nice to not have to "guess" about the structure of the
JSON that git produces.

On Mon, Aug 6, 2018 at 9:54 PM, Alexander Mills
 wrote:
> @git
>
> I hear there is a movement underway to have git support JSON for the
> purposes of machine readable stdio.
>
> That is all good and well. I want to add to that request. Imagine a
> software world were serialized data could be typed. So JSON produced
> by C/C++ could be described and the receiving process could have type
> information about how the JSON would be deserialized.
>
> How would that work? For every typed language, Git would provide
> header files for that language that include type information for each
> JSON string that would be produced by git commands.
>
> How could that be accomplished? Someone needs to create a system to
> generate type information (header files) for each language based off a
> single central format. So we create a JSON format that describes the
> JSON data. And from that single format we generate header files for
> each typed language.
>
> I hope you see the value in that. There could be a git command that
> generates the header files for a given language, given the git
> version:
>
> $ git --generate-header-files --language=java > $dir
>
> the above command would generate the right header files for Java given
> the git version. Or the headers could be stored on the www and could
> be downloadable for each git version. Perhaps Git authors wouldn't do
> all this work - perhaps they just author a JSON format that describes
> JSON. And the OSS community can write tools that convert the JSON
> format into header files for each language. But I would like something
> OFFICIAL to take place here.
>
> Hope this makes sense. I am looking forward to a software future where
> parsing stdio is easier by using machine readable JSON. Want human
> readable output? Use the regular command. Want machine readable
> output? use the --json flag.
>
> -alex
>
> --
> Alexander D. Mills
> ¡¡¡ New cell phone number: (415)730-1805 !!!
> alexander.d.mi...@gmail.com
>
> www.linkedin.com/pub/alexander-mills/b/7a5/418/



-- 
Alexander D. Mills
¡¡¡ New cell phone number: (415)730-1805 !!!
alexander.d.mi...@gmail.com

www.linkedin.com/pub/alexander-mills/b/7a5/418/

Support JSON with --json flag and provide header files

2018-08-06 Thread Alexander Mills

@git

I hear there is a movement underway to have git support JSON for the
purposes of machine readable stdio.

That is all good and well. I want to add to that request. Imagine a
software world were serialized data could be typed. So JSON produced
by C/C++ could be described and the receiving process could have type
information about how the JSON would be deserialized.

How would that work? For every typed language, Git would provide
header files for that language that include type information for each
JSON string that would be produced by git commands.

How could that be accomplished? Someone needs to create a system to
generate type information (header files) for each language based off a
single central format. So we create a JSON format that describes the
JSON data. And from that single format we generate header files for
each typed language.

I hope you see the value in that. There could be a git command that
generates the header files for a given language, given the git
version:

$ git --generate-header-files --language=java > $dir

the above command would generate the right header files for Java given
the git version. Or the headers could be stored on the www and could
be downloadable for each git version. Perhaps Git authors wouldn't do
all this work - perhaps they just author a JSON format that describes
JSON. And the OSS community can write tools that convert the JSON
format into header files for each language. But I would like something
OFFICIAL to take place here.

Hope this makes sense. I am looking forward to a software future where
parsing stdio is easier by using machine readable JSON. Want human
readable output? Use the regular command. Want machine readable
output? use the --json flag.

-alex

-- 
Alexander D. Mills
¡¡¡ New cell phone number: (415)730-1805 !!!
alexander.d.mi...@gmail.com

www.linkedin.com/pub/alexander-mills/b/7a5/418/

Re: What's cooking in git.git (Aug 2018, #02; Mon, 6)

2018-08-06 Thread Eric Sunshine

On Mon, Aug 6, 2018 at 6:36 PM Junio C Hamano  wrote:
> * pw/rebase-i-author-script-fix (2018-08-02) 2 commits
>  - sequencer: fix quoting in write_author_script
>  - sequencer: handle errors in read_author_ident()
>  (this branch uses es/rebase-i-author-script-fix.)
>
>  Recent "git rebase -i" update started to write bogusly formatted
>  author-script, with a matching broken reading code.  These are
>  being fixed.
>
>  Undecided.
>  Is it the list consensus to favor this "with extra code, read the
>  script written by bad writer" approach?

Phillip's original "effectively one-liner" backward compatibility[1]
seemed a reasonable compromise[2] between the choices of no backward
compatibility and heavyweight backward compatibility of his
re-roll[3]. His reference[4] to an earlier "one-liner" backward
compatibility solution given similar circumstances bolstered the case
for his original approach.

[1]: 
https://public-inbox.org/git/2018073532.9358-3-phillip.w...@talktalk.net/
[2]: 
https://public-inbox.org/git/455fafb5-3c92-4348-0c2c-0a4ab62cf...@talktalk.net/
[3]: 
https://public-inbox.org/git/20180802112002.720-3-phillip.w...@talktalk.net/
[4]: 
https://public-inbox.org/git/c7b8629d-7b93-2fbf-6793-0d566e86a...@talktalk.net/

Re: [PATCH v2 0/4] fix "rebase -i --root" corrupting root commit

2018-08-06 Thread Eric Sunshine

On Mon, Aug 6, 2018 at 9:20 PM Hilco Wijbenga  wrote:
> But your suggestion did make me think about what behaviour I would
> like to see, exactly. I like that Git removes commits that no longer
> serve any purpose (because I've included their changes in earlier
> commits). So I would not want to keep commits that become empty during
> the rebase. What I would like to see is that commits that _start out_
> as empty, are retained. Would such behaviour make sense? Or would that
> be considered surprising behaviour?

I, personally, have no opinion since I don't use empty commits.
Perhaps someone more experienced and more long-sighted will chime in.

Re: [PATCH v2 0/4] fix "rebase -i --root" corrupting root commit

2018-08-06 Thread Hilco Wijbenga

On Tue, Jul 31, 2018 at 11:22 PM, Eric Sunshine  wrote:
> On Tue, Jul 31, 2018 at 9:30 PM Hilco Wijbenga  
> wrote:
>> On Tue, Jul 31, 2018 at 12:33 AM, Eric Sunshine  
>> wrote:
>> > This is a re-roll of [1] which fixes sequencer bugs resulting in commit
>> > object corruption when "rebase -i --root" swaps in a new commit as root.
>> > Unfortunately, those bugs made it into v2.18.0 and have already
>> > corrupted at least one repository (a local project of mine). Patches 3/4
>> > and 4/4 are new.
>>
>> Does this also fix losing the initial commit if it is empty?
>>
>> git init ; git commit -m 'Initial commit' --allow-empty ; touch
>> file.txt ; git add file.txt ; git commit -m 'Add file.txt' ; git
>> rebase --root
>>
>> I would expect there to be 2 commits but the first one has
>> disappeared. (This usually happens with "git rebase -i --root" early
>> on in a new project.)
>
> This series should have no impact on that issue; it is fixing root
> commit object corruption, and does not touch or change how the commit
> graph is maintained or how the rebasing machinery itself works (and
> the --allow-empty stuff is part of that machinery).
>
> As Dscho is cc:'d already, perhaps he can chime in as a resident expert.
>
> By the way, what happens if you use --keep-empty while rebasing?

I was not aware of that flag. But, although I was expecting it to, if
I use it with the rebase, I don't see any different behaviour. I can't
really make out what its purpose is from "Keep the commits that do not
change anything from its parents in the result.".

But your suggestion did make me think about what behaviour I would
like to see, exactly. I like that Git removes commits that no longer
serve any purpose (because I've included their changes in earlier
commits). So I would not want to keep commits that become empty during
the rebase. What I would like to see is that commits that _start out_
as empty, are retained. Would such behaviour make sense? Or would that
be considered surprising behaviour?

Re: [PATCH] Documentation/diff-options: explain different diff algorithms

2018-08-06 Thread Jonathan Nieder

Hi,

Stefan Beller wrote:
> --- a/Documentation/diff-options.txt
> +++ b/Documentation/diff-options.txt
> @@ -91,14 +91,18 @@ appearing as a deletion or addition in the output. It 
> uses the "patience
>  diff" algorithm internally.
>  
>  --diff-algorithm={patience|minimal|histogram|myers}::
> - Choose a diff algorithm. The variants are as follows:
> + Choose a diff algorithm. See the discussion of DIFF ALGORITHMS
> +ifndef::git-diff[]
> + in linkgit:git-diff[1]
> +endif::git-diff[]
> + . The variants are as follows:

This means outside of git-diff(1), I'd see

See the discussion of DIFF ALGORITHMS in git-diff(1) .

And in git-diff(1), I'd see

See the discussion of DIFF ALGORITHMS .

Both don't seem quite right, since they have an extra space before the
period.  The git-diff(1) seems especially not quite right --- does it
intend to say something like "See the DIFF ALGORITHMS section for more
discussion"?

[...]
> --- a/Documentation/git-diff.txt
> +++ b/Documentation/git-diff.txt
> @@ -119,6 +119,40 @@ include::diff-options.txt[]
>  
>  include::diff-format.txt[]
>  
> +DIFF ALGORITHMS
> +---

Please add some introductory words about what the headings refer to.

> +`Myers`
> +
> +A diff as produced by the basic greedy algorithm described in
> +link:http://www.xmailserver.org/diff2.pdf[An O(ND) Difference Algorithm and 
> its Variations].
> +with a run time of O(M + N + D^2). It employs a heuristic to allow for
> +a faster diff at the small cost of diff size.
> +The `minimal` algorithm has that heuristic turned off.
> +
> +`Patience`
> +
> +This algorithm by Bram Cohen matches the longest common subsequence
> +of unique lines on both sides, recursively. It obtained its name by
> +the way the longest subsequence is found, as that is a byproduct of
> +the patience sorting algorithm. If there are no unique lines left
> +it falls back to `myers`. Empirically this algorithm produces
> +a more readable output for code, but it does not garantuee

nit: s/garantuee/guarantee/

> +the shortest output.

Trivia: the `minimal` variant of Myers doesn't guarantee shortest
output, either: what it minimizes is the number of lines marked as
added or removed.  If you want to minimize context lines too, then
that would be a new variant. ;-)

[...]
> +`Histogram`
> +
> +This algorithm finds the longest common substring and recursively
> +diffs the content before and after the longest common substring.

optional: may be worth a short aside in the text about the distinction
between LCS and LCS. ;-)

It would be especially useful here, since the alphabet used in these
strings is *lines* instead of characters, so the first-time reader
could probably use some help in building their intuition.

> +If there are no common substrings left, fallback to `myers`.

nit: fallback is the noun, fall back is the verb.

> +This is often the fastest, but in corner cases (when there are
> +many common substrings of the same length) it produces bad

Can you clarify what "bad" means?  E.g. would "unexpected", or "poorly
aligned", match what you mean?

> +results as seen in:
> +
> + seq 1 100 >one
> + echo 99 > two
> + seq 1 2 98 >>two
> + git diff --no-index --histogram one two
> +
>  EXAMPLES
>  

Thanks,
Jonathan

[RFC/WIP PATCH 1/3] rerere: avoid buffer overrun

2018-08-06 Thread Elijah Newren

check_one_conflict() compares `i` to `active_nr` in two places to avoid
buffer overruns, but left out an important third location.  This has
not previously been a problem, because existing merge strategies have
tended to not create entries at stage #1 that do not have a
corresponding entry at either stage #2 or stage #3.  However, this is
not guaranteed, so add a check to avoid segfaults.

Signed-off-by: Elijah Newren 
---
 rerere.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/rerere.c b/rerere.c
index 16c8aac621..7d22fb08c7 100644
--- a/rerere.c
+++ b/rerere.c
@@ -533,7 +533,7 @@ static int check_one_conflict(int i, int *type)
}
 
*type = PUNTED;
-   while (ce_stage(active_cache[i]) == 1)
+   while (i < active_nr && ce_stage(active_cache[i]) == 1)
i++;
 
/* Only handle regular files with both stages #2 and #3 */
-- 
2.18.0.550.g44d6daf40a.dirty

[RFC/WIP PATCH 0/3] Modifications to handling of non-textual file merge conflicts

2018-08-06 Thread Elijah Newren

== Summary ==

For non-textual conflicts, I would like to provide additional information
in the working copy in the form of additional conflict markers and
explanatory text stating what type of non-textual conflict was involved.
This should
  * Make it clearer to users what conflicts they are dealing with and why
  * Enable new features like Thomas Rast' old remerge-diff proposal[1]

[1] https://public-inbox.org/git/cover.1409860234.git...@thomasrast.ch/

If this sounds rather imprecise, concrete examples are provided in the
next section of this email.  If this change sounds surprising or
non-intuitive, more detailed rationale motivating this change (which is
admittedly slightly non-obvious) can be found in the remainder of this
email.  It may be the more of the information below needs to be moved into
the commit message for patch 3.

== Examples of Proposal ==

There are two basic types of changes at play here, each best shown with a
representative example:

1) Representative example: A modify/delete conflict; the path in question
in the working tree would have conflict information at the top of the file
followed by the normal file contents; thus it could be of the form:

 HEAD
Conflict hint: This block of text was not part of the original
branch; it serves instead to hint about non-textual conflicts:
  MODIFY/DELETE: path foo modified in HEAD and deleted in BRANCH

Conflict hint: This block of text was not part of the original
branch; it serves instead to hint about non-textual conflicts:
  MODIFY/DELETE: path foo modified in HEAD and deleted in BRANCH
 BRANCH
Lorem ipsum dolor sit amet, consectetuer sadipscing elitr,
sed diam nonumy eirmod tempor invidunt ut labore et dolore
magna aliquyam erat, sed diam voluptua. At vero eos et
accusam et justo duo dolores et ea rebum. Stet clita kasd
gubergren, no sea takimata sanctus est Lorem ipsum dolor
sit amet.

Alternative ideas for handling the explanatory text here are welcome.  I
chose to use identical text on both sides of the conflict in an attempt
to highlight that this isn't a normal textual conflict and the text isn't
meant to be part of the file.

This type of example could apply for each of the following types of
conflicts:
  * modify/delete
  * rename/delete
  * directory/file
  * submodule/file
  * symlink/file
  * rename/rename(1to2)
  * executable mode conflict (i.e. 100644 vs. 100755 mode; could come
from add/add or modify/delete or rename/delete)

It could also be used for the following types of conflicts to help
differentiate between it and other conflict types:
  * add/add
  * rename/add[/delete]
  * rename/rename(2to1)[/delete[/delete]]
  * rename/rename(1to2)/add[/add]

However, any of the types above would be inappropriate if the regular
file(s) in question were binary; in those cases, they'd actually fall
into category two:


2) Representative example: A binary edit/edit conflict.  In this case,
it would be inappropriate to put the conflict markers inside the
binary file.  Instead, we create another file (e.g. path~CONFLICTS)
and put conflict markers in it:

 HEAD
Conflict hint: This block of text was not part of the original
branch; it serves instead to hint about non-textual conflicts:
  BINARY conflict: path foo modified in both branches

Conflict hint: This block of text was not part of the original
branch; it serves instead to hint about non-textual conflicts:
  BINARY conflict: path foo modified in both branches
 BRANCH

This file would also be added to the index at stage 1 (so that 'git merge
--abort' would clean this file out instead of leaving it around untracked,
and also because 'git status' would report "deleted in both" which seems
reasonable).

This type of example could apply for each of the following types of
conflicts:
  * binary edit/edit
  * any of the conflicts from type 1 when binary files are involved
  * symlink edit/edit (or add/add)
  * symlink/submodule
  * symlink/directory
  * directory/submodule
  * submodule/submodule

It could also apply to the following new corner case conflict types from
directory rename detection:
  * N-way colliding paths (N>=2) due to directory renames
  * directory rename split; half renamed to one directory and half to another


== Motivation, part 1: Problem statement ==

When conflicts arise we need ways to inform the user of the existence of
the conflicts and their nature.  For textual conflicts with regular files,
we have a simple way of doing this: inserting conflict markers into the
relevant region of the file with both conflicting versions present.
Importantly, this representation of the conflict is present in the working
copy.

For other types of conflicts (path-based or non-regular files), we often
provide no hint in the working copy about either the existence or the
nature of the conflict.  I think this is suboptimal from a users'

[RFC/WIP PATCH 3/3] merge-recursive: provide more conflict hints for non-textual conflicts

2018-08-06 Thread Elijah Newren

For non-textual conflicts, provide additional information in the working
copy in the form of additional conflict markers and explanatory text
stating what type of non-textual conflict was involved.  In particular,
regular files involved in path-based conflicts would have something like
the following at the beginning of the file:

 HEAD
Conflict hint: This block of text was not part of the original
branch; it serves instead to hint about non-textual conflicts:
  MODIFY/DELETE: path foo modified in HEAD and deleted in BRANCH

Conflict hint: This block of text was not part of the original
branch; it serves instead to hint about non-textual conflicts:
  MODIFY/DELETE: path foo modified in HEAD and deleted in BRANCH
 BRANCH

When non regular files (binaries, symlinks, etc.) are involved in
conflicts, we instead put this information in a separate file that only
contains this conflict information.

The goals of providing this extra information are:
  * Make it clearer to users what conflicts they are dealing with and why
  * Enable new features like Thomas Rast' old remerge-diff proposal[1]

[1] https://public-inbox.org/git/cover.1409860234.git...@thomasrast.ch/

Signed-off-by: Elijah Newren 
---
 merge-recursive.c   | 133 +++-
 t/t3031-merge-criscross.sh  |   2 +
 t/t6022-merge-rename.sh |  39 ++--
 t/t6043-merge-rename-directories.sh |   4 +-
 t/t7610-mergetool.sh|   4 +
 5 files changed, 144 insertions(+), 38 deletions(-)

diff --git a/merge-recursive.c b/merge-recursive.c
index 4832234073..cdfe9824d2 100644
--- a/merge-recursive.c
+++ b/merge-recursive.c
@@ -313,6 +313,96 @@ static void output_commit_title(struct merge_options *o, 
struct commit *commit)
flush_output(o);
 }
 
+static void write_conflict_notice(struct strbuf *buf,
+ struct strbuf *notice,
+ int use_crlf)
+{
+   int marker_size = 8;
+
+   strbuf_addchars(buf, '<', marker_size);
+   if (use_crlf)
+   strbuf_addch(buf, '\r');
+   strbuf_addch(buf, '\n');
+
+   strbuf_addbuf(buf, notice);
+   if (use_crlf)
+   strbuf_addch(buf, '\r');
+   strbuf_addch(buf, '\n');
+
+   strbuf_addchars(buf, '=', marker_size);
+   if (use_crlf)
+   strbuf_addch(buf, '\r');
+   strbuf_addch(buf, '\n');
+
+   strbuf_addbuf(buf, notice);
+   if (use_crlf)
+   strbuf_addch(buf, '\r');
+   strbuf_addch(buf, '\n');
+
+   strbuf_addchars(buf, '>', marker_size);
+   if (use_crlf)
+   strbuf_addch(buf, '\r');
+   strbuf_addch(buf, '\n');
+}
+
+static int create_non_textual_conflict_file(struct merge_options *o,
+   struct strbuf *notice,
+   const char *path,
+   struct object_id *oid)
+{
+   struct strbuf contents = STRBUF_INIT;
+   int ret = 0;
+   int use_crlf = 0; /* FIXME: Determine platform default?? */
+
+   write_conflict_notice(&contents, notice, use_crlf);
+
+   if (write_object_file(contents.buf, contents.len, "blob", oid))
+   ret = err(o, _("Unable to add %s to database"), path);
+
+   strbuf_release(&contents);
+   return ret;
+}
+
+static int insert_non_textual_conflict_header(struct merge_options *o,
+ struct strbuf *notice,
+ const char *path,
+ struct object_id *oid,
+ unsigned int mode)
+{
+   struct strbuf header = STRBUF_INIT;
+   struct strbuf dst_buf = STRBUF_INIT;
+   enum object_type type;
+   char *buf;
+   unsigned long size;
+   char *end;
+   int use_crlf;
+   int ret = 0;
+
+   if (!S_ISREG(mode))
+   BUG("insert_non_textual_conflict_header called on file with 
wrong mode: %0d", mode);
+
+   buf = read_object_file(oid, &type, &size);
+   if (!buf)
+   return err(o, _("cannot read object %s '%s'"), oid_to_hex(oid), 
path);
+   if (type != OBJ_BLOB) {
+   return err(o, _("blob expected for %s '%s'"), oid_to_hex(oid), 
path);
+   }
+
+   end = strchrnul(buf, '\n');
+   use_crlf = (end > buf && end[-1] == '\r');
+   write_conflict_notice(&header, notice, use_crlf);
+
+   strbuf_addbuf(&dst_buf, &header);
+   strbuf_add(&dst_buf, buf, size);
+
+   if (write_object_file(dst_buf.buf, dst_buf.len, type_name(type), oid))
+   ret = err(o, _("Unable to add %s to database"), path);
+
+   strbuf_release(&header);
+   strbuf_release(&dst_buf);
+   return ret;
+}
+
 static int add_cacheinfo(struct merge_options *o,
 unsigned int mode, const str

[RFC/WIP PATCH 2/3] merge-recursive: fix handling of submodules in modify/delete conflicts

2018-08-06 Thread Elijah Newren

Similar to commit c641ca670729 ("merge-recursive: handle addition of
submodule on our side of history", 2017-11-14) a submodule can be
confused for a D/F conflict for modify/delete and rename/delete
conflicts.  (To the code, it appears there is a directory in the way of
us writing our expected path to the working tree, because our path is a
submodule; i.e. the submodule is itself the directory and any directory
is assumed to be a D/F conflict that is in the way.)  So, when we are
dealing with a submodule, avoid checking the working copy for a
directory being in the way.

Signed-off-by: Elijah Newren 
---

It might be better to first check that the submodule existed on the HEAD
side of the merge, because there could be a directory in the way of the
submodule.  But that's starting to get ugly quick, and the real fix to
make this cleaner is the rewrite of merge-recursive that does an index-only
merge first, then updates the working copy later...

 merge-recursive.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/merge-recursive.c b/merge-recursive.c
index 1446e92bea..4832234073 100644
--- a/merge-recursive.c
+++ b/merge-recursive.c
@@ -1466,7 +1466,7 @@ static int handle_change_delete(struct merge_options *o,
const char *update_path = path;
int ret = 0;
 
-   if (dir_in_way(path, !o->call_depth, 0) ||
+   if (dir_in_way(path, !o->call_depth && !S_ISGITLINK(changed_mode), 0) ||
(!o->call_depth && would_lose_untracked(path))) {
update_path = alt_path = unique_path(o, path, change_branch);
}
-- 
2.18.0.550.g44d6daf40a.dirty

[RFC/WIP PATCH 0/1] Simplify handling of directory/file conflicts

2018-08-06 Thread Elijah Newren

Directory/file conflicts are more difficult than they need to be for users
to resolve (and to un-resolve).  Simplify that process by doing to the
index what we do to the working tree: renaming the file in the
directory/file conflict to a different location.  This also avoids leaving
cruft untracked files around if the user decides to abort the merge.

>From one angle this proposal might appear surprising, so extended rationale
for this change can be found below (and if it seems surprising, some of the
below may need to be moved into the commit message for the patch).


== What git does, prior to this series ==

Let's say there's a directory/file conflict for path 'foo'.  One might see
git report:
  CONFLICT (file/directory): There is a directory with name foo in
  BRANCH2. Adding foo as foo~BRANCH
Further, at this point, git will record:
  * foo~BRANCH in the working tree
  * foo/ in the working tree
  * foo at higher order stage in the index
  * foo/* entries found in the index (at stage 0)

== User experience resolving directory/file conflicts ==

Let's say the user wants to resolve by just moving 'foo' (the file) to
somewhere else.  Here's five different things a user might try at this
point (not sequentially, but rather competing ideas they might try),
along with commentary about how each fails to resolve the conflict:

$ git mv foo other
  # Moves the directory instead, oops

$ mv foo~BRANCH other
$ git add other
  # Still leaves 'foo' conflicted in the index

$ git mv foo~BRANCH other
  # Error: "Not under source control"

$ git add -u
  # Removes conflict entry for 'foo' from index, but doesn't add
  # new one and leaves foo~BRANCH around untracked

$ git rm foo
  # Doesn't work ("foo: needs merge...fatal: git rm: 'foo': Is a directory)

== User experience un-resolving directory/file conflict ==

If the user decides they don't like the merge and run 'git merge --abort',
the abort fails due to a separate bug being fixed here:

  https://public-inbox.org/git/20180713163331.22446-1-new...@gmail.com/

However, even once the fixes there are part of git, a 'git merge --abort'
will leave behind new untracked files that were created by the merge
attempt (in the case above, one named foo~BRANCH).  This is suboptimal.

== Correct solution; old and new ==

Currently, this is what a user needs to run to resolve this conflict:

$ mv foo~BRANCH other
$ git add other
$ git rm --cached foo

If git would record foo~BRANCH at a higher stage in the index instead
of recording foo there, then we could shorten this to:

$ git add foo~BRANCH
$ git mv foo~BRANCH other

If we could also teach git-mv to quit reporting "not under version
control" for index entries with higher order stages (and instead rename
them while keeping them as higher order stages), then we could also allow
those two commands to be reversed:

$ git mv foo~BRANCH other
$ git add other

While this change to what git records in the index might feel like a lie,
it does make it easier for the user and we already have a precedent for
treating user convience as more important than trying to represent how we
got to the current state, as shown in the following analogy:

== Rename analogy ==

If one side of history renames A->B but there are content conflicts, one
choice for the contents for the index would be:
1A
2A
3B
However, that's not what git does.  In particular, this choice would require
the user to run both 'git add B' and 'git rm --cached A' to resolve the
conflict.  Further, it prevents the user from running commands such as
  git checkout [--ours|--theirs|--conflict|-m] B
  git diff [--ours|--theirs|--base] B
This would also make it harder to pair up entries from 'git ls-files -u'
(especially if there are many entries between A and B), since nothing
marks them as related anymore.  So, instead, git records the following in
the index:
1B
2B
3B
This might seem like a lie if you view the index as a place to record how
we got to the current state rather than a way to help users resolve
conflicts and update state, but it is certainly far more convenient for
the user to work with.  Follow suit with directory/file conflicts.


Elijah Newren (1):
  merge-recursive: make file/directory conflicts easier to resolve

 merge-recursive.c| 38 ++--
 t/t3030-merge-recursive.sh   | 16 ++--
 t/t6020-merge-df.sh  |  4 +--
 t/t6022-merge-rename.sh  |  4 +--
 t/t6036-recursive-corner-cases.sh|  5 ++--
 t/t6042-merge-rename-corner-cases.sh |  4 +--
 t/t6043-merge-rename-directories.sh  |  4 +--
 7 files changed, 49 insertions(+), 26 deletions(-)

-- 
2.18.0.550.g44d6daf40a.dirty

[RFC/WIP PATCH 1/1] merge-recursive: make file/directory conflicts easier to resolve

2018-08-06 Thread Elijah Newren

File/directory conflicts are somewhat difficult to resolve; by way of
contrast, renames are much easier to handle.  For file/directory
conflicts, we already have to record the file under a different name in
the working copy due to the directory being in the way; if we also record
the file under a different name in the index then it simplifies matters
for the user, and ensures that 'git reset --hard' and 'git merge --abort'
will clean up files created by the merge operation.

Note that there are multiple ways to get a file/directory conflict:
  * add/add (one side adds a file, the other a directory)
  * modify/delete (the side that deletes puts a directory in the way)
  * rename vs add-directory (directory added on one side in way of rename)
As such, there are multiple code paths that need updating.

FIXME: Several testcases were updated to show how this affected the
  testsuite in general, but there are still 11 more tests across five
  testfiles that need fixing.  As I said, this is just WIP/RFC.

Signed-off-by: Elijah Newren 
---
 merge-recursive.c| 38 ++--
 t/t3030-merge-recursive.sh   | 16 ++--
 t/t6020-merge-df.sh  |  4 +--
 t/t6022-merge-rename.sh  |  4 +--
 t/t6036-recursive-corner-cases.sh|  5 ++--
 t/t6042-merge-rename-corner-cases.sh |  4 +--
 t/t6043-merge-rename-directories.sh  |  4 +--
 7 files changed, 49 insertions(+), 26 deletions(-)

diff --git a/merge-recursive.c b/merge-recursive.c
index 1446e92bea..34906b0f90 100644
--- a/merge-recursive.c
+++ b/merge-recursive.c
@@ -1527,6 +1527,20 @@ static int handle_change_delete(struct merge_options *o,
 * path.  We could call update_file_flags() with update_cache=0
 * and update_wd=0, but that's a no-op.
 */
+   if (!o->call_depth && alt_path) {
+   struct diff_filespec orig, new;
+   int stage = (change_branch == o->branch1) ? 2 : 3;
+
+   remove_file_from_cache(path);
+   orig.mode = o_mode;
+   oidcpy(&orig.oid, o_oid);
+   new.mode = changed_mode;
+   oidcpy(&new.oid, changed_oid);
+   if (update_stages(o, alt_path, &orig,
+ stage == 2 ? &new : NULL,
+ stage == 3 ? &new : NULL))
+   ret = -1;
+   }
if (change_branch != o->branch1 || alt_path)
ret = update_file(o, 0, changed_oid, changed_mode, 
update_path);
}
@@ -3089,11 +3103,11 @@ static int merge_content(struct merge_options *o,
 
if (df_conflict_remains || is_dirty) {
char *new_path;
-   if (o->call_depth) {
-   remove_file_from_cache(path);
-   } else {
+   new_path = unique_path(o, path, rename_conflict_info->branch1);
+   remove_file_from_cache(path);
+   if (!o->call_depth) {
if (!mfi.clean) {
-   if (update_stages(o, path, &one, &a, &b))
+   if (update_stages(o, new_path, &one, &a, &b))
return -1;
} else {
int file_from_stage2 = was_tracked(o, path);
@@ -3101,14 +3115,13 @@ static int merge_content(struct merge_options *o,
oidcpy(&merged.oid, &mfi.oid);
merged.mode = mfi.mode;
 
-   if (update_stages(o, path, NULL,
+   if (update_stages(o, new_path, NULL,
  file_from_stage2 ? &merged : 
NULL,
  file_from_stage2 ? NULL : 
&merged))
return -1;
}
 
}
-   new_path = unique_path(o, path, rename_conflict_info->branch1);
if (is_dirty) {
output(o, 1, _("Refusing to lose dirty file at %s"),
   path);
@@ -3244,10 +3257,19 @@ static int process_entry(struct merge_options *o,
output(o, 1, _("CONFLICT (%s): There is a directory 
with name %s in %s. "
   "Adding %s as %s"),
   conf, path, other_branch, path, new_path);
+   remove_file_from_cache(path);
+   if (!o->call_depth) {
+   struct diff_filespec dfs;
+
+   dfs.mode = mode;
+   oidcpy(&dfs.oid, oid);
+   if (update_stages(o, new_path, NULL,
+ a_oid ? &df

Two RFC/WIP series for facilitating merge conflict resolution

2018-08-06 Thread Elijah Newren

Hi everyone,

Last week, Duy posted an interesting RFC patch for improving merge
conflict resolution by coloring the "CONFLICT' messages that are
output during a merge. I have two more ideas (complementary to his)
for improving merge conflict resolution experience.  My two ideas are
mostly independent of each other but with a couple important ties;
I'll shortly respond with a short RFC/WIP series for each.

Some notes applicable to both:
  - Each has a very long cover letter, due to trying to anticipate
questions/surprises and responding to each.  It may be simpler to skip
the cover letter (or just read the up-front summary) and see if the
patches make sense,
  - I am planning to imminently start looking into my merge-recursive
rewrite[1].  One possibility is just doing these changes as part of
the new strategy...thoughts?


Elijah

[1] See part E from
https://public-inbox.org/git/cabpp-bfqjzhfcjz1qvhvvcmd-_sofi0fkm5pexewzzn+zw2...@mail.gmail.com/;
also the two RFCs I'm about to post are part C from that email.

What's cooking in git.git (Aug 2018, #02; Mon, 6)

2018-08-06 Thread Junio C Hamano

Here are the topics that have been cooking.  Commits prefixed with
'-' are only in 'pu' (proposed updates) while commits prefixed with
'+' are in 'next'.  The ones marked with '.' do not appear in any of
the integration branches, but I am still holding onto them.

You can find the changes described here in the integration branches
of the repositories listed at

http://git-blame.blogspot.com/p/git-public-repositories.html

--
[New Topics]

* js/rebase-merges-exec-fix (2018-08-06) 2 commits
 - rebase --exec: make it work with --rebase-merges
 - t3430: demonstrate what -r, --autosquash & --exec should do

 The "--exec" option to "git rebase --rebase-merges" placed the exec
 commands at wrong places, which has been corrected.
 
 cf. 


* nd/no-extern (2018-08-03) 12 commits
 - submodule.h: drop extern from function declaration
 - revision.h: drop extern from function declaration
 - repository.h: drop extern from function declaration
 - rerere.h: drop extern from function declaration
 - line-range.h: drop extern from function declaration
 - diff.h: remove extern from function declaration
 - diffcore.h: drop extern from function declaration
 - convert.h: drop 'extern' from function declaration
 - cache-tree.h: drop extern from function declaration
 - blame.h: drop extern on func declaration
 - attr.h: drop extern from function declaration
 - apply.h: drop extern on func declaration

 Noiseword "extern" has been removed from function decls in the
 header files.

 Will merge to 'next'.


* ar/t4150-am-scissors-test-fix (2018-08-06) 1 commit
 - t4150: fix broken test for am --scissors

 Test fix.

 Will merge to 'next'.


* en/t3031-title-fix (2018-08-06) 1 commit
 - t3031: update test description to mention desired behavior

 Test fix.

 Will merge to 'next'.


* hn/config-in-code-comment (2018-08-06) 1 commit
 - config: document git config getter return value

 Header update.

 Will merge to 'next'.


* jk/diff-rendered-docs (2018-08-06) 1 commit
 - add a script to diff rendered documentation

 Developer support to allow the end result of documentation update
 to be inspected more easily.

 Will merge to 'next'.


* js/pull-rebase-type-shorthand (2018-08-06) 1 commit
 - pull --rebase=: allow single-letter abbreviations for the type

 "git pull --rebase=interactive" learned "i" as a short-hand for
 "interactive".

 Will merge to 'next'.


* nd/complete-config-vars (2018-08-06) 1 commit
 - Makefile: add missing dependency for command-list.h

 Build fix.

 Will merge to 'next'.


* nd/config-blame-sort (2018-08-06) 1 commit
 - config.txt: reorder blame stuff to keep config keys sorted

 Doc fix.

 Will merge to 'next'.


* wc/make-funnynames-shared-lazy-prereq (2018-08-06) 1 commit
 - t: factor out FUNNYNAMES as shared lazy prereq

 A test prerequisite defined by various test scripts with slightly
 different sematics has been consolidated into a single copy and
 made into a lazily defined one.



--
[Stalled]

* ma/wrapped-info (2018-05-28) 2 commits
 - usage: prefix all lines in `vreportf()`, not just the first
 - usage: extract `prefix_suffix_lines()` from `advise()`

 An attempt to help making multi-line messages fed to warning(),
 error(), and friends more easily translatable.

 Will discard and wait for a cleaned-up rewrite.
 cf. <20180529213957.gf7...@sigill.intra.peff.net>


* hn/bisect-first-parent (2018-04-21) 1 commit
 - bisect: create 'bisect_flags' parameter in find_bisection()

 Preliminary code update to allow passing more flags down the
 bisection codepath in the future.

 We do not add random code that does not have real users to our
 codebase, so let's have it wait until such a real code materializes
 before too long.


* av/fsmonitor-updates (2018-01-04) 6 commits
 - fsmonitor: use fsmonitor data in `git diff`
 - fsmonitor: remove debugging lines from t/t7519-status-fsmonitor.sh
 - fsmonitor: make output of test-dump-fsmonitor more concise
 - fsmonitor: update helper tool, now that flags are filled later
 - fsmonitor: stop inline'ing mark_fsmonitor_valid / _invalid
 - dir.c: update comments to match argument name

 Code clean-up on fsmonitor integration, plus optional utilization
 of the fsmonitor data in diff-files.

 Waiting for an update.
 cf. 



* pb/bisect-helper-2 (2018-07-23) 8 commits
 - t6030: make various test to pass GETTEXT_POISON tests
 - bisect--helper: `bisect_start` shell function partially in C
 - bisect--helper: `get_terms` & `bisect_terms` shell function in C
 - bisect--helper: `bisect_next_check` shell function in C
 - bisect--helper: `check_and_set_terms` shell function in C
 - wrapper: move is_empty_file() and rename it as is_empty_or_missing_file()
 - bisect--helper: `bisect_write` shell function in C
 - bisect--helper: `bisect_reset` shell function in C

 Expecting a reroll.
 cf. 
<0102015f5e5ee171-f30f4868-886f-47a1-a4e4-b4936afc545d-000...@eu-west-1.amazonses.com>

 I just

[PATCH] Documentation/diff-options: explain different diff algorithms

2018-08-06 Thread Stefan Beller

As a user I wondered what the diff algorithms are about. Offer at least
a basic explanation on the differences of the diff algorithms.

Signed-off-by: Stefan Beller 
---
 Documentation/diff-options.txt | 10 +++---
 Documentation/git-diff.txt | 34 ++
 2 files changed, 41 insertions(+), 3 deletions(-)

diff --git a/Documentation/diff-options.txt b/Documentation/diff-options.txt
index f394608b42c..eae033a21ea 100644
--- a/Documentation/diff-options.txt
+++ b/Documentation/diff-options.txt
@@ -91,14 +91,18 @@ appearing as a deletion or addition in the output. It uses 
the "patience
 diff" algorithm internally.
 
 --diff-algorithm={patience|minimal|histogram|myers}::
-   Choose a diff algorithm. The variants are as follows:
+   Choose a diff algorithm. See the discussion of DIFF ALGORITHMS
+ifndef::git-diff[]
+   in linkgit:git-diff[1]
+endif::git-diff[]
+   . The variants are as follows:
 +
 --
 `default`, `myers`;;
The basic greedy diff algorithm. Currently, this is the default.
 `minimal`;;
-   Spend extra time to make sure the smallest possible diff is
-   produced.
+   The same algorithm as `myers`, but spend extra time to make
+   sure the smallest possible diff is produced.
 `patience`;;
Use "patience diff" algorithm when generating patches.
 `histogram`;;
diff --git a/Documentation/git-diff.txt b/Documentation/git-diff.txt
index b180f1fa5bf..b182389aaae 100644
--- a/Documentation/git-diff.txt
+++ b/Documentation/git-diff.txt
@@ -119,6 +119,40 @@ include::diff-options.txt[]
 
 include::diff-format.txt[]
 
+DIFF ALGORITHMS
+---
+`Myers`
+
+A diff as produced by the basic greedy algorithm described in
+link:http://www.xmailserver.org/diff2.pdf[An O(ND) Difference Algorithm and 
its Variations].
+with a run time of O(M + N + D^2). It employs a heuristic to allow for
+a faster diff at the small cost of diff size.
+The `minimal` algorithm has that heuristic turned off.
+
+`Patience`
+
+This algorithm by Bram Cohen matches the longest common subsequence
+of unique lines on both sides, recursively. It obtained its name by
+the way the longest subsequence is found, as that is a byproduct of
+the patience sorting algorithm. If there are no unique lines left
+it falls back to `myers`. Empirically this algorithm produces
+a more readable output for code, but it does not garantuee
+the shortest output.
+
+`Histogram`
+
+This algorithm finds the longest common substring and recursively
+diffs the content before and after the longest common substring.
+If there are no common substrings left, fallback to `myers`.
+This is often the fastest, but in corner cases (when there are
+many common substrings of the same length) it produces bad
+results as seen in:
+
+   seq 1 100 >one
+   echo 99 > two
+   seq 1 2 98 >>two
+   git diff --no-index --histogram one two
+
 EXAMPLES
 
 
-- 
2.18.0.597.ga71716f1ad-goog

Re: [PATCH 2/2] sideband: highlight keywords in remote sideband output

2018-08-06 Thread Junio C Hamano

Han-Wen Nienhuys  writes:

>> If your "make test" did not catch this as an error, then we may need
>> to fix t/Makefile, as it is supposed to run test-lint.
>
> I've been running tests individually as
>
>  sh t5409-colorize-remote-messages.sh  -v -d

During your own development that may be sufficient, but do run "make
test" to run tests by other people; it will help you catch new bugs
you are introducing whey they notice their expectations are broken.

You are breaking 5541 (there may be others, but I noticed a breakage
there) perhaps with your debugging cruft.

Re: [BUG] A part of an edge from an octopus merge gets colored, even with --color=never

2018-08-06 Thread Jeff King

On Mon, Aug 06, 2018 at 02:34:45PM -0400, Noam Postavsky wrote:

> On 30 June 2018 at 08:47, Noam Postavsky  
> wrote:
> 
> > I'm still having trouble getting a big picture understanding of how
> > the graph struct relates the what gets drawn on screen, but through
> > some poking around with the debugger + trial & error, I've arrived at
> > a new patch which seems to work. It's also a lot simpler. I hope you
> > can tell me if it makes sense.
> >
> > Also attached an updated test-multiway-merge.sh which allows adding
> > more branches to test different sized merges more easily.
> 
> Ping? (I got some bounce message regarding test-multiway-merge.sh, but
> it does show up in the mailing list archive, so I think my message has
> gone through)
> 
> https://marc.info/?l=git&m=153036284214253&w=2

No, it made it. It just got shuffled to the bottom of my pile (for
future reference, you can ping more frequently than once a month if you
think something was dropped ;) ).

-Peff

Re: [BUG] A part of an edge from an octopus merge gets colored, even with --color=never

2018-08-06 Thread Jeff King

On Sat, Jun 30, 2018 at 08:47:16AM -0400, Noam Postavsky wrote:

> I'm still having trouble getting a big picture understanding of how
> the graph struct relates the what gets drawn on screen, but through
> some poking around with the debugger + trial & error, I've arrived at
> a new patch which seems to work. It's also a lot simpler. I hope you
> can tell me if it makes sense.
> [...]
>   int num_dashes =
>   ((graph->num_parents - dashless_commits) * 2) - 1;
>   for (i = 0; i < num_dashes; i++) {
> - col_num = (i / 2) + dashless_commits + graph->commit_index;
> + col_num = (i / 2) + dashless_commits;
>   strbuf_write_column(sb, &graph->new_columns[col_num], '-');
>   }
> - col_num = (i / 2) + dashless_commits + graph->commit_index;
> + col_num = (i / 2) + dashless_commits;

Hmm. So this seems to work because in your example, we're showing the
merge in slot 1. So the index is 1, and we have an off-by-one problem,
and getting rid of that solves it.

But what if we had another line of development to the left of that?
I.e., if the "c" in your script was itself on the right-hand side of a
merge.

We can simulate that by adding this to your script, right before the
invocation of log:

  # We traverse in commit-date order, so make sure the new commit is
  # more recent than the others. This is also the cause of your "calling
  # it x doesn't work" comment, I think (all of these commits are
  # created in a single second, so the order we hit the refs in --all
  # matters).
  sleep 1

  git checkout -b a-prime master^
  git commit --allow-empty -m a-prime

That gives me a graph like this (for d-e-f):

  * d342ed8 (HEAD -> a-prime) a-prime
  | * 14aae3a (c) c
  | | *---.   4bacae1 (m) merge a b d e f
  | | |\ \ \ \ \  
  | |/ / / / / /  
  | | | | | | * f19c3a9 (f) f
  | |_|_|_|_|/  
  |/| | | | |   
  | | | | | * 48fd961 (e) e
  | |_|_|_|/  
  |/| | | |   
  | | | | * 3f4914f (d) d
  | |_|_|/  
  |/| | |   
  | | | * 8bef98c (b) b
  | |_|/  
  |/| |   
  | | * 253e7ba (a) a
  | |/  
  |/|   
  | * 8a60f32 (master) 1
  |/  
  * b00ba42 0

and graph->commit_index is 2.

That doesn't trigger valgrind, but all the colors are off-by-one (which
makes sense; we're off-by-one towards the beginning of the array now).
Using "graph->commit_index - 1" seems to yield the right results, but it
feels like we're just hacking around it. And my understanding was that
the "straight edge" case actually works with the current code, and we'd
probably be breaking that.

I still think it makes more sense to iterate over the columns rather
than over the dashes, which removes a lot of these confusing cases. This
is what I came up with:

diff --git a/graph.c b/graph.c
index c782590202..d0a3e0858b 100644
--- a/graph.c
+++ b/graph.c
@@ -847,22 +847,24 @@ static void graph_output_commit_char(struct git_graph 
*graph, struct strbuf *sb)
 static int graph_draw_octopus_merge(struct git_graph *graph,
struct strbuf *sb)
 {
+   int col_num, first_col, last_col;
+
+   /* Skip the current commit, since we've already drawn its asterisk. */
+   first_col = graph->commit_index + 1;
/*
-* Here dashless_commits represents the number of parents
-* which don't need to have dashes (because their edges fit
-* neatly under the commit).
+* We subtract three, one each for:
+*  - the commit we're directly on top of
+*  - the commit to our left that we're merged into
+*  - we want to point to the final column, not one past
 */
-   const int dashless_commits = 2;
-   int col_num, i;
-   int num_dashes =
-   ((graph->num_parents - dashless_commits) * 2) - 1;
-   for (i = 0; i < num_dashes; i++) {
-   col_num = (i / 2) + dashless_commits;
+   last_col = first_col + graph->num_parents - 3;
+
+   for (col_num = first_col; col_num <= last_col; col_num++) {
strbuf_write_column(sb, &graph->new_columns[col_num], '-');
+   strbuf_write_column(sb, &graph->new_columns[col_num],
+   col_num == last_col ? '.' : '-');
}
-   col_num = (i / 2) + dashless_commits;
-   strbuf_write_column(sb, &graph->new_columns[col_num], '.');
-   return num_dashes + 1;
+   return 2 * (last_col - first_col + 1);
 }

 static void graph_output_commit_line(struct git_graph *graph, struct strbuf 
*sb)

I suspect it still has a bug, which is that it is handling this
first-parent-goes-left case, but probably gets the straight-parent case
wrong. But at least in this form, I think it is obvious to see where
that bug is (the "three" in the comment is not accurate in that latter
case, and it should be two). Which I think is what your original fix was
getting at: we need to set first/last to start off with, and then
"shrink" them with a conditional depending on which form we're seeing.

-Pe

Re: abstracting commit signing/verify to support other signing schemes

2018-08-06 Thread Tacitus Aedifex


On Fri, Aug 03, 2018 at 06:07:46PM -0400, Jeff King wrote:

There's been some work on this lately. See this patch and the response
thread:

 https://public-inbox.org/git/20180409204129.43537-9-mastahy...@gmail.com/

The more recent work focused on just doing the minimum to provide
gpg/gpgsm variants:

 https://public-inbox.org/git/cover.1531831244.git.henning.sch...@siemens.com/

That replaces the earlier patch series, and is currently merged to the
'next' branch and is on track to get merged to 'master' before Git 2.19
is released.


thank you for setting the context. it looks like both patch sets are going in 
the same direction that i suggested, at least with the config variables.  
personally, i prefer the 'signingtool.' approach in the older patch set 
over the 'gpg.' approach in the newer patch set since my goal is to get 
away from assuming gpg.


the older patch set suggested the idea of using PEM strings to match up the 
signature payload with a certain signing tool.  i can't tell if they mean the 
'pre-ecapsulation boundary' (e.g. '-BEGIN FOO-') or if they mean the 
encapsulated headers; both as defined in RFC 1421 [0].


the newer patch set looks specifically at the pre-encapsulation boundary to 
switch behaviors. that works assuming that the signing tools all understand 
PEM. in the case of signify, it doesn't, so the driver code in git will have to 
translate to/from PEM.


i suggest that we switch to a standard format for all signatures that is 
similar to the commit message format with colon ':' separated fields followed 
by a payload.  the colon separated fields would specify the signing tool used 
to generate the signature and the tool specific data followed by the signature 
blob like so:


 signing-tool: gpg
 gpg-keyid: 0123456789ABCDEF
 
 -BEGIN PGP SIGNATURE-

 
 -END PGP SIGNATURE-

by adopting this format, git will be fully abstracted from the underlying 
signing tool and the user can specify multiple signing tools in their config 
and git will be able to map the signature to the tool when verifying (e.g. git 
log --show-signature).


a signify signature would look something like this:

 signing-tool: signify
 signify-public-key: 
 
 


i hope we adopt a more generic approach like this.


One of the downsides there is that if we eventually move to a generic
signing-tool config, we'd have to support two layers of historical
abstraction (the original "gpg.program" config, and the new
"gpg..*" config).


i like the idea of aliasing all of the old config variables to their equivilent 
and outputting a deprecation warning when we get plan on removing the aliases 
altogether in the future.



So _if_ we knew what it would take to support signify, we could
potentially adjust what's going into 2.19 in order to skip straight to
the more generic interface. But on the OTOH, it may not be worth
rushing, and there is already a vague plan of how the gpg..*
config would interact with a more generic config.


there's no rush, but i would prefer that the newer patch set get changed to use 
the more generic 'signingtool..*' instead of 'gpg..*'. if you all 
agree, i'll follow up with a patch to change that part of what is going into 
2.19.


then round two will be to experiment with a new, standard signature format that 
doesn't assume anything about the underlying signing tool.


//t??

[0] https://tools.ietf.org/html/rfc1421

Re: [PATCH 0/7] improve range-diffs coloring and [RFC] move detection

2018-08-06 Thread Junio C Hamano

Stefan Beller  writes:

> On Sat, Aug 4, 2018 at 9:57 AM Junio C Hamano  wrote:
>>
>> Stefan Beller  writes:
>>
>> > This builds on top of sb/range-diff-colors, which builds on js/range-diff.
>>
>> As another round of js/range-diff is expected, according to
>>
>> 
>
> Oh right. I forgot to mention that in this cover letter, but Johannes
> has had work issues last week and people threw lots of stuff at him,
> so I concluded the resend might take a while. Hence I just put it out there
> in case we're happy with the status quo of that series.
> ...
>
> Thanks! I'll resend when appropriate.

Thanks; that way I have one less thing I have to remember ;-)

Re: [PATCH v3] t4150: fix broken test for am --scissors

2018-08-06 Thread Junio C Hamano

Andrei Rybak  writes:

> Tests for "git am --[no-]scissors" [1] work in the following way:
>
>  1. Create files with commit messages
>  2. Use these files to create expected commits
>  3. Generate eml file with patch from expected commits
>  4. Create commits using git am with these eml files
>  5. Compare these commits with expected
>
> The test for "git am --scissors" is supposed to take an e-mail with a
> scissors line and in-body "Subject:" header and demonstrate that the
> subject line from the e-mail itself is overridden by the in-body header
> and that only text below the scissors line is included in the commit
> message of the commit created by the invocation of "git am --scissors".
> However, the setup of the test incorrectly uses a commit without the
> scissors line and without the in-body header in the commit message,
> producing eml file not suitable for testing of "git am --scissors".
>
> This can be checked by intentionally breaking is_scissors_line function
> in mailinfo.c, for example, by changing string ">8", which is used by
> the test. With such change the test should fail, but does not.
>
> Fix broken test by generating eml file with scissors line and in-body
> header "Subject:". Since the two tests for --scissors and --no-scissors
> options are there to test cutting or keeping the commit message, update
> both tests to change the test file in the same way, which allows us to
> generate only one eml file to be passed to git am. To clarify the
> intention of the test, give files and tags more explicit names.
>
> [1]: introduced in bf72ac17d (t4150: tests for am --[no-]scissors,
>  2015-07-19)
>
> Signed-off-by: Andrei Rybak 
> ---
>
> Applies on top of 980a3d3dd (Merge branch 'pt/am-tests', 2015-08-03).
> This patch is also available at
>
>   https://github.com/rybak/git fix-am-scissors-test-v3
>
> Only changes since v2 are more clear tag names.

... and updated log message, which I think makes it worthwhile to
replace the previous one plus the squash/fixup with this version.

Thanks.


>  t/t4150-am.sh | 39 ---
>  1 file changed, 20 insertions(+), 19 deletions(-)
>
> diff --git a/t/t4150-am.sh b/t/t4150-am.sh
> index e9b6f8158..a821dfda5 100755
> --- a/t/t4150-am.sh
> +++ b/t/t4150-am.sh
> @@ -67,13 +67,15 @@ test_expect_success 'setup: messages' '
>  
>   EOF
>  
> - cat >scissors-msg <<-\EOF &&
> - Test git-am with scissors line
> + cat >msg-without-scissors-line <<-\EOF &&
> + Test that git-am --scissors cuts at the scissors line
>  
>   This line should be included in the commit message.
>   EOF
>  
> - cat - scissors-msg >no-scissors-msg <<-\EOF &&
> + printf "Subject: " >subject-prefix &&
> +
> + cat - subject-prefix msg-without-scissors-line >msg-with-scissors-line 
> <<-\EOF &&
>   This line should not be included in the commit message with --scissors 
> enabled.
>  
>- - >8 - - remove everything above this line - - >8 - -
> @@ -150,18 +152,17 @@ test_expect_success setup '
>   } >patch1-hg.eml &&
>  
>  
> - echo scissors-file >scissors-file &&
> - git add scissors-file &&
> - git commit -F scissors-msg &&
> - git tag scissors &&
> - git format-patch --stdout scissors^ >scissors-patch.eml &&
> + echo file >file &&
> + git add file &&
> + git commit -F msg-without-scissors-line &&
> + git tag expected-for-scissors &&
>   git reset --hard HEAD^ &&
>  
> - echo no-scissors-file >no-scissors-file &&
> - git add no-scissors-file &&
> - git commit -F no-scissors-msg &&
> - git tag no-scissors &&
> - git format-patch --stdout no-scissors^ >no-scissors-patch.eml &&
> + echo file >file &&
> + git add file &&
> + git commit -F msg-with-scissors-line &&
> + git tag expected-for-no-scissors &&
> + git format-patch --stdout expected-for-no-scissors^ 
> >patch-with-scissors-line.eml &&
>   git reset --hard HEAD^ &&
>  
>   sed -n -e "3,\$p" msg >file &&
> @@ -418,10 +419,10 @@ test_expect_success 'am --scissors cuts the message at 
> the scissors line' '
>   rm -fr .git/rebase-apply &&
>   git reset --hard &&
>   git checkout second &&
> - git am --scissors scissors-patch.eml &&
> + git am --scissors patch-with-scissors-line.eml &&
>   test_path_is_missing .git/rebase-apply &&
> - git diff --exit-code scissors &&
> - test_cmp_rev scissors HEAD
> + git diff --exit-code expected-for-scissors &&
> + test_cmp_rev expected-for-scissors HEAD
>  '
>  
>  test_expect_success 'am --no-scissors overrides mailinfo.scissors' '
> @@ -429,10 +430,10 @@ test_expect_success 'am --no-scissors overrides 
> mailinfo.scissors' '
>   git reset --hard &&
>   git checkout second &&
>   test_config mailinfo.scissors true &&
> - git am --no-scissors no-scissors-patch.eml &&
> + git am --no-scissors patch-with-scissors-line.eml &&
>   test_path_is_missing .git/rebase-apply

Re: [PATCH 1/1] pull --rebase=: allow single-letter abbreviations for the type

2018-08-06 Thread Junio C Hamano

"Johannes Schindelin via GitGitGadget" 
writes:

> From: Johannes Schindelin 
>
> Git for Windows' original 4aa8b8c8283 (Teach 'git pull' to handle
> --rebase=interactive, 2011-10-21) had support for the very convenient
> abbreviation
>
>   git pull --rebase=i
>
> which was later lost when it was ported to the builtin `git pull`, and
> it was not introduced before the patch eventually made it into Git as
> f5eb87b98dd (pull: allow interactive rebase with --rebase=interactive,
> 2016-01-13).
>
> However, it is *really* a useful short hand for the occasional rebasing
> pull on branches that do not usually want to be rebased.
>
> So let's reintroduce this convenience, at long last.
>
> Signed-off-by: Johannes Schindelin 
> ---

Makes sense, whether this patch is adding back what in the past
existed only in the Windows port but lost, and whether the lossage
was long time ago or in just a few releases go, or it is adding a
complete new feature for that matter.  This looks like a good
short-hand.

Will queue.

[GSoC] [PATCH v6 2/3] rebase: refactor common shell functions into their own file

2018-08-06 Thread Pratik Karki

The functions present in `git-legacy-rebase.sh` are used by the rebase
backends as they are implemented as shell script functions in the
`git-rebase--` files.

To make the `builtin/rebase.c` work, we have to provide support via
a Unix shell script snippet that uses these functions and so, we
want to use the rebase backends *directly* from the builtin rebase
without going through `git-legacy-rebase.sh`.

This commit extracts the functions to a separate file,
`git-rebase--common`, that will be read by `git-legacy-rebase.sh` and
by the shell script snippets which will be used extensively in the
following commits.

Signed-off-by: Pratik Karki 
---
Unchanged since v5.

 .gitignore|  1 +
 Makefile  |  1 +
 git-legacy-rebase.sh  | 69 ++-
 git-rebase--common.sh | 68 ++
 4 files changed, 72 insertions(+), 67 deletions(-)
 create mode 100644 git-rebase--common.sh

diff --git a/.gitignore b/.gitignore
index ec23959014..824141cba1 100644
--- a/.gitignore
+++ b/.gitignore
@@ -117,6 +117,7 @@
 /git-read-tree
 /git-rebase
 /git-rebase--am
+/git-rebase--common
 /git-rebase--helper
 /git-rebase--interactive
 /git-rebase--merge
diff --git a/Makefile b/Makefile
index c311532791..c210681a1d 100644
--- a/Makefile
+++ b/Makefile
@@ -619,6 +619,7 @@ SCRIPT_SH += git-web--browse.sh
 SCRIPT_LIB += git-mergetool--lib
 SCRIPT_LIB += git-parse-remote
 SCRIPT_LIB += git-rebase--am
+SCRIPT_LIB += git-rebase--common
 SCRIPT_LIB += git-rebase--interactive
 SCRIPT_LIB += git-rebase--preserve-merges
 SCRIPT_LIB += git-rebase--merge
diff --git a/git-legacy-rebase.sh b/git-legacy-rebase.sh
index 7973447645..af2cdfef03 100755
--- a/git-legacy-rebase.sh
+++ b/git-legacy-rebase.sh
@@ -57,12 +57,7 @@ cd_to_toplevel
 LF='
 '
 ok_to_skip_pre_rebase=
-resolvemsg="
-$(gettext 'Resolve all conflicts manually, mark them as resolved with
-"git add/rm ", then run "git rebase --continue".
-You can instead skip this commit: run "git rebase --skip".
-To abort and get back to the state before "git rebase", run "git rebase 
--abort".')
-"
+
 squash_onto=
 unset onto
 unset restrict_revision
@@ -102,6 +97,7 @@ case "$(git config --bool commit.gpgsign)" in
 true)  gpg_sign_opt=-S ;;
 *) gpg_sign_opt= ;;
 esac
+. git-rebase--common
 
 read_basic_state () {
test -f "$state_dir/head-name" &&
@@ -132,67 +128,6 @@ read_basic_state () {
}
 }
 
-write_basic_state () {
-   echo "$head_name" > "$state_dir"/head-name &&
-   echo "$onto" > "$state_dir"/onto &&
-   echo "$orig_head" > "$state_dir"/orig-head &&
-   echo "$GIT_QUIET" > "$state_dir"/quiet &&
-   test t = "$verbose" && : > "$state_dir"/verbose
-   test -n "$strategy" && echo "$strategy" > "$state_dir"/strategy
-   test -n "$strategy_opts" && echo "$strategy_opts" > \
-   "$state_dir"/strategy_opts
-   test -n "$allow_rerere_autoupdate" && echo "$allow_rerere_autoupdate" > 
\
-   "$state_dir"/allow_rerere_autoupdate
-   test -n "$gpg_sign_opt" && echo "$gpg_sign_opt" > 
"$state_dir"/gpg_sign_opt
-   test -n "$signoff" && echo "$signoff" >"$state_dir"/signoff
-}
-
-output () {
-   case "$verbose" in
-   '')
-   output=$("$@" 2>&1 )
-   status=$?
-   test $status != 0 && printf "%s\n" "$output"
-   return $status
-   ;;
-   *)
-   "$@"
-   ;;
-   esac
-}
-
-move_to_original_branch () {
-   case "$head_name" in
-   refs/*)
-   message="rebase finished: $head_name onto $onto"
-   git update-ref -m "$message" \
-   $head_name $(git rev-parse HEAD) $orig_head &&
-   git symbolic-ref \
-   -m "rebase finished: returning to $head_name" \
-   HEAD $head_name ||
-   die "$(eval_gettext "Could not move back to \$head_name")"
-   ;;
-   esac
-}
-
-apply_autostash () {
-   if test -f "$state_dir/autostash"
-   then
-   stash_sha1=$(cat "$state_dir/autostash")
-   if git stash apply $stash_sha1 >/dev/null 2>&1
-   then
-   echo "$(gettext 'Applied autostash.')" >&2
-   else
-   git stash store -m "autostash" -q $stash_sha1 ||
-   die "$(eval_gettext "Cannot store \$stash_sha1")"
-   gettext 'Applying autostash resulted in conflicts.
-Your changes are safe in the stash.
-You can run "git stash pop" or "git stash drop" at any time.
-' >&2
-   fi
-   fi
-}
-
 finish_rebase () {
rm -f "$(git rev-parse --git-path REBASE_HEAD)"
apply_autostash &&
diff --git a/git-rebase--common.sh b/git-rebase--common.sh
new file mode 100644
index 00..7e39d22871
--- /dev/null
+++ b/git-rebase--common.sh
@@ -0,0 +1,68 @@
+
+resolvemsg="
+$(gettext 'Resolve all co

[GSoC] [PATCH v6 1/3] rebase: start implementing it as a builtin

2018-08-06 Thread Pratik Karki

This commit imitates the strategy that was used to convert the
difftool to a builtin. We start by renaming the shell script
`git-rebase.sh` to `git-legacy-rebase.sh` and introduce a
`builtin/rebase.c` that simply executes the shell script version,
unless the config setting `rebase.useBuiltin` is set to `true`.

The motivation behind this is to rewrite all the functionality of the
shell script version in the aforementioned `rebase.c`, one by one and
be able to conveniently test new features by configuring
`rebase.useBuiltin`.

In the original difftool conversion, if sane_execvp() that attempts to
run the legacy scripted version returned with non-negative status, the
command silently exited without doing anything with success, but
sane_execvp() should not return with non-negative status in the first
place, so we use die() to notice such an abnormal case.

We intentionally avoid reading the config directly to avoid
messing up the GIT_* environment variables when we need to fall back to
exec()ing the shell script. The test of builtin rebase can be done by
`git -c rebase.useBuiltin=true rebase ...`

Signed-off-by: Pratik Karki 
---
Unchanged since v5.

 .gitignore|  1 +
 Makefile  |  3 +-
 builtin.h |  1 +
 builtin/rebase.c  | 58 +++
 git-rebase.sh => git-legacy-rebase.sh |  0
 git.c |  6 +++
 6 files changed, 68 insertions(+), 1 deletion(-)
 create mode 100644 builtin/rebase.c
 rename git-rebase.sh => git-legacy-rebase.sh (100%)

diff --git a/.gitignore b/.gitignore
index 3284a1e9b1..ec23959014 100644
--- a/.gitignore
+++ b/.gitignore
@@ -78,6 +78,7 @@
 /git-init-db
 /git-interpret-trailers
 /git-instaweb
+/git-legacy-rebase
 /git-log
 /git-ls-files
 /git-ls-remote
diff --git a/Makefile b/Makefile
index bc4fc8eeab..c311532791 100644
--- a/Makefile
+++ b/Makefile
@@ -609,7 +609,7 @@ SCRIPT_SH += git-merge-one-file.sh
 SCRIPT_SH += git-merge-resolve.sh
 SCRIPT_SH += git-mergetool.sh
 SCRIPT_SH += git-quiltimport.sh
-SCRIPT_SH += git-rebase.sh
+SCRIPT_SH += git-legacy-rebase.sh
 SCRIPT_SH += git-remote-testgit.sh
 SCRIPT_SH += git-request-pull.sh
 SCRIPT_SH += git-stash.sh
@@ -1063,6 +1063,7 @@ BUILTIN_OBJS += builtin/prune.o
 BUILTIN_OBJS += builtin/pull.o
 BUILTIN_OBJS += builtin/push.o
 BUILTIN_OBJS += builtin/read-tree.o
+BUILTIN_OBJS += builtin/rebase.o
 BUILTIN_OBJS += builtin/rebase--helper.o
 BUILTIN_OBJS += builtin/receive-pack.o
 BUILTIN_OBJS += builtin/reflog.o
diff --git a/builtin.h b/builtin.h
index 0362f1ce25..44651a447f 100644
--- a/builtin.h
+++ b/builtin.h
@@ -202,6 +202,7 @@ extern int cmd_prune_packed(int argc, const char **argv, 
const char *prefix);
 extern int cmd_pull(int argc, const char **argv, const char *prefix);
 extern int cmd_push(int argc, const char **argv, const char *prefix);
 extern int cmd_read_tree(int argc, const char **argv, const char *prefix);
+extern int cmd_rebase(int argc, const char **argv, const char *prefix);
 extern int cmd_rebase__helper(int argc, const char **argv, const char *prefix);
 extern int cmd_receive_pack(int argc, const char **argv, const char *prefix);
 extern int cmd_reflog(int argc, const char **argv, const char *prefix);
diff --git a/builtin/rebase.c b/builtin/rebase.c
new file mode 100644
index 00..c44addb2a4
--- /dev/null
+++ b/builtin/rebase.c
@@ -0,0 +1,58 @@
+/*
+ * "git rebase" builtin command
+ *
+ * Copyright (c) 2018 Pratik Karki
+ */
+
+#include "builtin.h"
+#include "run-command.h"
+#include "exec-cmd.h"
+#include "argv-array.h"
+#include "dir.h"
+
+static int use_builtin_rebase(void)
+{
+   struct child_process cp = CHILD_PROCESS_INIT;
+   struct strbuf out = STRBUF_INIT;
+   int ret;
+
+   argv_array_pushl(&cp.args,
+"config", "--bool", "rebase.usebuiltin", NULL);
+   cp.git_cmd = 1;
+   if (capture_command(&cp, &out, 6)) {
+   strbuf_release(&out);
+   return 0;
+   }
+
+   strbuf_trim(&out);
+   ret = !strcmp("true", out.buf);
+   strbuf_release(&out);
+   return ret;
+}
+
+int cmd_rebase(int argc, const char **argv, const char *prefix)
+{
+   /*
+* NEEDSWORK: Once the builtin rebase has been tested enough
+* and git-legacy-rebase.sh is retired to contrib/, this preamble
+* can be removed.
+*/
+
+   if (!use_builtin_rebase()) {
+   const char *path = mkpath("%s/git-legacy-rebase",
+ git_exec_path());
+
+   if (sane_execvp(path, (char **)argv) < 0)
+   die_errno(_("could not exec %s"), path);
+   else
+   BUG("sane_execvp() returned???");
+   }
+
+   if (argc != 2)
+   die(_("Usage: %s "), argv[0]);
+   prefix = setup_git_directory();
+   trace_repo_setup(prefix);
+   setup_work_tree();
+
+

[GSoC] [PATCH v6 0/3] rebase: rewrite rebase in C

2018-08-06 Thread Pratik Karki

As a GSoC project, I have been working on the builtin rebase.

The motivation behind the rewrite of rebase i.e. from shell script to C
are for following reasons:

1.  Writing shell scripts and getting it to production is much faster
than doing the equivalent in C but lacks in performance and extra
workarounds are needed for non-POSIX platforms.

2.  Git for Windows is at loss as the installer size increases due to
addition of extra dependencies for the shell scripts which are usually
available in POSIX compliant platforms.

This series of patches serves to demonstrate a minimal builtin rebase
which supports running `git rebase ` and also serves to ask for
reviews.

This is only the first patch series, with  many more to come.
I have finished the conversion, but I want to organize the patches in a neat
patch series to make review process more pleasant, and you can see the progress
at .

The organization of patches is also almost done. After the follow-up patches,
the rebase operation will be handled completely by this `builtin/rebase.c`.

Changes since v5:

  -  Fix `builtin/rebase: support running "git rebase "` as it
 does not need to switch to another branch and only needs to detach the
 `HEAD` for now.

  -  The `reset_head()` function is introduced in
 `builtin/rebase: support running "git rebase "`
 which is only used to detach the `HEAD` to `onto` for starting the rebase,
 but upcoming patches will add more functionality like moving or updating to
 original branch.

  -  Lots of changes that fix bugs discovered while getting the test suite to 
pass
 with the patch series.

Pratik Karki (3):
  rebase: start implementing it as a builtin
  rebase: refactor common shell functions into their own file
  builtin/rebase: support running "git rebase "

 .gitignore|   2 +
 Makefile  |   4 +-
 builtin.h |   1 +
 builtin/rebase.c  | 421 ++
 git-rebase.sh => git-legacy-rebase.sh |  69 +
 git-rebase--common.sh |  68 +
 git.c |   6 +
 7 files changed, 503 insertions(+), 68 deletions(-)
 create mode 100644 builtin/rebase.c
 rename git-rebase.sh => git-legacy-rebase.sh (90%)
 create mode 100644 git-rebase--common.sh

-- 
2.18.0

[GSoC] [PATCH v6 3/3] builtin/rebase: support running "git rebase "

2018-08-06 Thread Pratik Karki

This patch gives life to the skeleton added in the previous patches:
With this change, we can perform a elementary rebase (without any
options).

It can be tested thusly by:

git -c rebase.usebuiltin=true rebase HEAD~2

The rebase backends (i.e. the shell script functions defined in
`git-rebase--`) are still at work here and the "builtin
rebase"'s purpose is simply to parse the options and set
everything up so that those rebase backends can do their work.

Note: We take an alternative approach here which is *not* to set the
environment variables via `run_command_v_opt_cd_env()` because those
variables would then be visible by processes spawned from the rebase
backends. Instead, we work hard to set them in the shell script snippet.
On Windows, some of the tests fail merely due to core.fileMode not
being heeded that's why the core.*config variables is parsed here.

The `reset_head()` function is currently only used to detach the HEAD
to onto by way of starting the rebase, but it offers additional
functionality that subsequent patches will need like moving to the
original branch (possibly updating it) and also to do the equivalent of
`git reset --hard`.

The next commits will handle and support all the wonderful rebase
options.

Signed-off-by: Pratik Karki 
---
 builtin/rebase.c | 365 ++-
 1 file changed, 364 insertions(+), 1 deletion(-)

diff --git a/builtin/rebase.c b/builtin/rebase.c
index c44addb2a4..e695d8a430 100644
--- a/builtin/rebase.c
+++ b/builtin/rebase.c
@@ -9,6 +9,24 @@
 #include "exec-cmd.h"
 #include "argv-array.h"
 #include "dir.h"
+#include "packfile.h"
+#include "refs.h"
+#include "quote.h"
+#include "config.h"
+#include "cache-tree.h"
+#include "unpack-trees.h"
+#include "lockfile.h"
+
+static GIT_PATH_FUNC(apply_dir, "rebase-apply")
+static GIT_PATH_FUNC(merge_dir, "rebase-merge")
+
+enum rebase_type {
+   REBASE_UNSPECIFIED = -1,
+   REBASE_AM,
+   REBASE_MERGE,
+   REBASE_INTERACTIVE,
+   REBASE_PRESERVE_MERGES
+};
 
 static int use_builtin_rebase(void)
 {
@@ -30,8 +48,260 @@ static int use_builtin_rebase(void)
return ret;
 }
 
+static int apply_autostash(void)
+{
+   warning("TODO");
+   return 0;
+}
+
+struct rebase_options {
+   enum rebase_type type;
+   const char *state_dir;
+   struct commit *upstream;
+   const char *upstream_name;
+   char *head_name;
+   struct object_id orig_head;
+   struct commit *onto;
+   const char *onto_name;
+   const char *revisions;
+   int root;
+   struct commit *restrict_revision;
+   int dont_finish_rebase;
+};
+
+/* Returns the filename prefixed by the state_dir */
+static const char *state_dir_path(const char *filename, struct rebase_options 
*opts)
+{
+   static struct strbuf path = STRBUF_INIT;
+   static size_t prefix_len;
+
+   if (!prefix_len) {
+   strbuf_addf(&path, "%s/", opts->state_dir);
+   prefix_len = path.len;
+   }
+
+   strbuf_setlen(&path, prefix_len);
+   strbuf_addstr(&path, filename);
+   return path.buf;
+}
+
+static int finish_rebase(struct rebase_options *opts)
+{
+   struct strbuf dir = STRBUF_INIT;
+   const char *argv_gc_auto[] = { "gc", "--auto", NULL };
+
+   delete_ref(NULL, "REBASE_HEAD", NULL, REF_NO_DEREF);
+   apply_autostash();
+   close_all_packs(the_repository->objects);
+   /*
+* We ignore errors in 'gc --auto', since the
+* user should see them.
+*/
+   run_command_v_opt(argv_gc_auto, RUN_GIT_CMD);
+   strbuf_addstr(&dir, opts->state_dir);
+   remove_dir_recursively(&dir, 0);
+   strbuf_release(&dir);
+
+   return 0;
+}
+
+static struct commit *peel_committish(const char *name)
+{
+   struct object *obj;
+   struct object_id oid;
+
+   if (get_oid(name, &oid))
+   return NULL;
+   obj = parse_object(the_repository, &oid);
+   return (struct commit *)peel_to_type(name, 0, obj, OBJ_COMMIT);
+}
+
+static void add_var(struct strbuf *buf, const char *name, const char *value)
+{
+   if (!value)
+   strbuf_addf(buf, "unset %s; ", name);
+   else {
+   strbuf_addf(buf, "%s=", name);
+   sq_quote_buf(buf, value);
+   strbuf_addstr(buf, "; ");
+   }
+}
+
+static int run_specific_rebase(struct rebase_options *opts)
+{
+   const char *argv[] = { NULL, NULL };
+   struct strbuf script_snippet = STRBUF_INIT;
+   int status;
+   const char *backend, *backend_func;
+
+   add_var(&script_snippet, "GIT_DIR", absolute_path(get_git_dir()));
+   add_var(&script_snippet, "state_dir", opts->state_dir);
+
+   add_var(&script_snippet, "upstream_name", opts->upstream_name);
+   add_var(&script_snippet, "upstream",
+oid_to_hex(&opts->upstream->object.oid));
+   add_var(&script_snippet, "head_name", opts->head_name);
+   add_var(&

Re: [PATCH] add a script to diff rendered documentation

2018-08-06 Thread Junio C Hamano

Jeff King  writes:

>> > +  case "$1" in
>> > +  -j)
>> > +  parallel=${1#-j} ;;
>> 
>> This is curious.  Did you mean "-j*)" on the line above this one?
>
> Hmph, yes, I think this was broken even in the original. And after going
> through "rev-parse --parseopt", we should have a separate argument
> anyway, even for the "stuck" form. Worse, the OPTIONS_SPEC doesn't
> mention the argument, so it barfs on "-j4".

Ah, I forgot (just like somebody else, and even worse is that I
reminded him of this fact that I am forgetting here) that we use the
parseopt thing to normalize the option parsing, so with the correct
spec "-j)" is the right thing to say (but yes then you'd look at $2
and then shift both aawy).

> I think we need to squash this in:

Yup.  Thanks.

> diff --git a/Documentation/doc-diff b/Documentation/doc-diff
> index 5d5b243384..f483fe427c 100755
> --- a/Documentation/doc-diff
> +++ b/Documentation/doc-diff
> @@ -3,7 +3,7 @@
>  OPTIONS_SPEC="\
>  doc-diff [options]   [-- ]
>  --
> -jparallel argument to pass to make
> +j=n  parallel argument to pass to make
>  fforce rebuild; do not rely on cached results
>  "
>  SUBDIRECTORY_OK=1
> @@ -15,7 +15,7 @@ while test $# -gt 0
>  do
>   case "$1" in
>   -j)
> - parallel=${1#-j} ;;
> + parallel=$2; shift ;;
>   -f)
>   force=t ;;
>   --)
>
>> Then "script -j" (no explicit number) would get here and autodetect.
>> Running the script without any "-j" would also get an empty parallel
>> and come here.
>
> Yeah, I think that is the wrong thing. If anything "-j" should behave
> like "make -j". However, it looks like "rev-parse --parseopt" doesn't
> play well with optional arguments for short-only options. You get "-j",
> but then you have no idea whether the next argument is an optional value
> for it, or another option entirely. Arguably it should give a blank
> string or something (if you have long options, then it uses the
> long-stock form, which is fine).
>
>> So "script -j1" would be how a user would say "I want to use exactly
>> one process, not any parallelism", which makes sense.
>
> Right, that was the thing I actually wanted to have happen. :)
>
> -Peff

Re: [RFC PATCH 2/5] Add delta-islands.{c,h}

2018-08-06 Thread Duy Nguyen

On Mon, Aug 6, 2018 at 8:54 PM Christian Couder
 wrote:
> > Is it worth it? The "pahole" comment in this file is up to date. We
> > use 80 bytes per object. This series makes the struct 88 bytes (I've
> > just rerun pahole).
>
> Did you run it on V1 or on V2? I guess on V2, but then what do you
> think about converting the 'layer' field into a bit field, which might
> be simpler and save space?

V2. I kinda ignored the layer field because Jeff was suggesting that
it may not be needed. It may be better to just move it out though
because it's not that complicated and even if we have enough bits
left, they may be spread out that it's hard to reorder so that they're
grouped together and use can use them.

> > On linux repo with 12M objects, "git pack-objects
> > --all" needs extra 96MB memory even if this feature is not used. So
> > yes I still think it's worth moving these fields out of struct
> > object_entry.
>
> And what about the fields already in struct object_entry? While I am
> at it, I think I could move some of them too if it is really so worth

Way ahead of you. In v2.17.0 this struct is 136 bytes so going down to
80 bytes now is a big memory saving. I think I  squeezed everything
possible and was a bit too aggressive that I created a new performance
regression in v2.18.0 ;-)
-- 
Duy

RE: [PATCH] Makefile: enable DEVELOPER by default

2018-08-06 Thread Randall S. Becker

On August 6, 2018 1:42 PM, Ævar Arnfjörð Bjarmason wrote:
> On Mon, Aug 06 2018, Randall S. Becker wrote:
> 
> > On August 6, 2018 12:40 PM, Ævar Arnfjörð Bjarmason wrote:
> >> On Sat, Aug 04 2018, Junio C Hamano wrote:
> >>
> >> > Duy Nguyen  writes:
> >> >
> >> >> On Sat, Aug 4, 2018 at 8:11 AM Jonathan Nieder
> >> >> 
> >> wrote:
> >> >>> My main concern is not about them but about other people building
> >> >>> from source in order to run (instead of to develop) Git, and by
> >> >>> extension, the people they go to for help when it doesn't work.
> >> >>> I have lots of bitter experience of -Werror being a support
> >> >>> headache and leading to bad workarounds when someone upgrades
> >> >>> their compiler and the build starts failing due to a new warning it has
> introduced.
> >> >>
> >> >> Even old compilers can also throw some silly, false positive
> >> >> warnings (which now turn into errors) because they are not as
> >> >> smart as new ones.
> >> >
> >> > I agree with both of the above.  I do not think the pros-and-cons
> >> > are in favor of forcing the developer bit to everybody, even though
> >> > I am sympathetic to the desire to see people throw fewer bad
> >> > changes that waste review bandwidth by not compiling or passing its
> >> > own tests at us.
> >>
> >> I agree.
> >>
> >> Responding to the thread in general, perhaps people would like this
> >> more if we turned DEVELOPER=1 DEVOPTS=no-error on by default?
> >>
> >> That's basically why I added it in 99f763baf5 ("Makefile: add a
> >> DEVOPTS to suppress -Werror under DEVELOPER", 2018-04-14), because I
> >> wanted the abilty to have verbose informative output without the
> >> build dying on some older systems / compilers.
> >>
> >> It's fine and understandable if you're someone who's just building a
> >> package on some older system if you get a bunch of compiler warnings,
> >> but more annoying if you have to dig into how to disable a default -
> Werror.
> >
> > I am the platform maintainer for HPE NonStop and need to make sure I'm
> > not packaging DEV builds to anyone
> 
> Perhaps confusingly, the DEVELOPER=1 flag in git is not like the developer
> flag in some other projects. It's purely there to turn on extra compiler
> warnings (by default, fatal), it doesn't e.g. turn on extra asserts, tracing, 
> or
> suppress stripping of the binaries.
> 
> So if we enabled some variant of it by default it would be fine to ship the
> result of that to your users, e.g. I ship DEVELOPER=1 builds to users.

That works for me. I generally consider warnings to be errors in my builds 
anyway, by policy (except when I have no choice in the matter and the warning 
is explainable).

Cheers,
Randall

-- Brief whoami:
 NonStop developer since approximately 2112884442
 UNIX developer since approximately 421664400
-- In my real life, I talk too much.

Re: [PATCH] Makefile: enable DEVELOPER by default

2018-08-06 Thread Jeff King

On Mon, Aug 06, 2018 at 10:11:19AM -0700, Jonathan Nieder wrote:

> > We're developers ourselves, and we interact with new developers that we
> > want to help.  But there are masses of people[1] building Git who are
> > _not_ developers, and want the default to be as robust as possible.
> > They're probably not going to show up in this thread.
> >
> > -Peff
> >
> > [1] I actually wonder how large that mass is. Clearly there are many
> > orders of magnitude more users than there are developers. But I have
> > no idea what percentage of them build from source versus using
> > somebody else's binary package.
> 
> Relatedly, we need to think about the incentives these defaults
> create.  Personally, I want *more* naive users to be building from
> source, because then they are better able to test recent versions,
> bisect, test my patches, etc.
> 
> As I hinted in my earlier reply, I think it would be best to try some
> basic things to make DEVELOPER more visible first.  If that fails,
> then we can revisit how to make this more drastic change in a way that
> minimizes the harm (and I am not sure yet that that is possible).

Yes, I agree very much with both of those paragraphs.

-Peff

Re: [PATCH v3 0/4] Speed up unpack_trees()

2018-08-06 Thread Junio C Hamano

Duy Nguyen  writes:

> We require the unpacked entry from all input trees to be a tree
> objects (the dirmask thing), so if one tree has 't' as a file,

Ah, OK, this is still part of that "all the trees match cache tree
so we walk the index instead" optimization.  I forgot about that.

Re: [PATCH] add a script to diff rendered documentation

2018-08-06 Thread Jeff King

On Mon, Aug 06, 2018 at 11:25:07AM -0700, Junio C Hamano wrote:

> Jeff King  writes:
> 
> > +while test $# -gt 0
> > +do
> > +   case "$1" in
> > +   -j)
> > +   parallel=${1#-j} ;;
> 
> This is curious.  Did you mean "-j*)" on the line above this one?

Hmph, yes, I think this was broken even in the original. And after going
through "rev-parse --parseopt", we should have a separate argument
anyway, even for the "stuck" form. Worse, the OPTIONS_SPEC doesn't
mention the argument, so it barfs on "-j4".

I think we need to squash this in:

diff --git a/Documentation/doc-diff b/Documentation/doc-diff
index 5d5b243384..f483fe427c 100755
--- a/Documentation/doc-diff
+++ b/Documentation/doc-diff
@@ -3,7 +3,7 @@
 OPTIONS_SPEC="\
 doc-diff [options]   [-- ]
 --
-j  parallel argument to pass to make
+j=nparallel argument to pass to make
 f  force rebuild; do not rely on cached results
 "
 SUBDIRECTORY_OK=1
@@ -15,7 +15,7 @@ while test $# -gt 0
 do
case "$1" in
-j)
-   parallel=${1#-j} ;;
+   parallel=$2; shift ;;
-f)
force=t ;;
--)

> Then "script -j" (no explicit number) would get here and autodetect.
> Running the script without any "-j" would also get an empty parallel
> and come here.

Yeah, I think that is the wrong thing. If anything "-j" should behave
like "make -j". However, it looks like "rev-parse --parseopt" doesn't
play well with optional arguments for short-only options. You get "-j",
but then you have no idea whether the next argument is an optional value
for it, or another option entirely. Arguably it should give a blank
string or something (if you have long options, then it uses the
long-stock form, which is fine).

> So "script -j1" would be how a user would say "I want to use exactly
> one process, not any parallelism", which makes sense.

Right, that was the thing I actually wanted to have happen. :)

-Peff

Re: [PATCH v6 1/1] sideband: highlight keywords in remote sideband output

2018-08-06 Thread Eric Sunshine

On Mon, Aug 6, 2018 at 1:45 PM Han-Wen Nienhuys  wrote:
> The colorization is controlled with the config setting "color.remote".
> [...]
> Helped-by: Duy Nguyen 
> Signed-off-by: Han-Wen Nienhuys 
> ---
> diff --git a/Documentation/config.txt b/Documentation/config.txt
> @@ -1263,6 +1263,18 @@ color.push::
> +color.remote::
> +   If set, keywords at the start of the line are highlighted. The
> +   keywords are "error", "warning", "hint" and "success", and are
> +   matched case-insensitively. Maybe set to `always`, `false` (or

s/Maybe/May be/

> +   `never`) or `auto` (or `true`). If unset, then the value of
> +   `color.ui` is used (`auto` by default).
> diff --git a/sideband.c b/sideband.c
> +static void maybe_colorize_sideband(struct strbuf *dest, const char *src, 
> int n)
> +{
> +   if (!want_color_stderr(use_sideband_colors())) {
> +   strbuf_add(dest, src, n);
> +   fprintf(stderr, "added: '%s'\n", dest->buf);

Debugging gunk?

> +   return;
> +   }
> @@ -96,6 +204,7 @@ int recv_sideband(const char *me, int in_stream, int out)
> if (outbuf.len) {
> +   fprintf(stderr, "last bit '%s'\n", outbuf.buf);

Debugging gunk?

> strbuf_addch(&outbuf, '\n');
> xwrite(2, outbuf.buf, outbuf.len);
> }

Re: [RFC PATCH 2/5] Add delta-islands.{c,h}

2018-08-06 Thread Christian Couder

On Mon, Aug 6, 2018 at 5:53 PM, Duy Nguyen  wrote:
> On Sun, Aug 5, 2018 at 8:53 PM Christian Couder
>  wrote:
>>
>> As you can see the patch 6/6 (in the v2 of this patch series that I
>> just sent) moves `unsigned int tree_depth` from 'struct object_entry'
>> to 'struct packing_data'. I am not sure that I did it right and that
>> it is worth it though as it is a bit more complex.
>
> It is a bit more complex than I expected. But I think if you go with
> Jeff's suggestion (i.e. think of the new tree_depth array an extension
> of objects array) it's a bit simpler: you access both arrays using the
> same index, both arrays should have the same size and be realloc'd at
> the same time in packlist_alloc().

Ok, I will take a look at doing that to simplify things. Thanks to
Peff and you for that suggestion!

> Is it worth it? The "pahole" comment in this file is up to date. We
> use 80 bytes per object. This series makes the struct 88 bytes (I've
> just rerun pahole).

Did you run it on V1 or on V2? I guess on V2, but then what do you
think about converting the 'layer' field into a bit field, which might
be simpler and save space?

> On linux repo with 12M objects, "git pack-objects
> --all" needs extra 96MB memory even if this feature is not used. So
> yes I still think it's worth moving these fields out of struct
> object_entry.

And what about the fields already in struct object_entry? While I am
at it, I think I could move some of them too if it is really so worth
it.

Re: [PATCH] Makefile: enable DEVELOPER by default

2018-08-06 Thread Stefan Beller

> I had the impression that DEVELOPER=1 was allowed to set flags that old
> versions might not even know about. Hence they might actually barf, even
> without -Werror. Maybe that's better since the introduction of the
> detect-compiler script, though.
>
> I do think we may have a skewed view of the population on this list.
> We're developers ourselves, and we interact with new developers that we
> want to help.  But there are masses of people[1] building Git who are
> _not_ developers, and want the default to be as robust as possible.
> They're probably not going to show up in this thread.

Good point about the skewed perception of people who compile Git
themselves.

At first I thought this could be mitigated by detecting if someone is
a developer by being more clever, for example if they have a commit
(at HEAD) that is not contained in any remote, it is a pretty good indicator
that fresh code was written. But this could also be the case for some
non mainstream platform, that needs fixing.

But then I entertained the thought that we actually do not care about
people committing non conforming code to their copy of git.git, but
actually only care about the code when we see it (i.e. upon review).

So I wonder if we want to "fork" git-send-email and provide a tool to
send emails with a pre-send compile&lint check specific to our code
base.

But then all these sound a lot more complicated than a simple knob
that we turn on and off. So I don't look any further into it.

Stefan

[PATCH] t: factor out FUNNYNAMES as shared lazy prereq

2018-08-06 Thread William Chargin

A fair number of tests need to check that the filesystem supports file
names including "funny" characters, like newline, tab, and double-quote.
Jonathan Nieder suggested that this be extracted into a lazy prereq in
the top-level `test-lib.sh`. This patch effects that change.

The FUNNYNAMES prereq now uniformly requires support for newlines, tabs,
and double-quotes in filenames. This very slightly decreases the power
of some tests, which might have run previously on a system that supports
(e.g.) newlines and tabs but not double-quotes, but now will not. This
seems to me like an acceptable tradeoff for consistency.

One test (`t/t9902-completion.sh`) defined FUNNYNAMES to further require
the separators \034 through \037, the test for which was implemented
using the Bash-specific $'\034' syntax. I've elected to leave this one
as is, renaming it to FUNNIERNAMES.

After this patch, `git grep 'test_\(set\|lazy\)_prereq.*FUNNYNAMES'` has
only one result.

Signed-off-by: William Chargin 
---
Note: I've tested this only on an Ubuntu-like system, where FUNNYNAMES
and FUNNIERNAMES are both naturally satisfied. I've verified that the
tests correctly skip when the prereqs are stubbed out to fail by
prepending `false &&`, but I haven't verified that the actual logic for
testing the prereq has the correct behavior on non-FUNNYNAMES systems.

 t/t3600-rm.sh|  8 +++-
 t/t4135-apply-weird-filenames.sh | 10 +-
 t/t9902-completion.sh|  6 +++---
 t/t9903-bash-prompt.sh   | 13 +
 t/test-lib.sh| 14 ++
 5 files changed, 26 insertions(+), 25 deletions(-)

diff --git a/t/t3600-rm.sh b/t/t3600-rm.sh
index b8fbdefcd..5829dfd12 100755
--- a/t/t3600-rm.sh
+++ b/t/t3600-rm.sh
@@ -14,15 +14,13 @@ test_expect_success \
  git add -- foo bar baz 'space embedded' -q &&
  git commit -m 'add normal files'"
 
-if test_have_prereq !MINGW && touch -- 'tabembedded' 'newline
-embedded' 2>/dev/null
-then
-   test_set_prereq FUNNYNAMES
-else
+if test_have_prereq !FUNNYNAMES; then
say 'Your filesystem does not allow tabs in filenames.'
 fi
 
 test_expect_success FUNNYNAMES 'add files with funny names' "
+ touch -- 'tab embedded' 'newline
+embedded' &&
  git add -- 'tab   embedded' 'newline
 embedded' &&
  git commit -m 'add files with tabs and newlines'
diff --git a/t/t4135-apply-weird-filenames.sh b/t/t4135-apply-weird-filenames.sh
index c7c688fcc..6bc3fb97a 100755
--- a/t/t4135-apply-weird-filenames.sh
+++ b/t/t4135-apply-weird-filenames.sh
@@ -15,15 +15,7 @@ test_expect_success 'setup' '
git checkout -f preimage^0 &&
git read-tree -u --reset HEAD &&
git update-index --refresh
-   } &&
-
-   test_when_finished "rm -f \"tab embedded.txt\"" &&
-   test_when_finished "rm -f '\''\"quoteembedded\".txt'\''" &&
-   if test_have_prereq !MINGW &&
-   touch -- "tab   embedded.txt" '\''"quoteembedded".txt'\''
-   then
-   test_set_prereq FUNNYNAMES
-   fi
+   }
 '
 
 try_filename() {
diff --git a/t/t9902-completion.sh b/t/t9902-completion.sh
index 5ff43b9cb..175f83d70 100755
--- a/t/t9902-completion.sh
+++ b/t/t9902-completion.sh
@@ -1278,7 +1278,7 @@ test_expect_success 'setup for path completion tests' '
   touch BS\\dir/DQ\"file \
 '$'separators\034in\035dir/sep\036in\037file''
then
-   test_set_prereq FUNNYNAMES
+   test_set_prereq FUNNIERNAMES
else
rm -rf BS\\dir '$'separators\034in\035dir''
fi
@@ -1320,7 +1320,7 @@ test_expect_success '__git_complete_index_file - UTF-8 in 
ls-files output' '
test_path_completion árvíztűrő/С "árvíztűrő/Сайн яваарай"
 '
 
-test_expect_success FUNNYNAMES \
+test_expect_success FUNNIERNAMES \
 '__git_complete_index_file - C-style escapes in ls-files output' '
test_path_completion BS \
 BS\\dir &&
@@ -1332,7 +1332,7 @@ test_expect_success FUNNYNAMES \
 BS\\dir/DQ\"file
 '
 
-test_expect_success FUNNYNAMES \
+test_expect_success FUNNIERNAMES \
 '__git_complete_index_file - \nnn-escaped characters in ls-files output' '
test_path_completion sep '$'separators\034in\035dir'' &&
test_path_completion '$'separators\034i'' \
diff --git a/t/t9903-bash-prompt.sh b/t/t9903-bash-prompt.sh
index 04440685a..ab890d3d4 100755
--- a/t/t9903-bash-prompt.sh
+++ b/t/t9903-bash-prompt.sh
@@ -63,18 +63,15 @@ test_expect_success 'prompt - unborn branch' '
test_cmp expected "$actual"
 '
 
-repo_with_newline='repo
-with
-newline'
-
-if test_have_prereq !MINGW && mkdir "$repo_with_newline" 2>/dev/null
-then
-   test_set_prereq FUNNYNAMES
-else
+if test_have_prereq !FUNNYNAMES; then
say 'Your filesystem does not allow newlines in filenames.'
 fi
 
 test_expect_success FUNNYNAMES 'prompt - with newline in path' '
+rep

Re: [BUG] A part of an edge from an octopus merge gets colored, even with --color=never

2018-08-06 Thread Noam Postavsky

On 30 June 2018 at 08:47, Noam Postavsky  wrote:

> I'm still having trouble getting a big picture understanding of how
> the graph struct relates the what gets drawn on screen, but through
> some poking around with the debugger + trial & error, I've arrived at
> a new patch which seems to work. It's also a lot simpler. I hope you
> can tell me if it makes sense.
>
> Also attached an updated test-multiway-merge.sh which allows adding
> more branches to test different sized merges more easily.

Ping? (I got some bounce message regarding test-multiway-merge.sh, but
it does show up in the mailing list archive, so I think my message has
gone through)

https://marc.info/?l=git&m=153036284214253&w=2

[no subject]

2018-08-06 Thread johpaulesq...@gmail.com

Chúng tôi mang lại thông báo cho bạn rằng chính phủ của chúng tôi
chiến đấu cho những người lừa đảo bây giờ ở nước ta Togo và chính phủ
của chúng tôi IMF phát hiện ra địa chỉ email của bạn trong danh sách
nạn nhân lừa đảo ở nước ta Togo và từ bi tổng số của bạn $ 850;
000.00. Chúng tôi không thể chuyển tiền cho bạn qua địa chỉ email.

  Chúng tôi sẽ chuyển hàng ngày cho bạn $ 5000 thông qua văn phòng
ngân hàng gram tiền của chúng tôi cho đến khi quỹ chuyển hoàn toàn cho
bạn. Xin vui lòng nếu bạn không sẵn sàng để nhận được quỹ này, đừng
trả lời chúng tôi

Chúng tôi yêu cầu khi nào có thể chuyển tiền cho bạn

Tên đầy đủ của bạn
Địa chỉ nhà
Quốc gia / thành phố
Số điện thoại
Bản sao hộ chiếu

Chúng tôi mong được nghe từ bạn khẩn trương.

Trân trọng
  Mr.Felix Richard
  M.G GIÁM ĐỐC NGÂN HÀNG
Điện thoại: +22891368099

Re: [PATCH] add a script to diff rendered documentation

2018-08-06 Thread Junio C Hamano

Jeff King  writes:

> +while test $# -gt 0
> +do
> + case "$1" in
> + -j)
> + parallel=${1#-j} ;;

This is curious.  Did you mean "-j*)" on the line above this one?

> + -f)
> + force=t ;;
> + --)
> + shift; break ;;
> + *)
> + usage ;;
> + esac
> + shift
> +done
> +
> +if test -z "$parallel"
> +then

Then "script -j" (no explicit number) would get here and autodetect.
Running the script without any "-j" would also get an empty parallel
and come here.

So "script -j1" would be how a user would say "I want to use exactly
one process, not any parallelism", which makes sense.

> + parallel=$(getconf _NPROCESSORS_ONLN 2>/dev/null)
> + if test $? != 0 || test -z "$parallel"
> + then
> + parallel=1
> + fi
> +fi

Re: [RFC PATCH v2 06/12] submodule--helper: add a '--stage' option to the 'config' sub command

2018-08-06 Thread Stefan Beller

> The consistency I was referring to is merely in the mechanism used to
> deal with .gitmodules: by always using "submodule--helper config".
>
> As a side argument: in some previous discussion Stefan mentioned the
> possibility that, in the future, he may be interested to explore
> different locations for submodules configuration, like a special ref,
> to prevent .gitmodules from being in the working tree at all, not even
> for writing submodule configuration.
>
> Having an opaque "stage submodule configuration" operation would
> prepare for that too, but as I said this is not my immediate goal.
>

Thanks for demonstrating that this will be easy to achieve in the future!

> Having said that I can drop patches 06/07 if this can help moving things
> forward; or if I had some success in convincing you I can improve the
> user interface (any advice?) and the commit message.
>

I think dropping the patches would be a good idea,
given that this is not your immediate goal.

Thanks,
Stefan

Re: [PATCH 1/1] t/test-lib: make `test_dir_is_empty` more robust

2018-08-06 Thread Eric Sunshine

On Mon, Aug 6, 2018 at 9:02 AM Jeff King  wrote:
> On Sun, Aug 05, 2018 at 04:52:31PM -0400, Jeff King wrote:
> > Perhaps even simpler:
> >
> >   test "$1" = "$(find "$1")"
>
> Actually, I guess it needs to add "-print", since IIRC that is not the
> default on some old versions of "find".

You recall correctly.

I tend to avoid 'find' in scripts when I don't need its recursive
behavior due to *extremely* poor performance on Windows (at least that
was the case years ago), especially when there are a lot of files in
the directory being listed. However, for this usage, we expect the
directory to be empty, so perhaps performance isn't an issue.

Re: [PATCH v5 2/2] sideband: highlight keywords in remote sideband output

2018-08-06 Thread Junio C Hamano

Han-Wen Nienhuys  writes:

>> I think the code is OK without any assert() or BUG(), and that is
>> because the design is "we just paint the keyword at the beginning of
>> what the other side of the sideband wants us to show as a single
>> unit".  If the other side sends a payload with an embedded LF in a
>> single packet, that's their choice and we are free not to paint the
>> beginning of the second line after that LF.  So from that point of
>> view, perhaps we shouldn't even talk about "a single line only".
>
> I don't understand this remark. Isn't the call to strpbrk() meant to
> split the input on line endings?

Yes, but that happens only for band #2 and not band #3, to which
this coloring also applies if I am reading your changes correctly.

And I still think not splitting band #3 packet into lines and
painting only the beginning of the first line is perfectly OK.

>> >  #define ANSI_SUFFIX "\033[K"
>> > -#define DUMB_SUFFIX ""
>> > +#define DUMB_SUFFIX " "
>> >
>>
>> Was this change intended and if so for what purpose?  I can drop
>> this hunk if this is a mere finger-slip without proofreading, but I
>> do not want to do so without making sure I am not missing anything
>> and not discarding a meaningful change.
>
> This was my poor use of the tabify function.

OK, so I can drop this hunk---eh, perhaps v6 won't have it so I
won't have to worry about it ;-)

Thanks.

[PATCH v3] t4150: fix broken test for am --scissors

2018-08-06 Thread Andrei Rybak

Tests for "git am --[no-]scissors" [1] work in the following way:

 1. Create files with commit messages
 2. Use these files to create expected commits
 3. Generate eml file with patch from expected commits
 4. Create commits using git am with these eml files
 5. Compare these commits with expected

The test for "git am --scissors" is supposed to take an e-mail with a
scissors line and in-body "Subject:" header and demonstrate that the
subject line from the e-mail itself is overridden by the in-body header
and that only text below the scissors line is included in the commit
message of the commit created by the invocation of "git am --scissors".
However, the setup of the test incorrectly uses a commit without the
scissors line and without the in-body header in the commit message,
producing eml file not suitable for testing of "git am --scissors".

This can be checked by intentionally breaking is_scissors_line function
in mailinfo.c, for example, by changing string ">8", which is used by
the test. With such change the test should fail, but does not.

Fix broken test by generating eml file with scissors line and in-body
header "Subject:". Since the two tests for --scissors and --no-scissors
options are there to test cutting or keeping the commit message, update
both tests to change the test file in the same way, which allows us to
generate only one eml file to be passed to git am. To clarify the
intention of the test, give files and tags more explicit names.

[1]: introduced in bf72ac17d (t4150: tests for am --[no-]scissors,
 2015-07-19)

Signed-off-by: Andrei Rybak 
---

Applies on top of 980a3d3dd (Merge branch 'pt/am-tests', 2015-08-03).
This patch is also available at

  https://github.com/rybak/git fix-am-scissors-test-v3

Only changes since v2 are more clear tag names.

 t/t4150-am.sh | 39 ---
 1 file changed, 20 insertions(+), 19 deletions(-)

diff --git a/t/t4150-am.sh b/t/t4150-am.sh
index e9b6f8158..a821dfda5 100755
--- a/t/t4150-am.sh
+++ b/t/t4150-am.sh
@@ -67,13 +67,15 @@ test_expect_success 'setup: messages' '
 
EOF
 
-   cat >scissors-msg <<-\EOF &&
-   Test git-am with scissors line
+   cat >msg-without-scissors-line <<-\EOF &&
+   Test that git-am --scissors cuts at the scissors line
 
This line should be included in the commit message.
EOF
 
-   cat - scissors-msg >no-scissors-msg <<-\EOF &&
+   printf "Subject: " >subject-prefix &&
+
+   cat - subject-prefix msg-without-scissors-line >msg-with-scissors-line 
<<-\EOF &&
This line should not be included in the commit message with --scissors 
enabled.
 
 - - >8 - - remove everything above this line - - >8 - -
@@ -150,18 +152,17 @@ test_expect_success setup '
} >patch1-hg.eml &&
 
 
-   echo scissors-file >scissors-file &&
-   git add scissors-file &&
-   git commit -F scissors-msg &&
-   git tag scissors &&
-   git format-patch --stdout scissors^ >scissors-patch.eml &&
+   echo file >file &&
+   git add file &&
+   git commit -F msg-without-scissors-line &&
+   git tag expected-for-scissors &&
git reset --hard HEAD^ &&
 
-   echo no-scissors-file >no-scissors-file &&
-   git add no-scissors-file &&
-   git commit -F no-scissors-msg &&
-   git tag no-scissors &&
-   git format-patch --stdout no-scissors^ >no-scissors-patch.eml &&
+   echo file >file &&
+   git add file &&
+   git commit -F msg-with-scissors-line &&
+   git tag expected-for-no-scissors &&
+   git format-patch --stdout expected-for-no-scissors^ 
>patch-with-scissors-line.eml &&
git reset --hard HEAD^ &&
 
sed -n -e "3,\$p" msg >file &&
@@ -418,10 +419,10 @@ test_expect_success 'am --scissors cuts the message at 
the scissors line' '
rm -fr .git/rebase-apply &&
git reset --hard &&
git checkout second &&
-   git am --scissors scissors-patch.eml &&
+   git am --scissors patch-with-scissors-line.eml &&
test_path_is_missing .git/rebase-apply &&
-   git diff --exit-code scissors &&
-   test_cmp_rev scissors HEAD
+   git diff --exit-code expected-for-scissors &&
+   test_cmp_rev expected-for-scissors HEAD
 '
 
 test_expect_success 'am --no-scissors overrides mailinfo.scissors' '
@@ -429,10 +430,10 @@ test_expect_success 'am --no-scissors overrides 
mailinfo.scissors' '
git reset --hard &&
git checkout second &&
test_config mailinfo.scissors true &&
-   git am --no-scissors no-scissors-patch.eml &&
+   git am --no-scissors patch-with-scissors-line.eml &&
test_path_is_missing .git/rebase-apply &&
-   git diff --exit-code no-scissors &&
-   test_cmp_rev no-scissors HEAD
+   git diff --exit-code expected-for-no-scissors &&
+   test_cmp_rev expected-for-no-scissors HEAD
 '
 
 test_expect_success 'setup: new author and committer' '
-- 
2.18.0

Re: [PATCH v5 2/2] sideband: highlight keywords in remote sideband output

2018-08-06 Thread Han-Wen Nienhuys

On Mon, Aug 6, 2018 at 7:21 PM Junio C Hamano  wrote:
> > + If set, keywords at the start of the line are highlighted. The
> > + keywords are "error", "warning", "hint" and "success", and are
> > + matched case-insensitively. Maybe set to `always`, `false` (or
> > + `never`) or `auto` (or `true`). If unset, then the value of
> > + `color.ui` is used (`auto` by default).
>
> Reads much better.
>
> I am still trying to find a concise way to help readers who saw a
> line that begins with "Warnings: foo bar bla" and accept that it is
> OK the early 7 chars are not painted.  "... case-insensitively and
> honoring word boundary" is the best I came up so far, but  I am
> afraid that is adding more words without hinting what I want to convey
> strongly enough, so I am not going to suggest that (at least not yet).
>

what do you think of the idea of requiring a colon explicitly? That
would avoid spurious highlighting in case a keyword happens to be
wrapped onto a line start.

--

Google Germany GmbH, Erika-Mann-Strasse 33, 80636 Munich

Registergericht und -nummer: Hamburg, HRB 86891

Sitz der Gesellschaft: Hamburg

Geschäftsführer: Paul Manicle, Halimah DeLaine Prado

[PATCH v6 1/1] sideband: highlight keywords in remote sideband output

2018-08-06 Thread Han-Wen Nienhuys

The colorization is controlled with the config setting "color.remote".

Supported keywords are "error", "warning", "hint" and "success". They
are highlighted if they appear at the start of the line, which is
common in error messages, eg.

   ERROR: commit is missing Change-Id

The Git push process itself prints lots of non-actionable messages
(eg. bandwidth statistics, object counters for different phases of the
process). This obscures actionable error messages that servers may
send back. Highlighting keywords in the sideband draws more attention
to those messages.

The background for this change is that Gerrit does server-side
processing to create or update code reviews, and actionable error
messages (eg. missing Change-Id) must be communicated back to the user
during the push. User research has shown that new users have trouble
seeing these messages.

The highlighting is done on the client rather than server side, so
servers don't have to grow capabilities to understand terminal escape
codes and terminal state. It also consistent with the current state
where Git is control of the local display (eg. prefixing messages with
"remote: ").

The highlighting can be configured using color.remote.
configuration settings. Since the keys are matched case insensitively,
we match the keywords case insensitively too.

Finally, this solution is backwards compatible: many servers already
prefix their messages with "error", and they will benefit from this
change without requiring a server update. By contrast, a server-side
solution would likely require plumbing the TERM variable through the
git protocol, so it would require changes to both server and client.

Helped-by: Duy Nguyen 
Signed-off-by: Han-Wen Nienhuys 
---
 Documentation/config.txt|  12 +++
 help.c  |   1 +
 help.h  |   1 +
 sideband.c  | 127 ++--
 t/t5409-colorize-remote-messages.sh |  87 +++
 5 files changed, 219 insertions(+), 9 deletions(-)
 create mode 100755 t/t5409-colorize-remote-messages.sh

diff --git a/Documentation/config.txt b/Documentation/config.txt
index 63365dcf3d..33bc1a3def 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -1263,6 +1263,18 @@ color.push::
 color.push.error::
Use customized color for push errors.
 
+color.remote::
+   If set, keywords at the start of the line are highlighted. The
+   keywords are "error", "warning", "hint" and "success", and are
+   matched case-insensitively. Maybe set to `always`, `false` (or
+   `never`) or `auto` (or `true`). If unset, then the value of
+   `color.ui` is used (`auto` by default).
+
+color.remote.::
+   Use customized color for each remote keyword. `` may be
+   `hint`, `warning`, `success` or `error` which match the
+   corresponding keyword.
+
 color.showBranch::
A boolean to enable/disable color in the output of
linkgit:git-show-branch[1]. May be set to `always`,
diff --git a/help.c b/help.c
index 3ebf0568db..b6cafcfc0a 100644
--- a/help.c
+++ b/help.c
@@ -425,6 +425,7 @@ void list_config_help(int for_human)
{ "color.diff", "", list_config_color_diff_slots },
{ "color.grep", "", list_config_color_grep_slots },
{ "color.interactive", "", 
list_config_color_interactive_slots },
+   { "color.remote", "", list_config_color_sideband_slots },
{ "color.status", "", list_config_color_status_slots },
{ "fsck", "", list_config_fsck_msg_ids },
{ "receive.fsck", "", list_config_fsck_msg_ids },
diff --git a/help.h b/help.h
index f8b15323a6..9eab6a3f89 100644
--- a/help.h
+++ b/help.h
@@ -83,6 +83,7 @@ void list_config_color_diff_slots(struct string_list *list, 
const char *prefix);
 void list_config_color_grep_slots(struct string_list *list, const char 
*prefix);
 void list_config_color_interactive_slots(struct string_list *list, const char 
*prefix);
 void list_config_color_status_slots(struct string_list *list, const char 
*prefix);
+void list_config_color_sideband_slots(struct string_list *list, const char 
*prefix);
 void list_config_fsck_msg_ids(struct string_list *list, const char *prefix);
 
 #endif /* HELP_H */
diff --git a/sideband.c b/sideband.c
index 325bf0e974..d882e79a5f 100644
--- a/sideband.c
+++ b/sideband.c
@@ -1,6 +1,110 @@
 #include "cache.h"
+#include "color.h"
+#include "config.h"
 #include "pkt-line.h"
 #include "sideband.h"
+#include "help.h"
+
+struct keyword_entry {
+   /*
+* We use keyword as config key so it should be a single alphanumeric 
word.
+*/
+   const char *keyword;
+   char color[COLOR_MAXLEN];
+};
+
+static struct keyword_entry keywords[] = {
+   { "hint",   GIT_COLOR_YELLOW },
+   { "warning",GIT_COLOR_BOLD_YELLOW },
+   { "success",GIT_COLOR_BOLD_GREEN },
+   { "error",  GIT_COLOR_BOLD_R

[PATCH v6 0/1] sideband: highlight keywords in remote sideband output

2018-08-06 Thread Han-Wen Nienhuys

address Jun's comments.

Han-Wen Nienhuys (1):
  sideband: highlight keywords in remote sideband output

 Documentation/config.txt|  12 +++
 help.c  |   1 +
 help.h  |   1 +
 sideband.c  | 127 ++--
 t/t5409-colorize-remote-messages.sh |  87 +++
 5 files changed, 219 insertions(+), 9 deletions(-)
 create mode 100755 t/t5409-colorize-remote-messages.sh

--
2.18.0.597.ga71716f1ad-goog

Re: [PATCH v2] t4150: fix broken test for am --scissors

2018-08-06 Thread Andrei Rybak

On 2018-08-06 10:58, Paul Tan wrote:
>> +   git commit -F msg-without-scissors-line &&
>> +   git tag scissors-used &&
> 
> Nit: I'm not quite sure about naming the tag "scissors-used" though,
> since this commit was not created from the output of "git am
> --scissors". Maybe it should be named `commit-without-scissors-line`
> or something?
> 
>> +   git commit -F msg-with-scissors-line &&
>> +   git tag scissors-not-used &&
> 
> Nit: Likewise, perhaps this tag could be named `commit-with-scissors-line`?

How about "expected-for-scissors" and "expected-for-no-scissors"?
Junio, I'll send out v3 with updated tag names, if that's OK.
Also, squash-able patch below.

> So, this patch fixes the 3 problems with the tests, and so looks correct to 
> me.

Thank you for such thorough review.

--- 8< ---
From: Andrei Rybak 
Date: Mon, 6 Aug 2018 19:29:03 +0200
Subject: [PATCH] squash! t4150: fix broken test for am --scissors

clarify tag names
---
 t/t4150-am.sh | 14 +++---
 1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/t/t4150-am.sh b/t/t4150-am.sh
index bb2d951a70..a821dfda54 100755
--- a/t/t4150-am.sh
+++ b/t/t4150-am.sh
@@ -155,14 +155,14 @@ test_expect_success setup '
echo file >file &&
git add file &&
git commit -F msg-without-scissors-line &&
-   git tag scissors-used &&
+   git tag expected-for-scissors &&
git reset --hard HEAD^ &&
 
echo file >file &&
git add file &&
git commit -F msg-with-scissors-line &&
-   git tag scissors-not-used &&
-   git format-patch --stdout scissors-not-used^ 
>patch-with-scissors-line.eml &&
+   git tag expected-for-no-scissors &&
+   git format-patch --stdout expected-for-no-scissors^ 
>patch-with-scissors-line.eml &&
git reset --hard HEAD^ &&
 
sed -n -e "3,\$p" msg >file &&
@@ -421,8 +421,8 @@ test_expect_success 'am --scissors cuts the message at the 
scissors line' '
git checkout second &&
git am --scissors patch-with-scissors-line.eml &&
test_path_is_missing .git/rebase-apply &&
-   git diff --exit-code scissors-used &&
-   test_cmp_rev scissors-used HEAD
+   git diff --exit-code expected-for-scissors &&
+   test_cmp_rev expected-for-scissors HEAD
 '
 
 test_expect_success 'am --no-scissors overrides mailinfo.scissors' '
@@ -432,8 +432,8 @@ test_expect_success 'am --no-scissors overrides 
mailinfo.scissors' '
test_config mailinfo.scissors true &&
git am --no-scissors patch-with-scissors-line.eml &&
test_path_is_missing .git/rebase-apply &&
-   git diff --exit-code scissors-not-used &&
-   test_cmp_rev scissors-not-used HEAD
+   git diff --exit-code expected-for-no-scissors &&
+   test_cmp_rev expected-for-no-scissors HEAD
 '
 
 test_expect_success 'setup: new author and committer' '
-- 
2.18.0

Re: [PATCH] Makefile: enable DEVELOPER by default

2018-08-06 Thread Ævar Arnfjörð Bjarmason



On Mon, Aug 06 2018, Randall S. Becker wrote:

> On August 6, 2018 12:40 PM, Ævar Arnfjörð Bjarmason wrote:
>> On Sat, Aug 04 2018, Junio C Hamano wrote:
>>
>> > Duy Nguyen  writes:
>> >
>> >> On Sat, Aug 4, 2018 at 8:11 AM Jonathan Nieder 
>> wrote:
>> >>> My main concern is not about them but about other people building
>> >>> from source in order to run (instead of to develop) Git, and by
>> >>> extension, the people they go to for help when it doesn't work.  I
>> >>> have lots of bitter experience of -Werror being a support headache
>> >>> and leading to bad workarounds when someone upgrades their compiler
>> >>> and the build starts failing due to a new warning it has introduced.
>> >>
>> >> Even old compilers can also throw some silly, false positive warnings
>> >> (which now turn into errors) because they are not as smart as new
>> >> ones.
>> >
>> > I agree with both of the above.  I do not think the pros-and-cons are
>> > in favor of forcing the developer bit to everybody, even though I am
>> > sympathetic to the desire to see people throw fewer bad changes that
>> > waste review bandwidth by not compiling or passing its own tests at
>> > us.
>>
>> I agree.
>>
>> Responding to the thread in general, perhaps people would like this more if
>> we turned DEVELOPER=1 DEVOPTS=no-error on by default?
>>
>> That's basically why I added it in 99f763baf5 ("Makefile: add a DEVOPTS to
>> suppress -Werror under DEVELOPER", 2018-04-14), because I wanted the
>> abilty to have verbose informative output without the build dying on some
>> older systems / compilers.
>>
>> It's fine and understandable if you're someone who's just building a package
>> on some older system if you get a bunch of compiler warnings, but more
>> annoying if you have to dig into how to disable a default -Werror.
>
> I am the platform maintainer for HPE NonStop and need to make sure I'm
> not packaging DEV builds to anyone

Perhaps confusingly, the DEVELOPER=1 flag in git is not like the
developer flag in some other projects. It's purely there to turn on
extra compiler warnings (by default, fatal), it doesn't e.g. turn on
extra asserts, tracing, or suppress stripping of the binaries.

So if we enabled some variant of it by default it would be fine to ship
the result of that to your users, e.g. I ship DEVELOPER=1 builds to
users.

Re: [PATCH v5 2/2] sideband: highlight keywords in remote sideband output

2018-08-06 Thread Han-Wen Nienhuys

On Mon, Aug 6, 2018 at 7:21 PM Junio C Hamano  wrote:
>
> Han-Wen Nienhuys  writes:
>
> > The Git push process itself prints lots of non-actionable messages
> > (eg. bandwidth statistics, object counters for different phases of the
> > process), which obscures actionable error messages that servers may
>
> s/which obscures/which obscure/, as I think that "which" refers to
> "messages" above.
>
Done.

> > The highlighting is done on the client rather than server side, so
> > servers don't have to grow capabilities to understand terminal escape
> > codes and terminal state. It also consistent with the current state
> > where Git is control of the local display (eg. prefixing messages with
> > "remote: ").
>
> Yup.
>
> When we introduce "the receiving end asks messages to be sent with
> such and such decoration" protocol support, we would want a lot more
> than just painting messages in color, e.g. i18n, verbosity, and even
> "Hey, I am a script, send them in json".
>
> Until that happens, let's keep things simpler.  No i18n messages and
> no colored output over the wire.

Ack.

>
> > +color.remote::
> > + If set, keywords at the start of the line are highlighted. The
> > + keywords are "error", "warning", "hint" and "success", and are
> > + matched case-insensitively. Maybe set to `always`, `false` (or
> > + `never`) or `auto` (or `true`). If unset, then the value of
> > + `color.ui` is used (`auto` by default).
>
> Reads much better.
>
> I am still trying to find a concise way to help readers who saw a
> line that begins with "Warnings: foo bar bla" and accept that it is
> OK the early 7 chars are not painted.  "... case-insensitively and
> honoring word boundary" is the best I came up so far, but  I am
> afraid that is adding more words without hinting what I want to convey
> strongly enough, so I am not going to suggest that (at least not yet).

I would suggest that the phrase "keyword" implies a tokenization, so
I'd leave as is.

> > + for (i = 0; i < ARRAY_SIZE(keywords); i++) {
> > + strbuf_reset(&sb);
> > + strbuf_addf(&sb, "%s.%s", key, keywords[i].keyword);
>
> This design means future enhancement to allow more keywords will
> have to be done in the form of adding more "color.remote.",
> which means a few restrictions on them are cast in stone at the
> end-user facing design level, which we need to be careful about.
>
> Side note. I do not worry about the implementation level
> limitation at all.  For example, the current code will not
> allow end-users and projects to add new keywords to be
> painted, as it relies on the keywords[] static array we can
> see above.  But that implementation detail does not prevent
> us from improving it later to support more code in this
> codepath that notices "color.remote.failure" in config file
> and paint a line that begins with "failure:".
>
> Because the third-level "variable" name is case insensive, matching
> of any future keyword MUST be also done case insensitively.
>
> Also, as you mentioned elsewhere in this patch, the string that can
> be in the keyword MUST begin with an alphabetic and consists only of
> alphanumeric or dash.
>
> I do not think these limitations imposed by the design decision this
> patch is making are particularly bad ones---we just need to be aware
> of and firm about them.  When somebody else comes in the future and
> wants to recognize "F A I L" as a custom keyword case sensitively,
> we must be able to comfortably say "no" to it.
>
> Side note. We _could_ instead use "remotemsg..color"
> namespace, as the subsection names at the second level is a
> lot looser, but I do not think it is a good idea to use in
> this application, as they are case sensitive.
>
> The above discussion may deserve to be in the log message as a
> record to tell future ourselves why we decided to use
> color.remote..

I added a note about case insensitivity.

> > + if (git_config_get_string(sb.buf, &value))
> > + continue;
> > + if (color_parse(value, keywords[i].color))
> > + die(_("config value %s is not a color: %s"), sb.buf, 
> > value);
>
> That's a bit inconsistent, isn't it?  If the configuration is not
> even a string, we ignore it and continue, but if it is a string, we
> insist that it names a color and otherwise die?

Done; added test.

> > + * Optionally highlight one keyword in remote output if it appears at the 
> > start
> > + * of the line. This should be called for a single line only, which must be
> > + * passed as the first N characters of the SRC array.
> > + */
>
> Saying "MUST be" is cheap, but do we have anybody who polices that
> requirement?

rephrased.

> I think the code is OK without any assert() or BUG(), and that is
> because the design is "we just paint the keyword at the beginning of
> what the other side of the sideband

Re: [PATCH] Makefile: enable DEVELOPER by default

2018-08-06 Thread Ævar Arnfjörð Bjarmason

On Mon, Aug 06 2018, Jeff King wrote:

> On Mon, Aug 06, 2018 at 06:40:14PM +0200, Ævar Arnfjörð Bjarmason wrote:
>
>> Responding to the thread in general, perhaps people would like this more
>> if we turned DEVELOPER=1 DEVOPTS=no-error on by default?
>>
>> That's basically why I added it in 99f763baf5 ("Makefile: add a DEVOPTS
>> to suppress -Werror under DEVELOPER", 2018-04-14), because I wanted the
>> abilty to have verbose informative output without the build dying on
>> some older systems / compilers.
>>
>> It's fine and understandable if you're someone who's just building a
>> package on some older system if you get a bunch of compiler warnings,
>> but more annoying if you have to dig into how to disable a default
>> -Werror.
>
> I had the impression that DEVELOPER=1 was allowed to set flags that old
> versions might not even know about. Hence they might actually barf, even
> without -Werror. Maybe that's better since the introduction of the
> detect-compiler script, though.

I misrecalled that -Wimadethis-up wouldn't error on e.g. GCC today in
the interest of forward-compatibility, but it does. So that changes
things.

Although we could today pick some set of flags greater than what we use
today if there's interest, and enough compiler compatibility.

> I do think we may have a skewed view of the population on this list.
> We're developers ourselves, and we interact with new developers that we
> want to help.  But there are masses of people[1] building Git who are
> _not_ developers, and want the default to be as robust as possible.
> They're probably not going to show up in this thread.

Indeed.

Re: [RFC PATCH v2 06/12] submodule--helper: add a '--stage' option to the 'config' sub command

2018-08-06 Thread Junio C Hamano

Antonio Ospite  writes:

>> I also do not see a reason why we want to stop referring to
>> .gitmodules explicitly by name.  We do not hide the fact that
>> in-tree .gitignore and .gitattributes files are used to hold the
>> metainformation about the project tree, saying that it is an
>> implementation detail.  Is there a good reason why .gitmodules
>> should be different from these other two?
>
> Not sure about that, but one difference I can see
> between .gitignore/.gitattributes and .gitmodules is that I got the
> impression that editing the latter by hand is strongly discouraged, if
> that is indeed the case a layer of indirection can make sense IMHO to
> make the actual file path less relevant.

I do not think we discourage hand editing of .gitmodules more than
others, say .gitignore; and I do not see a sane reason to do so.

"If you commit broken .gitmodules and let another person clone it,
submodules will not be checked out correctly" is *not* a sane
reason, as exactly the same thing can be said for incorrect checkout
of files with broken .gitattributes.

Quite honestly, I just want to get over with this minor detail that
won't help any scripts (after all submodule--helper is not meant to
be used by humans) and focus on other parts of the patch series.

Re: [PATCH] add a script to diff rendered documentation

2018-08-06 Thread Jeff King

On Fri, Aug 03, 2018 at 04:52:05PM -0400, Jeff King wrote:

> I wrote this up for my own use after our discussion in [1]. I'm not sure
> if it's too ugly for inclusion, or if it might be helpful to others.
> I've only just written it, but my plan is to try to run it on anything I
> submit to check the formatting. So it _seems_ useful and appears to
> work, but only after a few minutes of playing with it. :)
> 
> [1] 
> https://public-inbox.org/git/20180720223608.ge18...@genre.crustytoothpaste.net/

OK, people seem to think this is possibly useful, so here it is with a
little bit of polish based on earlier reviews:

 - we now default to $(getconf _NPROCESSORS_ONLN) parallelism, or 1 if
   that doesn't work (thanks for the getconf suggestion, Jonathan)

 - fixed formatting of usage message, per Eric's suggestion

 - put "make" as a noun in quotes ;)

I suspect the rendering step could be done a little more efficiently. In
addition to `man`, we run a shell, a `mkdir`, and a `mv` for each file.
Probably the whole thing could be done via a single perl script,
exec-ing man as appropriate. But we'd lose the parallelism, unless we do
something clever with threads. So I've left it for now, but if anybody
is interested in poking at it, go for it.

-- >8 --
Subject: [PATCH] add a script to diff rendered documentation

After making a change to the documentation, it's easy to
forget to check the rendered version to make sure it was
formatted as you intended. And simply doing a diff between
the two built versions is less trivial than you might hope:

  - diffing the roff or html output isn't particularly
readable; what we really care about is what the end user
will see

  - you have to tweak a few build variables to avoid
spurious differences (e.g., version numbers, build
times)

Let's provide a script that builds and installs the manpages
for two commits, renders the results using "man", and diffs
the result. Since this is time-consuming, we'll also do our
best to avoid repeated work, keeping intermediate results
between runs.

Some of this could probably be made a little less ugly if we
built support into Documentation/Makefile. But by relying
only on "make install-man" working, this script should work
for generating a diff between any two versions, whether they
include this script or not.

Signed-off-by: Jeff King 
---
 Documentation/.gitignore |   1 +
 Documentation/doc-diff   | 109 +++
 2 files changed, 110 insertions(+)
 create mode 100755 Documentation/doc-diff

diff --git a/Documentation/.gitignore b/Documentation/.gitignore
index c7096f11f1..3ef54e0adb 100644
--- a/Documentation/.gitignore
+++ b/Documentation/.gitignore
@@ -12,3 +12,4 @@ cmds-*.txt
 mergetools-*.txt
 manpage-base-url.xsl
 SubmittingPatches.txt
+tmp-doc-diff/
diff --git a/Documentation/doc-diff b/Documentation/doc-diff
new file mode 100755
index 00..5d5b243384
--- /dev/null
+++ b/Documentation/doc-diff
@@ -0,0 +1,109 @@
+#!/bin/sh
+
+OPTIONS_SPEC="\
+doc-diff [options]   [-- ]
+--
+j  parallel argument to pass to make
+f  force rebuild; do not rely on cached results
+"
+SUBDIRECTORY_OK=1
+. "$(git --exec-path)/git-sh-setup"
+
+parallel=
+force=
+while test $# -gt 0
+do
+   case "$1" in
+   -j)
+   parallel=${1#-j} ;;
+   -f)
+   force=t ;;
+   --)
+   shift; break ;;
+   *)
+   usage ;;
+   esac
+   shift
+done
+
+if test -z "$parallel"
+then
+   parallel=$(getconf _NPROCESSORS_ONLN 2>/dev/null)
+   if test $? != 0 || test -z "$parallel"
+   then
+   parallel=1
+   fi
+fi
+
+test $# -gt 1 || usage
+from=$1; shift
+to=$1; shift
+
+from_oid=$(git rev-parse --verify "$from") || exit 1
+to_oid=$(git rev-parse --verify "$to") || exit 1
+
+cd_to_toplevel
+tmp=Documentation/tmp-doc-diff
+
+if test -n "$force"
+then
+   rm -rf "$tmp"
+fi
+
+# We'll do both builds in a single worktree, which lets "make" reuse
+# results that don't differ between the two trees.
+if ! test -d "$tmp/worktree"
+then
+   git worktree add --detach "$tmp/worktree" "$from" &&
+   dots=$(echo "$tmp/worktree" | sed 's#[^/]*#..#g') &&
+   ln -s "$dots/config.mak" "$tmp/worktree/config.mak"
+fi
+
+# generate_render_makefile  
+generate_render_makefile () {
+   find "$1" -type f |
+   while read src
+   do
+   dst=$2/${src#$1/}
+   printf 'all:: %s\n' "$dst"
+   printf '%s: %s\n' "$dst" "$src"
+   printf '\t@echo >&2 "  RENDER $(notdir $@)" && \\\n'
+   printf '\tmkdir -p $(dir $@) && \\\n'
+   printf '\tMANWIDTH=80 man -l $< >$@+ && \\\n'
+   printf '\tmv $@+ $@\n'
+   done
+}
+
+# render_tree  
+render_tree () {
+   # Skip install-man entirely if we already have an installed directory.
+   # We can't rely on make here, since "install-man" unconditionally
+   # copies the files (spending

Re: [PATCH v5 2/2] sideband: highlight keywords in remote sideband output

2018-08-06 Thread Junio C Hamano

Han-Wen Nienhuys  writes:

> The Git push process itself prints lots of non-actionable messages
> (eg. bandwidth statistics, object counters for different phases of the
> process), which obscures actionable error messages that servers may

s/which obscures/which obscure/, as I think that "which" refers to
"messages" above.

> The highlighting is done on the client rather than server side, so
> servers don't have to grow capabilities to understand terminal escape
> codes and terminal state. It also consistent with the current state
> where Git is control of the local display (eg. prefixing messages with
> "remote: ").

Yup.

When we introduce "the receiving end asks messages to be sent with
such and such decoration" protocol support, we would want a lot more
than just painting messages in color, e.g. i18n, verbosity, and even
"Hey, I am a script, send them in json".

Until that happens, let's keep things simpler.  No i18n messages and
no colored output over the wire.

> +color.remote::
> + If set, keywords at the start of the line are highlighted. The
> + keywords are "error", "warning", "hint" and "success", and are
> + matched case-insensitively. Maybe set to `always`, `false` (or
> + `never`) or `auto` (or `true`). If unset, then the value of
> + `color.ui` is used (`auto` by default).

Reads much better.

I am still trying to find a concise way to help readers who saw a
line that begins with "Warnings: foo bar bla" and accept that it is
OK the early 7 chars are not painted.  "... case-insensitively and
honoring word boundary" is the best I came up so far, but  I am
afraid that is adding more words without hinting what I want to convey
strongly enough, so I am not going to suggest that (at least not yet).

> diff --git a/help.h b/help.h
> index f8b15323a6..9eab6a3f89 100644
> --- a/help.h
> +++ b/help.h
> @@ -83,6 +83,7 @@ void list_config_color_diff_slots(struct string_list *list, 
> const char *prefix);
>  void list_config_color_grep_slots(struct string_list *list, const char 
> *prefix);
>  void list_config_color_interactive_slots(struct string_list *list, const 
> char *prefix);
>  void list_config_color_status_slots(struct string_list *list, const char 
> *prefix);
> +void list_config_color_sideband_slots(struct string_list *list, const char 
> *prefix);
>  void list_config_fsck_msg_ids(struct string_list *list, const char *prefix);
>  
>  #endif /* HELP_H */
> diff --git a/sideband.c b/sideband.c
> index 325bf0e974..239be2ec85 100644
> --- a/sideband.c
> +++ b/sideband.c
> @@ -1,6 +1,108 @@
>  #include "cache.h"
> +#include "color.h"
> +#include "config.h"
>  #include "pkt-line.h"
>  #include "sideband.h"
> +#include "help.h"
> +
> +struct keyword_entry {
> + /*
> +  * We use keyword as config key so it should be a single alphanumeric 
> word.
> +  */
> + const char *keyword;
> + char color[COLOR_MAXLEN];
> +};
> +
> +static struct keyword_entry keywords[] = {
> + { "hint",   GIT_COLOR_YELLOW },
> + { "warning",GIT_COLOR_BOLD_YELLOW },
> + { "success",GIT_COLOR_BOLD_GREEN },
> + { "error",  GIT_COLOR_BOLD_RED },
> +};
> +/* Returns a color setting (GIT_COLOR_NEVER, etc). */
> +static int use_sideband_colors(void)
> +{
> + static int use_sideband_colors_cached = -1;
> +
> + const char *key = "color.remote";
> + struct strbuf sb = STRBUF_INIT;
> + char *value;
> + int i;
> +
> + if (use_sideband_colors_cached >= 0)
> + return use_sideband_colors_cached;
> +
> + if (!git_config_get_string(key, &value)) {
> + use_sideband_colors_cached = git_config_colorbool(key, value);
> + } else if (!git_config_get_string("color.ui", &value)) {
> + use_sideband_colors_cached = git_config_colorbool("color.ui", 
> value);
> + } else {
> + use_sideband_colors_cached = GIT_COLOR_AUTO;
> + }
> +
> + for (i = 0; i < ARRAY_SIZE(keywords); i++) {
> + strbuf_reset(&sb);
> + strbuf_addf(&sb, "%s.%s", key, keywords[i].keyword);

This design means future enhancement to allow more keywords will
have to be done in the form of adding more "color.remote.",
which means a few restrictions on them are cast in stone at the
end-user facing design level, which we need to be careful about.

Side note. I do not worry about the implementation level
limitation at all.  For example, the current code will not
allow end-users and projects to add new keywords to be
painted, as it relies on the keywords[] static array we can
see above.  But that implementation detail does not prevent
us from improving it later to support more code in this
codepath that notices "color.remote.failure" in config file
and paint a line that begins with "failure:".

Because the third-level "variable" name is case insensive, matching
of any future keyword MUST be also done case insensitively.

Also, as you mentioned elsewh

Re: [PATCH] Makefile: enable DEVELOPER by default

2018-08-06 Thread Jonathan Nieder

Hi,

Jeff King wrote:

> I had the impression that DEVELOPER=1 was allowed to set flags that old
> versions might not even know about. Hence they might actually barf, even
> without -Werror.

Yes.

[...]
> We're developers ourselves, and we interact with new developers that we
> want to help.  But there are masses of people[1] building Git who are
> _not_ developers, and want the default to be as robust as possible.
> They're probably not going to show up in this thread.
>
> -Peff
>
> [1] I actually wonder how large that mass is. Clearly there are many
> orders of magnitude more users than there are developers. But I have
> no idea what percentage of them build from source versus using
> somebody else's binary package.

Relatedly, we need to think about the incentives these defaults
create.  Personally, I want *more* naive users to be building from
source, because then they are better able to test recent versions,
bisect, test my patches, etc.

As I hinted in my earlier reply, I think it would be best to try some
basic things to make DEVELOPER more visible first.  If that fails,
then we can revisit how to make this more drastic change in a way that
minimizes the harm (and I am not sure yet that that is possible).

Thanks,
Jonathan

RE: [PATCH] Makefile: enable DEVELOPER by default

2018-08-06 Thread Randall S. Becker

On August 6, 2018 1:02 PM, Peff wrote:
> To: Ævar Arnfjörð Bjarmason 
> Cc: Junio C Hamano ; Duy Nguyen
> ; Jonathan Nieder ; Stefan
> Beller ; Git Mailing List ; git-
> packag...@googlegroups.com; Han-Wen Nienhuys 
> Subject: Re: [PATCH] Makefile: enable DEVELOPER by default
> 
> On Mon, Aug 06, 2018 at 06:40:14PM +0200, Ævar Arnfjörð Bjarmason
> wrote:
> 
> > Responding to the thread in general, perhaps people would like this
> > more if we turned DEVELOPER=1 DEVOPTS=no-error on by default?
> >
> > That's basically why I added it in 99f763baf5 ("Makefile: add a
> > DEVOPTS to suppress -Werror under DEVELOPER", 2018-04-14), because I
> > wanted the abilty to have verbose informative output without the build
> > dying on some older systems / compilers.
> >
> > It's fine and understandable if you're someone who's just building a
> > package on some older system if you get a bunch of compiler warnings,
> > but more annoying if you have to dig into how to disable a default
> > -Werror.
> 
> I had the impression that DEVELOPER=1 was allowed to set flags that old
> versions might not even know about. Hence they might actually barf, even
> without -Werror. Maybe that's better since the introduction of the detect-
> compiler script, though.
> 
> I do think we may have a skewed view of the population on this list.
> We're developers ourselves, and we interact with new developers that we
> want to help.  But there are masses of people[1] building Git who are _not_
> developers, and want the default to be as robust as possible.
> They're probably not going to show up in this thread.
> 
> -Peff
> 
> [1] I actually wonder how large that mass is. Clearly there are many
> orders of magnitude more users than there are developers. But I have
> no idea what percentage of them build from source versus using
> somebody else's binary package.

One? 😉

Jokingly,
Randall

-- Brief whoami:
 NonStop developer since approximately 2112884442
 UNIX developer since approximately 421664400
-- In my real life, I talk too much.

RE: [PATCH] Makefile: enable DEVELOPER by default

2018-08-06 Thread Randall S. Becker

On August 6, 2018 12:40 PM, Ævar Arnfjörð Bjarmason wrote:
> On Sat, Aug 04 2018, Junio C Hamano wrote:
> 
> > Duy Nguyen  writes:
> >
> >> On Sat, Aug 4, 2018 at 8:11 AM Jonathan Nieder 
> wrote:
> >>> My main concern is not about them but about other people building
> >>> from source in order to run (instead of to develop) Git, and by
> >>> extension, the people they go to for help when it doesn't work.  I
> >>> have lots of bitter experience of -Werror being a support headache
> >>> and leading to bad workarounds when someone upgrades their compiler
> >>> and the build starts failing due to a new warning it has introduced.
> >>
> >> Even old compilers can also throw some silly, false positive warnings
> >> (which now turn into errors) because they are not as smart as new
> >> ones.
> >
> > I agree with both of the above.  I do not think the pros-and-cons are
> > in favor of forcing the developer bit to everybody, even though I am
> > sympathetic to the desire to see people throw fewer bad changes that
> > waste review bandwidth by not compiling or passing its own tests at
> > us.
> 
> I agree.
> 
> Responding to the thread in general, perhaps people would like this more if
> we turned DEVELOPER=1 DEVOPTS=no-error on by default?
> 
> That's basically why I added it in 99f763baf5 ("Makefile: add a DEVOPTS to
> suppress -Werror under DEVELOPER", 2018-04-14), because I wanted the
> abilty to have verbose informative output without the build dying on some
> older systems / compilers.
> 
> It's fine and understandable if you're someone who's just building a package
> on some older system if you get a bunch of compiler warnings, but more
> annoying if you have to dig into how to disable a default -Werror.

Any idea when this is going to be in an official release, and exactly what the 
settings will be for "Not Developer". I assume DEVELOPER=0 and DEVOPTS=error, 
which is the current behaviour, correct? I am the platform maintainer for HPE 
NonStop and need to make sure I'm not packaging DEV builds to anyone, since I'm 
the only one doing this for the platform. It's another hoop, but hopefully not 
a bad one. The question is the best place to set this, assuming we are using 
Jenkins for our builds, and I'd rather keep the existing config.mak.uname the 
same, since at least it seems stable. We currently just run "make" for our 
build. So make arguments? And is making the change now non-destructive in 
preparation so that I don't forget when the time comes?

Cheers,
Randall

-- Brief whoami:
 NonStop developer since approximately 2112884442
 UNIX developer since approximately 421664400
-- In my real life, I talk too much.

Re: [PATCH] Makefile: enable DEVELOPER by default

2018-08-06 Thread Jeff King

On Mon, Aug 06, 2018 at 06:40:14PM +0200, Ævar Arnfjörð Bjarmason wrote:

> Responding to the thread in general, perhaps people would like this more
> if we turned DEVELOPER=1 DEVOPTS=no-error on by default?
> 
> That's basically why I added it in 99f763baf5 ("Makefile: add a DEVOPTS
> to suppress -Werror under DEVELOPER", 2018-04-14), because I wanted the
> abilty to have verbose informative output without the build dying on
> some older systems / compilers.
> 
> It's fine and understandable if you're someone who's just building a
> package on some older system if you get a bunch of compiler warnings,
> but more annoying if you have to dig into how to disable a default
> -Werror.

I had the impression that DEVELOPER=1 was allowed to set flags that old
versions might not even know about. Hence they might actually barf, even
without -Werror. Maybe that's better since the introduction of the
detect-compiler script, though.

I do think we may have a skewed view of the population on this list.
We're developers ourselves, and we interact with new developers that we
want to help.  But there are masses of people[1] building Git who are
_not_ developers, and want the default to be as robust as possible.
They're probably not going to show up in this thread.

-Peff

[1] I actually wonder how large that mass is. Clearly there are many
orders of magnitude more users than there are developers. But I have
no idea what percentage of them build from source versus using
somebody else's binary package.

Re: [PATCH] add a script to diff rendered documentation

2018-08-06 Thread Jeff King

On Mon, Aug 06, 2018 at 08:16:47AM -0700, Junio C Hamano wrote:

> It's a bit dissapointing that we cannot express personal preference
> in config.mak ;-)

Try this:

diff --git a/Makefile b/Makefile
index e7994888e8..36bddff3be 100644
--- a/Makefile
+++ b/Makefile
@@ -1119,6 +1119,11 @@ ifdef DEVELOPER
 include config.mak.dev
 endif
 
+ifneq ($(or $(J_RECURSED), $(J), done), done)
+%:
+   $(MAKE) -j$(J) J_RECURSED=done $@
+else
+
 comma := ,
 empty :=
 space := $(empty) $(empty)
@@ -3034,3 +3039,4 @@ cover_db: coverage-report
 cover_db_html: cover_db
cover -report html -outputdir cover_db_html cover_db
 
+endif


Pretty nasty. ;)

-Peff

Re: [PATCH] add a script to diff rendered documentation

2018-08-06 Thread Jeff King

On Mon, Aug 06, 2018 at 08:01:00AM -0700, Jonathan Nieder wrote:

> Jeff King wrote:
> 
> >   3. Default to number of CPUs, which is what a lot of other threading
> >  in Git does. Unfortunately getting that from the shell is
> >  non-trivial. I'm OK with $(grep -c ^processor /proc/cpuinfo), but
> >  people on non-Linux platforms would have to fill in their own
> >  implementation.
> 
> How about $(getconf _NPROCESSORS_ONLN)?  That's what Linux's
> scripts/coccicheck uses (apropos of a recent discussion :)).

Thanks, that's certainly less gross than grepping /proc/cpuinfo. getconf
is POSIX, but _NPROCESSORS_ONLN is not. According to [1], it works on
Linux and macOS, which is probably a reasonable start, though. This is,
after all, a script aimed at developers, and the worst case is that we
default back to 1.

-Peff

[1] 
https://stackoverflow.com/questions/42862559/one-liner-for-n-1-cores/42863212#42863212

Re: [PATCH v5 1/2] config: document git config getter return value

2018-08-06 Thread Han-Wen Nienhuys

On Mon, Aug 6, 2018 at 6:32 PM Junio C Hamano  wrote:
> patch and queue it on its own merit (not that I think the other one
> is not yet good enough---I haven't even looked at it yet ;-).

I discovered that emacs tabify will happily also add tabs in the
middle of the line, which breaks in the case of DUMB_SUFFIX.  I can
reroll a v6 if needed, though.

--

Google Germany GmbH, Erika-Mann-Strasse 33, 80636 Munich

Registergericht und -nummer: Hamburg, HRB 86891

Sitz der Gesellschaft: Hamburg

Geschäftsführer: Paul Manicle, Halimah DeLaine Prado

Re: [PATCH] Makefile: enable DEVELOPER by default

2018-08-06 Thread Ævar Arnfjörð Bjarmason



On Sat, Aug 04 2018, Junio C Hamano wrote:

> Duy Nguyen  writes:
>
>> On Sat, Aug 4, 2018 at 8:11 AM Jonathan Nieder  wrote:
>>> My main concern is not about them but about other
>>> people building from source in order to run (instead of to develop)
>>> Git, and by extension, the people they go to for help when it doesn't
>>> work.  I have lots of bitter experience of -Werror being a support
>>> headache and leading to bad workarounds when someone upgrades their
>>> compiler and the build starts failing due to a new warning it has
>>> introduced.
>>
>> Even old compilers can also throw some silly, false positive warnings
>> (which now turn into errors) because they are not as smart as new
>> ones.
>
> I agree with both of the above.  I do not think the pros-and-cons
> are in favor of forcing the developer bit to everybody, even though
> I am sympathetic to the desire to see people throw fewer bad changes
> that waste review bandwidth by not compiling or passing its own
> tests at us.

I agree.

Responding to the thread in general, perhaps people would like this more
if we turned DEVELOPER=1 DEVOPTS=no-error on by default?

That's basically why I added it in 99f763baf5 ("Makefile: add a DEVOPTS
to suppress -Werror under DEVELOPER", 2018-04-14), because I wanted the
abilty to have verbose informative output without the build dying on
some older systems / compilers.

It's fine and understandable if you're someone who's just building a
package on some older system if you get a bunch of compiler warnings,
but more annoying if you have to dig into how to disable a default
-Werror.

Re: [BUG] 'git ls-files --no-exclude' segfault & co

2018-08-06 Thread Jeff King

On Mon, Aug 06, 2018 at 06:07:06PM +0200, SZEDER Gábor wrote:

> 'git ls-files' has the options '--exclude', '--exclude-from',
> '--exclude-per-directory', and '--exclude-standard', which all work
> fine.  However, it also allows the negated version of all these four
> options, and they definitely don't work very well:
> 
>   $ git ls-files --no-exclude
>   Segmentation fault
>   $ git ls-files --no-exclude-from
>   warning: unable to access '(null)': Bad address
>   fatal: cannot use (null) as an exclude file
> 
> And '--no-exclude-standard' has the same effect as
> '--exclude-standard', because its parseopt callback function
> option_parse_exclude_standard() doesn't bother to look at its 'unset'
> parameter.

I think --exclude-per-directory is fine, since it uses OPT_STRING().
Using "--no-exclude-per-directory" just means we'll cancel any
previously found option.

In an ideal world we'd perhaps do something useful with the negated
forms for the others, but I don't think the underlying code is set up to
do that (i.e., how do you undo "setup_standard_excludes()"). Possibly we
could switch to setting a flag (which could then be cleared), and
resolve the flags after parsing the whole command line. But often
options with callbacks list this have subtle user-visible timing effects
(e.g., that the command line options need to take effect in the order
they were found).

So unless somebody actually wants these negated forms to do something
useful, it's probably not worth the trouble and risk of regression.  But
we obviously should at least disallow them explicitly rather than
segfaulting.

I thought adding PARSE_OPT_NONEG like this would work:

diff --git a/builtin/ls-files.c b/builtin/ls-files.c
index 88bb2019ad..9adee62358 100644
--- a/builtin/ls-files.c
+++ b/builtin/ls-files.c
@@ -547,15 +547,16 @@ int cmd_ls_files(int argc, const char **argv, const char 
*cmd_prefix)
N_("show resolve-undo information")),
{ OPTION_CALLBACK, 'x', "exclude", &exclude_list, N_("pattern"),
N_("skip files matching pattern"),
-   0, option_parse_exclude },
+   PARSE_OPT_NONEG, option_parse_exclude },
{ OPTION_CALLBACK, 'X', "exclude-from", &dir, N_("file"),
N_("exclude patterns are read from "),
-   0, option_parse_exclude_from },
+   PARSE_OPT_NONEG, option_parse_exclude_from },
OPT_STRING(0, "exclude-per-directory", &dir.exclude_per_dir, 
N_("file"),
N_("read additional per-directory exclude patterns in 
")),
{ OPTION_CALLBACK, 0, "exclude-standard", &dir, NULL,
N_("add the standard git exclusions"),
-   PARSE_OPT_NOARG, option_parse_exclude_standard },
+   PARSE_OPT_NOARG | PARSE_OPT_NONEG,
+   option_parse_exclude_standard },
OPT_SET_INT_F(0, "full-name", &prefix_len,
  N_("make the output relative to the project top 
directory"),
  0, PARSE_OPT_NONEG),

But it actually does something quite interesting. Because of the NONEG
flag, we know that the user cannot mean "--no-exclude" itself. So our
liberal prefix-matching kicks in, and we treat it as
--no-exclude-per-directory. I.e., it becomes a silent noop and the user
gets no warning.

I think parse_options() may be overly liberal here, and we might want to
change that. But in the interim, it probably makes sense to just detect
the "unset" case in the callbacks, report an error, and return -1.

-Peff

[PATCH] Makefile: add missing dependency for command-list.h

2018-08-06 Thread Nguyễn Thái Ngọc Duy

Commit 3ac68a93fd (help: add --config to list all available config -
2018-05-26) makes generate-cmdlist.sh adds a new input source
config.txt but it's not a Makefile dependency. Any changes in
config.txt will not trigger command-list.h regeneration and the config
list in this file becomes outdated. Correct the dependency.

Signed-off-by: Nguyễn Thái Ngọc Duy 
---
 Makefile | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/Makefile b/Makefile
index bc4fc8eeab..47ca56d0e2 100644
--- a/Makefile
+++ b/Makefile
@@ -2037,7 +2037,7 @@ $(BUILT_INS): git$X
 
 command-list.h: generate-cmdlist.sh command-list.txt
 
-command-list.h: $(wildcard Documentation/git*.txt)
+command-list.h: $(wildcard Documentation/git*.txt) Documentation/config.txt
$(QUIET_GEN)$(SHELL_PATH) ./generate-cmdlist.sh command-list.txt >$@+ 
&& mv $@+ $@
 
 SCRIPT_DEFINES = $(SHELL_PATH_SQ):$(DIFF_SQ):$(GIT_VERSION):\
-- 
2.18.0.901.gd528ad0b1c.dirty

Re: [PATCH v5 1/2] config: document git config getter return value

2018-08-06 Thread Junio C Hamano

Han-Wen Nienhuys  writes:

> Signed-off-by: Han-Wen Nienhuys 
> ---
>  config.h | 7 ++-
>  1 file changed, 6 insertions(+), 1 deletion(-)

I think this is good (I read all the comments on v3 and v4 in the
archive before I said this).  Let's split it off from the other
patch and queue it on its own merit (not that I think the other one
is not yet good enough---I haven't even looked at it yet ;-).

Thanks.

>
> diff --git a/config.h b/config.h
> index bb2f506b27..183b31ebf4 100644
> --- a/config.h
> +++ b/config.h
> @@ -188,9 +188,14 @@ struct config_set {
>  
>  extern void git_configset_init(struct config_set *cs);
>  extern int git_configset_add_file(struct config_set *cs, const char 
> *filename);
> -extern int git_configset_get_value(struct config_set *cs, const char *key, 
> const char **value);
>  extern const struct string_list *git_configset_get_value_multi(struct 
> config_set *cs, const char *key);
>  extern void git_configset_clear(struct config_set *cs);
> +
> +/*
> + * These functions return 1 if not found, and 0 if found, leaving the found
> + * value in the 'dest' pointer.
> + */
> +extern int git_configset_get_value(struct config_set *cs, const char *key, 
> const char **dest);
>  extern int git_configset_get_string_const(struct config_set *cs, const char 
> *key, const char **dest);
>  extern int git_configset_get_string(struct config_set *cs, const char *key, 
> char **dest);
>  extern int git_configset_get_int(struct config_set *cs, const char *key, int 
> *dest);

Re: [PATCH 2/3] t7406: avoid having git commands upstream of a pipe

2018-08-06 Thread Elijah Newren

On Mon, Aug 6, 2018 at 8:48 AM SZEDER Gábor  wrote:
> > @@ -922,7 +928,7 @@ test_expect_success 'submodule update clone shallow 
> > submodule' '
> >   sed -e "s#url = ../#url = file://$pwd/#" <.gitmodules 
> > >.gitmodules.tmp &&
> >   mv -f .gitmodules.tmp .gitmodules &&
> >   git submodule update --init --depth=$commit_count &&
> > - test 1 = $(git -C submodule log --oneline | wc -l)
> > + test 1 = $(git -C submodule rev-list --count HEAD)
> >   )
> >  '
> >
> > @@ -938,7 +944,7 @@ test_expect_success 'submodule update clone shallow 
> > submodule outside of depth'
> >   test_i18ngrep "Direct fetching of that commit failed." actual 
> > &&
> >   git -C ../submodule config 
> > uploadpack.allowReachableSHA1InWant true &&
> >   git submodule update --init --depth=1 >actual &&
> > - test 1 = $(git -C submodule log --oneline | wc -l)
> > + test 1 = $(git -C submodule rev-list --count HEAD)
> >   )
> >  '
>
> These two hunks don't have the desired effect, because command
> substitutions used like this will hide the exit code anyway.  I'd
> suggest
>
>   git -C submodule log --oneline >out &&
>   test_line_count = 1 out
>
> instead, with the additional benefit of a nice error message on
> failure.

Ah, good point...and good suggestion.  I'll wait for further feedback
then resend with this change.

[BUG] 'git ls-files --no-exclude' segfault & co

2018-08-06 Thread SZEDER Gábor

'git ls-files' has the options '--exclude', '--exclude-from',
'--exclude-per-directory', and '--exclude-standard', which all work
fine.  However, it also allows the negated version of all these four
options, and they definitely don't work very well:

  $ git ls-files --no-exclude
  Segmentation fault
  $ git ls-files --no-exclude-from
  warning: unable to access '(null)': Bad address
  fatal: cannot use (null) as an exclude file

And '--no-exclude-standard' has the same effect as
'--exclude-standard', because its parseopt callback function
option_parse_exclude_standard() doesn't bother to look at its 'unset'
parameter.

Re: [PATCH v2 2/2] rebase --exec: make it work with --rebase-merges

2018-08-06 Thread Phillip Wood


On 06/08/18 16:23, Phillip Wood wrote:


Hi Johannes
On 06/08/18 10:52, Johannes Schindelin via GitGitGadget wrote:


From: Johannes Schindelin 

The idea of `--exec` is to append an `exec` call after each `pick`.

Since the introduction of fixup!/squash! commits, this idea was extended
to apply to "pick, possibly followed by a fixup/squash chain", i.e. an
exec would not be inserted between a `pick` and any of its corresponding
`fixup` or `squash` lines.

The current implementation uses a dirty trick to achieve that: it
assumes that there are only pick/fixup/squash commands, and then
*inserts* the `exec` lines before any `pick` but the first, and appends
a final one.

With the todo lists generated by `git rebase --rebase-merges`, this
simple implementation shows its problems: it produces the exact wrong
thing when there are `label`, `reset` and `merge` commands.

Let's change the implementation to do exactly what we want: look for
`pick` lines, skip any fixup/squash chains, and then insert the `exec`
line. Lather, rinse, repeat.

While at it, also add `exec` lines after `merge` commands, because they
are similar in spirit to `pick` commands: they add new commits.

Signed-off-by: Johannes Schindelin 
---
  sequencer.c  | 37 +++--
  t/t3430-rebase-merges.sh |  2 +-
  2 files changed, 28 insertions(+), 11 deletions(-)

diff --git a/sequencer.c b/sequencer.c
index 31038472f..ed2e694ff 100644
--- a/sequencer.c
+++ b/sequencer.c
@@ -4244,10 +4244,9 @@ int sequencer_add_exec_commands(const char 
*commands)

  {
  const char *todo_file = rebase_path_todo();
  struct todo_list todo_list = TODO_LIST_INIT;
-    struct todo_item *item;
  struct strbuf *buf = &todo_list.buf;
  size_t offset = 0, commands_len = strlen(commands);
-    int i, first;
+    int i, insert;
  if (strbuf_read_file(&todo_list.buf, todo_file, 0) < 0)
  return error(_("could not read '%s'."), todo_file);
@@ -4257,19 +4256,37 @@ int sequencer_add_exec_commands(const char 
*commands)

  return error(_("unusable todo list: '%s'"), todo_file);
  }
-    first = 1;
-    /* insert  before every pick except the first one */
-    for (item = todo_list.items, i = 0; i < todo_list.nr; i++, item++) {
-    if (item->command == TODO_PICK && !first) {
-    strbuf_insert(buf, item->offset_in_buf + offset,
-  commands, commands_len);
+    /*
+ * Insert  after every pick. Here, fixup/squash chains
+ * are considered part of the pick, so we insert the commands 
*after*

+ * those chains if there are any.
+ */
+    insert = -1;
+    for (i = 0; i < todo_list.nr; i++) {
+    enum todo_command command = todo_list.items[i].command;
+
+    if (insert >= 0) {
+    /* skip fixup/squash chains */
+    if (command == TODO_COMMENT)
+    continue;


insert is not updated so if the next command is not a fixup the exec 
line will be inserted before the comment.



+    else if (is_fixup(command)) {
+    insert = i + 1;
+    continue;
+    }
+    strbuf_insert(buf,
+  todo_list.items[insert].offset_in_buf +
+  offset, commands, commands_len);
  offset += commands_len;
+    insert = -1;
  }
-    first = 0;
+
+    if (command == TODO_PICK || command == TODO_MERGE)
+    insert = i + 1;
  }
  /* append final  */
-    strbuf_add(buf, commands, commands_len);
+    if (insert >= 0 || !offset)
+    strbuf_add(buf, commands, commands_len);


Having read your other message about this patch I think if you wanted to 
fix the position of the final exec in the case where the todo list ends 
with a comment you could do something like


 if (insert >= 0)
     strbuf_insert(buf,
   todo_list.items[insert].offset_in_buf +
   offset, commands, commands_len);
 else


Oops that should be 'else if (!offset)'


     strbuf_add(buf, commands, commands_len);

I'm not sure it matters that much though

The rest of this patch looks fine to me

Best Wishes

Phillip


  i = write_message(buf->buf, buf->len, todo_file, 0);
  todo_list_release(&todo_list);
diff --git a/t/t3430-rebase-merges.sh b/t/t3430-rebase-merges.sh
index 0bf5eaa37..90ae613e2 100755
--- a/t/t3430-rebase-merges.sh
+++ b/t/t3430-rebase-merges.sh
@@ -363,7 +363,7 @@ test_expect_success 'octopus merges' '
  EOF
  '
-test_expect_failure 'with --autosquash and --exec' '
+test_expect_success 'with --autosquash and --exec' '
  git checkout -b with-exec H &&
  echo Booh >B.t &&
  test_tick &&

Re: [PATCH v3 0/4] Speed up unpack_trees()

2018-08-06 Thread Duy Nguyen

On Mon, Aug 6, 2018 at 5:48 PM Junio C Hamano  wrote:
> > I've also checked about the lookahead thing in unpack_trees() to see
> > if we accidentally break something there, which is my biggest worry.
> > See [1] and [2] for context, but I believe since we can't have D/F
> > conflicts, the situation where lookahead is needed will not occur. So
> > we should be safe.
>
> Isn't this about branch switching, where the currently checked out
> branch may have a regular file 't' and checking out another branch
> that has directory 't' in it (or vice versa, possibly with the index
> having either a regular file 't' or requiring 't' to be a diretory
> by having a blob 't/1' in it)?

We require the unpacked entry from all input trees to be a tree
objects (the dirmask thing), so if one tree has 't' as a file,
all_trees_same_as_cache_tree() should return false and not trigger
this optimization. Same thing for the index, if it has the file 't',
then we should not have the cache-tree at path 't' and the
optimization is skipped as well.

So yes branch switching definitely can have d/f conflicts, but we
should never ever accidentally run this new optimization when that
happens.

> The log messge of [1] talks about
> walking three trees together with the index, but even if we limit us
> to two-tree walk, I do not think that the picture fundamentally
> changes.  So I am not sure how we can confidently say "we can't have
> D/F".  I'd need to block a solid time to take a look at the patches.

Yes please :)
-- 
Duy

Re: [RFC PATCH 2/5] Add delta-islands.{c,h}

2018-08-06 Thread Duy Nguyen

On Sun, Aug 5, 2018 at 8:53 PM Christian Couder
 wrote:
>
> On Sun, Jul 22, 2018 at 10:50 AM, Duy Nguyen  wrote:
> > On Sun, Jul 22, 2018 at 7:51 AM Christian Couder
> >  wrote:
>
> >> diff --git a/pack-objects.h b/pack-objects.h
> >> index edf74dabdd..8eecd67991 100644
> >> --- a/pack-objects.h
> >> +++ b/pack-objects.h
> >> @@ -100,6 +100,10 @@ struct object_entry {
> >> unsigned type_:TYPE_BITS;
> >> unsigned no_try_delta:1;
> >> unsigned in_pack_type:TYPE_BITS; /* could be delta */
> >> +
> >> +   unsigned int tree_depth; /* should be repositioned for packing? */
> >> +   unsigned char layer;
> >> +
> >
> > This looks very much like an optional feature. To avoid increasing
> > pack-objects memory usage for common case, please move this to struct
> > packing_data.
>
> As you can see the patch 6/6 (in the v2 of this patch series that I
> just sent) moves `unsigned int tree_depth` from 'struct object_entry'
> to 'struct packing_data'. I am not sure that I did it right and that
> it is worth it though as it is a bit more complex.

It is a bit more complex than I expected. But I think if you go with
Jeff's suggestion (i.e. think of the new tree_depth array an extension
of objects array) it's a bit simpler: you access both arrays using the
same index, both arrays should have the same size and be realloc'd at
the same time in packlist_alloc().

Is it worth it? The "pahole" comment in this file is up to date. We
use 80 bytes per object. This series makes the struct 88 bytes (I've
just rerun pahole). On linux repo with 12M objects, "git pack-objects
--all" needs extra 96MB memory even if this feature is not used. So
yes I still think it's worth moving these fields out of struct
object_entry.
-- 
Duy

Re: [PATCH v3 0/4] Speed up unpack_trees()

2018-08-06 Thread Junio C Hamano

Nguyễn Thái Ngọc Duy   writes:

> This is a minor update to address Ben's comments and add his
> measurements in the commit message of 2/4 for the record.

Yay.

> I've also checked about the lookahead thing in unpack_trees() to see
> if we accidentally break something there, which is my biggest worry.
> See [1] and [2] for context, but I believe since we can't have D/F
> conflicts, the situation where lookahead is needed will not occur. So
> we should be safe.

Isn't this about branch switching, where the currently checked out
branch may have a regular file 't' and checking out another branch
that has directory 't' in it (or vice versa, possibly with the index
having either a regular file 't' or requiring 't' to be a diretory
by having a blob 't/1' in it)?  The log messge of [1] talks about
walking three trees together with the index, but even if we limit us
to two-tree walk, I do not think that the picture fundamentally
changes.  So I am not sure how we can confidently say "we can't have
D/F".  I'd need to block a solid time to take a look at the patches.

> [1] da165f470e (unpack-trees.c: prepare for looking ahead in the index - 
> 2010-01-07)
> [2] 730f72840c (unpack-trees.c: look ahead in the index - 2009-09-20)

Thanks.

Re: [PATCH 2/3] t7406: avoid having git commands upstream of a pipe

2018-08-06 Thread SZEDER Gábor




> When a git command is on the left side of a pipe, the pipe will swallow
> its exit status, preventing us from detecting failures in said commands.
> Restructure the tests to put the output in a temporary file to avoid
> this problem.
> 
> Signed-off-by: Elijah Newren 
> ---
>  t/t7406-submodule-update.sh | 24 +++-
>  1 file changed, 15 insertions(+), 9 deletions(-)
> 
> diff --git a/t/t7406-submodule-update.sh b/t/t7406-submodule-update.sh
> index e2405c96b5..c8971abd07 100755
> --- a/t/t7406-submodule-update.sh
> +++ b/t/t7406-submodule-update.sh

> @@ -922,7 +928,7 @@ test_expect_success 'submodule update clone shallow 
> submodule' '
>   sed -e "s#url = ../#url = file://$pwd/#" <.gitmodules 
> >.gitmodules.tmp &&
>   mv -f .gitmodules.tmp .gitmodules &&
>   git submodule update --init --depth=$commit_count &&
> - test 1 = $(git -C submodule log --oneline | wc -l)
> + test 1 = $(git -C submodule rev-list --count HEAD)
>   )
>  '
> 
> @@ -938,7 +944,7 @@ test_expect_success 'submodule update clone shallow 
> submodule outside of depth'
>   test_i18ngrep "Direct fetching of that commit failed." actual &&
>   git -C ../submodule config uploadpack.allowReachableSHA1InWant 
> true &&
>   git submodule update --init --depth=1 >actual &&
> - test 1 = $(git -C submodule log --oneline | wc -l)
> + test 1 = $(git -C submodule rev-list --count HEAD)
>   )
>  '

These two hunks don't have the desired effect, because command
substitutions used like this will hide the exit code anyway.  I'd
suggest

  git -C submodule log --oneline >out &&
  test_line_count = 1 out

instead, with the additional benefit of a nice error message on
failure.

Re: Fetch on submodule update

2018-08-06 Thread Robert Dailey

On Mon, Aug 6, 2018 at 10:41 AM, Jonathan Nieder  wrote:
> Robert Dailey wrote:
>
>>  Automatic would be
>> great if submodules were treated as integrated in a similar manner to
>> subtree, but it's not there. I wasn't aware that `submodule update`
>> did a fetch, because sometimes if I do that, I get errors saying SHA1
>> is not present (because the submodule did not get fetched). Granted I
>> haven't seen this in a while, so maybe the fetch on submodule update
>> is a newer feature. Do you know what triggers the fetch on update
>> without --remote? Is it the missing SHA1 that triggers it, or is it
>> fetching unconditionally?
>
> Thanks for this and the rest of the context you sent.  It's very
> helpful.
>
> The relevant code in git-submodule.sh is
>
> # Run fetch only if $sha1 isn't present or it
> # is not reachable from a ref.
> is_tip_reachable "$sm_path" "$sha1" ||
> fetch_in_submodule "$sm_path" $depth ||
> say "$(eval_gettext "Unable to fetch in submodule path 
> '\$displaypath'")"
>
> # Now we tried the usual fetch, but $sha1 may
> # not be reachable from any of the refs
> is_tip_reachable "$sm_path" "$sha1" ||
> fetch_in_submodule "$sm_path" $depth "$sha1" ||
> die "$(eval_gettext "Fetched in submodule path '\$displaypath', but 
> it did not contain \$sha1. Direct fetching of that commit failed.")"
>
> The fallback to fetching by SHA-1 was introduced in v2.8.0-rc0~9^2
> (submodule: try harder to fetch needed sha1 by direct fetching sha1,
> 2018-02-23).

Yep, that's the root cause; I was basing my concerns on a legacy
issue. I just had avoided using `update` when I expected a fetch, so I
never saw the issue again, and thus didn't realize it was corrected.
Very helpful. Thanks again!

Re: Fetch on submodule update

2018-08-06 Thread Jonathan Nieder

Robert Dailey wrote:

>  Automatic would be
> great if submodules were treated as integrated in a similar manner to
> subtree, but it's not there. I wasn't aware that `submodule update`
> did a fetch, because sometimes if I do that, I get errors saying SHA1
> is not present (because the submodule did not get fetched). Granted I
> haven't seen this in a while, so maybe the fetch on submodule update
> is a newer feature. Do you know what triggers the fetch on update
> without --remote? Is it the missing SHA1 that triggers it, or is it
> fetching unconditionally?

Thanks for this and the rest of the context you sent.  It's very
helpful.

The relevant code in git-submodule.sh is

# Run fetch only if $sha1 isn't present or it
# is not reachable from a ref.
is_tip_reachable "$sm_path" "$sha1" ||
fetch_in_submodule "$sm_path" $depth ||
say "$(eval_gettext "Unable to fetch in submodule path 
'\$displaypath'")"

# Now we tried the usual fetch, but $sha1 may
# not be reachable from any of the refs
is_tip_reachable "$sm_path" "$sha1" ||
fetch_in_submodule "$sm_path" $depth "$sha1" ||
die "$(eval_gettext "Fetched in submodule path '\$displaypath', but it 
did not contain \$sha1. Direct fetching of that commit failed.")"

The fallback to fetching by SHA-1 was introduced in v2.8.0-rc0~9^2
(submodule: try harder to fetch needed sha1 by direct fetching sha1,
2018-02-23).

Jonathan

Re: [PATCH] config.txt: reorder blame stuff to keep config keys sorted

2018-08-06 Thread Duy Nguyen

On Mon, Aug 6, 2018 at 7:59 AM Stefan Beller  wrote:
>
> On Fri, Aug 3, 2018 at 11:25 PM Nguyễn Thái Ngọc Duy  
> wrote:
> >
> > The color group in config.txt is actually sorted but changes in
> > sb/blame-color broke this. Reorder color.blame.* and move
> > blame.coloring back to the rest of blame.* (and reorder that group too
> > while we're there)
> >
> > Signed-off-by: Nguyễn Thái Ngọc Duy 
>
> FWIW:
> Acked-by: Stefan Beller 
>
> Are there more hidden sorts in the config.txt that I am not aware of?

I think it's "mostly sorted" now. It's easier to see when you do

git grep '::$' Documentation/config.txt
-- 
Duy

Re: [PATCH 1/4] line-log: demonstrate a bug with nearly-overlapping ranges

2018-08-06 Thread Jonathan Nieder

Jonathan Nieder wrote:
> Johannes Schindelin wrote:

>> It is really, really important to realize how valuable it is to have the
>> regression test as an individual patch that can be used to verify that
>> there is a bug, to pinpoint where it was introduced, to test alternative
>> fixes, to keep records separate, and I could go on and on and on. Please
>> do not ignore these very good reasons, and please refrain from
>> recommending such conflation in the future.
>
> If you want to propose changing the project's style to always separate
> tests from the patch that fixes a bug, that's a discussion we can have,
> in a separate thread.

By the way, don't get me wrong: I am sympathetic to the motivation for
such a change.

I have worked on projects where tests were in a separate repository
from the application.  There are costs and benefits.  To support the
kind of use cases you're describing, the tests would include
conditionals to allow running on old versions of the application (the
test expectations were based on the latest version, but setup code
sometimes had to accomodate differences between versions).  It worked
okay and was probably worth it for that project, despite the added
friction.  It's not clear it would be worth it for Git.

Jonathan

Re: t5570-git-daemon fails with SIGPIPE on OSX

2018-08-06 Thread SZEDER Gábor

[Resending with Clemens' last used email address.
Clemens, please consider sending a patch to update our .mailmap file.]


On Mon, Aug 6, 2018 at 5:11 PM SZEDER Gábor  wrote:
>
> Travis CI changed its default OSX image to use XCode 9.4 on 2018-07-31
> [1].  Since then OSX build jobs fail rather frequently because of a
> SIGPIPE in the tests 'fetch notices corrupt pack' or 'fetch notices
> corrupt idx' in 't5570-git-daemon.sh' [2].  I think this is a symptom
> a real bug in Git affecting other platforms as well, but these tests
> are too lax to catch it.
>
> What it boils down to is this sequence:
>
>   - The test first prepares a repository containing a corrupt pack,
> ready to be server via 'git daemon'.
>
>   - Then the test runs 'test_must_fail git fetch ', which connects
> to 'git daemon', which forks 'git upload-pack', which then
> advertises refs (only HEAD) and capabilities.  So far so good.
>
>   - 'git fetch' eventually calls fetch-pack.c:find_common().  The
> first half of this function assembles a request consisting of a
> want and a flush pkt-line, and sends it via a send_request() call.
>
> At this point the scheduling becomes important: let's suppose that
> fetch is slow and upload-pack is fast.
>
>   - 'git upload-pack' receives the request, parses the want line,
> notices the corrupt pack, responds with an 'ERR upload-pack: not
> our ref' pkt-line, and die()s right away.
>
>   - 'git fetch' finally approaches the end of the function, where it
> attempts to send a done pkt-line via another send_request() call
> through the now closing TCP socket.
>
>   - What happens now seems to depend on the platform:
>
> - On Linux, both on my machine and on Travis CI, it shows textbook
>   example behaviour: write() returns with error and sets errno to
>   ECONNRESET.  Since it happens in write_or_die(), 'git fetch'
>   die()s with 'fatal: write error: Connection reset by peer', and
>   doesn't show the error send by 'git upload-pack'; how could it,
>   it doesn't even get as far to receive upload-pack's ERR
>   pkt-line.
>
>   The test only checks that 'git fetch' fails, but it doesn't
>   check whether it failed with the right error message, so the
>   test still succeeds.  Had it checked the error message as well,
>   we most likely had noticed this issue already, it doesn't happen
>   all that rarely.
>
> - On the new OSX images with XCode 9.4 on Travis CI the write()
>   triggers SIGPIPE right away, and 'test_must_fail' notices it and
>   fails the test.  I couldn't see any sign of an ECONNRESET or any
>   other error that we could act upon to avoid the SIGPIPE.
>
> - On OSX with XCode 9.2 on Travis CI there is neither SIGPIPE, nor
>   ECONNRESET, but sending the request actually succeeds even
>   though there is no process on the other end of the socket
>   anymore.  'git fetch' then simply continues execution, reads and
>   parses the ERR pkt-line, and then dies()s with 'fatal: remote
>   error: upload-pack: not our ref'.  So, on the face of it, it
>   shows the desired behaviour, but I have no idea how that write()
>   could succeed instead of returning error.
>
> I don't know what happens on a real Mac as I don't have access to one;
> I figured out all the above by enabling packet tracing, adding a
> couple of well placed tracing printf() and sleep() calls, running a
> bunch of builds on Travis CI, and looking through their logs.  But
> without access to a debugger and netstat and what not I can't really
> go any further.  So I would now happily pass the baton to those who
> have a Mac and know a thing or two about its porting issues to first
> check whether OSX on a real Mac shows the same behaviour as it does in
> Travis CI's virtualized(?) environment.  And then they can pass the
> baton to those who know all the intricacies of the pack protocol and
> its implementation to decide what to do with this issue.
>
> For a mostly reliable reproduction recipe you might want to fetch this
> branch:
>
>   https://github.com/szeder/git t5570-git-daemon-sigpipe
>
> and then run 'make && cd t && ./t5570-git-daemon.sh -v -x'
>
>
> Have fun! ;)
>
>
> 1 - https://blog.travis-ci.com/2018-07-19-xcode9-4-default-announce
>
> 2 - On git.git's master:
> https://travis-ci.org/git/git/jobs/411517552#L2717

[PATCH 1/3] t7406: simplify by using diff --name-only instead of diff --raw

2018-08-06 Thread Elijah Newren

We can get rid of some quoted tabs and make a few tests slightly easier
to read and edit by just asking for the names of the files modified,
since that's all these tests were interested in anyway.

Signed-off-by: Elijah Newren 
---
 t/t7406-submodule-update.sh | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/t/t7406-submodule-update.sh b/t/t7406-submodule-update.sh
index f604ef7a72..e2405c96b5 100755
--- a/t/t7406-submodule-update.sh
+++ b/t/t7406-submodule-update.sh
@@ -579,9 +579,9 @@ test_expect_success 'submodule update - update=none in 
.git/config' '
  git checkout master &&
  compare_head
 ) &&
-git diff --raw | grep "submodule" &&
+git diff --name-only | grep submodule &&
 git submodule update &&
-git diff --raw | grep "submodule" &&
+git diff --name-only | grep submodule &&
 (cd submodule &&
  compare_head
 ) &&
@@ -597,9 +597,9 @@ test_expect_success 'submodule update - update=none in 
.git/config but --checkou
  git checkout master &&
  compare_head
 ) &&
-git diff --raw | grep "submodule" &&
+git diff --name-only | grep submodule &&
 git submodule update --checkout &&
-test_must_fail git diff --raw \| grep "submodule" &&
+test_must_fail git diff --name-only \| grep submodule &&
 (cd submodule &&
  test_must_fail compare_head
 ) &&
-- 
2.18.0.548.gd57a518419

[PATCH 3/3] t7406: make a test_must_fail call fail for the right reason

2018-08-06 Thread Elijah Newren

A test making use of test_must_fail was failing like this:
  fatal: ambiguous argument '|': unknown revision or path not in the working 
tree.
when the intent was to verify that a specific string was not found
in the output of the git diff command, i.e. that grep returned
non-zero.  Fix the test to do that.

Signed-off-by: Elijah Newren 
---
 t/t7406-submodule-update.sh | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/t/t7406-submodule-update.sh b/t/t7406-submodule-update.sh
index c8971abd07..f65049ec73 100755
--- a/t/t7406-submodule-update.sh
+++ b/t/t7406-submodule-update.sh
@@ -603,7 +603,8 @@ test_expect_success 'submodule update - update=none in 
.git/config but --checkou
 git diff --name-only >out &&
 grep submodule out &&
 git submodule update --checkout &&
-test_must_fail git diff --name-only \| grep submodule &&
+git diff --name-only >out &&
+! grep submodule out &&
 (cd submodule &&
  test_must_fail compare_head
 ) &&
-- 
2.18.0.548.gd57a518419

[PATCH 0/2] Simple fixes to t7406

2018-08-06 Thread Elijah Newren

Changes since v1:
  - Simplify multiple tests using diff --name-only instead of diff --raw;
these tests are only interested in which file was modified anyway.
(Suggested by Junio)
  - Avoid putting git commands upstream of a pipe, since we don't run under
set -o pipefail (Suggested by Eric)
  - Avoid using test_must_fail for system binaries (Pointed out by
both Eric and Junio)

Elijah Newren (2):
  t7406: simplify by using diff --name-only instead of diff --raw
  t7406: avoid having git commands upstream of a pipe
  t7406: make a test_must_fail call fail for the right reason

 t/t7406-submodule-update.sh | 27 +--
 1 file changed, 17 insertions(+), 10 deletions(-)

-- 
2.18.0.548.gd57a518419

[PATCH 2/3] t7406: avoid having git commands upstream of a pipe

2018-08-06 Thread Elijah Newren

When a git command is on the left side of a pipe, the pipe will swallow
its exit status, preventing us from detecting failures in said commands.
Restructure the tests to put the output in a temporary file to avoid
this problem.

Signed-off-by: Elijah Newren 
---
 t/t7406-submodule-update.sh | 24 +++-
 1 file changed, 15 insertions(+), 9 deletions(-)

diff --git a/t/t7406-submodule-update.sh b/t/t7406-submodule-update.sh
index e2405c96b5..c8971abd07 100755
--- a/t/t7406-submodule-update.sh
+++ b/t/t7406-submodule-update.sh
@@ -481,7 +481,8 @@ test_expect_success 'recursive submodule update - command 
in .git/config catches
 
 test_expect_success 'submodule init does not copy command into .git/config' '
(cd super &&
-H=$(git ls-files -s submodule | cut -d" " -f2) &&
+git ls-files -s submodule >out &&
+H=$(cut -d" " -f2 out) &&
 mkdir submodule1 &&
 git update-index --add --cacheinfo 16 $H submodule1 &&
 git config -f .gitmodules submodule.submodule1.path submodule1 &&
@@ -579,9 +580,11 @@ test_expect_success 'submodule update - update=none in 
.git/config' '
  git checkout master &&
  compare_head
 ) &&
-git diff --name-only | grep submodule &&
+git diff --name-only >out &&
+grep submodule out &&
 git submodule update &&
-git diff --name-only | grep submodule &&
+git diff --name-only >out &&
+grep submodule out &&
 (cd submodule &&
  compare_head
 ) &&
@@ -597,7 +600,8 @@ test_expect_success 'submodule update - update=none in 
.git/config but --checkou
  git checkout master &&
  compare_head
 ) &&
-git diff --name-only | grep submodule &&
+git diff --name-only >out &&
+grep submodule out &&
 git submodule update --checkout &&
 test_must_fail git diff --name-only \| grep submodule &&
 (cd submodule &&
@@ -885,7 +889,8 @@ test_expect_success 'submodule update properly revives a 
moved submodule' '
 H=$(git rev-parse --short HEAD) &&
 git commit -am "pre move" &&
 H2=$(git rev-parse --short HEAD) &&
-git status | sed "s/$H/XXX/" >expect &&
+git status >out &&
+sed "s/$H/XXX/" out >expect &&
 H=$(cd submodule2 && git rev-parse HEAD) &&
 git rm --cached submodule2 &&
 rm -rf submodule2 &&
@@ -894,7 +899,8 @@ test_expect_success 'submodule update properly revives a 
moved submodule' '
 git config -f .gitmodules submodule.submodule2.path "moved/sub module" 
&&
 git commit -am "post move" &&
 git submodule update &&
-git status | sed "s/$H2/XXX/" >actual &&
+git status > out &&
+sed "s/$H2/XXX/" out >actual &&
 test_cmp expect actual
)
 '
@@ -912,7 +918,7 @@ test_expect_success SYMLINKS 'submodule update can handle 
symbolic links in pwd'
 
 test_expect_success 'submodule update clone shallow submodule' '
test_when_finished "rm -rf super3" &&
-   first=$(git -C cloned submodule status submodule |cut -c2-41) &&
+   first=$(git -C cloned rev-parse HEAD:submodule) &&
second=$(git -C submodule rev-parse HEAD) &&
commit_count=$(git -C submodule rev-list --count $first^..$second) &&
git clone cloned super3 &&
@@ -922,7 +928,7 @@ test_expect_success 'submodule update clone shallow 
submodule' '
sed -e "s#url = ../#url = file://$pwd/#" <.gitmodules 
>.gitmodules.tmp &&
mv -f .gitmodules.tmp .gitmodules &&
git submodule update --init --depth=$commit_count &&
-   test 1 = $(git -C submodule log --oneline | wc -l)
+   test 1 = $(git -C submodule rev-list --count HEAD)
)
 '
 
@@ -938,7 +944,7 @@ test_expect_success 'submodule update clone shallow 
submodule outside of depth'
test_i18ngrep "Direct fetching of that commit failed." actual &&
git -C ../submodule config uploadpack.allowReachableSHA1InWant 
true &&
git submodule update --init --depth=1 >actual &&
-   test 1 = $(git -C submodule log --oneline | wc -l)
+   test 1 = $(git -C submodule rev-list --count HEAD)
)
 '
 
-- 
2.18.0.548.gd57a518419

Re: [PATCH v2 2/2] rebase --exec: make it work with --rebase-merges

2018-08-06 Thread Phillip Wood


Hi Johannes
On 06/08/18 10:52, Johannes Schindelin via GitGitGadget wrote:


From: Johannes Schindelin 

The idea of `--exec` is to append an `exec` call after each `pick`.

Since the introduction of fixup!/squash! commits, this idea was extended
to apply to "pick, possibly followed by a fixup/squash chain", i.e. an
exec would not be inserted between a `pick` and any of its corresponding
`fixup` or `squash` lines.

The current implementation uses a dirty trick to achieve that: it
assumes that there are only pick/fixup/squash commands, and then
*inserts* the `exec` lines before any `pick` but the first, and appends
a final one.

With the todo lists generated by `git rebase --rebase-merges`, this
simple implementation shows its problems: it produces the exact wrong
thing when there are `label`, `reset` and `merge` commands.

Let's change the implementation to do exactly what we want: look for
`pick` lines, skip any fixup/squash chains, and then insert the `exec`
line. Lather, rinse, repeat.

While at it, also add `exec` lines after `merge` commands, because they
are similar in spirit to `pick` commands: they add new commits.

Signed-off-by: Johannes Schindelin 
---
  sequencer.c  | 37 +++--
  t/t3430-rebase-merges.sh |  2 +-
  2 files changed, 28 insertions(+), 11 deletions(-)

diff --git a/sequencer.c b/sequencer.c
index 31038472f..ed2e694ff 100644
--- a/sequencer.c
+++ b/sequencer.c
@@ -4244,10 +4244,9 @@ int sequencer_add_exec_commands(const char *commands)
  {
const char *todo_file = rebase_path_todo();
struct todo_list todo_list = TODO_LIST_INIT;
-   struct todo_item *item;
struct strbuf *buf = &todo_list.buf;
size_t offset = 0, commands_len = strlen(commands);
-   int i, first;
+   int i, insert;
  
  	if (strbuf_read_file(&todo_list.buf, todo_file, 0) < 0)

return error(_("could not read '%s'."), todo_file);
@@ -4257,19 +4256,37 @@ int sequencer_add_exec_commands(const char *commands)
return error(_("unusable todo list: '%s'"), todo_file);
}
  
-	first = 1;

-   /* insert  before every pick except the first one */
-   for (item = todo_list.items, i = 0; i < todo_list.nr; i++, item++) {
-   if (item->command == TODO_PICK && !first) {
-   strbuf_insert(buf, item->offset_in_buf + offset,
- commands, commands_len);
+   /*
+* Insert  after every pick. Here, fixup/squash chains
+* are considered part of the pick, so we insert the commands *after*
+* those chains if there are any.
+*/
+   insert = -1;
+   for (i = 0; i < todo_list.nr; i++) {
+   enum todo_command command = todo_list.items[i].command;
+
+   if (insert >= 0) {
+   /* skip fixup/squash chains */
+   if (command == TODO_COMMENT)
+   continue;


insert is not updated so if the next command is not a fixup the exec 
line will be inserted before the comment.



+   else if (is_fixup(command)) {
+   insert = i + 1;
+   continue;
+   }
+   strbuf_insert(buf,
+ todo_list.items[insert].offset_in_buf +
+ offset, commands, commands_len);
offset += commands_len;
+   insert = -1;
}
-   first = 0;
+
+   if (command == TODO_PICK || command == TODO_MERGE)
+   insert = i + 1;
}
  
  	/* append final  */

-   strbuf_add(buf, commands, commands_len);
+   if (insert >= 0 || !offset)
+   strbuf_add(buf, commands, commands_len);


Having read your other message about this patch I think if you wanted to 
fix the position of the final exec in the case where the todo list ends 
with a comment you could do something like


if (insert >= 0)
strbuf_insert(buf,
  todo_list.items[insert].offset_in_buf +
  offset, commands, commands_len);
else
strbuf_add(buf, commands, commands_len);

I'm not sure it matters that much though

The rest of this patch looks fine to me

Best Wishes

Phillip

  
  	i = write_message(buf->buf, buf->len, todo_file, 0);

todo_list_release(&todo_list);
diff --git a/t/t3430-rebase-merges.sh b/t/t3430-rebase-merges.sh
index 0bf5eaa37..90ae613e2 100755
--- a/t/t3430-rebase-merges.sh
+++ b/t/t3430-rebase-merges.sh
@@ -363,7 +363,7 @@ test_expect_success 'octopus merges' '
EOF
  '
  
-test_expect_failure 'with --autosquash and --exec' '

+test_expect_success 'with --autosquash and --exec' '
git checkout -b with-exec H &&
echo Booh >B.t &&
test_tick &&

Re: concurrent access to multiple local git repos is error prone

2018-08-06 Thread Duy Nguyen

On Mon, Aug 6, 2018 at 9:38 AM Alexander Mills
 wrote:
> Yeah this concurrency problem is real. Not only does it happen with
> `git status` the same thing happens with `git rev-parse
> --show-toplevel`.

"git rev-parse --show-toplevel" having this same problem helps. This
command should never make any update in the repository, not even
taking any lock and very basic access to the repository (I think it
just needs to resolve HEAD, it does not even access object database).
It's so "simple" [1] that makes me think this problem is something not
really related to git. Perhaps the process is terminated abnormally
because it hits some system limits?

[1] well the repo discovery steps are a bit messy  but I think it's
unlikely we have any racing issues there.

> What happens is that I get no stdout when repos are accessed
> concurrently (and no stderr).
-- 
Duy

Re: [PATCH] add a script to diff rendered documentation

2018-08-06 Thread Junio C Hamano

Jeff King  writes:

> On Mon, Aug 06, 2018 at 09:39:55AM -0400, Jeff King wrote:
>
>>   3. Default to number of CPUs, which is what a lot of other threading
>>  in Git does. Unfortunately getting that from the shell is
>>  non-trivial. I'm OK with $(grep -c ^processor /proc/cpuinfo), but
>>  people on non-Linux platforms would have to fill in their own
>>  implementation.
>
> Is this too horrible to contemplate?

Chickens, eggs and a kitchen sink.  Sounds like a title of a movie.

It's a bit dissapointing that we cannot express personal preference
in config.mak ;-)

> diff --git a/builtin/rev-parse.c b/builtin/rev-parse.c
> index 0f09bbbf65..fa8caeec0c 100644
> --- a/builtin/rev-parse.c
> +++ b/builtin/rev-parse.c
> @@ -635,6 +635,11 @@ int cmd_rev_parse(int argc, const char **argv, const 
> char *prefix)
>   continue;
>   }
>  
> + if (!strcmp(arg, "--online-cpus")) {
> + printf("%d", online_cpus());
> + continue;
> + }
> +
>   /* The rest of the options require a git repository. */
>   if (!did_repo_setup) {
>   prefix = setup_git_directory();
>
> -Peff

Re: [PATCH 2/2] rebase --exec: make it work with --rebase-merges

2018-08-06 Thread Junio C Hamano

Johannes Schindelin  writes:

>> If we ever see a todo-list without any pick/merge, then insert_final
>> is still 1 when we leave the loop and we will add one single exec at
>> the end.  Which may or may not make sense---I dunno, as I do not
>> offhand think of a reason why the user would give us such a sequence
>> in the first place, so it probably may not matter in practice.
>
> Think of the `noop` command. It was introduced specifically to allow
> rebasing patches interactively to an upstream that already applied the
> local patches. In that case, an `--exec` should still run at least once,
> to avoid negative surprises.

Ah, yes, it probably makes sense when you have `noop`; even if there
is no picks and merges that change the history from previous state
(which presumably matches the state the user started running the
current "rebase -i" session), after the whole sequence you would
want run it one iteration.

In any case, if the current code without this change works like so,
there is no point in redesigning that part of the semantics at all.

t5570-git-daemon fails with SIGPIPE on OSX

2018-08-06 Thread SZEDER Gábor

Travis CI changed its default OSX image to use XCode 9.4 on 2018-07-31
[1].  Since then OSX build jobs fail rather frequently because of a
SIGPIPE in the tests 'fetch notices corrupt pack' or 'fetch notices
corrupt idx' in 't5570-git-daemon.sh' [2].  I think this is a symptom
a real bug in Git affecting other platforms as well, but these tests
are too lax to catch it.

What it boils down to is this sequence:

  - The test first prepares a repository containing a corrupt pack,
ready to be server via 'git daemon'.

  - Then the test runs 'test_must_fail git fetch ', which connects
to 'git daemon', which forks 'git upload-pack', which then
advertises refs (only HEAD) and capabilities.  So far so good.

  - 'git fetch' eventually calls fetch-pack.c:find_common().  The
first half of this function assembles a request consisting of a
want and a flush pkt-line, and sends it via a send_request() call.

At this point the scheduling becomes important: let's suppose that
fetch is slow and upload-pack is fast.

  - 'git upload-pack' receives the request, parses the want line,
notices the corrupt pack, responds with an 'ERR upload-pack: not
our ref' pkt-line, and die()s right away.

  - 'git fetch' finally approaches the end of the function, where it
attempts to send a done pkt-line via another send_request() call
through the now closing TCP socket.

  - What happens now seems to depend on the platform:

- On Linux, both on my machine and on Travis CI, it shows textbook
  example behaviour: write() returns with error and sets errno to
  ECONNRESET.  Since it happens in write_or_die(), 'git fetch'
  die()s with 'fatal: write error: Connection reset by peer', and
  doesn't show the error send by 'git upload-pack'; how could it,
  it doesn't even get as far to receive upload-pack's ERR
  pkt-line.

  The test only checks that 'git fetch' fails, but it doesn't
  check whether it failed with the right error message, so the
  test still succeeds.  Had it checked the error message as well,
  we most likely had noticed this issue already, it doesn't happen
  all that rarely.

- On the new OSX images with XCode 9.4 on Travis CI the write()
  triggers SIGPIPE right away, and 'test_must_fail' notices it and
  fails the test.  I couldn't see any sign of an ECONNRESET or any
  other error that we could act upon to avoid the SIGPIPE.

- On OSX with XCode 9.2 on Travis CI there is neither SIGPIPE, nor
  ECONNRESET, but sending the request actually succeeds even
  though there is no process on the other end of the socket
  anymore.  'git fetch' then simply continues execution, reads and
  parses the ERR pkt-line, and then dies()s with 'fatal: remote
  error: upload-pack: not our ref'.  So, on the face of it, it
  shows the desired behaviour, but I have no idea how that write()
  could succeed instead of returning error.

I don't know what happens on a real Mac as I don't have access to one;
I figured out all the above by enabling packet tracing, adding a
couple of well placed tracing printf() and sleep() calls, running a
bunch of builds on Travis CI, and looking through their logs.  But
without access to a debugger and netstat and what not I can't really
go any further.  So I would now happily pass the baton to those who
have a Mac and know a thing or two about its porting issues to first
check whether OSX on a real Mac shows the same behaviour as it does in
Travis CI's virtualized(?) environment.  And then they can pass the
baton to those who know all the intricacies of the pack protocol and
its implementation to decide what to do with this issue.

For a mostly reliable reproduction recipe you might want to fetch this
branch:

  https://github.com/szeder/git t5570-git-daemon-sigpipe

and then run 'make && cd t && ./t5570-git-daemon.sh -v -x'


Have fun! ;)


1 - https://blog.travis-ci.com/2018-07-19-xcode9-4-default-announce

2 - On git.git's master:
https://travis-ci.org/git/git/jobs/411517552#L2717

Re: [PATCH v2] t4150: fix broken test for am --scissors

2018-08-06 Thread Junio C Hamano

Paul Tan  writes:

> I've taken a look at the original test, and it is pretty broken. My
> ...
> So, there are 3 problems that will need to be fixed.
> ...
> This fixes problem (3) by using an in-body header.
> ...
> This fixes the first half of problem (2) by making the naming of the
> files the same.
> ...
> Nit: I'm not quite sure about naming the tag "scissors-used" though,
> since this commit was not created from the output of "git am
> --scissors". Maybe it should be named `commit-without-scissors-line`
> or something?
>
> This hunk removes the line:
>
> git format-patch --stdout scissors^ >scissors-patch.eml &&
>
> without a corresponding replacement, but that is fine because the test
> should not be using a patch without a scissors line.
> ...
> This fixes the other half of problem (2) by making the naming of the
> files the same.
> ...
> So, this patch fixes the 3 problems with the tests, and so looks correct to 
> me.

Beautifully explained.  There are many styles of patch review, and
I'd personally call this "think aloud to follow the patch author's
flow of thought." style.  Your review is probably one of the best
examples of reviews in the style.  Very readable, helping readers to
reach the same degree of understanding of what problem the patch
tries to address and how, and demonostrates the reviewer thought
through the problem just as deeply as the patch author, all of which
gives weight to the final "looks correct to me".

Thanks, both.  Will queue.

Re: [PATCH] add a script to diff rendered documentation

2018-08-06 Thread Jonathan Nieder

Jeff King wrote:

>   3. Default to number of CPUs, which is what a lot of other threading
>  in Git does. Unfortunately getting that from the shell is
>  non-trivial. I'm OK with $(grep -c ^processor /proc/cpuinfo), but
>  people on non-Linux platforms would have to fill in their own
>  implementation.

How about $(getconf _NPROCESSORS_ONLN)?  That's what Linux's
scripts/coccicheck uses (apropos of a recent discussion :)).

Re: [PATCH 1/4] line-log: demonstrate a bug with nearly-overlapping ranges

2018-08-06 Thread Jonathan Nieder

Hi Dscho,

Johannes Schindelin wrote:
> On Sat, 4 Aug 2018, Jonathan Nieder wrote:

>> Alternatively, could it be squashed in with the patch that fixes it?
>
> There is this really awful trend on this mailing list to suggest
> conflating the demonstration of a bug with the bug fix.
>
> It is really, really important to realize how valuable it is to have the
> regression test as an individual patch that can be used to verify that
> there is a bug, to pinpoint where it was introduced, to test alternative
> fixes, to keep records separate, and I could go on and on and on. Please
> do not ignore these very good reasons, and please refrain from
> recommending such conflation in the future.

If you want to propose changing the project's style to always separate
tests from the patch that fixes a bug, that's a discussion we can have,
in a separate thread.

Today, we do allow and encourage putting the test with the patch that
fixes it, and that has real advantages:

- the tests are easier to understand when found with "git log" because
  they are in context

- as the patch evolves, it is easier to remember to update the test at
  the same time

- newcomers imitating existing patches have a clear hint to write
  tests

- the beginning of a patch series can be applied and merged down while
  the end is still under review, without either leaving out the tests
  or applying a test that doesn't pass and accomplishes little

I've never found it difficult to use the test from a patch to verify
that there is a bug or pinpoint where it was introduced.  Tests are
separate from the application code since they're in the t/ directory;
this is a very easy thing to do.  That isn't to say that a patch that
only adds a (passing or expected-failure) test isn't valuable, even
without a fix.  It is valuable, precisely when it is self-explanatory.

More importantly, I am a bit surprised that instead of accepting the
feedback, you are basically calling a reviewer complicit, for pointing
out a pretty normal possible improvement that follows the project's
conventions.

I'm beyond words.

Jonathan

Re: Fetch on submodule update

2018-08-06 Thread Robert Dailey

On Thu, Aug 2, 2018 at 1:08 AM, Jonathan Nieder  wrote:
> I think I misread this the first time.  I got distracted by your
> mention of the --remote option, but you mentioned you want to use the
> SHA-1 of the submodule listed, so that was silly of me.
>
> I think you'll find that "git fetch --no-recurse-submodules" and "git
> submodule update" do exactly what you want.  "git submodule update"
> does perform a fetch (unless you pass --no-fetch).
>
> Let me know how it goes. :)
>
> I'd still be interested in hearing more about the nature of the
> submodules involved --- maybe `submodule.fetchJobs` would help, or
> maybe this is a workflow where a tool that transparently fetches
> submodules on demand like
> https://gerrit.googlesource.com/gitfs/+/master/docs/design.md would be
> useful (I'm not recommending using slothfs for this today, since it's
> read-only, but it illustrates the idea).

Hi thanks for your response, sorry I am a bit late getting back with you.

Maybe my workflow is dated, because I'm still used to treating
submodules as distinctly separated and independent things. I realize
submodule recursion is becoming more inherent in many high level git
commands, but outside of git there are separation issues that make
this workflow doomed to be non-seamless. For example, pull requests
will never offer the same uniformity: You will still have 1 pull
request per submodule. There's also the issue of log audits: You
cannot use blame, log, bisect, or other "diagnostic" commands to
introspect into submodules "as if" they were subtree or something of
the like (i.e. truly part of the DAG). A more realistic example of one
of the common questions I still can't answer easily is: "How do you
determine which commit in a submodule made it into which release of
the software?" In the case where the parent repository has the
annotated tags (representing software release milestones), and the
submodule is just a common library (which does not have those tags and
has no release cycle). Anyway, none of these issues are particularly
related but they do contribute to the answer to your question
regarding my workflow and use cases. The list goes on but I hope you
get the idea.

Some of the more functional issues are performance related: I am aware
enough, at times, that I can save time (in both local operations and
network overhead) by skipping submodules. For example, if I know that
I'm merging mainline branches, I do not need to mess with the
submodules (I can fetch, merge, commit, push from the parent repo
without messing with the submodules. This saves me time). If
`fetchJobs` was also `updateJobs`, i.e. you could update submodules in
parallel too, that might make this less of an issue. Think of
repositories [like boost][1] that have (I think) over a hundred
sibling submodules: Fetching 8 in parallel *and* doing `submodule
update` in parallel 8 times might also speed things up. There's also
`git status`, that if it recurses into submodules, is also
significantly slow in the boost case (I'm not sure if it is
parallelized).

Again, none of this is particularly related, but just to give you more
context on the "why" for my ask. Sorry if I'm dragging this out too
far.

The TLDR is that I do prefer the manual control. Automatic would be
great if submodules were treated as integrated in a similar manner to
subtree, but it's not there. I wasn't aware that `submodule update`
did a fetch, because sometimes if I do that, I get errors saying SHA1
is not present (because the submodule did not get fetched). Granted I
haven't seen this in a while, so maybe the fetch on submodule update
is a newer feature. Do you know what triggers the fetch on update
without --remote? Is it the missing SHA1 that triggers it, or is it
fetching unconditionally?

Thanks for confirming it behaves as I already wanted. And as you can
tell, I'm also happy to further discuss motivation / use cases /
details related to overall usage of submodules if you'd like. I'm
happy to help however I can!

[1]: https://github.com/boostorg/boost

[PATCH v5 2/2] sideband: highlight keywords in remote sideband output

2018-08-06 Thread Han-Wen Nienhuys

The colorization is controlled with the config setting "color.remote".

Supported keywords are "error", "warning", "hint" and "success". They
are highlighted if they appear at the start of the line, which is
common in error messages, eg.

   ERROR: commit is missing Change-Id

The Git push process itself prints lots of non-actionable messages
(eg. bandwidth statistics, object counters for different phases of the
process), which obscures actionable error messages that servers may
send back. Highlighting keywords in the sideband draws more attention
to those messages.

The background for this change is that Gerrit does server-side
processing to create or update code reviews, and actionable error
messages (eg. missing Change-Id) must be communicated back to the user
during the push. User research has shown that new users have trouble
seeing these messages.

The highlighting is done on the client rather than server side, so
servers don't have to grow capabilities to understand terminal escape
codes and terminal state. It also consistent with the current state
where Git is control of the local display (eg. prefixing messages with
"remote: ").

Finally, this solution is backwards compatible: many servers already
prefix their messages with "error", and they will benefit from this
change without requiring a server update. By contrast, a server-side
solution would likely require plumbing the TERM variable through the
git protocol, so it would require changes to both server and client.

Helped-by: Duy Nguyen 
Signed-off-by: Han-Wen Nienhuys 
---
 Documentation/config.txt|  12 +++
 help.c  |   1 +
 help.h  |   1 +
 sideband.c  | 126 +---
 t/t5409-colorize-remote-messages.sh |  80 ++
 5 files changed, 210 insertions(+), 10 deletions(-)
 create mode 100755 t/t5409-colorize-remote-messages.sh

diff --git a/Documentation/config.txt b/Documentation/config.txt
index 63365dcf3d..33bc1a3def 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -1263,6 +1263,18 @@ color.push::
 color.push.error::
Use customized color for push errors.
 
+color.remote::
+   If set, keywords at the start of the line are highlighted. The
+   keywords are "error", "warning", "hint" and "success", and are
+   matched case-insensitively. Maybe set to `always`, `false` (or
+   `never`) or `auto` (or `true`). If unset, then the value of
+   `color.ui` is used (`auto` by default).
+
+color.remote.::
+   Use customized color for each remote keyword. `` may be
+   `hint`, `warning`, `success` or `error` which match the
+   corresponding keyword.
+
 color.showBranch::
A boolean to enable/disable color in the output of
linkgit:git-show-branch[1]. May be set to `always`,
diff --git a/help.c b/help.c
index 3ebf0568db..b6cafcfc0a 100644
--- a/help.c
+++ b/help.c
@@ -425,6 +425,7 @@ void list_config_help(int for_human)
{ "color.diff", "", list_config_color_diff_slots },
{ "color.grep", "", list_config_color_grep_slots },
{ "color.interactive", "", 
list_config_color_interactive_slots },
+   { "color.remote", "", list_config_color_sideband_slots },
{ "color.status", "", list_config_color_status_slots },
{ "fsck", "", list_config_fsck_msg_ids },
{ "receive.fsck", "", list_config_fsck_msg_ids },
diff --git a/help.h b/help.h
index f8b15323a6..9eab6a3f89 100644
--- a/help.h
+++ b/help.h
@@ -83,6 +83,7 @@ void list_config_color_diff_slots(struct string_list *list, 
const char *prefix);
 void list_config_color_grep_slots(struct string_list *list, const char 
*prefix);
 void list_config_color_interactive_slots(struct string_list *list, const char 
*prefix);
 void list_config_color_status_slots(struct string_list *list, const char 
*prefix);
+void list_config_color_sideband_slots(struct string_list *list, const char 
*prefix);
 void list_config_fsck_msg_ids(struct string_list *list, const char *prefix);
 
 #endif /* HELP_H */
diff --git a/sideband.c b/sideband.c
index 325bf0e974..239be2ec85 100644
--- a/sideband.c
+++ b/sideband.c
@@ -1,6 +1,108 @@
 #include "cache.h"
+#include "color.h"
+#include "config.h"
 #include "pkt-line.h"
 #include "sideband.h"
+#include "help.h"
+
+struct keyword_entry {
+   /*
+* We use keyword as config key so it should be a single alphanumeric 
word.
+*/
+   const char *keyword;
+   char color[COLOR_MAXLEN];
+};
+
+static struct keyword_entry keywords[] = {
+   { "hint",   GIT_COLOR_YELLOW },
+   { "warning",GIT_COLOR_BOLD_YELLOW },
+   { "success",GIT_COLOR_BOLD_GREEN },
+   { "error",  GIT_COLOR_BOLD_RED },
+};
+
+/* Returns a color setting (GIT_COLOR_NEVER, etc). */
+static int use_sideband_colors(void)
+{
+   static int use_sideband_colors_cached = -1;
+
+   co

[PATCH v5 0/2] sideband: highlight keywords in remote sideband output

2018-08-06 Thread Han-Wen Nienhuys

Address Jun's & Jrn's comments.

Han-Wen Nienhuys (2):
  config: document git config getter return value
  sideband: highlight keywords in remote sideband output

 Documentation/config.txt|  12 +++
 config.h|   7 +-
 help.c  |   1 +
 help.h  |   1 +
 sideband.c  | 126 +---
 t/t5409-colorize-remote-messages.sh |  80 ++
 6 files changed, 216 insertions(+), 11 deletions(-)
 create mode 100755 t/t5409-colorize-remote-messages.sh

--
2.18.0.597.ga71716f1ad-goog

[PATCH v5 1/2] config: document git config getter return value

2018-08-06 Thread Han-Wen Nienhuys

Signed-off-by: Han-Wen Nienhuys 
---
 config.h | 7 ++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/config.h b/config.h
index bb2f506b27..183b31ebf4 100644
--- a/config.h
+++ b/config.h
@@ -188,9 +188,14 @@ struct config_set {
 
 extern void git_configset_init(struct config_set *cs);
 extern int git_configset_add_file(struct config_set *cs, const char *filename);
-extern int git_configset_get_value(struct config_set *cs, const char *key, 
const char **value);
 extern const struct string_list *git_configset_get_value_multi(struct 
config_set *cs, const char *key);
 extern void git_configset_clear(struct config_set *cs);
+
+/*
+ * These functions return 1 if not found, and 0 if found, leaving the found
+ * value in the 'dest' pointer.
+ */
+extern int git_configset_get_value(struct config_set *cs, const char *key, 
const char **dest);
 extern int git_configset_get_string_const(struct config_set *cs, const char 
*key, const char **dest);
 extern int git_configset_get_string(struct config_set *cs, const char *key, 
char **dest);
 extern int git_configset_get_int(struct config_set *cs, const char *key, int 
*dest);
-- 
2.18.0.597.ga71716f1ad-goog

Re: concurrent access to multiple local git repos is error prone

2018-08-06 Thread Jonathan Nieder

(administrivia: please don't top-post)
Hi Alex,

Alexander Mills wrote:

> Yeah this concurrency problem is real. Not only does it happen with
> `git status` the same thing happens with `git rev-parse
> --show-toplevel`.

Sorry for the confusion --- I didn't mean to claim your experience was
not real!

What I wanted to make clear is that

 1. Git is designed to allow concurrent reads of a repository (and
pushes to a repository).  If it doesn't work, that is a simple bug,
not a design goal.

 2. Plenty of people rely on concurrently accessing repositories, so
if it doesn't work, then (i) we definitely want to know and (ii)
we're going to need a lot of detail to figure out what's happening,
so we can fix it.

Does that make it clearer?

> What happens is that I get no stdout when repos are accessed
> concurrently (and no stderr). If I limit concurrency to 1, the problem
> goes away. When I up the concurrency, the problem is sporadic, which
> is the exact signal for a concurrency/race-condition related issue.
> The signs are damn clear. I have seen this problem on MacOS I think a
> year back on a different project, but I never reported it b/c I hadn't
> really verified it.
>
> Like I said I am on Ubuntu. I have 3 git repos that are incorporated
> into the tool that's generating the problem. For one repo I got this:
>
> $ git fsck
>
> Checking object directories: 100% (256/256), done.
> dangling tree 4b825dc642cb6eb9a060e54bf8d69288fbee4904
>
> For `$ git version --build-options` I have:
>
> git version 2.17.1
> cpu: x86_64
> no commit associated with this build
> sizeof-long: 8

Thanks.  My best idea for a next step is that if you can come up with a
reproduction recipe, that would be very helpful.

It doesn't have to reproduce 100% of the time, but e.g. if you have a
script that reproduces it 50% of the time, I can run that script in a
loop.

Thanks,
Jonathan

Re: [PATCH v2] checkout: optimize "git checkout -b "

2018-08-06 Thread Ben Peart





On 8/3/2018 11:58 AM, Duy Nguyen wrote:

On Thu, Aug 02, 2018 at 02:02:00PM -0400, Ben Peart wrote:






But if you still want to push it further, this is something I have in
mind. It probably has bugs, but at least preliminary test shows me
that it could skip 99% work inside unpack_trees() and not need to
write the index.

The main check to determine "checkout -b" is basically the new
oidcmp() in merge_working_tree(). Not sure if I miss any condition in
that check, I didn't look very closely at checkout code.



Thanks Duy.  I think this is an interesting idea to pursue but... when I 
tried running this patch on a virtualized repo it started triggering 
many object downloads.  After taking a quick look, it appears that 
CE_UPDATE is set on every cache entry so check_updates() ends up calling 
checkout_entry() which writes out every file to the working tree - even 
those supposedly skipped by the skip-wortree bit.  Oops.


Not too surprising (you did say it probably has bugs :)) but it means I 
can't trivially get performance data on how much this will help.  It 
also fails a lot of tests (see below).


It experience does highlight the additional risk of this model of 
changing the underlying functions (vs the high level optimization of my 
original patch).  In addition, the new special cases in those 
lower-level functions do add additional complexity and fragility to the 
codebase.  So, like most things, to me it isn't a clear better/worse 
decision - it's just different.  While I like the idea of general 
optimizations that could apply more broadly to other commands; I do 
worry about the additional complexity, amount of code churn, and 
associated risk with the change.


When I have cycles, I'll take a look at how to fix this bug and get some 
performance data.  I just wanted to give you a heads up that I'm not 
ignoring your patch, just that it is going to take additional time and 
effort before I can properly evaluate how much impact it will have.



Test Summary Report
---
./t1011-read-tree-sparse-checkout.sh   (Wstat: 256 Tests: 21 
Failed: 1)

  Failed test:  20
  Non-zero exit status: 1
./t1400-update-ref.sh  (Wstat: 256 Tests: 
170 Failed: 73)

  Failed tests:  40, 42-45, 55-59, 70, 72, 82, 85, 87-88
90-100, 103-110, 113-119, 127, 129-130
132-133, 136-137, 140-147, 150-157, 160-166
170
  Non-zero exit status: 1
./t2011-checkout-invalid-head.sh   (Wstat: 256 Tests: 10 
Failed: 5)

  Failed tests:  3, 6-7, 9-10
  Non-zero exit status: 1
./t2015-checkout-unborn.sh (Wstat: 256 Tests: 6 
Failed: 3)

  Failed tests:  2-4
  Non-zero exit status: 1
./t2017-checkout-orphan.sh (Wstat: 256 Tests: 13 
Failed: 7)

  Failed tests:  7-13
  Non-zero exit status: 1
./t3033-merge-toplevel.sh  (Wstat: 256 Tests: 13 
Failed: 11)

  Failed tests:  3-13
  Non-zero exit status: 1
./t3200-branch.sh  (Wstat: 256 Tests: 
139 Failed: 2)

  Failed tests:  137-138
  Non-zero exit status: 1
./t5616-partial-clone.sh   (Wstat: 256 Tests: 13 
Failed: 1)

  Failed test:  4
  Non-zero exit status: 1
./t5516-fetch-push.sh  (Wstat: 256 Tests: 90 
Failed: 1)

  Failed test:  34
  Non-zero exit status: 1
./t6300-for-each-ref.sh(Wstat: 256 Tests: 
205 Failed: 9)

  Failed tests:  189-196, 199
  Non-zero exit status: 1
./t7114-reset-sparse-checkout.sh   (Wstat: 256 Tests: 3 
Failed: 2)

  Failed tests:  2-3
  Non-zero exit status: 1
./t7063-status-untracked-cache.sh  (Wstat: 256 Tests: 50 
Failed: 1)

  Failed test:  23
  Non-zero exit status: 1
./t7201-co.sh  (Wstat: 256 Tests: 38 
Failed: 33)

  Failed tests:  4, 6-27, 29-38
  Non-zero exit status: 1
./t7409-submodule-detached-work-tree.sh(Wstat: 256 Tests: 2 
Failed: 1)

  Failed test:  1
  Non-zero exit status: 1
./t9350-fast-export.sh (Wstat: 256 Tests: 37 
Failed: 1)

  Failed test:  12
  Non-zero exit status: 1
./t9903-bash-prompt.sh (Wstat: 256 Tests: 65 
Failed: 52)

  Failed tests:  4, 6-10, 14-34, 36, 39-51, 53-62, 65
  Non-zero exit status: 1
Files=834, Tests=19658, 2081 wallclock secs (10.42 usr 15.09 sys + 
1082.56 cusr 3530.46 csys = 4638.53 CPU)

Result: FAIL



-- 8< --
diff --git a/builtin/checkout.c b/builtin/checkout.c
index 28627650cd..912e565acc 100644
--- a/builtin/checkout.c
+++ b/builtin/checkout.c
@@ -478,6 +478,7 @@ static int merge_working_tree(const struct checkout_opts 
*opts,
  {
int ret;
struct lock_file lock_file = LOCK_INIT;
+   int skip_cache_tree_update = 0;
  
  	hold_locked_index(&lock_file, LOCK_DIE_ON_ERROR);

if (read_cache_preload(NULL) < 0)
@@ -493,6 +494,7 @@ static int merge_wo

Re: [PATCH 2/2] sideband: highlight keywords in remote sideband output

2018-08-06 Thread Han-Wen Nienhuys

On Fri, Aug 3, 2018 at 5:52 AM Jonathan Nieder  wrote:
>
> Hi,
>
> Han-Wen Nienhuys wrote:
>
> > The colorization is controlled with the config setting "color.remote".
> >
> > Supported keywords are "error", "warning", "hint" and "success". They
> > are highlighted if they appear at the start of the line, which is
> > common in error messages, eg.
> >
> >ERROR: commit is missing Change-Id
>
> A few questions:
>
> - should this be documented somewhere in Documentation/technical/*protocol*?
>   That way, server implementors can know to take advantage of it.

The protocol docs seem to work on a much different level. Maybe there
should be a "best practices" document or similar?

> - how does this interact with (future) internationalization of server
>   messages?  If my server knows that the client speaks French, should
>   they send "ERROR:" anyway and rely on the client to translate it?
>   Or are we deferring that question to a later day?

It doesn't, and we are deferring that question.

> [...]
> > The Git push process itself prints lots of non-actionable messages
> > (eg. bandwidth statistics, object counters for different phases of the
> > process), and my hypothesis is that these obscure the actionable error
> > messages that our server sends back. Highlighting keywords in the
> > draws more attention to actionable messages.
>
> I'd be interested in ways to minimize Git's contribution to this as
> well.  E.g. can we make more use of \r to make client-produced progress
> messages take less screen real estate?  Should some of the servers
> involved (e.g., Gerrit) do so as well?

Yes, I'm interested in this too, but I fear it would veer into a
territory that is prone to bikeshedding.

Gerrit should definitely also send less noise.

> > Finally, this solution is backwards compatible: many servers already
> > prefix their messages with "error", and they will benefit from this
> > change without requiring a server update.
>
> Yes, this seems like a compelling reason to follow this approach.
>
> [...]
> > --- a/Documentation/config.txt
> > +++ b/Documentation/config.txt
> > @@ -1229,6 +1229,15 @@ color.push::
> >  color.push.error::
> >   Use customized color for push errors.
> >
> > +color.remote::
> > + A boolean to enable/disable colored remote output. If unset,
> > + then the value of `color.ui` is used (`auto` by default).
> > +
> > +color.remote.::
> > + Use customized color for each remote keywords. `` may be
> > + `hint`, `warning`, `success` or `error` which match the
> > + corresponding keyword.
>
> What positions in a remote message are matched?  If a server prints a
> message about something being discouraged because it is error-prone,
> would the "error" in error-prone turn red?

yes, if it happened to occur after a line-break.

We could decide that we will only highlight

  ':' rest of line

or

  '\n'

that would work for the Gerrit case, and would be useful forcing
function to uniformize remote error messages.

> > +struct kwtable {
>
> nit: perhaps kwtable_entry?

done.

> > +/*
> > + * Optionally highlight some keywords in remote output if they appear at 
> > the
> > + * start of the line. This should be called for a single line only.
> > + */
> > +void maybe_colorize_sideband(struct strbuf *dest, const char *src, int n)
>
> Can this be static?

Done.

> What does 'n' represent?  Can the comment say?  (Or if it's the length
> of src, can it be renamed to srclen?)

Added comment.

> Super-pedantic nit: even if there are multiple special words at the
> start, this will only highlight one. :)  So it could say something
> like "Optionally check if the first word on the line is a keyword and
> highlight it if so."

Super pedantic answer: if people care about it that much, they can
read the 20 lines of code below :-)

> > +{
> > + int i;
> > + if (!want_color_stderr(use_sideband_colors())) {
>
> nit: whitespace damage (you can check for this with "git show --check").
> There's a bit more elsewhere.

ran tabify on whole file.

> > + for (i = 0; i < ARRAY_SIZE(keywords); i++) {
> > + struct kwtable* p = keywords + i;
>
> nit: * should attach to the variable, C-style.

Done.

> You can use "make style" to do some automatic formatting (though this
> is a bit experimental, so do double-check the results).

sad panda:

hanwen@han-wen:~/vc/git$ make style
git clang-format --style file --diff --extensions c,h
Traceback (most recent call last):
  File "/usr/bin/git-clang-format", line 580, in 
main()
  File "/usr/bin/git-clang-format", line 62, in main
config = load_git_config()
  File "/usr/bin/git-clang-format", line 195, in load_git_config
name, value = entry.split('\n', 1)
ValueError: need more than 1 value to unpack
Makefile:2663: recipe for target 'style' failed
make: *** [style] Error 1

> [...]
> > @@ -48,8 +145,10 @@ int recv_sideband(const char *me, int in_stream, int 
> > out)
> >   len--;
> >   switch (band)

Re: [RFC PATCH 2/5] Add delta-islands.{c,h}

2018-08-06 Thread Jeff King

On Sun, Aug 05, 2018 at 08:53:19PM +0200, Christian Couder wrote:

> - 'layer' is used in add_to_write_order() which is called from many
> places in add_descendants_to_write_order() and compute_write_order()
> for example like this:
> 
> for (s = DELTA_SIBLING(e); s; s = DELTA_SIBLING(s)) {
> add_to_write_order(wo, endp, s);
> }
> 
> So it seems to me that moving 'layer' to 'struct packing_data' would
> require changing the DELTA_SIBLING macros or adding a hack to easily
> find the index in an array corresponding to a 'struct object_entry'.

Right, that hack is how the various the existing memory-shrinking macros
work. Every "struct object_entry *" that we deal with _must_ be in the
list in "packing_data", so that we can compute an index as
"entry - to_pack.objects".

So I think Duy is just suggesting:

  void oe_set_layer(struct packing_data *pack,
struct object_entry *entry,
int layer)
  {
  if (!pack->layers)
  ALLOC_ARRAY(pack->layers, pack->nr_objects);
  pack->layers[entry - pack->objects] = layer;
  }

  int oe_get_layer(struct packing_data *pack,
   struct object_entry *entry)
  {
  if (!pack->layers)
  return 0;
  return pack->layers[entry - pack->nr_objects];
  }

That said, I wonder if need to cache the layer at all. We compute the
layers all at once in the new compute_pack_layers(). But it is just a
linear walk over the objects, asking for each "is this in the core
island?".

Could we just ask "is this in the core island?" when we consider each
object? That's a hash lookup each time we ask instead of looking at our
int, but I'd think we would only ask in linear proportion to the number
of objects. So there's no algorithmic speedup, though maybe a
constant-time one (for two layers, I'd expect we'd probably ask about
each object twice).

> - I don't see why it is different from other fields in 'struct
> object_entry' that are also used in compute_write_order(), for
> example:
> 
> uint32_t delta_idx;/* delta base object */
> uint32_t delta_child_idx; /* deltified objects who bases me */
> uint32_t delta_sibling_idx; /* other deltified objects who ... */
> 
> Would we also want to move these fields to 'struct packing_data'? Why
> or why not? If we want to move them, then it might be a good idea to
> do all the moving at the same time to make sure we get something
> coherent.

We actually use those multiple times. They're reset and used in
compute_write_order(), but I think we use them earlier as part of the
delta search (at the very least we use delta_idx; maybe no the
child/sibling stuff).

-Peff

Re: [PATCH 2/2] sideband: highlight keywords in remote sideband output

2018-08-06 Thread Han-Wen Nienhuys

On Thu, Aug 2, 2018 at 8:22 PM Junio C Hamano  wrote:
> >
> > Helped-by: Duy Nguyen 
> > Signed-off-by: Han-Wen Nienhuys 
> > ---
> >  Documentation/config.txt|   9 +++
> >  help.c  |   1 +
> >  help.h  |   1 +
> >  sideband.c  | 119 +---
> >  t/t5409-colorize-remote-messages.sh |  47 +++
> >  5 files changed, 168 insertions(+), 9 deletions(-)
> >  create mode 100644 t/t5409-colorize-remote-messages.sh
>
> I'll "chmod +x" while queuing.
>
Done.

> If your "make test" did not catch this as an error, then we may need
> to fix t/Makefile, as it is supposed to run test-lint.

I've been running tests individually as

 sh t5409-colorize-remote-messages.sh  -v -d

> > +color.remote::
> > + A boolean to enable/disable colored remote output. If unset,
> > + then the value of `color.ui` is used (`auto` by default).
>
> Nobody tells the end-users what "colored remote output" does;
> arguably they can find it out themselves by enabling the feature and
> observing remote messages, but that is not user friendly.

expanded doc.

> > +color.remote.::
> > + Use customized color for each remote keywords. `` may be
>
> Isn't 'each' a singular, i.e. "for each remote keyword"?  If so I do
> not mind dropping 's' myself while queuing.

Done.

>
> > + `hint`, `warning`, `success` or `error` which match the
> > + corresponding keyword.
>
> We need to say that keywords are painted case insensitively
> somewhere in the doc.  Either do that here, or in the updated
> description of `color.remote`---I am not sure which one results in
> more readable text offhand.

Done.

> > +void maybe_colorize_sideband(struct strbuf *dest, const char *src, int n)
> > +{
> > + int i;
>
> In a block with a dozen more more lines, it is easier to have a
> blank line between decls and the first statement, i.e. here.

Done.

> > + if (!want_color_stderr(use_sideband_colors())) {
>
> The above line is indented with SP followed by HT; don't.

Fixed. It would be great if there were a pre-commit hook that I could
install to prevent this from ever getting committed.


> > + struct kwtable* p = keywords + i;
>
> struct kwtable *p = keywords + i;

Done.

> > + int len = strlen(p->keyword);
> > +/*
> > + * Match case insensitively, so we colorize output from 
> > existing
> > + * servers regardless of the case that they use for their
> > + * messages. We only highlight the word precisely, so
> > + * "successful" stays uncolored.
> > + */
>
> Indent with tabs, not a run of spaces, i.e.

Done.

> Use write_script, i.e. instead of all the above, say

Done.


> Our tests are not written to demonstrate that our code works as
> written.  It is to protect our code from getting broken by others
> who may not share vision of the original author.  Make sure that you
> cast what you care about in stone, e.g. include "echo ERROR: bad" or
> something in the above to ensure that future updates to the code
> will not turn the match into a case sensitive one without breaking
> the test suite.

Add some more cases.

> > + echo 1 >file &&
> > + git add file &&
> > + git commit -m 1 &&
> > + git clone . child &&
> > + cd child &&
> > + echo 2 > file &&
> > + git commit -a -m 2
>
> Don't chdir the whole testing environment like this.  Depending on
> the success and failure in the middle of the above &&-chain, the
> next test will start at an unexpected place, which is bad.
>
> Instead, do something like
>
> git clone . child &&
> echo 2 >child/file &&
> git -C child commit -a -m 2
>
> or
>
Done.

> > +test_expect_success 'push' '
> > + git -c color.remote=always push -f origin HEAD:refs/heads/newbranch 
> > 2>output &&
> > + test_decode_color decoded &&
> > + grep "error:" decoded &&
> > + grep "hint:" decoded &&
> > + grep "success:" decoded &&
> > + grep "warning:" decoded &&
> > + grep "prefixerror: error" decoded
>
> A comment before this test (which covers both of these two) that
> explains why many "grep" invocations are necessary, instead of a
> comparison with a single multi-line expected result file.  I am
> guessing that it is *not* because you cannot rely on the order of
> these lines coming out from the update hook, but because the remote
> output have lines other than what is given by the update hook and
> we cannot afford to write them in the expected result file.

Comparing exact outputs is IMO an antipattern in general. It makes the
test more fragile than they need to be. (what if we change the
"remote: " prefix to something else for example?).

If using a golden output, the second test would either require a
repetitive, too long golden file, or it would use loose greps anyway.

Added a comment.

--
Google Germany

Re: [RFC PATCH 3/5] pack-objects: add delta-islands support

2018-08-06 Thread Jeff King

On Mon, Aug 06, 2018 at 10:44:14AM +0200, Christian Couder wrote:

> Taking a look at how we use regexec() in our code base, it looks like
> it might be better to use regexec_buf() defined in
> "git-compat-util.h".
> 
> I am not completely sure about that because apply.c has:
> 
> status = regexec(stamp, timestamp, ARRAY_SIZE(m), m, 0);
> if (status) {
> if (status != REG_NOMATCH)
> warning(_("regexec returned %d for input: %s"),
> status, timestamp);
> return 0;
> }
> 
> Though the above uses a regex that is defined in apply.c. The regex
> doesn't come from the config file.
> 
> Actually except the above there is a mix of regexec() and
> regexec_buf() in our code base, which are used with only 0, 1 or 2
> capture groups, so it is not very clear what should be used.

I don't think we need regexec_buf(). The advantage it has is that it can
operate on strings that aren't NUL-terminated, but that isn't the case
here.

> And anyway I still don't see how we could diagnose when the end user
> input requires more captures than we support.

We can use the final element as a sentinel, and complain if it gets
filled in:

diff --git a/delta-islands.c b/delta-islands.c
index dcc6590cc1..18426ffb18 100644
--- a/delta-islands.c
+++ b/delta-islands.c
@@ -375,6 +375,10 @@ static int find_island_for_ref(const char *refname, const 
struct object_id *oid,
if (i < 0)
return 0;

+   if (matches[ARRAY_SIZE(matches)-1].rm_so != -1)
+   die("island regex had too many matches (max=%d)",
+   (int)ARRAY_SIZE(matches) - 2);
+
for (m = 1; m < ARRAY_SIZE(matches); m++) {
regmatch_t *match = &matches[m];

The big downside is that it only kicks in when you actually successfully
make a match. So you could have:

  [pack]
  island = refs/(one)/(two)/(three)/(four)/(five)/(six)/(seven)

in your config for years, and then one day it blows up when somebody
actually has a ref that matches it.

I think it would be fine to just say "we only respect the first N
capture groups". And maybe even issue a warning (based on the detection
above). I'd also be fine with bumping the "matches" array to something
more ridiculous, like 32. The current value of 8 was supposed to be
ridiculous already (we've never used more than 2).

-Peff

1 2 >

1 - 100 of 118 matches

Mail list logo