Re: [PATCH] pack-protocol.txt: accept error packets in any context

2018-11-28 Thread Junio C Hamano
Junio C Hamano  writes:

>> diff --git a/connect.c b/connect.c
>> index 24281b608..458906e60 100644
>> --- a/connect.c
>> +++ b/connect.c
>> @@ -306,8 +306,6 @@ struct ref **get_remote_heads(struct packet_reader 
>> *reader,
>>  die_initial_contact(1);
>>  case PACKET_READ_NORMAL:
>>  len = reader->pktlen;
>> -if (len > 4 && skip_prefix(reader->line, "ERR ", ))
>> -die(_("remote error: %s"), arg);
>>  break;
>>  case PACKET_READ_FLUSH:
>>  state = EXPECTING_DONE;

This breaks build by not removing the decl of arg while removing the
last user of that variable in the function.

 connect.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/connect.c b/connect.c
index 458906e60d..4813f005ab 100644
--- a/connect.c
+++ b/connect.c
@@ -296,7 +296,6 @@ struct ref **get_remote_heads(struct packet_reader *reader,
struct ref **orig_list = list;
int len = 0;
enum get_remote_heads_state state = EXPECTING_FIRST_REF;
-   const char *arg;
 
*list = NULL;
 
-- 
2.20.0-rc1-10-g7068cbc4ab



Re: [PATCH v2] log -G: Ignore binary files

2018-11-28 Thread Junio C Hamano
Junio C Hamano  writes:

>> +test_expect_success 'log -G ignores binary files' '
>> +git checkout --orphan orphan1 &&
>> +printf "a\0a" >data.bin &&
>> +git add data.bin &&
>> +git commit -m "message" &&
>> +git log -Ga >result &&
>> +test_must_be_empty result
>> +'
>
> As this is the first mention of data.bin, this is adding a new file
> data.bin that has two 'a' but is a binary file.  And that is the
> only commit in the history leading to orphan1.
>
> The fact that "log -Ga" won't find any means it missed the creation
> event, because the blob is binary.  Good.

By the way, this root commit records another file whose path is
"file" and has "Picked" in it.  If the file had 'a' in it, it
would have been included in "git log" output, but that is too subtle
a point to be noticed by the readers who are only reading this patch
without seeing what has been done to the index before this test
piece.

If you are going to restructure these tests to create a three-commit
history in a single expect_success that is inspected with various
"log -Ga" invocations in subsequent tests, it is worth removing that
other file (or rather, starting with "read-tree --empty" immediately
after checking out the orphan branch, to clarify to the readers that
there is nothing but what you add in the set-up step in the index)
to make the test more robust.



Re: [PATCH v2] log -G: Ignore binary files

2018-11-28 Thread Junio C Hamano
Thomas Braun  writes:

> Subject: Re: [PATCH v2] log -G: Ignore binary files

s/Ig/ig/; (will locally munge--this alone is no reason to reroll).

The code changes looked sensible.

> diff --git a/t/t4209-log-pickaxe.sh b/t/t4209-log-pickaxe.sh
> index 844df760f7..5c3e2a16b2 100755
> --- a/t/t4209-log-pickaxe.sh
> +++ b/t/t4209-log-pickaxe.sh
> @@ -106,4 +106,44 @@ test_expect_success 'log -S --no-textconv (missing 
> textconv tool)' '
>   rm .gitattributes
>  '
>  
> +test_expect_success 'log -G ignores binary files' '
> + git checkout --orphan orphan1 &&
> + printf "a\0a" >data.bin &&
> + git add data.bin &&
> + git commit -m "message" &&
> + git log -Ga >result &&
> + test_must_be_empty result
> +'

As this is the first mention of data.bin, this is adding a new file
data.bin that has two 'a' but is a binary file.  And that is the
only commit in the history leading to orphan1.

The fact that "log -Ga" won't find any means it missed the creation
event, because the blob is binary.  Good.

> +test_expect_success 'log -G looks into binary files with -a' '
> + git checkout --orphan orphan2 &&
> + printf "a\0a" >data.bin &&
> + git add data.bin &&
> + git commit -m "message" &&

This starts from the state left by the previous test piece, i.e. we
have a binary data.bin file with two 'a' in it.  We pretend to
modify and add, but these two steps are no-op if the previous
succeeded, but even if the previous step failed, we get what we want
in the data.bin file.  And then we make an initial commit the same
way.

> + git log -a -Ga >actual &&
> + git log >expected &&

And we ran the same test but this time with "-a" to tell Git that
binary-ness should not matter.  It will find the sole commit.  Good.

> + test_cmp actual expected
> +'
> +
> +test_expect_success 'log -G looks into binary files with textconv filter' '
> + git checkout --orphan orphan3 &&
> + echo "* diff=bin" > .gitattributes &&

s/> />/; (will locally munge--this alone is no reason to reroll).

> + printf "a\0a" >data.bin &&
> + git add data.bin &&
> + git commit -m "message" &&
> + git -c diff.bin.textconv=cat log -Ga >actual &&

This exposes a slight iffy-ness in the design.  The textconv filter
used here does not strip the "binary-ness" from the payload, but it
is enough to tell the machinery that -G should look into the
difference.  Is that really desirable, though?

IOW, if this weren't the initial commit (which is handled by the
codepath to special-case creation and deletion in diff_grep()
function), would "log -Ga" show it without "-a"?  Should it?

I think this test piece (and probably the previous ones for "-a" vs
"no -a" without textconv, as well) should be using a history with
three commits, where

- the root commit introduces "a\0a" to data.bin (creation event)

- the second commit adds another instance of "a\0a" to data.bin
  (forces comparison)

- the third commit removes data.bin (deletion event)

and make sure that the three are treated identically.  If "log -Ga"
finds one (with the combination of other conditions like use of
textconv or -a option), it should find all three, and vice versa.

> + git log >expected &&
> + test_cmp actual expected
> +'
> +
> +test_expect_success 'log -S looks into binary files' '
> + git checkout --orphan orphan4 &&
> + printf "a\0a" >data.bin &&
> + git add data.bin &&
> + git commit -m "message" &&
> + git log -Sa >actual &&
> + git log >expected &&
> + test_cmp actual expected
> +'

Likewise.  This would also benefit from a three-commit history.

Perhaps you can create such a history at the beginning of these
additions as another "setup -G/-S binary test" step and test
different variations in subsequent tests without the setup?

>  test_done


Re: [PATCH 5/5] test-lib: add support for GIT_TODO_TESTS

2018-11-28 Thread Junio C Hamano
Ævar Arnfjörð Bjarmason   writes:

> -To skip tests, set the GIT_SKIP_TESTS variable. Individual tests can
> -be skipped:
> +To skip tests, set either the GIT_SKIP_TESTS or GIT_TODO_TESTS
> +variables. The difference is that with SKIP the tests won't be run at
> +all, whereas they will be run with TODO, but in success or failure
> +annotated as a TODO test.

It is entirely unclear what "They will be run with TODO" means.

I know what you want to achieve from the code change; instead of
just skipping, you want to cause as much side effect the skipped
test piece wants to make until it fails and dies, without taking
the remainder of the test down.  And I kind of agree that sometimes
such a mode would be very useful (i.e. when you do not want to
bother separating such a failing test properly into the setup part
that would affect later tests and the one-thing-it-wants-to-test
part that can now be safely skipped).  From the cursory look, the
code change in this patch is sensible realization of that idea.

Having said all that, I won't picking this up until next month ;-) I
really want to see that everybody is concentrating on making sure
that 2.20 is solid before playing with shiny new toys.

Thanks.



Re: [PATCH] transport-helper.c: do not translate a string twice

2018-11-28 Thread Junio C Hamano
Nguyễn Thái Ngọc Duy   writes:

> Signed-off-by: Nguyễn Thái Ngọc Duy 
> ---
>  My bad.
>
>  transport-helper.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/transport-helper.c b/transport-helper.c
> index 7213fa0d32..bf225c698f 100644
> --- a/transport-helper.c
> +++ b/transport-helper.c
> @@ -573,7 +573,7 @@ static int run_connect(struct transport *transport, 
> struct strbuf *cmdbuf)
>   fprintf(stderr, "Debug: Falling back to dumb "
>   "transport.\n");
>   } else {
> - die(_(_("unknown response to connect: %s")),
> + die(_("unknown response to connect: %s"),
>   cmdbuf->buf);

Thanks.


Re: [PATCH v2 5/7] checkout: split options[] array in three pieces

2018-11-28 Thread Junio C Hamano
Nguyễn Thái Ngọc Duy   writes:

> +static struct option *add_switch_branch_options(struct checkout_opts *opts,
> + struct option *prevopts)
> +{
> + struct option options[] = {
>   OPT_STRING('b', NULL, >new_branch, N_("branch"),
>  N_("create and checkout a new branch")),

I think there should be another step to rename the options to more
sensible ones for the context.  In the context of overall "checkout"
command, the 'b' option

git checkout -b  "

that signals that its parameter has something to do with a 'branch'
makes perfect sense.  But when the user uses the new command

git switch-branch -b  

does not convey the more important fact in the context.  In the
orignal context, "this is about a branch" and "we are not checking
out an existing one, but are creating" are both worthwhile thing to
express, but because a single letter option cannot stand for both,
"-b" is the most reasonable choice (compared to calling it "-c" that
stands for "create" that invites "what exactly are you creating?").

In the new context of "switch-branch", it no longer has to be said
that the optional feature is about "branch".  So I would imagine
that users naturally expect this option to be called

git switch-branch --create  

(or -c for short).

I'll just stop with this single example, but I think we should make
sure all the options make sense in the context of new command.

Of course, that means it will NOT be sufficient to just split the
option table into two tables and stitch them together for the single
command.  This option must stay to be "-b" (giving it a synonym
"--create-branch" is OK) in the context of "checkout".


Re: [PATCH/RFC v2 0/7] Introduce new commands switch-branch and checkout-files

2018-11-28 Thread Junio C Hamano
Stefan Xenos  writes:

> So - IMO - detaching should always be an explicit action. Some options
> that occur to me:
>
> git switch-branch --detach

That is the most obvious way to spell it, and it is why we have "git
checkout --detach".  If we were to split one half of "checkout" into
"switch-branch", I would support the idea to make switch-branch only
take a branch name and nothing else, allow it to take any commit-ish
to detach the HEAD at that commit in the history graph with the
--detach option".  I also do not think anybody minds explaining the
resulting state to be "on an unnamed branch" (or is it "the" unnamed
branch?  Given that HEAD@{} reflog is a singleton, perhaps the right
mental model is that there is only one unnamed branch per worktree).

> git detach

The detached HEAD state is not all that special to deserve a
separate command.  After all, all history growing commands like
"commit", "cherry-pick", "rebase", "merge", etc. work the same way
on the unnamed branch.

> git switch-branch HEAD

I do not think this one is acceptable.  "git checkout HEAD" does not
detach but works as if you said "git checkout master" (when on the
'master' branch).  And you do not want "git switch-branch HEAD^0" to
be that explicit way to tell Git to detach the HEAD as that will
take us back to square one ("git checkout HEAD^0" is the more
concise and time-honored way to say "git checkout --detach HEAD").


Re: [PATCH v2 7/7] Suggest other commands instead of "git checkout"

2018-11-28 Thread Junio C Hamano
Duy Nguyen  writes:

> I see my deliberate attempt to provoke has failed :D Giving your view
> of the new commands as "training wheels", I take it we still should
> make them visible as much as possible, but we just not try to hide
> "git checkout" as much (e.g. we mention both new and old commands,
> instead of just mentioning the new one, when suggesting something)?

Yes, I do support the overall idea of learning two (or possibly
three) separate commands would help new users to form the right
mental model much sooner than learning one that can be used in
multiple ways.  Another possible approach could be to split the use
of "reset" that does not move "HEAD" into the same half of
"checkout" that does not move "HEAD", i.e. checkout-files.


Re: [PATCH v2 6/7] checkout: split into switch-branch and checkout-files

2018-11-28 Thread Junio C Hamano
Stefan Xenos  writes:

> Although I have no problem with "switch-branch" as a command name,
> some alternative names we might consider for switch-branch might be:
>
> chbranch
> swbranch

Please never go in that direction.  So far, we made a conscious
effort to keep the names of most frequently used subcommand to
proper words that can be understood by coders (IOW, I expect they
know what 'grep' is, even though that may not be a 'proper word').

> switch
> branch change (as a subcommand for the "branch" command)

It is more like moving to the branch to work on.  I think 'switch'
is something SVN veterans may find familiar.  Both are much better
than ch/swbranch that are overly cryptic.


Re: [PATCH v2 6/7] checkout: split into switch-branch and checkout-files

2018-11-28 Thread Junio C Hamano
Stefan Beller  writes:

> I dislike the checkout-* names, as we already have checkout-index
> as plumbing, so it would be confusing as to which checkout-* command
> should be used when and why as it seems the co-index moves
> content *from index* to the working tree, but the co-files moves content
> *to files*, whereas checkout-branch is neither 'moving' to or from a branch
> but rather 'switching' to that branch.

To me, "switching to work on the branch", is like "checking the book
out from the library to read".  IOW, "check the branch out to work
on" does not have to involve *any* moving of contents.


Re: [PATCH 1/1] rebase --stat: fix when rebasing to an unrelated history

2018-11-28 Thread Junio C Hamano
"Johannes Schindelin via GitGitGadget" 
writes:

> The built-in version of the `git rebase` command blindly translated that
> shell script code, assuming that there is no need to test whether there
> *was* a merge base, and due to its better error checking, exited with a
> fatal error (because it tried to read the object with hash ...
> as a tree).

And the scripted version produced a wrong result because it did not
check the lack of merge base and fed an empty string (presumably, as
that is what you would get from mb=$(merge-base A B) when A and B
are unrelated) to later steps?  If that is the case, then it *is* a
very significant thing to record here.  As it was not merely "failed
to stop due to lack of error checking", but a lot worse.  It was
producing a wrong result.

A more faithful reimplementation would not have exited with fatal
error, but would have produced the same wrong result, so "a better
error checking caused the reimplementation die where the original
did not die" may be correct but is much less important than the fact
that "the logic used in the original to produce diffstat forgot that
there may not be a merge base and would not have worked correctly in
such a case, and that is what we are correcting" is more important.

Please descibe the issue as such, if that is the case (I cannot read
from the description in the proposed log message if that is the
case---but I came to the conclusion that it is what you wanted to
fix reading the code).

>   if (options.flags & REBASE_VERBOSE)
>   printf(_("Changes from %s to %s:\n"),
> - oid_to_hex(_base),
> + is_null_oid(_base) ?
> + "(empty)" : oid_to_hex(_base),

I am not sure (empty) is a good idea for two reasons.  It is going
to be inserted into an translated string without translation.  Also
it is too similar to the fallback string used by some printf
implementations when NULL was given to %s by mistake.

If there weren't i18n issues, I would suggest to use "empty merge
base" or "empty tree" instead, but the old one would have shown
0{40}, so perhaps empty_tree_oid_hex() is a good substitute?

> @@ -1494,8 +1495,9 @@ int cmd_rebase(int argc, const char **argv, const char 
> *prefix)
>   DIFF_FORMAT_SUMMARY | DIFF_FORMAT_DIFFSTAT;
>   opts.detect_rename = DIFF_DETECT_RENAME;
>   diff_setup_done();
> - diff_tree_oid(_base, >object.oid,
> -   "", );
> + diff_tree_oid(is_null_oid(_base) ?
> +   the_hash_algo->empty_tree : _base,
> +   >object.oid, "", );

The original would have run "git diff '' $onto", and died with an
error message, so both the original and the reimplementation were
failing.  Just in different ways ;-)

> diff --git a/git-legacy-rebase.sh b/git-legacy-rebase.sh
> index b97ffdc9dd..be3b241676 100755
> --- a/git-legacy-rebase.sh
> +++ b/git-legacy-rebase.sh
> @@ -718,10 +718,12 @@ if test -n "$diffstat"
>  then
>   if test -n "$verbose"
>   then
> - echo "$(eval_gettext "Changes from \$mb to \$onto:")"
> + mb_display="${mb:-(empty)}"
> + echo "$(eval_gettext "Changes from \$mb_display to \$onto:")"
>   fi
> + mb_tree="${mb:-$(git hash-object -t tree /dev/null)}"
>   # We want color (if set), but no pager
> - GIT_PAGER='' git diff --stat --summary "$mb" "$onto"
> + GIT_PAGER='' git diff --stat --summary "$mb_tree" "$onto"

Code fix for diff-tree invocation, both in the builtin/ version and
this one, looks correct.


Re: [PATCH] config.mak.dev: enable -Wpedantic in clang

2018-11-28 Thread Junio C Hamano
Eric Sunshine  writes:

> Playing Devi's Advocate, what if Apple's clang "8" was, in reality,
> real-world clang 3? Then this condition would incorrectly enable the
> compiler option on Apple for a (real) clang version below 4. For this
> reason, it seems we shouldn't be trusting only the clang version
> number when dealing with Apple.
>
> (I suspect that it won't make a difference in actual practice, so it
> may be reasonable to punt on this issue until/if someone complains.)

Why do we care which true version of clang is being used here in the
first place?  Is it because some version of clang (take -Wpedantic
but misbehave | barf when -Wpedantic is given) while some others are
OK?

If the only problem is that some version of clang barf when the
option is given, perhaps we can do a trial-compile of helloworld.c
or something, or is that something we are trying to avoid in the
first place?

It appears to me that ./detect-compiler tool (by the way, perhaps we
should start moving these build-helper-tools sitting at the top level
down to ./build-helpers/ subdirectory or something so that we can focus
on the source code when running "ls" at the top level of the hierarchy)
can become more intelligent to help things like this.


Re: [RFC PATCH] Introduce "precious" file concept

2018-11-28 Thread Junio C Hamano
Ævar Arnfjörð Bjarmason  writes:

> I don't think something like the endgame you've described in
> https://public-inbox.org/git/xmqqzhtwuhpc@gitster-ct.c.googlers.com/
> is ever going to work. Novice git users (the vast majority) are not
> going to diligently update both .gitignore and some .gitattribute
> mechanism in lockstep.

That goes both ways, no?  Forcing people to add the same pattern,
e.g. *.o, to .gitignore and .gitattribute to say they are
expendable, when most of the time they are, is not something you
should expect from the normal population.

>> I would think that a proper automation needs per-path hint from the
>> user and/or the project, not just a single-size-fits-all --force
>> option, and "unlike all the *.o ignored files that are expendable,
>> this vendor-supplied-object.o is not" is one way to give such a
>> per-path hint.
>>
>>> This would give scripts which relied on our stable plumbing consistent
>>> behavior, while helping users who're using our main porcelain not to
>>> lose data. I could then add a --force option to the likes of read-tree
>>> (on by default), so you could get porcelain-like behavior with
>>> --no-force.
>>
>> At that low level, I suspect that a single size fits all "--force"
>> would work even less well.
>
> Yeah I don't think the one-size-fits-all way out of this is a single
> --force flag.

Yes, indeed.  That's why I prefer the "precious" bit.  The system
would behave the same way with or without it, but projects (not
individual endusers) can take advantage of the feature if they
wanted to.



Re: [PATCH] rebase: mark the C reimplementation as an experimental opt-in feature (was Re: [ANNOUNCE] Git v2.20.0-rc1)

2018-11-28 Thread Junio C Hamano
Ævar Arnfjörð Bjarmason  writes:

> Since I raised this 'should we hold off?' I thought I'd chime in and say
> that I'm fine with going along with what you suggest and having the
> builtin as the default in the final. IOW not merge
> jc/postpone-rebase-in-c down.

OK.


Re: [PATCH 0/5] Add a new "sparse" tree walk algorithm

2018-11-28 Thread Derrick Stolee

On 11/28/2018 5:18 PM, Ævar Arnfjörð Bjarmason wrote:

This is really interesting. I tested this with:

 diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
 index 124b1bafc4..5c7615f06c 100644
 --- a/builtin/pack-objects.c
 +++ b/builtin/pack-objects.c
 @@ -3143 +3143 @@ static void get_object_list(int ac, const char **av)
 -   mark_edges_uninteresting(, show_edge, sparse);
 +   mark_edges_uninteresting(, show_edge, 1);

To emulate having a GIT_TEST_* mode for this, which seems like a good
idea since it turned up a lot of segfaults in pack-objects. I wasn't
able to get a backtrace for that since it always happens indirectly, and
I didn't dig enough to see how to manually invoke it the right way.


Thanks for double-checking this. I had run a similar test in my 
prototype implementation, but over-simplified some code when rewriting 
it for submission (and then forgot to re-run that test).


Specifically, these null checks are important:

diff --git a/list-objects.c b/list-objects.c
index 9bb93d1640..7e864b4db8 100644
--- a/list-objects.c
+++ b/list-objects.c
@@ -232,6 +232,10 @@ static void add_edge_parents(struct commit *commit,
    for (parents = commit->parents; parents; parents = parents->next) {
    struct commit *parent = parents->item;
    struct tree *tree = get_commit_tree(parent);
+
+   if (!tree)
+   continue;
+
    oidset_insert(set, >object.oid);

    if (!(parent->object.flags & UNINTERESTING))
@@ -261,6 +265,8 @@ void mark_edges_uninteresting(struct rev_info *revs,

    if (sparse) {
    struct tree *tree = get_commit_tree(commit);
+   if (!tree)
+   continue;

    if (commit->object.flags & UNINTERESTING)
    tree->object.flags |= UNINTERESTING;

I will definitely include a GIT_TEST_* variable in v2.

Thanks,

-Stolee



Re: [PATCH 2/2] format-patch: allow for independent diff & range-diff options

2018-11-28 Thread Junio C Hamano
Ævar Arnfjörð Bjarmason   writes:

> +   [--range-diff]]

Let's make sure a random string thrown at this mechanism will
properly get noticed and diagnosed.

> @@ -257,6 +258,13 @@ feeding the result to `git send-email`.
>   creation/deletion cost fudge factor. See linkgit:git-range-diff[1])
>   for details.
>  
> +--range-diff::
> + Other options prefixed with `--range-diff` are stripped of
> + that prefix and passed as-is to the diff machinery used to
> + generate the range-diff, e.g. `--range-diff-U0` and
> + `--range-diff--no-color`. This allows for adjusting the format
> + of the range-diff independently from the patch itself.

Taking anything is of course the most general, but I am afraid if
this backfires if there are some options that do not make sense to
be different between the invocations of range-diff and format-patch.

> @@ -1689,8 +1688,32 @@ int cmd_format_patch(int argc, const char **argv, 
> const char *prefix)
>   rev.preserve_subject = keep_subject;
>  
>   argc = setup_revisions(argc, argv, , _r_opt);
> - if (argc > 1)
> - die(_("unrecognized argument: %s"), argv[1]);
> + if (argc > 1) {
> + struct argv_array args = ARGV_ARRAY_INIT;
> + const char *prefix = "--range-diff";

Please call that anything but "prefix" that hides the parameter to
the function.  

const char *range_diff_opt = "--range-diff";

might work OK, or it might not.  Let's read on.

> + int have_prefix = 0;
> +
> + for (i = 0; i < argc; i++) {
> + struct strbuf sb = STRBUF_INIT;
> + char *str;
> +
> + strbuf_addstr(, argv[i]);
> + if (starts_with(argv[i], prefix)) {
> + have_prefix = 1;
> + strbuf_remove(, 0, strlen(prefix));
> + }
> + str = strbuf_detach(, NULL);
> + strbuf_release();
> +
> + argv_array_push(, str);
> + }
> +

Is the body of the loop essentially this?

char *passopt = argv[i];
if (!skip_prefix(passopt, range_diff_opt, ))
saw_range_diff_opt = 1;
argv_array_push(, xstrdup(passopt));

We only use that "prefix" thing once, so we may not even need the
variable.

> + if (!have_prefix)
> + die(_("unrecognized argument: %s"), argv[1]);

So we take normal options and check the leftover args; if there is
no --range-diff among the leftover bits, we pretend that
we stumbled while reading the first such leftover arg.

> + argc = setup_revisions(args.argc, args.argv, _rev, NULL);
> + if (argc > 1)
> + die(_("unrecognized argument: %s"), argv[1]);
> + }

Otherwise, we pass all the leftover bits, which is a random mixture
but guaranteed to have at least one meant for range-diff, to another
setup_revisions().  If it leaves a leftover arg, then that is
diagnosed here, so we'd be OK (iow, this is not a new "attack
vector" to inject random string to command line parser).

One minor glitch I can see is "format-patch --range-diffSilly" would
report "unrecognised arg: Silly".  As we are pretending to be and
reporting errors as format-patch, it would be better if we report
that --range-diffSilly was what we did not understand.

> Junio: I know it's late, but unless Eric has objections to this UI
> change I'd really like to have this in 2.20 since this is a change to
> a new command-line UI that's newly added in 2.20.

Quite honestly, I'd rather document "driving range-diff from
format-patch is experimental and does silly things when given
non-standard options in this release" and not touch the code at this
late stage in the game.  Would it be less intrusive a change to
*not* support the --range-diff option, still use rd_rev
that is separate from the main rev, and use a reasonable hardcoded
default settings when preparing rd_rev?




Re: Forcing GC to always fail

2018-11-28 Thread Bryan Turner
On Wed, Nov 28, 2018 at 5:19 PM Junio C Hamano  wrote:
>
> > Another issue with the canned steps for "git gc" is that it means it
> > can't be used to do specific types of cleanup on a different schedule
> > from others. For example, we use "git pack-refs" directly to
> > frequently pack the refs in our repositories, separate from "git
> > repack" + "git prune" for repacking objects. That allows us to keep
> > our refs packed better without incurring the full overhead of
> > constantly building new packs.
>
> I am not sure if the above is an example of things that are good.
> We keep individual "pack-refs" and "rev-list | pack-objects"
> available exactly to give finer grained control to repository
> owners, and "gc" is meant to be one-size-fits-all easy to run
> by end users.  Adding options to "git gc --no-reflog --pack-refs"
> to complicate it sounds somewhat backwards.

I think we're in agreement there. I was citing the fact that GC isn't
good for targeted maintenance as a reason why we use "pack-refs"
directly, which sounds like what you're saying as well. I don't think
that inflating GC with options to skip specific steps is a good idea,
but that does mean that, for those targeted operations, we need to use
the lower-level commands directly, rather than GC.

Bryan


Re: BUG: CR marker ^M doesn't show up in '-' lines of diffs when the ending of the removed line is CR+LF

2018-11-28 Thread Junio C Hamano
Johannes Sixt  writes:

> Am 27.11.18 um 00:31 schrieb Junio C Hamano:
>> Johannes Sixt  writes:
>>> Am 26.11.18 um 04:04 schrieb Junio C Hamano:
>>> ... this goes too far, IMO. It is the pager's task to decode control
>>> characters.
>>
>> It was tongue-in-cheek suggestion to split a CR into caret-em on our
>> end, but we'd get essentially the same visual effect if we added a
>> rule:
>>
>>  When producing a colored output (not limited to whitespace
>>  error coloring of diff output), insert  before a CR
>>  that comes immediately before a LF.

This was misspoken a bit.  Let's revise it to

When producing a colored output (not limited to whitespace
error coloring of diff output) for contents that are not
marked as eol=crlf (and other historical means), insert
 before a CR that comes immediately before a LF.

to exempt CR in CRLF sequence that the contents and the user agree
to be part of the end-of-line marker.

> I wouldn't want that to happen for all output (context lines, - lines,
> + lines): I really am not interested to see all the CRs in my CRLF
> files.

So your CRLF files will not give caret-em.

>> And a good thing is that I do not think that new rule is doing any
>> decode of control chars on our end.  We are just producing colored
>> output normally.
>
> But we already have it, as Brian pointed out:
>
>git diff --ws-error-highlight=old,new
>
> or by setting diff.wsErrorHighlight accordingly.

Not really.  If the context lines end with CRLF and the material is
not CRLF, I do not think CR at the end of them should be highlighted
as whitespace errors (as we tell to detect errors on "old" and "new"
lines), but I think we still should make an effort to help them be
visible to the users.  Otherwise, next Frank will come and complain
after seeing

-something
 some common thing
+something_new^M

With the suggested change, you can distinguish the following two
(and use your imagination that there are many "some common thing"
lines), which would help the user guide if the file should be marked
as CRLF file, or the contents mistakenly has some lines that end
with CRLF by mistake.  The corrective action between the two cases
would vastly be different.

-something^M-something  
 some common thing^M some common thing
 some common thing^M some common thing
 some common thing^M some common thing
+something_new^M+something_new^M  

Without, both would look like the RHS of the illustration, and with
the "highlight both", you'd only get an extra caret-em on the
removed line, and need to judge without the help from common context
lines.

Besides, --ws-error-highlight shows all whitespace errors, and
showing CR as caret-em is mere side effect of the interaction
between its coloring and the pager.



Re: in 2.19.2 t0061-run-command FAILs if . is in $PATH

2018-11-28 Thread Junio C Hamano
Johannes Schindelin  writes:

> -test_expect_success 'run_command is restricted to PATH' '
> +test_lazy_prereq DOT_IN_PATH '
> + case ":$PATH:" in
> + *:.:*) true;;
> + *) false;;
> + esac
> +'

An empty element in the colon-separated list also serves as an
instruction to pick up executable from $cwd, so

case ":$PATH:" in
*:.:** | *::*) true ;;
*) false ;;
esac

perhaps.

> +test_expect_success !DOT_IN_PATH 'run_command is restricted to PATH' '
>   write_script should-not-run <<-\EOF &&
>   echo yikes
>   EOF
> -- snap --
>
> If so, can you please provide a commit message for it (you can add my
> Signed-off-by: line and your Tested-by: line).
>
> Thanks,
> Johannes


Re: Forcing GC to always fail

2018-11-28 Thread Junio C Hamano
Bryan Turner  writes:

> For us, the biggest issue was "git gc"'s insistence on trying to run
> "git reflog expire". That triggers locking behaviors that resulted in
> very frequent GC failures--and the only reflogs Bitbucket Server (by
> default) creates are all configured to never ex[ire or be pruned, so
> the effort is all wasted anyway.

Detecting that the expiry threshold is set to "never" before
spending cycles and seeks to sift between old and new and not
spawning the expire command?

This seems like an obvious low-hanging fruit to me.

> Another issue with the canned steps for "git gc" is that it means it
> can't be used to do specific types of cleanup on a different schedule
> from others. For example, we use "git pack-refs" directly to
> frequently pack the refs in our repositories, separate from "git
> repack" + "git prune" for repacking objects. That allows us to keep
> our refs packed better without incurring the full overhead of
> constantly building new packs.

I am not sure if the above is an example of things that are good.
We keep individual "pack-refs" and "rev-list | pack-objects"
available exactly to give finer grained control to repository
owners, and "gc" is meant to be one-size-fits-all easy to run
by end users.  Adding options to "git gc --no-reflog --pack-refs"
to complicate it sounds somewhat backwards.


Re: [PATCH 09/10] fetch: try fetching submodules if needed objects were not fetched

2018-11-28 Thread Stefan Beller
On Fri, Oct 26, 2018 at 1:41 PM Jonathan Tan  wrote:
>
> > But this default fetch is not sufficient, as a newly fetched commit in
> > the superproject could point to a commit in the submodule that is not
> > in the default refspec. This is common in workflows like Gerrit's.
> > When fetching a Gerrit change under review (from refs/changes/??), the
> > commits in that change likely point to submodule commits that have not
> > been merged to a branch yet.
> >
> > Try fetching a submodule by object id if the object id that the
> > superproject points to, cannot be found.
>
> I see that these suggestions of mine (from [1]) was implemented, but not
> others. If you disagree, that's fine, but I think they should be
> discussed.

ok.

>
> > - if ((recurse_submodules != RECURSE_SUBMODULES_OFF) &&
> > - (recurse_submodules != RECURSE_SUBMODULES_ON))
> > + if (recurse_submodules != RECURSE_SUBMODULES_OFF)
>
> I think the next patch should be squashed into this patch. Then you can
> say that these are now redundant and can be removed.

ok.

>
> > @@ -1218,8 +1218,12 @@ struct submodule_parallel_fetch {
> >   int result;
> >
> >   struct string_list changed_submodule_names;
> > + struct get_next_submodule_task **fetch_specific_oids;
> > + int fetch_specific_oids_nr, fetch_specific_oids_alloc;
> >  };
>
> Add documentation for fetch_specific_oids. Also, it might be better to
> call these oid_fetch_tasks and the struct, "struct fetch_task".

ok.

> Here, struct get_next_submodule_task is used for 2 different things:
>  (1) After the first round fetch, fetch_finish() uses it to determine if
>  a second round is needed.
>  (2) In submodule_parallel_fetch.fetch_specific_oids, to tell the
>  parallel runner (through get_next_submodule_task()) to do this
>  fetch.
>
> I think that it's better to have 2 different structs for these 2
> different uses. (Note that task_cb can be NULL for the second round.
> Unless we plan to recheck the OIDs? Currently we recheck them, but we
> don't do anything either way.)

I think it is easier to only have one struct until we have substantially
more to communicate. (1) is a boolean answer, for which (non-)NULL
is sufficient.


> I think that this is best described as the submodule that has no entry
> in .gitmodules? Maybe call it "get_non_gitmodules_submodule" and
> document it with a similar comment as in get_submodule_repo_for().

done.

>
> > +
> > +static struct get_next_submodule_task *get_next_submodule_task_create(
> > + struct repository *r, const char *path)
> > +{
> > + struct get_next_submodule_task *task = xmalloc(sizeof(*task));
> > + memset(task, 0, sizeof(*task));
> > +
> > + task->sub = submodule_from_path(r, _oid, path);
> > + if (!task->sub) {
> > + task->sub = get_default_submodule(path);
> > + task->free_sub = 1;
> > + }
> > +
> > + return task;
> > +}
>
> Clearer to me would be to make get_next_submodule_task_create() return
> NULL if submodule_from_path() returns NULL.

I doubled down on this one and return NULL when get_default_submodule
(now renamed to get_non_gitmodules_submodule) returns NULL, as then we
can move the free() from get_next_submodule here and there we'll just have

task = fetch_task_create(spf->r, ce->name);
if (!task)
continue;

which helps get_next_submodule to stay readable.


> Same comment about "on-demand" as in my previous e-mail.

I'd want to push back on refactoring and defer that to a later series.

> Break lines to 80.
[...]
> Same comment about "s--h" as in my previous e-mail.

done

> > + /* Are there commits that do not exist? */
> > + if (commits->nr) {
> > + /* We already tried fetching them, do not try again. */
> > + if (task->commits)
> > + return 0;
>
> Same comment about "task->commits" as in my previous e-mail.

Good call. I reordered the function read easier and added a comment
on any early return how it could happen.

>
> > diff --git a/t/t5526-fetch-submodules.sh b/t/t5526-fetch-submodules.sh
> > index 6c2f9b2ba2..5a75b57852 100755
>
> One more thing to test is the case where a submodule doesn't have a
> .gitmodules entry.

added a test.

I just resent the series.

Stefan


[PATCH 4/9] submodule.c: tighten scope of changed_submodule_names struct

2018-11-28 Thread Stefan Beller
The `changed_submodule_names` are only used for fetching, so let's make it
part of the struct that is passed around for fetching submodules.

Signed-off-by: Stefan Beller 
---
 submodule.c | 19 +++
 1 file changed, 11 insertions(+), 8 deletions(-)

diff --git a/submodule.c b/submodule.c
index 3c388f85cc..f93f0aff82 100644
--- a/submodule.c
+++ b/submodule.c
@@ -25,7 +25,6 @@
 #include "commit-reach.h"
 
 static int config_update_recurse_submodules = RECURSE_SUBMODULES_OFF;
-static struct string_list changed_submodule_names = STRING_LIST_INIT_DUP;
 static int initialized_fetch_ref_tips;
 static struct oid_array ref_tips_before_fetch;
 static struct oid_array ref_tips_after_fetch;
@@ -1136,7 +1135,8 @@ void check_for_new_submodule_commits(struct object_id 
*oid)
oid_array_append(_tips_after_fetch, oid);
 }
 
-static void calculate_changed_submodule_paths(struct repository *r)
+static void calculate_changed_submodule_paths(struct repository *r,
+   struct string_list *changed_submodule_names)
 {
struct argv_array argv = ARGV_ARRAY_INIT;
struct string_list changed_submodules = STRING_LIST_INIT_DUP;
@@ -1174,7 +1174,8 @@ static void calculate_changed_submodule_paths(struct 
repository *r)
continue;
 
if (!submodule_has_commits(r, path, commits))
-   string_list_append(_submodule_names, 
name->string);
+   string_list_append(changed_submodule_names,
+  name->string);
}
 
free_submodules_oids(_submodules);
@@ -1221,8 +1222,10 @@ struct submodule_parallel_fetch {
int default_option;
int quiet;
int result;
+
+   struct string_list changed_submodule_names;
 };
-#define SPF_INIT {0, ARGV_ARRAY_INIT, NULL, NULL, 0, 0, 0, 0}
+#define SPF_INIT {0, ARGV_ARRAY_INIT, NULL, NULL, 0, 0, 0, 0, 
STRING_LIST_INIT_DUP }
 
 static int get_fetch_recurse_config(const struct submodule *submodule,
struct submodule_parallel_fetch *spf)
@@ -1284,7 +1287,7 @@ static int get_next_submodule(struct child_process *cp,
case RECURSE_SUBMODULES_ON_DEMAND:
if (!submodule ||
!string_list_lookup(
-   _submodule_names,
+   >changed_submodule_names,
submodule->name))
continue;
default_argv = "on-demand";
@@ -1376,8 +1379,8 @@ int fetch_populated_submodules(struct repository *r,
argv_array_push(, "--recurse-submodules-default");
/* default value, "--submodule-prefix" and its value are added later */
 
-   calculate_changed_submodule_paths(r);
-   string_list_sort(_submodule_names);
+   calculate_changed_submodule_paths(r, _submodule_names);
+   string_list_sort(_submodule_names);
run_processes_parallel(max_parallel_jobs,
   get_next_submodule,
   fetch_start_failure,
@@ -1386,7 +1389,7 @@ int fetch_populated_submodules(struct repository *r,
 
argv_array_clear();
 out:
-   string_list_clear(_submodule_names, 1);
+   string_list_clear(_submodule_names, 1);
return spf.result;
 }
 
-- 
2.20.0.rc1.387.gf8505762e3-goog



[PATCH 9/9] fetch: try fetching submodules if needed objects were not fetched

2018-11-28 Thread Stefan Beller
Currently when git-fetch is asked to recurse into submodules, it dispatches
a plain "git-fetch -C " (with some submodule related options
such as prefix and recusing strategy, but) without any information of the
remote or the tip that should be fetched.

But this default fetch is not sufficient, as a newly fetched commit in
the superproject could point to a commit in the submodule that is not
in the default refspec. This is common in workflows like Gerrit's.
When fetching a Gerrit change under review (from refs/changes/??), the
commits in that change likely point to submodule commits that have not
been merged to a branch yet.

Try fetching a submodule by object id if the object id that the
superproject points to, cannot be found.

builtin/fetch used to only inspect submodules when they were fetched
"on-demand", as in either on/off case it was clear whether the submodule
needs to be fetched. However to know whether we need to try fetching the
object ids, we need to identify the object names, which is done in this
function check_for_new_submodule_commits(), so we'll also run that code
in case the submodule recursion is set to "on".

The submodule checks were done only when a ref in the superproject
changed, these checks were extended to also be performed when fetching
into FETCH_HEAD for completeness, and add a test for that too.

Signed-off-by: Stefan Beller 
---
 builtin/fetch.c |  11 +-
 submodule.c | 206 +++-
 t/t5526-fetch-submodules.sh |  86 +++
 3 files changed, 265 insertions(+), 38 deletions(-)

diff --git a/builtin/fetch.c b/builtin/fetch.c
index e0140327aa..91f9b7d9c8 100644
--- a/builtin/fetch.c
+++ b/builtin/fetch.c
@@ -763,9 +763,6 @@ static int update_local_ref(struct ref *ref,
what = _("[new ref]");
}
 
-   if ((recurse_submodules != RECURSE_SUBMODULES_OFF) &&
-   (recurse_submodules != RECURSE_SUBMODULES_ON))
-   check_for_new_submodule_commits(>new_oid);
r = s_update_ref(msg, ref, 0);
format_display(display, r ? '!' : '*', what,
   r ? _("unable to update local ref") : NULL,
@@ -779,9 +776,6 @@ static int update_local_ref(struct ref *ref,
strbuf_add_unique_abbrev(, >object.oid, 
DEFAULT_ABBREV);
strbuf_addstr(, "..");
strbuf_add_unique_abbrev(, >new_oid, 
DEFAULT_ABBREV);
-   if ((recurse_submodules != RECURSE_SUBMODULES_OFF) &&
-   (recurse_submodules != RECURSE_SUBMODULES_ON))
-   check_for_new_submodule_commits(>new_oid);
r = s_update_ref("fast-forward", ref, 1);
format_display(display, r ? '!' : ' ', quickref.buf,
   r ? _("unable to update local ref") : NULL,
@@ -794,9 +788,6 @@ static int update_local_ref(struct ref *ref,
strbuf_add_unique_abbrev(, >object.oid, 
DEFAULT_ABBREV);
strbuf_addstr(, "...");
strbuf_add_unique_abbrev(, >new_oid, 
DEFAULT_ABBREV);
-   if ((recurse_submodules != RECURSE_SUBMODULES_OFF) &&
-   (recurse_submodules != RECURSE_SUBMODULES_ON))
-   check_for_new_submodule_commits(>new_oid);
r = s_update_ref("forced-update", ref, 1);
format_display(display, r ? '!' : '+', quickref.buf,
   r ? _("unable to update local ref") : _("forced 
update"),
@@ -892,6 +883,8 @@ static int store_updated_refs(const char *raw_url, const 
char *remote_name,
ref->force = rm->peer_ref->force;
}
 
+   if (recurse_submodules != RECURSE_SUBMODULES_OFF)
+   check_for_new_submodule_commits(>old_oid);
 
if (!strcmp(rm->name, "HEAD")) {
kind = "";
diff --git a/submodule.c b/submodule.c
index d1b6646f42..1ce944a737 100644
--- a/submodule.c
+++ b/submodule.c
@@ -1231,8 +1231,14 @@ struct submodule_parallel_fetch {
int result;
 
struct string_list changed_submodule_names;
+
+   /* The submodules to fetch in */
+   struct fetch_task **oid_fetch_tasks;
+   int oid_fetch_tasks_nr, oid_fetch_tasks_alloc;
 };
-#define SPF_INIT {0, ARGV_ARRAY_INIT, NULL, NULL, 0, 0, 0, 0, 
STRING_LIST_INIT_DUP }
+#define SPF_INIT {0, ARGV_ARRAY_INIT, NULL, NULL, 0, 0, 0, 0, \
+ STRING_LIST_INIT_DUP, \
+ NULL, 0, 0}
 
 static int get_fetch_recurse_config(const struct submodule *submodule,
struct submodule_parallel_fetch *spf)
@@ -1259,6 +1265,73 @@ static int get_fetch_recurse_config(const struct 
submodule *submodule,
return spf->default_option;
 }
 
+struct fetch_task {
+   struct repository *repo;
+   const struct submodule 

[PATCHv2 0/9] Resending sb/submodule-recursive-fetch-gets-the-tip

2018-11-28 Thread Stefan Beller
This is a resend of sb/submodule-recursive-fetch-gets-the-tip,
with all feedback addressed. As it took some time, I'll send it
without range-diff, but would ask for full review.

I plan on resending after the next release as this got delayed quite a bit,
which is why I also rebased it to master.

Thanks,
Stefan

Previous round:
https://public-inbox.org/git/20181016181327.107186-1-sbel...@google.com/

Stefan Beller (9):
  sha1-array: provide oid_array_filter
  submodule.c: fix indentation
  submodule.c: sort changed_submodule_names before searching it
  submodule.c: tighten scope of changed_submodule_names struct
  submodule: store OIDs in changed_submodule_names
  repository: repo_submodule_init to take a submodule struct
  submodule: migrate get_next_submodule to use repository structs
  submodule.c: fetch in submodules git directory instead of in worktree
  fetch: try fetching submodules if needed objects were not fetched

 Documentation/technical/api-oid-array.txt|   5 +
 builtin/fetch.c  |  11 +-
 builtin/grep.c   |  17 +-
 builtin/ls-files.c   |  12 +-
 builtin/submodule--helper.c  |   2 +-
 repository.c |  27 +-
 repository.h |  12 +-
 sha1-array.c |  17 ++
 sha1-array.h |   3 +
 submodule.c  | 284 ---
 t/helper/test-submodule-nested-repo-config.c |   8 +-
 t/t5526-fetch-submodules.sh  |  86 ++
 12 files changed, 395 insertions(+), 89 deletions(-)

-- 
2.20.0.rc1.387.gf8505762e3-goog



[PATCH 5/9] submodule: store OIDs in changed_submodule_names

2018-11-28 Thread Stefan Beller
'calculate_changed_submodule_paths' uses a local list to compute the
changed submodules, and then produces the result by copying appropriate
items into the result list.

Instead use the result list directly and prune items afterwards
using string_list_remove_empty_items.

By doing so we'll have access to the util pointer for longer that
contains the commits that we need to fetch, which will be
useful in a later patch.

Signed-off-by: Stefan Beller 
Reviewed-by: Jonathan Tan 
---
 submodule.c | 19 ++-
 1 file changed, 10 insertions(+), 9 deletions(-)

diff --git a/submodule.c b/submodule.c
index f93f0aff82..0c81aca6f2 100644
--- a/submodule.c
+++ b/submodule.c
@@ -1139,8 +1139,7 @@ static void calculate_changed_submodule_paths(struct 
repository *r,
struct string_list *changed_submodule_names)
 {
struct argv_array argv = ARGV_ARRAY_INIT;
-   struct string_list changed_submodules = STRING_LIST_INIT_DUP;
-   const struct string_list_item *name;
+   struct string_list_item *name;
 
/* No need to check if there are no submodules configured */
if (!submodule_from_path(r, NULL, NULL))
@@ -1157,9 +1156,9 @@ static void calculate_changed_submodule_paths(struct 
repository *r,
 * Collect all submodules (whether checked out or not) for which new
 * commits have been recorded upstream in "changed_submodule_names".
 */
-   collect_changed_submodules(r, _submodules, );
+   collect_changed_submodules(r, changed_submodule_names, );
 
-   for_each_string_list_item(name, _submodules) {
+   for_each_string_list_item(name, changed_submodule_names) {
struct oid_array *commits = name->util;
const struct submodule *submodule;
const char *path = NULL;
@@ -1173,12 +1172,14 @@ static void calculate_changed_submodule_paths(struct 
repository *r,
if (!path)
continue;
 
-   if (!submodule_has_commits(r, path, commits))
-   string_list_append(changed_submodule_names,
-  name->string);
+   if (submodule_has_commits(r, path, commits)) {
+   oid_array_clear(commits);
+   *name->string = '\0';
+   }
}
 
-   free_submodules_oids(_submodules);
+   string_list_remove_empty_items(changed_submodule_names, 1);
+
argv_array_clear();
oid_array_clear(_tips_before_fetch);
oid_array_clear(_tips_after_fetch);
@@ -1389,7 +1390,7 @@ int fetch_populated_submodules(struct repository *r,
 
argv_array_clear();
 out:
-   string_list_clear(_submodule_names, 1);
+   free_submodules_oids(_submodule_names);
return spf.result;
 }
 
-- 
2.20.0.rc1.387.gf8505762e3-goog



[PATCH 3/9] submodule.c: sort changed_submodule_names before searching it

2018-11-28 Thread Stefan Beller
We can string_list_insert() to maintain sorted-ness of the
list as we find new items, or we can string_list_append() to
build an unsorted list and sort it at the end just once.

As we do not rely on the sortedness while building the
list, we pick the "append and sort at the end" as it
has better worst case execution times.

Signed-off-by: Stefan Beller 
Signed-off-by: Junio C Hamano 
---
 submodule.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/submodule.c b/submodule.c
index bc48ea3b68..3c388f85cc 100644
--- a/submodule.c
+++ b/submodule.c
@@ -1283,7 +1283,7 @@ static int get_next_submodule(struct child_process *cp,
case RECURSE_SUBMODULES_DEFAULT:
case RECURSE_SUBMODULES_ON_DEMAND:
if (!submodule ||
-   !unsorted_string_list_lookup(
+   !string_list_lookup(
_submodule_names,
submodule->name))
continue;
@@ -1377,6 +1377,7 @@ int fetch_populated_submodules(struct repository *r,
/* default value, "--submodule-prefix" and its value are added later */
 
calculate_changed_submodule_paths(r);
+   string_list_sort(_submodule_names);
run_processes_parallel(max_parallel_jobs,
   get_next_submodule,
   fetch_start_failure,
-- 
2.20.0.rc1.387.gf8505762e3-goog



[PATCH 7/9] submodule: migrate get_next_submodule to use repository structs

2018-11-28 Thread Stefan Beller
We used to recurse into submodules, even if they were broken having
only an objects directory. The child process executed in the submodule
would fail though if the submodule was broken. This is tested via
"fetching submodule into a broken repository" in t5526.

This patch tightens the check upfront, such that we do not need
to spawn a child process to find out if the submodule is broken.

Signed-off-by: Stefan Beller 
---
 submodule.c | 56 +
 1 file changed, 44 insertions(+), 12 deletions(-)

diff --git a/submodule.c b/submodule.c
index 0c81aca6f2..77ace5e784 100644
--- a/submodule.c
+++ b/submodule.c
@@ -1253,6 +1253,30 @@ static int get_fetch_recurse_config(const struct 
submodule *submodule,
return spf->default_option;
 }
 
+static struct repository *get_submodule_repo_for(struct repository *r,
+const struct submodule *sub)
+{
+   struct repository *ret = xmalloc(sizeof(*ret));
+
+   if (repo_submodule_init(ret, r, sub)) {
+   /*
+* No entry in .gitmodules? Technically not a submodule,
+* but historically we supported repositories that happen to be
+* in-place where a gitlink is. Keep supporting them.
+*/
+   struct strbuf gitdir = STRBUF_INIT;
+   strbuf_repo_worktree_path(, r, "%s/.git", sub->path);
+   if (repo_init(ret, gitdir.buf, NULL)) {
+   strbuf_release();
+   free(ret);
+   return NULL;
+   }
+   strbuf_release();
+   }
+
+   return ret;
+}
+
 static int get_next_submodule(struct child_process *cp,
  struct strbuf *err, void *data, void **task_cb)
 {
@@ -1260,12 +1284,11 @@ static int get_next_submodule(struct child_process *cp,
struct submodule_parallel_fetch *spf = data;
 
for (; spf->count < spf->r->index->cache_nr; spf->count++) {
-   struct strbuf submodule_path = STRBUF_INIT;
-   struct strbuf submodule_git_dir = STRBUF_INIT;
struct strbuf submodule_prefix = STRBUF_INIT;
const struct cache_entry *ce = spf->r->index->cache[spf->count];
-   const char *git_dir, *default_argv;
+   const char *default_argv;
const struct submodule *submodule;
+   struct repository *repo;
struct submodule default_submodule = SUBMODULE_INIT;
 
if (!S_ISGITLINK(ce->ce_mode))
@@ -1300,15 +1323,11 @@ static int get_next_submodule(struct child_process *cp,
continue;
}
 
-   strbuf_repo_worktree_path(_path, spf->r, "%s", 
ce->name);
-   strbuf_addf(_git_dir, "%s/.git", submodule_path.buf);
strbuf_addf(_prefix, "%s%s/", spf->prefix, ce->name);
-   git_dir = read_gitfile(submodule_git_dir.buf);
-   if (!git_dir)
-   git_dir = submodule_git_dir.buf;
-   if (is_directory(git_dir)) {
+   repo = get_submodule_repo_for(spf->r, submodule);
+   if (repo) {
child_process_init(cp);
-   cp->dir = strbuf_detach(_path, NULL);
+   cp->dir = xstrdup(repo->worktree);
prepare_submodule_repo_env(>env_array);
cp->git_cmd = 1;
if (!spf->quiet)
@@ -1319,10 +1338,23 @@ static int get_next_submodule(struct child_process *cp,
argv_array_push(>args, default_argv);
argv_array_push(>args, "--submodule-prefix");
argv_array_push(>args, submodule_prefix.buf);
+
+   repo_clear(repo);
+   free(repo);
ret = 1;
+   } else {
+   /*
+* An empty directory is normal,
+* the submodule is not initialized
+*/
+   if (S_ISGITLINK(ce->ce_mode) &&
+   !is_empty_dir(ce->name)) {
+   spf->result = 1;
+   strbuf_addf(err,
+   _("Could not access submodule 
'%s'"),
+   ce->name);
+   }
}
-   strbuf_release(_path);
-   strbuf_release(_git_dir);
strbuf_release(_prefix);
if (ret) {
spf->count++;
-- 
2.20.0.rc1.387.gf8505762e3-goog



[PATCH 8/9] submodule.c: fetch in submodules git directory instead of in worktree

2018-11-28 Thread Stefan Beller
Keep the properties introduced in 10f5c52656 (submodule: avoid
auto-discovery in prepare_submodule_repo_env(), 2016-09-01), by fixating
the git directory of the submodule.

Signed-off-by: Stefan Beller 
---
 submodule.c | 10 --
 1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/submodule.c b/submodule.c
index 77ace5e784..d1b6646f42 100644
--- a/submodule.c
+++ b/submodule.c
@@ -494,6 +494,12 @@ void prepare_submodule_repo_env(struct argv_array *out)
 DEFAULT_GIT_DIR_ENVIRONMENT);
 }
 
+static void prepare_submodule_repo_env_in_gitdir(struct argv_array *out)
+{
+   prepare_submodule_repo_env_no_git_dir(out);
+   argv_array_pushf(out, "%s=.", GIT_DIR_ENVIRONMENT);
+}
+
 /* Helper function to display the submodule header line prior to the full
  * summary output. If it can locate the submodule objects directory it will
  * attempt to lookup both the left and right commits and put them into the
@@ -1327,8 +1333,8 @@ static int get_next_submodule(struct child_process *cp,
repo = get_submodule_repo_for(spf->r, submodule);
if (repo) {
child_process_init(cp);
-   cp->dir = xstrdup(repo->worktree);
-   prepare_submodule_repo_env(>env_array);
+   cp->dir = xstrdup(repo->gitdir);
+   prepare_submodule_repo_env_in_gitdir(>env_array);
cp->git_cmd = 1;
if (!spf->quiet)
strbuf_addf(err, "Fetching submodule %s%s\n",
-- 
2.20.0.rc1.387.gf8505762e3-goog



[PATCH 6/9] repository: repo_submodule_init to take a submodule struct

2018-11-28 Thread Stefan Beller
When constructing a struct repository for a submodule for some revision
of the superproject where the submodule is not contained in the index,
it may not be present in the working tree currently either. In that
situation giving a 'path' argument is not useful. Upgrade the
repo_submodule_init function to take a struct submodule instead.
The submodule struct can be obtained via submodule_from_{path, name} or
an artificial submodule struct can be passed in.

While we are at it, rename the repository struct in the repo_submodule_init
function, which is to be initialized, to a name that is not confused with
the struct submodule as easily. Perform such renames in similar functions
as well.

Also move its documentation into the header file.

Reviewed-by: Jonathan Tan 
Signed-off-by: Stefan Beller 
---
 builtin/grep.c   | 17 +++-
 builtin/ls-files.c   | 12 +
 builtin/submodule--helper.c  |  2 +-
 repository.c | 27 
 repository.h | 12 +++--
 t/helper/test-submodule-nested-repo-config.c |  8 +++---
 6 files changed, 43 insertions(+), 35 deletions(-)

diff --git a/builtin/grep.c b/builtin/grep.c
index 71df52a333..d6bd887b2d 100644
--- a/builtin/grep.c
+++ b/builtin/grep.c
@@ -404,7 +404,10 @@ static int grep_submodule(struct grep_opt *opt, struct 
repository *superproject,
  const struct object_id *oid,
  const char *filename, const char *path)
 {
-   struct repository submodule;
+   struct repository subrepo;
+   const struct submodule *sub = submodule_from_path(superproject,
+ _oid, path);
+
int hit;
 
/*
@@ -420,12 +423,12 @@ static int grep_submodule(struct grep_opt *opt, struct 
repository *superproject,
return 0;
}
 
-   if (repo_submodule_init(, superproject, path)) {
+   if (repo_submodule_init(, superproject, sub)) {
grep_read_unlock();
return 0;
}
 
-   repo_read_gitmodules();
+   repo_read_gitmodules();
 
/*
 * NEEDSWORK: This adds the submodule's object directory to the list of
@@ -437,7 +440,7 @@ static int grep_submodule(struct grep_opt *opt, struct 
repository *superproject,
 * store is no longer global and instead is a member of the repository
 * object.
 */
-   add_to_alternates_memory(submodule.objects->objectdir);
+   add_to_alternates_memory(subrepo.objects->objectdir);
grep_read_unlock();
 
if (oid) {
@@ -462,14 +465,14 @@ static int grep_submodule(struct grep_opt *opt, struct 
repository *superproject,
 
init_tree_desc(, data, size);
hit = grep_tree(opt, pathspec, , , base.len,
-   object->type == OBJ_COMMIT, );
+   object->type == OBJ_COMMIT, );
strbuf_release();
free(data);
} else {
-   hit = grep_cache(opt, , pathspec, 1);
+   hit = grep_cache(opt, , pathspec, 1);
}
 
-   repo_clear();
+   repo_clear();
return hit;
 }
 
diff --git a/builtin/ls-files.c b/builtin/ls-files.c
index c70a9c7158..583a0e1ca2 100644
--- a/builtin/ls-files.c
+++ b/builtin/ls-files.c
@@ -206,17 +206,19 @@ static void show_files(struct repository *repo, struct 
dir_struct *dir);
 static void show_submodule(struct repository *superproject,
   struct dir_struct *dir, const char *path)
 {
-   struct repository submodule;
+   struct repository subrepo;
+   const struct submodule *sub = submodule_from_path(superproject,
+ _oid, path);
 
-   if (repo_submodule_init(, superproject, path))
+   if (repo_submodule_init(, superproject, sub))
return;
 
-   if (repo_read_index() < 0)
+   if (repo_read_index() < 0)
die("index file corrupt");
 
-   show_files(, dir);
+   show_files(, dir);
 
-   repo_clear();
+   repo_clear();
 }
 
 static void show_ce(struct repository *repo, struct dir_struct *dir,
diff --git a/builtin/submodule--helper.c b/builtin/submodule--helper.c
index d38113a31a..4eceb8f040 100644
--- a/builtin/submodule--helper.c
+++ b/builtin/submodule--helper.c
@@ -2053,7 +2053,7 @@ static int ensure_core_worktree(int argc, const char 
**argv, const char *prefix)
if (!sub)
BUG("We could get the submodule handle before?");
 
-   if (repo_submodule_init(, the_repository, path))
+   if (repo_submodule_init(, the_repository, sub))
die(_("could not get a repository handle for submodule '%s'"), 
path);
 
if (!repo_config_get_string(, "core.worktree", )) {
diff --git a/repository.c b/repository.c
index 

[PATCH 1/9] sha1-array: provide oid_array_filter

2018-11-28 Thread Stefan Beller
Helped-by: Junio C Hamano 
Signed-off-by: Stefan Beller 
Signed-off-by: Junio C Hamano 
---
 Documentation/technical/api-oid-array.txt |  5 +
 sha1-array.c  | 17 +
 sha1-array.h  |  3 +++
 3 files changed, 25 insertions(+)

diff --git a/Documentation/technical/api-oid-array.txt 
b/Documentation/technical/api-oid-array.txt
index 9febfb1d52..c97428c2c3 100644
--- a/Documentation/technical/api-oid-array.txt
+++ b/Documentation/technical/api-oid-array.txt
@@ -48,6 +48,11 @@ Functions
is not sorted, this function has the side effect of sorting
it.
 
+`oid_array_filter`::
+   Apply the callback function `want` to each entry in the array,
+   retaining only the entries for which the function returns true.
+   Preserve the order of the entries that are retained.
+
 Examples
 
 
diff --git a/sha1-array.c b/sha1-array.c
index b94e0ec0f5..d922e94e3f 100644
--- a/sha1-array.c
+++ b/sha1-array.c
@@ -77,3 +77,20 @@ int oid_array_for_each_unique(struct oid_array *array,
}
return 0;
 }
+
+void oid_array_filter(struct oid_array *array,
+ for_each_oid_fn want,
+ void *cb_data)
+{
+   unsigned nr = array->nr, src, dst;
+   struct object_id *oids = array->oid;
+
+   for (src = dst = 0; src < nr; src++) {
+   if (want([src], cb_data)) {
+   if (src != dst)
+   oidcpy([dst], [src]);
+   dst++;
+   }
+   }
+   array->nr = dst;
+}
diff --git a/sha1-array.h b/sha1-array.h
index 232bf95017..55d016c4bf 100644
--- a/sha1-array.h
+++ b/sha1-array.h
@@ -22,5 +22,8 @@ int oid_array_for_each(struct oid_array *array,
 int oid_array_for_each_unique(struct oid_array *array,
  for_each_oid_fn fn,
  void *data);
+void oid_array_filter(struct oid_array *array,
+ for_each_oid_fn want,
+ void *cbdata);
 
 #endif /* SHA1_ARRAY_H */
-- 
2.20.0.rc1.387.gf8505762e3-goog



[PATCH 2/9] submodule.c: fix indentation

2018-11-28 Thread Stefan Beller
The submodule subsystem is really bad at staying within 80 characters.
Fix it while we are here.

Signed-off-by: Stefan Beller 
Signed-off-by: Junio C Hamano 
---
 submodule.c | 9 ++---
 1 file changed, 6 insertions(+), 3 deletions(-)

diff --git a/submodule.c b/submodule.c
index 6415cc5580..bc48ea3b68 100644
--- a/submodule.c
+++ b/submodule.c
@@ -1271,7 +1271,8 @@ static int get_next_submodule(struct child_process *cp,
if (!submodule) {
const char *name = default_name_or_path(ce->name);
if (name) {
-   default_submodule.path = default_submodule.name 
= name;
+   default_submodule.path = name;
+   default_submodule.name = name;
submodule = _submodule;
}
}
@@ -1281,8 +1282,10 @@ static int get_next_submodule(struct child_process *cp,
default:
case RECURSE_SUBMODULES_DEFAULT:
case RECURSE_SUBMODULES_ON_DEMAND:
-   if (!submodule || 
!unsorted_string_list_lookup(_submodule_names,
-submodule->name))
+   if (!submodule ||
+   !unsorted_string_list_lookup(
+   _submodule_names,
+   submodule->name))
continue;
default_argv = "on-demand";
break;
-- 
2.20.0.rc1.387.gf8505762e3-goog



Re: [PATCH v2 6/7] checkout: split into switch-branch and checkout-files

2018-11-28 Thread Stefan Xenos
More thoughts:

git switch-branch should never detach HEAD unless asked to do so
explicitly. That also means that "git switch-branch" shouldn't accept
any of the non-branch tree-ish arguments that would have caused "git
checkout" to do so.
On Wed, Nov 28, 2018 at 3:26 PM Stefan Xenos  wrote:
>
> Although I have no problem with "switch-branch" as a command name,
> some alternative names we might consider for switch-branch might be:
>
> chbranch
> swbranch
> switch
> branch change (as a subcommand for the "branch" command)
>
> I've personally been using "chbranch" as an alias for this
> functionality for some time.
>
>   - Stefan
> On Wed, Nov 28, 2018 at 3:22 PM Stefan Xenos  wrote:
> >
> > > Since the other one is already "checkout-files", maybe this one could 
> > > just be "checkout-branch".
> >
> > I rather like switch-branch and dislike the word "checkout" since it
> > has been overloaded in git for so long (does it mean moving HEAD or
> > copying files to my working tree?)
> >
> > > nobody will become "sick of" the single "checkout" command that can
> >
> > I have to admit I'm already sick of the checkout command. :-p I can
> > see myself using these two new commands 100% of the time and never
> > missing the old one.
> >
> > Some behaviors I'd expect to see from these commands (I haven't yet
> > checked to see if you've already done this):
> >
> > git checkout-files 
> > should reset all the files in the repository regardless of the current
> > directory - it should produce the same effect as "git reset --hard
> >  && git reset HEAD@{1}". It should also delete
> > locally-created files that aren't present in , such that the
> > final working tree is exactly identical to what was committed in that
> > tree-ish.
> >
> > git checkout-files foo -- myfile.txt
> > should delete myfile.txt if it is present locally but not present in foo.
> >
> > git checkout-files foo -- .
> > should recursively checkout all files in the current folder and all
> > subfolders, and delete any locally-created files if they're not
> > present in foo.
> >
> > git checkout-files should never move HEAD in any circumstance.
> >
> > Suggestion:
> > If git checkout-files overwrites or deletes any locally-modified files
> > from the workspace or index, those files could be auto-stashed. That
> > would make it easy to restore them in the event of a mistyped command.
> > Auto-stashing could be suppressed with a command-line argument (with
> > alternate behaviors being fail-if-modified or always-overwrite).
> >
> > Idea:
> > If git checkout-files modifies the submodules file, it could also
> > auto-update the submodules. (For example, with something like "git
> > submodule update --init --recursive --progress").
> >
> >   - Stefan
> > On Wed, Nov 28, 2018 at 7:31 AM Duy Nguyen  wrote:
> > >
> > > On Wed, Nov 28, 2018 at 7:03 AM Junio C Hamano  wrote:
> > > >
> > > > Nguyễn Thái Ngọc Duy   writes:
> > > >
> > > > > The good old "git checkout" command is still here and will be until
> > > > > all (or most of users) are sick of it.
> > > >
> > > > Two comments on the goal (the implementation looked reasonable
> > > > assuming the reader agrees with the gaol).
> > > >
> > > > At least to me, the verb "switch" needs two things to switch
> > > > between, i.e. "switch A and B", unless it is "switch to X".
> > > > Either "switch-to-branch" or simply "switch-to", perhaps?
> > > >
> > > > As I already hinted in my response to Stefan (?) about
> > > > checkout-from-tree vs checkout-from-index, a command with multiple
> > > > modes of operation is not confusing to people with the right mental
> > > > model, and I suspect that having two separate commands for "checking
> > > > out a branch" and "checking out paths" that is done by this step
> > > > would help users to form the right mental model.
> > >
> > > Since the other one is already "checkout-files", maybe this one could
> > > just be "checkout-branch".
> > >
> > > > So I tend to think
> > > > these two are "training wheels", and suspect that once they got it,
> > > > nobody will become "sick of" the single "checkout" command that can
> > > > be used to do either.  It's just the matter of being aware what can
> > > > be done (which requires the right mental model) and how to tell Git
> > > > what the user wants it do (two separate commands, operating mode
> > > > option, or just the implied command line syntax---once the user
> > > > knows what s/he is doing, these do not make that much a difference).
> > >
> > > I would hope this becomes better defaults and being used 90% of time.
> > > Even though I know "git checkout" quite well, it still bites me from
> > > time to time. Having the right mental model is one thing. Having to
> > > think a bit every time to write "git checkout" with the right syntax,
> > > and whether you need "--" (that ambiguation problem can still bite you
> > > from time to time), is frankly something I'd rather avoid.
> > > --
> > > Duy


Re: [PATCH v2 6/7] checkout: split into switch-branch and checkout-files

2018-11-28 Thread Stefan Xenos
Although I have no problem with "switch-branch" as a command name,
some alternative names we might consider for switch-branch might be:

chbranch
swbranch
switch
branch change (as a subcommand for the "branch" command)

I've personally been using "chbranch" as an alias for this
functionality for some time.

  - Stefan
On Wed, Nov 28, 2018 at 3:22 PM Stefan Xenos  wrote:
>
> > Since the other one is already "checkout-files", maybe this one could just 
> > be "checkout-branch".
>
> I rather like switch-branch and dislike the word "checkout" since it
> has been overloaded in git for so long (does it mean moving HEAD or
> copying files to my working tree?)
>
> > nobody will become "sick of" the single "checkout" command that can
>
> I have to admit I'm already sick of the checkout command. :-p I can
> see myself using these two new commands 100% of the time and never
> missing the old one.
>
> Some behaviors I'd expect to see from these commands (I haven't yet
> checked to see if you've already done this):
>
> git checkout-files 
> should reset all the files in the repository regardless of the current
> directory - it should produce the same effect as "git reset --hard
>  && git reset HEAD@{1}". It should also delete
> locally-created files that aren't present in , such that the
> final working tree is exactly identical to what was committed in that
> tree-ish.
>
> git checkout-files foo -- myfile.txt
> should delete myfile.txt if it is present locally but not present in foo.
>
> git checkout-files foo -- .
> should recursively checkout all files in the current folder and all
> subfolders, and delete any locally-created files if they're not
> present in foo.
>
> git checkout-files should never move HEAD in any circumstance.
>
> Suggestion:
> If git checkout-files overwrites or deletes any locally-modified files
> from the workspace or index, those files could be auto-stashed. That
> would make it easy to restore them in the event of a mistyped command.
> Auto-stashing could be suppressed with a command-line argument (with
> alternate behaviors being fail-if-modified or always-overwrite).
>
> Idea:
> If git checkout-files modifies the submodules file, it could also
> auto-update the submodules. (For example, with something like "git
> submodule update --init --recursive --progress").
>
>   - Stefan
> On Wed, Nov 28, 2018 at 7:31 AM Duy Nguyen  wrote:
> >
> > On Wed, Nov 28, 2018 at 7:03 AM Junio C Hamano  wrote:
> > >
> > > Nguyễn Thái Ngọc Duy   writes:
> > >
> > > > The good old "git checkout" command is still here and will be until
> > > > all (or most of users) are sick of it.
> > >
> > > Two comments on the goal (the implementation looked reasonable
> > > assuming the reader agrees with the gaol).
> > >
> > > At least to me, the verb "switch" needs two things to switch
> > > between, i.e. "switch A and B", unless it is "switch to X".
> > > Either "switch-to-branch" or simply "switch-to", perhaps?
> > >
> > > As I already hinted in my response to Stefan (?) about
> > > checkout-from-tree vs checkout-from-index, a command with multiple
> > > modes of operation is not confusing to people with the right mental
> > > model, and I suspect that having two separate commands for "checking
> > > out a branch" and "checking out paths" that is done by this step
> > > would help users to form the right mental model.
> >
> > Since the other one is already "checkout-files", maybe this one could
> > just be "checkout-branch".
> >
> > > So I tend to think
> > > these two are "training wheels", and suspect that once they got it,
> > > nobody will become "sick of" the single "checkout" command that can
> > > be used to do either.  It's just the matter of being aware what can
> > > be done (which requires the right mental model) and how to tell Git
> > > what the user wants it do (two separate commands, operating mode
> > > option, or just the implied command line syntax---once the user
> > > knows what s/he is doing, these do not make that much a difference).
> >
> > I would hope this becomes better defaults and being used 90% of time.
> > Even though I know "git checkout" quite well, it still bites me from
> > time to time. Having the right mental model is one thing. Having to
> > think a bit every time to write "git checkout" with the right syntax,
> > and whether you need "--" (that ambiguation problem can still bite you
> > from time to time), is frankly something I'd rather avoid.
> > --
> > Duy


Re: [PATCH v2 6/7] checkout: split into switch-branch and checkout-files

2018-11-28 Thread Stefan Xenos
> Since the other one is already "checkout-files", maybe this one could just be 
> "checkout-branch".

I rather like switch-branch and dislike the word "checkout" since it
has been overloaded in git for so long (does it mean moving HEAD or
copying files to my working tree?)

> nobody will become "sick of" the single "checkout" command that can

I have to admit I'm already sick of the checkout command. :-p I can
see myself using these two new commands 100% of the time and never
missing the old one.

Some behaviors I'd expect to see from these commands (I haven't yet
checked to see if you've already done this):

git checkout-files 
should reset all the files in the repository regardless of the current
directory - it should produce the same effect as "git reset --hard
 && git reset HEAD@{1}". It should also delete
locally-created files that aren't present in , such that the
final working tree is exactly identical to what was committed in that
tree-ish.

git checkout-files foo -- myfile.txt
should delete myfile.txt if it is present locally but not present in foo.

git checkout-files foo -- .
should recursively checkout all files in the current folder and all
subfolders, and delete any locally-created files if they're not
present in foo.

git checkout-files should never move HEAD in any circumstance.

Suggestion:
If git checkout-files overwrites or deletes any locally-modified files
from the workspace or index, those files could be auto-stashed. That
would make it easy to restore them in the event of a mistyped command.
Auto-stashing could be suppressed with a command-line argument (with
alternate behaviors being fail-if-modified or always-overwrite).

Idea:
If git checkout-files modifies the submodules file, it could also
auto-update the submodules. (For example, with something like "git
submodule update --init --recursive --progress").

  - Stefan
On Wed, Nov 28, 2018 at 7:31 AM Duy Nguyen  wrote:
>
> On Wed, Nov 28, 2018 at 7:03 AM Junio C Hamano  wrote:
> >
> > Nguyễn Thái Ngọc Duy   writes:
> >
> > > The good old "git checkout" command is still here and will be until
> > > all (or most of users) are sick of it.
> >
> > Two comments on the goal (the implementation looked reasonable
> > assuming the reader agrees with the gaol).
> >
> > At least to me, the verb "switch" needs two things to switch
> > between, i.e. "switch A and B", unless it is "switch to X".
> > Either "switch-to-branch" or simply "switch-to", perhaps?
> >
> > As I already hinted in my response to Stefan (?) about
> > checkout-from-tree vs checkout-from-index, a command with multiple
> > modes of operation is not confusing to people with the right mental
> > model, and I suspect that having two separate commands for "checking
> > out a branch" and "checking out paths" that is done by this step
> > would help users to form the right mental model.
>
> Since the other one is already "checkout-files", maybe this one could
> just be "checkout-branch".
>
> > So I tend to think
> > these two are "training wheels", and suspect that once they got it,
> > nobody will become "sick of" the single "checkout" command that can
> > be used to do either.  It's just the matter of being aware what can
> > be done (which requires the right mental model) and how to tell Git
> > what the user wants it do (two separate commands, operating mode
> > option, or just the implied command line syntax---once the user
> > knows what s/he is doing, these do not make that much a difference).
>
> I would hope this becomes better defaults and being used 90% of time.
> Even though I know "git checkout" quite well, it still bites me from
> time to time. Having the right mental model is one thing. Having to
> think a bit every time to write "git checkout" with the right syntax,
> and whether you need "--" (that ambiguation problem can still bite you
> from time to time), is frankly something I'd rather avoid.
> --
> Duy


Re: [PATCH/RFC v2 0/7] Introduce new commands switch-branch and checkout-files

2018-11-28 Thread Stefan Xenos
I think users have problems with detached heads for several reasons:

1. Users often enter the detached head state unexpectedly (for
example, by mistyping a "checkout" command or not understanding its
multipurpose nature, or as a side-effect of running a submodule
command). The change described here will go in the right direction
towards addressing this, since you won't be able to accidentally
detach your head by mistyping checkout. If detaching was always an
intentional action by the user, it would be much less intimidating.

2. Git shows a lot of scary error messages when running in a detached
head state. They tell you you're doing something dangerous, which
dissuades users from experimenting with the feature. IMO, it would be
better to replace the scary warning messages with instructions on
where to get more information about detached heads (and those
instructions should frontload info on how to reattach to a branch and
how to recover commits from the reflog).

3. When in a detached head state, a lot of commands add extra
confirmation prompts, which are unnecessary and intimidating.

Basically, the current user interface implies to the user that a
detached head is an error condition they'll need to recover from
rather than the normal and useful mode of working that it is.

(I'm resending this email without html - sorry if you received it twice).

So - IMO - detaching should always be an explicit action. Some options
that occur to me:

git switch-branch --detach
git detach
git switch-branch HEAD

  - Stefan

On Wed, Nov 28, 2018 at 2:48 PM Stefan Xenos  wrote:
>
> I think users have problems with detached heads for several reasons:
>
> 1. Users often enter the detached head state unexpectedly (for example, by 
> mistyping a "checkout" command or not understanding its multipurpose nature, 
> or as a side-effect of running a submodule command). The change described 
> here will go in the right direction towards addressing this, since you won't 
> be able to accidentally detach your head by mistyping checkout. If detaching 
> was always an intentional action by the user, it would be much less 
> intimidating.
>
> 2. Git shows a lot of scary error messages when running in a detached head 
> state. They tell you you're doing something dangerous, which dissuades users 
> from experimenting with the feature. IMO, it would be better to replace the 
> scary warning messages with instructions on where to get more information 
> about detached heads (and those instructions should frontload info on how to 
> reattach to a branch and how to recover commits from the reflog).
>
> 3. When in a detached head state, a lot of commands add extra confirmation 
> prompts, which are unnecessary and intimidating.
>
> Basically, the current user interface implies to the user that a detached 
> head is an error condition they'll need to recover from rather than the 
> normal and useful mode of working that it is.
>
>   - Stefan
>
> On Wed, Nov 28, 2018 at 12:01 PM Duy Nguyen  wrote:
>>
>> On Tue, Nov 27, 2018 at 5:53 PM Nguyễn Thái Ngọc Duy  
>> wrote:
>> >
>> > v2 is just a bit better to look at than v1. This is by no means final.
>> > If you think the command name is bad, the default behavior should
>> > change, or something else, speak up. It's still very "RFC".
>> >
>> > v2 breaks down the giant patch in v1 and starts adding some changes in
>> > these new commands:
>> >
>> > - restore-paths is renamed to checkout-paths. I wrote I didn't like
>> >   "checkout" because of completion conflict. But who am I kidding,
>> >   I'll use aliases anyway. "-files" instead of "-paths" because we
>> >   already have ls-files.
>> > - both commands will not accept no arguments. There is no "git
>> >   checkout" equivalent.
>> > - ambiguation rules are now aware that "switch-branch" for example
>> >   can't take pathspec...
>>
>> Another thing. Since we start with a clean slate, should we do
>> something about detached HEAD in this switch-branch command (or
>> whatever its name will be)?
>>
>> This is usually a confusing concept to new users and I've seen some
>> users have a hard time getting out of it. I'm too comfortable with
>> detached HEAD to see this from new user's perspective and a better way
>> to deal with it.
>> --
>> Duy


Re: [PATCH 0/5] Add a new "sparse" tree walk algorithm

2018-11-28 Thread Ævar Arnfjörð Bjarmason


On Wed, Nov 28 2018, Derrick Stolee via GitGitGadget wrote:

> One of the biggest remaining pain points for users of very large
> repositories is the time it takes to run 'git push'. We inspected some slow
> pushes by our developers and found that the "Enumerating Objects" phase of a
> push was very slow. This is unsurprising, because this is why reachability
> bitmaps exist. However, reachability bitmaps are not available to us because
> of the single pack-file requirement. The bitmap approach is intended for
> servers anyway, and clients have a much different behavior pattern.
>
> Specifically, clients are normally pushing a very small number of objects
> compared to the entire working directory. A typical user changes only a
> small cone of the working directory, so let's use that to our benefit.
>
> Create a new "sparse" mode for 'git pack-objects' that uses the paths that
> introduce new objects to direct our search into the reachable trees. By
> collecting trees at each path, we can then recurse into a path only when
> there are uninteresting and interesting trees at that path. This gains a
> significant performance boost for small topics while presenting a
> possibility of packing extra objects.
>
> The main algorithm change is in patch 4, but is set up a little bit in
> patches 1 and 2.
>
> As demonstrated in the included test script, we see that the existing
> algorithm can send extra objects due to the way we specify the "frontier".
> But we can send even more objects if a user copies objects from one folder
> to another. I say "copy" because a rename would (usually) change the
> original folder and trigger a walk into that path, discovering the objects.
>
> In order to benefit from this approach, the user can opt-in using the
> pack.useSparse config setting. This setting can be overridden using the
> '--no-sparse' option.

This is really interesting. I tested this with:

diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
index 124b1bafc4..5c7615f06c 100644
--- a/builtin/pack-objects.c
+++ b/builtin/pack-objects.c
@@ -3143 +3143 @@ static void get_object_list(int ac, const char **av)
-   mark_edges_uninteresting(, show_edge, sparse);
+   mark_edges_uninteresting(, show_edge, 1);

To emulate having a GIT_TEST_* mode for this, which seems like a good
idea since it turned up a lot of segfaults in pack-objects. I wasn't
able to get a backtrace for that since it always happens indirectly, and
I didn't dig enough to see how to manually invoke it the right way.


Re: [PATCH 3/5] pack-objects: add --sparse option

2018-11-28 Thread Stefan Beller
> +--sparse::
> +   Use the "sparse" algorithm to determine which objects to include in
> +   the pack. This can have significant performance benefits when 
> computing
> +   a pack to send a small change. However, it is possible that extra
> +   objects are added to the pack-file if the included commits contain
> +   certain types of direct renames.

As a user, where do I find a discussion of these walking algorithms?
(i.e. how can I tell if I can really expect to gain performance benefits,
what are the tradeoffs? "If it were strictly better than the original,
it would be on by default" might be a thought of a user)


Re: [RFC PATCH] Introduce "precious" file concept

2018-11-28 Thread Ævar Arnfjörð Bjarmason


On Wed, Nov 28 2018, Junio C Hamano wrote:

> Ævar Arnfjörð Bjarmason  writes:
>
>> What do you think about some patch like that which retains the plumbing
>> behavior for things like read-tree, doesn't introduce "precious" or
>> "trashable", and just makes you specify "[checkout|merge|...] --force"
>> in cases where we'd have clobbering?
>
> Whether you like it or not, don't people's automation use tons of
> invocations of "git merge", "git checkout", etc.?  You'd be breaking
> them by such a change.

I'm so sympathetic to this argument that I tried to convince you of
something like this around a year and a half ago:
https://public-inbox.org/git/cacbzzx59kxpoejiuktzln6zjo_xpiwve7xga6q-53j2lwvf...@mail.gmail.com/
:)

I was probing for what your current stance on this sort of thing is,
because discussions like this tend to get bogged down in the irrelevant
distraction of whether something is plumbing or porcelain, which almost
none of our users care about, and we've effectively stopped caring about
ourselves.

But we must have some viable way to repair warts in the tools, and
losing user data is a *big* wart.

I don't think something like the endgame you've described in
https://public-inbox.org/git/xmqqzhtwuhpc@gitster-ct.c.googlers.com/
is ever going to work. Novice git users (the vast majority) are not
going to diligently update both .gitignore and some .gitattribute
mechanism in lockstep. I'd bet most git users haven't read more than a
few paragraphs of our entire documentation at best.

So what's the way forward? I think ultimately we must move to something
where we effectively version the entire CLI UI similar to stable API
versions. I.e. for things like this that would break some things (or
Duy's new "split checkout") introduce them as flags first, then bundle
up all such flags and cut a major release "Git 3, 4, ...", and
eventually remove old functionality.

> Other than that, if we never had Git before and do not have to worry
> about existing users, I'd think it would be a lot closer to the ideal
> than today's system if "checkout  foo.o" rejected overwriting
> "foo.o" that is not tracked in the current index but matches an ignore
> pattern, and required a "--force" option to overwrite it.
>
> A user, during a conflict resolution, may say "I want this 'git
> checkout foo/' to ignore conflicted paths in that directory, so I
> would give "--force" option to it, but now "--force" also implies
> that I am willing to clobber ignored paths, which means I cannot use
> it".
>
> I would think that a proper automation needs per-path hint from the
> user and/or the project, not just a single-size-fits-all --force
> option, and "unlike all the *.o ignored files that are expendable,
> this vendor-supplied-object.o is not" is one way to give such a
> per-path hint.
>
>> This would give scripts which relied on our stable plumbing consistent
>> behavior, while helping users who're using our main porcelain not to
>> lose data. I could then add a --force option to the likes of read-tree
>> (on by default), so you could get porcelain-like behavior with
>> --no-force.
>
> At that low level, I suspect that a single size fits all "--force"
> would work even less well.

Yeah I don't think the one-size-fits-all way out of this is a single
--force flag.


[PATCH 4/5] revision: implement sparse algorithm

2018-11-28 Thread Derrick Stolee via GitGitGadget
From: Derrick Stolee 

When enumerating objects to place in a pack-file during 'git
pack-objects --revs', we discover the "frontier" of commits
that we care about and the boundary with commit we find
uninteresting. From that point, we walk trees to discover which
trees and blobs are uninteresting. Finally, we walk trees to find
the interesting trees.

This commit introduces a new, "sparse" way to discover the
uninteresting trees. We use the perspective of a single user trying
to push their topic to a large repository. That user likely changed
a very small fraction of the paths in their working directory, but
we spend a lot of time walking all reachable trees.

The way to switch the logic to work in this sparse way is to start
caring about which paths introduce new trees. While it is not
possible to generate a diff between the frontier boundary and all
of the interesting commits, we can simulate that behavior by
inspecting all of the root trees as a whole, then recursing down
to the set of trees at each path.

We already had taken the first step by passing an oidset to
mark_trees_uninteresting_sparse(). We now create a dictionary
whose keys are paths and values are oidsets. We consider the set
of trees that appear at each path. While we inspect a tree, we
add its subtrees to the oidsets corresponding to the tree entry's
path. We also mark trees as UNINTERESTING if the tree we are
parsing is UNINTERESTING.

To actually improve the peformance, we need to terminate our
recursion unless the oidset contains some intersting trees and
some uninteresting trees. Technically, we only need one interesting
tree for this to speed up in most cases, but we also will not mark
anything UNINTERESTING if there are no uninteresting trees, so
that would be wasted effort.

There are a few ways that this is not a universally better option.

First, we can pack extra objects. If someone copies a subtree
from one tree to another, the first tree will appear UNINTERESTING
and we will not recurse to see that the subtree should also be
UNINTERESTING. We will walk the new tree and see the subtree as
a "new" object and add it to the pack. We add a test case that
demonstrates this as a way to prove that the --sparse option is
actually working.

Second, we can have extra memory pressure. If instead of being a
single user pushing a small topic we are a server sending new
objects from across the entire working directory, then we will
gain very little (the recursion will rarely terminate early) but
will spend extra time maintaining the path-oidset dictionaries.

Despite these potential drawbacks, the benefits of the algorithm
are clear. By adding a counter to 'add_children_by_path' and
'mark_tree_contents_uninteresting', I measured the number of
parsed trees for the two algorithms in a variety of repos.

For git.git, I used the following input:

v2.19.0
^v2.19.0~10

 Objects to pack: 550
Walked (old alg): 282
Walked (new alg): 130

For the Linux repo, I used the following input:

v4.18
^v4.18~10

 Objects to pack:   518
Walked (old alg): 4,836
Walked (new alg):   188

The two repos above are rather "wide and flat" compared to
other repos that I have used in the past. As a comparison,
I tested an old topic branch in the Azure DevOps repo, which
has a much deeper folder structure than the Linux repo.

 Objects to pack:220
Walked (old alg): 22,804
Walked (new alg):129

I used the number of walked trees the main metric above because
it is consistent across multiple runs. When I ran my tests, the
performance of the pack-objects command with the same options
could change the end-to-end time by 10x depending on the file
system being warm. However, by repeating the same test on repeat
I could get more consistent timing results. The git.git and
Linux tests were too fast overall (less than 0.5s) to measure
an end-to-end difference. The Azure DevOps case was slow enough
to see the time improve from 15s to 1s in the warm case. The
cold case was 90s to 9s in my testing.

These improvements will have even larger benefits in the super-
large Windows repository. In our experiments, we see the
"Enumerate objects" phase of pack-objects taking 60-80% of the
end-to-end time of non-trivial pushes, taking longer than the
network time to send the pack and the server time to verify the
pack.

Signed-off-by: Derrick Stolee 
---
 revision.c | 111 ++---
 t/t5322-pack-objects-sparse.sh |  21 +--
 2 files changed, 116 insertions(+), 16 deletions(-)

diff --git a/revision.c b/revision.c
index 3a62c7c187..7e4bfe621a 100644
--- a/revision.c
+++ b/revision.c
@@ -99,26 +99,117 @@ void mark_tree_uninteresting(struct repository *r, struct 
tree *tree)
mark_tree_contents_uninteresting(r, tree);
 }
 
+struct paths_and_oids {
+   struct string_list list;
+};
+
+static void paths_and_oids_init(struct paths_and_oids *po)
+{
+   string_list_init(>list, 1);
+}
+
+static void 

[PATCH 0/5] Add a new "sparse" tree walk algorithm

2018-11-28 Thread Derrick Stolee via GitGitGadget
One of the biggest remaining pain points for users of very large
repositories is the time it takes to run 'git push'. We inspected some slow
pushes by our developers and found that the "Enumerating Objects" phase of a
push was very slow. This is unsurprising, because this is why reachability
bitmaps exist. However, reachability bitmaps are not available to us because
of the single pack-file requirement. The bitmap approach is intended for
servers anyway, and clients have a much different behavior pattern.

Specifically, clients are normally pushing a very small number of objects
compared to the entire working directory. A typical user changes only a
small cone of the working directory, so let's use that to our benefit.

Create a new "sparse" mode for 'git pack-objects' that uses the paths that
introduce new objects to direct our search into the reachable trees. By
collecting trees at each path, we can then recurse into a path only when
there are uninteresting and interesting trees at that path. This gains a
significant performance boost for small topics while presenting a
possibility of packing extra objects.

The main algorithm change is in patch 4, but is set up a little bit in
patches 1 and 2.

As demonstrated in the included test script, we see that the existing
algorithm can send extra objects due to the way we specify the "frontier".
But we can send even more objects if a user copies objects from one folder
to another. I say "copy" because a rename would (usually) change the
original folder and trigger a walk into that path, discovering the objects.

In order to benefit from this approach, the user can opt-in using the
pack.useSparse config setting. This setting can be overridden using the
'--no-sparse' option.

Derrick Stolee (5):
  revision: add mark_tree_uninteresting_sparse
  list-objects: consume sparse tree walk
  pack-objects: add --sparse option
  revision: implement sparse algorithm
  pack-objects: create pack.useSparse setting

 Documentation/git-pack-objects.txt |   9 +-
 bisect.c   |   2 +-
 builtin/pack-objects.c |   9 +-
 builtin/rev-list.c |   2 +-
 http-push.c|   2 +-
 list-objects.c |  51 ++-
 list-objects.h |   4 +-
 revision.c | 113 +++
 revision.h |   2 +
 t/t5322-pack-objects-sparse.sh | 139 +
 10 files changed, 323 insertions(+), 10 deletions(-)
 create mode 100755 t/t5322-pack-objects-sparse.sh


base-commit: a1598010f775d82b5adf12c29d0f5bc9b41434c6
Published-As: 
https://github.com/gitgitgadget/git/releases/tags/pr-89%2Fderrickstolee%2Fpush%2Fsparse-v1
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git 
pr-89/derrickstolee/push/sparse-v1
Pull-Request: https://github.com/gitgitgadget/git/pull/89
-- 
gitgitgadget


[PATCH 3/5] pack-objects: add --sparse option

2018-11-28 Thread Derrick Stolee via GitGitGadget
From: Derrick Stolee 

Add a '--sparse' option flag to the pack-objects builtin. This
allows the user to specify that they want to use the new logic
for walking trees. This logic currently does not differ from the
existing output, but will in a later change.

Create a new test script, t5322-pack-objects-sparse.sh, to ensure
the object list that is selected matches what we expect. When we
update the logic to walk in a sparse fashion, the final test will
be updated to show the extra objects that are added.

Signed-off-by: Derrick Stolee 
---
 Documentation/git-pack-objects.txt |   9 ++-
 builtin/pack-objects.c |   5 +-
 t/t5322-pack-objects-sparse.sh | 115 +
 3 files changed, 127 insertions(+), 2 deletions(-)
 create mode 100755 t/t5322-pack-objects-sparse.sh

diff --git a/Documentation/git-pack-objects.txt 
b/Documentation/git-pack-objects.txt
index 40c825c381..ced2630eb3 100644
--- a/Documentation/git-pack-objects.txt
+++ b/Documentation/git-pack-objects.txt
@@ -14,7 +14,7 @@ SYNOPSIS
[--local] [--incremental] [--window=] [--depth=]
[--revs [--unpacked | --all]] [--keep-pack=]
[--stdout [--filter=] | base-name]
-   [--shallow] [--keep-true-parents] < object-list
+   [--shallow] [--keep-true-parents] [--sparse] < object-list
 
 
 DESCRIPTION
@@ -196,6 +196,13 @@ depth is 4095.
Add --no-reuse-object if you want to force a uniform compression
level on all data no matter the source.
 
+--sparse::
+   Use the "sparse" algorithm to determine which objects to include in
+   the pack. This can have significant performance benefits when computing
+   a pack to send a small change. However, it is possible that extra
+   objects are added to the pack-file if the included commits contain
+   certain types of direct renames.
+
 --thin::
Create a "thin" pack by omitting the common objects between a
sender and a receiver in order to reduce network transfer. This
diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
index 5f70d840a7..7d5b0735e3 100644
--- a/builtin/pack-objects.c
+++ b/builtin/pack-objects.c
@@ -84,6 +84,7 @@ static unsigned long pack_size_limit;
 static int depth = 50;
 static int delta_search_threads;
 static int pack_to_stdout;
+static int sparse;
 static int thin;
 static int num_preferred_base;
 static struct progress *progress_state;
@@ -3135,7 +3136,7 @@ static void get_object_list(int ac, const char **av)
 
if (prepare_revision_walk())
die(_("revision walk setup failed"));
-   mark_edges_uninteresting(, show_edge, 0);
+   mark_edges_uninteresting(, show_edge, sparse);
 
if (!fn_show_object)
fn_show_object = show_object;
@@ -3292,6 +3293,8 @@ int cmd_pack_objects(int argc, const char **argv, const 
char *prefix)
{ OPTION_CALLBACK, 0, "unpack-unreachable", NULL, N_("time"),
  N_("unpack unreachable objects newer than "),
  PARSE_OPT_OPTARG, option_parse_unpack_unreachable },
+   OPT_BOOL(0, "sparse", ,
+N_("use the sparse reachability algorithm")),
OPT_BOOL(0, "thin", ,
 N_("create thin packs")),
OPT_BOOL(0, "shallow", ,
diff --git a/t/t5322-pack-objects-sparse.sh b/t/t5322-pack-objects-sparse.sh
new file mode 100755
index 00..81f6805bc3
--- /dev/null
+++ b/t/t5322-pack-objects-sparse.sh
@@ -0,0 +1,115 @@
+#!/bin/sh
+
+test_description='pack-objects object selection using sparse algorithm'
+. ./test-lib.sh
+
+test_expect_success 'setup repo' '
+   test_commit initial &&
+   for i in $(test_seq 1 3)
+   do
+   mkdir f$i &&
+   for j in $(test_seq 1 3)
+   do
+   mkdir f$i/f$j &&
+   echo $j >f$i/f$j/data.txt
+   done
+   done &&
+   git add . &&
+   git commit -m "Initialized trees" &&
+   for i in $(test_seq 1 3)
+   do
+   git checkout -b topic$i master &&
+   echo change-$i >f$i/f$i/data.txt &&
+   git commit -a -m "Changed f$i/f$i/data.txt"
+   done &&
+   cat >packinput.txt <<-EOF &&
+   topic1
+   ^topic2
+   ^topic3
+   EOF
+   git rev-parse   \
+   topic1  \
+   topic1^{tree}   \
+   topic1:f1   \
+   topic1:f1/f1\
+   topic1:f1/f1/data.txt | sort >expect_objects.txt
+'
+
+test_expect_success 'non-sparse pack-objects' '
+   git pack-objects --stdout --revs nonsparse.pack &&
+   git index-pack -o nonsparse.idx nonsparse.pack &&
+   git show-index nonsparse_objects.txt &&
+   test_cmp expect_objects.txt nonsparse_objects.txt
+'
+
+test_expect_success 'sparse pack-objects' '
+   git pack-objects --stdout --revs --sparse sparse.pack &&
+ 

[PATCH 5/5] pack-objects: create pack.useSparse setting

2018-11-28 Thread Derrick Stolee via GitGitGadget
From: Derrick Stolee 

The '--sparse' flag in 'git pack-objects' changes the algorithm
used to enumerate objects to one that is faster for individual
users pushing new objects that change only a small cone of the
working directory. The sparse algorithm is not recommended for a
server, which likely sends new objects that appear across the
entire working directory.

Create a 'pack.useSparse' setting that enables this new algorithm.
This allows 'git push' to use this algorithm without passing a
'--sparse' flag all the way through four levels of run_command()
calls.

If the '--no-sparse' flag is set, then this config setting is
overridden.

Signed-off-by: Derrick Stolee 
---
 builtin/pack-objects.c |  4 
 t/t5322-pack-objects-sparse.sh | 15 +++
 2 files changed, 19 insertions(+)

diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
index 7d5b0735e3..124b1bafc4 100644
--- a/builtin/pack-objects.c
+++ b/builtin/pack-objects.c
@@ -2711,6 +2711,10 @@ static int git_pack_config(const char *k, const char *v, 
void *cb)
use_bitmap_index_default = git_config_bool(k, v);
return 0;
}
+   if (!strcmp(k, "pack.usesparse")) {
+   sparse = git_config_bool(k, v);
+   return 0;
+   }
if (!strcmp(k, "pack.threads")) {
delta_search_threads = git_config_int(k, v);
if (delta_search_threads < 0)
diff --git a/t/t5322-pack-objects-sparse.sh b/t/t5322-pack-objects-sparse.sh
index 45dba6e014..8f5699bd91 100755
--- a/t/t5322-pack-objects-sparse.sh
+++ b/t/t5322-pack-objects-sparse.sh
@@ -121,4 +121,19 @@ test_expect_success 'sparse pack-objects' '
test_cmp expect_sparse_objects.txt sparse_objects.txt
 '
 
+test_expect_success 'pack.useSparse enables algorithm' '
+   git config pack.useSparse true &&
+   git pack-objects --stdout --revs sparse.pack &&
+   git index-pack -o sparse.idx sparse.pack &&
+   git show-index sparse_objects.txt &&
+   test_cmp expect_sparse_objects.txt sparse_objects.txt
+'
+
+test_expect_success 'pack.useSparse overridden' '
+   git pack-objects --stdout --revs --no-sparse sparse.pack &&
+   git index-pack -o sparse.idx sparse.pack &&
+   git show-index sparse_objects.txt &&
+   test_cmp expect_objects.txt sparse_objects.txt
+'
+
 test_done
-- 
gitgitgadget


[PATCH 2/5] list-objects: consume sparse tree walk

2018-11-28 Thread Derrick Stolee via GitGitGadget
From: Derrick Stolee 

When creating a pack-file using 'git pack-objects --revs' we provide
a list of interesting and uninteresting commits. For example, a push
operation would make the local topic branch be interesting and the
known remote refs as uninteresting. We want to discover the set of
new objects to send to the server as a thin pack.

We walk these commits until we discover a frontier of commits such
that every commit walk starting at interesting commits ends in a root
commit or unintersting commit. We then need to discover which
non-commit objects are reachable from  uninteresting commits.

The mark_edges_unintersting() method in list-objects.c iterates on
the commit list and does the following:

* If the commit is UNINTERSTING, then mark its root tree and every
  object it can reach as UNINTERESTING.

* If the commit is interesting, then mark the root tree of every
  UNINTERSTING parent (and all objects that tree can reach) as
  UNINTERSTING.

At the very end, we repeat the process on every commit directly
given to the revision walk from stdin. This helps ensure we properly
cover shallow commits that otherwise were not included in the
frontier.

The logic to recursively follow trees is in the
mark_tree_uninteresting() method in revision.c. The algorithm avoids
duplicate work by not recursing into trees that are already marked
UNINTERSTING.

Add a new 'sparse' option to the mark_edges_uninteresting() method
that performs this logic in a slightly new way. As we iterate over
the commits, we add all of the root trees to an oidset. Then, call
mark_trees_uninteresting_sparse() on that oidset. Note that we
include interesting trees in this process. The current implementation
of mark_trees_unintersting_sparse() will walk the same trees as
the old logic, but this will be replaced in a later change.

The sparse option is not used by any callers at the moment, but
will be wired to 'git pack-objects' in the next change.

Signed-off-by: Derrick Stolee 
---
 bisect.c   |  2 +-
 builtin/pack-objects.c |  2 +-
 builtin/rev-list.c |  2 +-
 http-push.c|  2 +-
 list-objects.c | 51 ++
 list-objects.h |  4 +++-
 6 files changed, 54 insertions(+), 9 deletions(-)

diff --git a/bisect.c b/bisect.c
index 487675c672..842f8b4b8f 100644
--- a/bisect.c
+++ b/bisect.c
@@ -656,7 +656,7 @@ static void bisect_common(struct rev_info *revs)
if (prepare_revision_walk(revs))
die("revision walk setup failed");
if (revs->tree_objects)
-   mark_edges_uninteresting(revs, NULL);
+   mark_edges_uninteresting(revs, NULL, 0);
 }
 
 static void exit_if_skipped_commits(struct commit_list *tried,
diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
index 411aefd687..5f70d840a7 100644
--- a/builtin/pack-objects.c
+++ b/builtin/pack-objects.c
@@ -3135,7 +3135,7 @@ static void get_object_list(int ac, const char **av)
 
if (prepare_revision_walk())
die(_("revision walk setup failed"));
-   mark_edges_uninteresting(, show_edge);
+   mark_edges_uninteresting(, show_edge, 0);
 
if (!fn_show_object)
fn_show_object = show_object;
diff --git a/builtin/rev-list.c b/builtin/rev-list.c
index 2880ed37e3..9663cbfae0 100644
--- a/builtin/rev-list.c
+++ b/builtin/rev-list.c
@@ -543,7 +543,7 @@ int cmd_rev_list(int argc, const char **argv, const char 
*prefix)
if (prepare_revision_walk())
die("revision walk setup failed");
if (revs.tree_objects)
-   mark_edges_uninteresting(, show_edge);
+   mark_edges_uninteresting(, show_edge, 0);
 
if (bisect_list) {
int reaches, all;
diff --git a/http-push.c b/http-push.c
index cd48590912..ea52d6f9f6 100644
--- a/http-push.c
+++ b/http-push.c
@@ -1933,7 +1933,7 @@ int cmd_main(int argc, const char **argv)
pushing = 0;
if (prepare_revision_walk())
die("revision walk setup failed");
-   mark_edges_uninteresting(, NULL);
+   mark_edges_uninteresting(, NULL, 0);
objects_to_send = get_delta(, ref_lock);
finish_all_active_slots();
 
diff --git a/list-objects.c b/list-objects.c
index c41cc80db5..9bb93d1640 100644
--- a/list-objects.c
+++ b/list-objects.c
@@ -222,25 +222,68 @@ static void mark_edge_parents_uninteresting(struct commit 
*commit,
}
 }
 
-void mark_edges_uninteresting(struct rev_info *revs, show_edge_fn show_edge)
+static void add_edge_parents(struct commit *commit,
+struct rev_info *revs,
+show_edge_fn show_edge,
+struct oidset *set)
+{
+   struct commit_list *parents;
+
+   for (parents = commit->parents; parents; parents = parents->next) {
+   struct commit *parent = parents->item;
+   struct tree *tree 

[PATCH 1/5] revision: add mark_tree_uninteresting_sparse

2018-11-28 Thread Derrick Stolee via GitGitGadget
From: Derrick Stolee 

In preparation for a new algorithm that walks fewer trees when
creating a pack from a set of revisions, create a method that
takes an oidset of tree oids and marks reachable objects as
UNINTERESTING.

The current implementation uses the existing
mark_tree_uninteresting to recursively walk the trees and blobs.
This will walk the same number of trees as the old mechanism.

There is one new assumption in this approach: we are also given
the oids of the interesting trees. This implementation does not
use those trees at the moment, but we will use them in a later
rewrite of this method.

Signed-off-by: Derrick Stolee 
---
 revision.c | 22 ++
 revision.h |  2 ++
 2 files changed, 24 insertions(+)

diff --git a/revision.c b/revision.c
index 13e0519c02..3a62c7c187 100644
--- a/revision.c
+++ b/revision.c
@@ -99,6 +99,28 @@ void mark_tree_uninteresting(struct repository *r, struct 
tree *tree)
mark_tree_contents_uninteresting(r, tree);
 }
 
+void mark_trees_uninteresting_sparse(struct repository *r,
+struct oidset *set)
+{
+   struct object_id *oid;
+   struct oidset_iter iter;
+
+   oidset_iter_init(set, );
+   while ((oid = oidset_iter_next())) {
+   struct tree *tree = lookup_tree(r, oid);
+
+   if (tree->object.flags & UNINTERESTING) {
+   /*
+* Remove the flag so the next call
+* is not a no-op. The flag is added
+* in mark_tree_unintersting().
+*/
+   tree->object.flags ^= UNINTERESTING;
+   mark_tree_uninteresting(r, tree);
+   }
+   }
+}
+
 struct commit_stack {
struct commit **items;
size_t nr, alloc;
diff --git a/revision.h b/revision.h
index 7987bfcd2e..f828e91ae9 100644
--- a/revision.h
+++ b/revision.h
@@ -67,6 +67,7 @@ struct rev_cmdline_info {
 #define REVISION_WALK_NO_WALK_SORTED 1
 #define REVISION_WALK_NO_WALK_UNSORTED 2
 
+struct oidset;
 struct topo_walk_info;
 
 struct rev_info {
@@ -327,6 +328,7 @@ void put_revision_mark(const struct rev_info *revs,
 
 void mark_parents_uninteresting(struct commit *commit);
 void mark_tree_uninteresting(struct repository *r, struct tree *tree);
+void mark_trees_uninteresting_sparse(struct repository *r, struct oidset *set);
 
 void show_object_with_name(FILE *, struct object *, const char *);
 
-- 
gitgitgadget



Re: [PATCH] i18n: fix small typos

2018-11-28 Thread Eric Sunshine
On Wed, Nov 28, 2018 at 4:43 PM Jean-Noël Avila  wrote:
> Translating the new strings introduced for v2.20 showed some typos.

Hard to spot by eyeball when looking at the diff, but both fixes make
sense. Thanks.

> Signed-off-by: Jean-Noël Avila 


[PATCH] i18n: fix small typos

2018-11-28 Thread Jean-Noël Avila
Translating the new strings introduced for v2.20 showed some typos.

Signed-off-by: Jean-Noël Avila 
---
 http.c | 2 +-
 midx.c | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/http.c b/http.c
index 3dc8c560d6..eacc2a75ef 100644
--- a/http.c
+++ b/http.c
@@ -834,7 +834,7 @@ static CURL *get_curl_handle(void)
 #if LIBCURL_VERSION_NUM >= 0x072c00
curl_easy_setopt(result, CURLOPT_SSL_OPTIONS, 
CURLSSLOPT_NO_REVOKE);
 #else
-   warning(_("CURLSSLOPT_NO_REVOKE not suported with cURL < 
7.44.0"));
+   warning(_("CURLSSLOPT_NO_REVOKE not supported with cURL < 
7.44.0"));
 #endif
}
 
diff --git a/midx.c b/midx.c
index 730ff84dff..2a6a24fcd7 100644
--- a/midx.c
+++ b/midx.c
@@ -202,7 +202,7 @@ int prepare_midx_pack(struct multi_pack_index *m, uint32_t 
pack_int_id)
struct strbuf pack_name = STRBUF_INIT;
 
if (pack_int_id >= m->num_packs)
-   die(_("bad pack-int-id: %u (%u total packs"),
+   die(_("bad pack-int-id: %u (%u total packs)"),
pack_int_id, m->num_packs);
 
if (m->packs[pack_int_id])
-- 
2.18.0



Re: [PATCH 1/2] git-reset.txt: render tables correctly under Asciidoctor

2018-11-28 Thread Martin Ågren
On Wed, 28 Nov 2018 at 20:45, Ævar Arnfjörð Bjarmason  wrote:
> On Wed, Nov 28 2018, Martin Ågren wrote:
>
> > Asciidoctor removes the indentation of each line in these tables, so the
> > last lines of each table have a completely broken alignment.
>
> Earlier I was trying to get the Documentation/doc-diff script to diff
> the asciidoc and asciidoctor docs without much success (hadn't used it
> before, just hacking the Makefile to turn on asciidoctor yielded syntax
> errors or something).
>
> Is something like that a thing we could make into a regression test?

Interesting idea. I just tried a gross hack:

 * Use `make --always-make ... install-man` in doc-diff.
 * ./doc-diff -f HEAD HEAD # note -f
 * Add empty commit and tweak config.mak
 * ./doc-diff HEAD^ HEAD # note no -f

There are lots of irrelevant differences in the headers and footers,
which is a bit unfortunate.

Also, lots of annoying differences originating in Asciidoctor's
inclination to render a space after linkgit:foo . There are those
differences themselves, obviously, but also follow-on differences such
as entire paragraphs that wrap differently.

I could spot a few things that should be fixable on our side, but on a
quick skimming, I didn't spot too many "huge" differences, which feels
good. The one which this patch fixes, obviously, and there's some work
to do in git-status.txt and git-column.txt (at least).

Tacking on `--stat` to the call to `git diff --no-index` singles out
git-config.txt, but it seems like lots of small or maybe even irrelevant
differences, plus lots of spaces around linkgit: , as already mentioned.

Martin


Re: [PATCH/RFC v2 0/7] Introduce new commands switch-branch and checkout-files

2018-11-28 Thread Stefan Beller
On Wed, Nov 28, 2018 at 12:09 PM Duy Nguyen  wrote:
>
> On Wed, Nov 28, 2018 at 9:01 PM Duy Nguyen  wrote:
> > should we do
> > something about detached HEAD in this switch-branch command (or
> > whatever its name will be)?
> >
> > This is usually a confusing concept to new users
>
> And it just occurred to me that perhaps we should call this "unnamed
> branch" (at least at high UI level) instead of detached HEAD. It is
> technically not as accurate, but much better to understand.

or 'direct' branch? I mean 'detached HEAD' itself is also not correct
as the HEAD points to a valid commit/tag usually, so it is attached to
that content. The detachment comes from the implicit "from a branch".


Re: Git pull confusing output

2018-11-28 Thread Stefan Beller
On Tue, Nov 27, 2018 at 10:31 PM Junio C Hamano  wrote:
>
> Will  writes:
>
> > I’m far from being a guru, but I consider myself a competent Git
> > user. Yet, here’s my understanding of the output of one the most-used
> > commands, `git push`:
> >> Counting objects: 6, done.
> > No idea what an “object” is. Apparently there’s 6 of them
> > here. What does “counting” them means? Should I care?
>
> You vaguely recall that the last time you pushed you saw ~400
> objects counted there, so you get the feeling how active you were
> since then.
>
> It is up to you if you are interested in such a feel of the level of
> activity.  "git fetch" (hence "git pull") would also give you a
> similar "feel", e.g. "the last fetch was ~1200 objects and today's
> is mere ~200, so it seems it is a relatively slow day".
>
> As to "what is an object?", there are plenty of Git tutorials and
> books to learn the basics from.  Again, it is up to you if you care.

While this is very interesting to the experienced git user, the
approximation of activity by object count is very coarse to say at least.

As It approximates changes in the DAG object count and nothing
about the deltas (which as we learn comes later and it comes with
a progress meter in bytes), it only provides the basics.

>
> I think we have "--quiet" option for those who do not care.

I think some users are not focused as much on the version control as
much as they are focused on another problem that is solved with
the content inside the repo.

Which means they only care about 'actionable' output, such as
* errors
* information provided by remote
  (e.g. links to click to start a code review)
* too long waiting time
  (so they can abort and inspect the problem)

I would suggest we come up with a mode that is "not quiet", but
cuts down to only the basic actionable items [and make that
the default eventually].

Now these actionable items depend on the workflow used,
for which I think an email based maintainers workflow is not
the norm. The vast majority of people uses git-push to
upload their change to a code review system instead.
And for such a workflow the size (as proxied by
object/delta count) is not as important as the target ref
that you push to or potentially the diffstat output of
a potential merge to a target branch.

TLDR: I still think making git-push a bit more quiet is
beneficial to the user base at large.

Stefan


[PATCH 1/2] format-patch: add test for --range-diff diff output

2018-11-28 Thread Ævar Arnfjörð Bjarmason
As noted in 43dafc4172 ("format-patch: don't include --stat with
--range-diff output", 2018-11-22) the diff options provided on the
command-line currently affect both the range-diff and the patch
output, but there was no test for checking this with output where we'd
show a patch diff. Let's add one.

Signed-off-by: Ævar Arnfjörð Bjarmason 
---
 t/t3206-range-diff.sh | 60 +++
 1 file changed, 60 insertions(+)

diff --git a/t/t3206-range-diff.sh b/t/t3206-range-diff.sh
index 90def330bd..bc5facc1cd 100755
--- a/t/t3206-range-diff.sh
+++ b/t/t3206-range-diff.sh
@@ -267,4 +267,64 @@ test_expect_success 'format-patch --range-diff as 
commentary' '
test_i18ngrep "^Range-diff:$" actual
 '
 
+test_expect_success 'format-patch with ' '
+   # No diff options
+   git format-patch --cover-letter --stdout --range-diff=topic~..topic \
+   changed~..changed >actual.raw &&
+   sed -ne "/^1:/,/^--/p" actual.range-diff &&
+   sed -e "s|:$||" >expect <<-\EOF &&
+   1:  a63e992 ! 1:  d966c5c s/12/B/
+   @@ -8,7 +8,7 @@
+@@
+ 9
+ 10
+   - B
+   + BB
+-12
++B
+ 13
+   -- :
+   EOF
+   test_cmp expect actual.range-diff &&
+   sed -ne "/^--- /,/^--/p" actual.diff &&
+   sed -e "s|:$||" >expect <<-\EOF &&
+   --- a/file
+   +++ b/file
+   @@ -9,7 +9,7 @@ A
+9
+10
+BB
+   -12
+   +B
+13
+14
+15
+   -- :
+   EOF
+   test_cmp expect actual.diff &&
+
+   # -U0
+   git format-patch --cover-letter --stdout -U0 \
+   --range-diff=topic~..topic changed~..changed >actual.raw &&
+   sed -ne "/^1:/,/^--/p" actual.range-diff &&
+   sed -e "s|:$||" >expect <<-\EOF &&
+   1:  a63e992 ! 1:  d966c5c s/12/B/
+   @@ -11 +11 @@
+   - B
+   + BB
+   -- :
+   EOF
+   test_cmp expect actual.range-diff &&
+   sed -ne "/^--- /,/^--/p" actual.diff &&
+   sed -e "s|:$||" >expect <<-\EOF &&
+   --- a/file
+   +++ b/file
+   @@ -12 +12 @@ BB
+   -12
+   +B
+   -- :
+   EOF
+   test_cmp expect actual.diff
+'
+
 test_done
-- 
2.20.0.rc1.387.gf8505762e3



[PATCH 0/2] format-patch: fix root cause of recent regression

2018-11-28 Thread Ævar Arnfjörð Bjarmason
As noted in 2/2 this fixes the root cause of the bug I plastered over
in
https://public-inbox.org/git/20181122211248.24546-3-ava...@gmail.com/
(that patch is sitting in 'next').

1/2 is a test for existing behavior, to make it more easily understood
what's being changed.

Junio: I know it's late, but unless Eric has objections to this UI
change I'd really like to have this in 2.20 since this is a change to
a new command-line UI that's newly added in 2.20.

As noted in 2/2 the current implementation is inherently limited, you
can't tweak diff options for the range-diff in the cover-letter and
the patch independently, now you can, and the implementation is much
less nasty now that we're not having to share diffopts across two
different modes of operation.

Ævar Arnfjörð Bjarmason (2):
  format-patch: add test for --range-diff diff output
  format-patch: allow for independent diff & range-diff options

 Documentation/git-format-patch.txt |  10 ++-
 builtin/log.c  |  42 +---
 t/t3206-range-diff.sh  | 101 +
 3 files changed, 142 insertions(+), 11 deletions(-)

-- 
2.20.0.rc1.387.gf8505762e3



[PATCH 2/2] format-patch: allow for independent diff & range-diff options

2018-11-28 Thread Ævar Arnfjörð Bjarmason
Change the semantics of the "--range-diff" option so that the regular
diff options can be provided separately for the range-diff and the
patch. This allows for supplying e.g. --range-diff-U0 and -U1 to
"format-patch" to provide different context for the range-diff and the
patch. This wasn't possible before.

Ever since the "--range-diff" option was added in
31e2617a5f ("format-patch: add --range-diff option to embed diff in
cover letter", 2018-07-22) the "rev->diffopt" we pass down to the diff
machinery has been the one we get from "format-patch"'s own
setup_revisions().

This sort of thing is unique among the log-like commands in
builtin/log.c, no command than format-patch will embed the output of
another log-like command. Since the "rev->diffopt" is reused we need
to munge it before we pass it to show_range_diff(). See
43dafc4172 ("format-patch: don't include --stat with --range-diff
output", 2018-11-22) for a related regression fix which is being
mostly reverted here.

Implementation notes: 1) We're not bothering with the full teardown
around die() and will leak memory, but it's too much boilerplate to do
all the frees with/without the die() and not worth it. 2) We call
repo_init_revisions() for "rd_rev" even though we could get away with
a shallow copy like the code we're replacing (and which
show_range_diff() itself does). This is to make this code more easily
understood.

Signed-off-by: Ævar Arnfjörð Bjarmason 
---
 Documentation/git-format-patch.txt | 10 ++-
 builtin/log.c  | 42 +++---
 t/t3206-range-diff.sh  | 41 +
 3 files changed, 82 insertions(+), 11 deletions(-)

diff --git a/Documentation/git-format-patch.txt 
b/Documentation/git-format-patch.txt
index aba4c5febe..6c048f415f 100644
--- a/Documentation/git-format-patch.txt
+++ b/Documentation/git-format-patch.txt
@@ -24,7 +24,8 @@ SYNOPSIS
   [--to=] [--cc=]
   [--[no-]cover-letter] [--quiet] [--notes[=]]
   [--interdiff=]
-  [--range-diff= [--creation-factor=]]
+  [--range-diff= [--creation-factor=]
+ [--range-diff]]
   [--progress]
   []
   [  |  ]
@@ -257,6 +258,13 @@ feeding the result to `git send-email`.
creation/deletion cost fudge factor. See linkgit:git-range-diff[1])
for details.
 
+--range-diff::
+   Other options prefixed with `--range-diff` are stripped of
+   that prefix and passed as-is to the diff machinery used to
+   generate the range-diff, e.g. `--range-diff-U0` and
+   `--range-diff--no-color`. This allows for adjusting the format
+   of the range-diff independently from the patch itself.
+
 --notes[=]::
Append the notes (see linkgit:git-notes[1]) for the commit
after the three-dash line.
diff --git a/builtin/log.c b/builtin/log.c
index 02d88fa233..7658e56ecc 100644
--- a/builtin/log.c
+++ b/builtin/log.c
@@ -1023,7 +1023,8 @@ static void show_diffstat(struct rev_info *rev,
fprintf(rev->diffopt.file, "\n");
 }
 
-static void make_cover_letter(struct rev_info *rev, int use_stdout,
+static void make_cover_letter(struct rev_info *rev, struct rev_info *rd_rev,
+ int use_stdout,
  struct commit *origin,
  int nr, struct commit **list,
  const char *branch_name,
@@ -1095,13 +1096,9 @@ static void make_cover_letter(struct rev_info *rev, int 
use_stdout,
}
 
if (rev->rdiff1) {
-   struct diff_options opts;
-   memcpy(, >diffopt, sizeof(opts));
-   opts.output_format &= ~(DIFF_FORMAT_DIFFSTAT | 
DIFF_FORMAT_SUMMARY);
-
fprintf_ln(rev->diffopt.file, "%s", rev->rdiff_title);
show_range_diff(rev->rdiff1, rev->rdiff2,
-   rev->creation_factor, 1, );
+   rev->creation_factor, 1, _rev->diffopt);
}
 }
 
@@ -1485,6 +1482,7 @@ int cmd_format_patch(int argc, const char **argv, const 
char *prefix)
struct commit *commit;
struct commit **list = NULL;
struct rev_info rev;
+   struct rev_info rd_rev;
struct setup_revision_opt s_r_opt;
int nr = 0, total, i;
int use_stdout = 0;
@@ -1603,6 +1601,7 @@ int cmd_format_patch(int argc, const char **argv, const 
char *prefix)
init_log_defaults();
git_config(git_format_config, NULL);
repo_init_revisions(the_repository, , prefix);
+   repo_init_revisions(the_repository, _rev, prefix);
rev.commit_format = CMIT_FMT_EMAIL;
rev.expand_tabs_in_log_default = 0;
rev.verbose_header = 1;
@@ -1689,8 +1688,32 @@ int cmd_format_patch(int argc, const char **argv, const 
char *prefix)
rev.preserve_subject = keep_subject;
 
argc = setup_revisions(argc, argv, 

Re: [PATCH/RFC v2 0/7] Introduce new commands switch-branch and checkout-files

2018-11-28 Thread Duy Nguyen
On Wed, Nov 28, 2018 at 9:01 PM Duy Nguyen  wrote:
> should we do
> something about detached HEAD in this switch-branch command (or
> whatever its name will be)?
>
> This is usually a confusing concept to new users

And it just occurred to me that perhaps we should call this "unnamed
branch" (at least at high UI level) instead of detached HEAD. It is
technically not as accurate, but much better to understand.
-- 
Duy


Re: [PATCH/RFC v2 0/7] Introduce new commands switch-branch and checkout-files

2018-11-28 Thread Duy Nguyen
On Tue, Nov 27, 2018 at 5:53 PM Nguyễn Thái Ngọc Duy  wrote:
>
> v2 is just a bit better to look at than v1. This is by no means final.
> If you think the command name is bad, the default behavior should
> change, or something else, speak up. It's still very "RFC".
>
> v2 breaks down the giant patch in v1 and starts adding some changes in
> these new commands:
>
> - restore-paths is renamed to checkout-paths. I wrote I didn't like
>   "checkout" because of completion conflict. But who am I kidding,
>   I'll use aliases anyway. "-files" instead of "-paths" because we
>   already have ls-files.
> - both commands will not accept no arguments. There is no "git
>   checkout" equivalent.
> - ambiguation rules are now aware that "switch-branch" for example
>   can't take pathspec...

Another thing. Since we start with a clean slate, should we do
something about detached HEAD in this switch-branch command (or
whatever its name will be)?

This is usually a confusing concept to new users and I've seen some
users have a hard time getting out of it. I'm too comfortable with
detached HEAD to see this from new user's perspective and a better way
to deal with it.
-- 
Duy


Re: [BUG REPORT] t5322: demonstrate a pack-objects bug

2018-11-28 Thread Derrick Stolee

On 11/28/2018 2:45 PM, Derrick Stolee wrote:

I was preparing a new "sparse" algorithm for calculating the
interesting objects to send on push. The important steps happen
during 'git pack-objects', so I was creating test cases to see
how the behavior changes in narrow cases. Specifically, when
copying a directory across sibling directories (see test case),
the new logic would accidentally send that object as an extra.

However, I found a bug in the existing logic. The included test
demonstrates this during the final 'git index-pack' call. It
fails with the message

'fatal: pack has 1 unresolved delta'


I realize now that I've gone through this effort that the problem is me 
(of course it is).



+   git pack-objects --stdout --thin --revs nonsparse.pack 
&&


Since I'm packing thin packs, the index operation is complaining about 
deltas. So, I need to use a different mechanism to list the objects in 
the pack (say, 'git verify-pack -v').


Sorry for the noise!

Thanks,

-Stolee



[BUG REPORT] t5322: demonstrate a pack-objects bug

2018-11-28 Thread Derrick Stolee
I was preparing a new "sparse" algorithm for calculating the
interesting objects to send on push. The important steps happen
during 'git pack-objects', so I was creating test cases to see
how the behavior changes in narrow cases. Specifically, when
copying a directory across sibling directories (see test case),
the new logic would accidentally send that object as an extra.

However, I found a bug in the existing logic. The included test
demonstrates this during the final 'git index-pack' call. It
fails with the message

'fatal: pack has 1 unresolved delta'

It is probable that this is not a minimal test case, but happens
to be the test I had created before discovering the problem.

I compiled v2.17.0 and v2.12.0 as checks to see if I could find
a "good" commit with which to start a bisect, but both failed.
This is an old bug!

Signed-off-by: Derrick Stolee 
---
 t/t5322-pack-objects-sparse.sh | 94 ++
 1 file changed, 94 insertions(+)
 create mode 100755 t/t5322-pack-objects-sparse.sh

diff --git a/t/t5322-pack-objects-sparse.sh b/t/t5322-pack-objects-sparse.sh
new file mode 100755
index 00..36faa70fe9
--- /dev/null
+++ b/t/t5322-pack-objects-sparse.sh
@@ -0,0 +1,94 @@
+#!/bin/sh
+
+test_description='pack-objects object selection using sparse algorithm'
+. ./test-lib.sh
+
+test_expect_success 'setup repo' '
+   test_commit initial &&
+   for i in $(test_seq 1 3)
+   do
+   mkdir f$i &&
+   for j in $(test_seq 1 3)
+   do
+   mkdir f$i/f$j &&
+   echo $j >f$i/f$j/data.txt
+   done
+   done &&
+   git add . &&
+   git commit -m "Initialized trees" &&
+   for i in $(test_seq 1 3)
+   do
+   git checkout -b topic$i master &&
+   echo change-$i >f$i/f$i/data.txt &&
+   git commit -a -m "Changed f$i/f$i/data.txt"
+   done &&
+   cat >packinput.txt <<-EOF &&
+   topic1
+   ^topic2
+   ^topic3
+   EOF
+   git rev-parse   \
+   topic1  \
+   topic1^{tree}   \
+   topic1:f1   \
+   topic1:f1/f1\
+   topic1:f1/f1/data.txt | sort >actual_objects.txt
+'
+
+test_expect_success 'non-sparse pack-objects' '
+   git pack-objects --stdout --thin --revs nonsparse.pack 
&&
+   git index-pack -o nonsparse.idx nonsparse.pack &&
+   git show-index nonsparse_objects.txt &&
+   test_cmp actual_objects.txt nonsparse_objects.txt
+'
+
+# Demonstrate that both algorithms send "extra" objects because
+# they are not in the frontier.
+
+test_expect_success 'duplicate a folder from f3 and commit to topic1' '
+   git checkout topic1 &&
+   echo change-3 >f3/f3/data.txt &&
+   git commit -a -m "Changed f3/f3/data.txt" &&
+   git rev-parse   \
+   topic1~1\
+   topic1~1^{tree} \
+   topic1^{tree}   \
+   topic1  \
+   topic1:f1   \
+   topic1:f1/f1\
+   topic1:f1/f1/data.txt   \
+   topic1:f3   \
+   topic1:f3/f3\
+   topic1:f3/f3/data.txt | sort >actual_objects.txt
+'
+
+test_expect_success 'non-sparse pack-objects' '
+   git pack-objects --stdout --thin --revs nonsparse.pack 
&&
+   git index-pack -o nonsparse.idx nonsparse.pack &&
+   git show-index nonsparse_objects.txt &&
+   test_cmp actual_objects.txt nonsparse_objects.txt
+'
+
+test_expect_success 'duplicate a folder from f3 and commit to topic1' '
+   mkdir f3/f4 &&
+   cp -r f1/f1/* f3/f4 &&
+   git add f3/f4 &&
+   git commit -m "Copied f1/f1 to f3/f4" &&
+   cat >packinput.txt <<-EOF &&
+   topic1
+   ^topic1~1
+   EOF
+   git rev-parse   \
+   topic1  \
+   topic1^{tree}   \
+   topic1:f3 | sort >actual_objects.txt
+'
+
+test_expect_success 'non-sparse pack-objects' '
+   git pack-objects --stdout --thin --revs nonsparse.pack 
&&
+   git index-pack -o nonsparse.idx nonsparse.pack &&
+   git show-index nonsparse_objects.txt &&
+   test_cmp actual_objects.txt nonsparse_objects.txt
+'
+
+test_done
-- 
2.20.0.rc1



Re: [PATCH 1/2] git-reset.txt: render tables correctly under Asciidoctor

2018-11-28 Thread Ævar Arnfjörð Bjarmason


On Wed, Nov 28 2018, Martin Ågren wrote:

> Asciidoctor removes the indentation of each line in these tables, so the
> last lines of each table have a completely broken alignment.
>
> Similar to 379805051d ("Documentation: render revisions correctly under
> Asciidoctor", 2018-05-06), use an explicit literal block to indicate
> that we want to keep the leading whitespace in the tables.
>
> Because this gives us some extra indentation, we can remove the one that
> we have been carrying explicitly. That is, drop the first six spaces of
> indentation on each line. With Asciidoc (8.6.10), this results in
> identical rendering before and after this commit, both for git-reset.1
> and git-reset.html.
>
> Reported-by: Paweł Samoraj 
> Signed-off-by: Martin Ågren 

Earlier I was trying to get the Documentation/doc-diff script to diff
the asciidoc and asciidoctor docs without much success (hadn't used it
before, just hacking the Makefile to turn on asciidoctor yielded syntax
errors or something).

Is something like that a thing we could make into a regression test?


Re: [bug report] git-gui child windows are blank

2018-11-28 Thread Stefan Beller
On Wed, Nov 28, 2018 at 6:13 AM Kenn Sebesta  wrote:
>
> v2.19.2, installed from brew on macOS Mojave 14.2.1.
>
> `git-gui` is my much beloved go-to tool for everything git.
> Unfortunately, on my new Macbook Air it seems to have a bug. When I
> first load the program, the parent window populates normally with the
> stage/unstaged and diff panes. However, when I click Push, the child
> window is completely blank. The frame is there, but there is no
> content.
>
> This same behavior is seen if I do a `git gui blame`, where the
> primary blame window opens normally but all the children windows are
> empty.
>
> I have googled but was unsuccessful in finding a solution. Is this a
> known issue?

Looking through the mailing list archive, this
seemed to be one of the last relevant messges
https://public-inbox.org/git/20180724065120.7664-1-sunsh...@sunshineco.com/

>
>
> --Kenn


Re: [PATCH v2 6/7] checkout: split into switch-branch and checkout-files

2018-11-28 Thread Duy Nguyen
On Wed, Nov 28, 2018 at 8:08 PM Stefan Beller  wrote:
>
> On Wed, Nov 28, 2018 at 7:31 AM Duy Nguyen  wrote:
> >
> > On Wed, Nov 28, 2018 at 7:03 AM Junio C Hamano  wrote:
> > >
> > > Nguyễn Thái Ngọc Duy   writes:
> > >
> > > > The good old "git checkout" command is still here and will be until
> > > > all (or most of users) are sick of it.
> > >
> > > Two comments on the goal (the implementation looked reasonable
> > > assuming the reader agrees with the gaol).
> > >
> > > At least to me, the verb "switch" needs two things to switch
> > > between, i.e. "switch A and B", unless it is "switch to X".
> > > Either "switch-to-branch" or simply "switch-to", perhaps?
> > >
> > > As I already hinted in my response to Stefan (?) about
> > > checkout-from-tree vs checkout-from-index, a command with multiple
> > > modes of operation is not confusing to people with the right mental
> > > model, and I suspect that having two separate commands for "checking
> > > out a branch" and "checking out paths" that is done by this step
> > > would help users to form the right mental model.
> >
> > Since the other one is already "checkout-files", maybe this one could
> > just be "checkout-branch".
>
> I dislike the checkout-* names, as we already have checkout-index
> as plumbing, so it would be confusing as to which checkout-* command
> should be used when and why as it seems the co-index moves
> content *from index* to the working tree, but the co-files moves content
> *to files*, whereas checkout-branch is neither 'moving' to or from a branch
> but rather 'switching' to that branch.

OK back to square one. Another thing I noticed, not sure if it
matters, is that these two commands will be the only ones with a
hyphen in them in "git help". But I guess it's even harder to find
one-word command names for these.
-- 
Duy


Re: [PATCH v2 6/7] checkout: split into switch-branch and checkout-files

2018-11-28 Thread Stefan Beller
On Wed, Nov 28, 2018 at 7:31 AM Duy Nguyen  wrote:
>
> On Wed, Nov 28, 2018 at 7:03 AM Junio C Hamano  wrote:
> >
> > Nguyễn Thái Ngọc Duy   writes:
> >
> > > The good old "git checkout" command is still here and will be until
> > > all (or most of users) are sick of it.
> >
> > Two comments on the goal (the implementation looked reasonable
> > assuming the reader agrees with the gaol).
> >
> > At least to me, the verb "switch" needs two things to switch
> > between, i.e. "switch A and B", unless it is "switch to X".
> > Either "switch-to-branch" or simply "switch-to", perhaps?
> >
> > As I already hinted in my response to Stefan (?) about
> > checkout-from-tree vs checkout-from-index, a command with multiple
> > modes of operation is not confusing to people with the right mental
> > model, and I suspect that having two separate commands for "checking
> > out a branch" and "checking out paths" that is done by this step
> > would help users to form the right mental model.
>
> Since the other one is already "checkout-files", maybe this one could
> just be "checkout-branch".

I dislike the checkout-* names, as we already have checkout-index
as plumbing, so it would be confusing as to which checkout-* command
should be used when and why as it seems the co-index moves
content *from index* to the working tree, but the co-files moves content
*to files*, whereas checkout-branch is neither 'moving' to or from a branch
but rather 'switching' to that branch.


[PATCH 1/2] git-reset.txt: render tables correctly under Asciidoctor

2018-11-28 Thread Martin Ågren
Asciidoctor removes the indentation of each line in these tables, so the
last lines of each table have a completely broken alignment.

Similar to 379805051d ("Documentation: render revisions correctly under
Asciidoctor", 2018-05-06), use an explicit literal block to indicate
that we want to keep the leading whitespace in the tables.

Because this gives us some extra indentation, we can remove the one that
we have been carrying explicitly. That is, drop the first six spaces of
indentation on each line. With Asciidoc (8.6.10), this results in
identical rendering before and after this commit, both for git-reset.1
and git-reset.html.

Reported-by: Paweł Samoraj 
Signed-off-by: Martin Ågren 
---
 Documentation/git-reset.txt | 140 
 1 file changed, 78 insertions(+), 62 deletions(-)

diff --git a/Documentation/git-reset.txt b/Documentation/git-reset.txt
index 2dac95c71a..7c925e3a29 100644
--- a/Documentation/git-reset.txt
+++ b/Documentation/git-reset.txt
@@ -365,53 +365,65 @@ index in state B.  It resets (i.e. moves) the HEAD (i.e. 
the tip of
 the current branch, if you are on one) to "target" (which has the file
 in state D).
 
-  working index HEAD target working index HEAD
-  
-   A   B CD --soft   A   B D
-   --mixed  A   D D
-   --hard   D   D D
-   --merge (disallowed)
-   --keep  (disallowed)
-
-  working index HEAD target working index HEAD
-  
-   A   B CC --soft   A   B C
-   --mixed  A   C C
-   --hard   C   C C
-   --merge (disallowed)
-   --keep   A   C C
-
-  working index HEAD target working index HEAD
-  
-   B   B CD --soft   B   B D
-   --mixed  B   D D
-   --hard   D   D D
-   --merge  D   D D
-   --keep  (disallowed)
-
-  working index HEAD target working index HEAD
-  
-   B   B CC --soft   B   B C
-   --mixed  B   C C
-   --hard   C   C C
-   --merge  C   C C
-   --keep   B   C C
-
-  working index HEAD target working index HEAD
-  
-   B   C CD --soft   B   C D
-   --mixed  B   D D
-   --hard   D   D D
-   --merge (disallowed)
-   --keep  (disallowed)
-
-  working index HEAD target working index HEAD
-  
-   B   C CC --soft   B   C C
-   --mixed  B   C C
-   --hard   C   C C
-   --merge  B   C C
-   --keep   B   C C
+
+working index HEAD target working index HEAD
+
+ A   B CD --soft   A   B D
+ --mixed  A   D D
+ --hard   D   D D
+ --merge (disallowed)
+ --keep  (disallowed)
+
+
+
+working index HEAD target working index HEAD
+
+ A   B CC --soft   A   B C
+ --mixed  A   C C
+ --hard   C   C C
+ --merge (disallowed)
+ --keep   A   C C
+
+
+
+working index HEAD target working index HEAD
+
+ B   B CD --soft   B   B D
+ --mixed  B   D D
+ --hard   D   D D
+ --merge  D   D D
+ --keep  (disallowed)
+
+
+
+working index HEAD target working index HEAD
+
+ B   B CC --soft   B   B C
+ --mixed  B   C C
+ --hard   C   C C
+

[PATCH 0/2] Re: Broken alignment in git-reset docs

2018-11-28 Thread Martin Ågren
On Wed, 28 Nov 2018 at 13:02, Martin Ågren  wrote:
>
> On Wed, 28 Nov 2018 at 12:42, Paweł Samoraj  wrote:
> >
> > The git-reset documentation page section which is accessible via URL
> > https://git-scm.com/docs/git-reset#_discussion is not looking good.
>
> [...] The correct fix could be something like 379805051d
> ("Documentation: render revisions correctly under Asciidoctor",
> 2018-05-06).

Turns out it probably is, so here's a proposed fix, followed by a patch
to typeset more of this document in monospace. That should also make
things prettier, but not in such a dramatic way as the first patch.

This is obviously not 2.20-material. About where to queue this, I had
expected this to depend on 743e63f3ed ("Documentation: use 8-space tabs
with Asciidoctor", 2018-05-06) just like 379805051d does for proper
rendering, but from my testing, somehow it doesn't.

Paweł, I'm hoping this fix should be in v2.21 in a few months and then
eventually trickle down to git-scm. Thanks again for reporting this.

Martin Ågren (2):
  git-reset.txt: render tables correctly under Asciidoctor
  git-reset.txt: render literal examples as monospace

 Documentation/git-reset.txt | 277 +++-
 1 file changed, 147 insertions(+), 130 deletions(-)

-- 
2.20.0.rc1.8.g46351a2c6f



[PATCH 2/2] git-reset.txt: render literal examples as monospace

2018-11-28 Thread Martin Ågren
Large parts of this document do not use `backticks` around literal
examples such as branch names (`topic/wip`), git usages, `HEAD` and
`` so they render as ordinary text. Fix that.

Signed-off-by: Martin Ågren 
---
 Documentation/git-reset.txt | 131 ++--
 1 file changed, 66 insertions(+), 65 deletions(-)

diff --git a/Documentation/git-reset.txt b/Documentation/git-reset.txt
index 7c925e3a29..9f69ae8b69 100644
--- a/Documentation/git-reset.txt
+++ b/Documentation/git-reset.txt
@@ -14,14 +14,14 @@ SYNOPSIS
 
 DESCRIPTION
 ---
-In the first and second form, copy entries from  to the index.
-In the third form, set the current branch head (HEAD) to , optionally
-modifying index and working tree to match.  The / defaults
-to HEAD in all forms.
+In the first and second form, copy entries from `` to the index.
+In the third form, set the current branch head (`HEAD`) to ``,
+optionally modifying index and working tree to match.
+The ``/`` defaults to `HEAD` in all forms.
 
 'git reset' [-q] [] [--] ...::
-   This form resets the index entries for all  to their
-   state at .  (It does not affect the working tree or
+   This form resets the index entries for all `` to their
+   state at ``.  (It does not affect the working tree or
the current branch.)
 +
 This means that `git reset ` is the opposite of `git add
@@ -36,7 +36,7 @@ working tree in one go.
 
 'git reset' (--patch | -p) [] [--] [...]::
Interactively select hunks in the difference between the index
-   and  (defaults to HEAD).  The chosen hunks are applied
+   and `` (defaults to `HEAD`).  The chosen hunks are applied
in reverse to the index.
 +
 This means that `git reset -p` is the opposite of `git add -p`, i.e.
@@ -44,16 +44,16 @@ you can use it to selectively reset hunks. See the 
``Interactive Mode''
 section of linkgit:git-add[1] to learn how to operate the `--patch` mode.
 
 'git reset' [] []::
-   This form resets the current branch head to  and
-   possibly updates the index (resetting it to the tree of ) and
-   the working tree depending on . If  is omitted,
-   defaults to "--mixed". The  must be one of the following:
+   This form resets the current branch head to `` and
+   possibly updates the index (resetting it to the tree of ``) and
+   the working tree depending on ``. If `` is omitted,
+   defaults to `--mixed`. The `` must be one of the following:
 +
 --
 --soft::
Does not touch the index file or the working tree at all (but
-   resets the head to , just like all modes do). This leaves
-   all your changed files "Changes to be committed", as 'git status'
+   resets the head to ``, just like all modes do). This leaves
+   all your changed files "Changes to be committed", as `git status`
would put it.
 
 --mixed::
@@ -66,24 +66,24 @@ linkgit:git-add[1]).
 
 --hard::
Resets the index and working tree. Any changes to tracked files in the
-   working tree since  are discarded.
+   working tree since `` are discarded.
 
 --merge::
Resets the index and updates the files in the working tree that are
-   different between  and HEAD, but keeps those which are
+   different between `` and `HEAD`, but keeps those which are
different between the index and working tree (i.e. which have changes
which have not been added).
-   If a file that is different between  and the index has unstaged
-   changes, reset is aborted.
+   If a file that is different between `` and the index has
+   unstaged changes, reset is aborted.
 +
-In other words, --merge does something like a 'git read-tree -u -m ',
+In other words, `--merge` does something like a `git read-tree -u -m `,
 but carries forward unmerged index entries.
 
 --keep::
Resets index entries and updates files in the working tree that are
-   different between  and HEAD.
-   If a file that is different between  and HEAD has local changes,
-   reset is aborted.
+   different between `` and `HEAD`.
+   If a file that is different between `` and `HEAD` has local
+   changes, reset is aborted.
 --
 
 If you want to undo a commit other than the latest on a branch,
@@ -116,15 +116,15 @@ $ git pull git://info.example.com/ nitfol  <4>
 +
 <1> You are happily working on something, and find the changes
 in these files are in good order.  You do not want to see them
-when you run "git diff", because you plan to work on other files
+when you run `git diff`, because you plan to work on other files
 and changes with these files are distracting.
 <2> Somebody asks you to pull, and the changes sound worthy of merging.
 <3> However, you already dirtied the index (i.e. your index does
-not match the HEAD commit).  But you know the pull you are going
-to make does not affect frotz.c or filfre.c, so you revert the
+not match the `HEAD` commit).  But you know the pull you are 

Re: Simple git push --tags deleted all branches

2018-11-28 Thread Mateusz Loskot
On Wed, 28 Nov 2018 at 17:50, Mateusz Loskot  wrote:
>
> (using git version 2.19.2.windows.1)
> [...]
> I restored the repo and tried out
>
> git push origin 1.0
> git push origin --tags
>
> and this time both succeeded, without wiping out any refs.

And, to my surprise, this pushed all branches and all tags:

git push --all origin

So, I  did not have to run follow with tags push only using

git push --tags origin

Has anything changes in [1] that now
--all
Push all branches ...AND tags?

[1] https://git-scm.com/docs/git-push#git-push---all

Best regards,
-- 
Mateusz Loskot, http://mateusz.loskot.net


Re: [PATCH] rebase -i: introduce the 'test' command

2018-11-28 Thread Paul Morelle
Hi Johannes,

On 28/11/18 16:19, Johannes Schindelin wrote:
> Hi Paul,
>
> On Wed, 28 Nov 2018, Paul Morelle wrote:
>
>> The 'exec' command can be used to run tests on a set of commits,
>> interrupting on failing commits to let the user fix the tests.
>>
>> However, the 'exec' line has been consumed, so it won't be ran again by
>> 'git rebase --continue' is ran, even if the tests weren't fixed.
>>
>> This commit introduces a new command 'test' equivalent to 'exec', except
>> that it is automatically rescheduled in the todo list if it fails.
>>
>> Signed-off-by: Paul Morelle 
> Would it not make more sense to add a command-line option (and a config
> setting) to re-schedule failed `exec` commands? Like so:

Your proposition would do in most cases, however it is not possible to
make a distinction between reschedulable and non-reschedulable commands.

Do you think that we can ignore the distinction between reschedulable
and non-reschedulable commands in the same script?
In this case, I would add some tests to your patch below, and propose
the result on this mailing list.

> -- snip --
> diff --git a/builtin/rebase--interactive.c b/builtin/rebase--interactive.c
> index a2ab68ed0632..a9ab009fdbca 100644
> --- a/builtin/rebase--interactive.c
> +++ b/builtin/rebase--interactive.c
> @@ -192,6 +192,8 @@ int cmd_rebase__interactive(int argc, const char **argv, 
> const char *prefix)
>   OPT_STRING(0, "onto-name", _name, N_("onto-name"), 
> N_("onto name")),
>   OPT_STRING(0, "cmd", , N_("cmd"), N_("the command to run")),
>   OPT_RERERE_AUTOUPDATE(_rerere_auto),
> + OPT_BOOL(0, "reschedule-failed-exec", 
> _failed_exec,
> +  N_("automatically re-schedule any `exec` that fails")),
>   OPT_END()
>   };
>  
> diff --git a/builtin/rebase.c b/builtin/rebase.c
> index 5b3e5baec8a0..700cbc4e150e 100644
> --- a/builtin/rebase.c
> +++ b/builtin/rebase.c
> @@ -104,6 +104,7 @@ struct rebase_options {
>   int rebase_merges, rebase_cousins;
>   char *strategy, *strategy_opts;
>   struct strbuf git_format_patch_opt;
> + int reschedule_failed_exec;
>  };
>  
>  static int is_interactive(struct rebase_options *opts)
> @@ -415,6 +416,8 @@ static int run_specific_rebase(struct rebase_options 
> *opts)
>   argv_array_push(, opts->gpg_sign_opt);
>   if (opts->signoff)
>   argv_array_push(, "--signoff");
> + if (opts->reschedule_failed_exec)
> + argv_array_push(, 
> "--reschedule-failed-exec");
>  
>   status = run_command();
>   goto finished_rebase;
> @@ -903,6 +906,9 @@ int cmd_rebase(int argc, const char **argv, const char 
> *prefix)
>  "strategy")),
>   OPT_BOOL(0, "root", ,
>N_("rebase all reachable commits up to the root(s)")),
> + OPT_BOOL(0, "reschedule-failed-exec",
> +  _failed_exec,
> +  N_("automatically re-schedule any `exec` that fails")),
>   OPT_END(),
>   };
>   int i;
> @@ -1195,6 +1201,9 @@ int cmd_rebase(int argc, const char **argv, const char 
> *prefix)
>   break;
>   }
>  
> + if (options.reschedule_failed_exec && !is_interactive())
> + die(_("--reschedule-failed-exec requires an interactive 
> rebase"));
> +
>   if (options.git_am_opts.argc) {
>   /* all am options except -q are compatible only with --am */
>   for (i = options.git_am_opts.argc - 1; i >= 0; i--)
> diff --git a/sequencer.c b/sequencer.c
> index e1a4dd15f1a8..69bee63e116f 100644
> --- a/sequencer.c
> +++ b/sequencer.c
> @@ -158,6 +158,7 @@ static GIT_PATH_FUNC(rebase_path_strategy, 
> "rebase-merge/strategy")
>  static GIT_PATH_FUNC(rebase_path_strategy_opts, "rebase-merge/strategy_opts")
>  static GIT_PATH_FUNC(rebase_path_allow_rerere_autoupdate, 
> "rebase-merge/allow_rerere_autoupdate")
>  static GIT_PATH_FUNC(rebase_path_quiet, "rebase-merge/quiet")
> +static GIT_PATH_FUNC(rebase_path_reschedule_failed_exec, 
> "rebase-merge/reschedule-failed-exec")
>  
>  static int git_sequencer_config(const char *k, const char *v, void *cb)
>  {
> @@ -2362,6 +2363,9 @@ static int read_populate_opts(struct replay_opts *opts)
>   opts->signoff = 1;
>   }
>  
> + if (file_exists(rebase_path_reschedule_failed_exec()))
> + opts->reschedule_failed_exec = 1;
> +
>   read_strategy_opts(opts, );
>   strbuf_release();
>  
> @@ -2443,6 +2447,8 @@ int write_basic_state(struct replay_opts *opts, const 
> char *head_name,
>   write_file(rebase_path_gpg_sign_opt(), "-S%s\n", 
> opts->gpg_sign);
>   if (opts->signoff)
>   write_file(rebase_path_signoff(), "--signoff\n");
> + if (opts->reschedule_failed_exec)
> + 

Simple git push --tags deleted all branches

2018-11-28 Thread Mateusz Loskot
Hi,

(using git version 2.19.2.windows.1)

I've just encountered one of those WTH moments.

I have a bare repository

core.git (BARE:master) $ git branch
  1.0
  2.0
* master

core.git (BARE:master) $ git tag
1.0.1651
1.0.766
2.0.1103
2.0.1200

I published the repo using: git push --all --follow-tags

This succeeded, but there seem to be no tags pushed, just branches.
So, I followed with

core.git (BARE:master) $ git push --tags
To XXX
 - [deleted]   1.0
 - [deleted]   2.0
 ! [remote rejected]   master (refusing to delete the current
branch: refs/heads/master)
error: failed to push some refs to 'XXX'

And, I've found out that all branches and tags have been
wiped in both, local repo and remote :)

I restored the repo and tried out

git push origin 1.0
git push origin --tags

and this time both succeeded, without wiping out any refs.

Could anyone help me to understand why remote-less

git push --tags

is/was so dangerous and unforgiving?!

Best regards,
-- 
Mateusz Loskot, http://mateusz.loskot.net


Re: [PATCH v3 1/7] rebase: fix incompatible options error message

2018-11-28 Thread Elijah Newren
On Wed, Nov 28, 2018 at 8:12 AM Duy Nguyen  wrote:
>
> On Thu, Nov 22, 2018 at 7:32 PM Elijah Newren  wrote:
> >
> > In commit f57696802c30 ("rebase: really just passthru the `git am`
> > options", 2018-11-14), the handling of `git am` options was simplified
> > dramatically (and an option parsing bug was fixed), but it introduced
> > a small regression in the error message shown when options only
> > understood by separate backends were used:
> >
> > $ git rebase --keep --ignore-whitespace
> > fatal: error: cannot combine interactive options (--interactive, --exec,
> > --rebase-merges, --preserve-merges, --keep-empty, --root + --onto) with
> > am options (.git/rebase-apply/applying)
> >
> > $ git rebase --merge --ignore-whitespace
> > fatal: error: cannot combine merge options (--merge, --strategy,
> > --strategy-option) with am options (.git/rebase-apply/applying)
> >
> > Note that in both cases, the list of "am options" is
> > ".git/rebase-apply/applying", which makes no sense.  Since the lists of
> > backend-specific options is documented pretty thoroughly in the rebase
> > man page (in the "Incompatible Options" section, with multiple links
> > throughout the document), and since I expect this list to change over
> > time, just simplify the error message.
>
> Can we simplify it further and remove the "error: " prefix? "fatal:
> error: " looks redundant.

Sure, I can do that.  Looks like there are a few other cases that need
fixing as well:
$ git grep error: builtin/rebase.c
builtin/rebase.c:   die(_("error: cannot combine
interactive options "
builtin/rebase.c:   die(_("error: cannot combine
merge options (--merge, "
builtin/rebase.c:   die(_("error: cannot combine
'--preserve-merges' with "
builtin/rebase.c:   die(_("error: cannot combine
'--rebase-merges' with "
builtin/rebase.c:   die(_("error: cannot combine
'--rebase-merges' with "

Perhaps, for consistency, I should also change the error message in
the git-legacy-rebase.sh script to use 'fatal' instead of 'error'?:

$ git grep error: *rebase*.sh
git-legacy-rebase.sh:   die "$(gettext "error: cannot
combine interactive options (--interactive, --exec, --rebase-merges,
--preserve-merges, --keep-empty, --root + --onto) with am options
($incompatible_opts)")"
git-legacy-rebase.sh:   die "$(gettext "error: cannot
combine merge options (--merge, --strategy, --strategy-option) with am
options ($incompatible_opts)")"
git-legacy-rebase.sh:   die "$(gettext "error: cannot combine
'--signoff' with '--preserve-merges'")"
git-legacy-rebase.sh:   die "$(gettext "error: cannot combine
'--preserve-merges' with '--rebase-merges'")"
git-legacy-rebase.sh:   die "$(gettext "error: cannot combine
'--rebase-merges' with '--strategy-option'")"
git-legacy-rebase.sh:   die "$(gettext "error: cannot combine
'--rebase-merges' with '--strategy'")"


Re: [PATCH v3 1/7] rebase: fix incompatible options error message

2018-11-28 Thread Duy Nguyen
On Thu, Nov 22, 2018 at 7:32 PM Elijah Newren  wrote:
>
> In commit f57696802c30 ("rebase: really just passthru the `git am`
> options", 2018-11-14), the handling of `git am` options was simplified
> dramatically (and an option parsing bug was fixed), but it introduced
> a small regression in the error message shown when options only
> understood by separate backends were used:
>
> $ git rebase --keep --ignore-whitespace
> fatal: error: cannot combine interactive options (--interactive, --exec,
> --rebase-merges, --preserve-merges, --keep-empty, --root + --onto) with
> am options (.git/rebase-apply/applying)
>
> $ git rebase --merge --ignore-whitespace
> fatal: error: cannot combine merge options (--merge, --strategy,
> --strategy-option) with am options (.git/rebase-apply/applying)
>
> Note that in both cases, the list of "am options" is
> ".git/rebase-apply/applying", which makes no sense.  Since the lists of
> backend-specific options is documented pretty thoroughly in the rebase
> man page (in the "Incompatible Options" section, with multiple links
> throughout the document), and since I expect this list to change over
> time, just simplify the error message.

Can we simplify it further and remove the "error: " prefix? "fatal:
error: " looks redundant.
-- 
Duy


Re: [PATCH v3 1/7] rebase: fix incompatible options error message

2018-11-28 Thread Elijah Newren
On Wed, Nov 28, 2018 at 12:28 AM Johannes Schindelin
 wrote:
>
> Hi Elijah,
>
> On Wed, 21 Nov 2018, Elijah Newren wrote:
>
> > In commit f57696802c30 ("rebase: really just passthru the `git am`
> > options", 2018-11-14), the handling of `git am` options was simplified
> > dramatically (and an option parsing bug was fixed), but it introduced
> > a small regression in the error message shown when options only
> > understood by separate backends were used:
> >
> > $ git rebase --keep --ignore-whitespace
> > fatal: error: cannot combine interactive options (--interactive, --exec,
> > --rebase-merges, --preserve-merges, --keep-empty, --root + --onto) with
> > am options (.git/rebase-apply/applying)
> >
> > $ git rebase --merge --ignore-whitespace
> > fatal: error: cannot combine merge options (--merge, --strategy,
> > --strategy-option) with am options (.git/rebase-apply/applying)
> >
> > Note that in both cases, the list of "am options" is
> > ".git/rebase-apply/applying", which makes no sense.  Since the lists of
> > backend-specific options is documented pretty thoroughly in the rebase
> > man page (in the "Incompatible Options" section, with multiple links
> > throughout the document), and since I expect this list to change over
> > time, just simplify the error message.
> >
> > Signed-off-by: Elijah Newren 
> > ---
>
> This patch is obviously good.
>
> Given that you embedded it in the patch series that makes the sequencer
> the work horse also for the `merge` backend of `git rebase` in addition to
> the `interactive` one, may I assume that you intend this patch for post
> v2.20.0?
>
> Ciao,
> Dscho

I think post v2.20.0 probably makes the most sense.  I was unsure what
the policy was around changing strings late in the cycle, but figured
that the worst case with this regression is someone basically
understands what the message is trying to say but thinks it might be
saying more than they understand and reach out with questions.  In
contrast, if we decide to change the string and some translators don't
have enough time to translate it before 2.20.0 goes out, then someone
who doesn't understand English gets an English error message, which
seems a little worse.

I at least wanted it to be discussed, though, which is why I
highlighted it in the cover letter.


Re: [PATCH v2 7/7] Suggest other commands instead of "git checkout"

2018-11-28 Thread Duy Nguyen
On Wed, Nov 28, 2018 at 7:04 AM Junio C Hamano  wrote:
>
> Nguyễn Thái Ngọc Duy   writes:
>
> > The assumption made is here
> >
> > - "git checkout" is a horrible monster that should only be touched
> >   with a two-meter pole
> >
> > - there are other commands that can achieve the same thing
>
> Thanks for clearly spelling out the assumptions.  It is good that
> this step cames at the end, as the earlier 6 steps looked reasonable
> to me.

I see my deliberate attempt to provoke has failed :D Giving your view
of the new commands as "training wheels", I take it we still should
make them visible as much as possible, but we just not try to hide
"git checkout" as much (e.g. we mention both new and old commands,
instead of just mentioning the new one, when suggesting something)?
-- 
Duy


Re: [PATCH v2 6/7] checkout: split into switch-branch and checkout-files

2018-11-28 Thread Duy Nguyen
On Wed, Nov 28, 2018 at 7:03 AM Junio C Hamano  wrote:
>
> Nguyễn Thái Ngọc Duy   writes:
>
> > The good old "git checkout" command is still here and will be until
> > all (or most of users) are sick of it.
>
> Two comments on the goal (the implementation looked reasonable
> assuming the reader agrees with the gaol).
>
> At least to me, the verb "switch" needs two things to switch
> between, i.e. "switch A and B", unless it is "switch to X".
> Either "switch-to-branch" or simply "switch-to", perhaps?
>
> As I already hinted in my response to Stefan (?) about
> checkout-from-tree vs checkout-from-index, a command with multiple
> modes of operation is not confusing to people with the right mental
> model, and I suspect that having two separate commands for "checking
> out a branch" and "checking out paths" that is done by this step
> would help users to form the right mental model.

Since the other one is already "checkout-files", maybe this one could
just be "checkout-branch".

> So I tend to think
> these two are "training wheels", and suspect that once they got it,
> nobody will become "sick of" the single "checkout" command that can
> be used to do either.  It's just the matter of being aware what can
> be done (which requires the right mental model) and how to tell Git
> what the user wants it do (two separate commands, operating mode
> option, or just the implied command line syntax---once the user
> knows what s/he is doing, these do not make that much a difference).

I would hope this becomes better defaults and being used 90% of time.
Even though I know "git checkout" quite well, it still bites me from
time to time. Having the right mental model is one thing. Having to
think a bit every time to write "git checkout" with the right syntax,
and whether you need "--" (that ambiguation problem can still bite you
from time to time), is frankly something I'd rather avoid.
-- 
Duy


Dear I Need An Investment Partner

2018-11-28 Thread Aisha Gaddafi



--
Hello Dear ,
I came across your contact during my private search
Mrs Aisha Al-Qaddafi is my name, the only daughter of late Libyan
president, I have funds the sum
of $27.5 million USD for investment, I am interested in you for
investment project assistance in your country,
i shall compensate you 30% of the total sum after the funds are
transfer into your account,
Reply me urgent for more details
Mrs Aisha Al-Qaddafi
--



git@vger.kernel.org has been hacked! Change your password immediately!

2018-11-28 Thread kassie
Hello!

I have very bad news for you.
03/08/2018 - on this day I hacked your OS and got full access to your account 
git@vger.kernel.org
On this day your account git@vger.kernel.org has password: dirgantara

So, you can change the password, yes.. But my malware intercepts it every time.

How I made it:
In the software of the router, through which you went online, was a 
vulnerability.
I just hacked this router and placed my malicious code on it.
When you went online, my trojan was installed on the OS of your device.

After that, I made a full dump of your disk (I have all your address book, 
history of viewing sites, all files, phone numbers and addresses of all your 
contacts).

A month ago, I wanted to lock your device and ask for a not big amount of btc 
to unlock.
But I looked at the sites that you regularly visit, and I was shocked by what I 
saw!!!
I'm talk you about sites for adults.

I want to say - you are a BIG pervert. Your fantasy is shifted far away from 
the normal course!

And I got an idea
I made a screenshot of the adult sites where you have fun (do you understand 
what it is about, huh?).
After that, I made a screenshot of your joys (using the camera of your device) 
and glued them together.
Turned out amazing! You are so spectacular!

I'm know that you would not like to show these screenshots to your friends, 
relatives or colleagues.
I think $781 is a very, very small amount for my silence.
Besides, I have been spying on you for so long, having spent a lot of time!

Pay ONLY in Bitcoins!
My BTC wallet: 1FgfdebSqbXRciP2DXKJyqPSffX3Sx57RF

You do not know how to use bitcoins?
Enter a query in any search engine: "how to replenish btc wallet".
It's extremely easy

For this payment I give you two days (48 hours).
As soon as this letter is opened, the timer will work.

After payment, my virus and dirty screenshots with your enjoys will be 
self-destruct automatically.
If I do not receive from you the specified amount, then your device will be 
locked, and all your contacts will receive a screenshots with your "enjoys".

I hope you understand your situation.
- Do not try to find and destroy my virus! (All your data, files and 
screenshots is already uploaded to a remote server)
- Do not try to contact me (you yourself will see that this is impossible, the 
sender address is automatically generated)
- Various security services will not help you; formatting a disk or destroying 
a device will not help, since your data is already on a remote server.

P.S. You are not my single victim. so, I guarantee you that I will not disturb 
you again after payment!
 This is the word of honor hacker

I also ask you to regularly update your antiviruses in the future. This way you 
will no longer fall into a similar situation.

Do not hold evil! I just do my job.
Good luck.



Re: [PATCH v2 1/7] parse-options: allow parse_options_concat(NULL, options)

2018-11-28 Thread Duy Nguyen
On Tue, Nov 27, 2018 at 8:44 PM Stefan Beller  wrote:
>
> On Tue, Nov 27, 2018 at 8:53 AM Nguyễn Thái Ngọc Duy  
> wrote:
> >
> > There is currently no caller that calls this function with "a" being
> > NULL. But it will be introduced shortly. It is used to construct the
> > option array from scratch, e.g.
> >
> >struct parse_options opts = NULL;
> >opts = parse_options_concat(opts, opts_1);
> >opts = parse_options_concat(opts, opts_2);
>
> While this addresses the immediate needs, I'd prefer to think
> about the API exposure of parse_options_concat,
> (related: do we want to have docs in its header file?)
> and I'd recommend to make it symmetrical, i.e.
> allow the second argument to also be NULL?

I'll just drop this patch. There's a better way to do the same without
adding special handling like this.
-- 
Duy


Re: [PATCH] rebase -i: introduce the 'test' command

2018-11-28 Thread Johannes Schindelin
Hi Paul,

On Wed, 28 Nov 2018, Paul Morelle wrote:

> The 'exec' command can be used to run tests on a set of commits,
> interrupting on failing commits to let the user fix the tests.
> 
> However, the 'exec' line has been consumed, so it won't be ran again by
> 'git rebase --continue' is ran, even if the tests weren't fixed.
> 
> This commit introduces a new command 'test' equivalent to 'exec', except
> that it is automatically rescheduled in the todo list if it fails.
> 
> Signed-off-by: Paul Morelle 

Would it not make more sense to add a command-line option (and a config
setting) to re-schedule failed `exec` commands? Like so:

-- snip --
diff --git a/builtin/rebase--interactive.c b/builtin/rebase--interactive.c
index a2ab68ed0632..a9ab009fdbca 100644
--- a/builtin/rebase--interactive.c
+++ b/builtin/rebase--interactive.c
@@ -192,6 +192,8 @@ int cmd_rebase__interactive(int argc, const char **argv, 
const char *prefix)
OPT_STRING(0, "onto-name", _name, N_("onto-name"), 
N_("onto name")),
OPT_STRING(0, "cmd", , N_("cmd"), N_("the command to run")),
OPT_RERERE_AUTOUPDATE(_rerere_auto),
+   OPT_BOOL(0, "reschedule-failed-exec", 
_failed_exec,
+N_("automatically re-schedule any `exec` that fails")),
OPT_END()
};
 
diff --git a/builtin/rebase.c b/builtin/rebase.c
index 5b3e5baec8a0..700cbc4e150e 100644
--- a/builtin/rebase.c
+++ b/builtin/rebase.c
@@ -104,6 +104,7 @@ struct rebase_options {
int rebase_merges, rebase_cousins;
char *strategy, *strategy_opts;
struct strbuf git_format_patch_opt;
+   int reschedule_failed_exec;
 };
 
 static int is_interactive(struct rebase_options *opts)
@@ -415,6 +416,8 @@ static int run_specific_rebase(struct rebase_options *opts)
argv_array_push(, opts->gpg_sign_opt);
if (opts->signoff)
argv_array_push(, "--signoff");
+   if (opts->reschedule_failed_exec)
+   argv_array_push(, 
"--reschedule-failed-exec");
 
status = run_command();
goto finished_rebase;
@@ -903,6 +906,9 @@ int cmd_rebase(int argc, const char **argv, const char 
*prefix)
   "strategy")),
OPT_BOOL(0, "root", ,
 N_("rebase all reachable commits up to the root(s)")),
+   OPT_BOOL(0, "reschedule-failed-exec",
+_failed_exec,
+N_("automatically re-schedule any `exec` that fails")),
OPT_END(),
};
int i;
@@ -1195,6 +1201,9 @@ int cmd_rebase(int argc, const char **argv, const char 
*prefix)
break;
}
 
+   if (options.reschedule_failed_exec && !is_interactive())
+   die(_("--reschedule-failed-exec requires an interactive 
rebase"));
+
if (options.git_am_opts.argc) {
/* all am options except -q are compatible only with --am */
for (i = options.git_am_opts.argc - 1; i >= 0; i--)
diff --git a/sequencer.c b/sequencer.c
index e1a4dd15f1a8..69bee63e116f 100644
--- a/sequencer.c
+++ b/sequencer.c
@@ -158,6 +158,7 @@ static GIT_PATH_FUNC(rebase_path_strategy, 
"rebase-merge/strategy")
 static GIT_PATH_FUNC(rebase_path_strategy_opts, "rebase-merge/strategy_opts")
 static GIT_PATH_FUNC(rebase_path_allow_rerere_autoupdate, 
"rebase-merge/allow_rerere_autoupdate")
 static GIT_PATH_FUNC(rebase_path_quiet, "rebase-merge/quiet")
+static GIT_PATH_FUNC(rebase_path_reschedule_failed_exec, 
"rebase-merge/reschedule-failed-exec")
 
 static int git_sequencer_config(const char *k, const char *v, void *cb)
 {
@@ -2362,6 +2363,9 @@ static int read_populate_opts(struct replay_opts *opts)
opts->signoff = 1;
}
 
+   if (file_exists(rebase_path_reschedule_failed_exec()))
+   opts->reschedule_failed_exec = 1;
+
read_strategy_opts(opts, );
strbuf_release();
 
@@ -2443,6 +2447,8 @@ int write_basic_state(struct replay_opts *opts, const 
char *head_name,
write_file(rebase_path_gpg_sign_opt(), "-S%s\n", 
opts->gpg_sign);
if (opts->signoff)
write_file(rebase_path_signoff(), "--signoff\n");
+   if (opts->reschedule_failed_exec)
+   write_file(rebase_path_reschedule_failed_exec(), "%s", "");
 
return 0;
 }
@@ -3586,9 +3592,10 @@ static int pick_commits(struct todo_list *todo_list, 
struct replay_opts *opts)
*end_of_arg = saved;
 
/* Reread the todo file if it has changed. */
-   if (res)
-   ; /* fall through */
-   else if (stat(get_todo_path(opts), ))
+   if (res) {
+   if (opts->reschedule_failed_exec)
+  

Re: [PATCH] t5562: skip if NO_CURL is enabled

2018-11-28 Thread Ævar Arnfjörð Bjarmason


On Thu, Nov 22 2018, Jeff King wrote:

> On Thu, Nov 22, 2018 at 02:17:01AM -0800, Carlo Arenas wrote:
>> PS. upstreaming the PERL_PATH fix is likely to be good to do soonish
>> as I presume at least all BSD might be affected, let me know if you
>> would rather me do that instead as I suspect we might be deadlocked
>> otherwise ;)
>
> Yeah, the $PERL_PATH thing is totally orthogonal, and should graduate
> separately.

On the subject of orthagonal things: This test fails on AIX with /bin/sh
(but not /bin/bash) due to some interaction of ssize_b100dots and the
build_option function. On that system:

$ ./git version --build-options
git version 2.20.0-rc1
cpu: 00FA74164C00
no commit associated with this build
sizeof-long: 4
sizeof-size_t: 4

But it somehow ends up in the 'die' condition in that case statement. I
dug around briefly but couldn't find the cause, probably some limitation
in the shell constructs it supports. Just leaving a note about this...


[bug report] git-gui child windows are blank

2018-11-28 Thread Kenn Sebesta
v2.19.2, installed from brew on macOS Mojave 14.2.1.

`git-gui` is my much beloved go-to tool for everything git.
Unfortunately, on my new Macbook Air it seems to have a bug. When I
first load the program, the parent window populates normally with the
stage/unstaged and diff panes. However, when I click Push, the child
window is completely blank. The frame is there, but there is no
content.

This same behavior is seen if I do a `git gui blame`, where the
primary blame window opens normally but all the children windows are
empty.

I have googled but was unsuccessful in finding a solution. Is this a
known issue?


--Kenn


[PATCH] rebase -i: introduce the 'test' command

2018-11-28 Thread Paul Morelle
The 'exec' command can be used to run tests on a set of commits,
interrupting on failing commits to let the user fix the tests.

However, the 'exec' line has been consumed, so it won't be ran again by
'git rebase --continue' is ran, even if the tests weren't fixed.

This commit introduces a new command 'test' equivalent to 'exec', except
that it is automatically rescheduled in the todo list if it fails.

Signed-off-by: Paul Morelle 
---
 Documentation/git-rebase.txt  |  9 ++---
 rebase-interactive.c  |  1 +
 sequencer.c   | 23 +++
 t/lib-rebase.sh   |  4 +++-
 t/t3404-rebase-interactive.sh | 16 
 5 files changed, 41 insertions(+), 12 deletions(-)

diff --git a/Documentation/git-rebase.txt b/Documentation/git-rebase.txt
index 80793bad8..c8f565637 100644
--- a/Documentation/git-rebase.txt
+++ b/Documentation/git-rebase.txt
@@ -693,8 +693,8 @@ $ git rebase -i -p --onto Q O
 Reordering and editing commits usually creates untested intermediate
 steps.  You may want to check that your history editing did not break
 anything by running a test, or at least recompiling at intermediate
-points in history by using the "exec" command (shortcut "x").  You may
-do so by creating a todo list like this one:
+points in history by using the "exec" command (shortcut "x") or the
+"test" command.  You may do so by creating a todo list like this one:
  ---
 pick deadbee Implement feature XXX
@@ -702,7 +702,7 @@ fixup f1a5c00 Fix to feature XXX
 exec make
 pick c0ffeee The oneline of the next commit
 edit deadbab The oneline of the commit after
-exec cd subdir; make test
+test cd subdir; make test
 ...
 ---
 @@ -715,6 +715,9 @@ in `$SHELL`, or the default shell if `$SHELL` is
not set), so you can
 use shell features (like "cd", ">", ";" ...). The command is run from
 the root of the working tree.
 +The "test" command does the same, but will remain in the todo list as
+the next command, until it succeeds.
+
 --
 $ git rebase -i --exec "make test"
 --
diff --git a/rebase-interactive.c b/rebase-interactive.c
index 78f3263fc..4a408661d 100644
--- a/rebase-interactive.c
+++ b/rebase-interactive.c
@@ -14,6 +14,7 @@ void append_todo_help(unsigned edit_todo, unsigned
keep_empty,
 "s, squash  = use commit, but meld into previous commit\n"
 "f, fixup  = like \"squash\", but discard this commit's log
message\n"
 "x, exec  = run command (the rest of the line) using shell\n"
+"   test  = same as exec command, but keep it in TODO if it
fails\n"
 "b, break = stop here (continue rebase later with 'git rebase
--continue')\n"
 "d, drop  = remove commit\n"
 "l, label  = label current HEAD with a name\n"
diff --git a/sequencer.c b/sequencer.c
index e1a4dd15f..c3dde6910 100644
--- a/sequencer.c
+++ b/sequencer.c
@@ -1508,6 +1508,7 @@ enum todo_command {
TODO_SQUASH,
/* commands that do something else than handling a single commit */
TODO_EXEC,
+   TODO_TEST,
TODO_BREAK,
TODO_LABEL,
TODO_RESET,
@@ -1530,6 +1531,7 @@ static struct {
{ 'f', "fixup" },
{ 's', "squash" },
{ 'x', "exec" },
+   { 0,   "test" },
{ 'b', "break" },
{ 'l', "label" },
{ 't', "reset" },
@@ -2072,7 +2074,7 @@ static int parse_insn_line(struct todo_item *item,
const char *bol, char *eol)
 command_to_string(item->command));
if (item->command == TODO_EXEC || item->command == TODO_LABEL ||
-   item->command == TODO_RESET) {
+   item->command == TODO_RESET || item->command == TODO_TEST) {
item->commit = NULL;
item->arg = bol;
item->arg_len = (int)(eol - bol);
@@ -3576,7 +3578,7 @@ static int pick_commits(struct todo_list
*todo_list, struct replay_opts *opts)
item->arg, item->arg_len, opts,
res, to_amend);
}
-   } else if (item->command == TODO_EXEC) {
+   } else if (item->command == TODO_EXEC || item->command == 
TODO_TEST) {
char *end_of_arg = (char *)(item->arg + item->arg_len);
int saved = *end_of_arg;
struct stat st;
@@ -3586,9 +3588,12 @@ static int pick_commits(struct todo_list
*todo_list, struct replay_opts *opts)
*end_of_arg = saved;
/* Reread the todo file if it has changed. */
-   if (res)
+   if (res) {
; /* fall through */
-   else if (stat(get_todo_path(opts), ))
+   if (item->command == TODO_TEST) {
+   reschedule = 1;
+   

Re: [PATCH v1] teach git to support a virtual (partially populated) work directory

2018-11-28 Thread SZEDER Gábor
On Tue, Nov 27, 2018 at 02:50:57PM -0500, Ben Peart wrote:

> diff --git a/t/t1092-virtualworkdir.sh b/t/t1092-virtualworkdir.sh
> new file mode 100755
> index 00..0cdfe9b362
> --- /dev/null
> +++ b/t/t1092-virtualworkdir.sh
> @@ -0,0 +1,393 @@
> +#!/bin/sh
> +
> +test_description='virtual work directory tests'
> +
> +. ./test-lib.sh
> +
> +# We need total control of the virtual work directory hook
> +sane_unset GIT_TEST_VIRTUALWORKDIR
> +
> +clean_repo () {
> + rm .git/index &&
> + git -c core.virtualworkdir=false reset --hard HEAD &&
> + git -c core.virtualworkdir=false clean -fd &&
> + touch untracked.txt &&

We would usually run '>untracked.txt' instead, sparing the external
process.

A further nit is that a function called 'clean_repo' creates new
untracked files...

> + touch dir1/untracked.txt &&
> + touch dir2/untracked.txt
> +}
> +
> +test_expect_success 'setup' '
> + mkdir -p .git/hooks/ &&
> + cat > .gitignore <<-\EOF &&

CodingGuidelines suggest no space between redirection operator and
filename.

> + .gitignore
> + expect*
> + actual*
> + EOF
> + touch file1.txt &&
> + touch file2.txt &&
> + mkdir -p dir1 &&
> + touch dir1/file1.txt &&
> + touch dir1/file2.txt &&
> + mkdir -p dir2 &&
> + touch dir2/file1.txt &&
> + touch dir2/file2.txt &&
> + git add . &&
> + git commit -m "initial" &&
> + git config --local core.virtualworkdir true
> +'


> +test_expect_success 'verify files not listed are ignored by git clean -f -x' 
> '
> + clean_repo &&

I find it odd to clean the repo right after setting it up; but then
again, 'clean_repo' not only cleans, but also creates new files.
Perhaps rename it to 'reset_repo'?  Dunno.

> + write_script .git/hooks/virtual-work-dir <<-\EOF &&
> + printf "untracked.txt\0"
> + printf "dir1/\0"
> + EOF
> + mkdir -p dir3 &&
> + touch dir3/untracked.txt &&
> + git clean -f -x &&
> + test -f file1.txt &&

Please use the 'test_path_is_file', ...

> + test -f file2.txt &&
> + test ! -f untracked.txt &&

... 'test_path_is_missing', and ...

> + test -d dir1 &&

... 'test_path_is_dir' helpers, respectively, because they print
informative error messages on failure.

> + test -f dir1/file1.txt &&
> + test -f dir1/file2.txt &&
> + test ! -f dir1/untracked.txt &&
> + test -f dir2/file1.txt &&
> + test -f dir2/file2.txt &&
> + test -f dir2/untracked.txt &&
> + test -d dir3 &&
> + test -f dir3/untracked.txt
> +'


Re: [PATCH] t5562: skip if NO_CURL is enabled

2018-11-28 Thread SZEDER Gábor
On Tue, Nov 20, 2018 at 04:11:08AM -0500, Jeff King wrote:
> On Mon, Nov 19, 2018 at 11:36:08AM -0800, Carlo Arenas wrote:
> 
> > tests 3-8 seem to fail because perl is hardcoded to /urs/bin/perl in
> > t5562/invoke-with-content-length.pl, while I seem to be getting some
> > sporadic errors in 9 with the following output :
> > 
> > ++ env CONTENT_TYPE=application/x-git-receive-pack-request
> > QUERY_STRING=/repo.git/git-receive-pack
> > 'PATH_TRANSLATED=/home/carenas/src/git/t/trash
> > directory.t5562-http-backend-content-length/.git/git-receive-pack'
> > GIT_HTTP_EXPORT_ALL=TRUE REQUEST_METHOD=POST
> > /home/carenas/src/git/t/t5562/invoke-with-content-length.pl push_body
> > git http-backend
> > ++ verify_http_result '200 OK'
> > ++ grep fatal: act.err
> > Binary file act.err matches
> > ++ return 1
> > error: last command exited with $?=1
> > not ok 9 - push plain
> > 
> > and the following output in act.err (with a 200 in act)
> > 
> > fatal: the remote end hung up unexpectedly
> 
> This bit me today, too, and I can reproduce it by running under my
> stress-testing script.

I saw this a few times on Travis CI as well.

> Curiously, the act.err file also has 54 NUL bytes before the "fatal:"
> message.

I think those NUL bytes come from the file system.

The contents of 'act.err' from the previous test ('fetch gzipped
empty') is usually:

  fatal: request ended in the middle of the gzip stream
  fatal: the remote end hung up unexpectedly

Notice that the length of the first line is 54 bytes (including the
trailing newline).  So I suspect that the following is happening:

  - http-backend in the previous test writes the first line,
  - that test finishes and this one starts,
  - this test truncates 'act.err',
  - and then the still-running http-backend from the previous test
finally writes the second line.

So at this point 'act.err' is empty, but the offset of the fd of the
redirection still open from the previous test is at 54, so the file
system fills those bytes with NULs.


> I tried adding an "strace" to see who was producing that
> output, but I can't seem to get it to fail when running under strace
> (presumably because the timing is quite different, and this likely some
> kind of pipe race).
> 
> -Peff


Re: [PATCH v2] log -G: Ignore binary files

2018-11-28 Thread Ævar Arnfjörð Bjarmason


On Wed, Nov 28 2018, Thomas Braun wrote:

Looks much better this time around.

> The -G option of log looks for the differences whose patch text
> contains added/removed lines that match regex.
>
> As the concept of patch text only makes sense for text files, we need to
> ignore binary files when searching with -G  as well.
>
> The -S option of log looks for differences that changes
> the number of occurrences of the specified block of text (i.e.
> addition/deletion) in a file. As we want to keep the current behaviour,
> add a test to ensure it.
> [...]
> diff --git a/Documentation/gitdiffcore.txt b/Documentation/gitdiffcore.txt
> index c0a60f3158..059ddd3431 100644
> --- a/Documentation/gitdiffcore.txt
> +++ b/Documentation/gitdiffcore.txt
> @@ -242,7 +242,7 @@ textual diff has an added or a deleted line that matches 
> the given
>  regular expression.  This means that it will detect in-file (or what
>  rename-detection considers the same file) moves, which is noise.  The
>  implementation runs diff twice and greps, and this can be quite
> -expensive.
> +expensive.  Binary files without textconv filter are ignored.

Now that we support --text that should be documented. I tried to come up
with something on top:

diff --git a/Documentation/diff-options.txt b/Documentation/diff-options.txt
index 0378cd574e..42ae65fb57 100644
--- a/Documentation/diff-options.txt
+++ b/Documentation/diff-options.txt
@@ -524,6 +524,10 @@ struct), and want to know the history of that block 
since it first
 came into being: use the feature iteratively to feed the interesting
 block in the preimage back into `-S`, and keep going until you get the
 very first version of the block.
++
+Unlike `-G` the `-S` option will always search through binary files
+without a textconv filter. [[TODO: Don't we want to support --no-text
+then as an optimization?]].

 -G::
Look for differences whose patch text contains added/removed
@@ -545,6 +549,15 @@ occurrences of that string did not change).
 +
 See the 'pickaxe' entry in linkgit:gitdiffcore[7] for more
 information.
++
+Unless `--text` is supplied binary files without a textconv filter
+will be ignored.  This was not the case before Git version 2.21..
++
+With `--text`, instead of patch lines we ::
Look for differences that change the number of occurrences of
diff --git a/Documentation/gitdiffcore.txt b/Documentation/gitdiffcore.txt
index c0a60f3158..26880b4149 100644
--- a/Documentation/gitdiffcore.txt
+++ b/Documentation/gitdiffcore.txt
@@ -251,6 +251,10 @@ criterion in a changeset, the entire changeset is 
kept.  This behavior
 is designed to make reviewing changes in the context of the whole
 changeset easier.

+Both `-S' and `-G' will ignore binary files without a textconv filter
+by default, this can be overriden with `--text`. With `--text` the
+binary patch we look through is generated as [[TODO: ???]].
+
 diffcore-order: For Sorting the Output Based on Filenames
 -

But as you can see given the TODO comments I don't know how this works
exactly. I *could* dig, but that's my main outstanding problem with this
patch, the commit message / docs aren't being updated to reflect the new
behavior.

I.e. let's leave the docs in some state where the reader can as
unambiguously know what to expect with -G and these binary diffs we've
been implicitly supporting as with the textual diffs. Ideally with some
examples of how to generate them (re my question about the base85 output
in v1).

Part of that's obviously behavior we've had all along, but it's much
more convincing to say:

We are changing X which we've done for ages, it works exactly like
this, and here's a switch to get it back.

Instead of:

X doesn't make sense, let's turn it off.

Also the diffcore docs already say stuff about how slow/fast things are,
and in a side-thread you said:

My main motiviation is to speed up "log -G" as that takes a
considerable amount of time when it wades through MBs of binary
files which change often.

Makes sense, but then let's say something about that in that section of
the docs.

>  When `-S` or `-G` are used without `--pickaxe-all`, only filepairs
>  that match their respective criterion are kept in the output.  When
> diff --git a/diffcore-pickaxe.c b/diffcore-pickaxe.c
> index 69fc55ea1e..4cea086f80 100644
> --- a/diffcore-pickaxe.c
> +++ b/diffcore-pickaxe.c
> @@ -154,6 +154,12 @@ static int pickaxe_match(struct diff_filepair *p, struct 
> diff_options *o,
>   if (textconv_one == textconv_two && diff_unmodified_pair(p))
>   return 0;
>
> + if ((o->pickaxe_opts & DIFF_PICKAXE_KIND_G) &&
> + !o->flags.text &&
> + ((!textconv_one && diff_filespec_is_binary(o->repo, p->one)) ||
> +  (!textconv_two && 

Re: [PATCH] rebase: mark the C reimplementation as an experimental opt-in feature (was Re: [ANNOUNCE] Git v2.20.0-rc1)

2018-11-28 Thread Ævar Arnfjörð Bjarmason


On Wed, Nov 28 2018, Johannes Schindelin wrote:

> Hi Jonathan,
>
> On Tue, 27 Nov 2018, Jonathan Nieder wrote:
>
>> At https://bugs.debian.org/914695 is a report of a test regression in
>> an outside project that is very likely to have been triggered by the
>> new faster rebase code.
>
> From looking through that log.gz (without having a clue where the test
> code lives, so I cannot say what it is supposed to do, and also: this is
> the first time I hear about dgit...), it would appear that this must be a
> regression in the reflog messages produced by `git rebase`.
>
>> The issue has not been triaged, so I don't know yet whether it's a
>> problem in rebase-in-c or a manifestation of a bug in the test.
>
> It ends thusly:
>
> -- snip --
> [...]
> + git reflog
> + egrep 'debrebase new-upstream.*checkout'
> + test 1 = 0
> + t-report-failure
> + set +x
> TEST FAILED
> -- snap --
>
> Which makes me think that the reflog we produce in *some* code path that
> originally called `git checkout` differs from the scripted rebase's
> generated reflog.
>
>> That said, Google has been running with the new rebase since ~1 month
>> ago when it became the default, with no issues reported by users.  As a
>> result, I am confident that it can cope with what most users of "next"
>> throw at it, which means that if we are to find more issues to polish it
>> better, it will need all the exposure it can get.
>
> Right. And there are a few weeks before the holidays, which should give me
> time to fix whatever bugs are discovered (I only half mind being the only
> one who fixes these bugs).
>
>> In the Google deployment, we will keep using rebase-in-c even if it
>> gets disabled by default, in order to help with that.
>>
>> From the Debian point of view, it's only a matter of time before
>> rebase-in-c becomes the default: even if it's not the default in 2.20,
>> it would presumably be so in 2.21 or 2.22.  That means the community's
>> attention when resolving security and reliability bugs would be on the
>> rebase-in-c implementation.  As a result, the Debian package will most
>> likely enable rebase-in-c by default even if upstream disables it, in
>> order to increase the package's shelf life (i.e. to ease the
>> maintenance burden of supporting whichever version of the package ends
>> up in the next Debian stable).
>>
>> So with either hat on, it doesn't matter whether you apply this patch
>> upstream.
>>
>> Having two pretty different deployments end up with the same
>> conclusion leads me to suspect that it's best for upstream not to
>> apply the revert patch, unless either
>>
>>   (a) we have a concrete regression to address and then try again, or
>>   (b) we have a test or other plan to follow before trying again.
>
> In this instance, I am more a fan of the "let's move fast and break
> things, then move even faster fixing them" approach.
>
> Besides, the bug that Ævar discovered was a bug already in the scripted
> rebase, but hidden by yet another bug (the missing error checking).
>
> I get the pretty firm impression that the common code paths are now pretty
> robust, and only lesser-exercised features may expose a bug (or
> regression, as in the case of the reflogs, where one could argue that the
> exact reflog message is not something we promise not to fiddle with).

Since I raised this 'should we hold off?' I thought I'd chime in and say
that I'm fine with going along with what you suggest and having the
builtin as the default in the final. IOW not merge
jc/postpone-rebase-in-c down.


Re: Broken alignment in git-reset docs

2018-11-28 Thread Martin Ågren
On Wed, 28 Nov 2018 at 12:42, Paweł Samoraj  wrote:
>
> Hi!
> The git-reset documentation page section which is accessible via URL
> https://git-scm.com/docs/git-reset#_discussion is not looking good.
>
[snip]
>
> The web archive has got a snapshot from 2014-06-28 when it was ok
> (https://web.archive.org/web/20140628170155/http://git-scm.com/docs/git-reset).

Hi Paweł

Thanks for the report. The sources have not changed since 2010, so this
is most likely because something in the build has changed. It's probably
that git-scm.com has switched to using Asciidoctor (as opposed to
Asciidoc). The correct fix could be something like 379805051d
("Documentation: render revisions correctly under Asciidoctor",
2018-05-06).

Do you feel like attempting a patch against git.git? The hard part of
that is probably to get all the build dependencies in place, in
particular to be able to test with both Asciidoc and Asciidoctor. See
USE_ASCIIDOCTOR in Documentation/Makefile. If you're not up for it, no
problem, I should be able to get to this later today.

Thanks again for the report.
Martin


Broken alignment in git-reset docs

2018-11-28 Thread Paweł Samoraj
Hi!
The git-reset documentation page section which is accessible via URL
https://git-scm.com/docs/git-reset#_discussion is not looking good.

ASCII tables should look like this:

  working index HEAD target working index HEAD
  
   A   B CD --soft  A   B D
 --mixed   A   D D
 --hard D   D D
 --merge (disallowed)
 --keep  (disallowed)

but are like this:

  working index HEAD target working index HEAD
  
   A   B CD --soft   A   B D
--mixed  A   D D
--hard   D   D D
--merge (disallowed)
--keep  (disallowed)


The web archive has got a snapshot from 2014-06-28 when it was ok
(https://web.archive.org/web/20140628170155/http://git-scm.com/docs/git-reset).


[PATCH v2] log -G: Ignore binary files

2018-11-28 Thread Thomas Braun
The -G option of log looks for the differences whose patch text
contains added/removed lines that match regex.

As the concept of patch text only makes sense for text files, we need to
ignore binary files when searching with -G  as well.

The -S option of log looks for differences that changes
the number of occurrences of the specified block of text (i.e.
addition/deletion) in a file. As we want to keep the current behaviour,
add a test to ensure it.

Signed-off-by: Thomas Braun 
---

Changes since v1:
- Merged both patches into one
- Adapted commit messages
- Added missing support for -a flag with tests
- Placed new code into correct location to be able to reuse an existing
  optimization
- Uses help-suggested -Ga writing without spaces
- Uses orphan branches instead of cannonball cleanup with rm -rf
- Changed search text to make it clear that it is not about the \0 boundary

 Documentation/gitdiffcore.txt |  2 +-
 diffcore-pickaxe.c|  6 ++
 t/t4209-log-pickaxe.sh| 40 +++
 3 files changed, 47 insertions(+), 1 deletion(-)

diff --git a/Documentation/gitdiffcore.txt b/Documentation/gitdiffcore.txt
index c0a60f3158..059ddd3431 100644
--- a/Documentation/gitdiffcore.txt
+++ b/Documentation/gitdiffcore.txt
@@ -242,7 +242,7 @@ textual diff has an added or a deleted line that matches 
the given
 regular expression.  This means that it will detect in-file (or what
 rename-detection considers the same file) moves, which is noise.  The
 implementation runs diff twice and greps, and this can be quite
-expensive.
+expensive.  Binary files without textconv filter are ignored.
 
 When `-S` or `-G` are used without `--pickaxe-all`, only filepairs
 that match their respective criterion are kept in the output.  When
diff --git a/diffcore-pickaxe.c b/diffcore-pickaxe.c
index 69fc55ea1e..4cea086f80 100644
--- a/diffcore-pickaxe.c
+++ b/diffcore-pickaxe.c
@@ -154,6 +154,12 @@ static int pickaxe_match(struct diff_filepair *p, struct 
diff_options *o,
if (textconv_one == textconv_two && diff_unmodified_pair(p))
return 0;
 
+   if ((o->pickaxe_opts & DIFF_PICKAXE_KIND_G) &&
+   !o->flags.text &&
+   ((!textconv_one && diff_filespec_is_binary(o->repo, p->one)) ||
+(!textconv_two && diff_filespec_is_binary(o->repo, p->two
+   return 0;
+
mf1.size = fill_textconv(o->repo, textconv_one, p->one, );
mf2.size = fill_textconv(o->repo, textconv_two, p->two, );
 
diff --git a/t/t4209-log-pickaxe.sh b/t/t4209-log-pickaxe.sh
index 844df760f7..5c3e2a16b2 100755
--- a/t/t4209-log-pickaxe.sh
+++ b/t/t4209-log-pickaxe.sh
@@ -106,4 +106,44 @@ test_expect_success 'log -S --no-textconv (missing 
textconv tool)' '
rm .gitattributes
 '
 
+test_expect_success 'log -G ignores binary files' '
+   git checkout --orphan orphan1 &&
+   printf "a\0a" >data.bin &&
+   git add data.bin &&
+   git commit -m "message" &&
+   git log -Ga >result &&
+   test_must_be_empty result
+'
+
+test_expect_success 'log -G looks into binary files with -a' '
+   git checkout --orphan orphan2 &&
+   printf "a\0a" >data.bin &&
+   git add data.bin &&
+   git commit -m "message" &&
+   git log -a -Ga >actual &&
+   git log >expected &&
+   test_cmp actual expected
+'
+
+test_expect_success 'log -G looks into binary files with textconv filter' '
+   git checkout --orphan orphan3 &&
+   echo "* diff=bin" > .gitattributes &&
+   printf "a\0a" >data.bin &&
+   git add data.bin &&
+   git commit -m "message" &&
+   git -c diff.bin.textconv=cat log -Ga >actual &&
+   git log >expected &&
+   test_cmp actual expected
+'
+
+test_expect_success 'log -S looks into binary files' '
+   git checkout --orphan orphan4 &&
+   printf "a\0a" >data.bin &&
+   git add data.bin &&
+   git commit -m "message" &&
+   git log -Sa >actual &&
+   git log >expected &&
+   test_cmp actual expected
+'
+
 test_done
-- 
2.19.0.271.gfe8321ec05.dirty



Re: [PATCH v1 1/2] log -G: Ignore binary files

2018-11-28 Thread Thomas Braun
> Ævar Arnfjörð Bjarmason  hat am 22. November 2018 um 11:16 
> geschrieben:

[...]

> >
> > +test_expect_success 'log -G ignores binary files' '
> > +   rm -rf .git &&
> > +   git init &&
> > +   printf "a\0b" >data.bin &&
> > +   git add data.bin &&
> > +   git commit -m "message" &&
> > +   git log -G a >result &&
> 
> Would be less confusing as "-Ga" since that's the invocation we
> document, even though I see (but wasn't aware that...) "-G a" works too.

Done.

> > +   test_must_be_empty result
> > +'
> > +
> > +test_expect_success 'log -G looks into binary files with textconv filter' '
> > +   rm -rf .git &&
> > +   git init &&
> > +   echo "* diff=bin" > .gitattributes &&
> > +   printf "a\0b" >data.bin &&
> > +   git add data.bin &&
> > +   git commit -m "message" &&
> > +   git -c diff.bin.textconv=cat log -G a >actual &&
> > +   git log >expected &&
> > +   test_cmp actual expected
> > +'
> > +
> >  test_done
> 
> This patch seems like the wrong direction to me. In particular the
> assertion that "the concept of differences only makes sense for text
> files". That's just not true. This patch breaks this:
> 
> (
> rm -rf /tmp/g-test &&
> git init /tmp/g-test &&
> cd /tmp/g-test &&
> for i in {1..10}; do
> echo "Always matching thensome 5" >file &&
> printf "a thensome %d binary \0" $i >>file &&
> git add file &&
> git commit -m"Bump $i"
> done &&
> git log -Gthensome.*5
> )
> 
> Right now this will emit 3/10 patches, and the right ones! I.e. "Bump
> [156]". The 1st one because it introduces the "Always matching thensome
> 5". Then 5/6 because the add/remove the string "a thensome 5 binary",
> respectively. Which matches /thensome.*5/.

log -p does not show you the patch text in your example because it is treated
as binary. And currently "log -G" has a different opinion into what it looks
and what it ignores. My patch tries to bring both more in line.
 
> I.e. in the first one we do a regex match against the content here
> because we don't have both sides:
> https://github.com/git/git/blob/v2.19.2/diffcore-pickaxe.c#L48-L53
> 
> And then for the later ones where we have both sides we end up in
> diffgrep_consume():
> https://github.com/git/git/blob/v2.19.2/diffcore-pickaxe.c#L27-L36
> 
> I think there may be a real issue here to address, which might be some
> combination of:
> 
>  a) Even though the diffcore can do a binary diff internally, this is
> not what it exposes with "-p", we just say "Binary files differ".
> 
> I don't know how to emit the raw version we'll end up passing to
> diffgrep_consume() in this case. Is it just --binary without the
> encoding? I don't know...
> 
>  b) Your test case shows that you're matching a string at a \0
> boundary. Is this perhaps something you ran into? I.e. that we don't
> have some -F version of -G so we can't supply regexes that match
> past a \0? I had some related work on grep for this that hasn't been
> carried over to the diffcore:
> 
> git log --grep='grep:.*\\0' --author=Ævar
> 
>  c) Is this binary diff we end up matching against just bad in some
> cases? I haven't dug but that wouldn't surprise me, i.e. that it's
> trying to be line-based so we'll overmatch in many cases.
> 
> So maybe this is something that should be passed down as a flag? See a
> recent discussion at
> https://public-inbox.org/git/87lg77cmr1@evledraar.gmail.com/ for how
> that could be done.

It is not about the \0 boundary. v2 of the patches will clarify that. My main
motiviation is to speed up "log -G" as that takes a considerable amount of time 
when it wades through MBs of binary files which change often. And in multiple 
places
I can already treat binary files differently (e.g. turn off delta compression, 
skip
trying to diff them, no EOL normalization). And for me making log -G ignore 
what git 
thinks are binary files is making the line clearer between what should be 
treated
as binary and what as text.

> Also if we don't have some tests already that were failing with this
> patch we really should have those as "let's test the current behavior
> first". Unfortunately tests in this area are really lacking, see
> e.g. my:
> 
> git log --author=Junio --min-parents=2 --grep=ab/.*grep
> 
> For some series of patches to grep where to get one patch in I needed to
> often lead with 5-10 test patches to convince reviewers that I knew what
> I was changing, and also to be comfortable that I'd covered all the edge
> cases we currently supported, but weren't testing for.

I'm happy to add more test cases to convince everyone involved :)


Re: [PATCH v1 2/2] log -S: Add test which searches in binary files

2018-11-28 Thread Thomas Braun
> Junio C Hamano  hat am 22. November 2018 um 02:34 
> geschrieben:
> 
> 
> Thomas Braun  writes:
> 
> > The -S  option of log looks for differences that changes the
> > number of occurrences of the specified string (i.e. addition/deletion)
> > in a file.
> 
> s/-S /-S/ and
> s/the specified string/the specified block of text/ would make it
> more in line with how Documentation/gitdiffcore.txt explains it.
> The original discussion from early 2017 also explains with a pointer
> why the primary mode of -S is not  but is .

Thanks for the pointer. I've updated the commit message.
 
> > diff --git a/t/t4209-log-pickaxe.sh b/t/t4209-log-pickaxe.sh
> > index 42cc8afd8b..d430f6f2f9 100755
> > --- a/t/t4209-log-pickaxe.sh
> > +++ b/t/t4209-log-pickaxe.sh
> > @@ -128,4 +128,15 @@ test_expect_success 'log -G looks into binary files 
> > with textconv filter' '
> > test_cmp actual expected
> >  '
> >  
> > +test_expect_success 'log -S looks into binary files' '
> > +   rm -rf .git &&
> > +   git init &&
> 
> Same comment as the one for 1/2 applies here.

Fixed as well.

> > +   printf "a\0b" >data.bin &&
> > +   git add data.bin &&
> > +   git commit -m "message" &&
> > +   git log -S a >actual &&
> > +   git log >expected &&
> > +   test_cmp actual expected
> > +'
> > +
> >  test_done
> 
> Other than these, I think both patches look sensible.  Thanks for
> resurrecting the old topic and reigniting it.
>


Re: [PATCH v1 1/2] log -G: Ignore binary files

2018-11-28 Thread Thomas Braun


> Junio C Hamano  hat am 27. November 2018 um 01:51 
> geschrieben:
> 
> 
> Stefan Beller  writes:
> 
> > On Wed, Nov 21, 2018 at 1:08 PM Thomas Braun
> >  wrote:
> >>
> >> The -G  option of log looks for the differences whose patch text
> >> contains added/removed lines that match regex.
> >>
> >> The concept of differences only makes sense for text files, therefore
> >> we need to ignore binary files when searching with -G  as well.
> >
> > What about partial text/partial binary files?
> 
> Good point. You'd use "-a" (or "--text") to tell the diff machinery
> to treat the contents as text, and the new logic must pay attention
> to that command line option.

Yes exactly. Either use -a for the occasional use or a textconv filter
for permanent use.

Coming from the opposite side: I usually mark svg files as binary as the
textual diff is well, let's say uninspiring.


Re: [PATCH v1 2/2] log -S: Add test which searches in binary files

2018-11-28 Thread Thomas Braun
> Ævar Arnfjörð Bjarmason  hat am 22. November 2018 um 10:14 
> geschrieben:
> 
> 
> 
> On Wed, Nov 21 2018, Thomas Braun wrote:
> 
> > The -S  option of log looks for differences that changes the
> > number of occurrences of the specified string (i.e. addition/deletion)
> > in a file.
> >
> > Add a test to ensure that we keep looking into binary files with -S
> > as changing that would break backwards compatibility in unexpected ways.
> >
> > Signed-off-by: Thomas Braun 
> > ---
> >  t/t4209-log-pickaxe.sh | 11 +++
> >  1 file changed, 11 insertions(+)
> >
> > diff --git a/t/t4209-log-pickaxe.sh b/t/t4209-log-pickaxe.sh
> > index 42cc8afd8b..d430f6f2f9 100755
> > --- a/t/t4209-log-pickaxe.sh
> > +++ b/t/t4209-log-pickaxe.sh
> > @@ -128,4 +128,15 @@ test_expect_success 'log -G looks into binary files 
> > with textconv filter' '
> > test_cmp actual expected
> >  '
> >
> > +test_expect_success 'log -S looks into binary files' '
> > +   rm -rf .git &&
> > +   git init &&
> > +   printf "a\0b" >data.bin &&
> > +   git add data.bin &&
> > +   git commit -m "message" &&
> > +   git log -S a >actual &&
> > +   git log >expected &&
> > +   test_cmp actual expected
> > +'
> > +
> >  test_done
> 
> This should just be part of 1/2 since the behavior is changed there &
> the commit message should describe both cases.

My reasoning was that this is a separate test which does not fit in with the 
other part.
But I'm happy in folding both into one patch. Done.


Re: [PATCH v1 1/2] log -G: Ignore binary files

2018-11-28 Thread Thomas Braun
> Ævar Arnfjörð Bjarmason  hat am 22. November 2018 um 11:16 
> geschrieben:

[...]

> >
> > +test_expect_success 'log -G ignores binary files' '
> > +   rm -rf .git &&
> > +   git init &&
> > +   printf "a\0b" >data.bin &&
> > +   git add data.bin &&
> > +   git commit -m "message" &&
> > +   git log -G a >result &&
> 
> Would be less confusing as "-Ga" since that's the invocation we
> document, even though I see (but wasn't aware that...) "-G a" works too.

Done.

> > +   test_must_be_empty result
> > +'
> > +
> > +test_expect_success 'log -G looks into binary files with textconv filter' '
> > +   rm -rf .git &&
> > +   git init &&
> > +   echo "* diff=bin" > .gitattributes &&
> > +   printf "a\0b" >data.bin &&
> > +   git add data.bin &&
> > +   git commit -m "message" &&
> > +   git -c diff.bin.textconv=cat log -G a >actual &&
> > +   git log >expected &&
> > +   test_cmp actual expected
> > +'
> > +
> >  test_done
> 
> This patch seems like the wrong direction to me. In particular the
> assertion that "the concept of differences only makes sense for text
> files". That's just not true. This patch breaks this:
> 
> (
> rm -rf /tmp/g-test &&
> git init /tmp/g-test &&
> cd /tmp/g-test &&
> for i in {1..10}; do
> echo "Always matching thensome 5" >file &&
> printf "a thensome %d binary \0" $i >>file &&
> git add file &&
> git commit -m"Bump $i"
> done &&
> git log -Gthensome.*5
> )
> 
> Right now this will emit 3/10 patches, and the right ones! I.e. "Bump
> [156]". The 1st one because it introduces the "Always matching thensome
> 5". Then 5/6 because the add/remove the string "a thensome 5 binary",
> respectively. Which matches /thensome.*5/.

log -p does not show you the patch text in your example because it is treated
as binary. And currently "log -G" has a different opinion into what it looks
and what it ignores. My patch tries to bring both more in line.
 
> I.e. in the first one we do a regex match against the content here
> because we don't have both sides:
> https://github.com/git/git/blob/v2.19.2/diffcore-pickaxe.c#L48-L53
> 
> And then for the later ones where we have both sides we end up in
> diffgrep_consume():
> https://github.com/git/git/blob/v2.19.2/diffcore-pickaxe.c#L27-L36
> 
> I think there may be a real issue here to address, which might be some
> combination of:
> 
>  a) Even though the diffcore can do a binary diff internally, this is
> not what it exposes with "-p", we just say "Binary files differ".
> 
> I don't know how to emit the raw version we'll end up passing to
> diffgrep_consume() in this case. Is it just --binary without the
> encoding? I don't know...
> 
>  b) Your test case shows that you're matching a string at a \0
> boundary. Is this perhaps something you ran into? I.e. that we don't
> have some -F version of -G so we can't supply regexes that match
> past a \0? I had some related work on grep for this that hasn't been
> carried over to the diffcore:
> 
> git log --grep='grep:.*\\0' --author=Ævar
> 
>  c) Is this binary diff we end up matching against just bad in some
> cases? I haven't dug but that wouldn't surprise me, i.e. that it's
> trying to be line-based so we'll overmatch in many cases.
> 
> So maybe this is something that should be passed down as a flag? See a
> recent discussion at
> https://public-inbox.org/git/87lg77cmr1@evledraar.gmail.com/ for how
> that could be done.

It is not about the \0 boundary. v2 of the patches will clarify that. My main
motiviation is to speed up "log -G" as that takes a considerable amount of time 
when it wades through MBs of binary files which change often. And in multiple 
places
I can already treat binary files differently (e.g. turn off delta compression, 
skip
trying to diff them, no EOL normalization). And for me making log -G ignore 
what git 
thinks are binary files is making the line clearer between what should be 
treated as binary
and what as text.

> Also if we don't have some tests already that were failing with this
> patch we really should have those as "let's test the current behavior
> first". Unfortunately tests in this area are really lacking, see
> e.g. my:
> 
> git log --author=Junio --min-parents=2 --grep=ab/.*grep
> 
> For some series of patches to grep where to get one patch in I needed to
> often lead with 5-10 test patches to convince reviewers that I knew what
> I was changing, and also to be comfortable that I'd covered all the edge
> cases we currently supported, but weren't testing for.

I'm happy to add more test cases to convince everyone involved :)


Re: [PATCH v1 1/2] log -G: Ignore binary files

2018-11-28 Thread Thomas Braun
> Jeff King  hat am 22. November 2018 um 17:20 geschrieben:
> 
> 
> On Wed, Nov 21, 2018 at 09:52:27PM +0100, Thomas Braun wrote:
> 
> > diff --git a/diffcore-pickaxe.c b/diffcore-pickaxe.c
> > index 69fc55ea1e..8c2558b07d 100644
> > --- a/diffcore-pickaxe.c
> > +++ b/diffcore-pickaxe.c
> > @@ -144,6 +144,11 @@ static int pickaxe_match(struct diff_filepair *p, 
> > struct diff_options *o,
> > textconv_two = get_textconv(o->repo->index, p->two);
> > }
> >  
> > +   if ((o->pickaxe_opts & DIFF_PICKAXE_KIND_G) &&
> > +   ((!textconv_one && diff_filespec_is_binary(o->repo, p->one)) ||
> > +(!textconv_two && diff_filespec_is_binary(o->repo, p->two
> > +   return 0;
> 
> If the user passes "-a" to treat binary files as text, we should
> probably skip the binary check. I think we'd need to check
> "o->flags.text" here.

Good point. I missed that flag. Added.

> > diff --git a/t/t4209-log-pickaxe.sh b/t/t4209-log-pickaxe.sh
> > index 844df760f7..42cc8afd8b 100755
> > --- a/t/t4209-log-pickaxe.sh
> > +++ b/t/t4209-log-pickaxe.sh
> > @@ -106,4 +106,26 @@ test_expect_success 'log -S --no-textconv (missing 
> > textconv tool)' '
> > [...]
> > +test_expect_success 'log -G ignores binary files' '
> > [...]
> > +test_expect_success 'log -G looks into binary files with textconv filter' '
> 
> And likewise add a test here similar to the textconv one.

Added as well.


Re: [PATCH v1 1/2] log -G: Ignore binary files

2018-11-28 Thread Thomas Braun
> Junio C Hamano  hat am 22. November 2018 um 02:29 
> geschrieben:
> 
> 
> Thomas Braun  writes:
> 
> > The -G  option of log looks for the differences whose patch text
> > contains added/removed lines that match regex.
> >
> > The concept of differences only makes sense for text files, therefore
> > we need to ignore binary files when searching with -G  as well.
> >
> > Signed-off-by: Thomas Braun 
> > ---
> >  Documentation/gitdiffcore.txt |  2 +-
> >  diffcore-pickaxe.c|  5 +
> >  t/t4209-log-pickaxe.sh| 22 ++
> >  3 files changed, 28 insertions(+), 1 deletion(-)
> 
> OK.
> 
> > diff --git a/Documentation/gitdiffcore.txt b/Documentation/gitdiffcore.txt
> > index c0a60f3158..059ddd3431 100644
> > --- a/Documentation/gitdiffcore.txt
> > +++ b/Documentation/gitdiffcore.txt
> > @@ -242,7 +242,7 @@ textual diff has an added or a deleted line that 
> > matches the given
> >  regular expression.  This means that it will detect in-file (or what
> >  rename-detection considers the same file) moves, which is noise.  The
> >  implementation runs diff twice and greps, and this can be quite
> > -expensive.
> > +expensive.  Binary files without textconv filter are ignored.
> 
> OK.
> 
> > diff --git a/diffcore-pickaxe.c b/diffcore-pickaxe.c
> > index 69fc55ea1e..8c2558b07d 100644
> > --- a/diffcore-pickaxe.c
> > +++ b/diffcore-pickaxe.c
> > @@ -144,6 +144,11 @@ static int pickaxe_match(struct diff_filepair *p, 
> > struct diff_options *o,
> > textconv_two = get_textconv(o->repo->index, p->two);
> > }
> >  
> > +   if ((o->pickaxe_opts & DIFF_PICKAXE_KIND_G) &&
> > +   ((!textconv_one && diff_filespec_is_binary(o->repo, p->one)) ||
> > +(!textconv_two && diff_filespec_is_binary(o->repo, p->two
> > +   return 0;
> > +
> > /*
> >  * If we have an unmodified pair, we know that the count will be the
> >  * same and don't even have to load the blobs. Unless textconv is in
> 
> Shouldn't this new test come after the existing optimization, which
> allows us to leave without loading the blob contents (which is
> needed once you call diff_filespec_is_binary())?

Yes, good point.

> > diff --git a/t/t4209-log-pickaxe.sh b/t/t4209-log-pickaxe.sh
> > index 844df760f7..42cc8afd8b 100755
> > --- a/t/t4209-log-pickaxe.sh
> > +++ b/t/t4209-log-pickaxe.sh
> > @@ -106,4 +106,26 @@ test_expect_success 'log -S --no-textconv (missing 
> > textconv tool)' '
> > rm .gitattributes
> >  '
> >  
> > +test_expect_success 'log -G ignores binary files' '
> > +   rm -rf .git &&
> > +   git init &&
> 
> Please never never ever do the above two unless you are writing a
> test that checks low-level repository details.
> 
> If you want a clean history that has specific lineage of commits
> without getting affected by commits that have been made by the
> previous test pieces, it is OK to "checkout --orphan" to create an
> empty history to work with.

Thanks for the hint. I thought I had seen a less intrusive way for getting an 
empty history. 
Changed.

> > +   printf "a\0b" >data.bin &&
> > +   git add data.bin &&
> > +   git commit -m "message" &&
> > +   git log -G a >result &&
> > +   test_must_be_empty result
> > +'
> > +
> > +test_expect_success 'log -G looks into binary files with textconv filter' '
> > +   rm -rf .git &&
> > +   git init &&
> > +   echo "* diff=bin" > .gitattributes &&
> > +   printf "a\0b" >data.bin &&
> > +   git add data.bin &&
> > +   git commit -m "message" &&
> > +   git -c diff.bin.textconv=cat log -G a >actual &&
> > +   git log >expected &&
> > +   test_cmp actual expected
> > +'
> > +
> >  test_done
>


Re: in 2.19.2 t0061-run-command FAILs if . is in $PATH

2018-11-28 Thread Johannes Schindelin
Hi,

On Wed, 28 Nov 2018, H.Merijn Brand wrote:

> the test is explicitely checking that it should not find runnable
> scripts outside $PATH, *assuming* $PATH does not have . in it

Does this fix it for you?

-- snip --
diff --git a/t/t0061-run-command.sh b/t/t0061-run-command.sh
index f3f308920f04..4949fdfde88b 100755
--- a/t/t0061-run-command.sh
+++ b/t/t0061-run-command.sh
@@ -33,7 +33,14 @@ test_expect_success 'run_command can run a command' '
test_must_be_empty err
 '
 
-test_expect_success 'run_command is restricted to PATH' '
+test_lazy_prereq DOT_IN_PATH '
+   case ":$PATH:" in
+   *:.:*) true;;
+   *) false;;
+   esac
+'
+
+test_expect_success !DOT_IN_PATH 'run_command is restricted to PATH' '
write_script should-not-run <<-\EOF &&
echo yikes
EOF
-- snap --

If so, can you please provide a commit message for it (you can add my
Signed-off-by: line and your Tested-by: line).

Thanks,
Johannes


> 
> Having '.' in $PATH can be seen as a bad idea (and it most likely is),
> but the tests should either remove '.' from $PATH before testing or
> ignore that fail if $PATH does have '.', as it is not illegal
> 
> $ git-2.19.2/t 504 > prove -v t0061-run-command.sh
> t0061-run-command.sh ..
> ok 1 - start_command reports ENOENT (slash)
> ok 2 - start_command reports ENOENT (no slash)
> ok 3 - run_command can run a command
> ok 4 - run_command is restricted to PATH
> ok 5 - run_command can run a script without a #! line
> ok 6 - run_command does not try to execute a directory
> ok 7 - run_command passes over non-executable file
> ok 8 - run_command reports EACCES
> ok 9 - unreadable directory in PATH
> ok 10 - run_command runs in parallel with more jobs available than tasks
> ok 11 - run_command runs in parallel with as many jobs as tasks
> ok 12 - run_command runs in parallel with more tasks than jobs available
> ok 13 - run_command is asked to abort gracefully
> ok 14 - run_command outputs
> ok 15 - GIT_TRACE with environment variables
> # passed all 15 test(s)
> 1..15
> ok
> All tests successful.
> Files=1, Tests=15,  1 wallclock secs ( 0.04 usr  0.01 sys +  0.26 cusr  0.07 
> csys =  0.38 CPU)
> Result: PASS
> 
> $ env PATH="$PATH"":." prove -v t0061-run-command.sh
> t0061-run-command.sh ..
> ok 1 - start_command reports ENOENT (slash)
> ok 2 - start_command reports ENOENT (no slash)
> ok 3 - run_command can run a command
> not ok 4 - run_command is restricted to PATH
> #
> #   write_script should-not-run <<-\EOF &&
> #   echo yikes
> #   EOF
> #   test_must_fail test-tool run-command run-command 
> should-not-run
> #
> ok 5 - run_command can run a script without a #! line
> ok 6 - run_command does not try to execute a directory
> ok 7 - run_command passes over non-executable file
> ok 8 - run_command reports EACCES
> ok 9 - unreadable directory in PATH
> ok 10 - run_command runs in parallel with more jobs available than tasks
> ok 11 - run_command runs in parallel with as many jobs as tasks
> ok 12 - run_command runs in parallel with more tasks than jobs available
> ok 13 - run_command is asked to abort gracefully
> ok 14 - run_command outputs
> ok 15 - GIT_TRACE with environment variables
> # failed 1 among 15 test(s)
> 1..15
> Dubious, test returned 1 (wstat 256, 0x100)
> Failed 1/15 subtests
> 
> Test Summary Report
> ---
> t0061-run-command.sh (Wstat: 256 Tests: 15 Failed: 1)
>   Failed test:  4
>   Non-zero exit status: 1
> Files=1, Tests=15,  1 wallclock secs ( 0.03 usr  0.00 sys +  0.24 cusr  0.07 
> csys =  0.34 CPU)
> Result: FAIL
> 
> -- 
> H.Merijn Brand  http://tux.nl   Perl Monger  http://amsterdam.pm.org/
> using perl5.00307 .. 5.29   porting perl5 on HP-UX, AIX, and openSUSE
> http://mirrors.develooper.com/hpux/http://www.test-smoke.org/
> http://qa.perl.org   http://www.goldmark.org/jeff/stupid-disclaimers/
> 


Re: [PATCH v1] mem_pool: add GIT_TRACE_MEMPOOL support

2018-11-28 Thread Johannes Schindelin
Hi Ben,

On Tue, 27 Nov 2018, Ben Peart wrote:

> From: Ben Peart 
> 
> Add tracing around initializing and discarding mempools. In discard report
> on the amount of memory unused in the current block to help tune setting
> the initial_size.
> 
> Signed-off-by: Ben Peart 
> ---

Looks good.

My only question: should we also trace calls to _alloc(), _calloc() and
_combine()?

Ciao,
Johannes

> 
> Notes:
> Base Ref: * git-trace-mempool
> Web-Diff: https://github.com/benpeart/git/commit/9ac84bbca2
> Checkout: git fetch https://github.com/benpeart/git git-trace-mempool-v1 
> && git checkout 9ac84bbca2
> 
>  mem-pool.c | 5 +
>  1 file changed, 5 insertions(+)
> 
> diff --git a/mem-pool.c b/mem-pool.c
> index a2841a4a9a..065389aaec 100644
> --- a/mem-pool.c
> +++ b/mem-pool.c
> @@ -5,6 +5,7 @@
>  #include "cache.h"
>  #include "mem-pool.h"
>  
> +static struct trace_key trace_mem_pool = TRACE_KEY_INIT(MEMPOOL);
>  #define BLOCK_GROWTH_SIZE 1024*1024 - sizeof(struct mp_block);
>  
>  /*
> @@ -48,12 +49,16 @@ void mem_pool_init(struct mem_pool **mem_pool, size_t 
> initial_size)
>   mem_pool_alloc_block(pool, initial_size, NULL);
>  
>   *mem_pool = pool;
> + trace_printf_key(_mem_pool, "mem_pool (%p): init (%"PRIuMAX") 
> initial size\n",
> + pool, (uintmax_t)initial_size);
>  }
>  
>  void mem_pool_discard(struct mem_pool *mem_pool, int invalidate_memory)
>  {
>   struct mp_block *block, *block_to_free;
>  
> + trace_printf_key(_mem_pool, "mem_pool (%p): discard (%"PRIuMAX") 
> unused\n",
> + mem_pool, (uintmax_t)(mem_pool->mp_block->end - 
> mem_pool->mp_block->next_free));
>   block = mem_pool->mp_block;
>   while (block)
>   {
> 
> base-commit: bb75be6cb916297f271c846f2f9caa3daaaec718
> -- 
> 2.18.0.windows.1
> 
> 


Re: [PATCH v1/RFC 1/1] 'git clone C:\cygwin\home\USER\repo' is working (again)

2018-11-28 Thread Johannes Schindelin
Hi J.H.

On Wed, 28 Nov 2018, Houder wrote:

> On 2018-11-28 09:46, Johannes Schindelin wrote:
> > 
> > On Wed, 28 Nov 2018, J.H. van de Water wrote:
> > 
> > > > > me@work /cygdrive
> > > > > $ ls
> > > > > c  d
> > > > >
> > > > > So `/cygdrive` *is* a valid directory in Cygwin.
> > > >
> > > > That supports the code that does not special case a path that begins
> > > > with /cygdrive/ and simply treats it as a full path and freely use
> > > > relative path, I guess.  Very good point.
> > > 
> > > Please read
> > > 
> > > https://cygwin.com/cygwin-ug-net/using.html#cygdrive
> > > ( The cygdrive path prefix )
> > > 
> > >  you can access arbitary drives on your system by using the cygdrive
> > > path
> > > prefix. The default value for this prefix is /cygdrive ...
> > > 
> > > 
> > > The cygdrive prefix is a >>> virtual directory <<< under which all drives
> > > on
> > > a system are subsumed ...
> > > 
> > > 
> > > The cygdrive prefix may be CHANGED in the fstab file as outlined above
> > > !
> > > 
> > > 
> > > To simplify scripting, Cygwin also provides a /proc/cygdrive symlink, ...
> > > 
> > > =
> > 
> > That's all very interesting, but I fail to see the relevance with regards
> > to the issue at hand, namely whether to special-case `/cygdrive` as a
> > special prefix that cannot be treated as directory in Git.
> > 
> > I still maintain that it should not be special-cased, no matter whether it
> > is a virtual directory or whether it can be renamed to `/jh-likes-cygwin`
> > or whatever.
> 
> Ok. Sorry about the noise.
> 
> From your post I got the impression that you believed that there will always
> be a directory called /cygdrive on Cygwin.

I know it can be different. In MSYS2 it is set to `/` via this line in
`/etc/fstab`:

none / cygdrive binary,posix=0,noacl,user 0 0

Which is just to say that I am fully aware of the option to rename it.

> My point: it can have a different name.

Indeed.

And whatever name you give it, Cygwin can handle it as if it were a
regular directory. So we *must not* special-case it in Git.

Ciao,
Johannes


Re: [ANNOUNCE] Git v2.20.0-rc1

2018-11-28 Thread Johannes Schindelin
Hi Junio,

On Wed, 28 Nov 2018, Junio C Hamano wrote:

> Johannes Schindelin  writes:
> 
> > ...
> > In short, even a thorough study of the code (keeping in mind the few
> > tidbits of information provided by you) leaves me really wondering which
> > code you run, because it sure does not look like current `master` to me.
> >
> > And if it is not `master`, then I have to ask why you keep suggesting to
> > turn off the built-in rebase by default in `master`.
> >
> > Ciao,
> > Johannes
> >
> > P.S.: Maybe you have a hook you forgot to mention?
> 
> Any response?  Or can I retract jc/postpone-rebase-in-c that was
> prepared as a reaction to this?

I worked with Ævar via IRC and we figured out the root cause and I
submitted https://public-inbox.org/git/pull.88.git.gitgitgad...@gmail.com/
to fix it.

Given that this is a really obscure edge case (`git rebase --stat -v -i`
onto an unrelated commit history, if you take away one of these, the bug
does not trigger), and that it was only discovered to be a bug *because*
of the built-in rebase (the scripted version had the same bug, but simply
forgot to do proper error checking), I would not think that the reported
bug is a strong argument in favor of turning off the built-in rebase by
defauly.

In other words, after understanding the bug I am even more confident than
before that the built-in rebase is actually in a pretty good shape.

I do not expect any major regressions, and if any happen: we do have that
escape hatch for corner cases while I fix those bugs.

Ciao,
Dscho

Re: [PATCH] rebase: mark the C reimplementation as an experimental opt-in feature (was Re: [ANNOUNCE] Git v2.20.0-rc1)

2018-11-28 Thread Johannes Schindelin
Hi Jonathan,

On Tue, 27 Nov 2018, Jonathan Nieder wrote:

> At https://bugs.debian.org/914695 is a report of a test regression in
> an outside project that is very likely to have been triggered by the
> new faster rebase code.

>From looking through that log.gz (without having a clue where the test
code lives, so I cannot say what it is supposed to do, and also: this is
the first time I hear about dgit...), it would appear that this must be a
regression in the reflog messages produced by `git rebase`.

> The issue has not been triaged, so I don't know yet whether it's a
> problem in rebase-in-c or a manifestation of a bug in the test.

It ends thusly:

-- snip --
[...]
+ git reflog
+ egrep 'debrebase new-upstream.*checkout'
+ test 1 = 0
+ t-report-failure
+ set +x
TEST FAILED
-- snap --

Which makes me think that the reflog we produce in *some* code path that
originally called `git checkout` differs from the scripted rebase's
generated reflog.

> That said, Google has been running with the new rebase since ~1 month
> ago when it became the default, with no issues reported by users.  As a
> result, I am confident that it can cope with what most users of "next"
> throw at it, which means that if we are to find more issues to polish it
> better, it will need all the exposure it can get.

Right. And there are a few weeks before the holidays, which should give me
time to fix whatever bugs are discovered (I only half mind being the only
one who fixes these bugs).

> In the Google deployment, we will keep using rebase-in-c even if it
> gets disabled by default, in order to help with that.
> 
> From the Debian point of view, it's only a matter of time before
> rebase-in-c becomes the default: even if it's not the default in 2.20,
> it would presumably be so in 2.21 or 2.22.  That means the community's
> attention when resolving security and reliability bugs would be on the
> rebase-in-c implementation.  As a result, the Debian package will most
> likely enable rebase-in-c by default even if upstream disables it, in
> order to increase the package's shelf life (i.e. to ease the
> maintenance burden of supporting whichever version of the package ends
> up in the next Debian stable).
> 
> So with either hat on, it doesn't matter whether you apply this patch
> upstream.
> 
> Having two pretty different deployments end up with the same
> conclusion leads me to suspect that it's best for upstream not to
> apply the revert patch, unless either
> 
>   (a) we have a concrete regression to address and then try again, or
>   (b) we have a test or other plan to follow before trying again.

In this instance, I am more a fan of the "let's move fast and break
things, then move even faster fixing them" approach.

Besides, the bug that Ævar discovered was a bug already in the scripted
rebase, but hidden by yet another bug (the missing error checking).

I get the pretty firm impression that the common code paths are now pretty
robust, and only lesser-exercised features may expose a bug (or
regression, as in the case of the reflogs, where one could argue that the
exact reflog message is not something we promise not to fiddle with).

Ciao,
Dscho

in 2.19.2 t0061-run-command FAILs if . is in $PATH

2018-11-28 Thread H.Merijn Brand
the test is explicitely checking that it should not find runnable
scripts outside $PATH, *assuming* $PATH does not have . in it

Having '.' in $PATH can be seen as a bad idea (and it most likely is),
but the tests should either remove '.' from $PATH before testing or
ignore that fail if $PATH does have '.', as it is not illegal

$ git-2.19.2/t 504 > prove -v t0061-run-command.sh
t0061-run-command.sh ..
ok 1 - start_command reports ENOENT (slash)
ok 2 - start_command reports ENOENT (no slash)
ok 3 - run_command can run a command
ok 4 - run_command is restricted to PATH
ok 5 - run_command can run a script without a #! line
ok 6 - run_command does not try to execute a directory
ok 7 - run_command passes over non-executable file
ok 8 - run_command reports EACCES
ok 9 - unreadable directory in PATH
ok 10 - run_command runs in parallel with more jobs available than tasks
ok 11 - run_command runs in parallel with as many jobs as tasks
ok 12 - run_command runs in parallel with more tasks than jobs available
ok 13 - run_command is asked to abort gracefully
ok 14 - run_command outputs
ok 15 - GIT_TRACE with environment variables
# passed all 15 test(s)
1..15
ok
All tests successful.
Files=1, Tests=15,  1 wallclock secs ( 0.04 usr  0.01 sys +  0.26 cusr  0.07 
csys =  0.38 CPU)
Result: PASS

$ env PATH="$PATH"":." prove -v t0061-run-command.sh
t0061-run-command.sh ..
ok 1 - start_command reports ENOENT (slash)
ok 2 - start_command reports ENOENT (no slash)
ok 3 - run_command can run a command
not ok 4 - run_command is restricted to PATH
#
#   write_script should-not-run <<-\EOF &&
#   echo yikes
#   EOF
#   test_must_fail test-tool run-command run-command should-not-run
#
ok 5 - run_command can run a script without a #! line
ok 6 - run_command does not try to execute a directory
ok 7 - run_command passes over non-executable file
ok 8 - run_command reports EACCES
ok 9 - unreadable directory in PATH
ok 10 - run_command runs in parallel with more jobs available than tasks
ok 11 - run_command runs in parallel with as many jobs as tasks
ok 12 - run_command runs in parallel with more tasks than jobs available
ok 13 - run_command is asked to abort gracefully
ok 14 - run_command outputs
ok 15 - GIT_TRACE with environment variables
# failed 1 among 15 test(s)
1..15
Dubious, test returned 1 (wstat 256, 0x100)
Failed 1/15 subtests

Test Summary Report
---
t0061-run-command.sh (Wstat: 256 Tests: 15 Failed: 1)
  Failed test:  4
  Non-zero exit status: 1
Files=1, Tests=15,  1 wallclock secs ( 0.03 usr  0.00 sys +  0.24 cusr  0.07 
csys =  0.34 CPU)
Result: FAIL

-- 
H.Merijn Brand  http://tux.nl   Perl Monger  http://amsterdam.pm.org/
using perl5.00307 .. 5.29   porting perl5 on HP-UX, AIX, and openSUSE
http://mirrors.develooper.com/hpux/http://www.test-smoke.org/
http://qa.perl.org   http://www.goldmark.org/jeff/stupid-disclaimers/


pgpSlesgyXBek.pgp
Description: OpenPGP digital signature


Re: [PATCH] advice: don't pointlessly suggest --convert-graft-file

2018-11-28 Thread Johannes Schindelin
Hi Ævar,

On Tue, 27 Nov 2018, Ævar Arnfjörð Bjarmason wrote:

> The advice to run 'git replace --convert-graft-file' added in
> f9f99b3f7d ("Deprecate support for .git/info/grafts", 2018-04-29)
> didn't add an exception for the 'git replace --convert-graft-file'
> codepath itself.
> 
> As a result we'd suggest running --convert-graft-file while the user
> was running --convert-graft-file, which makes no sense. Before:
> 
> $ git replace --convert-graft-file
> hint: Support for /info/grafts is deprecated
> hint: and will be removed in a future Git version.
> hint:
> hint: Please use "git replace --convert-graft-file"
> hint: to convert the grafts into replace refs.
> hint:
> hint: Turn this message off by running
> hint: "git config advice.graftFileDeprecated false"
> 
> Add a check for that case and skip printing the advice while the user
> is busy following our advice.
> 
> Signed-off-by: Ævar Arnfjörð Bjarmason 
> ---
>  builtin/replace.c  | 1 +
>  t/t6050-replace.sh | 5 -
>  2 files changed, 5 insertions(+), 1 deletion(-)
> 
> diff --git a/builtin/replace.c b/builtin/replace.c
> index a58b9c6d13..affcdfb416 100644
> --- a/builtin/replace.c
> +++ b/builtin/replace.c
> @@ -495,6 +495,7 @@ static int convert_graft_file(int force)
>   if (!fp)
>   return -1;
>  
> + advice_graft_file_deprecated = 0;
>   while (strbuf_getline(, fp) != EOF) {
>   if (*buf.buf == '#')
>   continue;

As long as we keep this code in the one-shot code path, it is probably
okay. Otherwise we'd have to save the original value and restoring it
before returning from this function.

But then, we might never move `convert_graft_file()` into `libgit.a`.

Thanks,
Dscho

> diff --git a/t/t6050-replace.sh b/t/t6050-replace.sh
> index 86374a9c52..5d6d3184ac 100755
> --- a/t/t6050-replace.sh
> +++ b/t/t6050-replace.sh
> @@ -461,7 +461,10 @@ test_expect_success '--convert-graft-file' '
>   printf "%s\n%s %s\n\n# comment\n%s\n" \
>   $(git rev-parse HEAD^^ HEAD^ HEAD^^ HEAD^2) \
>   >.git/info/grafts &&
> - git replace --convert-graft-file &&
> + git status 2>stderr &&
> + test_i18ngrep "hint:.*grafts is deprecated" stderr &&
> + git replace --convert-graft-file 2>stderr &&
> + test_i18ngrep ! "hint:.*grafts is deprecated" stderr &&
>   test_path_is_missing .git/info/grafts &&
>  
>   : verify that the history is now "grafted" &&
> -- 
> 2.20.0.rc1.379.g1dd7ef354c
> 
> 

Re: [PATCH v1/RFC 1/1] 'git clone C:\cygwin\home\USER\repo' is working (again)

2018-11-28 Thread Houder

On 2018-11-28 09:46, Johannes Schindelin wrote:

Hi J.H.,

On Wed, 28 Nov 2018, J.H. van de Water wrote:


> > me@work /cygdrive
> > $ ls
> > c  d
> >
> > So `/cygdrive` *is* a valid directory in Cygwin.
>
> That supports the code that does not special case a path that begins
> with /cygdrive/ and simply treats it as a full path and freely use
> relative path, I guess.  Very good point.

Please read

https://cygwin.com/cygwin-ug-net/using.html#cygdrive
( The cygdrive path prefix )

 you can access arbitary drives on your system by using the 
cygdrive path

prefix. The default value for this prefix is /cygdrive ...


The cygdrive prefix is a >>> virtual directory <<< under which all 
drives on

a system are subsumed ...


The cygdrive prefix may be CHANGED in the fstab file as outlined above 
!



To simplify scripting, Cygwin also provides a /proc/cygdrive symlink, 
...


=


That's all very interesting, but I fail to see the relevance with 
regards

to the issue at hand, namely whether to special-case `/cygdrive` as a
special prefix that cannot be treated as directory in Git.

I still maintain that it should not be special-cased, no matter whether 
it
is a virtual directory or whether it can be renamed to 
`/jh-likes-cygwin`

or whatever.


Ok. Sorry about the noise.

From your post I got the impression that you believed that there will 
always

be a directory called /cygdrive on Cygwin.

My point: it can have a different name.

Regards,

Henri


  1   2   >