Re: [PATCH v2 00/18] Add `branch-diff`, a `tbdiff` lookalike

2018-05-05 Thread Junio C Hamano
Johannes Schindelin  writes:

> Johannes Schindelin (17):
>   Add a function to solve least-cost assignment problems
>   Add a new builtin: branch-diff

Perhaps retitling these to

hungarian: a function to solve least-cost assignment problems
branch-diff: a new builtin to compare iterations of a topic

may serve as good precedents to changes other people may later make
to these files.  Especially the second one is already consistent
with the several changes that are listed below ;-)

>   branch-diff: first rudimentary implementation
>   branch-diff: improve the order of the shown commits
>   branch-diff: also show the diff between patches
>...


Re: cherry-pick --no-commit does not work well with --continue in case of conflicts

2018-05-05 Thread Ilya Kantor
Let's say master..feature has 2 commits: A and B.

Then `git cherry-pick -n master..feature` should pick-up A and then B
into the working directory and the index.

If applying A leads to a conflict, then it stops on A, like here:
>>> git cherry-pick -n master..feature
>> error: could not apply 2c11f12... Run work

Then I add the fix to the index, and should be able to continue on (B
is yet unpicked):

>>> git add .
>>> git cherry-pick --continue
>> error: your local changes would be overwritten by cherry-pick.
>> fatal: cherry-pick failed

Now it fails.
Expected behavior: should continue with picking B.

P.S. I assume, `cherry-pick -n ` is meant to merge given
commits' changes into the current working directory and the index,
without making new commits, for any given set of commits, hope that's right?

Then we should be able to --continue in case of a conflict without committing.


Re: [PATCH v2 02/18] Add a new builtin: branch-diff

2018-05-05 Thread Jacob Keller
On Sat, May 5, 2018 at 6:05 PM, Igor Djordjevic
 wrote:
> Hi Dscho,
>
> On 05/05/2018 23:57, Johannes Schindelin wrote:
>>
>> > > This builtin does not do a whole lot so far, apart from showing a
>> > > usage that is oddly similar to that of `git tbdiff`. And for a
>> > > good reason: the next commits will turn `branch-diff` into a
>> > > full-blown replacement for `tbdiff`.
>> >
>> > One minor point about the name: will it become annoying as a tab
>> > completion conflict with git-branch?
>>
>> I did mention this in the commit message of 18/18:
>>
>> Without this patch, we would only complete the `branch-diff` part but
>> not the options and other arguments.
>>
>> This of itself may already be slightly disruptive for well-trained
>> fingers that assume that `git braorimas` would expand to
>> `git branch origin/master`, as we now no longer automatically append a
>> space after completing `git branch`: this is now ambiguous.
>>
>> > It feels really petty complaining about the name, but I just want
>> > to raise the point, since it will never be easier to change than
>> > right now.
>>
>> I do hear you. Especially since I hate `git cherry` every single
>> time that I try to tab-complete `git cherry-pick`.
>>
>> > (And no, I don't really have another name in mind; I'm just
>> > wondering if "subset" names like this might be a mild annoyance in
>> > the long run).
>>
>> They totally are, and if you can come up with a better name, I am
>> really interested in changing it before this hits `next`, even.
>
> I gave this just a quick glance so might be I`m missing something
> obvious or otherwise well-known here, bur why not `diff-branch` instead?
>
> From user interface perspective, I would (personally) rather expect a
> command that does "diff of branches" to belong to "diff family" of
> commands (just operating on branches, instead of "branch" command
> knowing to "diff itself"), and I see we already have `diff-files`,
> `diff-index` and `diff-tree`, for what that`s worth.
>
> Heck, I might even expect something like `git diff --branch ...` to work,
> but I guess that is yet a different matter :)
>
> Thanks, Buga

I like diff-branch, though I suppose that also conflicts with diff too.

Thanks,
Jake


Re: [PATCHv2] git-send-email: allow re-editing of message

2018-05-05 Thread Junio C Hamano
Drew DeVault  writes:

> When shown the email summary, an opportunity is presented for the user
> to edit the email as if they had specified --annotate. This also permits
> them to edit it multiple times.
>
> Signed-off-by: Drew DeVault 
> Reviewed-by: Simon Ser 
>
> ---
> Thanks for the review Eric, updated to address your feedback.

Instead you could credit him with Helped-by:, perhaps in place of
Reviewed-by: somebody who does not have any commit in our history,
which does not help others decide how good this patch is because
they do not know how much trust they should place in the ability to
review of somebody they never have heard of---Simon may be a super
human programmer whose name alone should assure us that a patch
endorsed is good, but the thing is, that won't happen until we know
that.

The patch looks good.  Eric may say "This round looks good to me"
before I have a chance to queue it, in which case I'll add his
Reviewed-by: and queue.  In either case, unless there is something
else comes up, there is no need to resend this patch.

Thanks.

>  git-send-email.perl | 38 +++---
>  1 file changed, 31 insertions(+), 7 deletions(-)
>
> diff --git a/git-send-email.perl b/git-send-email.perl
> index 2fa7818ca..b45953733 100755
> --- a/git-send-email.perl
> +++ b/git-send-email.perl
> @@ -1330,9 +1330,14 @@ sub file_name_is_absolute {
>   return File::Spec::Functions::file_name_is_absolute($path);
>  }
>  
> -# Returns 1 if the message was sent, and 0 otherwise.
> -# In actuality, the whole program dies when there
> -# is an error sending a message.
> +# Prepares the email, then asks the user what to do.
> +#
> +# If the user chooses to send the email, it's sent and 1 is returned.
> +# If the user chooses not to send the email, 0 is returned.
> +# If the user decides they want to make further edits, -1 is returned and the
> +# caller is expected to call send_message again after the edits are 
> performed.
> +#
> +# If an error occurs sending the email, this just dies.
>  
>  sub send_message {
>   my @recipients = unique_email_list(@to);
> @@ -1404,15 +1409,17 @@ Message-Id: $message_id
>  
>  EOF
>   }
> - # TRANSLATORS: Make sure to include [y] [n] [q] [a] in your
> + # TRANSLATORS: Make sure to include [y] [n] [e] [q] [a] in your
>   # translation. The program will only accept English input
>   # at this point.
> - $_ = ask(__("Send this email? ([y]es|[n]o|[q]uit|[a]ll): "),
> -  valid_re => qr/^(?:yes|y|no|n|quit|q|all|a)/i,
> + $_ = ask(__("Send this email? ([y]es|[n]o|[e]dit|[q]uit|[a]ll): 
> "),
> +  valid_re => qr/^(?:yes|y|no|n|edit|e|quit|q|all|a)/i,
>default => $ask_default);
>   die __("Send this email reply required") unless defined $_;
>   if (/^n/i) {
>   return 0;
> + } elsif (/^e/i) {
> + return -1;
>   } elsif (/^q/i) {
>   cleanup_compose_files();
>   exit(0);
> @@ -1552,7 +1559,12 @@ $references = $initial_in_reply_to || '';
>  $subject = $initial_subject;
>  $message_num = 0;
>  
> -foreach my $t (@files) {
> +# Prepares the email, prompts the user, sends it out
> +# Returns 0 if an edit was done and the function should be called again, or 1
> +# otherwise.
> +sub process_file {
> + my ($t) = @_;
> +
>   open my $fh, "<", $t or die sprintf(__("can't open file %s"), $t);
>  
>   my $author = undef;
> @@ -1755,6 +1767,10 @@ foreach my $t (@files) {
>   }
>  
>   my $message_was_sent = send_message();
> + if ($message_was_sent == -1) {
> + do_edit($t);
> + return 0;
> + }
>  
>   # set up for the next message
>   if ($thread && $message_was_sent &&
> @@ -1776,6 +1792,14 @@ foreach my $t (@files) {
>   undef $auth;
>   sleep($relogin_delay) if defined $relogin_delay;
>   }
> +
> + return 1;
> +}
> +
> +foreach my $t (@files) {
> + while (!process_file($t)) {
> + # user edited the file
> + }
>  }
>  
>  # Execute a command (e.g. $to_cmd) to get a list of email addresses


Re: [RFC PATCH] checkout: Force matching mtime between files

2018-05-05 Thread Junio C Hamano
Jeff King  writes:

> The files in your checkout would all be consistent, but they might be
> inconsistent with other files _not_ created by Git (e.g., one might be
> saved in your editor). Now you may have introduced skew that cause
> "make" to do the wrong thing, because your source and target files are
> really operating from two different clocks.
>
> I really don't know how possible or common this is, but I feel like I've
> been warned about this distinction in the past. I wouldn't be surprised
> to find that it's an archaic thing found only on ancient versions of
> NFS, and oral tradition passed down the warnings. But I also would not
> be surprised if it's still possible and common.

It was pretty common back when I still was on NFS ;-)  I do not
think we care too deeply about a working tree that spans across
filesystem boundary, so one possible workaround is to read the fs
timestamp back out of the _first_ file we write in the process, and
then consistently use that time for the rest of the files that are
checked out by the same process with utimes().  

I personally still do not think it is worth the complication; these
projects are "holding it wrong" by placing build artifacts in SCM
(not in tarball) ;-).



Re: cherry-pick --no-commit does not work well with --continue in case of conflicts

2018-05-05 Thread Junio C Hamano
Ilya Kantor  writes:

> Somewhy cherry-pick --no-commit does not work well with --continue.
>
> Let's say I'm copying changes w/o committing and get a conflict:
>
>> git cherry-pick -n master..feature
> error: could not apply 2c11f12... Run work
>
> Then I fix the conflict, but cherry-pick refuses to go on:
>
>> git add .
>> git cherry-pick --continue
> error: your local changes would be overwritten by cherry-pick.
> fatal: cherry-pick failed
>
> It could continue *if* I committed, but I'm --no-commit for a reason,
> so I shouldn't have to commit to go on with cherry-pick.

Of course you shouldn't have to, and cherry-pick --continue
shouldn't commit either.

Once you resolve the conflicts, there is no more things to do for
cherry-pick command, so --continue does not make any sense, I would
think, when using --no-commit.

For that matter, "cherry-pick --no-commit A..B", unless you are
absolutely sure A..B consists of only one commit (in which case you
should just be saying "cherry-pick --no-commit B" instead), makes no
sense, either.

So perhaps these are what we should be fixing?  I.e. reject
range-pick when --no-commit is given, and reject --continue when
working in --no-commit mode.


Re: [PATCH v2 02/18] Add a new builtin: branch-diff

2018-05-05 Thread Junio C Hamano
Johannes Schindelin  writes:

> Hi Peff,
>
> On Sat, 5 May 2018, Jeff King wrote:
>
>> On Fri, May 04, 2018 at 05:34:32PM +0200, Johannes Schindelin wrote:
>> 
>> > This builtin does not do a whole lot so far, apart from showing a usage
>> > that is oddly similar to that of `git tbdiff`. And for a good reason:
>> > the next commits will turn `branch-diff` into a full-blown replacement
>> > for `tbdiff`.
>> 
>> One minor point about the name: will it become annoying as a tab
>> completion conflict with git-branch?

If tbdiff were "Thomas's branch diff", I would call this jbdiff ;-)
but I think the 't' in there stands for "topic", not "Thomas's".

How about "git topic-diff"?


Re: [PATCH v2 05/18] branch-diff: also show the diff between patches

2018-05-05 Thread Igor Djordjevic
Hi Johannes,

On 04/05/2018 17:34, Johannes Schindelin wrote:
> Just like tbdiff, we now show the diff between matching patches. This is
> a "diff of two diffs", so it can be a bit daunting to read for the
> beginner.
> 
> And just like tbdiff, we now also accept the `--no-patches` option
> (which is actually equivalent to the diff option `-s`).

A quick nit - would `--no-patch` (singular form) option name be more 
aligned with diff `-s` option it resembles?

Thanks, Buga


Re: [PATCH v2 02/18] Add a new builtin: branch-diff

2018-05-05 Thread Igor Djordjevic
Hi Dscho,

On 05/05/2018 23:57, Johannes Schindelin wrote:
> 
> > > This builtin does not do a whole lot so far, apart from showing a
> > > usage that is oddly similar to that of `git tbdiff`. And for a
> > > good reason: the next commits will turn `branch-diff` into a
> > > full-blown replacement for `tbdiff`.
> >
> > One minor point about the name: will it become annoying as a tab
> > completion conflict with git-branch?
> 
> I did mention this in the commit message of 18/18:
> 
> Without this patch, we would only complete the `branch-diff` part but
> not the options and other arguments.
> 
> This of itself may already be slightly disruptive for well-trained
> fingers that assume that `git braorimas` would expand to
> `git branch origin/master`, as we now no longer automatically append a
> space after completing `git branch`: this is now ambiguous.
> 
> > It feels really petty complaining about the name, but I just want
> > to raise the point, since it will never be easier to change than
> > right now.
> 
> I do hear you. Especially since I hate `git cherry` every single
> time that I try to tab-complete `git cherry-pick`.
> 
> > (And no, I don't really have another name in mind; I'm just
> > wondering if "subset" names like this might be a mild annoyance in
> > the long run).
> 
> They totally are, and if you can come up with a better name, I am
> really interested in changing it before this hits `next`, even.

I gave this just a quick glance so might be I`m missing something 
obvious or otherwise well-known here, bur why not `diff-branch` instead?

>From user interface perspective, I would (personally) rather expect a 
command that does "diff of branches" to belong to "diff family" of 
commands (just operating on branches, instead of "branch" command 
knowing to "diff itself"), and I see we already have `diff-files`, 
`diff-index` and `diff-tree`, for what that`s worth.

Heck, I might even expect something like `git diff --branch ...` to work, 
but I guess that is yet a different matter :)

Thanks, Buga


Re: [PATCH v2 02/18] Add a new builtin: branch-diff

2018-05-05 Thread Todd Zullinger
I wrote:
> Would it be possible and reasonable to teach 'git branch' to
> call this as a subcommand, i.e. as 'git branch diff'?  Then
> the completion wouldn't offer git branch-diff.

Of course right after I sent this, it occurred to me that
'git branch diff' would make mask the ability to create a
branch named diff.  Using 'git branch --diff ...' wouldn't
suffer that problem.

It does add a bit more overhead to the 'git branch' command,
in terms of documentation and usage.  I'm not sure it's too
much though.  The git-branch summary wouldn't change much:

-git-branch - List, create, or delete branches
+git-branch - List, create, delete, or diff branches

I hesitate to hit send again, in case I'm once again
overlooking a glaringly obvious problem with this idea. ;)

-- 
Todd
~~
Quick to judge, quick to anger, slow to understand.
Ignorance and prejudice and fear walk hand in hand.



Re: [PATCH v2 02/18] Add a new builtin: branch-diff

2018-05-05 Thread Todd Zullinger
Hi Johannes,

Johannes Schindelin wrote:
> On Sat, 5 May 2018, Jeff King wrote:
>> One minor point about the name: will it become annoying as a tab
>> completion conflict with git-branch?
> 
> I did mention this in the commit message of 18/18:
> 
> Without this patch, we would only complete the `branch-diff` part but
> not the options and other arguments.
> 
> This of itself may already be slightly disruptive for well-trained
> fingers that assume that `git braorimas` would expand to
> `git branch origin/master`, as we now no longer automatically append a
> space after completing `git branch`: this is now ambiguous.
> 
>> It feels really petty complaining about the name, but I just want to
>> raise the point, since it will never be easier to change than right now.
> 
> I do hear you. Especially since I hate `git cherry` every single time that
> I try to tab-complete `git cherry-pick`.
> 
>> (And no, I don't really have another name in mind; I'm just wondering if
>> "subset" names like this might be a mild annoyance in the long run).
> 
> They totally are, and if you can come up with a better name, I am really
> interested in changing it before this hits `next`, even.

Would it be possible and reasonable to teach 'git branch' to
call this as a subcommand, i.e. as 'git branch diff'?  Then
the completion wouldn't offer git branch-diff.

Users could still call it directly if they wanted, though
I'd tend to think that should be discouraged and have it
treated as an implementation detail that it's a separate
binary.

We have a number of commands which take subcommands this way
(bundle, bisect, notes, submodule, and stash come to mind).
I don't know if any are used with and without a subcommand,
but it doesn't seem too strange from a UI point of view, to
me.

(I don't know if it's coincidental that of the existing
commands I noted above, 3 of the 5 are currently implemented
as shell scripts.  But they've all seen at least some work
toward converting them to C, I believe).

The idea might be gross and/or unreasonable from an
implementation or UI view.  I'm not sure, but I thought I
would toss the idea out.

This wouldn't work for git cherry{,-pick} where you wouldn't
consider 'git cherry pick' as related to 'git cherry'
though.

We also have this with git show{,-branch} and some others.
It's a mild annoyance, but muscle memory adapts eventually.

-- 
Todd
~~
A budget is just a method of worrying before you spend money, as well
as afterward.



Re: [PATCH v2 12/18] branch-diff: use color for the commit pairs

2018-05-05 Thread Todd Zullinger
Hi Johannes,

As many others have already said, thanks for this series!
I've used tbdiff a bit over the years, but having a builtin
will make it much more convenient (and the speed boost from
a C implementation will be a very nice bonus).

Johannes Schindelin wrote:
> @@ -430,6 +451,8 @@ int cmd_branch_diff(int argc, const char **argv, const 
> char *prefix)
>   struct string_list branch1 = STRING_LIST_INIT_DUP;
>   struct string_list branch2 = STRING_LIST_INIT_DUP;
>  
> + git_diff_basic_config("diff.color.frag", "magenta", NULL);
> +
>   diff_setup();
>   diffopt.output_format = DIFF_FORMAT_PATCH;
>   diffopt.flags.suppress_diff_headers = 1;

Should this also (or only) check color.diff.frag?  I thought
that color.diff.* was preferred over diff.color.*, though
that doesn't seem to be entirely true in all parts of the
current codebase.

In testing this series it seems that setting color.diff
options to change the various colors read earlier in this
patch via diff_get_color_opt, as well as the 'frag' slot,
are ignored.  Setting them via diff.color. does work.

The later patch adding a man page documents branch-diff as
using `diff.color.*` and points to git-config(1), but the
config docs only list color.diff.

Is this a bug in the diff_get_color{,_opt}() tooling?
It's certainly not anything you've introduced here, of
course.  I just noticed that some custom color.diff settings
I've used weren't picked up by branch-diff, despite your
clear intention to respect colors from the config.

-- 
Todd
~~
Abandon the search for Truth; settle for a good fantasy.



Re: [PATCH v2 13/18] color: provide inverted colors, too

2018-05-05 Thread Johannes Schindelin
Hi Peff,

On Sat, 5 May 2018, Jeff King wrote:

> On Fri, May 04, 2018 at 05:34:58PM +0200, Johannes Schindelin wrote:
> 
> > For every regular color, there exists the inverted equivalent where
> > background and foreground colors are exchanged.
> > 
> > We will use this in the next commit to allow inverting *just* the +/-
> > signs in a diff.
> 
> There's a "reverse" attribute (which we already parse and support) that
> can do this without having to repeat the colors. AFAIK it's well
> supported everywhere, but I could be wrong.

How would I use that here, though? I need to get the thing via
diff_get_color_opt() which takes a parameter of type `enum color_diff`.
There is no way I can specify `reverse` here, can I?

> I wonder if that would make configuring this slightly more pleasant,
> since it saves the user having to define "oldinv" whenever they change
> "old".

I am all for making the configuration more pleasant. So I hope I can make
use of the `reverse` thing here, without having to introduce a new enum
value.

Ciao,
Dscho


Re: [PATCH v2 02/18] Add a new builtin: branch-diff

2018-05-05 Thread Johannes Schindelin
Hi Peff,

On Sat, 5 May 2018, Jeff King wrote:

> On Fri, May 04, 2018 at 05:34:32PM +0200, Johannes Schindelin wrote:
> 
> > This builtin does not do a whole lot so far, apart from showing a usage
> > that is oddly similar to that of `git tbdiff`. And for a good reason:
> > the next commits will turn `branch-diff` into a full-blown replacement
> > for `tbdiff`.
> 
> One minor point about the name: will it become annoying as a tab
> completion conflict with git-branch?

I did mention this in the commit message of 18/18:

Without this patch, we would only complete the `branch-diff` part but
not the options and other arguments.

This of itself may already be slightly disruptive for well-trained
fingers that assume that `git braorimas` would expand to
`git branch origin/master`, as we now no longer automatically append a
space after completing `git branch`: this is now ambiguous.

> It feels really petty complaining about the name, but I just want to
> raise the point, since it will never be easier to change than right now.

I do hear you. Especially since I hate `git cherry` every single time that
I try to tab-complete `git cherry-pick`.

> (And no, I don't really have another name in mind; I'm just wondering if
> "subset" names like this might be a mild annoyance in the long run).

They totally are, and if you can come up with a better name, I am really
interested in changing it before this hits `next`, even.

Ciao,
Dscho


Re: [PATCH v2 01/18] Add a function to solve least-cost assignment problems

2018-05-05 Thread Johannes Schindelin
Hi Peff,

On Sat, 5 May 2018, Jeff King wrote:

> On Fri, May 04, 2018 at 05:34:29PM +0200, Johannes Schindelin wrote:
> 
> > The Jonker-Volgenant algorithm was implemented to answer questions such
> > as: given two different versions of a topic branch (or iterations of a
> > patch series), what is the best pairing of commits/patches between the
> > different versions?
> 
> I love git-tbdiff, so I'm excited to see it getting more exposure (and a
> speedup). Thanks for working on this!

:-)

> Two minor nits on this patch:
> 
> > +/*
> > + * The parameter `cost` is the cost matrix: the cost to assign column j to 
> > row
> > + * i is `cost[j + column_count * i].
> > + */
> > +int compute_assignment(int column_count, int row_count, double *cost,
> > +  int *column2row, int *row2column)
> > +{
> > +   double *v = xmalloc(sizeof(double) * column_count), *d;
> 
> Please use st_mult, xcalloc, or ALLOC_ARRAY here to avoid unchecked
> multiplication in an allocation. This is probably hard to exploit in
> practice (since you'd need sizeof(size_t)/8 columns, which presumably
> requires allocating some heavier-weight struct per item). But it makes
> auditing easier if we avoid the pattern altogether.

Sure. I did mean to return errors in those case, but I guess it is not
worth the trouble (what would we do in case of out-of-memory?).

> > +/*
> > + * Compute an assignment of columns -> rows (and vice versa) such that 
> > every
> > + * column is assigned to at most one row (and vice versa) minimizing the
> > + * overall cost.
> > + *
> > + * The parameter `cost` is the cost matrix: the cost to assign column j to 
> > row
> > + * i is `cost[j + column_count * i].
> > + *
> > + * The arrays column2row and row2column will be populated with the 
> > respective
> > + * assignments (-1 for unassigned, which can happen only if column_count !=
> > + * row_count).
> > + */
> > +int compute_assignment(int column_count, int row_count, double *cost,
> > +  int *column2row, int *row2column);
> 
> It looks like this always returns zero. Is there a ever a case where we
> would return an error if this? If not, should it just be void?

Fixed.

Ciao,
Dscho


cherry-pick --no-commit does not work well with --continue in case of conflicts

2018-05-05 Thread Ilya Kantor
Somewhy cherry-pick --no-commit does not work well with --continue.

Let's say I'm copying changes w/o committing and get a conflict:

> git cherry-pick -n master..feature
error: could not apply 2c11f12... Run work

Then I fix the conflict, but cherry-pick refuses to go on:

> git add .
> git cherry-pick --continue
error: your local changes would be overwritten by cherry-pick.
fatal: cherry-pick failed

It could continue *if* I committed, but I'm --no-commit for a reason,
so I shouldn't have to commit to go on with cherry-pick.

Maybe there's a preliminary check that prevents --continue and should be fixed?

---
Best Regards,
Ilya Kantor


Re: [PATCH 02/18] Add a new builtin: branch-diff

2018-05-05 Thread Johannes Schindelin
Hi Elijah,

On Fri, 4 May 2018, Elijah Newren wrote:

> On Thu, May 3, 2018 at 11:40 PM, Johannes Schindelin
>  wrote:
> > I actually have a hacky script to fixup commits in a patch series. It lets
> > me stage part of the current changes, then figures out which of the
> > commits' changes overlap with the staged changed. If there is only one
> > commit, it automatically commits with --fixup, otherwise it lets me choose
> > which one I want to fixup (giving me the list of candidates).
> 
> Ooh, interesting.  Are you willing to share said hacky script by chance?

It is part of a real huge hacky script of pretty much all things I add as
aliases, so I extracted the relevant part for you:

-- snip --
#!/bin/sh

fixup () { # [--upstream=] [--not=]
upstream=
not=
while case "$1" in
--upstream) shift; upstream="$1";;
--upstream=*) upstream="${1#*=}";;
--not) shift; not="$not $1";;
--not=*) not="$not ${1#*=}";;
-*) die "Unknown option: $1";;
*) break;;
esac; do shift; done

test $# -le 1 ||
die "Need 0 or 1 commit"

! git diff-index --cached --quiet --ignore-submodules HEAD -- || {
git update-index --ignore-submodules --refresh
! git diff-files --quiet --ignore-submodules -- ||
die "No changes"

git add -p ||
exit
}
! git diff-index --cached --quiet --ignore-submodules HEAD -- ||
die "No staged changes"

test $# = 1 || {
if test -z "$upstream"
then
upstream="$(git rev-parse --symbolic-full-name \
HEAD@{upstream} 2> /dev/null)" &&
test "$(git rev-parse HEAD)" != \
"$(git rev-parse $upstream)" ||
upstream=origin/master
fi

revs="$(git rev-list $upstream.. --not $not --)" ||
die "Could not get commits between $upstream and HEAD"

test -n "$revs" ||
die "No commits between $upstream and HEAD"

while count=$(test -z "$revs" && echo 0 || echo "$revs" | wc -l 
| tr -dc 0-9) &&
test $count -gt 1
do
printf '\nMultiple candidates:\n'
echo $revs | xargs git log --no-walk --oneline | cat -n

read input
case "$input" in
[0-9]|[0-9][0-9]|[0-9][0-9][0-9])
revs="$(echo "$revs" | sed -n "${input}p")"
count=1
break
;;
h|hh)
revs=$(history_of_staged_changes $upstream..)
continue
;;
)
history_of_staged_changes -p $upstream..
continue
;;
p)
git log -p --no-walk $revs
continue
;;

[0-9a-f][0-9a-f][0-9a-f][0-9a-f][0-9a-f][0-9a-f][0-9a-f][0-9a-f][0-9a-f][0-9a-f][0-9a-f][0-9a-f][0-9a-f][0-9a-f][0-9a-f][0-9a-f][0-9a-f][0-9a-f][0-9a-f][0-9a-f][0-9a-f][0-9a-f][0-9a-f][0-9a-f][0-9a-f][0-9a-f][0-9a-f][0-9a-f][0-9a-f][0-9a-f][0-9a-f][0-9a-f][0-9a-f][0-9a-f][0-9a-f][0-9a-f][0-9a-f][0-9a-f][0-9a-f][0-9a-f])
revs=$input
continue
esac
revs="$(git rev-list --no-walk --grep="$input" $revs)"
done

test $count = 1 ||
die "No commit given"

set $revs
}

git commit --fixup "$1"
message="$(git show -s --format=%s%n%n%b)"
case "$message" in
'fixup! fixup! '*)
message="${message#fixup! }"
message="${message#fixup! }"
message="${message#fixup! }"
git commit --amend -m "fixup! $message"
;;
esac
}

history_of_staged_changes () { # 
pretty=
if test "a-p" = "a$1"
then
pretty=-p
shift
fi

test $# -le 1 ||
die "Usage: $0 "

test $# = 1 || {
upstream=$(git rev-parse --verify HEAD@{u} 2>/dev/null) ||
upstream=origin/master
set $upstream..
}

args=$(for file in $(git diff --no-color --cached --name-only)
do
for hunk in $(get_hunks --cached -- "$file")
do

Re: [PATCH v2 00/18] Add `branch-diff`, a `tbdiff` lookalike

2018-05-05 Thread Johannes Schindelin
Hi Elijah,

On Fri, 4 May 2018, Elijah Newren wrote:

> On Fri, May 4, 2018 at 8:34 AM, Johannes Schindelin
>  wrote:
> > The incredibly useful `git-tbdiff` tool to compare patch series (say, to see
> > what changed between two iterations sent to the Git mailing list) is 
> > slightly
> > less useful for this developer due to the fact that it requires the 
> > `hungarian`
> > and `numpy` Python packages which are for some reason really hard to build 
> > in
> > MSYS2. So hard that I even had to give up, because it was simply easier to
> > reimplement the whole shebang as a builtin command.
> 
> tbdiff is awesome; thanks for bringing it in as a builtin to git.

You're welcome.

> I've run through a few cases, comparing output of tbdiff and
> branch-diff.  So far, what I've noted is that they produce largely the
> same output except that:
> 
> - tbdiff seems to shorten shas to 7 characters, branch-diff is using
> 10, in git.git at least.  (Probably a good change)

Yes, it is a good change ;-)

> - tbdiff aligned output columns better when there were more than 9
> patches (I'll comment more on patch 09/18)

I added a new patch to align the patch numbers specifically. I considered
squashing it into 9/18, but decided against it: it will make it easier to
read through the rationale when calling `git annotate` on those lines.

> - As noted elsewhere in the review of round 1, tbdiff uses difflib
> while branch-diff uses xdiff.  I found some cases where that mattered,
> and in all of them, I either felt like the difference was irrelevant
> or that difflib was suboptimal, so this is definitely an improvement
> for me.

Indeed. It is more or less ambiguities that get resolved differently.

> - branch-diff produces it's output faster, and it is automatically
> paged.  This is really cool.

:-)

It was actually the paging that made the most difference for me. The `git
tbdiff` command broke for me as soon as I switched on the pager, as tbdiff
got confused by the decoration (AEvar had put up a PR to fix that, but
that PR has not even so much as been answered in the meantime, so I
thought it'd be a good time to rewrite the entire shebang in C, also
because I could not use tbdiff *at all* on Windows due to its hefty
dependencies).

> Also, I don't have bash-completion for either tbdiff or branch-diff.
> :-(  But I saw some discussion on the v1 patches about how this gets
> handled...  :-)

Oh? Does 18/18 not work for you?
https://public-inbox.org/git/71698f11835311c103aae565a2a761d10f4676b9.1525448066.git.johannes.schinde...@gmx.de/

Ciao,
Dscho


Re: [PATCH 02/18] Add a new builtin: branch-diff

2018-05-05 Thread Johannes Schindelin
Hi Ramsay,

On Fri, 4 May 2018, Ramsay Jones wrote:

> On 04/05/18 07:40, Johannes Schindelin wrote:
> [snip] 
> > BTW I ran `make sparse` for the first time, and it spits out tons of
> > stuff. And I notice that they are all non-fatal warnings, but so were the
> > ones you pointed out above. This is a bit sad, as I would *love* to
> > install a VSTS build job to run `make sparse` automatically. Examples of
> > warnings *after* applying your patch:
> > 
> > connect.c:481:40: warning: incorrect type in argument 2 (invalid types)
> > connect.c:481:40:expected union __CONST_SOCKADDR_ARG [usertype] __addr
> > connect.c:481:40:got struct sockaddr *ai_addr
> > 
> > or
> > 
> > pack-revindex.c:65:23: warning: memset with byte count of 262144
> > 
> > What gives?
> 
> My stock answer, until recently, was that you are using a very
> old version of sparse.

Sure. I did this in an Ubuntu 16.04 LTS VM, via `sudo apt-get install
sparse`.

> Which is probably still true here - but I recently noticed that more
> up-to-date platforms/gcc versions also have many problems. (The main
> sparse contributors tend to stick with conservative distros and/or don't
> use sparse on any software that uses system headers - thus they tend not
> to notice the problems caused by new gcc/glibc versions! ;-) )
> 
> Since I am on Linux Mint 18.3 (based on the last Ubuntu LTS) and build
> sparse from source, the current 'master', 'next' and 'pu' branches are
> all 'sparse-clean' for me. (On cygwin, which is built with NO_REGEX, I
> have a single sparse warning).
> 
> I was just about to say that, unusually for me, I was using a sparse
> built from a release tag, but then remembered that I have some
> additional patches which fixes a problem on fedora 27!  Using sparse on
> fedora 27 is otherwise useless. (There are still many warnings spewed on
> f27 - but they are caused by incorrect system headers :( ).
> 
> The current release of sparse is v0.5.2, which probably hasn't been
> included in any distro yet (I think the previous release v0.5.1, which
> should also work for you, is in Debian unstable).  If you wanted to try
> building a more up-to-date sparse, the repo is at:
> git://git.kernel.org/pub/scm/devel/sparse/sparse.git.

Well, what I would want to do is let the cloud work for me. By adding an
automated build to my Visual Studio Team Services (VSTS) account, of
course, as I have "cloud privilege" (i.e. I work in the organization
providing the service, so I get to play with all of it for free).

So I really don't want to build sparse every time a new revision needs to
be tested (whether that be from one of my branches, an internal PR for
pre-review of patches to be sent to the mailing list, or maybe even `pu`
or the personalized branches on https://github.com/gitster/git).

I really would need a ready-to-install sparse, preferably as light-weight
as possible (by not requiring any dependencies outside what is available
in VSTS' hosted Linux build agents.

Maybe there is a specific apt source for sparse?

> Linux Mint 19, which will be released in about a month, will be
> using the Ubuntu 18.04 LTS as a base, so I guess it is possible
> that I will need to debug sparse again ...

:-)

> BTW, I spent some time last night playing with 'git-branch-diff'.

Great!

> First of all - Good Job! This tool will be very useful (thanks
> also go to Thomas, of course).

Both Thomases. Thomas Rast and Thomas Gummerer.

> I noticed that there seemed to be an occasional 'whitespace error'
> indicator (red background) directly after an +/- change character
> which I suspect is an error (I haven't actually checked). However,
> this indicator disappears if you add the --dual-color option.

Indeed. This is a quirk of the whitespace error paired with diffing diffs:
the whitespace error correctly treats the leading space as marker for a
context line, but if you diff diffs, the next character may still be a
marker for a context line (but this time the "inner" diff). And our
whitespace error detection mechanism cannot guess that it looks at a diff
of diffs.

However, in dual-color mode, we know that we will almost certainly look at
diffs of diffs (except if the change is in the commit message, in which
case I don't care about whitespace errors at all, anyway).

So I have this hack in 16/18:
https://public-inbox.org/git/b99ab186c4f11239a10950b9902d9c87d0e60513.1525448066.git.johannes.schinde...@gmx.de/T/#u

Essentially, I extend the dual-color mode to detect where such a bogus
whitespace error would be detected, and simply *skip the space*! I can get
away with that because dual-color is meant for human consumption, and if a
horizontal tab follows, it would not matter whether there was a space: it
would find the same tab stop. Likewise, if the space comes before a CR or
LF, or even just before the final NUL, the space can be safely omitted
from the output, too.

Ciao,
Dscho


Re: [PATCH v2 1/4] test-tool: help verifying BUG() code paths

2018-05-05 Thread Johannes Schindelin
Hi Duy,

On Wed, 2 May 2018, Duy Nguyen wrote:

> On Wed, May 2, 2018 at 11:38 AM, Johannes Schindelin
>  wrote:
> > When we call BUG(), we signal via SIGABRT that something bad happened,
> > dumping cores if so configured. In some setups these coredumps are
> > redirected to some central place such as /proc/sys/kernel/core_pattern,
> > which is a good thing.
> >
> > However, when we try to verify in our test suite that bugs are caught in
> > certain code paths, we do *not* want to clutter such a central place
> > with unnecessary coredumps.
> >
> > So let's special-case the test helpers (which we use to verify such code
> > paths) so that the BUG() calls will *not* call abort() but exit with a
> > special-purpose exit code instead.
> >
> > Helped-by: Nguyễn Thái Ngọc Duy 
> > Signed-off-by: Johannes Schindelin 
> > ---
> >  t/helper/test-tool.c | 2 ++
> >  usage.c  | 5 +
> >  2 files changed, 7 insertions(+)
> >
> > diff --git a/t/helper/test-tool.c b/t/helper/test-tool.c
> > index 87066ced62a..5176f9f20ae 100644
> > --- a/t/helper/test-tool.c
> > +++ b/t/helper/test-tool.c
> > @@ -47,7 +47,9 @@ static struct test_cmd cmds[] = {
> >  int cmd_main(int argc, const char **argv)
> >  {
> > int i;
> > +   extern int BUG_exit_code;
> >
> > +   BUG_exit_code = 99;
> 
> It may be even better to let individual tests in t1406 control this,
> pretty much like your original patch, except that we tell usage.c to
> not send SIGABRT and just return a special fault code. That way we
> don't accidentally suppress BUG()'s sigabrt behavior  in tests that do
> not anticipate it (even in t1406).

I thought long and hard about this, even slept over it. And I came to the
conclusion that I do not really know whether we want such a special
treatment (you may even want to go crazier and limit *which* BUG() call
you intend to catch, so that others are still reported).

And I came to an important realization: whether or not to handle the bugs
in the way you described is actually *outside* the purpose of this patch
series. This patch series is really only about converting die(BUG:...)
calls to BUG() calls. And it simply leaves the concern you raised for
another patch series.

> But this patch is ok for me too if you don't want another reroll.

I don't ;-) (for the reasons mentioned above: I don't disagree with you, I
just think it should be addressed in another patch series than this here
patch series).

Ciao,
Dscho

Re: [PATCH v2 0/6] Let the sequencer handle `git rebase -i --root`

2018-05-05 Thread Johannes Schindelin
Hi Stefan,

On Fri, 4 May 2018, Stefan Beller wrote:

> > Branch-diff vs v1:
> >  1: 42db734a980 ! 1: 73398da7119 sequencer: learn about the special "fake 
> > root commit" handling
> >  @@ -54,40 +54,50 @@
> > return NULL;
> >}
> >
> >  ++/* Read author-script and return an ident line (author  
> > timestamp) */
> >   +static const char *read_author_ident(struct strbuf *buf)
> 
> 
> I like the new way of read_author_ident. Thanks for writing it!

You're welcome. After sleeping a night over it, I think this function
might also benefit from a new name: extract_ident_from_author_script().
What do you think?

> >  @@ -159,7 +169,17 @@
> >   +/* Does this command create a (non-merge) commit? */
> >   +static int is_pick_or_similar(enum todo_command command)
> >   +{
> >  -+ return command <= TODO_SQUASH;
> >  ++ switch (command) {
> >  ++ case TODO_PICK:
> >  ++ case TODO_REVERT:
> >  ++ case TODO_EDIT:
> >  ++ case TODO_REWORD:
> >  ++ case TODO_FIXUP:
> >  ++ case TODO_SQUASH:
> >  ++ return 1;
> >  ++ default:
> >  ++ return 0;
> >  ++ }
> 
> The switch case is not as bad as I thought following the discussion on of v1.

Yes, it makes things explicit, and it is not too long a case-chain.

> This series is
> Reviewed-by: Stefan Beller 

Thanks!

> 
> During a lunch discussion I wondered if the branch diff format could lead to
> another form of machine readable communication, i.e. if we want to add the
> ability to read the branch diff format and apply the changes. Note how 
> applying
> this diff-diff would not create new commits, but rather amend existing 
> commits.

 (which BTW is not valid XML)

I do not think that it would be a wise idea to detour even further from
Git when exchanging patch series iterations. We have a lovely way to
exchange commits, after all: `git fetch` and `git push`, and for times you
cannot agree on a central server, `git bundle`.

Ciao,
Dscho


Re: git update-ref fails to create reference. (bug)

2018-05-05 Thread Rafael Ascensão
Thanks Martin for the quick fix.

On Fri, May 04, 2018 at 08:26:46PM +0200, Martin �gren wrote:
> Anyway, that's not where I'm stuck... Regardless of how I try to write
> tests (in t1400), they just pass beautifully even before this patch. I
> might be able to look into that more on the weekend. If anyone has
> ideas, I am all ears. Or if someone feels like picking this up and
> running with it, feel free.

In t1400 `m=refs/heads/master` is used in the majority of tests. And
this issue doesn't manifest itself if refs are being written under refs/

It also seems particular about having the "old sha" set to 40 zeros or
the empty string.

So I guess we should add some extra tests to cover the variations of
these two cases.

e.g.
 test_expect_success "create PSEUDOREF" '
 git update-ref PSEUDOREF $A
 &&
 test $A = $(cat .git/PSEUDOREF)
 '

fails/succeeds appropriately in my limited testing.

I am busy this weekend, but can try to write some if no one writes it
until after the weekend.

Cumprimentos,
Rafael Ascensão


Re: [RFC PATCH] checkout: Force matching mtime between files

2018-05-05 Thread Jeff King
On Fri, Apr 13, 2018 at 07:01:29PM +0200, Michał Górny wrote:

> In order to avoid unnecessary cache mismatches, force a matching mtime
> between all files created by a single checkout action.  This seems to be
> the best course of action.  Matching mtimes do not trigger cache
> updates.  They also match the concept of 'checkout' being an atomic
> action.  Finally, this change does not break backwards compatibility
> as the new result is a subset of the possible previous results.

There's one case that might be regressed. As long as we assume time
always moves forward, I think you're right, but...

> diff --git a/unpack-trees.c b/unpack-trees.c
> index e73745051..e1efefb68 100644
> --- a/unpack-trees.c
> +++ b/unpack-trees.c
> @@ -346,6 +346,7 @@ static int check_updates(struct unpack_trees_options *o)
>   state.quiet = 1;
>   state.refresh_cache = 1;
>   state.istate = index;
> + state.checkout_mtime = time(NULL);

ISTR that it's possible for "system time" to be different from
"filesystem time". Is there any case where the time we get from
time(NULL) might vary wildly from what would be written by the
filesystem if we were to simply write the file? E.g., perhaps on some
types of network-mounted filesystems.

The files in your checkout would all be consistent, but they might be
inconsistent with other files _not_ created by Git (e.g., one might be
saved in your editor). Now you may have introduced skew that cause
"make" to do the wrong thing, because your source and target files are
really operating from two different clocks.

I really don't know how possible or common this is, but I feel like I've
been warned about this distinction in the past. I wouldn't be surprised
to find that it's an archaic thing found only on ancient versions of
NFS, and oral tradition passed down the warnings. But I also would not
be surprised if it's still possible and common.

-Peff


Re: [PATCH v2 13/18] color: provide inverted colors, too

2018-05-05 Thread Jeff King
On Fri, May 04, 2018 at 05:34:58PM +0200, Johannes Schindelin wrote:

> For every regular color, there exists the inverted equivalent where
> background and foreground colors are exchanged.
> 
> We will use this in the next commit to allow inverting *just* the +/-
> signs in a diff.

There's a "reverse" attribute (which we already parse and support) that
can do this without having to repeat the colors. AFAIK it's well
supported everywhere, but I could be wrong.

I wonder if that would make configuring this slightly more pleasant,
since it saves the user having to define "oldinv" whenever they change
"old".

-Peff


Re: [PATCH v2 02/18] Add a new builtin: branch-diff

2018-05-05 Thread Jeff King
On Fri, May 04, 2018 at 05:34:32PM +0200, Johannes Schindelin wrote:

> This builtin does not do a whole lot so far, apart from showing a usage
> that is oddly similar to that of `git tbdiff`. And for a good reason:
> the next commits will turn `branch-diff` into a full-blown replacement
> for `tbdiff`.

One minor point about the name: will it become annoying as a tab
completion conflict with git-branch?

It feels really petty complaining about the name, but I just want to
raise the point, since it will never be easier to change than right now.

(And no, I don't really have another name in mind; I'm just wondering if
"subset" names like this might be a mild annoyance in the long run).

-Peff


Re: [PATCH v2 01/18] Add a function to solve least-cost assignment problems

2018-05-05 Thread Jeff King
On Fri, May 04, 2018 at 05:34:29PM +0200, Johannes Schindelin wrote:

> The Jonker-Volgenant algorithm was implemented to answer questions such
> as: given two different versions of a topic branch (or iterations of a
> patch series), what is the best pairing of commits/patches between the
> different versions?

I love git-tbdiff, so I'm excited to see it getting more exposure (and a
speedup). Thanks for working on this!

Two minor nits on this patch:

> +/*
> + * The parameter `cost` is the cost matrix: the cost to assign column j to 
> row
> + * i is `cost[j + column_count * i].
> + */
> +int compute_assignment(int column_count, int row_count, double *cost,
> +int *column2row, int *row2column)
> +{
> + double *v = xmalloc(sizeof(double) * column_count), *d;

Please use st_mult, xcalloc, or ALLOC_ARRAY here to avoid unchecked
multiplication in an allocation. This is probably hard to exploit in
practice (since you'd need sizeof(size_t)/8 columns, which presumably
requires allocating some heavier-weight struct per item). But it makes
auditing easier if we avoid the pattern altogether.

> +/*
> + * Compute an assignment of columns -> rows (and vice versa) such that every
> + * column is assigned to at most one row (and vice versa) minimizing the
> + * overall cost.
> + *
> + * The parameter `cost` is the cost matrix: the cost to assign column j to 
> row
> + * i is `cost[j + column_count * i].
> + *
> + * The arrays column2row and row2column will be populated with the respective
> + * assignments (-1 for unassigned, which can happen only if column_count !=
> + * row_count).
> + */
> +int compute_assignment(int column_count, int row_count, double *cost,
> +int *column2row, int *row2column);

It looks like this always returns zero. Is there a ever a case where we
would return an error if this? If not, should it just be void?

-Peff


Re: [PATCH] pack-objects: validation and documentation about unreachable options

2018-05-05 Thread Jeff King
On Sat, May 05, 2018 at 10:47:16AM +0200, Nguyễn Thái Ngọc Duy wrote:

> These options are added in [1] [2] [3]. All these depend on running
> rev-list internally which is normally true since they are always used
> with "--all --objects" which implies --revs. But let's keep this
> dependency explicit.
> 
> While at there, add documentation for them. These are mostly used
> internally by git-repack. But it's still good to not chase down the
> right commit message to know how they work.

Yeah, this all make sense to me. Thanks for documenting this.

-Peff


Re: [PATCH 1/2] Fix support for merge options.

2018-05-05 Thread Christian Couder
On Sat, May 5, 2018 at 5:41 PM,   wrote:
> Christian wrote:
>>
>> It looks like git-reintegrate is not managed in Git but in this
>> repository :
>>
>> https://github.com/felipec/git-reintegrate
>>
>> So could be a bit confusing to send those patches to the Git mailing
>> list without telling that your patches are not to be integrated into
>> Git itself.
>
> Right, I stand corrected, thanks :)
>
>> Nice to see you again on the list anyway!

Also congratulations for Shadow's success! (https://shadow.tech/usen/)


Re: [PATCH 1/2] Fix support for merge options.

2018-05-05 Thread ydirson
Christian wrote:
> Hi Yann,
> 
> On Sat, May 5, 2018 at 3:24 PM, Yann Dirson  wrote:
> > ---
> >  git-reintegrate | 6 +++---
> >  1 file changed, 3 insertions(+), 3 deletions(-)
> >
> > diff --git a/git-reintegrate b/git-reintegrate
> 
> It looks like git-reintegrate is not managed in Git but in this
> repository :
> 
> https://github.com/felipec/git-reintegrate
> 
> So could be a bit confusing to send those patches to the Git mailing
> list without telling that your patches are not to be integrated into
> Git itself.

Right, I stand corrected, thanks :)

> Nice to see you again on the list anyway!
> 
> Best,
> Christian.
> 


Re: [GSoC] Yet another blog series about the GSoC

2018-05-05 Thread Christian Couder
Hi Pratik and Alban,

On Sat, May 5, 2018 at 2:24 PM, Pratik Karki  wrote:
> On Sat, May 5, 2018 at 5:11 PM, Alban Gruin  wrote:
>>
>> as my fellow students, I started a blog series about my GSoC project[1].
>> First, I wanted to post them directly on the mailing list, but a blog
>> allows me to edit the content easily if needed.
>>
>> Any feedback is welcome!
>>
>> [1] https://blog.pa1ch.fr/posts/2018/05/05/en/gsoc2018-week-1.html
>
> Nice post. Great job, Alban.
> Just a little typo I found out there: hazardous -> hasardous.

Thanks both for setting up nice blogs for your GSoC!

Best,
Christian.


Re: [PATCH 1/2] Fix support for merge options.

2018-05-05 Thread Christian Couder
Hi Yann,

On Sat, May 5, 2018 at 3:24 PM, Yann Dirson  wrote:
> ---
>  git-reintegrate | 6 +++---
>  1 file changed, 3 insertions(+), 3 deletions(-)
>
> diff --git a/git-reintegrate b/git-reintegrate

It looks like git-reintegrate is not managed in Git but in this repository :

https://github.com/felipec/git-reintegrate

So could be a bit confusing to send those patches to the Git mailing
list without telling that your patches are not to be integrated into
Git itself.

Nice to see you again on the list anyway!

Best,
Christian.


Re: [RFC] Other chunks for commit-graph, part 1 - Bloom filters, topo order, etc.

2018-05-05 Thread Jakub Narebski
Ævar Arnfjörð Bjarmason  writes:
> On Fri, May 04 2018, Jakub Narebski wrote:
>
> (Just off-the cuff here and I'm surely about to be corrected by
> Derrick...)
>
>> * What to do about merge commits, and octopus merges in particular?
>>   Should Bloom filter be stored for each of the parents?  How to ensure
>>   fast access then (fixed-width records) - use large edge list?
>
> You could still store it fixed with, you'd just say that if you
> encounter a merge with N parents the filter wouldn't store files changed
> in that commit, but rather whether any of the N (including the merge)
> had changes to files as of the their common merge-base.

Well, that is one solution: to store union of changes (sum of changes)
from all parents of a merge commit in a Bloom filter for a merge.

But this wouldn't help us to skip merge entirely, because which parent
we would walk then?  But perhaps I am mistaken, and that does not
matter.

> Then if they did you'd need to walk all sides of the merge where each
> commit would also have the filter to figure out where the change(s)
> was/were, but if they didn't you could skip straight to the merge base
> and keep walking.

Another solution that I thought of is to use the same mechanism that
commit-graph file uses for storing merges: store Bloom filters for first
two parents, and if there are more parents (octopus merge), store Bloom
filters for the remaining parents in large edge extension for Bloom
filters.

This meant accepting some padding waste for CDAT chunk, to have faster
access.  We could do the same for Bloom filters, but it may mean quite a
bit of waste, depending on how many bits Bloom filter would use... but
there is another solution: for merge commits store Bloom filters for
first two parents that are half the size - this means of course more
false positives, but it may be acceptable solution.

> Which, on the topic of what else a commit graph could store: A mapping
> from merge commits of N parents to the merge-base of those commits.

The problem is that those N parents may have more than one merge-base,
and if so then those merge-bases may have also multiple merge-bases,
recursively (what 'recursive' merge strategy handles).  Though this
could be solved with 'large merge-base list' extension, just like
existing EDGE chunk - I think we can assume that most merge parents have
only single merge-base (but I have not checked this).

> You could also store nothing for merges (or only files the merge itself
> changed v.s. its parents). Derrick talked about how the bloom filter
> implementation has a value that's "Didn't compute (for whatever reason),
> look at it manually".

Right, another solution could be to store nothing for merges, or store
Bloom filter for changes against only first parent.  The goal of Bloom
filter is to avoid calculating diff if we don't need to.

Derrick, could you tell us what solution VSTS uses for Bloom filters on
merge commits?  Thanks in advance.

>> * Then there is problem of rename and copying detection - I think we can
>>   simply ignore it: unless someone has an idea about how to handle it?
>>
>>   Though this means that "git log --follow " wouldn't get any
>>   speedup, and neither the half of "git gui blame" that runs "git blame
>>   --incremental -C -C -w" -- the one that allows code copying and
>>   movement detection.
>
> Couldn't the bloom filter also speed up --follow if you did two passes
> through the history? The first to figure out all files that ever changed
> names, and then say you did `--follow sha1-name.c` on git.git. The
> filter would have had all the bits for both sha1_name.c and sha1-name.c
> set on all commits that touched either for all of the history.
>
> Of course this would only work with a given default value of -M, but
> on the assumption that most users left it at the default, and
> furthermore that renames weren't so common as to make the filter useless
> with too many false-positives as a result, it might be worth it. If you

I think it would be much simpler to just ensure that we store in Bloom
filter as changed files also pure renames, and leave doing rename
detection to the walk.  This way we do not fix old rename detecion
algorithm in stone.

The walk would simply change the name of file it would ask Bloom filters
about.

Thank you for your comments,
-- 
Jakub Narębski


[PATCH 1/2] Fix support for merge options.

2018-05-05 Thread Yann Dirson
---
 git-reintegrate | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/git-reintegrate b/git-reintegrate
index a1c17d2..da45239 100755
--- a/git-reintegrate
+++ b/git-reintegrate
@@ -338,7 +338,7 @@ class Integration
   def finalize_command(cmd, args, message)
 begin
   fun = @@map[cmd] || "cmd_#{cmd}".to_sym
-  send(fun, message, *args)
+  send(fun, message, *args.split(' '))
 rescue NoMethodError
   raise Integration::Stop, "Unknown command: #{cmd}"
 end
@@ -559,7 +559,7 @@ def do_apply
   inst = inst.lines.reject do |line|
 next true if line =~ /^base /
 if line =~ /^merge (.*)$/
-  system(*%W[git merge-base --is-ancestor #{$1} HEAD])
+  system(*%W[git merge-base --is-ancestor #{$1.split(' ')[0]} HEAD])
   next true if $?.success?
 end
 false
@@ -621,7 +621,7 @@ def do_status
 when 'base'
   $status_base = args
 when 'merge'
-  status_merge(*args)
+  status_merge(*args.split(' ')[0])
 when '.'
   status_dot(*args)
 else
-- 
2.11.0



[PATCH 2/2] Fix wrong merge-base invocation preventing detection of up-to-date branches.

2018-05-05 Thread Yann Dirson
---
 git-reintegrate | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/git-reintegrate b/git-reintegrate
index da45239..a6e3cff 100755
--- a/git-reintegrate
+++ b/git-reintegrate
@@ -581,7 +581,7 @@ def status_merge(branch_to_merge = nil)
   elsif system(*%w[git merge-base --is-ancestor], branch_to_merge, 
$status_base)
 state = "+"
 verbose_state = "merged to #{$status_base}"
-  elsif system(*%w[git-merge-base --is-ancestor], branch_to_merge, 
$branch.name)
+  elsif system(*%w[git merge-base --is-ancestor], branch_to_merge, 
$branch.name)
 state = "*"
 verbose_state = "up-to-date"
   else
-- 
2.11.0



Re: [GSoC] Yet another blog series about the GSoC

2018-05-05 Thread Pratik Karki
On Sat, May 5, 2018 at 5:11 PM, Alban Gruin  wrote:
> Hi everybody,
>
> as my fellow students, I started a blog series about my GSoC project[1].
> First, I wanted to post them directly on the mailing list, but a blog
> allows me to edit the content easily if needed.
>
> Any feedback is welcome!
>
> [1] https://blog.pa1ch.fr/posts/2018/05/05/en/gsoc2018-week-1.html
>
> Cheers,
> Alban Gruin

Nice post. Great job, Alban.
Just a little typo I found out there: hazardous -> hasardous.

Cheers,
Pratik Karki


[GSoC] Yet another blog series about the GSoC

2018-05-05 Thread Alban Gruin
Hi everybody,

as my fellow students, I started a blog series about my GSoC project[1].
First, I wanted to post them directly on the mailing list, but a blog
allows me to edit the content easily if needed.

Any feedback is welcome!

[1] https://blog.pa1ch.fr/posts/2018/05/05/en/gsoc2018-week-1.html

Cheers,
Alban Gruin


[PATCH] pack-objects: validation and documentation about unreachable options

2018-05-05 Thread Nguyễn Thái Ngọc Duy
These options are added in [1] [2] [3]. All these depend on running
rev-list internally which is normally true since they are always used
with "--all --objects" which implies --revs. But let's keep this
dependency explicit.

While at there, add documentation for them. These are mostly used
internally by git-repack. But it's still good to not chase down the
right commit message to know how they work.

[1] ca11b212eb (let pack-objects do the writing of unreachable objects
as loose objects - 2008-05-14)
[2] 08cdfb1337 (pack-objects --keep-unreachable - 2007-09-16)
[3] e26a8c4721 (repack: extend --keep-unreachable to loose objects -
2016-06-13)

Signed-off-by: Nguyễn Thái Ngọc Duy 
---
 Documentation/git-pack-objects.txt | 13 +
 builtin/pack-objects.c |  2 ++
 2 files changed, 15 insertions(+)

diff --git a/Documentation/git-pack-objects.txt 
b/Documentation/git-pack-objects.txt
index 81bc490ac5..44245e5815 100644
--- a/Documentation/git-pack-objects.txt
+++ b/Documentation/git-pack-objects.txt
@@ -267,6 +267,19 @@ Unexpected missing object will raise an error.
locally created objects [without .promisor] and objects from the
promisor remote [with .promisor].)  This is used with partial clone.
 
+--keep-unreachable::
+   Objects unreachable from the refs in packs named with
+   --unpacked= option are added to the resulting pack, in
+   addition to the reachable objects that are not in packs marked
+   with *.keep files. This implies `--revs`.
+
+--pack-loose-unreachable::
+   Pack unreachable loose objects (and their loose counterparts
+   removed). This implies `--revs`.
+
+--unpack-unreachable::
+   Keep unreachable objects in loose form. This implies `--revs`.
+
 SEE ALSO
 
 linkgit:git-rev-list[1]
diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
index 4bdae5a1d8..cfac021360 100644
--- a/builtin/pack-objects.c
+++ b/builtin/pack-objects.c
@@ -3085,6 +3085,8 @@ int cmd_pack_objects(int argc, const char **argv, const 
char *prefix)
fetch_if_missing = 0;
argv_array_push(, "--exclude-promisor-objects");
}
+   if (unpack_unreachable || keep_unreachable || pack_loose_unreachable)
+   use_internal_rev_list = 1;
 
if (!reuse_object)
reuse_delta = 0;
-- 
2.17.0.705.g3525833791



Re: [PATCH 2/2] builtin/grep.c: teach '-o', '--only-matching' to 'git-grep'

2018-05-05 Thread Eric Sunshine
On Sat, May 5, 2018 at 12:03 AM, Taylor Blau  wrote:
> Teach GNU grep(1)'s '-o' ('--only-matching') to 'git-grep'. This option
> prints only the matching components of each line. It writes multiple
> lines if more than one match exists on a given line.
>
> For example:
>
>   $ git grep -on --column --heading git -- README.md | head -3
>   README.md
>   15:56:git
>   18:20:git
>
> By using show_line_header(), 'git grep --only-matching' correctly
> respects the '--header' option:

What is the '--header' option? I don't see it used in any example.

>   $ git grep -on --column --heading git -- README.md | head -4
>   README.md
>   15:56:git
>   18:20:git
>   19:16:git

How does this example differ from the earlier example (other than
showing 4 lines of output rather than 3)?

> Signed-off-by: Taylor Blau 


Re: [PATCH 1/2] grep.c: extract show_line_header()

2018-05-05 Thread Eric Sunshine
On Sat, May 5, 2018 at 12:03 AM, Taylor Blau  wrote:
> Teach 'git-grep(1)' how to print a line header multiple times from
> show_line() in preparation for it learning '--only-matching'.
>
> show_line_header() comprises of the code in show_line() executed in

s/of//

> order to produce a line header. It is a one-for-one extraction of the
> existing implementation.
>
> For now, show_line_header() provides no benefit over the change before
> this patch. The following patch will call conditionally call

s/call conditionally call/conditionally call/

> show_line_header() multiple times per invocation to show_line(), which
> is the desired benefit of this change.
>
> Signed-off-by: Taylor Blau 


[PATCH] completion: fix misspelled config key aliasesfiletype

2018-05-05 Thread Nguyễn Thái Ngọc Duy
The correct name in git-send-email.perl is aliasfiletype [1]. There are
actually two instances of this misspelling. The other was found and
fixed in 6068ac8848 (completion: add missing configuration variables -
2010-12-20)

[1] 994d6c66d3 (send-email: address expansion for common mailers - 2006-05-14)

Signed-off-by: Nguyễn Thái Ngọc Duy 
---
 contrib/completion/git-completion.bash | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/contrib/completion/git-completion.bash 
b/contrib/completion/git-completion.bash
index 01dd9ff07a..bc8ce24a89 100644
--- a/contrib/completion/git-completion.bash
+++ b/contrib/completion/git-completion.bash
@@ -2172,7 +2172,7 @@ _git_config ()
__gitcomp "$__git_log_date_formats"
return
;;
-   sendemail.aliasesfiletype)
+   sendemail.aliasfiletype)
__gitcomp "mutt mailrc pine elm gnus"
return
;;
-- 
2.17.0.705.g3525833791



Re: [PATCH 2/2] builtin/grep.c: teach '-o', '--only-matching' to 'git-grep'

2018-05-05 Thread Ævar Arnfjörð Bjarmason

On Sat, May 05 2018, Taylor Blau wrote:

> +--o::
> +--only-matching::
> + Show only the matching part of the lines.
> +

Makes sense to steal GNU grep's description here:

Print only the matched (non-empty) parts of a matching line, with
each such part on a separate output line.

> + if (!opt->only_matching)
> + output_color(opt, bol, match.rm_so, line_color);

This should also have braces, see "When there are multiple arms to a
conditional" in Documentation/CodingGuidelines.


>  '
>
> +cat >expected < +file:1:5:mmap
> +file:2:5:mmap
> +file:3:5:mmap
> +file:3:14:mmap
> +file:4:5:mmap
> +file:4:14:mmap
> +file:5:5:mmap
> +file:5:14:mmap
> +EOF

This should be set up as part of the test itself, see e.g. my c8b2cec09e
("branch: add test for -m renaming multiple config sections",
2017-06-18) for how to do that.

> +test_expect_success 'grep --only-matching' '
> + git grep --only-matching --line-number --column mmap file >actual &&
> + test_cmp expected actual
> +'
> +
> +cat >expected < +file
> +1:5:mmap
> +2:5:mmap
> +3:5:mmap
> +3:14:mmap
> +4:5:mmap
> +4:14:mmap
> +5:5:mmap
> +5:14:mmap
> +EOF
> +
> +test_expect_success 'grep --only-matching --heading' '
> + git grep --only-matching --heading --line-number --column mmap file 
> >actual &&
> + test_cmp expected actual
> +'
> +
>  cat >expected <  hello.c
>  4:int main(int argc, const char **argv)

We should test this a lot more, I think a good way to do that would be
to extend this series by first importing GNU grep's -o tests, see
http://git.savannah.gnu.org/cgit/grep.git/tree/tests/foad1 they are
license-compatible. Then change the grep_test() function to call git
grep instead.

It should also be tested with the various grep.patternType options to
make sure it works with basic, extended, perl, fixed etc.


Re: [PATCH v4 5/7] builtin/grep.c: add '--column' option to 'git-grep(1)'

2018-05-05 Thread Duy Nguyen
On Sat, May 5, 2018 at 4:43 AM, Taylor Blau  wrote:
> Teach 'git-grep(1)' a new option, '--column', to show the column
> number of the first match on a non-context line.

Why? Or put it another way, what is this option used for? Only
git-jump? (which should also be mentioned here if true)

>
> For example:
>
>   $ git grep -n --column foo | head -n3
>   .clang-format:51:14:# myFunction(foo, bar, baz);
>   .clang-format:64:7:# int foo();
>   .clang-format:75:8:# void foo()
>
-- 
Duy