Re: [PATCH v2 2/5] builtin/config.c: support `--type=` as preferred alias for `--`

2018-04-26 Thread Eric Sunshine
On Thu, Apr 26, 2018 at 1:58 AM, Taylor Blau  wrote:
> `git config` has long allowed the ability for callers to provide a 'type
> specifier', which instructs `git config` to (1) ensure that incoming
> values can be interpreted as that type, and (2) that outgoing values are
> canonicalized under that type.
>
> In another series, we propose to extend this functionality with
> `--type=color` and `--default` to replace `--get-color`.

Now that you've combined the two series, this sentence no longer makes
sense as written. Perhaps say:

Later patches will extend this functionality...

> However, we traditionally use `--color` to mean "colorize this output",
> instead of "this value should be treated as a color".
>
> Currently, `git config` does not support this kind of colorization, but
> we should be careful to avoid squatting on this option too soon, so that
> `git config` can support `--color` (in the traditional sense) in the
> future, if that is desired.
>
> In this patch, we support `--type=` in
> addition to `--int`, `--bool`, and etc. This allows the aforementioned
> upcoming patch to support querying a color value with a default via
> `--type=color --default=...`, without squandering `--color`.
>
> We retain the historic behavior of complaining when multiple,

Drop the comma and be more specific:
s/multiple,/multiple conflicting/

> legacy-style `--` flags are given, as well as extend this to
> conflicting new-style `--type=` flags. `--int --type=int` (and its
> commutative pair) does not complain, but `--bool --type=int` (and its
> commutative pair) does.

Confusing. Part of the selling point of the commit message of patch
1/5 is the removal of this complaint/restriction, claiming that it
intentionally treats "git config --int --bool" simply as "git config
--bool", and that that loosening is required to support "git config
--int --type=int" without complaining, thus is a good thing. But this
commit message (2/5) backpedals and reinstates the original
complaint/restriction.

Perhaps I could have understood if 1/5 said that the loosening of the
restriction was only temporary and that it would be restored by a
later patch rather than using the restriction-removal as a selling
point. However, this patch series doesn't need to be crafted such that
a feature is temporarily lost and later restored, so I'm having
trouble buying the way this series is architected.

What could make more sense would be for 1/5 to introduce
option_parse_type() for --, thus retaining the restriction, and
for 2/5 simply to augment option_parse_type() to also understand
--type=.

> Signed-off-by: Taylor Blau 


Re: [PATCH v3 2/3] merge: Add merge.renames config setting

2018-04-26 Thread Elijah Newren
On Thu, Apr 26, 2018 at 5:54 PM, Ben Peart  wrote:
> On 4/26/2018 6:52 PM, Elijah Newren wrote:
>> On Thu, Apr 26, 2018 at 1:52 PM, Ben Peart 
>> wrote:
>>
>>> diff --git a/merge-recursive.h b/merge-recursive.h
>>> index 80d69d1401..0c5f7eff98 100644
>>> --- a/merge-recursive.h
>>> +++ b/merge-recursive.h
>>> @@ -17,7 +17,8 @@ struct merge_options {
>>>  unsigned renormalize : 1;
>>>  long xdl_opts;
>>>  int verbosity;
>>> -   int detect_rename;
>>> +   int diff_detect_rename;
>>> +   int merge_detect_rename;
>>>  int diff_rename_limit;
>>>  int merge_rename_limit;
>>>  int rename_score;
>>> @@ -28,6 +29,11 @@ struct merge_options {
>>>  struct hashmap current_file_dir_set;
>>>  struct string_list df_conflict_file_set;
>>>   };
>>> +inline int merge_detect_rename(struct merge_options *o)
>>> +{
>>> +   return o->merge_detect_rename >= 0 ? o->merge_detect_rename :
>>> +   o->diff_detect_rename >= 0 ? o->diff_detect_rename : 1;
>>> +}
>>
>>
>> Why did you split o->detect_rename into two fields?  You then
>> recombine them in merge_detect_rename(), and after initial setup only
>> ever access them through that function.  Having two fields worries me
>> that people will accidentally introduce bugs by using one of them
>> instead of the merge_detect_rename() function.  Is there a reason you
>> decided against having the initial setup just set a single value and
>> then use it directly?
>>
> The setup of this value is split into 3 places that may or may not all get
> called.  The initial values, the values that come from the config settings
> and then any values passed on the command line.
>
> Because the merge value can now inherit from the diff value, you only know
> the final value after you have received all possible inputs.  That makes it
> necessary to be a calculated value.
>
> If you look at diff_rename_limit/merge_rename_limit, detect_rename follow
> the same pattern for the same reasons.  It turns out detect_rename was a
> little more complex because it is used in 3 different locations (vs just
> one) which is why I wrapped the inheritance logic into the helper function
> merge_detect_rename().

Ah, you're following the precedent set by
diff_rename_limit/merge_rename_limit; that makes sense.  Thanks for
the explanation.  I believe another possibility here is that for both
the {merge,diff}_rename_limit pair of variables and the
{diff,merge}_renames pair of variables, since the code parses all
inputs before ever using the result, we could calculate the result
once and store it rather than storing the constituent pieces of the
calculation.  That would also prevent people from trying to use one of
the pieces of the calculation instead of treating it as a coherent
whole.  However, while I would have preferred that the rename_limit
pair of variables also went away in favor of just one field which is
updated as it parses each input option, what you have is fine for this
series.


Re: [PATCH v4 1/2] blame: prevent error if range ends past end of file

2018-04-26 Thread Isabella Stephens
> Maybe I misread the previous discussion and/or your cover letter,
> but I have been assuming that you are trying to avoid failing the
> command in a useless way (e.g. when the file has only ~800 lines but
> the user does not know exactly how many, instead of letting -L1,820 
> to fail with "the file only has 815 lines", pretend that the -L1,815
> was given) and instead give a reasonable fall-back behaviour.

That's correct. In doing so I picked up on a few extra cases where the
behaviour wasn't intuitive, so I've attempted to fix all of those with
this patch. 

> And to be consistent with that world view, I would have expected
> that the meaning of -L,-20 to be updated from "fail if
>  is before line #20, or show 20 lines leading to
> " to "show lines leading to , up to 20 lines
> but it is OK if there aren't enough lines in the file to show that
> many".

This is the existing behaviour. -L10,-20 for example will blame the
first 10 lines of a file, it will not fail. My patch doesn't change
this. The case I am discussing is -L,-20 which at the moment blames
the first line of the file. Trying to go backwards from the start of
a file should be considered invalid, in my opinion, however I don't
feel strongly about it - I don't expect this case is common in 
practice.


Re: [PATCH v3 2/3] merge: Add merge.renames config setting

2018-04-26 Thread Elijah Newren
On Thu, Apr 26, 2018 at 7:23 PM, Junio C Hamano  wrote:
> Ben Peart  writes:
>
>> Color me puzzled. :)  The consensus was that the default value for
>> merge.renames come from diff.renames.  diff.renames supports copy
>> detection which means that merge.renames will inherit that value.  My
>> assumption was that is what was intended so when I reimplemented it, I
>> fully implemented it that way.
>>
>> Are you now requesting to only use diff.renames as the default if the
>> value is true or false but not if it is copy?  What should happen if
>> diff.renames is actually set to copy?  Should merge silently change
>> that to true, display a warning, error out, or something else?  Do you
>> have some other behavior for how to handle copy being inherited from
>> diff.renames you'd like to see?
>>
>> Can you write the documentation that clearly explains the exact
>> behavior you want?  That would kill two birds with one stone... :)
>
> I think demoting from copy to rename-only is a good idea, at least
> for now, because I do not believe we have figured out what we want
> to happen when we detect copied files are involved in a merge.
>
> But I am not sure if we even want to fail merge.renames=copy as an
> invalid configuration.  So my gut feeling of the best solution to
> the above is to do something like:
>
>  - whether the configuration comes from diff.renames or
>merge.renames, turn *.renames=copy to true inside the merge
>recursive machinery.
>
>  - document the fact in "git merge-recursive" documentation (or "git
>merge" documentation) to say "_currently_ asking for rename
>detection to find copies and renames will do the same
>thing---copies are ignored", impliying "this might change in the
>future", in the BUGS section.

Yes, I agree.  One more thing:

  - It may be best to avoid advertising "copies" as a vaild option for
merge.renames since it doesn't have any current practical use
anywhere.  (Remove the sentence 'If set to "copies" or "copy", Git
will detect copies, as well.' from the documentation)

My rationale for translating "copy" to "true" is a little different
than Junio's, though:

1) The reason we have configuration options around renames and copies
is primarily because they are expensive to compute.  So we let some
users specify that they don't want them, other users are willing to
pay for rename detection, and others are willing to pay for both
rename and copy detection.
2) If rename/copy detection were cheap, every part of git would just
compute whatever level of detection was relevant and use it.
3) The resolve and octopus merge strategies ignores diff.renames and
merge.renames, because they don't have logic to use any rename
information.  diff and log can use both renames and copies.  And the
recursive merge machinery is code which can use renames but not
copies.
4) Therefore, translating from "copy" to "true" inside the merge
recursive machinery is fine and not an error because we are using as
much detection information as is relevant to the algorithm and which
the user is willing to pay for.

To throw one more wrinkle in here, merge.renames could actually be set
to "copy" and make sense, because we compute diffs multiple times.
Twice within the recursive merge machinery (for which we'd want to
translate "copy" to "true"), and once for the diffstat at the end
(which comes from builtin/merge.c, and for which it could make sense
to detect copies).

(Kind of curious whether Junio agrees with my rationale or thinks I'm
out in left field with it...)


In some rebases, `exec git -C ...` has wrong working directory

2018-04-26 Thread William Chargin
Here is a repro script:

#!/bin/sh
set -eux
git --version
tmpdir="$(mktemp -d)"
cd "${tmpdir}"
mkdir target repo
cd repo
git init
touch file; git add file
git commit -m "Initial commit"
git rebase HEAD --exec "git -C ${tmpdir}/target init"

The end of this script prints something like

Executing: git -C /tmp/tmp.gd2q51jO93/target init
Reinitialized existing Git repository in /tmp/tmp.gd2q51jO93/repo/.git/
Successfully rebased and updated refs/heads/master.

But this is wrong: the repository should be initialized in `target`, not
reinitialized in `repo`.

Notes:

  - This propagates to subprocesses: if you run `exec make test` and
your test suite ends up calling `git -C`, then the same problem
occurs.

  - Substituting `rebase --root` for `rebase HEAD` causes the problem to
go away.

  - The `rebase HEAD` exec context adds the `GIT_DIR` environment
variable, and this is sufficient to reproduce the problem:
running `GIT_DIR="$PWD" git -C /tmp/target init` puts the repo in
the current working directory. The `rebase --root` context adds no
such environment variable. (You can use `--exec 'env >tempfile'` to
verify these.)

My `git --version` is 2.16.2.


Re: [PATCH v3 2/3] merge: Add merge.renames config setting

2018-04-26 Thread Junio C Hamano
Ben Peart  writes:

> Color me puzzled. :)  The consensus was that the default value for
> merge.renames come from diff.renames.  diff.renames supports copy
> detection which means that merge.renames will inherit that value.  My
> assumption was that is what was intended so when I reimplemented it, I
> fully implemented it that way.
>
> Are you now requesting to only use diff.renames as the default if the
> value is true or false but not if it is copy?  What should happen if
> diff.renames is actually set to copy?  Should merge silently change
> that to true, display a warning, error out, or something else?  Do you
> have some other behavior for how to handle copy being inherited from
> diff.renames you'd like to see?
>
> Can you write the documentation that clearly explains the exact
> behavior you want?  That would kill two birds with one stone... :)

I think demoting from copy to rename-only is a good idea, at least
for now, because I do not believe we have figured out what we want
to happen when we detect copied files are involved in a merge.

But I am not sure if we even want to fail merge.renames=copy as an
invalid configuration.  So my gut feeling of the best solution to
the above is to do something like:

 - whether the configuration comes from diff.renames or
   merge.renames, turn *.renames=copy to true inside the merge
   recursive machinery.

 - document the fact in "git merge-recursive" documentation (or "git
   merge" documentation) to say "_currently_ asking for rename
   detection to find copies and renames will do the same
   thing---copies are ignored", impliying "this might change in the
   future", in the BUGS section.



Re: [PATCH v4 1/2] blame: prevent error if range ends past end of file

2018-04-26 Thread Junio C Hamano
Isabella Stephens  writes:

> On 27/4/18 10:50 am, Junio C Hamano wrote:
>> isteph...@atlassian.com writes:
>> 
>>> diff --git a/line-range.c b/line-range.c
>>> index 323399d16..023aee1f5 100644
>>> --- a/line-range.c
>>> +++ b/line-range.c
>>> @@ -47,7 +47,7 @@ static const char *parse_loc(const char *spec, 
>>> nth_line_fn_t nth_line,
>>> else if (!num)
>>> *ret = begin;
>>> else
>>> -   *ret = begin + num;
>>> +   *ret = begin + num ? begin + num : -1;
>> 
>> When parsing "-L,-20" to grab some lines before the line
>> specified by , if that something happens to be line #20,
>> this gives -1 to *ret.  If it is line #19, *ret becomes -1, and if
>> it is line #18 or before, *ret becomes -2, -3, ...
>> 
>> Is that what we really want here?  It is disturbing that only line
>> #19 and #20 are treated identically in the above example.  If it
>> were "if going backwards by -num lines from begin goes beyond the
>> beginning of the file, clip it to the first line", I would
>> understand it, but as written, I am not sure what the code is trying
>> to do.
>> 

[administrivia] Do not top-post, but cull the context to leave
enough to remind readers what the discussion was about.

> My intention was to modify existing behaviour as little as possible,
> but I agree clipping to the first line makes a lot more sense. That
> raises the question though, do we clip to 1 and treat -L,-n as a valid
> input, or clip to -1 so that this case be detected?

Maybe I misread the previous discussion and/or your cover letter,
but I have been assuming that you are trying to avoid failing the
command in a useless way (e.g. when the file has only ~800 lines but
the user does not know exactly how many, instead of letting -L1,820 
to fail with "the file only has 815 lines", pretend that the -L1,815
was given) and instead give a reasonable fall-back behaviour.

And to be consistent with that world view, I would have expected
that the meaning of -L,-20 to be updated from "fail if
 is before line #20, or show 20 lines leading to
" to "show lines leading to , up to 20 lines
but it is OK if there aren't enough lines in the file to show that
many".

So the answer to the question probably depends on what happens when
"this case is detected" by returning -1 from here.  Do we detect and
fail?  That would defeat the overall theme of these patches.  Do we
detct and warn but continue?  That may be sensible in theory, but in
practice, especially where this "the users may not know how many
lines exactly the blob has, so do not force them to count" matters,
"blame" and "log" would show a lot of output that is sent to the
pager, so the warning message may not be shown in a noticeable way.
Compared to that, "pretend as if the first line was specified and go
on" looks like we have one fewer thing to worry about ;-)




Re: [PATCH v4 1/2] blame: prevent error if range ends past end of file

2018-04-26 Thread Isabella Stephens
My intention was to modify existing behaviour as little as possible,
but I agree clipping to the first line makes a lot more sense. That
raises the question though, do we clip to 1 and treat -L,-n as a valid
input, or clip to -1 so that this case be detected?

On 27/4/18 10:50 am, Junio C Hamano wrote:
> isteph...@atlassian.com writes:
> 
>> diff --git a/line-range.c b/line-range.c
>> index 323399d16..023aee1f5 100644
>> --- a/line-range.c
>> +++ b/line-range.c
>> @@ -47,7 +47,7 @@ static const char *parse_loc(const char *spec, 
>> nth_line_fn_t nth_line,
>>  else if (!num)
>>  *ret = begin;
>>  else
>> -*ret = begin + num;
>> +*ret = begin + num ? begin + num : -1;
> 
> When parsing "-L,-20" to grab some lines before the line
> specified by , if that something happens to be line #20,
> this gives -1 to *ret.  If it is line #19, *ret becomes -1, and if
> it is line #18 or before, *ret becomes -2, -3, ...
> 
> Is that what we really want here?  It is disturbing that only line
> #19 and #20 are treated identically in the above example.  If it
> were "if going backwards by -num lines from begin goes beyond the
> beginning of the file, clip it to the first line", I would
> understand it, but as written, I am not sure what the code is trying
> to do.
> 


Re: [PATCH v3 2/3] merge: Add merge.renames config setting

2018-04-26 Thread Ben Peart



On 4/26/2018 6:52 PM, Elijah Newren wrote:

On Thu, Apr 26, 2018 at 1:52 PM, Ben Peart  wrote:


+merge.renames::
+   Whether and how Git detects renames.  If set to "false",
+   rename detection is disabled. If set to "true", basic rename
+   detection is enabled.  If set to "copies" or "copy", Git will
+   detect copies, as well.  Defaults to the value of diff.renames.
+


We shouldn't allow users to force copy detection on for merges  The
diff side of the code will detect them correctly but the code in
merge-recursive will mishandle the copy pairs.  I think fixing it is
somewhere between big can of worms and
it's-a-configuration-that-doesn't-even-make-sense, but it's been a
while since I thought about it.


Color me puzzled. :)  The consensus was that the default value for 
merge.renames come from diff.renames.  diff.renames supports copy 
detection which means that merge.renames will inherit that value.  My 
assumption was that is what was intended so when I reimplemented it, I 
fully implemented it that way.


Are you now requesting to only use diff.renames as the default if the 
value is true or false but not if it is copy?  What should happen if 
diff.renames is actually set to copy?  Should merge silently change that 
to true, display a warning, error out, or something else?  Do you have 
some other behavior for how to handle copy being inherited from 
diff.renames you'd like to see?


Can you write the documentation that clearly explains the exact behavior 
you want?  That would kill two birds with one stone... :)





diff --git a/merge-recursive.h b/merge-recursive.h
index 80d69d1401..0c5f7eff98 100644
--- a/merge-recursive.h
+++ b/merge-recursive.h
@@ -17,7 +17,8 @@ struct merge_options {
 unsigned renormalize : 1;
 long xdl_opts;
 int verbosity;
-   int detect_rename;
+   int diff_detect_rename;
+   int merge_detect_rename;
 int diff_rename_limit;
 int merge_rename_limit;
 int rename_score;
@@ -28,6 +29,11 @@ struct merge_options {
 struct hashmap current_file_dir_set;
 struct string_list df_conflict_file_set;
  };
+inline int merge_detect_rename(struct merge_options *o)
+{
+   return o->merge_detect_rename >= 0 ? o->merge_detect_rename :
+   o->diff_detect_rename >= 0 ? o->diff_detect_rename : 1;
+}


Why did you split o->detect_rename into two fields?  You then
recombine them in merge_detect_rename(), and after initial setup only
ever access them through that function.  Having two fields worries me
that people will accidentally introduce bugs by using one of them
instead of the merge_detect_rename() function.  Is there a reason you
decided against having the initial setup just set a single value and
then use it directly?



The setup of this value is split into 3 places that may or may not all 
get called.  The initial values, the values that come from the config 
settings and then any values passed on the command line.


Because the merge value can now inherit from the diff value, you only 
know the final value after you have received all possible inputs.  That 
makes it necessary to be a calculated value.


If you look at diff_rename_limit/merge_rename_limit, detect_rename 
follow the same pattern for the same reasons.  It turns out 
detect_rename was a little more complex because it is used in 3 
different locations (vs just one) which is why I wrapped the inheritance 
logic into the helper function merge_detect_rename().


Re: [PATCH v4 1/2] blame: prevent error if range ends past end of file

2018-04-26 Thread Junio C Hamano
isteph...@atlassian.com writes:

> diff --git a/line-range.c b/line-range.c
> index 323399d16..023aee1f5 100644
> --- a/line-range.c
> +++ b/line-range.c
> @@ -47,7 +47,7 @@ static const char *parse_loc(const char *spec, 
> nth_line_fn_t nth_line,
>   else if (!num)
>   *ret = begin;
>   else
> - *ret = begin + num;
> + *ret = begin + num ? begin + num : -1;

When parsing "-L,-20" to grab some lines before the line
specified by , if that something happens to be line #20,
this gives -1 to *ret.  If it is line #19, *ret becomes -1, and if
it is line #18 or before, *ret becomes -2, -3, ...

Is that what we really want here?  It is disturbing that only line
#19 and #20 are treated identically in the above example.  If it
were "if going backwards by -num lines from begin goes beyond the
beginning of the file, clip it to the first line", I would
understand it, but as written, I am not sure what the code is trying
to do.


Proposal

2018-04-26 Thread MS Zeliha Omer Faruk



Hello

   Greetings to you today i asked before but i did't get a response please
i know this might come to you as a surprise because you do not know me
personally i have a business proposal for you please reply for more
info.



Best Regards,

Esentepe Mahallesi Büyükdere
Caddesi Kristal Kule Binasi
No:215
 Sisli - Istanbul, Turkey



Re: git merge banch w/ different submodule revision

2018-04-26 Thread Elijah Newren
On Thu, Apr 26, 2018 at 3:49 AM, Middelschulte, Leif
 wrote:
> Hi,
>
> we're using git-flow as a basic development workflow. However, doing so 
> revealed unexpected merge-behavior by git.
>
> Assume the following setup:
>
> - Repository `S` is sourced by repository `p` as submodule `s`
> - Repository `p` has two branches: `feature_x` and `develop`
> - The revisions sourced via the submodule have a linear history
>
>
> * 1c1d38f (feature_x) update submodule revision to b17e9d9
> | * 3290e69 (HEAD -> develop) update submodule revision to 0598394
> |/
> * cd5e1a5 initial submodule revision
>
>
> Problem case: Merge either branch into the other
>
> Expected behavior: Merge conflict.
>
> Actual behavior: Auto merge without conflicts.
>
> Note 1: A merge conflict does occur, if the sourced revisions do *not* have a 
> linear history
>
> Did I get something wrong about how git resolves merges? Shouldn't git be 
> like: "hey, you're trying to merge two different contents for the same line" 
> (the submodule's revision)

Hard to say without saying what commit was referenced for the
submodule in the merge-bases for the two repositories you have.  In
the basic case..

If branch A and branch B have different commits checked out in the
submodule, say:
   A: deadbeef
   B: ba5eba11

then it's not clear whether there's a conflict or not.  The merge-base
(the common point of history) matters.  So, for example if the
original version (which I'll refer to as 'O") had:
  O: deadbeef

then you would say, "Oh, branch A made no change to this submodule but
B did.  So let's go with what B has."  Conversely, of O had ba5eba11,
then you'd go the other way.

But, there is some further smarts in that if either A or B point at
commits that contain the other in their history and both contain the
commit that O points at, then you can just do a fast-forward update to
the newest.


You didn't tell us how the merge-base (cd5e1a5 from the diagram you
gave) differed in your example here between the two repositories.  In
fact, the non-linear case could have several merge-bases, in which
case they all become potentially relevant (as does their merge-bases
since at that point you'll trigger the recursive portion of
merge-recursive).  Giving us that info might help us point out what
happened, though if either the fast-forward logic comes into play or
the recursive logic gets in the mix, then we may need you to provide a
testcase (or access to the repo in question) in order to explain it
and/or determine if you've found a bug.

Does that help?

Elijah


Re: git merge banch w/ different submodule revision

2018-04-26 Thread Elijah Newren
On Thu, Apr 26, 2018 at 2:46 PM, Jacob Keller  wrote:
> On Thu, Apr 26, 2018 at 10:56 AM, Stefan Beller  wrote:
>> We often treating a submodule as a file from the superproject, but not 
>> always.
>> And in case of a merge, git seems to be a bit smarter than treating it
>> as a textfile
>> with two different lines.
>
> Sure, but a submodule is checked out "at a commit", so if two branches
> of history are merged, and they conflict over which place the
> submodule is at shouldn't that produce a conflict??

By "which place a submodule is at", do you mean the commit it points
to, or the path at which the submodule is found within the parent
repository?  Continuing on it sounds like you meant the former, but I
was unsure if you were asking mutliple different questions here.

> I mean, how is the merge algorithm supposed to know which is right?
> The patch you linked appears to be able to resolve it to the one which
> contains both commits.. but that may not actually be true since you
> can rewind submodules since they're *pointers* to commits, not commits
> themselves.

Only if both commits also contain the base; see lines 328 to 332 of
that patch.  So, if the submodules are rewound, that algorithm would
leave them as conflicted.


[no subject]

2018-04-26 Thread Scott McKellar
  Good morning Git

https://bit.ly/2HSZB08



Scott McKellar


Proposal

2018-04-26 Thread MS Zeliha Omer Faruk



Hello

   Greetings to you today i asked before but i did't get a response please
i know this might come to you as a surprise because you do not know me
personally i have a business proposal for you please reply for more
info.



Best Regards,

Esentepe Mahallesi Büyükdere
Caddesi Kristal Kule Binasi
No:215
 Sisli - Istanbul, Turkey



Re: Fetching tags overwrites existing tags

2018-04-26 Thread Junio C Hamano
Junio C Hamano  writes:

> Wink Saville  writes:
>
>> I've tried to teach 'git remote add' the --prefix-tags option using the
>> technique Junio provided. At moment it is PR #486 on github [1]
>> and I'd love some comments on whether or not this the right direction
>> for fetching tags and putting them in the branches namespace.
>>
>> -- Wink
>>
>> [1] https://github.com/git/git/pull/486
>
> FWIW, here is how that pull/486/head looks like.
>
> -- >8 --
>
> From: Wink Saville 
> Date: Thu, 26 Apr 2018 09:56:11 -0700
> Subject: [PATCH] Teach remote add the --prefix-tags option
>
> When --prefix-tags is passed to `git remote add` the tagopt is set to
> --prefix-tags and a second fetch line is added so tags are placed in
> the branches namespace.

When I hear "branches namespace", what comes to my mind is refs/heads/
or perhaps refs/remotes/*/.  "... are placed in a separate hierarchy
per remote" or something, perhaps?

>
> ...
> And the .git/config remote "gbenchmark" section looks like:
>   [remote "gbenchmark"]
> url = g...@github.com:google/benchmark
> fetch = +refs/heads/*:refs/remotes/gbenchmark/*
> fetch = +refs/tags/*:refs/remote-tags/gbenchmark/*
> tagopt = --prefix-tags
> ---

Missing sign-off ;-)

> +static void add_remote_tags(const char *key, const char *branchname,
> +const char *remotename, struct strbuf *tmp)
> +{
> + strbuf_reset(tmp);
> + strbuf_addch(tmp, '+');
> + strbuf_addf(tmp, "refs/tags/%s:refs/remote-tags/%s/%s",
> + branchname, remotename, branchname);

With "+refs/tags/%s:refs/remote-tags/%s/%s", combine addch/addf into
one, perhaps?

> + git_config_set_multivar(key, tmp->buf, "^$", 0);
> +}

Calling the second parameter "branchname" makes little sense, I
would think.  Practically, you would call this at most once with its
second parameter set to '*', and even if the second parameter is not
a wildcard/asterisk, it would be a tagname.


>  static const char mirror_advice[] =
>  N_("--mirror is dangerous and deprecated; please\n"
> "\t use --mirror=fetch or --mirror=push instead");
> @@ -161,6 +172,9 @@ static int add(int argc, const char **argv)
>   OPT_SET_INT(0, "tags", _tags,
>   N_("import all tags and associated objects when 
> fetching"),
>   TAGS_SET),
> + OPT_SET_INT(0, "prefix-tags", _tags,
> + N_("import all tags and associated objects when 
> fetching and prefix with "),
> +  TAGS_SET_PREFIX),

Funny indent.  Use monospaced font in your editor, set tab width to
8 and align, imitating how the above OPT_SET_INT() item does for
TAGS_SET.

> @@ -215,10 +229,35 @@ static int add(int argc, const char **argv)
>   }
>  
>   if (fetch_tags != TAGS_DEFAULT) {
> + if (fetch_tags == TAGS_SET_PREFIX) {
> + strbuf_reset();
> + strbuf_addf(, "remote.%s.fetch", name);
> + if (track.nr == 0)
> + string_list_append(, "*");
> + for (i = 0; i < track.nr; i++) {
> + add_remote_tags(buf.buf, track.items[i].string,
> + name, );
> + }

The "track" thing is made incompatible with anything but mirror in
early part of this function (outside the precontext).  I highly
suspect that --prefix-tags does *not* make sense when mirroring.

Hence (1) we should detect and error out when --prefix-tags is used
with mirror fetch near where we do the same for track used without
mirror fetch already, (2) detect and error out when --prefix-tags is
used with track, and (3) add "+refs/tags/*:refs/remote-tags/$name/*"
just once without paying attention to track here.  We may not even
want add_remote_tags() helper function if we go that route.

> + }
> +
>   strbuf_reset();
>   strbuf_addf(, "remote.%s.tagopt", name);
> - git_config_set(buf.buf,
> -fetch_tags == TAGS_SET ? "--tags" : "--no-tags");
> + char* config_val = NULL;

decl-after-statement.  Also "char *var", not "char* var".

> + switch (fetch_tags) {
> + case TAGS_UNSET:
> + config_val = "--no-tags";
> + break;
> + case TAGS_SET:
> + config_val = "--tags";
> + break;
> + case TAGS_SET_PREFIX:
> + config_val = "--prefix-tags";
> + break;
> + default:
> + die(_("Unexpected TAGS enum %d"), fetch_tags);
> + break;
> + }
> + git_config_set(buf.buf, config_val);
>   }
>  
>   if (fetch && fetch_remote(name))
> diff --git a/remote.c b/remote.c
> index 91eb010ca9..f383ce3cdf 

Re: [PATCH v3 1/3] merge: update documentation for {merge,diff}.renameLimit

2018-04-26 Thread Jonathan Tan
On Thu, 26 Apr 2018 16:11:50 -0700
Elijah Newren  wrote:

> Patch looks fine, but it's hard for me not to notice a separate issue
> in this area independent of your series: I'm curious if we should
> document that the value of 0 is special here (as per Jonathan Tan's
> commit 89973554b52c ("diffcore-rename: make diff-tree -l0 mean
> -l", 2017-11-29)), and doesn't actually drop the limit to 0.
> cc'ing Jonathan Tan for his thoughts.

Documenting that the value of 0 is special does make sense to me. I
think this patch can go in as-is, though - it is already an improvement.


Re: [PATCH v3 1/3] merge: update documentation for {merge,diff}.renameLimit

2018-04-26 Thread Elijah Newren
On Thu, Apr 26, 2018 at 1:52 PM, Ben Peart  wrote:
> Update the documentation to better indicate that the renameLimit setting is
> ignored if rename detection is turned off via command line options or config
> settings.
>
> Signed-off-by: Ben Peart 
> ---
>  Documentation/diff-config.txt  | 3 ++-
>  Documentation/merge-config.txt | 3 ++-
>  2 files changed, 4 insertions(+), 2 deletions(-)
>
> diff --git a/Documentation/diff-config.txt b/Documentation/diff-config.txt
> index 5ca942ab5e..77caa66c2f 100644
> --- a/Documentation/diff-config.txt
> +++ b/Documentation/diff-config.txt
> @@ -112,7 +112,8 @@ diff.orderFile::
>
>  diff.renameLimit::
> The number of files to consider when performing the copy/rename
> -   detection; equivalent to the 'git diff' option `-l`.
> +   detection; equivalent to the 'git diff' option `-l`. This setting
> +   has no effect if rename detection is turned off.
>
>  diff.renames::
> Whether and how Git detects renames.  If set to "false",
> diff --git a/Documentation/merge-config.txt b/Documentation/merge-config.txt
> index 12b6bbf591..48ee3bce77 100644
> --- a/Documentation/merge-config.txt
> +++ b/Documentation/merge-config.txt
> @@ -35,7 +35,8 @@ include::fmt-merge-msg-config.txt[]
>  merge.renameLimit::
> The number of files to consider when performing rename detection
> during a merge; if not specified, defaults to the value of
> -   diff.renameLimit.
> +   diff.renameLimit. This setting has no effect if rename detection
> +   is turned off.
>
>  merge.renormalize::
> Tell Git that canonical representation of files in the
> --
> 2.17.0.windows.1

Patch looks fine, but it's hard for me not to notice a separate issue
in this area independent of your series: I'm curious if we should
document that the value of 0 is special here (as per Jonathan Tan's
commit 89973554b52c ("diffcore-rename: make diff-tree -l0 mean
-l", 2017-11-29)), and doesn't actually drop the limit to 0.
cc'ing Jonathan Tan for his thoughts.


Re: [PATCH v3 3/3] merge: pass aggressive when rename detection is turned off

2018-04-26 Thread Elijah Newren
On Thu, Apr 26, 2018 at 1:52 PM, Ben Peart  wrote:
> Set aggressive flag in git_merge_trees() when rename detection is turned off.
> This allows read_tree() to auto resolve more cases that would have otherwise
> been handled by the rename detection.
>
> Reviewed-by: Johannes Schindelin 
> Signed-off-by: Ben Peart 
> ---
>  merge-recursive.c | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
>
> diff --git a/merge-recursive.c b/merge-recursive.c
> index 2637d34d87..6cc4404144 100644
> --- a/merge-recursive.c
> +++ b/merge-recursive.c
> @@ -276,6 +276,7 @@ static void init_tree_desc_from_tree(struct tree_desc 
> *desc, struct tree *tree)
>  }
>
>  static int git_merge_trees(int index_only,
> +  int aggressive,
>struct tree *common,
>struct tree *head,
>struct tree *merge)
> @@ -294,6 +295,7 @@ static int git_merge_trees(int index_only,
> opts.fn = threeway_merge;
> opts.src_index = _index;
> opts.dst_index = _index;
> +   opts.aggressive = aggressive;
> setup_unpack_trees_porcelain(, "merge");
>
> init_tree_desc_from_tree(t+0, common);
> @@ -1993,7 +1995,7 @@ int merge_trees(struct merge_options *o,
> return 1;
> }
>
> -   code = git_merge_trees(o->call_depth, common, head, merge);
> +   code = git_merge_trees(o->call_depth, !merge_detect_rename(o), 
> common, head, merge);
>
> if (code != 0) {
> if (show(o, 4) || o->call_depth)
> --
> 2.17.0.windows.1

Patch looks fine but as a heads up -- since merge_options is a
parameter in git_merge_trees after the
en/rename-directory-detection-reboot lands, we'll be able to switch
this patch to set opts.aggressive directly instead of needing to pass
it in as a parameter.


Re: [PATCH v3 2/3] merge: Add merge.renames config setting

2018-04-26 Thread Elijah Newren
On Thu, Apr 26, 2018 at 1:52 PM, Ben Peart  wrote:

> +merge.renames::
> +   Whether and how Git detects renames.  If set to "false",
> +   rename detection is disabled. If set to "true", basic rename
> +   detection is enabled.  If set to "copies" or "copy", Git will
> +   detect copies, as well.  Defaults to the value of diff.renames.
> +

We shouldn't allow users to force copy detection on for merges  The
diff side of the code will detect them correctly but the code in
merge-recursive will mishandle the copy pairs.  I think fixing it is
somewhere between big can of worms and
it's-a-configuration-that-doesn't-even-make-sense, but it's been a
while since I thought about it.

> diff --git a/merge-recursive.h b/merge-recursive.h
> index 80d69d1401..0c5f7eff98 100644
> --- a/merge-recursive.h
> +++ b/merge-recursive.h
> @@ -17,7 +17,8 @@ struct merge_options {
> unsigned renormalize : 1;
> long xdl_opts;
> int verbosity;
> -   int detect_rename;
> +   int diff_detect_rename;
> +   int merge_detect_rename;
> int diff_rename_limit;
> int merge_rename_limit;
> int rename_score;
> @@ -28,6 +29,11 @@ struct merge_options {
> struct hashmap current_file_dir_set;
> struct string_list df_conflict_file_set;
>  };
> +inline int merge_detect_rename(struct merge_options *o)
> +{
> +   return o->merge_detect_rename >= 0 ? o->merge_detect_rename :
> +   o->diff_detect_rename >= 0 ? o->diff_detect_rename : 1;
> +}

Why did you split o->detect_rename into two fields?  You then
recombine them in merge_detect_rename(), and after initial setup only
ever access them through that function.  Having two fields worries me
that people will accidentally introduce bugs by using one of them
instead of the merge_detect_rename() function.  Is there a reason you
decided against having the initial setup just set a single value and
then use it directly?


Re: Fetching tags overwrites existing tags

2018-04-26 Thread Junio C Hamano
Wink Saville  writes:

> I've tried to teach 'git remote add' the --prefix-tags option using the
> technique Junio provided. At moment it is PR #486 on github [1]
> and I'd love some comments on whether or not this the right direction
> for fetching tags and putting them in the branches namespace.
>
> -- Wink
>
> [1] https://github.com/git/git/pull/486

FWIW, here is how that pull/486/head looks like.

-- >8 --

From: Wink Saville 
Date: Thu, 26 Apr 2018 09:56:11 -0700
Subject: [PATCH] Teach remote add the --prefix-tags option

When --prefix-tags is passed to `git remote add` the tagopt is set to
--prefix-tags and a second fetch line is added so tags are placed in
the branches namespace.

For example:
  $ git remote add -f --prefix-tags gbenchmark g...@github.com:google/benchmark
  Updating gbenchmark
  warning: no common commits
  remote: Counting objects: 4406, done.
  remote: Compressing objects: 100% (18/18), done.
  remote: Total 4406 (delta 7), reused 13 (delta 6), pack-reused 4382
  Receiving objects: 100% (4406/4406), 1.34 MiB | 7.46 MiB/s, done.
  Resolving deltas: 100% (2865/2865), done.
  From github.com:google/benchmark
   * [new branch]  clangtidy   -> gbenchmark/clangtidy
   * [new branch]  iter_report -> gbenchmark/iter_report
   * [new branch]  master  -> gbenchmark/master
   * [new branch]  releasing   -> gbenchmark/releasing
   * [new branch]  reportercleanup -> gbenchmark/reportercleanup
   * [new branch]  rmheaders   -> gbenchmark/rmheaders
   * [new branch]  v2  -> gbenchmark/v2
   * [new tag] v0.0.9  -> refs/remote-tags/gbenchmark/v0.0.9
   * [new tag] v0.1.0  -> refs/remote-tags/gbenchmark/v0.1.0
   * [new tag] v1.0.0  -> refs/remote-tags/gbenchmark/v1.0.0
   * [new tag] v1.1.0  -> refs/remote-tags/gbenchmark/v1.1.0
   * [new tag] v1.2.0  -> refs/remote-tags/gbenchmark/v1.2.0
   * [new tag] v1.3.0  -> refs/remote-tags/gbenchmark/v1.3.0
   * [new tag] v1.4.0  -> refs/remote-tags/gbenchmark/v1.4.0

And the .git/config remote "gbenchmark" section looks like:
  [remote "gbenchmark"]
url = g...@github.com:google/benchmark
fetch = +refs/heads/*:refs/remotes/gbenchmark/*
fetch = +refs/tags/*:refs/remote-tags/gbenchmark/*
tagopt = --prefix-tags
---
 Documentation/git-remote.txt |  8 --
 builtin/remote.c | 47 +---
 remote.c |  2 ++
 3 files changed, 51 insertions(+), 6 deletions(-)

diff --git a/Documentation/git-remote.txt b/Documentation/git-remote.txt
index 4feddc0293..cdfd24e2ea 100644
--- a/Documentation/git-remote.txt
+++ b/Documentation/git-remote.txt
@@ -10,7 +10,7 @@ SYNOPSIS
 
 [verse]
 'git remote' [-v | --verbose]
-'git remote add' [-t ] [-m ] [-f] [--[no-]tags] 
[--mirror=]  
+'git remote add' [-t ] [-m ] [-f] [--prefix-tags | --tags | 
--no-tags] [--mirror=]  
 'git remote rename'  
 'git remote remove' 
 'git remote set-head'  (-a | --auto | -d | --delete | )
@@ -54,7 +54,11 @@ With `-f` option, `git fetch ` is run immediately after
 the remote information is set up.
 +
 With `--tags` option, `git fetch ` imports every tag from the
-remote repository.
+remote repository to refs/tags, use --prefix-tags to import them
+to refs/remote-tags//.
++
+With `--prefix-tags` option, `git fetch ` imports every tag from the
+remote repository to refs/remote-tags//.
 +
 With `--no-tags` option, `git fetch ` does not import tags from
 the remote repository.
diff --git a/builtin/remote.c b/builtin/remote.c
index 805ffc05cd..75813eeaa3 100644
--- a/builtin/remote.c
+++ b/builtin/remote.c
@@ -11,7 +11,7 @@
 
 static const char * const builtin_remote_usage[] = {
N_("git remote [-v | --verbose]"),
-   N_("git remote add [-t ] [-m ] [-f] [--tags | 
--no-tags] [--mirror=]  "),
+   N_("git remote add [-t ] [-m ] [-f] [--prefix-tags | 
--tags | --no-tags] [--mirror=]  "),
N_("git remote rename  "),
N_("git remote remove "),
N_("git remote set-head  (-a | --auto | -d | --delete | 
)"),
@@ -101,7 +101,8 @@ static int fetch_remote(const char *name)
 enum {
TAGS_UNSET = 0,
TAGS_DEFAULT = 1,
-   TAGS_SET = 2
+   TAGS_SET = 2,
+   TAGS_SET_PREFIX = 3
 };
 
 #define MIRROR_NONE 0
@@ -123,6 +124,16 @@ static void add_branch(const char *key, const char 
*branchname,
git_config_set_multivar(key, tmp->buf, "^$", 0);
 }
 
+static void add_remote_tags(const char *key, const char *branchname,
+  const char *remotename, struct strbuf *tmp)
+{
+   strbuf_reset(tmp);
+   strbuf_addch(tmp, '+');
+   strbuf_addf(tmp, "refs/tags/%s:refs/remote-tags/%s/%s",
+   branchname, remotename, branchname);
+   git_config_set_multivar(key, 

Re: git merge banch w/ different submodule revision

2018-04-26 Thread Stefan Beller
Stefan wrote:
> See https://github.com/git/git/commit/68d03e4a6e448aa557f52adef92595ac4d6cd4bd
> (68d03e4a6e (Implement automatic fast-forward merge for submodules, 
> 2010-07-07)
> to explain the situation you encounter. (specifically merge_submodule
> at the end of the diff)

+cc Heiko, author of that commit.

On Thu, Apr 26, 2018 at 2:46 PM, Jacob Keller  wrote:
> On Thu, Apr 26, 2018 at 10:56 AM, Stefan Beller  wrote:
>> We often treating a submodule as a file from the superproject, but not 
>> always.
>> And in case of a merge, git seems to be a bit smarter than treating it
>> as a textfile with two different lines.
>
> Sure, but a submodule is checked out "at a commit", so if two branches
> of history are merged, and they conflict over which place the
> submodule is at shouldn't that produce a conflict??

Stepping back a little bit:

When two branches developed a file differently, they can be merged
iff they do not change the same lines (plus a little bit of margin of 1
extra line)

That is the builtin merge-driver for "plain text files" and seems to be accepted
widely as "good enough" or "that is how git merges".

What if this text file happens to be the .gitmodules file and the changed lines
happen to be 2 options in there (Say one option was the path, as one branch
renamed the submodule, and the other option is submodule.branch) ?

Then we could do better as we know the structure of the file. We would not
need the extra buffer line as a cautious step, but instead could parse both
sides of the merge and merge each config in-memory and then write out
a .gitmodules file. I think David Turner proposed a custom merge driver
for .gitmodules a couple month ago.

Another example is the merge code respecting renames on one side
(even for directories) and edits in the other side. Technically the rename
of a file is a "delete of all lines in this path", which could also argued to
just conflict with the edit on the other side.

With these examples given, I think it is legit to treat submodule changes
not as "two lines of text differ at the same place, mark it as conflict",
but we are allowed to be smarter about it.

> I mean, how is the merge algorithm supposed to know which is right?

Good question. As said above, the merge algorithm for text files is just
correct for "plain text files". In source code, I can give an example
which merges fine, but doesn't compile after merging: One side changes
a function signature and the other side adds a call to the function (still using
the old signature).

Here you can see that our merge algorithm is wrong. It sucks.
The solution is a custom merge driver for C code (or whatever
language you happen to use).

For submodules, the given commit made the assumption that
progressing in history of a submodule is never bad, i.e. there are
no reverts and no bugs introduced, only perfect features are added
by new submodule commits. (I don't know which assumptions were
actually made, I made this up).

Maybe we need to revisit that decision?

> The patch you linked appears to be able to resolve it to the one which
> contains both commits.. but that may not actually be true since you
> can rewind submodules since they're *pointers* to commits, not commits
> themselves.

Right, and that is the problem, as the pointer is a small thing, which
doesn't allow for the dumb text merging strategy that is used in files.

So we could always err out and have the user make a decision.
Or we could provide a basic merge driver for submodules (which
was implemented in that commit).

If you use a different workflow this doesn't work for you, so
obviously you want a different custom merge driver for
submodules?

> I'm not against that as a possible strategy to merge submodules, but
> it seems like not necessarily something you would always want...

I agree that it is reasonable to want different things, just like
wanting a merge driver that works better with C code.
(side note: I am rebasing a large series currently and one of the
frequent conflicts were different #includes at the top of a file.
You could totally automate merging that :/)

Thanks,
Stefan


Proposal

2018-04-26 Thread MS Zeliha Omer Faruk



Hello

   Greetings to you today i asked before but i did't get a response please
i know this might come to you as a surprise because you do not know me
personally i have a business proposal for you please reply for more
info.



Best Regards,

Esentepe Mahallesi Büyükdere
Caddesi Kristal Kule Binasi
No:215
 Sisli - Istanbul, Turkey



Re: [PATCH v3 0/3] add merge.renames config setting

2018-04-26 Thread Elijah Newren
Hi Ben,

On Thu, Apr 26, 2018 at 1:52 PM, Ben Peart  wrote:
> This is a complete rewrite based on the feedback from earlier patches.

Thanks for pushing forward on this.

> Update the documentation to better indicate command line options that override
> various config settings related to merge.
>
> Add a new config merge.renames setting to to control the rename detection
> behavior of merge.  This setting will default to the value of diff.renames.
>
> Also adds logic so that when rename detection is turned off, the aggressive
> flag is passed to read_tree() so that it can auto resolve more cases that 
> would
> have been handled by rename detection.
>
> For the repro that I have been using this drops the merge time from ~1 hour to
> ~5 minutes and the unmerged entries goes down from ~40,000 to 1.
>
> Helped-by: Kevin Willford 
> Reviewed-by: Johannes Schindelin 
> Signed-off-by: Ben Peart 
>
> Base Ref: master

We may need to figure out how to coordinate amongst a few topics.
Looking over your patches, there are going to be a few conflicts with
en/rename-directory-detection-reboot, so this won't apply to pu.
Martin's series to introduce clear_unpack_trees_porcelain()[1], which
he was waiting to submit until mine went through, will also conflict
with this, if he uses the changes I suggested for the handling in
merge-recursive[2].  These aren't major conflicts, but I'm just
flagging it.

[1] https://public-inbox.org/git/cover.1524545557.git.martin.ag...@gmail.com/
[2] https://public-inbox.org/git/20180424162939.20956-1-new...@gmail.com/


Re: git merge banch w/ different submodule revision

2018-04-26 Thread Jacob Keller
On Thu, Apr 26, 2018 at 10:56 AM, Stefan Beller  wrote:
> We often treating a submodule as a file from the superproject, but not always.
> And in case of a merge, git seems to be a bit smarter than treating it
> as a textfile
> with two different lines.

Sure, but a submodule is checked out "at a commit", so if two branches
of history are merged, and they conflict over which place the
submodule is at shouldn't that produce a conflict??

I mean, how is the merge algorithm supposed to know which is right?
The patch you linked appears to be able to resolve it to the one which
contains both commits.. but that may not actually be true since you
can rewind submodules since they're *pointers* to commits, not commits
themselves.

I'm not against that as a possible strategy to merge submodules, but
it seems like not necessarily something you would always want...

Thanks,
Jake


[PATCH v3 3/3] merge: pass aggressive when rename detection is turned off

2018-04-26 Thread Ben Peart
Set aggressive flag in git_merge_trees() when rename detection is turned off.
This allows read_tree() to auto resolve more cases that would have otherwise
been handled by the rename detection.

Reviewed-by: Johannes Schindelin 
Signed-off-by: Ben Peart 
---
 merge-recursive.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/merge-recursive.c b/merge-recursive.c
index 2637d34d87..6cc4404144 100644
--- a/merge-recursive.c
+++ b/merge-recursive.c
@@ -276,6 +276,7 @@ static void init_tree_desc_from_tree(struct tree_desc 
*desc, struct tree *tree)
 }
 
 static int git_merge_trees(int index_only,
+  int aggressive,
   struct tree *common,
   struct tree *head,
   struct tree *merge)
@@ -294,6 +295,7 @@ static int git_merge_trees(int index_only,
opts.fn = threeway_merge;
opts.src_index = _index;
opts.dst_index = _index;
+   opts.aggressive = aggressive;
setup_unpack_trees_porcelain(, "merge");
 
init_tree_desc_from_tree(t+0, common);
@@ -1993,7 +1995,7 @@ int merge_trees(struct merge_options *o,
return 1;
}
 
-   code = git_merge_trees(o->call_depth, common, head, merge);
+   code = git_merge_trees(o->call_depth, !merge_detect_rename(o), common, 
head, merge);
 
if (code != 0) {
if (show(o, 4) || o->call_depth)
-- 
2.17.0.windows.1



[PATCH v3 2/3] merge: Add merge.renames config setting

2018-04-26 Thread Ben Peart
Add the ability to control rename detection for merge via a config setting.
This setting behaves the same and defaults to the value of diff.renames but only
applies to merge.

Reviewed-by: Johannes Schindelin 
Signed-off-by: Ben Peart 
---
 Documentation/merge-config.txt|  6 ++
 Documentation/merge-strategies.txt|  6 --
 diff.c|  2 +-
 diff.h|  2 ++
 merge-recursive.c | 23 +--
 merge-recursive.h |  8 +++-
 t/t3034-merge-recursive-rename-options.sh | 18 ++
 7 files changed, 55 insertions(+), 10 deletions(-)

diff --git a/Documentation/merge-config.txt b/Documentation/merge-config.txt
index 48ee3bce77..59848e5634 100644
--- a/Documentation/merge-config.txt
+++ b/Documentation/merge-config.txt
@@ -38,6 +38,12 @@ merge.renameLimit::
diff.renameLimit. This setting has no effect if rename detection
is turned off.
 
+merge.renames::
+   Whether and how Git detects renames.  If set to "false",
+   rename detection is disabled. If set to "true", basic rename
+   detection is enabled.  If set to "copies" or "copy", Git will
+   detect copies, as well.  Defaults to the value of diff.renames.
+
 merge.renormalize::
Tell Git that canonical representation of files in the
repository has changed over time (e.g. earlier commits record
diff --git a/Documentation/merge-strategies.txt 
b/Documentation/merge-strategies.txt
index 4a58aad4b8..1e0728aa12 100644
--- a/Documentation/merge-strategies.txt
+++ b/Documentation/merge-strategies.txt
@@ -84,12 +84,14 @@ no-renormalize;;
`merge.renormalize` configuration variable.
 
 no-renames;;
-   Turn off rename detection.
+   Turn off rename detection. This overrides the `merge.renames`
+   configuration variable.
See also linkgit:git-diff[1] `--no-renames`.
 
 find-renames[=];;
Turn on rename detection, optionally setting the similarity
-   threshold.  This is the default.
+   threshold.  This is the default. This overrides the
+   'merge.renames' configuration variable.
See also linkgit:git-diff[1] `--find-renames`.
 
 rename-threshold=;;
diff --git a/diff.c b/diff.c
index 1289df4b1f..5dfc24aa6d 100644
--- a/diff.c
+++ b/diff.c
@@ -177,7 +177,7 @@ static int parse_submodule_params(struct diff_options 
*options, const char *valu
return 0;
 }
 
-static int git_config_rename(const char *var, const char *value)
+int git_config_rename(const char *var, const char *value)
 {
if (!value)
return DIFF_DETECT_RENAME;
diff --git a/diff.h b/diff.h
index d29560f822..806faee2b3 100644
--- a/diff.h
+++ b/diff.h
@@ -324,6 +324,8 @@ extern int git_diff_ui_config(const char *var, const char 
*value, void *cb);
 extern void diff_setup(struct diff_options *);
 extern int diff_opt_parse(struct diff_options *, const char **, int, const 
char *);
 extern void diff_setup_done(struct diff_options *);
+extern int git_config_rename(const char *var, const char *value);
+
 
 #define DIFF_DETECT_RENAME 1
 #define DIFF_DETECT_COPY   2
diff --git a/merge-recursive.c b/merge-recursive.c
index 0c0d48624d..2637d34d87 100644
--- a/merge-recursive.c
+++ b/merge-recursive.c
@@ -555,13 +555,13 @@ static struct string_list *get_renames(struct 
merge_options *o,
struct diff_options opts;
 
renames = xcalloc(1, sizeof(struct string_list));
-   if (!o->detect_rename)
+   if (!merge_detect_rename(o))
return renames;
 
diff_setup();
opts.flags.recursive = 1;
opts.flags.rename_empty = 0;
-   opts.detect_rename = DIFF_DETECT_RENAME;
+   opts.detect_rename = merge_detect_rename(o);
opts.rename_limit = o->merge_rename_limit >= 0 ? o->merge_rename_limit :
o->diff_rename_limit >= 0 ? o->diff_rename_limit :
1000;
@@ -2232,9 +2232,18 @@ int merge_recursive_generic(struct merge_options *o,
 
 static void merge_recursive_config(struct merge_options *o)
 {
+   char *value = NULL;
git_config_get_int("merge.verbosity", >verbosity);
git_config_get_int("diff.renamelimit", >diff_rename_limit);
git_config_get_int("merge.renamelimit", >merge_rename_limit);
+   if (!git_config_get_string("diff.renames", )) {
+   o->diff_detect_rename = git_config_rename("diff.renames", 
value);
+   free(value);
+   }
+   if (!git_config_get_string("merge.renames", )) {
+   o->merge_detect_rename = git_config_rename("merge.renames", 
value);
+   free(value);
+   }
git_config(git_xmerge_config, NULL);
 }
 
@@ -2244,10 +2253,11 @@ void init_merge_options(struct merge_options *o)
memset(o, 0, sizeof(struct merge_options));
o->verbosity = 2;
  

[PATCH v3 1/3] merge: update documentation for {merge,diff}.renameLimit

2018-04-26 Thread Ben Peart
Update the documentation to better indicate that the renameLimit setting is
ignored if rename detection is turned off via command line options or config
settings.

Signed-off-by: Ben Peart 
---
 Documentation/diff-config.txt  | 3 ++-
 Documentation/merge-config.txt | 3 ++-
 2 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/Documentation/diff-config.txt b/Documentation/diff-config.txt
index 5ca942ab5e..77caa66c2f 100644
--- a/Documentation/diff-config.txt
+++ b/Documentation/diff-config.txt
@@ -112,7 +112,8 @@ diff.orderFile::
 
 diff.renameLimit::
The number of files to consider when performing the copy/rename
-   detection; equivalent to the 'git diff' option `-l`.
+   detection; equivalent to the 'git diff' option `-l`. This setting
+   has no effect if rename detection is turned off.
 
 diff.renames::
Whether and how Git detects renames.  If set to "false",
diff --git a/Documentation/merge-config.txt b/Documentation/merge-config.txt
index 12b6bbf591..48ee3bce77 100644
--- a/Documentation/merge-config.txt
+++ b/Documentation/merge-config.txt
@@ -35,7 +35,8 @@ include::fmt-merge-msg-config.txt[]
 merge.renameLimit::
The number of files to consider when performing rename detection
during a merge; if not specified, defaults to the value of
-   diff.renameLimit.
+   diff.renameLimit. This setting has no effect if rename detection
+   is turned off.
 
 merge.renormalize::
Tell Git that canonical representation of files in the
-- 
2.17.0.windows.1



[PATCH v3 0/3] add merge.renames config setting

2018-04-26 Thread Ben Peart
This is a complete rewrite based on the feedback from earlier patches.

Update the documentation to better indicate command line options that override
various config settings related to merge.

Add a new config merge.renames setting to to control the rename detection
behavior of merge.  This setting will default to the value of diff.renames.

Also adds logic so that when rename detection is turned off, the aggressive
flag is passed to read_tree() so that it can auto resolve more cases that would
have been handled by rename detection.

For the repro that I have been using this drops the merge time from ~1 hour to
~5 minutes and the unmerged entries goes down from ~40,000 to 1.

Helped-by: Kevin Willford 
Reviewed-by: Johannes Schindelin 
Signed-off-by: Ben Peart 

Base Ref: master
Web-Diff: https://github.com/benpeart/git/commit/6a8372d517
Checkout: git fetch https://github.com/benpeart/git merge-options-v3 && git 
checkout 6a8372d517

### Patches

Ben Peart (3):
  merge: update documentation for {merge,diff}.renameLimit
  merge: Add merge.renames config setting
  merge: pass aggressive when rename detection is turned off

 Documentation/diff-config.txt |  3 ++-
 Documentation/merge-config.txt|  9 +++-
 Documentation/merge-strategies.txt|  6 +++--
 diff.c|  2 +-
 diff.h|  2 ++
 merge-recursive.c | 27 +--
 merge-recursive.h |  8 ++-
 t/t3034-merge-recursive-rename-options.sh | 18 +++
 8 files changed, 62 insertions(+), 13 deletions(-)


base-commit: 1f1cddd558b54bb0ce19c8ace353fd07b758510d
-- 
2.17.0.windows.1




Proposal

2018-04-26 Thread MS Zeliha Omer Faruk



Hello

   Greetings to you today i asked before but i did't get a response please
i know this might come to you as a surprise because you do not know me
personally i have a business proposal for you please reply for more
info.



Best Regards,

Esentepe Mahallesi Büyükdere
Caddesi Kristal Kule Binasi
No:215
 Sisli - Istanbul, Turkey



Re: Fetching tags overwrites existing tags

2018-04-26 Thread Wink Saville
I've tried to teach 'git remote add' the --prefix-tags option using the
technique Junio provided. At moment it is PR #486 on github [1]
and I'd love some comments on whether or not this the right direction
for fetching tags and putting them in the branches namespace.

-- Wink

[1] https://github.com/git/git/pull/486


Re: [RFC PATCH] checkout: Force matching mtime between files

2018-04-26 Thread Duy Nguyen
On Thu, Apr 26, 2018 at 07:53:58PM +0200, SZEDER Gábor wrote:
> BTW, wouldn't running
> 
>   git clone --template=/path/to/template/dir/with/hooks/
> 
> invoke the post-checkout hook in there?
> 

Yes but it's cumbersome, preparing a template just for one extra
hook. I never like this template thing to be honest.
--
Duy


Re: [RFC PATCH] checkout: Force matching mtime between files

2018-04-26 Thread Duy Nguyen
On Thu, Apr 26, 2018 at 05:48:35PM +, Robin H. Johnson wrote:
> On Thu, Apr 26, 2018 at 06:43:56PM +0200, Duy Nguyen wrote:
> > On Wed, Apr 25, 2018 at 5:18 PM, Marc Branchaud  
> > wrote:
> > > Are we all that sure that the performance hit is that drastic?  After all,
> > > we've just done write_entry().  Calling utime() at that point should just
> > > hit the filesystem cache.
> > I have a feeling this has "this is linux" assumption. Anybody knows
> > how freebsd, mac os x and windows behave?
> I don't know sorry. futimens might be better here if it can be used
> before the fd is closed.
> 
> > > * In a "file checkout" ("git checkout -- path/to/file"), $1 and $2 are
> > > identical so the above loop does nothing.  Offhand I'm not even sure how a
> > > hook might get the right files in this case.
> > Would a hook that gives you the list of updated files (in the exact
> > same order that git updates) help?
> Yes, that, along with the target revision I think would allow most or
> all of the desired behaviors mentioned in this thread *.

Target revision should be available in the index. But this gives me an
idea to another thing that bugs me: sending the list to the hook means
I have to deal with separator (\n or NUL?) or escaping. This mentions
of index makes me take a different direction. I could produce a small
index that contains just what is modified, then you can retrieve
whatever info you want with `git ls-files` or even `git show` after
pointing $GIT_INDEX_FILE to it.

So it's basically what the following (hacky) patch does. It adds
support for a new hook named post-checkout-modified. This hook will
prepares $GIT_DIR/index.modified which contains just the files
git-checkout has touched and deletes it after the hook finishes.

My test hook is pretty simple just to dump out what in there

#!/bin/sh
GIT_INDEX_FILE=`git rev-parse --git-path index.modified` git ls-files 
--stage

and it seems to work.

Of course, this does not give you the checkout order. But checkout
order has always been sorted order by path if I remember correctly and
it's unlikely to change (and I don't think you really need that exact
order anyway)

-- 8< --
diff --git a/builtin/checkout.c b/builtin/checkout.c
index b49b582071..92b30cd05f 100644
--- a/builtin/checkout.c
+++ b/builtin/checkout.c
@@ -52,6 +52,8 @@ struct checkout_opts {
const char *prefix;
struct pathspec pathspec;
struct tree *source_tree;
+
+   struct index_state istate_modified;
 };
 
 static int post_checkout_hook(struct commit *old_commit, struct commit 
*new_commit,
@@ -470,7 +472,7 @@ static void setup_branch_path(struct branch_info *branch)
branch->path = strbuf_detach(, NULL);
 }
 
-static int merge_working_tree(const struct checkout_opts *opts,
+static int merge_working_tree(struct checkout_opts *opts,
  struct branch_info *old_branch_info,
  struct branch_info *new_branch_info,
  int *writeout_error)
@@ -595,6 +597,27 @@ static int merge_working_tree(const struct checkout_opts 
*opts,
if (!cache_tree_fully_valid(active_cache_tree))
cache_tree_update(_index, WRITE_TREE_SILENT | 
WRITE_TREE_REPAIR);
 
+   if (find_hook("post-checkout-modified")) {
+   int i;
+
+   for (i = 0; i < the_index.cache_nr; i++) {
+   struct cache_entry *ce = the_index.cache[i];
+   struct cache_entry *new_ce;
+
+   /*
+* Hack: this is an abuse of this flag, hidden
+* dependency with write_locked_index()
+*/
+   if (!(ce->ce_flags & CE_UPDATE_IN_BASE))
+   continue;
+
+   new_ce = xcalloc(1, cache_entry_size(ce_namelen(ce)));
+   memcpy(new_ce, ce, cache_entry_size(ce_namelen(ce)));
+   add_index_entry(>istate_modified, new_ce,
+   ADD_CACHE_JUST_APPEND);
+   }
+   }
+
if (write_locked_index(_index, _file, COMMIT_LOCK))
die(_("unable to write new index file"));
 
@@ -811,7 +834,7 @@ static void orphaned_commit_warning(struct commit 
*old_commit, struct commit *ne
clear_commit_marks_all(ALL_REV_FLAGS);
 }
 
-static int switch_branches(const struct checkout_opts *opts,
+static int switch_branches(struct checkout_opts *opts,
   struct branch_info *new_branch_info)
 {
int ret = 0;
@@ -848,6 +871,16 @@ static int switch_branches(const struct checkout_opts 
*opts,
 
update_refs_for_switch(opts, _branch_info, new_branch_info);
 
+   if (find_hook("post-checkout-modified")) {
+   struct lock_file lock_file = LOCK_INIT;
+
+   hold_lock_file_for_update(_file, 
git_path("index.modified"),
+   

Re: git merge banch w/ different submodule revision

2018-04-26 Thread Stefan Beller
On Thu, Apr 26, 2018 at 3:49 AM, Middelschulte, Leif
 wrote:
> Hi,
>
> we're using git-flow as a basic development workflow. However, doing so 
> revealed unexpected merge-behavior by git.
>
> Assume the following setup:
>
> - Repository `S` is sourced by repository `p` as submodule `s`
> - Repository `p` has two branches: `feature_x` and `develop`
> - The revisions sourced via the submodule have a linear history
>
>
> * 1c1d38f (feature_x) update submodule revision to b17e9d9
> | * 3290e69 (HEAD -> develop) update submodule revision to 0598394
> |/
> * cd5e1a5 initial submodule revision
>
>
> Problem case: Merge either branch into the other
>
> Expected behavior: Merge conflict.
>
> Actual behavior: Auto merge without conflicts.
>
> Note 1: A merge conflict does occur, if the sourced revisions do *not* have a 
> linear history
>
> Did I get something wrong about how git resolves merges?

We often treating a submodule as a file from the superproject, but not always.
And in case of a merge, git seems to be a bit smarter than treating it
as a textfile
with two different lines.

See https://github.com/git/git/commit/68d03e4a6e448aa557f52adef92595ac4d6cd4bd
(68d03e4a6e (Implement automatic fast-forward merge for submodules, 2010-07-07)
to explain the situation you encounter. (specifically merge_submodule
at the end of the diff)

> Shouldn't git be like: "hey, you're trying to merge two different contents 
> for the same line" (the submodule's revision)

As we have a history in the submodule we can do more than that and
resolve the conflict.

For two lines, you usually need manual intervention (which line to
pick, or craft a complete
new line out of parts of each line?), whereas for submodule commits
you can reason
about their dependencies due to their history and not just look at the
textual conflict.

Stefan


Re: [RFC PATCH] checkout: Force matching mtime between files

2018-04-26 Thread SZEDER Gábor
> On Wed, Apr 25, 2018 at 10:41:18AM +0200, �var Arnfj�r� Bjarmason wrote:
> >  2. Add some config so we can have hook search paths, and ship with a
> > default search path for hooks shipped with git.
> 
> I would go for something like this instead of search paths. This
> allows you to specify a path to any specific hook in hooks.* config
> group. This is basically "$GIT_DIR/hooks directory in config file" but
> with lower priority than those in $GIT_DIR/hooks.
> 
> Now we can do something like
> 
> 
> git -c hooks.post-checkout=/path/to/some/script clone ...
> 
> but of course I would need to fix the FIXME or this hook config is
> only effective just this one time. (And of course you could put it in
> ~/.gitconfig)
> 
> -- 8< --
> diff --git a/builtin/clone.c b/builtin/clone.c
> index 7df5932b85..143413ed2d 100644
> --- a/builtin/clone.c
> +++ b/builtin/clone.c
> @@ -1063,6 +1063,11 @@ int cmd_clone(int argc, const char **argv, const char 
> *prefix)
>   strbuf_addf(_top, "refs/remotes/%s/", option_origin);
>   }
>  
> + /*
> +  * FIXME: we should keep all custom config settings via
> +  * "git  -c ..." in $GIT_DIR/config.
> +  */
> +

We definitely should not, see the difference between 'git -c ... clone
url' and 'git clone -c ... url'

BTW, wouldn't running

  git clone --template=/path/to/template/dir/with/hooks/

invoke the post-checkout hook in there?



Re: [RFC PATCH] checkout: Force matching mtime between files

2018-04-26 Thread Robin H. Johnson
On Thu, Apr 26, 2018 at 07:15:40PM +0200, Duy Nguyen wrote:
> On Wed, Apr 25, 2018 at 10:41:18AM +0200, Ævar Arnfjörð Bjarmason wrote:
> >  2. Add some config so we can have hook search paths, and ship with a
> > default search path for hooks shipped with git.
> 
> I would go for something like this instead of search paths. This
> allows you to specify a path to any specific hook in hooks.* config
> group. This is basically "$GIT_DIR/hooks directory in config file" but
> with lower priority than those in $GIT_DIR/hooks.
> 
> Now we can do something like
> 
> 
> git -c hooks.post-checkout=/path/to/some/script clone ...
> 
> but of course I would need to fix the FIXME or this hook config is
> only effective just this one time. (And of course you could put it in
> ~/.gitconfig)
The FIXME leaves something ambiguous:
How do you differentiate between -c behavior that should be
one-time-only vs persisted by being written to $GIT_DIR/config during
$GIR_DIR init?

-- 
Robin Hugh Johnson
Gentoo Linux: Dev, Infra Lead, Foundation Treasurer
E-Mail   : robb...@gentoo.org
GnuPG FP : 11ACBA4F 4778E3F6 E4EDF38E B27B944E 34884E85
GnuPG FP : 7D0B3CEB E9B85B1F 825BCECF EE05E6F6 A48F6136


Re: [RFC PATCH] checkout: Force matching mtime between files

2018-04-26 Thread Robin H. Johnson
On Thu, Apr 26, 2018 at 06:43:56PM +0200, Duy Nguyen wrote:
> On Wed, Apr 25, 2018 at 5:18 PM, Marc Branchaud  wrote:
> > Are we all that sure that the performance hit is that drastic?  After all,
> > we've just done write_entry().  Calling utime() at that point should just
> > hit the filesystem cache.
> I have a feeling this has "this is linux" assumption. Anybody knows
> how freebsd, mac os x and windows behave?
I don't know sorry. futimens might be better here if it can be used
before the fd is closed.

> > * In a "file checkout" ("git checkout -- path/to/file"), $1 and $2 are
> > identical so the above loop does nothing.  Offhand I'm not even sure how a
> > hook might get the right files in this case.
> Would a hook that gives you the list of updated files (in the exact
> same order that git updates) help?
Yes, that, along with the target revision I think would allow most or
all of the desired behaviors mentioned in this thread *. It also needs to
fire in cases like 'git reset --hard $REV'.

* For this case, I just need the mtimes to be consistent within a single
  checkout, I don't need them to have specific values.

-- 
Robin Hugh Johnson
Gentoo Linux: Dev, Infra Lead, Foundation Treasurer
E-Mail   : robb...@gentoo.org
GnuPG FP : 11ACBA4F 4778E3F6 E4EDF38E B27B944E 34884E85
GnuPG FP : 7D0B3CEB E9B85B1F 825BCECF EE05E6F6 A48F6136


Re: BUG report: unicode normalization on APFS (Mac OS High Sierra)

2018-04-26 Thread Elijah Newren
On Thu, Apr 26, 2018 at 10:13 AM, Torsten Bögershausen  wrote:
> Hm,
> thanks for the report.
> I don't have a high sierra box, but I can probably get one.
> t0050 -should- pass automagically, so I feel that I can do something.
> Unless someone is faster of course.

Sweet, thanks for taking a look.

> Is it possible that  you run
> debug=t verbose=t ./t0050-filesystem.sh
> and send the output to me ?

Sure.  First, though, note that I can make it pass (or at least "not
ok...TODO known breakage") with the following patch (may be
whitespace-damaged by gmail):

diff --git a/t/test-lib.sh b/t/test-lib.sh
index 483c8d6d7..770b91f8c 100644
--- a/t/test-lib.sh
+++ b/t/test-lib.sh
@@ -1106,12 +1106,7 @@ test_lazy_prereq UTF8_NFD_TO_NFC '
auml=$(printf "\303\244")
aumlcdiar=$(printf "\141\314\210")
>"$auml" &&
-   case "$(echo *)" in
-   "$aumlcdiar")
-   true ;;
-   *)
-   false ;;
-   esac
+   stat "$aumlcdiar" >/dev/null 2>/dev/null
 '

 test_lazy_prereq AUTOIDENT '


I'm just worried there are bugs elsewhere in dealing with filesystems
like this that would need to be fixed and that this papers over them.

Anyway, the output you requested, at least for the last two failing tests, is:


expecting success:
git mv "$aumlcdiar" "$auml" &&
git commit -m rename

fatal: destination exists, source=ä, destination=ä
not ok 9 - rename (silent unicode normalization)

#
# git mv "$aumlcdiar" "$auml" &&
# git commit -m rename
#

expecting success:
git reset --hard initial &&
git merge topic

HEAD is now at 1b3caf6 initial
Updating 1b3caf6..2db1bf9
error: The following untracked working tree files would be overwritten by merge:
ä
Please move or remove them before you merge.
Aborting
not ok 10 - merge (silent unicode normalization)

#
# git reset --hard initial &&
# git merge topic
#

# still have 1 known breakage(s)
# failed 2 among remaining 9 test(s)


Re: [RFC PATCH] checkout: Force matching mtime between files

2018-04-26 Thread Duy Nguyen
On Wed, Apr 25, 2018 at 10:41:18AM +0200, Ævar Arnfjörð Bjarmason wrote:
>  2. Add some config so we can have hook search paths, and ship with a
> default search path for hooks shipped with git.

I would go for something like this instead of search paths. This
allows you to specify a path to any specific hook in hooks.* config
group. This is basically "$GIT_DIR/hooks directory in config file" but
with lower priority than those in $GIT_DIR/hooks.

Now we can do something like


git -c hooks.post-checkout=/path/to/some/script clone ...

but of course I would need to fix the FIXME or this hook config is
only effective just this one time. (And of course you could put it in
~/.gitconfig)

-- 8< --
diff --git a/builtin/clone.c b/builtin/clone.c
index 7df5932b85..143413ed2d 100644
--- a/builtin/clone.c
+++ b/builtin/clone.c
@@ -1063,6 +1063,11 @@ int cmd_clone(int argc, const char **argv, const char 
*prefix)
strbuf_addf(_top, "refs/remotes/%s/", option_origin);
}
 
+   /*
+* FIXME: we should keep all custom config settings via
+* "git  -c ..." in $GIT_DIR/config.
+*/
+
strbuf_addf(, "+%s*:%s*", src_ref_prefix, branch_top.buf);
strbuf_addf(, "remote.%s.url", option_origin);
git_config_set(key.buf, repo);
diff --git a/run-command.c b/run-command.c
index 84899e423f..ee5b6ea329 100644
--- a/run-command.c
+++ b/run-command.c
@@ -7,6 +7,7 @@
 #include "strbuf.h"
 #include "string-list.h"
 #include "quote.h"
+#include "config.h"
 
 void child_process_init(struct child_process *child)
 {
@@ -1256,6 +1257,8 @@ const char *find_hook(const char *name)
strbuf_reset();
strbuf_git_path(, "hooks/%s", name);
if (access(path.buf, X_OK) < 0) {
+   const char *cfg_hook;
+   struct strbuf cfg_key = STRBUF_INIT;
int err = errno;
 
 #ifdef STRIP_EXTENSION
@@ -1278,9 +1281,14 @@ const char *find_hook(const char *name)
   path.buf);
}
}
-   return NULL;
+
+   strbuf_reset();
+   strbuf_addf(_key, "hooks.%s", name);
+   if (!git_config_get_pathname(cfg_key.buf, _hook))
+   strbuf_addstr(, cfg_hook);
+   strbuf_release(_key);
}
-   return path.buf;
+   return path.len ? path.buf : NULL;
 }
 
 int run_hook_ve(const char *const *env, const char *name, va_list args)
-- 8< --
--
Duy


Re: BUG report: unicode normalization on APFS (Mac OS High Sierra)

2018-04-26 Thread Torsten Bögershausen
On 26.04.18 18:48, Elijah Newren wrote:
> On HFS (which appears to be the default Mac filesystem prior to High
> Sierra), unicode names are "normalized" before recording.  Thus with a
> script like:
> 
> mkdir tmp
> cd tmp
> 
> auml=$(printf "\303\244")
> aumlcdiar=$(printf "\141\314\210")
> >"$auml"
> 
> echo "auml:  " $(echo -n "$auml" | xxd)
> echo "aumlcdiar: " $(echo -n "$aumlcdiar" | xxd)
> echo "Dir contents:  " $(echo -n * | xxd)
> 
> echo "Stat auml: " "$(stat -f "%i   %Sm   %Su %N" "$auml")"
> echo "Stat aumlcdiar:" "$(stat -f "%i   %Sm   %Su %N" "$aumlcdiar")"
> 
> We see output like:
> 
> auml:   : c3a4 ..
> aumlcdiar:  : 61cc 88 a..
> Dir contents:   : 61cc 88 a..
> Stat auml:  857473   Apr 26 09:40:40 2018   newren ä
> Stat aumlcdiar: 857473   Apr 26 09:40:40 2018   newren ä
> 
> On APFS, which appears to be the new default filesystem in Mac OS High
> Sierra, we instead see:
> 
> auml:   : c3a4 ..
> aumlcdiar:  : 61cc 88 a..
> Dir contents:   : c3a4 ..
> Stat auml:  8591766636   Apr 26 09:40:59 2018   newren ä
> Stat aumlcdiar: 8591766636   Apr 26 09:40:59 2018   newren ä
> 
> i.e. APFS appears to record the filename as specified by the user, but
> continues to allow the user to access it via any name that normalizes
> to the same thing.  This difference causes t0050-filesystem.sh to fail
> the final two tests.  I could change the "UTF8_NFD_TO_NFC" flag
> checking in test-lib.sh to instead test the exit code of stat to make
> it pass these two tests, but I have no idea if there are problems
> elsewhere that this would just be papering over.
> 
> I dislike Mac OS and avoid it, so I'd prefer to find someone else
> motivated to fix this.  If no one is, I may eventually try to fix this
> up...in a year or three from now.  But is someone else interested?
> Would this serve as a good microproject for our microprojects list (or
> are the internals hairy enough that this is too big of a project for
> that list)?
> 
> 
> Elijah
> 

Hm,
thanks for the report.
I don't have a high sierra box, but I can probably get one.
t0050 -should- pass automagically, so I feel that I can do something.
Unless someone is faster of course.

Is it possible that  you run
debug=t verbose=t ./t0050-filesystem.sh 
and send the output to me ?





BUG report: unicode normalization on APFS (Mac OS High Sierra)

2018-04-26 Thread Elijah Newren
On HFS (which appears to be the default Mac filesystem prior to High
Sierra), unicode names are "normalized" before recording.  Thus with a
script like:

mkdir tmp
cd tmp

auml=$(printf "\303\244")
aumlcdiar=$(printf "\141\314\210")
>"$auml"

echo "auml:  " $(echo -n "$auml" | xxd)
echo "aumlcdiar: " $(echo -n "$aumlcdiar" | xxd)
echo "Dir contents:  " $(echo -n * | xxd)

echo "Stat auml: " "$(stat -f "%i   %Sm   %Su %N" "$auml")"
echo "Stat aumlcdiar:" "$(stat -f "%i   %Sm   %Su %N" "$aumlcdiar")"

We see output like:

auml:   : c3a4 ..
aumlcdiar:  : 61cc 88 a..
Dir contents:   : 61cc 88 a..
Stat auml:  857473   Apr 26 09:40:40 2018   newren ä
Stat aumlcdiar: 857473   Apr 26 09:40:40 2018   newren ä

On APFS, which appears to be the new default filesystem in Mac OS High
Sierra, we instead see:

auml:   : c3a4 ..
aumlcdiar:  : 61cc 88 a..
Dir contents:   : c3a4 ..
Stat auml:  8591766636   Apr 26 09:40:59 2018   newren ä
Stat aumlcdiar: 8591766636   Apr 26 09:40:59 2018   newren ä

i.e. APFS appears to record the filename as specified by the user, but
continues to allow the user to access it via any name that normalizes
to the same thing.  This difference causes t0050-filesystem.sh to fail
the final two tests.  I could change the "UTF8_NFD_TO_NFC" flag
checking in test-lib.sh to instead test the exit code of stat to make
it pass these two tests, but I have no idea if there are problems
elsewhere that this would just be papering over.

I dislike Mac OS and avoid it, so I'd prefer to find someone else
motivated to fix this.  If no one is, I may eventually try to fix this
up...in a year or three from now.  But is someone else interested?
Would this serve as a good microproject for our microprojects list (or
are the internals hairy enough that this is too big of a project for
that list)?


Elijah


Re: [RFC PATCH] checkout: Force matching mtime between files

2018-04-26 Thread Duy Nguyen
On Wed, Apr 25, 2018 at 5:18 PM, Marc Branchaud  wrote:
> Are we all that sure that the performance hit is that drastic?  After all,
> we've just done write_entry().  Calling utime() at that point should just
> hit the filesystem cache.

I have a feeling this has "this is linux" assumption. Anybody knows
how freebsd, mac os x and windows behave?

> The post-checkout hook approach is not exactly straightforward.
>
> Naively, it's simply
>
> for F in `git diff --name-only $1 $2`; do touch "$F"; done
>
> But consider:
>
> * Symlinks can cause the wrong file to be touched.  (Granted, Michał's
> proposed patch also doesn't deal with symlinks.)  Let's assume that a hook
> can be crafted will all possible sophistication.  There are still some
> fundamental problems:
>
> * In a "file checkout" ("git checkout -- path/to/file"), $1 and $2 are
> identical so the above loop does nothing.  Offhand I'm not even sure how a
> hook might get the right files in this case.

Would a hook that gives you the list of updated files (in the exact
same order that git updates) help?

> * The hook has to be set up in every repo and submodule (at least until
> something like Ævar's experiments come to fruition).
>
> * A fresh clone can't run the hook.  This is especially important when
> dealing with submodules.  (In one case where we were bit by this, make
> though that half of a fresh submodule clone's files were stale, and decided
> to re-autoconf the entire thing.)

This to me sounds like something we should be able to improve in
general. The alternative is "git clone --no-checkout" then checkout
manually (which I see jenkins plugin does) is really not optimal even
if it works.
-- 
Duy


Re: [PATCHv3 0/9] object store: oid_object_info is the next contender

2018-04-26 Thread Brandon Williams
On 04/25, Stefan Beller wrote:
> v3:
> * fixed and extended the commit message of last commit
> * fixed the last patch, as Jonathan Tan suggested, see interdiff:
> 
> $ git diff remotes/origin/sb/oid-object-info (which is v2)
> diff --git c/sha1_file.c w/sha1_file.c
> index 94123e0299..dcd6b879ac 100644
> --- c/sha1_file.c
> +++ w/sha1_file.c
> @@ -1289,14 +1289,13 @@ int oid_object_info_extended(struct repository 
> *r, const struct object_id *oid,
>  
> /* Check if it is a missing object */
> if (fetch_if_missing && repository_format_partial_clone &&
> -   !already_retried) {
> +   !already_retried && r == the_repository) {
> /*
>  * TODO Investigate having fetch_object() return
>  * TODO error/success and stopping the music here.
> -* TODO Pass a repository struct through 
> fetch_object.
> +* TODO Pass a repository struct through 
> fetch_object,
> +* such that arbitrary repositories work.
>  */
> -   if (r != the_repository)
> -   die(_("partial clones only supported in 
> the_repository"));
> fetch_object(repository_format_partial_clone, 
> real->hash);
> already_retried = 1;
> continue;
> 
> Thanks,
> Stefan

v3 looks good, thanks for taking care of this.

> 
> v2:
> 
> * fixed the sha1/oid typo
> * removed spurious new line
> * Brandon and Jonthan discovered another dependency that I missed due
>   to cherrypicking that commit from a tree before partial clone was a thing.
>   We error out when attempting to use fetch_object for repos that are not
>   the_repository.
> 
> Thanks,
> Stefan
> 
> v1:
> This applies on top of origin/sb/object-store-replace and is available as
> https://github.com/stefanbeller/git/tree/oid_object_info
> 
> This continues the work of sb/packfiles-in-repository,
> extending the layer at which we have to pass in an explicit
> repository object to oid_object_info.
> 
> A test merge to next shows only a minor merge conflicit (adding
> different #include lines in one c file), so this might be a good next
> step for the object store series.
> 
> Notes on further object store series:
> I plan on converting the "parsed object store" next,
> which would be {alloc, object, tree, commit, tag}.c as that is a prerequisite
> for migrating shallow (which is intermingled with grafts) information to the
> object store.
> 
> There is currently work going on in allocation (mempool - Jameson Miller)
> and grafts (deprecate grafts - DScho), which is why I am sending this
> series first. I think it can go in parallel to the "parsed object store"
> that is coming next.
> 
> Thanks,
> Stefan
> 
> Jonathan Nieder (1):
>   packfile: add repository argument to packed_object_info
> 
> Stefan Beller (8):
>   cache.h: add repository argument to oid_object_info_extended
>   cache.h: add repository argument to oid_object_info
>   packfile: add repository argument to retry_bad_packed_offset
>   packfile: add repository argument to packed_to_object_type
>   packfile: add repository argument to read_object
>   packfile: add repository argument to unpack_entry
>   packfile: add repository argument to cache_or_unpack_entry
>   cache.h: allow oid_object_info to handle arbitrary repositories
> 
>  archive-tar.c|  2 +-
>  archive-zip.c|  3 ++-
>  blame.c  |  4 ++--
>  builtin/blame.c  |  2 +-
>  builtin/cat-file.c   | 12 ++--
>  builtin/describe.c   |  2 +-
>  builtin/fast-export.c|  2 +-
>  builtin/fetch.c  |  2 +-
>  builtin/fsck.c   |  3 ++-
>  builtin/index-pack.c |  4 ++--
>  builtin/ls-tree.c|  2 +-
>  builtin/mktree.c |  2 +-
>  builtin/pack-objects.c   | 11 +++
>  builtin/prune.c  |  3 ++-
>  builtin/replace.c| 11 ++-
>  builtin/tag.c|  4 ++--
>  builtin/unpack-objects.c |  2 +-
>  cache.h  |  7 +--
>  diff.c   |  3 ++-
>  fast-import.c| 16 ++--
>  list-objects-filter.c|  2 +-
>  object.c |  2 +-
>  pack-bitmap-write.c  |  3 ++-
>  pack-check.c |  3 ++-
>  packfile.c   | 40 +++-
>  packfile.h   |  6 --
>  reachable.c  |  2 +-
>  refs.c   |  2 +-
>  remote.c |  2 +-
>  sequencer.c  |  3 ++-
>  sha1_file.c  | 37 +
>  sha1_name.c  | 12 ++--
>  streaming.c  |  2 +-
>  submodule.c  |  2 +-
>  tag.c|  

Re: [PATCH 18/41] index-pack: abstract away hash function constant

2018-04-26 Thread Duy Nguyen
On Wed, Apr 25, 2018 at 8:49 PM, Martin Ågren  wrote:
>> I agree that pack v2 is not going to have anything but SHA-1.  However,
>> writing all the code such that it's algorithm agnostic means that we can
>> do testing of new algorithms by wholesale replacing the algorithm with a
>> new one, which simplifies things considerably.
>
> Ok. I do sort of wonder if a "successful" test run after globally
> substituting Hash-Foo for SHA-1 (regardless of whether the size changes
> or not) hints at a problem. That is, nowhere do we test that this code
> uses 20-byte SHA-1s, regardless of what other hash functions are
> available and configured. Of course, until soon, that did not really
> have to be tested since there was only one hash function available to
> choose from. As for identifying all the places that matter ... no idea.
>
> Of course I can see how this helps get things to a point where Git does
> not crash and burn because the hash has a different size, and where the
> test suite doesn't spew failures because the initial chaining value of
> "SHA-1" is changed.
>
> Once that is accomplished, I sort of suspect that this code will want to
> be updated to not always blindly use the_hash_algo, but to always work
> with SHA-1 sizes. Or rather, this would turn into more generic code to
> handle both "v2 with SHA-1" and "v3 with some hash function(s)". This
> commit might be a good first step in that direction.

I also have an uneasy feeling when things this close to on-disk file
format get hash-agnostic treatment. I think we would need to start
adding new file formats soon, from bottom up with simple things like
loose object files (cat-file and hash-object should be enough to test
blobs...), then moving up to pack files and more. This is when we can
really decide where to use the new hash and whether we should keep
some hashes as sha-1.

For trailing hashes for example, there's no need to move to a new hash
which only costs us more cycles. We just use it as a fancy checksum to
avoid bit flips. But then my assumption about cost may be completely
wrong without experimenting.

> Long rambling short, yeah, I see your point.

So yeah. It may be ok to move everything to "new hash" now. But we
need a closer look soon.
-- 
Duy


Re: What's cooking in git.git (Apr 2018, #03; Wed, 25)

2018-04-26 Thread Duy Nguyen
On Wed, Apr 25, 2018 at 10:37 AM, Junio C Hamano  wrote:
> * nd/pack-objects-pack-struct (2018-04-16) 15 commits
>  ...
>
>  "git pack-objects" needs to allocate tons of "struct object_entry"
>  while doing its work, and shrinking its size helps the performance
>  quite a bit.
>
>  What's the doneness of this thing?  The interdiff since previous
>  rounds looked reasonable, but I didn't see this round otherwise
>  scrutinized by reviewers.  The numbers given in the commit near the
>  tip do look impressive, though ;-)

I think it's ok to move it to next, though I'd prefer to move it to
master just right after a release so it gets tested for a whole
release cycle. This also gives Jeff a chance to check it after he's
back (if he wants to).

> * nd/repack-keep-pack (2018-04-16) 7 commits
>  ...
>
>  "git gc" in a large repository takes a lot of time as it considers
>  to repack all objects into one pack by default.  The command has
>  been taught to pretend as if the largest existing packfile is
>  marked with ".keep" so that it is left untouched while objects in
>  other packs and loose ones are repacked.
>
>  What's the doneness of this thing?  The interdiff since the earlier
>  one looked reasonable, but I didn't see this round otherwise
>  scrutinized by reviewers.

This one should be safer than the previous one. I think it's ok to move to next.

Anyway I'll re-read these two series this weekend to see if I could
spot anything new.
-- 
Duy


Re: [RFC PATCH] checkout: Force matching mtime between files

2018-04-26 Thread Michał Górny
W dniu czw, 26.04.2018 o godzinie 10∶25 +0900, użytkownik Junio C Hamano
napisał:
> Marc Branchaud  writes:
> 
> > > But Git is not an archiver (tar), but is a source code control
> > > system, so I do not think we should spend any extra cycles to
> > > "improve" its behaviour wrt the relative ordering, at least for the
> > > default case.  Only those who rely on having build artifact *and*
> > > source should pay the runtime (and preferrably also the
> > > maintainance) cost.
> > 
> > Anyone who uses "make" or some other mtime-based tool is affected by
> > this.  I agree that it's not "Everyone" but it sure is a lot of
> > people.
> 
> That's an exaggerated misrepresentation.  Only those who put build
> artifacts as well as source to SCM *AND* depend on mtime are
> affected.
> 
> A shipped tarball often contain configure.in as well as generated
> configure, so that consumers can just say ./configure without having
> the whole autoconf toolchain to regenerate it (I also heard horror
> stories that this is done to control the exact version of autoconf
> to avoid compatibility issues), but do people arrange configure to
> be regenerated from configure.in in their Makefile of such a project
> automatically when building the default target?  In any case, that is
> a tarball usecase, not a SCM one.
> 
> > Are we all that sure that the performance hit is that drastic?  After
> > all, we've just done write_entry().  Calling utime() at that point
> > should just hit the filesystem cache.
> 
> I do not know about others, but I personally am more disburbed by
> the conceptual ugliness that comes from having to have such a piece
> of code in the codebase.

For the record, we're using this with ebuilds and respective cache files
(which are expensive to generate).  We are using separate repository
which combines sources and cache files to keep the development
repository clean.  I have researched different solutions for this but
git turned out the best option for incremental updates for us.

Tarballs are out of question, unless you expect users to fetch >100 MiB
every time, and they are also expensive to update.  Deltas of tarballs
are just slow and require storing a lot of extra data.  Rsync is not
very efficient at frequent updates, and has significant overhead
on every run.  With all its disadvantages, git is still something that
lets our users fetch updates frequently with minimal network overhead.

So what did I do to deserve being called insane here?  Is it because I
wanted to use the tools that work for us?  Because I figured out that I
can improve our use case without really harming anyone in the process?

-- 
Best regards,
Michał Górny



Re: [RFC PATCH] checkout: Force matching mtime between files

2018-04-26 Thread Marc Branchaud

On 2018-04-25 09:25 PM, Junio C Hamano wrote:

Marc Branchaud  writes:


But Git is not an archiver (tar), but is a source code control
system, so I do not think we should spend any extra cycles to
"improve" its behaviour wrt the relative ordering, at least for the
default case.  Only those who rely on having build artifact *and*
source should pay the runtime (and preferrably also the
maintainance) cost.


Anyone who uses "make" or some other mtime-based tool is affected by
this.  I agree that it's not "Everyone" but it sure is a lot of
people.


That's an exaggerated misrepresentation.  Only those who put build
artifacts as well as source to SCM *AND* depend on mtime are
affected.

A shipped tarball often contain configure.in as well as generated
configure, so that consumers can just say ./configure without having
the whole autoconf toolchain to regenerate it (I also heard horror
stories that this is done to control the exact version of autoconf
to avoid compatibility issues), but do people arrange configure to
be regenerated from configure.in in their Makefile of such a project
automatically when building the default target?


Yes.  I've seen many automake-generated Makefiles with "recheck" 
machinery that'll conveniently rerun autoconf if "necessary".


(And those horror stories are true, BTW.)


In any case, that is
a tarball usecase, not a SCM one.


No, it's an SCM case when you need to have the project's code as part of 
your own.  I can't make everyone do things the Right Way, so I'm stuck 
using projects that are not 100% pure-source, because they "helpfully" 
release their code after the autoconf step for some reason.



Are we all that sure that the performance hit is that drastic?  After
all, we've just done write_entry().  Calling utime() at that point
should just hit the filesystem cache.


I do not know about others, but I personally am more disburbed by
the conceptual ugliness that comes from having to have such a piece
of code in the codebase.


The ugliness arises from the problem being solved.  It's not git's fault 
that the world is so stupid.


That git might be willing to suffer a bit of self-mutilation for the 
benefit of its users should be seen as a point of pride.


M.



Re: [PATCH v4 03/10] commit-graph: compute generation numbers

2018-04-26 Thread Derrick Stolee



On 4/26/2018 8:58 AM, Derrick Stolee wrote:

n 4/25/2018 10:35 PM, Junio C Hamano wrote:

Derrick Stolee  writes:

@@ -439,6 +439,9 @@ static void write_graph_chunk_data(struct 
hashfile *f, int hash_len,

  else
  packedDate[0] = 0;
  +    if ((*list)->generation != GENERATION_NUMBER_INFINITY)
+    packedDate[0] |= htonl((*list)->generation << 2);
+
  packedDate[1] = htonl((*list)->date);
  hashwrite(f, packedDate, 8);

The ones that have infinity are written as zero here.  The code that
reads the generation field off of a file in fill_commit_graph_info()
and fill_commit_in_graph() both leave such a record in file as-is,
so the reader of what we write out will think it is _ZERO, not _INF.

Not that it matters, as it seems that most of the code being added
by this series treat _ZERO and _INF more or less interchangeably.
But it does raise another question, i.e. do we need both _ZERO and
_INF, or is it sufficient to have just a single _UNKNOWN?


This code is confusing. The 'if' condition is useless, since at this 
point every commit should be finite (since we computed generation 
numbers for everyone). We should just write the value always.


For the sake of discussion, the value _INFINITY means not in the graph 
and _ZERO means in the graph without a computed generation number. 
It's a small distinction, but it gives a single boundary to use for 
reachability queries that test generation number.




@@ -571,6 +574,46 @@ static void close_reachable(struct 
packed_oid_list *oids)

  }
  }
  +static void compute_generation_numbers(struct commit** commits,
+   int nr_commits)
+{
+    int i;
+    struct commit_list *list = NULL;
+
+    for (i = 0; i < nr_commits; i++) {
+    if (commits[i]->generation != GENERATION_NUMBER_INFINITY &&
+    commits[i]->generation != GENERATION_NUMBER_ZERO)
+    continue;
+
+    commit_list_insert(commits[i], );
+    while (list) {
+    struct commit *current = list->item;
+    struct commit_list *parent;
+    int all_parents_computed = 1;
+    uint32_t max_generation = 0;
+
+    for (parent = current->parents; parent; parent = 
parent->next) {
+    if (parent->item->generation == 
GENERATION_NUMBER_INFINITY ||
+    parent->item->generation == 
GENERATION_NUMBER_ZERO) {

+    all_parents_computed = 0;
+    commit_list_insert(parent->item, );
+    break;
+    } else if (parent->item->generation > 
max_generation) {

+    max_generation = parent->item->generation;
+    }
+    }
+
+    if (all_parents_computed) {
+    current->generation = max_generation + 1;
+    pop_commit();
+    }

If we haven't computed all parents' generations yet,
current->generation is undefined (or at least "left as
initialized"), so it does not make much sense to attempt to clip it
at _MAX at this point.  At leat not yet.

IOW, shouldn't the following two lines be inside the "we now know
genno of all parents, so we can compute genno for commit" block
above?


You're right! Good catch. This code sets every merge commit to _MAX. 
It should be in the block above.





+    if (current->generation > GENERATION_NUMBER_MAX)
+    current->generation = GENERATION_NUMBER_MAX;
+    }
+    }


This bothered me: why didn't I catch a bug here? I rebased my "fsck" RFC 
onto this branch and it succeeded. Then, I realized that this does not 
actually write incorrect values, since we re-visit this commit again 
after we pop the stack down to this commit. However, there is time in 
the middle where we have set the generation (in memory) incorrectly and 
that could easily turn into a real bug by a later change.


I'll stick the _MAX check in the if above to prevent confusion.

Thanks,
-Stolee



Re: [PATCH v4 03/10] commit-graph: compute generation numbers

2018-04-26 Thread Derrick Stolee

n 4/25/2018 10:35 PM, Junio C Hamano wrote:

Derrick Stolee  writes:


@@ -439,6 +439,9 @@ static void write_graph_chunk_data(struct hashfile *f, int 
hash_len,
else
packedDate[0] = 0;
  
+		if ((*list)->generation != GENERATION_NUMBER_INFINITY)

+   packedDate[0] |= htonl((*list)->generation << 2);
+
packedDate[1] = htonl((*list)->date);
hashwrite(f, packedDate, 8);

The ones that have infinity are written as zero here.  The code that
reads the generation field off of a file in fill_commit_graph_info()
and fill_commit_in_graph() both leave such a record in file as-is,
so the reader of what we write out will think it is _ZERO, not _INF.

Not that it matters, as it seems that most of the code being added
by this series treat _ZERO and _INF more or less interchangeably.
But it does raise another question, i.e. do we need both _ZERO and
_INF, or is it sufficient to have just a single _UNKNOWN?


This code is confusing. The 'if' condition is useless, since at this 
point every commit should be finite (since we computed generation 
numbers for everyone). We should just write the value always.


For the sake of discussion, the value _INFINITY means not in the graph 
and _ZERO means in the graph without a computed generation number. It's 
a small distinction, but it gives a single boundary to use for 
reachability queries that test generation number.





@@ -571,6 +574,46 @@ static void close_reachable(struct packed_oid_list *oids)
}
  }
  
+static void compute_generation_numbers(struct commit** commits,

+  int nr_commits)
+{
+   int i;
+   struct commit_list *list = NULL;
+
+   for (i = 0; i < nr_commits; i++) {
+   if (commits[i]->generation != GENERATION_NUMBER_INFINITY &&
+   commits[i]->generation != GENERATION_NUMBER_ZERO)
+   continue;
+
+   commit_list_insert(commits[i], );
+   while (list) {
+   struct commit *current = list->item;
+   struct commit_list *parent;
+   int all_parents_computed = 1;
+   uint32_t max_generation = 0;
+
+   for (parent = current->parents; parent; parent = 
parent->next) {
+   if (parent->item->generation == 
GENERATION_NUMBER_INFINITY ||
+   parent->item->generation == 
GENERATION_NUMBER_ZERO) {
+   all_parents_computed = 0;
+   commit_list_insert(parent->item, );
+   break;
+   } else if (parent->item->generation > 
max_generation) {
+   max_generation = 
parent->item->generation;
+   }
+   }
+
+   if (all_parents_computed) {
+   current->generation = max_generation + 1;
+   pop_commit();
+   }

If we haven't computed all parents' generations yet,
current->generation is undefined (or at least "left as
initialized"), so it does not make much sense to attempt to clip it
at _MAX at this point.  At leat not yet.

IOW, shouldn't the following two lines be inside the "we now know
genno of all parents, so we can compute genno for commit" block
above?


You're right! Good catch. This code sets every merge commit to _MAX. It 
should be in the block above.





+   if (current->generation > GENERATION_NUMBER_MAX)
+   current->generation = GENERATION_NUMBER_MAX;
+   }
+   }
+}
+
  void write_commit_graph(const char *obj_dir,
const char **pack_indexes,
int nr_packs,
@@ -694,6 +737,8 @@ void write_commit_graph(const char *obj_dir,
if (commits.nr >= GRAPH_PARENT_MISSING)
die(_("too many commits to write graph"));
  
+	compute_generation_numbers(commits.list, commits.nr);

+
graph_name = get_commit_graph_filename(obj_dir);
fd = hold_lock_file_for_update(, graph_name, 0);


Re: What's cooking in git.git (Apr 2018, #03; Wed, 25)

2018-04-26 Thread Derrick Stolee

On 4/25/2018 1:43 PM, Brandon Williams wrote:

On 04/25, Ævar Arnfjörð Bjarmason wrote:

* bw/protocol-v2 (2018-03-15) 35 commits
   (merged to 'next' on 2018-04-11 at 23ee234a2c)
  + remote-curl: don't request v2 when pushing
  + remote-curl: implement stateless-connect command
  + http: eliminate "# service" line when using protocol v2
  + http: don't always add Git-Protocol header
  + http: allow providing extra headers for http requests
  + remote-curl: store the protocol version the server responded with
  + remote-curl: create copy of the service name
  + pkt-line: add packet_buf_write_len function
  + transport-helper: introduce stateless-connect
  + transport-helper: refactor process_connect_service
  + transport-helper: remove name parameter
  + connect: don't request v2 when pushing
  + connect: refactor git_connect to only get the protocol version once
  + fetch-pack: support shallow requests
  + fetch-pack: perform a fetch using v2
  + upload-pack: introduce fetch server command
  + push: pass ref prefixes when pushing
  + fetch: pass ref prefixes when fetching
  + ls-remote: pass ref prefixes when requesting a remote's refs
  + transport: convert transport_get_remote_refs to take a list of ref prefixes
  + transport: convert get_refs_list to take a list of ref prefixes
  + connect: request remote refs using v2
  + ls-refs: introduce ls-refs server command
  + serve: introduce git-serve
  + test-pkt-line: introduce a packet-line test helper
  + protocol: introduce enum protocol_version value protocol_v2
  + transport: store protocol version
  + connect: discover protocol version outside of get_remote_heads
  + connect: convert get_remote_heads to use struct packet_reader
  + transport: use get_refs_via_connect to get refs
  + upload-pack: factor out processing lines
  + upload-pack: convert to a builtin
  + pkt-line: add delim packet support
  + pkt-line: allow peeking a packet line without consuming it
  + pkt-line: introduce packet_read_with_status
  (this branch is used by bw/server-options.)

  The beginning of the next-gen transfer protocol.

  Will cook in 'next'.

With a month & 10 days of no updates & this looking stable it would be
great to have it in master sooner than later to build on top of it in
the 2.18 window.

I personally think that this series is ready to graduate to master but I
mentioned to Junio off-list that it might be a good idea to wait until
it has undergone a little more stress testing.  We've been in the
process of trying to get this rolled out to our internal server but due
to a few roadblocks and people being out of the office its taken a bit
longer than I would have liked to get it up and running.  I'm hoping in
another few days/a week I'll have some more data on live traffic.  At
that point I think I'll be more convinced that it will be safe to merge it.

I may be overly cautions but I'm hoping that we can get this right
without needing to do another protocol version bump for a very long
time.  Technically using v2 is hidden behind an "experimental" config
flag so we should still be able to tweak it after the fact if we
absolutely needed to, but I'd like to avoid that if necessary.


There's no testing better than production. I think if we have an 
opportunity to test with realistic traffic, we should take advantage.


But I also agree that this series looks stable.

I realize it's hard to communicate both "this series is ready to merge" 
and "I appreciate your caution."


Thanks,

-Stolee



git merge banch w/ different submodule revision

2018-04-26 Thread Middelschulte, Leif
Hi,

we're using git-flow as a basic development workflow. However, doing so 
revealed unexpected merge-behavior by git.

Assume the following setup:

- Repository `S` is sourced by repository `p` as submodule `s`
- Repository `p` has two branches: `feature_x` and `develop`
- The revisions sourced via the submodule have a linear history


* 1c1d38f (feature_x) update submodule revision to b17e9d9
| * 3290e69 (HEAD -> develop) update submodule revision to 0598394
|/  
* cd5e1a5 initial submodule revision


Problem case: Merge either branch into the other

Expected behavior: Merge conflict.

Actual behavior: Auto merge without conflicts.

Note 1: A merge conflict does occur, if the sourced revisions do *not* have a 
linear history

Did I get something wrong about how git resolves merges? Shouldn't git be like: 
"hey, you're trying to merge two different contents for the same line" (the 
submodule's revision)

Thanks in advance,

Leif

Re: [PATCH v3 0/4] rebase -i: avoid stale "# This is a combinationof" in commit messages

2018-04-26 Thread Phillip Wood
On 26/04/18 10:51, Johannes Schindelin wrote:
> Hi Phillip,
> 
> On Wed, 25 Apr 2018, Phillip Wood wrote:
> 
>> On 25/04/18 13:48, Johannes Schindelin wrote:
>>>
>>> On Mon, 23 Apr 2018, Phillip Wood wrote:
>>>
 On 23/04/18 19:11, Stefan Beller wrote:
>
> On Sat, Apr 21, 2018 at 12:34 AM, Johannes Schindelin
>  wrote:
>> Eric Sunshine pointed out that I had such a commit message in
>> https://public-inbox.org/git/CAPig+cRrS0_nYJJY=o6cbov630snqhpv5qgrqdd8mw-syzn...@mail.gmail.com/
>> and I went on a hunt to figure out how the heck this happened.
>>
>> Turns out that if there is a fixup/squash chain where the *last*
>> command fails with merge conflicts, and we either --skip ahead
>> or resolve the conflict to a clean tree and then --continue, our
>> code does not do a final cleanup.
>>
>> Contrary to my initial gut feeling, this bug was not introduced
>> by my rewrite in C of the core parts of rebase -i, but it looks
>> to me as if that bug was with us for a very long time (at least
>> the --skip part).
>>
>> The developer (read: user of rebase -i) in me says that we would
>> want to fast-track this, but the author of rebase -i in me says
>> that we should be cautious and cook this in `next` for a while.
>
> I looked through the patches again and think this series is good
> to go.

 I've just realized I commented on an outdated version as the new
 version was posted there rather than as a reply to v1. I've just
 looked through it and I'm not sure it addresses the unnecessary
 editing of the commit message of the previous commit if a single
 squash command is skipped as outlined in
 https://public-inbox.org/git/b6512eae-e214-9699-4d69-77117a0da...@talktalk.net/
>>>
>>> I have not forgotten about this! I simply did not find the time yet,
>>> is all...
>>
>> I wondered if that was the case but I wanted to check as I wasn't sure
>> if you'd seen the original message as it was on an obsolete version of
>> the series
>>
>>> The patch series still has not been merged to `next`, but I plan on
>>> working on your suggested changes as an add-on commit anyway. I am not
>>> quite sure yet how I want to handle the "avoid running commit for the
>>> first fixup/squash in the series" problem, but I think we will have to
>>> add *yet another* file that is written (in the "we already have
>>> comments in the commit message" conditional block in
>>> error_failed_squash())...
>>
>> I wonder if creating the file in update_squash_messages() rather than
>> error_failed_squash() would be a better approach as then it is easy to
>> only create rebase_path_amend_type() when there has already been a
>> squash or fixup.  The file is removed in the loop that picks commits in
>> pick_commits() so it would be cleaned up at the beginning of the next
>> pick if it's not needed.
> 
> That would be a good idea in general, but I think we have to take care of
> the following scenario:
> 
>   pick<- succeeds
>   squash  <- succeeds
>   fixup   <- fails, will be skipped
> 
> In this case, we do need to open the editor. But in this scenario, we do
> not:
> 
>   pick<- succeeds
>   fixup   <- succeeds
>   squash  <- fails, will be skipped
> 
> If we write the amend-type file in update_squash_messages(), we would
> write "squash" into it in both cases. My hope was to somehow avoid that.

Good point, I'd not thought of that

> I just realized that the current iteration does not fulfill that goal, as
> the message-fixup file would be long gone by the time
> error_failed_squash() was called in the latter example.
> 
> Also, I realized something else: my previous work-around for the
> GETTEXT_POISON case (where I fail gently when a commit message does not
> contain the "This is a combination of # commits" count in ASCII)
> would be much superior if it simply would not abuse the comment in the
> commit message, but had a robust, non-l18ned way to count the fixup/squash
> commits.
> 
> My current thinking is to reconcile both problems by shunning the
> amend-type and instead just record the sequence of fixup/squash commits
> that went into HEAD, in a new file, say, current-fixups.
> 
> To answer the question how many commit messages are combined, I then
> simply need to count the lines in that file.
> 
> To answer the question whether a skipped fixup/squash requires the editor
> to be launched, I can simply look whether there is a "squash" line
> (ignoring the last line).

That sounds like a good plan, keeping count of the fixup/squash without
having to parse the last message is a good idea.

> Oh, and I also forgot to test whether this is the "final fixup". If we are
> skipping a "fixup" in the middle of a chain, there is no need to clean the
> commit message to begin with.
> 
> This will take a while... ;-)

Yes, it sounds like quite a bit of work, but it will be a very

Re: [PATCH v2 2/5] builtin/config.c: support `--type=` as preferred alias for `--`

2018-04-26 Thread Junio C Hamano
Taylor Blau  writes:

> +static int option_parse_type(const struct option *opt, const char *arg,
> +  int unset)
> +{
> + /*
> +  * To support '--' style flags, begin with new_type equal to
> +  * opt->defval.
> +  */
> + int new_type = opt->defval;
> + int *to_type = opt->value;
> +
> + if (unset) {
> + *((int *) opt->value) = 0;

As you moved the definition of to_type higher than the previous
rounds, you can already use it here to avoid cryptic casting, i.e.

*to_type = 0;

no?

> + return 0;
> + }


Re: [PATCH v3 0/4] rebase -i: avoid stale "# This is a combinationof" in commit messages

2018-04-26 Thread Johannes Schindelin
Hi Phillip,

On Wed, 25 Apr 2018, Phillip Wood wrote:

> On 25/04/18 13:48, Johannes Schindelin wrote:
> > 
> > On Mon, 23 Apr 2018, Phillip Wood wrote:
> > 
> > > On 23/04/18 19:11, Stefan Beller wrote:
> > > >
> > > > On Sat, Apr 21, 2018 at 12:34 AM, Johannes Schindelin
> > > >  wrote:
> > > > > Eric Sunshine pointed out that I had such a commit message in
> > > > > https://public-inbox.org/git/CAPig+cRrS0_nYJJY=o6cbov630snqhpv5qgrqdd8mw-syzn...@mail.gmail.com/
> > > > > and I went on a hunt to figure out how the heck this happened.
> > > > >
> > > > > Turns out that if there is a fixup/squash chain where the *last*
> > > > > command fails with merge conflicts, and we either --skip ahead
> > > > > or resolve the conflict to a clean tree and then --continue, our
> > > > > code does not do a final cleanup.
> > > > >
> > > > > Contrary to my initial gut feeling, this bug was not introduced
> > > > > by my rewrite in C of the core parts of rebase -i, but it looks
> > > > > to me as if that bug was with us for a very long time (at least
> > > > > the --skip part).
> > > > >
> > > > > The developer (read: user of rebase -i) in me says that we would
> > > > > want to fast-track this, but the author of rebase -i in me says
> > > > > that we should be cautious and cook this in `next` for a while.
> > > >
> > > > I looked through the patches again and think this series is good
> > > > to go.
> > >
> > > I've just realized I commented on an outdated version as the new
> > > version was posted there rather than as a reply to v1. I've just
> > > looked through it and I'm not sure it addresses the unnecessary
> > > editing of the commit message of the previous commit if a single
> > > squash command is skipped as outlined in
> > > https://public-inbox.org/git/b6512eae-e214-9699-4d69-77117a0da...@talktalk.net/
> > 
> > I have not forgotten about this! I simply did not find the time yet,
> > is all...
> 
> I wondered if that was the case but I wanted to check as I wasn't sure
> if you'd seen the original message as it was on an obsolete version of
> the series
> 
> > The patch series still has not been merged to `next`, but I plan on
> > working on your suggested changes as an add-on commit anyway. I am not
> > quite sure yet how I want to handle the "avoid running commit for the
> > first fixup/squash in the series" problem, but I think we will have to
> > add *yet another* file that is written (in the "we already have
> > comments in the commit message" conditional block in
> > error_failed_squash())...
> 
> I wonder if creating the file in update_squash_messages() rather than
> error_failed_squash() would be a better approach as then it is easy to
> only create rebase_path_amend_type() when there has already been a
> squash or fixup.  The file is removed in the loop that picks commits in
> pick_commits() so it would be cleaned up at the beginning of the next
> pick if it's not needed.

That would be a good idea in general, but I think we have to take care of
the following scenario:

pick<- succeeds
squash  <- succeeds
fixup   <- fails, will be skipped

In this case, we do need to open the editor. But in this scenario, we do
not:

pick<- succeeds
fixup   <- succeeds
squash  <- fails, will be skipped

If we write the amend-type file in update_squash_messages(), we would
write "squash" into it in both cases. My hope was to somehow avoid that.

I just realized that the current iteration does not fulfill that goal, as
the message-fixup file would be long gone by the time
error_failed_squash() was called in the latter example.

Also, I realized something else: my previous work-around for the
GETTEXT_POISON case (where I fail gently when a commit message does not
contain the "This is a combination of # commits" count in ASCII)
would be much superior if it simply would not abuse the comment in the
commit message, but had a robust, non-l18ned way to count the fixup/squash
commits.

My current thinking is to reconcile both problems by shunning the
amend-type and instead just record the sequence of fixup/squash commits
that went into HEAD, in a new file, say, current-fixups.

To answer the question how many commit messages are combined, I then
simply need to count the lines in that file.

To answer the question whether a skipped fixup/squash requires the editor
to be launched, I can simply look whether there is a "squash" line
(ignoring the last line).

Oh, and I also forgot to test whether this is the "final fixup". If we are
skipping a "fixup" in the middle of a chain, there is no need to clean the
commit message to begin with.

This will take a while... ;-)

Ciao,
Dscho


Re: [PATCH 0/3] enable userdiff for more things in git.git

2018-04-26 Thread Junio C Hamano
Ævar Arnfjörð Bjarmason   writes:

> A noticed that git.git doesn't have userdiff enabled for perl files,
> and looking over some recent patches this gave better results, while
> I'm at it add one for Python too. I couldn't find anything in
> gitattributes(5) that was worth the bother of enabling this (e.g. we
> just have one Ruby file).
>
> Ævar Arnfjörð Bjarmason (3):
>   .gitattributes: add *.pl extension for Perl
>   .gitattributes: use the "perl" differ for Perl
>   .gitattributes: add a diff driver for Python

All looked sane, except one minor "Huh?" in the titles.

The last one in the above list makes it look as if you are adding
the func header pattern and/or textconv filter for Python source
code, but the patch actually just specifies that .py is to be
processed by the existing diff driver meant for Python, and at least
to me, the wording for the second one reflects that better.



Re: `iconv` should have the encoding `ISO646-SE2`

2018-04-26 Thread Johannes Schindelin
Hi Abinsium,

On Wed, 25 Apr 2018, Abinsium wrote:

> I installed from `Git-2.16.2-64-bit.exe` from git-scm.com. `iconv` is included
> in this package. I think `iconv` should have the encoding `ISO646-SE2`. Ubuntu
> 16.04 has this encoding. I use it to read old Swedish text files, which there
> are a lot of e.g.:
> `curl -s https://www.abc.se/programbanken/abc/abc80/asmkod/basicii.txt |
> dos2unix | iconv -f ISO646-SE2 -t UTF8 | less`
> `ISO646-SE2` is used by e.g. the retro-computers Luxor
> [ABC80](https://en.wikipedia.org/wiki/ABC_80) (1978) and
> [ABC800](https://en.wikipedia.org/wiki/ABC_800)-series (1981). At my
> university we only have Git Bash and not Ubuntu for WSL in Windows 10
> computers.
> 
> (Not clear where I should report this, but one should be able to report issues
> with the configuration of the other programs than `git` in the package
> somewhere. If there is a better place, please let me know.)
> Originally reported at https://github.com/git/git-scm.com/issues/1199

This has also been reported as
https://github.com/git-for-windows/git/issues/1655. I will comment there,
as this issue is really not relevant to Git.

Ciao,
Johannes


Re: [PATCH v5 09/11] technical/shallow: stop referring to grafts

2018-04-26 Thread Johannes Schindelin
Hi Kuba,

On Wed, 25 Apr 2018, Jakub Narębski wrote:

> On 25 April 2018 at 11:54, Johannes Schindelin
>  wrote:
> > diff --git a/Documentation/technical/shallow.txt 
> > b/Documentation/technical/shallow.txt
> > index 5183b154229..4ec721335d2 100644
> > --- a/Documentation/technical/shallow.txt
> > +++ b/Documentation/technical/shallow.txt
> > @@ -8,15 +8,10 @@ repo, and therefore grafts are introduced pretending that
> >  these commits have no parents.
> >  *
> >
> > -The basic idea is to write the SHA-1s of shallow commits into
> > -$GIT_DIR/shallow, and handle its contents like the contents
> > -of $GIT_DIR/info/grafts (with the difference that shallow
> > -cannot contain parent information).
> > -
> > -This information is stored in a new file instead of grafts, or
> > -even the config, since the user should not touch that file
> > -at all (even throughout development of the shallow clone, it
> > -was never manually edited!).
> > +$GIT_DIR/shallow lists commit object names and tells Git to
> > +pretend as if they are root commits (e.g. "git log" traversal
> > +stops after showing them; "git fsck" does not complain saying
> > +the commits listed on their "parent" lines do not exist).
> >
> >  Each line contains exactly one SHA-1. When read, a commit_graft
> >  will be constructed, which has nr_parent < 0 to make it easier
> 
> Is the removed information (repeated below) important or not?
> 
>   the user should not touch that file
>   at all (even throughout development of the shallow clone, it
>   was never manually edited!).

Back in the days, it might have been necessary to tell people not to
meddle with Git's internals. Nowadays I don't think that'd be necessary
anymore, hence I removed it.

Ciao,
Dscho


Re: [PATCH v4 04/10] commit: use generations in paint_down_to_common()

2018-04-26 Thread Jakub Narebski
Junio C Hamano  writes:
> Derrick Stolee  writes:
>
>> Define compare_commits_by_gen_then_commit_date(), which uses generation
>> numbers as a primary comparison and commit date to break ties (or as a
>> comparison when both commits do not have computed generation numbers).
>>
>> Since the commit-graph file is closed under reachability, we know that
>> all commits in the file have generation at most GENERATION_NUMBER_MAX
>> which is less than GENERATION_NUMBER_INFINITY.
>
> I suspect that my puzzlement may be coming from my not "getting"
> what you meant by "closed under reachability",

It means that if commit A is in the commit graph, then all of its
ancestors (all commits reachable from A) are also in the commit graph.

>but could you also
> explain how _INF and _ZERO interact with commits with normal
> generation numbers?  I've always assumed that genno will be used
> only when comparing two commits with valid genno and otherwise we'd
> fall back to the traditional date based one, but...
>
>> +int compare_commits_by_gen_then_commit_date(const void *a_, const void *b_, 
>> void *unused)
>> +{
>> +const struct commit *a = a_, *b = b_;
>> +
>> +/* newer commits first */
>> +if (a->generation < b->generation)
>> +return 1;
>> +else if (a->generation > b->generation)
>> +return -1;
>
> ... this does not check if a->generation is _ZERO or _INF.  
>
> Both being _MAX is OK (the control will fall through and use the
> dates below).  One being _MAX and the other being a normal value is
> also OK (the above comparisons will declare the commit with _MAX is
> farther than less-than-max one from a root).
>
> Or is the assumption that if one has _ZERO, that must have come from
> an ancient commit-graph file and none of the commits have anything
> but _ZERO?

There is stronger and weaker version of the negative-cut criteria based
on generation numbers.

The strong criteria:

  if A != B and gen(A) <= gen(B), then A cannot reach B

The weaker criteria:

  if gen(A) < gen(B), then A cannot reach B


Because commit-graph is closed under reachability, this means that

  if A is in commit graph, and B is outside of it, then A cannot reach B

If A is in commit graph, then either _MAX >= gen(A) >= 1,
or gen(A) == _ZERO.  Because _INFINITY > _MAX > _ZERO, then we have

  if _MAX >= gen(A) >= 1 || gen(A) == 0, and gen(B) == _INFINITY
  then A cannot reach B

which also fullfils the weaker criteria

  if gen(A) < gen(B), then A cannot reach B


If both A and B are outside commit-graph, i.e. gen(A) = gen(B) = _INFINITY,
or if both A and B have gen(A) = gen(B) = _MAX,
or if both A and B come from old commit graph with gen(A) = gen(B) =_ZERO,
then we cannot say anything about reachability... and weak criteria
also does not say anything about reachability.


Maybe the following ASCII table would make it clear.

 |  gen(B)
 | :::
gen(A)   | _INFINITY | _MAX | larger   | smaller  | _ZERO
-+---+--+--+--+
_INFINITY| = | >| >| >| >
_MAX | < Nn  | =| >| >| >
larger   | < Nn  | < Nn | = n  | >| >
smaller  | < Nn  | < Nn | < Nn | = n  | >
_ZERO| < Nn  | < Nn | < Nn | < Nn | =

Here "n" denotes stronger condition, and "N" denotes weaker condition.
We have _INFINITY > _MAX > larger > smaller > _ZERO.


NOTE however that it is a *tradeoff*.  Using weaker criteria, with
strict inequality, means that we don't need to handle _INFINITY, _MAX
and _ZERO corner-cases in a special way; but it also means that we would
walk slightly more commits than if we used stronger criteria, with less
or equals.

For Linux kernel public repository commit graph[1] we have maximum of 512
commits sharing the same level, 5.43 sharing the same commit on average,
and 50% of time only 2 commits sharing the same level (median, or 2nd
quartile, or 50% percentile).  This is roughly the amount of commits we
walk more with weaker cut-off condition.

[1]: with 750k commits, but which is not largest commit graph any more :-0

>> +/* use date as a heuristic when generations are equal */
>> +if (a->date < b->date)
>> +return 1;
>> +else if (a->date > b->date)
>> +return -1;
>> +return 0;
>> +}

HTH
-- 
Jakub Narębski


[PATCH 0/3] enable userdiff for more things in git.git

2018-04-26 Thread Ævar Arnfjörð Bjarmason
A noticed that git.git doesn't have userdiff enabled for perl files,
and looking over some recent patches this gave better results, while
I'm at it add one for Python too. I couldn't find anything in
gitattributes(5) that was worth the bother of enabling this (e.g. we
just have one Ruby file).

Ævar Arnfjörð Bjarmason (3):
  .gitattributes: add *.pl extension for Perl
  .gitattributes: use the "perl" differ for Perl
  .gitattributes: add a diff driver for Python

 .gitattributes | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

-- 
2.17.0.290.gded63e768a



[PATCH 3/3] .gitattributes: add a diff driver for Python

2018-04-26 Thread Ævar Arnfjörð Bjarmason
Declare that the *.py files in our tree are Python for the purposes of
diffing, and as in 00ddc9d13c ("Fix build with core.autocrlf=true",
2017-05-09) set eol=lf on them, which makes sense like with the *.perl
files.

Signed-off-by: Ævar Arnfjörð Bjarmason 
---
 .gitattributes | 1 +
 1 file changed, 1 insertion(+)

diff --git a/.gitattributes b/.gitattributes
index aa08dd219d..1bdc91e282 100644
--- a/.gitattributes
+++ b/.gitattributes
@@ -4,6 +4,7 @@
 *.perl eol=lf diff=perl
 *.pl eof=lf diff=perl
 *.pm eol=lf diff=perl
+*.py eol=lf diff=python
 /Documentation/git-*.txt eol=lf
 /command-list.txt eol=lf
 /GIT-VERSION-GEN eol=lf
-- 
2.17.0.290.gded63e768a



[PATCH 2/3] .gitattributes: use the "perl" differ for Perl

2018-04-26 Thread Ævar Arnfjörð Bjarmason
As noted in gitattributes(5) this gives better patch context for these
types of files.

Signed-off-by: Ævar Arnfjörð Bjarmason 
---
 .gitattributes | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/.gitattributes b/.gitattributes
index 482af05a6a..aa08dd219d 100644
--- a/.gitattributes
+++ b/.gitattributes
@@ -1,9 +1,9 @@
 * whitespace=!indent,trail,space
 *.[ch] whitespace=indent,trail,space diff=cpp
 *.sh whitespace=indent,trail,space eol=lf
-*.perl eol=lf
-*.pl eof=lf
-*.pm eol=lf
+*.perl eol=lf diff=perl
+*.pl eof=lf diff=perl
+*.pm eol=lf diff=perl
 /Documentation/git-*.txt eol=lf
 /command-list.txt eol=lf
 /GIT-VERSION-GEN eol=lf
-- 
2.17.0.290.gded63e768a



[PATCH 1/3] .gitattributes: add *.pl extension for Perl

2018-04-26 Thread Ævar Arnfjörð Bjarmason
Change the list of Perl extensions added in 00ddc9d13c ("Fix build
with core.autocrlf=true", 2017-05-09) to also include *.pl, we have
some of those in the tree, e.g. in t/.

Signed-off-by: Ævar Arnfjörð Bjarmason 
---
 .gitattributes | 1 +
 1 file changed, 1 insertion(+)

diff --git a/.gitattributes b/.gitattributes
index 8ce9c6b888..482af05a6a 100644
--- a/.gitattributes
+++ b/.gitattributes
@@ -2,6 +2,7 @@
 *.[ch] whitespace=indent,trail,space diff=cpp
 *.sh whitespace=indent,trail,space eol=lf
 *.perl eol=lf
+*.pl eof=lf
 *.pm eol=lf
 /Documentation/git-*.txt eol=lf
 /command-list.txt eol=lf
-- 
2.17.0.290.gded63e768a



[PATCH v4 2/2] log: prevent error if line range ends past end of file

2018-04-26 Thread istephens
From: Isabella Stephens 

If the -L option is used to specify a line range in git log, and the end
of the range is past the end of the file, git will fail with a fatal
error. This commit prevents such behaviour - instead we perform the log
for existing lines within the specified range.

This commit also fixes a corner case where -L ,-n:file would be treated
as a log over the whole file. Now we complain that this is an empty
range.

Signed-off-by: Isabella Stephens 
---
 line-log.c  | 10 +++---
 t/t4211-line-log.sh | 11 ---
 2 files changed, 7 insertions(+), 14 deletions(-)

diff --git a/line-log.c b/line-log.c
index cdc2257db..ad3987062 100644
--- a/line-log.c
+++ b/line-log.c
@@ -599,11 +599,15 @@ parse_lines(struct commit *commit, const char *prefix, 
struct string_list *args)
lines, anchor, , ,
full_name))
die("malformed -L argument '%s'", range_part);
-   if (lines < end || ((lines || begin) && lines < begin))
-   die("file %s has only %lu lines", name_part, lines);
+   if (!begin && end < 0)
+   die("-L invalid empty range");
+   if ((!lines && (begin || end)) || lines < begin)
+   die(Q_("file %s has only %lu line",
+  "file %s has only %lu lines",
+  lines), name_part, lines);
if (begin < 1)
begin = 1;
-   if (end < 1)
+   if (end < 1 || lines < end)
end = lines;
begin--;
line_log_data_insert(, full_name, begin, end);
diff --git a/t/t4211-line-log.sh b/t/t4211-line-log.sh
index d0377fae5..0b96496e3 100755
--- a/t/t4211-line-log.sh
+++ b/t/t4211-line-log.sh
@@ -60,7 +60,6 @@ test_bad_opts "-L 1:nonexistent" "There is no path"
 test_bad_opts "-L 1:simple" "There is no path"
 test_bad_opts "-L '/foo:b.c'" "argument not .start,end:file"
 test_bad_opts "-L 1000:b.c" "has only.*lines"
-test_bad_opts "-L 1,1000:b.c" "has only.*lines"
 test_bad_opts "-L :b.c" "argument not .start,end:file"
 test_bad_opts "-L :foo:b.c" "no match"
 
@@ -84,16 +83,6 @@ test_expect_success '-L ,Y (Y == nlines)' '
git log -L ,$n:b.c
 '
 
-test_expect_success '-L ,Y (Y == nlines + 1)' '
-   n=$(expr $(wc -l 

[PATCH v4 1/2] blame: prevent error if range ends past end of file

2018-04-26 Thread istephens
From: Isabella Stephens 

If the -L option is used to specify a line range in git blame, and the
end of the range is past the end of the file, git will fail with a fatal
error. This commit prevents such behavior - instead we display the blame
for existing lines within the specified range. Tests and documentation
are ammended accordingly.

This commit also fixes two corner cases. Blaming -L n,-(n+1) now blames
the first n lines of a file rather than from n to the end of the file.
Blaming -L ,-n will complain of an empty range, rather than blaming the
whole file.

Signed-off-by: Isabella Stephens 
---
 Documentation/git-blame.txt   | 10 ++
 builtin/blame.c   |  6 --
 line-range.c  |  2 +-
 t/t8003-blame-corner-cases.sh | 17 +
 4 files changed, 28 insertions(+), 7 deletions(-)

diff --git a/Documentation/git-blame.txt b/Documentation/git-blame.txt
index 16323eb80..8cb81f57a 100644
--- a/Documentation/git-blame.txt
+++ b/Documentation/git-blame.txt
@@ -152,6 +152,16 @@ Also you can use a regular expression to specify the line 
range:
 
 which limits the annotation to the body of the `hello` subroutine.
 
+A range that begins or ends outside the bounds of the file will
+blame the relevant lines. For example:
+
+   git blame -L 10,-20 foo
+   git blame -L 10,+20 foo
+
+will respectively blame the first 10 and last 11 lines of a
+20 line file. However, blaming a line range that is entirely
+outside the bounds of the file will fail.
+
 When you are not interested in changes older than version
 v2.6.18, or changes older than 3 weeks, you can use revision
 range specifiers  similar to 'git rev-list':
diff --git a/builtin/blame.c b/builtin/blame.c
index 9dcb367b9..1204ab142 100644
--- a/builtin/blame.c
+++ b/builtin/blame.c
@@ -886,13 +886,15 @@ int cmd_blame(int argc, const char **argv, const char 
*prefix)
nth_line_cb, , lno, anchor,
, , sb.path))
usage(blame_usage);
-   if (lno < top || ((lno || bottom) && lno < bottom))
+   if (!bottom && top < 0)
+   die("-L invalid empty range");
+   if ((!lno && (top || bottom)) || lno < bottom)
die(Q_("file %s has only %lu line",
   "file %s has only %lu lines",
   lno), path, lno);
if (bottom < 1)
bottom = 1;
-   if (top < 1)
+   if (top < 1 || lno < top)
top = lno;
bottom--;
range_set_append_unsafe(, bottom, top);
diff --git a/line-range.c b/line-range.c
index 323399d16..023aee1f5 100644
--- a/line-range.c
+++ b/line-range.c
@@ -47,7 +47,7 @@ static const char *parse_loc(const char *spec, nth_line_fn_t 
nth_line,
else if (!num)
*ret = begin;
else
-   *ret = begin + num;
+   *ret = begin + num ? begin + num : -1;
return term;
}
return spec;
diff --git a/t/t8003-blame-corner-cases.sh b/t/t8003-blame-corner-cases.sh
index 661f9d430..4a0c51658 100755
--- a/t/t8003-blame-corner-cases.sh
+++ b/t/t8003-blame-corner-cases.sh
@@ -216,14 +216,23 @@ test_expect_success 'blame -L with invalid start' '
 '
 
 test_expect_success 'blame -L with invalid end' '
-   test_must_fail git blame -L1,5 tres 2>errors &&
-   test_i18ngrep "has only 2 lines" errors
+   git blame -L1,5 tres >out &&
+   test_line_count = 2 out
 '
 
 test_expect_success 'blame parses  part of -L' '
git blame -L1,1 tres >out &&
-   cat out &&
-   test $(wc -l < out) -eq 1
+   test_line_count = 1 out
+'
+
+test_expect_success 'blame -Ln,-(n+1)' '
+   git blame -L3,-4 nine_lines >out &&
+   test_line_count = 3 out
+'
+
+test_expect_success 'blame -L,-n' '
+   test_must_fail git blame -L,-1 tres 2>errors &&
+   test_i18ngrep "-L invalid empty range"
 '
 
 test_expect_success 'indent of line numbers, nine lines' '
-- 
2.14.3 (Apple Git-98)



[PATCH v4 0/2] blame and log: prevent error if range ends past end of file

2018-04-26 Thread istephens
Picking this back up after a little while. This solution means we can still 
accept
-L, for an empty file but any other range will fail. I've made the change for 
both
blame and log (as two separate patches).

I've also changed behaviour in a couple of corner cases - before we couldn't
distinguish between -Ln,-(n+1) and -Ln, so -Ln,-(n+1) would blame from n to the 
end of
the file rather than the first n lines. Also, we now complain that -L,-n is an 
empty
range where previously this would blame the whole file.



Re: Antw: Re: java diffs show no method context

2018-04-26 Thread Ævar Arnfjörð Bjarmason

On Thu, Apr 26 2018, Ulrich Windl wrote:

> Thanks for that. It sounds plausible, but I wonder why it works automagically
> for C, but not for Java (Politcal reasons put aside): Using ".c" for C is 
> about
> as common as using ".java" for Java ;-)

It has a bit to do with it being in C, but not in the way you think. By
default Git doesn't enable the "cpp" driver either for *.c, but it just
so happens to do the right thing more of the time because the default
heuristic is basically to search for a nearby line that doesn't start
with whitespace for context.

This doesn't work for Java because your methods tend to be indented
since they're part of the class you're working on.


Antw: Re: java diffs show no method context

2018-04-26 Thread Ulrich Windl
Hi!

Thanks for that. It sounds plausible, but I wonder why it works automagically
for C, but not for Java (Politcal reasons put aside): Using ".c" for C is about
as common as using ".java" for Java ;-)

Regards,
Ulrich

>>> Alban Gruin  schrieb am 25.04.2018 um 17:05 in
Nachricht
:
> Le 25/04/2018 à 14:53, Ulrich Windl a écrit :
>> Hi!
>> 
>> This is for git 2.13.6, and it may be an FAQ or frequent feature request. 
> Anyway:
>> I'm new to Java, and writing my first project using Git, I found that "git

> diff" only reports the class in the diff context, but not the method (as
seen 
> for C, for example).
>> I'd wish to have the method where the diff is located.
> 
> Hi,
> 
> to achieve this behaviour, you have to create a file named
> ".gitattributes" at the root of your project, containing this line:
> 
> *.java diff=java
> 
> .gitattributes allows you to configure other things, as described in the
> documentation[1].
> 
> I hope it helps.
> 
> [1] https://www.git-scm.com/docs/gitattributes 
> 
> Cheers,
> Alban





Re: [PATCH v9 00/17] rebase -i: offer to recreate commit topology by rebasing merges

2018-04-26 Thread Junio C Hamano
Junio C Hamano  writes:

>> - Rebased the patch series on top of current `master`, i.e. both
>>   `pw/rebase-keep-empty-fixes` and `pw/rebase-signoff`, to resolve merge
>>   conflicts myself.
>
> Good to see the last item, as this gave me a chance to make sure
> that the conflict resolution I've been carrying matches how you
> would have resolved as the original author.  Applying these on the
> old base (with minor conflict resolution) to match the old iteration
> and merging the result to the new base1f1cddd5 ("The fourth batch
> for 2.18", 2018-04-25) resulted in the same tree as the tree that
> results from applying these on top of the new base.
>
> That was done only to validate the result of the past resolution
> (and also seeing the interdiff from the old iteration).  There is no
> reason to keep this series back-portable to older tip of 'master',
> so I'll queue the result of applying the patches to the new base.

By the way, the rebasing made the topic textually merge cleanly to
the tip of 'pu' which made it slightly more cumbersome to deal with
a semantic conflict the topic has with another topic that modifies
the function signature of get_main_ref_store().  This topic adds a
new callsite in sequencer.c to this function.

The old base that forced the integrator to resolve conflicts in
sequencer.c with some other topic, thanks to that exact textual
conflicts, gave rerere a chance to record the adjustment for this
semantic conflict.

Now because the series applied to new base does not have textual
conflicts in sequencer.c when merged to 'pu', the adjustment for the
semantic conflict needs to be carried by a different mechanism.

Side note.  Do not take the above as a complaint.  Dealing with
interactions among various topics in flight while keeping them
as straight and clean topic is what I do.  It is a normal part
of running an active project.

It may be an interesting exercise to attempt to rebase tonight's
'pu' onto something younger in 'pu', say 'pu~4', without changing
anything in "pu^2" (which is the tip of this topic) and see how well
the merge recreation feature of this topic handles the evil merge.

The gist of the evil merge looks like this:

diff --cc sequencer.c
index a428fc7db7,e2f8394284..729cf05768
--- a/sequencer.c
+++ b/sequencer.c
@@@ -2483,6 -2527,349 +2556,349 @@@ static int do_exec(const char *command_
 ...
+ 
+ static int do_label(const char *name, int len)
+ {
 -  struct ref_store *refs = get_main_ref_store();
++  struct ref_store *refs = get_main_ref_store(the_repository);
+   struct ref_transaction *transaction;
+   struct strbuf ref_name = STRBUF_INIT, err = STRBUF_INIT;
+   struct strbuf msg = STRBUF_INIT;
+...


Re: [PATCH 5/5] builtin/config: introduce `color` type specifier

2018-04-26 Thread Taylor Blau
On Thu, Apr 26, 2018 at 02:32:54PM +0900, Junio C Hamano wrote:
> Taylor Blau  writes:
>
> > diff --git a/builtin/config.c b/builtin/config.c
> > index d7acf912cd..ec5c11293b 100644
> > --- a/builtin/config.c
> > +++ b/builtin/config.c
> > @@ -61,6 +61,7 @@ static int show_origin;
> >  #define TYPE_BOOL_OR_INT   3
> >  #define TYPE_PATH  4
> >  #define TYPE_EXPIRY_DATE   5
> > +#define TYPE_COLOR 6
> >
> >  #define OPT_CALLBACK_VALUE(s, l, v, h, i) \
> > { OPTION_CALLBACK, (s), (l), (v), NULL, (h), PARSE_OPT_NOARG | \
> > @@ -231,6 +232,11 @@ static int format_config(struct strbuf *buf, const 
> > char *key_, const char *value
> > if (git_config_expiry_date(, key_, value_) < 0)
> > return -1;
> > strbuf_addf(buf, "%"PRItime, t);
> > +   } else if (type == TYPE_COLOR) {
> > +   char v[COLOR_MAXLEN];
> > +   if (git_config_color(v, key_, value_) < 0)
> > +   return -1;
> > +   strbuf_addstr(buf, v);
> > } else if (value_) {
> > strbuf_addstr(buf, value_);
> > } else {
> > @@ -376,6 +382,20 @@ static char *normalize_value(const char *key, const 
> > char *value)
> > else
> > return xstrdup(v ? "true" : "false");
> > }
> > +   if (type == TYPE_COLOR) {
> > +   char v[COLOR_MAXLEN];
> > +   if (git_config_color(v, key, value))
> > +   die("cannot parse color '%s'", value);
> > +
> > +   /*
> > +* The contents of `v` now contain an ANSI escape
> > +* sequence, not suitable for including within a
> > +* configuration file. Treat the above as a
> > +* "sanity-check", and return the given value, which we
> > +* know is representable as valid color code.
> > +*/
> > +   return xstrdup(value);
> > +   }
> >
> > die("BUG: cannot normalize type %d", type);
> >  }
>
> Hmmm, option_parse_type() introduced in [PATCH 2/5] used to learn
> to parse "color" in this step, but it no longer does.

Oof, again. I dropped this hunk on the floor when integrating. I put it
back in v2.


Thanks,
Taylor


Re: [PATCH 2/5] builtin/config.c: support `--type=` as preferred alias for `--type`

2018-04-26 Thread Taylor Blau
On Thu, Apr 26, 2018 at 02:25:44PM +0900, Junio C Hamano wrote:
> Taylor Blau  writes:
>
> > Subject: Re: [PATCH 2/5] builtin/config.c: support `--type=` as 
> > preferred alias for `--type`
>
> I'd retitle while queuing, as the last 'type' is a placeholder for
> concrete types like  above.

Good idea. I amended v2 in this fashion.

> > +...
> > +   new_type = opt->defval;
> > +   if (!new_type) {
> > +...
> > +   }
> > +
> > +   *to_type = opt->value;
>
> But this is wrong, no?  You meant opt->value points at an integer
> variable that receives the type we discover by parsing, i.e.
>
>   to_type = opt->value;

Oof. You're absolutely right. I fixed this and moved the assignment to
the declaration at the top of this function.


Thanks,
Taylor