Re: [PATCH v4 5/6] get_short_oid: sort ambiguous objects by type, then SHA-1

2018-05-10 Thread Junio C Hamano
Ævar Arnfjörð Bjarmason   writes:

> diff --git a/sha1-name.c b/sha1-name.c
> index 9d7bbd3e96..46d8b1afa6 100644
> --- a/sha1-name.c
> +++ b/sha1-name.c
> @@ -409,6 +437,8 @@ static int get_short_oid(const char *name, int len, 
> struct object_id *oid,
>   status = finish_object_disambiguation(, oid);
>  
>   if (!quietly && (status == SHORT_NAME_AMBIGUOUS)) {
> + struct oid_array collect = OID_ARRAY_INIT;
> +
>   error(_("short SHA1 %s is ambiguous"), ds.hex_pfx);
>  
>   /*
> @@ -421,7 +451,12 @@ static int get_short_oid(const char *name, int len, 
> struct object_id *oid,
>   ds.fn = NULL;
>  
>   advise(_("The candidates are:"));
> - for_each_abbrev(ds.hex_pfx, show_ambiguous_object, );

So we used to let for_each_abbrev() to enumerate these object names
that share the prefix in the object name order and fed
show_ambiguous_object() to show them, which was the cause of the
output that is not grouped by type.  Now you collect them into
another oid_array and sort by type, relying on the fact to that the
for_each_abbrev() in the "collect" phase already does the de-duping.

Sounds good.

> + for_each_abbrev(ds.hex_pfx, collect_ambiguous, );
> + QSORT(collect.oid, collect.nr, sort_ambiguous);
> +
> + if (oid_array_for_each(, show_ambiguous_object, ))
> + BUG("show_ambiguous_object shouldn't return non-zero");
> + oid_array_clear();
>   }


> diff --git a/t/t1512-rev-parse-disambiguation.sh 
> b/t/t1512-rev-parse-disambiguation.sh
> index 711704ba5a..2701462041 100755
> --- a/t/t1512-rev-parse-disambiguation.sh
> +++ b/t/t1512-rev-parse-disambiguation.sh
> @@ -361,4 +361,25 @@ test_expect_success 'core.disambiguate does not override 
> context' '
>   git -c core.disambiguate=committish rev-parse $sha1^{tree}
>  '
>  
> +test_expect_success C_LOCALE_OUTPUT 'ambiguous commits are printed by type 
> first, then hash order' '
> + test_must_fail git rev-parse  2>stderr &&
> + grep ^hint: stderr >hints &&
> + grep  hints >objects &&
> + cat >expected <<-\EOF &&
> + tag
> + commit
> + tree
> + blob
> + EOF
> + awk "{print \$3}" objects.types &&
> + uniq objects.types.uniq &&

Eww, that is somewhat tricky (but correct) use of "uniq", which
POSIX not just mandates adjacent duplicates to be removed, but also
forbids from removing duplicates that are not adjacent from each
other.  So the objects in the "hints" file are not grouped by type,
we will fail to see these four lines.

> + test_cmp expected objects.types.uniq &&
> + for type in tag commit tree blob
> + do
> + grep $type objects >$type.objects &&
> + sort $type.objects >$type.objects.sorted &&
> + test_cmp $type.objects.sorted $type.objects

We not only want to see objects grouped by type (and types shown in
a desired order), but within the same type we want them ordered by
object name.

OK.

> + done
> +'
> +
>  test_done


Re: [PATCH] fast-export: avoid NULL pointer arithmetic

2018-05-10 Thread Junio C Hamano
Junio C Hamano  writes:

> René Scharfe  writes:
>
>>> But it somehow feels backwards in spirit to me, as the reason why we
>>> use "void *" there in the decoration field is because we expect that
>>> we'd have a pointer to some struture most of the time, and we have
>>> to occasionally store a small integer there.
>>
>> Yes, fast-export seems to be the only place that stores an integer as
>> a decoration.
>
> With the decoration subsystem that might be the case, but I think
> we have other codepaths where "void * .util" field in the structure
> is used to store (void *)1, expecting that a normal allocation will
> never yield a pointer that is indistinguishable from that value.

I was looking at a different topic and noticed that bisect.c uses
commit->util (which is of type "void *") to always store an int (it
never stores a pointer and there is no mixing).  This one is equally
unportable as fast-export after your fix.



Re: [PATCH v2] pack-format.txt: more details on pack file format

2018-05-10 Thread Junio C Hamano
Nguyễn Thái Ngọc Duy   writes:

> +Both ofs-delta and ref-delta store the "delta" against another
> +object. The difference between them is, ref-delta directly encodes
> +20-byte base object name. If the base object is in the same pack,
> +ofs-delta encodes the offset of the base object in the pack instead.

Those of us who know how delta works would understand it, but "delta
against another object" followed by a mention of "base object" may
not necessarily click to readers that "another object" and "base
object" refer to the same concept.

... store the 'delta' to be applied to another object
(called 'base object') to reconstruct the object.

perhaps?

> ...
> +  
> +--+-+-+-+-+---+---+---+
> +  | 1xxx | offset1 | offset2 | offset3 | offset4 | size1 | size2 | size3 
> |
> +  
> +--+-+-+-+-+---+---+---+
> +
> +This is the instruction format to copy a byte range from the source
> +object. It encodes the offset to copy from any the number of bytes to
> +copy. Offset and size are in little-endian order.

"any the number"???  Ah, s/any/and/ that is.

Other than that, looks good to me.

Thanks.


Re: [PATCH] apply: clarify "-p" documentation

2018-05-10 Thread Junio C Hamano
Jeff King  writes:

> How about this?
>
> -- >8 --
> Subject: [PATCH] apply: clarify "-p" documentation
>
> We're not really removing slashes, but slash-separated path
> components. Let's make that more clear.
>
> Reported-by: kelly elton 
> Signed-off-by: Jeff King 
> ---
>  Documentation/git-apply.txt | 6 --
>  1 file changed, 4 insertions(+), 2 deletions(-)
>
> diff --git a/Documentation/git-apply.txt b/Documentation/git-apply.txt
> index 4ebc3d3271..c993fbf714 100644
> --- a/Documentation/git-apply.txt
> +++ b/Documentation/git-apply.txt
> @@ -113,8 +113,10 @@ explained for the configuration variable 
> `core.quotePath` (see
>  linkgit:git-config[1]).
>  
>  -p::
> - Remove  leading slashes from traditional diff paths. The
> - default is 1.
> + Remove  leading path components (separated by slashes) from
> + traditional diff paths. E.g., with `-p2`, a patch against
> + `a/dir/file` will be applied directly to `file`. The default is
> + 1.

Thanks for an obvious improvement.  Will queue.

>  
>  -C::
>   Ensure at least  lines of surrounding context match before


Re: [PATCH 1/2] t5516-fetch-push: fix 'push with dry-run' test

2018-05-10 Thread Junio C Hamano
SZEDER Gábor  writes:

> In a while-at-it cleanup replacing a 'cd dir && <...> && cd ..' with a
> subshell, commit 28391a80a9 (receive-pack: allow deletion of corrupt
> refs, 2007-11-29) also moved the assignment of the $old_commit
> variable to that subshell.  This variable, however, is used outside of
> that subshell as a parameter of check_push_result(), to check that a
> ref still points to the commit where it is supposed to.  With the
> variable remaining unset outside the subshell check_push_result()
> doesn't perform that check at all.

Sigh/Blush.  Thanks for finding an old screw-up.

>
> Use 'git -C  cmd...', so we don't need to change directory, and
> thus don't need the subshell either when setting $old_commit.
>
> Furthermore, change check_push_result() to require at least three
> parameters (the repo, the oid, and at least one ref), so it will catch
> similar issues earlier should they ever arise.
>
> Signed-off-by: SZEDER Gábor 
> ---
>  t/t5516-fetch-push.sh | 8 
>  1 file changed, 4 insertions(+), 4 deletions(-)
>
> diff --git a/t/t5516-fetch-push.sh b/t/t5516-fetch-push.sh
> index 82239138d5..832b79ad40 100755
> --- a/t/t5516-fetch-push.sh
> +++ b/t/t5516-fetch-push.sh
> @@ -94,6 +94,9 @@ mk_child() {
>  }
>  
>  check_push_result () {
> + test $# -ge 3 ||
> + error "bug in the test script: check_push_result requires at least 3 
> parameters"
> +
>   repo_name="$1"
>   shift
>  
> @@ -553,10 +556,7 @@ test_expect_success 'branch.*.pushremote config order is 
> irrelevant' '
>  test_expect_success 'push with dry-run' '
>  
>   mk_test testrepo heads/master &&
> - (
> - cd testrepo &&
> - old_commit=$(git show-ref -s --verify refs/heads/master)
> - ) &&
> + old_commit=$(git -C testrepo show-ref -s --verify refs/heads/master) &&
>   git push --dry-run testrepo : &&
>   check_push_result testrepo $old_commit heads/master
>  '


Re: [PATCH v4 2/6] sha1-array.h: align function arguments

2018-05-10 Thread Junio C Hamano
Junio C Hamano  writes:

> Jeff King  writes:
>
>> On Thu, May 10, 2018 at 12:42:59PM +, Ævar Arnfjörð Bjarmason wrote:
>>
>>> The arguments weren't lined up with the opening parenthesis. Fixes up
>>> code added in aae0caf19e ("sha1-array.h: align function arguments",
>>> 2018-04-30).
> ...
> But then "fixes up code added in" is not quite right, either.  It is
> what the commit should have touched but didn't ;-)

FWIW, I ended up with this description.

The arguments weren't lined up with the opening parenthesis, after
910650d2 ("Rename sha1_array to oid_array", 2017-03-31) renamed the
function.



Re: [PATCH v4 2/6] sha1-array.h: align function arguments

2018-05-10 Thread Junio C Hamano
Jeff King  writes:

> On Thu, May 10, 2018 at 12:42:59PM +, Ævar Arnfjörð Bjarmason wrote:
>
>> The arguments weren't lined up with the opening parenthesis. Fixes up
>> code added in aae0caf19e ("sha1-array.h: align function arguments",
>> 2018-04-30).
>
> I think that's this patch. :)
>
> Presumably you meant 910650d2f8 (Rename sha1_array to oid_array,
> 2017-03-31)?

Sharp eyes.  I couldn't quite tell from a cursory read of the blame
output, until I realized that the original before that culprit were
aligned and the renaming was what made them out of alignment.

But then "fixes up code added in" is not quite right, either.  It is
what the commit should have touched but didn't ;-)

Thanks.


Re: Regression in patch add?

2018-05-10 Thread Junio C Hamano
Phillip Wood  writes:

> Yes, I think it probably makes sense to do that. Originally I didn't
> count empty lines as context lines in case the user accidentally added
> some empty lines at the end of the hunk but if 'git apply' does then I
> think 'git add -p' should as well

I am not sure if "adding to the tail" should be tolerated, but in
any case, newer GNU diff can show an empty unaffected line as an
empty line (unlike traditional unified context format in which such
a line is expressed as a line with a lone SP on it), which is
allowed as "implementation defined" by POSIX.1 [*1*]. Modern "git
apply" knows about this.

If "add -p" parses a patch, it should learn to do so, too.


[Reference]

*1* http://pubs.opengroup.org/onlinepubs/9699919799/utilities/diff.html

>
>> Meanwhile, I can easily configure my editor not to do this for `*.diff` 
>> files.
>> 
>> Thanks for your help, Phillip and Martin!
>
> Thanks for posting an example so we could test it, it makes it much
> easier to track the problem down
>
> Best Wishes
>
> Phillip
>
>> Mahmoud, does this also explain your problem as per your original post?
>> 


Re: [Best Practices Request] clean/smudge configuration

2018-05-10 Thread Junio C Hamano
"Randall S. Becker"  writes:

> What if we create a ../.gitconfig like ../.gitattributes, that is loaded
> before .git/config?

You should not forget one of the two reasons why clean/smudge/diff
etc. drivers must be given with a step of redirection, split between
attributes and config.  "This path holds text from ancient macs" at
the logical level may be expressed in .gitattributes, and then "when
checking out text from ancient macs, process the blob data with this
program to turn it into a form suitable for working tree" is given
as a smudge filter program, that is (1) suitable for the platform
_you_ as the writer of the config file is using *AND* (2) something
you are confortable running on behalf of you.

We *deliberately* lack a mechanism to pay attention to in-tree
config that gets propagated to those who get them via "git clone",
exactly because otherwise your upstream may not just specify a
"smudge" program that your platform would be unable to run, but can
specify a "smudge" program that pretends to be a smudger, but is
actually a program that installs a keylogger and posts your login
password and bank details to this mailing list ;-)

So, no, I do not think in-tree configuration will fly.


Re: [PATCH] fast-export: avoid NULL pointer arithmetic

2018-05-10 Thread Junio C Hamano
René Scharfe  writes:

>> But it somehow feels backwards in spirit to me, as the reason why we
>> use "void *" there in the decoration field is because we expect that
>> we'd have a pointer to some struture most of the time, and we have
>> to occasionally store a small integer there.
>
> Yes, fast-export seems to be the only place that stores an integer as
> a decoration.

With the decoration subsystem that might be the case, but I think
we have other codepaths where "void * .util" field in the structure
is used to store (void *)1, expecting that a normal allocation will
never yield a pointer that is indistinguishable from that value.

> Using struct decorate in fast-export has the benefit of not
> requiring separate allocations for individual entries.  Switching to
> struct hashmap would require individual allocations.  Adding a
> custom clone of decorate with a uint32_t payload would be an option.

As long as we know uint32_t is no wider than uintptr_t, your patch
should be safe, shouldn't it?


Re: [PATCH v2] add status config and command line options for rename detection

2018-05-10 Thread Junio C Hamano
Elijah Newren  writes:

>> Note: I removed the --no-breaks command line option from the original patch 
>> as
>> it will no longer be needed once the default has been changed [1] to turn it 
>> off.
>>
>> [1] 
>> https://public-inbox.org/git/20180430093421.27551-2-eckhard.s.ma...@gmail.com/
>
> I'd just drop these lines from the commit message, and instead mention
> that your patch depends on em/status-rename-config.
>
>> Original-Patch-by: Alejandro Pauly 
>> Signed-off-by: Ben Peart 
>> ---

Other things seem to have been resolved between you two already, so
I'll only comment on a minor tangent here.

>> Notes:
>> Base Ref: master
>
> This patch does not apply to master; it has conflicts.
>
>> Web-Diff: https://github.com/benpeart/git/commit/823212725b

As Git is distributed, unlike tags that are meant to be global among
project participants by convention, a branch name can never be used
as a trustable base among developers.  Your 'master' branch may
point at a different commit from mine, and my 'master' branch today
may point at a different commit from mine yesterday.

I've seen patches that used a similar note below the three-dash line
that named an exact commit object name.  That is a lot more reliable
way to convey the information necessary to consturct the exact state
the contributor worked on.

> This web diff shows em/status-rename-config as the parent commit, not
> master.  Since your commit message mentions you want the change to
> break detection provided by that series, just listing it as the
> explicit base seems like the right way to go.

Thanks for digging.  That would work well, too.


Re: [PATCH 0/1] Fix UX issue with commit-graph.lock

2018-05-10 Thread Junio C Hamano
Derrick Stolee  writes:

>> Also, can't we tell why we failed to acquire the lock at step #1?
>> Do we only get a NULL that says "I am not telling you why, but we
>> failed to lock"?
>
> To tell why we failed to acquire the lock, we could inspect
> "errno". However, this requires whitebox knowledge of both the
> lockfile API and the tempfile API to know that the last system call to
> set errno was an open() or adjust_shared_perm().

That depends on your viewpoint.  We can make it a part of documented
API so that we keep the promise of preserving errno in the error
codepath when we update these APIs and then you do not have to worry
about errno being whitebox knowledge ;-).



Re: [GSoC] Info: Week 02 Blog Post

2018-05-10 Thread Pratik Karki
On Thu, May 10, 2018 at 11:35 PM, Stefan Beller  wrote:
> Hi Pratik,

Hi Stefan,

> On Wed, May 9, 2018 at 8:07 PM, Pratik Karki  wrote:
>> Hi,
>>
>> The week 02 blog post[1] is live. This post is part I out of II and this
>> time it will be biweekly. The part II of will come in 2-3 days which
>> will describe the current `git-rebase.sh`.
>
> Cool post!

Thanks!

>> Do give me feedback.
>
> In "and you’re choice of DVCS is Git.", s/you're/your/

Fixed it! Thanks for pointing that out.


Re: [PATCH 0/2] Submodule merging: i18n, verbosity

2018-05-10 Thread Stefan Beller
Hi Elijah,

On Thu, May 10, 2018 at 5:04 PM, Elijah Newren  wrote:
> On Thu, May 10, 2018 at 2:19 PM, Stefan Beller  wrote:
>> Leif wrote:
>>> Sure, let me know what to use instead and I’ll update and resubmit the 
>>> patch.
>>> Sure, but `MERGE_WARNING` prefixes all the messages with "Failed to
>>> merge submodule“.
>>
>> I thought about replying and coming up with good reasons, but I wrote some
>> patches instead.
>>
>> They can also be found at 
>> https://github.com/stefanbeller/git/tree/submodule_i18n_verbose
>>
>> I think these would be a good foundation for your patch as well, as you can 
>> use the
>> output() function for the desired cases.
>>
>> Feel free to take these patches as part of your series or adapt
>> (or be inspired by) as needed.
>
> This is awesome.  In addition to the good reasons you gave, switching
> merge_submodule() to use output() was one of several things on my todo
> list since I think it'd be needed for remerge-diffs
> (https://bugs.chromium.org/p/git/issues/detail?id=12) and might be
> useful for merges in bare repos; thanks for tackling it.

Thanks for the encouraging words!
The one nit I find on that series is that we need to rely on and export
the add_submodule_odb function as I want to get rid of that function
once the object store series has progressed far enough.

>
> Patches look good to me.  Having Leif's patch on top of these two
> would be great.

ok, Let's go with that.

Stefan


Re: [PATCH 0/2] Submodule merging: i18n, verbosity

2018-05-10 Thread Elijah Newren
On Thu, May 10, 2018 at 2:19 PM, Stefan Beller  wrote:
> Leif wrote:
>> Sure, let me know what to use instead and I’ll update and resubmit the patch.
>> Sure, but `MERGE_WARNING` prefixes all the messages with "Failed to
>> merge submodule“.
>
> I thought about replying and coming up with good reasons, but I wrote some
> patches instead.
>
> They can also be found at 
> https://github.com/stefanbeller/git/tree/submodule_i18n_verbose
>
> I think these would be a good foundation for your patch as well, as you can 
> use the
> output() function for the desired cases.
>
> Feel free to take these patches as part of your series or adapt
> (or be inspired by) as needed.

This is awesome.  In addition to the good reasons you gave, switching
merge_submodule() to use output() was one of several things on my todo
list since I think it'd be needed for remerge-diffs
(https://bugs.chromium.org/p/git/issues/detail?id=12) and might be
useful for merges in bare repos; thanks for tackling it.

Patches look good to me.  Having Leif's patch on top of these two
would be great.

Elijah


Re: [PATCH] sha1dc: fix for compiling on AIX using IBM XLC compiler

2018-05-10 Thread Jonathan Nieder
(cc-ing Marc Stevens for real this time. Sorry for the noise)
Ævar Arnfjörð Bjarmason wrote:
> On Wed, May 09 2018, jens persson wrote:

>> Hello, first patch. I'm having trouble compiling on AIX using IBMs
>> compiler, leading to
>> unusable binaries. The following patch solved the problem for 2.17.0.
>> The patch below is cut via gmail to allow for firewalls, but
>> exists in an unmolested form on github:
>> https://github.com/MrShark/git/commit/44bfcaca6637e24548ec06f46fb6035a846b14af
>>
>> Best regards
>> /jp

Thanks for the patch.

>> Building on AIX using XLC every checkout gives an error:
>> fatal: pack is corrupted (SHA1 mismatch)
>> fatal: index-pack failed
>>
>> Back tracking it was introduced in 2.13.2, most likely in [1]
>>
>> Add a #ifdef guard based on macros defined at [2] and [3].
>>
>> Should perhaps __xlc__ should should be changed to or combined with _AIX
>> based on the behavour of GCC on AIX or XL C on Linux.
>>
>> 1. https://github.com/git/git/commit/6b851e536b05e0c8c61f77b9e4c3e7cedea39ff8
>> 2. 
>> https://www.ibm.com/support/knowledgecenter/SSGH2K_13.1.3/com.ibm.xlc1313.aix.doc/compiler_ref/macros_platform.html
>> 3. 
>> https://www.ibm.com/support/knowledgecenter/SSGH2K_13.1.3/com.ibm.xlc1313.aix.doc/compiler_ref/xlmacros.html
>>
>> Signed-off-by: jens persson 
>> ---
>>  sha1dc/sha1.c | 2 +-
>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/sha1dc/sha1.c b/sha1dc/sha1.c
>> index 25eded139..68a8a0180 100644
>> --- a/sha1dc/sha1.c
>> +++ b/sha1dc/sha1.c
>> @@ -84,7 +84,7 @@
>>  /* Not under GCC-alike or glibc or *BSD or newlib */
>>  #elif (defined(__ARMEB__) || defined(__THUMBEB__) || defined(__AARCH64EB__) 
>> || \
>> defined(__MIPSEB__) || defined(__MIPSEB) || defined(_MIPSEB) || \
>> -   defined(__sparc))
>> +   defined(__sparc) || (defined(__powerpc) && defined(__xlc__)))

I wonder if there's a simpler way to detect this.

__powerpc seems orthogonal to the goal, since there are little-endian
powerpc machines.

It appears that XLC defines _BIG_ENDIAN on big-endian machines.  I
wonder if we should do

 #elif defined(_BYTE_ORDER) && defined(_BIG_ENDIAN) && defined(_LITTLE_ENDIAN)
  ... as today ...
 #elif !defined(_BYTE_ORDER) && defined(_BIG_ENDIAN) && !defined(_LITTLE_ENDIAN)
 # define SHA1DC_BIGENDIAN
 #elif !defined(_BYTE_ORDER) && !defined(_BIG_ENDIAN) && defined(_LITTLE_ENDIAN)
  /* little endian. */
 #else
  ...

It also seems to me that Git should enable the #error in the
fallthrough case.  The test suite would catch this kind of problem but
it does not seem that everyone runs the test suite, so a compiler
error is more robust.  Is there a #define we can set to enable that?

>>  /*
>>   * Should define Big Endian for a whitelist of known processors. See
>>   * https://sourceforge.net/p/predef/wiki/Endianness/ and
>
> This patch looks sane to me, but we don't manage this software but
> instead try to pull it as-is from upstream at
> https://github.com/cr-marcstevens/sha1collisiondetection
>
> Which we could apply this, it would be much better if you could submit a
> pull request with this to upstream, and then we can update our copy as I
> did in e.g. 9936c1b52a ("sha1dc: update from upstream", 2017-07-01).

Thanks,
Jonathan


Re: [PATCH v2] mailmap: update brian m. carlson's email address

2018-05-10 Thread brian m. carlson
On Wed, May 09, 2018 at 12:33:59PM +0530, Kaartic Sivaraam wrote:
> On Wednesday 09 May 2018 05:49 AM, brian m. carlson wrote:
> > Correct, it doesn't.  In my case, I was using --pretty='%aN <%aE>',
> > which is how I noticed it in the first place.
> 
> So, how about updating the commit message to avoid confusions to the
> incidental future reader? (Or is it just not worth it? ;-])

I don't think it's worth it.  It does apply to git log, just not unless
you enable the mailmap, which I supposed was implicit.  I can do a v3 if
someone feels strongly, but otherwise I think it's fine as it is.
-- 
brian m. carlson: Houston, Texas, US
OpenPGP: https://keybase.io/bk2204


signature.asc
Description: PGP signature


Re: [PATCH v4 00/13] object store: alloc

2018-05-10 Thread Stefan Beller
On Thu, May 10, 2018 at 1:56 PM, Jonathan Tan  wrote:
> On Thu, 10 May 2018 10:32:09 -0700
> Stefan Beller  wrote:
>
>> >  - I would call them release_commit() and release_tag(), to match
>> >strbuf_release().
>>
>> Why not commit_release and tag_release to also have the same order
>> of words as in strbuf_release ?
>
> At this point in the discussion, either is fine.

ok, please express your opinion in form of a patch. ;)

I do not plan on resending this series unless someone comments
something that they themselves do not consider bikesheddy.

Stefan


Re: [PATCH v2] add status config and command line options for rename detection

2018-05-10 Thread Elijah Newren
Hi Ben,

On Thu, May 10, 2018 at 12:09 PM, Ben Peart  wrote:
> On 5/10/2018 12:19 PM, Elijah Newren wrote:
>> On Thu, May 10, 2018 at 7:16 AM, Ben Peart 
>> wrote:

> Given the example perf impact is arbitrary (the actual example that
> triggered this patch took status from 2+ hours to seconds) and can't be
> replicated using the current performance tools in git, I'm just going drop
> the specific numbers.  I believe the patch is worth while just to give users
> the flexibility to control these behaviors.

Your parenthetical statement of timing going from hours to seconds I
think would be great; I don't think we need precise numbers.

>>> +   if ((intptr_t)rename_score_arg != -1) {
>>> +   s.detect_rename = DIFF_DETECT_RENAME;
>>
>>
>> I'd still prefer this was a
>>  if (s.detect_rename < DIFF_DETECT_RENAME)
>>  s.detect_rename = DIFF_DETECT_RENAME;
>>
>> If a user specifies they are willing to pay for copy detection, but
>> then just passes --find-renames=40% because they want to find more
>> renames, it seems odd to disable copy detection to me.
>>
>
> I agree and will change it. It is unfortunate this will behave differently
> than it does with merge.  Fixing the merge behavior to match is outside the
> scope of this patch.

I agree that changing merge is outside the scope of this patch, but
I'm curious what change you have in mind for it to "make it match".
In particular, merge-recursive.c already has (or will shortly have)
+   if (opts.detect_rename > DIFF_DETECT_RENAME)
+   opts.detect_rename = DIFF_DETECT_RENAME;
from your commit 85b460305ce7 ("merge: add merge.renames config
setting", 2018-05-02), so I'm not sure why we'd want to carefully
propagate a larger value for o->{diff,merge}_detect_rename prior to
this point.  If it's just "future proofing" because you suspect that
copy information could be useful to the merging algorithm and we'll
eventually get rid of these two lines of code, then I could get behind
such a change, though color me skeptical that copy information would
ever turn out to be useful in that context.

The one place copy detection does make sense inside a merge is for the
diffstat shown at the end (from builtin/merge.c), but it currently
isn't controlled by any configuration setting at all.  When it is
hooked up, it'd probably store the value separately from
merge-recursive's internal o->{diff,merge}_detect_rename anyway,
because builtin/merge.c's diffstat should be controlled by the
relevant confiig settings and flags (merge.renames, diff.renames,
-Xfind-renames, etc.) regardless of which merge strategy (recursive,
resolve, octopus, ours, ort) is employed.  And when that is hooked up,
I agree with you that it should look like what you've done with
status.renames here.  In fact, if you'd like to take a crack at it, I
think you'd do a great job.  :-)  If not, it's on my list of things to
do.

>> Testcases look good.  It'd be nice to also add a few testcases where
>> copy detection is turned on -- in particular, I'd like to see one with
>> --find-renames=$DIFFERENT_THAN_DEFAULT being passed when
>> merge.renames=copies.
>>
>
> OK.  I also added tests to verify the settings correctly impact commit.

Nice!


[PATCH] submodule: port submodule subcommand 'foreach' from shell to C

2018-05-10 Thread Stefan Beller
From: Prathamesh Chavan 

This aims to make git-submodule foreach a builtin. 'foreach' is ported to
the submodule--helper, and submodule--helper is called from
git-submodule.sh.

Helped-by: Brandon Williams 
Mentored-by: Christian Couder 
Mentored-by: Stefan Beller 
Signed-off-by: Prathamesh Chavan 
Signed-off-by: Stefan Beller 
---

This is a resend of the last commit in origin/pc/submodule-helper-foreach
It addresses the micro nits of funny comment indentation.

Thanks,
Stefan

 builtin/submodule--helper.c | 144 
 git-submodule.sh|  39 +-
 2 files changed, 145 insertions(+), 38 deletions(-)

diff --git a/builtin/submodule--helper.c b/builtin/submodule--helper.c
index c2403a915ff..4002026d1ac 100644
--- a/builtin/submodule--helper.c
+++ b/builtin/submodule--helper.c
@@ -439,6 +439,149 @@ static void for_each_listed_submodule(const struct 
module_list *list,
fn(list->entries[i], cb_data);
 }
 
+struct cb_foreach {
+   int argc;
+   const char **argv;
+   const char *prefix;
+   int quiet;
+   int recursive;
+};
+#define CB_FOREACH_INIT { 0 }
+
+static void runcommand_in_submodule_cb(const struct cache_entry *list_item,
+  void *cb_data)
+{
+   struct cb_foreach *info = cb_data;
+   const char *path = list_item->name;
+   const struct object_id *ce_oid = _item->oid;
+
+   const struct submodule *sub;
+   struct child_process cp = CHILD_PROCESS_INIT;
+   char *displaypath;
+
+   displaypath = get_submodule_displaypath(path, info->prefix);
+
+   sub = submodule_from_path(the_repository, _oid, path);
+
+   if (!sub)
+   die(_("No url found for submodule path '%s' in .gitmodules"),
+   displaypath);
+
+   if (!is_submodule_populated_gently(path, NULL))
+   goto cleanup;
+
+   prepare_submodule_repo_env(_array);
+
+   /*
+* For the purpose of executing  in the submodule,
+* separate shell is used for the purpose of running the
+* child process.
+*/
+   cp.use_shell = 1;
+   cp.dir = path;
+
+   /*
+* NEEDSWORK: the command currently has access to the variables $name,
+* $sm_path, $displaypath, $sha1 and $toplevel only when the command
+* contains a single argument. This is done for maintaining a faithful
+* translation from shell script.
+*/
+   if (info->argc == 1) {
+   char *toplevel = xgetcwd();
+   struct strbuf sb = STRBUF_INIT;
+
+   argv_array_pushf(_array, "name=%s", sub->name);
+   argv_array_pushf(_array, "sm_path=%s", path);
+   argv_array_pushf(_array, "displaypath=%s", displaypath);
+   argv_array_pushf(_array, "sha1=%s",
+   oid_to_hex(ce_oid));
+   argv_array_pushf(_array, "toplevel=%s", toplevel);
+
+   /*
+* Since the path variable was accessible from the script
+* before porting, it is also made available after porting.
+* The environment variable "PATH" has a very special purpose
+* on windows. And since environment variables are
+* case-insensitive in windows, it interferes with the
+* existing PATH variable. Hence, to avoid that, we expose
+* path via the args argv_array and not via env_array.
+*/
+   sq_quote_buf(, path);
+   argv_array_pushf(, "path=%s; %s",
+sb.buf, info->argv[0]);
+   strbuf_release();
+   free(toplevel);
+   } else {
+   argv_array_pushv(, info->argv);
+   }
+
+   if (!info->quiet)
+   printf(_("Entering '%s'\n"), displaypath);
+
+   if (info->argv[0] && run_command())
+   die(_("run_command returned non-zero status for %s\n."),
+   displaypath);
+
+   if (info->recursive) {
+   struct child_process cpr = CHILD_PROCESS_INIT;
+
+   cpr.git_cmd = 1;
+   cpr.dir = path;
+   prepare_submodule_repo_env(_array);
+
+   argv_array_pushl(, "--super-prefix", NULL);
+   argv_array_pushf(, "%s/", displaypath);
+   argv_array_pushl(, "submodule--helper", "foreach", 
"--recursive",
+   NULL);
+
+   if (info->quiet)
+   argv_array_push(, "--quiet");
+
+   argv_array_pushv(, info->argv);
+
+   if (run_command())
+   die(_("run_command returned non-zero status while"
+   "recursing in the nested submodules of %s\n."),
+  

[PATCH 2/2] merge-recursive: i18n submodule merge output and respect verbosity

2018-05-10 Thread Stefan Beller
The submodule merge code now uses the output() function that is used by
all the rest of the merge-recursive-code. This allows for respecting
internationalisation as well as the verbosity setting.

Signed-off-by: Stefan Beller 
---
 merge-recursive.c | 33 +++--
 1 file changed, 15 insertions(+), 18 deletions(-)

diff --git a/merge-recursive.c b/merge-recursive.c
index 700ba15bf88..a4b91d17f87 100644
--- a/merge-recursive.c
+++ b/merge-recursive.c
@@ -1048,18 +1048,17 @@ static void print_commit(struct commit *commit)
strbuf_release();
 }
 
-#define MERGE_WARNING(path, msg) \
-   warning("Failed to merge submodule %s (%s)", path, msg);
-
-static int merge_submodule(struct object_id *result, const char *path,
+static int merge_submodule(struct merge_options *o,
+  struct object_id *result, const char *path,
   const struct object_id *base, const struct object_id 
*a,
-  const struct object_id *b, int search)
+  const struct object_id *b)
 {
struct commit *commit_base, *commit_a, *commit_b;
int parent_count;
struct object_array merges;
 
int i;
+   int search = !o->call_depth;
 
/* store a in result in case we fail */
oidcpy(result, a);
@@ -1073,21 +1072,21 @@ static int merge_submodule(struct object_id *result, 
const char *path,
return 0;
 
if (add_submodule_odb(path)) {
-   MERGE_WARNING(path, "not checked out");
+   output(o, 1, _("Failed to merge submodule %s (not checked 
out)"), path);
return 0;
}
 
if (!(commit_base = lookup_commit_reference(base)) ||
!(commit_a = lookup_commit_reference(a)) ||
!(commit_b = lookup_commit_reference(b))) {
-   MERGE_WARNING(path, "commits not present");
+   output(o, 1, _("Failed to merge submodule %s (commits not 
present)"), path);
return 0;
}
 
/* check whether both changes are forward */
if (!in_merge_bases(commit_base, commit_a) ||
!in_merge_bases(commit_base, commit_b)) {
-   MERGE_WARNING(path, "commits don't follow merge-base");
+   output(o, 1, _("Failed to merge submodule %s (commits don't 
follow merge-base)"), path);
return 0;
}
 
@@ -1116,25 +1115,24 @@ static int merge_submodule(struct object_id *result, 
const char *path,
parent_count = find_first_merges(, path, commit_a, commit_b);
switch (parent_count) {
case 0:
-   MERGE_WARNING(path, "merge following commits not found");
+   output(o, 1, _("Failed to merge submodule %s (merge following 
commits not found)"), path);
break;
 
case 1:
-   MERGE_WARNING(path, "not fast-forward");
-   fprintf(stderr, "Found a possible merge resolution "
-   "for the submodule:\n");
+   output(o, 1, _("Failed to merge submodule %s (not 
fast-forward)"), path);
+   output(o, 1, _("Found a possible merge resolution for the 
submodule:\n"));
print_commit((struct commit *) merges.objects[0].item);
-   fprintf(stderr,
+   output(o, 1, _(
"If this is correct simply add it to the index "
"for example\n"
"by using:\n\n"
"  git update-index --cacheinfo 16 %s \"%s\"\n\n"
-   "which will accept this suggestion.\n",
+   "which will accept this suggestion.\n"),
oid_to_hex([0].item->oid), path);
break;
 
default:
-   MERGE_WARNING(path, "multiple merges found");
+   output(o, 1, _("Failed to merge submodule %s (multiple merges 
found)"), path);
for (i = 0; i < merges.nr; i++)
print_commit((struct commit *) merges.objects[i].item);
}
@@ -1205,12 +1203,11 @@ static int merge_file_1(struct merge_options *o,
return ret;
result->clean = (merge_status == 0);
} else if (S_ISGITLINK(a->mode)) {
-   result->clean = merge_submodule(>oid,
+   result->clean = merge_submodule(o, >oid,
   one->path,
   >oid,
   >oid,
-  >oid,
-  !o->call_depth);
+  >oid);
} else if (S_ISLNK(a->mode)) {
switch (o->recursive_variant) {
 

[PATCH 1/2] submodule.c: move submodule merging to merge-recursive.c

2018-05-10 Thread Stefan Beller
In a later patch we want to improve submodule merging by using the output()
function in merge-recursive.c for submodule merges to deliver a consistent
UI to users.

To do so we could either make the output() function globally available
so we can use it in submodule.c#merge_submodule(), or we could integrate
the submodule merging into the merging code. Choose the later as we
generally want to move submodules closer into the core.

Therefore we move any function related to merging submodules
(merge_submodule(), find_first_merges() and print_commit) to
merge-recursive.c.  We'll keep add_submodule_odb() in submodule.c as it
is used by other submodule functions. While at it, add a TODO note that
we do not really like the function add_submodule_odb().

This commit is best viewed with --color-moved.

Signed-off-by: Stefan Beller 
---
 merge-recursive.c | 166 +
 submodule.c   | 168 +-
 submodule.h   |   6 +-
 3 files changed, 170 insertions(+), 170 deletions(-)

diff --git a/merge-recursive.c b/merge-recursive.c
index 0c0d48624da..700ba15bf88 100644
--- a/merge-recursive.c
+++ b/merge-recursive.c
@@ -23,6 +23,7 @@
 #include "merge-recursive.h"
 #include "dir.h"
 #include "submodule.h"
+#include "revision.h"
 
 struct path_hashmap_entry {
struct hashmap_entry e;
@@ -977,6 +978,171 @@ static int merge_3way(struct merge_options *o,
return merge_status;
 }
 
+static int find_first_merges(struct object_array *result, const char *path,
+   struct commit *a, struct commit *b)
+{
+   int i, j;
+   struct object_array merges = OBJECT_ARRAY_INIT;
+   struct commit *commit;
+   int contains_another;
+
+   char merged_revision[42];
+   const char *rev_args[] = { "rev-list", "--merges", "--ancestry-path",
+  "--all", merged_revision, NULL };
+   struct rev_info revs;
+   struct setup_revision_opt rev_opts;
+
+   memset(result, 0, sizeof(struct object_array));
+   memset(_opts, 0, sizeof(rev_opts));
+
+   /* get all revisions that merge commit a */
+   xsnprintf(merged_revision, sizeof(merged_revision), "^%s",
+   oid_to_hex(>object.oid));
+   init_revisions(, NULL);
+   rev_opts.submodule = path;
+   /* FIXME: can't handle linked worktrees in submodules yet */
+   revs.single_worktree = path != NULL;
+   setup_revisions(ARRAY_SIZE(rev_args)-1, rev_args, , _opts);
+
+   /* save all revisions from the above list that contain b */
+   if (prepare_revision_walk())
+   die("revision walk setup failed");
+   while ((commit = get_revision()) != NULL) {
+   struct object *o = &(commit->object);
+   if (in_merge_bases(b, commit))
+   add_object_array(o, NULL, );
+   }
+   reset_revision_walk();
+
+   /* Now we've got all merges that contain a and b. Prune all
+* merges that contain another found merge and save them in
+* result.
+*/
+   for (i = 0; i < merges.nr; i++) {
+   struct commit *m1 = (struct commit *) merges.objects[i].item;
+
+   contains_another = 0;
+   for (j = 0; j < merges.nr; j++) {
+   struct commit *m2 = (struct commit *) 
merges.objects[j].item;
+   if (i != j && in_merge_bases(m2, m1)) {
+   contains_another = 1;
+   break;
+   }
+   }
+
+   if (!contains_another)
+   add_object_array(merges.objects[i].item, NULL, result);
+   }
+
+   object_array_clear();
+   return result->nr;
+}
+
+static void print_commit(struct commit *commit)
+{
+   struct strbuf sb = STRBUF_INIT;
+   struct pretty_print_context ctx = {0};
+   ctx.date_mode.type = DATE_NORMAL;
+   format_commit_message(commit, " %h: %m %s", , );
+   fprintf(stderr, "%s\n", sb.buf);
+   strbuf_release();
+}
+
+#define MERGE_WARNING(path, msg) \
+   warning("Failed to merge submodule %s (%s)", path, msg);
+
+static int merge_submodule(struct object_id *result, const char *path,
+  const struct object_id *base, const struct object_id 
*a,
+  const struct object_id *b, int search)
+{
+   struct commit *commit_base, *commit_a, *commit_b;
+   int parent_count;
+   struct object_array merges;
+
+   int i;
+
+   /* store a in result in case we fail */
+   oidcpy(result, a);
+
+   /* we can not handle deletion conflicts */
+   if (is_null_oid(base))
+   return 0;
+   if (is_null_oid(a))
+   return 0;
+   if (is_null_oid(b))
+   return 0;
+
+   if (add_submodule_odb(path)) {
+   MERGE_WARNING(path, "not checked out");
+   

[PATCH 0/2] Submodule merging: i18n, verbosity

2018-05-10 Thread Stefan Beller
Leif wrote:
> Sure, let me know what to use instead and I’ll update and resubmit the patch.
> Sure, but `MERGE_WARNING` prefixes all the messages with "Failed to
> merge submodule“.

I thought about replying and coming up with good reasons, but I wrote some
patches instead.

They can also be found at 
https://github.com/stefanbeller/git/tree/submodule_i18n_verbose

I think these would be a good foundation for your patch as well, as you can use 
the
output() function for the desired cases.

Feel free to take these patches as part of your series or adapt
(or be inspired by) as needed.

Thanks,
Stefan


Stefan Beller (2):
  submodule.c: move submodule merging to merge-recursive.c
  merge-recursive: i18n submodule merge output and respect verbosity

 merge-recursive.c | 169 +-
 submodule.c   | 168 +
 submodule.h   |   6 +-
 3 files changed, 170 insertions(+), 173 deletions(-)

-- 
2.17.0.255.g8bfb7c0704



Re: [PATCH v4 00/13] object store: alloc

2018-05-10 Thread Jonathan Tan
On Thu, 10 May 2018 10:32:09 -0700
Stefan Beller  wrote:

> >  - I would call them release_commit() and release_tag(), to match
> >strbuf_release().
> 
> Why not commit_release and tag_release to also have the same order
> of words as in strbuf_release ?

At this point in the discussion, either is fine.

> >  - It might be better to just inline the handling of releasing commit
> >and tag memory. This code already knows that, for a tree, it needs to
> >free its buffer and only its buffer, so it is not much of a stretch
> >to think that it similarly knows the details of commit and tag
> >objects too.
> 
> We still call out to free_tree_buffer? Not sure I understand.

I meant that since we call out to free_tree_buffer (as you said), this
shows that the code knows the internal details of a tree object (in that
it has a buffer, and that needs to be freed, and that is the only thing
that needs to be freed), so maybe the code should operate on the
internal details of commits and tags as well. But again, this is a minor
point.


Re: [PATCH 00/12] Integrate commit-graph into fsck, gc, and fetch

2018-05-10 Thread Martin Ågren
On 10 May 2018 at 21:22, Stefan Beller  wrote:
> On Thu, May 10, 2018 at 12:05 PM, Martin Ågren  wrote:

>> I hope to find time to do some more hands-on testing of this to see that
>> errors actually do get caught.

> Packfiles and loose objects are primary data, which means that those
> need a more advanced way to diagnose and repair them, so I would imagine
> the commit graph fsck is closer to bitmaps fsck, which I would have suspected
> to be found in t5310, but a quick read doesn't reveal many tests that are
> checking for integrity. So I guess the test coverage here is ok, (although we
> should always ask for more)

Since I'm wrapping up for today, I'm posting some simple tests that I
assembled. The last of these showcases one or two problems with the
current error-reporting. Depending on the error, there can be *lots* of
errors reported and there are no new-lines, so the result on stdout can
be a wall of not-very-legible text.

Some of these might not make sense. I just started going through the
documentation on the format, causing some sort of corruption in each
field. Maybe this can be helpful somehow.

Martin

diff --git a/t/t5318-commit-graph.sh b/t/t5318-commit-graph.sh
index 82f95eb11f..a7e48db2de 100755
--- a/t/t5318-commit-graph.sh
+++ b/t/t5318-commit-graph.sh
@@ -255,4 +255,49 @@ test_expect_success 'git fsck (checks commit-graph)' '
git fsck
 '
 
+# usage: corrupt_data   []
+corrupt_data() {
+   file=$1
+   pos=$2
+   data="${3:-\0}"
+   printf "$data" | dd of="$file" bs=1 seek="$pos" conv=notrunc
+}
+
+test_expect_success 'detect bad signature' '
+   cd "$TRASH_DIRECTORY/full" &&
+   cp $objdir/info/commit-graph commit-graph-backup &&
+   test_when_finished mv commit-graph-backup $objdir/info/commit-graph &&
+   corrupt_data $objdir/info/commit-graph 0 "\0" &&
+   test_must_fail git commit-graph verify 2>err &&
+   grep "graph signature" err
+'
+
+test_expect_success 'detect bad version number' '
+   cd "$TRASH_DIRECTORY/full" &&
+   cp $objdir/info/commit-graph commit-graph-backup &&
+   test_when_finished mv commit-graph-backup $objdir/info/commit-graph &&
+   corrupt_data $objdir/info/commit-graph 4 "\02" &&
+   test_must_fail git commit-graph verify 2>err &&
+   grep "graph version" err
+'
+
+test_expect_success 'detect bad hash version' '
+   cd "$TRASH_DIRECTORY/full" &&
+   cp $objdir/info/commit-graph commit-graph-backup &&
+   test_when_finished mv commit-graph-backup $objdir/info/commit-graph &&
+   corrupt_data $objdir/info/commit-graph 5 "\02" &&
+   test_must_fail git commit-graph verify 2>err &&
+   grep "hash version" err
+'
+
+test_expect_success 'detect too small chunk-count' '
+   cd "$TRASH_DIRECTORY/full" &&
+   cp $objdir/info/commit-graph commit-graph-backup &&
+   test_when_finished mv commit-graph-backup $objdir/info/commit-graph &&
+   corrupt_data $objdir/info/commit-graph 6 "\01" &&
+   test_must_fail git commit-graph verify 2>err &&
+   cat err
+'
+
+
 test_done
-- 
2.17.0



Re: [PATCH 1/1] Warn about fast-forwarding of submodules during merge

2018-05-10 Thread Leif Middelschulte
Hi Stefan,


Am 10. Mai 2018 um 20:49:39, Stefan Beller
(sbel...@google.com(mailto:sbel...@google.com)) schrieb:

> On Thu, May 10, 2018 at 11:26 AM, Leif Middelschulte
> wrote:
> > From: Leif Middelschulte
>
> Hi Leif!
>
> thanks for following up with a patch!
sure, thanks for the quick review.
>
> > Warn the user about an automatically fast-forwarded submodule. The silent 
> > merge
> > behavior was introduced by commit 68d03e4a6e44 ("Implement automatic 
> > fast-forward
> > merge for submodules", 2010-07-07)).
> >
> > Signed-off-by: Leif Middelschulte
> > ---
> > submodule.c | 2 ++
> > 1 file changed, 2 insertions(+)
> >
> > diff --git a/submodule.c b/submodule.c
> > index 74d35b257..0198a72e6 100644
> > --- a/submodule.c
> > +++ b/submodule.c
> > @@ -1817,10 +1817,12 @@ int merge_submodule(struct object_id *result, const 
> > char *path,
> > /* Case #1: a is contained in b or vice versa */
> > if (in_merge_bases(commit_a, commit_b)) {
> > oidcpy(result, b);
> > + warning("Fast-forwarding submodule %s", path);
> > return 1;
> > }
> > if (in_merge_bases(commit_b, commit_a)) {
> > oidcpy(result, a);
> > + warning("Fast-forwarding submodule %s", path);
> > return 1;
> > }
>
> The code looks correct, however I think we can improve it.
> (Originally I was just wondering if stderr is the right output,
> which lead me to the thoughts below:)
I’ve had the same thoughts about stderr. However, I thought that using a
log function named `warning` to warn the user would be the right choice.
If anything, I thought, the warning function might need refactoring.

> Looking through the code of merge-recursive.c,
> all the other merge outputs are done via 'output()'
> that is able to buffer up the output as well as handles
> the output for different verbosity settings.
>
> So I would think we should make the output() function available
> outside of merge-recursive.c. (and rename it to a be more concise
> and descriptive in the global namespace) and make use of it.
Sure, let me know what to use instead and I’ll update and resubmit the patch.

>
> Funnily we already have MERGE_WARNING in submodule.c
> which outputs information for all the other cases. I would think
> we ought to convert those to the output(), too.
Sure, but `MERGE_WARNING` prefixes all the messages with "Failed to
merge submodule“.
>
> Thanks,
> Stefan

Thank you,
Leif


Re: Git case-sensitivity bug

2018-05-10 Thread Bryan Turner
Hey Cliff,

On Thu, May 10, 2018 at 12:46 PM Cliff  wrote:

> I believe I have discovered a bug with git tools.

> If you create a git branch, you can refer to that branch with
> case-insensitive alterations and it will track as the same branch.

This comes up on the list fairly often on the list. The most recent thread
is here:
https://public-inbox.org/git/d4d8d8208b6a41c380ecf20807763...@xch15-05-02.nw.nos.boeing.com/

The replies to that thread explain the behavior, why it does what it does,
and why it hasn't been changed yet. It happens on case-insensitive macOS
filesystems and on NTFS on Windows.

Hope this helps!
Bryan Turner


RE: [Best Practices Request] clean/smudge configuration

2018-05-10 Thread Randall S. Becker


On May 9, 2018 6:39 PM, Bryan Turner wrote:
> On Wed, May 9, 2018 at 3:09 PM Randall S. Becker
> 
> wrote:
> 
> > The question: what is the best practice for versioning the parts of
> > clean/smudge filters that are in .git/config given that only some
> > users in my environment will be cloning the repository in question and
> > that I
> really
> > can't put the entries in /etc/gitconfig or ~/.gitconfig because of
> potential
> > conflicts with other repositories that might also have clean/smudge
> > definitions.
> 
> Depending on level of trust, one approach might be to use an [include] in
> .git/config to include a file that's in the repository. Something like:
> 
> [include]
>  path = ../path/to/config
> 

What if we create a ../.gitconfig like ../.gitattributes, that is loaded
before .git/config? With loads of warnings in the documentation about what
*NOT* to put in here, any platform specifics and your own risk. The code in
config.c would look like the following, with obvious updates to
documentation and the test suite, so it's not fully baked yet. So far, I
don't have a solution to the chicken-and-egg problem, other than this.
However, if I'm barking up the wrong ballpark...

diff --git a/config.c b/config.c
index b0c20e6cb..75d5288ff 100644
--- a/config.c
+++ b/config.c
@@ -1555,11 +1555,15 @@ static int do_git_config_sequence(const struct
config_options *opts,
char *xdg_config = xdg_config_home("config");
char *user_config = expand_user_path("~/.gitconfig", 0);
char *repo_config;
+   char *repo_config_versioned;

-   if (opts->commondir)
+   if (opts->commondir) {
repo_config = mkpathdup("%s/config", opts->commondir);
-   else
+   repo_config_versioned = mkpathdup("%s/../.gitconfig",
opts->commondir);
+   } else {
repo_config = NULL;
+   repo_config_versioned = NULL;
+   }

current_parsing_scope = CONFIG_SCOPE_SYSTEM;
if (git_config_system() && !access_or_die(git_etc_gitconfig(), R_OK,
0))
@@ -1574,6 +1578,8 @@ static int do_git_config_sequence(const struct
config_options *opts,
ret += git_config_from_file(fn, user_config, data);

current_parsing_scope = CONFIG_SCOPE_REPO;
+   if (repo_config_versioned && !access_or_die(repo_config_versioned,
R_OK, 0))
+   ret += git_config_from_file(fn, repo_config_versioned,
data);
if (repo_config && !access_or_die(repo_config, R_OK, 0))
ret += git_config_from_file(fn, repo_config, data);

@@ -1585,6 +1591,7 @@ static int do_git_config_sequence(const struct
config_options *opts,
free(xdg_config);
free(user_config);
free(repo_config);
+   free(repo_config_versioned);
return ret;
 }




[PATCH v2 3/4] object.c: free replace map in raw_object_store_clear

2018-05-10 Thread Stefan Beller
The replace map for objects was missed to free in the object store in
the conversion of c1274495 ("replace-object: eliminate replace objects
prepared flag", 2018-04-11). We also missed to free the replaced objects
that are put into the replace map in that whole series.

Signed-off-by: Stefan Beller 
---
 object.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/object.c b/object.c
index 242d922d953..80d1cae53c0 100644
--- a/object.c
+++ b/object.c
@@ -481,6 +481,9 @@ void raw_object_store_clear(struct raw_object_store *o)
FREE_AND_NULL(o->objectdir);
FREE_AND_NULL(o->alternate_db);
 
+   oidmap_free(o->replace_map, 1);
+   FREE_AND_NULL(o->replace_map);
+
free_alt_odbs(o);
o->alt_odb_tail = NULL;
 
-- 
2.17.0.255.g8bfb7c0704



[PATCH v2 4/4] replace-object.c: remove the_repository from prepare_replace_object

2018-05-10 Thread Stefan Beller
This was missed in 5982da9d2ce (replace-object: allow
prepare_replace_object to handle arbitrary repositories, 2018-04-11)

Technically the code works correctly as the replace_map is the same
size in different repositories, however it is hard to read. So convert
the code to the familiar pattern of dereferencing the pointer that we
assign in the sizeof itself.

Signed-off-by: Stefan Beller 
---
 replace-object.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/replace-object.c b/replace-object.c
index 246b98cd4f1..801b5c16789 100644
--- a/replace-object.c
+++ b/replace-object.c
@@ -37,7 +37,7 @@ static void prepare_replace_object(struct repository *r)
return;
 
r->objects->replace_map =
-   xmalloc(sizeof(*the_repository->objects->replace_map));
+   xmalloc(sizeof(*r->objects->replace_map));
oidmap_init(r->objects->replace_map, 0);
 
for_each_replace_ref(r, register_replace_ref, NULL);
-- 
2.17.0.255.g8bfb7c0704



[PATCH v2 0/4] Fix mem leaks of recent object store conversions.

2018-05-10 Thread Stefan Beller
This series replaces the two commits that were queued on 
sb/object-store-replace,
fixing memory leaks that were recently introduced.

Compared to v1, I merged the two independent series from yesterday,
rewrote the commit message to clear up Junios confusion and addresses Peffs
comments for the packfiles as well.
Also added another free to free the oidmap for the replaces themselves.

Thanks,
Stefan

Stefan Beller (4):
  packfile: close and free packs upon releasing an object store
  packfile.h: remove all extern keywords
  object.c: free replace map in raw_object_store_clear
  replace-object.c: remove the_repository from prepare_replace_object

 object.c |  7 +++--
 packfile.c   | 13 
 packfile.h   | 79 
 replace-object.c |  2 +-
 4 files changed, 58 insertions(+), 43 deletions(-)

-- 
2.17.0.255.g8bfb7c0704



[PATCH v2 2/4] packfile.h: remove all extern keywords

2018-05-10 Thread Stefan Beller
Per our coding guidelines we prefer to only use the extern keyword
when needed.

Signed-off-by: Stefan Beller 
---
 packfile.h | 80 +++---
 1 file changed, 40 insertions(+), 40 deletions(-)

diff --git a/packfile.h b/packfile.h
index cdab0557979..eb3b1154501 100644
--- a/packfile.h
+++ b/packfile.h
@@ -10,32 +10,32 @@
  *
  * Example: odb_pack_name(out, sha1, "idx") => 
".git/objects/pack/pack-1234..idx"
  */
-extern char *odb_pack_name(struct strbuf *buf, const unsigned char *sha1, 
const char *ext);
+char *odb_pack_name(struct strbuf *buf, const unsigned char *sha1, const char 
*ext);
 
 /*
  * Return the name of the (local) packfile with the specified sha1 in
  * its name.  The return value is a pointer to memory that is
  * overwritten each time this function is called.
  */
-extern char *sha1_pack_name(const unsigned char *sha1);
+char *sha1_pack_name(const unsigned char *sha1);
 
 /*
  * Return the name of the (local) pack index file with the specified
  * sha1 in its name.  The return value is a pointer to memory that is
  * overwritten each time this function is called.
  */
-extern char *sha1_pack_index_name(const unsigned char *sha1);
+char *sha1_pack_index_name(const unsigned char *sha1);
 
-extern struct packed_git *parse_pack_index(unsigned char *sha1, const char 
*idx_path);
+struct packed_git *parse_pack_index(unsigned char *sha1, const char *idx_path);
 
 /* A hook to report invalid files in pack directory */
 #define PACKDIR_FILE_PACK 1
 #define PACKDIR_FILE_IDX 2
 #define PACKDIR_FILE_GARBAGE 4
-extern void (*report_garbage)(unsigned seen_bits, const char *path);
+void (*report_garbage)(unsigned seen_bits, const char *path);
 
-extern void reprepare_packed_git(struct repository *r);
-extern void install_packed_git(struct repository *r, struct packed_git *pack);
+void reprepare_packed_git(struct repository *r);
+void install_packed_git(struct repository *r, struct packed_git *pack);
 
 struct packed_git *get_packed_git(struct repository *r);
 struct list_head *get_packed_git_mru(struct repository *r);
@@ -46,31 +46,31 @@ struct list_head *get_packed_git_mru(struct repository *r);
  */
 unsigned long approximate_object_count(void);
 
-extern struct packed_git *find_sha1_pack(const unsigned char *sha1,
+struct packed_git *find_sha1_pack(const unsigned char *sha1,
 struct packed_git *packs);
 
-extern void pack_report(void);
+void pack_report(void);
 
 /*
  * mmap the index file for the specified packfile (if it is not
  * already mmapped).  Return 0 on success.
  */
-extern int open_pack_index(struct packed_git *);
+int open_pack_index(struct packed_git *);
 
 /*
  * munmap the index file for the specified packfile (if it is
  * currently mmapped).
  */
-extern void close_pack_index(struct packed_git *);
+void close_pack_index(struct packed_git *);
 
-extern unsigned char *use_pack(struct packed_git *, struct pack_window **, 
off_t, unsigned long *);
-extern void close_pack_windows(struct packed_git *);
-extern void close_pack(struct packed_git *);
-extern void close_all_packs(struct raw_object_store *o);
-extern void close_and_free_packs(struct raw_object_store *o);
-extern void unuse_pack(struct pack_window **);
-extern void clear_delta_base_cache(void);
-extern struct packed_git *add_packed_git(const char *path, size_t path_len, 
int local);
+unsigned char *use_pack(struct packed_git *, struct pack_window **, off_t, 
unsigned long *);
+void close_pack_windows(struct packed_git *);
+void close_pack(struct packed_git *);
+void close_all_packs(struct raw_object_store *o);
+void close_and_free_packs(struct raw_object_store *o);
+void unuse_pack(struct pack_window **);
+void clear_delta_base_cache(void);
+struct packed_git *add_packed_git(const char *path, size_t path_len, int 
local);
 
 /*
  * Make sure that a pointer access into an mmap'd index file is within bounds,
@@ -80,7 +80,7 @@ extern struct packed_git *add_packed_git(const char *path, 
size_t path_len, int
  * (like the 64-bit extended offset table), as we compare the size to the
  * fixed-length parts when we open the file.
  */
-extern void check_pack_index_ptr(const struct packed_git *p, const void *ptr);
+void check_pack_index_ptr(const struct packed_git *p, const void *ptr);
 
 /*
  * Perform binary search on a pack-index for a given oid. Packfile is expected 
to
@@ -96,51 +96,51 @@ int bsearch_pack(const struct object_id *oid, const struct 
packed_git *p, uint32
  * at the SHA-1 within the mmapped index.  Return NULL if there is an
  * error.
  */
-extern const unsigned char *nth_packed_object_sha1(struct packed_git *, 
uint32_t n);
+const unsigned char *nth_packed_object_sha1(struct packed_git *, uint32_t n);
 /*
  * Like nth_packed_object_sha1, but write the data into the object specified by
  * the the first argument.  Returns the first argument on success, and NULL on
  * error.
  */
-extern const struct object_id 

[PATCH v2 1/4] packfile: close and free packs upon releasing an object store

2018-05-10 Thread Stefan Beller
In d0b59866223 (object-store: close all packs upon clearing the object
store, 2018-03-23), we made sure to close all packfiles on releasing
an object store, but we also have to free the memory of the closed packs.

Signed-off-by: Stefan Beller 
---

Notes:
> Should that INIT_LIST_HEAD get moved down into that function?

done.

> Probably the same applies to setting NULL here; you're left with a
> dangling pointer if you just call close_and_free_packs(). Should
> that helper maybe just be a static function in object.c?

I just realize that

while (o->packed_git) {
...
o->packed_git = p->next;
...
}

will make sure that o->packed_git is NULL afterwards,
hence I removed the explicit set to NULL in object.c
and we rely on the code in replace-object.c

Thanks,
Stefan

 object.c   |  4 +---
 packfile.c | 13 +
 packfile.h |  1 +
 3 files changed, 15 insertions(+), 3 deletions(-)

diff --git a/object.c b/object.c
index 66cffaf6e51..242d922d953 100644
--- a/object.c
+++ b/object.c
@@ -484,7 +484,5 @@ void raw_object_store_clear(struct raw_object_store *o)
free_alt_odbs(o);
o->alt_odb_tail = NULL;
 
-   INIT_LIST_HEAD(>packed_git_mru);
-   close_all_packs(o);
-   o->packed_git = NULL;
+   close_and_free_packs(o);
 }
diff --git a/packfile.c b/packfile.c
index 6c3ddc3c31d..7f2a9e28a2b 100644
--- a/packfile.c
+++ b/packfile.c
@@ -322,6 +322,19 @@ void close_all_packs(struct raw_object_store *o)
close_pack(p);
 }
 
+void close_and_free_packs(struct raw_object_store *o)
+{
+   close_all_packs(o);
+
+   INIT_LIST_HEAD(>packed_git_mru);
+
+   while (o->packed_git) {
+   struct packed_git *p = o->packed_git;
+   o->packed_git = p->next;
+   free(p);
+   }
+}
+
 /*
  * The LRU pack is the one with the oldest MRU window, preferring packs
  * with no used windows, or the oldest mtime if it has no windows allocated.
diff --git a/packfile.h b/packfile.h
index 9c2f8859945..cdab0557979 100644
--- a/packfile.h
+++ b/packfile.h
@@ -67,6 +67,7 @@ extern unsigned char *use_pack(struct packed_git *, struct 
pack_window **, off_t
 extern void close_pack_windows(struct packed_git *);
 extern void close_pack(struct packed_git *);
 extern void close_all_packs(struct raw_object_store *o);
+extern void close_and_free_packs(struct raw_object_store *o);
 extern void unuse_pack(struct pack_window **);
 extern void clear_delta_base_cache(void);
 extern struct packed_git *add_packed_git(const char *path, size_t path_len, 
int local);
-- 
2.17.0.255.g8bfb7c0704



Re: [PATCH] fast-export: avoid NULL pointer arithmetic

2018-05-10 Thread René Scharfe
Am 10.05.2018 um 12:51 schrieb Junio C Hamano:
> René Scharfe  writes:
> 
>> The standard says about uintptr_t that "any valid pointer to void can
>> be converted to this type, then converted back to pointer to void, and
>> the result will compare equal to the original pointer".  So void * ->
>> uintptr_t -> void * is a proper roundtrip, but that doesn't imply that
>> casting arbitrary uintptr_t values to void * would be lossless.
>>
>> I don't know an architecture where this would bite us, but I wonder if
>> there is a cleaner way.  Perhaps changing the type of the decoration
>> member of struct decoration_entry in decorate.h to uintptr_t?
> 
> In order to ensure "void * -> uintptr_t -> void *" roundtrip holds,
> the implementation would guarantee that uintptr_t is wider than
> void*, so what you suggest technically makes sense.  We should be
> able to store any pointer in the field.  And we should be able to
> store any value of an unsigned integral type that is narrower than
> uintptr_t.
> 
> But it somehow feels backwards in spirit to me, as the reason why we
> use "void *" there in the decoration field is because we expect that
> we'd have a pointer to some struture most of the time, and we have
> to occasionally store a small integer there.

Yes, fast-export seems to be the only place that stores an integer as
a decoration.

>  So I'd naively expect
> that
> 
>   uint32_t mark = 23;
>   de->decoration = (void *)mark;
> 
> would be a good way to store mark #23 in the field and
> 
>   uint32_t mark;
>   mark = (typeof(mark))de->decoration;
> 
> would be a good way to read it off of the "void *" field.  Of
> course, this assume that (void *) is at least as wide as 32-bit and
> it also ignores the standard ;-)

Right, it looks deceptively good and works fine if memory is flat and
valid values for pointers are in a contiguous range starting at zero.
The standard allows for other models as well, though.

> This is an unrelated tangent but the mark-to-ptr() and ptr-to-mark()
> implementations feel wasteful, especially when we worry about 32-bit
> archs.  A naive platform implementation of
> 
>   (uint32_t *)mark - (uint32_t *)NULL;
> 
> would be ((uintptr_t)mark) / 4, i.e. the de->decoration field will
> always have two LSB clear and only utilize top 30-bit to represent
> the value of mark.

That's right, but I don't see what's naive about it, or how a 32-bit
architecture could avoid wasting those two bits.


Using struct decorate in fast-export has the benefit of not
requiring separate allocations for individual entries.  Switching to
struct hashmap would require individual allocations.  Adding a
custom clone of decorate with a uint32_t payload would be an option.

René


Git case-sensitivity bug

2018-05-10 Thread Cliff
I believe I have discovered a bug with git tools.

If you create a git branch, you can refer to that branch with
case-insensitive alterations and it will track as the same branch.

If I create branch "test" I cannot then create branch "Test" because
the same name is already used.

However, git commands ARE case-sensitive, causing unspecified behavior!

If you create branch "test" and then run "git checkout Test" and then
run "git branch" it will list the branches "master" and "test" and
NEITHER ONE will be flagged * as the current branch! (It does not list
Test and does not show any branch as being the current branch)

As far as git is concerned... you're not on a branch! But you're
secretly on the "test" branch.

Also, I think this may cause further issues. Switching to "test" tells
me that I am 1 commit behind master, but switching to "Test" gives no
such message, and appears to want me to --set-upstream again, for the
same branch.

This is really bad behavior.

I'm on MacOS so this may be unique to the Mac version of git tools.


[PATCH v2 1/3] refs.c: refer to "object ID", not "sha1", in error messages

2018-05-10 Thread Martin Ågren
We have two error messages that complain about the "sha1". Because we
are about to touch one of these sites and add some tests, let's first
modernize the messages to say "object ID" instead.

While at it, make the second one use `error()` instead of `warning()`.
After printing the message, we do not continue, but actually drop the
lock and return -1 without deleting the pseudoref.

Signed-off-by: Martin Ågren 
---
We could make error-reporting more consistent in general in this file,
but I'd rather not lose track of the original goal of this series.

 refs.c | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/refs.c b/refs.c
index 64aadd14c9..7820a52c4f 100644
--- a/refs.c
+++ b/refs.c
@@ -684,7 +684,8 @@ static int write_pseudoref(const char *pseudoref, const 
struct object_id *oid,
if (read_ref(pseudoref, _old_oid))
die("could not read ref '%s'", pseudoref);
if (oidcmp(_old_oid, old_oid)) {
-   strbuf_addf(err, "unexpected sha1 when writing '%s'", 
pseudoref);
+   strbuf_addf(err, "unexpected object ID when writing 
'%s'",
+   pseudoref);
rollback_lock_file();
goto done;
}
@@ -722,7 +723,8 @@ static int delete_pseudoref(const char *pseudoref, const 
struct object_id *old_o
if (read_ref(pseudoref, _old_oid))
die("could not read ref '%s'", pseudoref);
if (oidcmp(_old_oid, old_oid)) {
-   warning("Unexpected sha1 when deleting %s", pseudoref);
+   error("unexpected object ID when deleting '%s'",
+ pseudoref);
rollback_lock_file();
return -1;
}
-- 
2.17.0



[PATCH v2 0/3] refs: handle zero oid for pseudorefs

2018-05-10 Thread Martin Ågren
On 7 May 2018 at 12:05, Martin Ågren  wrote:
> On 7 May 2018 at 09:39, Michael Haggerty  wrote:
>> Thanks for the patch. This looks good to me. But it it seems that the
>> test coverage related to pseudorefs is still not great. Ideally, all of
>> the following combinations should be tested:
>
> Thank you for your comments. I was not able to find much
> pseudoref-testing. I think what I should do is a patch 1/2 adding the
> tests you outlined (some will be expected failures), then turn this
> patch into a patch 2/2.

Well, it turned into three patches. One to move away from "sha1" in some
error messages before spreading them to the test suite, then one to add
the tests, then one for the actual fix.

Martin

Martin Ågren (3):
  refs.c: refer to "object ID", not "sha1", in error messages
  t1400: add tests around adding/deleting pseudorefs
  refs: handle zero oid for pseudorefs

 t/t1400-update-ref.sh | 60 +++
 refs.c| 22 
 2 files changed, 77 insertions(+), 5 deletions(-)

-- 
2.17.0



[PATCH v2 2/3] t1400: add tests around adding/deleting pseudorefs

2018-05-10 Thread Martin Ågren
I have not been able to find any tests around adding pseudorefs using
`git update-ref`. Add some as outlined in this table (original design by
Michael Haggerty; modified and extended by me):

Pre-update value   | ref-update old OID   | Expected result
---|--|
missing| value| reject
missing| none given   | accept
set| none given   | accept
set| correct value| accept
set| wrong value  | reject
missing| zero | accept *
set| zero | reject *

The tests marked with a * currently fail, despite git-update-ref(1)
claiming that it is possible to "specify 40 '0' or an empty string as
 to make sure that the ref you are creating does not exist."
These failing tests will be fixed in the next commit.

It is only natural to test deletion as well. Test deletion without an
old OID, with a correct one and with an incorrect one.

Suggested-by: Michael Haggerty 
Signed-off-by: Martin Ågren 
---
 t/t1400-update-ref.sh | 60 +++
 1 file changed, 60 insertions(+)

diff --git a/t/t1400-update-ref.sh b/t/t1400-update-ref.sh
index 664a3a4e4e..3996109ba4 100755
--- a/t/t1400-update-ref.sh
+++ b/t/t1400-update-ref.sh
@@ -457,6 +457,66 @@ test_expect_success 'git cat-file blob master@{2005-05-26 
23:42}:F (expect OTHER
test OTHER = $(git cat-file blob "master@{2005-05-26 23:42}:F")
 '
 
+# Test adding and deleting pseudorefs
+
+test_expect_success 'given old value for missing pseudoref, do not create' '
+   test_must_fail git update-ref PSEUDOREF $A $B 2>err &&
+   test_path_is_missing .git/PSEUDOREF &&
+   grep "could not read ref" err
+'
+
+test_expect_success 'create pseudoref' '
+   git update-ref PSEUDOREF $A &&
+   test $A = $(cat .git/PSEUDOREF)
+'
+
+test_expect_success 'overwrite pseudoref with no old value given' '
+   git update-ref PSEUDOREF $B &&
+   test $B = $(cat .git/PSEUDOREF)
+'
+
+test_expect_success 'overwrite pseudoref with correct old value' '
+   git update-ref PSEUDOREF $C $B &&
+   test $C = $(cat .git/PSEUDOREF)
+'
+
+test_expect_success 'do not overwrite pseudoref with wrong old value' '
+   test_must_fail git update-ref PSEUDOREF $D $E 2>err &&
+   test $C = $(cat .git/PSEUDOREF) &&
+   grep "unexpected object ID" err
+'
+
+test_expect_success 'delete pseudoref' '
+   git update-ref -d PSEUDOREF &&
+   test_path_is_missing .git/PSEUDOREF
+'
+
+test_expect_success 'do not delete pseudoref with wrong old value' '
+   git update-ref PSEUDOREF $A &&
+   test_must_fail git update-ref -d PSEUDOREF $B 2>err &&
+   test $A = $(cat .git/PSEUDOREF) &&
+   grep "unexpected object ID" err
+'
+
+test_expect_success 'delete pseudoref with correct old value' '
+   git update-ref -d PSEUDOREF $A &&
+   test_path_is_missing .git/PSEUDOREF
+'
+
+test_expect_failure 'create pseudoref with old OID zero' '
+   git update-ref PSEUDOREF $A $Z &&
+   test $A = $(cat .git/PSEUDOREF)
+'
+
+test_expect_failure 'do not overwrite pseudoref with old OID zero' '
+   test_when_finished git update-ref -d PSEUDOREF &&
+   test_must_fail git update-ref PSEUDOREF $B $Z 2>err &&
+   test $A = $(cat .git/PSEUDOREF) &&
+   grep "already exists" err
+'
+
+# Test --stdin
+
 a=refs/heads/a
 b=refs/heads/b
 c=refs/heads/c
-- 
2.17.0



[PATCH v2 3/3] refs: handle zero oid for pseudorefs

2018-05-10 Thread Martin Ågren
According to the documentation, it is possible to "specify 40 '0' or an
empty string as  to make sure that the ref you are creating
does not exist." But in the code for pseudorefs, we do not implement
this, as demonstrated by the failing tests added in the previous commit.
If we fail to read the old ref, we immediately die. But a failure to
read would actually be a good thing if we have been given the zero oid.

With the zero oid, allow -- and even require -- the ref-reading to fail.
This implements the "make sure that the ref ... does not exist" part of
the documentation and fixes both failing tests from the previous commit.

Since we have a `strbuf err` for collecting errors, let's use it and
signal an error to the caller instead of dying hard.

Reported-by: Rafael Ascensão 
Helped-by: Rafael Ascensão 
Signed-off-by: Martin Ågren 
---
 t/t1400-update-ref.sh |  4 ++--
 refs.c| 16 +---
 2 files changed, 15 insertions(+), 5 deletions(-)

diff --git a/t/t1400-update-ref.sh b/t/t1400-update-ref.sh
index 3996109ba4..faf0dfe993 100755
--- a/t/t1400-update-ref.sh
+++ b/t/t1400-update-ref.sh
@@ -503,12 +503,12 @@ test_expect_success 'delete pseudoref with correct old 
value' '
test_path_is_missing .git/PSEUDOREF
 '
 
-test_expect_failure 'create pseudoref with old OID zero' '
+test_expect_success 'create pseudoref with old OID zero' '
git update-ref PSEUDOREF $A $Z &&
test $A = $(cat .git/PSEUDOREF)
 '
 
-test_expect_failure 'do not overwrite pseudoref with old OID zero' '
+test_expect_success 'do not overwrite pseudoref with old OID zero' '
test_when_finished git update-ref -d PSEUDOREF &&
test_must_fail git update-ref PSEUDOREF $B $Z 2>err &&
test $A = $(cat .git/PSEUDOREF) &&
diff --git a/refs.c b/refs.c
index 7820a52c4f..26af07fc51 100644
--- a/refs.c
+++ b/refs.c
@@ -681,9 +681,19 @@ static int write_pseudoref(const char *pseudoref, const 
struct object_id *oid,
if (old_oid) {
struct object_id actual_old_oid;
 
-   if (read_ref(pseudoref, _old_oid))
-   die("could not read ref '%s'", pseudoref);
-   if (oidcmp(_old_oid, old_oid)) {
+   if (read_ref(pseudoref, _old_oid)) {
+   if (!is_null_oid(old_oid)) {
+   strbuf_addf(err, "could not read ref '%s'",
+   pseudoref);
+   rollback_lock_file();
+   goto done;
+   }
+   } else if (is_null_oid(old_oid)) {
+   strbuf_addf(err, "ref '%s' already exists",
+   pseudoref);
+   rollback_lock_file();
+   goto done;
+   } else if (oidcmp(_old_oid, old_oid)) {
strbuf_addf(err, "unexpected object ID when writing 
'%s'",
pseudoref);
rollback_lock_file();
-- 
2.17.0



Bug: Untracked file deleted by git-stash

2018-05-10 Thread Martin Kunev
I stumbled upon the following issue with git 2.11.0 on Debian 9.

When a tracked file is removed and a directory with the same name is created, 
git-stash would delete the directory with all its contents. No warning is 
displayed and git stores no information about the deleted content (as far as I 
can tell). The following steps can be used to reproduce:

$ mkdir /tmp/bug; cd /tmp/bug
$ git init
Initialized empty Git repository in /tmp/bug/.git/
$ echo 'original file' > entry
$ git add entry
$ git commit -m 'entry added'
[master (root-commit) 483319e] entry added
 1 file changed, 1 insertion(+)
 create mode 100644 entry
$ rm entry
removed 'entry'
$ mkdir entry
$ echo 'data that will be destroyed' > entry/content
$ git status
On branch master
Changes not staged for commit:
deleted:entry

no changes added to commit
$ ls -l
total 4
drwxr-xr-x 2 martin root 4096 May 10 21:16 entry
$ git stash
Saved working directory and index state WIP on master: 483319e entry added
HEAD is now at 483319e entry added

After the stash, the working tree contains only the regular file entry and the 
stash contains no information about the directory or its contents:

$ ls -l
total 4
-rw-r--r-- 1 martin root 5 May 10 21:16 entry
$ git status
On branch master
nothing to commit, working tree clean
$ git stash show -p
diff --git a/entry b/entry
deleted file mode 100644
index 1269488..000
--- a/entry
+++ /dev/null
@@ -1 +0,0 @@
-original file


Re: [PATCH 00/12] Integrate commit-graph into fsck, gc, and fetch

2018-05-10 Thread Stefan Beller
On Thu, May 10, 2018 at 12:05 PM, Martin Ågren  wrote:
> On 10 May 2018 at 19:34, Derrick Stolee  wrote:
>
>> Commits 01-07 focus on the 'git commit-graph verify' subcommand. These
>> are ready for full, rigorous review.
>
> I don't know about "full" and "rigorous", but I tried to offer my thoughts.
>
> A lingering feeling I have is that users could possibly benefit from
> seeing "the commit-graph has a bad foo" a bit more than just "the
> commit-graph is bad". But adding "the bar is baz, should have been
> frotz" might not bring that much. Maybe you could keep the translatable
> string somewhat simple, then, if the more technical data could be useful
> to Git developers, dump it in a non-translated format. (I guess it could
> be hidden behind a debug switch, but let's take one step at a time..)
> This is nothing I feel strongly about.
>
>>  t/t5318-commit-graph.sh  |  25 +
>
> I wonder about tests. Some tests seem to use `dd` to corrupt files and
> check that it gets caught. Doing this in a a hash-agnostic way could be
> tricky, but maybe not impossible. I guess we could do something
> probabilistic, like "set the first two bytes of the very last OID to
> zero -- surely all OIDs can't start with 16 zero bits". Hmm, that might
> still require knowing the size of the OIDs...
>
> I hope to find time to do some more hands-on testing of this to see that
> errors actually do get caught.

Given that the commit graph is secondary data, the users work around
to quickly get back to a well working repository is most likely to remove
the file and regenerate it.
As a developer who wants to fix the bug, a stacktrace/datadump and the
history of git commands might be most valuable, but I agree we should
hide that behind a debug flag.

Packfiles and loose objects are primary data, which means that those
need a more advanced way to diagnose and repair them, so I would imagine
the commit graph fsck is closer to bitmaps fsck, which I would have suspected
to be found in t5310, but a quick read doesn't reveal many tests that are
checking for integrity. So I guess the test coverage here is ok, (although we
should always ask for more)

Thanks,
Stefan


Re: [PATCH 00/12] Integrate commit-graph into fsck, gc, and fetch

2018-05-10 Thread Ævar Arnfjörð Bjarmason

On Thu, May 10 2018, Derrick Stolee wrote:

> The behavior in this patch series does the following:
>
> 1. Near the end of 'git gc', run 'git commit-graph write'. The location
>of this code assumes that a 'git gc --auto' has not terminated early
>due to not meeting the auto threshold.
>
> 2. At the end of 'git fetch', run 'git commit-graph write'. This means
>that every reachable commit will be in the commit-graph after a
>a successful fetch, which seems a reasonable frequency. Then, the
>only times we would be missing a reachable commit is after creating
>one locally. There is a problem with the current patch, though: every
>'git fetch' call runs 'git commit-graph write', even if there were no
>ref updates or objects downloaded. Is there a simple way to detect if
>the fetch was non-trivial?
>
> One obvious problem with this approach: if we compute this during 'gc'
> AND 'fetch', there will be times where a 'fetch' calls 'gc' and triggers
> two commit-graph writes. If I were to abandon one of these patches, it
> would be the 'fetch' integration. A 'git gc' really wants to delete all
> references to unreachable commits, and without updating the commit-graph
> we may still have commit data in the commit-graph file that is not in
> the object database. In fact, deleting commits from the object database
> but not from the commit-graph will cause 'git commit-graph verify' to
> fail!
>
> I welcome discussion on these ideas, as we are venturing out of the
> "pure data structure" world and into the "user experience" world. I am
> less confident in my skills in this world, but the feature is worthless
> if it does not improve the user experience.

I really like #1 here, but I wonder why #2 is necessary.

I.e. is it critical for the performance of the commit graph feature that
it be kept really up-to-date, moreso than other things that rely on gc
--auto (e.g. the optional bitmap index)?

Even if that's the case, I think something that does this via gc --auto
is a much better option. I.e. now we have gc.auto & gc.autoPackLimit, if
the answer to my question above is "yes" this could also be accomplished
by introducing a new graph-specific gc.* setting, and --auto would just
update the graph more often, but leave the rest.


[RFC PATCH v2] Implement --first-parent for git rev-list --bisect.

2018-05-10 Thread Tiago Botelho
This will enable users to implement bisecting on first parents
which can be useful for when the commits from a feature branch
that we want to merge are not always tested.

Signed-off-by: Tiago Botelho 
---

This patch is based on pu so that it can be on top of hn/bisect-first-parent,
tests will still need to be developed for this functionality.

 bisect.c   | 53 +++--
 bisect.h   |  1 +
 builtin/rev-list.c |  3 +++
 3 files changed, 35 insertions(+), 22 deletions(-)

diff --git a/bisect.c b/bisect.c
index 4eafc8262..f43713574 100644
--- a/bisect.c
+++ b/bisect.c
@@ -34,7 +34,7 @@ static const char *term_good;
  * We care just barely enough to avoid recursing for
  * non-merge entries.
  */
-static int count_distance(struct commit_list *entry)
+static int count_distance(struct commit_list *entry, unsigned bisect_flags)
 {
int nr = 0;
 
@@ -49,10 +49,10 @@ static int count_distance(struct commit_list *entry)
commit->object.flags |= COUNTED;
p = commit->parents;
entry = p;
-   if (p) {
+   if (p && !(bisect_flags & BISECT_FIRST_PARENT)) {
p = p->next;
while (p) {
-   nr += count_distance(p);
+   nr += count_distance(p, bisect_flags);
p = p->next;
}
}
@@ -82,15 +82,16 @@ static inline void weight_set(struct commit_list *elem, int 
weight)
*((int*)(elem->item->util)) = weight;
 }
 
-static int count_interesting_parents(struct commit *commit)
+static int count_interesting_parents(struct commit *commit, unsigned 
bisect_flags)
 {
struct commit_list *p;
int count;
 
for (count = 0, p = commit->parents; p; p = p->next) {
-   if (p->item->object.flags & UNINTERESTING)
-   continue;
-   count++;
+   if (!(p->item->object.flags & UNINTERESTING))
+   count++;
+   if (bisect_flags & BISECT_FIRST_PARENT)
+   break;
}
return count;
 }
@@ -117,10 +118,10 @@ static inline int halfway(struct commit_list *p, int nr)
 }
 
 #if !DEBUG_BISECT
-#define show_list(a,b,c,d) do { ; } while (0)
+#define show_list(a,b,c,d,e) do { ; } while (0)
 #else
 static void show_list(const char *debug, int counted, int nr,
- struct commit_list *list)
+ struct commit_list *list, unsigned bisect_flags)
 {
struct commit_list *p;
 
@@ -146,10 +147,14 @@ static void show_list(const char *debug, int counted, int 
nr,
else
fprintf(stderr, "---");
fprintf(stderr, " %.*s", 8, oid_to_hex(>object.oid));
-   for (pp = commit->parents; pp; pp = pp->next)
+   for (pp = commit->parents; pp; pp = pp->next) {
fprintf(stderr, " %.*s", 8,
oid_to_hex(>item->object.oid));
 
+   if (bisect_flags & BISECT_FIRST_PARENT)
+   break;
+   }
+
subject_len = find_commit_subject(buf, _start);
if (subject_len)
fprintf(stderr, " %.*s", subject_len, subject_start);
@@ -267,13 +272,13 @@ static struct commit_list *do_find_bisection(struct 
commit_list *list,
unsigned flags = commit->object.flags;
 
p->item->util = [n++];
-   switch (count_interesting_parents(commit)) {
+   switch (count_interesting_parents(commit, bisect_flags)) {
case 0:
if (!(flags & TREESAME)) {
weight_set(p, 1);
counted++;
show_list("bisection 2 count one",
- counted, nr, list);
+ counted, nr, list, bisect_flags);
}
/*
 * otherwise, it is known not to reach any
@@ -289,7 +294,7 @@ static struct commit_list *do_find_bisection(struct 
commit_list *list,
}
}
 
-   show_list("bisection 2 initialize", counted, nr, list);
+   show_list("bisection 2 initialize", counted, nr, list, bisect_flags);
 
/*
 * If you have only one parent in the resulting set
@@ -310,7 +315,7 @@ static struct commit_list *do_find_bisection(struct 
commit_list *list,
continue;
if (weight(p) != -2)
continue;
-   weight_set(p, count_distance(p));
+   weight_set(p, count_distance(p, bisect_flags));
clear_distance(list);
 
/* Does it happen to be at exactly 

Re: [PATCH v2] add status config and command line options for rename detection

2018-05-10 Thread Ben Peart



On 5/10/2018 12:19 PM, Elijah Newren wrote:

Hi Ben,

On Thu, May 10, 2018 at 7:16 AM, Ben Peart  wrote:

After performing a merge that has conflicts, git status will by default attempt
to detect renames which causes many objects to be examined.  In a virtualized
repo, those objects do not exist locally so the rename logic triggers them to be
fetched from the server. This results in the status call taking hours to
complete on very large repos.  Even in a small repo (the GVFS repo) turning off
break and rename detection has a significant impact:


It'd be nice if you could show that impact by comparing 'git status'
to 'git status --no-renames', for some repo.  Showing only the latter
gives us no way to assess the impact.



Given the example perf impact is arbitrary (the actual example that 
triggered this patch took status from 2+ hours to seconds) and can't be 
replicated using the current performance tools in git, I'm just going 
drop the specific numbers.  I believe the patch is worth while just to 
give users the flexibility to control these behaviors.



git status --no-renames:
31 secs., 105 loose object downloads

git status --no-breaks
7 secs., 17 loose object downloads

git status --no-breaks --no-renames
1 sec., 1 loose object download


This patch doesn't add a --no-breaks option and it doesn't exist
previously, so adding it to the commit message serves to confuse
rather than help.  I'd just drop the last two of these (and redo the
timing for --no-renames assuming you are built on
em/status-rename-config).



OK


Add a new config status.renames setting to enable turning off rename detection
during status.  This setting will default to the value of diff.renames.

Add a new config status.renamelimit setting to to enable bounding the time spent
finding out inexact renames during status.  This setting will default to the
value of diff.renamelimit.


It may be worth mentioning that these config settings also affect 'git
commit' (and it does, in my testing, which I think is a good thing).



I agree this is a good thing as the other status settings behave the 
same way.  I'll update the documentation to reflect this as well.



Add status --no-renames command line option that enables overriding the config
setting from the command line. Add --find-renames[=] to enable detecting
renames and optionally setting the similarity index from the command line.


The command line options are specific to 'git status'.  I don't really
have a strong opinion on whether they should also be added to
git-commit; I suspect users would be more likely to use the config
options in order to set it once and forget about it and that users
would be more likely to want to override their config setting for
status than for commit.


Note: I removed the --no-breaks command line option from the original patch as
it will no longer be needed once the default has been changed [1] to turn it 
off.

[1] 
https://public-inbox.org/git/20180430093421.27551-2-eckhard.s.ma...@gmail.com/


I'd just drop these lines from the commit message, and instead mention
that your patch depends on em/status-rename-config.



OK


+   if ((intptr_t)rename_score_arg != -1) {
+   s.detect_rename = DIFF_DETECT_RENAME;


I'd still prefer this was a
 if (s.detect_rename < DIFF_DETECT_RENAME)
 s.detect_rename = DIFF_DETECT_RENAME;

If a user specifies they are willing to pay for copy detection, but
then just passes --find-renames=40% because they want to find more
renames, it seems odd to disable copy detection to me.



I agree and will change it. It is unfortunate this will behave 
differently than it does with merge.  Fixing the merge behavior to match 
is outside the scope of this patch.



+++ b/t/t7525-status-rename.sh


Testcases look good.  It'd be nice to also add a few testcases where
copy detection is turned on -- in particular, I'd like to see one with
--find-renames=$DIFFERENT_THAN_DEFAULT being passed when
merge.renames=copies.



OK.  I also added tests to verify the settings correctly impact commit.




+test_expect_success 'setup' '
+   echo 1 >original &&
+   git add . &&
+   git commit -m"Adding original file." &&
+   mv original renamed &&
+   echo 2 >> renamed &&
+   git add .
+'




+cat >.gitignore <<\EOF
+.gitignore
+expect*
+actual*
+EOF


Can this just be included in the setup?



OK



Everything else in the patch looked good to me.



Re: [PATCH 00/12] Integrate commit-graph into fsck, gc, and fetch

2018-05-10 Thread Martin Ågren
On 10 May 2018 at 19:34, Derrick Stolee  wrote:

> Commits 01-07 focus on the 'git commit-graph verify' subcommand. These
> are ready for full, rigorous review.

I don't know about "full" and "rigorous", but I tried to offer my thoughts.

A lingering feeling I have is that users could possibly benefit from
seeing "the commit-graph has a bad foo" a bit more than just "the
commit-graph is bad". But adding "the bar is baz, should have been
frotz" might not bring that much. Maybe you could keep the translatable
string somewhat simple, then, if the more technical data could be useful
to Git developers, dump it in a non-translated format. (I guess it could
be hidden behind a debug switch, but let's take one step at a time..)
This is nothing I feel strongly about.

>  t/t5318-commit-graph.sh  |  25 +

I wonder about tests. Some tests seem to use `dd` to corrupt files and
check that it gets caught. Doing this in a a hash-agnostic way could be
tricky, but maybe not impossible. I guess we could do something
probabilistic, like "set the first two bytes of the very last OID to
zero -- surely all OIDs can't start with 16 zero bits". Hmm, that might
still require knowing the size of the OIDs...

I hope to find time to do some more hands-on testing of this to see that
errors actually do get caught.

I saw you redirect stdout to a file "output", and anticipated later
commits to actually look into it. I never saw that though. (I did not
apply the patches, so I could have missed something.)

Martin


Re: Is rebase --force-rebase any different from rebase --no-ff?

2018-05-10 Thread Ilya Kantor
Hi,

If that's indeed true (as far as I could see that, still can be
mistaken), then as a git user, not developer, I'd stick to --no-ff,
because it's the more intuitive naming.

Just 5¢.
---
Best Regards,
Ilya Kantor


On Thu, May 10, 2018 at 9:34 PM, Marc Branchaud  wrote:
> On 2018-05-09 03:46 PM, Ilya Kantor wrote:
>>
>> I tried to compare --force-rebase VS --no-ff for the following repository:
>> http://jmp.sh/E7TRjcL
>>
>> There's no difference in the resulf of:
>> git rebase --force-rebase 54a4
>> git rebase --no-ff 54a4
>>
>> (rebases all 3 commits of feature)
>>
>> Also, there's no difference in interactive mode:
>> git rebase --force-rebase -i 54a4
>> git rebase --no-ff -i 54a4
>>
>> (picks all 3 commits of feature)
>>
>> Is there a case where --no-ff differs from --force-rebase?
>
>
> So now that "rebase -i" respects --force-rebase, the question is what to do
> about it:
>
> 1. Teach "rebase -i" to stop respecting --force-rebase (restoring the
> original intent when --no-ff was introduced)?
>
> 2. Deprecate --no-ff?
>
> 3. Deprecate --force-rebase?
>
> As a heavy rebase user, I find --no-ff more intuitive than --force-rebase.
> I'd be in favour of option 3, and keeping just --no-ff (with -f as a
> synonym).
>
> M.
>
>
>> ---
>> Best Regards,
>> Ilya Kantor
>>
>>
>> On Wed, May 9, 2018 at 10:27 PM, Marc Branchaud 
>> wrote:
>>>
>>> On 2018-05-09 02:21 PM, Stefan Beller wrote:


 +cc Marc and Johannes who know more about rebase.

 On Wed, May 9, 2018 at 9:01 AM, Ilya Kantor  wrote:
>
>
> Right now in "git help rebase" for --no-ff:
> "Without --interactive, this is a synonym for --force-rebase."
>
> But *with* --interactive, is there any difference?



 I found


 https://code.googlesource.com/git/+/b499549401cb2b1f6c30d09681380fd519938eb0
 from 2010-03-24
>>>
>>>
>>>
>>> In the original discussion around this option [1], at one point I
>>> proposed
>>> teaching rebase--interactive to respect --force-rebase instead of adding
>>> a
>>> new option [2].  Ultimately --no-ff was chosen as the better user
>>> interface
>>> design [3], because an interactive rebase can't be "forced" to run.
>>>
>>> At the time, I think rebase--interactive only recognized --no-ff.  That
>>> might have been muddled a bit in the migration to rebase--helper.c.
>>>
>>> Looking at it now, I don't have a strong opinion about keeping both
>>> options
>>> or deprecating one of them.
>>>
>>>  M.
>>>
>>> [1] https://public-inbox.org/git/4b9fd9c1.9060...@xiplink.com/t/
>>> [2]
>>>
>>> https://public-inbox.org/git/1269361187-31291-1-git-send-email-marcn...@xiplink.com/
>>> [3] https://public-inbox.org/git/7vzl1yd5j4@alter.siamese.dyndns.org/
>>>
>>>
   Teach rebase the --no-ff option.

   For git-rebase.sh, --no-ff is a synonym for --force-rebase.

   For git-rebase--interactive.sh, --no-ff cherry-picks all the
 commits
 in
   the rebased branch, instead of fast-forwarding over any unchanged
 commits.

   --no-ff offers an alternative way to deal with reverted merges.
 Instead of
   "reverting the revert" you can use "rebase --no-ff" to recreate
 the
 branch
   with entirely new commits (they're new because at the very least
 the
   committer time is different).  This obviates the need to revert
 the
   reversion, as you can re-merge the new topic branch directly.
 Added
 an
   addendum to revert-a-faulty-merge.txt describing the situation and
 how to
   use --no-ff to handle it.

 which sounds as if there is?

 Stefan

>>>
>


Re: [PATCH 1/1] Warn about fast-forwarding of submodules during merge

2018-05-10 Thread Stefan Beller
On Thu, May 10, 2018 at 11:26 AM, Leif Middelschulte
 wrote:
> From: Leif Middelschulte 

Hi Leif!

thanks for following up with a patch!

> Warn the user about an automatically fast-forwarded submodule. The silent 
> merge
> behavior was introduced by commit 68d03e4a6e44 ("Implement automatic 
> fast-forward
> merge for submodules", 2010-07-07)).
>
> Signed-off-by: Leif Middelschulte 
> ---
>  submodule.c | 2 ++
>  1 file changed, 2 insertions(+)
>
> diff --git a/submodule.c b/submodule.c
> index 74d35b257..0198a72e6 100644
> --- a/submodule.c
> +++ b/submodule.c
> @@ -1817,10 +1817,12 @@ int merge_submodule(struct object_id *result, const 
> char *path,
> /* Case #1: a is contained in b or vice versa */
> if (in_merge_bases(commit_a, commit_b)) {
> oidcpy(result, b);
> +   warning("Fast-forwarding submodule %s", path);
> return 1;
> }
> if (in_merge_bases(commit_b, commit_a)) {
> oidcpy(result, a);
> +   warning("Fast-forwarding submodule %s", path);
> return 1;
> }

The code looks correct, however I think we can improve it.
(Originally I was just wondering if stderr is the right output,
which lead me to the thoughts below:)

Looking through the code of merge-recursive.c,
all the other merge outputs are done via 'output()'
that is able to buffer up the output as well as handles
the output for different verbosity settings.

So I would think we should make the output() function available
outside of merge-recursive.c. (and rename it to a be more concise
and descriptive in the global namespace) and make use of it.

Funnily we already have MERGE_WARNING in submodule.c
which outputs information for all the other cases. I would think
we ought to convert those to the output(), too.

Thanks,
Stefan


Re: Is rebase --force-rebase any different from rebase --no-ff?

2018-05-10 Thread Marc Branchaud

On 2018-05-09 03:46 PM, Ilya Kantor wrote:

I tried to compare --force-rebase VS --no-ff for the following repository:
http://jmp.sh/E7TRjcL

There's no difference in the resulf of:
git rebase --force-rebase 54a4
git rebase --no-ff 54a4

(rebases all 3 commits of feature)

Also, there's no difference in interactive mode:
git rebase --force-rebase -i 54a4
git rebase --no-ff -i 54a4

(picks all 3 commits of feature)

Is there a case where --no-ff differs from --force-rebase?


So now that "rebase -i" respects --force-rebase, the question is what to 
do about it:


1. Teach "rebase -i" to stop respecting --force-rebase (restoring the 
original intent when --no-ff was introduced)?


2. Deprecate --no-ff?

3. Deprecate --force-rebase?

As a heavy rebase user, I find --no-ff more intuitive than 
--force-rebase.  I'd be in favour of option 3, and keeping just --no-ff 
(with -f as a synonym).


M.



---
Best Regards,
Ilya Kantor


On Wed, May 9, 2018 at 10:27 PM, Marc Branchaud  wrote:

On 2018-05-09 02:21 PM, Stefan Beller wrote:


+cc Marc and Johannes who know more about rebase.

On Wed, May 9, 2018 at 9:01 AM, Ilya Kantor  wrote:


Right now in "git help rebase" for --no-ff:
"Without --interactive, this is a synonym for --force-rebase."

But *with* --interactive, is there any difference?



I found

https://code.googlesource.com/git/+/b499549401cb2b1f6c30d09681380fd519938eb0
from 2010-03-24



In the original discussion around this option [1], at one point I proposed
teaching rebase--interactive to respect --force-rebase instead of adding a
new option [2].  Ultimately --no-ff was chosen as the better user interface
design [3], because an interactive rebase can't be "forced" to run.

At the time, I think rebase--interactive only recognized --no-ff.  That
might have been muddled a bit in the migration to rebase--helper.c.

Looking at it now, I don't have a strong opinion about keeping both options
or deprecating one of them.

 M.

[1] https://public-inbox.org/git/4b9fd9c1.9060...@xiplink.com/t/
[2]
https://public-inbox.org/git/1269361187-31291-1-git-send-email-marcn...@xiplink.com/
[3] https://public-inbox.org/git/7vzl1yd5j4@alter.siamese.dyndns.org/



  Teach rebase the --no-ff option.

  For git-rebase.sh, --no-ff is a synonym for --force-rebase.

  For git-rebase--interactive.sh, --no-ff cherry-picks all the commits
in
  the rebased branch, instead of fast-forwarding over any unchanged
commits.

  --no-ff offers an alternative way to deal with reverted merges.
Instead of
  "reverting the revert" you can use "rebase --no-ff" to recreate the
branch
  with entirely new commits (they're new because at the very least the
  committer time is different).  This obviates the need to revert the
  reversion, as you can re-merge the new topic branch directly.  Added
an
  addendum to revert-a-faulty-merge.txt describing the situation and
how to
  use --no-ff to handle it.

which sounds as if there is?

Stefan





Re: [PATCH 2/2] replace-object.c: remove the_repository from prepare_replace_object

2018-05-10 Thread Stefan Beller
> Using `git grep` I see 230 instances of 'xmalloc' and 261 instances of
> 'xcalloc'. After the Coccinelle transformation, these are down to 194 and
> 190, respectively, because the rest allocate in the same line as the
> definition. It's worth thinking about the macro pattern for those cases.

Thanks for reporting the coccinelle experiment!

As we follow a strict declare before code, and we do not know if further
declarations make use of this already, e.g. given

struct foo *f = xmalloc(sizeof(*f));
struct bar b = >baz;

we cannot split up the line declaring and assigning f, but the macro
has to recreate the assignment upon declaration, for that we'd
need to have something like

ALLOCATE_TYPE(type, name);

which over complicates things IMHO.

Maybe it is worth identifying the pattern where 'f' is not used in further
declarations, such that we can make patches as

-struct foo *f = xmalloc(sizeof(*f));
+   struct foo *f;
struct baz b = 
+
+ ALLOCATE(f);
+

Thanks,
Stefan


Re: bug: SHA1 calculation on PPC machines when built with gcc older than 4.6

2018-05-10 Thread Ævar Arnfjörð Bjarmason
On Thu, May 10, 2018 at 8:11 PM, Ken Cunningham
 wrote:
> Some vintage Apple PPC machines build a non-funtional version of git as of 
> git 13.1 when using the stock gcc compilers that are installed with the OS; 
> the SHA1 calculations are faulty. This can be repaired with a simple patch 
> (attached).
>
>
> Stock vintage Apple PPC machines come with gcc-4.0 or gcc-4.2. On MacOS 10.4 
> and earlier, or when not using Apple Common Crypto on 10.5, git uses the SHA1 
> calculation code from here 
> . The code in 
> 
>  tries to detect all systems that are BIG_ENDIAN, but the above noted systems 
> fall through because they fail the tests.
>
> It appears that the primary test:
>
> #if defined(__BYTE_ORDER__) && defined(__ORDER_BIG_ENDIAN__)
>
> only works as of gcc-4.6 and newer, so the code is built as LITTLE_ENDIAN on 
> PPC with older gcc versions.
>
>
> Issue report:
>
> 
>
>
> MacPorts bug report:
>
> 
>
>
> The included patch to git fixes the issue on our testing.
>
> Thanks for git!
>
> Ken Cunningham
>
>
>
>
> =
>
> diff --git a/sha1dc/sha1.c b/sha1dc/sha1.c
> index 25eded1..5faf5a5 100644
> --- a/sha1dc/sha1.c
> +++ b/sha1dc/sha1.c
> @@ -92,6 +92,10 @@
>   */
>  #define SHA1DC_BIGENDIAN
>
> +#elif (defined(__APPLE__) && defined(__BIG_ENDIAN__) && 
> !defined(SHA1DC_BIGENDIAN))
> +/* older gcc compilers which are the default  on Apple PPC do not define 
> __BYTE_ORDER__ */
> +#define SHA1DC_BIGENDIAN
> +
>  /* Not under GCC-alike or glibc or *BSD or newlib or  */
>  #elif defined(SHA1DC_ON_INTEL_LIKE_PROCESSOR)
>  /*

Thanks. As noted in
https://public-inbox.org/git/87603xxc3k@evledraar.gmail.com/
patches like this should be sent to the upstream, it appears you just
opened an issue there, but sent the patch here. Could you open a PR
with upstream with this patch?


Re: [PATCH 04/12] commit-graph: verify fanout and lookup table

2018-05-10 Thread Martin Ågren
On 10 May 2018 at 19:34, Derrick Stolee  wrote:
> While running 'git commit-graph verify', verify that the object IDs
> are listed in lexicographic order and that the fanout table correctly
> navigates into that list of object IDs.
>
> Signed-off-by: Derrick Stolee 
> ---
>  commit-graph.c | 33 +
>  1 file changed, 33 insertions(+)
>
> diff --git a/commit-graph.c b/commit-graph.c
> index ce11af1d20..b4c146c423 100644
> --- a/commit-graph.c
> +++ b/commit-graph.c
> @@ -839,6 +839,9 @@ static int verify_commit_graph_error;
>
>  int verify_commit_graph(struct commit_graph *g)
>  {
> +   uint32_t i, cur_fanout_pos = 0;
> +   struct object_id prev_oid, cur_oid;
> +
> if (!g) {
> graph_report(_("no commit-graph file loaded"));
> return 1;
> @@ -853,5 +856,35 @@ int verify_commit_graph(struct commit_graph *g)
> if (!g->chunk_commit_data)
> graph_report(_("commit-graph is missing the Commit Data 
> chunk"));
>
> +   for (i = 0; i < g->num_commits; i++) {
> +   hashcpy(cur_oid.hash, g->chunk_oid_lookup + g->hash_len * i);
> +
> +   if (i > 0 && oidcmp(_oid, _oid) >= 0)
> +   graph_report(_("commit-graph has incorrect oid order: 
> %s then %s"),

Minor: I think our style would prefer s/i > 0/i/.

I suppose the second check should be s/>=/>/, but it's not like it
should matter. ;-)

I wonder if this is a message that would virtually never make sense to a
user, but more to a developer. Leave it untranslated to make sure any
bug reports to the list are readable to us?

> +
> +   oid_to_hex(_oid),
> +   oid_to_hex(_oid));

Hmm, these two lines do not actually achieve anything?

> +   oidcpy(_oid, _oid);
> +
> +   while (cur_oid.hash[0] > cur_fanout_pos) {
> +   uint32_t fanout_value = get_be32(g->chunk_oid_fanout 
> + cur_fanout_pos);
> +   if (i != fanout_value)
> +   graph_report(_("commit-graph has incorrect 
> fanout value: fanout[%d] = %u != %u"),
> +cur_fanout_pos, fanout_value, i);

Same though re `_()`, even more so because of the more technical
notation.

> +
> +   cur_fanout_pos++;
> +   }
> +   }
> +
> +   while (cur_fanout_pos < 256) {
> +   uint32_t fanout_value = get_be32(g->chunk_oid_fanout + 
> cur_fanout_pos);
> +
> +   if (g->num_commits != fanout_value)
> +   graph_report(_("commit-graph has incorrect fanout 
> value: fanout[%d] = %u != %u"),
> +cur_fanout_pos, fanout_value, i);

Same here. Or maybe these should just give a translated user-readable
basic idea of what is wrong and skip the details?

Martin


[PATCH 1/1] Warn about fast-forwarding of submodules during merge

2018-05-10 Thread Leif Middelschulte
From: Leif Middelschulte 

Warn the user about an automatically fast-forwarded submodule. The silent merge
behavior was introduced by commit 68d03e4a6e44 ("Implement automatic 
fast-forward
merge for submodules", 2010-07-07)).

Signed-off-by: Leif Middelschulte 
---
 submodule.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/submodule.c b/submodule.c
index 74d35b257..0198a72e6 100644
--- a/submodule.c
+++ b/submodule.c
@@ -1817,10 +1817,12 @@ int merge_submodule(struct object_id *result, const 
char *path,
/* Case #1: a is contained in b or vice versa */
if (in_merge_bases(commit_a, commit_b)) {
oidcpy(result, b);
+   warning("Fast-forwarding submodule %s", path);
return 1;
}
if (in_merge_bases(commit_b, commit_a)) {
oidcpy(result, a);
+   warning("Fast-forwarding submodule %s", path);
return 1;
}
 
-- 
2.15.1 (Apple Git-101)



[PATCH 0/1] warn about auto fast-forwarded submodules during merges

2018-05-10 Thread Leif Middelschulte
From: Leif Middelschulte 

Warn the user during merges about automatically fast-forwarded submodules.
This is just informational and does *not* change behavior otherwise.

It is a follow up to Elijah Newren's suggestion[0] to provide the attached 
patch.

[0] https://marc.info/?l=git=152544498723355=2

Leif Middelschulte (1):
  Warn about fast-forwarding of submodules during merge

 submodule.c | 2 ++
 1 file changed, 2 insertions(+)

-- 
2.15.1 (Apple Git-101)



Re: [PATCH 02/12] commit-graph: verify file header information

2018-05-10 Thread Martin Ågren
On 10 May 2018 at 19:34, Derrick Stolee  wrote:
> During a run of 'git commit-graph verify', list the issues with the
> header information in the commit-graph file. Some of this information
> is inferred from the loaded 'struct commit_graph'. Some header
> information is checked as part of load_commit_graph_one().
>
> Signed-off-by: Derrick Stolee 
> ---
>  commit-graph.c | 23 ++-
>  1 file changed, 22 insertions(+), 1 deletion(-)
>
> diff --git a/commit-graph.c b/commit-graph.c
> index b25aaed128..c3b8716c14 100644
> --- a/commit-graph.c
> +++ b/commit-graph.c
> @@ -818,7 +818,28 @@ void write_commit_graph(const char *obj_dir,
> oids.nr = 0;
>  }
>
> +static int verify_commit_graph_error;
> +#define graph_report(...) \
> +   do {\
> +   verify_commit_graph_error = 1;\
> +   printf(__VA_ARGS__);\
> +   } while (0);
> +

It seems to me that other users of __VA_ARGS__ are protected with a
check for HAVE_VARIADIC_MACROS and provide an alternative
non-__VA_ARGS__-implementation. Or maybe I've missed something in my
grepping and we are actually (slowly) moving towards assuming
__VA_ARGS__ is always available?

>  int verify_commit_graph(struct commit_graph *g)
>  {
> -   return !g;
> +   if (!g) {
> +   graph_report(_("no commit-graph file loaded"));
> +   return 1;
> +   }
> +
> +   verify_commit_graph_error = 0;
> +
> +   if (!g->chunk_oid_fanout)
> +   graph_report(_("commit-graph is missing the OID Fanout 
> chunk"));
> +   if (!g->chunk_oid_lookup)
> +   graph_report(_("commit-graph is missing the OID Lookup 
> chunk"));
> +   if (!g->chunk_commit_data)
> +   graph_report(_("commit-graph is missing the Commit Data 
> chunk"));
> +
> +   return verify_commit_graph_error;

If you can't rely on __VA_ARGS__, maybe bite the bullet and introduce
braces... The expanded code wouldn't be too horrible, albeit a bit
repetitive.

Martin


Re: [PATCH 01/12] commit-graph: add 'verify' subcommand

2018-05-10 Thread Martin Ågren
On 10 May 2018 at 19:34, Derrick Stolee  wrote:
> In case the commit-graph file becomes corrupt, we need a way to
> verify its contents match the object database. In the manner of

s/verify its/verify that its/ might read better.

> 'git fsck' we will implement a 'git commit-graph verify' subcommand
> to report all issues with the file.
>
> Add the 'verify' subcommand to the 'commit-graph' builtin and its
> documentation. The subcommand is currently a no-op except for
> loading the commit-graph into memory, which may trigger run-time
> errors that would be caught by normal use. Add a simple test that
> ensures the command returns a zero error code.
>
> If no commit-graph file exists, this is an acceptable state. Do
> not report any errors.

This all makes sense to me.

> --- a/Documentation/git-commit-graph.txt
> +++ b/Documentation/git-commit-graph.txt

> +'verify'::
> +
> +Read the commit-graph file and verify its contents against the object
> +database. Used to verify for corrupted data.

s/verify for/check for/?

> --- a/builtin/commit-graph.c
> +++ b/builtin/commit-graph.c
> @@ -7,11 +7,17 @@
>
>  static char const * const builtin_commit_graph_usage[] = {
> N_("git commit-graph [--object-dir ]"),
> +   N_("git commit-graph verify [--object-dir ]"),
> N_("git commit-graph read [--object-dir ]"),
> N_("git commit-graph write [--object-dir ] [--append] 
> [--stdin-packs|--stdin-commits]"),

Minor nit: In the man-page, you added verify after read, which makes
more sense I think (r < v < w).

(I also note that the man-page synopsis doesn't give the no-subcommand
usage.)

> +static int graph_verify(int argc, const char **argv)
> +{
> +   struct commit_graph *graph = 0;
> +   char *graph_name;
> +
> +   static struct option builtin_commit_graph_verify_options[] = {
> +   OPT_STRING(0, "object-dir", _dir,
> +  N_("dir"),
> +  N_("The object directory to store the graph")),
> +   OPT_END(),
> +   };
> +
> +   argc = parse_options(argc, argv, NULL,
> +builtin_commit_graph_verify_options,
> +builtin_commit_graph_verify_usage, 0);
> +
> +   if (!opts.obj_dir)
> +   opts.obj_dir = get_object_directory();
> +
> +   graph_name = get_commit_graph_filename(opts.obj_dir);
> +   graph = load_commit_graph_one(graph_name);
> +
> +   if (!graph)
> +   return 0;
> +   FREE_AND_NULL(graph_name);

Maybe the FREE_AND_NULL could go immediately after the call to
`load_commit_graph_one()`. It makes it more obvious that you're done
with the name, and -- perhaps more importantly -- means that throwing a
leak-checker at this won't complain if we take the early return.

> +
> +   return verify_commit_graph(graph);

A leak-checker would still complain about leaking `graph`. I think it
would be ok to just UNLEAK it before calling `verify_commit_graph()`.
This is IMHO close enough to returning from `cmd_commit_graph()` to make
UNLEAK an acceptable, or even the correct, solution.

I realize that `graph_read()` is doing something similar to this patch
already, so what you have here is certainly the most consistent code.

Martin


bug: SHA1 calculation on PPC machines when built with gcc older than 4.6

2018-05-10 Thread Ken Cunningham
Some vintage Apple PPC machines build a non-funtional version of git as of git 
13.1 when using the stock gcc compilers that are installed with the OS; the 
SHA1 calculations are faulty. This can be repaired with a simple patch 
(attached).


Stock vintage Apple PPC machines come with gcc-4.0 or gcc-4.2. On MacOS 10.4 
and earlier, or when not using Apple Common Crypto on 10.5, git uses the SHA1 
calculation code from here 
. The code in 

 tries to detect all systems that are BIG_ENDIAN, but the above noted systems 
fall through because they fail the tests.

It appears that the primary test:

#if defined(__BYTE_ORDER__) && defined(__ORDER_BIG_ENDIAN__)

only works as of gcc-4.6 and newer, so the code is built as LITTLE_ENDIAN on 
PPC with older gcc versions.


Issue report:




MacPorts bug report:




The included patch to git fixes the issue on our testing.

Thanks for git!

Ken Cunningham




=

diff --git a/sha1dc/sha1.c b/sha1dc/sha1.c
index 25eded1..5faf5a5 100644
--- a/sha1dc/sha1.c
+++ b/sha1dc/sha1.c
@@ -92,6 +92,10 @@
  */
 #define SHA1DC_BIGENDIAN
 
+#elif (defined(__APPLE__) && defined(__BIG_ENDIAN__) && 
!defined(SHA1DC_BIGENDIAN))
+/* older gcc compilers which are the default  on Apple PPC do not define 
__BYTE_ORDER__ */
+#define SHA1DC_BIGENDIAN
+
 /* Not under GCC-alike or glibc or *BSD or newlib or  */
 #elif defined(SHA1DC_ON_INTEL_LIKE_PROCESSOR)
 /*

==




Re: Regression in patch add?

2018-05-10 Thread Phillip Wood
On 10/05/18 15:11, Oliver Joseph Ash wrote:
> You found the problem Phillip! My editor was trimming trailing white space, 
> which breaks the context line.

I'm glad we found the source of the problem (and that it wasn't some
obscure bug)

> I had tried to use an alternative editor to account for any editor specific 
> behaviour, but it turns out both the editors I tested in were doing this!
> 
> I suspect this change in behaviour will effect a lot of users? If so, it 
> would be good if `git add -p` allowed for this behaviour, in the same way 
> `git apply` does.

Yes, I think it probably makes sense to do that. Originally I didn't
count empty lines as context lines in case the user accidentally added
some empty lines at the end of the hunk but if 'git apply' does then I
think 'git add -p' should as well

> Meanwhile, I can easily configure my editor not to do this for `*.diff` files.
> 
> Thanks for your help, Phillip and Martin!

Thanks for posting an example so we could test it, it makes it much
easier to track the problem down

Best Wishes

Phillip

> Mahmoud, does this also explain your problem as per your original post?
> 



Re: [GSoC] Info: Week 02 Blog Post

2018-05-10 Thread Stefan Beller
Hi Pratik,

On Wed, May 9, 2018 at 8:07 PM, Pratik Karki  wrote:
> Hi,
>
> The week 02 blog post[1] is live. This post is part I out of II and this
> time it will be biweekly. The part II of will come in 2-3 days which
> will describe the current `git-rebase.sh`.

Cool post!

> Do give me feedback.

In "and you’re choice of DVCS is Git.", s/you're/your/


[PATCH v2] commit-graph: fix UX issue when .lock file exists

2018-05-10 Thread Derrick Stolee
We use the lockfile API to avoid multiple Git processes from writing to
the commit-graph file in the .git/objects/info directory. In some cases,
this directory may not exist, so we check for its existence.

The existing code does the following when acquiring the lock:

1. Try to acquire the lock.
2. If it fails, try to create the .git/object/info directory.
3. Try to acquire the lock, failing if necessary.

The problem is that if the lockfile exists, then the mkdir fails, giving
an error that doesn't help the user:

  "fatal: cannot mkdir .git/objects/info: File exists"

While technically this honors the lockfile, it does not help the user.

Instead, do the following:

1. Check for existence of .git/objects/info; create if necessary.
2. Try to acquire the lock, failing if necessary.

The new output looks like:

  fatal: Unable to create
  '/.git/objects/info/commit-graph.lock': File exists.

  Another git process seems to be running in this repository, e.g.
  an editor opened by 'git commit'. Please make sure all processes
  are terminated then try again. If it still fails, a git process
  may have crashed in this repository earlier:
  remove the file manually to continue.

Helped-by: Jeff King 
Signed-off-by: Derrick Stolee 
---
 commit-graph.c | 22 +-
 1 file changed, 5 insertions(+), 17 deletions(-)

diff --git a/commit-graph.c b/commit-graph.c
index a8c337dd77..bb54c1214c 100644
--- a/commit-graph.c
+++ b/commit-graph.c
@@ -1,5 +1,6 @@
 #include "cache.h"
 #include "config.h"
+#include "dir.h"
 #include "git-compat-util.h"
 #include "lockfile.h"
 #include "pack.h"
@@ -640,7 +641,6 @@ void write_commit_graph(const char *obj_dir,
struct hashfile *f;
uint32_t i, count_distinct = 0;
char *graph_name;
-   int fd;
struct lock_file lk = LOCK_INIT;
uint32_t chunk_ids[5];
uint64_t chunk_offsets[5];
@@ -754,23 +754,11 @@ void write_commit_graph(const char *obj_dir,
compute_generation_numbers();
 
graph_name = get_commit_graph_filename(obj_dir);
-   fd = hold_lock_file_for_update(, graph_name, 0);
-
-   if (fd < 0) {
-   struct strbuf folder = STRBUF_INIT;
-   strbuf_addstr(, graph_name);
-   strbuf_setlen(, strrchr(folder.buf, '/') - folder.buf);
-
-   if (mkdir(folder.buf, 0777) < 0)
-   die_errno(_("cannot mkdir %s"), folder.buf);
-   strbuf_release();
-
-   fd = hold_lock_file_for_update(, graph_name, 
LOCK_DIE_ON_ERROR);
-
-   if (fd < 0)
-   die_errno("unable to create '%s'", graph_name);
-   }
+   if (safe_create_leading_directories(graph_name))
+   die_errno(_("unable to create leading directories of %s"),
+ graph_name);
 
+   hold_lock_file_for_update(, graph_name, LOCK_DIE_ON_ERROR);
f = hashfd(lk.tempfile->fd, lk.tempfile->filename.buf);
 
hashwrite_be32(f, GRAPH_SIGNATURE);

base-commit: 34fdd433396ee0e3ef4de02eb2189f8226eafe4e
-- 
2.16.2.329.gfb62395de6



[PATCH 09/12] commit-graph: add '--reachable' option

2018-05-10 Thread Derrick Stolee
When writing commit-graph files, it can be convenient to ask for all
reachable commits (starting at the ref set) in the resulting file. This
is particularly helpful when writing to stdin is complicated, such as a
future integration with 'git gc' which will call
'git commit-graph write --reachable' after performing cleanup of the
object database.

Signed-off-by: Derrick Stolee 
---
 Documentation/git-commit-graph.txt |  8 ++--
 builtin/commit-graph.c | 41 ++
 t/t5318-commit-graph.sh| 10 ++
 3 files changed, 53 insertions(+), 6 deletions(-)

diff --git a/Documentation/git-commit-graph.txt 
b/Documentation/git-commit-graph.txt
index 1daefa7fb1..fbab3feba1 100644
--- a/Documentation/git-commit-graph.txt
+++ b/Documentation/git-commit-graph.txt
@@ -38,12 +38,16 @@ Write a commit graph file based on the commits found in 
packfiles.
 +
 With the `--stdin-packs` option, generate the new commit graph by
 walking objects only in the specified pack-indexes. (Cannot be combined
-with --stdin-commits.)
+with --stdin-commits or --reachable.)
 +
 With the `--stdin-commits` option, generate the new commit graph by
 walking commits starting at the commits specified in stdin as a list
 of OIDs in hex, one OID per line. (Cannot be combined with
---stdin-packs.)
+--stdin-packs or --reachable.)
++
+With the `--reachable` option, generate the new commit graph by walking
+commits starting at all refs. (Cannot be combined with --stdin-commits
+or --stind-packs.)
 +
 With the `--append` option, include all commits that are present in the
 existing commit-graph file.
diff --git a/builtin/commit-graph.c b/builtin/commit-graph.c
index f5d891f2b8..4c9328cdf2 100644
--- a/builtin/commit-graph.c
+++ b/builtin/commit-graph.c
@@ -3,13 +3,14 @@
 #include "dir.h"
 #include "lockfile.h"
 #include "parse-options.h"
+#include "refs.h"
 #include "commit-graph.h"
 
 static char const * const builtin_commit_graph_usage[] = {
N_("git commit-graph [--object-dir ]"),
N_("git commit-graph verify [--object-dir ]"),
N_("git commit-graph read [--object-dir ]"),
-   N_("git commit-graph write [--object-dir ] [--append] 
[--stdin-packs|--stdin-commits]"),
+   N_("git commit-graph write [--object-dir ] [--append] 
[--reachable|--stdin-packs|--stdin-commits]"),
NULL
 };
 
@@ -24,12 +25,13 @@ static const char * const builtin_commit_graph_read_usage[] 
= {
 };
 
 static const char * const builtin_commit_graph_write_usage[] = {
-   N_("git commit-graph write [--object-dir ] [--append] 
[--stdin-packs|--stdin-commits]"),
+   N_("git commit-graph write [--object-dir ] [--append] 
[--reachable|--stdin-packs|--stdin-commits]"),
NULL
 };
 
 static struct opts_commit_graph {
const char *obj_dir;
+   int reachable;
int stdin_packs;
int stdin_commits;
int append;
@@ -113,6 +115,25 @@ static int graph_read(int argc, const char **argv)
return 0;
 }
 
+struct hex_list {
+   char **hex_strs;
+   int hex_nr;
+   int hex_alloc;
+};
+
+static int add_ref_to_list(const char *refname,
+  const struct object_id *oid,
+  int flags, void *cb_data)
+{
+   struct hex_list *list = (struct hex_list*)cb_data;
+
+   ALLOC_GROW(list->hex_strs, list->hex_nr + 1, list->hex_alloc);
+   list->hex_strs[list->hex_nr] = xcalloc(GIT_MAX_HEXSZ + 1, 1);
+   strcpy(list->hex_strs[list->hex_nr], oid_to_hex(oid));
+   list->hex_nr++;
+   return 0;
+}
+
 static int graph_write(int argc, const char **argv)
 {
const char **pack_indexes = NULL;
@@ -127,6 +148,8 @@ static int graph_write(int argc, const char **argv)
OPT_STRING(0, "object-dir", _dir,
N_("dir"),
N_("The object directory to store the graph")),
+   OPT_BOOL(0, "reachable", ,
+   N_("start walk at all refs")),
OPT_BOOL(0, "stdin-packs", _packs,
N_("scan pack-indexes listed by stdin for commits")),
OPT_BOOL(0, "stdin-commits", _commits,
@@ -140,8 +163,8 @@ static int graph_write(int argc, const char **argv)
 builtin_commit_graph_write_options,
 builtin_commit_graph_write_usage, 0);
 
-   if (opts.stdin_packs && opts.stdin_commits)
-   die(_("cannot use both --stdin-commits and --stdin-packs"));
+   if (opts.reachable + opts.stdin_packs + opts.stdin_commits > 1)
+   die(_("use at most one of --reachable, --stdin-commits, or 
--stdin-packs"));
if (!opts.obj_dir)
opts.obj_dir = get_object_directory();
 
@@ -164,6 +187,16 @@ static int graph_write(int argc, const char **argv)
commit_hex = lines;
commits_nr = lines_nr;
}
+   } else if 

[PATCH 10/12] gc: automatically write commit-graph files

2018-05-10 Thread Derrick Stolee
The commit-graph file is a very helpful feature for speeding up git
operations. In order to make it more useful, write the commit-graph file
by default during standard garbage collection operations.

Add a 'gc.commitGraph' config setting that triggers writing a
commit-graph file after any non-trivial 'git gc' command. Defaults to
false while the commit-graph feature matures. We specifically do not
want to turn this on by default until the commit-graph feature is fully
integrated with history-modifying features like shallow clones.

Signed-off-by: Derrick Stolee 
---
 Documentation/config.txt | 6 ++
 Documentation/git-gc.txt | 4 
 builtin/gc.c | 8 
 3 files changed, 18 insertions(+)

diff --git a/Documentation/config.txt b/Documentation/config.txt
index 11f027194e..9a3abd87e7 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -1553,6 +1553,12 @@ gc.autoDetach::
Make `git gc --auto` return immediately and run in background
if the system supports it. Default is true.
 
+gc.commitGraph::
+   If true, then gc will rewrite the commit-graph file after any
+   change to the object database. If '--auto' is used, then the
+   commit-graph will not be updated unless the threshold is met.
+   See linkgit:git-commit-graph[1] for details.
+
 gc.logExpiry::
If the file gc.log exists, then `git gc --auto` won't run
unless that file is more than 'gc.logExpiry' old.  Default is
diff --git a/Documentation/git-gc.txt b/Documentation/git-gc.txt
index 571b5a7e3c..17dd654a59 100644
--- a/Documentation/git-gc.txt
+++ b/Documentation/git-gc.txt
@@ -119,6 +119,10 @@ The optional configuration variable `gc.packRefs` 
determines if
 it within all non-bare repos or it can be set to a boolean value.
 This defaults to true.
 
+The optional configuration variable 'gc.commitGraph' determines if
+'git gc' runs 'git commit-graph write'. This can be set to a boolean
+value. This defaults to false.
+
 The optional configuration variable `gc.aggressiveWindow` controls how
 much time is spent optimizing the delta compression of the objects in
 the repository when the --aggressive option is specified.  The larger
diff --git a/builtin/gc.c b/builtin/gc.c
index 77fa720bd0..8403445738 100644
--- a/builtin/gc.c
+++ b/builtin/gc.c
@@ -34,6 +34,7 @@ static int aggressive_depth = 50;
 static int aggressive_window = 250;
 static int gc_auto_threshold = 6700;
 static int gc_auto_pack_limit = 50;
+static int gc_commit_graph = 0;
 static int detach_auto = 1;
 static timestamp_t gc_log_expire_time;
 static const char *gc_log_expire = "1.day.ago";
@@ -46,6 +47,7 @@ static struct argv_array repack = ARGV_ARRAY_INIT;
 static struct argv_array prune = ARGV_ARRAY_INIT;
 static struct argv_array prune_worktrees = ARGV_ARRAY_INIT;
 static struct argv_array rerere = ARGV_ARRAY_INIT;
+static struct argv_array commit_graph = ARGV_ARRAY_INIT;
 
 static struct tempfile *pidfile;
 static struct lock_file log_lock;
@@ -121,6 +123,7 @@ static void gc_config(void)
git_config_get_int("gc.aggressivedepth", _depth);
git_config_get_int("gc.auto", _auto_threshold);
git_config_get_int("gc.autopacklimit", _auto_pack_limit);
+   git_config_get_bool("gc.commitgraph", _commit_graph);
git_config_get_bool("gc.autodetach", _auto);
git_config_get_expiry("gc.pruneexpire", _expire);
git_config_get_expiry("gc.worktreepruneexpire", 
_worktrees_expire);
@@ -374,6 +377,7 @@ int cmd_gc(int argc, const char **argv, const char *prefix)
argv_array_pushl(, "prune", "--expire", NULL);
argv_array_pushl(_worktrees, "worktree", "prune", "--expire", 
NULL);
argv_array_pushl(, "rerere", "gc", NULL);
+   argv_array_pushl(_graph, "commit-graph", "write", "--reachable", 
NULL);
 
/* default expiry time, overwritten in gc_config */
gc_config();
@@ -480,6 +484,10 @@ int cmd_gc(int argc, const char **argv, const char *prefix)
if (pack_garbage.nr > 0)
clean_pack_garbage();
 
+   if (gc_commit_graph)
+   if (run_command_v_opt(commit_graph.argv, RUN_GIT_CMD))
+   return error(FAILED_RUN, commit_graph.argv[0]);
+
if (auto_gc && too_many_loose_objects())
warning(_("There are too many unreachable loose objects; "
"run 'git prune' to remove them."));
-- 
2.16.2.329.gfb62395de6



[PATCH 05/12] commit: force commit to parse from object database

2018-05-10 Thread Derrick Stolee
In anticipation of verifying commit-graph file contents against the
object database, create parse_commit_internal() to allow side-stepping
the commit-graph file and parse directly from the object database.

Most consumers expect that if a commit exists in the commit-graph, then
the commit was loaded from the commit-graph so values such as generation
numbers are loaded. Hence, this method should not be called unless the
intention is explicit in avoiding commits from the commit-graph file.

Signed-off-by: Derrick Stolee 
---
 commit.c | 13 +
 commit.h |  1 +
 2 files changed, 10 insertions(+), 4 deletions(-)

diff --git a/commit.c b/commit.c
index 1d28677dfb..7c92350373 100644
--- a/commit.c
+++ b/commit.c
@@ -392,7 +392,7 @@ int parse_commit_buffer(struct commit *item, const void 
*buffer, unsigned long s
return 0;
 }
 
-int parse_commit_gently(struct commit *item, int quiet_on_missing)
+int parse_commit_internal(struct commit *item, int quiet_on_missing, int 
use_commit_graph)
 {
enum object_type type;
void *buffer;
@@ -403,17 +403,17 @@ int parse_commit_gently(struct commit *item, int 
quiet_on_missing)
return -1;
if (item->object.parsed)
return 0;
-   if (parse_commit_in_graph(item))
+   if (use_commit_graph && parse_commit_in_graph(item))
return 0;
buffer = read_sha1_file(item->object.oid.hash, , );
if (!buffer)
return quiet_on_missing ? -1 :
error("Could not read %s",
-oid_to_hex(>object.oid));
+   oid_to_hex(>object.oid));
if (type != OBJ_COMMIT) {
free(buffer);
return error("Object %s not a commit",
-oid_to_hex(>object.oid));
+   oid_to_hex(>object.oid));
}
ret = parse_commit_buffer(item, buffer, size, 0);
if (save_commit_buffer && !ret) {
@@ -424,6 +424,11 @@ int parse_commit_gently(struct commit *item, int 
quiet_on_missing)
return ret;
 }
 
+int parse_commit_gently(struct commit *item, int quiet_on_missing)
+{
+   return parse_commit_internal(item, quiet_on_missing, 1);
+}
+
 void parse_commit_or_die(struct commit *item)
 {
if (parse_commit(item))
diff --git a/commit.h b/commit.h
index b5afde1ae9..5fde74fcd7 100644
--- a/commit.h
+++ b/commit.h
@@ -73,6 +73,7 @@ struct commit *lookup_commit_reference_by_name(const char 
*name);
 struct commit *lookup_commit_or_die(const struct object_id *oid, const char 
*ref_name);
 
 int parse_commit_buffer(struct commit *item, const void *buffer, unsigned long 
size, int check_graph);
+int parse_commit_internal(struct commit *item, int quiet_on_missing, int 
use_commit_graph);
 int parse_commit_gently(struct commit *item, int quiet_on_missing);
 static inline int parse_commit(struct commit *item)
 {
-- 
2.16.2.329.gfb62395de6



[PATCH 11/12] fetch: compute commit-graph by default

2018-05-10 Thread Derrick Stolee
During a call to 'git fetch', we expect new commits and updated refs.
Use these updated refs to add the new commits to the commit-graph file,
automatically providing performance benefits in other calls.

Use 'fetch.commitGraph' config setting to enable or disable this
behavior. Defaults to false while the commit-graph feature matures.
Specifically, we do not want this on by default until the commit-graph
feature integrates with history-modifying features such as shallow
clones.

Signed-off-by: Derrick Stolee 
---
 Documentation/config.txt |  4 
 builtin/fetch.c  | 13 +
 2 files changed, 17 insertions(+)

diff --git a/Documentation/config.txt b/Documentation/config.txt
index 9a3abd87e7..3d8225600a 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -1409,6 +1409,10 @@ fetch.output::
`full` and `compact`. Default value is `full`. See section
OUTPUT in linkgit:git-fetch[1] for detail.
 
+fetch.commitGraph::
+   If true, fetch will automatically update the commit-graph file.
+   See linkgit:git-commit-graph[1].
+
 format.attach::
Enable multipart/mixed attachments as the default for
'format-patch'.  The value can also be a double quoted string
diff --git a/builtin/fetch.c b/builtin/fetch.c
index 8ee998ea2e..254f6ecfb6 100644
--- a/builtin/fetch.c
+++ b/builtin/fetch.c
@@ -38,6 +38,7 @@ enum {
 static int fetch_prune_config = -1; /* unspecified */
 static int prune = -1; /* unspecified */
 #define PRUNE_BY_DEFAULT 0 /* do we prune by default? */
+static int fetch_commit_graph = 0;
 
 static int all, append, dry_run, force, keep, multiple, update_head_ok, 
verbosity, deepen_relative;
 static int progress = -1;
@@ -66,6 +67,11 @@ static int git_fetch_config(const char *k, const char *v, 
void *cb)
return 0;
}
 
+   if (!strcmp(k, "fetch.commitGraph")) {
+   fetch_commit_graph = git_config_bool(k, v);
+   return 0;
+   }
+
if (!strcmp(k, "submodule.recurse")) {
int r = git_config_bool(k, v) ?
RECURSE_SUBMODULES_ON : RECURSE_SUBMODULES_OFF;
@@ -1462,6 +1468,13 @@ int cmd_fetch(int argc, const char **argv, const char 
*prefix)
result = fetch_multiple();
}
 
+   if (!result && fetch_commit_graph) {
+   struct argv_array commit_graph = ARGV_ARRAY_INIT;
+   argv_array_pushl(_graph, "commit-graph", "write", 
"--reachable", NULL);
+   if (run_command_v_opt(commit_graph.argv, RUN_GIT_CMD))
+   result = 1;
+   }
+
if (!result && (recurse_submodules != RECURSE_SUBMODULES_OFF)) {
struct argv_array options = ARGV_ARRAY_INIT;
 
-- 
2.16.2.329.gfb62395de6



[PATCH 08/12] fsck: verify commit-graph

2018-05-10 Thread Derrick Stolee
If core.commitGraph is true, verify the contents of the commit-graph
during 'git fsck' using the 'git commit-graph verify' subcommand. Run
this check on all alternates, as well.

We use a new process for two reasons:

1. The subcommand decouples the details of loading and verifying a
   commit-graph file from the other fsck details.

2. The commit-graph verification requires the commits to be loaded
   in a specific order to guarantee we parse from the commit-graph
   file for some objects and from the object database for others.

Signed-off-by: Derrick Stolee 
---
 Documentation/git-fsck.txt |  3 +++
 builtin/fsck.c | 21 +
 t/t5318-commit-graph.sh|  5 +
 3 files changed, 29 insertions(+)

diff --git a/Documentation/git-fsck.txt b/Documentation/git-fsck.txt
index b9f060e3b2..ab9a93fb9b 100644
--- a/Documentation/git-fsck.txt
+++ b/Documentation/git-fsck.txt
@@ -110,6 +110,9 @@ Any corrupt objects you will have to find in backups or 
other archives
 (i.e., you can just remove them and do an 'rsync' with some other site in
 the hopes that somebody else has the object you have corrupted).
 
+If core.commitGraph is true, the commit-graph file will also be inspected
+using 'git commit-graph verify'. See linkgit:git-commit-graph[1].
+
 Extracted Diagnostics
 -
 
diff --git a/builtin/fsck.c b/builtin/fsck.c
index ef78c6c00c..a6d5045b77 100644
--- a/builtin/fsck.c
+++ b/builtin/fsck.c
@@ -16,6 +16,7 @@
 #include "streaming.h"
 #include "decorate.h"
 #include "packfile.h"
+#include "run-command.h"
 
 #define REACHABLE 0x0001
 #define SEEN  0x0002
@@ -45,6 +46,7 @@ static int name_objects;
 #define ERROR_REACHABLE 02
 #define ERROR_PACK 04
 #define ERROR_REFS 010
+#define ERROR_COMMIT_GRAPH 020
 
 static const char *describe_object(struct object *obj)
 {
@@ -815,5 +817,24 @@ int cmd_fsck(int argc, const char **argv, const char 
*prefix)
}
 
check_connectivity();
+
+   if (core_commit_graph) {
+   struct child_process commit_graph_verify = CHILD_PROCESS_INIT;
+   const char *verify_argv[] = { "commit-graph", "verify", NULL, 
NULL, NULL, NULL };
+   commit_graph_verify.argv = verify_argv;
+   commit_graph_verify.git_cmd = 1;
+
+   if (run_command(_graph_verify))
+   errors_found |= ERROR_COMMIT_GRAPH;
+
+   prepare_alt_odb();
+   for (alt = alt_odb_list; alt; alt = alt->next) {
+   verify_argv[2] = "--object-dir";
+   verify_argv[3] = alt->path;
+   if (run_command(_graph_verify))
+   errors_found |= ERROR_COMMIT_GRAPH;
+   }
+   }
+
return errors_found;
 }
diff --git a/t/t5318-commit-graph.sh b/t/t5318-commit-graph.sh
index 6ca451dfd2..9680e6e6e0 100755
--- a/t/t5318-commit-graph.sh
+++ b/t/t5318-commit-graph.sh
@@ -240,4 +240,9 @@ test_expect_success 'git commit-graph verify' '
git commit-graph verify >output
 '
 
+test_expect_success 'git fsck (checks commit-graph)' '
+   cd "$TRASH_DIRECTORY/full" &&
+   git fsck
+'
+
 test_done
-- 
2.16.2.329.gfb62395de6



[PATCH 03/12] commit-graph: parse commit from chosen graph

2018-05-10 Thread Derrick Stolee
Before verifying a commit-graph file against the object database, we
need to parse all commits from the given commit-graph file. Create
parse_commit_in_graph_one() to target a given struct commit_graph.

Signed-off-by: Derrick Stolee 
---
 commit-graph.c | 18 +++---
 1 file changed, 15 insertions(+), 3 deletions(-)

diff --git a/commit-graph.c b/commit-graph.c
index c3b8716c14..ce11af1d20 100644
--- a/commit-graph.c
+++ b/commit-graph.c
@@ -309,7 +309,7 @@ static int find_commit_in_graph(struct commit *item, struct 
commit_graph *g, uin
}
 }
 
-int parse_commit_in_graph(struct commit *item)
+int parse_commit_in_graph_one(struct commit_graph *g, struct commit *item)
 {
uint32_t pos;
 
@@ -317,9 +317,21 @@ int parse_commit_in_graph(struct commit *item)
return 0;
if (item->object.parsed)
return 1;
+
+   if (find_commit_in_graph(item, g, ))
+   return fill_commit_in_graph(item, g, pos);
+
+   return 0;
+}
+
+int parse_commit_in_graph(struct commit *item)
+{
+   if (!core_commit_graph)
+   return 0;
+
prepare_commit_graph();
-   if (commit_graph && find_commit_in_graph(item, commit_graph, ))
-   return fill_commit_in_graph(item, commit_graph, pos);
+   if (commit_graph)
+   return parse_commit_in_graph_one(commit_graph, item);
return 0;
 }
 
-- 
2.16.2.329.gfb62395de6



[PATCH 06/12] commit-graph: load a root tree from specific graph

2018-05-10 Thread Derrick Stolee
When lazy-loading a tree for a commit, it will be important to select
the tree from a specific struct commit_graph. Create a new method that
specifies the commit-graph file and use that in
get_commit_tree_in_graph().

Signed-off-by: Derrick Stolee 
---
 commit-graph.c | 10 --
 1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/commit-graph.c b/commit-graph.c
index b4c146c423..8c636abba9 100644
--- a/commit-graph.c
+++ b/commit-graph.c
@@ -357,14 +357,20 @@ static struct tree *load_tree_for_commit(struct 
commit_graph *g, struct commit *
return c->maybe_tree;
 }
 
-struct tree *get_commit_tree_in_graph(const struct commit *c)
+static struct tree *get_commit_tree_in_graph_one(struct commit_graph *g,
+const struct commit *c)
 {
if (c->maybe_tree)
return c->maybe_tree;
if (c->graph_pos == COMMIT_NOT_FROM_GRAPH)
BUG("get_commit_tree_in_graph called from non-commit-graph 
commit");
 
-   return load_tree_for_commit(commit_graph, (struct commit *)c);
+   return load_tree_for_commit(g, (struct commit *)c);
+}
+
+struct tree *get_commit_tree_in_graph(const struct commit *c)
+{
+   return get_commit_tree_in_graph_one(commit_graph, c);
 }
 
 static void write_graph_chunk_fanout(struct hashfile *f,
-- 
2.16.2.329.gfb62395de6



[PATCH 07/12] commit-graph: verify commit contents against odb

2018-05-10 Thread Derrick Stolee
When running 'git commit-graph verify', compare the contents of the
commits that are loaded from the commit-graph file with commits that are
loaded directly from the object database. This includes checking the
root tree object ID, commit date, and parents.

Parse the commit from the graph during the initial loop through the
object IDs to guarantee we parse from the commit-graph file.

In addition, verify the generation number calculation is correct for all
commits in the commit-graph file.

Signed-off-by: Derrick Stolee 
---
 commit-graph.c | 88 ++
 1 file changed, 88 insertions(+)

diff --git a/commit-graph.c b/commit-graph.c
index 8c636abba9..24f5031f3e 100644
--- a/commit-graph.c
+++ b/commit-graph.c
@@ -863,6 +863,8 @@ int verify_commit_graph(struct commit_graph *g)
graph_report(_("commit-graph is missing the Commit Data 
chunk"));
 
for (i = 0; i < g->num_commits; i++) {
+   struct commit *graph_commit;
+
hashcpy(cur_oid.hash, g->chunk_oid_lookup + g->hash_len * i);
 
if (i > 0 && oidcmp(_oid, _oid) >= 0)
@@ -880,6 +882,10 @@ int verify_commit_graph(struct commit_graph *g)
 
cur_fanout_pos++;
}
+
+   graph_commit = lookup_commit(_oid);
+   if (!parse_commit_in_graph_one(g, graph_commit))
+   graph_report(_("failed to parse %s from commit-graph"), 
oid_to_hex(_oid));
}
 
while (cur_fanout_pos < 256) {
@@ -892,5 +898,87 @@ int verify_commit_graph(struct commit_graph *g)
cur_fanout_pos++;
}
 
+   for (i = 0; i < g->num_commits; i++) {
+   struct commit *graph_commit, *odb_commit;
+   struct commit_list *graph_parents, *odb_parents;
+   int num_parents = 0;
+
+   hashcpy(cur_oid.hash, g->chunk_oid_lookup + g->hash_len * i);
+
+   graph_commit = lookup_commit(_oid);
+   odb_commit = (struct commit *)create_object(cur_oid.hash, 
alloc_commit_node());
+   if (parse_commit_internal(odb_commit, 0, 0))
+   graph_report(_("failed to parse %s from object 
database"), oid_to_hex(_oid));
+
+   if (oidcmp(_commit_tree_in_graph_one(g, 
graph_commit)->object.oid,
+  get_commit_tree_oid(odb_commit)))
+   graph_report(_("root tree object ID for commit %s in 
commit-graph is %s != %s"),
+oid_to_hex(_oid),
+
oid_to_hex(get_commit_tree_oid(graph_commit)),
+
oid_to_hex(get_commit_tree_oid(odb_commit)));
+
+   if (graph_commit->date != odb_commit->date)
+   graph_report(_("commit date for commit %s in 
commit-graph is %"PRItime" != %"PRItime""),
+oid_to_hex(_oid),
+graph_commit->date,
+odb_commit->date);
+
+
+   graph_parents = graph_commit->parents;
+   odb_parents = odb_commit->parents;
+
+   while (graph_parents) {
+   num_parents++;
+
+   if (odb_parents == NULL)
+   graph_report(_("commit-graph parent list for 
commit %s is too long (%d)"),
+oid_to_hex(_oid),
+num_parents);
+
+   if (oidcmp(_parents->item->object.oid, 
_parents->item->object.oid))
+   graph_report(_("commit-graph parent for %s is 
%s != %s"),
+oid_to_hex(_oid),
+
oid_to_hex(_parents->item->object.oid),
+
oid_to_hex(_parents->item->object.oid));
+
+   graph_parents = graph_parents->next;
+   odb_parents = odb_parents->next;
+   }
+
+   if (odb_parents != NULL)
+   graph_report(_("commit-graph parent list for commit %s 
terminates early"),
+oid_to_hex(_oid));
+
+   if (graph_commit->generation) {
+   uint32_t max_generation = 0;
+   graph_parents = graph_commit->parents;
+
+   while (graph_parents) {
+   if (graph_parents->item->generation == 
GENERATION_NUMBER_ZERO ||
+   graph_parents->item->generation == 
GENERATION_NUMBER_INFINITY)
+   graph_report(_("commit-graph has valid 
generation for %s but not its parent, %s"),
+oid_to_hex(_oid),
+  

[PATCH 04/12] commit-graph: verify fanout and lookup table

2018-05-10 Thread Derrick Stolee
While running 'git commit-graph verify', verify that the object IDs
are listed in lexicographic order and that the fanout table correctly
navigates into that list of object IDs.

Signed-off-by: Derrick Stolee 
---
 commit-graph.c | 33 +
 1 file changed, 33 insertions(+)

diff --git a/commit-graph.c b/commit-graph.c
index ce11af1d20..b4c146c423 100644
--- a/commit-graph.c
+++ b/commit-graph.c
@@ -839,6 +839,9 @@ static int verify_commit_graph_error;
 
 int verify_commit_graph(struct commit_graph *g)
 {
+   uint32_t i, cur_fanout_pos = 0;
+   struct object_id prev_oid, cur_oid;
+
if (!g) {
graph_report(_("no commit-graph file loaded"));
return 1;
@@ -853,5 +856,35 @@ int verify_commit_graph(struct commit_graph *g)
if (!g->chunk_commit_data)
graph_report(_("commit-graph is missing the Commit Data 
chunk"));
 
+   for (i = 0; i < g->num_commits; i++) {
+   hashcpy(cur_oid.hash, g->chunk_oid_lookup + g->hash_len * i);
+
+   if (i > 0 && oidcmp(_oid, _oid) >= 0)
+   graph_report(_("commit-graph has incorrect oid order: 
%s then %s"),
+
+   oid_to_hex(_oid),
+   oid_to_hex(_oid));
+   oidcpy(_oid, _oid);
+
+   while (cur_oid.hash[0] > cur_fanout_pos) {
+   uint32_t fanout_value = get_be32(g->chunk_oid_fanout + 
cur_fanout_pos);
+   if (i != fanout_value)
+   graph_report(_("commit-graph has incorrect 
fanout value: fanout[%d] = %u != %u"),
+cur_fanout_pos, fanout_value, i);
+
+   cur_fanout_pos++;
+   }
+   }
+
+   while (cur_fanout_pos < 256) {
+   uint32_t fanout_value = get_be32(g->chunk_oid_fanout + 
cur_fanout_pos);
+
+   if (g->num_commits != fanout_value)
+   graph_report(_("commit-graph has incorrect fanout 
value: fanout[%d] = %u != %u"),
+cur_fanout_pos, fanout_value, i);
+
+   cur_fanout_pos++;
+   }
+
return verify_commit_graph_error;
 }
-- 
2.16.2.329.gfb62395de6



[PATCH 12/12] commit-graph: update design document

2018-05-10 Thread Derrick Stolee
The commit-graph feature is now integrated with 'fsck' and 'gc',
so remove those items from the "Future Work" section of the
commit-graph design document.

Also remove the section on lazy-loading trees, as that was completed
in an earlier patch series.

Signed-off-by: Derrick Stolee 
---
 Documentation/technical/commit-graph.txt | 22 --
 1 file changed, 22 deletions(-)

diff --git a/Documentation/technical/commit-graph.txt 
b/Documentation/technical/commit-graph.txt
index e1a883eb46..c664acbd76 100644
--- a/Documentation/technical/commit-graph.txt
+++ b/Documentation/technical/commit-graph.txt
@@ -118,9 +118,6 @@ Future Work
 - The commit graph feature currently does not honor commit grafts. This can
   be remedied by duplicating or refactoring the current graft logic.
 
-- The 'commit-graph' subcommand does not have a "verify" mode that is
-  necessary for integration with fsck.
-
 - After computing and storing generation numbers, we must make graph
   walks aware of generation numbers to gain the performance benefits they
   enable. This will mostly be accomplished by swapping a commit-date-ordered
@@ -130,25 +127,6 @@ Future Work
 - 'log --topo-order'
 - 'tag --merged'
 
-- Currently, parse_commit_gently() requires filling in the root tree
-  object for a commit. This passes through lookup_tree() and consequently
-  lookup_object(). Also, it calls lookup_commit() when loading the parents.
-  These method calls check the ODB for object existence, even if the
-  consumer does not need the content. For example, we do not need the
-  tree contents when computing merge bases. Now that commit parsing is
-  removed from the computation time, these lookup operations are the
-  slowest operations keeping graph walks from being fast. Consider
-  loading these objects without verifying their existence in the ODB and
-  only loading them fully when consumers need them. Consider a method
-  such as "ensure_tree_loaded(commit)" that fully loads a tree before
-  using commit->tree.
-
-- The current design uses the 'commit-graph' subcommand to generate the graph.
-  When this feature stabilizes enough to recommend to most users, we should
-  add automatic graph writes to common operations that create many commits.
-  For example, one could compute a graph on 'clone', 'fetch', or 'repack'
-  commands.
-
 - A server could provide a commit graph file as part of the network protocol
   to avoid extra calculations by clients. This feature is only of benefit if
   the user is willing to trust the file, because verifying the file is correct
-- 
2.16.2.329.gfb62395de6



[PATCH 02/12] commit-graph: verify file header information

2018-05-10 Thread Derrick Stolee
During a run of 'git commit-graph verify', list the issues with the
header information in the commit-graph file. Some of this information
is inferred from the loaded 'struct commit_graph'. Some header
information is checked as part of load_commit_graph_one().

Signed-off-by: Derrick Stolee 
---
 commit-graph.c | 23 ++-
 1 file changed, 22 insertions(+), 1 deletion(-)

diff --git a/commit-graph.c b/commit-graph.c
index b25aaed128..c3b8716c14 100644
--- a/commit-graph.c
+++ b/commit-graph.c
@@ -818,7 +818,28 @@ void write_commit_graph(const char *obj_dir,
oids.nr = 0;
 }
 
+static int verify_commit_graph_error;
+#define graph_report(...) \
+   do {\
+   verify_commit_graph_error = 1;\
+   printf(__VA_ARGS__);\
+   } while (0);
+
 int verify_commit_graph(struct commit_graph *g)
 {
-   return !g;
+   if (!g) {
+   graph_report(_("no commit-graph file loaded"));
+   return 1;
+   }
+
+   verify_commit_graph_error = 0;
+
+   if (!g->chunk_oid_fanout)
+   graph_report(_("commit-graph is missing the OID Fanout chunk"));
+   if (!g->chunk_oid_lookup)
+   graph_report(_("commit-graph is missing the OID Lookup chunk"));
+   if (!g->chunk_commit_data)
+   graph_report(_("commit-graph is missing the Commit Data 
chunk"));
+
+   return verify_commit_graph_error;
 }
-- 
2.16.2.329.gfb62395de6



[PATCH 01/12] commit-graph: add 'verify' subcommand

2018-05-10 Thread Derrick Stolee
In case the commit-graph file becomes corrupt, we need a way to
verify its contents match the object database. In the manner of
'git fsck' we will implement a 'git commit-graph verify' subcommand
to report all issues with the file.

Add the 'verify' subcommand to the 'commit-graph' builtin and its
documentation. The subcommand is currently a no-op except for
loading the commit-graph into memory, which may trigger run-time
errors that would be caught by normal use. Add a simple test that
ensures the command returns a zero error code.

If no commit-graph file exists, this is an acceptable state. Do
not report any errors.

Signed-off-by: Derrick Stolee 
---
 Documentation/git-commit-graph.txt |  6 ++
 builtin/commit-graph.c | 38 ++
 commit-graph.c |  5 +
 commit-graph.h |  2 ++
 t/t5318-commit-graph.sh| 10 ++
 5 files changed, 61 insertions(+)

diff --git a/Documentation/git-commit-graph.txt 
b/Documentation/git-commit-graph.txt
index 4c97b555cc..1daefa7fb1 100644
--- a/Documentation/git-commit-graph.txt
+++ b/Documentation/git-commit-graph.txt
@@ -10,6 +10,7 @@ SYNOPSIS
 
 [verse]
 'git commit-graph read' [--object-dir ]
+'git commit-graph verify' [--object-dir ]
 'git commit-graph write'  [--object-dir ]
 
 
@@ -52,6 +53,11 @@ existing commit-graph file.
 Read a graph file given by the commit-graph file and output basic
 details about the graph file. Used for debugging purposes.
 
+'verify'::
+
+Read the commit-graph file and verify its contents against the object
+database. Used to verify for corrupted data.
+
 
 EXAMPLES
 
diff --git a/builtin/commit-graph.c b/builtin/commit-graph.c
index 37420ae0fd..f5d891f2b8 100644
--- a/builtin/commit-graph.c
+++ b/builtin/commit-graph.c
@@ -7,11 +7,17 @@
 
 static char const * const builtin_commit_graph_usage[] = {
N_("git commit-graph [--object-dir ]"),
+   N_("git commit-graph verify [--object-dir ]"),
N_("git commit-graph read [--object-dir ]"),
N_("git commit-graph write [--object-dir ] [--append] 
[--stdin-packs|--stdin-commits]"),
NULL
 };
 
+static const char * const builtin_commit_graph_verify_usage[] = {
+   N_("git commit-graph verify [--object-dir ]"),
+   NULL
+};
+
 static const char * const builtin_commit_graph_read_usage[] = {
N_("git commit-graph read [--object-dir ]"),
NULL
@@ -29,6 +35,36 @@ static struct opts_commit_graph {
int append;
 } opts;
 
+
+static int graph_verify(int argc, const char **argv)
+{
+   struct commit_graph *graph = 0;
+   char *graph_name;
+
+   static struct option builtin_commit_graph_verify_options[] = {
+   OPT_STRING(0, "object-dir", _dir,
+  N_("dir"),
+  N_("The object directory to store the graph")),
+   OPT_END(),
+   };
+
+   argc = parse_options(argc, argv, NULL,
+builtin_commit_graph_verify_options,
+builtin_commit_graph_verify_usage, 0);
+
+   if (!opts.obj_dir)
+   opts.obj_dir = get_object_directory();
+
+   graph_name = get_commit_graph_filename(opts.obj_dir);
+   graph = load_commit_graph_one(graph_name);
+
+   if (!graph)
+   return 0;
+   FREE_AND_NULL(graph_name);
+
+   return verify_commit_graph(graph);
+}
+
 static int graph_read(int argc, const char **argv)
 {
struct commit_graph *graph = NULL;
@@ -160,6 +196,8 @@ int cmd_commit_graph(int argc, const char **argv, const 
char *prefix)
 PARSE_OPT_STOP_AT_NON_OPTION);
 
if (argc > 0) {
+   if (!strcmp(argv[0], "verify"))
+   return graph_verify(argc, argv);
if (!strcmp(argv[0], "read"))
return graph_read(argc, argv);
if (!strcmp(argv[0], "write"))
diff --git a/commit-graph.c b/commit-graph.c
index a8c337dd77..b25aaed128 100644
--- a/commit-graph.c
+++ b/commit-graph.c
@@ -817,3 +817,8 @@ void write_commit_graph(const char *obj_dir,
oids.alloc = 0;
oids.nr = 0;
 }
+
+int verify_commit_graph(struct commit_graph *g)
+{
+   return !g;
+}
diff --git a/commit-graph.h b/commit-graph.h
index 96cccb10f3..71a39c5a57 100644
--- a/commit-graph.h
+++ b/commit-graph.h
@@ -53,4 +53,6 @@ void write_commit_graph(const char *obj_dir,
int nr_commits,
int append);
 
+int verify_commit_graph(struct commit_graph *g);
+
 #endif
diff --git a/t/t5318-commit-graph.sh b/t/t5318-commit-graph.sh
index 77d85aefe7..6ca451dfd2 100755
--- a/t/t5318-commit-graph.sh
+++ b/t/t5318-commit-graph.sh
@@ -11,6 +11,11 @@ test_expect_success 'setup full repo' '
objdir=".git/objects"
 '
 
+test_expect_success 'verify graph with no graph file' '
+   cd "$TRASH_DIRECTORY/full" &&
+  

[PATCH 00/12] Integrate commit-graph into fsck, gc, and fetch

2018-05-10 Thread Derrick Stolee
The commit-graph feature is not useful to end users until the
commit-graph file is maintained automatically by Git during normal
upkeep operations. One natural place to trigger this write is during
'git gc'.

Before automatically generating a commit-graph, we need to be able to
verify the contents of a commit-graph file. Integrate commit-graph
checks into 'fsck' that check the commit-graph contents against commits
in the object database.

Thanks, Jakub, for the feedback on the RFC. I think there are still some
things to decide at a high-level before we dig too far into the review.
Specifically, the integration points in 'fsck', 'gc', and 'fetch' are
worth considering our alternatives.

For 'fsck', the current integration is minimal: if core.commitGraph is
true, then 'git fsck' calls 'git commit-graph verify --object-dir=X'
for the objects directory and every alternate. There are a few options
to consider here:

1. Keep this behavior: we should always check the commit-graph if it
   exists.

2. Add a --[no-]commit-graph argument to 'fsck' that toggles the
   commit-graph verification.

3. Remove all direct integration between 'fsck' and 'commit-graph' and
   instead rely on users checking 'git commit-graph verify' manually
   when they suspect a problem with the commit-graph file. While this
   option is worth considering, it is my least favorite since it requires
   more from users.

For 'gc' and 'fetch', these seemed like natural places to update the
commit-graph file. Relative to the other maintenance that occurs during
these commands, the 'git commit-graph write' command is fast, especially
for incremental updates; only the "new" commits are walked when computing
generation numbers and other metadata for the commit-graph file.

The behavior in this patch series does the following:

1. Near the end of 'git gc', run 'git commit-graph write'. The location
   of this code assumes that a 'git gc --auto' has not terminated early
   due to not meeting the auto threshold.

2. At the end of 'git fetch', run 'git commit-graph write'. This means
   that every reachable commit will be in the commit-graph after a
   a successful fetch, which seems a reasonable frequency. Then, the
   only times we would be missing a reachable commit is after creating
   one locally. There is a problem with the current patch, though: every
   'git fetch' call runs 'git commit-graph write', even if there were no
   ref updates or objects downloaded. Is there a simple way to detect if
   the fetch was non-trivial?

One obvious problem with this approach: if we compute this during 'gc'
AND 'fetch', there will be times where a 'fetch' calls 'gc' and triggers
two commit-graph writes. If I were to abandon one of these patches, it
would be the 'fetch' integration. A 'git gc' really wants to delete all
references to unreachable commits, and without updating the commit-graph
we may still have commit data in the commit-graph file that is not in
the object database. In fact, deleting commits from the object database
but not from the commit-graph will cause 'git commit-graph verify' to
fail!

I welcome discussion on these ideas, as we are venturing out of the
"pure data structure" world and into the "user experience" world. I am
less confident in my skills in this world, but the feature is worthless
if it does not improve the user experience.

Thanks,
-Stolee

Derrick Stolee (12):

Commits 01-07 focus on the 'git commit-graph verify' subcommand. These
are ready for full, rigorous review.

  commit-graph: add 'verify' subcommand
  commit-graph: verify file header information
  commit-graph: parse commit from chosen graph
  commit-graph: verify fanout and lookup table
  commit: force commit to parse from object database
  commit-graph: load a root tree from specific graph
  commit-graph: verify commit contents against odb

Commit 08 integrates 'git commit-graph verify' into fsck. The work here
is the minimum integration possible. (See above discussion of options.)

  fsck: verify commit-graph

Commit 09 introduces a new '--reachable' option only to make the calls
from 'gc' and 'fetch' simpler. Commits 10-11 integrate writing the
commit-graph into 'gc' and 'fetch', respectively. (See above disucssion.)

  commit-graph: add '--reachable' option
  gc: automatically write commit-graph files
  fetch: compute commit-graph by default

Commit 12 simply deletes sections from the "Future Work" section
of the commit-graph design document.

  commit-graph: update design document

 Documentation/config.txt |  10 ++
 Documentation/git-commit-graph.txt   |  14 ++-
 Documentation/git-fsck.txt   |   3 +
 Documentation/git-gc.txt |   4 +
 Documentation/technical/commit-graph.txt |  22 
 builtin/commit-graph.c   |  79 +-
 builtin/fetch.c  |  13 +++
 builtin/fsck.c   |  21 
 builtin/gc.c |   8 ++
 

Re: [PATCH v4 00/13] object store: alloc

2018-05-10 Thread Stefan Beller
On Thu, May 10, 2018 at 10:16 AM, Jonathan Tan  wrote:
> On Wed,  9 May 2018 17:40:11 -0700
> Stefan Beller  wrote:
>
>>   if (obj->type == OBJ_TREE)
>> - release_tree_node((struct tree*)obj);
>> + free_tree_buffer((struct tree*)obj);
>>   else if (obj->type == OBJ_COMMIT)
>> - release_commit_node((struct commit*)obj);
>> + release_commit_memory((struct commit*)obj);
>>   else if (obj->type == OBJ_TAG)
>> - release_tag_node((struct tag*)obj);
>> + free_tag_buffer((struct tag*)obj);
>
> This might seem a bit bikesheddy, but I wouldn't call it
> free_tag_buffer(), since what's being freed is not the buffer of the
> object itself, but just a string. If you want such a function, I would
> just call it release_tag_memory() to match release_commit_memory().
>
> Other than that, all the patches look fine to me.
>
> Some optional comments (this is almost certainly bikeshedding):

Who doesn't love some bikeshedding in late spring?

>
>  - I would call them release_commit() and release_tag(), to match
>strbuf_release().

Why not commit_release and tag_release to also have the same order
of words as in strbuf_release ?

>  - It might be better to just inline the handling of releasing commit
>and tag memory. This code already knows that, for a tree, it needs to
>free its buffer and only its buffer, so it is not much of a stretch
>to think that it similarly knows the details of commit and tag
>objects too.

We still call out to free_tree_buffer? Not sure I understand.


Re: [PATCH 1/2] object.c: free replace map in raw_object_store_clear

2018-05-10 Thread Stefan Beller
Hi Junio,

On Thu, May 10, 2018 at 3:16 AM, Junio C Hamano  wrote:
> Stefan Beller  writes:
>
>> The replace map for objects was missed to free in the object store in
>> the conversion of 174774cd519 (Merge branch 'sb/object-store-replace',
>> 2018-05-08)
>
>
> Or is this just a simple "the topic that ends at 174774cd519^2 had
> this leak that needs to be fixed by this patch; instead of rerolling
> this is an incremental, because the topic has already been merged to
> 'master' and it is too late now"?

This is the case. I wondered if I want to point to the exact commit
(which I have trouble pointing to as the whole topic has no memory
leak fixes, Closest would be d88f9fdf8b2 (replace-object: move
replace_map to object store, 2018-04-11))
or rather just point at the series. I did not think of the confusion
that might arise there.

> Looking at this patch in the context of the side branch (instead of
> in the merged result) already makes sense to me, so I am guessing it
> is the latter (i.e. not a botched merge that missed semantic
> conflicts), in which case the proposed log message is a bit too
> alarming and points readers in a wrong direction.  Shouldn't it
> point at, say, c1274495 ("replace-object: eliminate replace objects
> prepared flag", 2018-04-11) that turned the oidmap instance into a
> pointer in raw_object_store?

Ah, that is the best place to point at. Makes sense.
I'll reroll this,

Thanks,
Stefan


Re: [PATCH 1/9] Add and use generic name->id mapping code for color slot parsing

2018-05-10 Thread Stefan Beller
On Thu, May 10, 2018 at 7:19 AM, Nguyễn Thái Ngọc Duy  wrote:

>  7 files changed, 82 insertions(+), 112 deletions(-)

Nice!


>
> +static const char *color_branch_slots[] = {
> +   [BRANCH_COLOR_RESET]= "reset",

In 512f41cfac5 (clean.c: use designated initializer, 2017-07-14)
we thought we'll do it once and see if anyone complains
(and it shipped v2.15.0, 2017-10-29), and so far
nobody complained half a year later. So designated initializers
are all good now? Do we want to mention this decision in the
commit message?

If so, the patch looks good!
Thanks,
Stefan


Re: [PATCH v4 00/13] object store: alloc

2018-05-10 Thread Jonathan Tan
On Wed,  9 May 2018 17:40:11 -0700
Stefan Beller  wrote:

>   if (obj->type == OBJ_TREE)
> - release_tree_node((struct tree*)obj);
> + free_tree_buffer((struct tree*)obj);
>   else if (obj->type == OBJ_COMMIT)
> - release_commit_node((struct commit*)obj);
> + release_commit_memory((struct commit*)obj);
>   else if (obj->type == OBJ_TAG)
> - release_tag_node((struct tag*)obj);
> + free_tag_buffer((struct tag*)obj);

This might seem a bit bikesheddy, but I wouldn't call it
free_tag_buffer(), since what's being freed is not the buffer of the
object itself, but just a string. If you want such a function, I would
just call it release_tag_memory() to match release_commit_memory().

Other than that, all the patches look fine to me.

Some optional comments (this is almost certainly bikeshedding):

 - I would call them release_commit() and release_tag(), to match
   strbuf_release().
 - It might be better to just inline the handling of releasing commit
   and tag memory. This code already knows that, for a tree, it needs to
   free its buffer and only its buffer, so it is not much of a stretch
   to think that it similarly knows the details of commit and tag
   objects too.


Re: [PATCH v2] pack-format.txt: more details on pack file format

2018-05-10 Thread Stefan Beller
On Thu, May 10, 2018 at 8:09 AM, Nguyễn Thái Ngọc Duy  wrote:
> The current document mentions OBJ_* constants without their actual
> values. A git developer would know these are from cache.h but that's
> not very friendly to a person who wants to read this file to implement
> a pack file parser.
>
> Similarly, the deltified representation is not documented at all (the
> "document" is basically patch-delta.c). Translate that C code to
> English with a bit more about what ofs-delta and ref-delta mean.
>
> Signed-off-by: Nguyễn Thái Ngọc Duy 
> ---
>  This is a much better description than v1. I hope.
>
>  Documentation/technical/pack-format.txt | 78 +
>  cache.h |  5 ++
>  2 files changed, 83 insertions(+)
>
> diff --git a/Documentation/technical/pack-format.txt 
> b/Documentation/technical/pack-format.txt
> index 8e5bf60be3..d20bf592aa 100644
> --- a/Documentation/technical/pack-format.txt
> +++ b/Documentation/technical/pack-format.txt
> @@ -36,6 +36,84 @@ Git pack format
>
>- The trailer records 20-byte SHA-1 checksum of all of the above.
>
> +=== Object types
> +
> +Valid object types are:
> +
> +- OBJ_COMMIT (1)
> +- OBJ_TREE (2)
> +- OBJ_BLOB (3)
> +- OBJ_TAG (4)
> +- OBJ_OFS_DELTA (6)
> +- OBJ_REF_DELTA (7)
> +
> +Type 5 is reserved for future expansion. Type 0 is invalid.
> +
> +=== Deltified representation
> +
> +Conceptually there are only four object types: commit, tree, tag and
> +blob. However to save space, an object could be stored as a "delta" of
> +another "base" object. These representations are assigned new types
> +ofs-delta and ref-delta, which is only valid in a pack file.

...only valid...

as opposed to loose objects or as opposed to referencing cross-packs?
I would think the former, not the latter.

> +Both ofs-delta and ref-delta store the "delta" against another
> +object. The difference between them is, ref-delta directly encodes
> +20-byte base object name. If the base object is in the same pack,
> +ofs-delta encodes the offset of the base object in the pack instead.

Reading this paragraph clears up the question from before.
The ref delta is a delta to another "reference by hash id (sha1)".
What abbreviation is OFS? OFfSet ?

> +The delta data is a sequence of instructions to reconstruct an object
> +from the base object.

As said before the base object is of type 1..4, we do not "delta-on-delta"
yet, but to construct the object we have to create the base object first,
which itself can be represented as a deltified object leading to a delta
chain.

> Each instruction appends more and more data to
> +the target object until it's complete. There are two supported
> +instructions so far: one for copy a byte range from the source object
> +and one for inserting new data embedded in the instruction itself.

ok. So there are 2 types of instructions, "copy from (offset, size)" and
"new data follows".

The next paragraphs seem to describe the copy instruction, maybe
add a sub-headline here?

> +Each instruction has variable length. Instruction type is determined
> +by the seventh bit of the first octet. The following diagrams follow
> +the convention in RFC 1951 (Deflate compressed data format).
> +
> +  
> +--+-+-+-+-+---+---+---+
> +  | 1xxx | offset1 | offset2 | offset3 | offset4 | size1 | size2 | size3 
> |
> +  
> +--+-+-+-+-+---+---+---+
> +
> +This is the instruction format to copy a byte range from the source
> +object. It encodes the offset to copy from any the number of bytes to
> +copy. Offset and size are in little-endian order.
> +
> +All offset and size bytes are optional. This is to reduce the
> +instruction size when encoding small offsets or sizes. The first seven
> +bits in the first octet determines which of the next seven octets is
> +present. If bit zero is set, offset1 is present. If bit one is set
> +offset2 is present and so on.
> +
> +Note that a more compact instruction does not change offset and size
> +encoding. For example, if only offset2 is omitted like below, offset3
> +still contains bits 16-23. It does not become offset2 and contains
> +bits 8-15 even if it's right next to offset1.
> +
> +  +--+-+-+
> +  | 1101 | offset1 | offset3 |
> +  +--+-+-+

It reads very fluently to here.

> +In its most compact form, this instruction only takes up one byte
> +(0x80) with both offset and size omitted, which will have default
> +values zero. There is another exception: size zero is automatically
> +converted to 0x1.

This "another exception" sounds a bit tacked on, but is still understandable.
I would imagine that the size of 0 is used frequently to copy large blocks
and coincidentally it is represented using the lowest number of bytes
for size. Cute!

Before the next diagram we could have a 

Re: [PATCH] repository: fix free problem with repo_clear(the_repository)

2018-05-10 Thread Stefan Beller
Hi Junio,

On Thu, May 10, 2018 at 2:27 AM, Junio C Hamano  wrote:
> Stefan Beller  writes:
>
>> So this would go with the latest sb/object-store-alloc ?
>>
>> My impression was that we never call repo_clear() on
>> the_repository, which would allow us to special case
>> the_repository further just as I did in v2 of that series[1] by
>> having static allocations for certain objects in case of \
>> the_repository.
>>
>> [1] https://public-inbox.org/git/20180501213403.14643-14-sbel...@google.com/
>>
>> We could just deal with all the exceptions, but that makes repo_clear
>> ugly IMHO.
>
> So perhaps
>
>  void repo_clear(...)
>  {
> +   if (repo == the_repository)
> +   BUG("repo_clear() called on the repository");
> ...
>
> or something?

This would work, but Duy convinced me to have repo_clear working
on the_repository is a good idea by giving a minimal test helper[1],
which helped me to find leaks[2][3], so I'd rather go with the solution
by Duy in [4] from a developers perspective.

Stefan

[1] 
https://public-inbox.org/git/cacsjy8c7n2w821h8yr8vakdcsoscdtqi_yt7z8hhndo-vxj...@mail.gmail.com/
https://gist.github.com/pclouds/86a2df6c28043f1b6fa3d4e72e7a1276
[2] https://public-inbox.org/git/20180510001211.163692-1-sbel...@google.com/
[3] https://public-inbox.org/git/20180509234059.52156-1-sbel...@google.com/
[4] 
https://public-inbox.org/git/CACsJy8AdJcQpiGrR3p6xfuqU0=qoP3=stgbwrnckdfka6di...@mail.gmail.com/


Re: [PATCH v2] add status config and command line options for rename detection

2018-05-10 Thread Elijah Newren
Hi Ben,

On Thu, May 10, 2018 at 7:16 AM, Ben Peart  wrote:
> After performing a merge that has conflicts, git status will by default 
> attempt
> to detect renames which causes many objects to be examined.  In a virtualized
> repo, those objects do not exist locally so the rename logic triggers them to 
> be
> fetched from the server. This results in the status call taking hours to
> complete on very large repos.  Even in a small repo (the GVFS repo) turning 
> off
> break and rename detection has a significant impact:

It'd be nice if you could show that impact by comparing 'git status'
to 'git status --no-renames', for some repo.  Showing only the latter
gives us no way to assess the impact.

> git status --no-renames:
> 31 secs., 105 loose object downloads
>
> git status --no-breaks
> 7 secs., 17 loose object downloads
>
> git status --no-breaks --no-renames
> 1 sec., 1 loose object download

This patch doesn't add a --no-breaks option and it doesn't exist
previously, so adding it to the commit message serves to confuse
rather than help.  I'd just drop the last two of these (and redo the
timing for --no-renames assuming you are built on
em/status-rename-config).

> Add a new config status.renames setting to enable turning off rename detection
> during status.  This setting will default to the value of diff.renames.
>
> Add a new config status.renamelimit setting to to enable bounding the time 
> spent
> finding out inexact renames during status.  This setting will default to the
> value of diff.renamelimit.

It may be worth mentioning that these config settings also affect 'git
commit' (and it does, in my testing, which I think is a good thing).

> Add status --no-renames command line option that enables overriding the config
> setting from the command line. Add --find-renames[=] to enable detecting
> renames and optionally setting the similarity index from the command line.

The command line options are specific to 'git status'.  I don't really
have a strong opinion on whether they should also be added to
git-commit; I suspect users would be more likely to use the config
options in order to set it once and forget about it and that users
would be more likely to want to override their config setting for
status than for commit.

> Note: I removed the --no-breaks command line option from the original patch as
> it will no longer be needed once the default has been changed [1] to turn it 
> off.
>
> [1] 
> https://public-inbox.org/git/20180430093421.27551-2-eckhard.s.ma...@gmail.com/

I'd just drop these lines from the commit message, and instead mention
that your patch depends on em/status-rename-config.

> Original-Patch-by: Alejandro Pauly 
> Signed-off-by: Ben Peart 
> ---
>
> Notes:
> Base Ref: master

This patch does not apply to master; it has conflicts.

> Web-Diff: https://github.com/benpeart/git/commit/823212725b

This web diff shows em/status-rename-config as the parent commit, not
master.  Since your commit message mentions you want the change to
break detection provided by that series, just listing it as the
explicit base seems like the right way to go.

> ### Interdiff (v1..v2):

Thanks.

> +   if ((intptr_t)rename_score_arg != -1) {
> +   s.detect_rename = DIFF_DETECT_RENAME;

I'd still prefer this was a
if (s.detect_rename < DIFF_DETECT_RENAME)
s.detect_rename = DIFF_DETECT_RENAME;

If a user specifies they are willing to pay for copy detection, but
then just passes --find-renames=40% because they want to find more
renames, it seems odd to disable copy detection to me.

> +++ b/t/t7525-status-rename.sh

Testcases look good.  It'd be nice to also add a few testcases where
copy detection is turned on -- in particular, I'd like to see one with
--find-renames=$DIFFERENT_THAN_DEFAULT being passed when
merge.renames=copies.


> +test_expect_success 'setup' '
> +   echo 1 >original &&
> +   git add . &&
> +   git commit -m"Adding original file." &&
> +   mv original renamed &&
> +   echo 2 >> renamed &&
> +   git add .
> +'


> +cat >.gitignore <<\EOF
> +.gitignore
> +expect*
> +actual*
> +EOF

Can this just be included in the setup?


Everything else in the patch looked good to me.


Re: [PATCH v4 6/6] get_short_oid: document & warn if we ignore the type selector

2018-05-10 Thread Jeff King
On Thu, May 10, 2018 at 12:03:22PM -0400, Jeff King wrote:

> And finally, your 06fa example for me shows behavior that's either
> buggy, or I'm just confused. I get:
> 
>   $ git rev-parse 06fa^{blob}
>   error: short SHA1 06fa is ambiguous
>   hint: The candidates are:
>   hint:   06fa2b7c2b tag v2.1.4
>   hint:   06faa52353 commit 2005-10-18 - 2005-10-18 midnight
>   hint:   06fac427af blob
>   hint:   06faf6ba64 tree
> 
> (That 06fac blob comes Junio's refs/notes/amlog). Shouldn't the blob
> disambiguator show me just that object?

Ah, I see. No, "^{blob}" does not actually pass the blob disambiguator.
We only handle committish and treeish right now, and this iteration of
the series omits the function to fix that.

So I think the principle of this commit is sound without that patch, but
your example is not a good one anymore. :)

-Peff


Re: [PATCH v4 6/6] get_short_oid: document & warn if we ignore the type selector

2018-05-10 Thread Jeff King
On Thu, May 10, 2018 at 12:03:22PM -0400, Jeff King wrote:

> But some cases _don't_ issue an error. For example, try this:
> 
>   $ git log ..06faf
> 
> which returns an empty output! We return the single matching tree, even
> though the ".." triggers the commit hint. The revision machinery just
> queues the tree, and then later when we see we're not doing an --objects
> traversal, it just gets ignored. (That's a separate issue, but it shows
> that the hints are just that: hints. The code that runs after does not
> necessarily require a matching type).

I actually have a patch that generates a warning for this case (below).
I've been running with it for about a year, but it annoyingly produces
warnings for "git log --all":

  $ git log --all
  warning: ignoring blob object in traversal: refs/tags/junio-gpg-pub

I guess ideally it would distinguish between items added explicitly and
those added by a wildcard (or perhaps the wildcard adder should be more
careful about adding useless objects).

---
diff --git a/revision.c b/revision.c
index 1cff11833e..816d6b75ee 100644
--- a/revision.c
+++ b/revision.c
@@ -215,6 +215,16 @@ void add_pending_oid(struct rev_info *revs, const char 
*name,
add_pending_object(revs, object, name);
 }
 
+static void warn_ignored_object(struct object *object, const char *name)
+{
+   if (object->flags & UNINTERESTING)
+   return;
+
+   warning(_("ignoring %s object in traversal: %s"),
+   type_name(object->type),
+   (name && *name) ? name : oid_to_hex(>oid));
+}
+
 static struct commit *handle_commit(struct rev_info *revs,
struct object_array_entry *entry)
 {
@@ -272,8 +282,10 @@ static struct commit *handle_commit(struct rev_info *revs,
 */
if (object->type == OBJ_TREE) {
struct tree *tree = (struct tree *)object;
-   if (!revs->tree_objects)
+   if (!revs->tree_objects) {
+   warn_ignored_object(object, name);
return NULL;
+   }
if (flags & UNINTERESTING) {
mark_tree_contents_uninteresting(tree);
return NULL;
@@ -286,8 +298,10 @@ static struct commit *handle_commit(struct rev_info *revs,
 * Blob object? You know the drill by now..
 */
if (object->type == OBJ_BLOB) {
-   if (!revs->blob_objects)
+   if (!revs->blob_objects) {
+   warn_ignored_object(object, name);
return NULL;
+   }
if (flags & UNINTERESTING)
return NULL;
add_pending_object_with_path(revs, object, name, mode, path);


Re: [PATCH v4 0/6] get_short_oid UI improvements

2018-05-10 Thread Jeff King
On Thu, May 10, 2018 at 12:42:57PM +, Ævar Arnfjörð Bjarmason wrote:

> This is like v3 except all the patches to the peel syntax & docs have
> been dropped, which were controversial.
> 
> I think it's worthwhile to re-work that, but I don't have time for
> that now, so I'm submitting this. Maybe I'll have time in the future
> to re-work the rest, but then I can base it on top of this.

I'm not quite on-board with the final patch, but with the exception of a
few nits I sent already, the first 5 look good to me.

-Peff


Re: [PATCH v4 6/6] get_short_oid: document & warn if we ignore the type selector

2018-05-10 Thread Jeff King
On Thu, May 10, 2018 at 12:43:03PM +, Ævar Arnfjörð Bjarmason wrote:

> The SHA1 prefix 06fa currently matches no blobs in git.git. When
> disambiguating short SHA1s we've been quietly ignoring the user's type
> selector as a fallback mechanism, this was intentionally added in
> 1ffa26c461 ("get_short_sha1: list ambiguous objects on error",
> 2016-09-26).
> 
> I think that behavior makes sense, it's not very useful to just show
> nothing because a preference has been expressed via core.disambiguate,
> but it's bad that we're quietly doing this. The user might thing that
> we just didn't understand what e.g 06fa^{blob} meant.

I had to read this through a few times to figure out what problem you
were solving. Possibly because you lead with 06fa, which is really just
an example (and also, I have an 06fa blob in my clone ;) ).

Maybe:

  If the short-sha1 disambiguation code is told to use a particular hint
  (e.g., treeish or blob) but no objects with that short-sha1 match that
  hint, we end up ignoring the hint. This can result in either:

1. We choose the non-matching object if there is only one. This will
   typically result in an error later up the stack (since whatever
   gave us the hint is expecting a particular type).

2. We list all objects with that short-sha1, including those with
   non-matching types.

  This second case can be confusing to the user, who might think that we
  didn't apply the hint properly (especially if the hint came from
  them). For example, in git.git there is no blob with the prefix 06fa.
  So the user may see:

$ git rev-parse 06fa^{blob}
hint: The candidates are:
hint:   06fa2b7c2b tag v2.1.4
hint:   06faf6ba64 tree
06fa^{blob}
fatal: ambiguous argument '06fa^{blob}': unknown revision or path not in 
the working tree.

  Let's help them out by issuing a warning whenever the hint is ignored.

So that at least explains it in a way that makes sense to me. But now
that I've propped up my strawman, let me take a few swings...

Your patch just covers case 2, I think. And for the error case, that's
probably OK:

  $ git rev-parse 06faf^{blob}
  error: 06faf^{blob}: expected blob type, but the object dereferences to tree 
type
  06faf^{blob}
  error: 06faf^{blob}: expected blob type, but the object dereferences to tree 
type
  fatal: ambiguous argument '06faf^{blob}': unknown revision or path not in the 
working tree.

(though there is a separate bug in showing the error twice).

But some cases _don't_ issue an error. For example, try this:

  $ git log ..06faf

which returns an empty output! We return the single matching tree, even
though the ".." triggers the commit hint. The revision machinery just
queues the tree, and then later when we see we're not doing an --objects
traversal, it just gets ignored. (That's a separate issue, but it shows
that the hints are just that: hints. The code that runs after does not
necessarily require a matching type).

And that example shows another issue, which is that the user does not
necessarily feed us the hint explicitly. We're using a committish hint
there, but I'm not sure if mentioning that would confuse the user or
not. Certainly this warning:

> warning: Your hint (via core.disambiguate or peel syntax) was ignored, we 
> fell
> back to showing all object types since no object of the requested type
> matched the provide short SHA1 06fa

is not accurate, because the hint came from neither of those places. ;)

So all that said together, I kind of wonder if we should consider
issuing the warning earlier, doing so for all cases, and being a bit
less chatty. Like:

  $ git rev-parse 06fa^{blob}
  warning: short object id 06fa did not match any objects of type 'blob'

If that were followed by any of:

  1. error: short SHA1 06fa is ambiguous, then a bunch of non-blobs

  2. error: expected blob but I got a tree

  3. the command proceeds and silently ignores the matched object

I think it would be helpful. We'd need to add in an extra mapping of
GET_OID_* back to a human-readable string, but I think that should be
pretty easy.

And finally, your 06fa example for me shows behavior that's either
buggy, or I'm just confused. I get:

  $ git rev-parse 06fa^{blob}
  error: short SHA1 06fa is ambiguous
  hint: The candidates are:
  hint:   06fa2b7c2b tag v2.1.4
  hint:   06faa52353 commit 2005-10-18 - 2005-10-18 midnight
  hint:   06fac427af blob
  hint:   06faf6ba64 tree

(That 06fac blob comes Junio's refs/notes/amlog). Shouldn't the blob
disambiguator show me just that object?

-Peff


Re: [PATCH v3 13/13] alloc: allow arbitrary repositories for alloc functions

2018-05-10 Thread Duy Nguyen
On Wed, May 9, 2018 at 9:20 PM, Stefan Beller  wrote:
> On Wed, May 9, 2018 at 10:18 AM, Duy Nguyen  wrote:
>>
>> If you want to reproduce, this is what I used to test this with.
>>
>> https://gist.github.com/pclouds/86a2df6c28043f1b6fa3d4e72e7a1276
>
> This only applied cleanly after I created an empty file at
> t/helper/test-abc.c, using git-apply.

Right. I created the patch with "git add -N". I know exactly what the
bug is but I'll need to be careful with renaming a bit before trying
to fix this. Thanks for reminding me.
-- 
Duy


Re: [PATCH v4 5/6] get_short_oid: sort ambiguous objects by type, then SHA-1

2018-05-10 Thread Jeff King
On Thu, May 10, 2018 at 12:43:02PM +, Ævar Arnfjörð Bjarmason wrote:

> Now we'll instead show:
> 
> hint:   e8f2650052 tag v2.17.0
> hint:   e8f21caf94 commit 2013-06-24 - bash prompt: print unique detached 
> HEAD abbreviated object name
> hint:   e8f26250fa commit 2017-02-03 - Merge pull request #996 from 
> jeffhostetler/jeffhostetler/register_rename_src
> hint:   e8f2bc0c06 commit 2015-05-10 - Documentation: note behavior for 
> multiple remote.url entries
> hint:   e8f2093055 tree
> hint:   e8f25a3a50 tree
> hint:   e8f28d537c tree
> hint:   e8f2cf6ec0 tree
> hint:   e8f21d02f7 blob
> hint:   e8f21d577c blob
> hint:   e8f2867228 blob
> hint:   e8f2a35526 blob

I said already that I like the output, but this time I'll actually read
the code. ;)

It all looks good to me, with the exception of a few documentation nits
I'll mention below.

> A note on the implementation: Derrick rightly pointed out[1] that
> we're bending over backwards here in get_short_oid() to first
> de-duplicate the list, and then emit it, but could simply do it in one
> step.
> 
> The reason for that is that oid_array_for_each_unique() doesn't
> actually require that the array be sorted by oid_array_sort(), it just
> needs to be sorted in some order that guarantees that all objects with
> the same ID are adjacent to one another, which (barring a hash
> collision, which'll be someone else's problem) the sort_ambiguous()
> function does.

If we were to go this route, I think it would make sense to add a
sorting function pointer to "struct oid_array". I'm OK with punting on
it for now, though.

> diff --git a/Documentation/technical/api-oid-array.txt 
> b/Documentation/technical/api-oid-array.txt
> index b0c11f868d..94b529722c 100644
> --- a/Documentation/technical/api-oid-array.txt
> +++ b/Documentation/technical/api-oid-array.txt
> @@ -35,13 +35,18 @@ Functions
>   Free all memory associated with the array and return it to the
>   initial, empty state.
>  
> +`oid_array_for_each`::
> + Iterate over each element of the list, executing the callback
> + function for each one. Does not sort the list, so any custom
> + hash order is retained. If the callback returns a non-zero
> + value, the iteration ends immediately and the callback's
> + return is propagated; otherwise, 0 is returned.
> +
>  `oid_array_for_each_unique`::
> - Efficiently iterate over each unique element of the list,
> - executing the callback function for each one. If the array is
> - not sorted, this function has the side effect of sorting it. If
> - the callback returns a non-zero value, the iteration ends
> - immediately and the callback's return is propagated; otherwise,
> - 0 is returned.
> + Iterate over each unique element of the list in sort order ,
> + but otherwise behaves like `oid_array_for_each`. If the array
> + is not sorted, this function has the side effect of sorting
> + it.

Extra space in "sort order ,".

I'd probably say "sorted order", but that might be a matter of
preference.

Also, your parallel verb tenses don't agree. ;) It should be "Iterate
... but otherwise behave", not "behaves".

> + /*
> +  * Between object types show tags, then commits, and finally
> +  * trees and blobs.
> +  *
> +  * The object_type enum is commit, tree, blob, tag, but we
> +  * want tag, commit, tree blob. Cleverly (perhaps too
> +  * cleverly) do that with modulus, since the enum assigns 1 to
> +  * commit, so tag becomes 0.
> +  */
> + a_type_sort = a_type % 4;
> + b_type_sort = b_type % 4;
> + return a_type_sort > b_type_sort ? 1 : -1;

This is amusingly clever, and should be very efficient. I'm glad there's
a comment at least, though.

-Peff


[PATCH v2] pack-format.txt: more details on pack file format

2018-05-10 Thread Nguyễn Thái Ngọc Duy
The current document mentions OBJ_* constants without their actual
values. A git developer would know these are from cache.h but that's
not very friendly to a person who wants to read this file to implement
a pack file parser.

Similarly, the deltified representation is not documented at all (the
"document" is basically patch-delta.c). Translate that C code to
English with a bit more about what ofs-delta and ref-delta mean.

Signed-off-by: Nguyễn Thái Ngọc Duy 
---
 This is a much better description than v1. I hope.

 Documentation/technical/pack-format.txt | 78 +
 cache.h |  5 ++
 2 files changed, 83 insertions(+)

diff --git a/Documentation/technical/pack-format.txt 
b/Documentation/technical/pack-format.txt
index 8e5bf60be3..d20bf592aa 100644
--- a/Documentation/technical/pack-format.txt
+++ b/Documentation/technical/pack-format.txt
@@ -36,6 +36,84 @@ Git pack format
 
   - The trailer records 20-byte SHA-1 checksum of all of the above.
 
+=== Object types
+
+Valid object types are:
+
+- OBJ_COMMIT (1)
+- OBJ_TREE (2)
+- OBJ_BLOB (3)
+- OBJ_TAG (4)
+- OBJ_OFS_DELTA (6)
+- OBJ_REF_DELTA (7)
+
+Type 5 is reserved for future expansion. Type 0 is invalid.
+
+=== Deltified representation
+
+Conceptually there are only four object types: commit, tree, tag and
+blob. However to save space, an object could be stored as a "delta" of
+another "base" object. These representations are assigned new types
+ofs-delta and ref-delta, which is only valid in a pack file.
+
+Both ofs-delta and ref-delta store the "delta" against another
+object. The difference between them is, ref-delta directly encodes
+20-byte base object name. If the base object is in the same pack,
+ofs-delta encodes the offset of the base object in the pack instead.
+
+The delta data is a sequence of instructions to reconstruct an object
+from the base object. Each instruction appends more and more data to
+the target object until it's complete. There are two supported
+instructions so far: one for copy a byte range from the source object
+and one for inserting new data embedded in the instruction itself.
+
+Each instruction has variable length. Instruction type is determined
+by the seventh bit of the first octet. The following diagrams follow
+the convention in RFC 1951 (Deflate compressed data format).
+
+  +--+-+-+-+-+---+---+---+
+  | 1xxx | offset1 | offset2 | offset3 | offset4 | size1 | size2 | size3 |
+  +--+-+-+-+-+---+---+---+
+
+This is the instruction format to copy a byte range from the source
+object. It encodes the offset to copy from any the number of bytes to
+copy. Offset and size are in little-endian order.
+
+All offset and size bytes are optional. This is to reduce the
+instruction size when encoding small offsets or sizes. The first seven
+bits in the first octet determines which of the next seven octets is
+present. If bit zero is set, offset1 is present. If bit one is set
+offset2 is present and so on.
+
+Note that a more compact instruction does not change offset and size
+encoding. For example, if only offset2 is omitted like below, offset3
+still contains bits 16-23. It does not become offset2 and contains
+bits 8-15 even if it's right next to offset1.
+
+  +--+-+-+
+  | 1101 | offset1 | offset3 |
+  +--+-+-+
+
+In its most compact form, this instruction only takes up one byte
+(0x80) with both offset and size omitted, which will have default
+values zero. There is another exception: size zero is automatically
+converted to 0x1.
+
+  +--++
+  | 0xxx |data|
+  +--++
+
+This is the instruction to construct target object without the base
+object. The following data is appended to the target object. The first
+seven bits of the first octet determines the size of data in
+bytes. The size must be non-zero.
+
+  +--+
+  |  |
+  +--+
+
+This is the instruction reserved for future expansion.
+
 == Original (version 1) pack-*.idx files have the following format:
 
   - The header consists of 256 4-byte network byte order
diff --git a/cache.h b/cache.h
index 77b7acebb6..ad549e258e 100644
--- a/cache.h
+++ b/cache.h
@@ -373,6 +373,11 @@ extern void free_name_hash(struct index_state *istate);
 #define read_blob_data_from_cache(path, sz) 
read_blob_data_from_index(_index, (path), (sz))
 #endif
 
+/*
+ * Values in this enum (except those outside the 3 bit range) are part
+ * of pack file format. See Documentation/technical/pack-format.txt
+ * for more information.
+ */
 enum object_type {
OBJ_BAD = -1,
OBJ_NONE = 0,
-- 
2.17.0.705.g3525833791



Re: [PATCH v4 2/6] sha1-array.h: align function arguments

2018-05-10 Thread Jeff King
On Thu, May 10, 2018 at 12:42:59PM +, Ævar Arnfjörð Bjarmason wrote:

> The arguments weren't lined up with the opening parenthesis. Fixes up
> code added in aae0caf19e ("sha1-array.h: align function arguments",
> 2018-04-30).

I think that's this patch. :)

Presumably you meant 910650d2f8 (Rename sha1_array to oid_array,
2017-03-31)?

-Peff


Re: [PATCH v4 3/6] git-p4: change "commitish" typo to "committish"

2018-05-10 Thread Luke Diamand
On 10 May 2018 at 13:43, Ævar Arnfjörð Bjarmason  wrote:
> This was the only occurrence of "commitish" in the tree, but as the
> log will reveal we've had others in the past. Fixes up code added in
> 00ad6e3182 ("git-p4: work with a detached head", 2015-11-21).
>
> Signed-off-by: Ævar Arnfjörð Bjarmason 

Looks good to me!

Thanks,
Luke


> ---
>  git-p4.py | 6 +++---
>  1 file changed, 3 insertions(+), 3 deletions(-)
>
> diff --git a/git-p4.py b/git-p4.py
> index 7bb9cadc69..1afa87cd9d 100755
> --- a/git-p4.py
> +++ b/git-p4.py
> @@ -2099,11 +2099,11 @@ class P4Submit(Command, P4UserMap):
>
>  commits = []
>  if self.master:
> -commitish = self.master
> +committish = self.master
>  else:
> -commitish = 'HEAD'
> +committish = 'HEAD'
>
> -for line in read_pipe_lines(["git", "rev-list", "--no-merges", 
> "%s..%s" % (self.origin, commitish)]):
> +for line in read_pipe_lines(["git", "rev-list", "--no-merges", 
> "%s..%s" % (self.origin, committish)]):
>  commits.append(line.strip())
>  commits.reverse()
>
> --
> 2.17.0.410.g4ac3413cc8
>


Re: [PATCH] t5310-pack-bitmaps: make JGit tests work with GIT_TEST_SPLIT_INDEX

2018-05-10 Thread SZEDER Gábor
On Thu, May 10, 2018 at 4:34 PM, Jeff King  wrote:
> On Thu, May 10, 2018 at 03:58:52PM +0200, SZEDER Gábor wrote:
>> Since testing bitmaps doesn't need a worktree in the first place,
>> let's just create bare clones for the two JGit tests, so the cloned
>> won't have an index, and these two tests can be executed even with
>> split index enabled.
>
> Nice, this seems like a clever workaround.
>
> Reviewed-by: Jeff King 
>
> The more heavy-handed approach would be to just disable the JGIT prereq
> when GIT_TEST_SPLIT_INDEX is in use, which would cover this and
> potentially any other cases. This is nicer because it lets us continue
> using the test. And it's not like we have a ton of jgit dependencies,
> such that dealing with each individually would be a burden.

We could also 'sane_unset GIT_TEST_SPLIT_INDEX' in these two tests,
but I think that we should do that only in tests that specifically
check split index behavior (i.e. t1700).


Re: [PATCH 2/2] replace-object.c: remove the_repository from prepare_replace_object

2018-05-10 Thread Derrick Stolee

On 5/10/2018 7:56 AM, Jeff King wrote:

On Thu, May 10, 2018 at 07:23:13PM +0900, Junio C Hamano wrote:


This one was doing

ptr = xmalloc(sizeof(*another_ptr))

and it was OK because ptr and another_ptr happened to be of the same
type.  I wonder if we are making it safer, or making it more obscure
to seasoned C programmers, if we introduced a pair of helper macros,
perhaps like these:

#define ALLOCATE(ptr) (ptr) = xmalloc(sizeof(*(ptr)))
#define CALLOCATE(ptr,cnt) (ptr) = xcalloc((cnt), sizeof(*(ptr)))

I've often wondered that, too. It's the natural endgame of the
ALLOC_ARRAY() road we've been going down.


I'll chime in that I like this idea.

Because I'm trying to learn more about Coccinelle, I added the diff 
below and ran 'make coccicheck'. It resulted in a 1000-line patch on 
'next'. I'll refrain from sending that patch series, but it's nice to 
know a "simple" transformation is possible.


Using `git grep` I see 230 instances of 'xmalloc' and 261 instances of 
'xcalloc'. After the Coccinelle transformation, these are down to 194 
and 190, respectively, because the rest allocate in the same line as the 
definition. It's worth thinking about the macro pattern for those cases.


diff --git a/contrib/coccinelle/alloc.cocci b/contrib/coccinelle/alloc.cocci
new file mode 100644
index 00..95f426c4dc
--- /dev/null
+++ b/contrib/coccinelle/alloc.cocci
@@ -0,0 +1,13 @@
+@@
+expression p;
+@@
+- p = xmalloc(sizeof(*p));
++ ALLOCATE(p);
+
+@@
+expression c;
+expression p;
+@@
+- p = xcalloc(c, sizeof(*p));
++ CALLOCATE(p,c);
+
diff --git a/git-compat-util.h b/git-compat-util.h
index f9e4c5f9bc..5398f217d7 100644
--- a/git-compat-util.h
+++ b/git-compat-util.h
@@ -843,6 +843,9 @@ extern FILE *fopen_or_warn(const char *path, const 
char *mode);

  */
 #define FREE_AND_NULL(p) do { free(p); (p) = NULL; } while (0)

+#define ALLOCATE(ptr) (ptr) = xmalloc(sizeof(*(ptr)))
+#define CALLOCATE(ptr,cnt) (ptr) = xcalloc((cnt), sizeof(*(ptr)))
+
 #define ALLOC_ARRAY(x, alloc) (x) = xmalloc(st_mult(sizeof(*(x)), 
(alloc)))
 #define REALLOC_ARRAY(x, alloc) (x) = xrealloc((x), 
st_mult(sizeof(*(x)), (alloc)))




Re: [PATCH] t5310-pack-bitmaps: make JGit tests work with GIT_TEST_SPLIT_INDEX

2018-05-10 Thread Jeff King
On Thu, May 10, 2018 at 03:58:52PM +0200, SZEDER Gábor wrote:

> The two JGit tests 'we can read jgit bitmaps' and 'jgit can read our
> bitmaps' in 't5310-pack-bitmaps.sh' fail when run with
> GIT_TEST_SPLIT_INDEX=YesPlease.  Both tests create a clone of the test
> repository to check bitmap interoperability with JGit.  With split
> indexes enabled the index in the clone repositories contains the
> 'link' extension, which JGit doesn't support and, consequently, an
> exception aborts it:
> 
>   <...>
>   org.eclipse.jgit.api.errors.JGitInternalException: DIRC extension 'link' 
> not supported by this version.
>   at org.eclipse.jgit.dircache.DirCache.readFrom(DirCache.java:562)
>   <...>
> 
> Since testing bitmaps doesn't need a worktree in the first place,
> let's just create bare clones for the two JGit tests, so the cloned
> won't have an index, and these two tests can be executed even with
> split index enabled.

Nice, this seems like a clever workaround.

Reviewed-by: Jeff King 

The more heavy-handed approach would be to just disable the JGIT prereq
when GIT_TEST_SPLIT_INDEX is in use, which would cover this and
potentially any other cases. This is nicer because it lets us continue
using the test. And it's not like we have a ton of jgit dependencies,
such that dealing with each individually would be a burden.

-Peff


[PATCH] apply: clarify "-p" documentation

2018-05-10 Thread Jeff King
On Fri, Apr 27, 2018 at 12:40:05PM -0500, kelly elton wrote:

> >  -p
> > Remove  leading slashes from traditional diff paths. The default is 1.
> 
> This suggests to me the following outcomes:
> 1) home/user/repos/myrepo with -p1 becomes home/user/repos/myrepo
> 2) home/user/repos/myrepo with -p2 becomes home/user/repos/myrepo
> 3) /home/user/repos/myrepo with -p1 becomes home/user/repos/myrepo
> 4) /home/user/repos/myrepo with -p2 becomes home/user/repos/myrepo
> 5) //home/user/repos/myrepo with -p1 becomes /home/user/repos/myrepo
> 6) //home/user/repos/myrepo with -p2 becomes home/user/repos/myrepo
> 
> `Remove  leading slashes`...That's not really what's happening.
> 
> What seems to actually happen is that it is removing directories from the 
> path:
> 1) home/user/repos/myrepo with -p1 becomes home/user/repos/myrepo
> 2) home/user/repos/myrepo with -p2 becomes user/repos/myrepo
> 
> This argument seems to be removing folders from the path, not slashes.

Yes. I agree the current documentation is quite misleading.

How about this?

-- >8 --
Subject: [PATCH] apply: clarify "-p" documentation

We're not really removing slashes, but slash-separated path
components. Let's make that more clear.

Reported-by: kelly elton 
Signed-off-by: Jeff King 
---
 Documentation/git-apply.txt | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/Documentation/git-apply.txt b/Documentation/git-apply.txt
index 4ebc3d3271..c993fbf714 100644
--- a/Documentation/git-apply.txt
+++ b/Documentation/git-apply.txt
@@ -113,8 +113,10 @@ explained for the configuration variable `core.quotePath` 
(see
 linkgit:git-config[1]).
 
 -p::
-   Remove  leading slashes from traditional diff paths. The
-   default is 1.
+   Remove  leading path components (separated by slashes) from
+   traditional diff paths. E.g., with `-p2`, a patch against
+   `a/dir/file` will be applied directly to `file`. The default is
+   1.
 
 -C::
Ensure at least  lines of surrounding context match before
-- 
2.17.0.984.g9b00a423a4



Re: [PATCH 0/9] completion: avoid hard coding config var list

2018-05-10 Thread Duy Nguyen
On Thu, May 10, 2018 at 4:19 PM, Nguyễn Thái Ngọc Duy  wrote:
> (on top of nd/command-list)

Oh.. and it does make use of C99 designated initializer feature. I
could go with out it, but since git-clean has used it for some time
without anybody complaining, I figure I should take advantage of it.
-- 
Duy


Re: Missing --relative documentation

2018-05-10 Thread Jeff King
On Fri, Apr 27, 2018 at 12:24:26PM -0500, kelly elton wrote:

> git format-patch is missing documentation for --relative.
> There is also no auto complete(or tab complete, whatever it's called)
> for the --relative switch/argument.

The missing documentation is due to the ancient d4cb003fff (format-patch
documentation: Remove diff options that are not useful, 2009-11-07).
Unfortunately I couldn't find any discussion on why those options were
thought to be useless.

I assume the notion was that the point of format-patch is to generate
full text patches for somebody else to apply, and thus something like
--relative usually wouldn't make sense. I could see it being useful if
you are doing something funny, though, like generating patches for a
subset of the repository as if they were at the top level.

I'm not sure if it would make sense to add the option back to the
documentation, though, since the point was to try to de-clutter the
format-patch manpage. I wonder if we could say something like "this
command supports all the diff options; see git-diff[1] for the full
list".

-Peff


[PATCH 4/9] help: add --config to list all available config

2018-05-10 Thread Nguyễn Thái Ngọc Duy
Sometimes it helps to list all available config vars so the user can
search for something they want. The config man page can also be used
but it's harder to search if you want to focus on the variable name,
for example.

Signed-off-by: Nguyễn Thái Ngọc Duy 
---
 Documentation/git-help.txt |  5 
 advice.c   |  9 +++
 builtin/branch.c   |  6 +
 builtin/clean.c|  6 +
 builtin/commit.c   |  7 ++
 builtin/help.c |  9 +++
 diff.c |  7 ++
 fsck.c | 11 +
 generate-cmdlist.sh| 20 
 grep.c |  7 ++
 help.c | 48 ++
 help.h | 18 ++
 log-tree.c |  6 +
 13 files changed, 159 insertions(+)

diff --git a/Documentation/git-help.txt b/Documentation/git-help.txt
index a40fc38d8b..83d25d825a 100644
--- a/Documentation/git-help.txt
+++ b/Documentation/git-help.txt
@@ -45,6 +45,11 @@ OPTIONS
When used with `--verbose` print description for all recognized
commands.
 
+-c::
+--config::
+   List all available configuration variables. This is a short
+   summary of the list in linkgit:git-config[1].
+
 -g::
 --guides::
Prints a list of useful guides on the standard output. This
diff --git a/advice.c b/advice.c
index 406efc183b..b211ebc588 100644
--- a/advice.c
+++ b/advice.c
@@ -1,5 +1,6 @@
 #include "cache.h"
 #include "config.h"
+#include "help.h"
 
 int advice_push_update_rejected = 1;
 int advice_push_non_ff_current = 1;
@@ -84,6 +85,14 @@ int git_default_advice_config(const char *var, const char 
*value)
return 0;
 }
 
+void list_advices(const char *prefix)
+{
+   int i;
+
+   for (i = 0; i < ARRAY_SIZE(advice_config); i++)
+   printf("%s.%s\n", prefix, advice_config[i].name);
+}
+
 int error_resolve_conflict(const char *me)
 {
if (!strcmp(me, "cherry-pick"))
diff --git a/builtin/branch.c b/builtin/branch.c
index b41f332589..23bbdcb301 100644
--- a/builtin/branch.c
+++ b/builtin/branch.c
@@ -22,6 +22,7 @@
 #include "wt-status.h"
 #include "ref-filter.h"
 #include "worktree.h"
+#include "help.h"
 
 static const char * const builtin_branch_usage[] = {
N_("git branch [] [-r | -a] [--merged | --no-merged]"),
@@ -67,6 +68,11 @@ static const char *color_branch_slots[] = {
 static struct string_list output = STRING_LIST_INIT_DUP;
 static unsigned int colopts;
 
+void list_color_branch_slots(const char *prefix)
+{
+   PRINT_ARRAY(prefix, color_branch_slots);
+}
+
 static int git_branch_config(const char *var, const char *value, void *cb)
 {
const char *slot_name;
diff --git a/builtin/clean.c b/builtin/clean.c
index 0ccd95e693..72cea1537e 100644
--- a/builtin/clean.c
+++ b/builtin/clean.c
@@ -16,6 +16,7 @@
 #include "column.h"
 #include "color.h"
 #include "pathspec.h"
+#include "help.h"
 
 static int force = -1; /* unset */
 static int interactive;
@@ -91,6 +92,11 @@ struct menu_stuff {
void *stuff;
 };
 
+void list_color_interactive_slots(const char *prefix)
+{
+   PRINT_ARRAY(prefix, color_interactive_slots);
+}
+
 static int git_clean_config(const char *var, const char *value, void *cb)
 {
const char *slot_name;
diff --git a/builtin/commit.c b/builtin/commit.c
index bee5825bd2..41c3c4f5cd 100644
--- a/builtin/commit.c
+++ b/builtin/commit.c
@@ -32,6 +32,7 @@
 #include "column.h"
 #include "sequencer.h"
 #include "mailmap.h"
+#include "help.h"
 
 static const char * const builtin_commit_usage[] = {
N_("git commit [] [--] ..."),
@@ -1194,6 +1195,12 @@ static int parse_status_slot(const char *slot)
return LOOKUP_CONFIG(color_status_slots, slot);
 }
 
+void list_color_status_slots(const char *prefix)
+{
+   printf("%s.%s\n", prefix, "added");
+   PRINT_ARRAY(prefix, color_status_slots);
+}
+
 static int git_status_config(const char *k, const char *v, void *cb)
 {
struct wt_status *s = cb;
diff --git a/builtin/help.c b/builtin/help.c
index 5727fb5e51..a1153cf473 100644
--- a/builtin/help.c
+++ b/builtin/help.c
@@ -36,6 +36,7 @@ static const char *html_path;
 
 static int show_all = 0;
 static int show_guides = 0;
+static int show_config;
 static int verbose;
 static unsigned int colopts;
 static enum help_format help_format = HELP_FORMAT_NONE;
@@ -44,6 +45,7 @@ static struct option builtin_help_options[] = {
OPT_BOOL('a', "all", _all, N_("print all available commands")),
OPT_HIDDEN_BOOL(0, "exclude-guides", _guides, N_("exclude 
guides")),
OPT_BOOL('g', "guides", _guides, N_("print list of useful 
guides")),
+   OPT_BOOL('c', "config", _config, N_("print list recognized config 
variables")),
OPT_SET_INT('m', "man", _format, N_("show man page"), 
HELP_FORMAT_MAN),
OPT_SET_INT('w', "web", _format, N_("show manual in web browser"),

[PATCH 5/9] advice: keep config name in camelCase in advice_config[]

2018-05-10 Thread Nguyễn Thái Ngọc Duy
For parsing, we don't really need this because the main config parser
will lowercase everything so we can do exact matching. But this array
now is also used for printing in `git help --config`. Keep camelCase
so we have a nice printout.

Signed-off-by: Nguyễn Thái Ngọc Duy 
---
 advice.c | 40 
 1 file changed, 20 insertions(+), 20 deletions(-)

diff --git a/advice.c b/advice.c
index b211ebc588..d8ea93637a 100644
--- a/advice.c
+++ b/advice.c
@@ -25,27 +25,27 @@ static struct {
const char *name;
int *preference;
 } advice_config[] = {
-   { "pushupdaterejected", _push_update_rejected },
-   { "pushnonffcurrent", _push_non_ff_current },
-   { "pushnonffmatching", _push_non_ff_matching },
-   { "pushalreadyexists", _push_already_exists },
-   { "pushfetchfirst", _push_fetch_first },
-   { "pushneedsforce", _push_needs_force },
-   { "statushints", _status_hints },
-   { "statusuoption", _status_u_option },
-   { "commitbeforemerge", _commit_before_merge },
-   { "resolveconflict", _resolve_conflict },
-   { "implicitidentity", _implicit_identity },
-   { "detachedhead", _detached_head },
-   { "setupstreamfailure", _set_upstream_failure },
-   { "objectnamewarning", _object_name_warning },
-   { "rmhints", _rm_hints },
-   { "addembeddedrepo", _add_embedded_repo },
-   { "ignoredhook", _ignored_hook },
-   { "waitingforeditor", _waiting_for_editor },
+   { "pushUpdateRejected", _push_update_rejected },
+   { "pushNonFFCurrent", _push_non_ff_current },
+   { "pushNonFFMatching", _push_non_ff_matching },
+   { "pushAlreadyExists", _push_already_exists },
+   { "pushFetchFirst", _push_fetch_first },
+   { "pushNeedsForce", _push_needs_force },
+   { "statusHints", _status_hints },
+   { "statusUoption", _status_u_option },
+   { "commitBeforeMerge", _commit_before_merge },
+   { "resolveConflict", _resolve_conflict },
+   { "implicitIdentity", _implicit_identity },
+   { "detachedHead", _detached_head },
+   { "setupStreamFailure", _set_upstream_failure },
+   { "objectNameWarning", _object_name_warning },
+   { "rmHints", _rm_hints },
+   { "addEmbeddedRepo", _add_embedded_repo },
+   { "ignoredHook", _ignored_hook },
+   { "waitingForEditor", _waiting_for_editor },
 
/* make this an alias for backward compatibility */
-   { "pushnonfastforward", _push_update_rejected }
+   { "pushNonFastForward", _push_update_rejected }
 };
 
 void advise(const char *advice, ...)
@@ -76,7 +76,7 @@ int git_default_advice_config(const char *var, const char 
*value)
return 0;
 
for (i = 0; i < ARRAY_SIZE(advice_config); i++) {
-   if (strcmp(k, advice_config[i].name))
+   if (strcasecmp(k, advice_config[i].name))
continue;
*advice_config[i].preference = git_config_bool(var, value);
return 0;
-- 
2.17.0.705.g3525833791



[PATCH 8/9] completion: support case-insensitive config vars just a bit

2018-05-10 Thread Nguyễn Thái Ngọc Duy
config variable names are technically case-insensitive while this case
construct is by default case-sensitive. Moving it to case-insensitive
could be a lot more work. Instead let's just accept the form that is
more likely to show up in practice.

Signed-off-by: Nguyễn Thái Ngọc Duy 
---
 contrib/completion/git-completion.bash | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/contrib/completion/git-completion.bash 
b/contrib/completion/git-completion.bash
index 8d3257c4de..e7829b5b24 100644
--- a/contrib/completion/git-completion.bash
+++ b/contrib/completion/git-completion.bash
@@ -2084,7 +2084,7 @@ __git_compute_config_vars ()
 _git_config ()
 {
case "$prev" in
-   branch.*.remote|branch.*.pushremote)
+   branch.*.remote|branch.*.push[Rr]emote)
__gitcomp_nl "$(__git_remotes)"
return
;;
@@ -2096,7 +2096,7 @@ _git_config ()
__gitcomp "false true preserve interactive"
return
;;
-   remote.pushdefault)
+   remote.push[Dd]efault)
__gitcomp_nl "$(__git_remotes)"
return
;;
@@ -2162,7 +2162,7 @@ _git_config ()
__gitcomp "$__git_send_email_suppresscc_options"
return
;;
-   sendemail.transferencoding)
+   sendemail.transfer[Ee]ncoding)
__gitcomp "7bit 8bit quoted-printable base64"
return
;;
-- 
2.17.0.705.g3525833791



[PATCH 9/9] log-tree: allow to customize 'grafted' color

2018-05-10 Thread Nguyễn Thái Ngọc Duy
Commit 76f5df305b (log: decorate grafted commits with "grafted" -
2011-08-18) lets us decorate grafted commits but I forgot about the
color.decorate.* config.

Signed-off-by: Nguyễn Thái Ngọc Duy 
---
 Documentation/config.txt | 3 ++-
 log-tree.c   | 1 +
 2 files changed, 3 insertions(+), 1 deletion(-)

diff --git a/Documentation/config.txt b/Documentation/config.txt
index 91f7eaed7b..c63d66906d 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -1138,7 +1138,8 @@ color.diff.::
 color.decorate.::
Use customized color for 'git log --decorate' output.  `` is one
of `branch`, `remoteBranch`, `tag`, `stash` or `HEAD` for local
-   branches, remote-tracking branches, tags, stash and HEAD, respectively.
+   branches, remote-tracking branches, tags, stash and HEAD, respectively
+   and `grafted` for grafted commits.
 
 color.grep::
When set to `always`, always highlight matches.  When `false` (or
diff --git a/log-tree.c b/log-tree.c
index 19cfebd231..414dbce0dd 100644
--- a/log-tree.c
+++ b/log-tree.c
@@ -34,6 +34,7 @@ static const char *color_decorate_slots[] = {
[DECORATION_REF_TAG]= "tag",
[DECORATION_REF_STASH]  = "stash",
[DECORATION_REF_HEAD]   = "HEAD",
+   [DECORATION_GRAFTED]= "grafted",
 };
 
 static const char *decorate_get_color(int decorate_use_color, enum 
decoration_type ix)
-- 
2.17.0.705.g3525833791



[PATCH 0/9] completion: avoid hard coding config var list

2018-05-10 Thread Nguyễn Thái Ngọc Duy
This is part of the attempt to automate more in git-completion.bash.
This series is about making 'git' binary provide the list of
recognized configuration variables and feeding it to git-completion.bash.

The approach is less than ideal. Unlike parse-options, we don't have a
reliable source for configuration variables. Instead, we go grep
config.txt for variable names.

This should work most of the time even if it might miss some variables
if we start to write funny things in config.txt. But compared to the
current approach "every now and then, manually go through config.txt
(or something) and update _git_config() function", this is still
definitely better.

The result is about 200 more completable options (from current 324).
Granted most of this is fsck message ids, but I'm sure it has picked
up a few good config variables and will continue so in the future.

While at there, this config list is also available to the user via
`git help --config` if you can't remember the exact config name you
want and don't want to go through that big git-config man page.

PS. Yes too many complete options (524 now) is not that helpful. I
have a plan to deal with this but probably after I'm done with the
--no-xxx completion topic. Stay tuned.

(on top of nd/command-list)

Nguyễn Thái Ngọc Duy (9):
  Add and use generic name->id mapping code for color slot parsing
  grep: keep all colors in an array
  fsck: factor out msg_id_info[] lazy initialization code
  help: add --config to list all available config
  advice: keep config name in camelCase in advice_config[]
  am: move advice.amWorkDir parsing back to advice.c
  completion: drop the hard coded list of config vars
  completion: support case-insensitive config vars just a bit
  log-tree: allow to customize 'grafted' color

 Documentation/config.txt   |   3 +-
 Documentation/git-help.txt |   5 +
 advice.c   |  51 ++--
 advice.h   |   1 +
 builtin/am.c   |   5 +-
 builtin/branch.c   |  28 +-
 builtin/clean.c|  28 +-
 builtin/commit.c   |  40 +--
 builtin/help.c |  14 +
 config.c   |  13 +
 config.h   |   4 +
 contrib/completion/git-completion.bash | 340 +
 diff.c |  60 ++---
 fsck.c |  49 ++--
 generate-cmdlist.sh|  20 ++
 grep.c | 113 
 grep.h |  21 +-
 help.c |  74 ++
 help.h |  18 ++
 log-tree.c |  36 +--
 t/t4254-am-corrupt.sh  |   2 +-
 21 files changed, 387 insertions(+), 538 deletions(-)

-- 
2.17.0.705.g3525833791



[PATCH 2/9] grep: keep all colors in an array

2018-05-10 Thread Nguyễn Thái Ngọc Duy
This is more inline with how we handle color slots in other code. It
also allows us to get the list of configurable color slots later.

Signed-off-by: Nguyễn Thái Ngọc Duy 
---
 grep.c | 106 ++---
 grep.h |  21 +++-
 2 files changed, 62 insertions(+), 65 deletions(-)

diff --git a/grep.c b/grep.c
index 65b90c10a3..2f7ebe60f6 100644
--- a/grep.c
+++ b/grep.c
@@ -13,6 +13,17 @@ static int grep_source_is_binary(struct grep_source *gs);
 
 static struct grep_opt grep_defaults;
 
+static const char *color_grep_slots[] = {
+   [GREP_COLOR_CONTEXT] = "context",
+   [GREP_COLOR_FILENAME] = "filename",
+   [GREP_COLOR_FUNCTION] = "function",
+   [GREP_COLOR_LINENO] = "lineNumber",
+   [GREP_COLOR_MATCH_CONTEXT] = "matchContext",
+   [GREP_COLOR_MATCH_SELECTED] = "matchSelected",
+   [GREP_COLOR_SELECTED] = "selected",
+   [GREP_COLOR_SEP] = "separator",
+};
+
 static void std_output(struct grep_opt *opt, const void *buf, size_t size)
 {
fwrite(buf, size, 1, stdout);
@@ -42,14 +53,14 @@ void init_grep_defaults(void)
opt->pathname = 1;
opt->max_depth = -1;
opt->pattern_type_option = GREP_PATTERN_TYPE_UNSPECIFIED;
-   color_set(opt->color_context, "");
-   color_set(opt->color_filename, "");
-   color_set(opt->color_function, "");
-   color_set(opt->color_lineno, "");
-   color_set(opt->color_match_context, GIT_COLOR_BOLD_RED);
-   color_set(opt->color_match_selected, GIT_COLOR_BOLD_RED);
-   color_set(opt->color_selected, "");
-   color_set(opt->color_sep, GIT_COLOR_CYAN);
+   color_set(opt->colors[GREP_COLOR_CONTEXT], "");
+   color_set(opt->colors[GREP_COLOR_FILENAME], "");
+   color_set(opt->colors[GREP_COLOR_FUNCTION], "");
+   color_set(opt->colors[GREP_COLOR_LINENO], "");
+   color_set(opt->colors[GREP_COLOR_MATCH_CONTEXT], GIT_COLOR_BOLD_RED);
+   color_set(opt->colors[GREP_COLOR_MATCH_SELECTED], GIT_COLOR_BOLD_RED);
+   color_set(opt->colors[GREP_COLOR_SELECTED], "");
+   color_set(opt->colors[GREP_COLOR_SEP], GIT_COLOR_CYAN);
opt->color = -1;
opt->output = std_output;
 }
@@ -76,7 +87,7 @@ static int parse_pattern_type_arg(const char *opt, const char 
*arg)
 int grep_config(const char *var, const char *value, void *cb)
 {
struct grep_opt *opt = _defaults;
-   char *color = NULL;
+   const char *slot;
 
if (userdiff_config(var, value) < 0)
return -1;
@@ -103,32 +114,18 @@ int grep_config(const char *var, const char *value, void 
*cb)
 
if (!strcmp(var, "color.grep"))
opt->color = git_config_colorbool(var, value);
-   else if (!strcmp(var, "color.grep.context"))
-   color = opt->color_context;
-   else if (!strcmp(var, "color.grep.filename"))
-   color = opt->color_filename;
-   else if (!strcmp(var, "color.grep.function"))
-   color = opt->color_function;
-   else if (!strcmp(var, "color.grep.linenumber"))
-   color = opt->color_lineno;
-   else if (!strcmp(var, "color.grep.matchcontext"))
-   color = opt->color_match_context;
-   else if (!strcmp(var, "color.grep.matchselected"))
-   color = opt->color_match_selected;
-   else if (!strcmp(var, "color.grep.selected"))
-   color = opt->color_selected;
-   else if (!strcmp(var, "color.grep.separator"))
-   color = opt->color_sep;
-   else if (!strcmp(var, "color.grep.match")) {
-   int rc = 0;
-   if (!value)
-   return config_error_nonbool(var);
-   rc |= color_parse(value, opt->color_match_context);
-   rc |= color_parse(value, opt->color_match_selected);
-   return rc;
-   }
-
-   if (color) {
+   if (!strcmp(var, "color.grep.match")) {
+   if (grep_config("color.grep.matchcontext", value, cb) < 0)
+   return -1;
+   if (grep_config("color.grep.matchselected", value, cb) < 0)
+   return -1;
+   } else if (skip_prefix(var, "color.grep.", )) {
+   int i = LOOKUP_CONFIG(color_grep_slots, slot);
+   char *color;
+
+   if (i < 0)
+   return -1;
+   color = opt->colors[i];
if (!value)
return config_error_nonbool(var);
return color_parse(value, color);
@@ -144,6 +141,7 @@ int grep_config(const char *var, const char *value, void 
*cb)
 void grep_init(struct grep_opt *opt, const char *prefix)
 {
struct grep_opt *def = _defaults;
+   int i;
 
memset(opt, 0, sizeof(*opt));
opt->prefix = prefix;
@@ -160,14 +158,8 @@ void grep_init(struct grep_opt *opt, const char *prefix)
opt->relative = def->relative;
opt->output = def->output;
 
-  

  1   2   >