[PATCH 0/3] Improve abbreviation disambiguation

2017-09-15 Thread Derrick Stolee
Hello, My name is Derrick Stolee and I just switched teams at Microsoft from the VSTS Git Server to work on performance improvements in core Git. This is my first patch submission, and I look forward to your feedback. Thanks, Stolee When displaying object ids, we frequently want to see

[PATCH 2/3] sha1_name: Unroll len loop in find_unique_abbrev_r

2017-09-15 Thread Derrick Stolee
. Signed-off-by: Derrick Stolee <dsto...@microsoft.com> --- sha1_name.c | 57 ++--- 1 file changed, 42 insertions(+), 15 deletions(-) diff --git a/sha1_name.c b/sha1_name.c index 134ac9742..f2a1ebe49 100644 --- a/sha1_name.c +++ b/sha1_name.c @@ -

[PATCH 1/3] sha1_name: Create perf test for find_unique_abbrev()

2017-09-15 Thread Derrick Stolee
repos. Signed-off-by: Derrick Stolee <dsto...@microsoft.com> --- Makefile | 1 + t/helper/.gitignore| 1 + t/helper/test-abbrev.c | 23 +++ t/perf/p0008-abbrev.sh | 12 4 files changed, 37 insertions(+) create mode 100644 t/helper/test-ab

[PATCH 3/3] sha1_name: Parse less while finding common prefix

2017-09-15 Thread Derrick Stolee
Create get_hex_char_from_oid() to parse oids one hex character at a time. This prevents unnecessary copying of hex characters in extend_abbrev_len() when finding the length of a common prefix. This change decreases the time to run test-abbrev by up to 40% on large repos. Signed-off-by: Derrick

Re: [PATCH 1/3] sha1_name: Create perf test for find_unique_abbrev()

2017-09-18 Thread Derrick Stolee
On 9/17/2017 5:51 PM, Junio C Hamano wrote: Derrick Stolee <dsto...@microsoft.com> writes: +int cmd_main(int ac, const char **av) +{ + setup_git_directory(); As far as I recall, we do not (yet) allow declaration after statement in our codebase. Move this down to make it aft

Re: [PATCH] cleanup: fix possible overflow errors in binary search

2017-10-06 Thread Derrick Stolee
On 10/6/2017 10:18 AM, Jeff King wrote: On Fri, Oct 06, 2017 at 09:52:31AM -0400, Derrick Stolee wrote: A common mistake when writing binary search is to allow possible integer overflow by using the simple average: mid = (min + max) / 2; Instead, use the overflow-safe version

[PATCH] cleanup: fix possible overflow errors in binary search

2017-10-06 Thread Derrick Stolee
: grep "/ 2;" *.c grep "/ 2;" */*.c grep "/2;" */*.c Making this cleanup will prevent future review friction when a new binary search is contructed based on existing code. Signed-off-by: Derrick Stolee <dsto...@microsoft.com> --- builtin/ind

[PATCH v4 1/4] p4211-line-log.sh: add log --online --raw --parents perf test

2017-10-08 Thread Derrick Stolee
Add a new perf test for testing the performance of log while computing OID abbreviations. Using --oneline --raw and --parents options maximizes the number of OIDs to abbreviate while still spending some time computing diffs. Signed-off-by: Derrick Stolee <dsto...@microsoft.com> --- t/perf

[PATCH v4 3/4] sha1_name: parse less while finding common prefix

2017-10-08 Thread Derrick Stolee
Create get_hex_char_from_oid() to parse oids one hex character at a time. This prevents unnecessary copying of hex characters in extend_abbrev_len() when finding the length of a common prefix. Signed-off-by: Derrick Stolee <dsto...@microsoft.com> --- sha1_name.c | 14 --

[PATCH v4 4/4] sha1_name: minimize OID comparisons during disambiguation

2017-10-08 Thread Derrick Stolee
three copies of the Linux repo: | Packs | Loose | HEAD~3 | HEAD | Rel% | |---||--|--|---| | 1| 0 | 41.27 s | 38.93 s | -4.8% | | 24| 0 | 98.04 s | 91.35 s | -5.7% | | 23| 323952 | 117.78 s | 112.18 s | -4.8% | Signed-off-by: Derrick

[PATCH v4 0/4] Improve abbreviation disambiguation

2017-10-08 Thread Derrick Stolee
% | |---||---|--|---| | 1| 0 | 41.27 s | 38.93 s | -4.8% | | 24| 0 | 98.04 s | 91.35 s | -5.7% | | 23| 323952 | 117.78 s | 112.18 s | -4.8% | Derrick Stolee (4): p4211-line-log.sh: add log --online --raw --parents perf test sha1_name: Unroll len loop

[PATCH v4 2/4] sha1_name: unroll len loop in find_unique_abbrev_r

2017-10-08 Thread Derrick Stolee
. The focus of this change is to refactor the existing method in a way that clearly does not change the current behavior. In some cases, the new method is slower than the previous method. Later changes will correct all performance loss. Signed-off-by: Derrick Stolee <dsto...@microsoft.

[PATCH v2] cleanup: fix possible overflow errors in binary search

2017-10-08 Thread Derrick Stolee
conditioned on "min < max". The included changes were found using the following git grep: git grep '/ *2;' '*.c' Making this cleanup will prevent future review friction when a new binary search is contructed based on existing code. Signed-off-by: Derrick Stolee <dsto...@microsoft.c

Re: [PATCH v3 3/5] sha1_name: Unroll len loop in find_unique_abbrev_r

2017-10-04 Thread Derrick Stolee
On 10/4/2017 2:07 AM, Junio C Hamano wrote: Derrick Stolee <dsto...@microsoft.com> writes: - exists = has_sha1_file(sha1); - while (len < GIT_SHA1_HEXSZ) { - struct object_id oid_ret; - status = get_short_oid(hex, len, _ret, GET_OI

Re: [PATCH] test-list-objects: mark file-local symbols as static

2017-10-04 Thread Derrick Stolee
On 10/3/2017 5:51 PM, Ramsay Jones wrote: Signed-off-by: Ramsay Jones --- Hi Derrick, If you need to re-roll your 'ds/find-unique-abbrev-optim' branch, could you please squash this into the relevant patch (commit 3792c78ba0, "test-list-objects: list a subset of

Re: [PATCH v3 3/5] sha1_name: Unroll len loop in find_unique_abbrev_r

2017-10-04 Thread Derrick Stolee
On 10/4/2017 2:10 AM, Junio C Hamano wrote: Derrick Stolee <sto...@gmail.com> writes: ... I understand that this patch on its own does not have good numbers. I split the patches 3 and 4 specifically to highlight two distinct changes: Patch 3: Unroll the len loop that may inspect all

Re: [PATCH v4 4/4] sha1_name: minimize OID comparisons during disambiguation

2017-10-10 Thread Derrick Stolee
On 10/10/2017 8:56 AM, Junio C Hamano wrote: Jeff King writes: OK, I think that makes more sense. But note the p->num_objects thing I mentioned. If I do: git pack-objects .git/objects/pack/pack num_objects) return; Technically that also covers open_pack_index()

Re: [PATCH v4 4/4] sha1_name: minimize OID comparisons during disambiguation

2017-10-10 Thread Derrick Stolee
On 10/9/2017 9:49 AM, Jeff King wrote: On Sun, Oct 08, 2017 at 02:49:42PM -0400, Derrick Stolee wrote: @@ -505,6 +506,65 @@ static int extend_abbrev_len(const struct object_id *oid, void *cb_data) return 0; } +static void find_abbrev_len_for_pack(struct packed_git *p

Re: git-clone causes out of memory

2017-10-13 Thread Derrick Stolee
On 10/13/2017 8:44 AM, Jeff King wrote: On Fri, Oct 13, 2017 at 03:12:43PM +0300, Constantine wrote: On 13.10.2017 15:04, Junio C Hamano wrote: Christian Couder writes: Yeah, but perhaps Git could be smarter when rev-listing too and avoid processing files or

Re: [PATCH v2 1/5] test-list-objects: List a subset of object ids

2017-10-05 Thread Derrick Stolee
On 10/5/2017 6:00 AM, Jeff King wrote: On Thu, Oct 05, 2017 at 06:48:10PM +0900, Junio C Hamano wrote: Jeff King writes: This is weirdly specific. Can we accomplish the same thing with existing tools? E.g., could: git cat-file --batch-all-objects

Re: git-clone causes out of memory

2017-10-13 Thread Derrick Stolee
On 10/13/2017 9:15 AM, Derrick Stolee wrote: On 10/13/2017 8:44 AM, Jeff King wrote: On Fri, Oct 13, 2017 at 03:12:43PM +0300, Constantine wrote: On 13.10.2017 15:04, Junio C Hamano wrote: Christian Couder <christian.cou...@gmail.com> writes: Yeah, but perhaps Git could be smarter wh

Re: git-clone causes out of memory

2017-10-13 Thread Derrick Stolee
On 10/13/2017 10:26 AM, Jeff King wrote: On Fri, Oct 13, 2017 at 10:25:10AM -0400, Derrick Stolee wrote: This does appear to be the problem. The missing DIFF_OPT_HAS_CHANGES is causing diff_can_quit_early() to return false. Due to the corner-case of the bug it seems it will not be a huge

Re: git-clone causes out of memory

2017-10-13 Thread Derrick Stolee
On 10/13/2017 9:50 AM, Jeff King wrote: On Fri, Oct 13, 2017 at 09:39:14AM -0400, Derrick Stolee wrote: Since I don't understand enough about the consumers to diff_tree_oid() (and the fact that the recursive behavior may be wanted in some cases), I think we can fix this in builtin/rev-list.c

Re: [PATCH] revision: quit pruning diff more quickly when possible

2017-10-13 Thread Derrick Stolee
On 10/13/2017 11:27 AM, Jeff King wrote: On Fri, Oct 13, 2017 at 10:26:46AM -0400, Jeff King wrote: On Fri, Oct 13, 2017 at 10:25:10AM -0400, Derrick Stolee wrote: This does appear to be the problem. The missing DIFF_OPT_HAS_CHANGES is causing diff_can_quit_early() to return false. Due

Re: git-clone causes out of memory

2017-10-13 Thread Derrick Stolee
On 10/13/2017 10:20 AM, Jeff King wrote: On Fri, Oct 13, 2017 at 10:10:18AM -0400, Jeff King wrote: Hmm. So this patch makes it go fast: diff --git a/revision.c b/revision.c index d167223e69..b52ea4e9d8 100644 --- a/revision.c +++ b/revision.c @@ -409,7 +409,7 @@ static void

[PATCH v5 3/4] sha1_name: parse less while finding common prefix

2017-10-12 Thread Derrick Stolee
Create get_hex_char_from_oid() to parse oids one hex character at a time. This prevents unnecessary copying of hex characters in extend_abbrev_len() when finding the length of a common prefix. Signed-off-by: Derrick Stolee <dsto...@microsoft.com> --- sha1_name.c | 14 --

[PATCH v5 1/4] p4211-line-log.sh: add log --online --raw --parents perf test

2017-10-12 Thread Derrick Stolee
Add a new perf test for testing the performance of log while computing OID abbreviations. Using --oneline --raw and --parents options maximizes the number of OIDs to abbreviate while still spending some time computing diffs. Signed-off-by: Derrick Stolee <dsto...@microsoft.com> --- t/perf

[PATCH v5 2/4] sha1_name: unroll len loop in find_unique_abbrev_r

2017-10-12 Thread Derrick Stolee
. The focus of this change is to refactor the existing method in a way that clearly does not change the current behavior. In some cases, the new method is slower than the previous method. Later changes will correct all performance loss. Signed-off-by: Derrick Stolee <dsto...@microsoft.

[PATCH v5 4/4] sha1_name: minimize OID comparisons during disambiguation

2017-10-12 Thread Derrick Stolee
three copies of the Linux repo: | Packs | Loose | HEAD~3 | HEAD | Rel% | |---||--|--|---| | 1| 0 | 41.27 s | 38.93 s | -4.8% | | 24| 0 | 98.04 s | 91.35 s | -5.7% | | 23| 323952 | 117.78 s | 112.18 s | -4.8% | Signed-off-by: Derrick

[PATCH v5 0/4] Improve abbreviation disambiguation

2017-10-12 Thread Derrick Stolee
| 41.27 s | 38.93 s | -4.8% | | 24| 0 | 98.04 s | 91.35 s | -5.7% | | 23| 323952 | 117.78 s | 112.18 s | -4.8% | Derrick Stolee (4): p4211-line-log.sh: add log --online --raw --parents perf test sha1_name: unroll len loop in find_unique_abbrev_r sha1_name: parse less while

Re: [PATCH v5 0/4] Improve abbreviation disambiguation

2017-10-12 Thread Derrick Stolee
On 10/12/2017 8:02 AM, Derrick Stolee wrote: Changes since previous version: * Make 'pos' unsigned in get_hex_char_from_oid() * Check response from open_pack_index() * Small typos in commit messages Thanks, Stolee I forgot to mention that I rebased on master this morning to be sure

[PATCH v3 2/5] p0008-abbrev.sh: Test find_unique_abbrev() perf

2017-10-02 Thread Derrick Stolee
probability). Signed-off-by: Derrick Stolee <dsto...@microsoft.com> --- Makefile | 1 + t/helper/.gitignore| 1 + t/helper/test-abbrev.c | 18 ++ t/perf/p0008-abbrev.sh | 22 ++ 4 files changed, 42 insertions(+) create mode 100644 t/

[PATCH v3 4/5] sha1_name: Parse less while finding common prefix

2017-10-02 Thread Derrick Stolee
running in Windows Subsystem for Linux: Pack Files: 50 Packed Objects: 22,385,898 Loose Objects: 492 Base Time: 3.91 s New Time: 3.08 s Rel %: -21.1% Signed-off-by: Derrick Stolee <dsto...@microsoft.com> --- sha1_name.c | 14 -- 1 file changed, 12 insertions

[PATCH v3 5/5] sha1_name: Minimize OID comparisons during disambiguation

2017-10-02 Thread Derrick Stolee
Time: 2.72 s Rel %: -11.8 Signed-off-by: Derrick Stolee <dsto...@microsoft.com> --- sha1_name.c | 70 + 1 file changed, 66 insertions(+), 4 deletions(-) diff --git a/sha1_name.c b/sha1_name.c index 5081aeb71..54b3a37da

[PATCH v3 1/5] test-list-objects: List a subset of object ids

2017-10-02 Thread Derrick Stolee
to avoid clustering and therefore quite uniformly distributed. If a command line argument "--missing" is given before the sample count, then a list of OIDs is generated without examining the repo. Signed-off-by: Derrick Stolee <dsto...@microsoft.com> --- Makefile

[PATCH v3 0/5] Improve abbreviation disambituation

2017-10-02 Thread Derrick Stolee
o running in Windows Subsystem for Linux: Pack Files: 50 Packed Objects: 22,385,898 Loose Objects: 492 Base Time: 38.9 s New Time: 2.7 s Rel %: -93.1% Derrick Stolee (5): test-list-objects: List a subset of object ids p0008-abbrev.sh: Test find_unique_abbrev() perf s

[PATCH v3 3/5] sha1_name: Unroll len loop in find_unique_abbrev_r

2017-10-02 Thread Derrick Stolee
-by: Derrick Stolee <dsto...@microsoft.com> --- sha1_name.c | 57 ++--- 1 file changed, 42 insertions(+), 15 deletions(-) diff --git a/sha1_name.c b/sha1_name.c index 134ac9742..f2a1ebe49 100644 --- a/sha1_name.c +++ b/sha1_name.c @@ -474,10 +

Re: [PATCH v2 4/5] sha1_name: Parse less while finding common prefix

2017-10-02 Thread Derrick Stolee
My v3 patch is incoming, but I wanted to respond directly to this message. On 9/25/2017 7:42 PM, Stefan Beller wrote: On Mon, Sep 25, 2017 at 2:54 AM, Derrick Stolee <dsto...@microsoft.com> wrote: Create get_hex_char_from_oid() to parse oids one hex character at a time. This pr

Re: What's cooking in git.git (Sep 2017, #06; Fri, 29)

2017-09-29 Thread Derrick Stolee
Hi Junio, On 9/29/2017 12:34 AM, Junio C Hamano wrote: * ds/find-unique-abbrev-optim (2017-09-19) 4 commits - SQUASH??? - sha1_name: parse less while finding common prefix - sha1_name: unroll len loop in find_unique_abbrev_r() - sha1_name: create perf test for find_unique_abbrev()

Re: [PATCH v3 3/5] sha1_name: Unroll len loop in find_unique_abbrev_r

2017-10-03 Thread Derrick Stolee
On 10/3/2017 6:49 AM, Junio C Hamano wrote: Derrick Stolee <dsto...@microsoft.com> writes: p0008.1: find_unique_abbrev() for existing objects -- For 10 repeated tests, each checking 100,000 known objects, we find the following result

Re: [PATCH v3 5/5] sha1_name: Minimize OID comparisons during disambiguation

2017-10-03 Thread Derrick Stolee
On 10/3/2017 11:55 AM, Stefan Beller wrote: @@ -505,6 +506,65 @@ static int extend_abbrev_len(const struct object_id *oid, void *cb_data) return 0; } +static void find_abbrev_len_for_pack(struct packed_git *p, +struct min_abbrev_data *mad) +{ +

[PATCH v2 2/5] p0008-abbrev.sh: Test find_unique_abbrev() perf

2017-09-25 Thread Derrick Stolee
probability). For each test, use `sort -R` to (deterministically) shuffle the sample of object ids to not check abbreviations in lexicographic order. Signed-off-by: Derrick Stolee <dsto...@microsoft.com> --- Makefile | 1 + t/helper/.gitignore| 1 + t/helper/test-abbrev.

[PATCH v2 0/5] Improve abbreviation disambiguation

2017-09-25 Thread Derrick Stolee
: 22,385,898 Loose Objects: 492 Base Time: 38.9 s New Time: 2.7 s Rel %: -93.1% Derrick Stolee (5): test-list-objects: List a subset of object ids p0008-abbrev.sh: Test find_unique_abbrev() perf sha1_name: Unroll len loop in find_unique_abbrev_r sha1_name: Parse less while

[PATCH v2 4/5] sha1_name: Parse less while finding common prefix

2017-09-25 Thread Derrick Stolee
running in Windows Subsystem for Linux: Pack Files: 50 Packed Objects: 22,385,898 Loose Objects: 492 Base Time: 3.91 s New Time: 3.08 s Rel %: -21.1% Signed-off-by: Derrick Stolee <dsto...@microsoft.com> Signed-off-by: Derrick Stolee <dsto...@microsoft.com> ---

[PATCH v2 5/5] sha1_name: Minimize OID comparisons during disambiguation

2017-09-25 Thread Derrick Stolee
Time: 2.72 s Rel %: -11.8 Signed-off-by: Derrick Stolee <dsto...@microsoft.com> --- sha1_name.c | 70 + 1 file changed, 66 insertions(+), 4 deletions(-) diff --git a/sha1_name.c b/sha1_name.c index bb47b6702..1566cd4fc

[PATCH v2 3/5] sha1_name: Unroll len loop in find_unique_abbrev_r

2017-09-25 Thread Derrick Stolee
Stolee <dsto...@microsoft.com> Signed-off-by: Derrick Stolee <dsto...@microsoft.com> --- sha1_name.c | 57 ++--- 1 file changed, 42 insertions(+), 15 deletions(-) diff --git a/sha1_name.c b/sha1_name.c index 134ac9742..f2a1

[PATCH v2 1/5] test-list-objects: List a subset of object ids

2017-09-25 Thread Derrick Stolee
to avoid clustering and therefore quite uniformly distributed. If a second command line argument "--missing" is given, then a list of OIDs is generated without examining the repo. Signed-off-by: Derrick Stolee <dsto...@microsoft.com> --- Makefile | 1 + t/h

Re: [PATCH v2 1/5] test-list-objects: List a subset of object ids

2017-10-07 Thread Derrick Stolee
On 10/6/2017 10:11 AM, Jeff King wrote: On Thu, Oct 05, 2017 at 08:39:42AM -0400, Derrick Stolee wrote: I'll run some perf numbers for these commands you recommend, and also see if I can replicate some of the pain points that triggered this change using the Linux repo. Thanks! -Peff In my

Re: [PATCH v4 4/4] sha1_name: minimize OID comparisons during disambiguation

2017-10-11 Thread Derrick Stolee
On 10/10/2017 9:30 AM, Jeff King wrote: On Tue, Oct 10, 2017 at 09:11:15AM -0400, Derrick Stolee wrote: On 10/10/2017 8:56 AM, Junio C Hamano wrote: Jeff King <p...@peff.net> writes: OK, I think that makes more sense. But note the p->num_objects thing I mentioned. If I do:

Re: cherry-pick very slow on big repository

2017-11-10 Thread Derrick Stolee
On 11/10/2017 7:37 AM, Peter Krefting wrote: Jeff King: Can you get a backtrace? I'd do something like: Seems that it spends most time in diffcore_count_changes(), that is where it hits whenever I hit Ctrl+C (various line numbers 199-207 in diffcore-delta.c; this is on the v2.15.0 tag).

Re: [PATCH v2] sha1_file: use strbuf_add() instead of strbuf_addf()

2017-12-04 Thread Derrick Stolee
On 12/4/2017 11:56 AM, Jeff King wrote: When you put your cover letter first, you need to use "scissors" like: -- >8 -- to separate it from the commit message. Using three-dashes means "git am" will include your cover letter as the commit message, and omit your real commit message entirely.

Re: [PATCH] sha1_file: use strbuf_add() instead of strbuf_addf()

2017-12-01 Thread Derrick Stolee
On 12/1/2017 1:22 PM, Jeff King wrote: On Fri, Dec 01, 2017 at 12:49:56PM -0500, Derrick Stolee wrote: [snip] diff --git a/sha1_file.c b/sha1_file.c index 8ae6cb6285..2160323c4a 100644 This overall looks good, but I noticed one bug and a few cosmetic improvements. Thanks for finding quality

Re: Question about the ahead-behind computation and status

2017-12-15 Thread Derrick Stolee
On 12/15/2017 10:08 AM, Jeff Hostetler wrote: On 12/15/2017 5:08 AM, Jeff King wrote: On Thu, Dec 14, 2017 at 04:49:31PM -0500, Jeff Hostetler wrote: [*] Sadly, the local repo was only about 20 days out of date (including the Thanksgiving holidays) Taking 20 seconds to traverse 20

Re: Question about the ahead-behind computation and status

2017-12-15 Thread Derrick Stolee
On 12/15/2017 1:30 PM, Junio C Hamano wrote: Derrick Stolee <sto...@gmail.com> writes: The biggest reason for the 20 seconds is not just the number of commits in the ahead/behind but how many commits are walked (including common to both branches) before paint_down_to_common() breaks its

[PATCH v2] sha1_file: use strbuf_add() instead of strbuf_addf()

2017-12-04 Thread Derrick Stolee
-off-by: Derrick Stolee <dsto...@microsoft.com> --- sha1_file.c | 12 +++- t/perf/p4211-line-log.sh | 4 2 files changed, 11 insertions(+), 5 deletions(-) diff --git a/sha1_file.c b/sha1_file.c index 8ae6cb6285..2fc8fa93b4 100644 --- a/sha1_file.c +++ b/sha1_

Re: [PATCH] builtin/tag.c: return appropriate value when --points-at finds an empty list

2017-12-11 Thread Derrick Stolee
On 12/11/2017 8:44 AM, George Papanikolaou wrote: `git tag --points-at` can simply return if the given rev does not have any tags pointing to it. It's not a failure but it shouldn't return with 0 value. I disagree. I think the 0 return means "I completed successfully" and the empty output

[PATCH] sha1_file: use strbuf_add() instead of strbuf_addf()

2017-12-01 Thread Derrick Stolee
: HEAD~1HEAD 7.70(7.15+0.54) 7.44(7.09+0.29) -3.4% Signed-off-by: Derrick Stolee <dsto...@microsoft.com> --- sha1_file.c | 7 --- t/perf/p4211-line-log.sh | 4 2 files changed, 8 insertions(+), 3 deletions(-) diff

[RFC] 'unsigned long' to 'size_t' conversion

2017-12-06 Thread Derrick Stolee
There are several places in Git where we refer to the size of an object by an 'unsigned long' instead of a 'size_t'. In 64-bit Linux, 'unsigned long' is 8 bytes, but in 64-bit Windows it is 4 bytes. The main issue with this conversion is that large objects fail to load (they seem to hash and

Re: [RFC] protocol version 2

2017-10-25 Thread Derrick Stolee
On 10/20/2017 1:18 PM, Brandon Williams wrote: Overview == This document presents a specification for a version 2 of Git's wire protocol. Protocol v2 will improve upon v1 in the following ways: * Instead of multiple service names, multiple commands will be supported by a

Re: [RFC] Other chunks for commit-graph, part 1 - Bloom filters, topo order, etc.

2018-05-07 Thread Derrick Stolee
//public-inbox.org/git/alpine.DEB.2.20.1803091557510.23109@alexmv-linux/ I think it was what Derrick Stolee was talking about at the end of his part of "Making Git for Windows" presentation at Git Merge 2018: https://youtu.be/oOMzi983Qmw?t=1835 This was also mentioned in subthread of &quo

Re: What's cooking in git.git (May 2018, #01; Mon, 7)

2018-05-07 Thread Derrick Stolee
On 5/7/2018 10:58 AM, Junio C Hamano wrote: * ds/generation-numbers (2018-05-02) 11 commits - commit-graph.txt: update design document - merge: check config before loading commits - commit: use generation number in remove_redundant() - commit: add short-circuit to paint_down_to_common()

Re: [PATCH 4/4] mark_parents_uninteresting(): avoid most allocation

2018-05-14 Thread Derrick Stolee
On 5/14/2018 9:09 AM, Jeff King wrote: On Mon, May 14, 2018 at 08:47:33AM -0400, Derrick Stolee wrote: On 5/11/2018 2:03 PM, Jeff King wrote: Commit 941ba8db57 (Eliminate recursion in setting/clearing marks in commit list, 2012-01-14) used a clever double-loop to avoid allocations for single

Re: [PATCH 0/4] a few mark_parents_uninteresting cleanups

2018-05-14 Thread Derrick Stolee
insertions(+), 40 deletions(-) -Peff This series looks good to me. I found Patch 3 hard to read in the diff, so I just looked at the final result and the new arrangement is very clear about how it should behave. Reviewed-by: Derrick Stolee <dsto...@microsoft.com>

Re: [RFC] Other chunks for commit-graph, part 1 - Bloom filters, topo order, etc.

2018-05-14 Thread Derrick Stolee
On 5/12/2018 10:00 AM, Jakub Narebski wrote: Derrick Stolee <sto...@gmail.com> writes: On 5/4/2018 3:40 PM, Jakub Narebski wrote: With early parts of commit-graph feature (ds/commit-graph and ds/lazy-load-trees) close to being merged into "master", see https://public-inbox.org/

Re: [PATCH v2 01/12] commit-graph: add 'verify' subcommand

2018-05-14 Thread Derrick Stolee
On 5/12/2018 9:31 AM, Martin Ågren wrote: On 11 May 2018 at 23:15, Derrick Stolee <dsto...@microsoft.com> wrote: graph_name = get_commit_graph_filename(opts.obj_dir); graph = load_commit_graph_one(graph_name); + FREE_AND_NULL(graph_name); if (

Re: [PATCH 4/4] mark_parents_uninteresting(): avoid most allocation

2018-05-14 Thread Derrick Stolee
On 5/11/2018 2:03 PM, Jeff King wrote: Commit 941ba8db57 (Eliminate recursion in setting/clearing marks in commit list, 2012-01-14) used a clever double-loop to avoid allocations for single-parent chains of history. However, it did so only when following parents of parents (which was an uncommon

Re: [PATCH v2 02/12] commit-graph: verify file header information

2018-05-14 Thread Derrick Stolee
On 5/12/2018 9:35 AM, Martin Ågren wrote: +static int verify_commit_graph_error; + +static void graph_report(const char *fmt, ...) +{ + va_list ap; + struct strbuf sb = STRBUF_INIT; + verify_commit_graph_error = 1; + + va_start(ap, fmt); + strbuf_vaddf(, fmt, ap); +

Re: [PATCH v2 08/12] commit-graph: verify commit contents against odb

2018-05-14 Thread Derrick Stolee
On 5/12/2018 5:17 PM, Martin Ågren wrote: On 11 May 2018 at 23:15, Derrick Stolee <dsto...@microsoft.com> wrote: When running 'git commit-graph verify', compare the contents of the commits that are loaded from the commit-graph file with commits that are loaded directly from the object da

Re: [PATCH v2 14/14] commit.h: delete 'util' field in struct commit

2018-05-14 Thread Derrick Stolee
On 5/14/2018 1:30 PM, Duy Nguyen wrote: On Mon, May 14, 2018 at 6:07 PM, Duy Nguyen wrote: On Mon, May 14, 2018 at 04:52:29PM +0900, Junio C Hamano wrote: Nguyễn Thái Ngọc Duy writes: diff --git a/commit.h b/commit.h index 838f6a6b26..70371e111e

Re: Implementing reftable in Git

2018-05-09 Thread Derrick Stolee
On 5/9/2018 10:33 AM, Christian Couder wrote: Hi, I might start working on implementing reftable in Git soon. During the last Git Merge conference last March Stefan talked about reftable. In Alex Vandiver's notes [1] it is asked that people announce it on the list when they start working on

Re: [PATCH 1/1] commit-graph: fix UX issue when .lock file exists

2018-05-09 Thread Derrick Stolee
On 5/9/2018 10:42 AM, Jeff King wrote: On Wed, May 09, 2018 at 02:15:38PM +, Derrick Stolee wrote: The commit-graph file lives in the .git/objects/info directory. Previously, a failure to acquire the commit-graph.lock file was assumed to be due to the lack of the info directory, so a mkdir

[PATCH 0/1] Fix UX issue with commit-graph.lock

2018-05-09 Thread Derrick Stolee
e manually to continue. This patch is based on ds/generation-numbers Derrick Stolee (1): commit-graph: fix UX issue when .lock file exists commit-graph.c | 24 1 file changed, 8 insertions(+), 16 deletions(-) base-commit: 7547b95b4fbb8591726b1d9381c176cc27fc6aea --

[PATCH 1/1] commit-graph: fix UX issue when .lock file exists

2018-05-09 Thread Derrick Stolee
fatal: cannot mkdir .git/objects/info: File exists" Rearrange the expectations of this directory existing to avoid this error, and instead show the typical message when a lockfile already exists. Signed-off-by: Derrick Stolee <dsto...@microsoft.com> --- commit-graph.c | 24 ---

Re: [PATCH 0/1] Fix UX issue with commit-graph.lock

2018-05-10 Thread Derrick Stolee
On 5/10/2018 3:00 AM, Junio C Hamano wrote: Derrick Stolee <dsto...@microsoft.com> writes: We use the lockfile API to avoid multiple Git processes from writing to the commit-graph file in the .git/objects/info directory. In some cases, this directory may not exist, so we check f

Re: [PATCH 2/2] replace-object.c: remove the_repository from prepare_replace_object

2018-05-10 Thread Derrick Stolee
On 5/10/2018 7:56 AM, Jeff King wrote: On Thu, May 10, 2018 at 07:23:13PM +0900, Junio C Hamano wrote: This one was doing ptr = xmalloc(sizeof(*another_ptr)) and it was OK because ptr and another_ptr happened to be of the same type. I wonder if we are making it safer, or making it

[PATCH 10/12] gc: automatically write commit-graph files

2018-05-10 Thread Derrick Stolee
-trivial 'git gc' command. Defaults to false while the commit-graph feature matures. We specifically do not want to turn this on by default until the commit-graph feature is fully integrated with history-modifying features like shallow clones. Signed-off-by: Derrick Stolee <dsto...@microsoft.

[PATCH 09/12] commit-graph: add '--reachable' option

2018-05-10 Thread Derrick Stolee
' after performing cleanup of the object database. Signed-off-by: Derrick Stolee <dsto...@microsoft.com> --- Documentation/git-commit-graph.txt | 8 ++-- builtin/commit-graph.c | 41 ++ t/t5318-commit-graph.sh| 10 ++ 3

[PATCH 05/12] commit: force commit to parse from object database

2018-05-10 Thread Derrick Stolee
was loaded from the commit-graph so values such as generation numbers are loaded. Hence, this method should not be called unless the intention is explicit in avoiding commits from the commit-graph file. Signed-off-by: Derrick Stolee <dsto...@microsoft.com> --- commit.c | 13 + commit.

[PATCH 02/12] commit-graph: verify file header information

2018-05-10 Thread Derrick Stolee
During a run of 'git commit-graph verify', list the issues with the header information in the commit-graph file. Some of this information is inferred from the loaded 'struct commit_graph'. Some header information is checked as part of load_commit_graph_one(). Signed-off-by: Derrick Stolee <d

[PATCH 01/12] commit-graph: add 'verify' subcommand

2018-05-10 Thread Derrick Stolee
state. Do not report any errors. Signed-off-by: Derrick Stolee <dsto...@microsoft.com> --- Documentation/git-commit-graph.txt | 6 ++ builtin/commit-graph.c | 38 ++ commit-graph.c | 5 + commit-g

[PATCH 00/12] Integrate commit-graph into fsck, gc, and fetch

2018-05-10 Thread Derrick Stolee
ent in my skills in this world, but the feature is worthless if it does not improve the user experience. Thanks, -Stolee Derrick Stolee (12): Commits 01-07 focus on the 'git commit-graph verify' subcommand. These are ready for full, rigorous review. commit-graph: add 'verify' subcommand co

[PATCH 11/12] fetch: compute commit-graph by default

2018-05-10 Thread Derrick Stolee
while the commit-graph feature matures. Specifically, we do not want this on by default until the commit-graph feature integrates with history-modifying features such as shallow clones. Signed-off-by: Derrick Stolee <dsto...@microsoft.com> --- Documentation/config.txt | 4 builtin/f

[PATCH 08/12] fsck: verify commit-graph

2018-05-10 Thread Derrick Stolee
-graph file from the other fsck details. 2. The commit-graph verification requires the commits to be loaded in a specific order to guarantee we parse from the commit-graph file for some objects and from the object database for others. Signed-off-by: Derrick Stolee <dsto...@microsoft.

[PATCH 07/12] commit-graph: verify commit contents against odb

2018-05-10 Thread Derrick Stolee
the initial loop through the object IDs to guarantee we parse from the commit-graph file. In addition, verify the generation number calculation is correct for all commits in the commit-graph file. Signed-off-by: Derrick Stolee <dsto...@microsoft.com> --- commit-graph.

[PATCH 04/12] commit-graph: verify fanout and lookup table

2018-05-10 Thread Derrick Stolee
While running 'git commit-graph verify', verify that the object IDs are listed in lexicographic order and that the fanout table correctly navigates into that list of object IDs. Signed-off-by: Derrick Stolee <dsto...@microsoft.com> --- commit-graph.c | 33 ++

[PATCH 12/12] commit-graph: update design document

2018-05-10 Thread Derrick Stolee
The commit-graph feature is now integrated with 'fsck' and 'gc', so remove those items from the "Future Work" section of the commit-graph design document. Also remove the section on lazy-loading trees, as that was completed in an earlier patch series. Signed-off-by: Derrick St

[PATCH 03/12] commit-graph: parse commit from chosen graph

2018-05-10 Thread Derrick Stolee
Before verifying a commit-graph file against the object database, we need to parse all commits from the given commit-graph file. Create parse_commit_in_graph_one() to target a given struct commit_graph. Signed-off-by: Derrick Stolee <dsto...@microsoft.com> --- commit-graph.

[PATCH 06/12] commit-graph: load a root tree from specific graph

2018-05-10 Thread Derrick Stolee
When lazy-loading a tree for a commit, it will be important to select the tree from a specific struct commit_graph. Create a new method that specifies the commit-graph file and use that in get_commit_tree_in_graph(). Signed-off-by: Derrick Stolee <dsto...@microsoft.com> --- commit-graph.

[PATCH v2] commit-graph: fix UX issue when .lock file exists

2018-05-10 Thread Derrick Stolee
o continue. Helped-by: Jeff King <p...@peff.net> Signed-off-by: Derrick Stolee <dsto...@microsoft.com> --- commit-graph.c | 22 +- 1 file changed, 5 insertions(+), 17 deletions(-) diff --git a/commit-graph.c b/commit-graph.c index a8c337dd77..bb54c1214c 100644 --- a/commi

Re: [PATCH 04/12] commit-graph: verify fanout and lookup table

2018-05-11 Thread Derrick Stolee
On 5/10/2018 2:29 PM, Martin Ågren wrote: On 10 May 2018 at 19:34, Derrick Stolee <dsto...@microsoft.com> wrote: While running 'git commit-graph verify', verify that the object IDs are listed in lexicographic order and that the fanout table correctly navigates into that list of obje

Re: [PATCH 00/12] Integrate commit-graph into fsck, gc, and fetch

2018-05-11 Thread Derrick Stolee
On 5/10/2018 4:45 PM, Martin Ågren wrote: On 10 May 2018 at 21:22, Stefan Beller wrote: On Thu, May 10, 2018 at 12:05 PM, Martin Ågren wrote: I hope to find time to do some more hands-on testing of this to see that errors actually do get caught.

Re: [PATCH 00/12] Integrate commit-graph into fsck, gc, and fetch

2018-05-11 Thread Derrick Stolee
On 5/10/2018 3:22 PM, Stefan Beller wrote: On Thu, May 10, 2018 at 12:05 PM, Martin Ågren <martin.ag...@gmail.com> wrote: On 10 May 2018 at 19:34, Derrick Stolee <dsto...@microsoft.com> wrote: Commits 01-07 focus on the 'git commit-graph verify' subcommand. These are ready for fu

Re: [PATCH 00/12] Integrate commit-graph into fsck, gc, and fetch

2018-05-11 Thread Derrick Stolee
On 5/10/2018 3:17 PM, Ævar Arnfjörð Bjarmason wrote: On Thu, May 10 2018, Derrick Stolee wrote: The behavior in this patch series does the following: 1. Near the end of 'git gc', run 'git commit-graph write'. The location of this code assumes that a 'git gc --auto' has not terminated

[PATCH v2 02/12] commit-graph: verify file header information

2018-05-11 Thread Derrick Stolee
During a run of 'git commit-graph verify', list the issues with the header information in the commit-graph file. Some of this information is inferred from the loaded 'struct commit_graph'. Some header information is checked as part of load_commit_graph_one(). Signed-off-by: Derrick Stolee <d

[PATCH v2 01/12] commit-graph: add 'verify' subcommand

2018-05-11 Thread Derrick Stolee
state. Do not report any errors. During review, we noticed that a FREE_AND_NULL(graph_name) was placed after a possible 'return', and this pattern was also in graph_read(). Fix that case, too. Signed-off-by: Derrick Stolee <dsto...@microsoft.com> --- Documentation/git-commit-graph.tx

[PATCH v2 03/12] commit-graph: test that 'verify' finds corruption

2018-05-11 Thread Derrick Stolee
Add test cases to t5318-commit-graph.sh that corrupt the commit-graph file and check that the 'git commit-graph verify' command fails. These tests verify the header and chunk information is checked carefully. Helped-by: Martin Ågren <martin.ag...@gmail.com> Signed-off-by: Derrick Stolee

[PATCH v2 00/12] Integrate commit-graph into fsck and gc

2018-05-11 Thread Derrick Stolee
by default in graph_report(). The integration into 'fetch' is dropped (thanks Ævar!). Derrick Stolee (12): commit-graph: add 'verify' subcommand commit-graph: verify file header information commit-graph: test that 'verify' finds corruption commit-graph: parse commit from chosen graph

[PATCH v2 06/12] commit: force commit to parse from object database

2018-05-11 Thread Derrick Stolee
is explicit in avoiding commits from the commit-graph file. Signed-off-by: Derrick Stolee <dsto...@microsoft.com> --- commit.c | 13 + commit.h | 1 + 2 files changed, 10 insertions(+), 4 deletions(-) diff --git a/commit.c b/commit.c index 1d28677dfb..7c92350373

[PATCH v2 05/12] commit-graph: verify fanout and lookup table

2018-05-11 Thread Derrick Stolee
matches the exact error we inserted, but since our OID order test triggers incorrect fanout values (with possibly different numbers of output lines) we focus only that the correct error is written in that case. Signed-off-by: Derrick Stolee <dsto...@microsoft.com> --- commit-graph.c

[PATCH v2 04/12] commit-graph: parse commit from chosen graph

2018-05-11 Thread Derrick Stolee
Before verifying a commit-graph file against the object database, we need to parse all commits from the given commit-graph file. Create parse_commit_in_graph_one() to target a given struct commit_graph. Signed-off-by: Derrick Stolee <dsto...@microsoft.com> --- commit-graph.

  1   2   3   4   5   6   7   8   9   10   >