Re: [PATCH v3 8/9] commit-graph: always load commit-graph information

2018-04-23 Thread Derrick Stolee
On 4/18/2018 8:02 PM, Jakub Narebski wrote: Derrick Stolee <dsto...@microsoft.com> writes: Most code paths load commits using lookup_commit() and then parse_commit(). In some cases, including some branch lookups, the commit is parsed using parse_object_buffer() which side-steps parse_

Re: [PATCH v3 5/9] ref-filter: use generation number for --contains

2018-04-23 Thread Derrick Stolee
On 4/18/2018 5:02 PM, Jakub Narebski wrote: Here I can offer only the cursory examination, as I don't know this area of code in question. Derrick Stolee <dsto...@microsoft.com> writes: A commit A can reach a commit B only if the generation number of A is larger than the generation numbe

Re: [PATCH 0/6] Compute and consume generation numbers

2018-04-23 Thread Derrick Stolee
On 4/21/2018 4:44 PM, Jakub Narebski wrote: Jakub Narebski <jna...@gmail.com> writes: Derrick Stolee <sto...@gmail.com> writes: On 4/11/2018 3:32 PM, Jakub Narebski wrote: What would you suggest as a good test that could imply performance? The Google Colab notebook linked to ab

Re: [PATCH v3 0/9] Compute and consume generation numbers

2018-04-23 Thread Derrick Stolee
On 4/18/2018 8:04 PM, Jakub Narebski wrote: Derrick Stolee <dsto...@microsoft.com> writes: -- >8 -- This is the one of several "small" patches that follow the serialized Git commit graph patch (ds/commit-graph) and lazy-loading trees (ds/lazy-load-trees). As describ

Re: [PATCH v3 7/9] commit: add short-circuit to paint_down_to_common()

2018-04-24 Thread Derrick Stolee
On 4/23/2018 5:38 PM, Jakub Narebski wrote: Derrick Stolee <sto...@gmail.com> writes: On 4/18/2018 7:19 PM, Jakub Narebski wrote: Derrick Stolee <dsto...@microsoft.com> writes: [...] [...], and this saves time during 'git branch --contains' queries that would otherwise

[RFC PATCH 02/12] commit-graph: add 'check' subcommand

2018-04-17 Thread Derrick Stolee
documentation. Add a simple test that ensures the command returns a zero error code. Signed-off-by: Derrick Stolee <dsto...@microsoft.com> --- Documentation/git-commit-graph.txt | 7 +- builtin/commit-graph.c | 38 ++ commit-graph.c

[RFC PATCH 08/12] commit-graph: verify commit contents against odb

2018-04-17 Thread Derrick Stolee
calculation is correct for all commits in the commit-graph file. Signed-off-by: Derrick Stolee <dsto...@microsoft.com> --- commit-graph.c | 82 ++ 1 file changed, 82 insertions(+) diff --git a/commit-graph.c b/commit-graph.c index 80a2

[RFC PATCH 09/12] fsck: check commit-graph

2018-04-17 Thread Derrick Stolee
If a commit-graph file exists, check its contents during 'git fsck'. Signed-off-by: Derrick Stolee <dsto...@microsoft.com> --- builtin/fsck.c | 13 + 1 file changed, 13 insertions(+) diff --git a/builtin/fsck.c b/builtin/fsck.c index ef78c6c00c..9712f230ba 100644 --- a/b

[RFC PATCH 00/12] Integrate commit-graph into 'fsck' and 'gc'

2018-04-17 Thread Derrick Stolee
This RFC is based on v3 of ds/generation-numbers, and the first commit is a fixup! based on a bug in that version that I caught while prepping this series. Thanks, -Stolee Derrick Stolee (12): fixup! commit-graph: always load commit-graph information commit-graph: add 'check' subcommand co

[RFC PATCH 03/12] commit-graph: check file header information

2018-04-17 Thread Derrick Stolee
During a run of 'git commit-graph check', list the issues with the header information in the commit-graph file. Some of this information is inferred from the loaded 'struct commit_graph'. Signed-off-by: Derrick Stolee <dsto...@microsoft.com> --- commit-graph.

[RFC PATCH 11/12] gc: automatically write commit-graph files

2018-04-17 Thread Derrick Stolee
-trivial 'git gc' command. Signed-off-by: Derrick Stolee <dsto...@microsoft.com> --- Documentation/git-gc.txt | 4 builtin/gc.c | 8 2 files changed, 12 insertions(+) diff --git a/Documentation/git-gc.txt b/Documentation/git-gc.txt index 571b5a7e3c..17dd654a59

[RFC PATCH 01/12] fixup! commit-graph: always load commit-graph information

2018-04-17 Thread Derrick Stolee
--- commit-graph.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/commit-graph.c b/commit-graph.c index 21e853c21a..3f0c142603 100644 --- a/commit-graph.c +++ b/commit-graph.c @@ -304,7 +304,7 @@ static int find_commit_in_graph(struct commit *item, struct commit_graph *g,

[RFC PATCH 06/12] commit: force commit to parse from object database

2018-04-17 Thread Derrick Stolee
is explicit in avoiding commits from the commit-graph file. Signed-off-by: Derrick Stolee <dsto...@microsoft.com> --- commit.c | 14 ++ commit.h | 1 + 2 files changed, 11 insertions(+), 4 deletions(-) diff --git a/commit.c b/commit.c index 9ef6f699bd..07752d8503

[RFC PATCH 05/12] commit-graph: check fanout and lookup table

2018-04-17 Thread Derrick Stolee
that file in advance. We perform this parse now to ensure the object cache contains only commits from this commit-graph file. Signed-off-by: Derrick Stolee <dsto...@microsoft.com> --- commit-graph.c | 34 ++ 1 file changed, 34 insertions(+) diff --git a/commit-g

[RFC PATCH 12/12] commit-graph: update design document

2018-04-17 Thread Derrick Stolee
The commit-graph feature is now integrated with 'fsck' and 'gc', so remove those items from the "Future Work" section of the commit-graph design document. Signed-off-by: Derrick Stolee <dsto...@microsoft.com> --- Documentation/technical/commit-graph.txt | 9 - 1 file chan

[RFC PATCH 10/12] commit-graph: add '--reachable' option

2018-04-17 Thread Derrick Stolee
' after performing cleanup of the object database. Signed-off-by: Derrick Stolee <dsto...@microsoft.com> --- Documentation/git-commit-graph.txt | 8 -- builtin/commit-graph.c | 41 +++--- t/t5318-commit-graph.sh| 10 3 files chang

[RFC PATCH 07/12] commit-graph: load a root tree from specific graph

2018-04-17 Thread Derrick Stolee
When lazy-loading a tree for a commit, it will be important to select the tree from a specific struct commit_graph. Create a new method that specifies the commit-graph file and use that in get_commit_tree_in_graph(). Signed-off-by: Derrick Stolee <dsto...@microsoft.com> --- commit-graph.

[RFC PATCH 04/12] commit-graph: parse commit from chosen graph

2018-04-17 Thread Derrick Stolee
Before checking a commit-graph file against the object database, we need to parse all commits from the given commit-graph file. Create parse_commit_in_graph_one() to target a given struct commit_graph. Signed-off-by: Derrick Stolee <dsto...@microsoft.com> --- commit-graph.

Re: [PATCH v3 8/9] commit-graph: always load commit-graph information

2018-04-17 Thread Derrick Stolee
On 4/17/2018 1:00 PM, Derrick Stolee wrote: Most code paths load commits using lookup_commit() and then parse_commit(). In some cases, including some branch lookups, the commit is parsed using parse_object_buffer() which side-steps parse_commit() in favor of parse_commit_buffer

Re: [RFC PATCH 00/12] Integrate commit-graph into 'fsck' and 'gc'

2018-04-17 Thread Derrick Stolee
On 4/17/2018 2:10 PM, Derrick Stolee wrote: The commit-graph feature is not useful to end users until the commit-graph file is maintained automatically by Git during normal upkeep operations. One natural place to trigger this write is during 'git gc'. Before automatically generating a commit

Re: [PATCH v10 00/36] Add directory rename detection to git

2018-04-19 Thread Derrick Stolee
On 4/19/2018 2:41 PM, Stefan Beller wrote: On Thu, Apr 19, 2018 at 11:35 AM, Elijah Newren wrote: On Thu, Apr 19, 2018 at 10:57 AM, Elijah Newren wrote: This series is a reboot of the directory rename detection series that was merged to master and then

Re: [RFC 0/1] Tolerate broken headers in `packed-refs` files

2018-03-26 Thread Derrick Stolee
On 3/26/2018 8:42 AM, Michael Haggerty wrote: [...] But there might be some tools out in the wild that have been writing broken headers. In that case, users who upgrade Git might suddenly find that they can't read repositories that they could read before. In fact, a tool that we wrote and use

Re: [PATCH 4/3] sha1_name: use bsearch_pack() in unique_in_pack()

2018-03-25 Thread Derrick Stolee
cement as done by patch 3. Speed is less of a concern here -- at least I don't know a commonly used command that needs to resolve lots of short hashes. Thanks, René! Good teamwork on this patch series. Reviewed-by: Derrick Stolee <dsto...@microsoft.com>

Re: [PATCH] unpack-trees: release oid_array after use in check_updates()

2018-03-25 Thread Derrick Stolee
On 3/25/2018 12:31 PM, René Scharfe wrote: Signed-off-by: Rene Scharfe --- That leak was introduced by c0c578b33c (unpack-trees: batch fetching of missing blobs). unpack-trees.c | 1 + 1 file changed, 1 insertion(+) diff --git a/unpack-trees.c b/unpack-trees.c index

Re: [ANNOUNCE] Git v2.17.0-rc1

2018-03-25 Thread Derrick Stolee
On 3/25/2018 2:42 PM, Ævar Arnfjörð Bjarmason wrote: On Sun, Mar 25 2018, Derrick Stolee wrote: On 3/23/2018 1:59 PM, Ævar Arnfjörð Bjarmason wrote: On Wed, Mar 21 2018, Junio C. Hamano wrote: A release candidate Git v2.17.0-rc1 is now available for testing at the usual places

Re: [ANNOUNCE] Git v2.17.0-rc1

2018-03-25 Thread Derrick Stolee
On 3/23/2018 1:59 PM, Ævar Arnfjörð Bjarmason wrote: On Wed, Mar 21 2018, Junio C. Hamano wrote: A release candidate Git v2.17.0-rc1 is now available for testing at the usual places. It is comprised of 493 non-merge commits since v2.16.0, contributed by 62 people, 19 of which are new faces.

Re: [PATCH v4 00/13] Serialized Git Commit Graph

2018-04-02 Thread Derrick Stolee
/git/20180314192736.70602-1-dsto...@microsoft.com/T/#u Derrick Stolee <sto...@gmail.com> writes: As promised [1], this patch contains a way to serialize the commit graph. The current implementation defines a new file format to store the graph structure (parent relationships) and basic

Re: [PATCH v4 01/13] commit-graph: add format document

2018-04-02 Thread Derrick Stolee
On 3/30/2018 9:25 AM, Jakub Narebski wrote: Derrick Stolee <sto...@gmail.com> writes: +== graph-*.graph files have the following format: What is this '*' here? No longer necessary. It used to be a placeholder for a hash value, but now the graph is stored in objects/info/commit

Re: [PATCH v4 00/13] Serialized Git Commit Graph

2018-04-02 Thread Derrick Stolee
On 4/2/2018 10:46 AM, Jakub Narebski wrote: Derrick Stolee <sto...@gmail.com> writes: [...] At one point, I was investigating these reachability indexes (I read "SCARAB: Scaling Reachability Computation on Large Graphs" by Jihn, Ruan, Dey, and Xu [2]) but find the question th

Re: [PATCH v4 00/13] Serialized Git Commit Graph

2018-04-02 Thread Derrick Stolee
On 4/2/2018 1:35 PM, Stefan Beller wrote: On Mon, Apr 2, 2018 at 8:02 AM, Derrick Stolee <sto...@gmail.com> wrote: I would be happy to review any effort to extend the commit-graph format to include such indexes, as long as the performance benefits outweigh the complexity to create the

[PATCH v7 13/14] commit-graph: build graph from starting commits

2018-04-02 Thread Derrick Stolee
From: Derrick Stolee <dsto...@microsoft.com> Teach git-commit-graph to read commits from stdin when the --stdin-commits flag is specified. Commits reachable from these commits are added to the graph. This is a much faster way to construct the graph than inspecting all packed o

[PATCH v7 11/14] commit: integrate commit graph with commit parsing

2018-04-02 Thread Derrick Stolee
From: Derrick Stolee <dsto...@microsoft.com> Teach Git to inspect a commit graph file to supply the contents of a struct commit when calling parse_commit_gently(). This implementation satisfies all post-conditions on the struct commit, including loading parents, the root tree, and the commi

[PATCH v7 01/14] csum-file: rename hashclose() to finalize_hashfile()

2018-04-02 Thread Derrick Stolee
From: Derrick Stolee <dsto...@microsoft.com> The hashclose() method behaves very differently depending on the flags parameter. In particular, the file descriptor is not always closed. Perform a simple rename of "hashclose()" to "finalize_hashfile()" in preparation for f

[PATCH v7 03/14] commit-graph: add format document

2018-04-02 Thread Derrick Stolee
From: Derrick Stolee <dsto...@microsoft.com> Add document specifying the binary format for commit graphs. This format allows for: * New versions. * New hash functions and hash lengths. * Optional extensions. Basic header information is followed by a binary table of contents into &

[PATCH v7 12/14] commit-graph: read only from specific pack-indexes

2018-04-02 Thread Derrick Stolee
From: Derrick Stolee <dsto...@microsoft.com> Teach git-commit-graph to inspect the objects only in a certain list of pack-indexes within the given pack directory. This allows updating the commit graph iteratively. Signed-off-by: Derrick Stolee <dsto...@microsoft.com> --- Docum

[PATCH v7 02/14] csum-file: refactor finalize_hashfile() method

2018-04-02 Thread Derrick Stolee
From: Derrick Stolee <dsto...@microsoft.com> If we want to use a hashfile on the temporary file for a lockfile, then we need finalize_hashfile() to fully write the trailing hash but also keep the file descriptor open. Do this by adding a new CSUM_HASH_IN_STREAM flag along with a func

[PATCH v7 14/14] commit-graph: implement "--additive" option

2018-04-02 Thread Derrick Stolee
From: Derrick Stolee <dsto...@microsoft.com> Teach git-commit-graph to add all commits from the existing commit-graph file to the file about to be written. This should be used when adding new commits without performing garbage collection. Signed-off-by: Derrick Stolee <dsto...@micr

[PATCH v7 09/14] commit-graph: add core.commitGraph setting

2018-04-02 Thread Derrick Stolee
From: Derrick Stolee <dsto...@microsoft.com> The commit graph feature is controlled by the new core.commitGraph config setting. This defaults to 0, so the feature is opt-in. The intention of core.commitGraph is that a user can always stop checking for or parsing commit graph

[PATCH v7 00/14] Serialized Git Commit Graph

2018-04-02 Thread Derrick Stolee
est containing the latest version of this patch. Derrick Stolee (14): csum-file: rename hashclose() to finalize_hashfile() csum-file: refactor finalize_hashfile() method commit-graph: add format document graph: add commit graph design document commit-graph: create git-commit-graph

[PATCH v7 06/14] commit-graph: implement write_commit_graph()

2018-04-02 Thread Derrick Stolee
From: Derrick Stolee <dsto...@microsoft.com> Teach Git to write a commit graph file by checking all packed objects to see if they are commits, then store the file in the given object directory. Helped-by: Jeff King <p...@peff.net> Signed-off-by: Derrick Stolee <dsto..

[PATCH v7 04/14] graph: add commit graph design document

2018-04-02 Thread Derrick Stolee
From: Derrick Stolee <dsto...@microsoft.com> Add Documentation/technical/commit-graph.txt with details of the planned commit graph feature, including future plans. Signed-off-by: Derrick Stolee <dsto...@microsoft.com> --- Documentation/technical/commit-graph.txt | 163 ++

[PATCH v7 10/14] commit-graph: close under reachability

2018-04-02 Thread Derrick Stolee
From: Derrick Stolee <dsto...@microsoft.com> Teach write_commit_graph() to walk all parents from the commits discovered in packfiles. This prevents gaps given by loose objects or previously-missed packfiles. Also automatically add commits from the existing graph file, if it exists. Sign

[PATCH v7 08/14] commit-graph: implement git commit-graph read

2018-04-02 Thread Derrick Stolee
From: Derrick Stolee <dsto...@microsoft.com> Teach git-commit-graph to read commit graph files and summarize their contents. Use the read subcommand to verify the contents of a commit graph file in the tests. Signed-off-by: Derrick Stolee <dsto...@microsoft.com> --- Documentatio

[PATCH v7 05/14] commit-graph: create git-commit-graph builtin

2018-04-02 Thread Derrick Stolee
From: Derrick Stolee <dsto...@microsoft.com> Teach git the 'commit-graph' builtin that will be used for writing and reading packed graph files. The current implementation is mostly empty, except for an '--object-dir' option. Signed-off-by: Derrick Stolee <dsto...@microsoft.com> ---

[PATCH v7 07/14] commit-graph: implement git-commit-graph write

2018-04-02 Thread Derrick Stolee
From: Derrick Stolee <dsto...@microsoft.com> Teach git-commit-graph to write graph files. Create new test script to verify this command succeeds without failure. Signed-off-by: Derrick Stolee <dsto...@microsoft.com> --- Documentation/git-commit-graph.txt | 41 ++ builtin/co

Re: [PATCH 0/3] Lazy-load trees when reading commit-graph

2018-04-03 Thread Derrick Stolee
On 4/3/2018 9:06 AM, Jeff King wrote: On Tue, Apr 03, 2018 at 08:00:54AM -0400, Derrick Stolee wrote: There are several commit-graph walks that require loading many commits but never walk the trees reachable from those commits. However, the current logic in parse_commit() requires the root

Re: [PATCH 0/6] Compute and consume generation numbers

2018-04-03 Thread Derrick Stolee
On 4/3/2018 12:51 PM, Derrick Stolee wrote: This is the first of several "small" patches that follow the serialized Git commit graph patch (ds/commit-graph). As described in Documentation/technical/commit-graph.txt, the generation number of a commit is one more than the maximum

[PATCH 1/6] object.c: parse commit in graph first

2018-04-03 Thread Derrick Stolee
need to ensure that any commit that exists in the graph is loaded from the graph, so check parse_commit_in_graph() before calling parse_commit_buffer(). Signed-off-by: Derrick Stolee <dsto...@microsoft.com> --- object.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/ob

[PATCH 4/6] commit: use generations in paint_down_to_common()

2018-04-03 Thread Derrick Stolee
this extra effort, even if it is somewhat rare. Signed-off-by: Derrick Stolee <dsto...@microsoft.com> --- commit.c | 19 ++- commit.h | 1 + 2 files changed, 19 insertions(+), 1 deletion(-) diff --git a/commit.c b/commit.c index 3e39c86abf..95ae7e13a3 100644 --- a/commit.

[PATCH 5/6] commit.c: use generation to halt paint walk

2018-04-03 Thread Derrick Stolee
the minimum generation number of a commit that enters the queue with nonstale status. Signed-off-by: Derrick Stolee <dsto...@microsoft.com> --- commit.c | 37 ++--- 1 file changed, 30 insertions(+), 7 deletions(-) diff --git a/commit.c b/commit.c index 95ae

[PATCH 6/6] commit-graph.txt: update future work

2018-04-03 Thread Derrick Stolee
We now calculate generation numbers in the commit-graph file and use them in paint_down_to_common(). Signed-off-by: Derrick Stolee <dsto...@microsoft.com> --- Documentation/technical/commit-graph.txt | 7 +-- 1 file changed, 1 insertion(+), 6 deletions(-) diff --git a/Documen

[PATCH 0/6] Compute and consume generation numbers

2018-04-03 Thread Derrick Stolee
a starting point. A more substantial refactoring of revision.c is required before making 'git log --graph' use generation numbers effectively. This patch series depends on v7 of ds/commit-graph. Derrick Stolee (6): object.c: parse commit in graph first commit: add generation number to stru

[PATCH 3/6] commit-graph: compute generation numbers

2018-04-03 Thread Derrick Stolee
commits to the commit-graph file. Signed-off-by: Derrick Stolee <dsto...@microsoft.com> --- commit-graph.c | 46 ++ commit.h | 1 + 2 files changed, 47 insertions(+) diff --git a/commit-graph.c b/commit-graph.c index d24b947525..b80c8ad80e

[PATCH 2/6] commit: add generation number to struct commmit

2018-04-03 Thread Derrick Stolee
. The second (_NONE) means the generation number was loaded from a commit graph file that was stored before generation numbers were computed. Signed-off-by: Derrick Stolee <dsto...@microsoft.com> --- alloc.c| 1 + commit-graph.c | 2 ++ commit.h | 3 +++ 3 files changed, 6 insertions(+)

Re: [PATCH 3/3] commit-graph: lazy-load trees

2018-04-03 Thread Derrick Stolee
On 4/3/2018 2:00 PM, Stefan Beller wrote: On Tue, Apr 3, 2018 at 5:00 AM, Derrick Stolee <dsto...@microsoft.com> wrote: The commit-graph file provides quick access to commit data, including the OID of the root tree for each commit in the graph. When performing a deep commit-graph walk,

Re: [PATCH 0/6] Compute and consume generation numbers

2018-04-03 Thread Derrick Stolee
On 4/3/2018 2:03 PM, Brandon Williams wrote: On 04/03, Derrick Stolee wrote: This is the first of several "small" patches that follow the serialized Git commit graph patch (ds/commit-graph). As described in Documentation/technical/commit-graph.txt, the generation number of a commit i

Re: [PATCH 2/6] commit: add generation number to struct commmit

2018-04-03 Thread Derrick Stolee
On 4/3/2018 2:28 PM, Jeff King wrote: On Tue, Apr 03, 2018 at 11:05:36AM -0700, Brandon Williams wrote: On 04/03, Derrick Stolee wrote: The generation number of a commit is defined recursively as follows: * If a commit A has no parents, then the generation number of A is one. * If a commit

Re: [PATCH 1/6] object.c: parse commit in graph first

2018-04-03 Thread Derrick Stolee
On 4/3/2018 2:28 PM, Jeff King wrote: On Tue, Apr 03, 2018 at 11:21:36AM -0700, Jonathan Tan wrote: On Tue, 3 Apr 2018 12:51:38 -0400 Derrick Stolee <dsto...@microsoft.com> wrote: Most code paths load commits using lookup_commit() and then parse_commit(). In some cases, includin

Re: [PATCH v6 00/14] Serialized Git Commit Graph

2018-03-19 Thread Derrick Stolee
On 3/16/2018 12:28 PM, Lars Schneider wrote: On 14 Mar 2018, at 21:43, Junio C Hamano <gits...@pobox.com> wrote: Derrick Stolee <sto...@gmail.com> writes: Hopefully this version is ready to merge. I have several follow-up topics in mind to submit soon after, including: A few

Re: [PATCH v6 12/14] commit-graph: read only from specific pack-indexes

2018-03-19 Thread Derrick Stolee
On 3/15/2018 6:50 PM, SZEDER Gábor wrote: On Wed, Mar 14, 2018 at 8:27 PM, Derrick Stolee <sto...@gmail.com> wrote: From: Derrick Stolee <dsto...@microsoft.com> Teach git-commit-graph to inspect the objects only in a certain list of pack-indexes within the given pack directory.

Re: [PATCH v6 07/14] commit-graph: implement 'git-commit-graph write'

2018-03-19 Thread Derrick Stolee
On 3/18/2018 9:25 AM, Ævar Arnfjörð Bjarmason wrote: On Wed, Mar 14 2018, Derrick Stolee jotted: +'git commit-graph write' [--object-dir ] + + +DESCRIPTION +--- + +Manage the serialized commit graph file. + + +OPTIONS +--- +--object-dir:: + Use given directory

[PATCH] sha1_name: use bsearch_hash() for abbreviations

2018-03-20 Thread Derrick Stolee
log --oneline --parents Before: 7.85s After: 7.29s Rel %: -7.1% Signed-off-by: Derrick Stolee <dsto...@microsoft.com> --- packfile.c | 23 +++ packfile.h | 8 sha1_name.c | 24 3 files changed, 35 insertions(+), 20 deletions

[PATCH v2 2/3] packfile: define and use bsearch_pack()

2018-03-22 Thread Derrick Stolee
to a new method, bsearch_pack(), so this can be re-used in other code paths. Signed-off-by: Derrick Stolee <dsto...@microsoft.com> --- packfile.c | 42 ++ packfile.h | 8 2 files changed, 34 insertions(+), 16 deletions(-) diff --git a/packfi

[PATCH v2 1/3] sha1_name: convert struct min_abbrev_data to object_id

2018-03-22 Thread Derrick Stolee
From: "brian m. carlson" This structure is only written to in one place, where we already have a struct object_id. Convert the struct to use a struct object_id instead. Signed-off-by: brian m. carlson --- sha1_name.c | 6 +++--- 1

[PATCH v2 3/3] sha1_name: use bsearch_pack() for abbreviations

2018-03-22 Thread Derrick Stolee
--oneline --parents --raw Before: 59.2s After: 56.9s Rel %: -3.8% * git log --oneline --parents Before: 6.48s After: 5.91s Rel %: -8.9% Signed-off-by: Derrick Stolee <dsto...@microsoft.com> --- sha1_name.c | 24 1 file changed, 4 insertions(+), 20 del

[PATCH v2 0/3] Use bsearch_hash() for abbreviations

2018-03-22 Thread Derrick Stolee
Thanks to Jonathan and Brian for the help with the proper way to handle OIDs and existing callers to bsearch_hash(). This patch includes one commit that Brian sent in the previous discussion (included again here for completeness). Derrick Stolee (2): packfile: define and use bsearch_pack

Re: [PATCH v6 00/14] Serialized Git Commit Graph

2018-03-19 Thread Derrick Stolee
On 3/16/2018 4:19 PM, Jeff King wrote: On Fri, Mar 16, 2018 at 04:06:39PM -0400, Jeff King wrote: Furthermore, in order to look at an object it has to be zlib inflated first, and since commit objects tend to be much smaller than trees and especially blobs, there are a lot less bytes to

Re: [PATCH v6 07/14] commit-graph: implement 'git-commit-graph write'

2018-03-19 Thread Derrick Stolee
On 3/19/2018 10:36 AM, Ævar Arnfjörð Bjarmason wrote: On Mon, Mar 19 2018, Derrick Stolee jotted: On 3/18/2018 9:25 AM, Ævar Arnfjörð Bjarmason wrote: On Wed, Mar 14 2018, Derrick Stolee jotted: +'git commit-graph write' [--object-dir ] + + +DESCRIPTION +--- + +Manage

Re: What's cooking in git.git (Mar 2018, #03; Wed, 14)

2018-03-19 Thread Derrick Stolee
On 3/15/2018 4:36 AM, Ævar Arnfjörð Bjarmason wrote: On Thu, Mar 15 2018, Junio C. Hamano jotted: * nd/repack-keep-pack (2018-03-07) 6 commits - SQUASH??? - pack-objects: display progress in get_object_details() - pack-objects: show some progress when counting kept objects - gc --auto:

Re: [PATCH] sha1_name: use bsearch_hash() for abbreviations

2018-03-21 Thread Derrick Stolee
On 3/20/2018 6:25 PM, Jonathan Tan wrote: On Tue, 20 Mar 2018 16:03:25 -0400 Derrick Stolee <dsto...@microsoft.com> wrote: This patch updates the abbreviation code to use bsearch_hash() as defined in [1]. It gets a nice speedup since the old implementation did not use the fanout table

Re: [PATCH v6 00/14] Serialized Git Commit Graph

2018-03-19 Thread Derrick Stolee
On 3/19/2018 8:55 AM, Derrick Stolee wrote: Thanks for this! Fixing this performance problem is very important to me, as we will use the "--stdin-packs" mechanism in the GVFS scenario (we will walk all commits in the prefetch packs full of commits and trees instead of relyi

Re: How to debug a "git merge"?

2018-03-19 Thread Derrick Stolee
On 3/14/2018 4:53 PM, Lars Schneider wrote: On 14 Mar 2018, at 18:02, Derrick Stolee <sto...@gmail.com> wrote: On 3/14/2018 12:56 PM, Lars Schneider wrote: Hi, I am investigating a Git merge (a86dd40fe) in which an older version of a file won over the newer version. I try to understa

Re: Contributor Summit planning

2018-03-05 Thread Derrick Stolee
On 3/3/2018 5:39 AM, Jeff King wrote: On Sat, Mar 03, 2018 at 05:30:10AM -0500, Jeff King wrote: As in past years, I plan to run it like an unconference. Attendees are expected to bring topics for group discussion. Short presentations are also welcome. We'll put the topics on a whiteboard in

Re: [RFC] Contributing to Git (on Windows)

2018-03-05 Thread Derrick Stolee
I really appreciate the feedback on this document, Jonathan. On 3/3/2018 1:27 PM, Jonathan Nieder wrote: Hi Dscho, Johannes Schindelin wrote: Jonathan Nieder writes: Dereck Stolee wrote: nit: s/Dereck/Derrick/ Is my outgoing email name misspelled, or do you have a

[RFC] Contributing to Git (on Windows)

2018-03-01 Thread Derrick Stolee
We (Git devs at Microsoft) have had several people start contributing to Git over the past few years (I'm the most-recent addition). As we on-boarded to Git development on our Windows machines, we collected our setup steps on an internal wiki page. Now, we'd like to make that document

[PATCH v5 13/13] commit-graph: implement "--additive" option

2018-02-26 Thread Derrick Stolee
Teach git-commit-graph to add all commits from the existing commit-graph file to the file about to be written. This should be used when adding new commits without performing garbage collection. Signed-off-by: Derrick Stolee <dsto...@microsoft.com> --- Documentation/git-commit-graph.tx

[PATCH v5 12/13] commit-graph: build graph from starting commits

2018-02-26 Thread Derrick Stolee
, 700,000+ commits were added to the graph file starting from 'master' in 7-9 seconds, depending on the number of packfiles in the repo (1, 24, or 120). Signed-off-by: Derrick Stolee <dsto...@microsoft.com> --- Documentation/git-commit-graph.txt | 14 +- builtin/commit-g

[PATCH v5 03/13] commit-graph: create git-commit-graph builtin

2018-02-26 Thread Derrick Stolee
Thanks for the help in getting all the details right in setting up a builtin. -- >8 -- Teach git the 'commit-graph' builtin that will be used for writing and reading packed graph files. The current implementation is mostly empty, except for an '--object-dir' option. Signed-off-by: Derr

[PATCH v5 02/13] graph: add commit graph design document

2018-02-26 Thread Derrick Stolee
Add Documentation/technical/commit-graph.txt with details of the planned commit graph feature, including future plans. Signed-off-by: Derrick Stolee <dsto...@microsoft.com> --- Documentation/technical/commit-graph.txt | 164 +++ 1 file changed, 164 inse

[PATCH v5 10/13] commit: integrate commit graph with commit parsing

2018-02-26 Thread Derrick Stolee
| 1.35s | 0.32s | -76% | | rev-list --all | 6.7s | 0.83s | -87% | | rev-list --all --objects | 33.0s | 27.5s | -16% | Signed-off-by: Derrick Stolee <dsto...@microsoft.com> --- alloc.c | 1 + commit-graph.c

[PATCH v5 06/13] commit-graph: implement 'git-commit-graph write'

2018-02-26 Thread Derrick Stolee
Teach git-commit-graph to write graph files. Create new test script to verify this command succeeds without failure. Signed-off-by: Derrick Stolee <dsto...@microsoft.com> --- Documentation/git-commit-graph.txt | 39 builtin/commit-graph.c | 33 ++ t

[PATCH v5 00/13] Serialized Git Commit Graph

2018-02-26 Thread Derrick Stolee
ting the ODB. You can run your own performance comparisons by toggling the 'core.commitGraph' setting. [1] https://github.com/derrickstolee/git/pull/2 A GitHub pull request containing the latest version of this patch. Derrick Stolee (13): commit-graph: add format document graph: add commit gr

[PATCH v5 05/13] commit-graph: implement write_commit_graph()

2018-02-26 Thread Derrick Stolee
Teach Git to write a commit graph file by checking all packed objects to see if they are commits, then store the file in the given object directory. Signed-off-by: Derrick Stolee <dsto...@microsoft.com> --- Makefile | 1 + commit-graph.c

[PATCH v5 04/13] csum-file: add CSUM_KEEP_OPEN flag

2018-02-26 Thread Derrick Stolee
it in the commit that follows. -- >8 -- If we want to use a hashfile on the temporary file for a lockfile, then we need hashclose() to fully write the trailing hash but also keep the file descriptor open. Signed-off-by: Derrick Stolee <dsto...@microsoft.com> --- csum-file.c | 10 +++

[PATCH v5 01/13] commit-graph: add format document

2018-02-26 Thread Derrick Stolee
r commit would cause an extra level of indirection for every merge commit. (Octopus merges suffer from this indirection, but they are very rare.) Signed-off-by: Derrick Stolee <dsto...@microsoft.com> --- Documentation/technical/commit-graph-format.txt | 98 + 1 file changed

[PATCH v5 11/13] commit-graph: read only from specific pack-indexes

2018-02-26 Thread Derrick Stolee
Teach git-commit-graph to inspect the objects only in a certain list of pack-indexes within the given pack directory. This allows updating the commit graph iteratively. Signed-off-by: Derrick Stolee <dsto...@microsoft.com> --- Documentation/git-commit-graph.txt | 11 ++- builtin/

[PATCH v5 07/13] commit-graph: implement git commit-graph read

2018-02-26 Thread Derrick Stolee
Teach git-commit-graph to read commit graph files and summarize their contents. Use the read subcommand to verify the contents of a commit graph file in the tests. Signed-off-by: Derrick Stolee <dsto...@microsoft.com> --- Documentation/git-commit-graph.txt | 12 builtin/commit-g

[PATCH v5 08/13] commit-graph: add core.commitGraph setting

2018-02-26 Thread Derrick Stolee
The commit graph feature is controlled by the new core.commitGraph config setting. This defaults to 0, so the feature is opt-in. The intention of core.commitGraph is that a user can always stop checking for or parsing commit graph files if core.commitGraph=0. Signed-off-by: Derrick Stolee <d

[PATCH v5 09/13] commit-graph: close under reachability

2018-02-26 Thread Derrick Stolee
Teach write_commit_graph() to walk all parents from the commits discovered in packfiles. This prevents gaps given by loose objects or previously-missed packfiles. Also automatically add commits from the existing graph file, if it exists. Signed-off-by: Derrick Stolee <dsto...@microsoft.

[PATCH] sha1_name: fix uninitialized memory errors

2018-02-26 Thread Derrick Stolee
. Then nth_packed_object_oid() does not initialize "oid". Use the return value of nth_packed_object_oid() to prevent these errors. Reported-by: Christian Couder <christian.cou...@gmail.com> Signed-off-by: Derrick Stolee <dsto...@microsoft.com> --- sha1_name.c | 11 +++ 1 file

[PATCH v2] sha1_name: fix uninitialized memory errors

2018-02-27 Thread Derrick Stolee
lt;christian.cou...@gmail.com> Signed-off-by: Derrick Stolee <dsto...@microsoft.com> --- sha1_name.c | 12 ++-- 1 file changed, 6 insertions(+), 6 deletions(-) diff --git a/sha1_name.c b/sha1_name.c index 611c7d24dd..a041d8d24f 100644 --- a/sha1_name.c +++ b/sha1_name.c @@ -547,

Re: Use of uninitialised value of size 8 in sha1_name.c

2018-02-26 Thread Derrick Stolee
On 2/26/2018 5:23 AM, Christian Couder wrote: On Mon, Feb 26, 2018 at 10:53 AM, Jeff King wrote: On Mon, Feb 26, 2018 at 10:04:22AM +0100, Christian Couder wrote: ==21455== Use of uninitialised value of size 8 ==21455==at 0x2D2A73: get_hex_char_from_oid (sha1_name.c:492)

Re: [PATCH] commit-graph: fix some "plain integer as NULL pointer" warnings

2018-02-26 Thread Derrick Stolee
On 2/24/2018 12:42 AM, René Scharfe wrote: Am 24.02.2018 um 03:24 schrieb Ramsay Jones: diff --git a/commit-graph.c b/commit-graph.c index fc5ee7e99..c2f443436 100644 --- a/commit-graph.c +++ b/commit-graph.c @@ -45,7 +45,7 @@ char *get_graph_latest_filename(const char *obj_dir) {

Re: [PATCH] revision.c: reduce object database queries

2018-02-28 Thread Derrick Stolee
On 2/28/2018 1:37 AM, Jeff King wrote: On Tue, Feb 27, 2018 at 03:16:58PM -0800, Junio C Hamano wrote: This code comes originally form 454fbbcde3 (git-rev-list: allow missing objects when the parent is marked UNINTERESTING, 2005-07-10). But later, in aeeae1b771 (revision traversal: allow

Re: [PATCH 03/11] packfile: allow install_packed_git to handle arbitrary repositories

2018-02-28 Thread Derrick Stolee
On 2/27/2018 8:06 PM, Stefan Beller wrote: -void install_packed_git(struct packed_git *pack) +void install_packed_git(struct repository *r, struct packed_git *pack) This is a good thing to do. I'm just making note that this will collide with the new instances of install_packed_git() that I

Re: [PATCH 00/11] Moving global state into the repository object (part 2)

2018-02-28 Thread Derrick Stolee
On 2/27/2018 9:15 PM, Duy Nguyen wrote: On Tue, Feb 27, 2018 at 05:05:57PM -0800, Stefan Beller wrote: This applies on top of origin/sb/object-store and is the continuation of that series, adding the repository as a context argument to functions. This series focusses on packfile handling,

Re: What's cooking in git.git (Mar 2018, #01; Thu, 1)

2018-03-02 Thread Derrick Stolee
On 3/1/2018 5:20 PM, Junio C Hamano wrote: -- [Graduated to "master"] * jt/binsearch-with-fanout (2018-02-15) 2 commits (merged to 'next' on 2018-02-15 at 7648891022) + packfile: refactor hash search with fanout table + packfile: remove

Re: [RFC] Contributing to Git (on Windows)

2018-03-02 Thread Derrick Stolee
On 3/1/2018 11:44 PM, Jonathan Nieder wrote: Hi, Derrick Stolee wrote: Now, we'd like to make that document publicly available. These steps are focused on a Windows user, so we propose putting them in the git-for-windows/git repo under CONTRIBUTING.md. I have a pull request open for feedback

Re: What's cooking in git.git (Apr 2018, #03; Wed, 25)

2018-04-26 Thread Derrick Stolee
On 4/25/2018 1:43 PM, Brandon Williams wrote: On 04/25, Ævar Arnfjörð Bjarmason wrote: * bw/protocol-v2 (2018-03-15) 35 commits (merged to 'next' on 2018-04-11 at 23ee234a2c) + remote-curl: don't request v2 when pushing + remote-curl: implement stateless-connect command + http:

Re: [PATCH v4 03/10] commit-graph: compute generation numbers

2018-04-26 Thread Derrick Stolee
n 4/25/2018 10:35 PM, Junio C Hamano wrote: Derrick Stolee <dsto...@microsoft.com> writes: @@ -439,6 +439,9 @@ static void write_graph_chunk_data(struct hashfile *f, int hash_len, else packedDate[0] = 0; + if ((*list)->g

<    5   6   7   8   9   10   11   12   13   14   >