[PATCH v4 3/7] pack-objects: add delta-islands support

2018-08-11 Thread Christian Couder
From: Jeff King Implement support for delta islands in git pack-objects and document how delta islands work in "Documentation/git-pack-objects.txt" and Documentation/config.txt. This allows users to setup delta islands in their config and get the benefit of less disk usage while cloning and

[PATCH v4 1/7] Add delta-islands.{c,h}

2018-08-11 Thread Christian Couder
From: Jeff King Hosting providers that allow users to "fork" existing repos want those forks to share as much disk space as possible. Alternates are an existing solution to keep all the objects from all the forks into a unique central repo, but this can have some drawbacks. Especially when

[PATCH v4 6/7] pack-objects: move tree_depth into 'struct packing_data'

2018-08-11 Thread Christian Couder
This reduces the size of 'struct object_entry' and therefore makes packing objects more efficient. This also renames cmp_tree_depth() into tree_depth_compare(), as it is more modern to have the name of the compare functions end with "compare". Helped-by: Jeff King Helped-by: Duy Nguyen

[PATCH v4 7/7] pack-objects: move 'layer' into 'struct packing_data'

2018-08-11 Thread Christian Couder
This reduces the size of 'struct object_entry' from 88 bytes to 80 and therefore makes packing objects more efficient. For example on a Linux repo with 12M objects, `git pack-objects --all` needs extra 96MB memory even if the layer feature is not used. Helped-by: Jeff King Helped-by: Duy Nguyen

[PATCH v4 5/7] t: add t5319-delta-islands.sh

2018-08-11 Thread Christian Couder
From: Jeff King Signed-off-by: Jeff King Signed-off-by: Christian Couder --- t/t5319-delta-islands.sh | 143 +++ 1 file changed, 143 insertions(+) create mode 100755 t/t5319-delta-islands.sh diff --git a/t/t5319-delta-islands.sh b/t/t5319-delta-islands.sh

[PATCH v4 4/7] repack: add delta-islands support

2018-08-11 Thread Christian Couder
From: Jeff King Implement simple support for --delta-islands option and repack.useDeltaIslands config variable in git repack. This allows users to setup delta islands in their config and get the benefit of less disk usage while cloning and fetching is still quite fast and not much more CPU

[PATCH v4 0/7] Add delta islands support

2018-08-11 Thread Christian Couder
This patch series is upstreaming work made by GitHub and available in: https://github.com/peff/git/commits/jk/delta-islands The above work has been already described in the following article: https://githubengineering.com/counting-objects/ The above branch contains only one patch. In this

[PATCH v4 2/7] pack-objects: refactor code into compute_layer_order()

2018-08-11 Thread Christian Couder
In a following commit, as we will use delta islands, we will have to compute the write order for different layers, not just for one. Let's prepare for that by refactoring the code that will be used to compute the write order for a given layer into a new compute_layer_order() function. This will

Re: [PATCH v3 2/8] Add delta-islands.{c,h}

2018-08-11 Thread Christian Couder
On Sat, Aug 11, 2018 at 4:12 PM, Jeff King wrote: > On Sat, Aug 11, 2018 at 12:32:32PM +0200, Christian Couder wrote: > >> Ok, I have made the following changes in the branch I will send next. >> >> diff --git a/delta-islands.c b/delta-islands.c >> index 92137f2eca..22e4360810 100644 >> ---

[PATCH v3] test_dir_is_empty: properly detect files with newline in name

2018-08-11 Thread William Chargin
While the `test_dir_is_empty` function appears correct in most normal use cases, it can fail when filenames contain newlines. This patch changes the implementation to check that the output of `ls -a` has at most two lines (for `.` and `..`), which should be better behaved. The newly added unit

Re: [PATCH 1/1] t/test-lib: make `test_dir_is_empty` more robust

2018-08-11 Thread William Chargin
> That will recurse any subdirectories, possibly wasting time, but since > the point is that we expect it to be empty, that's probably OK. One caveat involves invocations of `test_must_fail test_dir_is_empty`, wherein we _don't_ actually expect the directory to be empty. It looks like there might

Re: function get_delta_base() is a file-local symbol

2018-08-11 Thread Christian Couder
Hi Ramsay and Peff, On Sun, Aug 12, 2018 at 3:16 AM, Jeff King wrote: > On Sun, Aug 12, 2018 at 01:30:02AM +0100, Ramsay Jones wrote: > >> My static-check.pl script has pinged me about the get_delta_base() >> symbol from packfile.[co]. The first patch from your 'cc/delta-islands' >> branch

[ANNOUNCE] git-cinnabar 0.5.0

2018-08-11 Thread Mike Hommey
Hi, Git-cinnabar is a git remote helper to interact with mercurial repositories. It allows to clone, pull and push from/to mercurial remote repositories, using git. Code on https://github.com/glandium/git-cinnabar This release on https://github.com/glandium/git-cinnabar/releases/tag/0.5.0

Re: function get_delta_base() is a file-local symbol

2018-08-11 Thread Jeff King
On Sun, Aug 12, 2018 at 01:30:02AM +0100, Ramsay Jones wrote: > Hi Christian, > > My static-check.pl script has pinged me about the get_delta_base() > symbol from packfile.[co]. The first patch from your 'cc/delta-islands' > branch exports this symbol, saying that it will soon be called from >

function get_delta_base() is a file-local symbol

2018-08-11 Thread Ramsay Jones
Hi Christian, My static-check.pl script has pinged me about the get_delta_base() symbol from packfile.[co]. The first patch from your 'cc/delta-islands' branch exports this symbol, saying that it will soon be called from outside packfile.c. As far as I can tell, no other patch in that series adds

[PATCH] rebase: fix a sparse 'plain integer as NULL pointer' warning

2018-08-11 Thread Ramsay Jones
Signed-off-by: Ramsay Jones --- Hi Pratik, If you need to re-roll your 'pk/rebase-in-c-4-opts' branch, could you please squash this into the relevant patch (commit b0721e7b48, "builtin rebase: support `-C` and `--whitespace=`", 2018-08-08). Thanks! ATB, Ramsay Jones builtin/rebase.c | 2

Re: [PATCHv2 3/6] Move definition of enum branch_track from cache.h to branch.h

2018-08-11 Thread Ramsay Jones
On 11/08/18 21:50, Elijah Newren wrote: > 'branch_track' feels more closely related to branching, and it is > needed later in branch.h; rather than #include'ing cache.h in branch.h > for this small enum, just move the enum and the external declaration > for git_branch_track to branch.h. > >

Re: [PATCH 2/2] fsck: use oidset for skiplist

2018-08-11 Thread René Scharfe
Am 11.08.2018 um 19:23 schrieb Jeff King: On Sat, Aug 11, 2018 at 01:02:48PM -0400, Jeff King wrote: - we could probably improve the speed of oidset. Two things I notice about its implementation: Before any optimizations, my best-of-five timing for: git cat-file

Re: [PATCH 1/2] fsck: use strbuf_getline() to read skiplist file

2018-08-11 Thread René Scharfe
Am 11.08.2018 um 18:48 schrieb Jeff King: And one I'm not sure about: - a read() error will now be quietly ignored; I guess we'd have to check ferror(fp) to cover this. I'm not sure if it matters. I'm not sure, either. It would catch media errors or file system corruption, right?

[PATCHv2 6/6] Add missing includes and forward declares

2018-08-11 Thread Elijah Newren
Signed-off-by: Elijah Newren --- bisect.h | 2 ++ pack-objects.h | 1 + 2 files changed, 3 insertions(+) diff --git a/bisect.h b/bisect.h index a5d9248a47..34df209351 100644 --- a/bisect.h +++ b/bisect.h @@ -1,6 +1,8 @@ #ifndef BISECT_H #define BISECT_H +struct commit_list; + /* *

[PATCHv2 4/6] urlmatch.h: fix include guard

2018-08-11 Thread Elijah Newren
Signed-off-by: Elijah Newren --- urlmatch.h | 2 ++ 1 file changed, 2 insertions(+) diff --git a/urlmatch.h b/urlmatch.h index 37ee5da85e..e482148248 100644 --- a/urlmatch.h +++ b/urlmatch.h @@ -1,4 +1,6 @@ #ifndef URL_MATCH_H +#define URL_MATCH_H + #include "string-list.h" struct url_info

[PATCHv2 2/6] alloc: make allocate_alloc_state and clear_alloc_state more consistent

2018-08-11 Thread Elijah Newren
Since both functions are using the same data type, they should either both refer to it as void *, or both use the real type (struct alloc_state *). Opt for the latter. Signed-off-by: Elijah Newren --- alloc.c | 2 +- alloc.h | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git

[PATCHv2 1/6] Add missing includes and forward declares

2018-08-11 Thread Elijah Newren
Signed-off-by: Elijah Newren --- alloc.h | 2 ++ apply.h | 3 +++ archive.h | 3 +++ attr.h| 1 + branch.h | 2 ++ bulk-checkin.h| 2 ++ column.h | 1 + commit-graph.h| 1 + config.h

[PATCHv2 0/6] Add missing includes and forward declares

2018-08-11 Thread Elijah Newren
This series fixes compilation errors when using a simple test.c file that includes git-compat-util.h and then exactly one other header (and repeating this for different headers of git). Changes since v1: - Followed Peff's suggestion to make my simple test .c file first #include

[PATCHv2 5/6] compat/precompose_utf8.h: use more common include guard style

2018-08-11 Thread Elijah Newren
Signed-off-by: Elijah Newren --- compat/precompose_utf8.h | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/compat/precompose_utf8.h b/compat/precompose_utf8.h index a94e7c4342..6f843d3e1a 100644 --- a/compat/precompose_utf8.h +++ b/compat/precompose_utf8.h @@ -1,4 +1,6 @@

[PATCHv2 3/6] Move definition of enum branch_track from cache.h to branch.h

2018-08-11 Thread Elijah Newren
'branch_track' feels more closely related to branching, and it is needed later in branch.h; rather than #include'ing cache.h in branch.h for this small enum, just move the enum and the external declaration for git_branch_track to branch.h. Signed-off-by: Elijah Newren --- branch.h | 11

Re: [PATCH 1/2] fsck: use strbuf_getline() to read skiplist file

2018-08-11 Thread Jeff King
On Sat, Aug 11, 2018 at 05:39:27PM +0200, René Scharfe wrote: > The char array named "buffer" is unlikely to contain a NUL character, so > printing its contents using %s in a die() format is unsafe. Clang's > ASan reports running over the end of buffer in the recently added > skiplist tests in

Re: [PATCH 2/2] fsck: use oidset for skiplist

2018-08-11 Thread Ramsay Jones
On 11/08/18 16:47, René Scharfe wrote: > Object IDs to skip are stored in a shared static oid_array.  Lookups do > a binary search on the sorted array.  The code checks if the object IDs > are already in the correct order while loading and skips sorting in that > case. > > Simplify the code by

Re: [PATCH 1/9] Add missing includes and forward declares

2018-08-11 Thread Jeff King
On Fri, Aug 10, 2018 at 09:32:10PM -0700, Elijah Newren wrote: > diff --git a/argv-array.h b/argv-array.h > index a39ba43f57..c46238784c 100644 > --- a/argv-array.h > +++ b/argv-array.h > @@ -1,6 +1,8 @@ > #ifndef ARGV_ARRAY_H > #define ARGV_ARRAY_H > > +#include "git-compat-util.h" /* for

Re: [PATCH 0/9] Add missing includes and forward declares

2018-08-11 Thread Jeff King
On Sat, Aug 11, 2018 at 01:59:50AM -0700, Elijah Newren wrote: > The part of my story you snipped in the ellipsis is kind of important, > though: "...and decided to determine which header files were missing > their own necessary #include's and forward declarations." The way I > did so was making

Re: [PATCH 2/2] fsck: use oidset for skiplist

2018-08-11 Thread Jeff King
On Sat, Aug 11, 2018 at 01:02:48PM -0400, Jeff King wrote: > - we could probably improve the speed of oidset. Two things I notice > about its implementation: > > - it has to malloc for each entry, which I suspect is the main > bottleneck. We could probably pool-allocate blocks,

Re: [PATCH 2/2] fsck: use oidset for skiplist

2018-08-11 Thread Ævar Arnfjörð Bjarmason
On Sat, Aug 11 2018, René Scharfe wrote: > Object IDs to skip are stored in a shared static oid_array. Lookups do > a binary search on the sorted array. The code checks if the object IDs > are already in the correct order while loading and skips sorting in that > case. I think this change

Re: [PATCH 2/2] fsck: use oidset for skiplist

2018-08-11 Thread Jeff King
On Sat, Aug 11, 2018 at 05:47:56PM +0200, René Scharfe wrote: > Object IDs to skip are stored in a shared static oid_array. Lookups do > a binary search on the sorted array. The code checks if the object IDs > are already in the correct order while loading and skips sorting in that > case. > >

Re: Help with "fatal: unable to read ...." error during GC?

2018-08-11 Thread Jeff King
On Sat, Aug 11, 2018 at 04:38:00PM +0200, Duy Nguyen wrote: > On Sat, Aug 11, 2018 at 4:25 PM Jeff King wrote: > > Responding myself and adding Duy to the cc to increase visibility among > > worktree experts. :) > > I do silently watch this thread (and yes I still have to fix that fsck > thing,

[PATCH 2/2] fsck: use oidset for skiplist

2018-08-11 Thread René Scharfe
Object IDs to skip are stored in a shared static oid_array. Lookups do a binary search on the sorted array. The code checks if the object IDs are already in the correct order while loading and skips sorting in that case. Simplify the code by using an oidset instead. Memory usage is a bit

[PATCH 1/2] fsck: use strbuf_getline() to read skiplist file

2018-08-11 Thread René Scharfe
The char array named "buffer" is unlikely to contain a NUL character, so printing its contents using %s in a die() format is unsafe. Clang's ASan reports running over the end of buffer in the recently added skiplist tests in t5504-fetch-receive-strict.sh as a result. Use an idiomatic

Re: Help with "fatal: unable to read ...." error during GC?

2018-08-11 Thread Duy Nguyen
On Sat, Aug 11, 2018 at 4:25 PM Jeff King wrote: > Responding myself and adding Duy to the cc to increase visibility among > worktree experts. :) I do silently watch this thread (and yes I still have to fix that fsck thing, hit a roadblock with ref names but I should really restart it soon). Now

Re: Help with "fatal: unable to read ...." error during GC?

2018-08-11 Thread Jeff King
On Sat, Aug 11, 2018 at 10:23:41AM -0400, Jeff King wrote: > > I do still have these warnings and no amount of git gc/git fsck/etc. > > has reduced them in any way: > > > > $ git gc > > warning: reflog of 'HEAD' references pruned commits > > warning: reflog of 'HEAD' references pruned commits >

Re: Help with "fatal: unable to read ...." error during GC?

2018-08-11 Thread Jeff King
On Sat, Aug 11, 2018 at 08:13:17AM -0400, Paul Smith wrote: > I rebuilt Git 2.18.0 without optimization to try to get more debug > information. Unfortunately I didn't think to create a backup of my > problematic .git directory. > > When I ran the above command under the debugger using the

Re: [PATCH v3 2/8] Add delta-islands.{c,h}

2018-08-11 Thread Jeff King
On Sat, Aug 11, 2018 at 12:32:32PM +0200, Christian Couder wrote: > Ok, I have made the following changes in the branch I will send next. > > diff --git a/delta-islands.c b/delta-islands.c > index 92137f2eca..22e4360810 100644 > --- a/delta-islands.c > +++ b/delta-islands.c > @@ -322,8 +322,7 @@

Re: [PATCH v3 1/1] clone: report duplicate entries on case-insensitive filesystems

2018-08-11 Thread Duy Nguyen
On Sat, Aug 11, 2018 at 12:09 PM SZEDER Gábor wrote: > > > > Paths that only differ in case work fine in a case-sensitive > > filesystems, but if those repos are cloned in a case-insensitive one, > > you'll get problems. The first thing to notice is "git status" will > > never be clean with no

Re: Help with "fatal: unable to read ...." error during GC?

2018-08-11 Thread Paul Smith
On Wed, 2018-08-08 at 14:24 -0400, Jeff King wrote: > If so, can you try running it under gdb and getting a stack trace? > Something like: > > gdb git > [and then inside gdb...] > set args pack-objects --all --reflog --indexed-objects foobreak die > run > bt > > That might give us

Re: [PATCH v3 2/8] Add delta-islands.{c,h}

2018-08-11 Thread Christian Couder
On Sat, Aug 11, 2018 at 11:04 AM, SZEDER Gábor wrote: >> diff --git a/delta-islands.c b/delta-islands.c >> new file mode 100644 >> index 00..448ddcbbe4 >> --- /dev/null >> +++ b/delta-islands.c > >> +static void deduplicate_islands(void) >> +{ >> + struct remote_island *island, *core

Re: [PATCH v3 1/1] clone: report duplicate entries on case-insensitive filesystems

2018-08-11 Thread SZEDER Gábor
> Paths that only differ in case work fine in a case-sensitive > filesystems, but if those repos are cloned in a case-insensitive one, > you'll get problems. The first thing to notice is "git status" will > never be clean with no indication what exactly is "dirty". > > This patch helps the

Re: [PATCH v3 2/8] Add delta-islands.{c,h}

2018-08-11 Thread SZEDER Gábor
> diff --git a/delta-islands.c b/delta-islands.c > new file mode 100644 > index 00..448ddcbbe4 > --- /dev/null > +++ b/delta-islands.c > +static void deduplicate_islands(void) > +{ > + struct remote_island *island, *core = NULL, **list; > + unsigned int island_count, dst, src,

Re: [PATCH 0/9] Add missing includes and forward declares

2018-08-11 Thread Elijah Newren
On Sat, Aug 11, 2018 at 1:30 AM Ævar Arnfjörð Bjarmason wrote > On Sat, Aug 11 2018, Elijah Newren wrote: > > [CC'd sha1dc maintainers, for context the relevent patch is > https://public-inbox.org/git/20180811043218.31456-8-new...@gmail.com/T/#u] > > > * Patches 6-8: These patches might need to

Re: [PATCH 0/9] Add missing includes and forward declares

2018-08-11 Thread Ævar Arnfjörð Bjarmason
On Sat, Aug 11 2018, Elijah Newren wrote: [CC'd sha1dc maintainers, for context the relevent patch is https://public-inbox.org/git/20180811043218.31456-8-new...@gmail.com/T/#u] > * Patches 6-8: These patches might need to be submitted to separate > projects elsewhere. Let me know if so.