Re: [PATCH 3/5] pack-objects: add --sparse option
On 11/29/2018 9:39 PM, Junio C Hamano wrote: Derrick Stolee writes: While _eventually_ we should make this opt-out, we shouldn't do that until it has cooked a while. I actually do not agree. If the knob gives enough benefit, the users will learn about it viva voce, and in a few more releases we'll hear "enough users complain that they have to turn it on, let's make it the default". If that does not happen, the knob does not deserve to be turned on in the first place. That's a good philosophy. I'll keep it in mind. Having said that, I won't be commenting on this shiny new toy before the final. I want to see more people help tying the loose ends and give it final varnish to the upcoming release, as it seems to have become rockier and larger release than we originally anticipated. Yeah, no rush on this one. I just wanted to get some initial feedback about the idea. Thanks, -Stolee
Re: [PATCH 3/5] pack-objects: add --sparse option
Derrick Stolee writes: > You're right that having this hidden as an opt-in config variable > makes it hard to discover as a typical user. > > I would argue that we should actually make the config setting true by > default, and recommend that servers opt-out. Here are my reasons: > > 1. The vast majority of users are clients. > > 2. Client users are not likely to know about and tweak these settings. > > 3. Server users are more likely to keep an eye on the different knobs > they can tweak. > > 4. Servers should use the reachability bitmaps, which don't use this > logic anyway. > > While _eventually_ we should make this opt-out, we shouldn't do that > until it has cooked a while. I actually do not agree. If the knob gives enough benefit, the users will learn about it viva voce, and in a few more releases we'll hear "enough users complain that they have to turn it on, let's make it the default". If that does not happen, the knob does not deserve to be turned on in the first place. The same applies to many shiny new toys people are discussing recently on this list (e.g. precious vs expendable and non-overlay checkout are the ones that immediately come to my mind). Having said that, I won't be commenting on this shiny new toy before the final. I want to see more people help tying the loose ends and give it final varnish to the upcoming release, as it seems to have become rockier and larger release than we originally anticipated.
Re: [PATCH 3/5] pack-objects: add --sparse option
On 11/28/2018 5:11 PM, Stefan Beller wrote: +--sparse:: + Use the "sparse" algorithm to determine which objects to include in + the pack. This can have significant performance benefits when computing + a pack to send a small change. However, it is possible that extra + objects are added to the pack-file if the included commits contain + certain types of direct renames. As a user, where do I find a discussion of these walking algorithms? (i.e. how can I tell if I can really expect to gain performance benefits, what are the tradeoffs? "If it were strictly better than the original, it would be on by default" might be a thought of a user) You're right that having this hidden as an opt-in config variable makes it hard to discover as a typical user. I would argue that we should actually make the config setting true by default, and recommend that servers opt-out. Here are my reasons: 1. The vast majority of users are clients. 2. Client users are not likely to know about and tweak these settings. 3. Server users are more likely to keep an eye on the different knobs they can tweak. 4. Servers should use the reachability bitmaps, which don't use this logic anyway. While _eventually_ we should make this opt-out, we shouldn't do that until it has cooked a while. Thanks, -Stolee
Re: [PATCH 3/5] pack-objects: add --sparse option
> +--sparse:: > + Use the "sparse" algorithm to determine which objects to include in > + the pack. This can have significant performance benefits when > computing > + a pack to send a small change. However, it is possible that extra > + objects are added to the pack-file if the included commits contain > + certain types of direct renames. As a user, where do I find a discussion of these walking algorithms? (i.e. how can I tell if I can really expect to gain performance benefits, what are the tradeoffs? "If it were strictly better than the original, it would be on by default" might be a thought of a user)
[PATCH 3/5] pack-objects: add --sparse option
From: Derrick Stolee Add a '--sparse' option flag to the pack-objects builtin. This allows the user to specify that they want to use the new logic for walking trees. This logic currently does not differ from the existing output, but will in a later change. Create a new test script, t5322-pack-objects-sparse.sh, to ensure the object list that is selected matches what we expect. When we update the logic to walk in a sparse fashion, the final test will be updated to show the extra objects that are added. Signed-off-by: Derrick Stolee --- Documentation/git-pack-objects.txt | 9 ++- builtin/pack-objects.c | 5 +- t/t5322-pack-objects-sparse.sh | 115 + 3 files changed, 127 insertions(+), 2 deletions(-) create mode 100755 t/t5322-pack-objects-sparse.sh diff --git a/Documentation/git-pack-objects.txt b/Documentation/git-pack-objects.txt index 40c825c381..ced2630eb3 100644 --- a/Documentation/git-pack-objects.txt +++ b/Documentation/git-pack-objects.txt @@ -14,7 +14,7 @@ SYNOPSIS [--local] [--incremental] [--window=] [--depth=] [--revs [--unpacked | --all]] [--keep-pack=] [--stdout [--filter=] | base-name] - [--shallow] [--keep-true-parents] < object-list + [--shallow] [--keep-true-parents] [--sparse] < object-list DESCRIPTION @@ -196,6 +196,13 @@ depth is 4095. Add --no-reuse-object if you want to force a uniform compression level on all data no matter the source. +--sparse:: + Use the "sparse" algorithm to determine which objects to include in + the pack. This can have significant performance benefits when computing + a pack to send a small change. However, it is possible that extra + objects are added to the pack-file if the included commits contain + certain types of direct renames. + --thin:: Create a "thin" pack by omitting the common objects between a sender and a receiver in order to reduce network transfer. This diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c index 5f70d840a7..7d5b0735e3 100644 --- a/builtin/pack-objects.c +++ b/builtin/pack-objects.c @@ -84,6 +84,7 @@ static unsigned long pack_size_limit; static int depth = 50; static int delta_search_threads; static int pack_to_stdout; +static int sparse; static int thin; static int num_preferred_base; static struct progress *progress_state; @@ -3135,7 +3136,7 @@ static void get_object_list(int ac, const char **av) if (prepare_revision_walk()) die(_("revision walk setup failed")); - mark_edges_uninteresting(, show_edge, 0); + mark_edges_uninteresting(, show_edge, sparse); if (!fn_show_object) fn_show_object = show_object; @@ -3292,6 +3293,8 @@ int cmd_pack_objects(int argc, const char **argv, const char *prefix) { OPTION_CALLBACK, 0, "unpack-unreachable", NULL, N_("time"), N_("unpack unreachable objects newer than "), PARSE_OPT_OPTARG, option_parse_unpack_unreachable }, + OPT_BOOL(0, "sparse", , +N_("use the sparse reachability algorithm")), OPT_BOOL(0, "thin", , N_("create thin packs")), OPT_BOOL(0, "shallow", , diff --git a/t/t5322-pack-objects-sparse.sh b/t/t5322-pack-objects-sparse.sh new file mode 100755 index 00..81f6805bc3 --- /dev/null +++ b/t/t5322-pack-objects-sparse.sh @@ -0,0 +1,115 @@ +#!/bin/sh + +test_description='pack-objects object selection using sparse algorithm' +. ./test-lib.sh + +test_expect_success 'setup repo' ' + test_commit initial && + for i in $(test_seq 1 3) + do + mkdir f$i && + for j in $(test_seq 1 3) + do + mkdir f$i/f$j && + echo $j >f$i/f$j/data.txt + done + done && + git add . && + git commit -m "Initialized trees" && + for i in $(test_seq 1 3) + do + git checkout -b topic$i master && + echo change-$i >f$i/f$i/data.txt && + git commit -a -m "Changed f$i/f$i/data.txt" + done && + cat >packinput.txt <<-EOF && + topic1 + ^topic2 + ^topic3 + EOF + git rev-parse \ + topic1 \ + topic1^{tree} \ + topic1:f1 \ + topic1:f1/f1\ + topic1:f1/f1/data.txt | sort >expect_objects.txt +' + +test_expect_success 'non-sparse pack-objects' ' + git pack-objects --stdout --revs nonsparse.pack && + git index-pack -o nonsparse.idx nonsparse.pack && + git show-index nonsparse_objects.txt && + test_cmp expect_objects.txt nonsparse_objects.txt +' + +test_expect_success 'sparse pack-objects' ' + git pack-objects --stdout --revs --sparse sparse.pack && +