[PATCH v2 02/36] t/helper: merge test-chmtime into test-tool

2018-03-24 Thread Nguyễn Thái Ngọc Duy
Signed-off-by: Nguyễn Thái Ngọc Duy --- Makefile | 3 ++- t/helper/test-chmtime.c | 15 +++--- t/helper/test-tool.c | 1 + t/helper/test-tool.h | 2 ++ t/lib-git-svn.sh | 2

[PATCH v4 4/7] gc: add gc.bigPackThreshold config

2018-03-24 Thread Nguyễn Thái Ngọc Duy
The --keep-largest-pack option is not very convenient to use because you need to tell gc to do this explicitly (and probably on just a few large repos). Add a config key that enables this mode when packs larger than a limit are found. Note that there's a slight behavior difference compared to

[PATCH v4 3/7] gc: add --keep-largest-pack option

2018-03-24 Thread Nguyễn Thái Ngọc Duy
This adds a new repack mode that combines everything into a secondary pack, leaving the largest pack alone. This could help reduce memory pressure. On linux-2.6.git, valgrind massif reports 1.6GB heap in "pack all" case, and 535MB in "pack all except the base pack" case. We save roughly 1GB

[PATCH v2 02/36] t/helper: merge test-chmtime into test-tool

2018-03-24 Thread Nguyễn Thái Ngọc Duy
Signed-off-by: Nguyễn Thái Ngọc Duy --- Makefile | 3 ++- t/helper/test-chmtime.c | 15 +++--- t/helper/test-tool.c | 1 + t/helper/test-tool.h | 2 ++ t/lib-git-svn.sh | 2

[PATCH v2 01/36] t/helper: add an empty test-tool program

2018-03-24 Thread Nguyễn Thái Ngọc Duy
This will become an umbrella program that absorbs most [1] t/helper programs in. By having a single executable binary we reduce disk usage (libgit.a is replicated by every t/helper program) and shorten link time a bit. Running "make --jobs=1; du -sh t/helper" with ccache fully populated, it takes

[PATCH v2 07/36] t/helper: merge test-date into test-tool

2018-03-24 Thread Nguyễn Thái Ngọc Duy
Signed-off-by: Nguyễn Thái Ngọc Duy --- Makefile | 2 +- t/helper/test-date.c | 17 + t/helper/test-tool.c | 1 + t/helper/test-tool.h | 1 + t/t0006-date.sh| 8 t/t1300-repo-config.sh | 2 +- t/test-lib.sh |

[PATCH v2 04/36] t/helper: merge test-lazy-init-name-hash into test-tool

2018-03-24 Thread Nguyễn Thái Ngọc Duy
Signed-off-by: Nguyễn Thái Ngọc Duy --- Makefile| 2 +- t/helper/test-lazy-init-name-hash.c | 13 +++-- t/helper/test-tool.c| 1 + t/helper/test-tool.h| 1 +

[PATCH v2 00/36] Combine t/helper binaries into a single one

2018-03-24 Thread Nguyễn Thái Ngọc Duy
v2 fixes a couple of typos in commit messages and use the cmd__ prefix for test commands instead of test_, which avoids a naming conflict with the existing function test_lazy_init_name_hash Nguyễn Thái Ngọc Duy (36): t/helper: add an empty test-tool program t/helper: merge test-chmtime into

[PATCH v4 1/7] t7700: have closing quote of a test at the beginning of line

2018-03-24 Thread Nguyễn Thái Ngọc Duy
The closing quote of a test body by convention is always at the start of line. Signed-off-by: Nguyễn Thái Ngọc Duy Signed-off-by: Junio C Hamano --- t/t7700-repack.sh | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/t/t7700-repack.sh

Re: [RFC][GSoC] Project proposal: convert interactive rebase to C

2018-03-24 Thread Christian Couder
Hi, On Thu, Mar 22, 2018 at 11:03 PM, Alban Gruin wrote: > Hi, > > here is my second draft of my proposal. As last time, any feedback is > welcome :) > > I did not write my phone number and address here for obvious reasons, > but they will be in the “about me” section of

[PATCH v2 06/36] t/helper: merge test-ctype into test-tool

2018-03-24 Thread Nguyễn Thái Ngọc Duy
Signed-off-by: Nguyễn Thái Ngọc Duy --- Makefile | 2 +- t/helper/test-ctype.c | 3 ++- t/helper/test-tool.c | 1 + t/helper/test-tool.h | 1 + t/t0070-fundamental.sh | 2 +- 5 files changed, 6 insertions(+), 3 deletions(-) diff --git a/Makefile b/Makefile

[PATCH v4 6/7] gc --auto: exclude base pack if not enough mem to "repack -ad"

2018-03-24 Thread Nguyễn Thái Ngọc Duy
pack-objects could be a big memory hog especially on large repos, everybody knows that. The suggestion to stick a .keep file on the giant base pack to avoid this problem is also known for a long time. Recent patches add an option to do just this, but it has to be either configured or activated

[PATCH v4 5/7] gc: handle a corner case in gc.bigPackThreshold

2018-03-24 Thread Nguyễn Thái Ngọc Duy
This config allows us to keep packs back if their size is larger than a limit. But if this N >= gc.autoPackLimit, we may have a problem. We are supposed to reduce the number of packs after a threshold because it affects performance. We could tell the user that they have incompatible

[PATCH v4 7/7] pack-objects: show some progress when counting kept objects

2018-03-24 Thread Nguyễn Thái Ngọc Duy
We only show progress when there are new objects to be packed. But when --keep-pack is specified on the base pack, we will exclude most of objects. This makes 'pack-objects' stay silent for a long time while the counting phase is going. Let's show some progress whenever we visit an object

[PATCH v4 2/7] repack: add --keep-pack option

2018-03-24 Thread Nguyễn Thái Ngọc Duy
We allow to keep existing packs by having companion .keep files. This is helpful when a pack is permanently kept. In the next patch, git-gc just wants to keep a pack temporarily, for one pack-objects run. git-gc can use --keep-pack for this use case. A note about why the pack_keep field cannot be

[PATCH v2 05/36] t/helper: merge test-config into test-tool

2018-03-24 Thread Nguyễn Thái Ngọc Duy
Signed-off-by: Nguyễn Thái Ngọc Duy --- Makefile | 2 +- t/helper/test-config.c| 5 +++-- t/helper/test-tool.c | 1 + t/helper/test-tool.h | 1 + t/t1305-config-include.sh | 2 +- t/t1308-config-set.sh | 22 +++---

[PATCH v2 03/36] t/helper: merge test-sha1 into test-tool

2018-03-24 Thread Nguyễn Thái Ngọc Duy
Signed-off-by: Nguyễn Thái Ngọc Duy --- Documentation/howto/recover-corrupted-object-harder.txt | 2 +- Makefile| 4 ++-- t/helper/test-sha1.c| 3 ++- t/helper/test-sha1.sh

[PATCH v4 5/7] gc: handle a corner case in gc.bigPackThreshold

2018-03-24 Thread Nguyễn Thái Ngọc Duy
This config allows us to keep packs back if their size is larger than a limit. But if this N >= gc.autoPackLimit, we may have a problem. We are supposed to reduce the number of packs after a threshold because it affects performance. We could tell the user that they have incompatible

[PATCH v4 2/7] repack: add --keep-pack option

2018-03-24 Thread Nguyễn Thái Ngọc Duy
We allow to keep existing packs by having companion .keep files. This is helpful when a pack is permanently kept. In the next patch, git-gc just wants to keep a pack temporarily, for one pack-objects run. git-gc can use --keep-pack for this use case. A note about why the pack_keep field cannot be

[PATCH v4 1/7] t7700: have closing quote of a test at the beginning of line

2018-03-24 Thread Nguyễn Thái Ngọc Duy
The closing quote of a test body by convention is always at the start of line. Signed-off-by: Nguyễn Thái Ngọc Duy Signed-off-by: Junio C Hamano --- t/t7700-repack.sh | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/t/t7700-repack.sh

[PATCH v4 0/7] nd/repack-keep-pack updates

2018-03-24 Thread Nguyễn Thái Ngọc Duy
v4 is mostly refining tests and other minor fixes on v3 - --keep-base-pack is renamed to --keep-largest-pack - "Counting objects" progress line is back - test and docs updates Interdiff diff --git a/Documentation/config.txt b/Documentation/config.txt index 6b602f918f..cf862d3edf 100644 ---

[PATCH v4 6/7] gc --auto: exclude base pack if not enough mem to "repack -ad"

2018-03-24 Thread Nguyễn Thái Ngọc Duy
pack-objects could be a big memory hog especially on large repos, everybody knows that. The suggestion to stick a .keep file on the giant base pack to avoid this problem is also known for a long time. Recent patches add an option to do just this, but it has to be either configured or activated

[PATCH v4 7/7] pack-objects: show some progress when counting kept objects

2018-03-24 Thread Nguyễn Thái Ngọc Duy
We only show progress when there are new objects to be packed. But when --keep-pack is specified on the base pack, we will exclude most of objects. This makes 'pack-objects' stay silent for a long time while the counting phase is going. Let's show some progress whenever we visit an object

[PATCH v4 3/7] gc: add --keep-largest-pack option

2018-03-24 Thread Nguyễn Thái Ngọc Duy
This adds a new repack mode that combines everything into a secondary pack, leaving the largest pack alone. This could help reduce memory pressure. On linux-2.6.git, valgrind massif reports 1.6GB heap in "pack all" case, and 535MB in "pack all except the base pack" case. We save roughly 1GB

[PATCH v4 4/7] gc: add gc.bigPackThreshold config

2018-03-24 Thread Nguyễn Thái Ngọc Duy
The --keep-largest-pack option is not very convenient to use because you need to tell gc to do this explicitly (and probably on just a few large repos). Add a config key that enables this mode when packs larger than a limit are found. Note that there's a slight behavior difference compared to

Re: [RFC PATCH v5 4/8] Extract functions out of git_rebase__interactive

2018-03-24 Thread Eric Sunshine
On Fri, Mar 23, 2018 at 5:25 PM, Wink Saville wrote: > The extracted functions are: > [...] > Signed-off-by: Wink Saville > --- > diff --git a/git-rebase--interactive.sh b/git-rebase--interactive.sh > @@ -740,8 +740,20 @@ get_missing_commit_check_level () { >

Re: [PATCH v1 0/2] perf/aggregate: sort result by regression

2018-03-24 Thread Christian Couder
On Fri, Mar 23, 2018 at 10:16 PM, Junio C Hamano wrote: > Christian Couder writes: > >> This small patch series makes it easy to spot big performance >> regressions, so that they can later be investigated. >> >> For example: >> >> $ ./aggregate.perl

Re: [PATCH v6 00/11] nd/pack-objects-pack-struct updates

2018-03-24 Thread Jeff King
On Fri, Mar 23, 2018 at 04:01:50PM +, Ramsay Jones wrote: > Not that it matters, but I assume this was something like: > > $ time (echo HEAD | git cat-file --batch-check="%(objectsize:disk)") > > ... and I suspect it was on the linux.git repo, yes? Yes to both. > If I do this on my

[PATCH v7 03/13] pack-objects: use bitfield for object_entry::dfs_state

2018-03-24 Thread Nguyễn Thái Ngọc Duy
Signed-off-by: Nguyễn Thái Ngọc Duy --- builtin/pack-objects.c | 3 +++ pack-objects.h | 28 +--- 2 files changed, 20 insertions(+), 11 deletions(-) diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c index 647c01ea34..83f8154865

[PATCH v7 12/13] pack-objects: shrink delta_size field in struct object_entry

2018-03-24 Thread Nguyễn Thái Ngọc Duy
Allowing a delta size of 64 bits is crazy. Shrink this field down to 31 bits with one overflow bit. If we find an existing delta larger than 2GB, we do not cache delta_size at all and will get the value from oe_size(), potentially from disk if it's larger than 4GB. Note, since DELTA_SIZE() is

[PATCH v7 11/13] pack-objects: shrink size field in struct object_entry

2018-03-24 Thread Nguyễn Thái Ngọc Duy
It's very very rare that an uncompressed object is larger than 4GB (partly because Git does not handle those large files very well to begin with). Let's optimize it for the common case where object size is smaller than this limit. Shrink size field down to 32 bits [1] and one overflow bit. If the

[PATCH v7 09/13] pack-objects: don't check size when the object is bad

2018-03-24 Thread Nguyễn Thái Ngọc Duy
sha1_object_info() in check_objects() may fail to locate an object in the pack and return type OBJ_BAD. In that case, it will likely leave the "size" field untouched. We delay error handling until later in prepare_pack() though. Until then, do not touch "size" field. This field should contain the

[PATCH v7 07/13] pack-objects: refer to delta objects by index instead of pointer

2018-03-24 Thread Nguyễn Thái Ngọc Duy
These delta pointers always point to elements in the objects[] array in packing_data struct. We can only hold maximum 4G of those objects because the array size in nr_objects is uint32_t. We could use uint32_t indexes to address these elements instead of pointers. On 64-bit architecture (8 bytes

[PATCH v7 06/13] pack-objects: move in_pack out of struct object_entry

2018-03-24 Thread Nguyễn Thái Ngọc Duy
Instead of using 8 bytes (on 64 bit arch) to store a pointer to a pack. Use an index instead since the number of packs should be relatively small. This limits the number of packs we can handle to 1k. Since we can't be sure people can never run into the situation where they have more than 1k pack

[PATCH v7 08/13] pack-objects: shrink z_delta_size field in struct object_entry

2018-03-24 Thread Nguyễn Thái Ngọc Duy
We only cache deltas when it's smaller than a certain limit. This limit defaults to 1000 but save its compressed length in a 64-bit field. Shrink that field down to 16 bits, so you can only cache 65kb deltas. Larger deltas must be recomputed at when the pack is written down. Signed-off-by: Nguyễn

[PATCH v7 02/13] pack-objects: turn type and in_pack_type to bitfields

2018-03-24 Thread Nguyễn Thái Ngọc Duy
An extra field type_valid is added to carry the equivalent of OBJ_BAD in the original "type" field. in_pack_type always contains a valid type so we only need 3 bits for it. A note about accepting OBJ_NONE as "valid" type. The function read_object_list_from_stdin() can pass this value [1] and it

[PATCH v7 10/13] pack-objects: clarify the use of object_entry::size

2018-03-24 Thread Nguyễn Thái Ngọc Duy
While this field most of the time contains the canonical object size, there is one case it does not: when we have found that the base object of the delta in question is also to be packed, we will very happily reuse the delta by copying it over instead of regenerating the new delta. "size" in this

[PATCH v7 05/13] pack-objects: move in_pack_pos out of struct object_entry

2018-03-24 Thread Nguyễn Thái Ngọc Duy
This field is only need for pack-bitmap, which is an optional feature. Move it to a separate array that is only allocated when pack-bitmap is used (it's not freed in the same way that objects[] is not). Signed-off-by: Nguyễn Thái Ngọc Duy --- builtin/pack-objects.c | 3 ++-

[PATCH v7 04/13] pack-objects: use bitfield for object_entry::depth

2018-03-24 Thread Nguyễn Thái Ngọc Duy
Because of struct packing from now on we can only handle max depth 4095 (or even lower when new booleans are added in this struct). This should be ok since long delta chain will cause significant slow down anyway. Signed-off-by: Nguyễn Thái Ngọc Duy ---

[PATCH v7 00/13] nd/pack-objects-pack-struct updates

2018-03-24 Thread Nguyễn Thái Ngọc Duy
Phew.. pack-objects is tough to crack. v7 changes - the 16k pack limit is removed thanks to Jeff suggestion. The limit for memory saving though is reduced down to 1k again. - only object size below 2G is cached (previously 4G) to avoid 1 << 32 on 32 bits. - fix oe_size() retrieving wrong size

[PATCH v7 13/13] pack-objects: reorder members to shrink struct object_entry

2018-03-24 Thread Nguyễn Thái Ngọc Duy
Previous patches leave lots of holes and padding in this struct. This patch reorders the members and shrinks the struct down to 80 bytes (from 136 bytes, before any field shrinking is done) with 16 bits to spare (and a couple more in in_pack_header_size when we really run out of bits). This is

[PATCH v7 01/13] pack-objects: a bit of document about struct object_entry

2018-03-24 Thread Nguyễn Thái Ngọc Duy
The role of this comment block becomes more important after we shuffle fields around to shrink this struct. It will be much harder to see what field is related to what. Signed-off-by: Nguyễn Thái Ngọc Duy --- pack-objects.h | 45 +

<    1   2