Signed-off-by: Nguyễn Thái Ngọc Duy
---
Makefile | 3 ++-
t/helper/test-chmtime.c | 15 +++---
t/helper/test-tool.c | 1 +
t/helper/test-tool.h | 2 ++
t/lib-git-svn.sh | 2
The --keep-largest-pack option is not very convenient to use because
you need to tell gc to do this explicitly (and probably on just a few
large repos).
Add a config key that enables this mode when packs larger than a limit
are found. Note that there's a slight behavior difference compared to
This adds a new repack mode that combines everything into a secondary
pack, leaving the largest pack alone.
This could help reduce memory pressure. On linux-2.6.git, valgrind
massif reports 1.6GB heap in "pack all" case, and 535MB in "pack
all except the base pack" case. We save roughly 1GB
Signed-off-by: Nguyễn Thái Ngọc Duy
---
Makefile | 3 ++-
t/helper/test-chmtime.c | 15 +++---
t/helper/test-tool.c | 1 +
t/helper/test-tool.h | 2 ++
t/lib-git-svn.sh | 2
This will become an umbrella program that absorbs most [1] t/helper
programs in. By having a single executable binary we reduce disk usage
(libgit.a is replicated by every t/helper program) and shorten link
time a bit.
Running "make --jobs=1; du -sh t/helper" with ccache fully populated,
it takes
Signed-off-by: Nguyễn Thái Ngọc Duy
---
Makefile | 2 +-
t/helper/test-date.c | 17 +
t/helper/test-tool.c | 1 +
t/helper/test-tool.h | 1 +
t/t0006-date.sh| 8
t/t1300-repo-config.sh | 2 +-
t/test-lib.sh |
Signed-off-by: Nguyễn Thái Ngọc Duy
---
Makefile| 2 +-
t/helper/test-lazy-init-name-hash.c | 13 +++--
t/helper/test-tool.c| 1 +
t/helper/test-tool.h| 1 +
v2 fixes a couple of typos in commit messages and use the cmd__ prefix
for test commands instead of test_, which avoids a naming conflict
with the existing function test_lazy_init_name_hash
Nguyễn Thái Ngọc Duy (36):
t/helper: add an empty test-tool program
t/helper: merge test-chmtime into
The closing quote of a test body by convention is always at the start
of line.
Signed-off-by: Nguyễn Thái Ngọc Duy
Signed-off-by: Junio C Hamano
---
t/t7700-repack.sh | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/t/t7700-repack.sh
Hi,
On Thu, Mar 22, 2018 at 11:03 PM, Alban Gruin wrote:
> Hi,
>
> here is my second draft of my proposal. As last time, any feedback is
> welcome :)
>
> I did not write my phone number and address here for obvious reasons,
> but they will be in the “about me” section of
Signed-off-by: Nguyễn Thái Ngọc Duy
---
Makefile | 2 +-
t/helper/test-ctype.c | 3 ++-
t/helper/test-tool.c | 1 +
t/helper/test-tool.h | 1 +
t/t0070-fundamental.sh | 2 +-
5 files changed, 6 insertions(+), 3 deletions(-)
diff --git a/Makefile b/Makefile
pack-objects could be a big memory hog especially on large repos,
everybody knows that. The suggestion to stick a .keep file on the
giant base pack to avoid this problem is also known for a long time.
Recent patches add an option to do just this, but it has to be either
configured or activated
This config allows us to keep packs back if their size is larger
than a limit. But if this N >= gc.autoPackLimit, we may have a
problem. We are supposed to reduce the number of packs after a
threshold because it affects performance.
We could tell the user that they have incompatible
We only show progress when there are new objects to be packed. But
when --keep-pack is specified on the base pack, we will exclude most
of objects. This makes 'pack-objects' stay silent for a long time
while the counting phase is going.
Let's show some progress whenever we visit an object
We allow to keep existing packs by having companion .keep files. This
is helpful when a pack is permanently kept. In the next patch, git-gc
just wants to keep a pack temporarily, for one pack-objects
run. git-gc can use --keep-pack for this use case.
A note about why the pack_keep field cannot be
Signed-off-by: Nguyễn Thái Ngọc Duy
---
Makefile | 2 +-
t/helper/test-config.c| 5 +++--
t/helper/test-tool.c | 1 +
t/helper/test-tool.h | 1 +
t/t1305-config-include.sh | 2 +-
t/t1308-config-set.sh | 22 +++---
Signed-off-by: Nguyễn Thái Ngọc Duy
---
Documentation/howto/recover-corrupted-object-harder.txt | 2 +-
Makefile| 4 ++--
t/helper/test-sha1.c| 3 ++-
t/helper/test-sha1.sh
This config allows us to keep packs back if their size is larger
than a limit. But if this N >= gc.autoPackLimit, we may have a
problem. We are supposed to reduce the number of packs after a
threshold because it affects performance.
We could tell the user that they have incompatible
We allow to keep existing packs by having companion .keep files. This
is helpful when a pack is permanently kept. In the next patch, git-gc
just wants to keep a pack temporarily, for one pack-objects
run. git-gc can use --keep-pack for this use case.
A note about why the pack_keep field cannot be
The closing quote of a test body by convention is always at the start
of line.
Signed-off-by: Nguyễn Thái Ngọc Duy
Signed-off-by: Junio C Hamano
---
t/t7700-repack.sh | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/t/t7700-repack.sh
v4 is mostly refining tests and other minor fixes on v3
- --keep-base-pack is renamed to --keep-largest-pack
- "Counting objects" progress line is back
- test and docs updates
Interdiff
diff --git a/Documentation/config.txt b/Documentation/config.txt
index 6b602f918f..cf862d3edf 100644
---
pack-objects could be a big memory hog especially on large repos,
everybody knows that. The suggestion to stick a .keep file on the
giant base pack to avoid this problem is also known for a long time.
Recent patches add an option to do just this, but it has to be either
configured or activated
We only show progress when there are new objects to be packed. But
when --keep-pack is specified on the base pack, we will exclude most
of objects. This makes 'pack-objects' stay silent for a long time
while the counting phase is going.
Let's show some progress whenever we visit an object
This adds a new repack mode that combines everything into a secondary
pack, leaving the largest pack alone.
This could help reduce memory pressure. On linux-2.6.git, valgrind
massif reports 1.6GB heap in "pack all" case, and 535MB in "pack
all except the base pack" case. We save roughly 1GB
The --keep-largest-pack option is not very convenient to use because
you need to tell gc to do this explicitly (and probably on just a few
large repos).
Add a config key that enables this mode when packs larger than a limit
are found. Note that there's a slight behavior difference compared to
On Fri, Mar 23, 2018 at 5:25 PM, Wink Saville wrote:
> The extracted functions are:
> [...]
> Signed-off-by: Wink Saville
> ---
> diff --git a/git-rebase--interactive.sh b/git-rebase--interactive.sh
> @@ -740,8 +740,20 @@ get_missing_commit_check_level () {
>
On Fri, Mar 23, 2018 at 10:16 PM, Junio C Hamano wrote:
> Christian Couder writes:
>
>> This small patch series makes it easy to spot big performance
>> regressions, so that they can later be investigated.
>>
>> For example:
>>
>> $ ./aggregate.perl
On Fri, Mar 23, 2018 at 04:01:50PM +, Ramsay Jones wrote:
> Not that it matters, but I assume this was something like:
>
> $ time (echo HEAD | git cat-file --batch-check="%(objectsize:disk)")
>
> ... and I suspect it was on the linux.git repo, yes?
Yes to both.
> If I do this on my
Signed-off-by: Nguyễn Thái Ngọc Duy
---
builtin/pack-objects.c | 3 +++
pack-objects.h | 28 +---
2 files changed, 20 insertions(+), 11 deletions(-)
diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
index 647c01ea34..83f8154865
Allowing a delta size of 64 bits is crazy. Shrink this field down to
31 bits with one overflow bit.
If we find an existing delta larger than 2GB, we do not cache
delta_size at all and will get the value from oe_size(), potentially
from disk if it's larger than 4GB.
Note, since DELTA_SIZE() is
It's very very rare that an uncompressed object is larger than 4GB
(partly because Git does not handle those large files very well to
begin with). Let's optimize it for the common case where object size
is smaller than this limit.
Shrink size field down to 32 bits [1] and one overflow bit. If the
sha1_object_info() in check_objects() may fail to locate an object in
the pack and return type OBJ_BAD. In that case, it will likely leave
the "size" field untouched. We delay error handling until later in
prepare_pack() though. Until then, do not touch "size" field.
This field should contain the
These delta pointers always point to elements in the objects[] array
in packing_data struct. We can only hold maximum 4G of those objects
because the array size in nr_objects is uint32_t. We could use
uint32_t indexes to address these elements instead of pointers. On
64-bit architecture (8 bytes
Instead of using 8 bytes (on 64 bit arch) to store a pointer to a
pack. Use an index instead since the number of packs should be
relatively small.
This limits the number of packs we can handle to 1k. Since we can't be
sure people can never run into the situation where they have more than
1k pack
We only cache deltas when it's smaller than a certain limit. This limit
defaults to 1000 but save its compressed length in a 64-bit field.
Shrink that field down to 16 bits, so you can only cache 65kb deltas.
Larger deltas must be recomputed at when the pack is written down.
Signed-off-by: Nguyễn
An extra field type_valid is added to carry the equivalent of OBJ_BAD
in the original "type" field. in_pack_type always contains a valid
type so we only need 3 bits for it.
A note about accepting OBJ_NONE as "valid" type. The function
read_object_list_from_stdin() can pass this value [1] and it
While this field most of the time contains the canonical object size,
there is one case it does not: when we have found that the base object
of the delta in question is also to be packed, we will very happily
reuse the delta by copying it over instead of regenerating the new
delta.
"size" in this
This field is only need for pack-bitmap, which is an optional
feature. Move it to a separate array that is only allocated when
pack-bitmap is used (it's not freed in the same way that objects[] is
not).
Signed-off-by: Nguyễn Thái Ngọc Duy
---
builtin/pack-objects.c | 3 ++-
Because of struct packing from now on we can only handle max depth
4095 (or even lower when new booleans are added in this struct). This
should be ok since long delta chain will cause significant slow down
anyway.
Signed-off-by: Nguyễn Thái Ngọc Duy
---
Phew.. pack-objects is tough to crack. v7 changes
- the 16k pack limit is removed thanks to Jeff suggestion. The limit
for memory saving though is reduced down to 1k again.
- only object size below 2G is cached (previously 4G) to avoid 1 << 32
on 32 bits.
- fix oe_size() retrieving wrong size
Previous patches leave lots of holes and padding in this struct. This
patch reorders the members and shrinks the struct down to 80 bytes
(from 136 bytes, before any field shrinking is done) with 16 bits to
spare (and a couple more in in_pack_header_size when we really run out
of bits).
This is
The role of this comment block becomes more important after we shuffle
fields around to shrink this struct. It will be much harder to see what
field is related to what.
Signed-off-by: Nguyễn Thái Ngọc Duy
---
pack-objects.h | 45 +
101 - 142 of 142 matches
Mail list logo