Sivakumar Selvam <gerritc...@gmail.com> writes: > I ran git repack on a single larger repository abc.git where the pack > file size 34 GB. Generally it used to take 20-25 minutes in my server to > complete the repacking. During repacking I noticed, disk usage was more, So > I thought of splitting the pack file into 4 GB chunks. I used the following > command to do repacking. > git repack -A -b -d -q --depth=50 --window=10 abc.git > > After adding --max-pack-size=4g to the above command again I ran to split > pack files.. > git repack -A -b -d -q --depth=50 --window=10 --max-pack-size=4g abc.git > > When I finished running, I found 12 pack files with each 4 GB and the > size is 48 GB. Now my disk usage has increased by 14 GB. Again, I ran to > check the performance, but the size (48 GB) and time to repacking takes > another 35 minutes more. Why this issue?
Hmmm, what is "this issue"? I do not see anything surprising. If you have N objects and run repack with window=10, you would (roughly speaking, without taking various optimization we have and bootstrap conditions into account) check each of these N objects against 10 other objects to find good delta base, no matter how big your max pack-size is set. And that takes the bulk of time in the repack process. Also it has to write more data to disk (see below), it has to find a good place to split, it has to adjust bookkeeping data at the pack boundary, in general it has to do more, not less, to produce split packs. It would be surprising if it took less time. Each pack by definition has to be self-sufficient; all delta in the pack must have its base object in the same pack. Now, imagine that an object (call it X) would have been expressed as a delta derived from another object (call it Y) if you were producing a single pack, and imagine that the pack has grown to be 4 GB big just before you write object X out. The current pack (which contains the base object Y already) needs to be closed and then a new pack is opened. Imagine how you would write X now into that new pack. You have to discard the deltified representation of X (which by definition is much smaller, because it is an instruction to reconstitute X given an object Y whose contents is very similar to X) and write the base representation of X to the pack, because X can no longer be expressed as a delta derived from Y. That is why you would need to write more. -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html