Re: [git-sizer] Implications of a large commit object
On Wed, Mar 14, 2018 at 09:33:32AM +0100, Michael Haggerty wrote: > Maybe your migration tool created a huge commit message, for example > listing each of the files that was changed. > > AFAIK this won't cause Git itself any problems, but it's likely to be > inconvenient. For example, when you type `git log` and 7 million > characters page by. Or when you use some GUI tool to view your history > and it performs badly because it wasn't built to handle such enormous > commit messages. Probably one such commit won't break the bank, but it will make history traversals that cross it slower (e.g., "--contains", merge-bases, etc). We'll load the whole 7MB object just to find its parents. If you imagine the average commit object is more like 1k and that current traversals bottleneck on loading the commit object bytes (both of which I think are roughly accurate), then crossing that one commit in a traversal is equivalent to crossing 7000 "normal" commits in cost. At least until Stolee's serialized commit graph work is merged, at which point it will only be expensive if we actually try to show the commit message for that particular object. -Peff
Re: [git-sizer] Implications of a large commit object
> On 14 Mar 2018, at 09:33, Michael Haggerty wrote: > > On Wed, Mar 14, 2018 at 9:14 AM, Lars Schneider > wrote: >> I am using Michael's fantastic Git repo analyzer tool "git-sizer" [*] >> and it detected a very large commit of 7.33 MiB in my repo (see chart >> below). >> >> This large commit is expected. I've imported that repo from another >> version control system but excluded all binary files (e.g. images) and >> some 3rd party components as their history is not important [**]. I've >> reintroduced these files in the head commit again. This is where the >> large commit came from. >> >> This repo is not used in production yet but I wonder if this kind of >> approach can cause trouble down the line? Are there any relevant >> implication of a single large commit like this in history? >> [...] >> >> ### >> ## git-sizer output >> >> [...] >> | Name | Value | Level of concern | >> | | - | -- | >> [...] >> | Biggest objects | || >> | * Commits| || >> | * Maximum size [1] | 7.33 MiB | !! | >> [...] > > The "commit size" that is being referred to here is the size of the > actual commit object; i.e., the author name, parent commits, etc plus > the log message. So a huge commit probably means that you have a huge > log message. This has nothing to do with the number or sizes of the > files added by the commit. > > Maybe your migration tool created a huge commit message, for example > listing each of the files that was changed. D'oh! Of course. I was so focused on that commit with the large number of files that I missed that. Looking at the reference [1] reveals the problem. Sorry for wasting your time! > AFAIK this won't cause Git itself any problems, but it's likely to be > inconvenient. For example, when you type `git log` and 7 million > characters page by. Or when you use some GUI tool to view your history > and it performs badly because it wasn't built to handle such enormous > commit messages. Thank you, Lars
Re: [git-sizer] Implications of a large commit object
On Wed, Mar 14, 2018 at 9:14 AM, Lars Schneider wrote: > I am using Michael's fantastic Git repo analyzer tool "git-sizer" [*] > and it detected a very large commit of 7.33 MiB in my repo (see chart > below). > > This large commit is expected. I've imported that repo from another > version control system but excluded all binary files (e.g. images) and > some 3rd party components as their history is not important [**]. I've > reintroduced these files in the head commit again. This is where the > large commit came from. > > This repo is not used in production yet but I wonder if this kind of > approach can cause trouble down the line? Are there any relevant > implication of a single large commit like this in history? > [...] > > ### > ## git-sizer output > > [...] > | Name | Value | Level of concern | > | | - | -- | > [...] > | Biggest objects | || > | * Commits| || > | * Maximum size [1] | 7.33 MiB | !! | > [...] The "commit size" that is being referred to here is the size of the actual commit object; i.e., the author name, parent commits, etc plus the log message. So a huge commit probably means that you have a huge log message. This has nothing to do with the number or sizes of the files added by the commit. Maybe your migration tool created a huge commit message, for example listing each of the files that was changed. AFAIK this won't cause Git itself any problems, but it's likely to be inconvenient. For example, when you type `git log` and 7 million characters page by. Or when you use some GUI tool to view your history and it performs badly because it wasn't built to handle such enormous commit messages. Michael
[git-sizer] Implications of a large commit object
Hi, I am using Michael's fantastic Git repo analyzer tool "git-sizer" [*] and it detected a very large commit of 7.33 MiB in my repo (see chart below). This large commit is expected. I've imported that repo from another version control system but excluded all binary files (e.g. images) and some 3rd party components as their history is not important [**]. I've reintroduced these files in the head commit again. This is where the large commit came from. This repo is not used in production yet but I wonder if this kind of approach can cause trouble down the line? Are there any relevant implication of a single large commit like this in history? Thanks, Lars [*] https://github.com/github/git-sizer [**] I know some of this stuff shouldn't be in the repo in the first place, but I am constrained in the things I can change. ### ## git-sizer output Processing blobs: 543782 Processing trees: 517104 Processing commits: 43365 Matching commits to trees: 43365 Processing annotated tags: 3 Processing references: 123 | Name | Value | Level of concern | | | - | -- | | Overall repository size | || | * Blobs | || | * Total size | 18.8 GiB | ** | | | || | Biggest objects | || | * Commits| || | * Maximum size [1] | 7.33 MiB | !! | | * Trees | || | * Maximum entries [2] | 6.84 k | ** | | | || | History structure| || | * Maximum tag depth [3] | 1 | * | | | || | Biggest checkouts| || | * Number of directories [4] | 21.9 k | ** | | * Maximum path depth [4] |18 | * | | * Maximum path length[5] | 225 B | ** | | * Number of files[4] | 256 k | * | | * Total size of files[6] | 2.08 GiB | ** |