Actually I don't believe the data loss is that large. (There may also be mercurial commits that are intentionally ignored by the conversion script, like commits that only add tags?)
hg log | grep '^changeset:' | wc -l 313209 git log | grep '^commit ' | wc -l 301478 So there is a difference of 11731 commits (about 4%) but those couldn't have such a large impact on repository size. I hope somebody else is willing to work with me on this so we document everything and do a reproducible repository conversion. --emi On Wed, Nov 23, 2016 at 9:10 PM, Emilian Bold <[email protected]> wrote: > Well, I dunno what black magic `gc --aggressive` does but the repository > is 0.85GB now! > > I also ran `git reflog expire` first but it didn't change the size at all. > > One thing to keep in mind is that I used --force although I had 6 commits > with the warning "repository has at least one unnamed head". Which were > probably all close branch commits (hg commit --close-branch). > > So I might have have data loss(!) since I believe I read hg-fast-export.sh > picks only one unnamed head as the migration winner. I wonder if the gc > command didn't just purge a lot of valid commits from such an unnamed head > and that's why the repository became so small. > > Could somebody else try a test repository conversion and validate my > numbers? > > git gc --aggressive --prune=now > Counting objects: 4085031, done. > Delta compression using up to 8 threads. > Compressing objects: 100% (2909203/2909203), done. > Writing objects: 100% (4085031/4085031), done. > Total 4085031 (delta 2150468), reused 1585934 (delta 0) > Checking connectivity: 4085031, done. > > > > --emi > > On Wed, Nov 23, 2016 at 7:59 PM, Paul Merlin <[email protected]> > wrote: > >> Hi Emilian, >> >> > I see hg-fast-export.sh finished at some point. >> > >> > As expected though, git does not have any of the disk space gains. The >> > converted git releases/ repository is 3.6GB. >> >> Just a thought. >> Did you try some git cleanups after the conversion? >> >> git reflog expire --expire=now --all >> git gc --aggressive --prune=now >> >> Cheers >> >> >> > In case these statistics mean something: >> > >> > git-fast-import statistics: >> > --------------------------------------------------------------------- >> > Alloc'd objects: 4090000 >> > Total objects: 4085509 ( 40220100 duplicates ) >> > blobs : 1036365 ( 28386238 duplicates 858087 deltas of >> > 969684 attempts) >> > trees : 2735935 ( 11833862 duplicates 1370606 deltas of >> > 2613480 attempts) >> > commits: 313209 ( 0 duplicates 0 deltas of >> > 0 attempts) >> > tags : 0 ( 0 duplicates 0 deltas of >> > 0 attempts) >> > Total branches: 1283 ( 346 loads ) >> > marks: 1048576 ( 313209 unique ) >> > atoms: 124011 >> > Memory total: 218429 KiB >> > pools: 26711 KiB >> > objects: 191718 KiB >> > --------------------------------------------------------------------- >> > pack_report: getpagesize() = 4096 >> > pack_report: core.packedGitWindowSize = 1073741824 >> > pack_report: core.packedGitLimit = 8589934592 >> > pack_report: pack_used_ctr = 39000045 >> > pack_report: pack_mmap_calls = 733040 >> > pack_report: pack_open_windows = 4 / 7 >> > pack_report: pack_mapped = 4280730006 / 6950823920 >> > --------------------------------------------------------------------- >> > >> > >> > --emi >> > >> > On Fri, Nov 18, 2016 at 1:32 PM, Emilian Bold <[email protected]> >> > wrote: >> > >> >> A releases/ clone which on my system takes 3.8GB is reduced to 1.6GB >> with >> >> the generaldelta and aggressivemergedeltas flags (took about 14 hours). >> >> >> >> Pretty impressive! >> >> >> >> Converting to git with hg-fast-export.sh complains that "repository >> has at >> >> least one unnamed head" for about 6 revisions. With --force I'm able to >> >> start the conversion but it hasn't finished yet. >> >> >> >> The git conversion is about 35% done and already using 1.3GB. >> >> >> >> So... I assume it's going to need just like the original repository >> about >> >> 3.8GB. >> >> >> >> I wonder if git has similar space-saving tricks? >> >> >> >> >> >> >> >> --emi >> >> >> >> On Thu, Nov 17, 2016 at 8:46 AM, Emilian Bold <[email protected]> >> >> wrote: >> >> >> >>> Forgot about this. I've just started the Mercurial repository >> conversion >> >>> which will take a few hours. >> >>> >> >>> Will report tomorrow or when it's done. >> >>> >> >>> >> >>> --emi >> >>> >> >>> On Wed, Nov 16, 2016 at 11:18 PM, cowwoc <[email protected]> >> wrote: >> >>> >> >>>> Hi Emilian, >> >>>> >> >>>> Any update on this? >> >>>> >> >>>> Thanks, >> >>>> Gili >> >>>> >> >>>> >> >>>> On 2016-11-11 01:33 (-0500), Emilian Bold <[email protected]> wrote: >> >>>>> Thank you for following through with this after we talked on IRC.> >> >>>>> >> >>>>> I will check later the size reduction for the releases/ repo.> >> > >> > >
