Emilian, Jan, Mark, great work. Smooth migration from Hg to Git is essential for successful migration to Apache. Thanks a lot for investigating how to do that.
My plan (as described in another email) is to prepare the code donation in Hg and update it incrementally with code integrated into Hg. Are your conversions methods ready for incremental updates or do they only work as a one-time batch conversion? -jt On Ätvrtek 24. listopadu 2016 10:41:50 CET Jan Lahoda wrote: > Interesting. I tried "git gc --aggressive" on the Mark's converted > repository, and the result is: > netbeans-import/.git$ du -hs . > 792M . > > The original was: > netbeans-import.git $ du -hs . > 3,5G . > > (IIRC Mark was converting http://hg.netbeans.org/main, not releases, so the > repository is a little bit smaller than the releases one.) > > I tried: > $ git log -p | sha1sum > > on both repositories, and the hashes appear to be the same. I also tried to > clone the gc-ed repository using git clone --bare --no-local, and the > resulting repository is still about the same size. So, this seems good to > me, unless there is some downside I don't know about. > > Jan > > > On Wed, Nov 23, 2016 at 8:26 PM, Emilian Bold <[email protected]> > > wrote: > > Actually I don't believe the data loss is that large. (There may also be > > mercurial commits that are intentionally ignored by the conversion script, > > like commits that only add tags?) > > > > hg log | grep '^changeset:' | wc -l > > > > 313209 > > > > git log | grep '^commit ' | wc -l > > > > 301478 > > > > So there is a difference of 11731 commits (about 4%) but those couldn't > > have such a large impact on repository size. > > > > I hope somebody else is willing to work with me on this so we document > > everything and do a reproducible repository conversion. > > > > > > > > --emi > > > > On Wed, Nov 23, 2016 at 9:10 PM, Emilian Bold <[email protected]> > > > > wrote: > > > Well, I dunno what black magic `gc --aggressive` does but the repository > > > is 0.85GB now! > > > > > > I also ran `git reflog expire` first but it didn't change the size at > > > > all. > > > > > One thing to keep in mind is that I used --force although I had 6 > > > commits > > > with the warning "repository has at least one unnamed head". Which were > > > probably all close branch commits (hg commit --close-branch). > > > > > > So I might have have data loss(!) since I believe I read > > > > hg-fast-export.sh > > > > > picks only one unnamed head as the migration winner. I wonder if the gc > > > command didn't just purge a lot of valid commits from such an unnamed > > > > head > > > > > and that's why the repository became so small. > > > > > > Could somebody else try a test repository conversion and validate my > > > numbers? > > > > > > git gc --aggressive --prune=now > > > Counting objects: 4085031, done. > > > Delta compression using up to 8 threads. > > > Compressing objects: 100% (2909203/2909203), done. > > > Writing objects: 100% (4085031/4085031), done. > > > Total 4085031 (delta 2150468), reused 1585934 (delta 0) > > > Checking connectivity: 4085031, done. > > > > > > > > > > > > --emi > > > > > > On Wed, Nov 23, 2016 at 7:59 PM, Paul Merlin <[email protected]> > > > > > > wrote: > > >> Hi Emilian, > > >> > > >> > I see hg-fast-export.sh finished at some point. > > >> > > > >> > As expected though, git does not have any of the disk space gains. > > >> > The > > >> > converted git releases/ repository is 3.6GB. > > >> > > >> Just a thought. > > >> Did you try some git cleanups after the conversion? > > >> > > >> git reflog expire --expire=now --all > > >> git gc --aggressive --prune=now > > >> > > >> Cheers > > >> > > >> > In case these statistics mean something: > > >> > > > >> > git-fast-import statistics: > > >> > --------------------------------------------------------------------- > > >> > Alloc'd objects: 4090000 > > >> > Total objects: 4085509 ( 40220100 duplicates ) > > >> > > > >> > blobs : 1036365 ( 28386238 duplicates 858087 deltas > > > > of > > > > >> > 969684 attempts) > > >> > > > >> > trees : 2735935 ( 11833862 duplicates 1370606 deltas > > > > of > > > > >> > 2613480 attempts) > > >> > > > >> > commits: 313209 ( 0 duplicates 0 deltas > > > > of > > > > >> > 0 attempts) > > >> > > > >> > tags : 0 ( 0 duplicates 0 deltas > > > > of > > > > >> > 0 attempts) > > >> > > > >> > Total branches: 1283 ( 346 loads ) > > >> > > > >> > marks: 1048576 ( 313209 unique ) > > >> > atoms: 124011 > > >> > > > >> > Memory total: 218429 KiB > > >> > > > >> > pools: 26711 KiB > > >> > > > >> > objects: 191718 KiB > > >> > > > >> > --------------------------------------------------------------------- > > >> > pack_report: getpagesize() = 4096 > > >> > pack_report: core.packedGitWindowSize = 1073741824 > > >> > pack_report: core.packedGitLimit = 8589934592 > > >> > pack_report: pack_used_ctr = 39000045 > > >> > pack_report: pack_mmap_calls = 733040 > > >> > pack_report: pack_open_windows = 4 / 7 > > >> > pack_report: pack_mapped = 4280730006 / 6950823920 > > >> > --------------------------------------------------------------------- > > >> > > > >> > > > >> > --emi > > >> > > > >> > On Fri, Nov 18, 2016 at 1:32 PM, Emilian Bold <[email protected] > > >> > > > >> > wrote: > > >> >> A releases/ clone which on my system takes 3.8GB is reduced to 1.6GB > > >> > > >> with > > >> > > >> >> the generaldelta and aggressivemergedeltas flags (took about 14 > > > > hours). > > > > >> >> Pretty impressive! > > >> >> > > >> >> Converting to git with hg-fast-export.sh complains that "repository > > >> > > >> has at > > >> > > >> >> least one unnamed head" for about 6 revisions. With --force I'm able > > > > to > > > > >> >> start the conversion but it hasn't finished yet. > > >> >> > > >> >> The git conversion is about 35% done and already using 1.3GB. > > >> >> > > >> >> So... I assume it's going to need just like the original repository > > >> > > >> about > > >> > > >> >> 3.8GB. > > >> >> > > >> >> I wonder if git has similar space-saving tricks? > > >> >> > > >> >> > > >> >> > > >> >> --emi > > >> >> > > >> >> On Thu, Nov 17, 2016 at 8:46 AM, Emilian Bold < > > > > [email protected]> > > > > >> >> wrote: > > >> >>> Forgot about this. I've just started the Mercurial repository > > >> > > >> conversion > > >> > > >> >>> which will take a few hours. > > >> >>> > > >> >>> Will report tomorrow or when it's done. > > >> >>> > > >> >>> > > >> >>> --emi > > >> >>> > > >> >>> On Wed, Nov 16, 2016 at 11:18 PM, cowwoc <[email protected]> > > >> > > >> wrote: > > >> >>>> Hi Emilian, > > >> >>>> > > >> >>>> Any update on this? > > >> >>>> > > >> >>>> Thanks, > > >> >>>> Gili > > >> >>>> > > >> >>>> On 2016-11-11 01:33 (-0500), Emilian Bold <[email protected]> wrote: > > >> >>>>> Thank you for following through with this after we talked on > > >> >>>>> IRC.> > > >> >>>>> > > >> >>>>> I will check later the size reduction for the releases/ repo.>
