Re: hg-git and round-tripping (and file copies?)

2017-03-18 Thread Danek Duvall
Sean Farley wrote:

> Danek Duvall  writes:
> 
> > Mike Hommey wrote:
> >
> >> If your goal trying to round-trip between mercurial and git is to
> >> provide developers with the possibility to use mercurial or git as they
> >> like, and somehow make it work with developers pushing on both ends, you
> >> should instead use a single source of truth (mercurial or git, whichever
> >> you prefer keeping a server for), and let developers use conversion tools
> >> on their end. hg-git can be used by developers who prefer mercurial when
> >> the server is git (although it annoyingly adds visible metadata to git
> >> commits in that case). git-cinnabar or git-remote-hg can be used by
> >> developers who prefer git when the server is mercurial. (full
> >> disclosure, I'm the author of git-cinnabar)
> >
> > Our source of truth is mercurial.  But our (Oracle Solaris) external,
> > read-only mirror is being moved from a mercurial repo on java.net to
> > github.
> 
> :-(

My thoughts exactly.

Danek
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


Re: hg-git and round-tripping (and file copies?)

2017-03-18 Thread Sean Farley
Danek Duvall  writes:

> Mike Hommey wrote:
>
>> If your goal trying to round-trip between mercurial and git is to
>> provide developers with the possibility to use mercurial or git as they
>> like, and somehow make it work with developers pushing on both ends, you
>> should instead use a single source of truth (mercurial or git, whichever
>> you prefer keeping a server for), and let developers use conversion tools
>> on their end. hg-git can be used by developers who prefer mercurial when
>> the server is git (although it annoyingly adds visible metadata to git
>> commits in that case). git-cinnabar or git-remote-hg can be used by
>> developers who prefer git when the server is mercurial. (full
>> disclosure, I'm the author of git-cinnabar)
>
> Our source of truth is mercurial.  But our (Oracle Solaris) external,
> read-only mirror is being moved from a mercurial repo on java.net to
> github.

:-(
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


Re: hg-git and round-tripping (and file copies?)

2017-03-16 Thread Danek Duvall
Mike Hommey wrote:

> If your goal trying to round-trip between mercurial and git is to
> provide developers with the possibility to use mercurial or git as they
> like, and somehow make it work with developers pushing on both ends, you
> should instead use a single source of truth (mercurial or git, whichever
> you prefer keeping a server for), and let developers use conversion tools
> on their end. hg-git can be used by developers who prefer mercurial when
> the server is git (although it annoyingly adds visible metadata to git
> commits in that case). git-cinnabar or git-remote-hg can be used by
> developers who prefer git when the server is mercurial. (full
> disclosure, I'm the author of git-cinnabar)

Our source of truth is mercurial.  But our (Oracle Solaris) external,
read-only mirror is being moved from a mercurial repo on java.net to
github.  I would like for people who might have had clones of the mercurial
repo to be able to use hg-git to pull from the github repo and get the same
changeset IDs as they had before (and as we have internally), but that
looks like it's not possible.

Danek
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


Re: hg-git and round-tripping (and file copies?)

2017-03-16 Thread Mike Hommey
On Thu, Mar 16, 2017 at 01:38:18PM -0700, Gregory Szorc wrote:
> On Thu, Mar 16, 2017 at 1:05 PM, Danek Duvall 
> wrote:
> 
> > In trying to convert
> >
> > https://hg.java.net/hg/solaris-userland~gate
> >
> > to a git repo and back, I'm seeing issues at changeset 34, where the hash
> > changes for reasons I can't see.  If I do a diff of the debug log, I see
> > it's due to the manifest:
> >
> > $ diff -u =(hg log -R userland-more --debug -r 34) =(hg log -R
> > userland-more.hgagain --debug -r 34 | grep -v "^phase:")
> > --- /tmp/zshhHyEIb  2017-03-16 11:37:57.601340643 -0700
> > +++ /tmp/zshlyqHbd  2017-03-16 11:37:57.793642372 -0700
> > @@ -1,12 +1,10 @@
> >  no terminfo entry for sitm
> > -changeset:   34:d20b10eba31725ad8954aa6d20374da512f0e636
> > -tag: build-149
> > +changeset:   34:2ccb817b85926f410df2a6bd23000265805088df
> >  parent:  33:371c8e56136d19872ae7db8d273f9de78c8fa783
> >  parent:  -1:
> > -manifest:34:e031f26e68549dadb3dfb4705d429c75622a58b4
> > +manifest:34:5a12a2a1bf3e7c0f7c30d01bd09a2e37185bcfb6
> >  user:Norm Jacobs 
> >  date:Sun Sep 19 13:50:53 2010 -0700
> > -phase:   public
> >  files:
> > components/Makefile
> > make-rules/prep.mk
> >
> > and if I use debugdata to look at the manifest at changeset 34, I see:
> >
> > $ gdiff -a -u =(hg -R userland-more debugdata -m 34) =(hg -R
> > userland-more.hgagain debugdata -m 34)
> > --- /tmp/zshOdnjza  2017-03-16 11:53:16.971130878 +
> > +++ /tmp/zshzoTzmc  2017-03-16 11:53:17.118194061 +
> > @@ -24,12 +24,12 @@
> >  make-rules/setup.py.mk302733d738cc7c6cceb63457442f24f931867472
> >  make-rules/shared-macros.mk03dd5df583b6e39a17ba66fc6ed6205df7f6be49
> >  tools/Makefilecc964766028e3b963b4a321c88815d211415006b
> > -tools/bass-o-matica618ef38ceda467b9a09680dd8b94debcd303037x
> > +tools/bass-o-matic349f9611499fddf1a110f9488a84fb110c90b7bfx
> >  tools/build-watch.df69b9a2b6a265c06268733430bbf3f9aa7d5e160x
> >  tools/build-watch.pl5e23340c7a84ac555e630a5ccdc28eceda95f4b6x
> >  tools/time.ca0a1f64ff8ac947ce9d045e0448f8ee72f9fd273
> > -tools/userland-fetch851170bb5cebf2648c53d4909eac26ac2055cdd3x
> > -tools/userland-unpack0977e35fa356d4cfab889b93613dc75d90d89b6bx
> > +tools/userland-fetchbae023e70db29fd07f6f989aaa858cfaed09238ax
> > +tools/userland-unpackb3800b9db86df38a644a653b3095805b269b6ac6x
> >  transforms/actuatorsc9d84677229efde5f89b1d985de5cd1b09267b56
> >  transforms/archive-libraries-drop5b346a0133242f460ff66f6689
> > 790da094ce27f6
> >  transforms/comparison-cleanupde1288c586594a171d43a3da5234cb920be408cc
> >
> > Now, those three files were copied in that changeset, but they're not the
> > first to be copied, so it's not that, strictly.  But it is the first
> > changeset in which files were copied without being modified.
> >
> > The index data is off-by-one, if that makes any difference:
> >
> > $ hg -R userland-more debugrevlog -d tools/bass-o-matic
> > # rev p1rev p2rev start   end deltastart base   p1   p2 rawsize
> > totalsize compression heads chainlen
> > 0-1-1 0  2175  00006005
> > 6005   2 10
> > 1 0-1  2175  2228  00005929
> >  11934   5 11
> >
> > $ hg -R userland-more.hgagain debugrevlog -d tools/bass-o-matic
> > # rev p1rev p2rev start   end deltastart base   p1   p2 rawsize
> > totalsize compression heads chainlen
> > 0-1-1 0  2174  00006005
> > 6005   2 10
> > 1 0-1  2174  2227  00005929
> >  11934   5 11
> >
> > Any thoughts on how to further debug this?
> >
> > Or is this just
> >
> > https://bitbucket.org/durin42/hg-git/issues/46

Note that bug is about git->hg conversion where the original repository
is git.

> >
> > and I'm out of luck?
> >
> 
> It is effectively impossible to round-trip between Git and Mercurial when
> file copies are involved. This is because Mercurial's filelog hashes
> include copy metadata and the parent nodes. Git's blob hashes, by contrast,
> are effectively content only. When you convert from Mercurial to Git, it
> will drop copy metadata (because Git doesn't track it explicitly). Then
> when you convert back to Mercurial, the copies have to be detected "just
> right" by hg-git for the hashes to align. Furthermore, the files have to be
> reintroduced in the same order, or the filelog parents may not align and
> the hashes may diverge. If a repo isn't linear, there's a non-zero chance
> of that happening.

hg-git actually "stores" copy/rename in the commit messages, but that's
assuming the commit was done in mercurial and 

Re: hg-git and round-tripping (and file copies?)

2017-03-16 Thread Danek Duvall
Gregory Szorc wrote:

> It is effectively impossible to round-trip between Git and Mercurial when
> file copies are involved. This is because Mercurial's filelog hashes
> include copy metadata and the parent nodes. Git's blob hashes, by contrast,
> are effectively content only. When you convert from Mercurial to Git, it
> will drop copy metadata (because Git doesn't track it explicitly). Then
> when you convert back to Mercurial, the copies have to be detected "just
> right" by hg-git for the hashes to align. Furthermore, the files have to be
> reintroduced in the same order, or the filelog parents may not align and
> the hashes may diverge. If a repo isn't linear, there's a non-zero chance
> of that happening.

Got it.

> It is best to have a single canonical repo and replicate from that.
> Attempting "syncing" from multiple discrete repos will only lead to
> divergence.

Sadly, that's not an option.  We're just going to have to deal with the
breakage, until such time as we can avoid the git intermediary.

Thanks,
Danek
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


Re: hg-git and round-tripping (and file copies?)

2017-03-16 Thread Gregory Szorc
On Thu, Mar 16, 2017 at 1:05 PM, Danek Duvall 
wrote:

> In trying to convert
>
> https://hg.java.net/hg/solaris-userland~gate
>
> to a git repo and back, I'm seeing issues at changeset 34, where the hash
> changes for reasons I can't see.  If I do a diff of the debug log, I see
> it's due to the manifest:
>
> $ diff -u =(hg log -R userland-more --debug -r 34) =(hg log -R
> userland-more.hgagain --debug -r 34 | grep -v "^phase:")
> --- /tmp/zshhHyEIb  2017-03-16 11:37:57.601340643 -0700
> +++ /tmp/zshlyqHbd  2017-03-16 11:37:57.793642372 -0700
> @@ -1,12 +1,10 @@
>  no terminfo entry for sitm
> -changeset:   34:d20b10eba31725ad8954aa6d20374da512f0e636
> -tag: build-149
> +changeset:   34:2ccb817b85926f410df2a6bd23000265805088df
>  parent:  33:371c8e56136d19872ae7db8d273f9de78c8fa783
>  parent:  -1:
> -manifest:34:e031f26e68549dadb3dfb4705d429c75622a58b4
> +manifest:34:5a12a2a1bf3e7c0f7c30d01bd09a2e37185bcfb6
>  user:Norm Jacobs 
>  date:Sun Sep 19 13:50:53 2010 -0700
> -phase:   public
>  files:
> components/Makefile
> make-rules/prep.mk
>
> and if I use debugdata to look at the manifest at changeset 34, I see:
>
> $ gdiff -a -u =(hg -R userland-more debugdata -m 34) =(hg -R
> userland-more.hgagain debugdata -m 34)
> --- /tmp/zshOdnjza  2017-03-16 11:53:16.971130878 +
> +++ /tmp/zshzoTzmc  2017-03-16 11:53:17.118194061 +
> @@ -24,12 +24,12 @@
>  make-rules/setup.py.mk302733d738cc7c6cceb63457442f24f931867472
>  make-rules/shared-macros.mk03dd5df583b6e39a17ba66fc6ed6205df7f6be49
>  tools/Makefilecc964766028e3b963b4a321c88815d211415006b
> -tools/bass-o-matica618ef38ceda467b9a09680dd8b94debcd303037x
> +tools/bass-o-matic349f9611499fddf1a110f9488a84fb110c90b7bfx
>  tools/build-watch.df69b9a2b6a265c06268733430bbf3f9aa7d5e160x
>  tools/build-watch.pl5e23340c7a84ac555e630a5ccdc28eceda95f4b6x
>  tools/time.ca0a1f64ff8ac947ce9d045e0448f8ee72f9fd273
> -tools/userland-fetch851170bb5cebf2648c53d4909eac26ac2055cdd3x
> -tools/userland-unpack0977e35fa356d4cfab889b93613dc75d90d89b6bx
> +tools/userland-fetchbae023e70db29fd07f6f989aaa858cfaed09238ax
> +tools/userland-unpackb3800b9db86df38a644a653b3095805b269b6ac6x
>  transforms/actuatorsc9d84677229efde5f89b1d985de5cd1b09267b56
>  transforms/archive-libraries-drop5b346a0133242f460ff66f6689
> 790da094ce27f6
>  transforms/comparison-cleanupde1288c586594a171d43a3da5234cb920be408cc
>
> Now, those three files were copied in that changeset, but they're not the
> first to be copied, so it's not that, strictly.  But it is the first
> changeset in which files were copied without being modified.
>
> The index data is off-by-one, if that makes any difference:
>
> $ hg -R userland-more debugrevlog -d tools/bass-o-matic
> # rev p1rev p2rev start   end deltastart base   p1   p2 rawsize
> totalsize compression heads chainlen
> 0-1-1 0  2175  00006005
> 6005   2 10
> 1 0-1  2175  2228  00005929
>  11934   5 11
>
> $ hg -R userland-more.hgagain debugrevlog -d tools/bass-o-matic
> # rev p1rev p2rev start   end deltastart base   p1   p2 rawsize
> totalsize compression heads chainlen
> 0-1-1 0  2174  00006005
> 6005   2 10
> 1 0-1  2174  2227  00005929
>  11934   5 11
>
> Any thoughts on how to further debug this?
>
> Or is this just
>
> https://bitbucket.org/durin42/hg-git/issues/46
>
> and I'm out of luck?
>

It is effectively impossible to round-trip between Git and Mercurial when
file copies are involved. This is because Mercurial's filelog hashes
include copy metadata and the parent nodes. Git's blob hashes, by contrast,
are effectively content only. When you convert from Mercurial to Git, it
will drop copy metadata (because Git doesn't track it explicitly). Then
when you convert back to Mercurial, the copies have to be detected "just
right" by hg-git for the hashes to align. Furthermore, the files have to be
reintroduced in the same order, or the filelog parents may not align and
the hashes may diverge. If a repo isn't linear, there's a non-zero chance
of that happening.

It is best to have a single canonical repo and replicate from that.
Attempting "syncing" from multiple discrete repos will only lead to
divergence.
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel