Hi,

> The point you missed is that especially the "grafted from" links do not 
> include the full URL, just the hg-hash (which is different from git-hashes). 
> And just greping for "grafted from" gives me 425 results (in total -- if you 
> want the log of individual branches, you need to use the `-b` option).
> For a more precise count, you should grep for hexadecimal numbers longer than 
> a few digits inside the commit messages.
I see, thanks for the explanation. 

> I somewhat doubt that any existing hg->git converters automatically 
> translates these hashes, but I'd be very happy if someone finds out 
> otherwise. Changing these manually is definitely not an option.
I might have good news on this one: We are apparently not the only project that 
works on migrating from Mercurial to Git. The OpenJDK project (a free 
implementation of the Java platform) has created Skara, a set of tools to 
handle all kind of stuff related to contributing to OpenJDK 
(https://github.com/openjdk/skara <https://github.com/openjdk/skara>). Some of 
the tools could be really helpful for our issues (see 
https://openjdk.java.net/jeps/357 <https://openjdk.java.net/jeps/357>). 

The relevant tool seem to be git-openjdk-import which is used to import from 
Mercurial to Git. I just had a short glance on the code but it seems to be very 
generic and does not seem to contain OpenJDP related stuff at all. The 
interesting part is the follow paragraph from https://openjdk.java.net/jeps/357 
<https://openjdk.java.net/jeps/357>

> We've also prototyped new tool, git-translate. This tool uses a file 
> called.hgcommits that is generated by the conversion tools and committed to 
> the Git repositories. This file contains a sequence of lines, each of which 
> contains two hexadecimal hashes: the first is the hash of a Mercurial 
> changeset and the second is the hash of the Git commit resulting from 
> converting that Mercurial changeset. The tool git-translate simply queries 
> the file .hgcommits


I haven't managed to get everything work out of the box but haven't tried too 
hard. Might be even worth opening a thread on the Skara mailing list. 

However, even if we have a translate tool this is still complicated: Changing 
hashes or links in a commit again alters the git hash and the translation is 
wrong for this particular commit. This could be a problem if a commit is 
referenced by more than one other commit or if commit a references commit b 
references commit c. 

>>> I see essentially three options:
>>> 1. Migrate to another mercurial provider
>>> 2. Convert to git, stay at bitbucket
>>> 3. Convert to git, migrate to another provider
>> 1. We could migrate to Tuxfamily and keep mercurial. As you said this would 
>> imply we have to handle pull requests separately which is possible. As you 
>> surly know LLVM does exactly that by using Phabricator. However this would 
>> fix some of the issues above but links to bitbucket would remain a problem. 
>> Another downside of mercurial is that only very few projects are using it 
>> and contributing would be much easier in the case of git.
> 
> I really don't see much difference in usability between hg and git -- both 
> have their advantages and little quirks, IMO. And I don't think that hg was 
> ever the main-hurdle for people contributing to Eigen ...
> 
> If Phabricator allows to import our existing PRs that would of course be a 
> nice option. But I'm really pessimistic about that at the moment, since this 
> also requires to match all users which made the PR or took part in the 
> discussion to the new host (maybe that would be the only argument for staying 
> with bitbucket).

I tried a few things regarding PRs: We can clearly get all Bitbucket PRs using 
its API (e.g. curl 
https://api.bitbucket.org/2.0/repositories/eigen/eigen/pullrequests --request 
GET) but such a Bitbucket PR is basically defined by source and destination 
repo and doesn't seem to contain any kind off diff. The obvious problem is that 
not only the Eigen repo will be closed (or deleted...) but also all of its 
forks. To really transfer PRs we would have to migrate at least part of the 
forks as well which is absolutely unrealistic.

I've also tried Phabricator and think its a great tool but has major downsides: 
It uses a different kind of workflow based on pure diffs (you can literally 
just copy the result of hg diff or git diff into a web tool) which might be 
hard to adapt for new users and is only free if self-hosted. The only real 
reason I'm mentioning this is that I guess we could get plain diffs from the 
Bitbucket PRs and could make them work with Phabricator. However, I really 
don't want to advertise this solution but it might be at least one.

I'm really pessimistic on this issue but see basically two options:
1. Try something exotic like the Phabricator workaround sketched above (I m 
totally unsure about this).
2. Get the diffs from all Bitbucket PRs and archive them separately (on an 
Eigen page for historical purposes only). Handle all open PRs and define a 
migration period during that we don't accept new PRs.

Thanks,
David


> On 24. Aug 2019, at 15:05, Christoph Hertzberg 
> <[email protected]> wrote:
> 
> Hi!
> 
> On 24/08/2019 12.30, David Tellenbach wrote:
>> just some thoughts about some points you've made:
>>> b) Fixing internal links inside commit messages ("grafted from ...", "fixes 
>>> error introduced in commit ...")
>> Maybe I've forgot something crucial but doing something like
>> for branch in $(hg branches | awk '{print $1}'); do
>>     hg update -C  $branch > /dev/null
>>     echo "$branch $(hg log -v | egrep "bitbucket.org" | wc -l)"
>> done
>> gives me
>> Branch                       Links
>> ------                       ------
>> default                      9
>> [...]
> 
> The point you missed is that especially the "grafted from" links do not 
> include the full URL, just the hg-hash (which is different from git-hashes). 
> And just greping for "grafted from" gives me 425 results (in total -- if you 
> want the log of individual branches, you need to use the `-b` option).
> For a more precise count, you should grep for hexadecimal numbers longer than 
> a few digits inside the commit messages.
> 
> I somewhat doubt that any existing hg->git converters automatically 
> translates these hashes, but I'd be very happy if someone finds out 
> otherwise. Changing these manually is definitely not an option.
> 
> Also, if we stayed with mercurial, but used a different provider, we can't 
> modify the history, because that would influence all the hashes (but then 
> only the 9 direct links to "bitbucket.org/..." you found would be broken, 
> which is acceptable, IMO)
> 
> Of course we can just ignore these links (though I think broken links/hashes 
> are even worse than non-existing ones ...)
> 
>> Another point are links inside the codebase that point to bitbucket.
>> Following the same logic as above I use
>> hg grep "bitbucket.org"
>> and get 11 links (all seem to be the same). Again something fixable manually.
> 
> Agreed, this part is easy to fix manually.
> 
>>> c) Fixing external links to the repository. Most notably, any links from 
>>> our bugtracker will eventually fail (even if we stayed with bitbucket, the 
>>> hashes won't match). I doubt that we could set up any automatic forwarding 
>>> for that.
>> This might be by far the most complicated point since a lot (the majority?) 
>> of all issues contain links to commits. If desired I can find a concrete 
>> number but I doubt that it will be very...motivating. I also doubt that 
>> Bitbucket will provide any functionality to redirect links to other Git 
>> providers but I could image that there could be some workaround if we decide 
>> to migrate to Bitbucket Git. Something we should keep in mind before 
>> choosing a new provider.
> 
> If you (or anyone else) are/is really interested, I can try to make a MySQL 
> dump of the underlying database (I'd need to strip the user data). If we have 
> some automatic translation between the hashes, this could even allow us to 
> automatically convert all links.
> Migrating to bitbucket-git will still break all existing links, since the 
> hashes don't match. And as bitbucket is not even planning to provide an 
> automated repository conversion, I would not count on any kind of forwarding 
> mechanism.
> 
> 
>>> Any third-party which relies on our main repository will need to change as 
>>> well (not directly "our" problem, but we need to give a reasonable amount 
>>> of time for everyone to migrate to whatever will be our future official 
>>> repository).
>> It's currently unclear for me what exactly will happen with the hg repo but 
>> I guess it will be archived or something similar. In this case we can link 
>> to the new repo on the README page. I don't have any further ideas regarding 
>> this but also think we should migrate somewhat fast.
> 
> Yes, I think this is unclear for everyone at the moment. The announcement 
> from bitbucket sounds a lot like they will literally delete all 
> hg-repositories in June next year :(
> If it was at least frozen/archived as it is, we would have almost no problems 
> with point c).
> 
> For manual redirection, we can of course open a new git-project which just 
> contains a README.md saying that bitbucket dropped hg-support, and point to 
> where Eigen migrated to.
> 
>>> I see essentially three options:
>>> 1. Migrate to another mercurial provider
>>> 2. Convert to git, stay at bitbucket
>>> 3. Convert to git, migrate to another provider
>> 1. We could migrate to Tuxfamily and keep mercurial. As you said this would 
>> imply we have to handle pull requests separately which is possible. As you 
>> surly know LLVM does exactly that by using Phabricator. However this would 
>> fix some of the issues above but links to bitbucket would remain a problem. 
>> Another downside of mercurial is that only very few projects are using it 
>> and contributing would be much easier in the case of git.
> 
> I really don't see much difference in usability between hg and git -- both 
> have their advantages and little quirks, IMO. And I don't think that hg was 
> ever the main-hurdle for people contributing to Eigen ...
> 
> If Phabricator allows to import our existing PRs that would of course be a 
> nice option. But I'm really pessimistic about that at the moment, since this 
> also requires to match all users which made the PR or took part in the 
> discussion to the new host (maybe that would be the only argument for staying 
> with bitbucket).
> 
> 
>> 2. The only reason I see for this is the one I mentioned above: If there is 
>> (or will be) any support to redirect bitbucket links it will most likely 
>> only work if we stay at bitbucket. Compared with other code hosting services 
>> I find bitbucket (not mercurial) to be really complicated and not intuitive.
> 
> It might be an option, if they allowed to automatically migrate 
> pull-requests. But at the moment, they don't even seem to plan automatic 
> migration of repositories.
> 
>> 3. In an ideal world this would be my absolute preference (not very 
>> surprising). Regarding the choice of a service I want to make the personal 
>> point that I would rather migrate to Gitlab than to Github because it is as 
>> least as good as Github and I think that diversity of tools and providers is 
>> crucial for open source. In the long run we could even think about migrating 
>> issues to Gitlab and installing test runners (this is another story).
> 
> In my ideal world, somebody volunteers to do the work necessary for migration 
> :) -- including the issues I pointed out (doesn't have to be the same person 
> doing everything, of course). Even some proof-of-concept demos what can be 
> automated would be nice!
> 
> I don't have any real preferences between mercurial/git or 
> github/gitlab/bitbucket.
> 
> I totally agree that having automated test runners on pull-requests will be a 
> big plus (for which I'm even willing to sacrifice some of my original points, 
> especially since we may need to anyway).
> 
> Cheers,
> Christoph
> 
> 
>> Thanks,
>> David
>>> On 21. Aug 2019, at 14:53, Christoph Hertzberg 
>>> <[email protected]> wrote:
>>> 
>>> Hello Eigen users and contributers!
>>> 
>>> As some may have noticed, bitbucket/atlassian is "sunsetting" its mercurial 
>>> support:
>>> 
>>> https://bitbucket.org/blog/sunsetting-mercurial-support-in-bitbucket
>>> 
>>> If they stick to their timeline, we will have to migrate until June 1st, 
>>> 2020. That means we still have time, but if we do nothing, things will 
>>> break ...
>>> 
>>> 
>>> Converting the repository itself to git should not be a bigger issue -- and 
>>> if we do this we could as well migrate to a more mainstream provider (i.e., 
>>> github).
>>> 
>>> I think the main problems for migration are:
>>> a) Migrating open pull-requests (for historical reasons, the closed/merged 
>>> ones should probably be archived as well)
>>> b) Fixing internal links inside commit messages ("grafted from ...", "fixes 
>>> error introduced in commit ...")
>>> c) Fixing external links to the repository. Most notably, any links from 
>>> our bugtracker will eventually fail (even if we stayed with bitbucket, the 
>>> hashes won't match). I doubt that we could set up any automatic forwarding 
>>> for that.
>>> d) Any third-party which relies on our main repository will need to change 
>>> as well (not directly "our" problem, but we need to give a reasonable 
>>> amount of time for everyone to migrate to whatever will be our future 
>>> official repository).
>>> 
>>> Smaller issues (relatively easy to fix or not as important):
>>> e) Change links from our wiki (to downloads)
>>> f) Change URLs for automated doxygen generation and for unit-tests
>>> g) Automatic links from the repository to our bugtracker (currently "Bug X" 
>>> automatically links to http://eigen.tuxfamily.org/bz/show_bug.cgi?id=X)
>>> h) Change hashes in bench/perf_monitoring/changesets.txt
>>> 
>>> I probably missed a few things ...
>>> 
>>> 
>>> I see essentially three options:
>>> 1. Migrate to another mercurial provider
>>> 2. Convert to git, stay at bitbucket
>>> 3. Convert to git, migrate to another provider
>>> 
>>> Honestly, I see no good reason for option 2. And the only real reason I see 
>>> for option 1 would be that it safes a lot of hassle with b) and h) -- also 
>>> perhaps it would simplify c) (e.g., we could easily crawl through our 
>>> bugzilla-database and just replace some URLs).
>>> 
>>> 
>>> Any opinions on this? Preferences for how to proceed, or other alternatives?
>>> Does anyone have experience with migrating from hg to git? Or migrating 
>>> between providers? Especially, also dealing with the issues listed above.
>>> Does anyone see issues I forgot?
>>> 
>>> 
>>> Cheers,
>>> Christoph
>>> 
> 
> -- 
> Dr.-Ing. Christoph Hertzberg
> 
> Besuchsadresse der Nebengeschäftsstelle:
> DFKI GmbH
> Robotics Innovation Center
> Robert-Hooke-Straße 5
> 28359 Bremen, Germany
> 
> Postadresse der Hauptgeschäftsstelle Standort Bremen:
> DFKI GmbH
> Robotics Innovation Center
> Robert-Hooke-Straße 1
> 28359 Bremen, Germany
> 
> Tel.:     +49 421 178 45-4021
> Zentrale: +49 421 178 45-0
> E-Mail:   [email protected]
> 
> Weitere Informationen: http://www.dfki.de/robotik
>  -------------------------------------------------------------
>  Deutsches Forschungszentrum für Künstliche Intelligenz GmbH
>  Trippstadter Strasse 122, D-67663 Kaiserslautern, Germany
> 
>  Geschäftsführung:
>  Prof. Dr. Jana Koehler (Vorsitzende)
>  Dr. Walter Olthoff
> 
>  Vorsitzender des Aufsichtsrats:
>  Prof. Dr. h.c. Hans A. Aukes
>  Amtsgericht Kaiserslautern, HRB 2313
>  -------------------------------------------------------------
> 
> 

Reply via email to