Re: Using git svn rebase at quarantine environment or a feasible alternative to synchronize git and svn repositories
On Mon, Oct 01 2018, Jose Gisbert wrote: > Dear members of the Git community, > > The enterprise I work for is planning to switch from svn to git. > > Before the complete switch to git we have decided to implement a scenario > where the two SCMs live together, being the svn repository the reference. We > also want this scenario to be transparent for both SCM users. > > I read the articles referenced at the end of the email and I come to the > following solution. > > My proposal consists to import the svn repository to git using git svn and set > receive.denyCurrentBranch to updateInstead. Then install pre-receive and > post-receive hooks and set that repository as the central repository for git > users. > > The pre-receive hook does git svn rebase and, if there is an update at the svn > repository, rejects the push and instructs the user to do git pull. The > post-receive hook does git svn dcommit to update the state of the svn > repository, then instructs the user to do git pull too. > > Both scripts check the changes pushed are made at master before doing anything > and exit after performing these tasks. branches.master.rebase is set to merges > at the user repository to avoid the histories of the central and the user > repositories diverge after doing git svn dcommit. > > However I'm stuck at this point because the pre-receive hook it's not allowed > to do git svn rebase because update refs are not allowed at the quarantine > environment. I was sure that I tried this solution with a past version of git > and it worked, but now I doubt this because the restriction to update refs at > quarantine environment was delivered at version 2.13, that dates from April > 2017, if I'm not wrong. > > I don't know if this solution could be implemented or is there a better way to > accomplish this kind of synchronization (I tried Tmate SubGit, but it didn't > work for me and I don't know if we will be willing to purchase a license). > Could you help me with this question? > > I come here asking for help because I think this is the appropriate place to > do so. I apologise if this is not the case. Any help is welcome. If anything > needs to be clarified, please, ask me to do so. I can share with you the > source code of the hook scripts, if necessary. A very long time ago I had a similar setup where some clients were using git-svn. This was for the first attempt to migrate the Wikimedia repositories away from SVN. There I had a setup where users could fetch my git-svn clone, which was hosted on github, and through some magic (I forgot the details) "catch up" with their local client. I.e. there was some mapping data that wasn't sent over. But users would always push to svn, not git. I think if you can live with that you'd have a much easier time, having this setup where you push to git and you then have to carry that push forward to svn is a lot more complex than just having the clients do that. GitHub also has a SVN gateway, that has no open source equivalent that I know of: https://help.github.com/articles/support-for-subversion-clients/ Maybe that's something you'd like to consider, i.e. fully migrate to git sooner than later, and for any leftover SVN clients have them push to a private repo on GitHub. Even if you only keep that GitHub repo as a bride during the migration and host Git in-house it'll be a lot easier with git as a DVCS to continually merge in those changes than pulling the same trick with a centralized system like SVN.
RE: Using git svn rebase at quarantine environment or a feasible alternative to synchronize git and svn repositories
> > Dear members of the Git community, > > > > The enterprise I work for is planning to switch from svn to git. > > > > Before the complete switch to git we have decided to implement a scenario > > where the two SCMs live together, being the svn repository the reference. > > We also want this scenario to be transparent for both SCM users. > > > > I read the articles referenced at the end of the email and I come to the > > following solution. > > > > My proposal consists to import the svn repository to git using git svn and > > set receive.denyCurrentBranch to updateInstead. Then install pre-receive > > and post-receive hooks and set that repository as the central repository > > for git users. > > > > The pre-receive hook does git svn rebase and, if there is an update at the > > svn repository, rejects the push and instructs the user to do git pull. > > The post-receive hook does git svn dcommit to update the state of the svn > > repository, then instructs the user to do git pull too. > > > > Both scripts check the changes pushed are made at master before doing > > anything and exit after performing these tasks. branches.master.rebase is > > set to merges at the user repository to avoid the histories of the central > > and the user repositories diverge after doing git svn dcommit. > > > > However I'm stuck at this point because the pre-receive hook it's not > > allowed to do git svn rebase because update refs are not allowed at the > > quarantine environment. I was sure that I tried this solution with a past > > version of git and it worked, but now I doubt this because the restriction > > to update refs at quarantine environment was delivered at version 2.13, > > that dates from April 2017, if I'm not wrong. > > > > I don't know if this solution could be implemented or is there a better > > way to accomplish this kind of synchronization (I tried Tmate SubGit, but > > it didn't work for me and I don't know if we will be willing to purchase a > > license). Could you help me with this question? > > > > I come here asking for help because I think this is the appropriate place > > to do so. I apologise if this is not the case. Any help is welcome. If > > anything needs to be clarified, please, ask me to do so. I can share with > > you the source code of the hook scripts, if necessary. > > A very long time ago I had a similar setup where some clients were using > git-svn. This was for the first attempt to migrate the Wikimedia > repositories away from SVN. > > There I had a setup where users could fetch my git-svn clone, which was > hosted on github, and through some magic (I forgot the details) "catch up" > with their local client. I.e. there was some mapping data that wasn't sent > over. > > But users would always push to svn, not git. I think if you can live with > that you'd have a much easier time, having this setup where you push to git > and you then have to carry that push forward to svn is a lot more complex > than just having the clients do that. > > GitHub also has a SVN gateway, that has no open source equivalent that I > know of: https://help.github.com/articles/support-for-subversion-clients/ > > Maybe that's something you'd like to consider, i.e. fully migrate to git > sooner than later, and for any leftover SVN clients have them push to a > private repo on GitHub. Even if you only keep that GitHub repo as a bride > during the migration and host Git in-house it'll be a lot easier with git as > a DVCS to continually merge in those changes than pulling the same trick > with a centralized system like SVN. Hi Ævar, First of all, thank you very much for your early response. I don't think making users always commit to svn is necessary. In fact, from my point of view, updating the svn repository with the changes committed to the git central repository is easy because there is no obstacle preventing to run git svn dcommit at the post-update hook. What I haven't managed to accomplish is to pull diffs from the svn repository into the git central repository without manual intervention. I suppose that in the setup you describe you manually pulled changes from the svn repository into your git-svn repository at GitHub. If don't, it would be very useful for me if you could remember how did you managed to do it automatically. I guess GitHub svn bridge (thank you for telling me about it, I didn't know about its existence) could be the solution if it was not for the fact that we want to keep our svn repository. Our whole CD infrastructure feeds from that repository and we'd like to figure out if everybody is comfortable using git and what is the actual value of using it as a team before making the effort of changing everything. Regards, Jose
Re: Using git svn rebase at quarantine environment or a feasible alternative to synchronize git and svn repositories
On Mon, Oct 01, 2018 at 08:12:49AM +, Jose Gisbert wrote: > My proposal consists to import the svn repository to git using git svn and set > receive.denyCurrentBranch to updateInstead. Then install pre-receive and > post-receive hooks and set that repository as the central repository for git > users. > > The pre-receive hook does git svn rebase and, if there is an update at the svn > repository, rejects the push and instructs the user to do git pull. The > post-receive hook does git svn dcommit to update the state of the svn > repository, then instructs the user to do git pull too. > [...] > However I'm stuck at this point because the pre-receive hook it's not allowed > to do git svn rebase because update refs are not allowed at the quarantine > environment. I was sure that I tried this solution with a past version of git > and it worked, but now I doubt this because the restriction to update refs at > quarantine environment was delivered at version 2.13, that dates from April > 2017, if I'm not wrong. As you noticed, this used to be allowed. But it's dangerous, because if the movement of the objects out of quarantine fails, then you're left with a corrupted ref (ditto, anybody looking at the ref after update but before quarantine ends will see what appears to be a corrupted repository). There are two solutions I can think of: 1. The unsafe thing is to just unset $GIT_QUARANTINE_PATH before running "git svn rebase". That will skip the safety check completely, enabling the pre-v2.13 behavior. I don't really recommend this, but modulo the race and unlikely file-moving errors, it would probably generally work. 2. Store intermediate results from pre-receive not as actual refs, and then install the refs as part of the post-receive. I don't think there's out of the box support for this, since "git svn rebase" is always going to call "git rebase", which is going to try to write refs. The smoothest thing would be for the refs code to see that $GIT_QUARANTINE_PATH is set, write a journal of ref updates into a file in that path, and then have the quarantine code try to apply those ref updates immediately after moving the objects out of quarantine (with the usual lease-locking mechanism). That's likely to be pretty tricky to implement, so I'm not even going to try to sketch out a patch in this email. You might be able to do something similar by hand by using a temporary sub-repository. E.g., "clone -s" to a temp repo, do the rebase there, and then in the post-receive fetch the refs back. That's less efficient, but the boundaries of the operation are very easy to understand. > I come here asking for help because I think this is the appropriate place to > do so. I apologise if this is not the case. Any help is welcome. If anything > needs to be clarified, please, ask me to do so. I can share with you the > source code of the hook scripts, if necessary. This is definitely the right place. Sorry I don't have a better answer! -Peff
Re: Using git svn rebase at quarantine environment or a feasible alternative to synchronize git and svn repositories
On Mon, Oct 1, 2018 at 4:17 PM Jose Gisbert wrote: > > > > Dear members of the Git community, > > > > > > The enterprise I work for is planning to switch from svn to git. > > > > > > Before the complete switch to git we have decided to implement a scenario > > > where the two SCMs live together, being the svn repository the reference. > > > We also want this scenario to be transparent for both SCM users. > > > > > > I read the articles referenced at the end of the email and I come to the > > > following solution. > > > > > > My proposal consists to import the svn repository to git using git svn and > > > set receive.denyCurrentBranch to updateInstead. Then install pre-receive > > > and post-receive hooks and set that repository as the central repository > > > for git users. > > > > > > The pre-receive hook does git svn rebase and, if there is an update at the > > > svn repository, rejects the push and instructs the user to do git pull. > > > The post-receive hook does git svn dcommit to update the state of the svn > > > repository, then instructs the user to do git pull too. > > > > > > Both scripts check the changes pushed are made at master before doing > > > anything and exit after performing these tasks. branches.master.rebase is > > > set to merges at the user repository to avoid the histories of the central > > > and the user repositories diverge after doing git svn dcommit. > > > > > > However I'm stuck at this point because the pre-receive hook it's not > > > allowed to do git svn rebase because update refs are not allowed at the > > > quarantine environment. I was sure that I tried this solution with a past > > > version of git and it worked, but now I doubt this because the restriction > > > to update refs at quarantine environment was delivered at version 2.13, > > > that dates from April 2017, if I'm not wrong. > > > > > > I don't know if this solution could be implemented or is there a better > > > way to accomplish this kind of synchronization (I tried Tmate SubGit, but > > > it didn't work for me and I don't know if we will be willing to purchase a > > > license). Could you help me with this question? > > > > > > I come here asking for help because I think this is the appropriate place > > > to do so. I apologise if this is not the case. Any help is welcome. If > > > anything needs to be clarified, please, ask me to do so. I can share with > > > you the source code of the hook scripts, if necessary. > > > > A very long time ago I had a similar setup where some clients were using > > git-svn. This was for the first attempt to migrate the Wikimedia > > repositories away from SVN. > > > > There I had a setup where users could fetch my git-svn clone, which was > > hosted on github, and through some magic (I forgot the details) "catch up" > > with their local client. I.e. there was some mapping data that wasn't sent > > over. > > > > But users would always push to svn, not git. I think if you can live with > > that you'd have a much easier time, having this setup where you push to git > > and you then have to carry that push forward to svn is a lot more complex > > than just having the clients do that. > > > > GitHub also has a SVN gateway, that has no open source equivalent that I > > know of: https://help.github.com/articles/support-for-subversion-clients/ > > > > Maybe that's something you'd like to consider, i.e. fully migrate to git > > sooner than later, and for any leftover SVN clients have them push to a > > private repo on GitHub. Even if you only keep that GitHub repo as a bride > > during the migration and host Git in-house it'll be a lot easier with git as > > a DVCS to continually merge in those changes than pulling the same trick > > with a centralized system like SVN. > > Hi Ævar, > > First of all, thank you very much for your early response. > > I don't think making users always commit to svn is necessary. In fact, from my > point of view, updating the svn repository with the changes committed to the > git central repository is easy because there is no obstacle preventing to run > git svn dcommit at the post-update hook. > > What I haven't managed to accomplish is to pull diffs from the svn repository > into the git central repository without manual intervention. I suppose that in > the setup you describe you manually pulled changes from the svn repository > into your git-svn repository at GitHub. If don't, it would be very useful for > me if you could remember how did you managed to do it automatically. > > I guess GitHub svn bridge (thank you for telling me about it, I didn't know > about its existence) could be the solution if it was not for the fact that we > want to keep our svn repository. Our whole CD infrastructure feeds from that > repository and we'd like to figure out if everybody is comfortable using git > and what is the actual value of using it as a team before making the effort of > changing everything. Makes sense. It's certainly not impossible to have some magic
RE: Using git svn rebase at quarantine environment or a feasible alternative to synchronize git and svn repositories
> As you noticed, this used to be allowed. But it's dangerous, because if > the movement of the objects out of quarantine fails, then you're left > with a corrupted ref (ditto, anybody looking at the ref after update but > before quarantine ends will see what appears to be a corrupted > repository). > > There are two solutions I can think of: > > 1. The unsafe thing is to just unset $GIT_QUARANTINE_PATH before > running "git svn rebase". That will skip the safety check > completely, enabling the pre-v2.13 behavior. I don't really > recommend this, but modulo the race and unlikely file-moving > errors, it would probably generally work. > > 2. Store intermediate results from pre-receive not as actual refs, and > then install the refs as part of the post-receive. I don't think > there's out of the box support for this, since "git svn rebase" is > always going to call "git rebase", which is going to try to write > refs. > > The smoothest thing would be for the refs code to see that > $GIT_QUARANTINE_PATH is set, write a journal of ref updates into a > file in that path, and then have the quarantine code try to apply > those ref updates immediately after moving the objects out of > quarantine (with the usual lease-locking mechanism). > > That's likely to be pretty tricky to implement, so I'm not even > going to try to sketch out a patch in this email. > > You might be able to do something similar by hand by using a > temporary sub-repository. E.g., "clone -s" to a temp repo, do the > rebase there, and then in the post-receive fetch the refs back. > That's less efficient, but the boundaries of the operation are very > easy to understand. I read about the dangers of updating refs at the pre-receive hook at git-receive-pack man pages and, of course, it's something to take into account. But, given that this setup will be temporary and the technical complexity of the other solutions you propose (solution #2 seems unfeasible and I don't know where to find and how to modify refs code or quarantine code) I'm more predisposed to try solution #1 by now. Moreover, assuming that solution #1 will generally work and the facts that: - I think it would be possible to us to recover from a corrupted repository somehow easily. Couldn't we, for instance, reset from a failed push and try it again? - the chances of corrupting the svn repository, our reference here, seem small because git svn dcommit is the last operation in the chain and is only performed when everything else went ok - we are a small team and git is not our main CVS, so we can stop pushing to git while we fix the repository I'm more inclined to apply this solution. Maybe I'm being too much optimistic with my assumptions. Nevertheless I will investigate the "clone -s" alternative. It seems easy to implement and performance is currently not a limitation for us. > This is definitely the right place. Sorry I don't have a better answer! Thank you for your elaborated response Jeff. It has been a pleasure to write to the git community email group. Regards, Jose
RE: Using git svn rebase at quarantine environment or a feasible alternative to synchronize git and svn repositories
> Makes sense. It's certainly not impossible to have some magic "push to > git". I only wanted to point out that it's extra complexity, so if you > could do away with that aspect of it you'd save yourself some > complexity. I was going to elaborate a bit on how that can go wrong, > but I see Jeff sent a mail just now that was better than what I had :) > > I'll only add that I think you're somewhat fooling yourself if you > think you can run Subversion and Git side-by-side and evaluate both on > their merits, even if you solve the technical aspects of doing that. > Such a system will always need to cater to the lowest common > denominator of Subversion's very centralized workflow. > > The big advantage you get out of DVCSs is being able to be more > flexible, and e.g. using hosting sites (in-house or external) like > GitHub or GitLab which are built around that flexibility. So > ultimately any decision about switching SCMs needs to be a > forward-looking management decision for the project, not based on how > well Git can emulate a SVN-based workflow, which is ultimately not > what you're interested in if you do make the switch. I agree with you about the extra complexity of letting users push to git and carrying the changes to svn through git hooks, Ævar. The reason because we decided to do this is to avoid forcing those who will play git to use svn to perform some actions. We think letting them focus on using git will provide them with a better understanding of git and all of its mechanisms. I know well about the benefits of git, I've been a git user since 2009. But I agreed with development team leaders that this temporary setup will give team members the opportunity to learn git while, at the same time, we avoid to change all of our CD infrastructure until we are ready. Besides that, this will allow both team leaders and members to experience git advantages, for instance, developers will be able to use branches to work on specific features or commit only changes that are ready. Though, as you well say, they won't know about git veritable benefits until we definitely migrate to git and discard svn. Nevertheless, thank you very much for your advice, I will consider it. Regards, Jose
Re: Using git svn rebase at quarantine environment or a feasible alternative to synchronize git and svn repositories
On Tue, Oct 02, 2018 at 10:28:38AM +, Jose Gisbert wrote: > Moreover, assuming that solution #1 will generally work and the facts that: > > - I think it would be possible to us to recover from a corrupted repository > somehow easily. Couldn't we, for instance, reset from a failed push and try > it again? Yes, I think that would generally allow you to recover (it just may require some manual fiddling by the admin). > - the chances of corrupting the svn repository, our reference here, seem small > because git svn dcommit is the last operation in the chain and is only > performed when everything else went ok > - we are a small team and git is not our main CVS, so we can stop pushing to > git while we fix the repository > > I'm more inclined to apply this solution. Maybe I'm being too much optimistic > with my assumptions. I think your analysis of the risk seems pretty accurate. I make no promises, of course. :) -Peff