Re: Using git svn rebase at quarantine environment or a feasible alternative to synchronize git and svn repositories

2018-10-02 Thread Jeff King
On Tue, Oct 02, 2018 at 10:28:38AM +, Jose Gisbert wrote:

> Moreover, assuming that solution #1 will generally work and the facts that:
> 
> - I think it would be possible to us to recover from a corrupted repository
>   somehow easily. Couldn't we, for instance, reset from a failed push and try
>   it again?

Yes, I think that would generally allow you to recover (it just may
require some manual fiddling by the admin).

> - the chances of corrupting the svn repository, our reference here, seem small
>   because git svn dcommit is the last operation in the chain and is only
>   performed when everything else went ok
> - we are a small team and git is not our main CVS, so we can stop pushing to
>   git while we fix the repository
> 
> I'm more inclined to apply this solution. Maybe I'm being too much optimistic
> with my assumptions.

I think your analysis of the risk seems pretty accurate. I make no
promises, of course. :)

-Peff


RE: Using git svn rebase at quarantine environment or a feasible alternative to synchronize git and svn repositories

2018-10-02 Thread Jose Gisbert
> Makes sense. It's certainly not impossible to have some magic "push to
> git". I only wanted to point out that it's extra complexity, so if you
> could do away with that aspect of it you'd save yourself some
> complexity. I was going to elaborate a bit on how that can go wrong,
> but I see Jeff sent a mail just now that was better than what I had :)
> 
> I'll only add that I think you're somewhat fooling yourself if you
> think you can run Subversion and Git side-by-side and evaluate both on
> their merits, even if you solve the technical aspects of doing that.
> Such a system will always need to cater to the lowest common
> denominator of Subversion's very centralized workflow.
> 
> The big advantage you get out of DVCSs is being able to be more
> flexible, and e.g. using hosting sites (in-house or external) like
> GitHub or GitLab which are built around that flexibility. So
> ultimately any decision about switching SCMs needs to be a
> forward-looking management decision for the project, not based on how
> well Git can emulate a SVN-based workflow, which is ultimately not
> what you're interested in if you do make the switch.

I agree with you about the extra complexity of letting users push to git and
carrying the changes to svn through git hooks, Ævar. The reason because we
decided to do this is to avoid forcing those who will play git to use svn to
perform some actions. We think letting them focus on using git will provide
them with a better understanding of git and all of its mechanisms.

I know well about the benefits of git, I've been a git user since 2009. But I
agreed with development team leaders that this temporary setup will give team
members the opportunity to learn git while, at the same time, we avoid to
change all of our CD infrastructure until we are ready. Besides that, this
will allow both team leaders and members to experience git advantages, for
instance, developers will be able to use branches to work on specific features
or commit only changes that are ready. Though, as you well say, they won't
know about git veritable benefits until we definitely migrate to git and
discard svn.

Nevertheless, thank you very much for your advice, I will consider it.

Regards,

Jose

RE: Using git svn rebase at quarantine environment or a feasible alternative to synchronize git and svn repositories

2018-10-02 Thread Jose Gisbert
> As you noticed, this used to be allowed. But it's dangerous, because if
> the movement of the objects out of quarantine fails, then you're left
> with a corrupted ref (ditto, anybody looking at the ref after update but
> before quarantine ends will see what appears to be a corrupted
> repository).
> 
> There are two solutions I can think of:
> 
>   1. The unsafe thing is to just unset $GIT_QUARANTINE_PATH before
>  running "git svn rebase". That will skip the safety check
>  completely, enabling the pre-v2.13 behavior. I don't really
>  recommend this, but modulo the race and unlikely file-moving
>  errors, it would probably generally work.
> 
>   2. Store intermediate results from pre-receive not as actual refs, and
>  then install the refs as part of the post-receive. I don't think
>  there's out of the box support for this, since "git svn rebase" is
>  always going to call "git rebase", which is going to try to write
>  refs.
> 
>  The smoothest thing would be for the refs code to see that
>  $GIT_QUARANTINE_PATH is set, write a journal of ref updates into a
>  file in that path, and then have the quarantine code try to apply
>  those ref updates immediately after moving the objects out of
>  quarantine (with the usual lease-locking mechanism).
> 
>  That's likely to be pretty tricky to implement, so I'm not even
>  going to try to sketch out a patch in this email.
> 
>  You might be able to do something similar by hand by using a
>  temporary sub-repository. E.g., "clone -s" to a temp repo, do the
>  rebase there, and then in the post-receive fetch the refs back.
>  That's less efficient, but the boundaries of the operation are very
>  easy to understand.

I read about the dangers of updating refs at the pre-receive hook at
git-receive-pack man pages and, of course, it's something to take into
account. But, given that this setup will be temporary and the technical
complexity of the other solutions you propose (solution #2 seems unfeasible
and I don't know where to find and how to modify refs code or quarantine code)
I'm more predisposed to try solution #1 by now.

Moreover, assuming that solution #1 will generally work and the facts that:

- I think it would be possible to us to recover from a corrupted repository
  somehow easily. Couldn't we, for instance, reset from a failed push and try
  it again?
- the chances of corrupting the svn repository, our reference here, seem small
  because git svn dcommit is the last operation in the chain and is only
  performed when everything else went ok
- we are a small team and git is not our main CVS, so we can stop pushing to
  git while we fix the repository

I'm more inclined to apply this solution. Maybe I'm being too much optimistic
with my assumptions.

Nevertheless I will investigate the "clone -s" alternative. It seems easy to
implement and performance is currently not a limitation for us.

> This is definitely the right place. Sorry I don't have a better answer!

Thank you for your elaborated response Jeff. It has been a pleasure to write
to the git community email group.

Regards,

Jose

Re: Using git svn rebase at quarantine environment or a feasible alternative to synchronize git and svn repositories

2018-10-01 Thread Ævar Arnfjörð Bjarmason
On Mon, Oct 1, 2018 at 4:17 PM Jose Gisbert  wrote:
>
> > > Dear members of the Git community,
> > >
> > > The enterprise I work for is planning to switch from svn to git.
> > >
> > > Before the complete switch to git we have decided to implement a scenario
> > > where the two SCMs live together, being the svn repository the reference.
> > > We also want this scenario to be transparent for both SCM users.
> > >
> > > I read the articles referenced at the end of the email and I come to the
> > > following solution.
> > >
> > > My proposal consists to import the svn repository to git using git svn and
> > > set receive.denyCurrentBranch to updateInstead. Then install pre-receive
> > > and post-receive hooks and set that repository as the central repository
> > > for git users.
> > >
> > > The pre-receive hook does git svn rebase and, if there is an update at the
> > > svn repository, rejects the push and instructs the user to do git pull.
> > > The post-receive hook does git svn dcommit to update the state of the svn
> > > repository, then instructs the user to do git pull too.
> > >
> > > Both scripts check the changes pushed are made at master before doing
> > > anything and exit after performing these tasks. branches.master.rebase is
> > > set to merges at the user repository to avoid the histories of the central
> > > and the user repositories diverge after doing git svn dcommit.
> > >
> > > However I'm stuck at this point because the pre-receive hook it's not
> > > allowed to do git svn rebase because update refs are not allowed at the
> > > quarantine environment. I was sure that I tried this solution with a past
> > > version of git and it worked, but now I doubt this because the restriction
> > > to update refs at quarantine environment was delivered at version 2.13,
> > > that dates from April 2017, if I'm not wrong.
> > >
> > > I don't know if this solution could be implemented or is there a better
> > > way to accomplish this kind of synchronization (I tried Tmate SubGit, but
> > > it didn't work for me and I don't know if we will be willing to purchase a
> > > license). Could you help me with this question?
> > >
> > > I come here asking for help because I think this is the appropriate place
> > > to do so. I apologise if this is not the case. Any help is welcome. If
> > > anything needs to be clarified, please, ask me to do so. I can share with
> > > you the source code of the hook scripts, if necessary.
> >
> > A very long time ago I had a similar setup where some clients were using
> > git-svn. This was for the first attempt to migrate the Wikimedia
> > repositories away from SVN.
> >
> > There I had a setup where users could fetch my git-svn clone, which was
> > hosted on github, and through some magic (I forgot the details) "catch up"
> > with their local client. I.e. there was some mapping data that wasn't sent
> > over.
> >
> > But users would always push to svn, not git. I think if you can live with
> > that you'd have a much easier time, having this setup where you push to git
> > and you then have to carry that push forward to svn is a lot more complex
> > than just having the clients do that.
> >
> > GitHub also has a SVN gateway, that has no open source equivalent that I
> > know of: https://help.github.com/articles/support-for-subversion-clients/
> >
> > Maybe that's something you'd like to consider, i.e. fully migrate to git
> > sooner than later, and for any leftover SVN clients have them push to a
> > private repo on GitHub. Even if you only keep that GitHub repo as a bride
> > during the migration and host Git in-house it'll be a lot easier with git as
> > a DVCS to continually merge in those changes than pulling the same trick
> > with a centralized system like SVN.
>
> Hi Ævar,
>
> First of all, thank you very much for your early response.
>
> I don't think making users always commit to svn is necessary. In fact, from my
> point of view, updating the svn repository with the changes committed to the
> git central repository is easy because there is no obstacle preventing to run
> git svn dcommit at the post-update hook.
>
> What I haven't managed to accomplish is to pull diffs from the svn repository
> into the git central repository without manual intervention. I suppose that in
> the setup you describe you manually pulled changes from the svn repository
> into your git-svn repository at GitHub. If don't, it would be very useful for
> me if you could remember how did you managed to do it automatically.
>
> I guess GitHub svn bridge (thank you for telling me about it, I didn't know
> about its existence) could be the solution if it was not for the fact that we
> want to keep our svn repository. Our whole CD infrastructure feeds from that
> repository and we'd like to figure out if everybody is comfortable using git
> and what is the actual value of using it as a team before making the effort of
> changing everything.

Makes sense. It's certainly not impossible to have some 

Re: Using git svn rebase at quarantine environment or a feasible alternative to synchronize git and svn repositories

2018-10-01 Thread Jeff King
On Mon, Oct 01, 2018 at 08:12:49AM +, Jose Gisbert wrote:

> My proposal consists to import the svn repository to git using git svn and set
> receive.denyCurrentBranch to updateInstead. Then install pre-receive and
> post-receive hooks and set that repository as the central repository for git
> users.
> 
> The pre-receive hook does git svn rebase and, if there is an update at the svn
> repository, rejects the push and instructs the user to do git pull. The
> post-receive hook does git svn dcommit to update the state of the svn
> repository, then instructs the user to do git pull too.
> [...]
> However I'm stuck at this point because the pre-receive hook it's not allowed
> to do git svn rebase because update refs are not allowed at the quarantine
> environment. I was sure that I tried this solution with a past version of git
> and it worked, but now I doubt this because the restriction to update refs at
> quarantine environment was delivered at version 2.13, that dates from April
> 2017, if I'm not wrong.

As you noticed, this used to be allowed. But it's dangerous, because if
the movement of the objects out of quarantine fails, then you're left
with a corrupted ref (ditto, anybody looking at the ref after update but
before quarantine ends will see what appears to be a corrupted
repository).

There are two solutions I can think of:

  1. The unsafe thing is to just unset $GIT_QUARANTINE_PATH before
 running "git svn rebase". That will skip the safety check
 completely, enabling the pre-v2.13 behavior. I don't really
 recommend this, but modulo the race and unlikely file-moving
 errors, it would probably generally work.

  2. Store intermediate results from pre-receive not as actual refs, and
 then install the refs as part of the post-receive. I don't think
 there's out of the box support for this, since "git svn rebase" is
 always going to call "git rebase", which is going to try to write
 refs.

 The smoothest thing would be for the refs code to see that
 $GIT_QUARANTINE_PATH is set, write a journal of ref updates into a
 file in that path, and then have the quarantine code try to apply
 those ref updates immediately after moving the objects out of
 quarantine (with the usual lease-locking mechanism).

 That's likely to be pretty tricky to implement, so I'm not even
 going to try to sketch out a patch in this email.

 You might be able to do something similar by hand by using a
 temporary sub-repository. E.g., "clone -s" to a temp repo, do the
 rebase there, and then in the post-receive fetch the refs back.
 That's less efficient, but the boundaries of the operation are very
 easy to understand.

> I come here asking for help because I think this is the appropriate place to
> do so. I apologise if this is not the case. Any help is welcome. If anything
> needs to be clarified, please, ask me to do so. I can share with you the
> source code of the hook scripts, if necessary.

This is definitely the right place. Sorry I don't have a better answer!

-Peff


RE: Using git svn rebase at quarantine environment or a feasible alternative to synchronize git and svn repositories

2018-10-01 Thread Jose Gisbert
> > Dear members of the Git community,
> >
> > The enterprise I work for is planning to switch from svn to git.
> >
> > Before the complete switch to git we have decided to implement a scenario
> > where the two SCMs live together, being the svn repository the reference.
> > We also want this scenario to be transparent for both SCM users.
> >
> > I read the articles referenced at the end of the email and I come to the
> > following solution.
> >
> > My proposal consists to import the svn repository to git using git svn and
> > set receive.denyCurrentBranch to updateInstead. Then install pre-receive
> > and post-receive hooks and set that repository as the central repository
> > for git users.
> >
> > The pre-receive hook does git svn rebase and, if there is an update at the
> > svn repository, rejects the push and instructs the user to do git pull.
> > The post-receive hook does git svn dcommit to update the state of the svn
> > repository, then instructs the user to do git pull too.
> >
> > Both scripts check the changes pushed are made at master before doing
> > anything and exit after performing these tasks. branches.master.rebase is
> > set to merges at the user repository to avoid the histories of the central
> > and the user repositories diverge after doing git svn dcommit.
> >
> > However I'm stuck at this point because the pre-receive hook it's not
> > allowed to do git svn rebase because update refs are not allowed at the
> > quarantine environment. I was sure that I tried this solution with a past
> > version of git and it worked, but now I doubt this because the restriction
> > to update refs at quarantine environment was delivered at version 2.13,
> > that dates from April 2017, if I'm not wrong.
> >
> > I don't know if this solution could be implemented or is there a better
> > way to accomplish this kind of synchronization (I tried Tmate SubGit, but
> > it didn't work for me and I don't know if we will be willing to purchase a
> > license). Could you help me with this question?
> >
> > I come here asking for help because I think this is the appropriate place
> > to do so. I apologise if this is not the case. Any help is welcome. If
> > anything needs to be clarified, please, ask me to do so. I can share with
> > you the source code of the hook scripts, if necessary.
> 
> A very long time ago I had a similar setup where some clients were using
> git-svn. This was for the first attempt to migrate the Wikimedia
> repositories away from SVN.
> 
> There I had a setup where users could fetch my git-svn clone, which was
> hosted on github, and through some magic (I forgot the details) "catch up"
> with their local client. I.e. there was some mapping data that wasn't sent
> over.
> 
> But users would always push to svn, not git. I think if you can live with
> that you'd have a much easier time, having this setup where you push to git
> and you then have to carry that push forward to svn is a lot more complex
> than just having the clients do that.
> 
> GitHub also has a SVN gateway, that has no open source equivalent that I
> know of: https://help.github.com/articles/support-for-subversion-clients/
> 
> Maybe that's something you'd like to consider, i.e. fully migrate to git
> sooner than later, and for any leftover SVN clients have them push to a
> private repo on GitHub. Even if you only keep that GitHub repo as a bride
> during the migration and host Git in-house it'll be a lot easier with git as
> a DVCS to continually merge in those changes than pulling the same trick
> with a centralized system like SVN.

Hi Ævar,

First of all, thank you very much for your early response.

I don't think making users always commit to svn is necessary. In fact, from my
point of view, updating the svn repository with the changes committed to the
git central repository is easy because there is no obstacle preventing to run
git svn dcommit at the post-update hook.

What I haven't managed to accomplish is to pull diffs from the svn repository
into the git central repository without manual intervention. I suppose that in
the setup you describe you manually pulled changes from the svn repository
into your git-svn repository at GitHub. If don't, it would be very useful for
me if you could remember how did you managed to do it automatically.

I guess GitHub svn bridge (thank you for telling me about it, I didn't know
about its existence) could be the solution if it was not for the fact that we
want to keep our svn repository. Our whole CD infrastructure feeds from that
repository and we'd like to figure out if everybody is comfortable using git
and what is the actual value of using it as a team before making the effort of
changing everything.

Regards,

Jose

Re: Using git svn rebase at quarantine environment or a feasible alternative to synchronize git and svn repositories

2018-10-01 Thread Ævar Arnfjörð Bjarmason


On Mon, Oct 01 2018, Jose Gisbert wrote:

> Dear members of the Git community,
>
> The enterprise I work for is planning to switch from svn to git.
>
> Before the complete switch to git we have decided to implement a scenario
> where the two SCMs live together, being the svn repository the reference. We
> also want this scenario to be transparent for both SCM users.
>
> I read the articles referenced at the end of the email and I come to the
> following solution.
>
> My proposal consists to import the svn repository to git using git svn and set
> receive.denyCurrentBranch to updateInstead. Then install pre-receive and
> post-receive hooks and set that repository as the central repository for git
> users.
>
> The pre-receive hook does git svn rebase and, if there is an update at the svn
> repository, rejects the push and instructs the user to do git pull. The
> post-receive hook does git svn dcommit to update the state of the svn
> repository, then instructs the user to do git pull too.
>
> Both scripts check the changes pushed are made at master before doing anything
> and exit after performing these tasks. branches.master.rebase is set to merges
> at the user repository to avoid the histories of the central and the user
> repositories diverge after doing git svn dcommit.
>
> However I'm stuck at this point because the pre-receive hook it's not allowed
> to do git svn rebase because update refs are not allowed at the quarantine
> environment. I was sure that I tried this solution with a past version of git
> and it worked, but now I doubt this because the restriction to update refs at
> quarantine environment was delivered at version 2.13, that dates from April
> 2017, if I'm not wrong.
>
> I don't know if this solution could be implemented or is there a better way to
> accomplish this kind of synchronization (I tried Tmate SubGit, but it didn't
> work for me and I don't know if we will be willing to purchase a license).
> Could you help me with this question?
>
> I come here asking for help because I think this is the appropriate place to
> do so. I apologise if this is not the case. Any help is welcome. If anything
> needs to be clarified, please, ask me to do so. I can share with you the
> source code of the hook scripts, if necessary.

A very long time ago I had a similar setup where some clients were using
git-svn. This was for the first attempt to migrate the Wikimedia
repositories away from SVN.

There I had a setup where users could fetch my git-svn clone, which was
hosted on github, and through some magic (I forgot the details) "catch
up" with their local client. I.e. there was some mapping data that
wasn't sent over.

But users would always push to svn, not git. I think if you can live
with that you'd have a much easier time, having this setup where you
push to git and you then have to carry that push forward to svn is a lot
more complex than just having the clients do that.

GitHub also has a SVN gateway, that has no open source equivalent that I
know of:
https://help.github.com/articles/support-for-subversion-clients/

Maybe that's something you'd like to consider, i.e. fully migrate to git
sooner than later, and for any leftover SVN clients have them push to a
private repo on GitHub. Even if you only keep that GitHub repo as a
bride during the migration and host Git in-house it'll be a lot easier
with git as a DVCS to continually merge in those changes than pulling
the same trick with a centralized system like SVN.