Re: Transform log message during migration svn -> git (using git-svn)
Am 20.06.2017 um 14:32 schrieb paul.mat...@s4m.com: > Well this is a possibility, of course. Our problem is that our SVN > repository contains about 220.000 revisions currently. As a colleague of > mine said that the command you suggest might take about 4 seconds per > revision, it would take about 10 days to do this for our whole repository. > So of course it could save a lot of time generally if such operation could > be done immediately during git-svn. My data point is this: A "git filter branch" run with ~2000 revisions took several hours on a Windows 7 box. That number seems to be roughly the same as your number. A comparable run on a Linux box took only about 10 minutes. So: If your benchmark was done on Windows you might do that also on Linux.
Re: Transform log message during migration svn -> git (using git-svn)
On Tue, Jun 20, 2017 at 02:46:22PM +0200, Lars Schneider wrote: > > > On 20 Jun 2017, at 14:32,wrote: > > > > Well this is a possibility, of course. Our problem is that our SVN > > repository contains about 220.000 revisions currently. As a colleague of > > mine said that the command you suggest might take about 4 seconds per > > revision, it would take about 10 days to do this for our whole repository. > > So of course it could save a lot of time generally if such operation could > > be done immediately during git-svn. > > You colleague is most likely correct. I suggested it as this is a one time > operation and therefore still somewhat practical from my point of view. I didn't follow this whole thread, but I happened to see this bit. I think the command in question is: git filter-branch -f --msg-filter 'perl -lape "s/^T(\d+)/#\$1/"' I know filter-branch is slow, but a msg-filter should be relatively fast. I'd be surprised at 4 seconds per revision (the main cost is kicking off a new perl process per revision). It's more like 120/sec on my machine. However, I think the fastest way would be to do it with fast-export, where you can just tweak the stream as it flows through: # set up a new repo to hold the results; we won't bother # copying the blobs, so just point at the current repo as an # alternate. git init fixed-repo echo "../../../.git/objects" >fixed-repo/.git/objects/info/alternates git fast-export --no-data --all | perl -ne ' # look for "data" chunks which contain the commit message if (/^data (\d+)/) { read STDIN, my $buf, $1; $buf =~ s/^T(\d+)/#$1/; print "data ", length($buf), "\n"; print $buf; } else { print; } ' | git -C fixed-repo fast-import That runs at about 3600 commits/sec on my machine. Most of that time goes to doing a tree diff on each commit. Technically that is not required for your use case, but I don't think there's a way to get fast-export to skip that (and it's an inherent part of the fast-import stream). It's probably fast enough, but it's possible that a specialized tool like BFG repo cleaner[1] could do better (I don't know offhand if it handles commit message rewrites or not). -Peff [1] https://rtyley.github.io/bfg-repo-cleaner/
Re: Transform log message during migration svn -> git (using git-svn)
> On 20 Jun 2017, at 14:32, <paul.mat...@s4m.com> <paul.mat...@s4m.com> wrote: > > Well this is a possibility, of course. Our problem is that our SVN > repository contains about 220.000 revisions currently. As a colleague of > mine said that the command you suggest might take about 4 seconds per > revision, it would take about 10 days to do this for our whole repository. > So of course it could save a lot of time generally if such operation could > be done immediately during git-svn. You colleague is most likely correct. I suggested it as this is a one time operation and therefore still somewhat practical from my point of view. If you don't like the solution then you need to change the git-svn code. Probably here somewhere (I am not familiar with this code): https://github.com/git/git/blob/master/git-svn.perl#L1836 - Lars PS: Please don't top post on this mailing list :-) https://en.wikipedia.org/wiki/Posting_style#Top-posting > > Paul Mattke > Software Developer > - > Arvato Systems S4M GmbH > Am Coloneum 3 > 50829 Köln > > Phone: +49 221 28555-443 > Fax: +49 221 28555-210 > E-Mail: paul.mat...@s4m.com > www.s4m.arvato-systems.com > > > -Ursprüngliche Nachricht- > Von: Lars Schneider [mailto:larsxschnei...@gmail.com] > Gesendet: Dienstag, 20. Juni 2017 11:32 > An: Mattke, Paul, NMM-BPDD <paul.mat...@s4m.com> > Cc: git@vger.kernel.org > Betreff: Re: Transform log message during migration svn -> git (using > git-svn) > > >> On 20 Jun 2017, at 09:32, paul.mat...@s4m.com wrote: >> >> Hi there, >> >> this is actually not really a bug report, but much more a feature >> request (if I did not oversee an already existing feature like this): >> >> We want to migrate our SVN repository to GIT and will be using git-svn >> for that of course. Currently in SVN, all our commit log messages >> start either >> with: >> >> 123456 (a number, representing the Bug Id in our old legacy bug >> tracker) >> >> or >> >> T123456 (a number, but prefixed with T, referring a TFS item in this >> case) >> >> During conversion to GIT, we want to replace the T in such log >> messages with a #, so commits, referring a TFS item will start with > #123456 in the future. >> We dont care about log messages which do not start with a T, only the >> TXX messages need to be transformed here. >> >> I guess an operation like this is currently not possible with git-svn, >> isnt it? So it would be nice, if a feature could be implemented that >> gives the user the possibility to specify some kind of script file for >> example, which transforms the log message in any way we want it. > > You can migrate your repo from SVN to Git as is. Afterwards you can fix up > the commit messages with the following command: > > git filter-branch -f --msg-filter 'perl -lape "s/^T(\d+)/#\$1/"' > > (this might take a while on a large repo) > > - Lars
AW: Transform log message during migration svn -> git (using git-svn)
Well this is a possibility, of course. Our problem is that our SVN repository contains about 220.000 revisions currently. As a colleague of mine said that the command you suggest might take about 4 seconds per revision, it would take about 10 days to do this for our whole repository. So of course it could save a lot of time generally if such operation could be done immediately during git-svn. Paul Mattke Software Developer - Arvato Systems S4M GmbH Am Coloneum 3 50829 Köln Phone: +49 221 28555-443 Fax: +49 221 28555-210 E-Mail: paul.mat...@s4m.com www.s4m.arvato-systems.com -Ursprüngliche Nachricht- Von: Lars Schneider [mailto:larsxschnei...@gmail.com] Gesendet: Dienstag, 20. Juni 2017 11:32 An: Mattke, Paul, NMM-BPDD <paul.mat...@s4m.com> Cc: git@vger.kernel.org Betreff: Re: Transform log message during migration svn -> git (using git-svn) > On 20 Jun 2017, at 09:32, paul.mat...@s4m.com wrote: > > Hi there, > > this is actually not really a bug report, but much more a feature > request (if I did not oversee an already existing feature like this): > > We want to migrate our SVN repository to GIT and will be using git-svn > for that of course. Currently in SVN, all our commit log messages > start either > with: > > 123456 (a number, representing the Bug Id in our old legacy bug > tracker) > > or > > T123456 (a number, but prefixed with T, referring a TFS item in this > case) > > During conversion to GIT, we want to replace the T in such log > messages with a #, so commits, referring a TFS item will start with #123456 in the future. > We dont care about log messages which do not start with a T, only the > TXX messages need to be transformed here. > > I guess an operation like this is currently not possible with git-svn, > isnt it? So it would be nice, if a feature could be implemented that > gives the user the possibility to specify some kind of script file for > example, which transforms the log message in any way we want it. You can migrate your repo from SVN to Git as is. Afterwards you can fix up the commit messages with the following command: git filter-branch -f --msg-filter 'perl -lape "s/^T(\d+)/#\$1/"' (this might take a while on a large repo) - Lars smime.p7s Description: S/MIME cryptographic signature
Re: Transform log message during migration svn -> git (using git-svn)
> On 20 Jun 2017, at 09:32, paul.mat...@s4m.com wrote: > > Hi there, > > this is actually not really a bug report, but much more a feature request > (if I did not oversee an already existing feature like this): > > We want to migrate our SVN repository to GIT and will be using git-svn for > that of course. Currently in SVN, all our commit log messages start either > with: > > 123456 (a number, representing the Bug Id in our old legacy bug tracker) > > or > > T123456 (a number, but prefixed with T, referring a TFS item in this case) > > During conversion to GIT, we want to replace the T in such log messages with > a #, so commits, referring a TFS item will start with #123456 in the future. > We dont care about log messages which do not start with a T, only the > TXX messages need to be transformed here. > > I guess an operation like this is currently not possible with git-svn, isnt > it? So it would be nice, if a feature could be implemented that gives the > user the possibility to specify some kind of script file for example, which > transforms the log message in any way we want it. You can migrate your repo from SVN to Git as is. Afterwards you can fix up the commit messages with the following command: git filter-branch -f --msg-filter 'perl -lape "s/^T(\d+)/#\$1/"' (this might take a while on a large repo) - Lars
Transform log message during migration svn -> git (using git-svn)
Hi there, this is actually not really a bug report, but much more a feature request (if I did not oversee an already existing feature like this): We want to migrate our SVN repository to GIT and will be using git-svn for that of course. Currently in SVN, all our commit log messages start either with: 123456 (a number, representing the Bug Id in our old legacy bug tracker) or T123456 (a number, but prefixed with T, referring a TFS item in this case) During conversion to GIT, we want to replace the T in such log messages with a #, so commits, referring a TFS item will start with #123456 in the future. We dont care about log messages which do not start with a T, only the TXX messages need to be transformed here. I guess an operation like this is currently not possible with git-svn, isnt it? So it would be nice, if a feature could be implemented that gives the user the possibility to specify some kind of script file for example, which transforms the log message in any way we want it. Paul Mattke Software Developer - Arvato Systems S4M GmbH Am Coloneum 3 50829 Köln Phone: +49 221 28555-443 Fax: +49 221 28555-210 E-Mail: paul.mat...@s4m.com www.s4m.arvato-systems.com smime.p7s Description: S/MIME cryptographic signature