Hi, [+CC: Daniel, for making me notice this email in the first place]
Eric Raymond writes: > My interest in tools for repository surgery has continued, and I recently > spotted an opportunity in the increasing use of git-fast-import streams > as a history-interchange format. I have written what I believe is the first > *native* application for fast-import streams, a repository editor I > call reposurgeon. Cool tool! > Which brings me to my feature request. Please add native support for > fast-export and fast-import to svndump. This would be a good idea > in general, but my specific reason for wanting it is to enable > reposurgeon to edit Subversion repositories. > > The export side is, of course, almost trivial. Proof of concept under > MIT license is here: <http://c133.org/code/svn-fast-export.c>. It > needs a bit of extension work around tags and branches; I won't > belabor the obvious (and easily solvable) issues with those. There are > two more substantive ones: With svnrdump (merged into Subversion trunk in subversion/svnrdump) and svn-fe (merged into git.git in contrib/svn-fe + vcs-svn/), it's possible to produce a fast-import stream from a remote repository without the need for any local mirroring. Unforutunately, svnrdump can only produce a deltified dumpfile v3, and the patch series that adds dumpfile v3 support to svn-fe hasn't been merged into git.git yet- you can pick up the branch `dumpfilev3` from David's repository <http://github.com/barrbrain/git> though. > 1) Whatever merge-tracking hair you represent internally should be dumped > 'as 'merge' commit properties. > > 2) User commit properties (e.g. those not in the svn: namespace) > should be exported using the bzr properties extension, which > reposurgeon handles now and which seems likely to make it into git core at > some point. Syntax: > > property <space> NAME <space> VALUE-LENGTH <space> VALUE LF > > or, if the value is empty: > > property <space> NAME LF > > NAME and VALUE are utf8-encoded. The properties for each commit are sorted > by the property name. > > Also note that an import stream actually containing commit-property > declarations > should have a line reading "feature commit-properties" before the first > commit. Actually, the objective of svn-fe is to produce a conformant fast-import stream (so Git can import it into its object store): some information is lost in the process. Does reposurgeon require all the information, or can it operate on the stream that svn-fe produces? That brings us to another point: a fast-import stream is probably not the most faithful representation of a Subversion repository, and I think a dumpfile v3 fits this bill. Subversion already supports native export/ import of this format: svnadmin (dump|load) when mirrored locally and svnrdump (dump|load) when it's not :) > The import side is less trivial, but given that you've already got internal > representations for merge-tracking it shouldn't be too difficult either. > <http://www.kernel.org/pub/software/scm/git/docs/git-fast-import.html>. svn-fe already supports converting a dumpfile v3 to a fast-import stream. Getting it to do the reverse shouldn't be too hard- we are already working on it :) > Finally, I will note that I think this feature could be significant > for Subversion's competitive posture. Because exporters are easy while > importers are more difficult, supporting import streams only with > exporters and only through sketchy third-party tools tends to > encourage migration to git while discouraging migration away from it. Are you happy with having a combination of svnrdump and svn-fe for this, or do you think Subversion should natively support fast-import? I don't think it'll be very difficult to support natively, but it's kind of a hack because Subversion already has so much infrastructure to deal with dumpfile v3. p.s- I'm one of the students who did a GSoC project with Git this summer. If you recall, you even commented on the proposal I posted to the Git list :) svnrdump and svn-fe are the products of that same project. -- Ram