Some months back I contributed svncutter to Subversion. This was a tool for doing surgery on dumpfiles intended to remove artifacts associated with conversions from older VCSes.
My interest in tools for repository surgery has continued, and I recently spotted an opportunity in the increasing use of git-fast-import streams as a history-interchange format. I have written what I believe is the first *native* application for fast-import streams, a repository editor I call reposurgeon. You can read the announcement here: http://esr.ibiblio.org/?p=2718 Project resource page with tarballs: http://www.catb.org/~esr/reposurgeon/ Freshmeat page: http://freshmeat.net/projects/reposurgeon HTML manual: http://www.catb.org/~esr/reposurgeon/reposurgeon.html Perhaps the most interesting thing about reposurgeon is that, by design, it knows almost nothing about any individual VCS. All it counts on is the ability to get a fast-import dump from a repo and then the ability to create a repo from the dump after the contents of the import stream has been modified. If you hadn't heard about this before, it's because the project is in alpha and only two weeks old. Nevertheless, it is already good enough for production use on git repositories. Operations supported include editing of commit and tag metadata, deletion of commits, expunges of file history, coalescing single-file commit cliques with identical comments, and topological cut. The code is backed by an extensive regression-test suite and fully documented. I also have working support for bzr and hg, though the practical utility of same is presently limited by unstable and poorly-supported export/import tools. I'm working with a bzr dev to address this problem; better solutions should be forthcoming within weeks, if not days. Which brings me to my feature request. Please add native support for fast-export and fast-import to svndump. This would be a good idea in general, but my specific reason for wanting it is to enable reposurgeon to edit Subversion repositories. The export side is, of course, almost trivial. Proof of concept under MIT license is here: <http://c133.org/code/svn-fast-export.c>. It needs a bit of extension work around tags and branches; I won't belabor the obvious (and easily solvable) issues with those. There are two more substantive ones: 1) Whatever merge-tracking hair you represent internally should be dumped 'as 'merge' commit properties. 2) User commit properties (e.g. those not in the svn: namespace) should be exported using the bzr properties extension, which reposurgeon handles now and which seems likely to make it into git core at some point. Syntax: property <space> NAME <space> VALUE-LENGTH <space> VALUE LF or, if the value is empty: property <space> NAME LF NAME and VALUE are utf8-encoded. The properties for each commit are sorted by the property name. Also note that an import stream actually containing commit-property declarations should have a line reading "feature commit-properties" before the first commit. The import side is less trivial, but given that you've already got internal representations for merge-tracking it shouldn't be too difficult either. I'd offer to do this, but I'm deliberately staying away from writing export/import code myself, other than the implementations inside reposurgeon. It will be better, long-term, if my reposurgeon assumptions don't leak into other implementations; they ought to be engineered from the fast-import stream documentation. See the definitive web page at: <http://www.kernel.org/pub/software/scm/git/docs/git-fast-import.html>. Finally, I will note that I think this feature could be significant for Subversion's competitive posture. Because exporters are easy while importers are more difficult, supporting import streams only with exporters and only through sketchy third-party tools tends to encourage migration to git while discouraging migration away from it. Other VCSes, with bzr taking point, are positioning themselves as destinations rather than places to leave by mainlining importers. As a friend of Subversion, I strongly recommend that it should do likewise. -- <a href="http://www.catb.org/~esr/">Eric S. Raymond</a> A human being should be able to change a diaper, plan an invasion, butcher a hog, conn a ship, design a building, write a sonnet, balance accounts, build a wall, set a bone, comfort the dying, take orders, give orders, cooperate, act alone, solve equations, analyze a new problem, pitch manure, program a computer, cook a tasty meal, fight efficiently, die gallantly. Specialization is for insects. -- Robert A. Heinlein, "Time Enough for Love"

