Re: Acceptance criteria for the git conversion

2015-09-03 Thread Jason Merrill

On 09/03/2015 02:59 AM, Trevor Saunders wrote:

On Tue, Sep 01, 2015 at 06:06:33PM +0200, Andreas Schwab wrote:

"Eric S. Raymond"  writes:


There is no way to maintain those links for git, so yes, you want to
keep a read-only Subversion instance around.


The mapping can also be put in some git notes tree for use by bugzilla.
That would only need to be set up once.


I'd think that would be the way you'd want to associate git commits with
a svn commit, but I think bugzilla wants to do the reverse map svn
commits to git ones (or we could just rewrite the link targets) but
either way that needs a mapping in the other direction.  Obviously
having a mapping in one direction makes getting the reverse pretty
trivial.


It's pretty trivial to map from SVN rev numbers to git with either 
git-svn or reposurgeon --legacy commit decorations.


git log --grep '^git-svn-id:.*@1234 ' --all -1
git log --grep '^Legacy-ID: 1234$' --all -1

Jason



Re: Acceptance criteria for the git conversion

2015-09-03 Thread shmeel gutl

On 01-Sep-15 01:54 PM, Eric S. Raymond wrote:

What kind of mechanical transformation or hand-editing would add value for you?
I am working from a clone of the current git repository. Is there an 
automated procedure that will enable me to switch to the new repository 
and still keep all of the commit history of my local branches?




Re: Acceptance criteria for the git conversion

2015-09-03 Thread Andreas Schwab
"Eric S. Raymond"  writes:

> (I'm pretty sure there's also a way to do this using the obscure "git bundle"
> feature, but I've never learned it in detail.)

git bundle cannot help here because it cannot rewrite commits.  It's
just an implementation of the git protocol over some unspecified file
transfer.

Andreas.

-- 
Andreas Schwab, SUSE Labs, sch...@suse.de
GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE  1748 E4D4 88E3 0EEA B9D7
"And now for something completely different."


Re: Acceptance criteria for the git conversion

2015-09-03 Thread Trevor Saunders
On Tue, Sep 01, 2015 at 06:06:33PM +0200, Andreas Schwab wrote:
> "Eric S. Raymond"  writes:
> 
> > There is no way to maintain those links for git, so yes, you want to
> > keep a read-only Subversion instance around.
> 
> The mapping can also be put in some git notes tree for use by bugzilla.
> That would only need to be set up once.

I'd think that would be the way you'd want to associate git commits with
a svn commit, but I think bugzilla wants to do the reverse map svn
commits to git ones (or we could just rewrite the link targets) but
either way that needs a mapping in the other direction.  Obviously
having a mapping in one direction makes getting the reverse pretty
trivial.

Trev

> 
> Andreas.
> 
> -- 
> Andreas Schwab, SUSE Labs, sch...@suse.de
> GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE  1748 E4D4 88E3 0EEA B9D7
> "And now for something completely different."


Re: Acceptance criteria for the git conversion

2015-09-03 Thread Eric S. Raymond
shmeel gutl :
> On 01-Sep-15 01:54 PM, Eric S. Raymond wrote:
> >What kind of mechanical transformation or hand-editing would add value for 
> >you?
> I am working from a clone of the current git repository. Is there an
> automated procedure that will enable me to switch to the new repository and
> still keep all of the commit history of my local branches?

Fully automated, no.  The closest possible approach to that would be a script
that took your branch point locations and branch names as arguments.

In that script, you should be able to turn your local branches into
patch sequences using git-format-patch, call git branch to create new
local branches in a clone of the conversion, and then apply the
sequences.

(I'm pretty sure there's also a way to do this using the obscure "git bundle"
feature, but I've never learned it in detail.)
-- 
http://www.catb.org/~esr/;>Eric S. Raymond


Re: Acceptance criteria for the git conversion

2015-09-03 Thread Andreas Schwab
shmeel gutl  writes:

> I am working from a clone of the current git repository. Is there an
> automated procedure that will enable me to switch to the new repository
> and still keep all of the commit history of my local branches?

The easiest way to do that is to fetch the new repository into your
local repository, find the equivalent commit in the rewritten history
that corresponds to the upstream commit from the svn mirror, and do a
merge -s ours on that commit.  Then switch the upstream to the new
repository.

Andreas.

-- 
Andreas Schwab, SUSE Labs, sch...@suse.de
GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE  1748 E4D4 88E3 0EEA B9D7
"And now for something completely different."


Re: Acceptance criteria for the git conversion

2015-09-03 Thread Joseph Myers
On Thu, 3 Sep 2015, shmeel gutl wrote:

> I am working from a clone of the current git repository. Is there an automated
> procedure that will enable me to switch to the new repository and still keep
> all of the commit history of my local branches?

Well, the "git fetch" command I proposed in 
 could no doubt be 
accompanied by commands people can run in their clones to adapt them for 
the renaming of refs - that sort of conversion instructions is one of the 
things we need anyway before making the switch.  Given then that you've 
fetched the newly converted history into your clone (which in fact doesn't 
depend on having both sets of objects in the one gcc.gnu.org repository), 
you can do a "git merge -s ours" from the relevant revision of the new 
history (and then do future merges from the new master), or do something 
with "git rebase --onto" to create new branches based on the new history 
with the commits from your old branches.

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: Acceptance criteria for the git conversion

2015-09-02 Thread Joseph Myers
On Tue, 1 Sep 2015, David Malcolm wrote:

> Caution: this script performs numerous URL GETs on gcc.gnu.org;
> it caches everything, but the first time you run it, the cache
> will be cold.  (So please be careful!)

It may be better to rsync the whole archive 
(gcc.gnu.org::gcc-patches-ml-archive) for doing anything like this.

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: Acceptance criteria for the git conversion

2015-09-01 Thread Joseph Myers
On Tue, 1 Sep 2015, Mikhail Maltsev wrote:

> Actually, I did not propose to alter the repository history. I just 
> meant to say that if .c -> .cc renaming is still planned, it could be 
> done right after conversion, as a normal commit, or, perhaps series of 
> commits on trunk and active development (feature) branches.

If such a change were desired, "right after conversion" seems like a bad 
time for it - we should allow time for people to become familiar with 
working with the new repository, and to iron out any issues with hooks 
etc. that weren't found before the conversion went live, before doing any 
such major rearrangements of files.

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: Acceptance criteria for the git conversion

2015-09-01 Thread Joseph Myers
On Tue, 1 Sep 2015, Eric S. Raymond wrote:

> Joseph Myers :
> > Indeed.  Ideally the tree objects in the git conversion should have 
> > exactly the same contents as SVN commits, and so be shared with the 
> > git-svn history to reduce the eventual repository size (except where there 
> > are defects in the git-svn history, or the git conversion fixes up cvs2svn 
> > artifacts and so some old revisions end up more accurately reflecting old 
> > history than the SVN repository does).
> 
> I don't think sharing with the git-svn history will be possible.  git-svn
> is a terrible whole-history converter; the odds of getting the same
> topology out of reposurgeon are basically nil, and the problem of matching
> different topologies is quite hard.

I'm not proposing sharing topology (commit objects).  Only blob and tree 
objects.  If two files have the same hash they will share the same blob 
object, and if two trees have files with the same hashes at the same paths 
then the tree objects will also have the same hash, and will be shared.  
Now, git-svn may well have made mistakes meaning some trees in the git-svn 
repository do not accurately correspond to any SVN revision of any branch 
(and so the objects aren't shared), but I'd expect most to be shared (even 
without disabling smart ignore handling, lots of tree objects for 
subdirectories would be shared, if those subdirectories don't have any 
ignore files or svn:ignore properties).

The point is that since the git-svn repository has been in use for years, 
and there are many git-only branches there with lots of development on 
them, there are also many git commit references in list archives etc. 
which need to remain meaningful.  While it would be possible to move the 
existing repository to a different URI (or put the new repository at a 
less-obvious URI), it seems simpler to put both sets of objects (with many 
objects in common) in the same repository (with appropriately renamed refs 
from the git-svn repository so that the objects aren't garbage-collected).

This isn't something for reposurgeon to do.  It's something that should be 
easy to do at the pure git level.  At a minimum, I think it might be just 
one command to add the git-svn objects to a repository converted with 
reposurgeon.  Untested, but should give an idea of what I'm thinking of:

git fetch git://gcc.gnu.org/git/gcc.git \
'refs/heads/*:refs/heads/git-old/*' \
'refs/remotes/*:refs/heads/git-svn-old/*' \
'regs/tags/*:refs/tags/git-old/*'

(OK, you want to git gc afterwards to repack the whole repository.)

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: Acceptance criteria for the git conversion

2015-09-01 Thread Joseph Myers
On Tue, 1 Sep 2015, Richard Earnshaw wrote:

> Renaming the files during the conversion is clearly *not* the right
> thing to do: it would break all builds of old code.

Indeed.  Ideally the tree objects in the git conversion should have 
exactly the same contents as SVN commits, and so be shared with the 
git-svn history to reduce the eventual repository size (except where there 
are defects in the git-svn history, or the git conversion fixes up cvs2svn 
artifacts and so some old revisions end up more accurately reflecting old 
history than the SVN repository does).

One particular case: we have well-maintained .gitignore files, that might 
even be more accurate than the svn:ignore properties, and I think the 
conversion should keep those and disable all smart ignore handling (just 
discard svn:ignore properties, and pass through the existing .gitignore 
files (and .cvsignore files)).

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: Acceptance criteria for the git conversion

2015-09-01 Thread Mikhail Maltsev
On 09/01/2015 08:11 PM, Joseph Myers wrote:
> On Tue, 1 Sep 2015, Richard Earnshaw wrote:
> 
>> Renaming the files during the conversion is clearly *not* the right
>> thing to do: it would break all builds of old code.
> 
> Indeed.  Ideally the tree objects in the git conversion should have 
> exactly the same contents as SVN commits, and so be shared with the 
> git-svn history to reduce the eventual repository size (except where there 
> are defects in the git-svn history, or the git conversion fixes up cvs2svn 
> artifacts and so some old revisions end up more accurately reflecting old 
> history than the SVN repository does).

Actually, I did not propose to alter the repository history. I just meant to 
say that
if .c -> .cc renaming is still planned, it could be done right after 
conversion, as a
normal commit, or, perhaps series of commits on trunk and active development
(feature) branches.

-- 
Regards,
Mikhail Maltsev


Re: Acceptance criteria for the git conversion

2015-09-01 Thread Eric S. Raymond
Joseph Myers :
> Indeed.  Ideally the tree objects in the git conversion should have 
> exactly the same contents as SVN commits, and so be shared with the 
> git-svn history to reduce the eventual repository size (except where there 
> are defects in the git-svn history, or the git conversion fixes up cvs2svn 
> artifacts and so some old revisions end up more accurately reflecting old 
> history than the SVN repository does).

I don't think sharing with the git-svn history will be possible.  git-svn
is a terrible whole-history converter; the odds of getting the same
topology out of reposurgeon are basically nil, and the problem of matching
different topologies is quite hard.

I'll be frank; if it's doable at all (which I doubt) I think this is a
*really bad idea* - a complexity hairball with few or no actual benefits.
I'm not willing to even try for it unless demand from the development
group is overwhelming and you're able to wait a long, long time for
results.

> One particular case: we have well-maintained .gitignore files, that might 
> even be more accurate than the svn:ignore properties, and I think the 
> conversion should keep those and disable all smart ignore handling (just 
> discard svn:ignore properties, and pass through the existing .gitignore 
> files (and .cvsignore files)).

This is also not currently possible, but it's not an intrinsically bad
idea. Giving reposurgeon an option to to support it wouldn't be very
difficult.
-- 
http://www.catb.org/~esr/;>Eric S. Raymond


Re: Acceptance criteria for the git conversion

2015-09-01 Thread David Malcolm
On Tue, 2015-09-01 at 11:30 -0400, Eric S. Raymond wrote:
> Joseph Myers :
> > With 227369 revisions I don't think adding git-style summary lines is 
> > really practical without some very reliable automation to match commits to 
> > corresponding gcc-patches messages (whose Subject: headers would be the 
> > natural choice for such summary lines)
> 
> In this case you may be right.  Select =L tells me there are 101139
> commits wanting that sort of adjustment, which I think is at least
> 2.5x the bulk I've ever had to deal with before.
> 
> Still, if anyone else is brave enough to write a script that will munch
> through gcc-patches producing committer/date/subject-line triples, I'll
> give it a try.

I don't think committer/date/subject-line triples are adequate: the
dates are unlikely to match up, for one thing.

I think such a solution would need to somehow locate and match patches
themselves.

I was feeling brave, so I had a go at writing a scraper; see:
https://github.com/davidmalcolm/patch-finder
for what I have so far (tested with Python 2.7).

This can scrape the gcc-patches archives and locate mails containing
patches, extracting the patches (some of them anyway...).  The idea
would be to stuff the patches into some kind of big data store, and
somehow them try to locate them (perhaps within a rough date "window").

Does this seem like a viable approach?

Caution: this script performs numerous URL GETs on gcc.gnu.org;
it caches everything, but the first time you run it, the cache
will be cold.  (So please be careful!)

> About scale:  The largest repository I've dealt with before this was
> NetBSD, with a working set of 18GB, vs 45GB for this one.  The way 
> reposurgeon's
> internal representations work, working set is dominated by comment text.  So
> the GCC repo has about 2.5x the comment bulk of NetBSD.




Re: Acceptance criteria for the git conversion

2015-09-01 Thread Eric S. Raymond
David Malcolm :
> > Still, if anyone else is brave enough to write a script that will munch
> > through gcc-patches producing committer/date/subject-line triples, I'll
> > give it a try.
> 
> I don't think committer/date/subject-line triples are adequate: the
> dates are unlikely to match up, for one thing.

Agreed. They're unlikely to match up exactly.

> I think such a solution would need to somehow locate and match patches
> themselves.
> 
> I was feeling brave, so I had a go at writing a scraper; see:
> https://github.com/davidmalcolm/patch-finder
> for what I have so far (tested with Python 2.7).
> 
> This can scrape the gcc-patches archives and locate mails containing
> patches, extracting the patches (some of them anyway...).  The idea
> would be to stuff the patches into some kind of big data store, and
> somehow them try to locate them (perhaps within a rough date "window").
> 
> Does this seem like a viable approach?

I think it's as good as we're likely to get given the data available.
-- 
http://www.catb.org/~esr/;>Eric S. Raymond


Re: Acceptance criteria for the git conversion

2015-09-01 Thread Rainer Orth
Joseph Myers  writes:

> On Tue, 1 Sep 2015, Eric S. Raymond wrote:
>
>> As a trivial example of the possibilities, sometimes when I do conversions
>> I fix obvious comment typos. I generally have to edit the comment history
>> anyway
>> to tweak comments that don't have git-style summary lines into shape, so
>> fixing typos is not much additional work.
>
> With 227369 revisions I don't think adding git-style summary lines is 
> really practical without some very reliable automation to match commits to 
> corresponding gcc-patches messages (whose Subject: headers would be the 
> natural choice for such summary lines)

And even that wouldn't work, I believe: a considerable number of patches
are submitted in the context of some thread whose patch e.g. introduced
a regression or bootstrap failure, without changing the subject.  So,
unless you detect this case and make something up, the result is likely
to be confusing rather than helpful.

Rainer

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University


Re: Acceptance criteria for the git conversion

2015-09-01 Thread Joseph Myers
On Tue, 1 Sep 2015, Eric S. Raymond wrote:

> As a trivial example of the possibilities, sometimes when I do conversions
> I fix obvious comment typos. I generally have to edit the comment history 
> anyway
> to tweak comments that don't have git-style summary lines into shape, so
> fixing typos is not much additional work.

With 227369 revisions I don't think adding git-style summary lines is 
really practical without some very reliable automation to match commits to 
corresponding gcc-patches messages (whose Subject: headers would be the 
natural choice for such summary lines)

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: Acceptance criteria for the git conversion

2015-09-01 Thread Richard Earnshaw
On 01/09/15 15:26, Mikhail Maltsev wrote:
> On 09/01/2015 01:54 PM, Eric S. Raymond wrote:
>> With the machinery for the git conversion now in reasonable shape, it's
>> time to ask GCC's developers in general:  what do you want this
>> conversion to accomplish?
> There was some discussion concerning file renaming:
> https://gcc.gnu.org/ml/gcc/2015-04/msg00175.html
> 
> I think using different extensions for C and C++ code is a good thing 
> (because having
> C++ code in ".c" files is confusing both for humans and for tools), and 
> probably
> moving to git is a good occasion for such rename.
> I could take care of performing this change (i.e. make a list of C++ files 
> with .c
> extension, fix build scripts, run some tests), if it is acceptable.
> 

Renaming the files during the conversion is clearly *not* the right
thing to do: it would break all builds of old code.

Whether the renames should be done post conversion is a completely
different question.  Personally, I don't see the point.

R.


Re: Acceptance criteria for the git conversion

2015-09-01 Thread David Edelsohn
On Tue, Sep 1, 2015 at 10:26 AM, Mikhail Maltsev  wrote:
> On 09/01/2015 01:54 PM, Eric S. Raymond wrote:
>> With the machinery for the git conversion now in reasonable shape, it's
>> time to ask GCC's developers in general:  what do you want this
>> conversion to accomplish?
> There was some discussion concerning file renaming:
> https://gcc.gnu.org/ml/gcc/2015-04/msg00175.html
>
> I think using different extensions for C and C++ code is a good thing 
> (because having
> C++ code in ".c" files is confusing both for humans and for tools), and 
> probably
> moving to git is a good occasion for such rename.
> I could take care of performing this change (i.e. make a list of C++ files 
> with .c
> extension, fix build scripts, run some tests), if it is acceptable.

Definitely not.

- David


Re: Acceptance criteria for the git conversion

2015-09-01 Thread Mikhail Maltsev
On 09/01/2015 01:54 PM, Eric S. Raymond wrote:
> With the machinery for the git conversion now in reasonable shape, it's
> time to ask GCC's developers in general:  what do you want this
> conversion to accomplish?
There was some discussion concerning file renaming:
https://gcc.gnu.org/ml/gcc/2015-04/msg00175.html

I think using different extensions for C and C++ code is a good thing (because 
having
C++ code in ".c" files is confusing both for humans and for tools), and probably
moving to git is a good occasion for such rename.
I could take care of performing this change (i.e. make a list of C++ files with 
.c
extension, fix build scripts, run some tests), if it is acceptable.

-- 
Regards,
Mikhail Maltsev


Re: Acceptance criteria for the git conversion

2015-09-01 Thread David Malcolm
On Tue, 2015-09-01 at 06:54 -0400, Eric S. Raymond wrote:
> With the machinery for the git conversion now in reasonable shape, it's
> time to ask GCC's developers in general:  what do you want this
> conversion to accomplish?
> 
> There are some obvious things we might expect it to accomplish, like
> 
> (1) Encouraging people to do finer-grained commits because the operation is
> so much faster.

FWIW, I'm not convinced (1) is so relevant to gcc.  For me, most of my
time spent on committing patches to gcc is the part where I'm waiting on
my machine to do bootstrap testing; the actual commit is
relatively fast.   I believe we have a policy that although we may break
up patches into chunks for ease of review, commits themselves should be
"atomic", so that the repository is always in a working state.  At
least, that's the ideal :)

> (2) Attract developers who think Subversion is clunky and old-fashioned.

Yes, though I think many of us do almost all of our gcc work using git
already, using the git-svn mirror, and only touch svn for the final
commit.

Hence I think this is more a marketing "smell" thing: potential
developers may see "SVN" on our website and go "ugh!  gcc is so old
fashioned!", but the reality for me is that I can already do almost all
of my gcc work in git without touching svn.

> (3) Enable bisection as a bug-localization technique.

> But there's not much Jason or I can do to advance *those* goals; any
> conversion except one that's too crappy to be usable would accomplish them.
> 
> What I'm interested in, as I assist the process, is how your desires
> ought to affect what we do during the conversion.
> 
> As a trivial example of the possibilities, sometimes when I do conversions
> I fix obvious comment typos. I generally have to edit the comment history 
> anyway
> to tweak comments that don't have git-style summary lines into shape, so
> fixing typos is not much additional work.
> 
> What kind of mechanical transformation or hand-editing would add value for 
> you?

Will the resulting git commits have some kind of metadata identifying
the corresponding SVN revision?   For example, I see something like this
in the git svn mirror e.g. this line shows up for
0c0caab66d411b8df6a9057d788f1c8bcf77a83a:

git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@226697 
138bc75d-0d04-0410-961f-82ee72b054a4

We have numerous references to specific revisions in bugzilla and in the
list archives, so retaining this mapping seems very useful for future
"archaeological digs"; it would be a major regression compared to the
git-svn mirror if we lost them.

Similarly, our bugzilla automatically turns text like "r226697" into
links to the relevant commit in SVN.  I don't know if there's a way to
maintain those links for git (beyond e.g. creating a named tag for every
r[0-9]+, but that's clearly insane), so presumably we'd want to keep the
old SVN web interface around to service those bugzilla URLs?


I'd love it if the commits gained readable titles (where they don't
already).  For example, if I run "git shortlog" (on a working copy from
the git-svn mirror), I see this for some of my commits:
  2013-08-06  David Malcolm  
  2013-08-07  David Malcolm  
  2013-08-07  David Malcolm  
  2013-08-07  David Malcolm  
  2013-08-07  David Malcolm  
  2013-08-13  David Malcolm  
  gcc/testsuite
  gcc/testsuite
(etc)
which is less than helpful.  Since I noticed this, I've been trying to
add decent title lines to my SVN commits so that they show up nicely in
git.

So it'd be great if a script could identify those commit titles that are
just the top of ChangeLog entries, scrape the gcc-patches mailing list
archives for try to locate the Subject lines of the pertinent patches
(and clean away extraneous [PATCH] or [PING] fragments, though other
fragments may be pertinent e.g. identifying subsystems).  Potentially it
could also add the URL of the discussion in the list archive.  Clearly a
non-trivial task though.

Thanks; hope this is constructive
Dave



Re: Acceptance criteria for the git conversion

2015-09-01 Thread Eric S. Raymond
Joseph Myers :
> With 227369 revisions I don't think adding git-style summary lines is 
> really practical without some very reliable automation to match commits to 
> corresponding gcc-patches messages (whose Subject: headers would be the 
> natural choice for such summary lines)

In this case you may be right.  Select =L tells me there are 101139
commits wanting that sort of adjustment, which I think is at least
2.5x the bulk I've ever had to deal with before.

Still, if anyone else is brave enough to write a script that will munch
through gcc-patches producing committer/date/subject-line triples, I'll
give it a try.

About scale:  The largest repository I've dealt with before this was
NetBSD, with a working set of 18GB, vs 45GB for this one.  The way reposurgeon's
internal representations work, working set is dominated by comment text.  So
the GCC repo has about 2.5x the comment bulk of NetBSD.
-- 
http://www.catb.org/~esr/;>Eric S. Raymond


Re: Acceptance criteria for the git conversion

2015-09-01 Thread Andreas Schwab
"Eric S. Raymond"  writes:

> There is no way to maintain those links for git, so yes, you want to
> keep a read-only Subversion instance around.

The mapping can also be put in some git notes tree for use by bugzilla.
That would only need to be set up once.

Andreas.

-- 
Andreas Schwab, SUSE Labs, sch...@suse.de
GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE  1748 E4D4 88E3 0EEA B9D7
"And now for something completely different."


Re: Acceptance criteria for the git conversion

2015-09-01 Thread Eric S. Raymond
David Malcolm :
> > What kind of mechanical transformation or hand-editing would add value for
> >you?
> 
> Will the resulting git commits have some kind of metadata identifying
> the corresponding SVN revision?

That's what the --legacy option does.  I think Jason plans to use it.

I've noted previously that I actually recommend against this based on
previous experience with Subversion conversions.  You get a lot of
clutter in the result and demand for such lookups (especially *outside
the specific context of a buglist*) tends to drop off faster than
people expect.  But that policy decision isn't mine to make.

> Similarly, our bugzilla automatically turns text like "r226697" into
> links to the relevant commit in SVN.  I don't know if there's a way to
> maintain those links for git (beyond e.g. creating a named tag for every
> r[0-9]+, but that's clearly insane), so presumably we'd want to keep the
> old SVN web interface around to service those bugzilla URLs?

There is no way to maintain those links for git, so yes, you want to
keep a read-only Subversion instance around.

This also lowers the utility of keeping the legacy-ID in every commit.

> So it'd be great if a script could identify those commit titles that are
> just the top of ChangeLog entries, scrape the gcc-patches mailing list
> archives for try to locate the Subject lines of the pertinent patches
> (and clean away extraneous [PATCH] or [PING] fragments, though other
> fragments may be pertinent e.g. identifying subsystems).  Potentially it
> could also add the URL of the discussion in the list archive.  Clearly a
> non-trivial task though.

There was already some discussion of this.  If someone else is willing
to digest the gcc-patches archives into committer/date/subject-line
triples I'll see what I can do.

> Thanks; hope this is constructive

Yes, very much the sort of thing I was looking for.
-- 
http://www.catb.org/~esr/;>Eric S. Raymond