Hello! On Tue, Jun 23, 2009 at 01:25:34PM +0200, olafbuddenha...@gmx.net wrote: > On Thu, Jun 18, 2009 at 03:02:41PM +0200, Thomas Schwinge wrote: > > Olaf asked whether we could fix the author and committer information > > for the changesets. This can't be done reliably in an automated way > > and surely no one wants to inspect 10,000+ changesets manually. As I > > consider a correct-believed but nevertheless incorrect automatic > > conversion worse than the current one where you exactly know that the > > information is not accurate, I decided to leave this alone as is. > > I don't agree. What's the use of this damn pedantic GNU-style changelog > format, if we can't even reliably extract author information from it?!
You can extract that information, but it's a manual process. It involves looking into the ChangeLog *file* and extract the information from there. Consider this case: P commits change C1 with log message L1. L1 does *not* contain any authorship information, but contains soleley the textual description of the actual change. P commits further changes Cx with log messages Lx, as above. Eventually P will commit change C(ChangeLog) with log message L(``.'')). Only then the ChangeLog will reflect the correct attribution of the changes. And note that I didn't make this up, but this is in fact how it has (partly) been done in the past. This means that the revision control system alone will never be able to correctly describe the authorship information of individual changes. Only the ``serialized form'' (say, a tarball release's ChangeLog file) will have the correct information. > I also do not agree that having everything wrong is better than having a > few errors, perhaps, or maybe not. I object: having most of it correct, but not all, gives the false impression that everything would be correct. And as we're talking about legally relevant matters here, better be on the safe side. > (And it's not even more consistent, as any new commits will have it > right.) Indeed there is a cut-off point, and before that one the ChangeLog is correct, and after it the Git information is correct. On the cut-off point (i.e., now), the ChangeLog files will be removed from the trees (or be renamed to ChangeLog.old or whatever). > Also note that not only the original Author information is missing, but > also the Committer is "tschwinge" for all commits -- I guess you did > some careless rebasing or something like that... So the result is that > the Committer is bogus, the Author contains the actual comitter, and the > real author is only mentioned in the Changelog. That's extremely ugly > and confusing IMHO. We can't really do anything about it. The situation after the cut-off point: standard Git usage. The situation before the cut-off point: Git committer name == CVS committer name, or tschwinge Git committer date == CVS committer date, or a recent date Git author name == CVS committer name Git author date == CVS committer date Indeed -- you guessed correctly; set aside the ``careless'' allegation -- the Git committer {name,date} != CVS committer {name,date} in the cases where I manually re-spooled a lot of series of commits in order to re-craft proper merges between branches. So, in Git sense, it is correct that the Git commit {name,date} is updated to the person having re-committed each original commit. So, to sum up: after the cut-off point, everything is as expected, and before the cut-off point, the Git committer information is useless, and the Git author information is the CVS commiter information, and the changes' author information is hidden in the relevant ChangeLog file. > > Also, there was the idea of aggregating all the individual one-file, > > [...], then ChangeLog commits into aggregates, but this also can't be > > done reliably in an automated fashion without a lot of manual > > corrections (as could be seen in the glibc CVS to Git conversion), so > > I also left that alone. > > I feared that much... It's a pity, but I guess there is nothing we can > do about it :-( Unfortunately, yes. See, these are (a part of) the resons why the CVS to Git migration took that long. As you, I wanted to get it all right, so that the Git VCS information is ``correct''. But it is just not possible (without manual intervention, of course). Regards, Thomas
signature.asc
Description: Digital signature