On Sat, 16 Apr 2005, Ingo Molnar wrote:
> 
> i've converted the Linux kernel CVS tree into 'flat patchset' format, 
> which gave a series of 28237 separate patches. (Each patch represents a 
> changeset, in the order they were applied. I've used the cvsps utility.)
> 
> the history data starts at 2.4.0 and ends at 2.6.12-rc2. I've included a 
> script that will apply all the patches in order and will create a 
> pristine 2.6.12-rc2 tree.

Hey, that's great. I got the CVS repo too, and I was looking at it, but 
the more I looked at it, the more I felt that the main reason I want to 
import it into git ends up being to validate that my size estimates are at 
all realistic.

I see that Thomas Gleixner seems to have done that already, and come to a 
figure of 3.2GB for the last three years, which I'm very happy with, 
mainly because it seems to match my estimates to a tee. Which means that I 
just feel that much more confident about git actually being able to handle 
the kernel long-term, and not just as a stop-gap measure.

But I wonder if we actually want to actually populate the whole history.. 
Now that my size estimates have been verified, I have little actual real 
reason to put the history into git. There are no visualization tools done 
for git yet, and no helpers to actually find problems, and by the time 
there will be, we'll have new history.

So I'd _almost_ suggest just starting from a clean slate after all.  
Keeping the old history around, of course, but not necessarily putting it
into git now. It would just force everybody who is getting used to git in 
the first place to work with a 3GB archive from day one, rather than 
getting into it a bit more gradually.

What do people think? I'm not so much worried about the data itself: the
git architecture is _so_ damn simple that now that the size estimate has
been confirmed, that I don't think it would be a problem per se to put
3.2GB into the archive. But it will bog down "rsync" horribly, so it will
actually hurt synchronization untill somebody writes the rev-tree-like
stuff to communicate changes more efficiently..

IOW, it smells to me like we don't have the infrastructure to really work 
with 3GB archives, and that if we start from scratch (2.6.12-rc2), we can 
build up the infrastructure in parallell with starting to really need it.

But it's _great_ to have the history in this format, especially since 
looking at CVS just reminded me how much I hated it.

Comments?

                Linus
-
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to