On Sun, 2005-08-14 at 23:58 +0200, "Martin v. Löwis" wrote: > Guido van Rossum wrote: > > Here's another POV. > > I think I agree with Daniel's view, in particular wrt. to performance. > Whatever the replacement tool, it should perform as well or better > than CVS currently does; it also shouldn't perform much worse than > subversion.
Then, in fairness, I should note that annotate is slower on subversion (and monotone, and anything using binary deltas) than CVS. This is because you can't generate line-diffs that annotate wants from binary copy + add diffs. You have to reconstruct the actual revisions and then line diff them. Thus, CVS is O(N) here, and SVN and other binary delta users are O(N^2). You wouldn't really notice the speed difference when you are annotating a file with 100 revisions. You would if you annotate the 800k changelog which has 30k trunk revisions. CVS takes 4 seconds, svn takes ~5 minutes, the whole time being spent in doing diffs of those revisions. I rewrote the blame algorithm recently so that it will only take about 2 minutes on changelog, but it cheats because it knows it can stop early because it's blamed all the revisions (since our changelog rotates). For those curious, you also can't directly generate "always-correct" byte-level differences from the diffs, since their goal is to find the most space efficient way to transform rev old into rev new, *not* record actual byte-level changes that occurred between old and new. It may turn out that doing an add of 2 bytes is cheaper than specifying the opcode for copy(start,len). Actual diffs are produced by reproducing the texts and line diffing them. Such is the cost of efficient storage :). > > I've been using git (or, rather, cogito) to keep up-to-date with the > Linux kernel. While performance of git is really good, storage > requirements are *quite* high, and initial "checkout" takes a long > time - even though the Linux kernel repository stores virtual no > history (there was a strict cut when converting the bitkeeper HEAD). > So these distributed tools would cause quite some disk consumption > on client machines. bazaar-ng apparently supports only-remote > repositories as well, so that might be no concern. The argument "network and disk is cheap" doesn't work for us when you are talking 5-10 gigabytes of initial transfer :). However, I doubt it's more than a hundred meg or so for python, if that. You may run into these problems in 10 years :) _______________________________________________ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com