Hi Zooko, Thanks for your clarification (and more timings)!
On Wed, Jan 23, 2008 at 08:43:53AM -0700, zooko wrote: > On Jan 23, 2008, at 5:02 AM, David Roundy wrote: > > 1. Am I correct that all these timings involve push -a and pull -a when > > there is nothing to push or pull? > > Yes. > > > 2. Did you enable a global cache directory with ~/.darcs/sources > > No. There is only one darcs-2-format repository on each machine so > it didn't occur to me that a global cache could be useful. Actually, I think you're right. So you can save time and not bother with the global cache. > > 3. Did you repeat your measurements? In particular, if you use a global > > cache the second time you push/pull with either --hashed on both sides > > or --darcs-2 on both sides should be lightning fast. (And if it's not, > > I want to figure out why.) > > Okay I just did it again. For each operation I did four trials and > am reporting the best (fastest): > > A. darcs 1.0.9, old-format local repo, old-format remote repo > B. darcs 2 pre, hashed-format local repo, old-format remote repo Two cases I'd like to see is darcs 2 pre with hashed-format on each side, and darcs 2 pre with old-format on each side. The latter case really ought to be about as fast as darcs 1.0.9, so it can serve as a test for whether we've done something seriously stupid (i.e. given the same format, we shouldn't require more network accesses). The former (hashed-hashed) ought to be faster than accessing a remote old-fashioned repository. It really should only take O(1) network accesses... although, to be honest, the old-fashioned repository will also take O(1), so maybe it won't actually be faster. The (small) slowdown you're seeing may be something stupid like accessing the remote _darcs/format multiple times. > B2. same but with a newly created global cache on the local side > B3. same but with an already-populated global cache on the local side > C. darcs 2 pre, new-format local repo, new-format remote repo > C2. same but with a newly created global cache on the local side > C3. same but with an already-populated global cache on the local side > > column 1: push, column 2: pull ; all values in seconds > > A. 7.9 18.2 > B. 16.3 14.7 > B2. 14.6 19.1 > B3. 16.0 14.3 > C. 9.0 12.1 > C2. 11.4 15.7 > C3. 9.9 9.8 > > Note that these measurements varied by many seconds -- up to 100% in > some runs -- possibly because of network contention or disk or CPU > contention on the remote repo. So consider them to be only broad > general measurements of performance, and also use these measurements > as reminders that your handling of network latency and resource > contention are often more important than your algorithm efficiency. > > So far my general feeling is that switching over to darcs-2 and using > hashed repository format (the B rows) would not be an improvement > over using darcs-1, and that requiring everyone who wants to share > source with me to switch to darcs-2 and use darcs-2-repository-format > (the C rows) might be an improvement, but it isn't clear yet, and > obviously having a "flag day" style upgrade like that is a deterrent. I think that if your remote repository is hashed, then you should get the same performance (when there are no conflicts) with --hashed as you do with --darcs-2, which is an improvement in pull according to your timings. If you've got the disk space, a cron job to keep --old-fashioned and --hashed repositories on the server in sync is pretty easy and safe, and allows darcs 2 users to have faster results. > Oh, by the way, I suspect that darcs on windows (the "local" machine > in these measurements) isn't actually finding the global cache (I > told it "~/.darcs/cache", and I guess it doesn't know what I mean by > "~"), so you can probably consider the 1/2/3 rows as just showing the > noise in the system rather than actually using the cache. :-) > > I'll try to explain to darcs-on-windows where to do caching, later. I'll appreciate that! There's a function in the haskell libraries that is supposed to give something like a cross-platform "home" directory, but I know there's been debate about how that's best defined (since Windows doesn't really have the equivalent of a unix home directory, at least not in the sense of "a place to put .files"). -- David Roundy Department of Physics Oregon State University _______________________________________________ darcs-devel mailing list darcs-devel@darcs.net http://lists.osuosl.org/mailman/listinfo/darcs-devel