Hi Zooko,

Thanks for your clarification (and more timings)!

On Wed, Jan 23, 2008 at 08:43:53AM -0700, zooko wrote:
> On Jan 23, 2008, at 5:02 AM, David Roundy wrote:
> > 1. Am I correct that all these timings involve push -a and pull -a when
> > there is nothing to push or pull?
> 
> Yes.
> 
> > 2. Did you enable a global cache directory with ~/.darcs/sources
> 
> No.  There is only one darcs-2-format repository on each machine so  
> it didn't occur to me that a global cache could be useful.

Actually, I think you're right.  So you can save time and not bother with
the global cache.

> > 3. Did you repeat your measurements? In particular, if you use a global
> > cache the second time you push/pull with either --hashed on both sides
> > or --darcs-2 on both sides should be lightning fast.  (And if it's not,
> > I want to figure out why.)
> 
> Okay I just did it again.  For each operation I did four trials and  
> am reporting the best (fastest):
> 
> A.  darcs 1.0.9, old-format local repo, old-format remote repo
> B.  darcs 2 pre, hashed-format local repo, old-format remote repo

Two cases I'd like to see is darcs 2 pre with hashed-format on each side,
and darcs 2 pre with old-format on each side.

The latter case really ought to be about as fast as darcs 1.0.9, so it can
serve as a test for whether we've done something seriously stupid
(i.e. given the same format, we shouldn't require more network accesses).

The former (hashed-hashed) ought to be faster than accessing a remote
old-fashioned repository.  It really should only take O(1) network
accesses... although, to be honest, the old-fashioned repository will also
take O(1), so maybe it won't actually be faster.

The (small) slowdown you're seeing may be something stupid like accessing
the remote _darcs/format multiple times.

> B2. same but with a newly created global cache on the local side
> B3. same but with an already-populated global cache on the local side
> C.  darcs 2 pre, new-format local repo, new-format remote repo
> C2. same but with a newly created global cache on the local side
> C3. same but with an already-populated global cache on the local side
> 
> column 1: push, column 2: pull ; all values in seconds
> 
> A.     7.9            18.2
> B.    16.3    14.7
> B2.   14.6    19.1
> B3.   16.0    14.3
> C.     9.0            12.1
> C2.   11.4    15.7
> C3.    9.9             9.8
> 
> Note that these measurements varied by many seconds -- up to 100% in  
> some runs -- possibly because of network contention or disk or CPU  
> contention on the remote repo.  So consider them to be only broad  
> general measurements of performance, and also use these measurements  
> as reminders that your handling of network latency and resource  
> contention are often more important than your algorithm efficiency.
> 
> So far my general feeling is that switching over to darcs-2 and using  
> hashed repository format (the B rows) would not be an improvement  
> over using darcs-1, and that requiring everyone who wants to share  
> source with me to switch to darcs-2 and use darcs-2-repository-format  
> (the C rows) might be an improvement, but it isn't clear yet, and  
> obviously having a "flag day" style upgrade like that is a deterrent.

I think that if your remote repository is hashed, then you should get the
same performance (when there are no conflicts) with --hashed as you do with
--darcs-2, which is an improvement in pull according to your timings.  If
you've got the disk space, a cron job to keep --old-fashioned and --hashed
repositories on the server in sync is pretty easy and safe, and allows
darcs 2 users to have faster results.

> Oh, by the way, I suspect that darcs on windows (the "local" machine  
> in these measurements) isn't actually finding the global cache (I  
> told it "~/.darcs/cache", and I guess it doesn't know what I mean by  
> "~"), so you can probably consider the 1/2/3 rows as just showing the  
> noise in the system rather than actually using the cache.  :-)
> 
> I'll try to explain to darcs-on-windows where to do caching, later.

I'll appreciate that! There's a function in the haskell libraries that is
supposed to give something like a cross-platform "home" directory, but I
know there's been debate about how that's best defined (since Windows
doesn't really have the equivalent of a unix home directory, at least not
in the sense of "a place to put .files").
-- 
David Roundy
Department of Physics
Oregon State University
_______________________________________________
darcs-devel mailing list
darcs-devel@darcs.net
http://lists.osuosl.org/mailman/listinfo/darcs-devel

Reply via email to