On Sat, 1 Nov 2008, David Roundy wrote:

> On Thu, Oct 30, 2008 at 06:01:26AM +0000, Ganesh Sittampalam wrote:
>> On Wed, 29 Oct 2008, Jason Dagit wrote:
>>> Have you retimed things with the full set of patches you submitted?
>>> Do you know what the overall improvement would be?
>>
>> Nope - without some really good "fire and forget" infrastructure and a
>> dedicated machine that can be guaranteed quiescent, benchmarking is quite
>> fiddly and time-consuming, so I've only been doing it for things where it
>> seemed particularly warranted.
>
> I've just reviewed this one, and it looks correct, but I couldn't predict
> whether its performance behavior.  So I'd rather not apply it, unless
> either you can explain it to me in such a way that I can understand the
> improvement is, or you have benchmarks demonstrating the improvement.

I've timed it on whatsnew -sl on a directory containing 1000 new files, 
and it's definitely a major speedup (I forget the timing numbers, but from 
the profile it was clearly quadratic to linear as I claim).

> I can see that you replace (+>+) with (:>:) using some clever tricks (which
> is definitely always a bonus), but that only affects the scaling when many,
> many changes are made to a single file, in which case this is almost
> certainly not a bottleneck (since we're diffing said file, which is a slow
> operation).

> The other change (and I think these two changes are separable?) is a switch
> from foldl' to foldr, and I must admit that these fold functions almost
> always confuse the heck out of me...

They are separable, and I didn't check which of them was responsible for 
the actual speedup in the test case I made, but they both seemed sensible 
as a general principle. The basic goal is that if we ever use (+>+), we 
should make sure it associates to the right, since it's linear in its 
*left* argument but constant time in its right argument, like (++) is. 
That's the effect of switching to foldr in this case; the key point is 
that in this particular case I wouldn't expect any semantic difference 
between the folds, as (+>+) is associative; it's just a performance 
change. In addition the fact that we're building a lazy data structure 
rather than a strict value means that foldr is almost certain to be the 
more natural thing to use here.

Ganesh
_______________________________________________
darcs-users mailing list
[email protected]
http://lists.osuosl.org/mailman/listinfo/darcs-users

Reply via email to