On Sat, 1 Nov 2008, David Roundy wrote: > On Thu, Oct 30, 2008 at 06:01:26AM +0000, Ganesh Sittampalam wrote: >> On Wed, 29 Oct 2008, Jason Dagit wrote: >>> Have you retimed things with the full set of patches you submitted? >>> Do you know what the overall improvement would be? >> >> Nope - without some really good "fire and forget" infrastructure and a >> dedicated machine that can be guaranteed quiescent, benchmarking is quite >> fiddly and time-consuming, so I've only been doing it for things where it >> seemed particularly warranted. > > I've just reviewed this one, and it looks correct, but I couldn't predict > whether its performance behavior. So I'd rather not apply it, unless > either you can explain it to me in such a way that I can understand the > improvement is, or you have benchmarks demonstrating the improvement.
I've timed it on whatsnew -sl on a directory containing 1000 new files, and it's definitely a major speedup (I forget the timing numbers, but from the profile it was clearly quadratic to linear as I claim). > I can see that you replace (+>+) with (:>:) using some clever tricks (which > is definitely always a bonus), but that only affects the scaling when many, > many changes are made to a single file, in which case this is almost > certainly not a bottleneck (since we're diffing said file, which is a slow > operation). > The other change (and I think these two changes are separable?) is a switch > from foldl' to foldr, and I must admit that these fold functions almost > always confuse the heck out of me... They are separable, and I didn't check which of them was responsible for the actual speedup in the test case I made, but they both seemed sensible as a general principle. The basic goal is that if we ever use (+>+), we should make sure it associates to the right, since it's linear in its *left* argument but constant time in its right argument, like (++) is. That's the effect of switching to foldr in this case; the key point is that in this particular case I wouldn't expect any semantic difference between the folds, as (+>+) is associative; it's just a performance change. In addition the fact that we're building a lazy data structure rather than a strict value means that foldr is almost certain to be the more natural thing to use here. Ganesh _______________________________________________ darcs-users mailing list [email protected] http://lists.osuosl.org/mailman/listinfo/darcs-users
