David Roundy wrote:
On Fri, Dec 14, 2007 at 10:04:57AM +0000, Simon Marlow wrote:
David Roundy wrote:
Okay, it turns out that it was indeed bad strictness causing the trouble.
For some reason, I had made the PatchInfoAnd data type strict in both its
components, which meant that every time we read a patch ID, we also needed
to parse the patch itself.  Very foolish.  There may be some further
regressions (I'm still running an optimize with profiling enabled.  But
darcs changes --last 10 (with profiling running) now takes me just a bit
over a minute, and not too much memory (I don't quite recall).
Ok, that is certainly an improvement:

  $ time darcs2 cha --last=10
  ...
  60.60s real   59.83s user   0.21s system   99% darcs2 cha --last=10

But this is still 1000 times slower than darcs1 for the same operation. Doesn't darcs changes just dump the contents of the inventory?

If you run darcs optimize first, this drops to 1s for me.  Still a bit
slow, but not so bad (and that's most of why darcs1 is faster).

Ok, confirmed.

However, I never use optimize, and only use tag when I need to. This is mainly because I'm paranoid and I don't fully understand what optimize does, and perhaps also because I'd like to understand what goes wrong if you don't use it.

I guess I don't understand why optimize is exposed to the user at all. if there's an optimal state for the repository, why can't it be maintained in that state?

The problem is that --last isn't at all tuned for efficiency, and instead
uses the same code that can handle --from-tag, and this could require
reordering (--from-tag could), so there are O(N^2) operations going on,
where N is the number of patches since the last known-to-be-in-order tag.

This has never been a problem (that I'm aware of), and simplifies the code
since we only have to deal with one case.  Reusing the same code also
ensures that performance improvements for one command are leveraged for
other commands.  Which comes down to: I'd rather not optimize changes
--last for the case of 17k patches and no tags (or not running optimize).
But I could certainly be convinced, because we are indeed taking a very
roundabout approach.  But then again, darcs1 uses exactly the same
approach, so if we could gain another factor of ten without losing this
abstraction, I'd rather know how--particularly as the improvement is likely
to benefit all other darcs commands.

Sure, code re-use is definitely a good thing, and I agree that optimising this operation in ways that darcs1 does not would be premature, given that there is still a factor of 20 difference between darcs1 and darcs2 unaccounted for.

Thanks for the quick response to my feedback so far... things are definitely heading in the right direction!

Cheers,
        Simon

_______________________________________________
Cvs-ghc mailing list
[email protected]
http://www.haskell.org/mailman/listinfo/cvs-ghc

Reply via email to