Hi again! Petr Rockai <[email protected]> writes: > many-files: 20 directories, 2000 files each > many-dirs: 20 directories, 20 subdirs each, 100 files per subdir
> many-files darcs wh: 2.7s > many-files darcs-diff: 1.1s > many-files git diff: .15s darcs-diff now: .25 > many-dirs darcs: 2.5s > many-dirs darcs-diff: .75 > many-dirs git: .2s darcs-diff now: .25 However, this involves a slight hack, since I had to circumvent the "modificationTime" library call, which incurs significant overhead, thanks to its use of unsafePerformIO (yuck). I have lifted the wrapper around System.Posix.Internal from darcs to do that (which is btw. only used on win32, which is maybe a mistake... it might make sense to use it on unix systems as well, maybe with addition of #ifdef'd createLink implementation...). Anyhow, I am pretty happy about how things perform now on this front. There are some changes that still need to be done, eg. automatic *and* fast index updates, but that won't be very painful. After that, I'll look into two things: 1. Implementing a TreeIO monad, that will in turn enable implementation of the WritableDirectory monad class used in Darcs. 2. A packed format for repositories. Other than those, I still need to do a lot of cleaning up on the API and come up with some ways to verify the code does what it's supposed to do. I currently just check on the benchmarks, and that it produces the diffs correctly (manually, of course). However, it shouldn't be (too) hard to come up with some unit tests. I'm starting to think it might be worth getting at least partial support for this into darcs 2.3. (I'm wondering if Kowey will require a sunset procedure for SlurpDirectory though, if we really take this route...) You can still find the library and documentation at the respective URLs: - http://repos.mornfall.net/hashed-storage/ - http://repos.mornfall.net/hashed-storage/dist/doc/html/hashed-storage/ Yours, Petr. PS: Since there is certain interest in slow filesystems, I have tried the many-files benchmark on my NFS home at university (that'd be gigabit ethernet), with the result darcs-diff 6.1s, git diff 6.1s. I guess it can't get much better than that, stat-ing 40k files over NFS is going to cost, no matter what. PPS: All times are with hot cache (and have always been). -- Peter Rockai | me()mornfall!net | prockai()redhat!com http://blog.mornfall.net | http://web.mornfall.net "In My Egotistical Opinion, most people's C programs should be indented six feet downward and covered with dirt." -- Blair P. Houghton on the subject of C program indentation _______________________________________________ darcs-users mailing list [email protected] http://lists.osuosl.org/mailman/listinfo/darcs-users
