-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Thanks!  I had actually made the original change based on the preference
for system libraries over custom code (lacking any comments justifying the preference for the latter) but it's nice to see it comes with a potential performance
boost for certain configurations.

Out of curiosity, and at the risk of boring folks, I added a third method for comparison, which performs a System.Process.runCommand("cp -r fromdir todir") >>= waitForProcess, essentially testing against standard cp plus the overhead of the subprocess creation.

In addition, I also ran this on Mac OS X Tiger x86.

Updated results for Linux x86 reiserfs:

$ for X in 1 2 3 4 5 6 ; do rm -r copytestdir/small{2,3,4}/*; ./ copytest copytestdir/small?; done
Copy of 998 files:
   via System.Directory.copyFile: 0.172032s
   via readFilePS >= writeFilePS: 0.164614s
   via   System.Process("cp -r"): 0.115983s
Copy of 998 files:
   via System.Directory.copyFile: 0.180447s
   via readFilePS >= writeFilePS: 0.176695s
   via   System.Process("cp -r"): 0.120259s
Copy of 998 files:
   via System.Directory.copyFile: 0.176865s
   via readFilePS >= writeFilePS: 0.165542s
   via   System.Process("cp -r"): 0.128305s
Copy of 998 files:
   via System.Directory.copyFile: 0.183365s
   via readFilePS >= writeFilePS: 0.176855s
   via   System.Process("cp -r"): 0.126601s
Copy of 998 files:
   via System.Directory.copyFile: 0.183976s
   via readFilePS >= writeFilePS: 0.17487s
   via   System.Process("cp -r"): 0.125522s
Copy of 998 files:
   via System.Directory.copyFile: 0.178875s
   via readFilePS >= writeFilePS: 0.171737s
   via   System.Process("cp -r"): 0.127577s

$ for X in 1 2 3 4 5 6 ; do rm -r copytestdir/big{2,3,4}/*; /kwq/ unstable/copytest copytestdir/big?; done
Copy of 4 files:
   via System.Directory.copyFile: 8.204959s
   via readFilePS >= writeFilePS: 11.083989s
   via   System.Process("cp -r"): 13.116506s
Copy of 4 files:
   via System.Directory.copyFile: 7.864635s
   via readFilePS >= writeFilePS: 9.829027s
   via   System.Process("cp -r"): 11.634171s
Copy of 4 files:
   via System.Directory.copyFile: 7.08999s
   via readFilePS >= writeFilePS: 10.456167s
   via   System.Process("cp -r"): 15.665767s
Copy of 4 files:
   via System.Directory.copyFile: 8.318283s
   via readFilePS >= writeFilePS: 10.496049s
   via   System.Process("cp -r"): 10.557133s
Copy of 4 files:
   via System.Directory.copyFile: 5.262135s
   via readFilePS >= writeFilePS: 13.083001s
   via   System.Process("cp -r"): 11.247626s
Copy of 4 files:
   via System.Directory.copyFile: 7.754436s
   via readFilePS >= writeFilePS: 10.117483s
   via   System.Process("cp -r"): 13.9466s

The results for Mac OS X Tiger x86:

$ for X in 1 2 3 4 5 6 ; do rm -r copytestdir/small{2,3,4}/*; ./ copytest copytestdir/small?; done
Copy of 998 files:
   via System.Directory.copyFile: 0.455989s
   via readFilePS >= writeFilePS: 0.435122s
   via   System.Process("cp -r"): 0.25074s
Copy of 998 files:
   via System.Directory.copyFile: 0.353156s
   via readFilePS >= writeFilePS: 0.367436s
   via   System.Process("cp -r"): 0.250192s
Copy of 998 files:
   via System.Directory.copyFile: 0.36179s
   via readFilePS >= writeFilePS: 0.450544s
   via   System.Process("cp -r"): 0.249152s
Copy of 998 files:
   via System.Directory.copyFile: 0.353648s
   via readFilePS >= writeFilePS: 0.366946s
   via   System.Process("cp -r"): 0.250617s
Copy of 998 files:
   via System.Directory.copyFile: 0.354281s
   via readFilePS >= writeFilePS: 0.417799s
   via   System.Process("cp -r"): 0.256529s
Copy of 998 files:
   via System.Directory.copyFile: 0.390028s
   via readFilePS >= writeFilePS: 0.363963s
   via   System.Process("cp -r"): 0.248796s

$ for X in 1 2 3 4 5 6 ; do rm -r copytestdir/big{2,3,4}/*; ./ copytest copytestdir/big?; done
Copy of 4 files:
   via System.Directory.copyFile: 4.419766s
   via readFilePS >= writeFilePS: 1.27034s
   via   System.Process("cp -r"): 1.649713s
Copy of 4 files:
   via System.Directory.copyFile: 1.429918s
   via readFilePS >= writeFilePS: 1.40437s
   via   System.Process("cp -r"): 1.456039s
Copy of 4 files:
   via System.Directory.copyFile: 1.552916s
   via readFilePS >= writeFilePS: 1.219946s
   via   System.Process("cp -r"): 1.465697s
Copy of 4 files:
   via System.Directory.copyFile: 1.426121s
   via readFilePS >= writeFilePS: 1.32494s
   via   System.Process("cp -r"): 0.985613s
Copy of 4 files:
   via System.Directory.copyFile: 1.429138s
   via readFilePS >= writeFilePS: 1.319917s
   via   System.Process("cp -r"): 1.40475s
Copy of 4 files:
   via System.Directory.copyFile: 1.428278s
   via readFilePS >= writeFilePS: 1.471474s
   via   System.Process("cp -r"): 1.425157s


There's clearly some variability in the results that could be investigated, but I'm not sure it's worth the effort trying to fine- tune the benchmark since we're after "typical" results. The copyFile never seems to be significantly slower, and can be significantly faster.

- -KQ


On 31 Jul 2007, at 10:46 AM, Jason Dagit wrote:

Fantastic analysis. Exactly what I was looking for. I'm now convinced :)

The icing on the cake would be to check this on various platforms but
I'm convinced enough already.

Thanks!
Jason

On 7/30/07, Kevin Quick <[EMAIL PROTECTED]> wrote:
On Sat, 28 Jul 2007 17:23:13 -0700, "Jason Dagit"
<[EMAIL PROTECTED]> wrote:
I think it would be nice to see a benchmark between the old and new
where there are thousands of tiny little files. Is there a noticeable
performance difference?  The reason for lots of little files is
because of the permission copying. I don't know that it would affect
the performance, but if the permission changes require a lot of
function calls then it could have a big impact.

Jason

Attached is copytest.hs.  It is given three directory names and
copies from the first to the second using System.Directory.copyFile,
then copies from the first to the third using readfilePS >>=
writeFilePS.  It then reports the elapsed time for each test.

To build (assuming it is located in the top-level of the darcs source
tree):

$ ghc --make -isrc copytest src/fpstring.c -lz

I used two input sets. The first directory set had 998 small files (50 bytes to 500 bytes) in the source directory. The second set had 4 big
files:

$ ls -lh copytestdir/big1
total 83M
-rw-rw-r-- 1 kquick kquick 16M Jul 30 22:15 bigfile1
-rw-rw-r-- 1 kquick kquick 14M Jul 30 22:15 bigfile2
-rw-rw-r-- 1 kquick kquick 30M Jul 30 22:15 bigfile3
-rw-rw-r-- 1 kquick kquick 25M Jul 30 22:16 bigfile4

Test runs:

$ for X in 1 2 3 4 5 6 ; do rm copytestdir/small{2,3}/*; ./copytest
copytestdir/small?; done
Copy of 998 files:
   via System.Directory.copyFile: 0.160895s
   via readFilePS >= writeFilePS: 0.153626s
Copy of 998 files:
   via System.Directory.copyFile: 0.161436s
   via readFilePS >= writeFilePS: 0.153718s
Copy of 998 files:
   via System.Directory.copyFile: 0.163191s
   via readFilePS >= writeFilePS: 0.155526s
Copy of 998 files:
   via System.Directory.copyFile: 0.16112s
   via readFilePS >= writeFilePS: 0.156255s
Copy of 998 files:
   via System.Directory.copyFile: 0.162132s
   via readFilePS >= writeFilePS: 0.157913s
Copy of 998 files:
   via System.Directory.copyFile: 0.163213s
   via readFilePS >= writeFilePS: 0.157451s
$


$ for X in 1 2 3 4 5 6 ; do rm copytestdir/big{2,3}/*; ./copytest
copytestdir/ big?; done
Copy of 4 files:
   via System.Directory.copyFile: 10.418745s
   via readFilePS >= writeFilePS: 11.420843s
Copy of 4 files:
   via System.Directory.copyFile: 8.318079s
   via readFilePS >= writeFilePS: 16.533595s
Copy of 4 files:
   via System.Directory.copyFile: 8.384256s
   via readFilePS >= writeFilePS: 11.35574s
Copy of 4 files:
   via System.Directory.copyFile: 7.752898s
   via readFilePS >= writeFilePS: 14.43615s
Copy of 4 files:
   via System.Directory.copyFile: 8.029765s
   via readFilePS >= writeFilePS: 14.187116s
Copy of 4 files:
   via System.Directory.copyFile: 7.85273s
   via readFilePS >= writeFilePS: 12.406907s


From the above, I conclude that System.Directory.copyFile is better at
actually copying the file data, and that the overhead of copying
permissions (visible from copying small files) is quite small (about 8
us/file).


--
--
Kevin Quick
quick at org after sparq

_______________________________________________
darcs-devel mailing list
darcs-devel@darcs.net
http://lists.osuosl.org/mailman/listinfo/darcs-devel




-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.7 (Darwin)

iD8DBQFGsIh4t76lKrRL0ewRAgtPAJ9l+dByfhH3gy+xazwAVX9n887z3QCaAjHa
CAdzHVaUoihXE4Csd/g9QaI=
=2f27
-----END PGP SIGNATURE-----
_______________________________________________
darcs-devel mailing list
darcs-devel@darcs.net
http://lists.osuosl.org/mailman/listinfo/darcs-devel

Reply via email to