On Thu, Aug 19, 2010 at 1:54 PM, Johan Corveleyn <jcor...@gmail.com> wrote: > On Wed, Aug 18, 2010 at 3:44 PM, Johan Corveleyn <jcor...@gmail.com> wrote: >> FWIW, below is the output of a recent run with my instrumented trunk >> build. I've instrumented svn_diff_diff in diff.c and >> svn_diff__get_tokens in token.c. I checked out bigfile.xml from a >> repository, and edited a single line of it in my working copy. The two >> statements "Starting diff" and "Diff finished" are the first and last >> statements inside the svn_diff_diff function. These are numbers from >> the second run (any following runs show approximately the same >> numbers). >> >> $ time svn diff bigfile.xml >> Starting diff ... (entered svn_diff_diff in diff.c) >> -> calling svn_diff__get_tokens for svn_diff_datasource_original >> == svn_diff__get_tokens datasource_open: 0 usec >> == svn_diff__get_tokens while loop: 265625 usec >> calls to datasource_get_next_token: 62500 usec >> svn_diff__tree_insert_token: 171875 usec >> rest: 15625 usec >> <- done: 281250 usec >> -> calling svn_diff__get_tokens for svn_diff_datasource_modified >> == svn_diff__get_tokens datasource_open: 234375 usec >> == svn_diff__get_tokens while loop: 312500 usec >> calls to datasource_get_next_token: 62500 usec >> svn_diff__tree_insert_token: 171875 usec >> rest: 46875 usec >> <- done: 562500 usec >> -> calling svn_diff__lcs >> <- done: 0 usec >> -> calling svn_diff__diff >> <- done: 0 usec >> Diff finished in 875000 usec (875 millis) >> Index: bigfile.xml >> =================================================================== >> [snip] >> >> real 0m1.266s >> user 0m0.015s >> sys 0m0.031s [snip] > > A release build will also probably be faster, but I have no time to > make one and test it now. If I time my 1.6.5 build from tigris.org, > that's still a lot faster: > $ time svn diff bigfile.xml > [snip] > real 0m0.468s > user 0m0.015s > sys 0m0.031s > > Maybe that can be totally attributed to the build (release vs. debug), > or maybe 1.6.5 was faster at svn diff than trunk is?
Feh, I just redid my apr_time_now+printf profiling with a release build (of trunk), instead of a debug build, and that makes a *big* difference. Total time of the svn_diff_diff call is now down to ~300 ms. Still a lot slower than GNU diff (78 ms), but a lot better than with the debug build. That will teach me not to do performance tests with a debug build. Still, the challenge holds: why is svn diff ~4 times slower than GNU diff? Also, still puzzling to me: why does datasource_open take so long in the "modified" case? Now it takes twice as long as the while loop ... Now, I'll spend some time to do some real profiling with "very sleepy". In the meantime, if anybody has any brilliant ideas/insight, please let me know ... Numbers below (release build of tr...@yesterday on Windows, instrumented with apr_time_now and printf): $ time svn diff bigfile.xml Starting diff ... (entered svn_diff_diff in diff.c) -> calling svn_diff__get_tokens for svn_diff_datasource_original == svn_diff__get_tokens datasource_open: 0 usec == svn_diff__get_tokens while loop: 46875 usec calls to datasource_get_next_token: 15625 usec svn_diff__tree_insert_token: 0 usec rest: 0 usec <- done: 78125 usec -> calling svn_diff__get_tokens for svn_diff_datasource_modified == svn_diff__get_tokens datasource_open: 125000 usec == svn_diff__get_tokens while loop: 62500 usec calls to datasource_get_next_token: 31250 usec svn_diff__tree_insert_token: 31250 usec rest: 0 usec <- done: 218750 usec Diff finished in 296875 usec (296 millis) [snip] real 0m0.531s user 0m0.015s sys 0m0.015s Cheers, -- Johan