On Wed, 26 Mar 2014 19:41:38 +0000 Philip Martin <philip.mar...@wandisco.com> wrote:
> Henrik Carlqvist <h...@poolhem.se> writes: > > Would people hosting public svn repositories think that it would be > > nice if some people using my tool would make one svn connection for > > each revision in the repository? > > It's a user problem as well since making a request per revision doesn't > scale well and will be very slow for large projects. As the merge information you get from "svn log -g" is somewhat recursive it seems as if time grows exponentially with the number of revisions (or maybe rather with the number of merges). However, my own test version of svn2cvsgraph which calls svn once for each revision does a pclose on the svn call after reading the first log entry and the second log entry (which might be a merge). With such a solution time grows linear with the number of revisions, but svn older than 1.7 will give some "svn: Write error: Broken pipe" to stderr. I did a benchmark comparing a box running Slackware 14.1 with svn 1.7.16 and another box running Slackware 13.1 with svn 1.6.16. On these machines I tested 3 version of svn2cvsgraph: svn2cvsgraph 1.2: makes a single call to "svn log -q -g" on the subversion repository root. svn2cvsgraph 2.0: makes one call to "svn log -q -g" for each branch (and trunk) svn2cvsgraph 2.1beta: makes one call to "svn log -q -g" for each revision, the call is aborted with pclose to avoid wasting time on redundant information. The benchmarks were run on a test subversion repository which was read from a 2.9 GB big subversion dump file of an actual project repository. The repository contains 13570 revisions and 160 branches. 206 merges has been logged into the repository since the repository was upgraded to version 1.5 of subversion. The test repository was accessed as file:/// on an NFS server. Times were measured with the /usr/bin/time command. These are the results: subversion svn2cvsgraph time result 1.7.16 1.2 6:13.70elapsed 17%CPU No merges found 1.7.16 2.1beta 7:20.73elapsed 55%CPU All merges found 1.7.16 2.0 13:49.48elapsed 45%CPU 23 merges lost 1.6.16 2.1beta 52:53.63elapsed 81%CPU All merges found 1.6.16 1.2 134:55:22elapsed 41%CPU All merges found 1.6.16 2.0 135:14:04elapsed 41%CPU All merges found Subversion 1.7.16 seems a lot faster than 1.6.16. Even though the tests were run on different machines and the Slackware 14.1 machines is slightly faster than the Slackware 13.1 machine I think that most of the difference is thanks to that 1.7.16 gives less recursive merge information to wade through. No merges are found when only doing "svn log -q -g" on the repository root with version 1.7.16. This is expected behavior as the behavior of "svn log -g" changed with version 1.6.17. 23 merges were lost with "svn log -q -g" on every branch with 1.7.16, this is most likely because of issue 4477. Doing "svn log -q -g" for each revision and abort the output with pclose is the fastest way to get correct results for both version 1.6.16 and 1.7.16. However, this is assuming that the repository is accessed with file://. Previously I have instead been using svn+ssh:// with svn 1.6.16 and with one call for each branch or only for the repository root that takes about 24 hours (compared with about 135 hours above). However using svn+ssh:// instead of file:// when doing one call for each revision would be a lot slower. regards Henrik