Performance of svn+ssh vs. file for multiple files

Eric Peers Tue, 06 Jul 2010 11:18:34 -0700

Howdy,

I've got a program that needs to checkout specific files at specificversions. In this particular case a branch does not make sense. I havefound that the performance of svn+ssh in this case is very bad.


I run the rough equivalent of:
svn update -r 2 file1 file2 file3 file4 file5
svn update -r 3 file6 file7 file8 file9 file10

overall I have about 100 such files, and 2 svn update calls. I'veaccomplished this with an xargs frontend to svn so as to not overrun thecmdline.


if I use file:/// as a protocol, it runs in 3 seconds.
if I use svn+ssh:/// as a protocol, it takes 53 seconds.
if I run an svn update -r 3 with no files, it takes about 2s.

I wrote a direct svn api-program to accept the file lists, make theauthentication a single time, and then call svn_update3. This still runssuper slow. around 53s still.

I suspect the problem is because each individual file is called out,locked, etc. Is there a way to batch these locks together or improveperformance? Cause the ssh channel/ra session to be reused?

Perusing the source code suggests that svn_client__update_internal willbe called for each element in my paths. Since an individual filelock/svn directory write does not seem to be overly performance costly,I suspect the problem is in the svn_client__open_ra_session_internal +svn_ra_do_update2 calls from svn_client__update_internal? Is thesubversion code opening a new ra_session for each of these files at theexpense of an ssh+svnserve on the remote end? Is there a way to force asingle RA session across all the files at an API level without writingmy own svn_client__update_internal?


thoughts here?

thanks!
   --eric

Performance of svn+ssh vs. file for multiple files

Reply via email to