Howdy,

I've got a program that needs to checkout specific files at specific versions. In this particular case a branch does not make sense. I have found that the performance of svn+ssh in this case is very bad.

I run the rough equivalent of:
svn update -r 2 file1 file2 file3 file4 file5
svn update -r 3 file6 file7 file8 file9 file10

overall I have about 100 such files, and 2 svn update calls. I've accomplished this with an xargs frontend to svn so as to not overrun the cmdline.

if I use file:/// as a protocol, it runs in 3 seconds.
if I use svn+ssh:/// as a protocol, it takes 53 seconds.
if I run an svn update -r 3 with no files, it takes about 2s.

I wrote a direct svn api-program to accept the file lists, make the authentication a single time, and then call svn_update3. This still runs super slow. around 53s still.

I suspect the problem is because each individual file is called out, locked, etc. Is there a way to batch these locks together or improve performance? Cause the ssh channel/ra session to be reused?

Perusing the source code suggests that svn_client__update_internal will be called for each element in my paths. Since an individual file lock/svn directory write does not seem to be overly performance costly, I suspect the problem is in the svn_client__open_ra_session_internal + svn_ra_do_update2 calls from svn_client__update_internal? Is the subversion code opening a new ra_session for each of these files at the expense of an ssh+svnserve on the remote end? Is there a way to force a single RA session across all the files at an API level without writing my own svn_client__update_internal?

thoughts here?

thanks!
   --eric


Reply via email to