Hi Hans, Thanks for the issue report! There seem to be two independent approaches that we could take to address the problem -- see below.
On 05.11.2016 00:19, Hans van Kranenburg wrote:
(please keep me in Cc: since I'm not on the list (yet)) Hi, We recently switched from using a htgroup file with apache2 to using the <repo>/conf/authz file with AuthzSVNReposRelativeAccessFile, for performance reasons. The authz files are really simple, like this: [/] f...@example.com = rw b...@example.com = rw
>
The performance benefit is ludicrous, but we ran into an issue: Since we have nginx in front of the apache2+svn box, there's only a small amount of connections that is opened between nginx and apache2. Those connections carry requests from different end users, and live relatively long. Inside mode_authz_svn the per-repo authz files are cached using a per-connection cache.
Up to here, everything should "just" work.
Since we dynamically create and remove repositories, and dynamically add and remove users, users that are added to a repository are not able to use the repository immediately. Also, our automated create repo flow which first creates the repo, checks if it can be accessed correctly and then adds the user to the authorization list triggers this problem in nearly 100% of the cases.
Yes, the dynamic aspect is an issue.
Currently, I'm running a modified mod_authz_svn that simply disables caching (not causing any noticable performance difference): http://paste.debian.net/plainh/0b2f4e20 I understand that it's possible to create really complex authorization structures inside a file for a single repository, so the caching of the result of parsing the file might be useful in some cases.
Looking at your workaround, I think the cleanest solution would be to introduce an AuthzSVNDisableCache option. Just grep in the same source file for AuthzSVNNoAuthWhenAnonymousAllowed and no_auth_when_anon_ok as an example for a boolean option. Use the new flag in get_access_conf() to optionally skip the cache access.
My suggestion about improving this situation would be to simply save the modification time of the authz file with the cached information, and check the mtime again on every request, invalidating the cache whenever we notice that the file has changed.
svnserve already has code for that. The idea is to use content SHA1 instead of timestamps. The latter might be racy / have to low a resolution during the repo creation phase where an authz file could be created first and then filled shortly thereafter. SHA1, on the other hand, are immediately available for in-repository authz and even for external files, calculating the SHA1 is ~10x as fast as actually parsing the file. The plan is to introduce a new authz implementation in, hopefully, 1.10 and making the cache invalidation available everywhere is part of that.
I can give it a shot to prepare a patch to do this, if wanted.
Patches are always welcome. I think adding AuthzSVNDisableCache should be easy enough for "giving it a shot". -- Stefan^2.