On 15.06.2022 02:33, Branko Čibej wrote:
On 08.06.2022 15:49, Julian Foad wrote:
I did some profiling for SVN-4832 "Authz perf regression 1.9 -> 1.10"
<https://subversion.apache.org/issue/4832>.

I updated the issue with my findings, including a sample of the debug
timings output from my debug code.

TL;DR: if an authz file specifies a large number of ACLs and a large
number of users, it is slow. Nearly all the time is used by
svn_authz__parse() calling update_user_rights() this many times:

   (total number of ACLs in the authz file) x (total number of "concrete
users" mentioned in the authz file).

This bottleneck apparently was not present in the previous
implementation (svn 1.9), but I understand other cases were slow then.
The optimization in svn 1.10 apparently optimizes for other cases but
makes things worse in these cases.

As far as I can see the "update_user_rights" code looks functionally
trivial and the problem is just the accumulated time over a huge number
of iterations of it.

I am looking to see if there is anything that can be done about speeding
it up, whether algorithmic or local optimization. If anyone can lend
insight or a hand, I know there are some users who would appreciate it.

It's in subversion/libsvn_repos/authz_parse.c: update_user_rights()
called right at the end of expand_acl_callback(), which is called right
at the end of svn_authz__parse(). In pseudo-code:

   for each of all ACLs:
     for each of all users:
       update_user_rights()

That's all mine, isn't it. You're right, there's an O(n * m) in there and it can't be resolved unless we find a better way to do this. Local optimisations won't help much, we either need a better algorithm or we have to parallelize this code. Although note that even parallelizing will only speed up by a constant factor.

Hm. Thinking ... I surely never considered we'd have so many ACL's * users that this would be a problem. One assumption of the authz rewrite was that we'd parse the authz file once, then keep the result cached over many requests -- the parsed representation is immutable and can be read in parallel from any number of request threads.

Is this something we could fix by adding lazy evaluation? That is, to expand only when a lookup for a given user happens? Lazy evaluation would break that immutability promise and we'd have to add locking.

For the svn+ssh case described in the issue, we're only interested in rules for one user, which we know in advance. So another solution would be to give the parser an optional per-user filter.

-- Brane

Reply via email to