Den fre 22 dec. 2023 kl 12:22 skrev radek.krotil.siemens.com via users < users@subversion.apache.org>:
> Hi SVN Community. > > > > I’d like to consult a problem with occasional high CPU utilization by > mod_authz_svn. Please, do CC me and my colleague in the response, because > we are subscribed to the list. > > > > Our Java application uses Subversion as a primary data storage and the SVN > data are being accessed via Apache HTTPD mod_dav_svn and also by svnserve > server on top of the same repository. > > > > We are facing a problem in one of our productive environments hosted in > AWS. The server is running RHEL 8.8, Apache 2.4.37 with SVN 1.14.1 > binaries. Occasionally, the Apache HTTPD is acting strange, utilizing 100% > CPU for periods of minutes to tens of minutes. This situation happens > several times a week and causes delayed responses from the SVN server up to > several minutes. Similar behavior is visible on several httpd server > processes (mpm_event module is used), where each of them has tens of active > requests being processed. When this CPU spike occurs, there several Apache > HTTPD server processes with tens of threads busy in the following code > related to the Subversion ACL. > > > > The access file is quite big. It contains about 20000 group definitions > with roughly 10000 users assigned to them and about 35000 authorization > rules for 15000 paths, and it’s size is over 5 MB. The access file is > getting changed 100 times a day, but the CPU spikes do not correlate with > the CPU spikes. > > > > Initially we suspected filesystem issues, but we were not able to confirm > them. When the spike occurs, CPU I/O wait goes to zero and all the CPU is > consumed in user space. > > > > We are looking for suggestions how to further troubleshoot the problem, or > possible confirmation that this is a Subversion bug or inefficiency. > > > > Thank you, > > Radek Krotil > > Siemens Digital Industries Software > > > > > > *Pstack output example* > > Thread 127 (Thread 0x7f0003fcf700 (LWP 2934729)): > > > > #0 0x00007f0049b79b60 in svn_authz.get_acl_access () from > target:/lib64/libsvn_repos-1.so.0 > > #1 0x00007f0049b7a864 in update_user_rights () from > target:/lib64/libsvn_repos-1.so.0 > > #2 0x00007f0048e35d06 in hash_do_callback () from > target:/lib64/libsvn_subr-1.so.0 > > #3 0x00007f0058bc8ebe in apr_hash_do () from target:/lib64/libapr-1.so.0 > > #4 0x00007f0048e35d6a in svn_iter_apr_hash () from > target:/lib64/libsvn_subr-1.so.0 > > #5 0x00007f0049b7a7b4 in expand_acl_callback () from > target:/lib64/libsvn_repos-1.so.0 > > #6 0x00007f0048e35e41 in svn_iter_apr_array () from > target:/lib64/libsvn_subr-1.so.0 > > #7 0x00007f0049b7c109 in svn_authz.parse () from > target:/lib64/libsvn_repos-1.so.0 > > #8 0x00007f0049b78f74 in svn_repos_authz_read4 () from > target:/lib64/libsvn_repos-1.so.0 > > #9 0x00007f004845e4a3 in get_access_conf () from > target:/etc/httpd/modules/mod_authz_svn.so > > #10 0x00007f004845e705 in req_check_access () from > target:/etc/httpd/modules/mod_authz_svn.so > > #11 0x00007f004845efd6 in auth_checker () from > target:/etc/httpd/modules/mod_authz_svn.so > > #12 0x0000561c8b0724c8 in ap_run_auth_checker () > > #13 0x0000561c8b074a9c in ap_process_request_internal () > > #14 0x0000561c8b093a70 in ap_process_async_request () > > #15 0x0000561c8b08fed0 in ap_process_http_connection () > > #16 0x0000561c8b0862b8 in ap_run_process_connection () > > #17 0x00007f004d6b9a47 in process_socket () from > target:/etc/httpd/modules/mod_mpm_event.so > > #18 0x00007f004d6ba3ea in worker_thread () from > target:/etc/httpd/modules/mod_mpm_event.so > > #19 0x00007f005899a1ca in start_thread () from > target:/lib64/libpthread.so.0 > > #20 0x00007f0058402e73 in clone () from target:/lib64/libc.so.6 > > > > > > *TOP output looks as follows* > > top - 13:00:01 up 15 days, 7:09, 0 users, load average: 129.31, 100.24, > 53.38 > > Tasks: 241 total, 7 running, 233 sleeping, 0 stopped, 1 zombie > > %Cpu(s): 95.6 us, 3.7 sy, 0.0 ni, 0.0 id, 0.0 wa, 0.7 hi, 0.0 > si, 0.0 st > > MiB Mem : 126914.0 total, 55179.2 free, 20765.8 used, 50969.0 buff/cache > > MiB Swap: 0.0 total, 0.0 free, 0.0 used. 104540.6 avail Mem > > > > PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ > COMMAND > > 233319 apache 20 0 5818164 2.9g 13844 S 456.2 2.3 183:35.14 > httpd > > 233318 apache 20 0 5751168 2.7g 13844 S 443.8 2.2 127:53.12 > httpd > > 232848 apache 20 0 5226028 2.0g 13844 S 218.8 1.6 46:08.83 > httpd > > 232849 apache 20 0 5291384 2.1g 13804 S 168.8 1.7 68:34.92 > httpd > > 232847 apache 20 0 5291628 1.7g 13908 S 137.5 1.4 39:14.61 > httpd > > > > *And Apache server status clearly shows the number of “stuck” requests* > > Apache Server Status for localhost (via ::1) > > Server Version: Apache/2.4.37 (Red Hat Enterprise Linux) OpenSSL/1.1.1k > SVN/1.14.1 > > Server MPM: event > > Server Built: Aug 17 2023 10:52:25 > ------------------------------ > > Current Time: Monday, 18-Dec-2023 13:00:01 UTC > > Restart Time: Saturday, 16-Dec-2023 05:23:02 UTC > > Parent Server Config. Generation: 2 > > Parent Server MPM Generation: 1 > > Server uptime: 2 days 7 hours 36 minutes 58 seconds > > Server load: 129.31 100.24 53.38 > > Total accesses: 212201 - Total Traffic: 1.9 GB - Total Duration: 67348142 > > CPU Usage: u27647.8 s220.14 cu338.36 cs31.99 - 14.1% CPU load > > 1.06 requests/sec - 10.0 kB/second - 9.4 kB/request - 317.379 ms/request > > 112 requests currently being processed, 0 workers gracefully restarting, > 138 idle workers > > *Slot* > > *PID* > > *Stopping* > > *Connections* > > *Threads* > > *Async connections* > > *total* > > *accepting* > > *busy* > > *graceful* > > *idle* > > *writing* > > *keep-alive* > > *closing* > > 0 > > 232847 > > no > > 12 > > yes > > 11 > > 0 > > 39 > > 0 > > 0 > > 1 > > 1 > > 232848 > > no > > 11 > > yes > > 11 > > 0 > > 39 > > 0 > > 0 > > 0 > > 3 > > 232849 > > no > > 15 > > yes > > 15 > > 0 > > 35 > > 0 > > 0 > > 0 > > 5 > > 233318 > > no > > 40 > > yes > > 37 > > 0 > > 13 > > 0 > > 0 > > 2 > > 6 > > 233319 > > no > > 37 > > yes > > 38 > > 0 > > 12 > > 0 > > 0 > > 0 > > Sum > > 5 > > 0 > > 115 > > > > 112 > > 0 > > 138 > > 0 > > 0 > > 3 > > W___W____W_____________WWW__W_______W__W__W_W_____W_W_____WW_W__ > > W__W____W____________W______WW______............................ > > ......................_W__W____W_W____WWW__WW_W____W___W_____WW_ > > _W______..................................................WWWW__ > > WWWWW_WWWW_WWW_WWWWWW_W_WWW__W_W_WWW__WWWWWWWWWWW__WWWW_WW_W_W_W > > WW__WW_WW_W_WWWWW_WWWWWWWWWWWW.................................. > > ................................................................ > > ................................................................ > > ................................................................ > > ................................................................ > > ................................................................ > > ................................................................ > > ................................ > > Scoreboard Key: > "*_*" Waiting for Connection, "*S*" Starting up, "*R*" Reading Request, > "*W*" Sending Reply, "*K*" Keepalive (read), "*D*" DNS Lookup, > "*C*" Closing connection, "*L*" Logging, "*G*" Gracefully finishing, > "*I*" Idle cleanup of worker, "*.*" Open slot with no current process > > *Srv* > > *PID* > > *Acc* > > *M* > > *CPU* > > *SS* > > *Req* > > *Dur* > > *Conn* > > *Child* > > *Slot* > > *Client* > > *Protocol* > > *VHost* > > *Request* > > *0-1* > > 232847 > > 0/20/27 > > *W* > > 1796.47 > > 183 > > 0 > > 77128 > > 0.0 > > 0.25 > > 0.28 > > 10.X.X.239 > > http/1.1 > > xyz:443 > > PROPFIND /repo/.polarion/user-management/users/… > > *0-1* > > 232847 > > 0/23/32 > > *W* > > 1761.15 > > 201 > > 0 > > 190003 > > 0.0 > > 0.18 > > 0.23 > > 10.X.X.239 > > http/1.1 > > xyz:443 > > GET /repo/!svn/bc/1731720/External/… > > *0-1* > > 232847 > > 0/24/35 > > *W* > > 1752.23 > > 182 > > 0 > > 81788 > > 0.0 > > 0.31 > > 0.36 > > 10.X.X.239 > > http/1.1 > > xyz:443 > > GET /repo/!svn/bc/1743998/External/… > > > > > Hi, There are other reports: https://lists.apache.org/thread/4vzo7tddyv556vs372kxb4hz83lomcdr https://lists.apache.org/thread/t9j8psdrkflfyb9rfb69dkq24lf26t84 Can you look at this to see if this is similar to your problem? I don't know if anything was ever done to this, as far as I understand this is caused by the parsing of the authorization file. Kind regards, Daniel