Bug#1057126: "AH03490: scoreboard is full" after nightly maintenance
Hi! Stefan Fritsch: Some processes are in "stopping" state but are not dying. They accumulate until the scoreboard is full. First you should verify with ps if the processes with the PIDs in the "(old gen)" lines still exist. If not, it is a bug in apache itself (probably in mpm_event). They do indeed still exist in the process table. Just looking at the command lines doesn't tell me much as they are all apache2 forks/threads, but see below: # apachectl status 2>&1 |awk '/old gen/ { print "tr \"\\0\" \" \" < /proc/" $2 "/cmdline; echo" }' | sh /usr/sbin/apache2 -k start /usr/sbin/apache2 -k start /usr/sbin/apache2 -k start /usr/sbin/apache2 -k start /usr/sbin/apache2 -k start /usr/sbin/apache2 -k start If they do, one needs to find out why they are not dying. This is likely the fault of some module. The server is running Trac with its default configuration, there might be something that has changed there? # dpkg -l trac Desired=Unknown/Install/Remove/Purge/Hold | Status=Not/Inst/Conf-files/Unpacked/halF-conf/Half-inst/trig-aWait/Tri |/ Err?=(none)/Reinst-required (Status,Err: uppercase=bad) ||/ Name Version Architecture Description +++-==---=== ii trac 1.6-2all Enhanced wiki and issue tra The processes do indeed seem to be from Trac, here is one example: # ls -l /proc/210869/fd total 0 lr-x-- 1 root root 64 dec 14 09:19 0 -> /dev/null l-wx-- 1 root root 64 dec 14 09:19 1 -> /dev/null lrwx-- 1 root root 64 dec 14 09:19 10 -> 'anon_inode:[eventpoll]' lr-x-- 1 root root 64 dec 14 09:19 11 -> 'pipe:[35832724]' l-wx-- 1 root root 64 dec 14 09:19 12 -> 'pipe:[35832724]' l-wx-- 1 root root 64 dec 14 09:19 14 -> /srv/trac/log/trac.log lrwx-- 1 root root 64 dec 14 09:19 18 -> /srv/trac/db/trac.db l-wx-- 1 root root 64 dec 14 09:19 2 -> '/var/log/apache2/error.log.1 (deleted)' lrwx-- 1 root root 64 dec 14 09:19 21 -> /srv/trac/db/trac.db lrwx-- 1 root root 64 dec 14 09:19 3 -> 'socket:[35833027]' lr-x-- 1 root root 64 dec 14 09:19 5 -> 'pipe:[35832707]' l-wx-- 1 root root 64 dec 14 09:19 6 -> 'pipe:[35832707]' l-wx-- 1 root root 64 dec 14 09:19 7 -> /var/log/apache2/other_vhosts_access.log l-wx-- 1 root root 64 dec 14 09:19 8 -> '/var/log/apache2/access.log.1 (deleted)' l-wx-- 1 root root 64 dec 14 09:19 9 -> '/var/log/apache2/access.log.1 (deleted)' Should I reassign the bug to Trac instead? -- \\// Peter - http://www.softwolves.pp.se/
Bug#1057126: "AH03490: scoreboard is full" after nightly maintenance
Hi, Am 30.11.23 um 09:05 schrieb Peter Krefting: we are experiencing that the Apache httpd locks up, filling the error.log with errors after the nightly maintenance (not every night, though): [Wed Nov 29 00:00:01.922731 2023] [mpm_event:notice] [pid 62346:tid 139841215223680] AH00489: Apache/2.4.57 (Debian) mod_wsgi/4.9.4 Python/3.11 configured -- resuming normal operations [Wed Nov 29 00:00:01.922790 2023] [core:notice] [pid 62346:tid 139841215223680] AH00094: Command line: '/usr/sbin/apache2' [Wed Nov 29 00:00:03.924683 2023] [mpm_event:error] [pid 62346:tid 139841215223680] AH03490: scoreboard is full, not at MaxRequestWorkers.Increase ServerLimit. [Wed Nov 29 00:00:04.925780 2023] [mpm_event:error] [pid 62346:tid 139841215223680] AH03490: scoreboard is full, not at MaxRequestWorkers.Increase ServerLimit. (etc) I set up a nightly job to mail me the output of "apachectl status", and this is the contents reported at Nov 29 00:00:00, i.e. just before the nightly maintenance: Some processes are in "stopping" state but are not dying. They accumulate until the scoreboard is full. First you should verify with ps if the processes with the PIDs in the "(old gen)" lines still exist. If not, it is a bug in apache itself (probably in mpm_event). If they do, one needs to find out why they are not dying. This is likely the fault of some module. Third-party modules that do not come with apache itself are more likely to be the culprit. Since you are using mod_wsgi, that is a likely candidate. Look in the error log for the last log lines of the processes with those PIDs. Maybe that gives some hint. Or maybe mod_wsgi writes some logs itself, I am not familiar with that one. Apache Server Status for localhost (via ::1) Server Version: Apache/2.4.57 (Debian) mod_wsgi/4.9.4 Python/3.11 Server MPM: event Server Built: 2023-04-13T03:26:51 __ Current Time: Wednesday, 29-Nov-2023 00:00:01 CET Restart Time: Tuesday, 21-Nov-2023 08:51:55 CET Parent Server Config. Generation: 8 Parent Server MPM Generation: 7 Server uptime: 7 days 15 hours 8 minutes 6 seconds Server load: 0.01 0.02 0.00 Total accesses: 34860 - Total Traffic: 717.8 MB - Total Duration: 2289464 CPU Usage: u297.08 s30.61 cu.01 cs.07 - .0497% CPU load .0529 requests/sec - 1141 B/second - 21.1 kB/request - 65.676 ms/request 2 requests currently being processed, 48 idle workers Slot PID Stopping Connections Threads Async connections total accepting busy idle writing keep-alive closing 0 62348 yes (old gen) 0 no 0 0 0 0 0 1 62350 yes (old gen) 0 no 0 0 0 0 0 2 66497 yes (old gen) 0 no 0 0 0 0 0 3 66498 yes (old gen) 0 no 0 0 0 0 0 4 73089 yes (old gen) 0 no 0 0 0 0 0 5 73090 yes (old gen) 0 no 0 0 0 0 0 6 79644 yes (old gen) 0 no 0 0 0 0 0 7 79645 yes (old gen) 0 no 0 0 0 0 0 8 86126 yes (old gen) 0 no 0 0 0 0 0 9 86127 yes (old gen) 0 no 0 0 0 0 0 10 92669 yes (old gen) 0 no 0 0 0 0 0 11 92670 yes (old gen) 0 no 0 0 0 0 0 12 99203 yes (old gen) 0 no 0 0 0 0 0 13 99204 yes (old gen) 0 no 0 0 0 0 0 14 105761 no 0 yes 0 25 0 0 0 15 105762 no 0 yes 2 23 0 0 0 Sum 16 14 0 2 48 0 0 0 .._W W___ Scoreboard Key: "_" Waiting for Connection, "S" Starting up, "R" Reading Request, "W" Sending Reply, "K" Keepalive (read), "D" DNS Lookup, "C" Closing connection, "L" Logging, "G" Gracefully finishing, "I" Idle cleanup of worker, "." Open slot with no current process '/usr/bin/lynx -dump http://localhost:80/server-status' failed. Maybe you need to install a package providing www-browser or you need to adjust the APACHE_LYNX variable in /etc/apache2/envvars Cheers, Stefan
Bug#1057126: "AH03490: scoreboard is full" after nightly maintenance
Package: apache2 Version: 2.4.57-2 Severity: normal Dear Maintainer, we are experiencing that the Apache httpd locks up, filling the error.log with errors after the nightly maintenance (not every night, though): [Wed Nov 29 00:00:01.922731 2023] [mpm_event:notice] [pid 62346:tid 139841215223680] AH00489: Apache/2.4.57 (Debian) mod_wsgi/4.9.4 Python/3.11 configured -- resuming normal operations [Wed Nov 29 00:00:01.922790 2023] [core:notice] [pid 62346:tid 139841215223680] AH00094: Command line: '/usr/sbin/apache2' [Wed Nov 29 00:00:03.924683 2023] [mpm_event:error] [pid 62346:tid 139841215223680] AH03490: scoreboard is full, not at MaxRequestWorkers.Increase ServerLimit. [Wed Nov 29 00:00:04.925780 2023] [mpm_event:error] [pid 62346:tid 139841215223680] AH03490: scoreboard is full, not at MaxRequestWorkers.Increase ServerLimit. (etc) I set up a nightly job to mail me the output of "apachectl status", and this is the contents reported at Nov 29 00:00:00, i.e. just before the nightly maintenance: Apache Server Status for localhost (via ::1) Server Version: Apache/2.4.57 (Debian) mod_wsgi/4.9.4 Python/3.11 Server MPM: event Server Built: 2023-04-13T03:26:51 __ Current Time: Wednesday, 29-Nov-2023 00:00:01 CET Restart Time: Tuesday, 21-Nov-2023 08:51:55 CET Parent Server Config. Generation: 8 Parent Server MPM Generation: 7 Server uptime: 7 days 15 hours 8 minutes 6 seconds Server load: 0.01 0.02 0.00 Total accesses: 34860 - Total Traffic: 717.8 MB - Total Duration: 2289464 CPU Usage: u297.08 s30.61 cu.01 cs.07 - .0497% CPU load .0529 requests/sec - 1141 B/second - 21.1 kB/request - 65.676 ms/request 2 requests currently being processed, 48 idle workers Slot PID Stopping Connections Threads Async connections total accepting busy idle writing keep-alive closing 0 62348 yes (old gen) 0 no 0 0 0 0 0 1 62350 yes (old gen) 0 no 0 0 0 0 0 2 66497 yes (old gen) 0 no 0 0 0 0 0 3 66498 yes (old gen) 0 no 0 0 0 0 0 4 73089 yes (old gen) 0 no 0 0 0 0 0 5 73090 yes (old gen) 0 no 0 0 0 0 0 6 79644 yes (old gen) 0 no 0 0 0 0 0 7 79645 yes (old gen) 0 no 0 0 0 0 0 8 86126 yes (old gen) 0 no 0 0 0 0 0 9 86127 yes (old gen) 0 no 0 0 0 0 0 10 92669 yes (old gen) 0 no 0 0 0 0 0 11 92670 yes (old gen) 0 no 0 0 0 0 0 12 99203 yes (old gen) 0 no 0 0 0 0 0 13 99204 yes (old gen) 0 no 0 0 0 0 0 14 105761 no 0 yes 0 25 0 0 0 15 105762 no 0 yes 2 23 0 0 0 Sum 16 14 0 2 48 0 0 0 .._W W___ Scoreboard Key: "_" Waiting for Connection, "S" Starting up, "R" Reading Request, "W" Sending Reply, "K" Keepalive (read), "D" DNS Lookup, "C" Closing connection, "L" Logging, "G" Gracefully finishing, "I" Idle cleanup of worker, "." Open slot with no current process '/usr/bin/lynx -dump http://localhost:80/server-status' failed. Maybe you need to install a package providing www-browser or you need to adjust the APACHE_LYNX variable in /etc/apache2/envvars The server is fairly lightly loaded, running Trac 1.6. While looking for information on the subject, I found this thread on Reddit, noting the exact same problem, also being tied to the nightly maintenance: https://www.reddit.com/r/debian/comments/15stmn7 Weird thing is, it happens right after a reload of apache that happens at midnight. So it's not a case of heavy usage (this site is barely used at all during the day, never mind at night). In fact there were no accesses between the time it reloaded the service and the error started. -- Package-specific info: -- System Information: Debian Release: 12.2 APT prefers stable-updates APT policy: (500, 'stable-updates'), (500, 'stable-security'), (500, 'stable') Architecture: amd64 (x86_64) Kernel: Linux 6.1.0-13-amd64 (SMP w/2 CPU threads; PREEMPT) Init: systemd (via /run/systemd/system) LSM: AppArmor: enabled Versions of packages apache2 depends on: ii apache2-bin2.4.57-2 ii apache2-data 2.4.57-2 ii apache2-utils 2.4.57-2 ii init-system-helpers1.65.2 ii lsb-base 11.6 ii media-types10.0.0 ii perl 5.36.0-7 ii procps 2:4.0.2-3 ii sysvinit-utils [lsb-base] 3.06-4 Versions of packages apache2 recommends: ii ssl-cert 1.1.2 Versions of packages apache2 suggests: pn apache2-doc pn apache2-suexec-pristine | apache2-suexec-custom ii lynx [www-browser] 2.9.0dev.12-1 Versions of packages apache2-bin depends on: ii libapr1 1.7.2-3 ii libaprutil1 1.6.3-1 ii