Bug#1057126: Bug#1067104: Acknowledgement (server stalls: AH00046: child process 2876749 still did not exit, sending a SIGKILL)
2024-03-21 13:12 skrev Yaroslav Halchenko: FWIW here is a dirty workaround script I just crafted with chatgpt to monitor/restart apache2 as soon as it starts happening My workaround is simpler, I have this line in root's crontab: 5 * * * * curl --silent --max-time 5 --output /dev/null http://localhost/trac/ || systemctl restart apache2 It seems to restart Apache once every 5-8 days, according to the notices I see from Zabbix. The frequency might very well be related to the number of accesses to the server. -- \\// Peter - http://www.softwolves.pp.se/
Bug#1057126: "AH03490: scoreboard is full" after nightly maintenance
Hi! Stefan Fritsch: Some processes are in "stopping" state but are not dying. They accumulate until the scoreboard is full. First you should verify with ps if the processes with the PIDs in the "(old gen)" lines still exist. If not, it is a bug in apache itself (probably in mpm_event). They do indeed still exist in the process table. Just looking at the command lines doesn't tell me much as they are all apache2 forks/threads, but see below: # apachectl status 2>&1 |awk '/old gen/ { print "tr \"\\0\" \" \" < /proc/" $2 "/cmdline; echo" }' | sh /usr/sbin/apache2 -k start /usr/sbin/apache2 -k start /usr/sbin/apache2 -k start /usr/sbin/apache2 -k start /usr/sbin/apache2 -k start /usr/sbin/apache2 -k start If they do, one needs to find out why they are not dying. This is likely the fault of some module. The server is running Trac with its default configuration, there might be something that has changed there? # dpkg -l trac Desired=Unknown/Install/Remove/Purge/Hold | Status=Not/Inst/Conf-files/Unpacked/halF-conf/Half-inst/trig-aWait/Tri |/ Err?=(none)/Reinst-required (Status,Err: uppercase=bad) ||/ Name Version Architecture Description +++-==---=== ii trac 1.6-2all Enhanced wiki and issue tra The processes do indeed seem to be from Trac, here is one example: # ls -l /proc/210869/fd total 0 lr-x-- 1 root root 64 dec 14 09:19 0 -> /dev/null l-wx-- 1 root root 64 dec 14 09:19 1 -> /dev/null lrwx-- 1 root root 64 dec 14 09:19 10 -> 'anon_inode:[eventpoll]' lr-x-- 1 root root 64 dec 14 09:19 11 -> 'pipe:[35832724]' l-wx-- 1 root root 64 dec 14 09:19 12 -> 'pipe:[35832724]' l-wx-- 1 root root 64 dec 14 09:19 14 -> /srv/trac/log/trac.log lrwx-- 1 root root 64 dec 14 09:19 18 -> /srv/trac/db/trac.db l-wx-- 1 root root 64 dec 14 09:19 2 -> '/var/log/apache2/error.log.1 (deleted)' lrwx-- 1 root root 64 dec 14 09:19 21 -> /srv/trac/db/trac.db lrwx-- 1 root root 64 dec 14 09:19 3 -> 'socket:[35833027]' lr-x-- 1 root root 64 dec 14 09:19 5 -> 'pipe:[35832707]' l-wx-- 1 root root 64 dec 14 09:19 6 -> 'pipe:[35832707]' l-wx-- 1 root root 64 dec 14 09:19 7 -> /var/log/apache2/other_vhosts_access.log l-wx-- 1 root root 64 dec 14 09:19 8 -> '/var/log/apache2/access.log.1 (deleted)' l-wx-- 1 root root 64 dec 14 09:19 9 -> '/var/log/apache2/access.log.1 (deleted)' Should I reassign the bug to Trac instead? -- \\// Peter - http://www.softwolves.pp.se/
Bug#1057126: found 1057126 2.4.58-1
found 1057126 2.4.58-1 thanks Still present in 2.4.58-1 from testing. It was running fine since 2023-11-30 until midnight 2023-12-08. [Fri Dec 08 00:00:02.485510 2023] [mpm_event:notice] [pid 123558:tid 140621062399872] AH00489: Apache/2.4.58 (Debian) mod_wsgi/4.9.4 Python/3.11 configured -- resuming normal operations [Fri Dec 08 00:00:02.485594 2023] [core:notice] [pid 123558:tid 140621062399872] AH00094: Command line: '/usr/sbin/apache2' [Fri Dec 08 00:00:04.487814 2023] [mpm_event:error] [pid 123558:tid 140621062399872] AH03490: scoreboard is full, not at MaxRequestWorkers.Increase ServerLimit. [Fri Dec 08 00:00:05.488946 2023] [mpm_event:error] [pid 123558:tid 140621062399872] AH03490: scoreboard is full, not at MaxRequestWorkers.Increase ServerLimit. (...and so on until restarted...) This is the output of apachectl status just before deadlock: Apache Server Status for localhost (via ::1) Server Version: Apache/2.4.58 (Debian) mod_wsgi/4.9.4 Python/3.11 Server MPM: event Server Built: 2023-10-19T10:56:29 __ Current Time: Friday, 08-Dec-2023 00:00:01 CET Restart Time: Thursday, 30-Nov-2023 09:12:39 CET Parent Server Config. Generation: 8 Parent Server MPM Generation: 7 Server uptime: 7 days 14 hours 47 minutes 22 seconds Server load: 0.00 0.00 0.00 Total accesses: 33931 - Total Traffic: 704.1 MB - Total Duration: 2032773 CPU Usage: u273.86 s30.97 cu0 cs0 - .0463% CPU load .0516 requests/sec - 1121 B/second - 21.2 kB/request - 59.909 ms/request 1 requests currently being processed, 0 workers gracefully restarting, 49 idle workers Slot PID Stopping Connections Threads Async connections total accepting busy graceful idle writing keep-alive closing 0 123559 yes (old gen) 0 no 0 0 0 0 0 0 1 123561 yes (old gen) 0 no 0 0 0 0 0 0 2 130244 yes (old gen) 0 no 0 0 0 0 0 0 3 130245 yes (old gen) 0 no 0 0 0 0 0 0 4 136773 yes (old gen) 0 no 0 0 0 0 0 0 5 136774 yes (old gen) 0 no 0 0 0 0 0 0 6 143347 yes (old gen) 0 no 0 0 0 0 0 0 7 143348 yes (old gen) 0 no 0 0 0 0 0 0 8 149859 yes (old gen) 0 no 0 0 0 0 0 0 9 149860 yes (old gen) 0 no 0 0 0 0 0 0 10 156457 yes (old gen) 0 no 0 0 0 0 0 0 11 156458 yes (old gen) 0 no 0 0 0 0 0 0 12 163598 yes (old gen) 0 no 0 0 0 0 0 0 13 163599 yes (old gen) 0 no 0 0 0 0 0 0 14 170137 no 0 yes 0 0 25 0 0 0 15 170138 no 0 yes 1 0 24 0 0 0 Sum 16 14 0 1 0 49 0 0 0 ..__ _W__ Scoreboard Key: "_" Waiting for Connection, "S" Starting up, "R" Reading Request, "W" Sending Reply, "K" Keepalive (read), "D" DNS Lookup, "C" Closing connection, "L" Logging, "G" Gracefully finishing, "I" Idle cleanup of worker, "." Open slot with no current process '/usr/bin/lynx -dump http://localhost:80/server-status' failed. Maybe you need to install a package providing www-browser or you need to adjust the APACHE_LYNX variable in /etc/apache2/envvars -- \\// Peter - http://www.softwolves.pp.se/
Bug#1057126: "AH03490: scoreboard is full" after nightly maintenance
Package: apache2 Version: 2.4.57-2 Severity: normal Dear Maintainer, we are experiencing that the Apache httpd locks up, filling the error.log with errors after the nightly maintenance (not every night, though): [Wed Nov 29 00:00:01.922731 2023] [mpm_event:notice] [pid 62346:tid 139841215223680] AH00489: Apache/2.4.57 (Debian) mod_wsgi/4.9.4 Python/3.11 configured -- resuming normal operations [Wed Nov 29 00:00:01.922790 2023] [core:notice] [pid 62346:tid 139841215223680] AH00094: Command line: '/usr/sbin/apache2' [Wed Nov 29 00:00:03.924683 2023] [mpm_event:error] [pid 62346:tid 139841215223680] AH03490: scoreboard is full, not at MaxRequestWorkers.Increase ServerLimit. [Wed Nov 29 00:00:04.925780 2023] [mpm_event:error] [pid 62346:tid 139841215223680] AH03490: scoreboard is full, not at MaxRequestWorkers.Increase ServerLimit. (etc) I set up a nightly job to mail me the output of "apachectl status", and this is the contents reported at Nov 29 00:00:00, i.e. just before the nightly maintenance: Apache Server Status for localhost (via ::1) Server Version: Apache/2.4.57 (Debian) mod_wsgi/4.9.4 Python/3.11 Server MPM: event Server Built: 2023-04-13T03:26:51 __ Current Time: Wednesday, 29-Nov-2023 00:00:01 CET Restart Time: Tuesday, 21-Nov-2023 08:51:55 CET Parent Server Config. Generation: 8 Parent Server MPM Generation: 7 Server uptime: 7 days 15 hours 8 minutes 6 seconds Server load: 0.01 0.02 0.00 Total accesses: 34860 - Total Traffic: 717.8 MB - Total Duration: 2289464 CPU Usage: u297.08 s30.61 cu.01 cs.07 - .0497% CPU load .0529 requests/sec - 1141 B/second - 21.1 kB/request - 65.676 ms/request 2 requests currently being processed, 48 idle workers Slot PID Stopping Connections Threads Async connections total accepting busy idle writing keep-alive closing 0 62348 yes (old gen) 0 no 0 0 0 0 0 1 62350 yes (old gen) 0 no 0 0 0 0 0 2 66497 yes (old gen) 0 no 0 0 0 0 0 3 66498 yes (old gen) 0 no 0 0 0 0 0 4 73089 yes (old gen) 0 no 0 0 0 0 0 5 73090 yes (old gen) 0 no 0 0 0 0 0 6 79644 yes (old gen) 0 no 0 0 0 0 0 7 79645 yes (old gen) 0 no 0 0 0 0 0 8 86126 yes (old gen) 0 no 0 0 0 0 0 9 86127 yes (old gen) 0 no 0 0 0 0 0 10 92669 yes (old gen) 0 no 0 0 0 0 0 11 92670 yes (old gen) 0 no 0 0 0 0 0 12 99203 yes (old gen) 0 no 0 0 0 0 0 13 99204 yes (old gen) 0 no 0 0 0 0 0 14 105761 no 0 yes 0 25 0 0 0 15 105762 no 0 yes 2 23 0 0 0 Sum 16 14 0 2 48 0 0 0 .._W W___ Scoreboard Key: "_" Waiting for Connection, "S" Starting up, "R" Reading Request, "W" Sending Reply, "K" Keepalive (read), "D" DNS Lookup, "C" Closing connection, "L" Logging, "G" Gracefully finishing, "I" Idle cleanup of worker, "." Open slot with no current process '/usr/bin/lynx -dump http://localhost:80/server-status' failed. Maybe you need to install a package providing www-browser or you need to adjust the APACHE_LYNX variable in /etc/apache2/envvars The server is fairly lightly loaded, running Trac 1.6. While looking for information on the subject, I found this thread on Reddit, noting the exact same problem, also being tied to the nightly maintenance: https://www.reddit.com/r/debian/comments/15stmn7 Weird thing is, it happens right after a reload of apache that happens at midnight. So it's not a case of heavy usage (this site is barely used at all during the day, never mind at night). In fact there were no accesses between the time it reloaded the service and the error started. -- Package-specific info: -- System Information: Debian Release: 12.2 APT prefers stable-updates APT policy: (500, 'stable-updates'), (500, 'stable-security'), (500, 'stable') Architecture: amd64 (x86_64) Kernel: Linux 6.1.0-13-amd64 (SMP w/2 CPU threads; PREEMPT) Init: systemd (via /run/systemd/system) LSM: AppArmor: enabled Versions of packages apache2 depends on: ii apache2-bin2.4.57-2 ii apache2-data 2.4.57-2 ii apache2-utils 2.4.57-2 ii init-system-helpers1.65.2 ii lsb-base 11.6 ii media-types10.0.0 ii perl 5.36.0-7 ii procps 2:4.0.2-3 ii sysvinit-utils [lsb-base] 3.06-4 Versions of packages apache2 recommends: ii ssl-cert 1.1.2 Versions of packages apache2 suggests: pn apache2-doc pn apache2-suexec-pristine | apache2-suexec-custom ii lynx [www-browser] 2.9.0dev.12-1 Versions of packages apache2-bin depends on: ii libapr1 1.7.2-3 ii libaprutil1 1.6.3-1 ii