Bug#1057126: Bug#1067104: Acknowledgement (server stalls: AH00046: child process 2876749 still did not exit, sending a SIGKILL)

2024-03-21 Thread Peter Krefting

2024-03-21 13:12 skrev Yaroslav Halchenko:


FWIW here is a dirty workaround script I just crafted with chatgpt to
monitor/restart apache2 as soon as it starts happening


My workaround is simpler, I have this line in root's crontab:

 5 * * * * curl --silent --max-time 5 --output /dev/null 
http://localhost/trac/ || systemctl restart apache2


It seems to restart Apache once every 5-8 days, according to the notices 
I see from Zabbix. The frequency might very well be related to the 
number of accesses to the server.


--
\\// Peter - http://www.softwolves.pp.se/



Bug#1057126: "AH03490: scoreboard is full" after nightly maintenance

2023-12-14 Thread Peter Krefting

Hi!

Stefan Fritsch:

Some processes are in "stopping" state but are not dying. They accumulate 
until the scoreboard is full.


First you should verify with ps if the processes with the PIDs in the "(old 
gen)" lines still exist. If not, it is a bug in apache itself (probably in 
mpm_event).


They do indeed still exist in the process table. Just looking at the 
command lines doesn't tell me much as they are all apache2 
forks/threads, but see below:


  # apachectl status 2>&1 |awk '/old gen/ { print "tr \"\\0\" \" \" < /proc/" $2 
"/cmdline; echo" }'  | sh
  /usr/sbin/apache2 -k start
  /usr/sbin/apache2 -k start
  /usr/sbin/apache2 -k start
  /usr/sbin/apache2 -k start
  /usr/sbin/apache2 -k start
  /usr/sbin/apache2 -k start

If they do, one needs to find out why they are not dying. This is likely the 
fault of some module.


The server is running Trac with its default configuration, there might 
be something that has changed there?


  # dpkg -l trac
  Desired=Unknown/Install/Remove/Purge/Hold
  | Status=Not/Inst/Conf-files/Unpacked/halF-conf/Half-inst/trig-aWait/Tri
  |/ Err?=(none)/Reinst-required (Status,Err: uppercase=bad)
  ||/ Name   Version  Architecture Description
  +++-==---===
  ii  trac   1.6-2all  Enhanced wiki and issue tra

The processes do indeed seem to be from Trac, here is one example:

  # ls -l /proc/210869/fd
  total 0
  lr-x-- 1 root root 64 dec 14 09:19 0 -> /dev/null
  l-wx-- 1 root root 64 dec 14 09:19 1 -> /dev/null
  lrwx-- 1 root root 64 dec 14 09:19 10 -> 'anon_inode:[eventpoll]'
  lr-x-- 1 root root 64 dec 14 09:19 11 -> 'pipe:[35832724]'
  l-wx-- 1 root root 64 dec 14 09:19 12 -> 'pipe:[35832724]'
  l-wx-- 1 root root 64 dec 14 09:19 14 -> /srv/trac/log/trac.log
  lrwx-- 1 root root 64 dec 14 09:19 18 -> /srv/trac/db/trac.db
  l-wx-- 1 root root 64 dec 14 09:19 2 -> '/var/log/apache2/error.log.1 
(deleted)'
  lrwx-- 1 root root 64 dec 14 09:19 21 -> /srv/trac/db/trac.db
  lrwx-- 1 root root 64 dec 14 09:19 3 -> 'socket:[35833027]'
  lr-x-- 1 root root 64 dec 14 09:19 5 -> 'pipe:[35832707]'
  l-wx-- 1 root root 64 dec 14 09:19 6 -> 'pipe:[35832707]'
  l-wx-- 1 root root 64 dec 14 09:19 7 -> 
/var/log/apache2/other_vhosts_access.log
  l-wx-- 1 root root 64 dec 14 09:19 8 -> '/var/log/apache2/access.log.1 
(deleted)'
  l-wx-- 1 root root 64 dec 14 09:19 9 -> '/var/log/apache2/access.log.1 
(deleted)'

Should I reassign the bug to Trac instead?

--
\\// Peter - http://www.softwolves.pp.se/



Bug#1057126: found 1057126 2.4.58-1

2023-12-08 Thread Peter Krefting

found 1057126 2.4.58-1
thanks

Still present in 2.4.58-1 from testing. It was running fine since 
2023-11-30 until midnight 2023-12-08.


[Fri Dec 08 00:00:02.485510 2023] [mpm_event:notice] [pid 123558:tid 
140621062399872] AH00489: Apache/2.4.58 (Debian) mod_wsgi/4.9.4 
Python/3.11 configured -- resuming normal operations
[Fri Dec 08 00:00:02.485594 2023] [core:notice] [pid 123558:tid 
140621062399872] AH00094: Command line: '/usr/sbin/apache2'
[Fri Dec 08 00:00:04.487814 2023] [mpm_event:error] [pid 123558:tid 
140621062399872] AH03490: scoreboard is full, not at 
MaxRequestWorkers.Increase ServerLimit.
[Fri Dec 08 00:00:05.488946 2023] [mpm_event:error] [pid 123558:tid 
140621062399872] AH03490: scoreboard is full, not at 
MaxRequestWorkers.Increase ServerLimit.

(...and so on until restarted...)


This is the output of apachectl status just before deadlock:

  Apache Server Status for localhost (via ::1)

   Server Version: Apache/2.4.58 (Debian) mod_wsgi/4.9.4 Python/3.11
   Server MPM: event
   Server Built: 2023-10-19T10:56:29
 __

   Current Time: Friday, 08-Dec-2023 00:00:01 CET
   Restart Time: Thursday, 30-Nov-2023 09:12:39 CET
   Parent Server Config. Generation: 8
   Parent Server MPM Generation: 7
   Server uptime: 7 days 14 hours 47 minutes 22 seconds
   Server load: 0.00 0.00 0.00
   Total accesses: 33931 - Total Traffic: 704.1 MB - Total Duration:
  2032773

   CPU Usage: u273.86 s30.97 cu0 cs0 - .0463% CPU load
   .0516 requests/sec - 1121 B/second - 21.2 kB/request - 59.909
  ms/request

   1 requests currently being processed, 0 workers gracefully restarting,
  49 idle workers

   Slot PID Stopping Connections Threads Async connections
   total accepting busy graceful idle writing keep-alive closing
   0 123559 yes (old gen) 0 no 0 0 0 0 0 0
   1 123561 yes (old gen) 0 no 0 0 0 0 0 0
   2 130244 yes (old gen) 0 no 0 0 0 0 0 0
   3 130245 yes (old gen) 0 no 0 0 0 0 0 0
   4 136773 yes (old gen) 0 no 0 0 0 0 0 0
   5 136774 yes (old gen) 0 no 0 0 0 0 0 0
   6 143347 yes (old gen) 0 no 0 0 0 0 0 0
   7 143348 yes (old gen) 0 no 0 0 0 0 0 0
   8 149859 yes (old gen) 0 no 0 0 0 0 0 0
   9 149860 yes (old gen) 0 no 0 0 0 0 0 0
   10 156457 yes (old gen) 0 no 0 0 0 0 0 0
   11 156458 yes (old gen) 0 no 0 0 0 0 0 0
   12 163598 yes (old gen) 0 no 0 0 0 0 0 0
   13 163599 yes (old gen) 0 no 0 0 0 0 0 0
   14 170137 no 0 yes 0 0 25 0 0 0
   15 170138 no 0 yes 1 0 24 0 0 0
   Sum 16 14 0   1 0 49 0 0 0



..__
_W__

   Scoreboard Key:
   "_" Waiting for Connection, "S" Starting up, "R" Reading Request,
   "W" Sending Reply, "K" Keepalive (read), "D" DNS Lookup,
   "C" Closing connection, "L" Logging, "G" Gracefully finishing,
   "I" Idle cleanup of worker, "." Open slot with no current process
'/usr/bin/lynx -dump http://localhost:80/server-status' failed.
Maybe you need to install a package providing www-browser or you
need to adjust the APACHE_LYNX variable in /etc/apache2/envvars

--
\\// Peter - http://www.softwolves.pp.se/



Bug#1057126: "AH03490: scoreboard is full" after nightly maintenance

2023-11-30 Thread Peter Krefting

Package: apache2
Version: 2.4.57-2
Severity: normal

Dear Maintainer,

we are experiencing that the Apache httpd locks up, filling the 
error.log with errors after the nightly maintenance (not every night, 
though):


  [Wed Nov 29 00:00:01.922731 2023] [mpm_event:notice] [pid 62346:tid 
139841215223680] AH00489: Apache/2.4.57 (Debian) mod_wsgi/4.9.4 Python/3.11 
configured -- resuming normal operations
  [Wed Nov 29 00:00:01.922790 2023] [core:notice] [pid 62346:tid 
139841215223680] AH00094: Command line: '/usr/sbin/apache2'
  [Wed Nov 29 00:00:03.924683 2023] [mpm_event:error] [pid 62346:tid 
139841215223680] AH03490: scoreboard is full, not at MaxRequestWorkers.Increase 
ServerLimit.
  [Wed Nov 29 00:00:04.925780 2023] [mpm_event:error] [pid 62346:tid 
139841215223680] AH03490: scoreboard is full, not at MaxRequestWorkers.Increase 
ServerLimit.

(etc)

I set up a nightly job to mail me the output of "apachectl status", 
and this is the contents reported at Nov 29 00:00:00, i.e. just before 
the nightly maintenance:


  Apache Server Status for localhost (via ::1)

   Server Version: Apache/2.4.57 (Debian) mod_wsgi/4.9.4 Python/3.11
   Server MPM: event
   Server Built: 2023-04-13T03:26:51
 __

   Current Time: Wednesday, 29-Nov-2023 00:00:01 CET
   Restart Time: Tuesday, 21-Nov-2023 08:51:55 CET
   Parent Server Config. Generation: 8
   Parent Server MPM Generation: 7
   Server uptime: 7 days 15 hours 8 minutes 6 seconds
   Server load: 0.01 0.02 0.00
   Total accesses: 34860 - Total Traffic: 717.8 MB - Total Duration:
  2289464

   CPU Usage: u297.08 s30.61 cu.01 cs.07 - .0497% CPU load
   .0529 requests/sec - 1141 B/second - 21.1 kB/request - 65.676
  ms/request

   2 requests currently being processed, 48 idle workers

   Slot PID Stopping Connections Threads Async connections
   total accepting busy idle writing keep-alive closing
   0 62348 yes (old gen) 0 no 0 0 0 0 0
   1 62350 yes (old gen) 0 no 0 0 0 0 0
   2 66497 yes (old gen) 0 no 0 0 0 0 0
   3 66498 yes (old gen) 0 no 0 0 0 0 0
   4 73089 yes (old gen) 0 no 0 0 0 0 0
   5 73090 yes (old gen) 0 no 0 0 0 0 0
   6 79644 yes (old gen) 0 no 0 0 0 0 0
   7 79645 yes (old gen) 0 no 0 0 0 0 0
   8 86126 yes (old gen) 0 no 0 0 0 0 0
   9 86127 yes (old gen) 0 no 0 0 0 0 0
   10 92669 yes (old gen) 0 no 0 0 0 0 0
   11 92670 yes (old gen) 0 no 0 0 0 0 0
   12 99203 yes (old gen) 0 no 0 0 0 0 0
   13 99204 yes (old gen) 0 no 0 0 0 0 0
   14 105761 no 0 yes 0 25 0 0 0
   15 105762 no 0 yes 2 23 0 0 0
   Sum 16 14 0   2 48 0 0 0



.._W
W___

   Scoreboard Key:
   "_" Waiting for Connection, "S" Starting up, "R" Reading Request,
   "W" Sending Reply, "K" Keepalive (read), "D" DNS Lookup,
   "C" Closing connection, "L" Logging, "G" Gracefully finishing,
   "I" Idle cleanup of worker, "." Open slot with no current process
'/usr/bin/lynx -dump http://localhost:80/server-status' failed.
Maybe you need to install a package providing www-browser or you
need to adjust the APACHE_LYNX variable in /etc/apache2/envvars


The server is fairly lightly loaded, running Trac 1.6.

While looking for information on the subject, I found this thread on 
Reddit, noting the exact same problem, also being tied to the nightly 
maintenance:


  https://www.reddit.com/r/debian/comments/15stmn7

Weird thing is, it happens right after a reload of apache that happens 
at midnight. So it's not a case of heavy usage (this site is barely 
used at all during the day, never mind at night). In fact there were 
no accesses between the time it reloaded the service and the error 
started.


-- Package-specific info:

-- System Information:
Debian Release: 12.2
  APT prefers stable-updates
  APT policy: (500, 'stable-updates'), (500, 'stable-security'), (500, 'stable')
Architecture: amd64 (x86_64)

Kernel: Linux 6.1.0-13-amd64 (SMP w/2 CPU threads; PREEMPT)
Init: systemd (via /run/systemd/system)
LSM: AppArmor: enabled

Versions of packages apache2 depends on:
ii  apache2-bin2.4.57-2
ii  apache2-data   2.4.57-2
ii  apache2-utils  2.4.57-2
ii  init-system-helpers1.65.2
ii  lsb-base   11.6
ii  media-types10.0.0
ii  perl   5.36.0-7
ii  procps 2:4.0.2-3
ii  sysvinit-utils [lsb-base]  3.06-4

Versions of packages apache2 recommends:
ii  ssl-cert  1.1.2

Versions of packages apache2 suggests:
pn  apache2-doc  
pn  apache2-suexec-pristine | apache2-suexec-custom  
ii  lynx [www-browser]   2.9.0dev.12-1

Versions of packages apache2-bin depends on:
ii  libapr1  1.7.2-3
ii  libaprutil1  1.6.3-1
ii