Bug#1057126: Bug#1067104: Acknowledgement (server stalls: AH00046: child process 2876749 still did not exit, sending a SIGKILL)
2024-03-21 13:12 skrev Yaroslav Halchenko: FWIW here is a dirty workaround script I just crafted with chatgpt to monitor/restart apache2 as soon as it starts happening My workaround is simpler, I have this line in root's crontab: 5 * * * * curl --silent --max-time 5 --output /dev/null http://localhost/trac/ || systemctl restart apache2 It seems to restart Apache once every 5-8 days, according to the notices I see from Zabbix. The frequency might very well be related to the number of accesses to the server. -- \\// Peter - http://www.softwolves.pp.se/
Bug#1057126: Bug#1067104: Acknowledgement (server stalls: AH00046: child process 2876749 still did not exit, sending a SIGKILL)
"All ingenious is simple" -- thanks for sharing. I might redo following your example but to check more frequently. On Thu, 21 Mar 2024, Peter Krefting wrote: > My workaround is simpler, I have this line in root's crontab: > 5 * * * * curl --silent --max-time 5 --output /dev/null > http://localhost/trac/ || systemctl restart apache2 -- Yaroslav O. Halchenko Center for Open Neuroscience http://centerforopenneuroscience.org Dartmouth College, 419 Moore Hall, Hinman Box 6207, Hanover, NH 03755 WWW: http://www.linkedin.com/in/yarik
Bug#1057126: Bug#1067104: Acknowledgement (server stalls: AH00046: child process 2876749 still did not exit, sending a SIGKILL)
I think "my" https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1067104 is a duplicate of this one. I blame mod_wsgi since this I believe started to happen after I started to use it. FWIW here is a dirty workaround script I just crafted with chatgpt to monitor/restart apache2 as soon as it starts happening (doesn't happen upon every maintenance event for me I believe). Let me know if I should gather any additional information. #!/bin/bash set -eu # Define the lock file and log directory lock_file="/var/log/apache-scoreboard-restart/lock.lck" log_dir="/var/log/apache-scoreboard-restart/" # Ensure the log directory exists mkdir -p "$log_dir" # Attempt to acquire a lock exec 200>"$lock_file" if ! flock -n 200 ; then echo "Another instance is running." exit 0 fi # Function to perform actions when the specified log line is found handle_scoreboard_full() { local timestamp=$(date --iso-8601=seconds) local log_file="${log_dir}${timestamp}.log" echo "Logging system information to $log_file." { ps auxw -H; echo "---"; lsof; } > "$log_file" echo "Reloading Apache." >> "$log_file" service apache2 reload echo "Sleeping for a minute." >> "$log_file" sleep 60 } # Monitor the Apache error log while true; do tail --follow=name /var/log/apache2/error.log | while read line ; do if echo "$line" | grep -q "AH03490: scoreboard is full, not at MaxRequestWorkers.Increase ServerLimit." ; then handle_scoreboard_full break # so we start with a fresh tail fi done done -- Yaroslav O. Halchenko Center for Open Neuroscience http://centerforopenneuroscience.org Dartmouth College, 419 Moore Hall, Hinman Box 6207, Hanover, NH 03755 WWW: http://www.linkedin.com/in/yarik signature.asc Description: PGP signature
Bug#1067104: server stalls: AH00046: child process 2876749 still did not exit, sending a SIGKILL
Am 18.03.24 um 13:59 schrieb Yaroslav Halchenko: Package: apache2 Version: 2.4.57-2 Severity: important Server was working just fine for years and recently started to stall completely after 3-7 days of functioning normally. error logs get filled up first with AH03490 and then eventually with AH00045 messages: [Sun Mar 17 02:26:01.353381 2024] [mpm_event:error] [pid 2649373:tid 139846579189632] AH03490: scoreboard is full, not at MaxRequestWorkers.Increase ServerLimit. ... [Sun Mar 17 22:00:42.201774 2024] [mpm_event:error] [pid 2649373:tid 139846579189632] AH03490: scoreboard is full, not at MaxRequestWorkers.Increase ServerLimit. [Sun Mar 17 22:00:42.995574 2024] [mpm_event:error] [pid 2649373:tid 139846579189632] AH03490: scoreboard is full, not at MaxRequestWorkers.Increase ServerLimit. [Sun Mar 17 22:00:42.998488 2024] [mpm_event:notice] [pid 2649373:tid 139846579189632] AH00492: caught SIGWINCH, shutting down gracefully [Sun Mar 17 22:00:46.358981 2024] [core:warn] [pid 2649373:tid 139846579189632] AH00045: child process 2649375 still did not exit, sending a SIGTERM [Sun Mar 17 22:00:46.359064 2024] [core:warn] [pid 2649373:tid 139846579189632] AH00045: child process 2649376 still did not exit, sending a SIGTERM Have you tried increasing ServerLimit as the warning suggests? Apart from that, it is probably the same as https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1057126 . mod_wsgi or some python script is preventing apache processes from dying and they accumulate until the scroeboard is full. Which versions of the wsgi related packages are you using?
Processed: tagging 1032628
Processing commands for cont...@bugs.debian.org: > tags 1032628 + pending Bug #1032628 [libapache2-mod-proxy-uwsgi] please drop transitional package libapache2-mod-proxy-uwsgi from src:apache2 Added tag(s) pending. > thanks Stopping processing here. Please contact me if you need assistance. -- 1032628: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1032628 Debian Bug Tracking System Contact ow...@bugs.debian.org with problems