[EMAIL PROTECTED] wrote:
> [In order for any reply to be added to the PR database, ]
> [you need to include <[EMAIL PROTECTED]> in the Cc line ]
> [and leave the subject line UNCHANGED. This is not done]
> [automatically because of the potential for mail loops. ]
>
> Synopsis: 'apachectl restart' or 'apachectl graceful' causes httpd to die.
>
> State-Changed-From-To: open-analyzed
> State-Changed-By: brian
> State-Changed-When: Tue May 5 18:10:45 PDT 1998
> State-Changed-Why:
> Could you run "truss", "strace", or some other type of system
> call tracking program on it, so we could see where it dies or
> becomes unresponsive? You could also use "gcore" to get a core
> file and see where it might be hung.
I did a 'gcore' on the httpd that remained after 'apachectl restart'. Then I
ran gdb on the core file and obtained this backtrace.(gdb) backtrace#0
0x8018255b in ?? () from /usr/lib/libc.so.1
#1 0x8019904f in ?? () from /usr/lib/libc.so.1
#2 0x80a6dca in reclaim_child_processes ()
#3 0x80a8c10 in standalone_main ()
#4 0x80a8f44 in main ()
#5 0x805baa7 in _start ()
Here is 'truss1.out' from 'truss -o /tmp/truss1.out -p 3110'
(3110 was the pid of the master httpd).
[...]
waitid(P_ALL, 0, 0x08047B50, WEXITED|WTRAPPED|WNOHANG) = 0
poll(0x08045B98, 0, 1000) = 0
time() = 894467787
waitid(P_ALL, 0, 0x08047B50, WEXITED|WTRAPPED|WNOHANG) = 0
poll(0x08045B98, 0, 1000) = 0
time() = 894467788
waitid(P_ALL, 0, 0x08047B50, WEXITED|WTRAPPED|WNOHANG) = 0
poll(0x08045B98, 0, 1000) = 0
time() = 894467789
waitid(P_ALL, 0, 0x08047B50, WEXITED|WTRAPPED|WNOHANG) = 0
poll(0x08045B98, 0, 1000) = 0
time() = 894467790
waitid(P_ALL, 0, 0x08047B50, WEXITED|WTRAPPED|WNOHANG) = 0
poll(0x08045B98, 0, 1000) = 0
time() = 894467791
waitid(P_ALL, 0, 0x08047B50, WEXITED|WTRAPPED|WNOHANG) = 0
poll(0x08045B98, 0, 1000) = 0
time() = 894467792
waitid(P_ALL, 0, 0x08047B50, WEXITED|WTRAPPED|WNOHANG) = 0
poll(0x08045B98, 0, 1000) = 0
time() = 894467793
waitid(P_ALL, 0, 0x08047B50, WEXITED|WTRAPPED|WNOHANG) = 0
poll(0x08045B98, 0, 1000) = 0
time() = 894467794
poll(0x08045B00, 1, 0) = 1
waitid(P_ALL, 0, 0x08047B50, WEXITED|WTRAPPED|WNOHANG) = 0
poll(0x08045B98, 0, 1000) = 0
time() = 894467795
waitid(P_ALL, 0, 0x08047B50, WEXITED|WTRAPPED|WNOHANG) = 0
Received signal #1, SIGHUP, in poll() [caught]
siginfo: SIGHUP pid=3175 uid=0
poll(0x08045B98, 0, 1000) Err#4 EINTR
setcontext(0x0804597C)
time() = 894467795
sigaction(SIGHUP, 0x08047B68, 0x08047BC4) = 0
sigaction(SIGUSR1, 0x08047B60, 0x08047BBC) = 0
kill(-3110, SIGHUP) = 0
Received signal #1, SIGHUP [ignored]
siginfo: SIGHUP pid=3110 uid=0
Received signal #18, SIGCLD [default]
siginfo: SIGCLD CLD_EXITED pid=3128 status=0x0000
poll(0x08045B80, 0, 17) = 0
waitid(P_PID, 3127, 0x08047B38, WEXITED|WTRAPPED|WNOHANG) = 0
waitid(P_PID, 3128, 0x08047B38, WEXITED|WTRAPPED|WNOHANG) = 0
waitid(P_PID, 3129, 0x08047B38, WEXITED|WTRAPPED|WNOHANG) = 0
waitid(P_PID, 3131, 0x08047B38, WEXITED|WTRAPPED|WNOHANG) = 0
waitid(P_PID, 3133, 0x08047B38, WEXITED|WTRAPPED|WNOHANG) = 0
waitid(P_PID, 3157, 0x08047B38, WEXITED|WTRAPPED|WNOHANG) = 0
waitid(P_PID, 3123, 0x08047B38, WEXITED|WTRAPPED|WNOHANG) = 0
kill(3123, SIGTERM) = 0
poll(0x08045B80, 0, 66) = 0
waitid(P_PID, 3123, 0x08047B38, WEXITED|WTRAPPED|WNOHANG) = 0
kill(3123, SIGTERM) = 0
poll(0x08045B80, 0, 263) = 0
waitid(P_PID, 3123, 0x08047B38, WEXITED|WTRAPPED|WNOHANG) = 0
kill(3123, SIGTERM) = 0
poll(0x08045B80, 0, 1049) = 0
waitid(P_PID, 3123, 0x08047B38, WEXITED|WTRAPPED|WNOHANG) = 0
kill(3123, SIGTERM) = 0
poll(0x08045B80, 0, 4195) (sleeping...)
poll(0x08045B80, 0, 4195) = 0
waitid(P_PID, 3123, 0x08047B38, WEXITED|WTRAPPED|WNOHANG) = 0
kill(3123, SIGTERM) = 0
poll(0x08045B80, 0, 16778) (sleeping...)
poll(0x08045B80, 0, 16778) = 0
waitid(P_PID, 3123, 0x08047B38, WEXITED|WTRAPPED|WNOHANG) = 0
kill(3123, SIGTERM) = 0
poll(0x08045B80, 0, 67109) (sleeping...)
*** process killed ***
And below is the list of running httpds at the time of restart. Note that
process 3123 is not in the list.
pts/2 socrates[16]# ps -ef | grep http
~/src/apache_1.3b6/src/main
nobody 3128 3110 3 08:14:25 ? 0:01 /usr/local/apache/sbin/httpd
nobody 3127 3110 2 08:14:25 ? 0:01 /usr/local/apache/sbin/httpd
nobody 3129 3110 0 08:14:25 ? 0:01 /usr/local/apache/sbin/httpd
root 3110 1 0 08:14:21 ? 0:00 /usr/local/apache/sbin/httpd
nobody 3131 3110 1 08:14:25 ? 0:01 /usr/local/apache/sbin/httpd
nobody 3133 3110 0 08:14:25 ? 0:00 /usr/local/apache/sbin/httpd
nobody 3157 3110 0 08:15:02 ? 0:00 /usr/local/apache/sbin/httpd
pts/2 socrates[16]# ps -ef | grep http
~/src/apache_1.3b6/src/main
nobody 3128 3110 3 08:14:25 ? 0:01 /usr/local/apache/sbin/httpd
nobody 3127 3110 2 08:14:25 ? 0:01 /usr/local/apache/sbin/httpd
nobody 3129 3110 0 08:14:25 ? 0:01 /usr/local/apache/sbin/httpd
root 3110 1 0 08:14:21 ? 0:00 /usr/local/apache/sbin/httpd
nobody 3131 3110 1 08:14:25 ? 0:01 /usr/local/apache/sbin/httpd
nobody 3133 3110 0 08:14:25 ? 0:00 /usr/local/apache/sbin/httpd
nobody 3157 3110 0 08:15:02 ? 0:00 /usr/local/apache/sbin/httpd
Further investigation revealed that it belongs to 'rotatelogs', which I use
for all my logging. Rotatelogs does not install a signal handler for
SIGTERM--is this the problem?
--
| Charles R. (C. R.) Oldham | NCA Commission on Schools |
| [EMAIL PROTECTED] | Arizona St. Univ., PO Box 873011,|
| V:602/965-8700 F:602/965-9423 | Tempe, AZ 85287-3011 _ |
| "I like it!"--Citizen G'Kar | #include <disclaimer.h> X_>|