Hello all, I had another qmail failure... i say that because qmail seems
to be the only service affected..... I lump tcpserver, smtpd and pop3d in
as qmail for now as I don't have a definative place to look.

Here is the ps output after the crash at 12:35 today:

    1 ?        S      0:51 init
    2 ?        SW     0:01 [kflushd]
    3 ?        SW     0:11 [kupdate]
    4 ?        SW     0:00 [kpiod]
    5 ?        SW     0:04 [kswapd]
    6 ?        SW<    0:00 [mdrecoveryd]
  339 ?        S    657:23 syslogd -m 0
  350 ?        SW     0:00 [klogd]
  366 ?        S      0:00 /usr/sbin/atd
  382 ?        S      0:00 crond
  398 ?        S      0:01 inetd
  414 ?        S      0:01 /usr/local/sbin/sshd
  430 ?        SW     0:00 [lpd]
  463 ?        S      0:02 /usr/local/apache/bin/httpsd
  466 ?        SW     0:00 [gcache]
  485 ?        SW     0:00 [supervise]
  486 ?        S      1:43 splogger qmail
  487 ?        S     32:32 qmail-send
  501 ?        SW     0:00 [supervise]
  502 ?        SW     0:00 [splogger]
  503 ?        S      2:31 tcpserver -p -q -r -c100 -x
/etc/tcprules.d/qmail-smt  
  504 ?        S      0:51 qmail-lspawn # Using
qmail-local to deliver messages
  505 ?        S      0:15 qmail-rspawn
  505 ?        S      0:15 qmail-rspawn
  506 ?        S      0:29 qmail-clean
  520 ?        SW     0:00 [supervise]
  521 ?        S      0:30 tcpserver -q -R -H -c100 -x
/etc/tcprules.d/qmail-pop  
  535 ?        SW     0:00 [supervise]
  536 ?        S      0:06 tcpserver -q -R -H -c100 -u0 -g0
vmail.f-tech.net pop  
  669 tty2     SW     0:00 [mingetty]
  670 tty3     SW     0:00 [mingetty]
  671 tty4     SW     0:00 [mingetty]
  672 tty5     SW     0:00 [mingetty]
  673 tty6     SW     0:00 [mingetty]
  822 tty1     S      0:00 login -- root
30124 ?        S      0:00 /usr/local/apache/bin/httpsd
30313 ?        S      0:00 /usr/local/apache/bin/httpsd
30369 ?        S      0:00 /usr/local/apache/bin/httpsd
 5254 ?        S      0:00 /usr/local/apache/bin/httpsd
 5262 ?        S      0:00 /usr/local/apache/bin/httpsd
 5263 ?        S      0:00 /usr/local/apache/bin/httpsd
12132 ?        S      0:00 /usr/local/apache/bin/httpsd
14163 ?        S      0:00 /usr/local/apache/bin/httpsd
 7065 ?        S      0:00 /usr/local/apache/bin/httpsd
 8274 ?        S      0:00 /usr/local/apache/bin/httpsd
21419 ?        S      0:00 /usr/local/apache/bin/httpsd
21461 ?        S      0:00 /usr/local/apache/bin/httpsd
21461 ?        S      0:00 /usr/local/apache/bin/httpsd
21477 ?        S      0:00 /usr/local/apache/bin/httpsd
22458 ?        S      0:00 in.identd -l -e -o
22637 tty1     S      0:00 -bash
22699 tty1     R      0:00 ps ax

No pop3 or smtp services were available, telnet didn't work either.
I have identd commented out in inet.conf, so I have no idea where it is
coming from, or how it started.

After stopping qmail.init, vpopmail, pop3d and smtpd this is a list of the
new processes:

23026 tty1     S      0:00 supervise /var/lock/qmail qmail-start # Using
qmail-l23027 tty1     S      0:00 splogger qmail
23028 tty1     S      0:00 qmail-send
23029 tty1     S      0:00 qmail-lspawn # Using qmail-local to deliver
messages
23030 tty1     S      0:00 qmail-rspawn
23031 tty1     S      0:00 qmail-clean
23032 tty1     S      0:00 qmail-remote pa-arng.ngb.army.mil
[EMAIL PROTECTED]
23043 tty1     S      0:00 supervise
/var/lock/qmail-smtpd tcpserver -p -q -r -c
23044 tty1     S      0:00
splogger qmail-smptd
23045 tty1     S      0:00 tcpserver -p -q -r -c100 -x
/etc/tcprules.d/qmail-smt
23056 tty1     S      0:00 supervise
/var/lock/qmail-pop3d tcpserver -q -R -H -c
23057 tty1     S      0:00
tcpserver -q -R -H -c 100 -x /etc/tcprules.d/qmail-pop
23085 tty1     S
0:00 supervise /var/lock/qmail-vpop3d tcpserver -q -R -H -
23086 tty1     S     0:00 tcpserver -q -R -H -c100 -u0 -g0
vmail.f-tech.net pop23092 tty1     S
0:00 qmail-popup vmail.f-tech.net /home/vpopmail/bin/vchkp
23093 tty1     S     0:00 qmail-vpop3d Maildir
23094 tty1     Z     0:00 [tcprules <defunct>]
23100 tty1     S      0:00 qmail-popup mail.f-tech.net checkpassword
qmail-pop3d
23101 tty1     S      0:00 qmail-pop3d Maildir
23341 tty1     S      0:00 qmail-smtpd
23374 tty1     S      0:00 qmail-smtpd

Notice that I have a new pid 23094 tcprules <defunct>, but the pid seems
to indicate that it was from the restart of the services, not the original
ps output.

The mail log just before the stoppage indicates that tcpserver encountered
a huge number of cname lookup failures for outgoing mail..... but DNS was
fine (the main dsn server is still running, and I do not cache entries on
the mail server).

Going the the mail log I see:

Jul 20 11:30:19 mail qmail: 964107019.317584 status: local 1/100 remote
51/100
Jul 20 11:30:20 mail qmail: 964107020.447592 delivery 118231: deferral:
qmail-spawn_unable_to_fork._(#4.3.0)/

and 

Jul 20 11:48:28 mail qmail: 964108108.168548 status: local 0/100 remote
46/100
Jul 20 11:48:35 mail qmail: 964108115.918335 delivery 118202: deferral:
Connected_to_24.0.95.29_but_connection_died._(#4.4.2)/

then 

Jul 20 12:01:47 mail qmail: 964108907.824767 status: local 0/100 remote
47/100
Jul 20 12:01:47 mail qmail: 964108907.917107 delivery 118245: deferral:
CNAME_lookup_failed_temporarily._(#4.4.3)/

Concurrancy went up to 51 for remote, local 1 during the 30 minutes it
took to fail.

ANY ideas??????

Thanks!


Paul Farber
Farber Technology
[EMAIL PROTECTED]
Ph  570-628-5303
Fax 570-628-5545

Reply via email to