On Thu, Jul 20, 2000 at 02:16:21PM -0400, Paul Farber wrote:
> Hello all, I had another qmail failure... i say that because qmail seems
> to be the only service affected..... I lump tcpserver, smtpd and pop3d in
> as qmail for now as I don't have a definative place to look.
> 
> Here is the ps output after the crash at 12:35 today:

Which looks ok to me. All the processes that are needed, seem to be
there.

> No pop3 or smtp services were available, telnet didn't work either.

What do you mean by "available?". What happened when you tried to connect
to those ports?

> I have identd commented out in inet.conf, so I have no idea where it is
> coming from, or how it started.

What do you expect identd to tell you that's relevant to qmail?

> After stopping qmail.init, vpopmail, pop3d and smtpd this is a list of the
> new processes:

Apart from the qmail-remotes, what relevant differences do you see
between this and the last ps?

> 23026 tty1     S      0:00 supervise /var/lock/qmail qmail-start # Using
> qmail-l23027 tty1     S      0:00 splogger qmail
> 23028 tty1     S      0:00 qmail-send
> 23029 tty1     S      0:00 qmail-lspawn # Using qmail-local to deliver
> messages

Is this a heavily edited 'ps' output? It makes if difficult if it is.

> Notice that I have a new pid 23094 tcprules <defunct>, but the pid seems
> to indicate that it was from the restart of the services, not the original
> ps output.

tcprules hasn't much to do with the running of qmail though.

> The mail log just before the stoppage indicates that tcpserver encountered
> a huge number of cname lookup failures for outgoing mail..... but DNS was
> fine (the main dsn server is still running, and I do not cache entries on
> the mail server).

What do you mean by DNS "was fine"? Do you mean it had reachability to
resolve queries or do you mean that the process was running?

My suspicion is that you lost the ability to do DNS lookups and
that stalled reverse lookups which stalls all processes that want to
do that (such as a wrappered telnet).

> Going the the mail log I see:
> 
> Jul 20 11:30:19 mail qmail: 964107019.317584 status: local 1/100 remote
> 51/100
> Jul 20 11:30:20 mail qmail: 964107020.447592 delivery 118231: deferral:
> qmail-spawn_unable_to_fork._(#4.3.0)/
> 
> and 
> 
> Jul 20 11:48:28 mail qmail: 964108108.168548 status: local 0/100 remote
> 46/100
> Jul 20 11:48:35 mail qmail: 964108115.918335 delivery 118202: deferral:
> Connected_to_24.0.95.29_but_connection_died._(#4.4.2)/
> 
> then 
> 
> Jul 20 12:01:47 mail qmail: 964108907.824767 status: local 0/100 remote
> 47/100
> Jul 20 12:01:47 mail qmail: 964108907.917107 delivery 118245: deferral:
> CNAME_lookup_failed_temporarily._(#4.4.3)/
> 
> Concurrancy went up to 51 for remote, local 1 during the 30 minutes it
> took to fail.

And what do you make of that? The first message tells me that you
haven't given qmail-send enough resources to start all the processes
you've configured it for - but it's not fatal.


Mark.

Reply via email to