Re: qmail died again... 3x in 3 weeks

2000-07-24 Thread Paul Farber

It seems that all of a sudden my RH had a resource limit problem.  DNS is
fine, but after 61 qmail-remotes it wouls appear that RH ran out of
resources.

I searched the archives and added some ulimit commands to the qmail.init
script, but I couldn't find a way to determine how many files to allow
open etc

If anyone knows how many resources qmail needs for a concurrancy of 100
let me know as the default RH settings are to low plus the other
services on the box, https, ssh, ntp etc.

Paul Farber
Farber Technology
[EMAIL PROTECTED]
Ph  570-628-5303
Fax 570-628-5545

On Sat, 22 Jul 2000, Eric Cox wrote:

> 
> 
> Paul Farber wrote:
> > 
> > telnetting to port 25 and 110 just timed out.  
> 
> This usually means (when it has happened to me anyway) that the 
> server is listening on the port you're telnetting to, but is 
> stalled doing a reverse DNS lookup of the client's IP address.  
> Perhaps a munged reverse DNS zonefile?
> 
> 
> > DNS was fine... it means
> > just that, I could ping via hostname and the dns logs show it was running.
> 
> That could still happen under the above scenario...
> 
> Eric
> 




Re: qmail died again... 3x in 3 weeks

2000-07-22 Thread Eric Cox



Paul Farber wrote:
> 
> telnetting to port 25 and 110 just timed out.  

This usually means (when it has happened to me anyway) that the 
server is listening on the port you're telnetting to, but is 
stalled doing a reverse DNS lookup of the client's IP address.  
Perhaps a munged reverse DNS zonefile?


> DNS was fine... it means
> just that, I could ping via hostname and the dns logs show it was running.

That could still happen under the above scenario...

Eric



Re: qmail died again... 3x in 3 weeks

2000-07-20 Thread Ricardo Cerqueira

> 
> Have you thought about migrating to the latest version of
> daemontools and it's startup idiom?  More importantly, 
> using cyclog?

That would be kind of difficult with the _latest_ version. :)
Try multilog :)

RC

-- 
+---
| Ricardo Cerqueira  
| PGP Key fingerprint  -  B7 05 13 CE 48 0A BF 1E  87 21 83 DB 28 DE 03 42 
| Novis  -  Engenharia ISP / Rede Técnica 
| Pç. Duque Saldanha, 1, 7º E / 1050-094 Lisboa / Portugal
| Tel: +351 21 3166700 (24h/dia) - Fax: +351 21 3166701



Re: qmail died again... 3x in 3 weeks

2000-07-20 Thread John White

On Thu, Jul 20, 2000 at 02:16:21PM -0400, Paul Farber wrote:
> Here is the ps output after the crash at 12:35 today:
> 
>   339 ?S657:23 syslogd -m 0
[...]
>   486 ?S  1:43 splogger qmail
 
Have you thought about the resources used by syslogd during
heavy qmail use?

Have you thought about migrating to the latest version of
daemontools and it's startup idiom?  More importantly, 
using cyclog?

John 



Re: qmail died again... 3x in 3 weeks

2000-07-20 Thread Paul Farber

telnetting to port 25 and 110 just timed out.  DNS was fine... it means
just that, I could ping via hostname and the dns logs show it was running.
No other host (the web server specifically) showed a dns failure (log
files had hostnames resolved during the 30 minute mail 'outage'.

I have the ps output saved as files, gpm cut/paste will only go so far.

qmail unable to fork is a new one for me.  I have concurrency set to 100,
and ulimit returns unlimited (from the command line).

ulimit -a shows
core file size (blocks)  100
data seg size (kbytes)   unlimited
file size (blocks)   unlimited
max memory size (kbytes) unlimited
stack size (kbytes)  8192
cpu time (seconds)   unlimited
max user processes   2048
pipe size (512 bytes)8
open files   1024
virtual memory (kbytes)  2105343


System memory available:
Mem:  131022848 125976576  5046272  8724480 77127680 23965696
Swap: 41250816  3854336 37396480
MemTotal:127952 kB
MemFree:   4928 kB
MemShared: 8520 kB
Buffers:  75320 kB
Cached:   23404 kB
SwapTotal:40284 kB
SwapFree: 36520 kB

I can see a large emailing from 1 single user in the logs, 2 of over
50 bounced (from qmail-qread). but I'm
pretty sure I've had CC's or To: of 150 before without problems (from a
site we host).

What else can I do to up the available resources???

Paul Farber
Farber Technology
[EMAIL PROTECTED]
Ph  570-628-5303
Fax 570-628-5545

On Thu, 20 Jul 2000 [EMAIL PROTECTED] wrote:

> On Thu, Jul 20, 2000 at 02:16:21PM -0400, Paul Farber wrote:
> > Hello all, I had another qmail failure... i say that because qmail seems
> > to be the only service affected. I lump tcpserver, smtpd and pop3d in
> > as qmail for now as I don't have a definative place to look.
> > 
> > Here is the ps output after the crash at 12:35 today:
> 
> Which looks ok to me. All the processes that are needed, seem to be
> there.
> 
> > No pop3 or smtp services were available, telnet didn't work either.
> 
> What do you mean by "available?". What happened when you tried to connect
> to those ports?
> 
> > I have identd commented out in inet.conf, so I have no idea where it is
> > coming from, or how it started.
> 
> What do you expect identd to tell you that's relevant to qmail?
> 
> > After stopping qmail.init, vpopmail, pop3d and smtpd this is a list of the
> > new processes:
> 
> Apart from the qmail-remotes, what relevant differences do you see
> between this and the last ps?
> 
> > 23026 tty1 S  0:00 supervise /var/lock/qmail qmail-start # Using
> > qmail-l23027 tty1 S  0:00 splogger qmail
> > 23028 tty1 S  0:00 qmail-send
> > 23029 tty1 S  0:00 qmail-lspawn # Using qmail-local to deliver
> > messages
> 
> Is this a heavily edited 'ps' output? It makes if difficult if it is.
> 
> > Notice that I have a new pid 23094 tcprules , but the pid seems
> > to indicate that it was from the restart of the services, not the original
> > ps output.
> 
> tcprules hasn't much to do with the running of qmail though.
> 
> > The mail log just before the stoppage indicates that tcpserver encountered
> > a huge number of cname lookup failures for outgoing mail. but DNS was
> > fine (the main dsn server is still running, and I do not cache entries on
> > the mail server).
> 
> What do you mean by DNS "was fine"? Do you mean it had reachability to
> resolve queries or do you mean that the process was running?
> 
> My suspicion is that you lost the ability to do DNS lookups and
> that stalled reverse lookups which stalls all processes that want to
> do that (such as a wrappered telnet).
> 
> > Going the the mail log I see:
> > 
> > Jul 20 11:30:19 mail qmail: 964107019.317584 status: local 1/100 remote
> > 51/100
> > Jul 20 11:30:20 mail qmail: 964107020.447592 delivery 118231: deferral:
> > qmail-spawn_unable_to_fork._(#4.3.0)/
> > 
> > and 
> > 
> > Jul 20 11:48:28 mail qmail: 964108108.168548 status: local 0/100 remote
> > 46/100
> > Jul 20 11:48:35 mail qmail: 964108115.918335 delivery 118202: deferral:
> > Connected_to_24.0.95.29_but_connection_died._(#4.4.2)/
> > 
> > then 
> > 
> > Jul 20 12:01:47 mail qmail: 964108907.824767 status: local 0/100 remote
> > 47/100
> > Jul 20 12:01:47 mail qmail: 964108907.917107 delivery 118245: deferral:
> > CNAME_lookup_failed_temporarily._(#4.4.3)/
> > 
> > Concurrancy went up to 51 for remote, local 1 during the 30 minutes it
> > took to fail.
> 
> And what do you make of that? The first message tells me that you
> haven't given qmail-send enough resources to start all the processes
> you've configured it for - but it's not fatal.
> 
> 
> Mark.
> 




Re: qmail died again... 3x in 3 weeks

2000-07-20 Thread markd

On Thu, Jul 20, 2000 at 02:16:21PM -0400, Paul Farber wrote:
> Hello all, I had another qmail failure... i say that because qmail seems
> to be the only service affected. I lump tcpserver, smtpd and pop3d in
> as qmail for now as I don't have a definative place to look.
> 
> Here is the ps output after the crash at 12:35 today:

Which looks ok to me. All the processes that are needed, seem to be
there.

> No pop3 or smtp services were available, telnet didn't work either.

What do you mean by "available?". What happened when you tried to connect
to those ports?

> I have identd commented out in inet.conf, so I have no idea where it is
> coming from, or how it started.

What do you expect identd to tell you that's relevant to qmail?

> After stopping qmail.init, vpopmail, pop3d and smtpd this is a list of the
> new processes:

Apart from the qmail-remotes, what relevant differences do you see
between this and the last ps?

> 23026 tty1 S  0:00 supervise /var/lock/qmail qmail-start # Using
> qmail-l23027 tty1 S  0:00 splogger qmail
> 23028 tty1 S  0:00 qmail-send
> 23029 tty1 S  0:00 qmail-lspawn # Using qmail-local to deliver
> messages

Is this a heavily edited 'ps' output? It makes if difficult if it is.

> Notice that I have a new pid 23094 tcprules , but the pid seems
> to indicate that it was from the restart of the services, not the original
> ps output.

tcprules hasn't much to do with the running of qmail though.

> The mail log just before the stoppage indicates that tcpserver encountered
> a huge number of cname lookup failures for outgoing mail. but DNS was
> fine (the main dsn server is still running, and I do not cache entries on
> the mail server).

What do you mean by DNS "was fine"? Do you mean it had reachability to
resolve queries or do you mean that the process was running?

My suspicion is that you lost the ability to do DNS lookups and
that stalled reverse lookups which stalls all processes that want to
do that (such as a wrappered telnet).

> Going the the mail log I see:
> 
> Jul 20 11:30:19 mail qmail: 964107019.317584 status: local 1/100 remote
> 51/100
> Jul 20 11:30:20 mail qmail: 964107020.447592 delivery 118231: deferral:
> qmail-spawn_unable_to_fork._(#4.3.0)/
> 
> and 
> 
> Jul 20 11:48:28 mail qmail: 964108108.168548 status: local 0/100 remote
> 46/100
> Jul 20 11:48:35 mail qmail: 964108115.918335 delivery 118202: deferral:
> Connected_to_24.0.95.29_but_connection_died._(#4.4.2)/
> 
> then 
> 
> Jul 20 12:01:47 mail qmail: 964108907.824767 status: local 0/100 remote
> 47/100
> Jul 20 12:01:47 mail qmail: 964108907.917107 delivery 118245: deferral:
> CNAME_lookup_failed_temporarily._(#4.4.3)/
> 
> Concurrancy went up to 51 for remote, local 1 during the 30 minutes it
> took to fail.

And what do you make of that? The first message tells me that you
haven't given qmail-send enough resources to start all the processes
you've configured it for - but it's not fatal.


Mark.