Not being a programmer, I have no clue how to trace this, but if someone
were able to help me, I'd be glad to give it a go.  I'm on FreeBSD
4.2-STABLE at the moment, and will be updating again soon.

Qmail is built with patches, a concurrency patch and the patches from the
FreeBSD port.  qmail-remote itself was not patched from 1.03.

What I'm seeing on these stuck processes, is that they're in a state of
'sbwait' (as shown by top).  netstat doesn't show any open connections to
the remote hosts (smtp or otherwise).

This problem doesn't seem to be related to the remote host, no matter the
MTA.  I've seen several stuck qmail-remote processes to a certain host, but
scanning through logs shows that mail has been successfully sent to that
same host on multiple occasions, both prior and after the stuck process was
launched.

This doesn't seem to be a networking problem.  On one occasion, I had over
1500 messages queued up because the number of stuck qmail-remote processes
ate up my concurrency limit.  After clearing up the blockge, the box
processed those 1500 messages in less than 30 minutes.  However, it left
behind another handfull of stuck qmail-remote processes.  Other messages
were undeliverable and left in the queue, and still others were sent back to
sender with permanent errors.

Logs are intact.  There's a start of delivery entry, but if qmail-remote
gets stuck, there is no further reference to those messages.

Yes, I can read the messages in the queue.  They are intact and appear to be
properly formatted.

There is no proxy server or firewall between this box and the rest of the
Internet.  Only a Cisco 2924 switch, a 3640 router and a T1 ride out to AT&T
or Sprint.


I hope all this information helps.  Anyone should feel free to ask for more
details, but please be specific in the information you need.  Remember, a
lot of us here are admins, not developers.


--
  Troy Settle
  Pulaski Networks
  540.994.4254


** -----Original Message-----
** From: Mark [mailto:[EMAIL PROTECTED]]
** Sent: Thursday, June 07, 2001 6:00 PM
** To: [EMAIL PROTECTED]
** Subject: Re: qmail-remote (cry wolf?)
**
**
** >  What are the probabilities of the Sendmail server being the
** one causing
** > the problems? What if the mail admin of mg.hk5.outblaze.com has used
** > some sort of patch that is causing qmail-remote's to hang? Has anyone
** > communicated with outblaze.com's postmaster?
**
** There is nothing a remote system can do that will hang qmail-remote on
** a correctly functioning OS. If the local TCP stack has accepted data
** and indicated available via the select() return, then the remote
** system has no further say as the read() only fetches the data
** previously received.
**
** I'll bet it's an OS bug - most likely in the TCP stack. Eg, it may be
** that the local TCP stack - in some circumstance - discards unread
** data, *then* marks the local socket as unreadable, rather than around
** the other way. That sort of window would wedge the select/read
** sequence in qmail-remote.
**
**
** Regards.
**
**

Reply via email to