On 2006-10-25 07:22:02 -0500, Peter Eisch wrote:
> On 10/25/06 1:11 AM, "Peter J. Holzer" <[EMAIL PROTECTED]> wrote:
> > On 2006-10-24 20:44:04 -0500, Peter Eisch wrote:
> >> On 10/24/06 8:23 PM, "Ask Bjørn Hansen" <[EMAIL PROTECTED]> wrote:
> >>>> I propose in the TODO the ability to clock the time it takes for
> >>>> spamd to scan an email.  A month or so ago there was a thread about
> >>>> this and I forgot I had such a mechanism in my spamassassin plugin.
> >>>> Rather than pollute this patch with its already suspect sprintf and
> >>>> sed.  I can submit that as a subsequent patch.
> >>> 
> >>> I don't get it.  Is "spamd is overwhelmed" a sign the mail was spam?
> >>> Or are you talking about handling timeouts from spamd with a 451
> >>> response to the smtp client?
> > 
> > But you would have to issue the 451 before passing the mail to spamd.
> > Otherwise you'll just exacerbate the problem by scanning lots of mails
> > where the result won't ever be used.
> > 
> 
> I came at the problem from users getting multiple copies of email.  After
> tracking the problem it was clear that client had gone away at some point
> during the scan.  The queuing unfortunately still took place.  While one
> spamd spun out of control, other incoming would get delivered multiple
> times.  The sending MTA would simply retry later.  If spamd chewed for an
> hour on an email and eventually returned, the mail was queued to the
> recipient.  The resulting 250 Queued would be written to a dead socket.  The
> recipient would then receive that email multiple times.  These were
> typically larger email but small enough to be scanned.

Yep. RFC 2821 says that the client should wait at least 10 minutes after
the terminating "." for the reply from the server. Unfortunately some
clients aren't that patient. So it is clearly desirable to keep the time
for scanning and queueing a message short.

If your system is mostly idle, but scanning of a few messages takes a
long time (maybe because of slow DNS lookups), returning an early
temporary error may help.

But that wasn't the case I was thinking of: I was thinking of the case
where the average time to scan a mail approaches the average time
between mails. In this case, returning a 45x error while spamd is still
scanning the mail, makes the problem worse: Spamd will continue to scan
the mail (AFAIK there is no way to abort it), thus delaying the scanning
of other mails. So when the client retries, scanning will take even
longer than before, so it won't be queued, but adds additional load, so
scanning gets even slower ... you get the picture.


> It would be best there were a way to "ping" the sending MTA to ensure that
> if you continue on to queue the message that it will be there to receive the
> confirmation that the message was queued.

It may be possible to do that with a (non-blocking) read. But I don't
think this is reliable: If the client gets impatient and sends a QUIT
before disconnecting, that doesn't work, because the QUIT is still "in
the queue", so EOF hasn't been reached yet.

I don't know if you can determine whether the socket is still writable
without actually writing to it. If that works it should be more reliable -
you still have a race condition, but the critical time is much shorter.

Generally speaking, getting "exactly once" semantics is a hard
problem, if you don't have a way to uniquely identify a message
(unfortunately in mail (other than in usenet) the message-id can't be
used to remove duplicates).

        hp

-- 
   _  | Peter J. Holzer    | Schlagfertigkeit ist das, was einem
|_|_) | Sysadmin WSR       | auf dem Nachhauseweg einfällt.
| |   | [EMAIL PROTECTED]         |    -- Lars 'Cebewee' Noschinski in dasr.
__/   | http://www.hjp.at/ |

Attachment: signature.asc
Description: Digital signature

Reply via email to