There is a lot of email in the archives of this list complaining about
things such as
  warning: unable to unlink local/9/3601004; will try again later

I saw this too, (running with the rpms made by Bruce Guenter E<[EMAIL PROTECTED]>)

I investigated what was going on.  The key is to look at errno when
the unlink fails.  (By the way, I suggest that the when printing the
warning about the unlink failing, the error code ought to be printed
out too.)

The unlink returned error code 5 (I/O error) sometimes, but not
always.

By taking out the syncdir patch, the problem goes away.

I mounted the ext2fs filesystem "sync" (it turns out the only thing on
that disk is my qmail queue and my alias maildirs, so it is an
excellent candidate for being mounted "sync".)

Now the system works much better, with none of those "unable to
unlink" messages in the logs.

A related problem:  The "try again later" is 123 seconds later:
   pe.dt = now() + SLEEP_SYSFAIL;
This can cause problem if more than a few hundred messages get into
this state (especially when using syslogd).  The problem is that qmail
spends all its time looking at these messages.  Much better would be
if the retry were scheduled with a quadratic backoff strategy to avoid
swamping qmail with these bad messages.

-Bradley




Reply via email to