Hello world,

if the host running my Mailman installation is experiencing a high
load (i.e., high CPU usage, many CPU cycles waiting for I/O), posts to
some lists are sometimes sent out with about 1 in 300 subscribers
missing.

First of all, I verified that the recipients who didn't receive posts
didn't have mail delivery suspended - they didn't. Then I looked into
Mailman logs, which yield the following entries:

#v+
/var/log/mailman/post:
Jun 19 11:13:07 2008 (3760) post to listname from [EMAIL PROTECTED] size=42101, 
message-id=<[EMAIL PROTECTED]>, success
#v-

#v+
/var/log/mailman/smtp:
Jun 19 11:13:09 2008 (3760) <[EMAIL PROTECTED]> smtp to listname for 226 
recips, completed in 1.179 seconds
#v-

Notice the 226 recipients and the long time it takes to submit the
messages (1.179 seconds) - as I said, the system is somewhat
"congested" at that time.

Let's have a look at the corresponding Postfix log entries:

#v+
Jun 19 11:13:07 mout03 postfix/smtpd[7065]: connect from localhost[127.0.0.1]
Jun 19 11:13:07 mout03 postfix/smtpd[7065]: B251578003: 
client=localhost[127.0.0.1]
Jun 19 11:13:07 mout03 postfix/cleanup[7062]: B251578003: message-id=<[EMAIL 
PROTECTED]>
Jun 19 11:13:07 mout03 postfix/qmgr[3962]: B251578003: from=<[EMAIL 
PROTECTED]>, size=42602, nrcpt=22 (queue active)
Jun 19 11:13:07 mout03 postfix/smtpd[7065]: B630678004: 
client=localhost[127.0.0.1]
Jun 19 11:13:07 mout03 postfix/cleanup[7062]: B630678004: message-id=<[EMAIL 
PROTECTED]>
Jun 19 11:13:07 mout03 postfix/qmgr[3962]: B630678004: from=<[EMAIL 
PROTECTED]>, size=42627, nrcpt=26 (queue active)
Jun 19 11:13:07 mout03 postfix/smtpd[7065]: C041B78005: 
client=localhost[127.0.0.1]
Jun 19 11:13:07 mout03 postfix/cleanup[7062]: C041B78005: message-id=<[EMAIL 
PROTECTED]>
Jun 19 11:13:08 mout03 postfix/qmgr[3962]: C041B78005: from=<[EMAIL 
PROTECTED]>, size=42419, nrcpt=19 (queue active)
Jun 19 11:13:08 mout03 postfix/smtpd[7065]: 2DD0D78003: 
client=localhost[127.0.0.1]
Jun 19 11:13:08 mout03 postfix/cleanup[7062]: 2DD0D78003: message-id=<[EMAIL 
PROTECTED]>
Jun 19 11:13:08 mout03 postfix/qmgr[3962]: 2DD0D78003: from=<[EMAIL 
PROTECTED]>, size=42153, nrcpt=158 (queue active)
Jun 19 11:13:09 mout03 postfix/smtpd[7065]: disconnect from localhost[127.0.0.1]
#v-

Now, 22+26+19+158 equals 225 and not 226 - no rejected mails, no
NOQUEUE entries. Either Postfix or Mailman is lying. How can I find
out which one it is, aside from running ngrep/tcpdump?

Which additional configuration data do I have to provide to aid in
remote debugging this?


Ciao
Stefan
-- 
Stefan Förster     http://www.incertum.net/     Public Key: 0xBBE2A9E9
FdI #68: WWW - World Wide Waiting
------------------------------------------------------
Mailman-Users mailing list
Mailman-Users@python.org
http://mail.python.org/mailman/listinfo/mailman-users
Mailman FAQ: http://www.python.org/cgi-bin/faqw-mm.py
Searchable Archives: http://www.mail-archive.com/mailman-users%40python.org/
Unsubscribe: 
http://mail.python.org/mailman/options/mailman-users/archive%40jab.org

Security Policy: 
http://www.python.org/cgi-bin/faqw-mm.py?req=show&amp;file=faq01.027.htp

Reply via email to