Re: [Mailman-Users] Mailman / Sendmail horriblyslowperformance..suggestions please?!?
for closure of this topic: Begin forwarded message: From: "Andrew Falgout" <[EMAIL PROTECTED]> Date: Mon Aug 25, 2003 8:22:05 pm Europe/London To: "Richard Barrett" <[EMAIL PROTECTED]> Subject: Re: [Mailman-Users] Mailman / Sendmail horribly slowperformance..suggestions please?!? Reply-To: "Andrew Falgout" <[EMAIL PROTECTED]> Problem solved. It was archiving. That's all I needed was to figure out what it was pausing on. I turned off archival and informed my manager that the overhead associated with it would be too much for the system to bear. It is FLYING through the queued mails now and delivering them now. Been a lifesaver, and THANKS!!! - Original Message - From: "Richard Barrett" <[EMAIL PROTECTED]> To: "Andrew Falgout" <[EMAIL PROTECTED]> Sent: Monday, August 25, 2003 12:11 PM Subject: Re: [Mailman-Users] Mailman / Sendmail horribly slowperformance..suggestions please?!? On Monday, August 25, 2003, at 04:22 pm, Andrew Falgout wrote: I have not modified the default values in the SMTPDirect module or Defaults.py. If I could get qrunner to tell me anything about what message(s) it is working I could tell you how many recipients were on that list. Each list in the current system should be no more than 2-5 people. We moved over a small subset of mailing lists from the older majordomo server. However, there are some lists with several hundres recipients on it that will be moving in the future when these issues get worked out. Orphaned locks? (ocean) root $ ls -lt (~mailman/locks) total 16 -rw-rw-r--2 mailman mailman55 Aug 25 2003 qrunner.lock -rw-rw-r--2 mailman mailman55 Aug 25 2003 qrunner.lock.ocean.otr.usm.edu.16302 -rw-rw-r--2 mailman mailman62 Aug 25 2003 gifted.studies.lock -rw-rw-r--2 mailman mailman62 Aug 25 2003 gifted.studies.lock.ocean.otr.usm.edu.16302 smtp log Aug 25 10:16:47 2003 (16302) smtp for 1 recips, completed in 1.400 seconds Aug 25 10:16:47 2003 (16302) smtp for 1 recips, completed in 3.799 seconds Aug 25 10:16:49 2003 (16302) smtp for 1 recips, completed in 1.968 seconds Aug 25 10:16:49 2003 (16302) smtp for 1 recips, completed in 1.990 seconds I'm guessing this would mean that there are not any orphaned locks. When I restart qrunner by hand, I kill the proces and remove all the locks from the locks directory. I thought the 3.799 seconds in the smtp log was qrunner telling you how long it took to send the mail through the MTA. Mailman's qrunner process each message through a pipeline. Attached is a modified version of the file $prefix/Mailman/Handlers/HandlerAPI.py This logs extra information about the progress of each message through the pipeline into the smtp log. This may give some insight as to where the qrunner is spending its time. The changes in the revised file are as follows: /Users/richard/development/mailman-htdig/mailman-2.0.13/Mailman/ Handlers bash-2.05a$ diff -u HandlerAPI.py.orig HandlerAPI.py --- HandlerAPI.py.orig Mon Aug 25 17:58:17 2003 +++ HandlerAPI.py Mon Aug 25 18:06:01 2003 @@ -73,10 +73,12 @@ def do_pipeline(mlist, msg, msgdata, pipeline): +t0x = t1x = t2x = time.time() while pipeline: modname = pipeline.pop(0) mod = __import__('Mailman.Handlers.' + modname) func = getattr(getattr(getattr(mod, 'Handlers'), modname), 'process') +t1x = time.time() try: pid = os.getpid() func(mlist, msg, msgdata) @@ -138,6 +140,10 @@ traceback.print_exc(file=s) syslog('error', s.getvalue()) break +t2x = time.time() +syslog('smtp', 'do_pipeline: list %s, modname %s, startat %.3f, durn %.3f' % + (mlist.internal_name(), modname, (t1x-t0x), (t2x-t1x))) +t1x = t2x return pipeline The revised files is: --- - About the best I can do to help. - Original Message - From: "Richard Barrett" <[EMAIL PROTECTED]> To: "Andrew Falgout" <[EMAIL PROTECTED]> Cc: <[EMAIL PROTECTED]> Sent: Monday, August 25, 2003 9:52 AM Subject: Re: [Mailman-Users] Mailman / Sendmail horribly slowperformance..suggestions please?!? On Monday, August 25, 2003, at 02:28 pm, Andrew Falgout wrote: The error log is at 0 bytes. I have not overridden the Defaults.py which shows Defaults.py:MTA_ALIASES_STYLE = 'sendmail'. So I assume that means that sendmail is the MTA. You will be using SMTP to the local MTA (which you say is Sendmail) if the following MM config variables are set as follows in either mm_cfg.py or the Defaults.py file. The following are th standard MM 2.0.13 default values: DELIVERY_MODULE = 'SMTPDirect' SMTPHOST = 'localhost' Also are the following default values still in place: SMTP_MAX_RCPTS = 500 MAX_DELIVERY_THREADS = 0 Aug 25 08:08:31 2003 (27798) smtp for 2 recips, completed in 1.045 seconds Aug 25 08:08:31 2003 (27798)
Re: [Mailman-Users] Mailman / Sendmail horriblyslowperformance..suggestions please?!?
I have not modified the default values in the SMTPDirect module or Defaults.py. If I could get qrunner to tell me anything about what message(s) it is working I could tell you how many recipients were on that list. Each list in the current system should be no more than 2-5 people. We moved over a small subset of mailing lists from the older majordomo server. However, there are some lists with several hundres recipients on it that will be moving in the future when these issues get worked out. Orphaned locks? (ocean) root $ ls -lt (~mailman/locks) total 16 -rw-rw-r--2 mailman mailman55 Aug 25 2003 qrunner.lock -rw-rw-r--2 mailman mailman55 Aug 25 2003 qrunner.lock.ocean.otr.usm.edu.16302 -rw-rw-r--2 mailman mailman62 Aug 25 2003 gifted.studies.lock -rw-rw-r--2 mailman mailman62 Aug 25 2003 gifted.studies.lock.ocean.otr.usm.edu.16302 smtp log Aug 25 10:16:47 2003 (16302) smtp for 1 recips, completed in 1.400 seconds Aug 25 10:16:47 2003 (16302) smtp for 1 recips, completed in 3.799 seconds Aug 25 10:16:49 2003 (16302) smtp for 1 recips, completed in 1.968 seconds Aug 25 10:16:49 2003 (16302) smtp for 1 recips, completed in 1.990 seconds I'm guessing this would mean that there are not any orphaned locks. When I restart qrunner by hand, I kill the proces and remove all the locks from the locks directory. I thought the 3.799 seconds in the smtp log was qrunner telling you how long it took to send the mail through the MTA. - Original Message - From: "Richard Barrett" <[EMAIL PROTECTED]> To: "Andrew Falgout" <[EMAIL PROTECTED]> Cc: <[EMAIL PROTECTED]> Sent: Monday, August 25, 2003 9:52 AM Subject: Re: [Mailman-Users] Mailman / Sendmail horribly slowperformance..suggestions please?!? > > On Monday, August 25, 2003, at 02:28 pm, Andrew Falgout wrote: > > > The error log is at 0 bytes. I have not overridden the Defaults.py > > which > > shows Defaults.py:MTA_ALIASES_STYLE = 'sendmail'. So I assume that > > means > > that sendmail is the MTA. > You will be using SMTP to the local MTA (which you say is Sendmail) if > the following MM config variables are set as follows in either > mm_cfg.py or the Defaults.py file. The following are th standard MM > 2.0.13 default values: > > DELIVERY_MODULE = 'SMTPDirect' > SMTPHOST = 'localhost' > > Also are the following default values still in place: > > SMTP_MAX_RCPTS = 500 > MAX_DELIVERY_THREADS = 0 > > > > > Aug 25 08:08:31 2003 (27798) smtp for 2 recips, completed in 1.045 > > seconds > > Aug 25 08:08:31 2003 (27798) smtp for 2 recips, completed in 1.149 > > seconds > > Aug 25 08:12:47 2003 (27798) smtp for 1 recips, completed in 0.208 > > seconds > > Aug 25 08:12:47 2003 (27798) smtp for 1 recips, completed in 0.241 > > seconds > > Aug 25 08:14:25 2003 (31184) smtp for 1 recips, completed in 0.701 > > seconds > > Aug 25 08:14:25 2003 (31184) smtp for 1 recips, completed in 0.716 > > seconds > > Aug 25 08:17:21 2003 (31184) smtp for 1 recips, completed in 0.167 > > seconds > > Aug 25 08:17:21 2003 (31184) smtp for 1 recips, completed in 0.179 > > seconds > > Aug 25 08:19:34 2003 (31184) smtp for 1 recips, completed in 3.962 > > seconds > > Aug 25 08:19:34 2003 (31184) smtp for 1 recips, completed in 4.234 > > seconds > > > > This is just a few of the lines in my /var/log/mailman/smtp file. > > There are > > a few bounces from time to time, but not on great frequency. I did > > send an > > email to a test list, and attempted to time when I would see my email > > address in the post log, but after 10 minutes I have not seen it. This > > really confuses me, because the mail system on this server delivers > > mail > > directly to a mailbox in seconds. Why the delay? > > > I am going to assume that you had not made the edit to > $prefix/Mailman/Handlers/SMTPDirect.py I suggested in one of my > response and that these log entries are the regular ones? > > Bear in mind that the number in brackets in these log entries - (27798) > for instance - is the process id of the qurnner handling the message > and the same qrunner is handling multiple messages. > > The smtp log entries are interesting if you have a large backlog of > messages in the qfiles directory because: > > 1. only a few recipients are being delivered for each log entry. Are > your mailing list that sparely populated for subscribers? Do these smtp > log entries tie in with post entries for lists with few members and > /var/log/mail entries for outgoing mail through the MTA? > > 2. quite long periods - 2, 3, 4 minutes in the sample log entries you > give - elapse between log entries. But the log entries are for the same > process so it has been continuing to run during these pregnant pauses. > But what is it doing? > > Some thoughts. Mailman's qrunner can get slow if it is: > > a. archiving and > > b. the messages for a list are large messages and/or > > c. the number of messages in the current archive of a list is large. > > On the other hand,
Re: [Mailman-Users] Mailman / Sendmail horriblyslowperformance..suggestions please?!?
I've included some of the diagnostics that was requested. As of this moment there are 7423 messages in the qfiles directory for mailman. And it continues to grow. Please let me know if my thinking is flawed on how mails flow through the system. 1. Message to a list hits sendmail initially and goes into the mqueue directory for sendmail for processing. 2. Sendmail delivers the message to the mailman wrapper which places the message in the qfiles directory for qrunner to process. 3. qrunner picks up these messages one by one, and delivers the message to each of the recpts listed for the list. How does qrunner decide which email to process? Is there a way to have it process a certain message in the qfiles directory? Are there any utilities for determining what is in the queue or why it was queued by qrunner. I really do appreciate all the help. Yes.. I am using sendmail as my MTA. Trying 127.0.0.1... Connected to localhost. Escape character is '^]'. 220 ocean.otr.usm.edu ESMTP Sendmail 8.11.6/8.11.6; Mon, 25 Aug 2003 09:21:56 -0500 (ocean) root $ time host python.org DNS lookups are almost instantaneous. python.org has address 194.109.137.226 host python.org 0.04s user 0.01s system 97% cpu 0.051 total (ocean) root $ dig -t python.org (/var/spool/mailman/qfiles) ;; Warning, ignoring invalid type python.org ; <<>> DiG 9.2.1 <<>> -t python.org ;; global options: printcmd ;; Got answer: ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 19709 ;; flags: qr rd ra; QUERY: 1, ANSWER: 13, AUTHORITY: 0, ADDITIONAL: 13 ;; QUESTION SECTION: ;. IN NS ;; ANSWER SECTION: . 5402IN NS G.ROOT-SERVERS.NET. . 5402IN NS H.ROOT-SERVERS.NET. . 5402IN NS I.ROOT-SERVERS.NET. . 5402IN NS J.ROOT-SERVERS.NET. . 5402IN NS K.ROOT-SERVERS.NET. . 5402IN NS L.ROOT-SERVERS.NET. . 5402IN NS M.ROOT-SERVERS.NET. . 5402IN NS A.ROOT-SERVERS.NET. . 5402IN NS B.ROOT-SERVERS.NET. . 5402IN NS C.ROOT-SERVERS.NET. . 5402IN NS D.ROOT-SERVERS.NET. . 5402IN NS E.ROOT-SERVERS.NET. . 5402IN NS F.ROOT-SERVERS.NET. ;; ADDITIONAL SECTION: A.ROOT-SERVERS.NET. 600537 IN A 198.41.0.4 B.ROOT-SERVERS.NET. 600537 IN A 128.9.0.107 C.ROOT-SERVERS.NET. 510409 IN A 192.33.4.12 D.ROOT-SERVERS.NET. 510409 IN A 128.8.10.90 E.ROOT-SERVERS.NET. 602151 IN A 192.203.230.10 F.ROOT-SERVERS.NET. 510409 IN A 192.5.5.241 G.ROOT-SERVERS.NET. 602151 IN A 192.112.36.4 H.ROOT-SERVERS.NET. 602151 IN A 128.63.2.53 I.ROOT-SERVERS.NET. 510408 IN A 192.36.148.17 J.ROOT-SERVERS.NET. 602151 IN A 192.58.128.30 K.ROOT-SERVERS.NET. 510408 IN A 193.0.14.129 L.ROOT-SERVERS.NET. 510408 IN A 198.32.64.12 M.ROOT-SERVERS.NET. 510409 IN A 202.12.27.33 ;; Query time: 3 msec ;; SERVER: 131.95.84.2#53(131.95.84.2) ;; WHEN: Mon Aug 25 09:24:48 2003 ;; MSG SIZE rcvd: 436 - Original Message - From: "Jon Carnes" <[EMAIL PROTECTED]> To: "Andrew Falgout" <[EMAIL PROTECTED]> Cc: <[EMAIL PROTECTED]>; "Richard Barrett" <[EMAIL PROTECTED]> Sent: Monday, August 25, 2003 9:09 AM Subject: Re: [Mailman-Users] Mailman / Sendmail horribly slowperformance..suggestions please?!? > The description of the problem fits right in with the primary DNS > resolver not working and the server having to fall back to the secondary > for each look up. > > That might not be your problem, but in anycase, you should run a > local-caching Nameserver on your mailserver. Then check to see how long > it takes to do dns lookups: > host -t mx some.host.com > dig -t mx some.host.com > > The "dig" command will tell you how long it took to get the dns > information. The "host" command will simply give you the information > you requested - which in this case is a listing of the Mailservers for > the domain some.host.com > > You might also want to tweak your sendmail configuration: > http://www.trilug.org/~jonc/mailserver/PartIII.html > > BTW: the many MTA's use the "sendmail aliases style", the fact that > Mailman is using it to communicate with your MTA does not insure that > you are using Sendmail as your MTA. > > To find out which MTA you are using: > telnet localhost 25 > > Read the header it displays. It should indicate what service is acting > as your MTA. As an example, here is what I get: > 220 Anncons4.nc.rr.com ESMTP Sendmail 8.12.8/8.12.8; Mon, 25 Aug 2003 > 10:07:09 -0400 > > As you can see, I'm run