[Mailman-Users] failing qrunner
Hi guys, We've got a problem with a half-completed delivery run, somehow an address with a ? at the end of the domain managed to get into the list addresses, ie, something like: [EMAIL PROTECTED] instead of just [EMAIL PROTECTED] ... now exim drops the connection when it sees this address, which means that none of the recipients in that run receives the message. Firstly, mailman should not have accepted that address, but this may have been fixed (this is a rather old version, no, I can't upgrade it, nor am I allowed to fix the exim config ... don't even bother asking). What I want to know is how mailman handles the message delivery runs. Afaik each message that needs to go out is stored in some location, along with a list of recipients, so periodically mailman checks which messages needs to go out, and to which recipients, and it then tries to make those deliveries, removing the recipients that it successfully delivers. Is there a manual way to remove the problem-causing email addy from this list for the particular message? We've already removed it from the main list so it won't cause issues in future but it's now holding up the delivery of an already sent message. Jaco -- Mailman-Users mailing list Mailman-Users@python.org http://mail.python.org/mailman/listinfo/mailman-users Mailman FAQ: http://www.python.org/cgi-bin/faqw-mm.py Searchable Archives: http://www.mail-archive.com/mailman-users%40python.org/ Unsubscribe: http://mail.python.org/mailman/options/mailman-users/archive%40jab.org Security Policy: http://www.python.org/cgi-bin/faqw-mm.py?req=showamp;file=faq01.027.htp
Re: [Mailman-Users] failing qrunner
Jaco Kroon wrote: What I want to know is how mailman handles the message delivery runs. Afaik each message that needs to go out is stored in some location, along with a list of recipients, so periodically mailman checks which messages needs to go out, and to which recipients, and it then tries to make those deliveries, removing the recipients that it successfully delivers. That is correct. Assuming this is at least Mailman 2.1.x, the messages to be sent are placed in Mailman's 'out' queue (normally Mailman's qfiles/out/ directory) and picked up and delivered by OutgoingRunner. If the MTA returns a non-retryable failure for one or more recipients, that is logged in Mailman's smtp-failure log and treated as a bounce for the failed recipients. If the MTA returns a retryable failure for one or more recipients, that is also logged in Mailman's smtp-failure log and the message is queued in the 'retry' queue for delivery to the failed recipients. Every 15 minutes, RetryRunner moves the message from the retry queue back to the out queue. This continues for DELIVERY_RETRY_PERIOD (default 5 days) after which, Mailman gives up on this message. Is there a manual way to remove the problem-causing email addy from this list for the particular message? We've already removed it from the main list so it won't cause issues in future but it's now holding up the delivery of an already sent message. First find the entry (a long, mostly numeric, name ending in .pck) in qfiles/retry, and move that file aside. Then use Mailman's bin/dumpdb to dump the file. This will output the raw message and the message metadata. The metadata contains a list of 'recips' which is the addresses remaining to be delivered. If you are proficient in Python, you could write a short script to unpickle the message and metadata from the file, remove the bad recipient from recips and repickle the message and metadata. then you could put the file in qfiles/out for delivery. (I'm currently debugging one I just wrote - I'll post a link soon). Alternatively, you could just remail the message outside of mailman to the remaining recipients. -- Mark Sapiro [EMAIL PROTECTED] The highway is for gamblers, San Francisco Bay Area, Californiabetter use your sense - B. Dylan -- Mailman-Users mailing list Mailman-Users@python.org http://mail.python.org/mailman/listinfo/mailman-users Mailman FAQ: http://www.python.org/cgi-bin/faqw-mm.py Searchable Archives: http://www.mail-archive.com/mailman-users%40python.org/ Unsubscribe: http://mail.python.org/mailman/options/mailman-users/archive%40jab.org Security Policy: http://www.python.org/cgi-bin/faqw-mm.py?req=showamp;file=faq01.027.htp
Re: [Mailman-Users] failing qrunner
Mark Sapiro wrote: If you are proficient in Python, you could write a short script to unpickle the message and metadata from the file, remove the bad recipient from recips and repickle the message and metadata. then you could put the file in qfiles/out for delivery. (I'm currently debugging one I just wrote - I'll post a link soon). The minimally tested script is at http://veenet.value.net/~msapiro/scripts/remove_recips and mirrored at http://fog.ccsf.edu/~msapiro/scripts/remove_recips. -- Mark Sapiro [EMAIL PROTECTED] The highway is for gamblers, San Francisco Bay Area, Californiabetter use your sense - B. Dylan -- Mailman-Users mailing list Mailman-Users@python.org http://mail.python.org/mailman/listinfo/mailman-users Mailman FAQ: http://www.python.org/cgi-bin/faqw-mm.py Searchable Archives: http://www.mail-archive.com/mailman-users%40python.org/ Unsubscribe: http://mail.python.org/mailman/options/mailman-users/archive%40jab.org Security Policy: http://www.python.org/cgi-bin/faqw-mm.py?req=showamp;file=faq01.027.htp
Re: [Mailman-Users] failing qrunner
Jaco Kroon wrote: Mark Sapiro wrote: Jaco Kroon wrote: Ok. That covers the 4xx and 5xx responses to rcpt to:, what happens if the MTA simply closes the connection? What I gathered the smtp conversation had to look like was something like: S: 220 servername ESMTP Exim C: helo servername S: 250 servername Hello localhost [127.0.0.1] C: mail from: [EMAIL PROTECTED] S: 250 OK C: rcpt to: [EMAIL PROTECTED] S: 250 OK C: rcpt to: [EMAIL PROTECTED] S: --- force close connection --- It will be logged in the 'smtp-failure' as a 'Low level smtp error' and in the 'post' log with the number refused. It shouldn't be retried. What's in Mailman's 'smtp', 'smtp-failure' and 'post' logs? Now, the problem here is that you don't really know whether it's a 5xx or a 4xx error code, and it actually looks like the entire run for that message gets interrupted and put to sleep in it's entirety. Thus may have been a bug that got fixed at some point (I don't even know which exact version of mailman I'm working with, but it's at the latest something released around Feb 2007). So at this point it simply wouldn't continue any further, and smtp-failures actually logs the address after the faulty one as the one causing a problem. It depends on what exception is returned by Python's smtplib. If Exim really just closes the connection, it will be logged in 'post' with a number of failures as well as being logged in 'smtp-failure' as a 'Low level' error and in 'smtp', and each attempted recipient from that transaction (all the ones up to SMTP_MAX_RCPTS (default 500) that were going to be delivered, not just the ones whose rcpt to was not sent) will be logged in 'smtp-failure' as 'code -1: error'. Then the message will be put in the retry queue with the same recips list minus any that were successfully delivered in a prior smtp transaction. What is in the Mailman logs? This continues for DELIVERY_RETRY_PERIOD (default 5 days) after which, Mailman gives up on this message. Is there a manual way to remove the problem-causing email addy from this list for the particular message? We've already removed it from the main list so it won't cause issues in future but it's now holding up the delivery of an already sent message. First find the entry (a long, mostly numeric, name ending in .pck) in qfiles/retry, and move that file aside. Then use Mailman's bin/dumpdb to dump the file. This will output the raw message and the message metadata. The metadata contains a list of 'recips' which is the addresses remaining to be delivered. I saw the dumpdb program, had no idea what it does though. Now I do, and it'll make my life a lot easier next time. Any way to repack the file? If you are proficient in Python, you could write a short script to unpickle the message and metadata from the file, remove the bad recipient from recips and repickle the message and metadata. then you could put the file in qfiles/out for delivery. (I'm currently debugging one I just wrote - I'll post a link soon). ... or issue mailmanctl stop, use vim on the file, find the invalid address and without changing the size of the file change the address to an RFC legal address that is bogus, ie, [EMAIL PROTECTED] can be changed to [EMAIL PROTECTED] which causes the pickle to not break, and will cause exim to not close the connection ... instead it will bounce back to mailman, harmlessly since this server isn't using VERP. Yes, you could do that, but as I posted later in this thread, there is now a script to just delete the bad address at http://veenet.value.net/~msapiro/scripts/remove_recips and mirrored at http://fog.ccsf.edu/~msapiro/scripts/remove_recips. And, if you really wanted to use vim, there's no need to stop Mailman. Just move the file out of the queue directory, edit it, dump it with bin/dumpdb to verify it can still be unpickled and move it back. -- Mark Sapiro [EMAIL PROTECTED] The highway is for gamblers, San Francisco Bay Area, Californiabetter use your sense - B. Dylan -- Mailman-Users mailing list Mailman-Users@python.org http://mail.python.org/mailman/listinfo/mailman-users Mailman FAQ: http://www.python.org/cgi-bin/faqw-mm.py Searchable Archives: http://www.mail-archive.com/mailman-users%40python.org/ Unsubscribe: http://mail.python.org/mailman/options/mailman-users/archive%40jab.org Security Policy: http://www.python.org/cgi-bin/faqw-mm.py?req=showamp;file=faq01.027.htp
Re: [Mailman-Users] failing qrunner
On 9/15/07, Jaco Kroon wrote: So at this point it simply wouldn't continue any further, and smtp-failures actually logs the address after the faulty one as the one causing a problem. To avoid this problem in the future, try enabling personalization on the list, and using VERP. Then Mailman will make separate delivery attempts for each user, and only the invalid one would fail in the manner you described. The rest should go through normally. This would be a bigger performance hit on the server, but would help make your day-to-day operations more robust. This is especially important since you've said you can't upgrade any of the software, and we know that more recent versions of Mailman have significantly improved their ability to handle failures of various different types and continue trying to deliver everything else. -- Brad Knowles [EMAIL PROTECTED] LinkedIn Profile: http://tinyurl.com/y8kpxu -- Mailman-Users mailing list Mailman-Users@python.org http://mail.python.org/mailman/listinfo/mailman-users Mailman FAQ: http://www.python.org/cgi-bin/faqw-mm.py Searchable Archives: http://www.mail-archive.com/mailman-users%40python.org/ Unsubscribe: http://mail.python.org/mailman/options/mailman-users/archive%40jab.org Security Policy: http://www.python.org/cgi-bin/faqw-mm.py?req=showamp;file=faq01.027.htp