Thanks for the very informative message! Out of curiosity, is there any reason From munging needs to be off for non-Gmail hosts, specifically Fastmail (who probably shows up in your log as messagingengine.com)? I saw you mention that they bounced messages from the EC2 server, but it’s unclear if any of the original issues apply.
Gaelan > On Dec 31, 2019, at 12:15 AM, omd via agora-discussion > <agora-discussion@agoranomic.org> wrote: > > On Fri, Dec 27, 2019 at 2:05 PM James Cook <jc...@cs.berkeley.edu> wrote: >> Some data about trying to pinpoint the end of the mailing list outage. >> It looks like it's slightly different per list; I suppose this may >> reflect the dates omd updated the configurations. Dates below are >> according to the archives at mailman.agoranomic.org. > > Here's a not-quite-exact chronology reconstructed from logs: > > - Unknown, but no later than Oct 29, when my logs start: Gmail first > starts returning 421 errors (temporary failure) with "authentication > information" message. At least since Oct 29, all list messages were > delivered on later attempts. > > - Dec 14 23:01 UTC: Gmail first starts returning 550 errors (permanent > failure) with same error message; first affected message is this one: > https://mailman.agoranomic.org/cgi-bin/mailman/private/agora-discussion/2019-December/056000.html > > [During this period, Gmail rejected most deliveries, although it > accepted some. The list could still receive from Gmail and deliver to > other servers.] > > - Around Dec 22 23:21 UTC: Reconfigured qmail on vps.qoid.us (which > hosts the lists) to forward via ec2.qoid.us. > > - Around Dec 23 00:39 UTC: Fixed ec2.qoid.us mail server to use IPv4 > instead of IPv6. Gmail etc. don't like mail coming from IPv6 because > you can't do effective IP bans. > > [During this period, Gmail accepted... most deliveries, albeit delayed > due to rate limits, but it did reject a lot of daily digests, which > some people are subscribed to. Moreover, icloud.com started rejecting > all deliveries; apparently ec2.qoid.us got onto the proofpoint.com > blacklist, a "machine-learning driven content classification system". > Sigh.] > > - Around Dec 24 05:53 UTC: Reconfigured Mailman to send messages > through SMTP directly to ec2.qoid.us rather than going through the > local qmail. This shouldn't affect anything. > > - Around Dec 28 00:33 UTC: Turned on From munging and DKIM signing and > switched back to vps.qoid.us. No mass rejections since then. > > In all cases, the three lists were affected at the same time (except > turning on From munging, which happened a few seconds apart for each > list). > > Unfortunately, since each subscriber gets their own separate delivery > attempt (mostly), there's no clear line between the list working and > not working. The possibility of delayed delivery makes things even > more complicated, as does the interaction with daily digests. I do > think it's a good idea to resolve this via ratification. > > Sorry for the delay in explaining what's going on. I'm with family > for the holidays, and I end up not spending any time on non-family > stuff, even though I have plenty of time. > > The "authentication information" error message in question: > > 550-5.7.26 This message does not have authentication information or fails to > 550-5.7.26 pass authentication checks. To best protect our users from spam, > the > 550-5.7.26 message has been blocked. Please visit > 550-5.7.26 https://support.google.com/mail/answer/81126#authentication for > more > 550 5.7.26 information. t17si15910193pjr.44 - gsmtp > > The message is misleading. Without From munging, list messages do > often fail DKIM authentication checks, because of the DIS/BUS/OFF > prefix added to the subject. But this failure has existed for years > without causing Gmail to reject messages, although it sometimes sent > them to Spam or marked them as suspicious. Moreover, sending the same > messages from ec2.qoid.us worked... or at least didn't fail the same > way. So it seems like Gmail decided to distrust vps.qoid.us's IP > address. I can think of a few possible reasons why: > > - Backscatter: I recently checked the IP address against various spam > blacklists, and while it wasn't on the most common ones, it was on the > backscatterer.org blacklist. This surprised me. Turns out that my > server was vulnerable to a straightforward backscatter attack, where > you send mail to an intentionally invalid recipient, setting the From > address to whoever you want to spam, and the resulting bounce message > is delivered unsolicited to them. The version of qmail I'm using has > a mechanism to reject invalid recipients synchronously within the SMTP > connection, rather than sending a bounce message... but when I first > started running the lists, I had to disable this mechanism due to a > bug. I forgot that I still hadn't fixed that. Oops. > > Since then, I've fixed the bug and re-enabled recipient verification. > As a bonus, I also wrote some code to synchronously reject messages to > the lists if the sender isn't subscribed to that list. This > duplicates an existing check in Mailman, which has always been > enabled, but is asynchronous. Originally it was set to reject > messages from non-subscribers with an explanatory bounce message, but > a long time ago I had to switch it to silently ignoring them, again > for fear of backscatter spam. Having messages silently ignored is > confusing; now I can return a proper error without risking > backscatter. (The error will probably be returned to the sender as a > bounce message, but coming from their own delivery agent rather than > my server; it doesn't have the same issue because it knows the From > isn't forged.) Note that this doesn't currently work for other > Mailman rejections, such as for oversized messages. > > Anyway, other possible reasons: > > - Backscatter via Mailman: Some "bot" email addresses, like > agora-<listname>-subscr...@agoranomic.org, would send a response that > quotes your original message in full, creating the possibility of > another kind of backscatter. Not sure if anyone was actually doing > this, but for now I've disabled these aliases; now the only way to > subscribe is by filling out the web form, and the only way to verify > the subscription is by clicking the link in the confirmation message > (as opposed to replying to it). A bit suboptimal, especially since > the confirmation messages have a tendency to get rejected as > unsolicited mail, but meh. > > - Approval requests: Technically, messages from non-subscribers were > not blackholed but held for moderation. This resulted in approval > request messages which I had set to go to my Gmail account, so that in > theory I could spot legitimate messages – even if I usually wasn't > paying attention, because the messages were almost all just random > spam sent to the list email addresses. Well, Gmail started marking > approval requests for spam as being spam itself, which was convenient > for me as it could help distinguish legitimate messages. However, the > stream of "spam" from my server probably harmed its reputation. This > was dumb; I should have changed it long ago. In any case, now that > I'm synchronously rejecting messages from non-subscribers, the stream > of approval requests has stopped. > > - Forwarding: Completely unrelated to Agora, but hosted on the same > server, I had some email aliases which forwarded all incoming mail to > Gmail accounts, which of course included spam. I've switched this to > a different server. > > Since I've addressed all these issues, as I've said, I'm hoping that > vps.qoid.us's reputation will improve and I'll be able to turn From > munging back off eventually.