We've been considering blocking messages that break RFC compliance by including a space before or after the colon in MAIL FROM: and RCPT TO: commands. From RFC 5321 Section 3.3:

   Since it has been a common source of errors, it is worth noting that
   spaces are not permitted on either side of the colon following FROM
   in the MAIL command or TO in the RCPT command.  The syntax is exactly
   as given above.

I wrote up the following statistics in an attempt to see just what would happen if we started enforcing this syntax. As it turns out, we would be able to skip a whole lot of RBL, SPF, and address lookups as well as content scans, and also be able to block a number of spams that would have otherwise gotten through our filter... but we would also block *some* legit mail. Most of the mail that would have been blocked was sent by email marketers, though, and wouldn't have necessarily been missed by our customers; and if the senders knew what they were doing, they could always do something about it.

Anyway, I thought I'd post it here and see if anyone has any opinions on whether it's acceptable to enforce this. It's certainly very tempting.



Within a period of about 24 hours, the node that I examined processed 22,523 transactions that would have violated strict syntax checks WRT spaces.
-------------------------------
Rejections:
invalid recip  15,881
rbl             4,856
spamassassin      683
spf               387
syntax            168 (missing angle brackets, etc.)
relay denied       38
blacklisted sender 23
no rdns            14
rdns mismatch       2
phishing            2
-------------------------------
Deferrals:
spf defer          24
-------------------------------
Quarantined:      211
Delivered:        192
-------------------------------


Total Blocked:   22,265 (99.1%)
Total Delivered:    192 (0.9%)

This node processed a total of 33,553 transactions overall, 3,796 of which resulted in delivery. 66.9% of these transactions would have been blocked outright by blocking on improper spaces in mail from/rcpt commands. Doing so would also have decreased the number of delivered messages by 5%, and decreased the number of quarantined messages by 32% -- but a few of these new blocks would have been false positives.

I examined 71 of the delivered messages that would have been blocked due to RFC-ignorance. There were some definite false positives. Most of these false positives were addressed with 'bounce@' and similar addresses and were obviously sent by direct marketing companies. Three messages, however, were sent by normal end-users who would have received bounces if we were blocking based on RFC ignorance.

Of the remaining delivered messages with invalid syntax, nearly all *appeared* to be sent by direct marketers. Upon investigation, however, most of these appeared very likely to be illegitimate. I suspect that a small handful of popular direct marketing email software applications use this flawed syntax, and are used by some legitimate marketers, as well as by a large number of spammers, and perhaps a few small marketing companies that don't realize (and probably don't care) that their customers are sending spam through their servers. StrongMail, the apparent MTA for multiview.com, is one candidate:

http://www.strongmail.com/what-we-offer/email-deliverability-solutions/

At any rate, some obviously legitimate mass email campaigns would have been blocked due to RFC ignorance from the following organizations:

bounce.rd.com (Reader's Digest): 3
cmpgnr.com (Campaigner)          5
email.joann.com (Joann Fabric):  2
service.govdelivery.com:         2
medtech.cfmvmail.com:            1
mktg.artfulhome.com:             1
strongmail.multiview.com:        1
----------------------------------------
Marketing Senders:              15
Human Senders:                   3
Total:                          18



So, the samples that I examined had a 23.4% false positive rate, although FWIW if you don't count mass-marketers, it would be a 3.9% FP rate. The overall FP rate, without taking collisions with other rejection methods into account, would be around 0.2%. Considering that this syntax is clearly prohibited by RFC, and the majority of senders are professional organizations who can do something about the issue if they want their messages to get through our systems, it may be appropriate to reject outright in spite of the possibility of blocking some legitimate mail.

-Jared

Reply via email to