Re: MIME_BASE64_TEXT only on us-ascii
On Tue, 30 Nov 2021, Philip Prindeville wrote: On Nov 17, 2021, at 9:50 AM, Bill Cole wrote: SpamAssassin rules are not laws in any sense. They do not prescribe or proscribe any action. They do not reflect any sort of moral or ethical judgment. They do not express or define technical correctness. Isn't that exactly what we're discussing here? "Technical correctness"? The way I generally put it is: SpamAssassin is not an RFC-compliance audit tool. -- John Hardin KA7OHZhttp://www.impsec.org/~jhardin/ jhar...@impsec.org pgpk -a jhar...@impsec.org key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C AF76 D822 E6E6 B873 2E79 --- The police of a state should never be stronger or better armed than the citizenry. An armed citizenry, willing to fight, is the foundation of civil freedom.-- Robert A. Heinlein, 1942 --- 549 days since the first private commercial manned orbital mission (SpaceX)
Re: SPF_NONE scoring
On 2021-11-30 at 13:47:36 UTC-0500 (Tue, 30 Nov 2021 11:47:36 -0700) Philip Prindeville is rumored to have said: Hi, I'm looking at the 0.001 scoring for SPF_NONE and scratching my head. This was discussed a bit in early 2015, but maybe it needs revisiting with new perspective. Surely no one who cares about maintaining their reputation by protecting themselves against spoofing would fail to provide SPF records... Surely no one who cares about the security of their email would run their own on-premises Exchange... Having started my sysadmin career less than 30 years ago, I never have been exposed to an Internet where the dominant visible feature of my fellow admins has been operational competence. We're all a bunch of bozos making stupid mistakes... So how is this score arrived at? In theory, it is set in concert with all of the other default rules by periodic analyses of the scoring of spam and ham corpora submitted by members of the SA community. As a 'network' rule, it is only included in analysis weekly. In practice, it is nailed down at a tiny non-zero value because otherwise it would not be "good enough" to publish and demand has been expressed for its publication. And of Ham, how much of it has a valid SPF? Recently: 90.1202% And of Spam, how much of it lacks a valid SPF? Recently: 65.3614% Has anyone run some numbers? Yes. See https://ruleqa.spamassassin.org/. The numbers above are drawn from the last "network masscheck" accessible there. -- Bill Cole b...@scconsult.com or billc...@apache.org (AKA @grumpybozo and many *@billmail.scconsult.com addresses) Not Currently Available For Hire
Re: SPF_NONE scoring
So how is this score arrived at? I believe that scores of 0.001 are generally manually set, and not intended to be anything other than a visible marker that the rule hit. That is probably the case here. Loren
Re: MIME_BASE64_TEXT only on us-ascii
On Tue, Nov 30, 2021 at 12:03:15PM -0700, Philip Prindeville wrote: > > On Nov 17, 2021, at 9:50 AM, Bill Cole > > wrote: > > SpamAssassin rules are not laws in any sense. They do not prescribe or > > proscribe any action. They do not reflect any sort of moral or ethical > > judgment. They do not express or define technical correctness. > > Isn't that exactly what we're discussing here? "Technical correctness"? Hm, no? App encoding pure ASCII is Base64 is not breaking any RFC? So it is behaving "technically correctly". > Good internetworking implementations follow (to the extent they don't > conflict with good security practices) Postel's Law, "be conservative in what > you send, be liberal [but not naive] in what you accept". Well, antispam efforts (as is security for important stuff) are mostly exactly the OPPOSITE of good internetworking implementations of the old Postel's law. And for the good reasons - in the internetworking implementations of the old, the vast majority of peers (if not all) you interacted with were GOOD guys trying to do good things. In today e-mail (and security), the majority of the actors are enemies trying to penetrate your defensive lines. Also, see https://en.wikipedia.org/wiki/Robustness_principle#Criticism > Rereading: > > Base64 encoding is only necessary if there are non-ASCII characters used. > > UTF-8 is a superset of ASCII & it is normal for MUAs to not encode more > > than needed. > > Exactly. Encoding is only used when and where necessary. ...by legitimate users. Spammers on the other hand will sometimes encode even when it is NOT needed, probably in attempt to avoid less advanced antispam tools (or due to sheer laziness when writing spam tool). The fact that such encoding is tehnically allowed does NOT change the fact that the tecnique is vastly more used by spammers than by innocent parties. > Properly encoded HTML uses HTML-Entity naming, which is also ASCII-friendly, > i.e. instead of Latin1 etc. or raw 8bit characters. There are several "proper" (ie. allowed by different RFCs) ways to encode that information in mail. Statistical analyses seem to say that some of the ways are used much more by spammers then by legitimate users. Hence, the score for those methods. -- Opinions above are GNU-copylefted.
Re: SPF_NONE scoring
On Tue, Nov 30, 2021 at 11:47:36AM -0700, Philip Prindeville wrote: > I'm looking at the 0.001 scoring for SPF_NONE and scratching my head. This > was discussed a bit in early 2015, but maybe it needs revisiting with new > perspective. SPF is double edged sword. Sure, when it great to authenticate envelope senders when it works, but: - when used in combination with mailing list, plain message forwarding etc. it will break with false positive, marking (for example) this perfectly valid message of mine as a fake. See https://en.wikipedia.org/wiki/Sender_Policy_Framework#FAIL_and_forwarding This is the reason why you can only really use it for "SPF OK" validation - "SPF FAIL" does not really tell you anything, as it will happen as often for forged senders, as for valid senders. This is why it will often end as "?all" or "~all" and not "-all" (and/or soft DMARC policies) - Also, envelope sender (on which SPF operates) is something completely different thing from header "From:" which is what vast majority of users will see, so it does not provide protection which one might expect. See https://en.wikipedia.org/wiki/Sender_Policy_Framework#Header_limitations And this makes "SPF OK" much less useful then it sounds in theory. - Then there are misconfigurations (hitting limit of max 10 DNS lookups, SPF records which were setup once but not kept up-to-date, etc). Thus, SPF is IMHO not very usable for scoring on its own, but it does have a useful purpose for creating custom SA rules and is often very usable for short circuiting with whitelist_auth. > Surely no one who cares about maintaining their reputation by protecting > themselves against spoofing would fail to provide SPF records... For example, I do not provide it on my few other e-mail accounts by choice (especially most of them which deal with many mailing lists, or with users which use non-SRS e-mail forwarding), as mere existence of SPF there causes much more damage then the potential help it brings. > So how is this score arrived at? That, I am not sure. Perhaps how well it is an indicator on ham/spam corpuses run to determine scores in general in SA? > And of Ham, how much of it has a valid SPF? For my recent hams, I get this: 714 SPF_PASS= 128 SPF_NONE= 67 SPF_NEUTRAL_ALL= 9 SPF_FAIL= 1 SPF_SOFTFAIL= So, about 1 message in 7 hams does not have SPF. > And of Spam, how much of it lacks a valid SPF? For recent spams that reach any kind of mailbox here (eg. not hitting very-safe RBLs, and not having very high SA scores - ie. having at least a minimum of potential for being misclassified non-spam): 2291 SPF_PASS= 667 SPF_SOFTFAIL= 472 SPF_NONE= 353 SPF_FAIL= 154 SPF_NEUTRAL_ALL= 129 SPF_PERMERROR= 53 SPF_NEUTRAL= 17 SPF_TEMPERROR= So, about 1 message in 9 spams does not have SPF. In summary, there does not seem to be big difference between adoption of SPF in spammers as opposed to legitimate users -- Opinions above are GNU-copylefted.
Re: SPF_NONE scoring
Philip Prindeville writes: > I'm looking at the 0.001 scoring for SPF_NONE and scratching my head. This > was discussed a bit in early 2015, but maybe it needs revisiting with new > perspective. > > Surely no one who cares about maintaining their reputation by > protecting themselves against spoofing would fail to provide SPF > records... So how is this score arrived at? > > And of Ham, how much of it has a valid SPF? > > And of Spam, how much of it lacks a valid SPF? > > Has anyone run some numbers? I see 0.001 as a score that says: this might be a spam sign, we don't know, and this way it shows up in reports, without really affecting anything. Lots of people think SPF is silly. And spammers spamming from a domain they control can even dkim/dmarc. So I agree that actual data would be interesting. signature.asc Description: PGP signature
Re: MIME_BASE64_TEXT only on us-ascii
> On Nov 17, 2021, at 9:50 AM, Bill Cole > wrote: > > SpamAssassin rules are not laws in any sense. They do not prescribe or > proscribe any action. They do not reflect any sort of moral or ethical > judgment. They do not express or define technical correctness. Isn't that exactly what we're discussing here? "Technical correctness"? Good internetworking implementations follow (to the extent they don't conflict with good security practices) Postel's Law, "be conservative in what you send, be liberal [but not naive] in what you accept". The point earlier in the thread was that using more encoding than is strictly necessary is not being "conservative in what you send", since it puts extra burden on the receiver to have a robust and complete implementation, and creates more opportunity to have an interoperability failure. Rereading: > Base64 encoding is only necessary if there are non-ASCII characters used. > UTF-8 is a superset of ASCII & it is normal for MUAs to not encode more than > needed. Exactly. Encoding is only used when and where necessary. Properly encoded HTML uses HTML-Entity naming, which is also ASCII-friendly, i.e. instead of Latin1 etc. or raw 8bit characters. -Philip
SPF_NONE scoring
Hi, I'm looking at the 0.001 scoring for SPF_NONE and scratching my head. This was discussed a bit in early 2015, but maybe it needs revisiting with new perspective. Surely no one who cares about maintaining their reputation by protecting themselves against spoofing would fail to provide SPF records... So how is this score arrived at? And of Ham, how much of it has a valid SPF? And of Spam, how much of it lacks a valid SPF? Has anyone run some numbers? Thanks, -Philip