Re: MIME_BASE64_TEXT only on us-ascii

2021-11-30 Thread John Hardin

On Tue, 30 Nov 2021, Philip Prindeville wrote:


On Nov 17, 2021, at 9:50 AM, Bill Cole 
 wrote:

SpamAssassin rules are not laws in any sense. They do not prescribe or 
proscribe any action. They do not reflect any sort of moral or ethical 
judgment. They do not express or define technical correctness.


Isn't that exactly what we're discussing here?  "Technical correctness"?


The way I generally put it is: SpamAssassin is not an RFC-compliance audit 
tool.


--
 John Hardin KA7OHZhttp://www.impsec.org/~jhardin/
 jhar...@impsec.org pgpk -a jhar...@impsec.org
 key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C  AF76 D822 E6E6 B873 2E79
---
  The police of a state should never be stronger or better armed
  than the citizenry. An armed citizenry, willing to fight, is the
  foundation of civil freedom.-- Robert A. Heinlein, 1942
---
 549 days since the first private commercial manned orbital mission (SpaceX)


Re: SPF_NONE scoring

2021-11-30 Thread Bill Cole

On 2021-11-30 at 13:47:36 UTC-0500 (Tue, 30 Nov 2021 11:47:36 -0700)
Philip Prindeville 
is rumored to have said:


Hi,

I'm looking at the 0.001 scoring for SPF_NONE and scratching my head.  
This was discussed a bit in early 2015, but maybe it needs revisiting 
with new perspective.


Surely no one who cares about maintaining their reputation by 
protecting themselves against spoofing would fail to provide SPF 
records...


Surely no one who cares about the security of their email would run 
their own on-premises Exchange...


Having started my sysadmin career less than 30 years ago, I never have 
been exposed to an Internet where the dominant visible feature of my 
fellow admins has been operational competence. We're all a bunch of 
bozos making stupid mistakes...



So how is this score arrived at?


In theory, it is set in concert with all of the other default rules by 
periodic analyses of the scoring of spam and ham corpora submitted by 
members of the SA community. As a 'network' rule, it is only included in 
analysis weekly.


In practice, it is nailed down at a tiny non-zero value because 
otherwise it would not be "good enough" to publish and demand has been 
expressed for its publication.



And of Ham, how much of it has a valid SPF?


Recently: 90.1202%


And of Spam, how much of it lacks a valid SPF?


Recently: 65.3614%


Has anyone run some numbers?


Yes. See https://ruleqa.spamassassin.org/. The numbers above are drawn 
from the last "network masscheck" accessible there.



--
Bill Cole
b...@scconsult.com or billc...@apache.org
(AKA @grumpybozo and many *@billmail.scconsult.com addresses)
Not Currently Available For Hire


Re: SPF_NONE scoring

2021-11-30 Thread Loren Wilton

So how is this score arrived at?


I believe that scores of 0.001 are generally manually set, and not intended 
to be anything other than a visible marker that the rule hit. That is 
probably the case here.


   Loren



Re: MIME_BASE64_TEXT only on us-ascii

2021-11-30 Thread Matija Nalis
On Tue, Nov 30, 2021 at 12:03:15PM -0700, Philip Prindeville wrote:
> > On Nov 17, 2021, at 9:50 AM, Bill Cole 
> >  wrote:
> > SpamAssassin rules are not laws in any sense. They do not prescribe or 
> > proscribe any action. They do not reflect any sort of moral or ethical 
> > judgment. They do not express or define technical correctness.
> 
> Isn't that exactly what we're discussing here?  "Technical correctness"?

Hm, no? App encoding pure ASCII is Base64 is not breaking any RFC?
So it is behaving "technically correctly".

> Good internetworking implementations follow (to the extent they don't 
> conflict with good security practices) Postel's Law, "be conservative in what 
> you send, be liberal [but not naive] in what you accept".

Well, antispam efforts (as is security for important stuff) are
mostly exactly the OPPOSITE of good internetworking implementations
of the old Postel's law.

And for the good reasons - in the internetworking implementations of
the old, the vast majority of peers (if not all) you interacted with
were GOOD guys trying to do good things.

In today e-mail (and security), the majority of the actors are
enemies trying to penetrate your defensive lines. 

Also, see https://en.wikipedia.org/wiki/Robustness_principle#Criticism


> Rereading:
> > Base64 encoding is only necessary if there are non-ASCII characters used. 
> > UTF-8 is a superset of ASCII & it is normal for MUAs to not encode more 
> > than needed.
> 
> Exactly.  Encoding is only used when and where necessary.

...by legitimate users. Spammers on the other hand will sometimes 
encode even when it is NOT needed, probably in attempt to avoid less
advanced antispam tools (or due to sheer laziness when writing spam
tool). 

The fact that such encoding is tehnically allowed does NOT change the
fact that the tecnique is vastly more used by spammers than by
innocent parties.

> Properly encoded HTML uses HTML-Entity naming, which is also ASCII-friendly, 
> i.e.  instead of Latin1  etc. or raw 8bit characters.

There are several "proper" (ie. allowed by different RFCs) ways to
encode that information in mail. Statistical analyses seem to say that
some of the ways are used much more by spammers then by legitimate
users. Hence, the score for those methods.

-- 
Opinions above are GNU-copylefted.


Re: SPF_NONE scoring

2021-11-30 Thread Matija Nalis
On Tue, Nov 30, 2021 at 11:47:36AM -0700, Philip Prindeville wrote:
> I'm looking at the 0.001 scoring for SPF_NONE and scratching my head.  This 
> was discussed a bit in early 2015, but maybe it needs revisiting with new 
> perspective.

SPF is double edged sword. Sure, when it great to authenticate
envelope senders when it works, but:

- when used in combination with mailing list, plain message
  forwarding etc. it will break with false positive, marking
  (for example) this perfectly valid message of mine as a fake.
  See https://en.wikipedia.org/wiki/Sender_Policy_Framework#FAIL_and_forwarding

  This is the reason why you can only really use it for "SPF OK"
  validation - "SPF FAIL" does not really tell you anything, as it
  will happen as often for forged senders, as for valid senders.

  This is why it will often end as "?all" or "~all" and not "-all"
  (and/or soft DMARC policies)

- Also, envelope sender (on which SPF operates) is something
  completely different thing from header "From:" which is what vast
  majority of users will see, so it does not provide protection which
  one might expect.
  See https://en.wikipedia.org/wiki/Sender_Policy_Framework#Header_limitations

  And this makes "SPF OK" much less useful then it sounds in theory.

- Then there are misconfigurations (hitting limit of max 10 DNS
  lookups, SPF records which were setup once but not kept up-to-date,
  etc).

Thus, SPF is IMHO not very usable for scoring on its own, but it does
have a useful purpose for creating custom SA rules and is often very
usable for short circuiting with whitelist_auth.

> Surely no one who cares about maintaining their reputation by protecting 
> themselves against spoofing would fail to provide SPF records...  

For example, I do not provide it on my few other e-mail accounts by
choice (especially most of them which deal with many mailing lists,
or with users which use non-SRS e-mail forwarding), as mere existence
of SPF there causes much more damage then the potential help it
brings.

> So how is this score arrived at?

That, I am not sure. Perhaps how well it is an indicator on
ham/spam corpuses run to determine scores in general in SA? 

> And of Ham, how much of it has a valid SPF?

For my recent hams, I get this:

714 SPF_PASS=
128 SPF_NONE=
 67 SPF_NEUTRAL_ALL=
  9 SPF_FAIL=
  1 SPF_SOFTFAIL=

So, about 1 message in 7 hams does not have SPF.

> And of Spam, how much of it lacks a valid SPF?

For recent spams that reach any kind of mailbox here (eg. not
hitting very-safe RBLs, and not having very high SA scores - ie. 
having at least a minimum of potential for being misclassified
non-spam):

   2291 SPF_PASS=
667 SPF_SOFTFAIL=
472 SPF_NONE=
353 SPF_FAIL=
154 SPF_NEUTRAL_ALL=
129 SPF_PERMERROR=
 53 SPF_NEUTRAL=
 17 SPF_TEMPERROR=

So, about 1 message in 9 spams does not have SPF.

In summary, there does not seem to be big difference between
adoption of SPF in spammers as opposed to legitimate users

-- 
Opinions above are GNU-copylefted.


Re: SPF_NONE scoring

2021-11-30 Thread Greg Troxel

Philip Prindeville  writes:

> I'm looking at the 0.001 scoring for SPF_NONE and scratching my head.  This 
> was discussed a bit in early 2015, but maybe it needs revisiting with new 
> perspective.
>
> Surely no one who cares about maintaining their reputation by
> protecting themselves against spoofing would fail to provide SPF
> records...  So how is this score arrived at?
>
> And of Ham, how much of it has a valid SPF?
>
> And of Spam, how much of it lacks a valid SPF?
>
> Has anyone run some numbers?

I see 0.001 as a score that says: this might be a spam sign, we don't
know, and this way it shows up in reports, without really affecting
anything.

Lots of people think SPF is silly.  And spammers spamming from a domain
they control can even dkim/dmarc.   So I agree that actual data would be
interesting.


signature.asc
Description: PGP signature


Re: MIME_BASE64_TEXT only on us-ascii

2021-11-30 Thread Philip Prindeville


> On Nov 17, 2021, at 9:50 AM, Bill Cole 
>  wrote:
> 
> SpamAssassin rules are not laws in any sense. They do not prescribe or 
> proscribe any action. They do not reflect any sort of moral or ethical 
> judgment. They do not express or define technical correctness.


Isn't that exactly what we're discussing here?  "Technical correctness"?

Good internetworking implementations follow (to the extent they don't conflict 
with good security practices) Postel's Law, "be conservative in what you send, 
be liberal [but not naive] in what you accept".

The point earlier in the thread was that using more encoding than is strictly 
necessary is not being "conservative in what you send", since it puts extra 
burden on the receiver to have a robust and complete implementation, and 
creates more opportunity to have an interoperability failure.

Rereading:


> Base64 encoding is only necessary if there are non-ASCII characters used. 
> UTF-8 is a superset of ASCII & it is normal for MUAs to not encode more than 
> needed.


Exactly.  Encoding is only used when and where necessary.

Properly encoded HTML uses HTML-Entity naming, which is also ASCII-friendly, 
i.e.  instead of Latin1  etc. or raw 8bit characters.

-Philip



SPF_NONE scoring

2021-11-30 Thread Philip Prindeville
Hi,

I'm looking at the 0.001 scoring for SPF_NONE and scratching my head.  This was 
discussed a bit in early 2015, but maybe it needs revisiting with new 
perspective.

Surely no one who cares about maintaining their reputation by protecting 
themselves against spoofing would fail to provide SPF records...  So how is 
this score arrived at?

And of Ham, how much of it has a valid SPF?

And of Spam, how much of it lacks a valid SPF?

Has anyone run some numbers?

Thanks,

-Philip