Re: Spam from addresses where full name mirrors left-hand side of address

2018-04-03 Thread John Hardin

On Tue, 3 Apr 2018, RW wrote:


On Mon, 2 Apr 2018 11:33:27 -0700 (PDT)
John Hardin wrote:


On Mon, 2 Apr 2018, Amir Caspi wrote:


many organizations -- especially government or other
large orgs -- also use firstname.middleinitial.lastname as their
user part.


So require a minimum length for the middle part:

   header THREE_WORD_MONTY  From =~ /(\w+) (\w{2,}) (\w+) <\1.\2.\3/


A meta rule using multi-dots could work, by either looking for
specific keywords or matching with other spammy indicators... but
by itself there's no real way to distinguish these AFAICT.  I think
a meta rule is the only safe way to go, but personally I would
_NOT_ use a rule like the one suggested where the quoted part
equals the user part, since every firstname.lastname address will
get caught that way.


Your comment is valid, but the suggested rule requires three parts,
so won't hit on firstname.lastname-style mailbox naming.

However, since it's looking for periods, it won't hit the dash- and
underscore-delimited versions.


It looks for . not \.


Ah, yes, my mistake.

--
 John Hardin KA7OHZhttp://www.impsec.org/~jhardin/
 jhar...@impsec.orgFALaholic #11174 pgpk -a jhar...@impsec.org
 key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C  AF76 D822 E6E6 B873 2E79
---
  The world has enough Mouse Clicking System Engineers.
   -- Dave Pooser
---
 10 days until Thomas Jefferson's 275th Birthday


Re: Spam from addresses where full name mirrors left-hand side of address

2018-04-03 Thread RW
On Mon, 2 Apr 2018 11:33:27 -0700 (PDT)
John Hardin wrote:

> On Mon, 2 Apr 2018, Amir Caspi wrote:
> 
> > many organizations -- especially government or other 
> > large orgs -- also use firstname.middleinitial.lastname as their
> > user part.  
> 
> So require a minimum length for the middle part:
> 
>header THREE_WORD_MONTY  From =~ /(\w+) (\w{2,}) (\w+) <\1.\2.\3/
> 
> > A meta rule using multi-dots could work, by either looking for
> > specific keywords or matching with other spammy indicators... but
> > by itself there's no real way to distinguish these AFAICT.  I think
> > a meta rule is the only safe way to go, but personally I would
> > _NOT_ use a rule like the one suggested where the quoted part
> > equals the user part, since every firstname.lastname address will
> > get caught that way.  
> 
> Your comment is valid, but the suggested rule requires three parts,
> so won't hit on firstname.lastname-style mailbox naming.
> 
> However, since it's looking for periods, it won't hit the dash- and 
> underscore-delimited versions.

It looks for . not \.


Re: Spam from addresses where full name mirrors left-hand side of address

2018-04-02 Thread John Hardin

On Mon, 2 Apr 2018, Amir Caspi wrote:

many organizations -- especially government or other 
large orgs -- also use firstname.middleinitial.lastname as their user 
part.


So require a minimum length for the middle part:

  header THREE_WORD_MONTY  From =~ /(\w+) (\w{2,}) (\w+) <\1.\2.\3/

A meta rule using multi-dots could work, by either looking for specific 
keywords or matching with other spammy indicators... but by itself 
there's no real way to distinguish these AFAICT.  I think a meta rule is 
the only safe way to go, but personally I would _NOT_ use a rule like 
the one suggested where the quoted part equals the user part, since 
every firstname.lastname address will get caught that way.


Your comment is valid, but the suggested rule requires three parts, so 
won't hit on firstname.lastname-style mailbox naming.


However, since it's looking for periods, it won't hit the dash- and 
underscore-delimited versions.


Perhaps:

  header THREE_WORD_MONTY  From =~ /(\w+) (\w{2,}) (\w+)\s+<\1[-._]\2[-._]\3\@/

And maybe a little more flexible to hit the *last three* parts of a 4+ 
part address:


  header THREE_WORD_MONTY  From =~ /(\w+) (\w{2,}) 
(\w+)\s+<[^@]*\1[-._]\2[-._]\3\@/

Potentially lots of backtracking there, though. Fortunately the string is 
not apt to be very long.


--
 John Hardin KA7OHZhttp://www.impsec.org/~jhardin/
 jhar...@impsec.orgFALaholic #11174 pgpk -a jhar...@impsec.org
 key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C  AF76 D822 E6E6 B873 2E79
---
  When fascism comes to America, it will be wrapped in
  "Diversity" and demanding "Safe Spaces." -- Mona Charen
---
 368 days since the first commercial re-flight of an orbital booster (SpaceX)


Re: Spam from addresses where full name mirrors left-hand side of address

2018-04-02 Thread Amir Caspi
On Apr 1, 2018, at 11:33 PM, Rich Wales  wrote:
> 
> I do realize some perfectly legitimate "From:" lines conform to this same 
> pattern, and the only way to really tell the difference may be via AI or a 
> real human brain.

Not just "some" legitimate mail... a LOT of legitimate mail, basically anything 
that conforms to "FirstName LastName" >.  One might think checking for multiple 
dots would help (as I suggested last week), but many organizations -- 
especially government or other large orgs -- also use 
firstname.middleinitial.lastname as their user part.

A meta rule using multi-dots could work, by either looking for specific 
keywords or matching with other spammy indicators... but by itself there's no 
real way to distinguish these AFAICT.  I think a meta rule is the only safe way 
to go, but personally I would _NOT_ use a rule like the one suggested where the 
quoted part equals the user part, since every firstname.lastname address will 
get caught that way.

Cheers.

--- Amir



Re: Spam from addresses where full name mirrors left-hand side of address

2018-04-02 Thread Bill Cole

On 2 Apr 2018, at 1:33 (-0400), Rich Wales wrote:


[I tried asking this question a couple of days ago, but I've seen no
signs that it made it out to the list -- possibly because the sample
e-mail addresses I included in my question might have caused it to be
flagged as spam.  So here goes again, this time with the addresses
mangled a bit.]

I see a lot of spam with "From:" lines where the left-hand side of the
address is essentially the same (modulo punctuation) as the "full 
name"

portion of the address.  The right-hand side, on the other hand, is a
random gibberish domain.

A few examples currently sitting in my local server's spam quarantine
(with the addresses edited so they hopefully won't trigger any spam 
checks):


    Adding To Human Lifespan 
    "Eliminate Fat Fast" jeanettejtaylor

(dot) com>
    "Home Warranty Special" racerville

(dot) com>
    Smartphone Screen Protector 

dtqmp (dot) com>

Two questions:

Is it *technically possible* to create a Spamassassin rule which would
match this sort of "From:" line?



This (UNTESTED) should do it:

header THREE_WORD_MONTY  From =~ /(\w+) (\w+) (\w+) <\1.\2.\3/



And assuming it can be done, is it *worthwhile* to do it? 


Not a clue. Maybe worth a try?

--
Bill Cole
b...@scconsult.com or billc...@apache.org
(AKA @grumpybozo and many *@billmail.scconsult.com addresses)
Currently Seeking Steady Work: https://linkedin.com/in/billcole


Spam from addresses where full name mirrors left-hand side of address

2018-04-01 Thread Rich Wales
[I tried asking this question a couple of days ago, but I've seen no
signs that it made it out to the list -- possibly because the sample
e-mail addresses I included in my question might have caused it to be
flagged as spam.  So here goes again, this time with the addresses
mangled a bit.]

I see a lot of spam with "From:" lines where the left-hand side of the
address is essentially the same (modulo punctuation) as the "full name"
portion of the address.  The right-hand side, on the other hand, is a
random gibberish domain.

A few examples currently sitting in my local server's spam quarantine
(with the addresses edited so they hopefully won't trigger any spam checks):

    Adding To Human Lifespan 
    "Eliminate Fat Fast" 
    "Home Warranty Special" 
    Smartphone Screen Protector 

Two questions:

Is it *technically possible* to create a Spamassassin rule which would
match this sort of "From:" line?

And assuming it can be done, is it *worthwhile* to do it?  I do realize
some perfectly legitimate "From:" lines conform to this same pattern,
and the only way to really tell the difference may be via AI or a real
human brain.
-- 
*Rich Wales*
ri...@richw.org