Re: Loads of recent low-scoring snowshoe spam

2019-09-26 Thread John Hardin

On Thu, 26 Sep 2019, Amir Caspi wrote:


On Sep 26, 2019, at 10:18 AM, John Hardin  wrote:


Some of those are following a pattern I've recently noticed - fairly obviously 
bogus spamvertising domain URLs with some .gov URLs thrown in as well. I'm 
assuming that's an attempt to leverage naïve domain whitelisting. One has a 
Humane Society URL, I presume the goal is similar.


Although they may not be in the spamples I provided, I've also seen .edu links.


Yeah, I'm starting to see those too. Added __URI_DOTEDU to see what it's 
worth.


--
 John Hardin KA7OHZhttp://www.impsec.org/~jhardin/
 jhar...@impsec.orgFALaholic #11174 pgpk -a jhar...@impsec.org
 key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C  AF76 D822 E6E6 B873 2E79
---
  Our politicians should bear in mind the fact that the American
  Revolution was touched off by the then-current government
  attempting to confiscate firearms from the people.
---
 3 days until the 78th anniversary of the massacre at Babi Yar
 Disarmament enables genocide - Registration enables disarmament

Re: Loads of recent low-scoring snowshoe spam

2019-09-26 Thread Amir Caspi
On Sep 26, 2019, at 10:18 AM, John Hardin  wrote:
> 
> Some of those are following a pattern I've recently noticed - fairly 
> obviously bogus spamvertising domain URLs with some .gov URLs thrown in as 
> well. I'm assuming that's an attempt to leverage naïve domain whitelisting. 
> One has a Humane Society URL, I presume the goal is similar.

Although they may not be in the spamples I provided, I've also seen .edu links. 
 And in today's spam I got a .gov.on.ca  link.  So we might 
need some variants, but then again, I suspect these will require a lot of 
tuning to guard against FPs.

My new AC_ rules (particularly AC_LARGE_INDENT and AC_POST*EXTRAS) do really 
well locally, but not so much in masscheck ... but they hit otherwise very 
low-scoring spam.  I would request that someone more talented than I am look at 
tuning those against FPs, if they are willing...

Cheers.

--- Amir



Re: Loads of recent low-scoring snowshoe spam

2019-09-26 Thread John Hardin

On Wed, 25 Sep 2019, Amir Caspi wrote:


Just a few (of many) spamples here:
https://pastebin.com/wRFBSCEZ
https://pastebin.com/FUdFEdhT
https://pastebin.com/LkqSEdAh


Some of those are following a pattern I've recently noticed - fairly 
obviously bogus spamvertising domain URLs with some .gov URLs thrown in as 
well. I'm assuming that's an attempt to leverage naïve domain 
whitelisting. One has a Humane Society URL, I presume the goal is similar.


I added __URI_DOTGOV but the performance isn't that great at the moment. I 
expect the masscheck corpora aren't seeing a lot of these (yet?). It's 
possible some of the DOTGOV combinations would work better in the Real 
World than they currently are in masschecks...



--
 John Hardin KA7OHZhttp://www.impsec.org/~jhardin/
 jhar...@impsec.orgFALaholic #11174 pgpk -a jhar...@impsec.org
 key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C  AF76 D822 E6E6 B873 2E79
---
  Are you a mildly tech-literate politico horrified by the level of
  ignorance demonstrated by lawmakers gearing up to regulate online
  technology they don't even begin to grasp? Cool. Now you have a
  tiny glimpse into a day in the life of a gun owner.   -- Sean Davis
---
 3 days until the 78th anniversary of the massacre at Babi Yar
 Disarmament enables genocide - Registration enables disarmament

Loads of recent low-scoring snowshoe spam

2019-09-25 Thread Amir Caspi
Hi all,

In recent weeks, my server has been getting hit with tons of snowshoe spam.  
Much of it is not getting filtered because even when it hits Bayes, it doesn't 
hit basically any other rules, and therefore is scoring just below 5 points.  
(Much of it hits only BAYES_50 and is therefore scoring even lower.)

Does anyone have any rules that can help hit these spams?  It seems like none 
of the default rules, nor KAM.cf, nor nonKAMrules.cf, are hitting these.  
Sometimes I'm lucky and Razor/DCC/Pyzor and/or URIBLs have already picked them 
up, throwing their score over threshold... but often I'm at the beginning of 
the queue and none of the hashes/BLs have gotten them yet.

(Reporting to SpamCop, it seems that almost all of this spam from today is 
coming from relays owned by sourcedns / liquidweb, and references URIs hosted 
by losangelesdedicated... although yesterday's spam came from a Romanian relay 
with URIs hosted by versaweb / fiberhub, so obviously there's no long-term 
pattern to the sources.)

Just a few (of many) spamples here:
https://pastebin.com/wRFBSCEZ
https://pastebin.com/FUdFEdhT
https://pastebin.com/LkqSEdAh

I've been testing some custom rules which are doing very well locally but which 
seem to have a high FP rate on masscheck, so would need some tuning before 
being included in the default rules, and I unfortunately haven't had time to do 
this tuning.  (If anyone wants to take a stab at it... the custom rules are 
AC_LOW_OPACITY, AC_POSTHTML_EXTRAS, AC_POSTIMG_EXTRAS, and AC_LARGE_INDENT. 
There is also AC_TINY_FONT but that seems to FP all over the place.)

Thanks in advance for any ideas/help... these have been really annoying my 
users.

Cheers.

--- Amir