RE: Canadian Spam - tired of writing rules!

2008-04-27 Thread Michael Hutchinson
 -Original Message-
 From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]
 Sent: 21 April 2008 8:48 a.m.
 To: James Wilkinson
 Cc: users@spamassassin.apache.org
 Subject: Re: Canadian Spam - tired of writing rules!
 
 
 James Wilkinson writes:
  Michael Hutchinson wrote:
   There's been a rise in Canadian Pharmaceutical Spam lately. This spam
 is
   quite basic, generally only including some text and a link. The link
 is
   always changing so we can't score against that.
  
   About the only other thing it scores on is the FORGED_HOTMAIL_RCVD
 rule,
   which doesn't have a big enough score to push the Spam over the 5.0
   points threshold.
  
   Does anyone have some effective rules / rulesets / update channels
 that
   would help to eliminate this stuff? I've been writing rules against it
   for the past few months. We've just employed our 61st rule against
 this
   type of Spam. Admittedly a lot of those are just basic phrase
 matching,
   and aren't complicated rules - but then the Spam changes enough each
   cycle, that it avoids complicated rules that I might write.
 
  I find that a meta rule where the body contains http://; and has no
  paragraphs above 100 to 140 characters¹ will give a few false positives,
  so you can't score it too highly, but it catches a *lot* of spam.
 
  The ham that matches this rule tends to be surprisingly rare, doesn't
  score highly on anything else, and is from regular correspondents (so
  the AWL helps).
 
  If any of the SA developers are reading, I'd love to see how rules like
  this play in the sandbox...
 
  James.
 
  ¹ I'd like to do it on body length, but I can't find a suitable way of
  doing this. body /.{100}/ will match on any e-mail which *has* got a
  paragraph of  99 characters...
 
 Provide a plugin that does it efficiently, and I'll try it out ;)
 

I think even our internal mail would get caught by that rule - and I can forsee 
enough FP's to be a problem straight away. I don't think I'll employ a rule 
like this. It must be time to go back to my RegExp training so hopefully I can 
come up with some good ones to be rid of the Pharmacy spam.

Cheers,
Mike



Re: Canadian Spam - tired of writing rules!

2008-04-20 Thread James Wilkinson
Michael Hutchinson wrote:
 There's been a rise in Canadian Pharmaceutical Spam lately. This spam is
 quite basic, generally only including some text and a link. The link is
 always changing so we can't score against that.
 
 About the only other thing it scores on is the FORGED_HOTMAIL_RCVD rule,
 which doesn't have a big enough score to push the Spam over the 5.0
 points threshold.
 
 Does anyone have some effective rules / rulesets / update channels that
 would help to eliminate this stuff? I've been writing rules against it
 for the past few months. We've just employed our 61st rule against this
 type of Spam. Admittedly a lot of those are just basic phrase matching,
 and aren't complicated rules - but then the Spam changes enough each
 cycle, that it avoids complicated rules that I might write.

I find that a meta rule where the body contains http://; and has no
paragraphs above 100 to 140 characters¹ will give a few false positives,
so you can't score it too highly, but it catches a *lot* of spam.

The ham that matches this rule tends to be surprisingly rare, doesn't
score highly on anything else, and is from regular correspondents (so
the AWL helps).

If any of the SA developers are reading, I'd love to see how rules like
this play in the sandbox...

James.

¹ I'd like to do it on body length, but I can't find a suitable way of
doing this. body /.{100}/ will match on any e-mail which *has* got a
paragraph of  99 characters...

-- 
E-mail: james@ | The opinions expressed herein are not necessarily those
aprilcottage.co.uk | of my employer, are not necessarily mine, and in fact are
   | probably not necessary at all...


Re: Canadian Spam - tired of writing rules!

2008-04-20 Thread Justin Mason

James Wilkinson writes:
 Michael Hutchinson wrote:
  There's been a rise in Canadian Pharmaceutical Spam lately. This spam is
  quite basic, generally only including some text and a link. The link is
  always changing so we can't score against that.
  
  About the only other thing it scores on is the FORGED_HOTMAIL_RCVD rule,
  which doesn't have a big enough score to push the Spam over the 5.0
  points threshold.
  
  Does anyone have some effective rules / rulesets / update channels that
  would help to eliminate this stuff? I've been writing rules against it
  for the past few months. We've just employed our 61st rule against this
  type of Spam. Admittedly a lot of those are just basic phrase matching,
  and aren't complicated rules - but then the Spam changes enough each
  cycle, that it avoids complicated rules that I might write.
 
 I find that a meta rule where the body contains http://; and has no
 paragraphs above 100 to 140 characters¹ will give a few false positives,
 so you can't score it too highly, but it catches a *lot* of spam.
 
 The ham that matches this rule tends to be surprisingly rare, doesn't
 score highly on anything else, and is from regular correspondents (so
 the AWL helps).
 
 If any of the SA developers are reading, I'd love to see how rules like
 this play in the sandbox...
 
 James.
 
 ¹ I'd like to do it on body length, but I can't find a suitable way of
 doing this. body /.{100}/ will match on any e-mail which *has* got a
 paragraph of  99 characters...

Provide a plugin that does it efficiently, and I'll try it out ;)

--j.


Re: Canadian Spam - tired of writing rules!

2008-04-20 Thread Loren Wilton

¹ I'd like to do it on body length, but I can't find a suitable way of
doing this. body /.{100}/ will match on any e-mail which *has* got a
paragraph of  99 characters...


body__BODY_100=~ /.{100}/
metaBODY_LT_100!__BODY_100

or maybe

bodyBODY_LT_100!~ /.{100}/

   Loren



Canadian Spam - tired of writing rules!

2008-04-13 Thread Michael Hutchinson
Hello everyone,

 

There's been a rise in Canadian Pharmaceutical Spam lately. This spam is
quite basic, generally only including some text and a link. The link is
always changing so we can't score against that.

About the only other thing it scores on is the FORGED_HOTMAIL_RCVD rule,
which doesn't have a big enough score to push the Spam over the 5.0
points threshold.

 

Does anyone have some effective rules / rulesets / update channels that
would help to eliminate this stuff? I've been writing rules against it
for the past few months. We've just employed our 61st rule against this
type of Spam. Admittedly a lot of those are just basic phrase matching,
and aren't complicated rules - but then the Spam changes enough each
cycle, that it avoids complicated rules that I might write.

 

Basically, I'm getting sick of writing rules all the time - I'm thinking
I probably shouldn't need to. Is there any way around this?



I know there is a SARE ruleset against Pharmacy Spam, but I am very
hesitant to employ it because we have several clients that are pharmacy
outlets, and I worry those rules will burn them.

 

Thanks in advance, for any information.

 

Michael Hutchinson

Manux Solutions Ltd

Email: [EMAIL PROTECTED] mailto:[EMAIL PROTECTED]