RE: giberish

2008-03-03 Thread Michael Hutchinson
 -Original Message-
 From: JP Kelly [mailto:[EMAIL PROTECTED]
 Sent: Tuesday, 4 March 2008 6:54 a.m.
 To: spamassassin-users
 Subject: giberish
 
 does anyone know of a rule that might catch this kind of spam which
 contains a lot of non words
 a grammar checking rule or plugin would be nice too since many spams
 contain a lot of nonsense.
 
SNIP!
   Content-Type:   text/plain; charset=iso-8859-1
 
   Content-Transfer-Encoding:  8bit
 
 
 
 Howdy!
 
 Go to get further directions: http://jennakilroytm.blogspot.com
 
 misbrandingmegadyne delightable underbodice undergore
 fica orchidist miamiforrad
 
 commiserates denominablebronteum architectonically capsulogenous
 disfigured
 
 unteemsimulated


I score for blogspot links in emails, and give them 5 points while I'm
at it:
uri CST_BADLY_SPELT2/blogspot\.com/
score CST_BADLY_SPELT2  5.0
describe CST_BADLY_SPELT2   blogspot Link.. probable SPAM


I don't know how the rest of you feel about blogspot links, but I've
never seen a valid/authentic one in an email that isn't spam before.

I used to run phrase matching with lots of OR statements to try catch
spam like this, but have since given up rewriting those rules every day
in favour of this one.

Cheers,
Michael Hutchinson.



RE: giberish

2008-03-03 Thread John Hardin

On Tue, 4 Mar 2008, Michael Hutchinson wrote:

I don't know how the rest of you feel about blogspot links, but I've 
never seen a valid/authentic one in an email that isn't spam before.


Be careful if you correspond with someone who has a blog on blogspot.

And I occasionally email blog URLs to my wife.

Poison Pill rules are generally a bad idea.

--
 John Hardin KA7OHZhttp://www.impsec.org/~jhardin/
 [EMAIL PROTECTED]FALaholic #11174 pgpk -a [EMAIL PROTECTED]
 key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C  AF76 D822 E6E6 B873 2E79
---
  You are in a maze of twisty little protocols,
  all written by Microsoft.
--
 10 days until Albert Einstein's 129th Birthday


Blogspot (was Re: giberish)

2008-03-03 Thread Kelson

Michael Hutchinson wrote:

I don't know how the rest of you feel about blogspot links, but I've
never seen a valid/authentic one in an email that isn't spam before.


I have.  In the last two weeks, I've seen blogspot links in the Drupal 
newsletter, the OpenOffice.org newsletter, Fedora Weekly News, and a 
newsletter for the Comic Book Legal Defense Fund -- all things I've 
signed up for.


And that's just me -- that's not counting anyone else on the mail server 
I manage.  I set up a rule to match blogspot links, and tracked the 
results.  It hit things like the Slashdot daily summary, and several 
newsletters  mailing lists that I couldn't guess whether the recipient 
signed up or not, on topics ranging from chess to ASP to financial news 
to political opinions.


And then there's people sending personal mail referencing a random blog 
post, or including their blogspot-hosted site in their email signatures.


We do still score blogspot URIs --- but we only add 1 point for it. 
Scoring at 5 would block legit mail.


--
Kelson Vibber
SpeedGate Communications www.speed.net


Re: giberish

2008-03-03 Thread Michael Scheidell
Just block anything from 'yahoo' that contains blogspot in it.

Even dkim signed email.  (yahoo has send be back 'ignorantgrams' claiming
that valid/dkim signed email from yahoo wan't from yahoo),

Grammer detection? Shoot, you would drop 90% of the email from the
teenagers.


-- 
Michael Scheidell, CTO
|SECNAP Network Security
Winner 2008 Network Products Guide Hot Companies
FreeBsd SpamAssassin Ports maintainer
Charter member, ICSA labs anti-spam consortium

_
This email has been scanned and certified safe by SpammerTrap(tm). 
For Information please see http://www.spammertrap.com
_


Re: giberish

2008-03-03 Thread JP Kelly

thanks for the rule ,looks like a good one.
can you point me to jennifer's rules?
thanks.
jp


On Mar 3, 2008, at 2:56 PM, Loren Wilton wrote:

body  LW_WORDLIST_15P /(?:\b(?!(?:from|that|have|this|were|with)\b) 
[a-z]{4,12}\s+){15}/

describe LW_WORDLIST_15P  string of 15+ random words
score  LW_WORDLIST_15P  5

Ignoring the blogspot comments, something along the lines of the  
above rule will catch this sort of stuff.  It looks like there are  
only 13 random words in your case, so you would need to cut the  
number of words down, and the score down.


Some of Jennifer's rules would also catch this sort of thing, but I  
don't recall which rules.  She had some that checked for unusual  
letter sequences that can't happen in English.  That doesn't help if  
your main mail is Slovak, but if it is English it might be useful.


  Loren