On Fri, 23 Jan 2009, Dennis Hardy wrote:
why are those scores low? What gives them negative score?
those rules have quite high score...
Here is an example (without my rules): http://pastebin.com/m4400a74d
Can you repost that with full headers?
The ones that get through are relatively short and simple, and many are
very "clean".
No DNSBL hits on the URI domain?
I've been thinking about maybe writing an SA plugin that counts the
three repeated URL patterns that are always present in all of these
spams, but I don't know where to start in trying to do that.
We'd need more than one sample URI to do a good job. Have you been
collecting a corpus?
I notice that this URI has a format that may be a good spam sign: the
domain name, followed by a long string of unpunctuated text gibberish.
Just off the top of my head and untested, how does this do against your
corpus?
uri GIBBERISH ;://[^/]{4,50}/(?=[a-z]{25,80}$)[a-z]{0,80}q[^u][a-z]{0,80}$;i
--
John Hardin KA7OHZ http://www.impsec.org/~jhardin/
jhar...@impsec.org FALaholic #11174 pgpk -a jhar...@impsec.org
key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C AF76 D822 E6E6 B873 2E79
-----------------------------------------------------------------------
Gun Control is nothing more than an attempt to return to feudalism,
where the peasants are helpless and must humbly petition their lord
and master to protect them from bandits and thieves (when they can
get around to it), and where the lords and masters can abuse the
peasants whenever they like without fear of effective resistance.
-----------------------------------------------------------------------
4 days until Wolfgang Amadeus Mozart's 253rd Birthday