On 11/25/2011 11:06 AM, Kevin A. McGrail wrote:
On 11/25/2011 12:23 AM, Alex wrote:
Some time ago we created the following rule on this list to identify
mail with less than 200 characters in the body:
uri __HAS_HTTP_URI m~^https?://~
rawbody __KB_RAWBODY_200 /^.{0,200}$/s
meta LOC_SHORT (__HAS_HTTP_URI&& __KB_RAWBODY_200)
score LOC_SHORT 0.6
describe LOC_SHORT Has URI and short body
I'm finding that it's hitting on mail that is much larger than 200
characters and I don't understand why. Is it only the text/plain
component of the body? Here's an example:
http://pastebin.com/raw.php?i=XNHjxfTz
I see the same issue on trunk and 3.3.2. Playing with code now since
I see this type of crap spam a lot as well.
It was a brilliantly simple idea but this concept won't work if I am
looking at things correctly. The loop for the pattern test appears to
test line by line. So if a single line is less than 200 chars, you are
hitting the rule.
I think you need to look at build on some of the LENGTH rules in
20_html_tests.cf. But it's possibly an eval plugin change is needed.
Regards,
KAM