On 8/5/2014 1:08 PM, Andy Balholm wrote:
The last few days, I’ve been getting a lot of spams that have a similar
pattern. They are plain-text messages, and each one ends with a paragraph from
a restaurant review (apparently to confuse bayesian filters), with some numbers
inserted. There is an 8-digit decimal number and a 32-digit hex one. Each
number appears two or three times. This is a consistent enough pattern that I
wrote a rule to match it:
body REPEATED_TRACKING_NUMBERS / (\d{8}) .* ([0-9a-f]{32}) .*\g1.*\g2/
score REPEATED_TRACKING_NUBMERS 1
describe REPEATED_TRACKING_NUMBERS A large number and a hex hash, each showing
up at least twice.
The spaces in the regex are necessary to avoid matching notification emails
from eBay.
The first thing I notice is that you are using .* thrice in a body rule.
That's probably going to be an issue...
The other thing is that you're going to be matching against legitimate
emails that have unsubscribe links (such as Facebook updates or bank
notifications).