On Wed, 2011-11-23 at 15:13 +0100, Simon Loewenthal wrote:
> I have spam that hits on these rules.
> 
> X-Spam-Report:
>     *  1.7 URIBL_BLACK Contains an URL listed in the URIBL blacklist
>     *      [URIs: europjobs.eu]
>     *  1.2 URIBL_JP_SURBL Contains an URL listed in the JP SURBL blocklist
>     *      [URIs: europjobs.eu]
>     *  0.0 UNPARSEABLE_RELAY Informational: message has unparseable
> relay lines
>     *  0.8 BAYES_50 BODY: Bayes spam probability is 40 to 60%
>     *      [score: 0.5000]
>     *  1.1 DCC_CHECK Listed in DCC (http://rhyolite.com/anti-spam/dcc/)
>     *  1.4 PYZOR_CHECK Listed in Pyzor (http://pyzor.sf.net/)
>     *  0.3 DIGEST_MULTIPLE Message hits more than one network digest check
>     *  0.8 RDNS_NONE Delivered to internal network by a host with no rDNS
> 
> What I fail to understand is why it did not hit on this local.cf rule:
> 
> describe RBODY_JOB_DOMAINS1 English language job opportunity1
> rawbody RBODY_JOB_DOMAINS1
> /\@(?:axeabout|career-lists|careers-consult|eur-exlusive|europe-career|europ-exlusive|it-jobsearch\.com|uk-exlusive|tech-newposition|new-joboffers|joblists|web-newcarer|world-jobsearch|gb-totaljob|simple-jobneed|sprytex-it|europjobs.eu|businesinsiders.com)\./
> score    RBODY_JOB_DOMAINS1 4.5
> 
> ( I tried the same by replacing |europjobs.eu| with |europjobs\.eu| in
> case it helped, but made no difference)
>
What Axb said. I'd just add that your rule description appears to be
misleading in that it seems to be a list of partial domain names rather
than any specifically English words or phrases and that you'll get fewer
FPs and, probably, a better hit rate if you use a meta to combine
generic job offer phrases with something else, along the lines of:

describe JOB_OFFERS Phrases typical of English language job offers
body     JOB_OFFERS /(my client|(contract|permanent) jobs))/i
score    JOB_OFFERS 0.01

describe UNWANTED_JOB_OFFERS Jobs at blacklisted sites
meta     UNWANTED_JOB_OFFERS JOB_OFFERS && (URIBL_BLACK ||
URIBL_JP_SURBL)
score    UNWANTED_JOB_OFFERS 4.5

because your rule is in effect a private blacklist that duplicates what
the URIBLs are already doing. Of course my JOB_OFFERS rule is merely an
example. In Real Life (tm) it would be a set of rather more elaborate
rules that you've built to recognise your particular jobspam stream.


Martin


Reply via email to