Couple of useful tests
Hi, I created these tests which I find very accurate for detecting spam and so thought I'd let the list have a view. Lots of numbers or consonants in the reply-to usually bodes ill. header REPLY_TO_NUMS_CJ Reply-To =~ /[0-9]{6,}/ score REPLY_TO_NUMS_CJ 5.000 header RET_PATH_NUMS_CJ Return-path =~ /[0-9]{6,}/ score RET_PATH_NUMS_CJ 5.000 header REPLY_TO_CONSON_CJ Reply-To =~ /[bcdfghjklmnpqrstvwxyz]{5,}.*@/i score RET_PATH_CONSON_CJ 5.000 header RET_PATH_CONSON_CJ Return-path =~ /[bcdfghjklmnpqrstvwxyz]{5,}.*@/i score RET_PATH_CONSON_CJ 5.000 If you can improve, please do -- have no mercy. Craig Jackson
Re: Couple of useful tests
Have you run it through the corpus tests? - Original Message - From: Craig Jackson [EMAIL PROTECTED] To: users@spamassassin.apache.org Sent: Wednesday, June 01, 2005 12:50 PM Subject: Couple of useful tests | Hi, | I created these tests which I find very accurate for detecting spam and | so thought I'd let the list have a view. Lots of numbers or consonants | in the reply-to usually bodes ill. | | header REPLY_TO_NUMS_CJ Reply-To =~ /[0-9]{6,}/ | score REPLY_TO_NUMS_CJ 5.000 | header RET_PATH_NUMS_CJ Return-path =~ /[0-9]{6,}/ | score RET_PATH_NUMS_CJ 5.000 | header REPLY_TO_CONSON_CJ Reply-To =~ /[bcdfghjklmnpqrstvwxyz]{5,}.*@/i | score RET_PATH_CONSON_CJ 5.000 | header RET_PATH_CONSON_CJ Return-path =~ /[bcdfghjklmnpqrstvwxyz]{5,}.*@/i | score RET_PATH_CONSON_CJ 5.000 | | | If you can improve, please do -- have no mercy. | | Craig Jackson | |
Re: Couple of useful tests
I am checking it now, I will have results in a few minutes. wrote: Have you run it through the corpus tests?
Re: Couple of useful tests
On 06/01/05 20:50, Craig Jackson wrote: Hi, I created these tests which I find very accurate for detecting spam and so thought I'd let the list have a view. Lots of numbers or consonants in the reply-to usually bodes ill. Good point about the reply-to, thanks! header REPLY_TO_NUMS_CJ Reply-To =~ /[0-9]{6,}/ score REPLY_TO_NUMS_CJ 5.000 header RET_PATH_NUMS_CJ Return-path =~ /[0-9]{6,}/ score RET_PATH_NUMS_CJ 5.000 header REPLY_TO_CONSON_CJ Reply-To =~ /[bcdfghjklmnpqrstvwxyz]{5,}.*@/i score RET_PATH_CONSON_CJ 5.000 header RET_PATH_CONSON_CJ Return-path =~ /[bcdfghjklmnpqrstvwxyz]{5,}.*@/i score RET_PATH_CONSON_CJ 5.000 I'd suggest to remove the y there. Shouldn' that be Return-Path instead of Return-path ? Speaking of Return-Paths, have you checked your rules against mailing list software (ezmlm?!) envelope sender adresses? IIRC, they slightly resemble what you are trying to match ... Regards, wolfgang
RE: Couple of useful tests
Even after corpus tests, I never give a single rule a score over 3 (local threshold is 6). There's no reason a real live person couldn't choose a consonant-only email name, and I know of some universities that give out addresses like [EMAIL PROTECTED] which would trigger your first rule. Pierre Thomson BIC -Original Message- From: Craig Jackson [mailto:[EMAIL PROTECTED] Sent: Wednesday, June 01, 2005 2:50 PM To: users@spamassassin.apache.org Subject: Couple of useful tests Hi, I created these tests which I find very accurate for detecting spam and so thought I'd let the list have a view. Lots of numbers or consonants in the reply-to usually bodes ill. header REPLY_TO_NUMS_CJ Reply-To =~ /[0-9]{6,}/ score REPLY_TO_NUMS_CJ 5.000 header RET_PATH_NUMS_CJ Return-path =~ /[0-9]{6,}/ score RET_PATH_NUMS_CJ 5.000 header REPLY_TO_CONSON_CJ Reply-To =~ /[bcdfghjklmnpqrstvwxyz]{5,}.*@/i score RET_PATH_CONSON_CJ 5.000 header RET_PATH_CONSON_CJ Return-path =~ /[bcdfghjklmnpqrstvwxyz]{5,}.*@/i score RET_PATH_CONSON_CJ 5.000 If you can improve, please do -- have no mercy. Craig Jackson
Re: Couple of useful tests
Wolfgang Zeikat wrote: I'd suggest to remove the y there. Shouldn' that be Return-Path instead of Return-path ? In spamassassin, header names are case-insensitive. Speaking of Return-Paths, have you checked your rules against mailing list software (ezmlm?!) envelope sender adresses? IIRC, they slightly resemble what you are trying to match ... True, although most only use 5 digits or so, I guess it depends a lot on how deep the list archive goes... ie, this list: [EMAIL PROTECTED]
Re: Couple of useful tests
Wolfgang Zeikat wrote: On 06/01/05 20:50, Craig Jackson wrote: Hi, I created these tests which I find very accurate for detecting spam and so thought I'd let the list have a view. Lots of numbers or consonants in the reply-to usually bodes ill. Good point about the reply-to, thanks! header REPLY_TO_NUMS_CJ Reply-To =~ /[0-9]{6,}/ score REPLY_TO_NUMS_CJ 5.000 header RET_PATH_NUMS_CJ Return-path =~ /[0-9]{6,}/ score RET_PATH_NUMS_CJ 5.000 header REPLY_TO_CONSON_CJ Reply-To =~ /[bcdfghjklmnpqrstvwxyz]{5,}.*@/i score RET_PATH_CONSON_CJ 5.000 header RET_PATH_CONSON_CJ Return-path =~ /[bcdfghjklmnpqrstvwxyz]{5,}.*@/i score RET_PATH_CONSON_CJ 5.000 I'd suggest to remove the y there. Shouldn' that be Return-Path instead Yes, you might be right. Many names end in consonants-y followed by consonants in the last name: e.g. [EMAIL PROTECTED] Also, it might be good to throw in a ~ and a * which are usually part of spam.
Re: Couple of useful tests
Pierre Thomson wrote: Even after corpus tests, I never give a single rule a score over 3 (local threshold is 6). There's no reason a real live person couldn't choose a consonant-only email name, and I know of some universities that give out addresses like [EMAIL PROTECTED] which would trigger your first rule. Pierre Thomson BIC -Original Message- From: Craig Jackson [mailto:[EMAIL PROTECTED] Sent: Wednesday, June 01, 2005 2:50 PM To: users@spamassassin.apache.org Subject: Couple of useful tests Hi, I created these tests which I find very accurate for detecting spam and so thought I'd let the list have a view. Lots of numbers or consonants in the reply-to usually bodes ill. header REPLY_TO_NUMS_CJ Reply-To =~ /[0-9]{6,}/ score REPLY_TO_NUMS_CJ 5.000 header RET_PATH_NUMS_CJ Return-path =~ /[0-9]{6,}/ score RET_PATH_NUMS_CJ 5.000 header REPLY_TO_CONSON_CJ Reply-To =~ /[bcdfghjklmnpqrstvwxyz]{5,}.*@/i score RET_PATH_CONSON_CJ 5.000 header RET_PATH_CONSON_CJ Return-path =~ /[bcdfghjklmnpqrstvwxyz]{5,}.*@/i score RET_PATH_CONSON_CJ 5.000 We're extremely aggressive with the scores because the tagged mail is sent to an IMAP folder -- and not deleted. We have strict email policies that preclude all personal email. This means that many emails that Spamassassin would ordinarily try to allow through, is a fair target for us.