Hello Sandy,
Tuesday, August 3, 2004, 6:43:13 AM, you wrote:
SS> Tom -
SS> I've seen these too, and have put in the following rules to try and catch
SS> them. They've helped, but a few are still slipping through. I think if you
SS> raise the scores some they'd catch a lot more, but I need to be a little
SS> more sure I won't get FPs before I do this!
Sandy, results of my mass-check here:
Section 3 -- Frequencies Log
(First numeric frequencies, followed by percentage frequencies)
OVERALL% SPAM% HAM% S/O SCORE NAME
58315 33581 24734 0.576 0.00 0.00 (all messages)
668 462 206 0.623 1.00 0.60 ODD_CHAR_COMMA_BA
287 249 38 0.828 0.89 0.60 ODD_CHAR_CARET_BA
557 261 296 0.394 0.78 0.60 ODD_CHAR_DOT_BA
75 44 31 0.511 0.44 0.60 ODD_CHAR_TIC1_BA
887 197 690 0.174 0.33 0.60 ODD_CHAR_UNDERSCORE_BA
6363 240 6123 0.028 0.11 0.60 ODD_CHAR_TILDE_BA
5988 194 5794 0.024 0.00 0.60 ODD_CHAR_DASH_BA
1151 183 968 0.122 0.00 0.60 ODD_CHAR_TIC2_BA
OVERALL% SPAM% HAM% S/O RANK SCORE NAME
58315 33581 24734 0.576 0.00 0.00 (all messages)
100.000 57.5855 42.4145 0.576 0.00 0.00 (all messages as %)
1.146 1.3758 0.8329 0.623 1.00 0.60 ODD_CHAR_COMMA_BA
0.492 0.7415 0.1536 0.828 0.89 0.60 ODD_CHAR_CARET_BA
0.955 0.7772 1.1967 0.394 0.78 0.60 ODD_CHAR_DOT_BA
0.129 0.1310 0.1253 0.511 0.44 0.60 ODD_CHAR_TIC1_BA
1.521 0.5866 2.7897 0.174 0.33 0.60 ODD_CHAR_UNDERSCORE_BA
10.911 0.7147 24.7554 0.028 0.11 0.60 ODD_CHAR_TILDE_BA
10.268 0.5777 23.4252 0.024 0.00 0.60 ODD_CHAR_DASH_BA
1.974 0.5450 3.9136 0.122 0.00 0.60 ODD_CHAR_TIC2_BA
All these rules hit significant ham. Indeed, UNDERSCORE_BA, DOT_BA,
TILDE_BA, DASH_BA, and TIC2_BA hit more ham than spam in my corpus.
Bob Menschel
SS> body ODD_CHAR_DASH_BA /\-[,\~\'\`\_\.\^]/
SS> describe ODD_CHAR_DASH_BA Odd combo of special chars
SS> score ODD_CHAR_DASH_BA .6
SS> body ODD_CHAR_CARET_BA /\^[\-,\~\'\`\_\.\^]/
SS> describe ODD_CHAR_CARET_BA Odd combo of special chars
SS> score ODD_CHAR_CARET_BA .6
SS> body ODD_CHAR_DOT_BA /\.[\^\~\'\`\_]/
SS> describe ODD_CHAR_DOT_BA Odd combo of special chars
SS> score ODD_CHAR_DOT_BA .6
SS> body ODD_CHAR_UNDERSCORE_BA /\_[\.\^\-,\~\'\`]/
SS> describe ODD_CHAR_UNDERSCORE_BA Odd combo of special chars
SS> score ODD_CHAR_UNDERSCORE_BA .6
SS> body ODD_CHAR_TIC1_BA /\`[\_\.\^\-,\~\']/
SS> describe ODD_CHAR_TIC1_BA Odd combo of special chars
SS> score ODD_CHAR_TIC1_BA .6
SS> body ODD_CHAR_TIC2_BA /\'[\`\_\.\^\-,\~]/
SS> describe ODD_CHAR_TIC2_BA Odd combo of special chars
SS> score ODD_CHAR_TIC2_BA .6
SS> body ODD_CHAR_TILDE_BA /\~[\'\`\_\.\^\-,\~]/
SS> describe ODD_CHAR_TILDE_BA Odd combo of special chars
SS> score ODD_CHAR_TILDE_BA .6
SS> body ODD_CHAR_COMMA_BA /,[\~\'\`\_\.\^\-]/
SS> describe ODD_CHAR_COMMA_BA Odd combo of special chars
SS> score ODD_CHAR_COMMA_BA .6
SS> Sandy
SS> ----- Original Message -----
SS> From: "Thomas Kinghorn" <[EMAIL PROTECTED]>
SS> To: <[EMAIL PROTECTED]>
SS> Sent: Tuesday, August 03, 2004 5:38 AM
SS> Subject: more are more junk getting through
>> Hi List
>>
>> I have had a dramatic increase in junk mail,
>>
>> I am using SA 2.63 with SURBL.
>> MTA is exim 4.34 using sa-exim-4
>>
>> I have attached a sample.
>>
>> Subjects and originating IP's are always different
>> But the formatting of the body is similar.
>>
>> Any ideas would be appreciated.
>>
>> <<important increased muscle mass without exercise.msg>>
>>
>> Regards,
>>
>> Tom Kinghorn
>>
>>
--
Best regards,
Robert mailto:[EMAIL PROTECTED]