----- Original Message ----- 
From: "Robert Menschel" <[EMAIL PROTECTED]>
To: "Sandy S" <[EMAIL PROTECTED]>
Cc: "Thomas Kinghorn" <[EMAIL PROTECTED]>;
<[EMAIL PROTECTED]>
Sent: Tuesday, August 03, 2004 2:42 PM
Subject: Re[2]: more are more junk getting through


> Hello Sandy,
>
> Tuesday, August 3, 2004, 6:43:13 AM, you wrote:
>
> SS> Tom  -
> SS> I've seen these too, and have put in the following rules to try and
catch
> SS> them.  They've helped, but a few are still slipping through.  I think
if you
> SS> raise the scores some they'd catch a lot more, but I need to be a
little
> SS> more sure I won't get FPs before I do this!
>
> Sandy, results of my mass-check here:
>
> Section 3 -- Frequencies Log
> (First numeric frequencies, followed by percentage frequencies)
>
> OVERALL%   SPAM%     HAM%     S/O          SCORE  NAME
>   58315    33581    24734    0.576   0.00   0.00  (all messages)
>     668      462      206    0.623   1.00   0.60  ODD_CHAR_COMMA_BA
>     287      249       38    0.828   0.89   0.60  ODD_CHAR_CARET_BA
>     557      261      296    0.394   0.78   0.60  ODD_CHAR_DOT_BA
>      75       44       31    0.511   0.44   0.60  ODD_CHAR_TIC1_BA
>     887      197      690    0.174   0.33   0.60  ODD_CHAR_UNDERSCORE_BA
>    6363      240     6123    0.028   0.11   0.60  ODD_CHAR_TILDE_BA
>    5988      194     5794    0.024   0.00   0.60  ODD_CHAR_DASH_BA
>    1151      183      968    0.122   0.00   0.60  ODD_CHAR_TIC2_BA
>
> OVERALL%   SPAM%     HAM%     S/O    RANK   SCORE  NAME
>   58315    33581    24734    0.576   0.00    0.00  (all messages)
> 100.000  57.5855  42.4145    0.576   0.00    0.00  (all messages as %)
>   1.146   1.3758   0.8329    0.623   1.00    0.60  ODD_CHAR_COMMA_BA
>   0.492   0.7415   0.1536    0.828   0.89    0.60  ODD_CHAR_CARET_BA
>   0.955   0.7772   1.1967    0.394   0.78    0.60  ODD_CHAR_DOT_BA
>   0.129   0.1310   0.1253    0.511   0.44    0.60  ODD_CHAR_TIC1_BA
>   1.521   0.5866   2.7897    0.174   0.33    0.60  ODD_CHAR_UNDERSCORE_BA
>  10.911   0.7147  24.7554    0.028   0.11    0.60  ODD_CHAR_TILDE_BA
>  10.268   0.5777  23.4252    0.024   0.00    0.60  ODD_CHAR_DASH_BA
>   1.974   0.5450   3.9136    0.122   0.00    0.60  ODD_CHAR_TIC2_BA
>
> All these rules hit significant ham.  Indeed, UNDERSCORE_BA, DOT_BA,
> TILDE_BA, DASH_BA, and TIC2_BA hit more ham than spam in my corpus.
>
> Bob Menschel
>

Ouch!  Thanks for running the check.  Obviously I will need to keep these
scored very low if if I continue to use them.

Sandy


Reply via email to