Hello Loren,

Monday, February 16, 2004, 7:42:19 AM, you wrote:

LW> body     DUMB_PERIODS    /(?:.*\b[a-z]{3,10}[\.\!][a-z]{3,10}\b){6,30}/i
LW> describe DUMB_PERIODS    Writer doesn't put spaces after periods.
LW> score    DUMB_PERIODS    2.0    # not real high, can match source code 
listings

LW> This is UNTESTED, but might help.  You can twiddle the score higher if
LW> nobody ever sends you code listings in mail.  I'd really like to run this
LW> against a corpus and see how much ham it catches before putting it in my own
LW> configuration.

Results against my corpus:

DUMB_PERIODS -- 5029s/1518h of 100794 corpus (82099s/18695h) 02/16/04
DUMB_PERIODS -- suggested score: 0.184 (of 5.0)

OVERALL%   SPAM%     HAM%     S/O    RANK   SCORE  NAME
 100794    82099    18695    0.815   0.00    0.00  (all messages)
100.000  81.4523  18.5477    0.815   0.00    0.00  (all messages as %)

  6.495   6.1255   8.1198    0.430   0.00    2.00  DUMB_PERIODS

It matches 8% of my ham, and only 6% of my spam.

Bob Menschel





Reply via email to