Hello Loren,
Monday, February 16, 2004, 7:42:19 AM, you wrote:
LW> body DUMB_PERIODS /(?:.*\b[a-z]{3,10}[\.\!][a-z]{3,10}\b){6,30}/i
LW> describe DUMB_PERIODS Writer doesn't put spaces after periods.
LW> score DUMB_PERIODS 2.0 # not real high, can match source code
listings
LW> This is UNTESTED, but might help. You can twiddle the score higher if
LW> nobody ever sends you code listings in mail. I'd really like to run this
LW> against a corpus and see how much ham it catches before putting it in my own
LW> configuration.
Results against my corpus:
DUMB_PERIODS -- 5029s/1518h of 100794 corpus (82099s/18695h) 02/16/04
DUMB_PERIODS -- suggested score: 0.184 (of 5.0)
OVERALL% SPAM% HAM% S/O RANK SCORE NAME
100794 82099 18695 0.815 0.00 0.00 (all messages)
100.000 81.4523 18.5477 0.815 0.00 0.00 (all messages as %)
6.495 6.1255 8.1198 0.430 0.00 2.00 DUMB_PERIODS
It matches 8% of my ham, and only 6% of my spam.
Bob Menschel