On Thu, 2003-03-13 at 02:40, Bob Miller wrote:
> In that particular example, if you split on any punctuation, there are
> nine non-words.  If you consider apostrophe as a word-constituent
> character, there are no words correctly spelled.
> 
> Would a test that simple work?

That might be a great test to apply to subject lines.

> Aside: I thought one of the original design goals for TarProxy was to
> reuse, not reinvent, filtering heuristics.

Such functionality would be placed in a Tokenizer in order to provide
notification of a very badly spelled subject line
(META.BAD_SUBJECT_SPELLING=Y ?) to a Classifier; the Classifier is still
the decision maker.  Since multiple Tokenizers can be used, TarProxy
admins can choose what info to provide to a Classifier.  I hope to have
a lot of Tokenizers available as part of an initial distribution.

Here are a couple more that might be useful to Classifiers:
META.OUTSIDE_BUSINESS_HOURS=Y and META.WEEKEND=Y.

-- 
Marty Lamb
Martian Software
<mlamb at martiansoftware dot com>

----
: The tarproxy-list mailing list is archived at
:   http://www.mail-archive.com/tarproxy-list%40martiansoftware.com/
:
: To unsubscribe from this list, follow the instructions at
:   http://www.martiansoftware.com/contact.html
:
: TarProxy's project page can be found at
:   http://www.martiansoftware.com/tarproxy

Reply via email to