The following module was proposed for inclusion in the Module List:
modid: Mail::Classifier
DSLIP: adpOp
description: Framework for statistical/bayesian mail sort
userid: DAGOLDEN (David Golden)
chapterid: 19 (Mail_and_Usenet_News)
communities:
comp.lang.perl.modules; Mail::Box mailing list
similar:
Mail::SpamTest::Bayesian AI:Categorizer
rationale:
The goal of Mail::Classifier is to facilitate rapid creation,
testing, and tuning of mail classification algorithms, such as the
newly popularized Naive Bayesian methods. Unlike other modules, this
module is both more specific around processing mail and more broad
by providing utility routines useful for developers of new/hybrid
approaches. Primary functionality of the abstract base class is mail
box and message handling (leveraging capabilities of Mail::Box),
data persistence, optional on-disk or in-memory data table storage,
and statistical validation of classifier performance.
Compared to Mail::SpamTest::Bayesian, Mail::Classifier is more
general -- it is not restricted to spam identification nor the
Bayesian approach. As a framework, it will support derived classes
that can handle any number of categories of mail and which implement
any of a number of algorithms or hybrids. (Indeed, one could write
Mail::Classifier::SpamTest::Bayesian to use Mail::SpamTest::Bayesian
as the implementation behind the scenes.)
Compared to AI::Categorizer, Mail::Classifier is more specific to
the task of processing e-mail, whereas AI::Categorizer is a broader
package more "academic" in nature and jargon and more useful, in my
opinion, for research than for people wanting to hack around with
mail sorting because of its higher learning curve. Mail::Classifier
is designed to be much simpler to use and extend. Derived classes
need only implement the specific methods that are core to an
algorithm (parse(), learn(), score(), and a few simple helper
methods to manage algorithm-specific data). Included examples
demonstrate both a trivial (near-random) method and the "Paul
Graham" Bayesian approach as a jumping off point for people to
develop and test their own ideas and methods.
enteredby: DAGOLDEN (David Golden)
enteredon: Tue Jan 14 19:53:08 2003 GMT
The resulting entry would be:
Mail::
::Classifier adpOp Framework for statistical/bayesian mail sort DAGOLDEN
Thanks for registering,
The Pause Team
PS: The following links are only valid for module list maintainers:
Registration form with editing capabilities:
https://pause.perl.org/pause/authenquery?ACTION=add_mod&USERID=6a100000_7439758662112850&SUBMIT_pause99_add_mod_preview=1
Immediate (one click) registration:
https://pause.perl.org/pause/authenquery?ACTION=add_mod&USERID=6a100000_7439758662112850&SUBMIT_pause99_add_mod_insertit=1