On Thu, 23 Dec 2004, John wrote:

> On Thu, 23 Dec 2004, Matt Kettler wrote:
> > At 12:06 PM 12/23/2004, John wrote:
> > >Matt,
> > >I appreciate this info! Is there a place where I can go to find  more about
> > >how this all works?
> >
> > Not that I'm aware of. There's some bits of information in the wiki, but
> > there's no "one general source" of information...
> >
> > http://wiki.apache.org/spamassassin/HowScoresAreAssigned?action=highlight&value=perceptron
> >
> >
> > Probably the best block of information is in the readme for the perceptron
> > itself:
> >
> > http://spamassassin.apache.org/full/3.0.x/dist/masses/README.perceptron
> >
> > The rest of it tends to come from having a feel for statistics and how
> > statistical systems operate, and a little bit of watching the devs discuss
> > concepts over the years.
> >
> Matt,
> I checked out these sources earlier but as you say it seems like just bits
> and pieces. I guess there is no place that has all that is need to
> generate scores. I am currently testing some of the rulesets found in
> rules_du_jour and I am running into a problem. These rules hit my spam
> corpus and when I run mass-check the spam.log file shows entries for these
> rules but when I run the perceptron the perceptron.scores has no entries
> for these custom rules. Also when I run hit-frequencies none of these
> rules show up. I place the custom rules in
> Mail-SpamAssassin-3.0.1/masses/spamassassin/local.cf. I also placed them
> in Mail-SpamAssassin-3.0.1/rules/local.cf. It seems that mass-check finds
> them but perceptron doesn't.
> http://wiki.apache.org/spamassassin/MassCheck suggests running mass-check
> on custom rules but it doesn't really describe where these rules should be
> placed. In going through the code of hits-frequencies (it also can not
> find these rules) I noticed that it calls parse-rules-for-masses which
> apperently only checks the rules directory for "[0-9]*.cf". I haven't
> looked through perceptron.c yet. But before I start investing a lot of
> time I was wondering if you have run into this problem.
> One more thing, I run hits-frequencies and perceptron without any options
> that change the rules directory so they should be running with the default
> of ../rules.
> Thanks,
> John
Well the obvious solution was to put the new rules into the existing
rules/[0-9]*.cf files which I did and I now have scores for the custom
rules in perceptron.scores. Either I was doing something wrong or
mass-checks looks in more places than perceptron for rules which is a bit
confusing when wading through all of this to generate scores. Any other
comments to help enlighten me on this subject are welcome. Other than that
thanks for the info.


