Bayesian Spam matcher + SMTPACL after DATA patch for james server v3.0a1 (cvs20040219)
Author: Alexander Zhukov <zhukov@ukrpost.net>

For installation read INSTALL file
This patch adds the following features to James:

    * Bayesian Spam Filter matcher
        There are a lot of buzz around bayesian spam filtering. Most noticeable 
        article about bayesian spam filtering was Paul Grahams "Plan for Spam" 
        (http://www.paulgraham.com/spam.html). It is a whole new approach to
        filtering spam which I wont describe here (read the article). So I 
        decided to implement this type of spam filter as a James matcher. My 
        matcher is based on my bayesian filter library (called bayes.jar :)).
        Library uses simplified CRM114s (crm114.sf.net) approach to classify 
        text streams.
        It has three modules: tokenizer, persistance (very poor), classifier.
        Tokenizer is used for initial tokenization of spam and nonspam messages.
        Tokens are stored in some persistant storage (filesystem only for now).
        Classifier uses tokenizer to convert incoming message into stream
        of tokens and calculates spamminess of the message based on the tokens
        from persistant storage.
        Matcher configuration is simple:
        Add the following to your matchers:
        ---- cut here ----
        <mailet match="BayesianSpam=/ukrpost/mail/bayes.spam.stats" class="..."/>
        ---- cut here ----
        /ukrpost/mail/bayes.spam.stats - is an url to persistant storage of tokens
        This is what you downloaded from 
        http://www.ukrpost.net/research/james/bayes.spam.stats.gz
    
    * SMTPACL after DATA
        This is an extension of my previous SMTPACL patch acls are now applied
        after DATA command in SMTP dialog.
        NOTE: Cool feature:
        You can use BayesianSpam matcher after DATA as access controlling matcher.
        So spam will even never reach your spool.
        Imagine:
        > HELO spammerhost.but.seems.like.normal.one.com
        < 250 hi
        > MAIL FROM: <user@remote>
        < 250 Sender ok
        > RCPT TO: <user@local>
        < 250 Recipient ok
        > DATA
        < 354 Ok Send data ending with <CRLF>.<CRLF>
        > Spam here 
        > Enlarge penis
        > Free viagra
        > bla bla bla
        < 550 SMTPACLReject spam detected.
        > QUIT
        < 221 closing

        This patch splits message processing (processMail method) and message 
        delivery to spool (spoolMail method) in SMTPHandler class.
        Matcher + SMTPACL after DATA:

        Add the following to james-config.xml 
        config->smtpserver->handler->smtpacl
        ---- cut here ----
            <data>
                 <mailet match="BayesianSpam=/ukrpost/mail/bayes.spam.stats" class="SMTPACLReject"/>
                 <mailet match="All" class="SMTPACLAccept"/>
            </data>
        ---- cut here ----
        This will run bayesian matcher after remote server has sent "." at the
        end of message.

        TODO: We could think of some temporary blocking of IPs that send to much 
        of spam based on this matcher. 
        Like: if 10 spam per hour limit reached then block this ip for 2 hours.
