Hi Joe,

Digging in DEEPER yields the following;

1.  mailet log says there is no mail server named hostname.myfirst.domain so
I change my config.xml to just [email protected] for bayesian analysis.  (
What should I be forwarding my spam to??? DO I need to purchase another domain for this to work?

No you don't need to purchase another domain.

The 'trick' is to ensure that you have SMTP Authentication enabled in James. Then in your config file you set up the root pipeline such that any email sent by an authenticated user to your special spam or ham address gets passed to the Bayesian analysis feeder. If you do this then the ham or spam addresses can be anything you like. Indeed, it is preferable to define some nonsense addresses so that only authenticated users can update the corpus (you don't want spammers to poison your corpus by sending non-spam to your spam address... rare though that might be). I'll show you how to do this in a minute after I've answered your other questions.

2.  All of the tables except deadletter are empty.

Yes.  This indicates that no spam or ham is being added to the corpus.

3.  Sending to [email protected] - I see "Corpus loaded" message in
mailet.log, but no entry in db - bayesian.._spam table.

This message doesn't mean a spam or ham message tried to be added. It simply means the Bayesian filter mailet loaded the empty corpus. It is good news that the corpus was loaded. It is bad news that we know it is empty!

4.  When I forward to [email protected]  - The spam ends up in
file://var/mail/address-error/ and not in database.

This also points to the fact that something is wrong with adding spam or ham to the corpus. Nothing is picking up emails and sending then to the Bayesian feeder.

5.  SHould my config.xml file include a complete table name in the
repositoryPath?  It is currently  db://maildb    WITHOUT any table name,.
i.e. bayesiananalysis_spam....  How does it know this?  Is it hardcoded in
the Class ??

The table names are hard-coded I believe so just specifying 'db://maildb' is sufficient.


Ok... so that's the theory and questions out of the way...

Now in your case according to the config.xml file you published for us you have the following in your root processing pipeline: -

<mailet match="[email protected]" class="BayesianAnalysisFeeder">
 <repositoryPath> db://maildb </repositoryPath>
 <feedType>ham</feedType>
 <maxSize>500000</maxSize>
 </mailet>

<mailet match="[email protected]" class="BayesianAnalysisFeeder">
 <repositoryPath> db://maildb </repositoryPath>
 <feedType>spam</feedType>
 <maxSize>500000</maxSize>
 </mailet>

Can you see the problem ;-) ?

According to this you should send your ham messages to '[email protected]' and your spam messages to... exactly the same address (assuming this is not an error you made in editing the file to remove sensitive data).

For comparison here's the same section in my config.xml: -

        <!-- "not spam" bayesian analysis feeder. -->
<mailet match="[email protected]" class="BayesianAnalysisFeeder">
           <repositoryPath> db://maildb </repositoryPath>
           <feedType>ham</feedType>
           <maxSize>500000</maxSize>
        </mailet>
<!-- "spam" bayesian analysis feeder. --> <mailet match="[email protected]" class="BayesianAnalysisFeeder">
           <repositoryPath> db://maildb </repositoryPath>
           <feedType>spam</feedType>
           <maxSize>500000</maxSize>
        </mailet>

I send all my spam messages to '[email protected]' and ham messages to '[email protected]'. I have not sanitized these addresses.... they are the real actual addresses. I can do this because my mail client (Thunderbird in my case) is set up to use my James SMTP server to send this email and therefore my mail client has to log into the server and thus is SMTP authenticated. The Bayesian feeder extracts this email before it gets too far down the pipeline so James won't attempt to actually send the email to an address in the xxx.yyy domain.

Hope this helps.

Regards,
David Legg


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to