Re: How do I reduce SPAM

David Legg Sun, 01 Feb 2009 10:38:22 -0800

Hi Joe,

Digging in DEEPER yields the following;
1.  mailet log says there is no mail server named hostname.myfirst.domain so
I change my config.xml to just [email protected] for bayesian analysis.  (
What should I be forwarding my spam to???DO I need to purchase another domain for this to work?


No you don't need to purchase another domain.

The 'trick' is to ensure that you have SMTP Authentication enabled inJames. Then in your config file you set up the root pipeline such thatany email sent by an authenticated user to your special spam or hamaddress gets passed to the Bayesian analysis feeder. If you do thisthen the ham or spam addresses can be anything you like. Indeed, it ispreferable to define some nonsense addresses so that only authenticatedusers can update the corpus (you don't want spammers to poison yourcorpus by sending non-spam to your spam address... rare though thatmight be). I'll show you how to do this in a minute after I've answeredyour other questions.

2.  All of the tables except deadletter are empty.


Yes.  This indicates that no spam or ham is being added to the corpus.

3.  Sending to [email protected] - I see "Corpus loaded" message in
mailet.log, but no entry in db - bayesian.._spam table.

This message doesn't mean a spam or ham message tried to be added. Itsimply means the Bayesian filter mailet loaded the empty corpus. It isgood news that the corpus was loaded. It is bad news that we know it isempty!

4.  When I forward to [email protected]  - The spam ends up in
file://var/mail/address-error/ and not in database.

This also points to the fact that something is wrong with adding spam orham to the corpus. Nothing is picking up emails and sending then to theBayesian feeder.

5.  SHould my config.xml file include a complete table name in the
repositoryPath?  It is currently  db://maildb    WITHOUT any table name,.
i.e. bayesiananalysis_spam....  How does it know this?  Is it hardcoded in
the Class ??

The table names are hard-coded I believe so just specifying'db://maildb' is sufficient.



Ok... so that's the theory and questions out of the way...

Now in your case according to the config.xml file you published for usyou have the following in your root processing pipeline: -

<mailet match="[email protected]"class="BayesianAnalysisFeeder">

 <repositoryPath> db://maildb </repositoryPath>
 <feedType>ham</feedType>
 <maxSize>500000</maxSize>
 </mailet>

<mailet match="[email protected]"class="BayesianAnalysisFeeder">

 <repositoryPath> db://maildb </repositoryPath>
 <feedType>spam</feedType>
 <maxSize>500000</maxSize>
 </mailet>

Can you see the problem ;-) ?

According to this you should send your ham messages to'[email protected]' and your spam messages to... exactly thesame address (assuming this is not an error you made in editing the fileto remove sensitive data).


For comparison here's the same section in my config.xml: -

        <!-- "not spam" bayesian analysis feeder. -->

<mailet match="[email protected]"class="BayesianAnalysisFeeder">

           <repositoryPath> db://maildb </repositoryPath>
           <feedType>ham</feedType>
           <maxSize>500000</maxSize>
        </mailet>

<mailet match="[email protected]"class="BayesianAnalysisFeeder">

           <repositoryPath> db://maildb </repositoryPath>
           <feedType>spam</feedType>
           <maxSize>500000</maxSize>
        </mailet>

I send all my spam messages to '[email protected]' and ham messages to'[email protected]'. I have not sanitized these addresses.... they arethe real actual addresses. I can do this because my mail client(Thunderbird in my case) is set up to use my James SMTP server to sendthis email and therefore my mail client has to log into the server andthus is SMTP authenticated. The Bayesian feeder extracts this emailbefore it gets too far down the pipeline so James won't attempt toactually send the email to an address in the xxx.yyy domain.


Hope this helps.

Regards,
David Legg


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: How do I reduce SPAM

Reply via email to