From Ramchandra's original message it sounds as if the corpus has only
been trained with 'spam' messages. The Bayesian filter *needs* both
spam AND ham (preferably in equal measure) before it can give sensible
results.
Secondly, by today's standards the Bayesian code is a little naive and
in particular it makes no attempt to decode base64 encoded content.
This means if your spams contain a lot of images it adds a lot of random
looking strings to the corpus which makes it more likely they will occur
in ham messages.
Thirdly, it has no support for n-grams which means it has a very hard
time analyzing UTF-8 rich emails like Chinese.
Regards,
David Legg
On 26/07/13 12:30, Eric Charles wrote:
Maybe your training is too wide.
What if you don't train, or only send a few mail for training? Does
James also mark all mails as spam?
On 23/07/2013 13:27, Ramchandra Naik wrote:
Hi Guys,
We are using bayesian analysis feeder for spam feeding/filtering.
After feeding spam in to it corpus get reloaded and then bayesian
analysis mark my all incoming mails as a spam. Can you guys please
look in to it and give me any solution.
I am using James Server 3.0-beta4 with MySQL and following is the
configuration of bayesian analysis:
<!-- "not spam" bayesian analysis feeder. -->
<mailet match="[email protected]"
class="BayesianAnalysisFeeder">
<repositoryPath>db://maildb</repositoryPath>
<feedType>ham</feedType>
<maxSize>200000</maxSize>
</mailet>
<!-- "spam" bayesian analysis feeder. -->
<mailet match="[email protected]"
class="BayesianAnalysisFeeder">
<repositoryPath>db://maildb</repositoryPath>
<feedType>spam</feedType>
<maxSize>200000</maxSize>
</mailet>
<!-- Anti spam bayesian analysis -->
<mailet match="All" class="BayesianAnalysis"
onMailetException="ignore">
<repositoryPath>db://maildb</repositoryPath>
<maxSize>200000</maxSize>
<headerName>X-MessageIsSpamProbability</headerName>
<ignoreLocalSender>true</ignoreLocalSender>
</mailet>
<mailet
match="CompareNumericHeaderValue=X-MessageIsSpamProbability > 0.90"
class="SetMailAttribute" onMatchException="noMatch">
<isSpam>true</isSpam>
</mailet>
<mailet
match="CompareNumericHeaderValue=X-MessageIsSpamProbability > 0.90"
class="SetMimeHeader" onMatchException="noMatch">
<name>X-MessageIsSpam</name>
<value>true</value>
</mailet>
<mailet
match="CompareNumericHeaderValue=X-MessageIsSpamProbability > 0.99"
class="ToProcessor" onMatchException="noMatch">
<processor>spam</processor>
<notice>Spam not accepted</notice>
</mailet>
<!-- Send remaining mails to the transport processor for
either local or remote delivery -->
<mailet match="All" class="ToProcessor">
<processor>transport</processor>
</mailet>
</processor>
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]