http://bugzilla.spamassassin.org/show_bug.cgi?id=3023





------- Additional Comments From [EMAIL PROTECTED]  2004-02-09 07:49 -------
Subject: Re:  Detecting random garbage in emails

On [02/09/04 07:38], [EMAIL PROTECTED] wrote:
> I think you may need to not apply this rule if there are less
> than, say, 50 unique words in the text just because there won't be large
> enouh sample.

Totally agree with this.

I think one of the nicer parts of this check is that it is very easy to
implement.  In addition, tt is language independant -- whatever the majority
of your tokens are, the check will work.
In its current state, this  rule will not take any additional processing time.
However, I think we should explore the the posibility of using frequency 
information
(nspam/nham) from Bayes to gauge how frequent the word is.





------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

Reply via email to