Patrick T. Tsang wrote:
I don't know your email types that they should be filtered.
To disable chain feature will lead to more spams, even though they are
chinese.
now, you might have a greater accuracy on chinese, but you will have
more false postive on other type of spams.
have you taken them into account?
chain feature is the most basic filtering skill we must apply.
I got a few of false positives, but overall it is doing
well and seems to be improving. Here is the stat:
TP True Positives: 251
TN True Negatives: 137
FP False Positives: 8
FN False Negatives: 12
SC Spam Corpusfed: 639
NC Nonspam Corpusfed: 349
TL Training Left: 2006
SHR Spam Hit Rate 95.44%
HSR Ham Strike Rate: 5.52%
OCA Overall Accuracy: 95.10%
--
Kent Tong
Useful news for software developers at
http://www2.cpttm.org.mo/cyberlab/softdev/newsletter