* On 26/01/07 19:38 +0800, Kent Tong wrote: | Odhiambo Washington wrote: | >* On 26/01/07 17:41 +0800, Kent Tong wrote: | >| Hi, | >| | >| I'm pilot testing dspam and is training it. It detects spam in English | >| or Russian quite well, but it almost always fails to detects spam in | >| Chinese. I've fed it with about 2,000 ham in my mail box and corrected | >| about 200 missed spams (false positive), but it doesn't seem to be | >| improving. | >| | >| Does anyone have good experience with Chinese spam? | > | >How much spam (esp chinese) have you trained it with? | | Below is the stats: | | dspam_stats -H [EMAIL PROTECTED] | [EMAIL PROTECTED]: | TP True Positives: 648 | TN True Negatives: 290 | FP False Positives: 0 | FN False Negatives: 202 | SC Spam Corpusfed: 89 | NC Nonspam Corpusfed: 1359 | TL Training Left: 851 | SHR Spam Hit Rate 76.24% | HSR Ham Strike Rate: 0.00% | OCA Overall Accuracy: 82.28% | | Among those false negatives, at least 50% are Chinese. So, at least 100 | Chinese spam have been fed to dspam as errors.
Train DSPAM! Train!!! It's not a human being. That is why training is required. -Wash http://www.netmeister.org/news/learn2quote.html DISCLAIMER: See http://www.wananchi.com/bms/terms.php -- +======================================================================+ |\ _,,,---,,_ | Odhiambo Washington <[EMAIL PROTECTED]> Zzz /,`.-'`' -. ;-;;,_ | Wananchi Online Ltd. www.wananchi.com |,4- ) )-,_. ,\ ( `'-'| Tel: +254 20 313985-9 +254 20 313922 '---''(_/--' `-'\_) | GSM: +254 722 743223 +254 733 744121 +======================================================================+ If all the world's economists were laid end to end, we wouldn't reach a conclusion. -- William Baumol
