Hi,
Here are my stats after retraining 100's of messages. Both spam and ham:
{227} dspam_stats -H
antispam:
TP True Positives: 4818
TN True Negatives: 22115
FP False Positives: 4
FN False Negatives: 5
SC Spam Corpusfed: 0
NC Nonspam Corpusfed: 0
TL Training Left: 0
SHR Spam Hit Rate 99.90%
HSR Ham Strike Rate: 0.02%
PPV Positive predictive value: 99.92%
OCA Overall Accuracy: 99.97%
Last night it caught maybe 100 emails, but I had much more than that
in my inbox.
Kind Regards,
Al
On Jul 23, 2015, at 12:13 PM, Nathanael D. Noblet wrote:
> On Wed, 2015-07-22 at 17:48 -0700, waterdog wrote:
>
>> The dspam_stats for this user don't look too good even after multiple
>> training attempts:
>>
>> TP True Positives: 0
>> TN True Negatives: 4
>> FP False Positives: 2353
>> FN False Negatives: 1947
>> SC Spam Corpusfed: 0
>> NC Nonspam Corpusfed: 0
>> TL Training Left: 143
>
> You can see from this line that it needs to receive another 143
> messages before it is out of training. It requires about 2500 messages
> before it flips a switch. I can't remember what switch but it flips
> one.
>
> When I setup myself years ago, I found a corpus of spam, and I fed it
> my entire mailbox + the spam. Now you can see my stats years later:
>
> TP True Positives: 3354
> TN True Negatives: 239849
> FP False Positives: 1448
> FN False Negatives: 981
> SC Spam Corpusfed: 0
> NC Nonspam Corpusfed: 0
> TL Training Left: 0
> SHR Spam Hit Rate 77.37%
> HSR Ham Strike Rate: 0.60%
> PPV Positive predictive value: 69.85%
> OCA Overall Accuracy: 99.01%
>
> You don't have enought data for dpsam do reliably do anything.
> Retraining one message as spam will *not* automatically get it to be
> classified as spam on the *next* classification.
>
> Watch the numbers in your stats which says whether training is
> occuring. If you have a false negative (ham as spam), train it and you
> should see the FN increment. If it does dspam is working as expected.
>
> The other implied part of your question is 'Why isn't dspam effective
> yet?'. Which is partly due to the amount of mail you've received so
> far, the type of spam, and the dspam settings. I used to setup people
> with TEFT as those were the recommendations and I think the default.
> Over the years I've seen it mentioned on this list multiple times that
> you should use TOE by default.
>
> I also use
>
> Algorithm graham burton
> Tokeninzer osb
>
> because of users of this list back in the day explaining that they
> were
> better defaults.
>
>
>
> ----------------------------------------------------------------------
> --------
> _______________________________________________
> Dspam-user mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/dspam-user
>
> !DSPAM:55b11c8d189367246910663!
>
------------------------------------------------------------------------------
_______________________________________________
Dspam-user mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/dspam-user