Am 11.12.2015 um 18:42 schrieb Martin Gregorie:
For instance, I have two portmanteau rules, SALE (contains sales
phrases like "huge discount") and PRODUCT (contains phrases like "fur
coat") that are ANDed by a meta called SALESPAM. The nice thing about
this approach is that, once the SALE and PRODUCT lists have grown to a
decent size the SALESPAM meta starts to fire on previously unseen
combinations without generating FPs. The only downside is that, unlike
Bayes, you have to build the lists manually but thats probably no worse
to do than building a hand-crafted Bayes DB like Reyndl does

hand crafted bayes?
worse?

what a nonsense

what's handcrafted there?
that i don't trust autolearn and don't like autoexpire?

well, how many of you trained chistmas spam this year while my bayes did know it from last year?

how many of you are train the same spam types again and again because spammers are aware of autoexpire and just need to stop using a campaign for some weeks until 99% of default setups has forgotten about it

what i do is just KEEP all training messages so that i can rebuild my bayes at every point in them without start learning from scratch

since "bayes_token_sources all" coming with the last release as well as "normalize_charset 1" enabled later and chnaged it#s behavior with the lastest release i know why - well, i did know that from the first moment "keep the corps if later something in the tokenizer changes"

Attachment: signature.asc
Description: OpenPGP digital signature

Reply via email to