Keith C. Ivey <[EMAIL PROTECTED]> wrote: > Bob George <[EMAIL PROTECTED]> wrote: > >> I've noticed that the add-on rules help recognize new >> patterns, which is very useful for training bayes. But once >> bayes has the patterns, it alone is more than adequate. > > I'm not sure what you mean by "patterns", but it should be > clarified that Bayes doesn't deal with patterns like the ones > recognized by most rules. It deals only with the presence of > tokens, and individual tokens at that, not even combinations. > Rules can recognize much more general and complex patterns in > messages than anything Bayes can (at least as Bayes is > implemented in SA).
Ah, I hope I'm not spreading bad information. I'm hardly an SA expert, just a very happy end-user. It seems that using the add-on rules in conjunction with bayes has resulted in NONE of the "clever" spams getting through. I have spent some time thinking through training bayes (including NOT feeding it this list as ham!) and it seems to have paid off. Perhaps I'm simply benefitting from better recognition in the basic SA rules. Just to verify, most spam I receive -- regardless of technique used -- seems to be tagged with BAYES lately (90+ mostly). So I thought the weird "patterns" (more correctly, broken-word tokens) were also going into bayes, with the result that since those odd spellings of v-drug, backhair, spammer domains and such ONLY show in spam, bayes associates them with statistically indicating spam. Have I misunderstood? So if the word "quatrain" only appears in random-word spam (here at least), or more importantly, never shows in non-spam, it won't help (nor necessarily hinder) detecting spam. And "eeVagra" and such will ONLY be in spam. If spammers are using common word lists, I'd think there would be some repetition, so it *might* help. Am I off base? - Bob
