On 08/22/16 09:06, Dianne Skoll wrote:
On Mon, 22 Aug 2016 09:03:38 -0700
Marc Perkel <supp...@junkemailfilter.com> wrote:

The ones that are the same are of no interest. Only where it matches
one side and not the other.
But... but... that's exactly like Bayes if you throw out tokens whose
observed probability is not 0 or 1.

Also, in your list of tokens, they are all phrases ranging from 1 to 4 words,
and that's why you get good results.  Multiword Bayes is just as good,
and I know that from experience.



This is nothing like bayes. Bayes is creating a mental block. When I describe it to people who don't know bayes they immediately get it. If I describe it to people who know bayes - they confuse it. Bayes is a probability spectrum based on a frequency match on both sets. That's not even close to what I'm doing.

Also - some of what I'm doing is all combinations, not just sequential. So it's like a system that writes and scores it's own rules. I just throw data at it and it classifies it.

The real magic is the feedback learning. So as it identifies ham it learns new words and phrases that then match email from other people. So it learns how normal people speak, it learns how spammers speak, and it identifies the DIFFERENCES between the two. And it's completely automated.


--
Marc Perkel - Sales/Support
supp...@junkemailfilter.com
http://www.junkemailfilter.com
Junk Email Filter dot com
415-992-3400

Reply via email to