Re: Trying to understand how bayes works.

2015-12-12 Thread Reindl Harald
Am 12.12.2015 um 20:12 schrieb Axb: On 12/12/2015 05:13 PM, RW wrote: The number of tokens depends on how many you train, not on how many you scan. Obvious... via autolearn my Bayes gets a constant feed of +500k "forced learn" spams/day. Works for me to expire those after 3 or 7 days, depen

Re: Trying to understand how bayes works.

2015-12-12 Thread Axb
On 12/12/2015 05:13 PM, RW wrote: The number of tokens depends on how many you train, not on how many you scan. Obvious... via autolearn my Bayes gets a constant feed of +500k "forced learn" spams/day. Works for me to expire those after 3 or 7 days, depending on the trap feed. Production tra

Re: Trying to understand how bayes works.

2015-12-12 Thread Reindl Harald
Am 12.12.2015 um 17:13 schrieb RW: On Sat, 12 Dec 2015 13:29:40 +0100 Axb wrote: On 12/12/2015 01:08 PM, Reindl Harald wrote: I hate stale data... that's all But you do keep stale data in the retained tokens, what you are getting rid of is the contribution from old mails that's least lik

Re: Trying to understand how bayes works.

2015-12-12 Thread RW
On Sat, 12 Dec 2015 13:29:40 +0100 Axb wrote: > On 12/12/2015 01:08 PM, Reindl Harald wrote: > >> I hate stale data... that's all But you do keep stale data in the retained tokens, what you are getting rid of is the contribution from old mails that's least likely to make a difference to any

Re: Trying to understand how bayes works.

2015-12-12 Thread Axb
On 12/12/2015 02:07 AM, Reindl Harald wrote: Am 11.12.2015 um 20:58 schrieb Axb: I hate stale data... that's all how can bayes data be stale? a spam message is a spam message now, tomorrow and next year the same especially for ham over time... header patterns change url patterns change ht

Re: Trying to understand how bayes works.

2015-12-11 Thread Reindl Harald
Am 11.12.2015 um 20:58 schrieb Axb: I hate stale data... that's all how can bayes data be stale? a spam message is a spam message now, tomorrow and next year the same especially for ham signature.asc Description: OpenPGP digital signature

Re: Trying to understand how bayes works.

2015-12-11 Thread Axb
On 12/11/2015 07:24 PM, Reindl Harald wrote: Am 11.12.2015 um 19:12 schrieb Axb: On 12/11/2015 06:51 PM, Reindl Harald wrote: well, how many of you trained chistmas spam this year while my bayes did know it from last year? I like my Bayes fresh like bread out of the oven, new guitar strings

Re: Trying to understand how bayes works.

2015-12-11 Thread Axb
On 12/11/2015 07:29 PM, Joe Quinn wrote: On 12/11/2015 1:24 PM, Reindl Harald wrote: Am 11.12.2015 um 19:12 schrieb Axb: On 12/11/2015 06:51 PM, Reindl Harald wrote: well, how many of you trained chistmas spam this year while my bayes did know it from last year? I like my Bayes fresh like

Re: Trying to understand how bayes works.

2015-12-11 Thread Joe Quinn
On 12/11/2015 1:24 PM, Reindl Harald wrote: Am 11.12.2015 um 19:12 schrieb Axb: On 12/11/2015 06:51 PM, Reindl Harald wrote: well, how many of you trained chistmas spam this year while my bayes did know it from last year? I like my Bayes fresh like bread out of the oven, new guitar strings

Re: Trying to understand how bayes works.

2015-12-11 Thread Reindl Harald
Am 11.12.2015 um 19:12 schrieb Axb: On 12/11/2015 06:51 PM, Reindl Harald wrote: well, how many of you trained chistmas spam this year while my bayes did know it from last year? I like my Bayes fresh like bread out of the oven, new guitar strings and clean sheets. well, i like my bayes cat

Re: Trying to understand how bayes works.

2015-12-11 Thread Axb
On 12/11/2015 06:51 PM, Reindl Harald wrote: well, how many of you trained chistmas spam this year while my bayes did know it from last year? I like my Bayes fresh like bread out of the oven, new guitar strings and clean sheets. Last years turkey doesn't appeal to me. SCR

Re: Trying to understand how bayes works.

2015-12-11 Thread Reindl Harald
Am 11.12.2015 um 18:42 schrieb Martin Gregorie: For instance, I have two portmanteau rules, SALE (contains sales phrases like "huge discount") and PRODUCT (contains phrases like "fur coat") that are ANDed by a meta called SALESPAM. The nice thing about this approach is that, once the SALE and P

Re: Trying to understand how bayes works.

2015-12-11 Thread Martin Gregorie
On Fri, 2015-12-11 at 09:05 -0800, Marc Perkel wrote: > For example. I create rules that look for many phrases about a > subject > and the subject becomes a token. For examples: > > JESUS > ROYALTY > MONEY > > But themselves not an indicator of spam. But if you have all 3 then > it's > definite

Re: Trying to understand how bayes works.

2015-12-11 Thread Dianne Skoll
On Fri, 11 Dec 2015 09:05:10 -0800 Marc Perkel wrote: > What I was thinking about doing was creating a string of tokens that > represented key features of the message. Then run that through a > program that created new tokens out of every possible combination of > 2 tokens and adding that to the

Re: Trying to understand how bayes works.

2015-12-11 Thread Reindl Harald
summary to what you said below: that's what bayes already does just rain it properly instead re-invent the whell Am 11.12.2015 um 18:05 schrieb Marc Perkel: What I was thinking about doing was creating a string of tokens that represented key features of the message. Then run that through a prog

Re: Trying to understand how bayes works.

2015-12-11 Thread Marc Perkel
On 12/11/15 06:58, RW wrote: On Thu, 10 Dec 2015 13:54:05 -0800 Marc Perkel wrote: Bayes breaks the message down into some sort of tokens and then does statistics on those tokens as to tokens found in spam vs. tokens found in ham. But what about combinations of tokens? I'm thinking that I'd l

Re: Trying to understand how bayes works.

2015-12-11 Thread Reindl Harald
Am 11.12.2015 um 17:40 schrieb Kris Deugau: Marc Perkel wrote: I've had bayes disabled in SA because it seems to not be able to stay working in a high volume situation. The MySQL server can't seem to keep up with it even on very fast computers. I'm curious where you started seeing performanc

Re: Trying to understand how bayes works.

2015-12-11 Thread Kris Deugau
Marc Perkel wrote: > I've had bayes disabled in SA because it seems to not be able to stay > working in a high volume situation. The MySQL server can't seem to keep > up with it even on very fast computers. I'm curious where you started seeing performance issues (number of messages, users; hardwa

Re: Trying to understand how bayes works.

2015-12-11 Thread Benny Pedersen
On December 11, 2015 9:38:49 AM Axb wrote: Again.. SA's Redis backend speed and ease of use can't be beat... mariadb engine=memory hack mariadb start stop to change engine before stop and after start for persistence db There's some help in https://svn.apache.org/repos/asf/spamassassin/tr

Re: Trying to understand how bayes works.

2015-12-11 Thread RW
On Thu, 10 Dec 2015 13:54:05 -0800 Marc Perkel wrote: > Bayes breaks the message down into some sort of tokens and then does > statistics on those tokens as to tokens found in spam vs. tokens > found in ham. > > But what about combinations of tokens? I'm thinking that I'd like to > have somethi

Re: Trying to understand how bayes works.

2015-12-11 Thread Axb
On 12/11/2015 06:28 AM, Marc Perkel wrote: On 12/10/15 18:31, Benny Pedersen wrote: Marc Perkel skrev den 2015-12-10 22:54: I've had bayes disabled in SA because it seems to not be able to stay working in a high volume situation. The MySQL server can't seem to keep up with it even on very fast

Re: Trying to understand how bayes works.

2015-12-10 Thread Marc Perkel
On 12/10/15 18:31, Benny Pedersen wrote: Marc Perkel skrev den 2015-12-10 22:54: I've had bayes disabled in SA because it seems to not be able to stay working in a high volume situation. The MySQL server can't seem to keep up with it even on very fast computers. i got a palm Zire that can do

Re: Trying to understand how bayes works.

2015-12-10 Thread Dianne Skoll
On Fri, 11 Dec 2015 03:31:56 +0100 Benny Pedersen wrote: > if z is scored as spam, and x and y is ham, then its ham basicly > that how bayes works, but a single mail might be lots of digest to > compare for this to say spam or not The thing is, the probability of token Y is not independent of th

Re: Trying to understand how bayes works.

2015-12-10 Thread Benny Pedersen
Marc Perkel skrev den 2015-12-10 22:54: I've had bayes disabled in SA because it seems to not be able to stay working in a high volume situation. The MySQL server can't seem to keep up with it even on very fast computers. i got a palm Zire that can do ocr on handwrited text :=) pretty good for

Re: Trying to understand how bayes works.

2015-12-10 Thread Dianne Skoll
On Thu, 10 Dec 2015 13:54:05 -0800 Marc Perkel wrote: > But what about combinations of tokens? I'm thinking that I'd like to > have something that says when it sees tokens X and Y and Z then > that's spam even though X,Y,Z might be in ham when not combined. The SpamAssassin Bayes implementation

Re: Trying to understand how bayes works.

2015-12-10 Thread Axb
On 12/10/2015 10:54 PM, Marc Perkel wrote: I've had bayes disabled in SA because it seems to not be able to stay working in a high volume situation. The MySQL server can't seem to keep up with it even on very fast computers. Redis is your friend. Redis over the wire is faster than any local SDB

Trying to understand how bayes works.

2015-12-10 Thread Marc Perkel
I've had bayes disabled in SA because it seems to not be able to stay working in a high volume situation. The MySQL server can't seem to keep up with it even on very fast computers. But - thinking about trying something interesting - doing my own bayes in a different way. Here's my question.