Re: local score ignored
What output does the command "sa-learn --dump magic" produce? On Fri, 19 Apr 2013, Joe Acquisto-j4 wrote: 0.000 0 1872 0 non-token data: nspam 0.000 0 9184 0 non-token data: nham On 19.04.13 07:41, John Hardin wrote: Generally you want the ratio of trained messages to reflect the ratio of mail you're seeing. Most people get a lot more spam than ham, so it looks like you need a lot more spam trained in. I try to maintain at least a 2:1 spam:ham ratio. well, I have similar ratio than above: 0.000 0 1736 0 non-token data: nspam 0.000 0 54729 0 non-token data: nham and it still catches most of spam... Note that false negatives (non-catched spam) are still better than false positives (catched non-spam). -- Matus UHLAR - fantomas, uh...@fantomas.sk ; http://www.fantomas.sk/ Warning: I wish NOT to receive e-mail advertising to this address. Varovanie: na tuto adresu chcem NEDOSTAVAT akukolvek reklamnu postu. I feel like I'm diagonally parked in a parallel universe.
Re: local score ignored
>>> On 4/19/2013 at 10:41 AM, John Hardin wrote: > On Fri, 19 Apr 2013, Joe Acquisto-j4 wrote: > >>> What output does the command "sa-learn --dump magic" produce? >> >> 0.000 0 1872 0 non-token data: nspam >> 0.000 0 9184 0 non-token data: nham > > Generally you want the ratio of trained messages to reflect the ratio of > mail you're seeing. Most people get a lot more spam than ham, so it looks > like you need a lot more spam trained in. > > I try to maintain at least a 2:1 spam:ham ratio. > > -- > John Hardin KA7OHZhttp://www.impsec.org/~jhardin/ > jhar...@impsec.orgFALaholic #11174 pgpk -a jhar...@impsec.org > key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C AF76 D822 E6E6 B873 2E79 > --- >Ten-millimeter explosive-tip caseless, standard light armor >piercing rounds. Why? > --- > Today: the 238th anniversary of The Shot Heard 'Round The World Interesting. I had not paid attention. From my personal experience, the totals seem reversed. I will have to check how others are feeding. I suspect a certain other party may have their signals crossed on what to send where. In which case, I may have to clear bayes and re-feed. joe a
Re: local score ignored
Joe Acquisto-j4 skrev den 2013-04-19 13:10: 0.000 0 1872 0 non-token data: nspam 0.000 0 9184 0 non-token data: nham any use of whitelist_from ? score whitelist_from 0.001 why ?, whitelist_from can be forged, and will poison bayes if not carefull with scores default score is -100 :( -- senders that put my email into body content will deliver it to my own trashcan, so if you like to get reply, dont do it
Re: local score ignored
On 4/19/2013 7:10 AM, Joe Acquisto-j4 wrote: 0.000 0 3 0 non-token data: bayes db version 0.000 0 1872 0 non-token data: nspam 0.000 0 9184 0 non-token data: nham 0.000 0 140303 0 non-token data: ntokens 0.000 0 1364766063 0 non-token data: oldest atime 0.000 0 1366368683 0 non-token data: newest atime 0.000 0 1366367890 0 non-token data: last journal sync atime 0.000 0 1366146116 0 non-token data: last expiry atime 0.000 01382400 0 non-token data: last expire atime delta 0.000 0 26360 0 non-token data: last expire reduction count You are learning to the same DB that's being used by SA, right? -- Bowie
Re: local score ignored
On Fri, 19 Apr 2013, Joe Acquisto-j4 wrote: What output does the command "sa-learn --dump magic" produce? 0.000 0 1872 0 non-token data: nspam 0.000 0 9184 0 non-token data: nham Generally you want the ratio of trained messages to reflect the ratio of mail you're seeing. Most people get a lot more spam than ham, so it looks like you need a lot more spam trained in. I try to maintain at least a 2:1 spam:ham ratio. -- John Hardin KA7OHZhttp://www.impsec.org/~jhardin/ jhar...@impsec.orgFALaholic #11174 pgpk -a jhar...@impsec.org key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C AF76 D822 E6E6 B873 2E79 --- Ten-millimeter explosive-tip caseless, standard light armor piercing rounds. Why? --- Today: the 238th anniversary of The Shot Heard 'Round The World
Re: local score ignored
On Fri, 19 Apr 2013, Joe Acquisto-j4 wrote: On 18.04.13 21:45, Joe Acquisto-j4 wrote: All I can do is feed it. that is what you should do. You need to train on both spam and ham, since the BAYES filter must know how they differ... That has always given me pause, as I get very little ham. Got one this AM. which I will feed but that's the first in at least a month. I gather that aged info is not useful? Ham changes character over time much less than spam. Train with whatever you have to start, then train with misclassified messages. -- John Hardin KA7OHZhttp://www.impsec.org/~jhardin/ jhar...@impsec.orgFALaholic #11174 pgpk -a jhar...@impsec.org key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C AF76 D822 E6E6 B873 2E79 --- Ten-millimeter explosive-tip caseless, standard light armor piercing rounds. Why? --- Today: the 238th anniversary of The Shot Heard 'Round The World
Re: local score ignored
Niamh Holding 04/19/13 7:11 AM >>> You only get one ham email a month? On 19.04.13 09:22, Joe Acquisto-j4 wrote: That's all *I* seem to get. Other users may differ, but I have them instructions on how to forward stuff for training. This is a rather small system compared to what many of you deal with. Do you use shared bayes database? Note that this may not be ideal for many users, since different users can have different opinions pon what is spam and what is ham. Also, are you sure SA is using the same BAYES database you are feeding? It's quite possible that database you have trained is not used (and this would explain your problem). The question is how is SA called and how do people train the database... -- Matus UHLAR - fantomas, uh...@fantomas.sk ; http://www.fantomas.sk/ Warning: I wish NOT to receive e-mail advertising to this address. Varovanie: na tuto adresu chcem NEDOSTAVAT akukolvek reklamnu postu. Save the whales. Collect the whole set.
Re: local score ignored
That's all *I* seem to get. Other users may differ, but I have them instructions on how to forward stuff for training. This is a rather small system compared to what many of you deal with. joe a. >>> Niamh Holding 04/19/13 7:11 AM >>> Hello Joe, Friday, April 19, 2013, 12:02:32 PM, you wrote: JAj> That has always given me pause, as I get very little ham. Got one this AM. which I will feed JAj> but that's the first in at least a month. You only get one ham email a month? -- Best regards, Niamhmailto:ni...@fullbore.co.uk
Re: local score ignored
On 4/19/2013 at 6:29 AM, Matus UHLAR - fantomas wrote: that is what you should do. You need to train on both spam and ham, since the BAYES filter must know how they differ... On 19.04.13 07:02, Joe Acquisto-j4 wrote: That has always given me pause, as I get very little ham. Got one this AM. which I will feed but that's the first in at least a month. I gather that aged info is not useful? I think that could be useful, mostly when it's aged HAM, but even aged spam is better than no spam... Training missed spam is more important but even training catched spam helps In your case (just a few ham) I'd train _all_ ham and all spam that does not hit BAYES_99 I looked at my spam history - only ~10% of my spam does not hit BAYES_99 and last spam hitting BAYES_50 was about a year and 500 spams ago. -- Matus UHLAR - fantomas, uh...@fantomas.sk ; http://www.fantomas.sk/ Warning: I wish NOT to receive e-mail advertising to this address. Varovanie: na tuto adresu chcem NEDOSTAVAT akukolvek reklamnu postu. Due to unexpected conditions Windows 2000 will be released in first quarter of year 1901
Re: local score ignored
>>> On 4/19/2013 at 6:35 AM, Matus UHLAR - fantomas wrote: > On 4/19/2013 at 12:06 AM, John Hardin wrote: >>> BAYES_50 is the bayes classifier's way of saying "insufficient data" or "I >>> don't know". >>> >>> Do you really want to assign 3 points for "I don't know"? > > On 19.04.13 06:09, Joe Acquisto-j4 wrote: >>In this case, from the samples I've seen. Absolutely, yes. > > as I said, the problem is that your BAYES database does not have enough of > spam/ham samples. You have to feed it, not to increase score for BAYES_50. > With your logic you can give high score to any other rule that hits, e.g. > HTML_MESSAGE. > Well, I *could* do a lot of things. And have. (See these scars?) > >>For me, this last few days, I have seen lots of missed spam that has >> virtually nothing else to trigger on. > > do you have network checks enabled? Plugins allowed? packages installed? > blacklist, uribl? > razor, pyzor, DCC, they all need plugins and installed clients. I have to check. I set this up a while ago cron'd up feeding BAYES and such and sat back. > Do you have your trusted_networks and internal_networks properly set? Probably. >>Been so irritated by this I considered giving it a 5.0. But, even for me, >> that's over the top. > > If you receive many spam with BAYES_50, there's something wrong with your > BAYES database, even disabling could behave better (but training would do > much better. I have suspected such, but . . . > What output does the command "sa-learn --dump magic" produce? > > -- 0.000 0 3 0 non-token data: bayes db version 0.000 0 1872 0 non-token data: nspam 0.000 0 9184 0 non-token data: nham 0.000 0 140303 0 non-token data: ntokens 0.000 0 1364766063 0 non-token data: oldest atime 0.000 0 1366368683 0 non-token data: newest atime 0.000 0 1366367890 0 non-token data: last journal sync atime 0.000 0 1366146116 0 non-token data: last expiry atime 0.000 01382400 0 non-token data: last expire atime delta 0.000 0 26360 0 non-token data: last expire reduction count joe a.
Re: local score ignored
Hello Joe, Friday, April 19, 2013, 12:02:32 PM, you wrote: JAj> That has always given me pause, as I get very little ham. Got one this AM. which I will feed JAj> but that's the first in at least a month. You only get one ham email a month? -- Best regards, Niamhmailto:ni...@fullbore.co.uk pgptqWiV1kyZ8.pgp Description: PGP signature
Re: local score ignored
>>> On 4/19/2013 at 6:29 AM, Matus UHLAR - fantomas wrote: > On 4/18/2013 at 7:21 AM, Matus UHLAR - fantomas wrote: >>> Train your bayes database, if you get many spams with this small score. > > On 18.04.13 21:45, Joe Acquisto-j4 wrote: >>All I can do is feed it. > > that is what you should do. You need to train on both spam and ham, since > the BAYES filter must know how they differ... > > That has always given me pause, as I get very little ham. Got one this AM. which I will feed but that's the first in at least a month. I gather that aged info is not useful? joe a.
Re: local score ignored
On 4/19/2013 at 12:06 AM, John Hardin wrote: BAYES_50 is the bayes classifier's way of saying "insufficient data" or "I don't know". Do you really want to assign 3 points for "I don't know"? On 19.04.13 06:09, Joe Acquisto-j4 wrote: In this case, from the samples I've seen. Absolutely, yes. as I said, the problem is that your BAYES database does not have enough of spam/ham samples. You have to feed it, not to increase score for BAYES_50. With your logic you can give high score to any other rule that hits, e.g. HTML_MESSAGE. For me, this last few days, I have seen lots of missed spam that has virtually nothing else to trigger on. do you have network checks enabled? Plugins allowed? packages installed? blacklist, uribl? razor, pyzor, DCC, they all need plugins and installed clients. Do you have your trusted_networks and internal_networks properly set? Been so irritated by this I considered giving it a 5.0. But, even for me, that's over the top. If you receive many spam with BAYES_50, there's something wrong with your BAYES database, even disabling could behave better (but training would do much better. What output does the command "sa-learn --dump magic" produce? -- Matus UHLAR - fantomas, uh...@fantomas.sk ; http://www.fantomas.sk/ Warning: I wish NOT to receive e-mail advertising to this address. Varovanie: na tuto adresu chcem NEDOSTAVAT akukolvek reklamnu postu. (R)etry, (A)bort, (C)ancer
Re: local score ignored
On 4/18/2013 at 7:21 AM, Matus UHLAR - fantomas wrote: Train your bayes database, if you get many spams with this small score. On 18.04.13 21:45, Joe Acquisto-j4 wrote: All I can do is feed it. that is what you should do. You need to train on both spam and ham, since the BAYES filter must know how they differ... DO NOT play with BAYES_50 score. ? What can it hurt? you can get many false positives. -- Matus UHLAR - fantomas, uh...@fantomas.sk ; http://www.fantomas.sk/ Warning: I wish NOT to receive e-mail advertising to this address. Varovanie: na tuto adresu chcem NEDOSTAVAT akukolvek reklamnu postu. Christian Science Programming: "Let God Debug It!".
Re: local score ignored
>>> On 4/19/2013 at 12:06 AM, John Hardin wrote: > On Thu, 18 Apr 2013, Joe Acquisto-j4 wrote: > > On 4/18/2013 at 7:21 AM, Matus UHLAR - fantomas wrote: >>> On 18.04.13 06:45, Joe Acquisto-j4 wrote: I was concerned about this: [score: 0.4968] >>> >>> This meant that BAYES has computer 49.56% probability that the mail is spam >>> and the rest (50.44%) that it is HAM. >> >> ok >> >>> DO NOT play with BAYES_50 score. >> >> ? What can it hurt? > > BAYES_50 is the bayes classifier's way of saying "insufficient data" or "I > don't know". > > Do you really want to assign 3 points for "I don't know"? > > -- > John Hardin KA7OHZhttp://www.impsec.org/~jhardin/ > jhar...@impsec.orgFALaholic #11174 pgpk -a jhar...@impsec.org > key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C AF76 D822 E6E6 B873 2E79 > --- >Ten-millimeter explosive-tip caseless, standard light armor >piercing rounds. Why? > --- > Tomorrow: the 238th anniversary of The Shot Heard 'Round The World In this case, from the samples I've seen. Absolutely, yes. For me, this last few days, I have seen lots of missed spam that has virtually nothing else to trigger on. Been so irritated by this I considered giving it a 5.0. But, even for me, that's over the top. joe a
Re: local score ignored
On Thu, 18 Apr 2013, Joe Acquisto-j4 wrote: On 4/18/2013 at 7:21 AM, Matus UHLAR - fantomas wrote: On 18.04.13 06:45, Joe Acquisto-j4 wrote: I was concerned about this: [score: 0.4968] This meant that BAYES has computer 49.56% probability that the mail is spam and the rest (50.44%) that it is HAM. ok DO NOT play with BAYES_50 score. ? What can it hurt? BAYES_50 is the bayes classifier's way of saying "insufficient data" or "I don't know". Do you really want to assign 3 points for "I don't know"? -- John Hardin KA7OHZhttp://www.impsec.org/~jhardin/ jhar...@impsec.orgFALaholic #11174 pgpk -a jhar...@impsec.org key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C AF76 D822 E6E6 B873 2E79 --- Ten-millimeter explosive-tip caseless, standard light armor piercing rounds. Why? --- Tomorrow: the 238th anniversary of The Shot Heard 'Round The World
Re: local score ignored
On 2013-04-18 19:45, Joe Acquisto-j4 wrote: DO NOT play with BAYES_50 score. ? What can it hurt? It can cause significant false positives, since BAYES_50 indicates at the there's a 50% chance that this message isn't spam. -- Dave Warren http://www.hireahit.com/ http://ca.linkedin.com/in/davejwarren
Re: local score ignored
>>> On 4/18/2013 at 7:21 AM, Matus UHLAR - fantomas wrote: > On 18.04.13 06:45, Joe Acquisto-j4 wrote: >>I was concerned about this: >> >> [score: 0.4968] > > This meant that BAYES has computer 49.56% probability that the mail is spam > and the rest (50.44%) that it is HAM. ok > Train your bayes database, if you get many spams with this small score. All I can do is feed it. > > DO NOT play with BAYES_50 score. ? What can it hurt? joe a.
Re: local score ignored
On 18.04.13 06:45, Joe Acquisto-j4 wrote: I was concerned about this: [score: 0.4968] This meant that BAYES has computer 49.56% probability that the mail is spam and the rest (50.44%) that it is HAM. Train your bayes database, if you get many spams with this small score. DO NOT play with BAYES_50 score. -- Matus UHLAR - fantomas, uh...@fantomas.sk ; http://www.fantomas.sk/ Warning: I wish NOT to receive e-mail advertising to this address. Varovanie: na tuto adresu chcem NEDOSTAVAT akukolvek reklamnu postu. 2B|!2B, that's a question!
Re: local score ignored
18.04.2013 13:45, Joe Acquisto-j4 kirjoitti: On 4/18/2013 at 6:38 AM, Axb wrote: >> On 04/18/2013 12:23 PM, Joe Acquisto-j4 wrote: >> On 4/18/2013 at 6:15 AM, Axb wrote: On 04/18/2013 12:11 PM, Joe Acquisto-j4 wrote: > I'm missing something. > > Find a fair amount of missed SPAM showing, among others: > > * 0.8 BAYES_50 BODY: Bayes spam probability is 40 to 60% * > [score: 0.4968] > > Bayes is way too low, in my HO. it's obviously not learning enough of whatever it's not scoring higher... > I am puzzled by the line after it. > > I set local.cf with: score BAYES_50_BODY 3.6 and restarted sa. > > Still comes thru with the low score. Am I sticking it in the wrong > place? Or can it not be overridden? I do not recall if the odd > second line was there before I made the change. seems like a phatphingers error: should be: score BAYES_50 3.6 and NOT: score BAYES_50_BODY 3.6 >>> Boy, that phingers guy is a real PITA. More like phathead, as that's what >> I intended. But right after I posted I re-examined and said to myself "self >> . >> . ." >>> Thanks. Might the odd line be a result of that? >> What odd line? report looks normal > I was concerned about this: > > [score: 0.4968] > > On a line by itself. But I see similar in all headers, now that I bother to > look. > > Excuse the morning fog, please. > > joe a. > BAYES_50 means: "Can's say if this is SPAM or HAM". I have the score near for it. For HAM I use negative scores, for SPAM positive. But BAYES_50 is not SPAM, nor HAM. 3.6 is way too much spammy score for it, IMHO. -- There is an old time toast which is golden for its beauty. "When you ascend the hill of prosperity may you not meet a friend." -- Mark Twain signature.asc Description: OpenPGP digital signature
Re: local score ignored
>>> On 4/18/2013 at 6:38 AM, Axb wrote: > On 04/18/2013 12:23 PM, Joe Acquisto-j4 wrote: > On 4/18/2013 at 6:15 AM, Axb wrote: >>> On 04/18/2013 12:11 PM, Joe Acquisto-j4 wrote: I'm missing something. Find a fair amount of missed SPAM showing, among others: * 0.8 BAYES_50 BODY: Bayes spam probability is 40 to 60% * [score: 0.4968] Bayes is way too low, in my HO. >>> >>> it's obviously not learning enough of whatever it's not scoring higher... >>> I am puzzled by the line after it. I set local.cf with: score BAYES_50_BODY 3.6 and restarted sa. Still comes thru with the low score. Am I sticking it in the wrong place? Or can it not be overridden? I do not recall if the odd second line was there before I made the change. >>> >>> seems like a phatphingers error: >>> >>> should be: >>> >>> score BAYES_50 3.6 >>> >>> and NOT: >>> >>> score BAYES_50_BODY 3.6 >> >> Boy, that phingers guy is a real PITA. More like phathead, as that's what > I intended. But right after I posted I re-examined and said to myself "self > . > . ." >> >> Thanks. Might the odd line be a result of that? > > What odd line? report looks normal I was concerned about this: [score: 0.4968] On a line by itself. But I see similar in all headers, now that I bother to look. Excuse the morning fog, please. joe a.
Re: local score ignored
On 04/18/2013 12:23 PM, Joe Acquisto-j4 wrote: On 4/18/2013 at 6:15 AM, Axb wrote: On 04/18/2013 12:11 PM, Joe Acquisto-j4 wrote: I'm missing something. Find a fair amount of missed SPAM showing, among others: * 0.8 BAYES_50 BODY: Bayes spam probability is 40 to 60% * [score: 0.4968] Bayes is way too low, in my HO. it's obviously not learning enough of whatever it's not scoring higher... I am puzzled by the line after it. I set local.cf with: score BAYES_50_BODY 3.6 and restarted sa. Still comes thru with the low score. Am I sticking it in the wrong place? Or can it not be overridden? I do not recall if the odd second line was there before I made the change. seems like a phatphingers error: should be: score BAYES_50 3.6 and NOT: score BAYES_50_BODY 3.6 Boy, that phingers guy is a real PITA. More like phathead, as that's what I intended. But right after I posted I re-examined and said to myself "self . . ." Thanks. Might the odd line be a result of that? What odd line? report looks normal
Re: local score ignored
>>> On 4/18/2013 at 6:15 AM, Axb wrote: > On 04/18/2013 12:11 PM, Joe Acquisto-j4 wrote: >> I'm missing something. >> >> Find a fair amount of missed SPAM showing, among others: >> >> * 0.8 BAYES_50 BODY: Bayes spam probability is 40 to 60% * >> [score: 0.4968] >> >> Bayes is way too low, in my HO. > > it's obviously not learning enough of whatever it's not scoring higher... > >> I am puzzled by the line after it. >> >> I set local.cf with: score BAYES_50_BODY 3.6 and restarted sa. >> >> Still comes thru with the low score. Am I sticking it in the wrong >> place? Or can it not be overridden? I do not recall if the odd >> second line was there before I made the change. > > seems like a phatphingers error: > > should be: > > score BAYES_50 3.6 > > and NOT: > > score BAYES_50_BODY 3.6 Boy, that phingers guy is a real PITA. More like phathead, as that's what I intended. But right after I posted I re-examined and said to myself "self . . ." Thanks. Might the odd line be a result of that? joea.
Re: local score ignored
On 04/18/2013 12:11 PM, Joe Acquisto-j4 wrote: I'm missing something. Find a fair amount of missed SPAM showing, among others: * 0.8 BAYES_50 BODY: Bayes spam probability is 40 to 60% * [score: 0.4968] Bayes is way too low, in my HO. it's obviously not learning enough of whatever it's not scoring higher... I am puzzled by the line after it. I set local.cf with: score BAYES_50_BODY 3.6 and restarted sa. Still comes thru with the low score. Am I sticking it in the wrong place? Or can it not be overridden? I do not recall if the odd second line was there before I made the change. seems like a phatphingers error: should be: score BAYES_50 3.6 and NOT: score BAYES_50_BODY 3.6