Re: BAYES question

2013-04-28 Thread Matus UHLAR - fantomas
Joe Acquisto-j4 skrev den 2013-04-27 13:37: Very interesting. However, I don't see any BAYES_xx markings in the headers at all. On 27.04.13 19:00, Joe Acquisto-j4 wrote: I seem to have not stated my query clearly, as several have suggested this. Or, it was perfectly understood, but I am not

Re: BAYES question

2013-04-27 Thread Karsten Bräckelmann
On Sat, 2013-04-27 at 19:00 -0400, Joe Acquisto-j4 wrote: > > > Very interesting. However, I don't see any BAYES_xx markings in the > > > headers at all. > > > I assumed that is because it is not scoring yet, due to low samples. > > > Or some other reason. > > > > that could be the reason, othe

Re: BAYES question

2013-04-27 Thread John Hardin
On Sat, 27 Apr 2013, Joe Acquisto-j4 wrote: I don't want to know how to see the tokens, etc (I do, but already know how). I was curious about this BAYES_xx thing, which I gather is something I should see in a message header. Yes, the BAYES_## are rules that would show up in the hit-rules list

Re: BAYES question

2013-04-27 Thread John Hardin
On Sat, 27 Apr 2013, Alex wrote: Hi, To feed "ham" to bayes, should one only user mis-flagged mail, or may one use unflagged (below 5) mail? Expressed differently, can one feed "good" messages, "sa-learn --ham path-to-ham " as one might feed missed spam, "sa-learn --spam path-to-spam" You

Re: BAYES question

2013-04-27 Thread Joe Acquisto-j4
>>> On 4/27/2013 at 11:17 AM, Benny Pedersen wrote: > Joe Acquisto-j4 skrev den 2013-04-27 13:37: > >> Very interesting. However, I don't see any BAYES_xx markings in the >> headers at all. > > how is you bayes setup ? > > what gives "sa-learn --dump magic" ? > >> I assumed that is because i

Re: BAYES question

2013-04-27 Thread Joe Acquisto-j4
>>> On 4/27/2013 at 1:20 PM, John Hardin wrote: > On Fri, 26 Apr 2013, Joe Acquisto-j4 wrote: > >> So, I could just feed a bunch of good mail, to --ham, and spam that is > correctly marked >> as spam as well as missed spam, to --spam? > > Correct; the important part is that what you train with

Re: BAYES question

2013-04-27 Thread Alex
Hi, To feed "ham" to bayes, should one only user mis-flagged mail, or may one >> use unflagged (below 5) mail? >> >> Expressed differently, can one feed "good" messages, "sa-learn --ham >> path-to-ham " as one might feed missed spam, "sa-learn --spam path-to-spam" >> > > You can train hams that h

Re: BAYES question

2013-04-27 Thread Jari Fredriksson
27.04.2013 23:15, Karsten Br�ckelmann kirjoitti: > Point being, am I correct in assuming these numbers roughly reflect your > ham/spam ratio? > >> > 0.000 0 28252 0 non-token data: nspam >> > 0.000 0 187579 0 non-token data: nham Yes. I want more spam,

Re: BAYES question

2013-04-27 Thread Karsten Bräckelmann
On Sat, 2013-04-27 at 11:59 +0300, Jari Fredriksson wrote: > 27.04.2013 04:54, Karsten Bräckelmann kirjoitti: > > And it is good advice to keep the initial training corpora to a > > ratio roughly assembling your ham/spam ratio, or maybe 1/1. (At this > > point, we're approaching woodoo. Learning 10

Re: BAYES question

2013-04-27 Thread John Hardin
On Sat, 27 Apr 2013, Niamh Holding wrote: Hello John, Saturday, April 27, 2013, 12:50:34 AM, you wrote: JH> Simple rule: train any ham that doesn't hit BAYES_00. ??? What about ham that hits BAYES_00 and shows autolearn=no ? If a ham hits BAYES_00 that means the Bayes system did a good job

Re: BAYES question

2013-04-27 Thread John Hardin
On Fri, 26 Apr 2013, Joe Acquisto-j4 wrote: So, I could just feed a bunch of good mail, to --ham, and spam that is correctly marked as spam as well as missed spam, to --spam? Correct; the important part is that what you train with must be *correctly classified* - training a ham as spam is no

Re: BAYES question

2013-04-27 Thread Jari Fredriksson
27.04.2013 18:24, Benny Pedersen kirjoitti: > Jari Fredriksson skrev den 2013-04-27 10:59: > >> 0.000 0 28252 0 non-token data: nspam >> 0.000 0 187579 0 non-token data: nham >> >> I have no problems with Bayes whatsoever. > > this is an good working m

Re: BAYES question

2013-04-27 Thread Benny Pedersen
Niamh Holding skrev den 2013-04-27 18:25: What about ham that hits BAYES_00 and shows autolearn=no ? if its spam, sa-learn --spam else the above is ok, its no need to learn if it already is learned as ham -- senders that put my email into body content will deliver it to my own trashcan, so

Re: BAYES question

2013-04-27 Thread Niamh Holding
Hello John, Saturday, April 27, 2013, 12:50:34 AM, you wrote: JH> Simple rule: train any ham that doesn't hit BAYES_00. ??? What about ham that hits BAYES_00 and shows autolearn=no ? -- Best regards, Niamhmailto:ni...@fullbore.co.uk pgp3P8oEu1ldu.pgp Description

Re: BAYES question

2013-04-27 Thread Benny Pedersen
Jari Fredriksson skrev den 2013-04-27 10:59: 0.000 0 28252 0 non-token data: nspam 0.000 0 187579 0 non-token data: nham I have no problems with Bayes whatsoever. this is an good working mta setup, not a bayes problem :) -- senders that put my e

Re: BAYES question

2013-04-27 Thread Benny Pedersen
Joe Acquisto-j4 skrev den 2013-04-27 13:37: Very interesting. However, I don't see any BAYES_xx markings in the headers at all. how is you bayes setup ? what gives "sa-learn --dump magic" ? I assumed that is because it is not scoring yet, due to low samples. Or some other reason. that c

Re: BAYES question

2013-04-27 Thread Benny Pedersen
Joe Acquisto-j4 skrev den 2013-04-27 01:38: path-to-ham " as one might feed missed spam, "sa-learn --spam path-to-spam" yes, but if you sort based on scores there is no point in using bayes in the first place only thing that is important is to feed what is spam and what is ham to learning

Re: BAYES question

2013-04-27 Thread Matus UHLAR - fantomas
Do train those, which have a Bayesian probability close(r) to 0.5. Or even worse, have a Bayesian probability contrary to the overall score, or actual classification. Training the plethora of spam hitting BAYES_99 might not be a mistake. But it is pretty likely, to *not* improve general SA perfor

Re: BAYES question

2013-04-27 Thread Joe Acquisto-j4
. . . > Do train those, which have a Bayesian probability close(r) to 0.5. Or > even worse, have a Bayesian probability contrary to the overall score, > or actual classification. > > Training the plethora of spam hitting BAYES_99 might not be a mistake. > But it is pretty likely, to *not* improve

Re: BAYES question

2013-04-27 Thread Jari Fredriksson
27.04.2013 12:03, Axb kirjoitti: > On 04/27/2013 10:59 AM, Jari Fredriksson wrote: >> 27.04.2013 04:54, Karsten Bräckelmann kirjoitti: >>> And it is good advice to keep the initial training corpora to a >>> ratio roughly assembling your ham/spam ratio, or maybe 1/1. (At this >>> point, we're approa

Re: BAYES question

2013-04-27 Thread Axb
On 04/27/2013 10:59 AM, Jari Fredriksson wrote: 27.04.2013 04:54, Karsten Bräckelmann kirjoitti: And it is good advice to keep the initial training corpora to a ratio roughly assembling your ham/spam ratio, or maybe 1/1. (At this point, we're approaching woodoo. Learning 10 times more ham than s

Re: BAYES question

2013-04-27 Thread Jari Fredriksson
27.04.2013 04:54, Karsten Bräckelmann kirjoitti: > And it is good advice to keep the initial training corpora to a > ratio roughly assembling your ham/spam ratio, or maybe 1/1. (At this > point, we're approaching woodoo. Learning 10 times more ham than spam is > most likely to be a bad choice, thou

Re: BAYES question

2013-04-26 Thread Karsten Bräckelmann
On Fri, 2013-04-26 at 19:38 -0400, Joe Acquisto-j4 wrote: > To feed "ham" to bayes, should one only user mis-flagged mail, or may > one use unflagged (below 5) mail? The Bayesian classifier is a subsystem mostly independent from SA. Most SA rules are rather white or black. Match, or don't. And sc

Re: BAYES question

2013-04-26 Thread Karsten Bräckelmann
On Fri, 2013-04-26 at 21:25 -0400, Joe Acquisto-j4 wrote: > Well, right now, there are no bayes hits at all. I cleared bayes to > re-train, after correcting for a botched initial scheme. > > While I am getting a fair amount of missed spam, there is very little > mis-classified. > > So I am look

Re: BAYES question

2013-04-26 Thread Joe Acquisto-j4
>>> On 4/26/2013 at 7:50 PM, John Hardin wrote: > On Fri, 26 Apr 2013, Joe Acquisto-j4 wrote: > >> To feed "ham" to bayes, should one only user mis-flagged mail, or may >> one use unflagged (below 5) mail? >> >> Expressed differently, can one feed "good" messages, "sa-learn --ham >> path-to-ham

Re: BAYES question

2013-04-26 Thread John Hardin
On Fri, 26 Apr 2013, Joe Acquisto-j4 wrote: To feed "ham" to bayes, should one only user mis-flagged mail, or may one use unflagged (below 5) mail? Expressed differently, can one feed "good" messages, "sa-learn --ham path-to-ham " as one might feed missed spam, "sa-learn --spam path-to-spam"

Re: Bayes Question

2007-04-23 Thread Matt Kettler
Craig wrote: > Hello All- > > My bayes database seems to have problems and I would like suggestion > on how to correct. Here is my issue- > I take any spam email from my users and run the following commands > a. spamassassin -R name of spam file to check > b. spamassassin -r name of spam file to

Re: Bayes question

2006-02-21 Thread M. Lewis
Sorry, I am in the habit of 'reply' as opposed to 'reply all'. I see no 'obvious' errors in spamassassin -D --lint which was the first thing I checked. Shortly before you asked about the 'sa-learn --dump magic', I found this message from Matt: http://marc.theaimsgroup.com/?l=spamassassin-us

Re: Bayes question

2006-02-21 Thread Steven Stern
M. Lewis wrote: Thanks Steve, # sa-learn --dump magic 0.000 0 3 0 non-token data: bayes db version 0.000 0 57468 0 non-token data: nspam 0.000 0 16419 0 non-token data: nham 0.000 0 181931 0 non-to

Re: Bayes question

2006-02-21 Thread Steven Stern
M. Lewis wrote: I recently lost a hard drive and have had to setup everything again. I'm seeing a fair amount of spam that is getting through my filters. From what I can see in the headers of messages, bayes does not seem to be used at all. I'm reasonable sure this is the reason I'm seeing sp

Re: bayes question (sa-learn)

2006-02-15 Thread Patrick von der Hagen
Philipp Snizek wrote: [...] However, I fear SA learns that headers coming from my internal MTA could be spam and so causing false results on real spam. Exactly. Forwarding e-mail breaks the original information and has to be avoided. What experiences have you made or how have you solved thi

Re: Bayes question

2005-07-27 Thread Matt Kettler
Alan Fullmer wrote: > I attempted to do that once, with a network file system, but it didn’t > seem to know how to handle the locking properly. I know I did something > wrong, so if anyone else has a solution, I’d also be happy to hear it! J As JamesDR suggested.. Do it right, use SQL. It's a dat

RE: Bayes question

2005-07-27 Thread Tyler Nally
Boy... anytime I've done some kind of network file sharing across a system or two, I have never done it for good performance reasons... only convenience sakes. And even then, never large files. Almost a decade ago when I was performing massive COBOL database conversions to load data into flat fil

RE: Bayes question

2005-07-27 Thread Alan Fullmer
I attempted to do that once, with a network file system, but it didn’t seem to know how to handle the locking properly.  I know I did something wrong, so if anyone else has a solution, I’d also be happy to hear it! J   -Alan Fullmer [EMAIL PROTECTED] www.xnote.com www.zoobuh.com    

Re: Bayes question

2005-07-27 Thread JamesDR
Robert Swan wrote: I have a pair of Spamassassin servers filtering e-mail (Spamassassin 3.0.4, spamd/spamc, Postfix, redhat 9) I was wondering if I could share the bayes database between the two server rather than having each with its own and having to do the salearn process twice. Any Th

Re: Bayes question

2005-04-14 Thread Matt Kettler
Joe Zitnik wrote: >I apologize if this has been asked before, but I need some >clarification. If I have autolearn for ham set to 0, and the default >BAYES_00 score assigns mail a negative value, and a spam message comes >through with enough good text in it to give it a BAYES_00 and therefore >a n

Re: bayes question

2005-01-10 Thread Michael Parker
In the future, please be sure to CC the list as well, so it can get dumped into the archives for future use. On Mon, Jan 10, 2005 at 06:13:16PM -0500, Sunny Forro wrote: > Michael, > I am running it as root. I get the error every time I run > SA-LEARN -D --SYNC, I don't get bayes checking wi

Re: bayes question

2005-01-10 Thread Michael Parker
On Mon, Jan 10, 2005 at 04:50:57PM -0500, Sunny Forro wrote: > debug: bayes: found bayes db version 2 > bayes: bayes db version 2 is not able to be used, aborting! at > /usr/local/lib/perl5/site_perl/5.8.4/Mail/SpamAssassin/BayesStore/DBM.pm > line 160. Ok, yeah, this is just a warning, no error,

RE: bayes question

2005-01-10 Thread Sunny Forro
[EMAIL PROTECTED] Web: http://www.compcoind.com/ -Original Message- From: Michael Parker [mailto:[EMAIL PROTECTED] Sent: Monday, January 10, 2005 4:30 PM To: Sunny Forro Cc: users@spamassassin.apache.org Subject: Re: bayes question On Mon, Jan 10, 2005 at 04:22:03PM -0500, Sunny

Re: bayes question

2005-01-10 Thread Michael Parker
On Mon, Jan 10, 2005 at 04:22:03PM -0500, Sunny Forro wrote: > Help! > I know this has got to be the number 1 question. But I haven't > had any luck with it: > Actually, it doesn't happen that often these days. > I'm getting: > Bayes: bayes db version 2 is not able to be used, aborting! >

Re: Bayes question

2004-12-23 Thread Chuck Campbell
On Mon, Dec 20, 2004 at 04:38:34PM -0600, Steve Bondy wrote: > > > > > On Mon, Dec 20, 2004 at 04:13:44PM -0600, Steve Bondy wrote: > > > I'm no expert on Bayes, but as far as I know, repeatedly > > learning the > > > same message over and over again doesn't do you any good. Once the > > > to

Re: Bayes question

2004-12-21 Thread Theo Van Dinter
On Mon, Dec 20, 2004 at 08:28:45PM -0800, Jon Drukman wrote: > also bayes won't learn the *exact* same message repeatedly. if it's > already seen a message it won't process it at all. i'm not sure if it > works off the message-id or a hash of the message content. Just for clarification, it's a

Re: Bayes question

2004-12-21 Thread Jon Drukman
Chuck Campbell wrote: On Mon, Dec 20, 2004 at 12:56:43PM -0600, Steve Bondy wrote: For example, the default score in 2.6.x for BAYES_90 is either 2.454 or 2.101. If that's the only rule you hit, and your threshold is above those numbers, it will come through. But what if you repeatedly learn the m

RE: Bayes question

2004-12-20 Thread Steve Bondy
> > On Mon, Dec 20, 2004 at 04:13:44PM -0600, Steve Bondy wrote: > > I'm no expert on Bayes, but as far as I know, repeatedly > learning the > > same message over and over again doesn't do you any good. Once the > > tokens are in there, that's it. The bayes score goes up as more > > tokens

Re: Bayes question

2004-12-20 Thread Michael Parker
On Mon, Dec 20, 2004 at 04:18:58PM -0600, Chuck Campbell wrote: > It's not the same message... exactly. It is the same spam, coming from many > different senders, each with a unique message ID. I keep getting more of > them, > and I keep learning them with sa-learn. > > I'm just not getting SA

Re: Bayes question

2004-12-20 Thread Chuck Campbell
On Mon, Dec 20, 2004 at 04:13:44PM -0600, Steve Bondy wrote: > I'm no expert on Bayes, but as far as I know, repeatedly learning the > same message over and over again doesn't do you any good. Once the > tokens are in there, that's it. The bayes score goes up as more tokens > in the message match

RE: Bayes question

2004-12-20 Thread Steve Bondy
confirm if I'm right... It would help me out too. Steve > -Original Message- > From: Chuck Campbell [mailto:[EMAIL PROTECTED] > Sent: Monday, December 20, 2004 3:54 PM > To: Steve Bondy > Cc: SpamAssassin Users > Subject: Re: Bayes question > > > On Mon,

Re: Bayes question

2004-12-20 Thread Chuck Campbell
On Mon, Dec 20, 2004 at 12:56:43PM -0600, Steve Bondy wrote: > Just because you learn something as spam doesn't mean it will be > blocked. > SA will add a score to the message based on the bayes rules, but if the > bayes rules are the only ones that get hit, and they score less than > your threshol

RE: Bayes question

2004-12-20 Thread Steve Bondy
Just because you learn something as spam doesn't mean it will be blocked. SA will add a score to the message based on the bayes rules, but if the bayes rules are the only ones that get hit, and they score less than your threshold, it won't keep the stuff out. For example, the default score in 2.6.x

Re: Bayes question

2004-12-06 Thread Michael Parker
On Mon, Dec 06, 2004 at 01:28:23AM -, Gray, Richard wrote: > > So, what happens when you take these two overlapping databases and > > combine them is that certain tokens (those that have overlap) are then > > double counted. This makes the database, at least according to the > > bayes model SA

RE: Bayes question

2004-12-06 Thread Gray, Richard
Title: Re: Bayes question > So, what happens when you take these two overlapping databases and> combine them is that certain tokens (those that have overlap) are then> double counted.  This makes the database, at least according to the> bayes model SA is using, statistic

Re: Bayes question

2004-12-05 Thread Ricardo Oliveira
Michael, I understood the dangers behing the theory - I'll get into the analysis of all the bayes databases later on. I guess the only way to do it cleanly is to feed the same HAM+SPAM messages to all the bayes's learning mechanisms... Thanks for your time, Ricardo

Re: Bayes question

2004-12-04 Thread Michael Parker
On Sat, Dec 04, 2004 at 10:46:22AM +, Ricardo Oliveira wrote: > According to the docs, --restore is destructive (in the sense it > destroys the previous contents of the database). > > Would you guys be interested in such a feature? I plan to use a > generic bayes DB (which is maintained by our

Re: Bayes question

2004-12-04 Thread Ricardo Oliveira
According to the docs, --restore is destructive (in the sense it destroys the previous contents of the database). Would you guys be interested in such a feature? I plan to use a generic bayes DB (which is maintained by our tech team), and merge it with each clients's own DB (which would result in

Re: Bayes question

2004-12-03 Thread Mike
On Fri, 3 Dec 2004 19:37:05 +, Ricardo Oliveira <[EMAIL PROTECTED]> wrote: > What about joining several databases together? > > I'd like to use a "general" bayes DB, and join it with some clients's > particular DB's. > > TIA, > Ricardo > Never tried it, but it should be possible with sa-l

Re: Bayes question

2004-12-03 Thread Ricardo Oliveira
What about joining several databases together? I'd like to use a "general" bayes DB, and join it with some clients's particular DB's. TIA, Ricardo

Re: Bayes question

2004-12-02 Thread Mike
On Thu, 2 Dec 2004 22:27:05 +, Ricardo Oliveira <[EMAIL PROTECTED]> wrote: > By the way - are the bayes databases on disk portable (in the sense I > could "import" or copy them to another server and use them > accordingly)? > > Thanks in advance > I haven't had a problem doing that, moving f

Re: Bayes question

2004-12-02 Thread Ricardo Oliveira
By the way - are the bayes databases on disk portable (in the sense I could "import" or copy them to another server and use them accordingly)? Thanks in advance

Re: Bayes question

2004-11-24 Thread Rakesh
Austin Weidner wrote: Really trying to figure out bayes. Auto learn is set up, and my headers are showing autolearn=spam However, when I do sa-learn --dump magic, there are zero spams and zero hams. By using the -D (debug) option, I can see sa-learn is looking at: debug: bayes: 17216 tie-ing to DB

Re: Bayes question

2004-11-23 Thread Matt Kettler
At 01:58 AM 11/23/2004 -0500, Austin Weidner wrote: Really trying to figure out bayes. Auto learn is set up, and my headers are showing autolearn=spam However, when I do sa-learn --dump magic, there are zero spams and zero hams. By using the -D (debug) option, I can see sa-learn is looking at: debu