Re: Help with Bayes-SQL-Configuration
Julian Kippels wrote: Hi, I am in the process of setting up a bayes-sql-database but I am unsure of wether I want to set the bayes_sql_override_username option. I would like to have per-user-bayes scores, so that scores from user A will not interfere with messages sent to user B. If I understand it correctly, no matter what I use as the username for sa-learn, this option when set will override it with whatever I put there. Does this not effectivly disable per-user-bayes scores and bundles them all under one meta-user? However I have read, when using amavis (which I do) to call SA, I should set this variable to the username which runs the amavis process. What should I do? You're getting two similarly-named but unrelated options confused. bayes_sql_override_username explicitly and specifically disables the per-user Bayes DB setup you're asking for. (IMO this is of limited use at any scale larger than "a handful of technically minded users", but it *can* work as long as your users are willing to feed the system. For most other setups you're going to get better results by using a single, centrally-maintained site-wide Bayes database.) The option you're probably looking for is "bayes_sql_username" (and the related bayes_sql_dsn and bayes_sql_password). This sets the SQL connection user - not the SA/Bayes user! - which is what you need if you want to keep many per-user Bayes datasets in an SQL database instead of one of the other backends. Calling SpamAssassin from Amavis or some other glue layer that operates during the same part of mail flow means that it's sometimes extremely difficult to make use of per-user Bayes and other settings, because you have one message with many recipients. Full per-user SA settings are IMO best handled by calling SA on final mail delivery, where you are guaranteed to be calling SA for exactly one recipient on any given call. The downside of this method is that if a message originally had multiple recipients, SA will be called for each of those recipients. -kgd
Help with Bayes-SQL-Configuration
Hi, I am in the process of setting up a bayes-sql-database but I am unsure of wether I want to set the bayes_sql_override_username option. I would like to have per-user-bayes scores, so that scores from user A will not interfere with messages sent to user B. If I understand it correctly, no matter what I use as the username for sa-learn, this option when set will override it with whatever I put there. Does this not effectivly disable per-user-bayes scores and bundles them all under one meta-user? However I have read, when using amavis (which I do) to call SA, I should set this variable to the username which runs the amavis process. What should I do? Thanks Julian
Re: Help with bayes
Kai Schaetzl wrote: Troy Settle wrote on Mon, 17 Nov 2008 13:33:10 -0500: I'm having a major problem with the bayes system. I cleared the bayes database and let it start re-learning. Once it kicked in, I again started getting false hits with BAYES_00=-2.599 on a great many spam/uce messages. How did you let it start re-learning? What's the output of sa-learn dump magic? From incoming mail. I'm still working on building a corpus suitable for sa-learn. $ sa-learn --dump magic 0.000 0 3 0 non-token data: bayes db version 0.000 0 44946 0 non-token data: nspam 0.000 0 36757 0 non-token data: nham 0.000 0 545675 0 non-token data: ntokens 0.000 0 1226964376 0 non-token data: oldest atime 0.000 0 1227033356 0 non-token data: newest atime 0.000 0 1227033315 0 non-token data: last journal sync atime 0.000 0 1227007705 0 non-token data: last expiry atime 0.000 0 43200 0 non-token data: last expire atime delta 0.000 0 393274 0 non-token data: last expire reduction count FWIW, how bad would I screw things up if I were to override the BAYES_00 score to 0? -- Troy Settle Pulaski Networks ~ http://www.psknet.com 866.477.5638 ~ 540.994.4254
Re: Help with bayes
On Tue, 2008-11-18 at 15:19 -0500, Troy Settle wrote: Kai Schaetzl wrote: Troy Settle wrote on Mon, 17 Nov 2008 13:33:10 -0500: I'm having a major problem with the bayes system. I cleared the bayes database and let it start re-learning. Once it kicked in, I again started getting false hits with BAYES_00=-2.599 on a great many spam/uce messages. How did you let it start re-learning? What's the output of sa-learn dump magic? From incoming mail. I'm still working on building a corpus suitable for sa-learn. You *need* to train on error. Also, you definitely will want to manually learn, at the very least until Bayes has been trained properly. If you rely solely on auto-learning, there is a great many spams that will not be learned. Which pretty much are exactly those where Bayes can make a difference! http://spamassassin.apache.org/full/3.2.x/doc/Mail_SpamAssassin_Conf.html#learning_options http://spamassassin.apache.org/full/3.2.x/doc/Mail_SpamAssassin_Plugin_AutoLearnThreshold.html By default, auto-learning will *not* learn any spam with a total score less than 12 (without Bayes, etc) and body and header tests less than 3 respectively. It won't learn ham with a score above 0.1 either. This is a safety measure. FWIW, how bad would I screw things up if I were to override the BAYES_00 score to 0? That's not gonna solve your problems. You'd better properly train Bayes on the stuff not auto-learned, so it will eventually learn the difference between ham and spam. So far it only knows about the extreme ends, which really don't need Bayes to make a difference anyway. guenther -- char *t=[EMAIL PROTECTED]; main(){ char h,m=h=*t++,*x=t+2*h,c,i,l=*x,s=0; for (i=0;il;i++){ i%8? c=1: (c=*++x); c128 (s+=h); if (!(h=1)||!t[s+h]){ putchar(t[s]);h=m;s=0; }}}
Re: Help with bayes
From: Troy Settle [EMAIL PROTECTED] Date: Tue, 18 Nov 2008 15:19:56 -0500 Kai Schaetzl wrote: Troy Settle wrote on Mon, 17 Nov 2008 13:33:10 -0500: I'm having a major problem with the bayes system. I cleared the bayes database and let it start re-learning. Once it kicked in, I again started getting false hits with BAYES_00=-2.599 on a great many spam/uce messages. How many and what percentage spam messages are getting BAYES_00? A few spam messages getting BAYES_00/05/20 is ok. If you are getting a large percentage of spam hitting BAYES_00 then you have some sort of problem with the messages that are being learned. Most likely you are (auto)learning spam messages as ham. Any mistakes made in learning need to be corrected by relearning those messages. Any spam message that has autolearn=ham has to be relearned as spam. Or perhaps you are not learning from enough spam messages. For spam messages getting BAYES_00 what do you get for the following: spamassassin -D --test-mode --debug all,bayes msg.txt 21 | grep bayes: Which spammy looking tokens have low values? How did you let it start re-learning? What's the output of sa-learn dump magic? From incoming mail. I'm still working on building a corpus suitable for sa-learn. $ sa-learn --dump magic 0.000 0 3 0 non-token data: bayes db version 0.000 0 44946 0 non-token data: nspam 0.000 0 36757 0 non-token data: nham 0.000 0 545675 0 non-token data: ntokens ... You should probably increase the size of the Bayes database, eg bayes_expiry_max_db_size 200 FWIW, how bad would I screw things up if I were to override the BAYES_00 score to 0? With proper training this should not be necessary. Also, 0 would disable the test, so you won't get any BAYES_00 hits. A small temporary non zero score would be better so you can continue to track the problem. -jeff
Re: Help with bayes
Troy Settle wrote on Tue, 18 Nov 2008 15:19:56 -0500: From incoming mail. well, but how? By auto-learning? In that case you are just multiplying your problem. It seems a lot of spam gets miscategorized as ham. Auto-learning that spam as ham means enforcing this miscategorization and that's what you see as a result. 0.000 0 44946 0 non-token data: nspam 0.000 0 36757 0 non-token data: nham 0.000 0 545675 0 non-token data: ntokens looking fine if the ham tokens were really ham. 0.000 0 1227007705 0 non-token data: last expiry atime 0.000 0 393274 0 non-token data: last expire reduction count Hm, you just did an expire that slashed your db almost in half? You may want to let it grow a bit. FWIW, how bad would I screw things up if I were to override the BAYES_00 score to 0? As it is causing you grief now, probably not much. It means that real ham that also gets detected as Bayes_00 will not enjoy the benefits of this negative score. Maybe switching Bayes off for a while is better. I would start over with that db. 1. stop Bayes and check how the categorization without Bayes works, by theory you should have a good number of miscategorized spam (as ham) already without Bayes. 2. collect some ham and spam where you can be absolutely sure that they are in the right category and then train Bayes with these. Stop autolearning for bayes for a while. 3. switch it on with your new db and check if Bayes seems to categorize better now 4. if it does then switch auto-learning on, but move the auto-learning threshold for ham a bit down, so that the chance of spam creeping in is smaller. Kai -- Kai Schätzl, Berlin, Germany Get your web at Conactive Internet Services: http://www.conactive.com
Re: Help with bayes
Kai Schaetzl wrote: well, but how? By auto-learning? In that case you are just multiplying your problem. It seems a lot of spam gets miscategorized as ham. Auto-learning that spam as ham means enforcing this miscategorization and that's what you see as a result. When SpamAssassin decides whether or not to learn a message, it does not take Bayes scores into account. So if you have a message that only hits BAYES_00 (with a score of either -2.3 or -2.6) and another rule with a score of 0.2, that message will not be learnt (unless you change the limits), because 0.2 is greater than 0.1 (the limit). Hope this helps, James. -- E-mail: james@ | ‘Sir, they’ve taken Mr. Rimmer!’ aprilcottage.co.uk | ‘Quick, let’s get out of here before they bring him | back!’ | -- Kryten and Cat, ‘Red Dwarf’
Re: Help with bayes
James Wilkinson wrote on Tue, 18 Nov 2008 21:56:34 +: well, but how? By auto-learning? In that case you are just multiplying your problem. It seems a lot of spam gets miscategorized as ham. Auto-learning that spam as ham means enforcing this miscategorization and that's what you see as a result. When SpamAssassin decides whether or not to learn a message, it does not take Bayes scores into account. So if you have a message that only hits BAYES_00 (with a score of either -2.3 or -2.6) and another rule with a score of 0.2, that message will not be learnt (unless you change the limits), because 0.2 is greater than 0.1 (the limit). Very well, but doesn't affect anything of what I wrote. ;-) I think you misunderstood my explanation. Kai -- Kai Schätzl, Berlin, Germany Get your web at Conactive Internet Services: http://www.conactive.com
Help with bayes
I'm having a major problem with the bayes system. I cleared the bayes database and let it start re-learning. Once it kicked in, I again started getting false hits with BAYES_00=-2.599 on a great many spam/uce messages. Can someone point me to some good reading material to better understand why this is happening, and how to prevent it? SA is running under a single user site-wide (about 2500 mailboxes total). Is this screwing things up for me? Would I have better results if I were to run SA for each user separately? Thanks, -- Troy Settle Pulaski Networks 866.477.5638
Re: Help with bayes
Troy Settle wrote on Mon, 17 Nov 2008 13:33:10 -0500: I'm having a major problem with the bayes system. I cleared the bayes database and let it start re-learning. Once it kicked in, I again started getting false hits with BAYES_00=-2.599 on a great many spam/uce messages. How did you let it start re-learning? What's the output of sa-learn dump magic? SA is running under a single user site-wide (about 2500 mailboxes total). Is this screwing things up for me? Would I have better results if I were to run SA for each user separately? If your users each get enough mail to produce enough Bayes tokens, maybe. Kai -- Kai Schätzl, Berlin, Germany Get your web at Conactive Internet Services: http://www.conactive.com
Help with BAYES + MYSQL
Hoppe someone can help me!! Iam using Spamassassin 3.0.1 ( users stored in mysql and vpopmail and qmail). Slackware version 10.2 Mysql version 5.0 My problem: when i use the standart bayes confs ( hard drive .db files) everything works so fine.. When i change to bayes MYSQL. strange things happens. First time i receive a email, system give me a message that not have user at bayes tables. but at end he use the learn mode and everything is ok . After this when this same user start to receive email the system dye in a Segmentation fault message ( look debug example 1 ). And after a --lit debug example.. any idea what happens? ## ebug example 1 ### Thu Mar 2 14:01:24 2006 [17152] dbg: prefork: ordered 17156 to accept Thu Mar 2 14:01:24 2006 [17152] dbg: prefork: sysread(6) not ready, wait max 300 secs Thu Mar 2 14:01:24 2006 [17156] info: spamd: connection from localhost [127.0.0.1] at port 48847 Thu Mar 2 14:01:24 2006 [17152] dbg: prefork: child 17156: entering state 2 Thu Mar 2 14:01:24 2006 [17152] dbg: prefork: new lowest idle kid: 17157 Thu Mar 2 14:01:24 2006 [17156] info: spamd: handle_user unable to find user: [EMAIL PROTECTED] Thu Mar 2 14:01:24 2006 [17156] dbg: config: Conf::SQL: executing SQL: select preference, value from userpref where username = '[EMAIL PROTECTED]' or username = '@GLOBAL' order by username asc Thu Mar 2 14:01:24 2006 [17156] dbg: config: retrieving prefs for [EMAIL PROTECTED] from SQL server Thu Mar 2 14:01:24 2006 [17156] dbg: info: user has changed Thu Mar 2 14:01:24 2006 [17156] dbg: bayes: using username: [EMAIL PROTECTED] Thu Mar 2 14:01:24 2006 [17156] dbg: bayes: database connection established Thu Mar 2 14:01:24 2006 [17156] dbg: bayes: found bayes db version 3 Thu Mar 2 14:01:24 2006 [17156] dbg: bayes: Using userid: 4 Thu Mar 2 14:01:24 2006 [17156] dbg: bayes: not available for scanning, only 0 spam(s) in bayes DB 10 Thu Mar 2 14:01:24 2006 [17156] dbg: config: score set 1 chosen. Thu Mar 2 14:01:24 2006 [17156] dbg: dns: name server: 192.168.100.105, LocalAddr: 0.0.0.0 Thu Mar 2 14:01:24 2006 [17156] info: spamd: processing message [EMAIL PROTECTED] for [EMAIL PROTECTED]:0 Thu Mar 2 14:01:24 2006 [17156] dbg: bayes: database connection established Thu Mar 2 14:01:24 2006 [17156] dbg: bayes: found bayes db version 3 Thu Mar 2 14:01:24 2006 [17156] dbg: bayes: Using userid: 4 Thu Mar 2 14:01:24 2006 [17156] dbg: bayes: not available for scanning, only 0 spam(s) in bayes DB 10 Thu Mar 2 14:01:24 2006 [17156] dbg: received-header: parsed as [ ip=209.73.178.172 rdns=web60524.mail.yahoo.com helo=web60524.mail.yahoo.com by=nisyros.psmi.com.br ident= envfrom= intl=0 id= auth= ] Thu Mar 2 14:01:24 2006 [17156] dbg: dns: looking up A records for 'nisyros.psmi.com.br' Thu Mar 2 14:01:24 2006 [17156] dbg: dns: A records for 'nisyros.psmi.com.br': 201.64.97.21 201.64.97.21 201.64.97.21 Thu Mar 2 14:01:24 2006 [17156] dbg: dns: looking up A records for 'nisyros.psmi.com.br' Thu Mar 2 14:01:24 2006 [17156] dbg: dns: A records for 'nisyros.psmi.com.br': 201.64.97.21 201.64.97.21 201.64.97.21 Thu Mar 2 14:01:24 2006 [17156] dbg: received-header: 'by' nisyros.psmi.com.br has public IP 201.64.97.21 Thu Mar 2 14:01:24 2006 [17156] dbg: received-header: relay 209.73.178.172 trusted? no internal? no Thu Mar 2 14:01:24 2006 [17156] dbg: dns: looking up PTR record for '201.64.97.17' Thu Mar 2 14:01:24 2006 [17156] dbg: dns: PTR for '201.64.97.17': 'nagios.psmi.com.br' Thu Mar 2 14:01:24 2006 [17156] dbg: received-header: parsed as [ ip=201.64.97.17 rdns=nagios.psmi.com.br helo= by=web60524.mail.yahoo.com ident= envfrom= intl=0 id= auth= ] Thu Mar 2 14:01:24 2006 [17156] dbg: received-header: relay 201.64.97.17 trusted? no internal? no Thu Mar 2 14:01:24 2006 [17156] dbg: metadata: X-Spam-Relays-Trusted: Thu Mar 2 14:01:24 2006 [17156] dbg: metadata: X-Spam-Relays-Untrusted: [ ip=209.73.178.172 rdns=web60524.mail.yahoo.com helo=web60524.mail.yahoo.com by=nisyros.psmi.com.br ident= envfrom= intl=0 id= auth= ] [ ip=201.64.97.17 rdns=nagios.psmi.com.br helo= by=web60524.mail.yahoo.com ident= envfrom= intl=0 id= auth= ] Thu Mar 2 14:01:24 2006 [17156] dbg: message: MIME PARSER START Thu Mar 2 14:01:24 2006 [17156] dbg: message: main message type: multipart/alternative Thu Mar 2 14:01:24 2006 [17156] dbg: message: parsing multipart, got boundary: 0-1994749066-1141318697=:2926 Thu Mar 2 14:01:24 2006 [17156] dbg: message: found part of type text/plain, boundary: 0-1994749066-1141318697=:2926 Thu Mar 2 14:01:24 2006 [17156] dbg: message: parsing normal part Thu Mar 2 14:01:24 2006 [17156] dbg: message: added part, type: text/plain Thu Mar 2 14:01:24 2006 [17156] dbg: message: found part of type text/html, boundary: 0-1994749066-1141318697=:2926 Thu Mar 2 14:01:24 2006
Help with bayes configuration
Hello, I installed spamassassin a month agowiththebayes auto learn option, but there is still 60% of spam that is not detected. In my bayes db there was nothing ... [EMAIL PROTECTED] root]# sa-learn --dump magic0.000 0 3 0 non-token data: bayes db version0.000 0 0 0 non-token data: nspam0.000 0 0 0 non-token data: nham0.000 0 0 0 non-token data: ntokens0.000 0 0 0 non-token data: oldest atime0.000 0 0 0 non-token data: newest atime0.000 0 0 0 non-token data: last journal sync atime0.000 0 0 0 non-token data: last expiry atime0.000 0 0 0 non-token data: last expire atime delta0.000 0 0 0 non-token data: last expire reduction count Today I rebuild the db but it seems still not working ... [EMAIL PROTECTED] root]# sa-learn --ham --no-rebuild /etc/mail/spamassassin/The --no-rebuild option has been deprecated. Please use --no-sync instead.Learned from 5 message(s) (5 message(s) examined).[EMAIL PROTECTED] root]# sa-learn --spam --no-rebuild /etc/mail/spamassassin/The --no-rebuild option has been deprecated. Please use --no-sync instead.Learned from 6 message(s) (6 message(s) examined).[EMAIL PROTECTED] root]#[EMAIL PROTECTED] root]# sa-learn --rebuildThe --rebuild option has been deprecated. Please use --sync instead.synced Bayes databases from journal in 0 seconds: 549 unique entries (549 total entries)[EMAIL PROTECTED] root]#[EMAIL PROTECTED] spamassassin]# sa-learn --dump magic0.000 0 3 0 non-token data: bayes db version0.000 0 6 0 non-token data: nspam0.000 0 5 0 non-token data: nham0.000 0 337 0 non-token data: ntokens0.000 0 1132317402 0 non-token data: oldest atime0.000 0 1132317451 0 non-token data: newest atime0.000 0 1132317466 0 non-token data: last journal sync atime0.000 0 0 0 non-token data: last expiry atime0.000 0 0 0 non-token data: last expire atime delta0.000 0 0 0 non-token data: last expire reduction count My spamassassin configuration : [EMAIL PROTECTED] spamassassin]# cat local.cf# This is the right place to customize your installation of SpamAssassin.## See 'perldoc Mail::SpamAssassin::Conf' for details of what can be# tweaked.## rewrite_header Subject *SPAM*# report_safe 1trusted_networks 10. 192.168.1.# lock_method flock # Scoringrequired_score 5# Score pour une probabilitée Spam entre 50 et 60% : score DCC_CHECK 4.000score RAZOR2_CHECK 2.500 score BAYES_60 3# Score pout proba entre 60 et 70%score BAYES_70 4score BAYES_80 4.8score BAYES_95 5score BAYES_99 6 #user_scores_dsn DBI:mysql:spamassassin:127.0.0.1#user_scores_sql_username spamassassin#user_scores_sql_password password # Encapsulation ?report_safe 0 dns_available yes # Settings bayesbayes_path /etc/mail/spamassassin/bayes_file_mode 0777use_auto_whitelist 1auto_whitelist_path /etc/mail/spamassassin/whitelist use_bayes 1use_bayes_rules 1bayes_auto_learn 1bayes_auto_learn_threshold_spam 25bayes_auto_learn_threshold_nonspam -5bayes_min_ham_num 60bayes_min_spam_num 100 #required_hits 2.6rewrite_subject 1subject_tag *SPAM* # Enable or disable network checksskip_rbl_checks 0use_razor2 1use_dcc 1use_pyzor 0 # Mail using languages used in these country codes will not be marked# as being possibly spam in a foreign language.# - english french # Mail using locales used in these country codes will not be marked# as being possibly spam in a foreign language. [EMAIL PROTECTED] spamassassin]# Accédez au courrier électronique de La Poste : www.laposte.net ; 3615 LAPOSTENET (0,34 /mn) ; tél : 08 92 68 13 50 (0,34/mn)
Help with Bayes auto-learn
I would like to enable the Bayes system with auto-learning. I thought that I had my config setup correctly but apparently I don't. My config looks like this: ## # How we want to modify the email rewrite_header subject [**SPAM**] report_safe 0 #Bayes learning system use_bayes 1 bayes_auto_learn 1 # Define the sensitivity level. Standard level is 5. required_hits 6.8 # Enable SpamAssassin's RBL checking features : skip_rbl_checks 0 rbl_timeout 3 num_check_received 3 score RCVD_IN_BL_SPAMCOP_NET 3 report_header 1 use_terse_report 1 ## so I thought from the reading in the FAQ and on the wiki that this would enable bayes, and turn on its auto_learn for spam that hits higher then the default of 12. But in my logs I end up with this: 2005-05-12 23:30:33.240563500 2005-05-13 06:30:33 [88906] i: connection from localhost.whootis.com [127.0.0.1] at port 4737 2005-05-12 23:30:33.333094500 2005-05-13 06:30:33 [88906] i: processing message [EMAIL PROTECTED] for qmaild:10004. 2005-05-12 23:30:33.431814500 2005-05-13 06:30:33 [88906] i: identified spam (23.2/6.8) for qmaild:10004 in 0.2 seconds, 1311 bytes. 2005-05-12 23:30:33.432514500 2005-05-13 06:30:33 [88906] i: result: Y 23 - BAYES_99,FORGED_MUA_THEBAT_BOUN,FORGED_THEBAT_HTML,FORGED_YAHOO_RCVD,HEAD_ILLEGAL_CHARS,HTML_MESSAGE,HTML_MIME_NO_HTML_TAG,MIME_HTML_ONLY,MIME_HTML_ONLY_MULTI,MSGID_RANDY,NORMAL_HTTP_TO_IP,RCVD_BY_IP,RCVD_DOUBLE_IP_LOOSE,RCVD_HELO_IP_MISMATCH,RCVD_NUMERIC_HELO,SUBJ_ILLEGAL_CHARS scantime=0.2,size=1311,mid=[EMAIL PROTECTED],bayes=0.999,autolearn=no Does the autolearn=no mean that this message has not been submitted to bayes for auto-learn? And if not, can someone steer me in the right direction for getting my config setup correctly? Thanks very much, Geoff Sweet
RE: Help with Bayes auto-learn
I can swear I saw this question in at least 20 different messages, not to mention the website I really recommend you research your question before asking it. autolearn=no means that it didn't 'learn' this message. Other possible states are 'spam, 'ham' and ... 'DISABLED' If autolearn were to be disabled, you would see this last one. I would like to enable the Bayes system with auto-learning. I thought that I had my config setup correctly but apparently I don't. My config looks like this: ## # How we want to modify the email rewrite_header subject [**SPAM**] report_safe 0 #Bayes learning system use_bayes 1 bayes_auto_learn 1 # Define the sensitivity level. Standard level is 5. required_hits 6.8 # Enable SpamAssassin's RBL checking features : skip_rbl_checks 0 rbl_timeout 3 num_check_received 3 score RCVD_IN_BL_SPAMCOP_NET 3 report_header 1 use_terse_report 1 ## so I thought from the reading in the FAQ and on the wiki that this would enable bayes, and turn on its auto_learn for spam that hits higher then the default of 12. But in my logs I end up with this: 2005-05-12 23:30:33.240563500 2005-05-13 06:30:33 [88906] i: connection from localhost.whootis.com [127.0.0.1] at port 4737 2005-05-12 23:30:33.333094500 2005-05-13 06:30:33 [88906] i: processing message [EMAIL PROTECTED] for qmaild:10004. 2005-05-12 23:30:33.431814500 2005-05-13 06:30:33 [88906] i: identified spam (23.2/6.8) for qmaild:10004 in 0.2 seconds, 1311 bytes. 2005-05-12 23:30:33.432514500 2005-05-13 06:30:33 [88906] i: result: Y 23 - BAYES_99,FORGED_MUA_THEBAT_BOUN,FORGED_THEBAT_HTML,FORGED_YAHOO_RCVD,HEAD_IL LEGAL_CHARS,HTML_MESSAGE,HTML_MIME_NO_HTML_TAG,MIME_HTML_ONLY,MIME_HTML_ONLY _MULTI,MSGID_RANDY,NORMAL_HTTP_TO_IP,RCVD_BY_IP,RCVD_DOUBLE_IP_LOOSE,RCVD_HE LO_IP_MISMATCH,RCVD_NUMERIC_HELO,SUBJ_ILLEGAL_CHARS scantime=0.2,size=1311,mid=[EMAIL PROTECTED],bayes=0.9 99,autolearn=no Does the autolearn=no mean that this message has not been submitted to bayes for auto-learn? And if not, can someone steer me in the right direction for getting my config setup correctly? Thanks very much, Geoff Sweet
Re: Help with Bayes auto-learn
In an older episode (Friday 13 May 2005 08:38), Geoff Sweet wrote: I would like to enable the Bayes system with auto-learning. I thought that I had my config setup correctly but apparently I don't. My config looks like this: ## # How we want to modify the email rewrite_header subject [**SPAM**] report_safe 0 #Bayes learning system use_bayes 1 bayes_auto_learn 1 In an older episode (Friday 13 May 2005 10:17), George Breahna wrote: I really recommend you research your question before asking it. good point, anyway: man Mail::SpamAssassin::Conf and http://spamassassin.apache.org/full/3.0.x/dist/doc/Mail_SpamAssassin_Conf.html would tell you: bayes_min_ham_num (Default: 200) bayes_min_spam_num (Default: 200) To be accurate, the Bayes system does not activate until a certain number of ham (non-spam) and spam have been learned. The default is 200 of each ham and spam, but you can tune these up or down with these two settings. for information how to learn the needed amount of mails, see man sa-learn regards, wolfgang
Re: Help with Bayes auto-learn
Yes, but his scoring list BAYES_99 as one of the scores, which means bayes is active, which means it has been fed the necessary 200 spam and 200 ham. If it hadn't been fed the necessary spam and ham, it would not have been given a BAYES score at all. The fact that the mail was not autolearned could mean that it did not fall within the autolearn range OR that an identical message had already been learned. With a score like BAYES_99, it is probably the latter. wolfgang [EMAIL PROTECTED] 5/13/2005 4:38 AM In an older episode (Friday 13 May 2005 08:38), Geoff Sweet wrote: I would like to enable the Bayes system with auto-learning. I thought that I had my config setup correctly but apparently I don't. My config looks like this: ## # How we want to modify the email rewrite_header subject [**SPAM**] report_safe 0 #Bayes learning system use_bayes 1 bayes_auto_learn 1In an older episode (Friday 13 May 2005 10:17), George Breahna wrote: I really recommend you research your question before asking it.good point, anyway:man Mail::SpamAssassin::Conf andhttp://spamassassin.apache.org/full/3.0.x/dist/doc/Mail_SpamAssassin_Conf.htmlwould tell you:bayes_min_ham_num (Default: 200)bayes_min_spam_num (Default: 200) To be accurate, the Bayes system does not activate until a certain number of ham (non-spam) and spam have been learned. The default is 200 of each ham and spam, but you can tune these up or down with these two settings.for information how to learn the needed amount of mails, seeman sa-learnregards,wolfgang
Re: Help with Bayes auto-learn
In an older episode (Friday 13 May 2005 12:26), Joe Zitnik wrote: Yes, but his scoring list BAYES_99 as one of the scores, which means bayes is active, which means it has been fed the necessary 200 spam and 200 ham. If it hadn't been fed the necessary spam and ham, it would not have been given a BAYES score at all. thanks for pointing that out, i had missed that. wolfgang
Re: Help with Bayes auto-learn
At 02:38 AM 5/13/2005, Geoff Sweet wrote: 2005-05-12 23:30:33.432514500 2005-05-13 06:30:33 [88906] i: result: Y 23 - BAYES_99,FORGED_MUA_THEBAT_BOUN,FORGED_THEBAT_HTML,FORGED_YAHOO_RCVD,HEAD_ILLEGAL_CHARS,HTML_MESSAGE,HTML_MIME_NO_HTML_TAG,MIME_HTML_ONLY,MIME_HTML_ONLY_MULTI,MSGID_RANDY,NORMAL_HTTP_TO_IP,RCVD_BY_IP,RCVD_DOUBLE_IP_LOOSE,RCVD_HELO_IP_MISMATCH,RCVD_NUMERIC_HELO,SUBJ_ILLEGAL_CHARS scantime=0.2,size=1311,mid=[EMAIL PROTECTED],bayes=0.999,autolearn=no Does the autolearn=no mean that this message has not been submitted to bayes for auto-learn? And if not, can someone steer me in the right direction for getting my config setup correctly? First, I'm assuming you're using SA 3.0.0 or higher, if not, please specify version and I'll correct my message (some of the details differ) That does mean the message was not autolearned. However, it does not mean that no messages will be autolearned. In SA 3.0 if autolearning was disabled, or failing, you would have seen disabled or failed, not no. The requirements for autolearning are considerably more complex than just total score over xx. The following things have to happen: Note: ALL scores referenced below are the learning score. Learning score is NOT the same as the final spam score. It is the score recalculated as if bayes was disabled, *including* changing scoreset. Also all AWL, whitelist, and blacklist rules don't count towards this score. 1) total learning score over bayes_auto_learn_threshold_spam (default 12) 2) learning score of header rules must be over 3.0 3) learning score of body rules must be over 3.0 4) existing bayes learning must not be strongly ham (ie: don't learn as spam anything that would otherwise get bayes_00'ed) 5) From addresses (including Return-Path, etc) must not match a bayes_ignore_from statement 6) To addresses (including Cc, etc) must not match a bayes_ignore_from statement 7) The bayes DB must not be locked by some other SA process (another learner, expiry, etc). Note: this test results in autolearn=failed. See also: http://wiki.apache.org/spamassassin/AutolearningNotWorking
Re: Need help with Bayes DB
Also make sure that you updated the DB format if you moved from 2.6x to 3.0.1. Maybe Bayes is turned on, but every time it tastes the DB it doesn't like the format. Loren
Need help with Bayes DB
Hello: I am running SpamAssassin 3.0.1 and have been having a problem. When I run sa-learn -D --dump magic I get the following output. 0.000 0 3 0 non-token data: bayes db version 0.000 0 1084 0 non-token data: nspam 0.000 0 1361 0 non-token data: nham 0.000 0 109079 0 non-token data: ntokens 0.000 0 1078967175 0 non-token data: oldest atime 0.000 0 1103809663 0 non-token data: newest atime 0.000 0 1103809671 0 non-token data: last journal sync atime 0.000 0 1103808704 0 non-token data: last expiry atime 0.000 0 0 0 non-token data: last expire atime delta 0.000 0 0 0 non-token data: last expire reduction count Now the problem is that the numbers for nspam, nham, ntokens, and oldest atime NEVER change. What could be the cause for this? How could I fix this problem? Thanks in advance, Ronald Vazquez
Re: Need help with Bayes DB
Check to make sure that you don't have a phantom local.cf somewhere that's pointing SA to the wrong directory for bayes. For instance, see if you have both a /etc/spamassassin and /etc/mail/spamassassin folder. Make sure it's not putting a new bayes database in your user/.spamassassin directory. Make sure that you have appropriate rights to the bayes folder and files. I've been using chmod 666. RO SpamAssassin User wrote: Hello: I am running SpamAssassin 3.0.1 and have been having a problem. When I run sa-learn -D --dump magic I get the following output. 0.000 0 3 0 non-token data: bayes db version 0.000 0 1084 0 non-token data: nspam 0.000 0 1361 0 non-token data: nham 0.000 0 109079 0 non-token data: ntokens 0.000 0 1078967175 0 non-token data: oldest atime 0.000 0 1103809663 0 non-token data: newest atime 0.000 0 1103809671 0 non-token data: last journal sync atime 0.000 0 1103808704 0 non-token data: last expiry atime 0.000 0 0 0 non-token data: last expire atime delta 0.000 0 0 0 non-token data: last expire reduction count Now the problem is that the numbers for nspam, nham, ntokens, and oldest atime NEVER change. What could be the cause for this? How could I fix this problem? Thanks in advance, Ronald Vazquez