Help with Bayes auto-learn
I would like to enable the Bayes system with auto-learning. I thought that I had my config setup correctly but apparently I don't. My config looks like this: ## # How we want to modify the email rewrite_header subject [**SPAM**] report_safe 0 #Bayes learning system use_bayes 1 bayes_auto_learn 1 # Define the sensitivity level. Standard level is 5. required_hits 6.8 # Enable SpamAssassin's RBL checking features : skip_rbl_checks 0 rbl_timeout 3 num_check_received 3 score RCVD_IN_BL_SPAMCOP_NET 3 report_header 1 use_terse_report 1 ## so I thought from the reading in the FAQ and on the wiki that this would enable bayes, and turn on its auto_learn for spam that hits higher then the default of 12. But in my logs I end up with this: 2005-05-12 23:30:33.240563500 2005-05-13 06:30:33 [88906] i: connection from localhost.whootis.com [127.0.0.1] at port 4737 2005-05-12 23:30:33.333094500 2005-05-13 06:30:33 [88906] i: processing message [EMAIL PROTECTED] for qmaild:10004. 2005-05-12 23:30:33.431814500 2005-05-13 06:30:33 [88906] i: identified spam (23.2/6.8) for qmaild:10004 in 0.2 seconds, 1311 bytes. 2005-05-12 23:30:33.432514500 2005-05-13 06:30:33 [88906] i: result: Y 23 - BAYES_99,FORGED_MUA_THEBAT_BOUN,FORGED_THEBAT_HTML,FORGED_YAHOO_RCVD,HEAD_ILLEGAL_CHARS,HTML_MESSAGE,HTML_MIME_NO_HTML_TAG,MIME_HTML_ONLY,MIME_HTML_ONLY_MULTI,MSGID_RANDY,NORMAL_HTTP_TO_IP,RCVD_BY_IP,RCVD_DOUBLE_IP_LOOSE,RCVD_HELO_IP_MISMATCH,RCVD_NUMERIC_HELO,SUBJ_ILLEGAL_CHARS scantime=0.2,size=1311,mid=[EMAIL PROTECTED],bayes=0.999,autolearn=no Does the autolearn=no mean that this message has not been submitted to bayes for auto-learn? And if not, can someone steer me in the right direction for getting my config setup correctly? Thanks very much, Geoff Sweet
RE: Help with Bayes auto-learn
I can swear I saw this question in at least 20 different messages, not to mention the website I really recommend you research your question before asking it. autolearn=no means that it didn't 'learn' this message. Other possible states are 'spam, 'ham' and ... 'DISABLED' If autolearn were to be disabled, you would see this last one. I would like to enable the Bayes system with auto-learning. I thought that I had my config setup correctly but apparently I don't. My config looks like this: ## # How we want to modify the email rewrite_header subject [**SPAM**] report_safe 0 #Bayes learning system use_bayes 1 bayes_auto_learn 1 # Define the sensitivity level. Standard level is 5. required_hits 6.8 # Enable SpamAssassin's RBL checking features : skip_rbl_checks 0 rbl_timeout 3 num_check_received 3 score RCVD_IN_BL_SPAMCOP_NET 3 report_header 1 use_terse_report 1 ## so I thought from the reading in the FAQ and on the wiki that this would enable bayes, and turn on its auto_learn for spam that hits higher then the default of 12. But in my logs I end up with this: 2005-05-12 23:30:33.240563500 2005-05-13 06:30:33 [88906] i: connection from localhost.whootis.com [127.0.0.1] at port 4737 2005-05-12 23:30:33.333094500 2005-05-13 06:30:33 [88906] i: processing message [EMAIL PROTECTED] for qmaild:10004. 2005-05-12 23:30:33.431814500 2005-05-13 06:30:33 [88906] i: identified spam (23.2/6.8) for qmaild:10004 in 0.2 seconds, 1311 bytes. 2005-05-12 23:30:33.432514500 2005-05-13 06:30:33 [88906] i: result: Y 23 - BAYES_99,FORGED_MUA_THEBAT_BOUN,FORGED_THEBAT_HTML,FORGED_YAHOO_RCVD,HEAD_IL LEGAL_CHARS,HTML_MESSAGE,HTML_MIME_NO_HTML_TAG,MIME_HTML_ONLY,MIME_HTML_ONLY _MULTI,MSGID_RANDY,NORMAL_HTTP_TO_IP,RCVD_BY_IP,RCVD_DOUBLE_IP_LOOSE,RCVD_HE LO_IP_MISMATCH,RCVD_NUMERIC_HELO,SUBJ_ILLEGAL_CHARS scantime=0.2,size=1311,mid=[EMAIL PROTECTED],bayes=0.9 99,autolearn=no Does the autolearn=no mean that this message has not been submitted to bayes for auto-learn? And if not, can someone steer me in the right direction for getting my config setup correctly? Thanks very much, Geoff Sweet
Re: Help with Bayes auto-learn
In an older episode (Friday 13 May 2005 08:38), Geoff Sweet wrote: I would like to enable the Bayes system with auto-learning. I thought that I had my config setup correctly but apparently I don't. My config looks like this: ## # How we want to modify the email rewrite_header subject [**SPAM**] report_safe 0 #Bayes learning system use_bayes 1 bayes_auto_learn 1 In an older episode (Friday 13 May 2005 10:17), George Breahna wrote: I really recommend you research your question before asking it. good point, anyway: man Mail::SpamAssassin::Conf and http://spamassassin.apache.org/full/3.0.x/dist/doc/Mail_SpamAssassin_Conf.html would tell you: bayes_min_ham_num (Default: 200) bayes_min_spam_num (Default: 200) To be accurate, the Bayes system does not activate until a certain number of ham (non-spam) and spam have been learned. The default is 200 of each ham and spam, but you can tune these up or down with these two settings. for information how to learn the needed amount of mails, see man sa-learn regards, wolfgang
Re: Help with Bayes auto-learn
Yes, but his scoring list BAYES_99 as one of the scores, which means bayes is active, which means it has been fed the necessary 200 spam and 200 ham. If it hadn't been fed the necessary spam and ham, it would not have been given a BAYES score at all. The fact that the mail was not autolearned could mean that it did not fall within the autolearn range OR that an identical message had already been learned. With a score like BAYES_99, it is probably the latter. wolfgang [EMAIL PROTECTED] 5/13/2005 4:38 AM In an older episode (Friday 13 May 2005 08:38), Geoff Sweet wrote: I would like to enable the Bayes system with auto-learning. I thought that I had my config setup correctly but apparently I don't. My config looks like this: ## # How we want to modify the email rewrite_header subject [**SPAM**] report_safe 0 #Bayes learning system use_bayes 1 bayes_auto_learn 1In an older episode (Friday 13 May 2005 10:17), George Breahna wrote: I really recommend you research your question before asking it.good point, anyway:man Mail::SpamAssassin::Conf andhttp://spamassassin.apache.org/full/3.0.x/dist/doc/Mail_SpamAssassin_Conf.htmlwould tell you:bayes_min_ham_num (Default: 200)bayes_min_spam_num (Default: 200) To be accurate, the Bayes system does not activate until a certain number of ham (non-spam) and spam have been learned. The default is 200 of each ham and spam, but you can tune these up or down with these two settings.for information how to learn the needed amount of mails, seeman sa-learnregards,wolfgang
Re: Help with Bayes auto-learn
In an older episode (Friday 13 May 2005 12:26), Joe Zitnik wrote: Yes, but his scoring list BAYES_99 as one of the scores, which means bayes is active, which means it has been fed the necessary 200 spam and 200 ham. If it hadn't been fed the necessary spam and ham, it would not have been given a BAYES score at all. thanks for pointing that out, i had missed that. wolfgang
Re: Help with Bayes auto-learn
At 02:38 AM 5/13/2005, Geoff Sweet wrote: 2005-05-12 23:30:33.432514500 2005-05-13 06:30:33 [88906] i: result: Y 23 - BAYES_99,FORGED_MUA_THEBAT_BOUN,FORGED_THEBAT_HTML,FORGED_YAHOO_RCVD,HEAD_ILLEGAL_CHARS,HTML_MESSAGE,HTML_MIME_NO_HTML_TAG,MIME_HTML_ONLY,MIME_HTML_ONLY_MULTI,MSGID_RANDY,NORMAL_HTTP_TO_IP,RCVD_BY_IP,RCVD_DOUBLE_IP_LOOSE,RCVD_HELO_IP_MISMATCH,RCVD_NUMERIC_HELO,SUBJ_ILLEGAL_CHARS scantime=0.2,size=1311,mid=[EMAIL PROTECTED],bayes=0.999,autolearn=no Does the autolearn=no mean that this message has not been submitted to bayes for auto-learn? And if not, can someone steer me in the right direction for getting my config setup correctly? First, I'm assuming you're using SA 3.0.0 or higher, if not, please specify version and I'll correct my message (some of the details differ) That does mean the message was not autolearned. However, it does not mean that no messages will be autolearned. In SA 3.0 if autolearning was disabled, or failing, you would have seen disabled or failed, not no. The requirements for autolearning are considerably more complex than just total score over xx. The following things have to happen: Note: ALL scores referenced below are the learning score. Learning score is NOT the same as the final spam score. It is the score recalculated as if bayes was disabled, *including* changing scoreset. Also all AWL, whitelist, and blacklist rules don't count towards this score. 1) total learning score over bayes_auto_learn_threshold_spam (default 12) 2) learning score of header rules must be over 3.0 3) learning score of body rules must be over 3.0 4) existing bayes learning must not be strongly ham (ie: don't learn as spam anything that would otherwise get bayes_00'ed) 5) From addresses (including Return-Path, etc) must not match a bayes_ignore_from statement 6) To addresses (including Cc, etc) must not match a bayes_ignore_from statement 7) The bayes DB must not be locked by some other SA process (another learner, expiry, etc). Note: this test results in autolearn=failed. See also: http://wiki.apache.org/spamassassin/AutolearningNotWorking