Re: score=19.9 points, tflags=autolearn_force; = autolearn=no autolearn_force=no; WTF?
On 4/21/2015 11:48 PM, David B Funk wrote: I've got some home-grown rules that I trust to which have added tflags autolearn_force Recently I've seen some spam that hit those rules and racked up enough points that they should have auto-learned. But the scoring analysis explicitly says autolearn=no autolearn_force=no. What's going on here? Different rules are categorized differently and you likely aren't hitting the requirements: The score threshold above which a mail has to score, to be fed into SpamAssassin's learning systems automatically as a spam message. Note: SpamAssassin requires at least 3 points from the header, and 3 points from the body to auto-learn as spam. Therefore, the minimum working value for this option is 6. If the test option autolearn_force is set, the minimum value will remain at 6 points but there is no requirement that the points come from body and header rules. This option is useful for autolearning with rules that are considered to be extremely safe indicators of the spaminess of a message. is the autolearn_force being ignored because of the initial BAYES_00 score? Is there a 'autolearn_force_yes_I_really_mean_it' tflag that can be used to overcome that inhibition? I'd run with debug and look for these debugs: dbg(learn: auto-learn: autolearn_force flagged for a rule. Removing seperate body and head point threshold. Body Only Points: $body_only_points ($required_body_points req'd) / Head Only Points: $head_only_points ($required_head_points req'd)); dbg(learn: auto-learn: autolearn_force flagged because of rule(s): $force_autolearn_names); } else { dbg(learn: auto-learn: autolearn_force not flagged for a rule. Body Only Points: $body_only_points ($required_body_points req'd) / Head Only Points: $head_only_points ($required_head_points req'd)); } regards, KAM
Re: score=19.9 points, tflags=autolearn_force; = autolearn=no autolearn_force=no; WTF?
On Tue, 21 Apr 2015 22:48:46 -0500 (CDT) David B Funk wrote: is the autolearn_force being ignored because of the initial BAYES_00 score? Yes, a Bayes point in the opposite direction prevents auto-training. All the force flag does is override the 3+3 rule. Is there a 'autolearn_force_yes_I_really_mean_it' tflag that can be used to overcome that inhibition? Not as such, but it is possible to get that behaviour by transferring the score of BAYES_00 into two mutually exclusive meta-rules, one marked learn, and the other noautolearn. The former will retain the sanity-check and the latter wont.
score=19.9 points, tflags=autolearn_force; = autolearn=no autolearn_force=no; WTF?
I've got some home-grown rules that I trust to which have added tflags autolearn_force Recently I've seen some spam that hit those rules and racked up enough points that they should have auto-learned. But the scoring analysis explicitly says autolearn=no autolearn_force=no. What's going on here? # spamc -R /tmp/food-0 19.9/6.0 Checker-Version SpamAssassin 3.4.0 (2014-02-07) on xyzzy.engr.uiowa.edu Content analysis details: (19.9 points, 6.0 required, autolearn=no autolearn_force=no) pts rule name description -- -- 10 SURBL_URI_DBF4 Contains an URL in My SURBL list 4 [URIs: zxrich.com] 4.0 SURBL_URI_DBF2 Contains an URL in My SURBL list 2 [URIs: zxrich.com] -0.0 RCVD_IN_MSPIKE_H4 RBL: Very Good reputation (+4) [178.23.244.208 listed in wl.mailspike.net] 0.0 MISSING_HEADERSMissing To: header -0.0 SPF_HELO_PASS SPF: HELO matches SPF record -1.9 BAYES_00 BODY: Bayes spam probability is 0 to 1% [score: 0.] 1.1 RAZOR2_CF_RANGE_51_100 Razor2 gives confidence level above 50% [cf: 100] 1.9 RAZOR2_CF_RANGE_E8_51_100 Razor2 gives engine 8 confidence level above 50% [cf: 100] 0.9 RAZOR2_CHECK Listed in Razor2 (http://razor.sf.net/) 2.0 KAM_OBFObfuscated Porn Spams 0.8 KAM_ASCII_DIVIDERS Spam that uses ascii formatting tricks -0.0 RCVD_IN_MSPIKE_WL Mailspike good senders 1.0 TO_CC_NONE No To: or Cc: header -0.0 T__RECEIVED_2 More than one untrusted relay 0.1 KHOP_SC_CIDR8 Relay CIDR /8 is among worst in SpamCop The odd thing is that if I manually explicitly learn them with sa-learn --spam --mbox /tmp/food-0 then suddenly the 'autolearn_force=yes' takes effect. (with no other change, exact same message, seconds later). # spamc -R /tmp/food-0 23.8/6.0 Checker-Version SpamAssassin 3.4.0 (2014-02-07) on xyzzy.engr.uiowa.edu Content analysis details: (23.8 points, 6.0 required, autolearn=unavailable autolearn_force=yes (SURBL_URI_DBF4)) pts rule name description -- -- 10 SURBL_URI_DBF4 Contains an URL in My SURBL list 4 [URIs: zxrich.com] 4.0 SURBL_URI_DBF2 Contains an URL in My SURBL list 2 [URIs: zxrich.com] 0.0 MISSING_HEADERSMissing To: header -0.0 SPF_HELO_PASS SPF: HELO matches SPF record -0.0 RCVD_IN_MSPIKE_H4 RBL: Very Good reputation (+4) [178.23.244.208 listed in wl.mailspike.net] 2.0 BAYES_80 BODY: Bayes spam probability is 80 to 95% [score: 0.9197] 1.1 RAZOR2_CF_RANGE_51_100 Razor2 gives confidence level above 50% [cf: 100] 1.9 RAZOR2_CF_RANGE_E8_51_100 Razor2 gives engine 8 confidence level above 50% [cf: 100] 0.9 RAZOR2_CHECK Listed in Razor2 (http://razor.sf.net/) 2.0 KAM_OBFObfuscated Porn Spams 0.8 KAM_ASCII_DIVIDERS Spam that uses ascii formatting tricks -0.0 RCVD_IN_MSPIKE_WL Mailspike good senders 1.0 TO_CC_NONE No To: or Cc: header -0.0 T__RECEIVED_2 More than one untrusted relay 0.1 KHOP_SC_CIDR8 Relay CIDR /8 is among worst in SpamCop is the autolearn_force being ignored because of the initial BAYES_00 score? Is there a 'autolearn_force_yes_I_really_mean_it' tflag that can be used to overcome that inhibition? -- Dave Funk University of Iowa dbfunk (at) engineering.uiowa.eduCollege of Engineering 319/335-5751 FAX: 319/384-0549 1256 Seamans Center Sys_admin/Postmaster/cell_adminIowa City, IA 52242-1527 #include std_disclaimer.h Better is not better, 'standard' is better. B{
Re: autolearn_force
I suggest to spend some time reading the relevant documentation. In particular the M::SA::Conf and AutoLearnThreshold docs. http://spamassassin.apache.org/doc/ On Sat, 2014-05-24 at 22:12 -0700, Ian Zimmerman wrote: So, now I am really confused. I think I did everything right in user_prefs: tflags INVALID_DATE autolearn_force tflags are part of the Privileged Settings (see section in M::SA::Conf docs). For security and efficiency reasons, these are not allowed in user_prefs, unless allow_user_rules is enabled. (Just for completeness, dunno if you enabled it.) Nonetheless: X-Spam-Score: 6.9 X-Spam-Tests: BAYES_99=3.5,BAYES_999=0.2,HTML_FONT_LOW_CONTRAST=0.001, HTML_MESSAGE=0.001,MIME_HTML_ONLY=0.723,RDNS_NONE=0.793,SPF_PASS=-0.001, T_REMOTE_IMAGE=0.01,URIBL_BLACK=1.7 X-Spam-Autolearn: no autolearn_force=no As RW already pointed out quoting the AutoLearnThreshold man page, the score taken for the decision to auto-learn is not the same as the overall score shown above. To prevent Bayes self-feeding, (a) the Bayesian rules themselves are ignored, and (b) the respective non-Bayes score-set is used. The latter often (but not necessarily) results in higher scores per rule. However 6.9 -3,7 for BAYES_* rules is likely to not exceed the threshold even when using the respective non-Bayes score-set. And here's a case where it doesn't autolearn ham (same user_prefs as above): X-Spam-Status: No X-Spam-Level: X-Spam-Score: -2.7 X-Spam-Tests: BAYES_00=-1.9,DKIM_SIGNED=0.1,DKIM_VALID=-0.1,DKIM_VALID_AU=-0.1, FREEMAIL_FORGED_FROMDOMAIN=0.001,FREEMAIL_FROM=0.001, HEADER_FROM_DIFFERENT_DOMAINS=0.001,HTML_MESSAGE=0.001,RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H2=-0.001,SPF_PASS=-0.001 X-Spam-Autolearn: no autolearn_force=no The documentation certainly doesn't say anything like the 3/3 and force mechanism is in place for ham. So this _should_ autolearn. Right? Right?? No. At no point does the documentation suggest autolearn_force would work with ham. The AutoLearnThreshold doc mentions this option only in the context of the spam threshold, and the M::SA::Conf doc is even more clear about it. autolearn_force The test will be subject to less stringent autolearn thresholds. Normally, SpamAssassin will require 3 points from the header and 3 points from the body to be auto-learned as spam. This option keeps the threshold at 6 points total but changes it to have no regard to the source of the points. Moreover, that message did *not* trigger any of the rules you set tflags autolearn_force for. Thus, even regardless of the actual score and it being not spam, that message would never be considered autolearn_force. -- char *t=\10pse\0r\0dtu\0.@ghno\x4e\xc8\x79\xf4\xab\x51\x8a\x10\xf4\xf4\xc4; main(){ char h,m=h=*t++,*x=t+2*h,c,i,l=*x,s=0; for (i=0;il;i++){ i%8? c=1: (c=*++x); c128 (s+=h); if (!(h=1)||!t[s+h]){ putchar(t[s]);h=m;s=0; }}}
Re: autolearn_force
On Sat, 24 May 2014 22:12:10 -0700 Ian Zimmerman wrote: So, now I am really confused. I think I did everything right in user_prefs: ... Nonetheless: X-Spam-Score: 6.9 X-Spam-Tests: BAYES_99=3.5,BAYES_999=0.2,HTML_FONT_LOW_CONTRAST=0.001, HTML_MESSAGE=0.001,MIME_HTML_ONLY=0.723,RDNS_NONE=0.793,SPF_PASS=-0.001, T_REMOTE_IMAGE=0.01,URIBL_BLACK=1.7 X-Spam-Autolearn: no autolearn_force=no And here's a case where it doesn't autolearn ham (same user_prefs as above): ... The documentation certainly doesn't say anything like the 3/3 and force mechanism is in place for ham. So this _should_ autolearn. Right? Right?? Mail::SpamAssassin::PlUser:CoMail::SpamAssassin::Plugin::AutoLearnThreshold(3) NAME Mail::SpamAssassin::Plugin::AutoLearnThreshold - threshold-based discriminator for Bayes auto-learning SYNOPSIS loadplugin Mail::SpamAssassin::Plugin::AutoLearnThreshold DESCRIPTION This plugin implements the threshold-based auto-learning discriminator for SpamAssassin's Bayes subsystem. Auto-learning is a mechanism whereby high-scoring mails (or low-scoring mails, for non-spam) are fed into its learning systems without user intervention, during scanning. Note that certain tests are ignored when determining whether a message should be trained upon: o rules with tflags set to 'learn' (the Bayesian rules) o rules with tflags set to 'userconf' (user configuration) o rules with tflags set to 'noautolearn' Also note that auto-learning occurs using scores from either scoreset 0 or 1, depending on what scoreset is used during message check. It is likely that the message check and auto-learn scores will be different.
Re: autolearn_force
On 5/25/2014 1:12 AM, Ian Zimmerman wrote: So, now I am really confused. I think I did everything right in user_prefs: bayes_auto_learn1 bayes_auto_learn_threshold_nonspam -2.00 bayes_auto_learn_threshold_spam 6.00 bayes_auto_learn_on_error 0 [snip] tflags URIBL_DBL_SPAM autolearn_force tflags URIBL_JP_SURBL autolearn_force tflags URIBL_BLACK autolearn_force tflags INVALID_DATE autolearn_force Nonetheless: X-Spam-Score: 6.9 X-Spam-Tests: BAYES_99=3.5,BAYES_999=0.2,HTML_FONT_LOW_CONTRAST=0.001, HTML_MESSAGE=0.001,MIME_HTML_ONLY=0.723,RDNS_NONE=0.793,SPF_PASS=-0.001, T_REMOTE_IMAGE=0.01,URIBL_BLACK=1.7 X-Spam-Autolearn: no autolearn_force=no And here's a case where it doesn't autolearn ham (same user_prefs as above): X-Spam-Status: No X-Spam-Level: X-Spam-Score: -2.7 X-Spam-Tests: BAYES_00=-1.9,DKIM_SIGNED=0.1,DKIM_VALID=-0.1,DKIM_VALID_AU=-0.1, FREEMAIL_FORGED_FROMDOMAIN=0.001,FREEMAIL_FROM=0.001, HEADER_FROM_DIFFERENT_DOMAINS=0.001,HTML_MESSAGE=0.001,RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H2=-0.001,SPF_PASS=-0.001 X-Spam-Autolearn: no autolearn_force=no The documentation certainly doesn't say anything like the 3/3 and force mechanism is in place for ham. So this _should_ autolearn. Right? Right?? Hi Ian, Perhaps a bug. Hard to say from this little output. Please turn on -D and pastebin the output. If you also want to pastebin the email, I'll look at it. But if not, these are the current debug's I'll be looking for: dbg(learn: auto-learn: autolearn_force flagged for a rule. Removing seperate body and head point threshold. Body Only Points: $body_only_points ($required_body_points req'd) / Head Only Points: $head_only_points ($required_head_points req'd)); dbg(learn: auto-learn: autolearn_force flagged because of rule(s): $force_autolearn_names); } else { dbg(learn: auto-learn: autolearn_force not flagged for a rule. Body Only Points: $body_only_points ($required_body_points req'd) / Head Only Points: $head_only_points ($required_head_points req'd)); } Regards, KAM
Re: autolearn_force
On 05/25/2014 07:12 AM, Ian Zimmerman wrote: tflags URIBL_DBL_SPAM autolearn_force tflags URIBL_JP_SURBL autolearn_force tflags URIBL_BLACK autolearn_force tflags INVALID_DATE autolearn_force URIBL rules are not set to use 'userconf' (user configuration) so entries in user_prefs shouldn't affect the results if anything it should go in a system wide rule (ie: local.cf) (not user_prefs) your: tflags URIBL_DBL_SPAM autolearn_force should probably read: tflags URIBL_DBL_SPAM net domains_only autolearn_force etc, etc - and not in user_ iirc, this will also influence Bayes's scoring/learning behaviour. modifying rules' tflags should be done with care
Re: autolearn_force
On Sun, 25 May 2014 16:40:44 +0200 Axb axb.li...@gmail.com wrote: Axb URIBL rules are not set to use 'userconf' (user configuration) Axb so entries in user_prefs shouldn't affect the results Axb if anything it should go in a system wide rule (ie: local.cf) (not Axb user_prefs) Axb your: tflags URIBL_DBL_SPAM autolearn_force Axb should probably read: Axb tflags URIBL_DBL_SPAM net domains_only autolearn_force Axb etc, etc - and not in user_ Axb iirc, this will also influence Bayes's scoring/learning behaviour. Axb modifying rules' tflags should be done with care But it does autolearn in _some_ instances: May 25 08:33:50 host spamd[13561]: spamd: result: Y 10 - BAYES_99,BAYES_999,HTML_FONT_LOW_CONTRAST,HTML_MESSAGE,MIME_HTML_ONLY, RDNS_NONE,SPF_PASS,T_REMOTE_IMAGE,URIBL_BLACK,URIBL_DBL_SPAM,URIBL_JP_SURBL scantime=1.7,size=6496,user=itz,uid=1000,required_score=4.3,rhost=127.0.0.1, raddr=127.0.0.1,rport=52900, mid=24251386609892242521126914206...@lun5bim.dollazo.eu,bayes=1.00, autolearn=spam autolearn_force=yes (URIBL_JP_SURBL,URIBL_DBL_SPAM,URIBL_BLACK) So I'm afraid I can't be satisfied with this explanation. The whole autolearning settings thing just feels way unpredictable for me. If there are so many hurdles, does anyone actually do it? -- Please *no* private copies of mailing list or newsgroup messages.
Re: autolearn_force
On 05/25/2014 05:59 PM, Ian Zimmerman wrote: On Sun, 25 May 2014 16:40:44 +0200 Axb axb.li...@gmail.com wrote: Axb URIBL rules are not set to use 'userconf' (user configuration) Axb so entries in user_prefs shouldn't affect the results Axb if anything it should go in a system wide rule (ie: local.cf) (not Axb user_prefs) Axb your: tflags URIBL_DBL_SPAM autolearn_force Axb should probably read: Axb tflags URIBL_DBL_SPAM net domains_only autolearn_force Axb etc, etc - and not in user_ Axb iirc, this will also influence Bayes's scoring/learning behaviour. Axb modifying rules' tflags should be done with care But it does autolearn in _some_ instances: May 25 08:33:50 host spamd[13561]: spamd: result: Y 10 - BAYES_99,BAYES_999,HTML_FONT_LOW_CONTRAST,HTML_MESSAGE,MIME_HTML_ONLY, RDNS_NONE,SPF_PASS,T_REMOTE_IMAGE,URIBL_BLACK,URIBL_DBL_SPAM,URIBL_JP_SURBL scantime=1.7,size=6496,user=itz,uid=1000,required_score=4.3,rhost=127.0.0.1, raddr=127.0.0.1,rport=52900, mid=24251386609892242521126914206...@lun5bim.dollazo.eu,bayes=1.00, autolearn=spam autolearn_force=yes (URIBL_JP_SURBL,URIBL_DBL_SPAM,URIBL_BLACK) Yes, when it reached certain conditions and a score above 15.0 you can tune that score via local.cf entries: bayes_auto_learn_threshold_nonspam bayes_auto_learn_threshold_spam Depending on your traffic, you may want to raise/lower those scores. There's default safe settings but playing with those score helps tune learning sensitivity. The whole autolearning settings thing just feels way unpredictable for me. It feels unpredictable because of the overwhelming amount of variables which influence learning. Just stick to experimenting with settings till you find the best performance for your traffic. There is no one size fits all because each system's ham/spam traffic can be so different. If there are so many hurdles, does anyone actually do it? Since Bayes was added to SA, I've used nothing else. (2004? 2005?).
Re: autolearn_force
On Sun, 25 May 2014 20:06:22 +0200 Axb axb.li...@gmail.com wrote: Axb Yes, when it reached certain conditions and a score above 15.0 Axb you can tune that score via local.cf entries: Axb bayes_auto_learn_threshold_nonspam bayes_auto_learn_threshold_spam Please see the prefs in my post upthread - I have already done this. That's why I am so confused, and frankly, irritated. I have done everything the documentation says to do, and it still behaves magically and strangely. -- Please *no* private copies of mailing list or newsgroup messages.
Re: autolearn_force
On Sun, 25 May 2014 08:59:28 -0700 Ian Zimmerman wrote: On Sun, 25 May 2014 16:40:44 +0200 Axb axb.li...@gmail.com wrote: Axb URIBL rules are not set to use 'userconf' (user configuration) Axb so entries in user_prefs shouldn't affect the results Axb if anything it should go in a system wide rule (ie: local.cf) Axb (not user_prefs) Axb your: tflags URIBL_DBL_SPAM autolearn_force Axb should probably read: Axb tflags URIBL_DBL_SPAM net domains_only autolearn_force Axb etc, etc - and not in user_ Axb iirc, this will also influence Bayes's scoring/learning Axb behaviour. modifying rules' tflags should be done with care But it does autolearn in _some_ instances: May 25 08:33:50 host spamd[13561]: spamd: result: Y 10 - BAYES_99,BAYES_999,HTML_FONT_LOW_CONTRAST,HTML_MESSAGE,MIME_HTML_ONLY, RDNS_NONE,SPF_PASS,T_REMOTE_IMAGE,URIBL_BLACK,URIBL_DBL_SPAM,URIBL_JP_SURBL scantime=1.7,size=6496,user=itz,uid=1000,required_score=4.3,rhost=127.0.0.1, raddr=127.0.0.1,rport=52900, mid=24251386609892242521126914206...@lun5bim.dollazo.eu,bayes=1.00, autolearn=spam autolearn_force=yes (URIBL_JP_SURBL,URIBL_DBL_SPAM,URIBL_BLACK) A difference between this and the other one you quoted is that this one appears to be over the 6 point threshold and the other didn't. (I haven't done the exact arithmetic for scoreset 1, but the other was only slightly over 6 in scoreset 3 including BAYES_99, and this one is well over). That would mean that even if autolearn_force worked correctly, it still wouldn't have been autolearned. It would be interesting to see if you can reproduce the previous autolearn_force=no result on a very high scoring spam - It's possible there may be a cosmetic bug where autolearn_force is not logged correctly when the spam isn't going to be autolearned anyway.
Re: autolearn_force
On Thu, 22 May 2014 15:54:42 +0100 RW rwmailli...@googlemail.com wrote: Ian But in fact this is a per-test setting, a subcategory of tflags. Ian Do I have to specify it separately for every test? Why? RW The point is to set it for a small number of rules that are RW sufficiently strong as to guarantee there will be no mislearning in RW combination with the autolearn as spam threshold. So, now I am really confused. I think I did everything right in user_prefs: bayes_auto_learn1 bayes_auto_learn_threshold_nonspam -2.00 bayes_auto_learn_threshold_spam 6.00 bayes_auto_learn_on_error 0 [snip] tflags URIBL_DBL_SPAM autolearn_force tflags URIBL_JP_SURBL autolearn_force tflags URIBL_BLACK autolearn_force tflags INVALID_DATE autolearn_force Nonetheless: X-Spam-Score: 6.9 X-Spam-Tests: BAYES_99=3.5,BAYES_999=0.2,HTML_FONT_LOW_CONTRAST=0.001, HTML_MESSAGE=0.001,MIME_HTML_ONLY=0.723,RDNS_NONE=0.793,SPF_PASS=-0.001, T_REMOTE_IMAGE=0.01,URIBL_BLACK=1.7 X-Spam-Autolearn: no autolearn_force=no One suspect thing I see in the log: May 24 20:29:58 host spamd[13561]: spamd: result: Y 6 - BAYES_99,BAYES_999,HTML_FONT_LOW_CONTRAST,HTM L_MESSAGE,MIME_HTML_ONLY,RDNS_NONE,SPF_PASS,T_REMOTE_IMAGE,URIBL_BLACK scantime=1.9,size=6208,user=itz, uid=1000,required_score=4.3,rhost=127.0.0.1,raddr=127.0.0.1,rport=60231,mid=23931386609892239320827813 806...@86adv5n4.disabilism.eu,bayes=1.00,autolearn=no autolearn_force=no Note the 6 - is it possible that SA truncates the score to an integer for this purpose, and then treats it as a strict lower bound - that is, if I set bayes_auto_learn_threshold_spam = 6.00, the lowest score to actually trigger autolearn would be 7? That is the only rational explanation I have, tortured as it is. It sure looks like SA is going out of its way to force me to do manual training :-( -- Please *no* private copies of mailing list or newsgroup messages.
Re: autolearn_force
So, now I am really confused. I think I did everything right in user_prefs: bayes_auto_learn 1 bayes_auto_learn_threshold_nonspam -2.00 bayes_auto_learn_threshold_spam 6.00 bayes_auto_learn_on_error 0 [snip] tflags URIBL_DBL_SPAM autolearn_force tflags URIBL_JP_SURBL autolearn_force tflags URIBL_BLACK autolearn_force tflags INVALID_DATE autolearn_force Nonetheless: X-Spam-Score: 6.9 X-Spam-Tests: BAYES_99=3.5,BAYES_999=0.2,HTML_FONT_LOW_CONTRAST=0.001, HTML_MESSAGE=0.001,MIME_HTML_ONLY=0.723,RDNS_NONE=0.793,SPF_PASS=-0.001, T_REMOTE_IMAGE=0.01,URIBL_BLACK=1.7 X-Spam-Autolearn: no autolearn_force=no And here's a case where it doesn't autolearn ham (same user_prefs as above): X-Spam-Status: No X-Spam-Level: X-Spam-Score: -2.7 X-Spam-Tests: BAYES_00=-1.9,DKIM_SIGNED=0.1,DKIM_VALID=-0.1,DKIM_VALID_AU=-0.1, FREEMAIL_FORGED_FROMDOMAIN=0.001,FREEMAIL_FROM=0.001, HEADER_FROM_DIFFERENT_DOMAINS=0.001,HTML_MESSAGE=0.001,RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H2=-0.001,SPF_PASS=-0.001 X-Spam-Autolearn: no autolearn_force=no The documentation certainly doesn't say anything like the 3/3 and force mechanism is in place for ham. So this _should_ autolearn. Right? Right?? -- Please *no* private copies of mailing list or newsgroup messages.
Re: autolearn_force
On Wed, 21 May 2014 21:34:23 -0700 Ian Zimmerman wrote: I don't understand this setting, and reading the documentation doesn't help. It seems it sould make bayes learn spam whenever the total score surpasses the value of bayes_auto_learn_threshold_spam, and not require 3 points from header and body each; that would make it a global setting similar in purpose to bayes_auto_learn_threshold_spam. But in fact this is a per-test setting, a subcategory of tflags. Do I have to specify it separately for every test? Why? The point is to set it for a small number of rules that are sufficiently strong as to guarantee there will be no mislearning in combination with the autolearn as spam threshold. It's probably best to create a single metarule for this - something that eliminates the possibility of mistraining through a lot of overlapping rules. I do something similar to get more spam into my high-scoring folder. I assign a lot of the near-certain spam rules to different classes: BAYES, RBLs, URIBLs, relaycountry etc and then count the number of classes.
Re: autolearn_force
On Thu, 22 May 2014 15:54:42 +0100 RW rwmailli...@googlemail.com wrote: Ian I don't understand this setting, and reading the documentation Ian doesn't help. Ian It seems it should make Bayes learn spam whenever the total score Ian surpasses the value of bayes_auto_learn_threshold_spam, and not Ian require 3 points from header and body each; that would make it a Ian global setting similar in purpose to Ian bayes_auto_learn_threshold_spam. Ian But in fact this is a per-test setting, a subcategory of tflags. Ian Do I have to specify it separately for every test? Why? RW The point is to set it for a small number of rules that are RW sufficiently strong as to guarantee there will be no mislearning in RW combination with the autolearn as spam threshold. RW It's probably best to create a single metarule for this - something RW that eliminates the possibility of mistraining through a lot of RW overlapping rules. I do something similar to get more spam into my RW high-scoring folder. I assign a lot of the near-certain spam rules RW to different classes: BAYES, RBLs, URIBLs, relaycountry etc and then RW count the number of classes. The problem I am trying to solve is that nearly all of my spam is flagged due to body rules. The header rules seem to be close to useless with the latest campaigns - spammers seem to have learned enough to avoid sending obvious stinking pieces of turd. (The one exception is patterns in the Message-ID, but I am afraid that will be short lived too, and is insufficient by itself even now). Thus, even if I set bayes_auto_learn_threshold_spam low, very few of my spams are autolearned because of the 3/3 requirement. The damn 3/3 is my problem - how can I work around it? If I have to spend an hour a day manually training the classifier the spammers have won :-( By the way, how are meta rules counted for this purpose? The documentation says nothing about that. -- Please *no* private copies of mailing list or newsgroup messages.
autolearn_force
I don't understand this setting, and reading the documentation doesn't help. It seems it sould make bayes learn spam whenever the total score surpasses the value of bayes_auto_learn_threshold_spam, and not require 3 points from header and body each; that would make it a global setting similar in purpose to bayes_auto_learn_threshold_spam. But in fact this is a per-test setting, a subcategory of tflags. Do I have to specify it separately for every test? Why? Or is there another way to bypass the 3/3 requirement? -- Please *no* private copies of mailing list or newsgroup messages.