Re: Bayes auto-learn - not happening
Imho You need 100 ham and 100 spam to auto learning working. Do manual learning 08.08.2017 8:20 PM "Scott Techlist" napisaĆ(a): > Centos7 > Postfix 3.2.2 > Amavisd-new 2.11.0 > Spamassassin 3.4.0 > Site-wide configuration > > This is a new box and I've configured some conservative values for > auto-learn. I've enabled it properly AFAIK, but I can't see any sign of it > working. > > I have these set in local.cf > use_bayes 1 > bayes_auto_learn1 > bayes_auto_learn_threshold_nonspam -1.7 > bayes_auto_learn_threshold_spam 10.0 > # this is a filename prefix, not a directory per se > bayes_path /etc/mail/bayes/bayes > bayes_file_mode 0666 > > -bayes prep > Start fresh for troubleshooting: > su amavis -c 'sa-learn --clear' > > Add one spam manually and check tokens: > > [root@tn2 mail]# su amavis -c 'sa-learn --dump magic' > 0.000 0 3 0 non-token data: bayes db version > 0.000 0 1 0 non-token data: nspam > 0.000 0 0 0 non-token data: nham > 0.000 0 2157 0 non-token data: ntokens > > -amavisd prep > > Restart amavisd/spamassassin just to be sure all configs read.. > > --- ready to process - > > The next high scoring spam arrives, it was sent to my spam mailbox. It > did NOT autolearn. Nor did several others. > > To troubleshoot, I took one that did not autolearn, and learned it > manually by: > su amavis -c 'sa-learn -D --spam --showdots --mbox /home/mail/onespam > > even though this message was slightly over the threshold, the log says it > learned anyway: > -D log snippet: > - > Aug 8 12:37:27.216 [13198] info: archive-iterator: skipping large > message: 858 lines, 262203 bytes, limit 262144 bytes > > Learned tokens from 1 message(s) (1 message(s) examined) > - > > Verified it learned: > > [root@tn2 mail]# su amavis -c 'sa-learn --dump magic' > 0.000 0 3 0 non-token data: bayes db version > 0.000 0 2 0 non-token data: nspam > > > Partial header from that message: > > X-Spam-Flag: YES > X-Spam-Score: 17.374 > X-Spam-Level: * > X-Spam-Status: Yes, score=17.374 tag=- tag2=5 kill=6.31 > tests=[RCVD_IN_BRBL_LASTEXT=1.644, RCVD_IN_DNSWL_NONE=-0.0001, > RCVD_IN_RP_RNBL=1.284, RCVD_IN_SBL_CSS=3.558, > RCVD_IN_SORBS_WEB=1.5, > RP_MATCHES_RCVD=-0.001, SUSPICIOUS_RECIPS=2.497, > URIBL_ABUSE_SURBL=1.948, URIBL_BLACK=1.7, URIBL_DBL_SPAM=2.5, > URIBL_SBL=0.644, URIBL_SBL_A=0.1] autolearn=no autolearn_force=no > > Why aren't my spams getting auto-learned? If sa-learn "ate" it, shouldn't > auto-learn too? > > I know there is a default 200 threshold before Bayes starts tagging > anything, but I understand it should learn without issue. > > Can't figure out what's wrong... > > > > > > > > > > > > > >
Re: Relay Country Plugin GEOIP issue - solved
On 2015-10-14 18:31, Mark Martinec wrote: Check your database: $ spamassassin --lint -D metadata' 2>&1 | fgrep RelayCountry should yield something like: Oct 15 01:26:45.584 [78315] dbg: metadata: RelayCountry: Using database: Geo::IP GEO-106FREE 20151006 Build 1 Copyright (c) 2015 MaxMind Inc All Rights Reserved We see that exact response but we are still exhibited the warning. Fresh files can be downloaded from: http://dev.maxmind.com/geoip/legacy/geolite/ unzip them and place them to their expected location, typically /usr/local/share/GeoIP/ . You need files GeoIP.dat and GeoIPv6.dat there. Mark We tried this with no luck. Then we discovered we were patching the bug in the wrong location. Wrong Location (System Location): /usr/local/share/perl5/Geo/IP.pm Right Location (Cpanel Location) /usr/local/cpanel/3rdparty/perl/514/lib64/perl5/cpanel_lib/Geo/IP.pm Now it's working as it's supposed to. Thanks for your help. Allen a...@satester.com
Re: Relay Country Plugin GEOIP issue
Hi, I cannot get the fix below to work. Does the Geo::IP package need to be recompiled for the change to go into effect? If so, any tips on how to recompile would be greatly appreciated. Allen a...@satester.com On 2015-10-14 12:04, George Ficzeri wrote: This? https://github.com/maxmind/geoip-api-perl/pull/22 If you click 'Files changed' you'll see the path, and see the fix. On 10/14/15 11:49 AM, a...@satester.com wrote: Hi, We activated the relay country plugin yesterday. As part of the process we did a yum install perl-Geo-IP. Now we get the following warning when we lint or salearn. Use of uninitialized value $hasStructureInfo in numeric eq (==) at (eval 31) line 5520 I have no idea which file line 5520 is in, and I am not finding much in Google and I'm hoping someone here has a clue. Thanks. Allen a...@satester.com
Re: Relay Country Plugin GEOIP issue
Thanks for the reply George. We tried that link yesterday and made the change as described with no results. We restarted mailscanner but nothing else. Maybe I need to restart our MTA or other daemon. Allen a...@bandwise.com On 2015-10-14 12:04, George Ficzeri wrote: This? https://github.com/maxmind/geoip-api-perl/pull/22 If you click 'Files changed' you'll see the path, and see the fix. On 10/14/15 11:49 AM, a...@satester.com wrote: Hi, We activated the relay country plugin yesterday. As part of the process we did a yum install perl-Geo-IP. Now we get the following warning when we lint or salearn. Use of uninitialized value $hasStructureInfo in numeric eq (==) at (eval 31) line 5520 I have no idea which file line 5520 is in, and I am not finding much in Google and I'm hoping someone here has a clue. Thanks. Allen a...@satester.com
Relay Country Plugin GEOIP issue
Hi, We activated the relay country plugin yesterday. As part of the process we did a yum install perl-Geo-IP. Now we get the following warning when we lint or salearn. Use of uninitialized value $hasStructureInfo in numeric eq (==) at (eval 31) line 5520 I have no idea which file line 5520 is in, and I am not finding much in Google and I'm hoping someone here has a clue. Thanks. Allen a...@satester.com
satester.com update
I have been working on satester.com this past week in my spare time and I put up a new version recently (v1.02). I think I may have finally nailed it down to where it is usable with a large ruleset. (It threw errors with pastes of large .cf files and other stuff) I also show the rule name now in the results, and anyway, I hope you and others may find this tool useful for certain situations. satester.com Allen Marsalis a...@satester.com
Re: SA Rule Tester/Checker
On 2015-07-18 04:54, Martin Gregorie wrote: There are lots of possibilities. I test using a big (and growing) spam collection, which I keep so I can regression test my current rule set. Thats quite crude: if everything in the collection is recognised as spam, nothing gets flagged up during the test run and thats a pass. Thanks for the reply. Having a test server seems like a good idea. I also like the idea of not storing a spam collection on a more expensive production server. My setup is fairly simple: I run spamc/spamd on a development box and have a collection of bash scripts that can pass one or more spam samples (piped into spamc) through spamd for testing. I maintain local copies of all cf files and a set of bash scripts that can: I keep local copies on a small production server and run a bash script that moves our .cf files (after linting) to several servers and reloads the rules. It's very simple but does the trick. On a related note, I wish I could pass a message from the quarantine. That is, I'm trying to figure out a way to pass a raw message to my little satester for visual analysis. We use mailscanner/mailwatch and I would like to be able to just click once on a message in mailwatch and be looking at it in satester. To me, that would be cool. - lint check the cf file collection locally be calling spamassassin - start/stop/status check the local spamd (its stopped except when testing rule changes - move cf files to the local /etc/mail/spamassassin and restart spamd - run selected messages through spamc/spamd showing SA generated headers or - run selected messages through spamc/spamd showing whether the rule under test fires - run a full regression test that displays the messages that AREN'T flagged as spam - load the current cf file collection into my production mail server and restart spamd. I don't pretend this is the best approach, but it works for me and has also been used to test, develop and control the installation of SA plugins. I like your approach and thanks again for sharing. It definitely gives me some ideas on possible directions to proceed. Allen Hopefully this shows that you can run spamc/spamd anywhere, that it doesn't need to be associated with an MTA for rule development and testing and that the test setup can be quite simple - certainly no more complex than you'd use to develop any other single-purpose server. Martin
Re: SA Rule Tester/Checker
On 2015-07-17 16:49, Kevin A. McGrail wrote: We use maildir most of the time on our servers. Is that a problem or are you referring to a mbox file on a client machine? I never ran spamassassin on a client before. Sorry, just trying to understand your test environment. I usually am working and researching based on submissions. Your mail flow might differ but mutt supports maildir. Yea sorry I'm totally lost. It happens. Do you test on a production server, other (test) server, or local mbox with Mutt as your client? By submissions, do you mean customer submissions of FP and FN hits submitted by users? I looked for regression_tests.cf but I couldn't find it in any directory on my server. You likely need an svn checkout of trunk to get it. Understood, that will probably get me on the right path. Exactly. It's in addition to your cf files now and adds that regression testing layer to see if they do what you expect. And if you get a new spam in the same vein, you add the string, modify the rule and see if it still works on your old patterns and the new, etc. Wow, I can now obviously see how that can be useful. I will definitely give it a try. There really isn't much out there on the subject that is clear enough for someone getting started. I'm not sure I'm Wiki material, but I may try to put together a few basic howto's that interested folks can be pointed to on occasions like this. Anything is better than a vacuum! All about combatting spam! Yeppers! Allen am -at- sarules.com
Re: SA Rule Tester/Checker
On 2015-07-17 09:27, Kevin A. McGrail wrote: On 7/16/2015 8:00 PM, Allen Marsalis wrote: Can you elaborate on the macros any? Sure. Mutt is a very powerful little mail client and it's perfect for me for analysis of mbox files. We use maildir most of the time on our servers. Is that a problem or are you referring to a mbox file on a client machine? I never ran spamassassin on a client before. Sorry, just trying to understand your test environment. Creating a .muttrc file, you can add some macros like ctrl-y (why is this hitting KAM ;-) ): macro index \cy "spamassassin -t -D 2>&1 | grep -e KAM --e Content\\ analysis\n" "Test Message with Spamassassin for KAM Rules" or prompt for a string to match with ctrl-v macro index \cv "spamassassin -t -D 2>&1 | grep -i -e Content\\ analysis -e " "Test Message with Spamassassin for Rules Matching Search" ctrl-o to look at everything: macro index \co "spamassassin -t -D\n" "Test Message with Spamassassin for all Rules" And when you view a message I can hit ctrl v to sha1sum an attachment for example: macro attach \cv "sha1sum\n" "sha1sum on an attachment" Hope this helps and perhaps you can edit our wiki and add any ideas you find useful for others! Yes that helps tremendously. Thanks Kevin. I was thinking bash scripts with spamassassin -t or something when I saw the word "macro". Knowing these are Mutt Macros set off the light bulb. I will definitely play around with Mutt and your macros as see what it's all about. regression_tests.cf is a file you edit with a rule name and strings it should and should not hit on. You then run make test and will be told if your rule hits/doesn't hit as expected. Off-hand not sure exactly which test does it but once you figure that out you can do prove -v t/testname.t and run just that test. I looked for regression_tests.cf but I couldn't find it in any directory on my server. Not in /etc/mail/spamassassin/ or anyplace else I looked. I did google sample copies of the file and looking at it, I was a little confused since it doesn't look like other .cf files I'm familiar with. I see the "test", "ok" and "fail" attributes but no regex, just words. I'm guessing this is a .cf file that I need to add alongside my other .cf files (not part of installation). I never ran make test before either. But shouldn't be too hard to figure out. Sorry, not trying to spam my rule tool but just gain insight on where and if it is truly useful. I think it is useful for new rule testers. I try and automate my stuff as much as possible and these days I can pickup spam patterns in my sleep... LOL. I may not be so good at coding patterns, but I can smell a good spam phrase in a heartbeat. I am also very careful to double check, lint rules, etc. and I can't be too careful. Anyway, a link or two for (basic|convention|intended) rule checking might be enough to get me started and more familiar with regular methods of checking/debugging. Sorry, only thing I would be doing is a Google search... I'm not sure such a document exists though it should. Perhaps some of the other people who write rules can share some of their tricks? There really isn't much out there on the subject that is clear enough for someone getting started. I'm not sure I'm Wiki material, but I may try to put together a few basic howto's that interested folks can be pointed to on occasions like this. Thanks again for taking time to help someone down low on the mountain, help get up the mountain. I surely appreciate it. Allen am -at- satester.com
Re: SA Rule Tester/Checker
On 2015-07-16 04:53, Kevin A. McGrail wrote: You might find the regression_tests.cf in the trunk rules/ dir interesting. It's a way of giving strings you want to hit/not-hit on rules and see if it properly hits/doesn't hit as you expect. I also use mutt and a few macros such as one that run spamassassin -t 2>&1 with a prompt for a keyword. Helpful for debugging. Can you elaborate on the macros any? After searching, I'm still having a hard time understanding conventional SA rule checking/debugging methods. I've been going my own route so far, but I would like to have a basic understanding how most folks do it. I'm not finding a much to get me started. (Guides on regression_tests.cf etc.) Without knowing more at this point, do you think there may some usefulness to a tool that responds to keystrokes/keyphrases in real time like satester/rubular do? That is why I found the Rubular site so handy for checking my regex patterns in the first place and was inspired to write satester. For example, as I bang out a new rule, I can vary the sample text very quickly to check the pattern. Add/change/delete a character here or there and see what happens instantly. But with satester just on a larger scale. Sorry, not trying to spam my rule tool but just gain insight on where and if it is truly useful. Anyway, a link or two for (basic|convention|intended) rule checking might be enough to get me started and more familiar with regular methods of checking/debugging. Allen am -at- satester.com
Re: SA Rule Tester/Checker
On 2015-07-16 07:32, Axb wrote: header __KAM_NOTINMYNETWORK1 X-No-Relay =~ /./i header __KAM_MULTIPLE_FROM From =~ /^./ I think I get the first one (if anything exists in X-No-Relay) but I'll have to look deeper to understand why you would trigger on any From address. Anyway I'm having fun, learning a lot, and doing my customers a lot of good by developing rules. Thanks again for your tips and help. did you miss the next line? tflags __KAM_MULTIPLE_FROM multiple,maxhits=2 Understood. I just had a "what the heck is this?" moment. I'm a little excited by the new tool and I can't wait to dig into mutt and spamassassin -t today to see how they work by comparison. Yea, no more Rubular. heh. Allen Marsalis am -at- satester.com
Re: SA Rule Tester/Checker
On 2015-07-16 04:53, Kevin A. McGrail wrote: You might find the regression_tests.cf in the trunk rules/ dir interesting. It's a way of giving strings you want to hit/not-hit on rules and see if it properly hits/doesn't hit as you expect. I also use mutt and a few macros such as one that run spamassassin -t 2>&1 with a prompt for a keyword. Helpful for debugging. Thank Kevin. I really appreciate your sharing. I will check these out today. I've used regex for years but I'm relatively new to SA rules. I did see something interesting this morning. I pasted KAM.cf in to satester figuring it might overload my script but it worked. However any sample text I type triggers these two rules of yours. header __KAM_NOTINMYNETWORK1 X-No-Relay =~ /./i header __KAM_MULTIPLE_FROM From =~ /^./ I think I get the first one (if anything exists in X-No-Relay) but I'll have to look deeper to understand why you would trigger on any From address. Anyway I'm having fun, learning a lot, and doing my customers a lot of good by developing rules. Thanks again for your tips and help. Allen Marsalis am -at- satest.com
SA Rule Tester/Checker
I started writing SA rules about a year ago. Although I am new to this list, I have been lurking for quite a while. I would like to thank Kevin McGrail and others for providing rules and tips that inspires me to write my own custom rules. Today I wrote a little tool that helps me test my SA rules. I was using Rubular.com to check one pattern at a time which was very tedious. With my new tool, I can paste my entire rule.cf file (or just a one rule) and check against any test string to see which rules hit. (operates like a multi-line version of Rubular) I hope some of you find this tool useful. I wrote it because I couldn't find another one like it in google. If there is something better at testing SA rules like this, please let me know so I don't waste any further development efforts. If it is useful, ideas and suggestions will be heartily appreciated. www.satester.com It's a one page site created in one day, so it doesn't look like much right now. We might style it better later on. There is no database and we save nothing entered into the site. It ignores meta, score, and describe at this time (any line without regex in it) Simply paste in a rule and enter some sample text and it automatically highlights the hits. I notice a couple of bugs already. I've seen an odd rule hit on one of our span tags used for highlighting sample results. Also I need to add mimeheader to the list of lines that contain regex to be checked (along with header, body, rawbody, etc.) Hope you enjoy! Allen Marsalis President, Bandwise LLC am -at- satester dot com