RE: How to create spam score list for sample email messages
On Tue, 2011-10-11 at 15:37 +, Sharma, Ashish wrote: > Martin, > > Your testing strategy of spamassassin is interesting to emulate and I > have following queries: > > Following are the plugins that get loaded in my spamassassin: > > SpamAssassin loaded plugins: AutoLearnThreshold, Bayes, BodyEval, > Check, DKIM, DNSEval, FreeMail, FuzzyOcr, HTMLEval, HTTPSMismatch, > Hashcash, HeaderEval, ImageInfo, MIMEEval, MIMEHeader, Pyzor, Razor2, > RelayEval, ReplaceTags, SPF, SpamCop, URIDNSBL, URIDetail, URIEval, > VBounce, WLBLEval, WhiteListSubject > > 1. There are network rules in my spamassassin(correct me if I am > wrong), How do you simulate and test them? > My testing machine has the same external access rights as the live box, and also does all DNS lookups via the copy of bind 9 on the live server, so I run the same set of spamd plugins on both boxes. Both use the same SA version and the same sa_update cycle. In fact, I don't much mind if there are differences because my local rule set has ended up with very little reliance on standard rules and the testing set-up is primarily to develop local rules. This is because almost all my spam comes from mailing lists where it has been input through a web forum: the effect is that by and large header-based rules don't file on it. I have one locally developed plugin which whitelists senders who I have previously sent mail to by accessing a view of my mail archive database. The associated rule is in its own .cf file. Most of my straight-forward rules are in local.cf and link to rules that use complex patterns in a second .cf file, which is automatically built by a awk script that translates human-readable text (i.e. one alternate pattern per line) into faster but unreadable SA rules. These all populate the testing system's configuration directory. The testing system is managed by a set of scripts that start and stop it as well as converting test results into readable statistics and summaries. Finally there is a script that transfers the configuration to the live system and then restarts it. This has a small amount of selectivity about what gets transferred. I've released the rule generator, but the rest of the set-up is probably too specific to my needs to be worth the trouble of releasing. Anyway, I'm certain that anybody who needs something like it can easily build it by accretion. All my scripts are written in a sinister mix of bash and awk with a tiny amount of grep thrown in. If you don't know awk, its worth getting to grips with: its fast and you can do a lot with very little code once you understand its structure. I also use scp to transfer files between systems: its very easy to use from a bash script. > 2. How can I divide my spamassassin rulesets, so that network rules > and local rules can be tested? > Decide on functional rule groupings, put them into separate .cf files and look at using a script or two to automate the process of applying them to your live system(s). HTH Martin
RE: How to create spam score list for sample email messages
Martin, Your testing strategy of spamassassin is interesting to emulate and I have following queries: Following are the plugins that get loaded in my spamassassin: SpamAssassin loaded plugins: AutoLearnThreshold, Bayes, BodyEval, Check, DKIM, DNSEval, FreeMail, FuzzyOcr, HTMLEval, HTTPSMismatch, Hashcash, HeaderEval, ImageInfo, MIMEEval, MIMEHeader, Pyzor, Razor2, RelayEval, ReplaceTags, SPF, SpamCop, URIDNSBL, URIDetail, URIEval, VBounce, WLBLEval, WhiteListSubject 1. There are network rules in my spamassassin(correct me if I am wrong), How do you simulate and test them? 2. How can I divide my spamassassin rulesets, so that network rules and local rules can be tested? Can you please elaborate? Thanks in advance Ashish -Original Message- From: Martin Gregorie [mailto:mar...@gregorie.org] Sent: Monday, October 10, 2011 9:59 PM To: users@spamassassin.apache.org Subject: Re: How to create spam score list for sample email messages On Mon, 2011-10-10 at 15:36 +, Sharma, Ashish wrote: > I want to create a report of sample emails with the spam scores > generated in accordance with permissible limits after deploying the > spamassassin updated rulesets. > > For that I am trying out on a shell script providing with my test > email messages > I do something similar, but keep my test messages as separate text files in a directory because I find that easier to manage. I do approximately this on a computer that's entirely separate from my mail host and runs its own copy of spamd so I can mess around with its rule sets and configuration without upsetting the live copy of SA. The testing SA runs in effectively the same configuration as my live SA because the test rig has an identical set of SA config files: when I'm happy with the test operation I export the entire set of configuration files to the live system and then restart spamd. Here's the guts of the test system: for f in testdata/*.txt do spamc >result.txt done analysis_prog result.txt rm result.txt My analysis program is an awk script: that or Perl are probably the weapons of choice for writing this type of program. You probably need to feed the messages to amavis-new since it is creating a special header, rather than to spamc/spamd as I do, but I question whether your command line is right since amavis has direct access to the Perl modules that make up spamassassin. Disclaimer: the previous paragraph contains almost everything I know about amavis-new. Somebody else may be able to help with the amavis-new command line, but not me since I don't use it. What I do know is that Postfix passes a message at a time to spamc/spamd so its entirely probable it does the same with amavis-new if you're running that as a Postfix service. Martin
Re: How to create spam score list for sample email messages
On Mon, 2011-10-10 at 20:08 +0100, RW wrote: > On Mon, 10 Oct 2011 17:29:08 +0100 > Martin Gregorie wrote: > > > > for f in testdata/*.txt > > do > > spamc >result.txt > > For that to work you need the setting > > fold_headers 0 Fair comment: I use gawk rather than grep and my filter looks like this: spamc -l <$s | gawk ' BEGIN { tag=0 } /^X-Spam/ { tag=1; print; next } /^ / || /^\t/ { if (tag==1) { print } next } { tag = 0 } ' | where_ever I don't use 'fold_headers 0' because I don't want *any* differences between my test config and the live one. From the look of that filter I obviously ran into the folded line thing solved the problem with gawk. Martin
Re: How to create spam score list for sample email messages
On Mon, 10 Oct 2011 17:29:08 +0100 Martin Gregorie wrote: > for f in testdata/*.txt > do > spamc >result.txt For that to work you need the setting fold_headers 0
Re: How to create spam score list for sample email messages
On Mon, 2011-10-10 at 15:36 +, Sharma, Ashish wrote: > I want to create a report of sample emails with the spam scores > generated in accordance with permissible limits after deploying the > spamassassin updated rulesets. > > For that I am trying out on a shell script providing with my test > email messages > I do something similar, but keep my test messages as separate text files in a directory because I find that easier to manage. I do approximately this on a computer that's entirely separate from my mail host and runs its own copy of spamd so I can mess around with its rule sets and configuration without upsetting the live copy of SA. The testing SA runs in effectively the same configuration as my live SA because the test rig has an identical set of SA config files: when I'm happy with the test operation I export the entire set of configuration files to the live system and then restart spamd. Here's the guts of the test system: for f in testdata/*.txt do spamc >result.txt done analysis_prog result.txt rm result.txt My analysis program is an awk script: that or Perl are probably the weapons of choice for writing this type of program. You probably need to feed the messages to amavis-new since it is creating a special header, rather than to spamc/spamd as I do, but I question whether your command line is right since amavis has direct access to the Perl modules that make up spamassassin. Disclaimer: the previous paragraph contains almost everything I know about amavis-new. Somebody else may be able to help with the amavis-new command line, but not me since I don't use it. What I do know is that Postfix passes a message at a time to spamc/spamd so its entirely probable it does the same with amavis-new if you're running that as a Postfix service. Martin
How to create spam score list for sample email messages
Hi, I have a mail receiving setup where in Postfix (2.6.6) is the MTA and then I have amavisd-new (with spamassassin and CLamAV) as content filter. I have enabled spam report header in my amavisd-new conf file. I want to create a report of sample emails with the spam scores generated in accordance with permissible limits after deploying the spamassassin updated rulesets. For that I am trying out on a shell script providing with my test email messages to the following (with following command): spamassassin -C /etc/amavisd.conf -e --progress < testemail.eml and be able to create a report, that would enlist the spam scores of all email messages that have been parsed by the above tool. Is it possible?, actually I am unable to generate the spam scores in any output form via the above command to be added in the report. Moreover I am using amavisd-new config file here, is it a right approach? Will the above command affect any kind of Bayesian learning of the spamassassin setup ?, I don't want to do that. Thanks Ashish Sharma