A few rules to catch current gmail spam
I have seen a few posts with people complaining about spam from gmail (often linking to blogspot pages) which no existing rules catch, and have had a number of these myself. This is only a small fraction of the spam I am seeing, but it is anoying none-the-less! NOTE: I am not a particulally good rule writer and there are probably a lot more elegant ways of doing this! Feel free to make suggestions and improvements and to use how you will. The easiest way that I can see to catch these emails is to combine a number of existing rules and to add a couple of new rules which look for specific things: Existing rules used: FreeMail.pm Plugin ChickenPox.cf New Rule 1 - Find all emails which link to a free blog site: uri FHS_FREEBLOG /(?:spaces\.msn\.com|blogeasy\.com|easyjournal\.com|multiply\.com|blog-city\.com|blogharbor\.com|bloghi\.com|bloghorn\.com|blogspirit\.com|blogsource\.com|ebloggy\.com|pitas\.com|blogger\.de|blogsome\.com|weblogs\.us|wordpress\.com|wpblogs\.com|blogthing\.com|globbo\.org|theblog\.cc|learnerblogs\.org|uniblogs\.org|edublogs\.org|hrblogs\.org|beblogger\.com|evilsupergenius\.net|blogcafe\.com|blogspot\.com|weblogs\.hu|weblogs\.cz|blogs\.ro|weblogs\.pl|blogs\.fi|blogs\.no|blogs\.dk|blogs\.se|blog\.com|blog\.de|blog\.co\.uk|blog\.ca|freewebs\.com|livejournal\.com|20six\.co\.uk|xanga\.com|aeonity\.com|bloggercrab\.com|upsaid\.com|diaryland\.com|blogs\.ie|modblog\.com|efx2\.com|blogdrive\.com|tblog\.com|blogcult\.com|seo-blog\.com|quickblog\.org|diary-x\.com|blurty\.com|upsaid\.com|bloggercrab\.com|blogghost\.com)/i describe FHS_FREEBLOG Contains a link to a free blog. score FHS_FREEBLOG 0.001 New Rule 2 - Look for a propper html link in the email (i.e. long url and short description): rawbody FHS_LINK/\]{20,50}\>[^<]{6,15}\<\/a/i describe FHS_LINK Contains a long URL with a short description - a well written link score FHS_LINK 0.001 Now consider that people who send messages from a free email address are very unlikely to go to the trouble of using a properly formatted link in their email (they will just copy and past the url): meta FREEMAIL_LINK_BLOG (FREEMAIL_FROM && FHS_LINK && FHS_FREEBLOG) describe FREEMAIL_LINK_BLOG From a freemail address and includes a well written link to a blog score FREEMAIL_LINK_BLOG 2.0 The next thing I noticed was that most of these emails hit various bits of the chickenpox.cf ruleset so I created a set of meta rules to count how many of these were hit, and then combined this with the freemail rules: meta FHS_COUNT_CHICKENPOX_3 (( J_CHICKENPOX_12 + J_CHICKENPOX_13 + J_CHICKENPOX_14 + J_CHICKENPOX_15 + J_CHICKENPOX_16 + J_CHICKENPOX_17 + J_CHICKENPOX_18 + J_CHICKENPOX_19 + J_CHICKENPOX_110 + J_CHICKENPOX_111 + J_CHICKENPOX_21 + J_CHICKENPOX_22 + J_CHICKENPOX_23 + J_CHICKENPOX_24 + J_CHICKENPOX_25 + J_CHICKENPOX_26 + J_CHICKENPOX_27 + J_CHICKENPOX_28 + J_CHICKENPOX_29 + J_CHICKENPOX_210 + J_CHICKENPOX_31 + J_CHICKENPOX_32 + J_CHICKENPOX_33 + J_CHICKENPOX_34 + J_CHICKENPOX_35 + J_CHICKENPOX_36 + J_CHICKENPOX_37 + J_CHICKENPOX_38 + J_CHICKENPOX_39 + J_CHICKENPOX_41 + J_CHICKENPOX_42 + J_CHICKENPOX_43 + J_CHICKENPOX_44 + J_CHICKENPOX_45 + J_CHICKENPOX_46 + J_CHICKENPOX_47 + J_CHICKENPOX_48 + J_CHICKENPOX_51 + J_CHICKENPOX_52 + J_CHICKENPOX_53 + J_CHICKENPOX_54 + J_CHICKENPOX_55 + J_CHICKENPOX_56 + J_CHICKENPOX_57 + J_CHICKENPOX_61 + J_CHICKENPOX_62 + J_CHICKENPOX_63 + J_CHICKENPOX_64 + J_CHICKENPOX_65 + J_CHICKENPOX_66 + J_CHICKENPOX_71 + J_CHICKENPOX_72 + J_CHICKENPOX_73 + J_CHICKENPOX_74 + J_CHICKENPOX_75 + J_CHICKENPOX_81 + J_CHICKENPOX_82 + J_CHICKENPOX_83 + J_CHICKENPOX_84 + J_CHICKENPOX_91 + J_CHICKENPOX_92 + J_CHICKENPOX_93 + J_CHICKENPOX_101 + J_CHICKENPOX_102 ) > 2) describe FHS_COUNT_CHICKENPOX_3 Three or more odd character combinations score FHS_COUNT_CHICKENPOX_30.1 meta FHS_COUNT_CHICKENPOX_5 (( J_CHICKENPOX_12 + J_CHICKENPOX_13 + J_CHICKENPOX_14 + J_CHICKENPOX_15 + J_CHICKENPOX_16 + J_CHICKENPOX_17 + J_CHICKENPOX_18 + J_CHICKENPOX_19 + J_CHICKENPOX_110 + J_CHICKENPOX_111 + J_CHICKENPOX_21 + J_CHICKENPOX_22 + J_CHICKENPOX_23 + J_CHICKENPOX_24 + J_CHICKENPOX_25 + J_CHICKENPOX_26 + J_CHICKENPOX_27 + J_CHICKENPOX_28 + J_CHICKENPOX_29 + J_CHICKENPOX_210 + J_CHICKENPOX_31 + J_CHICKENPOX_32 + J_CHICKENPOX_33 + J_CHICKENPOX_34 + J_CHICKENPOX_35 + J_CHICKENPOX_36 + J_CHICKENPOX_37 + J_CHICKENPOX_38 + J_CHICKENPOX_39 + J_CHICKENPOX_41 + J_CHICKENPOX_42 + J_CHICKENPOX_43 + J_CHICKENPOX_44 + J_CHICKENPOX_45 + J_CHICKENPOX_46 + J_CHICKENPOX_47 + J_CHICKENPOX_48 + J_CHICKENPOX_51 + J_CHICKENPOX_52 + J_CHICKENPOX_53 + J_CHICKENPOX_54 + J_CHICKENPOX_55 + J_CHICKENPOX_56 + J_CHICKENPOX_57 + J_CHICKENPOX_61 + J_CHICKENPOX_62 + J_CHICKENPOX_63 + J_CHICKENPOX_64 + J_CHICKENPOX_65 + J_CHICKENPOX_66 + J_CHICKENPOX_71 + J_CHICKENPOX_72 + J_CHICKENPOX_73 + J_CHICKENPOX_74 + J_CHICKENPOX_75 + J_CHICKENPOX_81 + J_CHICKENPOX_82 + J_CHICKENPOX_83 + J_CHICKENPOX_84 + J_CHICKENPOX_91 + J_CHICKEN
Script to generate whitelist based on outgoing email
Not sure if this will be of any use to anyone else, of if it can be made to work with anything other than Exim, but here is the first draft of a script to generate a whitelist based on outgoing email! I have had it running on a server (for the last 2 months) handeling 20,000 emails a week for a variety of end users and as yet it hasn't caused any problems, and has helped to reduce the chances of false positives... I got the idea as a lot of desktop antispam solutions will automatically add the addresses of people you send email to, to a whitelist. Usually this feature is called somthing like AutoWhiteList (not to be confused with the spamassassin AWL which does somthing else entirely). The following script (which I hope comes through sucessfully) looks through the last 4 weeks of Exim maillogs and can be used to generate a spamassassin rule file to down score incoming emails (or as part of a shortcircuit rule). I admit to having very little knowledge of linux utilities and scripts having only started messing with them a few months ago, so I am sure someone with better skills than mine will have a good laugh at what I have done, but the idea is there and though the code is not elegant it does work! I would appreciate any suggestions or comments you have :D ## The Script - out.sh ## # Script to create a spamassassin ruleset to down-score emails from addresses which have previously had email SENT to them. # This is designed to work with exim logs and will need to be customised to fit your system! # This script looks at the current mail log and the ones from the previous four weeks and is designed to be run once per day (probaly at night). # NOTE: Email addresses which have repeatedly been sent to over this period are given a better score than ones which appear in only one log file. # This script is in no way optimised or designed for use on a production mail server - it is very much a proof of concept! # Version 0.1 Alpha - Updated 09-12-2007 (D-M-Y) # Bugs / ToDo's: # Currently if a log file does not include any outgoing email then the generated rule will match EVERY incoming email. Make sure you you don't schedule it directly after a log-rotate! # Usage: # ./out.sh > out.cf # The process: # AWK the current email log for lines which relate to outgoing email sent by local users # Sort it alphabetically # Remove any duplicates # NOTE: the next few steps can probably be done with one command if you have been using TR and SED for more than the 10 minutes I have! # Remove line breaks - replace them with commas # Remove the final comma # Replace the commas with | # Escape the .'s using SED # Escape the @'s using SED # Create the text of a spamassassin rule which matches any email addresses that have been sent to in the mail log file # Remove line breaks created by AWK awk '/T=remote_smtp/ && /[Cc]="250 [Oo][Kk]/ && !/F=<>/ {print $5}' /var/log/exim/mainlog | sort | uniq | tr "\r\n" "," | sed '$s/,$//' | tr "," "|" | sed 's/[.]/\\./g' | sed 's/[EMAIL PROTECTED]/\\@/g' | awk 'BEGIN {print "header __MAIL_SENT_TO_0 FROM =~ /("} {print $0} END {print ")/i\n"}' | tr -d "\r\n" echo echo describe __MAIL_SENT_TO_0 From address which had been sent to during the last week echo awk '/T=remote_smtp/ && /[Cc]="250 [Oo][Kk]/ && !/F=<>/ {print $5}' /var/log/exim/mainlog.1 | sort | uniq | tr "\r\n" "," | sed '$s/,$//' | tr "," "|" | sed 's/[.]/\\./g' | sed 's/[EMAIL PROTECTED]/\\@/g' | awk 'BEGIN {print "header __MAIL_SENT_TO_1 FROM =~ /("} {print $0} END {print ")/i\n"}' | tr -d "\r\n" echo echo describe __MAIL_SENT_TO_1 From address which had been sent to one week ago echo awk '/T=remote_smtp/ && /[Cc]="250 [Oo][Kk]/ && !/F=<>/ {print $5}' /var/log/exim/mainlog.2 | sort | uniq | tr "\r\n" "," | sed '$s/,$//' | tr "," "|" | sed 's/[.]/\\./g' | sed 's/[EMAIL PROTECTED]/\\@/g' | awk 'BEGIN {print "header __MAIL_SENT_TO_2 FROM =~ /("} {print $0} END {print ")/i\n"}' | tr -d "\r\n" echo echo describe __MAIL_SENT_TO_2 From address which had been sent to two weeks ago echo awk '/T=remote_smtp/ && /[Cc]="250 [Oo][Kk]/ && !/F=<>/ {print $5}' /var/log/exim/mainlog.3 | sort | uniq | tr "\r\n" "," | sed '$s/,$//' | tr "," "|" | sed 's/[.]/\\./g' | sed 's/[EMAIL PROTECTED]/\\@/g' | awk 'BEGIN {print "header __MAIL_SENT_TO_3 FROM =~ /("} {print $0} END {print ")/i\n"}' | tr -d "\r\n" echo echo describe __MAIL_SENT_TO_3 From address which had been sent to three weeks ago echo awk '/T=remote_smtp/ && /[Cc]="250 [Oo][Kk]/ && !/F=<>/ {print $5}' /var/log/exim/mainlog.4 | sort | uniq | tr "\r\n" "," | sed '$s/,$//' | tr "," "|" | sed 's/[.]/\\./g' | sed 's/[EMAIL PROTECTED]/\\@/g' | awk 'BEGIN {print "header __MAIL_SENT_TO_4 FROM =~ /("} {print $0} END {print ")/i\n"}' | tr -d "\r\n" echo echo describe __MAIL_SENT_TO_4 From address which had been sent to four weeks ago echo echo echo meta MAIL_SENT_TO \(\(__MAIL_SENT_TO_0 + __MAIL_SENT_TO_1 + __MAIL_SENT_TO_2 + __MAIL_SENT_TO_3 + __MAIL_SENT_TO_
Re: Stop tests when score is high
Not that I am aware of... The complication with this would be the order in which tests are carrierd out - you might have a genuine email which hits some good and some bad tests, and if the bad tests are hit first then you might have a problem! However it is a feature I would like to see as it could be used in conjunction with the Short Circuit pluggin. I am currently using short circuit to improve spam processing speed. I have set fast tests and rules with a high accuracy to run first (using a low, negative, priority), and when specific combinations of rules fire which should never cause false positives, I then break out of further testing and clasify the email as spam. -- View this message in context: http://www.nabble.com/Stop-tests-when-score-is-high-tp14432409p14437413.html Sent from the SpamAssassin - Users mailing list archive at Nabble.com.
syswrite() to parent failed: Broken pipe at /usr/lib/perl5/site_perl/5.8.5/Mail/SpamAssassin/SpamdForkScaling.pm line 570
SpamD seems to die every now and again (every couple of days) and though I have a script which checks regularly for various key services and restarts them if they are missing, it is letting a couple of spam through each time... The error message I am getting in my maillog when this happens is: server spamd[9522]: syswrite() to parent failed: Broken pipe at /usr/lib/perl5/site_perl/5.8.5/Mail/SpamAssassin/SpamdForkScaling.pm line 570. I have installed the logging plugin and will grab a copy of the next message to cause this to see if that sheds any light on the problem, but I was wonderng if anyone had seen this problem before? This is running on a CentOS 4.4 (Red Hat) VPS with Exim 4.67 (not that this is probably relevant) and is running SpamAssassin 3.2.3 with all the normal additons (Razor, DCC, iXhash, BotNet, SARE, PDFInfo, ClamAV Plugin, Extra DNSBLs, and a few custom ShortCircuits). Thanks! -- View this message in context: http://www.nabble.com/syswrite%28%29-to-parent-failed%3A-Broken-pipe-at--usr-lib-perl5-site_perl-5.8.5-Mail-SpamAssassin-SpamdForkScaling.pm-line-570-tf4751769.html#a13587308 Sent from the SpamAssassin - Users mailing list archive at Nabble.com.
Re: How to block the bat!
If you want to reduce the spam you get which claims to be from the bat then do the following: Create a rule which looks for the bat as a header with a 0.001 score. Create a meta rule which looks for email which is caught by the above rule AND hits Bayes_99 AND/OR (you choose based on how worried you are about FPs) which hits BOTNET. Give this meta rule a score of 5 or more. Thats how I would handle it (if my current config wern't already catching all these emails). -- View this message in context: http://www.nabble.com/How-to-block-the-bat%21-tf4644470.html#a13362545 Sent from the SpamAssassin - Users mailing list archive at Nabble.com.
Re: Manual sorting based on score count
You already can - try this in your local.cf: rewrite_header Subject SPAM [_STARS(X)_] This will give you somthing which looks like: SPAM [X] Some Dodgy Subject You can also put in the actual numeric score (rather than a number of X's which equals the whole number part of the score) but I find it easier to create rules in email clients which count whole numbers of X's. Note: You can use any other character rather than X if you want. To include the actual score use: rewrite_header Subject *SPAM* (_SCORE_) This will give you somthing which looks like: *SPAM* (9.7) Some Dodgy Subject Hope this helps! Jesse Molina wrote: > > > Hi > > I admin my personal mail system with SpamAssassin. I use maildrop as my > MDA to process mail through SpamAssassin and then deliver it to the > proper new-spam folder based on the spam's score. > > However, I then need to manually go through my new-spam folder from time > to time and find the false-positives and train the Bayes system as > appropriate. > > I use Seamonkey (Mozilla) and mutt as my MUAs. I'm usually using > Seamonkey when I'm doing my manual sorting and processing of my new-spam > folder. > > Today I was thinking about adding a feature to rewrite the Subject field > of spam-tagged messages with the numerical value of the score. For > example; > > Subject: *SPAM:Score=24* old-subject-goes-here > > or maybe > > Subject: *SPAM:24* old-subject-goes-here > > This would make sorting of my new-spam folder easy, based on the > alphabetical/numerical ordering of the subjects. Lower scored mails are > more likely to be false positives, so I can go through them first and > then forget about anything with a score over 15 or 20. > > This is pretty easy to do, but I wanted to ask if anyone else is doing > this, and if they have any superior methodologies that they have > discovered. > > Comments would be appreciated > > > > -- > # Jesse Molina > # Mail = [EMAIL PROTECTED] > # Page = [EMAIL PROTECTED] > # Cell = 1.602.323.7608 > # Web = http://www.opendreams.net/jesse/ > > > > -- View this message in context: http://www.nabble.com/Manual-sorting-based-on-score-count-tf4376119.html#a12477936 Sent from the SpamAssassin - Users mailing list archive at Nabble.com.
Re: False negative
You need to either get him to change the way he sends his emails or adjust your scores! If he is sending directly from a dynamic IP address then he will be blocked by a lot of peoples filters - for instance there is no chance of his emails being accepted by AOL! The way round this is for him to relay through his ISPs outgoing mail server if at all possible. i.e. put smtp.ispname.com (or somthing like that) in the outgoing server address of his email client. If you want to accept emails from people with a similar setup to his without adding them manually to a whitelist, then you will have to reduce the scores for the rules which fire on these mails. Edit your local.cf file (probably in /etc/mail/spamassassin) to include somthing like: score FH_HOST_ALMOST_IP 1.0 score FH_HOST_EQ_DYNAMICIP 1.0 score RCVD_IN_SORBS_DUL 0.5 This will still help to catch some spam (though is has reduced the amount you will catch) but will hopefully be enough to let emails like this through as long as they don't hit any other rules. I would suggest NOT using the BOTNET pluggin as it will probably make the problem worse! -- View this message in context: http://www.nabble.com/False-negative-tf4335349.html#a12347708 Sent from the SpamAssassin - Users mailing list archive at Nabble.com.
Some thoughts on Baysian Setup...
Site Wide Bayes or Per User Bayes? This is somthing I have been thinking about and thought I would share to see what other people think... Site wide bayes has one database. Per User bayes has one per user or domain (depending on how your server is configured). For example if you have 40 users with a 10Mb bayes database each then you either have to read and write these to and from disk when an email comes in, or load all 400Mb of data into memory. 1. Most users don't know how, arn't allowed, or can't be bothered to train Bayes. In most cases spamassassin is left to auto-train bayes. 2. Most people would consider the same emails to be SPAM. 90% of what I think is spam would also be what you think is spam, with only a small percentage of emails that we disagree on. 3. The emails which we would disagree on would probably be newsletters and advertising emails from legitimate companies. Unwanted newsletters and advertising emails which people have deliberately (possibiliy due to stupidity) signed up to should not be trained as SPAM, but should be manually blacklisted if necessary. 4. Site wide bayes saves disk space and more importantly it saves significantly on disk IO or memory requirements. 5. A larger database leads to more accurate baysian identification - I am guessing this is right? Do you agree or disagree with the five above statements? Based on the five above statements I would suggest that: Site wide bayes is as good as if not slightly better (due to a potentially larger single database) than per user bayes when it comes to identifying SPAM emails. 1. What I think of as HAM emails could be widely different from what you think of as HAM emails - if I were to sort your inbox by hand (without knowing you personally) I would probably delete some good emails by mistake while getting rid of the spam. 2. If a server has one customer who is a plumber and one who is an artist, site wide bayes would learn that emails containing the words pipes or canvas are good. The plumber will get emails with the word canvas in them tagged as bayes_00 and vice versa. 3. If per user bayes is chosen then bayes_00 will only fire on emails containing words which have occurred in emails which YOU have received in the past and which scored low enough to be autolearned. 4. If a HAM email is misclasified as SPAM then users are more likely to report this to their admin or to train the filter themselves, than for SPAM emails which are not tagged. People will ignore a few spam slipping through but not false positives! Do you agree or disagree with the four above statements? Based on the four above statements I would suggest that: Per User bayes is better than Site Wide bayes when it comes to correctly identifying HAM emails. If my various assumptions are correct then perhapse there should be a third type of bayes to choose from in spamassassin? Namely one where: SPAM tokens are stored on a server wide basis - can be a LARGE database if this helps HAM tokens are stored on a per user basis - probably only needs a 1-2Mb file per user. Any comments? PS. I am not up to coding anything like this myself so don't bother suggesting that I try it and report back! -- View this message in context: http://www.nabble.com/Some-thoughts-on-Baysian-Setup...-tf4335489.html#a12347630 Sent from the SpamAssassin - Users mailing list archive at Nabble.com.
How to query the AWL at an earlier stage for Short Circuit?
I am playing with the Short Circuit plugin to speed up scanning (by skipping Network Tests on obviously good emails) and wanted to be able to query the AWL as part of this as I don't want to Short Circuit on BAYES_00 alone. i.e. Short Circuit as HAM if both BAYES_00 & AWL fire. I tried this: priority USER_IN_WHITELIST -1000 priority ALL_TRUSTED-950 priority BAYES_00 -400 shortcircuit USER_IN_WHITELIST on shortcircuit ALL_TRUSTEDon # Add a high priority rule to check if the sender is in the AWL header __MY_AWL eval:check_from_in_auto_whitelist() describe __MY_AWL Sender has been seen before. priority __MY_AWL-300 meta MY_HAM_SC (( BAYES_00 + __MY_AWL ) > 1) describe MY_HAM_SC Clearly not SPAM. priority MY_HAM_SC -200 tflags MY_HAM_SCnice score MY_HAM_SC -50 shortcircuit MY_HAM_SC on However this does not work as messages which get BAYES_00 and AWL, do not get Short Circuited... I presume that this is because the AWL which normally runs at a priority of 1000 can't be accessed at an earlier stage? I still want the AWL to do its normal job once the other scoring has finished, so don't want to make its priority less than 1000, but was hoping that there was a way to query its information earlier in the SpamAssasssin process. Any ideas? -- View this message in context: http://www.nabble.com/How-to-query-the-AWL-at-an-earlier-stage-for-Short-Circuit--tf4332696.html#a12339661 Sent from the SpamAssassin - Users mailing list archive at Nabble.com.
Re: Problem with clamav plugin
You need to set a high priority for the meta rules as otherwise they are evaluated BEFORE the ClamAV plugin is used (I think?). I am not an expert in how SA works, but I eventually came up with the following solution (for using several different 3rd party clamav signatures): This is my clamav.cf file: loadplugin ClamAV clamav.pm full CLAMAV eval:check_clamav() describe CLAMAV Clam AntiVirus detected something... score CLAMAV 0.001 # Look for specific types of ClamAV detections header __CLAMAV_PHISH X-Spam-Virus =~ /Yes.{1,20}Phishing/i header __CLAMAV_SANE X-Spam-Virus =~ /Yes.{1,20}Sanesecurity/i header __CLAMAV_MBL X-Spam-Virus =~ /Yes.{1,20}MBL/ header __CLAMAV_MSRBL X-Spam-Virus =~ /Yes.{1,20}MSRBL/ # Give the above rules a very late priority so that they can see the output # of previous rules - otherwise they don't work! Not sure what the correct # priority should be but this seems to work... priority __CLAMAV_PHISH priority __CLAMAV_SANE priority __CLAMAV_MBL priority __CLAMAV_MSRBL # Work out what ClamAV detected and score accordingly meta CLAMAV_VIRUS (CLAMAV && !__CLAMAV_PHISH && !__CLAMAV_SANE && !__CLAMAV_MBL && !__CLAMAV_MSRBL) describe CLAMAV_VIRUS Virus found by ClamAV default signatures score CLAMAV_VIRUS 20.0 meta CLAMAV_PHISH (CLAMAV && __CLAMAV_PHISH && !__CLAMAV_SANE) describe CLAMAV_PHISH Phishing email found by ClamAV default signatures score CLAMAV_PHISH 10.0 meta CLAMAV_SANE (CLAMAV && __CLAMAV_SANE) describe CLAMAV_SANE SPAM found by ClamAV SaneSecurity signatures score CLAMAV_SANE 7.5 meta CLAMAV_MBL (CLAMAV && __CLAMAV_MBL) describe CLAMAV_MBL Malware found by ClamAV MBL signatures score CLAMAV_MBL 7.5 meta CLAMAV_MSRBL (CLAMAV && __CLAMAV_MSRBL) describe CLAMAV_MSRBL SPAM found by ClamAV MRSBL signatures score CLAMAV_MSRBL 2.0 In your case you could fix what you have done (which looks to be taken from one of my previous messages while trying to get this to work myself?) by making it: header __MY_CLAMAV X-Spam-Virus =~ /Yes/i priorty __MY_CLAMAV header __MY_CLAMAV_SANE X-Spam-Virus =~ /Yes.{1,50}Sanesecurity/i priorty __MY_CLAMAV_SANE meta MY_CLAMAV_SANE (__MY_CLAMAV && __MY_CLAMAV_SANE) score MY_CLAMAV_SANE 5 Hope this helps! -- View this message in context: http://www.nabble.com/Problem-with-clamav-plugin-tf4135813.html#a11763227 Sent from the SpamAssassin - Users mailing list archive at Nabble.com.
Re: My bash script to upload PDFinfo daily, safely
I have found SaneSecurity definitions to be VERY good - they hit about 60% of my SPAM which is incredible given that they only match exact results (they are not fuzzy). However this high percentage may be beacuse I am based in the UK as is the author of the sanesecurity definitions. Also they tend to hit already high scoring spam so they arn't a miracle spam fighting measure though they are good. My biggest concern was over possible false positives given that there is only one person working on these definitions unlike the official ClamAV signatures... However I have yet to have any problems with them in the month that I have been using them. There are also two other sets of ClamAV signatures which I am now testing (though these are not as good IMHO): http://www.malware.com.br/ (various formats including ClamAV) http://www.msrbl.com/site/ (ClamAV as well as RBLs) As a solution to my own concerns over false positives I have changed from virus scanning at SMTP time and have moved to using the ClamAV SpamAssassin plugin: http://wiki.apache.org/spamassassin/ClamAVPlugin Rather than using the standard clamav.cf I have written my own which gives different scores depending on what ClamAV signature found somthing: loadplugin ClamAV clamav.pm full CLAMAV eval:check_clamav() describe CLAMAV Clam AntiVirus detected something... score CLAMAV 0.001 # Look for specific types of ClamAV detections header __CLAMAV_PHISH X-Spam-Virus =~ /Yes.{1,20}Phishing/i header __CLAMAV_SANE X-Spam-Virus =~ /Yes.{1,20}Sanesecurity/i header __CLAMAV_MBL X-Spam-Virus =~ /Yes.{1,20}MBL/ header __CLAMAV_MSRBL X-Spam-Virus =~ /Yes.{1,20}MSRBL/ # Give the above rules a very late priority so that they can see the output # of previous rules - otherwise they don't work! priority __CLAMAV_PHISH priority __CLAMAV_SANE priority __CLAMAV_MBL priority __CLAMAV_MSRBL # Work out what ClamAV detected and score accordingly meta CLAMAV_VIRUS (CLAMAV && !__CLAMAV_PHISH && !__CLAMAV_SANE && !__CLAMAV_MBL && !__CLAMAV_MSRBL) describe CLAMAV_VIRUS Virus found by ClamAV default signatures score CLAMAV_VIRUS 20.0 meta CLAMAV_PHISH (CLAMAV && __CLAMAV_PHISH && !__CLAMAV_SANE) describe CLAMAV_PHISH Phishing email found by ClamAV default signatures score CLAMAV_PHISH 10.0 meta CLAMAV_SANE (CLAMAV && __CLAMAV_SANE) describe CLAMAV_SANE SPAM found by ClamAV SaneSecurity signatures score CLAMAV_SANE 7.5 meta CLAMAV_MBL (CLAMAV && __CLAMAV_MBL) describe CLAMAV_MBL Malware found by ClamAV MBL signatures score CLAMAV_MBL 7.5 meta CLAMAV_MSRBL (CLAMAV && __CLAMAV_MSRBL) describe CLAMAV_MSRBL SPAM found by ClamAV MRSBL signatures score CLAMAV_MSRBL 2.0 Hope this is of some help to someone... -- View this message in context: http://www.nabble.com/My-bash-script-to-upload-PDFinfo-daily%2C-safely-tf4115144.html#a11732078 Sent from the SpamAssassin - Users mailing list archive at Nabble.com.
Re: is there a whitelist rhswl available
http://www.dnswl.org/ http://wiki.ctyme.com/index.php/Spam_DNS_Lists Both work well IMHO Ramprasad wrote: > > There are quite a few domain you can trust not to send spam. > For example the airlines, the banks , and a lot others like > spamassassin.apache.org :-) > > If mails from these domains gets an SPF/DK pass we can simply pass the > mails. Today I manually maintain a list of whitelist_from_auth > > Is there a global DNS WL available somewhere. So that I dont have to > keep tracking myself for maintaining which new bank has put up SPF > records > > > Thanks > Ram > > > > > > -- View this message in context: http://www.nabble.com/is-there-a-whitelist-rhswl-available-tf4102536.html#a11668610 Sent from the SpamAssassin - Users mailing list archive at Nabble.com.
Re: ClamAV in SA( was: SaneSecurity)
Is the following easy to do? I am a bt of a Linux novice I'm afraid... I have tried discarding at SMTP with ClamAV and Exim, and scanning in SA using the ClamAV plugin, but wasn't 100% happy with either solution (for the reasons you give). Any pointers would be greatfully accepted! >We do, an I think they are. Currently I run two instances of >clamd in our mail gateway. >One instance has only the official ClamAV databases with phishing >signatures turned off. This instance is used by MIMEDefang (a >milter) for discarding infected mail. >The second instance has the official databases with phishing >signatures (and some other stuff) turned on as well as the >SaneSecurity*, MSRBL* and Malware* signatures. This instance is >used by SpamAssassin for scoring mail. -- View this message in context: http://www.nabble.com/SaneSecurity-tf3989268.html#a11400255 Sent from the SpamAssassin - Users mailing list archive at Nabble.com.
Writing a rule to access SA ClamAV Plugin Header
There is a SpamAssassin plugin which checks messages with ClamAV, which adds the following header to emails it processes: X-Spam-Virus: Yes ($VirusName) http://wiki.apache.org/spamassassin/ClamAVPlugin By default you can set a score in its clamav.cf file: score CLAMAV 10 I am currently testing a 3rd party set of ClamAV definitions from a website called www.sanesecurity.co.uk which look to be very effective against some phishing and image spam emails. When it fires on an email the headers the ClamAV plugin adds are as follows: X-Spam-Virus: Yes ($Name.Sanesecurity) What I would like to do would be to score the ClamAV detection differently depending on whether it was detected by the ClamAV default signatures (virus) or the Sanesecurity signatures (spam). I have tried adding the following to local.cf but it doesn't seem to be working: header __MY_CLAMAV X-Spam-Virus =~ /Yes/i header __MY_CLAMAV_SANE X-Spam-Virus =~ /Yes.{1,50}Sanesecurity/i meta MY_CLAMAV (__MY_CLAMAV && !__MY_CLAMAV_SANE) meta MY_CLAMAV_SANE (__MY_CLAMAV && __MY_CLAMAV_SANE) score MY_CLAMAV 10 score MY_CLAMAV_SANE 5 Any suggestions? -- View this message in context: http://www.nabble.com/Writing-a-rule-to-access-SA-ClamAV-Plugin-Header-tf4007944.html#a11382177 Sent from the SpamAssassin - Users mailing list archive at Nabble.com.
Re: exposing rules
Assuming that you have managed to get SA to add headers to messages which is thinks are spam, and are looking to add a header to ALL messages so you can see what rules are firing on your HAM, then you can do the following. This may not be what you are after, but may be of some use! edit your local.cf file and add: add_header all Status _YESNO_, score=_SCORE_ required=_REQD_ tests=_TESTSSCORES(,)_ _DCCR_ _PYZOR_ _RBL_ autolearn=_AUTOLEARN_ languages=_LANGUAGES_ Note: this should all be added as ONE long line! -- View this message in context: http://www.nabble.com/exposing-rules-tf3979477.html#a11314268 Sent from the SpamAssassin - Users mailing list archive at Nabble.com.
Re: Botnet Score
Though BotNet is VERY effective in catching SPAM, the default score of 5 is way too high IMHO. With a well trained BAYES, using a selected list of RBLs and URIBLs for scoring, the SARE rules, and some custom rules of my own I am confident that I am catching well over 90% of the SPAM hitting my server (about 5000 emails received a week), with almost no false positives. Based on this I set BotNet to score 0.001 for all its rules (so as not to confuse the issue), and after a week examined its effectiveness using sa-stats.pl... If detected 91.7% of SPAM which is FANTASTIC! But is also fired on 9.6% of my HAM emails which is not so good :( Normally if a rule gets this higher FP then I would discard it, but given the amount of SPAM is catches I have left it running but set to only add 1 to the scores of the emails it detects (as this will not be enough to greatly affect the scores of the false positive ham emails it hits) and in this fashon it helps to up-score my SPAM enough to push it over my BAYES training threshold and my Delete threshold. One other benefit of BotNet is that it includes some rules which can be used to down-score some genuine commerical emails and emails sent through an ISPs mail servers. My scores for BotNet are as follows: score BOTNET 1.000 score BOTNET_CLIENT 0.100 score BOTNET_CLIENTWORDS 0.100 score BOTNET_IPINHOSTNAME 0.500 score BOTNET_SOHO -0.100 score BOTNET_SERVERWORDS -0.500 Other things you should look at are upgrading to SA 3.2.1 as this includes URIBL_BLACK by default (another very effective rule), and possibly using the SAGREY plugin (which uses the auto white list feature to see if an email is the first one you have had from an address, and in this case if it looks to be SPAM it adds a bit more to its score!). Obviously your mileage may vary! Oliver Matt-123 wrote: > > I have added botnet to my Spamassassin install. It seems to have > helped quite a bit so far. I am just wandering about the 5 points it > gives for a hit. Is that too much? Does it have alot of false > positives or not? > > Matt > > -- View this message in context: http://www.nabble.com/Botnet-Score-tf3971206.html#a11276655 Sent from the SpamAssassin - Users mailing list archive at Nabble.com.
Changes to SURBL in SA 3.2.1?
EDIT: My mistake - the URIBLs are listed in two different places in the 3.2.1 rules table! However URIBL_BLACK does seem to be listed twice with different names and scores... I have just been picking through some of the changes in 3.2.1 (having just installed it) to see what impact this will have on my custom rules and RBLs etc and have noticed somthing strange! 3.2.0 checked the following URIBLs http://spamassassin.apache.org/tests_3_0_x.html: URIBL_SBL URIBL_SC_SURBL URIBL_WS_SURBL URIBL_PH_SURBL URIBL_OB_SURBL URIBL_AB_SURBL 3.2.1 checks the following URIBLs http://spamassassin.apache.org/tests_3_2_x.html: URIBL_COMPLETEWHOIS URIBL_RHS_ABUSE URIBL_RHS_AHBL URIBL_RHS_BOGUSMX URIBL_RHS_DOB URIBL_RHS_DSN URIBL_RHS_POST URIBL_RHS_TLD_WHOIS URIBL_RHS_URIBL_BLACK URIBL_RHS_URIBL_GREY URIBL_RHS_WHOIS URIBL_XS_SURBL (URL listed in XS SURBL - TEsting) My question is: Does URIBL_XS_SURBL replace all the previous SURBL black lists? Is it in effect multi.surbl.org? I can't find any details on XS_SURBL on the surbl.org website... If it is a multiple check then this will reduce the scoring of some SPAM as it is scored at 1, when some of the old SURBL rules where scored at 2, 3, and even 4! Not a problem IMHO as SA now includes several other good URI BLs including the excelent URIBL_BLACK. -- View this message in context: http://www.nabble.com/Changes-to-SURBL-in-SA-3.2.1--tf3969802.html#a11267936 Sent from the SpamAssassin - Users mailing list archive at Nabble.com.