Confused about how to use sa-update
I have just found out why most of my emails have been getting tagged as spam this year. It's because of a bug in a rule which causes this hit to happen when it shouldn't: FH_DATE_PAST_20XXThe date is grossly in the future. The actual file at fault is 72_active.cf which is a spamassassin rule file and it can be fixed by getting the new file via sa-update. But I don't understand how to use sa-update. I've run it and I can see all the new rule files in /var/lib/spamassassin/3.002005. However, I think my rules run off the files in /usr/share/spamassassin/. The wiki at http://wiki.apache.org/spamassassin/RuleUpdates#Using_sa-update says NOT to use the --updatedir parameter to put updates in /usr/share/spamassassin. So how exactly do you get the new rule files into /usr/share/spamassassin so they start working? Do you just copy them across manually, or is there a way of getting sa-update to do it automatically?
keep-alive check?
I've just found that line on the spamc man page: -K Perform a keep-alive check of spamd, instead of a full message check. Someone knows what it means, and what it actually does?
Re: keep-alive check?
On Wednesday, 31 of March 2010, David wrote: I've just found that line on the spamc man page: -K Perform a keep-alive check of spamd, instead of a full message check. Someone knows what it means, and what it actually does? It does what it says. Keep-alive means check means just connecting to spamd to check whether the daemon is still alive. One does not need to do the full message scan for this. -- \/ | k...@epsilon.eu.org | | http://epsilon.eu.org/ | /\
sa-update
I installed with yum lhe following pakages: postfix, amavisd-new and spamassassin. I have *.cf in /usr/share/spamassassin/ directory and now I would like update them. Is it possible? with sa-update? If yes which is the complete command to use to update *.cf in /usr/share/spamassassin/ directory? Thanks Andrea
Re: Confused about how to use sa-update
On Wed, 2010-03-31 at 19:15 +1100, Phill Edwards wrote: I have just found out why most of my emails have been getting tagged as spam this year. It's because of a bug in a rule which causes this hit to happen when it shouldn't: FH_DATE_PAST_20XXThe date is grossly in the future. The actual file at fault is 72_active.cf which is a spamassassin rule file and it can be fixed by getting the new file via sa-update. But I don't understand how to use sa-update. I've run it and I can see all the new rule files in /var/lib/spamassassin/3.002005. However, I think my rules run off the files in /usr/share/spamassassin/. The wiki man spamassassin Pay special attention to the section Configuration Files. sa-update is doing the right thing, and spamassassin will use the update dir instead of the base stock rules. You merely need to restart spamd, or whatever else you are using. at http://wiki.apache.org/spamassassin/RuleUpdates#Using_sa-update says NOT to use the --updatedir parameter to put updates in /usr/share/spamassassin. So how exactly do you get the new rule files into /usr/share/spamassassin so they start working? Do you just copy them across manually, or is there a way of getting sa-update to do it automatically? Pretty much the same answer as above. Do not move, copy or otherwise harm those files. :) sa-update knows, where spamassassin expects the rules. guenther -- char *t=\10pse\0r\0dtu...@ghno\x4e\xc8\x79\xf4\xab\x51\x8a\x10\xf4\xf4\xc4; main(){ char h,m=h=*t++,*x=t+2*h,c,i,l=*x,s=0; for (i=0;il;i++){ i%8? c=1: (c=*++x); c128 (s+=h); if (!(h=1)||!t[s+h]){ putchar(t[s]);h=m;s=0; }}}
Re: Confused about how to use sa-update
Phill Edwards wrote on Wed, 31 Mar 2010 19:15:18 +1100: So, you have finally found sa-update? Wow. So how exactly do you get the new rule files into /usr/share/spamassassin so they start working? Run a debug lint and you will see that the /var/lib directory gets used when it contains rules. Nothing gets moved. Kai -- Get your web at Conactive Internet Services: http://www.conactive.com
Re: sa-update
On 31.3.2010 14:02, Andrea Bencini wrote: I installed with yum lhe following pakages: postfix, amavisd-new and spamassassin. I have *.cf in /usr/share/spamassassin/ directory and now I would like update them. Is it possible? with sa-update? If yes which is the complete command to use to update *.cf in /usr/share/spamassassin/ directory? The rules are normally in /usr/lib/spamassassin and sa-update knows where they are supposed to be in any case. The simplest case of sa-update is just # sa-update That's all. But there are options. You may want to include more channels than the default. A good tutorial can be found from http://khopesh.com/wiki/Anti-spam The author is a member of this list, and also available in irc. Currently I use my sa-update just as it says in that tutorial, same channels. -- http://www.iki.fi/jarif/ Q: Why is it that Mexico isn't sending anyone to the '84 summer games? A: Anyone in Mexico who can run, swim or jump is already in LA. signature.asc Description: OpenPGP digital signature
Re: sa-update
On 3/31/2010 7:02 AM, Andrea Bencini wrote: I installed with yum lhe following pakages: postfix, amavisd-new and spamassassin. I have *.cf in /usr/share/spamassassin/ directory and now I would like update them. Is it possible? with sa-update? If yes which is the complete command to use to update *.cf in /usr/share/spamassassin/ directory? Thanks Andrea Don't. Let sa-update put them in /var/lib/spamassassin. Spamassassin will check this location for new rulesets and use it instead of /usr/share. If you're not sure this is happening, try running spamassassin --lint -D and check the top of the debug output for what rule paths SA is using.
Limit SA to scan messages 100k and below
Hi Guys, My current sysadmin has now left the company and I'm new to SA and Exim. Needless to say I have been assigned the task to look after the server . I'm hoping I've come to the right place for my questions to be answered. The system I have is running on: Gentoo Base System release 1.12.10 SpamAssassin version 3.2.5 running on Perl version 5.8.8 Exim version 4.69 Here is my spamd.conf file: = SPAMD_OPTS=-m 25 -H -u mail -D # spamd stores its pid in this file. If you use the -u option to # run spamd under another user, you might need to adjust it. PIDFILE=/var/run/spamd.pid # SPAMD_NICELEVEL lets you set the 'nice'ness of the running # spamd process SPAMD_NICELEVEL=1 = I've read somewhere that the default setting for SA to scan a message is 500k. Can I reduce this, so that SA scans messages 100k and below? Many Thanks in advance
Re: Limit SA to scan messages 100k and below
Hi On Wed, Mar 31, 2010 at 2:24 PM, Keith De Souza kbdeso...@googlemail.com wrote: Hi Guys, [snip] I've read somewhere that the default setting for SA to scan a message is 500k. Can I reduce this, so that SA scans messages 100k and below? Have you tried google first ? http://www.google.dk/#hl=dasafe=offq=spamd+scan+messages+sizemeta=aq=faqi=aql=oq=gs_rfai=fp=15904d39482f0df0 Maybe this one: http://spamassassin.apache.org/full/3.2.x/doc/spamc.html I'm no expert at spamc ... but this seems to be the right settings to go for ... But are there are reason for dropping it? Many Thanks in advance mvh Mikael Syska
Re: Limit SA to scan messages 100k and below
On Wed, 2010-03-31 at 13:24 +0100, Keith De Souza wrote: My current sysadmin has now left the company and I'm new to SA and Exim. [...] I've read somewhere that the default setting for SA to scan a message is 500k. That's actually the default for spamc. Messages exceeding the threshold just won't be passed to spamd. SA (and spamd) will check everything it gets passed. Can I reduce this, so that SA scans messages 100k and below? You need to change whatever glue you are using to pass messages to SA, and skip the scanning for messages larger than your desired threshold. That said, IMHO 100k is rather low. Why do you want that particular threshold? guenther -- char *t=\10pse\0r\0dtu...@ghno\x4e\xc8\x79\xf4\xab\x51\x8a\x10\xf4\xf4\xc4; main(){ char h,m=h=*t++,*x=t+2*h,c,i,l=*x,s=0; for (i=0;il;i++){ i%8? c=1: (c=*++x); c128 (s+=h); if (!(h=1)||!t[s+h]){ putchar(t[s]);h=m;s=0; }}}
Re: Limit SA to scan messages 100k and below
Hi, Remember to respond to the mailing list ... so other users can follow this also ... On Wed, Mar 31, 2010 at 2:54 PM, Keith De Souza kbdeso...@googlemail.com wrote: Hi, But are there are reason for dropping it? I'm having a few errors in my Exim logs from legitamate senders not coming through: === 2010-03-31 01:22:25 1Nwlbc-0001QS-Ua H=host81-136-197-86.in-addr.btopenworld.com (mail.duke.tv) [81.136.197.86] F=l...@dukeandearl.com temporarily rejected after DATA === And after checking my SA logs: === Mar 31 01:25:51 mailserver spamd[5379]: spamd: result: . -4 - GENESIS_PHONENUMBER07 scantime=300.0,size=24337,user=nobody,uid=8,required_score=3.2,rhost=localhost,raddr=127.0.0.1,rport=42308,mid=c7d27527.8a78%l...@dukeandearl.com,autolearn=unavailable == Your required score is very slow ... but thats not the problem. I'm trying to understand why is it taking 300.0 seconds to scan a message only 24Kb in size?? This is not the way to go ... there could be other problems ... like SA rules, RBL's timing out ... Are you running sa-update ? I'm begeining to think that because SA is taking so long to scan the message, it is timing out and hence Exim returning a temporarily reject after DATA. My thoughs so far is to perhaps reducing the file size that SA takes to scan and see if the scan time reduces. Are there lots of mails in the queue ? I may be wrong in my troublshooting methods but I'm not sure why this is happeninig at present. Many Thanks On 31 March 2010 13:30, Mikael Syska mik...@syska.dk wrote: Hi On Wed, Mar 31, 2010 at 2:24 PM, Keith De Souza kbdeso...@googlemail.com wrote: Hi Guys, [snip] I've read somewhere that the default setting for SA to scan a message is 500k. Can I reduce this, so that SA scans messages 100k and below? Have you tried google first ? http://www.google.dk/#hl=dasafe=offq=spamd+scan+messages+sizemeta=aq=faqi=aql=oq=gs_rfai=fp=15904d39482f0df0 Maybe this one: http://spamassassin.apache.org/full/3.2.x/doc/spamc.html I'm no expert at spamc ... but this seems to be the right settings to go for ... But are there are reason for dropping it? Many Thanks in advance mvh Mikael Syska mvh
Re: Limit SA to scan messages 100k and below
Hi * You need to change whatever glue you are using to pass messages to SA, and skip the scanning for messages larger than your desired threshold. *Sorry as I'm new to SA can you elaborated what you mean by glue? * That said, IMHO 100k is rather low. Why do you want that particular threshold?* Judging from your response, I may be wrong in what I need to do: Basically I'm having a few errors in my Exim logs from legitamate senders not coming through: === 2010-03-31 01:22:25 1Nwlbc-0001QS-Ua H= host81-136-197-86.in-addr.btopenworld.com (mail.duke.tv) [81.136.197.86] F= l...@dukeandearl.com temporarily rejected after DATA === And after checking my SA logs: === Mar 31 01:25:51 mailserver spamd[5379]: spamd: result: . -4 - GENESIS_PHONENUMBER07 *scantime=300.0,size=24337*, user=nobody,uid=8,required_score=3.2,rhost=localhost,raddr=127.0.0.1,rport=42308,mid= c7d27527.8a78%l...@dukeandearl.com c7d27527.8a78%25l...@dukeandearl.com ,autolearn=unavailable == I'm trying to understand why is it taking 300.0 seconds to scan a message only 24Kb in size?? I'm begeining to think that because SA is taking so long to scan the message, it is timing out and hence Exim returning a temporarily reject after DATA. My thoughs so far is to perhaps reducing the file size that SA takes to scan and see if the scan time reduces. I may be wrong in my troublshooting methods but I'm not sure why this is happeninig at present. Many Thanks 2010/3/31 Karsten Bräckelmann guent...@rudersport.de On Wed, 2010-03-31 at 13:24 +0100, Keith De Souza wrote: My current sysadmin has now left the company and I'm new to SA and Exim. [...] I've read somewhere that the default setting for SA to scan a message is 500k. That's actually the default for spamc. Messages exceeding the threshold just won't be passed to spamd. SA (and spamd) will check everything it gets passed. Can I reduce this, so that SA scans messages 100k and below? You need to change whatever glue you are using to pass messages to SA, and skip the scanning for messages larger than your desired threshold. That said, IMHO 100k is rather low. Why do you want that particular threshold? guenther -- char *t=\10pse\0r\0dtu...@ghno \x4e\xc8\x79\xf4\xab\x51\x8a\x10\xf4\xf4\xc4; main(){ char h,m=h=*t++,*x=t+2*h,c,i,l=*x,s=0; for (i=0;il;i++){ i%8? c=1: (c=*++x); c128 (s+=h); if (!(h=1)||!t[s+h]){ putchar(t[s]);h=m;s=0; }}}
Re: Limit SA to scan messages 100k and below
Hi Oops only realized after I had sent you the message - but will do. * Are you running sa-update ?* I might not be, how can I check? * Are there lots of mails in the queue? *No mails in the queue. I should also say that, mail is coming in fine and we are receving it but certain legitamate mail (like the one sent)are not and SA take 300.0 second to scan. I'm also receiving these in my logs: *spam acl condition: error reading from spamd socket: Connection timed out *Many Thanks
Re: Limit SA to scan messages 100k and below
From: Keith De Souza kbdeso...@googlemail.com Date: Wed, 31 Mar 2010 14:10:50 +0100 Hi * You need to change whatever glue you are using to pass messages to SA, and skip the scanning for messages larger than your desired threshold. *Sorry as I'm new to SA can you elaborated what you mean by glue? * That said, IMHO 100k is rather low. Why do you want that particular threshold?* Judging from your response, I may be wrong in what I need to do: Basically I'm having a few errors in my Exim logs from legitamate senders not coming through: 300 seconds looks like an timeout. Something is giving up after waiting 300 seconds. Note the autolearn=unavailable. I'd guess that you are getting locked out from the Bayes database. You probably had a Bayes expire running at the same time. There should be messages about this in a log file. If this is the case you can turn off bayes_auto_expire and run expire from cron. You could also try learning to the journal and doing sa-learn --sync periodically from cron. -jeff === 2010-03-31 01:22:25 1Nwlbc-0001QS-Ua H= host81-136-197-86.in-addr.btopenworld.com (mail.duke.tv) [81.136.197.86] F= l...@dukeandearl.com temporarily rejected after DATA === And after checking my SA logs: === Mar 31 01:25:51 mailserver spamd[5379]: spamd: result: . -4 - GENESIS_PHONENUMBER07 *scantime=300.0,size=24337*, user=nobody,uid=8,required_score=3.2,rhost=localhost,raddr=127.0.0.1,rport=42308,mid= c7d27527.8a78%l...@dukeandearl.com c7d27527.8a78%25l...@dukeandearl.com ,autolearn=unavailable == I'm trying to understand why is it taking 300.0 seconds to scan a message only 24Kb in size?? I'm begeining to think that because SA is taking so long to scan the message, it is timing out and hence Exim returning a temporarily reject after DATA. My thoughs so far is to perhaps reducing the file size that SA takes to scan and see if the scan time reduces. I may be wrong in my troublshooting methods but I'm not sure why this is happeninig at present. Many Thanks 2010/3/31 Karsten Bräckelmann guent...@rudersport.de On Wed, 2010-03-31 at 13:24 +0100, Keith De Souza wrote: My current sysadmin has now left the company and I'm new to SA and Exim. [...] I've read somewhere that the default setting for SA to scan a message is 500k. That's actually the default for spamc. Messages exceeding the threshold just won't be passed to spamd. SA (and spamd) will check everything it gets passed. Can I reduce this, so that SA scans messages 100k and below? You need to change whatever glue you are using to pass messages to SA, and skip the scanning for messages larger than your desired threshold. That said, IMHO 100k is rather low. Why do you want that particular threshold? guenther -- char *t=\10pse\0r\0dtu...@ghno \x4e\xc8\x79\xf4\xab\x51\x8a\x10\xf4\xf4\xc4; main(){ char h,m=h=*t++,*x=t+2*h,c,i,l=*x,s=0; for (i=0;il;i++){ i%8? c=1: (c=*++x); c128 (s+=h); if (!(h=1)||!t[s+h]){ putchar(t[s]);h=m;s=0; }}}
Re: Limit SA to scan messages 100k and below
Keith De Souza wrote: I'm trying to understand why is it taking 300.0 seconds to scan a message only 24Kb in size?? I'm begeining to think that because SA is taking so long to scan the message, it is timing out and hence Exim returning a temporarily reject after DATA. My thoughs so far is to perhaps reducing the file size that SA takes to scan and see if the scan time reduces. I may be wrong in my troublshooting methods but I'm not sure why this is happeninig at present. My first suggestion to anyone who is having problems with SA running slowly is to check memory usage. You posted previously that your conf file contained this: SPAMD_OPTS=-m 25 -H -u mail -D -m 25 means that you are running 25 spamd processes. On my system (with a few extra rulesets), the spamd processes take up about 60-70M each. How much memory do you have? You need to make sure that the machine doesn't go into swap. If it does, SA will slow down dramatically. Try running the free command to see how much memory you have available. If you are close to the edge, you may want to lower the number of processes. -H is a command to change the home directory and generally requires an argument, so I'm not sure what it's doing here. -u mail means spamd is running as the user mail. So when you are testing, manually learning the Bayes db, etc, make sure you are logged in as mail so that you are using the same settings and databases as spamd. -D puts spamd into debug mode. Aside from filling up your logs with excess debug information, this will probably slightly increase the memory use and slow down the scanning process. If you don't need it for some reason, get rid of it. -- Bowie
Re: Limit SA to scan messages 100k and below
On Wed, 31 Mar 2010, Keith De Souza wrote: Sorry as I'm new to SA can you elaborated what you mean by glue? Geek terminology for the program, script or other mechanism that 'connects' your MTA and your SA. Ie. The calling MTA or its script must do the size check, then decide *whether* to call SA I'm trying to understand why is it taking 300.0 seconds to scan a message only 24Kb in size?? 1) Server is overloaded. Your load only has to go 10-20% over your system's 'maximum capacity' to cause processing times to jump from 20 seconds up to five minutes or more 2) Something that SA relies upon, like your DNS server, is taking way too long to do its job. Check that your DNS has a reasonable timeout value. Otherwise it could be waiting for a non-existent domain This would be the case if the problem occurs for certain addresses, or more often on spam (which comes from 'unknown' systems) than on legitimate mail 3) There may be a 'locking' issue with any databases (Bayes?) that SA uses. Again, this may only become a problem under heave load, with too many concurrent SA processes My thoughs so far is to perhaps reducing the file size that SA takes to scan and see if the scan time reduces. It is a better idea to try and reduce the number of emails that SA will process at the same time. - C
Re: Scanning large-body spam
Alex wrote: What settings do people typically have these days for the maximum scanned message size? Surprisingly, at least to me, I'm seeing spam in the 650k and 700k range, at least a few per hour, and are not scanned. Does anyone have any suggestions for optimizing the process for spam containing just a large image that would therefore bypass the typical scanning? Should I be scanning messages that large, then? Depends on your available CPU resources. If you always have a low load average, you can scan larger messages. My production deployment is such a workhorse that I've got it set to 1.1MB. My general advice is that since many spammers will check against a default SA scan before blasting out their messages, you want something slightly larger than whatever the default is (actually, in the event that it has changed between versions, something slightly larger than the largest default SA has ever shipped with). Maybe somebody who knows the innards better can comment on how quickly and efficiently SA can ignore non-text attachments (for those of use who don't try to decode word documents and PDFs or use OCR on images). Wasn't some earlier version of SA capable of scanning just the /first/ [size] of an email? Probably harder to implement within MIME, but some control to internally truncate remaining pieces (for scanning only, like the pseudo-headers) would allow scanning beyond the size limit.
Re: Scanning large-body spam
On Wed, Mar 31, 2010 at 11:05:57AM -0400, Adam Katz wrote: Wasn't some earlier version of SA capable of scanning just the /first/ [size] of an email? Probably harder to implement within MIME, but some control to internally truncate remaining pieces (for scanning only, like the pseudo-headers) would allow scanning beyond the size limit. SA 3.3 has special handling for truncated messages and amavisd-new (if it's your choice of glue) has already done it since 2.6.3. Never encountered a problem with it. Here are release notes for the record: - large messages beyond $sa_mail_body_size_limit are now partially passed to SpamAssassin and other spam scanners for checking: a copy passed to a spam scanner is truncated near or slightly past the indicated limit. Large messages are no longer given an almost free passage through spam checks. Note that message truncation can invalidate a DKIM or DK signature. If using (non-default) SpamAssassin rules to assign score points to mail with no valid signatures from authors which are expected to always provide a valid signature, the message truncation can cause false positives on these rules. As a workaround, to a truncated message passed to spam scanners, amavisd inserts a header field: X-Amavis-MessageSize: m, TRUNCATED to n which can be captured by SpamAssassin rules, e.g.: header __TRUNCATED X-Amavis-MessageSize =~ m{\A[^\n]*TRUNCATED}m and used in rules like NOTVALID_EBAY to prevent them from triggering. Starting with version 3.3.0 of SpamAssassin, its DKIM plugin understands the issue and receives undamaged DKIM signature objects directly from amavisd, so the above workaround is not needed. Also, a hit on a __TRUNCATED rule is automatically generated (explicit header rule is not necessary), just in case it might be useful for some purpose. For other glue, I recommend taking it up with the author to support truncating properly. (Hmm, I don't think spamc has been enhanced yet..) Of course we hope that someday SA will have true support for ignoring useless attachment data.
Re: Limit SA to scan messages 100k and below
On Wed, 2010-03-31 at 15:06 +0200, Mikael Syska wrote: I'm trying to understand why is it taking 300.0 seconds to scan a message only 24Kb in size?? Use the sysstat tool-set to find out what's going on in your system and fix that. I agree with those who say that -m 25 is too large a value. If that's the problem then you don't need to use the sysstat programs to see it - just run 'top' and you'll see the swap space used value changing and that kswapd is busy. Try simply deleting the -m option, which uses the default of 5 children, and see how SA performance changes. To provide some guide numbers I looked at my two SA setups, which both use the default number of children: - My SA rule development rig runs on a 1.5GHz CoreDuo laptop with 1GB RAM. It can scan my two biggest spam test messages (412KB and 360KB) in 21 seconds: however scan time depends on the message content: the 412KB message only takes 1 second to scan by itself while the 360KB one takes the other 20 seconds. This set-up uses SA 3.3.0 - My main server is a lot smaller: an 866MHz P3 with 512 MB RAM. It runs SA 3.2.5. Here are the numbers from its set of maillogs: Messages scanned: 2758 Message size: min 2072 avg 7223 max 417840 bytes Scan times: min 0.7 avg 2.247 max 21.1 seconds I'm using the default SA child process populations. This machine is also running getmail, Postfix, Dovecot, named, ntpd, Samba, Apache and PostgresQL and is used for Java development as well. Martin
Re: Scanning large-body spam
On Wed, 31 Mar 2010, Henrik K wrote: SA 3.3 has special handling for truncated messages Excuse me for not *thinking* earlier, but it occurs to me that there is a very big drawback to *truncating* a message before passing it to SA, as opposed to my original request/suggestion to *flag* (or set a config param?) to tell SA to *ignore* parts of a message past a certain size. I believe it is fairly common practice for MTA's to expect SA to return the *entire* message, complete with X-Spam header 'markup', from SA's standard output stream. This is particularly important where mail classified as *slightly* spammy is delivered to a special spam folder based upon the headers added by SA. Or on a system where all mail tagged as spam is quarantined. Having SA's markup/explanations is critical to analysing false positives/negatives. So SA needs to read and write the *entire* message, but then be given a parameter to keep it from thrashing over the really large ones. - Charles
SPAM from legit a Yahoo/Gmail account
I'm wondering if anyone else has an issue with SPAM that comes from a real yahoo or gmail account? I've noticed a few emails get let into our organization everyday that is sent from a free email account such as yahoo and gmail. When I do a rDNS lookup, of the IP, it points back to a real server (not a spam server). Here's an example of one that just got let in: Mar 31 12:05:34 mailgate2 spamd[14709]: spamd: processing message 39701.814...@web36505.mail.mud.yahoo.com for apache:48 Mar 31 12:05:38 mailgate2 spamd[14709]: spamd: clean message (-0.1/4.4) for apache:48 in 3.8 seconds, 22865 bytes. Mar 31 12:05:38 mailgate2 spamd[14709]: spamd: result: . 0 - DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FROM,HTML_MESSAGE,RCVD_IN_DNSWL_NONE,T_RP_MATCHES_RCVD The subject of this is email was: Launch of www.girlsandwomen.com G(irls) 20 Summit Website Does anyone have any recommendations on how to fixing that? Thanks! Kaleb
Is report_safe broken?
Greetings! I upgraded SA from version 3.2.5 to 3.3.1 this morning. Since that time all of the emails that are marked as spam are being converted to attachments. One other oddity. If you look close at the rewrite_header Subject line, you will count three %'s after the word SPAM. This is a change I made to test if SA was reading this config file at all. It is. My email subject lines went from %%SPAM%% to %%SPAM%%% just as expected, but the required score stayed at 5.0 and didn't change to 50.0. I re-checked the report_safe setting in the local.cf file and it is still set to zero as it was before. I also checked for another cf file changing that parameter but there are no other cf files that mention it. (Or .pre files for that matter.) The user_prefs file in the home directory has nothing that is not commented out and has not been changed from the default. I am calling spamc from a postfix filter line in master.cf, but that hasn't been changed since before the upgrade. As always, what am I missing? TIA! Here is my /etc/mail/spamassassin/local.cf file: === rewrite_header Subject %%SPAM%%% (_SCORE_) add_header all Level _STARS(X)_ # required_score 5.0 required_score 50.0 report_safe 0 ok_locales en # ok_languages en # Use Bayesian classifier (default: 1) # use_bayes 1 # Bayesian classifier auto-learning (default: 1) # bayes_auto_learn 0 Michael Weber Network Administrator Allied National, Inc. 4551 W . 107th St. Suite 100 Overland Park, KS 66207 913-945-4313 is my direct number E-MAIL CONFIDENTIALITY NOTICE: This communication and any associated file(s) may contain privileged, confidential or proprietary information or be protected from disclosure under law (Confidential Information). Any use or disclosure of this Confidential Information, or taking any action in reliance thereon, by any individual/entity other than the intended recipient(s) is strictly prohibited. This Confidential Information is intended solely for the use of the individual(s) addressed. If you are not an intended recipient, you have received this Confidential Information in error and have an obligation to promptly inform the sender and permanently destroy, in its entirety, this Confidential Information (and all copies thereof). E-mail is handled in the strictest of confidence by Allied National, however, unless sent encrypted, it is not a secure communication method and may have been intercepted, edited or altered during transmission and therefore is not guaranteed.
Re: SPAM from legit a Yahoo/Gmail account
One likely scenario may be that the spammer managed to hack into an existing account, then use it to send out their garbage. One way to fix that is to ensure all humans with computer access always employ best practices for choosing and protecting secure passwords. Another possible scenario is the spammer created their own account just so their spam would look more legitimate. This is another human behavior issue for which (like the one above) there is unlikely ever to be an acceptable technological solution. You're never going to stop ALL the spam, and for situations that represent, as you said, only a few the effort to catch them is often more trouble than it's worth - or the problem may just go away (the freemail host notices and closes the account) by the time you start trying to think of a solution. Kaleb Hosie kho...@spectraaluminum.com 03/31/10 12:18 PM I'm wondering if anyone else has an issue with SPAM that comes from a real yahoo or gmail account? I've noticed a few emails get let into our organization everyday that is sent from a free email account such as yahoo and gmail. When I do a rDNS lookup, of the IP, it points back to a real server (not a spam server). Here's an example of one that just got let in: Mar 31 12:05:34 mailgate2 spamd[14709]: spamd: processing message 39701.814...@web36505.mail.mud.yahoo.com for apache:48 Mar 31 12:05:38 mailgate2 spamd[14709]: spamd: clean message (-0.1/4.4) for apache:48 in 3.8 seconds, 22865 bytes. Mar 31 12:05:38 mailgate2 spamd[14709]: spamd: result: . 0 - DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FROM,HTML_MESSAGE,RCVD_IN_DNSWL_NONE,T_RP_MATCHES_RCVD The subject of this is email was: Launch of www.girlsandwomen.com G(irls) 20 Summit Website Does anyone have any recommendations on how to fixing that? Thanks! Kaleb
Spamhaus Uncovers Fake DNSBL: nszones.com
Spamhaus has uncovered a fake spam filter company which was pirating and selling DNSBL data stolen from major anti-spam systems including Spamhaus, CBL and SURBL, republishing the stolen data under the name nszones.com. more: http://www.spamhaus.org/organization/statement.lasso?ref=8 -- Neil Schwartzman Senior Director Security Strategy, Receiver Services Return Path Inc. [303] 999-3217 Tweets: ReturnPathHelp
Re: Scanning large-body spam
Hi, Does anyone have any suggestions for optimizing the process for spam containing just a large image that would therefore bypass the typical scanning? Should I be scanning messages that large, then? Depends on your available CPU resources. If you always have a low load average, you can scan larger messages. My production deployment is such a workhorse that I've got it set to 1.1MB. Will messages this large have the benefit of bayes? What would be the impact on the corresponding sa-learn of a message of that size? Perhaps only learn the header and body components that aren't an attachment somehow? Thanks, Alex
Re: SPAM from legit a Yahoo/Gmail account
Hi, I've noticed a few emails get let into our organization everyday that is sent from a free email account such as yahoo and gmail. When I do a rDNS lookup, of the IP, it points back to a real server (not a spam server). Here's an example of one that just got let in: Mar 31 12:05:34 mailgate2 spamd[14709]: spamd: processing message 39701.814...@web36505.mail.mud.yahoo.com for apache:48 That's a yahoo message ID, but did it in fact come from yahoo? Mar 31 12:05:38 mailgate2 spamd[14709]: spamd: result: . 0 - DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FROM,HTML_MESSAGE,RCVD_IN_DNSWL_NONE,T_RP_MATCHES_RCVD Where did you get that T_RP_MATCHES_RCVD rule and what does it do? Is it something you wrote to match on yahoo.com sender? I've put together a few rules that match on freemail domains with particular contents (typically a URI) in the body for instances just such as this. If you're really having trouble, post a message to pastebin.com and a message to the list here with that link, so we can help further. Best, Alex
Re: Scanning large-body spam
On Wednesday March 31 2010 18:05:52 Charles Gregory wrote: Excuse me for not *thinking* earlier, but it occurs to me that there is a very big drawback to *truncating* a message before passing it to SA, as opposed to my original request/suggestion to *flag* (or set a config param?) to tell SA to *ignore* parts of a message past a certain size. I believe it is fairly common practice for MTA's to expect SA to return the *entire* message, complete with X-Spam header 'markup', from SA's standard output stream. This is particularly important where mail classified as *slightly* spammy is delivered to a special spam folder based upon the headers added by SA. Or on a system where all mail tagged as spam is quarantined. Having SA's markup/explanations is critical to analysing false positives/negatives. So SA needs to read and write the *entire* message, but then be given a parameter to keep it from thrashing over the really large ones. There are some drawbacks in depriving SpamAssassin of the full message and letting it work on a truncated message, appropriately marked as one. But even the message header alone often carries half the value of score quality. Adding to that the first 400 kB of a body already covers plenty of information about a message. It would be better of course to let SA have access to a full or summarized info about the rest of the message (like its attachments) too, but doing without is not too bad. Comparing the quality of a score on a partial message, to not having any score at all (and passing any big message as clean) makes a decision trivial (it just needs to be done). I believe it is fairly common practice for MTA's to expect SA to return the *entire* message, complete with X-Spam header 'markup', from SA's standard output stream. Sure, but this is an implementation detail. There is no underlying reason that spamc could not keep the original message and only feed part of it to spamd, then merge the results back and do the final message editing (like inserting/editing header fields) by itself. Or to modify spamd and let it handle arbitrary size messages by avoiding its current paradigm of keeping the entire message in memory. Anyway, the amavisd glue to SpamAssassin does just that: let SpamAssassin see only the first 400 kB (configurable) of a large message, then edit the original message based on results obtained from SpamAssassin. This offers best of both worlds: handles arbitrary size messages, and avoids SpamAssassin slurping it all in memory. The tricky details are in editing the message, and ensuring that DKIM and DK signatures survive (which is done by using an out-of-band channel between a caller and SA with its plugins, as provided by SA 3.3). Mark
Re: Scanning large-body spam
On Wed, 31 Mar 2010, Mark Martinec wrote: and let it handle arbitrary size messages by avoiding its current paradigm of keeping the entire message in memory. Is there really a problem with the in-memory size? I would have thought the major concern was the processing time for evaluating 'full' (and rawbody?) rules on a large message Anyway, the amavisd glue to SpamAssassin does just that: let SpamAssassin see only the first 400 kB (configurable) of a large message, then edit the original message based on results obtained from SpamAssassin. Good for amavis-d, but not for those of us relying on SA to do the whole job, and not have our MTA's perform any further message modification I would be interested in having some of the developers offer an opinion on this. Where is the real 'cost' in running SA against a large message? Is it just the memory used? Or is it, as I suspect, the use of 'full' rules? - Charles
Re: Scanning large-body spam
On Wednesday March 31 2010 23:43:25 Charles Gregory wrote: Is there really a problem with the in-memory size? I would have thought the major concern was the processing time for evaluating 'full' (and rawbody?) rules on a large message Yes, sure, the main issue is with evaluating regexp rules over a large message. Nevertheless, even now keeping 50 copies of 100 MB memory-footprint child processes is not to be underestimated. Add to that several copies (raw, decoded, array of lines, ...) of a large message in perl's data structures can be a big deal. And bear in mind that once a process running perl extends its virtual memory, it cannot shrink back, so it stays huge forever after processing one large message. Mark
Re: Is report_safe broken?
On 3/31/2010 12:34 PM, Michael Weber wrote: Greetings! I upgraded SA from version 3.2.5 to 3.3.1 this morning. Since that time all of the emails that are marked as spam are being converted to attachments. One other oddity. If you look close at the rewrite_header Subject line, you will count three %'s after the word SPAM. This is a change I made to test if SA was reading this config file at all. It is. My email subject lines went from %%SPAM%% to %%SPAM%%% just as expected, but the required score stayed at 5.0 and didn't change to 50.0. I re-checked the report_safe setting in the local.cf file and it is still set to zero as it was before. I also checked for another cf file changing that parameter but there are no other cf files that mention it. (Or .pre files for that matter.) The user_prefs file in the home directory has nothing that is not commented out and has not been changed from the default. I am calling spamc from a postfix filter line in master.cf, but that hasn't been changed since before the upgrade. As always, what am I missing? That should work fine. Did you run spamassassin --lint, to see if SA can parse the configuration files? (this should run with no output, but if there's a parse error, it will complain) SA could be tripping on an illegal character and skipping several lines of your config file... Given it's taking the rewrite_header option, it's obvious you've got the right local.cf and restarted spamd, etc, so there's something else amiss. TIA! Here is my /etc/mail/spamassassin/local.cf file: === rewrite_header Subject %%SPAM%%% (_SCORE_) add_header all Level _STARS(X)_ # required_score 5.0 required_score 50.0 report_safe 0 ok_locales en # ok_languages en # Use Bayesian classifier (default: 1) # use_bayes 1 # Bayesian classifier auto-learning (default: 1) # bayes_auto_learn 0
Re: Confused about how to use sa-update
But I don't understand how to use sa-update. I've run it and I can see all the new rule files in /var/lib/spamassassin/3.002005. However, I think my rules run off the files in /usr/share/spamassassin/. The wiki at http://wiki.apache.org/spamassassin/RuleUpdates#Using_sa-update says NOT to use the --updatedir parameter to put updates in /usr/share/spamassassin. So how exactly do you get the new rule files into /usr/share/spamassassin so they start working? Do you just copy them across manually, or is there a way of getting sa-update to do it automatically? OK - I've found an answer to my own question at https://issues.apache.org/SpamAssassin/show_bug.cgi?id=6269#c41 - The rules in /usr/share/spamassassin are not consulted if a directory /var/lib/spamassassin/3.x exists. The sa-update only updates the latter. One other question I have which is more about this mailing list - is the list actually still active? It seems to have extremely low traffic for a product like spamassassin. And I've also found it can be quite difficult to get replies to questions. It makes me wonder whether I'm actually posting to the right place! Is this the official spamassassin mailing list?
Re: Confused about how to use sa-update
On 3/31/2010 9:10 PM, Phill Edwards wrote: But I don't understand how to use sa-update. I've run it and I can see all the new rule files in /var/lib/spamassassin/3.002005. However, I think my rules run off the files in /usr/share/spamassassin/. The wiki at http://wiki.apache.org/spamassassin/RuleUpdates#Using_sa-update says NOT to use the --updatedir parameter to put updates in /usr/share/spamassassin. So how exactly do you get the new rule files into /usr/share/spamassassin so they start working? Do you just copy them across manually, or is there a way of getting sa-update to do it automatically? OK - I've found an answer to my own question at https://issues.apache.org/SpamAssassin/show_bug.cgi?id=6269#c41 - The rules in /usr/share/spamassassin are not consulted if a directory /var/lib/spamassassin/3.x exists. The sa-update only updates the latter. One other question I have which is more about this mailing list - is the list actually still active? It seems to have extremely low traffic for a product like spamassassin. And I've also found it can be quite difficult to get replies to questions. It makes me wonder whether I'm actually posting to the right place! Is this the official spamassassin mailing list? The list is definitely active. Now, is it 100 messages a minute? No.. but your original post did get two replies providing the answer, both slightly over 2 hours after your question.
Re: Confused about how to use sa-update
The list is definitely active. Now, is it 100 messages a minute? No.. but your original post did get two replies providing the answer, both slightly over 2 hours after your question. Yeah, I've subsequently found them on a Nabble list. For some reason I'm not getting any email from this list into my Gmail mailbox. Other mailing lists are coming through just fine, but not spamassassin. I wondered if they'd been spammed by Gmail in some sort of delicious twist of irony, but no. I have no idea why they're not showing up, but am glad to hear that the list is so active!