rulesemporim
hello, does anybody knows, if rules on www.rulesemporium are available again? To update the spamassassin rules. greetings ralf
Best practices for getting HAM for bayes training
Hi, I am running SA in a hosted environment where SA is the MX and it scans the mails and forwards to real mail server. We have report spam facility where users report spam that went through SA. I am not using bayes as of now but want to start using. To train bayes we have enough spam (via user's spam reporting) but not much ham. My problem is how to get enough ham for SA training in such an environment? What is a good ratio for ham/spam when training SA? Any other best practices that I can use in such an environment? raj
Re: once again problems with sa-learn
On Mon, 2 Feb 2009 15:41:21 -0500 Caleb Cushing wrote: > On Monday 02 February 2009 15:23:34 wolfgang wrote: > > My idea: try "...cur/*" instead of ".../cur" when calling sa-learn. > > Thus it should learn each file in that directory IMHO. > > I've tried that too, the results are the same. also according to prior > conversation on this list both should work. Have you tried renaming ~/.spamassassin and letting sa-learn recreate it. If that doesn't work I cd to the parent directory and run sa-learn on cur/, just in case it has a problem with the square brackets in the path.
Re: own address in AWL?
On Mon, February 2, 2009 19:04, Greg Troxel wrote: > I have removed my address from the whitelist and will keep an eye on > how it gets back in. AWL is not really a WHITELIST, but a score avanger, and it does a fairly good job imho, unhappy with it change its conf not the data it creates in db perldoc Mail::SpamAssassin::Plugin::AWL -- http://localhost/ 100% uptime and 100% mirrored :)
Re: once again problems with sa-learn
In an older episode (Monday, 2. February 2009), Caleb Cushing wrote: > On Monday 02 February 2009 15:23:34 wolfgang wrote: > > My idea: try "...cur/*" instead of ".../cur" when calling sa-learn. > > Thus it should learn each file in that directory IMHO. > > I've tried that too, the results are the same. also according to > prior conversation on this list both should work. 2 more ideas: What happens if you copy one file from .../cur/ to /tmp and run sa-learn on it there? Does it help to use the full path instead of .kde4.2/.../cur/ ? Regards, wolfgang
Re: once again problems with sa-learn
On Monday 02 February 2009 15:23:34 wolfgang wrote: > My idea: try "...cur/*" instead of ".../cur" when calling sa-learn. Thus > it should learn each file in that directory IMHO. I've tried that too, the results are the same. also according to prior conversation on this list both should work. -- Caleb Cushing http://xenoterracide.blogspot.com signature.asc Description: This is a digitally signed message part.
Re: once again problems with sa-learn
Hi, In an older episode (Sunday, 1. February 2009), Caleb Cushing wrote: > sa-learn -D --showdots --spam > .kde4.2/share/apps/kmail/dimap/.1734756527.directory/. > \[Gmail\].directory/Spam/ > cur/ > Learned tokens from 0 message(s) (0 message(s) examined) > right now I've no idea why it's not examining any of 2k messages in > that > folder (maildir) My idea: try "...cur/*" instead of ".../cur" when calling sa-learn. Thus it should learn each file in that directory IMHO. Hope this helps, wolfgang
Re: own address in AWL?
I forgot to say: I am running spamass-milter via postfix. I wonder if the previous hop is getting lost during that process leading to ip=none. milter support in postfix is not quite 100% there. pgpoSsAeu4HIh.pgp Description: PGP signature
own address in AWL?
I am running spamassassin 3.2.5. I found one of my own messages filed as spam. The message was not relayed - sent from gnus to postfix on the mail server. Here is the header and AWL info (with the hostname and my domain name query-replaced, but otherwise unmunged). I have adjusted NO_RELAYS to a much lower score, which is fortunate in this case. Return-Path: X-Spam-Flag: YES X-Spam-Checker-Version: SpamAssassin 3.2.5 (2008-06-10) on gdt-server.example.com X-Spam-Level: ** X-Spam-Status: Yes, score=2.5 required=1.0 tests=AWL,BAYES_00,HASHCASH_20, NO_RELAYS autolearn=no version=3.2.5 X-Spam-Report: * -0.5 HASHCASH_20 Contains valid Hashcash token (20 bits) * -10 NO_RELAYS Informational: message was not relayed via SMTP * -2.6 BAYES_00 BODY: Bayesian spam probability is 0 to 1% * [score: 0.] * 16 AWL AWL: From: address is in the auto white-list X-Original-To: g...@foo.example.org Delivered-To: g...@gdt-server.example.com Received: by gdt-server.example.com (Postfix, from userid 9545) id 00B5516F3C; Mon, 2 Feb 2009 12:48:06 -0500 (EST) X-Hashcash: 1:20:090202:g...@foo.example.org::5aeXQ1z3aUrCT7YF:0\ 02cBF From: Greg Troxel To: Greg Troxel Subject: tgest Date: Mon, 02 Feb 2009 12:48:05 -0500 Message-ID: User-Agent: Gnus/5.110011 (No Gnus v0.11) Emacs/22.3 (berkeley-unix) MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii This seems to be hitting this AWL entry: 15.7 (204.6/13) -- g...@foo.example.org|ip=none I really doubt that such extremely spammy messages have been generated on the machine with my username, especially since cron jobs that send reports etc. are not configured with my example.org domain, but would just pick up the server hostname. I looked at the logs and can't find evidence of that but will look harder. So: Is there a way to exclude my own address from AWL processing, at least for ip=none? AWL uses only the first 2 bytes, and that mixes mail from my own machine on FiOS and botnet machines on FiOS into the same bucket. I am concerned that this will misattribute botnet spam to my own mail, but this is currently theoretical. Is there any easy way to turn on a log of each AWL update so I can find out how these are getting added? I suspect it's not hard to munge the code, but haven't looked yet. Any clues as to how AWL processing could hit ip=none when the mail is really delivered from off the machine? Perhaps in misparsing cases it should be ip=unknown instead of ip=none. I have removed my address from the whitelist and will keep an eye on how it gets back in. pgptqGJjARGRn.pgp Description: PGP signature
RE: please help, getting hammered with snowshoe spam
> Do people generally have good non-FP experience with BRBL? I am > thinking of > bumping up the score, but I get so much spam per day it is hard to > check for > FPs with it enabled. It seems like a great resource, will it be pushed > out > with "sa-update" soon? I believe it is enabled in svn, from what I've > read. > On one of the systems we run we set it to 0.1 initially to see how it went. After three months monitoring we upped it to 3.0. and have never had any problems. However you have to take this in the context of the other settings and mail throughput for this particular system: A tagging score of 4 and a drop score of 12 (yes, this is a bit high), on roughly 4000 emails per day (after zen.spamhause.org dnsbl blocking). Faris.
Re: please help, getting hammered with snowshoe spam
Yes, it has been a problem as there are so many domains used. However..I took everyone's earlier suggestions, including training Bayes against FN snowshoe spam and adding the Barracuda RBL (BRBL), and this appears to almost completely take care of the problem!! So far I have been able to remove all of my custom rules except for BRBL of course, and only a few of these snowshoe spams get through now. Nice! Do people generally have good non-FP experience with BRBL? I am thinking of bumping up the score, but I get so much spam per day it is hard to check for FPs with it enabled. It seems like a great resource, will it be pushed out with "sa-update" soon? I believe it is enabled in svn, from what I've read. Also I am using policyd-weight to do front-end greylisting if the DNSBL checks trigger as this reduces load on the server. Can anyone suggest how to enable the BRBL in policyd-weight? I'm not sure what values to use. Again thank you for your help with this problem! It is great to see SA working so well now against it :-) -- View this message in context: http://www.nabble.com/please-help%2C-getting-hammered-with-snowshoe-spam-tp21627042p21792616.html Sent from the SpamAssassin - Users mailing list archive at Nabble.com.