rulesemporim

2009-02-02 Thread Ralf Heidenreich

hello,

does anybody knows, if rules on www.rulesemporium are available again?
To update the spamassassin rules.

greetings ralf


Best practices for getting HAM for bayes training

2009-02-02 Thread Rajkumar S
Hi,

I am running SA in a hosted environment where SA is the MX and it
scans the mails and forwards to real mail server. We have report spam
facility where users report spam that went through SA. I am not using
bayes as of now but want to start using. To train bayes we have enough
spam (via user's spam reporting) but not much ham.

My problem is how to get enough ham for SA training in such an
environment? What is a good ratio for ham/spam when training SA? Any
other best practices that I can use in such an environment?

raj


Re: once again problems with sa-learn

2009-02-02 Thread RW
On Mon, 2 Feb 2009 15:41:21 -0500
Caleb Cushing  wrote:

> On Monday 02 February 2009 15:23:34 wolfgang wrote:
> > My idea: try "...cur/*" instead of ".../cur" when calling sa-learn.
> > Thus it should learn each file in that directory IMHO.
> 
> I've tried that too, the results are the same. also according to prior
> conversation on this list both should work.

Have you tried renaming ~/.spamassassin and letting sa-learn recreate
it.

If that doesn't work I cd to the parent directory and run sa-learn on
cur/, just in case it has a problem with the square brackets in the
path.


Re: own address in AWL?

2009-02-02 Thread Benny Pedersen

On Mon, February 2, 2009 19:04, Greg Troxel wrote:
> I have removed my address from the whitelist and will keep an eye on
> how it gets back in.

AWL is not really a WHITELIST, but a score avanger, and it does a
fairly good job imho, unhappy with it change its conf not the data
it creates in db

perldoc Mail::SpamAssassin::Plugin::AWL

-- 
http://localhost/ 100% uptime and 100% mirrored :)



Re: once again problems with sa-learn

2009-02-02 Thread wolfgang
In an older episode (Monday, 2. February 2009), Caleb Cushing wrote:
> On Monday 02 February 2009 15:23:34 wolfgang wrote:
> > My idea: try "...cur/*" instead of ".../cur" when calling sa-learn.
> > Thus it should learn each file in that directory IMHO.
>
> I've tried that too, the results are the same. also according to
> prior conversation on this list both should work.

2 more ideas:
What happens if you copy one file from .../cur/ to /tmp and run sa-learn 
on it there?

Does it help to use the full path instead of
.kde4.2/.../cur/ ?

Regards,

wolfgang


Re: once again problems with sa-learn

2009-02-02 Thread Caleb Cushing
On Monday 02 February 2009 15:23:34 wolfgang wrote:
> My idea: try "...cur/*" instead of ".../cur" when calling sa-learn. Thus
> it should learn each file in that directory IMHO.

I've tried that too, the results are the same. also according to prior
conversation on this list both should work.
-- 
Caleb Cushing

http://xenoterracide.blogspot.com


signature.asc
Description: This is a digitally signed message part.


Re: once again problems with sa-learn

2009-02-02 Thread wolfgang
Hi,

In an older episode (Sunday, 1. February 2009), Caleb Cushing wrote:
> sa-learn -D --showdots --spam
> .kde4.2/share/apps/kmail/dimap/.1734756527.directory/.
> \[Gmail\].directory/Spam/
> cur/

> Learned tokens from 0 message(s) (0 message(s) examined)

> right now I've no idea why it's not examining any of 2k messages in
> that
> folder (maildir)

My idea: try "...cur/*" instead of ".../cur" when calling sa-learn. Thus 
it should learn each file in that directory IMHO.

Hope this helps,

wolfgang


Re: own address in AWL?

2009-02-02 Thread Greg Troxel

I forgot to say: I am running spamass-milter via postfix.  I wonder if
the previous hop is getting lost during that process leading to ip=none.
milter support in postfix is not quite 100% there.




pgpoSsAeu4HIh.pgp
Description: PGP signature


own address in AWL?

2009-02-02 Thread Greg Troxel

I am running spamassassin 3.2.5.  I found one of my own messages filed
as spam.  The message was not relayed - sent from gnus to postfix on the
mail server.

Here is the header and AWL info (with the hostname and my domain name
query-replaced, but otherwise unmunged).  I have adjusted NO_RELAYS to a
much lower score, which is fortunate in this case.

Return-Path: 
X-Spam-Flag: YES
X-Spam-Checker-Version: SpamAssassin 3.2.5 (2008-06-10) on 
gdt-server.example.com
X-Spam-Level: **
X-Spam-Status: Yes, score=2.5 required=1.0 tests=AWL,BAYES_00,HASHCASH_20,  
NO_RELAYS autolearn=no version=3.2.5
X-Spam-Report:  
* -0.5 HASHCASH_20 Contains valid Hashcash token (20 bits)  
*  -10 NO_RELAYS Informational: message was not relayed via SMTP
* -2.6 BAYES_00 BODY: Bayesian spam probability is 0 to 1%  
*  [score: 0.]  
*   16 AWL AWL: From: address is in the auto white-list
X-Original-To: g...@foo.example.org
Delivered-To: g...@gdt-server.example.com
Received: by gdt-server.example.com (Postfix, from userid 9545) 
  
id 00B5516F3C; Mon,  2 Feb 2009 12:48:06 -0500 (EST)
X-Hashcash: 
1:20:090202:g...@foo.example.org::5aeXQ1z3aUrCT7YF:0\
02cBF
From: Greg Troxel 
To: Greg Troxel 
Subject: tgest
Date: Mon, 02 Feb 2009 12:48:05 -0500
Message-ID: 
User-Agent: Gnus/5.110011 (No Gnus v0.11) Emacs/22.3 (berkeley-unix)
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii  

This seems to be hitting this AWL entry:

15.7  (204.6/13)  --  g...@foo.example.org|ip=none

I really doubt that such extremely spammy messages have been generated
on the machine with my username, especially since cron jobs that send
reports etc. are not configured with my example.org domain, but would
just pick up the server hostname.  I looked at the logs and can't find
evidence of that but will look harder.

So:

  Is there a way to exclude my own address from AWL processing, at least
  for ip=none?

  AWL uses only the first 2 bytes, and that mixes mail from my own
  machine on FiOS and botnet machines on FiOS into the same bucket.  I
  am concerned that this will misattribute botnet spam to my own mail,
  but this is currently theoretical.

  Is there any easy way to turn on a log of each AWL update so I can
  find out how these are getting added?  I suspect it's not hard to
  munge the code, but haven't looked yet.

  Any clues as to how AWL processing could hit ip=none when the mail is
  really delivered from off the machine?  Perhaps in misparsing cases it
  should be ip=unknown instead of ip=none.


I have removed my address from the whitelist and will keep an eye on how
it gets back in.



pgptqGJjARGRn.pgp
Description: PGP signature


RE: please help, getting hammered with snowshoe spam

2009-02-02 Thread Faris Raouf
> Do people generally have good non-FP experience with BRBL?  I am
> thinking of
> bumping up the score, but I get so much spam per day it is hard to
> check for
> FPs with it enabled.  It seems like a great resource, will it be pushed
> out
> with "sa-update" soon?  I believe it is enabled in svn, from what I've
> read.
> 

On one of the systems we run we set it to 0.1 initially to see how it went.
After three months monitoring we upped it to 3.0. and have never had any
problems. However you have to take this in the context of the other settings
and mail throughput for this particular system: A tagging score of 4 and a
drop score of 12 (yes, this is a bit high), on roughly 4000 emails per day
(after zen.spamhause.org dnsbl blocking).

Faris.



Re: please help, getting hammered with snowshoe spam

2009-02-02 Thread Dennis Hardy

Yes, it has been a problem as there are so many domains used.  However..I
took everyone's earlier suggestions, including training Bayes against FN
snowshoe spam and adding the Barracuda RBL (BRBL), and this appears to
almost completely take care of the problem!!  So far I have been able to
remove all of my custom rules except for BRBL of course, and only a few of
these snowshoe spams get through now.  Nice!

Do people generally have good non-FP experience with BRBL?  I am thinking of
bumping up the score, but I get so much spam per day it is hard to check for
FPs with it enabled.  It seems like a great resource, will it be pushed out
with "sa-update" soon?  I believe it is enabled in svn, from what I've read.

Also I am using policyd-weight to do front-end greylisting if the DNSBL
checks trigger as this reduces load on the server.  Can anyone suggest how
to enable the BRBL in policyd-weight?  I'm not sure what values to use.

Again thank you for your help with this problem!  It is great to see SA
working so well now against it :-)


-- 
View this message in context: 
http://www.nabble.com/please-help%2C-getting-hammered-with-snowshoe-spam-tp21627042p21792616.html
Sent from the SpamAssassin - Users mailing list archive at Nabble.com.