Re: RDJ and bogus virus warnings rule

2005-04-07 Thread .rp
I did not have a problem downloading it this week.


Net::DNS trouble

2005-04-07 Thread Craig Baird
I just attempted an upgrade from SA 2.64 to 3.0.2, and am now having problems 
with SURBLs and RBLs not working.  I upgraded all of the perl modules 
mentioned in INSTALL to the latest versions prior to installing SA 3.0.2, 
including Net::DNS, which is at version 0.49.  When I run:

spamassassin -D --lint

I get the following two messages relating to Net::DNS:

debug: diag: module installed: Net::DNS, version (undef)
debug: is Net::DNS::Resolver available? no
debug: is DNS available? 0

I assume this means that SpamAssassin can't figure out what version of 
Net::DNS I'm running, and is therefore failing to use it.  I tried downgrading 
Net::DNS to version 0.48 with the same results.

I have four SA servers, all with Debian Woody, and have tried to upgrade two 
of them to 3.0.2.  This problem is happening on both of these machines.

Does anyone have any idea how I can fix this?

Thanks!

Craig


RE: Spam is marked but delivered anyway

2005-04-07 Thread Rakesh


-Original Message-
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] 
Sent: Friday, April 08, 2005 12:13 AM
To: users@spamassassin.apache.org
Subject: Re: Spam is marked but delivered anyway

> On Thursday 07 April 2005 09:38, [EMAIL PROTECTED] typed:
>
>
> SpamAssassin is only a tagging filter, not a delivery agent.  You need
> something else in the pipeline that checks the status lines after SA is
> finished and routes the mail appropriately.
>
> There is the chance that bayes_99 will trip on legit mail, but normally
> this only occurs if you haven't trained the bayesian database properly so
> that it has a good set of tokens representing ham and spam.
>
I see. So you're saying that the BAYES_99 mail that is being delivered is
due to the configuration of my MTA (Postfix), not SpamAssassin?

I checked my Postfix config files (main.cf, master.cf) and neither have
anything about it, so I would think that SpamAssassin is the one deciding
on which spam to drop and which spam to let through. If that isn't the
case, any idea what file I need to edit to block the BAYES_99 spam?




Are you using any content filter like Amavis or MailScanner in your setup ? 
If no and you are directly delivering the mail to spamd using postfix then
don't expect the spams to be stopped or quarantined it will be tagged as
spamassassin is just a tagging agent and not a filtering agent.

Usually people do put in a content filter (e.g Amavis ) after their MTA
which scans the mails for viruses and spams by invoking spamassassin. Can
you please confirm whether are you using any content filter or not ?




Re: Spam is marked but delivered anyway

2005-04-07 Thread Evan Platt
At 11:42 AM 4/7/2005, you wrote:
so I would think that SpamAssassin is the one deciding
on which spam to drop and which spam to let through.
SpamAssassin is only a filter. SpamAssassin cannot 'drop' mail or reject 
mail.
If that isn't the case, any idea what file I need to edit to block the 
BAYES_99 spam?
A procmail recipe perhaps? 



Re: Spam is marked but delivered anyway

2005-04-07 Thread mailings
> On Thursday 07 April 2005 09:38, [EMAIL PROTECTED] typed:
>
>> I recently took over admin duty for a mailserver. The system I'm taking
>>  over is a FreeBSD mailserver with SpamAssassin 3.0.2 running.
>> Unfortunately, a significant amount of spam gets through, and I think I
>>  know the reason.
>>
>> The spam that gets through is marked with [Suspected Spam]. Spamassasin
>>  says that it is possible spam and attaches the actual message as an
>> attachment. The number of points is always ridiculously high--
>> generally
>
> SpamAssassin is only a tagging filter, not a delivery agent.  You need
> something else in the pipeline that checks the status lines after SA is
> finished and routes the mail appropriately.
>
> There is the chance that bayes_99 will trip on legit mail, but normally
> this only occurs if you haven't trained the bayesian database properly so
> that it has a good set of tokens representing ham and spam.
>
I see. So you're saying that the BAYES_99 mail that is being delivered is
due to the configuration of my MTA (Postfix), not SpamAssassin?

I checked my Postfix config files (main.cf, master.cf) and neither have
anything about it, so I would think that SpamAssassin is the one deciding
on which spam to drop and which spam to let through. If that isn't the
case, any idea what file I need to edit to block the BAYES_99 spam?


Re: Tables obscuring words

2005-04-07 Thread Todd Ellison

Matt Kettler wrote:
See yesterday's thread "Re: Extra Sare Rules for meds?"
Jesse Houwing posted a beta-grade rule for this:
BODY TABLEOBFU
m{]+|"[^"]+)>(<([^>]+|"[^"]+)>)*[a-z]{1,2}(<([^>]+|"[^"]+)>)*]+|"[^"]+)>}i
 

Argh.  I hate when I do that.  Looks like I just stopped reading that 
thread too early.  This apparently doesn't hit on my sample yet, but 
with tweakage it might. Working on that now.

Todd


Re: Tables obscuring words

2005-04-07 Thread Matt Kettler
Todd Ellison wrote:

> Hi all.
>
> I have seen several spammers lately using tables to obscure offensive
> words in the message body.  Most of them get caught by other tests,
> but some get through.  Do you guys have any ideas for rules to check
> this? (Example table source below)


See yesterday's thread "Re: Extra Sare Rules for meds?"

Jesse Houwing posted a beta-grade rule for this:

BODY TABLEOBFU
m{]+|"[^"]+)>(<([^>]+|"[^"]+)>)*[a-z]{1,2}(<([^>]+|"[^"]+)>)*]+|"[^"]+)>}i



Tables obscuring words

2005-04-07 Thread Todd Ellison
Hi all.
I have seen several spammers lately using tables to obscure offensive 
words in the message body.  Most of them get caught by other tests, but 
some get through.  Do you guys have any ideas for rules to check this? 
(Example table source below)

Thanks
Todd
---Begin example

 
 
   Exp
    
    and En
    
    your
Pen
    
    with   
    
   largement
p
   ill 
    
 
   and
   large
   is
   our en
    


RE: Re[2]: it's getting worse again

2005-04-07 Thread Chris Santerre
>
>I hold down a 50-60 hour work week, family, volunteer time for NPOs,
>plus personal interests, and still find time to fight spam via SARE,
>because it's that important to me. Personal preference.
>
>If you don't want to spend the time required to tweak SA to a high
>enough performance (you probably don't need the 99.9% accuracy I
>want), the you can buy someone else's package and let them worry about
>the tweaking.
>
>There's a balance point -- some time invested vs some gain received.
>The question becomes whether the gain received does balance the time
>invested, and only you can answer that question.
>
>Bob Menschel
>

Yeah what I think Bob is trying to say is, if you don't have the time to
update SA with SARE rulesets, then just buy a commercial package. They seem
to use the SARE rulesets for you. 

Ohdid I say that out loud? :) 

I know thats not totally truesome actually change the rule names!
Whe.

I'm not bitter on it, I just wish they would add the disclaimer:

"Ar, there be ninja codes inside!"

--Chris (Hallo, Salute, its me, your duke.)



RE: Rule-sets

2005-04-07 Thread David Brodbeck
On Thu, 7 Apr 2005 12:27:58 +0100, Gray, Richard wrote
> You probably also want to learn more about regular expressions too.
> There
> Was a lot of stuff that I didn't know before I started doing this.
> 
> In particular, useful things like back chaining and forward referencing
> are useful to understand.
> 
> I wish I could tell you I had found a good site or book about it,
>  but I haven't. If you do find one, would you let me know?

The best I've found so far is the section on regular expressions in
_Programming Perl_.  But that's an awfully thick book to buy just for that. ;)



Re: Re[2]: it's getting worse again

2005-04-07 Thread Florin Andrei
On Thu, 2005-04-07 at 11:24 -0400, Kevin Sullivan wrote:

> If you think about this, it isn't surprising.  At the time the mass-checks 
> ran for 3.0.2, the 3.0.2 rules caught almost all of the spam of that time. 
> (If they didn't, then people wrote rules until they did.)  So dynamic 
> systems like Bayes and SURBL didn't add much. and thus scored low.  This 
> will be the case during every release.
> But now, between releases, spammers write spam which evades the "standard" 
> rules.  Sure, there will be new "standard" rules for the next SA release, 
> but until then dynamic systems like Bayes and SURBL are all that are 
> catching some spam.

This looks like the smoking gun.

Wouldn't make sense, then, to pre-emptively bump up the scores for the
dynamic systems right from the beginning? They will be slightly over-
rated at first, become perfectly rated after a while, then start getting
under-rated as more time passes.
That will certainly be better than the current situation, when the
dynamic rules are perfectly rated only at first, but immediately they
start to "decay" (well, not really, but i'm trying to find a concise
metaphor).

Or how about this:
In the big SA config file, add a parameter that controls the overall
weight of the static versus the dynamic things. Release the next SA
version with that parameter set so that it gives more importance to the
static systems. Tell users (in a prominent, obvious, even intrusive
fashion!) to adjust that parameter every now and then, to give more
importance to the dynamic systems as time passes.
Or even, heck, make SA track time and automatically increase the
importance of the dynamic systems as time passes. That will make it, of
course, one of those scary self-adaptive systems that pull the carpet
from under sysadmin's feet :-) but if it stops the spam, then who cares.

I believe you're right, that's what's causing me problems - the spammers
started to learn the static rules and are evading them. Well, if that's
true, then SA must provide a mechanism to control that. An overall
static-vs-dynamic "balance button" might be a good idea. Or not. 

I will try to bump up the Bayes rules and see where that goes.

Thanks everyone.

-- 
Florin Andrei

http://florin.myip.org/



Re: rules du jour and windows

2005-04-07 Thread Chris Thielen
Hi Ben,
Ben Wylie wrote:
I run spamassassin on windows. 
I like the SARE rules and would love to be able to automatically keep them
up to date. Is there a windows alternative for "rules do jour"? I do have
cygwin installed. Is it easy to set it up with that? I guess I would prefer
to do it in windows, but if cygwin is the only way I could do it, I would.
Are there instructions on how to set it up with cygwin?
 

Theoretically there should be few problems getting RDJ running on 
cygwin, although I don't actually know of anyone who's done so yet (I'm 
the author of RDJ).

If that's something you want to try, I'll walk through it with you and 
we can come up with a HOWTO.  Feel free to contact me offlist.

Chris Thielen


signature.asc
Description: OpenPGP digital signature


RE: rules du jour and windows

2005-04-07 Thread Bret Miller
> I run spamassassin on windows.
> I like the SARE rules and would love to be able to
> automatically keep them
> up to date. Is there a windows alternative for "rules do
> jour"? I do have
> cygwin installed. Is it easy to set it up with that? I guess
> I would prefer
> to do it in windows, but if cygwin is the only way I could do
> it, I would.
> Are there instructions on how to set it up with cygwin?

I have a setup that another person shared with me a while back. I
haven't actually used it, but can't see why it wouldn't work. It
essentially uses wget to download a list of files. It was never
automated to the point that it would actually update the rules, but I
don't see why that would be hard to do. I think I'd want to do some more
coding on it before using it to check the file's timestamp and only
update newer files. Maybe I'll do that someday...

For now, it'd be fairly straightforward to create a batch file to wget a
list of files daily and then restart your spam scanner if necessary.

Bret






rules du jour and windows

2005-04-07 Thread Ben Wylie
I run spamassassin on windows. 
I like the SARE rules and would love to be able to automatically keep them
up to date. Is there a windows alternative for "rules do jour"? I do have
cygwin installed. Is it easy to set it up with that? I guess I would prefer
to do it in windows, but if cygwin is the only way I could do it, I would.
Are there instructions on how to set it up with cygwin?

Sorry for my ignorance,

Ben




Re: it's getting worse again

2005-04-07 Thread Kris Deugau
Robert Menschel has already addressed most of your points pretty well.
MEE too!!oneone!1!!

Florin Andrei wrote:
> 
> 
> I've a fairly demanding job, i've a few pretty convoluted personal
> projects i'm involved in, i've a family and other details that
> typically show up if one is not an archetypal pale-faced
> geek-in-the-basement.  I do try to take care of my personal webserver
> (to which i'm the sole admin), mailserver (SA, Postfix, Cyrus,
> Squirrelmail), VoIP PBX, etc., despite the schedule overload.

Understood.  I happen to be in the position of doing this as a part of
my day job;  I administer a number of local servers for what used to be
a local ISP (bought out two years ago).

As I noted, however, I don't currently spend a whole lot of time
specifically tuning SA, because I've got a well-tuned setup (on the
ISP-account filter server and domain hosting server, according to
customers;  and on my own personal system) that needs all of about five
minutes attention to SA per *week*.  I've left all three systems running
SA 2.64, patched for SURBL lookups, because of this- I have no real need
to upgrade.

That said, all of those systems have been in more or less continuous
operation for several years now, and have had the benefit of quite a bit
of my time doing the tuning since I installed SA.  I'm also seeing far
less customer feedback;  whether that's due to lack of FPs and FNs on
most accounts (possible) or just nobody noticing (somewhat more likely,
sadly) I can't say.

Any good filter *will* take some time to get well-tuned for YOUR
particular mail flow.  :/

> And these days i was looking at SA and i'm, like, "it's not gonna
> happen, i don't have time for this." I chose to play the dumb user on
> purpose, just because i can't fix everything myself.

I know that feeling.  

> I do apologize for not reporting the actual nature of the problem.

Ranting is allowed.But if you really expect help, a brief summary
of what you think is wrong and what you've tried to fix the problem lets
others provide advice that may allow you to spend five minutes making a
VERY noticeable improvement in your setup.

Tweaking the Bayes and SURBL (aka URIRBL) scores will probably give you
the most visible, immediate improvement in your spam detection rates
with SA3.x without having to write or test rules or rulesets.

-kgd
-- 
Get your mouse off of there!  You don't know where that email has been!


RE: Rule-sets

2005-04-07 Thread Chris Santerre


>-Original Message-
>From: Bowie Bailey [mailto:[EMAIL PROTECTED]
>Sent: Thursday, April 07, 2005 10:44 AM
>Cc: users@spamassassin.apache.org
>Subject: RE: Rule-sets
>
>
>From: Ron McKeating [mailto:[EMAIL PROTECTED]
>> 
>> On Thu, 2005-04-07 at 12:27 +0100, Gray, Richard wrote:
>> > > > Thanks to all the replied, we have rules_du_jour and I am now
>> > > > getting an idea of how it works. I suppose the obvious 
>question is
>> > > > has anybody written a good howto on writing your own 
>rules. And if
>> > > > so where is it?
>> > > 
>> > 
>> > You probably also want to learn more about regular expressions too.
>> > There Was a lot of stuff that I didn't know before I 
>started doing this.
>
>> > 
>> > In particular, useful things like back chaining and 
>forward referencing
>> > are useful to understand.
>> > 
>> > I wish I could tell you I had found a good site or book 
>about it, but I
>> > haven't.  If you do find one, would you let me know?
>> > 
>> 
>> I use mastering regular expressions by O'Reilly
>
>Seconded.  Excellent book.
>
>Mastering Regular Expressions by Jeffrey E.F. Friedl
>Published by O'Reilly
>
>I found it when I was learning Perl.  It has lots of generic 
>RE stuff as
>well as a whole chapter (100 pages) on Perl specifics.

Most of my Perl knowledge comes from Theo's random signitures! ;) 

--Chris (Larry who?)


RE: Rule-sets

2005-04-07 Thread Bowie Bailey
From: Ron McKeating [mailto:[EMAIL PROTECTED]
> 
> On Thu, 2005-04-07 at 12:27 +0100, Gray, Richard wrote:
> > > > Thanks to all the replied, we have rules_du_jour and I am now
> > > > getting an idea of how it works. I suppose the obvious question is
> > > > has anybody written a good howto on writing your own rules. And if
> > > > so where is it?
> > > 
> > 
> > You probably also want to learn more about regular expressions too.
> > There Was a lot of stuff that I didn't know before I started doing this.

> > 
> > In particular, useful things like back chaining and forward referencing
> > are useful to understand.
> > 
> > I wish I could tell you I had found a good site or book about it, but I
> > haven't.  If you do find one, would you let me know?
> > 
> 
> I use mastering regular expressions by O'Reilly

Seconded.  Excellent book.

Mastering Regular Expressions by Jeffrey E.F. Friedl
Published by O'Reilly

I found it when I was learning Perl.  It has lots of generic RE stuff as
well as a whole chapter (100 pages) on Perl specifics.

Bowie


Re: Rule-sets

2005-04-07 Thread Matt Kettler
At 05:53 AM 4/7/2005, Matthew Newton wrote:
The "main" site for rules is generally http://www.rulesemporium.com/,
and spefically the http://www.rulesemporium.com/rules.htm page. They
have descriptions for what they do. You'll also find more on
http://www.exit0.us/, including the RulesDuJour script at
http://www.exit0.us/index.php?pagename=RulesDuJour that will
automatically check for new rules for you.
Don't forget the list of rulesets in the official Wiki:
http://wiki.apache.org/spamassassin/CustomRulesets
Really, I would regard the wiki page as the best starting page, as it 
references all the others, including exit0 and rulesemporium.

Rulesemporium is the outlet for all the rules built up by SARE, who do by 
far make the most and best add-on rules, but they aren't the only ones..




Re: Retain original headers

2005-04-07 Thread Kevin Peuhkurinen
Harry Putnam wrote:
[Possible duplicate Alert... Posted on gmane a few days ago but did
I'd like to get the old behavior temporarily, where SA just inserted
headers into the mail instead of encapsulating etc.  Just to see what
is really happening for sure.
Scanning thru perldoc Mail::SpamAssassin::conf, nothing jumped out at
me in the way of a flag or long var or whatever to get the old
behavior. 
 

Adding "report_safe 0 " to your /etc/mail/spamassassinlocal.cf file will stop SA from 
encapsulating the original email.   If you want the message encapsulated with the original Recieved 
headers, use "report_safe_copy_headers recieved" instead.
Kevin



RE: it's getting worse again

2005-04-07 Thread Bowie Bailey
From: Florin Andrei [mailto:[EMAIL PROTECTED]
> 
> On Wed, 2005-04-06 at 15:53 -0400, Kris Deugau wrote:
> 
> > I'm a little puzzled what you're asking for, then;  addon rulesets are
> > available from SARE, and somewhere there's a tool to automatically check
> > for updates on those rules.
> 
> My impression, after a quick perusal, was that any mentions about SARE and
> the like are pretty well "hidden" on the SA main website.  Yes, there is a
> mention, but there's a big fat "Use at your own risk" warning at the top
> of the page. What would a new user think?

The default rules in SA have been carefully tweaked to optimize the scoring.
The addition of extra rules can throw off that balance and cause false
positives or negatives.  So new users should be careful rather than just
dumping all the SARE rules into their setup.  The extra rules can also
increase the memory and processing time required by SA, which may be a
problem with older systems.

Personally, I try to be fairly conservative.  My servers tag spam, but do
not modify or drop messages on most accounts.  I have found that with the
following rulesets, I can leave the spam threshold at 5 and get pretty good
results.  For my personal email, I have dropped the threshold down to 4
without seeing any extra false positives.  I have not tweaked the Bayes_99
score, but I am considering it since my Bayes database seems to be reliable.

These are the rules that I use:
  70_sare_evilnum0.cf
  70_sare_genlsubj0.cf
  70_sare_header0.cf
  70_sare_html0.cf
  70_sare_specific.cf
  70_sare_uri0.cf
  chickenpox.cf
  weeds.cf

Bowie


RE: Rule-sets

2005-04-07 Thread Ron McKeating
On Thu, 2005-04-07 at 12:27 +0100, Gray, Richard wrote:
> > > Thanks to all the replied, we have rules_du_jour and I am 
> > now getting 
> > > an idea of how it works. I suppose the obvious question is 
> > has anybody 
> > > written a good howto on writing your own rules. And if so 
> > where is it?
> > 
> 
> You probably also want to learn more about regular expressions too.
> There
> Was a lot of stuff that I didn't know before I started doing this. 
> 
> In particular, useful things like back chaining and forward referencing
> are useful to understand.
> 
> I wish I could tell you I had found a good site or book about it, but I
> haven't.
> If you do find one, would you let me know?
> 

I use mastering regular expressions by O'Reilly

Ron
> HTH
> 
> R
> 
> 
> ---
> This email from dns has been validated by dnsMSS Managed Email Security and 
> is free from all known viruses.
> 
> For further information contact [EMAIL PROTECTED]
> 
> 
> 
> 
-- 
Ron McKeating
Senior IT Services Specialist
Computing Services
Loughborough University
01509 222329



Got it installed, now what?

2005-04-07 Thread Josh Peters
Parden me for my ignorance, or maybe this queston doesn't belong here, but 
I've looked at all the documentation and it seems there's an endless amount of 
ways to setup SpamAssassin.  So I'll start from the beginning.  I have a VPS 
FreeBSD server, running Apacahe and Sendmail as my mail service. I also have 
Mysql if it's pertinate to setup? I have a about 80 POP3 mailboxes that are 
running fine. We've been bombarded with SPAM the past few months so I decided 
to start looking for some type of SPAM filtering software and came to 
SpamAssassin.  Yesterday I successfully upgraded my version of Perl in order 
to install SA, and got a clean Make and Install of SA.

So now what?  I've been trying to figure out how to test this on a single 
mailbox, so I created a directory call /usr/home//.spamassassin . In 
that I created a file called user_prefs and inserted this code

 
# How many hits before a message is considered spam.   
required_hits   5.0
   
# Whether to change the subject of suspected spam  
rewrite_subject 1  
   
# Text to prepend to subject if rewrite_subject is used
subject_tag *SPAM* 
   
# Encapsulate spam in an attachment
report_safe 1  
   
# Use terse version of the spam report 
use_terse_report0  
   
# Enable the Bayes system  
use_bayes   1  
   
# Enable Bayes auto-learning   
auto_learn  1  
  
# Enable or disable network checks
skip_rbl_checks 0 
use_razor2  1 
use_dcc 1 
use_pyzor   1 
  
# Mail using languages used in these country codes will not be marked 
# as being possibly spam in a foreign language.   
ok_languagesall   
  
# Mail using locales used in these country codes will not be marked   
# as being possibly spam in a foreign language.   
ok_locales  all   

###  

How do I know everything is running and how do I test this thing?  Again, 
sorry if this isn't the place for a question like this but I'm having issues 
figuring this out.  It's on my server, just need to configure it now.  
Thanks  



Retain original headers

2005-04-07 Thread Harry Putnam

[Possible duplicate Alert... Posted on gmane a few days ago but did
not appear on my server... now posted direct to list]

Running SA 3.0.2

I may be just missunderstanding something here, if so I hope someone
will help me straighten out my flawed view of how this works.

I pull down mail from an isp server over POP3, to my linux box running
fetchmail and sendmail.

In messages flagged as spam that have the encapsulated original
message, I don't see the original headers.  Only a few headers like 
From: Subject: Cc: Msgid:  (maybe a few more).

So where are the original `Received:' or the many others that would
normally be there?

First I thought they were retained in the message headers that do not
get encapsulated, but on many of the spam flagged messages I see only
one `Received:' and that is my localhost recieving from my local
sendmail.

These messages would have had to have more `Received:' headers from
when the ISP pulled them in, wouldn't they?

As it arrives now, some spam mail appears to have been injected
directly into my local sendmail, but can not have been since it is
invisible from the internet.

I'd like to get the old behavior temporarily, where SA just inserted
headers into the mail instead of encapsulating etc.  Just to see what
is really happening for sure.

Scanning thru perldoc Mail::SpamAssassin::conf, nothing jumped out at
me in the way of a flag or long var or whatever to get the old
behavior. 


Re: WHich is better

2005-04-07 Thread Peter Marshall
Thank you for the detailed response.
spamassassin wil automatically autolearn ham as ham ?  I thought you had 
to autolearn ham messages manually (and do an equil number as you do 
with spam).  I also read that you have to learn about 200 of both spam 
and no spam messages for it to work ...

Also, is it better to run a cron for each user, or have it does system 
wide by root ?

Thank you,
Peter
Gray, Richard wrote:
 


-Original Message-
From: Peter Marshall [mailto:[EMAIL PROTECTED] 
Sent: 07 April 2005 13:30
To: SpamAssassin list
Subject: WHich is better

I am looking for opinions.
Problems I have with both:
1.  What is the best method of obtaining the spam / ham.  I 
have the server create a spam folder for each user when the 
user is created. 
spamassassin will automatically put all mail marked as spam 
in this folder.  Obviously I will use this folder to run 
salearn on for spam.  I will also instruct users to move mail 
that is spam that was not marked as spam to this folder.  My 
problem is, where do I run salearn for ham. 

Not to mention potentially learning on all the false positives
That SA may or may not produce, or the fact that with bayes turned
On most of the messages in the folder will already have been seen
By the system.
I guess the ideal solution would be to have a False Positive folder
That people can drag messages that the filter gone wrong into (tell
Them that this is where you look and if its not there, you can't
Deal with it (Most users will realise that the fastest way to stop
Receiving the spam is to put it in there then)
You probably also want to tag the messages that are put into the
Spam folder, then maybe once a day run through each users mailbox
And find the messages in their other folders that have a spam tag, as
Its these messages that SA incorrectly tagged (not the ones in the 
Spam folder)

SA automatically learns messages that score above and below a
Defined threshold, so you don't need to run these through again. 
What you actually want to force through bayes are the FPs and FNs
That occur, and these are best identified by eyeball (you have no
Idea how many users will put legitimate things in as spam!)

In short, automated is nice, but learn the right things, and 
Ideally look at them to be sure.

HTH
R
---
This email from dns has been validated by dnsMSS Managed Email Security and is 
free from all known viruses.
For further information contact [EMAIL PROTECTED]



--
Peter Marshall, BCS
System Administrator, CARIS
CARIS 2005 - Mapping a Seamless Society
10th International User Group Conference and Educational Sessions
Halifax, NS, Canada
E-mail [EMAIL PROTECTED] for more.


Re: Create mbox directory structure

2005-04-07 Thread Duncan Hill
On Thursday 07 April 2005 14:22, [EMAIL PROTECTED] typed:
> Hello All
>
> Does someone know a linux command to create a mailbox (mbox) without X11 or
> KDE.

A mbox mailbox is merely a named file in a location. mkdir, touch.  So long as 
the directory is there, most MTAs and MDAs can deal with the rest.


Create mbox directory structure

2005-04-07 Thread bruno . delladucata




Hello All

Does someone know a linux command to create a mailbox (mbox) without X11 or
KDE.

I have to create two mailboxes for Spam and Ham learning
Each mbox has its own user account.

Thanks





RE: WHich is better

2005-04-07 Thread Gray, Richard
 

> -Original Message-
> From: Peter Marshall [mailto:[EMAIL PROTECTED] 
> Sent: 07 April 2005 13:30
> To: SpamAssassin list
> Subject: WHich is better
> 
> I am looking for opinions.
> 
> Problems I have with both:
> 1.  What is the best method of obtaining the spam / ham.  I 
> have the server create a spam folder for each user when the 
> user is created. 
> spamassassin will automatically put all mail marked as spam 
> in this folder.  Obviously I will use this folder to run 
> salearn on for spam.  I will also instruct users to move mail 
> that is spam that was not marked as spam to this folder.  My 
> problem is, where do I run salearn for ham. 

Not to mention potentially learning on all the false positives
That SA may or may not produce, or the fact that with bayes turned
On most of the messages in the folder will already have been seen
By the system.

I guess the ideal solution would be to have a False Positive folder
That people can drag messages that the filter gone wrong into (tell
Them that this is where you look and if its not there, you can't
Deal with it (Most users will realise that the fastest way to stop
Receiving the spam is to put it in there then)

You probably also want to tag the messages that are put into the
Spam folder, then maybe once a day run through each users mailbox
And find the messages in their other folders that have a spam tag, as
Its these messages that SA incorrectly tagged (not the ones in the 
Spam folder)

SA automatically learns messages that score above and below a
Defined threshold, so you don't need to run these through again. 
What you actually want to force through bayes are the FPs and FNs
That occur, and these are best identified by eyeball (you have no
Idea how many users will put legitimate things in as spam!)

In short, automated is nice, but learn the right things, and 
Ideally look at them to be sure.

HTH

R


---
This email from dns has been validated by dnsMSS Managed Email Security and is 
free from all known viruses.

For further information contact [EMAIL PROTECTED]






RE: Rule-sets

2005-04-07 Thread Chris Santerre

>
>Thanks to all the replied, we have rules_du_jour and I am now 
>getting an
>idea of how it works. I suppose the obvious question is has anybody
>written a good howto on writing your own rules. And if so where is it?
>
>Ron


see this page:
http://www.rulesemporium.com/links.htm

I need to add more. 

Chris Santerre 
System Admin and SARE Ninja
http://www.rulesemporium.com 


Re: lint: issues detected - where?

2005-04-07 Thread Dermot Paikkos
Excellent - big typo in my meta rule. 

I had spelt BAYES as BAYNES.
Thanx Anthony.
Dp.


On 7 Apr 2005 at 13:39, [EMAIL PROTECTED] wrote:
> Hi,
> 
> I think your problems lie here:
> 
> 
> 
> warning: description exists for non-existent rule
> ANTI_BAYES_SPAMCOP_00 warning: description exists for non-existent
> rule ANTI_BAYES_SPAMCOP_40 warning: description exists for
> non-existent rule ANTI_BAYES_SPAMCOP_05 warning: description exists
> for non-existent rule ANTI_BAYES_SPAMCOP_20 warning: score set for
> non-existent rule ANTI_BAYES_SPAMCOP_00 warning: score set for
> non-existent rule ANTI_BAYES_SPAMCOP_40 warning: score set for
> non-existent rule ANTI_BAYES_SPAMCOP_05 warning: score set for
> non-existent rule ANTI_BAYES_SPAMCOP_20 
> 
> 
> 
> -- 
> Anthony Peacock   
> CHIME, Royal Free & University College Medical School
> WWW:http://www.chime.ucl.ac.uk/~rmhiajp/
> I'm in shape. - ROUND is a shape.
> 
> 
> 



Re: lint: issues detected - where?

2005-04-07 Thread a . peacock
Hi,

I think your problems lie here:



warning: description exists for non-existent rule ANTI_BAYES_SPAMCOP_00 
warning: description exists for non-existent rule 
ANTI_BAYES_SPAMCOP_40
warning: description exists for non-existent rule 
ANTI_BAYES_SPAMCOP_05
warning: description exists for non-existent rule 
ANTI_BAYES_SPAMCOP_20
warning: score set for non-existent rule ANTI_BAYES_SPAMCOP_00 
warning: score set for non-existent rule ANTI_BAYES_SPAMCOP_40 
warning: score set for non-existent rule ANTI_BAYES_SPAMCOP_05 
warning: score set for non-existent rule ANTI_BAYES_SPAMCOP_20 



-- 
Anthony Peacock   
CHIME, Royal Free & University College Medical School
WWW:http://www.chime.ucl.ac.uk/~rmhiajp/
I'm in shape. - ROUND is a shape.




WHich is better

2005-04-07 Thread Peter Marshall
I am looking for opinions.
I have been building a new mailserver to replace my old one.
The new one has postfix, Cyrus-imap, anomy, spamassassin.  I am trying 
to set up the bays auto-learn stuff.  Each user has a home directory on 
the server (they can not log onto the server).  I am using the Maildir 
format.

Is it better to have a cron job run by a single user (say root) to do 
the ham / spam learning for everyone, or should I run a cron for each 
individual user.  All users belong to the same company.

Problem I have thought of with the latter.
1.  There would be approximitly 130 cron jobs running sa-learn at the 
same time  or it would run constantly if I staggered it for every 
user.  What kind of load will that have on  my 850 with 756 MB of ram ?

Problems I have with both:
1.  What is the best method of obtaining the spam / ham.  I have the 
server create a spam folder for each user when the user is created. 
spamassassin will automatically put all mail marked as spam in this 
folder.  Obviously I will use this folder to run salearn on for spam.  I 
will also instruct users to move mail that is spam that was not marked 
as spam to this folder.  My problem is, where do I run salearn for ham. 
 If I run it on the INBOX, then I could potentially be running it spam 
mail that has not yet been moved to the spam directory.

2. How often should I run sa-learn ?  Users here for the most part get 
mail in their inbox and then after reading it move it to some other sub 
folder ... (of which everyones is different, and some have over 100).

Are there any downfalls to running a site wide one ?  What is the best 
method of doing this if this is a better method.  Currently I plan to 
use this to learn the spam.  Does anyone see any problems.
(Note:  this assumes it is being run as a particular user.)

/usr/bin/sa-learn --spam --dir ~/Maildir/.Spam/new
/usr/bin/sa-learn --spam --dir ~/Maildir/.Spam/cur
mv ~/Maildir/.Spam/new/* ~/Maildir/.Trash
mv ~/Maildir/.Spam/cur/* ~/Maildir/.Trash
Thanks for the input,
Peter




--
Peter Marshall, BCS
System Administrator, CARIS
CARIS 2005 - Mapping a Seamless Society
10th International User Group Conference and Educational Sessions
Halifax, NS, Canada
E-mail [EMAIL PROTECTED] for more.


lint: issues detected - where?

2005-04-07 Thread Dermot Paikkos
hi,

SA 3.0.0 with exim-acl

I ran spamassassin -D --lint after making a change to the local.cf 
and noticed the following:

debug: 
tests=ALL_TRUSTED,BAYES_40,MISSING_HEADERS,MISSING_SUBJECT,NO_REAL_NAM
E
debug: 
subtests=__HAS_MSGID,__MSGID_OK_DIGITS,__MSGID_OK_HOST,__SANE_MSGID,__
UNUSABLE_MSGID
lint: 8 issues detected.  please rerun with debug enabled for more 
information.


I am trying to find the issues but am not getting very far. I noticed 
the following that might account for 3 issues 

debug: diag: module not installed: Net::LDAP ('require' failed)
debug: diag: module not installed: Razor2::Client::Agent ('require' 
failed)
debug: diag: module not installed: URI ('require' failed)

I am not sure why URI is failing as it is installed and mails do get 
tagged with "3.5 URIBL_OB_SURBL"  etc. The other modules I am not 
using.

Can anyone help me try an locate the other errors? I have tried to 
pass mails to SA with:
>spamassassin -D < testmail 
but I am not getting an more feedback than I would with lint. I have 
put the complete output below.

Thanx.
DP.

debug: SpamAssassin version 3.0.0
debug: Score set 0 chosen.
debug: running in taint mode? yes
debug: Running in taint mode, removing unsafe env vars, and resetting 
PATH
debug: PATH included '/usr/local/sbin', keeping.
debug: PATH included '/usr/sbin', keeping.
debug: PATH included '/sbin', keeping.
debug: PATH included '/usr/local/bin', keeping.
debug: PATH included '/usr/bin', keeping.
debug: PATH included '/bin', keeping.
debug: PATH included '/usr/X11R6/bin', keeping.
debug: PATH included '/usr/games', keeping.
debug: PATH included '/opt/www/htdig/bin', keeping.
debug: PATH included '/usr/lib/java/bin', keeping.
debug: PATH included '/usr/lib/java/jre/bin', keeping.
debug: PATH included '/opt/kde/bin', keeping.
debug: PATH included '/usr/lib/qt-3.2.1/bin', keeping.
debug: PATH included '/usr/share/texmf/bin', keeping.
debug: Final PATH set to: 
/usr/local/sbin:/usr/sbin:/sbin:/usr/local/bin:/usr/bin:/bin:/usr/X11R
6/bin:/usr/games:/opt/www/htdig/bin:/usr/lib/java/bin:/usr/lib/java/jr
e/bin:/opt/kde/bin:/usr/lib/qt-3.2.1/bin:/usr/share/texmf/bin
debug: diag: module installed: DBI, version 1.37
debug: diag: module installed: DB_File, version 1.804
debug: diag: module installed: Digest::SHA1, version 2.10
debug: diag: module installed: IO::Socket::UNIX, version 1.2
debug: diag: module installed: MIME::Base64, version 3.05
debug: diag: module installed: Net::DNS, version 0.48
debug: diag: module not installed: Net::LDAP ('require' failed)
debug: diag: module not installed: Razor2::Client::Agent ('require' 
failed)
debug: diag: module installed: Storable, version 2.04
debug: diag: module not installed: URI ('require' failed)
debug: ignore: using a test message to lint rules
debug: using "/etc/mail/spamassassin/init.pre" for site rules 
init.pre
debug: config: read file /etc/mail/spamassassin/init.pre
debug: using "/usr/share/spamassassin" for default rules dir
debug: config: read file /usr/share/spamassassin/10_misc.cf
debug: config: read file /usr/share/spamassassin/20_anti_ratware.cf
debug: config: read file /usr/share/spamassassin/20_body_tests.cf
debug: config: read file /usr/share/spamassassin/20_compensate.cf
debug: config: read file /usr/share/spamassassin/20_dnsbl_tests.cf
debug: config: read file /usr/share/spamassassin/20_drugs.cf
debug: config: read file 
/usr/share/spamassassin/20_fake_helo_tests.cf
debug: config: read file /usr/share/spamassassin/20_head_tests.cf
debug: config: read file /usr/share/spamassassin/20_html_tests.cf
debug: config: read file /usr/share/spamassassin/20_meta_tests.cf
debug: config: read file /usr/share/spamassassin/20_phrases.cf
debug: config: read file /usr/share/spamassassin/20_porn.cf
debug: config: read file /usr/share/spamassassin/20_ratware.cf
debug: config: read file /usr/share/spamassassin/20_uri_tests.cf
debug: config: read file /usr/share/spamassassin/23_bayes.cf
debug: config: read file /usr/share/spamassassin/25_body_tests_es.cf
debug: config: read file /usr/share/spamassassin/25_hashcash.cf
debug: config: read file /usr/share/spamassassin/25_spf.cf
debug: config: read file /usr/share/spamassassin/25_uribl.cf
debug: config: read file /usr/share/spamassassin/30_text_de.cf
debug: config: read file /usr/share/spamassassin/30_text_fr.cf
debug: config: read file /usr/share/spamassassin/30_text_nl.cf
debug: config: read file /usr/share/spamassassin/30_text_pl.cf
debug: config: read file /usr/share/spamassassin/50_scores.cf
debug: config: read file /usr/share/spamassassin/60_whitelist.cf
debug: config: read file 
/usr/share/spamassassin/70_sare_bayes_poison_nxm.cf
debug: config: read file /usr/share/spamassassin/70_sare_oem.cf
debug: config: read file 
/usr/share/spamassassin/99_sare_fraud_post25x.cf
debug: using "/etc/mail/spamassassin" for site rules dir
debug: config: read file /etc/mail/spamassassin/anti_bayes.cf
debug: config: read file /etc/mail/spamassassin/bigevil.cf
debug: config: read file /et

RE: Rule-sets

2005-04-07 Thread Gray, Richard
> > Thanks to all the replied, we have rules_du_jour and I am 
> now getting 
> > an idea of how it works. I suppose the obvious question is 
> has anybody 
> > written a good howto on writing your own rules. And if so 
> where is it?
> 

You probably also want to learn more about regular expressions too.
There
Was a lot of stuff that I didn't know before I started doing this. 

In particular, useful things like back chaining and forward referencing
are useful to understand.

I wish I could tell you I had found a good site or book about it, but I
haven't.
If you do find one, would you let me know?

HTH

R


---
This email from dns has been validated by dnsMSS Managed Email Security and is 
free from all known viruses.

For further information contact [EMAIL PROTECTED]






Re: Rule-sets

2005-04-07 Thread Matthew Newton
On Thu, Apr 07, 2005 at 11:00:52AM +0100, Ron McKeating wrote:
> On Thu, 2005-04-07 at 10:53 +0100, Matthew Newton wrote:
> > Ron,
> > 
> > On Thu, Apr 07, 2005 at 10:23:24AM +0100, Ron McKeating wrote:
> > > Thanks to all of you who replied about the job offer spams. Could
> > > anybody point at the best site for the latest rulesets and an
> > > explanation of what each one does.
> > 
> > The "main" site for rules is generally http://www.rulesemporium.com/,
> > and spefically the http://www.rulesemporium.com/rules.htm page. They
> > have descriptions for what they do. You'll also find more on
> > http://www.exit0.us/, including the RulesDuJour script at
> > http://www.exit0.us/index.php?pagename=RulesDuJour that will
> > automatically check for new rules for you.
> > 
> > I can send you the current RulesDuJour settings I am using, if you like,
> > assuming you are not already using it. You should check it yourself and
> > make sure you are happy with the rules yourself, though.
> > 
> > I still find that there are some spam messages that don't seem to be
> > covered by rules, so end up writing my own. I'm no expert, but basic
> > rule-writing isn't that hard if you can write regular expressions.
> > 
> > Matthew
> > 
> 
> Thanks to all the replied, we have rules_du_jour and I am now getting an
> idea of how it works. I suppose the obvious question is has anybody
> written a good howto on writing your own rules. And if so where is it?

http://wiki.apache.org/spamassassin/WritingRules is good. Remember to
run "spamassassin --lint" before restarting SA to make sure you haven't
made any errors. I usually score my rules with 0.1 first to see how
messages are hitting them, and then increase after a few days.

http://www.exit0.us/index.php?pagename=RulesBasics may also be useful.

The most confusing thing I originally found was the difference between
body, rawbody and raw. The exit0.us page above seems to explain that
fairly well.

Matthew


-- 
Matthew Newton <[EMAIL PROTECTED]>

UNIX and e-mail Systems Administrator, Network Support Section,
Computer Centre, University of Leicester,
Leicester LE1 7RH, United Kingdom


Re: Rule-sets

2005-04-07 Thread Ron McKeating
On Thu, 2005-04-07 at 10:53 +0100, Matthew Newton wrote:
> Ron,
> 
> On Thu, Apr 07, 2005 at 10:23:24AM +0100, Ron McKeating wrote:
> > Thanks to all of you who replied about the job offer spams. Could
> > anybody point at the best site for the latest rulesets and an
> > explanation of what each one does.
> 
> The "main" site for rules is generally http://www.rulesemporium.com/,
> and spefically the http://www.rulesemporium.com/rules.htm page. They
> have descriptions for what they do. You'll also find more on
> http://www.exit0.us/, including the RulesDuJour script at
> http://www.exit0.us/index.php?pagename=RulesDuJour that will
> automatically check for new rules for you.
> 
> I can send you the current RulesDuJour settings I am using, if you like,
> assuming you are not already using it. You should check it yourself and
> make sure you are happy with the rules yourself, though.
> 
> I still find that there are some spam messages that don't seem to be
> covered by rules, so end up writing my own. I'm no expert, but basic
> rule-writing isn't that hard if you can write regular expressions.
> 
> Matthew
> 

Thanks to all the replied, we have rules_du_jour and I am now getting an
idea of how it works. I suppose the obvious question is has anybody
written a good howto on writing your own rules. And if so where is it?

Ron
> 
-- 
Ron McKeating
Senior IT Services Specialist
Computing Services
Loughborough University
01509 222329



Re: Rule-sets

2005-04-07 Thread Matthew Newton
Ron,

On Thu, Apr 07, 2005 at 10:23:24AM +0100, Ron McKeating wrote:
> Thanks to all of you who replied about the job offer spams. Could
> anybody point at the best site for the latest rulesets and an
> explanation of what each one does.

The "main" site for rules is generally http://www.rulesemporium.com/,
and spefically the http://www.rulesemporium.com/rules.htm page. They
have descriptions for what they do. You'll also find more on
http://www.exit0.us/, including the RulesDuJour script at
http://www.exit0.us/index.php?pagename=RulesDuJour that will
automatically check for new rules for you.

I can send you the current RulesDuJour settings I am using, if you like,
assuming you are not already using it. You should check it yourself and
make sure you are happy with the rules yourself, though.

I still find that there are some spam messages that don't seem to be
covered by rules, so end up writing my own. I'm no expert, but basic
rule-writing isn't that hard if you can write regular expressions.

Matthew


-- 
Matthew Newton <[EMAIL PROTECTED]>

UNIX and e-mail Systems Administrator, Network Support Section,
Computer Centre, University of Leicester,
Leicester LE1 7RH, United Kingdom


Rule-sets

2005-04-07 Thread Ron McKeating
Thanks to all of you who replied about the job offer spams. Could
anybody point at the best site for the latest rulesets and an
explanation of what each one does.

Ron
-- 
Ron McKeating
Senior IT Services Specialist
Computing Services
Loughborough University
01509 222329



Re: Spam is marked but delivered anyway

2005-04-07 Thread Duncan Hill
On Thursday 07 April 2005 09:38, [EMAIL PROTECTED] typed:
> I recently took over admin duty for a mailserver. The system I'm taking
> over is a FreeBSD mailserver with SpamAssassin 3.0.2 running.
> Unfortunately, a significant amount of spam gets through, and I think I
> know the reason.
>
> The spam that gets through is marked with [Suspected Spam]. Spamassasin
> says that it is possible spam and attaches the actual message as an
> attachment. The number of points is always ridiculously high-- generally

SpamAssassin is only a tagging filter, not a delivery agent.  You need 
something else in the pipeline that checks the status lines after SA is 
finished and routes the mail appropriately.

There is the chance that bayes_99 will trip on legit mail, but normally this 
only occurs if you haven't trained the bayesian database properly so that it 
has a good set of tokens representing ham and spam.


Spam is marked but delivered anyway

2005-04-07 Thread mailings
I recently took over admin duty for a mailserver. The system I'm taking
over is a FreeBSD mailserver with SpamAssassin 3.0.2 running.
Unfortunately, a significant amount of spam gets through, and I think I
know the reason.

The spam that gets through is marked with [Suspected Spam]. Spamassasin
says that it is possible spam and attaches the actual message as an
attachment. The number of points is always ridiculously high-- generally
20+ -- but since it is marked as BAYES_99, it goes through. I guess the
old sysadmin assumed that BAYES_99 could be legit email so he let them
through... But there's a DCC_CHECK and lots of other rules that match the
spam, and it definitely should go in the trash.

However, I perused through the configuration files for SA (local.cf and
user_prefs), and I can't find out how he set this up. I'd like to change
it to let through BAYES_99 only if it isn't accompanied by any other
high-value rules. Any idea where I can find the configuration information
for the setup and how I can change it to how I want it?


Re: it's getting worse again

2005-04-07 Thread Martin Hepworth
Florin
Depends on how well it's setup in the first place. The default ruleset 
are a pretty good starting point, but I find I need to add quite a few 
extra ones in from www.rulesemporium.com etc in order to get a reason 
catch rate.

the URI-RBL from surbl.org has help tremendously in providing a more 
automatic update system and the (my_)rules_du_jour from the SARE gang at 
rulesemporium.com helps with their rules.

From what I hear of the developers they have been discussing ways of 
providing an auto-update mechanism, but they are trying to lock it down 
from what I see, so you know the updates are from them and not spoofed.

--
Martin Hepworth
Snr Systems Administrator
Solid State Logic
Tel: +44 (0)1865 842300
Florin Andrei wrote:
I'm using SA since... well, a long time ago, and one thing that i
noticed was a pattern in the way its efficiency varies: it's pretty good
soon after a new release, then it gets continuously worse; then a new
release and all of a sudden it's good again, then it starts "decaying"
again...
Well, it's been a while since the last release, and it's already
noticeably worse. I know this has been discussed before, i am aware of
the VirusScannerTypeUpdates FAQ entry, but you know what, from an end-
user's point of view, it does not matter. All that matters is that,
despite brilliant technical discussions, the efficiency is going down
and, if a new version is not released soon enough, the users start to
complain. This is what's happening right now.
I guess something has to change. "Then change it yourself" type of
advices will go straight to /dev/null, thank you, because as far as SA
is concerned, i'm just a user. I am merely pointing out the problem.
**
This email and any files transmitted with it are confidential and
intended solely for the use of the individual or entity to whom they
are addressed. If you have received this email in error please notify
the system manager.
This footnote confirms that this email message has been swept
for the presence of computer viruses and is believed to be clean.   
**


Re: RDJ and bogus virus warnings rule

2005-04-07 Thread Martin Hepworth
Chris
Tim does seem to have quite a few problems with people getting to this 
on a regular basis. Perhaps he needs to host directly off the SARE 
pages Or can anyone give him bandwidth???

--
Martin Hepworth
Snr Systems Administrator
Solid State Logic
Tel: +44 (0)1865 842300
Chris wrote:
I think RDJ has been trying for about a week and a half now to download this 
rule.  I either get a '403' error or today I got a new one:

RulesDuJour Run Summary on cpollock.localdomain:
The following rules had errors:
Tim Jackson's (et al) bogus virus warnings had an unknown error:
curl exit code: 18
curl: (18) transfer closed with 79007 bytes remaining to read
200
Just tried a few minutes ago and got another '403'.  Has anyone successfully 
gotten this updated rule?

Chris
**
This email and any files transmitted with it are confidential and
intended solely for the use of the individual or entity to whom they
are addressed. If you have received this email in error please notify
the system manager.
This footnote confirms that this email message has been swept
for the presence of computer viruses and is believed to be clean.   
**


Re[2]: it's getting worse again

2005-04-07 Thread Robert Menschel
Hello Florin,

Wednesday, April 6, 2005, 5:40:10 PM, you wrote:

FA> So what is the reason why BAYES_99 is scored so low?

The algorithm/process that determines scores came out with a low score
like that.  It seemed a good bet for this new version.  Many of us
have decided that it wasn't, and we've increased the score for it in
our systems.

One of the strengths of SA is its flexibility -- if you want to be
more aggressive, raise the score(s) or lower the threshold. If you
want to be more conservative, lower the score(s) or raise the
threshold. SA's scores are aimed at a fairly conservative target,
since false positives are horrendously worse than false negatives.

>> I'm a little puzzled what you're asking for, then;  addon rulesets are
>> available from SARE, and somewhere there's a tool to automatically check
>> for updates on those rules.

FA> My impression, after a quick perusal, was that any mentions about SARE
FA> and the like are pretty well "hidden" on the SA main website.
FA> Yes, there is a mention, but there's a big fat "Use at your own risk"
FA> warning at the top of the page. What would a new user think?

A new user should think, and think twice, before using SARE rules.
A new user that doesn't read this list probably should think three or
four times before using SARE rules, and should do so slowly and
carefully, if at all, and only after reading the documentation within
those rules files.

A user who has read this list and sees how many people use SARE rules,
should also be capable of looking at the documentation within those
rules and deciding which ones might be worth trying. (And should
probably do so slowly and carefully anyway.)

>> If you're really not interested in tweaking your SA setup

FA> I've a fairly demanding job, i've a few pretty convoluted personal
FA> projects i'm involved in, i've a family and other details that typically
FA> show up if one is not an archetypal pale-faced geek-in-the-basement.
FA> I do try to take care of my personal webserver (to which i'm the sole
FA> admin), mailserver (SA, Postfix, Cyrus, Squirrelmail), VoIP PBX, etc.,
FA> despite the schedule overload.

FA> And these days i was looking at SA and i'm, like, "it's not gonna
FA> happen, i don't have time for this." I chose to play the dumb user on
FA> purpose, just because i can't fix everything myself.

I hold down a 50-60 hour work week, family, volunteer time for NPOs,
plus personal interests, and still find time to fight spam via SARE,
because it's that important to me. Personal preference.

If you don't want to spend the time required to tweak SA to a high
enough performance (you probably don't need the 99.9% accuracy I
want), the you can buy someone else's package and let them worry about
the tweaking.

There's a balance point -- some time invested vs some gain received.
The question becomes whether the gain received does balance the time
invested, and only you can answer that question.

Bob Menschel




Re: it's getting worse again

2005-04-07 Thread Robert Menschel
Hello Florin,

Wednesday, April 6, 2005, 11:29:51 AM, you wrote:

FA> I'm using SA since... well, a long time ago, and one thing that i
FA> noticed was a pattern in the way its efficiency varies: it's pretty good
FA> soon after a new release, then it gets continuously worse; then a new
FA> release and all of a sudden it's good again, then it starts "decaying"
FA> again...

FA> Well, it's been a while since the last release, and it's already
FA> noticeably worse. I know this has been discussed before, i am aware of
FA> the VirusScannerTypeUpdates FAQ entry, but you know what, from an end-
FA> user's point of view, it does not matter. All that matters is that,
FA> despite brilliant technical discussions, the efficiency is going down
FA> and, if a new version is not released soon enough, the users start to
FA> complain. This is what's happening right now.

FA> I guess something has to change. "Then change it yourself" type of
FA> advices will go straight to /dev/null, thank you, because as far as SA
FA> is concerned, i'm just a user. I am merely pointing out the problem.

That's one of the goals of SARE, to provide useful rule updates to
keep SpamAssassin's performance high even late in the cycle between
releases.

IMO we do very well.  My systems are still running 99.9% accurate at
this date (processing about 50k emails a week, 50/50 ham/spam).

To benefit from this work, you need to be able to judiciously apply
SARE updates whenever they come out.

Bob Menschel





Re: it's getting worse again

2005-04-07 Thread Florin Andrei
On Wed, 2005-04-06 at 15:53 -0400, Kris Deugau wrote:

> This WILL HAPPEN if you rely entirely on static rules - spammers adjust
> their tactics to avoid those rules.  That's why dynamic rules or systems
> such as Bayes and SURBL are so important.

I religiously feed false negatives back into Bayes. I've a cron job
that's polling a special folder in my IMAP account (i wrote a Perl
script based on a CPAN IMAP module) and i just drag the spam there and
forget about it.

> The most common detail in most other reports like yours (you don't say
> much beyond "It's broke.  Fix it.")

Increasing number of false negatives.

> is that spam is hitting BAYES_99
> and nothing else.  In 2.6x, this wasn't a problem, BAYES_99 scored over
> the threshold of 5 in the default setup, and spam would be correctly
> tagged in that case.  With 3.x, the BAYES_nn scores have been rather
> reduced, and a number of people have reported good results from just
> copying the 2.64 BAYES_nn scores.

So what is the reason why BAYES_99 is scored so low?

> I'm a little puzzled what you're asking for, then;  addon rulesets are
> available from SARE, and somewhere there's a tool to automatically check
> for updates on those rules.

My impression, after a quick perusal, was that any mentions about SARE
and the like are pretty well "hidden" on the SA main website.
Yes, there is a mention, but there's a big fat "Use at your own risk"
warning at the top of the page. What would a new user think?

> If you're really not interested in tweaking your SA setup



I've a fairly demanding job, i've a few pretty convoluted personal
projects i'm involved in, i've a family and other details that typically
show up if one is not an archetypal pale-faced geek-in-the-basement.
I do try to take care of my personal webserver (to which i'm the sole
admin), mailserver (SA, Postfix, Cyrus, Squirrelmail), VoIP PBX, etc.,
despite the schedule overload.

And these days i was looking at SA and i'm, like, "it's not gonna
happen, i don't have time for this." I chose to play the dumb user on
purpose, just because i can't fix everything myself.

I do apologize for not reporting the actual nature of the problem.

-- 
Florin Andrei

http://florin.myip.org/



RDJ and bogus virus warnings rule

2005-04-07 Thread Chris
I think RDJ has been trying for about a week and a half now to download this 
rule.  I either get a '403' error or today I got a new one:

RulesDuJour Run Summary on cpollock.localdomain:

The following rules had errors:
Tim Jackson's (et al) bogus virus warnings had an unknown error:
curl exit code: 18
curl: (18) transfer closed with 79007 bytes remaining to read
200

Just tried a few minutes ago and got another '403'.  Has anyone successfully 
gotten this updated rule?

Chris

-- 
Chris
Registered Linux User 283774 http://counter.li.org
18:57:14 up 1:25, 2 users, load average: 0.49, 0.48, 0.44
Mandrake Linux 10.1 Official, kernel 2.6.8.1-12mdk

Green's Law of Debate:
Anything is possible if you don't know what you're talking about.

Live - Classic Rock - From Virgin Radio UK Derek And The Dominos - Layla -