Querying the AWL

2008-09-11 Thread David Allsopp
I've been happily using SpamAssassin with MIMEDefang for nearly a year now.
I have a question about controlling and querying the whitelist.

The per-user automatic whitelist is enabled and clearly doing something
(because it's growing in size) but I can't find much documentation about it.
Is there any way to query the email addresses stored in the AWL? For
example, periodically I wouldn't mind going through the AWL and promoting
addresses to the actual whitelist for each user (and then mark all the
others as permanently blacklisted) just to help the AWL on its way. Also,
as I use MIMEDefang, this would allow me to implement the blacklist earlier
without actually calling spamc as I could bounce the email immediately
following the SMTP FROM clause. Looking through the docs - I can't see any
way of querying the addresses and scores from the AWL... did I miss
something?

Ultimately, what I'm planning on doing, either in my MIMEDefang filter or by
parsing the sendmail logs every now and then, is to update user's whitelists
such that any email address emailed *by* a user are automatically added to
their personal whitelist in user_prefs. Additionally, because most of my
users use Outlook, I'd periodically synchronise Outlook address books with
the server. MIMEDefang is configured to bounce email above a certain
threshold: giving it the users' address books allows this bounce threshold
to be very low (e.g. 3-5) as MIMEDefang could use a higher bounce threshold
(e.g. 10) for recognised email addresses - which would hopefully still catch
SPAM with a forged from address (though from what I can see it's relatively
rare to get SPAM from a forged address that you actually know).

Is this:

a) Sensible (is it a good idea to have a huge number of email addresses in
user_prefs?)
b) Has anyone configured SpamAssassin (via MIMEDefang or any other milter)
to work this before?

I've googled around and couldn't see anything. I'm invoking SpamAssassin via
spamc running version 3.2.5 on Fedora 9 but I don't expect that's relevant
here.

Thanks in advance for any advice/tips


David 



Re: sa-learn with tagged mail

2008-09-11 Thread Massimiliano Marini
 'man Mail::SpamAssassin::Conf' and read about bayes_ignore_header. 

Very helpfully thanks :)

-- 
Massimiliano Marini - http://www.linuxtime.it/massimilianomarini/
It's easier to invent the future than to predict it.  -- Alan Kay


Capture -D --lint output

2008-09-11 Thread Jack L. Stone
Folks, I'm trying to capture/grep specific given info from the subject
output, like this:

#spamassassin -D --lint | grep database

I KNOW that doesn't work, but describes my issue at hand. I've spent an
hour+ searching for others with this same question without success. I
remember this being posted on this list apprx 2 years ago and I can't find
it now.

Piping and grepping is easy to grab on other commands, but this one escapes
me.

Appreciate any help.

Jack

(^_^)
Happy trails,
Jack L. Stone

System Admin
Sage-american


RE: Querying the AWL

2008-09-11 Thread Giampaolo Tomassoni
 -Original Message-
 From: David Allsopp [mailto:[EMAIL PROTECTED]
 Sent: Thursday, September 11, 2008 1:58 PM
 
 I've been happily using SpamAssassin with MIMEDefang for nearly a year
 now.
 I have a question about controlling and querying the whitelist.
 
 The per-user automatic whitelist is enabled and clearly doing
 something (because it's growing in size) but I can't find much
 documentation about it.
 Is there any way to query the email addresses stored in the AWL? For
 example, periodically I wouldn't mind going through the AWL and
 promoting
 addresses to the actual whitelist for each user (and then mark all the
 others as permanently blacklisted) just to help the AWL on its way.
 Also,
 as I use MIMEDefang, this would allow me to implement the blacklist
 earlier
 without actually calling spamc as I could bounce the email immediately
 following the SMTP FROM clause. Looking through the docs - I can't see
 any
 way of querying the addresses and scores from the AWL... did I miss
 something?

You may use a SQL backend to store AWL data. This would help a lot in
interrogating the AWL DB.

Have a look to the Mail::SpamAssassin::Plugin::AWL perldoc to setup a
SQL-backed AWL.


 Ultimately, what I'm planning on doing, either in my MIMEDefang filter
 or by parsing the sendmail logs every now and then, is to update
 user's whitelists such that any email address emailed *by* a user are
 automatically added to their personal whitelist in user_prefs.
 Additionally, because most of my users use Outlook, I'd periodically
 synchronise Outlook address books with the server. MIMEDefang is
 configured to bounce email above a certain threshold: giving it the
 users' address books allows this bounce threshold to be very low
 (e.g. 3-5) as MIMEDefang could use a higher bounce threshold
 (e.g. 10) for recognised email addresses - which would hopefully still
 catch SPAM with a forged from address (though from what I can see it's
 relatively rare to get SPAM from a forged address that you actually know).
 
 Is this:
 
 a) Sensible (is it a good idea to have a huge number of email addresses
 in
 user_prefs?)
 b) Has anyone configured SpamAssassin (via MIMEDefang or any other
 milter)
 to work this before?

This is a feature often referred with the name Pen Pals. I know amavisd
has support for it.

Giampaolo


 I've googled around and couldn't see anything. I'm invoking
 SpamAssassin via
 spamc running version 3.2.5 on Fedora 9 but I don't expect that's
 relevant
 here.
 
 Thanks in advance for any advice/tips
 
 
 David



Re: Capture -D --lint output

2008-09-11 Thread Mariusz Kruk
On czw, 2008-09-11 at 07:53 -0500, Jack L. Stone wrote:
 Folks, I'm trying to capture/grep specific given info from the subject
 output, like this:
 
 #spamassassin -D --lint | grep database
 
 I KNOW that doesn't work, but describes my issue at hand. I've spent an
 hour+ searching for others with this same question without success. I
 remember this being posted on this list apprx 2 years ago and I can't find
 it now.
 
 Piping and grepping is easy to grab on other commands, but this one escapes
 me.

I don't understand what's your problem. You can't redirect stderr to
different stream or what?

-- 
\.\.\.\.\.\.\.\.\.\.\.\.\.\ A  new  film  by  Borg...  Star Borg VI: The
[EMAIL PROTECTED] Unassimilated Country
\.http://epsilon.eu.org/\.\ 
.\.\.\.\.\.\.\.\.\.\.\.\.\. 



Re: Capture -D --lint output

2008-09-11 Thread John Wilcock

Mariusz Kruk a écrit :

On czw, 2008-09-11 at 07:53 -0500, Jack L. Stone wrote:

Folks, I'm trying to capture/grep specific given info from the subject
output, like this:

#spamassassin -D --lint | grep database

I KNOW that doesn't work, but describes my issue at hand. I've spent an
hour+ searching for others with this same question without success. I
remember this being posted on this list apprx 2 years ago and I can't find
it now.

Piping and grepping is easy to grab on other commands, but this one escapes
me.


I don't understand what's your problem. You can't redirect stderr to
different stream or what?



No need for that attitude, we were all newbies once...
It wouldn't have taken any longer to give the actual solution:

spamassassin -D --lint 21 | grep database

John.

--
-- Over 3000 webcams from ski resorts around the world - www.snoweye.com
-- Translate your technical documents and web pages- www.tradoc.fr


Re: Capture -D --lint output

2008-09-11 Thread Matt Kettler
Jack L. Stone wrote:
 Folks, I'm trying to capture/grep specific given info from the subject
 output, like this:

 #spamassassin -D --lint | grep database

 I KNOW that doesn't work, but describes my issue at hand. I've spent an
 hour+ searching for others with this same question without success. I
 remember this being posted on this list apprx 2 years ago and I can't find
 it now.

 Piping and grepping is easy to grab on other commands, but this one escapes
 me.
   

You need to redirect stderr to stdout if you want to use a pipe.

spamassassin -D --lint 21 | grep database


You can also dump it to a file:

spamassassin -D --lint 2 somefile.txt




Re: Capture -D --lint output

2008-09-11 Thread ram

On Thu, 2008-09-11 at 07:53 -0500, Jack L. Stone wrote:
 Folks, I'm trying to capture/grep specific given info from the subject
 output, like this:
 
 #spamassassin -D --lint | grep database
 
spamassassin -D --lint 21 | grep database



Re: Capture -D --lint output

2008-09-11 Thread Mariusz Kruk
On czw, 2008-09-11 at 15:06 +0200, John Wilcock wrote:
 No need for that attitude, we were all newbies once...

Sorry, wasn't meant as an insult or anything like that. Was more like
surprised because I really didn't understand the problem.

 It wouldn't have taken any longer to give the actual solution:
 
 spamassassin -D --lint 21 | grep database

Unless, of course, you're using another shell.
I'd send the original asker to man page of his shell anyway. To read
about input/output redirection. It can be quite useful in many other
cases.

-- 
  Kruk@ -\   | Microsoft Office 2000: Wzrasta Twoje IQ
  }- epsilon.eu.org | 
http:// -/   | 
 | 



Re: Different Scores

2008-09-11 Thread Matus UHLAR - fantomas
On 10.09.08 11:24, PileOfMush wrote:
 No, I ran the spamassassin -d -t test as root. I'm not sure which user to 
 run as. I'm using qmail on plesk. I have about 6 different users with 
 the name qmail in them, plus a few mail related users as well as
 popuser.
 
 Here is what's different between the two sets of headers. I threw the Bayes
 part out as well because it's understandable. Does running as a different
 user cause this part to be different as well? This message was manually run
 through literally 1 minute later. 
 
 Automated:
 *  1.5 URIBL_JP_SURBL Contains an URL listed in the JP SURBL
 blocklist
 *  [URIs: opaqbay.com]
 *  1.1 URIBL_RHS_DOB Contains an URI of a new domain (Day Old Bread)
 *  [URIs: wildberyl.com]
 
 Manually run as root:
 *  0.3 DNS_FROM_DOB RBL: Sender from new domain (Day Old Bread)
 *  0.8 RCVD_IN_DOB RBL: Received via relay in new domain (Day Old
 Bread)
 *  0.9 URIBL_RHS_DOB Contains an URI of a new domain (Day Old Bread)
 *  [URIs: opaqbay.com]
 *  2.9 URIBL_JP_SURBL Contains an URL listed in the JP SURBL
 blocklist
 *  [URIs: opaqbay.com]

since URIBL_JP_SURBL and URIBL_RHS_DOB have different scores in those cases,
it's clear that you run with different settings.

% grep URIBL_JP_SURBL 
/var/lib/spamassassin/3.002003/updates_spamassassin_org/50_scores.cf
score URIBL_JP_SURBL 0 2.857 0 1.501 # n=0 n=2

the first aas with, the latter without BAYES filter...
-- 
Matus UHLAR - fantomas, [EMAIL PROTECTED] ; http://www.fantomas.sk/
Warning: I wish NOT to receive e-mail advertising to this address.
Varovanie: na tuto adresu chcem NEDOSTAVAT akukolvek reklamnu postu.
Honk if you love peace and quiet. 


Per-user preferences directory

2008-09-11 Thread Christian Gregoire
Hello,

Is there any alternative to the %u, %l or %d options to the virtual-config-dir 
option ? I have 12 000 mailboxes, that is, as many entries in only one 
directory ...

Thanks.

Christian



  


Re: Querying the AWL

2008-09-11 Thread Jonas Eckerman

David Allsopp wrote:


The per-user automatic whitelist is enabled and clearly doing something
(because it's growing in size) but I can't find much documentation about it.


perldoc Mail::SpamAssassin::Plugin::AWL
perldoc Mail::SpamAssassin::AutoWhitelist


Is there any way to query the email addresses stored in the AWL? For
example, periodically I wouldn't mind going through the AWL and promoting
addresses to the actual whitelist for each user (and then mark all the
others as permanently blacklisted) just to help the AWL on its way.


If you trust the AWL enough to use it that way, maybe you should 
simply raise the auto_whitelist_factor. This way the AWLs score 
adjustments will get bigger without the need for new code anywhere.


Personally I would only be prepared to straight white/blacklist 
for addresses that have a *very* high or low score in the AWL, 
but addresses with very high/low scores will result in a big 
score adjustment from the AWL anyway.


So promoting addresses from the AWL to white/black lists would 
only help if those lists are either used outside SA or used with 
short circuiting.


Considering promoting the addresses to straight black/white lists 
in SA, I'n not sure if SA handles partial IP addresses for 
whitelist_from_rcvd, wich is what is stored in the AWL.



as I use MIMEDefang, this would allow me to implement the blacklist earlier


This shouldn't be too hard to do if you have the AWL use a SQL 
database.


Otherwise you should be able to do it with the help of 
Mail::SpamAssassin::AutoWhitelist.



Looking through the docs - I can't see any
way of querying the addresses and scores from the AWL... did I miss
something?


perldoc Mail::SpamAssassin::AutoWhitelist should give some hints.

You might need tp read some source code though.


Ultimately, what I'm planning on doing, either in my MIMEDefang filter or by
parsing the sendmail logs every now and then, is to update user's whitelists
such that any email address emailed *by* a user are automatically added to
their personal whitelist in user_prefs.


I'm doing something similar to this with MIMEDefang and a 
SpamAssassin plugin. See below.



Additionally, because most of my
users use Outlook, I'd periodically synchronise Outlook address books with
the server.


Do note that the AWL uses (a part of) the IP-address of the relay 
as well as the mail address. This information will be missing 
from the MUAs address books.


Whitelisting based only on email addresses often leads to FNs.


Is there any way to query the email addresses stored in the AWL? For
example, periodically I wouldn't mind going through the AWL and promoting
addresses to the actual whitelist for each user (and then mark all the

[snip]

MIMEDefang is configured to bounce email above a certain
threshold: giving it the users' address books allows this bounce threshold
to be very low (e.g. 3-5) as MIMEDefang could use a higher bounce threshold
(e.g. 10) for recognised email addresses


You don't actually need to do anything special in SA for this. 
Since MIMEDefang knows from who a mail is, from wich relay, and 
to which local address, you could have MIMEDefang use different 
thresholds depending on this information.


Using the AWL data to adjust the threshold seems odd to me. Since 
the AWL data allready adjusts the score, an adjustment to the 
threshold as well based on the same data will just make the 
adjustment stronger. This can be done easier and with less code 
by simply adjusting auto_whitelist_factor for SA.



SPAM with a forged from address (though from what I can see it's relatively
rare to get SPAM from a forged address that you actually know).


Especially if you check the sending relay as well as the mail 
address. Wich I think you should.



b) Has anyone configured SpamAssassin (via MIMEDefang or any other milter)
to work this before?


Not exactly what describe, but another solutions with slightly 
similar goals.


* My mimedefang-filter saves information about all *outgoing* 
mail to a SQL database. I then have a SpamAssassin plugin that 
checks to see if incoming mail is likely to be replies to 
outgoing mail.


* The filter also keeps tracks of incoming spam/ham (as 
determined by SA) and uses this both to bypass spamassassin and 
to block mail before having to call SA.


Both the filter and the plugin is available at
http://whatever.frukt.org/mimedefangfilter.text.shtml

The filter is *huge*, but I hope it's not too hard to find the 
relevant parts of it.


Regards
/Jonas
--
Jonas Eckerman, FSDB  Fruktträdet
http://whatever.frukt.org/
http://www.fsdb.org/
http://www.frukt.org/



Folder Redirection Besides classification

2008-09-11 Thread David Carvalho
Hi !

Is it possible to redirect classified spam to another file, just after
classification,  instead of 

appending to the user regular mail file (like /var/mail/usermail) ?

Regards

David

 



Re: Folder Redirection Besides classification

2008-09-11 Thread Theo Van Dinter
On Thu, Sep 11, 2008 at 05:03:06PM +0100, David Carvalho wrote:
 Is it possible to redirect classified spam to another file, just after
 classification,  instead of 

No.

 appending to the user regular mail file (like /var/mail/usermail) ?

SA isn't doing that either.  It's just marking up the message.

-- 
Randomly Selected Tagline:
It started as all journies do, with a beginning...   - Commercial


pgpJLXcjoDaX4.pgp
Description: PGP signature


Re: Folder Redirection Besides classification

2008-09-11 Thread mouss

David Carvalho wrote:

Hi !

Is it possible to redirect classified spam to another file, just after
classification,  instead of 


appending to the user regular mail file (like /var/mail/usermail) ?


sure. use maildrop, procmail, dovecot sieve, amavisd-new, postfix,  
etc. SA is not involved in delivery.


MagicSpam

2008-09-11 Thread robb

Does anybody have any experience with this product?

My company wants to replace SpamAssassin with this product, due to  
SpamAssassin being not being up to par other products.


My argument is that people we give SpamAssassin to have no clue how to  
use it and what it's designed to do, therefore they think it sucks.






RE: MagicSpam

2008-09-11 Thread Martin.Hepworth
Rob

Can't say i have, but SA does need someone with a little expertise and a clue 
(tm) to get it going well. After that it takes very little extra work apart 
from upgrading every so often and running sa-update every week or so.

-- 
martin

-Original Message-
From:  [EMAIL PROTECTED]
Sent: Thursday, September 11, 2008 6:12 PM
To: users@spamassassin.apache.org
Subject: MagicSpam

Does anybody have any experience with this product?

My company wants to replace SpamAssassin with this product, due to  
SpamAssassin being not being up to par other products.

My argument is that people we give SpamAssassin to have no clue how to  use it 
and what it's designed to do, therefore they think it sucks.




**
Confidentiality : This e-mail and any attachments are intended for the 
addressee only and may be confidential. If they come to you in error 
you must take no action based on them, nor must you copy or show them 
to anyone. Please advise the sender by replying to this e-mail 
immediately and then delete the original from your computer.
Opinion : Any opinions expressed in this e-mail are entirely those of 
the author and unless specifically stated to the contrary, are not 
necessarily those of the author's employer.
Security Warning : Internet e-mail is not necessarily a secure 
communications medium and can be subject to data corruption. We advise 
that you consider this fact when e-mailing us. 
Viruses : We have taken steps to ensure that this e-mail and any 
attachments are free from known viruses but in keeping with good 
computing practice, you should ensure that they are virus free.

Red Lion 49 Ltd T/A Solid State Logic
Registered as a limited company in England and Wales 
(Company No:5362730)
Registered Office: 25 Spring Hill Road, Begbroke, Oxford OX5 1RU, 
United Kingdom
**



Re: MagicSpam

2008-09-11 Thread Jesse Stroik

Rob,

Spamassassin is more difficult to configure because commercial products 
don't have the luxury of requiring more sysadmin configuration.  They 
have to be easy or no one would buy them.  The disadvantage of them 
being easier is that they have less flexibility, less information and 
less site-specific configuration to work with.  They also tend to be 
less accurate, erring to the side of enforcement at the risk of 
discarding legitimate mail.


It is important to check spamassassin to see which plugins are installed 
properly and working.  Spamassassin will work with only a few plugins 
installed, but it will work much better if you install all plugins that 
make sense for your site.


To maintain spamassassin well, you also have to have very level-headed 
admins who are willing to drop even very effective plugins if they have 
the potential for false positives.  You have to evaluate the plugins 
yourself, to some extent, and you have to trust behavior that you 
observe.  I recently had to decrease the score of the BOTNET plugin 
significantly.  It's not the BOTNET plugin is doing something wrong -- 
it's simply that companies often configure their mail servers with mail 
gateways and have internal/private network Received lines that trigger 
the BOTNET plugin.


Commercial products tend to trap lots of spam, like a properly 
configured spamassassin installation, but they also tend to get a lot of 
false positives.  Consider that people complain a lot more about false 
negatives (spam that gets through) than false positives, especially if 
they don't see the false positives.  Because of this behavior pattern, 
commercial products will almost always err to the side of throwing away 
the baby with the bathwater.  And this is more dangerous to email than 
spam is.


Best,
Jesse


Re: MagicSpam

2008-09-11 Thread Aaron Wolfe
On Thu, Sep 11, 2008 at 1:11 PM,  [EMAIL PROTECTED] wrote:
 Does anybody have any experience with this product?


It appears *noone* has any experience with it... Google finds only 2
links and they are on the company's own homepage.

 My company wants to replace SpamAssassin with this product, due to
 SpamAssassin being not being up to par other products.

What is the evidence for this statement?  I move customers from
commercial solutions to my company's SA based filtering regularly and
they are typically very impressed with what we can do for them with
Spamassassin.


 My argument is that people we give SpamAssassin to have no clue how to use
 it and what it's designed to do, therefore they think it sucks.


Why would your users even need to know you are using SA?  How are they
supposed to use it?  Just configure it to make spam go away and they
should be OK with that.  You can set up some sort of quarantine or
tagging system but people generally aren't going to use it much.





From what I can find of the company behind this Magic thing, it looks
like their products are repackaged open source software.  (Their
MagicMail product appears to be qmail).  There's a pretty decent
change they are selling you Spamassassin anyway :)


Re: MagicSpam

2008-09-11 Thread fchan

Hi,
Sorry I don't have experience with this product.
I do have limited experience with Barracuda Networks appliance and I 
think is a great product for an e-mail filter which I had experienced 
with my friend to set up on their network  email server. It is easy 
to set up, configure and maintain so for an alternative to 
spamassassin this is great alternative. Price a fairly good and since 
they were a educational institute they got an discount.

http://www.barracudanetworks.com/ns/products/spam_overview.php

Frank


Does anybody have any experience with this product?

My company wants to replace SpamAssassin with this product, due to 
SpamAssassin being not being up to par other products.


My argument is that people we give SpamAssassin to have no clue how 
to use it and what it's designed to do, therefore they think it 
sucks.




Re: Capture -D --lint output

2008-09-11 Thread Jack L. Stone
At 03:16 PM 9.11.2008 +0200, Mariusz Kruk wrote:
On czw, 2008-09-11 at 15:06 +0200, John Wilcock wrote:
 No need for that attitude, we were all newbies once...

Sorry, wasn't meant as an insult or anything like that. Was more like
surprised because I really didn't understand the problem.

 It wouldn't have taken any longer to give the actual solution:
 
 spamassassin -D --lint 21 | grep database

Unless, of course, you're using another shell.
I'd send the original asker to man page of his shell anyway. To read
about input/output redirection. It can be quite useful in many other
cases.

-- 
  Kruk@ -\   | Microsoft Office 2000: Wzrasta Twoje IQ

Yes, it was the shell csh I use. Tried sh and the suggested redirects
work fine.

Thanks
Jack

(^_^)
Happy trails,
Jack L. Stone

System Admin
Sage-american


Problems with 3.2.5

2008-09-11 Thread Micah Anderson

I just upgraded to 3.2.5 and have encountered some regressions.

First, I'm getting tons of the following in my logs, literally metric tons:

Sep 11 17:11:28 spamd2 spamd[27357]: Use of uninitialized value in 
concatenation (.) or string at 
/usr/share/perl5/Mail/SpamAssassin/Plugin/Check.pm line 1028, GEN442 line 315.

In order to get it to stop, I had to disable the shortcircuit plugin in
v320.pre. I filled a partition with this line in a couple minutes flat.

I particularly value the savings I get from this plugin, so I would like
to know how I can re-enable it!

This problem is also present in 3.2.4, but not in 3.2.3, if that helps.

Additionally, I am getting the following:

Sep 11 20:25:41 spamd2 spamd[26599]: DNS query timeout for
gamma._domainkey.gmail.com
Sep 11 20:16:02 spamd2 spamd[21923]: Compilation failed in require at
/usr/lib/perl5/Net/DNS/RR/TXT.pm line 11, GEN203 line 78.
Sep 11 20:16:02 spamd2 spamd[21923]: BEGIN failed--compilation aborted
at /usr/lib/perl5/Net/DNS/RR/TXT.pm line 11, GEN203 line 78.

These are obviously related to domainkeys/dkim, but the perl errors are
ugly.

Thanks for everyone's work on SA, its really appreciated,
Micah



Re: Problems with 3.2.5

2008-09-11 Thread Sahil Tandon
Micah Anderson [EMAIL PROTECTED] wrote:

 In order to get it to stop, I had to disable the shortcircuit plugin in
 v320.pre. I filled a partition with this line in a couple minutes flat.
 
 I particularly value the savings I get from this plugin, so I would like
 to know how I can re-enable it!

https://issues.apache.org/SpamAssassin/show_bug.cgi?id=5776#c3

-- 
Sahil Tandon [EMAIL PROTECTED]