Re: **exact** info about "skip_rbl_checks" needed

2007-01-25 Thread Daryl C. W. O'Shea

David B Funk wrote:

On Fri, 26 Jan 2007, Daryl C. W. O'Shea wrote:


Some of my incoming mesasges involve messages forwarded to my server via a rule 
from accounts that some of my clients have on other ISPs mail servers. For such 
incoming messages, I have been creating a temporary copy of the message where 
all headers that were ADDED by either the other ISP and/or my server are 
removed so that the message is brought back to the state that it was in when 
originally sent by the original sender (just prior to the ISP's mail server 
received it). This way, SA can work with that the potential spammer actually 
sent, without any received headers added.

But is that really necessary? Or would I get the same results if, under my 
configuration described above, I just left the extra added headers in there?

To get the same functionality without stripping headers you'd have to
add the forwarders' IPs to your trusted and internal networks config.


Pardon my confusion, but wouldn't it be sufficient to just add them to
the trusted networks list? (IE not adding them to internal too).


If you haven't already defined internal_networks, yes, since 
internal_networks will default to whatever you use for trusted_networks.


Having them in both trusted and internal networks is more similar to 
stripping the received headers than having them in trusted but not internal.




IIUR, internal networks are for clients that will source messages,
trusted is for MTAs that feed you. Am I missing something?


I think you are.

 - for something to be internal it has to be trusted
   (you can't have an internal but not trusted relay)

 - relays that act as MXes need to be both trusted and internal

 - all relays between an MX and SA need to also be both trusted and
   internal


Adding the forwarding situation described:

 - the MX for the account forwarding to the local account is acting
   as an MX for the final destination account (forwarding is messy)

 - all relays between an MX (the one for the account forwarding to
   the local account) and SA (thus all relays between the remote MX
   and your MX) need to be both trusted and internal



On the submission side (not involved in the original question) it goes 
something like this:


 - if the MSA isn't an MX or internal relay between an MX and SA
   you want it to be trusted but not internal; otherwise it has to
   be both trusted and internal and you'd better have auth tokens
   in the received headers (or be using the POPAuth plugin)


Daryl


RE: Possible false positive?

2007-01-25 Thread Aydin SASMAZ
Actually I didn't defined that rule, it poped up, there was not any problem
days ago..I've looked up at /etc/MailScanner/spam.assassin.prefs.conf or
there is no .spamassassin/local_pref file also. Can't find

How can I find where it is...

thanks






Hasan Aydın ŞAŞMAZ

Genel Müdür Yardımcısı

BTEĞİTİM

Tel : 0212 274 6998

Fax: 0212 267 4725


-Original Message-
From: Theo Van Dinter [mailto:[EMAIL PROTECTED] 
Sent: 26 Ocak 2007 Cuma 03:02
To: users@spamassassin.apache.org
Subject: Re: Possible false positive?

On Fri, Jan 26, 2007 at 02:32:11AM +0200, Aydin SASMAZ wrote:
> adds a test score Fw_mail 100.00 but I got the email because my email
[...]
> What is the real cause of this and how prevent spamassassin adding this
> Fw_mail 100 score any forwarded email

"Fw_mail" is not a standard rule included with SpamAssassin.  You'll want to
find where you define this rule in your configs and disable it.

-- 
Randomly Selected Tagline:
If a can of Alpo costs 38 cents, would it cost $2.50 in Dog Dollars?



-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.



Re: **exact** info about "skip_rbl_checks" needed

2007-01-25 Thread David B Funk
On Fri, 26 Jan 2007, Daryl C. W. O'Shea wrote:

> >
> > Some of my incoming mesasges involve messages forwarded to my server via a 
> > rule from accounts that some of my clients have on other ISPs mail servers. 
> > For such incoming messages, I have been creating a temporary copy of the 
> > message where all headers that were ADDED by either the other ISP and/or my 
> > server are removed so that the message is brought back to the state that it 
> > was in when originally sent by the original sender (just prior to the ISP's 
> > mail server received it). This way, SA can work with that the potential 
> > spammer actually sent, without any received headers added.
> >
> > But is that really necessary? Or would I get the same results if, under my 
> > configuration described above, I just left the extra added headers in there?
>
> To get the same functionality without stripping headers you'd have to
> add the forwarders' IPs to your trusted and internal networks config.

Pardon my confusion, but wouldn't it be sufficient to just add them to
the trusted networks list? (IE not adding them to internal too).

IIUR, internal networks are for clients that will source messages,
trusted is for MTAs that feed you. Am I missing something?


-- 
Dave Funk  University of Iowa
College of Engineering
319/335-5751   FAX: 319/384-0549   1256 Seamans Center
Sys_admin/Postmaster/cell_adminIowa City, IA 52242-1527
#include 
Better is not better, 'standard' is better. B{


Re: **exact** info about "skip_rbl_checks" needed

2007-01-25 Thread Daryl C. W. O'Shea

My question... why **exactly** can't webmail line wrap messages?  :)


Rob McEwen (PowerView Systems) wrote:


1st question:

Some of my incoming mesasges involve messages forwarded to my server via a rule 
from accounts that some of my clients have on other ISPs mail servers. For such 
incoming messages, I have been creating a temporary copy of the message where 
all headers that were ADDED by either the other ISP and/or my server are 
removed so that the message is brought back to the state that it was in when 
originally sent by the original sender (just prior to the ISP's mail server 
received it). This way, SA can work with that the potential spammer actually 
sent, without any received headers added.

But is that really necessary? Or would I get the same results if, under my 
configuration described above, I just left the extra added headers in there?


To get the same functionality without stripping headers you'd have to 
add the forwarders' IPs to your trusted and internal networks config.




(I'm concerned that, even with skip_rbl_checks turned off, there might still be SPF checking or 
other things going on which then might get messed up if I don't present the message in its 
"original" form. PLEASE... let me know if that is the case. This will only be about the 
10th time that I've asked what other "network checks" happen besides Razor/DCC when 
skip_rbl_checks is set to true.)


SPF checks aren't RBL checks, so skip_rbl_checks doesn't affect them. 
Enabling or disabling the SPF plugin and/or rules affects whether SPF 
checks are done.


As for what other network checks are done; run a message through in 
debug mode and find out.  Everything is logged in the debug output. 
This'll only be about the 2nd time I've suggested this, 8 more to go to 
catch up. :)




2nd question:

Does SA have any problems working with a file that OTHER programs are currently accessing 
(in "read" mode)?


Just like any other simultaneous file access, if the lock states are 
compatible you're OK.  If there is some file in a lock state that isn't 
compatible with what SA wants to do SA will continue on after a short 
time, so there's no risk in SA hanging up.



Daryl


Re: Should I use greylisting

2007-01-25 Thread uNiXpSyChO

Shaun T. Erickson wrote:
> Personally, I didn't like the added delay for first-time mails, 
which is

> why I chose to greylist only on blocklists, but for a minimal effort my
> spam was significantly reduced.

what are you using to greylist based on blocklists?


I use maRBL. The latest version lets me greylist (I use sqlgrey, but
there are others) anyone who is found on whatever RBLs I configure it
to check, and any connection that comes from a Windows box (the vast
majority of which are botnet zombies). It has had an immense impact on
the amount of spam that gets through to be looked at by SA & clamav.
I've been very happy with it.


hmm.  these two look like they're only for postfix.  darn.

was hoping for a Sendmail version and a SQL plugin.



Re: Drug spam, some caught some not - none caught by drug rules

2007-01-25 Thread Rich Shepard

On Thu, 25 Jan 2007, Andy Figueroa wrote:


Rich, if you can post the output as text files to a web site somewhere and
just send the link/url, that's the kindest way to to this.  And then if I
knew what I was doing, I'd go look at them and analyze them for you. 
Thought it won't be me, I'm sure someone will.


Andy et al.:

  You can use http://www.appl-ecosys.com/temp-files/analyzed-spam.tgz>.

  I'll leave it there for a day. Any insight into how to better trap this
type of spam would be welcome. I have a few other representative types, too.
But, Friday evening I run sa-learn on my spam-uncaught message file and
delete them.

Thanks,

Rich

--
Richard B. Shepard, Ph.D.   |The Environmental Permitting
Applied Ecosystem Services, Inc.|  Accelerator(TM)
 Voice: 503-667-4517  Fax: 503-667-8863


Re: Drug spam, some caught some not - none caught by drug rules

2007-01-25 Thread Andy Figueroa
Rich, if you can post the output as text files to a web site somewhere 
and just send the link/url, that's the kindest way to to this.  And then 
if I knew what I was doing, I'd go look at them and analyze them for 
you.  Thought it won't be me, I'm sure someone will.


Andy Figueroa

Rich Shepard wrote:

On Thu, 25 Jan 2007, Matt Kettler wrote:


The proper command would be:

spamassassin -D bayes < message1 2> debug1.txt


  OK. I have a spam message that made it to my inbox today. Empty body, the
spam base64 encoded. SA gave it a score of 0 this morning.

  I've run it through the debug process per the above, but I've no idea how
to interpret the results or learn from them what -- if anything -- 
should be

tweaked.

  How should I make the message and debug output tarball available?

Rich



Re: Drug spam, some caught some not - none caught by drug rules

2007-01-25 Thread Andy Figueroa
Thanks, again, Matt.  I need all the help I can get.  I've only been 
managing my own SpamAssassin installations (two mailservers) for about 
four months and still have a lot to learn.


Andy

Matt Kettler wrote:

Andy Figueroa wrote:

You can capture the debug output by using:
spamassassin -D -t < message1 2> debug1.txt


Andy, you'r missing something VERY important here. They need BAYES
debugging, not general debugging. And using -t here is pointless. Won't
hurt, but serves no useful purpose. (-t forces SA to mark the message up
and generate a report like it would for spam, even if the score isn't
over the threshold.

The proper command would be:

spamassassin -D bayes < message1 2> debug1.txt




Re: Should I use greylisting

2007-01-25 Thread Shaun T. Erickson

> Personally, I didn't like the added delay for first-time mails, which is
> why I chose to greylist only on blocklists, but for a minimal effort my
> spam was significantly reduced.

what are you using to greylist based on blocklists?


I use maRBL. The latest version lets me greylist (I use sqlgrey, but
there are others) anyone who is found on whatever RBLs I configure it
to check, and any connection that comes from a Windows box (the vast
majority of which are botnet zombies). It has had an immense impact on
the amount of spam that gets through to be looked at by SA & clamav.
I've been very happy with it.
--
   -ste


Re: Should I use greylisting

2007-01-25 Thread Magnus Holmgren
On Friday 26 January 2007 03:21, uNiXpSyChO wrote:
> Chris Purves wrote:
> > Personally, I didn't like the added delay for first-time mails, which is
> > why I chose to greylist only on blocklists, but for a minimal effort my
> > spam was significantly reduced.
> >
> > Hope that helps.
>
> what are you using to greylist based on blocklists?

Judging from his presence on the Exim-related mailing lists he is probably 
using the Exim MTA and its ACL facilities.

-- 
Magnus Holmgren[EMAIL PROTECTED]
   (No Cc of list mail needed, thanks)

  "Exim is better at being younger, whereas sendmail is better for 
   Scrabble (50 point bonus for clearing your rack)" -- Dave Evans
--- Begin Message ---

Marc Haber wrote:

On Tue, Jan 16, 2007 at 01:57:38PM -0700, Chris Purves wrote:
I am having difficulties getting AUTH to work for remote connections.  I 
have had it working in the past, but don't normally use my server for 
sending e-mail because it has a dynamic IP.  Yesterday I found that it 
doesn't seem to be working at all.  I have tried with Thunderbird and 
Opera to send e-mail, both say something the server is not accepting 
SMTP connections or is not set up properly.


Any chance that your ISP might be blocking incoming port 25? Does
submission on port 587 have the same problem?


The problem was along these lines.  Port 25 seems to be blocked for 
outgoing on the network I was testing the e-mail client.  I added 
listening on port 587 for situations like that and everything is working 
now; or rather it was always working and I just now realised it.  Thanks 
for pointing out the most obvious reason.  It could have taken weeks for 
my brain to turn on.




I also found that when using telnet remotely, the welcome banner was 
very slow to come up ~60s. I set rfc1413_query_timeout = 0s to get

around that.


If that didn't help, you might be experiencing DNS issues. If it
helped, I have no idea because rfc1413 timeout was always shorter than
30 seconds.


Yes, you're right.  I reset to 30s and from some hosts it takes about 
35s and from others about 3s.  I must have made a mistake when I 
measured 60s.  I have set the timeout to 5s, which I think is the 
default for exim 4.6 (I have 4.5).


Thanks again.

--
Chris


___
Pkg-exim4-users mailing list
[EMAIL PROTECTED]
http://lists.alioth.debian.org/mailman/listinfo/pkg-exim4-users
--- End Message ---


pgpIKAe32PDDi.pgp
Description: PGP signature


Re: Should I use greylisting

2007-01-25 Thread uNiXpSyChO

Chris Purves wrote:

Matthew Bickerton wrote:



<...snip...>

Personally, I didn't like the added delay for first-time mails, which is 
why I chose to greylist only on blocklists, but for a minimal effort my 
spam was significantly reduced.


Hope that helps.




what are you using to greylist based on blocklists?



Re: Drug spam, some caught some not - none caught by drug rules

2007-01-25 Thread Rich Shepard

On Thu, 25 Jan 2007, Matt Kettler wrote:


The proper command would be:

spamassassin -D bayes < message1 2> debug1.txt


  OK. I have a spam message that made it to my inbox today. Empty body, the
spam base64 encoded. SA gave it a score of 0 this morning.

  I've run it through the debug process per the above, but I've no idea how
to interpret the results or learn from them what -- if anything -- should be
tweaked.

  How should I make the message and debug output tarball available?

Rich

--
Richard B. Shepard, Ph.D.   |The Environmental Permitting
Applied Ecosystem Services, Inc.|  Accelerator(TM)
 Voice: 503-667-4517  Fax: 503-667-8863


SQL Bayes Store -- initialization of database

2007-01-25 Thread Tom Allison

I'm trying to initialize a database for Bayes from perl (DIY).

Using Test::More as a start I tried:

can_ok('Mail::SpamAssassin::BayesStore', ('tie_db_readonly'));

my $to = 'tom';
my $spamtest = Mail::SpamAssassin->new( {username => $to, debug=>'all'} );

isa_ok($spamtest, 'Mail::SpamAssassin');
my $bayes = 
Mail::SpamAssassin::BayesStore->new($spamtest->{bayes_store_module});
isa_ok($bayes, 'Mail::SpamAssassin::BayesStore');

$bayes->tie_db_readonly();



Everything passes...

Except for the '$bayes->tie_db_readonly()'
returns 'bayes: tie_db_readonly: not implemented'

I'm a little confused as to why.


Re: Drug spam, some caught some not - none caught by drug rules

2007-01-25 Thread Matt Kettler
Nigel Frankcom wrote:
> Debug results are available on: 
> http://dev.blue-canoe.net/spam/spam01.txt
> http://dev.blue-canoe.net/spam/debug1.txt
>
> http://dev.blue-canoe.net/spam/spam02.txt
> http://dev.blue-canoe.net/spam/debug2.txt
>
> http://dev.blue-canoe.net/spam/spam03.txt
> http://dev.blue-canoe.net/spam/debug3.txt
>
> http://dev.blue-canoe.net/spam/spam04.txt
> http://dev.blue-canoe.net/spam/debug4.txt
>
> Make of them what you will, I think I need more beer before that lot
> makes much sense :-D
>
> Kind regards
>
> Nigel
>   

Sorry Nigel. Andy steered you a bit wrong and those debug outputs are
useless.. You need "-D bayes" not just "-D".

Try it again with:

spamassassin -D bayes < message1 2> debug1.txt

Instead of
spamassassin -D -t < message1 2> debug1.txt




Re: Drug spam, some caught some not - none caught by drug rules

2007-01-25 Thread Matt Kettler
Andy Figueroa wrote:
> Thanks, Matt.  That sounds like a good suggestion.
>
> Nigel, since you have the emails, if you could capture the debug
> output in a file and post like you did the messages, perhaps someone
> wise could evaluate what is going on.
>
> You can capture the debug output by using:
> spamassassin -D -t < message1 2> debug1.txt

Andy, you'r missing something VERY important here. They need BAYES
debugging, not general debugging. And using -t here is pointless. Won't
hurt, but serves no useful purpose. (-t forces SA to mark the message up
and generate a report like it would for spam, even if the score isn't
over the threshold.

The proper command would be:

spamassassin -D bayes < message1 2> debug1.txt

>
> Matt Kettler wrote:
>>
>> BAYES changes are easily explained by the header changes, but a deeper
>> analysis would involve running through spamassassin -D bayes and looking
>> at the exact tokens.
>>
>



Re: Rulesdujour?

2007-01-25 Thread Matt Kettler
Gene Heskett wrote:
> On Thursday 25 January 2007 11:56, Theo Van Dinter wrote:
>   
>> On Thu, Jan 25, 2007 at 11:50:13AM -0500, Gene Heskett wrote:
>> 
>>> I got this email from Rules_Du_Jour this morning, what is the fix?
>>>   
>> Don't take this the wrong way, but did you read the errors at all?
>>
>> 
>>> Lint output: [16404] warn: config: failed to parse line, skipping:
>>> README: [16404] warn: config: failed to parse line, skipping: WARNING:
>>> YOU HAVE DOWNLOADED THIS RULESET from COMCAST. I am TERMINATING THIS
>>> ACCOUNT. [16404] warn: config: failed to parse line, skipping: Someone
>>> else will eventually have control of this webspace, possibly a
>>> malicious spammer. [16404] warn: config: failed to parse line,
>>> skipping: STOP using RDJ on this file *NOW*
>>> [16404] warn: config: failed to parse line, skipping: Also, make note
>>> of the fact that this file is for users of SA 2.64 and below.
>>>   
>> It makes it pretty clear that you should stop using it and why.
>> 
>
> Yes I did read it, but I'm not sure what rule I should remove, or if I 
> should stop using rulesdujour.  Has it fallen out of favor or was it too 
> good for somebody?
No, you shouldn't stop using RDJ.

You should however stop using RDJ to update antidrug, for the following
reasons:

1) Antidrug is no longer actively maintained. I haven't edited the rules
themselves in a very long time, over a year. You've probably downloaded
update since, but it's all notes in the comments. ie: don't use this
with 3.0.0 or higher went in back in june or july 06. October 06 saw the
ruleset updated with a comment telling you it moved (that few read).

2) Antidrug is a part of SA as of SA 3.0.0. If you're using antidrug
with SA 3.0.0 or higher, you're possibly downgrading your antidrug
rules. Unless you're using SA 2.64 or lower, you should remove
antidrug.cf from your system completely.

3) If I ever make updates to the antidrug rules, I'd submit them to the
main SA project to avoid conflicts. I will likely NOT update
antidrug.cf. (anyone using 2.64 or older would get a much bigger boost
in accuracy from updating SA than they will from updating my rules.)

Therefore, checking Antidrug with RDJ is pointless.  In fact, the
current version of RDJ no longer supports antidrug at all for this very
reason.

So, I suggest that you take the following steps:

1) update your RDJ. Chris Thielen, the author of RDJ, has in the past
pointed out that it's no longer available via exit0.us, but can be
gotten here:
http://sandgnat.com/rdj/rules_du_jour

2) remove antidrug.cf from your system unless your SA version is 2.64 or
lower. If it is, I would SERIOUSLY consider upgrading.







Re: Possible false positive?

2007-01-25 Thread Theo Van Dinter
On Fri, Jan 26, 2007 at 02:32:11AM +0200, Aydin SASMAZ wrote:
> adds a test score Fw_mail 100.00 but I got the email because my email
[...]
> What is the real cause of this and how prevent spamassassin adding this
> Fw_mail 100 score any forwarded email

"Fw_mail" is not a standard rule included with SpamAssassin.  You'll want to
find where you define this rule in your configs and disable it.

-- 
Randomly Selected Tagline:
If a can of Alpo costs 38 cents, would it cost $2.50 in Dog Dollars?


pgpU4FOU80TTZ.pgp
Description: PGP signature


Possible false positive?

2007-01-25 Thread Aydin SASMAZ
 

Hi all,

 

I'm new in the list and I have a problem with my Spamassassin so that I've
realized when I forward one of my email to my email address spamassassin
adds a test score Fw_mail 100.00 but I got the email because my email
address is in whitelist. If I forward non-spam email in my inbox to someone
these email tag as spam. This was not the case before and it's a new for me.
What is the real cause of this and how prevent spamassassin adding this
Fw_mail 100 score any forwarded email

 

Could it be possible to arise from domainname "From:" section don't match
with domainname on Message-id

 

Following is a header of an email forwarded

 

From: "Aydin SASMAZ" <[EMAIL PROTECTED]>

To: =?iso-8859-9?B?QXlk/W4g3mH+bWF6?= <[EMAIL PROTECTED]>

Subject: FW: Doc uzantili dosyalar

Date: Fri, 26 Jan 2007 01:44:42 +0200

Message-ID: <[EMAIL PROTECTED]>

MIME-Version: 1.0

Content-Type: multipart/alternative;

boundary="=_NextPart_000_0028_01C740EB.8D22F880"

X-Mailer: Microsoft Office Outlook 11

Thread-Index: AcdAmK7skn8lL6i3QTGfXZHeMrwZawAKWpogAAYrNqA=

X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2900.3028

X-btegitim-MailScanner-Information: Please contact the ISP for more
information

X-btegitim-MailScanner: Found to be clean

X-btegitim-MailScanner-SpamCheck: not spam (whitelisted),

SpamAssassin (score=100.814, required 5, Fw_mail 100.00,

HTML_MESSAGE 0.00, INFO_TLD 0.81)

X-btegitim-MailScanner-From: [EMAIL PROTECTED]

X-Spam-Status: No

Status:

 

 

 

Platform: RHELAS 3.0 UP6

 

Spamassassin :

 

spamassassin-3.1.0-1

perl-Mail-SpamAssassin-3.1.0-1

 

Mailscanner :

 

MailScanner-perl-MIME-Base64-3.05-5

mailscanner-4.51.5-1

MailScanner-perl-MIME-Base64-debuginfo-3.05-5

 

 

 

Thanks in advance

 

 

H. Aydin SASMAZ


-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.



Re: Botnet plugin

2007-01-25 Thread Matthias Fuhrmann
On Thu, 25 Jan 2007, Jason Little wrote:

>
> I was wondering about the maturity of the botnet plugin and where I can get
> my hands on it again.  I used an early version of it for a while but I
> removed it because we didn't really need it and now it seems I need it again
> with all the spammers finding a way to slip a 3.7 acore by spamassassin and
> when I look at the headers its so obviously from a botnet.

this one: http://people.ucsc.edu/~jrudd/spamassassin/ ?

regards,
Matthias


Re: Should I use greylisting

2007-01-25 Thread Chris Purves

Matthew Bickerton wrote:


I have been thinking about implementing Greylisting. However, I am worried
about blocking/long delays with e-mails from mail farms (gmail, yahoo etc.)



You could compromise by greylisting based on blocklists (such as 
spamhaus, etc.).  This would free up some resources by rejecting a fair 
amount of mail that would otherwise go to spamassassin.  For my setup 
(consisting of two users), greylisting with this method eliminates half 
of spam that would have otherwise gone to spamassassin. (about 250/500 
per week).  It also means that you can greatly increase the greylist 
time to several hours or even a day since it would be unlikely that 
legit e-mail would be greylisted, but if it was it would still get 
through, although quite delayed.  Of course if you are using blocklists 
for blocking...then that wouldn't help.


You can also add a whitelist to bypass the greylisting for large mail 
servers.


Personally, I didn't like the added delay for first-time mails, which is 
why I chose to greylist only on blocklists, but for a minimal effort my 
spam was significantly reduced.


Hope that helps.


--
Chris



RE: Should I use greylisting

2007-01-25 Thread Dylan Bouterse
I am using postgrey which allows for whitelisting of address ranges,
specific IPs, etc. I implemented it on the Thanksgiving weekend so it
could build up it's triplet database before hitting the work week email
and I've not had a single person complain. On the flip side, I very
rarely see spam come through that isn't sent to postmaster@ which is
whitelisted. Until the spammers build in retry into their bots, I'm a
firm believer of greylisting.

Dylan

> -Original Message-
> From: Matthew Bickerton [mailto:[EMAIL PROTECTED]
> Sent: Thursday, January 25, 2007 7:33 AM
> To: users@spamassassin.apache.org
> Subject: Should I use greylisting
> 
> Hi,
> 
> I am setting up a new server, so have a chance to make big changes to
my
> email server.
> 
> I have been thinking about implementing Greylisting. However, I am
worried
> about blocking/long delays with e-mails from mail farms (gmail, yahoo
> etc.)
> 
> I would very much appreciate other people's recommendations on
Greylisting
> or other approaches to reducing the load on my server by rejecting
spam
> early.
> 
> Matthew



Botnet plugin

2007-01-25 Thread Jason Little

I was wondering about the maturity of the botnet plugin and where I can get
my hands on it again.  I used an early version of it for a while but I
removed it because we didn't really need it and now it seems I need it again
with all the spammers finding a way to slip a 3.7 acore by spamassassin and
when I look at the headers its so obviously from a botnet.

Jason Little
Network Admin
Mint Inc
156 Front St W suite 300
Toronto ON



Re: Sa -- lint : HOWTO know which cf file gives the problem ?

2007-01-25 Thread Matthias Fuhrmann
On Thu, 25 Jan 2007, Florent Gilain wrote:

hI,

> Hello all,
>
> When i run this :
>
> [EMAIL PROTECTED] spamassassin]# spamassassin --lint
> [21570] warn: config: warning: description exists for non-existent rule
> MIME_BOUND_NEXTPART
> [21570] warn: config: warning: description exists for non-existent rule
> BIZ_TLD
> [21570] warn: lint: 2 issues detected, please rerun with debug enabled for
> more information
>
> I am asking myself how to know which *.cf file is the problem...is there an
> easy way to find it ?

either in /etc/mail/spamassasin or in $PREFIX/share/spamassassin
do for example this: 'grep RULENAME *.cf'
if you were using sa-update you can find those updated main rules in
$PREFIX/var/spamassassin/3.001007/updates_spamassassin_org
this is for 3.1.7, your path might be:
$PREFIX/var/spamassassin/3.001001/updates_spamassassin_org

result is something like:

grep ZMIde_SUBBIG *.cf
70_zmi_german.cf:header   ZMIde_SUBBIG Subject =~ /(?:Eilig
70_zmi_german.cf:describe ZMIde_SUBBIG subject suggesting business
70_zmi_german.cf:scoreZMIde_SUBBIG 1.8

so the file containing the rule is 70_zmi_german.cf in the current
directory.

regards,
Matthias


**exact** info about "skip_rbl_checks" needed

2007-01-25 Thread Rob McEwen (PowerView Systems)
BACKGROUND:

First, I do NOT use SA for IP or URI based lookups as I do those in my own 
custom programmed spam filter.

But I do desire to use SA for such things as Razor, SARE rules, ImageInfo, etc.

Therefore, I have the following set up to prevent IP lookups:

skip_rbl_checks 1

And other items are "commented out" to prevent such things as SURBL and URIBL 
lookups since I'm already doing those, too. Also, I also choose have bayes 
turned off.

THAT IS THE BACKGROUND... HERE IS THE QUESTION:

1st question:

Some of my incoming mesasges involve messages forwarded to my server via a rule 
from accounts that some of my clients have on other ISPs mail servers. For such 
incoming messages, I have been creating a temporary copy of the message where 
all headers that were ADDED by either the other ISP and/or my server are 
removed so that the message is brought back to the state that it was in when 
originally sent by the original sender (just prior to the ISP's mail server 
received it). This way, SA can work with that the potential spammer actually 
sent, without any received headers added.

But is that really necessary? Or would I get the same results if, under my 
configuration described above, I just left the extra added headers in there?

(I'm concerned that, even with skip_rbl_checks turned off, there might still be 
SPF checking or other things going on which then might get messed up if I 
don't present the message in its "original" form. PLEASE... let me know if that 
is the case. This will only be about the 10th time that I've asked what other 
"network checks" happen besides Razor/DCC when skip_rbl_checks is set to true.)

2nd question:

Does SA have any problems working with a file that OTHER programs are currently 
accessing (in "read" mode)?

Thanks!

Rob McEwen
PowerView Systems
[EMAIL PROTECTED]




Re: How to deal with mailing list spam?

2007-01-25 Thread Magnus Holmgren
On Wednesday 24 January 2007 21:29, Chris Purves wrote:
> I was wondering what is the best way to deal with spam that comes
> through on mailing lists?  For mailing lists like spamassassin I
> whitelist all mail because I expect to see examples of spam, but for
> other lists, is it a good idea to run 'sa-learn --spam'?  What about
> reporting those spam to razor/pyzor or spamcop?

Check what header fields the mailing list software adds, and exclude them from 
bayes with bayes_ignore_header if they aren't already on the built-in ignore 
list. Most are covered, but I've found some that are not (at least 
Resent-Sender and Resent-Message-ID from Debian's list server). If the list 
server filters out spam well, the prevalence of ham means that everything 
added by the list software will be seen as a ham sign.

Add the mailing list server to trusted_networks and even internal_networks, 
provided that you believe it not to be accessible to spammers. In practise it 
acts like an MX that receives mail addressed (indirectly) to you and forwards 
it to you. Putting it in internal_networks means that some DNSBL rules will 
work better.

-- 
Magnus Holmgren[EMAIL PROTECTED]
   (No Cc of list mail needed, thanks)

  "Exim is better at being younger, whereas sendmail is better for 
   Scrabble (50 point bonus for clearing your rack)" -- Dave Evans


pgpzXaHt6Ho8I.pgp
Description: PGP signature


hits=SpamAssassin Client version 3.1.1

2007-01-25 Thread Casey Ralls
Hi All,I am using spamd/spamc combination to scan incomming messages on my qmail/vpopmail system. My SpamAssassin Client version is 3.1.1My Platform is FreeBSD 5.4-RELEASEI recently made some changes to my configuration and am running spamc as user qscand and I'm sure I made some other changes through trial and error.My symptom is that all my mail headers show the folloing line: X-Spam-Status: No, hits=SpamAssassin Client version 3.1.1 required=?It used to show an actual number for hits so I'm assuming that spamassassin is not working properly.  Can anyone assist me in diagnosing this issue?Best Regards, --Casey 


Re: NOTICE: 3.2.0 rescoring mass-checks

2007-01-25 Thread Fred Tarasevicius
Hello Justin,

Thursday, January 25, 2007, 12:57:18 PM, you wrote:

> hi all --

> OK, if you're planning to send us mass-check logs for the 3.2.0 rescoring,
> now's the time!

OK, so we can start running the tests now?  To ensure I am correct at
how to go about this, we just svn update the latest release, start the
mass-checks as outlined on the wiki page and send away when we are
done?

-- 
Best regards,
 Fredmailto:[EMAIL PROTECTED]



Re: Should I use greylisting

2007-01-25 Thread Chris St. Pierre

"Steven W. Orr" <[EMAIL PROTECTED]> wrote:


I'm running sendmail and I want a good greylist that uses a mysql
database. There are all sorts of things out there but they're not
dbms based.


Relaydelay (http://projects.puremagic.com/greylisting/downloads.html)
is the only Sendmail greylister I know of that uses MySQL

Chris St. Pierre
Unix Systems Administrator
Nebraska Wesleyan University

Never send mail to [EMAIL PROTECTED]



Re: Rulesdujour?

2007-01-25 Thread Gene Heskett
On Thursday 25 January 2007 12:33, Nigel Frankcom wrote:

>On Thu, 25 Jan 2007 12:20:09 -0500, Gene Heskett
>
><[EMAIL PROTECTED]> wrote:
>>On Thursday 25 January 2007 11:56, Theo Van Dinter wrote:
>>>On Thu, Jan 25, 2007 at 11:50:13AM -0500, Gene Heskett wrote:
 I got this email from Rules_Du_Jour this morning, what is the fix?
>>>
>>>Don't take this the wrong way, but did you read the errors at all?
>>>
 Lint output: [16404] warn: config: failed to parse line, skipping:
 README: [16404] warn: config: failed to parse line, skipping:
 WARNING: YOU HAVE DOWNLOADED THIS RULESET from COMCAST. I am
 TERMINATING THIS ACCOUNT. [16404] warn: config: failed to parse
 line, skipping: Someone else will eventually have control of this
 webspace, possibly a malicious spammer. [16404] warn: config: failed
 to parse line, skipping: STOP using RDJ on this file *NOW*
 [16404] warn: config: failed to parse line, skipping: Also, make
 note of the fact that this file is for users of SA 2.64 and below.
>>>
>>>It makes it pretty clear that you should stop using it and why.
>>
>>Yes I did read it, but I'm not sure what rule I should remove, or if I
>>should stop using rulesdujour.  Has it fallen out of favor or was it
>> too good for somebody?
>>
>>FWIW, rulesdujour, if its complaining about a package, should not only
>> say its an out of date package, but should name it so that one can
>> find and remove it!  This message didn't arrive until after this one
>> this morning:
>>
>>Matt Kettler's AntiDrug has changed on coyote.coyote.den.
>>Version line: # rev 0.65 10/01/2006 - updated URL, etc
>>
>>So I assume that's the file being bitched about, so I've removed
>> several of them in the /etc/spamassassin/rulesdujour dir, and removed
>> the antidrug thing from /etc/rulesdujour/config.
>>
>>Damn I get enough of that, some of them claim I could get it up if I
>> was 100 years old.  But I'm diabetic & 72, so the chances are
>> somewhere between damned slim and none.
>
>What else is in your RDJ config? It might be worth taking a walk
>through the rules site and just checking what  you've got and what, if
>any have been obfuscated.
>
>Kind regards
>
>Nigel
TRUSTED_RULESETS="EVILNUMBERS EVILNUMBERS1 EVILNUMBERS2 BOGUSVIRUS 
SARE_ADULT SARE_BAYES_POISON_NXM SARE_BML SARE_CODING 
SARE_REDIRECT_POST300 SARE_GENLSUBJ SARE_UNSUB SARE_HEADER0 SARE_HEADER2 
SARE_OBFU0 SARE_OBFU1 SARE_OEM SARE_RANDOM SARE_URI0 SARE_URI1 SARE_URI3 
SARE_URI_ENG SARE_WHITELIST SARE_WHITELIST_SPF SARE_WHITELIST_RCVD 
SARE_SPECIFIC SARE_STOCKS SARE_FRAUD SARE_SPOOF ZMI_GERMAN"
SA_DIR="/etc/mail/spamassassin"
MAIL_ADDRESS="[EMAIL PROTECTED]"
SA_RESTART="killall -HUP spamd"

-- 
Cheers, Gene
"There are four boxes to be used in defense of liberty:
 soap, ballot, jury, and ammo. Please use in that order."
-Ed Howdershelt (Author)
Yahoo.com and AOL/TW attorneys please note, additions to the above
message by Gene Heskett are:
Copyright 2007 by Maurice Eugene Heskett, all rights reserved.


NOTICE: 3.2.0 rescoring mass-checks

2007-01-25 Thread Justin Mason
hi all --

OK, if you're planning to send us mass-check logs for the 3.2.0 rescoring,
now's the time!

http://wiki.apache.org/spamassassin/RescoreDetails has all the details.

Note that the deadline for result submission is Tuesday, Feb 6 as
described at http://wiki.apache.org/spamassassin/Release320Schedule .

cheers!

--j.


Re: Rulesdujour?

2007-01-25 Thread Gene Heskett
On Thursday 25 January 2007 11:56, Theo Van Dinter wrote:
>On Thu, Jan 25, 2007 at 11:50:13AM -0500, Gene Heskett wrote:
>> I got this email from Rules_Du_Jour this morning, what is the fix?
>
>Don't take this the wrong way, but did you read the errors at all?
>
>> Lint output: [16404] warn: config: failed to parse line, skipping:
>> README: [16404] warn: config: failed to parse line, skipping: WARNING:
>> YOU HAVE DOWNLOADED THIS RULESET from COMCAST. I am TERMINATING THIS
>> ACCOUNT. [16404] warn: config: failed to parse line, skipping: Someone
>> else will eventually have control of this webspace, possibly a
>> malicious spammer. [16404] warn: config: failed to parse line,
>> skipping: STOP using RDJ on this file *NOW*
>> [16404] warn: config: failed to parse line, skipping: Also, make note
>> of the fact that this file is for users of SA 2.64 and below.
>
>It makes it pretty clear that you should stop using it and why.

Yes I did read it, but I'm not sure what rule I should remove, or if I 
should stop using rulesdujour.  Has it fallen out of favor or was it too 
good for somebody?

FWIW, rulesdujour, if its complaining about a package, should not only say 
its an out of date package, but should name it so that one can find and 
remove it!  This message didn't arrive until after this one this morning:

Matt Kettler's AntiDrug has changed on coyote.coyote.den.
Version line: # rev 0.65 10/01/2006 - updated URL, etc

So I assume that's the file being bitched about, so I've removed several 
of them in the /etc/spamassassin/rulesdujour dir, and removed the 
antidrug thing from /etc/rulesdujour/config.

Damn I get enough of that, some of them claim I could get it up if I was 
100 years old.  But I'm diabetic & 72, so the chances are somewhere 
between damned slim and none.

-- 
Cheers, Gene
"There are four boxes to be used in defense of liberty:
 soap, ballot, jury, and ammo. Please use in that order."
-Ed Howdershelt (Author)
Yahoo.com and AOL/TW attorneys please note, additions to the above
message by Gene Heskett are:
Copyright 2007 by Maurice Eugene Heskett, all rights reserved.


Sa -- lint : HOWTO know which cf file gives the problem ?

2007-01-25 Thread Florent Gilain
Hello all,

When i run this :

[EMAIL PROTECTED] spamassassin]# spamassassin --lint
[21570] warn: config: warning: description exists for non-existent rule
MIME_BOUND_NEXTPART
[21570] warn: config: warning: description exists for non-existent rule
BIZ_TLD
[21570] warn: lint: 2 issues detected, please rerun with debug enabled for
more information

I am asking myself how to know which *.cf file is the problem...is there an
easy way to find it ?

Thanks

FLorent



Re: Should I use greylisting

2007-01-25 Thread --[ UxBoD ]--
On Thu, 25 Jan 2007 11:56:47 -0500 (EST)
"Steven W. Orr" <[EMAIL PROTECTED]> wrote:

> On Thursday, Jan 25th 2007 at 12:49 -, quoth --[ UxBoD ]--:
> 
> =>Check out http://policyd.sourceforge.net/ then as it allows you to
> =>specify Servers/IP that should not be greylisted. Works very well.
> =>
> 
> I know this is the wrong pleace to discuss this, but since I didn't
> start it, I'm taking advantage. The policyd link above is for
> postfix. What I'd like doesn't seem to exist that I know of, and I'd
> like to know if someone maybe has a pointer.
> 
> I'm running sendmail and I want a good greylist that uses a mysql 
> database. There are all sorts of things out there but they're not
> dbms based.
> 
> Anyone?
> 

try here :- http://www.greylisting.org/

-- 
This message has been scanned for viruses and dangerous content by MailScanner, 
and is
believed to be clean.



Re: Rulesdujour?

2007-01-25 Thread Nigel Frankcom
On Thu, 25 Jan 2007 12:20:09 -0500, Gene Heskett
<[EMAIL PROTECTED]> wrote:

>On Thursday 25 January 2007 11:56, Theo Van Dinter wrote:
>>On Thu, Jan 25, 2007 at 11:50:13AM -0500, Gene Heskett wrote:
>>> I got this email from Rules_Du_Jour this morning, what is the fix?
>>
>>Don't take this the wrong way, but did you read the errors at all?
>>
>>> Lint output: [16404] warn: config: failed to parse line, skipping:
>>> README: [16404] warn: config: failed to parse line, skipping: WARNING:
>>> YOU HAVE DOWNLOADED THIS RULESET from COMCAST. I am TERMINATING THIS
>>> ACCOUNT. [16404] warn: config: failed to parse line, skipping: Someone
>>> else will eventually have control of this webspace, possibly a
>>> malicious spammer. [16404] warn: config: failed to parse line,
>>> skipping: STOP using RDJ on this file *NOW*
>>> [16404] warn: config: failed to parse line, skipping: Also, make note
>>> of the fact that this file is for users of SA 2.64 and below.
>>
>>It makes it pretty clear that you should stop using it and why.
>
>Yes I did read it, but I'm not sure what rule I should remove, or if I 
>should stop using rulesdujour.  Has it fallen out of favor or was it too 
>good for somebody?
>
>FWIW, rulesdujour, if its complaining about a package, should not only say 
>its an out of date package, but should name it so that one can find and 
>remove it!  This message didn't arrive until after this one this morning:
>
>Matt Kettler's AntiDrug has changed on coyote.coyote.den.
>Version line: # rev 0.65 10/01/2006 - updated URL, etc
>
>So I assume that's the file being bitched about, so I've removed several 
>of them in the /etc/spamassassin/rulesdujour dir, and removed the 
>antidrug thing from /etc/rulesdujour/config.
>
>Damn I get enough of that, some of them claim I could get it up if I was 
>100 years old.  But I'm diabetic & 72, so the chances are somewhere 
>between damned slim and none.

What else is in your RDJ config? It might be worth taking a walk
through the rules site and just checking what  you've got and what, if
any have been obfuscated.

Kind regards

Nigel


Re: Drug spam, some caught some not - none caught by drug rules

2007-01-25 Thread Nigel Frankcom
On Thu, 25 Jan 2007 10:28:21 -0500, Andy Figueroa
<[EMAIL PROTECTED]> wrote:

>Thanks, Matt.  That sounds like a good suggestion.
>
>Nigel, since you have the emails, if you could capture the debug output 
>in a file and post like you did the messages, perhaps someone wise could 
>evaluate what is going on.
>
>You can capture the debug output by using:
>spamassassin -D -t < message1 2> debug1.txt
>
>Andy Figueroa
>
>Matt Kettler wrote:
>> Andy Figueroa wrote:
>>> Matt (but not just to Matt), I don't understand your reply (though I
>>> am deeply in your dept for the work you do for this community).  The
>>> sample emails that Nigel posted are identical in content, including
>>> obfuscation.  I've noted the same situation.  Yet, the scoring is
>>> really different. On the low scoring ones, DCC and RAZOR2 didn't hit,
>>> and the BAYES score is different.  The main differences are in the
>>> headers' different forged From and To addresses.  I thought these
>>> samples were worthy of deeper analysis.
>> 
>> Well, there might be other analysis worth making.
>> 
>>  However,  Nigel asked why the drugs rules weren't matching. I answered
>> that question alone.
>> 
>> Not sure why the change in razor/dcc happend.
>> 
>> BAYES changes are easily explained by the header changes, but a deeper
>> analysis would involve running through spamassassin -D bayes and looking
>> at the exact tokens.
>> 

Debug results are available on: 
http://dev.blue-canoe.net/spam/spam01.txt
http://dev.blue-canoe.net/spam/debug1.txt

http://dev.blue-canoe.net/spam/spam02.txt
http://dev.blue-canoe.net/spam/debug2.txt

http://dev.blue-canoe.net/spam/spam03.txt
http://dev.blue-canoe.net/spam/debug3.txt

http://dev.blue-canoe.net/spam/spam04.txt
http://dev.blue-canoe.net/spam/debug4.txt

Make of them what you will, I think I need more beer before that lot
makes much sense :-D

Kind regards

Nigel


Re: Should I use greylisting

2007-01-25 Thread Steven W. Orr
On Thursday, Jan 25th 2007 at 12:49 -, quoth --[ UxBoD ]--:

=>Check out http://policyd.sourceforge.net/ then as it allows you to
=>specify Servers/IP that should not be greylisted. Works very well.
=>

I know this is the wrong pleace to discuss this, but since I didn't start 
it, I'm taking advantage. The policyd link above is for postfix. What I'd 
like doesn't seem to exist that I know of, and I'd like to know if someone 
maybe has a pointer.

I'm running sendmail and I want a good greylist that uses a mysql 
database. There are all sorts of things out there but they're not dbms 
based.

Anyone?


Re: Rulesdujour?

2007-01-25 Thread Theo Van Dinter
On Thu, Jan 25, 2007 at 11:50:13AM -0500, Gene Heskett wrote:
> I got this email from Rules_Du_Jour this morning, what is the fix?

Don't take this the wrong way, but did you read the errors at all?

> Lint output: [16404] warn: config: failed to parse line, skipping: README:
> [16404] warn: config: failed to parse line, skipping: WARNING: YOU HAVE 
> DOWNLOADED THIS RULESET from COMCAST. I am TERMINATING THIS ACCOUNT.
> [16404] warn: config: failed to parse line, skipping: Someone else will 
> eventually have control of this webspace, possibly a malicious spammer.
> [16404] warn: config: failed to parse line, skipping: STOP using RDJ on 
> this file *NOW*
> [16404] warn: config: failed to parse line, skipping: Also, make note of 
> the fact that this file is for users of SA 2.64 and below.

It makes it pretty clear that you should stop using it and why.

-- 
Randomly Selected Tagline:
Ask them to list all 54 flavors, then order Vanilla.


pgpzEzoDNpvgw.pgp
Description: PGP signature


Rulesdujour?

2007-01-25 Thread Gene Heskett
Greetings;

I got this email from Rules_Du_Jour this morning, what is the fix?

Thu Jan 25 05:59:52 2007
   
***WARNING***: spamassassin --lint failed.
Rolling configuration files back, not restarting SpamAssassin.
Rollback command is:  
mv -f /etc/mail/spamassassin/antidrug.cf 
/etc/mail/spamassassin/RulesDuJour/antidrug.cf.2; 
mv -f /etc/mail/spamassassin/RulesDuJour/antidrug.cf.20070125-0559 
/etc/mail/spamassassin/antidrug.cf;

Lint output: [16404] warn: config: failed to parse line, skipping: README:
[16404] warn: config: failed to parse line, skipping: WARNING: YOU HAVE 
DOWNLOADED THIS RULESET from COMCAST. I am TERMINATING THIS ACCOUNT.
[16404] warn: config: failed to parse line, skipping: Someone else will 
eventually have control of this webspace, possibly a malicious spammer.
[16404] warn: config: failed to parse line, skipping: STOP using RDJ on 
this file *NOW*
[16404] warn: config: failed to parse line, skipping: Also, make note of 
the fact that this file is for users of SA 2.64 and below.
[16404] warn: String found where operator expected at (eval 1122) line 1, 
near ""you" "are""
[16404] warn:  (Missing operator before "are"?)
[16404] warn: String found where operator expected at (eval 1122) line 1, 
near ""are" "running""
[16404] warn:  (Missing operator before "running"?)
[16404] warn: String found where operator expected at (eval 1122) line 1, 
near ""running" "SA""
[16404] warn:  (Missing operator before "SA"?)
[16404] warn: Number found where operator expected at (eval 1122) line 1, 
near ""SA" 3.0.0"
[16404] warn:  (Missing operator before 3.0.0?)
[16404] warn: String found where operator expected at (eval 1122) line 1, 
near "3.0.0 "or""
[16404] warn:  (Missing operator before "or"?)
[16404] warn: String found where operator expected at (eval 1122) line 1, 
near ""or" "higher""
[16404] warn:  (Missing operator before "higher"?)
[16404] warn: String found where operator expected at (eval 1122) line 1, 
near ""higher" "you""
[16404] warn:  (Missing operator before "you"?)
[16404] warn: String found where operator expected at (eval 1122) line 1, 
near ""you" "already""
[16404] warn:  (Missing operator before "already"?)
[16404] warn: String found where operator expected at (eval 1122) line 1, 
near ""already" "have""
[16404] warn:  (Missing operator before "have"?)
[16404] warn: String found where operator expected at (eval 1122) line 1, 
near ""have" "antidrug""
[16404] warn:  (Missing operator before "antidrug"?)
[16404] warn: String found where operator expected at (eval 1122) line 1, 
near ""antidrug" "and""
[16404] warn:  (Missing operator before "and"?)
[16404] warn: String found where operator expected at (eval 1122) line 1, 
near ""and" "this""
[16404] warn:  (Missing operator before "this"?)
[16404] warn: String found where operator expected at (eval 1122) line 1, 
near ""this" "file""
[16404] warn:  (Missing operator before "file"?)
[16404] warn: config: unclosed 'if' in /etc/mail/spamassassin/antidrug.cf: 
if you are running SA 3.0.0 or higher, you already have antidrug and this 
file
[16404] warn: lint: 6 issues detected, please rerun with debug enabled for 
more information


-- 
Cheers, Gene
"There are four boxes to be used in defense of liberty:
 soap, ballot, jury, and ammo. Please use in that order."
-Ed Howdershelt (Author)
Yahoo.com and AOL/TW attorneys please note, additions to the above
message by Gene Heskett are:
Copyright 2007 by Maurice Eugene Heskett, all rights reserved.


Re: True spam getting really low Bayesian points

2007-01-25 Thread Matt Kettler
Kim Christensen wrote:
> Hey list,
>
> I've recently started training our bayesian filter with spam/ham from my
> personal mailbox, to prepare for live usage on our customer accounts.
>
> % sa-learn --dump magic
> ...
> 0.000  0340  0  non-token data: nspam
> 0.000  0475  0  non-token data: nham
> 0.000  0  53404  0  non-token data: ntokens
> ...
>
> So far so good, and spamd is actually using the bayesian db when
> examining incoming mails. However, I find that a few of the legit ham 
> (not a majority) mails get unusually high bayesian points, while some
> of the real spam (which gets scored as spam by sa) often get bayesian
> points < 1. 
>
> Now, I'm sure I haven't trained the database with wrong messages. Is it
> a good idea to continue feeding sa-learn with example spam and ham until
> it reaches a few thousands messages, before relying on the results?
>
> I would think my current amount is sufficient, but I guess something's
> wrong with this picture :-)
>
>
>
>   
If you want to see what the tokens are that are throwing bayes off, try
running a mis-categorized message through spamassassin -D bayes. This
will turn on bayes debugging, and will print all the bayes-matching
tokens in the message (in text form) and their individual probabilities.

It's completely normal for a message to have a few tokens on "the wrong
side". So don't over-worry about testing every message this way, that
can lead to the mistake of micro-managing your bayes. However, it can be
useful to figure out what bayes is thinking when you have odd results.




Re: Drug spam, some caught some not - none caught by drug rules

2007-01-25 Thread Nigel Frankcom
On Thu, 25 Jan 2007 10:28:21 -0500, Andy Figueroa
<[EMAIL PROTECTED]> wrote:

>Thanks, Matt.  That sounds like a good suggestion.
>
>Nigel, since you have the emails, if you could capture the debug output 
>in a file and post like you did the messages, perhaps someone wise could 
>evaluate what is going on.
>
>You can capture the debug output by using:
>spamassassin -D -t < message1 2> debug1.txt
>
>Andy Figueroa
>
>Matt Kettler wrote:
>> Andy Figueroa wrote:
>>> Matt (but not just to Matt), I don't understand your reply (though I
>>> am deeply in your dept for the work you do for this community).  The
>>> sample emails that Nigel posted are identical in content, including
>>> obfuscation.  I've noted the same situation.  Yet, the scoring is
>>> really different. On the low scoring ones, DCC and RAZOR2 didn't hit,
>>> and the BAYES score is different.  The main differences are in the
>>> headers' different forged From and To addresses.  I thought these
>>> samples were worthy of deeper analysis.
>> 
>> Well, there might be other analysis worth making.
>> 
>>  However,  Nigel asked why the drugs rules weren't matching. I answered
>> that question alone.
>> 
>> Not sure why the change in razor/dcc happend.
>> 
>> BAYES changes are easily explained by the header changes, but a deeper
>> analysis would involve running through spamassassin -D bayes and looking
>> at the exact tokens.
>> 

I'll sit down with a beer later and run the debug on them. In the
meantime Steve Basford from sanesecurity.com has added them to the
Clam add on I mentioned a while back. 

Their main download point is
http://sanesecurity.com/clamav/downloads.htm (in my experience here
it's worked very well indeed). For those of you that are interested
and are running multiple servers contact me off list for the URL to
the scripts James Rallo mod'd for updating multiple backend servers
(or you can hunt back through the mail archives for it :-D).

Kind regards

Nigel


Re: [guinevere-discuss] GWAVA dropping Guinevere

2007-01-25 Thread Rob Anderson
>>> Matt Kettler <[EMAIL PROTECTED]> 01/25/07 10:03AM >>>
Clay Davis wrote:
> Has anyone thrown this to the SA wolves... I mean group, to get their
> opnion?  (get ready to duck!)
> Clay
Disclaimer: I'm just a community member, and really don't care about
Guinevere or GWAVA, nor do I know much about either.

Their statements about accuracy make me laugh. Really, it sounds like
they're dropping the SA based product to increase sales of their
in-house engines.

Reading the article, their "spam 2.0" solution sounds like a
re-arrangement of how SA works with bayes. Of course the article isn't
technical enough to know for sure, but it sounds like they're using a SA
style rule-based autolearner to train a bayes system, but when it comes
time to score mail they use the bayes only.

How this is any kind of radical departure from SA's existing
autolearning ability is beyond me. Unless they've found some form of
learning categorizer that works better than bayes, I think they'll
eventually find there's a good reason SA uses both static rules and
bayes. Bayes alone causes lag in adapting to new spam trends.
=>
I know a great deal about both of these products (got into SA as a result of 
Guinevere) and agree with your sentiments.

I listened to them discuss this at their GWAVACON, Dallas earlier this week.  
The whole time I'm thinking SA will do this better.

The problem with guinevere is it's deployed on a windows platform!  SA on Linux 
is S much better!

Rob



Re: [guinevere-discuss] GWAVA dropping Guinevere

2007-01-25 Thread Matt Kettler
Clay Davis wrote:
> Has anyone thrown this to the SA wolves... I mean group, to get their
> opnion?  (get ready to duck!)
> Clay
Disclaimer: I'm just a community member, and really don't care about
Guinevere or GWAVA, nor do I know much about either.

Their statements about accuracy make me laugh. Really, it sounds like
they're dropping the SA based product to increase sales of their
in-house engines.

Reading the article, their "spam 2.0" solution sounds like a
re-arrangement of how SA works with bayes. Of course the article isn't
technical enough to know for sure, but it sounds like they're using a SA
style rule-based autolearner to train a bayes system, but when it comes
time to score mail they use the bayes only.

How this is any kind of radical departure from SA's existing
autolearning ability is beyond me. Unless they've found some form of
learning categorizer that works better than bayes, I think they'll
eventually find there's a good reason SA uses both static rules and
bayes. Bayes alone causes lag in adapting to new spam trends.



Re: [guinevere-discuss] GWAVA dropping Guinevere

2007-01-25 Thread Clay Davis
Has anyone thrown this to the SA wolves... I mean group, to get their
opnion?  (get ready to duck!)
Clay

>>> On 1/25/2007 at 9:49 AM, in message
<[EMAIL PROTECTED]>, "Joe Zitnik" <[EMAIL PROTECTED]>
wrote:
Well, I'm going to bite my tongue on this until I hear something else. 
Every time I've used this platform as a bitch-fest, I've upset someone.
What I will say is that with some additional tweaks to my Guin
configuration in the last month, I haven't received a single spam e-mail
in almost a week for the first time in years.  Pretty amazing for
technology that's unable to keep up with the spammers.

>>> On 1/25/2007 at 9:26 AM, "Dan Abernathy"
<[EMAIL PROTECTED]> wrote:

I received an email from a sales rep letting me know that Guinevere
licensing is coming up for renewal. I replied back and asked about their
timetable for a new release, this was the response from Davin Cooke.
Apologies if this is old news, I hadn't heard about it.

 
Hi Dan
 
Thanks for your email..  We are proud to support Guinevere and and its
effectiveness at anti-Spam..  but sadly Spam is growing faster than
Spamassassin and RBL's can keep up with.  We will unlikely be releasing
a newer version of Guinevere in the near future as we have re-thought
our spam blocking approach in favor of scanning and allowing only good
mail to come through.  This is we believe a better and long lasting
approach.  Guinevere leverages some older Spam Blocking technology that
works still to this day but will not in the future..  Here's an article
that better explains this http://www.gwavanation.com/node/221 (
http://www.gwavanation.com/node/221 ) 
 
GWAVA 4 is a good alternative but costs more money..  We also have
GWAVIX that leverages some of the older technology a little better than
Spam Assassin...
 
Our Tech Support Dept is more than willing to help you tweak Guinevere
to make it better as well.. Let me know if you would like to discuss
some alternatives.



Re: True spam getting really low Bayesian points

2007-01-25 Thread maillist

maillist wrote:

Kim Christensen wrote:

Hey list,

I've recently started training our bayesian filter with spam/ham from my
personal mailbox, to prepare for live usage on our customer accounts.

% sa-learn --dump magic
...
0.000  0340  0  non-token data: nspam
0.000  0475  0  non-token data: nham
0.000  0  53404  0  non-token data: ntokens
...

So far so good, and spamd is actually using the bayesian db when
examining incoming mails. However, I find that a few of the legit ham 
(not a majority) mails get unusually high bayesian points, while some

of the real spam (which gets scored as spam by sa) often get bayesian
points < 1.
Now, I'm sure I haven't trained the database with wrong messages. Is it
a good idea to continue feeding sa-learn with example spam and ham until
it reaches a few thousands messages, before relying on the results?

I would think my current amount is sufficient, but I guess something's
wrong with this picture :-)


Best regards
  
Run spamassassin --test-mode on the messages that are scoring high and 
low.  See if they are actually running through any BAYES_* tests.  I'm 
not 100% sure but I think that by default, the bayes do not even begin 
until you have 500 trained messages of each spam and ham.


You can of course get around this by setting bayes_min_ham_num  and  
bayes_min_spam_num in your local.cf file.


-=Aubrey=-


The default for 3.* is 200 messages for each.  Sorry dude.

-=Aubrey=-


Re: True spam getting really low Bayesian points

2007-01-25 Thread maillist

Kim Christensen wrote:

Hey list,

I've recently started training our bayesian filter with spam/ham from my
personal mailbox, to prepare for live usage on our customer accounts.

% sa-learn --dump magic
...
0.000  0340  0  non-token data: nspam
0.000  0475  0  non-token data: nham
0.000  0  53404  0  non-token data: ntokens
...

So far so good, and spamd is actually using the bayesian db when
examining incoming mails. However, I find that a few of the legit ham 
(not a majority) mails get unusually high bayesian points, while some

of the real spam (which gets scored as spam by sa) often get bayesian
points < 1. 


Now, I'm sure I haven't trained the database with wrong messages. Is it
a good idea to continue feeding sa-learn with example spam and ham until
it reaches a few thousands messages, before relying on the results?

I would think my current amount is sufficient, but I guess something's
wrong with this picture :-)


Best regards
  
Run spamassassin --test-mode on the messages that are scoring high and 
low.  See if they are actually running through any BAYES_* tests.  I'm 
not 100% sure but I think that by default, the bayes do not even begin 
until you have 500 trained messages of each spam and ham.


You can of course get around this by setting bayes_min_ham_num  and  
bayes_min_spam_num in your local.cf file.


-=Aubrey=-


Re: lint test failed after rulesdujour update

2007-01-25 Thread Dimitri Yioulos
On Thursday 25 January 2007 10:10 am, Matt Kettler wrote:
> Dimitri Yioulos wrote:
> > On Thursday 25 January 2007 6:33 am, Michael Connors wrote:
> >> Hi,
> >> I am new to spamassassin so sorry if my question is a bit stupid.
> >> I have mail spamassassin 3.1.0 running with mailscanner.
> >> It updates it self via RulesDuJour on a regular basis and I get an email
> >> which informs me of the update.
> >> This morning I noticed that there was a error in the process, I received
> >> a second email which contained the following plus a traceback that
> >> mentioned missing operators.
> >>
> >> **WARNING***: spamassassin --lint failed.
> >> Rolling configuration files back, not restarting SpamAssassin.
> >> Rollback command is:  mv -f /etc/spamassassin/antidrug.cf
> >> /etc/spamassassin/RulesDuJour/antidrug.cf.2; mv -f
> >> /etc/spamassassin/RulesDuJour/antidrug.cf.20070125-0029
> >> /etc/spamassassin/antidrug.cf;
> >>
> >>
> >> I couldnt rollback because the file antidrug.cf.20070125-0029 did not
> >> exist so I decided to run spamassassin --lint at the command line myself
> >> expecting the same error but instead it ran ok, I sent the spamassassin
> >> test email to myself and it was caught so everything seems to be working
> >> as expected, however I would really like to know why the above error was
> >> thrown.
> >> Regards,
> >> Michael
> >
> > The creator of antidrug posted a thorugh explanation of the where and
> > when regarding this rule (see
> > marc.theaimsgroup.com/?l=spamassassin-users&m=116965442518029&w=2). 
> > Without trying to sound holier-than-thou (lord knows, I'm the last one
> > that should cop that attitude), you should search the archives first. 
> > That said, a precis of Matt Kettler's post:
> >
> > 1.  The location of antidrug.cf has moved, and;
> > 2.  It's included in SA 3+ and, in fact, can be counter-productive if
> > used in combination with same.
> >
> > HTH.
> >
> > Dimitri
>
> Thank you Dimitri.
>
> I'd also add:
>
> 3) I've posted the error-generating file as a last-resort to draw
> people's attention to the fact they need to change their RDJ before
> someone else, possibly malicious, has control of my old account. A
> malicious person could post a replacement file that whitelists spam.

Matt,

Thanks for completing the info.  Hence my "holier-than-thou" disclaimer.

Dimitri

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.



Re: Drug spam, some caught some not - none caught by drug rules

2007-01-25 Thread Andy Figueroa

Thanks, Matt.  That sounds like a good suggestion.

Nigel, since you have the emails, if you could capture the debug output 
in a file and post like you did the messages, perhaps someone wise could 
evaluate what is going on.


You can capture the debug output by using:
spamassassin -D -t < message1 2> debug1.txt

Andy Figueroa

Matt Kettler wrote:

Andy Figueroa wrote:

Matt (but not just to Matt), I don't understand your reply (though I
am deeply in your dept for the work you do for this community).  The
sample emails that Nigel posted are identical in content, including
obfuscation.  I've noted the same situation.  Yet, the scoring is
really different. On the low scoring ones, DCC and RAZOR2 didn't hit,
and the BAYES score is different.  The main differences are in the
headers' different forged From and To addresses.  I thought these
samples were worthy of deeper analysis.


Well, there might be other analysis worth making.

 However,  Nigel asked why the drugs rules weren't matching. I answered
that question alone.

Not sure why the change in razor/dcc happend.

BAYES changes are easily explained by the header changes, but a deeper
analysis would involve running through spamassassin -D bayes and looking
at the exact tokens.



Re: Drug spam, some caught some not - none caught by drug rules

2007-01-25 Thread Matt Kettler
Andy Figueroa wrote:
> Matt (but not just to Matt), I don't understand your reply (though I
> am deeply in your dept for the work you do for this community).  The
> sample emails that Nigel posted are identical in content, including
> obfuscation.  I've noted the same situation.  Yet, the scoring is
> really different. On the low scoring ones, DCC and RAZOR2 didn't hit,
> and the BAYES score is different.  The main differences are in the
> headers' different forged From and To addresses.  I thought these
> samples were worthy of deeper analysis.

Well, there might be other analysis worth making.

 However,  Nigel asked why the drugs rules weren't matching. I answered
that question alone.

Not sure why the change in razor/dcc happend.

BAYES changes are easily explained by the header changes, but a deeper
analysis would involve running through spamassassin -D bayes and looking
at the exact tokens.



Re: what are the rules directories

2007-01-25 Thread Theo Van Dinter
On Thu, Jan 25, 2007 at 07:18:45PM +0530, Ramprasad wrote:
> How do I make it use files on /usr/share/spamassassin too 

Why would you need that?  If you have an update installed, that's the
directory you want to use.

> I just need a command line version to run lint 

I'm not sure I understand your concern.

-- 
Randomly Selected Tagline:
Why use Windows, since there is a door?
 (By [EMAIL PROTECTED], Andre Fachat)


pgpPLxZnrkEcG.pgp
Description: PGP signature


Re: copy a filter

2007-01-25 Thread Matthias Fuhrmann
On Thu, 25 Jan 2007, pocopelli wrote:

hI,

> Hello everybody,
>
> we have an extern rootserver with our provider in Germany.
> MTA=Qmail
> Config Webinterface = PLESK 8.0
> We have a number of domains hosted on it with emailAccounts.
> The mails of the different domains are in subdirecties similar to
>
> /var/qmail/mailnames/clientdomain.de/mailuser
>
> A single user has already a well trained spamfilter. The files are in  the
> folder
>
> /var/qmail/mailnames/clientdomain.de/mailuser/.spamassassin
>
> Is it possible just to copy this filter server wide or for certain mail
> adresses ?
> I have root access. Do I have to copy certain files ?

i guess you can do this. bayes* and auto-whitelist* should be of no
problem, copying them. but not user_prefs, since it may contain
userspecific preferences.
start with a copy for one user, ur watching for a while. but it should
work w/o a problem. dont forget to 'chown' the files :)

regards,
Matthias


Re: lint test failed after rulesdujour update

2007-01-25 Thread Matt Kettler
Dimitri Yioulos wrote:
> On Thursday 25 January 2007 6:33 am, Michael Connors wrote:
>   
>> Hi,
>> I am new to spamassassin so sorry if my question is a bit stupid.
>> I have mail spamassassin 3.1.0 running with mailscanner.
>> It updates it self via RulesDuJour on a regular basis and I get an email
>> which informs me of the update.
>> This morning I noticed that there was a error in the process, I received
>> a second email which contained the following plus a traceback that
>> mentioned missing operators.
>>
>> **WARNING***: spamassassin --lint failed.
>> Rolling configuration files back, not restarting SpamAssassin.
>> Rollback command is:  mv -f /etc/spamassassin/antidrug.cf
>> /etc/spamassassin/RulesDuJour/antidrug.cf.2; mv -f
>> /etc/spamassassin/RulesDuJour/antidrug.cf.20070125-0029
>> /etc/spamassassin/antidrug.cf;
>>
>>
>> I couldnt rollback because the file antidrug.cf.20070125-0029 did not
>> exist so I decided to run spamassassin --lint at the command line myself
>> expecting the same error but instead it ran ok, I sent the spamassassin
>> test email to myself and it was caught so everything seems to be working
>> as expected, however I would really like to know why the above error was
>> thrown.
>> Regards,
>> Michael
>> 
>
> The creator of antidrug posted a thorugh explanation of the where and when 
> regarding this rule (see 
> marc.theaimsgroup.com/?l=spamassassin-users&m=116965442518029&w=2).  Without 
> trying to sound holier-than-thou (lord knows, I'm the last one that should 
> cop that attitude), you should search the archives first.  That said, a 
> precis of Matt Kettler's post:
>
> 1.  The location of antidrug.cf has moved, and;
> 2.  It's included in SA 3+ and, in fact, can be counter-productive if used in 
> combination with same.
>
> HTH.
>
> Dimitri
>
>   
Thank you Dimitri.

I'd also add:

3) I've posted the error-generating file as a last-resort to draw
people's attention to the fact they need to change their RDJ before
someone else, possibly malicious, has control of my old account. A
malicious person could post a replacement file that whitelists spam.




Re: Drug spam, some caught some not - none caught by drug rules

2007-01-25 Thread Andy Figueroa
Matt (but not just to Matt), I don't understand your reply (though I am 
deeply in your dept for the work you do for this community).  The sample 
emails that Nigel posted are identical in content, including 
obfuscation.  I've noted the same situation.  Yet, the scoring is really 
different. On the low scoring ones, DCC and RAZOR2 didn't hit, and the 
BAYES score is different.  The main differences are in the headers' 
different forged From and To addresses.  I thought these samples were 
worthy of deeper analysis.


Sincerely,
Andy Figueroa

Matt Kettler wrote:

Nigel Frankcom wrote:

Hi All,

Does anyone have any idea why there are such scoring disparities
between these two emails? I've been seeing a few of these creep
through lately.

http://dev.blue-canoe.net/spam/spam01.txt
http://dev.blue-canoe.net/spam/spam02.txt
http://dev.blue-canoe.net/spam/spam03.txt
http://dev.blue-canoe.net/spam/spam04.txt

More to the point with these is why are they not hitting any of the
drugs rules?


There's a few million obfuscation methods, and the rules can't always
cover em all.

The examples you posted are using "duplicated letters", as well as
inserted underscores.

The old Antidrug rules (part of xx_drugs.cf now) that I wrote will deal
with the underscores, and a wide range of character substitutions, but
only a few special-cases of insertions.

It's taken the spammers a long time to figure that out, but it appears
they finally have.

I used to have to update the set constantly, but lately I've been a bit
too busy with real life.


Re: Enhancing Detection of Certain Spam

2007-01-25 Thread Rich Shepard

On Wed, 24 Jan 2007, Doc Schneider wrote:


I always run sa-update -D to see what is happening.


   Thank you, Doc. Copying the script to /usr/local/bin/ also made a
difference.

Rich

--
Richard B. Shepard, Ph.D.   |The Environmental Permitting
Applied Ecosystem Services, Inc.|  Accelerator(TM)
 Voice: 503-667-4517  Fax: 503-667-8863


Re: what are the rules directories

2007-01-25 Thread Ramprasad
On Wed, 2007-01-24 at 09:46 -0500, Theo Van Dinter wrote:
> On Wed, Jan 24, 2007 at 01:17:15PM +0530, Ramprasad wrote:
> > But If I have /var/lib/spamassassin with some files in it SA is
> > apparently ignoring  /usr/share/spamassassin/*.cf 
> 
> Yes.  That's how updates work.
> 

How do I make it use files on /usr/share/spamassassin too 
I just need a command line version to run lint 

Anyway I use Mailscanner which defines what directories to use for
scanning of mails, so that is not an issue


Thanks
Ram







Re: bayes sql initialization

2007-01-25 Thread Bob McClure Jr
On Thu, Jan 25, 2007 at 05:20:27AM -0500, Tom Allison wrote:
> Bob McClure Jr wrote:
> >On Wed, Jan 24, 2007 at 09:01:58PM -0500, Tom Allison wrote:
> >>Am I correct in understanding that I have to run sa-learn for every user 
> >>who is going to have a bayes token store?
> >
> >If you are running per-user Bayes (nothing else makes much sense,
> >IMHO), yes, but only if they want to train their Bayes with mis-marked
> >ham and spam, or want to pre-load Bayes with some corpus.
> >
> 
> Just to initialize their databases I have to do this?

Not if you're not going to pre-load the Bayes DBs, which you don't
have to do.  If you have not turned off Bayes (it is on by default),
and you are calling spamc at delivery time, say, with the user's
.procmailrc, then SA will initialize the Bayes DBs.

Cheers,
-- 
Bob McClure, Jr. Bobcat Open Systems, Inc.
[EMAIL PROTECTED] http://www.bobcatos.com
Whatever you do, work at it with all your heart, as working for the
Lord, not for men, since you know that you will receive an inheritance
from the Lord as a reward. It is the Lord Christ you are serving.
Colossians 3:23-24 (NIV)


Re: sa-learn on dedicated spamabuse email account

2007-01-25 Thread Pete Russell
You have described why it wont work as good. Using your method the 
headers become useless - bayes ill learn only the body/subject content. 
You would need to tell bayes to ignore the headers.


With pop3 you only have the option you describe - this may still work ok?

Pete

Oenus Tech Services wrote:

Thanks, Peter

Yes, but this would not work with our more than 1500 customers that have
only pop3 access and do not have access to any shared or private folder
in the servers. We needed to implement some way for these pop3-only
customers to report spam back to us, and for now we've only thought of
forwarding spam to an spamabuse account where some scripts could check
its inbox and do bayesian learning. However, does sa-learn take into
account that those emails are being forwarded and they're not the
original source of the spam? I guess not. If that is the case, does
anybody has come up with a similar idea?

Ignacio



Peter Russell escribió:

See attach python script written by one of the folks on the MailScanner
list.

Its designed for use with exchange, so i will describe the Exchange
usage and you can modify as you see fit to work on your pop3 server.

In Exchange 2003 create a public folder called SPAM, give everyone
contributor access, not read or edit. Then any user can simply drag spam
to the public folder, but no user can see in the folder.

Modify the script to suit your environment (Exchange server name and
credentials). Make it executable.

Now run it. It will scan your public folder called SPAM and learn the
contents into bayes, then delete the messages it has learned.

The script doesnt seem to run recursively all that well, it maye stop
randomly and need to be re run again - if any python scripters see this
would you mind having a go fixing this and re posting to the list?

Many thanks
Pete

Oenus Tech Services wrote:

Hi there!

Most of our email is delivered through pop3, so right now bayes
filtering is off. Nevertheless Spamassassin is doing a good job
filtering email, but we want to setup a way for our customers to report
to us undetected spam by forwarding that spam to an
[EMAIL PROTECTED] account in our server. If we then point sa-learn
to that inbox, will it work? My concern is that email arriving to that
account is not from the spammer anymore, but from a forwarded mail by
our customer.

TIA,

Ignacio




#!/usr/bin/env python
import commands, os, time
import imaplib
import sys, re
import string, random
import StringIO, rfc822

# Set required variables
PREFS = "/etc/MailScanner/spam.assassin.prefs.conf"
TMPFILE = "/var/tmp/salearn.tmp"
SALEARN = "/usr/bin/sa-learn"
SERVER = "x.x.x.x"
USER  = "someuserwithaccesstopublicfolder"
PASSWORD = "somepassword"
LOGFILE = "/var/log/learn.spam.log"
log = file(LOGFILE, 'a+')
log.write("\n\nTraining SpamAssassin on %s at %s\n" % (time.strftime("%Y-%m-%d"), 
time.strftime("%H:%M:%S")))

# connect to server
server = imaplib.IMAP4(SERVER)

# login
server.login(USER, PASSWORD)
server.select("Public Folders/Spam")

# Get messages
typ, data = server.search(None, 'ALL')
for num in data[0].split():
typ, data = server.fetch(num, '(RFC822)')
tmp = file(TMPFILE, 'w+')
tmp.write(data[0][1])
tmp.close()
log.write(commands.getoutput("%s --prefs-file=%s --spam %s" % \
(SALEARN, PREFS, TMPFILE)))
log.write("\n")
# Mark learned spam as "Deleted"
server.store(num, '+FLAGS', '\\Deleted')
# Delete messages marked as "Deleted" from server
server.expunge()
server.logout




#!/usr/bin/env python
import commands, os, time
import imaplib
import sys, re
import string, random
import StringIO, rfc822

# Set required variables
PREFS = "/opt/MailScanner/etc/spam.assassin.prefs.conf"
TMPFILE = "/var/tmp/salearn.tmp"
SALEARN = "/usr/bin/sa-learn"
SERVER = "x.x.x.x"
USER  = "someuserwithaccesstopublicfolder"
PASSWORD = "somepassword"
LOGFILE = "/var/log/learn.spam.log"
log = file(LOGFILE, 'a+')
log.write("\n\nTraining SpamAssassin on %s at %s\n" % (time.strftime("%Y-%m-%d"), 
time.strftime("%H:%M:%S")))

# connect to server
server = imaplib.IMAP4(SERVER)

# login
server.login(USER, PASSWORD)
server.select("Public Folders/Spam")

# Get messages
typ, data = server.search(None, 'ALL')
for num in data[0].split():
typ, data = server.fetch(num, '(RFC822)')
tmp = file(TMPFILE, 'w+')
tmp.write(data[0][1])
tmp.close()
log.write(commands.getoutput("%s --prefs-file=%s --spam %s" % \
(SALEARN, PREFS, TMPFILE)))
log.write("\n")
# Mark learned spam as "Deleted"
server.store(num, '+FLAGS', '\\Deleted')
# Delete messages marked as "Deleted" from server
#server.expunge()
server.logout





Re: Should I use greylisting

2007-01-25 Thread Steven Stern
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Matthew Bickerton wrote:
> Thanks, but does this mean I have to keep/maintain a list of all the mail
> farms. Keeping this list up to date sounds horrid/impossible.
> 
> Matthew  
> 
> -Original Message-
> From: --[ UxBoD ]-- [mailto:[EMAIL PROTECTED] 
> Sent: 25 January 2007 12:49
> To: users@spamassassin.apache.org
> Subject: Re: Should I use greylisting
> 
> Check out http://policyd.sourceforge.net/ then as it allows you to
> specify Servers/IP that should not be greylisted. Works very well.
> 
> On Thu, 25 Jan 2007 12:33:19 -
> "Matthew Bickerton" <[EMAIL PROTECTED]> wrote:
> 
>> Hi,
>>
>> I am setting up a new server, so have a chance to make big changes to
>> my email server.
>>
>> I have been thinking about implementing Greylisting. However, I am
>> worried about blocking/long delays with e-mails from mail farms
>> (gmail, yahoo etc.)
>>
>> I would very much appreciate other people's recommendations on
>> Greylisting or other approaches to reducing the load on my server by
>> rejecting spam early.
>>

I tried out greylisting for several months for a select group of users
using greylist-milter.  Their unanimous opinion was that they wanted to
receive mail "instantly". The 10 - 60 minute delay for first-time
senders was unacceptable. The reduction in spam was not noticeable as we
get great results using a combination of ClamAV ans SpamAssassin with a
global bayes filter and many RDJ rules.

- --

  Steve
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Fedora - http://enigmail.mozdev.org

iD8DBQFFuK5OeERILVgMyvARAoUEAJ9LhlgxkvoktjH88rlFpE9B39Zy0ACfVJF9
nBF1MCNsvLkCKlOoyTVP7+Q=
=CzLb
-END PGP SIGNATURE-


Re: Should I use greylisting

2007-01-25 Thread --[ UxBoD ]--
You can use wildcards :)

On Thu, 25 Jan 2007 12:58:51 -
"Matthew Bickerton" <[EMAIL PROTECTED]> wrote:

> Thanks, but does this mean I have to keep/maintain a list of all the
> mail farms. Keeping this list up to date sounds horrid/impossible.
> 
> Matthew  
> 
> -Original Message-
> From: --[ UxBoD ]-- [mailto:[EMAIL PROTECTED] 
> Sent: 25 January 2007 12:49
> To: users@spamassassin.apache.org
> Subject: Re: Should I use greylisting
> 
> Check out http://policyd.sourceforge.net/ then as it allows you to
> specify Servers/IP that should not be greylisted. Works very well.
> 
> On Thu, 25 Jan 2007 12:33:19 -
> "Matthew Bickerton" <[EMAIL PROTECTED]> wrote:
> 
> > Hi,
> > 
> > I am setting up a new server, so have a chance to make big changes
> > to my email server.
> > 
> > I have been thinking about implementing Greylisting. However, I am
> > worried about blocking/long delays with e-mails from mail farms
> > (gmail, yahoo etc.)
> > 
> > I would very much appreciate other people's recommendations on
> > Greylisting or other approaches to reducing the load on my server by
> > rejecting spam early.
> > 
> > Matthew
> > 
> > 
> 

-- 
This message has been scanned for viruses and dangerous content by MailScanner, 
and is
believed to be clean.



Re: lint test failed after rulesdujour update

2007-01-25 Thread Dimitri Yioulos
On Thursday 25 January 2007 6:33 am, Michael Connors wrote:
> Hi,
> I am new to spamassassin so sorry if my question is a bit stupid.
> I have mail spamassassin 3.1.0 running with mailscanner.
> It updates it self via RulesDuJour on a regular basis and I get an email
> which informs me of the update.
> This morning I noticed that there was a error in the process, I received
> a second email which contained the following plus a traceback that
> mentioned missing operators.
>
> **WARNING***: spamassassin --lint failed.
> Rolling configuration files back, not restarting SpamAssassin.
> Rollback command is:  mv -f /etc/spamassassin/antidrug.cf
> /etc/spamassassin/RulesDuJour/antidrug.cf.2; mv -f
> /etc/spamassassin/RulesDuJour/antidrug.cf.20070125-0029
> /etc/spamassassin/antidrug.cf;
>
>
> I couldnt rollback because the file antidrug.cf.20070125-0029 did not
> exist so I decided to run spamassassin --lint at the command line myself
> expecting the same error but instead it ran ok, I sent the spamassassin
> test email to myself and it was caught so everything seems to be working
> as expected, however I would really like to know why the above error was
> thrown.
> Regards,
> Michael

The creator of antidrug posted a thorugh explanation of the where and when 
regarding this rule (see 
marc.theaimsgroup.com/?l=spamassassin-users&m=116965442518029&w=2).  Without 
trying to sound holier-than-thou (lord knows, I'm the last one that should 
cop that attitude), you should search the archives first.  That said, a 
precis of Matt Kettler's post:

1.  The location of antidrug.cf has moved, and;
2.  It's included in SA 3+ and, in fact, can be counter-productive if used in 
combination with same.

HTH.

Dimitri

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.



RE: Should I use greylisting

2007-01-25 Thread Matthew Bickerton
Thanks, but does this mean I have to keep/maintain a list of all the mail
farms. Keeping this list up to date sounds horrid/impossible.

Matthew  

-Original Message-
From: --[ UxBoD ]-- [mailto:[EMAIL PROTECTED] 
Sent: 25 January 2007 12:49
To: users@spamassassin.apache.org
Subject: Re: Should I use greylisting

Check out http://policyd.sourceforge.net/ then as it allows you to
specify Servers/IP that should not be greylisted. Works very well.

On Thu, 25 Jan 2007 12:33:19 -
"Matthew Bickerton" <[EMAIL PROTECTED]> wrote:

> Hi,
> 
> I am setting up a new server, so have a chance to make big changes to
> my email server.
> 
> I have been thinking about implementing Greylisting. However, I am
> worried about blocking/long delays with e-mails from mail farms
> (gmail, yahoo etc.)
> 
> I would very much appreciate other people's recommendations on
> Greylisting or other approaches to reducing the load on my server by
> rejecting spam early.
> 
> Matthew
> 
> 

-- 
This message has been scanned for viruses and dangerous content by
MailScanner, and is
believed to be clean.



True spam getting really low Bayesian points

2007-01-25 Thread Kim Christensen
Hey list,

I've recently started training our bayesian filter with spam/ham from my
personal mailbox, to prepare for live usage on our customer accounts.

% sa-learn --dump magic
...
0.000  0340  0  non-token data: nspam
0.000  0475  0  non-token data: nham
0.000  0  53404  0  non-token data: ntokens
...

So far so good, and spamd is actually using the bayesian db when
examining incoming mails. However, I find that a few of the legit ham 
(not a majority) mails get unusually high bayesian points, while some
of the real spam (which gets scored as spam by sa) often get bayesian
points < 1. 

Now, I'm sure I haven't trained the database with wrong messages. Is it
a good idea to continue feeding sa-learn with example spam and ham until
it reaches a few thousands messages, before relying on the results?

I would think my current amount is sufficient, but I guess something's
wrong with this picture :-)


Best regards
-- 
Kim Christensen
"You just had a near-life experience."


Re: Should I use greylisting

2007-01-25 Thread --[ UxBoD ]--
Check out http://policyd.sourceforge.net/ then as it allows you to
specify Servers/IP that should not be greylisted. Works very well.

On Thu, 25 Jan 2007 12:33:19 -
"Matthew Bickerton" <[EMAIL PROTECTED]> wrote:

> Hi,
> 
> I am setting up a new server, so have a chance to make big changes to
> my email server.
> 
> I have been thinking about implementing Greylisting. However, I am
> worried about blocking/long delays with e-mails from mail farms
> (gmail, yahoo etc.)
> 
> I would very much appreciate other people's recommendations on
> Greylisting or other approaches to reducing the load on my server by
> rejecting spam early.
> 
> Matthew
> 
> 

-- 
This message has been scanned for viruses and dangerous content by MailScanner, 
and is
believed to be clean.



Should I use greylisting

2007-01-25 Thread Matthew Bickerton
Hi,

I am setting up a new server, so have a chance to make big changes to my
email server.

I have been thinking about implementing Greylisting. However, I am worried
about blocking/long delays with e-mails from mail farms (gmail, yahoo etc.)

I would very much appreciate other people's recommendations on Greylisting
or other approaches to reducing the load on my server by rejecting spam
early.

Matthew



lint test failed after rulesdujour update

2007-01-25 Thread Michael Connors

Hi,
I am new to spamassassin so sorry if my question is a bit stupid.
I have mail spamassassin 3.1.0 running with mailscanner.
It updates it self via RulesDuJour on a regular basis and I get an email 
which informs me of the update.
This morning I noticed that there was a error in the process, I received 
a second email which contained the following plus a traceback that 
mentioned missing operators.


**WARNING***: spamassassin --lint failed.
Rolling configuration files back, not restarting SpamAssassin.
Rollback command is:  mv -f /etc/spamassassin/antidrug.cf 
/etc/spamassassin/RulesDuJour/antidrug.cf.2; mv -f 
/etc/spamassassin/RulesDuJour/antidrug.cf.20070125-0029 
/etc/spamassassin/antidrug.cf;


I couldnt rollback because the file antidrug.cf.20070125-0029 did not 
exist so I decided to run spamassassin --lint at the command line myself 
expecting the same error but instead it ran ok, I sent the spamassassin 
test email to myself and it was caught so everything seems to be working 
as expected, however I would really like to know why the above error was 
thrown.

Regards,
Michael




copy a filter

2007-01-25 Thread pocopelli

Hello everybody,

we have an extern rootserver with our provider in Germany. 
MTA=Qmail
Config Webinterface = PLESK 8.0
We have a number of domains hosted on it with emailAccounts.
The mails of the different domains are in subdirecties similar to 

/var/qmail/mailnames/clientdomain.de/mailuser

A single user has already a well trained spamfilter. The files are in  the
folder

/var/qmail/mailnames/clientdomain.de/mailuser/.spamassassin 

Is it possible just to copy this filter server wide or for certain mail
adresses ?
I have root access. Do I have to copy certain files ?

I would be grateful for any help ? 

Thanks in advance !

poc
-- 
View this message in context: 
http://www.nabble.com/copy-a-filter-tf3097420.html#a8599804
Sent from the SpamAssassin - Users mailing list archive at Nabble.com.



Re: bayes sql initialization

2007-01-25 Thread Tom Allison

Bob McClure Jr wrote:

On Wed, Jan 24, 2007 at 09:01:58PM -0500, Tom Allison wrote:
Am I correct in understanding that I have to run sa-learn for every user 
who is going to have a bayes token store?


If you are running per-user Bayes (nothing else makes much sense,
IMHO), yes, but only if they want to train their Bayes with mis-marked
ham and spam, or want to pre-load Bayes with some corpus.



Just to initialize their databases I have to do this?


Re: sa-learn on dedicated spamabuse email account

2007-01-25 Thread Oenus Tech Services
Thanks, Peter

Yes, but this would not work with our more than 1500 customers that have
only pop3 access and do not have access to any shared or private folder
in the servers. We needed to implement some way for these pop3-only
customers to report spam back to us, and for now we've only thought of
forwarding spam to an spamabuse account where some scripts could check
its inbox and do bayesian learning. However, does sa-learn take into
account that those emails are being forwarded and they're not the
original source of the spam? I guess not. If that is the case, does
anybody has come up with a similar idea?

Ignacio



Peter Russell escribió:
> See attach python script written by one of the folks on the MailScanner
> list.
> 
> Its designed for use with exchange, so i will describe the Exchange
> usage and you can modify as you see fit to work on your pop3 server.
> 
> In Exchange 2003 create a public folder called SPAM, give everyone
> contributor access, not read or edit. Then any user can simply drag spam
> to the public folder, but no user can see in the folder.
> 
> Modify the script to suit your environment (Exchange server name and
> credentials). Make it executable.
> 
> Now run it. It will scan your public folder called SPAM and learn the
> contents into bayes, then delete the messages it has learned.
> 
> The script doesnt seem to run recursively all that well, it maye stop
> randomly and need to be re run again - if any python scripters see this
> would you mind having a go fixing this and re posting to the list?
> 
> Many thanks
> Pete
> 
> Oenus Tech Services wrote:
>> Hi there!
>>
>> Most of our email is delivered through pop3, so right now bayes
>> filtering is off. Nevertheless Spamassassin is doing a good job
>> filtering email, but we want to setup a way for our customers to report
>> to us undetected spam by forwarding that spam to an
>> [EMAIL PROTECTED] account in our server. If we then point sa-learn
>> to that inbox, will it work? My concern is that email arriving to that
>> account is not from the spammer anymore, but from a forwarded mail by
>> our customer.
>>
>> TIA,
>>
>> Ignacio
>>
> 
> 
> 
> #!/usr/bin/env python
> import commands, os, time
> import imaplib
> import sys, re
> import string, random
> import StringIO, rfc822
> 
> # Set required variables
> PREFS = "/etc/MailScanner/spam.assassin.prefs.conf"
> TMPFILE = "/var/tmp/salearn.tmp"
> SALEARN = "/usr/bin/sa-learn"
> SERVER = "x.x.x.x"
> USER  = "someuserwithaccesstopublicfolder"
> PASSWORD = "somepassword"
> LOGFILE = "/var/log/learn.spam.log"
> log = file(LOGFILE, 'a+')
> log.write("\n\nTraining SpamAssassin on %s at %s\n" % 
> (time.strftime("%Y-%m-%d"), time.strftime("%H:%M:%S")))
> 
> # connect to server
> server = imaplib.IMAP4(SERVER)
> 
> # login
> server.login(USER, PASSWORD)
> server.select("Public Folders/Spam")
> 
> # Get messages
> typ, data = server.search(None, 'ALL')
> for num in data[0].split():
> typ, data = server.fetch(num, '(RFC822)')
> tmp = file(TMPFILE, 'w+')
> tmp.write(data[0][1])
> tmp.close()
> log.write(commands.getoutput("%s --prefs-file=%s --spam %s" % \
> (SALEARN, PREFS, TMPFILE)))
> log.write("\n")
> # Mark learned spam as "Deleted"
> server.store(num, '+FLAGS', '\\Deleted')
> # Delete messages marked as "Deleted" from server
> server.expunge()
> server.logout
> 
> 
> 
> 
> #!/usr/bin/env python
> import commands, os, time
> import imaplib
> import sys, re
> import string, random
> import StringIO, rfc822
> 
> # Set required variables
> PREFS = "/opt/MailScanner/etc/spam.assassin.prefs.conf"
> TMPFILE = "/var/tmp/salearn.tmp"
> SALEARN = "/usr/bin/sa-learn"
> SERVER = "x.x.x.x"
> USER  = "someuserwithaccesstopublicfolder"
> PASSWORD = "somepassword"
> LOGFILE = "/var/log/learn.spam.log"
> log = file(LOGFILE, 'a+')
> log.write("\n\nTraining SpamAssassin on %s at %s\n" % 
> (time.strftime("%Y-%m-%d"), time.strftime("%H:%M:%S")))
> 
> # connect to server
> server = imaplib.IMAP4(SERVER)
> 
> # login
> server.login(USER, PASSWORD)
> server.select("Public Folders/Spam")
> 
> # Get messages
> typ, data = server.search(None, 'ALL')
> for num in data[0].split():
> typ, data = server.fetch(num, '(RFC822)')
> tmp = file(TMPFILE, 'w+')
> tmp.write(data[0][1])
> tmp.close()
> log.write(commands.getoutput("%s --prefs-file=%s --spam %s" % \
> (SALEARN, PREFS, TMPFILE)))
> log.write("\n")
> # Mark learned spam as "Deleted"
> server.store(num, '+FLAGS', '\\Deleted')
> # Delete messages marked as "Deleted" from server
> #server.expunge()
> server.logout