Re: Clustered MySQL with SA

2005-02-15 Thread Michael Parker
On Tue, Feb 15, 2005 at 10:28:41PM +, Nigel Frankcom wrote:
> 
> Could anyone point me towards any useful info on how to cluster the SA
> MySQL backend (Bayesian etc.)?
> 

Here is a nice little HOWTO:
http://dev.mysql.com/tech-resources/articles/mysql-cluster-for-two-servers.html

Let me know how well it works out, sounds really interesting.

Michael


pgpPiKMSkHcPp.pgp
Description: PGP signature


Re: Re[2]: Care and feeding instructions for SpamAssassin?

2005-02-15 Thread FH
-- Original Message --
Received: Tue, 15 Feb 2005 01:21:21 AM EST
From: Robert Menschel <[EMAIL PROTECTED]>
To: FH <[EMAIL PROTECTED]>Cc: users@spamassassin.apache.org
Subject: Re[2]: Care and feeding instructions for SpamAssassin?


> Next time you get one of those spam that sneaks through, run
> > spamassassin -D output 2>debug.out

There must be a disconnect somewhere. I just did this w/ a "drugs online" spam
I just received.  When it first came in it had a rating of 1.9, I saved it as
a file (not an mbox) on the server and ran the above command and it reported a
12.5!!!  After running sa-learn on the mbox I saved the email to it didn't
change anything (the above still reported a 12.5).  I then "bounced" the
message back to myself and when it hit the incoming mailbox again this time it
was autolearned as ham and rated as 0.4.  Running it back through the above
command again that only scored a 7.9 :( ?!?

So just to double check I'm doing this right:

- Mail comes in to the server and is picked up by postfix (running as
postfix).
- It's passed off to procmail via "mailbox_command = /bin/procmail" in the
postfix/main.cf file.
- Procmail calls spamc which passed off the mail to spamd (running as spamd
and started via an init.d script that runs "spamd -d -u spamd" at startup).
- That runs it through spamassassin and marks it up if appropropriate and then
dumps it into the mailbox.
- Not getting into what the other users are doing, if I get an unmarked spam I
save it to a mailbox (I use [PC-]Pine btw in case that makes a difference) and
occasionally run "sa-learn --showdots --spam --mbox spam" as root on that
file.

This is how it's supposed to work right?  I did a "find" for journal, seen and
toks and only came back w/ those in the expected place
(/var/spool/spamassassin).  The only other spamassassin files I found that
looked "out of place" (aka not the config file or the share/rules files) were
in the ~root/.spamassassin

drwx--   2 root other512 Feb 14 14:18 ./
drwxr-xr-x  13 root other512 Feb 15 15:42 ../
-rw---   1 root other  24576 Feb 15 18:09 auto-whitelist
-rw---   1 filter   filter 0 Dec 25 09:45 auto-whitelist.dir
-rw---   1 root other  6 Feb 15 18:09 auto-whitelist.mutex
-rw---   1 filter   filter  1024 Dec 25 09:45 auto-whitelist.pag
-rw---   1 root other 58 Jan  4 14:52 bayes.lock
-rw---   1 root other   2030 Jan  4 11:02
bayes.lock.OLD.MACHINE.NAME.2623
-rw---   1 root other   4292 Jan  4 14:59
bayes.lock.OLD.MACHINE.NAME.4734
-rw---   1 root other 58 Jan  4 14:59
bayes.lock.OLD.MACHINE.NAME.4739
-rw---   1 root other 29 Jan  4 15:00
bayes.lock.OLD.MACHINE.NAME.4743
-rwx--   1 root other   1175 Dec 24 23:52 user_prefs*

The OLD.MACHINE.NAME is the name of the server while I was setting it
up/testing it, it's something different now.  The user_prefs file has
essentially nothing in it (it's all #d out).  

Any hints clues suggestions are appreciated.
Thanks




RE: Good idea or bad idea?

2005-02-15 Thread Matt Kettler
At 06:00 PM 2/15/2005, Austin Weidner wrote:
So do you think it is better for bayes if you try to keep this ratio more
toward 50/50? I find it is much harder to train HAM than it is SPAM. But if
a bad ratio is going to hurt things, one could shut down the SPAM trainer.
Basically, is too much SPAM a bad thing?
I'd say "ideal" is 50/50.. But mine is *WAY* off from that and I have no 
problems.. I would not worry about ratios, but I would try to configure it 
for as close as possible without going through any major trouble to do it.

I take measures to try and up my ham training whenever I find an easy way 
to do it, but I don't loose sleep over it.

I really meant it when I said these phrases in my post:
Of course, my all-of-history ratio is about 96:4, and my recent training
ratio is 90:10 (past day).

Really, I think ratios are helpful, but a fresh feed of both seems more
important. I totally agree with the above. between autolearn and forced
training scripts, SA learns quite a bit of mail.



RE: Good idea or bad idea?

2005-02-15 Thread Austin Weidner
 
So do you think it is better for bayes if you try to keep this ratio more
toward 50/50? I find it is much harder to train HAM than it is SPAM. But if
a bad ratio is going to hurt things, one could shut down the SPAM trainer.

Basically, is too much SPAM a bad thing?

-Original Message-
From: Matt Kettler [mailto:[EMAIL PROTECTED] 
Sent: Tuesday, February 15, 2005 5:50 PM
To: Chris Santerre; Thomas Arend; users@spamassassin.apache.org
Subject: RE: Good idea or bad idea?

At 04:56 PM 2/15/2005, Chris Santerre wrote:
>Yes, it does tolerate a deviation well. But I remember DQ saying somethnig
>like this.

Here's one reference to the post I was talking about. In the thread I'd 
been suggesting "optimal" would be best if the training ratio matched your 
"real world" spam:ham ratio (which historically was somewhere around 75/25 
here, but recently it's closer to 60/40).

Dan corrected me and said 50/50 was the goal to shoot for:

http://readlist.com/lists/incubator.apache.org/spamassassin-users/0/2046.htm
l

Of course, my all-of-history ratio is about 96:4, and my recent training 
ratio is 90:10 (past day).



>I agree on a personal scale it works wonders if you *continue* to feed it a
>proper diet.

Really, I think ratios are helpful, but a fresh feed of both seems more 
important. I totally agree with the above. between autolearn and forced 
training scripts, SA learns quite a bit of mail.

>  But when you get to a more general server side solution, I
>don't think the results are worth the effort, when one can write a simple
>rule faster then training.

I don't think that's true.. the autolearner is a big help here.. Although I 
force feed, SA autolearns more mail than my scripts feed it.

(64% of spam and 12% of ham get autolearned the way I'm set up, and I've 
not seen any learning errors so far. However, I do use a setup tweaked to 
avoid false ham learning, something I consider a major issue with the 
default autolearn threshold.)




Re: Good idea or bad idea?

2005-02-15 Thread Matt Kettler
At 05:11 PM 2/15/2005, Justin Mason wrote:
Yes, this is one thing to watch out for.  If the spamtrap accounts include
some accounts that *were* previously active as user accounts, you
*really* need to monitor those for the occasional ham slipping in.
I agree wholeheartedly..  This is also a concern for corpus maintainers for 
SA too... Corpus pollution hurts us all (especially the opposite case where 
spam ends up in the ham corpus.. very deadly to the SA scores)

As for that kind of spamtrap, I don't use that type of account myself. Too 
many problems.. I use hand "seeded" addresses, obviously bogus addresses 
I've purposely mentioned in technical examples on mailing lists with web 
archives, etc.

Things like:
[EMAIL PROTECTED]
Which I might use in an example of a cronjob that checks system status and 
emails it to my pager.. Usually takes 1-4 weeks for some spammer to mine it 
off the web archives and start using it...





RE: Good idea or bad idea?

2005-02-15 Thread Matt Kettler
At 04:56 PM 2/15/2005, Chris Santerre wrote:
Yes, it does tolerate a deviation well. But I remember DQ saying somethnig
like this.
Here's one reference to the post I was talking about. In the thread I'd 
been suggesting "optimal" would be best if the training ratio matched your 
"real world" spam:ham ratio (which historically was somewhere around 75/25 
here, but recently it's closer to 60/40).

Dan corrected me and said 50/50 was the goal to shoot for:
http://readlist.com/lists/incubator.apache.org/spamassassin-users/0/2046.html
Of course, my all-of-history ratio is about 96:4, and my recent training 
ratio is 90:10 (past day).


I agree on a personal scale it works wonders if you *continue* to feed it a
proper diet.
Really, I think ratios are helpful, but a fresh feed of both seems more 
important. I totally agree with the above. between autolearn and forced 
training scripts, SA learns quite a bit of mail.

 But when you get to a more general server side solution, I
don't think the results are worth the effort, when one can write a simple
rule faster then training.
I don't think that's true.. the autolearner is a big help here.. Although I 
force feed, SA autolearns more mail than my scripts feed it.

(64% of spam and 12% of ham get autolearned the way I'm set up, and I've 
not seen any learning errors so far. However, I do use a setup tweaked to 
avoid false ham learning, something I consider a major issue with the 
default autolearn threshold.)




Clustered MySQL with SA

2005-02-15 Thread Nigel Frankcom
Hi All,

Could anyone point me towards any useful info on how to cluster the SA
MySQL backend (Bayesian etc.)?

TIA

Nigel


Re: Good idea or bad idea?

2005-02-15 Thread Justin Mason
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1


Matt Kettler writes:
> At 03:54 PM 2/15/2005, Austin Weidner wrote:
> >I get certain e-mail accounts that are old and JUST GET SPAM (no question
> >about it). I set up a script that takes e-mails from these accounts and feds
> >them in to sa-learn as SPAM.
> >
> >I have no HAM's right now, however I have plans to add at least a couple
> >hundred to bayes (that is the bare minimum, I believe).
> >
> >My question is: Is there anything wrong with doing this?
> 
> No.. I see nothing wrong with it.. I do this myself...  They're called 
> "spam traps" by most.
> 
> I also have some carefully guarded "ham traps" that I've carefully 
> subscribed to well-trusted industry newsletters, etc. I script-feed these 
> to sa-learn --ham. I also keep a rotating archive of all the learned mail, 
> so I can go through and review it for contamination.

Yes, this is one thing to watch out for.  If the spamtrap accounts include
some accounts that *were* previously active as user accounts, you
*really* need to monitor those for the occasional ham slipping in.

In the past, I've seen spamtrap accounts that get thousands of spams, but
still have a very old subscription to announcement mailing lists that
send maybe one mail every 2 months, for example.

- --j.

> I just use a simple interval cronjob to do this, and I have the individual 
> addresses all aliases into one "spam" and one "ham" account. You can also 
> add things like calls to razor-report, etc.
> 
> (Note I've re-named the mailboxes here with a search/replace. You should 
> too.. You don't want outsiders being able to recognize your spam trap or 
> ham trap accounts. You certainly don't want anything as predictable as 
> "[EMAIL PROTECTED]" as your ham trap.)
> 
> Here's a trimmed down version of my script (no warranties or claims it's 
> bug-free, etc. Just provided to give you some ideas)
> 
> #!/bin/sh
> cd /var/autolearn/
> 
> if [ -f /var/spool/mail/spam ]; then
>   echo learning spam mailbox - spam
>   mv /var/spool/mail/spam .
>   /usr/bin/sa-learn --spam --mbox spam
>   rm spam/spam.alearn6.gz
>   mv spam/spam.alearn5.gz spam/spam.alearn6.gz
>   mv spam/spam.alearn4.gz spam/spam.alearn5.gz
>   mv spam/spam.alearn3.gz spam/spam.alearn4.gz
>   mv spam/spam.alearn2.gz spam/spam.alearn3.gz
> 
>   gzip spam/spam.alearn1
>   mv spam/spam.alearn1.gz spam/spam.alearn2.gz
> 
>   mv spam spam/spam.alearn1
> fi
> 
> if [ -f /var/spool/mail/ham ]; then
>   echo learning ham mailbox - ham
>   mv /var/spool/mail/ham .
>   /usr/bin/sa-learn --ham --mbox ham
>   rm ham/ham.alearn6.gz
>   mv ham/ham.alearn5.gz ham/ham.alearn6.gz
>   mv ham/ham.alearn4.gz ham/ham.alearn5.gz
>   mv ham/ham.alearn3.gz ham/ham.alearn4.gz
>   mv ham/ham.alearn2.gz ham/ham.alearn3.gz
> 
>   gzip ham/ham.alearn1
>   mv ham/ham.alearn1.gz ham/ham.alearn2.gz
> 
>   mv ham ham/ham.alearn1
> fi
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.2.5 (GNU/Linux)
Comment: Exmh CVS

iD8DBQFCEnOhMJF5cimLx9ARAtk0AJ4lMWErOwVHWtaC6surOO+VpgE8fgCgncOq
8TE+PQHMIaiP6B+HNzvRK+Q=
=XnOV
-END PGP SIGNATURE-



Re: Good idea or bad idea?

2005-02-15 Thread Rick Macdougall

Chris Santerre wrote:
The 1:1 ratio is a mistake based on a wrong interpretation of 
the bayes=20
theorem. I have a ham : spam ratio of 1 : 40.

Also:
"I thing bayes is a very good addition to individual rules. And when it's  
trained propper it works fine. "

I agree on a personal scale it works wonders if you *continue* to feed it a
proper diet. But when you get to a more general server side solution, I
don't think the results are worth the effort, when one can write a simple
rule faster then training. 

IMHO if it only works on a personal level, or a very small company level,
its not a good solution. (Mainly because the starfish out there can't even
work their microwaves, nevermind feed bayes!)
--Chris 
Works great here at the ISP level, about 40K users.  I let it auto learn 
for awhile (a few weeks) then turn off the auto-learning and feed it myself.

Regards,
Rick


RE: Good idea or bad idea?

2005-02-15 Thread Chris Santerre
>>The 1:1 ratio is a mistake based on a wrong interpretation of 
>the bayes=20
>>theorem. I have a ham : spam ratio of 1 : 40.
>
>No, it's not.. Dan Q actually corrected me once on this.. 
>Technically, the 
>"optimal" ratio is 1:1, based on the testing done by the SA 
>development team.
>
>SA is very tolerant of deviation from this, mine is 22:1 and 
>works fine... 
>but Chris is correct is saying the ideal is 1:1, and it's not 
>a mistake.

Yes, it does tolerate a deviation well. But I remember DQ saying somethnig
like this. 

Also:
"I thing bayes is a very good addition to individual rules. And when it's  
trained propper it works fine. "

I agree on a personal scale it works wonders if you *continue* to feed it a
proper diet. But when you get to a more general server side solution, I
don't think the results are worth the effort, when one can write a simple
rule faster then training. 

IMHO if it only works on a personal level, or a very small company level,
its not a good solution. (Mainly because the starfish out there can't even
work their microwaves, nevermind feed bayes!)

--Chris 


Re: Good idea or bad idea?

2005-02-15 Thread Matt Kettler
At 04:26 PM 2/15/2005, Thomas Arend wrote:
The 1:1 ratio is a mistake based on a wrong interpretation of the bayes=20
theorem. I have a ham : spam ratio of 1 : 40.
No, it's not.. Dan Q actually corrected me once on this.. Technically, the 
"optimal" ratio is 1:1, based on the testing done by the SA development team.

SA is very tolerant of deviation from this, mine is 22:1 and works fine... 
but Chris is correct is saying the ideal is 1:1, and it's not a mistake.



Re: sa-learn ham from my emails

2005-02-15 Thread Thomas Arend
Am Montag, 14. Februar 2005 23:13 schrieb Daniel Cañas:
> On Feb 14, 2005, at 3:34 PM, Thomas Arend wrote:
> > Am Montag, 14. Februar 2005 20:50 schrieb Daniel Cañas:
> >> I have over 2000 emails that I have as ham and would like to feed to
> >> sa-learn..
> >
[..]

> >> I have legit spam that I want to learn but I am afraid to do it if I
> >> don't have corresponding number of ham.
> >
> > To my opinion and expirience this is bullshit.
>
> Cool.. this is good to know as I can collect tons of spam.

I have a ratio of 1 : 40 and bayes works fine.


Thomas
-- 
icq:133073900
http://www.t-arend.de


pgpfF1XwwAiyS.pgp
Description: PGP signature


Re: Good idea or bad idea?

2005-02-15 Thread Thomas Arend
Am Dienstag, 15. Februar 2005 22:08 schrieb Chris Santerre:
> >I have autolearned disabled in my SpamAssassin config.
> >
> >I get certain e-mail accounts that are old and JUST GET SPAM
> >(no question
> >about it). I set up a script that takes e-mails from these
> >accounts and feds
> >them in to sa-learn as SPAM.
> >
> >I have no HAM's right now, however I have plans to add at
> >least a couple
> >hundred to bayes (that is the bare minimum, I believe).
> >
> >My question is: Is there anything wrong with doing this? I've seen some
> >posts about ratio's. I figured the more SPAM you feed it, the
> >smarter it
> >will get. Keep in mind I am not trying to use bayes scoring
> >right now, but I
> >thought this setup was better instead of using auto-learn to
> >try to guess
> >which were spam (they are ALL spam!)
>
> When taking a survey on abstinence, is it good to only go and ask college
> kids?  :)
>
> A proper Bayes Diet consists of 50% ham and %50 spam. This would be the
> optimum. Drastic differences can skew the results. Remember Bayes doesn't
> just look for spam, it also looks for ham just as much.

The 1:1 ratio is a mistake based on a wrong interpretation of the bayes 
theorem. I have a ham : spam ratio of 1 : 40.

>
> And YES, Ninja Chris has just answered a Bayes question. I know, I know,
> don't panic! ;)
>
> --Chris (I don't usually answer Bayes questions because I don't think Bayes
> is a good solution.)

I thing bayes is a very good addition to individual rules. And when it's  
trained propper it works fine. 


Thomas


-- 
icq:133073900
http://www.t-arend.de


pgpyxMSo57Wq1.pgp
Description: PGP signature


Re: Good idea or bad idea?

2005-02-15 Thread Matt Kettler
At 03:54 PM 2/15/2005, Austin Weidner wrote:
I get certain e-mail accounts that are old and JUST GET SPAM (no question
about it). I set up a script that takes e-mails from these accounts and feds
them in to sa-learn as SPAM.
I have no HAM's right now, however I have plans to add at least a couple
hundred to bayes (that is the bare minimum, I believe).
My question is: Is there anything wrong with doing this?
No.. I see nothing wrong with it.. I do this myself...  They're called 
"spam traps" by most.

I also have some carefully guarded "ham traps" that I've carefully 
subscribed to well-trusted industry newsletters, etc. I script-feed these 
to sa-learn --ham. I also keep a rotating archive of all the learned mail, 
so I can go through and review it for contamination.

I just use a simple interval cronjob to do this, and I have the individual 
addresses all aliases into one "spam" and one "ham" account. You can also 
add things like calls to razor-report, etc.

(Note I've re-named the mailboxes here with a search/replace. You should 
too.. You don't want outsiders being able to recognize your spam trap or 
ham trap accounts. You certainly don't want anything as predictable as 
"[EMAIL PROTECTED]" as your ham trap.)

Here's a trimmed down version of my script (no warranties or claims it's 
bug-free, etc. Just provided to give you some ideas)

#!/bin/sh
cd /var/autolearn/
if [ -f /var/spool/mail/spam ]; then
 echo learning spam mailbox - spam
 mv /var/spool/mail/spam .
 /usr/bin/sa-learn --spam --mbox spam
 rm spam/spam.alearn6.gz
 mv spam/spam.alearn5.gz spam/spam.alearn6.gz
 mv spam/spam.alearn4.gz spam/spam.alearn5.gz
 mv spam/spam.alearn3.gz spam/spam.alearn4.gz
 mv spam/spam.alearn2.gz spam/spam.alearn3.gz
 gzip spam/spam.alearn1
 mv spam/spam.alearn1.gz spam/spam.alearn2.gz
 mv spam spam/spam.alearn1
fi
if [ -f /var/spool/mail/ham ]; then
 echo learning ham mailbox - ham
 mv /var/spool/mail/ham .
 /usr/bin/sa-learn --ham --mbox ham
 rm ham/ham.alearn6.gz
 mv ham/ham.alearn5.gz ham/ham.alearn6.gz
 mv ham/ham.alearn4.gz ham/ham.alearn5.gz
 mv ham/ham.alearn3.gz ham/ham.alearn4.gz
 mv ham/ham.alearn2.gz ham/ham.alearn3.gz
 gzip ham/ham.alearn1
 mv ham/ham.alearn1.gz ham/ham.alearn2.gz
 mv ham ham/ham.alearn1
fi




Re: Good idea or bad idea?

2005-02-15 Thread Thomas Arend
Am Dienstag, 15. Februar 2005 21:54 schrieb Austin Weidner:
> I have autolearned disabled in my SpamAssassin config.
>
> I get certain e-mail accounts that are old and JUST GET SPAM (no question
> about it). I set up a script that takes e-mails from these accounts and
> feds them in to sa-learn as SPAM.
>
> I have no HAM's right now, however I have plans to add at least a couple
> hundred to bayes (that is the bare minimum, I believe).

you will need 200 spam and 200 ham  in the default configuration.

>
> My question is: Is there anything wrong with doing this? I've seen some
> posts about ratio's. I figured the more SPAM you feed it, the smarter it
> will get. Keep in mind I am not trying to use bayes scoring right now, but
> I thought this setup was better instead of using auto-learn to try to guess
> which were spam (they are ALL spam!)

You should feed all ham and spam. with auto-learn you risk to train a false 
positive as spam and a false negative as ham. This will spiol your database. 
To my expirience the default scores are good enough.

I can only encourage you to use bayes. After a little training it is very good 
with "old" spam and not bad with new "spam".

Regards

Thomas   
-- 
icq:133073900
http://www.t-arend.de


pgpdcTvHgc6Gw.pgp
Description: PGP signature


RE: Good idea or bad idea?

2005-02-15 Thread Chris Santerre
>I have autolearned disabled in my SpamAssassin config.
>
>I get certain e-mail accounts that are old and JUST GET SPAM 
>(no question
>about it). I set up a script that takes e-mails from these 
>accounts and feds
>them in to sa-learn as SPAM.
>
>I have no HAM's right now, however I have plans to add at 
>least a couple
>hundred to bayes (that is the bare minimum, I believe).
>
>My question is: Is there anything wrong with doing this? I've seen some
>posts about ratio's. I figured the more SPAM you feed it, the 
>smarter it
>will get. Keep in mind I am not trying to use bayes scoring 
>right now, but I
>thought this setup was better instead of using auto-learn to 
>try to guess
>which were spam (they are ALL spam!)

When taking a survey on abstinence, is it good to only go and ask college
kids?  :) 

A proper Bayes Diet consists of 50% ham and %50 spam. This would be the
optimum. Drastic differences can skew the results. Remember Bayes doesn't
just look for spam, it also looks for ham just as much.

And YES, Ninja Chris has just answered a Bayes question. I know, I know,
don't panic! ;) 

--Chris (I don't usually answer Bayes questions because I don't think Bayes
is a good solution.)


Good idea or bad idea?

2005-02-15 Thread Austin Weidner
I have autolearned disabled in my SpamAssassin config.

I get certain e-mail accounts that are old and JUST GET SPAM (no question
about it). I set up a script that takes e-mails from these accounts and feds
them in to sa-learn as SPAM.

I have no HAM's right now, however I have plans to add at least a couple
hundred to bayes (that is the bare minimum, I believe).

My question is: Is there anything wrong with doing this? I've seen some
posts about ratio's. I figured the more SPAM you feed it, the smarter it
will get. Keep in mind I am not trying to use bayes scoring right now, but I
thought this setup was better instead of using auto-learn to try to guess
which were spam (they are ALL spam!)



MISC: HUMOR Instant 419!

2005-02-15 Thread Chris Santerre
Looks like 419'er are using instant messaging to get people now! Funny
conversation that a friend of mine had. Worth the read. 

http://www.merchantsoverseas.com/wwwroot/gorilla/funny419.txt

(Friends name changed, but engrojie_adams is the real 419'er.

--Chris 


RE: URIDNSBL error

2005-02-15 Thread List Mail User
Crhis,

Yes.  Try using the rfci lists and/or AHBL (no they're not in the
code base as delivered, but they work very well).

Paul Shupak
[EMAIL PROTECTED]

>>From [EMAIL PROTECTED] Tue Feb 15 10:53:26 2005
>Mailing-List: contact [EMAIL PROTECTED]; run by ezmlm
>Precedence: bulk
>list-help: 
>list-unsubscribe: 
>list-post: 
>List-Id: 
>Delivered-To: mailing list users@spamassassin.apache.org
>X-ASF-Spam-Status: No, hits=0.1 required=10.0
>   tests=FORGED_RCVD_HELO
>Received-SPF: pass (hermes.apache.org: local policy)
>From: Chris Santerre <[EMAIL PROTECTED]>
>To: "'Matt Kettler'" <[EMAIL PROTECTED]>,
>Austin Weidner <[EMAIL PROTECTED]>, users@spamassassin.apache.org
>Subject: RE: URIDNSBL error
>Date: Tue, 15 Feb 2005 13:46:21 -0500
>X-Mailer: Internet Mail Service (5.5.2653.19)
>X-Virus-Checked: Checked
>
>>surbl.org is the biggest source of URIDNSBLs.
>
>Is there another? :)  
>
>--Chris (Oh no! We're the Microsoft of URIDNSBLs! All your domains are
>belong to us!)
>


[OT] SA Users and spam folder deliveries.

2005-02-15 Thread Dave Goodrich
Good afternoon all,
I'm begining to see instances where my modified ifspamh script is 
failing. Could be many reasons. Last night I had a spam attack of about 
44k + messages (several thousand with 15 to 20 recipients each).

I currently have different scripts for each clients needs. I'd like to 
run one method of catching the result of spamc and delivering based on 
the result spamc hands back.

I'm concerned about using procmail and system resources/speed, I've 
never used maildrop, how are others handling delivery after spamc?

Thanks,
DAve
--
Dave Goodrich
Systems Administrator
http://www.tls.net
Get rid of Unwanted Emails...get TLS Spam Blocker!



RE: How Can I find out what SA is doing?

2005-02-15 Thread Chris Santerre

>> > I'd also strongly suggest switching over to enabling 
>network tests and 
>> disabling bigevil.. Much lower overhead.
>>
>>Can you explain how I make that switch or refer me to any 
>relevent docs?
>
>Really, all you need to do is make sure Net::DNS is 
>installed.. SA 3.x will 
>by default begin using the SURBL DNS based URL blocklists.
>
>Chris S (author of bigevil) worked with Will S to merge the 
>bigevil data 
>into the ws.surbl.org DNS list quite a while ago, and has been 
>recommending 
>people use SURBL instead of bigevil.cf where possible.  

I heard my name :) 

BigEvil hasn't been updated in about 4 months. Last week someone mentioned
they still used it, and for giggles I checked what it would be like if I
updated it. It killed servers when it was around 900k. Its now 2.7 megs! 

I guess one sure way for people to stop using it, is for me to update it :) 

--Chris 


RE: URIDNSBL error

2005-02-15 Thread Chris Santerre

>surbl.org is the biggest source of URIDNSBLs.

Is there another? :)  

--Chris (Oh no! We're the Microsoft of URIDNSBLs! All your domains are
belong to us!)


Re: URIDNSBL error

2005-02-15 Thread Justin Mason
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1


Austin Weidner writes:
> Why am I getting around 20 lines of this in a spamassassin --lint -D:
> 
> debug: plugin: Mail::SpamAssassin::Plugin::URIDNSBL=HASH(0xa89414c)
> inhibited further callbacks
> 
> What is URIDNSBL and what is this error?

that's not an error.  you're running with debugs on, and it's
a debugging message ;)

URIDNSBL is the plugin used to do SURBL lookups.

- --j.
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.2.5 (GNU/Linux)
Comment: Exmh CVS

iD8DBQFCEkNbMJF5cimLx9ARAk73AJwIL+5kxuc126c84hAeAGQNkl8vrQCglosu
IKkqdXao3Dx4njpgzsIggZA=
=OjKt
-END PGP SIGNATURE-



Re: Wierd Problem identifying Spam

2005-02-15 Thread Daniel Draes
Hi,
Usually that means the message has been double-scanned..
First at the MTA layer, where it got tagged as spam and encapsulated. 
The encapsulation also winds up creating new headers for the message.

The second time it got called at the MDA layer (ie: procmail) and the 
new headers resulted in a lower score that wasn't spam. The second 
scan over-writes all the X-Spam headers with the new status, but 
doesn't modify the subject and body that were tagged by the previous run.
Hmm. I thought something like this Can you help me finding out if 
and when why?

Here is the relevant part of my postfix setup:
smtp inet n - n - - smtpd
-o content_filter=spamassassin
[...]
spamassassin unix - n n - - pipe
user=nobody argv=/usr/bin/spamc -f -e
/usr/sbin/sendmail -oi -f ${sender} ${recipient}

There is no further processing mail with SA through procmail (I checked 
that already).

THX!
Daniel



Re: Wierd Problem identifying Spam

2005-02-15 Thread Matt Kettler
At 01:30 PM 2/15/2005, Daniel Draes wrote:
Here is what happens:
SA gets the mail and checks it nicely. However, applying points to the 
mail seems to fail somehow. For example I have mails where the subjects 
will be rewritten according my confing with '*SPAM*' but the 
SA-Spam-Status Flag states:
No, hits=-4.8 required=5.0 tests=BAYES_00,HTML_MESSAGE autolearn=ham 
version=2.64

The mail itself however says:
Content analysis details:   (7.8 points, 5.0 required)
So having a procmail-rule for X-Spam-Level doesn't really help.
Any ideas whats wrong with my setup?
Usually that means the message has been double-scanned..
First at the MTA layer, where it got tagged as spam and encapsulated. The 
encapsulation also winds up creating new headers for the message.

The second time it got called at the MDA layer (ie: procmail) and the new 
headers resulted in a lower score that wasn't spam. The second scan 
over-writes all the X-Spam headers with the new status, but doesn't modify 
the subject and body that were tagged by the previous run.



Re: Wierd Problem identifying Spam

2005-02-15 Thread Theo Van Dinter
On Tue, Feb 15, 2005 at 07:30:33PM +0100, Daniel Draes wrote:
> Any ideas whats wrong with my setup?

My guess is that you're running the message through SpamAssassin twice.  The
first time marks it up appropriately, then the second time sees a
substantially different message and marks it up differently, leading to
confusing results.

-- 
Randomly Generated Tagline:
"First learn computer science and all the theory. Next develop a
 programming style.  Then forget all that and just hack."
   - George Carrette


pgpcIRzrUgpHV.pgp
Description: PGP signature


Re: URIDNSBL error

2005-02-15 Thread Theo Van Dinter
On Tue, Feb 15, 2005 at 01:22:01PM -0500, Matt Kettler wrote:
> >debug: plugin: Mail::SpamAssassin::Plugin::URIDNSBL=HASH(0xa89414c)
> >inhibited further callbacks
> 
> It's not an error, it's a debugging informational message only.. It's 
> highly important to plugin writing and debugging, but otherwise it's 
> irrelevant... Ignore it.

FYI, the "inhibited further callbacks" debug statement was disabled for 3.0.2:

r55722 | jm | 2004-10-27 14:56:59 -0400 (Wed, 27 Oct 2004) | 1 line
remove the annoying 'inhibited further callbacks' debug message

-- 
Randomly Generated Tagline:
"Do not meddle in the affairs of wizards, for you are crunchy and good
 with ketchup."  - Unknown


pgpobxKwpVcZj.pgp
Description: PGP signature


Wierd Problem identifying Spam

2005-02-15 Thread Daniel Draes
Hi folks,
I ahve a pretty wierd problem here and cannot figure out why.
Here is my system:
SuSe 9.2
Postfix 2.0.19
SpamAssassin 2.64
SA is runnung as deamon, and postfix is connecting correctly to the 
assigned TCP port.

Here is what happens:
SA gets the mail and checks it nicely. However, applying points to the 
mail seems to fail somehow. For example I have mails where the subjects 
will be rewritten according my confing with '*SPAM*' but the 
SA-Spam-Status Flag states:
No, hits=-4.8 required=5.0 tests=BAYES_00,HTML_MESSAGE autolearn=ham 
version=2.64

The mail itself however says:
Content analysis details:   (7.8 points, 5.0 required)
So having a procmail-rule for X-Spam-Level doesn't really help.
Any ideas whats wrong with my setup?
THX!
Daniel



Re: URIDNSBL error

2005-02-15 Thread Matt Kettler
At 12:38 PM 2/15/2005, Austin Weidner wrote:
Why am I getting around 20 lines of this in a spamassassin --lint -D:
debug: plugin: Mail::SpamAssassin::Plugin::URIDNSBL=HASH(0xa89414c)
inhibited further callbacks
What is URIDNSBL and what is this error?
It's not an error, it's a debugging informational message only.. It's 
highly important to plugin writing and debugging, but otherwise it's 
irrelevant... Ignore it.

URIDNSBLs are DNS based blacklists of domain names found in spamvertized 
URI's (aka URL's) from the message body.

surbl.org is the biggest source of URIDNSBLs.



Re: spamd hanging or looping

2005-02-15 Thread Henk van Lingen
On Tue, Feb 15, 2005 at 06:27:39PM +0100, Sander Holthaus - Orange XL wrote:
  > 
  > From reading your messages, I wouldn't be sure if it is a bug in spamd
  > itself. It could very well be in either the Perl-version or related modules
  > your are using. Some are actually quite old and have known bugs in them
  > which can lead to endless loops.

  Hm. If versions coming with RHEL 3 Linux cause it, I think I would still
  call it a bug...

  > Before submitting a bugreport, upgrade perl and related modules to their
  > latest versions. Also save your db-files and the related messages (which can
  > be handy if it is indeed an unresolved bug).

  I have already filed a report including relevant files:

  http://bugzilla.spamassassin.org/show_bug.cgi?id=4138

  Cheers,

-- 
Henk van Lingen, Systems & Network Administrator  (o-  -+
Dept. of Computer Science, Utrecht University./\|
phone: +31-30-2535278v_/_
http://henk.vanlingen.net/ http://www.tuxtown.net/netiquette/


URIDNSBL error

2005-02-15 Thread Austin Weidner

Why am I getting around 20 lines of this in a spamassassin --lint -D:

debug: plugin: Mail::SpamAssassin::Plugin::URIDNSBL=HASH(0xa89414c)
inhibited further callbacks

What is URIDNSBL and what is this error?

Thanks



Re: How Can I find out what SA is doing?

2005-02-15 Thread Matt Kettler
At 12:03 PM 2/15/2005, Chris Withers wrote:
> I'd also strongly suggest switching over to enabling network tests and 
disabling bigevil.. Much lower overhead.

Can you explain how I make that switch or refer me to any relevent docs?
Really, all you need to do is make sure Net::DNS is installed.. SA 3.x will 
by default begin using the SURBL DNS based URL blocklists.

Chris S (author of bigevil) worked with Will S to merge the bigevil data 
into the ws.surbl.org DNS list quite a while ago, and has been recommending 
people use SURBL instead of bigevil.cf where possible.  



RE: spamd hanging or looping

2005-02-15 Thread Sander Holthaus - Orange XL
> On Mon, Feb 14, 2005 at 08:18:47PM +0100, Henk van Lingen wrote:
>   >
>   > Additional info on this bug:
> 
>   Being a bit surprised about the lack of interest in a bug 
> like this here,
>   I'm trying to submit something to this 'bugzilla'. I've 
> made an account
>   and now it suggest reading the guidelines. However:
> 
>   "The requested URL /bugwritinghelp.html was not found on 
> this server."
> 
>   So, what is priority, severity and URL? Or does it not 
> really matter?
>   And what component to choose? spamd seems logical but i 
> think the prob
>   is in a library.
> 
>   (I hate having to edit a message without vim in a 'textarea' :-))

>From reading your messages, I wouldn't be sure if it is a bug in spamd
itself. It could very well be in either the Perl-version or related modules
your are using. Some are actually quite old and have known bugs in them
which can lead to endless loops.

Before submitting a bugreport, upgrade perl and related modules to their
latest versions. Also save your db-files and the related messages (which can
be handy if it is indeed an unresolved bug).

Kind Regards,
Sander Holthaus

PS: What OS and which Perl-version are you using?



Re: How Can I find out what SA is doing?

2005-02-15 Thread Chris Withers
Matt Kettler wrote:
TRUSTED_RULESETS="BIGEVIL TRIPWIRE ANTIDRUG EVILNUMBERS";
Please don't use antidrug.cf with SA 3.0 or higher.. SA 3.0 already has 
antidrug as a part of the standard ruleset, and the second .cf is a waste.
Aha, OK.
I'd also strongly suggest switching over to enabling network tests and 
disabling bigevil.. Much lower overhead.
Can you explain how I make that switch or refer me to any relevent docs?
cheers,
Chris
--
Simplistix - Content Management, Zope & Python Consulting
   - http://www.simplistix.co.uk


Re: How Can I find out what SA is doing?

2005-02-15 Thread Matt Kettler
At 11:35 AM 2/15/2005, Chris Withers wrote:
I'm using a debian package install - spamassassin 3.0.2-0.backports.org
I also use rules_du_jour in a my_rules_du_jour style and have the 
following trusted rule sets:

TRUSTED_RULESETS="BIGEVIL TRIPWIRE ANTIDRUG EVILNUMBERS";
Please don't use antidrug.cf with SA 3.0 or higher.. SA 3.0 already has 
antidrug as a part of the standard ruleset, and the second .cf is a waste.

I'd also strongly suggest switching over to enabling network tests and 
disabling bigevil.. Much lower overhead.



How Can I find out what SA is doing?

2005-02-15 Thread Chris Withers
Hi,
I run spamassassin from procmail using the following in my .procmailrc:
# Spam Assassiin and other spam filtering
:0fw: spamassassin.lock
| /usr/bin/spamassassin
I'm using a debian package install - spamassassin 3.0.2-0.backports.org
I also use rules_du_jour in a my_rules_du_jour style and have the 
following trusted rule sets:

TRUSTED_RULESETS="BIGEVIL TRIPWIRE ANTIDRUG EVILNUMBERS";
I noticed a big dip in spam catching performance a while after moving 
from a source install to the debian package.

I'v also noticed that none of the rulesets I use are being updated by 
rules_du_jour anymore.

So, a couple of questions:
- how can I check what rules are being used and whether the stuff done 
by rules_du_jour and my weekly training script is having an effect?

- what rule sets do people now recommend in rules_du_jour and is 
rules_du_jour still working?

cheers,
Chris
--
Simplistix - Content Management, Zope & Python Consulting
   - http://www.simplistix.co.uk


Re: bayesian filter training

2005-02-15 Thread Matias Lopez Bergero
Robert Menschel wrote:
Hello Matias,
Friday, February 11, 2005, 5:32:10 AM, you wrote:
MLB> The sa-learn man page says that for a good training of the
MLB> Bayesian filter, you need to train it with equal amounts of spam
MLB> and ham, or more ham if is possible. So if I sa-learn the spam
MLB> folder, the spam tokens are going to grow a lot compared to ham
MLB> tokens.
IMO, if you manually train ONLY spam into the system, then yes, you
may end up with Bayes problems. Emphasis: may. It might work just
fine.
You don't need to worry about training Bayes with equal amounts of
spam and ham -- my ratio has varied from 10:1 to 15:1 spam:ham, with
no problem.
But it's important to feed ham into the system as well. I would
hesitate exceeding a 100:1 ratio, unless your actual spam load exceeds
100:1.

I'm running a bayes site-wide db now, and I'm seeing a lot of ham 
appended to the db by the auto learn.
I think that this is a very good thing, and it's helping me to keep a 
good ham:spam radio :)

Since: Feb 13 04:03:57
learned ham: 1671
Learned spam: 1560
:-D
BR,
Matías.


Re: spamd hanging or looping

2005-02-15 Thread Henk van Lingen
On Mon, Feb 14, 2005 at 08:18:47PM +0100, Henk van Lingen wrote:
  > 
  > Additional info on this bug:

  Being a bit surprised about the lack of interest in a bug like this here,
  I'm trying to submit something to this 'bugzilla'. I've made an account
  and now it suggest reading the guidelines. However:

  "The requested URL /bugwritinghelp.html was not found on this server."

  So, what is priority, severity and URL? Or does it not really matter?
  And what component to choose? spamd seems logical but i think the prob
  is in a library.

  (I hate having to edit a message without vim in a 'textarea' :-))

  Cheers,

-- 
Henk van Lingen, Systems & Network Administrator  (o-  -+
Dept. of Computer Science, Utrecht University./\|
phone: +31-30-2535278v_/_
http://henk.vanlingen.net/ http://www.tuxtown.net/netiquette/


Re: Spamassassin with sa-learn

2005-02-15 Thread Kris Deugau
Tinni wrote:
> Here my qs. is when a mail is coming to the server , suppose, for
> *user2* or many others,  will the spamassassin  check the mails for
> ham/spam with the *default* database which is bydefault set to
> *user1* ? or it will check only for the mails of *user1* ? I am
> little bit confused  here.

I don't have your original message, but IIRC you said you're calling
SpamAssassin from procmail.  This implies that you're doing so just
before the message is put into a mail folder (whether that's the inbox
for a user or elsewhere is determined by procmail).  On most mail
systems, this *also* means that mail processing is done one message at a
time, for one recipient at a time.

As I said in my first reply, if you want a single global Bayes database
you **MUST** at the very minimum put a bayes_path statement in one of
your local configuration files - local.cf is most commonly used.

When a message is processed by SA, with that bayes_path statement in
place, *ALL* Bayes activity is done on that global database.

> As i understand that individual users_prefs will supercede the
> value of the global parameter settings.

For certain settings, yes.  See the man page for
Mail::SpamAssassin::Conf for the details on which ones.

> So does this concept is for bayes database also?

IIRC, no;  bayes* options are considered "priviledged" settings.  Check
the man page on your installed SpamAssassin copy to be certain for your
usage.

> If yes, then the bayes default databaes whatever learned (spam + ham)
> for default user *user1* will not work for the other users - is this
> so?

Assuming that bayes* options are not priviledged, then yes, any user
could stick in a bayes_path statement and avoid the global database.

Otherwise, all users will refer to the global database.

> I want simply that the default  path  where i am seeing spamassassin
> is updatijng/working will be applicable for all the users.

Please see the suggestions at the bottom of my first reply, and refer to
the man page to make sure you have the settings laid out correctly for
your installed version of SA.

Those settings have been working on one of my systems for several years
now.

-kgd
-- 
Get your mouse off of there!  You don't know where that email has been!


Re: Doesn't work with non local accounts

2005-02-15 Thread Matt Kettler
At 02:12 AM 2/15/2005, Andrew Afliatunov wrote:
I use spamassassin-3.01 in site-wide mode (spamd+spamc) on Linux 
Slackware-9.1 mail server.
Everything worked just fine - about 300 spam-letters dayly was filtered. 
But then I made system to look up mail-users in ldap database. And removed 
accounts from linux system.
Now spamc can't check letters for those users. In procmail.log I see:
--
getpwuid() failed: No such file or directory
procmail: Program failure (71) of "/usr/bin/spamc"
procmail: Rescue of unfiltered data succeeded
--
And users get tonns of spam :(.
How can I make spamassassin work with non-local accounts?

Did you pass the --setuid-with-ldap parameter to spamd? If so, don't unless 
the accounts exist locally.

(spamd can't setuid to a nonexistent account, which is why this feature is 
optional. It is only useful if you have ldap AND local accounts)





Re: Using external tests

2005-02-15 Thread Matt Kettler
At 06:04 AM 2/15/2005, Manuel Schmitt (manitu) wrote:
I am searching for a way to use an external test in Spamassassin. E.g. I 
want to have a simple bash script which is called by SA every time a mail 
is checked. This external program returns an return code which SA uses for 
testing purposes.

Perhaps this is very simple by having a simple conf directive, but I did 
not find anything in the docs or faq :(
There's no conf directive to do that.. you'd have write a custom plugin in 
perl that does it.



Re: Spamassassin with sa-learn

2005-02-15 Thread Tinni




Hi 
Thanks for your reply.
>>OK, looks good. SA puts preferences and AWL data and >>Bayes data files in ~/.spamassassin/ by default.
I am sorry if my qs sounds little bit funny but as i am new so i have some confusions.
Here my qs. is when a mail is coming to the server , suppose, for *user2* or many others,  will the spamassassin  check the mails for ham/spam with the *default* database which is bydefault set to  *user1* ? or it will check only for the mails of *user1* ? I am little bit confused  here.
As i understand that individual   users_prefs will supercede the value of the global parameter settings. So does this concept is   for bayes database also? If yes, then  the bayes default databaes whatever learned (spam + ham)  for default user *user1* will not work for the other users - is this so? 
I want simply that the default  path  where i am seeing spamassassin is updatijng/working will be applicable for all the users.
Suggessions/advice is really appreciated.
Thanks again
-Tinni




Kris Deugau <[EMAIL PROTECTED]> wrote: 
Please post messages in plaintext in the future. Thanks.Tinni wrote:> I am little bit confused of *sa-learn*. I have installed SA 3.02. I> did not set any bayes path in local.cf . When i am checking with>> #spamassassin -lint -D>> it is showing a path as[snip]< debug: bayes: 7103 tie-ing to DB file R/O> /home/sites/www.domain.org/users//.spamassassin/bayes_toks> debug: bayes: 7103 tie-ing to DB file R/O> /home/sites/www.domain.org/users//.spamassassin/bayes_seenOK, looks good. SA puts preferences and AWL data and Bayes data filesin ~/.spamassassin/ by default.> I am executing the *sa-learn* as root, So do you think that the> central database for bayes is the aboove path?No, if you run sa-learn as root it will, like any other default SA
 call,put Bayes data in ~/.spamassassin/. In particular, it will create/root/.spamassassin/bayes_seen and /root/.spamassassin/bayes_toks.> Also i am seeing that> the individual users's bayes database also updated. but i am not> allowing ANYBODY to execute the *sa-learn*.> - Though no user is executing the *sa-learn* then how every> userid bayes database is being updated ? (i am telling only > seeing the time stapm)This is due to SpamAssassin's autolearning capability; by defaultmessages scoring under 0.1 or over 12 (IIRC, check the documentation)will get autolearned as ham or spam respectively. Each user'sautolearned Bayes data will be put in the appropriate files in~user/.spamassassin/.> I want that the mail> will be filtered through the *central database* only.> - Do i need to mention the path of bayes db in the local.cf?If you want a single, global Bayes database,
 you **MUST** set bayes_pathin your configuration.For instance, on one of the systems I administer, I have the followingin my local.cf to set up SA's Bayes subsystem:use_bayes 1bayes_auto_learn 1bayes_auto_learn_threshold_nonspam -0.01bayes_learn_to_journal 1bayes_expiry_max_db_size 100bayes_auto_expire 0bayes_path /var/SpamAssassin/bayesbayes_file_mode 0777I've explicitly set a number of options to their defaults as well, butthis provides me with a single, global Bayes database, accessible andautolearn-able for all users, with a larger number of tokens than thedefault. Check the Mail::SpamAssassin::Conf manpage for details on whatthese options do. Note that some of them may have changed for 3.x; this is a working 2.64 install.-kgd-- Get your mouse off of there! You don't know where that email has been!

Yahoo! India Matrimony: Find your life partner
online.

FORGED YAHOO RCVD

2005-02-15 Thread Steven Stern
This is the header from a message I sent myself from my Yahoo account. I leads
to two questions:

1. How reliable is the "forged" test?
2. Is there a test for verifying domain keys?

Return-Path: <[EMAIL PROTECTED]>
Received: from web90102.mail.scd.yahoo.com (web90102.mail.scd.yahoo.com
[66.218.94.73])
by ciscy.sterndata.com (8.13.1/8.13.1) with SMTP id j1FD4WQq000538
for <[EMAIL PROTECTED]>; Tue, 15 Feb 2005 07:04:33 -0600
Received: (qmail 73527 invoked by uid 60001); 15 Feb 2005 12:57:50 -
Comment: DomainKeys? See http://antispam.yahoo.com/domainkeys
DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws;
  s=s1024; d=yahoo.com;

b=K9rWEvn/iHj0/WO0JfzuDWsuLiNZMlApMytnZlcTU6pj5Qw/ix262PAF6Ch162pWZK4q3M/fYiPS3JOHjjgNaI5en/ejqDKlu0njMzf2rEILE8VLX4O9ss1LC+bH1o+E9C53Sx3IQnSYw5ThpupthffEAaNa1lG417tm0PCzDeY=
;
Message-ID: <[EMAIL PROTECTED]>
Received: from [66.167.178.157] by web90102.mail.scd.yahoo.com via HTTP; Tue,
15 Feb 2005 04:57:49 PST
Date: Tue, 15 Feb 2005 04:57:49 -0800 (PST)
From: Steven Stern <[EMAIL PROTECTED]>
Subject: test from yahoo
To: [EMAIL PROTECTED]
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
X-Greylist: Delayed for 00:06:42 by milter-greylist-2.0b2 (ciscy.sterndata.com
[192.168.123.175]); Tue, 15 Feb 2005 07:04:33 -0600 (CST)
X-Virus-Scanned: ClamAV version 0.83, clamav-milter version 0.83 on
ciscy.sterndata.com
X-Virus-Status: Clean
X-Spam-Status: No, score=0.1 required=5.0 tests=BAYES_00,FORGED_YAHOO_RCVD 
autolearn=no version=3.0.1
X-Spam-Checker-Version: SpamAssassin 3.0.1 (2004-10-22) on ciscy.sterndata.com
Status: O
X-UID: 14487
Content-Length: 126
X-Keywords:


-- 
  Steve 
   


little tool for detecting & reanimate hanged Spamd

2005-02-15 Thread Eugene Kurmanin
Hello, all.

May be useful for someone... Little tool SpamdMon on clear C for Linux.
It  works  fine  for  me  on  my  busy  production  servers, which used
daemonize   version   of   SpamAssassin.   You  know, hangup sometimes
happens with SA.

http://user.rol.ru/~kurmanin/projects/

Feedback are welcome :)

-- 
Kind regards,
Eugene Kurmanin
mailto:[EMAIL PROTECTED]



Using external tests

2005-02-15 Thread Manuel Schmitt (manitu)
Hello all,
I am searching for a way to use an external test in Spamassassin. E.g. I 
want to have a simple bash script which is called by SA every time a 
mail is checked. This external program returns an return code which SA 
uses for testing purposes.

Perhaps this is very simple by having a simple conf directive, but I did 
not find anything in the docs or faq :(

Regards,
Manuel
--

Manuel Schmitt
- Geschäftsführer -
manitu  [EMAIL PROTECTED]
Welvertstraße 2 http://www.manitu.de
D-66606 St. WendelTelefon: +49-(0)-6851-99808-20
Germany   Telefax: +49-(0)-6851-99808-29
  PGP-Key-ID: 0x3E486E93
Unser Impressum finden Sie unter http://www.manitu.de/impressum


Re: [OT] Spam Quarantine

2005-02-15 Thread Robert LeBlanc
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1
Thomas Kinghorn [MTNNS -Rosebank] wrote:
| Maia seems extremely cool, just wish there were some screenshots to look
| at.
| Definetely seems like a good way to go considering the platforms I am
| currently running on.
Maia was featured in the December issue of Linux Journal, which included
some screenshots.  The article (with screenshots) is available online at
.  The development version
looks a bit different, however, now that it's based on Smarty templates
for user-selectable themes/skins.
- --
Robert LeBlanc <[EMAIL PROTECTED]>
Renaissoft, Inc.
Maia Mailguard 
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.2.6 (GNU/Linux)
iD8DBQFCEc2fGmqOER2NHewRAlRfAJ9qoVxmovWQ5ZHZGYgLT5rlAWu9hQCgruPt
1daMY886slM056IF65jXLL0=
=Gb4D
-END PGP SIGNATURE-


RE: [OT] Spam Quarantine

2005-02-15 Thread Thomas Kinghorn [MTNNS -Rosebank]
Hi All.

Thanks for the responses.

Maia seems extremely cool, just wish there were some screenshots to look
at.
Definetely seems like a good way to go considering the platforms I am
currently running on. 

Thanks again

Tom

-Original Message-
From: Robert LeBlanc [mailto:[EMAIL PROTECTED] 
Sent: 15 February 2005 11:45 AM
To: Thomas Kinghorn [MTNNS -Rosebank]
Cc: [EMAIL PROTECTED]
Subject: Re: [OT] Spam Quarantine

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Thomas Kinghorn [MTNNS -Rosebank] wrote:
| Hi List.
|
| Sorry to post this here but I am stumped as where to look.
|
| I am looking for webmail software which can be a frontend to A 
| catchall for spam.
|
| Basically I would like to login via the web and release messages.
| Also, a once-a-day digest would be great.

Take a look at Maia Mailguard , which is
based on amavisd-new and SpamAssassin, and uses PHP and Perl scripts.

- --
Robert LeBlanc <[EMAIL PROTECTED]>
Renaissoft, Inc.
Maia Mailguard 

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.2.6 (GNU/Linux)

iD8DBQFCEcSFGmqOER2NHewRAlJfAJ916eQF7MVkiYkAPSSOTytlyHZVPgCdHtQK
3qzppncuFf0CigcF0fhAot0=
=UYMr
-END PGP SIGNATURE-


Re: [OT] Spam Quarantine

2005-02-15 Thread Martin Hepworth
Good
two projects for him to evaluate. Should give Thomas a nice choice.
--
Martin Hepworth
Snr Systems Administrator
Solid State Logic
Tel: +44 (0)1865 842300
Robert LeBlanc wrote:
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1
Thomas Kinghorn [MTNNS -Rosebank] wrote:
| Hi List.
|
| Sorry to post this here but I am stumped as where to look.
|
| I am looking for webmail software which can be a frontend to
| A catchall for spam.
|
| Basically I would like to login via the web and release messages.
| Also, a once-a-day digest would be great.
Take a look at Maia Mailguard , which is
based on amavisd-new and SpamAssassin, and uses PHP and Perl scripts.
- --
Robert LeBlanc <[EMAIL PROTECTED]>
Renaissoft, Inc.
Maia Mailguard 
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.2.6 (GNU/Linux)
iD8DBQFCEcSFGmqOER2NHewRAlJfAJ916eQF7MVkiYkAPSSOTytlyHZVPgCdHtQK
3qzppncuFf0CigcF0fhAot0=
=UYMr
-END PGP SIGNATURE-
**
This email and any files transmitted with it are confidential and
intended solely for the use of the individual or entity to whom they
are addressed. If you have received this email in error please notify
the system manager.
This footnote confirms that this email message has been swept
for the presence of computer viruses and is believed to be clean.
**


Re: [OT] Spam Quarantine

2005-02-15 Thread Robert LeBlanc
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1
Thomas Kinghorn [MTNNS -Rosebank] wrote:
| Hi List.
|
| Sorry to post this here but I am stumped as where to look.
|
| I am looking for webmail software which can be a frontend to
| A catchall for spam.
|
| Basically I would like to login via the web and release messages.
| Also, a once-a-day digest would be great.
Take a look at Maia Mailguard , which is
based on amavisd-new and SpamAssassin, and uses PHP and Perl scripts.
- --
Robert LeBlanc <[EMAIL PROTECTED]>
Renaissoft, Inc.
Maia Mailguard 
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.2.6 (GNU/Linux)
iD8DBQFCEcSFGmqOER2NHewRAlJfAJ916eQF7MVkiYkAPSSOTytlyHZVPgCdHtQK
3qzppncuFf0CigcF0fhAot0=
=UYMr
-END PGP SIGNATURE-


Re: [OT] Spam Quarantine

2005-02-15 Thread Martin Hepworth
Thomas
being a MailScanner user myself.
I'd say, use that and a couple of little extras called MailWatch and 
Quarantinereport.

Head over to the MailScanner list and we'll you all the details :-)
--
Martin Hepworth
Snr Systems Administrator
Solid State Logic
Tel: +44 (0)1865 842300
Thomas Kinghorn [MTNNS -Rosebank] wrote:
Hi List.
Sorry to post this here but I am stumped as where to look.
I am looking for webmail software which can be a frontend to
A catchall for spam.
Basically I would like to login via the web and release messages.
Also, a once-a-day digest would be great.
Thanks
Regards,
Tom

**
This email and any files transmitted with it are confidential and
intended solely for the use of the individual or entity to whom they
are addressed. If you have received this email in error please notify
the system manager.
This footnote confirms that this email message has been swept
for the presence of computer viruses and is believed to be clean.
**


Re: [OT] Spam Quarantine

2005-02-15 Thread Evan Platt
At 12:07 AM 2/15/2005, you wrote:
Hi List.
Sorry to post this here but I am stumped as where to look.
I am looking for webmail software which can be a frontend to
A catchall for spam.
Basically I would like to login via the web and release messages.
Also, a once-a-day digest would be great.
Thanks

Unless I'm misunderstanding your question... Almost any webmail interface 
that supports folders. Have spam moved to a folder, say "POSSIBLE SPAM" 
then go from there. Almost every freeware webmail interface I've seen 
supports that - squirrelmail comes to mind.

Evan 



[OT] Spam Quarantine

2005-02-15 Thread Thomas Kinghorn [MTNNS -Rosebank]
Title: [OT] Spam Quarantine






Hi List.


Sorry to post this here but I am stumped as where to look.


I am looking for webmail software which can be a frontend to

A catchall for spam.


Basically I would like to login via the web and release messages.

Also, a once-a-day digest would be great.



Thanks


Regards, 

Tom 






Doesn't work with non local accounts

2005-02-15 Thread Andrew Afliatunov
Sorry, mistake, must be
"3000 spam-letters dayly was filtered."
--
Andrew.



Doesn't work with non local accounts

2005-02-15 Thread Andrew Afliatunov
Hello!
I use spamassassin-3.01 in site-wide mode (spamd+spamc) on Linux 
Slackware-9.1 mail server.
Everything worked just fine - about 300 spam-letters dayly was filtered. 
But then I made system to look up mail-users in ldap database. And 
removed accounts from linux system.
Now spamc can't check letters for those users. In procmail.log I see:
--
getpwuid() failed: No such file or directory
procmail: Program failure (71) of "/usr/bin/spamc"
procmail: Rescue of unfiltered data succeeded
--
And users get tonns of spam :(.
How can I make spamassassin work with non-local accounts?

My /etc/procmailrc is:
--
DROPPRIVS=yes
LOGDIR=/var/log
SPOOLDIR=/var/spool/procmail
LOGFILE=${LOGDIR}/procmail.log
:0fw
* < 20
| /usr/bin/spamc
:0:
* ^X-Spam-Level: \*\*\*\*\*\*\*\*
$SPOOLDIR/spam
:0
* ^^rom[ ]
{
 LOG="*** Dropped F off From_ header! Fixing up. "
   :0 fhw
   | sed -e '1s/^/F/'
}
--
--
Andrew.



Re: Less spam blocked with 3.02 - AWL-related?

2005-02-15 Thread Johann Spies
Thanks!  I am learning every day.

Johann
-- 
Johann Spies  Telefoon: 021-808 4036
Informasietegnologie, Universiteit van Stellenbosch

 "For by him were all things created, that are in  
  heaven, and that are in earth, visible and invisible, 
  whether they be thrones, or dominions, or  
  principalities, or powers; all things were created by 
  him, and for him." Colossians 1:16 


Re: Query

2005-02-15 Thread Robert Menschel
Hello Atif,

Monday, February 14, 2005, 5:29:34 PM, you wrote:

AM> Hi

AM> I am currently conducted an experiment to test the accuracy of spamassassin.
AM> Having conducted one experiment I need to re-run the it. I just wanted to
AM> know how to delete the data that
AM> has been stored by the bayes component of the filter. Would you know how to
AM> do this? (I am using spamassassin in a windows environment)

Simply delete the Bayes database files (all of them). The system will
recreate them from scratch when you rerun your test.

AM> Also, When training the filter is it ok to use the sa-learn function for
AM> this or pipe the messages through manually?

I'm not sure how you would "pipe the messages through manually."  I do
all my sa-learn by collecting my ham into one mailbox file, my spam
into a second mailbox file, and then executing sa-learn once against
each mailbox, telling it which type of mail is in each.

Bob Menschel





Re[2]: Care and feeding instructions for SpamAssassin?

2005-02-15 Thread Robert Menschel
Hello FH,

Monday, February 14, 2005, 11:30:14 AM, you wrote:

F> Thanks for the help/pointers :D So where are these other FAQs?
http://wiki.apache.org/spamassassin/FrequentlyAskedQuestions

F> I think I need a good crash course on how/where to setup custom rules
http://wiki.apache.org/spamassassin/WritingRules

F> and to make sure the learning process is doing what it is supposed to be 
doing.
http://wiki.apache.org/spamassassin/AutolearningNotWorking
F> For example the dates on the /var/spool/spamassassin files 
(journal/seen/toks)
Good.
F> seem to be constantly changing but the /usr/local/share/spamassassin files
F> (what I think are the rules files) haven't changed since I installed them.
F> I would have thought after running sa_learn they would have changed
F> a bit. Does that sound right?
No, sa-learn updates the Bayes files, not the rules files.  The rules
files will not change unless you change them (and you should generally
NOT change the original installation rules files -- that's reserved
for installing new versions).

F> I have to say I'm getting a little frustrated w/ the
F> process/program.  For example no matter how many times I dump spam
F> w/ "Tadalafil" into the sa-learn process (and these are the
F> messages I get not the forwarded messages I was talking about
F> earlier) it's still not marking new messages as spam :(

Based on the problems others have had recently, I wonder whether
you're a) feeding this spam into one Bayes database, but b) reading a
different Bayes database when testing new emails that arrive on your
system.

F> BTW since lint output seems to be a popular thing people ask about
F> here's what I get, in case there's something about the way I'm
F> running it that's not correct.  In particular are those ('require'
F> failed) messages something to be concerned about?

I don't see anything seriously wrong with your output (though there
are lines I couldn't say whether they're right or wrong).

Next time you get one of those spam that sneaks through, run
> spamassassin -D output 2>debug.out

where "email" is the file containing that email and nothing but. Then
the -D output together with the output file will tell us the specific
steps that are followed. We'll know whether SURBL is properly testing
URis in the email, whether Bayes is corrupt, etc.

Those of us with custom rule sets will also be able to run your output
through our SA, and see if we're catching that spam because of custom
rules you don't have installed.

Make sure to attach the output file rather than cut/paste it --
cut/paste destroys information we need for that latter test.

Bob Menschel





Re: local rule, score is ignored

2005-02-15 Thread Justin Mason
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1


jdow writes:
> From: "Vicki Brown" <[EMAIL PROTECTED]>
> 
> > We allow user rules. (Please don't argue with me about this. It's a very
> > small site and yes, we do trust our users.)
> >
> > The following are in my .spamassassin/user_prefs
> >  header CF_SUB_UID   Subject =~ /vlb|Vicki/i
> >  score CF_SUB_UID4.0
> >  describe CF_SUB_UID Subject: contains my ID
> >  header CF_NOT_FOR_METoCc !~ /[EMAIL PROTECTED]/
> >  score CF_NOT_FOR_ME 3.0
> >  describe CF_NOT_FOR_ME  Neither To nor Cc me
> >
> > Here are the headers from a piece of spam
> >
> > X-Spam-Flag: YES
> > X-Spam-Checker-Version: SpamAssassin 3.0.2 (2004-11-16) on cfcl.com
> > X-Spam-Level: *
> > X-Spam-Status: Yes, score=5.0 required=0.5 testsÏ_NOT_FOR_ME,CF_SUB_UID,
> > FORGED_RCVD_HELO autolearn=no version=3.0.2
> > X-Spam-Report:
> > *  1.0 CF_SUB_UID Subject: contains my ID
> > *  1.0 CF_NOT_FOR_ME Neither To nor Cc me
> > *  3.0 FORGED_RCVD_HELO Received: contains a forged HELO
> >
> >
> > What am I doing wrong? My tests are running. But why are my tests scoring
> > only 1.0 and not the score I specify?
> >
> > Does anyone see something really lame I'm missing?
> 
> Unfortunately not. There is a bug in 3.0.x through the current 3.0.2
> which causes this effect. The work arounds are ugly++ to say the least.
> They mostly consume processor cycles.
> 
> What happens is that the first time a given spamd child runs it works
> right. Each time afterwards that it runs it fails to pick up local
> scores even though it pucks up the local rules. You might make it work
> by limiting each child to running only once.

this is fixed in 3.1.0, right?

- --j.
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.2.5 (GNU/Linux)
Comment: Exmh CVS

iD8DBQFCEYV8MJF5cimLx9ARAknrAJ0QgIkH0bkeQ+iUS305sUqMr0wn/QCcD51Y
aT/m/9eeimM2VWgNOiMClPo=
=Fv3D
-END PGP SIGNATURE-



Re: local rule, score is ignored

2005-02-15 Thread jdow
From: "Vicki Brown" <[EMAIL PROTECTED]>

> We allow user rules. (Please don't argue with me about this. It's a very
> small site and yes, we do trust our users.)
>
> The following are in my .spamassassin/user_prefs
>  header CF_SUB_UID   Subject =~ /vlb|Vicki/i
>  score CF_SUB_UID4.0
>  describe CF_SUB_UID Subject: contains my ID
>  header CF_NOT_FOR_METoCc !~ /[EMAIL PROTECTED]/
>  score CF_NOT_FOR_ME 3.0
>  describe CF_NOT_FOR_ME  Neither To nor Cc me
>
> Here are the headers from a piece of spam
>
> X-Spam-Flag: YES
> X-Spam-Checker-Version: SpamAssassin 3.0.2 (2004-11-16) on cfcl.com
> X-Spam-Level: *
> X-Spam-Status: Yes, score=5.0 required=0.5 tests=CF_NOT_FOR_ME,CF_SUB_UID,
> FORGED_RCVD_HELO autolearn=no version=3.0.2
> X-Spam-Report:
> *  1.0 CF_SUB_UID Subject: contains my ID
> *  1.0 CF_NOT_FOR_ME Neither To nor Cc me
> *  3.0 FORGED_RCVD_HELO Received: contains a forged HELO
>
>
> What am I doing wrong? My tests are running. But why are my tests scoring
> only 1.0 and not the score I specify?
>
> Does anyone see something really lame I'm missing?

Unfortunately not. There is a bug in 3.0.x through the current 3.0.2
which causes this effect. The work arounds are ugly++ to say the least.
They mostly consume processor cycles.

What happens is that the first time a given spamd child runs it works
right. Each time afterwards that it runs it fails to pick up local
scores even though it pucks up the local rules. You might make it work
by limiting each child to running only once.

The two solutions I have used here rely special and very special
circumstances. The first solution is to simply use "spamassassin"
rather than "spamc/spamd". The second solution for a single user
spamassassin setup is to role the user rules into the main rule
space and live with it that way. That's what I am doing at the
moment because the would be other user on the machine prefers the
2.64 install on a second box we have. (He needs some of the reporting
that 2.63 implements and 3.0.2 does not. I must admit I am not all
that impressed by 3.0.2 as compared to the well tuned 2.64 on the
other machine. There is not a big enough improvement to really push
a switch over "for real." In some ways 3.0 series seems like a
serious downgrade. But that's IMOAO and YMMV most assuredly applies.)

{^_^}




local rule, score is ignored

2005-02-15 Thread Vicki Brown
We allow user rules. (Please don't argue with me about this. It's a very
small site and yes, we do trust our users.)

The following are in my .spamassassin/user_prefs
 header CF_SUB_UID   Subject =~ /vlb|Vicki/i
 score CF_SUB_UID4.0
 describe CF_SUB_UID Subject: contains my ID
 header CF_NOT_FOR_METoCc !~ /[EMAIL PROTECTED]/
 score CF_NOT_FOR_ME 3.0
 describe CF_NOT_FOR_ME  Neither To nor Cc me

Here are the headers from a piece of spam

X-Spam-Flag: YES
X-Spam-Checker-Version: SpamAssassin 3.0.2 (2004-11-16) on cfcl.com
X-Spam-Level: *
X-Spam-Status: Yes, score=5.0 required=0.5 tests=CF_NOT_FOR_ME,CF_SUB_UID,
FORGED_RCVD_HELO autolearn=no version=3.0.2
X-Spam-Report:
*  1.0 CF_SUB_UID Subject: contains my ID
*  1.0 CF_NOT_FOR_ME Neither To nor Cc me
*  3.0 FORGED_RCVD_HELO Received: contains a forged HELO


What am I doing wrong? My tests are running. But why are my tests scoring
only 1.0 and not the score I specify?

Does anyone see something really lame I'm missing?
-- 
Vicki Brown ZZZJourneyman Sourceror:
SF Bay Area, CAzz  |\ _,,,---,,_  Scripts & Philtres
http://www.cfcl.com zz /,`.-'`'-.  ;-;;,_Code, Doc, Process, QA
http://cfcl.com/vlb   |,4-  ) )-,_. ,\ ( `'-'Perl, Unix, Mac OS X, WWW
 '---''(_/--'  `-'\_)  ___


Query

2005-02-15 Thread Atif Munir
Hi
I am currently conducted an experiment to test the accuracy of spamassassin. 
Having conducted one experiment I need to re-run the it. I just wanted to 
know how to delete the data that
has been stored by the bayes component of the filter. Would you know how to 
do this? (I am using spamassassin in a windows environment)

Also, When training the filter is it ok to use the sa-learn function for 
this or pipe the messages through manually?

Can you please respons directly to my email account - [EMAIL PROTECTED]
Thanks
Atif
_
Want to block unwanted pop-ups? Download the free MSN Toolbar now!  
http://toolbar.msn.co.uk/