Re: 3.0.2 and SARE

2005-03-31 Thread Loren Wilton
 I am thinking of downloading rules from SARE.
 However, I am told that some or many of the
 rules have already been incorporated into 3.0.2

 Can someone recommend the best approach to
 avoid duplicates ?

Sure.  Read the docs on the SARE rules page relating to each ruleset.  We
document which ones you should and shouldn't use with various releases.

Loren



Re: Phishing attempts getting through.

2005-03-31 Thread Loren Wilton
 Can someone expand on the ClamAV detecting phishing attempts. Or direct
 me some where?

Pick up some of the SARE rulesets.  I think spoof or fraud is the one that
contains an assortment of phishhooks.  Won't get 'em all, but will sure cut
down on the more common ones.

Loren



Re: increasing children

2005-03-31 Thread Steve Lake

- From 'man spamd':
- -m num, --max-children=num Allow maximum num children
Just set that as desired in your script that starts up the spamd daemon.
HAHAHA!!  OMG, I was looking at the wrong man file.  ^_^;;  Thanks 
for the help.



Re: my girlfriend is getting ticked :)

2005-03-31 Thread Steven Dickenson
Matthew Lenz wrote:
X-Spam-Status: No, score=4.1 required=5.0 tests=BAYES_99,HTML_80_90,
HTML_FONT_BIG,HTML_MESSAGE,HTML_TITLE_EMPTY,MIME_HTML_ONLY,
MSGID_FROM_MTA_ID autolearn=no version=3.0.2
I see your false negative scored 99% on bayes.  The BAYES_99 rule has a 
much lower score in v3 than it did in v2.  My users started bitching 
after the upgrade the 3 because all the sudden spam was starting to get 
through.  Tweaking up the bayes scores a bit helped significantly.

Steven


Re: my girlfriend is getting ticked :)

2005-03-31 Thread AltGrendel
Mike Jackson wrote:
Your bayes database looked to be reasonably trained.  The 
false-negative was labeled 99% spam by Bayes.

I don't see any RBL checks, which might have made the difference on 
this one, if it's already been seen and flagged.  Do you have 
Net::DNS installed and the RLB tests enabled?  What happens if you 
feed it through spamassassin with the -D flag?

In my experience, it's more efficient to let the MTA handle the RBL 
checks instead of Spamassassin. I can't remember what MTA the OP was 
using, but it's trivial to set them up in Sendmail. On my employer's 
boxes, I use the spamhaus.org lists, but on my personal box (where I 
can be much more aggressive) I use a few of the rfc-ignorant.org lists 
and ws.surbl.org. The spamhaus lists are checked first, and they're 
highly effective.
Agreed, I setup my postfix to do the checks and it's made a world of 
difference. The OP never said what OS/MTA is being used.


Re: my girlfriend is getting ticked :)

2005-03-31 Thread Jeff Chan
On Wednesday, March 30, 2005, 2:20:17 PM, Mike Jackson wrote:
 Your bayes database looked to be reasonably trained.  The false-negative 
 was labeled 99% spam by Bayes.

 I don't see any RBL checks, which might have made the difference on this 
 one, if it's already been seen and flagged.  Do you have Net::DNS 
 installed and the RLB tests enabled?  What happens if you feed it through 
 spamassassin with the -D flag?

 In my experience, it's more efficient to let the MTA handle the RBL checks 
 instead of Spamassassin. I can't remember what MTA the OP was using, but 
 it's trivial to set them up in Sendmail. On my employer's boxes, I use the 
 spamhaus.org lists, but on my personal box (where I can be much more 
 aggressive)

I use sbl.spamhaus.org and list.dsbl.org on most of the MTAs I
have visibility on.

 I use a few of the rfc-ignorant.org lists and ws.surbl.org. The 
 spamhaus lists are checked first, and they're highly effective. 

H, ws.surbl.org shouldn't be used as a regular RBL.  It has
very few IP addresses, and most of those are probably web
servers.  So it won't match most of the IP address RBL checks a
plain old MTA would do.  SURBLs are meant to match message body
URIs, not mail senders.

Jeff C.
-- 
Jeff Chan
mailto:[EMAIL PROTECTED]
http://www.surbl.org/



Re: my girlfriend is getting ticked :)

2005-03-31 Thread Matthew Lenz
- Original Message - 
From: AltGrendel
To: users@spamassassin.apache.org
Sent: Wednesday, March 30, 2005 8:50 PM
Subject: Re: my girlfriend is getting ticked :)


Mike Jackson wrote:
Your bayes database looked to be reasonably trained.  The false-negative 
was labeled 99% spam by Bayes.

I don't see any RBL checks, which might have made the difference on this 
one, if it's already been seen and flagged.  Do you have Net::DNS 
installed and the RLB tests enabled?  What happens if you feed it 
through spamassassin with the -D flag?

In my experience, it's more efficient to let the MTA handle the RBL 
checks instead of Spamassassin. I can't remember what MTA the OP was 
using, but it's trivial to set them up in Sendmail. On my employer's 
boxes, I use the spamhaus.org lists, but on my personal box (where I can 
be much more aggressive) I use a few of the rfc-ignorant.org lists and 
ws.surbl.org. The spamhaus lists are checked first, and they're highly 
effective.
Agreed, I setup my postfix to do the checks and it's made a world of 
difference. The OP never said what OS/MTA is being used.

actually i did in my first post
I'm using 3.0.2 on a debian woody box.  Its from www.backports.org (great 
site)



Re: my girlfriend is getting ticked :)

2005-03-31 Thread Jeff Chan
On Wednesday, March 30, 2005, 2:21:01 PM, Matthew Lenz wrote:
 I just installed backports perl-libnet-dns (.48, hope that is new
 enough .49 is the newest).  Is there anywhere I can check to see if
 'network tests' (what the SURBL says needs to be enabled) are enabled?

Set your trust path correctly:

(quoteing Matt Kettler:)
 Please see the Wiki:
 http://wiki.apache.org/spamassassin/TrustPath/
 
 and look up trusted_networks in man Mail::SpamAssassin::Conf

And enable network tests:

  http://www.surbl.org/faq.html#nettest

And things should work much better.

Jeff C.
-- 
Jeff Chan
mailto:[EMAIL PROTECTED]
http://www.surbl.org/



Re: my girlfriend is getting ticked :)

2005-03-31 Thread AltGrendel
Matthew Lenz wrote:
- Original Message - From: AltGrendel
To: users@spamassassin.apache.org
Sent: Wednesday, March 30, 2005 8:50 PM
Subject: Re: my girlfriend is getting ticked :)

Mike Jackson wrote:
Your bayes database looked to be reasonably trained.  The 
false-negative was labeled 99% spam by Bayes.

I don't see any RBL checks, which might have made the difference on 
this one, if it's already been seen and flagged.  Do you have 
Net::DNS installed and the RLB tests enabled?  What happens if you 
feed it through spamassassin with the -D flag?

In my experience, it's more efficient to let the MTA handle the RBL 
checks instead of Spamassassin. I can't remember what MTA the OP was 
using, but it's trivial to set them up in Sendmail. On my employer's 
boxes, I use the spamhaus.org lists, but on my personal box (where I 
can be much more aggressive) I use a few of the rfc-ignorant.org 
lists and ws.surbl.org. The spamhaus lists are checked first, and 
they're highly effective.

Agreed, I setup my postfix to do the checks and it's made a world of 
difference. The OP never said what OS/MTA is being used.

actually i did in my first post
I'm using 3.0.2 on a debian woody box.  Its from www.backports.org 
(great site)

Ok, so you're using Spamassassin 3.0.2 on Debian. Are you using 
Sendmail, qmail, courier, or postfix? I honestly don't know that Debian 
uses as a default mailserver.


RE: my girlfriend is getting ticked :)

2005-03-31 Thread Michael Bellears

  I'm using 3.0.2 on a debian woody box.  Its from www.backports.org 
  (great site)
 
 Ok, so you're using Spamassassin 3.0.2 on Debian. Are you 
 using Sendmail, qmail, courier, or postfix? I honestly don't 
 know that Debian uses as a default mailserver.

Exim.



Re: my girlfriend is getting ticked :)

2005-03-31 Thread AltGrendel
Michael Bellears wrote:
I'm using 3.0.2 on a debian woody box.  Its from www.backports.org 
(great site)

 

Ok, so you're using Spamassassin 3.0.2 on Debian. Are you 
using Sendmail, qmail, courier, or postfix? I honestly don't 
know that Debian uses as a default mailserver.
   

Exim.
 

Ok, so you might want to check out this: 
http://www.exim.org/howto/rbl.html if you haven't. I've been working 
with a postfix/amavis-new/spamassassin/clamav setup, so I probably 
wouldn't be much help. Since I started doing RBL checks at the MTA(Exim 
for you) level I've seen a radical reduction in spam. It can save on 
processing too since the spam never gets past the received stage. Be 
careful though, some lists are much less forgiving than others and can 
block legit traffic.

Good luck.


Rule Design Benchmark/Resource Question

2005-03-31 Thread Rocky Olsen

Before i pull my hair out doing bench/resource test, i was wondering if
anyone out there knew if there was much of a speed/resource usage
difference between the following way of writing the same rule.


Method A:
bodyrule_a  /(?:feh|meh|bleh)/i

vs.

Method B:

bod __rule_a/(?:feh)/i
body__rule_b/(?:meh)/i
body__rule_c/(?:bleh)/i

metarule_d  (__rule_a || __rule_b || __rule_c)


There probably isn't much difference using just 3 rules, but i'm thinking
more along the lines of large(500+) lists and it isn't limited to just body
stuff.  So if anyone has some realworld benching/experience with what is
preferred or if the developers know which is faster for SA, i would love
the input.

-Rocky
-- 
__


what's with today, today?

Email:  [EMAIL PROTECTED]
PGP:http://rocky.mindphone.org/rocky_mindphone.org.gpg


signature.asc
Description: Digital signature


Re: Anyway to have SA drop high spam tag other?

2005-03-31 Thread Robert Menschel
Hello Bill,

Wednesday, March 30, 2005, 8:15:05 AM, you wrote:

B I'm running Spamassassin 3.0.2 on Linux in front of Oracle Collaboration 
Suite.
B I am using
B proxsmtp(http://memberwebs.com/nielsen/software/proxsmtp/) to scan
B mail and then pass it along to oracle. 

B My question is...

B I know that you can have spamassassin exit with a non-zero code if it detects
B spam by using the 'spamassasin -e' option.  Does anyone know if it is 
possible
B to have SA tag spam and not exit as usual, but exit with a non-zero code if 
say
B the score is over 10?

B The other options I came up with is to either write script to check the 
level,
B or have SA run twice...once to tag and once to drop.

Or, since SA is only a filter, have the SA output feed a script which
a) copies input to output unchanged, and
b) interprets the score from the X-Spam-Status header, and then exits
with int(score) (0 if negative).

Bob Menschel





can Pyzor run localy?

2005-03-31 Thread Alan Shine

Hi,
I have a few questuions regrding the benefit/use of SA fatures.

1. Can Pyzord runlocaly asSURBL does with rbldnsd(check the messagewith local repository, not with thePyzor web servers) ?


2.I wouldlike toactivate more features toSA (I currently use only SARE rules). 
We are considering SURBL, DCC and Pyzor.
My question is - whatare the preferable features that I can add to SA, that willresult inbetter spam identification, and that will cost the lowest in performance time?

Thanks a lot.
Alan
		Do you Yahoo!? 
Yahoo! Mail - Helps protect you from nasty viruses.

Re: Anyway to have SA drop high spam tag other?

2005-03-31 Thread John Andersen
On Wednesday 30 March 2005 08:02 pm, Robert Menschel wrote:
 Hello Bill,

 Wednesday, March 30, 2005, 8:15:05 AM, you wrote:

 B I'm running Spamassassin 3.0.2 on Linux in front of Oracle Collaboration
 Suite. B I am using
 B proxsmtp(http://memberwebs.com/nielsen/software/proxsmtp/) to scan
 B mail and then pass it along to oracle.

 B My question is...

 B I know that you can have spamassassin exit with a non-zero code if it
 detects B spam by using the 'spamassasin -e' option.  Does anyone know if
 it is possible B to have SA tag spam and not exit as usual, but exit with
 a non-zero code if say B the score is over 10?

 B The other options I came up with is to either write script to check the
 level, B or have SA run twice...once to tag and once to drop.

 Or, since SA is only a filter, have the SA output feed a script which
 a) copies input to output unchanged, and
 b) interprets the score from the X-Spam-Status header, and then exits
 with int(score) (0 if negative).

 Bob Menschel

SA doesn't drop mail. It simply tags it.
If you want to dev null mail, that's what procmail is for.

-- 
_
John Andersen


pgpnygpNNqhFA.pgp
Description: signature


Re: Bigevil file is gone.

2005-03-31 Thread Martin Hepworth
Hurray and... can't wait
--
Martin Hepworth
Snr Systems Administrator
Solid State Logic
Tel: +44 (0)1865 842300
Chris Santerre wrote:
Too much traffic used for a file no longer updated. BigEvil has been
removed. I shall replace it with our newest ruleset soon. Its a real corker
;) 

Chris Santerre 
System Admin and SARE/SURBL Ninja
http://www.rulesemporium.com
http://www.surbl.org
'It is not the strongest of the species that survives,
not the most intelligent, but the one most responsive to change.'
Charles Darwin 
**
This email and any files transmitted with it are confidential and
intended solely for the use of the individual or entity to whom they
are addressed. If you have received this email in error please notify
the system manager.
This footnote confirms that this email message has been swept
for the presence of computer viruses and is believed to be clean.   
**


Getting around URIRBLs

2005-03-31 Thread John Wilcock
h3hLeo BreebaartttTiggletp:Graycat/SOGP/getSeyreniapoLena 
WilliamsrPatrick DersjantnoSimon WaldmandvSorchad.Guitar 
HuwcoThe Senior Wranglerm/h3
This looks like an effective way of getting round URIRBLs (though of 
course it requires the end user to cut and paste).

The rule below seems to catch the technique. Any suggestions for 
improving it or any other rules to suggest?

# 2005-03-31 new rule
rawbody  local_OBFU_HTTP 
/(?!https?:\/\/)h(?:.+)?t(?:.+)?t(?:.+)?p(?:.+)?s?(?:.+)?:(?:.+)?\/(?:.+)?\/(?:.+)?/im
describe local_OBFU_HTTP	HTTP obfuscated with tags
scorelocal_OBFU_HTTP	1.0

John.
--
-- Over 2500 webcams from ski resorts around the world - www.snoweye.com
-- Translate your technical documents and web pages- www.tradoc.fr


Re: Anyway to have SA drop high spam tag other?

2005-03-31 Thread Menno van Bennekom
 I'm running Spamassassin 3.0.2 on Linux in front of Oracle Collaboration
 Suite.
  I am using proxsmtp(http://memberwebs.com/nielsen/software/proxsmtp/) to
 scan
 mail and then pass it along to oracle.

 My question is...

 I know that you can have spamassassin exit with a non-zero code if it
 detects
 spam by using the 'spamassasin -e' option.  Does anyone know if it is
 possible
 to have SA tag spam and not exit as usual, but exit with a non-zero code
 if say
 the score is over 10?

 The other options I came up with is to either write script to check the
 level,
 or have SA run twice...once to tag and once to drop.

 If anyone has any ideas that would be great, thanks in advance!

 --Bill

I don't know about proxysmtp, but it should be possible with amavisd that
calls spamassassin/clamd/etc.
You can decide there at what level (tag2_level) the spam gets marked in
the subject and at what level (kill_level) it is handled as spam.
So you could set tag2_level to 5 and kill_level to 10.
After reaching kill_level you can configure what action has to be taken,
pass the mail, bounce it, or discard it.
A disadvantage is that the quarantine/archive is triggered by the
kill_level so you won't have a spam-archive of the 5 to 10 spam-mails.
Menno van Bennekom




Re: Getting around URIRBLs

2005-03-31 Thread Rocky Olsen
i wrote something similar to this but instead of of using .+, i used [^]+,
supposedly a tad faster, iirc. also writing s?(?:.+)? as
(?:s(?:[^]+)?)? should be slightly faster cause if it fails to match on
the 's' it won't move on to check for the stuff


-Rocky

On Thu, Mar 31, 2005 at 11:35:26AM +0200, John Wilcock wrote:
 h3hLeo BreebaartttTiggletp:Graycat/SOGP/getSeyreniapoLena 
 WilliamsrPatrick DersjantnoSimon WaldmandvSorchad.Guitar 
 HuwcoThe Senior Wranglerm/h3
 
 This looks like an effective way of getting round URIRBLs (though of 
 course it requires the end user to cut and paste).
 
 The rule below seems to catch the technique. Any suggestions for 
 improving it or any other rules to suggest?
 
 # 2005-03-31 new rule
 rawbody  local_OBFU_HTTP 
 /(?!https?:\/\/)h(?:.+)?t(?:.+)?t(?:.+)?p(?:.+)?s?(?:.+)?:(?:.+)?\/(?:.+)?\/(?:.+)?/im
 describe local_OBFU_HTTP  HTTP obfuscated with tags
 scorelocal_OBFU_HTTP  1.0
 
 John.
 
 -- 
 -- Over 2500 webcams from ski resorts around the world - www.snoweye.com
 -- Translate your technical documents and web pages- www.tradoc.fr
 

-- 
__


what's with today, today?

Email:  [EMAIL PROTECTED]
PGP:http://rocky.mindphone.org/rocky_mindphone.org.gpg


signature.asc
Description: Digital signature


From dollars to pounds... and Nigeria to UK

2005-03-31 Thread lists
Nigerian scams are evolving...
I just received that one with only 3 rules matching (SA 2.6)...
BAYES_90 2.10, RCVD_IN_SBL 1.11, RCVD_IN_SORBS 0.10


Return-Path: [EMAIL PROTECTED]
Received: from vsmtp1.tin.it (vsmtp1.tin.it [212.216.176.141])
	by  (8.12.10/8.12.10) with ESMTP id j2V3snX8020382
	for ; Thu, 31 Mar 2005 05:54:49 +0200
Received: from ims1d.cp.tin.it (192.168.70.101) by vsmtp1.tin.it (7.0.027)
id 4238611B0060DD46; Thu, 31 Mar 2005 05:51:00 +0200
Received: from [192.168.70.183] by ims1d.cp.tin.it with HTTP; Thu, 31 
Mar 2005 05:50:59 +0200
Date: Thu, 31 Mar 2005 05:50:59 +0200
Message-ID: [EMAIL PROTECTED]
From: DAVID HESKEY [EMAIL PROTECTED]
Subject: YOUR CONCEPT IS NEEDED[reply.
Reply-To: [EMAIL PROTECTED]
Cc: [EMAIL PROTECTED]
MIME-Version: 1.0
Content-Type: text/plain
X-Originating-IP: 80.179.243.4

The Auditor/Head
of Department
Bank of Scotland,
United Kingdom.
(Great Opportunity Very Urgent and Confidential)
___
Greetings,
I am Dr David Heskey, the auditor and computing staff of a bank here in
SCOTLAND UNITED KINGDOM.
I discovered a dormant account in my office, as an auditor  and head of
computing department of a bank here in Scotland, United Kingdom.It will
be in my interest to transfer this fund worth 15,000,000 million pounds
in an account
offshore.
If you can be a collaborator to this please indicate interest immediately
for us to proceed. Your contact phone numbers and name and your account
information  will be necessary for this effect.
Here is my direct phone number (+447040110197)
At the conclusion of this business, you will be given 35% of the  total
amount, 60% will be for me, while 5% will be for expenses both parties might
have incurred during this process.
More details awaits your positive reply.
Regards and respect,
Dr David Heskey




Re: HUMOR: 419 pic

2005-03-31 Thread Niek
On 3/30/2005 10:15 PM +0100, Chris Santerre wrote:
For those of you who don't know, there is a group of ppl that lead 419
scammers on wild goose chases. One of the things they do is request pics for
proof. THey have them do some funny stuff. (Bread and fish on head)
This came accross my mail today. Pretty funny! (Contains the word p enis.)
http://www.plus613.com/image/12046
Got this one 1-2 weeks ago, 419 scam, wants to give me millions :)
http://asbak.coding-slaves.com/pic.jpg
Niek
--


Re: negative score from ALL_TRUSTED

2005-03-31 Thread Arvinn Løkkebakken
Matt Kettler wrote:
You have a broken trust path. ALL_TRUSTED should *never* match email
from outside your network.
 

But it does anyway, even when trust path is set correctly:
http://bugzilla.spamassassin.org/attachment.cgi?id=2508
Happens when spamassassin fails to parse the Received header(s) 
comtaining the untrusted host(s).

Arvinn


Re: can Pyzor run localy?

2005-03-31 Thread Stuart Johnston
Alan Shine wrote:
Hi,
I have a few questuions regrding the benefit/use of SA fatures.
 
1. Can Pyzord run localy as SURBL does with rbldnsd (check the 
message with local repository, not with the Pyzor web servers) ?
See: http://pyzor.sourceforge.net/
Since the entire system is released under the GPL, people are free to 
host their own independent servers. Server peering is planned for a 
future release.

 
2.I would like to activate more features to SA (I currently use only 
SARE rules).
We are considering SURBL, DCC and Pyzor. 
My question is - what are the preferable features that I can add to SA, 
that will result in better spam identification, and that will cost the 
lowest in performance time?
Probably SURBL but if you are going to enable network tests it is best 
to have as many activated as possible from the start.

http://wiki.apache.org/spamassassin/SingleUserUnixInstall


Re: negative score from ALL_TRUSTED

2005-03-31 Thread Matt Kettler
At 09:07 AM 3/31/2005, Arvinn Løkkebakken wrote:
You have a broken trust path. ALL_TRUSTED should *never* match email
from outside your network.
But it does anyway, even when trust path is set correctly:
http://bugzilla.spamassassin.org/attachment.cgi?id=2508
Hmm. Well, that happens if and only if SA can't parse your received 
headers: Ususaly broken AV appliances that insert a by clause in front of 
the from clause cause this.

One of Roy's headers does look strange to me, and might be unparseable 
because the from and by are in separate headers. I've never seen a working 
mailserver do that before, but that doesn't mean it's not parsable by SA.

However, looking at Roy's headers, it looks like he might have a NATed 
mailserver too, which would definitely cause a broken trust path.

However, you might want to suspect that problem after trying to set trust 
path. Setting a trust path may or may not fix the problem, but at least 
it's quick and easy.

The patch is also not a sure-fire fix, as it will NOT help anyone suffering 
from the broken-trust-path problem. It will ONLY help those suffering from 
a broken mailserver.



sa-learn issues

2005-03-31 Thread Chip
I recently installed SA 3.0.2 on freeBSD 4.10, and it's working great, 
except for this one feature:

I have things setup so each user has a spam folder that they will put 
missed spam in.  This folder will later be trained from cron jobs using 
sa-learn.  The problem is, it seems that sa-learn is ignoring the -u / 
--user= flag.  No matter what I set it to, it trains for root instead of 
that user.  I am verifying this by checking the /root/.spamassassin/ 
directory.  Each time I run sa-learn, the bayes files in the directory 
are updated, instead of the files in /usr/home/user/.spamassassin/

bash-2.05b# /usr/local/bin/sa-learn -u userfoo --spam --showdots --mbox 
/usr/home/chip/SPAM
..
Learned from 2 message(s) (2 message(s) examined).
bash-2.05b# /usr/local/bin/sa-learn -u userbar --spam --showdots --mbox 
/usr/home/chip/SPAM
..
Learned from 0 message(s) (2 message(s) examined)

I have tried the following user flags:
--user=userfoo
--user=userfoo
--user='userfoo'
-u userfoo
-u userfoo
-u 'userfoo'
Any idea what I am missing here?


Re: sa-learn issues

2005-03-31 Thread Matt Kettler
Chip wrote:

 The problem is, it seems that sa-learn is ignoring the -u / --user= flag.

Of course it's ignoring it. There is no -u flag in sa-learn. ONLY spamc
and spamd accept that flag.

sa-learn uses the userid of the user that calls it. Period.


Re: sa-learn issues

2005-03-31 Thread Chip
Matt Kettler wrote:
Chip wrote:
 

The problem is, it seems that sa-learn is ignoring the -u / --user= flag.
   

Of course it's ignoring it. There is no -u flag in sa-learn. ONLY spamc
and spamd accept that flag.
sa-learn uses the userid of the user that calls it. Period.
 

man sa-learn says differently:
-u username, --username=username  Override username taken from the 
runtime environment

However if this is the case, how do you use spamc to train spam on a 
users mbox?


Re: sa-learn issues *RETRACTED*

2005-03-31 Thread Matt Kettler
My bad, apparently 3.0.2 and 3.0.1 do have such a flag in sa-learn. I
was looking at the 3.0.0 version, which does not.

Matt Kettler wrote:

Chip wrote:

  

The problem is, it seems that sa-learn is ignoring the -u / --user= flag.



Of course it's ignoring it. There is no -u flag in sa-learn. ONLY spamc
and spamd accept that flag.

sa-learn uses the userid of the user that calls it. Period.

  




Re: sa-learn issues

2005-03-31 Thread Andre Nicholson
 The problem is, it seems that sa-learn is ignoring the -u / --user= flag.
(B
(B Of course it's ignoring it. There is no -u flag in sa-learn. ONLY spamc
(B and spamd accept that flag.
(B
(B sa-learn uses the userid of the user that calls it. Period.
(B
(BFrom TFM:
(B
(B-u username, --username=username Override username taken from the runtime 
(Benvironment
(B
(BAndre

redirect output from lint

2005-03-31 Thread bruno . delladucata




Hello all

Can someone tell me how i can redirect the output from the command:
spamassassin --lint
to a file or maybe grep / awk

spamassassin --lint | grep something
does not work!

I want to autmate the checks of all rules

bruno




.packlist

2005-03-31 Thread Matthew Lenz
Anyone know why the /etc/mail/spamassassin
and /usr/local/share/spamassassin stuff isn't being included in
the .packlist?  I realize that there might be some concern about them
being removed if the package is uninstalled (unlikely) but its also kind
of against everything for which the .packlist stands.

-Matt



Re: redirect output from lint

2005-03-31 Thread Matt Kettler
[EMAIL PROTECTED] wrote:



Hello all

Can someone tell me how i can redirect the output from the command:
spamassassin --lint
to a file or maybe grep / awk

spamassassin --lint | grep something
does not work!

I want to autmate the checks of all rules

bruno
  

Ahh, system admin 201, intermediate pipes and redirection

lint's output goes to stderr, therefore you need to do this to redirect it.

spamassassin --lint 2 file.out

Note that the 2 means redirect filehandle 2 and file handle #2 is
stderr.

You can also do piping if you redirect stderr back to to stdout (handle 1):

spamassassin --lint 21 | grep sometext  somefile.out



Autolearn=failed when BAYES_00 is only rule hit

2005-03-31 Thread Don Levey
Please forgive me if this is in the archives; I'm having trouble finding it.

I've just finished training my Bayes DB using sa-learn (perversely, when I
was trying to collect 200 spam messages, the spammers decided to stop
sending to me).  Now that the DB is usable, it's interesting that while most
ham messages produce at least one small rule hit and a negative Bayes score
that results in Autolearn=no, when BAYES_00 is the ONLY rule that hits I
get Autolearn=failed.

Two quick questions:
1) What should I do about this, and
2) Should I worry, or just ignore it?

TIA,
 -Don


Re: sa-learn issues

2005-03-31 Thread Michael Parker
On Thu, Mar 31, 2005 at 01:23:24PM -0500, Chip wrote:
 I have things setup so each user has a spam folder that they will put 
 missed spam in.  This folder will later be trained from cron jobs using 
 sa-learn.  The problem is, it seems that sa-learn is ignoring the -u / 
 --user= flag.  No matter what I set it to, it trains for root instead of 
 that user.  I am verifying this by checking the /root/.spamassassin/ 
 directory.  Each time I run sa-learn, the bayes files in the directory 
 are updated, instead of the files in /usr/home/user/.spamassassin/

This is a feature/shortcoming in the -u option for sa-learn when using
non-SQL based bayes storage modules.  That is why the documentation
states:
  You can use this option to specify users in a virtual user
  configuration.

Otherwise the bayes path, if unset via dbpath or in a .cf file is
expanded to be in $ENV{HOME} which in your case is /root/.

I added the -u specifically for BayesSQL users, since it doesn't refer
to an actual directory on the filesystem.

Feel free to file a bug report, but honestly it might end up being a
documentation patch saying that -u is not effective for DBM storage.

BTW, you can easily accomplish the same thing as root using su -c or
similar mechanisms.

Michael


pgpbeEbvUpSUj.pgp
Description: PGP signature


Re: sa-learn issues

2005-03-31 Thread Chip
Michael Parker wrote:
On Thu, Mar 31, 2005 at 01:23:24PM -0500, Chip wrote:
 

I have things setup so each user has a spam folder that they will put 
missed spam in.  This folder will later be trained from cron jobs using 
sa-learn.  The problem is, it seems that sa-learn is ignoring the -u / 
--user= flag.  No matter what I set it to, it trains for root instead of 
that user.  I am verifying this by checking the /root/.spamassassin/ 
directory.  Each time I run sa-learn, the bayes files in the directory 
are updated, instead of the files in /usr/home/user/.spamassassin/
   

This is a feature/shortcoming in the -u option for sa-learn when using
non-SQL based bayes storage modules.  That is why the documentation
states:
 You can use this option to specify users in a virtual user
 configuration.
Otherwise the bayes path, if unset via dbpath or in a .cf file is
expanded to be in $ENV{HOME} which in your case is /root/.
I added the -u specifically for BayesSQL users, since it doesn't refer
to an actual directory on the filesystem.
Feel free to file a bug report, but honestly it might end up being a
documentation patch saying that -u is not effective for DBM storage.
BTW, you can easily accomplish the same thing as root using su -c or
similar mechanisms.
Michael
 

Ahh ok.  Make sense!  I will change to a sql backend, as my users have 
no shell access and can't run the command as themselves.  Thanks for the 
clarification!


Re: sa-learn issues

2005-03-31 Thread Michael Parker
On Thu, Mar 31, 2005 at 02:24:23PM -0500, Chip wrote:
 Ahh ok.  Make sense!  I will change to a sql backend, as my users have 
 no shell access and can't run the command as themselves.  Thanks for the 
 clarification!

Not a bad idea.  The Bayes SQL modules have proven to be stable and in
most cases worth the effort, especially in a virtual user environment.

You can find some more information on storing your SpamAssassin user
data in a SQL database here:
http://people.apache.org/~parker/presentations/

Michael

PS Anyone interested in testing a MySQL specific Bayes Storage module?
It requires MySQL 4.1, SA 3.1-dev and InnoDB tables if you want
rollback on error.  It also provides a 30-40% speed up in some cases.
If so, shoot me an email and I'll send you a copy of the module.


pgppL9kGtH8D9.pgp
Description: PGP signature


SA rescore config file

2005-03-31 Thread Lisheng Sun
Hi, 
I try to re-assign score, in masses/config file:

SCORESET=3
HAM_PREFERENCE=2.0
THRESHOLD=5.0
EPOCHS=100
NOTE=

What the SCORESET here mean? Do i need to change the HAM_PREFERENCE,
THRESHOLD and EPOCHS value?
Thanks.


Re: SA rescore config file

2005-03-31 Thread Jim Maul
Lisheng Sun wrote:
Hi, 
I try to re-assign score, in masses/config file:

SCORESET=3
HAM_PREFERENCE=2.0
THRESHOLD=5.0
EPOCHS=100
NOTE=
What the SCORESET here mean? Do i need to change the HAM_PREFERENCE,
THRESHOLD and EPOCHS value?
Thanks.

umm what is masses/config file?
You should only be changing scores in local.cf
something like:
score RULE_NAME_HERE VALUE_HERE
ie: score BAYES_99 5.0
-Jim


Mysql 5.0 with SA 3.0

2005-03-31 Thread bruno . delladucata




Hello

Has someone tested Mysql 5.0 with SA3.0?



Re: Mysql 5.0 with SA 3.0

2005-03-31 Thread Michael Parker
On Thu, Mar 31, 2005 at 10:12:38PM +0200, [EMAIL PROTECTED] wrote:
 
 Has someone tested Mysql 5.0 with SA3.0?
 

Yes.  Are you just asking? or did you find some sort of problem?

I haven't found any problems so far, but all of my testing has been
focused on BayesSQL.

Michael



pgpt1rIyLXb2C.pgp
Description: PGP signature


Re: SA rescore config file

2005-03-31 Thread Matt Kettler
Jim Maul wrote:




 umm what is masses/config file?

That's the configuration file for the mass-check tools. They impact the
perceptron when evolving scoresets (advanced stuff)


 You should only be changing scores in local.cf

 something like:

 score RULE_NAME_HERE VALUE_HERE

 ie: score BAYES_99 5.0


That's true for the average user. However, it looks like Lisheng is
trying to re-evolve an entire scoreset from the ground up, which is a
very advanced topic.

Lisheng,

I think you probably want to leave the masses directory alone until
you've got a better understanding of spamassassin in it's default
configuration. Certainly you should already have an understanding of
what the different scoresets are LONG before you consider trying to
evolve one. There are probably less than 100 people in the entire world
that ever play with the masses tools. Thousands of users of SA aren't
even aware that they exist. These tools are mostly for the developers,
and the very advanced user. Even most of the SARE ninjas only run a
small number of the tools in here here. They largely use mass-check and
hit-frequencies, without using the perceptron.

I'd suggest using the default scores that come with SA to start with.
The SA developers have gone to the effort of running these tools already
to generate a good scoreset that works for most people. Once you've got
a good feel for the basics, you can start looking at the highly advanced
stuff.

To answer your specific questions (bearing in mind that this is really a
highly advanced user thing to be playing with)

What the SCORESET here mean? 

SCORESET here is to pick which scoreset you are evolving scores for. See
the score description in man Mail::SpamAssassin::Conf for a
description of each scoreset.

 Do i need to change the HAM_PREFERENCE,
THRESHOLD and EPOCHS value?


No. These adjust the mathematics of how the perceptron runs while
generating scoresets. You can run perceptron -h to see a short
description of what these do. Unless you've read the perceptron code and
have a really good feel for what they do, you probably only want to
adjust these for experimental reasons.







Re: Rule Design Benchmark/Resource Question

2005-03-31 Thread Matt Kettler
Rocky Olsen wrote:

Before i pull my hair out doing bench/resource test, i was wondering if
anyone out there knew if there was much of a speed/resource usage
difference between the following way of writing the same rule.


Method A:
body   rule_a  /(?:feh|meh|bleh)/i

vs.

Method B:

bod__rule_a/(?:feh)/i
body   __rule_b/(?:meh)/i
body   __rule_c/(?:bleh)/i

meta   rule_d  (__rule_a || __rule_b || __rule_c)


There probably isn't much difference using just 3 rules, but i'm thinking
more along the lines of large(500+) lists and it isn't limited to just body
stuff.  So if anyone has some realworld benching/experience with what is
preferred or if the developers know which is faster for SA, i would love
the input.
  


To start with, use perl's regex debugger as your friend:

$perl -Mre=debug -e  /(?:feh|meh|bleh)/i
size 11 Got 92 bytes for offset annotations.

$ perl -Mre=debug -e  /(?:feh)/i
Freeing REx: `,'
Compiling REx `(?:feh)'
size 3 Got 28 bytes for offset annotations.

(repeat 2 times)

However, this only deals with part of the story. The cost of the regex
itself. It does not deal with the per-rule overhead in SA.

In general I'd favor the combined approach, unless for some reason your
combined rule is considerably larger than the sum of it's parts. Bigevil
ran much better once Chris S did some combining and common subexpression
elimination.




Also, I'd suggest eliminating the (?:) for the single-text-matches. It
does nothing of use, and doesn't change the evaluation of the regex any
for a simple single text match. All it does is waste 4 bytes of disk
space per rule.

body __RULE_A   /feh/i

instead of:
body __RULE_A   /(?:feh)/i

I leave comparing the two using re=debug as an exercise for the student.
Also compare to /(feh)/i and /(feh)\1/i to see how backtracking works.









Re: sa-learn issues

2005-03-31 Thread Chip
Chip wrote:
Michael Parker wrote:
On Thu, Mar 31, 2005 at 01:23:24PM -0500, Chip wrote:
 

I have things setup so each user has a spam folder that they will 
put missed spam in.  This folder will later be trained from cron 
jobs using sa-learn.  The problem is, it seems that sa-learn is 
ignoring the -u / --user= flag.  No matter what I set it to, it 
trains for root instead of that user.  I am verifying this by 
checking the /root/.spamassassin/ directory.  Each time I run 
sa-learn, the bayes files in the directory are updated, instead of 
the files in /usr/home/user/.spamassassin/
  

This is a feature/shortcoming in the -u option for sa-learn when using
non-SQL based bayes storage modules.  That is why the documentation
states:
 You can use this option to specify users in a virtual user
 configuration.
Otherwise the bayes path, if unset via dbpath or in a .cf file is
expanded to be in $ENV{HOME} which in your case is /root/.
I added the -u specifically for BayesSQL users, since it doesn't refer
to an actual directory on the filesystem.
Feel free to file a bug report, but honestly it might end up being a
documentation patch saying that -u is not effective for DBM storage.
BTW, you can easily accomplish the same thing as root using su -c or
similar mechanisms.
Michael
 

Ahh ok.  Make sense!  I will change to a sql backend, as my users have 
no shell access and can't run the command as themselves.  Thanks for 
the clarification!

Changing the backend storage driver worked perfectly, well almost.  When 
using DBM storage, the user_prefs file was automatically created when a 
new user got its first mail.  Now using mySQL, the userpref table is 
empty.  Is this the default behavior?  Reason I ask is with no examples 
of what to put in the table, I am unsure of the syntax ;)



Re: sa-learn issues

2005-03-31 Thread Chip
Chip wrote:
Michael Parker wrote:
On Thu, Mar 31, 2005 at 01:23:24PM -0500, Chip wrote:
 

I have things setup so each user has a spam folder that they will 
put missed spam in.  This folder will later be trained from cron 
jobs using sa-learn.  The problem is, it seems that sa-learn is 
ignoring the -u / --user= flag.  No matter what I set it to, it 
trains for root instead of that user.  I am verifying this by 
checking the /root/.spamassassin/ directory.  Each time I run 
sa-learn, the bayes files in the directory are updated, instead of 
the files in /usr/home/user/.spamassassin/
  

This is a feature/shortcoming in the -u option for sa-learn when using
non-SQL based bayes storage modules.  That is why the documentation
states:
 You can use this option to specify users in a virtual user
 configuration.
Otherwise the bayes path, if unset via dbpath or in a .cf file is
expanded to be in $ENV{HOME} which in your case is /root/.
I added the -u specifically for BayesSQL users, since it doesn't refer
to an actual directory on the filesystem.
Feel free to file a bug report, but honestly it might end up being a
documentation patch saying that -u is not effective for DBM storage.
BTW, you can easily accomplish the same thing as root using su -c or
similar mechanisms.
Michael
 

Ahh ok.  Make sense!  I will change to a sql backend, as my users have 
no shell access and can't run the command as themselves.  Thanks for 
the clarification!



Re: Rule Design Benchmark/Resource Question

2005-03-31 Thread Rocky Olsen

Thanks

On Thu, Mar 31, 2005 at 05:16:25PM -0500, Matt Kettler wrote:
 Rocky Olsen wrote:
 
 Before i pull my hair out doing bench/resource test, i was wondering if
 anyone out there knew if there was much of a speed/resource usage
 difference between the following way of writing the same rule.
 
 
 Method A:
 body rule_a  /(?:feh|meh|bleh)/i
 
 vs.
 
 Method B:
 
 bod  __rule_a/(?:feh)/i
 body __rule_b/(?:meh)/i
 body __rule_c/(?:bleh)/i
 
 meta rule_d  (__rule_a || __rule_b || __rule_c)
 
 
 There probably isn't much difference using just 3 rules, but i'm thinking
 more along the lines of large(500+) lists and it isn't limited to just body
 stuff.  So if anyone has some realworld benching/experience with what is
 preferred or if the developers know which is faster for SA, i would love
 the input.
   
 
 
 To start with, use perl's regex debugger as your friend:
 
 $perl -Mre=debug -e  /(?:feh|meh|bleh)/i
 size 11 Got 92 bytes for offset annotations.
 
 $ perl -Mre=debug -e  /(?:feh)/i
 Freeing REx: `,'
 Compiling REx `(?:feh)'
 size 3 Got 28 bytes for offset annotations.
 
 (repeat 2 times)
 
 However, this only deals with part of the story. The cost of the regex
 itself. It does not deal with the per-rule overhead in SA.
 
 In general I'd favor the combined approach, unless for some reason your
 combined rule is considerably larger than the sum of it's parts. Bigevil
 ran much better once Chris S did some combining and common subexpression
 elimination.
 
 
 
 
 Also, I'd suggest eliminating the (?:) for the single-text-matches. It
 does nothing of use, and doesn't change the evaluation of the regex any
 for a simple single text match. All it does is waste 4 bytes of disk
 space per rule.
 
 body __RULE_A   /feh/i
 
 instead of:
 body __RULE_A   /(?:feh)/i
 
 I leave comparing the two using re=debug as an exercise for the student.
 Also compare to /(feh)/i and /(feh)\1/i to see how backtracking works.
 
 
 
 
 
 
 

-- 
__


what's with today, today?

Email:  [EMAIL PROTECTED]
PGP:http://rocky.mindphone.org/rocky_mindphone.org.gpg


signature.asc
Description: Digital signature